An Effective Approach to Detect Label Noise

Abrishami, Mahdi

An Effective Approach to Detect Label Noise

dc.contributor.advisor	Ghorbani, Ali A.
dc.contributor.author	Abrishami, Mahdi
dc.date.accessioned	2023-10-04T13:26:35Z
dc.date.available	2023-10-04T13:26:35Z
dc.date.issued	2022-11
dc.description.abstract	With the increased usage of Internet of Things (IoT) devices in recent years, different Machine Learning (ML) methods have also developed dramatically for attack detection in this domain. However, ML models are vulnerable to various classes of adversarial attacks that aim to fool a model into making an incorrect prediction. For instance, label manipulation or label flipping is a type of adversarial attack in which the attacker attempts to manipulate the label of training data, thereby causing the trained model to be biased and/or with decreased performance. However, the number of samples that can be flipped in this type of attack can be limited, giving the attacker a limited target selection. Due to the importance of securing ML models against Adversarial Machine Learning (AML) attacks, particularly in the IoT domain, this thesis presents an extensive review of AML in IoT. Then, a classification of AML attacks is proposed based on the literature, creating a foundation for future research in this domain. Next, more specifically, this thesis investigates the negative impact levels of applying malicious label flipping attacks (intentional label noise) on IoT data. As accurate labels are necessary for ML training, exploring adversarial label noise is an important research topic. However, the label noise in datasets is not always adversarial and may be caused due to several other reasons, such as careless data labelling. Classification is an essential task in machine learning, where the main objective is to predict the categories of unseen data. The existence of label noise in training datasets can negatively impact the performance of supervised classification, whether it is adversarial or non-adversarial. Due to the growing interest in the data-centric AI that aims at improving the quality of training data without enhancing the complexity of models, a range of research has been undertaken to tackle the label noise problem. However, few works have investigated this problem in the IoT network intrusion detection domain. This thesis addresses the issue of label noise in the intrusion detection domain by presenting a framework to detect samples with noisy labels. The proposed framework’s main components are the decision tree classification algorithm and active learning. The framework is composed of two steps: making a decision tree robust against the label noise in a dataset and then using this robust model with the help of active learning with uncertainty sampling to detect noisy samples effectively. In this way, the inherent resiliency of the decision tree algorithm against label noise is utilized to tackle this issue in datasets. Based on the results of our experiments, the proposed framework can detect a considerable number of noisy samples in the training dataset, with up to 98% noise reduction. The proposed detection method can also be leveraged as a defense against random label flipping attacks where adversarial label manipulation is applied randomly.
dc.description.copyright	© Mahdi Abrishami, 2022
dc.format.extent	xvii, 139
dc.format.medium	electronic
dc.identifier.oclc	(OCoLC)1419203876	en
dc.identifier.other	Thesis 11124	en
dc.identifier.uri	https://unbscholar.lib.unb.ca/handle/1882/37459
dc.language.iso	en
dc.publisher	University of New Brunswick
dc.rights	http://purl.org/coar/access_right/c_abf2
dc.subject.discipline	Computer Science
dc.subject.lcsh	Internet of things.	en
dc.subject.lcsh	Machine learning.	en
dc.subject.lcsh	Cyberterrorism--Prevention.	en
dc.title	An Effective Approach to Detect Label Noise
dc.type	master thesis
oaire.license.condition	other
thesis.degree.discipline	Computer Science
thesis.degree.grantor	University of New Brunswick
thesis.degree.level	masters
thesis.degree.name	M.C.S.

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mahdi Abrishami - Thesis.pdf
Size:: 6.07 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.13 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Open Theses & Dissertations