A novel feature selection-based sequential ensemble learning method for class noise detection in high-dimensional data
Source of Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
© 2018, Springer Nature Switzerland AG. Most of the irrelevant or noise features in high-dimensional data present significant challenges to high-dimensional mislabeled instances detection methods based on feature selection. Traditional methods often perform the two dependent step: The first step, searching for the relevant subspace, and the second step, using the feature subspace which obtained in the previous step training model. However, Feature subspace that are not related to noise scores and influence detection performance. In this paper, we propose a novel sequential ensemble method SENF that aggregate the above two phases, our method learns the sequential ensembles to obtain refine feature subspace and improve detection accuracy by iterative sparse modeling with noise scores as the regression target attribute. Through extensive experiments on 8 real-world high-dimensional datasets from the UCI machine learning repository , we show that SENF performs significantly better or at least similar to the individual baselines as well as the existing state-of-the-art label noise detection method.
Chen, Kai; Guan, Donghai; Yuan, Weiwei; Li, Bohan; Khattak, Asad Masood; and Alfandi, Omar, "A novel feature selection-based sequential ensemble learning method for class noise detection in high-dimensional data" (2018). Scopus Indexed Articles. 1185.