A novel feature selection-based sequential ensemble learning method for class noise detection in high-dimensional data

Document Type

Conference Proceeding

Source of Publication

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Publication Date

1-1-2018

Abstract

© 2018, Springer Nature Switzerland AG. Most of the irrelevant or noise features in high-dimensional data present significant challenges to high-dimensional mislabeled instances detection methods based on feature selection. Traditional methods often perform the two dependent step: The first step, searching for the relevant subspace, and the second step, using the feature subspace which obtained in the previous step training model. However, Feature subspace that are not related to noise scores and influence detection performance. In this paper, we propose a novel sequential ensemble method SENF that aggregate the above two phases, our method learns the sequential ensembles to obtain refine feature subspace and improve detection accuracy by iterative sparse modeling with noise scores as the regression target attribute. Through extensive experiments on 8 real-world high-dimensional datasets from the UCI machine learning repository [3], we show that SENF performs significantly better or at least similar to the individual baselines as well as the existing state-of-the-art label noise detection method.

ISBN

9783030050894

ISSN

0302-9743

Publisher

Springer Verlag

Volume

11323 LNAI

First Page

55

Last Page

65

Disciplines

Computer Sciences

Keywords

Feature selection, Noise Filtering, Sequential ensemble

Scopus ID

85059749070

Indexed in Scopus

yes

Open Access

no

Share

COinS