Classification with class noises through probabilistic sampling

Document Type

Article

Source of Publication

Information Fusion

Publication Date

5-1-2018

Abstract

© 2017 Accurately labeling training data plays a critical role in various supervised learning tasks. Now a wide range of algorithms have been developed to identify and remove mislabeled data as labeling in practical applications might be erroneous due to various reasons. In essence, these algorithms adopt the strategy of one-zero sampling (OSAM), wherein a sample will be selected and retained only if it is recognized as clean. There are two types of errors in OSAM: identifying a clean sample as mislabeled and discarding it, or identifying a mislabeled sample as clean and retaining it. These errors could lead to poor classification performance. To improve classification accuracy, this paper proposes a novel probabilistic sampling (PSAM) scheme. In PSAM, a cleaner sample has more chance to be selected. The degree of cleanliness is measured by the confidence on the label. To accurately estimate the confidence value, a probabilistic multiple voting idea is proposed which is able to assign a high confidence value to a clean sample and a low confidence value to a mislabeled sample. Finally, we demonstrate that PSAM could effectively improve the classification accuracy over existing OSAM methods.

DOI Link

10.1016/j.inffus.2017.08.007

ISSN

1566-2535

Publisher

Elsevier B.V.

Volume

41

First Page

57

Last Page

67

Disciplines

Computer Sciences

Keywords

Mislabeled training data, Multiple voting, One-zero sampling, Probabilistic sampling

Scopus ID

85027514343

Recommended Citation

Yuan, Weiwei; Guan, Donghai; Ma, Tinghuai; and Khattak, Asad Masood, "Classification with class noises through probabilistic sampling" (2018). All Works. 936.
https://zuscholars.zu.ac.ae/works/936

Indexed in Scopus

yes

Open Access

no

All Works

Classification with class noises through probabilistic sampling

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

First Page

Last Page

Disciplines

Keywords

Scopus ID

Recommended Citation

Indexed in Scopus

Open Access

Search

Browse

Contribute

Content Type

All Works

Classification with class noises through probabilistic sampling

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

First Page

Last Page

Disciplines

Keywords

Scopus ID

Recommended Citation

Indexed in Scopus

Open Access

Share

Search

Browse

Contribute

Content Type