Data from: Fuzziness-based active learning framework to enhance hyperspectral image classification performance for discriminative and generative classifiers

Publication Date



Hyperspectral image classification with a limited number of training samples without loss of accuracy is desirable, as collecting such data is often expensive and time-consuming. However, classifiers trained with limited samples usually end up with a large generalization error. To overcome the said problem, we propose a fuzziness-based active learning framework (FALF), in which we implement the idea of selecting optimal training samples to enhance generalization performance for two different kinds of classifiers, discriminative and generative (e.g. SVM and KNN). The optimal samples are selected by first estimating the boundary of each class and then calculating the fuzziness-based distance between each sample and the estimated class boundaries. Those samples that are at smaller distances from the boundaries and have higher fuzziness are chosen as target candidates for the training set. Through detailed experimentation on three publically available datasets, we showed that when trained with the proposed sample selection framework, both classifiers achieved higher classification accuracy and lower processing time with the small amount of training data as opposed to the case where the training samples were selected randomly. Our experiments demonstrate the effectiveness of our proposed method, which equates favorably with the state-of-the-art methods.

S1 File. The Indian Pines (corrected) dataset, consisting of 145*145 samples and 220 spectral bands with a spatial resolution of 20 m and a spectral range of 0.4–2.5 μm.

Twenty noisy bands were removed prior to the analysis, whereas the remaining 200 bands were used in our experimental setup. The removed bands are 104–108, 150–163, and 220. The original Indian Pines dataset is available online at [47] [48].

S2 File. The original Indian Pines ground truths consist of 16 classes.

The ground truth classes and the number of samples per class (class name-number of samples) are as follows: ““Alfalfa-46”, “Corn Notill-1428”, “Corn-Mintel-830”, “Corn-237”, “Grass Pasture-483”, “Grass Trees-730”, “Grass Pasture Mowed-28”, “Hay Windrowed-478”, “Oats-20”, “Soybean Notill-972”, “Soybean Mintel-2455”, “Soybean Clean-593”, “Wheat-205”, “Woods-1265”, “Buildings Grass Trees Drives-386” and “Stone Steel Towers-93””. The ground truths are freely available at [47] [48].

S3 File. This file contains MATLAB code for reshaping and rewriting of the original Indian Pines dataset and ground truths into text files as per journal requirements.

Repository Name




File Formats


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Data Availability