1 Introduction
Nowadays, feature selection has become a very active research field in machine learning. Used in supervised and unsupervised learning, the feature selection process has been an indispensable tool to reduce the dimensionality of datasets and consequently reduce the hughes phenomenon.
This last is detected when a large number of features are present compared to a small number of samples. This is why, feature selection is largely used in many domains such as gene selection in DNA microarray datasets, band selection in hyperspectral images, feature selection in image classification and recognition [13, 1, 7].
In supervised classification, it is crucial to select features carefully in order to achieve high classification accuracy and constructing a good classification model. It attempts to select the relevant and informative features to reduce the computational complexity [2, 12, 4].
The feature selection approach is divided into three classes: wrapper, embedded and filter. Many feature selection approaches have been proposed in various applications.
In [10], Medjahed et al. have proposed a new feature selection approach based on SVM-RFE and Binary Dragon fly. The objective function is a combination of three classifiers (SVM-LS, C-SVM and v-SVM).
Their work was applied to the DNA micro array dataset. Gold et al. [5] proposed a Bayesian viewpoint of SVM classifiers to determine irrelevant features by adjusting parameter values.
In [6], performance is measured through the accuracy rate of the classifier. Medjahed et al. [8] have proposed two approaches based on the Threshold Accepting and the Microcanonical Annealing used for feature selection.
Several medical datasets have been used to test the approaches that have shown their effectiveness through experimental results.
In [9] , the authors developed a new objective function that contains two important terms: the first is the classification accuracy rate, and the second is the Jeffries Matusita distance.
The proposed objective function is optimized using the Grey Wolf Optimizer Algorithm. Their work aims to select the relevant bands in hyperspectral images that provided a high classification accuracy rate and make sure the classes are well separated. Minimum Noise Band Selection is a new approach for selecting bands proposed in [14]. Based on low correlation and high SNR , this approach determines the quality of each band. Search efficiency can be improved by sequential backward selection.
A new band selection framework has been developed in [15]. By using an evolutionary strategy associated with groupwise selection strategies, the proposed approach reduces the computational burden.
Progressive Band Selection (PBS), a new method of selecting bands, has been developed In [3]. PBS differs from traditional band selection. As opposed to the former, which adapts the number of selected bands, p, to the various end members used for spectral unmixing, the latter fixes, p, at the same constant value for all end members.
In this paper, a new approach to the band selection problem is proposed. The proposed approach is based on Simulated Annealing which is a very used metaheuristic in the literature, and we propose a new objective function composed of two terms.
Three hyperspectral images widely used for testing image classification, namely are : Salinas Scene, Pavia University Scene and Indian Pines Scene are used to evaluate the performance of the proposed approach.
The remainder of this paper is structured as follows: Section 2 describes the proposed approach, Section 3 describes and discusses the experimental results and in Section 4 the conclusion and some perspectives are drawn.
2 The Proposed Approach
Typically, each feature selection approach has an objective function, a search strategy, a generator of subset candidates, and a stop criterion.
In our study, the band selection problem has the following mathematical form. Let’s:
where
2.1 The Objective Function
In feature selection, the objective function computes a certain measure for each generated subset. It can have many forms. Generally, we can have a filter function if it used a statistical measure independently of the classifier system, and it can be a wrapper function if it is computed from a classifier system.
In this study, we propose a new objective function based on two important terms. The first term is based on bagging approach.
The advantage of bagging is to reduce the effect of overfitting and improves classification and generalization. The basic idea is to generate several decision trees under a different subset of predictors and combines the results of these decision trees. The proposed score is computed as follows:
where,
In other terms,
where,
where,
2.2 Simulated Annealing Algorithm
The second important criterion in the feature selection process is the search strategy. In our approach, we propose to use the Simulated Annealing Algorithm which is a meta heuristic widely used in image classification and it showed its performance. Simulated Annealing is an optimization approach for approximating the global optimum in a large search space.
The origin of this algorithm is the annealing in metallurgy. The process is to heat the metal to high temperature and cooled slowly.
The first step of the algorithm is to set the system to an arbitrary solution and by using a sequence of transitions, we achieve the final state which is considered as the optimal solution:
where
The algorithm schema can be as shown in Algorithm 1. Note there are some parameters in the simulated annealing algorithm which should be set carefully which are: initial temperature, cooling schema, stop criterion.
3 Experimental Results
The results we obtained by applying our approach are presented based on three hyperspectral images. We evaluate the proposed approach in terms of overall accuracy rate, average accuracy rate and individual class accuracy rate.
3.1 Dataset
Pavia University image is taken ROSIS over the Pavia, Northern, Italy. It was composed of 103 bands in the spectral range
A total of
Indian Pines image was taken by AVIRIS over the Indian Pines, North-Western Indiana, USA. It consists of
The ground truth was composed of
Salinas hyperspectral image is taken by AVIRIS over Salinas Valley, California, USA. The area covered
This image is composed of
3.2 Results and Discussion
In this experimentation, we propose to split the data into three steps, Training, Testing and Validation. The Simulated annealing is set as follows: initial temperature is set to 1000. We use a geometric cooling schema with 0.98 as a parameter of cooling. Table 1 shows the results obtained by the proposed approach.
Class | Hyperspectral Images | ||
Pavia University | Indian Pines | Salinas | |
1 | 87,16 | 57,14 | 99,25 |
2 | 90,03 | 60,09 | 99,64 |
3 | 66,75 | 53,21 | 96,54 |
4 | 86,68 | 39,86 | 99,16 |
5 | 98,64 | 82,41 | 97,82 |
6 | 70,01 | 82,42 | 99,37 |
7 | 77,07 | 82,35 | 99,30 |
8 | 77,87 | 87,46 | 76,16 |
9 | 100,00 | 41,67 | 98,60 |
10 | 54,45 | 89,37 | |
11 | 68,57 | 95,48 | |
12 | 46,35 | 98,01 | |
13 | 82,11 | 98,00 | |
14 | 88,54 | 92,52 | |
15 | 45,26 | 69,14 | |
16 | 89,29 | 96,50 | |
AA | 83,80 | 66,32 | 94,06 |
OA | 84,89 | 67,35 | 89,23 |
Table 1 illustrated the results obtained by the proposed approach using three widely used hyperspectral images : Salinas, Pavia University and Indian Pines.
The first column shows the class number, the second column represents the Pavia University results, the third column represents the Indian Pines results and the last column represents the Salinas results.
The experimental results are conducted in terms of average accuracy (AA), overall accuracy (OA) and individual class accuracy. The two last columns of the table represent respectively the AA and OA.
The algorithm is also running
Until
The classification accuracy rate obtained is very high with an advantage to Salinas images which produced
Figures 4 and 5 illustrate the AA and OA obtained in each execution. Figure 6 shows the classification map for each image, produced by the proposed approach.
We compare the proposed approach to three optimization method by using the same objective function. These methods are: Tabu search (TS), Threshold accepting (TA) and Genetic Algorithm (GA).
The comparaison has been done on Pavia University Image. Table 2 describes the results obtained by the approaches. We conduct the comparison by using the same objective function under the Pavia University Image.
The analysis of the results shows that the values obtained by the proposed approach are slightly better than those of (TS) and (TA). Compare to the results obtained by (GA), they are almost the same.
4 Conclusion
In this paper, we present a new objective function for the band selection to improve the classification accuracy rate.
We also propose to use simulated annealing with geometric cooling schema. A new objective function is designed based on two important terms, the first is a score measured under 100 decision trees (bagging) and the second is the correlation between the bands.
The performance of the proposed approach was tested under three different Hyperspectral images: Salinas, Pavia University and Indian Pines.
The analysis of the results obtained by the proposed approach demonstrates the outperformance of the proposed approach.
The proposed approach has provided very good results when it was used in hyperspectral images. In future works, we will propose to improve the approach and the quality of the objective function to adapt it for a different types of datasets.