1. Introduction
The increasing world demand for natural ingredients for cosmetics, perfumes, and medicines, also increases the need for essential ingredients. The world's essential products produced in Indonesia include patchouli, citronella, clove leaf oil, and Cananga. With the growing importance of the needs and research in the field of essential oil in Indonesia, in 2020, the Ministry of Industry has determined essential oil as one of the national research priorities in its use as an antioxidant and anti-aging material.
Among essential products, patchouli oil is the largest export commodity, which is 60% of exports of essential oils in Indonesia. Indonesia is also the largest patchouli oil producer at 85% in the international market (Wahyudi & Ermiati, 2020). As the higher demand for patchouli oil and the decreasing area that can be planted with patchouli, there needs to be an effort to increase the productivity of patchouli cultivation.
The importance of knowing patchouli varieties is because not all patchouli plants have good oil quality. Besides showing the quality of oil, patchouli varieties can also show the resistance of these plant varieties from pests and diseases. By knowing the resistance of pests and plant diseases, the cultivation of prevention can be done for these pests and plant diseases. One way that is usually done to know patchouli varieties is through experts. But there are few experts in this field, so we need a tool for patchouli varieties identification.
In the process of identification, the system requires a sample of patchouli varieties that are used as input data. But the number of patchouli samples obtained from the field is imbalance where the number of samples in a class is higher than other classes or vice versa. Classification on imbalance class may ignore classes that have a fewer number of samples so that it may significantly reduce the accuracy of the classification method (Yıldırım, 2016). Furthermore, minority class features will usually be difficult to be identified (Jeatrakul et al., 2010). One way to handle imbalance classes is to use the Synthetic Minority Over-sampling Technique (SMOTE) technique that works by generating minority data as much as majority data.
Previous research has been carried out to classify patchouli varieties using texture features extracted using wavelet method with 83.33% of accuracy (Dewi et al., 2016). Another study uses morphological features, local binary pattern texture features and hulls convection with accuracy of 77.5% (Dewi et al., 2016). From these two studies it can be seen that the use of wavelet feature extraction is more effective when compared to the local binary pattern texture features. Furthermore, the use of the convex hull feature can increase accuracy in the recognition process (Dewi et al., 2016). Both studies also mentioned the need to choose a combination of dominant derivative features to improve the accuracy of the recognition process.
In this study, the patchouli identification model is constructed using a combination of leaf morphological, texture, and shape features. The texture features are obtained using Wavelet transformation and the shape features are obtained using convex hull. These three features can be extracted into several sub features. However, not all of the sub features have a dominant influence in the process of recognizing varieties. Thus, the optimal selection of features will greatly determine the success of the classification of patchouli varieties. The effectiveness of the input features is tested using three machine learning methods. The first is feedforward neural networks with backpropagation algorithm for training, the second is learning vector quantization (LVQ), and the last is extreme learning machine (ELM).
Patchouli leaf image sample data is taken from several types of patchouli, namely Sidikalang, Diploid, Tetraploid and Patchoulina. Image data is then used as input data as for the process of features extraction of patchouli varieties. The result of features extraction is stored as a vector that is used as input data in the classification process using the proposed methods.
2. Related works
Plant identification using leaf image processing has been done in several studies. Principal component analysis (PCA) and elliptic Fourier have been chosen to extract the leaf shapes. (Laga et al., 2014; Neto et al., 2006). Furthermore, several studies have also been carried out to identify leaves by combining deep belief networks and multi-features (Liu & Kan, 2016), pattern of leaf bones (Zhang et al., 2016); leaf texture (Pahikkala et al., 2015).
The plant leaves identification is also carried out using three features, namely shape features using the scale invariant feature transform (SIFT) method, color features that are extracted using the color moment method, and texture features using the segmentation-based fractal texture analysis (SFTA) method. The study reports an accuracy of 94% (Jamil et al., 2015). The combination of the three features (morphology, texture, and shape) and using PNN as a classification method are also used in the medicinal plants identification with maximum accuracy reaching 74.67% (Herdiyeni et al., 2013).
The use of wavelet in (Abdolmaleki et al., 2017) for extracting spectral features in hyperspectral images with wavelet yields promising results for the detection of copper deposits. Study (Bakhshipour et al., 2017) shows that feature extraction using wavelet may improve the performance of the weed detection process. Furthermore, the other study also proves that using wavelet in feature selection can improve recognition performance (Arora et al., 2012).
The use of morphological features has been used in plant identification research (Arora et al., 2012). One of them is a study (Wu et al., 2007) to identify leaf images using morphological features using the probabilistic neural network (PNN) classifier which results in an average accuracy of 90.3%. The use of shape features using convex hull extraction was done in (Lee & Hong, 2013) by using leaf veins and shape features using fast Fourier transform and convex hulls obtained accuracy reaching 97.19%.
3. Synthetic minority over-sampling technique (SMOTE)
Data imbalance happens if the number of objects in a certain class is significantly higher than other classes. Classes with a greater number of objects are labeled as major classes while others are labeled as minor classes. Classification methods that do not treat data imbalances may be overwhelmed by major classes and ignore minor classes (Chawla et al., 2002).
The SMOTE method that is proposed in (Jeatrakul et al., 2010) offers a solution to address unbalanced data with a different principle from the previous oversampling. If the oversampling method is designed to add random observations, the SMOTE method generates new artificial data for minor classes data so that they have equivalent amount of data to the major classes. Artificial data or synthesis is generated using k-nearest neighbor. The number of neighbors is selected by considering the ease in implementing it. Numerical scale artificial data generation differs from categorical data. Numerical data are measured by their proximity to Euclidean distance while categorical data are measured by mode value.
4. Feature extraction
4.1. Wavelet
Wavelet is a short wave whose energy is concentrated in short time intervals. Wavelet transform is the development of Fourier transform which works on periodic waves (Sanjeevi et al., 2001). Wavelet transform can provide time and frequency information simultaneously and has a good performance for analyzing non periodic signals. Therefore, Wavelet transform is appropriate for signal processing and digital image processing (Feng et al., 2011). The signal is formed in a sequence of discrete signals that is called 2D discrete wavelet transform (DWT-2D) (Yang et al., 2014).
The calculation of DWT-2D is carried out using low-pass and high-pass filters of pixel image values. The low-pass and high-pass filters are labelled by h and g as shown in Figure 1. The decomposition of DWT-2D image is done in 3 levels. At every level, high-pass filters are applied to produce detailed pixel formation of images. The low-pass filters are applied to produce rough estimation of images (Shahbahrami, 2012). The DWT-2D transformation itself uses Eq.1.
where:
XWT= |
wavelet transform function |
Ψ*= |
mother wavelet function |
x(t)= |
reverse transformation |
t= |
time and output of the function |
i, j = |
pixel coordinates |
The reverse transformation of DWT-2D is formulated in (2):
where:
(t)= |
reverse transformation |
ψ= |
mother wavelet function |
τ= |
time |
s= |
scale |
τ= |
time and output of the function |
Filtering using the low-pass and high-pass filter obtains 4 image subsections is shown in Figure 2. The four sub-sections are labelled as HH (high sub-bands), LL (low sub-band), HL (high-low sub-bands), and LH (low-high sub-bands). The sub-sections are labelled as LL as the results of low pass filter in horizontal and vertical sections. The results of high pass filter in horizontal and vertical sections are labelled as HH. The results of low pass filter in the horizontal direction and high pass filter images in the vertical direction is denoted as LH while HL is the results of low pass filter in the vertical direction.
The implementation of the wavelet transform may use several algorithms with different wavelet coefficients. One popular application used is the Daubechies wavelet transformation. Daubechies has a same computation time as other wavelets. Furthermore, it can easily address the edges of images (Singh & Khare, 2014). The wavelet feature uses energy normalization to 1 (L1) and energy normalization to 2 (L2) using Equ. 3. Wavelet features use high-frequency sub-bands (HH) and feature extraction is applied for each level of decomposition. In this study we use 6 wavelet features that are presented in Table 1.
4.2. Leave morphology
Morphological characteristics are categorized into two characteristics, namely basic characteristics and derivative characteristics. Leaf base features include diameter (D), physical length (Lp), physical width (Wp), area (A), and perimeter (P). The diameter is the furthest point between two points of the leaf boundary. Physical length is measured as the distance of two leaf base points. Physical width is calculated based on the length of the longest line which intersects the orthogonal length of the physical length line. The area is calculated based on the number of pixels that are inside the edge of the leaf, while the perimeter is the number of pixels on the edge of the leaf (Wu et al., 2007).
From these five basic characteristics, seven morphological features are obtained. The value of inheritance can be calculated from the ratio between the leaf base characteristics. There are six leaf derivative features, namely (Wu et al., 2007):
1. Aspect ratio
Ratio of physiological length (Lp) to physiological width (Wp).
2. Form factor
Used to determine the shape of a leaf and find out how round the leaf shape is.
3. Rectangularity
Describe how square the leaf surface is
4. Narrow factor
The ratio of diameter (D) to physiological length. This feature is to determine whether the shape of the leaf blade is classified as symmetry or asymmetry. If the leaf blade is classified as symmetry, the narrow factor is 1. If asymmetry, the narrow factor is more than 1.
5. Perimeter ratio of diameter
This feature is to measure how oval the leaf is.
6. Perimeter ratio of physiological length & width.
4.3. Convex Hull
Binary images obtained from preprocessing are used for extraction of shapes using convex hull. This characteristic is calculated from the difference between the image area of the results of convex hulls and the original image area of the leaves (Lee & Hong, 2013). Figure 3 shows an example of a convex hull implementation in a binary leaf image.
Convexity and solidity characteristics make use of convex hull or the convex set. The convex set is defined as the smallest polygon that surrounds an object. The convexity value is determined by the ratio of the perve length of the convex hull surrounding the object to its perimeter length. The value is calculated using Equ. 4.
Solidity is measured by the ratio of the object’s area to its convex hull, by utilizing the pixels that make up the convex hull. The ratio is formulated in Equ. 5.
5. Classification algorithms
5.1. Learning vector quantization (LVQ)
The vector quantization learning (LVQ) network is a supervised artificial neural network that was developed by Teuvo Kohonen in the mid-1980s (Kohonen, 1995). The network has an input layer, an LVQ layer, and an output layer. The output layer contains several processing elements because there are different classes. The LVQ layer contains several processing elements for each class. LVQ has been successfully implemented for classifications problems (Arifando et al., 2019).
The LVQ architecture consists of an input layer, kohonen layer (there is competition for input to enter a class based on proximity) and an output layer. The steps in the LVQ Learning Algorithm can be explained as follows (Degang et al., 2007).
1. Let the x is an input vector from training set and
W
i
is the ith reference vector:
2. Determine the winner unit c in the competitive process through Equ. 8:
3. Adjust W c using Equ. 9:
in which:
where α(t) is the corresponding learning rate,
where
5.2. Extreme learning machine (ELM)
Extreme learning machine (ELM) is an artificial neural network algorithm that is often used for classification, regression, clustering, and learning features that have the concept of one or more hidden layers that work in a single iteration (Tang et al., 2016). The advantage of the ELM method is that it is a thousand times faster than other neural network algorithms that use the concept of backpropagation learning (Ding et al., 2015). The ELV have been applied for various classification problems (Alfiyatin et al., 2019). The steps in the ELM Algorithm are detailed as follows (Samet & Miri, 2012):
1. Suppose there are
2. In the equation,
means that there exist
3. By using the following substitutions:
Where
5.3. Backpropagation algorithm
Backpropagation is supervised learning algorithms to train an artificial neural network. The network is composed of multiple layers of neurons. Backpropagation is a controlled type of artificial neural network training method which uses a weight adjustment pattern to minimize error value between predicted output and targeted output (Rahmi et al., 2016).
The steps in the Backpropagation Algorithm can be explained as follows (Liu et al., 2016): For each neuron
j, let n denote the number of neurons in
the last layer;
Let
If the neuron j is in the output layer, the network starts the
backpropagation phase. Let t
j
is the encoded target output. The algorithm calculates the output error
Err
j
for the neuron
Let
Let
6. Experimental result
This section discusses testing conducted on the LVQ, ELM & backpropagation method to classify input (in the form of features extracted in the previous stage) into 4 classes of patchouli varieties namely diploid, tetraploid, patchoulina and sidikalang. The data used are 91 data which are divided into two parts, namely 63 training data and 28 test data.
The scenario in this test is carried out 10 times and the average of results is calculated. This test aims to find the best parameters to produce the recommended parameter values with the highest accuracy results used in the next testing phase.
6.1. LVQ testing
In this stage, we test the best parameter value of learning rate (α) and epoch for LVQ. Testing is done with variations in the learning rate (α) 0.01, 0.02, 0.03, 0.04, 0.05 up to 0.9 and variations on epoch 10, 20, 30, 40, 50 up to 200. The learning rate (α) and epoch LVQ test results are provided in Table 2 and Table 3.
learning rate | ||||||||||||||||||
0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | |
Accuracy (%) |
82.86 | 84.64 | 83.21 | 83.93 | 84.29 | 83.93 | 82.50 | 81.43 | 84.64 | 85.00 | 83.57 | 83.93 | 76.07 | 73.93 | 70.71 | 62.14 | 60.36 | 60.71 |
Time (s) | 0.394 | 0.406 | 0.407 | 0.407 | 0.415 | 0.417 | 0.419 | 0.421 | 0.425 | 0.425 | 0.427 | 0.429 | 0.435 | 0.441 | 0.446 | 0.447 | 0.449 | 0.453 |
epoch | |||||||||||||||
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 | 120 | 140 | 160 | 180 | 200 | |
Accuracy (%) |
84.64 | 85.00 | 84.64 | 84.64 | 84.29 | 85.71 | 85.71 | 85.71 | 83.93 | 86.07 | 83.93 | 85.36 | 85.00 | 85.36 | 84.64 |
Time (s) | 0.41 | 0.43 | 0.45 | 0.49 | 0.74 | 1.05 | 1.52 | 2.01 | 2.43 | 2.99 | 3.62 | 4.49 | 5.50 | 6.52 | 8.23 |
Testing of learning rate (α) is done by using the initial parameters of epoch of 20. As shown in Table 2, there is increase of accuracy for learning rate from 0.01 to 0.1. Greater value of learning rate may enable the LVQ to quickly update its weight to produce better results. However, the performance of the LVQ is reduced for learning rate greater than 0.1. Too high value learning rate may cause the training process of the LVQ becoming unstable. It is possible that the weights of LVQ is updated too fast and its optimum value is jumped. Table 2 also shows that there is no significant differences of training process time for different value of learning rate.
Testing of epoch is done by using the initial parameters of the best value of learning rate of 0.1. As shown in Table 3, there is increase of accuracy for epoch from 10 to 100. Greater value of epoch will enable the LVQ to update its weights be better in each iteration. However, for epoch greater than 100, there is no significant improvement of accuracy. Therefore, we conclude that the best value of epoch is 100 as greater value of epoch will require a higher training time.
6.2. ELM testing
The scenario in this test is to find the best parameters to produce the recommended parameter values with the highest accuracy results for use in the next test phase. Parameter testing is performed to find the best number of hidden layers. Testing is done with variations of hidden layer of 5, 10, 15, 20 to 100. The results of the hidden layer ELM parameter are shown in Table 4.
Number of hidden layers | ||||||||||
5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 | |
Accuracy (%) |
77.4 | 79.3 | 79.9 | 80 | 80.1 | 80.3 | 80.3 | 80.3 | 80.3 | 80.4 |
Time (s) | 0.152 | 0.152 | 0.153 | 0.156 | 0.161 | 0.161 | 0.163 | 0.165 | 0.167 | 0.173 |
Table 4 shows that the hidden layer of 50 gets a maximum accuracy of 80.43%. The results shows that the number of hidden layers heavily determine the level of accuracy. The greater the number of hidden layers, the better the accuracy level. Hidden layer here serves to assist the process, the more hidden layer that is used the better output is obtained, but the training time will be longer. We have increased the number the hidden layer until 100 and there is no improvement of accuracy.
6.3. Backpropagation algorithm testing
Determining the best parameter values of Backpropagation is carried out using starting value Hidden Layer = 25 and Epoch of 100. Learning Rate (α) is varied from 0.002 to 0.050. Table 5 shows the result of the learning rate testing.
learning rate (α) | 0.002 | 0.004 | 0.006 | 0.008 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 |
accuracy (%) |
79.64 | 80.7 | 82.0 | 84.5 | 86.4 | 86.4 | 86.4 | 86.4 | 86.4 |
Based on Table 5, the learning rate (α) 0.01 gets the maximum accuracy that is 86.424%. The pattern is similar with the learning rate testing for the LVQ. Greater value of learning rate may enable the Backpropagation to quickly update its weight to produce better results. However, the performance of the Backpropagation is reduced for learning rate greater than 0.01.
Table 6 shows the result of number of hidden layer testing. The best value of learning rate (α) that is obtained in the previous testing is used. The number of hidden layers of 15 has gotten the maximum accuracy result which is 89,637%. The greater number of hidden layers does not significantly increase the level of accuracy.
# hidden layer | 5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 |
accuracy (%) | 83.9 | 85.7 | 89.6 | 88.2 | 87.1 | 84.3 | 88.2 | 86.8 | 87.1 | 88.6 |
Table 7 shows the result of epoch testing. The best values of learning rate (α) and number of hidden layers that are obtained in the previous testing are used. As shown in Table 6, there is increase of accuracy for epoch from 10 to 150. Greater value of epoch will enable the Backpropagation to update its weights be better in each iteration. However, for epoch greater than 150, there is no significant improvement of accuracy.
6.4. Comparison of input feature and SMOTE combination
At this stage, testing of the learning vector quantization (LVQ), extreme learning machine (ELM) & backpropagation method is carried out by using various types of feature extraction combinations. There are 3 features, namely texture features extracted using wavelet texture analysis, morphological features and shape features extracted using convex hull. The combinations used include:
1. Morphology (M)
2. Wavelet (W)
3. Morphology + Wavelet (M+W)
4. Morphology + Wavelet + Convex Hull (M+W+C)
An input vector of length 14 is used to store 6 morphological features as representation of leaf basic characteristics, 6 wavelet features, and 2 convex hull features (convexity and solidity).
At this stage, the test uses the best parameters that have been obtained previously to determine the effect of changes in the level of accuracy. The results of feature tests are presented in Table 7 and Table 8.
Method | M | W | M + W | M + W + C |
Backpro | 78.21% | 81.394% | 91.77% | 92.49% |
LVQ | 77.49% | 76.42% | 81.06% | 87.85% |
ELM | 20.36% | 19.29% | 28.93% | 37.5% |
Average | 58.68% | 59.03% | 67.25% | 72.61% |
Method | M | W | M + W | M + W + C |
Backpro | 84.42% | 92.99% | 94.85% | 94.71% |
LVQ | 79.13% | 84.28% | 89.28% | 90.56% |
ELM | 75% | 81.14% | 79.43% | 80.43% |
Average | 79.52% | 86.13% | 87.85% | 88.56% |
Table 7 and Table 8 show that the best combination of features in all methods is in "Morphorlogy + Wavelet + Convex Hull" with the highest values reaching 92,493% and 94.7111% for the data that has been processed using SMOTE. In the non-SMOTE feature testing the good accuracy trend only occurs in Backpropagation Algorithm and LVQ, this is inversely proportional to ELM which only gets the best accuracy with a value of 37.5%. In other words, ELM is unable to overcome classification problems in unbalanced data.
The addition of the number of features affects all methods, this can be seen from the increase in accuracy along with the addition of features. From these results it can be concluded that the combination of feature selection is very important in increasing accuracy. Therefore, it is necessary to obtain relevant features when there are many features. Meanwhile, if the number of features is small, the increase in accuracy is not too significant but still able to reduce the workload of the system in the computing process.
7. Conclusions
The target of this research is to get the best accuracy results for the introduction of patchouli varieties using a combination of morphological features, textures, shapes, and artificial neural network algorithms. The higher the accuracy results, the better the results of the introduction of the method used.
In this study, the amount of data used is only 91 and unbalanced, it is expected that in subsequent studies it can improve the results of accuracy by increasing the amount of data and increasing the number of classes so that the classification process can be done in detail. The system can be developed further by using different input features so that it will always be updated with scientific knowledge that continues to grow.
Conflict of interest
The authors do not have any type of conflict of interest to declare.