1 Introduction
Individuals afflicted with diabetes mellitus (DM), irrespective of whether it manifests as type 1 or type 2, face a significantly elevated likelihood of developing diabetic retinopathy (DR) [1-2]. Those grappling with this condition may experience visual impairments or, in severe cases, complete blindness [3-4]. A prognostic study indicates that within a mere decade, the prevalence of this ailment is anticipated to double compared to patient counts from a decade prior [5].
Consequently, the incidence of vision impairment is poised to witness a twofold increase. Several factors contribute to the expeditious proliferation of DR cases [6-7]. The foremost factor is the onset age of this condition, which has been documented in DM patients as young as 20 years old [8].
Another critical concern in the context of DR is the dearth of expert technicians that are adept at diagnosing the ailment [9-10]. In response to this challenge, diverse computer technologies have been harnessed to facilitate prompt detection and informed medical decision-making for patients.
Researchers have leveraged advanced tools like deep learning, particularly neural networks, to expedite the detection process not only for DR but also for other diseases, concurrently diminishing the margin of error [11-14].
Techniques such as convolutional neural networks (CNNs) and deep learning methodologies have proven highly efficacious in DR detection, with optimal outcomes realized through their application [15-17]. Many researchers have expedited their work by utilizing pre-trained CNNs, yet bespoke architectures tailored to specific issues have exhibited enhanced detection accuracy, incorporating techniques such as optimization algorithms and fuzzy logic [18].
This study commenced by meticulously selecting the foundational CNN model and the most appropriate pre-processing technique for the designated database. Exhaustive research ensured the identification of the pre-processing method that exhibited superior performance in prior studies.
Similarly, the base CNN model was chosen based on the favorable outcomes discerned in the research. The subsequent phase involved the integration of a fuzzy inference system to optimize the layers of the CNN architecture for DR detection. Despite the prevalent use of fuzzy logic in other studies for classifying diverse databases, its application in the detection and classification of DR remains relatively unexplored.
Section 2 of this paper elucidates the related work conducted by various authors. Section 3 provides comprehensive explanations of fundamental concepts, complemented by illustrative examples, to enhance comprehension of this study.
Section 4 delineates and expounds upon the methods employed in this research. Section 5 presents the results gleaned from the conducted experiments, and finally, Section 6 articulates the conclusions derived from the study.
2 Related Work
Increasing the precision of neural networks either for classification or prediction has been a priority for experts in recent years [19-20], due to this many works have been written with different implementations or own methods that seek the same objective.
In the work of [21] we sought to modify the depth of the network, size and number of convolutional filters and number of neurons in the hidden layers. In this work it was concluded that, depending on the data set used, the number of filters used in the convolution layer can be increased or decreased, so increasing the number of filters will not necessarily improve the results.
Do not forget that a convolutional neural network contains the same hidden layers as a traditional neural network. The work of the author [22] shows us the different behaviors that modifying the number of neurons in each of the hidden layers can result in an increase of precision.
To carry out the modifications to the hyperparameters of the CNNs, the use of optimization algorithms has been implemented that allow experts to speed up the process, one of them being the genetic algorithm. Thanks to this technology, prior work can be carried out with the APTOS 2019 [23] database. Using this algorithm, a CNN model was created that improves the accuracy of diabetic retinopathy classification.
3 Basic Concepts
In the previous section, some terms were mentioned that may not be familiar to those who do not work with the use of intelligent hybrid systems. So, for a complete understanding of this work, this section will present the information necessary to understand the work in its entirety.
3.1 Artificial Neural Networks
One of the most widely employed machine learning tools for disease detection is the supervised artificial neural network. This specific neural network variant enables expert technicians to train the network using a labeled database [24].
When utilizing images in the learning phase, it becomes imperative to specify pertinent information for accurate future classification. This underscores the necessity for the supervision of an expert technician in the realm of image topics.
However, there exists a type of neural network capable of performing this task with convolutional filters: convolutional neural networks [25]. The design of a convolutional neural network model closely resembles that of a feed-forward neural network, with the differentiating factor becoming evident after the input stage [26].
3.1.1. Convolutional Layer
It is the initial layer in the CNN architecture, facilitates the recognition of key characteristics within the input images [27].
Consequently, the network eliminates the requirement for an expert technician to apply preprocessing methods to the images. To achieve this, the convolutional layer necessitates a kernel to derive a new matrix.
3.1.2. ReLU Function
The Rectified Linear Unit Function (ReLU) plays a crucial role in Convolutional Neural Networks (CNNs). Activation functions are essential after each individual neuron, and one of the widely utilized functions for CNNs is the ReLU function [28].
ReLU permits the activation of every positive number, thereby reducing the time required for experimentation.
3.1.3. MaxPooling Layer
The activation function serves to expedite the training time, although the image size remains constant, and not all pixels carry equal significance. To address this, a pooling method is employed, with the MaxPooling method being one of the most prevalent in the domain of convolutional neural networks [29].
3.2 Fuzzy Logic
Fuzzy logic is a logic that allows you to reach “reasoned” conclusions based on ambiguous or imprecise information [30]. One of the great contributions of this logic is that it allows us to model situations or behaviors that are vague in themselves, that is, it adapts better to reality than classic logic, where there are only two values to decide [31].
Human reasoning does not react in a classical logic manner, but on the contrary, evaluates the environment and based on the weights of each of the variables decides, so fuzzy logic is more suitable to try to emulate such mental behavior [32].
It has been used for the development of a countless number of applications of all kinds such as medicine and bioinformatics [33]. The primary emphasis of this study lies in hybrid systems, signifying the necessity of incorporating two or more distinct techniques to formulate the proposed method.
The utilization of both CNN and fuzzy logic has been a recurrent theme in previous works, with a noticeable increase in its prevalence over the years [34]. The amalgamation of convolutional neural networks and fuzzy logic typically involves the incorporation of optimization algorithms, such as genetic algorithms or particle swarm optimization, aimed at optimizing the parameters associated with each technology [35].
4 Proposed Methods
In this section, we will consider each of the concepts explained in the previous section. First, the architectures of the CNN models from a previous work to which the proposed method will be applied will be detailed. Afterwards, the data from the APTOS 2019 database, characteristics and the two distributions necessary for the case studies of this work will be presented.
Then the explanation of the preprocessing applied to the database will be gone into detail. And finally, the creation and implementation of a fuzzy inference system that will allow us to modify the number of filters and neurons in the networks.
4.1 Neural Network Models
There are two CNN models obtained on previous work [36]. The first model, focused on binary study case, has an input size of 256x256x3 (width, height, and depth) and 5 convolutional layers, some on them have MaxPooling layer after it, but not all. Also, the model has 3 fully connected layers with different number of neurons.
Finally, the model has a Sigmoid activation function because it is for a binary study case. The second model, focused on multiclass study case, has the same input size and 5 convolutional layers with different values on its hyperparameters (MaxPooling size dropout if applies). Also, the model has 1 fully connected layer. Finally, the model has a Softmax activation function because it is for multiclass study case.
4.2 APTOS 2019 Database
This DR database has 3662 labeled images can be used for training and validation [37]. The database has 5 different classes that represent the damage caused by the disease that are used for multiclass study case [38], but 4 classes can be combined to obtain just 2 total classes for binary study case: healthy retina images and retina with diabetic retinopathy images [39].
The database has images with different kind of noise and no image has the same size. In Table 1, the distribution of the images for this work can be observed.
4.3 Preprocessing Method
This approach involves removing interfering pixels from the background and completely isolating the retina in the image. To achieve this goal, it is essential to transform color images to grayscale. By using a grayscale representation, conversion to a binary image becomes feasible. The process of converting the grayscale image to binary involves the selection of each pixel to refine the image.
The amount of luminosity present in the pixel must exceed a defined threshold to avoid considering it as noise and instead take advantage of it for precise retinal extraction. With the resulting binary image, the next step consists of locating and identifying the retina.
Thanks to the binary image, this task is simplified since it only involves identifying the most prominent shape. Once the retina has been detected, its position is extracted and used to isolate the retina from the original images.
Finally, black pixels are inserted as necessary to achieve an image with uniform dimensions in width and height. This method has yielded favorable results compared to other preprocessing techniques [40].
4.4 Fuzzy Inference System Description
To start, both APTOS 2019 study cases require a Mamdani Type-1 fuzzy inference system (FIS). All membership functions are trapezoidal functions.
The FIS comprises two inputs: the accuracy achieved with the current quantity of filters or neurons, and the second input pertains to the current quantity of filters or neurons.
The FIS yields a single output, which signifies the number of filters for the convolutional layer or neurons for the fully connected layer. Each input and output have 4 membership functions. The graphical representation of the FIS is depicted in Fig. 1. The accuracy value is normalized within the range of 0 to 1.
The quantity of filters or neurons is contingent upon the convolutional layer number or fully connected layer, with specified ranges detailed in Table 2 and Table 3 respectively.
Fully Connected Layer Number | Range |
1 (Binary Study Case) | [ - 128] |
2 (Binary Study Case) | [ - 256] |
3 (Binary Study Case) | [ - 512] |
1 (Multiclass Study Case) | [ - 512] |
The chosen Defuzzification Method is Centroid. In the context of these experiments, the FIS incorporates 16 fuzzy if-then rules obtained by trial and error, which are outlined in Table 4. A graphical representation of the proposed method can be observed on Fig. 2.
Fuzzy Rule Number | Fuzzy Rule |
1 | If (accuracy is very_bad) and (old_filters is very_few) then (new_filters is a_lot) |
2 | If (accuracy is bad) and (old_filters is very_few) then (new_filters is many) |
3 | If (accuracy is good) and (old_filters is very_few) then (new_filters is few) |
4 | If (accuracy is excellent) and (old_filters is very_few) then (new_filters is very_few) |
5 | If (accuracy is very_bad) and (old_filters is few) then (new_filters is a_lot) |
6 | If (accuracy is bad) and (old_filters is few) then (new_filters is many) |
7 | If (accuracy is good) and (old_filters is few) then (new_filters is very_few) |
8 | If (accuracy is excellent) and (old_filters is few) then (new_filters is few) |
9 | If (accuracy is very_bad) and (old_filters is many) then (new_filters is very_few) |
10 | If (accuracy is bad) and (old_filters is many) then (new_filters is few) |
11 | If (accuracy is good) and (old_filters is many) then (new_filters is a_lot) |
12 | If (accuracy is excellent) and (old_filters is many) then (new_filters is many) |
13 | If (accuracy is very_bad) and (old_filters is a_lot) then (new_filters is very_few) |
14 | If (accuracy is bad) and (old_filters is a_lot) then (new_filters is few) |
15 | If (accuracy is good) and (old_filters is a_lot) then (new_filters is many) |
16 | If (accuracy is excellent) and (old_filters is a_lot) then (new_filters is a_lot) |
Equations of the FIS can be observed on Eq. 1-8 where Eq. 1-4 are for the input of accuracy and Eq. 5-8 are for the input and output of the number of filters or neurons.
5 Experimental Results
In this section, we are going to bring together each of the concepts and methods proposed for distributed experimentation in two study cases: binary and multiclass.
5.1 Experiments for APTOS 2019 Binary Study Case
Two experiments were conducted; the initial one involved employing the CNN model derived through the hierarchical genetic algorithm as documented in prior research [41]. The second experiment utilized the CNN model acquired through the previously explained FIS.
Each experiment was iterated 30 times, maintaining consistent hyperparameters: 10 epochs, utilization of the APTOS 2019 database, and the Adam optimizer algorithm.
In the first experiment, the mean accuracy recorded was 0.9021, accompanied by a standard deviation of 0.108434797. Conversely, for the second experiment, the mean accuracy achieved was 0.9526, with a standard deviation of 0.008521158. Detailed results for each iteration are presented in Table 5.
Experiment Number | Genetic Algorithm Accuracy | Fuzzy Logic Accuracy | Experiment Number | Genetic Algorithm Accuracy | Fuzzy Logic Accuracy |
1 | 0.938608468 | 0.934515715 | 16 | 0.927694380 | 0.960436583 |
2 | 0.929058671 | 0.956343770 | 17 | 0.889495254 | 0.938608468 |
3 | 0.911323309 | 0.960436583 | 18 | 0.931787193 | 0.954979539 |
4 | 0.507503390 | 0.952251017 | 19 | 0.950886786 | 0.949522495 |
5 | 0.920873106 | 0.956343770 | 20 | 0.960436583 | 0.956343770 |
6 | 0.930422902 | 0.961800814 | 21 | 0.953615308 | 0.952251017 |
7 | 0.939972699 | 0.961800814 | 22 | 0.897680759 | 0.972714841 |
8 | 0.937244177 | 0.942701221 | 23 | 0.931787193 | 0.949522495 |
9 | 0.924965918 | 0.957708061 | 24 | 0.904502034 | 0.954979539 |
10 | 0.938608468 | 0.956343770 | 25 | 0.934515715 | 0.935879946 |
11 | 0.937244177 | 0.960436583 | 26 | 0.916780353 | 0.950886786 |
12 | 0.937244177 | 0.960436583 | 27 | 0.929058671 | 0.946793973 |
13 | 0.939972699 | 0.952251017 | 28 | 0.934515715 | 0.945429742 |
14 | 0.946793973 | 0.952251017 | 29 | 0.507503390 | 0.954979539 |
15 | 0.930422902 | 0.939972699 | 30 | 0.933151424 | 0.948158264 |
5.1.1. Box Plot for Binary Study Case
One box plot was made to observe the comparison of the values. Box plot for the binary study case can be observed on Fig. 3.
5.1.2. Hypothesis Testing for Binary Study Case
Based on the results observed in Table 5, the hypothesis testing will be between mean accuracy and standard deviation obtained by the first experiment and the second one. The experiment of this present work got a higher mean accuracy, so, our statement is that the experiment with the CNN model obtained by the fuzzy system inference offers a bigger mean accuracy than the offered by the experiment with the CNN model obtained by the hierarchical genetic algorithm for binary study case. Using an Alpha value of 0.05, the critical value obtained must be more than 1.96 to reject the null hypothesis. The score of the statistic test is 2.5240, meaning that the null hypothesis is rejected and there is enough evidence to support the claim.
5.2 Experiments for APTOS 2019 Multiclass Study
In the same way as previous experimentation, two experiments were conducted; the first one involved employing the CNN model obtained through the hierarchical genetic algorithm [42] and the second experiment utilized the CNN model acquired through the FIS. Each experiment was iterated 30 times, maintaining consistent hyperparameters: 10 epochs, utilization of the APTOS 2019 database, and the Adam optimizer algorithm.
In the first experiment, the mean accuracy recorded was 0.7191, accompanied by a standard deviation of 0.010199619. Conversely, for the second experiment, the mean accuracy achieved was 0.7299, with a standard deviation of 0.015614013. Detailed results for each iteration are presented in Table 6.
Experiment Number | Genetic Algorithm Accuracy | Fuzzy Logic Accuracy | Experiment Number | Genetic Algorithm Accuracy | Fuzzy Logic Accuracy |
1 | 0.728512943 | 0.717598915 | 16 | 0.688949525 | 0.740791261 |
2 | 0.706684828 | 0.738062739 | 17 | 0.703956366 | 0.720327437 |
3 | 0.714870393 | 0.733969986 | 18 | 0.731241465 | 0.736698508 |
4 | 0.712141871 | 0.712141871 | 19 | 0.710777640 | 0.744884014 |
5 | 0.710777640 | 0.731241465 | 20 | 0.703956366 | 0.727148712 |
6 | 0.714870393 | 0.733969986 | 21 | 0.735334218 | 0.713506162 |
7 | 0.727148712 | 0.723055959 | 22 | 0.728512943 | 0.720327437 |
8 | 0.736698508 | 0.743519783 | 23 | 0.739427030 | 0.731241465 |
9 | 0.740791261 | 0.739427030 | 24 | 0.739427030 | 0.729877234 |
10 | 0.750341058 | 0.729877234 | 25 | 0.718963146 | 0.725784421 |
11 | 0.724420190 | 0.742155552 | 26 | 0.690313756 | 0.727148712 |
12 | 0.725784421 | 0.710777640 | 27 | 0.712141871 | 0.727148712 |
13 | 0.727148712 | 0.724420190 | 28 | 0.706684828 | 0.721691668 |
14 | 0.688949525 | 0.725784421 | 29 | 0.720327437 | 0.735334218 |
15 | 0.716234624 | 0.736698508 | 30 | 0.717598915 | 0.753069580 |
5.2.1. Box Plot for Multiclass Study Case
In the same way as the previous experiment, one box plot was made to observe the comparison of the values. Box plot for the multiclass study case can be observed on Fig. 4.
5.1.2. Hypothesis Testing for Multiclass Study Case
Based on the results observed in Table 6, the hypothesis testing will be between mean accuracy and standard deviation obtained by the first experiment and the second one.
The experiment of this present work got a higher mean accuracy, so, our statement is that the experiment with the CNN model obtained by the fuzzy system inference offers a bigger mean accuracy than the offered by the experiment with the CNN model obtained by the hierarchical genetic algorithm for multiclass study case.
Using an Alpha value of 0.05, the critical value obtained must be more than 1.96 to reject the null hypothesis. The score of the statistic test is 3.1786, meaning that the null hypothesis is rejected and there is enough evidence to support the claim.
6 Conclusions
In this study, the focus was on employing a Mamdani Type 1 fuzzy inference system to determine the filter numbers based on the previous filter values and the obtained accuracy. Before implementing the proposed method, the mean accuracy and standard deviation of the base CNN model were calculated for comparative analysis. Subsequently, the proposed method was integrated into a pre-existing CNN model.
After the generation of the new CNN model, the FIS was used iteratively to refine the CNN model, obtaining the best CNN model, mean precision, and standard deviation. There is room for improvement in the construction of the FIS such as the number of variables, rules and membership functions, so as future work the current work can be taken and implemented the respective improvements and seek a higher average precision with a reduced standard deviation.
Finally, APTOS 2019 serves as a valuable database in addressing real-world problems, yet it is not the exclusive dataset where the proposed method could find application. As future work, we plan to consider different metaheuristics for optimizing the method, as in [43-48]. Also, elevate the use of fuzzy logic to type-2, like it is done in several recent works [49-54].