1 Introduction
The use of intelligent techniques in medical diagnosis is increasingly common, which allows the development of intelligent systems based on knowledge [1, 2]. These techniques have performed very well for the diagnosis of many diseases [3, 4].
In medicine, diagnosis of a disease is often a difficult process, because the relationship between disease and symptom is rarely one to one. Diseases may share a range of overlapping symptoms. Having many variables that affect the analysis increases the difficulty of arriving at an accurate diagnosis [5, 6, 7]
Medical diagnosis is a complicated task because it is full of uncertainty and without precision, which means that a diagnosis made by doctors is often unable to produce an accurate result, causing a misdiagnosis of the disease.
This problem can be treated employing a fuzzy system with the ability to handle the uncertainty in the representation of knowledge [8].
Fuzzy logic is a branch of artificial intelligence that can handle uncertainty by providing the opportunity to model conditions that are imprecisely defined, allowing the simulation of human reasoning in fuzzy data.
It is used as a decision-making tool [1, 3, 6], beside intelligent healthcare systems’ detection and prevention of mosquito-borne diseases [9] in fog-computing, or internet of things-based support systems for dengue analysis and prediction [10], or for identifying and controlling chikungunya virus [11].
Fuzzy logic has had many applications in pattern recognition, medical applications, and computer vision [12].
There are many viral diseases transmitted by vectors, such as dengue, yellow fever, chikungunya, zika, among others, that are difficult to distinguish in their early stages, making their diagnosis difficult [13].
An early diagnosis by observing the signs and symptoms is very important to prevent it from spreading, helping to reduce the disease and mortality rate [8, 12]. Therefore, it is necessary to develop a diagnostic system that helps determine the level of infection of these four viral diseases: dengue, yellow fever, zika, and chikungunya.
This research work aims to develop a fuzzy medical system to determine the level of infection of these four viral diseases, thereby helping the doctor to make a better diagnosis. The system provides many benefits for both physicians and patients.
This work is organized into six sections. The second shows a theoretical framework of fuzzy logic. The third section is a review of the related literature. The fourth section represents the design and development of the application for the diagnosis of the four diseases. The fifth section shows the tests and results of the system. The last section presents the conclusions and future work of the diagnostic system.
2 Fuzzy Expert System
An expert system is a computer program that simulates the decision of a person who has specialized information and experience in a particular field [14]. Some advantages of expert systems are higher reliability, reduced errors, multiple experiences, and consistent performance unaffected by fatigue [14, 15].
An expert system is a useful tool in medical diagnosis because this type of system allows simulating the behavior of a human specialist in the face of a problem in a certain field, grouping the specialist's knowledge and decision-making rules in order to make the best decision.
2.1 The Theory of Fuzzy Logic
Fuzzy logic was proposed by L. Zadeh in 1965 [16] because Boolean logic has a deficiency when dealing with uncertainty and imprecision. Fuzzy logic is a multi-valued logic that allows information to be represented and manipulated, similar to human reasoning and communication [1, 17]. A fuzzy system is made up of a set of membership functions and fuzzy rules, which are used to analyze and reason about data. It uses fuzzy logic instead of Boolean logic [3, 7, 14, 18].
The theory of fuzzy sets starts from the classical theory of sets, adding a membership function to the set, defined as a real number between 0 and 1. For each fuzzy set, a membership or inclusion function
If
The knowledge base contains the specialized knowledge that is obtained from the human expert, and is represented by a fuzzy rule of the form IF-THEN [20]:
where,
Since
Union:
Intersection:
Complement:
Finally, an output value must be obtained. There are several methods to calculate this value, but the one most used is called centroid or center of area, which is calculated using the formula (6) [20, 21]:
Fuzzy logic systems are based on a specific life cycle model that consists of four components: fuzzification, inference engine, fuzzy rules, and defuzzification [20]. A fuzzy system can be summarized in the following stages, carried out in order [3, 6]:
Fuzzification: convert classical data to fuzzy data using membership functions. Here membership functions are applied to determine the degree of membership to a fuzzy set.
Deduction or Evaluation of rules: the truth value of the antecedents of each rule is calculated, and then the consequent of the conclusion of each rule is applied. This results in a fuzzy set being assigned to each output variable for each rule.
Combination or Aggregation: the process of unifying the outputs of all the rules; that is, the membership functions of all previously scaled consequents are combined.
Defuzzification: the objective is to convert the output fuzzy set obtained in the aggregation into a real output number.
Fuzzy logic is a very useful technique for designing machine learning systems. This technique, derived from artificial intelligence, has been chosen due to its successful use in many health-related scenarios, as it provides the handling of uncertainty and imprecision in real-life applications [3, 22, 23]. In the medical field, fuzzy logic is used as a decision-making tool.
3 Medical Diagnostic
The medical diagnosis of a disease is very important, particularly when there are several diseases whose symptoms are quite similar in the early stage [3]. A precise and rapid diagnosis allows the doctor to provide the appropriate treatment to the patient, but in the medical diagnosis, numerous variables affect the decision process. Having many symptoms to analyze to diagnose a patient's illness makes the doctor's job difficult [5].
Medical diagnosis is a complicated task because it is fraught with uncertainty and imprecision. Fuzzy logic provides the solution for dealing with uncertainty by providing the opportunity to model conditions that are imprecisely defined. Fuzzy systems have shown their usefulness in a significant way in medical diagnosis, achieving greater accuracy in the results [1, 5].
3.1 Viral Infections
Vector-borne viral infections (Zika, chikungunya, dengue, yellow fever, and others) are of international concern, as they pose a high risk of infection and disease spread [13, 15]. This type of disease can occur together with others, further complicating the diagnosis [8].
Viral diseases transmitted by the Aedes Aegypti mosquito are difficult to diagnose and can have complications causing severe disease [13]. Dengue, yellow fever, zika, and chikungunya are diseases that in their early stages are difficult to distinguish since they have great similarities in symptoms making the diagnosis even more difficult. Here is a description of these diseases.
Dengue. This is an infectious disease of a viral cause that affects babies, young children, and adults. The mosquitoes that transmit dengue fever also transmit chikungunya, yellow fever, and zika virus infection. This disease is difficult to identify because it shares symptoms with other diseases [24]. Dengue symptoms include fever, headache, rash, pain behind the eyes, joint and muscle pain, rash, nausea, vomiting, diarrhea, conjunctivitis [8, 25].
Dengue is a viral disease, which can become a fatal complication called severe dengue. There are three different forms of dengue fever. They are dengue fever, dengue hemorrhagic fever, and dengue shock syndrome. The former is not a serious disease, although overexposure (reinfection by another bite) increases the risk of contracting the hemorrhagic variant, which does have a high mortality [4].
The second is less common, but it is a severe form of dengue in which the patient shows bleeding. The third is considered the riskiest, responsible for most deaths, especially in children [8].
Chikungunya. It is a viral disease with an incubation period that varies from 1 to 12 days. Symptoms are usually high fever, headache, muscle aches, nausea, tiredness, skin rashes, and in some cases minor bleeding, especially affecting the extremities [13, 22]. If joint pain persists, it can result in acute or chronic disease [13].
Zika. This disease is caused by a virus, as is the case with dengue and chikungunya, it has an incubation period that is estimated to be 3 to 14 days. Common symptoms are fever, myalgia, pink eye, rash, muscle and joint pain, and vomiting [23]. Vertical transmission of zika from pregnant women to the fetus is responsible for the microcephaly epidemic [13].
Yellow fever. This is an infectious hemorrhagic viral disease also caused by flavivirus-infected mosquitoes and has an incubation period of 3 to 6 days. Yellow fever is considered a re-emerging disease [21]. Its main symptoms are fever, loss of appetite, headache, joint pain, and vomiting [26]. But yellow fever can be much more serious and cause more serious problems.
4 Related Works
Many researchers have applied artificial intelligence techniques in disease diagnosis to optimize the accuracy of medical diagnostic results, including:
In [27] they made a tool to diagnose typhoid fever and dengue hemorrhagic fever. Its goal is to prevent errors in case management and reduce mortality, as these two diseases are often misdiagnosed because they have the same symptoms as other diseases. This expert system uses the Sugeno method of fuzzy logic. The system was created to help the community diagnose based on the signs or symptoms experienced and provide treatment solutions. They have tested 86 data cases with an accuracy value of 80.2%.
In [3], an intelligent diagnostic support system was developed to identify chikungunya disease.
In the development of their system, they integrated a genetic algorithm and fuzzy logic, and the system is capable of providing a diagnosis of Chikungunya. The system was tested on 25 cases and matched the results of the training data set in 88% of the cases. An advantage of this genetic-fuzzy system is the low-cost solution and the ease of implementation.
In [20], the design of a web-based multiple fever diagnostic system using fuzzy logic considering human physiological symptoms is proposed. The diseases considered in this study are malaria, Lassa fever, dengue fever, typhoid fever, and yellow fever. The classifier has two stages: in the first stage, the type of fever is precisely confirmed using common symptoms; in the second stage the level of infection (mild/acute) is determined. The analysis clearly shows the effectiveness and accuracy in the performance of the system through the elimination of false results.
In [28], they built a fuzzy tool for the prediction of the type of dengue fever, which includes the three types of dengue: dengue fever, dengue hemorrhagic fever, and dengue shock syndrome. Their studies suggest that 95% of the results generated by the tool were like the real data in predicting the type of fever in patients.
In [20], the author developed an expert system to diagnose tropical infectious diseases that combines fuzzy logic and certainty factors to diagnose some diseases such as dengue fever, typhoid fever, and Chikungunya. The results of their tests show that the developed system has a 91.07% similarity with the expert.
5 Implementation
In this work, only the use of human physiological symptoms to diagnose a viral disease is considered. The proposed fuzzy system aims to help the doctor diagnose the viral disease present in the patient, determining the level of infection of the four diseases. The system was implemented in MATLAB 9.1 and covers only four types of viral diseases: dengue, zika, chikungunya, and yellow fever.
To carry out the system, first we have been defined 6 input variables and 4 output variables (see Fig. 1).
The symptoms selected for the proposed system are pain, temperature, bleeding, poor appetite, muscle weakness, and shortness of breath. The output variables are the four viral diseases selected for diagnosis: dengue, zika, chikungunya, and yellow fever. Representing each disease as an output variable allows us to determine the level of infection. Fig. 1 shows the general diagram of the fuzzy system formed by these input and output variables.
5.1 Fuzzification
In this phase, the input variables must be converted to linguistic variables.
The input variables or symptoms and their possible values or linguistic variables are shown in Table 1. The symptoms are obtained from the patient and the data is provided manually to the program. The output variables are the four diseases, and their possible linguistic values are shown in Table 2.
Symptom | Linguistic variables | ||||
Pain | No-Pain | Mild | Moderate | Strong | Severe |
Temperature | Normal | Low fever | High Fever | ||
Hemorrhage | Mild | Medium | Severe | ||
Appetite | Normal | Little | None | ||
Muscular weakness | Normal | Mild | Moderate | Severe | |
Difficult breathing | Severe-Low | Moderate | Normal |
Output variable | Linguistic variable | |||
Dengue | Mild | Moderate | Strong | |
Zika | Mild | Moderate | Strong | |
Chikungunya | Mild | Strong | ||
Yellow fever | Mild | Moderate | Strong | Severe |
Fig. 2 shows the membership function of the first entry, pain. This is divided into five fuzzy sets: no-pain, mild, moderate, strong and severe. The ranges of each fuzzy set of the pain variable are: no-pain range from 0 to 2.5, mild range from 1.5 to 4.5, moderate range from 3.5 to 6.5, strong range from 5.5 to 8.5 and low range from 7.5 to 10.
Fig. 3 shows the membership function of the second entry, temperature. This is divided into three fuzzy sets: normal, low fever, and high fever. The ranges of each fuzzy set of the temperature variable are: normal range from 36.6 to 37.0, low fever range from 37.1 to 37.9, and fever high from 38.0 to 40.
Fig. 4 shows the membership function of the third entry, hemorrhagic. This is divided into three fuzzy sets: mild, medium, and severe. The ranges of each fuzzy set of the hemorrhagic variable are mild range from 0 to 4, medium range from 3 to 8, and severe from 7 to 10.
Fig. 5 shows the membership function of the fourth entry, appetite.
This is divided into three fuzzy sets: normal, little, and none. The ranges of each fuzzy set of the appetite variable are normal range from 0 to 4.5, little range from 3.5 to 7.5, and none range from 6.5 to 10.
Fig. 6 shows the membership function of the fifth entry, muscle weakness. This is divided into four values: normal, mild, moderate, and severe. The ranges of each fuzzy set of the muscle weakness variable are: normal range from 0 to 3, mild range from 2.5 to 5.5, moderate range from 4.5 to 8.5, and severe range from 7.5 to 10.
Fig. 7 shows the membership function of the sixth entry, difficult breathing. This is divided into three fuzzy sets: normal, moderate and severe low.
The ranges of each fuzzy set of the difficult breathing variable are: several-low range from 50 to 89.99, moderate range from 90 to 94.99 and normal range from 95 to 100. The selection of these ranges has been chosen based on the oxygen saturation levels recommended in [29] [30] Now the output variables (see Fig. 8 to Fig. 11). Fig. 8 shows the membership function of the first output variable: dengue. This is divided into three fuzzy sets: mild, moderate, and strong. The ranges of each fuzzy set of the dengue variable are: the mild range from 0 to 3.99, the moderate range from 4 to 6.99 and the strong range from 7 to 10.
Fig. 9 shows the membership function of the second output variable: zika. This is divided into three fuzzy sets: mild, moderate, and strong. The ranges of each fuzzy set of the zika variable are: the mild range from 0 to 3.99, the moderate range from 4 to 6.99 and the strong range from 7 to 10.
Fig. 10 shows the membership function of the third output variable: chikungunya. This is divided into two fuzzy sets: mild and strong. The ranges of each fuzzy set of the chikungunya variable are: the mild range from 0 to 4.99 and the strong range from 5 to 10.
Fig. 11 shows the membership function of the fourth output variable: yellow fever. This is divided into four fuzzy sets: mild, moderate, strong and severe. The ranges of each fuzzy set of the yellow fever variable are: the mild range from 0 to 1.99, the moderate range from 2 to 6.99, the strong range from 7 to 7.99 and the severe from 8 to 10.
5.2 Inference Motor
The inference engine is made up of the rules of the knowledge base that determine whether a patient has one or the other disease [24]. In the inference engine, the Mamdani mechanism has been used and the implication process is carried out through the 𝑚𝑖𝑛 operation. If we have the antecedent of a rule, the consequent is obtained by taking the minimum value of the degree of membership of all the variables of the antecedent. In our case, the inference engine helps us determine the level of infection of these four viral diseases.
The rules have also been generated using the MATLAB Rules Editor. The rule base is the main part of the fuzzy inference system, and the quality of the result depends on it. The fuzzy knowledge-based system provided by the domain expert is basically the collection of fuzzy IF-THEN rules. In this proposed system, a set of 87 types of fuzzy rules have been defined using the knowledge of the expert on the domain of diseases.
Fig. 12 shows the MATLAB Fuzzy Rules Generator that has been used to define the 87 fuzzy rules defined with the knowledge of the expert in the domain of diseases. This image shows the first part of the rule definition, that is, the selection of the first input variables.
Next, two of the rules that have been introduced in the fuzzy rule builder in MATLAB are shown:
Pain = Moderate AND Temperature = LowFever AND Hemorrhage = Leve AND Appetite = Low AND Muscular-Weakness = Strong, Difficult-Breath = Normal THEN Dengue = Strong AND YellowFever = Moderade.
Pain = Mild AND Temperature = LowFever AND Hemorrhage = None AND Appetite = Low AND Muscular-Weakness = Leve, Difficult-Breath = Normal THEN Dengue = Moderate AND Zika = Strong AND YellowFever = Mild.
5.3 Defuzzification
There are many techniques to perform defuzzification. Some of these are the center of the area, the center of sums, and the mean of maximum [24]. In the fuzzy system implemented in this work we have used the center of area method (see equation 1). This method is also called the center of gravity and it is one of the most used methods to carry out defuzzification. This method defines the output as the center of the surface of the membership function that characterizes the fuzzy set of the combination (aggregation) of the implication results.
6 Experiments and Results
The predictions made by the system have been compared with the predictions of the experts, and the system has given good results in most of the tests. The tests were carried out in 15 cases with viral diseases. Some of those tests are shown below. To calculate the percentage of data according to the expert or accuracy, calculated using the formula (7) [27]:
Fig. 13 shows the first test performed in MATLAB, where patient X presents symptoms such as pain of 2.5, temperature of 37.4 ° C, bleeding of 0, lack of appetite at level 2, muscle weakness at level 4 and difficulty breathing of 93 percent. The diagnosis determined by the fuzzy set was the following: dengue = 5.5, Zika = 8.76, yellow fever = 0.997 and chikungunya = 5, concluding that the patient has a higher probability of suffering from Zika, since he/she presents a higher level of infection for that disease.
Fig. 14 shows a second test performed in MATLAB, where patient Y has the following symptoms: pain in a range of 7, temperature of 38.4 ° C, bleeding in a range of 5, lack of appetite at level 8, muscle weakness at level 9 and difficulty breathing at 70 percent.
The diagnosis determined by the fuzzy set was the following: dengue = 5, zika = 5, yellow fever = 7.5 and chikungunya = 5, concluding that the patient has a greater probability of suffering from yellow fever since he/she presents a higher level of infection for that disease.
In the results of the two previous exercises, the system provides the level of infection of the four diseases. In these cases, the diagnosis of the system was correct, since it was the same as the doctor's diagnosis, that is, it helped to confirm that the doctor's diagnosis was correct.
The system does not serve as a replacement for medical experts, but as a diagnostic assistance solution. In addition, this system only uses physiological symptoms. The doctor provides the symptoms, and the system determines the level of infection for each of the four diseases. The level of infection of each disease tells the doctor how much the patient may suffer from that disease. Supported by the results of the fuzzy system, the doctor will be able to make a diagnosis of the disease more quickly and reliably. Having a membership function for each disease allows us to see what possibility the patient has of suffering from each disease, helping the doctor make a better decision. From 15 cases, the results obtained check 13 correct data and 2 incorrect data. The experimental results showed that the result agrees with the diagnosis of expert human doctors in 86.6%.
7 Conclusions
The fuzzy system proposed in this work helps to determine the level of infection in the diagnosis of four diseases: dengue, zika, chikungunya and yellow fever, according to the presence of six symptoms: pain, temperature, bleeding, lack of appetite, weakness muscle and shortness of breath. The system is not intended to replace doctors, but to help them make a better decision regarding diagnosis.
The system shows the doctor what possibility the patient has of suffering from each disease, which allows the doctor to diagnose with greater certainty, and helps to avoid erroneous diagnoses. The involvement of the human expert is vital, as the entire system is based on the knowledge provided by the expert.
The predictions made by the system have been compared with the expert's predictions, and the system has provided good results in most of the tests. We found that 86.6% of these results were similar to the real data of the patients.
In the future, we intend to add other factors or symptoms that allow a more accurate diagnosis, and we also intend to design another diffuse system applied to other diseases such as COVID-19. Finally, we intend to implement other techniques to solve the same problem, such as: Naive Bayes, Neural Networks and C45.