Introduction
Fruit scab is caused by the fungus Sphaceloma perseae, and is one of the main phytosanitary problems of avocado (Persea americana Mill.), which reduces its export potential (Tamayo-Molano, 2007). This disease reduces quality and degrades production to second-rate fruit in the domestic market. The incidence of the pathogen can be reduced through a management system based on a combination of effective and timely fungicide sprays, as well as periodic removal of diseased fruit and dead branches within the canopy. Thus, the economic losses caused by scab could be reduced (Hartill, 1991). To avoid over-application of chemical products, growers require alternatives to ensure delivery of quality fruit (Korsten, 2006). Peel color in fruits is a quality attribute that influences consumer preference, and induces the expectation of flavor, taste and palatability (Wadhera & Capaldi-Phillips, 2014). Therefore, it is useful to construct color scales using instruments such as colorimeters, digital image processing and other objective color determination techniques to reduce the subjectivity of operators in the field or postharvest (Castro-Camacho et al., 2013).
Sphaceloma perseae infection results in a brown, corky-looking, irregularly-shaped lesion in the epidermis, but does not produce pulp rot (Tamayo-Molano, 2007). There are few epidemiological works describing the temporal progress of avocado scab, such as graphical scans (Marroquín-Pimentel, 1999) and some spatial distribution models of thrips (Rivera-Martínez et al., 2017). The incidence of the disease (97 %) has been directly related to the incidence of thrips damage (90 %) (Ávila-Quezada et al., 2003). However, the exact damage caused by this disease is not known (Ávila-Quezada et al., 2002).
In Mexico, avocado scab is commonly controlled with continuous applications of agrochemicals, without prior knowledge of the epidemiology of the pathosystem, which negatively affects the environment and the grower's economic situation (Ávila & Marroquín, 2007). Visual scales are a useful tool to evaluate the severity of plant diseases. These have established classes with a 0-10 rating scale, where 0 is a healthy fruit (any other organ or plant) and 10 is a completely damaged organ (Everett, 1999). Currently, technological tools are available that can help determine the percentage of damage to a crop, such as digital image analysis, information analysis through statistical packages, and different color formats, among others. One color format in which an image can be analyzed is the CIE L*a*b* format, which consists of three important data (Iñiguez et al., 1995): L (lightness), “a” (green to red) and “b” (blue to yellow). Bai et al. (2013) used the CIE L*a*b* format in their research, and by taking illumination into account in the learning stage of the model they proposed, they discovered that by analyzing colors in images with non-complex textures a reasonable classification of crops with respect to soil is obtained.
In this study, the CIE L*a*b* color format was used to evaluate the percentage of damage due to scab, considering that scab has a brown color in different shades. Therefore, the objective of this study was to provide an innovative and reliable tool, through image analysis, to evaluate the surface damaged by scab in avocado fruit. This will help to make timely decisions and achieve effective management. The images used contained Hass avocado fruits at developmental stages five and six (Ávila-Quezada et al., 2005).
Materials and methods
The strategy to identify color, proposed in this work, consisted of sampling pixels of one color (with different shades of the same color) from an image (Figure 1 and Figure 2). The RGB data of the sampled pixels were transformed to the CIE L*a*b* format. The CIE L*a*b* coordinates were plotted in a three-dimensional coordinate system. When looking at the plotted points from different angles, it was found that in the "a-b" plane view the pixels corresponding to the same color in different shades were in the same neighborhood of points (Figure 3). The pixel points in the "a-b" plane were plotted again (discarding the value of L) and a polygon was drawn around them (Figure 4). Subsequently, the initial image (Figure 1) was analyzed, pixel by pixel, looking for pixels whose “a, b” coordinates were within the polygon which were then counted (Figure 5). The percentage of green pixels in an image was determined by dividing the number of green pixels found by the total number of pixels in the image.
Avocado scab control is carried out when the fruit is green, that is, in the stages prior to ripening. Images of Hass avocado fruit were obtained when the fruit was clearly green. When obtaining photos of avocado fruit in the field, it was not necessary to cut the fruit. Seventy images were obtained with different lighting conditions at different times of the day in orchards in Michoacán, Mexico, because some fruits were exposed to the sun, others were completely shaded, and some were partially shaded. In addition, the photos were taken with the fruit in front of a light blue or white sheet of paper to avoid the leaves of the avocado tree in the image. This ensured that the algorithm recognized only the green of the avocado fruit.
With the images obtained, a sample extraction of only green pixels was made. These pixels were plotted on the "a-b" plane of the CIE L*a*b* color format. Then a polygon was drawn around the projected pixels (Figure 6). The coordinates found for the polygon in "a, b" format are: -25, 80; -14, 67; -7, 26; -4, 14; -5, 0; -10, -6; -20, -8; -50, -10; -71, 5; -85, 32; -79, 65; -58, 75; -31, 81. Pixels that had "a-b" coordinates included within the polygon were green in the subsequent analysis.
To check the correct functioning of the polygon, the number of green pixels (pixels that fell within the polygon) in an image of the avocado fruit was calculated and, as a result, white pixels were placed where green coloring was found (Figure 7). In that image, it was found that there were 42.925 % green pixels. To obtain the percentage of green pixels, the number of green pixels was divided by the total number of pixels in the image, and the result was multiplied by 100. With the above, it was visually found that the polygon had correctly identified all pixels with green coloring.
Subsequently, the percentage of scab color was obtained; to do this, the same procedure was performed as with the green color, except that this time only the scab-colored pixels were identified and plotted on the "a-b" plane of the CIE L*a*b* color format (Figure 8). The coordinates of the polygon in "a, b" format were: 7, 50; 16, 49; 25, 47; 30, 30; 26, 22; 20, 16; 12, 10; 5, 8; 3, 13; 0, 18; 4, 34; 4, 49; 5, 51.
To check for correct operation, we visually assessed whether the polygon correctly identified the scab. To do this, in the original image, the pixels found with the scab color were changed to white (Figure 9). In the image, it was observed that there were 17.37 % of pixels with scab, which was estimated with respect to the total number of pixels in the image.
Because it is necessary to know the percentage of scab with respect to the fruit, it was considered that the sum of the number of green pixels in the entire image, plus the number of scab-colored pixels can be related to the approximate amount of 100 % of pixels that make up the fruit. In addition, the number of scab-colored pixels can be related to the percentage of scab-colored pixels that exist with respect to the fruit. Therefore, by means of a rule of three, the percentage of scab-colored pixels (percentage of the disease) with respect to the fruit is obtained.
where P scab is the percentage of scab in the fruit, S is the percentage of pixels with scab found with respect to all pixels in the image and G is the percentage of green pixels found with respect to all pixels in the image. Substituting the above values (S = 17.37 % and G = 42.925 %) into the equation gives a fruit scab percentage of 28.8 %.
Subsequently, the percentage of scab on the fruit was calculated with the same images using the diagrammatic scale (Ávila & Marroquín, 2007). All avocado images were digitized in the AutoCAD® program (AUTODESK, 2020), the percentage of scab on the fruit was calculated and the results were compared with the diagrammatic scale. The scab percentages estimated with AutoCAD® (AUTODESK, 2020) were taken as actual or real (observed) data due to the program’s high precision.
To analyze the data, the residuals of all scab percentage estimates were first checked for normal behavior; this was done in the R Statistics program (R Development Core Team, 2020) using the probplot command of the e1071 package reported by Meyer et al. (2019). As a next step, the scab percentages obtained with the polygons (abscissae) were plotted against those obtained with AutoCAD® (AUTODESK, 2020) (ordinates). The regression equation of the form y = β 0 + β 1 (x) was found and the coefficient β 0 was analyzed with null hypothesis β 0 = 0, according to the methodology proposed by Infante-Gil and Zárate-de Lara (1984) to determine if the straight line passes through the coordinate (0, 0). Similarly, the coefficient β 1 was analyzed with null hypothesis β 1 = 1, also with the methodology of Infante-Gil and Zárate-de Lara (1984) to determine if for each unit increase in the abscissae the ordinates increase by one unit.
Results and discussion
To validate the green and scab color identification polygons of the avocado, the actual percentage of scab on the fruit was obtained. The avocado images were digitized in the AutoCAD® program (AUTODESK, 2020). First, the area of the entire fruit (A1) was digitized, as shown in the dotted line in Figure 10b, and then the area of the fruit containing scab (A2) was digitized, as shown in the solid line in Figure 10b. The percentage of scab on the fruit was estimated by dividing the areas A2/A1 and multiplying it by 100.
At the visual level, it was observed that the data showed a linear trend when comparing the observed scab percentages with those estimated by the proposed method (Figure 11), and the equation of the straight line with intersection at the origin was obtained (r2 = 0.867).
A first linear regression was performed to obtain the equation of the straight line of the form y = β
0
+ β
1
(x). The β
0
value obtained was small compared to the scale of the graph. Because of this, a hypothesis test was carried out on the coefficient β
0
, where the null hypothesis indicated that β
0
= 0 and the alternate hypothesis indicated that β
0
≠ 0, with a value of α = 0.01. The test statistic (t
0
= 2.44) was calculated and compared with the table statistic (
On the other hand, a hypothesis test was performed on the coefficient β
1
of the straight line y = β
0
+ β
1
(x), where the null hypothesis indicated that β
1
= 1 and the alternate hypothesis indicated that β
1
≠ 1. Likewise, the test statistic (t
0
= -0.17) was calculated and compared with the table statistic (
The regression of the equation of the straight line of the actual scab percentage with the scab percentage found with the diagrammatic scale (Ávila & Marroquín, 2007) was calculated, and a y = 0.5848x was obtained, with an r2 = 0.80.
Ávila and Marroquín (2007) designed a precise, accurate (R2 > 0.8 and β 1 > 0.8) and reproducible diagrammatic logarithmic scale to evaluate scab in avocado. In the present work, with said diagrammatic scale, an r2 = 0.80 and β 1 = 0.58 were found. It is likely that, due to the complexity of the symptom patterns and the variable assessment of the evaluator (due to a lack of training), the value of β 1 is sometimes less than 0.8.
These scales are the most commonly used to measure disease severity and are based on the principle of Horsfall and Barrat (1945) and the Weber-Fechner law (Campbell & Madden, 1990). Overestimation of severity levels using logarithmic scales has been reported for Puccinia horiana (Barbosa et al., 2006) and for Ramularia gossypii (Aquino et al., 2008). Leaves with similar severities, but with a different number of lesions, generate a tendency to overestimate the disease, mainly when the number of lesions is very high and their size is small (Sherwood et al., 1983). The six-class diagrammatic logarithmic scales proposed to assess the severity of Corynespora cassiicola-induced leaf and calyx spotting of roselle (Ortega-Acosta et al., 2016) provided good accuracy, precision, and reproducibility in the estimates.
Hock et al. (1992) developed a scale also based on the logarithmic principle for the field evaluation of the severity of the tarspot disease complex of maize; however, the precision values obtained were not adequate due to the complicated quantification system and the high number of classes with which it was developed. Michereff et al. (2006) determined suitable precision values in a first evaluation with the use of scales. Tovar-Soto et al. (2002) had to familiarize the evaluators to increase the precision and accuracy values (Hernández-Ramos & Sandoval-Islas, 2015). Due to the use of numerous scales and varying accuracy, some authors such as Bai et al. (2013) and Yakushev and Kanash (2016) used computer programs to identify color regions in crops using data transformed to the CIE L*a*b color format. Bai et al. (2013) obtained 87.2 % in the performance of the model they proposed with the CIE L*a*b* color format. In the present work, an r2 of 0.867 was found, indicating that a performance similar to that of Bai et al. (2013) was obtained.
The coefficient of determination obtained with the polygon is acceptable (> 0.8), which shows that precision can be obtained with the use of this measurement strategy. When comparing the actual scab percentages (obtained with Autocad®) with those obtained with the polygon method, an r2 = 0.867 and a slope of 0.992 were obtained. Therefore, the polygon allowed the assessment of the disease with precision, accuracy and reproducibility. Accuracy is the closeness of an estimated value to the actual value, while precision is the variation or repeatability associated with an estimate, and reproducibility is the absence of variation in estimates when different assessors quantify the same characteristic (Nascimiento et al., 2005).
The polygons found cover most of the green and scab-colored variations; therefore, it is recommended to use the same polygon coordinates to determine the scab percentage in any image with a green avocado fruit and a white or light blue background. If the outside of the avocado fruit was not green, an adjustment to the polygon would have to be made; therefore, it is always recommended to visually check the estimate in at least one image.
Conclusions
The method proposed in this study may be useful in research where clearly different colors need to be identified.
With the method proposed in this study, based on polygons, it was found that the equation y = 0.9923x, with an r2 = 0.8672, proves that this method performs better than the diagrammatic scale.
The results generated will support future monitoring of the disease in a novel way.