Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Similars in SciELO
Share
Investigaciones geográficas
On-line version ISSN 2448-7279Print version ISSN 0188-4611
Abstract
GOMEZ GUERRERO, Jenny Sofía and AGUAYO ARIAS, Mauricio Iván. Performance Evaluation of Rainfall Data Fill-in Methods in Two Morphostructural Areas of South-Central Chile. Invest. Geog [online]. 2019, n.99, e59837. Epub Sep 25, 2019. ISSN 2448-7279. https://doi.org/10.14350/rig.59837.
The quality of the information in meteorological data time series has always been a concern for the scientific community. The scarcity of information requires the use of data fill-in techniques and methods that frequently ignore the orographic features of the study area, as well as the method accuracy, leading to inaccurate results with important consequences.
In this context, this paper seeks to evaluate two methods for filling rainfall data, namely Normal Ratio and Linear Regression Model (LRM), applied to two morphostructural zones in the south central region of Chile, through an error analysis of a 32-year series of precipitation data.
Both methods were compared considering 65 of 112 stations across the region, located on the coastal plain and central valley. Subsequently, two time-consistent base stations were defined, one for each area; pluviometric and proximity criteria, as well as the amount of information available, were applied to choose five neighboring stations.
After calculating the correlation between stations, using a probability analysis by quartiles and the Shapiro-Wilk test the normality of the LRM models was confirmed, as well as the homogeneity of the adjusted predictions and residuals.
The Normal Ratio method evaluated rainfall estimates by weighting mean annual rainfall in the neighboring stations, where each weighting factor corresponds to the ratio between the precipitation figure recorded in the auxiliary station and the mean annual rainfall of the respective station.
The performance of each method was assessed using the following estimators: Mean Error, Coefficient of Determination (CoD), Mean Squared Error (MSE), RootMean-Square Error (RMSE), Sum of Squared Residuals (SSR), Mean Relative Error (MRE), and Mean Absolute Percentage Error (MAPE).
The statistical analysis reveals a greater range of temporal variation in precipitation in the Central Valley relative to the Coastal Zone, except for one station, and a positive relationship between altitude and a broader pluviometric range. LRM shows greater data dispersion at station Chiguayante; moreover, according to the CoD, this is the station with the lowest prediction potential.
In most of the cases analyzed, we found an inverse relationship between the sum of squared residuals (SSR) and the number of annual precipitation data available in each station.
The estimators SSR, MSE, and RMSE penalize large residuals, revealing that for the 32-year series studied, The Normal Ratio yields better performance and lower prediction error in the target stations in both morphostructural areas, with Dichato as the station with the lowest mean error and Mayulermo as the station with the lowest mean relative error, for both methods in the sample selected.
As Dichato was the station with the greatest Euclidean distance from the base, the distance is discarded as a major predictive factor, contrary to our findings regarding data dispersion.
The analysis of residuals (SSR, MSE, RMSE) indicated that the Linear Regression Model is influenced by outliers. However, these values were considered, since eliminating the extreme values, as is usually done in regression analysis, may result in losing relevant information about maximum and minimum precipitation that is useful in the analysis of extreme climatic events such as drought. The efficiency of both methods for predicting actual values was evaluated through the estimators SSR and CoD, showing that in the present analysis, the Normal Ratio involves a higher CoD and a lower residual variability. Although regression remains a widely used and recommended method, the Normal Ratio should be reconsidered for the prediction of missing data in precipitation series in areas of south central Chile with records available for neighboring stations that could support the equation for the data required.
The quadratic estimators MSE and RMSE allow inferring that those stations showing a lower mean error, where the predictive methods analyzed were most successful, were the stations where precipitation showed a more stable behavior around the mean.
The dimensionless estimators MRE and MAPE confirmed the advantage of the Normal Ratio and determined that the best mean performance of the prediction was related to data dispersion rather than to the Euclidean distance between stations and the base station.
The two methods evaluated offer a simple way to estimate meteorological data when the information available is insufficient; however, the Normal Ratio demonstrated a better performance relative to LRM for estimating missing precipitation data, regardless of the geomorphological area selected.
Keywords : Fill of rainfall data; performance evaluation; morphostructural zones; Linear Regression; Normal Ratio.