SciELO - Scientific Electronic Library Online

 
vol.32 issue3Analysis of a severe air pollution episode in India during Diwali festival - a nationwide approach author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Atmósfera

Print version ISSN 0187-6236

Abstract

MORALES MARTINEZ, Jorge Luis et al. Analysis of a new spatial interpolation weighting method to estimate missing data applied to rainfall records. Atmósfera [online]. 2019, vol.32, n.3, pp.237-259.  Epub Oct 07, 2020. ISSN 0187-6236.  https://doi.org/10.20937/atm.2019.32.03.06.

In the present work, two new generalized weighted methods of imputation of missing data are developed and tested using a daily rainfall series. The proposed methodology allows to fully rebuild the time series while preserving its statistical properties. Rainfall records in the state of Tabasco, Mexico, during the period 1980-2012 were used to test and evaluate the proposed methodology. The imputation of missing data in a given weather station is performed by using daily data from neighboring stations with a similar rainfall behavior. The choice of optimal parameters for the proposed formulae is based on minimizing the mean absolute error (MAE) via an evolutionary strategy (CMA-ES). The K-means method was used with the Euclidean distance in order to select the adequate neighboring weather stations. Five different methods were applied to estimate the optimal number of clusters: the elbow method, gap statistics, TraceW, Hartigan and Krzanowski-Lai indices. In addition, the structural stability of the chosen clusters was evaluated in order to demonstrate that these represent the correct data structure and are not the result of an artificial internal procedure of the grouping algorithm. Results from two different statistical tests, Friedman and Nemenyi post hoc, showed that our two new methods produce significantly and statistically better estimation when compared to existing methods in the literature.

Keywords : missing data; rainfall data; K-means clustering; optimization; deterministic interpolation methods.

        · abstract in Spanish     · text in English     · English ( pdf )