SciELO - Scientific Electronic Library Online

 
vol.67 número1The first principle calculations of structural, magneto-electronic, elastic, mechanical, and thermoelectric properties of half-metallic double perovskite oxide Sr2TiCoO6Modeling of the deuteron wave function in coordinate representation and calculations of polarization characteristics índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Revista mexicana de física

versión impresa ISSN 0035-001X

Rev. mex. fis. vol.67 no.1 México ene./feb. 2021  Epub 31-Ene-2022

https://doi.org/10.31349/revmexfis.67.123 

Research

Medical Physics

Mathematical modeling and forecasting of COVID-19: experience in Santiago de Cuba province

E. E. Ramirez-Torres1 

A. R. Selva Castañeda2 

Y. Rodríguez-Aldana3 

S. Sánchez Domínguez4 

L. E. Valdés García5 

A. Palú-Orozco6 

E. R. Oliveros-Domínguez7 

L. Zamora-Matamoros8 

R. Labrada-Claro9 

M. Cobas-Batista10 

D. Sedal-Yanes11 

O. Soler-Nariño12 

P. A. Valdés-Sosa13 

J. I. Montijano14 

L. E. Bergues Cabrales15 

1Departamento de Biomédica, Facultad de Ingeniería en Telecomunicaciones, Informática y Biomédica, Universidad de Oriente, Santiago de Cuba, Cuba.

2Departamento de Telecomunicaciones, Facultad de Ingeniería en Telecomunicaciones, Informática y Biomédica, Universidad de Oriente, Santiago de Cuba, Cuba

3Centro de Estudios de Neurociencias y Procesamiento de Señales, Universidad de Oriente, Santiago de Cuba, Cuba

4Departamento de Matemática, Facultad de Ciencias Naturales y Exactas, Universidad de Oriente, Santiago de Cuba, Cuba

5Centro Provincial de Higiene, Epidemiología y Microbiología, Santiago de Cuba, Cuba

6Centro Provincial de Higiene, Epidemiología y Microbiología, Santiago de Cuba, Cuba

7Dirección de Informatización, Universidad de Oriente, Santiago de Cuba, Cuba

8Departamento de Matemática, Facultad de Ciencias Naturales y Exactas, Universidad de Oriente, Santiago de Cuba, Cuba

9Departamento de Matemática, Facultad de Ciencias Naturales y Exactas, Universidad de Oriente, Santiago de Cuba, Cuba

10Departamento de Matemática, Facultad de Ciencias Naturales y Exactas, Universidad de Oriente, Santiago de Cuba, Cuba

11Facultad de Ciencias Sociales, Universidad de Oriente, Santiago de Cuba, Cuba

12Facultad de Ciencias Sociales, Universidad de Oriente, Santiago de Cuba, Cuba

13Join China-Cuba Lab for Frontier Research in Translational Neurotechnology, Sichuan, China

14Instituto Universitario de Investigación de Matemáticas y Aplicaciones, Universidad de Zaragoza, Zaragoza, Spain

15Centro Nacional de Electromagnetismo Aplicado, Universidad de Oriente, Santiago de Cuba, Cuba. e-mail: berguesc@yahoo.com


Abstract

In the province of Santiago de Cuba, Cuba, the COVID-19 epidemic has a limited progression that shows an early small-number peak of infections. Most published mathematical models fit data with high numbers of confirmed cases. In contrast, small numbers of cases make it difficult to predict the course of the epidemic. We present two known models adapted to capture the noisy dynamics of COVID-19 in the Santiago de Cuba province. Parameters of both models were estimated using the approximate-Bayesian-computation framework with dedicated error laws. One parameter of each model was updated on key dates of travel restrictions. Both models approximately predicted the infection peak and the end of the COVID-19 epidemic in Santiago de Cuba. The first model predicted 57 reported cases and 16 unreported cases. Additionally, it estimated six initially exposed persons. The second model forecasted 51 confirmed cases at the end of the epidemic. In conclusion, an opportune epidemiological investigation, along with the low number of initially exposed individuals, might partly explain the favorable evolution of the COVID-19 epidemic in Santiago de Cuba. With the available data, the simplest model predicted the epidemic evolution with greater precision, and the more complex model helped to explain the epidemic phenomenology.

Keywords: COVID-19 epidemic; mathematical modelling; approximate Bayesian computation; Poisson noise

PACS: 02.70.Uu; 02.30.Zz; 87.10.Ed; 87.15.Aa; 87.19.Xx

1.Introduction

The ongoing COVID-19 epidemic represents a global challenge for health-care systems. The SARS-COV-2 coronavirus causing the COVID-19 is highly transmissible 1,2. Nevertheless, the epidemic propagation in a region/country is influenced by several factors, including governmental measures 3. The Cuban government has implemented several strategies of social isolation to control this epidemic. As a result, a small number of cases is reported in the province of Santiago de Cuba. Mathematical models are useful for understanding the COVID-19 transmission and the influence of government interventions 4-7. Currie et al. 4 suggest that system dynamics (differential-equation-based modeling) is suitable to model government decisions that cause social distancing and isolation.

The most common system dynamics in epidemic modeling are SIR (susceptible - infectious - removed) and SEIR (susceptible-exposed-infectious-removed) type. The classic SIR and SEIR models assume that individuals change classes (e.g., from susceptible to infected) at constant rates. This assumption is not realistic in most situations 8. Additionally, it does not explain why the effective reproduction number of COVID-19 changes over time 9. Some authors introduce time-dependent parameters to formulate realistic models of the current pandemic 5 10-12.

A challenge for COVID-19 modelers is to describe the difference between total cases and diagnosed cases. One approach is to exclude this difference and fit the model to reported data 16-8. Another option is to consider undetected cases as a compartment of the differential equations system 6,7,12,19,20. Finally, some authors avoid the complication of adding a new compartment, and they model the error with different ad hoc methods 5,10,21,22.

Mathematical models have been used to forecast the dynamics of COVID-19 in delimited geographical areas. These models usually fit big-number data after several days of epidemic evolution 7,10,23-26. The official data reported from Santiago de Cuba show a maximum value of 36 active cases reached 31 days after the first confirmed case. This peak is followed by a decreasing trend. These small and noisy data make difficult the model fitting. A common strategy to describe COVID-19 dynamics is to update one or more model parameters based on key days of government actions 7,10,12.

Ordinary differential equation (ODE) based models are widely used to capture the kinetic of biological systems and to predict the behaviour of time-dependent variables. The minimization of a cost function (as a strategy for fitting a given dynamic model) does not take into account the uncertainty of parameters. Additionally, this approach can lead to unrealistic fittings 27.

Bayesian inference offers a platform that deals naturally with both problems, but it requires the calculation of likelihood functions 13-15,28. In many practical problems, likelihood functions become analytically intractable; nevertheless, data can be simulated from a parameter vector by applying some algorithm that depends on the model. Thus, several techniques have been developed to infer parameters without using likelihood functions. These techniques are usually called approximate Bayesian computation (ABC) or free likelihood inference 28,29.

ABC can be used to estimate parameters when the available data of an infectious disease is coarse 30. Besides, ABC accomplishes parameters estimation for models that include dedicated laws to describe the error between total cases and reported cases.

The aim of this paper is to adapt two models reported in the literature to capture the dynamics of the noisy and small-number data of COVID-19 in Santiago de Cuba. For this, we modify an extended SEIR model (10) by introducing a dynamic-mode gamma distribution to describe the error between total and reported cases. Additionally, we adapt the classic SIR model reported in 31 by introducing Poisson noise in the observation of removed cases. For both models, we update a parameter on critical days of government actions and estimate the parameter values with ABC. We are not aware that these modifications have been reported in the literature.

2.Methods

2.1.Data

Because no personal data of patients were used, the ethical approval and individual consent were not applicable.

We used official data of cumulative, infected, and removed cases of COVID-19 in Santiago de Cuba province. These data were reported from March 20th to May 4th, 2020. Daily data was provided by the Provincial Direction of Public Health of Santiago de Cuba and visualized in an information board from a web platform (http://www.covid19scu.uo.edu.cu). Forty-nine patients with SARS-COV-2 were confirmed using real-time polymerase chain reaction (PCR) tests. PCR tests were made in the Laboratory of Molecular Biology, Provincial Center of Hygiene, Epidemiology and Microbiology of Santiago de Cuba.

2.2.Epidemiological evidence

The first day of the COVID-19 in Santiago de Cuba province counted when the first case was diagnosed (March 17th, 2020). The first case (day one), the second case (day four), and the third case (day seven) were contacts of different travelers from various countries with COVID-19 transmission reported. The number of diagnosed cases increased from day seven and they were contacts of three first confirmed cases. The epidemiological investigation revealed that the primary infectious focus was small in Santiago de Cuba province due to the small number of travelers that arrived in this province. All confirmed cases with COVID-19 were attended and treated in the Dr. José Joaquín Castillo Duany hospital of Santiago de Cuba. The confirmed patients remained at least 14 days admitted to the hospital. When two PCR tests were negative, patients were sent home under medical supervision. Additionally, all contacts of these confirmed patients were quickly isolated during 14 days to detect symptomatic and non-symptomatic cases. After 14 days of isolation, unconfirmed cases of COVID-19 were sent home and daily evaluated by physicians.

The government of Santiago de Cuba took different actions that restrained COVID-19 transmission, such as the urban and interprovincial travel restrictions; workers were encouraged to work from their homes; safe transportation was provided for essential people in the different prioritized activities; school activity was temporarily suspended at all educational levels; social movement was restrained; and the modified quarantine was established on the case clusters of importance for disease transmission.

2.3.Methodology

In order to understand and predict COVID-19 in Santiago de Cuba, we have chosen two deterministic models: one more explanatory and the other more parsimonious. The first model allowed to elaborate hypotheses about some possible causes of the few reported cases. The second model was used to make faster and accurate predictions of the epidemic evolution. For both models, we updated a parameter on key days of government action. Also, we included error models to explain the variability of the observed data. Details of the chosen models are shown in the following section.

2.4.Models

Lin et al. 10 reported an SEIR model with three additional classes: the total population size (N), the public perception of risk (D), and the cumulative cases number (C). Also, they modeled the zoonotic transmission during the first month and then only person-to-person transmission, taking into account the emigration rate inside the country.

Unlike 10, we did not consider the zoonotic transmission because infection in Cuba was not originated by animals. Also, we neglected the emigration rate for two reasons: first, the few days elapsed before the transport restriction by air, land, and sea; second, the social isolation actions taken, such as remote work, stopping school activities, among others. As a result of these two assumptions, we obtained the following model, named model-I:

{dSdt=-β(t)NSIdEdt=β(t)NSI-σEdIdt=σE-γIdRdt=γIdDdt=dγI-λDdCdt=σE, (1)

with

β(t)=β0(1-α)(1-DN)k, (2)

where σ-1, γ-1, d, λ-1, β(t), β0, a and k are the mean latent period, the mean infectious period, the proportion of severe cases, the mean duration of public reaction, the dynamic transmission rate, the initial transmission rate, the governmental measure strength and the intensity of individual response, respectively. The last expression of Eq. (1) follows from C = I + R. As a was considered a stepwise function in 10, we used two milestone dates: the days when interprovincial transport (April 10th, 2020) and urban transport (April 25th, 2020) were restricted.

To describe the error in reported data, we assumed that the difference CT - CR between total cases (CT) and reported cases (CR) is a gamma probability distribution. The assumption of a discrete random variable as a continuous one provided some advantages in the assessment of parameter estimation. For instance, it was possible to perform a nonparametric continuous hypothesis test over residues. The shape parameter of the gamma distribution was estimated, and the scale parameter was changed dynamically. This approach allowed us to place the mode closer to zero in the most optimistic scenarios. We used an incremental error in the reported rate during the early days of the epidemic. A dedicated law was formulated for the number of unreported cases per each reported one (U(t)), given by

U(t)={pt+δ, ift<dTpdT+δ,iftdT, (3)

where p and δ are tuning parameters and dT is the time from which COVID-19 transmission showed higher regularity in the case count. We empirically set the values in p = 0.01 day-1, δ = 0.04 and dT = 8 days. We assumed that U(t) increased during the days when almost no new cases arose, and after that, it remained roughly constant. This assumption is consistent with a previous study that reported growth in undocumented cases during the early days of the epidemic. The expression for the dynamic mode of error (M(t)) is

M(t)=CTU(t)+1. (4)

The difference between CT and M(t) was considered a time-dependent approximation of undetected cases.

Model-II consisted of a standard SIR model 31, adapted with the introduction of Poisson noise in the removed cases. For the parameter calibration of model-II, reported cumulative and removed cases were considered. We described β(t) in the model-II as a stepwise function that included changes in key dates. The error was assumed as a standard normal distribution for model-II.

2.5.Parameter estimation

Parameter estimation was performed with the ABC approach 28,33,34. We preferred the sequential Monte Carlo (ABC SMC) method described in the Algorithm 4.8 of 34 because of its well-documented results. Nevertheless, the ABC SMC method can perform slowly computing adaptive tolerances.

This sampler (Algorithm 1 in Appendix A) outsmarts its predecessors in automatic computing of tolerances based on the effective sample size of the previous population. We used N = 104, h0 = 108, T = 5 and a Gaussian acceptance kernel function Kh. We refer readers to for a deeper understanding of ABC samplers and their adjustable parameters.

The reweight step of the chosen ABC SMC sampler led to an optimization problem in the calculation of hm. Thus, it may introduce delays in the computation process, especially if the posterior distribution is distant from the prior distribution. Therefore, we introduced the hybrid version of ABC techniques, named ABC H algorithm (Algorithm 2 in Appendix B) as a hybrid of rejection sampler (ABC RS) and Markov chain Monte Carlo (ABC MCMC) approaches. Although we have not demonstrated the ergodicity of the ABC H algorithm, we empirically ascertained that it found a maximum-a-posteriori (MAP) located within the parameter region determined by the ABC SMC algorithm, consuming less computational time. Hence, we preferred to use the MAPs obtained by the ABC H sampler for quick prediction and the ABC SMC process for more in-depth posterior Bayesian analysis. Specifically, we used the ABC SMC results to compute the credible intervals of posterior distributions. Details of the algorithms ABC SMC and ABC H are shown in A and B, respectively.

As far as we know, the ABC H sampler is a new idea, and its main disadvantage may be the inaccuracy of the sampling from the posterior distribution. This issue was not critical because we aimed to give a quick prediction based on the MAP. Nevertheless, we hypothesize that for a reasonably high value assigned to the fourth sampling choice (w3, corresponding to a conventional ABC MCMC), it is possible to obtain posterior distributions similar to ABC SMC results. Further research is required to verify this hypothesis. We used the scheme w={0.2,0.2,0.3,0.3} for sampling choice. Other values used in ABC H sampler were N = 103, h = 800, and a Gaussian acceptance kernel function Kh.

Because actual data can be directly compared to simulated data for ODE models, it was not necessary to use summary statistics as criteria of divergence. We used the sum of squared errors as the distance criterion, which is closely related to the likelihood 33,35,36, and suitable for infectious-disease modeling 30. We used a methodology for the prior-distribution choice that consisted of several trials mimicking different favorable and unfavorable scenarios, followed by feedback from epidemiologists.

Values of d = 0.2 and σ = 1/3 day-1 in model-I were reported in 10. The remaining parameter values and the initial conditions E(0) and D(0)were estimated with ABC starting from locally uniform prior distributions.

In our approach, two key dates of travel restrictions divided the evolution of the epidemic into three periods. Three governmental action strengths were estimated for these periods: before travel restrictions (α1), after interprovincial travel restriction (α2), and after the urban public transport restriction in Santiago de Cuba (α3). The three values β1, β2, and β3 were considered in model-II taking into account the same criteria for key dates mentioned above.

We calculated the coefficient of determination R2 for the regression analysis of MAP fit to assess the quality of parameter-estimation results for models I and II. We performed the frequentist and nonparametric Mann-Whitney U test , taking as null hypothesis that the actual residues obtained arose from the assumed gamma distribution. Furthermore, an approximation of maximum likelihood (ML) ratio between models (ML of model-I to ML of model-II) was computed with ABC output.2

All simulations were made in a 256-core high-performance-computing (HPC) processor with 256 GB RAM using Python 3.6 39. The HPC machine was acquired by the Flemish Development Cooperation through VLIR-UOS (Flemish Interuniversity Council-University Cooperation for Development of Belgium) in the context of the Institutional University Cooperation program with Universidad de Oriente, Santiago de Cuba.

3.Results

Model-I simulation (Fig. 1) was performed with MAPs calculated with ABC SMC. All parameters and results of model-I fitting are summarized in Table I. The Mann-Whitney U test performed on the residues of model-I fit did not reject the null hypothesis based on a 5% significance level.

Figure 1 Estimated evolution of COVID-19 epidemic in Santiago de Cuba with model-I. The selected parameter values for simulation correspond to the MAP of the approximate posterior distribution. The model-predicted and reported infected individuals a), and the cumulative b), both predicted in simulation (C T and C R predicted) and observed (C R ) cases are plotted. The shaded area is limited by the extremes of 95% credible intervals of posterior distributions obtained with the ABC SMC sampler. 

Table I Summary table of model-I parameters and results. 

Parameter Notation Prior or fixed value MAP (ABC SMC) MAP (ABC H)
Transmission rate

β0

α1

U (0.1, 1.5)

0.4811 days-1

CI(0.3680, 0.5468)

0.5245

0.4340 days-1

0.6179

Fitting parameter α2 U (0.1, 1)

CI(0.3717, 0.7534)

0.2855

0.1801
α3

CI(0.1775, 0.3692)

0.9665

0.9417
Intensity of responds k U (600, 4000)

CI(0.8527, 1.0000)

3430

3405
CI(3335, 3598)
Mean latent period σ-1

3 days

U (1, 40)

- 21 days - 25 days
Mean infectious period γ-1
Proportion of severe cases d 0.2 - -
Mean duration of public reaction λ -1 U(30,30) 20 days 50 days
public reaction Initial susceptible S(0) 9.41 · 105 CI(1, 39) - -
population
Initial exposed E(0) U (1, 10) 6 6
population CI(4, 7)
Initial infectious I(0) 1 - -
population
Initial removed R(0) 0 - -
population
Initial risk perception D(0) U (1, 103) 163 148
CI(142, 188)
Initial cumulative cases C(0) 1 - -

Figure 2 was obtained using the MAPs of posterior distributions of model-II parameters that were computed with ABC SMC. These MAPs and other values from model-II are summarized in Table II.

Figure 2 Estimated evolution of COVID-19 epidemic in Santiago de Cuba with model-II. Predicted infected individuals follow the dynamics of the reported ones a). A long-term scenario of both reported and predicted cumulative cases, and the area limited by the extremes of the 95% credible intervals of posterior distributions obtained with the ABC SMC sampler b) are plotted. 

Table II Summary table of model-II parameters. 

Parameter Notation Prior or fixed value MAP (ABC SMC) MAP (ABC H)
β1 0.5674 days-1 0.5149 days-1
Transmission rate β2 U (0.1, 0.9)

CI(0.3067, 0.7534)

0.5855 days-1

0.6056 days-1
β3

CI(0.4794, 0.7088)

0.3555 days-1

0.3548 days-1
Mean infectious period γ-1 U (1, 10)

CI(0.2443, 0.4525)

2 days

2 days
CI(1, 3)
Interval expectation λ U (1, 20) 18 days 18 days
of removed Initial susceptible S(0) 9.41x105 CI(12, 23) - -
population
Initial infectious I(0) 1 - -
population
Initial removed R(0) 0 - -
population

In all trials made, the ABC H sampler consumed less time and reached similar parameter regions to the ones obtained by ABC SMC (Figs. 3-6) for the same prior distributions.

Model-I predicted a total of 73 infected at the end of the COVID-19 epidemic in Santiago de Cuba, with 21% of undocumented cases. The final value of M(t) was 57, being the model-I prediction of confirmed cases. Model-II forecasted 51 confirmed cases at the end of the epidemic. Additionally, the ML ratio calculated was 0.941; and the values of determination coefficients were R2 = 0.9450 for model-I and R2 = 0.9781 for model-II.

4.Discussion

COVID-19 dynamics in Santiago de Cuba is difficult to capture empirically without the introduction of stepwise functions for parameter 𝛼 in model-I and parameter β in model-II. Results show that model-I (Fig. 1) and model-II (Fig. 2), fitted with ABC algorithms, can predict an early small-number peak of infected cases and the approximate ending of the COVID-19 in Santiago de Cuba. The non-rejection of the null hypothesis on the residuals allows us to continue considering the validity of the error law for model-I.

All MAPs calculated with ABC H are within credible intervals obtained with ABC SMC, except parameter λ of model-I. Even for this parameter, the histogram generated by ABC H partially overlapped the one generated by ABC SMC. We deduce that both algorithms find the same parameter region for the chosen prior. A visual comparison of Fig. 3 with Fig. 4, and Fig. 5 with Fig. 6, reveals that ABC H does not sample from the same distribution as ABC SMC. Although this does not affect the use of ABC H in this work, a modification of the ABC H sampler is required to make it a fully functional ABC algorithm. The scheme w={1,0,0,0} for sampling choice corresponds to a ABC RS process 33,40. The scheme w={0,0,0,1} resembles an ABC MCMC method 41. ABC RS and ABC MCMC are algorithms that sample from adequate posterior distributions but in an inefficient time, so other intermediate schemes may be evaluated in further works.

Figure 3 Posterior density estimation of model-I parameters using ABC SMC sampler and COVID-19 data from Santiago de Cuba. The estimated parameters are β0, k, λ, α1, α2, α3, γ, E(0), and the scale parameter of the gamma distribution (α GAM ). 

Figure 4 Posterior density estimation of model-I parameters using ABC H sampler and COVID-19 data from Santiago de Cuba. The estimated parameters are β0, k, λ, α1, α2, α3, γ, E(0), and the scale parameter of the gamma distribution α GAM

Figure 5 Posterior density estimation of model-II parameters using ABC SMC sampler and COVID-19 data from Santiago de Cuba. The estimated parameters are β1, β2, β3, γ, and the interval expectation of the Poisson distribution (λPoisson). 

Figure 6 Posterior density estimation of model-II parameters using ABC H sampler and COVID-19 data from Santiago de Cuba. The estimated parameters are β1, β2, β3, γ, and the interval expectation of the Poisson distribution (λPoisson). 

The computed percentage of undocumented cases is in contrast with the 86% of unreported infected cases estimated for China 43. This result suggests that the limited progression of the COVID-19 epidemic in Santiago de Cuba may be due to opportune epidemiological investigations, effective control measures in each source of infection, and a low number of initially exposed individuals (E(0) = 6). Additionally, case clusters prevail over transmission clusters for COVID-19 in Santiago de Cuba province. The policymakers used these estimates to complement an investigation into the initial infectious load of SARS-COV-2 in Santiago de Cuba.

The MAPs of a posterior distributions were estimated at α = 0.5245, α2 = 0.2855, and α3 = 0.9665 (Table I). Value of 𝛼 is expected to grow as government measures increase but, surprisingly, the value after the closure of transportation between provinces is lower than before (α2 < α1). We consider two possible explanations for this result:

First, the interpretation of α as government action is somewhat forced. In a previous study 42, α was interpreted as seasonality of transmission associated with the school calendar for a similar formulation of β(t). Consequently, the interpretation of α remains open.

Second, the increase in PCR tests coincides with the period corresponding to α2. Neither the deterministic model nor the error law explicitly takes into account the variability of PCR tests. This variability influences the number of detected cases, and therefore the estimation of all transmission-rate parameters.

Thus, in our results, we do not interpret α as government action. We consider 𝛼 as a multifactorial parameter that works as a calibrator of β(t) dynamics. The only dynamic variable the explicitly influences β(t) in Eq. (2) is 𝐷 10 42. This influence is in agreement with the epidemiological and sociological researches carried out in Santiago de Cuba, which confirm the reduction of the transmission rate by the high-risk perception of both decision-makers and population. Nevertheless, D depends explicitly on I, which in turn depends on E; so the variable D is affected by changes in these state variables; and therefore, they indirectly influence the dynamics of β(t).

The estimated value k = 3430 (k = 1117 in 10) may indicate a good response from individuals in Santiago de Cuba to COVID-19 epidemic.

The mean infection period for model-I may be explained by the inclusion of unreported cases and the delay in the reports of those recovered. The value of this parameter for model-II is plausible because it is the number of days expected for a diagnosed patient to recover, and the diagnosis is usually confirmed several days after the exposition. The interval expectation of removed, λPoisson=18 days with CI[12,23], seems consistent with the patient-discharge policy implemented by the Ministry of Public Health of the Republic of Cuba.

One possible explanation for the ML ratio is that the error law for model-I (Eq. (3)) can only partially describe the complex variability of the reporting rate. As model-II is more parsimonious than model-I, we did not perform some further analysis like Akaike’s information criterion (AIC) or Bayes factor, which should favour model-II over model-I. On the other hand, the model-I allowed us to estimate the number of individuals initially exposed, and to compute the percentage of unreported cases. We argue that simpler models like model-II should be preferred for accurate prediction, whereas more explanatory models like the model-I are to be used for phenomenological analysis.

The chat on the left in Fig. 2 shows a remarkable tracking of the data variation by the fitted curve. A conventional SIR model cannot achieve this. Model-II accomplishes this tracking by dividing the adjustable parameters of the SIR model into three time periods. Furthermore, the cases recovered on the day estimated by the SIR model are not subtracted, but on the day expected by the Poisson error. In other words, model-II is advantageous over a conventional SIR as it is a piecewise SIR with a Poisson noise in the number of removed cases.

We hold that the difference between the predicted and reported data in both models is acceptable because of reporting delays, which are in agreement with previous studies 10,32,44.

One advantage of using a Bayesian approach in parameter estimation is to use different prior distributions in order to set favourable and unfavourable scenarios for models. When the data do not show a peak yet, most models become highly unidentifiable, in such a way that one can find several parameter combinations leading to reasonable fits. Therefore, we deem it essential to complement the mathematical analysis with an evaluation of epidemiological investigation, looking for the most plausible prediction.

The error law for the model-I (Eq. (3)) and the assumption of the gamma distribution cannot fully capture the variability of the reporting rate. This approach is a baseline for future improvements.

The influences of β0, 𝛼, 𝐷 and 𝑘 in β(t)should not be discussed separately. The variables β0, a, D, and k are dynamically interrelated, taking into account that the society (complex system) is constituted by the interaction and coupling of its elements as a global unit, in agreement with 45.

Further studies will evaluate the influence of other variables on the favourable evolution of COVID-19 in Santiago de Cuba: government-induced social isolation, the social response of individuals, and the environmental factors (e.g., temperature and relative humidity). Additionally, future works should study the ABC H performance for intermediate values of w, looking for an acceptable compromise of efficiency and accuracy. Another possibility is to study adaptive schemes of w, in order to accept a group of candidates in a short time, and then ensure sampling of the posterior distribution.

5.Conclusions

Our study indicates that modified SIR and SEIR models, combined with ABC for parameter estimation, can follow the small-number dynamics of the COVID-19 epidemic. More precisely, models follow dynamics with no explosions of new cases and with small-number peaks by including some parameters as stepwise functions of critical days. An opportune epidemiological investigation, along with a low number of initially exposed individuals, may partly explain the favourable evolution of the COVID-19 epidemic in Santiago de Cuba.

Acknowledgements

We gratefully acknowledge the help provided by Professor Scott Sisson in understanding the use of the ABC SMC method. Besides, we appreciate the technical support and invaluable feedback provided by Alexander Alexeis Suárez León, Antonio Iván Ruíz Chaveco, Nelsa María Sagaró Del Campo, Yase Bring Pérez, Rodolfo Hernández Despaigne, Marlon César Texidor Garzón, Digna de la Caridad Bandera Jiménez, Adriana Rodríguez Valdés, Manuel de Jesús Salvador Álvarez, Hilda Morandeira Padrón, Itciar Arias Portales and Sergio Miranda Reyes. Additionally, the authors thank Universidad de Oriente, DATYS-Santiago de Cuba, Dirección Provincial de Salud Pública, and the provincial government officials of Santiago de Cuba.

References

1. S. Sanche, Y. T. Lin, C. Xu, E. Romero-Severson, N. W. Hengartner, and R. Ke, The novel coronavirus, 2019-nCoV, is highly contagious and more infectious than initially estimated, arXiv preprint arXiv:2002.03268 (2020), Available from: https://arxiv.org/ftp/arxiv/papers/2002/2002.03268.pdf. [ Links ]

2. L. Ferretti et al., Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing, Science 368 (2020) eabb6936, https://dx.doi.org/10.1126/science.abb6936. [ Links ]

3. R. M. Anderson, H. Heesterbeek, D. Klinkenberg, and T. D. Hollingsworth, How will country-based mitigation measures influence the course of the COVID-19 epidemic?, The Lancet 395 (2020) 931, https://doi.org/10.1016/S0140-6736(20)30567-5. [ Links ]

4. C. S. M. Currie et al., How simulation modelling can help reduce the impact of COVID-19, J. Simulat. 14 (2020) 83, https://doi.org/10.1080/17477778.2020.1751570. [ Links ]

5. R. F. Reis et al., Characterization of the COVID-19 pandemic and the impact of uncertainties, mitigation strategies, and underreporting of cases in South Korea, Italy, and Brazil, Chaos Solitons Fract. 136 (2020) 109888, https://doi.org/10.1016/j.chaos.2020.109888. [ Links ]

6. S. Zhao and H. Chen, Modeling the epidemic dynamics and control of COVID-19 outbreak in China, Quant. Biol. 8 (2020) 11 https://doi.org/10.1007/s40484-020-0199-0. [ Links ]

7. Z. Liu, P. Magal, O. Seydi, and G. Webb, Understanding unreported cases in the COVID-19 epidemic outbreak in Wuhan, China, and the importance of major public health interventions, Biology 9 (2020) 50, https://doi.org/10.3390/biology9030050. [ Links ]

8. A. L. Lloyd, Destabilization of epidemic models with the inclusion of realistic distributions of infectious periods, P. Roy. Soc. B-Biol. Sci. 268 (2001) 985. https://doi.org/10.1098/rspb.2001.1599. [ Links ]

9. C. Anastassopoulou, L. Russo, A. Tsakris, and C. Siettos, Data-based analysis, modelling and forecasting of the COVID-19 outbreak, PloS One 15 (2020) e0230405. https://doi.org/10.1371/journal.pone.0230405. [ Links ]

10. Q. Lin et al., A conceptual model for the coronavirus disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental action, Int. J. Infect. Dis. 93 (2020) 211. https://doi.org/10.1016/j.ijid.2020.02.058. [ Links ]

11. J. Wangping et al., Extended SIR prediction of the epidemics trend of COVID-19 in Italy and compared with Hunan, China, Front. Med. 7 (2020) 169. https://doi.org/10.3389/fmed.2020.00169. [ Links ]

12. G. Giordano et al., Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy, Nat. Med. 26 (2020) 855. https://doi.org/10.1038/s41591-020-0883-7. [ Links ]

13. A. Gelman, Bayes, Jeffreys, prior distributions and the philosophy of statistics, Stat. Sci. 24 (2009) 176. https://doi.org/10.1214/09-STS284D. [ Links ]

14. J. K. Kruschke, Doing Bayesian Data Analysis (Academic Press, Cambridge, 2015). [ Links ]

15. V. Amrhein, T. Roth, and F. Korner-Nievergelt, Identical twins and Bayes’ theorem in the 21st century, F1000Research 2 (2015) 278. https://doi.org/10.12688/f1000research.2-278.v2 [ Links ]

16. E. B. Postnikov, Estimation of COVID-19 dynamics “on a back-of-envelope": Does the simplest SIR model provide quantitative parameters and predictions?, Chaos Solitons Fract. 135 (2020) 109841. https://doi.org/10.1016/j.chaos.2020.109841. [ Links ]

17. D. Fanelli and F. Piazza, Analysis and forecast of COVID-19 spreading in China, Italy and France, Chaos Solitons Fract. 134 (2020) 109761. https://doi.org/10.1016/j.chaos.2020.109761. [ Links ]

18. M. D’Arienzo and A. Coniglio, Assessment of the SARS-CoV-2 basic reproduction number, R0, based on the early phase of COVID-19 outbreak in Italy, Biosaf. Health 2 (2020) 57, https://doi.org/10.1016/j.bsheal.2020.03.004. [ Links ]

19. C. Reno et al., Forecasting COVID-19-Associated Hospitalizations under Different Levels of Social Distancing in Lombardy and Emilia-Romagna, Northern Italy: Results from an Extended SEIR Compartmental Model, J. Clin. Med. 9 (2020) 1492. https://doi.org/10.3390/jcm9051492. [ Links ]

20. B. Ivorra, M. R. Ferrández, M. Vela-Pérez, and A. Ramos, Mathematical modeling of the spread of the coronavirus disease 2019 (COVID-19) taking into account the undetected infections. The case of China, Commun. Nonlinear Sci. 88 (2020) 105303 https://doi.org/10.1016/j.cnsns.2020.105303. [ Links ]

21. G. C. Calafiore, C. Novara, and C. Possieri, A modified SIR model for the COVID-19 Contagion in Italy, https://arxiv.org/pdf/2003.14391.pdf. [ Links ]

22. A. Lachmann, K. M. Jagodnik, F. M. Giorgi, and F. Ray, Correcting under-reported COVID-19 case numbers: estimating the true scale of the pandemic, https://doi.org/10.1101/2020.03.14.20036178. [ Links ]

23. E. L. Piccolomini and F. Zama, Preliminary analysis of COVID-19 spread in Italy with an adaptive SEIRD model, https://arxiv.org/abs/2003.09909. [ Links ]

24. G. Guzzetta et al., Potential short-term outcome of an uncontrolled COVID-19 epidemic in Lombardy, Italy, February to March 2020, Eurosurveillance 25 (2020) 2000293. https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000293. [ Links ]

25. A. J. Kucharski et al., Early dynamics of transmission and control of COVID-19: a mathematical modelling study, Lancet. Infect. Dis. 20 (2020) 553. https://doi.org/10.1016/S1473-3099(20)30144-4. [ Links ]

26. W. C. Roda, M. B. Varughese, D. Han, and M. Y. Li, Why is it difficult to accurately predict the COVID-19 epidemic?, Infect. Dis. Model. 5 (2020) 271. https://doi.org/10.1016/j.idm.2020.03.001. [ Links ]

27. D. Fernández Slezak et al., When the optimal is not the best: parameter estimation in complex biological models, PLOS ONE 5 (2020) e13283. https://doi.org/10.1371/journal.pone.0013283. [ Links ]

28. M. A. Beaumont, Approximate Bayesian computation in evolution and ecology, Annu. Rev. Ecol. Evol. Syst. 41 (2010) 379. https://doi.org/10.1146/annurev-ecolsys-102209-144621. [ Links ]

29. J. Lintusaari et al., Fundamentals and recent developments in approximate Bayesian computation, Syst. Biol. 66 (2017) e66. https://doi.org/10.1093/sysbio/syw077. [ Links ]

30. A. Minter and R. Retkute, Approximate Bayesian Computation for infectious disease modelling, Epidemics 29 (2019) 100368. https://doi.org/10.1016/j.epidem.2019.100368. [ Links ]

31. H. H. Weiss, The SIR model and the foundations of public health, Mat. Mat. 2013 (2013) 17. [ Links ]

32. S. Zhao et al., Estimating the unreported number of novel coronavirus (2019-nCoV) cases in China in the first half of January 2020: a data-driven modelling analysis of the early outbreak, J. Clin. Med. 9 (2020) 388. https://doi.org/10.3390/jcm9020388. [ Links ]

33. T. Toni et al., Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems, J. R. Soc. Interface 6 (2009) 187. https://doi.org/10.1098/rsif.2008.0172. [ Links ]

34. S. A. Sisson, Y. Fan, and M. Beaumont, Handbook of Approximate Bayesian Computation (Chapman and Hall/CRC, 2018). https://doi.org/10.1201/9781315117195. [ Links ]

35. H. Motulsky and A. Christopoulos, Fitting Models to Biological Data Using Linear and Nonlinear Regression: A Practical Guide to Curve Fitting (Oxford University Press, 2004). [ Links ]

36. A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation (Society for Industrial and Applied Mathematics, Philadelphia, 2005), https://doi.org/10.1137/1.9780898717921. [ Links ]

37. P. E. McKnight and J. Najab, Mann-Whitney U Test, in The Corsini Encyclopedia of Psychology, edited by I. B. Weiner and W. E. Craighead (Wiley, New York, 2009), https://doi.org/10.1002/9780470479216.corpsy0524. [ Links ]

38. G. W. Corder and D. I. Foreman, Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach (Wiley, New York, 2009), https://doi.org/10.1002/9781118165881. [ Links ]

39. G. V. Rossum, The Python Library Reference (12th Media Services, Suwanee, 2018). [ Links ]

40. J. K. Pritchard, M. T. Seielstad, A. Perez-Lezaun, and M. W. Feldman, Population growth of human Y chromosomes: a study of Y chromosome microsatellites, Mol. Biol. Evol. 16 (1999) 1791. https://doi.org/10.1093/oxfordjournals.molbev.a026091. [ Links ]

41. P. Marjoram, J. Molitor, V. Plagnol, and S. Tavaré, Markov chain Monte Carlo without likelihoods, Proc. Natl. Acad. Sci. USA 100 (2003) 15324. https://doi.org/10.1073/pnas.0306899100. [ Links ]

42. D. He et al., Inferring the causes of the three waves of the 1918 influenza pandemic in England and Wales, P. Roy. Soc. B-Biol. Sci. 280 (2013) 20131345. https://doi.org/10.1098/rspb.2013.1345. [ Links ]

43. R. Li et al., Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2), Science 368 (2020) 489. https://doi.org/10.1126/science.abb3221. [ Links ]

44. A. R. Tuite and D. N. Fisman, Reporting, epidemic growth, and reproduction numbers for the 2019 novel coronavirus (2019-nCoV) epidemic, Ann. Intern. Med. 172 (2020) 567. https://doi.org/10.7326/M20-0358. [ Links ]

45. S. Galea, M. Riddle, and G. A. Kaplan, Causal thinking and complex system approaches in epidemiology, Int. J. Epidemiol. 39 (2010) 97. https://doi.org/10.1093/ije/dyp296. [ Links ]

Appendix

A. Algoritm ABC Sequential Monte Carlo Algorithm

Algorithm 1 ABC Sequential Monte Carlo Algorithm

Require:

  • A target posterior density π(θ|y)p(y|θ)π(θ) consisting of a prior distribution π(θ) and a procedure for generating data under the model p(y|θ).

  • A kernel function Kh(u), and an integer N > 0.

  • An initial sampling density 𝑔(𝜃) and sequence of proposal densities gm(θ,θ'),m=1,,M.

  • A value α[0,1] to control the effective sample size.

  • A low dimensional vector of summary statistics s = S(y).

Ensure: A set of weighted parameter vectors (θM(1),wM(1),,θM(N),wM(N)) drawn from πABC(θ|sobs)

Initialise:

For i = 1,…,N Do

  • Generate θ0(i)g(θ) from initial distribution g.

  • Generate y0(i)(t)p(y|θ0(i)) and compute summary statistics s0(i)(t)=S(y0(i)) for t = 1,…, T.

  • Compute weights w0(i)=π(θ0(i))/g(θ0(i)), and set m = 1.

End For

Sampling:

1.Reweight: Determine hm such that

ESS(wm(1),,wm(N)) = ESS(wm1(1),,wm1(N)) where wm(i)=wm1(i)t=1TKhm(||sm-sobs)||)p(s|θ)π(θm)t=1TKhm-1(||sm-1-sobs)||)p(s|θ)π(θm-1)

2. Resample: if ESS(wm1(1),...,wm1(N))<E then resample N particles from the empirical distribution function {θm(i),sm(i)(1),...,sm(i)(T),Wm(i)} where Wm(i)=wm(i)/j=1Nwm(j) and set wm(i)=1/N.

3. Move:

For i=1,,N Do

if wm(i)>0then

  • Generate θ'gm(θm(i),θ), y'(t)p(y|θm(i)) and s'(t)=Sy't for t = 1,…,T.

  • Accept θ with probability min1,t=1TKhm(||s'(t)-sobs)||)π(θ')g(θ'|θm(i))t=1TKhm(||sm(i)(t)-sobs)||)π(θm(i))g(θm(i)|θ')

end if

end for

B. Algorithm ABC Hybrid Algorithm

Algorithm 2 ABC Hybrid Algorithm

Requiere

  • A target posterior density π(θ|y)p(y|θ)π(θ) consisting of a prior distribution π(θ) and a procedure for generating data under the model p(y|θ).

  • A kernel function Kh(u), an integer N > 0 and a tolerance h.

  • Sampling densities gm(θ), gc(θ) and gMC(θ).

  • A normalized vector w={w0,w1,w2,w3} to control the sampling choice and i = 0 to count the accepted candidates.

  • A low dimensional vector of summary statistics s = S(y).

Ensure A set of parameter vectors (θ(1),...,θ(N)) drawn from

πABC(θ|sobs)KhM(||s-sobs)||)p(s|θ)π(θ)ds

Initialise:

  • Sample three initial points {θm,θc,θMC}i=0 from π(θ).

  • Generate y{m,c,MC}p(y|θ{m,c,MC}) and compute summary statistics s{m,c,MC}=S(y{m,c,MC}).

Sampling:

while i < N do

1. Choose sampling mode: sample w´ from w according to weights.

2. Sample candidate:

if w´= w0 then Draw θ´ from π(θ)

else if w´= w1 then Propose θ´ according to gmθθm

else if w´= w2 then Propose θ´ according to gcθθc

else if w´= w3 then Propose θ´ according to gMCθθMC

end if

3. Accept or not:

  • Generate y'p(y|θ') and s'=S(y').

  • Accept θ´ with probability Kh(||s'-sobs)||)

  • If θ´ is accepted, do θMC=θ' with probability min1,π(θ')gMC(θMC|θ')π(θMC)gMC(θ'|θMC) and increment i.

4. Update:

  • Do θm=θ' if Kh(||s'-sobs)||) is the highest computed so far.

  • Propose θc' according to gc(θ|θ',θc), generate yc'p(y|θc'), compute summary statistics sc'=S(yc') and do θc=θc' if Kh(||sc'-sobs)||) is the highest computed so far.

end while

Received: July 19, 2020; Accepted: August 28, 2020

Creative Commons License This is an open-access article distributed under the terms of the Creative Commons Attribution License