Introduction
Per the World Health Organization (WHO), depression is distinguished from typical mood variations and brief emotional reactions to everyday life challenges. Particularly when it becomes recurrent and exhibits moderate or severe intensity, depression can evolve into a significant health concern [1]. Generally, it emerges early in life, causing a substantial decline in the overall functioning of individuals. The condition tends to recur and imposes notable economic and social burdens, making it a prominent contributor to the list of debilitating illnesses. [2]. In its most severe form, depression can tragically result in suicide, with nearly one million people committing suicide each year. Shockingly, it stands as the second leading cause of death among individuals aged 15 to 29, as reported by WHO data. Additionally, in the past, the COVID-19 pandemic has brought forth numerous overwhelming stresses. Some evident factors include job loss, bereavement of family members, friends, or coworkers, financial instability, and social isolation, especially for individuals living alone. When required, healthcare providers must differentiate between demoralization and depression; however, access to in-person consultations with qualified mental health experts may not be easily accessible to everyone in need [3].
Apart from the difficulties posed by the inability to meet patients with depression in person, there are inherent challenges in conducting psychiatric assessments. Such evaluations require considerable effort from specialists to collect objective patient information. Moreover, successful assessment heavily relies on the patient willingness to cooperate and effectively communicate their symptoms and concerns [4]. One of the techniques employed to assess patients' depression is the Montgomery-Asberg Depression Rating Scale (MADRS), designed to gauge the current severity of ongoing depression [5]. Clinicians evaluate ten depression-relevant items through observations and discussions with the patient, and a cumulative score (ranging from 0 to 60) indicates the level of depression. Scores below ten are categorized as having no depressive symptoms [6]. While scores above 30 indicate a severe depressive state [7]. Therefore, an objective detection mechanism based on biological signals is needed to improve timely diagnosis.
The extensive use of wearable devices for monitoring mental and physical health has gained significant popularity. Now, people are consistently collecting data to improve their well-being and monitor fitness advancements. Moreover, the data collected from these devices can hold considerable value from a psychiatric perspective, extending beyond evaluating the overall quality of life. It holds the potential to aid in diagnosing various mental health conditions, including depression. [8]. Undoubtedly, motor activity reflects social patterns influenced by cyclical biological rhythms, which are regulated by the 24-hour circadian clock and interwoven with several ultradian rhythmic cycles lasting 2 to 6 hours [9]. Disrupted biological rhythmic patterns have been proposed as significant indicators of mood episodes [10]. Actigraphy serves as a non-intrusive approach to observing human rest and activity patterns. Typically, it involves using a wrist-worn device to record gravitational acceleration units [11].
The actigraph is among the devices frequently employed to collect motor activity data. Numerous research studies have leveraged this data to create models using artificial intelligence techniques for classifying, detecting, and monitoring the illness. García-Ceja et al., [12] utilize machine learning to differentiate between depressed and non-depressed patients. To assess the algorithms' performance, they employ leave-one-patient-out validation. The collective results reveal that sensor data contains valuable information for determining an individual depression status. On the other hand, Zanella-Calzada et al., [13] proposed a novel approach to distinguish depressive participants from control participants using data from their wearable device-recorded motor activity. Statistical features were extracted from the motor activity signals, which were then utilized to train a random forest classifier. Galván-Tejada et al., [14] investigated the accelerometer signal from smart bands to identify depressive states based on patients' activity. A statistical feature extraction technique was devised, focusing on the temporal and spectral evolution of the signal. Furthermore, an intelligent feature selection method utilizing GA was incorporated to optimize the non-invasive diagnostic process efficiently The results demonstrate the potential to distinguish between depressive states using the smart band activity signal, offering a preliminary and automated tool for almost real-time depression diagnosis at a lower computational cost to specialists.
Moreover, researchers have explored the utilization of motor activity in time series data to gather valuable insights into identifying potential cases of depression, among other applications. Frogner et al., [15] Initially employed One-Dimensional Convolutional Neural Networks (1D-CNN) to assess motor activity for depression detection. Subsequently, the study extended its scope to identify three levels of depression (no depression, mild, and severe) using the MADRS scale. The final model successfully predicts the MADRS scores of the participants.
Rodríguez-Ruiz et al.,[16] introduced a series of models aimed at classifying depressive and non-depressive episodes throughout different moments of the day (day, night, and full day) based on participants motor activity levels. The Depresjon database, containing activity data from both depression patients and controls, was utilized in the study. Additionally, they proposed a Random Forest Classifier (RFC) model for multiclass classification, distinguishing schizophrenia, depression, and healthy controls using night-time activity data with an impressive 98 % accuracy in detecting all three classes. Experimental results demonstrated the model efficacy in identifying episodes of depression and schizophrenia, as well as healthy controls, surpassing prior studies that employed computationally expensive algorithms such as CNN and Bidirectional Recurrent Neural Networks (BRNN), resulting in a noteworthy boost in accuracy. [17].
Jakobsen et al., [18] conducted a study to investigate the potential of various machine learning algorithms in distinguishing between depressed patients and healthy controls using motor activity time series. Furthermore, their research demonstrated that machine learning capacity to reveal hidden patterns in the data aligns with the conclusions drawn from previous studies utilizing both linear and nonlinear statistical methods in motor activity analysis.
Artificial Intelligence (AI), has proven to be useful tool for detecting cases of depression. Kour et al., [19] propose that unipolar depression and bipolar depression display similar clinical symptom profiles, presenting a considerable challenge in distinguishing between the two depression types. The disruptions in motor activity offer a potential avenue to detect pathological mental states and may prove valuable in addressing this diagnostic challenge. That is why every day, there is more research studying this type of data where different types of algorithms derived from AI have been implemented, as is the case with Pacheco-González et al., [20] conducted a comparative analysis of various classification techniques, including conditional inference trees, random forest, K-Nearest Neighbor, support vector machine, and Naïve Bayes. The study aimed to predict depressive states based on patients' activity, measured using a smart band accelerometer. Conversely, Raihan et al.,[21] employed a combination of motor sensor readings and demographic data along with machine learning techniques like Random Forest (RF), AdaBoost, and Artificial Neural Networks (ANN). Finally, as mentioned by Singh et al., [22], identifying disruptions in motoric activity could serve as a valuable approach to detecting pathological mental states. Thus, creating distinct motor activity database (Depresjon) containing data from patients with unipolar depression, bipolar depression, and healthy individuals has demonstrated effectiveness in the timely detection of depression cases.
The objective of this research is to present an approach capable of objectively identifying episodes of depression by employing various AI algorithms in order to explore a wider range of options for detecting depression, this using a limited amount of data intelligently selected with genetics algorithms that reduced the dimensionality and redundancy of the data, it can obtain model capable of training simply with new input data and processing it to generate models adapted to different types of patients, in different environments. It also seeks to show that the behavior of each patient as well as each person is always different, depending on the situation or environment in which they find themselves, therefore, it is important to create adaptive artificial intelligence models for each participants to detect depression in a timely manner and help mental health specialist to develop a treatment for each degree of depression that can be found. Critical analysis of the current relevant literature, statement of the general aims of the work and the importance of the same. In the text of the article, the references must be numbered in the order that they appear.
Materials and methods
The methodology proposed in this work consists of five main stages as shown in Figure 1. Initially, the data are obtained from the Depresjon database. Next, in the second phase of the process, data pre-processing, a comprehensive explanation is given on how the dataset will be organized to extract each minute as a feature from the original data obtained in the initial step. This arrangement enables a subsequent feature selection employing GA. Lastly, to verify the significance of the chosen features, two crucial stages were undertaken: the first involved classification analysis through the implementation of AI algorithms such as Logistic Regression (LR), Artificial Neural Networks (ANN), Support Vector Machine (SVM), and Decision Tree (DT). The second stage focused on result validation.
Data Acquisition
The motor activity dataset comprises patient data monitored using an actigraph watch on the right wrist this measure activity using a piezoelectric accelerometer. This actigraph watch, named "Actiwatch AW4”, measures activity levels at a sampling frequency of 32 Hz, recording movements above 0.05 g. These movements correspond to particular voltage (v) values, which are stored as activity counts in the Actiwatch memory. The count values directly correlate to the intensity of the movements. Continuous recording of total activity counts occurred at one-minute intervals, with the activity counts being recorded accordingly at the same one-minute intervals [1]. Information was gathered from a group of 22 psychotic patients and all used antipsychotic medications who were admitted to Hauklend University Hospital, the mean age to the first time of hospitalization was 24 +/- 9.3. The specialist diagnosed the patients using a semi structured interview based on Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) [23]. Within this group, were 3 females and 19 males, with an average age of 42.6 years (ranging from 27 to 69 years). As for the healthy control group included, it included 23 hospital employees, five students, and four individual practitioner clinic [18].
Data Pre-processing
To obtain a dataset consisting of a certain number of observations capable of being processed through AI techniques, a period of a week to maintain data diversity was taken in one minute intervals. we reordered them to identify the number of observations per participant of study as a relation between participant-day, with their respective registered data.
Motor Activity Data (MAD) for each study participant is filed in "C" columns, spanning a full day from 00:00 to 23:59, with each minute corresponding to its respective MAD record. These columns are then transposed and added to the first row of an "A" matrix, accompanied by the corresponding output, where 0 indicates the control group, and 1 indicates the condition group based on the source dataset, successively. The same process is applied for the subsequent participating days until completing a week of information, resulting in a matrix A = [385 x 1441], including its output. The process is explained in more detail below.
Given the columns of the days corresponding to week 1, we would have C11, C12, C13, C14, C15, C16, C17, where the first subscript indicates the week number and the second indicates the day number with the accompanying timestamp of the motor activity record generated during that specific minute as shown in Equation 1. To obtain the total number of time intervals defined as characteristics corresponding to a day, a simple operation was carried out consisting of multiplying the 60 minutes that an hour contains by the number of hours that a day contains to obtain, as a result, a column that contains 1440 rows. These rows represent the minutes of a day with their respective MAD record, thus defining the time range to be used.
Afterward, Following the transposition, the columns are reorganized to create a vector V^T, where each one-minute interval signifies a feature. Each vector corresponds to an observation per participant, culminating in a row with 1440 columns. These vectors are then added to the matrix “A” rows. Each participant, whether depressed or not (labeled as one and zero), has their vector extracted daily, spanning 7 days (one week). Each vector is appended to matrix “A” simultaneously with the assignment of its output label. This process continues for the next participants until all 55 participants are incorporated into the new dataset configuration, as illustrated in Equation (2). This setup is utilized for feature selection to identify the most relevant minutes for depression detection.
Feature Selection
Bearing in mind that the data collected through a portable device known as an activewatch in intervals of one minute are considered here as characteristics for each observation, in this section, we seek to select the minutes that provide the most significant information to generate a simple, intelligent model capable of detecting depression, considering that the number of features is much higher compared to observations, this process also aims to reduce redundancy and the amount of data to process to improve its performance.
The methodology for intelligent feature selection using Genetic Algorithms (GA) is shown in Figure 2.
GA draw inspiration from nature, particularly the process of natural selection. They implement a population-based search approach grounded in the fundamental principle of 'survival of the fittest.' At their core, GAs consist of several essential components: chromosome representation, selection, crossover, mutation, and fitness function computation [23].
In the context of the research, A library of genetic algorithms known as Genetic Algorithms for Multivariate Statistical Models from Large-Scale Functional Genomic Data 1.4 (GALGO) [24]. was implemented. GALGO, designed specifically for the R programming language, serves the purpose of selecting models with high fitness. This process begins by creating a random population of features with a specified size (n). These features are evaluated using a fitness function to assess their ability to classify the dependent variable, typically yielding an accuracy value.
For the specific task of depression detection, we utilized the logistic function as our classification method. In our study, the genetic algorithm iteratively explores and evolves combinations of genes (intervals of time) from a dataset comprising two classes. The goal is to identify features that effectively distinguish between these classes using the logistic regression (LR) method over 500 generations, considering 1000 possible solutions, and aiming for a fitness level of 95 %. As a result, the intervals of time obtained represent the most relevant contributors to the processed dataset. They are consistently favored within the intelligent models generated by the GA, demonstrating superior performance in classifying subjects.
Classification Analysis
Before starting the analysis of the dataset, a z-score normalization process is applied to the MADs to contain outliers as their variation in some time intervals can be significant. This normalization also speeds up the training time for each function within the same scale and is especially beneficial for modeling applications where the inputs often have varying scales. The mean and standard deviations are calculated for each feature [25], as shown in Equation 3. Then generate binary classification models since the data contains two possible classes as output, depressed (represented as 1) and not depressed (represented as 0). The techniques used to develop the models are ANN, LR, DT, and SVM. Finally, a comparison of the performance of these techniques was applied.
The artificial neural network consists of two multilayer perceptron (4, 2) and an output layer with a logistic activation function. On the other hand, the parameters established for the classification SVM are a linear kernel with a cost of 1.
The process for the development of the models consists in two steps, training, and testing. Therefore, the data were randomly subsampled into two sets, one for each step. The dataset encompassed 80 % of data for training process, while the remaining 20 % was reserved for testing.
Validation
These parameters comprise accuracy, sensitivity, specificity, The Receiver Operating Characteristic Curve with the Area Under the Curve (ROC/AUC), and the F1-Score.
Accuracy (Acc) is a performance criterion that indicates the degree to which the outcome of a calculation aligns with the correct value [13]. as represented in Equation (4).
In this context, TP represent True Positives, TNstands for True Negatives, CP denotes Truly Positive, and CN indicates Truly Negative.
The accuracy of a classifier encompasses other crucial aspects like precision, which refers to the number of correctly detected targets among all the targets detected. Additionally, there exists a relationship between the number of correctly detected targets and all known true targets, known as recall, as shown in Equations (5) and (6).
FP is False Positive, and FN is False Negative.
The measure that takes into account both precision and recall to evaluate the classification capability of an algorithm is referred to as the F1-Score. It is defined as the harmonic mean of precision and recall [26], as shown in Equation (7).
Sensitivity, is the ability to accurately identify data with depressed symptoms, represents the number of condition participants correctly identified [27]. It is calculated using Equation (8).
Specificity, referring to the ability to identify data without the condition as healthy, measures the proportion of negative samples that are accurately classified as such [28]. It is calculated using the following Equation (9).
The ROC curve is a commonly used method for evaluating machine learning models. It provides a visual representation of the classifier performance, enabling the selection of an appropriate operating point, referred to as the decision threshold, along with the AUC value [29]. The AUC can be calculated through integration, as shown in Equation (10).
Results and discussion
In this section, we explain the performed experiments and discuss the results.
Initially, Figure 3 shows the motor activity of depressed and non-depressed people throughout the day, extracted from the research paper entitled: Two-Dimensional Convolutional Neural Network for Depression Episodes Detection in Real Time Using Motor Activity Time Series of Depresjon Dataset [30]. As evident from the data, a significant distinction between the two cases can be observed, with a noticeable decrease in movement observed among depressed individuals. The data was collected for each of the 55 participants under study (32 healthy and 23 depressed) for a week. Each day was treated as an individual observation for each participant, resulting in a total of 385 observations, providing enough data to develop a classification model for episodes of depression. This was achieved by multiplying the number of patients by the number of days in a week (55X 7).
The dataset comprises 385 observations, each containing 1440 minutes. Following the feature selection process, the most significant features were chosen, specifically: 15:43, 15:41, 12:11, 7:25, 15:40, 7:29, and 12:15, as depicted in Figure 4. The graph illustrates the frequency of appearance for each feature, with the highest-ranked features displayed in black.
Different artificial intelligence techniques were applied and validated with different metrics obtaining the following results. The ANN has an accuracy of 0.74, DT has 0.73, SVM obtained a performance of 0.81, and LR 0.83. Considering that logistic regression presents the best level of accuracy, it can be concluded that this technique can correctly classify a greater number of test participants than the others. To know the number of true positives and true negatives, as well as false positives and false negatives that were obtained by the different implemented algorithms, Table 1 shows the results of their confusion matrices which is often applied in machine learning to evaluate or visualize the model behavior in supervised classification scenarios [31]. Additionally, we add the Matthews Correlation Coefficient (MCC) which is a more reliable statistical index that produces a high score only if the prediction obtained good results in the four categories of the confusion matrix [32] and also the Kappa Correlation Coefficient (KCC) which is a metric to summarize the agreement between two nominal classifications, based on the same categories [33].
ML Algorithm |
Results of the algorithms with the test data set | |||||
---|---|---|---|---|---|---|
FP | FN | TP | TN | MCC | KCC | |
SVM | 5 | 9 | 27 | 36 | 0.635 | 0.632 |
LR | 5 | 8 | 27 | 37 | 0.659 | 0.660 |
DT | 11 | 11 | 21 | 34 | 0.411 | 0.411 |
ANN | 8 | 10 | 24 | 35 | 0.523 | 0.523 |
The Results of the different validation metrics implemented to know the performance of classification models are shown in Table 2, where the values of accuracy, AUC, F1-Score, sensitivity, and specificity are specified.
Validation Metrics |
Artificial Intelligence Algorithms | |||
---|---|---|---|---|
SVM | LR | DT | ANN | |
Accuracy | 0.81 | 0.83 | 0.71 | 0.76 |
AUC | 0.82 | 0.83 | 0.70 | 0.76 |
F1-Score | 0.83 | 0.85 | 0.75 | 0.80 |
Sensitivity | 0.87 | 0.90 | 0.75 | 0.81 |
Specificity | 0.75 | 0.75 | 0.65 | 0.70 |
As can be seen in the results of the previous table, we can notice that the specificity values are lower in all the models generated to detect cases of depression, this is due to a class balancing problem since there are more data from healthy study participants than with a certain degree of depression, but regardless of this phenomenon and considering the limited number of observations, the algorithms used and the small number of features used, the results are favorable.
On the other hand, the ROC curves that graphically show the relationship between the sensitivity and specificity of the classification models generated when applied to a set of test data. This next image gives us a picture of the extent to which the study participants (healthy and depressed) are classified correctly.
First, in Figure 5 shows the performance obtained by the SVM algorithm to classify the participants, the model was subjected to a detailed analysis through the construction of a Receiver Operating Characteristic (ROC) curve and several key results were obtained indicating its predictive ability. The AUC was calculated to be 0.822, with a 95% confidence interval between 0.722 and 0.921. This AUC value, which is closer to 1 than 0, suggests that the model has a robust ability to discriminate between classes, indicating promising performance.
Secondly, A comprehensive analysis of the LR model was also carried out in the same classification context. Figure 6 illustrates the performance achieved by the LR algorithm in classifying the participants.
For the LR model, the AUC was calculated to be 0.838, with a 95 % confidence interval between 0.751 and 0.926. This AUC value, which is even closer to 1 than that obtained with the SVM model, reinforces the ability of the Logistic Regression model to effectively distinguish between the classes of interest. The high AUC is a strong indication of its predictive performance.
Comparing the results of both models, we observed that the Logistic Regression model obtained a slightly higher AUC (0.838) compared to the SVM model (0.822). This difference might suggest that the Logistic Regression model performs marginally better on this particular classification task. However, it is important to note that the choice between these models could depend on other factors, such as interpretability and simplicity of the model.
Figure 7 depicts the performance attained through the DT algorithm for participant classification. Although the AUC is lower compared to the previous models (SVM and Logistic Regression), it is still in a range that suggests some discriminative ability. However, this AUC value indicates that the Decision Tree model may have more limited predictive performance in this particular task.
In the framework of the research, the performance of an Artificial Neural Network (ANN) model as shown in Figure 8 was also evaluated in the classification task, along with SVM, Logistic Regression and Decision Tree models. When we compare the AUC of the ANN model (0.764) with the SVM (0.822) and Logistic Regression (0.838) models, we observe that ANN is in an intermediate position in terms of discrimination ability. Although it does not outperform the previous models in AUC, its performance is competitive and can be considered for applications where interpretability is not the main concern.
In summary, the results of the ROC curves of the models are highly promising. The high AUC, the balance between sensitivity and specificity, and the low false positive and false negative rate suggest that the model is an effective tool in the classification task studied. These results have important implications in the context of developing tools to support the diagnosis of depressive episodes, and could be valuable for clinical applications. However, it is important to consider the limitations of the model and future areas of research, such as the optimization of classification thresholds, to maximize its usefulness in the real world.
One of the assumptions that could not be omitted is where these algorithms were tested with all the features proposed in this research work; that is why we proceeded to a phase where the algorithms used the 1440 features to know their performance; the results are shown in Table 3 and Table 4.
ML Algorithm |
Results of the algorithms with the test data set | |||||
---|---|---|---|---|---|---|
FP | FN | TP | TN | MCC | KCC | |
SVM | 9 | 16 | 23 | 29 | 0.358 | 0.352 |
LR | 16 | 20 | 16 | 25 | 0.054 | 0.054 |
DT | 10 | 18 | 22 | 27 | 0.283 | 0.277 |
ANN | 10 | 15 | 22 | 30 | 0.349 | 0.346 |
Validation Metrics |
Artificial Intelligence Algorithms | |||
---|---|---|---|---|
SVM | LR | DT | ANN | |
Accuracy | 0.67 | 0.53 | 0.63 | 0.67 |
AUC | 0.68 | 0.52 | 0.64 | 0.67 |
F1-Score | 0.69 | 0.58 | 0.65 | 0.70 |
Sensitivity | 0.76 | 0.60 | 0.72 | 0.75 |
Specificity | 0.58 | 0.44 | 0.55 | 0.59 |
Results presented in the previous table are not as favorable with those presented in Table 2. This is an indicator of the importance of an intelligent feature selection phase that prevents the process of analysis and processing of the information to have better results, avoiding overfitting, reducing redundancy, and selecting the most significant data.
Although there are currently several investigations that have obtained very important results in the detection of depression, the simplicity of the process developed in this research is a very important aspect to consider since the results obtained through an algorithm such as logistic regression and reduced number of data contrast with the methodological approach used in state of the art. Table 5. shows the results obtained from the most recent research as well as the features used as source data related to the data set used in this work.
Author | Features | Technique | Acc. |
---|---|---|---|
Garcia-Ceja et al. [1] |
Feature vector |
SVM | 0.72 |
Frogner et al. [15] |
Feature Vector |
1D-CNN | 0.71 |
Jakobsen et al. [18] |
Statistical Features |
CNN | 0.84 |
Kumar et al. [34] |
Statistical Features |
CNN | 0.85 |
Rodríguez-Ruiz et al. [16] |
Statistical Features |
RFC | 0.98 |
Ghate et al.[35] | Statistical Features |
Transfer Learning |
0.96 |
Zakariah et al. [36] |
Statistical Features |
Deep Neural Network |
0.99 |
The table above reveals several noteworthy observations. Firstly, many previous studies employ considerably more intricate methodologies to enhance depression identification through motor activity. These approaches often involve extracting a multitude of statistical features, resulting in a substantial increase in the number of variables for analysis, processing, and the creation of classification models. This research aims to make a significant contribution by introducing a novel feature selection method using Genetic Algorithms (GA), which has not been employed with the Depression Dataset before. The primary objective is to reduce data volume without sacrificing critical information for identifying depressive states while also mitigating data redundancy.
Additionally, the achieved accuracy provides an initial step toward the development of more efficient models in terms of time and computational cost. Lastly, it's important to note that the obtained results hold statistical significance compared to the existing literature, despite the limited amount of processed data and the unique methodology proposed in this study
The proposal offers an innovative optimization approach for future work in developing algorithms, methodologies, and tools to aid in the detection of depression.
Conclusions
Artificial intelligence algorithms such as LR, supported by an objective feature selection method such as GA, allowed the efficient generation of a model capable of detecting depression with 83.0 % accuracy using only a few time intervals at different times of the day as a data source. The time intervals that showed significant information to generate models capable of detecting depression were: 15:43, 15:41, 12:11, 7:25, 15:40, 7:29, and 12:15 of a 24 Hrs. time system. Based on these findings, it can be inferred that the methodology introduced in this paper facilitates automatic and objective depression detection through various artificial intelligence approaches, achieving noteworthy accuracy with a set of seven features derived from patients' motor activity. As a result, this preliminary development of an assisted diagnostic tool emerges, offering potential assistance in mitigating the elevated error rates associated with diagnosing this condition. However, in this research, we assume that adding other variables such as type of diet, family history, sex, age, place of birth, and habits and implementing these algorithms in a real environment to test their efficiency and improve his learning could help in the future to improve the diagnosis of this mental illness.
As part of future work, there is a suggestion to augment the number of experimental observations to present more robust results with a greater diversity of data, including the increase of variables other than motor activity. In addition, it is proposed to implement deep learning like convolutional neural networks in smart devices capable of monitoring the objects of study in different circumstances or recurrent neural networks to detect depression early and allows a prevention strategy.
Author contributions
C.H.E.S conceptualized the project, designed and developed the methodology, contributed to the writing of the original manuscript, and participated in the programming of the software. C.E.G.T. participated in the data curation and gathering, carried out formal analyses and participated in the development of the software for the data analyses. A.G.S.R. contributed to the writing, editing and reviewing of the manuscript and participated in the data visualization for their correct interpretation. H.L.G. analyzed and validated the results. H.G.R. obtained funding and financial resources and oversaw the project. J.A.M.B provided access to material resources and equipment, participated in the data gathering and performed experiments. J.M.C.P. supervised and guided the general research and oversaw the project. J.I.G.T. conceptualized the project. All authors reviewed and approved the final version of the manuscript.