Introduction
Identification and early intervention in developmental delays or disabilities in children is a global challenge. In Mexico, a 6% prevalence of childhood disability has been reported (Economic Commission for Latin America, 2016; World Health Organization, 2012; World Health Organization, 2009; Watkins, 2016; Walker et al., 2011; National Institute of Statistics and Geography [INEGI], 2016), with 25% of children under five years of age showing mild or moderate developmental delays due to socioeconomic risk factors such as poverty, inadequate health services, lack of health equity, and a culture of prevention that could affect their school and work performance. All of these perpetuate the cycle of lack of opportunities for them (Phillips et al., 2016; Alvarado-Ruiz, Martínez-Vázquez, & Sánchez-Pérez, 2013; Grantham-McGregor et al., 2007). It is important for health personnel at the primary health care level to have an instrument with development indicators for reliable screening that will enable them to monitor the first two years of life, and provide timely care and prevention in order to achieve optimal development. To this end, the American Academy of Pediatrics proposes the use of reliable, validated screening instruments for the detection of child development risks (Rydz et al., 2006; Committee on Children with Disabilities, 1994; Bright Futures Steering Committee and Medical Home Initiatives for Children with Special Needs Project Advisory Committee, 2006; Hamilton & Woodbury, 2006; Phillips et al., 2016).
Bearing this in mind, at the Subdirectorate of Rehabilitation of the National System for Integral Family Development, Benavides, Sánchez, and Mandujano (1985) have designed and used the Neurobehavioral Assessment of Infant Development (VANEDELA) for the detection and timely care of risks of neurological sequelae during the process of childhood growth and development. Benavides et al. (1989) conducted an assessment of the validity of the instrument on a population of 97 high- and medium-risk children enrolled in the pediatric follow-up program of the National Institute of Perinatology (aged one, four, eight, and 12 months).The Amiel-Tison Neurologic Evaluation, used as an external criterion, showed that the sensitivity of the test was excellent (1.0) and the specificity low (.73). Over the years, the VANEDELA has been used in several research projects: Arines (1998), at the Tlaltizapan Morelos Rural Center; Sánchez et al. (2007), at the Tláhuac Mothers’ and Children’s Hospital; and Martínez-Vázquez (2001) at the CIMIGEN Mothers’ and Children’s Hospital. Martínez-Vázquez subsequently undertook a second evaluation of the sensitivity and specificity of the behavior formats (DB) and developmental reactions (DR) in the same population, which constitutes the first adaptation of the exploration and qualification criteria published in the second version.
On the basis of the latter, it was used by Ceballos (2007) at an IMSS nursery in Toluca; by Chávez et al. (2012) to monitor premature infants in the NICU at the La Raza Medical Center; and by Alvarado-Ruiz, Martínez-Vázquez, Sánchez-Pérez, and Muñoz-Ledo (2013) for the surveillance and monitoring of low-risk infants at the ISSSTE Tlalpan Family Medicine Clinic. As a result of this experience, as well as feedback from health personnel who use it in their everyday consultations, adaptations were made to the instrument in order to better describe and clarify certain DB and DR items, so that it could be graded more accurately and to ensure that its use not only covered neurological damage risks. Between November 2004 and January 2012, Alvarado-Ruiz et al. (2013) reported part of this new experience in monitoring child development in 3 527 evaluations of 293 newborns and infants taken to a primary health care outpatient clinic on a monthly basis.
The purpose of this new report was to determine the validity (sensitivity and specificity) and reliability (test-retest) of the adaptations made to the behavior forms and development reactions of the VANEDELA screening test in comparison with Gesell’s Developmental Schedule Test.
Method
This is a descriptive, observational, cross-sectional study. A convenience sample was used, with newborns and infants being recruited from the Neurodevelopment Monitoring Laboratory of the National Institute of Pediatrics and the Tlalpan ISSSTE Family Medicine Clinic from 2011 to 2012 within the following age ranges: one to four months (up to seven days before or after); eight to 12 months (up to a fortnight before or after), 18 to 24 months (up to three weeks before or after). A medical examination was undertaken of the children who, according to their files, met the age criteria and did not have congenital or genetic diseases or syndromes that could severely affect the nervous system. Once the infants had been selected, the parents signed an informed consent letter and then an evaluation was conducted using the VANEDELA test and the Gesell Developmental Schedule Test (GDST) (Gesell & Amatruda, 1981). The same people who had designed the instrument undertook the evaluations of the VANEDELA. The GDST were conducted by the researcher in charge, a trained child development psychologist with 15 years experience using this test. The VANEDELA comprises four sections: somatometry (SM), developmental behavior (DB), developmental reactions (DR), and warning signs (WS) (Sánchez et al., 2007). The tests were conducted during a minimum period of time (15 minutes). This first report describes the sections on DB and DR; the other sections are still being processed for their inclusion in the report.
Infant Development Behavior Form (DB)
This section comprises 60 behaviors evaluated in various areas of development at six age cut-off points: 1, 4, 8, 12, 18, and 24 months. Each cut-off point includes ten items, with one graphic icon assigned rather the seven original ones presented, which could mean that two items had to be reviewed in a graphic icon, which was extremely confusing. As a result, 10 graphic icons were added to the new rating form for DB to facilitate its use (see Form).The grading criteria for 14 items in the DB form that belong to the various cut-off points and one in the DR were specified (Table 1). To interpret the results in this research, the following risk criterion was established: “10 positive items mean that development is as expected. There is a risk of delay if a doubtful or altered score occurs.” The DR evaluates 10 reactions that enable the infant to organize various movement patterns and achieve the bipedal posture and move during the first two years. Four types of righting are considered: 1. optical-labyrinthine (one month); 2. head acting on the body (one month); 3. Landau (three months); 4. body acting on the body (eight months); three defense movements: 5. forward protection (eight months), 6. lateral protection (12 months), and 7. backward protection (12 months); and three balance movements: 8. sitting (18 months), 9. crawling (18 months), and 10. standing (24 months). Grading: development was regarded as expected when reactions were present at the age cut-off point evaluated, while a risk of delay was thought to exist when they failed to meet the established criteria.
The Gesell Developmental Schedule Test (GDST) assesses a child’s development during the first six years. During the first year, it tests this on a monthly basis and, in the second year, it does so every three months. Four areas are evaluated separately: motor skills (fine and gross), adaptive, language (receptive and expressive), and personal and social. The score is obtained through a development coefficient, dividing the age obtained by the chronological age multiplied by a hundred. A development index of 85% is regarded as normal, while 84% or less in one or more areas is regarded as a delay (Gesell & Amatruda, 1981).
The Gessell Developmental Schedule Test is regarded as a suitable standard for diagnosis because of its extensive review of child development skills in the first two years. This test has been used since the 1960s by Dr. Cravioto and Delicardi, MSc Psy, who have used it in various studies in Mexico and Latin America. This test has been used for 30 years at the Neurodevelopment Monitoring Laboratory of the National Institute of Pediatrics in clinical work to distinguish children with expected development from those with delays (Alvarado-Ruiz et al., 2013). Before the evaluation, the files of possible candidates were reviewed, they were invited to participate, the procedure was explained to parents, and optimal conditions for the exploration were verified (being alert, and not sleepy or hungry).The VANEDELA was applied, followed by the GDST. At the end of the evaluation, parents were informed of the result, and given suggestions on how to stimulate the infants’ neurodevelopment. A week later, the VANEDELA was re-applied.
Cases with a risk of alterations were discussed with trained personnel in specific sessions to provide suggestions for their care and they were referred to secondary health care. All the evaluations were filmed in 8 mm digital format. Tests were graded at the end of the evaluation and results recorded in the database that same day.
On the basis of the scores obtained, expected results and risks for each form were determined separately. As for the capacity for detection, the following was determined in the DB and DR forms by age cut-off point: no risk if both met the criteria as expected, and at risk, if the established criteria were not met in either of the forms. The Epidemiological Analysis of Tabulated Data program (EPIDAT v4) was used to estimate the sensitivity, specificity, and predictive value indices. Its stability was determined through the correlation of results (absent-present) in the first evaluation and at seven days (test-retest) with JMP13.
The study was part of the research project entitled, “Ages of Acquisition in Mexican Infants of the Evolutionary Sequences of Target Behaviors in the VANEDELA Screening Test,” approved by the National Institute of Pediatrics Research and Ethics Commissions. No funds from private institutions were used for this research and the authors declare no conflict of interest.
Results
The study evaluated 379 infants who met the established criteria and whose primary caregivers agreed to participate. The mean maternal age was 29.32 ± 5.43, the minimum being 16 and the maximum 43. As for the mothers’ educational attainment, 37% held university degrees and 46% had completed high school. Gender distribution was as follows: 196 (52%) boys and 183 (48%) girls. Validity analyses were performed for DB and DB-DR.
DB: Sensitivity (S) was between 79% and 89%; Specificity (Sp) was between 83% and 95%; positive likelihood ratio (LR+) was between 5.08 and 18.09; and negative likelihood ratio (LR-) between .12 and .24. DR: S was located between 27% and 50%; Sp between 76% and 94%; LR+ between 2.38 and 7.33; and LR- between 54 and 88. DB-DR: S was located between 82% and 89%; Sp between 72% and 91%; LR+ between 2.93 and 9.82, and LR- between .12 and .25 (Table 2).
Note: D/DB = relationship between the Gesell diagnostic test and the developmental behavior forrn. D/DB-DR = relationship between the Gesell diagnostic test and behavior forms and developmental reactions. n = number of cases studied; S = sensitivity; Sp = specificity; LR+ = positive likelihood ratio; LR- = negative likelihood ratio; VR+ = positive validity rate and VR- = negative validity rate. The confidence interval is given in arentheses.
The stability analysis with the test-retest reliability method for each of the items during the first month yielded a Pearson coefficient between .76 and 1, with the exception of item 4, “visual contact,” which scored.49. At four months, the nine items presented a range of between .80 and 1, with the exception of item 2, “contact grasp,” which scored .74. At eight months, eight of the items showed a coefficient of between .81 and 1; item 5, “explores the mother’s face with their gaze or touches it,” yielded .64, while item 7, “while lying on stomach, stretches out their arms,” yielded a coefficient of .62. At 12 months, nine items scored coefficients of between .82 and 1; while item 4, “Pick up or grab a ball,” presented .70. At 18 months, nine items showed a coefficient of between .84 and 1; item 2, “puts seeds into a jar,” scored .70. At 24 months, the ten items yielded coefficients of between .81 and 1. In score per cut-off month of the DBs, they had coefficients of between .86 and .99.
Of the 10 DRs, eight had coefficients between .80 and 1, DR 2, “Righting the head acting on the body as a block,” and DR 4, “straightening of the body acting on the body,” yielded .70, while the cut-off score was set between .90 and 1 (Table 3).
Note: The test-retest reliability coefficient is presented with a time interval of 7 days. The coefficients of the 10 behavioral items graded for each cut-off point are described and for the DR that occur in the monthly cut-off, the blanks are because the reactions are not present because they evolve over time. DB = developmental behavior and item number; DR = Developmental reactions and item number.
Discussion and conclusion
The VANEDELA made it possible to comply with the WHO recommendations as part of prevention programs (World Health Organization, 2009). The DB form has adequate validity for the detection of risk in the six cut off points. When the values obtained in the present study were compared with those of 2001, it was found to maintain adequate levels of sensitivity and improved specificity. At four months, it increased from 75% to 95%; at eight months, from 67% to 83%; at 12 months, from 45% to 84%; and at 18 months, from 67% to 91% (Martínez-Vázquez, 2001). The DR form achieved acceptable specificity (76% - 94%), but low sensitivity (31% - 50%), since the GDST considered skills not directly related to the DRs evaluated that modified its scores. In future studies, it would be useful to include an additional neurological assessment instrument to measure its sensitivity and specificity more accurately. Using the DB and DR forms together, adequate validity was obtained with high sensitivity. Specificity also remained adequate in most of the cut-off points, except at twelve months in the back-protection reaction, which was found to be below the suggested percentage: 75% (Meisels, 1989). In order to increase specificity, the back-protection reaction (DR 7) should be reviewed and the description expanded, since it says, “It is considered normal (1) when the arms are extended backwards protectively,” to which one should add, “in response to the backward stimulus, the child will be able to turn and put one or both hands out to avoid falling.”
The positive likelihood of DBs, which indicates the possibility of presenting a risk when the test establishes this, is appropriate for all cut-off points; for one, four and 18 months, it is good; and for eight, 12 and 24 months it is moderate. Negative likelihood, which shows the probability of not having a risk when the test establishes it, is adequate, since it discriminates well for all cut-off points and is moderate for 12 months.
The reliability analysis through the test-retest method over a seven day period produced stability. Items were detected that showed rapid changes in the acquisition of behavior: at one month, visual contact was achieved; at four months, contact grasp; at eight months, infants explored their mother’s face with interest and when placed on their stomachs, pushed themselves up with their hands; at 12 months, they grabbed and lifted a ball without body support; and at 18 months, they put seeds in a jar. In DRs, at one month, they were able to raise their heads and after eight months, they were able to right themselves by rolling over. Given that established behaviors are measured, they were probably in transition and, therefore failed to meet the criteria, but the infants continued to exercise and a favorable environment enabled them to structure their behavior. Further analysis is required to determine whether this is part of the variability of development. The evaluation provides indices on the infant for the caregiver to use, which constitutes a significant contribution to early evaluation, since parents observe their children’s abilities and the strategies provided by professionals will enhance the infants’ development. Conversely, in infants who do not do this, it would constitute an obstacle that would have to be determined. Reactions involving head righting acting on the body (at one month) and righting themselves by rolling over were located at .70, considered minimally acceptable by some authors, such as Devellis (1991), and low for others, as in the case of behaviors. These reactions will have to be reviewed to determine a description of observables in an evolutionary sequence that will make it possible to distinguish rapid changes in development resulting from timely interaction as opposed to an alteration indicating neurological damage.
Contrasting the VANEDELA with other instruments used in Mexico highlights the challenges we face in obtaining an instrument that will permit the early detection of developmental delays in order to ensure their timely care. First, we have the EDI, which reports a specificity between 53% and 62% for the groups aged two to 15 months, and between 53% and 71% for those aged 16 to 60 months, which would lead to the saturation of the health system due to a large number of false positives (Rizzoli-Córdoba et al., 2013; Romo-Pardo, Liendo-Vallejos, Vargas-Lopez, Rizzoli-Córdoba, & Buenrostro-Márquez, 2012). Conversely, the review of tests designed abroad and widely used by Mexican pediatricians, such as the Denver II test in Tlaltizapan Morelos, showed a low sensitivity of 64% and an adequate specificity of 82% (Olivera-Moreno, 2010), while the Capute scales showed a low sensitivity between 17% and 72%, and an adequate specificity of 90% (Cazares-Figueroa, 2008). These reports tally with studies from other countries that have reported this problem (Song, Zhu, & Gu, 1982; Cairney et al., 2016; Accardo & Capute, 2005). A continuous review of screening instruments would make it possible to obtain an instrument suitable for use with Mexican population.
Although the DB and DR forms of VANEDELA, which allow timely detection at the primary care level, must be continuously adapted, they remain a valid, reliable instrument, as well as an essential tool for health professionals to timely detect children with developmental difficulties and to provide timely strategies, so that the curve changes from low to high adequacy.
The VANEDELA makes it possible to assess children quickly through its six age cut-off points. Since it is a screening test, it contains the expected milestones for all children under normal conditions for the age cut-off point. Accordingly, a child with 10 positive items is regarded as not being at risk, while one with nine or less items is regarded as being at risk of delay, with two conditions: nine or eight means a mild risk, which was previously called doubtful, and seven or less means a high risk, formerly called altered. Its main limitation was revealed when the child reached an intermediate age, meaning that one had to wait for the cut-off point for confirmation of expected development or a risk of delay. Work is currently underway on intermediate milestones that will enable professionals to determine the moment of evolution of a particular behavior.
Given the sociocultural variability existing in Mexico, it is important to expand training in the application of this test so that other professionals can use it and thereby obtain data from a larger sample encompassing various social groups. The VANEDELA is a suitable, user-friendly tool for the timely detection of risks of developmental delays, which will provide health professionals with an useful tool for monitoring infants and advising their families in the event they require a diagnostic evaluation and early intervention services to ensure optimal growth. The DBs and reactions sections have sufficient validity and reliability in general for them to be used for the detection of risks of developmental delays in primary health care.