I. Introduction
In many countries, repeated aggressive behavior in students, known as bullying, has become a social health problem (Levine & Tamburrino, 2014). This is because victims (the bullied) and aggressors (the bullies) have higher incidences of health complaints, depression (Due, Damsgaard, Lund, & Holstein, 2009; Fleming & Jacobsen, 2009; Hanley & Gibb, 2011), and interpersonal relationship problems (Gilmartin, 1987), which later may result in different forms of violence, criminality, substance abuse, suicidal thoughts and suicide attempts (Bauman, Toomey, & Walker, 2013; Fleming & Jacobsen, 2009). The global prevalence of bullying in high schools has been found to range from 5% to 45%. This figure is higher in women (Craig et al., 2009), and an increase in violent behavior has also been reported (Bickmore, 1997). In addition, it has been reported that the prevalence of school bullying may differ depending on the educational level (Nansel, Overpeck, Pilla, Ruan, Simons-Morton, & Scheidt, 2001; Sawyer, Bradshaw, & O'Brennan, 2008), decreasing in older students but not disappearing (Alzahrani, 2012; Ozkal, 2011). The reduced prevalence in older students is primarily related to physical aspects, but there is an increase in indirect or social forms, like exclusion, verbal and cyberbullying, among others (Alzahrani, 2012; Bauman, Toomey, & Walker, 2013; Ozkal, 2011).
Violence in schools in Mexico has been measured nationwide as part of an extensive survey that measured several child-related aspects in 2000 (Secretaría de Salud, 2006), 2012 (Instituto Federal Electoral, 2012), and 2015 (Instituto Federal Electoral, 2015). In the year 2000, information was obtained from 4,000,000 girls and boys with the objective of estimating violence reported by students in school. The results indicated that 32% of children between the ages of 6 and 9 and 13% of children between the ages of 10 and 13 reported having been victims of violence in school (Secretaría de Salud, 2006). A similar survey, which measured only school violence in the 13 to 15-year-old age group, was administered in 2012 to 2,256,532 children across all the states of Mexico (Instituto Federal Electoral, 2012). The results indicated that 12% of students reported being harassed or intimidated by peers in school and 4% reported being victims of sexual abuse in schools.
The most recent survey was reported in 2015 with a sample size of 2,916,686 children from all the states of Mexico, and violence in schools was only measured in the 10 to 13-year-old age group (Instituto Federal Electoral, 2015). The results showed that 14.9% reported physical violence, 26.3% verbal violence, 19.5% emotional violence, and 2.9% sexual violence in school. These indicators were also measured in the state of Chihuahua, where the study sample is from, and for the state the results were similar, as 14.5% reported physical violence in school, 28.6% verbal violence, 21.6% emotional violence, and 2.8% reported sexual violence in school.
Bullying in Mexico has become an important topic, particularly in places were there has been a lot of violence, like in Ciudad Juárez, which was named the most violent city in the world in 2010 due to the war between drug cartels and the Mexican government. Between 2007 and 2011, more than 9,000 people were murdered in Ciudad Juárez (Valencia & Chacon, 2013), producing social, cultural and psychological consequences. Violence was part of citizens’ everyday lives, as it appeared in television broadcasts, radio, newspapers, social media, and the Internet. Children were aware of the problem since they had to live - and survive - in this environment full of violence and death. It is in this context that researchers in Mexico, and Ciudad Juárez specifically, decided to analyze this problem to intervene and attempt to minimize the negative consequences in children.
Qualitative observation, direct interviews, psychometric tests and self-report methods are accepted ways of evaluating bullying among school-based professionals (Casey, Hayward, & Gowen, 2001; Glover, Gough, Johnson, & Cartwright, 2000; Shapiro & Heick, 2004). Qualitative observation focuses on elaborating on and explicating the experiences of bullied and bullies (Patton, Hong, Patel, & Kral, 2015); this method identifies the physical characteristic of the phenomena in situ better than verbal and emotional expressions, but its implementation and analysis are laborious processes. Direct interviews focus profoundly on the interpretation and perceptions of bullying phenomena, but it takes a long time for researchers to interpret responses while avoiding subjectivity and thinking errors (Creswell, 2003). Psychometric tests are a formal way to diagnose the psychological and emotional effects of bullying (Tyler, 1972), but are impossible to administer on a large scale. Finally, through a combination of closed and open inquiries, self-reports focus on the subjective perception of, or reflection on, a phenomenon to generate self-evaluations of a psychological condition (Alarcón, Pérez-Luco, Salvo, Roa, Jaramillo, & Sanhueza, 2010).
Questionnaires are also useful tools to evaluate bullying behavior, because despite being less sensitive than the methods mentioned above, they are practical and economic (Crothers & Levinson, 2004). In Mexico, questionnaires to evaluate bullying have been conducted with no validity reported (Cerezo, 2006; Instituto Nacional de Salud Pública, 2006; Instituto Nacional de Salud Pública, 2012), so the results may not be reliable or valid. In addition, with the exception of the Bull-M, we do not know of validated questionnaires to evaluate bullying behavior in Mexico; the Bull-M has, however, only been validated in junior high schools (Ramos-Jimenez, Wall-Medrano, Esparza-Del Villar, & Hernández-Torres, 2013). Because bullying manifests itself differently with age and its prevalence changes, it becomes necessary to evaluate bullying at different school ages, and thereby implement better and effective prevention programs. The importance of the Bull-M is that it has been designed to assess school bullying anonymously and in a short time (10-15 min), thus increasing reliability when administered to larger groups of students and in areas with high rates of violence. In light of the foregoing, the aim of this study was to validate the Bull-M at three different school stages (from elementary school to high school) and by sex.
II. Methods
The sample consisted of 2,030 students from grades 5 to 12 (9 to 20-year-old age range) from the metropolitan area of Ciudad Juárez, in the state of Chihuahua, Mexico (see Table I): 1,001 girls and 1,029 boys from 17 elementary schools (1,110 students), six junior high schools (560 students), and six high schools (360 students). The students interviewed were those present in classrooms on the day the Bull-M was administered. According to the National System of School Statistics Information (SNIEE), there are 611 public schools in the city: 460 elementary schools, 106 junior high schools, 33 high schools and 12 universities, with approximately 400,000 students in total. In order to obtain a random and appropriate sample including both genders and all educational levels, a multi-step method was used for sampling. First, schools and educational levels were selected through a stratified probabilistic method using the SNIEE list of schools for Ciudad Juárez, and then students were selected randomly. In addition, no students younger than 5th grade were included as we considered that these students might not have been able to understand the questions asked.
The Bull-M was designed with the aim of understanding the phenomenon of bullying anonymously. It contains ten items, eight of which are divided into two theoretical constructs: the bullied factor (items 2-5), to identify if the participant is being bullied by peers; and the bully factor (items 6-9), to identify if the participant bullies others. There is also one introductory item about social relations, and at the end, one item about the physical symptoms related to bullying (see Table II for the content of items). Response options for this questionnaire follow a five-point Likert scale with the following values: never (0), seldom (1), sometimes (2), often (3), and every day (4). Because items 1 and 10 deal with the problem of bullying indirectly, only items 2 to 9 are included in the theoretical constructs for the confirmatory factor analyses as described in the two factors above. The Bull-M was designed initially by four researchers in the field of social and community health, with the support of seven elementary and junior high school teachers with over five years of direct in-class experience in schools with a high degree of violence, and where bullying often occurs. Other studies include a complete validation of the questionnaire design (Ramos-Jimenez, Wall-Medrano, Esparza-Del Villar & Hernández-Torres, 2013).
Once participating schools were selected, written informed consent was obtained from participating students, school administrators and parents prior to administering the questionnaire. If authorization was not given, an alternative school was randomly selected. Participants were allowed 20 minutes to complete the questionnaire, which was administered in a classroom without a teacher or other school authority figure present. Groups were comprised of up to 30 students, and the protocol was approved by school authorities and the Ethics Committee of the Universidad Autónoma de Ciudad Juárez (Autonomous University of Ciudad Juárez).
To analyze the internal reliability of the questionnaire between education levels and genders, Cronbach's alpha analyses were performed. Construct validity was assessed with a confirmatory factor analysis with a two-factor structure, as described previously (Hu & Bentler, 1999), using structural equation modeling with the maximum likelihood estimation (MLE) method. A previous study analyzed the factor structure of the scale using an exploratory factor analysis (EFA; Ramos-Jimenez et al., 2013). An EFA analyses the factors in which items group together without constraints, and suggests a specific number of factors for the scale according to eigenvalues greater than one (Yong & Pearce, 2013). Once a factor structure has been defined with the EFA, it must be validated to confirm it and the suggested procedure is to perform a confirmatory factor analysis (CFA), which compares observed variables (responses to items) to a proposed factor structure, usually defined by the structure found in the EFA (Kline, 2013). Since the Bull-M has been previously validated, in this study a CFA was used to confirm the factor structure of the scale (Kline, 2013).
Finally, measurement invariance was assessed between men and women. Measurement invariance assesses equivalence of a measurement across groups to verify if a scale measures the construct equally. The Bull-M is an indicator of a latent construct, bullying, and this study analyzed if the items relate to the latent variables in men and women equally. To assess this equivalence, the first step is to analyze the factor structure of the scale in both groups (men and women) and evaluate if the item responses of the samples fit the proposed two-factor model using structural equation modeling. If the model has an adequate fit, then the item loadings are constrained (metric invariance), assuming they are the same across both groups. If the new model with the constrained loading has an adequate fit, then the intercept equivalence, error variances and covariance equivalence are evaluated. The structural equation models were analyzed with the Amos 22.0 computer program; all other analyses were performed with SPSS 22.0.
III. Results
The factor structure analyzed in this study has been slightly modified in comparison to a previous study wherein the instrument was validated in Mexico (Ramos-Jimenez et al., 2013). The bullied factor includes items 2 to 5, and measures if the student is bullied by others. The bullies factor evaluates whether the student bullies others. For this structure, items 1 and 10 were taken off the bullied factor since they measure bullying indirectly; item 1 is related to social relations and item 10 is related to physical symptoms. The Cronbach’s alpha if item is deleted and item-total correlations table supported our decision to evaluate item 1 by itself.
The first step was to analyze the internal reliability of the total scale and its factors by calculating Cronbach’s alpha indices for the total scale, the bullied factor, and the bullies factor (see Table II). The internal reliabilities, including all participants, were α=0.82 for the total scale, α=0.76 for the bullied factor, and α=0.78 for the bullies factor. The internal reliability was also calculated for each segment of the educational levels and by sex (see Table I). Eliminating any of the items would decrease the value of Cronbach’s alpha for each of the factors, indicating that all of the items should be kept. When analyzing the internal reliability of the 10 items, the only increase in the alpha value would be if item 1 was deleted. This is the reason why item 1 is scored separately. Most of the internal reliability values were acceptable except for alpha values less than 0.70: total scale for middle school (α=0.68), bullied factor for high school (α=0.67), and bullies factor for middle school (α=0.67). According to the item-total correlation table, all of the items have Pearson correlations greater than 0.30 (Cristobal, Flavián, & Guinalíu, 2007), except for item 1, which has a very low correlation and is another reason why it is scored separately.
The second step was to analyze the confirmatory factor analyses for all of the participants and for each of the educational levels and sexes (see Table 3). The model fit was analyzed with the following indices: chi-square (χ2), chi-square divided by degrees of freedom (χ2/df), goodness of fit index (GFI), norm fit index (NFI), comparative fit index (CFI), and the root mean square error of approximation (RMSEA). Since the χ2 analyses are sensitive to large sample sizes, most times the results will be statistically significant, so we use other fit indices that are not sensitive to large sample sizes, as proposed by Hu and Bentler (1999), to evaluate model fit. The cut-off points value per index for a good model fit are the following: GFI≥.90, NFI≥.90, CFI≥.95, and RMSEA≤.06. All of the models had good model fits in most of the indices, except for the RMSEA, which saw values ranging from 0.074 to 0.115. It can be concluded that the models are acceptable when including all of the participants or for each group.
Note. χ2=Chi-square; df=Degrees of Freedom; GFI=Goodness of Fit Index; NFI=Norm Fit Index; CFI=Comparative Fit Index; RMSEA=Root Mean Square Error of Approximation.
Factor loadings and uniqueness values were calculated for each item (see Table IV). A factor loading refers to how much an item contributes to a factor (Yong & Pearce, 2013) and uniqueness is the variance of an item that is due to measurement error and specific variance (Kline, 2013). An item should have a minimal factor loading value of 0.320; if the item has a lower value, then it should be discarded (Tabachnick & Fidell, 2007). The factor loadings for all of the items in all of the analyses were good, ranging from 0.495 to 0.757.
The final step was to analyze the measurement invariance for the factor structure across men and women. Measurement invariance was assessed by calculating the model fit of the scale for both men and women. Since both samples had good model fit indices for the scale, both scales were analyzed by constraining parameters across both samples (see Table V).
Note. χ2=Chi-square; df=Degrees of Freedom; GFI=Goodness of Fit Index; NFI=Norm Fit Index; CFI=Comparative Fit Index; ΔCFI=Difference in between baseline model and nested models; RMSEA=Root Mean Square Error of Approximation.
--=Value not calculated.
a= Nested models were compared to Model 1.
** p<.01
For the first model, the model fit of the scale was analyzed without constraints between the two samples. For the second model the measurement weights were constrained across both samples. In the third model included the constraint of model two and in addition, measurement intercepts, structural weights, structural intercepts, structural means, and structural covariances were constrained. In the fourth and final model, the constraints of model three were present, and in addition, structural residuals and measurement residuals were constrained. In these analyses Cheung & Rensvond (2002) recommend using the difference in the comparative fit index (ΔCFI) as an index to assess invariance between nested models and the baseline model. The CFI index is not sensitive to sample size or non-normal data. They suggest that a ΔCFI of less than or equal to 0.01 indicate a similar model fit (Cheung & Rensvond, 2002). Our results indicate partial measurement invariance since the difference in the comparative fit indices of the fourth model compared to the baseline model was 0.015, which is greater than the 0.01 suggested.
IV. Discussion and conclusions
Bullying is a social problem that originates primarily in homes with an abuse of power, violence, and communication problems (Fatima & Khatoon, 2015), and its consequences can result in criminality, substance use and abuse, and suicide (Fleming & Jacobsen, 2009). According to the National Survey for Health and Nutrition (Instituto Nacional de Salud Pública, 2006; 2012), in Mexico there has been an increase in bullying, and therefore this study evaluates the self-report Bull-M questionnaire to evaluate the prevalence of bullying among students. This instrument was previously tested and validated in a previous study with students from 7th to 9th grade from northern Mexico (Ramos-Jimenez et al., 2013). This study extends the sample age, from 5th to 12th grade, and was administered to a gender-balanced sample. The sample was randomly chosen considering the all schools in Ciudad Juárez as listed by the SNIEE.
The main results indicate that the Bull-M questionnaire is a valid instrument to evaluate anonymously the prevalence of aggressors and victims of bullying among students from different grades and by gender. The internal reliability analyses in our sample show adequate Cronbach’s alpha values (0.77 to 0.87) for the total scale across all groups; however, when the reliabilities are analyzed by factor, there is a decrease in the values, which range from 0.67 to 0.80.
The confirmatory factor analyses performed with the total sample and all other groups indicated adequate model fit indices. All of the chi-square analyses were significant, which translates into a bad model fit, but this can be explained by the large sample sizes, according to Hu & Bentler (1999); for this reason there are other fit indices included in the analyses. The factor loadings for all of the items in both factors are strong, ranging from 0.55 to 0.76.
There is a high correlation between the bullied and bullies factor (r=0.828), which might suggest a unidimensional solution. A two-factor solution was kept, as exploratory and confirmatory factor analyses in a previous validation study suggested and confirmed a two-factor solution. The two-factor structure was validated in this study but with a high correlation between factors. The two factors were kept separately because according to the theory of bullying, it identifies two groups of people: the bullies and the bullied. Another reason to keep the two-factor structure is that in recent years Ciudad Juárez has had high levels of social violence due to the war between drug cartels, which has had an impact on society. This violence has been reflected in some school settings, with people becoming victims or perpetrators of violence. The schools sampled for this study were in the most affected areas of Ciudad Juárez and for this reason, the high correlation between two factors indicates that people who report higher levels of being bullied also report higher levels of bullying.
Measurement invariance across boys and girls was assessed but the results indicated partial measurement invariance. The instrument is invariant in the factor structure and loadings, measurement intercepts, structural means and structural covariances, but it was not invariant in structural and measurement residuals, even though it has been argued that invariance across measurement residuals is an overly restrictive test (Byrne, 2004); for this reason, we suggest it can be used to compare results between boys and girls since it has the same representation of factors in both groups. Measurement invariance across educational levels was not assessed since the sample for middle school was small in comparison with the rest of the groups.
In this study only the factor structure and internal reliability were evaluated. There is a need to evaluate other types of validity like concurrent validity or discriminant validity. There is a need to analyze the correlation of the Bull-M with other scales that measure related or similar constructs. Second, the RMSEA values for all groups were not ideal, even though the other model fit indices showed ideal values. Third, the internal reliability alpha values were acceptable in most of the sample except for middle school and high school, which produced alpha values below 0.70. Fourth, the total sample size was 2,030 but when the sample was divided by education levels, high school had only 360 students. There is a need for a larger sample in high school and middle school, especially as larger sample sizes are necessary when calculating structure equation models. Fifth, there is a need to have samples obtained from different places in Mexico, including the south and center of the country. This sample was obtained from Ciudad Juárez, in northern Mexico.
According to the results, the Bull-M can be used at different educational levels, from 5th grade to 12th grade. The confirmatory factor analyses performed in the total sample, and by educational level and gender, indicate a good model fit with the bullied and bullies factor. Most of the model fit indices are adequate, the factor loadings for all of the items are strong and the Cronbach’s alpha values indicate adequate internal reliability for both factors and the total scale. The Bull-M is an anonymous and brief instrument that can be used to diagnose bullying problems in schools in northern Mexico.
Validation of the Bull-M in samples that include children from 5th grade to 12th grade was necessary to evaluate bullying in schools in Ciudad Juárez. After the violence escalated in Ciudad Juárez, it became a considerable problem with social and psychological consequences. Schoolchildren started to accept violent behavior as normal, and teachers reported that sometimes students would pretend to be drug dealers in recess at school. School administrators, government institutions and in particular civil society organizations started programs to intervene in schools to reduce violence among students.
When these interventions began, bullying assessments were not used or were very informal. The development and validation of the Bull-M provided a low-cost and rapid evaluation of violence at school. This scale has been used in several schools to assess bullying and will be promoted in the city and state government so that it can be applied in all schools to assess violence among students. This tool will provide quantitative information about the problem so that school authorities may decide if it is necessary to intervene with programs targeted at reducing violence. This scale can be validated in other populations to serve as a quick and low-cost tool to detect violence in schools.