Machine Learning Algorithms for Predicting of Academic Achievement

Morales Hernández, Miguel Ángel; González Camacho, Juan Manuel; Robles Vásquez, Héctor; Valle Paniagua, David H. del; Durán Moreno, José Rafael

doi:10.23913/ride.v12i24.1180

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Permalink

RIDE. Revista Iberoamericana para la Investigación y el Desarrollo Educativo

versión On-line ISSN 2007-7467

Resumen

MORALES HERNANDEZ, Miguel Ángel et al. Machine Learning Algorithms for Predicting of Academic Achievement. RIDE. Rev. Iberoam. Investig. Desarro. Educ [online]. 2022, vol.12, n.24, e035. Epub 23-Mayo-2022. ISSN 2007-7467. https://doi.org/10.23913/ride.v12i24.1180.

In this research, two machine learning classifiers were implemented, a multilayer perceptron (MLP) and a gradient boosting model (GB), to predict the degree of academic achievement in Spanish and mathematics of basic education students in two stages, sixth of primary (2008) and third of secondary (2011), based on contextual variables obtained from the Enlace test of the state of Tlaxcala, Mexico. Thirteen input variables were considered. The relative importance of these was determined by the random forest (RF) classifier. MLP and GB classifiers were trained and tested with a dataset of 11 036 records of students who remained in the school system from 2008 to 2011. The models were trained and tested in prediction for 2008 and 2011. In Spanish MLP outperformed GB with a global classification accuracy (PG) of 70.1 % in 2008 and 61.1 % in 2011. GB obtained better performance in mathematics with a PG of 68.8 % in 2008 and 63.5 % in 2011. It was observed that the score in Spanish has a strong association with the degree of academic achievement in mathematics. Scores in Spanish and mathematics have greater relative importance with respect to contextual factors considered as sex, scholarship, school shift, and so on. In the population of students analyzed, it is observed that, in Spanish and mathematics, the proportion of women is higher than the proportion of men in achievement levels 1 (elementary) and 2 (good or excellent); in contrast, in both subjects this proportion is reversed at achievement level 0 (insufficient).

Palabras llave : supervised learning; decision trees; school context; artificial neural networks; cross validation.

· resumen en Español | Portugués · texto en Español · Español (

pdf )