I. Introduction
Higher education (HE) offers high transformative potential for students, not only because its main mission is to provide students with technical and professional knowledge and skills, essential for the development of society, but also by stimulating students’ comprehensive development, helping them to become autonomous and critical citizens (Fragoso et al., 2013; Harman, 2017). In recent decades, student numbers have increased in HE institutions and students are gradually becoming more heterogeneous in terms of social, academic, and individual trajectories (Harrison & Waller, 2018). The growing democratization of access to HE poses new challenges for institutions in promoting student retention and course completion. These challenges become even more pronounced if institutions take a proactive approach in supporting the academic success of first-year students. Difficulties in transition and adaptation to college are frequent and increase in more vulnerable first-year students (Ecclestone et al., 2010).
For Tinto (1975), student adaptation and success will only be possible with social and academic integration. The emphasis on interactionism in his model points to dynamic and reciprocal interactions between the student’s personal characteristics, formal and informal characteristics of the higher education institution (HEI), and characteristics of the community the student is placed in (Duarte et al., 2014; Meyer & Marx, 2014). Persistence in or dropout from HE must be understood as a multivariate phenomenon in which several personal and contextual factors interact (Casanova et al., 2018; Freixa et al., 2018; Tinto, 2010). For this reason, dropout can be analyzed as a process of student disconnection from the institution, the course, and academic life, and not as an isolated event or a hasty or impulsive decision (Casanova et al., 2018), with a number of personal and institutional variables playing a role in students’ decisions.
One important predictor in explaining dropout is students’ level of commitment toward the institution and course, for example in terms of their vocational and professional career. In countries where HE enrollments are managed through a numerus clausus system based on average academic grade in secondary education and entrance exam scores, like in Portugal, a considerable percentage of students may not have gained access to their first-choice course or institution (nearly 40% in Portugal). This situation may result in higher persistence and completion rates among students attending a first-choice course, with these rates also associated with higher admission scores (Diseth, 2011; Postareff et al., 2017). In other words, attending a course that is not a student’s first choice may explain lower engagement in learning activities and in relationships with classmates and teachers, reduced levels of satisfaction and self-efficacy, and lower academic expectations, contributing to worse adaptation and academic failure (Fleith et al., 2020). These negative academic experiences prompt student disconnection and dropout intention (Belloc et al., 2011; Casanova et al., 2018; Koshy et al., 2017; Stinebrickner & Stinebrickner, 2014).
One factor related to student commitment to the institution and course, and another important predictor of dropout, is academic achievement (Figuera et al., 2015; Tuero et al., 2018). Students with low levels of achievement, particularly when they do not complete a certain number of curriculum units, exhibit a higher dropout rate (Casanova et al., 2018). This may be associated with the course or area of study, with the highest failure rates found in science and engineering courses. In these courses, more students mention learning difficulties, notably in the core or propaedeutic curriculum units and when previous curricular knowledge from secondary education has been poorly consolidated (Seymour & Hewitt, 1997; Yorke, 2016). These gaps in academic competencies may require a greater number of hours of study, which the student is not always able to give up. These students, experiencing failure and difficulties, are less invested in an academic and professional career involving HE, stay at the university with lower academic expectations, and invest less in their education (Casanova et al., 2018; Stinebrickner & Stinebrickner, 2014). All these factors can contribute to a higher dropout rate in first-year students (Crosling et al., 2009; Tinto, 2010).
Students’ sociocultural background introduces a new set of variables that affect HE dropout rates. Students from rural areas or from outside of the city where they attend college report difficulties meeting schedules due to transport, citing problems related to housing and food. Sometimes feelings of loneliness can lead to sleeping and eating difficulties, substance use, and a progressive disconnection from classmates and the course (Casanova et al., 2020; Sinval et al., 2021). These difficulties can be more evident in students who leave the parental home to attend university. Without adequate levels of autonomy and maturity, these students find it more difficult to manage their responsibilities and daily activities, and also experience isolation or anxiety due to a lack of social support (Newlon & Lovell, 2017).
Finally, student age appears to be a relevant personal variable in explaining dropout. Sánchez-Gelabert and Elias (2017), with a cohort of 6,367 first-year students of the Autonomous University of Barcelona, conclude that older students who work tend to drop out more frequently than traditional students (younger students not in full or part-time employment). Traditional students tend to exhibit a higher level of family cultural and economic capital that allows them to understand and engage more with academic university life (Tuero et al., 2018). Several studies point to a higher dropout rate among older students (Figuera et al., 2015), which may be more frequent in male students who work (Belloc et al., 2011; Venegas-Muggli, 2019) or in female students with young children to care for (Fragoso et al., 2013). Indeed, while dropout in older male students may reflect new or changing work commitments, in older female students new family responsibilities, like a change in marital status, childbirth or family illnesses, become more important factors in a dropout decision (González-Ramírez & Pedraza-Navarro, 2017; Sánchez-Gelabert & Elías, 2017; Severiens & ten Dam, 2012). In addition, dropout in older students tends to be related to academic achievement. A high rate of dropout among male students may be associated with learning difficulties and lower performance levels (García & Adrogué, 2015), in contrast to better study skills and greater academic success among female students (Andrade et al., 2017; García de Fanelli, 2014). Recent legislation in Portugal has encouraged entry into higher education by individuals aged 23 or over who have not completed secondary education. These students sometimes lack certain learning competencies or study habits (Newlon & Lovell, 2017).
This study aims to contribute to the identification of variables that predict students’ decision to persist in or drop out of higher education. The regression tree method was implemented with persistence/dropout as the criterion and a large number of personal and contextual variables as predictors: sex, age, study habits in secondary education, repetition of grades in basic or secondary education, vocational guidance in secondary education, grade point average to access HE, attendance of a first-choice course, attendance of a first-choice university, attendance of HE away from home, being employed, degree of confidence in graduating in the subject they are studying, and degree of confidence in completing a degree at the university where they are currently studying.
II. Method
Participants. The participants were 2843 first-year students of a public university in northern Portugal, of whom 35.4% had parents with only basic education, 34.3% had at least one parent with secondary education, and 30.3% had at least one parent with tertiary education. The majority of participants were female (55.5%), and the average age was 18.88 years (SD = 3.64), ranging from 16 to 61, with 5.6% being students over the age of 23. The students were enrolled in courses in different areas of study: 38.6% in legal sciences, 34.1% in engineering and architecture, 15.2% in exact sciences, 8.1% in languages, and 4% in health sciences. In addition, 25.5% of the students reported being employed on a part or full-time basis.
Instruments. Data collection was carried out at two different points in time. At the first point of data collection (at the time of enrollment at the university), a questionnaire was administered that included items related to information about the student: sex, age, father’s and mother's education, course of enrollment, attendance of HE away from home, and student employment; information on schooling history and vocational options: grade point average to access HE, attendance of first-choice course and university, degree of confidence in graduating in their subject, and degree of confidence in completing a degree at the university they are studying in; and a short list of seven items describing study habits in secondary education, answered on a 5-point Likert frequency scale. The second stage of data collection took place at the beginning of the following year in conjunction with the university’s administrative services, and with the students’ consent, obtained as part of the first stage of data collection. For this second stage, data was collected on the number of the curriculum units passed and student persistence or dropout after one year of enrollment.
Procedures. The study was carried out respecting the ethical standards of research with human beings, in accordance with the guidelines of the Declaration of Helsinki and the Oviedo Convention, and received a favorable opinion from the Ethics Council of the HE institution. Upon enrollment in the first year of HE, students were informed of the objectives of the study, and they provided free and informed consent in writing. Authorization was also requested for access to data on academic performance and persistence status at the beginning of the following academic year. The confidentiality of the data collected was guaranteed, as was the participants’ right to withdraw or exclude themselves from the study at any time during the research.
We proceeded to employ a regression tree method to explore relationships between variables without a prior theoretical model (Gomes & Jelihovschi, 2019). Several studies on the use of the tree method with educational data also offer technical arguments that the tree regression method is superior to general linear model techniques for handling data on educational systems that includes 1) nominal variables with many categories; 2) ordinal variables in which the assumption of equal distances between the ranges of values is not very plausible; 3) possible non-linear relationships between predictors and outcome (Gomes & Jelihovschi, 2019; Gomes et al., 2020, 2021).
III. Results
Table 1 presents the distribution of students for different variables, separated into two groups: dropout and persistence in the following academic year. The mean and standard deviation are given for non-categorical variables.
Persistence | Dropout | |||
M | SD | M | SD | |
Age | 18.23 | 1.89 | 21.34 | 6.53 |
Grade point average to access HE | 15.16 | 17.54 | 14.69 | 18.85 |
Level of confidence | ||||
Graduating in subject | 4.57 | .74 | 4.39 | .88 |
Completing a degree at current university | 4.64 | .69 | 4.30 | 1.02 |
n | % | n | % | |
Sex | ||||
Male | 892 | 43.8 | 253 | 47.1 |
Female | 1146 | 56.2 | 284 | 52.9 |
Failed a year in basic/secondary education | ||||
Yes | 298 | 15.6 | 113 | 26.6 |
No | 1618 | 84.4 | 312 | 73.4 |
Vocational guidance | ||||
Yes | 900 | 47.2 | 162 | 38.2 |
No | 1006 | 52.8 | 262 | 61.8 |
First-choice course | ||||
Yes | 1098 | 57.5 | 225 | 53.3 |
No | 810 | 42.5 | 197 | 46.7 |
First-choice university | ||||
Yes | 1413 | 74.9 | 266 | 63.9 |
No | 474 | 25.1 | 150 | 36.1 |
Attending HE away from home | ||||
Yes | 756 | 39.6 | 177 | 41.4 |
No | 1155 | 60.4 | 251 | 58.6 |
Being employed | ||||
Yes | 113 | 5.9 | 84 | 19.6 |
No | 1808 | 94.1 | 344 | 80.4 |
An analysis of the students who remain in HE and students who drop out shows similar percentages in the variables sex, first-choice course, and attending HE away from home. On the other hand, students who drop out exhibit a greater rate of failure in basic or secondary education (difference of 11.0%) and employment (difference of 13.7%). Also, fewer students that leave HE have received vocational guidance (difference of 9.0%), and there is a greater proportion of students who are not attending their first choice of institution among students who drop out (difference of 11.0%).
Table 2 presents the mean and standard deviation of results on seven items relating to study habits in secondary education, for students who persist and students who drop out. Mean differences in both groups are estimated by a t-test.
Study habits in secondary education | Persistence | Dropout | ||||||
n | M | SD | n | M | SD | t-test | p | |
I made a plan before I started studying or before I started schoolwork | 2305 | 2.87 | 1.21 | 538 | 2.81 | 1.18 | -.959 | .337 |
After a test I tried to look at what I got right and wrong to assess my performance | 2305 | 3.58 | 1.11 | 538 | 3.52 | 1.17 | -1.038 | .299 |
I tried to study something to gain a proper understanding of what I was learning | 2305 | 4.30 | .74 | 538 | 4.18 | .77 | -3.021 | .003 |
If it helped me understand, I summarized, took notes or solved more exercises | 2305 | 4.20 | .93 | 538 | 4.11 | .97 | -1.725 | .085 |
I followed a study schedule every day that I set for myself | 2305 | 2.77 | 1.12 | 538 | 2.76 | 1.15 | -.113 | .910 |
When I did not understand a subject or exercise, I asked the teacher and classmates for help | 2305 | 3.96 | .90 | 538 | 3.86 | .93 | -1.931 | .054 |
I was concerned about finishing schoolwork within the deadlines | 2305 | 4.45 | .85 | 538 | 4.35 | .94 | -2.086 | .037 |
These results tend to show a mean difference in favor of students who do not drop out in all items. In Hab3 (“I tried to study something to gain a proper understanding of what I was learning”) and Hab7 (“I was concerned about finishing schoolwork within the deadlines”) these differences are significant (p < .05), and near this critical point in Hab6 (“When I did not understand a subject or exercise, I asked the teacher and classmates for help).
Figure 1 presents the model that classifies students according to each variable, making it possible to determine which variables are predictors of the decision to remain at or leave the university.
As expected, at the end of the first academic year most students remain at the university (81.1%) and 18.9% of students drop out. The model obtained reveals good predictive capacity (83.8%), although it is a more accurate predictor of cases of persistence (85.3%) than dropout (66.7%). The model classifies students into seven groups based on the values in the variables considered, and in three groups, students are more likely to remain in college than drop out, while in the remaining four groups, students are more likely to drop out.
Analyzing the groups in detail, we find that Group 1, which represents 5% (n = 142) of the total sample, is made up of students over 22 years of age, and they exhibit a high probability of dropping out (86%). Data shows there are no other relevant variables for the prediction of this group in the model. For students under the age of 22, we found a more complex situation, in which other variables predict persistence or dropout status.
Group 2, which represents 2% (n = 56) of the total sample, confirms the relevance of the age variable. The group consists of students who completed fewer than 3 CUs in their first year and are aged between 20 and 22 years old and pursuing a degree in social and juridical sciences, who exhibit a 31% probability of dropping out and a 69% probability of remaining in college.
Group 3 is made up of younger students with a lower level of academic performance and accounts for 4% of the total sample (n = 113). For these students, who have completed fewer than 3 CUs in their first year, are aged up to 19 years and are pursuing courses outside of the field of social and judicial sciences, the probability of remaining in college is 62% and the probability of dropping out is 28%.
Group 4 represents the majority of participants (89% of the total sample, n = 2530), and comprises students under the age of 22 who have completed more than 3 CUs during their first year. In this group, students have an 87% probability of persisting and 13% probability of dropping out.
Other variables introduced in the analysis (e.g., sex, attending HE away from home, enrolling in a first-choice course and university, repeating grades in earlier education, receiving vocational guidance, being employed, grade point average to access higher education, degree of confidence in graduating in their subject) do not offer a relevant contribution to explaining student dropout and persistence after considering students’ age and academic achievement.
IV. Discussion and conclusions
A considerable percentage of students drop out during their first year at university: in this sample, nearly 20% of students entering this Portuguese public university. As a complex phenomenon related to a dynamic and longitudinal decision process, a range of personal and contextual variables must be considered in explaining it (Casanova et al., 2018; Tinto, 2010). In this study, which uses the regression tree method, several variables have been introduced to differentiate between students who remain in or drop out from this academic institution.
The variable that best predicts the decision to stay at or leave the university is student age, with first-year students over 22 years old exhibiting a high rate of dropout. This group is likely made up of students in employment and with other social and family responsibilities, which is associated with age and has been found to be relevant in explaining dropout in the literature (Figuera et al., 2015; González-Ramírez & Pedraza-Navarro, 2017; Sánchez-Gelabert & Elias, 2017; Venegas-Muggli, 2019). Furthermore, a recent change to Portuguese law allows individuals aged 23 or over to access HE without secondary education. In these cases, some learning competencies or study habits have not been acquired, hampering academic achievement in HE where more learning autonomy is required from students (Newlon & Lovell, 2017). This aspect gains relevance because, after the age of 23, in the model tested (see Figure 1), a second variable explaining the dropout rate is the number of curriculum units completed (at this university the degrees have 10 or 12 CUs per year). The results show that only completing an especially low number of curriculum units (< 3 CUs) has a particular impact on dropout, highlighting the importance of poor academic performance in students’ decision to drop out (Casanova et al., 2018; Tuero et al., 2018).
It is also worth remarking that some variables introduced in the analysis do not appear relevant to the predictive model of student persistence and dropout. An initial descriptive analysis confirms that some of them produce no differentiation (e.g. sex, enrolling in a first-choice course, leaving the parental home), but others do appear relevant (e.g. having repeated grades in basic and secondary education, being employed, participating in vocational guidance programs in secondary education, being in a first-choice academic institution) when considered separately in analysis.
Given the social and institutional impact of first-year dropout in HE, the main objective of this paper was to analyze how some personal and contextual variables converge to predict a student’s decision to persist or to drop out. The regression tree method was chosen to perform the statistical analysis and determine the relevance of a sequence of variables in explaining rates of student persistence or dropout.
The results suggest that students’ age is the best predictor of their decision to drop out. In particular, students over the age of 22 have a higher probability of dropping out, which may be associated with employment and social and family responsibilities that affect their ability to attend classes or complete work with classmates, for example. A decision to drop out becomes even more likely if these students fail to complete at least three curriculum units (a degree at the sample university generally has 10 or 12 curriculum units per academic year). Low academic achievement appears to play an important role in students’ decision to remain in or leave HE.
Further supporting the finding that student age influences dropout is the fact that compared to their younger counterparts, students aged 21 to 23 years also exhibit a higher rate of dropout. Among these students, dropout is more frequent in those with a low level of confidence in completing their degree when starting college, those studying a course in the social and juridical sciences, and those reporting poor study habits in high school. Taking into account that age remains an important and highly predictive variable for dropping out of higher education, educational institutions should create teaching models adjusted to student profiles to facilitate learning in these students, and adopt measures that ensure adaptation to the characteristics and needs of all students.
To conclude, some limitations can be considered in this study. The sample comprised students from a single university, making it difficult to generalize data and conclusions. In Portugal, there are two subsystems in HE (university and polytechnic), each with its own particularities in terms of students’ social background and career projects, and different dropout rates (higher in the polytechnic subsystem). In future, both subsystems should be considered for a fuller explanation of the dropout phenomenon. Including other HE institutions and a larger number of dropout students will likely mean that other variables, assumed not to be relevant in this study, also contribute to explaining the dropout rate. Finally, dropout explanation can be enhanced by mixed-methods research that complements quantitative data with information obtained from interviews with subgroups of students.