I. Introduction
Current trends in higher education have stressed the need for continuous improvement of formative processes through the progressive development of a range of competencies that prepare students to meet the challenges of their future career. Many of these competencies concern verbal communication and involve language skills that are considered a professional tool widely used in pedagogical settings, and in this sense, the teaching profession can be understood to require a complex set of cognitive and socio-affective skills, with language being the main vehicle by which they are conveyed. Teacher language is commonly studied under the theoretical concept of academic language, involving broad mastery of oral and written discourse through higher-order skills in grammatical, lexical, and discursive aspects of language (Schleppegrell, 2004). It is thus hoped that future professionals acquire, in their formative years, an adequate command of academic language that will form the groundwork for the use of more technical, applied language as necessary within their respective disciplines.
Five decades of scientific research have documented this area of study by exploring the characteristics of and variations in teacher language, demonstrating a consistent link with quality of teaching practice (Kalinowski et al., 2020) and a wide-ranging impact on students’ cognitive processes, as teacher language strengthens, for example, abstract thinking, the production of new vocabulary and preferences for certain words (McNally et al., 2019). Early childhood education has not been left behind by these trends (Campbell-Barr & Bogatić, 2017), which shows that for early childhood educators, it is essential to acquire verbal skills that go beyond the mere command of non-academic language common to their immediate social environment.
1.1 Emotions and words
Various theories have been put forward to explain the relationship between words and emotions, revealing the complex functional representation they both have in the brain, distributed throughout different interrelated sensory, motor, affective and other language-related areas and networks in the brain (Hinojosa et al., 2019). Words can be understood as specific cognitive and neurophysiological markers serving to store an experience in a person’s conscience and/or retrieve the experience based on its significance and affective valence. For example, studies have reported how word choice is a good predictor of various psychophysical states (Shahane & Denny, 2019). An emotion-related term or word is understood as a linguistic and communicative expression with an affective connotation that, from a structural perspective, should possess several distinctive features. According to Fontaine et al. (2013), it should include (a) an appraisal component that triggers the emotion; (b) a distinctive tendency toward action; (c) facial movements and changes in tone of voice/speech; (d) observable body reactions; and (e) a subjective component. In words associated with an emotional experience, all these characteristics and/or properties converge with components of a cognitive, behavioral, and neurophysiological nature, and are often apparent as early as one’s first exposure to syllables, with an earlier and faster attentional bias for negative words (Yao et al., 2018). This evidence illustrates the intricate process that underpins the relationship between words and emotions, influencing sentence construction, verbalization and, ultimately, the meanings that subjects attribute to their own experience.
1.2 Evidence-based characterizations of early childhood educators
A fair amount of research has explored the characteristics of early childhood educators, reporting a range of dominant variables and other variables that present various degrees of association with quality of professional competence. For example, better academic credentials are associated with higher quality educational practices (Manning et al., 2019), and may in turn be further distinguished by type of academic training (Nasiopoulou et al., 2017), by degree of professional efficiency (Hu et al., 2018), by repertoire of emotional competencies (Fernández et al., 2019) and/or by theoretical preferences in explaining toddler behavior (Mischo et al., 2012). Other studies have identified educators’ professional competencies based on the country’s educational policies (Oberhuemer, 2015), socio-demographic variables (Unal & Kurt, 2018), characteristics relating to their professional identity (Doğan & Erdiller, 2016) and/or variables of a psychological nature such as the degree of emotional support provided to children (Treviño et al., 2013), attitudes toward early learning (Jeon et al., 2015), and personality traits (Tatalović et al., 2018).
Meta-analyses of the use of language in early education settings have shown that trends have centered on exploring literacy and verbal skill development practices (Markussen-Brown et al., 2017) and have identified the crucial role played by educators’ verbal expertise in teaching and learning processes (Justice et al., 2018). In this vein, for example, emphasis has been placed on the impact of educators’ language on child language acquisition (Muhinyi & Rowe, 2019), on the specific development of expressive language through play with infants (Cuellar & Farkas, 2018), and/or on specific cognitive skills such as print concept, letter naming and/or phonological awareness (Piasta et al., 2019). In addition, recent studies have examined more specific properties of language and the impact of educator syntax on infants’ learning of new vocabulary (Farrow et al., 2020), vocabulary size, syntactical complexity and lexical diversity both of educators and infants (Pizarro et al., 2019), near and clear or far and unclear educator talk (Degotardi et al., 2018), and/or preferences for storytelling and reading versus writing activities and their differential association with verbal syntactic complexity and the acquisition of new vocabulary (Torr, 2020).
However, no recent evidence can be found beyond research on language in purely formal terms, with a small number of studies available having adopted other approaches to this subject. One example is a study of instructional and/or commanding language (Hu et al., 2018) and its unequivocal link to educators’ professional qualifications (Hu et al., 2017). Recent studies have also looked at a particular linguistic style among educators that tends to minimize infants’ emotional experience, discouraging the expression of emotions, with a detrimental effect on the development of socio-affective skills (King & La Paro, 2018). On the other hand, evidence in this regard suggests that preschoolers adapt their exploratory strategies to the structure of the instruction provided by educators (Ruggeri et al., 2019) and, similarly, questions by educators about emotions have a regulating effect on infants’ prosocial behavior and hence a far-reaching impact on lifelong social interactions (Brownell et al., 2012). All this evidence reflects an emerging interest in research on the characteristics of the language used by these professionals, under the umbrella of continuing improvements in the quality of teaching practice, pedagogical efficiency, and professional development.
1.3 This study
According to the Undersecretariat of Early Childhood Education (Subsecretaría de Educación Parvularia) in Chile, as of August 2020 and in the regions covered by this study (Atacama and Antofagasta), there were a total of 1,588 early childhood educators working in government-dependent schools, with many others working in private centers with official state accreditation. Despite clear deficiencies in regulatory policies regarding the formative process for these professionals (Pizarro & Espinoza, 2016), in recent years quality standards have been ensured through the creation of a single, comprehensive, far-reaching curricular directive (Subsecretaría de Educación Parvularia, 2018). This curricular directive has established a series of technical and pedagogical guidelines involving professional skills that educators need to develop, such as using language within their discipline that is consistent with children’s changing development (Subsecretaría de Educación Parvularia, 2019). The relevance of these verbal skills and the recognition of the effect of semantic environments in shaping children’s thoughts, behaviors, social interactions, and emotions make explicitly clear the need to explore the characteristics of the language used by professionals working directly in early education settings. The socio-affective domain also being one of the cornerstones of formative processes in this section of the education system, this research aims to identify the characteristics, relevant dimensions and common uses of the speech of early childhood educators in their regular work environments (classroom and playground).
This study explores the production of socio-affective words by early childhood educators based on the understanding that a lexical unit (a word) can be classified into different semantic categories of meaning. This raised three research questions: (1) Can differential distributions of semantic categories associated with the socio-affective domain be identified by their frequency of use? (2) Can differential distributions of semantic categories associated with the socio-affective domain be grouped into common linguistic profiles? (3) What are the best predictors of socio-affective word production?
II. Method
Design and participants. This descriptive, cross-sectional research adopted an exploratory approach within the framework of the paradigmatic dimension of functional linguistics, which claims that the meaning of words lies in a vast network of interrelated choices (Halliday, 2014). A total of 20 early childhood educators, selected by convenience sampling, participated (all female; X̄ = 29.9 years of age; age range 21 - 40). These participants were from 14 different state and/or state-accredited educational institutions, all located in two nearby cities in northern Chile. All participants were native speakers of Spanish and held at least a four-year bachelor’s degree in early education, worked in the towns where the research was conducted at the time of the study, and had at least one year’s professional experience. In addition, no participant reported any disqualifying condition that may hinder the data collection and transcription process. The groups taught by the participants included three educational levels, with children aged from 24 to 60 months.
Instruments. The data was processed with the help of LIWC2015 v1.6 and its Spanish dictionary (Pennebaker Conglomerates, Inc.). This digital tool for psycholinguistic text analysis was the result of work by Pennebaker et al. (2015) and was developed on the premise that words from a text can be classified into linguistic categories of a grammatical and psychological nature, producing a profile for each individual text based on the percentage represented by each category in the base text. Its internal reliability (Cronbach’s alpha) ranges from .52 to .07, depending on the categories included.
Study variables. This study examined a total of 62 variables, divided into two groups: (a) one group aimed at identifying general linguistic aspects (n = 18) and (b) a second group to categorize psychological and socio-affective properties of speech (n = 44; variables listed in Table 1). Each variable corresponds to a semantic category of meaning, and an algorithm is used to classify lexical units (words) based on the degree to which they pertain to different categories, constructed on the basis of normative studies and prior classifications. Although in many cases the definitions of these variables are self-explanatory, more information can be found in Pennebaker et al. (2015).
Procedure. Data was collected for each participant between 10 a.m. and 4 p.m. for three to five non-consecutive days, over a total period of two to three weeks between August 2018 and January 2019. A continuous audio analysis method was employed due to its methodological advantages (Cunningham et al., 2019). This produced recordings varying in duration between 90 and 120 minutes, which included one or two regular teaching activities that responded to curricular standards applicable in the participants’ institutions and had a clear beginning, development, and conclusion. An attempt was made to ensure the recordings were not made on days with special activities that may alter the usual language employed (e.g. Mother’s Day, Month of the Sea, Christmas preparations). The final number of audio files and the variations in their duration were due to circumstances outside the researchers’ control (e.g. poor use of digital recorder and/or forgetfulness), so it was decided to establish a range of between 8 and 10 hours’ duration for the recordings, and two participants were ultimately excluded on this criterion. The recordings were transcribed with the help of the application Live Transcribe for Android and the data was subsequently analyzed with SPSS 26.0 (IBM Corp., NY, USA).
Data analysis. Various nonparametric tests were considered due to the characteristics and size of the sample. The descriptive analysis of lexical densities and general associations between variables included X , SD, quartiles, Spearman’s Rho (rs); z-scores, and effect sizes through Cohen’s d. Normal distributions were also calculated with the Shapiro-Wilk test, reliability was determined with Cronbach’s alpha, and a means comparison was performed with the independent samples t test. Possible clusters of similarly used words were defined in two steps: first, through a hierarchical cluster analysis to identify the number of resulting groupings, and then through a k-means algorithm to confirm them. The between-group and within-group significance of clusters was verified by the Kruskal-Wallis coefficient for one-way ANOVA. Lastly, a theoretical regression model was proposed to identify the best predictors of socio-affective word production.
Ethical aspects. Initial meetings were held and a formal cover letter for the study was sent out. In some cases, permission to record was guaranteed by a preexisting scientific research agreement with the lead author’s institution. Signed consent forms were collected in all cases and the general guidelines of the American Psychological Association (APA) were followed.
III. Results
3.1 Distribution of variables, bivariate correlations, and means comparison
The total number of words compiled was 308 277 (X̄ = 15,413.85; SD = 8,455.38), through 190 hours of effective recording. The following descriptive statistics were obtained for the first group of variables (linguistic variables): total pronouns (X̄ = 12.13; SD = 1.08); personal pronouns (X̄ = 12.31; SD = .87); impersonal pronouns (X̄ = 8.9; SD = .82); first-person singular pronouns X̄= 1.29; SD = .42); first-person plural pronouns (X̄ = .42; SD = .25); second-person pronouns (X̄= 1.54; SD = .44); third-person singular pronouns (X̄ = 6.4; SD = .68); third-person plural pronouns (X̄= 1.85; SD = .37); verbs (X̄= 3.41; SD = .67); verbs-first-person singular form (X̄ = 13.26; SD = 1.23); verbs-second-person form (X̄ = 1.44; SD = .33); verbs-first-person plural form (X̄ = 11.23; SD = 1.18); verbs-third-person singular form (X̄ = .01; SD = .01); verbs-third-person plural form (X̄ = 2.63; SD = .66); quantifiers (X̄ = .86; SD = .16); negations (X̄ = 1.85; SD = .54); numbers (X̄ = 1.71; SD = .28); formal language (X̄ = .13; SD = .09); informal language (X̄ = .89; SD = .4); and words of 6 or more letters (X̄ = .56; SD = .22).
For the second group of variables, made up of affective processes, positive emotion, negative emotion, anger, anxiety, sadness, pleasure, social processes, family, friends, and human, the results are presented in Tables 1 and 2. The reliability test suggests an overall Cronbach’s alpha of .63 for all variables (n = 64; p ≤ .05), with a value of .55 for the variables exclusively related to general linguistic aspects (n = 20; p ≤ .05), a value of .6 for all remaining variables excluding the socio-affective ones (n = 33; p ≤ .05), and .65 for the variables exclusively related to the socio-affective domain (n = 11; p ≤. 05).. The variable pleasure was included in the analysis due to its experiential proximity to the socio-affective domain, but not the variable sense due to the fact it referred solely to sensory pathways (e.g. watch, see, hear, touch).
X̄ (SD) | Q* | Shapiro-Wilk** | t | Sig. (2-tailed) *** | Skewness (SD = .51) | Kurtosis (SD = .99) | |
---|---|---|---|---|---|---|---|
Affective processes (1) | 3.6(.63) | 4 | .52 | 25.55 | .00 | .57 | .43 |
Positive emotion (2) | 2.94(.54) | 4 | .25 | 24 | .00 | .29 | -.99 |
Negative emotion (3) | .65(.32) | 2 | .00 | 8.83 | .00 | .84 | -.68 |
Anger (4) | .24(.19) | 1 | .00 | 5.55 | .00 | 1.27 | .67 |
Anxiety (5) | .14(.09) | 1 | .46 | 6.85 | .00 | .23 | -1.05 |
Sadness (6) | .14(.08) | 1 | .05 | 7.38 | .00 | .52 | -1.1 |
Pleasure (7) | 1.35(.43) | 3 | .95 | 13.96 | .00 | .01 | .69 |
Social processes (8) | 8.84(.73) | 4 | .11 | 53.48 | .00 | .76 | 2.52 |
Family (9) | .28(.14) | 1 | .02 | 8.8 | .00 | .91 | -.08 |
Friends (10) | .29(.24) | 1 | .00 | 5.23 | .00 | 1.42 | 1.63 |
Human (11) | .25(.17) | 1 | .03 | 6.37 | .00 | 1.22 | 1.54 |
Cognitive processes (12) | 15.67(1.87) | 4 | .8 | 37.37 | .00 | .06 | -.44 |
Insight (13) | 2.81(.7) | 4 | .00 | 17.77 | .00 | 1.99 | 6.2 |
Causation (14) | 1.33(.26) | 3 | .41 | 22.94 | .00 | -.61 | .79 |
Discrepancy (15) | 1.44(.29) | 3 | .48 | 21.58 | .00 | .56 | 1.5 |
Certainty (16) | 1.16(.23) | 3 | .33 | 21.64 | .00 | .45 | -.7 |
Tentative (17) | 1.9(.37) | 3 | .67 | 22.85 | .00 | -.09 | -.2 |
Relativity (18) | 10.88(1.01) | 4 | .39 | 47.72 | .00 | -.16 | -.26 |
Motion (19) | 4.76(.75) | 4 | .9 | 28.36 | .00 | .34 | .00 |
Time (20) | 3.98(.63) | 4 | .18 | 28.2 | .00 | .39 | -1.12 |
Space/Place (21) | 2.7(.7) | 3 | .03 | 17.18 | .00 | 1.18 | 1.01 |
Perceptual processes (22) | 4.66(.75) | 4 | .02 | 27.79 | .00 | 1.4 | 3.74 |
See (23) | 2.23(.88) | 3 | .00 | 11.29 | .00 | 2.72 | 9.17 |
Hear (24) | 1.28(.32) | 3 | .35 | 17.54 | .00 | -.01 | -.11 |
Sense (25) | 1.07(.55) | 2 | .03 | 8.72 | .00 | .81 | -.08 |
Biological processes (26) | 2.46(.63) | 3 | .21 | 17.36 | .00 | .69 | .13 |
Body (27) | 1.17(.3) | 3 | .01 | 17.14 | .00 | 1.05 | .61 |
Health (28) | .15(.05) | 1 | .76 | 13.8 | .00 | -.31 | -.27 |
Ingestion (29) | 1.07(.45) | 2 | .3 | 10.49 | .00 | .3 | -.88 |
Sexual (30) | .29(.17) | 1 | .00 | 7.29 | .00 | 1.46 | 1.61 |
Other variables | |||||||
Achievement (31) | .85(.32) | 2 | .22 | 11.92 | .00 | .24 | 1.35 |
Assent (32) | .85(.24) | 2 | .46 | 15.73 | .00 | .7 | 1.24 |
Exclusion (33) | 2.06(.37) | 3 | .19 | 24.87 | .00 | .08 | .44 |
Death (34) | .09(.07) | 1 | .02 | 5.83 | .00 | .87 | -.21 |
Home (35) | .69(.23) | 2 | .97 | 13.41 | .00 | .4 | .28 |
Inhibition (36) | .28(.1) | 1 | .34 | 12.93 | .00 | .61 | 1.02 |
Inclusion (37) | 3.51(.54) | 4 | .21 | 29.06 | .00 | .6 | .65 |
Money (38) | .33(.13) | 2 | .09 | 11.38 | .00 | 3.34 | .09 |
Nonfluencies (39) | .63(.2) | 2 | .52 | 13.9 | .00 | -.03 | -1.05 |
Past (40) | 1.69(.37) | 3 | .43 | 20.41 | .00 | .08 | -.61 |
Present (41) | 3.21(.43) | 4 | .39 | 33.29 | .00 | -.28 | .56 |
Future (42) | .13(.21) | 1 | .54 | 23.37 | .00 | .07 | -.66 |
Religion (43) | .09(.07) | 1 | .00 | 5.48 | .00 | 1.39 | .09 |
Work (44) | .99(.23) | 2 | .15 | 18.61 | .00 | -.05 | -.07 |
Note: *Quartiles from highest (4) to lowest (1).
*p ≤ .05 ***p ≤ .01
(rs) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
---|---|---|---|---|---|---|---|---|---|---|---|
(1) | 1 | .77** | .53* | .41 | .44* | .27 | .29 | .00 | .4 | .3 | .06 |
(2) | 1 | -.02 | -.06 | -.1 | -.03 | .29 | -.13 | .22 | .28 | -.22 | |
(3) | 1 | .73** | .87** | -.03 | .29 | .15 | .31 | .18 | .38 | ||
(4) | 1 | .64** | .34 | .38 | .02 | .24 | .05 | .47* | |||
(5) | 1 | .57** | .1 | .05 | .18 | .16 | .37 | ||||
(6) | 1 | .18 | .13 | -.01 | .07 | .03 | |||||
(7) | 1 | .21 | .06 | .51* | -.00 | ||||||
(12) | -.43 | -.1 | -.35 | -.29 | -.43 | -.11 | .51* | -.04 | -.27 | -.26 | -.1 |
(13) | -.1 | -.02 | -.02 | -.04 | -.09 | .12 | .06 | .13 | -.18 | .05 | -.06 |
(14) | -.32 | -.23 | -.07 | -.06 | -.1 | .07 | .06 | -.08 | -.35 | -.17 | -.04 |
(15) | -.36 | -.26 | -08 | -.19 | -.13 | .15 | -.00 | -.18 | -.35 | -.58** | -.22 |
(16) | -.09 | .14 | -.46* | -.28 | -.42 | -.63** | -.16 | .05 | .11 | .12 | .1 |
(17) | -.37 | -.26 | -.16 | -.25 | -.15 | .14 | -.26 | .09 | -.25 | -.5* | -.2 |
(18) | -.08 | .27 | -.33 | -.51 | -.32 | -.26 | .51* | -.24 | -.25 | .08 | -.49* |
(19) | .12 | .38 | -.2 | -.47* | -.25 | .03 | .08 | -.18 | -.23 | -.1 | -.68** |
(20) | -.04 | .12 | -.22 | -.28 | -.16 | -.35 | -.36 | .05 | .23 | .19 | .26 |
(21) | .49* | -.31 | -.14 | -.15 | -.13 | -.15 | .02 | -.18 | -.19 | -.01 | -.04 |
(22) | .03 | .04 | .06 | .15 | -.11 | .01 | .21 | .16 | -.17 | .25 | -.08 |
(23) | .41 | .42 | .13 | .08 | .04 | .08 | .07 | -.24 | -17 | .03 | -.16 |
(24) | -.55* | -.68** | -.04 | .44 | -.09 | .07 | .08 | .39 | .03 | -.03 | .38 |
(25) | .24 | .23 | .14 | .25 | .05 | -.08 | .58** | .22 | .11 | .49* | .1 |
(26) | .58** | .36 | .28 | .45* | .19 | .03 | .22 | -.01 | .15 | -.1 | -.11 |
(27) | .31 | .19 | .3 | .26 | .26 | .02 | .5* | -.03 | .03 | .35 | -.1 |
(28) | .31 | .03 | .24 | .17 | .41 | .02 | .33 | -.04 | .03 | -.12 | .2 |
(29) | .35 | .25 | -.00 | .27 | -.05 | -.1 | -.00 | -.04 | -.01 | -.43 | -.24 |
(30) | .41 | .33 | .42 | .18 | .23 | .32 | .31 | .31 | .48* | .55* | .39 |
(31) | -.22 | .15 | -.39 | -.47* | -.49* | -.03 | -.34 | -.08 | -.12 | -.37 | -.35 |
(32) | .01 | .43 | -.64** | -.52* | -.6** | -.61** | -.17 | -.5* | .11 | .06 | -.35 |
(33) | -.11 | .09 | -.12 | -08 | -.27 | .1 | -.08 | .28 | .06 | -.16 | .07 |
(34) | .02 | .09 | .02 | -.04 | -.03 | .15 | -.05 | -.15 | -.31 | -.21 | -.27 |
(35) | -.16 | -.09 | -.19 | -.01 | -.16 | -.1 | -.52* | -.48* | -.22 | -.51* | .04 |
(36) | .00 | .14 | -.16 | .05 | -.04 | -.23 | -.02 | -.4 | -.16 | -.26 | -.01 |
(37) | -.38 | -.08 | -.28 | -.31 | -.26 | -.07 | .07 | .07 | .05 | .12 | .07 |
(38) | -.27 | -.04 | -.37 | .24 | -.32 | -.00 | .38 | .25 | .42 | .12 | .55 |
(39) | .24 | -.09 | .35 | .43 | .32 | -.00 | .08 | .08 | .19 | -.16 | .06 |
(40) | .13 | .6** | -.37 | -.48* | -.39 | -.1 | .03 | -.38 | -.19 | .15 | -.54* |
(41) | -.13 | -.27 | .17 | .33 | .2 | .29 | .06 | -.06 | -.25 | -.32 | .11 |
(42) | -.27 | -.33 | .21 | .17 | .11 | .17 | .13 | .26 | -.13 | -.05 | .01 |
(43) | -.05 | -.13 | -.08 | .29 | -.05 | -.56** | .19 | .08 | .28 | .24 | .08 |
(44) | .39 | .36 | .27 | .18 | .21 | .02 | .14 | .35 | .59** | .08 | .46* |
Note: * p ≤ .05
** p ≤ .01
Bivariate correlations between the variables of the second group were also reviewed using the two-tailed Spearman’s rank correlation coefficient (rs). However, due to space limitations, the following table presents only associations in the socio-affective domain. The numbers are given in the same order as in Table 1.
Other significant correlations worthy of mention (p ≤ .01) are informal language / inclusion (-.72), discrepancy / tentative (.79), sense / money (.82), and body / money (-.7).
3.2 Cluster analysis
To continue our analysis, we explored the possibility of identifying common profiles of semantic categories associated exclusively with the socio-affective domain, based on their degree of convergence and statistical differentiation. After standardization of the 11 variables, two steps were followed to group the variables. Firstly, an agglomerative hierarchical cluster analysis was applied to determine the best possible grouping of the observations (variables) included in the model (n = 11). This was done by building a hierarchy of observations that are then merged as they move up the hierarchy. For this, the algorithm employed a measure of the squared Euclidean distance. These results are presented in the following dendrogram:
Secondly, a k-means algorithm was applied to confirm the emergent clusters and determine their best predictors, specific distributions, and statistical significance. The Kruskal-Wallis coefficient indicates significant results for the mean centers of affective processes, negative emotion, anger, anxiety, and sadness, and a post hoc analysis using Cohen’s d confirmed these differences for anxiety, affective processes, sadness, and anger, with high, moderate, and low effect sizes (see Table 3).
Cluster centers* (z-scores) | Kruskal-Wallis | ||||
---|---|---|---|---|---|
1 (n=13) | 2 (n=7) | Test statistic | Asymp. Sig. | Cohen’s d | |
Affective processes | 4(-.35) | 3.35(.65) | 3.62 | .05 | 1.11 |
Positive emotion | 3.29(-.02) | 2.71(.05) | .00 | 8.37 | - |
Negative emotion | .57(-.53) | .46(.98) | 8.37 | .00 | .51 |
Anger | .22(-.49) | .07(.92) | 6.45 | .01 | .24 |
Anxiety | .21(-.48) | .11(.89) | 8.38 | .00 | 1.62 |
Sadness | .08(-.46) | .26(.86) | 7.28 | .00 | .58 |
Pleasure | 1.02(-.19) | 1.32(.35) | 1.92 | .16 | - |
Social processes | 7.42(-.05) | 10.91(.09) | .26 | .6 | - |
Family | .31(-.25) | .44(.47) | 1.51 | .21 | - |
Friends | .35(-.33) | .95(.62) | 2.39 | .12 | - |
Human | .74(-.31) | .28(.58) | 2.27 | .13 | - |
Note: * The final variance ratio between cluster centers is 3.4
Thus the groups initially observed were confirmed, revealing the presence of two main clusters, one predominant (cluster 1; n = 13; 65%), and another smaller one (cluster 2; n = 7; 35%). Furthermore, high homogeneity was observed in the distribution of variables for both groups, with a high density of words associated with the general categories of social processes and affective processes, and then, in particular, the subcategory positive emotion. By contrast, a low density was observed in the scores for negative emotion, so anger, anxiety, and sadness, as well as for human. The variable pleasure exhibits average values in both groups. Figure 2 shows the similarities and differences between the predictors of the clusters in z-scores:
3.3 Predictors of socio-affective words
A multiple regression model was constructed to explore the trends in the data and identify the best predictors for the second set of variables that would be strictly consistent with the assumptions for this type of statistics: multicollinearity, homoscedasticity, means and normal distributions of residuals, and the absence of cases that skew the data. After verifying that all these criteria were met and without removing any outlier (all Cook’s distances were below 1), 15 independent variables were entered in the final steps of the model using the stepwise method. Due to the large number of variables, it was decided to report only two of the resulting predictors: (1) The best predictors of the social processes category are exclusion and friends [F10.37, p < .00; R2change = .32; with a multicollinearity test (VIF) of 1.06 and a Durbin-Watson (DW) coefficient of 2.53]. (2) The category family is predicted by sexual (exemplified by words like “lips”, “love”, “pure”, and “kisses”) and by causation [F12.48, p < .00; R2change = .11; VIF of 1; DW of 2.03]. (3) The category friends is predicted by family and ingestion [F11.25, p < .00; R2change = .15; VIF of 1; DW of 2.55]. (4) The category human is predicted by motion and family [F14.75, p < .00; R2change = .2; VIF of 1.04; DW of 2.45]. (5) The general category affective processes is predicted by positive emotion and negative emotion [F177.04, p < .00; R2change = .3; VIF of 1; DW of 2.21]. In turn, (6) positive emotion is predicted by the general category affective processes and the subcategory negative emotion [F126.89, p < .00; R2change = .28; VIF of 1.37; DW of 2.33]. (7) Negative emotion is predicted by anger and anxiety [F133.63, p < .00; R2change = .11; VIF of 1.2; DW of 2.07]. (8) Anger is predicted by negative emotion and sadness [F82.93, p < .00; R2change = .08; VIF of 1.87; DW of 2.09]. (9) Anxiety is predicted by negative emotion and health [F30.83, p < .00; R2change = .08; VIF of 1.09; DW of 2.11]. (10) Sadness is predicted by negative emotion and anger [F21.22, p < .00; R2change = .24; VIF of 5.78; DW of 1.94]. And (11) pleasure is predicted by body [F14.03, p < .00; R2change = .43, VIF of 1; DW of 2.4].
IV. Discussion
4.1 Predominant semantic categories (research question no. 1)
A first glance at the data shows distinctive distributions of words associated with formal aspects of language. Of particular note is the use of (a) the present tense over the past or future; (b) personal pronouns over impersonal ones; (c) pronouns in the second-person singular over first-person pronouns; (d) formal language over informal language; (e) a greater presence of verbs in the second-person form (you); and (f) substantial negation. Next in the analysis, 23 semantic categories stand out for their high or low lexical density in the fourth and first quartiles respectively (Table 1). The first research question has thus been answered in the affirmative, considering the differential distributions of semantic categories that were indeed identified.
The highest lexical density was attained by the general variable cognitive processes, headed by the subcategory insight and exemplified by words like “understand”, “realize”, “make”, and “solution”. These findings can be understood within the context of daily discipline-specific language in early education settings, which are characterized by a wealth of scenarios that demand clear, precise instructions, together with countless warnings and reprimands in response to ongoing unexpected situations. This may be indicative of an oversaturation of words that require an understanding and quick judgment on the part of preschoolers. In addition, these observations shed some light on the specific role played by educational interventions that may have an impact on the development of behavioral self-regulation skills among toddlers, as has been suggested by Bendezú et al. (2017).
The second densest category is relativity, exemplified by words such as “high”, “low”, “full”, and “up to”, and particularly the subcategory motion, exemplified by words such as “bring”, “path”, and “go”, all of which emphasize the here and now of the educational environment. These findings complement the above discussion on the efforts made by educators to raise awareness of and alert to the educational setting and the boundaries by which it is defined, opening up an interesting hypothesis regarding which linguistic features promote the above-mentioned behavioral self-regulation skills in preschoolers, particularly inhibitory control of behavior.
The third semantic category noted for its lexical density is that of social processes, which reflects a use of language to be expected in this educational context. Similarly, the general category perceptual processes was highly present, and within this category, the subcategory see was the mechanism of perception most often referred to by educators, who continually instructed children to be alert to what was happening in the learning environment. Examples of lexical units in this category include verbs like “see”, “look (at)”, and “show”, but also nouns such as “arch” or “picture”. This can be linked to the role of instructional language and the generation of meaning in early education settings, recently noted by Brown et al. (2020), through which educators frame the educational situation, constantly noting behaviors and directing actions. Lastly, the category inclusion achieved a high rank in the distributions observed, and is illustrated by lexical units such as “put it away”, “wait”, “hang on”, and “responsible”.
A further substantial number of categories were poorly represented in the general mosaic of results, in particular those relating to emotions. Given that these findings cannot be contextualized due to the lack of any comparative element, they should be treated with caution. One good example is the apparent contradiction between what Cekaite and Bergnehr (2018) have stressed as the crucial role of language associated with corporeality in teaching and the low lexical densities found in the category biological processes and the subcategory body. Moreover, manually transcribing the audio recordings provided us with a range of field observations regarding the lack of variety in the production of socio-affective words. Indeed, this specific professional context calls for a simple lexis consistent with the developmental tasks undertaken by children at an early age. Nonetheless, for children, the lack of greater lexical complexity may also constitute a limitation as they strive to align their emerging feelings against the linguistic backdrop at their disposal and provided by educators. The progressive nature of language development in the early years of life requires a rich vocabulary able to meet the educational demands of children, as asserted by Dalgren (2016), to diminish their normative verbal response. These observations offer an interesting hypothesis for future work.
4.2 Semantic profiles (research question no. 2)
The semantic categories identified in this study showed that they could be differentiated by frequency of use and also, to a certain extent, predicted. On the basis of the statistics applied, the two linguistic profiles identified cannot be said to have been selected at random, and they represent different clusters of linguistic choices and preferences for lexical units associated with the socio-affective domain. The high presence, in both profiles, of words associated with the category affective processes and the fact that the subcategory positive emotion is the best predictor could be a feature of language itself in this discipline, but could also be considered a desirable professional trait and skill. This is reaffirmed in work by Decker-Woodrow (2018), who reports that greater emotional support, classroom management skills, and quality interactions are all characteristics of efficient facilitators in early education settings. On the other hand, as observed in the second cluster, the high presence of the semantic category associated with the social domain can be contextualized as suggested by Cejudo and López-Delgado (2017) and based on a tendency by some teachers to exhibit a high degree of sociability in their interactions within educational settings. These findings in part suggest a need to acquire appropriate socio-affective teaching skills for a proper, balanced enrichment of both the social and affective domains.
On the other hand, a low density of words associated with negative emotions was found across all categories identified, which, as indicated above, is a common denominator for the two profiles observed. Similarly, this could tentatively be explained as a discipline-specific feature of language in this educational setting, but the presence of such a distribution (a high density of words associated with positive emotions and a low density associated with negative emotions) may also point to other variables that may affect language production by these professional educators. The level of professional literacy, command of discipline-specific language, professional experience, academic credentials, personality traits and level of professional stress are a few variables that may produce this impact. For example, the number of children in each class, high noise levels and/or other aspects of the work environment increase stress levels and may “taint” educators’ speech with words associated with negative emotions. Importantly, the data tended toward differentiating a third cluster characterized by the predominant use of words with a negative emotional connotation, but ultimately, due to the low number of members in the cluster, it was not included.
It should also be noted that the low number of participants in the sample does not allow us to assume that these findings are significant across the general population of educators. However, the small differences observed between the cluster mean centers suggests high data homogeneity - in other words, word production tends to be similar in both clusters. This similarity in distribution can provisionally be attributed to a two-layer explanation: on the one hand, a consistent formative process, and on the other, deficiencies in training and/or, ultimately, an excessive focus on certain semantic categories (e.g. positive emotions), which may be associated with professional bias due to working with children at such a young age.
4.3 Predictors of socio-affective words (research question no. 3)
Many of the variables failed to meet the assumptions for the regression analysis, which provides a valuable insight into the structure and complexity of this type of linguistic data and the methodology required for the data collection process. The findings showed that most semantic categories in the socio-affective domain - family, friends, affective processes, positive emotion, negative emotion, anxiety, anger, and sadness - are, as expected, largely good predictors of one another. These can be joined by exclusion, sexual, causation, motion, ingestion, health, and body, which are also sufficiently good predictors of the production of words associated with semantic categories in the socio-affective domain. Exploring these predictive relationships between semantic categories and their specific lexical units is shaping up as an attractive line of future research.
4.4 Limitations and future directions
Forthcoming studies in this area should be enriched by extending the methodological design to take into account comparison groups (e.g. teachers in primary and secondary education), and ideally by also including a longitudinal design to observe the behavior and stability of these semantic categories, profiles, and predictors over time. In addition, a larger number of participants would ensure greater validity and reliability by using parametric statistics. One other relevant consideration concerns non-discrimination in data collection. While this study leveraged the advantages of continuously recording information, future work may be more flexible and define more precisely the scope of the corpus from which data is to be extracted, for example by segmenting audio recordings prior to analysis. Furthermore, the software used groups words in linguistic categories that may not conform exactly to the distinctive use of language by the research subjects. Future research should take this into account and describe the particular use of socio-affective language by employing their own indices that acknowledge the idiosyncrasies of the participants. One other important issue concerns the exploration of more complex meanings in educators’ socio-affective speech, which would enable a better contextually-situated characterization of this linguistic phenomenon. Lastly, a parallel characterization of the production of socio-affective words by children would offer a more comprehensive perspective of the dynamics of communication within the early education classroom, in accordance with research trends in designing studies on language acquisition.
V. Conclusions
The objective of this study was to characterize the production of socio-affective words by early childhood educators by exploring their use, frequencies, and typical groupings during their regular teaching activities with children of preschool age. Our findings suggest dominant lexical densities, two distinctive semantic profiles with differing explanatory capacities, and various predictors of socio-affective words.
It is widely reported in relevant literature that an effective teacher will tend to provide an unwaveringly high level of support to students, offering numerous opportunities for them to practice, make mistakes, and learn. In this sense, the presence of a high or low density of words associated with the socio-affective domain entails an impact on children’s learning and development process, driving or potentially restricting the development of specific skills and knowledge - for example, through the acquisition of appropriate vocabulary and/or the acquisition of richer descriptive approaches to their own emotional experience. In other words, the cognitive scaffolding underpinning teachers’ use of discipline-specific language and its influence on child language development is needless to say an essential factor that boosts the development of socio-affective skills. This opens up an attractive line of research on the role of instructional language in learning, in view of what was recently reported by Martí and Portolés (2019) and the emotional modulation that early childhood educators’ teaching styles may represent for children’s behavior.
Finally, it is hoped that our study will provide an empirical contribution to understanding the factors associated with quality and effectiveness in teaching in early education by characterizing early childhood educators’ verbal competence, and in particular, by identifying the most appropriate lexical choices to meet children’s socio-affective needs at each stage of their development. It is our hope that these findings will strengthen interest in this line of research and facilitate the next steps for further research on this linguistic phenomenon in the specific learning context these professional educators work in. We also hope to contribute to the construction of updated and ad hoc formative proposals in higher education and continuing education for these educators.