Introduction
A special issue of Therya dedicated to Dr. Alfred L. Gardner for his long research career on the diversity of neotropical mammals, especially for his work in México, honors this outstanding scientist by contributing important advances to the knowledge of mammalogy. Our contribution adds to the mission of modern systematic biology: the discovery, description, and classification of the biodiversity on the planet from an evolutionary perspective (Daly et al. 2012). This task involves subjects under debate over the past three decades, such as the species concept (what a species is) and species delimitation (how a species is recognized). Both subjects are closely related but conveniently divided for practical applications (see review by de Queiroz 2007), and over time, species delimitation has taken priority over species concepts (Sites and Marshall 2003, 2004). Given the current rate of species loss, it is urgent to accurately delimit species inasmuch they are the fundamental unit in studies of ecology, systematic, and conservation biology, among other research areas. From the evolutionary standpoint, species delimitation includes the understanding of population-level mechanisms that can be complex (Huang 2020). Populations differentiation through multiple stages at different rates, in part dependent on factors such as generation time, selection pressure, and gene flow. Tracing the process with an acceptable level of certainty depends on the use of appropriate markers (preferably multiple and independent) and the criteria of evaluation (de Queiroz 2007). One of the most reliable strategies is to use multiple sources of evidence (morphology, genetics, ecology, geography, among others) and to base conclusions on their consistency (Knowles and Carstens 2007; Rissler and Apodaca 2007; Carstens et al. 2013).
There are both regions as well as biological groups, which are amenable to test hypotheses about species delimitation. The Mesoamerican region has been repeatedly used as a study model because of its complex physiography and biogeographical history, which is reflected by high biological diversity, including many endemic species (Myers et al. 2000), particularly for highland groups. As for groups of organisms, rodents, reptiles, and insects, among others have served as models to test hypotheses about evolutionary patterns and processes (e. g.Doody et al. 2009; Gilbert and Manica 2015; Maestri et al. 2017). Some species of rodents have been assessed by evaluating their phylogenetic relationships and further used to illuminate the vicariant biogeography of Mesoamerica (e. g.Sullivan et al. 2000; Leon-Paniagua et al. 2007; Almendra et al. 2018; León-Tapia et al. 2021). Such is the case of Reithrodontomys sumichrasti (Family Cricetidae; Bradley 2017), with a particular interest in the high levels of intraspecific divergence reported (Sullivan et al. 2000; Urbina et al. 2006; Hardy et al. 2013).
Reithrodontomys sumichrasti is distributed along the highlands of Mesoamerica, from central México at 1,200 masl to Panama above 3,400 masl, inhabiting temperate pine-oak and cloud forests. Seven subspecies are recognized, which are distributed in three disjunctive spots (Hooper 1952; Hall 1981; Figure 1). The range of R. s. sumichrasti includes portions of the Sierra Madre Oriental, the Mexican Transvolcanic Belt, and the Oaxacan Highlands (type locality El Mirador, Veracruz, México). The distribution of R. s. nerterus is restricted to the west portion of the Mexican Transvolcanic Belt (type locality Nevado de Colima, Jalisco, México) whereas R. s. luteolus is found in the Sierra Madre del Sur (type locality Juquila, Oaxaca, México). R. s. dorsalis occurs in the mountains of the Mexican states of Chiapas and Guatemala (type locality Tonicapan, Guatemala) and R. s. modestus in the highlands of El Salvador, Honduras, and western Nicaragua (type locality Jinotega, Nicaragua). The southernmost distribution of the species includes the Cordillera Central and Cordillera de Talamanca in Costa Rica for R. s. australis (type locality Cartago, Costa Rica) and the extreme east of Costa Rica and high mountains of western Panama for R. s. vulcanius (type locality Chiriquí, Panama; Hooper 1952).
Previous phylogenetic studies using DNA sequences of the mitochondrial Cytochrome b (Cyt-b) gene (Sullivan et al. 2000), or also incorporating the seventh intron of nuclear gene beta-fibrinogen (Fgb-I7) and the second intron of the acid phosphatase type V (Acp5; Hardy et al. 2013) have revealed the existence of several distinct clades that may deserve species-level recognition. Lineages on either side of the Isthmus of Tehuantepec in México were proposed as distinct biological species, but this pattern has been supported by only mtDNA sequences (Sullivan et al. 2000; Hardy et al. 2013). Although it was difficult to elucidate the relationships among networks of populations from central México (Hardy et al. 2013; Figure 2), there was a clear pattern of phylogenetic structure.
Here, we evaluate species delimitation within R. sumichrasti using different methods of analysis than those previously employed to test the hypothesis that R. sumichrasti represents a complex of cryptic species. We also comment on the diversification processes in the region and make taxonomic suggestions.
Materials and methods
Data acquisition. DNA sequences from the mitochondrial gene Cyt-b, and the Fgb-I7 and Acp5 nuclear genes, representing Hardy et al. (2013) populations dataset of Reithrodontomys sumichrasti (n = 226) were obtained from GenBank. We sequenced an additional 11 specimens of R. sumichrasti, five of these from three new geographic localities (64 to 66; Appendix 1). Given the current availability of sequence data for outgroup taxa, we included samples of R. zacatecae, R. megalotis, R. chrysopsis, R. humulis, R. montanus, and R. raviventris from the R. megalotis species group (Musser and Carleton 2005). The updated DNA datasets were realigned with MAFFT v7 [L-INS-i refinement, gap penalty = 3, offset = 0.5] (Katoh et al. 2005) for nuclear markers, and manually for Cyt-b using Geneious Pro v6.1.6 (https://www.geneious.com). The optimal partition scheme (by gene) and models of nucleotide substitution (Cyt-b: GTR+I+G, Fgb-I7: HKY+I+G, Acp5: K80+I+G); were determined with Partition Finder (Lanfear et al. 2014).
Phylogenetic hypothesis. We considered the phylogenetic relationships proposed by Hardy et al. (2013) as our working hypothesis, where two geographic clades are supported as species-level lineages. One species (spA) split ~2.5 million years ago (Ma) and comprises populations from Chiapas south into Central America (clade I; Figure 2). Species (spB) includes 3 haplogroups restricted to México, west of the Isthmus of Tehuantepec (Figure 2), whose most recent common ancestor was placed ~1.36 Ma (see Hardy et al. 2013). To assess support for this phylogenetic hypothesis (Hardy et al. 2013), and for alternative topological arrangements, we applied three methods for assessing species boundaries and species tree estimation (see below) that do not require a guide topology or species assignments to be specified a priori.
Single locus species delimitation. A time-calibrated Bayesian Inference (BI) analysis of Cyt-b for R. sumichrasti samples was run in BEAST2 v.2.6.2 (Bouckaert et al. 2014). We employed a prior rate of evolution of 0.017 substitutions per site per million years (Arbogast et al. 2002) and fossil calibrations (R. moorei, R. wetmorei, R. galushai, R. pratincola, R. rexroadensis, and R. sp.) with an offset of exponential prior for the age (in Ma) of the root (mean = 2.25, offset = 1.3, HD = 95 % between 1.5 to 5.5 Ma; Dalquest 1978; Czaplewski 1987; Martin et al. 2002; Morgan and White 2005; Lindsay and Czaplewski 2011; Martin and Peláez-Campomanes 2014). BI analysis consisted of four Markov chain Monte Carlo (MCMC) chains of 10 million generations, sampling trees every 1,000 generations and with a burn-in of 20 % of the trees. The last 100 trees sampled from each run were analyzed with 1 million generations of the Bayesian General Mixed Yule-Coalescent (bGMYC) model (Reid and Carstens 2012) in the computing environment R (R Core Team 2018). As advised by Reid and Carstens (2012), outgroup taxa were not included in this analysis. For all Bayesian analyses reported herein, stabilization and appropriate Effective Sample Sizes (ESS ≥ 200) of the posterior distributions for model parameters were examined in Tracer 1.8 (Rambaut et al. 2018).
Time-calibrated multiple loci species delimitation. The multiple loci multiple species dataset was analyzed simultaneously with the multi-tree multi-species coalescent method (Heled and Drummond 2010) and the assignment-free species delimitation technique implemented in STACEY (Jones 2017), using BEAST2. The search strategy implemented in STACEY uses a birth-death-collapse prior to approximate alternative delimitation models and node re-height MCMC move that aims to improve the convergence of the species tree estimation, therefore, its performance is subject to the accuracy of divergence times estimation. As recommended, the analysis was run twice, the second time sampling from the prior only; for 100 million generations, trees were sampled every 5,000 generations. A Fossilized Birth-Death model was set on the speciation rate (Heath et al. 2014), time-calibrated as specified above. Topologies and clock rates from individual loci were left unlinked, and substitution rates among branches were drawn from a log-normal distribution with a prior mean rate of 0.017 substitutions per site per million years for the Cyt-b (Arbogast et al. 2002).
Clock-like multiple loci species delimitation. We assessed the probability of alternative species delimitation models and species trees with the Bayesian Phylogenetics and Phylogeography method (BPPv3.2; Yang and Rannala 2014). This assumes a Jukes-Cantor evolutionary model (strict molecular clock) and applies a species tree search strategy that is grounded on the Nearest Neighbor Interchange (NNI) algorithm, followed by its characteristic rjMCMC move. Although it accounts for the uncertainty on estimated rates of evolution compared to *BEAST-STACEY, this method is applicable to inter- and intra-species datasets that meet the criteria of having clock-like evolutionary rates. For this analysis, uniform rooted species trees were assumed, with gamma priors for the population size (α, β) of Θ = (2, 2000) and root age (Tau = τ) τ0 = (4, 2, and 1). The rjMCMC was run with algorithm A11 with fine-tune parameter ε _= 2 (joint unguided species delimitation and species tree inference) for 500,000 generations with a sampling frequency of 200 after a burn-in period of 10,000.
Genetic distances. Cyt-b genetic distances using the Kimura 2-parameter (K2P; Kimura 1980) and the uncorrected P-distances were estimated between and within clades suggested as distinct species using MEGA X (Kumar et al. 2018). This allowed us to make genetic distance comparisons with other values reported for rodents and for R. sumichrasti by Bradley and Baker (2001) and Hardy et al. (2013), respectively.
Ecological niche equivalence. For each species-level clade (clades I-IV, see Results section), we developed present-time Ecological Niche Models (ENMs) with MAXENT 4. (Phillips and Dudik 2008). Correlation between the 19 environmental variables from the WORLDCLIM database (1 km2 resolution; Hijmans et al. 2005) was calculated with ENMtools v1.4.1 (Warren et al. 2010). Then, 9 environmental variables (correlation = r ≤ 0.80) and presence points confirmed with molecular data (Appendix 1) were employed to obtain the ENMs. For clades I-III, 10 bootstrap replicates of presence/background points assigning 15 % of the presence points for training were applied. For clade IV, 10-fold cross-validation replicates were applied because of the limited number of presence records.
To test the hypothesis of niche conservatism between the ENMs from sister clades, a null distribution of 99 estimates of the I Statistics (Warren et al. 2008) and the Schoener’s D (Schoener 1968) measures of niche overlap was generated for each pair of sister clades with the R package DISMO (Hijmans et al. 2017). In addition, a canonical discriminant function (CF) analysis was executed with the package candisc (Friendly and Fox 2015), to distinguish the potential affecting the extent to which their niches have been conserved. For this analysis, current time ENMs were reclassified so that each pixel predicted by each model would equal 1 and the rest of the grid 0. The resultant ENM masks were used to extract for each clade pixel-level data for the 9 environmental variables.
Lineage dispersal. To reconstruct the spatiotemporal history of lineage dispersal in R. sumichrasti we used the Relaxed Random Walk model (RRW; Lemey et al. 2010) as implemented in BEAST2. This model assumes an uncorrelated diffusion rate across the tree and infers the dispersal lineage history in space and time simultaneously, using both the phylogenetic tree and the geographic locations of the samples (Dellicour et al. 2021). To build the RRW we employed the geographic coordinates from each terminal collecting locality as a two-dimensional trait. We assumed a relaxed molecular clock (prior rate = 0.017, SD = 1.0), and the tree priors were calibrated as described above. To visualize the estimated phylogeographic reconstruction, space-time dispersal networks were created using SPREAD 1.0.6 (Bielejec et al. 2011).
Results
Phylogenetic hypothesis and species delimitation. The bGMYC species delimitation analysis of the Cyt-b recovered two species-level clades within R. sumichrasti (P ≥ 0.95), separated by the Isthmus of Tehuantepec (Figure 3; Hypothesis 1). In this phylogeny, samples from new populations 64 to 66 from Guerrero and Oaxaca formed part of clade II. For the BPP and STACEY multiple-loci methods, the highest probability values (BPP, pP = 0.56; STACEY, pP = 0.91) supported Hypothesis five which recovered four divergent clades at the species level (Figure 3). One of them (clade I) was confined to the east and south of the Isthmus of Tehuantepec in México and Central America and the other three (clades II, III, and IV) were restricted to México. The K2P genetic distance values ranged from 5.43 % to 7.52 %, with the lowest value between clades II and IV and the highest between clades I and IV (Table 1). Similar genetic distance values among clades were obtained with the uncorrected P-distances (Table 1).
R. sumichrasti | Clade I | Clade II | Clado III | Clado IV |
---|---|---|---|---|
Clade I | 1.71 | 6.69 | 6.97 | 7.01 |
Clade II | 7.16 | 1.66 | 5.74 | 5.17 |
Clade III | 7.47 | 6.07 | 1.59 | 6.28 |
Clade IV | 7.52 | 5.43 | 6.67 | 0.25 |
The species delimitation methods and the species tree (Figure 4) recovered the ancestral position of clade I (pP = 0.84), with a mean divergence time for the most recent common ancestor (MRCA) of ~2.15 Ma. The bGMYC supported the sister relationship between clades II and IV, whereas the multi-loci methods and the species tree supported the split of clade IV (pP = 0.79; mean divergence time 1.42 Ma), and a sister relationship between clades II and III (pP = 0.70; mean divergence time 0.90 Ma). In addition, the ancestral position of R. chrysopsis with respect to R. megalotis-R. zacatecae and R. sumichrasti was strongly supported (pP = 1.00), with an MRCA mean age estimated at 6.18 Ma. Also, a closer relationship was recovered between R. humulis and R. montanus-R. raviventris (pP = 1.00; mean divergence time 6.43 Ma), although the sister relationship of R. montanus-R. ravivientris received lower probabilities (pP = 0.86; mean divergence time 4.44 Ma).
Ecological niche equivalence. Ecological Niche Models generated for the four species-level clades within R. sumichrasti had AUC values above 0.90 for training data. The inter-clade predictability of the ENM of clade I ranged from 95 % when predicting known localities from clade III to 100 % when predicting known localities of clade IV (Figure 5). Clade IV had the most restricted ENM, and its inter-clade predictability ranged from 0 % when predicting clade III (and vice versa), to 18 % when predicting clade II. The ENMs of clades II and III showed the lowest intra-clade predictability values with 90 % and 95 %, respectively. Quantification of niche overlap with the I and Schoener’s D statistics (from here forward I and D) revealed small amounts of overlap between each clade pair. For all clade pairwise comparisons, the niche identity (niche equivalency) hypothesis was rejected regardless of the similarity measure (I or D; Table 2).
R. sumichrasti | Clade | Schoener’s D | I statistics | Identity test |
---|---|---|---|---|
Clade I | II | 0.1322 | 0.3075 | niche non-equivalency |
III | 0.4369 | 0.7547 | niche non-equivalency | |
IV | 0.2722 | 0.5371 | niche non-equivalency | |
Clade II | III | 0.3803 | 0.6456 | niche non-equivalency |
IV | 0.1872 | 0.3900 | niche non-equivalency | |
Clade III | IV | 0.0260 | 0.0843 | niche non-equivalency |
The canonical variable analysis did not discriminate significantly among the ENMs of the clades (Figure 6). The first and second canonical functions accounted for 97.3 % of the variance and the meaningful structure coefficients (> 0.3) were exclusively related to temperature (BIO1, BIO2, BIO4, BIO5, BIO6, BIO7). Overall, there was more similarity among mean values of each climatic variable between the ENM of clades II and III, whereas the area that occupied clade IV displayed extreme values for the Max Temperature of Warmest Month (BIO5; 27.4 °C), Annual Precipitation (BIO12; 1086 mm), and Precipitation of Driest Quarter (BIO17; 14.86 mm; Table 3).
Climatic Variable | Function 1 Eigen=0.261 | Function 2 Eigen=0.035 | Function 3 Eigen=0.008 | Clade I | Clade II | Clade III | Clade IV |
---|---|---|---|---|---|---|---|
BIO1 | 0.689 | 0.402 | 0.028 | 17.11 | 16.75 | 14.15 | 18.44 |
BIO2 | -0.054 | 0.409 | 0.023 | 11.82 | 12.23 | 12.18 | 12.96 |
BIO4 | 0.632 | 0.086 | 0.021 | 104.09 | 124.54 | 185.44 | 164.25 |
BIO5 | 0.239 | 0.486 | 0.379 | 24.88 | 25.24 | 23.17 | 27.47 |
BIO6 | -0.385 | 0.280 | 0.252 | 9.00 | 8.16 | 4.40 | 8.30 |
BIO7 | -0.614 | 0.671 | 0.015 | 15.88 | 17.09 | 18.77 | 19.17 |
BIO11 | 0.421 | 0.149 | 0.619 | 15.70 | 15.11 | 11.64 | 16.16 |
BIO12 | -0.257 | 0.116 | 0.056 | 1723.79 | 1237.19 | 1157.14 | 1086 |
BIO17 | -0.196 | 0.232 | 0.302 | 79.52 | 34.47 | 98.99 | 14.86 |
EV (%) | 85.575 | 11.724 | 2.700 |
BIO1 = Annual Mean Temperature; BIO2 = Mean Diurnal Range (Mean of monthly (max temp - min temp)); BIO4 = Temperature Seasonality (standard deviation *100); BIO5 = Max Temperature of Warmest Month; BIO6 = Min Temperature of Coldest Month, BIO7 = Temperature Annual Range (BIO5-BIO6); BIO11 = Mean Temperature of Coldest Quarter; BIO12 = Annual Precipitation; BIO17 = Precipitation of Driest Quarter; EV (%) = Percent of explained variance.
Lineage dispersal. The RRW model predicted the ancestral distribution of R. sumichrasti was centered in the SMdC physiographic region (abbreviations described in Figure 2), within the current extent of clade I (Figure 7). This clade started to spread at ~1.80 - 1.75 Ma to the west crossing the Tehuantepec Isthmus towards both the Oaxacan Highlands (OH) and Sierra Madre del Sur (SMdS) where the MRCA of clades II, III, and IV originated. Subsequently (between 1.53 - 1.25 Ma), the MRCA of clade III extended to the Sierra Madre Oriental (SMOr), while clade I colonized the Costa Rican Seasonal Moist Forest (TR*) and Talamancan Range (TR) regions. By ~1.25 to 0.65 Ma, the ancestor of clade IV expanded to the west of the Mexican Transvolcanic Belt (CT as named in Hardy et al. 2013), and by ~ 0.11 Ma most dispersal events occurred when clade II expanded through the central and east of the CT, but also seemed to expand towards the east by the OH (Figure 7).
Discussion
Species delimitation. The use of innovative tools and methodologies to assess species boundaries has helped to clarify taxonomic problems while facilitating the generation of robust hypotheses to reveal cryptic species and describe the speciation processes (Dayrat et al. 2005; Padial et al. 2010). Such is the case of mammals distributed in Mesoamerica, characterized by a peculiar evolutionary history that is linked to the environmental and biogeographical characteristics of this region (see Almendra and Rogers 2012). We used the cricetid rodent R. sumichrasti because it is a good model to evaluate the biogeographical and ecological niche conservatism hypotheses linked to vicariant speciation events in México to Central America. This approach was addressed by other authors (Sullivan et al. 2000; Martínez-Gordillo et al. 2010; Hardy et al. 2013), but this is the first time that the use of mathematical methods for species delimitation and phylogeographic reconstruction is put into practice for this species.
Our results show that the species delimitation methods support the phylogenetic hypotheses one and five with higher posterior probabilities, suggesting that R. sumichrasti is a complex of multiple species. In both hypotheses, clade I was identified as a distinct species, as this result was congruent among the three species delimitation methods. Recognition of clade I at the species level has been suggested previously due to its position in the molecular phylogenies (Sullivan et al. 2000; Hardy et al. 2013), and to the P-distances to the remaining clades (6.15 % to 9.10 %; Hardy et al. 2013). We agree with this species-level suggestion since this clade was placed as an independent sister lineage to the other clades of R. sumichrasti in our phylogenetic trees and also showed the highest genetic divergence (both K2P and P-distances) compared to clades II-IV. The populations belonging to this clade are distributed southeast of the Isthmus of Tehuantepec, from the Sierra Madre de Chiapas, México to western Panama (Hall 1981), and were the first to diverge from a common ancestor ~2.15 Ma. This mean age is close to that reported by Hardy et al (2013; ~2.56 Ma), placing the species diversification within R. sumichrasti at the Plio-Pleistocene boundary (see discussion below).
The proposal that clade I evolved independently was better supported by molecular data than by ecological data. The environmental niche space that this clade occupies predicted the potential distribution areas of the remaining clades with high percentages, although the inverse was not true. In general, R. sumichrasti sensu lato inhabits brush and grass in pine-oak and cloud forests throughout its geographical distribution. However, Hooper (1952) reported a greater diversity of habitats for the subspecies that encompass clade I, particularly for R. s. dorsalis and R. s. australis. This apparently broad environmental range could explain the high percentages of predictability we found, which was also evidenced in the canonical analysis. Nevertheless, non-equivalency of niche was found in the niche identity test. The remaining ecological analyses showed a relatively high similarity between this clade and clades II-IV, suggesting that their differentiation at the species level within R. sumichrasti sensu lato was more favored by geography than by ecology (Peterson et al. 1999).
The species delimitation methods were not consistent in the delimitation of clades II, III, and IV. The single-locus bGMYC (Cyt-b) proposed that the three clades form a single species, while the multiple-loci BPP and STACEY (Cyt-b + Fgb-I7 + Acp5) considered each clade as a distinct species. Molecular delimitation methods are considered a valuable complement to taxonomy based on morphological traits and are often used as part of an integrative approach to validate putative species (Luo et al. 2018). The three delimitation methods used in our study have been recognized for their high performance for this purpose (Jones 2017; Luo et al. 2018), but only two of them (BPP and STACEY) were consistent in this work. The performance and accuracy of each method can be affected by factors including both biological (variation in population size, uninterrupted gene flow) and methodological (input tree), among others, so they can over or underestimate the number of species (Rannala 2015; Luo et al. 2018). For this reason, the use of different molecular delimitation methods is highly recommended with species hypotheses based on the congruence among them (Carstens et al. 2013). In accordance with this suggestion, Hypothesis five (which is based on multiple loci) should be accepted and therefore each clade distributed west of the Isthmus of Tehuantepec constitutes a distinct species-level entity. Hypothesis five (Fig. 2) was also supported by the amount of Cyt-b genetic differentiation among clades. The K2P genetic distance values between pairwise clades II-III, II-IV, and III-IV were 6.07, 5.43, and 6.67, respectively, which are greater than the 5 % value associated with sister species recognition in mammals (Baker and Bradley 2006) including rodents (ranged from 2.70 % to 19.23 %; Bradley and Baker 2001).
Phylogenetic relationships among clades II, III, and IV were different between the Cyt-b tree topology and the species tree, but generally with weak nodal support. In the first case, II and IV were recovered as sister clades, while in the second, clades II and III were more closely related. These results partially coincide with the topologies obtained by Hardy et al. (2013), in which their concatenated DNA tree is consistent with our species tree. On the other hand, none of our phylogenies (gene tree or species tree) recovered sister relationships between clades III and IV, such as those obtained in the Cyt-b tree of Hardy et al. (2013). This is also supported by the ecological results where there is a greater ecological similarity (based on both directions of area predictability) between clades II and III than between clades II and IV or III and IV.
The ecological niche characteristics (from the bioclimatic variables used) of clade II showed high predictability percentages of the ecological suitability areas of clades III and IV, but these tended to have low or null values when the inverse analysis was performed. For example, clade IV predicted only 18 % of clade II and 0 % of clade III. The geographical distribution of each clade could explain the different percentages of predictability of the environmental niche. The wide geographical distribution of clade II includes localities of the CT, SMdS, extreme south of SMOr, and OH, while clade III is distributed in the SMOr, and clade IV is restricted to Coalcomán and Dos Aguas localities, in Michoacán (Hall 1981; Hardy et al. 2013; Figure 1, 2).
Niche pairwise comparisons showed low observed values for D and I similarity indices, mainly between clades III and IV. This is based on the fact that these indices can take values from 0 (no niche overlap) to 1 (total niche overlap; Warren et al. 2008). Closely related species are predicted to share characteristics of their environmental niche due to their common ancestry (Peterson et al. 1999), but niche differentiation can occur when allopatric populations exist, and gene flow is assumed to have been disrupted in the past (Avise 2000; Martínez-Gordillo et al. 2010). This could explain the non-equivalency of niche between these clades, as well as the low values of area predictability, which coincides with reports of Martínez-Gordillo et al. (2010) for different rodent species, including R. sumichrasti.
Bioclimatic data show that clade II shared similar characteristics to the other clades depending on the variable being analyzed. Moreover, clade III was characterized by low temperatures and the second-highest value of annual mean precipitation. These bioclimatic characteristics correspond to the habitat description of R. s. sumichrasti, mainly associated with pine and pine-oak forests, in “areas frequently bathed by clouds and rain (Hooper 1952:72)”. In contrast, clade IV was associated with higher temperatures and lower precipitation values, showing extreme values with respect to the other clades in at least five of the nine variables analyzed. Hardy et al. (2013) highlighted the presence of geographical barriers such as low-lying river drainages that have isolated clade IV populations from other R. sumichrasti sensu lato populations, which could justify our molecular and ecological results regarding the species recognition of this clade.
Phylogeographic history. Our results suggest that the common ancestor of the R. sumichrasti sensu lato originated in the montane regions of northern Central America ~2 Ma ago and expanded to where this species complex currently occurs. Various geographic and environmental factors may have favored and/or limited its dispersal in Central America and México (for more details see Hardy et al. 2013). The montane and intermontane Central America regions have a deep tectonic and volcanic history, which may have influenced the origin and diversification of montane species such as Peromyscus guatemalensis, P. bakeri, and P. carolpattonae (Álvarez-Castañeda et al. 2019). Also, the Pleistocene glacial cycles may have played a key role, due to favorable climatic conditions (Ceballos et al. 2010), which allowed the colonization of new areas and in some cases new habitats, followed by post-glacial isolation that limited the gene flow between populations (Martin 1961). This has been reported in several groups such as plants (e. g.Ramírez-Barahona and Eguiarte 2013), reptiles and amphibians (e. g.Church et al. 2003; Howes et al. 2006), birds (e. g.Johnson and Cicero 2004; Baker 2008), and mammals (e. g.Ceballos et al. 2010; Chiou et al. 2011) including other species of Reithrodontomys (Martínez-Borrego et al. 2022). In addition, geographic regions such as the Isthmus of Tehuantepec seem to have acted as an efficient barrier limiting gene flow between populations that are distributed on both sides of the Isthmus, an accepted explanation for R. sumichrasti and other rodent species (e. g.Sullivan et al. 2000; León-Paniagua et al. 2007; Ordoñez-Garza et al. 2010; Hardy et al. 2013).
The lineage dispersal in México was from populations in the west of the OH and SMdS that currently belong to the clade II, which spread into SMOr (clade III) and the west of CT (clade IV) as well as through the central and east of the CT (clade II). This model would explain the wide geographical distribution of clade II, and also its greater number of haplotypes compared to the other clades (Hardy et al. 2013). Although these dispersal events seem to have occurred relatively recently, the physiographic characteristics of the Mexican mountainous regions (Morrone 2005; Escalante et al. 2009) could have favored relatively faster speciation processes within R. sumichrasti complex, leading to differentiation, at least genetically and ecologically, among each clade analyzed here. This seems to be a common pattern in several species of small mammals, where the allopatric effect and the habitat characteristics each ancestral species occupied resulted in complete speciation of lineages, often associated with cryptic speciation processes (e. g.Arellano et al. 2005; Rogers et al. 2007; León-Tapia et al. 2021; Martínez-Borrego et al. 2022).
Taxonomic considerations. Species delimitation methods and values of genetic divergence support the recognition of populations of R. sumichrasti at the east and south of the Isthmus of Tehuantepec, from Chiapas, México to Central America (Clade I), as a valid species which is different from everything occurring to the west of this geographical barrier. According to this hypothesis, then R. australis (Allen 1895) is the taxonomic name that has priority (Article 23; ICZN 1999). Subspecies distributed across this region of Mesoamerica, beyond the nominotypical would include R. a. dorsalis (Merriam 1901), R. a. modestus (Thomas 1907), and R. a. vulcanius (Bangs 1902).
In addition, the existence of an undescribed species represented by the populations included in clade IV, from Coalcomán and Dos Aguas in Michoacán, México (northwestern SMdS) is supported by species delimitation methods and values of genetic divergence. The disjoint distribution of this genetically distinct clade suggests that it does not belong to R. s. nerterus nor R. s. luteolus. The mountainous region inhabited by this new species is isolated from other mountain ranges in the area by lowlands of up to approximately 400 masl. This pattern of genetic differentiation coincides with the recent description of a new species of the genus Peromyscus (P. greenbaumi; Bradley et al. 2022; but see also León-Tapia et al. 2021). In order to make the formal description based on diagnostic characters that will derive in an appropriate species name, a morphological comparison would be necessary.
Molecular species delimitation and genetic distance values associated to populations from clades II and III indicate that these two lineages should be recognized as valid species. Nomenclatural suggestions are difficult to make due to the sympatry of individuals of some populations from both clades. This was already addressed by Hardy et al. (2013) through nested clade analysis. In our study a phylogeographic pattern of diffusion of the lineages (RRW model) suggests colonization after the separation of clades II and III. Nevertheless, in this work we propose populations comprising clade II should be recognized as R. nerterus (Merriam, 1901). Although we did not include specimens from the type locality of R. nerterus (El Nevado de Colima, Jalisco, México), we analyzed several individuals from sites reported by Hooper (1952) for this taxon. Because clade II includes populations of the known distribution of R. s. luteolus, this taxon should be considered as subspecies of R. nerterus. Clade III should be named as R. sumichrasti; here we also did not include individuals from the type locality (El Mirador, Veracruz, México), but we used specimens from localities that belong to this species. Populations from south Puebla and Northern Oaxaca (28, 1, and 10 in Figure 2), regarded originally as R. s. sumichrasti should be now R. n. luteolus. It remains necessary to evaluate sympatric populations from both clades in order to identify plausible evolutionary processes in this region.