Introduction
In economics, capital is the collection of goods that is calculated by summing the contributions of all the people involved (members of a business partnership, family, etc.). In nature, natural capital is the body of natural resources (for example, plants and animals), that on one hand can generate some benefit for humanity, but also maintains harmony among the services provided in an ecosystem (for example, oxygen generation, pollination, erosion prevention, etc.).
Goods, in economics and in ecology (the set of properties, wealth, or resources that belong to a person or group) must be known (inventoried) to accurately determine the amount of capital. In botany, the activity that helps determine the natural capital of a country, state, or region is herborization; that is, the act of traveling around a natural space and collecting samples (specimens), which after a meticulous formal process, are incorporated into scientific collections known as herbaria. The specimens resulting from herborization and maintained in herbaria constitute the primary materials for plant systematics (Sánchez-González & González-Ledesma, 2007). They are used by taxonomists to make observations and measurements that, once analyzed, generate the academic products of this line of research (floristic inventories, floras, monographs, taxonomic revisions, phytogeographic analyses, etc.).
The collection of botanical specimens is an ancient activity, dating to long before the existence of herbaria as we currently know them. In fact, the concept of “herbarium” alludes to a kind of book illustrating plants, particularly medicinal plants. Currently, an herbarium is a collection of dried, pressed plants that are stored in a collection that follows a particular arrangement based on a classification system (Lot & Chiang, 1986). The specimens stored in herbaria are collected in the field by a wide variety of people, including professional collectors that make their living from the sale and profits of their collections, botanists that study a plant group to understand its diversity, geographic distribution, and evolution, or that are interested in inventorying the floristic richness of a region, and amateurs who are interested in knowing the plants around them. Not all collectors contribute equally to the enrichment of a collection; there are some collectors, mainly those employed by or associated with herbaria, that have made particularly important contributions to the accumulation of specimens protected in the collection. For example, the National Herbarium of Mexico (Herbario Nacional de México; MEXU) at the Institute of Biology of the National Autonomous University of Mexico (Instituto de Biología, Universidad Nacional Autónoma de México), houses just over 1.5 million mounted specimens that are accessible for consultation, generated by the effort of slightly over 7,000 people, each of whom collected somewhere between 1 and over 30,000 specimens.
There have been several attempts to document the people involved in the effort to better characterize the floristic richness of Mexico. Perhaps the most important was the account of Mexico’s botanical collectors done by Knobloch (1983), which reports the existence of about 3,472 collectors. This author highlights the difficulty of generating a total count of collectors, since people interested in collecting are constantly incorporated, which leads to an ever-growing accumulation of names. At the same time, information should be collected from herbaria all over the world, since many of them have records of some collecting activity done in Mexico, such that some names may be omitted.
Another important contribution to the knowledge of the botanical collectors in Mexico is that of Rzedowski et al. (2009), who after an exhaustive bibliographic review of the history of botanical collection in Mexico and of the lives and works of many important collectors in Mexico, synthesized the knowledge on the principal botanical collectors through the 1930’s. The authors compiled the intense activity of about 540 collectors but reduced their study to describe more deeply 332 of them, whose contributions they considered merited a more detailed account of their biography and history.
In a more regional context, Martínez and Hernández (1993) compiled a list of the collectors who contributed to the floristic knowledge of Tamaulipas, identifying the participation of approximately 150 botanists. For his part, Rzedowski (1997) analyzed those that had participated in the floristic knowledge of the Bajío Region and neighboring areas (Guanajuato, Querétaro, and northern Michoacán), reporting the contribution of some 420 botanists, considering 67 of them as the most important.
Bibliographic review does not document additional works on the botanists that have done exploration and collection work in the country. Notwithstanding, there are important contributions that address this activity in several of them. The works mentioned above (particularly Rzedowski et al., 2009) contain references to this literature, citing, for example the biographies of prominent botanists such as C. G. Pringle (Davis, 1936), E. Palmer (McVaugh, 1956) or the 3 generations of Hintons (Hinton et al., 2019). Similarly, the majority of the “Listados florísticos de México” (Floristic listings of Mexico), published by the Institute of Biology, UNAM, briefly recount the collectors involved in each listing (http://www.ibiologia.unam.mx/BIBLIO68/fulltext/listflor.html).
Currently, identifying the collectors that have contributed to collections in a country is particularly relevant, given that much information about biodiversity is stored in digital databases. It is important to stress that the information available can present problems due to a lack of standardization of fields such as collector, among others. Another problem that frequently occurs in the collector field is that the whole group of collectors is recorded, which makes it more difficult to organize and search information. This lack of tidiness in the data decreases the efficiency of the use of databases, hence the importance of creating catalogues that facilitate data cleaning.
The creation of catalogues that allow the normalization of the information contained within them is fundamental and led to the proposal of the following questions for this work: 1) How many and which collectors are documented as having carried out collecting activities in Mexico? How many of them are prominent based on their large number of collections? 3) In which regions of the country have these prominent collectors (hereafter, “main collectors”) carried out their activities? Answering these questions will surely improve the quality and precision of the information in the digitalized biodiversity databases, which will facilitate their use and analysis to address other, more specific questions about the floristic diversity of Mexico and its patterns of distribution. As such, the focal objective of this work was to generate a list of the main collectors of vascular plants in Mexico associated with 4 additional dimensions of information; once the number of main collectors was determined, their participation is analyzed by state, by the core period of their collections, the herbarium that houses most of their collections, and their taxonomic knowledge and expertise (family with the highest number of collections).
Materials and methods
To compile the list of collectors, we analyzed just over 5 million records of botanical specimens available from 2 digital information sources on the flora of Mexico. The first was the National Biodiversity Information System (SNIB in Spanish, “Sistema Nacional de Información sobre Biodiversidad”) run by the Mexican National Commission for the Knowledge and Use of Biodiversity (Conabio in Spanish, “Comisión Nacional para el Conocimiento y Uso de la Biodiversidad”, Conabio, 2020). The other source was the digital repository of the National Herbarium of Mexico (Herbario Nacional de México; MEXU-UNIBIO, currently available through the IBdata online platform: www.ibdata.abaco2.org of the Institute of Biology, UNAM.
We did a preliminary evaluation to eliminate records that did not pertain to vascular plants (eliminating, for example, algae, bryophytes, and lichens), as well as records from the “Naturalista” platform that indicated “observed in the field” in the “sample source” field. Thus, the working database from which we obtained the names of collectors in Mexico was ultimately composed of 3,704,664 records (Table 1). Based on the “Collector Name” field in the database, the first collector’s name listed was identified. The Ibero-American names were separated into paternal surname, maternal surname, and name(s). In the case of Anglo-Saxon names, they were separated into last name, and given name (including first and middle names). We supplemented this list with a (non-exhaustive) search of additional sources, consulting floristic lists and inventories as well as the Index of Botanists of the Harvard University Herbarium (https://kiki.huh.harvard.edu/databases/botanist_index.html) and the Global Plants catalogue in JSTOR (https://plants.jstor.org/). Once we had processed approximately 89% of the records, we did a non-exhaustive cleaning of the collection number for each record, separating its information into 3 new fields: prefix, number, and suffix. For example, the collection number ‘Breedlove 31727A’ was separated into prefix = ‘Breedlove’, number = 31727, and suffix = A. The records were then grouped by the number and suffix fields to identify duplicates. The records were counted for each collector and related to 4 values (information dimensions): state, collection year, collection in which the sample is housed, and taxonomic family. The families of the vascular plants follow the criteria of Villaseñor (2016) with the modification that all families of ferns were grouped together under ‘Ferns’, so that a larger number of collectors could reach the value used as a cutoff (100 records per family).
Records | Percentage | |
---|---|---|
Source database (total records) | 5,367,577 | 100 |
SNIB (Conabio) | 4,011,670 | 74.7 |
MEXU (Instituto de Biología, UNAM) | 1,355,907 | 25.3 |
Country = Mexico | 3,923,911 | 100 |
SNIB (Mexico) | 2,850,361 | 72.6 |
MEXU (Mexico) | 1,073,550 | 27.4 |
Angiosperms, gymnosperms, ferns, and allies | 3,704,664 | 100 |
We defined a “main collector” as any collector that had 500 or more unique specimen records in the database (eliminating duplicates, considering unique “Collector”- “Collection number” values). We consider that a collector with this number of collections has been actively involved in the collection process, whether as a student involved in a floristic inventory or some other type of botanical project, or a collector associated with an institution. The list of collectors was associated with 4 dimensions of information: 1) state, 2) herbarium or scientific collection, collection year, and 4) taxonomic families.
We quantified the number of collectors per biome based on the geographic coordinates of the records when available and superimposed them on a digital map based on the criteria of Villaseñor and Ortiz (2014). “Aquatic vegetation” was added as a sixth biome to the 5 biomes reported in that work. We also identified the collectors that are authors of published vascular plants names according to Villaseñor et al. (2008).
The sum of the number of records attributed to each collector in the database was smaller than the total number of specimens deposited in the biological collections, mainly because not all the specimens deposited in the collections have been digitalized and because our process of identifying the collectors in the database was not exhaustive. As such, the cutoff of 500 specimen records in the database to consider a person a main collector is an arbitrary criterion. Similarly, the limit of specimen records used to generate the listings of states, sampling periods, biological collections, and families into which the collectors’ records were mainly grouped was a similarly arbitrary limit, whose only aim was that the associated lists be informative and useful for future floristic and taxonomic studies of the Mexican flora.
Homonymy Index. We generated a “homonymy index” to estimate the number of incorrectly assigned records. In the process of homogenizing the names of collectors, deciding among homonyms was a common issue; for example, if a record contains as the collector’s name “Martínez, E.”, was the specimen collected by “Esteban Manuel Martínez Salas” or by “Enrique Martínez Ojeda”? Thus, for each collector, we counted the number of pairs of records with the same collection number but different collection years. Each pair of records detected in this way indicates a potential inconsistency, of which one of the possible causes is that they belong to different collectors. Therefore, an index that represents the proportion of records that are inconsistent in the fields “Collection number”-“Collection year”, was calculated as: Homonymy index = Ih = Rinc/ Rcol, where Rinc is the number of pairs of inconsistent records and Rcol is the number of records attributed to the collector. Let the next 12 pairs of year- collection number: (1999,1), (1999,2), (1999,3), (2000,3), (2000,4), (2000,5), (2000,6), (2001,6), (2001,7), (2002,8), (2003,9), (2003,10) each one corresponding to a record in the database assigned to a particular collector. There are 10 (Rcol) unique collection numbers (1-10) and 2 pairs of records (Rinc) with the same collector number (3 and 6) but different year, so the homonymy index (Ih)for such collector can be calculated as 2/10 = 20%.
On one hand, this index allowed us to evaluate whether the set of records was correctly assigned to each collector; a value of Ih < 5% was considered to represent acceptable quality for the elimination of inconsistencies in the information. On the other hand, the index facilitated the data cleaning process; for example, evaluating the change in the index before and after joining records of 2 collector names that were suspected to be homonyms. If upon joining the sets of records the index exceeded 5%, the names were proposed to be homonyms.
An example of how Ih was applied was between the collector names of 2 people (last names, given names): Miranda, A. and Miranda Moreno, Andrés Gelacio. “Miranda, A.” had an Ih value of 0.010, and “Miranda Moreno, Andrés Gelacio” had an Ih value of 0.007. Combining these 2 names under the assumption that they were a single collector increased the Ih value to 0.299. Given that this value was above 0.05, we decided to keep the names separate, supporting the hypothesis of homonymy.
In another case, the Ih values provided evidence of homonymy among the following collector names: Castillo C., G. (Ih = 0.005), Castillo, G. (Ih = 0.001), and Castillo Campos, Gonzalo (Ih = 0.007). When these names were combined, considering them to represent a single person, the Ih value was 0.019, below the 0.05 cutoff value, so we decided to reject the hypothesis of homonymy and consider all 3 names’ variations of the same collector.
We also analyzed the participation of the collectors in the generation of floristic knowledge at both the biome level and the most important families in their collection efforts. Finally, we introduce an online application to facilitate searches of the main results obtained.
Results
Scientific collections with records of vascular plants of the Mexican flora. The source databases on the vascular plants of Mexico analyzed in this work included a little over 3.7 million records (Table 1). The records were from 375 collections, both within and outside of Mexico. Table 2 shows the collections that contained more than 20,000 records in the database (herbaria abbreviations are according to Thiers, 2016). The most important was MEXU, with 3 times more records than the other collections of Mexican plants. In second place was the collection effort by the Mexican government, through the National Forest and Soil Inventory (Inventario Nacional Forestal y de Suelos), which is carried out to better document the plant diversity of the country’s forests. Much of that material is deposited in MEXU. In third place was the herbarium of the Missouri Botanical Garden (MO), which in collaboration with MEXU and the Royal Botanic Gardens at Kew (K) published a large inventory of Mesoamerican flora. It is important to note that of the 22 scientific collections cited in Table 2, 12 are in Mexico and 10 are located elsewhere. Currently, Mexican herbaria house most specimens, a trend that seemed unattainable as recently as a couple of decades ago.
Scientific collection | Total records | Number of records from main collectors | Records contributed by main collectors | Number of main collectors |
---|---|---|---|---|
National Herbarium of Mexico (MEXU; Herbario Nacional de México, Instituto de Biología, UNAM) | 1,079,972 | 898,630 | 83.2% | 550 |
Collections of the National Forest and Soil Inventory (INFyS; Recolectas del Inventario Nacional Forestal y de Suelos) | 272,222 | 270,123 | 99.2% | 60 |
Herbarium of the Missouri Botanical Garden (MO) | 234,033 | 208,754 | 89.2% | 454 |
Herbarium of the Institute of Ecology, Xalapa (XAL; Instituto de Ecología, Xalapa) | 210,148 | 177,702 | 84.6% | 372 |
Herbarium of the Institute of Ecology, Patzcuaro (IEB; Instituto de Ecología, Pátzcuaro) | 161,396 | 142,430 | 88.2% | 373 |
Lundell Herbarium and University of Texas, Austin (LL-TEX) | 165,409 | 138,976 | 84.0% | 377 |
Herbarium of the National School Biological Sciences of the IPN (ENCB; Herbario de la Escuela Nacional de Ciencias Biológicas, IPN) | 91,700 | 74,069 | 80.8% | 365 |
United States National Herbarium (US) | 80,463 | 71,666 | 89.1% | 261 |
Herbarium of the Center for Scientific Reserach of Yucatán (CICY; Centro de Investigación Científica de Yucatán) | 55,602 | 45,512 | 81.9% | 149 |
Herbarium of the New York Botanical Garden (NY) | 51,228 | 43,894 | 85.7% | 329 |
Aarhus University “Palm Transect Database” (AAU) | 50,847 | 0.0% | ||
Herbarium of the University of Arizona, Tucson (ARIZ) | 41,317 | 37,379 | 90.5% | 217 |
Herbarium of the Field Museum of Natural History, Chicago (F) | 42,394 | 34,828 | 82.2% | 100 |
Herbarium of the California Academy of Sciences, San Francisco (CAS) | 32,714 | 30,801 | 94.2% | 191 |
Herbarium of the Chiapas University of Sciences and Arts, Tuxlta Gutierrez (HEM; Universidad de Ciencias y Artes de Chiapas, Tuxtla Gutiérrez) | 30,681 | 10,359 | 33.8% | 12 |
Herbarium of the Universidad Autónoma Chapingo, Texcoco (CHAP) | 24,010 | 19,016 | 79.2% | 206 |
Herbarium of the Autonomous University of Querétaro (QMEX; Universidad Autónoma de Querétaro) | 25,859 | 17,404 | 67.3% | 91 |
Herbarium of the Royal Botanic Gardens, Kew, England (K) | 21,600 | 18,603 | 86.1% | 235 |
National Plant Germplasm Bank (BANGEV; Banco Nacional de Germoplasma Vegetal) | 20,105 | 11,092 | 55.2% | 5 |
Geo B. Hinton Herbarium (GBH) | 20,501 | 4,873 | 23.8% | 4 |
Table 3 presents a summary of the main results of the cleaning process of the source databases. There was variation in the number of records depending on the action; for example, the collector’s name could only be assigned in 88.7% of the records after the cleaning process, leaving 11.3% of the records with uncertainty about the correct name of the collector. Thus, the number of records (discarding duplicates) for which the collector could be confidently identified and was a person rather than a collecting group or company was 1,998,740. The number of repeated (duplicate) records from a single collecting event was 878,290, which represents 30.5% of the total records assigned to a specific person.
Angiosperms, gymnosperms, ferns, and allies | 3,704,664 | 100.0% |
Assigned a collector after cleaning | 3,286,348 | 88.7% |
“Collector name” field contained names of people | 2,877,030 | 77.7% |
Fields ‘Collector’-‘Collection number’ had unique value | 1,998,740 | 54.0% |
Repeated or duplicate records (% of records where the collector named was a person) | 878,290 | 30.5% |
Collector name field was empty or invalid | 208,950 | 5.6% |
Record was difficult to identify (e.g., a single surname or initials only) | 198,531 | 5.4% |
Collector not assigned after cleaning process | 418,317 | 11.3% |
Main collectors of Mexican flora. The analysis of the database allowed us to identify over 6,500 collectors (Table 4), of which 610 had 500 or more unique collection events attributed to them (without considering repeated records). It is estimated that a large portion of the effort to document Mexico’s natural capital composed of vascular plants is built on their exploration and collection efforts, represented by 1,658,608 records. In other words, about 9% of the collectors have collected 83% of the specimens documenting the great floristic richness of Mexico.
Type of collector | Number of collectors (% of total) | Records (% of total) |
---|---|---|
Main collectors (≥ 500 records in DB) | 610 (9) | 1,658,608 (83) |
Other collectors (<500 records in DB) | 6,052 (91) | 340,128 (17) |
Total | 6,661 (100) | 1,998,736 (100) |
Appendix 1 contains the list of 610 main collectors, associated with information that generates a profile of each one. We describe some characteristics, such as the state where they did most of their collection activities, the herbarium or collection where most of their specimens are housed, the period in which they carried out their collecting activities, and the taxonomic family to which most of their collected specimens belong.
Among the list of main collectors, we identified only one set of possible homonyms: “Álvarez Álvarez, Armando”, with collection years ranging from 1939 to 1951 and from 1964 to 1993. The difference in these collection dates did not allow us to conclude whether this name applied to 1 or 2 collectors.
Table 5 shows some details about the main collectors identified. For example, it is interesting to highlight that 18% of these main collectors are women. It is also evident that there has been more interest from collectors in taxonomic groups within Magnoliophyta (monocotyledons and dicotyledons), less in the ferns and allies, and an even smaller proportion in gymnosperms. An equivalent proportion for all the taxonomic groups is observed; except for the gymnosperms, the other groups did not differ substantially in their participation in the collections of these collectors.
Number of main collectors | |
---|---|
Main collectors | 610 (100.0%) |
Female | 111 (18.1%) |
Male | 499 (81.9%) |
Taxonomic group* | |
Ferns and allies | 154 (25.3%) |
Gymnosperms | 84 (13.8%) |
Monocots (Liliopsida) | 422 (69.3%) |
Dicots (Magnoliopsida) | 595 (97.7%) |
Author of species description | 225 (36.9%) |
Biome* | |
Seasonally dry tropical forest | 407 (66.8%) |
Humid tropical forest | 288 (47.3%) |
Humid montane forest | 151 (24.8%) |
Temperate forest | 471 (77.3%) |
Aquatic and subaquatic vegetation | 61 (10.0%) |
Xerophytic scrub and grassland | 308 (50.6%) |
Number of collections* | |
Herbaria with records from main collectors | 95 |
Number of years of collecting activity | |
1 to 9 years | 319 (52.4%) |
10 to 19 years | 152 (25.0%) |
20 to 29 years | 69 (11.3%) |
30 to 39 years | 51 (8.4%) |
40 to 49 years | 16 (2.6%) |
50 or more years | 2 (0.3%) |
The specimens collected by main collectors are deposited in 200 collections; however, 95 herbaria had records of 100 or more samples from 1 or more collectors. The National Herbarium of Mexico (MEXU) was the collection with the largest number of main collectors (559). Among them, 226 collectors had the largest number of their specimens in this collection, to which should be added another 60 main collectors that participated in the activities of the National Forest and Soil Inventory (INFyS), whose materials are deposited in MEXU. The collection with the second most important set of collectors was MO (454 collectors), followed by LL-TEX (377), IEB (373), XAL (372), and ENCB (365).
Upon analyzing the biomes in which the main collectors did their work, it can be observed that the largest number of them have focused their work on temperate forests, followed by seasonally dry tropical forests and xerophytic scrub. Probably due to the small area they cover, humid montane forests have not been as widely explored by this group of collectors, which is reflected in the lower activity of many of them in this biome. Similarly, in constituting a specific habitat type, only a small number of main collectors focused on intensive collection of aquatic and subaquatic vegetation.
In evaluating how long this set of collectors has been dedicated to collecting activities, most of them dedicated less than a decade to this occupation. About 25% of the collectors collected for over 2 decades, and a much lower proportion of them participated in collecting efforts for 3 decades or more. In this respect, Dr. Jerzy Rzedowski stands out, having been the most noteworthy botanist in many respects, including as a taxonomist, mentor, administrator of a research institution, herbarium manager, etc. He has contributed specimens to one of the most important collections of Mexican plants at a national level for more than 50 years. Other collectors who have made great contributions to the knowledge of botanical richness, especially in the northwestern part of the country, include Dr. Richard S. Felger (University of Arizona), who also spent nearly 50 years exploring that part of the country collecting botanical specimens. The Hintons (father, son, and grandson) have also been collecting for more than 50 years (Hinton et al., 2019).
Collectors by state. The number of main collectors differed among states. For example, Tlaxcala has only 9 main collectors that have stood out for their collection efforts (with more than 100 records in the state). Similarly, there were only 12 main collectors that stood out in Aguascalientes, 20 in Colima, and 21 in Morelos. In contrast, the states with the highest levels of activity by main collectors were Oaxaca (254), Veracruz (231), Chiapas (203), Jalisco (123), Puebla (120), Michoacán (114), and Guerrero (108). In general, the mode was between 30 and 40 main collectors per state. This is reflected in a higher number of records (collection events), since Tlaxcala had only 5,510 records, while Oaxaca surpassed that number by nearly 40-fold (Fig. 1).
Collectors and collection decades. The frequency by decade of specimens collected compared to the frequency of the main collectors identified shows similar tendencies (Fig. 2). Special attention should be paid to the 1980s, when Mexican institutions had funding specifically earmarked for botanical exploration, as well as the 2010s, when there was intense collection activity by the National Forest Inventory program, which recorded around 300,000 specimens.
Figure 3 shows the distribution over time of the activity of main collectors in Mexico. It is notable that there was intense activity beginning in the 1950s, when Mexican institutions began intense exploration of the territory and became involved in a larger number of floristic inventories.
Distribution of collection events by family. There was a linear relationship between the number of collection events and the number of main collectors using the number of records per family (Fig. 4). This significant relationship supports the cutoff points used in the selection of the collectors analyzed (that is, 500 specimen records to consider a collector a “main collector” and 100 to consider the collector important for a family in this analysis). In the database, we recorded 277 families of flowering plants, including both native and introduced species; this number represents 67% of the 416 families recognized by the Angiosperm Phylogeny Group (APG IV, Chase et al., 2016). Of these figures, main collectors documented specimens for 272 of these families (98% of the total recorded).
The line in figure 4 suggests that the number of records is strongly linearly correlated with the number of main collectors per family, with a slope of 516.3. For example, for the family Poaceae, we identified 219 main collectors who have collected 100 or more specimens from that family, which multiplied by 516.3 yields a total figure of 113,070 records, a number that is close to the total records documented for the family in the database (SNIB+MEXU). Another implication of this significant positive correlation is that the database analyzed is representative of the botanical collections in terms of the collectors and the taxonomic families in Mexico. For the 277 families recorded in the database, only 133 of them are associated with at least one main collector that has collected a minimum of 100 specimens from the family. In contrast, for more than 160 families, such as Aristolochiaceae or Portulacaceae, no main collector reached at least 100 specimens.
Homonymy index. The value of this index increased due to records that were missing values in the collection number field or to records that had inconsistencies between the collection number and collection year fields. This could be due to, among other causes, mis-assignment of the record to a particular collector. More than half of the main collectors (321) were associated with a homonymy index of less than 1%, while only 7.1% (43) have a value above 10%, which in most cases was due to records that were missing collection numbers. An exhaustive review of these 43 collectors allowed us to decide which of these 2 causes led to this high value and, in that case correct the mistaken assignments.
Another point for accessing the list of main collectors is available on the website db-Dalea (www.colectores.abaco2.org; Fig. 5), which was generated specifically to allow the public to consult these data. There, the public can obtain lists by family, state, year, or collecting period, in addition to allowing specific searches by combining criteria among the 4 associated dimensions. An abbreviated version is available on the website AbaTax (www.abatax.abaco2.org) to create identification keys where they can be consulted in a streamlined way as a dynamic list (Murguía-Romero et al., 2021).
Discussion
The results indicate that less than 10% of the collectors contributed more than 80% of the collection effort, which may seem like an overestimation. It must be noted that it was not possible to clean the records completely; for about 11% of the records in the database, the names of the collectors remain uncertain. Our estimations suggest that the total number of collectors is surely higher than previously reported, perhaps exceeding 7,000.
Considering a relatively small set of collectors (610) with respect to the total identified (6,661) may provide advantages for future floristic and taxonomic studies. Once the information associated with these main collectors is better known, data cleaning will require less effort if first focused on this group of main contributors, then on other collectors. It is likely that many of the collectors that could not be identified during the cleaning process are members of local institutions, who once they are identified by curators of the databases of the collections at those institutions, will facilitate the process of data cleaning and normalization.
Several collections in Mexico still have low percentages of digitalization or inclusion in public databases, as is the case of the National School of Biological Sciences of the IPN (ENCB) and the Herbarium of the Mexican Association of Orchidology (AMO). Surely if these collections are considered in more detail, there will be other collectors that could be incorporated into the list of main collectors.
Although there are tools available on the internet to consult lists of collectors, to date none presents an orderly and clean format of names of the people and associated attributes. For example, the list included on the “Harvard University Herbaria & Libraries” website (https://kiki.huh.harvard.edu/databases/botanist_search.php), although useful in some contexts, is not useful for finding collectors from Mexico, since in addition to the inclusion of several homonyms and exclusion of many important Mexican collectors, the names are not associated with important attributes such as the number of specimens or geographic region below the country level. Penn et al. (2018) assigned between 10,000 and 15,000 collectors for Mexico during the second half of the last century; comparing that number with the fewer than 7,000 identified in this work, it seems that that number may be an overestimation, although it should be considered that in this work, we did not count the total collectors, excluding nearly 11% of the records, mainly of collectors that were not well represented in the database.
The informatics tool presented in this work provides a way to generate lists of main collectors in Mexico by state, family, or a combination of both. Thus, it provides a summary of the main data associated with the set of records associated with each main collector.
Rzedowski et al. (2009) discuss that between the years 1700 and 1930, approximately 332 collectors participated in collection efforts; of those, 34 are included in this study, who documented 500 or more collection records. Unfortunately, those authors did not indicate the total number of samples collected by each of those collectors, so it is difficult to make further comparisons.
Approximately 18% of the main collectors identified in this study are women, highlighting the important contribution women have made to the knowledge of the flora of Mexico. However, it is likely that this proportion is an underestimate due the combination of 2 factors; first, the method used in this work assigned each record only to the first collector noted on the label, and second, that in mixed gender teams, the male partner is more often noted first, regardless of the relationship between the partners (known as the Matilda effect). This occurs frequently, noting the husband before the wife, father before daughter, brother before sister. The participation of women as the first collector began to be registered incipiently in the first decades of the twentieth century and has increased recently. An analysis using the second collector noted will likely reveal that the participation of women collectors has been greater than reported in this work.
Another important aspect of having a clean list of collectors is that it facilitates the future capture of specimen data in biological collection databases. A catalogue that can be linked to a selectable list in the data entry interface would allow the person entering the data to choose an element, with the benefit not only of streamlining the process by investing less time in capturing this important field but also keeping the list of collectors clean immediately, without the need for a further data cleaning and normalization process.
The bulk of the activity of the main collectors identified in this work is focused on certain regions, especially the south and southeastern parts of the country. There were few main collectors that worked in the north of the country, and most of them were not Mexican. The proximity to research centers in the USA of the northern part of the country is reflected in the more intense collecting activity there by collectors from the USA. For example, of the 21 main collectors that collected in Sonora, 12 (57%) are of foreign nationalities, compared to only 21% of the main collectors in Chiapas and 25% in Veracruz.
The collector’s name is an important field to point out in biodiversity databases. This datum can be used to estimate species representativeness, as well as the feasibility of using or excluding records from biological collections to estimate abundance or structure of communities where they prosper (Steege et al., 2011). On the other hand, having a homogeneous and standardized list of the collectors of Mexico is important to facilitate the citation of specimens by botanical researchers, since the primary key used to document a variety of data, for example, in a taxonomic revision, is the name of the collector and the collection number.
We hope that this contribution will provide better information on vascular plant collectors in Mexico in the near future. It will doubtless facilitate the homogenization of collector names among the different databases that are being generated or those that have already been made public. The names included in Appendix 1 will surely allow the standardization of the information in most of their records. Future stages that integrate other collectors that have contributed to the formation of the natural capital we possess will allow access to better information in the many studies of Mexican biodiversity that are in progress currently or will be carried out in the future.