1. Introduction
The nature of scientific explanation in biology is a matter that has been enormously discussed among the philosophers of biology for the last 30 years (Braillard and Malaterre 2015). The rise of the new mechanistic philosophy at the beginning of the XX century was a small revolution within this area of philosophical research in biology (Machamer et al. 2000; Glennan 2017; Glennan and Illari 2018). According to the new mechanistic philosophy, explanations in the biological domain are causal mechanistic, therefore in order to explain a certain phenomenon we need to appeal to or describe the mechanism that is responsible for it, with its entities, activities and their causal connections. Thus, it is widely accepted that explanations in biology, at least in experimental biology like molecular biology, physiology, etc., must be causal and detailed mechanistically (Brigandt 2013).
Despite the wide acceptance of the mechanistic philosophy to address many philosophical problems such as the metaphysical nature of causation, the relation between laws, mechanisms and counterfactuals, the nature of biological functions, emergence, reduction, natural kinds and, in particular, explanation (Glennan and Illari 2018), some voices have been rising criticism towards the possibility of extending mechanistic explanation to certain biological fields that seem not to properly fit with such an account (Braillard and Malaterre 2015). One of the underlying debates is whether we can talk about explanations in biology being non-causal (Deulofeu et al. 2021; Mekios 2015; Huneman 2010; Moreno and Suárez 2020; Walsh 2015; Halina 2018). Thus, appealing to non-causal explanations in biology opposes the wide application of the mechanistic account and to the general claim that scientific explanations must be causal.
Braillard and Malaterre (2015), acknowledging the enormous success of the new mechanistic philosophy to address and analyze the nature of scientific explanation in biology, argue that the current research in biological explanation should be framed by replying to the following four questions: i) Are there natural laws in biology? ii) Does causation play a specific explanatory role in biology? iii) Are there other types of explanation needed?1 iv) Does mechanistic explanation fulfil all expectations? or in other words, can we account for certain types of explanation in biology being non-causal? (2015, p. 14).
This paper is a contribution to question iii), and it is framed within a particular debate over the nature of evolutionary explanations. The debate arose within the broader dispute between the causal interpretation of evolutionary theory as a theory of forces (Sober 1984) and the statistic interpretation (Ariew et al. 2015; Walsh 2000, 2007; Walsh et al. 2002, 2017). The debate is rather metaphysical, for eventually both sides disagree on the level in which we can identify causal relations. The causal interpretation contends that evolutionary theory is a theory of forces, and that natural selection, one of the main forces of evolution (jointly with migration, mutation, or drift), is a population level causal process. Statistics sustain an opposite claim, that causal relations in evolution happen only at the individual level while selection is just a statistical property of living populations (Walsh 2000). The debate, although metaphysical, has important implications over the nature of evolutionary explanations. Explanations appealing to natural selection and drift are seen as causal explanations according to the dynamic interpretation. The Statistics argue otherwise, namely, that natural selection explanations are rather statistical, non-causal.
The goal of this paper is twofold. First, it aims at presenting the ontological debate of the Causalists and Statistics regarding the nature of natural selection. Second, it aims at arguing for the existence of statistical explanations in evolutionary biology, showing at the same time that they fit with Huneman’s notion of structural explanation. Using Huneman’s scheme will help us identify those features that provide explanatory force to the explanations we target.
In section 2, I introduce the two different philosophical interpretations of evolutionary theory, the causal and the statistical and argue that the statistic characterization of natural selection and drift as statistical phenomena is more appealing, although the debate is still far from being solved.
In section 3, I present a twofold interpretation of natural selection which could serve as a way out of the debate.
In section 4, I sketch the mathematical structure of population genetic models, used to explain dynamic changes in trait structure of populations (the definition of evolution according to the Modern Synthesis).
In section 5, I show how statistical explanations from the Modern Synthesis work and claim that we must accept the existence of autonomous statistical explanations, which rather than appealing to first-level causes of individual lives, deaths and reproductions, appeal to statistical features of populations. I claim that these explanations fulfil Huneman’s conditions for being considered structural explanations.
2. Dynamical and Statistical Interpretations of Evolutionary Theory
The Modern Synthesis, an elegant, very powerful and highly successful theory of evolutionary change was the result of merging Darwinian natural selection theory with Mendelian genetics.2 Its foundations go back to the works of a few evolutionary biologists from the 1930’s on. Fisher and Wright established the basis of population genetics, Dobzhansky performed extensive experiments on populations of Drosophila, confirming Fisher and Wright’s works, Mayr created models on how speciation was possible, and Simpson integrated paleontological ideas with the Modern Synthesis.3
Within the context of the philosophy of biology, the Modern Synthesis Theory has mainly been analyzed as a theory of forces. One of the main reasons for the wide acceptance of this stance is Elliott Sober’s seminal book The Nature of Selection (1984). In this book, Sober draws an analogy between Newtonian mechanics and evolutionary theory.4 Natural selection and drift are seen as the main forces that drive evolutionary change at the level of populations. The former is supposed to be a deterministic process whereas the latter is supposed to be a probabilistic process. Both are seen as the main causes of evolution of populations (jointly with migration and mutation), thus, evolutionary explanations are supposed to be causal, for they appeal to forces. Walsh and colleagues (2002) named this causal account the “dynamic interpretation of evolutionary theory”. Sober expresses his ideas as follows:
In evolutionary theory, the forces of mutation, migration, selection and drift constitute causes that propel populations through a sequence of gene frequencies. To identify the cause of a population’s present state [...] requires describing which evolutionary forces impinged. (1984, p. 191)
Despite the wide success of the Dynamic interpretation amongst the philosophers of biology, there is a small group of philosophers, which we can call the Statistics, defending that natural selection and drift are not forces of the evolution of populations (Ariew et al. 2015; Matthen and Ariew 2002; Walsh 2003, 2007; Walsh et al. 2002; Walsh et al. 2017). According to the Statistics, selection and drift are properties of “the statistical structure of a population”. Thus, evolutionary theory is not a theory of forces, and population genetics (discipline analyzing the changes in the genetic distribution of populations), which provides selection and drift explanations, do not appeal to population level causes, but to statistical properties of populations (Walsh 2015; Ariew et al. 2015).
Both approaches accept that evolutionary theory explains change in the trait structure of a population (i.e., evolutionary change) by differentiating certain phenomena of populations such as selection, drift, mutation and migration. However, they disagree with respect to the explanations provided: causal or statistical and to the existing causal processes.5
In what follows we will introduce different tenets that allow us to get a clearer picture of the philosophical discussion. 2.1 introduces an example, drawn from Walsh and colleagues (2002) that works as an argument to show that explanations appealing to natural selection do not use forces in order to get its explanatory force. 2.2 introduces the advantages of the statistics account in dealing with problems of distinguishing between selection and drifts.
2.1. Two Experimental Set-ups: Feathers and Coins
Evolutionary theory understood as a theory of forces must provide means to identify in cases where there is modification of traits in a population, which forces impinge the population and how much each force is contributing to the evolutionary change. Selection and drift thus are seen as forces that propel population change, but it is on many occasions difficult to tell whether selection is acting alone, or it is drift, or are both taking place.6
In contrast, the Statistics contend that evolutionary theory is highly probabilistic, therefore, selection is seen as “discriminate sampling” while drift as “indiscriminate sampling” (Beatty 1984; Walsh 2007). This and other statistical examples, such as the idea of a “blindfolder drawing balls from an urn” are supposed to highlight that natural selection is a statistical property of a population (Walsh et al. 2002). In Walsh and colleagues’ own words, “[selection and drift are] statistical properties of an assemblage of trial events: births, deaths and reproduction” (2002, p. 453). Forces causing changes, according to the Statistics, can be found only at the level of individuals and they are not selection nor drift, they are just causal interactions between members of a population.
The following two experimental set-ups presented by Walsh and colleagues (2002) properly illustrates what is the issue at stake. We must imagine, on the one hand, a feather dropped from a height of 1m. On the other hand, ten coins are drawn at random from a group of 1000 coins (500 of them heads on, 500 tails on). The final outcome of the first experiment will be a position where the feather hits the ground. The outcome of the second will be a certain distribution of heads and tails. Both results can be predicted, and there is a certain likelihood that error arises between the prediction and the final outcome. Expectation and error however are very different in both examples. The forces that move the feather, gravitational force mainly, can help us predict the final position of the feather, while other forces acting at random, such as the ones caused by the unexpected motion of the air molecules, can generate the error from the expected final position. With the coins experimental set-up, though, things are different. The result distribution of heads and tails is not predicted by the action of any force, but by “taking into account the structure of the population being sampled” (2002, p. 454). Laws of probability give us the probability distribution of the heads-tails possible outcomes. Thus, while in the first example we explain the position of the feather by appealing to certain forces, the outcome of heads and tails is explained by appealing to the statistical structure of the population of coins. Therefore, the error in the first experiment comes from the fact that we neglect some of the forces acting because we do not have access to them.
The error in the coins set-up is just a statistical deviation of the mean. Accordingly, the expected value and the error rate are consequences of the experimental set up: an array of independent trial events. The expected outcome and the error rate are not thus consequences of the presence of forces acting on the coins, so there is not a way to distinguish the causes of the error from the causes of the expected result. Error is not a consequence of our ignorance of the forces at stake.
The two different experimental set-ups require distinct types of explanations. The explanation of the final position of the feather requires a “dynamical theory”, a theory of forces, while the explanation of the outcome of a series of trials requires a statistical theory, “a theory that deals with the statistical structure of the population in question and the probabilistic nature of sampling” (2002, p. 455).
At this point, a causalist could argue that it is possible to provide a dynamical explanation of the result of each coin, and then explain the result of the 10 trials by explaining each. However, this explanation does not remove the need for a statistical explanation. A dynamical account cannot tell why given the population and its initial conditions, the final outcome is to be expected (5-5 heads tails). It does not explain either the connection between the 10 trials, the 20 trials, the 40 trials, etc., say, why the probability distribution is the same (and for instance, why is more probable the distribution 8heats-2tails in a toss of 10 coins than 80heads-20tails in a toss of 100 coins). Accordingly, Dynamic and Statistic theories differ in structure, explanations provided and target explananda.7
The statistical set-up allows the Statistics to draw an analogy between the coins and natural selection in a population, arguing that selection is a statistical phenomenon insofar as it focuses on a “sequence of trial events: births, deaths and reproductions” (Walsh et al. 2002, p. 455). The analogy with the coin experiment allows the Statistics to claim that drift cannot be understood as a dynamical error but a statistical error.
In the following section we show how the impossibility of distinguishing (conceptually and empirically) in many factual cases selection from drift, supposes an advantage for the statistical account.
2.2. Selection and Drift
It is often the case that selection and drift happen at the same time in evolving populations. Sometimes though, the problem is to tell whether both are in play when a population is evolving, or only one of them.
John Beatty, being aware of this problem (1984), argued that it is conceptually very difficult to distinguish selection and drift, but as we will show, the problem is only for those that embrace the causal approach.
Beatty illustrates his position with the following scenario (1984). Imagine a population of moths, some of them being light, and the rest being dark. The moth population inhabits a forest with light and dark coloured trees, with a higher proportion of dark trees. Given the environment, population biologists assign dark moths higher fitness for they have more chances of landing in dark trees and so being more camouflage from predators. As a consequence, it is expected that the subpopulation of dark moths will increase over time. However, it turns out that as a matter of chance, dark moths fall more often into light trees and thus the proportion of dark moths in the population decreases. Beatty wonders whether selection and drift are actually happening in this scenario because the situation seems blurry. He claims that it would be odd to tell that dark moths decreased due to drift while light moths increased due to natural selection. It would not make sense either to claim that selection was acting alone without drift happening, or the other way around, because the results are deviated from the prediction. Beatty concludes that, given the deviation from the expected outcome given by population biologists, drift must explain the moth’s change in trait frequencies. However, he contends that natural selection by means of predation pressure is what actually provides change in trait frequencies, so both, selection and drift are appealed to. This seems though an undesirable conclusion, and it is the result of the underlying causal account of evolutionary theory. If, as Beatty says, we claim that it is predation pressure, as selection, the responsible for the unexpected change in trait frequencies in the moth population, then there is no explanatory role left for drift, because as selection is seen as a force that causally determines the deaths and survivals of individual organisms, it completely explains the changes of the population’s structure. It seems to be a contradiction between appealing to both selection and drift.
However, if we sustain that selection actually explains the evolution of populations as a means to differences in trait fitness, namely, appealing to a statistical property of a population, the answer to the former question (drift or selection?) is rather straightforward: If there is variation from the expected prediction made by population biologists appealing to trait fitness, then drift is doing the explanatory work, there is statistical error. If the final distribution of the trait types is as population biologists predicted, following trait fitness, then selection is doing the explanatory work. Point.
The Statistics strategy to deal with this issue, namely appealing to trait fitness as a statistical property of the population being sampled, is way more compelling and less problematic.8 If we are not concerned with population causal processes, neither by providing an etiological narrative, just contending that within a population of living beings, and a series of trial events, births, deaths and reproductions, we can predict the evolution of the population regarding a specific trait, appealing to trait fitness, the solution to Beatty’s scenario is more compelling and less demanding from a metaphysical stance. In Walsh and colleagues’ words: “the conceptual distinction between drift and selection can be drawn without requiring the metaphysical distinction between forces that cause drift and those that cause selection” (2002, p. 465).
As a consequence, we contend that understanding drift as a statistical error, natural selection is better interpreted as a statistical property of a population rather than a population level causal process.9
2.1 and 2.2 suggest that the statistical interpretation of natural selection is more appealing, mainly because it fits better with the statistical interpretation of drift. However, those are not in my view final arguments that allow to close the debate. In fact, Ariew and colleagues (2015) claim that the presence of autonomous statistical explanations in population genetics favours but not implies the ontological thesis defended by the Statistics. In the following section we introduce a possible closure for the debate.
3. A Possible Way Out
Before moving to the analysis and characterisation of statistical explanations, we should mention a distinction which seems to me to be extremely relevant at this point. Walsh (2007), drawing on Grene’s paper (1961), highlights that there are two dimensions of natural selection, and that this distinction goes back to Fisher’s theorem of natural selection. In her analysis of Fisher’s theorem, Grene contends that we should distinguish genetic selection from natural selection in the following vein. While the traditional theory of natural selection (Darwin 2004) states a relation between a trait and its ability to survive and reproduce in a particular environment, namely, seeking for a causal relation between bearing a trait and being more adapted to the environment, Fisher’s theorem of natural selection appeals to genetic selection, which is basically statistical. Then, according to Grene, we must distinguish between genetic selection, which is statistical, and it is depicted by Fisher’s theorem, and Darwinian selection on the other hand, which is causal and environmentally based. Grene argues that the former is not actually natural selection but only the latter is. She contends that given the concepts Fisher defines, his theorem is not a theorem of natural selection but “a statistical device for recording and predicting population change” (1961, p. 30). I agree with the latter claim but not with the former. We have to take both dimensions of selection, and not just identify it with the causal interpretation. I will sustain my position drawing on Walsh response (2007). First, according to Walsh (2007), Fisher’s theorem provides the statistical apparatus10 of the Modern Synthesis that “seeks to explain changes in gene frequencies by appeal to statistical structures of populations” (2007, p. 301). Why then should we consider selection (and drift) as objective features of biological populations? Walsh claims that if we argue that selection is a cause of population change, then “the dynamical interpretation conflates the causal study of evolutionary processes with the statistical study of their effects” (2007, p. 302). This is why we must distinguish between Darwinian selection from Modern Synthesis selection (gene selection in Grene terms). In general, the Statistics do not deny that evolutionary biology is partly causal, and they agree that we can definitely talk about the causal study of evolutionary processes. They accept that difference in the individual’s propensities to survive and reproduce cause changes in the structure of populations, actually, he acknowledges that many biological work is devoted to study the individual level causes of population change, and Walsh, for instance, is even open to call this process natural selection (as it looks like the way Darwin coined the term). However, Walsh (2007) points out that the Modern Synthesis theory of evolution is not concerned with the identification of individual level causes of population change (Darwinian selection). The Modern Synthesis, as we have already mentioned, is concerned with explaining changes in population structure by appeal to statistical properties of populations only (Modern Synthesis Selection, MS selection henceforth).
According to Walsh, it is essential to clearly demarcate these two differentiated endeavours. In his words:
The causal study of evolution involves an investigation of those mechanisms that cause differential death, survival, and reproduction, and crucially those that secure the high fidelity of inheritance, and the capacity of individuals to produce, sustain, and pass on adaptively significant phenotypes. If the statistical interpretation is correct, the concepts of selection and drift embodied in the modern synthesis theory of evolution play no role in the causal study of evolution. (2007, p. 302)
This distinction discussed above might be a way out of the ontological dispute, which would turn it into a verbal one, and as some statistic philosophers have mentioned (Ariew 2003; Walsh 2007), we might need to provide explanations for both, population level explanations of MS selection (appealing to trait fitness), and Darwinian selection as causal explanations of why an individual bearing a certain trait in a population has higher fitness than individuals that do not bear that trait in the same population (individual fitness). Both are essential to provide a full story of evolution, and here what the Statistics are doing is to focus on MS selection.
In what follows we introduce Modern Synthesis explanations of change in trait frequency of populations, in order to show that those explanations do not appeal to any causal information, nor they use causal language.
4. Modern Synthesis and Population Genetics
Population genetics is the discipline which made possible to merge Darwinian selection with Mendelian inheritance ideas, and it is at the core of the Modern Synthesis theory of evolution. The main mathematical models given by population genetics have its roots in the different works of Fisher, Wright and Haldane between the 1920s and 1940s and although their respective stances are slightly different, their main achievements were to mathematically fit Darwinian selection and Mendelian inheritance systems. At that time, the beginning of the xx century, there was a strong discussion about the possibility of merging the two theories, in particular, several mendelian biologists disagreed about the gradual picture of evolution depicted by Darwin and supported by biometricians, and claimed that novel mutation could arise in just one step (Provine 1971). The success of Fisher, Wright and Haldane was due to its ability to build mathematical models that provided a dynamic of the evolution of a population affected by selective pressures and following the Mendelian principles of inheritance. Evolution thus according to population genetics are the changes in the frequencies of a specific genotype in a population, due to selective pressures such as selection, mutation, migration and/or the random phenomena of genetic drift.
Moreover, the Modern Synthesis carried out an important change in population thinking. While Darwin talked about “assemblages of individual organisms” in the study of evolution, the Modern Synthesis “cast biological populations as ensembles of abstract types, commonly gene types” (Walsh 2019, p. 226). The study of evolution given by the Modern Synthesis, Walsh claims, “is the study of the kinematics of these ensembles. The principal virtue of this version of population thinking lies in its capacity to account for evolutionary change without having to advert to the complex, multitudinous properties of individuals” (2019, p. 227).
In this section, we will introduce the main tenets of population genetics models.11 The presentation will be rather sketchy, because we only need to give a general idea about what mathematical models in the Modern Synthesis are about. As we will show, in order for the models to work, some simplified and idealized assumptions will have to be accepted, like the idea that individuals in a population mate at random, that populations are infinite, or that generations are non-overlapping (namely, that individuals die when producing an offspring).
The common feature of all population genetic models is the Hardy-Weinberg principle. Suppose we have a population of diploid and sexually reproducing organisms (they contain two copies of each chromosome). When organisms sexually reproduce, their gametes are haploid, so that each of the two parents provides one pair of the chromosome in order to create a diploid zygote (animals and some plants follow this pattern). Suppose now that we focus on a specific locus in which there are two possible alleles, A1 and A2, so that there are only three possible genotypes: two homozygotes A1A1, A2A2, and one heterozygote A1A2 (or the other way around). The relative frequencies in the population for the two alleles are denoted p and q, so that p + q = 1, and the relative frequencies of the genotypes are denoted by f(A1A1), f(A1A2) and f(A2A2), so that its addition is equal to 1. Hardy-Weinberg principle tells that the frequencies of the genotypes in the population, assuming that mating is random and that there are no selective pressures, will be as follows: f(A1A1) = p2, f(A1,A2) = 2pq, f(A2A2) = q2, and thus adding them we will have the following equation: p2 + q2 + 2pq = 1, which gives us the equilibrium state the population will reach after one generation no matter the initial distribution, supposing that the generations are not overlapping, that mating is random and that there are no selective pressures at play. With no selective pressures, the population defined will continue to have the same genotypic frequencies generation after generation assuming organisms to mate at random; this population is said to be in Hardy-Weinberg equilibrium.
One of the major benefits of Hardy-Weinberg principle is that it makes it easier to dynamically model evolutionary changes, and this is why many models in population genetics assume that Hardy-Weinberg principle holds. In Okasha’s words:
When a population is in Hardy-Weinberg equilibrium, it is possible to track the genotypic composition of the population by directly tracking the allelic frequencies (or gametic frequencies). That this is so is clear-for if we know the relative frequencies of all the alleles (at a single locus), and know that the population is in Hardy-Weinberg equilibrium, the entire genotype frequency distribution can be easily computed. (2022, p. 12)
Once this basic picture has been presented, population genetic models are refined so as to capture the evolutionary dynamics of a population when selective pressures are at play, in particular, we will continue to consider a population of diploid and sexually reproducing organisms, focusing only in one locus having two different alleles. This scenario is clearly a simplification because many populations evolve not only changing genotypic frequencies at one locus but at many. However, for the sake of this paper, it will be enough to show how population genetic models introduce selection in their mathematical apparatus. Remember that drift in population genetic models is dealt with by assuming that populations are infinite, otherwise, the probabilities that changes in trait frequencies in small populations are due to drift is rather high. Recall that evolution according to population genetic models is usually defined as changes in the genetic composition of a population over time.
When introducing selection in a population genetic model, a further parameter called
trait fitness, w, is introduced. Trait fitness is associated with each allele, so
that it confers higher or lower probabilities of organisms bearing those alleles to
better survive and reproduce. Okasha defines it as follows: “[trait fitness for a
given genotype] is defined as the average number of successful gametes that an
organism of that genotype contributes to the next generation” (2022, p. 13). In our
simplified example, each genotype has its fitness value, so A1A1, A2A2, and A1A2,
have respectively w11, w22, w12. Therefore, when bearing one of these genotypes
confer higher chances of survival and reproduction, natural selection will be
operating, and fitness parameters will be different. Supposing again that the
population is at Hardy-Weinberg equilibrium, the three genotypes will produce
successful gametes in proportion to their fitness. The notion of average fitness in
the population, taking into account the three different fitness ascriptions for each
genotype, will be as follows: w = p2 w11 + 2pq w12 + q2 w22, and thus, the total
number of gametes will be Nw, N being the size of the population. The mathematical
apparatus allows to derive an equation that tell us what will be the frequency of
the two alleles A1, and A2, for instance, in order to give the frequency of allele
A1 in the next generation, we will write p’= [N p
2 w11
Obviously, this presentation of population genetics for the study of the evolution of three genotypes on a two allele locus is a simplification, because many populations evolve not only with respect to two alleles, and selection is rarely the only pressure affecting those changes, so many population genetic models try to capture the evolutionary dynamics of actual populations. However, this presentation is enough to show how those models work, even though we have considered here those that take into account selection, and lack the effects of mutation and migration.
Despite the high number of simplifications and idealisations, I would like to comment on the role that population genetics plays in contributing to better understanding evolutionary processes. Wade (2021), for instance, argues that population genetic models have not been contrasted enough with data and that their idealisations and simplifications are too important so as to capture important elements of evolution. On the other hand, Lynch (2007) writes that nothing in biology makes sense but at the light of population genetics, that population genetics provides mathematical models that are accurate to capture the evolutionary dynamics of many real populations, giving good predictions and explanations about the frequencies of a specific allele or genotype in a particular population.
Finally, Okasha (2022) claims that population genetics models are silent with respect to fitness differences. This is an important point, for it might be thought that population genetics models must be complemented by first order causal explanations of the births, deaths and reproductions of individual organisms of the population. We claim that, even though those individual level causal explanations could help to have a deeper understanding of evolutionary processes, population genetic models provide autonomous statistical explanations. We develop this idea in the following section.
5. Statistical vs Causal Explanations
5.1. The Presence of Statistical Explanations
Widespread support is given to the picture that explanations in biology are causal (Glennan and Illari 2018). This support is due, in great part, to the success of the new mechanistic philosophy, although not only. Explanations have always been considered casual, while Hempel himself (1965) considered the notion metaphysically laden and advocated by a covering law model which was by far being proven incomplete, in part due to the necessity to introduce causal information.
Statistical explanations could be seen as a special type of what Huneman (2018) entitles structural explanations. Structural explanations are explanations in which the mathematical structure of the system under study plays a key explanatory role for the explanandum phenomenon, and not just a representational role (2018). Huneman (2018) claims that there are different types of structural explanations in science that explain using mathematics and not by identifying causal relations. Huneman mentions equilibrium explanations (Sober 1983; Kuorikoski 2007; Potochnik 2015; Suárez and Deulofeu 2019), minimal model explanations (Batterman 2002; Batterman and Rice 2014; Ross 2015), statistical explanations (Lange 2013; Walsh 2015) and topological explanations (Huneman 2010; Jones 2014; Woodward 2003). According to Huneman (2018), all tokens of structural explanations have some commonalities:
i) They all aim at accounting for some pattern rather than just detecting patterns in the data (they are not mere representations/descriptions).
ii) The explanandum of a structural explanation, being a property of a system, is not explained by the causal details that lead to it. These details are not explanatory for the behaviour of the whole system, they are abstracted away. To illustrate this point, think about why “stones left falling on the top of the hill end up in the valley” (2018, p. 670). The trajectory of each stone does not matter for the explanandum, but just the fact that all end up in the same place.
iii) All structural explanations reach a level of generality that is not achievable by mechanistic research, this is why the specific nature of a mechanism does not figure in the explanation (Moreno and Suárez 2020). Because of that, one can change the nature of a mechanism and the structural explanation in which the mechanism is involved would still be valid.
iv) Finally, all these explanations use formal features formulated in mathematical terms.
In section 5.2, I will show that explanations provided by the Modern Synthesis, as illustrated in section 4, perfectly fit the four points of a structural explanation. In the statistical models given by Fisher, Wright, Dobzhansky and Haldane, among others, the aggregate of different causal processes, namely the lives, deaths and reproductions of particular individuals in a population are abstracted away, and no matter how those causal processes are arranged, the models show that certain ensemble level trajectories are likely to happen, by appealing to statistical patterns.12
By now, let us introduce the presence of statistical explanations in biology, those that appeal to random genetic drift. Lange, for instance, defines statistical explanations when discussing the role that drift plays in evolutionary explanations (2013), although studied them in several scientific disciplines besides biology in his seminal book Because without Cause (2016). In his 2013 paper, he names really statistical explanations those explanations from population biology (those referred by Walsh when talking about Modern Synthesis models and Modern Synthesis Selection) that are a mere “statistical fallout”, namely “just a statistical fact of life” (p. 169). To illustrate the nature of statistical explanations, Lange appeals to “regression towards the mean”, a statistical phenomenon of a mathematical system. Lange wonders why students that scored worse in a first exam, scored better in a second, and instead of giving causal individual explanations of why a particular student that scored lower in the first exam scored better in the second, he attributes the phenomena to a statistical fact of the population system, and explains it by appealing to “regression towards the mean”. Analogous to the coins experiment presented by Walsh (section 2.1), we could focus on the individual causal stories of the students to score better or worse, but that would not be an appropriate answer, for the nature of the system forces us to treat the phenomena at a population level and appeal to “regression towards the mean” instead to the causal individual stories. What must be noted here is that the explanation of the numbers of heads and tails, or the explanation of the student’s exam results are not given by “describing the world’s network of causal relations” (Lange 2013, p. 172). Both explanations appeal to a statistical feature of the system under investigation, which is a population level phenomenon. In Lange’s words:
A RS [Really Statistical] explanation does not proceed from the particular chances of various results-or even from the fact that some result’s chance is high or low. It exploits merely the fact that some process is chancy, and so an RS explanation shows the result to be just a statistical fact of life. (2013, p. 173)
So, in these two cases, what makes the explanation non causal is that the explanans, that appeals to regression or statistical association between outcomes, or to statistical facts of a population system, gets its explanatory force by other than identifying and appealing to information about causes.
Moreover, Lange claims that the fact that some information in the explanans of a RS explanation need itself to appeal to causes, does not make the statistical explanation causal. An example Lange uses are α particles emitted by a radioactive source, which are constant if studied in a considerable period of time, whereas if analyzed in short periods there is high fluctuation. The fact that α particles are emitted at random makes the system subject to the laws of probabilities, which carry the explanatory power to explain the deviation towards the mean in the emission of α particles in short periods of time. Explaining though that a particular sample has “a constant chance of emission”, we have to appeal to causal factor, however, the fact that the explanans need causal information does not make the RS explanation causal. He concludes, “A scientific explanation is not responsible for explaining the facts in its own explanans” (2013, p. 175).
When Lange presents drift explanations, he claims that drift should not be seen only as cases of indiscriminate sampling, as many others contend (Beatty 1984). For instance, population genetics, in order to explain that some selectively neutral or deleterious trait got fixed in a population, appeals to drift, and this explanation is statistical because it tells that the result is just a fluctuation characteristic of statistical processes (2013). His position contrasts with Shapiro and Sober (2007), in which selection and drift are seen as different population causal processes, but not necessarily drift implies indiscriminate sampling. They contend that selection and drift are represented in the Modern Synthesis theory by fitness and effective population size. If modification of these parameters is undertaken, changes in trait frequencies will happen, thus they are causes of the evolution of populations. However, that population size can be a source of more or less drift, and thus play an explanatory power, is not mediated by any laws of nature, but by the laws of probability, as Lange remarks (2013). This is essential to realize, for it is not due to a causal law or anything similar that the smaller the population, the greater the deviation of the values from the expectations will be. Lange proceeds as follows:
Given the chances of various possible outcomes on any particular independent trial, the relation between population size and the likelihoods of various population-wide outcomes is not contingent. For all of its manipulationist, counterfactual dependence, and probability-raising credentials, population size does not act as a population-level cause in drift explanations since these credentials are mathematically necessary, not beholden to any mere law of nature. (2013, p. 185)
That said, explanations appealing to drift are not causal, but rather statistical, and are sustained by the mathematical apparatus of population genetics.
5.2. Autonomous Statistical Explanations and Its Independence from the Ontology Claim
In this section, I argue that we can talk about statistical explanations being autonomous, namely that do not need the appeal to causal information in order to account for their explanandum (Ariew et al. 2015). In particular, my arguments follow Ariew and colleagues (2015) in arguing that Modern Synthesis explains the difference in trait structure of a population by appealing to statistical models rather than appealing to causes of evolutionary change, in particular, appealing to “a set of statistical properties of populations, viz. the mean (and variance) of fitness between trait types” (2015, p. 7). This has been entitled walm claim (Walsh, Ariew, Lewens and Matthen). We will present different arguments to sustain it.13
First, Statistics claim that scientific explanations of changes in trait frequencies of populations in the Modern Synthesis are analogous to explanations of certain gas properties in the kinetic theory of gasses (Walsh 2003; Ariew et al. 2015). A gas is seen as a complex system, in this case, made out of many gas molecules. In this complex system, there are large scale statistical regularities that hold for the population as a whole but do not hold for every individual (molecule). What needs to be explained is how this large-scale statistical regularity emerges from the relative chaos of the interactions between individuals (molecules). Ariew and colleagues claim that “the deductive consequences of a statistical model are sufficient to explain such large-scale regularities in complex systems” (2015, p. 13). To give more details, the properties of a gas, temperature, pressure, and volume, are population level properties, they range over the ensemble of molecules. The nature and relation among these population level properties, which comes from the aggregation of gas molecules, is given by the use of a statistical property, mean kinetic energy, and it is by appealing to mean kinetic energy that the changes in pressure, temperature and volume are explained, not by appealing to the forces that moves each molecule to pursue a certain dynamical path. Walsh (2003) claims that there are cases in which, due to the nature of the explanandum, appealing to statistical patterns is better than appealing to causal process explanations. Evolving populations and gasses are among the systems whose behaviour (trait frequency distribution and temperature, volume and pressure respectively) is better explained by appealing to statistical models. In both cases, the explanandum is the existence of population level patterns, and in both cases the explanans is statistically autonomous, and involves two steps: “assumptions that follow for the use of a statistical model and then deduction from that model” (Ariew et al. 2015, p. 14). As they have been claiming, there is no need for additional causal information in order to make the explanation complete. It suffices with these two steps. Thus, the Modern Synthesis, by using trait fitness in its equations, explains the final distribution of traits in a population from an initial condition to a final state. Modern Synthesis explanatory models tell how a final configuration of a population will be (in terms of trait distribution), given trait fitness and initial conditions. In the kinetic theory of gasses like in Modern Synthesis models, population level explanations rather than causal explanations of the behaviour of the individual members of a population is what prevails and what is preferable for the explanandum at stake.
Second, following the same vein, Woodward (2003), when criticizing Salmon’s causal mechanical model of explanation, talks about those cases in which higher level explanations, statistical in our case, are preferable than lower-level detailed causal explanations. First, he claims, it would be way too difficult to capture the details of all members of the population (i.e., gas molecules, or living being individuals), and even if we could, the computation would be too complex to be understood (2003, p. 354). Second, Woodward claims that appealing to a higher-level explanation, not causally detailed, can capture features that would not be captured by going to the individual level causal processes. The macro properties of a gas in terms of its pressure, volume, etc., could be achieved by many possible trajectories of the gas molecules, and parallel in the Modern Synthesis, the final distribution of traits could have been achieved by many different individual trajectories of the members of the population. What matters is not each individual trajectory but the final outcome, the population trait distribution. Analogous scenarios happen in equilibrium explanations, for instance, following Sober, “Where causal explanation shows how the event to be explained was in fact produced, equilibrium explanation shows how the event would have occurred regardless of which of a variety of causal scenarios actually transpired” (1983, p. 202). It is thus not the appeal to the causal relations and connections between the individual members of a population what explains its final trait distribution, it is the fact that it has been deduced from a large-scale statistical pattern, which takes the form of equations presented in section 4, that appeal to trait fitness and statistical properties of populations. In Ariew and colleagues’ words, “no matter what the arrangement of the causes are, a particular ensemble level trajectory is highly likely” (2015, p. 15).
As mentioned before though, it is not denied that there are causal explanations of how a population reached a specific trait distribution, what is claimed here is that MS selection, as appealed in the Modern Synthesis, do not use this kind of explanations, because the explanandum targeted, namely, a dynamical trajectory of a distribution of traits in a population is not given in terms of individual causal descriptions, but appealing to statistical models (section 4).
A possible objection might arise. We have shown that population genetics explains by making a certain trait structure of a population expectable from some statistical patterns (including initial conditions and trait fitness as essential values). Is it enough though to explain the changes in trait frequencies of a population by making it deducible from statistical patterns? There are philosophers that have argued that when we have a dynamical model with predictive and descriptive power, we can talk about the model being explanatory (Chemero and Silberstein 2008). Another possible reply would come from Batterman and Rice, and what they entitle minimal model explanations. They claim there are models that are explanatory “because of a story of why a class of systems will all display the same large scale behaviour because the details that distinguish them are irrelevant” (2014, p. 349). Ariew and colleagues (2015) reply to this worry as well. They claim that statistical explanations defined above are not mere deductions but complete explanations because they are able to provide counterfactual information, in their own words, “statistical explanation tells us how the large-scale regularity would have been different if the statistical properties of the population had been different” (2015, p. 21).
To sum up, we have shown that i) the explanandum of Modern Synthesis explanations are general patterns (in terms of dynamic paths of populations from one trait distribution to another), ii) these explananda are not accounted for by appealing to particular causal trajectories of the individuals of a population, iii) the level of generality reached by Modern Synthesis explanatory models would not be possible by providing causal detailed explanations of the lives, deaths and reproductions of individuals of a population and iv) Modern Synthesis explanations use the statistical apparatus coined by Fisher, Wright and Haldane, among others. Therefore, we can conclude that the explanations given by population genetics fits Huneman’s definition of a structural explanation (2018).
As Walsh mentioned, evolutionary biologists are nowadays looking into individual causal explanations, so it might be important to find a way to integrate individual level causal explanations and higher-level population explanations, for instance, using Sandra Mitchell’s account of integrative pluralism (2003, 2009), and going a step further to determine what is the relation between these two types of explanations. Such a suggestion is a possibility that might help to resolve the debate between Causalists and Statistics, and give a final closure, although this is not the aim of this paper.
6. Conclusion
The paper has explored two dimensions of the debate over the nature of natural selection. First, it has introduced the two ontological opposed positions, the causal and the statistical and has suggested that the statistical interpretation is more appealing. At the same time, it has introduced Walsh distinction between Darwinian selection and Modern Synthesis Selection as a way out. Second, it has provided an analysis of the nature of statistical explanations from population genetics and has claimed that these explanations rather than appealing to lower-level causal details of the interactions between individuals of a population, appeal to statistical patterns in order to explain the dynamics of population change regarding its trait structure. Finally, the paper has shown that these explanations fulfil Huneman’s conditions and therefore they have to be accepted as structural explanations.14