INTRODUCTION
In order to avoid the high seed cost of hybrid maize (Zea mays L.) varieties, some farmers in Mexico sow their advanced generations, or carry out other management strategies with existing hybrid varieties. This has contributed to the generation of a number of studies related to the formation of synthetic varieties with single-cross, three-way line cross or double-cross hybrids. Among other studies is one that deals with the theoretical aspects of synthetic varieties derived from single crosses as parents (Sahagún-Castellanos and Villanueva-Verduzco, 1997) and those that generated formulas for predicting the yield of synthetics that would be derived from double crosses (e.g.: Sahagún-Castellanos et al., 2005; Márquez-Sánchez, 2008). In particular, with regard to three-way line hybrids, which are commonly used in Mexico, the inbreeding coefficient and a formula to predict the yield of the synthetic produced by the random mating of three-way line crosses made with pure lines have been determined (Márquez-Sánchez, 2010).
The synthetic derived from three-way line hybrids (Syn T ) is interesting because the genetic participation of the three lines that form such a hybrid is not balanced; however, there are still gaps in our knowledge of their properties. For example, the study by Márquez-Sánchez (2010) did not include the case in which the parent lines have an inbreeding coefficient F (0 ≤ F ≤ 1). In addition, the value that this author obtained for the contribution of intraparental coancestry to the Syn T inbreeding coefficient is not convincing. In this regard, the hypothesis of this study is that this coancestry is overvalued. The value of F is important because it is related to the magnitude of the inbreeding coefficient of a Syn T that can be derived from these lines. This in turn is linearly and inversely related to the genotypic means of some traits of economic interest (grain yield, for example) of this synthetic variety (Busbice, 1970). In this context, the purpose of this work was to derive a formula to determine the inbreeding coefficient without error and another formula to predict the mean of a synthetic variety whose parents are t three-way line hybrids formed with lines whose inbreeding coefficient is any value of F (0 ≤ F ≤ 1).
MATERIALS AND METHODS
In general, the methods used in this study are based on the concepts of genotypic array and gametic array in the context of the model of a locus of a diploid species reproduced by random mating. More specifically, for a population reproduced in this way, if the frequency of the A i gene is p i (i = 1,2,…,a), its gametic (GAA) and genotypic (GEA) arrays are defined as (Sahagún-Castellanos et al., 2013):
The inbreeding coefficient of a synthetic variety (SV) was visualized as the probability that the genotype of a random individual of that SV is derived from two identical by descent genes. On the other hand, the genotypic mean of a synthetic variety was visualized as what results from its genotypic array after substituting its genotypes for the corresponding genotypic values (Sahagún-Castellanos et al., 2013). In this work, SVs derived from parents that are t three-way line hybrids (TWLHs) are studied. It was assumed that each TWLH is represented by m plants and derived from lines whose inbreeding coefficient is F (0 ≤ F ≤ 1). If the lines that form the single parental cross of a TWLH are the virtual populations represented by A 1 A 2 and B 1 B 2, while C 1 C 2 represents the third line, then, according to Rodríguez-Pérez et al. (2016) regarding probability (P) of identity by descent (≡), it is considered that P(A 1 ≡ A 2) = P(B 1 ≡ B 2) = P(C 1 ≡ C 2) = F. It was also considered that the coancestry among the 3t lines that form the Syn T is equal to zero.
RESULTS AND DISCUSSION
The genotypic array of the three-way line cross (A 1 A 2XB 1 B 2) XC 1 C 2 (GEA T ), according to Equation 1, must be:
The Syn T is generated by the random mating of the mt representatives of the t three-way line crosses. As a consequence, this type of mating also occurs among the m representatives of each of these crosses. For example, the population derived from the three-way line cross of Equation 2 has the gametic array: (1/8)A 1+(1/8)A 2+(1/8)B 1+(1/8)B 2+(2/8)C 1+(2/8)C 2.
The population produced by the random mating of the GEA T individuals (Equation 2) must include genotypes formed by two genes of two different parental lines, which do not contribute to inbreeding, and genotypes formed by genes from the same line, which do contribute. These last genotypes and their frequencies are:
Inbreeding Coefficient
According to Expression (3), the IC of the offspring produced by the random mating of a three-way line cross (F T ) is:
In addition, if the number of parents (three-way line crosses) is t, the inbreeding coefficient of the synthetic produced by their random mating (FSyn T ) can be expressed in the form:
Below, an analysis is made to determine the genotypic structure of the Syn T in order to know the level of accuracy of an inbreeding coefficient of a Syn T developed with pure lines (Márquez-Sánchez, 2010).
The random mating of the 8 genotypes of the genotypic array of a three-way line cross (Equation 2) produces offspring that can be classified into: a) those produced by the union of two gametes from the same genotype, whether it is from the same individual (self-pollinations) or not (intraparental crosses), and b) those produced by crosses between individuals whose genotypes do not have genes in common. The inbreeding coefficient of the offspring produced by self-pollination is 1/2. The remaining intraparental crosses can be classified into 8 groups of 7 crosses, which have a parent in common. For example, the genotype A 1 C 1 is crossed with each of the 7 remaining genotypes of the GEA T described in Equation 2 (A 1 C 2, A 2 C 1, A 2 C 2, B 1 C 1, B 1 C 2, B 2 C 1 and B 2 C 2). These 7 crosses produce the offspring whose genotypic array broken down by crosses is as follows:
Based on this equation, the inbreeding coefficient of is:
The inbreeding coefficient of each of the 7 groups of remaining crosses is also (2 + 3F)/14. Therefore, the inbreeding coefficient of the population derived from the offspring of the 8 sets of crosses (F T8 ) is expressed as:
It is to be expected, particularly when m is large, that the sample of plants from each parent contains groups of the 8 genotypes that form the genotypic array of each three-way line cross (Equation 2). In these cases, F T8 (Equation 6) is not the coancestry between the m individuals that represent a three-way line cross because it does not include the part due to mating between different plants that have the same genotype.
The intraparental coancestry r 0,W is formed with all contributions to the inbreeding coefficient of the offspring produced by the random mating between the m individuals representing the three-way line cross, except those produced by the m self-pollinations. According to these considerations and with Equation 4:
Or, in simplified form:
This coancestry of the intraparental crosses (Equation 7), reduced to the case F = 1, differs from that derived by Márquez-Sánchez (2010) for pure lines (3/8), and if m=8 reduces to (2+3F)/14
Based on Equation 7, the inbreeding coefficient of the offspring produced by the random mating between the m representatives of a three-way line cross is also expressible as:
Equation 8 is reducible to the form F T = 3(1 + F)/16. This result had already been generated based on the definition of the inbreeding coefficient of the offspring produced by the random mating of a three-way line cross (Equation 4).
Generalizing, since Syn T is generated by the random mating of t three-way line hybrids, its inbreeding coeff icient (FSyn T ) in terms of r 0,W (Equation 7) is:
Clearly, FSyn T has an inverse relationship with t and, for a fixed value of t, it reaches its maximum when the parental lines of the hybrids are pure (F = 1).
If, on the other hand, the inbreeding coefficient of the 3t initial lines is F (0 ≤ F ≤ 1) and these are subjected to random mating, a synthetic (Syn L ) is formed; its inbreeding coefficient (FSyn L ), according to Márquez-Sánchez (1993), is:
According to Equations 5 and 10, respectively, if F = 1 (pure lines):
and
Regarding the frequencies of genes contributed by the lines of Syn L (Equations 10 and 12) and Syn T (Equations 5 and 11), there are differences. While in Syn L they are balanced, in Syn T they are not. For this difference, Equations 5 and 10 do not coincide and, consequently, Syn L ≠ Syn T . In addition, FSyn T > FSyn L .
Genotypic Mean
The concept of genotypic array applied to a synthetic where the genotype of each plant from each parent is identif ied (Sahagún-Castellanos, 1998) will be applied to derive the genotypic mean of a Syn T .
Let A pik A qjl be the genotype of the individual whose parents are the individuals p and q (p, q = 1,2,…,m) representing the three-way line hybrids i and j, respectively (i, j = 1,2,…,t), and k and l are the genes with which these parent individuals contribute (k, l = 1,2). According to this notation and Equation 1, the population that results from the random mating of the mt parent individuals must have a genotypic array (GEASyn T ) expressible as:
If in Equation 13Y pik , qjl and ȲSyn T are the genotypic value of A pik A qjl and the genotypic mean of Syn T , respectively:
In addition, if: a) Ȳ RMT is the genotypic mean of the t subpopulations generated by the random mating of the m individuals representing each TWLH, and b) Ȳ CP is the mean of the t (t-1) subpopulations produced by the direct and reciprocal interparental crosses, according to Equation 14:
This specific result is consistent with that found in general terms by EUCARPIA-INRA (1981), presumably with a different methodology. Equation 15 is a prediction formula that deserves attention. While its application requires the experimental means of only two subpopulation groups, the one used by Márquez-Sánchez (2010) for Syn T is based on the experimental means of each of three subpopulation groups generated by: 1) the interparental crosses (Ȳ CP ), 2) the self-pollinations of each parent (Ȳ S1 ), and 3) the intraparental crosses of each parent (Ȳ CWP ). Regarding these three means from Equation 14, the following equation can also be arrived at:
DISCUSSION
Inbreeding Coefficient
The inbreeding coefficient of Syn T for F = 1 (Equation 11) is lower than the one derived by Márquez-Sánchez (2010) for this case (F'Syn T ). The formula on which the derivation of F'Syn T was based includes: a) r' 0,W = coancestry between the m individuals representing each parent, b) F 0 = inbreeding coefficient of the parents (three-way line hybrids), and c) r 0,B = coancestry between individuals of different hybrids. The formula used by this author is:
In this equation, Márquez-Sánchez (2010) considered that since the parental lines of the TWLHs are not related, r 0,B = 0 and F 0 = 0; also, according to the cited study, for F = 1, r' 0,W = 3/8. With this information, this author found that:
Because the lines are unrelated, it must happen that r 0,B = 0 and F 0 = 0. However, the intraparental coancestry between the m individuals representing each parent (r 0,W ) differs from 3/8 when F = 1. In this case, according to Equation 7, r 0,W = (3m - 4)/[8(m - 1)]. This implies that the inaccuracy of r' 0,W is 1/[8(m - 1)] and that with r 0,W instead of r' 0,W in Equation 18, it turns out that instead of F'Syn T we get FSyn T = 3/(8t); that is, F'Syn T has a bias equal to 1/(8mt).
On the other hand, if the inbreeding coefficient of the lines is F (0 ≤ F ≤ 1), r 0,B and F 0 are not affected, but r 0,W is (Equation 7). This change, applied to Equation 17, produces the unbiased and general IC of Syn T (Equation 5).
Regarding the synthetics derived from only the 3t lines (Syn L ) or only the t three-way line crosses (Syn T ), the Syn L lines must be the parents that by self-pollination and intraparental crosses produce the highest proportion of genotypes derived from two identical genes by descent. This is because each line only contains 1 (F = 1) or 2 (F < 1) genes non-identical by descent. On the other hand, the offspring of each three-way line cross may contain 3 (when F = 1) or more (when F < 1). These considerations suggest that Syn L is the one that has the highest inbreeding coefficient; however, this is not the case (Equations 9 and 10); the smallest of the two inbreeding coefficients is FSyn L . Part of the explanation for this apparent contradiction lies in the fact that the synthetic that has the highest proportion of interparental crosses, which are not inbred, is the Syn L (Table 1). This is because for a fixed number of initial lines the Syn L has three-fold the number of parents (3t) of the t that Syn T has, which means that the percentages of interparental crosses are always higher in Syn L (Table 1). Another factor that makes FSyn T greater than FSyn L is the imbalance in the frequencies of the genes that contribute the lines that form each TWLH. With balanced gene frequencies (as in Syn L ) the formation of genotypes derived from non-identical-by-descent genes (which do not contribute to the inbreeding coefficient) is maximized (Sahagún-Castellanos et al., 2013).
Synthetic variety | Number of initial lines (3t) | |||||
---|---|---|---|---|---|---|
3 | 6 | 9 | 12 | 15 | 18 | |
Syn L | 66.67 | 83.33 | 88.89 | 93.75 | 93.33 | 94.44 |
Syn T | 00.00 | 50.00 | 66.67 | 75.00 | 80.00 | 83.33 |
Regarding the origin of the 2m genes that at a locus has the sample of m plants that represent a three-way line cross of the form (L A XL B )XL C , half (m) invariably contributes the L C line, while the L A and L B lines contribute X (0,1,2,…,m) and Y = m - X genes, respectively. By contrast, in a synthetic formed with only 3t lines (3t/2 single crosses) each line (single cross) invariably provides 2m (m) genes. This means that the genetically more stable synthetics are those derived from only one type of parent (lines or single crosses).
Genotypic Mean
According to the magnitudes of the inbreeding coefficients and the consideration that there is an inverse linear relationship between them and the genotypic means of the synthetics (Busbice, 1970), the genotypic mean of the Syn L of a variable such as grain yield must be greater than Syn T ’s. On the other hand, Equation 16 differs from the one derived by Márquez-Sánchez (2010) for the genotypic mean of Syn T . The difference is the sign of the term (1/t)(Ȳ CP - Ȳ CWP ), which in this work is negative and in that of the cited author is positive. Regarding this difference, the details of the derivation of are shown below. From an equation analogous to that of Equation 14, Sahagún-Castellanos (1998) arrived at an expression that, adapted for ȲSyn T is:
From the genotypic means of all the offspring of each of the three terms of the Equation 19 numerator [self-pollinations (Ȳ S1) , intraparental crosses )Ȳ CWP ) and interparental crosses (Ȳ CP )], it turns out that:
According to Sahagún-Castellanos (1998), the prediction based on Equation 20 would be more accurate than that of Equation 15 due to having a phenotypic mean with a lower variance (Wricke and Weber, 1986). However, its strict application is not realistic because, strictly speaking, it requires: forming and evaluating in f ield experiments with replicates: a) tm offspring derived from self-pollination, b) t(t‑1) interparental crosses and c) tm(m - 1) intraparental crosses. Applying Equation 15, on the other hand, only requires forming and evaluating the t populations produced by the random mating of each three-way line cross and the t(t‑1) direct and reciprocal crosses of the t three-way line hybrids.
CONCLUSIONS
In this study a formula was derived for predicting the genotypic mean and another for the unbiased inbreeding coefficient (IC) of the synthetic variety whose parents are t three-way line crosses generated with 3t unrelated and not necessarily pure lines (FSyn T ). With FSyn T , the problem of having only an IC that is overvalued and restricted to the use of pure lines is solved. However, FSyn T is greater than the IC of the synthetic whose parents are the 3t lines (Syn L ). This is because: 1) the frequencies of the genes in the 3t parental lines of the Syn L are balanced and the frequencies of these in the t three-line parental crosses of the Syn T are not; and 2) with 3t parents in the Syn L and t in the Syn T the percentages of interparental crosses, which do not contribute to inbreeding, are always higher in the Syn L . Regarding the genotypic means, the inverse relationship between them and the ICs implies that Syn L ’s exceeds that of Syn T in a variable such as grain yield.