| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Corresponding author: Rongling Wu, 533 McCarty Hall C, University of Florida, Gainesville, FL 32611., rwu{at}stat.ufl.edu (E-mail)
Communicating editor: M. A. ASMUSSEN
| ABSTRACT |
|---|
Two major aspects have made the genetic and genomic study of polyploids extremely difficult. First, increased allelic or nonallelic combinations due to multiple alleles result in complex gene actions and interactions for quantitative trait loci (QTL) in polyploids. Second, meiotic configurations in polyploids undergo a complex biological process including either bivalent or multivalent formation, or both. For bivalent polyploids, different degrees of preferential chromosome pairings may occur during meiosis. In this article, we develop a maximum-likelihood-based model for mapping QTL in tetraploids by considering the quantitative inheritance and meiotic mechanism of bivalent polyploids. This bivalent polyploid model is implemented with the EM algorithm to simultaneously estimate QTL position, QTL effects, and QTL-marker linkage phases by incorporating the impact of a cytological parameter determining bivalent chromosome pairings (the preferential pairing factor). Simulation studies are performed to investigate the performance and robustness of our statistical method for parameter estimation. The implication and extension of the bivalent polyploid model are discussed.
POLYPLOIDS represent a group of plant species that are of great importance to evolutionary studies and plant breeding (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
A significant gap that still remains in the current genetic study of polyploids is a serious lack of powerful statistical methods for mapping quantitative trait loci (QTL) on the basis of the genetic map of polymorphic markers. We know of only three articles that deal with the development of QTL-mapping methodologies (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In this article, we have developed a new maximum-likelihood-based statistical infrastructure for mapping QTL in polyploids undergoing bivalent formation during meiosis. Beyond the existing statistical methods, our method integrates quantitative genetic knowledge about gene action and interaction and cytological mechanisms of chromosome pairing to gain better insights into the structure, organization, and function of polyploid genomes. It is observed that for many polyploids there is a higher probability of pairing between more similar chromosomes than between less similar chromosomes (![]()
![]()
![]()
![]()
![]()
| MATHEMATICAL MODEL FOR LINKAGE ANALYSIS |
|---|
Meiotic pairing:
Consider a bivalent tetraploid, in which there are four sets of chromosomes. If chromosomes 1 and 2 are genetically more identical, as are chromosomes 3 and 4, there are three different combinations for the bivalent chromosome pairing. One of the three pairs is between more identical chromosomes 1 and 2 as well as 3 and 4 (
1) and the other two are between less identical chromosomes 1 and 3 as well as 2 and 4 (
2) or 1 and 4 as well as 2 and 3 (
3). In general, the probability of pairing between more identical chromosomes is higher than that between less identical chromosomes due to different evolutionary relatedness of chromosomes (![]()
![]()
![]()
![]()
1, 1/3 - 1/2p for
2, and 1/3 - 1/2p for
3. When p = 0, the four chromosomes in one group pair completely randomly. Extreme autopolyploids follow this pattern. When
, chromosome pairing occurs only between homologous ones and never occurs between homeologous ones. This pattern is characterized by extreme allopolyploids. Most polyploids are intermediate between these two extremes. Some polyploids that were originally classified as autotetraploids are found to belong to the intermediate types with 0
p
2/3 (reviewed in ![]()
Tetraploid model for three-point linkage analysis:
Linkage analysis in most diploid organisms is based on inbred line crosses, such as a backcross or F2. However, for many other species including polyploids, inbred lines are not available and, thus, their linkage analysis should be based on a full-sib family derived from outbred parental lines. In such a full-sib family, numerous cross types of genes can be possible. To simplify our description of linkage analysis in polyploids, we first consider fully informative markers between the two parents. Our mapping model can be readily generalized to consider arbitrary polyploid cross types composed of any type of partially informative markers.
Suppose there is a full-sib family of size n derived from two heterozygous tetraploid parents P and Q. Consider two fully informative markers 
and 
+1, which each have eight different alleles assigned to the four chromosomes of parents P and Q, respectively. The four alleles at marker 
are labeled by M
1, M
2, M
3, and M
4 for parent P and by N
1, N
2, N
3, and N
4 for parent Q, and the four alleles at marker 
+1 are labeled by M
+11, M
+12, M
+13, and M
+14 for parent P and by N
+11, N
+12, N
+13, and N
+14, for parent Q. Between these two markers there is a putative QTL
whose alleles are denoted by P1, P2, P3, and P4 for parent P and Q1, Q2, Q3, and Q4 for parent Q. The recombination fractions between marker 
and QTL
, QTL
and marker 
+1, and the two markers are denoted by
1,
2, and
, respectively. For parents P and Q, these three loci (two markers and one QTL) have a total of 576 x 576 = 331,776 possible nonallelic configuration or linkage phase combinations, one of which can be schematically expressed as
|
(1) |
where lines indicate the individual chromosomes on which the QTL is bracketed by the two markers and
is the Kronecker product. The specific linkage phase combination of parents P and Q, which is not known a priori, must be inferred from these possibilities for correct QTL mapping on the basis of marker and phenotype observations. In general, the linkage phase of the two markers is known before QTL mapping. Thus, we need to determine only the most likely linkage phase combination from 24 x 24 = 576 possibilities of the QTL relative to its two flanking markers.
Apart from the effect of different linkage phases on gamete formation frequencies, as a case in diploid organisms (![]()
1,
2, and
3) in bivalent polyploids also affect the patterns of gene segregation and, thus, gamete frequencies. However, these two factors have different influences. For a particular parent, there can be only one linkage phase, whereas different bivalent pairings may occur simultaneously with different frequencies. Hence, overall frequencies of gametes from three possible bivalent pairings should be expressed in terms of the preferential pairing factor p for parents P and Q (![]()
|
(2) |
where double lines are used to distinguish the two sets of paired chromosomes. For one parent, each of these three different bivalent pairings produces four diploid gamete types at a single locus. When the gametes are mixed from these pairings, a total of six gamete types will be produced for a locus. Thus, under bivalent pairings, parent P generates 36 diploid gametes at the two markers, whose genotypes are arrayed by

The probabilities of these marker gametes,
, can be derived in terms of the preferential pairing factor and the recombination fraction between the two markers (![]()
which are produced in the same way as the generation of the marker gametes [expression (2)]. Table 1 lists the joint probabilities of the two-marker and one-QTL gamete genotypes,
in parent P when three possible bivalent pairings occur at meiosis given a particular linkage phase of expression (1).
|
Similarly, for parent Q, we can write the array of the two-marker gamete genotypes,
, and the array of one-QTL gamete genotypes,
. The probabilities of two-marker gamete genotypes,
and of joint marker and QTL gamete genotypes,
can also be written. With the information of the two parents, we can express the arrays of zygote genotypes for the markers and the QTL, respectively, as

and the probabilities of two-marker zygote genotypes and of joint marker and QTL zygote genotypes, respectively, as
![]() |
(3) |
![]() |
(4) |
The conditional probabilities of the QTL zygote genotypes upon the marker zygote genotypes can be derived as
![]() |
(5) |
which forms a (1296 x 36) matrix, where
is the elementwise division of the two matrices. These conditional probabilities are used for QTL mapping as described in the next section.
| STATISTICAL METHOD FOR QTL MAPPING |
|---|
The mixture model:
A fundamental statistical model for QTL mapping is the mixture model (![]()
![]() |
(6) |
where
= (
1, ... ,
n) are the mixture proportions that are constrained to be nonnegative and sum to unity;
= (
1, ... ,
n) are the component-specific parameters, with
j being specific to component j; and
is a parameter that is common to all components.
For the mixture model used in genetic mapping, each component represents a class of QTL genotypes and, thus, the mixture model provides a framework by which observations may be clustered together into different classes of QTL genotypes. The mixture proportions represent the relative frequency of occurrence of each QTL genotype in the population. For a particular two-marker genotype, M
k1M
k2N
l1N
l2M
+1r1M
+1r2N
+1s1N
+1s2, the frequency of the QTL genotype Pu1Pu2Qv1Qv2 is the corresponding conditional probability described by Equation 5 and given in Table 1.
Linear model of a quantitative trait:
The mixture components in the mixture model of Equation 6 follow a normal distribution, with the mean equal to the expected genotypic value (µu1u2v1v2) of a QTL genotype and the variance equal to the residual variance (
2) within the QTL genotype. The phenotype of a quantitative trait observed for individual i can be described by a linear model,
![]() |
(7) |
where Xi(u1u2v1v2) is the indicator variable defined as 1 if individual i has the QTL genotype Pu1Pu2Qv1Qv2 and 0 otherwise, and ei is the residual effect, distributed as N(0,
2). The genotypic value of QTL genotype Pu1Pu2Qv1Qv2 is partitioned into additive and dominant (interaction) effects of different orders:
![]() |
(8) |
In a full-sib family, an individual will inherit two QTL alleles, Pu1Pu2, from parent P and two QTL alleles, Qv1Qv2, from parent Q. Because both parents P and Q have a total of eight different alleles, the above genetic model includes eight main effects, 28 diallelic interactions [6 due to two alleles from the same parent P (ßPPu1u2), 6 from the same parent Q (ßQQv1v2), and 16 from different parents (ßPQu1v1, ßPQu1v2, ßPQu2v1, ßPQuv2,)], 48 triallelic interactions [24 due to two alleles from parent P and one from parent Q (ßPPQu1u2v1, ßPPQu1u2v2) and 24 due to one allele from parent P and two from parent Q (ßPQQu1v1v2, ßPQQu2v1v2)], and 36 tetraallelic interactions.
Because some of the main and interaction effects are not independent, a parameterization process based on effect partitioning is needed to obtain a smaller number of estimable independent parameters (Appendix A). After this, estimable parameters include 6 for the main effects, 13 for the diallelic interactions (2 for interactions between alleles from parent P, 2 for parent Q, and 9 for interactions between alleles from different parents), 12 triallelic interactions, and 4 tetraallelic interactions (see also ![]()
We also used orthogonal polynomials to parameterize the main and interaction effects into linear contrasts, quadratic contrasts, and, if any, cubic contrasts (C.-X. MA and R. L. WU, unpublished results). Yet, we do not report the results from this parameterization approach here because of space limitation.
Computational algorithm:
A maximum-likelihood approach is used to fit a single QTL affecting a quantitative trait in tetraploids. The likelihood of the phenotypes (y) for n offspring in a full-sib family of two outcrossing tetraploids is expressed as
![]() |
(9) |
where
= (a, r1 or r2,
2, p) is the vector of unknown parameters containing the overall mean, QTL effects, QTL position, residual variance, and the preferential pairing factor; pu1u2v1v2i is the probability of progeny i to have QTL genotype Pu1Pu2Qv1Qv2, which is the probability of the QTL genotype conditional upon marker genotypes (Table 1) when the marker information is combined. Last, fu1u2v1v2(yi) is a normal distribution density for QTL genotype Pu1Pu2Qv1Qv2, with the mean equal to the expected genotypic value from Equation 7 and the variance equal to the residual variance (
2) within this genotype.
As seen from above, the total number of QTL effects equals the number of the QTL genotypes in bivalent tetraploids. This permits us to estimate the overall mean and QTL effect parameters from the estimated values (
u1u2v1v2) of the QTL genotypes by solving a group of regular equations. From a computational perspective, it is more efficient to estimate the expected genotypic values (µu1u2v1v2) from the mixture model of Equation 7 than to estimate the overall mean µ and QTL effect parameters that comprise vector a. We use the two parameterization approaches, as mentioned above, to estimate vector a from the 36 normal mixtures of QTL genotypes for bivalent tetraploids. The process of estimating
u1u2v1v2 and
2 on the basis of the EM algorithm is given in Appendix B. The maximum-likelihood estimates (MLEs) of r1 or r2 and p can be obtained using the grid approach because these two parameters each have a particular bound, 0
r1 or r2
1 and 0
p
2/3.
The characterization of linkage phase:
Above, we have derived a statistical procedure for estimating the recombination fraction and the preferential pairing factor in polyploids when their chromosome pairings at meiosis follow the bivalent model. The procedure assumes the linkage phase combination of the two markers and QTL as indicated by display (1). However, this represents only one of the 576 possible combinations for the two phase-known flanking markers and the QTL. Optimal estimates of all parameters should be based on a most likely linkage phase combination. Different linkage phases of the QTL relative to its flanking markers can be assigned on the basis of the permutation of four QTL alleles on four different chromosomes for each parent. A most likely linkage phase combination should correspond to the largest likelihood value calculated from Equation 9.
However, a new question arises about the comparisons of the likelihood values among different phase combinations. If we change different linkage phases, we may obtain different estimates for a QTL effect parameter, but we will obtain the same likelihood value. We therefore should pose constraints on allelic effects of the two parents to obtain comparable likelihood values. In fact, the occurrence of a particular linkage phase implies that alleles should be different for both loci under consideration. A total of 576 phase combinations between the QTL and its flanking fully informative markers are based on the condition that four QTL alleles are different for each parent. The direct description of such differences can be provided by allelic effects. Thus, we can pose the inequality constraints of three allelic effects from each parent. Without loss of generality, such constraints can be taken as
![]() |
(10) |
for parent P and
![]() |
(11) |
for parent Q. Under these constraints, we will obtain different likelihood values from different linkage phase combinations and, therefore, it will be possible to select a most likely linkage phase combination.
Hypothesis tests:
After the optimal estimates for the linkage and linkage phase are obtained on the basis of the largest likelihood value, we test for the significance of linkage by calculating the likelihood-ratio test (LRT) statistic,
![]() |
(12) |
where Ao stands for the most likely linkage phase combination between the QTL and its flanking markers under which the likelihood value is highest, calculated from Equation 9 with the above-mentioned constraints. Here,
and
stand for the MLEs for unknown parameters under the full model (at least one element in a is not equal to zero) and reduced model (a = 0), respectively. By formulating similar reduced models, we can also test for the significance of additive effects or dominance effects at different interaction levels.
As in diploid mapping, simulation studies can be used to determine critical threshold values. We can declare the existence of a significant QTL located between two markers 
and 
+1 if the LRT is greater than the critical threshold for an appropriate choice of the type I error rate
. Similarly, we can formulate a hypothesis for testing whether or not the preferential pairing factor is equal to zero (a set of four chromosomes are all homologous; the autopolyploid model) or 2/3 (homeologous chromosomes do not pair; the allopolyploid model). Results from such a test are useful for examining the level of relatedness between different genomes.
| RESULTS |
|---|
Simulation studies are performed to examine the statistical behavior of our bivalent polyploid model. We first focus our simulation to quantify the effects of trait heritability and sample size on the estimation of QTL parameters and of the bivalent chromosome pairing parameter. Then, we compare the differences of parameter estimates between our method and DOERGE and CRAIG's (2000) method, in which completely preferential bivalent chromosome pairings are assumed, and ![]()
Experimental design:
Two outcrossing tetraploid parents are simulated for two fully informative markers and a QTL with an assumed linkage phase configuration shown in display (1). The recombination fractions between the two markers and between the first marker and the QTL are given as 0.20 and 0.10, respectively. The preferential pairing factor p = 0.30 is assumed. These two parents are crossed to generate a full-sib family of 200, 400, and 800 offspring. Given a sample size, the observations of each of 36 x 36 = 1296 offspring genotypes at these two markers are simulated on the basis of their respective frequencies (Equation 4).
The numbers of offspring within each marker genotype carrying each of 36 QTL genotypes are simulated on the basis of the conditional probability matrices of Equation 5. Because of the QTL effects, offspring with different QTL genotypes will be different for a quantitative trait. The genotypic values of the offspring carrying different QTL genotypes are calculated on the basis of their structures, as given in D-1a (Appendix A), using the hypothesized values of the overall mean and 35 effects in the vector a (Table 2). The variance among these genotypic values is the genetic variance explained by this QTL. The phenotypic values of the offspring are calculated as an overall mean of 10, plus the genotypic values and the residual effects distributed as N(0,
2). Different
2 values are assigned by assuming different heritability levels 0.20 and 0.40. The heritability is defined as the proportion of the genetic variance to the total phenotypic variance.
|
For the simulated marker and phenotypic data, we use the bivalent polyploid model to estimate unknown parameters contained in the vector
and further obtain the MLEs of
using a procedure described in Appendix A. By permutating the arrangements of four QTL alleles among the four chromosomes for each parent, we obtain the MLEs of
with the constraints, as given in displays (10) and (11), under a total of 576 linkage phase combinations. The phase combination that has the largest likelihood value is regarded as a most likely one, under which the MLEs of
are given in Table 2. The simulations are repeated 100 times to calculate the means and standard errors of the MLEs from our model.
The effects of trait heritability and sample size:
Using the computational algorithms described in Appendix B, we obtain the MLEs of
. The recombination fraction between the first marker and the QTL can be accurately estimated for different sample sizes (n) and heritability (H2) levels considered, although its estimation precision increases with sample sizes and heritability levels. The estimate of residual variance (
2) is considerably downward biased, especially for a trait with low heritability, if the sample size used is <400.
The real genotypic values of the 36 QTL genotypes are determined from a = D-1m (see Appendix A). The EM algorithm provides accurate estimates for these genotypic values, even when sample size or heritability is low (results not shown). If the genotypic values can be well estimated, the QTL gene effects (a) can also be well estimated because, according to our parameterization, the sampling variances of â will be reduced relative to those of
[see the structure of D-1(D-1)T in Appendix A]. It is shown that the estimators of additive effects of alleles for each parent have only one-sixteenth of the sampling variance of the estimated residual variances. The estimates of dominant effects vary depending upon the type and degree of interactions. If dominant effects are derived from the two alleles of one same parent, their estimators will be even more precise than those of the allelic effects. The estimators of dominant effects are derived from two alleles of different parents having the lowest precision, whose sampling variances are 9/64 of those of the estimated residual variance. It is interesting to note that the estimators of tetraallelic effects have better precision than those of the diallelic dominance effect with two alleles from different parents. From the structure of D-1(D-1)T, the estimators of different QTL effect parameters are basically independent. Their dependence occurs only within the QTL effects of the same type. The structure analysis of D-1(D-1)T suggests that the parameterization process of QTL effects will produce favorable effects on their estimates from the EM algorithm as described in Appendix B.
As expected, the allelic (or additive) effects can be estimated both more accurately and more precisely than the dominant effects, and the dominant effects of lower-order interactions can be estimated more precisely than the dominant effects of higher-order interactions (Table 2). It is interesting to note that the diallelic dominance effects between two alleles from the same parent can be estimated better than those between two alleles from different parents.
For all kinds of gene effects in bivalent tetraploids, the estimation accuracy and precision are increased when sample sizes and heritability levels are increased (Table 2). In general, a sample size of 200 can provide reasonably precise estimates of the allelic additive effects for a quantitative trait with a heritability of 0.20. But the estimation precision can be significantly improved if n is increased to 400 or for a quantitative trait with an increased H2 level. There is not much improvement if n is further increased from 400 to 800, even for a less inheritable trait.
For the diallelic dominance effects between two alleles from the same parent, it seems that for a lower heritability (0.20) a sample size of at least 400 is needed to achieve reasonable estimation precision, whereas for a heritability of at least 0.40 a smaller sample size (200) may be adequate, compared to the magnitudes of the actual values of these effects that are hypothesized (Table 2). For the diallelic dominance effects between two alleles from different parents, reasonable estimates need a sample size of at least 400 for a trait with a heritability of at least 0.40. In general, it is difficult to estimate triallelic dominance effects unless a sample size is extremely large (say 800). To obtain reasonable estimates for tetraallelic effects, an extremely large sample size should accompany a highly inheritable quantitative trait (see Table 2).
The estimates of all parameters listed in Table 2 were based on an optimal linkage phase combination selected from all possibilities in terms of the estimated likelihood values. The probabilities of detecting a correct linkage phase combination were estimated for different sample sizes and heritability levels (Table 3). When N = 200 and H2 = 0.20, we have only about one-third probability to detect a correct linkage phase combination. Other probabilities include about one-quarter to detect two linkage phase combinations and about one-half to detect an incorrect linkage phase combination. When a sample size or heritability is doubled, the probability of detecting an incorrect linkage phase combination is reduced. If a sample size of 400 is used for a trait of H2 = 0.40, no incorrect linkage phase combination will be detected.
|
The log-likelihood ratios (LRT) of Equation 12 were used to test for the significance of QTL effects under different sample sizes and heritability levels. Except for a few cases where N = 200 and H2 = 0.20, QTL can be detected at a significance level of P = 0.05 in all 100 repeated simulations. The critical threshold value was calculated by simulating data sets with QTL effects set to zero and examining the distribution of the LRT (see also ![]()
The effects of completely preferential pairings and random pairings:
![]()
Hackett et al.'s assumption of random bivalent pairings (the autopolyploid model) leads to the same structure of the conditional probability matrix that we have in our bivalent polyploid model. Because our model covers the allo- and autopolyploid model, it can be regarded as the general polyploid model. Here, we make a comparison between Hackett et al.'s method and our method by first looking at the conditional probability matrix derived for the general polyploid model listed in Table 1. From the table it is found that only the conditional probabilities of QTL genotypes of 12 boldfaced marker genotypes contain p and the conditional probabilities of the rest of the 24 marker genotypes do not contain p because p is canceled out. This means that p may have a relatively small influence on the conditional probability matrix and therefore on parameter estimates under the general polyploid model when two markers considered are fully informative. In other words, for fully informative markers, results from the autopolyploid model will be similar to those from the general polyploid model. A small simulation study has confirmed this inference (results not shown).
However, for partially informative markers (![]()
| DISCUSSION |
|---|
The development of statistical methods for mapping QTL in polyploids is one of the most difficult tasks in genetic and genomic study. Although quite a few studies of linkage analysis have used polymorphic markers in polyploids (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In this article we report on the development of a novel statistical methodology for QTL mapping in bivalent polyploids that represent an important group of polyploids including alfalfa, potato, and wheat. Using extensive simulations, we examined the robustness and performance of this bivalent polyploid method in estimating QTL effects, QTL position, and QTL linkage phase relative to known-phase markers under different sample sizes and heritability levels. We also compared the results from our method and the current methods on the basis of the allo- and autopolyploid model. Our method has four significant improvements over these current statistical methods for QTL mapping in bivalent polyploids. First, our method incorporates a general bivalent pairing mechanism of meiotic configuration by defining a cytological parameter called the preferential pairing factor. The preferential pairing factor (p) is defined as the propensity of bivalent pairings between more similar rather than less similar chromosomes (![]()
![]()
![]()
![]()
![]()
![]()
The second improvement of our method is a thorough exploration of QTL action and interaction effects on phenotypes in polyploids. As with diploids, the inheritance mode of QTL in polyploids can be additive or dominant. But compared with diploids, these gene actions and interactions are much more complicated because of an increased number of alleles and allele combinations. ![]()
![]()
The efficient estimation of these 36 quantitative genetic parameters in tetraploid mapping, therefore, offers the third improvement of our method over the current methods. In this article, we incorporate the EM algorithm (![]()
![]()
The correct characterization of linkage phases is a prerequisite for genome mapping in species like polyploids, in which homozygous inbred lines cannot be obtained. In this article, we used a modified EM algorithm to simultaneously estimate linkage and linkage phases. ![]()
Our method can be extended to several more general situations. First, we should consider statistical properties of using markers with lower degrees of informativeness to map QTL in polyploids. ![]()
![]()
![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank two anonymous reviewers for their constructive comments. This work is partially supported by an Outstanding Young Investigator Award of the National Natural Science Foundation of China (30128017), a University of Florida Research Opportunity Fund (02050259), and a University of South Florida Biodefense grant (7222061-12) to R.W. The publication of this manuscript is approved as a journal series no. R-08795 by the Florida Agricultural Experiment Station.
Manuscript received November 5, 2002; Accepted for publication September 28, 2003.
| APPENDIX A |
|---|
PARAMETERIZATION OF GENE EFFECTS
All the main and interaction effects in bivalent tetraploids should be parameterized to obtain a group of estimable parameters. In this article, the parameterization of these gene effects is based on different constraints posed on them. The constraints on the allelic (main) effects are expressed as

which lead to six estimable independent parameters. The constraints on the diallelic interaction effects are

which lead to two independent parameters for interactions between two alleles from parents P and Q, respectively, and nine independent parameters for interactions between two alleles each from a different parent. The constraints on the triallelic interaction effects are

which lead to 12 independent parameters. The constraints on the tetraallelic interaction effects are

which lead to four independent parameters.
By these parameterization constraints, a total of 120 QTL effect parameters contained in the genotypic values Pu1Pu2Qv1Qv2 (1
u1 < u2
4, 1
v1 < v2
4) in Equation 8 can be reduced to 35 independent estimable parameters. Without loss of generality, 6 independent allelic (main) effect parameters are assigned as
P1,
P2, and
P3 for parent P and
Q1,
Q2, and
Q3 for parent Q. Thirteen diallelic interaction parameters are assigned as ßPP12 and ßPP13 for two alleles from parent P; ßQQ12 and ßQQ13 for two alleles from parent Q; and ßPQ11, ßPQ12, ßPQ13, ßPQ21, ßPQ22, ßPQ23, ßPQ31, ßPQ32, and ßPQ33 for two alleles, one from parent P and the other from parent Q. Twelve triallelic interaction parameters are assigned as
PPQ121,
PPQ122,
PPQ123,
PPQ131,
PPQ132, and
PPQ133 for three alleles, two from parent P and one from parent Q; and
PQQ112,
PQQ212,
PQQ312,
PQQ113,
PQQ213, and
PQQ313 for three alleles, one from parent P and two from parent Q. Four tetraallelic interaction parameters are assigned as
1212,
1213,
1312, and
1313.
The vector
of the 36 QTL genotypic values can be expressed in terms of these assigned effect parameters. We have

where D is a design matrix expressed as
and a is the vector of gene effects, expressed as

In Appendix B, we provide the EM algorithm for estimating
, from which the MLEs of a are solved:

The sampling variance of â is thus Vâ = D-1V
(D-1)T = D-1 (D-1)T
2. We have D-1(D-1)T =
which has two desirable properties: (1) the elements on its diagonal are much smaller than one, ranging from 9/64 to 1/36, and (2) most elements off its diagonal are zero. The first property implies that the sampling variance of each estimator in the vector a from our parameterization approach is always smaller than the estimated residual variance. The second property suggests that different estimators in vector a are independent of each other.
| APPENDIX B |
|---|
IMPLEMENTATION OF THE EM ALGORITHM
The parameter vector in which we are interested is denoted by
. But the estimation of this vector is not most efficient from a computational standpoint. As explained in the text, we define a new vector
= (m, r1 or r2,
2, p), which can be more easily estimated by implementing the EM algorithm (![]()
![]()
![]() |
(B1) |
with derivatives

where we define
![]() |
(B2) |
which could be thought of as a posterior probability that progeny i (i = 1, ... , N) has QTL genotype u1u2v1v2 (1
u1 < u2
4, 1
v1 < v2
4). We then implement the EM algorithm with the expanded parameter set {
, P}, where
Conditional on P, we solve for the zeros of
/
mlog L(y|
)to get our estimates of
under the constraints of displays (10) and (11) (the M step). The estimates are then used to update P (the E step), and the process is repeated until convergence. The values at convergence are the MLEs.
The log-likelihood equations for the MLEs of m and
2 are given as

The QTL position and the preferential pairing factor are generally estimated using the grid approach by fixing them at particular values in their space. The values of these two parameters, at which the maximum-likelihood value is obtained, are their MLEs.
| LITERATURE CITED |
|---|
ALLENDORF, F. W. and R. G. DANZMANN, 1997 Secondary tetrasomic segregation of MDH-B and preferential pairing of homeologues in rainbow trout. Genetics 145:1083-1092.[Abstract]
BEVER, J. D. and F. FELBER, 1992 The theoretical population genetics of autopolyploidy. Oxf. Surv. Evol. Biol. 8:185-217.
BROUWER, D. J. and T. C. OSBORN, 1999 A molecular marker linkage map of tetraploid alfalfa (Medicago sativa L.). Theor. Appl. Genet. 99:1194-1200.[CrossRef]
BROUWER, D. J., S. H. DUKE, and T. C. OSBORN, 2000 Mapping genetic factors associated with winter hardiness, fall growth, and freezing injury in autotetraploid alfalfa. Crop Sci. 40:1387-1396.
BUTRUILLE, D. V. and L. S. BOITEUX, 2000 Selection-mutation balance in polysomic tetraploids: impact of double reduction and gametophytic selection on the frequency and subchromosomal localization of deleterious mutations. Proc. Natl. Acad. Sci. USA 97:6608-6613.
DA SILVA, J., M. E. SORRELLS, W. L. BURNQUIST, and S. D. TANKSLEY, 1995 RFLP linkage map and genome analysis of Saccharum spontaneum.. Genome 36:782-791.
DEMPSTER, A. P., N. M. LAIRD, and D. B. RUBIN, 1977 Maximum likelihood from incomplete data via EM algorithm. J. R. Stat. Soc. Ser. B 39:1-38.
DOERGE, R. W. and B. A. CRAIG, 2000 Model selection for quantitative trait locus analysis in polyploids. Proc. Natl. Acad. Sci. USA 97:7951-7956.
FISHER, R. A., 1947 The theory of linkage in polysomic inheritance. Philos. Trans. R. Soc. Ser. B 233:55-87.
FJELLSTROM, R. G., P. R. BEUSELINCK, and J. J. STEINER, 2001 RFLP marker analysis supports tetrasomic inheritance in Lotus corniculatus L. Theor. Appl. Genet. 102:718-725.[CrossRef]
GRIVET, L., A. D'HONT, D. ROQUES, P. FELDMANN, and C. LANAUD et al., 1996 RFLP mapping in cultivated sugarcane (Saccharum spp): genome organization in a highly polyploid and aneuploid interspecific hybrid. Genetics 142:987-1000.[Abstract]
HACKETT, C. A., 2001 A comment on Xie and Xu: Mapping quantitative trait loci in tetraploid species. Genet. Res. 78:187-189.[Medline]
HACKETT, C. A., J. E. BRADSHAW, R. C. MEYER, J. W. MCNICOL, and D. MILBOURNE et al., 1998 Linkage analysis in tetraploid species: a simulation study. Genet. Res. 71:143-154.[CrossRef]
HACKETT, C. A., J. E. BRADSHAW, and J. W. MCNICOL, 2001 Interval mapping of quantitative trait loci in autotetraploid species. Genetics 159:1819-1832.
HALDANE, J. B. S., 1930 Theoretical genetics of autopolyploids. Genetics 22:359-372.
HICKOK, L. G., 1978 Homoeologous chromosome pairing: frequency difference in inbred and intraspecific hybrid polyploid ferns. Science 202:982-984.
HILU, K. W., 1993 Polyploidy and the evolution of domesticated plants. Am. J. Bot. 80:1491-1499.
HOARAU, J. Y., B. OFFMANN, A. D'HONT, A. M. RISTERUCCI, and D. ROQUES et al., 2001 Genetic dissection of a modern sugarcane cultivar (Saccharum spp.). I. Genome mapping with AFLP markers. Theor. Appl. Genet. 103:84-97.[CrossRef]
KEMPTHORNE, O., 1957 An Introduction to Genetic Statistics. John Wiley & Sons, New York.
LANDER, E. S. and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199.
LEITCH, I. J. and M. D. BENNETT, 1997 Polyploidy in angiosperms. Trends Plant Sci. 2:470-476.[CrossRef]
LUO, Z. W, C. A. HACKETT, J. E. BRADSHAW, J. W. MCNICOL, and D. MILBOURNE, 2001 Construction of a genetic linkage map in tetraploid species using molecular markers. Genetics 157:1369-1385.
MATHER, K., 1935 Reductional and equational separation of the chromosomes in bivalents and multivalents. J. Genet. 30:53-78.
MATHER, K., 1936 Segregation and linkage in autotetraploids. J. Genet. 32:287-314.
MENG, X. L. and D. B. RUBIN, 1993 Maximum likelihood estimation via the ECM algorithma general framework. Biometrika 80:267-278.
MEYER, R. C., D. MILBOURNE, C. A. HACKETT, J. E. BRADSHAW, and J. W. MCNICHOL et al., 1998 Linkage analysis in tetraploid potato and association of markers with quantitative resistance to late blight (Phytophthora infestans). Mol. Gen. Genet. 259:150-160.[CrossRef][Medline]
MING, R., S. C. LIU, Y. R. LIN, J. DA SILVA, and W. WILSON et al., 1998 Detailed alignment of Saccharum and Sorghum chromosomes: comparative organization of closely related diploid and polyploid genomes. Genetics 150:1663-1682.
MING, R., S. C. LIU, P. H. MOORE, J. E. IRVINE, and A. H. PATERSON, 2001 QTL analysis in a complex autopolyploid: genetic control of sugar content in sugarcane. Genome Res. 11:2075-2084.
OTT, S. P. and J. WHITTON, 2000 Polyploid incidence and evolution. Annu. Rev. Genet. 34:401-437.[CrossRef][Medline]