- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Hackett, C. A.
- Articles by McNicol, J. W.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Hackett, C. A.
- Articles by McNicol, J. W.
Interval Mapping of Quantitative Trait Loci in Autotetraploid Species
C. A. Hacketta, J. E. Bradshawb, and J. W. McNicolaa Biomathematics and Statistics Scotland, Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, Scotland
b Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, Scotland
Corresponding author: C. A. Hackett, Biomathematics and Statistics Scotland, Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, Scotland., christine{at}bioss.ac.uk (E-mail)
Communicating editor: C. HALEY
| ABSTRACT |
|---|
This article presents a method for QTL interval mapping in autotetraploid species for a full-sib family derived by crossing two parents. For each offspring, the marker information on each chromosome is used to identify possible configurations of chromosomes inherited from the two parents and the locations of crossovers on these chromosomes. A branch and bound algorithm is used to identify configurations with the minimum number of crossovers. From these configurations, the conditional probability of each possible QTL genotype for a series of positions along the chromosome can be estimated. An iterative weighted regression is then used to relate the trait values to the QTL genotype probabilities. A simulation study is performed to assess this approach and to investigate the effects of the proportion of codominant to dominant markers, the heritability, and the population size. We conclude that the method successfully locates QTL and estimates their parameters accurately, and we discuss different modes of action of the QTL that may be modeled.
LINKAGE analysis and quantitative trait loci (QTL) mapping methods are now well established and widely used for diploid plant species, and there is an increasing interest in extending these methods to autopolyploid species, despite the complications of polysomic inheritance. Linkage maps have been calculated for autotetraploid potato (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Unless a mapping population is very large, it is difficult to detect repulsion linkages between simplex markers in polyploids. ![]()
![]()
![]()
![]()
![]()
![]()
An important use of linkage maps is to locate major genes and QTL for important traits. Early studies of diploid species compared trait means for different phenotypes at a single marker using regression models, and some authors used the same approach in polyploid species. ![]()
![]()
![]()
![]()
In this article we present an approach for QTL interval mapping in autotetraploid species in a full-sib family derived by crossing two parents. As for similar populations derived from outbreeding diploid parents, we use all the markers on a chromosome to estimate conditional probabilities for QTL genotypes. The trait values are then related to the QTL genotypes, using a mixture model. We present a simulation study to look at effects of the proportion of codominant to dominant markers, the heritability, and the population size on the ability of the model to locate QTL and to estimate their effects.
| METHODS |
|---|
The mapping population:
The QTL mapping approach is developed for an F1 population derived by crossing two parents, P1 and P2. The phenotypes of m molecular markers are assumed to be known for the parents and n offspring, and trait data are available for the parents and offspring. The parents can have up to eight distinct alleles at each marker or quantitative trait locus: These are represented by AH or O for a "null" allele (![]()
![]()
Tetrasomic inheritance:
The model for QTL mapping is developed by assuming random chromosomal segregation. The four homologous chromosomes are assumed to pair at random to give two bivalents, and crossing over is assumed to be restricted to within each bivalent. All bivalent pairings are assumed to be equally likely. The possibilities of nonrandom chromosomal pairing, or multivalent formation, are not considered here. We assume that there is no chromatid and no crossover interference. ![]()
Model for a quantitative trait:
![]()
![]() |
(1) |
where µ is the population mean, the
's are the main effects of the alleles (analagous to additive effects in a diploid population), and ß's,
's, and
's are the diallelic, triallelic, and tetraallelic interactions, respectively. The ß's are analogous to dominance deviations in a diploid population, but there is no diploid analogy to the triallelic and tetraallelic interactions. Appendix A compares the notation of model (1) with that used by various authors when considering two alleles at a tetraploid locus. For our full-sib family model, an individual will inherit alleles Ai and Aj, 1
i
3, i
j
4 from parent 1 and Ak and Al, 5
k
7, k
l
8 from parent 2. There are therefore eight main effects, 28 diallelic interactions, 24 triallelic interactions, and 36 tetraallelic interactions in model (1), totaling more than the 36 genotypes available for model fitting. We rewrite model (1) for a full-sib family with indicator variables Xi = 1/0 corresponding to allele Ai present/absent for that individual. Model (1) becomes
![]() |
(2) |
There is intrinsic aliasing of {Xi} and higher products, so that some of the parameters are nonestimable. As each individual inherits precisely two alleles from each parent, we have the constraints
![]() |
(3) |
Substituting these into a model with main effects only gives
![]() |
(4) |
The estimable parameters are (µ + 2
1 + 2
5),
2 -
1, etc. To estimate the individual parameters, we impose the constraints
1 = 0,
5 = 0, sometimes referred to as cornerpoint constraints.
There are further constraints on the higher-order terms of Equation 2, for example,
![]() |
(5) |
and similarly for the other parent. In total, it is possible to fit six main effects, 13 biallelic interactions (2 for interactions between alleles from parent 1, 2 for parent 2, and 9 for interactions between alleles from different parents), 12 triallelic interactions, and 4 tetraallelic interactions. These total 35 effects, equal to the degrees of freedom among the 36 genotype means.
When fitting models in the simulation study, we concentrate on the main effects model (4), with cornerpoint constraints on the parameter estimates. However, there is no theoretical difficulty in including higher-order terms.
In practice, the offspring QTL genotypes are unknown, and the conditional probabilities of QTL genotypes must be estimated from the marker information.
A mixture model for QTL mapping:
Here we develop a maximum-likelihood approach for fitting a single QTL model, considering one chromosome at a time. The analysis is an extension of that used by ![]()
Qi be the set of possible QTL genotypes, and let gi
Gi be the set of chromosome configurations that are compatible with phenotype oi. By "chromosome configurations" we mean the marker genotypes and the parental chromosomes from which the marker alleles come, so that it is clear how the bivalents occurred to form that offspring and where recombinations occurred. We adopt an interval mapping approach, fitting a QTL at a set of locations along the chromosome and maximizing the likelihood for each location as a function of the QTL parameters
= (µ,
2,
3,
4,
6,
7,
8,
2), where µ and
i are as in Equation 4 and
2 is the residual variance.
The likelihood of the trait and marker data is
![]() |
(6) |
Now,
![]() |
(7) |
assuming conditional independence of (i) the trait value and the marker data given the QTL genotype and (ii) the QTL genotype and the marker phenotype given the chromosome configuration.
We can maximize the log-likelihood by
![]() |
(8) |
The first term on the right-hand side of (8) does not depend on the QTL parameters
, and the second term may be written as
![]() |
(9) |
Only the third term of the final sum depends on
and so contributes to the likelihood equation. The term represents a regression of the trait values on the QTL genotypes, weighted by the conditional probability of each QTL genotype. ![]()
![]()
![]() |
(10) |
We can therefore use P(qi|gi)P(gi|oi) as initial weights, maximize the likelihood conditional on these, update the QTL genotype probabilities using Equation 10, and repeat until the log-likelihood converges to a maximum. An alternative approach is to calculate the QTL genotype probabilities from the marker data only, regardless of the trait values. In this case we would use a single weighted regression with weights P(qi|gi)P(gi|oi) rather than an iterative process to update the QTL genotype probabilities. We compare both approaches here.
Estimation of QTL genotype probabilities:
In populations such as doubled haploids from inbred diploid lines, every marker is fully informative, and the conditional probability of a QTL genotype depends only on the genotypes of the markers flanking the possible QTL location. When markers are not fully informative, we need to consider the information from all markers on a chromosome to calculate the conditional probabilities of QTL genotypes. The QTL genotype probabilities P(qi|gi)P(gi|oi) factor into two terms, and we consider them separately.
The conditional probability of the chromosome configuration, given the marker phenotypes, P(gi|oi):
One way to calculate these probabilities would be to use a hidden Markov model (HMM). These have been widely used for multipoint mapping in diploid species (e.g., ![]()
![]()
![]()
![]()
We preferred to use an alternative method via a "branch and bound" algorithm, where we search for chromosome configurations for each offspring that are compatible with the marker phenotype information, arise from a possible bivalent pairing in each parent, and have the minimum number of crossovers. Appendix B describes the process of reconstruction for one individual from a cross between the two parents shown in Table 1 and Fig 1 illustrates the eight possible chromosome configurations for this individual that are compatible with the phenotypic information and have the minimum number of crossovers (six).
|
|
If there are m marker loci, the recombination frequency between loci i and i + 1 is ri, and there are xi recombinations between them (0
xi
4), then the probability of a configuration can be calculated as
![]() |
(11) |
This assumes that there is no interference and that recombinations occur independently between different pairs of loci. As all the configurations have the same number of crossovers, the probabilities of the different configurations will be similar unless the marker spacing is very uneven. There is, of course, an approximation here as we are ignoring configurations that have more than the minimum number of crossovers compatible with the phenotypes. However, configurations with more crossovers will have smaller probabilities by Equation 11.
The probability of the QTL genotype, given the chromosome configuration P(qi|gi):
Once we have calculated a set of possible chromosome configurations for each individual, we can identify possible QTL genotypes and calculate their probability for putative QTL locations at a set of positions along the chromosome. We assume that there are no double crossovers between markers. The individual illustrated in Fig 1, for example, has inherited chromosomes 1, 2, 6, and 8 at marker loci L5 and L6 for all configurations. Therefore, for QTL locations between L5 and L6, we assume that the QTL genotype is Q1Q2Q6Q8, with probability 1. This individual has also inherited chromosomes 1, 2, 6, and 5 at locus L7, with a crossover between chromosomes 5 and 8. For QTL locations between L6 and L7, there are two possible QTL genotypes, Q1Q2Q6Q8 and Q1Q2Q6Q5. The probability of the former genotype will decrease, and the probability of the latter will increase, as we consider locations at an increasing distance from L6 and closer to L7. To calculate these probabilities, we assume that crossovers follow a Poisson process with the probability of no crossovers in an interval of M morgans equal to e-M and the probability of one crossover as Me-M. If the positions of L6 and L7 are m6 and m7 and we want to calculate the probability associated with a QTL at position mQ between them,
![]() |
(12) |
For many positions the set of possible QTL genotypes will depend upon the configuration; e.g., for positions between loci L8 and L9 (configuration 3275), there will be two QTL genotypes (Q1Q2Q7Q5 and Q3Q2Q7Q5) for configurations where the configuration at L8 is 1275 and four QTL genotypes (Q1Q2Q6Q5, Q1Q2Q7Q5, Q3Q2Q6Q5, and Q3Q2Q7Q5) for configurations where the configuration at L8 is 1265. Crossovers between chromosomes 1 and 3 occur independently of crossovers between chromosomes 6 and 7, and so we have probabilities
![]() |
(13) |
Similarly,
![]() |
(14) |
and
![]() |
(15) |
From these formulas, we can calculate QTL genotype probabilities from Equation 10 and use them for a weighted regression to test for the significance of a QTL at position mQ. A plot of the adjusted coefficient of determination R2a against the position mQ should show a maximum at the true QTL location.
Simulation study:
A simulation study was carried out to investigate this approach for QTL mapping and to quantify the effects of the marker type, the trait heritability, and the population size. Two parents were simulated initially and these were crossed to give a population of 200 offspring by random chromosomal segregation. The first simulation consisted of one chromosome with 10 codominant markers, spaced at 10-cM intervals. There were five possible alleles (AE) and a null allele (O) at each marker locus, with equal probability. The parental marker genotypes were simulated by sampling four alleles for each parent from the set of possible alleles, with replacement. One such parental configuration is shown in Table 1. A QTL was assumed to be situated halfway between markers L2 and L3 and to have eight different alleles. The QTL alleles were assumed to have additive effects, of sizes 0, 1, 1, 2, 0, -1, -1, and -2 for alleles Q1Q8, respectively. The trait values for each offspring were calculated as an overall mean of 10.0, plus the sum of the effects of their alleles, plus an environmental effect distributed as N(0,
2). The value of
2 was chosen to give the desired trait heritability h2 =
, where
2G is the genetic variance:
2 = 4.0, 12.0 correspond to heritabilities of 25 and 10%, respectively. The true QTL genotype of each individual was known, so that it was possible to compare parameter estimates from an unweighted regression on the true QTL genotype with the interval mapping approach of weighted regression on the possible QTL genotypes.
The above simulation has three random stages: simulation of the parents, simulation of the offspring given the parents, and simulation of the environmental error to add to the genotype values; and the study can be replicated at each stage. To see the effect of each level of replication, 10 pairs of parents were simulated, 20 sets of 200 offspring were simulated for each set of parents, and 20 sets of environmental error were simulated for each set of offspring, giving a total of 4000 sets of marker and trait data for analysis. Simulations A and B had heritabilities of 25 and 10%, respectively.
Most experimental data sets will be a mixture of codominant and dominant marker types, and in general the dominant markers are less informative. A further set of simulations (C) was generated to investigate this. The dominant markers were simulated as a mixture of simplex markers (AOOO x OOOO), duplex markers (AAOO x OOOO), and double-simplex markers (AOOO x AOOO), in the proportion found in potato by ![]()
|
We need a threshold for R2a, above which we declare a QTL present. For a single regression on a known QTL genotype in a population of size n, the threshold for declaring significance at a 5% level is the 95% point of an F distribution with 6 and (n - 7) d.f. (Six QTL effects are fitted, rather than eight, due to the constraints discussed in Equation 4.) The F-statistic is related to R2a by
![]() |
(16) |
where
1 and
2 are the numerator and denominator degrees of freedom. For n = 200, the 95% point for the F-statistic is 2.146, corresponding to R2a = 3.3%, and for n = 100, the 95% point for the F-statistic is 2.198, corresponding to R2a = 6.8%. In QTL interval mapping, however, we consider the location with the maximum R2a for a large number of linked positions and this is difficult to establish theoretically. Instead simulation sets E and F were run, with a similar pattern to simulations A and D, but the QTL effects were all set to zero. In this way, we can see how large a value of R2a can be achieved by random variation and establish the distribution of R2a under the null hypothesis of no QTL on the chromosome. If the observed peak of R2a along a profile exceeds the 95% point of the distribution under the null hypothesis, we declare a QTL present with significance p < 0.05.
Computing:
All routines were written in Fortran 90 (![]()
![]()
| RESULTS |
|---|
Reconstruction of the chromosome configurations:
The reconstruction of the chromosome configurations depends only on the marker data and is not affected by the heritability of the trait. The computer program to calculate the possible configurations had an upper limit of 5000 for each individual so that individuals with >5000 possible configurations were excluded from further analysis. No individuals were excluded, using the codominant markers for simulations A and B. The mean numbers of individuals excluded from simulations C and D were 8.8 (SE 0.93) out of 200 individuals and 4.8 (SE 0.96) out of 100 individuals, respectively.
For the individual illustrated in Fig 1, there are eight possible configurations. These agree, and are the same as the true genotypic configuration, for 37 of the 40 alleles. For the other 3 alleles one-half of the configurations have the simulated allele. The mean proportions of alleles that are correct for every chromosome configuration are summarized in Table 3 for simulations A, B, C, and D. The proportions of correct reconstructions were significantly higher for simulations A and B, using codominant markers only, but the proportions for simulations C and D were still high (0.81).
|
Interval mapping:
For each trait, we obtained a profile of R2a at a series of positions along the chromosome. Fig 2 shows such a profile for a trait from simulation A, with codominant markers and a heritability of 25%. We took the maximum of the profile to indicate the most likely location of the QTL. Table 4 summarizes the estimates of the QTL effects, R2a, and the residual mean square for the different simulation studies. For each study, the mean and standard errors over 4000 data sets (10 parental combinations x 20 offspring combinations x 20 traits) are presented. The row labeled wt4 is the sum of the probabilities of the true QTL genotype for each individual in the weighted regression step of the EM algorithm. If the true QTL genotype has been inferred for each individual with probability 1, then wt4 is equal to the number of individuals n. Similarly, wt3 is the sum of the probabilities of QTL genotypes that are correct for three-quarters of the alleles. The proportions of simulations indicating a QTL in the interval (525 cM) are shown.
|
|
Model fitting by iterative or noniterative weighted regression: As discussed above, the QTL model can be fitted at each position by a single weighted regression or by an iterative process, updating the QTL genotype probabilities using trait information as in Equation 10. The third and fourth columns of Table 4 compare the model parameters from these two methods of model fitting for simulation set A. The estimated value for the constant term was close to the true value in both cases, and the mean position, the proportion of QTL located to the region between 5 and 25 cM, and the total weight of correct and three-quarters correct QTL genotypes were very similar. For all the QTL allele effects, however, the iterative approach gave an estimate close to the true value, while the noniterative approach gave estimates whose absolute values were biased downward. The noniterative approach also underestimated the percentage variance accounted for and overestimated the residual variance. The same pattern was observed for all of the other simulation sets (results not shown). We conclude that the noniterative approach to model fitting is inadequate, and it is not considered further.
The threshold for declaring a QTL present: Simulation sets E and F (codominant markers and n = 200, and dominant and codominant markers and n = 100, respectively, and with all QTL effects equal to zero) were used to investigate the distribution of R2a when no QTL is segregating. For set E, the mean R2a was 2.6 (SE 0.04), and the upper 95% point of the distribution was 6.5. We therefore took R2a = 6.5 as the threshold above which a QTL was declared present for simulations A and B. Further simulations (results not shown) indicated that the same threshold was appropriate for simulation C, with a mixture of codominant and dominant markers. For set F, with 100 individuals, the mean R2a was 5.1 (SE 0.18), and the upper 95% point of the distribution was 13.7. This is the appropriate threshold for simulation set D.
The effect of heritability:
The fourth and fifth columns of Table 4 compare simulations with heritabilities of 25 and 10%, respectively. Means and standard errors were calculated over all 4000 data sets in each case. For a heritability of 25%, 3998/4000 data sets had R2a greater than the threshold of 6.5. For a heritability of 10%, 3098/4000 data sets had R2a > 6.5, indicating a significant QTL. The sixth column of Table 4 shows the means and standard deviations for these 3098 significant data sets. The standard error of the QTL location increased as the heritability decreased, although
80% of the simulations were still found in the interval between 5 and 25 cM. The weight placed on the true QTL genotypes in the weighted regression decreased slightly for the lower heritability, and the weight on the three-quarters correct QTL genotype increased. The estimates of the QTL effects, R2a, and the residual mean square were still close to the true values. However, the mean QTL effects and R2a for all 4000 data sets were consistently slightly lower than the true value, while they were consistently slightly higher for the 3098 significant data sets. The opposite trend was seen for the residual mean square. This is to be expected, as we are excluding the data sets with the lowest R2a from the mean in the case of the significant sets.
The effect of marker type: The effect of the change from all codominant markers (simulation B) to a mixture of codominant and dominant markers (simulation C) can be seen by comparing columns five and seven (all 4000 data sets) and columns six and eight (significant data sets) of Table 4. In each case the heritability is 10%. For simulation C, 2848/4000 data sets had R2a > 6.5, indicating a significant QTL. The proportion of QTL located in the interval (525 cM) fell to 0.66 (0.70 for the significant data sets) and the weight given to the correct QTL genotypes decreased. The absolute values of the QTL effects were biased downward for the mean over all 4000 data sets but less so for the significant data sets.
The effect of population size: The last four columns of Table 4 compare the effect of decreasing the population size from 200 to 100 (simulation D), for the situation of 10% heritability and a mixture of codominant and dominant markers. As discussed above, the threshold for declaring a significant QTL with a population of 100 is R2a > 13.7, and 1233/4000 data sets had R2a in this range. The absolute values of the QTL effects for a population of 100 (estimated over all 4000 data sets) were biased downward slightly more than for 200 individuals, and the standard errors were generally slightly larger. The weight given to the correct QTL genotypes decreased to below half of the population size. The mean of the QTL location changed markedly from 18.5 to 25.8 cM. However, an examination of the distribution of locations showed that the distribution was skewed. The median QTL position was 18.0 for all data sets and 16.0 for the significant data sets, closer to the true location. The estimates of the QTL effects for the significant data sets, and especially the mean R2a, were biased upward. This is to be expected, but is a source of potential bias in the analysis of experimental data sets.
Comparison with regression on true QTL genotype: In a simulation study such as this, the true QTL genotype is known and we can compare the parameter estimates from a regression of each trait on the true QTL genotype to that obtained by our interval mapping procedure. An examination of the estimates produced by regression on the true QTL genotypes shows that these estimates varied substantially. To illustrate this, Fig 3 shows the relationship between the estimate of R2a from the interval mapping and that from regression on the true genotype for each set of simulations. The line x = y is shown on each plot. There was a high correlation in each case, decreasing as the heritability, the informativeness of the markers, and the population size decreased.
|
| DISCUSSION |
|---|
In this article we proposed and tested a method for interval mapping of QTL in a full-sib population of an autotetraploid species. This method could also be extended for QTL mapping in plant species of higher ploidy. As with diploid species, the precision with which a QTL may be located is affected by the heritability of the trait, the size of the mapping population, and the informativeness of the markers. It is useful to have as high a proportion of codominant markers as possible, both for precision of QTL mapping and for linkage map construction (![]()
The threshold at which a QTL was declared present was calculated by simulating data sets with QTL effects set to zero and examining the distribution of the adjusted coefficient of determination R2a. Using the 95% point of the distribution of R2a gives a test of significance at a 5% level for the presence of a QTL. Using this threshold, the power to detect QTL varied considerably. For simulation set A, with codominant markers, a heritability of 25% and a population of 200 individuals, the power was >99%. This fell to 77% when the heritability was reduced to 10% (set B), to 71% when a mixture of dominant and codominant markers was used (set C), and to 31% when the population size was reduced to 100 (set D). For set D, the true value of R2a is 10%. The mean value of R2a from all the simulations in set D was 11.1% (SE 0.31), which is lower than the threshold for declaring significance (13.7%). The simulations for which a QTL is declared significant are those with values of R2a in the upper tail of the distribution, which have mean 18.9% (SE 0.14), overestimating the true value. The effects of the QTL alleles are similarly overestimated. This problem is not confined to tetraploid analysis. Simulation studies by ![]()
![]()
We used a novel approach of reconstructing possible chromosome configurations from the observed marker phenotypes for each offspring. The branch and bound algorithm was used to identify configurations with the minimum number of crossovers consistent with the observed data. Configurations that did not come from bivalent pairings were rejected. This analysis was motivated by the need for QTL mapping studies in tetraploid potato. A recent ultrahigh-density diploid genetic linkage map of potato chromosome 1 found that chromatids had experienced 0, 1, or 2 recombination events during meiosis (E. ISIDORE, personal communication); that is, one or two chiasmata per chromosome pair had occurred during the meiosis. The same is likely to be true for chromosomes 212, given the lengths of the linkage groups (68108 cM) found using molecular markers (![]()
![]()
![]()
The accuracy with which the chromosome configurations were reconstructed depended on the type of markers used. In this study we considered codominant markers (for example, microsatellites) and dominant markers (for example, amplified fragment length polymorphisms). For the codominant markers, the dosages and configurations of alleles were obtained by random sampling with replacement from a maximum of five alleles and a null allele. The proportion of alleles reconstructed correctly was lower for a mixture of dominant and codominant markers than for codominant markers alone. The codominant markers with most alleles, and in particular those with most alleles in simplex configurations, gave offspring phenotypes that could occur in the fewest ways. Such markers are therefore the most useful for chromosome reconstruction. Useful marker information could also be obtained by pyrosequencing single nucleotide polymorphisms to measure dosages of alleles for each offspring, which should be more informative in chromosome reconstruction than presence/absence data.
This simulation study, and in particular the chromosome reconstructions, assumed that the marker order was known without error. This may not be the situation for experimental data, and the reconstruction method could also be used to check and improve the locus ordering. The current strategy (![]()
![]()
![]()
- Drop each marker in the linkage group in turn and calculate the total number of crossovers for all the offspring. If the omission of any marker reduces the number of crossovers markedly compared to the order for the full group, then try other positions for this marker and reposition it where the total number of crossovers is lowest. Repeat for other markers if necessary.
- Examine the distribution of the crossovers for all the offspring and identify individuals with large numbers of crossovers. The marker data corresponding to the crossovers should be checked and corrected where necessary.
Theoretically, we could also use a computer-intensive search method to search directly for the locus order and parental phases that minimize the total number of crossovers in the offspring. However, there are a very large number of possible orders for a tetraploid cross [m!/2 orders for m loci, and up to (4!)2 possible phases at each locus], so it is preferable to use pairwise information to reduce the search space as far as possible. There is a need for further research here.
This analysis was restricted to the case of additive effects of the QTL alleles. However, this model may be too simple. For example, many traits in potato display specific as well as general combining ability (![]()
![]()
| ACKNOWLEDGMENTS |
|---|
The computer programs for simulating tetraploid data were written by Dr. Z. W. Luo. This research was supported by a research grant from the United Kingdom Biotechnology and Biological Sciences Research Council and by the Scottish Executive Rural Affairs Department.
Manuscript received May 25, 2001; Accepted for publication September 17, 2001.
| APPENDIX A |
|---|
RELATIONSHIP BETWEEN BIOMETRICAL MODELS
![]()
![]()
![]()

We can equate the expression from ![]()

We see that d is a function of the additive effects
and higher-order terms, while h, v, and w depend on the diallelic, triallelic, and tetraallelic interactions and higher-order terms, respectively.
|
| APPENDIX B |
|---|
RECONSTRUCTION OF CHROMOSOME CONFIGURATIONS
Here we demonstrate the reconstruction of the possible chromosomes inherited by an offspring from the cross between parents P1 and P2, with genotypes given in Table 1. The phenotype of this offspring is shown in Table A2.
|
Consider locus L1, with phenotype ACD. The A allele must have come from chromosome 6 (parent P2). The individual does not have a B or E allele and therefore has not inherited chromosome 2 or 5 at this locus. The possibilities are (i) chromosomes 6 and 7 from P2, together with either chromosomes 1 and 3 from P1 (giving genotype ACDO); or (ii) chromosomes 1 and 4 from P1 (giving genotype ACDD); or (iii) chromosomes 6 and 8 from P2, together with either chromosomes 1 and 4 (giving genotype ACCD); or (iv) chromosomes 3 and 4 (giving genotype ACDO). Table A2 shows the chromosome configurations giving rise to the phenotypes of this individual at each locus.
A branch and bound algorithm is then used to identify the chromosome configurations that give the minimum number of recombinations for the complete linkage group. An initial configuration is found by ordering the configurations according to the number of loci for which they are possible and selecting for each locus the most frequent compatible configuration. For this individual, the configurations 1257 and 2357 are jointly most frequent, each being possible for 5 of the 10 loci. Neither of these is possible for locus L1, however. The initial order is
L1 1 4 6 7
L2 1 2 5 7
L3 1 2 5 7
L4 1 2 5 7
L5 1 4 6 7
L6 3 2 6 8
L7 1 2 5 6
L8 1 2 5 7
L9 3 2 5 7
L10 1 2 5 7
with 12 recombinations. The algorithm searches for configurations with the minimum number of crossovers. It is not necessary to test every combination in Table A2: If a combination for, say, L1L5 has more recombinations than the current minimum, then this is rejected without considering L6L10.
The minimum number of recombinations for this individual is six, and there are 20 configurations with this minimum. Up to now, the question of whether the configuration may be produced by bivalent pairing has been ignored, but now this is checked for each configuration.
One possible configuration with six recombinations is
L1 1 4 6 7
L2 1 4 6 7
L3 1 4 6 5
L4 1 2 6 5
L5 1 2 6 8
L6 1 2 6 8
L7 1 2 6 5
L8 1 2 7 5
L9 3 2 7 5
L10 3 2 7 5.
If we consider the first four loci, these suggest that the chromosomes are paired as 1 + 3 and 2 + 4 from P1 and 5 + 7 and 6 + 8 from P2. However, L5 and L6 have chromosomes 6 and 8 from P2, and L8L10 have chromosomes 5 and 7 together. This configuration is rejected as incompatible with bivalent pairing. However, the configuration
L1 1 4 6 8
L2 1 4 6 8
L3 1 4 6 5
L4 1 2 6 5
L5 1 2 6 8
L6 1 2 6 8
L7 1 2 6 5
L8 1 2 7 5
L9 3 2 7 5
L10 3 2 7 5
is compatible throughout with the chromosomes pairing as 1 + 3, 2 + 4, 6 + 7, and 5 + 8. Of the 20 configurations with six recombinations for this individual, 8 were compatible with bivalent pairings. They can be summarized as
L1 1 4 6 8
L2 1 4 6 5/8
L3 1 4 6 5
L4 1 2 6 5/8
L5 1 2 6 8
L6 1 2 6 8
L7 1 2 6 5
L8 1 2 6/7 5
L9 3 2 7 5
L10 3 2 7 5.
This is represented as a graphical genotype in Fig 1. The 8 configurations coincide for 37 of the 40 alleles, but there is uncertainty about the other 3 alleles. For example, there is definitely a recombination between chromsomes 6 and 7 between L7 and L9, but it is uncertain on which side of L8 it occurred. One of the 8 configurations is the same as the simulated genotype for this individual.
If locus L3 was excluded from the analysis, there would be no evidence to establish the first two crossovers between chromosomes 5 and 8. Loci L1, L3, and L5 have unique alleles on chromosome 5, and this individual carries the unique allele from L3 but not from L1 or L5. The true recombinations were between L1 and L2 and between L4 and L5, but the phenotypes observed for L2 and L4 are both compatible with inheriting chromosomes 6 and 8 from parent P2. Without L3, the minimum recombination configuration would have four recombinations, fewer than that simulated.
Occasionally the minimum recombination configurations have fewer than the simulated number of recombinations, but are all incompatible with bivalent pairing. In this case configurations with minimum + 1, minimum + 2, etc., recombinations are considered until compatible configurations are found.
| LITERATURE CITED |
|---|
AL-JANABI, S. M., R. J. HONEYCUTT, M. MCCLELLAND, and B. W. S. SOBRAL, 1993 A genetic linkage map of Saccharum spontaneum L. SES 208. Genetics 134:1249-1260[Abstract].
BARNES, D. K., and C. H. HANSON, 1967 An illustrated summary of genetic traits in tetraploid and diploid alfalfa. U.S. Department of Agriculture Technical Bulletin 1370. U.S. Department of Agriculture, Washington, DC.
BEAVIS, W. D., 1994 The power and deceit of QTL experiments: lessons from comparative QTL studies, pp. 250266 in 49th Annual Corn and Sorghum Industry Research Conference. ASTA, Washington, DC.
BRADSHAW, J. E., 1994 Quantitative genetics theory for tetrasomic inheritance, pp. 7199 in Potato Genetics, edited by J. E. BRADSHAW and G. R. MACKAY. CAB International, Wallingford, Oxon, UK.
BRADSHAW, J. E., and G. R. MACKAY, 1994 Breeding strategies for clonally propagated potatoes, pp. 467497 in Potato Genetics, edited by J. E. BRADSHAW and G. R. MACKAY. CAB International, Wallingford, Oxon, UK.
BRADSHAW, J. E., C. A. HACKETT, R. C. MEYER, D. MILBOURNE, and J. W. MCNICOL et al., 1998 Identification of AFLP and SSR markers associated with quantitative resistance to Globodera pallida (Stone) in tetraploid potato (Solanum tuberosum subsp. tuberosum) with a view to marker-assisted selection. Theor. Appl. Genet. 97:202-210.
BROUWER, D. J. and T. C. OSBORN, 1999 A molecular marker linkage map of tetraploid alfalfa (Medicago sativa L.). Theor. Appl. Genet. 99:1194-1200.
CALLEN, D. F., A. D. THOMPSON, H. A. Y. SHEN, H. A. PHILLIPS, and R. I. RICHARDS et al., 1993 Incidence and origin of "null" alleles in the (AC)n microsatellite markers. Am. J. Hum. Genet. 52:922-927[Medline].
CHURCHILL, G. A. and R. W. DOERGE, 1994 Empirical threshold values for quantitative trait mapping. Genetics 138:963-971[Abstract].
DA SILVA, J., 1993 A methodology for genome mapping of autopolyploids and its application to sugarcane (Saccharum spp.). PhD. Dissertation, Cornell University, Ithaca, NY.
DA SILVA, J. A. G., M. E. SORRELLS, W. L. BURNQUIST, and S. D. TANKSLEY, 1993 RFLP linkage map and genome analysis of Saccharum spontaneum.. Genome 36:782-791.
DA SILVA, J., R. J. HONEYCUTT, W. BURNQUIST, S. M. AL-JANABI, and M. E. SORRELLS et al., 1995 Saccharum spontaneum L. SES 208 genetic linkage map combining RFLP- and PCR-based markers. Mol. Breed. 1:165-179.
DIGITAL, 1997 DIGITAL Fortran Language Reference Manual. Digital Equipment Corporation, Maynard, MA.
DIWAN, N., J. H. BOUTON, G. KOCHERT, and P. B. CREGAN, 2000 Mapping of simple sequence repeat (SSR) DNA markers in diploid and tetraploid alfalfa. Theor. Appl. Genet. 101:165-172.
EASTON, H. S., 1976 Etude comparative d'effects génétique chez des plantes diploïdes et tetraploïdes isogéniques de Festuca pratensis Huds. Thèse de Doctorat d'Etat des Sciences Naturelles, Université de Paris-Sud, Paris, France.
HACKETT, C. A., 2001 A comment on Xie and Xu: mapping quantitative trait loci in tetraploid species.. Genet. Res. 78:187-189[Medline].
HACKETT, C. A., J. E. BRADSHAW, R. C. MEYER, J. W. MCNICOL, and D. MILBOURNE et al., 1998 Linkage analysis in tetraploid species: a simulation study. Genet. Res. 71:143-154.
JANSEN, R. C., 1992 A general mixture model for mapping quantitative trait loci by using molecular markers. Theor. Appl. Genet. 85:252-260.
JANSEN, R. C., 1996 A general Monte Carlo method for mapping multiple quantitative trait loci. Genetics 142:305-311[Abstract].
JIANG, C. J. and JIANG, C. J.Z-B. ZENG, 1997 Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101:47-58[Medline].
KEMPTHORNE, O., 1957 An Introduction to Genetic Statistics. John Wiley & Sons, New York.
LANDER, E. S. and P. GREEN, 1987 Construction of multilocus genetic-linkage maps in humans. Proc. Natl. Acad. Sci. USA 84:2363-2367
LUO, Z. W., C. A. HACKETT, J. E. BRADSHAW, J. W. MCNICOL, and D. MILBOURNE, 2000 Predicting parental genotypes and gene segregation for tetrasomic inheritance. Theor. Appl. Genet. 100:1067-1073.
LUO, Z. W., C. A. HACKETT, J. E. BRADSHAW, J. W. MCNICOL, and D. MILBOURNE, 2001 Construction of a genetic linkage map in tetraploid species using molecular markers. Genetics 157:1369-1385
MEYER, R. C., D. MILBOURNE, C. A. HACKETT, J. E. BRADSHAW, and J. W. MCNICOL et al., 1998 Linkage analysis in tetraploid potato and associations of markers with quantitative resistance to late blight (Phytophthora infestans). Mol. Gen. Genet. 259:150-160[Medline].
PIJNACKER, L. P. and M. A. FERWERDA, 1984 Giemsa C-banding of potato chromosomes. Can. J. Genet. Cytol. 26:415-419.
RIPOL, M. I., G. A. CHURCHILL, J. A. G. DA SILVA, and M. SORRELLS, 1999 Statistical aspects of genetic mapping in autopolyploids. Gene 235:31-41[Medline].
ROBERTS, S. J., 1984 A branch and bound algorithm for determining the optimal feature subset of given size. Appl. Stat. 33:236-241.
SILLS, G. R., W. BRIDGES, S. M. AL-JANABI, and B. W. S. SOBRAL, 1995 Genetic analysis of agronomic traits in a cross between sugarcane (Saccharum officinarum L.) and its presumed progenitor (S. robustum Brandes and Jesw. ex Grassl). Mol. Breed. 1:355-363.
STAM, P., and J. W. VAN OOIJEN, 1995 JoinMap Version 2.0: Software for the Calculation of Genetic Linkage Maps. CPRO-DLO, Wageningen, The Netherlands.
SWAMINATHAN, M. S. and H. W. HOWARD, 1953 The cytology and genetics of the potato (Solanum tuberosum) and related species. Bibliogr. Genet. 16:1-192.
UKOSIT, K. and P. G. THOMPSON, 1997 Autopolyploidy versus allopolyploidy and low-density randomly amplified polymorphic DNA linkage maps of sweetpotato. J. Am. Soc. Hort. Sci. 122:822-828
UTZ, H. F., A. E. MELCHINGER, and C. C. SCHON, 2000 Bias and sampling error of the estimated proportion of genotypic variance explained by quantitative trait loci determined from experimental data in maize using cross validation and validation with independent samples. Genetics 154:1839-1849
VAN ECK, H. J., J. R. VAN DER VOORT, J. DRAAISTRA, P. VAN ZANDVOORT, and E. VAN ENCKEVORT et al., 1995 The inheritance and chromosomal localization of AFLP markers in a noninbred potato offspring. Mol. Breed. 1:397-410.
WRIGHT, A. J., 1979 The use of differential coefficients in the development and interpretation of quantitative genetics models. Heredity 43:1-8.
WU, K. K., W. BURNQUIST, M. E. SORRELLS, T. L. TEW, and P. H. MOORE et al., 1992 The detection and estimation of linkage in polyploids using single-dose restriction fragments. Theor. Appl. Genet. 83:294-300.
XIE, C. and S. XU, 2000 Mapping quantitative trait loci in tetraploid populations. Genet. Res. 76:105-115[Medline].
YU, K. F. and K. P. PAULS, 1993 Segregation of random amplified polymorphic DNA markers and strategies for molecular mapping in tetraploid alfalfa. Genome 36:844-851.
This article has been cited by other articles:
![]() |
C. A. Hackett, I. Milne, J. E. Bradshaw, and Z. Luo TetraploidMap for Windows: Linkage Map Construction and QTL Mapping in Autotetraploid Species J. Hered., November 1, 2007; 98(7): 727 - 729. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. G. Robins, D. Luth, T. A. Campbell, G. R. Bauchan, C. He, D. R. Viands, J. L. Hansen, and E. C. Brummer Genetic Mapping of Biomass Production in Tetraploid Alfalfa Crop Sci., January 22, 2007; 47(1): 1 - 10. [Abstract] [Full Text] [PDF] |
||||
![]() |





















