- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- A corrigendum has been published
-
All Versions of this Article:
genetics.108.091058v1
180/3/1743 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Bauman, L. E.
- Articles by Lange, K.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Bauman, L. E.
- Articles by Lange, K.
Originally published as Genetics Published Articles Ahead of Print on September 14, 2008.
Genetics, Vol. 180, 1743-1761, November 2008, Copyright © 2008
doi:10.1534/genetics.108.091058
Mixed Effects Models for Quantitative Trait Loci Mapping With Inbred Strains
Lara E. Bauman*,
,1,
Janet S. Sinsheimer
,
,
,
Eric M. Sobel
and
Kenneth Lange
,
,**
* Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio, Texas 78245-0549,
Department of Biomathematics, University of California, Los Angeles, California 90095-1766,
Department of Biostatistics, University of California, Los Angeles, California 90095-1772,
Department of Human Genetics, University of California, Los Angeles, California 90095-7088 and ** Department of Statistics, University of California, Los Angeles, California 90095-1554
1 Corresponding author: Department of Genetics, Southwest Foundation for Biomedical Research, P.O. Box 760549, San Antonio, TX 78245-0549.
E-mail: lbauman{at}sfbrgenetics.org
>ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
Fixed effects models have dominated the statistical analysis of genetic crosses between inbred strains. In spite of their popularity, the traditional models ignore polygenic background and must be tailored to each specific cross. We reexamine the role of random effect models in gene mapping with inbred strains. The biggest difficulty in implementing random effect models is the lack of a coherent way of calculating trait covariances between relatives. The standard model for outbred populations is based on premises of genetic equilibrium that simply do not apply to crosses between inbred strains since every animal in a strain is genetically identical and completely homozygous. We fill this theoretical gap by introducing novel combinatorial entities called strain coefficients. With an appropriate theory, it is possible to reformulate QTL mapping and QTL association analysis as an application of mixed models involving both fixed and random effects. After developing this theory, our first example compares the mixed effects model to a standard fixed effects model using simulated advanced intercross line (AIL) data. Our second example deals with hormone data. Here multivariate traits and parameter identifiability questions arise. Our final example involves random mating among eight strains and vividly demonstrates the versatility of our models.
IN analyzing gene mapping data from inbred strains, there is always the temptation to borrow models more pertinent to outbred populations. The vast majority of statisticians are wise enough to resist this temptation and turn to analysis methods tailored to specific breeding designs. Fortunately, the typical backcross or F2 design has sufficient symmetry to permit analysis of variance by standard statistical packages. As mammalian geneticists explore more complicated designs involving multiple strains and multiple generations, this analysis paradigm has begun to fracture. It is therefore hardly surprising that the last decade and a half have seen a revival of interest in statistical models for gene mapping with inbred strains. Although we briefly review some of the important contributions to this literature in the next section, it is fair to say that most modern models rely heavily on fixed effects. In contrast, the most successful models for mapping quantitative trait loci (QTL) in outbred populations invoke random effects (HOPPER and MATHEWS 1982; GOLDGAR 1990; SCHORK 1993; AMOS 1994; BLANGERO and ALMASY 1997).
The premise of this article is that, properly formulated, random effects models hold equal promise for more complicated inbred strain data. If a QTL is segregating between two strains, backcross and F2 designs reliably detect it (VALDAR et al. 2006). Models based on fixed allelic effects play a critical role in this process. Traditional designs have two drawbacks. First, the scarcity of recombination events often gives long mapped intervals. Second, when two founder strains of related ancestry are chosen, there may be no segregating QTL. To increase the number of recombination events and the number of segregating QTL, geneticists are turning to more complex designs involving multiple strains. Although the rationale for more complex designs is compelling, they bring in their wake problems of overparameterization. Random effects models neatly circumvent some of the parameterization issues encountered with fixed effects models. Unfortunately, the standard outbred QTL model does not make sense for inbred strains. All individuals of a particular strain are genetically identical and completely homozygous. These cardinal characteristics have subtle consequences when we calculate trait covariances for the descendants of matings between different strains. A logically correct theory for specifying covariances between pairs of individuals is the key to making random effects models respectable for inbred strains.
In this article, we take two approaches to QTL mapping; both capture polygenic background as a source of random variation. The two approaches differ in how they handle variation caused by the QTL. In association mapping, markers are treated one by one as candidate genes, and observed genotypes or allele counts at a marker serve as fixed predictors of trait means. In linkage mapping, markers in the vicinity of the QTL provide prior information on gene sharing, and the QTL contribution is modeled as a random effect. The greatest defect of our models is the blanket assumption of additivity. The greatest strength of our models is their generality in other regards. Thus, there is no limit to the number of founding strains, the depth and complexity of pedigrees, or the number of traits in a multivariate analysis.
To avoid breaking the flow of our discussion, much of the mathematical detail is relegated to the APPENDIXES. The following sections summarize previous contributions, lay out the model with full attention to computation of strain coefficients and relative covariances, resolve the thorny issue of identifiability, apply the models to real and simulated data, and discuss the broader implications and limitations of the models.
ABSTRACT
>METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
A brief survey of previous methods:
Inbred mammalian strains have unique advantages in genetics. All members of a strain are genetically identical and completely homozygous. Simple crosses between strains involve no phase ambiguities, and any genes mapped can be quickly located in humans and other species by synteny. With mice and other small mammals, breeding is reasonably straightforward, generation times are fairly short, and the environment can be exquisitely controlled.For decades, QTL mapping in inbred strains was considered an exercise in fixed effects modeling. Testing for association between marker genotypes and trait values is readily carried out using several available statistical packages. In the interval method introduced by LANDER and BOTSTEIN (1989), the QTL is allowed to take any position along a chromosome. This makes QTL genotypes unobservable and requires computation of posterior distributions given observed genotypes at the flanking markers. Although the EM algorithm is applicable in this context, it is often slow to converge, and the regression method of HALEY and KNOTT (1992) provides a quick approximation. The permutation test of CHURCHILL and DOERGE (1994) handles multiple testing problems gracefully. The recent program R/qtl (BROMAN et al. 2003), which capitalizes on the R software environment, combines several of these methods with hidden Markov modeling of missing genotypes. Despite these admirable advances, interval mapping is still limited to simple crosses where polygenic background is confounded with random environment. As the field embraces more complex crosses, geneticists no longer have the luxury of ignoring polygenic background, and it seems self-evident that explicitly modeling it will improve statistical inference.
The composite interval mapping method of ZENG (1993, 1994) implemented in QTL Cartographer generalizes interval mapping by including the direct effects of one or more markers unlinked to the QTL. Hence, composite interval mapping can be viewed as an attempt to incorporate polygenic background through fixed effects. If the number of typed markers is large, then it becomes hopeless to include all of them, and some automatic selection of background markers is desirable (MANLY and OLSON 1999).
Although XIE et al. (1989) take important first steps toward including polygenic background as a random effect, they do not derive general covariance expressions. This failure makes it difficult to deal with nonstandard crosses and awkward to combine data from different crosses. In the meantime, the pressure to increase the number of strains per cross has been growing (REBAI and GOFFINET 1993). Of 21 cloned mouse genes listed in Tables 1 and 2 of the review by FLINT et al. (2005), 7 rely on cloning strategies involving multiple strains or outbred mice. These practical concerns are stimulating intense efforts to revamp experimental design and statistical analysis of inbred cross data (LIU and ZENG 2000; HITZEMANN et al. 2002; PLETCHER et al. 2004; LI et al. 2005; CERVINO et al. 2007). Other recent models that delve into multiple QTL models and epistasis are both frequentist (KAO et al. 1999; JANNINKA and JANSENA 2001; SEATON et al. 2002; BROMAN et al. 2003) and Bayesian oriented (SILLANPÄÄ and ARJAS 1998; SEN and CHURCHILL 2001; BROMAN et al. 2003).
|
|
Trait means, variances, and covariances:
We begin our theory development with a basic model applicable to any inbred strain design, including F2, advanced intercross lines, and random mating. Suppose that i and j are two animals generated by a complex cross involving s inbred strains. At t traits of interest, i and j exhibit random vectors Xi and Xj of trait values. For the sake of simplicity, assume further that Xi and Xj reflect the contributions of a single gene whose alleles have additive effects. Our immediate goal is to calculate the expected vectors E(Xi) and E(Xj) and the covariance matrix Cov(Xi, Xj). When i = j, we recover variances as well as covariances. Because of our assumption of additivity, Xi decomposes as the sum Yi + Zi of a maternal contribution Yi plus a paternal contribution Zi. To calculate E(Yi), let Mi denote the originating strain of the maternal gene of i. Although Mi is unobserved, we can calculate the probability Pr(Mi = a) for any given strain a. In terms of these probabilities and the t x 1 mean vector µ(a) of allelic effects on each trait for strain a, we have
![]() |
![]() | (1) |
i(a) is the probability that a randomly sampled gene from i originates from strain a. We refer to
i as the strain fraction vector for animal i;
i has dimension s x 1.
Covariances are derived by the same kind of reasoning. Decompose Xj into the sum Vj + Wj of a maternal contribution Vj plus a paternal contribution Wj. In view of the bilinearity of the covariance operator and the symmetry of maternal and paternal alleles, it suffices to find the covariance Cov(Yi, Vj). Let Nj denote the originating strain of the maternal gene of j. Conditioning on the joint value of Mi and Nj then yields
![]() |
ij(a, b) to be the joint probability that a randomly drawn gene from animal i originates from strain a and a randomly drawn gene from the same locus of animal j originates from strain b. If i and j coincide, then sampling is done with replacement. The t x t covariance matrix between the trait values of i and j becomes
![]() | (2) |
ij(a, b) –
i(a)
j(b), which we collect into an s x s matrix, denoted Cij.
For s strains and t traits, it is convenient to stack the allelic effects into a column vector µ of length st with transpose
![]() |
= µµ* can then be split into t2 blocks
kl each of size s x s. Restricting our attention to the block corresponding to traits k and l, the covariance matrix (2) has entries given by the trace formula
![]() | (3) |
In polygenic inheritance, many independent loci contribute in an additive manner to the traits under consideration. Since trait means and covariances add in this setting, the mean expression (1) and the covariance expressions (2) and (3) remain valid provided we replace µ by
and
by
. Here µl denotes the vector contribution corresponding to locus l rather than the lth component of µ. APPENDIX A shows that every pair (µ,
) consisting of a vector µ and a positive semidefinite matrix
can be represented as two such coordinated linear combinations. Hence, to capture polygenic background, it suffices to estimate arbitrary µ and
. We see later that there is an identifiability issue that must be surmounted in estimating
.
Computation of strain coefficients:
Because the combinatorial coefficients
i(a) and
ij(a, b) are essential in calculating trait means and variances, we need good algorithms to compute these coefficients. Fortunately, we can mimic the logic used in calculating kinship coefficients for outbred populations. Since a pedigree founder i is assumed to be strain pure, one entry of the vector
i = 1, and the remaining entries = 0. Likewise for two founders i and j, one entry of the matrix
ij = 1, and the remaining entries = 0. All other strain fraction vectors
i and strain coefficient matrices
ij are defined recursively starting with the founders.
To avoid circular reasoning, pedigree members are numbered so that parents always precede their children. If animal i is not a founder, then it has parents k and l. Assuming that k and l have already been visited in filling in the strain fractions, we set
![]() | (4) |
i, then without loss of generality we can assume j has been visited already, and we can set
![]() | (5) |
![]() | (6) |
![]() | (7) |
) denotes a diagonal matrix whose diagonal entries coincide with the entries of the vector
.
The initial conditions on founders and the recurrences (4)–(7) completely determine
i and
ij. These in turn determine the Cij matrices, which have a richer mathematical structure than the strain coefficient matrices
ij. APPENDIX B describes several fascinating properties of the Cij matrices. One such property is Cij = 0 between most members of simple crosses, for example, for all F2 animals when i
j or whenever i is a founder or F1.
Variance component models for QTL mapping with outbred populations require conditional kinship coefficients in addition to theoretical kinship coefficients. For exactly the same reasons, we also need conditional strain fractions and coefficient matrices. These depend on observed marker genotypes in the vicinity of a putative QTL. On small pedigrees, it is possible to compute conditional strain coefficient matrices exactly by considering all descent graphs (gene flow patterns) at the QTL and neighboring markers (KRUGLYAK et al. 1996). In practice, inbred strain pedigrees are so large that the number of possible descent graphs is astronomical. Stochastic sampling provides a workable substitute for exhaustive enumeration of descent graphs (SOBEL and LANGE 1996). The Markov chain Monte Carlo (MCMC) method incorporated in the computer program SimWalk samples relevant descent graphs with the appropriate conditional probabilities. Given a descent graph at the QTL, it is trivial to compute strain fractions for all animals and strain coefficient matrices for all pairs of animals in a pedigree. The averages of these quantities over all sampled descent graphs serve as approximations to the conditional strain fractions and strain coefficient matrices.
Strain coefficients convey more information than strain fractions. For instance, it is obvious that
![]() |
be the conditional probability that animal i has unordered genotype a/b at the hypothetical marker given the observed data at the ordinary markers. The relations
![]() |
![]() | (8) |
Variance component models:
Variance component models revolve around the multivariate normal distribution or related distributions such as the multivariate t. Every multivariate normal distribution is uniquely determined by its mean vector
and variance matrix
. If we decompose trait values into independent, additive contributions, then
and
can be expressed as sums over the various contributions. As long as we are willing to take the leap of faith that all random contributions are Gaussian, then trait vectors will be Gaussian as well. For each random contribution, variance matrices are constructed from a constant part and a parametric part. The genetic covariance formula (3) is typical in this regard. The constant parts Cij are forced on us by the nature of the pedigree. The parametric part
with blocks
kl requires estimation.
The environmental contribution to the mean is usually modeled as the sum of a grand mean
plus covariate effects such as age or sex. Random environment and cage effects can be modeled by Kronecker products of variance matrices, provided we order trait values so that all values corresponding to a given trait are contained in a single block, and animals are consistently enumerated across blocks. Given these conventions, the variance matrix under random environment reduces to the Kronecker product
I of the trait variance matrix
and the identity matrix I. Obviously,
is the parametric part; it describes the environmental covariation of the traits in a single animal. The matrix I reflects the independence of the random environments for the various animals. For a random cage effect, we replace the identity matrix by a cage matrix H = (hij), where hij = 1 if animals i and j belong to the same cage and 0 otherwise. The matrix replacing
describes the environmental covariation of the traits for animals in a single cage (LANGE 2002). As an example, heritability analyses generally specify two random effects, additive polygenes and random error/environment,
![]() | (9) |
![]() | (10) |
ic is the cth of C covariates measured on animal i and βkc is the corresponding regression coefficient for trait k.
Once we specify the mean and variance components, the loglikelihood of a pedigree can be written as
![]() |
such as that of Equation 9, and the variance matrix
such as that of Equation 10. Assuming pedigrees behave independently, their loglikelihoods add. Given the overall loglikelihood, parameters can be estimated by maximum likelihood, and statistical inference conducted by standard likelihood ratio tests comparing alternative hypotheses to null hypotheses. LANGE (2002) develops this frequentist approach to estimation and inference in detail. Our computer program Mendel relies on a quasi-Newton algorithm for maximum likelihood estimation. BAUMAN et al. (2005) discusses an alternative EM algorithm as well as factor-analytic parameterizations of variance matrices. Given the presence of covariates and heterogenous pedigree structures, permutation testing is rarely possible. To aid the user in judging significance and model fitting, Mendel reports standard errors of parameters, pedigree deviances, outlier individuals, and various goodness-of-fit statistics.
Two QTL mapping strategies:
There are two specific strategies, association and linkage, for QTL mapping. Variance component models are pertinent to both. Although the two strategies differ in how they portray QTL effects, each captures polygenic background as a random effect. In addition to the strain effects appearing in Equation 1, most models include a grand mean
and fixed effects tied to plausible predictors. If we specify
, then we must impose the vector constraint
on the polygenic mean vector µ. Here the index a ranges over all strains. Random effects include the polygenic effect summarized by Equation 3, random environment plus measurement error, and possibly correlated environment such as cage effects. As described in the next section, the polygenic variance matrix
is not identifiable, and complicated constraints must be imposed on it to compensate for this fact. Regardless of the nature of these constraints, we must compute theoretical strain fractions and strain coefficients to estimate µ and
under the null hypothesis of no QTL effect.
In linkage mapping, markers serve to tag chromosome segments and keep track of recombination events. The genotypes of the causative QTL are unobserved, and the QTL is allowed to assume any position along the genome. Under the alternative hypothesis in linkage mapping, we model the QTL as a random effect in the same way that we modeled the contribution of a single gene with additive effects. The only difference is that we use strain fractions and coefficients calculated conditional on the observed marker data. From here on, we refer to these as conditional strain fractions and coefficients; those calculated unconditionally we call theoretical strain fractions and coefficients. Motivated by Equations 1 and 3, we let
(a) denote the additive effect of the QTL in strain a. Then our earlier reasoning shows that the QTL contribution has mean
![]() |
![]() |
.
Our basic linkage model therefore specifies the trait means and covariances
![]() | (11) |
![]() | (12) |
ic is covariate c of animal i, and βkc is the corresponding regression coefficient for trait k. If we let
denote the average
, then all QTL models that include a grand mean require the constraint
. In the presence of this constraint, the likelihood ratio test of linkage follows asymptotically a
2 distribution with st – t degrees of freedom.
In association mapping, QTL fixed effects are tied to the current marker. The marker is viewed as a candidate gene whose genotypes or alleles directly influence trait means (LANGE et al. 2005); random QTL effects are omitted. Hence, in Equation 12 we drop the random effect
, and in Equation 11 we amend the fixed effect
to represent regression on observed allele counts at the current marker. If the additive model for allelic effects is viewed as too restrictive, then we can regress on observed genotypes. Association testing is again conducted by likelihood ratio statistics.
In the presence of missing genotypes in association testing, we fall back on imputed allele counts or imputed genotype counts. Because genotypes at markers are usually directly observed, little is lost in imputation by ignoring genotypes at flanking markers. In this simpler setting, a fast deterministic algorithm is available for imputation (LANGE et al. 2005). Flanking marker genotypes occasionally resolve phase ambiguities caused by combining closely spaced single nucleotide polymorphisms (SNPs) into supermarkers. Accordingly, the current version of Mendel also accepts MCMC estimates of conditional strain fractions from SimWalk. When each strain carries a different allele at the marker, the allele counts delivered by SimWalk are computed by doubling the conditional strain fractions at the marker. When two strains share a common allele at the marker, the corresponding strain fractions are added before doubling.
Identifiability:
We have seen that the polygenic covariance expression (3) between trait k of animal i and trait l of animal j involves the s x s trait block
kl of an st x st variance matrix
. Unfortunately, estimation of
collides with an identifiability issue. The crux of the problem is the existence of nontrivial matrices
with
![]() |
matrices Emn indexed by unordered strain pairs {m, n}. Here all entries of Emn are 0 except for the diagonal entries emm = enn = 1 and the off-diagonal entries emn = enm = –1. It follows that
![]() |
![]() | (13) |
kl = (
kl,mn) of
.
We can solve the identifiability problem by subtracting the nonidentifiable part of
from
. To achieve this end, we view the positive semidefinite matrix
as a vector in the Euclidean space Rstxst. In this setting the trace function
= tr(AB*) and Frobenius norm ||A||F = tr(AA*)1/2 reduce to the standard inner product and Euclidean norm. To find the nonidentifiable part of
, one projects
onto the vector subspace
of symmetric matrices satisfying Equation 13 for every strain pair {m, n} and every trait block
kl. Formally, the projection P(
) is defined to be the matrix X giving the minimum of
for
.
Fortunately, minimization of
separates into subproblems corresponding to different trait blocks. First, consider a diagonal block
kk of
. To simplify notation, denote its entries by ymn =
kk,mn and the entries of the corresponding block of the projection by xmn = P(
)kk,mn. To find P(
)kk we must minimize the sum of squares
![]() |
kl =
. These come in pairs that must be handled together, so we let
![]() |
![]() |
![]() |
APPENDIX C shows that each of these least-squares problems has solution X = (xmn) with residual
![]() |
. In calculating a covariance, we can ignore symmetrization and replace the matrix
by U. Indeed, the symmetry of Cij implies that
![]() |
klQ) faithfully represents the covariance between trait k of animal i and trait l of animal j. By the same reasoning, we can replace the entire residual matrix
– P(
) by the matrix
![]() | (14) |
.
In reparameterizing
, it is convenient to define an orthogonal matrix O mapping the vector
to the standard basis vector e1. (See APPENDIX D for one version of O.) It follows that
![]() |
zeros out the first row and first column of the matrix. To take advantage of this fact, we express the residual matrix (14) as
![]() | (15) |
![]() |
diag(O*). By our earlier remark, a block
kl of
equals the corresponding block of diag(O)
diag(O*) with its first row and column zeroed out.
We are now close to the desired goal of reparameterizing the residual. The matrix
has entire rows and columns consisting of zeros. Permuting its rows and columns appropriately will move its nontrivial part to an upper-left block, which will be positive definite whenever
is positive definite. The Cholesky decomposition of this upper-left block then serves as a good parameterization of R. To compute the number of parameters for s strains and t traits, observe that the matrix
is st x st. A total of t rows and columns are lost in the zeroing-out process. This leaves an (st – t) x (st – t) upper-left block with (st – t)(st – t + 1)/2 diagonal or subdiagonal entries. For example, with three strains and two traits, there are 10 parameters.
For the sake of clarity, let us summarize how our proposed parameterization leads to trait covariances. It begins with a Cholesky decomposition
of an (st – t) x (st – t) positive definite matrix. The matrix 
* is then subdivided into (s – 1) x (s – 1) trait blocks (
*)kl, and each block is promoted to an s x s trait block
kl by adding a top row and left column of zeros. In matrix notation,
kl = Z(
*)klZ* with Z the s x (s – 1) matrix
![]() |
With these conventions, the covariance between trait k of animal i and trait l of animal j amounts to
![]() | (16) |
Z*AZ on an s x s matrix A deletes the first row and first column of A. This ends our theoretical overview of the model. APPENDIX E shows how to differentiate covariances with respect to parameters, and APPENDIX F supplies a counterexample connecting identifiability and symmetry. We now move on to data analysis.
ABSTRACT
METHODS
>APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
A simulated advanced intercross line:
An AIL starts with F1 offspring from an intercross of two inbred strains. The F1 animals are randomly bred to produce the F2 animals, the F2 animals are randomly bred to produce the F3 animals, and so on for a total of n generations. An AIL differs from repeated brother–sister mating, because it involves enough animals to preserve genetic diversity. It draws its strength from the steady accumulation of recombination events over many generations (DARVASI and SOLLER 1995). Simulating data according to an AIL design permits us to compare our mixed effects results with the fixed effects results of the benchmark program QTL Cartographer. This exercise is not meant to be a substitute for an exhaustive study of power and experimental design. Also, the comparison is not entirely fair because QTL Cartographer analyzes the Fn data at the last generation ignoring the previous generations. To reconstruct missing marker information, QTL Cartographer applies an inflated recombination fraction scaled to reflect n.
To create our simulated AIL data, we mated two inbred founder animals and subjected their descendants in each generation to virtual random mating. Generation 10 contained 175 animals in 140 sibships with 492 animals overall. Placing the QTL locus at the midpoint of markers 5 and 6 of 11 equally spaced marker loci, we simulated genotypes by gene dropping and assigned QTL effects on the basis of the genotypes at the QTL. QTL genotypes were then discarded from further analysis. We modeled a univariate trait with a grand mean
= 4, an environmental variance 
= 1, and a 2 x 2 polygenic variance matrix
![]() |
1 = 0.2 and
2 = –0.2 centered around 0. In view of our discussion of identifiability, we can estimate only a single parameter p1 characterizing
. The projection technique discussed yields the value p1 = 0.667. The discussion of the Cij matrices in APPENDIX B explains why genotype data on a single generation also prevent estimation of p1. To provide the most informative comparisons, we ran three analyses: (1) Mendel on the full pedigree with complete genotype and phenotype data (Mendel Full), (2) Mendel on the full pedigree but with phenotype data on only the final F10 generation (Mendel F10), and (3) QTL Cartographer on the final F10 generation with complete genotype and phenotype data (Cartographer). Simply comparing cases Mendel Full and Cartographer is hardly fair; the full pedigree contains more than twice the number of animals in the final generation. Mendel F10 takes advantage of the full genealogy and all genotype data in computing theoretical and conditional strain coefficients. It limits itself to the phenotype sample in the last generation to enable a better comparison to QTL Cartographer.
Before turning to QTL mapping in the Mendel analyses, we fit a baseline model including the grand mean, the polygenic variance, and the environmental variance. We then estimated conditional strain coefficients at each of the 11 marker loci. This put us in a position to estimate the global parameters and the QTL-specific parameters simultaneously at each locus. The evidence in favor of the QTL is summarized by a likelihood ratio test (LRT) statistic following a
distribution; a nonlinear false discovery rate (FDR) correction (BENJAMINI et al. 2001) corrects for multiple testing for all three analyses. Table 1 summarizes the type I error rate, power, and coverage as well as the generating parameters, their estimates, and the standard errors of the estimates at the loci adjacent to the QTL. Successful coverage occurs when the equivalent one-LOD drop interval (4.6 LRT units) includes the QTL. We reject the null hypothesis of no QTL effect when the LRT is significant at the 0.05 level.
The results in Table 1 reflect 100 simulations for a QTL-effect size that yields power >90% for Mendel Full; type I error rates are given as confidence intervals based on 500 simulations under the null hypothesis of no QTL effect. Clearly, the power to detect linkage is drastically reduced when only the F10 generation is available for analysis. This absence of data also makes it difficult for Mendel F10 to estimate the polygenic parameter p1 accurately. For Mendel Full all estimates are within one standard error of their true values, and standard errors are small. QTL Cartographer exhibits slightly better power and coverage than Mendel F10, but with a largely inflated type I error rate. Both methods are easily bested by Mendel Full. These trends continue over a range of smaller QTL effects (data not shown). We are pleased with these results. In our view they demonstrate that application of the mixed effects model sacrifices little in simple settings while generalizing readily to complex pedigrees.
A multivariate four-way cross:
To illustrate the analysis of multivariate traits, we next consider the hormone data of Burke and colleagues (HARPER et al. 2003) on aging UM-HET3 mice. Figure 1 shows how the UM-HET3 mice were created from four founder strains: BALB/cJ (C), C57BL/6J (B6), C3H/HeJ (C3), and DBA/2J (D2). CB6F1 females crossed with C3D2F1 males provided 967 F2 full siblings. At markers with four different alleles, all F2 mice were heterozygous. Thus compared to a two-way cross, the four-way cross doubles the number of founder strains without sacrificing phase certainty. Hormone levels of insulin-like growth factor I (IGF), leptin (Lep), and thyroxine (T4) were measured at 4 and 15 months on each of the F2 mice. Testing maternal and paternal effects separately, Harper et al. found several linked markers in these data via ANOVA, including a maternal allele at D3Mit25 linked to IGF at 15 months, a paternal allele at D3Mit127 linked to Lep at 4 months, and both maternal and paternal alleles linked to Lep at 15 months. It is worth pointing out that ANOVA or MANOVA must be carried out at marker loci. Only here do marker genotypes or allele counts unambiguously define factor levels. With complete genotyping, our model collapses in this setting to the classical models.
|
This multistrain cross highlights identifiability pitfalls inherent in the structure of some crosses and the data collected on them. For example, all F2 mice share the strain fraction vector
. Hence, the polygenic mean is confounded with the grand mean. Using strain trait averages or phenotyping members of the original strains would allow us to estimate the polygenic means, but this is not an option for the current data.
Although the rigid structure of the four-way cross preserves phase certainty, it reduces uncertainty to the point where the polygenic covariance matrix cannot be estimated. Polygenic covariances depend on the combinatorial matrices Cij. We have already noted that Cij = 0 whenever i is a founder or i and j are F1 mice. Straightforward calculations for F2 mice i and j with i
j yield
![]() |
is confounded with the matrix describing the environmental covariances.
Finally, there are identifiability problems with the QTL allelic effects. At the covariance level, the conditional coefficient matrix
is identically 0 when typing is full and different alleles are present in each strain. At the mean level, imposition of the constraint
4 = –
1 –
2 –
3 shows that the genotype-specific means in a purely allelic model can be expressed as the vector
![]() |
We analyze these data in the same manner as the simulated AIL except for graphing the –log10(P-value) instead of LRTs and analyzing multiple map points in the intervals between marker loci. We enjoy two advantages over ANOVA or MANOVA; namely, we can use phenotyped individuals with wholly or partially missing genotypes, and we can estimate both QTL location and effect size.
To carry out a multivariate analysis, one must decide which univariate traits to analyze together. This is not a trivial matter because combining traits exacerbates the multiple testing problem and may add noise and degrade power (AMOS et al. 2001; BAUMAN et al. 2005). With outbred populations it is intertwined with the issue of ascertainment (DAWSON and ELSTON 1984); it may also be a problem with inbred populations since strains are often chosen for a particular experiment on the basis of their average phenotype. We present here the results of two multivariate analyses making these points. The most interesting results from this example data set are on chromosome 3, and we focus on three traits, leptin measurements at both 4 and 15 months and insulin-like growth factor I at 15 months in this region. In univariate analysis, both IGF-15 and Lep-4 show significant linkage to markers on chromosome 3, while Lep-15 shows suggestive linkage. Multivariate analyses are indicated biologically, spatially, and temporally.
We carried out a number of multivariate analyses; some of the results are summarized in Figures 2 and 3 and Table 2. The graphs of –log10(P-value) along chromosome 3 in Figure 2 correspond to the univariate analyses of IGF-15, Lep-4, and Lep-15 and the bivariate analysis of Lep-4 and Lep-15. The univariate graph of IGF-15 peaks over marker D3Mit5. Subjecting the P-values for IGF-15 to the nonlinear FDR correction (BENJAMINI et al. 2001) suggests a single location for IGF-15. Both of the univariate leptin graphs as well as the bivariate graph peak over D3Mit127. After FDR correction, at least two significant map points are suggested over D3Mit127 for the bivariate leptin analysis. Table 2 reports estimates and standard errors for the bivariate leptin mean parameters at marker D3Mit127. These estimates are very similar at the two time points. Although likelihood ratios improve over univariate analysis, P-values do not because the degrees of freedom of the
2 test double. The estimated environmental covariance matrix
![]() | (17) |
|
|
A trivariate analysis of IGF-15, Lep-4, and Lep-15 clearly illustrates that in the case of multivariate traits, more is not always better. Comparing Figure 2 to Figure 3 shows two large peaks: one at marker D3Mit25 and one over marker D3Mit127. After FDR adjustment only the first peak survives, and the evidence for it is compromised. Thus, the trivariate analysis provides no additional linkage information and actually degrades the power to detect linkage. While leptin and IGF share numerous biological interactions, there is no evidence in these data for a common genetic determinant on chromosome 3.
An eight-strain simulated cross:
Our first two examples demonstrated the equivalence of the random effects model to the fixed effects model for standard cross designs and hint at the flexibility of our approach. To demonstrate this flexibility, we now present an eight-strain simulated example that (a) documents how correctly accounting for polygenic background can be beneficial and (b) demonstrates how it is possible to test hypotheses with the kind of unbalanced pedigree data encountered in human studies. As with the simulated AIL example, this exercise is not meant to be a substitute for an exhaustive study of power and experimental design.
Simulation specifics:
Our simulated cross involves a univariate trait, eight inbred strains, and seven pedigrees of nine generations each. We are motivated in part by the heterogeneous stock (MOTT et al. 2000) and the collaborative-cross designs (WILLIAMS et al. 2002). Starting with strain-pure founders, we constructed each pedigree by random mating with a decreasing number of progeny per animal per generation. The average number of animals per pedigree is 366. Random mating ensures substantial diversity in theoretical and conditional strain fractions and coefficients. On the basis of the marker map for chromosome 2 in the UM-HET example of the previous section, we simulated genotypes at six loci using the gene-dropping option of Mendel. Locus 3 serves as the QTL and the remaining loci as markers. Genotypes at the QTL are omitted during linkage analysis.
We generated univariate trait values independently for each pedigree by sampling from a multivariate normal distribution with prescribed means and covariances. If animal i has QTL genotype a/b and trait value Xi, then
![]() |
is the grand mean, µ is the vector of polygenic deviations from the mean, and
is the vector of QTL deviations from the mean. For animals i and j, the polygenic and random environment contributions entail the covariance
![]() |
.
|
Our simulation choices present both opportunities and challenges. For example, the fact that each strain is assigned a unique QTL allele suggests that even a simple F2 cross between two strains would be adequate to map the QTL. This advantage is tempered by the long genetic distances separating the QTL from the flanking markers, by the smallness of the QTL effects, by the similarity of these effects in some strains, and by the discordance of the QTL effects and the polygenic means effects.
In using random effect models for QTL mapping, inclusion of polygenic background is usually a good idea. If polygenic background is present but ignored, then the only way of accounting for relative correlations is through the QTL component. When we analyze the current data omitting polygenic background, every single chromosome location in the linkage analysis achieves a P-value <0.00001. Adding polygenic background causes P-values to reach more reasonable levels, ranging from 0.0019 to 0.3835. Subjecting the P-values to the (FDR) procedure highlights the QTL and one neighboring point as significant (BENJAMINI et al. 2001). Figure 4 plots the function –log10(P-value) along the chromosome; as earlier, the P-values reflect the likelihood ratio tests of the QTL component. The QTL is located at 30 cM from the origin between marker D2Mit323 at 23 cM and marker D2Mit37 at 42 cM.
|
We also used these data to illustrate the application of the QTL association model. As in our linkage analysis, omitting polygenic background leads to unrealistically small P-values. Figure 4 plots the –log10(P-value) for the association analysis with the polygenic background. The association results are similar to the linkage results. The marker with the most significant result is D2Mit323, which is the marker nearest to the QTL. The FDR procedure singles out D2Mit323 as the only significant association.
Comparison of computation times between the two models illustrates the speed of the association analysis. The linkage model requires
4 hr for calculation of the coefficient matrices for each pedigree and
20 hr to estimate the parameters for each of the 17 points. The association model requires
1.5 hr for all calculations at each of the five markers.
ABSTRACT
METHODS
APPLICATIONS
>DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
Balanced against these strengths is the need for better-conceived study designs. Unless crosses are carefully structured, some parameters will be unidentifiable. One antidote is to scale back the complexity of a model and reparameterize. Our first two examples illustrate this tactic. Another antidote is to avoid monolithic designs and opt for a mixture of designs that individually reveal different features of a model. Our third example does this.
In random effects models, trait values for most animals are correlated. Logically, one should treat all animals as members of a single large pedigree. At some point this requirement becomes unwieldy. The computational demands of the random effects models are fairly high, so tactics such as pedigree splitting, marker thinning, and marker amalgamation should not be dismissed. It will probably take a combination of these tactics to cope with the large-scale mapping projects now under way (PLETCHER et al. 2004). Fortunately, our experiences with simulated data suggest that a moderate amount of pedigree splitting sacrifices little information.
We have omitted a detailed discussion of how the program SimWalk delivers conditional strain fractions and coefficients. In our experience, SimWalk's MCMC algorithm adequately samples descent graph space. In association analysis, this lengthy process can be dispensed with if information at neighboring markers is ignored. Deterministic algorithms that produce approximate kinship and strain coefficients may ultimately be a better choice than stochastic sampling (GAO et al. 2004; GAO and HOESCHELE 2005). In maximizing loglikelihoods, it is also worth mentioning that Mendel allows the user to set initial parameter values and bounds. This flexibility is valuable in exploring multimodal likelihood surfaces.
Our QTL parameters enter the model at both the mean and the variance level and are not subject to nonnegativity constraints. Thus, the asymptotic distribution of a likelihood ratio test follows a chi-square distribution with degrees of freedom equal to the difference in the number of independent parameters between the underlying nested models. Model selection can be accomplished by likelihood ratio tests or modified criteria such as the Akaike information criteria (AIC) or the Bayesian information criteria (BIC). Multiple testing is certainly an issue. The FDR correction of Benjamini and Hochberg (BENJAMINI et al. 2001) for dependent tests is often a useful cure and provided us with correct inferences in our simulated examples. Extensions such as Storey's optimal discovery procedure (STOREY 2007; STOREY et al. 2007) can lead to more accurate P-values and should be kept in mind.
The assumption of multivariate normality is helpful in maximum likelihood estimation. For univariate traits with excess kurtosis, the multivariate t distribution is a workable substitute for the multivariate normal distribution and is an implemented option in Mendel. It is reasonable to conjecture that some version of the central limit theorem should hold for a polygenic trait over a pedigree (LANGE 1978; LANGE and BOEHNKE 1983). For simple pedigrees generated en masse in a cross, one can check the normality assumption empirically. The impact of departures from normality has been considered by several researchers (BEATY et al. 1985; ALLISON et al. 1999; PRATT et al. 2000). BLANGERO et al. (2000) and SHAM et al. (2000) suggest solutions to gross violations. One can object that QTL effects by their discrete nature cannot be normal. Three responses are possible. First, this objection has never stopped ordinary QTL mapping with outbred populations. Second, under the null hypothesis, the discrete effects disappear. Third, in all but the simplest crosses, application of a rigorous model incorporating both polygenes and major genes is very computationally demanding.
The web site (http://www.genetics.ucla.edu/software) offers the current versions of Mendel and SimWalk for several computing platforms. Ample documentation and sample problems are provided. The experimental versions of Mendel and SimWalk featured in this article will be released publicly as soon as it is practical.
ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
>APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
and a k x 1 vector µ, we now prove that vectors µ1, ... , µn exist such that
and
. To simplify the proof, we pass to the spectral decomposition
= O*DO of
. Here O is an orthogonal matrix, and D is a diagonal matrix whose jth diagonal entry dj is an eigenvalue of
. If n vectors
1, ... ,
n exist such that
and
, then taking µi = O*
i for each i completes the proof.
With the transformed problem, we can work on each dimension j separately. Suppose we can find scalars a1, ... , am such that a1 + ... + am =
j and
+ ... +
= dj. Then we construct m vectors w1, ... , wm whose entries are 0 except for their jth entries wij = ai. These m vectors compose part of the solution set
1, ... ,
n and do not impinge on the parts contributed by other dimensions. To show that appropriate scalars a1, ... , am exist, we consider optimizing the function f(a) =
+ ... +
subject to the affine constraint a1 + ... + am =
j. By introducing a Lagrange multiplier, we can prove that f(a) attains its minimum
when all ai =
j/m. The maximum of f(a) is infinite in all but the trivial case m = 1. For instance, we can take
,
, and all other ai = 0 and send p to
. Since dj must be positive, some positive integer m exists with
< dj. This choice of m puts us in a position to invoke the intermediate value theorem. The set of vectors a = (a1, ... , am) satisfying the constraint is convex and therefore connected. A continuous function on a connected set attains every value between its minimum and maximum values. Hence, there is some a with f(a) = dj.
ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
>APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
ij. Recall that the founder initial conditions and the recurrences (4)–(7) completely determine the strain fraction vectors
i and the strain coefficient matrices
ij. If we retain the conventions that i has parents k and l and j is an animal previously considered, then the last three recurrences translate into the similar recurrences
![]() | (B1) |
![]() | (B2) |
![]() | (B3) |
PROPOSITION 1. In addition to satisfying the recurrences (B1), (B2), and (B3), the matrix Cij
- has all entries 0 when either i or j is a founder,
- is symmetric,
- is positive semidefinite,
- has the vector 1 in its null space,
- has entries Cij(m, n) confined to the interval
for n
m and to the interval
for n = m.
Proof.
- If i is a founder belonging to strain q and j is a founder belonging to strain r, then by definition
i(m) = 1{m=q},
j(n) = 1{n=r}, and
ij(m, n) = 1{m=q}1{n=r}. Thus, all entries of Cij vanish. If i or j is a founder but the other is not, then induction and the recurrences (B1) and (B2) show that all entries of Cij vanish.
- Formula (B3) forces Cii to be symmetric, and the recurrences (B1) and (B2) preserve symmetry.
- Because the recurrences (B1) and (B2) preserve positive semidefiniteness, it suffices to prove that Cii is positive semidefinite. Inspection of formula (B3) further demonstrates that it suffices to prove that diag(
k) –
k
k* is positive semidefinite for all k. Accordingly, let v be an arbitrary vector. The quadratic form
is nonnegative owing to Cauchy's inequality
and the fact that
.
- Again this is a consequence of the recurrences (B1) and (B2) and the validity of the assertion for Cii. In the latter case, the equality
is obvious.
- Because the stated bounds are preserved by recurrences (B1) and (B2), it suffices to consider Cii. The contribution
to a diagonal term in Equation B3 is bounded below by 0 and above by
. The contribution
to an off-diagonal term is bounded below by
and above by 0.
The collection
of all Cij matrices over a pedigree has considerable structure. For example, the symmetry of Cij entails Cij = Cji. With just s = 2 strains, parts b, d, and e of Proposition 1 imply that every Cij is representable as
![]() |
. Furthermore, since aij = 0 whenever i or j is a founder or an F1 individual, straightforward recursive arguments show that within any strictly linear mating designs like Fn, aij = 0 for all i
j. Two-strain systems also produce uninteresting conditional coefficients; straightforward calculations show that
for all i and j at markers that differentiate between the strains with complete genotyping.
To generalize this representation to more than two strains, it is helpful to introduce the s x s matrix Emn where all entries of Emn are 0 except for emm = enn = 1 and emn = enm = –1. There are
such matrices.
PROPOSITION 2. Every matrix Cij from the collection
can be represented as a linear combination
![]() | (B4) |
![]() |
Proof. Each matrix Cii = 0 corresponding to a founder i clearly qualifies. The representation (B4) is preserved by the averaging process of the recurrences (B1) and (B2), so it suffices to prove the representation for a matrix Cii generated by a nonfounder. Again the averaging nature of recurrence (B3) allows us to verify the representation (B4) for a matrix of the form
. Because the set of dyadic rationals constitutes an algebraic field, it is clear by induction that all entries of
k are dyadic rationals. We now claim that
![]() | (B5) |
![]() |
are dyadic rationals, all that remains is to check that the sum of the coefficients is properly bounded. This follows from
![]() | (B6) |
. Equality is achieved only when all
.
ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
>APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
![]() |
![]() |
![]() | (C1) |
![]() |
![]() | (C2) |
n. Rearrangement of Equation (C1) gives
![]() | (C3) |
![]() |
![]() |
![]() | (C4) |
It is easy to check that the constraint (C2) is implicit in this solution. Furthermore, the solution entails the residual
![]() |
, our objective function can now be expressed as
![]() |
. This allows us to reparameterize f(x) as
![]() |
Minimizing the objective function in this form coincides with a classical problem in population genetics. If we assume that m and n represent two possible alleles from s equally frequent alleles and bmn represents a trait value determined by the genotype m/n, then minimizing f(x) corresponds to the problem of determining the additive genetic variance of a centered trait. The solution to this problem is known to be
![]() |
![]() |
![]() |
![]() | (C5) |
n and m = n.
The projection solution (C5) reduces the residual rmn = ymn – xmn to
![]() |
![]() |
![]() |
is a symmetrized version of U. Fortunately, we can represent U as the matrix product
![]() |
. ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
>APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
to the standard basis vector e1 can be explicitly constructed by the Gramm–Schmidt process applied to the basis
, where ek is the standard basis vector with 1 in position k and zeros elsewhere. The first row of O is just
; the subsequent rows take the form
![]() |
provide an orthonormal basis. ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
>APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
= (
mn). If we let
mn denote partial differentiation with respect to
mn, then formula (16) immediately leads to
![]() |

* = (duv). Since
![]() |
![]() |
mn(
*) consists entirely of zeros except for row m and column n. This fact considerably simplifies computation of derivatives. ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
>APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
LITERATURE CITED
kl with a symmetrized block residual
![]() |
– P(
) that fails to be positive semidefinite. Consider the matrix
![]() |
![]() |
![]() |
![]() |
ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
>ACKNOWLEDGEMENTS
LITERATURE CITED
ABSTRACT
METHODS
APPLICATIONS
DISCUSSION
APPENDIX A: REPRESENTATION OF...
APPENDIX B: PROPERTIES OF...
APPENDIX C: COMPUTATION OF...
APPENDIX D: CONSTRUCTION OF...
APPENDIX E: DIFFERENTIATION OF...
APPENDIX F: A COUNTEREXAMPLE...
ACKNOWLEDGEMENTS
>LITERATURE CITED
ALLISON, D. B., M. C. NEALE, R. ZANNOLLI, N. J. SCHORK, C. I. AMOS et al., 1999 Testing the robustness of the likelihood-ratio test in a variance-component quantitative-trait loci-mapping procedure. Am. J. Hum. Genet. 65: 531–544.[CrossRef][Medline]
AMOS, C. I., 1994 Robust variance-components approach for assessing genetic linkage in pedigrees. Am. J. Hum. Genet. 54: 535–543.[Medline]
AMOS, C. I., M. DE ANDRADE and D. K. ZHU, 2001 Comparison of multivariate tests for genetic linkage. Hum. Hered. 51: 133–144.[CrossRef][Medline]
BAUMAN, L. E., L. ALMASY, J. BLANGERO, R. DUGGIRALA, J. SINSHEIMER et al., 2005 Fishing for pleiotropic QTLs in a polygenic sea. Ann. Hum. Genet. 69: 590–611.[CrossRef][Medline]
BEATY, T. H., S. G. SELF, K.-Y. LIANG, M. A. CONNOLLY, G. A. CHASE et al., 1985 Use of robust covariance components models to analyse triglyceride data in families. Ann. Hum. Genet. 49: 315–328.[Medline]
BENJAMINI, Y., D. DRAI, G. ELMER, N. KAFKAFI and I. GOLANI, 2001 Controlling the false discovery rate in behavior genetics research. Behav. Brain Res. 125: 279–284.[CrossRef][Medline]
BLANGERO, J., 1993 Statistical approaches to human adaptability. Hum. Biol. 65: 941–966.[Medline]
BLANGERO, J., and L. ALMASY, 1997 Multipoint oligogenic linkage analysis of quantitative traits. Genet. Epidemiol. 14: 959–964.[CrossRef][Medline]
BLANGERO, J., J. T. WILLIAMS and L. ALMASY, 2000 Robust lod scores for variance component-based linkage analysis. Genet. Epidemiol. 19: S8–S14.[CrossRef][Medline]
BROMAN, K. W., H. WU,
. SEN and G. A. CHURCHILL, 2003 R/qtl: QTL mapping in experimental crosses. Bioinformatics 19: 889–890.
CERVINO, A. C. L., A. DARVASI, M. FALLAHI, C. C. MADER and N. F. TSINOREMAS, 2007 An integrated in silico gene mapping strategy in inbred mice. Genetics 175: 321–333.
CHURCHILL, G. A., and R. W. DOERGE, 1994 Empirical threshold values for quantitative trait mapping. Genetics 138: 963–971.[Abstract]
DARVASI, A., and M. SOLLER, 1995 Advanced intercross lines, an experimental population for fine genetic mapping. Genetics 141: 1199–1207.[Abstract]
DAWSON, D. V., and R. C. ELSTON, 1984 A bivariate problem in human-genetics—ascertainment of families through a correlated trait. Am. J. Med. Genet. 18: 435–448.[CrossRef][Medline]
FLINT, J., W. VALDAR, S. SHIFMAN and R. MOTT, 2005 Strategies for mapping and cloning quantitative trait genes in rodents. Nat. Rev. Genet. 6: 271–286.[CrossRef][Medline]
GAO, G., and I. HOESCHELE, 2005 Approximating identity-by-descent matrices using multiple haplotype configurations on pedigrees. Genetics 171: 365–376.
GAO, G., I. HOESCHELE, P. SORENSEN and F. DU, 2004 Conditional probability methods for haplotyping in pedigrees. Genetics 167: 2055–2065.
GOLDGAR, D. E., 1990 Multipoint analysis of human quantitative genetic-variation. Am. J. Hum. Genet. 47: 957–967.[Medline]
HALEY, C. S., and S. A. KNOTT, 1992 A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69: 315–324.[Medline]
HARPER, J. M., A. T. GALECKI, D. T. BURKE, S. L. PINKOSKY and R. A. MILLER, 2003 Quantitative trait loci for insulin-like growth factor I, leptin, thyroxine, and corticosterone in genetically heterogeneous mice. Physiol. Genomics 15: 44–51.
HITZEMANN, R. W., B. MALMANGER, S. COOPER, S. COULOMBE, C. REED et al., 2002 Multiple cross mapping (MCM) markedly improves the localization of a QTL for ethanol-induced activation. Genes Brain Behav. 1: 214–222.[CrossRef][Medline]
HOPPER, J. L., and J. D. MATHEWS, 1982 Extensions to multivariate normal models for pedigree analysis. Ann. Hum. Genet. 46: 373–383.[Medline]
ITOH, Y., and Y. YAMADA, 1990 Relationships between genotype x environment interaction and genetic correlation of the same trait measured in different environments. Theor. Appl. Genet. 80: 11–16.
JAFFRÉIC, F., and S. D. PLETCHER, 2000 Statistical models for estimating the genetic basis of repeated measures and other function-valued traits. Genetics 156: 913–922.
JANNINKA, J.-L., and R. JANSENA, 2001 Mapping epistatic quantitative trait loci with one-dimensional genome searches. Genetics 157: 445–454.
KAO, C.-H., Z.-B. ZENG and R. D. TEASDALE, 1999 Multiple interval mapping for quantitative trait loci. Genetics 152: 1203–1216.
KRUGLYAK, L., M. J. DALY, M. P. REEVE-DALY and E. S. LANDER, 1996 Parametric and nonparametric linkage analysis: a unified multipoint approach. Am. J. Hum. Genet. 58: 1347–1363.[Medline]
LANDER, E. S., and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199.
LANGE, K., 1978 Central limit theorems for pedigrees. J. Math. Biol. 6: 59–66.[CrossRef]
LANGE, K., 1986 Cohabitation, convergence and environmental covariances. Am. J. Med. Genet. 24: 483–491.[CrossRef][Medline]
LANGE, K., 2002 Mathematical and Statistical Methods for Genetic Analysis, Ed. 2. Springer-Verlag, New York.
LANGE, K., and M. BOEHNKE, 1983 Extensions to pedigree analysis. IV. Covariance components models for multivariate traits. Am. J. Med. Genet. 14: 513–524.[CrossRef][Medline]
LANGE, K., J. S. SINSHEIMER and E. M. SOBEL, 2005 Association testing with Mendel. Genet. Epidemiol. 29: 36–50.[CrossRef][Medline]
LI, R., M. A. LYONS, H. WITTENBURG, B. PIAGEN and G. A. CHURCHILL, 2005 Combining data from multiple inbred line crosses improves the power and resolution of quantitative trait loci mapping. Genetics 169: 1699–1709.
LIU, Y., and Z.-B. ZENG, 2000 A general mixture model approach for mapping quantitative trait loci from diverse cross designs involving multiple inbred lines. Genet. Res. 75: 345–355.[CrossRef][Medline]
MANLY, K. F., and J. M. OLSON, 1999 Overview of QTL mapping software and introduction to Map Manager QT. Mamm. Genome 10: 327–334.[CrossRef][Medline]
MEYER, K., and M. KIRKPATRICK, 2005 Up hill, down dale: quantitative genetics of curvaceous traits. Philos. Trans. R. Soc. 360: 1443–1455.[CrossRef]
MOTT, R., C. J. TALBOT, M. G. TURRI, A. C. COLLINS and J. FLINT, 2000 A method for fine mapping quantitative trait loci in outbred animal stocks. Proc. Natl. Acad. Sci. USA 97: 12649–12654.
PLETCHER, M. T., P. MCCLURG, S. BATALOV, A. I. SU, S. W. BARNES et al., 2004 Use of a dense single nucleotide polymorphism map for in silico mapping in the mouse. PLoS Biol. 2: 2159–2169.
PLETCHER, S. D., 1999 Model fitting and hypothesis testing for age-specific mortality data. J. Evol. Biol. 12: 430–439.[CrossRef]
PLETCHER, S. D., and C. J. GEYER, 1999 The genetic analysis of age-dependent traits: modeling the character process. Genetics 153: 825–835.
PLETCHER, S. D., and F. JAFFRÉIC, 2002 Generalized character process models: estimating the genetic basis of traits that cannot be observed and that change with age or environmental conditions. Biometrics 58: 157–162.[CrossRef][Medline]
PRATT, S. C., M. J. DALY and L. KRUGLYAK, 2000 Exact multipoint quantitative-trait linkage analysis in pedigrees by variance components. Am. J. Hum. Genet. 6: 1153–1157.
PURCELL, S., 2002 Variance components models for gene-environment interaction in twin analysis. Twin Res. 5: 554–571.[CrossRef][Medline]
PURCELL, S., and P. SHAM, 2002 Variance components models for gene-environment interaction in quantitative trait locus linkage analysis. Twin Res. 5: 572–576.[CrossRef][Medline]
REBAI, A., and B. GOFFINET, 1993 Power of tests for QTL detection using replicated progenies derived from a diallel cross. Theor. Appl. Genet. 86: 1014–1022.
SCHORK, N. J., 1993 Extended multipoint identity-by-descent analysis of human quantitative traits: efficiency, power, and modeling considerations. Am. J. Hum. Genet. 53: 1306–1319.[Medline]
SEATON, G., C. S. HALEY, S. A. KNOTT, M. KEARSEY and P. M. VISSCHER, 2002 QTL Express: mapping quantitative trait loci in simple and complex pedigrees. Bioinformatics 18: 339–340.
SEN,
., and G. A. CHURCHILL, 2001 A statistical framework for quantitative trait mapping. Genetics 159: 371–387.
SHAM, P. C., J. H. ZHAO, S. S. CHERNY and J. K. HEWITT, 2000 Variance-components QTL linkage analysis of selected and non-normal samples: conditioning on trait values. Genet. Epidemiol. 19: S22–S28.[CrossRef][Medline]
SILLANPÄÄ, M. J., and E. ARJAS, 1998 Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. Genetics 148: 1373–1388.
SOBEL, E. M., and K. LANGE, 1996 Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. Am. J. Hum. Genet. 58: 1323–1337.[Medline]
STOREY, J. D., 2007 The optimal discovery procedure: a new approach to simultaneous significance testing. J. R. Stat. Soc. B 69: 347–368.[CrossRef]
STOREY, J. D., J. Y. DAI and J. T. LEEK, 2007 The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments. Biostatistics 8: 414–432.
TOWNE, B., R. M. SIERVOGEL and J. BLANGERO, 1997 Effects of genotype-by-sex interaction on quantitative trait linkage analysis. Genet. Epidemiol. 14: 1053–1058.[CrossRef][Medline]
VALDAR, W., J. FLINT and R. MOTT, 2006 Simulating the collaborative cross: power of quantitative trait loci detection and mapping resolution in large sets of recombinant inbred strains of mice. Genetics 172: 1783–1797.
VIEL, K. R., D. M. WARREN, A. BUIL, T. D. DYER, T. E. HOWARD et al., 2005 A comparison of discrete versus continuous environment in a variance components-based linkage analysis of the COGA data. BMC Genet. 6(Suppl. 1): S57.[CrossRef][Medline]
WILLIAMS, R. W., K. W. BROMAN, J. M. CHEVERUD, G. A. CHURCHILL, R. W. HITZEMANN et al., 2002 A collaborative cross for high-precision complex trait analysis. First workshop report of the complex trait consortium. Technical report.
XIE, C., D. D. G. GESSLER and S. XU, 1989 Combining different line crosses for mapping quantitative trait loci using the identical by descent-based variance component method. Genetics 149: 1139–1146.
ZENG, Z.-B., 1993 Theoretical basis for separation of multiple linked gene effects in mapping quantitative trait loci. Proc. Natl. Acad. Sci. USA 90: 10972–10976.
ZENG, Z.-B., 1994 Precision mapping of quantitative trait loci. Genetics 136: 1457–1468.[Abstract]
Communicating editor: K. W. BROMAN
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- A corrigendum has been published
-
All Versions of this Article:
genetics.108.091058v1
180/3/1743 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Bauman, L. E.
- Articles by Lange, K.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Bauman, L. E.
- Articles by Lange, K.


















































































