A General Framework for Analyzing the Genetic Architecture of Developmental Characteristics
 ^{*} Department of Statistics, University of Florida, Gainesville, Florida 32611
 ^{†} Institute of Statistical Genetics, Zhejiang Forestry University, Lin'an, Zhejiang 311300, People's Republic of China
 1 Corresponding author: Department of Statistics, 533 McCarty Hall C, University of Florida, Gainesville, FL 32611. Email: rwu{at}stat.ufl.edu
Abstract
The genetic architecture of growth traits plays a central role in shaping the growth, development, and evolution of organisms. While a limited number of models have been devised to estimate genetic effects on complex phenotypes, no model has been available to examine how gene actions and interactions alter the ontogenetic development of an organism and transform the altered ontogeny into descendants. In this article, we present a novel statistical model for mapping quantitative trait loci (QTL) determining the developmental process of complex traits. Our model is constructed within the traditional maximumlikelihood framework implemented with the EM algorithm. We employ biologically meaningful growth curve equations to model timespecific expected genetic values and the AR(1) model to structure the residual variancecovariance matrix among different time points. Because of a reduced number of parameters being estimated and the incorporation of biological principles, the new model displays increased statistical power to detect QTL exerting an effect on the shape of ontogenetic growth and development. The model allows for the tests of a number of biological hypotheses regarding the role of epistasis in determining biological growth, form, and shape and for the resolution of developmental problems at the interface with evolution. Using our newly developed model, we have successfully detected significant additive × additive epistatic effects on stem height growth trajectories in a forest tree.
THE evolution of complex organisms, such as animals and plants, does not result simply from the direct transformation of adult ancestors into adult descendants, but rather involves a cascade of developmental processes that produce the new features of each generation. An increasing number of evolutionary studies have been launched to determine the genetic or developmental changes in the rate or timing of developmental processes that must take place to derive a particular phenotype from its ancestor (Rice 1997; Raff 2000; Rougvie 2001). A general view is that the evolution of developmental processes is affected by both the environment and many genes that act singly and in interaction with each other (Lynch and Walsh 1998). However, to accurately predict the direction and rate of trait evolution, a detailed genetic architecture of how genes act and interact to control various stages of development must be quantified.
The genes predisposing for a phenotypic character that displays continuous variation among individuals are referred to as quantitative trait loci (QTL). The genetic effect or variance of QTL includes two components, additive, due to the cumulation of breeding values, and nonadditive, due to allelic (dominant) or nonallelic (epistatic) interactions. Epistatic interactions between different loci can be further partitioned into different types: additive × additive, additive × dominant (or dominant × additive), and dominant × dominant. The presence of epistasis implies that the influence of a gene on the phenotype depends critically upon the context provided by other genes. In the past, the estimation of the additive and nonadditive genetic architecture of a quantitative trait was based on the phenotypes of related individuals (Lynch and Walsh 1998), although this has minimal power to detect the nonadditive genetic variances, especially epistatic variance because epistasis contributes little to the resemblance among relatives (Cheverud and Routman 1995).
The advent of DNAbased linkage maps opens a novel avenue for precisely estimating the genetic architecture of developmental traits (Vaughnet al. 1999). Current statistical methods proposed to detect the main and interaction effects of QTL are based on the phenotypes of a quantitative trait measured at a limited set of landmark ages. More recently, Wu et al. (2002, 2003a,b) and Ma et al. (2002) have derived a powerful functional mapping method for estimating the dynamic changes of QTL effects during a course of ontogenetic growth through the implementation of universal growth laws (Westet al. 2001) and the structured residual (co)variance matrix among different time points (see Kirkpatrick and Heckman 1989; Kirkpatrick et al. 1990, 1994; Pletcher and Geyer 1999). This method has proven to be statistically more powerful and more precise because of a reduced number of parameters being estimated and to be biologically more meaningful due to the consideration of biological principles underlying trait development (Wuet al. 2002). However, this model has not incorporated the estimation process of epistatic interactions and, thus, cannot examine the role of the entire genetic architecture in developmental trajectories.
In this article, we extend the functional mapping method to map any QTL (including additive, dominant, and epistatic) that transforms allelic and/or nonallelic effects into final phenotypes during a continuous process of development represented as ontogenetic trajectories or a path through phenotypetime space (Alberchet al. 1979; Wolfet al. 2000). We derive special procedures to estimate and test the impact of epistasis on trait growth because a growing body of evidence now shows that epistasis plays a more important role in determining developmental changes than originally thought (Rice 1997, 2000; Wolfet al. 2000). Epistasis can trigger an effect on the evolution of development across different levels of biological organization and these include the molecular mechanisms of gene expression and genetic architecture, the evolution of sex and recombination, the genetic coadaptation and its associated outbreeding depression, adaptive evolution, and the very process of speciation (Wolfet al. 2000). We use a maximumlikelihoodbased method, implemented with the expectationmaximization (EM) algorithm, to estimate QTL locations and genetic effects on growth differentiation. Compared with current mapping methods, our method of incorporating growth trajectories tends to be more powerful and more precise in QTL detection and effect estimation, as demonstrated in an example using forest tree data. In practice, our method is economically more feasible than previous methods because it needs a smaller size of genotyped samples to obtain adequate power for QTL detection through the use of repeated measurements for each individual. It can be anticipated that the method proposed in this article will have potential implications for understanding the origin and evolution of development and the contributions of epistatic effects to evolutionary changes in the process of development.
GROWTH EQUATIONS AND MIXTURE MODEL
Growth equations: When growth g is plotted against time t, different forms of growth curves will appear. Among these forms, a logistic growth curve (also referred to as the sigmoid curve of growth; Niklas 1994) is one of the most ubiquitous, having been derived from fundamental physiological and physical principles (Westet al. 2001). The logistic growth curve can be mathematically described by
In evolutionary biology, a question of how a population evolves on a logistic curve is determined by how selection acts on the growth and by the local geometry of the curve itself. Some geometric properties of the growth curve have straightforward biological interpretations. For example, the slope of the logistic curve at any given time point measures the degree to which the value of growth is sensitive to a change in age:
Such a slope represents the rate of growth at a given time. Thus, if the slope at a point is low, then that value of growth is locally buffered against age changes. The rate of growth drops off linearly as the overall size approaches some limit.
From a growth curve, we can derive the timing (t_{I}) of the inflection point, at which the exponential phase ends and the asymptotic phase begins (Niklas 1994). For the logistic curve, t_{I} is derived as
The area under the logistic curve at an interval [t_{1} t_{2}] describes the capacity of a given organism to grow over time. Such an area is the integral of the logistic curve, expressed as
Quantitative genetic model: We start with a simple F_{2} population of size n derived from two homozygous lines. Consider two segregating QTL responsible for a quantitative trait,
Logistic mixture model for mapping epistatic QTL: Unlike traditional statistical models, in which marker information is associated with phenotypic values measured at one time point, our model intends to map QTL for an infinitedimensional trait expressed as a function of time (Kirkpatrick and Heckman 1989; Kirkpatrick et al. 1990, 1994; Pletcher and Geyer 1999). Yet, the modeling of the functional relationship of a trait with time needs to be based on the measurements made at a finite number of landmark ages. It is reasonable to assume that the phenotypes of an infinitedimensional trait measured at all time points 1,..., m for each putative QTL genotype group follow a multivariate normal density,
The assumption of variance stationarity can be satisfied by transforming both sides (TBS) of the growth equation (1), as proposed by Carroll and Ruppert (1984). The transformation at the left side of Equation 1 can lead to a homogeneous variance over times, whereas the transformation at the right side of Equation 1 can preserve the biological properties of growth parameters (a, b, c). Thus, with the TBS model, the favorable advantages of structuring Σ according to (8) can be preserved.
We formulate the likelihood function of the phenotypic data with mdimensional measurements as
The EM algorithm: The maximumlikelihood estimates (MLEs) of the unknown parameters under a twoQTL model can be computed by implementing the EM algorithm (Dempsteret al. 1977; Lander and Botstein 1989). We have incorporated the growth law (1) into the mixturebased likelihood function (9) and derived the loglikelihood equations to estimate Ω. In the E step, calculate the expected conditional (posterior) probability of a twoQTL genotype j_{k}j_{l} given marker genotypes for progeny i,
In practical computations, the QTL position parameters can be viewed as nuisance parameters because two putative QTL can be searched at given positions throughout the entire linkage map. The amount of support for the QTL at particular map positions is often displayed graphically through the use of likelihood maps or profiles, which plot the likelihoodratio test statistics as a function of map positions of the two putative QTL.
HYPOTHESIS TESTS
Different from traditional mapping approaches, our functional mapping for epistatic QTL allows for the tests of a number of biologically meaningful hypotheses. These hypothesis tests can be a global test for the existence of significant QTL, a local test for the genetic effect on growth at a particular time point, a regional test for the overall effect of QTL on a particular period of growth process, or an interaction test for the change of QTL expression across ages.
Global test: Testing whether specific QTL exist to affect the shape of growth trajectories is a first step toward the understanding of the genetic architecture of complex phenotypes. The genetic control over entire growth trajectories can be tested by formulating the following hypotheses:
H_{0} states that no QTL affect growth trajectories (the reduced model), whereas H_{1} proposes that such QTL do exist (the full model). The test statistic for testing the hypotheses (10) is calculated as the loglikelihood ratio of the reduced to the full model,
We can also test the global effects of different genetic components, additive, dominant, and epistatic, on the shapes of entire growth curves. The hypothesis for testing the additive effect of QTL
The test for the dominant effect, β_{k}(t), of QTL
Local test: The local test can test the significance of the main (additive or dominant) effect of each QTL and the interaction (epistatic) effect between the two QTL on growth measured at a time point (t*) of interest. The tests of additive and dominant effects of individual QTL and their epistatic effects can be made on the basis of the corresponding restrictions given in Equations 13, 14, 15, 16, 17, 18, 19, 20. For example, the hypothesis for testing the additive effect of QTL
Regional test: It is likely that an important developmental event often occurs in a time interval rather than simply at a time point. The question of how QTL exert their effects on a period of growth process [t_{1}, t_{2}] can be tested using a regional test approach based on the areas (G_{jkjl} [t_{1}, t_{2}], Equation 4) covered by growth curves. The hypothesis test for the genetic effect on a period of growth process is equivalent to testing the difference between the full model with no restriction and the reduced model with a restriction. The types of restriction used are similar to Equations 13, 14, 15, 16, 17, 18, 19, 20, depending on the additive effects, dominant effects, or epistatic effects of different kinds.
Interaction test: The effects of QTL may change with age, which suggests the occurrence of QTL × age interaction effects on the growth process. The differentiation of g(t) with respect to time t represents a slope of the growth curve (growth rate, Equation 2). If the slopes at a particular time point t* are different between the curves of different QTL genotypes, this means that significant QTL × age interaction occurs between this time point and the next. The test for QTL × age interaction can be formulated with the restriction
The effect of QTL × age interaction on growth can be examined during the entire growth trajectories. The global test of QTL × age interaction due to the additive effect of QTL
Test for the timing of development: During its ontogenetic growth, an organism would experience various developmental events. The genetic determination of the timing of development sheds light on the theoretical integration of evolution and development (Raff 2000; Rougvie 2001). Using our functional mapping model, the genotypic differences in the timing (t_{I}) of the inflection point of maximum growth rate can be tested. According to Equation 2, the test for such a genotypic difference due to the additive effect of QTL
A CASE STUDY
Materials: The power of our statistical model for mapping QTL affecting growth trajectories was demonstrated by a case study in poplar trees. A Populus deltoides clone (designated I69) was used as a female parent to mate with an interspecific P. deltoides × P. nigra clone (designated I45) as a male parent (Wuet al. 1992). A total of 450 1yearold rooted hybrid seedlings from this cross were planted at a spacing of 4 × 5 m at a forest farm near Xuchou City, Jiangsu Province, China. The total stem heights and diameters were measured at the end of each of the growing seasons for each tree. Two parentspecific genetic linkage maps each composed of 19 linkage groups (roughly representing 19 haploid chromosomes in poplar) were constructed from restriction fragment length polymorphism, amplified fragment length polymorphism, and microsatellite markers for this hybrid progeny (Yinet al. 2002) and used for the genetic mapping of QTL affecting complex traits of economical importance in forest trees.
Methods: Yin et al. (2002) used Grattapaglia and Sederoff's (1994) pseudotest backcross strategy to construct two linkage maps each corresponding to a parent. Each testcross marker for these two parentspecific maps is heterozygous in one parent and null in the other. Because the two parents, I69 and I45, are heterozygous, there is no consistent linkage phase among dominant alleles of different markers on the same linkage group; some are in a coupling phase whereas others are in a repulsion linkage phase (Yinet al. 2002). Thus, unlike QTL mapping in inbredline crosses, we need to determine the correct linkage phase between the QTL and the markers flanking it for the pseudotest backcross mapping population (Linet al. 2003). Our functional mapping proposed above was modified to incorporate the uncertainty of the QTLmarker linkage phase into the likelihood function (appendix).
Detection of QTL with significant epistasis: The statistical model built upon a universal logistic growth law (Westet al. 2001) is used to map epistatic QTL responsible for growth trajectories in poplars. All of the 19 linkage groups were scanned on a 2cM scale for the existence of a pair of QTL at different genomic locations. We have successfully detected a few pairs of genomic locations at which two QTL interact to affect stem growth trajectories in poplar. Table 1 and Figure 1 illustrate an example in which a pair of QTL located at the ninth interval [TC/CTG730, TC/CTG735] and the thirteenth interval [AG/CTT570, AG/CTT595] of linkage group D16 (Yinet al. 2002) were found to show significant epistatic effects on stem height growth. The uppercase alleles of these two QTL were observed to be in a coupling phase with dominant alleles of their respective flanking coupling markers. The maximum (93.1) of the landscape of the logLR test statistics across the linkage group (Figure 1), greater than the genomewide threshold at the significance level α= 0.05 (LR_{T} = 85.6) estimated from permutation tests, justifies the adequacy of a twoQTL model incorporating growth curves.
To show the advantages of our functional mapping model, the same data set was analyzed by traditional univariate and multivariate interval mapping approaches. Univariate interval mapping applied to the most differentiated heights at the oldest age (year 11) measured detected no QTL, whereas multivariate interval mapping for three representative ages (years 1, 6, and 11) suggested a marginal QTL at the significance level α= 0.10 (results not shown). These results indicate that our functional mapping approach is statistically more powerful for detecting QTL from a given mapping population.
The pseudotest backcross used here allows for only the significance tests of the additive effects of the two QTL and their additive × additive epistasis. The thresholds at the α= 0.05 level for these tests were calculated by simulation studies with the restrictions 13 and 17, respectively. By comparing the maximum value of the LRs from the functional mapping approach with these thresholds, we found that the additive × additive epistasis has a significant impact on the overall differences of growth curve shapes in stem height growth, whereas these two QTL each display marginally significant effects.
To address a possible violation of the constant variance assumption in the matrix (8), we incorporate the TBS model (Carroll and Ruppert 1984) into our functional mapping framework. Similar results about the estimation of QTL positions and effects were obtained from the TBSbased mapping approach (data not shown). Yet, the TBSbased mapping approach provides more precise estimates of growth curve parameters, with sampling errors reduced by 20–50% compared to those from untransformed data.
The dynamic pattern of QTL effect: The growth curves of height are drawn using the estimates of logistic parameters (Table 1) for four genotypes at two interactive QTL located on linkage group D16 (Figure 2). As described in hypothesis tests, our functional mapping approach can be used to test various genetic hypotheses related to the developmental process on the basis of estimated growth parameters. The additive effect of the QTL located on the thirteenth marker interval is significant throughout the entire growth process measured, but its sign is altered when trees develop into age 7–8 years (Figure 2). The QTL located on the eighth marker interval has nonsignificant additive effect on growth, but it interacts significantly with the QTL on the thirteenth interval. It is not surprising that significant QTL × age interactions are detected on height growth given the change of the signs of the additive and additive × additive effects (Figure 2).
Genetic control over the inflection point: Equation 3 describes the coordinates of the inflection point where the exponential phase ends and the asymptotic phase begins (Niklas 1994). The difference in the coordinates between different genotypes provides important information about the genetics and evolution of growth trajectories. If different growth curves predicted by a QTL have different ages at the inflection point, this indicates that the inflection point is under genetic determination. In our example of poplars, significant additive effect due to the QTL on the thirteenth marker interval and significant additive × additive effect on the timing of the inflection point are detected. The additive effect of the QTL on the thirteenth interval delays the occurrence of the inflection point by about 0.4 year, whereas the additive × additive effect causes the inflection point to occur 0.8 year earlier (Figure 2). Because the inflection point occurs at a time of maximum growth rate, the genetic control of growth trajectory implies that it can be genetically modified to increase a tree's capacity to effectively acquire spatial resources.
DISCUSSION
Increasing evidence has emerged for the role of complex genetic architecture in regulating the ontogenetic development of embryological phenotypes (Cheverudet al. 1983; Atchley 1984; Atchley and Zhu 1997; Vaughnet al. 1999; Carlborget al. 2003) and, ultimately, shaping the evolutionary process of organismic form (Wolfet al. 2000). However, traditional genetic approaches are limited in estimating nonadditive effects. If epistasis is assumed to be absent, as in most quantitative genetic studies, Cockerham's (1963) model based on a mating design provides a nice estimate of the dominant variance. Wu (1996) extended Cockerham's quantitative genetic model to estimate the epistatic variances on the basis of clonal replicates. If both parents and offspring are cloned, Wu's model can estimate the entire additive × additive epistatic variance, a partial additive × dominant epistatic variance, and a partial dominant × dominant epistatic variance.
In this article, we present a statistical approach for mapping any QTL that exert various genetic effects on growth trajectories based on a genetic linkage map. Our approach is unique in that it detects and estimates genetic effects due to allelic/nonallelic actions and interactions of QTL from physiological and developmental principles of growth. This uniqueness makes our approach advantageous in two aspects and leads us to construct a conceptual framework of evolutionary developmental biology (Arthur 2002).
First, we integrate growth equations into a statistical mapping framework to map developmental QTL that guide the trajectories of organ growth and development. Separate QTL analyses of growth at different time points are not powerful to follow the dynamics of QTL effects since relationships of growth at different times are not considered. Multivariate QTL analysis combining all time points takes into account these agedependent relationships (Korolet al. 2001), but it quickly becomes intractable when the number of time points increases. By fitting the expected genetic values at different time points by growth curves (Westet al. 2001) and the residual (co)variance matrix by the AR(1) model (Davidian and Giltinan 1995), our approach estimates a considerably reduced number of parameters, which thus increases significantly the power to detect epistasis. Second, because biological principles are incorporated, our approach sheds better light on the integration of development and epitasis. Our approach allows for the understanding of the genetic basis for growth and development at the cutting edge of biology (Raff 2000; Arthur 2002). Growth and development are two different but related biological processes. It is likely that these two processes share some common genetic mechanisms. By mapping the timing of various developmental events, e.g., the time to first flower, on the growth curves of QTL genotypes, we can test whether the same growth genes are also involved in regulating reproductive behaviors.
The estimation precision of additive, dominant, and epistatic effects on growth at particular time points, at time intervals, or during the entire growth process depends on the estimation precision of the parameters describing growth curves. Our earlier studies using a real example have demonstrated that growth parameters of different QTL genotypes can be precisely estimated (Maet al. 2002; Wu et al. 2002, 2004). The sampling errors of the growth parameters estimated from Fisher's information index are less than onetenth of the parameter values when a modest sample size (90) was used. Moreover, these sampling errors can be reduced if functional mapping is based on the transformbothsides model, as advocated by Carroll and Ruppert (1984), for relaxing the variance stationarity assumption in the AR(1) model. The estimation precision of parameters in the real example is consistent with that from our simulation studies. It is inferred from these results that the estimation of the genetic architecture of a quantitative trait from our functional mapping approach should be adequately precise, although direct assessments of the sampling errors for the gene effect estimation from our approach are needed. In addition, it is possible to modify our analysis model to consider nonstationary variancecovariance structures using structured antedependent models, as proposed by NunezAnton (1997) and NunezAnton and Zimmerman (2000). Such a modification can be viewed as an alternative to the current method based on the variancestationary assumption.
Genetic effects are expressed differentially during ontogeny (Rice 2000; Wolfet al. 2000), as also documented in our case study of poplar trees. Epistasis is regarded as a force to determine the direction of organismic evolution. This controversial view can now be tested and assessed, using this model and implementing the estimation process of epistasis. From an evolutionary perspective, however, our approach should be modified to consider population genetic properties of QTL alleles, QTL genotypes, and the relationship between the QTL and markers. If a mapping population is constructed using a natural population, the theory of linkage disequilibrium mapping should be integrated within the functional mapping framework. Linkage disequilibriumbased mapping provides a powerful tool for finescale mapping of complex traits (Louet al. 2003) and, thus, the combination of this mapping strategy with our functional mapping can gain better insights into the genetic basis of development and evolution.
Acknowledgments
We thank two anonymous referees for their constructive comments on this manuscript. This work is partially supported by an Outstanding Young Investigators Award of the National Science Foundation of China (30128017), a University of Florida Research Opportunity Fund (02050259), and a University of South Florida Biodefense grant (722206112) to R.W. The publication of this manuscript is approved as journal series no. R09586 by the Florida Agricultural Experiment Station.
APPENDIX: ESTIMATION OF QTL PARAMETERS IN A PHASEUNKNOWN PSEUDOTEST BACKCROSS
Suppose an interval is flanked by two markers,
As shown in Lin et al. (2003), the mixturelikelihood function of these four phase combinations can be written as
Footnotes

Communicating editor: M. W. Feldman
 Received September 24, 2003.
 Accepted November 24, 2003.
 Copyright © 2004 by the Genetics Society of America