- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.107.070920v1
176/3/1879 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zhao, W.
- Articles by Wu, R.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zhao, W.
- Articles by Wu, R.
Originally published as Genetics Published Articles Ahead of Print on April 15, 2007.
Genetics, Vol. 176, 1879-1892, July 2007, Copyright © 2007
doi:10.1534/genetics.107.070920
Wavelet-Based Parametric Functional Mapping of Developmental Trajectories With High-Dimensional Data
Wei Zhao, Hongying Li, Wei Hou and Rongling Wu1
Department of Statistics, University of Florida, Gainesville, Florida 32611
1 Corresponding author: Department of Statistics, University of Florida, Gainesville, FL 32611.
E-mail: rwu{at}stat.ufl.edu
>ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
MONTE CARLO SIMULATION
DISCUSSION
APPENDIX
ACKNOWLEDGEMENTS
LITERATURE CITED
The biological and statistical advantages of functional mapping result from joint modeling of the mean-covariance structures for developmental trajectories of a complex trait measured at a series of time points. While an increased number of time points can better describe the dynamic pattern of trait development, significant difficulties in performing functional mapping arise from prohibitive computational times required as well as from modeling the structure of a high-dimensional covariance matrix. In this article, we develop a statistical model for functional mapping of quantitative trait loci (QTL) that govern the developmental process of a quantitative trait on the basis of wavelet dimension reduction. By breaking an original signal down into a spectrum by taking its averages (smooth coefficients) and differences (detail coefficients), we used the discrete Haar wavelet shrinkage technique to transform an inherently high-dimensional biological problem into its tractable low-dimensional representation within the framework of functional mapping constructed by a Gaussian mixture model. Unlike conventional nonparametric modeling of wavelet shrinkage, we incorporate mathematical aspects of developmental trajectories into the smooth coefficients used for QTL mapping, thus preserving the biological relevance of functional mapping in formulating a number of hypothesis tests at the interplay between gene actions/interactions and developmental patterns for complex phenotypes. This wavelet-based parametric functional mapping has been statistically examined and compared with full-dimensional functional mapping through simulation studies. It holds great promise as a powerful statistical tool to unravel the genetic machinery of developmental trajectories with large-scale high-dimensional data.
FUNCTIONAL mapping of dynamic or longitudinal traits, such as growth curves, human immunodeficiency virus dynamics, programmed cell death, circadian rhythms, and pharmacodynamics/pharmacokinetics, advocated by R. Wu and colleagues (MA et al. 2002; WU et al. 2003, 2004a,b,c; ZHAO et al. 2004a,b, 2005; reviewed in WU and LIN 2006), has emerged as an important statistical method to map and identify specific quantitative trait loci (QTL) that underlie the genetic architecture of complex traits. Genetic mapping of longitudinal traits has also received considerable attention by other workers who have developed different statistical designs (MACGREGOR et al. 2005) or methods (YANG et al. 2006) for estimating the temporal pattern of genetic variance and covariance explained by a QTL in a time course. The rationale of functional mapping is to express the genotypic effects of QTL at a series of time points in terms of a continuous growth function with respect to time or other independent variables. Under this formulation, the parameters describing longitudinal trajectories, rather than time-dependent genotypic values, are estimated within a maximum-likelihood framework. Also, unlike traditional genetic mapping strategies, functional mapping estimates the parameters that model the structure of the covariance matrix among multiple different time points, which, therefore, largely reduces the number of parameters being estimated for variances and covariances, especially when the dimension of data is high.
The significant advantage of functional mapping is that it provides a quantitative framework for testing the interplay between genetic (inter)actions and the pattern of development in a time course. Functional mapping constructs a setting for precisely estimating and predicting a number of fundamental events in the genetic control of development (WU et al. 2004a), which include (1) the timing of a QTL to turn on and off to affect growth in a time course, (2) the duration of the dynamic genetic effect of a QTL, (3) the magnitude of the genetic effect of a QTL on maximal growth rate and the lasting period of maximal growth rate, and (4) the pleiotropic effect of the growth QTL on other developmental traits related to growth processes. Despite these elegant features, functional mapping based on simple parametric modeling of mean-covariance structures may be limited in the following two aspects when the dimension of longitudinal data is large. First, one of the most promising directions in genetic studies is to view the formation of a quantitative trait as a dynamic system, dissect the system into a series of fundamental components, and evaluate the developmental complexity of the system on the basis of the significance of each component. Functional mapping has power to detect the genes involved in various stages of development of each component and draw a detailed picture of interactive networks of these genes. However, functional mapping of multivariate high-dimensional data required to describe a system is encumbered with a tremendous computational burden. Second, with increased dimensionality of repeated measurements, it is difficult to model the structure of a time-dependent covariance matrix on the basis of parametric approaches, such as autoregressive (DIGGLE et al. 2002) and antedependence models (NÚÑEZ-ANTÓN and ZIMMERMAN 2000; ZIMMERMAN and NÚÑEZ-ANTÓN 2001) or the Brownian process (SY et al. 1997). Also, high dimension may make the computation of the inverse of such a structured covariance matrix unstable, with the determinant approaching infinity or zero.
An efficient approach for applying functional mapping to high-dimensional data is through dimensionality reduction, i.e., the transformation that brings data from a high- to a low-order dimension. Wavelet dimensionality reduction preserves the signal pattern and yields better or comparative classification accuracy, as shown by Donoho and colleagues (DONOHO and JOHNSTONE 1994; DONOHO 1995; DONOHO et al. 1995). More recently, wavelet-based approaches have been integrated with functional mixed models to extract meaningful information from high-dimensional functional and longitudinal data (MORRIS et al. 2003, 2006; MORRIS and CARROLL 2006). In these applications, wavelet-based approaches, serving as a nonparametric tool to fit curves that cannot be mathematically described, play a similar role to other nonparametric approaches like B-spline smoothing. For many biological processes, such as growth curves, in which explicit mathematical equations exist (WEST et al. 2001), the nonparametric application of wavelet-based approaches will reduce the power of functional mapping to quantitatively test biologically meaningful hypotheses through mathematical models, as itemized above. In this article, we develop a statistical model for integrating parametric functional mapping of developmental trajectories on the basis of wavelet transform methods. By reducing the dimensionality of data through wavelet transform, conventional parametric functional mapping can be performed, as has usually been practiced, and thus its biological relevance and statistical advantage are preserved. It has been shown that the low-dimensionality models are not only computationally efficient, but also more flexible and robust than high-dimensionality models. The statistical properties of this new model are investigated through simulation studies and are compared with those of full-dimensional functional mapping methods.
ABSTRACT
>WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
MONTE CARLO SIMULATION
DISCUSSION
APPENDIX
ACKNOWLEDGEMENTS
LITERATURE CITED
Wavelet transform:
Through wavelet transform (also called analysis), a series of original functional data are divided into two sequences of wavelet coefficients with exactly the same length, half of the original signal length (MALLAT 1989, 1998; VIDAKOVIC 1999; JENSEN and LA COUR-HARBO 2001). The first sequence of coefficients is called smooth coefficients, usually denoted by c–j, which correspond to an approximation process of the original signal. The second sequence of coefficients is called detail coefficients, usually denoted by d–j, which correspond to the detail information (subtleties) left by the approximation process. The symbol –j denotes the resolution level at which the initial sequence is split into smooth and detail coefficients. Since detail coefficients are contaminated severely by random errors, shrinking them to zero can help to reduce the overall noise level of the signal (DONOHO and JOHNSTONE 1994; DONOHO 1995; DONOHO et al. 1995).
Here we incorporate the principle of wavelet transform into the framework of functional mapping for developmental trajectories. Suppose there is an experimental mapping population, such as the backcross, with two different genotypes at a putative QTL. Each individual of this population is measured for a growth trait at T different time points. Molecular markers have been genotyped for the population to identify and map QTL that control growth trajectories. For a particular QTL genotype l (l = 1, 2), the phenotypic observation (yi(t
)) for individual i at time t
can be written as
![]() |
i is an indicator variable for individual i denoted as 1 for QTL genotype 1 and 0 for QTL genotype 2, ul(t
) is the genotypic value for QTL genotype l at time t
, and ei(t
) is the residual error at time t following a normal distribution N(0,
2). The residual covariance between two different time points
and
is expressed as
.
Letting yi = {yi(t1), ... , yi(tT)}', ul = {ul(t1), ... , ul(tT)}', and ei = {ei(t1), ... , ei(tT)}', we have
![]() | (1) |
. For the wavelet transformation, a matrix H exists such that
![]() | (2) |
![]() |
![]() |
After the first-resolution Haar wavelet transform, the smooth and detail coefficients of these two signals are expressed, respectively, as
![]() | (3) |
![]() | (4) |
. The process can be iterated until only one smooth and one detail coefficient are produced (WALKER 1999). The pattern of the smooth coefficients in the wavelet space resembles the signal pattern in the time space. In Figure 1, c0 represents a sample of repeated measurements of a logistic growth curve at 12 time points. The smooth coefficients of the first and second Haar wavelet transformation are plotted as c–1 and c–2, respectively. The pattern of c–1 and c–2 coefficients conforms to the signal pattern although they are in two different resolution levels. Because of the similarity, it is reasonable to model the original growth pattern on the basis of low-dimensional smooth coefficients.
|
Likelihood of smooth coefficients:
Let c–j be the new variable with a reduced dimension K (K < T) transformed from the jth resolution Haar wavelet. The likelihood function based on a mixture model containing two QTL genotypes in the backcross can be rewritten, in terms of c–j and marker information M, as
![]() | (5) |
1|i,
2|i) are the mixture proportions corresponding to the frequencies of different QTL genotypes for individual i. Because the marker genotype of each individual is known, these mixture proportions actually represent the conditional probabilities expressed in terms of the recombination fractions between the markers and QTL.
= {
l|i, w
,
–j}
contains unknown parameters about the QTL position and the mean vector and covariance matrix after the wavelet transform, and fl(c
; w
,
–j) is the multivariate normal distribution of individual i that carries QTL genotype l expressed as
![]() | (6) |
= {c
(k1), ... , c
(kK)} is a vector of smooth coefficients for individual i, w
= {wl–j(k1), ... , wl–j(kK)} is a vector of expected smooth coefficients for QTL genotype l, and
–j is the (K x K) residual covariance matrix for the smooth coefficients.
Modeling the mean-covariance structures:
The tenet of functional mapping is used to model the mean vector w
by a simple parametric function of biological meaning denoted as
) with
being a set of mathematical parameters that describe the shape of the growth curve specifically for QTL genotype l. For the first resolution transformation, this modeling process can be mathematically expressed as
![]() | (7) |
with wavelet-transformed data at an appropriate transformation resolution j, trajectory curves in a time course can be drawn individually for different QTL genotypes. Genotypic differences for the curve can be compared and tested for the statistical significance of genetic control over developmental traits.
As shown in functional mapping, the covariance matrix
among different time points can be modeled using parametric approaches. The structure of the covariance matrix can be modeled by the first-order autoregressive [AR(1)] model (DIGGLE et al. 2002), expressed as
![]() | (8) |
< 1 is the proportion parameter with which the correlation decays with time lag. The parameters that model the structure of the covariance matrix are arrayed in
v = (
,
2).
Alternatively, the age-specific change of variance and correlation in the analysis of longitudinal traits can be modeled by ZIMMERMAN and NÚÑEZ-ANTÓN's (2001) SAD model. Using matrix notation, the error term in regression model (1), ignoring the subscript i, can be expressed as
![]() |
= {
(t1), ... ,
(tT)}', and for the first-order SAD [SAD(1)] model
![]() |
The residual covariance matrix of the longitudinal trait is then expressed (JAFFRÉZIC et al. 2003) as
![]() | (9) |

is the innovation covariance matrix, expressed as
![]() |

(t
) can be approached by a polynomial (POURAHMADI 1999) or, for simplicity, is assumed as a constant 
. Thus, the residual matrix
contains the parameters,
v = (
, 
), that model it.
The residual variances and covariances of the smooth coefficients for the developmental trait measured at tT time points after Wavelet transforms can be derived, with expressions depending on the resolution level of transform. For example, the variance of first-resolution smooth coefficient ci–1(k1) is expressed as
![]() |
![]() |
Computational algorithms:
We adopt the previous EM algorithm developed on the basis of log-likelihood function (5) (MA et al. 2002; WU et al. 2004a,b,c) to estimate the QTL position in terms of QTL-marker recombination fractions, the curve parameters (
) that model the mean vector, and the parameters (
v) modeling the covariance matrix. In the APPENDIX, a detailed EM algorithm is given to estimate these parameters. In practice, the QTL position parameter can be viewed as a fixed parameter because a putative QTL can be searched at every 1 or 2 cM on a map interval bracketed by two markers throughout the entire linkage map. The amount of support for a QTL at a particular map position is often displayed graphically through the use of likelihood maps or profiles, which plot the likelihood-ratio test statistic as a function of map position of the putative QTL.
Thresholding rule:
As the noise-free signal is unknown, an approximation of the signal should be sought that is smooth and fits the input well. There are two predominant thresholding schemes for this approximation based on wavelet dimensionality reduction or wavelet shrinkage. One is the hard threshold filter (Hh), or referred to as the "keep or kill" method (ABOUFADEL and SCHLICKER 1999), that removes coefficients below a threshold value, determined by the noise variance. The second is the soft threshold filter (Hs) that shrinks the wavelet coefficients above and below the threshold. Soft thresholding reduces coefficients toward zero (GHAEL et al. 1997). The process of denoising may lose some information in that the denoised signal is irreversibly different from the noisy signal. Thus, thresholding is the cause of this loss of information.If we desire the resulting signal to be smooth, the soft threshold filter should be used, although the hard threshold filter performs better. In practice, it is difficult to choose a threshold value. A small threshold value creates a noisy result near the input, while a large threshold value introduces bias. The optimal threshold is somewhere in between.
Many approaches have been available to determine the threshold level. A universal method assigns a threshold level equal to the variance times, expressed as
![]() | (10) |
![]() | (11) |
N(0, 
), i = 1, ... , n,
![]() | (12) |
greater than the threshold tends to 0. For most applications, the hard thresholding rule keeps only those detail coefficients that are significantly greater than zero. We use the hard thresholding rule as a guidance to either keep or kill the whole level of detail coefficients. The following procedure is proposed to perform data dimensionality reduction through wavelet transforms:
- Select proper orthogonal wavelet filters.
- Calculate the empirical variances for the detail coefficients.
- Apply the hard thresholding rule to the detail coefficients.
- Truncate the whole level of the detail coefficients if they are all set to zero by step 3, and keep the whole level of the detail coefficients otherwise.
- Repeat procedures 1–4 for user prescribed j times. Functional QTL mapping will be conducted on the basis of the smooth coefficients c–j if the rest of detail coefficients are truncated or on the basis of the smooth coefficients c–j+1 otherwise.
ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
>HYPOTHESIS TESTING
MONTE CARLO SIMULATION
DISCUSSION
APPENDIX
ACKNOWLEDGEMENTS
LITERATURE CITED
![]() |
![]() |
![]() | (13) |
Other hypothesis tests after a significant QTL is detected include the significance of the additive genetic effect on a certain point or period of growth trajectory during a time course. The wavelet-based parametric model can be used to test the genetic control over particular developmental timing and specific mathematical parameters that determine the shape of a curve. Furthermore, wavelet-based functional mapping facilitates the resolution of fundamental biological issues as follows: (1) If multiple environments are used, as shown in ZHAO et al. (2004b), we can test how the detected QTL interact with the environment in a coordinated manner to determine growth performance at different developmental stages; (2) the model can be readily extended to model multiple QTL for growth trajectories throughout the genome and estimate and test the effects of different components of epistasis on growth patterns and forms (WU et al. 2004a); and (3) wavelet-based parametric functional mapping builds up a bridge between the genetic mechanisms underlying different traits, quantitatively testing for the role of pleiotropic QTL on the phenotypic integration of a biological system.
Unlike the test statistic for the QTL significance by Equation 13, the critical threshold values for all the tests after a significant QTL was detected should be determined from simulation studies. However, the null hypotheses for these tests are well nested with their alternative hypotheses and, for this reason, the corresponding LR values can be thought to asymptotically follow a
2-distribution. Therefore, in theory, the critical values for these tests can be obtained from the
2-table, although the asymptotic properties of these test statistics need to be investigated analytically or empirically.
ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
>MONTE CARLO SIMULATION
DISCUSSION
APPENDIX
ACKNOWLEDGEMENTS
LITERATURE CITED
We assume a backcross population of a moderately large sample size (200), in which there are two genotypes at each gene. A total of six markers are evenly distributed along a linkage group of length 100 cM. A QTL assumed to affect growth trajectories is located at the center of the linkage group (i.e., 50 cM from either end of the linkage group). By assuming two different QTL genotypic curves, defined by
, where al is the genotype-specific asymptotic growth when time is infinite, bl is the genotype-specific parameter related to the initial growth, and rl is the genotype-specific relative growth rate, phenotypic data were simulated using the linear regression model (2), with the time-dependent covariance of residual errors following the AR(1) model. The variance
2 was selected according to the heritability of H2 = 0.1 and 0.4 at the time point where the genetic variance is in the middle of the largest and the smallest.
It is important to note that the genetic control of a QTL over growth may have different patterns. We consider four such different patterns (Figure 2), as also indicated in WU and LIN (2006), to investigate how functional mapping affects the estimation of genotypic curves for different patterns. Pattern 1 with two parallel genotypic curves shows that a QTL exerts its effect on the entire process of growth, whereas the other three patterns show the changes of QTL effects over the time course because the two genotypic curves are crossed at a certain point of development. Pattern 2 suggests that a QTL has no effect at an earlier stage of development, but displays an increased influence with time after a particular timing. Pattern 3 is the inverse of pattern B and, thus, reflects the impact of an "early QTL." The two genotypic curves in pattern 4 make a crossover, suggesting that the underlying QTL alters the direction of its effect. Curve parameters (
) that defined these two patterns are given in Table 1.
|
|
Both full-dimensional and wavelet-based functional mapping were used to analyze the simulated data including marker information and longitudinal measures, respectively. According to the thresholding rule, the optimal resolution levels of wavelet transform are 1, 2, and 3 and, therefore, the optimal dimensions after reduction are 4, 4, and 8, for the 8-, 16-, and 64-dimensional longitudinal data, respectively. For each simulation scheme, 200 runs were repeated to estimate the square roots of the mean square errors of the maximum-likelihood estimates (MLEs) of unknown parameters as well as averaged computing times in seconds under the two models. Figure 3 illustrates the profiles of LR values across the simulated linkage group for two different levels of heritability of growth curves (0.1 and 0.4) and of dimension of longitudinal data (16 and 64) calculated from full-dimensional and wavelet-based functional mapping. The peaks of the LR profiles near the center of the linkage group suggest that the two models can provide a good estimate of the QTL location for each simulation scheme.
|
The two functional mapping models provide comparable results; i.e., both of them can be well used to estimate the location of a QTL and QTL genotype-specific parameters that define the shape of growth trajectories (Tables 2–4
|
|
|
The computing times used for full-dimensional and wavelet-based functional mapping differ notably. The advantage of wavelet-based over full-dimensional functional mapping in computing efficiency is consistent for growth curves of different heritabilities and different patterns of dynamic genetic effects. Wavelet-based functional mapping is computationally more efficient by approximately one to two and two to four times relative to full-dimensional functional mapping when the dimension of longitudinal data is 8 or 16. This computational efficiency increases to approximately four to eight times when the dimension increases to 64.
ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
MONTE CARLO SIMULATION
>DISCUSSION
APPENDIX
ACKNOWLEDGEMENTS
LITERATURE CITED
The effective use of the wavelet thresholding model for functional mapping can be attributed to three factors. First, the dimensionality of smooth coefficients is only a fraction of the data dimensionality. The curse of dimensionality is removed by considering models for low-dimensional smooth coefficients. Second, the smooth coefficients capture the pattern of the original signal (see Figure 1). As shown in FAN and LIN (1998), the Fourier transformation preserves information content mostly in the low-frequency Fourier coefficients. Likewise, smooth coefficients are the low-frequency counterpart of the signal (VIDAKOVIC 1999). Third, removing detail coefficients from the transformation corresponds to removing high-frequency contents of the data, which is equivalent to reducing the noise level and the redundancies of the data. Truncating detail coefficients corresponds to a denoising process (DONOHO and JOHNSTONE 1994; DONOHO 1995; DONOHO et al. 1995). By throwing out the noise-rich detail coefficients, the remaining smooth coefficients have a higher signal-to-noise ratio under certain conditions.
We performed simulation studies to investigate the statistical behavior of wavelet-based functional mapping. The data were simulated for multiple time points using QTL genotype-specific logistic curves, but analyzed by both wavelet-based and full-dimensional functional mapping models. The results suggest that the wavelet-based functional mapping outperforms the full-dimensional model in computational efficiency, while these two models provide consistent results about parameter estimation. With an increased dimension of longitudinal data, the advantage of wavelet-based over full-dimensional functional mapping in computation becomes more pronounced.
The advantage of functional mapping in biology comes from the parametric modeling of the dynamic change of genotypic values for a putative QTL over different time points and the testing of QTL effects on the pattern and form of growth curves on the basis of mathematical parameters that define the curves. Wavelet-based shrinkage has been generally used as a nonparametric approach to handle functional data (MORRIS et al. 2003, 2006; MORRIS and CARROLL 2006), thus exhibiting a broad use in situations in which no explicit mathematical function is available to describe longitudinal curves. For many longitudinal trajectories that can be mathematically formulated, such as the sigmoid shape of organismic growth (VON BERTALANFFY 1957; WEST et al. 2001), biexponential decay of viral load after medical treatment (HO et al. 1995; WU and DING 1999), and periodic functions of circadian rhythm (SCHEPER et al. 1999), a nonparametric treatment will unavoidably reduce their biological relevance. One of the most significant relevances of our wavelet-based functional mapping proposed is to preserve all the favorable properties of functional mapping in biologically meaningful hypothesis tests through the integration of wavelet shrinkage within the framework of parametric functional mapping.
For a practical data set, the true mean-covariance structures are not known. As shown by ZHAO et al. (2005), different covariance structures may give different results of QTL location estimation. In our simulation studies, the AR(1) model was used to approximate the structure of the covariance matrix. It would be important to model the mean-covariance structures for the time–space data set using different approaches (see POURAHMADI 1999, 2000; PAN and MACKENZIE 2003; WU and POURAHMADI 2003) and further derive dimensionally reduced models at different resolutions.
The wavelet dimensionality reduction method proposed in this article has power to map and identify QTL responsible for high-dimensional longitudinal traits. It opens a new avenue to draw a detailed picture of the genetic architecture of a complex biological system by dissecting it into different components and developmental stages and studying the genetic regulation and interactions of different QTL involved in various sequential steps of development. As a starting point, we hope to develop more sophisticated wavelet-based functional mapping to take into account the complexity and high dimension of longitudinal data that are needed to describe a biological system in a comprehensive way.
ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
MONTE CARLO SIMULATION
DISCUSSION
>APPENDIX
ACKNOWLEDGEMENTS
LITERATURE CITED
(l = 1, 2) for the logistic equation of growth trajectories and the covariance parameters
v = (
,
2) based on the AR(1) model from wavelet-based functional mapping. The observed log-likelihood function of Equation 5 is rewritten as
![]() |
![]() |
![]() |
![]() |
and w
are determined according to the thresholding rule.
Assuming that QTL genotypes (Q) for each individual are known, i.e., there are no missing data, we construct the complete log-likelihood function as
![]() |
1 is the indicator variable for a QTL genotype of individual i as defined before.
In the E-step of the EM algorithm, we need to calculate the conditional expectation of the complete log-likelihood given the observed data and the current estimation of parameters,
![]() |
So, we define the E-step by expressing the posterior probabilities of individual i to be QTL genotype 1 or 2 as
![]() |
![]() |
The M-step is derived by solving the log-likelihood equations of the expected complete log-likelihood function given the observed data and current estimations,
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
.
Note that since the smooth coefficients are weighted averages of the neighboring points, the correlation structure between these smooth coefficients resembles the correlation structure of the full-dimensional data. Here we take advantage of this resemblance and directly model the correlation structure of the smooth coefficients. If the likelihood equations cannot be solved directly, the Newton–Raphson algorithm is implemented within the M-step to solve these equations numerically. For an illustration, we show only the method for computing b1. The
th step in the Newton–Raphson algorithm is expressed as
![]() |
The E- and M-steps are iteratively repeated until the estimates of parameters are stable. These stable estimates are regarded as the MLEs of parameters.
ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
MONTE CARLO SIMULATION
DISCUSSION
APPENDIX
>ACKNOWLEDGEMENTS
LITERATURE CITED
ABSTRACT
WAVELET-BASED FUNCTIONAL MAPPING
HYPOTHESIS TESTING
MONTE CARLO SIMULATION
DISCUSSION
APPENDIX
ACKNOWLEDGEMENTS
>LITERATURE CITED
ABOUFADEL, E., and S. SCHLICKER, 1999 Discovering Wavelets. John Wiley & Sons, New York.
CHURCHILL, G. A., and D. W. DOERGE, 1994 Empirical threshold values for quantitative trait mapping. Genetics 138: 963–971.[Abstract]
DIGGLE, P. J., P. HEAGERTY, K. Y. LIANG and S. L. ZEGER, 2002 Analysis of Longitudinal Data. Oxford University Press, Oxford.
DONOHO, D. L., 1995 De-noising by soft-thresholding. IEEE Trans. Inform. Theory 41: 613–627.[CrossRef]
DONOHO, D. L., and I. M. JOHNSTONE, 1994 Ideal spatial adaptation by wavelet shrinkage. Biometrika 81: 425–455.
DONOHO, D. L., I. M. JOHNSTONE, G. KERKYACHARIAN and D. PICARD, 1995 Wavelet shrinkage: Asymptopia? J. R. Stat. Soc. B 57: 301–369.
FAN, J. Q., and S. K. LIN, 1998 Test of significance when data are curves. J. Am. Stat. Assoc. 93: 1007–1021.[CrossRef]
GHAEL, S. P., A. M. SAYEED and R. G. BARANIUK, 1997 Improved wavelet denoising via empirical Wiener filtering. Proceedings of SPIE: Wavelet Applications in Signal and Imaging, San Diego, Vol. 3169, pp. 389–399.
HO, D. D., A. U. NEUMANN, A. S. PERELSON, W. CHEN, J. M LEONARD et al., 1995 Rapid turnover of plasma virions and CD4 lymphocytes in HIV-1 infection. Nature 373: 123–126.[CrossRef][Medline]
JAFFRÉZIC, F., R. THOMPSON and W. G. HILL, 2003 Structured antedependence models for genetic analysis of repeated measures on multiple quantitative traits. Genet. Res. 82: 55–65.[CrossRef][Medline]
JENSEN, A., and A. LA COUR-HARBO, 2001 Ripples in Mathematics. Springer, New York.
JOHNSTONE, I. M., and B. W. SILVERMAN, 1997 Wavelet threshold estimators for data with correlated noise. J. Roy. Stat. Soc. Ser. B 59: 319–351.[CrossRef]
LANDER, E. S., and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199.
LYNCH, M., and B. WALSH, 1998 Genetics and Analysis of Quantitative Traits. Sinauer, Sunderland, MA.
MA, C. X., G. CASELLA and R. L. WU, 2002 Functional mapping of quantitative trait loci underlying the character process: a theoretical framework. Genetics 161: 1751–1762.
MACGREGOR, S., S. A. KNOTT, I. WHITE and P. M. VISSCHER, 2005 Quantitative trait locus analysis of longitudinal quantitative trait data in complex pedigrees. Genetics 171: 1365–1376.
MALLAT, S. G., 1989 A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. Patt. Anal. Machine Intell. 11: 674–693.[CrossRef]
MALLAT, S., 1998 A Wavelet Tour of Signal Processing. Academic Press, New York/London/San Diego.
MORRIS, J. S., and R. J. CARROLL, 2006 Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B 68: 179–199.[CrossRef]
MORRIS, J. S., M. VANNUCCI, P. J. BROWN and R. J. CARROLL, 2003 Wavelet-based nonparametric modeling of hierarchical functions in colon carcinogenesis. J. Am. Stat. Assoc. 98: 573–597.[CrossRef]
MORRIS, J. S., S. ARROYO, B. A. COULL, L. M. RYAN, R. HERRICK et al., 2006 Using wavelet-based functional mixed models to characterize population heterogeneity in accelerometer profiles: a case study. J. Am. Stat. Assoc. 101: 1352–1364.[CrossRef]
NÚÑEZ-ANTÓN, V., and D. L. ZIMMERMAN, 2000 Modeling nonstationary longitudinal data. Biometrics 56: 699–705.[CrossRef][Medline]
PAN, J. X., and G. MACKENZIE, 2003 On modelling mean-covariance structures in longitudinal studies. Biometrika 90: 239–244.
POURAHMADI, M., 1999 Joint mean-covariance models with applications to longitudinal data: unconstrained parameterisation. Biometrika 86: 677–690.
POURAHMADI, M., 2000 Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika 87: 425–435.
SCHEPER, T. O., D. KLINKENBERG, C. PENNARTZ and J. VAN PELT, 1999 A mathematical model for the intracellular circadian rhythm generator. J. Neurosci. 19: 40–47.
SY, J. P., J. M. TAYLOR and W. G. CUMBERLAND, 1997 A stochastic model for the analysis of bivariate longitudinal AIDS data. Biometrics 53: 542–555.[CrossRef][Medline]
VIDAKOVIC, B., 1999 Statistical Modeling by Wavelets. Wiley, New York.
VON BERTALANFFY, L., 1957 Quantitative laws in metabolism and growth. Q. Rev. Biol. 32: 217–231.[CrossRef][Medline]
WALKER, J. S., 1999 A Primer on Wavelets for Their Scientific Applications. Chapman & Hall/CRC, London/New York/Cleveland/Boca Raton, FL.
WEST, G. B., J. H. BROWN and B. J. ENQUIST, 2001 A general model for ontogenetic growth. Nature 413: 628–631.[CrossRef]
WU, H., and A. DING, 1999 Population HIV-1 dynamics in vivo: applicable models and inferential tools for virological data from AIDS clinical trials. Biometrics 55: 410–418.[CrossRef][Medline]
WU, R. L., and M. LIN, 2006 Functional mapping—how to map and study the genetic architecture of dynamic complex traits. Nat. Rev. Genet. 7: 229–237.[CrossRef][Medline]
WU, R. L., C.-X. MA, W. ZHAO and G. CASELLA, 2003 Functional mapping of quantitative trait loci underlying growth rates: a parametric model. Physiol. Genomics 14: 241–249.
WU, R. L., C.-X. MA, M. LIN and G. CASELLA, 2004a A general framework for analyzing the genetic architecture of developmental characteristics. Genetics 166: 1541–1551.
WU, R. L., C.-X. MA, M. LIN, Z. H. WANG and G. CASELLA, 2004b Functional mapping of quantitative trait loci underlying growth trajectories using a transform-both-sides logistic model. Biometrics 60: 729–738.[CrossRef][Medline]
WU, R. L., Z. H. WANG, W. ZHAO and J. M. CHEVERUD, 2004c A mechanistic model for genetic machinery of ontogenetic growth. Genetics 168: 2383–2394.
WU, W. B., and M. POURAHMADI, 2003 Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90: 831–844.
YANG, R. Q., Q. TIAN and S. Z. XU, 2006 Mapping quantitative trait loci for longitudinal traits in line crosses. Genetics 173: 2339–2356.
ZENG, Z.-B., 1994 Precision mapping of quantitative trait loci. Genetics 136: 1457–1468.[Abstract]
ZHAO, W., R. L. WU, C.-X. MA and G. CASELLA, 2004a A fast algorithm for functional mapping of complex traits. Genetics 167: 2133–2137.
ZHAO, W., C.-X. MA, M. CHEVERUD and R. L. WU, 2004b A unifying statistical model for QTL mapping of genotype x sex interaction for developmental trajectories. Physiol. Genomics 19: 218–227.
ZHAO, W., Y. Q. CHEN, G. CASELLA, J. M. CHEVERUD and R. L WU, 2005 A nonstationary model for functional mapping of complex traits. Bioinformatics 21: 2469–2477.
ZIMMERMAN, D. L., and V. NÚÑEZ-ANTÓN, 2001 Parametric modeling of growth curve data: an overview. Test 10: 1–73.[CrossRef]
Communicating editor: J. B. WALSH
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.107.070920v1
176/3/1879 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zhao, W.
- Articles by Wu, R.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zhao, W.
- Articles by Wu, R.










































