help button home button Genetics J Bacteriology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Originally published as Genetics Published Articles Ahead of Print on April 19, 2006.

Genetics, Vol. 173, 2247-2255, August 2006, Copyright © 2006
doi:10.1534/genetics.105.054197

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
genetics.105.054197v1
173/4/2247    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gianola, D.
Right arrow Articles by Odegaard, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gianola, D.
Right arrow Articles by Odegaard, J.

On the Quantitative Genetics of Mixture Characters

Daniel Gianola*,{dagger},1, Bjorg Heringstad{dagger} and Jorgen Odegaard{dagger}

* Department of Animal Sciences, University of Wisconsin, Madison, Wisconsin 53706 and {dagger} Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, N-1432 Ås, Norway

1 Corresponding author: Department of Animal Sciences, 1675 Observatory Dr., Madison, WI 53706.
E-mail: gianola{at}calshp.cals.wisc.edu

Manuscript received December 1, 2005. Accepted for publication April 14, 2006.


    ABSTRACT
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Finite mixture models are helpful for uncovering heterogeneity due to hidden structure. Quantitative genetics issues of continuous characters having a finite mixture of Gaussian components as statistical distribution are explored in this article. The partition of variance in a mixture, the covariance between relatives under the supposition of an additive genetic model, and the offspring–parent regression are derived. Formulas for assessing the effect of mass selection operating on a mixture are given. Expressions for the genetic and phenotypic correlations between mixture and Gaussian traits and between two mixture traits are presented. It is found that, if there is heterogeneity in a population at the genetic or environmental level, then genetic parameters based on theory treating distributions as homogeneous can lead to misleading interpretations. Some peculiarities of mixture characters are: heritability depends on the mean values of the component distributions, the offspring–parent regression is nonlinear, and genetic or phenotypic correlations cannot be interpreted devoid of the mixture proportions and of the parameters of the distributions mixed.


FINITE mixture models, used in biology and in genetics since PEARSON (1894), are helpful for uncovering heterogeneity due to hidden structure or incorrect assumptions. For instance, unknown loci with major effects can create "bumps" (sometimes quite subtle) in a phenotypic distribution, and this type of heterogeneity may be resolved by fitting a mixture, i.e., by calculating conditional probabilities that a datum is drawn from one of the several potential, yet unknown, genotypes. A brief review of the potential usefulness of mixtures for uncovering major genes is in LYNCH and WALSH (1998). Also, many quantitative trait loci detection procedures are based on ideas from mixture models (HALEY and KNOTT 1992).

The quantitative genetics of characters distributed as mixtures has not been studied extensively, although the idea underlies work of, e.g., LATTER (1965) and KIMURA and CROW (1978). Perhaps this is due to that, until recently, fitting complex hierarchical mixture models to phenotypic data was computationally difficult. However, inference about some quantitative genetic characters via finite mixture models may be warranted in practice. For example, consider mastitis, an inflammation of the mammary gland of cows and goats associated with bacterial infection. The disease affects the dairy industry globally, and it has severe economic effects. Genetic variation in susceptibility to mastitis exists, and selection for increased resistance is feasible (HERINGSTAD et al. 2000). However, recording of mastitis events is not routine in most nations, and milk somatic cell counts (SCC) have been used as a proxy in genetic evaluation of sires (using mixed-effects linear models), because an elevation of SCC is associated with mastitis. It is not obvious how the SCC information should be treated optimally in genetic evaluation. SCC is both an indicator of mastitis and a measure of response to infection. It is reasonable to expect that SCC observations taken on healthy and diseased animals display different distributions, which are "hidden" in the absence of disease recording. Finite mixture models have been suggested in this context by DETILLEUX and LEROY (2000), ØDEGÅRD et al. (2003, 2005), GIANOLA et al. (2004), and BOETTCHER et al. (2005).

This article explores quantitative genetics issues of continuous characters having a finite mixture of Gaussian components as statistical distribution. MODEL introduces notation, gives a specification in which both genetic and residual effects follow mixtures, and derives pertinent marginal and conditional distributions. TRUNCATION SELECTION gives formulas for assessing the effect of mass selection operating on a mixture. Next, COVARIANCE BETWEEN RELATIVES presents the partition of variance in a mixture, the calculation of covariance between relatives under the supposition of an additive genetic model, and illustrates the effect of heterogeneity on the offspring–parent regression. Genetic and phenotypic correlations between mixture and Gaussian traits, and between two mixture traits, are discussed in COVARIANCE STRUCTURE. This article concludes with some comments and with an APPENDIX, where some basic formulas of mixture distributions are presented.


    MODEL
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Suppose an observable random variable (yi, phenotype of individual i) is drawn from the finite mixture of GE Gaussian components,

Formula 1(1)
where pe is a vector containing the mixing proportions Formula 1 (summing to 1); µe and Formula 1 are each GE x 1 vectors of means and variances with typical elements µk and Formula 1, respectively; ai is the genetic value of i, and Formula 1 denotes a univariate normal density with appropriate mean and variance. As shown in the APPENDIX, the mean and variance of this conditional (given the genetic effect) distribution are

Formula 2(2)
and

Formula 3(3)
respectively, where Formula 3 is the residual or "environmental" variance. Informally, Formula 3 is the part of the environmental variance contributed by population heterogeneity.

Assume that the genetic effect ai is also drawn from the mixture with GA components

Formula 4(4)
where Formula 4, and Formula 4 are the vectors of mixing proportions, component means, and component variances, respectively. Then, Formula 4, and

Formula 5(5)
where Formula 5 is the genetic variance, and Formula 5 is interpretable as "variance between genetic means." In Gaussian linear models the distribution of the random genetic effects is often taken to be N(ai | 0, Formula 5), where Formula 5 is the additive genetic variance, so it may be reasonable to introduce the restriction Formula 5 in the mixture (VERBEKE and LESAFFRE 1996). The joint density of ai and yi is obtained by multiplication of (1) and (4), yielding

Formula 6(6)
which is a finite mixture of GE x GA bivariate normal distributions, with mixing proportion Formula 6 for the kmth component; note that Formula 6. From standard Gaussian linear models theory, given the km component (let the indicator {delta}km = 1 denote such a situation),

Formula 6
where N2(. | .,.) denotes a bivariate normal distribution. Further,

Formula 6
where

Formula 6
and

Formula 6

Under the standard additive genetic model of FISHER (1918), this regression of "genotype on phenotype" bkm is the heritability of the character under the kmth component of the bivariate mixture. The joint density (6) is also expressible as

Formula 7(7)

The marginal density of yi is arrived at by integrating (7) over ai, yielding

Formula 8(8)

This is a finite mixture of GE x GA univariate normal distributions with mixing proportions Formula 8. From the APPENDIX, the mean and variance of the phenotypic distribution are

Formula 9(9)
and

Formula 10(10)

A standard problem in quantitative genetics is that of inferring genetic values from phenotypes. From (7) and (8), the density of the conditional distribution of ai given yi is

Formula 11(11)
where

Formula 11

Hence, the conditional distribution of ai given yi is a mixture of the GE x GA normal distributions Formula 11, where the mixing proportion is Qkm, the conditional probability that the datum is drawn from Formula 11, given the observation yi. The best predictor of genetic value is the conditional expectation function

Formula 12(12)

(HENDERSON 1973; BULMER 1980; FERNANDO and GIANOLA 1986; SEARLE et al. 1992), which is a weighted average of the conditional expectations peculiar to each of the GE x GA components of mixture (11). This result is important: the regression of genotype on phenotype is not linear in yi. Therefore, standard linear models give less than optimal predictions of genetic effects for traits distributed as mixtures. Further, using (39) in the APPENDIX, the variance of the conditional distribution is

Formula 13(13)

In the standard additive genetic linear model, the variance of the conditional distribution of genotypes given phenotypes is Formula 13 (FALCONER 1989), where h2 is the coefficient of heritability; this conditional variance is homogeneous and does not depend on the data. In a mixture model, however, the dispersion about the regression function is heteroscedastic and nonlinear on the phenotypic value. Hence, both point and interval predictions of genetic value in mixtures involve strikingly different formulas.


    TRUNCATION SELECTION
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Consider the standard truncation selection setting in which individuals kept as parents are such that yi > t, with the proportion of individuals selected being Pr Formula 13. From (8), the distribution of phenotypic values within selected individuals has density

Formula 13
where

Formula 14(14)

Above, {gamma}km is the proportion selected within the kmth mixture component and Formula 14 is the standard normal distribution function. The proportion selected {gamma} is, thus, a weighted average of the individual component selection proportions {gamma}km. Since the threshold is fixed, the components that are most prevalent, have largest means, and are most variable will be influential.

The mean value of selected individuals is

Formula 15(15)
where ikm is the selection intensity factor under the kmth component (FALCONER 1989) and

Formula 15
are relative weights summing to 1. The phenotypic superiority of selected individuals or selection differential Formula 15 is given by the difference between (15) and (9). Further, the mean genetic value of selected parents is

Formula 15

Employing (12),

Formula 15

This expression cannot be evaluated analytically, because it is a highly nonlinear function of the phenotypic values. However, it can be approximated by Monte Carlo procedures, e.g., by drawing samples from the bivariate mixture (7). Accept the draws in which y > t, calculate Formula 15 for each accepted y, and then average this quantity over the samples kept. Finally, the genetic superiority of accepted parents over the unselected population is

Formula 15

The expected fraction of the selection differential that is realized can be assessed as {Delta}a/S, and this will differ from what could be expected from the regression of offspring on midparent, because of nonlinearity (see the following section).

Effects of truncation selection upon a heterogeneous population, i.e., a mixture, have been studied extensively in quantitative genetics. For example, HILL (1974) and BIJMA and WOOLLIAMS (1999) gave formulas for prediction of response suitable for age-structured populations or for overlapping generations. Also, LATTER (1965), LANDE (1976), and KIMURA and CROW (1978) addressed consequences of truncation selection when there is some grouping structure in a population, e.g., caused by genes of large effects. To illustrate, suppose that a genetic mixture derives from a major locus with two alleles. Prior to selection, the mixing proportions (frequencies) of the three genotypes are, in our notation, Formula 15, and Formula 15. Also, suppose that the environmental distribution is zero-mean normal, with variance Formula 15, independent of the genetic distribution, and that the polygenic genetic variance is homoscedastic and equal to Formula 15 (equivalently, the within major genotype genetic variance is constant). Using (14), the overall selection proportion is Formula 15, and the genotypic frequencies after selection become Formula 15. After selection, employing (15), the phenotypic distribution has mean value

Formula 15

FALCONER (1989) gives approximate expressions for relative fitness of genotypes, e.g., {gamma}2/{gamma}1. Note that, under these assumptions, the phenotypic distribution remains a mixture, irrespective of the number of cycles of selection. The means and variance change, however, due to the modification of the Formula 15 frequencies produced by selection.


    COVARIANCE BETWEEN RELATIVES
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
General:
The fraction of variance attributable to additive genetic effects (usual definition of heritability) is location invariant for a Gaussian trait, i.e., it does not involve mean values. In a mixture, "heritability" becomes

Formula 16(16)

The partition of variance depends on component-specific variances Formula 16, on mixing proportions (Formula 16 and Formula 16), and on mean values (µk and {alpha}m) as well. In the simpler case in which the genetic distribution is the homogeneous process Formula 16, heritability becomes

Formula 17(17)
and this is expected to be lower than in a homogeneous population because fixed effects contribute to variance. If the residual variance is homoscedastic across mixture components, this reduces further to

Formula 18(18)
where Formula 18. Heterogeneity in means reduces heritability in a mixture Formula 18, and the standard h2 is obtained only when the sampling model invokes a draw from a single component distribution.

The covariance between phenotypes of related individuals i and i' is

Formula 19(19)
after assuming that phenotypes (given the additive genetic values ai, ai') are conditionally independent. To develop the covariance between genetic values further, we assume that these are distributed as the bivariate mixture

Formula 19
where Formula 19 denotes a bivariate normal distribution and Aii' is the additive relationship between the two individuals, assumed constant across all components of the mixture; inbreeding is supposed to be nil. It follows directly that each of the genetic values has Formula 19 as marginal distribution. Then

Formula 20(20)

This reduces to Formula 20 if {alpha}m = 0 for every m and to the standard Formula 20 in the absence of heterogeneity in the distribution of genetic effects.

Regression of offspring on parent:
Using (10) and (20), the standard formula for the regression of the phenotypic value of a progeny (O) on that of a parent (P) (with Formula 20) gives

Formula 21(21)

If the distribution of genetic effects is homogeneous, this simplifies to

Formula 22(22)

The consequences of (21) and (22) are clear: if there is heterogeneity in the distribution either of sampling model residuals or of genetic effects, then ßOP is affected by the mixing proportions and by the means µk. To illustrate, suppose that the genetic distribution is homogeneous; let GE = 2, take µ1 = 0 as "origin," Formula 22, and Formula 22. Then (22) is expressible as

Formula 22
When Pe = 1, the formula gives half of heritability, which is a standard result (FALCONER 1989). The function is symmetric with respect to Pe; since Formula 22 is maximum at Formula 22, the regression is minimum at this value. As an example, consider the offspring–parent regression as a function of Pe for four situations with different additive genetic variance (Formula 22) and distances between means ({Delta}) in the two distributions of the mixture: (1) Formula 22; (2) Formula 22; (3) Formula 22; and (4) Formula 22. Situations 1 and 2 correspond to a trait with a heritability of 0.50 under homogeneity, while 3 and 4 are for a lowly heritable trait (h2 {approx} 0.09). In 1 and 2, the regression ß decreases from 0.25 to ~0.22 and 0.17, respectively, representing relative decreases in heritability of 12 and 32%. The relative decreases in heritability are 18 and 47% in cases 3 and 4, respectively. In brief, heritability in heterogeneous or admixed populations depends on the mixing proportion, on the mean difference between mixture components, and on the "homogeneous situation" heritability.


    COVARIANCE STRUCTURE
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Correlations with a Gaussian trait:
Correlations between a mixture trait and a normally distributed character may be of interest. For example, the mixture trait could be SCC in dairy cattle, with several component distributions corresponding to different unknown statuses of mammary gland disease. The Gaussian trait could be milk yield of a cow. Is the genetic correlation between the two traits affected by heterogeneity of somatic cell count?

Let the model for the Gaussian trait w be

Formula 23(23)
where µw is the mean of the trait, Formula 23 is the additive genetic value of individual i for trait w, and Formula 23 is a residual effect, independent of awi; Formula 23 and Formula 23 are genetic and residual components of variance, respectively. The phenotypic distribution is, thus, Formula 23.

Suppose that the distribution of mixture trait y has two components at each of the genetic and residual levels; i.e., GA = GE = 2. Assume further that

Formula 24(24)

The distribution of each of the two genetic effects entering into the mixture for trait y (a1, a2) is centered at {alpha}, but has component-specific variances. These two genetic effects may be imperfectly correlated, with genetic covariance Formula 24; for instance, different genes affecting somatic cell count are expressed under the unknown "mastitis" and "no-mastitis" disease conditions generating the mixture. Also, the genetic covariance with the Gaussian trait may be specific to each of the two components. The component-specific residual effects (e1, e2) are heteroscedastic but uncorrelated, as it is not possible to observe "disease" and "no disease" in the same individual at the same time. However, a residual correlation with the Gaussian trait is allowed and assumed peculiar to each component distribution.

Recall from (4) that Formula 24. Now, let {partial}m take the value 1 when the draw is from component m and 0 otherwise. The joint density under m is

Formula 24

so that, unconditionally

Formula 24

Then

Formula 25(25)

Using (5) and (25), and taking {alpha} = 0, the genetic correlation between the mixture-distributed trait and the Gaussian character is

Formula 26(26)

This reduces to the standard Formula 26 when the distribution of genetic effects for trait y is homogeneous and to

Formula 27(27)
when Formula 27.

The effect of Formula 27 on genetic correlation (27) is illustrated next. Let Formula 27 be a heteroscedasticity factor, where Formula 27, the genetic variance under the first component of the mixture, is viewed as "baseline" genetic variance, i.e., a measure of variability in the absence of heterogeneity. Then

Formula 28(28)
where {rho}homo is the genetic correlation in the absence of a mixture and

Formula 28
is the factor by which {rho}homo is modified by heterogeneity. Since the sign of Formula 28 is invariant with respect to Formula 28, it suffices to examine function (28) only under positive values of {rho}homo. Figure 1 displays the relationship between the genetic correlation (28) and Formula 28 for two values of {rho}homo (0.7 and 0.3) and of {lambda} (1.5 and 2). As Formula 28 increases, the proportion of the component with larger genetic variance (m = 2) decreases. The genetic correlation increases monotonically with Formula 28 and more rapidly so at the largest value of genetic heteroscedasticity. Suppose that w is total lactation milk yield in dairy cows and that y is SCC, a mixture trait resulting from the fact that some cows have mastitis (~20–40%). There is evidence (e.g., HERINGSTAD et al. 2006) that the genetic variance of somatic cell count is ~2.5 times larger in healthy than in diseased (clinical cases) cows. Under the assumptions leading to (28), our model predicts that the genetic correlation between milk yield and somatic cell score would decrease as the frequency of mastitis in the population decreases. Similar algebra and considerations hold for the environmental correlation between traits.


Figure 1
View larger version (18K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 1.— Genetic correlation (Rho) between a Gaussian character and a mixture trait for a two-component mixture, as a function of the mixing proportion (Formula 28), for different combinations of {rho}homo, genetic correlation in absence of mixture, and {lambda}, heteroscedasticity factor. From top to bottom: (1) {rho}homo = 0.7, {lambda} = 1.5 (open squares); (2) {rho}homo = 0.7, {lambda} = 2 (dotted line); (3) {rho}homo = 0.3, {lambda} = 1.5 (solid line); (4) {rho}homo = 0.3, {lambda} = 2 (open circles).

 
Consider again the joint distribution (24), and write

Formula 29(29)
where the random variable {delta}e takes the value 1 or 0 with probabilities Pe and 1 – Pe, respectively; {delta}a is another binary variable taking the values 1 or 0 with probabilities Pa and (1 –Pa), respectively, and distributed independently of {delta}e. These two binary variates are assumed to be independent of awi and ewi entering into the model for wi in (23). Then,

Formula 29
and

Formula 29

The phenotypic covariance between y and w is, therefore,

Formula 29

Since the second term is null

Formula 30(30)

Above, Formula 30 is the genetic covariance, as in (25), and Formula 30 is the residual covariance. Collecting (30), (10), plus the fact that Formula 30, and assuming that {alpha} = 0, yields as phenotypic correlation

Formula 31(31)

The phenotypic correlation depends not only on the underlying components of variance and covariance, but also on the mixing proportions and population means.

If the genetic distribution is homoscedastic Formula 31, and heterogeneity is at the level of the sampling model only, but with Formula 31 and Formula 31, the phenotypic correlation becomes

Formula 32(32)
where Formula 32 is the phenotypic correlation in the absence of a mixture for the residual distribution and Formula 32. To illustrate, take µ1 = 0 as origin, µ2 = {Delta}, and {sigma}y = 1. Then

Formula 33(33)

The phenotypic correlation has a minimum at Pe = 0.5 if {rho}homo is positive; however, it is maximum at this value of the mixing proportion if {rho}homo is negative. Effects of Pe and of {Delta} on the genetic correlation are shown in Figure 2, for {rho}homo = 0.7 and {Delta} = 1 and 2. The function is symmetric and steeper as {Delta} increases. For {Delta} = 2, the correlation decreases from 0.7 (for Pe = 0 or 1) to a minimum of ~0.50. The curves are inverted if {rho}homo is negative. In short, if the value of Formula 33 is used to measure admixture in the residual distribution, the phenotypic correlation decreases with admixture if it is positive in a homogeneous population. On the other hand, {rho}yw increases with admixture if negative under the homogeneous situation.


Figure 2
View larger version (12K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 2.— Phenotypic correlation (Rho) between a Gaussian character and a mixture trait for a two-component mixture, as a function of the mixing proportion (Pe). (1) {rho}homo = 0.7, {Delta} = 1 (solid line); (2) {rho}homo = 0.7, {Delta} = 2 (line with squares). {rho}homo, phenotypic correlation in the absence of a mixture; {Delta}, difference between means of the two distributions.

 
Correlations between two mixture-distributed characters:
Suppose now that measurements are available for traits y and z. For simplicity, it is assumed that the joint distribution of y and z arises from a two-component mixture of bivariate normal distributions at the level of the sampling model (that is, given the genetic effects) and from a two-component bivariate normal mixture of genetic effects. Given the independently distributed binary indicator variables {delta}e and {delta}a, one can write

Formula 34(34)

Then

Formula 35(35)

Assuming independence between genetic and environmental effects in the distributions, it follows that

Formula 35

where Formula 35 and Formula 35 are the additive and environmental components of covariance between y and z under m, and Formula 35 and Formula 35 are potential cross-mixture covariances. Further, under the assumption of a null mean of the component-specific genetic and environmental distributions, as well as of independence of the binary indicator variables {delta}e and {delta}a,

Formula 35
and

Formula 35

Since {delta}e and {delta}a are Bernoulli, Formula 35, Formula 35, Formula 35, and Formula 35. Then

Formula 35
and

Formula 35

Employing the preceding results in (35),

Formula 36(36)

Under the assumption that the underlying genetic distributions have zero means, the genetic correlation between the two mixture traits is

Formula 36

The environmental correlation takes the form

Formula 36

and the phenotypic correlation follows from (36) and (10), after setting all {alpha}'s to 0.


    CONCLUSION
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Some basic results of standard theory of quantitative genetics under additive inheritance were extended to finite mixture models with Gaussian components. It was found that, if there is heterogeneity in a population at either the genetic or the environmental levels, then genetic parameters based on theory treating distributions as homogeneous can lead to misleading interpretations. Some peculiarities of mixture characters are: heritability depends on the mean values of the populations, the offspring–parent regression is nonlinear, and genetic or phenotypic correlations cannot be interpreted devoid of the mixture proportions and of the parameters of the component distributions. For example, nonlinearity of the offspring–parent regression was studied by ROBERTSON (1977) and GIMELFARB (1986) under dominance and by IM and GIANOLA (1988) for binary traits. GIMELFARB (1986) gave conditions under which the regression would be nearly linear under dominance, e.g., a large number of loci affecting the trait or mild dominance and gene frequencies not far from 0.5. Our results illustrate that nonlinearity can also arise due to heterogeneity at the environmental level due to, for instance, omitting relevant covariates in a linear model for quantitative genetic analysis.

Clearly, standard models for quantitative traits can lead to erroneous results if fitted to heterogeneous data. If a mixture is suspected, two suitable methods for inferring unknown mixture parameters are maximum-likelihood and Bayesian analyses. Procedures for likelihood- or posterior-based inference applied to mixtures are discussed extensively in TITTERINGTON et al. (1985) and MCLACHLAN and PEEL (2000), including situations in which the component distributions are not normal, e.g., skewed survival processes. Implementations suitable for fitting different types of quantitative genetic mixture models have been described and applied by ØDEGÅRD et al. (2003, 2005), GIANOLA et al. (2004), and BOETTCHER et al. (2005). Prediction of breeding values is discussed in GIANOLA (2005). A suitable software for the analysis of mixtures with random effects is available in a forthcoming update of Version 6.0 of the DMU package (MADSEN and JENSEN 2002).

An important issue is how many components should be fitted in a mixture model. Testing for the number of components is a difficult matter, and there may not be a one-to-one correspondence between the number of components fitted and the number of heterogeneous groups (MCLACHLAN and PEEL 2000). For example, several groups may be hidden behind an apparently bimodal distrbution, due to limited sample size. Also, if some of the component distributions are skewed, typically the number of components needed for a good fit is larger than the number of groups causing heterogeneity. Probably the most elegant procedures for inferring the number of components needed are Bayesian implementations via the reversible-jump algorithm (RICHARDSON and GREEN 1997), but computations can be extremely taxing.


    APPENDIX
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
The first and second moments, and the variance of a finite mixture of K Gaussian distributions, with parameters Formula 36, where the mixture proportions Pk are such that Formula 36, are

Formula A1(A1)

Formula A2(A2)

and

Formula A3(A3)

The first term in (A3) can be interpreted as an average variance, while Formula A3 measures dispersion between group means or heterogeneity; if the µ's are equal, this term is null. The variance of the mixture depends not only on individual component variances, but also on group means. If the components have homogeneous variance {sigma}2,


Formula A3(A4)


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Research was supported by the Wisconsin Agriculture Experiment Station and by grants National Research Initiatives Competitive Grants Program/U.S. Department of Agriculture 2003-35205-12833, National Science Foundation (NSF) DEB-0089742, and NSF DMS-044371. Financial support from the Babcock Institute for International Dairy Research and Development, University of Wisconsin, Madison, is acknowledged.


    LITERATURE CITED
 TOP
 ABSTRACT
 MODEL
 TRUNCATION SELECTION
 COVARIANCE BETWEEN RELATIVES
 COVARIANCE STRUCTURE
 CONCLUSION
 APPENDIX
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 

BIJMA, P., and J. A. WOOLLIAMS, 1999 Prediction of genetic contributions and generation intervals in populations with overlapping generations under selection. Genetics 151: 1197–1210.[Abstract/Free Full Text]

BOETTCHER, P. J., P. MORONI, G. PISONI and D. GIANOLA, 2005 Application of a finite mixture model to somatic cell scores of Italian goats. J. Dairy Sci. 88: 2209–2216.[Abstract/Free Full Text]

BULMER, M. G., 1980 The Mathematical Theory of Quantitative Genetics. Clarendon Press, Oxford.

DETILLEUX, J., and P. L. LEROY, 2000 Application of a mixed normal mixture model to the estimation of mastitis-related parameters. J. Dairy Sci. 83: 2341–2349.[Abstract]

FERNANDO, R. L., and D. GIANOLA, 1986 Optimal properties of the conditional mean as a selection criterion. Theor. Appl. Genet. 72: 822–825.

GIANOLA, D., 2005 Prediction of random effects in finite mixture models with Gaussian components. J. Anim. Breed. Genet. 122: 145–160.[CrossRef][Medline]

GIANOLA, D., J. ØDEGÅRD, B. HERINGSTAD, G. KLEMETSDAL, D. SORENSEN et al., 2004 Mixture model for inferring susceptibility to mastitis in dairy cattle: a procedure for likelihood-based inference. Genet. Sel. Evol. 36: 3–27.[Medline]

FALCONER, D. S., 1989 Introduction to Quantitative Genetics. Longman, Burnt Mill, Harlow, UK.

FISHER, R. A., 1918 The correlation between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinb. 52: 399–433.

GIMELFARB, A., 1986 Offspring-parent regression: How linear is it? Biometrics 42: 67–71.[CrossRef][Medline]

HALEY, C. S., and S. A. KNOTT, 1992 A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69: 315–324.[Medline]

HENDERSON, C. R., 1973 Sire evaluation and genetic trends, pp. 10–41 in Proceedings of the Animal Breeding and Genetics Symposium in Honor of Dr. Jay L. Lush. American Society of Animal Science and American Dairy Science Association, Champaign, IL.

HERINGSTAD, B., G. KLEMETSDAL and J. RUANE, 2000 Selection for mastitis resistance in dairy cattle: a review with focus on the situation in the Nordic countries. Livest. Prod. Sci. 64: 95–106.[CrossRef][Medline]

HERINGSTAD, B., D. GIANOLA, Y. M. CHANG, J. ØDEGÅRD and G. KLEMETSDAL, 2006 Genetic associations between clinical mastitis and somatic cell score in early first lactation cows. J. Dairy Sci. 89: 2236–2244.[Abstract/Free Full Text]

HILL, W. G., 1974 Prediction and evaluation of response to selection with overlapping generations. Anim. Prod. 18: 117–139.

IM, S., and D. GIANOLA, 1988 Offspring-parent regression for a binary trait. Theor. Appl. Genet. 75: 720–722.

KIMURA, M., and J. F. CROW, 1978 Effect of overall phenotypic selection on genetic change at individual loci. Proc. Natl. Acad. Sci. USA 75: 6168–6171.[Abstract/Free Full Text]

LANDE, R., 1976 Natural selection and random genetic drift in phenotypic evolution. Evolution 30: 314–334.[CrossRef]

LATTER, B. D. H., 1965 The response to artificial selection due to autosomal genes of large effect. Aust. J. Biol. Sci. 18: 585–598.

LYNCH, M., and B. WALSH, 1998 Genetics and Analysis of Quantitative Traits. Sinauer, Sunderland, MA.

MADSEN, P., and J. JENSEN, 2002 A User's Guide to DMU. A Package for Analysing Mixed Models, Version 6, Release 4.3. Danish Institute of Agricultural Sciences, Tjele, Denmark.

MCLACHLAN, G., and D. PEEL, 2000 Finite Mixture Models. Wiley, New York.

ØDEGÅRD, J., J. JENSEN, P. MADSEN, D. GIANOLA, G. KLEMETSDAL et al., 2003 Mixture models for detection of mastitis in dairy cattle using test-day somatic cell scores: a Bayesian approach via Gibbs sampling. J. Dairy Sci. 86: 3694–3703.[Abstract/Free Full Text]

ØDEGÅRD, J., P. MADSEN, D. GIANOLA, G. KLEMETSDAL, J. JENSEN et al., 2005 A Bayesian threshold-normal mixture model for analysis of a continuous mastitis-related trait. J. Dairy Sci. 88: 2652–2659.[Abstract/Free Full Text]

PEARSON, K., 1894 Contributions to the mathematical theory of evolution. Philos. Trans. R. Soc. A 185: 71–110.

RICHARDSON, S., and P. J. GREEN, 1997 On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. B 59: 731–792.[CrossRef]

ROBERTSON, A., 1977 The non-linearity of offspring-parent regression, pp. 297–303 in Proceedings of the International Conference on Quantitative Genetics, edited by E. POLLAK, O. KEMPTHORNE and T. B. BAILEY, JR. Iowa State University Press, Ames, IA.

SEARLE, S. R., G. CASELLA and C. E. MCCULLOCH, 1992 Variance Components. Wiley, New York.

TITTERINGTON, D. M., A. F. M. SMITH and U. E. MAKOV, 1985 Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester, UK.

VERBEKE, G., and E. LESAFFRE, 1996 A linear mixed effects model with heterogeneity in the random-effects population. J. Am. Stat. Assoc. 91: 217–221.[CrossRef]

Communicating editor: B. J. WALSH





This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
genetics.105.054197v1
173/4/2247    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gianola, D.
Right arrow Articles by Odegaard, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gianola, D.
Right arrow Articles by Odegaard, J.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2006 by the Genetics Society of America.