| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
,1
,




* Departamento de Producción Animal, E.T.S.I. Agrónomos–Universidad Politécnica de Madrid, 28040 Madrid, Spain,
Department of Dairy Science and
Department of Animal Sciences, University of Wisconsin, Madison, Wisconsin 53706 and
Aviagen Ltd., Newbridge EH28 8SZ, Scotland, United Kingdom
1 Corresponding author: Department of Dairy Science, University of Wisconsin, 1465 Observatory Dr., Madison, WI 53706.
E-mail: ogonzalez2{at}wisc.edu
| ABSTRACT |
|---|
|
|
|---|
-metric model, kernel regression, reproducing kernel Hilbert spaces (RKHS) regression, and a Bayesian regression) were compared with a standard procedure of genetic evaluation (E-BLUP) of sires using mortality rates in broilers as a response variable, working in a Bayesian framework. Late mortality (14–42 days of age) records on 12,167 progeny of 200 sires were precorrected for fixed and random (nongenetic) effects used in the model for genetic evaluation and for the mate effect. The average of the corrected records was computed for each sire. Twenty-four SNPs seemingly associated with late mortality were included in three methods used for genomic assisted evaluations. One thousand SNPs were included in the Bayesian regression, to account for markers along the whole genome. The posterior mean of heritability of mortality was 0.02 in the E-BLUP approach, suggesting that genetic evaluation could be improved if suitable molecular markers were available. Estimates of posterior means and standard deviations of the residual variance were 24.38 (3.88), 29.97 (3.22), 17.07 (3.02), and 20.74 (2.87) for E-BLUP, the linear model on SNPs, RKHS regression, and the Bayesian regression, respectively, suggesting that RKHS accounted for more variance in the data. The two nonparametric methods (kernel and RKHS regression) fitted the data better, having a lower residual sum of squares. Predictive ability, assessed by cross-validation, indicated advantages of the RKHS approach, where accuracy was increased from 25 to 150%, relative to other methods.
2.8 million SNPs in the chicken genome. The chicken polymorphism database (ChickVD) is available at http://chicken.genomics.org.cn/index.jsp (WANG et al. 2005), which constitutes a valuable inventory of markers potentially usable as predictors of genetic components underlying disease resistance. In the context of animal breeding, an interesting application of SNPs is in prediction of performance of progeny groups, e.g., progeny of dairy sires, as an alternative to a standard progeny test. Currently, predictions are based on pedigree indexes (information from ancestors), but without using molecular markers. To the extent that there is an association between genetic markers and progeny performance, genomic information might enhance the accuracy of genetic evaluations. For example, SNP information on candidate sires could be used to make early decisions on which of these should be progeny tested. Hence, it is of interest to study whether genotyping sires for SNPs improves the accuracy of predictions over and above that attained with pedigree indexes.
Some methods have been proposed for dealing with the large amount of genomic information currently available (MEUWISSEN et al. 2001; GIANOLA et al. 2003; XU 2003). An issue is whether or not all SNPs should be included in a predictive model. For example, excluding irrelevant SNPs led to more accurate classification in an association study (LONG et al. 2007). Further, incorporating the massive genomic information into genetic evaluations is not trivial from statistical and computational points of view. Many issues need to be considered when using SNPs in the context of traditional methods, e.g., the distributional assumptions made, the fact that the number of SNPs can exceed by far the number of observations in the data, and the difficult problem of estimating and interpreting nonadditive genetic effects. GIANOLA et al. (2006) and GIANOLA and VAN KAAM (2008, accompanying article, this issue) proposed nonparametric methods for predicting genomic values. These methods use weaker assumptions than traditional fully parametric models and allow accounting for nonadditive effects without explicit modeling.
The objective of this study was to compare a standard method (a Bayesian counterpart of empirical best linear unbiased prediction) against methods including genomic information, for genetic evaluation of mortality using data from a broiler population. Four methods including genomic information were used: three methods using 24 "informative" SNPs (a widely used linear
-metric regression and two nonparametric methods, kernel regression and reproducing kernel Hilbert spaces regression) and a Bayesian regression using 1000 SNPs along the genome. Mortality in chickens is a trait with low heritability, thus providing a stringent test of the potential effectiveness of genomic assisted evaluations.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Pedigree information was tracked six generations back, ending up with 1103 sires in the pedigree file. Sires were genotyped for 5523 SNPs chosen among the 2.8 million SNPs identified in the chicken genome sequencing project (WONG et al. 2004). Twenty-four SNPs, selected with the filter and wrapper strategy of LONG et al. (2007) applied to the same data, were used in this research for three of the four methods including genomic information. The filter reduces the original thousands of SNPs to a smaller number (e.g., 50), by using an information gain measure. In the wrapper step, a naive Bayesian classifier (using cross-validation prediction accuracy) evaluates each SNP subset's usefulness, eventually arriving at the 24 SNPs having the highest performance (LONG et al. 2007).
Models:
Four statistical methods, including genomic information plus a standard genetic evaluation procedure ignoring markers, were implemented to analyze sires' transformed adjusted progeny means as a response variable. Let y (200 x 1) be the data vector consisting of standardized log-transformed means of adjusted records, to which the following models were fitted.
Model 1—standard genetic evaluation (E-BLUP):
A genetic evaluation using the pedigree as sole source of genetic information was implemented using a Bayesian approach. The method is a Bayesian equivalent of empirical best linear unbiased prediction for predicting sires' transmitting abilities, as described by HENDERSON (1975). The linear model was
![]() |
is a vector of sire effects,
is the effect of sire i in the pedigree (i = 1, 2,..., 1103), and Z is an incidence matrix of order 200 x 1103 linking u to the observed data. A priori, the sire effects were assumed to be distributed as
, where A is the additive relationship matrix between sires and
is the variance between sires. The residuals e were assumed distributed as
, where
is a diagonal matrix, with its diagonal elements
being the number of progeny of sire i. This dispersion structure for e weights the residuals according to the number of progeny each sire has (SORENSEN and GIANOLA 2002; VARONA et al. 2007). Independent scale inverse chi-square prior distributions were assigned to the sire and residual variances as
and
, respectively, where
and
were the degrees of freedom, and
and
were the corresponding scale parameters. Posterior means of the variance components, as well as of sire effects, were calculated using Gibbs sampling, as described by WANG et al. (1993, 1994). Posterior means were used as estimates of sire merit (transmitting ability).
Model 2—F
-metric model (linear regression on SNPs):
A model with "fixed" additive effects at each of the SNP loci was fitted following the
-metric model (linear regression on SNPs) (
-metric) parameterization described by VAN DER VEEN (1959),
![]() |
is the regression of phenotype (y) on the additive effect of SNP i, with q = 24 being the number of SNPs fitted, and e is a residual. The regressors xia relating regression coefficients for SNP i to y were set up as described by VAN DER VEEN (1959) and ZENG et al. (2005). The codes used were
![]() |
Model 3—kernel regression on SNPs:
A nonparametric procedure was used to infer effects of the different SNPs combinations of sires on performance without making strong assumptions. Consider the model
![]() |
is the transformed average progeny group mean of sire i, as described earlier, and
is a (q x 1) vector representing the genotype of each sire for the q = 24 SNPs. Here,
is some unknown function of the whole SNP genotype for each sire, representing the expected phenotypic value of sires possessing this 24-dimensional SNPs combination, i.e., the conditional expectation function
. The random vector of residuals,
, was assumed distributed independently of
and centered at zero. The conditional expectation function given by
![]() |
![]() |
![]() |
is a kernel function, with smoothing parameter h, which acts as a measurement of "genomic distance" between two SNP combinations: the observed combination and the focal point. The focal point is any SNP genotype combination at which the function
is evaluated. A Gaussian kernel is often employed (NADARAYA 1964; WATSON 1964), but a trinomial kernel was adopted instead (GIANOLA and VAN KAAM 2008), considering the three possible genotypes for every SNP. The incidence of the three genotypes can be described with two free dummy variates. Let
and
be the number of disagreements between a focal SNP combination and the observed combination in sire i for two of the three possible genotypes. In this case, the observed genotypes were AA, Aa, or aa, with A being the allele with the largest frequency, and
and
were calculated as the sum of disagreements over loci from observing genotypes at the corresponding locus. An agreement was coded as 0, whereas a disagreement was coded as 1. Table 1 illustrates the values that
and
take at a given locus for each of the nine possible combinations of observed and focal genotypes. The trinomial kernel function had the form
![]() |
is the observed genomic combination in sire i. The smoothing parameters h1 and h2 take values in the interval (0, 1) and also satisfy
so that
is a suitable candidate kernel. The EGV for each sire was the nonparametric estimate using its corresponding genomic configuration as a focal point. The smoothing parameters (h1, h2) were tuned using the generalized (direct) cross-validation method described in WAHBA et al. (2002). For simplicity, a single parameter h1 = h2 = h was tuned.
|
Model 4—reproducing kernel Hilbert spaces regression:
A second nonparametric procedure, a reproducing kernel Hilbert spaces (RKHS) regression, was used to estimate the effect of different genomic combinations on LM without making strong assumptions. A brief description of the RKHS is given here, and additional details can be found in KIMELDORF and WAHBA (1971), WAHBA (1990, 1999), MALLICK et al. (2005) and GIANOLA and VAN KAAM (2008). This method assumes that distances in the Euclidean space can be represented in an isomorphic and isometric space, with a kernel matrix measuring distances between objects (focal points) in the Hilbert space. In our case, the focal points are the SNP genotypes of each individual, and the kernels involve the distance between objects. In these spaces, the penalized sum of squares has the form
![]() | (1) |
is a vector of unknown functions of SNP genotypes. The second term in this equation acts as a penalty, adding up some deviance depending on the value of the unknown parameter
. The term
is a norm under a Hilbert space. KIMELDORF and WAHBA (1971) found that the function g(x) that minimizes (1) admits the representation
![]() |
is a vector of unknown coefficients, n is the number of sires genotyped, and
is a reproducing kernel used as a basis function, possibly depending on some smoothing parameter(s) h. Since data were preadjusted in advance, effects other than the genomic combinations were not included in the model. In our implementation of RKHS, the intercept
was included in the model as the sole element of
. The following exponential kernel meets the requirements of the RKHS structure,
![]() |
was a score of similarity between two SNPs sequences. The score assigned to each pair of SNP sequences was based on the pairwise sequence alignment (NEEDLEMAN and WUNSCH 1970), with some modifications. The APPENDIX shows the score system in detail. Contrary to the kernels used in GIANOLA et al. (2006) and GIANOLA and VAN KAAM (2008), this kernel does not need tuning smoothing parameters inside of the kernel.
Then, a matrix of kernels K with dimension 200 x 200, and with rows in the form
, j = 1, 2, ..., 200, was constructed. The matrix K is symmetric and positive definite and can be interpreted as a correlation matrix between genomic combinations. The minimizer
of (1) can be expressed in a vectorial manner as
![]() |
Embedding this expression in (1) and considering that β includes only an intercept, the function to be minimized becomes
![]() |
to 0 (MALLICK et al. 2005; GIANOLA et al. 2006; GIANOLA and VAN KAAM 2008), and noting the dependence on parameters
, the RKHS regression equations can be formulated in matrix form as
![]() | (2) |
; recall that
, where ni is the number of progeny of sire i. Equivalently, the RKHS approach can be formulated in terms of the random-effects model
![]() |
) and the residuals assumed to be independently distributed as
and
, with
. This model was fitted within a Bayesian framework with
unknown. Scale inverse chi-square prior distributions were assigned to
and to the residual variance; i.e.,
and
, where
and
were the degrees of freedom, and
and
were the respective prior scale parameters. System (2) resembles the well-known mixed-model equations in animal breeding (HENDERSON 1975). In fact, additional fixed and random effects (including parametric genetic effects) can be included in the model if necessary (GIANOLA et al. 2006). Finally, the EGV of each sire was the nonparametric estimate of its corresponding genomic combination; i.e.,
, where
is the posterior mean of
. A Fortran-90 software was developed to implement the RKHS regression.
Model 5—Bayesian regression:
An adapted version of the Bayesian regression proposed by XU (2003) with Gaussian prior distributions assigned to markers (SNPs) was performed using 1000 SNPs randomly chosen from the whole genome. This model allows each SNP marker to have its own variance, producing differential shrinkage of marker effects. The model was
![]() |
Here,
is 1000 x 1, and bi is a regression coefficient for SNP i (i = 1, 2,..., 1000) assumed normally distributed a priori as
, where
is the unknown variance associated with marker i. The elements of the incidence matrix X with dimension equal to the number of records (n = 200) times the number of markers (p = 1000), relating regression coefficients for SNPs to y, were set up as in model 2 for additivity. The residuals e were assumed distributed as
, with R constructed as in previous models. Details of the Markov chain Monte Carlo (MCMC) sampling method are in XU (2003). The EGV of each sire was inferred by summing up the product of the Bayesian estimates (posterior means) of the regression coefficients, times the code of the respective genotype.
In summary, model 1 accounted for polygenic additive effects, and models 2 and 5 considered only additive effects of markers (24 and 1000, respectively). On the other hand, models 3 and 4 are expected to account for all additive and epistatic effects involving the 24 SNPs chosen. Models 1, 2, 4, and 5 were implemented using Bayesian methods via MCMC procedures, specifically the Gibbs sampler. Each of the analyses was based on a chain of 200,000 iterations, with the first 50,000 iterations discarded as burn-in. There were 15,000 samples used for posterior inference, obtained by drawing every 10th iteration from the chain following burn-in.
Model fit:
After obtaining predicted transmitting abilities (PTA) (model 1) and EGVs (models 2–5) of all sires, these were matched with the observed mortality rates (adjusted and raw progeny means) in the original data set. These averages were modeled via a local weighted regression using the PTA or the EGV as an explanatory variable, to examine whether the PTA or the EGV bore any relationship with mortality rates. Local weighted regression is a nonparametric approach to fitting curves to data on the basis of smoothing (CLEVELAND 1979). This method approximates the relationship between mortality rates (response variable) and the PTA or EGV estimates (explanatory variables) locally by a smooth curve based on a parametric function, using locally weighted least squares. Weights are assigned such that points close (in the Euclidean distance) to the predictor value of interest receive a higher weight. For convenience, fitting was such that one-fifth of the points in the plot were allowed to influence the smooth at each value. The regressions were computed using the R software (R DEVELOPMENT CORE TEAM 2007). The mean square error (MSE) for each method was calculated as the average of the squares of the differences between the actual average and the local weighted regression estimate at each point.
Predictive ability:
The ability of predicting yet to be observed mortality rates was studied by cross-validation for the four methods. Five subsets of data were generated by excluding 20% of the sire progeny means at random. The process was without replacement, so that all sires had missing values in only one of the subsets. This analysis was done for each method, using the variances and smoothing parameters estimated previously. The estimates for sires without phenotypic records were obtained by treating the missing data via the data augmentation algorithm (TANNER and WONG 1987). Thus, each sire had a PTA or an EGV for scenarios with and without records for each of the five methods.
Correlations between predicted and observed adjusted means of the progeny of each sire were calculated for each subset and method. Also, a "global" correlation was calculated using the predicted values from the five subsets together. The model producing the highest correlation was regarded as the one with the best predictive ability of yet to be observed records.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
, the reciprocal of the smoothing parameter
) was estimated at 0.40 (0.07) with RKHS. The interpretation of this parameter is not obvious; in general, the smaller it is, the smoother
is (HASTIE and TIBSHIRANI 1990). Model 3 (kernel regression) does not lead to variance estimates directly. The h smoothing parameter for kernel regression was estimated at 0.45. Comparison of these nonparametric results with those from other studies is not possible, since such methodology has not been used previously in quantitative genetics. The choice and design of kernels are of great importance in nonparametric regression, and they deserve further research for applications in genomic selection.
|
-metric regression model fitted to the data the worst.
|
|
) and Pearson (
) correlations between estimates of sire effects from the different models ranged from 0.27 to 0.93 (Table 3). E-BLUP had a higher agreement with the BR method (
,
) than with either kernel regression or RKHS. This illustrates that the polygenic model provides a close approximation to a BR model with a large number of markers having additive effects. The
-metric genomic evaluation had the weakest correlations with the other methods, including the two nonparametric procedures, even though
, kernel regression, and RKHS used the same information (phenotype and genomic information). This indicates that the
approach produced predictions that were distinct from those from other methods. Also, E-BLUP (which does not incorporate genomic information) and the RKHS methods gave different results, and E-BLUP had a higher correlation with BR than with RKHS, as noted earlier.
|
-metric, respectively. Even though LM has a very low heritability, the RKHS incorporating genomic information from only 24 SNPs attained better predictive ability than either E-BLUP or BR, the latter having 1000 markers in the explanatory structure. The models including genomic information had better predictive ability than E-BLUP in four of five subsets (P < 0.01 under binomial sampling), and the RKHS method had a better predictive ability in three of these four subsets (P = 0.05 under binomial sampling). Within methods considering genomic information, RKHS and kernel regression models can potentially account for nonadditive genetic effects (e.g., dominance and epistasis). The Bayesian regression did not perform better than RKHS, in spite of using a much larger amount of genomic information. This might be due to the strong assumptions placed on BR such as linearity, multivariate normality, or absence of interactions among SNPs. Further studies should deal with the inclusion of large amounts of information in the RKHS regression model and with a comprehensive comparison between methods incorporating SNPs along the whole genome and those based on filtering SNP information, as in LONG et al. (2007).
|
Nonparametric methods are expected to have a better predictive ability than parametric models when relationships between variables are cryptic. However, the interpretation of parameters is not straightforward, or even useful, since the focus is on description and prediction. A low heritability and the categorical nature of LM provided a challenge to all methods, but there was some advantage for the RKHS approach using genomic information. Bootstrapping might give a clearer picture of the uncertainty about differences in predictive ability between methods. However, it is computationally demanding.
LEGARRA et al. (2007) found that methods incorporating genomic information had better predictive ability than BLUP in a model with independent families. In a simulation, GIANOLA et al. (2006) found that RKHS regression had higher accuracy of prediction of genotypic values than linear random regressions, mainly under additive x additive epistasis. Also, these authors assumed a different penalty structure for the nonparametric coefficients and used a Gaussian kernel. The kernel and scoring system developed in this study performed better than the kernel described in GIANOLA et al. (2006) (results not shown).
Other authors have proposed parametric methods for including genomic information into genetic evaluations. For instance, MEUWISSEN et al. (2001) compared least-squares, BLUP, and two Bayesian methods with different prior distributions for the QTL effects, using simulated data. They found that Bayesian methods assigning specific prior distributions to marker effects were more reliable, but their priors matched the simulated parameter values. Also, GIANOLA et al. (2003) and XU (2003) dealt with markers over the entire genome to estimate polygenic effects by applying Bayesian methods. Strong assumptions are needed for these methods, as mentioned earlier.
Pedigree and genomic information were not combined in any of the methods used in this study, due to the small number of sires. However, including both sources of information in analyses with a larger amount of data might improve predictive ability of the methods presented here.
Implications:
Three methods using information from 24 SNPs associated with chicken mortality in broilers, which is a lowly heritable trait, and one method using 1000 SNPs across the whole genome were used for genomic evaluation of late mortality. Comparison of these methods against a standard genetic evaluation using pedigree information indicated that predictive ability was similar for E-BLUP, LM, kernel regression, and BR, but RKHS increased predictive correlations by 0.04 over BR, by 0.10 over E-BLUP, and by 0.12 over a linear regression on SNPs using the
-metric model. The kernel regression and the RKHS procedure fitted the data better than E-BLUP, BR, or
-metric, producing a smaller MSE. It seems that incorporating information on SNPs into genetic evaluation is feasible with nonparametric models, with the RKHS being appealing, because of its ability to deal with complex gene action as well as include parametric components in the model (GIANOLA and VAN KAAM 2008). Further, RKHS had the highest accuracy when predicting performance of sires without progeny records in the analysis. This study suggests that RKHS using just some SNPs having an association with the trait may provide better predictive ability than a BR using markers over the entire genome. This is an important issue in the development of genomic selection procedures that must be considered in future studies in a thorough manner.
Nonparametric methods for incorporating genomic information into genetic evaluations should be studied further, since they are appealing for handling large numbers of SNPs and their interactions. At present, SNP chip developments allow genotyping thousands of SNPs for many individuals. If these methods prove to be better than existing ones, breeding companies could genotype candidates and make earlier decisions on breeding programs and progeny testing, with the potential for increasing the rate of genetic progress (SCHAEFFER 2006). Developing suitable kernels to measure the extent by which haplotypes differ from each other and that allow for weighting the most relevant SNPs differentially constitutes an important research challenge.
| APPENDIX |
|---|
|
|
|---|
Let
and
be two sequences of SNPs within a given chromosome, with the SNPs sorted in ascending order regarding their position in the chromosome. First, calculate the frequencies (
) at locus s of genotype k (with k = 1, 2, or 3 being one of the three possible genotypes, i.e., AA, Aa, or aa). Then, initialize the score S = 0. The algorithm for scoring similarity between two sequences i and j is computed from s going from 1 to the number of SNPs in a given chromosome as
![]() |
Above, S is the score for the similarity of two sequences of SNP in a given chromosome, and subscore is a temporary variable in the algorithm that is initialized to 1 at the beginning of the algorithm and every time the sequences differ in their genotypes. The final score between both sequences is given by the sum of the scores for each chromosome as
![]() |
The lower the score, the more similar the SNP sequences are. Further, the more uncommon SNPs two sequences share, the lower the score is. This score was introduced in the exponential kernel as
, so that the more similar two sequences were, the larger the value of the kernel between sequence x and
was.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| LITERATURE CITED |
|---|
|
|
|---|
CLEVELAND, W. S., 1979 Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc. 74: 829–836.[CrossRef]
DE GREEF, K. H., L. L. JANSS, A. L. J. VEREIJKEN, R. PIT and C. L. GERRITSEN, 2001 Disease-induced variability of genetic correlations: ascites in broilers as a case study. J. Anim. Sci. 79: 1723–1733.
GIANOLA, D., and J. B. C. H. M. VAN KAAM, 2008 Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 178: 2289–2303.
GIANOLA, D., M. PEREZ-ENCISO and M. A. TORO, 2003 On marker assisted prediction of genetic value: beyond the ridge. Genetics 163: 347–365.
GIANOLA, D., R. L. FERNANDO and A. STELLA, 2006 Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173: 1761–1776.
HASTIE, T. J, and R. J. TIBSHIRANI, 1990 Generalized Additive Models. Chapman & Hall, London.
HAYES, B., J. LAERDAHL, D. LIEN, A. ADZHUBEI and B. HØYHEIM, 2004 Large scale discovery of single nucleotide polymorphism (SNP) markers in Atlantic Salmon (Salmo salar). AKVAFORSK, Institute of Aquaculture Research, Aas, Norway. www.mabit.no/pdf/hayes.pdf.
HENDERSON, C. R., 1975 Best linear unbiased estimation and prediction under a selection model. Biometrics 31: 423–447.[CrossRef][Medline]
JANSS, L. L. G., and N. M. BOLDER, 2000 Heritabilities of and genetic relationships between salmonella resistance traits in broilers. J. Anim. Sci. 78: 2287–2291.
KIMELDORF, G., and G. WAHBA, 1971 Some results on Tchebycheffian spline functions. J. Math. Anal. Appl. 33: 82–95.[CrossRef]
LAMONT, S. J., M. G. KAISER and W. LIU, 2002 Candidate genes for resistance to Salmonella enteritidis colonization in chickens as detected in a novel genetic cross. Vet. Immun. Immunopathol. 87: 423–428.[CrossRef][Medline]
LEGARRA, A., J. M. ELSEN, E. MANFREDI and C. ROBERT-GRANIE, 2007 Validation of genomic selection in an outbred mice population. Proceedings of the 58th Annual Meeting of European Association for Animal Production, August 26–29, 2007, Dublin, Ireland, Session 18, Abstract 1071.
LIU, H. C., H. H. CHENG, V. TIRUNAGARU, L. SOFER and J. BURNSIDE, 2001 A strategy to identify positional candidate genes conferring Marek's disease resistance by integrating DNA microarrays and genetic mapping. Anim. Genet. 32: 351–359.[CrossRef][Medline]
LONG, N., D. GIANOLA, G. J. M. ROSA, K. WEIGEL and S. AVENDAÑO, 2007 Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. J. Anim. Breed. Genet. 124(6): 377–389.[Medline]
MALLICK, B. K., D. GHOSH and M. GHOSH, 2005 Bayesian classification of tumours by using gene expression data. J. R. Stat. Soc. B 67: 219–234.[CrossRef]
MEUWISSEN, T. H. E., B. J. HAYES and M. E. GODDARD, 2001 Prediction of total genetic value using genome-wide dense marker maps. Genetics 157: 1819–1829.
NADARAYA, E. A., 1964 On estimating regression. Theor. Probab. Appl. 9: 141–142.[CrossRef]
NEEDLEMAN, S. B., and C. D. WUNSCH, 1970 A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48: 443–453.[CrossRef][Medline]
R DEVELOPMENT CORE TEAM, 2007 Statistical Computing. R Foundation for Statistical Computing. Vienna. http://www.R-project.org.
SCHAEFFER, L. R., 2006 Strategy for applying genome-wide selection in dairy cattle. J. Anim. Breed. Genet. 123(4): 218–223.[CrossRef][Medline]
SILVERMAN, B. W., 1986 Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.
SORENSEN, D. A., and D. GIANOLA, 2002 Likelihood, Bayesian and MCMC Methods in Quantitative Genetics, pp. 588–595. Springer-Verlag, New York.
TANNER, M. A., and W. H. WONG, 1987 The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 81: 82–86.[CrossRef]
VAN DER VEEN, J. H., 1959 Tests of non-allelic interaction and linkage for quantitative characters in generations derived from two diploid pure lines. Genetica 30: 201–232.[CrossRef][Medline]
VARONA, L., D. A. SORENSEN and R. THOMPSON, 2007 Analysis of litter size and average litter weight in pigs using a recursive model. Genetics 177: 1791–1799.
WAHBA, G., 1990 Spline Model for Observational Data. Society for Industrial and Applied Mathematics, Philadelphia.
WAHBA, G., 1999 Advances in Kernel Methods: Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GAVC, edited by B. SCHOLKOPF, C. BURGES and A. SMOLA, pp. 68–88. MIT Press, Cambridge, MA.
WAHBA, G., Y. LIN, Y. LEE and H. ZHANG, 2002 Optimal properties and adaptive tuning of standard and non-standard support vector machines, pp. 125–143 in Nonlinear Estimation and Classification, edited by D. DENISON, M. HANSEN, C. HOLMES, B. MALLICK and B. YU. Springer, New York.
WANG, C. S., J. J. RUTLEDGE and D. GIANOLA, 1993 Marginal inferences about variance components in a mixed linear model using Gibbs sampling. Genet. Sel. Evol. 25: 41–62.[CrossRef]
WANG, C. S., J. J. RUTLEDGE and D. GIANOLA, 1994 Bayesian analysis of mixed linear models via Gibbs sampling with an application to litter size in Iberian pigs. Genet. Sel. Evol. 26: 91–115.[CrossRef]
WANG, J., X. HE, J. RUAN, M. DAI, J. CHEN et al., 2005 ChickVD: a sequence variation database for the chicken genome. Nucleic Acids Res. 33: D438–D441.
WATSON, G. S., 1964 Smooth regression analysis. Sankhya Ser. A 26: 359–372.
WONG, G. K., B. LIU, J. WANG, Y. ZHANG, X. YANG et al., 2004 A genetic variation map for chicken with 2.8 millions single nucleotide polymorphism. Nature 432: 717–722.[CrossRef][Medline]
XU, S., 2003 Estimating polygenic effects using markers of the entire genome. Genetics 163: 789–801.
YE, X., S. AVENDAÑO, J. C. M. DEKKERS and S. J. LAMONT, 2006 Association of twelve immune-related genes with performance of three broiler lines in two different hygiene environments. Poult. Sci. 85: 1555–1569.
ZENG, Z. B., T. WANG and W. ZOU, 2005 Modeling quantitative trait loci and interpretation of models. Genetics 169: 1711–1725.
This article has been cited by other articles:
![]() |
D. Gianola and J. B. C. H. M. van Kaam Reproducing Kernel Hilbert Spaces Regression Methods for Genomic Assisted Prediction of Quantitative Traits Genetics, April 1, 2008; 178(4): 2289 - 2303. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |