Genetics, Vol. 162, 841-849, October 2002, Copyright © 2002

Estimation of Quantitative Trait Locus Allele Frequency via a Modified Granddaughter Design

Joel Ira Wellera, Hayim Wellera,b, David Kligera, and Micha Rona
a Institute of Animal Sciences, ARO, The Volcani Center, Bet Dagan 50250, Israel
b Jerusalem College of Technology, Jerusalem 91160, Israel

Corresponding author: Joel Ira Weller, ARO, The Volcani Ctr., P.O. Box 6, Bet Dagan 50250, Israel., weller{at}agri.huji.ac.il (E-mail)

Communicating editor: C. HALEY


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

A method is described on the basis of a modification of the granddaughter design to obtain estimates of quantitative trait loci (QTL) allele frequencies in dairy cattle populations and to determine QTL genotypes for both homozygous and heterozygous grandsires. The method is based on determining the QTL allele passed from grandsires to their maternal granddaughters using haplotypes consisting of several closely linked genetic markers. This method was applied to simulated data of 10 grandsire families, each with 500 granddaughters, and a QTL with a substitution effect of 0.4 phenotypic standard deviations and to actual data for a previously analyzed QTL in the center of chromosome 6, with substitution effect of 1 phenotypic standard deviation on protein percentage. In the simulated data the standard error for the estimated QTL substitution effect with four closely linked multiallelic markers was only 7% greater than the expected standard error with completely correct identification of QTL allele origin. The method estimated the population QTL allelic frequency as 0.64 ± 0.07, compared to the simulated value of 0.7. In the actual data, the frequency of the allele that increases protein percentage was estimated as 0.63 ± 0.06. In both data sets the hypothesis of equal allelic frequencies was rejected at P < 0.05.


NUMEROUS studies have shown that individual loci affecting economic traits [quantitative trait loci (QTL)] can be detected via linkage to genetic markers. For species such as cattle with very limited female prolificacy, the most appropriate experimental designs are the daughter and granddaughter designs (WELLER et al. 1990 Down). These designs have already been applied to several commercial populations, and chromosomal regions with segregating QTL have been statistically detected (e.g., HEYEN et al. 1999 Down).

Two general methods have been considered with respect to the statistical analysis of data generated for analysis by these designs. GEORGES et al. 1995 Down, using a granddaughter design, analyzed each half-sib family separately. Only the two grandpaternal alleles were considered, and the statistical test determined whether the difference between these two alleles was significant. QTL alleles from different families were not compared. NEIMANN-SORENSEN and ROBERTSON (1961) first suggested that data could be accumulated across families by a chi-square analysis. WELLER et al. 1990 Down proposed an ANOVA analysis for the granddaughter design, with the effect of sire nested within grandpaternal allele, which was nested within grandsire. This type of analysis decreases the total number of tests and increases power, provided that the QTL has a relatively high polymorphic information content. A priori no assumptions are required with respect to the number of segregating alleles or their relative frequencies.

If several families are analyzed jointly, the total number of segregating QTL alleles is not known. BOVENHUIS and WELLER 1994 Down and MACKINNON and WELLER 1995 Down derived methodology on the basis of maximum likelihood to estimate QTL effect, location, and allele frequencies under the assumption that only two alleles are segregating in the population. At the opposite extreme, FERNANDO and GROSSMAN 1989 Down assumed a random distribution of QTL allele effects and that each individual in the population with unknown parents received two unique QTL alleles. Similarly, GRIGNOLA et al. 1996 Down derived methodology to determine the variance due to a segregating QTL and its map location, assuming a random distribution of QTL effects. For analysis of complex pedigrees, assumption of the QTL effect as random is mathematically more tractable. However, most schemes proposed to utilize QTL in breeding programs require determination of the QTL genotype of specific individuals (e.g., MACKINNON and GEORGES 1998 Down).

Even if it is assumed that only two QTL alleles are segregating, results from daughter or granddaughter design analyses can give only very limited information on QTL genotype of individual sires or allele frequencies in the general population. Each family includes a common polygenic effect, which is confounded with the specific QTL effect. MACKINNON and WELLER 1995 Down and SONG and WELLER 1998 Down proposed methodology to determine sire QTL genotype, but these methods are effective only for QTL with very large effects relative to the polygenic variance for the trait in question. Their estimates of QTL allele frequencies are based chiefly on the frequency of sires heterozygous for the QTL, as compared to homozygous sires.

The relative frequency of the QTL alleles is of paramount importance for marker-assisted selection. If the favorable allele is already at high frequency, then little can be gained by selection. Conversely, a relatively rare favorable allele is very valuable. In this study we propose a method based on a modification of the granddaughter design, termed the modified granddaughter design (MGD) to obtain estimates of allele frequencies in the population and QTL genotypes for both homozygous and heterozygous individuals under the assumption that only two alleles are segregating. Methods to test the "two-allele" hypothesis are also considered.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The experimental design:
The MGD is diagrammed in Fig 1. Assume that a segregating QTL for a trait of interest has been detected and mapped to a chromosomal segment of ~10 cM using either a daughter or a granddaughter design. Consider the maternal granddaughters of a grandsire with a significant contrast between his two paternal alleles. This grandsire will be denoted the heterozygous grandsire. Each maternal granddaughter will receive one allele from her sire, who is assumed to be unrelated to the heterozygous grandsire, and one allele from her dam, who is a daughter of the heterozygous grandsire. Of these granddaughters, one-quarter should receive the grandpaternal QTL allele with the positive effect, one-quarter should receive the negative grandpaternal QTL allele, and one-half should receive neither grandpaternal allele. In the third case, the granddaughter received one of the QTL alleles of her granddam, the mate of the heterozygous grandsire. These granddams can be considered a random sample of the general population with respect to the allelic distribution of the QTL. All genetic and environmental effects not linked to the chromosomal segment in question are assumed to be randomly distributed among the granddaughters or are included in the analysis model, as described below. Thus, unlike a standard granddaughter design, it is possible to compare the effects of the two grandpaternal alleles to the mean QTL population effect.



View larger version (22K):
In this window
In a new window
Download PPT slide
 
Figure 1. The modified granddaughter design. Only alleles for the QTL are shown. Alleles originating in the heterozygous grandsire are termed Q1 and Q2. Alleles originating in the granddams are termed M1 and M2. Alleles originating in the sires are termed H1, H2, H3, and H4.

Assuming that the QTL is "functionally biallelic" (that is, there are only alleles with differential expression relative to the quantitative trait) and that allele origin can be determined in the granddaughters, the relative frequencies of the two QTL alleles in the population can be determined by comparing the mean values of the three groups of granddaughters for the quantitative trait. For example, if the two grandpaternal QTL alleles are at approximately equal frequency in the general population, then the mean value of those daughters that inherited neither grandpaternal allele should be midway between the two groups of daughters that inherited either grandpaternal allele. However, if one allele has a frequency much higher than that of the other allele, then the mean of those individuals receiving neither grandpaternal allele should approach the mean of the granddaughters that received the more frequent grandpaternal allele. This result will hold regardless of dominance at the QTL, since only additive effects will be observed in this design.

Statistical model:
The following linear model, denoted model 1, can be used to derive estimates for the QTL allelic effects relative to the population mean effect at the QTL for each grandsire family,

(1)

where Yij is the trait value for granddaughter j, daughter of sire i; Q and q are the effects of QTL grandpaternal alleles Q and q; pQij and pqij are the probabilities that the granddaughter inherited alleles Q or q; X is the mean effect for daughters that inherited neither grandpaternal allele; pxij is the respective probability; Si is the effect of sire i on the quantitative trait; and eij is the random residual associated with each daughter. A sire effect is included, because it is assumed that the granddaughters will generally be progeny of a small number of sires, and these might not be randomly distributed relative to the granddaughter QTL genotypes. If multiple heterozygous grandsire families are analyzed jointly, under the assumptions that only two alleles are segregating in the population, then this model should be modified to include a grandsire effect, in addition to the sire effect.

The granddaughter records analyzed will be commercial field data, and the statistical analysis must also account for herd effects. Furthermore, if a lactation trait is measured, cows may have multiple records. Thus, the dependent variable must be either yield deviations (VANRADEN and WIGGANS 1991 Down) or genetic evaluations. Genetic evaluations are regressed. This will bias the estimated effect, but will not reduce power of detection (ISRAEL and WELLER 1998 Down). Assuming that the three probabilities can be computed for each individual, this model is a simple linear regression with four variables, Q, q, X, and S.

Assuming that only two QTL alleles are segregating in the population, the expectation of X, , will be pQ + (1 - pQ), where and are the estimated effects of the two grandpaternal QTL alleles and pQ is the probability of Q in the general population. The expectation of pQ, E(pQ), can then be derived as

(2)

where is the solutions for X in Equation 2. - and - are both estimable; thus E(pQ) is the ratio of two estimable functions. Denoting the numerator as n and the denominator as d, the approximate standard error, SE(pQ), can be derived as

(3)

where Var(n) and Var(d) are the prediction error variances of ( - ) and ( - ), and Cov(n, d) is their prediction error covariance. The estimates of ( - ) and ( - ) and their prediction error variances were derived using the "estimate" option of PROC GLM of SAS (SAS INSTITUTE 1988). Cov(n, d) was computed from the inverse of the coefficient matrix used to solve Equation 1. With the equation for q set to zero, Cov(n, d) will be equal to the off-diagonal element of the equations for X and Q multiplied by the residual mean squares.

Determination of QTL allele origin in the granddaughters:
Generally, a segregating QTL can be identified only by linked markers, such as microsatellites. Although microsatellites are highly polymorphic even in commercial dairy cattle populations (RON et al. 1995 Down), it will not be possible in most cases to determine allele origin in the granddaughters by merely comparing their genotypes to the grandsire genotype for a single marker. If the granddaughter received only one grandpaternal marker allele, it is not known whether she received this allele from the heterozygous grandsire, his mate, or her sire. The only case that is unequivocal is when the granddaughter received neither grandpaternal allele. However, even in this case the granddaughter still could have received one of the grandpaternal QTL alleles due to recombination.

The probability of correct determination of QTL allele origin is increased if the heterozygous grandsire, the granddaughters, and their sires are all genotyped for several closely linked, highly polymorphic microsatellites, assuming that this chromosomal segment includes the QTL. (Genotyping dams would also increase the probability of correct determination, but this is not considered to be a viable option, because the number of dams will be quite large, and obtaining genetic material from these cows once a segregating QTL has been detected will generally not be possible.) Considering the structure of most commercial dairy cattle populations, it should be possible to obtain several hundred maternal granddaughters for each grandsire, all progeny of a very few sires. Once the sires and several hundred of their daughters are genotyped for three or four closely linked markers, it should be possible to determine each sire's haplotypes with nearly complete certainty. Since the markers are closely linked, it should also be possible to determine with a very high probability which paternal haplotype was passed to the daughter and, by elimination, the maternal haplotype. Given the maternal haplotype, the probability of receiving either grandpaternal QTL allele, or neither, can then be determined. Combining all sources of information, the probability of receiving either grandpaternal allele, or neither, can be determined with a high degree of certainty. The probability that granddaughter j of sire i received grandpaternal allele Q, pQij can be computed as

(4)

where HSic is the haplotype c received from sire i, PGS is the grandsire genotype including phase, PAi is the population frequencies of the alleles in the haplotype received from the dam, and PSj is the genotype of sire i including phase.

Description of the computer algorithm to compute pQij:
A computer program was written to compute pQij on the basis of Equation 4. The following assumptions were employed:

  1. Allele frequencies in the population were estimated from the allelic frequencies in the sires and granddaughters.

  2. The order of the genetic markers and the QTL within the chromosomal segment genotyped was assumed known without error, but the QTL could be between any two markers.

  3. Recombination frequencies among all pairs of adjacent genetic markers and between the QTL and the adjacent genetic markers were assumed known.

  4. All recombination events were assumed to be independent; that is, the Haldane mapping (HALDANE 1919 Down) function was assumed.

  5. Genetic linkage phases of the grandsires and sires were assumed known without error for all genetic markers.

  6. The grandsires were assumed to be heterozygous for all genetic markers.

Assuming that the haplotype consists of only three loci, there are 23 = 8 different haplotypes that could be passed from the sire to his daughter. Assuming no mutations, many of these possibilities can be excluded if some of the daughter alleles differ from her sire. Considering the possibility of recombination within the haplotype, there may still be several possibilities for the sire haplotype transmitted, but allelic combinations that require recombination will have a lower probability than those of allelic combinations that are based on the assumption that the sire haplotype was transferred intact.

Once the putative paternal haplotype is determined, the maternal haplotype is also determined. With respect to the effect of the QTL, the only path of recombination of interest is from the grandsire to his granddaughter. Even if the haplotype followed is relatively short, for example 10 cM, this is still problematic, because the grandsire and his granddaughters are separated by two generations. Assuming that the QTL is in fact located within the segment under consideration, if recombination occurred in the grandsire's sperm, there is uncertainty with respect to which QTL allele was passed to his daughter. With recombination in the subsequent generation, only part of the grandpaternal haplotype will be passed to the granddaughter. This part may or may not include the QTL. With respect to pQij, we considered several possibilities, starting from the simplest to the most complex.

  1. The putative dam haplotype has none of the grandsire alleles. In this case, , and .

  2. The putative dam haplotype is the same as one of the grandsire haplotypes. If the dam haplotype is the same as the grandsire haplotype including Q, then the probability of pqij will be zero, barring the very slight possibility of double recombination. However, there is still a possibility that the dam inherited this haplotype from her dam (the grandsire's mate). pQij is then computed as the probability that the haplotype was passed from the grandsire, divided by the sum of the probabilities of the two possible paths of inheritance. The probability that this haplotype was passed from the granddam is equal to the product of the frequencies of the three marker alleles in the general population times one-half. The probability that this haplotype was passed from the grandsire is [0.5(1 - r1)(1 - r2)]2, where r1 and r2 are the probabilities of recombination between markers one and two, and two and three, respectively, assuming zero interference. There are other possibilities, but they require both recombination within the marker bracket and passing of paternal alleles from the granddam.

  3. The putative dam haplotype contains only grandpaternal alleles, but corresponds to neither haplotype. As in the previous cases, the dam could have received this haplotype from her dam, with the probability as computed previously. In addition, the whole or part of the haplotype might be of grandpaternal origin. If recombination occurred from grandsire to dam, then the daughter did receive one of the grandpaternal QTL alleles. Alternatively, recombination could have occurred between the dam and her daughter. In this case the granddaughter could have received the QTL allele either from the grandsire or from the granddam.

  4. The dam haplotype contains grandsire alleles for only some of the markers. In this case either the haplotype is derived from the granddam with probability as computed above or recombination occurred between the dam and the granddaughter, and only part of the grandpaternal allele was passed to the granddaughter. In the latter case, the granddaughter could have received either the QTL allele linked to the marker alleles passed from the grandsire or the QTL allele of the granddam.

Three constraints were employed to simplify the computing algorithm:

  1. Granddaughters with valid genotypes for less than three markers were deleted.

  2. Daughters had to receive at least one allele of their sire for all markers with valid genotypes. If this condition was not met, we assumed that the daughter's paternity was incorrect, and the record was discarded.

  3. The probability that a grandsire haplotype was passed to the mother with two or more events of recombination was assumed to be zero.

Generation of simulated data:
A haplotype consisting of four markers with 5% recombination between each pair of markers was generated. The QTL was assumed to be located in the center of the haplotype between markers 2 and 3. Recombination frequencies between the QTL and the two adjacent markers were set at 0.02569, corresponding to the Haldane mapping function. Each marker was assumed to have five alleles segregating in the population. Population allelic frequencies for each marker were determined by sampling five times from a uniform distribution. The five sample values for each marker were then divided by their sum, so that the allelic frequencies would sum to unity. The QTL was assumed to have two alleles with population allelic frequencies of 0.7 and 0.3. The QTL was codominant with an additive effect of 0.4 phenotypic standard deviations. Thus the QTL accounted for 0.0672 of the phenotypic variance [2p(1 - p)a2, where a is the additive QTL effect]. The polygenic additive genetic variance, excluding the QTL, was assumed to be 0.25 of the phenotypic variance. Thus the total heritability was .

Ten grandsires, each with 500 granddaughters, were simulated. All grandsires were assumed to be heterozygous for the QTL and all four markers. Subject to these restrictions, grandsire genotypes were generated by sampling from uniform distributions that are based on the allelic frequencies in the population. Each grandsire mating was assumed to be to a different granddam to produce 5000 daughters. The marker haplotype that each daughter received from her dam was generated by sampling each allele from a uniform distribution according to the population allelic frequencies. QTL alleles were also determined by sampling from a uniform distribution. If a value <0.7 was sampled, then a positive QTL allele was generated. Otherwise, a negative allele was generated. The haplotype passed from the grandsire was generated on the basis of the grandsire genotype and the recombination frequencies given previously. The daughters of the grandsires were randomly mated to a total of 20 sires that were assumed to be unrelated to all other animals. The sire genotypes were generated by the same method as the granddam genotypes. The chromosome haplotype that each daughter received from her sire was simulated on the basis of the sire genotype and the recombination probabilities given previously. The haplotype that each daughter received from her dam was determined by sampling each locus from uniform distributions corresponding to the allelic population frequencies. The granddaughter genotypes were then constructed by sampling from the genotypes of their sire and dam, on the basis of the assumed recombination frequencies. Marker genotypes for all individuals for all markers were assumed known without error.

The quantitative trait value for each granddaughter, Yijk, was simulated as

(5)

where Gi is the grandsire effect, Qijk is the QTL effect of granddaughter k, and the other terms are as defined previously. The Gi, Sj, and eijk effects were simulated by sampling from normal distributions with standard deviations of 0.125, 0.25, and 0.924, to give a total phenotypic variance of 1, including the variance contributed by the QTL effect.

For each granddaughter, the simulated grandpaternal QTL allele value (GQijk) was defined as 1 or -1 for granddaughters that received the grandpaternal alleles Q and q and 0 for granddaughters that received neither grandpaternal allele. The estimated grandpaternal QTL allele value, E(pQijk), was computed as expectation of the QTL value on the basis of the probabilities for each grandpaternal QTL allele, as estimated by the computer program based on Equation 4. That is, , where pqijk is the probability that the granddaughter received allele q. Results of the simulations were evaluated on the basis of the regression of GQijk on E(pQijk), the estimates of the grandpaternal QTL allelic effects as derived from Equation 1(pQ) as derived from Equation 2. The individual families were analyzed separately, and all families were analyzed jointly.

Analysis of actual data:
Various studies have found a major QTL chiefly affecting protein percentage near the middle of chromosome 6 (reviewed by RON et al. 2001 Down). Two Israeli sires were found to be heterozygous for this QTL on the basis of a daughter design analysis. A total of 593 putative maternal granddaughters of sire 2278 were genotyped for six microsatellites covering the region including the QTL. Sire 2278 was heterozygous for all six microsatellites. The genetic map distances between the microsatellites were estimated from the data by CRI-MAP (http://linkage.rockefeller.edu/soft/crimap/), and the assumed QTL location was determined by interval mapping, as described previously (RON et al. 2001 Down). The recombination frequencies were then computed on the basis of the Haldane mapping function. The markers genotyped and the assumed recombination frequencies between the markers and the QTL are given in Table 1. Population allelic frequencies for the microsatellites were estimated from allele frequencies of the genotyped animals.


 
View this table:
In this window
In a new window

 
Table 1. The markers genotyped on chromosome 6, the number of alleles observed, and the recombination frequencies between adjacent markers

The granddaughters were progeny of 15 sires. Each sire had at least six daughters, so that sire genotype and linkage phase of the sires for all six markers could be determined with very high likelihood. Cows with valid genotypes for fewer than three markers were deleted from the analysis. Likewise, cows listed as daughters of sires that received neither paternal allele for any of the genetic markers with valid genotypes were assumed not to be daughters of the listed sire and were therefore also deleted from the analysis. Microsatellite genotyping methods were as described previously by RON et al. 2001 Down.

Testing the hypothesis of equal allelic frequency:
If a single grandsire family is analyzed and the two QTL alleles are at equal frequency in the population, then the estimate of X in model 1 should be equal to the mean of Q and q. In this case model 1 can be modified as

(6)

where AQ is one-half the QTL substitution effect and the other terms are as defined previously. In this model, termed model 2, E(pQij) is now a linear regression coefficient with an effect equal to AQ. The hypothesis of equal QTL allele frequency can be tested by an F-statistic computed as the ratio of the difference of the models 1 and 2 sums of squares to the model 1 residual sum of squares. This F-value will have a single numerator degree of freedom (d.f.) and denominator d.f. equal to the residual sum of squares d.f. in model 1. If the F-value is significant, then the hypothesis of equal allele frequency can be rejected. The hypothesis of equal allelic frequency was tested for both the simulated data jointly for all families and the chromosome 6 QTL effect on protein percentage in granddaughters of sire 2278.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The regressions of GQijk on E(pQijk) are given in Table 2 for each individual grandsire family and for all families jointly. As expected, all regressions were very close to unity, and all y-intercepts were close to zero. Coefficients of determination were in the range of 0.79–0.86 for the individual families and 0.83 for the joint analysis. Thus, on the basis of data from four linked polymorphic loci, the computing algorithm was able to accurately estimate the grandpaternal allele received.


 
View this table:
In this window
In a new window

 
Table 2. Regressions of simulated QTL grandpaternal allele value on the estimated values in the simulated data

Estimates of QTL grandpaternal allelic effects, E(pQ), and their standard errors are given in Table 3. Allelic effects were estimated relative to the estimate for granddaughters that received neither grandpaternal allele. Since the dependent variable is the phenotypic value of the granddaughters, the expectation of Q - q is equal to the substitution effect. The substitution effect estimated over all families was 0.447, and E(pQ) was 0.644 with a standard error of 0.069. Both values are close to the simulated value of 0.4 and 0.7, and these values are well within the confidence intervals of the estimates, estimated as plus or minus 2 standard errors.


 
View this table:
In this window
In a new window

 
Table 3. Estimates of QTL grandpaternal allelic effects and population allele frequencies for the simulated data

E(pQ) for the individual families ranged from 0.183 to 1.081, with a mean of 0.656 and a standard deviation of 0.28. Equation 2 is based on the assumption that only two QTL alleles are segregating in the population. If this assumption is incorrect, then the E(pQ) could be outside the two-allele hypothesis parameter space of 0–1. The mean of the standard errors for the individual families was 0.25, which is close to the standard deviation among the estimates for E(pQ). Even though the mean of E(pQ) was close to the simulated value of 0.7, the standard error with a QTL effect and sample of this size is too large to reach any meaningful conclusions for the individual families. The confidence interval includes nearly the entire possible range for E(pQ), assuming that only two alleles are segregating.

The standard error of Q - q for the individual families was ~0.135. With 500 granddaughters per family, it is expected that 125 daughters should have received each grandpaternal allele. Thus if the allele passed was known without error, the standard error of the estimate would be (2/125)1/2 = 0.126. This value is only 7% less than the value derived from the simulations. Thus, increasing the number of markers or using more informative markers could increase the efficiency of the method only marginally. Furthermore, predictions of statistical power, based on the assumption of completely correct QTL allelic identification, will be only slightly inflated.

In the analysis of the actual data of chromosome 6, there were 430 granddaughters with valid genotypes. Of these cows, 424 had valid genetic evaluations for all five production traits. The mean probability values for the two grandpaternal alleles were 0.23 and 0.19, compared to the theoretical expected values of 0.25. The mean probability to receive neither allele was 0.57, compared to the expected value of 0.5. Estimates of QTL grandpaternal allelic effects and population allele frequencies for the effect on chromosome 6 are given in Table 4. The estimates of the substitution effects derived previously by RON et al. 2001 Down for this population by a daughter design are also given. The current estimates were generally somewhat greater, but similar to the previous estimates for all five traits. Since a different sample of cows was analyzed, these results can be considered an independent confirmation of the results of RON et al. 2001 Down. In the daughter and MGD designs the expectation of the contrast estimated on the level of cow breeding value is a regressed estimate of the QTL substitution effect, while in the granddaughter design the expectation of the contrast on the level of the son breeding value is a regressed estimate of the substitution effect.


 
View this table:
In this window
In a new window

 
Table 4. Estimates of QTL grandpaternal allelic effects and population allele frequencies for the effect on chromosome 6

Except for protein yield, E(pQ) was within the range of 0.8–0.5. The estimate was only 0.2 for protein yield, but the substitution effect for this trait was only marginally significant, and the standard error was 0.38. E(pQ) for protein percentage, which was the trait with the greatest effect relative to its standard error, was 0.63 with a standard error of 0.06. Thus the confidence interval for E(pQ) is 0.51–0.75. The allele termed Q had a positive effect on all traits, except milk yield.

The ANOVA table to reject the hypothesis of equal QTL allelic frequency for the simulated data including all families and the chromosome 6 QTL effect on protein percentage are given in Table 5. In both cases the F-values were significant at the 5% level. Thus the hypothesis of equal QTL allelic frequency can be rejected. Therefore, on the basis of both the F-test and the confidence interval for E(pQ), we can then conclude that the population frequency of the allele that increases protein percentage on chromosome 6 is >0.5.


 
View this table:
In this window
In a new window

 
Table 5. ANOVA table to reject the hypothesis of equal QTL allelic frequency


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Estimates of QTL effects derived from either a daughter or a granddaughter design will generally be biased, due to selection of effects deemed to be "significant" (GEORGES et al. 1995 Down). This will not be the case in the MGD, because effects that were found to be significant in a daughter or granddaughter design analysis are reanalyzed by the MGD on an independent sample of cows (maternal granddaughters). There is therefore no reason to assume that any bias obtained in the original analysis, due to selection, will carry over into the MGD analysis, in which no selection based on statistical significance is performed. All markers and all families selected by the original daughter or granddaughter design analysis will be included in the MGD analysis. Furthermore, GEORGES et al. 1995 Down showed that the effect of bias is greatest for small QTL effects close to the limit of significance and decreases for large effects, which would be selected as significant in any event. The MGD is applicable only to effects in which the QTL substitution effect is well estimated and, therefore, highly significant.

The second question that must be addressed is the problem of sires' QTL genotype misidentification. Again there is no reason to assume that a sire's genotype will be misclassified in both analyses. For example, if a probability value of 5% is used to determine significance of the within-family contrast, the probability that significant contrasts will be obtained by chance in two families will be 0.25%. These events can therefore easily be detected by inspecting the QTL allele contrasts in the MGD analysis. Any grandsire with a nonsignificant contrast in the MGD analysis will be assumed to be misclassified and will be deleted from the analysis. In the simulated data generated, the contrast between the two grandpaternal alleles was significant for all 10 families at P < 0.05. The opposite mistake, classifying a heterozygous sire as homozygous in the original daughter or granddaughter design analysis, will not bias the results of the MGD analysis. All that will happen is that one less sire will be included in the MGD analysis. The question of bias in the estimation of QTL effects due to misclassification of sires in daughter and granddaughter designs has been considered in detail (ISRAEL and WELLER 2002 Down).

Although the sample size for the QTL on chromosome 6 was relatively small compared to the simulated data, the QTL effect on protein percentage relative to the phenotypic standard deviation is much greater. The substitution effect on protein percentage is approximately equal to 1 phenotypic standard deviation (RON et al. 2001 Down), compared to a substitution effect of 0.4 standard deviations in the simulations. For protein percentage the standard error for E(pQ) was ~10% of the substitution effect, which is similar to the ratio obtained for the joint analysis of all 10 simulated families. Approximate confidence intervals E(pQ) could be also derived by the nonparametric bootstrap method (VISSCHER et al. 1996 Down). RON et al. 2001 Down noted that if 75% of the sires are homozygous for the QTL (six out of eight) and only two QTL alleles are segregating in the population, then the frequency of the more frequent allele assuming random mating is ~0.85. Considering that only eight sires were analyzed, this estimate corresponds rather well with the current estimate of ~0.63 + 0.06. The economic value of this allele would also be positive for most common selection indices, which generally give positive values for fat and protein yield, and negative or zero values for milk yield. It is not surprising that the economically favorable allele is already at a relatively high frequency in the commercial population.

The analyses of both the simulated and actual data demonstrate that either very large sample sizes or very large effects will be required to obtain reasonable power to test meaningful hypotheses. Thus the MGD does not share the main advantage of the granddaughter design that moderately sized effects can be detected by genotyping relatively few individuals. In both data sets analyzed, the hypothesis of equal QTL allelic frequencies could be rejected, but only marginally, even though the simulated data included records on 5000 granddaughters, and the effect on chromosome 6 is approximately equal to a complete phenotypic standard deviation (RON et al. 2001 Down).

The algorithm employed assumed that the QTL location was known without error. Of course this will almost never be the case for actual data. In the case of chromosome 6, the confidence interval for the QTL affecting protein percentage was 4 cM, very close to marker BM143. We somewhat arbitrarily assigned the QTL a position between BM143 and BMS382, but very close to BM143. Further study is required to determine the effect of inaccurate QTL location on the proposed method. However, since the method is effective only for large QTL or very large samples, it can be assumed that the confidence interval for QTL location will be quite narrow under any situation in which this method is applied.

If significant contrasts are found in several families, then there is a possibility that more than two alleles with significant frequencies are present in the population. An example of the results expected in this case is given in Fig 2. It will generally not be possible to reject the two-allele hypothesis on the basis of the magnitude of the contrasts among the families in the original analysis. However, if the two-allele hypothesis is correct, then the estimates for Q, q, and X should repeat across families. Thus the two-allele hypothesis can be tested by comparing a model that assumes single estimates for Q, q, and X for all families to a model that derives separate estimates for these parameters for each family. Furthermore, if only two QTL alleles are segregating in the population and the significant families are reanalyzed by an MGD as described, the significant within-family contrasts should repeat, but there should be no correlation among families for the within-family contrasts found in the original analysis and the within-family contrasts in the second analysis.



View larger version (10K):
In this window
In a new window
Download PPT slide
 
Figure 2. Graphic representation of estimated effects assuming three alleles. Five grandsire families, denoted GS1–GS5, are shown. The phenotypic means for the granddaughter groups of each grandsire are plotted as short bands. The mean population value for the quantitative trait is denoted X. It is assumed that three QTL alleles, denoted Q1, Q2, and Q3, are segregating in the population of grandsires, with effects of {alpha}1, {alpha}2, and {alpha}3 relative to the population mean. Grandsires GS1, GS2, and GS3 are homozygous for the QTL, while grandsires GS4 and GS5 are heterozygous. Therefore only a single band is shown for the first three grandsires.

The MGD can also be used to determine QTL genotype of grandsires homozygous for the QTL. Model 1 is then modified as

(7)

with pQij defined as the probability that the granddaughter received either grandpaternal allele. If the grandsire is homozygous for the positive allele, then the estimate of Q should be greater than X, while if the grandsire is homozygous for the negative allele, then Q should be less than X. If the contrast between the Q and X effects is significant, then it can be concluded that the QTL alleles of the grandsire are different from the population mean.

A final application of the MGD is to compare allele frequencies across countries. Many sires are now used widely in different countries. This design could be applied to a sire with many granddaughters in two different countries. Analyzing each country separately, individual country estimates for E(pQ) could be derived. If the two estimates are significantly different, it could be concluded that the QTL allelic frequencies in the two countries are in fact different or that different alleles are segregating for the QTL. Analysis of homozygous sires by the model of Equation 7 for a series of unlinked QTL in different countries can represent the genetic background of each country regarding the QTL. This information could then be used to make decisions on importation of genetic material.


*  ACKNOWLEDGMENTS

This research was supported by grants from the Israel Milk Marketing Board and the U.S.-Israel Binational Agricultural Research and Development fund. We thank E. Ezra, A. Gilmour, and A. Neiman for computing support and useful discussions.

Manuscript received December 21, 2001; Accepted for publication July 25, 2002.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

BOVENHUIS, H. and J. I. WELLER, 1994  Mapping and analysis of dairy cattle quantitative trait loci by maximum likelihood methodology using milk protein genes as genetic markers. Genetics 136:267-280.

FERNANDO, R. L. and M. GROSSMAN, 1989  Marker assisted selection using best linear unbiased prediction. Genet. Sel. Evol. 21:467-477.

GEORGES, M., D. NIELSEN, M. MACKINNON, A. MISHRA, and R. OKIMOTO et al., 1995  Mapping quantitative trait loci controlling milk production in dairy cattle by exploiting progeny testing. Genetics 139:907-920.[Abstract]

GRIGNOLA, F. E., I. HOESCHELE, and B. TIER, 1996  Mapping quantitative trait loci in outcross populations via residual maximum likelihood. I. Methodology. Genet. Sel. Evol. 28:479-490.

HALDANE, J. B. S., 1919  The combination of linkage values and the calculation of distances between the loci of linked factors. J. Genet. 8:299.

HEYEN, D. W., J. I. WELLER, M. RON, M. BAND, and J. E. BEEVER et al., 1999  A genome scan for QTL influencing milk production and health traits in dairy cattle. Physiol. Genomics 1:165-175.[Abstract/Free Full Text]

ISRAEL, C. and J. I. WELLER, 1998  Estimation of candidate gene effects in dairy cattle populations. J. Dairy Sci. 81:1653-1662.[Abstract]

ISRAEL, C. and J. I. WELLER, 2002  Effect of type I error threshold on marker-assisted selection in dairy cattle. Livest. Prod. Sci. in press.

MACKINNON, M. J. and J. I. WELLER, 1995  Methodology and accuracy of estimation of quantitative trait loci parameters in a half-sib design using maximum likelihood. Genetics 141:755-770.[Abstract]

MACKINNON, M. J. and M. A. J. GEORGES, 1998  Marker-assisted preselection of young diary sires prior to progeny-testing. Livest. Prod. Sci. 54:229-250.

NEIMANN-SORENSEN, A. and A. ROBERTSON, 1961  The association between blood groups and several production characters in three Danish cattle breeds. Acta Agric. Scand. 11:163-196.

RON, M., H. LEWIN, Y. DA, M. BAND, and A. YANAI et al., 1995  Prediction of informativeness for microsatellite markers among progeny of sires used for detection of economic trait loci in dairy cattle. Anim. Genet. 26:439-441.[Medline]

RON, M., D. KLIGER, E. FELDMESSER, E. SEROUSSI, and E. EZRA et al., 2001  Multiple quantitative trait locus analysis of bovine chromosome 6 in the Israeli Holstein population by a daughter design. Genetics 159:727-735.[Abstract/Free Full Text]

SAS INSTITUTE, 1988 SAS User's Guide: Statistics, Release 6.03. SAS Institute, Cary, NC.

SONG, J. Z., and J. I. WELLER, 1998 Maximum likelihood estimation of quantitative trait loci parameters with flanking markers in a half-sib population. Proceedings of the 6th World Congress on Genetics Applied to Livestock Production, Armidale, NSW, Australia, Vol. 26, pp. 341–344.

VANRADEN, P. M. and G. R. WIGGANS, 1991  Derivation, calculation and use of national animal model information. J. Dairy Sci. 74:2737-2746.[Abstract]

VISSCHER, P. M., R. THOMPSON, and C. S. HALEY, 1996  Confidence intervals in QTL mapping by bootstrapping. Genetics 143:1013-1020.[Abstract]

WELLER, J. I., Y. KASHI, and M. SOLLER, 1990  Power of "daughter" and "granddaughter" designs for genetic mapping of quantitative traits in dairy cattle using genetic markers. J. Dairy Sci. 73:2525-2537.[Abstract]




This article has been cited by other articles:


Home page
J DAIRY SCIHome page
J. I. Weller, M. Golik, S. Reikhav, R. Domochovsky, E. Seroussi, and M. Ron
Detection and Analysis of Quantitative Trait Loci Affecting Production and Secondary Traits on Chromosome 7 in Israeli Holsteins
J Dairy Sci, February 1, 2008; 91(2): 802 - 813.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. Cohen-Zinder, E. Seroussi, D. M. Larkin, J. J. Loor, A. E.-v. d. Wind, J.-H. Lee, J. K. Drackley, M. R. Band, A.G. Hernandez, M. Shani, et al.
Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle
Genome Res., July 1, 2005; 15(7): 936 - 944.
[Abstract] [Full Text] [PDF]