Abstract
A method to measure genomic response to natural and artificial selection by means of genetic markers in livestock is proposed. Genomic response through several levels of selection was measured using sequential testing for distorted segregation of alleles among selected and nonselected sons, single-sperm typing, and a test with records for growth performance. Statistical power at a significance level of 0.05 was >0.5 for a marker linked to a QTL with recombination fractions 0, 0.10, and 0.20 for detecting genomic responses for gene effects of 0.6, 0.7, and 1.0 phenotypic standard deviations, respectively. Genomic response to artificial selection in six commercial bull sire families comprising 285 half-sib sons selected for growth performance was measured using 282 genetic markers evenly distributed over the cattle genome. A genome-wide test using selected sons was significant (P < 0.001), indicating that selection induces changes in the genetic makeup of commercial cattle populations. Markers located in chromosomes 6, 10, and 16 identified regions in those chromosomes that are changing due to artificial selection as revealed by the association of records of performance with alleles at specific markers. Either natural selection or genetic drift may cause the observed genomic response for markers in chromosomes 1, 7, and 17.
GENETIC improvement of farm animals has traditionally been done by artificial selection. In short, artificial selection means that animals with variations better fitted to the production conditions are chosen to breed, and, consequently, to pass on favorable characteristics (genes) to their offspring. Theoretically, this process leads to changes in allele frequencies at loci affecting the traits under selection (Crow and Kimura 1970). However, experimental results showing changes in allele frequency at loci located genome-wide in populations undergoing artificial selection have not yet been reported. The avenue taken in recent years has been to use molecular markers to map quantitative trait loci (QTL) in commercial farm animals (e.g., Georgeset al. 1995) with the final goal of improving the efficiency of artificial selection in the so-called marker-assisted selection.
Genomic response to selection is the change in allele frequency of loci in specific locations of the genome as a result of selection for a quantitative trait. Random anonymous markers such as microsatellites offer the opportunity to investigate genomic response within large progeny groups of bull sires because of the widespread use of artificial insemination in cattle.
The objective of this article is to propose a novel approach to measure genomic response to selection using genetic markers on the basis of sequential testing of the genomic changes taking place through various levels of selection within the offspring or gametes of widely used sires in cattle. It includes (1) testing using selected young bulls for production traits (artificial selection), (2) testing using culled young bulls (zygotic and artificial selection), and (3) testing for selection in sperm cells (gametic selection).
MATERIALS AND METHODS
Population structure and selection: The material for this study is Norwegian cattle, which is a commercial breed specialized in both meat and milk production. Elite sires produce large groups of young bulls. Every year, 400 young bulls (sons of elite sires) are tested at station for growth (measured in grams/day and recorded between 3 and 11 months of age) and conformation. Genetic evaluations for growth utilize the bull’s own performance and ∼30 half-sibs. Of the 400 bulls entering in the station, 200 are selected for high growth. Of these, ∼10 young bulls are eliminated for low fertility and another 70 bulls are eliminated for bad conformation. This process resulted in ∼130 young bulls (∼30% selected) with 250 daughters each that were progeny tested for milk. Table 1 shows the six bull sires used and the number of selected and culled sons.
DNA isolation and molecular markers: Semen for genotyping of sires and their selected half-sib sons (after selection for growth and conformation has taken place) was available from the Norwegian breeding organization. Semen from culled sons was not available and DNA was extracted from serum samples routinely utilized in paternity testing of young bulls. Serum samples were lysated at 96° for 10 min and used directly in the PCR. A total of 282 genetic markers covering 2807 cM of the cattle genome (261 microsatellites, one protein polymorphism, and 20 coding genes) were typed in six elite bull sires and 285 sons were selected for high growth and good conformation in the commercial production of Norwegian cattle. A previously constructed genetic map (Vågeet al. 2000; http://www.nlh.no/ihf/Genkartstorfe/) was used for testing for transmission disequilibrium among selected sons.
Identification number and name of the six sire families and number of selected and culled sons per sire family
Distorted segregation of alleles among sperm cells for each of the six sires was tested using single-sperm typing (Liet al. 1988), which utilizes one single-sperm cell and repeated cycles of PCR, allowing amplification of the marker alleles. Single-sperm typing allows testing for a potentially unlimited number of meioses per donor. Approximately 50 sperm cells were isolated for each sire and DNA extraction followed by PCR amplification was carried out for the markers showing significant results in the previous tests. The percentage of success in amplifying single-sperm cells was between 89 and 96%.
Testing of genomic response to selection: Sequential testing for distorted segregation of alleles from widely used commercial bull sires is carried out in three stages: (1) among progeny selected for high growth using a full coverage of the genome, (2) among culled progeny on the basis of low growth, and (3) among sperm cells of sires. Typing in steps 2 and 3 was carried out only for markers yielding significant results in step 1 to reduce the amount of genotyping and reduce the problem of multiple testing at the last stage.
Testing for distorted segregation of alleles among either selected or culled young bulls can be attained by the following method. Consider an elite sire heterozygous for a marker with alleles A and a. DNA for testing dams’ genotypes was not available. Alleles inherited from dams to sons could be either of the sire alleles (A or a) or any other allele segregating in the population (denoted by a*). If this marker is nearby a QTL affecting a trait under selection, then the frequency among selected sons inheriting either A or a alleles from their sires would be different from the expected 50:50. This approach is similar to a transmission disequilibrium test developed for mapping loci affecting diseases in which the frequency of transmission of alleles from heterozygous parents to their affected offspring deviates from the expected 50% (Spielmanet al. 1993). We developed a maximum-likelihood approach to test for transmission disequilibrium of alleles to half-sib sons. Following Gomez-Raya (2001), the likelihood equation to be maximized is
The genome-wide transmission disequilibrium testing was performed estimating both the transmission parameter and the dam allele frequencies. A joint test for all families for a particular marker was possible using the properties of the χ2 distribution. That is, the sum of independent variables having central χ2 distributions follows a central χ2 distribution with degrees of freedom equal to the sum of the degrees of freedom of the former central χ2 distributions. A test for selection at the genome level was possible under the assumption of independence of χ2 across the genome. This is strictly speaking not true because linked markers will have a tendency to cosegregate their alleles. However, the impact of this assumption will be small since each marker will be “unlinked” with most of the other markers in the genome.
A statistical test for distorted segregation of alleles among sperm cells was carried out for those markers and bull sires showing significant results in the testing among selected young bulls. A test for distorted segregation using single-sperm typing was performed by counting the sperm cell carriers of alleles A and a (OA and Oa, respectively) and by computing χ2 = (OA - EA)2/EA + (Oa - Ea)2/Ea, where EA = Ea = 1/2(OA + Oa). This test is χ2 distributed with 1 d.f.
A t-test contrasting growth performance among all sons (selected and culled) inheriting alternative alleles from their sires was carried out by
To account for multiple testing, a permutation test (Churchill and Doerge 1994) was performed by shuffling 500,000 times the observations on growth performance and by randomly assigning those observations to sons. The shuffling of the observations was carried out within the offspring of each sire but keeping the same genotype at all tested markers for each son. The steps for computing experimentwise P values were, (1) finding the maximum t-test among all tested markers for each permuted datum, (2) ordering of the maximum t-test among all tested markers, (3) computing the number of times (ntimes) out of 500,000 permutations that the t-test in the real data was higher than the value of the permuted data, and (4) computing the experimentwise P value, (1 - ntimes)/500,000.
Changes in the transmission disequilibrium parameter by random sampling: The sampling of gametes from a heterozygous sire can lead to changes in the transmission disequilibirium parameter by chance. The approximated accumulated probability of random changes in the transmission parameter can be computed by taking advantage of the normal approximation to the binomial distribution,
Statistical power for a transmission disequilibrium test: For simplicity, it is assumed that a transmission disequilibrium test (TDT) is performed in which inheritance of the allele from the sire can always be traced. Let the sire be heterozygous A/a at a genetic marker and the total number of offspring be n. Assume also that a QTL with alleles Q and q affecting growth is linked to the marker with a recombination fraction c. The linkage phase in the sire is AQ/aq. The distribution of phenotypic values is assumed normal with means μQ and μq for sons inheriting alleles Q and q from their sire. The expected means among sons inheriting alleles A and a are (1 - c)μQ + cμq and (1 - c)μq + cμQ, respectively. Therefore, the difference between sons inheriting alleles A and a is (1 - 2c)(μQ -μq). The sons are evaluated for growth performance using relatives’ information. It is assumed that heritability of growth (h2) was 0.40 and accuracy of estimated breeding values (r) was 0.80. Gene effects are usually given in phenotypic standard deviations rather than in the scale of the estimated breeding values using relatives’ information. The variance of estimated breeding values is r2σ2A. Thus, if gP is the gene effect in phenotypic standard deviations then the gene effect in standard deviations of estimated breeding values is gâ = gPr/h. If the marker is linked to a QTL affecting growth then the proportion of sons inheriting either A or a will depart from 50% among sons selected for high growth. A mixture of two normal densities corresponding to the sons inheriting alternative alleles from their sire was used for the power calculation. The means of the two distributions used the expected value of gene effects in standard deviations of estimated breeding values. The variance was assumed to be the same in the two groups. Truncation selection makes different the relative contributions of the two groups of offspring to the selected group of sons. The proportion of sons inheriting A and a alleles can be computed by fixing the total proportion selected (ps) and by using numerical integration on
The normal approximation to the binomial distribution was used to compute power. It is expected that 50% of the sons inherited either allele A or a from the sire under the null hypothesis. Therefore, the value of the abscissa under the null hypothesis is
Power of the mapping strategy for TDT followed by a daughter design: The mapping strategy consisted, first, of carrying out a TDT using selected offspring. The significant markers and families were then tested using records of performance. The selection at the TDT stage has two main advantages in terms of power: (1) The frequency of heterozygotes at the QTL is increased in the selected families in the t-test and (2) the multiple-testing problem might be reduced since the number of tests in the last stage is much reduced. Power of the experiment attributable to increased heterozygosity at the QTL among sire families can be computed following the approach of Weller et al. (1990) but with an increased heterozygosity.
The frequency of heterozygotes at the QTL among selected families based on TDT results is
—Accumulated probability of obtaining a given transmission disequilibrium parameter under random sampling of 50 sons.
RESULTS
The accumulated probability of random changes in the transmission disequilibrium parameter when sampling 50 sons from a heterozygous sire is shown in Figure 1. It is unlikely that the transmission parameter goes beyond the interval 0.3-0.70 when testing distorted segregation for that family size.
Statistical power for transmission disequilibrium testing of selection at a significance level of 0.05 is shown in Figure 2. Power to detect distorted segregation of alleles among offspring is >0.5 for QTL having an effect of >0.6 phenotypic standard deviations with full linkage of the marker to the QTL. Power decreases with increasing distance between the marker and the QTL. For loose linkage power is low. Therefore, a genome-wide scan should be carried out using markers separated from each other by ∼10 cM. These results also indicate that the Norwegian cattle population is suitable for measuring genomic response of QTL with moderate to large effect.
It is also of interest to know the gain in power for QTL detection by using the strategy TDT-DD vs. a simple DD. Power for the detection of QTL segregating at moderate or low frequencies increases by the selection of markers and families upon TDT results (Figure 3). On the contrary, QTL segregating at an intermediate frequency do not benefit for the strategy including pretesting with TDT (Figure 3).
—Power for detecting selection by means of genetic markers using TDT. Recombination fractions (c) were 0, 0.10, and 0.20. The population for typing corresponds to a selected sample of 32.5%. The level of significance is 0.05.
Testing of transmission disequilibrium of alleles among selected sons was performed with 282 genetic markers evenly distributed over the genome. The genome-wide test was highly significant (P < 0.001), indicating that natural and/or artificial selection for growth and conformation induces detectable changes in the genetic makeup of commercial populations (Table 2). Chromosomes 2, 6, 16, and 21 were significant at P < 0.01 whereas chromosomes 1 and 18 were significant at P < 0.05. Values of the likelihood-ratio statistic test for markers showing significant results in the genome-wide search are depicted in Table 3. The expected number of markers showing significant results at P < 0.01 is ∼3 because 282 markers were used, which suggests that there is an excess of significant markers that cannot be attributed to chance. However, the method cannot distinguish between transmission disequilibrium of alleles among selected offspring due to selection (i.e., a marker linked to a QTL influencing growth) or chance because of the amount of multiple testing performed.
Figure 4 shows the likelihood-ratio statistic test for markers in chromosome 2 and sire family 2005 when using estimates of allele frequencies in the dam population. The chromosomal fragments near INRA40 are transmitted in disequilibria among selected young bulls (P < 0.01). A graphic display of the transmission parameters for this chromosome and sire family is given in Figure 5. A transmission map is constructed using all transmission parameters for all the markers after they are oriented in the two homologous chromosomes. Transmission maps illustrate how different regions of the chromosome are transmitted from sire to selected offspring.
—Power of TDT followed by a daughter design vs. a simple daughter design using the population structure of Norwegian cattle.
The next step was to test for distorted segregation of alleles among nonselected offspring. Table 3 also shows the likelihood-ratio test statistic for nonselected sons. Markers INRA40 in family 2005 (P < 0.01), BM1237 in family 2052 (P < 0.10), BM1237 in family 3131 (P < 0.05), and BM6430 in family 2463 (P < 0.10) were significant. The lack of more significant results could be attributed to the weaker selection intensity among nonselected sons, which corresponded to a selected percentage of 66%.
The last step was to test for gametic selection at significant markers in step 1 using single-sperm typing. The results are also given in Table 3. Only marker BM1237 (family 2052) was significant (P < 0.05).
A summary of the transmission parameters in the three testing points (selected sons, nonselected sons, and sperm cells) is given in Figure 6. The transmission parameter is the frequency of transmission of one of the sire’s alleles to his sons. The number in parentheses in the histogram corresponds to the allele that is highly represented among selected sons. The relation between artificial selection and the observed distorted segregation among selected and nonselected sons can be established after performing a t-test contrasting growth performance of sons inheriting alternative alleles from their sire (Table 4). Alleles at a high frequency among selected sons should also increase growth. Markers BM4528 in families 2402 and 2946, BM1237 in family 2052, and BM6340 in family 2463 yielded significant results (P < 0.01), showing increased performance for sons carrying the allele highly represented among selected sons. INRA40 in family 2005 also yielded higher performance for bulls inheriting the allele highly represented among selected sons and was close to being significant. In all cases except marker BM4528 in sire family 2402, the pattern of segregation of alleles was reversed among nonselected sons. INRA128 was not significant for the t-test contrasting growth performance but showed distorted segregation of alleles among selected sons. Marker BM8125 was nonsignificant. These results suggest that sampling might play an important role in the genomic changes in populations undergoing artificial selection.
Likelihood-ratio statistic test (LRT) for the segregation of alleles among selected offspring for all markers within each of the 29 autosomal chromosomes
Markers BM1237 in family 3131 and BM8125 in family 2005 were not associated with growth performance. As shown in Figure 6, the allele highly represented among selected sons was also the most abundant among nonselected sons. These results suggest that zygotic selection may have operated prior to the test for growth.
Likelihood-ratio statistic test (LRT) using selected and nonselected half-sib sons and single-sperm typing
Another round of typing was carried out by (1) single-sperm typing using marker BM1237 (family 2052) with another sample of sperm cells and (2) typing for markers BM1237 (family 3131) and BM8125 (family 2005) using DNA from 48 half-sib daughters. The purpose of the first was to confirm gametic selection and of the second was to verify if natural selection has operated in that family and marker among the female offspring. The results for either single-sperm typing or half-sib daughters were nonsignificant (χ2 values ranging between 0.00 and 0.05).
—Likelihood-ratio statistic test for markers in chromosome 2 and sire 2005 along genetics distances (in centimorgans using the Kosambi mapping function). The horizontal line represents the threshold for hypothesis testing at P < 0.01.
—Transmission map for chromosome 2 and sire 2005. Estimates of the transmission parameter for each marker along the genetic distance (in centimorgans using the Kosambi mapping function) were obtained assuming known allele frequencies in the dam population. The conditional probability of the most likely linkage phase given the observations for each pair of consecutive markers TGLA44, INRA40, TGLA431, CSFM50, ETH121, BM1223, BM2113, and OARFCB11 ranged from 0.90 to 1 and is shown in the box in the bottom.
—Transmission disequilibrium parameter in selected sons, nonselected sons, and single-sperm typing. The values in parentheses are the corresponding alleles to the estimated transmission disequilibrium parameters (alleles highly represented among selected sons). Sire families are given at the top. (▪) Selected, () nonselected, (□) single-sperm typing.
DISCUSSION
Livestock represent a unique opportunity to follow genomic changes through the various levels of selection because of their large progeny groups. Sequential testing for distorted segregation of alleles among sperm cells and selected and culled sons is proposed to establish the main components of selection in commercial live-stock at the gametic, zygotic, and human-made levels. Our results indicate that gametic selection (selection of gametes before fertilization), if it exists, is not strong enough to be detected, which could be attributed to the industry testing for fertility traits before young bulls are used to inseminate the cow population.
Values of the t-test contrasting growth performance between sons inheriting alternative alleles from their sires
Zygotic selection is the selection from fertilization until the animals are ready for artificial insemination. This could occur if the same allele is more frequent among selected sons and among nonselected sons. This, in fact, occurred for markers BM1237 and BM8125 in our material. Consequently, those markers could be under zygotic selection. For example, it could happen when the marker is linked to a gene reducing viability of embryos or provoking high mortality among young animals. Another explanation is sampling, in which the allele is highly represented in selected and nonselected sons just by chance. Records on how zygotic selection occurred are needed to clearly establish how selection took place.
The main conclusion of this study is that the genetic makeup of commercial cattle is being systematically modified. Not surprisingly, the main cause for these changes is human intervention. The use of records of performance allowed us to establish the main forces influencing genomic response to artificial selection. Both direct and stochastic changes in the allele frequencies are important components of the response. Regions of chromosomes 6, 10, and 16 responded directly to artificial selection for growth in commercial cattle. This knowledge brings two applications: (1) to identify QTL responsible of the observed response and (2) to monitor other known loci with an important physiological role (e.g., immune response), which are located nearby those chromosomal areas responding to artificial selection. The first is comparable to today’s efforts for mapping QTL in commercial cattle (e.g., Georgeset al. 1995). The second opens new possibilities for controlling genomic manipulation of farm animals.
Testing of linkage disequilibrium was performed within families. This was a necessary assumption because we used mostly microsatellites (random anonymous markers) and, therefore, linkage phase of marker and QTL alleles could be different in different sires. Consequently, population-wide linkage disequilibrium was not tested. It is clear from the numbers given in Table 1 that the contribution to the next generation of bull sires was not equal. In particular, family 2005 was highly selected (∼25% of sons were selected) whereas selection pressure on family 2052 was low (∼11%). Genomic response across families could be tested only if polymorphisms in coding genes or markers strongly linked to them were available. This would require a highly dense genetic map.
The research performed in this article was carried out using Norwegian cattle, which is a relatively small population. The American dairy cattle population is much larger with several elite bull sires having between 100 and 300 sons’ progeny tested for dairy traits (E. B. Burnside, personal communication). The power for detection would be much higher than for European populations and would allow identification of the areas of the genome under selection for milk traits. This information could be compared with available information from QTL mapping in the dairy cattle population of America (Georgeset al. 1995). It is a paradox that QTL of relatively large effect are segregating in the commercial dairy population since fixation at those loci after a few cycles of selection is expected (Crow and Kimura 1970; Gomez-Raya and Klemetsdal 1999). The approach proposed in this article could be used to elucidate if the chromosomal areas where QTL have been detected are also under selection pressure in the American cattle population. For example, mapped QTL in the American population, which are not modifying their allele frequencies, might provide evidence of other forces maintaining genetic variability.
During the past several years great progress has been achieved in building better comparative maps between livestock and humans/mouse (Bandet al. 2000). Identifying breakpoints between conserved syntenic groups in different species is extremely important for the extrapolation of positional information from the highly developed human map to lower density maps in cattle. In the near future, it will be possible to identify positional candidate genes for the loci responding to artificial selection by taking advantage of this wealth of information and thus to establish the genetic nature of the genomic changes operating under selection in farm animals.
Acknowledgments
Information provided by Erling Sehested about the Norwegian cattle population is gratefully acknowledged. Biological material for typing was provided by GENO (breeding organization of Norwegian cattle). This work has been supported by the Norwegian Research Council, project number 130162/130, and title “Strategic QTL Research Plan for Disease Resistance in Atlantic Salmon and Cattle.” Financial support from the Institut de Recerca i Tecnologia Agroalimentaries is also acknowledged.
Footnotes
-
Communicating editor: J. B. Walsh
- Received July 20, 2001.
- Accepted August 16, 2002.
- Copyright © 2002 by the Genetics Society of America