| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 171, 1173-1181, November 2005, Copyright © 2005
doi:10.1534/genetics.105.040782
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,1



* Department of Genetics, Hebrew University, 91904 Jerusalem, Israel,
Department of Animal Science and Center for Integrated Animal Genomics, Iowa State University, Ames, Iowa 50011 and
Hy-Line International, Dallas Center, Iowa 50063
2 Corresponding author: Department of Genetics, Hebrew University, 91904 Jerusalem, Israel.
E-mail: soller{at}vms.huji.ac.il
| ABSTRACT |
|---|
|
|
|---|
2') measure. The results show appreciable LD among markers separated by up to 5 cM, decreasing rapidly with increased separation between markers. The LD within 5 cM was strongly conserved across generations and differed among chromosomal regions. Using marker-to-marker LD as an indication for marker-QTL LD, a genome scan of markers spaced 2 cM apart at moderate power would have good chances of uncovering most QTL segregating in these populations. However, of markers showing significant trait associations, only 57% are expected to be within 5 cM of the responsible QTL, and the remainder will be up to 20 cM away. Thus, high-resolution LD mapping of QTL will require dense marker genotyping across the region of interest to allow for interval mapping of the QTL.
For species with large family sizes, specifically dairy cattle (see KHATKAR et al. 2004 for a review), QTL have been detected within breeds but on a within-family basis. Because of their potentially loose linkage to the QTL, markers detected in such within-family studies are likely to be in linkage equilibrium (LE) with the QTL across the population. For these so-called LE markers the linkage phase between markers and the QTL must be established separately for each family and, once established, phased markers must be traced across generations (SOLLER and MEDJUGORAC 1999). This limits the use of these markers for MAS (DEKKERS 2004).
An alternative approach, based on candidate gene analysis (ROTHSCHILD and SOLLER 1997), avoids these problems. Candidate gene markers are expected to be close enough to the causative mutation such that associations are consistent across families because of population-wide linkage disequilibrium (LD) (DEKKERS 2004). This allows candidate gene associations to be detected within breeding populations and to be used immediately for selection in those same populations by selection on so-called LD markers (DEKKERS 2004). One of the first implementations of this approach in animal breeding was for the estrogen receptor gene for litter size in pigs (ROTHSCHILD et al. 1994; SHORT et al. 1997; DEKKERS 2004). At present, however, because of the targeted and time-consuming nature of candidate gene studies, the scope of a candidate gene approach for identifying QTL across the genome is limited. Costs of targeted candidate gene SNP detection and of SNP genotyping are, however, rapidly dropping, which may change the situation dramatically.
An attractive alternative to both LE and candidate gene mapping, which shares the advantages of both, is to look for population-wide LD between anonymous markers and QTL, using a dense marker map. Genome-wide scans for marker-QTL LD are feasible through application of selective DNA pooling (DARVASI and SOLLER 1994; LIPKIN et al. 1998; MOSIG et al. 2001) for the initial scan. Current chicken/cattle maps provide about one microsatellite marker per centimorgan, and there are now 2.8 million SNPs available in chickens, which, after correcting for SNPs appearing in more than one line, amounts to one marker per 374 bp (2,833,578 variant sites in the 1.06-Gb genome) (INTERNATIONAL CHICKEN POLYMORPHISM MAP CONSORTIUM 2004a). Availability of the complete chicken and cattle genome sequences means that essentially unlimited numbers of microsatellite and SNP markers can be readily obtained to saturate the genome to any desired degree. Thus, implementation of genome-wide scans for LD between anonymous markers and QTL in commercial populations is feasible, but depends on the extent of LD that exists in these populations. Theoretical analyses, based on the well-known SVED (1971) equation relating LD to effective population size and map distance, predict LD over very small regions in populations with large effective population size, such that a 0.1- or even 0.01-cM spacing might be required for an effective LD scan. In this context, the report by FARNIR et al. (2000) of extensive marker-to-marker LD over several tens of centimorgans in a study of Dutch Holsteins has important implications for LD mapping and LD-based MAS within commercial animal breeding populations. These results have been supported in further studies in cattle (VALLEJO et al. 2003) and sheep (MCRAE et al. 2002).
The purpose of this study was to examine the extent of marker-to-marker LD (henceforth "marker-LD") in a number of closed breeding lines of commercial egg-laying chickens of the Hy-Line International chicken breeding organization (henceforth Hy-Line). Under the assumption that the presence and extent of marker-LD correspond to the presence and extent of marker-QTL LD, marker-LD provides a basis for determining whether it is feasible to search for marker-QTL LD in commercial chicken lines and allows the marker density and population size required for such a search to be quantified.
| MATERIALS AND METHODS |
|---|
|
|
|---|
To initiate the study, marker genotype data for 22 individuals, each of the G1 generation of lines 1 and 2, were evaluated for LD. This data set, henceforth termed the "LD22 data set," consisted of two subsets: (i) the LD22 "whole-chromosome" data set, consisting of markers covering chromosomes 3, 4, and 5, and (ii) the LD22 "cluster" data set, consisting of clusters of closely linked markers on chromosomes 1, 2, 6 (line 2 only), and 8. On the basis of results from the LD22 data set, a dense marker scan was implemented for chromosomes 4 and 5 of lines 1 and 3, to study the degree to which population-wide marker-LD was conserved across generations. WANG (2003) previously reported QTL in these populations affecting the traits under selection on chromosome 4, but not on chromosome 5. This data set, denoted the "LD96 data set," consisted of marker genotype data for 32 individuals of each line in each of three generations (a total of 96 individuals per line). For line 1, the data came from generations G1, G2, and G6 and for line 3 from generations G5, G6, and G7.
Markers and genotyping:
The number of microsatellite markers used for each line and chromosome, the average spacing between markers, and the average number of alleles per marker are in Table 1. Complete marker details are in HEIFETZ (2004). The number of alleles per marker averaged
3.0. Where possible, map locations and distances between markers were according to the consensus map (http://iowa.thearkdb.org/). In some instances, a marker was not in the consensus map, but located on one of the individual maps on which the consensus map is based. In such cases, additional markers to both sides of the marker in question were identified that were present in both the consensus map and the individual map, and the marker in question was positioned on the consensus map proportional to these two markers. Resulting distances were in good agreement with the recently published sequence of the chicken genome (INTERNATIONAL CHICKEN POLYMORPHISM MAP CONSORTIUM 2004b). In any event, map distances represent proportions of recombination and will track the effects of recombination on LD more closely than distances on the sequence map, which may not be as closely tied to recombination rates. A plot of LD against physical distance, however, would be a convenient means of identifying recombination hotspots. An analysis of our data from this point of view will be presented elsewhere.
|
Measures of linkage disequilibrium:
Simulation studies (ZHAO et al. 2005) have shown that the standardized chi-square,
2' (YAMAZAKI 1977; HEDRICK 1987), is the preferred measure of LD for multiallelic markers for purposes of QTL mapping. The measure termed r2 (HILL and ROBERTSON 1968) is very useful for biallelic markers, since it stands in inverse proportion to the sample size needed to demonstrate significance. But when pooling across allele pairs for multiallelic markers, r2 weighted by the product of allele frequencies was found to be strongly undervalued (ZHAO et al. 2004) and this was also found in our data (not shown). Hence this measure will not be considered further. For comparison purposes, the measure D', which was used in other livestock studies (FARNIR et al. 2000; MCRAE et al. 2002; VALLEJO et al. 2003), was included in some analyses. Definitions of D' and
2' are
![]() |
![]() |
![]() |
![]() |
![]() |
To calculate the various LD measures between marker pairs, maximum-likelihood estimates of all two-marker haplotype frequencies were obtained using the software Arlequin (SCHNEIDER et al. 2000; Genetic and Biometry Lab, University of Geneva), which uses the genotypes of each individual for all markers as input. Individuals with a missing genotype for a given marker were excluded when computing LD for that marker.
Critical 0.05 and 0.01 significance levels of the
2'-statistic were obtained empirically on the assumption that LD is not expected for nonsyntenic marker pairs, and hence the distribution of
2'-values obtained for such marker pairs represents the distribution under the null hypothesis. Critical values were determined separately for each data set by ranking
2'-values for all nonsyntenic marker pairs and taking the LD value of the pair whose rank corresponded to the desired significance level.
Prediction equation for LD:
The well-known equation of SVED (1971), relating LD generated by drift to distance between markers and effective population size (Ne), is a useful way to summarize the extent and decline of LD with distance in a population. This equation was used as the basis for fitting the following model (model 1) to observed LD,
![]() | (1) |
2'-values of marker pairs that were up to 100 cM apart were included.
Tests for chromosomal and regional differences in LD:
Under the assumption that presence and extent of marker-LD corresponds to presence and extent of marker-QTL LD, regions with high marker-LD would be expected to also exhibit greater marker-QTL LD. Such regions would be priority regions for QTL mapping, since they would require lower marker density (on the genetic map distance scale) for equivalent power. In addition, outlier behavior of a chromosomal region with respect to LD might in itself be a "selection signature," indicating presence of a QTL under selection in that region (KIM and NIELSEN 2004). Differences in LD between chromosomes were evaluated using marker pairs that were up to 20 cM apart on chromosomes 4 and 5 from the LD22 and LD96 data sets. Since markers were not evenly distributed within regions, it was necessary to take differences in map distance between markers into account when evaluating regional differences in LD. This was done by conducting the analysis on residuals from model 1 fitted to each data set. In addition, residuals were divided by their predicted value to stabilize the variance, since variance of LD (measured by r2) is expected to be proportional to the square of expected LD (HILL 1981). Thus, observations used for analysis were
![]() |
is the estimate of the decline of LD with distance for data set j. To test for differences in LD between chromosomes, the following model (model 2) was used,
![]() | (2) |
ijk is a residual. Differences in LD between chromosomal regions on chromosomes 3, 4, and 5, were evaluated using regions of 10 cM that had three or more markers in at least one of the lines. This resulted in four data sets: two LD22 data sets with 90 and 128 marker pairs for lines 1 and 2 in 24 regions, respectively, and two LD96 data sets with 101 and 77 marker pairs for lines 1 and 3 in 16 regions, respectively. Mean marker distance within 10-cM regions was 4.5 cM for the LD22 data set and 4.1 cM for the LD96 data set. Marker-LD was corrected for distance and heterogeneous variance using the procedure described above and analyzed by the following model (model 3) to test for differences between regions,
![]() | (3) |
ijk is the residual of marker pair i in region k of line j.
Correlations between generations:
Pairwise correlations of LD values between generations were calculated using all marker pairs that were present in all generations for each line of the LD96 data set. Expected correlations were obtained by simulation, using the methods described in ZHAO et al. (2004). Linkage disequilibrium between markers with four alleles was simulated over 100 generations of random mating in a population with effective size 100. Correlations between LD in alternate generations were obtained by averaging correlations among generations 91100 using all marker pairs that were segregating in these generations. For syntenic markers >20 cM apart, the maximum was taken as 100 cM and for the nonsyntenic markers, marker pairs that were 300350 cM apart were used.
| RESULTS |
|---|
|
|
|---|
2'-measure resulted in a 100-fold reduction: only 0.080.30% of nonsyntenic pairs had values between 0.50 and 1.00, and only 0.020.20% had values >0.90 (data not shown). With rare exceptions (e.g., in cases of epistasis or where two markers are each in LD with QTL that are under selection), LD is not expected among markers on different chromosomes. Thus, for these data sets it is clear that D' had an unacceptable proportion of high values that do not appear to correspond to the reality of the situation. For this reason,
2' was used in all subsequent analyses of the data.
Observed LD among nonsyntenic markers was used to derive critical
2'-values to declare significant LD for syntenic markers. Critical values were first obtained separately by line and generation within data sets. Examination showed that critical values differed between data sets, but within data sets values did not differ between lines or among generations within lines (data not shown). Consequently, lines and generations were pooled within data sets and used to derive critical values by data set (Table 2). As expected, critical LD values were in inverse proportion to the number of individuals used to calculate the LD values. Thus, they were highest for the LD22 data set (based on 22 individuals), somewhat less for individual generations of the LD96 data set (32 individuals), and distinctly lower for the pooled-generation data of the LD96 data set (96 individuals).
|
2'-values as a function of distance. Values are for the LD96 data set of line 3, using marker data from 96 individuals across three generations. This picture of high LD at shorter distances that decays rapidly as distance increases was typical of all other data sets and agrees with previous results (FARNIR et al. 2000; PRITCHARD and PRZEWORSKI 2001) and with theory (SVED 1971).
|
2' against genetic distance for the LD22 (Table 3) and LD96 data sets (Table 4). For all data sets, results for d
20 cM were almost identical to results for nonsyntenic chromosomes for the corresponding data sets (not shown), with only 10% of
2'-values
0.20 and none
0.50. For shorter distances, in the LD22 whole chromosome data (Table 3), the proportion of
2'-values >0.50 depended strongly on distance between markers and on line, declining from 3334% for d
5 cM, to 913% for d in the range 510 cM, to 6% for d in the range1020 cM. The proportion of
2'-values >0.80 followed a similar trend, going from 1523% for d
5 cM, to 35% for d in the range 510 cM, to 13% for d in the range 1020 cM. Both lines behaved very similarly, with line 1 showing a somewhat higher degree of LD than line 2. Results for the marker clusters conformed to those above, with strong LD for d
5 cM, and appreciable, but less LD for d in the range 510 cM, and greater LD for line 1 than for line 2. When considering the LD96 results (Table 4), although the average amount of LD was greater in line 3 than in line 1 (see later), differences among generations within lines were small and not significant (data not shown). The pooled data sets behaved similarly to the individual generations for d
5 cM, but for longer distances (d
5 cM), the pooled data sets showed a higher proportion of
2'-values in the lowest LD class (0.00.1 bin) and a lesser proportion in all other bins (data not shown). The pooled data sets, consisting of 96 observations per line, have lower chance variation than the individual-generation data sets of 32 observations. Thus, these results show that chance variation is not a major source of the high LD values for short marker-to-marker distances, but is for longer distances. Consequently, by increasing sample size it should be possible to reduce the contribution to marker-QTL LD of more remote QTL. This is important for LD mapping. Considering the data averaged over generations for line 1 and d < 5 cM, 24% of markers showed moderate to high LD (
2'
0.50), and 11% showed very high LD (
2'
0.80); corresponding values for line 3 were 44 and 29%, respectively. The difference between the two lines was highly significant by a chi-square contingency test (P < 0.01). This accords with the history of the lines, since line 3 underwent a hybridization episode in G1, which would be expected to contribute to increased LD in generations G5G7, and is evaluated further below, with a model that includes correction for differences in marker distances. For line 1, the proportion of LD values in the highest LD bin for the LD96 data set at the closest distances (<5 cM) was 0.11, only about half the corresponding proportion (0.23) for the LD22 data set.
|
|
2'
0.50) and 18% showing very high LD (
2'
0.80).
Comparing lines, chromosomes, and regions:
Estimates of the decline of LD with distance (bj) obtained from model 1 were 7.92, 11.81, 18.34, and 11.36 for LD22 line 1, LD22 line 2, LD96 line 1, and LD96 line 3, respectively. Estimates for LD22 line 2 and LD96 line 3 were not significantly different on the basis of a t-test, but differences with and among estimates for all other lines were highly significant (P < 0.0001). The smaller estimate for line 1 compared to line 2 in the LD22 data set indicates a greater degree of LD in line 1, which can be attributed to the fact that line 1 is a more recent cross (12 generations vs. 30 generations for line 2). The smaller estimate for line 3 compared to line 1 in the LD96 data set can be attributed to the fact that line 3 was formed at G1 by crossing two lines and hence would still have large amounts of residual short-range LD at the G5G7 generations studied here. On the basis of the SVED (1971) prediction equation for LD, and assuming that the standard deviation of LD is equal to the predicted LD, as shown by HILL (1981) for r2, predicted and observed (in parentheses) proportions of LD >0.80 for LD22 line 1 and line 2 and for LD96 line 1 and line 3 are: 0.32 (0.23), 0.23 (0.15), 0.10 (0.11), and 0.24 (0.29), respectively. The mean predicted and observed proportion of LD across the four lines was 0.22 (0.20). Thus, the observed proportion of LD >0.80 was virtually as predicted.
Analysis of LD corrected for distance and variance using model 2 also showed a highly significant difference (P = 0.0003) between lines 1 and 2 in the LD22 data set and a significant difference (P = 0.02) between lines 1 and 3 in the LD96 data set. The model 2 analysis did not show significant differences between chromosomes 4 and 5 across lines and the interaction of chromosome and data set was also not significant (data not shown). Thus, this analysis shows that these two chromosomes were not characterized by different degrees of LD.
When evaluating differences in LD between regions, after correcting for genetic distance and heterogeneous variance, following model 3, the interaction of line and region was not significant and hence was dropped from the model. In the LD22 data set, the line effect was significant (P < 0.05) and the region effect was highly significant (P < 0.0001), the latter explaining
26% of the total variance. In the LD96 data set, the line effect was not significant but the region effect was again significant (P < 0.005), explaining
18% of the total variance.
Correlations between generations:
Table 5 shows observed Pearson correlations of
2' for marker pairs between generations of the LD96 data set, according to map distance between markers. For each correlation the expected correlation calculated according to the simulation is also given. For nonsyntenic markers, correlations were consistently low, but were positive and significantly different from zero (P < 0.0007) for both lines, even over four or five generations. For such markers, expected correlations are 0.25 for adjacent generations, similar to what was observed, but close to zero for generations further apart. For syntenic markers at d
20 cM, correlations were about the same as for nonsyntenic markers for line 1 but about twice as large for line 3. Again, this is probably due to the recent hybridization event in the history of line 3.
|
20 cM), correlations among generations were reduced considerably and ranged from 0.56 to 0.84. A
2-test of the difference between observed and expected did not reveal a significant difference. | DISCUSSION |
|---|
|
|
|---|
2', was a more effective measure than either r2 or D' to evaluate marker-LD in the studied populations: r2 was strongly undervalued in multiallelic situations (data not shown), and D' often gave high values for widely separated or nonsyntenic marker pairs. In contrast,
2' maintained a full range of values from 0 to 1.0, even with multiallelic markers, and rarely gave high values for widely separated or nonsyntenic marker pairs. For this reason, and also on the basis of simulation results of ZHAO et al. (2005),
2' was used as the definitive measure of LD in this study. Moderately high and statistically significant values for D' have been reported for widely separated and nonsyntenic marker pairs in studies of dairy cattle (FARNIR et al. 2000; VALLEJO et al. 2003) and sheep (MCRAE et al. 2002). These have been attributed to the population structure imposed by the small effective numbers of sires in livestock populations (FARNIR et al. 2000). It is possible that a population structure of this sort is also responsible for some of the high D'-values obtained in the present study, although an attempt was made to reduce the importance of this factor by choosing individuals that represented as broad a pedigree as possible. A plausible explanation for the high nonsyntenic values of D' in this and other livestock studies relates to the technical artifact that when one or more of the haplotypes expected in a sample are not observed, D' must equal 1.0. Consequently, when one of the alleles in a haplotype is at low frequency in a population, haplotypes containing this allele are also expected to be in low frequency in the population. In this situation, one or more of the expected haplotypes may not be present in the sampled individuals, which causes D' to inflate to 1.0. In studies of human populations, where the D'-measure is commonly used, biallelic markers are the rule, and markers for which one of the alleles is at low frequency are not included. In contrast, in livestock studies where multiallelic microsatellites are used, low-frequency alleles and haplotypes are common. This could explain the high D'-values between nonsyntenic markers obtained in this study and in the other livestock studies listed above. In contrast,
2' is unaffected by this artifact (ZHAO et al. 2005).
The main finding of the present marker-LD study is the presence in the study populations of widespread LD among markers separated by
5 cM. Although there were significant differences in the degree of LD between lines 1 and 2, overall LD levels were high. Taking a rough unweighted average across lines and chromosomes,
30% of marker pairs in the 0- to 5-cM distance range showed
2'-values
0.50, and 14% showed values
0.90. LD dropped off rapidly with increasing distance between markers, with
15% of pairs separated by 510 cM showing LD
0.50, and only 5% showing LD
0.90. For markers separated by 1020 cM, the corresponding values were 4 and 0.6%, respectively. It should be emphasized that these results apply to the specific populations of the present study, which are highly selected and partially inbred. The situation in other populations may be very different. Indeed, differences in LD among lines 1, 2, and 3 were consistent with the histories of these lines, and overall levels of LD in these lines were consistent with their relatively small effective population sizes. In contrast to the results reported here, results reported by the INTERNATIONAL CHICKEN POLYMORPHISM MAP CONSORTIUM (2004b) hint that SNP haplotype blocks in their populations are rarely as large as 0.3 cM.
The relationship of LD to distance found in this study for markers separated by
5 cM was comparable to that found by FARNIR et al. (2000), using the D'-measure in the Dutch Black and White dairy cattle population. A study of 26 markers on sheep chromosome 6 in a mixed population of Romney, Coopworth, and Perendale sheep also showed comparable values of D' for markers separated by
5 cM (MCRAE et al. 2002). A study of LD in North American Holstein cattle included only four marker pairs separated by
5cM; highly significant LD was found for one of these pairs (VALLEJO et al. 2003).
The finding of LD over extended regions (510 cM) implies that testing a candidate gene by an association test in a population of this sort could result in significant associations through linked QTL as far as 510 cM from the candidate gene marker, which greatly increases the noise factor. Thus, a candidate gene association test should be accompanied by a marker-LD analysis, to give some idea of the degree of LD in the population and hence of the potential region with which the candidate gene marker might be associating. In addition, marker-LD in a population could provide a criterion for constructing and testing a population a priori for its suitability with respect to candidate gene testing. One could decide on the degree of marker-LD against distance that would be acceptable and construct the population accordingly. Similarly, a two-stage approach might be used to lead from moderate- to high-resolution LD QTL mapping. The initial screen could be at a lower marker density to identify suggestive regions, and then a second analysis would be implemented, using a much higher density of markers in these regions. This would be similar to the combined linkage disequilibrium and linkage analysis mapping procedures that have been proposed (MEUWISSEN et al. 2002), in which family-based linkage analysis is used to determine general QTL location, and LD analysis is used for high-resolution mapping.
Examination of the SVED (1971) prediction equation for LD as a function of distance shows that the coefficient bj stands in inverse proportion to the extent of LD; that is, the greater the observed LD is, the smaller the estimate of bj. Thus, the effect of bj is similar to the effect of effective population size (Ne). Nevertheless, simulations show that bj is a biased estimate of Ne when based on
2' (ZHAO et al. 2005). Consequently, the bj estimates obtained in the present study do not provide estimates of Ne.
This study was unique in showing that short-range LD is strongly conserved in consecutive generationsand almost to the same extent across an interval of four or five generations. Thus, short-range LD established in a given generation can be expected to persist over a number of generations. This is important when using marker-QTL LD for MAS. Another important finding was the presence of significant differences in marker LD among chromosomal regions within lines. To the extent that such differences are generated by drift, it may be useful to concentrate mapping efforts in regions that have high LD since in these regions the likelihood of detecting marker-QTL associations may also be higher. However, if regions of high LD represent selection signatures (KIM and NIELSEN 2004), regions of high marker LD may be regions where selection has already increased the frequency of favorable alleles to high levels or even fixation, and in this case, depending on the approach to fixation, these regions may not be useful for mapping QTL that are still segregating in the population. Further experimental and theoretical studies are needed to clarify these possibilities.
It is of interest to explore the implications of these findings for QTL mapping based on marker-QTL LD, on the assumption that marker-QTL LD occurs at the same levels as marker-marker LD. At a marker spacing of 2 cM, a randomly placed QTL will be within 5 cM of
5 markers, within 510 cM of
5 markers, and within 1020 cM of
10 markers. If the proportion of marker pairs within a given range R that have a level of LD greater than a set threshold T is denoted LDR, and the number of markers in this range is denoted mR, then on elementary probability calculations and assuming independence of LD between adjacent markers, the likelihood that a QTL will have LD > T with at least one marker in the given range is
. Using the approximate levels of LD found in this study as given above (LD
5cM = 0.30), it can then readily be calculated that there is a likelihood of P
5cM = 0.83 that a QTL is in LD at
2'
0.50 with a marker that is within 5 cM of the QTL. The corresponding likelihood for
2'
0.50 for a marker in the range 510 cM is P510cM = 0.56 and for a marker in the range 1020 cM is P1020cM = 0.34. Thus, at this marker spacing, for any given QTL, there is a likelihood of P020cM = 1 (1 P
5cM)(1 P510cM)(1 P1020cM) = 0.95 that marker-QTL LD at
2'
0.50 will be found with at least one marker that is within 20 cM of the QTL, and on the average, LD at
2'
0.50 will be found for 2.65 markers within 20 cM of the QTL. Even at moderate statistical power of the experiment, therefore, a genome scan at 2-cM spacing would potentially be able to uncover most of the QTL segregating in these populations. At a marker spacing of 5 cM, similar calculations show that about two-thirds of QTL would be in LD at
2'
0.50 with at least 1 marker within 20 cM. In this case, an average QTL would be in LD with 1.0 markers, and very high statistical power would be required to realize the LD mapping potential of the population. This said, the evaluation of statistical power for marker-QTL LD association tests in agricultural populations is a complex problem that remains to be adequately addressed.
Nonetheless, whether at a marker spacing of 2 or 5 cM, only 57% of positive marker-QTL association tests will be with markers that are within 5 cM of the QTL, while the remainder will be distributed among markers in the range 510 cM (28%) and 1020 cM (15%). Thus, a finding of marker-QTL association in these populations does not automatically position the QTL close to the marker. Achieving this will require genotyping additional markers, at a spacing of 0.251 cM, in the regions of interest.
An optimal strategy for detecting QTL using population-wide LD remains to be worked out, but it might involve the use of selective DNA pooling (DARVASI and SOLLER 1994; LIPKIN et al. 1998) for the initial scan at close marker spacing, followed by individual genotyping of suggestive markers with an association test in the second stage to confirm marker-QTL LD. Multitrait (KOROL et al. 1995) and multilocus (DE KONING et al. 2001) methods might be useful at this stage to increase statistical power. In addition, multilocus haplotype analysis may increase mapping precision compared to single-marker methods (MEUWISSEN and GODDARD 2000; LEE et al. 2004), although that has been questioned in recent research, which showed that single-marker regression analysis can be just as effective for fine mapping (GRAPES et al. 2004). Studying LD across two or more generations should also be effective in narrowing the region to which the QTL has been mapped.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
| LITERATURE CITED |
|---|
|
|
|---|
ANDERSSON, L., 2001 Genetic dissection of phenotypic diversity in farm animals. Nat. Rev. Genet. 2: 130138.[CrossRef][Medline]
DARVASI, A., and M. SOLLER, 1994 Selective DNA pooling for determination of linkage between a molecular marker and a quantitative trait locus. Genetics 138: 13651373.[Abstract]
DE KONING, D. J., N. F. SCHULMAN, K. ELI, S. MOISIO, R. INOS et al., 2001 Mapping of multiple quantitative trait loci by simple regression in half-sib designs. J. Anim. Sci. 79: 616622.
DEKKERS, J. C., 2004 Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons. J. Anim. Sci. 82 (E-Suppl): E313328.
DEKKERS, J. C. M., and F. HOSPITAL, 2002 The use of molecular genetics in improvement of agricultural populations. Nat. Rev. Genet. 3: 2232.[CrossRef][Medline]
FARNIR, R., W. COPPIETERS, J.-J. ARRANZ, P. BERZI, N. CAMBISARO et al., 2000 Extensive genome-wide linkage disequilibrium in cattle. Genome Res. 10: 220227.
GRAPES, L., J. C. DEKKERS, M. F. ROTHSCHILD and R. L. FERNANDO, 2004 Comparing linkage disequilibrium-based methods for fine mapping quantitative trait loci. Genetics 166: 15611570.
HEDRICK, P. W., 1987 Gametic disequilibrium measures: proceed with caution. Genetics 117: 331341.
HEIFETZ, E. M., 2004 Fine mapping of QTL, methods and results. Ph.D. Thesis, Hebrew University, Jerusalem.
HILL, W. G., 1981 Estimation of effective population size from data on linkage disequilibrium. Genet. Res. 38: 209216.
HILL, W. G., and A. ROBERTSON, 1968 Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38: 226231.[CrossRef]
INTERNATIONAL CHICKEN POLYMORPHISM MAP CONSORTIUM, 2004a A genetic variation map for chicken with 2.8 million single nucleotide polymorphisms. Nature 432: 717722.[CrossRef][Medline]
INTERNATIONAL CHICKEN POLYMORPHISM MAP CONSORTIUM, 2004b Sequence and comparative analyses of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432: 695716.[CrossRef][Medline]
KHATKAR, M. S., P. C. THOMSON, I. TAMMEN and H. W. RAADSMA, 2004 Quantitative trait loci mapping in dairy cattle: review and meta-analysis. Genet. Sel. Evol. 36: 163190.[CrossRef][Medline]
KIM, Y., and R. NIELSEN, 2004 Linkage disequilibrium as a signature of selective sweeps. Genetics 167: 15131524.
KOROL, A. B., Y. I. RONIN and V. M. KIRSHNER, 1995 Interval mapping of quantitative trait loci employing correlated trait complexes. Genetics 140: 11371147.[Abstract]
LEE, S. H., and J. H. J. VAN DER WERF, 2004 The efficiency of designs for fine-mapping of quantitative trait loci using combined linkage disequilibrium and linkage. Genet. Sel. Evol. 36: 145161.[CrossRef][Medline]
LIPKIN, E., M. O. MOSIG, A. DARVASI, E. EZRA, A. SHALOM et al., 1998 Mapping loci controlling milk protein percentage in dairy cattle by means of selective milk DNA pooling using dinucleotide microsatellite markers. Genetics 149: 15571567.
MCRAE, A. F., J. C. MCEWAN, K. G. DODDS, T. WILSON, A. M. CRAWFORD et al., 2002 Linkage disequilibrium in domestic sheep. Genetics 160: 11131122.
MEUWISSEN, T. H. E., and M. E. GODDARD, 2000 Fine mapping of quantitative trait loci using linkage disequilibrium with closely linked marker loci. Genetics 155: 421430.
MEUWISSEN, T. H. E., A. KARLSEN, S. LIEN, I. OLSAKER and M. E. GODDARD, 2002 Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping. Genetics 161: 373379.
MOSIG, M., E. LIPKIN, G. KHUTORESKAYA, E. TCHOURZYNA, M. SOLLER et al., 2001 A whole-genome scan for QTL affecting milk protein percentage in Israel-Holstein cattle by means of selective milk pooling in a daughter design, using adjusted false discovery rate criterion. Genetics 157: 16831698.
PRITCHARD, J. K., and M. PRZEWORSKI, 2001 Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69: 114.[CrossRef][Medline]
ROTHSCHILD, M., and M. SOLLER, 1997 Candidate gene analysis to detect genes controlling traits of economic importance in domestic livestock. Probe 8: 1320.
ROTHSCHILD, M. F., C. JACOBSON, D. A. VASKE, C. K. TUGGLE, T. H. SHORT et al., 1994 A major gene for litter size in pigs. Proceedings of the 5th World Congress on Genetics Applied to Livestock Production, Guelph, ON, Canada, Vol. 21, pp. 225228.
SCHNEIDER, S., D. ROESSELI and L. EXCOFFIER, 2000 Arlequin Ver. 2.000: A Software for Population Genetics Data Analysis. Genetics and Biometry Laboratory, University of Geneva, Geneva.
SHORT, T. H., M. F. ROTHSCHILD, O. I. SOUTHWOOD, D. G. MCLAREN, A. DE VRIES et al., 1997 Effect of the estrogen receptor locus on reproduction and production traits in four commercial pig lines. J. Anim. Sci. 75: 31383142.
SOLLER, M., and J. S. BECKMANN, 1982 Restriction fragment length polymorphisms and genetic improvement. Proceedings of the 2nd World Congress on Genetics Applied to Livestock Production, Madrid, Spain, Vol. 6, pp. 396404.
SOLLER, M., and J. S. BECKMANN, 1983 Genetic polymorphism in varietal identification and genetic improvement. Theor. Appl. Genet. 67: 2533.[CrossRef]
SOLLER, M., and I. MEDJUGORAC, 1999 A successful marriage: making the transition from QTL mapping to marker assisted selection. Proceedings of the "From Jay Lush to Genomics: Visions for Animal Breeding and Genetics" Conference, Iowa State University, Ames, IA, pp. 8596.
SVED, J. A., 1971 Linkage disequilibrium and homozygosity of chromosome segments in finite populations. Theor. Popul. Biol. 2: 125141.[CrossRef][Medline]
VALLEJO, R. L., Y. L. LI, G. W. ROGERS and M. S. ASHWELL, 2003 Genetic diversity and background linkage disequilibrium in the North American Holstein cattle population. J. Dairy Sci. 86: 41374147.
WANG, J., 2003 Interval mapping of QTL with selective DNA pooling data. Ph.D. Thesis, Iowa State University, Ames, IA.
YAMAZAKI, T., 1977 The effect of overdominance on linkage in a multilocus system. Genetics 86: 227236.
ZHAO, H., D. NETTLETON, M. SOLLER and J. DEKKERS, 2004 Linkage disequilibrium measures between markers as predictors of linkage disequilibrium between markers and QTL. J. Anim. Sci. 82 (Suppl. 2): 9.
ZHAO, H., D. NETTLETON, M. SOLLER and J. C. M. DEKKERS, 2005 Evaluation of linkage disequilibrium measures between multi-allelic markers as predictors of linkage disequilibrium between markers and QTL. Genet. Res. 86: 7787.[CrossRef][Medline]
Communicating editor: T. H. D. BROWNThis article has been cited by other articles:
![]() |
A. J. Amaral, H.-J. Megens, R. P. M. A. Crooijmans, H. C. M. Heuven, and M. A. M. Groenen Linkage Disequilibrium Decay and Haplotype Block Structure in the Pig Genetics, May 1, 2008; 179(1): 569 - 579. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Stapley, T. R. Birkhead, T. Burke, and J. Slate A Linkage Map of the Zebra Finch Taeniopygia guttata Provides New Insights Into Avian Genome Evolution Genetics, May 1, 2008; 179(1): 651 - 667. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Andreescu, S. Avendano, S. R. Brown, A. Hassen, S. J. Lamont, and J. C. M. Dekkers Linkage Disequilibrium in Related Breeding Lines of Chickens Genetics, December 1, 2007; 177(4): 2161 - 2169. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Dodgson The Chicken Genome: Some Good News and Some Bad News Poult. Sci., July 1, 2007; 86(7): 1453 - 1459. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Soller, S. Weigend, M. N. Romanov, J. C. M. Dekkers, and S. J. Lamont Strategies to Assess Structural Variation in the Chicken Genome and its Associations with Biodiversity and Biological Performance Poult. Sci., December 1, 2006; 85(12): 2061 - 2078. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |