Skip to main content
  • Facebook
  • Twitter
  • YouTube
  • LinkedIn
  • Google Plus
  • Other GSA Resources
    • Genetics Society of America
    • G3: Genes | Genomes | Genetics
    • Genes to Genomes: The GSA Blog
    • GSA Conferences
    • GeneticsCareers.org
  • Log in
Genetics

Main menu

  • HOME
  • ISSUES
    • Current Issue
    • Early Online
    • Archive
  • ABOUT
    • About the journal
    • Why publish with us?
    • Editorial board
    • Early Career Reviewers
    • Contact us
  • SERIES
    • All Series
    • Genomic Prediction
    • Multiparental Populations
    • FlyBook
    • WormBook
    • YeastBook
  • ARTICLE TYPES
    • About Article Types
    • Commentaries
    • Editorials
    • GSA Honors and Awards
    • Methods, Technology & Resources
    • Perspectives
    • Primers
    • Reviews
    • Toolbox Reviews
  • PUBLISH & REVIEW
    • Scope & publication policies
    • Submission & review process
    • Article types
    • Prepare your manuscript
    • Submit your manuscript
    • After acceptance
    • Guidelines for reviewers
  • SUBSCRIBE
    • Why subscribe?
    • For institutions
    • For individuals
    • Email alerts
    • RSS feeds
  • Other GSA Resources
    • Genetics Society of America
    • G3: Genes | Genomes | Genetics
    • Genes to Genomes: The GSA Blog
    • GSA Conferences
    • GeneticsCareers.org

User menu

  • Log out

Search

  • Advanced search
Genetics

Advanced Search

  • HOME
  • ISSUES
    • Current Issue
    • Early Online
    • Archive
  • ABOUT
    • About the journal
    • Why publish with us?
    • Editorial board
    • Early Career Reviewers
    • Contact us
  • SERIES
    • All Series
    • Genomic Prediction
    • Multiparental Populations
    • FlyBook
    • WormBook
    • YeastBook
  • ARTICLE TYPES
    • About Article Types
    • Commentaries
    • Editorials
    • GSA Honors and Awards
    • Methods, Technology & Resources
    • Perspectives
    • Primers
    • Reviews
    • Toolbox Reviews
  • PUBLISH & REVIEW
    • Scope & publication policies
    • Submission & review process
    • Article types
    • Prepare your manuscript
    • Submit your manuscript
    • After acceptance
    • Guidelines for reviewers
  • SUBSCRIBE
    • Why subscribe?
    • For institutions
    • For individuals
    • Email alerts
    • RSS feeds
Previous ArticleNext Article

Comparing Linkage Disequilibrium-Based Methods for Fine Mapping Quantitative Trait Loci

L. Grapes, J. C. M. Dekkers, M. F. Rothschild and R. L. Fernando
Genetics March 1, 2004 vol. 166 no. 3 1561-1570; https://doi.org/10.1534/genetics.166.3.1561
L. Grapes
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J. C. M. Dekkers
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
M. F. Rothschild
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R. L. Fernando
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: rohan@iastate.edu
  • Article
  • Figures & Data
  • Info & Metrics
Loading

Abstract

Recently, a method for fine mapping quantitative trait loci (QTL) using linkage disequilibrium was proposed to map QTL by modeling covariance between individuals, due to identical-by-descent (IBD) QTL alleles, on the basis of the similarity of their marker haplotypes under an assumed population history. In the work presented here, the advantage of using marker haplotype information for fine mapping QTL was studied by comparing the IBD-based method with 10 markers to regression on a single marker, a pair of markers, or a two-locus haplotype under alternative population histories. When 10 markers were genotyped, the IBD-based method estimated the position of the QTL more accurately than did single-marker regression in all populations. When 20 markers were genotyped for regression, as single-marker methods do not require knowledge of haplotypes, the mapping accuracy of regression in all populations was similar to or greater than that of the IBD-based method using 10 markers. Thus for populations similar to those simulated here, the IBD-based method is comparable to single-marker regression analysis for fine mapping QTL.

THE purpose of mapping quantitative trait loci (QTL) in livestock is to identify genes affecting a quantitative trait and ultimately use existing variation in those genes to select superior individuals from a population. One difficulty is that traditional QTL linkage studies identify chromosomal regions, not individual genes, which may affect a trait. Depending on the power of the test and population structure, these regions can range from 20 to 40 cM in size and contain possibly thousands of genes. It is impractical to consider thousands or even hundreds of potential candidate genes to identify the QTL. Therefore, the chromosomal region associated with the trait should be narrowed, i.e., the region should be fine mapped, before attempts to identify the gene are made.

Advanced intercross lines (Darvasi and Soller 1995) and recombinant inbred lines (Taylor 1978) have been proposed as resource populations to be used for fine mapping. In these populations, due to repeated recombination, the linkage disequilibrium (LD) generated by the initial cross is limited to closely linked loci. However, these types of populations are nearly impossible to create for most livestock species, as well as humans, because of time, ethical and financial constraints, as well as inbreeding depression. To overcome this, it has been proposed to use the existing LD from historical recombinations for fine mapping (e.g., Bodmer 1986; Xiong and Guo 1997).

Meuwissen and Goddard (2000) proposed a method to fine map a QTL using LD within a haplotype of closely linked markers. In their work, they showed that haplotype-based LD mapping was more accurate than single-marker-based LD mapping by comparing their method to the transmission-disequilibrium test (TDT) of Rabinowitz (1997). The TDT is, however, restricted to within-family information, unlike the method of Meuwissen and Goddard (2000). The TDT has an advantage in that it is not affected by breed or line differences (population admixture), but this advantage comes at the expense of the power of the test. The method of Meuwissen and Goddard (2000) is affected by population admixture, but it is an inherently more powerful test because it uses across-family information. A simple and more appropriate comparison would be to test the haplotype-based method of Meuwissen and Goddard (2000) against least-squares regression on single markers because both these approaches use within- and between-family information, and both are subject to admixture. Thus, the purpose of this work was to compare the haplotype-based method of Meuwissen and Goddard (2000) to single-marker-based regression methods to determine if haplotypes provide additional information for fine mapping QTL.

The method of Meuwissen and Goddard (2000) maps QTL by modeling the covariance between individuals on the basis of the similarity of their haplotypes. Individuals with similar marker haplotypes will likely share QTL alleles that are identical by descent (IBD) and so will have a higher covariance. Assumptions about the population history are made to model the covariance. Meuwissen and Goddard (2000) showed that their IBD method is quite robust to departures from these assumptions, but it is unclear whether these assumptions affect comparisons with least-squares regression methods. So, determining the impact of population history on comparisons between the methods was the second objective in this study.

METHODS

Population simulations: Following Meuwissen and Goddard (2000) it was assumed that a previous linkage analysis study had mapped a QTL to a region of 2.25–9 cM in size, and within that region 10 biallelic markers were available. Thus, in all simulations, individuals were generated with 10 evenly spaced, biallelic markers, a QTL centered between two adjacent markers, and a trait phenotypic value according to their QTL genotype.

Default population: The IBD method is based upon modeling the covariance between individuals under the following assumptions: (1) variation in a QTL is due to a mutation that occurred 100 generations ago, (2) during the last 100 generations the effective population size was 100, and (3) each marker locus has two alleles with equal frequencies in the founder population. It was known which markers were maternally and paternally inherited so that haplotypes could be constructed. The data under the default simulation were generated under these assumptions with the QTL placed in the middle of the marker haplotype.

Phenotypic values for individuals in the final generation were generated similarly to those in Meuwissen and Goddard (2000). In all simulated populations, except for a crossbred population that is described later, the QTL alleles were uniquely numbered in the founders. So with an effective population size of 100, the initial frequency of each QTL allele is 0.005. In all simulations, one QTL allele with a frequency >0.1 in the final generation was randomly selected to be the mutant QTL allele. This mutant allele was given an additive genetic value of 1, and the value of all other QTL alleles was set to 0. The phenotypic value for each individual in the final generation was calculated by adding the QTL allele effects to an environmental effect sampled from N(0, 1).

As explained below, additional resources would be necessary to complete an experiment that uses haplotypes as compared to single markers. To determine the haplotypes of an individual, the genotypes of both parents may be required. Assuming all individuals in the final generation have different parents, up to three times as many genotypes would be required for an experiment that uses a haplotype-based analysis as compared to a single-marker-based analysis. Thus, given the same resources, single-marker-based analyses would permit a higher marker density. So, the regression analyses were also simulated with a higher density of 20 markers to compare the methods under more equitable resources.

Alternative populations: To test robustness of the methods to population history assumptions, several populations that differed from the default for one or more conditions were created. In the first, the population was created by crossing two breeds with divergent allele frequencies for two QTL alleles (see Table 1). After crossing, the population was randomly mated for 1, 5, 10, 20, or 100 generation(s). In the second population, the QTL was fixed at a position other than the center of the haplotype. In the third population, marker allele frequencies were assigned at random in the founder generation within a range of 0.2–0.8. In the last population, a “worst-case scenario” that differed from the default for all three conditions listed above was created. Details of all simulations are summarized in Table 1.

Maximum-likelihood estimation (IBD method): To fine map the QTL, phenotypic data in the final generation for a single trait, assuming one record per individual, were modeled following the method of Meuwissen and Goddard (2000) by y=Xb+a+e, (1) where y is a vector of phenotypic values, b is a vector of fixed effects, which here reduces to the overall mean, X is an incidence matrix for b, which reduces to a vector of ones, a is the vector of random genotypic values at the QTL, and e is the vector of residuals. The variance-covariance matrix of residuals is Var(e) = Rσ2e, where R is an identity matrix. The variance of the vector of genotypic values is Var(a)=Gpσa2 , where Gp is the additive relationship matrix for the QTL conditional on marker information, when the QTL is at position p. In the model used by Meuwissen and Goddard (2000) they fitted Zh in place of a in Equation 1, where h is a vector of random haplotype effects, and Z is an incidence matrix for h. The size of h is q × 1, where q is the number of unique marker haplotypes in the final generation. Their model assumed that identical marker haplotypes contain the same QTL allele. However, it is theoretically possible for two identical marker haplotypes to contain different QTL alleles. Model (1) does not make this assumption. Thus the covariance is modeled more accurately using Equation 1 than using the model of Meuwissen and Goddard (2000), which likely overestimates the covariance between individuals in some cases.

The additive relationship coefficient between two individuals is twice the probability that a random allele from one individual is identical by descent to a random allele from the other individual. Matrix Gp contains these relationship coefficients for a QTL at position p, given the marker haplotypes. To determine IBD probabilities for the QTL on the basis of marker haplotypes, the gene drop method described in Meuwissen and Goddard (2000) was used. This method compares a pair of haplotypes from the final generation by counting the number of markers to the left (Nl) and to the right (Nr) of the QTL that are consecutively identical in state (IIS). This assigns a haplotype pair to a distinct (Nl, Nr) category. The purpose of the (Nl, Nr) category is twofold. First, the category defines a region around the QTL of size (Nl, Nr) that may be IBD. Second, the number of IBD probabilities that must be estimated is reduced because multiple haplotype comparisons fall into the same (Nl, Nr) category. After assigning a haplotype pair to a (Nl, Nr) category, it is then determined whether the haplotype pair shares QTL alleles that are IBD. The QTL alleles are all uniquely numbered in the founder generation. So, individuals with QTL alleles that are IIS must also be IBD. Each pair of haplotypes from the final generation is categorized by its (Nl, Nr), and the IBD state of its QTL alleles is determined. To obtain estimates of IBD probabilities for each (Nl, Nr) category, the number of times the QTL alleles were IBD for that category was divided by the number of times the (Nl, Nr) category was observed across 100,000 replicates of the default simulation. These probabilities were calculated for each position that the QTL could take. Meuwissen and Goddard (2000) presented these IBD probabilities as approximations to the IBD probabilities that would be calculated if every possible haplotype pair was considered. However, as is demonstrated in the discussion, these IBD probabilities are in fact not approximations to IBD probabilities for individual haplotypes.

View this table:
  • View inline
  • View popup
TABLE 1

Parameters for default and alternative simulated populations

By assuming multivariate normality, the residual log-likelihood of model (1) is L(Gp,σa2,σe2)∝−0.5[ln(∣V∣)+ln(∣X′V−1X∣)+(y−Xb^)′V−1(y−Xb^)], where V=Var(y)=[Gpσa2+Rσe2] and b̂ is the generalized least-squares estimate of b. For every central position of a marker bracket, p, that was considered for the QTL, the likelihood was maximized with respect to the variance components σa2 and σe2 . The position with the highest log-likelihood was the estimated position of the QTL. Simulations using the IBD method for mapping were replicated 1000 times.

Single-locus regression models: For fine mapping using marker regression methods, the phenotypic data for the final generation were modeled by y=Xb+e. (2) In the first single-locus (SL) model, y is a vector of phenotypic data, b isa2 × 1 vector (μ0, μ1) that contains the intercept and the regression coefficient for a single-marker locus, and X is an incidence matrix for b. The hypothesis H0: μ1 = 0 vs. HA: μ1 ≠ 0 was tested for every marker locus. The position of the marker locus with the largest F-statistic was the estimated position of the QTL. Simulations using any regression-based method for mapping were replicated 10,000 times as they were much less computationally intensive than the IBD method.

For the second single-locus model (SL2), two adjacent loci were tested for association with the QTL. This model was included to determine if regression on two flanking markers could perform better than regression on a single marker or better than the IBD method, which also attempts to position the QTL between two flanking markers. Phenotypic data for the final generation were modeled as in Equation 2 except that b is a 4 × 1 vector of allelic effects (μ0i, μ1i, μ0j, μ1j) for alleles 0 and 1 at two adjacent marker loci (i, j). The hypothesis H0: μ0i =μ1i and μ0j =μ1j vs. HA: μ0i ≠ μ1i or μ0j ≠ μ1j was tested for every pair of adjacent marker loci (marker bracket). The center of the marker bracket with the largest F-statistic was the estimated position of the QTL.

Two-locus haplotype regression model: In this model (HAP), a haplotype was constructed from two adjacent marker loci. This model was included to examine the ability of regression to utilize flanking marker information, but in this case the markers were fit as a haplotype to more closely resemble the IBD method. Phenotypic data for the final generation were modeled as in Equation 2, except that b is a 5 × 1 vector including the intercept and haplotype effects (μ, μ00, μ01, μ10, μ11) for alleles 0 and 1 at two adjacent marker loci. The hypothesis H0: μ00 =μ01 and μ00 =μ10 and μ00 =μ11 vs. HA: μ00 ≠ μ01 or μ00 ≠ μ10 or μ00 ≠ μ11 was tested for every marker bracket. The center of the two-locus haplotype (marker bracket) with the largest F-statistic was the estimated position of the QTL.

Comparison of methods: To evaluate the ability of the methods to estimate the QTL position, the absolute differences between the estimated QTL position and the true QTL position were obtained for each method from each replicate of a simulation as absolute difference=∣ϴ^i−ϴ∣, where ϴ^i is the estimated QTL position in centimorgans for replicate i and θ is the true position of the QTL in centimorgans.

Bias of each method was estimated by bias=∑i=1nϴ^in−ϴ, where n is the number of replicates performed for a method.

To test for differences in mapping accuracies between methods, absolute differences for all replicates of a simulation were analyzed using ANOVA (JMP version 5.0; SAS Institute, Cary, NC) with method fit as a fixed effect. Although absolute differences are not normally distributed, ANOVA is known to be robust when the sample size is large as in this study. The least-squares mean of absolute differences (LSMD) was obtained for each method. The LSMD is a measure of a method's ability to estimate the position of the QTL, and a method with a smaller LSMD is preferable.

RESULTS

Comparison under the default population: The IBD method with 10 markers was compared to the regression methods SL, SL2, and HAP, each with 10 markers. The LSMD for each method using three different marker spacings is presented in Table 2.

The average LSMD across methods using 10 markers was 1.41 cM when the marker spacing was 1 cM, indicating that the mapping resolution of all methods was fairly good. At this marker spacing, an average QTL position estimate could be expected to deviate from the true QTL position by <2 markers or marker brackets from the QTL. Additionally, average mapping resolution increased proportionately as the marker spacing decreased. The average LSMDs across methods using 10 markers were 0.74 and 0.42 cM for marker spacings of 0.5 and 0.25 cM, respectively. In both cases, an average QTL position estimate could be expected to deviate from the true QTL position by <2 markers or marker brackets.

View this table:
  • View inline
  • View popup
TABLE 2

Least-squares mean absolute difference (centimorgans) of QTL position estimates for four mapping methods using 10 or 20 markers under the default scenario

The bias of all four methods under the default simulation was approximately zero. The mean QTL position estimate for each regression method differed from the true QTL position by ≤±0.05 cM, regardless of marker spacing. The IBD method's mean QTL position estimate differed from the true QTL position by 0.1 cM when the marker spacing was 1 cM and differed by ∼0.02 cM when the markers were spaced 0.5 and 0.25 cM apart. A bias of zero was expected because the QTL was positioned in the center of the marker haplotype.

Comparing LSMD across methods, the IBD method was significantly better at estimating the position of the QTL than the SL method with 10 markers (SL-10) for all three marker spacings (Table 2). The SL-10 method was significantly better than the SL2 method with 10 markers (SL2-10) when the marker spacings were 1 and 0.5 cM. Interestingly, fitting a two-locus haplotype in regression (the HAP method) using 10 markers performed similar to the IBD method regardless of marker spacing.

Next, with the exception of HAP the regression methods were allowed to have 20 markers genotyped and were then compared to the IBD method in an attempt to evaluate the approaches with more equitable genotyping costs, considering that the IBD method requires knowledge of haplotypes. The HAP method also requires knowledge of haplotypes, but it was allowed to use 20 genotypes to determine if additional information could improve its mapping resolution and to provide a more complete comparison. The SL method using 20 markers (SL-20) was significantly better than all other methods at positioning the QTL in its true location when markers were spaced either 0.5 or 0.25 cM apart (Table 2). However, when markers were spaced 0.125 cM apart (0.25 cM for IBD), SL-20 was not significantly better than IBD. With 20 markers, SL2 was significantly poorer than SL-20 and IBD at positioning the QTL. This regression method, SL2, may perform consistently worse than SL because more degrees of freedom are associated with the markers for this model (2 d.f.) as compared to the SL model (1 d.f.).

Again, biases of the regression-based methods were small (<±0.04 cM) except for the SL2 method with 20 markers at 0.5 cM marker spacing. Its mean position estimate differed from the true position by –0.12 cM. However, at smaller marker spacings, bias of the SL2 method was <–0.04 cM.

In general, LSMD of the SL method was smaller when 20 markers were used as compared to 10 for all marker spacings (Table 2). Interestingly, in the case of SL2, LSMD changed very little when 20 markers were used as compared to 10 for all marker spacings (Table 2). So the ability to utilize extra information from additional markers appears to be dependent upon the method of analysis.

Two-breed cross followed by random mating: Two breeds were simulated, each of effective size 100, which had the same two QTL alleles but at different frequencies (see Table 1). The number of generations of random mating that occurred after the initial cross of the two breeds ranged between 100 and 1. The LSMDs for the IBD method and the SL method with 10 (20) markers for each of the different numbers of generations of random mating are shown in Table 3. Marker spacing was set to 1 (0.5) cM, and the QTL was located at the center of the marker haplotype. Due to the poor performance of the SL2 method in the default population, it was not tested in any of the alternative populations. The HAP method was not tested in any of the alternative populations to focus on the comparison between single-marker-based analysis and the IBD method.

View this table:
  • View inline
  • View popup
TABLE 3

Least-squares mean absolute difference (centimorgans) of QTL position estimate for mapping methods with 1-cM marker spacing in a two-breed cross followed by random mating

Population admixture affected the accuracy of all methods negatively (Table 3). Even with 100 generations of random mating, LSMD was greater than that in the default population for both methods (Table 2). In fact, the LSMD of the IBD and regression methods was often greater than the LSMD of a randomly selected QTL position, which is 2 cM for the 10-marker case (1 cM spacing) and 2.25 cM for the 20-marker case (0.5 cM spacing) with a centrally located QTL. Note, however, that a centrally located QTL is most favorable for a random estimator of QTL position; i.e., the LSMD of a randomly selected QTL position will be smallest when the true QTL is located in the center of the chromosome. All of the simulated populations, except for the noncentral QTL and worst-case scenario, included a centrally located QTL. So, the accuracy of the methods is compared to the most accurate random QTL position estimate. Bias of the methods remained small, ranging from –0.17 to 0.16 cM. As the number of generations of random mating decreased, LSMD tended to increase. However, when the number of generations of random mating decreased from 100 to 20, LSMD decreased for all methods. This may be due to the fact that initially only two QTL alleles were in this population and after 100 generations of mating the QTL alleles attained extreme frequencies or became fixed in many replicates, resulting in lower mapping resolution.

In nearly all cases, the IBD method was significantly better than the SL-10 method but not significantly different from the SL-20 method (Table 3). With 100 generations of random mating, however, the SL-20 method was significantly better and there was no difference between the IBD and SL-10 methods. When only one generation of random mating occurred after the cross, a situation comparable to an F2 population, the SL-20 and IBD methods were better than the SL-10 method. A basic assumption of the IBD method was violated in this population, i.e., the event that created linkage disequilibrium. It was expected that the mapping accuracy of the IBD method would be more negatively affected than the mapping accuracy of regression methods because they make no assumptions about population history. However, both methods had similar mapping accuracies. So, violating this assumption had no impact on the comparison of the methods.

Noncentral QTL position: In this population, the QTL was positioned halfway between markers 3 and 4 (or markers 6 and 7 when 20 markers were genotyped) and the IBD method was compared to the SL method with 10 (20) markers. The LSMD for each method with marker spacing of 1 (0.5) cM is presented in Table 4.

Both the SL-10 method and the IBD method had larger LSMDs when the QTL was positioned toward the beginning of the marker haplotype instead of at the center. However, the LSMD of the SL-20 method did not change when the QTL was positioned toward the beginning of the marker haplotype. For this population, the SL-20 method was best able to estimate the position of the QTL while the SL-10 method was least able. However, all methods had much greater mapping accuracy than that of a randomly selected QTL position. The LSMD for a randomly chosen QTL position is 2.4 cM when 10 markers (1-cM spacing) are used and the QTL is between markers 3 and 4 and 2.58 cM when 20 markers (0.5-cM spacing) are used and the QTL is located between markers 6 and 7.

Bias was observed in all methods, as expected, due to the noncentral position of the QTL. Bias was smallest for the SL-20 method, at 0.36 cM, followed by the IBD method at 0.51 cM, and the SL-10 method at 0.63 cM (Table 4). Although bias of the SL-20 method increased from 0.02 to 0.36 cM with a noncentral position of the QTL, LSMD of the SL-20 method did not change (Table 4). Unlike the SL-20 method, the SL-10 and IBD methods showed an increase in both bias and LSMD for a noncentral QTL. The bias of all three methods remained relatively small though, as the bias for a randomly selected QTL position is 2 cM for both the 10- and 20-marker case.

View this table:
  • View inline
  • View popup
TABLE 4

Least-squares mean absolute difference (centimorgans) of QTL position estimate and bias (centimorgans) for mapping methods in three alternate scenarios

Variable marker allele frequencies: In all previous populations, initial frequency of the marker alleles was 0.5. Here marker allele frequencies in the founders were randomly set at each marker locus within a range of 0.2 and 0.8 and then the IBD method was compared to the SL method using 10 (20) markers. The LSMDs for these methods at a marker spacing of 1 (0.5) cM are shown in Table 4.

The performance of all methods in this population was similar to their performance in the default population (Tables 2 and 4). The LSMDs of all methods increased by 0.04 cM or less from their LSMDs in the default. Additionally, the bias for all three methods remained close to zero, ranging from 0.03 to –0.09 cM (Table 4). Comparing methods, the LSMD of the SL-20 method was smallest, while the LSMD of the SL-10 method was highest. This ranking of methods is the same as for the default population. So, it appears that the SL and IBD methods were not sensitive to marker allele frequencies.

Worst-case scenario: The previous alternative populations differed from the default by only one condition. Here, several conditions were changed from the default population to create a worst-case scenario. First, the two breeds described previously were crossed, followed by 10 generations of random mating. Second, the QTL was positioned between marker loci 3 and 4 when 10 markers were genotyped and between marker loci 6 and 7 when 20 markers were genotyped. Third, marker frequencies of the founders were set at random, as described previously.

The IBD method and the SL method using 10 (20) markers were tested for this worst-case scenario with a marker spacing of 1 (0.5) cM and their LSMDs are shown in Table 4. The LSMD of all methods increased drastically compared to the default population. The average LSMD for the SL-10, SL-20, and IBD methods increased from 1.33 cM under the default conditions to 2.52 cM in this population. The LSMDs of the three methods were similar to the LSMD of a randomly selected QTL position, which is 2.4 cM when 10 markers (1-cM spacing) are used and 2.58 cM when 20 markers (0.5-cM spacing) are used and the QTL is in a noncentral location as mentioned previously. Biases also increased markedly, from a range of –0.04 to 0.1 cM in the default scenario, to a range of 1.49 to 1.76 cM in the worst-case scenario (Table 4). These values are similar to the bias of a randomly selected QTL position, which is 2 cM as described previously. Bias was toward the center of the chromosome for all methods. The large positive bias and the near doubling of the LSMD when compared to the default are unique to this population. However, when comparing LSMD across methods, the results are not unique. Here the SL-20 method was not significantly different from the IBD method, and both were significantly better than the SL-10 method. This result is similar to the results from the two-breed cross in which, in nearly all cases, the SL-20 method and the IBD method were similar and significantly better than SL-10 (Table 3).

DISCUSSION

Comparing performances of mapping methods: Results from this work show that least-squares regression on a single marker is an effective method for LD-based fine mapping of QTL if a dense marker map is available. In situations that were both ideal and nonideal for the IBD method of Meuwissen and Goddard (2000), mapping precision of the IBD method was greater than that of the SL method, given an equal number of markers. Mapping precision of the SL method using 20 markers was similar to or greater than that of the IBD method with 10 markers. It should be pointed out, however, that mapping precision of the SL method was underestimated in the populations simulated here, because the SL method estimates the position of the QTL at a marker locus, but the true position of the QTL was always simulated at the center between two marker loci. Thus, the most accurate QTL position estimate the SL method can have is at one of the markers flanking the true QTL, which introduces an inherent level of error for the simulations performed here. In contrast, the IBD method estimates the position of the QTL at the center of a marker bracket, which is where the QTL is simulated, so it does not have an inherent error.

The comparable performance of the IBD and SL methods is contradictory to the generally held expectation that using more information (i.e., a haplotype) results in better estimates. One possible explanation is that IBD probability matrices were similar for adjoining positions of the QTL. In other words, IBD probability matrices were not sensitive to the position of the QTL. Thus, for adjoining positions of the QTL the likelihoods were also similar, possibly resulting in decreased mapping precision. Further studies will examine how the number of markers considered in the haplotype affects the sensitivity of the IBD probability matrices and mapping precision.

Another possible explanation for this contradictory result may stem from the fact that the regression-based methods model the disequilibrium using location parameters (mean effects of marker alleles), while the IBD method models the disequilibrium using dispersion parameters (variance of genotypic values and error variance). It is well known that location parameters are easier to estimate than dispersion parameters. Thus, single-marker regression-based methods may have an inherent advantage over the IBD method.

Effects of alternative populations: Several alternative populations were considered in this study to test robustness of the fine-mapping methods and to determine if any methods were particularly sensitive to deviations from the default population.

First, in the default, it was assumed that a mutation on a founder chromosome was responsible for creating the linkage disequilibrium in the population. The IBD probabilities were generated under the assumption that 100 generations of random mating in a population of effective size 100 had elapsed since the mutation occurred. Meuwissen and Goddard (2000) showed that the mapping accuracy of their method was not affected by violations of these assumptions such as altering effective population size and the number of generations of random mating since the mutation occurred. However, they did not consider an alternative event to create the initial linkage disequilibrium.

In two alternative populations in this study, the two-breed cross and the worst-case scenario, a cross between two breeds created initial disequilibrium. It may be that these two breeds diverged from a common population several generations ago and were reintroduced. Sabry et al. (2002) tested the IBD method in a population similar to this in which four populations diverged from a founder population, were reintroduced after 90 generations, and were allowed to randomly mate for 6 generations. Sabry et al. (2002) found the IBD method to be robust to this population structure, in contrast to our result, which found that performance of the IBD method was much worse in the two-breed cross and the worst-case scenario than in the default population. However, the regression methods also performed much worse in these two alternative populations than in the default population (Tables 2,3,4). In fact, the mapping accuracy of all methods was similar to, or even less than, the accuracy of a randomly selected QTL position for both alternative populations. The worst-case scenario does include a noncentral QTL and randomly set marker allele frequencies, which the two-breed cross does not, but these were shown to have little effect on mapping ability. So the decrease in mapping accuracy for all methods is apparently due to the introduction of population admixture. Other population events such as recent bottlenecks or recurrent mutation at the QTL may also decrease the ability of the methods to fine map a QTL. Further research is needed to compare methods under these scenarios.

Second, any or all methods may be affected if the QTL is not located in the center of the chromosomal region evaluated. If the QTL is closer to either end of a chromosomal region, then there will be fewer markers on one side of the QTL than on the other. Thus, there is no longer a symmetric distribution of information across the chromosomal region. The fact that LSMD of the SL-20 method did not change when the QTL position was shifted toward the beginning of the chromosome (Table 4) supports this idea. The SL-20 method maintained six markers to the left of the alternative QTL position while the IBD and SL-10 methods maintained only three markers. The additional marker information may have allowed the SL-20 method to map the QTL equally well at both QTL positions. Also, additional marker information may have allowed the SL-20 method to maintain smaller bias than the SL-10 or IBD method with a noncentral QTL (Table 4). The finite parameter space considered for the noncentral QTL introduced bias for all methods. Bias of SL-10 was largest (Table 4), indicating that the additional markers, and possibly the decreased marker spacing, of SL-20 greatly improved its mapping accuracy.

Third, IBD probabilities were calculated under the assumption that initial frequencies of all marker alleles were 0.5 and violating this assumption may have an effect on the IBD method. A marker is most informative when its frequency is 0.5 so marker allele frequencies that deviate from 0.5 should also affect any fine-mapping method. However, results from this study showed that the IBD method and the regression-based methods perform as well in this alternative population as in the default population. Thus, the deviation of marker frequencies from 0.5 had essentially no impact on the ability of the methods to map the QTL. This is an important result because it seems unlikely that in an actual population the frequencies of all marker alleles would be 0.5. Markers with more extreme allele frequencies were not considered because they would not be utilized in an experimental situation. So the range of founder allele frequencies used in this population is reasonable because it does not cause marker alleles to have extreme frequencies or to reach fixation in generation 100 such that mapping precision is decreased. Although all methods were robust to this alternative population, the SL-20 method was again best able to estimate the position of the QTL and thus would be the preferred method for a fine-mapping experiment if the markers were available.

Estimation of IBD probabilities: As noted earlier, IBD probabilities were not obtained for every possible haplotype pair but instead were estimated for groups of haplotype pairs that shared a similar distribution of IIS marker alleles around the QTL. Meuwissen and Goddard (2000) presented the IBD probabilities derived from the gene drop method as approximations to those based on individual haplotype comparisons. In fact, the IBD probabilities based on haplotype pairs are identical to IBD probabilities based on (Nl, Nr) categories. This is because the IBD state of two-QTL alleles is dependent upon only the number of consecutive marker alleles flanking the QTL that are IIS. The first pair of non-IIS alleles that is reached indicates a recombination event in the population simulated here. Thus, marker alleles beyond this locus are no longer informative for determining the IBD state of the QTL alleles. This was confirmed by simulating a default population with 4 markers instead of 10 and calculating an IBD probability for each haplotype pair. The IBD probability of each haplotype pair was the same as the IBD probability of the appropriate (Nl, Nr) category for the haplotype pair. This is an important result because if IBD probabilities are based on individual haplotype pairs, the number of IBD probabilities that must be estimated increases exponentially as the number of markers increases. The ability to group haplotype pairs into (Nl, Nr) categories is essential for the efficient use of the IBD method.

Current use of fine-mapping methodology: The application of fine-mapping methods for positional cloning of a QTL in livestock has appeared only recently (Grisartet al. 2001; Blottet al. 2003). These studies showed that fine mapping of a previously identified chromosomal region was an important step toward identification of the gene and its causative mutation(s). Using a maximum-likelihood approach that simultaneously mined linkage and LD information in outbred half-sib pedigrees from five different dairy cattle populations, Farnir et al. (2002) were able to refine the position of a previously identified QTL on BTA 14. This eventually led to the positional cloning of the DGAT1 gene (Grisartet al. 2001). Blott et al. (2003) modified the method of Farnir et al. (2002) to consider IBD probabilities for sires' haplotypes so that a hierarchical clustering algorithm could be used to group haplotypes to fine map a QTL on BTA 20 affecting milk yield and composition. The bovine growth hormone receptor gene (GHR) was identified as a positional candidate gene and mutation in GHR was found to be associated with milk yield and composition (Blottet al. 2003). Meuwissen et al. (2002) extended the IBD method of Meuwissen and Goddard (2000) to also include pedigree information and fine mapped a QTL for twinning rate in dairy cattle to a region <1 cM. Each of these experiments took advantage of both linkage and LD information for the purposes of fine mapping, so results from this study cannot be extrapolated directly to form a comparison between regression-based fine-mapping methods and the fine-mapping methods used in Grisart et al. (2001), Meuwissen et al. (2002), or Blott et al. (2003).

However, it can be stated that if a fine-mapping experiment was to be conducted using a sample of individuals assumed to be unrelated, regression-based LD mapping methods would be expected to perform as well as IBD-based LD mapping methods. If individuals were related, given the same number of individuals, the expected number of informative markers and haplotypes would decrease, which could decrease mapping precision. Meuwissen and Goddard (2000) showed that mapping precision of their IBD method decreased when phenotypic records from 100 individuals in a population of effective size 50 were used as compared to records from the default population of effective size 100. However, the decrease in mapping precision was not large (Meuwissen and Goddard 2000). Further research is necessary to examine whether population size and relation between individuals will impact LD-based mapping methods.

Evidence to support our result that single-marker-based analysis is comparable to haplotype-based analysis was presented in a recent study by Zhang et al. (2003), where a variance-components analysis (Abecasiset al. 2000) was used to detect association between markers and immunoglobulin E concentration in humans. The association results that were obtained using a three-, four-, or five-marker haplotype as a sliding window across the region were not different from the association results obtained using single markers (Zhanget al. 2003). Future studies using experimental data rather than simulated data should also examine haplotype- and single-marker-based analyses to determine their mapping precision under experimental conditions.

Mapping under equitable resources: Justification for the use of 20 markers in regression analysis comes from the need to compare methods as they could be used in an experimental situation. For the population described here, resources required to conduct an experiment using information from a 10-locus haplotype are more comparable to resources required to conduct an experiment using information from 20-marker, rather than 10-marker, genotypes. In practice, it is possible to estimate haplotype information without knowing parental genotypes or to infer the haplotypes when half-sib family information is available, but the IBD method as presented by Meuwissen and Goddard (2000) requires known haplotypes from equally unrelated individuals with no pedigree information. The effect of using estimated haplotype information in the IBD method has not been studied, but it is expected that this will reduce mapping accuracy. It is debatable whether it is statistically fair to compare the SL-20 method to the IBD method with 10 markers but for experimental purposes described here it was considered fair.

The benefit of using 20 instead of 10 markers was most evident in the default population (Table 2) and in the following two alternative populations (Table 4): (1) for a noncentral QTL and (2) when marker allele frequencies were random. So genotyping additional markers can improve the SL method's ability to fine map a QTL by making it more robust. Of course, depending on the extent of the LD, there will be a limit to the extra information that can be obtained by simply genotyping additional markers. It may be possible that an optimum number of markers spaced an optimum distance apart exist for fine mapping. Further work is being conducted to examine this theory and to examine additional properties of haplotype-based LD mapping.

Acknowledgments

The authors thank Dan Nettleton for his comments and contribution to this work. This work was supported in part by funding from the United States Department of Agriculture-National Research Initiative, Sygen International, the Iowa Agriculture and Home Economics Experiment Station, and by Hatch Act and State of Iowa funds. Laura Grapes was supported by a United States Department of Agriculture National Needs fellowship in quantitative and molecular genetics.

Footnotes

  • Communicating editor: G. A. Churchill

  • Received November 13, 2003.
  • Accepted December 10, 2003.
  • Copyright © 2004 by the Genetics Society of America

LITERATURE CITED

  1. ↵
    1. Abecasis G. R.,
    2. Cardon L. R.,
    3. Cookson W. O. C.
    , 2000 A general test of association for quantitative traits in nuclear families. Am. J. Hum. Genet. 66: 279–292.
    OpenUrlCrossRefPubMedWeb of Science
  2. ↵
    1. Blott S.,
    2. Kim J.,
    3. Moisio S.,
    4. Schmidt-Küntzel A.,
    5. Cornet A.,
    6. et al
    ., 2003 Molecular dissection of a quantitative trait locus: a phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics 163: 253–256.
    OpenUrlAbstract/FREE Full Text
  3. ↵
    1. Bodmer W. F.
    , 1986 Human genetics: the molecular challenge. Cold Spring Harbor Symp. Quant. Biol. LI: 1–13.
  4. ↵
    1. Darvasi A.,
    2. Soller M.
    , 1995 Advanced intercross lines, an experimental population for fine genetic mapping. Genetics 141: 1199–1207.
    OpenUrlAbstract/FREE Full Text
  5. ↵
    1. Farnir F.,
    2. Grisart B.,
    3. Coppieters W.,
    4. Riquet J.,
    5. Berzi P.,
    6. et al
    ., 2002 Simultaneous mining of linkage and linkage disequilibrium to fine map quantitative trait loci in outbred half-sib pedigrees: revisiting the location of a quantitative trait locus with major effect on milk production on bovine chromosome 14. Genetics 161: 275–287.
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Grisart B.,
    2. Coppieters W.,
    3. Farnir F.,
    4. Karim L.,
    5. Ford C.,
    6. et al
    ., 2001 Positional candidate cloning of a QTL in dairy cattle: identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Res. 12: 222–231.
    OpenUrlCrossRefWeb of Science
  7. ↵
    1. Meuwissen T. H. E.,
    2. Goddard M. E.
    , 2000 Fine mapping of quantitative trait loci using linkage disequilibria with closely linked markers. Genetics 155: 421–430.
    OpenUrlAbstract/FREE Full Text
  8. ↵
    1. Meuwissen T. H. E.,
    2. Karlsen A.,
    3. Lien S.,
    4. Olsaker I.,
    5. Goddard M. E.
    , 2002 Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping. Genetics 161: 373–379.
    OpenUrlAbstract/FREE Full Text
  9. ↵
    1. Rabinowitz D.
    , 1997 A transmission disequilibrium test for quantitative trait loci. Hum. Hered. 47: 342–350.
    OpenUrlCrossRefPubMedWeb of Science
  10. ↵
    1. Sabry A.,
    2. Lund M. S.,
    3. Guldbrandtsen B.
    , 2002 Robustness of a variance component QTL fine mapping method. Proceedings of the 7th World Congress on Genetics Applied to Livestock Production, August 19–23, Montpellier, France, Vol. 32, pp. 677–680.
  11. ↵
    1. Taylor B. A.
    , 1978 Recombinant inbred strains: use in gene mapping, pp. 423–438 in Origins of Inbred Mice, edited by Morse H. C.. Academic Press, New York.
  12. ↵
    1. Xiong M.,
    2. Guo S.
    , 1997 Fine-scale mapping of quantitative trait loci using historical recombinations. Genetics 145: 1201–1218.
    OpenUrlAbstract/FREE Full Text
  13. ↵
    1. Zhang Y.,
    2. Leaves N. I.,
    3. Anderson G. G.,
    4. Ponting C. P.,
    5. Broxholme J.,
    6. et al
    ., 2003 Positional cloning of a quantitative trait locus on chromosome 13q14 that influences immunoglobulin E levels and asthma. Nat. Genet. 34: 181–186.
    OpenUrlCrossRefPubMedWeb of Science
View Abstract
Previous ArticleNext Article
Back to top

PUBLICATION INFORMATION

Volume 166 Issue 3, March 2004

Genetics: 166 (3)

ARTICLE CLASSIFICATION

INVESTIGATIONS
View this article with LENS
Email

Thank you for sharing this Genetics article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Comparing Linkage Disequilibrium-Based Methods for Fine Mapping Quantitative Trait Loci
(Your Name) has forwarded a page to you from Genetics
(Your Name) thought you would be interested in this article in Genetics.
Print
Alerts
Enter your email below to set up alert notifications for new article, or to manage your existing alerts.
SIGN UP OR SIGN IN WITH YOUR EMAIL
View PDF
Share

Comparing Linkage Disequilibrium-Based Methods for Fine Mapping Quantitative Trait Loci

L. Grapes, J. C. M. Dekkers, M. F. Rothschild and R. L. Fernando
Genetics March 1, 2004 vol. 166 no. 3 1561-1570; https://doi.org/10.1534/genetics.166.3.1561
L. Grapes
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J. C. M. Dekkers
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
M. F. Rothschild
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R. L. Fernando
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: rohan@iastate.edu
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation

Comparing Linkage Disequilibrium-Based Methods for Fine Mapping Quantitative Trait Loci

L. Grapes, J. C. M. Dekkers, M. F. Rothschild and R. L. Fernando
Genetics March 1, 2004 vol. 166 no. 3 1561-1570; https://doi.org/10.1534/genetics.166.3.1561
L. Grapes
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J. C. M. Dekkers
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
M. F. Rothschild
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R. L. Fernando
Department of Animal Science, Iowa State University, Ames, Iowa 50011
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: rohan@iastate.edu

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero

Related Articles

Cited By

More in this TOC Section

  • Comparative Genomics and Transcriptomics To Analyze Fruiting Body Development in Filamentous Ascomycetes
  • The Fate of Deleterious Variants in a Barley Genomic Prediction Population
  • The Role of Anti-Müllerian Hormone in Testis Differentiation Reveals the Significance of the TGF-β Pathway in Reptilian Sex Determination
Show more Investigations
  • Top
  • Article
    • Abstract
    • METHODS
    • RESULTS
    • DISCUSSION
    • Acknowledgments
    • Footnotes
    • LITERATURE CITED
  • Figures & Data
  • Info & Metrics

GSA

The Genetics Society of America (GSA), founded in 1931, is the professional membership organization for scientific researchers and educators in the field of genetics. Our members work to advance knowledge in the basic mechanisms of inheritance, from the molecular to the population level.

Online ISSN: 1943-2631

  • For Authors
  • For Reviewers
  • For Subscribers
  • Submit a Manuscript
  • Editorial Board
  • Press Releases

SPPA Logo

GET CONNECTED

RSS  Subscribe with RSS.

email  Subscribe via email. Sign up to receive alert notifications of new articles.

  • Facebook
  • Twitter
  • YouTube
  • LinkedIn
  • Google Plus

Copyright © 2019 by the Genetics Society of America

  • About GENETICS
  • Terms of use
  • Advertising
  • Permissions
  • Contact us
  • International access