Genetics, Vol. 166, 341-350, January 2004, Copyright © 2004

Optimal Designs for Linkage Disequilibrium Mapping and Candidate Gene Association Tests in Livestock Populations

Alessandra Stellaa and Paul J. Boettcherb
a CERSA-Fondazione Parco Tecnologico Padano, Consiglio Nazionale delle Ricerche, Segrate 20090, Italy
b Istituto Biologia e Biotecnologia Agraria, Consiglio Nazionale delle Ricerche, Segrate 20090, Italy

Corresponding author: Paul J. Boettcher, Palazzo LITA, Via Fratelli Cervi 93, 20090 Segrate MI, Italy., boettch{at}ibba.cnr.it (E-mail)

Communicating editor: C. HALEY


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Simulation was used to evaluate the performance of different selective genotyping strategies when using linkage disequilibrium across large half-sib families to position a QTL within a previously defined genomic region. Strategies examined included standard selective genotyping and different approaches of discordant and concordant sib selection applied to arbitrary or selected families. Strategies were compared as a function of effect and frequency of QTL alleles, heritability, and phenotypic expression of the trait. Large half-sib families were simulated for 100 generations and 2% of the population was genotyped in the final generation. Simple ANOVA was applied and the marker with the greatest F-value was considered the most likely QTL position. For traits with continuous phenotypes, genotyping the most divergent pairs of half-sibs from all families was the best strategy in general, but standard selective genotyping was somewhat more precise when heritability was low. When the phenotype was distributed in ordered categories, discordant sib selection was the optimal approach for positioning QTL for traits with high heritability and concordant sib selection was the best approach when genetic effects were small. Genotyping of a few selected sibs from many families was generally more efficient than genotyping many individuals from a few highly selected sires.


LINKAGE mapping has been used to detect the presence of numerous putative quantitative trait loci (QTL) in livestock species (e.g., ANDERSSON et al. 1994 Down; GEORGES et al. 1995 Down). In general, these studies have positioned the identified QTL within chromosomal regions spanning 10–40 cM. Although this level of precision is sufficient for some applications of marker-assisted selection, additional precision of mapping based on linkage disequilibrium (LD) between the markers and QTL may be needed for a number of reasons. First, results from linkage studies yielding a 10- to 40-cM level of precision may be confidently applied directly only for marker-assisted selection within the families in which the linkage phase between markers and QTL has been established. Mapping based on LD may be used to find markers that are located closer to the QTL of interest and can thus be assumed to have the same linkage phase across the population, even for families within which linkage phase has not been explicitly defined. The efficiency of marker-assisted selection is thus much greater for markers in population-wide LD than for markers in within-family linkage with QTL, because transmission of the QTL can be predicted more accurately and because testing of new families is unnecessary. Alternatively, one may wish to use positional candidate cloning or another strategy to identify the gene responsible for the observed phenotypic effect associated with the marker. In this instance, the length of the chromosomal region within which the potential QTL resides must be shortened to decrease the number of candidate loci to a manageable quantity. For these reasons, efforts are being made in livestock species to detect markers that have LD with QTL that exists across populations (DEKKERS 2003 Down).

TERWILLIGER and GORING 2000 Down emphasized the importance of optimal experimental design for increasing the efficiency and statistical power of genetic association studies. The power (per genotype) of linkage analysis for the detection of QTL can be increased markedly by the selective genotyping of particular individuals, usually those with extreme phenotypes (LANDER and BOTSTEIN 1989 Down). Selective genotyping has been used successfully in livestock (e.g., KIRKPATRICK et al. 2000 Down) and is often used in linkage analysis of human disorders (e.g., CARDON et al. 1994 Down; FULLERTON et al. 2003 Down). However, the best selection strategies used for linkage studies may not be optimal for detection of LD (ABECASIS et al. 2001 Down). In human studies, selection is typically based on strategies that consider family structure as well as phenotype, selecting multiple sibs with similar or dissimilar phenotypes, depending on the strategy applied (CARDON and FULKER 1994 Down). The properties of selective genotyping that improve the power of linkage mapping are likely to improve the precision of using LD mapping, but the optimal strategy will likely depend on the trait, population, and QTL of interest. ABECASIS et al. 2001 Down compared different strategies of selective genotyping for LD mapping in humans. Their studies were based on simulated data from small nuclear families with up to four full-sibs. The objective of the current study was to use simulation to compare the precision of fine mapping associated with several different selective genotyping strategies that may be applicable to livestock populations, in particular to populations (e.g., dairy cattle) in which artificial insemination is routinely employed to create large families of half-sibs.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Base simulation:
Simulation was used to generate data for the comparison of selective genotyping strategies. Various parameters (frequency of QTL alleles, QTL variance, and total heritability) defining the population were varied in the base simulation to examine the effect of these parameters on the optimal strategy of LD mapping (Table 1). Each combination of these three parameters was simulated, for a total of eight combinations in the base situation.


 
View this table:
In this window
In a new window

 
Table 1. Parameters systemically varied in the base simulation

The design of the simulation was based roughly on the approach of MEUWISSEN and GODDARD 2000 Down. Previous studies using linkage analysis were assumed to have positioned a putative QTL to lie within a small (<40 cM) chromosomal region. Additional studies then focused exclusively on this segment of the genome. New markers were placed within this region and associations with phenotypes were analyzed to estimate to which marker the QTL was most closely located. This simulation is comparable to a situation in which results from a linkage analysis within a small experimental resource population are subsequently used to identify genomic regions for LD mapping in a large commercial population.

The large population in this study consisted of 50,000 females and 125 males [N = 50,125 and Ne (effective population size) ~500]. Genotypes for each animal consisted of two haplotypes for 11 evenly spaced biallelic markers in a 20-cM chromosomal segment, an associated QTL, and a normally distributed polygenic effect. The QTL was positioned near (recombination is 0.001) the central marker. The simulation was based on the gene-drop approach of MACCLUER et al. 1986 Down in which a single segregating QTL was established in a base population and subsequent generations were simulated to allow recombination to occur among the QTL and linked markers. In the base generation, individuals were randomly assigned two alleles at each of the 11 marker loci. Initial frequencies were set at 0.50 for all marker alleles at all loci. For the QTL locus, each animal was assigned two unique alleles (i.e., original total was 2N QTL alleles). Polygenic effects were drawn from N(0, {sigma}2A). In future generations, marker and QTL genotypes were assigned according to rules of Mendelian inheritance and allowed for recombination within the region. Polygenic effects were equal to means of parental values plus a Mendelian sampling component. Discrete generations were assumed. Completely random mating was simulated for the first 99 generations, but all sires had exactly 400 offspring in the final generation. Analyses of LD and comparisons of selection strategies were based on the final generation. The QTL values and phenotypes were assigned only in this generation.

The QTL effect was effectively biallelic, inasmuch as numerous QTL alleles were initially simulated, but a nonzero effect was assigned to only a single allele. At the end of 100 generations, the overwhelming majority of the original 2N alleles had been lost due to random drift, and remaining alleles generally had frequencies >>1/2N. The single existing allele that had a frequency (P) closest to a predefined value (Table 1) was identified and assigned a positive effect chosen to ensure that the QTL accounted for a specified proportion of the genetic variance in the last generation, and all other QTL alleles had effects of zero. The variability at the QTL (VQTL) accounted for either 30 or 10% of the total genetic variance. For simulations with the high (P {cong} 0.20) frequency for the positive QTL allele, the most frequent allele was identified and defined simply as the allele with the positive effect. This frequency ranged from 0.12 to 0.40 across replicates with a mean of ~0.20. No anomalies such as segregation distortion or population admixture were simulated.

Phenotypes were assigned by summing polygenic, QTL, and random residual [~N(0, {sigma}2e)] effects. Heritability was either 0.05 or 0.30 (Table 1). Two types of phenotypes were simulated: (1) continuously and normally distributed and (2) dichotomous. For the dichotomous phenotype, an underlying phenotype was generated on a normal liability scale by using the same procedure applied to the continuously distributed trait, and then a threshold was imposed on this scale such that the highest 10% of the population on the continuous scale exhibited one phenotype and the remainder of the population had another. In other words, the individuals that carried the alternative (positive) QTL allele had a higher probability to express the less common phenotype. Heritability for the dichotomous trait was defined on the liability rather than on the observed scale. The base simulation included all eight combinations of parameters in Table 1 for both types of phenotypes.

Selection strategies:
Selective genotyping of 1000 individuals (2%) was simulated. Six basic selection strategies were defined. The first strategy was simple random selection (RAN) of individuals for genotyping, regardless of phenotype. The second was standard selective genotyping (STD), in which the individuals with phenotypes in the highest and lowest 1% of the population were selected, regardless of family. The remaining strategies considered half-sib (sire) families in the selection process. The third strategy balanced genotyping across sires (BAL), choosing the four highest and four lowest individuals from each half-sib family. The fourth strategy was similar to the discordant sib-selection strategy applied in human studies in which sibs with diverse phenotypes are genotyped (ABECASIS et al. 2001 Down). The 10 sires with the most variability among daughters were identified and the highest and lowest 50 offspring from each sire were genotyped (DIS10). This strategy was repeated with 20 (and 50) sires and 25 (and 10) pairs of offspring (DIS20 and DIS50). The fifth strategy identified and tested the 500 most discordant pairs (DISPAIR) of half-sibs in the population, without targeting specific sires or placing upper or lower limits on the number of pairs per sire. For this strategy, half-sib pairs were formed by matching the highest offspring from each sire with the lowest offspring, the second highest with the second lowest, and so forth. The final strategy was similar to concordant sib selection (ABECASIS et al. 2001 Down), in which siblings with similar extreme phenotypes are chosen. Because half-sib families were much larger in this study (and in many livestock populations) this strategy was altered somewhat from typical concordant sib selection. First, the highest and lowest sires (on the basis of daughter mean) were identified. Then the highest daughters of the highest sires and lowest daughters of the lowest sires were selected. This strategy was applied for the 5, 10, and 25 highest (lowest) bulls (CON5, CON10, and CON25, respectively). The concordant sib-selection strategies involved 100, 50, and 20 offspring per sire, respectively.

When applying these various strategies to the dichotomous trait, many half-sibs had the same phenotype, making it impossible to rank sibs or to precisely determine the highest and lowest animals. In such cases, selection among animals with the same phenotype was done at random. In addition, the DISPAIR strategy was applied only for the continuous phenotype, because with the dichotomous trait all sires had multiple identical pairs of offspring with one high and one low phenotype.

Analyses:
To test for the most likely location of the QTL, simple ANOVA of marker effects was performed for each of the 11 marker loci. The locus with the highest resulting F-test was considered the most likely location of the QTL. Although ANOVA is theoretically appropriate only for continuous traits, it was used also for the dichotomous phenotypes, because the primary objective was to determine the marker with the highest association with phenotypes, rather than to apply a significance test. In addition, the ANOVA F-test applied to a dichotomous variable is functionally related to the R-square and chi-square statistics and, as a result, to the standard {phi} coefficient for dichotomous variables. Preliminary analyses (our unpublished results) also examined use of half-sib applications of the pedigree disequilibrium test (MARTIN et al. 2000 Down) for the dichotomous trait and the quantitative pedigree disequilibrium test (ZHANG et al. 2001 Down) for the continuous trait, but these approaches generally yielded lower power than did the simple ANOVA and, therefore, less ability to distinguish among selection strategies. Thus, reported results were based on ANOVA. These other approaches may be more robust than simple ANOVA to effects of population stratification (MARTIN et al. 2000 Down), but no such effects were simulated in this study. Statistical significance of the F-test was not considered because the simulation was designed to emulate a situation in which a QTL was assumed a priori to lie within the tested interval and the LD analysis was undertaken simply to determine more precisely the most probable location. In addition, the primary objective of this study was to compare strategies, rather than to determine sample sizes needed to reach a given level of power or precision subject to a certain level of type I error. However, to ensure that no unexpected source of bias favored a given marker position, a null model was applied in which all QTL alleles had effects of zero. This null model was applied to the continuous phenotype at the two different levels of heritability. In a real-life application, one would be advised to apply a significance test to help ensure that a QTL is segregating, especially if the targeted LD mapping is applied to a population that differs from the one that had been used to identify the chromosomal region being tested. In this case, an empirical significance test based on permutation may be advised, given that standard tests may be inappropriate with selective genotyping.

The simulation was repeated 1000 times for each combination of population parameters and the number of times that each of the 11 markers was indicated as the most likely position of the QTL was recorded across all replicates.

Additional scenarios:
In addition to the original eight combinations of QTL frequency, VQTL, and heritability considered in the base simulation (Table 1), performance of the various strategies was examined for other situations defined by changes in these and other parameters.

Decreased QTL frequencies: For this comparison, frequency of the QTL allele with the positive effect was decreased to P = 0.05, while VQTL and heritability were maintained at the values described in Table 1.

Decreased effective population size: In certain livestock species, the relatively recent (in evolutionary terms) widespread use of artificial insemination and national and international genetic evaluation programs has allowed for a high selection intensity of sires of sires. This process has decreased the Ne of these populations to <100 in some cases (HANSEN 2000 Down). The extent of LD is likely to differ with Ne and this difference may affect power of fine mapping and possibly the optimal strategy of selective genotyping. We simulated a decreased Ne in the later generations of the population by decreasing the number of sires of sires. Under this scenario, starting in generation 95 (i.e., total generations - 5), the number of sires of sires was decreased from 125 to 10 and was designed to mimic somewhat changes in the global dairy cattle populations during the recent generations. This breeding strategy decreased the population's effective population size to ~90 individuals, according to the change in average inbreeding coefficient of the population during the final two generations.

Increased numbers of marker alleles: The markers in the base simulation had two equally common alleles in the base population. In this situation, the marker allele that was coupled with the positive QTL allele in the original founder will also be coupled with a high proportion of QTL alleles with zero effect. The level of LD is expected to be increased with more marker alleles, because this proportion of coupling with zero-value QTL alleles is expected to be decreased. In addition, increasing the number of alleles should increase heterozygosity and average information content of marker loci. To test the effects of an increased number of marker alleles on the selection strategies, we simulated the base population with six, rather than two, equally frequent marker alleles at each locus.

Proportion of the population expressing each of the dichotomous phenotypes: Selective genotyping strategies were compared for the situation in which the two dichotomous phenotypes were observed in proportions of 2:98 and 30:70, rather than the original 10:90. In addition, effectiveness of the various strategies was compared for the situations in which QTL frequency, variance, and heritability were all at high values (Table 1) and the proportion of individuals expressing the high phenotypic class varied from 1 to 99%.

Ordered categorical phenotypes: Selection strategies were compared for two distributions of phenotypes into multiple (more than two) ordered categories. The first scenario included three categories, for which the first class included the individuals with the 5% highest phenotypes on the liability scale, the second group included the next 25%, and the remaining 70% of the population expressed a third phenotype. The second scenario included five ordered phenotypic categories, each with 20% of the population.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Continuous phenotypes:
Table 2 presents the percentages of times that the marker located closest to the QTL was indicated as the most likely location of the QTL for each of the 10 different strategies applied to the quantitative phenotype for each of the eight parameter combinations in the base simulation. In addition, results are presented for the situation in which P = 0.05. For each combination of parameters, the value in italics in Table 2 corresponds to the strategy for which the proper location of the QTL was identified the highest proportion of times. Values that are not statistically different (P > 0.05) are indicated by asterisks. In terms of general results, precision of QTL location decreased as residual variance increased and influence of the QTL decreased. Precision of QTL location also decreased as VQTL was held constant, but frequency of the positive QTL allele decreased. For example, for the RAN strategy, the correct location of the QTL was identified 70.1% of the time when P {cong} 0.20, heritability was 0.30, and the QTL accounted for 30% of the genetic variance. This precision dropped to only 37.4% when P = 0.05 and the other factors remained constant. Thus, precision of QTL positioning decreased as the substitution effect increased and VQTL was held constant. This result may be due to the fact that precision is expected to be linearly related to substitution effect, whereas VQTL is proportional to squared substitution effect. Therefore, the increase in substitution effect needed to maintain VQTL as frequency decreases did not provide an equivalent increase in precision of mapping.


 
View this table:
In this window
In a new window

 
Table 2. Percentage of times that the correct location of the QTL was identified for various selection strategies and parameters for continuous phenotypes

Although changes in the population parameters had major effects on the level of precision for the location of the QTL, only minor effects were observed on the relative effectiveness of the various strategies. For all combinations of parameters, multiple strategies yielded levels of precision that were not significantly different from the best strategy. However, the particular strategy yielding the highest rate of correct QTL positioning differed across combinations of population parameters and some clear patterns were observed. For example, the STD strategy was universally the most precise strategy when the heritability of the trait was low, regardless of the frequency of alleles at the QTL or VQTL. In contrast, STD was never the highest-ranking strategy when heritability was high. In fact, the STD strategy was always significantly (P < 0.05) less powerful than the optimal strategy when heritability was high and VQTL was low. The DISPAIR strategy was never significantly less precise than the best observed strategy for any of the 12 different combinations and always ranked as the best strategy when heritability and VQTL were high. The BAL and DIS50 strategies were never significantly different from the best strategy when heritability was high, but were usually not as precise as STD for low heritability traits.

Strategies based on concordant sib selection (CON5, CON10, and CON25) were not particularly effective for quantitative traits. For no scenario were any of the concordant sib strategies optimal or even nonsignificantly different from the most precise strategy. ABECASIS et al. 2001 Down also reported that discordant selection was superior to concordant selection when using small full-sib families to detect LD between markers and genes affecting traits with continuous phenotypes.

Under nearly all circumstances, the application of any of the selection strategies was superior to RAN selection of animals for genotyping. Differences between application of RAN and the optimal strategy were particularly large when heritabilities were low. For example, the frequency for correct positioning of the QTL was nearly three times greater for STD than for RAN (48.8 vs. 18.0%) when P {cong} 0.20 and heritability and VQTL were low, but this difference was reduced (88.3% for DISPAIR vs. 70.1% for RAN) when heritability and VQTL were high. For a few of the scenarios, CON5 and CON10 were similar (or slightly inferior) to RAN.

For both the discordant and concordant sib-selection strategies, increased precision was generally obtained by applying a relatively low intensity of selection to sire families, allowing for high intensity of selection of sibs within families. In general, ranked in terms of precision of QTL positioning, DIS50 > DIS20 > DIS10 and CON25 > CON10 > CON5. Two factors likely influenced this ranking of strategies. First, when many offspring of fewer sires were used, the sibs likely shared long common fragments of the targeted chromosomal region, due to the low number of recombination events in the transmission between sire and offspring. Thus, evidence of linkage disequilibrium with the QTL within these families was likely to have extended not only to the closest marker, but also to additional neighboring markers, decreasing the ability to determine the precise location. Second, selective genotyping strategies increase power by inducing a correlation between residual and QTL effects. This correlation may have been increased when few extreme animals per family were selected, because the average magnitude of residual effects was likely increased. In addition, selection intensity of daughters within sire was reduced as a greater proportion of daughters was selected per sire.

The approach of discordant sib selection applied in this study sought to increase precision of QTL positioning in two ways. First, the initial identification of the most variable sires was designed to increase precision by targeting the sires that were most likely to be heterozygous at the QTL. In theory, this process could increase the probability that daughters with opposing phenotypes received different QTL alleles from the sire. Second, selection of equal numbers of extreme half-sibs with opposing phenotypes was designed to increase power by pairing daughters to help account for polygenic contributions of the sire. The first step of formally targeting variable sires seemed to be of relatively little importance in this study, however, inasmuch as the DISPAIR approach, which selected the most extreme half-sib pairs from all sires, regardless of the variability of their daughters, generally performed as well or better than any of the discordant approaches applied only within selected families. Although heterozygosity is critical for linkage mapping, this result suggests that it is of little importance for LD mapping, at least when a simple ANOVA is used for analysis.

Results from the null model indicated no systematic bias in QTL position associated with the different strategies. Across the 10 strategies and 11 marker positions, the proportion of times that a given marker was indicated as the most likely position ranged from 7.0 to 11.5%. The sixth marker position (site of the true QTL) was indicated as the most likely location only once in 20 possibilities (10 strategies and two levels of heritability).

Dichotomous phenotypes:
Results for the various strategies for dichotomous phenotypes are shown in Table 3. The general relationships between precision of QTL location and allelic frequency, heritability, and VQTL are similar to those observed for continuous phenotypes, but precision was generally decreased. This decrease in precision is consistent with previous results in the statistical genetics literature regarding losses in statistical power for tests of genetic association when continuous phenotypes are dichotomized (ABECASIS et al. 2001 Down) with reductions in selection response (FALCONER and MACKAY 1996 Down) and estimates of heritabilities (GIANOLA 1982 Down) for threshold characters vs. continuously distributed traits.


 
View this table:
In this window
In a new window

 
Table 3. Percentage of times that the correct location of the QTL was identified for various selection strategies and parameters for dichotomous phenotypes

Changes in the various parameters had much more profound effects on the relative efficiencies of the strategies than that observed for the continuously distributed trait. In particular, strategies involving concordant sib selection were much more effective for dichotomous traits, particularly when heritability was low. For example, when heritability was 0.05 and frequency of the positive QTL allele was high, CON25 was significantly (P < 0.05) more precise than all of the other strategies for both high and low levels of VQTL. The second-best strategy for these scenarios was CON10. The CON25 approach was also optimal when frequency of the positive allele was 0.10, although some other strategies were not significantly different. When frequency of the positive QTL allele was 0.05, various discordant sib-selection strategies yielded higher precision of QTL positioning than did CON25, although not to a statistically significant degree. Different strategies were optimal when heritability was high, with results varying according to the frequency of the positive QTL allele. The STD was optimal when frequency was high, DIS20 for P = 0.10, and DIS10 was optimal for P = 0.05. When heritability was high, the optimum number of half-sib families to use for discordant sib selection seemed to decrease with P. For example, among discordant sib strategies, DIS50 was best when P {cong} 0.20, DIS20 when P = 0.10, and DIS10 when P = 0.05. However, these differences were not always statistically significant (P > 0.05).

Selection of specific half-sib families for application of fine mapping was of much more importance for dichotomous phenotypes than that observed for continuous traits (Table 2), especially when heritability was low. In these situations, because the heritability is low, the phenotype of a given animal provides relatively little information about the genotype of that individual. For example, for the scenarios with P = 0.10, only 1% of the population is expected to be homozyogous for the positive QTL allele (on the basis of Hardy-Weinberg frequencies). Homozygous animals should increase among animals with the positive phenotype, but when heritability = 0.05 and VQTL = 0.10, this proportion is expected to increase to only ~1.2%. Proportion of heterozygous individuals among positive individuals is expected to increase to only ~19.8%, from 18% in the general population. In contrast, the mean phenotype of 400 offspring provides much more information about the genotype of a given sire. Among the 50 highest-ranking sires under this scenario, ~3% were homozygous and 32% were heterozygous for the positive QTL allele vs. 0.1 and 6.8%, respectively, among the 50 lowest-ranking sires. Corresponding frequencies among offspring of these sires were 2.6% (homozygous) and 26.7% (heterozygous) for highly ranked sires and 0.3 and 13.1% for the low-ranked sires. Therefore, the selection of individuals for genotyping based on a combination of sire mean plus phenotype increased the precision for QTL positioning by increasing the correlation between phenotype and QTL genotype among tested individuals.

Alternative scenarios:
Effective population size: Decreasing the effective population size from Ne = 500 to Ne = 90 decreased the precision with which the QTL was positioned, but generally had no major effects on relative efficiency of the different strategies across the different combinations (data not shown). In only one instance did decreasing effective population size have a profound impact on the ranking of selection strategies. This difference was observed for the dichotomous trait, when P = 0.10 and heritability was low. With Ne = 500, CON25 was the highest-ranking strategy (Table 2). However, when Ne = 90, CON25 was no longer the optimal strategy. Instead, DIS10 ranked first in the identification of QTL position (28.4%) when VQTL was high (vs. only 20.3% for CON25). When VQTL was low, DIS20 was the highest-ranking strategy at 15.5% (although CON25 was not significantly different). Otherwise, rankings for dichotomous traits were little affected. For continuous traits, relative rankings were unchanged, as STD still ranked highest for traits with low heritability, across all levels of P, and DISPAIR was the highest-ranked strategy for all situations with high heritability and high VQTL (data not shown).

Table 4 shows the decrease in precision of the best strategy (expressed as a percentage of the maximum value when Ne = 500) observed when effective population size was decreased from 500 to 90. Across scenarios, the average decrease in precision of the best strategy was ~16%. For both types of phenotypes, the decreases in precision due to reduced effective population size tended to be greater when frequency of the positive QTL was lower. For example, with dichotomous phenotypes, when P = 0.10 the average decrease (across four combinations of heritability and VQTL) in precision of the best strategy was 30% vs. only ~9% with P {cong} 0.20. Corresponding decreases for continuous phenotypes were ~16 and 9%, respectively. The loss in precision varied with VQTL (increasing with VQTL) when heritability was 0.05, but not when heritability was 0.30. The greatest effects of effective population size on precision were observed for moderately heritable dichotomous traits affected by a QTL allele at a high frequency. Decreases in precision were likely due to the fact that the genotyped animals shared more recent common ancestors than was the case with higher effective population size, reducing the number of recombination events between the common ancestor and the current generation. In turn, this likely increased LD with markers more distant from the QTL, obscuring its location.


 
View this table:
In this window
In a new window

 
Table 4. Decrease in precision of the best strategy observed when effective population size was decreased from 500 to 90

Increased numbers of marker alleles: The use of markers with six alleles increased precision for all of the strategies, but had no major effects on ranking of the strategies (data not shown). The main effect observed was that in several cases, although the highest-ranking strategy remained the same, the increased level of precision obtained accentuated its advantage, decreasing the number of other strategies that were not significantly (P > 0.05) different. For example, for dichotomous phenotypes with P = 0.10 and heritability = 0.05, the CON25 strategy was superior, as had been observed for the base situation with biallelic markers. With biallelic markers, four other strategies (STD, DIS10, DIS20, and CON10) were statistically equivalent to CON25 when VQTL was 0.10 and DIS10 was not significantly different when VQTL was 0.30 (Table 3). However, when using markers with six alleles, the proportion of times that the correct QTL location was identified was increased from 19.8 to 29.5% for VQTL = 0.10 and from 38.4 to 59.6% for VQTL = 0.30. At these increased levels of precision, CON25 was statistically superior (P < 0.05) to all other approaches.

In particular cases, the opposite effect on the differentiation of strategies was observed for continuous phenotypes. Once the level of precision approached ~90%, additional marginal gains in precision were seemingly difficult to obtain by implementing slightly different strategies. Therefore, more strategies were not significantly different from the best strategy that had been observed with the same scenario and biallelic markers. For example, when using high values for heritability and VQTL, and P = 0.10, the proportion of correct positioning for DIS20 (87.3%) and CON25 (85.9%) was not different (P < 0.05) from that of DISPAIR, the best observed strategy (88.8%). These two strategies ranked significantly lower than DISPAIR when biallelic markers were used (Table 2).

Proportion of the population expressing each of the dichotomous phenotypes: When the ratio of individuals expressing the less common phenotype was decreased from 10 to 2%, the precision of QTL positioning was generally increased (Table 5). Precision increased because the effective intensity of selective genotyping was increased among individuals with the high phenotype. The opposite effect was therefore observed when the proportion of individuals with the high phenotype was increased to 30% (Table 6). For most scenarios, the relative efficiency of the various strategies was similar to the base situation (Table 3). In particular, the CON25 strategy generally performed best when heritability was low. Discordant sib selection (DIS20 or DIS50) performed consistently well for traits with high heritability. The strategies most strongly affected by changes in proportion of the population expressing the higher phenotype were STD and RND. The relative precision of STD to RAN decreased as the proportion of high phenotypes increased. This decrease was due to the lower effective intensity of selection obtained as the proportion of high phenotypes increased. The family-based (DIS and CON) strategies were relatively less strongly affected by changes in this proportion because intensity of selection of sire families for genotyping remained relatively high, as this step of selection was based on daughter means, the distribution of which remained continuous regardless of the proportions expressing the two phenotypes. A related result was that for some situations, RAN was not significantly different from the best observed strategy, especially when heritability and VQTL were high. In particular, no advantage of STD over RND was observed when 30% of the population expressed the positive phenotype. Fig 1 shows the precision (percentage of times that the QTL was correctly positioned) for RAN, STD, and BAL for the scenario when VQTL = 0.30, heritability = 0.30, and P {cong} 0.20 and varying proportions of the population express the high phenotype, ranging from 0.01 to 0.99. When the high phenotype was very rare, STD and BAL were both very precise, with STD holding a small, but nonsignificant advantage, while the precision of RAN was much lower. As the proportion of individuals expressing the high phenotype increased, the precision of STD and BAL decreased and the precision of RAN increased, reaching equivalent levels when 30% of the population expressed the high phenotype. Clearly, when 50% of the population expresses each phenotype, RAN and STD were expected to be essentially equivalent, involving the genotyping of exactly equal numbers of individuals with each phenotype for STD and differing from this ratio due only to sampling with RAN. The three approaches remained similarly effective until the frequency of the high phenotype reached ~60%. From that point the precision of RAN decreased quickly. Efficiency of STD and BAL remained at similar levels until the point when ~90% of the population expressed the positive phenotype. The BAL strategy continued to work efficiently when 99% of the population expressed the high phenotype, whereas the efficiency of STD dropped off sharply.



View larger version (15K):
In this window
In a new window
Download PPT slide
 
Figure 1. Proportion of times that the correct location was identified for random, standard selective genotyping, and balanced selection of discordant half-sib pairs when the proportion of the population expressing the positive phenotype ranged from 0.01 to 0.90 (for a biallelic locus with frequencies 0.3:0.7 explaining 30% of the genetic variance for a dichotomous trait with heritability 0.30).


 
View this table:
In this window
In a new window

 
Table 5. Percentage of times that the correct location of the QTL was identified for various selection strategies when 2% of individuals expressed the less common dichotomous phenotype


 
View this table:
In this window
In a new window

 
Table 6. Percentage of times that the correct location of the QTL was identified for various selection strategies when 30% individuals expressed the less common dichotomous phenotype

Ordered categorical phenotypes: Results for phenotypes with multiple ordered categories were very similar to those for dichotomous traits (thus, data not shown). The CON25 strategy was superior when heritability was low. The DIS20, DIS50, and BAL approaches each ranked first for at least one of the particular scenarios when heritability was high. Although STD never ranked highest, its precision was never statistically different from the best strategy when heritability was high. Precision of QTL positioning was generally ~10% higher than that observed for dichotomous phenotypes (expressed as a percentage of the precision of the best strategy for dichotomous phenotypes). Precision increased because having more categories increased the effective intensity of selective genotyping. In the three-category situation, intensity was increased in the selection of individuals with high phenotypes. With five categories, individuals with low phenotypes were selected more precisely.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The principal conclusion of this study was that any form of selected genotyping (even when applied in a haphazard way) will generally improve precision of mapping based on LD, but judiciously applied sampling based on simple consideration of the type of phenotype and the underlying genetic parameters can be particularly advantageous. Ten different selection strategies were applied to a wide range of scenarios and, in nearly all cases, even the poorest strategy yielded more precise estimates of the QTL position than did analyses applied to a random sample from the population. Nevertheless, the best strategy was always significantly more precise than the poorest strategy.

Strategies were applied to a wide range of populations and phenotypes defined by different parameters. Several general trends were noted for the effects of changes in various parameters on precision of LD mapping. Many of the trends were consistent with intuition. For example, precision increased as residual variance decreased (or heritability increased). At a fixed level of heritability, precision increased as VQTL increased. At a fixed level of VQTL, precision increased as the frequency of the positive allele was increased from 0.05 to ~0.20 (and the substitution effect decreased correspondingly). Precision was greater for continuous phenotypes than for dichotomous phenotypes (at the same level of underlying genetic control) and precision for dichotomous phenotypes increased as the positive form of the phenotype (associated with the QTL allele with a positive effect) decreased in frequency. Likewise, phenotypes with multiple categories yielded higher precision than did dichotomous phenotypes. Finally, precision increased as the number of marker alleles increased, but decreased as Ne decreased.

Choice of the optimum strategy depended on the existence of some knowledge about the phenotype of the trait and underlying genetic parameters. In general, this information can be known prior to the selection of individuals for genotyping. The most critical factor influencing the performance of varying strategies was whether the trait was distributed continuously or into a few discrete categories. This aspect about a given trait should clearly be known without exception. Heritability of a given trait was the second most important factor influencing the ranking of the strategies. Some knowledge about the heritability of a trait to which fine mapping is applied usually either will already exist or can be readily estimated. In particular, for populations similar to that simulated in this study (i.e., populations with many recorded phenotypes and large half-sib families), accurate estimates of genetic parameters either should exist or could be easily calculated using the information (phenotypes and sire identification) required to implement most of the strategies simulated.

For continuous phenotypes, choice of the selection strategy seems straightforward. The STD approach was always the highest-ranking strategy when heritability was low. Approaches based on discordant sib selection with many sire families (DISPAIR or DIS50) were the best strategies when phenotypes were continuous and heritability was high. When no knowledge is available about heritability, the DISPAIR strategy would be a logical choice to apply to a continuous phenotype. This strategy ranked highest under several scenarios and was never significantly different from the best observed strategy. In addition, the DISPAIR strategy may be more flexible and easier to apply in real populations, as the other discordant methods simulated were based on choosing equal numbers of pairs from each sire and sires all had the same number of daughters, which is unlikely to be the case in most real populations.

For categorical phenotypes, knowledge of the heritability was more critical. When heritability was low, CON25 was the highest-ranking strategy and often differed statistically from all other approaches. When heritability was high, DIS50, DIS20, and STD were the most efficient strategies. The STD approach is likely to be the simplest to apply in real populations. Although such a situation was not simulated here, selection of pairs of half-sibs with opposite dichotomous phenotypes (a strategy similar to DISPAIR for continuous phenotypes) may also be effective.

The VQTL and frequency of the positive QTL are parameters that are less likely to be available prior to designing a selection strategy, but these factors had comparatively little effect on the relative efficiencies of the various strategies. In most instances, when for a given level of heritability and frequency of the positive QTL, the same strategy ranked highest for both VQTL = 0.10 and VQTL = 0.30. Although these parameters have little influence on relative precision of strategies for a given scenario, they do influence precision of QTL positioning. Therefore, prior estimates for these parameters may be useful for determining the number of individuals needed to obtain a given level of statistical precision. Some information about these parameters is typically available in the situation that was assumed for this study, specifically, that the chromosomal region targeted for fine mapping had been identified on the basis of results of previous linkage analyses. Many linkage mapping analyses produce some estimate of the substitution effect of a QTL.

Although sensitivity of the various strategies was tested for variability in a number of parameters affecting power of QTL detection, the genetic model simulated was relatively simple and corresponded best to a large homogeneous outbred population. Additional complicating factors such as selection, population admixture, and segregation distortion were not considered and researchers dealing with LD mapping in a population subject to such factors may want to repeat a similar simulation to ensure that the strategies indicated as optimal in the studied population remain the same under these conditions.


*  ACKNOWLEDGMENTS

The authors graciously acknowledge the contribution of the anonymous editors in terms of both their time and their expert advice for improving the manuscript.

Manuscript received June 5, 2003; Accepted for publication October 7, 2003.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

ABECASIS, G. R., O. C. COOKSON, and L. R. CARDON, 2001  The power to detect linkage disequilibrium with quantitative traits in selected samples. Am. J. Hum. Genet. 68:1463-1474.[CrossRef][Medline]

ANDERSSON, L., C. S. HALEY, H. ELLEGREN, S. A. KNOTT, and M. JOHANSSON, 1994  Genetic mapping of quantitative trait loci for growth and fatness in pigs. Science 263:1771-1774.[Abstract/Free Full Text]

CARDON, L. R. and D. W. FULKER, 1994  The power of interval mapping of quantitative trait loci, using selected sib pairs. Am. J. Hum. Genet. 55:825-833.[Medline]

CARDON, L. R., S. D. SMITH, D. W. FULKER, W. J. KIMBERLING, and B. F. PENNINGTON et al., 1994  Quantitative trait locus for reading disability on chromosome 6. Science 266:276-279.[Abstract/Free Full Text]

DEKKERS, J. C. M., 2003 Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons, p. 4 in Book of Abstracts: 54th Annual Meeting of the European Association of Animal Producers, Vol. 9, edited by Y. VAN DER HONING. Wageningen Academic Publishers, Wageningen, The Netherlands.

FALCONER, D. S., and T. F. C. MACKAY, 1996 Introduction to Quantitative Genetics. Longman, Essex, UK.

FULLERTON, J., M. CUBIN, H. TIWARI, C. WANG, and A. BOMHRA et al., 2003  Linkage analysis of extremely discordant and concordant sibling pairs identifies quantitative-trait loci that influence variation in the human personality trait neuroticism. Am. J. Hum. Genet. 72:879-890.[CrossRef][Medline]

GEORGES, M., D. NIELSEN, M. MACKINNON, A. MISHRA, and R. OKIMOTO et al., 1995  Mapping quantitative trait loci controlling milk production in dairy cattle by exploiting progeny testing. Genetics 139:907-920.[Abstract]

GIANOLA, D., 1982  Theory and analysis of threshold characters. J. Anim. Sci. 63:217-244.

HANSEN, L. B., 2000  Consequences of selection for milk yield from a geneticist's viewpoint. Invited symposium paper. J. Dairy Sci. 83:1145-1150.[Abstract]

KIRKPATRICK, B. W., B. M. BYLA, and K. E. GREGORY, 2000  Mapping quantitative trait loci for bovine ovulation rate. Mamm. Genome 11:136-139.[CrossRef][Medline]

LANDER, E. and D. BOTSTEIN, 1989  Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199.[Abstract/Free Full Text]

MACCLUER, J. W., J. L. VANDEBERG, B. READ, and O. A. RYDER, 1986  Pedigree analysis by computer simulation. Zoo Biol. 5:147-160.

MARTIN, E. R., S. A. MONKS, L. L. WARREN, and N. L. KAPLAN, 2000  A test for linkage and association in general pedigrees: the pedigree disequilibrium test. Am. J. Hum. Genet. 67:146-154.[CrossRef][Medline]

MEUWISSEN, T. H. E. and M. E. GODDARD, 2000  Fine mapping of quantitative trait loci using linkage disequilibria with closely linked marker loci. Genetics 155:421-430.[Abstract/Free Full Text]

TERWILLIGER, J. D. and H. H. GORING, 2000  Gene mapping in the 20th and 21st centuries: statistical methods, data analysis, and experimental design. Hum. Biol. 72:63-132.[Medline]

ZHANG, S., K. ZHANG, J. LI, F. SUN, and H. ZHAO, 2001  Test of linkage and association for quantitative traits in general pedigree: the quantitative pedigree disequilibrium test. Genet. Epidemiol. 21(Suppl. 1):370-375.