Power to Detect Higher-Order Epistatic Interactions in a Metabolic Pathway Using a New Mapping Strategy

Epistatic interactions among quantitative trait loci (QTL) contribute substantially to the variation in complex traits. The main objectives of this study were to (i) compare three- vs. four-step genome scans to identify three-way epistatic interactions among QTL belonging to a metabolic pathway, (ii) investigate by computer simulations the power and proportion of false positives (PFP) for detecting three-way interactions among QTL in recombinant inbred line (RIL) populations derived from a nested mating design, and (iii) compare these estimates to those obtained for detecting three-way interactions among QTL in RIL populations derived from diallel and different partial diallel mating designs. The single-nucleotide polymorphism haplotype data of B73 and 25 diverse maize inbreds were used to simulate the production of various RIL populations. Compared to the three-step genome scan, the power to detect three-way interactions was higher with the four-step genome scan. Higher power to detect three-way interactions was observed for RILs derived from optimally allocated distance-based designs than from nested designs or diallel designs. The power and PFP to detect three-way interactions using a nested design with 5000 RILs were for both the 4-QTL and the 12-QTL scenario of a magnitude that seems promising for their identification.

U NTIL now estimation of the positions of quantitative trait loci (QTL) in plant genetics was accomplished by classical linkage mapping (Lander and Botstein 1989). Recently, the adaption of association mapping in plant genetics has been proposed by several authors (e.g., Vuylsteke et al. 2000;Thornsberry et al. 2001). Both linkage and association mapping methods have merits and limitations for QTL mapping. While linkage mapping methods offer a high power to detect QTL in genomewide approaches, association mapping methods have the merit of a high resolution to detect QTL (Remington et al. 2001). Wu and Zeng (2001) studied a joint linkage and linkage disequilibrium (LD) mapping strategy for natural populations. Using data from a general complex pedigree of cattle, Blott et al. (2003) and Meuwissen et al. (2002) identified candidategene polymorphisms at previously mapped QTL by combining linkage and LD information.
In this study, we examine a genomewide QTL mapping strategy using genome sequence information of recombinant inbred lines (RILs) that were generated from several crosses of parental inbreds. This QTL mapping strategy is based on the idea that the genomes of RILs are mosaics of chromosomal segments of their parental genome. Consequently, within the chromosomal segments the LD information across the parental inbreds is maintained. Thus, if diverse parental inbreds are used as in this study, LD decays within the chromosomal segments over a short physical distance (Wilson et al. 2004). Therefore, the new mapping strategy will show not only a high power to detect QTL in genomewide approaches but also a high mapping resolution when both linkage and LD information are used.
Results from model organisms suggest that epistatic interactions among loci also contribute substantially to the variation in complex traits (Carlborg and Haley 2004;Marchini et al. 2005). While Rebai et al. (1997) applied classical linkage mapping to detect QTL with additive effects in connected mapping populations of maize, Blanc et al. (2006) used such populations to detect two-way epistatic interactions. The power to detect two-way interactions by using different mating designs was examined by Verhoeven et al. (2006). Furthermore, Ritchie et al. (2003) assessed the power of multifactor dimensionality reduction to detect twoway interactions. However, several studies described QTL 3 genetic background interactions (e.g., Doebley et al. 1995;Alonso-Blanco et al. 1998), which can be caused by higher-order epistatic interactions among QTL (Jannink and Jansen 2001). Furthermore, the metabolic pathways that presumably underlie quantitative traits involve multiple interacting gene products and regulatory loci that could generate higher-order epistatic interactions (McMullen et al. 1998). Information about the power for genomewide detection of epistatic interactions among more than two QTL is still lacking.
The objectives of our research were to (i) compare three-vs. four-step genome scans to identify three-way interactions among QTL involved in a metabolic pathway, (ii) investigate by computer simulations the power and proportion of false positives (PFP) for detecting three-way interactions among QTL in RIL populations derived from a nested mating design, and (iii) compare these estimates to those obtained for detecting threeway interactions among QTL using RIL populations derived from diallel and different partial diallel mating designs.
Examined mating designs: The nested association mapping (NAM) data set was established in accordance with the crossing scheme applied in the project ''molecular and functional diversity of the maize genome.'' From each cross of the 25 diverse inbreds with B73, a segregating population with 200 RILs was developed. The diallel association mapping (DAM) data set DAM4875 was generated by deriving RIL populations with 15 RILs from each of the 325 crosses in the diallel (method 4; Griffing 1956) among all 26 inbreds.
The distance-based (DB) data sets DBc 3 r were created by selecting from the 325 crosses in a diallel the c combinations of parental inbreds that show, on the basis of all marker loci, the maximum genetic dissimilarity calculated according to Nei and Li (1979). For the c combinations of parental inbreds r RILs were derived from each combination. In our study the data sets DB75 3 65, DB125 3 39, and DB195 3 25 were examined. For single round robin (SRR) (Verhoeven et al. 2006), 188 RILs were derived from each of the 26 chain crosses, i.e., inbred 1 3 inbred 2, inbred 2 3 inbred 3, . . . , inbred 26 3 inbred 1. The data sets DAM900, DB25 3 36, DB50 3 18, DB100 3 9, and DB150 3 6 were examined only in combination with the NAM data set and were therefore based on the 300 crosses in a diallel among the 25 diverse inbreds.
In a second scenario, 12 QTL, organized in three four-locus pathways, were assumed. Genotypic values of the inbreds were determined by summing up the effects caused by the individual pathways. In this scenario s 2 A ; s 2 AA ; s 2 AAA , and s 2 AAAA were 3.516, 0.820, 0.387, and 0.073, respectively, for a population with allele frequencies of 0.5, whereas for the NAM data set the corresponding variances were 2.034, 0.248, 0.064, and 0.007, respectively.
The phenotypic values of the RILs were generated by adding a normally distributed variable N(0, s 2 E ) to the genotypic values. The error variance was calculated as where s 2 G denotes the genetic variance and h 2 denotes the heritability on an entry-mean basis. On the basis of previous empirical studies, we examined h 2 -values of 0.5 and 0.8 (Flint-Garcia et al. 2005). All simulations were performed with software Plabsoft (Maurer et al. 2004), which is implemented as an extension of the statistical software R (R Development Core Team 2004).
Statistical analyses: Due to their vast number, it is intractable to detect three-way interactions using a three-dimensional genome scan. Therefore, we used two different model selection approaches for identifying three-way interactions that considerably reduce the number of models to be evaluated during the model selection process. For both approaches PROC GLMSELECT of the statistical software SAS (SAS Institute 2005) was used.
Several authors suggested the use of information criteria such as the Akaike information criterion or the Schwarz Bayesian criterion (Piepho and Gauch 2001) or modifications thereof (Bogdan et al. 2004;Baierl et al. 2006) to circumvent the problems connected with multiple likelihood-ratio tests for model selection. In preliminary simulations, however, we used the Schwarz Bayesian criterion and observed a high PFP (data not shown). Therefore, in our study we used the model selection criteria P-to-enter and P-to-stay (Miller 2002), which allows the use of a more conservative threshold.
All SNPs, and also those treated as QTL, were included in both approaches. Hence, QTL detection is not based on LD between QTL and adjacent molecular markers and, thus, the correlation structure among the RILs can be ignored.
Three-step genome scan: The three-step genome scan applied in this study to identify three-way interactions was based on the one-dimensional genome scan described by Jannink and Jansen (2001). In the first step, stepwise multiple linear regression (Efroymson 1960) was performed on y, the phenotypic values of the RILs, as a dependent variable and w 1 , w 2 , . . . , w 653 , the SNP loci, and x, the affiliation of each RIL to a cross of parental inbreds, as independent variables. Independent variables showing P-to-enter or P-to-stay ,1 3 10 À8 were added or kept in the model.
In addition to the variables identified in the first step, the variables w 1 3 x, w 2 3 x, . . . , w 653 3 x were used as independent variables in the stepwise multiple linear regression of the second step, where variable selection was performed only on w 1 3 x, w 2 3 x, . . . , w 653 3 x. Variables showing a P-to-enter or P-tostay ,1 3 10 À5 were added or kept in the model. The i variables w out of the i w 3 x interactions identified in the second step were used in the backward elimination procedure of the third step together with the ð i 2 Þ and ð i 3 Þ possible two-and three-locus interactions among them as independent variables. Variables showing a P-to-stay ,1 3 10 À5 were kept in the model. The model resulting from the third step was designated as the final model.
Four-step genome scan: Stepwise multiple linear regression was also used for the four-step genome scan, where in the first step y was used as a dependent variable and w 1 , w 2 , . . . , w 653 , the SNP loci, as independent variables. The j loci identified in the first step were used together with the two-way interactions, which were constructed by combining the j loci with all loci, as independent variables in the stepwise multiple linear regression of the second step, where variable selection was performed only on the two-way interactions. The single loci and two-way interactions identified in the first and the second step, respectively, were used together with the three-way interactions, which were constructed by combining the two-way interactions with all loci, as independent variables in the stepwise multiple linear regression of the third step. Variable selection was performed only on the three-way interactions. Three twoway interactions are subordinated to each three-way interaction, whereas the significance of two of them was examined in the previous steps. To ensure the detection of three-way epistatic interactions and not the effect of three-way interactions confounded with that of the not examined, subordinated twoway interactions we applied in the fourth step backward elimination on all variables contained in the model resulting from the third step and the not examined, subordinated twoway interactions. In this step, variable selection was performed only on the three-way interactions and the not examined, subordinated two-way interactions. The model resulting from the fourth step was designated as the final model. In each of the four steps, variables showing a P-to-enter or P-to-stay ,1 3 10 À8 were added or kept in the model. This conservative threshold was chosen to warrant an acceptable PFP. To observe an acceptable PFP in studies based on empirical data we suggest using computer simulations to estimate the corresponding P-to-enter and P-to-stay.
The power 1 À b* to detect three-way interactions was calculated as the proportion of three-way interactions correctly identified in the final model out of the total number of threeway interactions simulated. We estimated the PFP (Fernando et al. 2004) as the proportion of three-way interactions for which at least one locus is not a QTL out of the total number of three-way interactions identified in the final model. Averages were calculated across the simulated 50 replications to determine 1 À b* and PFP.

RESULTS
The average map distance between the 653 SNP markers was 2.6 cM. The pairwise genetic dissimilarity among the 26 inbreds ranged from 0.25 to 0.42 (Figure 2). The average frequency of the B73 allele was 0.81 in the RILs of the NAM data set and 0.64 in the RILs of data sets DAM4875 and SRR.
In the three-step genome scan, the power and PFP to detect three-way interactions were for the NAM data set 0.05 and 0.35 (4 QTL; h 2 ¼ 0.5), respectively (data not shown). For 12 QTL and h 2 ¼ 0.5, the power to detect three-way interactions using the NAM data set decreased to 0.00 and PFP increased to 1.00. A power 1 À b* of 0.08 (4 QTL; h 2 ¼ 0.5) was observed for data sets DAM4875 and DB125 3 39, whereas the PFP was 0.40 and 0.35, respectively.
Using the four-step genome scan, a power 1 À b* to detect three-way interactions of 0.28 (4 QTL; h 2 ¼ 0.5) was found for the NAM data set (Table 2). In the scenario with 12 QTL, the power 1 À b* was 0.18. Lower PFP was detected for the 12-QTL scenario (0.29) than for the 4-QTL scenario (0.54). Totals of 1.2 and 1.5 times higher power estimates were observed for the 4-and 12-QTL scenarios of the NAM data set, respectively, when increasing h 2 from 0.5 to 0.8.
For both levels of h 2 , $1.5 times higher power estimates were observed for the combined data set of NAM and DAM900 than for NAM. For the former, PFP was for both levels of h 2 $0.50. The power 1 À b* for combined NAM and DB data sets ranged for 4 QTL and h 2 ¼ 0.5 from 0.27 to 0.42 and for h 2 ¼ 0.8 from 0.33 to 0.60. For these data sets PFP varied for 4 QTL and h 2 ¼ 0.5 between 0.46 and 0.52 and for h 2 ¼ 0.8 between 0.51 and 0.64.

DISCUSSION
The comparison of power 1 À b* of different statistical analyses requires an equal PFP. However, model selection procedures such as those applied in our study do not stringently adhere to a specified PFP (Miller 2002). Nevertheless, in our study similar PFP values were obtained for all mating designs of each simulation scenario under the four-step genome scan and, thus, power estimates can be compared.
Owing to technical reasons, we were not able to ascertain similar PFP values for the four-step genome scan as for the three-step genome scan. However, differences in  1 À b* between both approaches were of such size that the four-step genome scan seems more promising for detecting epistatic QTL in the assumed metabolic pathway than the three-step genome scan irrespective of the mating design. Therefore, the comparison between the examined mating designs was based only on results of the former. Power to detect three-way interactions under different mating designs: Despite a comparable number of RILs in the two data sets NAM and DAM4875, higher power estimates were found for the latter (Table 2). This observation is attributable to the average frequency of the B73 allele, which is closer to 0.5 for DAM4875 than for NAM. Crossing schemes resulting in RILs with an average allele frequency of 0.5 have a high power to detect QTL because the probability that some QTL haplotypes have only a very small class size is minimized (Verhoeven et al. 2006). For the detection of higherorder epistatic QTL this issue is even more important because the number of possible QTL haplotypes increases with an increasing number of QTL. But this reduces the probability that all QTL haplotypes are present in the data set.
The average frequency of the B73 allele was the same for the RILs of DAM4875 and SRR. Nevertheless, in our study a higher power to detect three-way interactions was found for the former. This is in contrast to a result of Verhoeven et al. (2006), who observed a considerably higher power to detect epistatic QTL for SRR than for the same number of RILs derived from a diallel. The different findings can be explained by the different assumptions underlying the simulations. Verhoeven et al. (2006) assumed a distinct allele for each parental inbred. In this case large numbers of small populations show, due to the increased probability that some QTL haplotypes have only very small class size, a lower power to detect epistatic QTL than do a small number of large populations. However, the assumption made by Verhoeven et al. (2006) ignores the consequences of genetic drift that for real data not all QTL segregate in every population (Xu 1996). In our study this fact was considered by using SNP haplotype data of 26 inbreds as a basis of the simulations. Consequently, the mating designs resulting in a large number of small populations have indeed the above-mentioned disadvantage but this is compensated by the large number of individuals within populations segregating for the QTL.
The probability that QTL are segregating is increased in individual line crosses by using parental inbreds that are phenotypically the opposite extremes for the trait of interest (Xu 1998). However, this may not be helpful for detecting QTL for other traits. Furthermore, results of Burkhamer et al. (1998) suggest that inbreds showing a large genetic distance on the basis of molecular markers also strongly differ in their alleles at QTL. Therefore, we examined the DB approach by using only those parental combinations of the diallel to establish RILs that show, on the basis of all marker loci, the maximum genetic dissimilarity. A higher power to detect three-way interactions was observed for DB125 3 39 and DB195 3 25 than was found for NAM, SRR, and DAM4875. This indicated that the DB design is promising for increasing the probability that QTL are segregating in populations.
Nevertheless, we observed a lower power to detect three-way interactions for DB75 3 65 than for DB125 3 39 and DB195 3 25. The opposite result was expected, on the basis of the average genetic dissimilarity among the parental inbreds and the higher number of RILs per segregating population (Xie et al. 1998). Presumably, the reason for our observation is the insufficient sampling of QTL alleles of the base population if the number of selected parental combinations is too low (Muranty 1996;Wu and Jannink 2004).
In summary, the results of our study indicated that for a genetic dissimilarity among the parental inbreds such as that observed in this study the crossing schemes underlying the data sets DB125 3 39 and DB195 3 25 are the most promising designs to detect three-way interactions. However, our results also suggest that only RILs derived from optimally allocated DB designs show an increased power to detect three-way interactions. Nevertheless, the project ''molecular and functional diversity of the maize genome'' applies the crossing scheme underlying the NAM data set to construct RIL populations. The reasons are: (i) The common reference parent B73 has been the subject of extensive genetic and genomic studies (e.g., Morgante et al. 2005), and (ii) crossing the 25 diverse inbreds to the welladapted inbred B73 facilitates both the development and the phenotypic evaluation of RILs.
Factors influencing the power and PFP to detect higher-order epistatic QTL in NAM: Genetic architecture of the trait: A higher power 1 À b* of detecting three-way interactions was observed with a single pathway influencing the phenotypic trait than with three pathways influencing the phenotypic trait (Table 2). This is due to the increased proportion of variance explained by a single pathway. Thus, if pathways explain unequal proportions of the genotypic variance, the power to detect epistatic interactions is higher for pathways explaining a large proportion of the genotypic variance than for pathways explaining a small proportion of the genotypic variance. Higher PFP was observed for the one-pathway scenario than for the three-pathway scenario. This finding can be explained by the higher power to detect three-way interactions in the former than in the latter scenario.
In our study the genotypic values for the four-locus genotypes were arranged in such a way that they are interpretable as molecular interactions among genes (Figure 1) ( Jayaram and Peterson 1990). Nevertheless, for each individual pathway genic effects of four single loci, six two-way interactions, four three-way interactions, and one four-way interaction can be estimated (Yang 2004). In addition, the variances s 2 A ; s 2 AA ; s 2 AAA , and s 2 AAAA can be calculated. The variances s 2 A and s 2 AA determined in our study under the assumption of allele frequencies of 0.5 agree well with those reported by Wolf et al. (2000) for various traits in an F 2 maize population. Thus, we conclude that the assumptions underlying our simulations were realistic. For pathways consisting of k QTL and showing mono-, di-, and trigenic effects similar to those of our study, a higher power to detect three-way interactions when k ¼ 3 is expected. This is because a lower number of genic effects is influencing the genotypic value and, thus, each single genic effect is explaining a higher proportion of the genotypic variance. The opposite is expected for k . 4.
Detection method: For the detection of three-way interactions, the four-step genome scan requires that the genic effects of at least one single locus and one two-way interaction are different from zero in addition to the genic effect of the three-way interaction. In contrast, using the three-step genome scan for the detection of three-way interactions the main effect of the QTL does not matter and only the interaction effect of all three QTL with the genetic background must be different from zero. Epistatic interactions among QTL cause QTL 3 genetic background interactions ( Jannink and Jansen 2001) and, thus, the effect of the latter increases in our study with the size of the genic effects of two-, three-, and four-way interactions. Therefore, for both QTL detection approaches a higher power to detect three-way interactions is expected if genotypic values are assumed for the four-locus genotypes such that all genic effects have a higher absolute value than in our study. The opposite result is expected if all genic effects show a lower absolute value than in our study.
Different trends of the power to detect three-way interactions are expected for the two QTL detection approaches, if all genic effects of at least one of the steps, a single locus or two-way interactions, decrease. While a strong reduction in power to detect three-way interactions is expected by using the four-step genome scan, only a weak reduction is expected for the three-step genome scan. In the extreme, if all genic effects of one of these steps are zero, three-way interactions can be detected only by using the three-step genome scan. However, this case is very unlikely (Marchini et al. 2005). Further research is needed concerning the most promising detection method for epistatic interactions under different genetic architectures.
Probability of QTL segregation: The probability of QTL segregation is influenced by (i) the average and (ii) the variance of the genetic dissimilarity among the parental inbreds of the RILs. The higher the average genetic dissimilarity among the parental inbreds, the higher the power is to detect three-way interactions for RILs derived from all mating designs. For a low average genetic dissimilarity the opposite result is expected. The optimal number of populations and RILs per population is not influenced by the average genetic dissimilarity among the parental inbreds. In contrast, the higher the variance of the genetic dissimilarity among the parental inbreds, the larger the difference in power 1 À b* is between mating designs considering genetic dissimilarities, like DB designs, and mating designs neglecting this information. For a low variance of the genetic dissimilarity the opposite result is expected.
Genetic map distance between marker loci and QTL: In our study all SNPs, and also those treated as QTL, were included in the statistical analyses. This is because the proposed mapping strategy requires that markers are available for the QTL itself, which is true if the genome sequence of all RILs is known. Due to the fast progress of genome sequencing techniques (Churchill et al. 2004;Shendure et al. 2004) this is a realistic assumption in the foreseeable future, which maximizes the power to detect three-way interactions.
A decreased power to detect three-way interactions and an increased PFP are expected if RILs are genotyped for a lower number of SNPs. This is attributable to the decreased probability of substantial LD between QTL and marker loci (Marchini et al. 2005). Furthermore, the application of the proposed mapping strategy to true data sets for which the genome sequence of all individuals is not available requires major modifications, the most important one being that in this case QTL detection is based on LD between QTL and adjacent molecular markers and, thus, the correlation structure among the RILs must be considered. Furthermore, in some cases it might be necessary to (i) assume a certain QTL allele for each parental inbred or (ii) use multilocus haplotype data to infer parental allelic relationships at QTL ( Jansen et al. 2003).
Heritability of the trait: Increasing h 2 from 0.5 to 0.8 resulted for all examined data sets and both numbers of QTL in a considerably higher power to detect three-way interactions. This is because for h 2 ¼ 0.8 the environmental influence on the phenotypic trait is reduced in comparison with h 2 ¼ 0.5. Hence, increasing h 2 by conducting field experiments with several replications in several environments is a promising approach to increase the power to detect three-way interactions. However, in studies with a fixed budget this implies a reduction of the number of RILs to be tested in field experiments. Further research is needed concerning the optimal allocation of resources with respect to the number of RILs and the intensity of their phenotypic evaluation.
A higher PFP was observed for scenarios underlying a high heritability than for scenarios underlying a low heritability. This finding can be explained by the higher power to detect three-way interactions in the former than in the latter scenario.
Number of examined RILs: Up to now, published QTL mapping experiments with replicated trials mostly employed between 100 and 200 progenies (Melchinger et al. 1998). Experiments of this size have a low power to detect epistatic QTL (Mihaljevic et al. 2005). However, in our study the power to detect three-way interactions with 5000 RILs derived from a nested design was relatively high for both the 4-QTL and the 12-QTL scenarios. Also the observed PFP of $0.54 (4 QTL) and 0.29 (12 QTL) was at an acceptable level considering the complex genetic architecture. Nevertheless, near-isogenic lines should be used to validate the identified epistatic interactions. For the validation of each identified threeway interaction, eight near-isogenic lines are required.
For detection of three-way interactions in pathways more complex than that of our study, the NAM data set must be complemented with additional RILs. Furthermore, our results suggest that the optimally allocated DB approach is more appropriate for complementing the NAM data set than deriving additional RILs from a diallel.