## Abstract

We present a new flexible, simple, and powerful genome-scan method (flexible intercross analysis, FIA) for detecting quantitative trait loci (QTL) in experimental line crosses. The method is based on a pure random-effects model that simultaneously models between- and within-line QTL variation for single as well as epistatic QTL. It utilizes the score statistic and thereby facilitates computationally efficient significance testing based on empirical significance thresholds obtained by means of permutations. The properties of the method are explored using simulations and analyses of experimental data. The simulations showed that the power of FIA was as good as, or better than, Haley–Knott regression and that FIA was rather insensitive to the level of allelic fixation in the founders, especially for pedigrees with few founders. A chromosome scan was conducted for a meat quality trait in an F_{2} intercross in pigs where a mutation in the halothane (Ryanodine receptor, RYR1) gene with a large effect on meat quality was known to segregate in one founder line. FIA obtained significant support for the halothane-associated QTL and identified the base generation allele with the mutated allele. A genome scan was also performed in a previously analyzed chicken F_{2} intercross. In the chicken intercross analysis, four previously detected QTL were confirmed at a 5% genomewide significance level, and FIA gave strong evidence (*P* < 0.01) for two of these QTL to be segregating within the founder lines. FIA was also extended to account for epistasis and using simulations we show that the method provides good estimates of epistatic QTL variance even for segregating QTL. Extensions of FIA and its applications on other intercross populations including backcrosses, advanced intercross lines, and heterogeneous stocks are also discussed.

THE detection of quantitative trait loci (QTL) in domestic animals has been greatly enhanced by the design of experimental crosses between highly divergent lines (Andersson 2001). The animals of the founder lines are usually taken from two different breeds with large phenotypic differences, such as European Wild Boar and Large White domestic pigs (Knott *et al*. 1998), Jungle Fowl and White Leghorn chicken (Kerje *et al*. 2003), and selected mouse lines (Brockmann *et al*. 1998). Here large genetic differences between the breeds are expected for the studied trait, and the power of detecting QTL is high even for moderately sized experimental crosses. There is often a substantial within-breed heritability for the studied trait, but still the commonly used regression model to detect QTL assumes a biallelic QTL that is fixed within each of the two founder lines (Haley *et al*. 1994). If the QTL alleles are not fixed, however, the regression model will underestimate the QTL allele effect and the power to detect the QTL decreases (Perez-Enciso and Varona 2000). Furthermore, an increased understanding of the magnitude of the genetic variation within lines is crucial for further studies to identify the causative genes underlying QTL.

Several approaches have been adopted to account for within-line QTL variation in line crosses of outbred lines. Following Goddard (1992), Wang *et al*. (1998) suggested the use of a multibreed model with a fixed breed effect and a random QTL effect. For analysis of F_{2} intercrosses, Knott *et al*. (1996) developed a nested within-half-sib family model that does not assume fixation of QTL alleles in the founder lines, and the number of alleles is constrained only by the number of families. This is a model with fixed effects and the number of estimated parameters increases with the number of half-sib families. Furthermore, the genotypic information of the dams is not included in the model and the sires are assumed to be unrelated.

Perez-Enciso and Varona (2000) developed a mixed-QTL model that accounts for line differences and within-line variation of QTL effects. In this model, which is similar to the model developed by Wang *et al*. (1998), a fixed line effect is estimated together with a random within-line QTL variance. The model is a combination of the regression model (Haley *et al*. 1994) and the QTL variance component (VC) model (Fernando and Grossman 1989; Goldgar 1990). A drawback of the model is, however, the difficulty to compare estimates in different genomic locations as the total QTL variance is a combination of fixed and random effects. The method of Perez-Enciso and Varona (2000) maximizes the likelihood with a nonderivative algorithm and is rather slow. Their model was further developed in Perez-Enciso *et al*. (2001) to account for dominance effects.

Meuwissen and Goddard (2000) developed a model for fine mapping in outbred populations. They model a QTL as a random effect and introduce nonzero correlations between the founder alleles given by estimated linkage disequilibrium within founder haplotypes. We develop this approach further by, instead of estimating the founder correlations from external information about the population history, estimating these correlations directly from the marker–pedigree–phenotype data from the experimental cross.

On the basis of this model, we develop a new flexible genome-scan method for QTL detection in experimental intercrosses. The method is based on a general VC model that utilizes the score statistic to detect QTL. The VC framework relaxes the assumptions of previous fixed-effects models (*e.g*., Haley *et al*. 1994) and ensures simple interpretation and testing of the estimated parameters. The method can be extended to account for epistasis for both segregating and fixed QTL. The use of the score statistic makes the analysis fast and robust, which enables empirical significance testing. This new method is referred to as flexible intercross analysis (FIA).

## METHODS

#### Statistical modeling:

First we present the single-locus VC model, where all base QTL alleles are assumed to be uncorrelated (Fernando and Grossman 1989; Goldgar 1990). Thereafter, we develop the single-locus model to account for within-founder-line correlations. Our model can be used to identify within-line variation in a three-stage process:

Step 1: Perform a genome scan utilizing the score statistic to test for QTL that are either fixed or segregating within founder lines.

Step 2: Estimate the within-line correlation (ρ) at significant QTL locations and test H

_{0}: ρ = 1*vs*. H_{1}: ρ < 1.Step 3: If H

_{0}can be rejected, different correlations will be estimated within lines and different allele effects will be identified within lines by analyzing the least-squares estimates of the base generation allele effects.

The traditional single-locus VC model for general pedigrees with *n* phenotyped individuals is given by(1a)(*e.g*., Lynch and Walsh 1998, Chap. 16), where **y** is the vector of individual phenotypes (length *n*), β is the vector of fixed effects and **X** is the corresponding design matrix, **v** is a vector of random individual QTL effects (length *n*) in position τ, **a** is a vector of random polygenic effects (length *n*), and **e** is a vector of residual effects (length *n*). The variance–covariance matrix of **y**, assuming independent allelic effects in the base generation, is:(1b)where Π is the genotype identity-by-descent (IBD) matrix (size *n* × *n*) calculated in position τ, is the corresponding genotype QTL variance, **A** is the additive relationship matrix, is the variance of polygenic effects, **I**_{n} is the identity matrix of size *n* × *n*, and is the residual variance.

An alternative, and equivalent, presentation of model (1) is(2)(Rönnegård and Carlborg 2007), where **v*** is the vector of *m* independent normal-distributed base generation QTL alleles with variance , **Z** is an incidence matrix of size *n* × *m* relating individuals with the QTL alleles in the base generation, and **e** is the residual vector with variance . In the calculations of the IBD matrix in model (1), it is implicitly assumed that the QTL allele effects are uncorrelated in the base generation, so , where **I**_{m} is the identity matrix of size *m* × *m*. We also have the identity .

##### The single-locus VC QTL model for experimental line crosses:

In a line cross, the QTL allele effects will be correlated within the base generation lines *A* and *B*. This correlation can be included in model (2). We then get a mixed linear model , where **v*** is a random effect with *m* levels, where *m* = *m _{A}* +

*m*and

_{B}*m*is twice the number of individuals in line

_{A}*A*and

*m*is twice the number of individuals in line

_{B}*B*. In our model, there is an unknown correlation ρ between the first

*m*levels and the last

_{A}*m*levels of

_{B}**v***, such that

**v*** ∼ MVN(0,

**G**) and,

*e.g*., for one founder in line

*A*and three founders in line

*B*(

*m*= 2 and

_{A}*m*= 6),(3a)or in alternative notation(3b)Hence, , where(4a)

_{B}Using the gametic IBD matrix formulation (*e.g*., Pong-Wong *et al*. 2001), we have a gametic IBD matrix **W** for the base individuals, with and , where (⊗ denotes the Kronecker product).

The key to estimating and (and thereby also ρ) is to formulate the variance–covariance matrix of **y** as(4b)where Π and Π_{J} are the IBD matrices calculated for base-generation structures with alleles independent and fixed within lines, respectively. From (2), it is possible to show that (4a) and (4b) are equivalent sinceWhere is a block diagonal matrix with ones in blocks of size *m _{A}* ×

*m*and

_{A}*m*×

_{B}*m*.

_{B}##### Epistatic VC QTL model:

Including epistasis is an essential part of QTL modeling (Carlborg and Haley 2004). FIA accounts for epistasis by an extension of model (4b) as(5)where Π_{1} and Π_{2} are the IBD matrices for the two QTL, assuming an outbred base generation, and the Hadamard product ∘ was used for the epistatic variance component (Mitchell *et al*. 1997; Rönnegård *et al*. 2008).

##### Genome scan with the score statistic:

The score statistic is calculated as (Cox and Hinkley 1974; Tang and Siegmund 2001; Putter *et al*. 2002), where **D** is the gradient vector for the log-likelihood function under the alternative hypothesis calculated at the parameter values given by the null hypothesis, and **F** is Fisher's information matrix (see Lynch and Walsh 1998).

We base our analysis on the residual log-likelihood function (Patterson and Thompson 1971):

Let be the parameter vector for the VC part of the model. We then have the following partial derivatives:

The elements of the gradient **D** under the null hypothesis of were calculated as(Lynch and Walsh 1998), where , and with and being the estimated polygenic and residual variance under the null hypothesis. The elements of **F** are calculated asNote that the value of **P**_{0} does not change between positions along the genome and that the computational requirements for the score statistic are low.

##### Parameter estimation:

The parameters in the model are estimated using average information–restricted maximum likelihood (AI–REML) (Johnson and Thompson 1995). The algorithm was programmed in R (R Development Core Team 2004) and set to converge when the log likelihood differed <10^{−4} between iterations. The derivatives of the log-likelihood function used in the algorithm arewhere with **V** given in (4b).

This estimation enables testing of whether the QTL allele effects are identical within founder lines or not by testing H_{0}: *vs*. H_{1}: using a likelihood-ratio (LR) test.

##### Identification of segregating alleles within founder lines:

When a QTL with is detected, we test within which line and individual that the alleles are segregating. First, we estimate separate correlations within each line, so that [following the example in (3)] we havewhere ρ_{A} is the correlation within founder line *A* and ρ_{B} is the correlation within founder line *B*. Furthermore, the individual base generation allele effects can be estimated. The least-squares estimates are obtained by constructing a design matrix **X**_{m} from the first *m* columns of the gametic IBD matrix **W**. The base generation allele effects β_{m} are then estimated from the linear model .

##### Calculation of significance thresholds:

The significance thresholds for the genome scan are calculated by means of permutation testing (Churchill and Doerge 1994; Good 2005). An important assumption of a permutation test is that the observations are exchangeable under the null hypothesis (Davison and Hinkley 1997, p. 143; Anderson and Ter Braak 2002) and we guarantee this by permuting residuals estimated from the null model . Replicates of the phenotypic data were simulated with , where is the vector of permuted residuals, and and are the estimated fixed and random polygenic effects obtained from the null model. (In our analyses of the European Wild Boar × Large White F_{2} cross and the Red Jungle Fowl × White Leghorn F_{2} cross no polygenic effects were detected and we therefore used ). For each replicate, the score statistic was calculated at every tested position (5 cM apart) along the genome using (4). The empirical distribution of the maximum score value from each replicate was used to obtain significance thresholds.

#### IBD-matrix estimation:

Several different IBD estimation programs are available. We use a deterministic algorithm based on Pong-Wong *et al*. (2001) to calculate the IBD matrices in (4b). The original algorithm by Pong-Wong *et al*. assumes that founder marker phases are unknown and does not enable general base generation structures. We implemented the original algorithm to account for known phases of marker genotypes and consequently allow user-specified base generation structures. The most likely phases of the marker genotypes in the base generation are estimated using a genetic algorithm that uses both marker and pedigree information as input. Resulting marker genotype phases were subsequently used as input in the IBD estimation algorithm (F. Besnier, B. -W. Kim and Ö. Carlbourg, unpublished data).

#### Simulation setup for a single QTL:

We used simulations to study the distribution of ρ and variation of LR for different levels of fixation within lines. In the results, if not otherwise stated, LR = −2(*l*_{0} − *l*_{1}) with *l*_{1} equaling the likelihood of model (4) and *l*_{0} equaling the likelihood of the same model with . Furthermore, we wished to compare the power of the Haley–Knott regression with additive effects (Haley *et al*. 1994) to our more flexible method.

In the analyses of VC estimates and comparisons of LRs to the ones obtained from Haley–Knott regression, 100 replicates were simulated for each of the four cases described below with four founders and 800 F_{2} individuals, and a 20% QTL at a fully informative marker. The setup for these simulations gives expected values of ρ = 1.00, 0.68, 0.20, and 0.00 for cases 1–4, respectively.

In the power analyses, four parameters were varied to thoroughly evaluate the differences between FIA and Haley–Knott regression: number of founders in base generation, number of F_{2} individuals, marker information, and level of fixation within founder lines. The methods were compared by their power to detect a QTL for a given position at a 5% significance level.

##### Base generation size:

Two types of pedigrees were simulated with random mating: one with a small base generation (no. of founders = 4) and one with a large base generation (no. of founders = 50). These are referred to as *small base* and *large base*, respectively. The structure for the small base pedigree was designed to mimic the pedigree of a Red Jungle Fowl × White Leghorn F_{2} cross (Kerje *et al*. 2003) with one jungle fowl male mated to three leghorn females in the base generation. The large base generation structure pedigree consisted of 25 individuals in each line and resembles an F_{2} intercross between two divergently selected body-weight chicken lines described in Jacobsson *et al*. (2005). The two simulated base generation sizes were chosen as most experimental intercrosses are expected to have base generation sizes intermediate to these.

##### Number of F_{2} individuals:

The number of F_{2} individuals simulated was either 200 or 800. A biallelic QTL was simulated with a difference between the two QTL allele effects of , where *h*^{2} is the QTL heritability in an outbred population. For the pedigrees with 200 F_{2} individuals, *h*^{2} was set to 0.05, whereas *h*^{2} was set to 0.02 for the pedigrees with 800 F_{2} individuals. The phenotype of an F_{2} individual *i* was simulated with , where is the QTL allele effect on the paternally inherited chromosome and is the QTL allele effect on the maternally inherited chromosome, and *e _{i}* is an i.i.d. normally distributed residual effect with a variance of 1 −

*h*

^{2}.

##### Marker information:

A QTL was simulated either at a fully informative marker or in the middle of a 40-cM marker interval flanked by two fully informative markers.

#### Simulation setup for a QTL with epistatic interaction:

Using simulations, we evaluated how well our model estimates epistatic variance components and how the existence of epistasis influences the estimates of other variance components in the presence of QTL segregation. A small base generation and 800 F_{2} individuals were simulated for cases 1–4 (Table 1). A biallelic QTL was simulated with a difference between the two QTL allele effects of 3.162. An additional unlinked QTL was simulated with no main effect. The base alleles of this QTL were assumed to be unique and to be interacting with the former biallelic QTL. For every possible pairwise combination of alleles between the two QTL a value for the epistatic interaction was drawn from , where is the epistatic variance. The residual variance was 95 − , which gives a heritability for the main QTL effect of 5% in an outbred population. One hundred replicates were simulated for each of the four cases and for , 5, or 10, and a fully informative marker was assumed.

#### Analyses of experimental data:

##### European Wild Boar × Large White F_{2} cross:

To compare the properties of FIA to Haley–Knott regression when applied to real data, we analyzed data from a European Wild Boar × Large White cross, where the causal mutation underlying the studied trait has been detected (Lundström *et al*. 1995) and is known to be segregating in the founders. In this cross, two European Wild Boars were mated to eight Large White sows, producing 191 F_{2} offspring with measured genotypes and phenotypes. Twenty-two markers were genotyped on the studied chromosome 6 at 0.0, 8.6, 36.6, 49.7, 50.5, 62.9, 79.2, 80.4, 83.7, 84.1, 84.8, 90.6, 95.4, 100.7, 101.9, 115.9, 116.7, 119.0, 120.2, 124.0, 127.0, and 170.9 cM. In our analysis, we examined a meat quality trait (reflectance value, EEL, scored over a cross-section of the longissimus dorsi muscle), which is known to be affected by the halothane gene (Lundström *et al*. 1995) located at 80.4 cM. One of the founder boars is known to be heterozygous (*Hal ^{N}/Hal^{n}*), whereas all other founders are homozygous for the wild-type allele (

*Hal*). We performed a QTL scan on chromosome 6 using Haley–Knott regression and FIA and also explored whether FIA could identify which founder allele carried the mutated halothane allele (

^{N}/Hal^{N}*Hal*). Following Knott

^{n}*et al*. (1998), we included sex, litter, and slaughter weight as fixed effects in our analysis. The data are described in detail in Knott

*et al*. (1998) and Andersson-Eklund

*et al*. (1998).

##### Red Jungle Fowl × White Leghorn F_{2} cross:

In a Red Jungle Fowl × White Leghorn F_{2} cross, we performed a full-genome scan using FIA and Haley–Knott regression. In this pedigree, one Red Jungle Fowl male was mated to three White Leghorn females, producing 756 F_{2} offspring with measured genotypes and phenotypes. We used an updated marker map to those reported in Kerje *et al*. (2003), including 439 markers (L. Andersson, personal communication) covering chromosomes 1–28. We analyzed body weight at 200 days of age for which Kerje *et al*. (2003) found two 1% genomewide significant QTL on chromosome 1 (growth 1 at 68 cM and growth 2 at 420 cM) and 5% genomewide significant QTL on chromosome 5 (growth 8 at 21 cM) and chromosome 27 (growth 13 at 20 cM). Two additional 5% genomewide significant QTL were also found in this study, which had large and significant dominance effects. In the original publication, it was noted that there was a within-line QTL variation for growth 2 on the basis of indications from a heterogeneity test among the four largest F_{1} families, but it was not investigated how this influenced the results or within which founder line the QTL was segregating. We repeated the genome scan to explore how many of the previously detected QTL were segregating. Following Kerje *et al*. (2003), we included sex and batch as fixed effects in the model. The data are described in detail in Kerje *et al*. (2003).

## RESULTS

#### Simulations of a single QTL:

##### Estimates of within-line correlations and comparisons to LRs from Haley–Knott regression:

We compared the LRs from our VC model using REML with those obtained from Haley–Knott regression. LRs from Haley–Knott regression decrease rapidly when the level of segregation within lines increases (Table 2), whereas LRs from our model differed only marginally between the evaluated cases. The estimated within-line correlation (ρ) drops from one to zero as the level of fixation decreases in cases 1–4 (Table 2).

##### Power analysis:

There were no substantial differences in power between FIA and Haley–Knott regression when the QTL were fixed within lines (case 1 in Figure 1). The power of FIA was up to 10 times higher, however, when there was segregation within lines (Figure 1, case 4). For case 4, the difference between Haley–Knott regression and FIA was largest when the base generation was small and the number of F_{2} individuals was high (Figure 1a), *i.e*., when there were a large number of copies of each base generation allele in the F_{2}. The difference in power was highest when the QTL was located at the marker.

With a large base generation, the power of both Haley–Knott regression and FIA decreases significantly. For example, in case 4, Haley–Knott regression had no power at all to detect the segregating QTL (detection equal to the type I error rate of 5%). FIA, however, still had some power (18%) to detect the segregating QTL near a fully informative marker in the 800 F_{2} pedigree. Decreasing the number of copies of each base generation allele in the F_{2} increases the uncertainty of the estimated base generation structure, especially when the base is fully outbred or close to outbred. Therefore a decline in the power of FIA going from case 1 to 4, as shown in Figure 1, c and d, is expected.

#### Simulations of a QTL with epistatic interaction:

Both the main and the epistatic QTL variance were estimated satisfactorily with FIA (Table 3), although the average epistatic variance tended to be slightly overestimated for and because the distribution of estimates was highly skewed. Most estimates were, however, very close to the true value with the medians for being <0.02. The estimated residual variances were all close to the simulated ones with no bias detected in any of the simulated cases (results not shown). Thus, FIA gives accurate estimates of the QTL variances also when there are substantial epistatic effects. The estimated within-line correlation was, however, confounded with the epistatic variance, giving an increased for cases 2 and 3 when and . We may therefore expect that the estimated degree of segregation within lines is conservative in the presence of epistasis.

#### Analyses of experimental data:

##### European wild boar × Large White F_{2} cross:

In the scan of chromosome 6 for the meat quality trait EEL (reflectance score), FIA gave a maximum score value of 135.9 (LR = 33.3 from REML) and the Haley–Knott regression resulted in LR = 3.5. Both maxima were located at 80 cM (Figure 2). The LR for a segregating QTL *vs*. a fixed QTL was 26.0 (*P* < 0.01) with an estimated within-line correlation of . The 1% chromosomewise significance threshold was obtained using 10,000 permutations. The estimates for different within-line correlations were and , thus indicating that the QTL is segregating within the wild boars but not within the domestic pigs. The estimates of the allelic effects clearly show that only one of the wild boar alleles (allele no. 3) carries the halothane mutation (Figure 3).

##### Red Jungle Fowl × White Leghorn F_{2} intercross:

Six genomewide significant QTL were detected with both FIA and Haley–Knott regression (Table 4). FIA showed that at least two of the six QTL segregate within the founder lines. The segregating QTL are located on chromosomes 1 and 27. The REML estimates (chromosome 1, 488 cM: , ; chromosome 27, 21 cM: , ) and the estimated allele effects (Figures 4 and 5) indicate that both QTL are segregating among the White Leghorn base individuals but not within the Red Jungle Fowl male. Moreover, for both of the two segregating QTL, it was possible to separate the allele effects into three distinct groups (Figures 4 and 5), indicating that these QTL are triallelic.

An additional QTL on chromosome 5 was segregating according to the test for fixation (*P* = 0.01, Table 4), but the REML estimates for the model with different within-line correlations did not converge. The estimates of the base allele effects (Figure 6) were significantly different within the Jungle Fowl founder, whereas the differences within the White Leghorn founders were relatively small.

The QTL heritabilities (calculated as , with estimated as the residual variance without any random effect in the model) were between 0.04 and 0.10. The 5% genomewide significance threshold was calculated from 1000 replicates of permuted data.

## DISCUSSION

We have developed a new, general and powerful method for QTL detection in line-cross experiments, which does not rely on the assumption of fixation within founder lines. The method is based on VC theory and includes a parameter of within-line correlation to model segregation of QTL within the founder lines. It uses the score statistic for significance testing, which enables fast and robust genome scans based on empirical significance thresholds. The new method, FIA, is powerful in detecting fixed and segregating QTL and provides good estimates of epistasis among QTL. Our simulations showed that the reduction in power, compared to Haley–Knott regression, was only marginal for fixed QTL, whereas the gain in power was substantial for segregating QTL. Since the power to detect fixed QTL is high in F_{2} line crosses, the advantage of getting a major increase in the power to detect segregating QTL supersedes the minor reduction in the power to detect fixed QTL, and we therefore recommend FIA as a standard QTL method in F_{2} intercross designs.

A substantial gain in power, compared to Haley–Knott regression, was also shown in real data (Figure 2) from a European Wild Boar × Large White F_{2} cross, where the functional halothane gene affecting meat quality as well as the mutated base generation allele could be identified using FIA but not with Haley–Knott regression. In the analyses of body weight in the Red Jungle Fowl × White Leghorn F_{2} cross, there was a tendency for the estimated within-line correlation to decrease with the level of the QTL effect (Table 4). The reason for this decrease is not known, but it might be due to chance, since the uncertainty of increases as the QTL effect decreases, or has an underlying genetic explanation as it is more likely that QTL with large effects have been fixed during selection for increased body weight.

The method was given for an F_{2} pedigree, but the extension to backcrosses, deeper pedigrees (*e.g*., advanced intercross lines; Darvasi and Soller 1995), and several founder lines (*e.g*., heterogeneous stocks; Talbot *et al*. 1999) is relatively straightforward since it is a VC-based method. The only principal modification needed is to construct suitable IBD matrices and, in the case of several founder lines, allow for several within-line correlations. An analysis of heterogeneous stocks, for instance, could enable estimation of correlations between founder line QTL effects and thereby improve QTL detection. Previous studies indicate that similarities between the founder lines vary between genome positions (Talbot *et al*. 1999). We might therefore expect to get different correlation estimates at different QTL positions. An analysis of this kind should improve our understanding of how the QTL were generated in the founder lines.

Following the modeling of Meuwissen and Goddard (2000), we included both the between- and the within-line effects as random. There are several advantages of using a pure random QTL-effects model. Including a potential QTL as a fixed effect in a linear model does not account for the extra sampling variation due to uncertain transmission of alleles through the pedigree (Xu 1998a; Feenstra *et al*. 2006), whereas a VC approach does (Rönnegård and Carlborg 2007). Moreover, Xu (1998b) pointed out that the interpretation in terms of heritabilities and average gene substitution effects is conceptually more logical in VC modeling than in models where the family-specific QTL effects are included as fixed effects. Furthermore, when the parameters of interest are modeled as a combination of fixed and random, model selection using a residual likelihood is not straightforward (Welham and Thompson 1997) since the residual likelihood is based on the residual values after fitting the fixed effects. Hence, standard derivative-based REML estimation algorithms cannot be used, and the method of Perez-Enciso and Varona (2000), therefore, maximizes the likelihood with a relatively slow nonderivative algorithm. The method of Perez-Enciso and Varona (2000) may be powerful but is too slow for practical use in large-scale QTL studies where empirical thresholds are desirable.

The nested within-half-sib family model developed by Knott *et al*. (1996) allows for QTL segregating within lines and is simple and fast. Its use in F_{2} intercrosses is, however, limited to pedigrees with a low degree of relationship between the F_{1} individuals. Moreover, the power decreases with the number of F_{1} males, because each male is assumed to have fixed and independent QTL effects.

In our model, we did not include dominance. Dominance is possible to implement in a VC model assuming an outbred base population (*e.g*., Xu 1996). Including dominance in our segregation model should be feasible by incorporating the two dominance IBD matrices (Xu 1996) obtained by letting the base generation alleles be all independent in the first matrix and fixed within lines in the second matrix. Dominance was not included here to keep the model and the analyses simple. FIA did not detect the two QTL with large dominance effects published in Kerje *et al*. (2003) and our intention is to develop the model to include dominance in the future.

Our analysis of additive-by-additive epistasis showed that both the variance component of the main QTL effect and the epistatic effect could be adequately estimated. The within-line correlation was, however, overestimated when epistasis was included. This bias might be explained by the fact that we did not include the within-line correlation in the interaction part of the model. We plan to include within-line correlation in the modeling of epistasis, which will require a REML algorithm that allows a nonlinear VC model since the parameter ρ is included both in the variance component of the main effect and the epistatic effect.

We used a common correlation within lines in the genome-scan model. A possible extension would be to use a model with different correlations within the two founder lines. We do, however, recommend that the number of parameters (and the degrees of freedom used) should be as few as possible, not to decrease the power to detect fixed QTL. For the same reason, we do not expect that different QTL variances in the two founder lines would improve the model.

The REML estimates for different within-line correlations did not converge for the QTL on chromosome 5. Convergence problems in REML are common when many VCs are included and our model with different within-line correlations includes three different VCs for the QTL effect plus the residual variance. The observation that the QTL is likely to be segregating only within the jungle fowl founder line (Figure 6), consisting of a single individual and only two alleles, can also be a reason why the model did not converge. One of the great advantages of using the score statistic in the genome scan is that problems of convergence are completely avoided, since the calculation of the score statistic is noniterative.

The computational requirements with FIA are higher than with Haley–Knott regression, but the computation time is still low. For the simulated pedigree with 800 F_{2} individuals, the computation time at one position for Haley–Knott regression was <1 sec whereas the calculations of our score statistic took <10 sec (on a standard laptop computer 1.33 GHz PowerPC G4). Including epistasis (Equation 5) increased the computation time to <20 sec. Our implementation of the score statistic in R was not numerically optimized and we may therefore expect the computation time to be reduced even further in the future.

When generating F_{2} pedigrees with large base generations for QTL analyses it is plausible that several base generation alleles will be represented only with few or no copies in the F_{2} generation. Hence, a substantial proportion of the genetic information about segregation among the founders cannot be utilized. Our results show that the power to detect QTL segregating within lines increases when there are few F_{0} and many F_{2} individuals, since there will be more copies of each base generation allele among the F_{2} individuals. Increasing the number of founders will, however, increase the chance of having alleles with different effects if they are not fixed within lines. Our simulations were not designed to recommend an optimum number of founders to be used in F_{2} intercrosses, but the results showed a clear increase in power to detect QTL segregating within lines when the ratio of F_{2} to F_{0} individuals was increased from 16 to 50 (Figure 1). On the basis of these results, a rough guideline would be to use <20 founders for an intercross of 1000 F_{2} individuals.

A major advantage of our model is that it is quite general and easy to implement. We therefore expect that the method will have large practical importance for future QTL analyses of line crosses. The information that the method gives about which QTL that are likely to segregate within lines, the number of alleles having different effects, and which founders that are likely to carry these will be useful for fine mapping of QTL and detection of functional genes. The ability to obtain this new information in a single analysis with a method that has the same, or higher, power as the most frequently used method in all evaluated scenarios indicates that it will be the method of choice in the future. For more complex models with, *e.g*., dominance and linked QTL, further investigation is required.

## APPENDIX: DERIVING THE THEORETICAL VALUES OF THE VARIANCE COMPONENTS FOR CASES 1–4

The theoretical values of the variance components for cases 1–4 can be derived using Henderson's (1953) method 1. It is important to keep in mind that the estimated variances in a VC model are measures of the variances in the population that the founders were taken from. If the *m* founders are outbred, the 2*m* QTL alleles are independently sampled from a common metapopulation, whereas if the two founder lines are fixed then only two QTL alleles have been drawn from the common metapopulation of alleles. Hence, a VC QTL model estimates the QTL variance of this metapopulation.

Let *y* be the vector of allele effects in the founders and *a* the simulated QTL allele effect. Then for the small base population with four founders we have for cases 1, 2, 3, and 4, respectively,

Let **Z** be the incidence matrix relating the founder allele effects to lines. Then for cases 1, 2, and 4 we haveand for case 3

For simplicity, we assume that the only fixed effect we have is the population mean, which is the case in our simulations also. Let θ be the parameter vectorwhere μ is the population mean from which the QTL alleles have been sampled, is the variance between lines and is the within-line variance. The genotypic QTL variance is then given by and the within-line correlation is given by

The theoretical value for the genotypic variance is easy to obtain intuitively for case 1. In case 1 the random effects are known to be and given *y* above, and since they have been sampled from a common population of allelic effects the estimated variance of this population is with *N* = 2, which is equal to , and the genotypic variance is twice the allelic variance. Hence, the genotypic variance is *a*^{2} for case 1.

For cases 2–4, it is more difficult to derive the expectation of θ intuitively. The expectation of θ may then obtained from Henderson's method 1 aswherewhere *n _{0}* = 8 and

**J**is an

*n*

_{0}×

*n*

_{0}matrix of ones,where , and .

## Footnotes

↵2

*Present address:*Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, SE-75007 Uppsala, Sweden.Communicating editor: L. McIntyre

- Received October 9, 2007.
- Accepted February 9, 2008.

- Copyright © 2008 by the Genetics Society of America