Originally published as Genetics Published Articles Ahead of Print on December 6, 2006.

Genetics, Vol. 175, 1267-1274, March 2007, Copyright © 2007
doi:10.1534/genetics.106.064063

Haplotype Probabilities for Multiple-Strain Recombinant Inbred Lines

* Research Unit Genetics and Biometry, Research Institute for the Biology of Farm Animals (FBN), Dummerstorf, Germany 18196 and {dagger} Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205

1 Corresponding author: Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St., Baltimore, MD 21205–2179.
E-mail: kbroman{at}jhsph.edu

Manuscript received July 28, 2006. Accepted for publication November 26, 2006.

ABSTRACT

Recombinant inbred lines (RIL) derived from multiple inbred strains can serve as a powerful resource for the genetic dissection of complex traits. The use of such multiple-strain RIL requires a detailed knowledge of the haplotype structure in such lines. BROMAN (2005) derived the two- and three-point haplotype probabilities for 2n-way RIL; the former required hefty computation to infer the symbolic results, and the latter were strictly numerical. We describe a simpler approach for the calculation of these probabilities, which allowed us to derive the symbolic form of the three-point haplotype probabilities. We also extend the two-point results for the case of additional generations of intermating, including the case of 2n-way intermated recombinant inbred populations (IRIP).


RECOMBINANT inbred lines (RIL) can serve as powerful tools for genetic mapping. An RIL is formed by crossing two inbred strains followed by repeated matings among relatives (e.g., selfing or sibling mating) to create a new inbred line whose genome is a mosaic of the parental genomes. As each RIL is an inbred strain and so can be propagated eternally, a panel of RIL has a number of advantages for genetic mapping: one need genotype each strain only once; one can phenotype multiple individuals from each strain to reduce individual, environmental, and measurement variability; multiple invasive phenotypes can be obtained on the same set of genomes, including measurements on a single invasive phenotype over time or in different environments; and, as the breakpoints in RIL are more dense than those that occur in any one meiosis, greater mapping resolution can be achieved.

Members of the Complex Trait Consortium have recently begun the development of a large panel of eight-way RIL in the mouse (THREADGILL et al. 2002; COMPLEX TRAIT CONSORTIUM 2004). An eight-way RIL is formed by intermating eight parental inbred strains, followed by repeated selfing or sibling mating to produce a new inbred line whose genome is a mosaic of the eight parental strains. (Figure 1, A and B, illustrates the production of eight-way RIL by selfing and sibling mating, respectively.) This panel will serve as a valuable community resource for mapping the loci that contribute to complex phenotypes in the mouse.


Figure 1
View larger version (23K):
In this window
In a new window
Download PPT slide
 
FIGURE 1.—

The production of eight-way RIL by selfing (A) and by sibling mating (B) and of eight-way RIL+ by selfing (C) and by sibling mating (D).

 
In general, one might consider the development of a panel of 2n-way RIL, mixing the genomes of 2n different inbred lines. One might also consider an additional generation of interbreeding, preceding the process of inbreeding, to increase the density of breakpoints on the final RIL; we call this the RIL+ design. In 2n-way RIL, inbreeding begins with individuals at generation n; in 2n-way RIL+, two Gn individuals from independent "funnels" (with initial crosses in the same order, but with no shared recombination events) are crossed, and inbreeding begins at generation n + 1. The production of eight-way RIL+ by selfing and sibling mating is shown in Figure 1, C and D, respectively. Note that in eight-way RIL+, one may mate cousins at generation G2, as these individuals have no shared recombination events. For higher-order RIL+, a more extensive set of matings will be required to ensure that the individuals at generation Gn–1 exhibit independent recombination events.

Further, it has been proposed to include some number of generations of random mating prior to inbreeding, a design that has been called an intermated recombinant inbred population (IRIP). Multiple designs for the formation of 2n-way IRIP might be considered. First, one might create an unlimited population of individuals at generation n, each from a funnel having initial crosses in the same order, but with such crosses completely independent between individuals. Second, the individuals at generation n might each come from an independent, random funnel, with the order of the initial crosses completely randomized, though with all 2n parental strains represented. We focus on the latter design, as it requires the formation of a single large population from which a panel of IRIP may be developed. The former design would require separate populations of intermating individuals for each line to be formed. Note that the use of random funnels makes the IRIP design distinct from the RIL+ design, which uses a fixed funnel.

The use of multiple-strain RIL panels will require a detailed understanding of the haplotype structure in such lines. At any given genomic position, an RIL will be homozygous for one of the 2n possible parental alleles; a haplotype is the set of alleles at linked loci along a chromosome. We seek to understand the pattern of exchanges among the parental alleles along an RIL chromosome. In particular, the decision of whether to include additional generations of intermating should be based upon an understanding of the additional mapping precision that such intermating will provide.

The seminal article of HALDANE and WADDINGTON (1931) provided the basic results for the standard two-way RIL by selfing or by sibling mating: they derived both two- and three-point haplotype probabilities (i.e., the probabilities for all possible two- and three-locus haplotypes) for such two-way RIL. WINKLER et al. (2003) calculated the two-point haplotype probabilities for the case of two-way IRIP. BROMAN (2005) derived the two- and three-point haplotype probabilities for four- and eight-way RIL, though with enormous computational effort. Only numerical results were provided for the three-point probabilities.

Here, we improve on the work of HALDANE and WADDINGTON (1931) and BROMAN (2005). We describe a simpler approach for the calculation of two- and three-point probabilities in 2n-way RIL, which allowed us to determine exact formulas for the three-point probabilities. We also extend the results on two-point haplotype probabilities for the case of 2n-way RIL+ and 2n-way IRIP. Our results on the map expansion obtained in each design will provide a useful guide to investigators considering the development of 2n-way RIL and considering whether additional generations of intermating should be performed.


TWO POINTS
Here we derive the two-point haplotype probabilities on the fixed chromosome in 2n-way RIL, RIL+, and IRIP. We consider both selfing and sibling mating, and we focus on the autosome. (Results for the X chromosome may be derived in a similar manner, but since the X chromosome recombines in females but not in males and so different alleles have different numbers of opportunities for recombination before they arrive at the four-chromosome bottleneck, even single-point results are difficult to write down for the general 2n-way case.) We also derive the quantity analogous to the recombination fraction, but for the fixed RIL chromosome. Note that in the case of sibling mating, we generally assume n ≥ 2 (that is, 2n ≥ 4).

Selfing:

Two-way RIL:

HALDANE and WADDINGTON (1931) derived the two-locus haplotype probabilites for two-way RIL by selfing. Here, we describe a simpler solution to the problem.

Let W1W2 | X1X2 denote the haplotypes for a Gk individual (for k > 0), with subscripts denoting the alleles at the two loci. Let p1 denote the probability that the W1W2 haplotype goes on to be fixed, and let p2 denote the probability that the W1X2 haplotype goes on to be fixed. By symmetry, Pr(X1X2 fixed) = Pr(W1W2 fixed) and Pr(X1W2 fixed) = Pr(W1X2 fixed), and so 2p1 + 2p2 = 1.

Further, if we condition on the first step, we have Formula. That is, the probability that the W1W2 haplotype is fixed is the probability that it is transmitted intact to the next generation (and this can occur in two ways) and then becomes fixed plus the probability that W1 is transmitted to one gamete and W2 is transmitted to the other gamete and then these are brought together at fixation (and this can occur in two ways). Substituting p2 = (1 – 2p1)/2, we find p1 = 1/[2(1 + 2r)] and p2 = r/(1 + 2r).

Finally, note that in the G1 generation, Wi {equiv} A and Xi {equiv} B. Thus, in a two-way RIL by selfing Pr(AA fixed) = p1 = 1/[2(1 + 2r)] and Pr(AB fixed) = p2 = r/(1 + 2r). These are the haplotype probabilities derived by HALDANE and WADDINGTON (1931).

2n-way RIL:

The results for higher-order RIL by selfing may be immediately derived from the results for two-way RIL, due to the two-chromosome bottleneck at the start of inbreeding. We consider the generation of 2n-way RIL via a funnel, in which the genomes are brought together as rapidly as possible, followed immediately by inbreeding (see Figure 1A). In the following, we assume n ≥ 2. Let Formula denote the parental lines, and consider the cross [(L1 x L2) x (L3 x L4)] x .... We also use Li to denote the allele from that line.

Let W1W2 | X1X2 denote the alleles on the two chromosomes in generation n, at which inbreeding begins. We must have Formula and Formula.

To derive the haplotype probabilities for the fixed 2n-way RIL chromosome, we first determine the haplotype probabilities at the start of inbreeding and then combine them with the results for two-way RIL. We begin with the calculation of the haplotype probabilities at the start of inbreeding. We consider the case that the L1 allele will be fixed at the first locus; other probabilities follow by symmetry.

To obtain Pr(W1 = W2 = L1), note that there must be no recombination at any of the initial mixing generations, and that the L1L1 haplotype must be transmitted at each generation. Thus we see that Pr(W1 = W2 = L1) = [(1 – r)/2]n–1. Similarly, Pr(W1 = L1, W2 = L2) = (r/2)[(1 – r)/2]n–2, as the two loci must recombine at the first generation but not at subsequent generations, and the L1 allele at the first locus must always be transmitted. Finally, for i = 0, 1, ..., n – 2 and j = 1, ..., 2i, we have Formula.

We now proceed to calculate the haplotype probabilities for the fixed RIL chromosome. The probability that the fixed haplotype is L1Lj, for j = 1, ..., 2n–1 is simply the probability Pr(W1 = L1, W2 = Lj) multiplied by the probability that the W1W2 haplotype gets fixed. For k = 2n–1 + 1, ..., 2n, the probability that the fixed haplotype is L1Lk is Pr(W1 = L1, X2 = Lk) = (1/2)2n–2, multiplied by the probability that the W1X2 haplotype gets fixed. Thus the two-locus haplotype probabilities in a 2n-way RIL by selfing are as follows:

Formula 1(1)

The probability that the RIL chromosome is fixed at different alleles at the two loci (the quantity analogous to the recombination fraction) is then R = 1 – 2nPr(L1L1) = 1 – (1 – r)n–1/(1 + 2r). The map expansion in a 2n-way RIL by selfing is then dR/dr |r=0 = n + 1. (For a short proof of the fact that dR/dr | rr=0 corresponds to the map expansion, see the APPENDIX.)

2n-way RIL+:

In 2n-way RIL+ by selfing, one crosses two Gn individuals, generated from independent funnels, and then performs repeated selfing starting at generation n + 1 (see Figure 1C). To calculate the two-locus haplotype probabilities for this case, we need to revise the haplotype probabilities for the generation in which inbreeding begins. We now have Formula 1. These haplotype probabilities use those from the formation of 2n-way RIL, but with an additional generation of recombination.

We have Pr(W1 = W2 = L1) = [(1 – r)/2]n. For i = 0, ..., n – 1 and j = 1, ..., 2i, Formula 1.

Calculation of the haplotype probabilities on the fixed RIL+ chromosome proceeds as before, but a particular allele may come from either chromosome. Thus, for example, the probability that the RIL+ is fixed at L1L1 is Pr(W1 = W2 = L1) times the chance that the W1W2 haplotype gets fixed, plus Pr(X1 = X2 = L1) times the chance that the X1X2 haplotype gets fixed, plus Pr(W1 = X2 = L1) times the chance that the W1X2 haplotype gets fixed, plus Pr(X1 = W2 = L1) times the chance that the X1W2 haplotype gets fixed. This gives Pr(L1L1) = 2[(1 – r)/2]n/[2(1 + 2r)] + 2(1/2)2n[r/(1 + 2r)]. The other cases are similar, and so the two-locus haplotype probabilities in a 2n-way RIL+ by selfing are as follows:

Formula 2(2)

The probability that the RIL+ chromosome is fixed for different alleles is then R = 1 – 2nPr(L1L1) = 1 – [(1 – r)n + r21–n]/(1 + 2r). The map expansion for the 2n-way RIL+ design by selfing is then n + 2 – 21–n.

2n-way IRIP(s):

In the formation of 2n-way IRIP(s) by selfing, one generates an unlimited population of Gn individuals from random funnels, intermates them for s generations, and then inbreeds, by selfing, a random individual from the n + s generation.

At generation Gn, in the case of the funnel [(L1 x L2) x (L3 x L4)] x ..., the haplotype probabilities for the first chromosome are Pr(W1 = W2 = L1) = [(1 – r)/2]n–1 and Formula 2 for i = 0, ..., n – 2, j = 1, ..., 2i. The other chromosome has a similar structure, but for the other alleles.

In the IRIP, we consider individuals from random funnels. That is, each individual comes from a cross of the form Formula 2, where Formula 2 is a random permutation of (1, 2, ..., 2n). The haplotype probabilities for a random individual at generation Gn then become Pr(W1 = W2 = L1) = (1/2)[(1 – r)/2]n–1 = (1 – r)n–1/2n, and, for j != 1, Pr(W1 = L1, W2 = Lj) = [1 – (1 – r)n–1]/[2n(2n – 1)]. We thus have complete symmetry among the 2n alleles.

A random Gn+1 individual is formed from a cross between two random Gn individuals. Using the results above, we then have the haplotype probabilities Pr(W1 = W2 = L1) = [(1 – r)/2]n. We can use the symmetry of alleles to conclude that, for j != 1, Pr(W1 = L1, W2 = Lj) = [1 – 2nPr(W1 = W2 = L1)]/[2n(2n – 1)] = [1 – (1 – r)n]/[2n(2n – 1)].

The disequilibrium parameter at Gn+1 is D = Pr(W1 = W2 = L1) – [Pr(W1 = L1)]2 = [(1 – r)/2]n – 1/22n. The disequilibrium parameter at Gn+s (for s ≥ 1), after s 1 additional generations of random mating, is D(1 – r)s–1, and so at this generation Pr(W1 = W2 = L1) = D(1 – r)s–1 + 1/22n = (1 – r)n+s–1/2n + [1 – (1 – r)s–1]/22n.

For j != 1, we again have, by symmetry, Pr(W1 = L1, W2 = Lj) = [1 – 2nPr(W1 = W2 = L1)]/[2n(2n – 1)], which we neglect to write out.

Finally, we arrive at the haplotype probabilities on the fixed 2n-way IRIP(s) chromosome, which are derived as before:

Formula 3(3)

It follows that the probability that the 2n-way IRIP(s) is fixed at different alleles at the two loci is Formula 3, and so the map expansion is n + (s + 1)(1 – 2n).

Sibling mating:

Two-way RIL:

HALDANE and WADDINGTON (1931) derived the two-locus haplotype probabilities for two-way RIL by sibling mating. Their derivation involved the solution of a system of 22 linear equations. Here we describe a simpler solution to the problem.

Let W1W2 | X1X2 x Y1Y2 | Z1Z2 denote the haplotypes for the pair of individuals at generation Gk (for k ≥ 0). Let q1, q2, and q3 denote the probabilities that the W1W2, W1X2, and W1Y2 haplotypes, respectively, go on to be fixed. Others follow by symmetry, and so we have 4q1 + 4q2 + 8q3 = 1.

At G0, Wi {equiv} Xi {equiv} A and Yi {equiv} Zi {equiv} B, and so Pr(AA fixed) = 2(q1 + q2) and Pr(AB fixed) = 4q3. At G1, Wi {equiv} Yi {equiv} A and Xi {equiv} Zi {equiv} B, and so Pr(AA fixed) = 2(q1 + q3). Equating this to the previous equation, we see immediately that q2 = q3.

Looking forward one generation (with the same technique used for the case of two-way RIL by selfing), we find Formula 3. Thus 2(q1 + q2) = 2(1 – r)q1 + 3q2, and so q2 = 2rq1. Using the fact that 4q1 + 4q2 + 8q3 = 4q1 + 12q2 = 1, we obtain q1 = 1/[4(1 + 6r)] and q2 = r/[2(1 + 6r)].

Finally, we obtain, for two-way RIL by sibling mating, Pr(AA fixed) = 2(q1 + q2) = (1 + 2r)/[2(1 + 6r)] and Pr(AB fixed) = 4q3 = 2r/(1 + 6r). These are the haplotype probabilities derived by HALDANE and WADDINGTON (1931).

Four-way RIL:

Our method for calculating the two-locus haplotype probabilities for two-way RIL by sibling mating (above) included the results for four-way RIL. The qi defined above are exactly the two-locus haplotype probabilities for four-way RIL by sibling mating. If we let L1, ..., L4 denote the four alleles, we have Pr(LiLi fixed) = q1 = 1/[4(1 + 6r)] and Pr(LiLj fixed) = q2 = r/[2(1 + 6r)] for i != j. These results are the same as those obtained by BROMAN (2005).

2n-way RIL:

Derivation of the two-point haplotype probabilities on the fixed chromosome in a 2n-way RIL by sibling mating is similar to the case of selfing, although we must consider the four chromosomes at the start of inbreeding. Let W1W2 | X1X2 x Y1Y2 | Z1Z2 denote the two-locus haplotypes in the two individuals at generation Gn–1, prior to inbreeding (see Figure 1B), and note that Formula 3, Formula 3,Formula 3, and Formula 3. To determine the haplotype probabilities on the fixed chromosome, we first determine the probabilities that particular alleles survive to the Gn–1 generation and then multiply those by the probabilities that such alleles go on to be fixed.

For the first part, note that Pr(W1 = W2 = L1) = [(1 – r)/2]n–2. For i = 0, ..., n – 3 and j = 1, ..., 2i, we have Formula 3. Further note that, for k = 1 + 2n–1, ..., 2n–2 + 2n–1, we have Pr(W1 = L1, Y2 = Lk) = Pr(W1 = L1)Pr(Y2 = Lk) = (1/2)2(n–2).

Combining these results with the fact that Pr(W1W2 fixed) = 1/[4(1 + 6r)] and Pr(W1X2 fixed) = Pr(W1Y2 fixed) = r/[2(1 + 6r)], we obtain the following two-point haplotype probabilities for the fixed 2n-way RIL:

Formula 4(4)

It follows that the probability that the 2n-way RIL is fixed at different alleles at the two loci is 1 – 2nPr(L1L1) = 1 – (1 – r)n–2/(1 + 6r), and so the map expansion is (n + 4).

2n-way RIL+:

In 2n-way RIL+ by sibling mating, Gn individuals from independent funnels are crossed to form the Gn+1 generation, at which point inbreeding by sibling mating begins (see Figure 1D). At generation n, we have that Formula 4 and Formula 4. Derivation of the two-point haplotype probabilities proceeds with two changes: there is an additional generation of recombination prior to the start of inbreeding, and the L1L1 haplotype may now be fixed in four possible ways: W1 = W2 = L1 and the W1W2 haplotype is fixed, Y1 = Y2 = L1 and the Y1Y2 haplotype is fixed, W1 = L1 and Y2 = L1 and the W1Y2 haplotype is fixed, and finally W2 = L1 and Y1 = L1 and the Y1W2 haplotype is fixed.

We first look at the haplotype probabilities at generation n. We have Pr(W1 = W2 = L1) = [(1 – r)/2]n–1. For i = 0, ..., n – 2, j = 1, ..., 2i, Formula 4.

To obtain the two-point haplotype probabilities on the fixed 2n-way RIL+ chromosome, we note that, for example, the probability that the L1L1 haplotype is fixed is 2[(1 – r)/2]n–1Pr(W1W2 fixed) + 2(1/2)2(n–1)Pr(W1Y2 fixed). The probability that the L1Lk haplotype is fixed, for k > 2n–1, is 4(1/2)2(n–1)Pr(W1X2 fixed), with the 4 coming from the fixation of W1X2, W1Z2, Y1X2, or Y1Z2. Thus, the final results are as follows:

Formula 5(5)

It follows that the probability that the 2n-way RIL+ is fixed at different alleles at the two loci is R = 1 – [(1 – r)n–1 + r22–n]/(1 + 6r), and so the map expansion is n + 5 – 22–n.

2n-way IRIP(s):

In the formation of 2n-way IRIP(s) by sibling mating, one generates an unlimited population of Gn individuals from random funnels, intermates them for s generations, and then inbreeds. The haplotype probabilities at generation n + s, at which inbreeding begins, are the same for the case of 2n-way IRIP(s) by selfing, and so we have, at generation n + s, Pr(W1 = W2 = L1) = (1 – r)n+s–1/2n + [1 – (1 – r)s–1]/22n. The probabilities Pr(W1 = L1, W2 = Lj) for j != 1 may be derived by symmetry.

Derivation of the haplotype probabilities on the fixed 2n-way IRIP(s) chromosome proceeds as before, and so we obtain the following:

Formula 6(6)

It follows that the probability that the 2n-way IRIP(s) is fixed at different alleles at the two loci is Formula 6, and so the map expansion is n + (s + 5)(1 – 2n).

Summary:

Here we have derived the two-point haplotype probabilities for 2n-way RIL, RIL+, and IRIP. Perhaps our most important results concern the map expansion in the different designs, as these indicate the increased mapping resolution that may be obtained. The RIL+ and IRIP designs require additional generations of mating, and this additional effort must be weighed against the improved precision provided.

The map expansions for 2n-way RIL, RIL+, and IRIP(s) by selfing are assembled in Table 1. The map expansion in the RIL+ design is somewhat <1 unit greater than that for the RIL. In the IRIP, one obtains a slightly <1 unit increase in the map expansion for each additional generation of intermating.


View this table:
In this window
In a new window

 
TABLE 1

Map expansion in 2n-way RIL, RIL+, and IRIP

 


THREE POINTS
A technique similar to that used above for the case of two points may be used to derive the three-point haplotype probabilities in RIL. BROMAN (2005) derived these quantities, but obtained only numerical solutions. By our approach, we may obtain exact formulas for the three-point haplotype probabilities.

We focus exclusively on the autosome in four- and eight-way RIL by sibling mating. Exact formulas for four- and eight-way RIL by selfing were presented in BROMAN (2005). Results for higher-order RIL, RIL+, and IRIP may be obtained from the results provided below, and a similar technique may be used to derive results for the X chromosome.

We consider three points and assume that the recombination fractions in the two intervals are the same, r12 = r23 = r. Let c denote the three-point coincidence at meiosis, c = Pr(double recombinant)/r2. Note that c is generally a function of r, with, for most organisms, c = 0 for small r (indicating strong positive crossover interference) and c = 1 for r = Formula 6. We define r13 to be the recombination fraction between the first and third loci, so that c = (2rr13)/(2r2) and so r13 = 2r(1 – cr).

To simplify some of the notation in what follows, define r00 = 1 – 2r + cr2, the chance that a nonrecombinant haplotype is transmitted; r01 = r(1 – cr), the chance that the second but not the first interval recombines; and r11 = cr2, the chance that both intervals recombine.

Four-way RIL:

We consider the case of four-way RIL by sibling mating. Let pijk denote the probability that the ijk haplotype is fixed. (For ease of notation, here we denote the four alleles as the integers 1, 2, 3, 4.) Taking account of the various symmetries, there are seven distinct haplotype probabilities, shown in Table 2. Note that we must have 4p111 + 8p112 + 4p121 + 16p113 + 8p131 + 16p123 + 8p132 = 1.


View this table:
In this window
In a new window

 
TABLE 2

Three-point haplotype probabilities on an autosome in four-way RIL by sibling mating

 
To derive these seven probabilities, we condition on the first step toward inbreeding. For example, we can write Formula 6. This is derived as follows: the 111 haplotype can be fixed if it is transmitted intact in the first generation and then that haplotype goes on to be fixed (and this can happen in two ways), or if the 1 alleles at two adjacent loci are transmitted from the first parent in one generation and from the other parent at the third locus and these are brought together at fixation (and this can occur in four different ways), or finally that the 1 alleles at the first and third loci are transmitted from one parent and at the middle locus from the other parent and these are brought together at fixation (and this can happen in two ways).

Similar arguments lead to the following additional six equations:

Formula 7(7)

We have now defined a set of eight linear equations in seven unknowns. The seven unknown probabilities may be easily derived by the consideration of the first equation plus six of the other seven equations. The solutions, displayed in Table 2, were obtained via Mathematica (WOLFRAM RESEARCH 2003). Note that one may collapse the three-point probabilities to obtain the two-point probabilities derived earlier.

A quantity analogous to the three-point coincidence, but for the fixed RIL chromosome, may be calculated from these results, as C = (1 – 4p111 – 8p112 – 16p113)/R2, which gives the following:

Formula 8(8)

Eight-way RIL:

The three-point haplotype probabilities for the autosome in eight-way RIL by sibling mating may be immediately derived from the results on four-way RIL, using the equations in Table 7 of BROMAN (2005). We neglect to write these out, but do derive the quantity analogous to the three-point coincidence, for the fixed eight-way RIL chromosome:

Formula 9(9)

In the work of BROMAN (2005), nearly 3 years of total computer time were used to derive the above quantity, although the results were strictly numerical and were for the case of no interference (c = 1) and for a model of strong positive crossover interference. Here, we have shown a simpler method to derive the result, which allowed us to obtain explicit formulas for the three-point probabilities. The formulas in this section match the numerical results of BROMAN (2005) to within round-off error.


DISCUSSION
We have improved on the work of HALDANE and WADDINGTON (1931) and BROMAN (2005), describing a simpler approach for the calculation of two- and three-point haplotype probabilities in multiple-strain RIL. Our simpler solution (which is an instance of the standard trick for calculations with Markov chains: condition on the first step) allowed us to derive exact formulas for the three-point haplotype probabilities in four- and eight-way RIL by sibling mating. Moreover, we have extended the results on two-point haplotype probabilities for the case of additional generations of intermating in the 2n-way RIL+ and IRIP designs. It is important to emphasize that the results on IRIP are based on the assumption of an infinite population of intermating individuals. With the finite populations that would be used in practice, the progress to inbreeding would be more rapid and the realized map expansion would be somewhat less than our theoretical calculations indicate.

While our results on the two-point haplotype probabilities will play an important role in methods for reconstructing the RIL haplotypes on the basis of incompletely informative markers, such as single-nucleotide polymorphisms (SNPs), perhaps the greatest value of this work concerns the map expansion provided by the different designs. The precision of localization of a quantitative trait locus (QTL) depends critically upon the density of breakpoints in the mapping population, but the increased density of breakpoints in RIL+ and IRIP must be weighed against the additional generations of intermating (and of inbreeding) required. In this regard, it should be emphasized that there is an important trade-off between the power to identify novel QTL and the precision of localization of QTL. The LOD threshold for significance in a genomewide scan for QTL increases as the density of breakpoints increases. This is shown clearly in the results of LANDER and BOTSTEIN (1989) on the LOD threshold for the dense-map case: the threshold increases with the effective genetic length of the genome. Thus, while the introduction of additional generations of interbreeding in the formation of RIL will lead to greater mapping precision, a larger RIL panel will be required to identify QTL with a given effect size.

MARTIN and HOSPITAL (2006) recently pointed out that the maximum-likelihood estimate of the recombination fraction between two markers, on the basis of breakpoint frequencies in an RIL panel, is subject to some bias. They presented a method, using a Taylor expansion, for reducing the bias. They further described a method for testing for crossover interference with RIL data. Their methods could also be used with the multiple-strain RIL considered herein. While these results are quite interesting, we wish to point out that, in the use of RIL for QTL mapping, interest is in the breakpoint frequencies themselves and not in the underlying recombination fractions. Moreover, an understanding of recombination at meiosis, particularly regarding crossover interference, might best be studied in a large backcross or intercross, rather than with RIL, as the process of inbreeding to develop RIL is subject to considerable selection, and so our understanding of recombination on the basis of breakpoint frequencies in RIL would likely be distorted.

MARTIN and HOSPITAL (2006) viewed the term "map expansion" as misleading, as it really concerns an increased frequency of breakpoints and no real change in the genetic map. We, however, still prefer the phrase, and no useful alternatives have been proposed; it provides a useful shorthand for a more complex phenomenon. They further take issue with the treatment, in software, of RIL as a backcross through equations such as R = 4r/(1 + 6r) and with an assumption of no crossover interference, as even if meiosis exhibits no interference, occurrences of breakpoints in adjacent intervals on an RIL chromosome are not independent. (Note that this lack of independence was identified by HALDANE and WADDINGTON in 1931.) To the contrary, however, as was stated in BROMAN (2005), the breakpoint process on an RIL chromosome, at least for the mouse, will be more closely approximated by a Poisson process than is the crossover process at meiosis, which in the mouse exhibits extremely strong positive crossover interference (see BROMAN et al. 2002). Thus the current approach for multipoint QTL mapping in RIL, embodied in software such as MapMaker/QTL (LANDER et al. 1987), is entirely reasonable.


APPENDIX
Let r denote the recombination fraction (at meiosis) for an interval, and let R(r) denote the analogous quantity for an RIL (or RIL+ or IRIP). Several authors (e.g., WINKLER et al. 2003; BROMAN 2005) have referred to the fact that the map expansion in an RIL is dR/dr |r=0, but this has been stated without proof. The map expansion for a given RIL design is particularly important for our work, and so we wish, in this APPENDIX, to provide a proof of the result. This expands on a comment in TEUSCHER et al. (2005).

Let d denote map distance (that is, the average number of crossovers in an interval), and let D(d) denote the corresponding average number of breakpoints in that interval on an RIL (or RIL+ or IRIP) chromosome. We wish to show that D(d) is linear in d and in particular that D(d) = ad for some a.

First, note that D(d1 + d2) = D(d1) + D(d2) for all d1, d2. This comes from the fact that di is the average number of crossovers in an interval, while D(di) is the average number of breakpoints in the interval on the RIL chromosome, and so both are additive.

It follows that D(nd) = nD(d) for any nonnegative integer n. Further, D(d) = D(nd/n) = nD(d/n), and so D(d/n) = D(d)/n for any positive integer n. Thus D(qd) = qD(d) for all nonnegative rationals q. Surely D(d) is continuous, and so then D(xd) = xD(d) holds for all nonnegative x.

Now pick any d0 > 0 and let a = D(d0)/d0. Then for any d, D(d) = D[(d/d0)d0] = (d/d0)D(d0) = ad. Thus we have shown D(d) = ad for some a.

Now let r denote the recombination fraction corresponding to the distance d and let R(r) denote the analogous quantity for the RIL, and note that for small r, r {approx} d. It follows that Formula 9.

Note that this result requires no assumption about crossover interference, but does require the existence of a map function: that there is a one-to-one relationship between the recombination fraction for an interval and the average number of crossovers in the interval.


ACKNOWLEDGEMENTS
The authors thank Volker Guiard for helpful discussions, Gudrun Brockmann for suggesting our collaboration, and two anonymous reviewers for helpful comments. This work was supported in part by National Institutes of Health grant GM074244 (to K.W.B.).


LITERATURE CITED

BROMAN, K. W., 2005 The genomes of recombinant inbred lines. Genetics 169: 1133–1146.[Abstract/Free Full Text]

BROMAN, K. W., L. B. ROWE, G. A. CHURCHILL and K. PAIGEN, 2002 Crossover interference in the mouse. Genetics 160: 1123–1131.[Abstract/Free Full Text]

COMPLEX TRAIT CONSORTIUM, 2004 The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat. Genet. 36: 1133–1137.[CrossRef][Medline]

HALDANE, J. B. S., and C. H. WADDINGTON, 1931 Inbreeding and linkage. Genetics 16: 357–374.[Free Full Text]

LANDER, E. S., and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199.[Abstract/Free Full Text]

LANDER, E. S., P. GREEN, J. ABRAHAMSON, A. BARLOW, M. J. DALY et al., 1987 MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174–181.[CrossRef][Medline]

MARTIN, O. C., and F. HOSPITAL, 2006 Two- and three-locus tests for linkage analysis using recombinant inbred lines. Genetics 173: 451–459.[Abstract/Free Full Text]

TEUSCHER, F., V. GUIARD, P. E. RUDOLPH and G. A. BROCKMANN, 2005 The map expansion obtained with recombinant inbred strains and intermated recombinant inbred populations for finite generation designs. Genetics 170: 875–879.[Abstract/Free Full Text]

THREADGILL, D. W., K. W. HUNTER and R. W. WILLIAMS, 2002 Genetic dissection of complex and quantitative traits: from fantasy to reality via a community effort. Mamm. Genome 13: 175–178.[CrossRef][Medline]

WINKLER, C. R., N. M. JENSEN, M. COOPER, D. W. PODLICH and O. S. SMITH, 2003 On the determination of recombination rates in intermated recombinant inbred populations. Genetics 164: 741–745.[Abstract/Free Full Text]

WOLFRAM RESEARCH, 2003 Mathematica, Version 5. Wolfram Research, Champaign, IL.

Communicating editor: D. HOULE


Related articles in Genetics:

ISSUE HIGHLIGHTS

Genetics 2007 175: NP. [Full Text]  



This article has been cited by other articles:


Home page
GeneticsHome page
M. V. Rockman and L. Kruglyak
Breeding Designs for Recombinant Inbred Advanced Intercross Lines
Genetics, June 1, 2008; 179(2): 1069 - 1078.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. F. Crow
Haldane, Bailey, Taylor and Recombinant-Inbred Lines
Genetics, June 1, 2007; 176(2): 729 - 732.
[Full Text] [PDF]