- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Frisch, M.
- Articles by Melchinger, A. E.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Frisch, M.
- Articles by Melchinger, A. E.
The Length of the Intact Donor Chromosome Segment Around a Target Gene in Marker-Assisted Backcrossing
Matthias Frischa and Albrecht E. Melchingeraa Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
Corresponding author: Albrecht E. Melchinger, Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany., melchinger{at}uni-hohenheim.de (E-mail)
Communicating editor: Z-B. ZENG
| ABSTRACT |
|---|
Recurrent backcrossing is an established procedure to transfer target genes from a donor into the genetic background of a recipient genotype. By assessing the parental origin of alleles at markers flanking the target locus one can select individuals with a short intact donor chromosome segment around the target gene and thus reduce the linkage drag. We investigated the probability distribution of the length of the intact donor chromosome segment around the target gene in recurrent backcrossing with selection for heterozygosity at the target locus and homozygosity for the recurrent parent allele at flanking markers for a diploid species. Assuming no interference in crossover formation, we derived the cumulative density function, probability density function, expected value, and variance of the length of the intact chromosome segment for the following cases: (1) backcross generations prior to detection of a recombinant individual between the target gene and the flanking marker; (2) the backcross generation in which for the first time a recombinant individual is detected, which is selected for further backcrossing; and (3) subsequent backcross generations after selection of a recombinant. Examples are given of how these results can be applied to investigate the efficiency of marker-assisted backcrossing for reducing the length of the intact donor chromosome segment around the target gene under various situations relevant in breeding and genetic research.
RECURRENT backcrossing with selection for presence of a target gene is a well-established breeding method for introgressing desirable genes from a donor into the genetic background of a recipient genotype used as recurrent parent. With the development of high-density linkage maps in most crop species, it became possible to monitor the parental origin of alleles at DNA markers throughout the entire genome. Selection of individuals, which not only carry the target gene but also are homozygous for the recurrent parent alleles at a large portion of markers, can accelerate recovery of the recurrent parent genome and reduce the number of backcross generations required for gene introgression. This approach is called background selection and was first proposed by ![]()
The goal of background selection is to reduce the recurrent parent genome proportion across the whole genome (![]()
![]()
![]()
![]()
![]()
![]()
![]()
= te-tx. Calculating the expectation Et(X) by assuming a chromosome of infinite length yields FISHER's (1949, p. 50) formula
![]() |
(1) |
![]()
![]() |
(2) |
![]()
![]()
![]()
The objective of this study was to extend earlier results concerning the length of the intact donor chromosome segment around the target gene to backcross programs with selection for (a) presence of a target gene and (b) homozygosity of the recurrent parent allele at flanking markers. Knowledge of the probability distribution of the length of the chromosome segment around a target gene is useful in (i) choosing the flanking markers depending on the effect of their position on the intact chromosome segment length and (ii) estimating the intact chromosome segment length on the basis of the marker genotype during a backcross program.
We derived the cumulative density functions (cdf's) and the pdf's of the intact donor chromosome segment length for (1) backcross generations prior to detection of a recombinant between the target gene and the marker(s), (2) the backcross generation in which a recombinant is first detected and selected for further backcrossing, and (3) backcross generations after selection of a recombinant. The respective expected values and variances can be calculated from these density functions.
| ASSUMPTIONS, DEFINITIONS, AND SOME BASIC RESULTS |
|---|
We assume that the process of crossover formation during meiosis in a diploid species is completely described by the following two properties: (A) The location of crossovers is uniformly distributed along the chromosome, and (B) the number of crossovers, which occur during one meiosis on a chromosome region of length l, follows a Poisson distribution with parameter l. Assumptions A and B are mathematically equivalent to those made by ![]()
![]()
![]()
For our derivations we use an algebra of events, the union and intersection operators
and
, and the subset relation
. Let u and v denote two positions on a chromosome, measured in a scale in morgan units with the coordinate origin at the target locus. For all calculations concerning only one side of the target locus, we assume without loss of generality u < v and u, v > 0. The map distance between the target locus and the end of the chromosome is l. The following notation is used for events in generation BCi:
- Zu,v,i: No crossover occurred in the interval [u, v).
- Ou,v,i: An odd number of crossovers occurred in [u, v).
- Eu,v,i: An even number (including zero) of crossovers occurred in [u, v).
Note that Zu,v,i
Eu,v,i. Furthermore, we define
- Nu,v,t =
ti=1 Eu,v,i: No recombination occurred between the loci at positions u and v in any of the generations BC1 to BCt. Ru,v,t =
t-1i=1 Eu,v,i
Ou,v,t for t > 1 and Ru,v,1 = Ou,v,1 for t = 1: Recombination between the loci at positions u and v occurred for the first time in generation BCt.
Note that the events Zu,v,i, Ou,v,i, and Eu,v,i describe recombination in the interval [u, v) in generation BCi, whereas Nu,v,t and Ru,v,t refer to the accumulation of events in generations BC1 to BCt.
From assumptions A and B follows directly (![]()
![]() |
(3) |
![]() |
(4) |
![]() |
(5) |
where d = v - u. Following Haldane's original derivation, we use hyperbolic functions because they are easier to handle in subsequent derivations than the more common formulas with exponential functions only.
Formation of crossovers in different generations is stochastically independent. Moreover, assumptions A and B imply that for every generation crossover formation in nonoverlapping intervals is independent (![]()
{Z, O, E}, s
s', and arbitrary intervals [u, v) and [u', v'),
![]() |
(6) |
and for K, L
{Z, O, E, N, R}, arbitrary generations s and s', and [u, v)
[u',v') = Ø,
![]() |
(7) |
Equation 6 and Equation 7 can be used to calculate the probabilities of events N and R as
![]() |
(8) |
![]() |
(9) |
We define a random variable X, which describes the length of the donor chromosome segment attached on one side of the target gene. The event "the donor chromosome segment attached on one side of the target gene is greater than a certain value x" is denoted by {X > x}. Hence, the cdf of the random variable X is F(x) = 1 - P({X > x}).
| SEGMENT ATTACHED ON ONE SIDE OF THE TARGET GENE |
|---|
Under assumptions A and B, the two random variables that describe the length of the intact donor chromosome segments attached on each side of the target gene are stochastically independent. We use this property to first derive the distribution of each random variable and then combine the results to obtain formulas for the total length. The core of the approach is the derivation of the cdf's from conditional probabilities; further properties of the distribution such as pdf, expectation, and variance can be derived with standard methods.
Generations prior to detection of a recombinant:
We first investigate the length of the intact chromosome segment in backcross generation BCt under the condition that no recombination between the target gene and a marker at position y occurred in any backcross generation BCi (i
t). We distinguish three cases: (1) The attached chromosome segment is smaller than the flanking marker distance; (2) the attached chromosome segment is greater than the flanking marker distance but smaller than the distance between target gene and the end of the chromosome; and (3) the attached chromosome segment comprises the complete distance between target gene and the end of the chromosome.
Case 1. x
[0, y):
The event "the intact chromosome segment is greater than a certain value x
[0, y) in generation BCt and no recombination was observed between the target gene and the marker" occurs if and only if no crossover event happened in the interval [0, x) in all backcross generations BCi (i
t) and an even number of crossovers occurred in the interval [x, y) in all backcross generations BCi (i
t):
![]() |
(10) |
With Equation 3, Equation 5, and Equation 7 we obtain
![]() |
(11) |
which is required for calculation of the conditional probability
![]() |
(12) |
Hence, the cdf of the attached chromosome segment length is
![]() |
(13) |
Case 2. x
[y, l):
The event "the attached chromosome segment is greater than a certain value x
[y, l) in generation BCt and no recombination was observed between the target gene and the marker" occurs if and only if no crossover happened in the interval [0, x) in all backcross generations BCi (i
t). From x
[y, l) follows {Xt > x}
N0,y,t and hence,
![]() |
(14) |
In analogy to the calculations in Equation 10Equation 11Equation 12 HREF="#FD13">Equation 13 we obtain the cdf
![]() |
(15) |
Case 3. x = l:
The event "the attached chromosome segment takes its maximum value in generation BCt, {Xt = l} and no recombination was observed between the target gene and the marker" occurs if and only if no crossover occurred in the interval [0, l) in all backcross generations BCi (i
t). From {Xt = l}
N0,y,t follows
![]() |
(16) |
and therefrom we get
![]() |
(17) |
The discrete character of Xt for the value x = l must be taken into account when calculating the expectation and variance of Xt.
Pdf for Cases 1 and 2:
Differentiation of Equation 13 and Equation 15 with respect to x yields the pdf
![]() |
(18) |
Note that ft(x|N0,y,t) is not continuous for x = y.
Generation in which a recombinant is detected and selected:
Let us now assume that recombination between the target gene and the marker occurred for the first time in generation BCs. The event "the chromosome segment attached on this side of the target gene is greater than a certain value x
[0, y) and recombination is observed between the target gene and the marker" occurs if and only if no crossover happened in the interval [0, x) in all backcross generations BCi (i
s), an even number of crossovers occurred in the interval [x, y) in all backcross generations BCi (i < s), and an odd number of crossovers occurred in the interval [x, y) in generation BCs:
![]() |
(19) |
In analogy to the calculations in Equation 10Equation 11Equation 12 HREF="#FD13">Equation 13 we obtain the cdf

(20)
Differentiation with respect to x yields the corresponding pdf of the attached chromosome segment length under the condition that recombination between the target gene and the marker occurred for the first time in generation BCs:
![]() |
(21) |
Note that fs(x|R0,y,s) is not continuous for x = y.
Subsequent generations after selection of a recombinant:
We now investigate the distribution of the length of the attached segment on one side of the target, when selection of a recombinant individual, on the basis of a flanking marker at position y, was carried out in generation BCs and backcrossing is continued for another t - s generations. The event "the attached chromosome segment is greater than a certain value x in generation BCt and recombination is observed between the target gene and the marker in generation BCs (s
t)" occurs if and only if no crossover occurred in the interval [0, x) in all generations BCi (i
t), an even number of crossovers occurred in the interval [x, y) in all backcross generations BCi (i < s), and an odd number of crossovers occurred in the interval [x, y) in generation BCs:
![]() |
(22) |
In analogy to the calculations in Equation 10Equation 11Equation 12 HREF="#FD13">Equation 13 we obtain the cdf

(23)
Differentiation with respect to x yields the corresponding pdf of the attached chromosome segment length in generation BCt under the condition that recombination between the target gene and the marker occurs for the first time in generation BCs:
![]() |
(24) |
Note that for s = t, Equation 24 simplifies to Equation 21 and that ft(x|R0,y,s) is not continuous for x = y.
Expected values and variances:
From the presented pdf's, expected values and variances of the distribution of X on one side of the target gene can be obtained with standard methods of calculus,
![]() |
(25) |
![]() |
(26) |
![]() |
(27) |
![]() |
(28) |
with
![]() |
(29) |
![]() |
(30) |
Note that integration must be performed separately for the intervals of definition of f(x|N0,y,t).
The expectation and variance of the length of the intact chromosome segment attached on one side of the target gene, when selecting in generations BC1 and BC2 for recombinants between the target gene and the marker, are presented in Table 1. In Appendix A we demonstrate how these equations were derived using Et(x|R0,y,1) as an example.
|
Selection of recombinants without marker analyses in previous generations:
There are situations in practice when the genotype of flanking markers is not examined right from the beginning of a backcrossing program but only in advanced generations. Frequently, the flanking markers used for identification of recombinants are positioned fairly distant from the target gene to have a high probability of success for recovering at least one recombinant with a limited population size (![]()
Numerical results:
Fig 1 shows the expected length Et(X) and the standard deviation SDt(X) of the intact chromosome segment attached on one side of the target gene in generations BC1 to BC15 (t = 1 ... 15) for backcross programs with and without background selection at flanking markers in generation BC1. The target locus is positioned at distance l = 1.0 M from the chromosome end and the flanking marker is located at distance y = 0.1, 0.2, 0.3, 0.4, 0.5 M from the target locus.
|
In generation BC1, E1(X|R0,y,1) = tanh(
)
(Table 1). Hence, the expected length of the intact chromosome segment is
0.25 M when selecting for a flanking marker at 0.5 M distance. Without marker-assisted selection, a value of
0.25 M is reached only in generation BC4. The standard deviation of the length of the intact chromosome segment is distinctly smaller with marker-assisted selection in early backcross generations than without. For example, SD1(x|R0,0.1,1) = 0.015 while without selection at the flanking marker SD1(X) = 0.375. In advanced backcross generations, the differences between the two schemes become smaller. However, an expected length of the intact chromosome segment of
0.05 M, as reached in generation BC1 with a flanking marker 0.1 M distant, is not reached even after 15 backcross generations without background selection.
Fig 2 shows the pdf's and cdf's of the length of the intact chromosome segment attached on one side of the target gene in generation BC5 for backcross programs with background selection at a flanking marker positioned at y = 0.1 or y = 0.5 M distant from the target gene, when a recombinant is selected in generations BC1 to BC5. The pdf for selection in generation BC5 at a marker with distance y = 0.1 M has only a small negative slope, whereas for selection in generation BC1 the pdf for large attached chromosome segments (x is near y) is only about half the absolute value of the pdf of small linked segments (x is near 0). The effect of the generation of selection on the difference in the pdf for small compared with large x values is even greater for a marker at position y = 0.5 M. Here, the probability of having large attached chromosome segments (x is near y) is almost zero when selecting for the flanking marker in generation BC1. The cdf for selection in generation BC1 is greater than for selection in BC2 to BC5, the difference being larger for y = 0.5 M than for y = 0.1 M. Consequently, the probability of having a smaller intact chromosome segment is greater with selection in an early compared with a late generation.
|
| TOTAL LENGTH OF THE INTACT DONOR CHROMOSOME SEGMENT |
|---|
In this section we use an abbreviated notation. The two sides of the target locus are named a and b, which are used as subscripts to mark parameters for the respective side. Parameters without subscript a or b refer to sums of both sides: X = Xa + Xb, l = la + lb, y = ya + yb. The subscripts for the generation were dropped, and fa(xa) and fb(xb) can be any of the previously derived pdf's, referring to a certain backcross generation t. The events Ra and Rb denote that recombination between the target gene and the marker occurred on the respective side of the target gene in backcross generation s
t; Na and Nb denote that no recombination occurred in any backcross generation s
t. Note that the generation s, in which recombination occurred, can be different for sides a and b.
Expected values and variances:
Calculation of expected values and variances is straightforward due to the stochastic independence of Xa and Xb:
![]() |
(31) |
![]() |
(32) |
For derivation of the pdf's we distinguish the following three cases. The cdf's can be obtained by integrating the pdf's.
Case 1. Recombination on both sides of the target gene:
Under condition Ra
Rb both random variables Xa and Xb are continuous [i.e., fa and fb are either ft(x| R0,y,t) or ft(x|R0,y,s)]. Without loss of generality, we assume ya
yb. Because of the stochastic independence of Xa and Xb, the joint density of (xa, xb) is calculated by multiplying the marginal densities. Consider a certain length x of the intact donor chromosome segment: from x = xa + xb follows xb = x - xa. Hence, the probability density for any x is obtained by integration of the joint density of (xa, x - xa) over all possible values for xa that result in x = xa + xb. We denote this integral by
![]() |
(33) |
The limits of integration, i.e., the minimum and maximum length of the chromosome segment attached on side a, depend on the total length of the intact donor chromosome segment x. For x < yb the whole donor chromosome segment may be on either side of the target gene; hence, xa can range from 0 to x. If yb
x, the length on side a has to be at least x - yb. The maximum length of xa under condition Ra is ya. Hence, the pdf can be written in terms of i as
![]() |
(34) |
This principle is illustrated in Appendix B, using as an example the pdf of the length of the intact donor chromosome segment around the target gene in generation BC1, when selection is for recombinants at flanking markers on both sides.
Case 2. Recombination on one side of the target gene:
Without loss of generality we consider Na
Rb [i.e., fa is ft(x|N0,y,t) and fb is either ft(x|R0,y,t) or ft(x|R0,y,s)] and distinguish la > yb and la
yb. For la > yb, the value of xa can range between 0 and x if x < yb; otherwise, the minimum of xa is x - yb and the maximum is la. If x > la, it is also possible that no recombination on side a occurred (Xa = la) and the probability density for this case adds to the integral because of the discrete character of Xa for xa = la. Hence, we obtain for la > yb
![]() |
(35) |
In analogy, the pdf for la
yb is
![]() |
(36) |
Note that for condition Na the function fa is defined depending on the value of x (Equation 18). This must be taken into account for the integration.
Case 3. No recombination on either side:
We have Na and Nb [i.e., fa and fb are ft(x|N0,y,t)]. Without loss of generality we assume la
lb. In analogy to the above cases, the pdf of the length of the intact chromosome segment can be derived as
![]() |
(37) |
The random variable X is discrete for x = l:
![]() |
(38) |
Numerical results:
Fig 3 shows the pdf's and cdf's of the total length of the intact donor chromosome segment around a target gene for various combinations of marker distances ya and yb ,which sum up to 0.2 M. As reflected by the shape of the pdf's, for asymmetric marker bracket there is a high chance of detecting recombinants with a medium length of the intact chromosome segment, while for an asymmetric marker bracket, the probability of finding recombinants with a short or long intact chromosome segment increases.
|
| DISCUSSION |
|---|
Genetic model:
Following earlier studies (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
We used HALDANE's (1919) Poisson model due to its mathematical simplicity, its exponential interevent distribution, and the stochastic independence of crossover formations in adjacent chromosome regions, which allowed us to derive closed analytical formulas for the problems addressed in this article. Applying gamma interevent distributions would in most instances yield unwieldy formulas, which could only be numerically approximated. Moreover, as pointed out by ![]()
Under the assumption of positive chiasma interference (![]()
Comparison with earlier studies:
Our results for marker-assisted selection of recombinants can readily be used to derive the cdf, when selection is only for presence of the target gene. For any y
(0, l) and generation BCt, the disjoint events R0,y,i (i = 1, ... , t) and N0,y,t represent a mutually exclusive partition of the entire probability space. Using the theorem of total probability, we obtain for y < x < l:
![]() |
(39) |
Inserting Equation 8, Equation 9, Equation 15, and 20, we obtain FISHER's (1949, p. 50) probability p presented in the Introduction, which is also the basis of HANSON's (1959) formula. Summarizing, the relation between the three studies can be described as follows: ![]()
![]()
![]()
![]()
![]()
Applications of the theory:
Marker-assisted backcrossing is applied to the following tasks in breeding and genetic research: (1) transfer of a target gene, which may be a transgene or another major gene (e.g., a disease resistance gene); (2) transfer of a chromosome region, which contains a favorable allele at a putative quantitative trait locus (QTL); and (3) development of near-isogenic lines (NILs). Our theoretical results can be applied to the experimental design of such backcross programs and for monitoring the length of the attached chromosome segment in various generations.
Transfer of a gene:
To optimize marker-assisted selection for transfer of a gene, ![]()
![]() |
(40) |
where z+, y+a, y+b denote heterozygosity at the target locus and two flanking markers at distance ya and yb, respectively, and y-b, y-b denote homozygosity for the recurrent parent allele at the respective loci. Without loss of generality we assume ya
yb. If several individuals of the most preferable genotype (according to the above ordering) are found, selection of the best among them is based on a selection index calculated from the genotype at additional markers on the carrier chromosome of the target gene and on the noncarrier chromosomes as proposed by ![]()
Before starting a t-generation backcross program, our results can be used to determine a priori the effect of the population size n1, ... , nt in generations BC1 to BCt and the flanking marker distances ya and yb on the probability distribution of the intact chromosome segment in the selected individual in generation BCt. The pdf of the attached chromosome segment length on one side of the target gene in generation BCt is a mixture of the conditional pdf's for selection of a recombinant in one of generations BC1 to BCt and the conditional pdf for the case that no recombinant is selected up to generation BCt. The respective weights are calculated from the multinomial distribution, following the principle described in detail in Equation 37Equation 38Equation 39 of ![]()
{a, b} in generation BCt is
![]() |
(41) |
where
![]() |
(42) |
The probability that the attached chromosome segment comprises the complete distance between the target gene and the end of the chromosome is
![]() |
(43) |
The cdf, expectation, and variance can be obtained from Equation 41 and Equation 43 with standard methods, and the distribution of the total length of the intact chromosome segment is obtained according to the principle described in theory. These formulas can be used before starting the backcross program to calculate the following:
- The expected length of the intact chromosome segment for given flanking marker distances ya, yb, and population sizes nt;
- the population sizes n1, ... , nt required for given flanking marker distances ya and yb to obtain a desired value for the expected intact donor chromosome segment length, or to obtain with a given probability
an intact chromosome segment length shorter than a value u by using F(u)
; and - the flanking marker distances ya and yb required for given population sizes nt to obtain a desired value for the expected intact donor chromosome segment length, or to obtain with a given probability
an intact chromosome segment length shorter than a value u by using F(u)
.
During the breeding program, our results can be used to infer the length of the intact chromosome segment from the known genotype of an individual (a posteriori situation). We illustrate this by a three-generation backcross program. Consider a single recombinant in generation BC1. On the side of no recombination, the probability distribution of the length of the chromosome segment attached to the target gene is described by the equations derived in Generations prior to detection of a recombinant, whereas on the side of recombination, the results derived in Generation in which a recombinant is detected and selected apply. These results also apply in generation BC2 to the second side of the target gene, when recombination occurs. The results derived in Subsequent generations after selection of a recombinant apply in generation BC2 to the side, where recombination occurred already in generation BC1, as well as to both sides of the target gene in more advanced backcross generations. Consequently, the given formulas allow a complete description of the length of the chromosome segment attached to the target gene in such a backcross program.
Introgression of favorable alleles at quantitative trait loci:
Marker-assisted selection in QTL introgression usually comprises selection for presence of the donor allele at two markers za and zb delimiting the interval in which the putative QTL was detected and for the recurrent parent allele at markers ya and yb flanking the QTL interval [za, zb] (![]()
Development of near-isogenic lines:
A set of NILs, of which each line differs from any other line in one chromosome region, can be employed for confirmation, reanalysis, and fine mapping of QTL (![]()
Selection of several individuals and application in animal breeding: In developing the presented theory, we assumed that in each generation one individual was backcrossed to the recurrent parent. However, especially in an animal breeding context lower selection intensities may be desirable, for example, by backcrossing all recombinant individuals recovered in a backcross population. Since our results on pdf and cdf are valid irrespective of the number of recombinant individuals selected per generation, they also apply to such breeding plans.
Furthermore, our approach can be extended to derive the distribution of the intact donor chromosome segment in the "best" of several recombinant individuals for two important special cases using results from order statistics. The latter requires the stochastic independence of the length of the intact donor chromosome segment for the individuals under consideration. This holds true (a) for BC1 populations or (b) in advanced backcross generations s, if each BCs individual traces back to a different ancestor in generation BC1. Consider one side of the target gene and suppose that m recombinant individuals are found. Then, the pdf of the first order statistic is obtained as
![]() |
(44) |
(![]()
![]() |
(45) |
(45)
This result can be used to calculate the moments of the first order statistic, referring to the length of the attached donor chromosome segment in the recombinant individual with the shortest segment in a sample of m.
Placement of flanking markers:
If several markers on both sides of the target gene are available, it is of interest to compare the effect of symmetric versus asymmetric placement of flanking markers on the intact donor chromosome segment length. As reflected by the shape of the pdf's shown in Fig 3, for a symmetric marker bracket there is a high chance of detecting recombinants with a medium length of the intact chromosome segment, while for an asymmetric marker bracket, the probability of finding recombinants with a short or long intact chromosome segment increases. Larger population sizes are required in a backcross program with an asymmetric rather than a symmetric marker bracket (![]()
Generation of selection: A marker-assisted backcross program usually comprises three or more generations. Hence, it is of interest to compare the effect of the generation in which a recombinant is selected on the intact donor chromosome segment length in the final breeding product. The probability of having a smaller intact chromosome segment is greater with selection in an early generation than with selection in an advanced generation (Fig 2), because crossover events in subsequent generations after selection may result in a further reduction of the intact chromosome segment. The shape of the pdf of X in the final backcross generation (Fig 2) reveals that with a closely linked flanking marker (y = 0.1 M) and selection in an advanced generation, individuals with a short intact chromosome segment occur almost as frequently as individuals with a long intact segment (compared with y). However, with increasing marker distance (y = 0.5 M) and selection in early generations, the chance of recovering individuals with relatively short segments is considerably increased.
These results show that in practical breeding programs selection of recombinants between marker and target in early generations is not only advantageous with respect to the resources required (![]()
Donor and recipient are elite:
In backcross programs for transfer of a desirable gene from one elite line to another, it is not necessary to have a maximum reduction of the attached chromosome segment because tight linkage of undesirable traits is unlikely and there may be even positive effects caused by the attached chromosome segment (![]()
In such a breeding program selection for recombinants between the target gene and a flanking marker is effective even when the marker is fairly distant from the target gene. For example, a saving of three backcross generations concerning the expected length of the linked chromosome segment is realized with a marker distance of y = 0.5 M (Fig 1). The considerably reduced standard deviation of the linked chromosome segment length with background selection compared to selection only for the target gene (Fig 1) reflects the fact that without marker-assisted selection large intact segments occur quite frequently in early generations. This is due to the absence of crossover events between the target gene and the end of the chromosome and results in the undesired introgression of large intact donor chromosome segments.
Because recombinants between the target gene and fairly distant flanking markers occur with a high probability even in small backcross populations (![]()
Donor is unadapted and recipient is elite:
In a backcross program for transfer of a target gene from unadapted material into breeding material used for variety development, a short attached chromosome segment is important. In a classical backcross program more than the generally recommended six backcross generations are required in this case (![]()
In a backcross program with tightly linked flanking markers, the sequential analysis of markers surrounding the target gene can assure an economic use of resources: First, a fairly distant flanking marker is analyzed. Assuming a given population size, its distance from the target locus can be determined such that with a high probability at least one single or double recombinant is found (![]()
Further research needs: Especially in early backcross generations, donor chromosome segments not directly attached to the target gene contribute a substantial amount to the total fraction of the undesirable donor genome in a backcross individual. We are currently investigating whether our approach can be extended to obtain a complete description of the distribution of the total donor genome proportion for a given marker genotype at several markers distributed throughout the genome.
| ACKNOWLEDGMENTS |
|---|
We greatly appreciate the suggestions and comments of two anonymous reviewers, which helped to improve this article.
Manuscript received May 19, 2000; Accepted for publication November 20, 2000.
| APPENDIX A |
|---|
Calculation of Et(X|R0,y,1):
We demonstrate calculation of the expected length of the intact chromosome segment attached on one side of the target gene in generation BCt, when selection for a recombinant individual was performed in generation BC1 (s = 1). The results in Table 1 are obtained with analogous calculations.
Inserting s = 1 in Equation 24 yields the pdf
![]() |
(A1) |
for x
[0, y). Calculation of the expected value requires integration:
![]() |
(A2) |
Equation A2 is not defined for t = 2 because the denominator and nominator are 0. With the rule of l'Hospital we calculate the limit of Et(X|R0,y,1) for t
2 and obtain
![]() |
(A3) |
| APPENDIX B |
|---|
Calculation of f(x|Ra
Rb):
We illustrate calculation of the pdf of the length of the intact donor chromosome segment around the target gene in generation BC1, when recombination between the target gene and markers occurred on both sides. From Equation 21 we obtain
![]() |
(B1) |
![]() |
(B2) |
For calculation of the pdf of X = Xa + Xb, we need the integral i(
, ß) (Equation 33). First we calculate the indefinite integral
![]() |
(B3) |
Substitution of xb = x - xa and integrating yields
![]() |
(B4) |
With
, the integral i(
, ß) =
(
) -
(ß) and hence, we obtain the pdf
![]() |
(B5) |
| APPENDIX C |
|---|
Selection of recombinants without marker analyses in previous generations:
In practical breeding programs, there are situations when a flanking marker is analyzed for the first time in an advanced backcross generation BCs. Here, we derive the distribution of the length of the donor chromosome segment attached on one side of the target gene for two such cases. We define the event
Bu,v,s =
u,v,s =
si=1Ru,v,i: Recombination between loci at positions u and v was observed in generation BCs but it is unknown in which generation BCi (i
s) recombination occurred for the first time.
The probability of event Bu,v,s is
![]() |
(C1) |
Marker assay only in an advanced backcross generation, B0,y,s:
In this case, the distribution of the length of the chromosome segment attached on one side of the target gene is a mixture of the distributions under condition P(R0,y,i) with weights
![]() |
(C2) |
Hence, we have for x
[0, y)
![]() |
(C3) |
![]() |
(C6) |
![]() |
(C7) |
Marker assay with more closely linked markers after detection of recombinants, B0,y,s
R0,y*,s:
In generation BCs, recombination between the target gene and a marker at position y*, which was analyzed in all previous generations, was observed for the first time. Recombination between the target gene and a second marker at position y < y*, which was analyzed for the first time in generation BCs, was also observed.
It is unknown in which generation BCi (i
s) recombination between the target gene and the marker at position y occurred. The distribution of the length of the chromosome segment attached on one side of the target is a mixture of the distributions under conditions R0,y,i
R0,y*,s with weights
![]() |
(C8) |
where
![]() |
(C9) |
and
![]() |
(C10) |
Let BCz be the generation in which recombination between the target gene and the marker at position y* occurred. It can be shown that, for z = s and x
[0, y),
![]() |
(C11) |
![]() |
(C12) |
![]() |
(C13) |
and, for z
s and x
[0, y),
































































