| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Corresponding author: John A. Woolliams, Roslin Institute (Edinburgh), Roslin, Midlothian EH25 9PS, United Kingdom., john.woolliams{at}bbsrc.ac.uk (E-mail)
Communicating editor: R. G. SHAW
| ABSTRACT |
|---|
Tractable forms of predicting rates of inbreeding (
F) in selected populations with general indices, nonrandom mating, and overlapping generations were developed, with the principal results assuming a period of equilibrium in the selection process. An existing theorem concerning the relationship between squared long-term genetic contributions and rates of inbreeding was extended to nonrandom mating and to overlapping generations.
F was shown to be ~1/4(1 -
) times the expected sum of squared lifetime contributions, where
is the deviation from Hardy-Weinberg proportions. This relationship cannot be used for prediction since it is based upon observed quantities. Therefore, the relationship was further developed to express
F in terms of expected long-term contributions that are conditional on a set of selective advantages that relate the selection processes in two consecutive generations and are predictable quantities. With random mating, if selected family sizes are assumed to be independent Poisson variables then the expected long-term contribution could be substituted for the observed, providing 1/4 (since
= 0) was increased to 1/2. Established theory was used to provide a correction term to account for deviations from the Poisson assumptions. The equations were successfully applied, using simple linear models, to the problem of predicting
F with sib indices in discrete generations since previously published solutions had proved complex.
WRAY and THOMPSON (1990) proved a fundamental relationship between the sum of squared long-term genetic contributions of ancestors and rates of inbreeding for random mating populations in discrete generations. One consequence of this relationship was that rates of inbreeding were tied to the numerator relationship matrix for the first time. This narrowed the conceptual gap between the central parameter for genetic evaluation of individuals using best linear unbiased prediction and one of the key properties of a breeding scheme. Another important consequence was to set out in a formal way a model for the mechanics of inheritance of selected advantage, a concept that ![]()
![]()
F in mass selection through modeling pathway extensions. However, this was done by using a recursive algorithm, so that although the mechanics were clear, the overall structure of the prediction remained obscure.
![]()
F. It was shown to have terms involving variances of family size in one generation, with additional terms for the proliferation or reduction of ancestral lines over many generations that could be predicted as a result of the selective advantage of the ancestor. Furthermore, it was clear that under equilibrium conditions, the model would lend itself to geometric summation of terms across generations. This led to simple forms for the expected long-term contribution of an ancestor. ![]()
![]()
![]()
![]()
F in mass selection. They obtained a neater closed form for
F than that derived by ![]()
![]()
![]()
![]()
![]()
![]()
This article examines the issues raised by the work described above. First, the relationship between
F and the realized long-term genetic contributions is extended to include nonrandom mating and overlapping generations. Second, an important result for the prediction of
F is developed by demonstrating a relationship between
F and the expected squared long-term contribution conditional on the selective advantages for random mating. Finally, as an example of application, predictions of
F for sib indices, previously considered by ![]()
![]()
RELATIONSHIP BETWEEN F AND LONG-TERM GENETIC CONTRIBUTIONS |
|---|
This section discusses the relationship between
F and realized long-term genetic contributions. In doing so, it derives the expected increase in homozygosity at the level of a neutral locus in contrast to the matrix method of ![]()
|
Discrete generations:
Consider one of these alleles in the base population at a neutral locus (say allele B). Let the gene frequency at time t, in the parents of sex q that have been selected to produce generation t + 1, be denoted by PB(q, t). The gene frequency can be described in terms of genetic contributions similar to Equation 1 of ![]()
![]() |
(1) |
where ri,u(q, t) is the genetic contribution of individual i born at time u to the parents of sex q at time t, with breeding value for frequency of allele B given by Ai,u and Mendelian sampling terms ai,u = Ai,u - 1/2(Asire + Adam). Equation 1 separates out the base generation, which provides the foundation alleles, and subsequent generations, which influence the frequency of the allele through the Mendelian sampling of their parent alleles. The variance of the Mendelian sampling terms will depend on Asire and Adam; Var(ai,u) = 0 if both Asire and Adam are homozygotes, 1/8 if they are both heterozygotes, or 1/16 otherwise. Since B is unique, Ai,0 is 0 for all individuals except for one individual for which Ai,0 = 1/2. The genetic contribution of an individual to the generation of its birth is ri,t(m, t) = X-1m if i is male or 0 if i is female, and ri,t(f, t) = X-1f if i is female or 0 if i is male.
Initially assume that there is random mating. For any generation the probability of homozygotes for B is obtained from the product of the gene frequencies in the male and in female parents and is PB(m, t)PB(f, t). The inbreeding coefficient Ft for the neutral locus is then the sum over all distinct alleles at the locus,
![]() |
(2) |
where ri,u(q, t - 1) is the average contribution to parents of sex q at time t - 1. (Note the breeding values and Mendelian sampling terms will depend on the allele but this dependence has not been made explicit to spare notation.) For each allele the cross-product terms in Ai,0 Aj,0 are zero since Ai,0 = 0 except for a single individual. Since the Mendelian sampling terms from different individuals are independent of all other terms for a neutral locus, all cross-products of the Mendelian sampling terms are zero.
More precisely, for each allele and each ancestor, the term
iri,u(m, t - 1)ri,u(f, t - 1)a2i,u should be the sum of products of contributions of the ancestor to each male and female mating pair:
![]() |
(3) |
This will account for any nonrandom mating of parents. For a neutral locus, the covariance between ri and ai will be 0 (![]()
![]()
i
mates (j(m),j*(f)) ri,u(j(m), t - 1)ri,u(j*(f), t - 1)]E[a2i,u]. Let the first of these, the expectation of the cross-products of contributions to mates, be Cu(t - 1). Note that (i) Ct-1(t - 1) = 0 since an individual without offspring cannot contribute to both sexes and (ii) the first term in Equation 2 is 1/2C0(t - 1) since A2i,0 has a value 1/4 for each of its two alleles and 0 otherwise.
Assume equilibrium values for (i) the deviation from Hardy-Weinberg frequencies arising from the nonrandom mating (
, equivalent to
I of ![]()
F, attained by generation 2 (this assumption is removed later); then Equation 2 can be further simplified using results given in Appendix A, namely,
alleles E[a2i,u] =
for u = 1 and 1/4(1 -
)(1 -
F)u-1 for u
2. Therefore,
![]() |
(4) |
![]() |
(5) |
Subtracting (5) from (4) and rearranging terms,
![]() |
(6) |
Assuming equilibrium, then a steady state of pedigree development will occur and the expectation of the cross-products will be determined by the number of generations over which they have developed, i.e., Cu(t) = Cu-1(t - 1) since both terms represent contributions t - u generations after the birth of the ancestor. This is not a strong assumption in the context of the problem since in the absence of an equilibrium there would be no single
F to predict.
Therefore, the terms in Cu(t) can be modified to terms in Cu-1(t - 1), and each term of the sum within the square brackets of Equation 6 can be reduced to -
FCu(t - 1). After repeating this process for the C2(t) term [and temporarily neglecting the term in 
FC1(t - 1)],
![]() |
(7) |
For large enough t, the terms in Cu(t) will converge for a given u. Therefore, 1/2C0(t)
1/2C0(t - 1), and 1/4C1(t) - 1/4
C1(t - 1)
1/4(1 -
)C1(t - 1); then by adding and subtracting the term 1/2
FC0(t),
![]() |
(8) |
Finally, note E[Ft+1 - Ft] =
FE[1 - Ft] and that the term in square brackets in Equation 8 is E[Ft], giving
![]() |
(9) |
This result holds for t large enough for contributions from early generations to have converged. If it is assumed that the base generation used for defining the inbreeding coefficients was chosen to be part of a period of equilibrium, then C1(t - 1) = C0(t) = C,
![]() |
(10) |
where C is the sum of squared converged contributions for a generation, chosen arbitrarily within the period of equilibrium. Including the term neglected between Equation 6 and Equation 7 would replace [1 - 1/2C]-1 by [1 - (1/2 + 1/4
)C]-1. For random mating, omitting the term [1 - 1/2C]-1 leads to an underestimate with a fractional error of ~1/2C, which in turn is ~2
F.
Since C = E[
i
mates (j(m),j*(f))ri,u(j(m), t) ri,u(j*(f), t)] for large u << t, for any i the terms ri,u(j(m), t) and ri,u(j(f), t) converge to the same value for all j in generation t providing the population mixes. This value will be the long-term contribution of ancestor i to the population, denoted by ri. This will occur with or without random mating. Thus C = E[
ir2i] for a generation of ancestors, which leads to
![]() |
(11) |
![]() |
(12) |
In Equation 12, the expectations are conditional on the individual i being a selected ancestor; however, since ri = 0 for an unselected ancestor, Equation 12 can also be given as
![]() |
(13) |
where Tm and Tf are the number of candidates for selection in each sex and the expectation is for a candidate (i.e., it is not conditional on i being selected). (E[
F] is used in Equation 12 and Equation 13, rather than simply
F, to emphasize that the result is an expectation over replicate populations.)
This result was obtained for
= 0 by ![]()
F" may exist. Second, the derivation using the probability of homozygosity for an assumed allele is of value since the proof of ![]()
![]()
![]()
![]()
F with nonrandom mating and it is now clear why this was so.
Even though the development of the pedigree may be in equilibrium (which will imply the genetic variance being selected upon is in equilibrium) this does not imply that equilibrium values of
and
F for the alleles defined in the arbitrary base are immediately attained. Equation 4, using Appendix A, assumes that these parameters were in equilibrium for the Mendelian sampling in generation 2. However, the following argument shows that this does not affect the result. Assume the equilibrium conditions have not been attained by generation 2; then for this generation plus a small number of generations following (i.e., up to attainment of equilibrium) there will be terms of the form
Cu(t) in Equation 4 and
Cu(t - 1) in Equation 5. Providing t is sufficiently large compared to the period of attainment, these terms will cancel in Equation 6 since Cu(t) is a convergent series. Thus Equation 10Equation 11Equation 12 HREF="#FD13">Equation 13 will hold for the equilibrium values of
and
F.
Overlapping generations:
If
F is taken per unit time then the structure of the preceding proof holds. The reduction in the variance of the Mendelian sampling term over initial cohorts, before an equilibrium
F/unit time is established, is not straightforward since it will depend upon the age structure of the population; but the previous argument used to overcome deviations from equilibrium can be applied. However, one distinction in overlapping generations is that the base generation will contain the equivalent of L cohorts, where L is the period of time over which the long-term contributions sum to one, since this is the period required for the population to turn over a generation for those genes destined to remain in the population in the long-term. ![]()
![]()
F/generation) or equivalently 1/2C0(t)L (
F/unit time), where L is the generation interval. Thus the error term in Equation 10 is [1 - 1/2CL]-1, and consequently ignoring this term results in an underestimate with a fractional error of 2 x (
F per generation). Equation 11 is obtained by summing over all individuals born in a single cohort. With overlapping generations, individual ancestors within cohorts will have different life histories, since they will be used at different breeding ages or for different purposes. If Xq is the number of individuals with a lifetime breeding profile categorized by q, then the approximation will be
![]() |
(14) |
where the expectations are over the squared contributions from a single cohort and are conditional on selection in category q. Although the approach is different, Equation 14 is equivalent to the result of ![]()
![]()
RELATIONSHIP BETWEEN F AND EXPECTED CONTRIBUTIONS |
|---|
Since
F is proportional to E[r2i], the task of predicting
F in selected populations would be made easier if tractable and general methods for calculating expected squared contributions were available. However, E[r2i] = µ2i +
2i and consequently there is a need to predict both the mean and variance of the contributions. Commonly, the prediction of means is a simpler task than the prediction of variances. General methods for predicting expected long-term contributions in selected populations have been developed by ![]()
F. The relationship will need to assume random mating and is developed by conditioning on the selective advantage(s), si, for an ancestor. The selective advantage(s) of the ancestor, if inherited, will partly determine the breeding success of its descendants, with diminishing impact over generations. The proof uses the result E[r2i] = Es[r22|si] = Es[µ2i] + Es[
2i], where µi = E[ri|si] and
2i = Var[ri|si], and the subscript s on the E indicates that the expectation is being taken over the selective advantages.
Monoecious population:
The proof is simplest in the case of a monoecious diploid population of X parents in discrete generations without selfing. Random mating is assumed (
= 0). Extension to overlapping generations and to two sexes follows by analogy but is complicated by the need for matrices, and so this extension is made in Appendix B. The long-term contribution of individual i is given by
![]() |
(15) |
These sums may be restricted to the selected offspring since unselected offspring have no long-term contribution. It is assumed that conditional on the selective advantage si of the parent i, the genetic contribution of the offspring is independent of the number of offspring selected from parent i (denote this number by ni). Then from Equation 15,
![]() |
(16) |
![]() |
(17) |
Equation 17 requires random mating. Let
n,i and Vn,i be the mean and variance of ni|si; then
![]() |
(18) |
The derivation of µi in a general genetic framework was described by ![]()
The variance
2i is derived using the statistical result that the unconditional variance is the expectation of the conditional variance plus the variance of the conditional expectation. Applying this result to Equation 16 and Equation 17 gives
![]() |
(19) |
Assume now that the number selected from parent i has a Poisson distribution. For example, this would be the case if litter size before selection had a Poisson distribution. Then
n,i can replace Vn,i in the second term of Equation 19 to obtain
![]() |
(20) |
which can be recognized as
![]() |
(21) |
If expectations are now taken over si, ![]()
n,i and E[r2j|si, j offspring of i]. A heuristic explanation is that if there were a covariance, then this would result in selection for increased squared contributions, breaking the assumption of equilibrium. The right-hand side is then equal to
Es[r2i|si], since Es[
n,i] = 2. Therefore,
![]() |
(22) |
which leads to the result that
![]() |
(23) |
Finally, if X is the number of parents in each generation, then
![]() |
(24) |
The power of this result is that it requires only the mean conditional on the selective advantages to be modeled, which can be done for a wide class of genetic structures using the methods of ![]()
![]()
![]()
One of the critical assumptions of the proof leading to (24) is that the selected family sizes are distributed as a Poisson variable. However, departures from this will occur, for example, (i) when the litter sizes are not Poisson; (ii) when negative covariances between full-sibs and between half-sibs are induced by using sib indices for selection; (iii) when selection intensity becomes large; and (iv) when there are common environmental variances associated with litters. (The occurrence of the last two causes will depend on the model chosen for s, which is addressed in the DISCUSSION.)
To account for this deviation let Vn,i =
n,i + Vn,dev,i in Equation 19, where Vn,dev,i may be positive or negative according to the circumstances. Then the component in
n,i can be treated as previously and Equation 21 becomes
![]() |
(25) |
and Equation 23 becomes
![]() |
(26) |
with the result
![]() |
(27) |
Anticipating an observed result, the magnitude of terms involving si in E[rj|si, j offspring of i] contributes very little to the second term of Equation 27 and only the constant term, independent of si, needs be considered. In the current context E[rj|ss, j offspring of i]
X-1 and the second term in Equation 27 becomes 1/8Es[Vn,dev,i]/X. For example, in mass selection with fixed litter sizes, ![]()
-n-1o, where no is the number of offspring per parent, with the result that the correction for the deviation from Poisson is (-8T)-1 where T is the total number of individuals born.
One of the benefits of Equation 24 is that the rate of inbreeding can be obtained from predicting means, often using regression techniques. Accounting for deviations from the Poisson distribution introduces the need for estimating variances of family size to obtain Equation 27. Nevertheless, the multigenerational problem of estimating the variance of a long-term genetic contribution has been reduced to estimating the variance of family size after selection in a single generation.
Extension to overlapping generations:
With overlapping generations, individuals within a cohort that are selected to breed at any point in their lifetime can be divided into breeding categories. These categories are defined by the age of breeding, how often, and for what purpose the individual breeds. Categories are particularly important in selection. As an example, consider mass selection where all selected individuals can have progeny born at ages 1, 2, or 3. If the population is making genetic progress the average merit of individuals born 3 years ago is less than the average merit of an individual born 1 year ago. Therefore an offspring of a 3-year-old parent will have a selective disadvantage compared to an offspring of a 1-year-old parent and so is expected to make a smaller genetic contribution in the long-term (see ![]()
For these reasons partition of the selected individuals into categories is necessary to obtain the general result. It is assumed that the categories are defined so that an individual belongs to a single category that describes its lifetime genetic contribution. To continue the example of mass selection, where the only distinction among parents is the breeding age, there would be potentially seven categories. If {x} denotes age x at breeding, then these categories are {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}. The number of categories will inevitably depend on the complexity of the breeding scheme, but the essential point is that they can be defined and enumerated. Let nc be the number of categories indexed from q = 1 ... nc, and µi(q) be the expected long-term contribution of individual i in category q conditional on its selective advantage si(q) with variance
2i(q). The steps given above in Equation 16Equation 17Equation 18 HREF="#FD19">Equation 19Equation 20Equation 21Equation 22Equation 23Equation 24 HREF="#FD25">Equation 25Equation 26Equation 27 for a single category remain the same but changes are needed since terms must be redefined as vectors and matrices. The notation to develop the argument therefore becomes more complex but the result remains simple. For this reason the proof is given in Appendix B. The conclusion is that if family sizes after selection are assumed to be distributed as independent Poisson variables, then
![]() |
(28) |
This simple result shows that the rate of inbreeding, when approximated by the sum of squared contributions, is equal to one half of the sum of the squares of expected lifetime contributions. Instead of using the observed contribution, as in Equation 12, the expected contribution can be substituted, but this is done at the cost of changing the coefficient from 1/4 to 1/2. This is because the expected contribution is being used to model both the mean and the variance.
As previously, for a parent from category q, define the matrix Vn(q),dev of size nc x nc to be the (co)variance matrix for the number of selected offspring in each of the nc categories, expressed as deviations from independent Poisson variances. For each q, neglecting terms in s (for empirical reasons given earlier), there will be a term
q defined by
TVn(q),dev
, where
is the vector with the qth element equal to the expected long-term contribution for an individual from category q, i.e., Es[µi(q)] =
q. Note
q may be negative since it is a variance deviation and is not a variance. This term is introduced in Equation B6 of Appendix B. From Appendix B we arrive at
![]() |
(29) |
Although the proof has been based upon a monoecious diploid organism with no selfing, the extension to a dioecious organism is clear from the proof for overlapping generations. Having discrete generations with two sexes is identical to having two categories, i.e., males and females. Finally note that, other than assuming an equilibrium and random mating, there have been no assumptions on the type of selection index used, the nature of the genetic variation, or the population structure.
| APPLICATIONS AND RESULTS |
|---|
Sib indices in discrete generations:
The theory is illustrated by selection on a general sib index of the form I = b1(P -
fs) + b2(
fs -
hs) + b3
hs, where P is the phenotype of the candidate,
fs is the phenotypic mean of its full-sibs (including candidate), and
hs is the phenotypic mean of its half-sibs (including candidate and full-sibs). Mass selection is a special case, with b1 = b2 = b3 = 1 (or any constant >0). This formulation was used also by ![]()
|
|
In ![]()
Expected long-term genetic contributions were modeled following ![]()
q + ßTq(si(q) -
q), where si(q) denotes the vector of selective advantages for a selected individual of sex q expressed as a deviation from the mean of its contemporaries
q, ßq is the vector of regression coefficients of ri(q) on si(q) -
q,
q is the mean contribution of selected parents of sex q, and T denotes the transpose. In the parameterization used, the mean of Ai(fs) is always zero. To simplify the notation it is assumed that Ai(hs) is already expressed as a deviation from the mean of the contemporary group, and so
q is omitted from this point onwards.
Step 1. Prediction of expected contributions:
The prediction of expected genetic contributions is covered in detail by ![]()
= (
m,
f)T and ß = (ßTm, ßTf). In discrete generations, (
m,
f) = [1/(2Xm), 1/(2Xf)] always. Solutions for ß are obtained applying the method of ![]()
m,
f) = (0.0250, 0.0083), ß = (0.0447, 0.0149, 0.0130).
Step 2. Rates of inbreeding assuming Poisson variances:
From step 1, µi(m) = [0.0250 + 0.0447Ai(hs)]. The expected squared mean is a simple sum of squared terms: XmE[µ2i(m)] = Xm[0.02502 + 0.04472
(Ai(hs))(1 - X-1m)]. The (1 - X-1m) term accounts for variances about the sample mean of the selected group rather than the true mean.
The terms arising from XfE[µ2i(f)] are calculated analogously. Since the two selected advantages of the females are mutually independent, the expected mean squared is simply the sum of squared terms. The expected long-term contribution of a female parent is

and the sum of squared means is

As previously mentioned, the term is defined as a deviation from the mean over all ancestors so
(Ai(fs)) requires no correction.
The rate of inbreeding ignoring deviations from Poisson variances is predicted from
F =
(XmE[µ2i(m)] + XfE[µ2i(f)]) =
= 0.0158.
Step 3. Correction for deviations of Vn from Poisson variances:
Deviations from Poisson variances can be accounted for by correcting the rate of inbreeding using Equation 28, where
q =
TVn(q),dev
and Vn(q),dev is the (2 x 2) matrix with (co)variances of the number of selected offspring of a parent of sex q (q = m, f) as a deviation from independent Poisson variances. The calculation of the deviation from Poisson family variance for fixed numbers of selection candidates per full-sib family is described in Appendix D. The approach adopted was derived in detail by ![]()
![]()

The total correction to the predicted
F is 0.0016, and the prediction, using Equation 29, is 0.0175. The mean
F derived from 1000 simulations was 0.0183 (SE = 0.0001).
General fit:
Extensive simulations were carried out assuming an infinitesimal model with factorial combinations of Xm = 20, 40, 80; d = 1, 2, 3 (and 5 for Xm = 20, 40); total offspring of 4, 8, and 16 per full-sib family equally divided between sexes; and with h2 = 0.1, 0.2, 0.4, and 0.6; weights used were (1.0, 0.75, 0.5) for d > 1 [changed to (1.0, 0.75, 0.75) for d = 1] and (1.0, 1.5, 2.0) for d > 1 [changed to (1.0, 1.5, 1.5) for d = 1]. Classical weights were also examined since these weights were the subject of the study of ![]()
![]()
With weights (1.0, 0.75, 0.5, or 0.75) the accuracy was excellent for all schemes, with all errors <4%. With weights (1.0, 1.5, 1.5, or 2.0) accuracy was also very good, accurately tracking trends with the changes in the parameters and with a large majority of errors <2% with the exception of d = 3, h2 = 0.4, where underestimates of up to 8% were observed. The trends in rates of inbreeding were also accurately tracked with classical weights with no increases in the magnitude of the errors, even though schemes had rates of inbreeding >0.03.
The most serious trend in the errors was a pattern of underprediction characterized by high mating ratio and large family sizes (both of which increase the selection intensity) and increased family weights. More surprisingly, the errors also increased with the numbers of parents at a constant d (i.e., Xm = 20, Xf = 60 compared to Xm = 80, Xf = 240), and also the errors were not present for h2 = 0.01 and increased sharply as h2 increased. To explore these errors further, the long-term contributions for selected males were plotted against Ai(hs) for the following schemes with d = 3, weights (1.0, 1.5, 2.0): I, Xm = 20, h2 = 0.4, no = 16; II, Xm = 80, h2 = 0.4, no = 16; III, Xm = 80, h2 = 0.01, no = 16; and IV, Xm = 80, h2 = 0.4, with no = 4. The results for simulated (S) and predicted (P) were as follows: I, S = 0.0231, P = 0.0220; II, S = 0.0070, P = 0.0058; III, S = 0.0028, P = 0.0029; IV, S = 0.0037, P = 0.0037. Note that scheme II is simply scheme I with four times the number of parents and expected long-term contributions of I are consequently four times bigger than II. The prediction of
F for scheme II is close to (but not precisely) 1/4 of that for I. However, the ratio of the simulated
F for scheme II compared to I was closer to 1/3, i.e., much greater than would be expected from scaling. Serious prediction error occurs only for scheme II.
Fig 1 shows that the accuracy of prediction with low h2 (scheme III) is because the linear model used is a good fit (i.e., the contributions are a simple linear regression on the selective advantage) and similarly for low selection intensity (scheme IV). However, for both the other two schemes the linear model predicts a substantial proportion of the selected males to have negative contributions, although rates of inbreeding are accurately predicted in one case (scheme I) but not in the other (scheme II).
|
Closer replicate-by-replicate analysis shows that despite the expectation, the substantially greater variance of contributions (approximately proportional to
F/Xm) in scheme I obscures the nonlinearity in the majority of replicates. When both linear and quadratic terms for the selective advantage were included in a regression model for observed contributions, the quadratic term was not statistically significant (defined here as P < 0.01) in >60% of the replicates. In contrast, for scheme II, this percentage was <15%. Thus the accuracy of prediction depends on the goodness-of-fit of the linear model within a replicate, so more parents may promote greater proportional prediction errors, even though these errors will be associated with lower rates of inbreeding.
The pattern of the correction for deviations from Poisson distribution for selected family sizes is worth noting. These corrections are negative for b2, b3 < 1, reduce in size as the index weights increase, and were generally positive for b2, b3 > 1. For mass selection, b1 = b2 = b3 = 1, the correction is of the order of -1/(8T).
| DISCUSSION |
|---|
The theory described in this article provides a powerful tool for predicting rates of inbreeding in selected populations and for providing insights into the forces that contribute to the rate of loss of variation. The relationship of ![]()
![]()
Theory:
The first theorem relating the rate of inbreeding in a population to the squared long-term contributions was previously derived by ![]()
![]()
F

r2i) is not exact and was shown to underestimate the rate of inbreeding by a fraction of the order of (2
F), providing there was no major deviation from random mating, and is therefore small for any practical scheme. In overlapping generations, with rates of inbreeding per unit time and per generation both of interest, it is shown that this error is 2(
F/generation), where the generation interval was defined by the period over which the long-term genetic contributions sum to 1.
The importance of the relationship between rates of inbreeding and squared genetic contributions is that it holds for selected populations, with no assumptions on the form of selection, providing (i) the genes are ultimately mixed, and (ii) an equilibrium exists over which a stable
F may be defined. A further caveat is that the rate obtained applies to a neutral, unlinked gene. The extension of other relationships to predict
F in selected populations does not always hold. For example, using the relationship Var(
q) = q(1 - q)
F, where q is the frequency of a neutral gene and
q is the change in frequency per unit time, will not hold if selection is not random since it assumes mutual independence of
q over consecutive intervals. The increments,
q, are also correlated for overlapping generations due to the many intervals over which the progeny of a single parent may be selected. As a consequence the justification for the proof by ![]()
F with overlapping generations is invalid, even in the absence of genetic selection, although the result is correct and agrees with the previous proof of ![]()
![]()
![]()
![]()
![]()
The form of Equation 4 shows that the sum of squared long-term contributions for any given cohort may be usefully interpreted in the absence of an equilibrium. The sum of squared contributions for a cohort is the proportion of the new variation (the Mendelian sampling variance) arising from within that cohort that is lost to the population in the long term. This includes all mutational variance arising in prior generations, since the choice of base is arbitrary. Therefore the sum of squared contributions of cohorts (particularly those still to converge!) is important, irrespective of equilibrium, and provides a meaningful measure of risk, and merits attention in both breeding and conservation schemes. The operational tools described by ![]()
![]()
![]()
The second, novel theorem derived in this article is concerned with showing how the formulas with observed long-term contributions may be translated into formulas with expected long-term contributions. The latter are advantageous since they use predictable entities. The major change is that the expected can be substituted for the observed, providing the constant of proportionality is increased from 1/4 to 1/2. The critical step in the proof is that the error variance of a long-term contribution given the selective advantage is related to the square of its mean, i.e., the coefficient of variation is relatively constant. Apart from random mating, the scope of this proof is very broad and is applicable to overlapping generations. The validity of the derivation was checked using general sib-indices as an example in discrete generations, and a companion article (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Prediction:
Usable predictions were obtained by ![]()
![]()
![]()
![]()
![]()
In any attempt to obtain prediction formulas, a balance has to be achieved between accuracy and simplicity. We have used simple linear models to interpret the theory. Thus in application the prediction consists of two elements: (i) the squared expected contribution and (ii) the deviation from independent Poisson families. The first of these elements was applied precisely as described by ![]()
![]()
![]()
The choice of selective advantages has as an objective the minimum number needed to make the selective processes in different time periods independent. Using sib indices as an example, the authors considered both the method presented, where only breeding values were included as selective advantages, and an alternative definition in which the selective advantages were the half-sib mean and deviation of the full-sib mean from the half-sib mean. The potential benefit from the alternative parameterization is that the environmental covariances in the index arising from the sib means are accounted for within the expected long-term contribution. Conditioning on the sib means is more than is strictly necessary for conditional independence between generations. However, while results using the alternative parameterization were as accurate in most cases (results not shown), the underestimates explored in the results tended to be more severe. One reason for this is that terms included in the expected long-term contribution are modeled by linear functions, whereas modeling the environmental correlations by the method of ![]()
Nonlinear relationships between the selective advantage and long-term contributions occurred when high selection intensities of selection were combined with moderate heritabilities, large numbers of parents, and high mating ratios. Results from including quadratic terms in the model for the expected long-term contribution (unpublished) confirm that the serious prediction errors arise from the assumption of linearity rather than from Equation 29.
There are good reasons to believe that these departures from linearity should not prove a major problem where the objective is to design effective breeding schemes. First, on pragmatic grounds the curvilinear relationship shown in Fig 1 suggests that 15% of selected males were being used with no expectation of long-term contribution to the population (this percentage is even higher if the contributions were plotted against the observed half-sib mean!). The resources used to keep and breed these animals are clearly wasted. In an ideal selection scheme, an ancestor's long-term contributions will be zero or, once its Mendelian sampling term is above a critical threshold, linearly related to the sampling term (![]()
![]()
![]()
2 for which the methods presented here had a good fit.
In conclusion, this article has (i) established a broader theorem (compared to ![]()
![]()
| ACKNOWLEDGMENTS |
|---|
J.A.W. gratefully acknowledges financial support from the Ministry of Agriculture, Fisheries and Food (United Kingdom), and the support and encouragement of Prof. A. Maki-Tanila, who gave an opportunity for this work to be initiated. The contribution of P.B. was financially supported by the Netherlands Technology Foundation (STW) and coordinated by the Earth and Life Science Foundation (ALW).
Manuscript received March 30, 1999; Accepted for publication December 6, 1999.
| APPENDIX A |
|---|
THE EXPECTED MENDELIAN SAMPLING VARIANCE
The expected Mendelian sampling variance in generation 1 summed ove