help button home button Genetics PLANT CELL
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Woolliams, J. A.
Right arrow Articles by Bijma, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Woolliams, J. A.
Right arrow Articles by Bijma, P.
Genetics, Vol. 154, 1851-1864, April 2000, Copyright © 2000

Predicting Rates of Inbreeding in Populations Undergoing Selection

John A. Woolliamsa and Piter Bijmab
a Roslin Institute (Edinburgh), Roslin, Midlothian EH25 9PS, United Kingdom
b Animal Breeding and Genetics Group, Wageningen Institute of Animal Sciences, Wageningen Agricultural University, 6700 AH Wageningen, The Netherlands

Corresponding author: John A. Woolliams, Roslin Institute (Edinburgh), Roslin, Midlothian EH25 9PS, United Kingdom., john.woolliams{at}bbsrc.ac.uk (E-mail)

Communicating editor: R. G. SHAW


*  ABSTRACT
*TOP
*ABSTRACT
*RELATIONSHIP BETWEEN {Delta}F...
*RELATIONSHIP BETWEEN {Delta}F...
*APPLICATIONS AND RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*LITERATURE CITED

Tractable forms of predicting rates of inbreeding ({Delta}F) in selected populations with general indices, nonrandom mating, and overlapping generations were developed, with the principal results assuming a period of equilibrium in the selection process. An existing theorem concerning the relationship between squared long-term genetic contributions and rates of inbreeding was extended to nonrandom mating and to overlapping generations. {Delta}F was shown to be ~1/4(1 - {omega}) times the expected sum of squared lifetime contributions, where {omega} is the deviation from Hardy-Weinberg proportions. This relationship cannot be used for prediction since it is based upon observed quantities. Therefore, the relationship was further developed to express {Delta}F in terms of expected long-term contributions that are conditional on a set of selective advantages that relate the selection processes in two consecutive generations and are predictable quantities. With random mating, if selected family sizes are assumed to be independent Poisson variables then the expected long-term contribution could be substituted for the observed, providing 1/4 (since {omega} = 0) was increased to 1/2. Established theory was used to provide a correction term to account for deviations from the Poisson assumptions. The equations were successfully applied, using simple linear models, to the problem of predicting {Delta}F with sib indices in discrete generations since previously published solutions had proved complex.


WRAY and THOMPSON (1990) proved a fundamental relationship between the sum of squared long-term genetic contributions of ancestors and rates of inbreeding for random mating populations in discrete generations. One consequence of this relationship was that rates of inbreeding were tied to the numerator relationship matrix for the first time. This narrowed the conceptual gap between the central parameter for genetic evaluation of individuals using best linear unbiased prediction and one of the key properties of a breeding scheme. Another important consequence was to set out in a formal way a model for the mechanics of inheritance of selected advantage, a concept that ROBERTSON 1961 Down had introduced but had left unclarified. An achievement of the methods of WRAY and THOMPSON 1990 Down was to obtain, for the first time, accurate predictions of {Delta}F in mass selection through modeling pathway extensions. However, this was done by using a recursive algorithm, so that although the mechanics were clear, the overall structure of the prediction remained obscure.

WOOLLIAMS et al. 1993 Down advanced the understanding of the structure of the prediction by obtaining a closed form for the prediction of {Delta}F. It was shown to have terms involving variances of family size in one generation, with additional terms for the proliferation or reduction of ancestral lines over many generations that could be predicted as a result of the selective advantage of the ancestor. Furthermore, it was clear that under equilibrium conditions, the model would lend itself to geometric summation of terms across generations. This led to simple forms for the expected long-term contribution of an ancestor. WRAY et al. 1994 Down extended the methods to index selection, although the form of the model is a hybrid of the approach of WOOLLIAMS et al. 1993 Down and HILL 1972 Down, since the conditional arguments of pathway extension that had been carried out for mass selection were found to be too complex for index selection. Nevertheless, worthwhile predictions were made available in a tractable form.

SANTIAGO and CABALLERO 1995 Down used an approach that made no direct reference to the theory of contributions to predict {Delta}F in mass selection. They obtained a neater closed form for {Delta}F than that derived by WOOLLIAMS et al. 1993 Down through an argument based on total drift, relating the change through selection to loss of genetic variance. Unlike the previous work of WRAY and THOMPSON 1990 Down and WOOLLIAMS et al. 1993 Down, who had considered the population in relation to an unselected base generation, SANTIAGO and CABALLERO 1995 Down developed predictions based upon equilibrium genetic variance. NOMURA 1996 Down extended the approach of SANTIAGO and CABALLERO 1995 Down to mass selection with overlapping generations but with the important restriction that the males and females selected from a cohort remain the same in both number and identity throughout the breeding life of the cohort.

This article examines the issues raised by the work described above. First, the relationship between {Delta}F and the realized long-term genetic contributions is extended to include nonrandom mating and overlapping generations. Second, an important result for the prediction of {Delta}F is developed by demonstrating a relationship between {Delta}F and the expected squared long-term contribution conditional on the selective advantages for random mating. Finally, as an example of application, predictions of {Delta}F for sib indices, previously considered by WRAY et al. 1994 Down, are reexamined using the equilibrium methods for expected long-term contributions developed by WOOLLIAMS et al. 1999 Down and compared to results from simulation.


*  RELATIONSHIP BETWEEN {Delta}F AND LONG-TERM GENETIC CONTRIBUTIONS
*TOP
*ABSTRACT
*RELATIONSHIP BETWEEN {Delta}F...
*RELATIONSHIP BETWEEN {Delta}F...
*APPLICATIONS AND RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*LITERATURE CITED

This section discusses the relationship between {Delta}F and realized long-term genetic contributions. In doing so, it derives the expected increase in homozygosity at the level of a neutral locus in contrast to the matrix method of WRAY and THOMPSON 1990 Down. The notation that is used is shown in Table 1. The model for the population is assumed, for the present, to have discrete generations with Xm male parents and Xf female parents. For calculation of inbreeding coefficients every allele is considered as unique in the base population (t = 0). It does not matter if the base generation has the structure of an unselected and unrelated population.


 
View this table:
[in this window]
[in a new window]

 
Table 1. Notation used to derive Equation 1Equation 2Equation 3Equation 4Equation 5Equation 6Equation 7Equation 8Equation 9Equation 10Equation 11Equation 12Equation 13Equation 14Equation 15Equation 16Equation 17Equation 18Equation 19Equation 20Equation 21Equation 22Equation 23Equation 24Equation 25Equation 26Equation 27

Discrete generations:
Consider one of these alleles in the base population at a neutral locus (say allele B). Let the gene frequency at time t, in the parents of sex q that have been selected to produce generation t + 1, be denoted by PB(q, t). The gene frequency can be described in terms of genetic contributions similar to Equation 1 of WOOLLIAMS et al. 1999 Down. Let Ai be the gene frequency of an allele B in individual i, where Ai = 1, 1/2, or 0 if i is BB, B·, or ··, respectively (where · represents any other allele), then the individual gene frequencies can be treated as breeding values for frequency. The average of the gene frequency in the parents of sex q in generation t is given by

(1)

where ri,u(q, t) is the genetic contribution of individual i born at time u to the parents of sex q at time t, with breeding value for frequency of allele B given by Ai,u and Mendelian sampling terms ai,u = Ai,u - 1/2(Asire + Adam). Equation 1 separates out the base generation, which provides the foundation alleles, and subsequent generations, which influence the frequency of the allele through the Mendelian sampling of their parent alleles. The variance of the Mendelian sampling terms will depend on Asire and Adam; Var(ai,u) = 0 if both Asire and Adam are homozygotes, 1/8 if they are both heterozygotes, or 1/16 otherwise. Since B is unique, Ai,0 is 0 for all individuals except for one individual for which Ai,0 = 1/2. The genetic contribution of an individual to the generation of its birth is ri,t(m, t) = X-1m if i is male or 0 if i is female, and ri,t(f, t) = X-1f if i is female or 0 if i is male.

Initially assume that there is random mating. For any generation the probability of homozygotes for B is obtained from the product of the gene frequencies in the male and in female parents and is PB(m, t)PB(f, t). The inbreeding coefficient Ft for the neutral locus is then the sum over all distinct alleles at the locus,

(2)

where ri,u(q, t - 1) is the average contribution to parents of sex q at time t - 1. (Note the breeding values and Mendelian sampling terms will depend on the allele but this dependence has not been made explicit to spare notation.) For each allele the cross-product terms in Ai,0 Aj,0 are zero since Ai,0 = 0 except for a single individual. Since the Mendelian sampling terms from different individuals are independent of all other terms for a neutral locus, all cross-products of the Mendelian sampling terms are zero.

More precisely, for each allele and each ancestor, the term {Sigma}iri,u(m, t - 1)ri,u(f, t - 1)a2i,u should be the sum of products of contributions of the ancestor to each male and female mating pair:

(3)

This will account for any nonrandom mating of parents. For a neutral locus, the covariance between ri and ai will be 0 (WOOLLIAMS and THOMPSON 1994 Down; WOOLLIAMS et al. 1999 Down), and the expectation of Equation 3 is E[{Sigma}i {Sigma}mates (j(m),j*(f)) ri,u(j(m), t - 1)ri,u(j*(f), t - 1)]E[a2i,u]. Let the first of these, the expectation of the cross-products of contributions to mates, be Cu(t - 1). Note that (i) Ct-1(t - 1) = 0 since an individual without offspring cannot contribute to both sexes and (ii) the first term in Equation 2 is 1/2C0(t - 1) since A2i,0 has a value 1/4 for each of its two alleles and 0 otherwise.

Assume equilibrium values for (i) the deviation from Hardy-Weinberg frequencies arising from the nonrandom mating ({omega}, equivalent to {alpha}I of CABALLERO and HILL 1992 Down) and (ii) {Delta}F, attained by generation 2 (this assumption is removed later); then Equation 2 can be further simplified using results given in Appendix A, namely, {Sigma}alleles E[a2i,u] = for u = 1 and 1/4(1 - {omega})(1 - {Delta}F)u-1 for u >= 2. Therefore,

(4)


(5)

Subtracting (5) from (4) and rearranging terms,

(6)

Assuming equilibrium, then a steady state of pedigree development will occur and the expectation of the cross-products will be determined by the number of generations over which they have developed, i.e., Cu(t) = Cu-1(t - 1) since both terms represent contributions t - u generations after the birth of the ancestor. This is not a strong assumption in the context of the problem since in the absence of an equilibrium there would be no single {Delta}F to predict.

Therefore, the terms in Cu(t) can be modified to terms in Cu-1(t - 1), and each term of the sum within the square brackets of Equation 6 can be reduced to -{Delta}FCu(t - 1). After repeating this process for the C2(t) term [and temporarily neglecting the term in {omega}{Delta}FC1(t - 1)],

(7)

For large enough t, the terms in Cu(t) will converge for a given u. Therefore, 1/2C0(t) {approx} 1/2C0(t - 1), and 1/4C1(t) - 1/4{omega}C1(t - 1) {approx} 1/4(1 - {omega})C1(t - 1); then by adding and subtracting the term 1/2{Delta}FC0(t),

(8)

Finally, note E[Ft+1 - Ft] = {Delta}FE[1 - Ft] and that the term in square brackets in Equation 8 is E[Ft], giving

(9)

This result holds for t large enough for contributions from early generations to have converged. If it is assumed that the base generation used for defining the inbreeding coefficients was chosen to be part of a period of equilibrium, then C1(t - 1) = C0(t) = C,

(10)

where C is the sum of squared converged contributions for a generation, chosen arbitrarily within the period of equilibrium. Including the term neglected between Equation 6 and Equation 7 would replace [1 - 1/2C]-1 by [1 - (1/2 + 1/4{omega})C]-1. For random mating, omitting the term [1 - 1/2C]-1 leads to an underestimate with a fractional error of ~1/2C, which in turn is ~2{Delta}F.

Since C = E[{Sigma}i{Sigma}mates (j(m),j*(f))ri,u(j(m), t) ri,u(j*(f), t)] for large u << t, for any i the terms ri,u(j(m), t) and ri,u(j(f), t) converge to the same value for all j in generation t providing the population mixes. This value will be the long-term contribution of ancestor i to the population, denoted by ri. This will occur with or without random mating. Thus C = E[{Sigma}ir2i] for a generation of ancestors, which leads to

(11)


(12)

In Equation 12, the expectations are conditional on the individual i being a selected ancestor; however, since ri = 0 for an unselected ancestor, Equation 12 can also be given as

(13)

where Tm and Tf are the number of candidates for selection in each sex and the expectation is for a candidate (i.e., it is not conditional on i being selected). (E[{Delta}F] is used in Equation 12 and Equation 13, rather than simply {Delta}F, to emphasize that the result is an expectation over replicate populations.)

This result was obtained for {omega} = 0 by WRAY and THOMPSON 1990 Down but the derivation differs in several aspects. First, in the derivation of Wray and Thompson the base was unselected and therefore not in equilibrium at the start of the selection process, and this led to an impression that the contributions used for estimating rates of inbreeding must be the generation after an unselected base. It is now evident that the choice of generation on which the estimate is obtained is arbitrary except that it is at the start of some period of local equilibrium during which some "equilibrium {Delta}F" may exist. Second, the derivation using the probability of homozygosity for an assumed allele is of value since the proof of WRAY and THOMPSON 1990 Down is heavily based upon the properties of the numerator relationship matrix. Third, it extends the result to incorporate nonrandom mating, although the result was given without proof by WOOLLIAMS and THOMPSON 1994 Down. CABALLERO and HILL 1992 Down noted that the result of WRAY and THOMPSON 1990 Down was a poor predictor of {Delta}F with nonrandom mating and it is now clear why this was so.

Even though the development of the pedigree may be in equilibrium (which will imply the genetic variance being selected upon is in equilibrium) this does not imply that equilibrium values of {omega} and {Delta}F for the alleles defined in the arbitrary base are immediately attained. Equation 4, using Appendix A, assumes that these parameters were in equilibrium for the Mendelian sampling in generation 2. However, the following argument shows that this does not affect the result. Assume the equilibrium conditions have not been attained by generation 2; then for this generation plus a small number of generations following (i.e., up to attainment of equilibrium) there will be terms of the form {delta}Cu(t) in Equation 4 and {delta}Cu(t - 1) in Equation 5. Providing t is sufficiently large compared to the period of attainment, these terms will cancel in Equation 6 since Cu(t) is a convergent series. Thus Equation 10Equation 11Equation 12 HREF="#FD13">Equation 13 will hold for the equilibrium values of {omega} and {Delta}F.

Overlapping generations:
If {Delta}F is taken per unit time then the structure of the preceding proof holds. The reduction in the variance of the Mendelian sampling term over initial cohorts, before an equilibrium {Delta}F/unit time is established, is not straightforward since it will depend upon the age structure of the population; but the previous argument used to overcome deviations from equilibrium can be applied. However, one distinction in overlapping generations is that the base generation will contain the equivalent of L cohorts, where L is the period of time over which the long-term contributions sum to one, since this is the period required for the population to turn over a generation for those genes destined to remain in the population in the long-term. WOOLLIAMS et al. 1999 Down show this genetic generation interval is different from the average age of the parents when there are selection advantages between groups (see also BIJMA and WOOLLIAMS 1999 Down). To balance (8) there is a need to add and subtract terms of magnitude 1/2C0(t) ({Delta}F/generation) or equivalently 1/2C0(t)L ({Delta}F/unit time), where L is the generation interval. Thus the error term in Equation 10 is [1 - 1/2CL]-1, and consequently ignoring this term results in an underestimate with a fractional error of 2 x ({Delta}F per generation). Equation 11 is obtained by summing over all individuals born in a single cohort. With overlapping generations, individual ancestors within cohorts will have different life histories, since they will be used at different breeding ages or for different purposes. If Xq is the number of individuals with a lifetime breeding profile categorized by q, then the approximation will be

(14)

where the expectations are over the squared contributions from a single cohort and are conditional on selection in category q. Although the approach is different, Equation 14 is equivalent to the result of HILL 1972 Down, HILL 1979 Down when random selection and random mating is assumed. However, Equation 14 clearly shows that the rate of inbreeding is related to the sum of squared lifetime contributions irrespective of selection and nonrandom mating.


*  RELATIONSHIP BETWEEN {Delta}F AND EXPECTED CONTRIBUTIONS
*TOP
*ABSTRACT
*RELATIONSHIP BETWEEN {Delta}F...
*RELATIONSHIP BETWEEN {Delta}F...
*APPLICATIONS AND RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*LITERATURE CITED

Since {Delta}F is proportional to E[r2i], the task of predicting {Delta}F in selected populations would be made easier if tractable and general methods for calculating expected squared contributions were available. However, E[r2i] = µ2i + {sigma}2i and consequently there is a need to predict both the mean and variance of the contributions. Commonly, the prediction of means is a simpler task than the prediction of variances. General methods for predicting expected long-term contributions in selected populations have been developed by WOOLLIAMS et al. 1999 Down. The objective of the following section is to obtain a relationship between the variance of the long-term contributions and their expectations, which will then permit development of general methods for the prediction of E[r2i] and consequently for {Delta}F. The relationship will need to assume random mating and is developed by conditioning on the selective advantage(s), si, for an ancestor. The selective advantage(s) of the ancestor, if inherited, will partly determine the breeding success of its descendants, with diminishing impact over generations. The proof uses the result E[r2i] = Es[r22|si] = Es2i] + Es[{sigma}2i], where µi = E[ri|si] and {sigma}2i = Var[ri|si], and the subscript s on the E indicates that the expectation is being taken over the selective advantages.

Monoecious population:
The proof is simplest in the case of a monoecious diploid population of X parents in discrete generations without selfing. Random mating is assumed ({omega} = 0). Extension to overlapping generations and to two sexes follows by analogy but is complicated by the need for matrices, and so this extension is made in Appendix B. The long-term contribution of individual i is given by

(15)

These sums may be restricted to the selected offspring since unselected offspring have no long-term contribution. It is assumed that conditional on the selective advantage si of the parent i, the genetic contribution of the offspring is independent of the number of offspring selected from parent i (denote this number by ni). Then from Equation 15,

(16)


(17)

Equation 17 requires random mating. Let {theta}n,i and Vn,i be the mean and variance of ni|si; then

(18)

The derivation of µi in a general genetic framework was described by WOOLLIAMS et al. 1999 Down.

The variance {sigma}2i is derived using the statistical result that the unconditional variance is the expectation of the conditional variance plus the variance of the conditional expectation. Applying this result to Equation 16 and Equation 17 gives

(19)

Assume now that the number selected from parent i has a Poisson distribution. For example, this would be the case if litter size before selection had a Poisson distribution. Then {theta}n,i can replace Vn,i in the second term of Equation 19 to obtain

(20)

which can be recognized as

(21)

If expectations are now taken over si, WOOLLIAMS and BIJMA 1999 Down show that by assuming an equilibrium there is no covariance between {theta}n,i and E[r2j|si, j offspring of i]. A heuristic explanation is that if there were a covariance, then this would result in selection for increased squared contributions, breaking the assumption of equilibrium. The right-hand side is then equal to Es[r2i|si], since Es[{theta}n,i] = 2. Therefore,

(22)

which leads to the result that

(23)

Finally, if X is the number of parents in each generation, then

(24)

The power of this result is that it requires only the mean conditional on the selective advantages to be modeled, which can be done for a wide class of genetic structures using the methods of WOOLLIAMS et al. 1999 Down. Note that the set of selective advantages used for conditioning must completely describe the interrelationship between one generation of selection and the next. This is embodied in the assumption that conditioning on the selective advantage si removes associations between the number of offspring selected and the subsequent success of the offspring. For example, the mates of the individual provide a selective advantage that must be accounted for (WOOLLIAMS and THOMPSON 1994 Down; SANTIAGO and CABALLERO 1995 Down).

One of the critical assumptions of the proof leading to (24) is that the selected family sizes are distributed as a Poisson variable. However, departures from this will occur, for example, (i) when the litter sizes are not Poisson; (ii) when negative covariances between full-sibs and between half-sibs are induced by using sib indices for selection; (iii) when selection intensity becomes large; and (iv) when there are common environmental variances associated with litters. (The occurrence of the last two causes will depend on the model chosen for s, which is addressed in the DISCUSSION.)

To account for this deviation let Vn,i = {theta}n,i + Vn,dev,i in Equation 19, where Vn,dev,i may be positive or negative according to the circumstances. Then the component in {theta}n,i can be treated as previously and Equation 21 becomes

(25)

and Equation 23 becomes

(26)

with the result

(27)

Anticipating an observed result, the magnitude of terms involving si in E[rj|si, j offspring of i] contributes very little to the second term of Equation 27 and only the constant term, independent of si, needs be considered. In the current context E[rj|ss, j offspring of i] {approx} X-1 and the second term in Equation 27 becomes 1/8Es[Vn,dev,i]/X. For example, in mass selection with fixed litter sizes, SANTIAGO and CABALLERO 1995 Down used the approximation that Es[Vn,dev,i] {approx} -n-1o, where no is the number of offspring per parent, with the result that the correction for the deviation from Poisson is (-8T)-1 where T is the total number of individuals born.

One of the benefits of Equation 24 is that the rate of inbreeding can be obtained from predicting means, often using regression techniques. Accounting for deviations from the Poisson distribution introduces the need for estimating variances of family size to obtain Equation 27. Nevertheless, the multigenerational problem of estimating the variance of a long-term genetic contribution has been reduced to estimating the variance of family size after selection in a single generation.

Extension to overlapping generations:
With overlapping generations, individuals within a cohort that are selected to breed at any point in their lifetime can be divided into breeding categories. These categories are defined by the age of breeding, how often, and for what purpose the individual breeds. Categories are particularly important in selection. As an example, consider mass selection where all selected individuals can have progeny born at ages 1, 2, or 3. If the population is making genetic progress the average merit of individuals born 3 years ago is less than the average merit of an individual born 1 year ago. Therefore an offspring of a 3-year-old parent will have a selective disadvantage compared to an offspring of a 1-year-old parent and so is expected to make a smaller genetic contribution in the long-term (see BIJMA and WOOLLIAMS 1999 Down). If an individual is a parent at all ages then its genetic contribution is expected to be greater than an individual chosen for breeding only at a single age. Breeding purpose is also important: if one group of parents are given more mating opportunities, then these would be expected to have more offspring and, other factors being equal, ultimately a greater long-term genetic contribution.

For these reasons partition of the selected individuals into categories is necessary to obtain the general result. It is assumed that the categories are defined so that an individual belongs to a single category that describes its lifetime genetic contribution. To continue the example of mass selection, where the only distinction among parents is the breeding age, there would be potentially seven categories. If {x} denotes age x at breeding, then these categories are {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}. The number of categories will inevitably depend on the complexity of the breeding scheme, but the essential point is that they can be defined and enumerated. Let nc be the number of categories indexed from q = 1 ... nc, and µi(q) be the expected long-term contribution of individual i in category q conditional on its selective advantage si(q) with variance {sigma}2i(q). The steps given above in Equation 16Equation 17Equation 18 HREF="#FD19">Equation 19Equation 20Equation 21Equation 22Equation 23Equation 24 HREF="#FD25">Equation 25Equation 26Equation 27 for a single category remain the same but changes are needed since terms must be redefined as vectors and matrices. The notation to develop the argument therefore becomes more complex but the result remains simple. For this reason the proof is given in Appendix B. The conclusion is that if family sizes after selection are assumed to be distributed as independent Poisson variables, then

(28)

This simple result shows that the rate of inbreeding, when approximated by the sum of squared contributions, is equal to one half of the sum of the squares of expected lifetime contributions. Instead of using the observed contribution, as in Equation 12, the expected contribution can be substituted, but this is done at the cost of changing the coefficient from 1/4 to 1/2. This is because the expected contribution is being used to model both the mean and the variance.

As previously, for a parent from category q, define the matrix Vn(q),dev of size nc x nc to be the (co)variance matrix for the number of selected offspring in each of the nc categories, expressed as deviations from independent Poisson variances. For each q, neglecting terms in s (for empirical reasons given earlier), there will be a term {delta}q defined by {alpha}TVn(q),dev{alpha}, where {alpha} is the vector with the qth element equal to the expected long-term contribution for an individual from category q, i.e., Esi(q)] = {alpha}q. Note {delta}q may be negative since it is a variance deviation and is not a variance. This term is introduced in Equation B6 of Appendix B. From Appendix B we arrive at

(29)

Although the proof has been based upon a monoecious diploid organism with no selfing, the extension to a dioecious organism is clear from the proof for overlapping generations. Having discrete generations with two sexes is identical to having two categories, i.e., males and females. Finally note that, other than assuming an equilibrium and random mating, there have been no assumptions on the type of selection index used, the nature of the genetic variation, or the population structure.


*  APPLICATIONS AND RESULTS
*TOP
*ABSTRACT
*RELATIONSHIP BETWEEN {Delta}F...
*RELATIONSHIP BETWEEN {Delta}F...
*APPLICATIONS AND RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*LITERATURE CITED

Sib indices in discrete generations:
The theory is illustrated by selection on a general sib index of the form I = b1(P - fs) + b2(fs - hs) + b3hs, where P is the phenotype of the candidate, fs is the phenotypic mean of its full-sibs (including candidate), and hs is the phenotypic mean of its half-sibs (including candidate and full-sibs). Mass selection is a special case, with b1 = b2 = b3 = 1 (or any constant >0). This formulation was used also by WRAY et al. 1994 Down in their study of rates of inbreeding. Every generation, the highest ranking Xm sires and Xf dams are selected as parents for the next generation. Each sire is mated at random to d = Xf/Xm dams and each dam produces a total of no offspring, nm male, and nf female, which are available for selection in the next generation. The unselected base population is assumed to have a phenotypic variance of 1 with a heritability of h20 for the selected trait. Additional notation used for the sib index is shown in Table 2. An example is given at each step and this is a selection scheme for Xm = 20, Xf = 60, nm = nf = 4, with index weights b1 = 1, b2 = 1.5, and b3 = 2. The principal parameters for this scheme are presented in Table 3 for easy reference.


 
View this table:
[in this window]
[in a new window]

 
Table 2. Genetic parameters for a population selected with a sib index


 
View this table:
[in this window]
[in a new window]

 
Table 3. Principal parameters, as described in Table 2, for the example selection scheme used throughout

In WRAY et al. 1994 Down the selective advantages were based on the breeding values Ai(x), and this approach is adopted here but slightly modified. A sire i has one selective advantage, namely, its own breeding value plus the average breeding value of its d mates (i.e., its mate group) and this aggregate value is denoted by Ai(hs). A dam i has two selective advantages: first, the selective advantage of its mate (Ai(hs)) and second, its own breeding value expressed as a deviation from the average breeding value of the mate group to which it belongs (denoted Ai(fs)). The average breeding value of the full-sib family from dam i is 1/2(Ai(hs) + Ai(fs)). Thus, in this hierarchical scheme, si(m) = (Ai(hs)), and si(f) = (Ai(hs), Ai(fs))T. The two selective advantages for a dam are independent.

Expected long-term genetic contributions were modeled following WOOLLIAMS et al. 1999 Down as E[ri(q)|si(q)] = µi(q) = {alpha}q + ßTq(si(q) - q), where si(q) denotes the vector of selective advantages for a selected individual of sex q expressed as a deviation from the mean of its contemporaries q, ßq is the vector of regression coefficients of ri(q) on si(q) - q, {alpha}q is the mean contribution of selected parents of sex q, and T denotes the transpose. In the parameterization used, the mean of Ai(fs) is always zero. To simplify the notation it is assumed that Ai(hs) is already expressed as a deviation from the mean of the contemporary group, and so q is omitted from this point onwards.

Step 1. Prediction of expected contributions: The prediction of expected genetic contributions is covered in detail by WOOLLIAMS et al. 1999 Down. The current article only summarizes the procedure for a sib index, without derivation. Prediction of µi(q) requires the prediction of {alpha} = ({alpha}m, {alpha}f)T and ß =Tm, ßTf). In discrete generations, ({alpha}m, {alpha}f) = [1/(2Xm), 1/(2Xf)] always. Solutions for ß are obtained applying the method of WOOLLIAMS et al. 1999 Down, using BULMER's (1980) equilibrium genetic variances. A summary of equations used is given in Appendix C. For the example ({alpha}m, {alpha}f) = (0.0250, 0.0083), ß = (0.0447, 0.0149, 0.0130).

Step 2. Rates of inbreeding assuming Poisson variances: From step 1, µi(m) = [0.0250 + 0.0447Ai(hs)]. The expected squared mean is a simple sum of squared terms: XmE[µ2i(m)] = Xm[0.02502 + 0.04472{nu}(Ai(hs))(1 - X-1m)]. The (1 - X-1m) term accounts for variances about the sample mean of the selected group rather than the true mean.

The terms arising from XfE2i(f)] are calculated analogously. Since the two selected advantages of the females are mutually independent, the expected mean squared is simply the sum of squared terms. The expected long-term contribution of a female parent is

and the sum of squared means is

As previously mentioned, the term is defined as a deviation from the mean over all ancestors so {nu}(Ai(fs)) requires no correction.

The rate of inbreeding ignoring deviations from Poisson variances is predicted from {Delta}F = (XmE[µ2i(m)] + XfE[µ2i(f)]) = = 0.0158.

Step 3. Correction for deviations of Vn from Poisson variances: Deviations from Poisson variances can be accounted for by correcting the rate of inbreeding using Equation 28, where {delta}q = {alpha}TVn(q),dev{alpha} and Vn(q),dev is the (2 x 2) matrix with (co)variances of the number of selected offspring of a parent of sex q (q = m, f) as a deviation from independent Poisson variances. The calculation of the deviation from Poisson family variance for fixed numbers of selection candidates per full-sib family is described in Appendix D. The approach adopted was derived in detail by BURROWS 1984 Down, although extension to two sexes was required and the method was made more flexible by incorporating results from MENDELL and ELSTON 1974 Down. Applying the method to the example gives

The total correction to the predicted {Delta}F is 0.0016, and the prediction, using Equation 29, is 0.0175. The mean {Delta}F derived from 1000 simulations was 0.0183 (SE = 0.0001).

General fit: Extensive simulations were carried out assuming an infinitesimal model with factorial combinations of Xm = 20, 40, 80; d = 1, 2, 3 (and 5 for Xm = 20, 40); total offspring of 4, 8, and 16 per full-sib family equally divided between sexes; and with h2 = 0.1, 0.2, 0.4, and 0.6; weights used were (1.0, 0.75, 0.5) for d > 1 [changed to (1.0, 0.75, 0.75) for d = 1] and (1.0, 1.5, 2.0) for d > 1 [changed to (1.0, 1.5, 1.5) for d = 1]. Classical weights were also examined since these weights were the subject of the study of WRAY et al. 1994 Down, although they are suboptimal after the first round of selection from an unselected base population. Results have been tabulated and summarized by WOOLLIAMS and BIJMA 1999 Down.

With weights (1.0, 0.75, 0.5, or 0.75) the accuracy was excellent for all schemes, with all errors <4%. With weights (1.0, 1.5, 1.5, or 2.0) accuracy was also very good, accurately tracking trends with the changes in the parameters and with a large majority of errors <2% with the exception of d = 3, h2 = 0.4, where underestimates of up to 8% were observed. The trends in rates of inbreeding were also accurately tracked with classical weights with no increases in the magnitude of the errors, even though schemes had rates of inbreeding >0.03.

The most serious trend in the errors was a pattern of underprediction characterized by high mating ratio and large family sizes (both of which increase the selection intensity) and increased family weights. More surprisingly, the errors also increased with the numbers of parents at a constant d (i.e., Xm = 20, Xf = 60 compared to Xm = 80, Xf = 240), and also the errors were not present for h2 = 0.01 and increased sharply as h2 increased. To explore these errors further, the long-term contributions for selected males were plotted against Ai(hs) for the following schemes with d = 3, weights (1.0, 1.5, 2.0): I, Xm = 20, h2 = 0.4, no = 16; II, Xm = 80, h2 = 0.4, no = 16; III, Xm = 80, h2 = 0.01, no = 16; and IV, Xm = 80, h2 = 0.4, with no = 4. The results for simulated (S) and predicted (P) were as follows: I, S = 0.0231, P = 0.0220; II, S = 0.0070, P = 0.0058; III, S = 0.0028, P = 0.0029; IV, S = 0.0037, P = 0.0037. Note that scheme II is simply scheme I with four times the number of parents and expected long-term contributions of I are consequently four times bigger than II. The prediction of {Delta}F for scheme II is close to (but not precisely) 1/4 of that for I. However, the ratio of the simulated {Delta}F for scheme II compared to I was closer to 1/3, i.e., much greater than would be expected from scaling. Serious prediction error occurs only for scheme II.

Fig 1 shows that the accuracy of prediction with low h2 (scheme III) is because the linear model used is a good fit (i.e., the contributions are a simple linear regression on the selective advantage) and similarly for low selection intensity (scheme IV). However, for both the other two schemes the linear model predicts a substantial proportion of the selected males to have negative contributions, although rates of inbreeding are accurately predicted in one case (scheme I) but not in the other (scheme II).



View larger version (15K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. The expected long-term contribution and lower and upper quartiles obtained from simulation (as a function of the selective advantage Ai(hs)), together with the expected long-term contribution predicted from assuming a linear model for four example schemes. The curves obtained from simulation are the result of sampling 8000 individuals. The following schemes all have d = 3 with weights (1.0, 1.5, 2.0): I, Xm = 20, h2 = 0.4, no = 16; II, Xm = 80, h2 = 0.4, no = 16; III, Xm = 80, h2 = 0.01, no = 16; and IV, Xm = 80, h2 = 0.4, no = 4. {triangleup}, linear prediction; •, simulated expectation; {circ}, lower and upper quartiles.

Closer replicate-by-replicate analysis shows that despite the expectation, the substantially greater variance of contributions (approximately proportional to {Delta}F/Xm) in scheme I obscures the nonlinearity in the majority of replicates. When both linear and quadratic terms for the selective advantage were included in a regression model for observed contributions, the quadratic term was not statistically significant (defined here as P < 0.01) in >60% of the replicates. In contrast, for scheme II, this percentage was <15%. Thus the accuracy of prediction depends on the goodness-of-fit of the linear model within a replicate, so more parents may promote greater proportional prediction errors, even though these errors will be associated with lower rates of inbreeding.

The pattern of the correction for deviations from Poisson distribution for selected family sizes is worth noting. These corrections are negative for b2, b3 < 1, reduce in size as the index weights increase, and were generally positive for b2, b3 > 1. For mass selection, b1 = b2 = b3 = 1, the correction is of the order of -1/(8T).


*  DISCUSSION
*TOP
*ABSTRACT
*RELATIONSHIP BETWEEN {Delta}F...
*RELATIONSHIP BETWEEN {Delta}F...
*APPLICATIONS AND RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*LITERATURE CITED

The theory described in this article provides a powerful tool for predicting rates of inbreeding in selected populations and for providing insights into the forces that contribute to the rate of loss of variation. The relationship of WRAY and THOMPSON 1990 Down has been derived directly from consideration of identity by descent and has been extended to cover overlapping generations and nonrandom mating. Applicability was then advanced by showing how expected long-term contributions, which are predictable by general methods, can be used in place of observed long-term contributions to predict the rates of inbreeding, if random mating was assumed. Finally, the methods were applied to sib indices in discrete generations, for which the previous solutions were complex (WRAY et al. 1994 Down). In doing so, some insight was gained into the origin of the prediction errors, and these appeared to arise from the goodness-of-fit of the models used to implement the theory rather than those used to derive it.

Theory:
The first theorem relating the rate of inbreeding in a population to the squared long-term contributions was previously derived by WRAY and THOMPSON 1990 Down but the proof here has several useful extensions. In contrast to WRAY and THOMPSON 1990 Down, the proof is direct in using identity by descent rather than properties of the numerator relationship matrix, and it also incorporates nonrandom mating and overlapping generations. The simplest relationship ({Delta}F {approx} {Sigma}r2i) is not exact and was shown to underestimate the rate of inbreeding by a fraction of the order of (2{Delta}F), providing there was no major deviation from random mating, and is therefore small for any practical scheme. In overlapping generations, with rates of inbreeding per unit time and per generation both of interest, it is shown that this error is 2({Delta}F/generation), where the generation interval was defined by the period over which the long-term genetic contributions sum to 1.

The importance of the relationship between rates of inbreeding and squared genetic contributions is that it holds for selected populations, with no assumptions on the form of selection, providing (i) the genes are ultimately mixed, and (ii) an equilibrium exists over which a stable {Delta}F may be defined. A further caveat is that the rate obtained applies to a neutral, unlinked gene. The extension of other relationships to predict {Delta}F in selected populations does not always hold. For example, using the relationship Var({delta}q) = q(1 - q){Delta}F, where q is the frequency of a neutral gene and {delta}q is the change in frequency per unit time, will not hold if selection is not random since it assumes mutual independence of {delta}q over consecutive intervals. The increments, {delta}q, are also correlated for overlapping generations due to the many intervals over which the progeny of a single parent may be selected. As a consequence the justification for the proof by HILL 1979 Down for {Delta}F with overlapping generations is invalid, even in the absence of genetic selection, although the result is correct and agrees with the previous proof of HILL 1972 Down. Closer examination of HILL 1979 Down shows that its justification lies in an intuitive argument for the relationship that was to be proved later by WRAY and THOMPSON 1990 Down. Consequently the methods derived here may be seen to arise as a natural development of the results of HILL 1972 Down, HILL 1979 Down for selected populations.

The form of Equation 4 shows that the sum of squared long-term contributions for any given cohort may be usefully interpreted in the absence of an equilibrium. The sum of squared contributions for a cohort is the proportion of the new variation (the Mendelian sampling variance) arising from within that cohort that is lost to the population in the long term. This includes all mutational variance arising in prior generations, since the choice of base is arbitrary. Therefore the sum of squared contributions of cohorts (particularly those still to converge!) is important, irrespective of equilibrium, and provides a meaningful measure of risk, and merits attention in both breeding and conservation schemes. The operational tools described by GRUNDY et al. 1998 Down are based upon controlling sums of squared contributions of cohorts and have meaning and validity beyond the infinitesimal model (e.g., VILLANUEVA et al. 1999 Down). However, there are clearly greater problems in providing deterministic predictive tools to analyze population dynamics if the assumption of equilibrium is removed, and those provided by WOOLLIAMS et al. 1999 Down assume this equilibrium.

The second, novel theorem derived in this article is concerned with showing how the formulas with observed long-term contributions may be translated into formulas with expected long-term contributions. The latter are advantageous since they use predictable entities. The major change is that the expected can be substituted for the observed, providing the constant of proportionality is increased from 1/4 to 1/2. The critical step in the proof is that the error variance of a long-term contribution given the selective advantage is related to the square of its mean, i.e., the coefficient of variation is relatively constant. Apart from random mating, the scope of this proof is very broad and is applicable to overlapping generations. The validity of the derivation was checked using general sib-indices as an example in discrete generations, and a companion article (BIJMA et al. 2000 Down) provides verification in overlapping generations with mass selection with lifetime selection, thereby removing a serious restriction of NOMURA 1996 Down. The limitation to random mating arises from Equation 17, although in one special case, partial full-sib mating with no selection, the analysis can be completed (using results of GHAI 1965 Down) and shown to agree with the results of CABALLERO and HILL 1992 Down. This provides an indirect verification of Equation 13 for nonrandom mating.

WOOLLIAMS et al. 1999 Down show how the expected long-term contribution may be calculated in general for different inheritance models (e.g., imprinted variation, maternal additive, or sex-linked variation) with different selection indices (sib indices or best linear unbiased predictors). Using long-term contributions follows the path of WRAY and THOMPSON 1990 Down and WOOLLIAMS et al. 1993 Down and differs from SANTIAGO and CABALLERO 1995 Down(mass selection in discrete generations) and NOMURA 1996 Down(a special case of mass selection with overlapping generations), who base their predictions on genetic variation transmitted to descendants. This is because the approach using genetic variation cannot be sustained for general selection schemes. SANTIAGO and CABALLERO 1995 Down suggest (their Equation 13) that a change in covariance between a general selective advantage and a neutral gene following selection is determined by the reduction in genetic variation. This is true for mass selection, where the index of selection is solely a function of the total breeding value and residual error, but will not be true in general (WOOLLIAMS et al. 1999 Down). BIJMA et al. 2000 Down show why there is agreement between the two approaches for mass selection in discrete generations and also why the current methods are required to cope with overlapping generations.

Prediction:
Usable predictions were obtained by WRAY et al. 1994 Down and an alternative form based upon WRAY et al. 1994 Down was used by VILLANUEVA and WOOLLIAMS 1997 Down. However, the method of WRAY et al. 1994 Down was complicated, although it attempted to model the expected proliferation of ancestral lines. The authors believe the proposed method is conceptually simpler than that of WRAY et al. 1994 Down and is open to development.

In any attempt to obtain prediction formulas, a balance has to be achieved between accuracy and simplicity. We have used simple linear models to interpret the theory. Thus in application the prediction consists of two elements: (i) the squared expected contribution and (ii) the deviation from independent Poisson families. The first of these elements was applied precisely as described by WOOLLIAMS et al. 1999 Down, with corrections for finite numbers only being used to obtain the sample variance of selective advantages. No other modifications were needed because the other terms in the squared expected contribution were estimates of regression coefficients, which were assumed to be relatively robust to finite sampling. This assumption may be justified in part by the excellent agreement obtained by WOOLLIAMS et al. 1999 Down between simulations and deterministic predictions of expected long-term contributions. The second element, calculating the deviation from independent Poisson families, only required extension of the method of BURROWS 1984 Down to two sexes. The correlation coefficients among full-sibs and half-sibs used for calculating this element were those obtained assuming infinite numbers but, to compensate for this, no reduction for finite samples was applied to the squared means.

The choice of selective advantages has as an objective the minimum number needed to make the selective processes in different time periods independent. Using sib indices as an example, the authors considered both the method presented, where only breeding values were included as selective advantages, and an alternative definition in which the selective advantages were the half-sib mean and deviation of the full-sib mean from the half-sib mean. The potential benefit from the alternative parameterization is that the environmental covariances in the index arising from the sib means are accounted for within the expected long-term contribution. Conditioning on the sib means is more than is strictly necessary for conditional independence between generations. However, while results using the alternative parameterization were as accurate in most cases (results not shown), the underestimates explored in the results tended to be more severe. One reason for this is that terms included in the expected long-term contribution are modeled by linear functions, whereas modeling the environmental correlations by the method of BURROWS 1984 Down allows part of the nonlinearity to be accounted for. Therefore, the more terms that are included linearly in the expected long-term contribution, the greater the errors arising from nonlinearity.

Nonlinear relationships between the selective advantage and long-term contributions occurred when high selection intensities of selection were combined with moderate heritabilities, large numbers of parents, and high mating ratios. Results from including quadratic terms in the model for the expected long-term contribution (unpublished) confirm that the serious prediction errors arise from the assumption of linearity rather than from Equation 29.

There are good reasons to believe that these departures from linearity should not prove a major problem where the objective is to design effective breeding schemes. First, on pragmatic grounds the curvilinear relationship shown in Fig 1 suggests that 15% of selected males were being used with no expectation of long-term contribution to the population (this percentage is even higher if the contributions were plotted against the observed half-sib mean!). The resources used to keep and breed these animals are clearly wasted. In an ideal selection scheme, an ancestor's long-term contributions will be zero or, once its Mendelian sampling term is above a critical threshold, linearly related to the sampling term (WOOLLIAMS and THOMPSON 1994 Down; GRUNDY et al. 1998 Down). Consequently it would be expected that in an ideal scheme, the long-term contribution of a selected ancestor will show an approximate linearity with its breeding value. This argument suggests that if the design objective is for a scheme to generate gain efficiently from the resources available, a linear model for the relationship between the long-term contribution and the selective advantage should prove sufficient. If so, then the need for improved deterministic models to cater for the schemes with large prediction errors would be removed. The viewpoint that the schemes with large prediction errors are inefficient is supported by the results of VILLANUEVA and WOOLLIAMS 1997 Down, who showed that when using sib indices, efficient schemes had d <= 2 for which the methods presented here had a good fit.

In conclusion, this article has (i) established a broader theorem (compared to WRAY and THOMPSON 1990 Down) concerning the relationship between squared long-term genetic contributions and rates of inbreeding, in particular extending the theorem to nonrandom mating and to overlapping generations; (ii) shown that, for random mating, the relationship can be generalized from long-term contributions that are simply observed to encompass expected long-term contributions that can be predicted; and (iii) shown how these equations might be interpreted with simple linear models in the context of predicting rates of inbreeding with sib indices in discrete generations. Together with the findings of WOOLLIAMS et al. 1999 Down, the findings of this study show how rates of inbreeding may be predicted in general populations with complex structures and genetic models.


*  ACKNOWLEDGMENTS

J.A.W. gratefully acknowledges financial support from the Ministry of Agriculture, Fisheries and Food (United Kingdom), and the support and encouragement of Prof. A. Maki-Tanila, who gave an opportunity for this work to be initiated. The contribution of P.B. was financially supported by the Netherlands Technology Foundation (STW) and coordinated by the Earth and Life Science Foundation (ALW).

Manuscript received March 30, 1999; Accepted for publication December 6, 1999.


*  APPENDIX A
*TOP
*ABSTRACT
*RELATIONSHIP BETWEEN {Delta}F...
*RELATIONSHIP BETWEEN {Delta}F...
*APPLICATIONS AND RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*LITERATURE CITED

THE EXPECTED MENDELIAN SAMPLING VARIANCE
The expected Mendelian sampling variance in generation 1 summed ove