| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Corresponding author: Jinliang Wang, Regent's Park, London NW1 4RY, United Kingdom., jinliang.wang{at}ioz.ac.uk (E-mail)
Communicating editor: J. B. WALSH
| ABSTRACT |
|---|
An approach to the optimal utilization of marker and pedigree information in minimizing the rates of inbreeding and genetic drift at the average locus of the genome (not just the marked loci) in a small diploid population is proposed, and its efficiency is investigated by stochastic simulations. The approach is based on estimating the expected pedigree of each chromosome by using marker and individual pedigree information and minimizing the average coancestry of selected chromosomes by quadratic integer programming. It is shown that the approach is much more effective and much less computer demanding in implementation than previous ones. For pigs with 10 offspring per mother genotyped for two markers (each with four alleles at equal initial frequency) per chromosome of 100 cM, the approach can increase the average effective size for the whole genome by
40 and 55% if mating ratios (the number of females mated with a male) are 3 and 12, respectively, compared with the corresponding values obtained by optimizing between-family selection using pedigree information only. The efficiency of the marker-assisted selection method increases with increasing amount of marker information (number of markers per chromosome, heterozygosity per marker) and family size, but decreases with increasing genome size. For less prolific species, the approach is still effective if the mating ratio is large so that a high marker-assisted selection pressure on the rarer sex can be maintained.
A major genetic problem in maintaining small populations under captive breeding is the inescapable accumulation of inbreeding and genetic drift over generations, which puts them in jeopardy of immediate extinction due to inbreeding depression and also risks their survival in the long run due to the depletion of genetic variation and loss of evolutionary potential (![]()
![]()
![]()
![]()
![]()
2N, where N is the actual size of a monoecious population or a dioecious population with equal numbers of males and females (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The traditional method described above for making selection decisions to minimize inbreeding utilizes only pedigree information, which describes the expected relationship among individuals. With the same pedigree, however, individuals of diploid species still vary greatly in their genetic makeup or in the realized genetic relationship among them. In this respect, genetic markers are very useful for inferring the realized genetic relationship and could be potentially utilized in increasing Ne of small populations. For using marker information to increase Ne, several approaches such as frequency-dependent selection and selection for heterozygosity at marker loci have been proposed (e.g., ![]()
![]()
![]()
For a better utilization of genetic markers to increase Ne, ![]()
![]()
![]()
![]()
In this article, I propose a simple method that optimizes the use of marker and pedigree information simultaneously to minimize inbreeding and genetic drift in a small population. Given such marker and pedigree information, the average (realized) coancestry for all loci between two diploid genomes can be estimated and then used to choose individuals that result in the minimum average (realized) coancestry among them by a standard quadratic integer programming technique. Numerical examples show that the proposed method is powerful in decreasing inbreeding and genetic drift, and also efficient in implementation compared with previous methods.
| THEORY AND METHOD |
|---|
First, the average realized coancestry between two homologous chromosomes given their respective marker genotypes and pedigrees is computed. Second, the mean coancestry between any two diploid genomes is obtained by averaging over chromosomes and is used in minimizing the average coancestry among all selected individuals by integer programming. I consider the simplest case of one marker locus per chromosome in detail and then extend the treatment to two and more marker loci per chromosome. Throughout this article, genes are used when their identities in state or identities by descent are not distinguished, while alleles are used to refer to different allelic variants at a locus in the population.
One marker locus per chromosome:
Let us consider a diploid population consisting of N1 males and N2 females, each male being mated randomly with N2/N1 females (called mating ratio and denoted by r hereafter) at each discrete generation. Each mating is assumed to give n offspring of each sex that are marker-genotyped for selection. There are at least two codominant alleles segregating in the population at a single marker locus on each chromosome of the genome. The problem is to select, from the 2nT = 2nN2 offspring available with known marker and pedigree information, N1 males and N2 females with the minimum average coancestry among them as the next generation. This is accomplished by the following procedures.
s* = 1 or 2 for father or mother), then marker A comes from parent s with probability 1, PA,s = 1.
(i = 1, 2). If marker A is identical to the marker on homologue i only, then QA,is = 1 and QA,i*s = 0 (i
i* = 1, 2).
A,is. Consider the gene at a locus situated x morgans away from the marker locus on homologue A. The probabilities that the gene and marker A have not and have recombined during transmission are yx =
(1 + e-2x) and 1 - yx, respectively, assuming Haldane's mapping function. Assuming that loci are distributed uniformly along the chromosome and integrating yx over intervals 0L1 and 0L2 (![]() |
(1) |
where L1 and L2 are the distances from the marker locus to the two ends of the chromosome with total length L = L1 + L2. The probability that a random gene on offspring homologue A comes from homologue i in parent s is simply the weighted average
![]() |
(2) |
The equation is derived noting that RA,i*s = (1 - QA,is)PA,s = PA,s - RA,is (see step 1.3 above).
A',i's'. Denoting the average coancestry between homologues i (in parent s for offspring homologue A) and i' (in parent s' for offspring homologue A') as Gis,i's', the coancestry between offspring homologues A and A' is ![]() |
(3) |
The self-coancestry of any homologue, such as A, is always 1, GA,A
1. The coancestry between different homologues within an offspring is calculated as
![]() |
(4) |
Equation 4 is derived as follows. The two homologues within an offspring always come from separate parents. The probability that homologues A and A' in the offspring originate from the parents of sexes s and s' (s
s' = 1 or 2), respectively, is PA,s or PA',s' (PA',s'
PA,s) as calculated in step 1.1. Given PA,s, the probability that A is from homologue i in parent s is
, and the probability that A' is from homologue i' in parent s' is
. The average coancestry between homologues A and A' is then

Since PA',s'
PA,s, the above expression reduces to (4).
![]() |
(5) |
where subscript Aj (A'j') refers to homologue A (A') in offspring j (j'). The self-coancestry of offspring j is
![]() |
(6) |
The coancestry between the two whole diploid genomes j and j',
j,j', is obtained by averaging gj,j' over chromosomes in the genome weighted by their lengths.
js,j's', the optimization is realized by minimizing the function ![]() |
(7) |
subject to the restriction
![]() |
(8) |
where indicator variable ujs = 1 or 0 if offspring j of sex s is selected or not. Equation 7 and Equation 8 can be solved by integer quadratic programming, or by simulated annealing methods (![]()
![]()
When marker information is not available, the above marker-assisted selection procedure reduces to the method of selection based on minimizing the average coancestry of selected offspring calculated from pedigree information only (![]()
![]()
A,is =
from the derivation of (2). From (35), it is clear that the coancestry between any two offspring is the average over the four pairs of parents, as expected. Therefore, the MAS procedure described above also applies to optimized between-family selection using pedigree information only. For the application of MAS in practice, missing marker genotypes for an individual or for some chromosomes within an individual can be dealt with similarly.
Two marker loci per chromosome:
The amount of marker information for, and its relevance to, MAS are increased by using more than one marker locus per chromosome. With two or more marker loci per chromosome, however, additional difficulty comes with determining the linkage phase. In the following, I outline the procedure for the use of two marker loci per chromosome in MAS to minimize the average coancestry among selected individuals.
k) of linkage phase k can be obtained by Bayes's theorem. The linkage phase with the larger
k value is accepted for the parent. In programming, these operations to determine linkage phases are facilitated by assigning a specific value to each marker allele and each marker genotype. Although the power of this procedure to infer linkage phases decreases with the decline in the number of offspring per parent, it seems to work well because the MAS efficiency is satisfactory (compared with a single marker locus; see results in Table 1) even with only two offspring per mother (unequal sex ratio) being available for selection.
|
k in the next generation if the offspring is selected as a parent.
s = 1, 2), then homologue A comes from parent s with probability 1, PA,s = 1. Otherwise, homologue A could come from either parent. The probability that it comes from parent s can be calculated, by using Bayes's theorem, as PA,s =
, where PA,s(AA*) is the probability of obtaining homologues A and A* in the offspring given that they originate from parents s and s*, respectively. Probabilities PA,s(AA*) and PA,s(AA*) can be calculated easily by comparing the marker genotypes and linkage phases of the parents and the offspring. Consider the following parental and offspring haplotypes for two markers G and H as an example: GH/gh for the father, Gh/gH for the mother, and GH/Gh for the offspring. If the offspring homologue with marker alleles G and H on it is denoted as A, then obviously PA,1(AA*) =
(1 - c)2, PA,2(AA*) =
c2, and PA,1 =
, where c is the recombination fraction between marker loci G and H.
A,is, given PA,s. The following four cases are possible. If both markers on homologue A are identical with those on each haploid of the parent, then obviously ![]() |
(9) |
If the marker at locus M (= 1 or 2 for left or right marker locus) is identical only with that on haploid i of the parent, and the marker at locus M* (
M = 1, 2) is identical with both marker genes in the parent, then it can be derived, using a procedure similar to the single marker case, that
![]() |
(10) |
where LM is the distance between the left marker and the left end (if M = 1) or the right marker and the right end (if M = 2) of the chromosome. If both markers on homologue A are identical only with those on haploid i in parent s, then
![]() |
(11) |
where L3 is the distance between the markers and L = L1 + L2 + L3.
If markers at loci M and M* (M
M* = 1, 2) on homologue A are identical only with those on haploids i and i* (i
i* = 1, 2) in parent s, respectively, then
![]() |
(12) |
For all the above cases, we always have that
![]() |
(13) |
The probability that a gene taken at random on homologue A comes from haploid i in the parent of sex s is
![]() |
(14) |
A,is and the same formula described in the single-marker case. For a double-heterozygous offspring, two linkage phases need to be considered separately using procedures 2 and 3. Therefore, 4, 8, or 16 possible pairs of homologues (one from each offspring) need to be considered and the corresponding average coancestries calculated when none, one, or all of the two offspring are double heterozygotes for markers on the chromosome in question.
Many marker loci per chromosome:
Three or more marker loci per chromosome can be treated similarly to the two-marker case shown above. With an increasing number of marker loci per chromosome, the formulations become inevitably more complicated, but the extra efficiency diminishes because the restricting factor to MAS efficiency is usually the number of genotyped offspring per parent in practice.
As the number of informative marker loci increases, the parental origin of each homologue can be inferred with increasing confidence. In the limit, the identity for every bit of a homologue can be deduced from marker information. The MAS efficiency is constrained, therefore, only by the number of marker-genotyped offspring available for selection. This extreme case is considered in the simulations described below assuming that the origin of each chromosome is completely known.
| SIMULATION RESULTS |
|---|
The efficiency in increasing Ne of the marker-assisted selection method developed above was investigated by stochastic simulations and compared with that of previous methods. A total of 174 loci, each with two alleles of equal initial frequency, equally spaced on each chromosome were simulated and the realized (harmonic) mean effective size was calculated from both the decrease in heterozygosity, averaged over loci and replicate runs, and the increase in the variance of allele frequency among replicate runs, averaged over loci, between generations 5 and 20. The two methods yielded essentially the same results, which were averaged as the realized mean effective size. The efficiency of MAS was expressed as percentage increase in realized mean effective size relative to the corresponding value without MAS, which was obtained by the same procedure but using pedigree information only. The simulated population consisted either of five breeding males and 5r breeding females when the mating ratio r = 1 or 3, or of three breeding males and 3r breeding females when r = 6 or 12. Each chromosome was assumed to be 100 cM in map length, and each marker locus was assumed to have four codominant alleles at equal initial frequency. Each female had an equal number of offspring (half of each sex) genotyped for selection. For a set of parameters, 100 replicates were run.
When a single-marker locus, situated at the center of a chromosome, was used for MAS, the increases in harmonic mean effective size are as shown in Table 1 (columns 26) for different mating ratios, numbers of chromosomes, and family sizes (half of each sex). For equal numbers of males and females, the efficiency of the present method is similar to that of ![]()
When the numbers of the two sexes are different, an imbalance in genetic contribution among parents is introduced each generation. For example, a female parent whose son is selected contributes more genetically than a female parent whose daughter is selected. The imbalance in genetic contribution among parents can be minimized at each generation by using marker and pedigree information simultaneously in the present approach. Its efficiency, therefore, is much higher than our previous approach. When eight offspring (half of each sex) from each mother are genotyped for a single marker (four alleles at equal frequency) per chromosome in a haploid genome of 20 chromosomes (of 1 M each), for example, the average Ne can be increased by
28, 31, and 35% for mating ratios 3, 6, and 12, respectively, by the present method, while the corresponding increases are about 10, 9, and 8% by the previous method (![]()
In comparing the efficiencies, we should note that the reference selection schemes are different between the two approaches. The reference scheme in the present approach is optimized between-family selection using pedigree information, which is superior to the reference selection scheme of ![]()
1319% depending on mating ratios (data not shown), which is similar in performance to the selection scheme combined with group mating proposed by ![]()
![]()
The MAS efficiency increases with increasing mating ratio. This is because a larger mating ratio results in a higher degree of imbalance in genetic contribution among parents and also a larger paternal family size for a given maternal family size. Both factors tend to increase the efficiency of MAS. When the mating ratio is greater than one, MAS even with two genotyped offspring per mother is still effective, and the efficiency increases rapidly with increasing mating ratios (Table 1). This relationship between mating ratio and MAS efficiency for the present method is in contrast to that for our previous one, where MAS efficiency decreases with increasing mating ratio because the within-family variation in genetic contribution on which MAS acts becomes less important compared with between-family variation.
The simulation results for the efficiency of MAS with two marker loci per chromosome, the left (right) locus being situated one-third morgan to the left (right) end of the chromosome, are listed in Table 1 (columns 711). Compared with a single marker, use of two marker loci per chromosome generally increases MAS efficiency, and the increase is greatest when the number of chromosomes is small and family size is large, where the restricting factor for MAS efficiency is the amount of marker information and its relevance to all loci over the whole chromosome.
Compared with our previous method, use of two markers per chromosome in the present method increases the effective size enormously, especially when mating ratio is high. This is because with an increasing mating ratio, the variation in genetic contribution among families that is ignored in the previous MAS method becomes increasingly important compared with within-family variation in determining the inbreeding and drift processes.
Other issues related to the use of two marker loci per chromosome, such as marker location and chromosome length, have been considered by ![]()
Columns 1216 in Table 1 show the MAS efficiency when the identity of each chromosome is completely known, which is realized in practice by using many informative markers per chromosome. As is clear, full knowledge of chromosome origins increases MAS efficiency enormously compared with the one- or two-marker case, especially when family size is large. The number of markers required to infer unambiguously the chromosome identity varies depending on the recombination frequency of the chromosome. With no recombination (e.g., males in Drosophila), for example, one informative marker per chromosome is enough.
In Table 1I considered populations consisting of either 5 (if r = 1 or 3) or 3 (if r = 6 or 12) males. The absolute population size, however, does not influence much of the MAS efficiency if it is not very small. When r = 3 and 8 offspring per mother are genotyped for a single marker (with four alleles at equal frequency) per chromosome (1 M) in a haploid genome of 20 chromosomes, for example, the increases in Ne are
28, 31, and 30% for male numbers being 5, 10, and 20, respectively.
| DISCUSSION |
|---|
In this study, an approach to optimizing within- and between-family selections simultaneously by using marker and pedigree information, and thus minimizing the rate of inbreeding or genetic drift for a small diploid population, was proposed and its efficiency was investigated by simulations. The new approach actually estimates and records the expected pedigree of each chromosome by using marker and pedigree (for individuals) information, and the chromosome pedigree is then used in a formal way to minimize the average coancestry among selected chromosomes by standard integer programming. The target of selection is chromosomes, while individuals are considered only as carriers of chromosomes and selection units. The approach is much more effective if the mating ratio is larger than one, compared with our previous MAS method that considered separately between-family selection on pedigree and within-family selection on marker information (![]()
It is also computationally simpler and more efficient compared with ![]()
40 and 70% when a single and two markers (each with four alleles) are used in the selection, respectively. In contrast, the corresponding values are
51 and 102% if the present approach is used. It is not clear why the approach proposed herein is more efficient. A direct comparison between the two approaches in a single simulation study would be helpful.
In the context of animal breeding, the probability of descent for a QTL allele (PDQ) conditional on linked marker information has been used to compute the conditional covariance of additive effects of the QTL alleles within and between individuals for the purpose of increasing the response to selection for a quantitative trait (![]()
![]()
![]()
![]()
Although for convenience some simplified situations were considered in the simulation, the approach applies to a wide variety of complexities encountered in practice. These issues (e.g., overlapping generations, nonrandom mating, unequal length of chromosomes, dominant markers, and different numbers and frequencies of marker alleles) as well as the potential impact on fitness and adaptation to captivity have been discussed in our previous investigation (![]()
At present, the major obstacle to the practical application of MAS seems to be that there are few species with the necessary information on markers and their chromosomal distributions. Such information is, however, rapidly accumulating. It should be emphasized that, with MAS, the effective size varies over loci, depending on the location relative to marker. Loci situated close to markers have a much larger effective size than those far from markers. With the same (harmonic) mean increase in Ne by MAS, therefore, it is better to use many less informative markers scattered on a chromosome rather than a single marker with high heterozygosity.
| ACKNOWLEDGMENTS |
|---|
I am grateful to Armando Caballero, Jesús Fernández, Bill Hill, Bill Jordan, Miguel Toro, and two anonymous referees for helpful comments.
Manuscript received August 4, 2000; Accepted for publication October 24, 2000.
| LITERATURE CITED |
|---|
BALLOU, J. D., and R. C. LACY, 1995 Identifying genetically important individuals for management of genetic variation in pedigreed populations, pp. 76111 in Population Management for Survival and Recovery. Analysis Methods and Strategies in Small Population Conservation, edited by J. D. BALLOU, M. GILPIN and T. J. FOOSE. Columbia University Press, New York.
CABALLERO, A., 1994 Developments in the prediction of effective population size. Heredity 73:657-679.
CABALLERO, A. and M. A. TORO, 2000 Interrelations between effective population size and other pedigree tools for the management of conserved populations. Genet. Res. 75:331-343[Medline].
CHEVALET, C. and H. ROCHAMBEAU, 1986 Variabilité génétique et contrôle des souches non-consanguines. Sci. Tech. Anim. Lab. 11:251-257.
CROW, J. F., and M. KIMURA, 1970 Introduction to Population Genetics Theory. Harper & Row, New York.
FERNÁNDEZ, J. and M. A. TORO, 1999 The use of mathematical programming to control inbreeding in selection schemes. J. Anim. Breed. Genet. 116:447-466.
FERNANDO, R. L. and M. GROSSMAN, 1989 Marker-assisted selection using best linear unbiased prediction. Genet. Sel. Evol. 21:467-477.
FRANKHAM, R., 1995 Conservation genetics. Annu. Rev. Genet. 29:305-327[Medline].
GODDARD, M. E., 1992 A mixed model for analyses of data on multiple genetic markers. Theor. Appl. Genet. 83:878-886.
GOWE, R. S., A. ROBERTSON, and B. D. H. LATTER, 1959 Environment and poultry breeding problems. 5. The design of poultry control strains. Poultry Sci. 38:462-471.
KIMURA, M. and J. F. CROW, 1963 On the maximum avoidance of inbreeding. Genet. Res. 4:399-415.
LACY, R. C., 1995 Clarification of genetic terms and their use in the management of captive populations. Zoo Biol. 14:565-578.
LACY, R. C., 1997 Importance of genetic variation to the viability of mammalian populations. J. Mammal. 78:320-335.
NEJATI-JAVAREMI, A., C. SMITH, and J. P. GIBSON, 1997 Effect of total allelic relationship on accuracy of evaluation and response to selection. J. Anim. Sci. 75:1738-1745
PRESS, W. H., S. A. TEUKOLSKY, W. T. VETTERLING and B. P. FLANNERY, 1992 Numerical Recipes in Fortran 77, Ed. 2. Cambridge University Press, Cambridge, UK.
ROBERTSON, A., 1964 The effect of non-random mating within inbred lines on the rate of inbreeding. Genet. Res. 5:164-167.
TORO, M., L. SILIÓ, J. RODRIGÁÑEZ, and C. RODRIGUEZ, 1998 The use of molecular markers in conservation programs of live animals. Genet. Sel. Evol. 30:585-600.
TORO, M., L. SILIÓ, J. RODRIGÁÑEZ, C. RODRIGUEZ, and J. FERNÁNDEZ, 1999 Optimal use of genetic markers in conservation programs. Genet. Sel. Evol. 31:255-261.
WANG, J., 1997 More efficient breeding systems for controlling inbreeding and effective size in animal populations. Heredity 79:591-599.
WANG, J. and A. CABALLERO, 1999 Developments in predicting the effective size of subdivided populations. Heredity 82:212-226.
WANG, J. and W. G. HILL, 2000 Marker assisted selection to increase effective population size by reducing Mendelian segregation variance. Genetics 154:475-489
WANG, T., R. L. FERNANDO, S. VAN DER BEEK, M. GROSSMAN, and J. A. M. VAN ARENDONK, 1995 Covariance between relatives for a marked quantitative trait locus. Genet. Sel. Evol. 27:251-274.
This article has been cited by other articles:
![]() |
J. Fernandez, B. Villanueva, R. Pong-Wong, and M. A. Toro Efficiency of the Use of Pedigree and Molecular Marker Information in Conservation Programs Genetics, July 1, 2005; 170(3): 1313 - 1321. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Fernandez, M. A. Toro, and A. Caballero Fixed Contributions Designs vs. Minimization of Global Coancestry to Control Inbreeding in Small Populations Genetics, October 1, 2003; 165(2): 885 - 894. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Sanchez, P. Bijma, and J. A. Woolliams Minimizing Inbreeding by Managing Genetic Contributions Across Generations Genetics, August 1, 2003; 164(4): 1589 - 1595. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wang and M. C. Whitlock Estimating Effective Population Size and Migration Rates From Genetic Samples Over Space and Time Genetics, January 1, 2003; 163(1): 429 - 446. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |