- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Ronfort, J.
- Articles by Rousset, F.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Ronfort, J.
- Articles by Rousset, F.
Analysis of Population Structure in Autotetraploid Species
Joëlle Ronforta, Eric Jenczewskia, Thomas Bataillona, and François Roussetba Laboratoire de Génétique et d'Amélioration des Plantes, Institut National de la Recherche Agronomique, Domaine de Melgueil, 34130 Mauguio, France
b Laboratoire Génétique et Environnement, Institut des Sciences de l'Evolution, Université des Sciences et Techniques du Languedoc, 34095 Montpellier, France
Corresponding author: Joëlle Ronfort, Laboratoire de Génétique et d’Amélioration des Plantes, Institut National de la Recherche Agronomique, Domaine de Melgueil, 34130 Mauguio, France., ronfort{at}ensam.inra.fr (E-mail).
Communicating editor: M. SLATKIN
| ABSTRACT |
|---|
Population structure parameters commonly used for diploid species are reexamined for the particular case of tetrasomic inheritance (autotetraploid species). Recurrence equations that describe the evolution of identity probabilities for neutral genes in an "island model" of population structure are derived assuming tetrasomic inheritance. The expected equilibrium value of FST is computed. In contrast to diploids, the correlation of genes between individuals within populations with respect to genes between populations (FST) may vary among loci due to the particular segregation patterns expected under tetrasomic inheritance and is consequently inappropriate for estimating demographic parameters in such populations. We thus define a new parameter (
) and derive its relationship with Nm. This relationship is shown to be independent from both the selfing rate and the proportion of double reduction. Finally, the statistical procedure required to evaluate these parameters using data on gene frequencies distribution among autotetraploid populations is developed.
DUE to its frequent occurrence among angiosperm species (from 30 to 50%; ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Due to the addition of divergent genomes, inheritance in allopolyploids is disomic; i.e., pairing behavior during meiosis is similar to that of nonhomologous pairs of chromosomes in diploids. In contrast, segregation patterns in autopolyploids are much more complex because more than two homologous chromosomes can pair during meiosis. Multivalents leading to polysomic inheritance are formed. This does not necessarily lead to random assortments of homologous chromosomes into gametes; two sister chromatids may also segregate into the same gamete (Figure 1). This phenomenon, known as "double reduction," is specific to autopolyploids. It increases the production of homozygous gametes as compared to what is expected under random chromosome segregation and is thus likely to alter many basic expectations of population genetics (![]()
|
Probably because of the agronomic significance of polyploid species, the consequences of polysomic inheritance and double reduction have been investigated, especially for self-fertilization and regular systems of inbreeding (![]()
![]()
![]()
![]()
![]()
![]()
![]()
For diploids, the distribution of genetic diversity within and among natural populations is commonly analyzed using theoretical models of population structure, for instance, the island model or the stepping stone model. Functions of probabilities of gene identity within and between units (populations, subpopulations), such as FST (![]()
![]()
![]()
![]()
![]()
![]()
The aim of this article is to develop a theoretical framework for the analysis of population structure in autotetraploid species. Recurrence equations that describe probabilities of gene identity under the island or isolation by distance models may be generalized for the case of tetrasomic inheritance; the case of the island model is given here as an illustration. Equilibrium values for traditional F-statistics parameters are derived. Because the proportion of double reduction may vary over loci, we define an additional function of probabilities of gene identity. This parameter seems appropriate to analyze population structure in autotetraploids, because its relationship with the migration rate and the population size is shown to be independent from both the selfing rate and the proportion of double reduction. Finally, following ![]()
| HIERARCHICAL GENIC STRUCTURE AND DEFINITION OF PARAMETERS |
|---|
Let Q stand for the probability of identity, Q0 for pairs of genes within individuals, Q1 for pairs of gene between individuals within subpopulations, and Q2 for pairs of genes between subpopulations. Throughout this article, the notation Qj will refer to probabilities of identity in state (IIS) and the j indices (j = 0 to 2) to the same pairs of genes. The addition of a dot on the top of a parameter will denote probabilities of identity by descent (IBD) (i.e.,
j), and the standard notation
will be used to distinguish the definition of parameters from their values under particular models of population structure.
Under tetrasomic inheritance, four genes are available at a given locus. Then, a random pair of genes within individuals (Q0) can be issued either from the same gamete (probability 1/3) or from two different gametes (probability 2/3). If QA and QB denote the probability of IIS associated, respectively, with these two categories of pairs of genes, then Q0 =
.
Following ![]()
![]()
![]()
![]() |
(1) |
![]() |
(2) |
![]() |
(3) |
Another parameter we will consider is
![]() |
(4) |
This parameter is analogous to the "correlation between truly outcrossed mates" in diploids (![]()
![]()
![]()
![]()
in autotetraploids, this relationship is moreover independent of the proportion of double reduction and therefore identical for all loci independently of their distance to the centromere.
Let us define
r as the IBD probability for two genes in different individuals located either in two different subpopulations (r = 2) or in the same subpopulation (r = 1) and use the relationship between coalescence of genes and identity probabilities (![]()
![]()
![]()
![]()
![]()
![]() |
(5) |
![]() |
(6) |
|
Because T2 corresponds to a coalescence time, then using the relationships between the coalescence of genes and identity probabilities, E[(1 - µ)2T2] represents the IBD probability for a pair of genes when both are sampled in the same individual (Figure 2):
![]() |
(7) |
Unlike T2, T1 in this instance is not a coalescence time but rather the "waiting time" for two genes initially at distance r to migrate within the same individual (Figure 2). To define T1, we do not make any reference to identity between the two genes under consideration, and this waiting time will depend only on the initial distance between the two genes (r = 1 or 2) and on the way genes migrate within and between subpopulations. Hence, E[(1 - µ)2T1] , which we will denote
r in what follows, is not an IBD probability but simply denotes the probability that neither gene has mutated during T1. Since double reduction affects only transition probabilities for genes within individuals, it does not affect T1 nor
r. These two parameters are consequently independent from the proportion of double reduction.
Now, following (2), and using (3), the IBD probability for two genes in different individuals at distance r reduces to
![]() |
(8) |
·: ![]() |
(9) |
This parameter is of interest for two reasons: (1) Because the
rs are independent of the coefficient of double reduction, this equation shows that this is also true for
·; (2) as will be shown later, the expected value of
· can be deduced with minimal effort from previous models of haploid populations.
Consider now,
![]() |
(10) |
Noting that

we can always write

which reduces to
![]() |
(11) |
This may be compared to the result of the diploid model with selfing (![]()
![]() |
(12) |
| EQUILIBRIUM VALUES OF THE PARAMETERS IN AN ISLAND MODEL |
|---|
We consider a finite island model (![]()
![]()
![]()
for genes within a subpopulation, and b =
for genes from different subpopulations. In each subpopulation, a proportion S of offspring is produced through selfing and the proportion of double reduction for the studied locus is denoted
.
When individuals are autotetraploid, there are 4N genes in each subpopulation. Then, provided neither gene has mutated [with probability
= (1 - µ)2], genes originating from the same subpopulation are identical by descent with probability (1 + 3
0)/4N + (1 - 1/N)
1, while genes from different subpopulations are identical by descent with probability
2. The recurrence relations for
1 and
2 are as follows (t denoting time in generation):
![]() |
(13) |
![]() |
(14) |
Combining these two relationships, we obtain
![]() |
(15) |
At equilibrium, the Qi's do not change, hence
![]() |
(16) |
Using d = a - b, this equation can be expressed as
![]() |
(17) |
Noting that
-
1 =
, then substituting this into (10) and using (17), yields
![]() |
(18) |
=
, Equation 18 becomes ![]() |
(19) |
i.e.,
![]() |
(20) |
![]() |
(21) |
As one may note, we do not need to know identity probabilities within subpopulations (Q0 and Q1) to derive these results. For diploids, the expected equilibrium value of FIS depends on the selfing rate (S), and the population size (N). For autotetraploids, it also depends on the proportion of double reduction that increases the proportion of homozygous gametes produced [see, for example, ![]()
= 0), FIS = 0. Equation 21 can then be further simplified into
![]() |
(22) |
Expected values of
can be computed for other mutation models as previously described (e.g., ![]()
![]()
r/(1 -
r)
r/(2D
2) + Constant, for a pair of populations at distance r in a one-dimensional model, and
r/(1 -
r)
ln(r)/(2D
2) + Constant, in a two-dimensional model, where D is the population density and
2 is a measure of dispersal (![]()
| POPULATION PARAMETERS ESTIMATION |
|---|
Consider a dataset describing the genotypic constitution of autotetraploid individuals sampled (at random) from a set of r subpopulations. Each subpopulation is represented by ni individuals (sample size), where i refers to the ith subpopulation. To build estimators for the level of population differentiation, we use the linear model with hierarchical effects (subpopulations, individuals within subpopulations, and genes within individuals) developed by ![]()
![]()
k
4, instead of 1
k
2 for diploids) in the jth sampled individual (1
j
ni) of the ith subpopulation (1
i
r). For a particular allele u, xijk:u = 1, if the gene is u, xijk:u = 0 otherwise, and the ANOVA setup is as follows:

Using the same developments as for diploids (![]()
![]() |
(23a) |
![]() |
(23b) |
![]() |
(23c) |
ini, S2 =
in2i, Wd
S1 - r, Wa
S1 -
, and Ww
r - 1 .
From Equation 23aEquation 23bEquation 23c, we obtain
![]() |
(24) |
![]() |
(25) |
![]() |
(26) |
Now, noting that 1 + 3Q0 - 4Q1 = 1 - Q0 + 4(Q0 - Q1) =
, we have
![]() |
(27) |
An estimator of
IT
1 -
is
![]() |
(28) |
![]() |
(29) |
For all these parameters, multilocus estimates (i.e., combining the information from all alleles and all loci) are defined as the sum of locus-specific numerators divided by the sum of locus-specific denominators (see also ![]()
![]()
![]() |
(30) |
= 0 for all the studied loci. As soon as
0 for at least one locus, only the estimate of
will have this property. | DISCUSSION |
|---|
The aim of this study was to adapt the use of Wright's FST to estimate population structure and gene flow in autotetraploid species. In contrast to diploids, FST estimates in autotetraploids are expected to vary across the loci as a consequence of different amounts of double reduction during meiosis (Figure 1 and Introduction). This problem is illustrated in Equation 11 because FIS will vary depending on both the selfing rate and the proportion of double reduction (
). Since the proportion of double reduction for a given locus is difficult to assess empirically and because population structure estimates should be based on several loci, we defined a new function of identity probabilities,
, which is an analogue to the "correlation between truly outcrossed mates" previously defined for diploids (![]()
![]()
comes mainly from the fact that this relationship is also independent of the proportion of double reduction and therefore identical for all loci independently of their distance to the centromere. The parameter
can consequently be used to assess population structure over many loci, without any prior knowledge concerning the proportion of double reduction.
Inspection of the relationship between
and FST (11) shows that FST is increased by a factor (1 + 3FIS)/4 when self-fertilization or double reduction occurs within subpopulations. This means that like self-fertilization, double reduction reduces the effective subpopulation size and hence promotes differentiation among subpopulations (for the studied locus). The complication due to partial selfing or double reduction can be absorbed in the single parameter FIS and by defining the effective population size as NZ =
. Equation 21 can then be used with NZ replacing N, i.e.,
=
, while
is still equal to
/(1 -
)
1/(2Nm
+ 2Nµ), which depends only on the migration rate, mutation rate, and the demographic population size (i.e., N, not NZ). Comparison of Equation 11 with the results of the diploid model (12) further shows that self-fertilization has a greater influence on differentiation in autotetraploids as compared to diploids.
When ignoring selfing and double reduction, the expected effect of drift under the island model of population structure is halved at equilibrium as compared to expectations for diploids, i.e., FST
1/(1 + 4Nm
+ 4Nµ) (![]()
![]()
![]()
= 0): FST =
.
Following the linear model derived in ![]()
![]()
can be computed through hierarchical analyses of variance of gene frequencies. Simulations were performed to assess possible bias in the estimation of
due to small sample sizes. We simulated a finite island model composed of n monoecious subpopulations of size N. In each subpopulation, 10 neutral, independent loci (recombination rate = 0.5), each with K possible allelic states (K-allele model), and all segregating according to the same proportion of double reduction (
) were modeled. Initial frequencies of the different allelic states were made equal in all the subpopulations (initial frequency = 1/K). Each subpopulation had the same mating system: complete outcrossing (S = 0) or partial selfing (S
0). We assumed discrete and nonoverlapping generations. Mutation occurs at a rate µ per locus per generation, each allele having an equal chance to mutate toward one of the K-1 other allelic states. Migration occurs through male gametes only: to produce the next generation in a given subpopulation, each pollen grain was sampled independently, and with probability m it was chosen among gametes from the remaining n - 1 subpopulations. As shown in Figure 3, the discrepancies between the average value of the estimator and the expected value of
are very small even for small sample sizes, with either S = 0 or S
0 and
0.
|
We wrote a computer program estimating F-statistics and the parameter
according to the ANOVA setup developed above (details of the computations are given in the appendix). The program provides estimations for
, FST, FIS, and FIT for each allele as well as estimates combining data over alleles and over loci. To test for a departure from FST = 0, the program allows for Fisher's exact test on (population x genotypes) contingency tables [for each locus separately, see ![]()
![]()
|
| ACKNOWLEDGMENTS |
|---|
We thank D. Couvet and P. Jarne for discussions, M. Raymond for advice concerning the computer program, and J. M. Prosperi for comments on the manuscript. This work was supported by a grant from the French "Bureau des Ressources Génétiques" to E.J. and J.R. This is contribution number 98-085 of the Institut des Sciences de l'Evolution.
Manuscript received March 4, 1998; Accepted for publication June 23, 1998.
| APPENDIX 1 |
|---|
Computation of expected sum of squares of gene frequencies involved in estimating
and F-statistics: Let Eu denote the expectation of xijk:u, and
u, the expected frequency of the allele u. Then
[(xijk:u- Eu)2] =
u -
2u , where
denotes expectation. Then, summing over all alleles, we obtain
![]() |
(A1) |
![]() |
(A2) |
This, derived for different pairs of genes, yields the covariances
![]() |
(A3) |
![]() |
(A4) |
![]() |
(A5) |

i.e.,
![]() |
(A6) |
i.e.,
![]() |
(A7) |
i.e.,
![]() |
(A8) |
ini and S2 =
in2i .
Now, the basic relationship
[
riwi(xi - x)2] =
[
riwi (xi - E)2] -
[
riwi(x - E)2] can be used to write sum of squares expectations, for genes within individuals,

and using (A6) and (A7), we obtain
![]() |
(A9) |
Following the same procedure and denoting Wd
S1 - r, Wa
S1 - S
, and Ww
r - 1 , we find the following sum of squares expectations: for genes between individuals within subpopulations (using A6 and A7),
![]() |
(A10) |
![]() |
(A11) |
As for diploids (![]()
i:u + ßj:u +
ijk:u) can be expressed as linear functions of identity probabilities, i.e.,
![]() |
(A12) |
![]() |
(A13) |
![]() |
(A14) |
ANOVA framework for the estimation of
and F-statistics:
To compute sum of squares, straight way gene frequencies were used instead of indicator variables (xijk). This method is based on the following relationships between gene frequency estimates and the indicator variable [see ![]()

where pAi =
and PAAi = P0,i +
+
with P0,i, P1,i , and P2,i standing, respectively, for the proportion of monogenic (AAAA), trigenic (AAAa), and digenic (AAab) individuals in the ith population (![]()
| LITERATURE CITED |
|---|
BENNETT, J. H., 1968 Mixed self- and cross-fertilization in a tetrasomic species. Biometrics 24:485-500[Medline].
BEVER, J. D. and F. FELBER, 1992 The theoretical population genetics of autopolyploidy. Oxford Surv. Evol. Biol. 8:185-217.
BRETAGNOLLE, F. and J. D. THOMPSON, 1995 Gametes with somatic chromosome number: mechanisms of their formation and role in the evolution of autopolyploid plants. New Phytol. 129:1-22.
BRETAGNOLLE, F. and J. D. THOMPSON, 1996 An experimental study of ecological differences in winter growth between sympatric diploid and autotetraploid Dactylis glomerata. J. Ecol. 84:343-351.
COCKERHAM, C. C., 1969 Variance of gene frequencies. Evolution 23:72-84.
COCKERHAM, C. C., 1973 Analysis of gene frequencies. Genetics 74:679-700
COCKERHAM, C. C. and B. S. WEIR, 1987 Correlations, descent measures: drift with migration and mutation. Proc. Natl. Acad. Sci. USA 84:8512-8514
COCKERHAM, C. C. and B. S. WEIR, 1993 Estimation of gene flow from F-Statistics. Evolution 47:855-863.
CRAWFORD, D. J., 1985 Electrophoretic data and plant speciation. Syst. Bot. 10:405-416.
CROW, J. F. and K. AOKI, 1984 Group selection for a polygenic behavioral trait: estimating the degree of population subdivision. Proc. Natl. Acad. Sci. USA 81:6073-6077
DEMARLY, Y., 1963 Génétique des tétraploïdes et amélioration des plantes. Ann. Amelior. Plantes 13:307-400.
GALLAIS, A., 1990 Théorie de la Selection en Amélioration des Plantes. Masson, Paris.
GLENDINNING, D. R., 1989 Some aspects of autotetraploid population dynamics. Theor. Appl. Genet. 78:233-242.
GRANT, V., 1981 Plant Speciation, Ed. 2. Columbia University Press, New York.
HALDANE, J. B. S., 1930 Theoretical genetics of autotetraploids. J. Genet. 22:359-372.
HAMRICK, J. L., and J. W. GODT, 1990 Allozyme diversity in plant species, pp. 4363 in Plant Population Genetics, Breeding and Genetic Resources, edited by A. H. D. BROWN, M. T. CLEGG, A. L. KAHLER and B. S. WEIR. Sinauer Associates Inc., Sunderland, MA.
LEVIN, D. A., 1983 Polyploidy and novelty in flowering plants. Am. Nat. 122:1-25.
LEWIS, W. H., 1980 Polyploidy in angiosperms: dicotyledons, pp. 241268 in Polyploidy, Biological Relevance, edited by W. H. LEWIS. Plenum Press, New York.
LOVELESS, M. D. and J. L. HAMRICK, 1984 Ecological determinants of genetic structure in plant populations. Annu. Rev. Ecol. Syst. 15:65-95.
MALÉCOT, G., 1948 Les Mathématiques de l'Hérédité. Masson, Paris.
MALÉCOT, G., 1975 Heterozygosity and relationship in regularly subdivided populations. Theor. Popul. Biol. 8:212-241[Medline].
MOODY, M. E., L. D. MUELLER, and D. E. SOLTIS, 1993 Genetic variation and random drift in autotetraploid populations. Genetics 134:649-657[Abstract].
NAGYLAKI, T., 1983 The robustness of neutral modes of geographical variation. Theor. Popul. Biol. 24:268-294.
PETIT, C., J. D. THOMPSON, and F. BRETAGNOLLE, 1996 Phenotypic plasticity in relation to ploidy level and corm production in the perennial grass Arrhenatherum elatius. Can. J. Bot. 74:1964-1973.
PETIT, C., P. LESBROS, X. GE, and J. D. THOMPSON, 1997 Variation in flowering phenology and selfing rate across a contact zone between diploid and tetraploid Arrhenaterum elatius (Poaceae). Heredity 79:31-40.
RAYMOND, M. and F. ROUSSET, 1995 An exact test for population differentiation. Evolution 49:1280-1283.
REYNOLDS, J., B. S. WEIR, and C. C. COCKERHAM, 1983 Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics 105:767-779
RIESEBERG, L. H. and M. F. DOYLE, 1989 Tetrasomic segregation in the naturally occurring autotetraploid Allium nevii (Alliaceae). Hereditas 111:31-36.
ROUSSET, F., 1996 Equilibrium values of measures of population subdivision for stepwise mutation processes. Genetics 142:1357-1362[Abstract].
ROUSSET, F., 1997 Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145:1219-1228[Abstract].
SLATKIN, M., 1991 Inbreeding coefficients and coalescence times. Genet. Res. 58:167-175[Medline].
SLATKIN, M. and N. H. BARTON, 1989 A comparison of three indirect methods for estimating average levels of gene flow. Evolution 43:1349-1368.
SLATKIN, M. and L. VOELM, 1991 FST in a hierarchical island model. Genetics 127:627-629[Abstract].
SOKAL, R. R., and F. J. ROHLF, 1995 Biometry, Ed. 3. Freeman and Company, New York.
SOLTIS, D. E. and P. S. SOLTIS, 1989 Genetic consequences of autopolyploidy in Tolmiea (Saxifragaceae). Evolution 43:586-594.
SOLTIS, D. E. and P. S. SOLTIS, 1993 Molecular data and the dynamic nature of polyploidy. Crit. Rev. Plant Sci. 12:243-273.
STEBBINS, G. L., 1971 Chromosomal Evolution in Higher Plants. Addison-Wesley, Reading, MA.
STEBBINS, G. L., 1985 Polyploidy, hybridization and the invasion of new habitats. Ann. MO Bot. Gard. 72:824-832.
TACHIDA, H., 1985 Joint frequencies of alleles determined by separate formulation for the mating and mutation systems. Genetics 111:963-974
TACHIDA, H. and H. YOSHIMARU, 1996 Genetic diversity in partially selfing populations with the stepping-stone structure. Heredity 77:469-475.
THOMPSON, J. D. and R. LUMARET, 1992 The evolutionary dynamics of polyploid plants: origins, establishment and persistence. Trends Ecol. Evol. 7:302-307.
WALLER, D. M. and S. E. KNIGHT, 1989 Genetic consequences of outcrossing in the cleistogamous annual, Impatiens capensis. II. Outcrossing rates and genotypic correlations. Evolution 43:860-869.
WEIR, B. S., 1996 Genetic Data Analysis II: Methods for Discrete Population Genetic Data. Sinauer Associates, Sunderland, MA.
WEIR, B. S. and C. C. COCKERHAM, 1984 Estimating F-Statistics for the analysis of population structure. Evolution 38:1358-1370.
WRIGHT, S., 1938 The distribution of gene frequencies in populations of polyploids. Proc. Natl. Acad. Sci. USA 24:372-377
WRIGHT, S., 1951 The genetical structure of populations. Ann. Eugen. 15:323-354.
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Ronfort, J.
- Articles by Rousset, F.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Ronfort, J.
- Articles by Rousset, F.



































) with expected equilibrium values of
is for sample size e = 10,
for e = 30. Lines were computed using the expected equilibrium value of
for small Nm. Simulations were performed for two other parameter sets: N = 50, S = 0.2,
. 












