- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Hamilton, M. B.
- Articles by Miller, J. R.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Hamilton, M. B.
- Articles by Miller, J. R.
Comparing Relative Rates of Pollen and Seed Gene Flow in the Island Model Using Nuclear and Organelle Measures of Population Structure
Matthew B. Hamiltona,b and Judith R. Millerca Department of Biology, Georgetown University, Washington, DC 20057,
b Biological Dynamics of Forest Fragments Project, National Institute for Research in the Amazon, Manaus, AM 69011-970, Brazil
c Department of Mathematics, Georgetown University, Washington, DC 20057
Corresponding author: Matthew B. Hamilton, Reiss 406, Georgetown University, 37th and O Streets NW, Washington, DC 20057., hamiltm1{at}georgetown.edu (E-mail)
Communicating editor: D. CHARLESWORTH
| ABSTRACT |
|---|
We describe a method for comparing nuclear and organelle population differentiation (FST) in seed plants to test the hypothesis that pollen and seed gene flow rates are equal. Wright's infinite island model is used, with arbitrary levels of self-fertilization and biparental organelle inheritance. The comparison can also be applied to gene flow in animals. Since effective population sizes are smaller for organelle genomes than for nuclear genomes and organelles are often uniparentally inherited, organelle FST is expected to be higher at equilibrium than nuclear FST even if pollen and seed gene flow rates are equal. To reject the null hypothesis of equal seed and pollen gene flow rates, nuclear and organelle FST's must differ significantly from their expected values under this hypothesis. Finite island model simulations indicate that infinite island model expectations are not greatly biased by finite numbers of populations (
100 subpopulations). The power to distinguish dissimilar rates of pollen and seed gene flow depends on confidence intervals for fixation index estimates, which shrink as more subpopulations and loci are sampled. Using data from the tropical tree Corythophora alta, we rejected the null hypothesis that seed and pollen gene flow rates are equal but cannot reject the alternative hypothesis that pollen gene flow is 200 times greater than seed gene flow.
GENE flow in natural populations can take place in two ways: through the movement of gametes and through the movement of individuals such as progeny or adults. In seed plants, adults are usually immobile, while gene flow results from the dispersal of both pollen and seeds. In diploid species, seeds carry two nuclear alleles while pollen grains carry one nuclear allele. As a result, pollen and seed gene flows are expected to have different effects on deme size and overall levels of population structure (![]()
![]()
The vast majority of direct measures of plant gene flow have focused on the pollen component of gene flow (e.g., ![]()
![]()
![]()
![]()
![]()
![]()
Organelle genomes provide a mechanism to estimate levels of gene flow independently in pollen and seeds, due to uniparental inheritance (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
To correctly interpret these data, it is necessary to account for differences in the effective population sizes of nuclear and organelle genomes. The effective population size of organelle genomes is less than that of diploid nuclear genomes because organelle genomes are haploid and are often uniparentally inherited (![]()
![]()
![]()
![]()
![]()
![]()
Our goal in this article is to provide a framework to test specific hypotheses about rates of pollen and seed gene flow in the island model using estimates of fixation indices from natural populations. In particular, we attempt to consolidate the theory of population structure for nuclear and organelle genomes influenced by drift and migration in WRIGHT's (1931) infinite island model. We combine and extend previous results to simultaneously include the effects of arbitrary levels of organelle genome effective population size, self-fertilization, and patterns of inheritance of organelles on expected nuclear and organelle population structures near equilibrium. We then describe a method for comparing empirical estimates of nuclear and organelle fixation indices to test the null hypothesis that rates of seed and pollen gene flow are identical under the island model. We investigate the power of the hypothesis test using simulation and show that the sampling variability of the comparison is fairly straightforward to estimate and interpret. Our comparison therefore provides a useful alternative to existing estimators of the ratio of nuclear and organelle gene flow rates established by ![]()
Since actual populations are not infinite in number, we examined expectations for nuclear and organelle population structure in a finite island model using simulations. These simulations explore the power of our hypothesis test to distinguish fixation index estimates from nuclear and organelle data. Specifically, they indicate the dependence of power on the total number of subpopulations in the island model, the number of subpopulations sampled, and the number of nuclear loci sampled. These results suggest sampling designs that will increase the chance of rejecting the null hypothesis of identical rates of pollen and seed gene flow when the true values are different. Although we employ the perspective of pollen- and seed-mediated gene flow in plants, the results presented here are generally applicable to systems of nuclear diploid and haploid (sex) chromosomes or nuclear and organelle gene flow in animals as well.
Throughout this article we assume no heteroplasmy of organelle genomes in gametes. Although the rate of loss of organelle genetic diversity within individuals depends on several factors, ![]()
![]()
![]()
![]()
![]()
![]()
| INFINITE ISLAND MODEL |
|---|
Let us consider the effective population size (Ne) of selectively neutral alleles in diploid nuclear genomes and selectively neutral haplotypes in organelle genomes. In a population of Ne individuals there are 2Ne nuclear allele copies. Cytoplasmic genomes are inherited from the single haplotype present in one parent, assuming complete homoplasmy and strict uniparental transmission. In hermaphrodites there are Ne haplotype copies, since all individuals are capable of transmitting an organelle genome to progeny because each individual can act as an effective maternal or paternal parent. In a dioecious species, only one sex or one-half of the population (assuming a 1:1 breeding sex ratio) is able to transmit an organelle genome to progeny, giving Ne/2 haplotype copies. Thus, the effective population sizes of nuclear and organelle genomes will differ due to their mode of inheritance. Under these assumptions, the effective population size of an organelle genome is expected to be one-half that of the nuclear genome for hermaphrodites and one-quarter that of the nuclear genome for dioecious organisms (![]()
![]()
We now review WRIGHT's (1931, 1951; see ![]()
![]()
![]() |
(1) |
where F is the inbreeding coefficient, a is the probability of autozygosity, and b is the probability of allozygosity (b = 1 - a). At equilibrium, Ft = Ft-1 = Feq. Solving Equation 1 for Feq then yields
![]() |
(2) |
For a diploid nuclear locus we substitute a = 1/(2Ne) and b = 1 - 1/(2Ne) and assume that m2 terms are small enough to ignore. This gives an approximation of the amount of fixation among subpopulations at equilibrium:
![]() |
(3) |
Using similar logic, we can approximate the degree of fixation among subpopulations at equilibrium for organelles by substituting the appropriate probabilities of autozygosity and allozygosity (![]()
![]()
![]()
![]() |
(4) |
and for dioecious species with a 1:1 breeding sex ratio we obtain
![]() |
(5) |
(Since usage varies in the literature, we note that throughout this article FST denotes a population parameter while terms such as RST,
, and "FST estimates" refer to estimators of the parameter FST.)
Expressions (35) predict that at equilibrium organelle loci should exhibit more fixation among populations than nuclear loci given an identical rate of migration (![]()
![]()
![]()
![]()
Components of gene flow in seed plants:
In seed plants, population structure due to gene migration is determined by the dispersal of nuclear alleles and organelle haplotypes in pollen and seeds (![]()
![]()
![]()
![]() |
(6) |
![]() |
(7) |
![]() |
(8) |
|
Organelle genomes may also be transmitted to zygotes by either a maternal or a paternal parent in a uniparental fashion. If
is the rate of maternal transmission and ß the rate of paternal transmission for an organelle genome, then
+ ß = 1 (![]()
![]()
![]()
(1 - t)mseed and ß(1 - t)mseed terms sum to (1 - t)mseed, yield an expression for the components of gene migration for an organelle haplotype with biparental transmission:
![]() |
(9) |
|
We can verify this result by substituting values for several rates of biparental organelle transmission. With strict maternal transmission (
= 1) Equation 9 reduces to Equation 8, with strict paternal transmission (ß = 1) Equation 9 reduces to Equation 7, and with biparental transmission (
= ß = 0.5) Equation 9 reduces to Equation 6.
Null and alternative hypotheses for rates of pollen and seed gene flow:
As described above, our goal is to establish an analytical framework in which to compare different degrees of population differentiation (measured as FST) for nuclear and organelle genomes and to infer possible differences in migration rates of pollen and seed. Such comparisons require clear null hypotheses. In particular, we desire a test of the null hypothesis that pollen and seed gene flow rates are identical. Since equilibrium FST's for nuclear and organelle genomes are expected to differ due to differences in the effective size of each genome (Equation 3Equation 4Equation 5), showing that FST estimates are significantly different does not demonstrate that migration rates of pollen and seed differ. To establish a testable null hypothesis, it is necessary to account for both the effective size of allele populations for different genomes and the contributions of seed and pollen dispersal to fixation among populations at equilibrium.
The most basic null hypothesis is that seed and pollen gene flow rates are equal. If this is the case, then the mseed and mpollen terms in the expressions for components of gene flow (Equation 6Equation 7Equation 8Equation 9) are equal so that the subscripts can be dropped. This provides a means of establishing null hypotheses for differences in FST estimated for different genomes. Specifically, suppose that an organism has genome 1 and genome 2, whose FST values approximately satisfy FST1 = 1/(a1Nem + 1) and FST2 = 1/(a2Nem + 1). Solving for Nem in terms of FST1 and inserting the result into the formula for FST2 yields the equation
![]() |
(10) |
which is the expected relationship between the two FST values under the null hypothesis of equal migration rates. FST1 and FST2 will differ by a maximum amount only when the amount of fixation among populations is small (in Equation 10, as FST1 approaches zero, FST1/FST2 approaches a2/a1). As FST1 and FST2 approach one, we also expect the fixation indices to converge and differ by a factor approaching one. If a (say) 95% confidence interval for FST1 has been estimated to be [c1min, c1max], and if the null hypothesis that mseed = mpollen is true, then a 95% confidence interval for FST2 is obtained by letting FST1 = c1min and then FST1 = c1max in Equation 10, yielding the expected 95% confidence interval [c2min, c2max] for FST2 (we note that this is true because the left-hand side of Equation 10 is an increasing function of FST1). As we show later, this means that a test of the null hypothesis of equal migration rates can be carried out by comparing the confidence interval for FST2 obtained from genetic data with that expected under the null. Before discussing such a hypothesis test in detail, however, we demonstrate the calculation of appropriate a1 and a2 values under the null (mseed = mpollen) by considering several examples.
First, consider an outcrossing (t = 1) hermaphrodite with strict maternal organelle transmission (
= 1). The components of gene migration are then mnuclear = (1/2)m + m or (3/2)m for nuclear alleles (from Equation 6) and mmaternal = m for organelle alleles (from Equation 8). If we then substitute these migration values into our equilibrium expectations for fixation among populations we obtain for nuclear alleles (Equation 3)
![]() |
(11) |
and for organelle haplotypes in a hermaphrodite (Equation 4)
![]() |
(12) |
Therefore, the values used in Equation 10 are 6 and 2, providing an expectation for the difference between the organelle and nuclear fixation indices. Thus, if seed and pollen migration rates are identical in an outcrossing hermaphrodite with strict maternal organelle inheritance, our null hypothesis is that organelle haplotypes should show up to three times as much fixation among populations as nuclear alleles as FST(n) approaches zero. This result is logical since nuclear and organelle alleles disperse at identical rates in seeds, but nuclear alleles also disperse in pollen as haploids. In short, nuclear alleles show less fixation among populations because the effective population size of nuclear alleles is larger and they experience two components of dispersal.
As a second example, again assuming that pollen and seed dispersal rates are equal, we derive the appropriate null hypotheses for the case when organelle genomes have some degree of biparental inheritance. Consider a dioecious species with a mixed mating system (t = 0.5) and organelle transmission that is 90% maternal (
= 0.9) and 10% paternal (ß = 0.1). Equation 9 is used to determine the components of organelle allele migration in pollen and seed. In this case morganelle = (21/20)m [note that this migration rate is greater than that for strict maternal inheritance by a difference of (1/2)ßm because of dispersal of ß organelle genomes in pollen, but only one-half of the paternally transmitted organelles actually disperse in pollen since the selfing rate is 1 - t or 1/2]. Next we observe that total migration of nuclear alleles is given by Equation 6. In this case mnuclear = (5/4)m, slightly less than would be the case under complete outcrossing. This is reasonable since selfing increases migration in seed and reduces migration in pollen.
Substituting the total migration rate for organelles into Equation 5 gives
![]() |
(13) |
while substitution of the total migration rate for nuclear alleles into Equation 3 yields
![]() |
(14) |
In this case the values used in Equation 10 are 21/20 and 5. For values of FST(n) approaching zero, therefore, the fixation among populations for organelle haplotypes in this example is expected to be almost five times greater than that for nuclear alleles.
In general, when seed and pollen dispersal rates are equal, the total migration rate of organelle alleles for all levels of biparental transmission will be at least m. This is because all alleles must be dispersed in seed with any level of outcrossing. The total migration of organelle alleles will thus be increased by the degree of paternal transmission for outcross matings, since dispersal in these matings is through both pollen and seed.
The examples presented above take the perspective of establishing null hypotheses for relative levels of nuclear and organelle subdivision when pollen and seed migration rates are equal. However, it is a simple extension to incorporate into a null hypothesis any available data on rates of pollen and seed gene flow. If, for example, pollen and seed migration rates as well as the mating system have been estimated, it will be possible to formulate a null hypothesis for the degree of fixation among populations for organelle and nuclear alleles at drift-migration equilibrium. It is also possible to use observed levels of fixation among populations for organelle and nuclear alleles to estimate rates of pollen and seed migration that would be expected to produce such a pattern. For all of these cases, however, it is important to bear in mind that the hypotheses generated are dependent on the assumptions of Wright's island model, namely that populations are near equilibrium and that drift and migration are the dominant processes influencing population subdivision.
At this point we switch from considering only theoretical expectations, which are exact, to potential comparisons of FST(n) and FST(o) under various regimes of pollen and seed gene flow based on empirical estimates of fixation indices influenced by sampling variance.
Methods for comparing nuclear and organelle gene flow:
Before presenting an application and power analysis of the null hypotheses we have developed in the sections above, we pause to compare our approach to that of ![]()
![]()
![]()
![]() |
(15) |
where A = (1/FST(n)) - 1 and C = (1/FST(oh)) - 1 (this relationship was obtained by substitution of total gene migration rates into Equation 3 and Equation 4 above, and then setting these quantities equal to each other and solving to produce the ratio in Equation 15). When the amount of seed gene flow is small, y is near unity and the ratio of pollen to seed gene flow is then
(A - 2C)/C.
Ennos's statistic is a point estimate of the ratio of pollen to seed gene flow estimated from nuclear and organelle fixation indices. To test hypotheses about rates of seed and pollen gene flow or to compare ratio estimates from empirical data, it is necessary to estimate the sampling variance of the statistic(s) employed. Because Ennos's statistic is a ratio and has a denominator with a strongly nonlinear relationship to the organelle fixation index, its sampling variance is problematic to estimate and interpret. A simple calculation indicates the reason for this difficulty: If x and y are known to within error bounds
x and
y, then the error
f of the linear approximation to (1 - x)/(1 - y) satisfies, to leading order, the equation
![]() |
(16) |
which grows without bound as y approaches one. This calculation shows that the natural approach of using a Taylor series to estimate the sampling variance of Ennos's statistic will introduce unacceptable amounts of error for the regimes we are interested in, namely low rates of seed gene flow. What is more, it shows why even an empirical approach such as bootstrapping will produce error bounds that may be complicated to interpret. The reason is that the large coefficient 1/(1 - y) multiplies both the
x and the
y terms, making it difficult to determine whether error in estimating the numerator or denominator is the factor most limiting the precision of the overall estimate. We have thus chosen not to employ Ennos's statistic, but rather to estimate FST(n) and FST(o) separately, because this approach yields empirical sampling distributions that are more easily interpreted.
We stress that whether one compares two FST estimates by calculating their ratio, by comparing confidence intervals (as we do below), or by any other means, it is the error in estimating each individually that ultimately determines the power of comparison. We have chosen to create separate confidence intervals for FST(n) and FST(o) on the grounds that this approach makes it easier to interpret the sources of estimation error, no additional estimation error is introduced by linear approximation of a markedly nonlinear function, and the construction of null hypotheses is more obvious. In some cases, it could be convenient to compute a single statistic that summarizes the comparison between the two FST estimates. One could use Ennos's statistic for this purpose, although for the reasons given above, we argue that a difference of scaled FST estimates is preferable. This is because the sampling variance of the difference is simply the sum of the individual sampling variances and therefore is more straightforward to interpret than the sampling variance of a ratio. It seems to us, however, that use of any single statistic has the potential to obscure patterns clearly discernible in the separate FST estimates.
| FINITE ISLAND MODEL SIMULATIONS |
|---|
Using simulations of drift and gene flow in a finite island model, we estimated nuclear and organelle fixation indices and their 95% confidence intervals across a range of total numbers of subpopulations and numbers of subpopulations sampled. The goal of these simulations was to examine the power to distinguish differences in the values of fixation indices for nuclear and organelle loci in the finite island model under equal and unequal rates of pollen and seed gene flow.
We conducted simulations of drift and migration for monoecious individuals with arbitrary rates of paternal gamete (pollen) and progeny (seed) migration, outcrossing, rates of maternal and paternal transmission of organelles, effective size of subpopulations, and total numbers of subpopulations. Nuclear genotypes were diploid, all loci had two alleles, and generations were nonoverlapping. All subpopulations were equal and constant in size and mating was random within subpopulations. Allele frequency change caused by genetic drift was modeled with KIMURA's (1980) "pseudo-sampling variable" (PSV) method. Migration was based on individual zygote or gamete movement where the product of the migration rate and the effective size (Nem) determined the number of gametes or progeny leaving or entering each population every generation. When Nem was not a whole number, the remainder determined the chances of an individual migration event in each generation. Genotype-specific immigration rates were determined by the average genotype frequency among all subpopulations, equivalent to assuming that all subpopulations have equal rates of gene flow with all other subpopulations. In each generation the order of events was genetic drift, gamete dispersal, mating, and progeny dispersal.
The fixation indices (
and GST) were calculated in our simulations by two methods: analysis of variance (![]()
![]()
![]()
![]()
![]()
![]()
![]()
Before presenting the simulation results in detail, we note that two types of sampling variance may cause allele frequencies in finite numbers of subpopulations to evolve differently than expected under the infinite island model (![]()
![]()
![]()
In the simulations presented here, fixation indices (
and GST) were estimated using all individuals (i.e., a complete census) in randomly sampled subpopulations. Although the use of a complete census might obscure a potentially significant source of sampling error (i.e., error due to within-population sampling of individuals), in fact it is well known in the sample survey literature (e.g., ![]()
| SIMULATION RESULTS |
|---|
Although not completely realistic in all aspects, the finite island model is a better approximation of actual organisms than the infinite island model since biological populations are clearly not infinite in number. The key question addressed in our simulations is, "Can the finite island model be used to predict levels of population structure under drift and migration such that it is possible to distinguish cases where rates of pollen and seed gene flow are roughly equal or where rates are very different?" To answer this question we ran simulations with a variety of parameter values to estimate FST(n) and FST(o). We first discuss simulation results for equal rates of pollen and seed gene flow.
The simulation results allowed us to compare levels of population subdivision in the finite and infinite island models. In general, the level of population subdivision for both the nuclear and organelle genomes in the finite island model closely matched expectations from the infinite island model. However, the finite case does differ in two major respects from the infinite case. First, fixation index estimates showed more random variation across generations with decreasing total numbers of populations. This result is expected since a finite number of subpopulations causes random variance in total population allele frequency, which can be thought of as total population genetic drift, and increases as the total number of subpopulations decreases (![]()
![]()
Fig 3 shows simulation results for 10 populations sampled from a total of 200 populations with equal and moderate rates of pollen and seed gene flow (Nem = 4.0). Simulation results in Fig 4 are for the same conditions except that migration rates are very low (Nem = 0.02). In both cases, random variation in fixation index estimates across generations from the small total number of subpopulations is evident and the 95% confidence intervals for the nuclear and organelle estimates broadly overlap. In these examples the organelle genome showed higher levels of population subdivision (as expected due to the smaller effective size of a uniparentally inherited, haploid genome), although the nuclear and organelle fixation indices are statistically indistinguishable at the 95% confidence level.
|
|
These examples are representative of our simulations for up to 10,000 total subpopulations under equal rates of pollen and seed gene flow. Sampling more subpopulations decreased the width of confidence intervals, while increasing the number of subpopulations in the finite island model reduced generation-to-generation variation in fixation index estimates. As expected, the two methods of fixation index estimation (
or GST) gave very similar results with two alleles, so only results for
are presented.
We also explored organelle and nuclear population differentiation for 100, 200, 500, and 1000 total subpopulations in the finite island model with sampling sizes of 5, 10, 25, and 50 subpopulations. Simulation results for mean
and the 2.5 and 97.5 percentiles from 1000 independent subpopulation samples for each set of conditions are given in Table 1.
|
These simulations highlight two patterns in the finite island model. First, mean estimates of
were relatively insensitive to variation in the total number of populations in the island model. In Table 1 the infinite island model expectations are FST = 1/7 for nuclear loci and FST = 1/3 for organelle loci. All levels of total subpopulations in the finite island model produced mean estimates of
that were near infinite island model expectations. This demonstrates that infinite island model expectations for levels of organelle and nuclear population differentiation should not be greatly biased by finite numbers of populations if there are at least 100 populations experiencing gene flow. In practice, the sampling variance associated with empirical estimates of fixation indices will likely be much greater than systematic bias in fixation indices due to finite numbers of subpopulations. For example, compare the 95% confidence intervals for fixation indices in Fig 3 and Fig 4 with the difference between the infinite island model expected and finite island model simulation mean of
(Table 1).
The second pattern apparent in Table 1 is the relationship between the number of populations sampled to estimate fixation indices and the distribution of those estimates. The confidence intervals for nuclear and organelle estimates of
were independent of the total number of island subpopulations, but clearly widened as the number of sampled subpopulations decreased. It may, therefore, be difficult to statistically distinguish levels of nuclear and organelle population differentiation when small numbers of populations are sampled. For example, the 95% confidence intervals for organelle and nuclear
overlapped when 5 or 10 subpopulations were sampled across all levels of total subpopulations. However, when 25 or 50 subpopulations were sampled the distributions of organelle and nuclear
were nonoverlapping in all cases.
Sampling multiple loci reduced the width of fixation index confidence intervals. Table 2 shows simulation results for mean
and the 2.5 and 97.5 percentiles from 1000 independent samples of 10 subpopulations for 1, 5, and 10 nuclear loci drawn from 100, 200, 500, and 1000 total subpopulations. In Table 2, as in Table 1, the infinite island model expectations are FST = 1/7 for nuclear loci. The confidence intervals for nuclear estimates of
were relatively independent of the total number of subpopulations, but decreased as the number of loci sampled increased.
|
It will be increasingly difficult to distinguish nuclear and organelle fixation estimates under equal rates of pollen and seed gene flow as Nem approaches zero (or as FST approaches one). For any level of Nem, however, sampling more subpopulations and loci will reduce the sampling variance of fixation index estimates and give greater power to distinguish nuclear and organelle population differentiation. Finally, we note that, using simulation, it should also be possible to provide a power analysis of any empirical comparison of nuclear and organelle fixation indices. Thus, failure to reject the null hypothesis of equal rates of pollen and seed gene flow can be interpreted in the context of the number of subpopulations and loci sampled.
| AN EXAMPLE |
|---|
As an empirical example, we tested the hypothesis that seed and pollen gene flow are equal using organelle haplotype and nuclear genotype data from the hermaphroditic tropical tree Corythophora alta. ![]()
at 0.962 with a 95% confidence interval of 0.8351.0 (here
and the confidence interval based on 1000 bootstrap population samples with replacement were estimated with a Matlab program as in the simulations). Data for one nuclear microsatellite locus from these same individuals and populations (locus CTC 40-3 no. 12, n = 147; M. B. HAMILTON, unpublished data) provided an estimate of RST = 0.059 with a 95% confidence interval of 0.0090.182 (estimated with RSTCALC 2.2, ![]()
![]()
Empirical estimates of the outcrossing rate and degree of chloroplast maternal transmission are not yet available, so we assumed complete outcrossing and complete maternal organelle transmission (the implications of these assumptions are considered below). Under these assumptions in the infinite island model with equal rates of pollen and seed gene flow, applying Equation 10 to the observed 95% confidence interval (0.009, 0.182) for FST(n) yields an expected 95% confidence interval (C.I.) for FST(oh) of (0.027, 0.400) [with an expected point estimate of FST(oh) = 0.158 or 2.68 times the nuclear fixation estimate]. This expected interval does not overlap the observed chloroplast confidence interval of 0.8351.0. Likewise, applying Equation 10 to the observed 95% confidence interval (0.835, 1.0) for FST(oh) yields an expected 95% C.I. for FST(n) of (0.628, 1.0) [with an expected point estimate of FST(n) = 0.869]. Again, this expected interval does not overlap the observed nuclear confidence interval of (0.009, 0.182).
Finite island model simulations provide an additional point of view on these results. For a hermaphrodite with equal rates of pollen and seed gene flow, complete outcrossing, strict maternal organelle transmission, and Nem = 4.0 or Nem = 0.02 are shown in Fig 3 and Fig 4, respectively. These simulations establish a null hypothesis for nuclear and organelle population structure in the finite island model with equal rates of pollen and seed gene flow. Clearly, the C. alta data do not fit these predictions. In particular, the simulations predict that the 95% confidence intervals for nuclear and organelle fixation indices are likely to overlap considerably, even though the organelle genome point estimates are often greater than the nuclear point estimates as expected. In the actual data, the 95% confidence intervals do not overlap and the nuclear RST is near zero, while the chloroplast FST is not significantly different from one. Further, as discussed in the preceding paragraph, the 95% confidence intervals for nuclear and organelle fixation indices still do not overlap when either is scaled by Equation 10 under the assumption of equal rates of pollen and seed gene flow.
Having rejected at the
90% level the null hypothesis that pollen and seed gene flow rates are equal, we used the finite island model simulations to explore expectations under various alternative hypotheses for rates of seed and pollen gene flow. Simulations were run using equal effective population sizes for both genomes but unequal pollen and seed migration rates. The gene flow rates for pollen and seed were based on migration values that would produce the observed levels of nuclear and chloroplast Nem, given that the values of Ne for the two genomes must be equal.
The results of one simulation are given in Fig 5, which shows nuclear and organelle fixation among populations when pollen gene flow is 200 times greater than seed gene flow (mp = 0.08, ms = 0.0004, Ne = 50). With asymmetric rates of pollen and seed gene flow, the simulations predict that organelle FST reaches and remains at one due to low levels of gene flow in seeds, while the nuclear locus shows modest levels of population subdivision due to a higher rate of gene flow in pollen (Fig 5). The simulation results closely match the observed levels of population subdivision in C. alta. Indeed, the confidence intervals for the simulation and observed fixation index estimates broadly overlap for both the organelle and the nuclear genomes. Thus, the observed data are consistent with the expectation of asymmetric levels of pollen and seed gene flow in the finite island model.
|
In this example, we do not have estimates of the outcrossing rate nor the rate of maternal chloroplast transmission in C. alta. However, departures from strict maternal transmission of organelles would tend to decrease the equilibrium level of population structure in the chloroplast genome due to organelle gene flow in pollen. Self-fertilization with strict maternal inheritance would tend to increase nuclear population structure or, with some degree of biparental organelle inheritance, both nuclear and organelle population structure. Because the simulation fixation index confidence intervals are large, we cannot distinguish complete outcrossing from a small degree of selfing or strict maternal inheritance from limited paternal organelle transmission (results not shown).
A complete test of the hypothesis that C. alta nuclear and chloroplast population structure is consistent with asymmetric seed and pollen gene flow rates will require estimates of the outcrossing rate and mode of organelle transmission. However, a substantial departure from strict outcrossing and maternal transmission would be required to qualitatively modify the predicted levels of population subdivision under these asymmetric rates of gene flow, given the large confidence intervals about both the empirical estimates and the simulation expectations.
| DISCUSSION |
|---|
We have presented a general framework for deriving expectations for levels of population differentiation for nuclear and organelle genomes in the island model, taking into account the potentially complex components of gene flow in seed plants. This framework allows comparison of levels of population differentiation for different genomes and tests of the null hypothesis that rates of pollen and seed gene flow are equal. Such comparisons will help to elucidate the mechanisms that contribute to plant deme size and population structure and will be increasingly practical as joint estimates of nuclear and organelle population differentiation are collected in natural populations.
In the finite island model with limited migration, fixation indices are expected to increase toward the level predicted by the infinite island model and then, after many generations have elapsed, to decline toward zero (![]()
![]()
We presented simulation results for a small number of total subpopulations to demonstrate that the infinite and finite island models provide very similar expectations for population differentiation. Even in the case of 5 populations sampled from 100 total subpopulations, the simulations indicate that we will be able to distinguish grossly unequal rates of pollen and seed gene flow. Our test will not be sensitive to minor differences in gene flow rates where small numbers of populations are sampled. Indeed, the stochastic nature of the drift process causes substantial random variation in allele frequencies among populations that results in widely variable estimates of the fixation index when a small number of populations are sampled. In sum, the results show that the power of empirical comparisons is largely dependent on the number of subpopulations sampled. Comparisons of nuclear and organelle fixation indices will be increasingly able to distinguish different rates of pollen and seed gene flow by sampling more populations, since this will reduce fixation index confidence intervals.
Another way to narrow confidence intervals for fixation index estimates, and thereby aid comparisons of nuclear and organelle gene flow, is to sample multiple loci. Multilocus data are easily obtainable for nuclear fixation index estimates since unlinked loci can be utilized. However, organelle genomes are single-linkage units, so that observed haplotypes constitute a single locus even if sampled from multiple regions of an organelle genome. Two-locus estimates of the fixation index for organelles may possibly be obtained in plants by sampling both chloroplast and mitochondrial genomes, if organelles have contrasting patterns of uniparental inheritance (e.g., ![]()
![]()
![]()
When comparing nuclear and organelle fixation estimates, it is important to consider what might cause departures from the null hypothesis of equal rates of pollen and seed gene flow. The genetic model that allows us to equate a fixation index with a simple function of the effective population size and migration rate assumes that equilibrium between drift and migration has nearly been reached. As a result, samples drawn from populations not at equilibrium may cause the null hypothesis to be erroneously rejected. In other words, if equilibrium is violated, it is possible that a significant departure from the null hypothesis could be observed when the rates of pollen and seed gene flow are actually equal. However, GST has been shown to reach equilibrium rapidly under assumptions similar to those here (![]()
![]()
and GST reached approximate equilibrium in <100 generations. Since the effective size of organelle loci is less than that of nuclear loci, we expect the approach to fixation or loss by drift to be faster for organelle loci. Therefore, organelle loci should approach equilibrium more rapidly than nuclear loci if rates of pollen and seed gene flow are approximately equal. For populations where genetic structure is decreasing, ![]()
![]()
The island model assumes equal levels of gene flow among all populations, which may not be appropriate for many species. In particular, the island model does not incorporate isolation by distance, which is observed in many natural populations (![]()
![]()
![]()
![]()
Although it is sometimes stated that populations are not expected to be near a state of drift-migration equilibrium (e.g., ![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank J. Braverman for discussion and an anonymous National Science Foundation (NSF) proposal reviewer for insight into the sampling variance of Ennos's ratio estimator. The final manuscript was improved by the comments of two anonymous reviewers. This research was supported by NSF grant DEB-9983014 to M.B.H.
Manuscript received December 18, 2001; Accepted for publication September 5, 2002.
| LITERATURE CITED |
|---|
BIRKY, C. W., T. MARUYAMA, and P. FUERST, 1983 An approach to population and evolutionary genetic theory for genes in mitochondria and chloroplasts, and some results. Genetics 103:513-527.
BIRKY, C. W., P. FUERST, and T. MARUYAMA, 1989 Organelle gene diversity under migration, mutation, and drift: equilibrium expectations, approach to equilibrium, effects of heteroplasmic cells, and comparison to nuclear genes. Genetics 121:613-627.
CARON, H., S. DUMAS, G. MARQUE, C. MESSIER, and E. BANDOU et al., 2000 Spatial and temporal distribution of chloroplast DNA polymorphism in a tropical tree species. Mol. Ecol. 9:1089-1098.[Medline]
CHAKRABORTY, R., and H. DANKER-HOPFE, 1991 Analysis of population structure: a comparative study of different estimators of Wright's fixation indices, pp. 203254 in Handbook of Statistics: Statistical Methods in Biological and Medical Sciences, Vol. 8, edited by C. R. RAO and R. CHAKRABORTY. Elsevier Science, New York.
CHESSER, R. K., 1998 Heteroplasmy and organelle gene dynamics. Genetics 150:1309-1327.
CLARK, A. G., 1988 Deterministic theory of heteroplasmy. Evolution 42:621-626.
CORRIVEAU, J. L. and A. W. COLEMAN, 1988 Rapid screening method to detect potential biparental inheritance of plastid DNA and the results for over 200 angiosperm species. Am. J. Bot. 75:1443-1458.
CRAWFORD, T. J., 1984 The estimation of neighborhood parameters for plant populations. Heredity 52:273-283.
CROW, J. F., 1986 Basic Concepts in Population, Quantitative, and Evolutionary Genetics. W. H. Freeman, New York.
CROW, J. F. and K. AOKI, 1984 Group selection for a polygenic behavioral trait: estimating the degree of population subdivision. Proc. Natl. Acad. Sci. USA 81:6073-6077.




















