help button home button Genetics Drug Metabolism
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Orive, M. E.
Right arrow Articles by Asmussen, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Orive, M. E.
Right arrow Articles by Asmussen, M. A.
Genetics, Vol. 155, 833-854, June 2000, Copyright © 2000

The Effects of Pollen and Seed Migration on Nuclear-Dicytoplasmic Systems. II. A New Method for Estimating Plant Gene Flow From Joint Nuclear-Cytoplasmic Data

Maria E. Orivea,b and Marjorie A. Asmussenb
a Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas 66045-2106
b Department of Genetics, University of Georgia, Athens, Georgia 30602-7223

Corresponding author: Maria E. Orive, Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045-2106., orive{at}ukans.edu (E-mail)

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*CONDITIONS FOR GENE FLOW...
*ESTIMATING GENE FLOW
*HYBRID ZONE MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

A new maximum-likelihood method is developed for estimating unidirectional pollen and seed flow in mixed-mating plant populations from counts of joint nuclear-cytoplasmic genotypes. Data may include multiple unlinked nuclear markers with a single maternally or paternally inherited cytoplasmic marker, or with two cytoplasmic markers inherited through opposite parents, as in many conifer species. Migration rate estimates are based on fitting the equilibrium genotype frequencies under continent-island models of plant gene flow to the data. Detailed analysis of their equilibrium structures indicates when each of the three nuclear-cytoplasmic systems allows gene flow estimation and shows that, in general, it is easier to estimate seed than pollen migration. Three-locus nuclear-dicytoplasmic data only increase the conditions allowing seed migration estimates; however, the additional dicytonuclear disequilibria allow more accurate estimates of both forms of gene flow. Estimates and their confidence limits for simulated data sets confirm that two-locus data with paternal cytoplasmic inheritance provide better estimates than those with maternal inheritance, while three-locus dicytonuclear data with three modes of inheritance generally provide the most reliable estimates for both types of gene flow. Similar results are obtained for hybrid zones receiving pollen and seed flow from two source populations. An estimation program is available upon request.


THE juxtaposition of biparental and uniparental inheritance gives joint nuclear-cytoplasmic data special utility in decomposing plant gene flow and estimating the rates of pollen and seed migration. Moreover, plants offer two major cytoplasmic genomes, mitochondrial (mt) and chloroplast (cp) DNA, and three different combinations of uniparental inheritance patterns for this purpose. In most plant species, mtDNA and cpDNA are both inherited maternally, but in some species the two organelles are both inherited paternally or through opposite parents (KIRK and TILNEY-BASSETT 1978 Down; BIRKY 1988 Down; BOBLENZ et al. 1990 Down; HARRISON and DOYLE 1990 Down; MOGENSEN 1996 Down). This gives us three main types of joint nuclear-cytoplasmic data with varying degrees of power for estimating plant gene flow.

The first is two-locus cytonuclear data with a biparentally inherited nuclear marker and a maternally inherited cytoplasmic marker. This is the most common, but may be the least powerful form of nuclear-cytoplasmic data for estimating pollen flow, since pollen then only carries the nuclear marker. The second is two-locus cytonuclear data with a paternally inherited cytoplasmic marker; although less common, this type of data can be much more informative, since pollen flow will now be reflected in both the nuclear and cytoplasmic markers. The third and final class is three-locus, nuclear-mitochondrial-chloroplast data combining biparental nuclear inheritance with both forms of uniparental cytoplasmic inheritance. Such dicytonuclear data with three distinct modes of inheritance are currently available from conifer species in the family Pinaceae, which inherit their mitochondria maternally and chloroplasts paternally (NEALE et al. 1986 Down; WAGNER et al. 1987 Down; NEALE and SEDEROFF 1989 Down; DONG and WAGNER 1994 Down), and should be the most informative system of all.

The theoretical foundation for using these three types of data for the estimation of plant gene flow has been laid out in a series of continent-island migration models for mixed-mating populations. These have fully delimited the effects of unidirectional pollen and seed flow upon both standard two-locus cytonuclear systems, with a single maternally or paternally inherited cytoplasmic marker (ASMUSSEN and SCHNABEL 1991 Down; SCHNABEL and ASMUSSEN 1992 Down), and three-locus dicytonuclear systems with both modes of cytoplasmic inheritance (ASMUSSEN and ORIVE 2000 Down). This extensive analytic framework now allows us to estimate rates of pollen and seed migration from any of the three main types of nuclear-cytoplasmic data in plants by fitting the expected equilibrium frequencies to the observed joint genotypic counts.

Although cytonuclear disequilibria are not necessary for estimation per se, permanent nonrandom associations nonetheless increase the chances that the equilibrium state will depend on, and thus allow estimates of, the rates of pollen and seed migration. In this regard, ASMUSSEN and SCHNABEL 1991 Down found that, with maternal cytoplasmic inheritance, nonzero cytonuclear disequilibria are maintained only if migrant seeds carry nonrandom cytonuclear associations; pollen dispersal has only a small effect on the disequilibria caused by seed migration. However, with paternal cytoplasmic inheritance (SCHNABEL and ASMUSSEN 1992 Down), pollen migration significantly affects the equilibrium cytonuclear structure of the resident population through nonrandom cytonuclear associations in the migrant pollen. In addition, the presence of both types of gene flow can then generate permanent cytonuclear disequilibria via intermigrant admixture effects, such as differences in nuclear and chloroplast allele frequencies in migrant pollen and seeds.

For both two-locus systems, the factors necessary for permanent cytonuclear associations can arise in many biological situations. Disequilibria will be present in migrant pollen or seeds, for example, when suitable selection or other nonrandomizing forces act on the source population(s), as well as when gene flow is contributed by multiple, genetically distinct sources, as might be expected in hybrid zones and other areas of admixture. Similarly, allele frequency differences between the two forms of gene flow can be caused by the presence of distinct sources for migrant pollen and seeds, by multiple pollen and seed sources whose relative contributions vary with the form of gene flow, or by selection or other evolutionary forces acting during the life cycle of the source population(s).

Finally, dicytonuclear systems with opposite uniparental inheritance of the two cytoplasmic markers (ASMUSSEN and ORIVE 2000 Down) can produce permanent cytonuclear associations via all of the pathways for the two-locus systems, as well as cytoplasmic and three-locus associations through new interactions involving all three genomes. The latter require allelic cytonuclear disequilibria for both cytoplasmic markers in migrants, or different allele frequencies in migrant pollen or seeds for one of the three markers plus nonrandom associations in migrant seeds between the other two markers. These results suggest that three-locus, nuclear-mtDNA-cpDNA data juxtaposing both forms of uniparental inheritance should be especially powerful for estimating pollen and seed migration rates, since they provide the greatest number of avenues for the accumulation of nonrandom associations.

Here we develop a formal maximum-likelihood procedure for estimating both types of plant gene flow based on this three-part theoretical framework, using the general, dicytonuclear migration model developed in a companion article (ASMUSSEN and ORIVE 2000 Down). Unlike previous methods (PETIT et al. 1993 Down; ENNOS 1994 Down; HU and ENNOS 1999 Down), this new approach allows separate estimates of seed and pollen gene flow rates rather than their ratio or the product of effective population size and migration rate, Nm. The method developed here utilizes joint genotype counts from either a two-locus cytonuclear system (with either a maternally or paternally inherited cytoplasmic marker) or a full three-locus dicytonuclear system with both modes of uniparental transmission. In each case, the maximum-likelihood estimates are the parameter values whose equilibrium genotypic frequencies best fit the data. This method thus requires that the cytonuclear or dicytonuclear structure of the population under analysis be at equilibrium and no longer changing.

We begin by deriving the general conditions under which each system can be used in this way for estimation, illustrating these through consideration of some important special cases. In addition, we test the relative utility of these three types of data for decomposing plant gene flow into pollen and seed migration via simulated data. Finally, since cytonuclear and dicytonuclear data also offer a valuable tool for studying gene flow into a hybrid zone and other areas of admixture (ASMUSSEN et al. 1989 Down; AVISE et al. 1990 Down; PAIGE et al. 1991 Down; SITES et al. 1996 Down; GOODISMAN and ASMUSSEN 1997 Down; GOODISMAN et al. 1998 Down), we extend our analysis to the case of a hybrid plant population receiving pollen and seeds from two source populations or species.


*  CONDITIONS FOR GENE FLOW ESTIMATION
*TOP
*ABSTRACT
*CONDITIONS FOR GENE FLOW...
*ESTIMATING GENE FLOW
*HYBRID ZONE MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

We wish to determine the conditions under which the cytonuclear and dicytonuclear models of unidirectional pollen and seed migration developed previously (ASMUSSEN and SCHNABEL 1991 Down; SCHNABEL and ASMUSSEN 1992 Down; ASMUSSEN and ORIVE 2000 Down) can be used to estimate plant gene flow. As in ASMUSSEN and ORIVE 2000 Down, mtDNA here represents a maternally inherited and cpDNA a paternally inherited cytoplasmic marker. The underlying models assume each locus is diallelic with alleles A and a at the nuclear marker, alleles (cytotypes) M and m at the mitochondrial marker, and C and c at the chloroplast marker. There are thus 6 possible joint n-mtDNA and n-cpDNA genotypes (Table 1) and 12 possible joint n-mtDNA-cpDNA genotypes (Table 2). Definitions and notation for all variables (Table 3) follow those given in a companion article (ASMUSSEN and ORIVE 2000 Down). Plain upper case letters denote variables in adults of the resident (study) population, with a caret (^) indicating an equilibrium value; the corresponding values in the two migrant pools are indicated by overbars, with upper case letters denoting values in migrant pollen and lower case letters denoting the values in migrant seeds. For instance, P represents the nuclear allele frequency in the resident population, the equilibrium nuclear allele frequency in this population, and and the nuclear allele frequencies in migrant pollen and seeds, respectively. The frequencies of the diallelic cytonuclear combinations carried by migrant pollen are denoted as in Table 4.


 
View this table:
[in this window]
[in a new window]

 
Table 1. Joint n-mtDNA and n-cpDNA genotype frequencies and decomposition in terms of disequilibria


 
View this table:
[in this window]
[in a new window]

 
Table 2. Joint n-mtDNA-cpDNA genotype frequencies and decomposition in terms of disequilibria


 
View this table:
[in this window]
[in a new window]

 
Table 3. Key frequency and disequilibrium variables in cytonuclear and dicytonuclear systems


 
View this table:
[in this window]
[in a new window]

 
Table 4. Joint cytonuclear frequencies in migrant pollen

All three migration models apply to mixed-mating populations with nonoverlapping generations (CLEGG 1980 Down) in which reproduction is a combination of selfing (probability s) and of outcrossing (probability 1 - s), where 0 < s < 1, as well as to populations that reproduce solely by outcrossing (s = 0) or solely by selfing (s = 1). The study population is censused among adults, and gene flow occurs each generation via pollen and seeds following the generation cycle shown in Fig 1. Here, the pollen migration rate M represents the fraction of outcrossed pollen derived each generation from migrants, with the remaining fraction, 1 - M, of local origin, where 0 <= M < 1; the overall fraction of migrant pollen per generation is thus the product, M(1 - s). Similarly, each generation a fraction m of the total seed pool is assumed to be derived from the source population(s) and the remaining fraction 1 - m from the resident population, where 0 <= m < 1. The migration rates and mating system, as well as the genetic composition of the source(s), are assumed constant over time; this corresponds to a continent-island model of migration with unidirectional migration (Fig 2). In addition, we assume that the markers are unaffected by selection, mutation, or random genetic drift in the resident population.



View larger version (13K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. Adult census model showing the two types of gene flow within a generation cycle. Pollen migration occurs first, at rate M, followed by seed migration at rate m. Mating is a mixture of selfing, which occurs at rate s, and outcrossing, which occurs at rate 1 - s. Here, mtDNA and cpDNA represent a maternally inherited and a paternally inherited cytoplasmic marker, respectively.



View larger version (7K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. Continent-island migration model with unidirectional pollen (M) and seed (m) migration.

General considerations:
A basic consideration for any estimation procedure is that the number of parameters to be estimated not exceed the degrees of freedom (i.e., number of independent classes) in the data. This means that at most 5 parameters can be estimated from the 6 joint genotypes in two-locus cytonuclear data (Table 1), and at most 11 from the 12 joint genotypes in three-locus nuclear-mitochondrial-chloroplast (n-mtDNA-cpDNA) data (Table 2). Tests for goodness-of-fit between observed and expected frequencies will require a further degree of freedom, leaving at most 4 parameters that may be estimated from each two-locus system and 10 from the full, three-locus system. In each system, additional degrees of freedom and estimation power are possible by assaying multiple, unlinked nuclear markers.

The estimation method developed here additionally requires that the equilibrium state of the relevant cytonuclear or dicytonuclear migration model depend on the parameter(s) being estimated. This requires that the parameter(s) independently affect the equilibrium value of at least one variable out of the set of independent variables used to describe the system. Sufficient sets of equilibrium values for each of the three possible types of data are provided by the theory developed in ASMUSSEN and SCHNABEL 1991 Down, SCHNABEL and ASMUSSEN 1992 Down, and ASMUSSEN and ORIVE 2000 Down. Using the definitions and notation in Table 3, a set of five independent variables for the equilibrium of a two-locus n-mtDNA system with maternal inheritance includes the final frequencies of nuclear alleles () and nuclear heterozygotes (), the final mitochondrial frequency (M), and the final allelic and genotypic cytonuclear disequilibria giving associations between mtDNA cytotypes and nuclear alleles (A/M) and heterozygotes (Aa/M). The corresponding set of variables for a two-locus n-cpDNA system with paternal inheritance includes the final frequencies of the nuclear alleles and nuclear heterozygotes ( and ), the equilibrium chloroplast frequency (C), and the final allelic (A/C) and heterozygote (Aa/C) cytonuclear disequilibria between the nuclear and chloroplast markers. The formulas for these are given in Appendix A as (A1–A8).

Last, for a full three-locus nuclear-dicytoplasmic system, a sufficient set of 11 variables includes the 8 variables listed above for the two cytonuclear systems, together with the final cytoplasmic disequilibrium between the two cytoplasmic markers (M/C) and the joint allelic and genotypic disequilibria for the three markers, measuring associations between the joint mitochondrial-chloroplast cytotypes and the nuclear alleles (A/MC) and heterozygotes (Aa/MC) [(A9–A11) in Appendix A]. An alternate parameterization of the three-locus system replaces the joint disequilibria (A/MC and Aa/MC) with the final three-way allelic (A/M/C) and heterozygote (Aa/M/C) disequilibria [(A12–A13) in Appendix A], which measure the associations among the three markers (nuclear, mitochondrial, and chloroplast) after taking into account all of the possible two-way associations between them (nuclear-mitochondrial, nuclear-chloroplast, and mitochondrial-chloroplast).

General estimation conditions:
Examination of these equilibrium formulas readily gives the general conditions under which each of the three systems can be used to estimate the rate of seed (m) and pollen (M) migration. These are given in Table 5 for the general case of mixed-mating populations. The necessary conditions include (i) nuclear polymorphism in migrant seeds (0 < < 1), (ii) unequal nuclear or chloroplast allele frequencies in migrant pollen and seeds ( != or C != C), and (iii) the existence of two-locus disequilibria between chloroplast and nuclear alleles in migrant pollen (A/C != 0) or seeds (A/C != 0), or between chloroplast and mitochondrial alleles in migrant seeds (M/C != 0). Allele frequency differences between the two types of migrant pools (ii) could be caused by distinct sources for migrant pollen and seeds, or by selection or other evolutionary forces acting during the life cycle of the source population. Such intermigrant differences could also be caused by having unequal pollen and seed dispersal from multiple sources (e.g., source 1 may contribute a greater fraction of migrant pollen than migrant seeds, while source 2 contributes a greater fraction of migrant seeds than pollen). Migrant pollen or seeds would be expected to carry associations between chloroplast and nuclear or mitochondrial alleles (iii) when selection or other evolutionary forces act on the source population, as well as when the pollen or seeds are contributed by multiple, genetically distinct sources, as might be expected in hybrid zones and other areas of admixture.


 
View this table:
[in this window]
[in a new window]

 
Table 5. Conditions under which n-mtDNA, n-cpDNA, and n-mtDNA-cpDNA data can be used to estimate the rate of seed (m) and pollen (M) migration for the general mixed-mating model

The results given in Table 5 show that it is harder to estimate pollen flow than seed flow; whenever the data allow estimation of pollen migration, the rate of seed migration can also be estimated. This is not surprising, given that only two of the three markers experience movement via pollen migration, while all three move during seed migration. Moreover, an inspection of the conditions in Table 5 reveals that there are at least three situations when only seed migration can be estimated. Two of these arise when there are equal nuclear allele frequencies in the two migrant pools ( = ). In such cases, the rate of pollen flow cannot be estimated using cytonuclear data with maternal cytoplasmic inheritance (n-mtDNA), while estimation of the seed migration rate requires only that the migrant seeds be polymorphic at the nuclear marker (0 < < 1). The latter condition also allows estimates of the seed migration rate from cytonuclear data with paternal cytoplasmic inheritance (n-cpDNA) and n-mtDNA-cpDNA data with both forms of cytoplasmic inheritance in cases where none of the three systems allows estimation of the pollen migration rate (i.e., no intermigrant allele frequency differences in the nuclear or paternally inherited marker, or allelic cytonuclear disequilibria for the paternally inherited marker in migrant pollen and seeds). Finally, nonrandom associations between the two cytoplasmic markers (M/C != 0) allow seed migration estimates from n-mtDNA-cpDNA data in the absence of all four conditions that allow pollen migration estimates.

Three-locus data allow estimation of the gene flow parameters whenever data can be used from at least one of the two-locus cytonuclear systems and also sometimes when neither of these allows estimation. However, dicytonuclear data with three modes of inheritance increase only the conditions for estimating the seed migration rate; their conditions for estimating the pollen migration rate (M) are the same as for two-locus data with paternal inheritance (n-cpDNA). Three-locus data may nevertheless increase the power to detect the two forms of gene flow and/or give more accurate estimates because the rates of pollen and seed flow enter into more of the terms that determine the final state of the dicytonuclear system via cytonuclear and dicytonuclear disequilibria in migrant seeds (Appendix A). These migrant associations are not listed explicitly in the estimation conditions given in Table 5, since they require that migrant seeds be polymorphic at the nuclear marker, which is itself sufficient for estimation of the rate of seed migration.

Estimation conditions for special cases:
To further explore the conditions under which the two gene flow rates may be estimated, we consider the important special cases of the general continent-island model of unidirectional pollen and seed migration presented in a companion article (ASMUSSEN and ORIVE 2000 Down). These include (1) seed migration alone (0 < m < 1, M = 0), (2) pollen migration alone (0 < M < 1, m = 0), (3) complete random mating (s = 0), (4) complete self-fertilization with seed migration (s = 1, 0 < m < 1), (5) equal nuclear allele frequencies in the two migrant pools ( = ), (6) equal frequencies of the paternally inherited cytoplasmic marker in the two migrant pools (C = C), (7) equivalent migrant pools ( = , C = C, A/C = A/C), and (8) no migrant disequilibria ( = = 0 for all migrant pollen and seed disequilibria). The equilibrium structure in these eight simpler cases provides valuable insight into the conditions under which the equilibria for the dicytonuclear system and the two two-locus cytonuclear systems lose their dependence on, and their utility for estimating, either of the two migration rates. The details for these are given in ASMUSSEN and ORIVE 2000 Down.

Seed migration alone (0 < m < 1, M = 0): The equilibria for all three systems (n-mtDNA, n-cpDNA, and n-mtDNA-cpDNA) depend on, and allow the estimation of, the seed migration rate when this is the sole form of gene flow, as long as the migrant seeds are polymorphic for the nuclear marker (0 < < 1). Data from the n-mtDNA-cpDNA system can be used in the additional case where migrant seeds carry nonrandom associations between the two cytoplasmic markers (M/C != 0).

Pollen migration alone (0 < M < 1, m = 0): The absence of seed migration places significant constraints on the estimation of the pollen migration rate, M. The equilibrium for cytonuclear systems with maternal inheritance (n-mtDNA) is independent of M and thus cannot be used to estimate the rate of pollen migration if this is the only form of gene flow. The n-cpDNA and n-mtDNA-cpDNA systems with a paternally inherited cytoplasmic marker do allow estimation of the pollen migration rate, provided that the migrant pollen carry nonrandom cytonuclear allelic associations (A/C != 0).

Complete random mating (s = 0): Complete random mating places no additional constraints on estimation for the rates of pollen and seed flow beyond those given in Table 5 for the general case of mixed-mating populations.

Complete self-fertilization with seed migration (s = 1, 0 < m < 1): The opposite extreme of complete selfing is distinctive since the lack of outcrossing means such populations are also closed to pollen flow. Both types of two-locus cytonuclear systems allow estimation of the seed migration rate (m) in purely selfing populations as long as the migrant seed pool includes either heterozygous seeds ( != 0) or cytonuclear disequilibria for that two-locus system (N/* != 0 for N = A, AA, Aa, or aa and * = M or C). Dicytonuclear data may also be used in such cases, as well as when migrant seeds carry joint or three-way disequilibria (N/* != 0 for N = A, AA, Aa, or aa and * = MC or M/C).

Equal nuclear allele frequencies in migrant pollen and seeds ( = ): Although we have noted a number of situations that can produce unequal frequencies in the two forms of gene flow, in other cases they will be the same. When this holds for the nuclear marker, the residents' nuclear allele frequency reaches the common migrant value ( = = ). As a result, the cytonuclear system with maternal cytoplasmic inheritance (n-mtDNA) loses its dependence on the pollen migration rate and thus cannot be used to estimate this type of gene flow (ASMUSSEN and SCHNABEL 1991 Down). The n-cpDNA system with paternal cytoplasmic inheritance and the full n-mtDNA-cpDNA system can still be used to estimate both types of gene flow under all of the other conditions given for the general case (Table 5), although their estimation power may be reduced due to the elimination of several intermigrant admixture effects that contribute to permanent disequilibria.

Equal frequencies of the paternally inherited cytoplasmic marker in the two migrant pools (C = C): If the paternally inherited marker has equal frequencies in the two migrant pools, its frequency in the resident population approaches the common migrant value (C = C = C). As with equal nuclear allele frequencies, this reduces the power for estimating gene flow rates for both the n-cpDNA and n-mtDNA-cpDNA systems with a paternally inherited marker by eliminating many of the intermigrant factors that generate permanent disequilibria.

Equivalent migrant pools ( = , C = C, A/C = A/C): If the two migrant pools are equivalent, cytonuclear data with maternal inheritance (n-mtDNA) are restricted to estimating the rate of seed migration (and then only if migrant seeds are polymorphic for the nuclear marker, as is true whenever = ). The two- and three-locus systems with a paternally inherited cytoplasmic marker (n-cpDNA and n-mtDNA-cpDNA) still allow estimation of both types of gene flow as long as the migrant pollen and seeds carry nonrandom associations between nuclear and paternally transmitted cytoplasmic alleles (A/C, A/C != 0).

No migrant disequilibria ( = = 0, for all migrant pollen and seed disequilibria): With no migrant disequilibria, it should be possible to estimate both types of gene flow from any of the three systems as long as the two migrant pools differ in their nuclear allele frequencies ( != , see special case 5 above). Further, the cytonuclear system with paternal transmission (n-cpDNA) and the full three-locus system (n-mtDNA-cpDNA) can also provide both estimates if migrant pollen and seeds have distinct frequencies of the paternally inherited marker (C != C).


*  ESTIMATING GENE FLOW
*TOP
*ABSTRACT
*CONDITIONS FOR GENE FLOW...
*ESTIMATING GENE FLOW
*HYBRID ZONE MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

Here we present a new method to estimate rates of unidirectional pollen and seed migration using joint cytonuclear or dicytonuclear frequencies. We focus specifically on the case of uniparentally inherited cytoplasmic markers where, in the dicytonuclear case, these are inherited through opposite parents. This method uses data in the form of counts in adults of joint n-mtDNA, n-cpDNA, or n-mtDNA-cpDNA genotypes, where mtDNA here represents a maternally inherited marker and cpDNA a paternally inherited marker. We first outline the general approach for the simplest cases with a single diallelic marker from each genome and then indicate two initial extensions to data involving multiple unlinked nuclear markers and/or multiallelic markers. The utility of this method is illustrated with simulated data, which also allows us to compare estimates from dicytonuclear data with opposite cytoplasmic inheritance with estimates obtained from two-locus cytonuclear data with a single maternally or paternally inherited cytoplasmic marker. A program implementing this estimation procedure is available from the authors upon request.

Maximum-likelihood estimation:
We use maximum likelihood (EDWARDS 1992 Down) in conjunction with the general continent-island model of pollen and seed dispersal (Fig 1 and Fig 2) developed in ASMUSSEN and ORIVE 2000 Down to find the rates of seed and pollen migration whose equilibrium cytonuclear or dicytonuclear genotype frequencies best fit the observed genotypic counts. The equilibrium frequencies of the two-locus cytonuclear genotypes (n-mtDNA and n-cpDNA, Table 1) or the three-locus genotypes (n-mtDNA-cpDNA, Table 2) are specified in terms of the migration and selfing rates (M, m, s) and the known migrant seed genotype frequencies (e.g., 11, ... , 22) and migrant pollen frequencies (e.g., 1C, ... , 2C, Table 4), using the equilibrium formulas in (A1–A13) for the general case in conjunction with the decompositions shown in Table 1 and Table 2. We then jointly estimate the three parameters M, m, and s by finding their values that maximize the log-likelihood function for the observed joint genotypic counts, using the "simulated annealing" minimization routine amebsa from Numerical Recipes in C, Ed. 2 (PRESS et al. 1992 Down).

The full log-likelihood function for dicytonuclear data is given by

(1)

(ignoring the constant multinomial coefficient), where, for example, NAA/M/C gives the observed count of adults with the AA/M/C joint genotype, and Û11 gives the equilibrium frequency of that joint genotype for the specified parameter values under our model. If unknown, the program may be modified so as to simultaneously estimate the migrant frequencies from the source population. In this case, the migrant frequencies become parameters for the likelihood function, in addition to the migration and selfing rate parameters M, m, and s. The 95% confidence limits are found by bootstrapping the data set a user-specified number of times and dropping the lowest 2.5% and highest 2.5% of each estimate. The program gives the lower and upper bounds on the confidence interval as well as the length of the interval, obtained by subtracting the lower from the upper bound.

Multiple nuclear markers:
An extension for estimating rates of seed and pollen migration from data with multiple unlinked nuclear markers is straightforward. In this case, the overall likelihood is calculated as a product of the likelihoods for each of the nuclear markers considered separately (ASMUSSEN et al. 1989 Down); the overall log likelihood is thus a sum of n factors like (1), one for each of the n unlinked nuclear markers. For example, if Nij,k,l;x gives the observed count for a particular joint genotype involving nuclear marker x (where the nuclear alleles are now denoted by numerical indices i,j, the mitochondrial cytotype by the index k, and the chloroplast cytotype by the index l) and ij,k,l;x gives the expected frequency of that joint genotype under our model, the log-likelihood function is given by

(2)

where the outer sum is over each nuclear marker, x = 1, ... , n. Estimates for the parameters are obtained by maximizing this composite function via the optimization routine described above.

Consideration of linked nuclear markers requires simultaneous consideration of the various recombination frequencies (BARTON and TURELLI 1991 Down; TURELLI and BARTON 1994 Down) and is beyond the scope of this initial treatment.

Multiallelic data:
An extension to multiple alleles greatly increases the number and complexity of associations considered by this model. As a first step to addressing multiallelic data, such data may be converted to diallelic form by grouping one allele vs. all others at a locus. This approach was chosen for the estimation program developed here, because it is the simplest and allows the user to specify the grouping for the locus, rather than arbitrarily averaging over all possible groupings, which could be difficult to interpret. Further theory must be developed to allow a more comprehensive multiallelic analysis.

Results from simulated data:
We tested this method with simulated diallelic data sets containing a single nuclear marker; counts of the 12 possible joint three-locus genotypes (NAA/M/C ... , Naa/m/c) in Table 2 were generated as random samples from the equilibrium genotypic distributions under specified migration and selfing rates and specified migrant pollen and seed frequencies. Due to the many possible combinations of parameter values, the examples chosen are not meant to be exhaustive, but instead illustrate some of the factors that affect gene flow estimation using this method. To assess the relative utility of the three possible types of data, we also extracted counts for the two-locus genotypes in each two-locus cytonuclear system. For each run, all three data sets were bootstrapped 200 times to construct 95% confidence intervals for each of the three estimated parameters (M, m, and s).

We consider two different populations (designated A and B) whose migrant compositions differ in the conditions that allow estimation of the two rates of gene flow (Table 5). For population A, the nuclear allele frequency is equal in migrant pollen and seeds ( = = 0.7), while the frequency of the paternally inherited marker differs in the two migrant pools (C = 1.0, C = 0.7), allowing us to examine the effect of having only one type of intermigrant frequency difference on estimating the two migration rates. The resident population receives migrant seeds of two types, AA/M/C and aa/m/c (11 = 0.7, 22 = 0.3) , which produces cytoplasmic as well as all possible allelic and homozygote disequilibria in the migrant seeds (M/C = A/* = AA/* = -aa/* = 0.21 and Aa/* = Aa/M/C = 0, where * indicates M, C, or MC; A/M/C = AA/M/C = -aa/M/C = -0.084). The migrant pollen are also of two types, A/C and a/C (1C = 0.7, 1C = 0.3, A/C = 0), but carry no nonrandom associations. Population B receives the same types of migrant seeds as population A and thus has the same disequilibria in the migrant seed pool. The migrant pollen also still carry no allelic cytonuclear disequilibrium, but are now monomorphic for both the nuclear and paternally transmitted cytoplasmic markers (1C = 1.0). As a result, population B has intermigrant frequency differences for both these markers ( = 0, = 0.7, C = 1.0, C = 0.7), allowing more ways to generate permanent disequilibria and estimate gene flow than population A.

These two situations, with disequilibria in migrant seeds but not in migrant pollen, could arise if a population receives migrant seeds from multiple sources, but migrant pollen from only one source, distinct from the sources for migrant seeds. For instance, a central population might receive animal-dispersed seeds from two nearby populations but only receive wind-dispersed pollen in one prevailing direction from a more distant population. The gene flow estimates in the numerical examples below would then represent composite estimates of the overall rates of pollen and seed flow from all sources. The hybrid zone model in the subsequent section outlines how to estimate the separate contributions from each source.

For each migrant composition, we examined various combinations of the seed migration (m = 0, 0.05, 0.1), pollen migration (M = 0, 0.05, 0.1, 0.2), and selfing (s = 0.1, 0.5, 0.9) rates. We estimated these parameters from three simulated data sets with representative sample sizes of N = 100, 300, and 500. The results for m and M are given in TABLE ACC, where the symbol <z> indicates the estimated value for each parameter z. The estimates are normalized relative to their deviation from the true value,

When the parameter value is zero, the estimate itself is given instead. Also presented in TABLE ACC are the lengths of the 95% confidence intervals for the estimates of m (Im) and M (IM), each of which has a maximum possible length of 1.0.

When only the frequency of the paternally inherited cytoplasmic marker differs between the two migrant pools (population A, Table C1), the n-mtDNA data, with a maternally inherited cytoplasmic marker, cannot estimate the pollen migration rate (Table 5) because the equilibrium state for this cytonuclear system is then independent of the pollen migration rate (ASMUSSEN and SCHNABEL 1991 Down). In such cases, however, the n-cpDNA data set, with a paternally inherited cytoplasmic marker, and the n-mtDNA-cpDNA data set, with both forms of cytoplasmic inheritance, generally give good estimates of both seed and pollen migration rates, considering accuracy together with the size of the confidence interval ({Delta}(m) < 0.5 for 83.3% of estimates, Im < 0.23 for all estimates; {Delta}(M) < 0.5 for 75% of estimates, IM < 0.4 for 71% of estimates).

All three systems consistently give good estimates when the allele frequencies of both the nuclear and the paternally inherited cytoplasmic marker differ in the migrant pools (population B, Table C2), so that permanent disequilibria are generated by intermigrant admixture effects. Such differences in allele frequencies would be expected, for example, if multiple, genetically distinct sources each contributed differentially to the migrant pollen and seed pools. When there is no seed migration (m = 0), none of the three systems can estimate the pollen migration rate in our population B examples because the absence of allelic cytonuclear disequilibrium in migrant pollen (A/C = 0) makes the equilibrium state independent of the pollen migration rate (see the special case m = 0 above). In other cases with no seed migration, where migrant pollen carry cytonuclear allelic disequilibrium (A/C != 0), estimation of the pollen migration rate is possible from two- and three-locus cytonuclear data containing a paternally inherited marker (n-cpDNA and n-mtDNA-cpDNA). This, however, requires knowledge of the initial frequency for the maternally inherited marker, X(0)M, which is now the marker's expected frequency since its value is not affected by pollen migration alone (Appendix A); if necessary, this may be treated as an unknown parameter and jointly estimated along with M and s.

In general, increased sample sizes appear to improve the accuracy of the estimates and decrease the size of the confidence intervals for both migrant compositions (A and B), although there is a great deal of variability among different simulated data sets. High selfing rates (s = 0.9) generally worsen estimates of pollen migration, although again there is a great deal of variation across runs. These poor estimates may be because high selfing reduces the total fraction of migrant pollen, M(1 - s), so that such pollen contribute less to the genetic composition of the population at equilibrium. The equilibrium genotype frequencies are thus less useful for estimating the pollen migration rate under high selfing.

Fig 3 and Fig 4 give results from two particular test runs for these same migrant compositions (A and B). In each case, the estimates of the seed migration rate <m> and the pollen migration rate <M> for the three systems (n-mtDNA, n-cpDNA, and n-mtDNA-cpDNA) are given for the three sample sizes: N = 100, 300, and 500 (estimates of selfing rate not shown). The dashed lines give the actual values of the migration rates, the solid boxes give the estimates, and the open boxes indicate the upper and lower bounds for the 95% confidence limits.



View larger version (18K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3. Estimates of seed migration rate (m) and pollen migration rate (M), where = , C != C, and s = 0.1. Values for migrant pools were 11 = 0.7, 22 = 0.3, 1C = 0.7, and 1C = 0.3. Disequilibria in the migrant pools were M/C = A/* = AA/* = -aa/* = 0.21, Aa/* = Aa/M/C = 0 (where * indicates M, C, or MC), A/M/C = AA/M/C = -aa/M/C = -0.084 , and A/C = 0. Solid boxes give estimates while open boxes give the 95% confidence limits. The cytonuclear system with maternal inheritance (n-mtDNA) cannot estimate M for this case; details are given in text.



View larger version (20K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4. Estimates of seed migration rate (m) and pollen migration rate (M) where != , C != C, and s = 0.5. Values for migrant pools were 11 = 0.7, 22 = 0.3, and 1C = 1.0. Migrant disequilibria are as in Fig 3. Solid boxes give estimates while open boxes give the 95% confidence limits.

Fig 3 gives an example from population A ( = , C != C) with a selfing rate of s = 0.1. Estimates of seed migration rates are close to the true value and confidence intervals are small for all three systems (Im <= 0.154), with the exception of n-cpDNA for N = 100 (Im = 0.227). For this example, the n-mtDNA system could not estimate the pollen migration rate because, as noted in the special case = above, the equilibrium state for cytonuclear systems with a maternally inherited cytoplasmic marker does not depend on the pollen migration rate with equal nuclear frequencies in the migrant pools. The n-mtDNA-cpDNA system gave smaller confidence intervals for all of the estimates.

An example from population B ( != , C != C) where the selfing rate is s = 0.5 is given in Fig 4. All three systems give good estimates for the seed migration rate ({Delta}(m) <= 0.335), and both n-mtDNA and n-mtDNA-cpDNA give consistently small confidence intervals for this estimate (0.035 <= Im <= 0.125). For the pollen migration rate, only the n-mtDNA system with maternal cytoplasmic inheritance and the n-mtDNA-cpDNA system with three forms of inheritance gave good results for N = 100 and 500 ({Delta}(M) <= 0.345), with the n-mtDNA-cpDNA system giving slightly smaller confidence intervals for all of the estimates. All three systems consistently overestimated the pollen flow rate for the run at N = 300.


*  HYBRID ZONE MODEL
*TOP
*ABSTRACT
*CONDITIONS FOR GENE FLOW...
*ESTIMATING GENE FLOW
*HYBRID ZONE MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

We next consider explicitly the estimation of unidirectional migration rates from multiple sources, such as would be found in a hybrid zone or other areas of admixture. Here, we estimate the migration rates for pollen and seed from each source separately, in contrast to the previous framework that allowed only composite estimates for the total amount of each form of gene flow. The migration model for this application is depicted in Fig 5. We assume that each generation a fixed fraction M1 of outcrossed pollen in the hybrid population is derived from source population 1 (species 1), and a fraction M2 is derived from source population 2 (species 2), where both sources have constant genetic compositions. Similarly, a fixed fraction m1 of the seeds migrate from source population 1 and a fixed fraction m2 from source population 2. The total pollen and seed migration rates are then M = M1 + M2 and m = m1 + m2, respectively, with the remaining fractions contributed by the resident population. The general model described in ASMUSSEN and ORIVE 2000 Down applies here with the migration rates (M, m) now given by these decompositions.



View larger version (11K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 5. Migration model with pollen (M1, M2) and seed (m1, m2) migration from two populations into a hybrid zone.

Composition of total migrant pools:
Frequencies in each total migrant pool are now the weighted averages of the corresponding frequencies in the two source populations. For example, the nuclear allele frequency in migrant pollen will be

where (i) is the nuclear allele frequency in migrant pollen from source population i. The disequilibria in the total migrant pools will be the result of admixture between the contributions of the two sources and can be calculated using (15–18) of the companion article (ASMUSSEN and ORIVE 2000 Down). For instance, the cytoplasmic disequilibrium in the total migrant seed pool is

where, once again, the superscript indicates the source population. We can use this new model to jointly estimate the four migration rates M1, M2, m1, m2) in hybrid zones in much the same way we used the original model to estimate the two (composite) migration rates (M, m).

For our numerical test of the estimation procedure for hybrid zones, we focus on the case where the two source populations show fixed differences at all three loci. This corresponds to having diagnostic nuclear, mitochondrial, and chloroplast markers for the two source populations, as might be found in a hybrid zone where two genetically diverged taxa come into contact. Such a situation is presumably the optimal case for estimation; the utility in other cases can be determined via simulations in the same way as we have done here. We assume that source population 1 is fixed for AA/M/C and source population 2 is fixed for aa/m/c. The migration rates from the two sources then uniquely determine the frequencies in migrant seeds,

(3)

and in migrant pollen (Table 4),

(4)

as well as their disequilibria,

(5)


(6)


(7)


(8)

where * indicates M, C, or MC. Note that this shows that the signs of the three-way disequilibria can serve as useful indicators of asymmetry in seed migration from two genetically distinct sources, since (6) will be positive if m2 > m1, negative if m1 > m2, and zero only when m1 = m2.

Results from simulated data:
Several different combinations of migration and selfing rates (M1, M2, m1, m2, s) were tested, selected from the same range used in the general case. The numerical results are given in TABLE ADD. As for the original model, due to the many possible combinations of parameter values, the examples given in TABLE ADD are not meant to be exhaustive but are illustrative of some of the factors that impact this estimation method. The cytonuclear system with a maternally inherited cytoplasmic marker (n-mtDNA) consistently had trouble estimating pollen migration rates, generally giving the largest confidence intervals for M1 and M2 and often having very poor estimates as well. With equal rates of pollen and seed migration (m1 = m2 = M1 = M2 = 0.1), the cytonuclear system with a paternally inherited cytoplasmic marker (n-cpDNA) was also often poor at estimating the pollen migration rate, and, for low or intermediate selfing rates, had larger confidence intervals for estimates of seed migration rates than either the n-mtDNA or n-mtDNA-cpDNA systems. Note that, for this case, the migrant pools are equivalent, with equal allele frequencies for the nuclear and paternally inherited cytoplasmic markers ( = = 0.5, C = C = 0.5) so that there are no simple intermigrant effects.

In the examples where only one source population contributes migrant pollen (M1 = 0.0), the n-cpDNA system with paternal cytoplasmic inheritance tended to estimate the seed migration rate from the other source population as zero instead of the true value of 0.01 (<m2> = 0.0, {Delta}(m2) = 1.0). This may be because, with diagnostic markers and only one source of migrant pollen, the migrant pollen pool is fixed at both the nuclear and paternally inherited loci ( = C = 0) and has no allelic cytonuclear disequilibrium (A/C = 0). In contrast, the three-locus system with both modes of cytoplasmic inheritance was usually quite successful at estimating all four migration parameters, and, in cases where one of the two-locus systems failed to estimate migration rates ({Delta}(z) > 0.5), the n-mtDNA-cpDNA system generally succeeded.

With smaller sample sizes (N = 100), there were occasionally runs where none of the three systems could estimate a parameter. For example, for m1 = 0, m2 = 0.05, M1 = 0.1, M2 = 0, s = 0.9, and N = 100, by chance the simulated data set did not include any heterozygous individuals (NAa/m/C = NAa/m/c = 0) although the equilibrium heterozygote frequency was = 0.0388 for that parameter set. All three systems estimated the seed migration rate to be zero in this case (<m2> = 0.0, {Delta}(m2) = 1.0). This is indicative of the problem that all iterative maximum-likelihood methods encounter when there are missing observations, as can occur with small sample sizes (WEIR 1996 Down). Paralleling the original model, there was a tendency for high selfing rates to decrease the accuracy of estimates for pollen migration rates.

Fig 6 Fig 7 Fig 8 Fig 9 give examples from two particular test runs. As before, the estimates of the seed migration rates (Fig 6 and Fig 8, <m1> and <m2>) and the pollen migration rates (Fig 7 and Fig 9, <M1> and <M2>) for the three systems (n-mtDNA, n-cpDNA, and n-mtDNA-cpDNA) are given for the three sample sizes: N = 100, 300, and 500. Again, the dashed lines give the actual values for each of the migration rates, the solid boxes give the estimates, and the open boxes indicate the upper and lower bounds for the 95% confidence limits.



View larger version (19K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 6. Estimates of the two seed migration rates in a hybrid zone when the two seed and two pollen migration rates differ (m1 = 0.05, m2 = 0.01, M1 = 0.1, M2 = 0.2) and where != , C != C, and s = 0.5. Each source population was fixed for diagnostic markers, so the migration rates determined the migrant composition; see Equation 3Equation 4Equation 5Equation 6Equation 7Equation 8 in text. Solid boxes give estimates while open boxes give the 95% confidence limits.



View larger version (21K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 7. Estimates of the two pollen migration rates in a hybrid zone when the two seed and two pollen migration rates differ (m1 = 0.05, m2 = 0.01, M1 = 0.1, M2 = 0.2) and where != , C != C, and s = 0.5, as in Fig 6. Each source population was fixed for diagnostic markers, so the migration rates determined the migrant composition; see Equation 3Equation 4Equation 5Equation 6Equation 7Equation 8 in text. Solid boxes give estimates while open boxes give the 95% confidence limits. Note the difference in scale between this figure and Fig 6, which gives estimates for the two seed migration rates.



View larger version (21K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 8. Estimates of the two seed migration rates in a hybrid zone when the two seed and two pollen migration rates are the same (m1 = 0.1, m2 = 0.1, M1 = 0.1, M2 = 0.1) and where = , C = C, and s = 0.1. Each source population was fixed for diagnostic markers, so the migration rates determined the migrant composition; see Equation 3Equation 4Equation 5Equation 6Equation 7Equation 8 in text. Solid boxes give estimates while open boxes give the 95% confidence limits.



View larger version (22K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 9. Estimates of the two pollen migration rates in a hybrid zone when the two seed and two pollen migration rates are the same (m1 = 0.1, m2 = 0.1, M1 = 0.1, M2 = 0.1) and where = , C = C, and s = 0.1, as in Fig 8. Each source population was fixed for diagnostic markers, so the migration rates determined the migrant composition; see Equation 3Equation 4Equation 5Equation 6Equation 7Equation 8 in text. Solid boxes give estimates while open boxes give the 95% confidence limits. Note the difference in scale between this figure and Fig 8, which gives estimates for the two seed migration rates.

In the example given by Fig 6 and Fig 7, all four migration rates differed (m1 = 0.05, m2 = 0.01, M1 = 0.1, M2 = 0.2) and the selfing rate was s = 0.5. Increasing sample size generally decreased the size of the confidence intervals for all three systems, except for the pollen migration rates for N = 500. The three-locus system (n-mtDNA-cpDNA) gave consistently smaller confidence intervals for pollen migration rates than the other two systems. The n-cpDNA system with paternal inheritance gave very poor estimates for the smaller of the seed migration rates (Fig 6, m2), while the n-mtDNA system with maternal inheritance always performed poorly in estimating pollen migration (Fig 7). In contrast, the n-mtDNA-cpDNA system gives reasonable estimates for all four migration rates.

For the example shown in Fig 8 and Fig 9, all four migration rates were the same (m1 = m2 = M1 = M2 = 0.1) and the selfing rate was s = 0.1. Again, increasing sample size decreased the confidence intervals for most of the estimates, although the effect was not as great as in the previous example. The cytonuclear system with paternal inheritance (n-cpDNA) gave large confidence intervals and often poor estimates for all four migration rates. The cytonuclear system with maternal inheritance (n-mtDNA) again gave very poor estimates for the pollen migration rates (Fig 9). All three systems underestimated the seed migration rates and overestimated the pollen migration rates for the N = 300 run, indicating that a particular data set may poorly reflect the "true" genotypic frequencies for a population, leading to inaccurate estimates. Once again, the n-mtDNA-cpDNA system performed best overall.


*  DISCUSSION
*TOP
*ABSTRACT
*CONDITIONS FOR GENE FLOW...
*ESTIMATING GENE FLOW
*HYBRID ZONE MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

The juxtaposition of biparental and uniparental inheritance in the same individual makes joint cytonuclear data particularly useful for decomposing plant gene flow and estimating the pollen (haploid) and seed (diploid) components. Here we have used previously developed continent-island models of unidirectional migration (ASMUSSEN and SCHNABEL 1991 Down; SCHNABEL and ASMUSSEN 1992 Down; ASMUSSEN and ORIVE 2000 Down) to develop a new method for estimating both forms of plant gene flow in mixed-mating populations. A program implementing this approach is available from the authors by request. This procedure can utilize joint two-locus cytonuclear data, with either maternal or paternal cytoplasmic inheritance, as well as joint nuclear-mitochondrial-chloroplast data with both forms of cytoplasmic inheritance, for any number of unlinked nuclear markers. The underlying theory is based on diallelic loci, with multiallelic data accommodated in this initial implementation by grouping one user-specified allele vs. all others. Three-locus data from species where both organelles are transmitted through the same parent have not been considered since these are equivalent to a two-locus cytonuclear system with a multiallelic cytoplasmic marker. Such three-locus systems have estimation benefits over data from either two-locus cytonuclear system, but presumably only from the increased degrees of freedom in the data. Biparental inheritance of cytoplasmic markers has been reported for several species of plants (METZLAFF et al. 1981 Down; MEDGYESY et al. 1986 Down), but we have not included it here since such data lack the special utility of uniparentally inherited cytoplasmic markers.

We have focused here on censusing adults, both for convenience and since, in general, assaying three markers from adult tissues will be easier than doing so from seeds, especially for species whose seeds are small. However, seeds from conifers may be particularly easy to assay, especially for nuclear allozymes (CONKLE 1971 Down). The method presented here can be readily extended to populations censused at the seed stage, by converting the equilibrium genotype </