- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Morgan, M. T.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Morgan, M. T.
Properties of Maximum Likelihood Male Fertility Estimation in Plant Populations
M. T. Morganaa Départment of Botany, Department of Genetics and Cell Biology, Washington State University, Pullman, Washington 99164-4238
Corresponding author: M. T. Morgan, Department of Botany, Department of Genetics and Cell Biology, Washington State University, Pullman, Washington 99164-4238, mmorgan{at}wsu.edu (E-mail).
Communicating editor: A. H. D. BROWN
| ABSTRACT |
|---|
Computer simulations are used to evaluate maximum likelihood methods for inferring male fertility in plant populations. The maximum likelihood method can provide substantial power to characterize male fertilities at the population level. Results emphasize, however, the importance of adequate experimental design and evaluation of fertility estimates, as well as limitations to inference (e.g., about the variance in male fertility or the correlation between fertility and phenotypic trait value) that can be reasonably drawn.
ONE half of the nuclear genes in most plants pass through the male reproductive pathway, yet estimates of male fertility based on ecological observations such as dispersal distances of pollen analogues or observed pollinator movements can be "disappointingly crude" (![]()
Genetic markers can assist male fertility estimation. The most powerful marker-based methods (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Here I use computer simulation to document statistical power of maximum likelihood methods and to identify conditions when reasonable insight into male fertility variation can be obtained. The focus is on allozyme data, where factors contributing to manageable experimental designs are well understood; speculation on possible results from highly variable markers is presented in DISCUSSION. Results indicate the importance of genetic exclusion probability (
, see ![]()
![]()
| MATERIALS AND METHODS |
|---|
Maximum likelihood estimation:
![]()
![]()
of male fertilities, using a matrix X of genetic data. Each element of the fertility vector
j corresponds to the fertility of the jth unique male genotype, while the matrix entry Xij is the probability of observing offspring genotype i given the genotypes of the maternal parent and the jth putative paternal parent (![]()
![]()
![]() |
(1) |
The goal is to identify the vector of male fertilities maximizing this likelihood.
A maximum of the likelihood can be found using the expectation maximization algorithm (![]()
j to a value
'j using the formula
![]() |
(2) |
The product Xij
j in the numerator represents the expectation step, while the division and outer sum correspond to maximization. The algorithm used here starts with an initial vector of male fertilities
in which elements are equal and sum to one,
j
j = 1. Iteration proceeds until the change in the log of the likelihood is less than 10-5 per iteration.
Simulation methodology:
Simulation was used to evaluate the statistical power of the estimation procedure and to evaluate inference about male fertility. Simulations centered around a "standard" parameter set. The standard set assumed a dioecious population of 25 male and 25 female parents, with 20 progeny assayed per maternal family. Genetic data in the standard set consist of eight loci, each with two equally frequent alleles (expected exclusion probability
= 0.81; observed exclusions in simulations, e.g., in Figure 1, are less than this because of the finite number of paternal parents). This parameter set involves assaying a reasonable number of progeny for a combination of loci with exclusion probabilities toward the high end of that attainable with allozyme markers. Natural populations are likely to have more than 25 potential males, but the analyses presented below suggest that this realistic situation results in poor statistical properties. Loci are in Hardy-Weinberg and linkage equilibrium and are inherited in a Mendelian fashion. Parental genotypes are known without error. Expected male fertilities were chosen from a Gaussian distribution with mean equal to the number of progeny simulated and coefficient of variation equal to CVg; zero fertility was assigned when negative deviates were drawn. The actual fertility coefficient of variation CVm (i.e., variation in male fertility realized in a simulation) includes this source of variation and an additional multinomial component associated with sampling. Numbers of male and female parents, progeny array size, and number of loci were varied one at a time, with CVg ranging between zero and one (with CVg < 0.7, virtually all males sire some offspring, whereas for CVg = 1, the distribution of male fertilities is nearly Poisson and ~35% of males sire no offspring). Each parameter combination involved 500 replicates.
|
Statistical power was evaluated using the likelihood ratio statistic suggested by ![]()
log L. For each statistical test, 500 data sets were simulated assuming equal male fertility, CVg = 0. The
log L values from these simulations represent the null distribution against which fertility distributions with CVg > 0 are to be compared. Statistical power for each scenario with CVg > 0 is determined as the proportion of
log L values more extreme (larger) than 95% of the values under the assumption of equal expected fertility.
Two measures were used to characterize estimated vs. actual fertilities. The first,
m/CVm, compared the estimated to actual male fertility coefficient of variation (this is also the ratio of estimated and actual male fertility standard deviations because the mean estimated and actual male fertility is the same). The fertility coefficient of variation represents the opportunity for selection (![]()
![]()
m/CVm provides an indication of whether this opportunity will be over- or underestimated in paternity analyses. The second measure,
, is the correlation between estimated and actual fertilities. This correlation is important in analyses of selection attempting to correlate phenotypic trait value with a measure of fitness (![]()
![]()
determines the maximum possible correlation between trait and fitness (![]()
![]()
| RESULTS |
|---|
Simulation results in Figure 1 indicate that statistical power to reject the null hypothesis of equal male fertility can be high, provided that male fertility is not too uniformly distributed. Paternity analyses benefit from large progeny sizes, many maternal progeny arrays, many loci (high exclusion probabilities), and few paternal parents. The lower panels of Figure 1 suggest that the total number of progeny assayed is important because similar curves result when comparable total progeny are assayed (e.g., 10 progeny from 25 mothers = 250 total progeny vs. 20 progeny from 12 mothers = 240 total progeny).
Estimation of the male fertility variance may be biased, and there may not be a strong correlation between actual and estimated fertility (Table 1). These difficulties are particularly apparent when the actual variance is limited or when many male fertilities are estimated. Even in scenarios with 12 loci and, hence, extraordinary exclusion probability (expected
= 0.92), the maximum likelihood method overestimates variance in male fertility by 1.5- to 2-fold. With eight loci and moderate exclusion probability (expected
= 0.81), the correlation between actual and estimated fertility ranges from 0.25, when there are many males with limited fertility variation, to 0.65, when substantial fertility variation among relatively few males is estimated using many or large maternal families. With the exclusion probability offered by 12 loci, the correlation between actual and estimated fertility can rise to 0.83. When males have equal expected fertility, replicates with 50 females or 40 progeny per female show a slight decrease in performance of the estimators compared with standard parameter values involving fewer females or progeny. A similar pattern is observed when male fertility variation is summarized as a ratio of expected values, rather than as the expected value of ratios, so that the difference is not likely to result from uncertainty in the denominator of
m/CVm. Instead, this result may reflect an underlying bias in the imperfectly estimated fertilities, reinforced by larger sample sizes.
|
| DISCUSSION |
|---|
Maximum likelihood methods can detect significant male fertility variation when applied to appropriate data sets (![]()
= 0.92. The correlation between estimated and actual fertility can reduce the correlation between trait value and relative fertility in a selection analysis by 50% or more (Table 1). These results suggest how experimental design can enhance statistical power, and they indicate limits to inference drawn from such experiments.
Experimental populations are well suited to inference of male fertility (![]()
![]()
![]()
![]()
![]()
Genetic information (exclusion probability
) plays a prominent but not exclusive role in male fertility estimation. For instance, all parameter sets involving eight loci in Figure 1 have the same exclusion probability, yet statistical power varies from near zero to one, depending on other aspects of experimental design and the actual amount of fertility variation. The results of Table 1 similarly show the importance of factors other than exclusion probability in characterizing fertility variation. Even if exclusion were complete and fertility assigned without error, under the hypothesis of uniform expected male fertility, the error of individual fertility estimates follows a multinomial distribution with sampling variance inversely proportional to the total number of progeny surveyed (![]()
Modern molecular markers may substantially expand the applicability of paternity analyses, although available data sets only hint at appropriate parameters for further investigation. Simple sequence repeats (SSRs) are one promising genetic marker with abundant polymorphism and codominant expression. Although many SSR loci are found in rice (![]()
![]()
![]()
![]()
![]()
|
Computer simulation and resampling techniques may continue to play an important part in paternity studies. Preliminary analysis, using knowledge of marker variation, population structure, and proposed experimental design, might help to determine whether a full-scale study will be informative (![]()
Finally, the method of estimating paternity used here represents only one form of analysis. ADAMS and co-workers (![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
This research was supported by a Natural Sciences and Engineering Research Council of Canada postdoctoral fellowship. DANIEL SCHOEN, PETER SMOUSE, and anonymous reviewers provided many helpful comments on earlier versions.
Manuscript received November 18, 1997; Accepted for publication February 27, 1998.
| LITERATURE CITED |
|---|
ADAMS, W. T., 1992 Gene dispersal within forest tree populations. New Forests 6:217-240.
ADAMS, W. T., and D. S. BIRKES, 1991 Estimating mating patterns in forest tree populations, pp. 157172 in Biochemical Markers in the Population Genetics of Forest Trees, edited by S. FINESCHI, M. E. MALVOLTI, F. CANNATA and H. H. HATTEMER. SPB Academic Publishing, The Hague.
ADAMS, W. T., D. S. BIRKES and V. J. ERICKSON, 1992 Using genetic markers to measure gene flow and pollen dispersal in forest tree seed orchards, pp. 3761 in Ecology and Evolution of Plant Reproduction, edited by R. WYATT. Chapman & Hall, New York.
ARNOLD, S. J. and M. J. WADE, 1984 On the measurement of natural and sexual selection: theory. Evolution 38:709-719.
BROWN, A. H. D., 1990 Genetic characterization of plant mating systems, pp. 145162 in Plant Population Genetics, Breeding, and Genetic Resources, edited by A. H. D. BROWN, M. T. CLEGG, A. L. KAHLER and B. S. WEIR. Sinauer Associates, Sunderland, MA.
BROYLES, S. B. and R. WYATT, 1990 Paternity analysis in a natural population of Asclepias exaltata: multiple paternity, functional gender, and the `pollen-donation' hypothesis. Evolution 44:1454-1468.
BURCZYK, J., W. T. ADAMS, and J. Y. SHIMIZU, 1996 Mating patterns and pollen dispersal in a natural knobcone pine (Pinus attenuata Lemmon.) stand. Heredity 77:251-260.
CHAKRABORTY, R., P. E. SMOUSE, and T. R. MEAGHER, 1988 Parentage analysis with genetic markers in natural populations. I. The expected proportion of offspring with unambiguous paternity. Genetics 118:527-536
CHASE, M., R. KESSELI, and K. BAWA, 1996 Microsatellite markers for population and conservation genetics. Am. J. Bot. 83:51-57.
CHEN, X., S. TEMNYKH, Y. XU, Y. G. CHO, and S. R. MCCOUCH, 1997 Development of a microsatellite framework map providing genome-wide coverage in rice (Oryza sativa L.). Theor. Appl. Genet. 95:553-567.
CONNER, J. K., S. RUSH, S. KERCHER, and P. JENNETTEN, 1996 Measurements of natural selection on floral traits in wild radish (Raphanus raphanistrum). 2. Selection through lifetime male and total fitness. Evolution 50:1137-1146.
CROW, J. F., 1958 Some possibilities for measuring selection intensities in man. Hum. Biol. 30:1-13[Medline].
DAWSON, I. K., R. WAUGH, A. J. SIMONS, and W. POWELL, 1997 Simple sequence repeats provide a direct estimate of pollen-mediated gene dispersal in the tropical tree Gliricidia sepium.. Mol. Ecol. 6:179-183.
DEVLIN, B. and N. C. ELLSTRAND, 1990 Male and female fertility variation in wild radish, a hermaphrodite. Am. Nat. 136:87-107.
DEVLIN, B., K. ROEDER, and N. C. ELLSTRAND, 1988 Fractional paternity assignment: theoretical development and comparison to other methods. Theor. Appl. Genet. 76:369-380.
DEVLIN, B., J. CLEGG, and N. C. ELLSTRAND, 1992 The effect of flower production on male reproductive success in wild radish populations. Evolution 46:1030-1042.
HARTL, D. L., and A. G. CLARK, 1989 Principles of Population Genetics. Sinauer Associates, Sunderland, MA.
KOHN, J. R. and S. C. H. BARRETT, 1992 Experimental studies on the functional significance of heterostyly. Evolution 46:43-55.
LANDE, R., 1976 Natural selection and random genetic drift in phenotypic evolution. Evolution 30:314-334.
LANDE, R. and S. J. ARNOLD, 1983 The measurement of selection on correlated characters. Evolution 36:1210-1226.
LI, C. C., 1955 Population Genetics. The University of Chicago Press, Chicago.
ROEDER, K., B. DEVLIN, and B. G. LINDSAY, 1989 Application of maximum likelihood methods to population genetic data for the estimation of individual fertilities. Biometrics 45:363-379.
SCHOEN, D. J. and S. C. STEWART, 1986 Variation in male reproductive investment and male reproductive success in white spruce. Evolution 40:1109-1120.
SMITH, J. S. C., E. C. L. CHIN, H. SHU, O. S. SMITH, and S. J. WALL et al., 1997 An evaluation of the utility of SSR loci as molecular markers in maize (Zea mays): comparisons with data from RFLPS and pedigree. Theor. Appl. Genet. 95:163-173.
SMOUSE, P. E. and T. R. MEAGHER, 1994 Genetic analysis of male reproductive contributions in Chamaelirium luteum (L.) Gray (Liliaceae). Genetics 136:313-322[Abstract].
SNOW, A. A. and P. O. LEWIS, 1993 Reproductive traits and male fertility in plantsempirical approaches. Annu. Rev. Ecol. Syst. 24:331-351.
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Morgan, M. T.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Morgan, M. T.


