Abstract
The consequences of variable rates of clonal reproduction on the population genetics of neutral markers are explored in diploid organisms within a subdivided population (island model). We use both analytical and stochastic simulation approaches. High rates of clonal reproduction will positively affect heterozygosity. As a consequence, nearly twice as many alleles per locus can be maintained and population differentiation estimated as F_{ST} value is strongly decreased in purely clonal populations as compared to purely sexual ones. With increasing clonal reproduction, effective population size first slowly increases and then points toward extreme values when the reproductive system tends toward strict clonality. This reflects the fact that polymorphism is protected within individuals due to fixed heterozygosity. Contrarily, genotypic diversity smoothly decreases with increasing rates of clonal reproduction. Asexual populations thus maintain higher genetic diversity at each single locus but a lower number of different genotypes. Mixed clonal/sexual reproduction is nearly indistinguishable from strict sexual reproduction as long as the proportion of clonal reproduction is not strongly predominant for all quantities investigated, except for genotypic diversities (both at individual loci and over multiple loci).
THE essential feature of sexual reproduction is that genetic material from different ancestors is brought together in a single individual. If sexual reproduction is dominant in eukaryotic organisms (e.g., Charlesworth 1989; Westet al. 1999), many organisms of major medical or economical importance are known to reproduce mainly or strictly clonally (e.g., Milgroom 1996; Tayloret al. 1999; Tibayrenc 1999). The presence or absence of a sexual process will crucially determine the genetics at both the individual and the population level and leads to several straightforward predictions. At the individual level, clonality will produce a strong correlation between alleles within individuals at different loci, as they share a common history within a clonal lineage. Sex on the other hand will break these associations, allowing for many more potential genetic combinations. Further, in diploids, absence of sex will promote divergence between alleles within loci, as the two copies will accumulate different mutations over time. This effect has been termed the “Meselson effect” and has recently been experimentally documented in bdelloid rotifers, which are believed to have been reproducing strictly clonally over long evolutionary time (Butlin 2000; Mark Welch and Meselson 2000, 2001). Heterozygosity is thus expected to increase indefinitely under clonal propagation (Birky 1996; Judson and Normak 1996). In another respect, theoretical considerations predict that the effective population size of clonal organisms should be lower than that of panmictic ones (e.g., Orive 1993; Milgroom 1996). However, the few theoretical population genetics studies that we are aware of provide ambiguous conclusions on that topic (Orive 1993; Berg and Lascoux 2000) and numerous field observations support this ambiguity (e.g., Butlinet al. 1998; Gabrielsen and Brochmann 1998; Cywinska and Hebert 2002). Thus, “whether organisms with clonal reproduction necessarily have lower genetic diversity is unclear” (Orive 1993, p. 337). These ambiguities illustrate what little is known on the population genetics consequences of clonal reproduction. In the absence of theoretical models providing clear expectations, estimating the rate of clonal reproduction in natural populations appears problematic (e.g., Anderson and Kohn 1998) and even the detection of purely clonal populations is often controversial (e.g., Tibayrenc 1997; Vigalyset al. 1997). Clonality is not just an academic matter (Tibayrenc 1997). Many diploid organisms believed to reproduce mainly or strictly clonally are of major medical, veterinary, and economical importance, including pathogenic fungi such as Candida or protozoans such as Trypanosoma. A better understanding of the reproductive system of such organisms might be crucial for planning successful longterm drug administration or vaccination programs (Tibayrencet al. 1991; Milgroom 1996; Tayloret al. 1999).
Here we present both analytical and stochastic simulation results for the population genetics of clonally and partially clonally reproducing populations. We focus on a simple population subdivision model (island model) and restrict our work to neutral mutations. We derive the identities by descent, Fstatistics, and mean coalescence times of alleles and genotypes for variable rates of clonal reproduction. We also investigate the allelic and genotypic diversities maintained under different rates of clonal reproduction.
MODEL ASSUMPTIONS AND GENETIC IDENTITIES
We consider a subdivided monoecious population of diploid individuals, which reproduce clonally with probability c, with sexual reproduction occurring at the complementary probability (1  c). Sexual reproduction in the model follows random union of gametes, selffertilization occurs at a rate s, and a subpopulation is composed of N number of adults. In our model, individuals, rather than gametes, migrate following an island model (Wright 1951) at a rate m, implying that a migrant has an equal probability to reach any of the subpopulations. We further assume stable census sizes and population structure and no selection. The life cycle involves nonoverlapping generations and juvenile migration. The precise sequence goes as follows:
Adult reproduction and subsequent death
Juvenile dispersal
Regulation of juveniles, the survivors reaching adulthood
Because of the symmetry of the island model, only the following probabilities of identity by descent are needed to describe the apportionment of genetic variation in a subdivided monoecious population.
F: The inbreeding coefficient, defining the probability that two alleles drawn at random from a single individual are identical by descent.
θ: Coancestry of individuals drawn at random from within the same subpopulation, defined as the probability that two randomly sampled alleles from two different individuals within a subpopulation are identical by descent.
α: Coancestry of individuals randomly drawn from different populations. This is defined as the probability that two randomly sampled alleles from two individuals in different subpopulations are identical by descent.
The identities may be calculated in juveniles (F_{J}, θ_{J}, α_{J}), or adults (F_{A}, θ_{A}, α_{A}), or respectively before or after migration. In a first step, we express identities between adults one generation forward in time (t + 1) as functions of juvenile identities (t + ½). Adult identities are affected only by dispersal,
For analytical effectiveness, recurrence equations for identities by descent can be presented in matrix form,
INDIVIDUALBASED SIMULATIONS
To obtain the variances of the quantities of interest, as well as multilocus behavior, we additionally performed stochastic individualbased simulation, as implemented in the software EASYPOP (version 1.7.4; Balloux 2001). For all simulations, we used 20 loci with a mutation rate of 10^{5}. Mutations had an equivalent probability to generate any of the 99 possible allelic states. This relatively high number of allelic states keeps the probability of obtaining indistinguishable alleles through different mutational events (homoplasy) low. At the start of the simulation, genetic diversity was set to the maximum possible value at the first generation and the simulation was then run for 10,000 generations, the point at which all statistics measured in EASYPOP (F_{IS}, F_{ST}, H_{S}, H_{T}, and the number of alleles) had reached equilibrium. All simulations were replicated 20 times.
FSTATISTICS
Deviations from random mating are generally expressed by means of Fstatistics (Wright 1951). They are the most commonly used tools for describing gene flow and breeding structure in both theoretical and empirical studies (reviewed in Balloux and LugonMoulin 2002). Fstatistics are defined as
Withinpopulation deviations from random mating (F_{IS}): Replacing the solutions of Equation 7 in (8), we get F_{IS} after migration for subdivided populations with a mixed system of clonal and sexual reproduction (selfing set to 1/N) and zygotic migration
In Figure 1, we plot F_{IS} as obtained from Equation 10 against the rate of clonal reproduction. We also give values obtained from individualbased simulations. Analytical and stochastic simulation results are in excellent agreement. From Figure 1, it can be seen that for very high values of clonal reproduction, huge heterozygote excesses are obtained. However, as long as there is a small proportion of sexual reproduction, F_{IS} stays close to what is expected under panmixia; a significant excess of heterozygotes occurs only for extreme rates of asexuality. As long as there is mutation in the system, F_{IS} cannot reach 1 even for strict clonality. If the product of the number of individuals in the complete population (nN) times the mutation rate is high, the F_{IS} value for complete clonality can be very much offset from 1. The reason for this can be seen from Equation 8. Under clonal reproduction all individuals will be heterozygous and this will not be changed by mutation, so F = 0, while θ decreases with increasing mutation rate.
The F_{IS} estimates from the stochastic simulations in Figure 1 are averaged over loci and replicates and do not reveal anything about the strong influence of the rate of clonal reproduction on the variance over loci. This huge variation among loci, in particular for low rates of sexual reproduction, is illustrated by standard errors in F_{IS} (Figure 2). The lowest variations are obtained with pure clonality and with <95% of clonality.
Population differentiation (F_{ST}): Again by replacing the solutions of Equation 7 in (8), we obtain F_{ST} for subdivided populations with a mixed system of clonal, selfing, and sexual reproduction after migration:
EFFECTIVE POPULATION SIZE
Effective population size: The effective population size (Wright 1931) is the parameter summarizing the amount of genetic drift to which a population is subjected. It is quantified as the number of idealized randomly mating individuals that experience the same amount of random fluctuations at a neutral locus as the population under scrutiny. The dynamics of idealized randomly mating individuals are described by the WrightFisher model, whose wellknown properties lead to different definitions of the effective population size depending on whether the quantities of interest are the variance of change in allelic frequencies, inbreeding coefficients, or the rate of decline in heterozygosity (Ewens 1982; Whitlock and Barton 1997). Here we introduce a new definition of effective size called the coalescence effective size,
There is a strict relationship between identitybydescent probabilities and coalescence times (Slatkin 1991; Rousset 1996). The probability of identity of any pair of alleles is the probability that neither allele has undergone mutation since their most recent common ancestor (Hudson 1990). Recalling Equation 7,
Genotypic and allelic effective population size: We have shown that increased rates of clonal reproduction will increase the allelic effective population size, and thus clonal populations are expected to maintain more alleles at neutral loci than are sexually reproducing ones. We can go a step further and address the issue of how clonal reproduction will affect the number of different genotypes maintained. The coalescence approach allows us to capture qualitatively these trends by calculating the genotypic effective population size. To obtain this quantity, we need, in addition to F and θ, ρ, the probabilities that three alleles randomly sampled in two different individuals are identical. These three variables are necessary to calculate the probability Δ that two genotypes are identical. However, these higherorder coefficients are complicated and we therefore limit ourselves to a nonsubdivided monoecious population without mutation. We follow the approach of Cockerham (1971, pp. 243244) to calculate the dynamics of these four variables. Collecting the identities given in appendix a leads to the following system of recurrence equations:
GENETIC DIVERSITIES
We can now take a closer quantitative look at how genetic diversity is distributed between alleles and genotypes with the stochastic simulations. Allelic diversity can be expressed as the effective number of alleles, n_{e}, corresponding to the number of equally frequent alleles needed to observe a given genetic diversity, which is
DISCUSSION
We used both an analytical approach and stochastic individualbased simulations to describe the dynamics of genetic variance in subdivided populations, characterized by various levels of clonal reproduction. Higher rates of asexual reproduction will increase heterozygosity and decrease population differentiation. Diversity at single loci will be higher in clonal organisms than in sexuals, whereas the opposite is true for genotypic diversity. At the exception of genotypic diversity (both at single loci and over multiple loci), which decreases at a constant rate with increasing rates of asexual reproduction, all other quantities investigated are significantly affected only when sexual reproduction becomes rare.
Our results thus suggest that strict clonality may easily be detected in diploid populations due to heterozygote excess. Furthermore, very low levels of sex (cryptic sex) may also be revealed by on average low F_{IS} values with very important variance among loci, though DNA alterations may also lead to a similar pattern in a strictly clonal population. For instance, Candida albicans is known to undergo mitotic recombinations including chromosomal translocation (Lottet al. 1999). Much effort has been put into testing for evidence of strict clonal reproduction with traditional population genetics (e.g., Tibayrencet al. 1991) or through testing for the Meselson effect (high divergence at the two alleles of a single locus within individuals; reviewed in Butlin 2000). Extreme genetic divergence at single loci within individuals has been documented in bdelloid rotifers, which are believed to be ancient asexuals (Mark Welch and Meselson 2000, 2001). The Meselson effect could, however, not be detected in other potentially old asexual lineages (Schönet al. 1998; Normark 1999). Whether this is due to rare sex or perhaps to extremely frequent gene conversion events (the copy of the DNA sequence of one chromosome on the other) is an unresolved issue to date.
Empirical data on genetic variation and its apportionment by means of Fstatistics in clonal lineages, as compared to sexually reproducing populations of the same species, are rare. Furthermore, studies using dominant genetic markers (e.g., rapidly amplified polymorphic DNAs) do not properly allow for the disentanglement between genetic variation within loci and within genotypes. Indeed, as can be seen from Figure 6, the absolute genetic diversity (the sum of allelic and genotypic variability) does not provide any clear prediction on the rate of clonal reproduction. Another potential problem stems from the difficulty in ruling out the presence of rare sexual reproduction. However, a recent study by Delmotte et al. (2002) comparing eight sexual with five asexual populations of the aphid Rhopalosiphum padi could provide a test for our model. Their empirical results are overall in good agreement with our analytical expectations. As we expect, Delmotte et al. (2002) report increased excess in expected heterozygotes (F_{IS}) for asexuals and lower differentiation (F_{ST}) between asexual populations than between sexual populations. They also report lower genotypic variation and lower allelic variation in asexuals than in sexuals. This relatively good agreement between our model and their data suggests that these asexual populations have not experienced sexual reproduction in recent times. The discrepancy in allelic diversity could be due to different factors (e.g., sampling, extinctionrecolonization dynamics). However, even if we assumed all else being equal between the sexual and asexual aphid populations, selection could still reduce genetic diversities more effectively in the asexual populations. Mutations under strong directional selection make linked loci behave as if they were evolving under smaller effective population sizes (Robertson 1961). Due to the complete absence of recombination in strictly clonal organisms, any strongly deleterious dominant mutation will drive the lineage, where it appeared, to extinction. New beneficial mutations will also reduce the effective population sizes of clones, as lineages with a new beneficial mutation will displace other lineages.
Indeed our model does not include natural selection, so that our results apply strictly to neutral genetic variability or more generally to relatively weakly selected polymorphisms subject to genetic drift. Genetic drift is the main force driving allele frequencies as long as the selection differential s between alleles is not much above the inverse of effective population size (1/N_{e}). For higher selection differentials, the effect of genetic drift becomes negligible. However, our predictions should hold even for relatively important selection differentials in clonal and nearly clonal organisms, as the efficacy of selection acting simultaneously at linked sites is considerably reduced (Hill and Robertson 1966).
We assumed identical fitness (in both mean and variance) for clonally and sexually produced offspring. The rate of clonal reproduction is not a heritable trait in our model, as it is a fixed property of the population (clonally produced individuals do not have a higher chance to reproduce clonally themselves). Therefore, different fecundities for sexually or clonally produced offspring would result only in increasing the variance in reproductive success and thus would decrease the effective population size. Our results are thus qualitatively robust to reasonable differences in relative fitness between clonally and sexually produced offspring.
Finally, our model could lead to the development of new approaches to infer the rate of clonal reproduction. Our results show that all estimators based on identities by descent (including linkage disequilibrium approaches) are expected to be rather insensitive to the rate of clonal reproduction as long as it does not become strongly predominant. It is therefore doubtful that such estimators will allow precise inferences on the actual rate of clonal reproduction unless it is very close or equal to 1. As genotypic diversity decreases smoothly with the rate of clonal reproduction, one promising alternative approach would be to build estimators of clonal reproduction as functions of the relative genotypic and allelic identities.
APPENDIX A: GENOTYPIC PROBABILITIES OF IDENTITIES BY DESCENT
We follow the same rationale as Cockerham (1971, pp. 242243), but add the dynamics of θ. When one offspring is produced clonally, his two alleles are not independent. When we sample alleles and look back to their common parent, the two genes of a clone always stem from the same individual. Two clones are randomly sampled with probability c^{2}; the four genes stem either from the same parent or from two different ones. The identity between genotypes and three alleles reads
APPENDIX B: COEFFICIENT OF EQUATION 23
If c = 0, then the mean coalescence time for two genotypes in the WrightFisher setting reduces to
Performing a Taylor expansion of first degree under large population size and substituting some close integers yields the approximation for Equation B2:
Acknowledgments
We thank Nathalie Charbonnel, Sylvain Gandon, Jerôme Goudet, Andy Overall, Franck Prugnolle, François Renaud, Max Reuter, Denis Roze, Michel Tibayrenc, and two anonymous referees for very inspiring conversations and comments; François Rousset for having given access to unpublished material; and Sylvain de l’Hérault for his strong support. F.B. was supported by the Biotechnology and Biological Sciences Research Council and by grant 823A067616 from the Swiss National Science Foundation.
Footnotes

Communicating editor: M. K. Uyenoyama
 Received November 6, 2002.
 Accepted April 11, 2003.
 Copyright © 2003 by the Genetics Society of America