## Abstract

In sexually reproducing isogamous species, syngamy between gametes is generally not indiscriminate, but rather restricted to occurring between complementary self-incompatible mating types. A longstanding question regards the evolutionary pressures that control the number of mating types observed in natural populations, which ranges from two to many thousands. Here, we describe a population genetic null model of this reproductive system, and derive expressions for the stationary probability distribution of the number of mating types, the establishment probability of a newly arising mating type, and the mean time to extinction of a resident type. Our results yield that the average rate of sexual reproduction in a population correlates positively with the expected number of mating types observed. We further show that the low number of mating types predicted in the rare-sex regime is primarily driven by low invasion probabilities of new mating type alleles, with established resident alleles being very stable over long evolutionary periods. Moreover, our model naturally exhibits varying selection strength dependent on the number of resident mating types. This results in higher extinction and lower invasion rates for an increasing number of residents.

- mating types
- isogamy
- facultative sex
- self-incompatibility
- balancing selection
- frequency-dependent selection
- finite populations

IN isogamous species, the gamete size differentiation that defines the sexes in anisogamous species is absent (Lehtonen *et al.* 2016). Despite their morphological similarity, the gametes of isogamous species are typically not interchangeable, but rather fall into one of a number of genetically determined self-incompatible gamete classes, termed mating types. Syngamy can only occur between gametes of distinct and complementary mating types.

The evolutionary explanation for this self-incompatibility (SI), which limits the number of potential mates available to an isogamous organism, has been the subject of debate, with a number of competing hypotheses proposed (Billiard *et al.* 2011). These include theories that such SI alleles limit inbreeding depression (Charlesworth and Charlesworth 1979), increase encounter rates between gamete pairs (Hoekstra 1982; Hadjivasiliou *et al.* 2015; Hadjivasiliou and Pomiankowski 2019), allow for ploidy level detection and the instigation of the zygote developmental program (Haag 2007; Perrin 2012), or manage cytoplasmic conflict by promoting uniparental inheritance of organelles (UPI) (Hurst and Hamilton 1992; Hadjivasiliou *et al.* 2012).

Empirical observations of the number of mating types vary between species, ranging from two to many thousands (Kothe 1996). For example, intermediate numbers of mating types are reported in ciliates of the *Tetrahymena*-species [ different mating types (Doerder *et al.* 1995; Phadke and Zufall 2009)] and slime molds [ mating types (Bloomfield *et al.* 2010; Clark and Haskins 2010)]. Larger numbers can be found in fungal populations; in *Coprinellus disseminatus* the global population is estimated to contain 123 different mating types (James *et al.* 2006), while *Schizophyllum commune* has a staggering distinct types (Kothe 1999). This naturally leads to the question which type of evolutionary pressures govern this diversity in mating type number? This has been the subject of much debate, motivated in part by a discrepancy between these empirical observations and simple evolutionary reasoning.

From a theoretical standpoint, one might naïvely expect to see a very large number of mating types within any given species, due to the “rare sex advantage” of novel types (Iwasa and Sasaki 1987). Since mating types are self-incompatible, rare types have more opportunities for mating, and, thus, each type experiences negative-frequency dependent selection. Therefore, each novel mating type produced by mutation should establish in the population, and the number of mating types should consistently grow. However, this prediction stands in stark contrast to what is observed in the natural world. Although isogamous species with hundreds, or even thousands, of mating types are possible (Kothe 1996), examples of such species are very rare; the vast majority have very few (typically two) mating types (Hadjivasiliou 2014). Explaining this discrepancy between theory and empirical observation has been the focus of much work, and multiple theories have been proposed (Billiard *et al.* 2011).

One prominent hypothesis is that UPI drives the evolution of two mating types (Hadjivasiliou *et al.* 2012), with larger mating type numbers becoming less stable with the increased complexity of coordinating an organelle donor-receiver program (Hurst and Hamilton 1992). While recent modeling work has shown that, when the frequency-dependent effects of UPI are accounted for, an invading donor-receiver program does not reduce the expected number of mating types (Hadjivasiliou *et al.* 2013), perceived empirical support for the theory comes from the fact that many species with more than two mating types have developed mechanisms to ensure homoplasmy without such a program. For instance, in *Paramecium bursaria*, with up to eight mating types (Phadke and Zufall 2009), and *S. commune*, sexual reproduction is achieved following the exchange of nuclei between cells without cytoplasmic mixing (Birky 1995). However, this in turn leads to an opposing question: if it really is only a donor-receiver program that limits the number of mating types, why do ciliates and Agaricomycetes, with these alternative methods for ensuring homoplasmy, still feature species with very few types [as low as two and three respectively, see James (2015)]?

A further hypothesis (Hoekstra 1982; Hoekstra *et al.* 1991), with renewed attention (Hadjivasiliou and Pomiankowski 2016), suggests that it is cell–cell signaling between gametes that limits the number of mating types. Here coevolution between two resident types has been shown to potentially limit the evolutionary success of a third mutant mating type. If derived from a resident type, this new mutant must essentially cross a fitness valley before it can develop encounter rates with the residents comparable to their current pairwise encounter rate, limiting its invasion potential (Hadjivasiliou and Pomiankowski 2016).

In each of these hypotheses, a biologically plausible emergent mechanism is sought that generates a selective advantage to a pair of mating types, thus limiting their number to two. A notable exception can be found in Iwasa and Sasaki (1987). Here, it was demonstrated that, under certain dynamics for the mating type encounter rate, the advantage to rare mating types could be suppressed. In the limit of infinitely long-lived gametes (that can always survive until a suitable partner becomes available), selection for mating type numbers greater than two could be eliminated entirely. It was verbally suggested that, under such a scenario, genetic drift would purge new mating types, limiting their number. This led to a bimodal prediction for the number of types; populations would either have two (given a particular set of encounter rate dynamics and immortal gametes), or have infinitely many otherwise (with each at infinitely low frequency).

While the conditions required for two mating types in Iwasa and Sasaki (1987) were stringent (and, indeed, the possibility of intermediate numbers of mating types impossible) it was notable for suggesting that genetic drift may have a key role to play in determining mating type number. In a similar vein, recent work has emphasized the relevance of finite population size null models (*i.e.*, models in which all mating types are phenotypically similar) for addressing the distribution of mating type numbers observed in nature (Constable and Kokko 2018; Czuppon and Rogers 2019). These studies stress that, even in the absence of species-specific biological processes, the number of mating types in any real finite population cannot be infinite. Instead, the expected number of types will arise from balance between mutations (which introduce new mating types) and extinctions (which decrease the number of types), leading to a number of mating types well below the studied population size.

As an explanation for the low number of mating types often observed in isogamous species, this mutation-extinction balance hypothesis may seem at first improbable, particularly in the light of a number of classic population genetic studies of SI alleles in plants (Wright 1939, 1960, 1964; Ewens 1964; Nagylaki 1975; Yokoyama and Nei 1979; Yokoyama and Hetherington 1982), see also Clark and Kao (1994) for a review. From a modeling perspective, the dynamics of SI alleles in gametophytic species, those where SI is determined at the haploid pollen stage (Bod’ová *et al.* 2018), closely resemble those of SI mating type alleles in isogamous species. [A comparison with the dynamics of SI alleles in sporophytic species, in which SI of the haploid pollen is determined by the diploid parent, can be more complicated due to the complex dominance relationships that regulate SI in some of these species (Thompson and Taylor 1966; Prigoda *et al.* 2005; Billiard *et al.* 2007)]. In a seminal paper, Wright (1939) predicted that, for the model plant *Oenothera organensis*, a population of ∼500 individuals could sustain ∼13 SI alleles (see also Crosby 1966). As the number of predicted SI alleles would rise with increasing effective population size (which intuitively reduces genetic drift and hence extinction rates), many more SI mating type alleles might naïvely be expected in isogamous species, for which effective population sizes can be of the order of (Baranova *et al.* 2015).

While, in the above context, the plausibility of the mutation-extinction balance hypothesis might seem doubtful, an important biological feature of isogamous species with less prominence in plants is the potential to reproduce asexually. In many isogamous species, long periods of asexual reproduction are punctuated by rare bouts of facultative sex. For instance, in the single-celled green alga, *Chlamydomonas reinhardtii* [with two mating types (Goodenough *et al.* 2007)], recent genomic estimates have placed the rate of sexual reproduction to be once in every 770 asexual generations (Hasan and Ness 2018). Sex in yeast, which also typically have two mating types (Butler 2007), appears to be rarer still, with estimates of once in every 1000–3000 asexual generations reported in some species (Tsai *et al.* 2008).

In Hadjivasiliou *et al.* (2016), a computational model was used to show that long periods of asexual reproduction would lead to substantial drift in mating types frequencies. Using a population genetic model, Constable and Kokko (2018) demonstrated further that, as sex becomes increasingly rare, the relative strength of genetic drift over selection for novel mating types is amplified. This leads to a lower expected number of mating types in mutation-extinction balance [a similar observation has been made of Solanaceae gametophytic SI alleles (Vallejo-Marín and Uyenoyama 2008)]. This was true even in the absence of any species-specific selective mechanisms. A survey of available empirical data further supported the view that the rate of facultative sex is a key predictor of the number of mating types in isogamous species.

Analytic bounds on the number of expected mating types under a mutation-extinction balance were calculated in Constable and Kokko (2018), using the stationary distribution of a simple Moran-type model of mating type dynamics. In Czuppon and Rogers (2019), an approximation of the expected number of mating types was calculated under the assumption that sex is obligate. Muirhead and Wakeley (2009) used a similar model to calculate the stationary distribution of the frequency spectrum of mating type alleles; however, estimates on the number of expected mating types were not explicitly calculated. In addition, facultative sex was not modeled (although the mathematical analysis employed can account for this factor).

While such estimates of the number of mating type alleles in the stationary distribution are informative, they obscure the dynamic processes that drive and maintain this equilibrium: the establishment and extinction of alleles. These quantities are important for two reasons.

First, they are both independent of the arrival rate of new SI alleles. Null models of mating type dynamics often equate the arrival rate of new SI alleles with an effective mutation rate, and, in doing so, take a course grained approach to the genetic, molecular, and physiological mechanisms underlying mating type determination. However, this effective mutation rate is a confounding parameter, as it is difficult to estimate empirically. [A similar problem was encountered by Wright, who conceded that there could be argument about what a “reasonable” rate should be (Wright 1939)]. In contrast, the effective mutation rate affects neither the establishment probability of a mutant mating type nor the expected extinction time of a resident type. These quantities thus provide mutation rate independent measures of the evolutionary dynamics of mating types that can be used to test the feasibility of the mutation-extinction hypothesis.

Second, whereas the stationary distribution of mating type number can be observed only over evolutionary time periods, the establishment probability of new mating types and the mean extinction time of mating types can be observed over significantly shorter time periods. These quantities are thus important for providing empirically testable insight into the evolution of mating type number.

Extinction rates have previously been studied in related systems featuring negative frequency-dependent selection, which results in high levels of polymorphism. In Takahata (1990) a one-dimensional stochastic diffusion was used to identify the dynamics of the polymorphic major histocompatibility complex (MHC) system with a time rescaled neutral coalescent. This allowed for estimating the extinction rates of MHC alleles. These results were verified numerically in Takahata and Nei (1990) and later in Slatkin and Muirhead (1999). The same theoretical approach was adapted to gametophytic SI in plants, obtaining results showing that the time to the most recent common ancestor of a sex allele might even exceed the speciation time (Vekemans and Slatkin 1994). In the specific context of mating type alleles, such results have thus far been absent. However, recently, Czuppon and Rogers (2019) calculated the establishment probability of newly arising mating type for populations in which sex is obligate. We will rely on this quantity to obtain our estimate on the mean extinction time of resident mating types.

In this paper, we complement this literature by exploring scenarios of SI under facultative sex. We begin by calculating an analytic expression for the stationary distribution of the number of mating types, extending the results of Constable and Kokko (2018) (where only bounds of the mode of this distribution were calculated). We then provide an expression for the establishment probability of a novel mating type in facultatively sexual populations, generalizing the results of Czuppon and Rogers (2019). Finally, combining these expressions, we calculate the mean time to extinction of a resident mating type allele in a novel way. To be more precise, instead of relying solely on the one-dimensional dynamics of a focal mating type (in mating type frequency space) we make use of the dynamics on the number of mating types (in mating type number space). We conclude by discussing the biological implications of these results.

## Model

We consider a population in which the self-incompatible mating type of an individual is determined by one of an infinite set of potential alleles at a single locus. Such single locus determination systems are found, for example, in the social amoebae *Dictyostelium discoideum* [with three mating types (Bloomfield *et al.* 2010)] and *Didymium squamulosum* [with up to 12 mating types (Clark and Haskins 2010)]. The evolutionary dynamics are determined by a Moran process; generations are overlapping and the population is assumed to be at an ecological equilibrium at which the number of individuals, *N*, is constant over time. We denote the number of individuals carrying mating type *i* by such that .

Asexual reproduction happens with probability . In this case, a randomly chosen individual produces a clone that replaces one of the individuals in the population, also chosen uniformly at random.

During sexual reproduction, which occurs with probability , two randomly chosen individuals mate if they express different mating types. In the case of a successful mating, an offspring individual is produced that replaces one of the other individuals in the population. The mating type of the offspring is chosen at random from among the two parental alleles.

Additionally, we consider the emergence of novel mating types through mutations which occur at rate *m*. We implement mutations according to the infinitely many alleles model (Kimura and Crow 1964), where the mutated individual expresses a completely new mating type not previously present in the population. For simplicity, we consider the case in which mutation events are decoupled from reproduction. Mutants possess the same characteristics as any other mating type in the population; that is, they are self-incompatible and mate with nonself types at the same *per capita* rate as the resident types.

Mathematically implementing the model described above, the probability per unit time for a type *i* to increase by one, and a type *j* to decrease by one, through a birth–death event is given by(1)The reproduction term is split into an asexual component (the first term) and a sexual component (the second term). We note that the sum in the sexual reproduction term goes over all non-*i* mating types present in the population, generating a reproductive advantage to rare mating types. The probability per unit time that a novel mating type *i* is generated from an ancestral mating type *j* is given by(2)which can only occur when type *i* is not already present in the population. As described in Equation S6 in the Supplemental Information, the full expression for the probability transition rate given in Equation 2 features an additional normalization constant, included to ensure that the mutation rate is independent of the population composition (*i.e.*, that ).

The deterministic limit of this model is described by a system of ordinary differential equations, where the dynamics (neglecting mutations) of a the mating type, with frequency is given by(3)This system has been studied in Iwasa and Sasaki (1987) (see their Mating kinetics I). The dynamical system possesses an internal stable fixed point where all mating types are present at equal frequencies. Analogous finite population size models lead to stochastic differential equations, and have been analyzed in Czuppon and Rogers (2019). Note that, in that paper, mutation was implemented in a distinct manner, with the assumption that mutation occurs *with* reproductive events. While that implementation is biologically more reasonable, the choice leads to only minor quantitative differences in dynamics.

### Data availability

The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article. The code and simulation outputs can be found on github: https://github.com/gwaconstable/InvExtDynMatTypes. Supplemental material available at Figshare: https://doi.org/10.25386/genetics.8099852.

## Results

In this section, we mathematically analyze the null model just presented. We begin by characterizing the long-time equilibrium behavior of the model, which is described by its stationary distribution. This allows us to answer the key question of how many mating types, *M*, are predicted by the null model as a function of the population size *N*, mutation rate *m*, and, importantly, relative rate of asexual reproduction, *c*. However, while this results in an important benchmark for the expected number of mating types in real populations, it provides little insight into the dynamics of mating type number. To this end, we further calculate the probability that a novel mutant mating type establishes in the population and the expected extinction time of a resident mating type allele. These quantities provide a deeper insight into the ongoing dynamics of mating type number in real populations.

### The stationary distribution of individuals carrying each mating type allele

We denote by the stationary distribution of the number of individuals of each mating type. Constable and Kokko (2018) showed that an analytic solution for is accessible as the probability transition rates in Equations 1 and 2 can be decomposed into the product of a birth function and death function that each depend only on the number of each mating type reproducing and dying respectively;(4)where(5)Under this decomposition, the stationary distribution of mating type alleles takes the exact form(6)where is the vector n reordered with its largest entries first, N is a normalization constant that enforces and is defined as(7)The full derivation of Equation 6 is given in Constable and Kokko (2018). In terms of birth–death processes, it is interesting to note that Equation 6 is the product of standard stationary distributions obtained for a two-allele (single variable) population (Karlin and Taylor 1975), moderated by a series of effective population sizes. A similar observation has been made in a related model of multi-allelic selection (Muirhead and Wakeley 2009).

### The stationary distribution of the number of mating type alleles

The stationary distribution of populations with *M* mating types is defined as . This gives the probability of finding a population with a given set of parameters (*N*, *c*, and *m*) in a state with *M* mating types at long times. is related to the distribution through the following summation;(8)where is the set of all vectors n that represent a population with *M* present mating types (that is, all vectors n with *M* nonzero elements).

While Equation 8 can be expressed neatly, evaluating this quantity is problematic as it involves summations of Equation 6 over sets of the infinite vector n. To make analytic progress, we note that, when the population size *N* is large and the per-generation mutation rate is small, at intermediate times the population resides in a quasi-stationary distribution in the region of a fixed point of the dynamics in the deterministic limit. More precisely, when the population is comprised of *M* mating types, this fixed point is given by for each of the *i* present mating types, and otherwise. Following an extinction or mutation event (which changes the number of mating types in the population), the population quickly relaxes to a new quasi-stationary distribution in the region of an alternate fixed point with mating types, *i.e.*, . Within the limit of large *N* and small then, can be approximated by a superposition of these quasi-stationary distributions.

Briefly, the quasi-stationary distribution of the population in the region of a fixed point can be calculated by conducting a diffusion approximation on the underlying Moran model and linearizing the resultant advection–diffusion equation about a deterministic fixed point (the van Kampen approximation). The full calculation is conducted in the Supplemental Information, where we show that these quasi-stationary distributions are Gaussian (see Equations S18–S22).

Each of the quasi-stationary distributions can now be renormalized, using Equation 6 to “pin” the height of each quasi-stationary distribution to the height of Equation 6 in the region of the deterministic fixed point [see Hufton *et al.* (2016) and Vasconcelos *et al.* (2017) for similar approaches]. The full calculation is detailed in Section S3 in the Supplemental Information. Substituting the resulting approximation for into Equation 8, and taking the limit of large *N*, we find(9)where ℳ is a normalization constant, such that and(10)Note that when sexual reproduction is obligate, and , while when sex is facultative and rare, θ becomes large.

Comparing our approximate expression for the distribution of the number of mating types, Equation 9, against the results of stochastic simulation of the population, we find excellent agreement (see Figure 1). We note that, in a facultatively sexual population, where the role of genetic drift is amplified by high clonal reproduction rates, *c*, extinctions can even lower the number of mating types to one (see Supplemental Information, Section S8). In this scenario, no further sexual reproduction events can take place, and the population reproduces purely asexually until a new mating type is generated by mutation.

### The mode number of mating types

Since the distribution is unimodal, obtaining an estimate for its mode is straightforward numerically. We define by the ratio of probabilities of having *M* and mating types;(11)For any given set of parameters, this ratio is independent of the normalization constant ℳ. Substituting Equation 9 into Equation 11, we obtain a simplified analytic expression for (see Supplemental Information, Equation S42). Since the distribution is unimodal, finding its mode is equivalent to finding the value of *M* for which starts to decrease. The approximate mode of , which we denote , can then be obtained as the solution to the equation(12)The function has a single root, and, thus, solving for is numerically straightforward.

In Figure 2, we plot the mode number of mating types as a function of the per-generation mutation rate, and the population size, *N*, for four different rates of asexual to sexual reproduction. Here, we see that facultative sex has a strong influence over the number of mating types, with even a ratio of asexual to sexual divisions (Figure 2b) reducing the expected number of mating types by an order of magnitude compared to the obligately sexual scenario (Figure 2a).

### The establishment probability of a new mating type allele

Having approximated the mode of the number of mating types, we now proceed to study the evolutionary dynamics of the number of mating types. We begin by computing the probability of a successful establishment, , of a newly arising mutant in a population of *M* resident mating types. We define as the probability that this novel mating type allele (initially at a frequency ) reaches the new stationary frequency of mating types before any of the resident mating types goes extinct.

For the deterministic equilibrium is an internal fixed point. This is crucial for our approximation of the establishment probability, since we will identify it by the survival probability of a corresponding branching process (see Section S5 in the Supplemental Information). Furthermore, we assume that , *i.e.*, the population size is sufficiently large that the resident mating types are not quickly lost through genetic drift. Under these assumptions, we can extend the computation from Czuppon and Rogers (2019) for obligate sexual reproduction, and find(13)In the case of obligate sex , reduces to , recovering the result found in Czuppon and Rogers (2019). For facultative sex we see that the establishment probability is reduced. Since asexual reproduction increases the time to reach the stationary state in the deterministic system (*i.e.*, the selection strength for even mating type ratios is reduced, see also Equation 3), newly arising mutants spend more time at low frequencies where they are susceptible to purging by genetic drift. This is reflected by decreasing with *c* (see Figure 3). Our definition of the establishment probability includes that none of the resident mating types become extinct during the invasion process. This impedes a straightforward comparison with the case of obligate asexual reproduction, (*i.e.*, a neutral Moran model). The corresponding establishment probability might naïvely be assumed to be ; however, this value ignores the survival of all resident mating types, and, therefore, overestimates the actual establishment probability.

### Mean time until the extinction of a mating type allele

Our final goal is to calculate the mean extinction time of a mating type allele. We assume that, initially, the population is close to its equilibrium (*i.e.*, all mating types are approximately at equal frequencies). We then model the arrival of mating types (through mutation events) and their extinction (due to genetic drift) as a birth–death process on the number of mating types. Let be the probability per unit time that a population with *M* resident mating types gains a new mating type, and be the probability per unit time that a mating type goes extinct.

Although we are interested in calculating the time between extinction events, we first turn our attention to the stationary probability distribution of this effective “birth–death” process in the number of mating types; . Solving the equations for the stationary distribution of the effective birth–death process, we obtain the well-known result (see for instance, Allen 2011, Chapter 6)(14)Rearranging for , we find(15)where we have used the definition of given in Equation 11. The mean time between extinction events is the inverse of this effective death rate, *i.e.*,(16)where the factor *N* accounts for the fact that we are measuring the time until an extinction event in time units of generations. Further, we note that, since is determined using (which itself accounts for covariances in allelic frequencies in the stationary state), our expression for captures the effect of allele frequency fluctuations in all *M* types.

As we have already analytically calculated (see Equation 11), all we now need to evaluate the mean time until the extinction of a mating type allele is an expression for (see Equation 16). We assume that the effective “probability birth rate,” (the probability per unit time that the number of mating types in the system increases) is given by the product of the mutation rate and the establishment probability of a mutant allele (see Equation 13):(17)This approximation relies on the frequencies of the resident types being close to their deterministic equilibrium (an assumption made in the derivation of ). In general, this will be true if the state containing *M* types is sufficiently stable and if mutations are occurring sufficiently infrequently that the population has time to relax to this quasi-stationary state. If these conditions are not met, it is possible that a new mutant mating type may arise in a population with a highly uneven mating type distribution, where will no longer provide an accurate estimate of the establishment probability. We discuss these points further below.

Inserting this effective birth rate into Equation 16, and substituting from Equation 11 (see Equation S42), we can express the mean extinction time as(18)We first note that, as we would expect, Equation 18 is independent of the mutation rate of novel mating type alleles (the mean time until the extinction of a mating type allele should not depend on the time until a new mating type allele arrives). Further, we can see that, when the population size, *N*, is large, the mean time to extinction is dominated by the term raised to the power of *N* in Equation 18 [explaining the linear growth of for large *N* in Figure 4]. Although Equation 18 is lengthy, it is useful as it allows us to quickly evaluate the mean extinction time for arbitrary parameter values. We can then rapidly explore parameter regimes that would be prohibitively time consuming to simulate (see Figure 5).

The expression derived in Equation 18 becomes increasingly accurate with increasing population size, *N*, decreasing rates of asexual reproduction, *c*, and smaller numbers of initial mating types, *M*. The approximation can break down, however, when either *N* is small, or *c* or *M* are large. In this latter range, as the population dynamics become increasingly dominated by genetic drift, various assumptions involved in the derivation of Equation 18 can become invalid. Most importantly, the Gaussian approximation for the quasi-stationary distribution of a focal mating type breaks down. The true distribution becomes increasingly flat, while the Gaussian approximation does not respect the boundary conditions requiring non-negative number of mating types . Since this results in erroneous approximations of , (the crucial ingredient calculating Equation 14) when the number of mating types is highly unstable (*i.e.*, when ), we do not expect our analytically derived mean extinction time to provide reasonable results in this regime. This reasoning is assessed in more detail and validated in the Supplemental Information (see Section S6).

We now seek a more quantitative measure of when we expect Equation 18 to remain valid. We first note that, naturally, should increase monotonically with population size, and decrease monotonically with the rate of asexual reproduction and the number of resident mating types. However, on evaluating Equation 18, we find that these expectations are violated as the relative strength of genetic drift increases (*i.e.*, as the initial number of mating types, , becomes small), or, conversely, as the strength of selection for equal numbers of mating types decreases (*i.e.*, *c* becomes large). In these regimes, the dynamics of the mating type alleles approach neutrality, and extinction time is better approximated by standard results on the neutral multi-allelic Moran model Baxter *et al.* (2007) (see also Equation S51 in the Supplemental Information);(19)Combining and , the mean time to extinction of a mating type allele can then be approximated by(20)In Figure 4, we can see that this captures the results of simulations very well.

In Figure 5, we plot predicted extinction times for regions of parameter space that are prohibitively time consuming to investigate numerically. The gray regions that dominate the parameter space for low asexual reproduction rates depict extinction rates > generations. To set this in context, we compare this to the evolutionary history of fungi. Fungi first evolved ∼1.5 billion years ago (Wang *et al.* 1999). Assuming ∼100 generations a year, a conservative estimate considering the doubling time of yeast, which is ∼90 min, this leads to an order of magnitude guess of generations for the entire evolutionary history of fungi. Hence, in the context of obligate sex (see Figure 5 top-left panel), mating types alleles are expected to remain fixed in a population while their number remains below six. This condition is severely relaxed in populations in which sex is facultative (see, for example, Figure 5, bottom-right panel). Even though extinction times remain high for realistic effective population sizes, some turnover of mating types might be expected over long time periods, *e.g.*, the evolutionary history of fungi. Moreover, this observation indicates that mating types loss is most likely be driven by extreme population bottlenecks decreasing the effective population size *N*.

## Discussion

The evolutionary mechanisms that drive the number of mating type alleles observed across species has been the topic of numerous theoretical studies. Our analysis in the context of haploid SI adds new results and perspectives to the evolutionary dynamics of this type of balancing selection.

A specificity of mating type SI, as studied here, when compared to gametophytic SI, as prevalent in plants, is the possibility to reproduce asexually. Our findings support and extend the previous result that switching between an asexual and a sexual life cycle significantly reduces the number of mating types in a population (Constable and Kokko 2018). Empirically, available data appears to support the view that more frequent sex is correlated with more mating types [see Table 1, and again Constable and Kokko (2018)]. However, we are hampered from making any strong empirical claims in this area as a result of a paucity of empirical estimates of rates of sex in natural populations. In most species, such estimates are absent; however, with new methodologies and understanding for estimating the rate of sex arising increasingly frequently (Hartfield *et al.* 2018; Nieuwenhuis *et al.* 2018; Ennos and Hu 2019), it is our hope that this gap in the literature will soon be filled.

In Table 1, we test our model quantitatively against four species where estimates for the rate of sex are available. We compare the number of mating types observed empirically with the mode number predicted theoretically. Since estimates for the mutation rate of new mating type alleles and the effective population size parameters are difficult to obtain, we considered a range of values. We find that, while facultative sex explains much of the variation in mating type number, there are quantitative disagreements. In particular, for a range of parameters (particularly large effective population size), the number of mating types is overestimated in *Saccharomyces cerevisiae*, *C. reinhardtii* and Tetrahymena, while it is underestimated in *S. commune*.

Given the simplicity of the model that we have proposed, these quantitative disagreements are not unexpected. An interesting open problem is to explore other null models that are capable of explaining these discrepancies. Essentially this task translates to identifying additional mechanisms that decrease selection for more mating types (where their number is overestimated) or increase this selection strength (where their number is underestimated). There exist a number of biologically reasonable potential candidates.

In Equation 1, we have assumed mass action encounter rate dynamics between gametes, leading to a linear relationship between a mating type’s frequency and its probability of finding a sexual partner [*i.e.*, ]. This implementation neglects active mate search. In reality, in species such as *C. reinhardtii* or the diatom *Ditylum brightwellii*, which have developed active methods to increase the encounter rate between complementary mating types (Waite and Harrison 1992; Snell and Goodenough 2009), this term might be more accurately described by a decreasing concave up function (see, for example, Ashby and Gupta 2014). This would qualitatively recapitulate the results of Iwasa and Sasaki (1987) (Mating Kinetics 3 and 4), and lead to a decrease in the predicted number of mating types.

Further, our model assumes that a mating type is determined by a single locus. Thus, our model is restricted to bipolar mating type systems, as opposed to tetrapolar systems, where the mating type is determined by alleles at two loci (Nieuwenhuis *et al.* 2013). We expect that additional loci may lead to larger numbers of mating types maintained in the population (as mating type allele combinations can be regenerated through recombination, the extinction rate of mating types will be reduced). Indeed, this would be consistent with the larger number of mating types empirically observed in the tetrapolar *S. commune*, as well as the general observation that tetrapolarity is associated with an increase in mating type diversity (Nieuwenhuis *et al.* 2013). While it is interesting to note that the number of alleles predicted by our model capture the number of *A* and *B* SI complexes in *S. commune* separately (see Table 1), theoretically addressing the mutation-extinction hypothesis for tetrapolar systems in a systematic way, along with the effects of encounter rate dynamics, will be interesting areas for future investigations.

Thus far, we have focused on modeling considerations that may well strengthen the mutation-extinction balance hypothesis. Of course, non-neutral differences between mating types must be considered as well, and a pluralistic view that incorporates the remaining hypotheses for observed mating type numbers may be necessary. For instance, we have not accounted for the fact that newly arising mutants are unlikely to be fully compatible with (or equally as fit as) resident mating types, a biologically important consideration (Power 1976; Hadjivasiliou and Pomiankowski 2016; Krumbeck *et al.* 2019) that would lower the number of types relative to our idealized estimate. In addition, we have not considered how forms of homothalism (*i.e.*, the evolution of self-compatibility) may affect our results. For instance most yeasts have evolved the ability to switch mating types between sexual generations (Nieuwenhuis and Immler 2016; Nieuwenhuis *et al.* 2018), while, among ciliates, some species exhibit probabilistic mating type expression (Paixão *et al.* 2011). Indeed, as demonstrated in Hadjivasiliou *et al.* (2016), the rapid loss of mating type diversity when sex is rare and local population sizes are low (see Figure 5) can drive selection for mating type switching in order to maintain the capacity for sexual reproduction. Such factors make a comparison of our model with these species problematic. Further mechanisms that may limit the number of mating types that are not included in our model, including UPI and costly mate search, are reviewed in Billiard *et al.* (2011).

Our model for haploid SI is conceptually similar to extensively studied gametophytic SI systems observed in plants. Yet, we take a novel mathematical approach in estimating the number of mating types supported by a finite population. Instead of using the extinction boundaries of the one-dimensional diffusion approximation of a focal mating type as in most previous studies (Wright 1939, 1960, 1964; Ewens 1964; Yokoyama and Nei 1979; Yokoyama and Hetherington 1982; Vallejo-Marín and Uyenoyama 2008), we study a birth–death process on the total number of mating types. Utilizing just the local description of the stationary distribution around an interior stable fixed point, we circumvent the problem of the diffusion approximation for systems with a stable deterministic behavior being inaccurate at the boundaries (Assaf and Meerson 2017). This leads to robust predictions for the stationary probability distribution (Figure 1), and, importantly, the mean extinction time of a mating type allele, (Figure 4).

In calculating this mean extinction time, we deviate from the approaches taken before that rely on the one-dimensional dynamics of a focal mating type (Takahata 1990; Vekemans and Slatkin 1994). This previous method compares the one-dimensional diffusion to a time-rescaled neutral coalescent, thus obtaining estimates for the diversification rate (*i.e.*, the establishment rate of a novel mating type). This comparison is conducted by assuming constant selective strength. In our model, this would be equivalent to assuming that the term is constant. However, with a varying number of present mating types (and indeed, also with fluctuations in the mating type frequencies themselves), this value is not constant in time. Hence, a comparison with a time-rescaled neutral coalescent does not seem appropriate since the selection strength would need to be reassessed after each coalescence event that is associated with a loss of a mating type. This variation in selection strength has been empirically observed by analyzing the genome of various *Coprinus cinereus* (mushroom fungus) populations worldwide, which shows that selective forces in the mating type dynamics are dependent on the number of distinct lineages (May *et al.* 1999); see also Richman (2000) for other examples in the context of balancing selection. Our technique, explicitly computing the mean extinction time by the effective birth–death process on the number of mating types, avoids this problem. Since the varying selective strength enters in both the establishment probability and the stationary distribution, our computed extinction time accounts for the fluctuating selective pressure.

These explicit estimates of establishment probabilities and typical extinction times yield new insight into the dynamics that drive the low number of mating types predicted by our model in the rare sex regime. One might initially suspect that these numbers are the result of a high turnover of mating type alleles, *i.e.*, frequent extinction and invasion events. However, we find that, in fact, they are a result of very low invasion probabilities for novel mating types, combined with rapidly decreasing extinction times as a function of resident mating type diversity. It is worth mentioning that, although our approximations for the extinction time break down when the resident state becomes highly unstable (*i.e.*, when ), these represent states that would be very short-lived in nature, and thus are not relevant from a biological perspective.

Finally, in Table 1, we link our theoretical observation of small numbers of mating type alleles yielding very large extinction times (see Figure 4) to some species examples. Indeed, the large extinction times found by the quantitative study are in line with previous empirical observations of long terminal branches in allelic genealogies under negative-frequency dependent selection, *e.g.*, in fungi (May *et al.* 1999) and Solanaceae (Uyenoyama 1997). Similar to previous simulation studies (Slatkin and Muirhead 1999; Gervais *et al.* 2011), we find that the number of resident mating types strongly influences the diversification rate; the larger the resident number, the lower the diversification rate. This slowdown is ubiquitously observed in natural systems under balancing selection, such as SI in fungi (May *et al.* 1999) or Solanaceae (Uyenoyama 1997) and MHC-systems (Solberg *et al.* 2008). We find that long terminal branches in SI systems, corresponding to old mating type allele ages [with some being older than the corresponding species age (Richman 2000)] emerge naturally under balancing selection for two reasons: (i) the stability of the internal equilibrium can lead to extremely large extinction times of alleles, thus enabling such long branches; and (ii) the lower establishment rate due to a large number of resident SI alleles slows down the diversification rate, and, hence, decreases the number of newly established mating types.

In conclusion, we analyzed the evolutionary dynamics of self-incompatible mating types in facultatively sexual isogamous species. Our results refine previous estimates on the number of mating types maintained in a finite population as well as on the establishment probability of a newly arising mating type allele. Furthermore, we have used these results to compute the mean extinction time of a focal mating type via an effective birth–death process describing the number of mating types. This estimate naturally incorporates variation in selection strength due to varying numbers of resident mating types, a fact that previous studies have failed to incorporate. We are therefore able to qualitatively explain the empirically and numerically observed slowdown of allelic diversification rates in populations under balancing selection. The here presented methodology is theoretically extendable to other systems exhibiting negative-frequency dependent selection.

## Acknowledgments

We thank the Department of Evolutionary Theory at the Max Planck Institute for Evolutionary Biology in Plön for inviting G.W.A.C., resulting in this collaboration. We also thank Deborah and Brian Charlesworth for their insightful feedback when presenting this work. We are also grateful to Sylvain Billiard for directing our attention toward diversification rates of mating types and for comments on the manuscript which increased its readability. Lastly, the suggestions of two anonymous reviewers helped to clarify certain aspects of the manuscript; most notably this improved the discussion around Equation 18. P.C. received funding from the Agence Nationale de la Recherche (ANR), grant number ANR-14-ACHN-0003-01 provided to Florence Débarre. G.W.A.C. thanks the Leverhulme Early Career Fellowship provided by the Leverhulme Trust for funding.

## Footnotes

Supplemental material available at Figshare: https://doi.org/10.25386/genetics.8099852.

*Communicating editor: A. Agrawal*

- Received May 9, 2019.
- Accepted August 4, 2019.

- Copyright © 2019 by the Genetics Society of America