## Abstract

Classical population genetic theory generally assumes either a fully haploid or fully diploid life cycle. However, many organisms exhibit more complex life cycles, with both free-living haploid and diploid stages. Here we ask what the probability of fixation is for selected alleles in organisms with haploid-diploid life cycles. We develop a genetic model that considers the population dynamics using both the Moran model and Wright–Fisher model. Applying a branching process approximation, we obtain an accurate fixation probability assuming that the population is large and the net effect of the mutation is beneficial. We also find the diffusion approximation for the fixation probability, which is accurate even in small populations and for deleterious alleles, as long as selection is weak. These fixation probabilities from branching process and diffusion approximations are similar when selection is weak for beneficial mutations that are not fully recessive. In many cases, particularly when one phase predominates, the fixation probability differs substantially for haploid-diploid organisms compared to either fully haploid or diploid species.

- fixation probability
- Moran model
- Wright–Fisher model
- haploid-diploid life cycle
- variance effective population size

CLASSICAL genetic theories generally assume either a fully haploid (haplont) or fully diploid (diplont) life cycle (*e.g.*, Crow and Kimura 1970). Organisms often exhibit more complex sexual life cycles, however, with both free-living haploid and diploid stages (haploid-diploid life cycle) (Mable and Otto 1998). For example, marine macroalgae generally exhibit an alternation of generations between a haploid-gametophyte stage and a diploid-sporophyte stage (Bell 1994, 1997). Furthermore, haploid and diploid life stages can differ substantially in morphology. For example, diploid-dominant species (*e.g.*, Laminariales) have large diploid sporophytes and small haploid gametophytes, with the reverse found in haploid-dominant species (*e.g.*, *Cutleriales multifida*) (Hirose 1975; Bell 1994, 1997).

Despite haploid-diploid life cycles being common in several groups of organisms, classical population genetic quantities, such as the fixation probability and the effective population size, have not been determined for haploid-diploid organisms. The fixation probability is fundamental to predicting the rate of adaptation of a species with an alternation between free-living haploid and diploid generations. It is also important for predicting the ability of organisms to persist in a changing environment and to predict how patterns of molecular evolution depend on the life cycle. Indeed, a total of 5480 articles mention the probability of fixation in a genetic context [Google Scholar search, September 10, 2016: (“fixation probability” ‖ “probability of fixation”) (gene‖genetic‖allele‖mutation‖mutant)], in contexts ranging from the interpretation of sequence data, to the prediction of quantitative trait variation, to the spread of disease resistance, *etc*.

In addition, the fixation probability is needed to model the evolution of the life cycle itself (*i.e.*, the relative proportion of haploidy *vs.* diploidy), when one accounts for several of the complexities seen in haploid-diploid species such as coexistence of haploids and diploids in a population (*e.g.*, Destombe *et al.* 1989; Hiraoka and Yoshida 2010) and differences in spatial distribution and dispersal ability (*e.g.*, Bell 1997). In a related article, we model the dynamics of a modifier allele controlling the life cycle (Mable and Otto 1998), explicitly tracking the demographic and spatial dynamics of haploids and diploids (K. Bessho and S. P. Otto, unpublished data). This related work requires the fixation probability in haploid-diploid species, as derived in the current article.

Given the fundamental importance of the fixation probability in evolutionary theory, there is a surprising lack of stochastic analyses describing the fate of selected alleles with haploid-diploid life cycles, compared to the many analyses on exclusively haploid or diploid life cycles (*e.g.*, Crow and Kimura 1970; Ewens 2004). To fill this gap, we develop genetic models that consider the population dynamics for haploid and diploid generations, deriving the fixation probability for a selected allele in such species. These results are further evaluated by comparison to numerical simulations.

## Model

To calculate the fixation probability, we first extended the classical population genetic models, the Moran model and the Wright–Fisher (WF) model, to account for free-living haploid and diploid phases. In these models, we track the haploid and diploid population dynamics and changes in allele frequencies under the assumption of a one-locus genetic model with two alleles (resident allele *R* and mutant allele *M*). We assume the following: (1) that the total population size *N* is fixed, (2) that there is an obligate alternation of generations between haploid and diploid phases, (3) that the species is monoecious with equal investment in male and female gametes, and (4) that mating is random.

Because the number of individuals with genotype (*GT*) is a Markov process over time in both Moran and WF models, we let represent the number at time *t* as a random variable, and we let represent the realized number in a given generation. The index (*GT*) indicates the genotype of individuals or cells [(*GT*) = *R* or *M* for haploids and (*GT*) = *RR*, *RM*, or *MM* for diploids]. Summed over all genotypes, the population size is *N* . Parameters and variables in our model are represented in Table 1.

### Moran model

In the Moran model, at every time step, death, reproduction, fertilization, and recruitment occur. First, one individual is randomly eliminated from the population in proportion to its mortality rate This individual is then replaced by a birth. At reproduction, diploids produce haploid spores at rate and haploids produce female haploid gametes at rate [and male gametes at rate The female gametes can be fertilized with male gametes with a fertilization success of where we assume that male gametes are not limiting. All haploid spores and fertilized diploid zygotes are candidate reproductive cells to replace the dead individual and are equally likely to be recruited.

### Wright-Fisher (WF) model

We next consider a different type of population dynamics using the WF model. In this model, all diploids produce a number of spores, and haploids produce female gametes As before, the female gametes are fertilized with male gametes with a fertilization success rate of After the end of each time step, all parents are eliminated from the population, and the next generation is constructed by randomly sampling *N* individuals in proportion to the number of haploid spores and diploid zygotes produced by the previous generation (see Equation D1).

### Individual-based simulations

To check the accuracy of the analytical results, we simulated the dynamics numerically. We started the system fixed on the R allele, assuming that the system has reached ecological and demographic equilibrium (*i.e.*, in the absence of selection, see Appendix A and Appendix D for details), with haploids and diploids (rounded to have integer numbers of individuals) and with giving the frequency of haploids in the population (see Equation A8b for the Moran model and Equation D2 for the WF model). Next, we mutated one resident allele at random and simulated until this allele was fixed or lost in the population. To estimate the fixation probability, we repeated the process and estimated the probability from the resulting frequency of fixation.

### Data availability

The analysis, numerical simulations, and scripts to generate the original figures were coded in Wolfram Mathematica 10 (Supplemental Material, File S1), available for download from the Dryad Digital Repository (http://doi.org/10.5061/dryad.n4k76).

## Results

### Fixation probability

#### Fixation probabilities in the Moran model using a branching process approximation:

We first develop a branching process approximation (*e.g.*, Haldane 1927; Yeaman and Otto 2011) for haploid-diploid life cycle species. Specifically, we calculate the fixation probability of a single allele that appears in the haploid phase (genotype *M*), and the fixation probability for a single allele that appears in a heterozygous diploid (genotype *RM*), after which we calculate the average probability of fixation for an allele that appears randomly in any individual. When using a branching process approach, the fates of mutations in different individuals are assumed to be independent of one another, which implicitly requires that the population size is large and that mutant alleles are lost or established while rare. Under these assumptions, we obtain approximations for the fixation probabilities (Appendix B):(1a)(1b)where we define the expected reproductive rate the average fertilization rate for mutant gametes (averaged over cases where the mutant is carried by the sperm and when it is carried by the egg), the expected reproductive rate of a haploid mutant and the expected reproductive rate of a haploid resident Observe that the expected reproductive rate of haploids is scaled by the average value of for the combining gametes divided by two, which accounts for the costs of sex (specifically for the possibility that not all female gametes are fertilized and for the fact that haploids must invest half of their reproductive resources in male gametes, which are unable to reproduce on their own). Equations 1a and 1b apply as long as the fixation probabilities and are positive, which occurs when the average mutant fitness is greater than the resident, where the relevant average across the haploid-diploid life cycle requires This condition agrees with the invasion condition in a deterministic model (K. Bessho and S. P. Otto, unpublished data). When haploids and diploids have similar reproductive rates the fixation probability of a mutation appearing first in a haploid tends to be larger than a mutation appearing first in a diploid because all of the haploid mutant offspring bear the mutant allele, whereas only half of the offspring of an *RM* diploid do. As the cost of sex rises, however declines relative to the probability of fixation becomes increasingly more likely when the mutant first appears in a diploid, thereby delaying this cost.

The average fixation probability of a new mutation that arises uniformly at random among the individuals present in the resident population, can then be defined as:(2)where is the equilibrium proportion of haploids in the resident population (A8b) and where the “2” accounts for the fact that diploids carry two alleles that could have potentially mutated. Equation 2 allows the fixation probability to be calculated for arbitrarily strong selection, but it will be accurate only as long as the fixation probabilities (1) are not negative.

#### Fixation probabilities in the Moran model using a diffusion approximation:

Next, we derive the fixation probability of a selected mutant allele in a haploid-diploid population using Kolmogorov’s backward equation, given the first and second moments for the fluctuations in allele frequencies. This diffusion equation provides an adequate approximation for the fixation probability even for small populations and deleterious mutations (*e.g.*, Kimura 1957), unlike a branching process approximation. Rather than analyzing the genotype frequencies, *x*_{(}_{GT}_{)}, we transform the model into a new set of variables that allow us to perform a separation of timescales, assuming that selection occurs on a slower timescale and that ecological and demographic processes rapidly bring the system to a “quasi-equilibrium” (see Appendix C). This approach allows the dynamics of the mutant allele to be described by moments that depend only on the parameters and the current allele frequency (Appendix C).

Specifically, we let the allele frequencies in haploids and diploids equal and respectively, and focus on the following new variables: the difference in allele frequencies between haploids and diploids the frequency of haploids in the population the departure from the Hardy–Weinberg principle in diploids and the allele frequency averaged across ploidy levels and weighted by the expected life span of haploids and diploids Only with this weighted allele frequency is it possible to separate the timescales so that the leading-order change in *p _{M}* is independent of the other variables (Appendix C). This averaging also corresponds to weighting by the reproductive value of each class (Taylor 1990).

To perform the separation of timescales, we assume selection is weak by first defining the following selection [*s*_{(}_{GT}_{)}] and dominance (*h*) coefficients: and We then assume weak selection by setting for each life history parameter, where is a small term.

The fast-scale dynamics can be tracked by determining changes in the system to leading order assuming that selection alters these dynamics only over a longer period of time. Doing so, we find that the system rapidly approaches a quasi-equilibrium state where there is a similar allele frequency in haploid and diploid populations the fraction of haploids is approximately the population is nearly at Hardy–Weinberg and the allele frequency does not change (see Appendix C and File S1). We then calculate the average change in allele frequency to leading order, which requires that we keep linear-order terms in and which accounts for selection (Appendix C and File S1).

Furthermore to leading order, the second and higher moments of the change in allele frequency are the same as for the neutral case (*i.e.*, when R and M are equivalent, dropping all terms involving Consequently, we derive the second moment for the change in allele frequency using the diffusion limit when selection is absent (Appendix C and File S1), which allows us to determine the variance effective population size for this model. We also confirmed that the third moment goes to zero in the diffusion limit for the Moran model, justifying the use of a diffusion approximation (Karlin and Taylor 1981, p. 165).

Using the first and second moments for the expected change in the frequency of mutant alleles, and we can calculate the backward equation for the fixation probability with an initial allele frequency *p*_{0}, (Appendix C):(3a)where(3b)(3c)and is the average selection experienced by a rare allele across haploid and diploid stages. Here, we scale time as described in Appendix C and define the arithmetic mean mortality rate as Under the diffusion approximation, selection is assumed to be weak and of order in which case both and are of when we take the diffusion limit Solving Equations 3a–3c with the boundary conditions, and we then obtain the fixation probability for any diffusion process (Crow and Kimura 1970, p. 424):(4a)with When a mutation arises at random on a single chromosome within the population, the expected initial allele frequency is:(4b)again weighting the allele frequency in each ploidy phase by its longevity.

#### Fixation probabilities in the WF model using a branching process approximation:

We next develop the fixation probabilities and using the WF model, again applying both branching process and diffusion methods (Appendix D).

The branching process approximation yields a pair of coupled nonlinear equations in and (Equations D4a and D4b), which can be solved numerically to obtain the fixation probabilities for an allele appearing at random within the population when selection is strong:(5a)where now the fraction of haploids in the resident population is given by (see Equation D2). For compactness, we have written the fitness of a resident haploid considering the cost of fertilization and sex as

We can also approximate the fixation probabilities and explicitly when selection is weak. To do so, we define and where again we assume weak selection by setting for small To leading order in the fixation probabilities for an allele initially in a haploid individual or in a diploid individual become:(5b)(5c)where the average selective effect for a rare allele in the WF model is defined as Plugging Equation 5b and (5c) into (5a), we have(6)This branching process approximation is valid only for mutations whose average effect is beneficial

#### Fixation probabilities in the WF model using a diffusion approximation:

We also derive the fixation probability using a diffusion approach, finding the appropriate drift and diffusion coefficients for use in Equation 4a. Again, we sought a transformation of variables that allowed a separation of timescales, such that the leading-order term in selection depended only on the parameters and the allele frequency. For the WF model, where all parents die each generation, the appropriate definition of the average mutant allele frequency is (Appendix D). Applying the separation of timescales, we obtain the drift and diffusion coefficients:(7a)(7b)where time is measured in units of *N* generations (see Appendix D for the derivation). Equation 4a then gives the fixation probability, with the initial allele frequency for a mutation appearing at random on a single chromosome within the population. In contrast to the branching process approximation, this fixation probability can be calculated for both beneficial and deleterious mutations, but assumes that this selection is weak.

#### Effective genetic parameters:

To compare the fixation probabilities derived above to classical results for fully haploid or fully diploid populations, we define the variance effective population size selection coefficient and dominance coefficient that would give the same fixation probability obtained by a diffusion approximation in the diploid WF model, a classical standard for comparison.

We first define the variance effective population size using the number of diploid individuals in a WF model that would exhibit the same allele frequency variation for a haploid-diploid species by defining (Ewens 2004). Note that *N*_{e} would be the census size *N* in the diploid-only WF model but *N*/2 in the haploid-only WF model, because we use the diploid WF model to define the effective parameters. From the diffusion approximations above, we find:(8a)(8b)Note that the variance effective population size is substantially reduced in populations where one of the phases is rare or reflecting the increased amount of drift that occurs in that phase in species with a strict alternation of generations. When the death rates are equal is half reflecting the higher variance in reproductive success that is typically observed in the Moran model compared to the WF model (*e.g.*, Otto and Day 2007, p. 583 and p. 643).

We next consider the effective selection and dominance coefficients. To do so, we define *s*_{e} and *h*_{e} as the selection coefficient and the degree of dominance in the classic diploid WF model, in which the drift term for the change in allele frequency is (see Crow and Kimura 1970). Note that the drift term obtained for haploid-diploid populations per generation has the same functional form with respect to (see Equation 3b and Equation 7a). Consequently, we can find *s*_{e} and *h*_{e} from the pair of equations describing the changes in allele frequency when *p _{M}* = 0 and when

*p*= 1 obtaining:(9a)(9b)for the Moran model, and(10a)(10b)for the WF model. For an additive beneficial allele we have and

_{M}Given the above definitions for the variance effective population size (*N*_{e}), the effective selection coefficient (*s*_{e}), and the effective dominance coefficient (*h*_{e}), we regain the fixation probabilities derived above, from the fixation probability in the classic diploid WF model (Kimura 1957, 1962; Crow and Kimura 1970, p. 427):(11)For arbitrary dominance, the result must be numerically integrated, but for additive mutations we have:(12)Assuming weak positive selection on an initially rare mutation in a large population we obtain the classic approximation to the fixation probability, The result is the same as that obtained using branching processes if we assume weak selection, as given by Equation 6 for the WF model:(13a)and by taking the Taylor series of Equation 2 to leading order in selection for the Moran model:(13b)We next turn to numerical analyses of these equations to better understand how the fixation probability depends on life history parameters of haploids and diploids.

### Comparisons to numerical simulations

#### Similar haploid and diploid life histories:

We first consider the accuracy of these analytical approximations by comparing the fixation probability to numerical simulations when the life history parameters are equivalent between haploid and diploid residents (*β _{R} = β_{RR}* and

*d*=

_{R}*d*for the Moran model,

_{RR}*w*=

_{R}*w*for the WF model). This scenario may be appropriate for isomorphic species with morphologically similar haploid and diploid stages (

_{RR}*e.g.*, the green alga

*Ulva*, the brown alga

*Dictyota*, the red alga

*Chondrus*,

*etc*.).

In Figure 1, we illustrate the fixation probability for the Moran model (A–C) and WF model (D–F) as a function of the average selection experienced by a rare allele, where selection acts only in the haploid phase (Figure 1, A and D; or in the diploid phase (remaining panels; and assuming for simplicity that selection acts only through fertility Figure 2 represents the fixation probability for different values of when selection does not act on the allele while rare (*i.e.*, the allele is fully recessive in diploids and has no selective benefit in haploids, in the Moran (A) and WF model (B). In Figure 3, we illustrate how the fixation probability varies for a given total number of individuals, *N*.

The branching process approximation allowing for strong selection (solving numerically for from Equation 2 and from Equation 5a with Equation D4) accurately predicts the fixation probability from numerical simulations regardless of the strength of selection (red curve, see inset panels of Figure 1), but this method is valid only when the population size *N* is large and the average effect of a rare mutation is beneficial, failing when there is no selection on a rare allele (Figure 2). As expected, approximating the branching process results for weak selection (Equation 13) is accurate only when selection is weak (black dashed line in Figure 1).

By contrast, the diffusion equation (4a) with drift and diffusion coefficients (3) for the Moran model and (7) for the WF model provides a good approximation as long as selection is weak (Figure 1, Figure 2, and Figure 3, blue curves), even when the average effect of a rare mutation is not beneficial and/or the population size is small. Also, the diffusion well approximates the fixation probability when the mutation is fully recessive and has no effect in haploids (Figure 2), which is a case that cannot be handled by the branching process approximation because the fate of the allele is not determined while it is rare.

Because we have assumed equal death rates for haploids and diploids, the fixation probabilities behave similarly for the Moran (Figure 1, A–C) and WF models (Figure 1, D–F), except for the fact that the fixation probability is approximately halved in the Moran model, due to the difference in *N*_{e} (see Equation 8).

#### Different haploid and diploid life histories:

We next consider the fixation probability in organisms whose life histories differ between haploid and diploid stages for the Moran model, and for the WF model), as would be appropriate for heteromorphic species (*e.g.*, the green alga *Derbesia*, the brown alga *Macrocystis*, the red alga *Porphyra*, *etc*.). Again, both the branching process and diffusion approximations work well under the conditions assumed during their derivation (Figure 5). This figure also compares the results to classical haploid-only or diploid-only models, showing that the fixation probability in a haploid-diploid population differs dramatically as the frequency of haploids *vs.* diploids in the population varies.

We first discuss the case where the mortality rates of haploids and diploids are equivalent With equal death rates, haploids and diploids have the same expected life span in the Moran model, as is assumed in the WF model where all individuals die at each time step. In this case, the fixation probability of beneficial mutations with weak selection and a large population becomes for the Moran model and for the WF model, using either branching processes or a diffusion approximation (Equation 13). We next compare these fixation probabilities for haploid-diploid populations to the classic results for selection only in haploids (Moran: WF: *vs.* selection only in diploids (Moran: WF: ), assuming selection is additive.

The fixation probability (13) reaches a maximum of for the Moran model and for the WF model when (Figure 4). When haploids comprise two-thirds of a haploid-diploid population, there is an equal number of alleles present in haploids and in diploids, which maximizes (minimizing drift for a newly arisen mutation), thereby maximizing the fixation probability for a given total number of individuals. At this maximum, the fixation probability (Equation 13, black dashed curve in Figure 4) is the average of the fixation probability in a fully haploid population (*s _{H}* for the Moran model and 2

*s*for the WF model) and in a fully diploid population (

_{H}*s*for the Moran model and 2

_{D}*s*for the classic diploid WF model). In Figure 4, we explore a case where the fixation probability is the same in haplont and diplont populations (grey dashed curve), in which case the fixation probability in a haploid-diploid population under weak selection (13) is only the same if two-thirds of the population is haploid (black dashed curve). The simulation results tend to fall slightly below this expectation, however, due to the small population size and relatively large selection coefficients assumed (black dots in Figure 4).

_{D}When the fraction of haploids departs from two-thirds, however, the fixation probability decreases dramatically (Figure 4), reaching zero when most individuals are haploids or diploids This sensitivity to the relative proportions of haploids and diploids occurs because the new allele must be carried by both haploid and diploid individuals, in alternating generations, in a haploid-diploid population. Hence, when the ratio of haploids to diploids in the population is very skewed, the strength of random genetic drift is increased in whichever ploidy phase has fewer individuals.

We next allow the mortality rates of haploids and diploids to differ as well As seen in Figure 5A, we again find that the fixation probability approaches zero whenever the ratio of haploids to diploids is extremely skewed near zero or one). With mortality rates varying between the ploidy levels, however, the peak fixation probability no longer occurs at As a consequence, increasing the mortality rate of haploids relative to diploids may increase or decrease the fixation probability of a beneficial mutation, depending on the overall population frequency of haploids. In Figure 5B, we illustrate the parameter space within which the fixation probability in a haploid-diploid population is larger than in a fully haploid and fully diploid population (red), or smaller than the two (blue), as a function of the frequency of haploids and the relative mortality rate of haploids

At an intuitive level, beneficial alleles are better preserved and more likely to fix when the more common ploidy phase also has the lower mortality rate, allowing alleles to remain in this phase for longer.

#### Ploidally antagonistic selection:

In this section, we consider cases where selection acts in opposite directions in haploid and diploid phases (ploidally antagonistic selection). The above derivations have not made assumptions about the sign of selection in each phase, although the branching process approximation assumes that the fixation probabilities in (1) are both positive. To assess the robustness of these approximations when selection acts in opposite directions in haploids and diploids, we set and for and varied the degree of antagonism (*z*) in Figure 6.

The diffusion approximation (blue curves) applies only while selection in both phases is weak. The diffusion approximation depends only on the average strength of selection (see Equation 3b and Equation 7a) and fails to capture the large oscillations in allele frequency that occur between haploid and diploid phases when *z* is large.

The branching process does, however, capture the decline in the fixation probability that occurs as the oscillations in selection across haploid and diploid phases increase in magnitude. When ploidally antagonistic selection becomes very extreme, however, selection no longer favors allele *M* when it becomes common. We can measure selection on *M* once common as the difference in fitness of *M*-bearing *vs.* *R*-bearing individuals relative to the now common genotypes, *MM* and *M*: averaging across the two phases. Selection only remains positive in Figure 6, favoring M when it becomes common, if *z* falls between −0.377 and 0.177. Beyond these points, the branching process begins to break down because the loss or fixation of *M* is no longer decided while it is rare. Indeed, once *z* extends beyond −0.432 and 0.232, both alleles *M* and *R* can invade when rare according to the branching process approximation in both the Moran and WF models. Beyond these values, we expect ploidally antagonistic selection to maintain polymorphism for extended periods of time [see bottom of Figure 6, A and B, in File S1; the stochastic analog of the protected polymorphism identified by Immler *et al.* 2012].

In the WF model, a further complexity arises because of the fact that a mutation that appears in a haploid will only be in diploid offspring, haploid grand-offspring, *etc*., because of the assumption of completely nonoverlapping generations. This alternation can lead to the situation where the entire population is composed of M haploids and RR diploids, followed in the next generation by R haploids and MM diploids; continuing to alternate until the population happens to lose one genotype or the other. At that point, allele M tends to fix only when its geometric mean fitness is higher: which requires *z* < 0.122. Because M is now in a different context, finding itself in MM diploids that are disfavored when *z* is positive, the branching process fails to capture the probability of fixation for large values of *z*.

Overall, we find that our results continue to apply with ploidally antagonistic selection as long as selection in each phase is weak for the diffusion approximation and the fate of an allele is decided while it remains rare for the branching process approximation.

## Discussion

Sexual organisms often exhibit complex life cycles alternating between different ploidy phases, with free-living haploid and diploid individuals. We develop a genetic model explicitly considering the population dynamics of haploid and diploid individuals by generalizing classical genetic models: the Moran model and the WF model. Using a branching process and a diffusion approximation, our work is the first to derive the fixation probability for organisms with haploid-diploid life cycles.

The fixation probability obtained by using branching processes (Equation 2), provides a good approximation to our simulation results when the average effect of a mutation is beneficial and the total population size *N* is sufficiently large, as long as ploidally antagonistic selection is not so strong that the allele becomes deleterious as it rises in frequency. On the other hand, the fixation probability from the diffusion approximation (Equation 4a), is accurate for both deleterious and beneficial mutations for both small and large populations, but only when selection is weak in each phase.

When average selection is weak, additive, and positive, and when starting from a single mutation, the branching process and diffusion approximations converge to Equation 13a for the WF model and Equation 13b for the Moran model. When the mortality rates are equal in haploids and diploids, the fixation probability reduces in both models to:(14)where for the WF model and for the Moran model; reflecting the increased stochasticity in the Moran model, which samples both individuals to die and to give birth (all parents die in the WF model). Equation 14 demonstrates an important aspect of the fixation probability in organisms with free-living haploid and diploid phases: beneficial alleles are less likely to fix in populations with skewed ratios of haploids to diploids This contrasts with classical results for haplont or diplont populations (Crow and Kimura 1970), where the other ploidy phase is assumed to be effectively infinite (no drift).

The “average effect” of selection, which averages selection across reproduction, fertilization, and mortality for both haploid and diploid phases, plays a key role in the fixation probability in haploid-diploid populations. Mutations can have deleterious effects at one ploidy level, but not the other, and still be beneficial overall. Alternatively, such ploidally antagonistic selection can maintain polymorphism (Immler *et al.* 2012). In that case, the diffusion model can also be used with the mean and variance components derived here to obtain the time to loss of one or the other allele, or the stationary distribution, as well as the fixation probability.

Our results in a haploid-diploid population offer an interesting comparison to models with separate sexes (*e.g.*, Crow and Kimura 1970; Hill 1979; Pollak 1990; Caballero 1995) or two patches (*e.g.*, Maruyama 1970; Whitlock and Barton 1997; Whitlock 2003; Yeaman and Otto 2011). Consider first the case with separate sexes. To see the connection it is instructive to recalculate the variance effective population size using the method described in Crow and Kimura (1970), rather than the more formal derivation in Appendix D. In a population with two sexes consisting of *M* adult males and *F* adult females (both diploid) with allele frequencies *p*_{♂} and *p*_{♀}, the average allele frequency among offspring is (*p*_{♂}+*p*_{♀})/2 and the variance in allele frequency is [Var(*p*_{♂})+Var(*p*_{♀})]/4. This in turn equals assuming similar allele frequencies in the two sexes (*p*_{♂} = *p*_{♀} = *p*). Setting this to the variance expected in the classic diploid WF model without separate sexes, we obtain the variance effective population size (Crow and Kimura 1970). For *X*-linked genes, the average allele frequency within the population is instead (*p*_{♂}+2*p*_{♀})/3, so that the expected variance is (note there is only one allele in males), yielding One might initially assume that a haploid-diploid population would have a similar variance effective population size to an *X*-linked gene, but this is not the case. With a strict alternation of generations, a gene spends half of the generations in haploids and half in diploids, so the average allele frequency is (now averaged over the alternation of generations), giving a variance in allele frequency of and a variance effective population size of (Equation 8b). This is most similar to the two sex, autosomal model but with and In both cases, descendants have an evolutionary history spent half in one type (males or haploids) and half in the other (females or diploids), and the variance effective population size is sensitive to low numbers of either type. As a consequence, is maximized in the haploid-diploid model when the phase with fewer genes (haploids) is relatively common whereas is maximized in the sex-linked model when the phase with fewer genes (males) is relatively rare

The haploid-diploid model can also be thought of as a two-patch model, sharing similarities with models of subdivided populations (*e.g.*, Tachida and Iizuka 1991; Gavrilets and Gibson 2002; Yeaman and Otto 2011). In the classical island and stepping stone models, the fixation probability in a subdivided population is the same as that in an undivided population of the same total population size (Maruyama 1970). This assumes, however, “conservative migration,” where each population contributes to the dynamics of a metapopulation according to the local population size and where migration has no effect on the local population size (Whitlock and Barton 1997; Whitlock 2003). The haploid-diploid model, however, does not exhibit conservative migration. Rather, all haploids become diploids and vice versa (equivalent to a migration rate of 100% in the WF model). When one population is smaller (say haploids), that population has a much larger impact on the dynamics of the whole population because the larger population (say diploids) descends entirely from it. Consequently, as we have shown here, the fixation probability of a haploid-diploid population does not obey Maruyama’s result for a single population, either haploid or diploid, with the same total number of individuals (Figure 4 and Figure 5).

By deriving the fixation probability for haploid-diploid populations, this work has extended classical population genetic results to organisms with an alternation of generations, allowing a better understanding of the role of stochasticity and drift in such species. Given that a large number of genes are expressed during both the haploid and diploid generations (Coelho *et al.* 2007), the theoretical framework developed here also allows for the integration of selection of these “shared” (pleiotropic) genes across the life cycle.

## Acknowledgments

We thank Ron Blutrich for help in the development of this manuscript. We also thank H. Otsuki, A. Sasaki, Y. Uchiumi, M. Whitlock, R. Yamaguchi, the members of the laboratory of S.P.O., and two anonymous reviwers for helpful suggestions and discussions. This project was funded by a Grant-in-Aid from a Japan Society for the Promotion of Science (JSPS) postdoctoral fellowship for research abroad, a JSPS postdoctoral fellowship (16J05204) to K.B., and by a Natural Sciences and Engineering Research Council of Canada grant (183611) to S.P.O.

## Appendix A: Change in Allele Frequency Within a Haploid-Diploid Moran Model

The Moran model is a Markov process where the state of the population at time *t* + 1 can be treated probabilistically given the frequencies of all genotypes, in the previous time step. At each time step, a single individual is chosen to die and one is chosen to give birth, so that the population size remains constant.

We first choose an individual to die, weighting the chance that each type is eliminated by its mortality rate. Thus, the probability that an individual with genotype (*GT*) is eliminated from the population is

We then replace that individual with a birth chosen randomly from all of the spores and zygotes that could be produced by the remaining population at time *t*. Because we assume a large population size, we assume that the change in genotype frequency due to the death in this time step is negligible for common genotypes (this assumption is not made in the simulations or for rare genotypes in our analysis). The probability that a haploid spore is recruited is then proportional to the rate at which diploids reproduce by meiosis to produce either R- or M-bearing spores:(A2a)(A2b)Given that resident and mutant gametes are produced at rates and respectively, and accounting for the probability of fertilization, the probability that a diploid zygote is recruited is:(A3a)(A3b)(A3c)Overall, the probability that genotype (*GT*) is recruited into the empty site is:(A4)To derive the first moment of mutant allele frequency fluctuations, we calculate the conditional expected value for the change in any function *F* of the random variable as(A5)where and

Using Equation A1 and (A4), the expected change in the number of individuals with genotype (*GT*), is:(A6)Note that Equation A6 can be rewritten as(A7)where and

These expected changes in genotype frequencies can be used to track the dynamics of the system and determine its equilibrium properties.

### Equilibrium

When the mutant allele is absent, the resident population approaches an equilibrium with haploids of genotype *R* (*x*_{R}) and diploids of genotype *RR* (*x*_{RR}), with stochastic fluctuations around this point. Assuming the population size is large (dropping terms of order 1/*N*), we can solve for the equilibrium number of haploids and diploids,(A8a)using (A7), which yields and to obtain the frequencies:(A8b)Equation A8b emphasizes the fact that haploids are most common when diploids reproduce and/or die at a higher rate, and vice versa.

## Appendix B: Fixation Probability in the Haploid-Diploid Moran Model Using a Branching Process Approximation

To derive the fixation probability of a mutant allele, we first apply the branching process approximation for a two-patch model (*e.g.*, Yeaman and Otto 2011). Because the fate of a mutant allele is assumed to be determined while it is rare in a branching process approximation, we ignore homozygous *MM* mutants and consider only the change in number of haploid mutants *X _{M}* and diploid mutants

*X*. As in Appendix A, we assume that the life cycle consists of the death of a single individual, followed by the birth of an individual from the pool of haploid spores and zygotes that could be produced at that time step. In this birth step, we again assume that the population size is large enough that we can ignore the small change in that occurs following the death of an individual with one of the common genotypes.

_{RM}When a single haploid mutant appears in the resident population, the numbers of individuals are *x _{M}* = 1, and

*x*=

_{RM}*x*= 0. When a single diploid mutant appears in the resident population, we have

_{MM}*x*= 0,

_{M}*x*= 1, and

_{RM}*x*= 0. Plugging these into Equation A1, the probabilities that a mutant individual is chosen to die are:

_{MM}where (assuming the population is large) and otherwise residents are sampled to die (*i.e.*, and We add “|*H*” or “|*D*” to the subscript to clarify that we start with one haploid or one diploid mutant individual.

When a single haploid mutant appears in the resident population, the expected number of zygotes is: and Because the diploid population consists solely of *RR* individuals, the rate of production of haploid spores is: and Because *N* is assumed to be very large in a branching process approximation and homozygotes are rare, the probability that the sampled offspring will be of genotype (*GT*) given by Equation A4 is approximately:(B2)where

When a single diploid mutant appears in the resident population, we have and Because *N* is large and homozygous mutants rare, the probability that a mutant offspring is sampled is:(B3)Let equal the probability that a mutant allele introduced into a single haploid individual with genotype *M* (single diploid individual with genotype *RM*) is lost from the descendant population at some point in the future. Assuming that its fate is determined before the homozygote mutant *MM* appears, that the population is at demographic equilibrium, and that the probabilities of loss are independent of each other; these probabilities satisfy the law of total probability, accounting for all possible death-birth transitions:(B4)when starting with one initial *M* haploid, and(B5)when starting with one initial *RM* diploid. The probabilities of fixation are then defined as the chance that loss does not occur: and Recalling the branching process assumption that the fate of the allele is determined while the mutation is rare, the probability that a single mutant allele is chosen to die or reproduce remains roughly constant during this period. The pair in Equation B5 can thus be solved to give the fixation probabilities:(B6a)(B6b)Because the population size is very large, the probabilities of death or recruitment of a mutant and are very small. Hence, the fixation probabilities are well approximated by:(B7a)(B7b)Plugging (B1), (B2), and (B3) into (B7) along with the frequency of haploids among residents, (A8b), we obtain (1).

Furthermore, we can apply a Taylor expansion in to Equation 1, assuming weak selection. Specifically, we plug (A8b) into (1) and rewrite and in terms of the life history parameters and the mutant’s effect on those parameters Carrying out the Taylor series, the fixation probability is then approximately(B8a)(B8b)Plugging (B8) into (2), we have (13) for the weak-selection approximation to the fixation probability based on branching processes in the Moran model.

## Appendix C: Fixation Probability in the Haploid-Diploid Moran Model Using a Diffusion Approximation

To derive the fixation probability from a diffusion approximation, we begin by deriving the first and second moments for the expected change in mutant allele frequency.

### Separation of Timescales

Because the total number of individuals is held constant at *N*, dynamics for *X*_{(}_{GT}_{)} are described by four variables. Transforming the original system of Equation A7 into the new variables, and we have equivalent dynamics for the expected changes in the new variables. Applying a separation of timescales (*e.g.*, Nagylaki 1976; Otto and Day 2007), we then calculate the slow dynamics for the change in frequency of mutant alleles The details of the calculations are given in a supplementary Mathematica file (File S1).

Let represent the expected change in each variable across one generation given the current state of the population:(C1)The exact functional form for each variable is provided in File S1.

To approximate the above with weak selection, we apply a Taylor series expansion in to Equation C1. To constant order we have:(C2a)(C2b)(C2c)(C2d)(see detailed form of the functions in the File S1). We assume that the population rapidly reaches a quasi-equilibrium according to these equations, which are independent of selection and reflect ecological dynamics and the approach to Hardy–Weinberg. Solving for this quasi-equilibrium by setting Equations C2 to zero, we have the steady state approximations: and which gives for any *p _{M}*. Consequently, the population quickly approaches a state where there is a similar allele frequency between the haploid and diploid population a Hardy–Weinberg ratio in the diploid population and a haploid-to-diploid ratio close to that in the resident population given by Equation (A8b) We refer to this as the quasi-equilibrium, to emphasize the point that the dynamics will still change slowly due to selection.

After the state of the population approaches this steady state, the variables change over a longer timescale due to selection and To derive these slow dynamics, we set and in Equation C1 and perform a Taylor series expansion around now keeping order terms. Doing so, we find that the change in allele frequency represents a closed variable system that does not depend on the dynamics of the other variables to this order:(C3)where the weighted mean mortality rate is and the harmonic mean mortality rate is

### Appropriate Weight for the Average Allele Frequency *p*_{M}

_{M}

When we transform the variables using different allele frequency definitions, the dynamics of the allele frequency did depend on the other variables to That is, defining the average allele frequency as with an arbitrary weighting, *a*, only (as used above) allows the allele frequency dynamics to be independent of the other variables to (see File S1).

### Deriving the Second and Third Moments

To derive the backward diffusion equation, we need both the second and third moments of change in mutant allele frequencies. Because we assume weak selection, we derive the second and third moments from the leading-order dynamics where the allele *R* and *M* are equivalent and As described above, to leading order, the system rapidly approaches a state where the allele frequencies are similar between haploids and diploids, diploids are at Hardy–Weinberg, and the frequency of haploids is In the following, we assume that these steady state relationships hold.

We define the change in mutant allele frequency when an individual with genotype (*a*) dies and is replaced by an individual with genotype (*b*) as(C4)where the number of the genotype that died decreases by one the number of the genotype that was born increases by one and the number of all other genotypes remains the same when We can ignore the case where an individual is replaced by an individual of the same genotype because that causes no change in allele frequency. For example, when a haploid resident is replaced by a haploid mutant and can be described as and this event occurs with probability Using Equation C3, we can define the expected moments of change in mutant allele frequency through a series of twenty terms:(C5)Transforming variables into and taking and transforming the timescale by a new time variable and where the change in this transformed time variable is and taking the limit as we have the expected second moment for the instantaneous change in mutant allele frequency(C6a)and the expected third moment:(C6b)The diffusion equation is then described by the backward Equation 3a with the first moment from Equation C3 and the second moment from Equation C6a. The details of these calculations are described in File S1.

## Appendix D: Fixation Probability in the Haploid-Diploid WF Model

Here we apply the branching process and diffusion approximation to obtain the fixation probability of a mutant allele in the WF model.

### Equilibrium

In the WF model, the offspring generation is formed by sampling randomly from the spores and zygotes produced by the previous generation of adults, who all die. Letting be the probability that a reproductive cell of genotype (*GT*) is sampled to occupy an empty site at the recruitment step, the expectation for change in the number of individuals of genotype (*GT*) is:(D1)Equation D1 is based on the multinomial distribution and assumes that the number of spores and zygotes is large relative to *N*, so that sampling is nearly with replacement.

In a large population, the equilibrium fraction of haploids can be found deterministically by solving and (recalling that and The equilibrium in a resident population then satisfies and where:(D2)When we derive the fixation probability, we assume that the frequency of haploids remains near this equilibrium. Despite ignoring fluctuations in the frequency of haploids, we find that the fixation probability remains accurate (*e.g.*, Figure 1).

### Branching Process Approximation

Because the branching process approximation focuses on mutant alleles while they are rare, we again only consider the change in the number of haploid mutants *X _{M}* and diploid heterozygotes

*X*.

_{RM}When a single haploid mutant appears in the resident population, the expected fraction of offspring produced by this parent is where the first “2” accounts for the fact that *RM* offspring can be formed when the mutant serves as the mother or as the father; where is the average fertility, which gets divided by two to account for the cost of sex; and where we define Similarly, when a single diploid mutant arises in the resident population, the expected fraction of M haploid offspring it produces is , where the “2” here accounts for the segregation of *M* alleles into only half of the haploid spores produced by an *RM* diploid.

Although sampling in the WF model follows a multinomial distribution, the sampling distribution for a rare mutant is approximately Poisson in a large population (as assumed in a branching process approximation). The probabilities for loss would then satisfy:(D3a)(D3b)where is the Poisson probability of producing *k* offspring. Using the Taylor series expansion for the exponential function we can rewrite (D3) as:(D4a)(D4b)Assuming that the fixation probabilities and are small under weak selection, applying the Taylor series expansion, and ignoring the third order terms, we have(D5a)(D5b)Solving (D5) and dropping terms in the solution gives Equation 5 for the fixation probability in the WF model under weak selection using a branching process approximation.

### First Moment of Change in Average Allele Frequency

As with the Moran model, we next apply a separation of timescales to describe the first moment for the change in mutant allele frequency. First, we transform the variables, using the same definitions for the Moran model except that the average allele frequency is defined as [Again, we set and found that the only value of the weighting term *a* that caused the change in allele frequency to be independent of the other variables to leading order in selection was *a* = 1/2.]

We then calculate the expected change in each variable, assuming weak selection. To constant order (no selection), we find:(D6a)(D6b)(D6c)(D6d)Consequently, the system rapidly approaches the quasi-equilibrium, and (see Equation D2), after which the system slowly changes according to selection. Taking the next term in the Taylor series we find that the change in mutant allele frequency is:(D7)which does not depend on any of the other variables, to this order. That is, the system of equations has been reduced to one variable describing selection. Equation D7 provides the drift coefficient for use in the diffusion Equation 4a. The drift in allele frequency is driven both by the average selection acting on allele *M* and the nonadditive effects of selection on the homozygous *MM* genotype once the mutant allele frequency becomes common.

### Second Moment of Change in Average Allele Frequency

We next derive the diffusion coefficient for use in Equation 4a, which can be derived to leading order by ignoring selection, as in Appendix C (see supporting information in the File S1).

We assume that the numbers of both haploids and diploids are large and focus on the sampling properties of mutations within each of these two pools, ignoring the impact of the small fluctuations in the frequency of haploids around the quasi-equilibrium Focusing first on the haploid pool, haploid offspring are sampled according to a binomial distribution, with Ignoring terms of order the first and second moment of change in the number of mutant individuals among the population of haploids are then:(D8a)(D8b)where and

We next consider the pool of diploid offspring. Because the number of individuals in the next generation depends on a trinomial distribution, we have and where and Furthermore, because the random variables are correlated, we have

Using these and ignoring terms of order the expected values for the first and second moments of change in the number of mutant individuals among the diploid offspring pool are:(D9a)(D9b)(D9c)(D9d)(D9e)These genotypic frequencies can be used to obtain the change in mutant allele frequency among diploids by calculating the change in the random variable From (D9), we have(D10a)(D10b)To consider the change in mean mutant allele frequency across the entire population (combining haploid and diploid pools), we calculate the first and second moment of change in allele frequency: [recall that we defined the allele frequency as to isolate the dynamics of the allele frequency from the other variables]. Assuming the changes in mutant allele frequency in haploid and diploid populations are nearly independent in a large population we have the first and second moments for the change in allele frequency to constant order in (D11a)(D11b)We transform timescales using the time variable and the allele variable (Otto and Day 2007). Taking the limit as we then obtain the diffusion coefficient for a neutral allele from Equation D11b, We also have the drift coefficient from Equation D7 Furthermore, because the difference in frequency of a mutant allele between the haploid and diploid pools is very small so that we have the diffusion coefficient, Equation 7b. With the drift and diffusion coefficients in hand, the fixation probability for the WF model can be readily calculated using standard methods (Karlin and Taylor 1981).

## Footnotes

*Communicating editor: L. M. Wahl*Supplemental material is available online at http://doi.org/10.5061/dryad.n4k76.

- Received July 6, 2016.
- Accepted November 10, 2016.

- Copyright © 2017 by the Genetics Society of America