## Abstract

Mutation, genetic drift, and selection are considered the main factors shaping genetic variation in nature. There is a lack, however, of general predictions accounting for the mutual interrelation between these factors. In the context of the background selection model, we provide a set of equations for the joint prediction of the effective population size and the rate of fixation of deleterious mutations, which are applicable both to sexual and asexual species. For a population of *N* haploid individuals and a model of deleterious mutations with effect *s* appearing with rate *U* in a genome *L* Morgans long, the asymptotic effective population size (*N*_{e}) and the average number of generations (*T*) between consecutive fixations can be approximated by and . The solution is applicable to Muller’s ratchet, providing satisfactory approximations to the rate of accumulation of mutations for a wide range of parameters. We also obtain predictions of the effective size accounting for the expected nucleotide diversity. Predictions for sexual populations allow for outlining the general conditions where mutational meltdown occurs. The equations can be extended to any distribution of mutational effects and the consideration of hotspots of recombination, showing that *N*_{e} is rather insensitive and not proportional to changes in *N* for many combinations of parameters. This could contribute to explain the observed small differences in levels of polymorphism between species with very different census sizes.

POPULATIONS accumulate mutations at a rate that is dependent on the interaction with selection and drift (Lande 1995; Charlesworth 2012). When the effective population size is small, genetic drift reduces the efficiency of purifying selection and deleterious mutations can be fixed. If the reduction is maintained over time, the accumulation of mutations causes a progressive decline in fitness that eventually leads the population to extinction. This process is referred to as the mutational meltdown (Lynch and Gabriel 1990; Lynch *et al.* 1995; Whitlock and Bürger 2005).

In the context of asexual populations, the accumulation of deleterious mutations is called Muller’s ratchet (Muller 1932; Felsenstein 1974). The argument is that, with time, the last few individuals from the fittest class, that is the class with the minimum number of mutations, will get an additional deleterious mutation at any locus and the least-loaded class will disappear to be replaced by a new fittest class with one more mutation. Without recombination, each loss of the least-loaded class is as irreversible as the click of a ratchet. The mechanism has been widely studied (Haigh 1978; Bell 1988; Stephan *et al.* 1993; Lande 1998; Gordo and Charlesworth 2000a,b; Etheridge *et al.* 2007; Soderberg and Berg 2007; Jain 2008; Rouzine *et al.* 2008; Neher and Shraiman 2012; Good *et al.* 2014) and there are several approaches for predicting the average number of generations (*T*) between two consecutive clicks. For scenarios where the ratchet operates slowly, there are precise methods available to predict *T* using diffusion approximation theory (*e.g.*, Gordo and Charlesworth 2000a,b; Jain 2008; Neher and Shraiman 2012). For more general scenarios, methods based on traveling-wave theory are also available (Rouzine *et al.* 2008; Good *et al.* 2014).

We argue that the prediction of the rate of the ratchet is basically the same problem as the prediction of the rate of fixation of deleterious mutations in the background selection model (Charlesworth 2013) and, consequently, a solution based on the classical concept of the effective population size *N*_{e} (Wright 1931; recently reviewed by Wang *et al.* 2016) should apply if appropriate equations for *N*_{e} considering the fixation of deleterious mutations can be found. When the ratchet is at mutation-selection-drift equilibrium, each click of the ratchet corresponds, on average, to a fixation of a deleterious allele (Charlesworth and Charlesworth 1997). Felsenstein (1974) established that Muller’s ratchet is an aspect of the Hill–Robertson interference (Hill and Robertson 1966) affecting the genetic variance in both sexual and asexual finite populations under selection. Although the interference was postulated to be a consequence of selection, genetic drift is also needed to develop disequilibrium if gene effects are independent. Thus, the rate of fixation of deleterious mutations depends on the effective population size, but in reverse, the effective population size is a function of the genetic variance for fitness, which is related to the rate of fixation of deleterious mutations. Although the effective population size and the rate of accumulation of deleterious mutations are dependent on each other, there is not a general theory that considers this interrelation to make a joint prediction.

Most of the theoretical developments on the prediction of Muller’s ratchet assume constant mutation effects with some inferences for variable effects. Charlesworth *et al.* (1993) showed that the appropriated value to predict the size of the least-loaded class with variation in selection coefficients (*s*) is the harmonic mean of the effects. This leads to the conclusion that a reduction in *N*_{e} for the background selection model would mainly be determined by mutations with relatively small effects, up to the extent that they are not effectively neutral. The boundary around *s* = 1/*N*_{e} is a transition region from mutations under effective selection to effectively neutral mutations. Soderberg and Berg (2007) dealt with the problem of a continuous distribution of effects without recombination by considering only the part of the distribution with effects larger than a subjective cut-off value (*s* > 4/*N*_{eb}, where *N*_{eb} is the effective population size applicable to small effects). Good *et al.* (2014) adapted their theory for genetic diversity to a distribution of weak effects by substituting *s* by the squared root of the second moment of the distribution of effects.

Here we extend the theory of the effective population size to consider the reduction in variance caused by the fixation of deleterious mutations in models of fixed and variable gene effects. We argue that this theory, in combination with the equation for the probability of fixation of Malecót (1952) and Kimura (1957), allows for a general prediction of *N*_{e} and the rate of fixation of deleterious mutations with or without recombination and, consequently, also gives a general approximation for the particular problem of predicting the rate of progression of Muller’s ratchet in asexual populations. The analysis shows that, under background selection, the norm is that *N*_{e} is not proportional to the census size *N* of the populations, particularly when variable distributions of gene effects are considered. This fact may have important consequences on the levels of genetic variation predicted in natural populations: *N*_{e} can be quite insensitive to changes in population size, which could contribute to explain the relatively small differences in levels of polymorphism observed between species with very different census sizes (Lewontin 1974; Lynch and Conery 2003; Leffler *et al.* 2012; Corbett-Detig *et al.* 2015).

## Methods

Let us consider a finite population with random mating and a constant number *N* of haploid reproducers. Each individual is made up of a genome *L* Morgans long. Every generation, *N* new individuals are generated to found the next generation. Each new individual is produced by fusion and meiosis of two random reproducers sampled with replacement and receives a number of new deleterious mutations taken from a Poisson distribution with mean *U*. The locations of the new mutations are randomly scattered along the genome and their effects are independent. Thus, an individual carrying *k* mutations with deleterious effects *s*_{1}, *s*_{2}, ..., *s _{k}* has an absolute fitness (1 −

*s*

_{1})(1 −

*s*

_{2})…(1 −

*s*). If the deleterious effects are considered to be invariant (

_{k}*s*), then the fitness of an individual carrying

*k*detrimental mutations is (1 −

*s*)

*. Assuming a constant population size is the same as considering a competitive model in which the relative fitness of the individual is (1 −*

^{k}*s*)

*/*

^{k}*W*, where

*W*is the mean of the absolute fitness values. Therefore, the input of new genetic variation for relative fitness per generation is

*V*

_{M}=

*Us*

^{2}. The census size

*N*of the population can be very large but the effective population size,

*N*

_{e}, is reduced by selection by such an amount that drift could lead some deleterious mutations to escape from selection and be fixed.

### Prediction of the asymptotic *N*_{e} accounting for the reduction in variance due to fixation of deleterious mutations with fixed effects

In what follows we define *N*_{e} as the asymptotic effective population size, *i.e.*, that measuring the asymptotic rate of change in inbreeding or genetic drift in a closed population with constant size over generations, which is maintained under invariable breeding and evolutionary conditions. The prediction of *N*_{e} for a background selection model is (Santiago and Caballero 1998), where *V* is the variance of relative fitness and *Q* is the amplification factor of the effects of selection over generations (Robertson 1961). The random association generated by sampling in a given generation between a focal neutral gene and the selected mutations does not disappear in the next generation but remains until recombination, drift, and selection cancel it out (Robertson 1961; Santiago and Caballero 1995). The expected total change in gene frequency of the focal neutral gene over generations relative to the change in the first generation will then accumulate over generations to a limiting value *Q* and the corresponding effect on variance of changes for an infinitely large number of focal neutral genes is then squared. In a model of deleterious mutations of fixed effect *s* appearing at a rate *U* per haploid genome per generation on a genome of length *L* M, *N*_{e} can be predicted by(1)as approximated by Hudson and Kaplan (1995) and Nordborg *et al.* (1996), focusing on the effect of selection on nucleotide diversity; by Santiago and Caballero (1998), using a genetic drift approach; and by Nicolaisen and Desai (2013) using a coalescent approach.

Equation 1 assumes that the variance for relative fitness is that expected for an infinitely large population (*V* = *Us*) in which mutations are never fixed. When there is a chance for fixation of deleterious mutations in finite populations, however, *V* is reduced as mean fitness declines over generations. If mutations have effect *s* and there is a fixation every *T* generations, the rate of decline in fitness per generation is −*s*/*T*. This rate must be equal to the reduction in fitness due to mutation (−*Us*) plus the expected response to selection, where the latter equals the amount of variance of fitness (*V*) according to Fisher’s fundamental theorem of natural selection (Fisher 1930). Thus, −*s*/*T* = −*Us* + *V* and, therefore,(2)The above expression is a particular case of the more general Equation 15A from García-Dorado (2007).

The relative cumulative value *Q*_{r} over generations (*i*) contributed by a new association of the focal neutral gene with a particular selected locus having a recombination frequency *r* between them, is(Santiago and Caballero 1998). The ratio *V*_{M}/*V* is the rate of reduction of genetic variation due to selection and drift, which for now we consider to be constant for all the selected loci at equilibrium.

Selection on distant genes does not have influence on the drift process of the focal neutral gene and, for close linkage, recombination rates and genetic distances are nearly proportional. Assuming this proportionality across the genome, the cumulative effect of selection over generations (*VQ*^{2}) for a neutral gene located in the middle of a chromosome *L* M long can be given by(a more detailed deduction of this expression in given in Supplemental Material, File S1A). By using this integral to represent the cumulative effect over the genome, we also assume that the selected sites are uniformly and densely distributed. Substituting *V*_{M} = *Us*^{2} and *V* = *Us* − *s*/*T* from above, the effective size accounting for an arbitrary rate of fixation of deleterious mutations can be approximated as (see File S1B)(3)If the rate of fixation is negligible (*T* → ∞) then Equation 3 reduces to (1), as expected.

It is important to note that the derivation of these equations corresponds to the concept of the asymptotic effective population size under selection, as defined above. For a cohort of new alleles, this *N*_{e} represents the magnitude of drift or inbreeding rate that is reached after a fairly large number of generations counted since they first appeared by mutation (Santiago and Caballero 1998). Until that moment, drift at the focal loci is expected to increase with time. We argue that the asymptotic effective size is the pertinent measure of drift to predict the rate of fixation of deleterious mutations, whose transitions from one copy to fixation are dominated by long-term associations with selective variants at closely linked loci.

The intensity of genetic drift is expected to be larger for old mutations than for recent ones. In terms of effective population size, the asymptotic *N*_{e} corresponds to old mutations but the particular *N*_{e} values associated to recent mutations are expected to be larger. The partial *N*_{e(}_{t}_{)} value which represents the intensity of drift after *t* generations is given bywhere is the corresponding cumulative value up to generation *t*.

### Heterozygosity effective size (*N*_{e}_{H})

_{H}

With regards to neutral variation, the heterozygosity of a population is the result of the accumulation of contributions by mutations that appeared at different generations backward in time, which are under different drift intensities. Strictly speaking, there is not a single effective size value that describes the spectrum of neutral variation in selected populations. The heterozygosity effective size, *N*_{e}* _{H}*, obtained from the standing heterozygosity

*H*at neutral loci by solving the simple equation

*H*= 2

*μN*

_{e}

*, where*

_{H}*μ*is the rate of neutral mutation, is a sort of average effective size of convenience that is larger than the asymptotic

*N*

_{e}:(4)where each element of the infinite sum is the contribution to current heterozygosity by neutral mutations, which occurred

*t*generations backward in time. The difference between asymptotic

*N*

_{e}and heterozygosity

*N*

_{e}

*is not expected to be large with recombination, because most of the change in partial*

_{H}*N*

_{e(}

_{t}_{)}values during the lifespan of the cohort of neutral mutations occurs in the first generations (see below). However, the difference between the asymptotic

*N*

_{e}and

*N*

_{e}

*can be substantial under tight linkage and small effects (*

_{H}*N*

_{e}

*s*< 1) as will be shown below.

### Joint prediction of the asymptotic *N*_{e} and the rate of fixation of deleterious mutations

The expected number of generations between consecutive fixations (*T*) (or its inverse 1/*T*, the rate of fixation of deleterious mutations) can be predicted by the product of the number of newly arising mutations per generation (*NU*) and their probability of fixation (*P*_{f}), which was developed by Malecót (1952) and Kimura (1957) as *P*_{f} ≈ (2*sN*_{e}/*N*)/(exp[2*sN*_{e}] − 1). We argue that an approximation for the average number of generations between consecutive fixations can be given by *T* ≈ 1/(*NUP*_{f}), *i.e.*(5)Thus, we have an approximation of *N*_{e} accounting for the rate of fixation of deleterious mutations (Equation 3) and another of *T* from the probability of fixation of mutations given as a function of *N*_{e} (Equation 5). These equations then provide a system of two independent equations with two unknowns, *T* and *N*_{e}, given values of *N*, *U*, and *s*. Equation 3 can be rearranged in a more convenient form by solving for *T* (see File S1C). The prediction of the standing variance for fitness *V* is straightforward by substituting *T* into Equation 2. This variance *V* is the corresponding one to be used in Equation 4 to predict heterozygosity.

### Extension of the prediction of *N*_{e} to an arbitrary distribution of mutation effects

The prediction of *N*_{e} can be easily extended to account for any continuous distribution of mutational effects, *f*(*s*) (see File S1D),(6)where the corresponding terms for any particular effect *s* are the same as for fixed effects derived above, *i.e.*, , , , and . The mutation rate *U*_{s} for an effect *s* is obtained from the distribution, *f*(*s*), of effects: .

Equation 6 needs to be solved numerically for *N*_{e}, which is present at both sides of the equality sign.

### Heterogeneity of sites and hotspots of recombination

The effect of selection on the asymptotic *N*_{e} for a focal neutral site under linkage has generally been considered to be the addition of the effects of selected loci. For a number *n* of selected loci, this can be indicated in a discrete form by the cumulative variance,to be included in the prediction of , where *r*_{i} is the recombination frequency between the neutral locus and the *i*th selected gene, and *V*_{Mi} and *V*_{i} are the mutational variance and the standing genetic variance contributed by locus *i* at equilibrium (the ratio *V*_{Mi}/*V*_{i} is the rate of reduction of genetic variance due to selection and drift at that locus).

This approach has been widely used to predict the landscape of neutral variation in a genome region with variable mutation and recombination rates over the sites by considering that the terms of the sum are independent. However, the reduction of variance due to fixation of deleterious mutations makes the variation for fitness at a particular site dependent on the effects of the other selected sites (Hill and Robertson 1966). The particular *N*_{ei} value for each selected site *i* must be included in the terms of the sum because *V*_{i} = *s*_{i}*U*_{i} − *s*_{i}/*T*_{i}, as previously shown, and the rate of fixation 1/*T*_{i} at site *i* depends on *N*_{ei}, which, in turn, depends on the particular *V*_{i} values at the other sites. Therefore, all the terms of the sum depend on each other and all *N*_{ei} values have to be estimated at the same time.

The problem becomes complex if sites are not equivalent, that is, if they differ for recombination and mutation rates and effects. With regards to recombination, an approximate solution for *N*_{e}, valid for any site, can be obtained if we assume that mutation rates and distributions of effects are equal for all loci and the pattern of recombination is regularly repeated over the genome. File S1E gives a solution for the particular case of *m* segments flanked by hotspots of recombination evenly spaced over the genome and an arbitrary distribution of effects. The prediction of *N*_{e} can be approximated by(7)where *K* is the proportion of recombination events located at hotspots, occurring in the remainder fraction (1 − *K*) within the segments between hotspots. The terms and are the mutational variance and the genetic variance for fitness contributed by the segment. As effects and rates are uniformly distributed, then and .

### Evaluation of expressions by simulation

Computer simulations were carried out to evaluate the predictions of *N*_{e} and *T*. Haploid populations of size *N* ranging from 10 to 1.5 × 10^{7} individuals were run with selection-reproduction-mutation cycles for a large number of generations to reach the selection–drift balance. An infinitely large number of biallelic loci with potential effect on fitness was assumed over a genome of length *L* = 0, 0.1, or 1 M. Effects on fitness were considered either constant (*s* = 0.001, *s* = 0.01, or *s* = 0.05) or taken from an exponential distribution with mean . Selection was implemented by sampling with replacement individuals with probabilities proportional to their fitness values. Having selected a mating pair, a zygote was formed by allowing recombination events to occur between the genomes, in a number that was sampled from a Poisson distribution with mean *L*. Then, new mutations were assigned to the zygote in a number that was sampled from a Poisson distribution with mean *U* (0.001, 0.01, 0.2, or 1).

The fixation of deleterious mutations is a long-term stochastic process, and we argue that it must be dependent on the asymptotic effective population size *N*_{e}. The computation of *N*_{e} from simulations was carried out by two different methods that give estimations of the asymptotic value. For the first one, biallelic neutral loci were inserted, equally spaced across the genome. After the population had reached mutation-selection equilibrium, neutral alleles were set up at frequency 0.5 and the rate of increase in neutral gene frequency variance over replicates (*V*_{q}) was scored for consecutive generations (*t*) over a period of 300–500 generations until an asymptotic stage was approximately reached. Thus, the effective population size at generation *t* was calculated as *N*_{e(}_{t}_{)} = [0.25 − *V*_{q,}_{t}_{− 1}]/[*V*_{q,}_{t}_{− 1} − *V*_{q,}* _{t}* ]. The effective size

*N*

_{e(}

_{t}_{)}was also calculated from the rate of decline in heterozygosity in consecutive generations (

*H*

_{t}) of the neutral alleles as

*N*

_{e(}

_{t}_{)}= 1/[1 − (

*H*

_{t}/

*H*

_{t − 1})], and the same results were obtained as for the gene frequency variance calculation, as expected. The asymptotic

*N*

_{e}value was approximated by

*N*

_{e(}

_{t}_{)}for a sufficiently large

*t*.

For the second method of estimating the effective size from simulations, this was obtained from the simulated rate of fixation 1/*T* of deleterious mutations by solving Equation 5 for *N*_{e}. We will call this measure the “fixation effective size.” We checked that, as expected, this measure of the effective size was very close to the simulated asymptotic effective size as described above. Because this latter method demands much less computational time than the preceding one, we generally used this one except for those scenarios where the rate of fixation is so slow that the method is not useful, and the previous one was applied. Results were averages of a variable number of replicates or generations (thousands to millions) depending on the population sizes and mutational parameters.

To estimate the heterozygosity effective population size, *N*_{e}* _{H}*, a neutral locus was allocated in a central position of the genome. For each generation, the value of each individual’s neutral locus was changed by adding a mutational effect randomly sampled from the standard normal distribution. Under this distribution of mutational effects, the equilibrium variance is the expected heterozygosity effective population size

*N*

_{e}

*at the neutral locus (Hill 1982).*

_{H}### Data availability

The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article.

## Results

### Computation of the effective size under the background selection model

In the background selection model, with a constant flux of deleterious mutations, the effective population size can be computed in different ways (Figure 1). Equation 3 refers to the asymptotic effective size under selection and linkage which, as mentioned above, may take a large number of generations to be reached depending on the genome length, the population size, and the mutational parameters. An illustration of this approach is shown in Figure 1A with simulation results for constant deleterious effects. The simulated asymptotic *N*_{e} computed from the increase in gene frequency variance at neutral loci is close to those calculated from the simulated rate of fixation of deleterious mutations by solving Equation 5 for *N*_{e} (fixation effective size; right-hand circles). This agreement supports the hypothesis that the asymptotic *N*_{e} value is the relevant parameter to represent the drift process responsible for the fixation of deleterious mutations in the long term. In contrast, differences between the asymptotic *N*_{e} and heterozygosity *N*_{e}* _{H}* are larger. These differences are a consequence of the process of accumulation of variation contributed by neutral mutations that occurred in previous generations, as explained before.

The same pattern is observed when deleterious effects are variable. Figure 1B shows the three effective size estimates (asymptotic, fixation, and heterozygosity) for a continuous distribution of effects *f*(*s*). Here, the fixation effective size *N*_{e} is computed by solving the corresponding equation for the observed rate of change in fitness due to fixations of deleterious mutations:(8)where *ds*/*T*_{s} is the rate of fixation of mutations with effects within the range from *s* to *s* + *ds*.

In summary, in a background selection model the asymptotic effective size and the fixation effective size are very similar, as hypothesized, and the heterozygosity *N*_{e}* _{H}* can differ to some extent under certain scenarios. In order to compare predictions with simulations of rates of fixation we will generally consider, for simplicity, simulated fixation effective size values obtained by solving Equation 5 for constant effects and Equation 8 for variable effects, unless stated otherwise.

### Predictions of the asymptotic effective size accounting for fixation of deleterious mutations

Figure 2 represents the ratio *N*_{e}/*N* over a wide range of *N* and *s* values. For scenarios where *s* ≈ 1/*N*_{e} and below (left-hand region of the thick line determining *s* = 1/*N*_{e}), Equation 1 for *N*_{e} does not hold, as there is fixation of deleterious mutations (*N*_{e} is predicted by Equation 3). In this region, it can be noticed that *N*_{e} is not proportional to changes in *N*, *i.e.*, *N*_{e} becomes almost insensitive to increases in *N* (the ratio *N*_{e}/*N* decreases for increasing *N*), except when effects are so small that mutations are effectively neutral (plateau at the left-hand top corner), where the fixation rate is 1/*T* ≈ *U* and *N*_{e} equals *N*. When the effects are large (*s* >> 1/*N*_{e}) (right-hand region of the thick line *s* = 1/*N*_{e}) fixation is negligible (1/*T* → 0) and there is proportionality in changes of *N*_{e} and *N* (the ratio remains constant for increasing *N*).

### Predictions of the heterozygosity effective size, *N*_{e}_{H}

_{H}

Figure 3 shows a comparison between predictions of the ratio *N*_{e}/*N* (green line) and the predictions of *N*_{e}* _{H}*/

*N*(blue line). The latter is equivalent to the reduction of the neutral diversity of a population under selection relative to the diversity of the same population without selection (π/π

_{0}). Both predictions overlap for intermediate and large values of selection coefficients

*s*, but they are quite different for small

*s*values. Predictions from Good

*et al.*(2014) for π/π

_{0}(red line) show a small deviation from simulations, which is persistent in the region of weakly selected mutations.

### Rate of fixation of deleterious mutations with no recombination

The rate of fixation of deleterious mutations from our predictions provides a solution for the rate of advance of Muller’s ratchet in asexual populations. Table 1 and Figure 4 show simulation results and predictions for 1/*T* and *T*, respectively, using Equations 3–5 as well as equations by (Gordo and Charlesworth 2000a,b), Rouzine *et al.* (2008), Neher and Shraiman (2012), and Good *et al.* (2014) for the rate of progress of Muller’s ratchet (the different methods used are given in File S1F). Predictions by Gordo and Charlesworth and Neher and Shraiman’s equations are appropriate when the rate of fixation is low but produce underestimations of 1/*T* (Table 1) or overestimations of *T* (Figure 4), particularly those of Neher and Shraiman, for scenarios with high fixation rate. This deviation is probably a consequence of the assumption that the proportion of individuals in the least-loaded class is exp[−*U*/*s*], which is the expectation for an infinitely large population, but it deviates from the real value when the rate of fixation is high. The theory by Good *et al.* (2014) largely overpredicts *T* when the ratchet runs slowly and Rouzine’s (2008) predictions are generally precise, but give large underestimations of *T* when the fixation rate is very slow (Figure 4). Equations 3–5 give reasonable predictions for the range of scenarios considered, although provide biased estimates in the slow-ratchet regime.

### Rate of fixation of deleterious mutations with recombination

Predictions of asymptotic effective size and the rate of fixation of deleterious mutations with Equations 3–5 for different genome lengths (*L*) are shown in Table 2 compared with simulations. In general, a satisfactory agreement is found between predictions and simulations. The nonlinear relationship between *N*_{e} and *N* can also be observed from the table. For example, for *U* = 1, *s* = 0.01, and *L* ≤0.1, *N*_{e} is around or below 100 both for *N* = 10^{3} and 10^{4}.

We can use the predictive Equations 3–5 to illustrate the general conditions for the mutational meltdown. Figure 5A shows the rate of fixation relative to that for neutral genes (1/*TU*) for different population sizes and mutation effects. From this figure it becomes evident that there is a quick shift from significant rates of fixation to nearly null fixation rates as either *N* or *s* increase. Therefore, the transition between the conditions under which fixation occurs, or not, can be effectively represented by a line, as shown in Figure 5, B and C. These show the lines of transition between conditions of significant accumulation of detrimental mutations and low rates of accumulation as a function of *L*, *U*, *s*, and *N*. In asexual organisms (*L* = 0), the progress of the ratchet is strongly dependent on the mutation rate. For sexual species (*L* > 0) the dependence on mutation rates is less conspicuous. Note, however, that these results refer to a model where there are only deleterious mutations, so they cannot be taken as general.

### Prediction of the asymptotic *N*_{e} for a variable distribution of mutation effects and heterogeneity of recombination rates

Predictions of *N*_{e} for a variable distribution of mutation effects using Equation 6 are in good agreement with simulations in situations either with or without recombination (Figure 6; continuous blue lines). On the opposite side, predictions using the classical Equation 1 (red dashed lines), with the average instead of the constant effect *s*, are larger than observed *N*_{e} values and the deviation increases for increasing *N*. This result was expected as the size of the least-loaded class is mainly dependent on the harmonic mean of the distribution of effects (Charlesworth *et al.* 1993). Equation 6 is in fact the harmonic mean of effects weighted by their contributions to the response to selection relative to the contributions in an equivalent but infinitely large population, that is, without fixations.

Now consider situations where recombination takes place at fixed hotspots (simulations shown as triangles and squares in Figure 6B), where predictions by Equation 7 are rather accurate. In this case, the average effective population size for the whole genome can be drastically reduced relative to that for evenly distributed recombination points. Moreover, *N*_{e} becomes more independent on changes in *N* as the proportion of events of recombination occurring at hotspots increases.

Finally, Figure 7 shows the ratio *N*_{e}/*N* for a wide range of mutational effects, rates of mutation, and levels of linkage. The analysis indicates that *N*_{e} becomes very insensitive to increases in population size for an ample set of parameters, and this is more extensive with no recombination when the shape of the distribution becomes very leptokurtic (β = 0.2; Figure S1). In contrast, with a uniform recombination map, the shape of the distribution of effects has virtually no effect on the predictions of *N*_{e} and the general conclusion is that *N*_{e} and *N* are approximately proportional when *U*/*L* < 1, but the proportionality tends to disappear as this ratio increases (Figure S1).

## Discussion

When the effects of deleterious mutations are so small that there is a chance of fixation, the preexistent equation for *N*_{e} under background selection (Equation 1) does not hold. The reason is that the genetic variance, which is a variable in the equation, is reduced below the expectation for infinitely large populations. The prediction of the standing variance is complicated because it is dependent on *N*_{e} itself: the genetic variance is dependent on the rate of fixation that, in turn, is dependent on *N*_{e}. We argued that the effective population size and the rate of fixation of deleterious mutations can be jointly predicted by solving a system of two equations: a new equation for the asymptotic *N*_{e} that takes into account the fixation of deleterious mutations (Equation 3), and the well-known equation for the rate of fixation of selected genes by Malecót (1952) and Kimura (1957) (Equation 5).

The solution found is rather general, covering sexual and asexual populations and is given in the same terms as the main developments on *N*_{e}, which facilitates the application to populations with particular genetic, temporal, or spatial structures. For sexual species, the equations allow for a prediction of the rate of fixation of deleterious mutations under different linkage levels, permitting a detailed assessment of the theoretical scenarios where mutation accumulation may lead to mutational meltdown (Lynch and Gabriel 1990; Lynch *et al.* 1995; Whitlock and Bürger 2005). For asexual species, the equations are also an approximate solution for the rate of advance of Muller’s ratchet. That is, the method proposed unifies the concepts of Muller’s ratchet and mutational meltdown (Lynch *et al.* 1995)

The search for a prediction of the speed of Muller’s ratchet has been previously based on the analysis of the fluctuations in size of the fittest class due to selection, mutation, and drift (Haigh 1978; Bell 1988; Gordo and Charlesworth 2000a,b; Etheridge *et al.* 2007; Jain 2008; Rouzine *et al.* 2008; Neher and Shraiman 2012). Eventually, these fluctuations determine the extinction of this class and its substitution by a class with an increased load of mutations. In our model, there is an expected increment of one deleterious mutation per individual during the time (*T*) between two consecutive fixations (Lynch and Gabriel 1990; Charlesworth and Charlesworth 1997). Given that the distribution and density of the genetic variation remains constant at mutation-selection-drift equilibrium, this increment means that a fixation of a detrimental gene occurs on average during the same time span. Following this rationale, our systems of Equations 3–5 provide reasonably good approximations of the average time between fixations *T* covering a wide range of scenarios with the largest deviations in the slow-ratchet regime, where fixations occur every many thousands or millions of generations. Probably, fluctuations of fitness variance around the value *V*, which is assumed to be constant in the model, lead to a loss of precision that is critical in the region of very rare fixations.

We have extended the theory to a variable distribution of mutation effects (Equation 6), which provides predictions of the rate of decay in fitness. These predictions, as well as computations by Equation 7, assume that there is a single *N*_{e} value that represents the drift process affecting all the spectrum of selective effects. This assumption can be approximately held as far as the population is at mutation-selection-drift equilibrium, which is the implicit condition in the derivation of the original equation for the probability of fixation by Kimura (1957). However, when selective sweeps of beneficial mutations are also included in the model, mutations with different deleterious effects are fixed with probabilities that do not correspond to the same *N*_{e} value in Equation 5 (Messer and Petrov 2013). The reason for this is that selective sweeps induce large fluctuations in *N*_{e} over time, which obviously affect different deleterious mutations in different ways as given by Kimura’s equation: fixation rates of mutations with different deleterious effects are differentially affected as *N*_{e} fluctuates. Simulation results given in File S1G and Figure S2 illustrate these remarks.

Prediction of *N*_{e} for a variable distribution of effects is a fundamental problem, which has not been completely solved so far. Charlesworth *et al.* (1993) suggested the use of the harmonic mean of selection coefficients, but this solution cannot be readily applied to a continuous distribution of effects that includes weakly selected mutations. Thus, Soderberg and Berg (2007) split the distribution into three parts (large, medium, and small effects) and assumed that only the *N*_{e} value corresponding to the large-effects part would determine the probability of fixation of weakly selected mutations. As small effects are ignored, predicted *N*_{e} severely overestimates the real values. The solution given by Good *et al.* (2014) is an adaptation for their theory for genetic diversity to a distribution of weak effects by substituting *s* by the squared root of the second moment of the distribution of effects, but it is only applicable to weak effects. Here we showed that the harmonic mean provides a reasonably accurate solution when the reduction in fitness variance due to fixation of deleterious mutations is considered.

We have argued that the ultimate probability of fixation of deleterious mutations is dependent on the asymptotic effective size *N*_{e}. However, conceptually, this *N*_{e} is not the one to be included in the well-known equation for neutral diversity *π* = 2*μN*_{e}. The prediction of π requires a heterozygosity effective size, *N*_{e}* _{H}*, which takes into account the differential intensity of the drift process acting on mutations that appeared at different generations backward in time. The value of

*N*

_{e}

*can, however, be straightforwardly estimated with the approach proposed (Equation 4).*

_{H}The nonlinear relationship between *N*_{e} and *N*, explicit in the predictive equation for *N*_{e} accounting for the fixation of deleterious mutations (Equation 3), may have an important consequence in our understanding of the genetic diversity in nature. A limitation attributed to the background selection model is the impossibility of explaining the relative homogeneity of variation between species given the huge differences in census sizes across them. Genetic diversities at neutral sites of most eukaryotes and also most prokaryotes fall within a range of an order of magnitude (Lynch and Conery 2003; Leffler *et al.* 2012), while the average census sizes for entire species surely differ in more than three orders of magnitude; showing that levels of polymorphism are quite insensitive to variation of population size (Lewontin 1974).

Several alternative or complementary explanations have been given for this paradox. One would be the greater impact of nonequilibrium demographic perturbations, such as a high variation in reproductive success, in species with large population sizes, as shown by Romiguier *et al.* (2014). In a genome-wide diversity analysis of 76 nonmodel animal species, they showed that long-lived or low-fecundity species were genetically less diverse than short-lived or high-fecundity species, pointing toward this explanation of the Lewontin’s paradox.

An alternative explanation is the possible constrain in genomic variation imposed by natural selection, particularly in low recombination genome regions, via fixation of beneficial mutations and elimination of deleterious ones. A number of theoretical studies have found a nonlinearity between the heterozygosity effective size *N*_{e}* _{H}* and the census size

*N*(Santiago and Caballero 1998, Gordo

*et al.*2002, O’Fallon

*et al.*2010, Neher and Hallatschek 2013, Neher

*et al.*2013, Good

*et al.*2014, this article). From empirical data, Corbett-Detig

*et al.*(2015), analyzing genome diversity across 40 species, have shown that natural selection removes more variation at linked neutral sites in species with large population sizes (indirectly inferred from proxies such as body size and species range) than in those with lower population sizes, giving support to this explanation. The interpretation made by Corbett-Detig

*et al.*(2015), however, has been argued by Coop (2016), who gives an alternative one pointing toward the idea that demographic fluctuations are the main determinants of the levels of diversity across species.

Preexistent equations for *N*_{e} under background selection are linear functions of *N* (Hudson and Kaplan 1995; Nordborg *et al.* 1996; Santiago and Caballero 1998; Nicolaisen and Desai 2013), which has led to the consideration that genetic drift implies that neutral variation depends linearly on population size (Gillespie 2000). However, the inclusion of the rate of fixation in Equation 3 introduces a nonlinear relationship between *N*_{e} and *N*: when *N*_{e}*s* is ∼1 or lower, the ratio *N*_{e}/*N* decreases for increasing *N*. To a similar extent, this also occurs for the heterozygosity effective size, *N*_{e}* _{H}*. For a model with a continuous distribution of deleterious effects in the range from 0 to 1, there will always be some

*s*values for which

*N*

_{e}

*s*< ∼1 for any

*N*

_{e}and, therefore, the ratio

*N*

_{e}/

*N*will no longer be strictly constant (Figure 6, Figure 7, and Figure S1). In other words, strong reductions of

*N*values lead weakly selected genes to be nearly neutral, reducing the impact of selection on the reduction of

*N*

_{e}in a kind of buffering effect. Although weakly selected genes have a small contribution to the standing genetic variation, they are very relevant to the cumulative effect of selection on

*N*

_{e}because they can persist in the population for a long time. Thus, the simplification of considering a constant effect

*s*results in the erroneous conclusion that the background selection model predicts a linear relationship between

*N*

_{e}and

*N*, and this does not hold with the more realistic assumption that mutations effects are highly variable (Kousathanas and Keightley 2013).

The ratio *N*_{e}/*N* can be affected by the shape of the distribution of mutation effects (parameter β of a gamma distribution in Figure S1). With asexual reproduction, the more leptokurtic the distribution of effects (the lower β), the wider the spectrum of parameters where progressive increases in *N* lead to lower increases of *N*_{e}, which is consistent with the observed loose dependence of *N*_{e} on *N* in nature. For sexual reproduction, the nonlinear relationship between *N*_{e} and *N* is very evident when *U*/*L* > 10, irrespective of the shape of the distribution of mutational effects (Figure S1).

Note, however, that the direct application of the genome length in the predictive equations of *N*_{e} (3) and (6) assumes that recombination events occur with equal probability between adjacent deleterious sites evenly spaced over the genome. In contrast, most evidence suggests that recombination concentrates in most species at particular hotspots (Myers *et al.* 2005; Berg *et al.* 2010; Grey *et al.* 2011). Although there is variation in the location of the hotspots within and between populations and over time (Anjali *et al.* 2011; Munch *et al.* 2014), hotspots shape the structure of linkage disequilibrium in populations (Stump 2002; De La Vega *et al.* 2005; Nishant *et al.* 2006; Lau *et al.* 2007) and, therefore, they also probably determine how selected loci affect *N*_{e} at neutral linked sites. In this article we have also derived an expression of *N*_{e} in this scenario with some simplifying assumptions (Equation 7). The results indicate that the presence of recombination hotspots further reduces the ratio *N*_{e}/*N* (Figure 5B).

Our equation of the effective size, therefore, states that there is a nonlinear relationship between *N*_{e} and *N*, which may contribute to explain Lewontin’s paradox under the hypothesis that natural selection is a determinant of the level of variation across population sizes. However, this effect is likely to be only part of the explanation, because rather restrictive scenarios (*U*/*L* > 1) are needed for the nonlinearity between *N*_{e} and *N* to have a drastic effect. Let us consider a numerical example using human data with a number of plausible parameters. First, it is generally assumed that recombination hotspots account for around 70% of recombination events (average numbers taken from Myers *et al.* 2005 and the International HapMap Consortium 2007 based on a single population), but an in-depth analysis combining data from several populations predict that 95–97% of all crossovers can be explained by hotspots (Khil and Camerini-Otero 2010). Thus, we shall assume that 97% of the ∼30 M of the human genetic map are assigned to hotspots and the remaining 3% is assigned at any other random sites. Second, from the analyses of sequence data, and by making assumptions about the fraction of the genome constrained by selection, deleterious mutation rates for humans have been estimated to be of the order of *U* = 1.1 per haploid genome (Keightley 2012). This is probably a lower bound as it disregards transposable elements subject to selection, insertions, and deletions, as well as adaptive mutations and weakly selected mutations, which lead to an underestimation of the proportion of sites in the genome that are subject to negative selection (Lesecque *et al.* 2012). Although difficult to say, let us tentatively assume that *U* is 10× larger, as suggested by Reed *et al.* (2005). In a euchromatic genome with ∼3 Gb, a typical segment of 100 kb flanked by hotspots (this is the average segment size in a genome with 30,000 hotspots) could account for an average rate of mutation equal to 3.7 × 10^{−4}. Finally, the analysis from nonsynonymous mutations producing segregating amino acid polymorphisms in humans (Boyko *et al.* 2008) suggests a distribution of mutation effects close to a gamma distribution with shape parameter β = 0.2 and average effect = 0.03. This estimated average effect is likely to be an overestimation, as it only accounts for amino acid-changing (nonsynonymous) mutations. Let us assume that a more realistic average is 10× lower, = 0.003. Under all these assumptions, *i.e.*, by assigning parameters *L* = 30 M, *m* = 30,000, *K* = 0.97, *U* = 11, = 0.003 and β = 0.2 to Equation 7, the ratios *N*_{e}/*N* for populations of size *N* = 10^{2}, 10^{4}, 10^{6}, and 10^{8} are *N*_{e}/*N* = 0.930, 0.601, 0.136, and 0.018, respectively. This illustrates the nonlinear relationship between *N*_{e} and *N*. The corresponding predictions of *N*_{e}/*N* using the previous Equation 1 are invariably 0.480.

However, if less extreme parameters are considered in the above example, then the nonlinear relationship between *N*_{e} and *N* fades. For instance, if *U* is assumed to be 1.1 and = 0.03 keeping the other parameters invariant, the prediction of the ratios *N*_{e}/*N* for populations of size *N* = 10^{2}, 10^{4}, 10^{6}, and 10^{8} are *N*_{e}/*N* = 0.973, 0.937, 0.785, and 0.678, respectively. Thus, it seems to be that background selection could explain Lewontin’s paradox but only for some combinations of parameters that are in the limit of the range of present day estimations. It may well be, however, a different situation in other scenarios, such as selfing species, where the interrelation between inbred matings and selection drastically reduces the effective size (Santiago and Caballero 1995). This is, in fact, compatible with the results from Coop (2016), who found that the selective interpretation of Lewontin’s paradox could well apply to selfing species.

Summing up, the appropriate inclusion of the interaction between the effects of genetic drift and selection in the equations of *N*_{e} provides a way to address some problems, particularly those related with the accumulation of deleterious mutations, and modifies the predictions of genetic variation under the background selection model. In this context, the distribution of effects and the global pattern of recombination become determinants of the variation in natural populations. Estimating these parameters in real populations is important to assess if the background selection model can effectively explain the observed homogeneity of variation between species and maybe also within the genome (Gossmann *et al.* 2011).

## Acknowledgments

We thank Brian Charlesworth for useful discussions on the topic and Michael Desai and Benjamin Good for constructive comments on the manuscript and for providing valuable simulation data. This work was funded by Ministerio de Economía y Competitividad (CGL2012-39861-C02-01 and CGL2016-75904-C02), Xunta de Galicia (GPC2013-011), and Fondos Feder: “Unha maneira de facer Europa.”

## Footnotes

*Communicating editor: G. Coop*Supplemental material is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.116.188250/-/DC1.

- Received February 17, 2016.
- Accepted September 20, 2016.

- Copyright © 2016 by the Genetics Society of America