Abstract
The distribution of fitness effects (DFE) is central to many questions in evolutionary biology. However, little is known about the differences in DFE between closely related species. We use >9000 coding genes orthologous one-to-one across great apes, gibbons, and macaques to assess the stability of the DFE across great apes. We use the unfolded site frequency spectrum of polymorphic mutations (n = 8 haploid chromosomes per population) to estimate the DFE. We find that the shape of the deleterious DFE is strikingly similar across great apes. We confirm that effective population size (Ne) is a strong predictor of the strength of negative selection, consistent with the nearly neutral theory. However, we also find that the strength of negative selection varies more than expected given the differences in Ne between species. Across species, mean fitness effects of new deleterious mutations covaries with Ne, consistent with positive epistasis among deleterious mutations. We find that the strength of negative selection for the smallest populations, bonobos and western chimpanzees, is higher than expected given their Ne. This may result from a more efficient purging of strongly deleterious recessive variants in these populations. Forward simulations confirm that these findings are not artifacts of the way we are inferring Ne and DFE parameters. All findings are replicated using only GC-conservative mutations, thereby confirming that GC-biased gene conversion is not affecting our conclusions.
- DFE
- deleterious mutations
- beneficial mutations
- epistasis
- compensatory evolution
- effective population size
ALL organisms undergo mutation. Studying the effect of mutations on fitness is fundamental to explain the patterns of genetic diversity within and between species and to predict the effect of population size on the probability of survival, or extinction of a species (Eyre-Walker and Keightley 2007). The fitness effects of all new mutations that can occur in a given genome are described by the distribution of fitness effects (DFE). The allele frequency distributions or the site frequency spectrum (SFS) contains information to infer the DFE of new mutations. Current statistical methods infer the DFE while accounting for demography and other sources of distortion in the SFS (Eyre-Walker et al. 2006; Keightley and Eyre-Walker 2007; Boyko et al. 2008; Schneider et al. 2011; Galtier 2016; Kim et al. 2017; Tataru et al. 2017; Barton and Zeng 2018).
The DFE of new deleterious amino acid mutations is assumed to be similar in closely related species, while the mean effect of those deleterious mutations (Sd = 2Nesd) is expected to increase as a function of the effective population size (Ne) (Kimura 1983; Ohta 1992). Very little is known about the variability in the DFE of new beneficial mutations across species as well as how this variability affects the estimates of the deleterious DFE. Most studies have described the DFE of new deleterious mutations of individual populations, while the study of the differences in the DFE between populations or species has remained unexplored. To our knowledge, Huber et al. (2017) is the first work that tries to explain, under different theoretical models, the detected differences in the deleterious DFE of new amino acid mutations in humans, Drosophila, yeast, and mice. They find that humans have more strongly deleterious mutations than Drosophila and that species complexity (measured by the number of different phenotypic dimensions under stabilizing selection) correlates positively with the degree of deleteriousness of new mutations as expected under Fisher’s geometrical model (Fisher 1930). They find that these differences in the DFE cannot be explained by differences in population size or demography between the species. Recently, Zhen et al. (2018) argued that there is a higher proportion of beneficial amino acid changing mutations in humans compared to mice and flies when accounting for the population size of the outgroup species. Zhen et al. (2018) use divergence to detect beneficial mutations that reached fixation many generations ago, many of those were strongly beneficial mutations that contribute disproportionately more to divergence than polymorphism. Here we focus instead on new weakly beneficial (still segregating) mutations. The power of our approach to infer with precision the tail of the beneficial DFE is reduced. However, relying solely on polymorphisms means that our estimates are more robust to ancient fluctuations in the effective population size that affected the probability of fixation of slightly selected mutations in the ancestral populations (Rousselle et al. 2018).
Whether the full DFE varies on shorter timescales remains an open question. It also remains to be considered the contribution of new weakly beneficial mutations to the SFS and by extension the estimates of the deleterious DFE. Tataru et al. (2017) have shown that beneficial mutations can indeed contribute to the SFS and affect the estimates of the deleterious DFE. In fact, Huber et al. (2017) reported more weakly beneficial mutations in humans than in Drosophila, indicating that weakly beneficial mutations are relevant in humans and probably are also important in other nonhuman great apes. A recent study has found that ∼3/4 of adaptive substitutions in the human lineage were driven by weakly beneficial amino acid mutations (Uricchio et al. 2019). In this work, we compare the DFE in a panel of closely related species, the nine great ape populations studied by Prado-Martinez et al. (2013). Great apes share most of their genes and genomic configuration. Gene density, mutation rate, and recombination rate (at least at broad scale) are very similar in these species (Stevison et al. 2016; Kronenberg et al. 2018; Smith et al. 2018; Besenbacher et al. 2019). Thus, our inference of the DFE is affected only marginally by variation in these factors. Interestingly, there is substantial variation in the effective population size (Ne) and remarkable differences in the population histories of great apes (Mailund et al. 2011; Scally et al. 2012; Prado-Martinez et al. 2013; Bataillon et al. 2015; McManus et al. 2015; de Manuel et al. 2016). This variation in Ne allows us to test predictions of the nearly neutral theory (Ohta 1992). By polarizing mutations and using the unfolded SFS (uSFS), we are able to infer the DFE of new weakly beneficial mutations without relying on divergence and make a cleaner inference of the current deleterious DFE. This allows us to investigate if Ne also has an effect on the beneficial DFE. For instance, it is commonly assumed that the rate and effect size of beneficial and deleterious mutations are shared between closely related species, but these quantities might change if the fitness of the population changes (Silander et al. 2007). Recent works have shown that GC-biased gene conversion (gBGC) can bias the inference of the DFE and population history (Bolívar et al. 2018; Pouyet et al. 2018). Here we also assess the effect of gBGC by replicating our analyses using only GC-conservative mutations (A<−>T and C<−>G), which are unaffected by gBGC.
Finally, we use the method of Tataru and Bataillon (2019) (polyDFEv2.0) to test for invariance of DFE across species. Under this new method any fitted parameter can be shared across species or fitted independently for each species. In this work, we compare the fit of several models to investigate which aspects of the full DFE are shared and which aspects differ between great apes.
Materials and Methods
Data sets
SNP calls from the autosomes are retrieved from Prado-Martinez et al. (2013) for nine great ape populations: Homo sapiens (this sample includes three African and six non-African individuals), Pan paniscus, Pan troglodytes ellioti, Pan troglodytes schweinfurthi, Pan troglodytes troglodytes, Pan troglodytes verus, Gorilla gorilla gorilla, Pongo abelii, and Pongo pygmaeus. Hereafter we refer to these species as humans, bonobos, Nigeria-Cameroon chimpanzees, eastern chimpanzees, central chimpanzees, western chimpanzees, western lowland gorillas, Sumatran orangutans, and Bornean orangutans, respectively. To allow a fair comparison between species (note that only four individuals of western chimpanzees and central chimpanzees were sequenced), we down sample all great apes to eight randomly chosen haploid chromosomes, while positions called in fewer than eight chromosomes are discarded. Table S1 shows the final list of analyzed genes per species. In Prado-Martinez et al. (2013) all reads are mapped to the human reference genome (hg18). We lift over the original variant call format (VCF) to hg19/GRCh37.75 coordinates to take advantage of more recent functional annotations. To avoid errors introduced by mismapping due to paralogous variants and repetitive sequences, we also restrict all analyses to a set of sites with a unique mapping to the human genome as in Cagan et al. (2016). Additionally, we require positions to have at least fivefold coverage in all individuals per species. Only the remaining set of sites is used in further analyses. Genomes are annotated using the SnpEff and SnpSift software (Cingolani et al. 2012) (version 4.3 m, last accessed June 2017) and the human database GRCh37.75. We extract zerofold nonsynonymous and fourfold synonymous sites from the codon information for the canonical transcript provided by SnpEff (twofold and threefold degenerate sites are discarded). We assume that the degeneracy and gene annotations are constant across species.
Polarization of mutations
To estimate the full DFE using DNA diversity data, we first need to polarize SNPs to call-derived variants (Boyko et al. 2008; Schneider et al. 2011; Tataru et al. 2017). We use two outgroup species (Nomascus leucogenys or gibbons, and Macaca mulatta or macaques) and a probabilistic method to polarize SNPs using Kimura 2-parameter (K2) model (Keightley and Jackson 2018). We downloaded a multiple species alignment from the University of California, Santa Cruz server between all known coding sequences in the human reference genome (GRCh37.75/hg19), gibbon (Nleu3.0/nomLeu3), and macaque (BGI CR_1.0/rheMac3) (from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz100way/alignments/knownGene.exonNuc.fa.gz). Sites are retained for analysis if there is no missing data in the focal species or either outgroup species. We also removed CpG sites, as CpG hypermutability may result in polarization errors (Keightley and Jackson 2018). CpG sites are defined as sites that are CpG in their context in either the focal species or any of the outgroup (including both REF and ALT alleles). In doing so, we remove 4% of all zerofold nonsynonymous and 5% of all fourfold synonymous sites that passed our previous quality filters. We compare the uSFS from the K2 model and a more complex model allowing six symmetrical rates between all potential nucleotide changes (R6 model) obtaining indistinguishable uSFS for synonymous and nonsynonymous mutations (data not shown). Table S1 shows the uSFS for synonymous and nonsynonymous mutations in all great apes. All our results can be recapitulated from that table.
Estimation of the DFE and bootstrapping
We use the polyDFEv2.0 framework (Tataru and Bataillon 2019) to estimate and compare the DFE across species by means of likelihood ratio tests (LRTs). The inference is performed on the uSFS data only (divergence counts to an outgroup are not fitted) and uSFS data are fitted using a DFE model comprising both deleterious (γ-distributed) and beneficial (exponentially distributed) mutations. polyDFE assumes that new mutations in a genomic region arise as a Poisson process with an intensity that is proportional to the length of the region and the mutation rate per nucleotide (μ). We assume that μ remained constant across great apes (Besenbacher et al. 2019). Both an ancestral SNP misidentification error (ε) and distortion parameters (ri) can be estimated. See Table 1 and Table 2 for the list of parameters estimated in each model. To ensure that the likelihood function is reliably maximized we perform 10 runs of maximization with randomly starting values.
The estimation of the DFE entails substantial statistical uncertainty. To obtain the sampling variance of parameter estimates and approximate confidence intervals, we use a bootstrap approach. Bootstraps are generated by resampling the data at the site level by parametric bootstrapping. We assume that all counts in the uSFS are independent variables following a Poisson distribution, with means specified by the observed uSFS data. This is in line with the modeling assumption that the number of mutations in each uSFS entry follows a Poisson process (Hartl et al. 1994; Tataru et al. 2017). We use the R function bootstrapData (from https://github.com/paula-tataru/polyDFE/blob/master/postprocessing.R) to get the bootstrap replicates.
Statistical test for differences in DFE across species and model averaging
polyDFE returns maximum likelihood (ML) estimates, and therefore, LRTs and the Akaike information criteria (AIC) can be used to compare models (Tataru and Bataillon 2019). The LRT entails fitting two nested models, where one reduced model is a special case of a more general model. A P-value is obtained by assuming that the log of the ratio of the MLs of the two models follows a X2 distribution parameterized by the difference in the number of degrees of freedom (i.e., number of estimated parameters) between the two models. A P-value < 5% means that the reduced model is rejected in favor of the more parameter-rich model. LRTs are performed comparing the pairs of models indicated in the main text (see Table 1 and Table 2 for the lists of models). This approach of comparing the DFE parameters between species through LRTs has been applied before (Huber et al. 2017; Zhen et al. 2018). Simulations demonstrate that the method guards against excessive type 1 error while retaining substantial power to detect differences across data sets when the ri parameters are estimated independently for each data set (Tataru and Bataillon 2019). We use the R function compareModels (from https://github.com/paula-tataru/polyDFE/blob/master/postprocessing.R) to compare pairs of models.
Model averaging provides a way to obtain honest estimates that account for model uncertainty. To produce the model average estimates of the beneficial DFE reported at Table 3 we weight each competing model (M3I, M3S, M2I, and M2S from Table 2) according to their AIC following this equation:where ΔAICj is obtained as
We use the R function getAICweights (from https://github.com/paula-tataru/polyDFE/blob/master/postprocessing.R) to do the model averaging.
Subsampling
To assess the contribution of the variation in sd and Ne to the realized strength of purifying selection (Sd) across great apes, we run a multiple linear regression. We first obtain statistically independent measures of Sd and sd. Note that sd = Sd/2Ne. To do so, we divide the coding genome into two halves drawing random sites without replacement. We use the first half to estimate Sd,1 and the second half to estimate sd,2. To estimate sd,2, we divide Sd,2 by 2Ne,1. Ne,1 is estimated by dividing θ (Watterson 1975), estimated using fourfold synonymous sites coming from the first half of the genome, by the genome-wide mutation rate per site and generation (μ = 1.65 × 10−8). To obtain an independent estimate of Ne, we use fourfold synonymous sites of the second half of the genome. We repeat this subsampling 100 times. We then run a multiple linear regression in a log-log scale with the R function lm[log(Sd,1) ∼ log(sd,2) + log(Ne,2)]. As a sanity check we run the complementary analysis lm[log(Sd,2) ∼ log(sd,1) + log(Ne,1)], obtaining equivalent results (data not shown). The relative importance of sd and Ne is assessed using standardized regression coefficients (β).
Other statistical analyses
All statistical analyses commented above are performed within the R framework (version 3.4.4) except the DFE estimation and the phylogenetically aware regression. To perform the phylogenetically aware regression between Ne and various summary statistics of the full DFE, we use BayesTraitsV3 (Pagel and Meade 2006). We set the method to employ random-walk and ML. Significance is assessed by comparing a model where the correlation is a fitted parameter to a model where the correlation is fixed to 0, by means of a likelihood ratio statistic. To perform the bivariate correlations, we use the function cor.test and the Pearson method. To perform the quadratic regression between Ne and Sd (or sd), we use the R function lm[y∼ poly(x, degree = 2)]. We then compare the linear regression and polynomial regression with an LRT. All figures are generated using the R package “ggplot2.”
Data availability
The final list of analyzed positions is available upon request. The authors confirm that all data necessary for confirming the conclusions of the article are present within the article, figures, and tables. Supplemental material available at FigShare: https://doi.org/10.25386/genetics.9705281.
Results
We compiled polymorphism data from nine great ape populations: bonobos, western chimpanzees, central chimpanzees, Nigeria-Cameroon chimpanzees, eastern chimpanzees, western lowland gorillas, Sumatran orangutans, Bornean orangutans, and humans. A cosmopolitan sample of humans (three African and six non-African diploid individuals) was used to represent human genetic diversity. All our population data are retrieved from Prado-Martinez et al. (2013). To allow a fair comparison, each population is subsampled to eight haploid chromosomes. We investigate >9000 autosomal coding genes that are orthologous one-to-one across great apes, gibbons, and macaques (the two outgroups we use to calculate the uSFS). See Table S1 for the whole list of analyzed genes and the uSFS. The proportion of shared SNPs between species is below 1% but it rises to 10–30% between pairs of chimpanzee populations and between the orangutan species, respectively (Table S2). This imperfection of the data makes the chimpanzee populations and orangutan populations statistically nonindependent.
The inference of the DFE from the uSFS can be affected by polarization errors (Hernandez et al. 2007), gBGC (Bolívar et al. 2018), tight linkage (Eyre-Walker and Keightley 2009; Kousathanas and Keightley 2013), ascertainment bias, nonrandom sampling, population structure, and demographic changes (Eyre-Walker et al. 2006; Keightley and Eyre-Walker 2007; Boyko et al. 2008). To quantify and correct polarization errors, we estimate a polarization error parameter (εanc) (Williamson et al. 2005; Boyko et al. 2008; Glémin et al. 2015). To account for additional biases, we compare the observed uSFS for putatively neutral (fourfold degenerate) synonymous mutations to the expected uSFS for neutral mutations under the Wright–Fisher model. To do that, we estimate a series of nuisance parameters (ri) using the uSFS for fourfold synonymous and zerofold nonsynonymous mutations [as in Eyre-Walker et al. (2006)]. If the ri parameters are significantly different from 1, we jointly estimate the ri parameters together with the parameters of the DFE; otherwise, only the DFE parameters are estimated. The ri parameters are adjusted for each allele frequency class in the uSFS. This effectively assumes that the neutral and selected uSFS are distorted in an identical way by these factors. Weakly beneficial mutations can also distort the uSFS and by extension our estimates of the deleterious DFE (Tataru et al. 2017). We therefore also coestimate the rate and strength of new beneficial mutations together with the ri, εanc, and the two parameters defining the deleterious DFE: the shape, b, and mean, Sd. Here we assume, like several studies in humans, that the deleterious DFE are γ-distributed (Boyko et al. 2008; Eyre-Walker and Keightley 2009; Galtier 2016; Huber et al. 2017; Kim et al. 2017). We investigate two beneficial DFE: an exponential and a more general discrete distribution. Given the very similar results obtained with these two beneficial DFE in the main text, we will only show the results obtained with the exponential distribution. The results obtained assuming a discrete distribution are reported in the Table S4.
Model choice
Before comparing the DFE between species we used a variety of models to fit our data at the species level. Differences among models include the joint estimation of DFE parameters and extra nuisance parameters that can correct for departures from a constant population size Wright–Fisher model, the proportion of beneficial mutations, and potential polarization errors. Table 1 shows the different DFE models and parameters used in this work.
We find that the uSFS of all great apes show significant departures from the Wright–Fisher model with constant population size (ri ≠ 1 as judged by an LRT) (Table S3, column M0 vs. M1). These nuisance parameters capture all evolutionary processes that equally affect synonymous and nonsynonymous mutations, such as demography, population structure, sampling bias, and linked selection. The performance of these nuisance parameters under demographic histories relevant for great apes is investigated using simulations presented in the Supplemental Material. In general, the inference quality is very high with simulated data, particularly when some aspects of the DFE are jointly estimated across species (see below).
We find that the models including εanc (M2 and M4) do not fit the data significantly better than the models assuming εanc = 0 (M1 and M3) (Table S3, column M1 vs. M2 and M3 vs. M4), except in eastern chimpanzees. This indicates that our polarization strategy is effective or that a substantial fraction of polarization errors is accounted for by the ri parameters [as argued by Galtier (2016)]. We do not find strong evidence for beneficial amino acid mutations segregating in our samples (Table S3, column M1 vs. M3 and M2 vs. M4). Consistent with these findings, the most flexible model (M2) is never the preferred model, as judged by our LRTs (Table S3). M3 is instead the preferred model across all species. This model assumes only deleterious mutations and no polarization errors but includes the distortion parameters (ri). The modest contribution of beneficial mutations is corroborated by the goodness-of-fit analyses (Figure 1). The results comparing the exponential and discrete beneficial DFE can be consulted in Table S4. Both distributions fit the data equally well. Larger samples sizes will be needed to assess which beneficial distribution is more realistic. Hereafter, we will refer to M3 as the purely deleterious model and M2 as the flexible model.
Observed and expected uSFS. The solid lines show the observed number of zerofold nonsynonymous changes and the dashed lines show the observed number of fourfold synonymous changes. The dots and circles represent the expected counts of neutral and selected sites under the M2 and M3 models (shown as blue and red, respectively). The y-axis is on log scale.
Assessing the stability/invariance of the DFE across great apes
Next, we sought to investigate whether the full DFE varies across great apes. We compare two models: one where only the shape parameter (b) of the deleterious DFE is shared across species and the mean (Sd) can vary independently as expected for populations with different effective sizes, and the other where all the parameters, including b, vary independently across species (this is equivalent to model 3 in Table 1). Note that, in a few cases, shape estimates of the deleterious DFE are sensitive to the model assumptions and, in particular, whether the model includes beneficial mutations (M1 and M2) or not (M3 and M4). For example, in bonobos the estimated shape parameter is b = 0.09 under the purely deleterious model, but it increases to b = 0.22 under the flexible model, which includes beneficial mutations (Table S4). This is because under the flexible model a substantial proportion of new mutations are inferred as slightly beneficial or nearly neutral (pb ∼ 14–17% and average Sb ∼ 0.65–0.79). In other words, in bonobos the shape of the deleterious DFE is sensitive to the presence/absence of beneficial mutations in the model. Hence, to accommodate these cases we also include the flexible model where both the beneficial DFE parameters and εanc are jointly estimated.
Table 2 shows the details of the set of constrained models used to test DFE invariance. In all of these models the distortion parameters (ri) are estimated independently for each species as recommended by Tataru and Bataillon (2019). Figure 2 shows that the density distributions for the shape of the deleterious DFE largely overlap between species. We find that the model estimating the shape of the deleterious DFE independently in each species does not explain the data significantly better than the model where the shape parameter is shared across species (shared b = 0.1588, bootstrap confidence interval 95% = 0.1344–0.1736) (M3S vs. M3I LRT P-value = 0.77). Although visually it seems that bonobos show a shift toward lower shape parameters, this result is not significant (M3 independent b = 0.0938 vs. M3 shared b = 0.1588; LRT P-value = 0.07). The inferred shape of the deleterious DFE increases very slightly when beneficial mutations are also modeled (shared b = 0.1842, bootstrap confidence interval 95% = 0.1697–0.2184) (M2S vs. M2I LRT P-value = 0.97) (Table S5). Moreover, we find that the model where the shape of the deleterious DFE is shared across species but the beneficial DFE is estimated independently for each species (M2S) does not fit the data significantly better than the equivalent model without beneficial mutations (M3S) (M3S vs. M2S LRT P-value = 0.99).
Sampling distribution of shape parameter estimates (b) for the DFE estimated in each species. Sampling distributions are obtained by 100 bootstrap replicates for each species and b is estimated under the purely deleterious model (M3I). The dotted black line (b = 0.1588) indicates the mean of the shared shape parameter under the most likely model (M3S). See Table 2 for a description of the models.
Population genetics theory predicts that natural selection will be weaker in small populations due to random genetic drift (Ohta 1992). If we assume that the shape of the deleterious DFE is constant across great apes, we find a strong positive correlation (Pearson’s correlation coefficient r = 0.91 [0.88, 0.93]) (Figure 3A) between the population scaled mean effect size of deleterious mutations (|Sd| = |2Nesd|) and our estimate of the effective population size (Ne = θS/(4μ), where θS is synonymous diversity and μ the mutation rate per site and generation (μ = 1.65 × 10−8) (Ségurel et al. 2014). This correlation remains significant after accounting for species phylogenetic dependence (BayesTrait V3 P-value < 0.01). Forward simulations confirm that Sd and Ne are significantly correlated as expected by the nearly neutral theory (Supplemental Analyses). We also find that the mean selection coefficient of deleterious mutations, sd (calculated dividing Sd by 2Ne), is positively correlated to Ne (r = 0.76 [0.67, 0.85], BayesTrait V3 P-value < 0.01) (Figure 3B). This implies that populations with smaller size have mutations that are, on average, less deleterious. We checked using simulations that our method of inference does not introduce any covariation between sd and Ne (Supplemental Analyses). It does not. We then assessed the relative contribution of the variation in Ne and sd to the differences in Sd. To do so, we use a multiple linear regression. Sd estimates are subject to considerable uncertainty and statistical noise and this can introduce a spurious (negative) correlation between Sd and sd estimates. Hence, we first obtained Sd and sd estimates that are statistically independent by splitting the coding genome in two (Castellano et al. 2016, 2018a; James et al. 2017) (see Subsampling). One half is used to estimate Sd and the other half to estimate sd. We run 100 replicates to compute the empirical confidence intervals. We find that Ne and sd are both significantly correlated with Sd in our multiple linear regression (Table 4). The standardized regression coefficients (β) suggest that direct variation in Ne and sd are quantitatively of very similar magnitude (β(Ne) = 0.43 [0.22, 0.85] and β(sd) = 0.69 [0.47, 1.07]). Although our predictor variables (Ne and sd) are correlated, their variance inflation factors are small (∼ 1.1), suggesting that multicollinearity is a minor issue.
Relationship between Ne and Sd on log scale (A), sd on log scale (B), the proportion of strongly deleterious mutations (C), the proportion of weakly deleterious mutations (D) and the proportion of effectively neutral mutations (E). Note that Ne (x-axis) is on log scale. Each violin plot represents 100 bootstrap replicates under the most likely model (M3S). See Table 2 for a description of the models.
Of note, although a linear regression explains most of the variance in Sd (R2 = 0.83 [0.77, 0.86]), the relationship seems to reach a plateau for the less populous great apes (Figure 3 A–B). In fact, we find that a curvilinear function (quadratic regression) fits the data significantly better (R2 = 0.90 [0.80, 0.95]) than a linear regression (LRT linear vs. quadratic regression P-value < 0.01). This result is not observed in the simulated data sets (Supplemental Analyses) and it suggests that other forces beyond genetic drift can modulate the strength of purifying selection in the smallest great ape populations (see Discussion).
One can argue that our sample sizes are too small to estimate Sd with precision and that the findings commented above are merely driven by noise or by some particular demography biasing the estimates of the DFE parameters inferred using polyDFE. However, estimates of the proportion of mutations in a given Sd range from the DFE are often less noisy than mean Sd estimates (Keightley and Eyre-Walker 2007; Eyre-Walker and Keightley 2009). We demonstrate with our simulations that when the shape parameter is co-estimated across species, not only these proportions can be reliably estimated using a small sample size, but also Sd (Supplemental Analyses). We find that the fraction of strongly deleterious mutations (Sd ≤ −10) is positively correlated to Ne (Figure 3C and Table 3). There is a negative correlation between the fraction of effectively neutral mutations (−1 < Sd ≤ 0) and Ne (Figure 3E and Table 3), as predicted by the nearly neutral theory (Ohta 1992). Interestingly, there is a negative, but very shallow, correlation between the fraction of weakly deleterious mutations (−10 < Sd ≤ −1) and Ne (Figure 3D and Table 3). This is consistent with a recent work that found that the effect of linked selection (mainly background selection) on genetic diversity is very similar along the genomes of great apes (Castellano et al. 2018b).
To further assess whether there is systematic variation in Sd across great apes, we compare a purely deleterious model where both b and Sd are shared across all species (M3SS) against a model where only b is shared between species (M3S). In both models, the ri and the population scaled mutation rate parameters are estimated independently for each species. The model with variation in the mean Sd across great apes fits the data significantly better than the model where the mean Sd is the same across species (M3S vs. M3SS LRT P-value = 5.9e−40) (Table S5E). This result strongly supports the fact that there is systematic variation in the strength of purifying selection across the great apes.
The selective effect, Sb, of new beneficial mutations might also depend on Ne. For instance, some authors have reported that species with a larger Ne tend to have higher rates of beneficial substitutions (Strasburg et al. 2011; Gossmann et al. 2012). This is because large populations will have to wait less time for the appearance of new beneficial mutations. Once the beneficial mutation has appeared, natural selection will be more effective in populations with large Ne. In contrast, other studies have reported that proxies for Ne and the rate of beneficial substitutions are poorly correlated (Galtier 2016). This is expected if small Ne populations have higher rates of new beneficial mutations, pb, because they are further away from their fitness optimum due to fixed or segregating deleterious mutations (Hartl and Taubes 1996; Poon and Otto 2000).
To test whether our data supports any of those opposing views we use a model averaging approach to estimate the effect and proportion of new beneficial mutations (Tataru and Bataillon 2019). This allows us to factor in the fact that, as measured via LRT, there is only very weak (statistically nonsignificant) evidence for the presence of beneficial mutations in the polymorphism data. We consider a set of four competing models (M3I, M3S, M2I, and M2S from Table 2) and weight them by their AIC. Figure S1 shows the weight of each model per species. This approach has been applied before in the context of detection of adaptive molecular evolution (Kjeldsen et al. 2012; Rousselle et al. 2019). We find a nonsignificant negative correlation between the model-averaged rate of new beneficial mutations, pb, and Ne (Figure S2A and Table 3). There is no correlation between the model-averaged Sb and Ne (Figure S2B and Table 3). The expected rate of beneficial nonsynonymous substitutions (the product of rate of new beneficial mutations, pb, and the mean strength of positive selection, Sb) is also uncorrelated to Ne (Figure S2C and Table 3). Model-averaged estimates of the proportion of beneficial mutations, pb, suggest that beneficial mutations remain very rare (Figure S3) and Sb estimates are always small (Sb ≪ 1) and similar across great apes (Figure S4). Nevertheless, it is also worth mentioning that one of the two smallest populations of great apes, bonobos, show a substantial proportion (1–2%) of new beneficial mutations (Figure S3 and Table 5). Larger sample sizes will be required to quantify this interesting class of effectively neutral beneficial mutations more precisely.
Accounting for gBGC
gBGC can distort the uSFS of synonymous and nonsynonymous mutations differently due to differences in their GC-content (Bolívar et al. 2018). Furthermore, larger populations are expected to be more affected by gBGC due to higher effective rates of recombination and gene conversion. Thus, the degree of bias will also scale with Ne. To check that our findings are not merely driven by differences in the intensity of gBGC, we repeated our analysis using GC-conservative mutations (A<−>T and C<−>G), which are unaffected by gBGC. Table S6 shows the estimated parameters of the DFE for GC-conservative mutations for each species and model. We also find that the model where a single shape b is shared across species is preferred over the model with a separate shape parameter for each species (shared b = 0.1694 [0.1123, 0.2176]) (M3S vs. M3I LRT P-value = 0.63) (Figure S5). This is also true when beneficial mutations are included (shared b = 0.2789 [0.2014, 0.9997]) (M2S vs. M2I LRT P-value = 0.70). We also find evidence of systematic variation in the strength of negative selection for GC-conservative mutations across species. A model where the shape is shared but the mean Sd is estimated independently is preferred over a model where both the shape and the mean Sd are shared (M3S vs. M3SS LRT P-value = 4.4e−6).
We confirm that sd covaries with Ne (Table 3) and that the strength of purifying selection in bonobos and western chimpanzees is higher than expected given their Ne (Figure S6). The correlation between Ne and the summary statistics of the DFE also persist when considering only nonsynonymous GC-conservative mutations (Figure S6–S8, Table S7, and Table 3). The goodness-of-fit analysis for GC-conservative mutations is presented in Figure S9.
Discussion
In this work, we have inferred and compared the full DFE of new heterozygous mutations between humans and their closest living relatives. To estimate the DFE we used only allele frequency distributions for fourfold synonymous and zerofold nonsynonymous changes. We found that the shape of the deleterious DFE is remarkably constant across great apes [b = 0.16 (0.13, 0.17)]. This result is robust to gBGC, polarization errors and proportions and effect of new beneficial mutations. Our estimate of the shape parameter in humans is consistent with older reports (b = 0.18–0.20) (Eyre-Walker et al. 2006; Keightley and Eyre-Walker 2007; Boyko et al. 2008) and recent estimates where 1000 of human chromosomes are reanalyzed under a complex human demography (b = 0.17–0.21) (Kim et al. 2017). While our analysis did not infer specific demographic parameters, coestimating nuisance parameters for the different uSFS classes does capture the effects of demography and allows us to recover the underlying DFE. We confirmed the robustness of our approach by running forward simulations with demographic histories relevant for great apes (Supplemental Analyses).
If we assume that within the great apes, not only are recombination rate, mutation rate, gene density, or gene expression levels conserved, but also protein function and regulatory and metabolic networks are equivalent, then it may be possible to explain the differences in the DFE between species by differences in their effective population size. Hence, here we ask, does Ne affect the DFE, all else being equal? We found evidence for systematic variation in the strength of negative selection (Sd = 2Nesd) across great apes. The correlation between our estimate of the species effective population size (Ne), based on the current levels of diversity at fourfold synonymous sites (θS), and the estimated strength of negative selection (Sd) is very strong for all mutations and GC-conservative mutations (explaining ∼80% and ∼60% of the variance in the strength of purifying selection, respectively). This result is consistent with nearly neutral theory (Ohta 1992), assuming that the deleterious DFE (in sd units) is constant over evolutionary time. This constancy is expected under a model where protein function is the main driver of fitness effects of mutations, and where protein function hardly changes between species (at the evolutionary time scale spanning great apes). However, we also found evidence that suggests that sd has not remained constant across great apes. We expect that Sd scales proportionally with Ne. However, we found more pronounced differences in Sd than expected given our estimates of the effective population size. In other words, we find that the mean absolute effect size of deleterious mutations (sd) is also correlated to Ne. Using data simulated with a constant sd, we show that our estimation procedure does not drive the covariation between Ne and sd estimates. Interestingly, this result is consistent with positive epistasis. Under positive epistasis |sd| will decrease as fitness (and probably Ne) decreases (Silander et al. 2007). This is because new deleterious mutations will be less detrimental in a genetic background that already contains deleterious variants than in a genetic background free of deleterious mutations. Population genetics theory predicts that in small populations drift can overwhelm selection. Slightly deleterious mutations may thus reach higher frequencies and even reach fixation causing fitness to decline (Kimura et al. 1963; Bataillon and Kirkpatrick 2000). This means that new deleterious mutations will have a higher chance of interacting with other preexisting deleterious variants in a small population than in a large population. The prevalence of epistasis between old and new deleterious mutations is thus expected to increase when Ne decreases and we believe this is reflected in our estimates of the DFE (Figure 4). We propose that the variation in the strength of purifying selection across great apes is doubly affected by Ne. First, for a given selection coefficient, Ne determines the efficiency of purifying selection (Ohta 1992), and second, Ne determines the amount of potential epistatic interactions occurring in a given individual, which, in turn, will modulate the effect of new deleterious mutations (Poon and Otto 2000). This is an exciting result that deserves further theoretical exploration and empirical validation.
Schematic representation of the potential number of epistatic interactions between old (in gray) and new (in red) deleterious variants as a function of Ne. Each gray horizontal line represents a chromosome, while open circles represent deleterious mutations.
Moreover, for additive mutations, the effect of deleterious mutations should decrease with decreasing Ne. Surprisingly, we find that in bonobos and western chimpanzees, the two smallest great ape populations, the mean effect size of deleterious mutations increases. We do not see this overestimation of Sd in our forward simulations of the bonobo and western chimpanzee demographic histories. Note that in our simulations all mutations are codominant. We hypothesize that the efficient purging of strongly deleterious recessive variants in bonobos and western chimpanzees might explain this result (Barrett and Charlesworth 1991; Glémin 2003). Bonobos and eastern lowland gorillas (not analyzed in this study) show an excess of inbreeding compared to the other great apes, suggesting small population sizes or a fragmented population (Prado-Martinez et al. 2013). Similarly, the western chimpanzees are thought to have spread from a very small ancestral population (Prado-Martinez et al. 2013; de Manuel et al. 2016) and show, as expected by theory, a higher proportion of putatively deleterious variants compared to central chimpanzees (Han et al. 2019). Recent work in Ibex and wolves suggests that bottlenecks can indeed favor the purging of strongly deleterious recessive mutations while allowing the accumulation of weakly deleterious additive mutations (Grossen et al. 2019; Robinson et al. 2019). Note, however, that our estimates assume that all mutations segregating and affecting fitness are codominant (h = 0.5). Whether the estimation of DFE parameters is robust to variation in the joint effects of the dominance of mutations, inbreeding and demography remains an open question.
Regarding the beneficial portion of the DFE, we do not find any statistically significant contribution of beneficial mutations to the uSFS counts. Thus, we are unable to support an increase in the strength of positive selection (Sb) with Ne (Nam et al. 2017), as well as an increase in the expected rate of beneficial substitutions (pb × Sb) with Ne (Eyre-Walker et al. 2006). Note, however, that in this work we do not use substitution data, only polymorphisms. Rare and strongly beneficial mutations will fix quickly and contribute relatively more to divergence counts than to uSFS counts (Eyre-Walker and Keightley 2007). Hence, our results are still compatible with the view that a sizeable amount of divergence at the amino acid level is driven by relatively rare but strongly beneficial mutations in the great apes (Nam et al. 2017). Our choice was to avoid using divergence data to estimate the full DFE. The reason for doing so is that fixed mutations may introduce biases in the estimation of the full DFE if ancient fluctuations in the effective population size are not properly modeled (Tataru et al. 2017; Rousselle et al. 2018; Zhen et al. 2018). A second explanation to the apparent lack of beneficial mutations is our modest sample size (n = 8 haploid chromosomes per population). Thus, although most adaptive substitutions seem to be weakly beneficial in humans (Uricchio et al. 2019), we might need very large sample sizes to quantify accurately the DFE of new beneficial mutations because new beneficial mutations are still very rare relative to deleterious ones.
Using a model averaging framework where the different competing models are weighted by their AIC, we find that between 1 and 2% of new mutations are mildly beneficial in bonobos. However, the estimated population scaled effect size of beneficial mutations is below one in all great apes. Small populations tend to be further away from their fitness optimum due to a higher genetic load and/or fixation of slightly deleterious mutations. As a consequence, a new mutation has a higher probability of being beneficial in a small than in a large population (Hartl and Taubes 1996; Poon and Otto 2000; Silander et al. 2007). The interactions within and among genes will allow new mutations to compensate/restore for the fitness effects of other, fixed or polymorphic, slightly deleterious mutations. Hence, there is no need for very unlikely back mutations to restore the fitness losses incurred by previous mutations (Charlesworth and Eyre-Walker 2007). An interesting implication of such mode of evolution is that rates of adaptive substitutions may not be driven only by external conditions (such as viruses, see Enard et al. 2016; Castellano et al. 2019; Uricchio et al. 2019) but also by the amount of deleterious mutations already present in the genome as this mutation load conditions the current level of adaptation in a population. This mechanism is not often invoked to explain Darwinian adaptation (due to environmental changes), yet a small pool of compensatory mutations will contribute to the amino acid differences between species in the long term (Hartl and Taubes 1996). The induced epistasis imply that mutations are only conditionally beneficial or deleterious and that the DFE and Ne might not be independent as commonly assumed.
Finally, we discuss some limitations of our study. We have assumed the same mutation rate in all populations to estimate Ne based on current levels of synonymous diversity, θS. Synonymous diversity has been used repeatedly in several related studies as a proxy for Ne (Gossmann et al. 2010; Strasburg et al. 2011; Phifer-Rixey et al. 2012; Galtier 2016). This might be a problem because θS is jointly influenced by variation in Ne and the mutation rate. In this work we have assumed the same mutation rate per site and generation across all great apes. However, there is evidence that both the generation time and the mutation rate per year vary across great apes (Amster and Sella 2016; Jónsson et al. 2017; Thomas et al. 2018; Besenbacher et al. 2019). We checked how more realistic estimates of mutation rate per generation could affect our results. With data retrieved from Besenbacher et al. (2019) we find that the mutation rate per generation is ∼23% higher in chimpanzees, bonobos, and orangutans than in humans. Gorillas and humans have a very similar mutation rate per generation despite having the shortest and longest generation times across great apes, respectively. We show that our results, including the correlation between sd and Ne, remain robust to the variation in the mutation rate per generation across great apes (Figure S10 and Table S8). We emphasize that estimates of mutation rate per generation are prone to much uncertainty due to our limited knowledge about generation times in nature. Second, synonymous diversity is a poor proxy for Ne if there is widespread selection on codon usage or other kinds of selection on synonymous changes due to, for example, the regulation of splicing or gene expression. However, this is unlikely to affect some great apes more than others or to affect more synonymous than nonsynonymous changes. Selection on codon usage is a weak force and great apes are in general not highly populous species. The robustness of our results to gBGC, which is typically correlated to GC-content and codon usage bias, does not suggest that this is a major issue with this analysis. Furthermore, the fact that all our results have been replicated using GC-conservative mutations indicates that gBGC is not likely affecting our conclusions. Third, we cannot explain why we do not observe signals of mildly beneficial mutations in western chimpanzees, a population with a level of genetic diversity equivalent to that found in bonobos. Other factors beyond the genome-wide amount of DNA diversity, such as the rate of change of the environment (Lourenço et al. 2013), inbreeding, or population structure, might be triggering the emergence of those weakly beneficial mutations specifically in bonobos. Fourth, fluctuations in the effective population size (along time or the genome) might affect neutral and selected mutations differently (Gordo and Dionisio 2005; Brandvain and Wright 2016; Castellano et al. 2018a; Torres et al. 2019). In other words, the Ne that better describes the population dynamics of neutral mutations might be different from the Ne that better describes the population dynamics of weakly selected mutations or strongly selected mutations (Messer and Petrov 2013; Rousselle et al. 2018). These decouplings in the Ne of neutral and selected mutations can affect our results and interpretations. However, the high inference accuracy of polyDFE under background selection and the complex demographic histories of great apes should make our inference robust to such nonequilibrium dynamics.
Conclusions
We have made the first comparison of the full distribution of fitness effects of new amino acid mutations across great apes. By comparing the fit of a series of nested models to polymorphism data, we have identified which aspects of the DFE are shared between humans and their closest living relatives. Our analysis shows that the shape of the deleterious DFE is shared across these species. Consistent with the nearly neutral theory, we have found that the average population scaled effect size of new deleterious mutations (Sd) is strongly correlated to our estimate of Ne. Interestingly, there is also covariation between the average effect size of new deleterious mutations (sd) and Ne suggesting a role for positive epistasis. We further find that the two smallest great ape populations, western chimpanzees and bonobos, show a comparatively larger strength of purifying selection, which is compatible with the efficient purging of deleterious recessive variants in small populations. The LRT does not favor a richer DFE model including beneficial mutations over models considering only deleterious mutations. However, when we use a model averaging approach, we estimate a small proportion of mildly beneficial mutations only in bonobos. This finding is consistent with compensatory epistasis, which predicts a larger rate of beneficial mutations in small populations. This work invites further investigation of the relationship between epistasis and Ne. Our study demonstrates the simple but perhaps underappreciated fact that the effect of mutations is dynamic, even between closely related species, and may depend on the genetic background on which they arise.
Acknowledgments
We thank Martin Lascoux, Donate Weghorn, Miguel Rodriguez, and three anonymous reviewers for helpful discussion. We further thank Alex Cagan for sharing the set of sites with a unique mapping to the human genome with at least five-fold coverage in all individuals per species. This work was supported by the Danish Council for Independent Research (grant number 4181-00358).
Footnotes
Supplemental material available at FigShare: https://doi.org/10.25386/genetics.9705281.
Communicating editor: S. Wright
- Received July 8, 2019.
- Accepted August 29, 2019.
- Copyright © 2019 by the Genetics Society of America