Gene Duplication, Gene Conversion and the Evolution of the Y Chromosome
Tim Connallon, Andrew G. Clark


Nonrecombining chromosomes, such as the Y, are expected to degenerate over time due to reduced efficacy of natural selection compared to chromosomes that recombine. However, gene duplication, coupled with gene conversion between duplicate pairs, can potentially counteract forces of evolutionary decay that accompany asexual reproduction. Using a combination of analytical and computer simulation methods, we explicitly show that, although gene conversion has little impact on the probability that duplicates become fixed within a population, conversion can be effective at maintaining the functionality of Y-linked duplicates that have already become fixed. The coupling of Y-linked gene duplication and gene conversion between paralogs can also prove costly by increasing the rate of nonhomologous crossovers between duplicate pairs. Such crossovers can generate an abnormal Y chromosome, as was recently shown to reduce male fertility in humans. The results represent a step toward explaining some of the more peculiar attributes of the human Y as well as preliminary Y-linked sequence data from other mammals and Drosophila. The results may also be applicable to the recently observed pattern of tetraploidy and gene conversion in asexual, bdelloid rotifers.

NONRECOMBINING chromosomes are often associated with genetic degradation and a loss of functional genes, and nowhere is this pattern more exaggerated than on the Y chromosome (Charlesworth and Charlesworth 2000; Bachtrog 2006). However, in addition to the more widely recognized pattern of gene loss, genome sequences of mammals and Drosophila are also yielding evidence for Y-linked functional gene gain followed by amplification of duplicate genes (Skaletsky et al. 2003; Koerich et al. 2008; Carvalho et al. 2009; Krsticevic et al. 2009; Hughes et al. 2010). Duplication and retention of functional Y-linked gene copies is somewhat surprising because evolutionary theory predicts an opposing pattern. First, to the extent that gene duplicates are fixed via positive selection, they are less likely to become fixed on nonrecombining relative to recombining chromosomes (Otto and Goldstein 1992; Clark 1994; Yong 1998; Otto and Yong 2002; Tanaka and Takahasi 2009). Second, regardless of whether Y-linked duplicates become fixed via genetic drift or by natural selection, the actions of Muller's ratchet, genetic hitchhiking, and background selection are expected to greatly increase the probability that Y-linked genes degenerate into nonfunctional pseudogenes (Charlesworth and Charlesworth 2000; Bachtrog 2006; Engelstadter 2008).

The issue is more complex when one considers data from the well-characterized human Y chromosome. A majority of functional Y-linked genes are members of duplicate gene pairs residing within large palindromes and are almost exclusively testis expressed (Skaletsky et al. 2003). In contrast to many of the single-copy genes with X-linked homologs, members of Y-linked gene families are apparently not degenerating, but rather have become fixed and maintained over many millions of years (Skaletsky et al. 2003; Yu et al. 2008). Although Y chromosomes are not well characterized in other taxa, currently available data suggest that duplication is a common feature of Y chromosomes in other mammal species as well as Drosophila (Rozen et al. 2003; Verkaar et al. 2004; Murphy et al. 2006; Alföldi 2008; Wilkerson et al. 2008; Krsticevic et al. 2009; Geraldes et al. 2010). Thus, patterns of gene duplication and retention, for at least a subset of Y-linked genes, may be a general rule of Y chromosome evolution.

Another attribute of the mammalian Y appears to be relevant for duplicate gene evolution. Comparative analysis between humans and chimpanzees suggests ongoing recombination between the gene duplicate pairs that reside on the same Y chromosome. Such “intrachromosomal” recombination includes both nonreciprocal (gene conversion) and reciprocal exchange (crossing over) between gene duplicate pairs (Rozen et al. 2003; Lange et al. 2009). Gene conversion between the duplicates potentially maintains gene function by counteracting stochastic forces of Y chromosome degeneration (Rozen et al. 2003; Charlesworth 2003; Noordam and Repping 2006). The rationale behind this hypothesis is subtle. As with other clonally inherited chromosomes, each evolutionary lineage of the Y is physically coupled to, and its evolutionary fate is influenced by, the presence of deleterious mutations. Mutation-bearing lineages represent evolutionary dead ends unless they can somehow remove or compensate for deleterious mutations. Recombination between duplicates can “rescue” functionality via gene conversion between functional and nonfunctional copies.

On the other hand, double-strand DNA breaks, which precede gene conversion events (Marais 2003), also precede crossing over. Crossovers between Y-linked genes can generate acentric and dicentric Y chromosomes, resulting in infertility and disruption of the sex determination pathway (e.g., Repping et al. 2002; Heinritz et al. 2005; Lange et al. 2009). Considering both gene conversion and crossing over on the Y, recombination can be viewed as a factor that either constrains (via gene conversion) or promotes (via crossing over) Y chromosome degeneration.

These observations concerning Y chromosome gene content and recombination raise interesting questions that have not been formally addressed by evolutionary theory (but see the recent study by Marais et al. 2010). First, what conditions favor the evolutionary invasion of Y-linked gene duplicates, and does recombination influence the probability that duplicates eventually become fixed within a population? Second, what affect does recombination have on Y-linked fitness and the maintenance of functional duplicate genes? To address these questions, we develop and analyze a series of population-genetic models of Y chromosome evolution. We show that, when direct selection on gene duplicates is weak, biased gene conversion can increase, whereas crossing over will decrease, their probability of fixation. For duplicates with larger fitness effects, the probability of fixation is largely independent of Y-linked recombination. Finally, gene conversion has a major impact on the retention of functional Y-linked genes that are already fixed within the population and maintains multiple gene copies with or without selection favoring these duplicates.


Gene conversion and the invasion of new gene duplicates:

We first consider conditions favoring the evolutionary invasion of new Y-linked duplicate genes at low initial frequency within the population. Deterministic invasion dynamics are described for a two-locus model, and it is shown separately that the two-locus model characterizes duplicate gene invasion conditions on a Y chromosome carrying an arbitrary number of genes (see supporting information, File S1). We then develop and analyze a diffusion approximation and perform stochastic simulations to examine the probability that a rare gene duplicate eventually becomes fixed within a population of small size.

Invasion of a new gene duplicate:

Consider a single Y-linked locus with a functional allele, A, and a nonfunctional allele, a. Mutation from A to a occurs at rate u per generation and there is no back mutation. By introducing a duplication of the locus, the population is expanded to include five genotypic classes: the original single-copy classes (A and a), those with two functional gene copies (AA), those with one functional and one nonfunctional copy (Aa), and those with two nonfunctional copies (aa). As in the single-locus case, transitions between states (AAAa or aA; Aa or aAaa) can occur by mutation, at rate of u per locus; because there are now two loci, the mutation rate per chromosome is 2u.

For Y chromosomes carrying duplicates, recombination (crossing over and gene conversion) can potentially occur between loci. Throughout our analysis, we examine cases where recombination occurs at a rate of d per paralog pair, per generation. The probability that a single recombination event is a crossover, which generates an abnormal (sterile) Y chromosome (e.g., Repping et al. 2002; Heinritz et al. 2005; Lange et al. 2009), is equal to the constant c . The remainder of recombination events (1 − c) represent gene conversion events between duplicate pairs. Gene conversion involving Aa or aA individuals yields AA or aa sperm at rate b and 1 − b, respectively. Thus, b can be viewed as a biased gene conversion parameter, where the functional copy A preferentially replaces the nonfunctional a whenever b > 0.5 (there is no bias when b = 0.5).

Compared to individuals with two functional gene copies, individuals with zero functional copies suffer a fitness reduction of s, while those with one functional copy suffer a reduction of sh, where h is equivalent to a dominance coefficient. Complete masking of a nonfunctional allele occurs when h = 0, and there is no direct fitness benefit of carrying two vs. one functional gene. Partial masking occurs when 1 > h > 0; in such cases, there is a fitness benefit of having two functional copies. Genotypes, genotypic fitness, and zygotic frequencies are described in Table 1.

View this table:

Parameterization for the gene duplicate invasion model

For a sequence of events of (i) birth, (ii) selection, (iii) mutation, (iv) recombination, and (v) random mating (and ignoring factors of u2), the frequency change of each genotype, per generation, is given by the following six recursions,MathMathMathMathMathMathwhere mean fitness is Math + Math.

To describe conditions promoting the invasion of duplicates, we analyzed the stability of an evolutionary equilibrium in which duplicated genotypes are absent from the population. Under such a condition, the frequencies x1 and x0 equilibrate to Math and the leading eigenvalue of the stability matrix isMath(1a)

Selection favors the invasion of a duplicate when the leading eigenvalue is greater than one (Otto and Day 2007). The magnitude of the leading eigenvalue also represents the strength of selection acting in favor of a rare duplicate gene [i.e., the probability of fixation is proportional to λ (Otto and Bourguet 1999; Otto and Yong 2002); see below for additional details]. Without recombination (d = 0), the leading eigenvalue reduces toMath(1b)and evolutionary invasion of a duplicate-bearing Y is favored when sh > u/(1 − u). Duplicates are favored when the direct fitness benefit of additional functional gene copies outweighs the indirect consequences of doubling the deleterious mutation rate, as previously reported for both haploid and diploid systems without recombination (Clark 1994; Otto and Yong 2002; also see Otto and Goldstein 1992).

How does recombination alter the evolutionary dynamics of Y chromosomes? When duplicates do not directly increase fitness (sh = 0), and there is no recombination, selection never favors invasion (Equation 1b above). We can ask whether gene conversion expands the conditions favorable to invasion of a duplicate in a way that is similar to previous models of gene duplication with crossing over (Otto and Yong 2002). By permitting Y-linked recombination between duplicates, and assuming that the crossover rate is zero (dc = 0; hence, all recombination is by gene conversion), the leading eigenvalue can be approximated for low rates of gene conversion (d ≈ 0, per generation),Math(1c)which indicates that selection favors duplicates (λ > 1) when gene conversion is biased toward transmission of functional over nonfunctional gene copies (b > 0.5). Numerical evaluation of Equation 1a indicates that, although higher rates of gene conversion can increase the leading eigenvalue (and hence the probability of invasion), this positive relationship quickly saturates. Thus, a little bit of gene conversion has about as much of an impact on the leading eigenvalue as a high rate of gene conversion does. Nevertheless, the strength of such positive selection (with magnitude of λ − 1) is on the order of the mutation rate (u) and is therefore extremely weak. Stochastic simulations (see below) show that the probability of duplicate fixation is marginally influenced by biased gene conversion alone.

Further analysis of Equation 1a shows that, as with the case of no recombination (Otto and Yong 2002), selection will favor duplicates if they directly increase fitness (sh > 0). Gene conversion (including unbiased gene conversion: b = 0.5) can increase the strength of selection favoring invasion of a duplicate (λ − 1; Figure 1). However, the relative impact of gene conversion is minor when shu. In other words, when there are weak direct benefits of having multiple gene copies, the strength of natural selection favoring Y-linked gene duplicates will be enhanced by gene conversion between paralogs. This conclusion holds if the crossover rate between duplicate pairs (dc) is small (Figure 1). As the rate of crossing over increases, the production of abnormal Y haplotypes can generate purifying selection against Y chromosomes that carry gene duplicates.

Figure 1.—

Gene conversion can enhance the strength of positive selection for rare duplicate genes, whereas crossovers select against duplicates. Selection coefficient approximations (λ − 1) are based on the leading eigenvalue (Equation 1a), as described and justified in the text, and are presented as a ratio of selection with (d > 0) vs. without recombination (d = 0). Representative results are presented for u = 10−5 and assume that there is no gene conversion bias (i.e., b = 0.5).

Why should gene conversion broaden duplicate invasion conditions under weak selection? An intuitive explanation can be reached by considering the recursion dynamics for a population fixed for the single-gene haplotype. Because this explanation is heuristic, we ignore crossovers and assume that they do not occur (c = 0). The rate of increase for a rare haplotype with two functional gene copies depends on its relative competitiveness against the resident, single-copy haplotype. For initial condition x11 = 1/N and x10 = x00 = 0, the expected proportion of functional duplicate haplotypes (x11) within the gamete pool is Math, and the duplicate is favored when Math. Invasion is clearly facilitated by gene conversion (db > 0). Nevertheless, because the term 2u(1 − db) is extremely small, gene conversion will marginally influence the probability of fixation whenever shu.

Probability of duplicate fixation:

The deterministic model presented above can be modified to describe the evolutionary dynamics in finite populations. Following Otto and Bourguet (1999) and Otto and Yong (2002), the selection coefficient for a rare gene duplicate can be approximated as λ − 1, where λ is the leading eigenvalue of the stability matrix (Equation 1a, above). Given this selection coefficient, the probability that a rare duplicate is eventually fixed can be estimated by diffusion approximation (Kimura 1957, 1962), with drift and diffusion coefficients M = (λ − 1)x(1 − x) and V = x(1 − x)/N, respectively, where x is the frequency of a duplicate-bearing Y haplotype and N is the Y chromosome effective population size. For an initial frequency of 1/N, the probability that a duplicate is fixed will beMath(2)

To assess the validity of Equation 2, we conducted computer simulations that incorporate mutation, selection, and genetic drift. Each simulation was initiated at x11 = 1/N, x0 = u(1 − hs)/(ssh), and x1 = 1 − x11x0. To generate genotypic frequencies for the next generation, N genotypes were randomly drawn from a multinomial distribution, after selection, from the six genotypes described above. Mutation–selection–drift recursions were iterated until the duplicate genotype was either fixed or lost from the population. Equation 2 provides a good approximation for the probability of duplicate fixation over a broad range of parameter space (Figure 2 and Figure S1). As direct selection on a duplicate approaches zero (sh → 0), the probability of fixation approaches 1/N. As direct selection increases in strength (1 ≫ 1 − λ ≫ 1/N), the probability of fixation approaches 2(λ − 1).

Figure 2.—

The probability of fixation for Y-linked duplicate genes. The solid line depicts the analytical approximation from Equation 2. Circles represent the proportion of duplicate genotypes (out of 100,000 replicate simulations for each data point) that eventually become fixed within the population. Results are shown for d = 0, N = 1000, and u = 10−5, per locus, per generation. Values of d > 0 yield approximately the same results (see Figure S1).

Gene conversion had little impact on the probability of duplicate fixation (see Figure S1). As shown above, the leading eigenvalue of the stability matrix is not substantially influenced by gene conversion unless sh is of similar order to u. Even though the selection coefficient approximation (λ − 1) can increase with gene conversion, its absolute magnitude under weak direct selection (sh ≈ 0) will generally be too small for natural selection to be effective, unless of course Nu > 1, which is particularly unlikely for Y-linked loci. Thus, gene conversion is unlikely to significantly enhance the rate of duplicate gene fixation, but can potentially reduce the fixation rate of duplicates if the rate of deleterious crossovers between paralogs is high.

Gene conversion and the maintenance of gene duplicates:

A major hypothesis inspired by the human Y chromosome is that gene conversion between duplicates may prevent the accumulation of mutations and ultimately prevent or slow down Y chromosome degeneration due to Muller's ratchet (Charlesworth 2003; Rozen et al. 2003; Noordam and Repping 2006). To formally evaluate this possibility, we considered two models for the maintenance of functional Y-linked genes. We first conducted simulations of our two-locus model with initial condition x11 = 1 (a pair of functional duplicates is initially fixed within the population) and analyzed whether gene conversion prevented the loss of one or both of the functional gene copies. Gene conversion between Y-linked paralogs decreased the rate of gene loss under a wide range of fitness conditions, including the extreme case where there was no direct benefit of having two, as opposed to one, functional gene copies (Figure S2). Although gene conversion can substantially reduce the rate of gene loss, the results indicate that loss of completely redundant genes (where sh = 0) will persist under gene conversion, albeit at a substantially reduced rate.

Prior models of Muller's ratchet generally find that the rate at which deleterious mutations become fixed depends upon both the strength of purifying selection and the number of loci evolving on an asexual chromosome (Charlesworth and Charlesworth 2000; Bachtrog 2008). To account for selection and gene conversion across many loci, we extended our model to describe the degeneration of Y chromosomes carrying an arbitrary number of genes. To permit gene conversion, we assumed that each Y initially carries n distinct gene types, each with a duplicate copy (for a total of 2n loci). Because the increased number of genes greatly expands the number of possible genotypic and fitness states (and consequently the matrix of transition probabilities between states), we made a simplifying assumption that each of the n gene types represents an essential male fertility factor. Males lacking a functional copy of one or more gene types are sterile and comprise a heterogeneous genotypic class with reproductive success of zero. Although the essentiality assumption is useful for modeling purposes, it will often be biologically reasonable because Y-linked genes, at least in mammals and Drosophila, are often essential for male fertility. For example, human Y chromosome microdeletions within Y-palindromic regions are often associated with spermatogenic failure (Noordam and Repping 2006; Lange et al. 2009). In Drosophila melanogaster, mutations in at least three of seven currently Y-annotated genes (kl-2, kl-3, and kl-5, as well as an additional set of unannotated genes: kl-1, ks-1, and ks-2; data obtained from are known to cause male-sterile phenotypes. Nevertheless, the overall agreement between our multilocus and two-locus results (the latter does not assume essentiality; see Figure S2) suggests that a violation of the essentiality assumption is unlikely to strongly affect our conclusions.

For each paralog pair, there are three possible genotypes: both loci functional, one functional and one nonfunctional, and both nonfunctional. Transitions between genotypic states can occur by mutation, by gene conversion, or by crossing over, with crossover yielding an abnormal Y chromosome. For individuals carrying a structurally normal Y, fitness follows the function w = (1 − sh)k(0)j, where j refers to the number of gene pairs with both copies nonfunctional, and k refers to the number of pairs where one of the two gene copies is functional (0 ≤ kn). Individuals with j > 0 and individuals carrying abnormal Y chromosomes are sterile. After selection, the reproductive contribution of an individual with k Y-linked mutations isMathwhere xk is the zygotic frequency of k-bearing males, wk = (1 − sh)k is the fitness of a male with k mutations, and mean male fitness with respect to the Y is Math. (The reproductive contribution of sterile individuals is zero.)

To facilitate analytical tractability, we assume that the rates of recombination and mutation are both small enough to ignore multiple mutation and multiple recombination events per generation. In other words, there is a zero probability of an individual with k mutations producing a fertile son with k − 2 or k + 2 mutations. This assumption is justified as long as 2nu ≪ 1 and nd ≪ 1, which requires that the mutation and recombination rate per locus is small, and the number of loci mutable to a nonfunctional allele is much smaller than the reciprocal of the mutation or gene conversion rate: n ≪ min[1/u, 1/d]. Because n represents a small fraction of Y-linked nucleotides (i.e., it represents a very specific functional class), this assumption is biologically reasonable. Nevertheless, a violation of these assumptions is expected to make our results conservative by downwardly biasing the speed of Muller's ratchet (which is enhanced by a higher mutation rate) and minimizing the positive effect of gene conversion (higher gene conversion rates increasingly counteract Muller's ratchet). Extending across the 2n loci, the probability that a Y chromosome experiences one mutation is Pr(M = 1) = 2nu = U. The probability that zero mutations occur is Pr(M = 0) = 1 − U. The probability of a recombination event between one of the n paralog pairs is Pr(R = 1) = nd = D. The probability of no recombination is Pr(R = 0) = 1 − D.

Given a sequence of events of (i) birth, (ii) selection, (iii) mutation, (iv) recombination, and (v) random mating, the frequency of fertile males in the next generation follows the recursionMath

The “least-loaded” (k = 0) and “most-loaded” (k = n) classes of fertile males follow the recursionMathandMathrespectively. The frequency of sterile males in the next generation (via crossover, mutation, or gene conversion) will beMath

Deterministic equilibria and mean fitness of the Y:

When there is no recombination between duplicates (D = 0), mean Y chromosome fitness as well as the distribution of mutations among individuals can be analytically determined. If mutations that eliminate duplicate gene function are deleterious (sh > 0), and the number of unique Y-linked genes is large (nU/sh), the population approaches the equilibrium: Math, Math, and Math. This is analogous to the case of mutation–selection balance with incomplete dominance (sh > 0), with a Y-linked genetic load of L = U ≈ 1 − e−U (e.g., Haldane 1937; Kimura and Maruyama 1966; Kondrashov and Crow 1988). When knocking out a duplicate yields no fitness effect (sh = 0), or the number of Y-linked genes is small (nU/sh), the population approaches the equilibrium: Math, Math, and Math. Under this scenario, the genetic load is reduced by a factor of 2, to L = U/2 ≈ 1 − e−U/2 (Haldane 1937).

Gene conversion between duplicates increases the frequency of the least-mutated class (Figure 3 and Figure S3), whether or not there is a gene conversion bias favoring functional over nonfunctional loci. The frequency of the least-loaded class represents a quantity of particular importance for adaptation on clonally transmitted chromosomes such as the Y (Charlesworth and Charlesworth 2000). Without recombination, the unit of selection is the chromosome rather than the locus. Beneficial mutations that are associated with mutation-free genetic backgrounds are relatively likely to become fixed (Peck 1994; Orr and Kim 1998) and do not permit hitchhiking of deleterious mutations during a selective sweep (Rice 1987). However, as the frequency of the least-loaded class becomes small, virtually all beneficial mutations will arise in inferior genetic backgrounds. This will limit the adaptive potential of the Y chromosome. Because it increases the fraction of mutant-free Y chromosomes, gene conversion is expected to enhance the fixation probability for beneficial mutations and can reduce the deleterious consequences of hitchhiking.

Figure 3.—

Gene conversion increases the frequency of Y chromosome haplotypes that carry zero deleterious mutations (i.e., the “least-loaded” genotypic class). The cost of a mutation eliminating function of a copy of each duplicate pair is represented by sh (this cost increases from left to right on the x-axis). The relative proportion of mutation-free Y chromosomes in recombining vs. nonrecombining populations is presented as a ratio of the two scenarios (gene conversion increases the proportion of mutation-free Y's when this ratio is greater than one). The number of distinct, Y-linked genes is represented by n. Results are presented for c = 0, b = 0.5, and u = 5 × 10−4, per locus, per generation, and D = U = 2nu. Additional results are presented in Figure S3.

By shifting the mutational distribution toward relatively mutation-free genotypes, gene conversion also increases mean Y chromosome fitness. This effect does not depend on a gene conversion bias, but can become exacerbated when conversion events favor functional over nonfunctional variants (for models yielding similar conclusions about the genetic load, albeit by different approaches, see Bengtsson 1986, 1990, and especially Ohta 1989).

These long-term effects of gene conversion can be accounted for by a straightforward explanation. When the fitness cost of silencing both copies of a duplicate pair is much greater than the cost of silencing one of the copies (when duplicates partially or completely mask deleterious mutations: h < 0.5), selection across Y chromosomes mimics truncation selection, which is particularly efficient at removing deleterious alleles (e.g., Kondrashov 1988; Ohta 1989). Truncation selection arises because mutations on a relatively mutation-free Y will generally affect one copy of a pair, with the second, functional copy compensating for loss of the first. As the number of mutations on a Y increases, so does the probability of silencing the second copy of a pair. Consequently, the deleterious effect of each mutation increases faster than linearly with the number of mutations carried on a Y.

Without recombination, the accumulation of mutations is unidirectional, and the population will tend to evolve toward the edge of the truncation point (n mutations at distinct genes), particularly if masking by duplicates is strong (i.e., having two functional copies provides the same fitness as one copy). At the extreme of sh = 0 (complete masking), the population evolves to contain n functional genes, each distinct. Gene conversion restores variability by permitting bidirectional transitions (e.g., k to k − 1 and k + 1 mutations). Y chromosomes that are closer to the truncation point have a higher probability of transitioning (by mutation or recombination) beyond the truncation point where they are removed by selection. Consequently, the population distribution shifts toward fewer mutations per Y. However, if selection in favor of functional duplicates is strong relative to the number Y-linked genes (sh > 0; n large), most individuals will carry few mutations, the truncation point becomes irrelevant to Y chromosome evolution, selection shifts toward multiplicative epistasis, and gene conversion does not strongly influence mean fitness or the distribution of mutations among Y chromosomes. This explanation accounts for the decreased impact of gene conversion on mutation-free Y chromosomes, as the strength of selection (sh) increases (Figure 3 and Figure S3).

Muller's ratchet and the accumulation of nonfunctional genes:

The deterministic results (presented above) represent an upper limit for Y chromosome fitness. In finite populations, where Muller's ratchet operates, mean fitness can further decrease with each successive loss of “mutation-free” individuals. Once lost from the population, mutation-free genotypes are unlikely to be recovered by back mutation or positive selection because they must initially arise within the current least-loaded class and subsequently avoid stochastic loss (Peck 1994; Orr and Kim 1998; Gordo and Charlesworth 2000).

To explore the influence of gene conversion on the rate and severity of Y chromosome degeneration via Muller's ratchet, we conducted a series of stochastic simulations, varying the selection and recombinational parameters (u, h, n, d, c, b). We first use the recursions presented above to bring the frequencies of each genotypic class to deterministic equilibrium. Convergence to equilibrium is followed by 100,000 generations of simulation under a mutation–selection–drift model and constant male population size. For each generation, genotype frequencies were sampled from a pseudorandom multinomial distribution (pseudorandom numbers generated with R; R Development Core Team 2005), with genotypes randomly sampled after selection, mutation, and recombination.

When there is no gene conversion between duplicates, Muller's ratchet can operate rapidly, causing Y-linked fitness decay and loss of functional genes. Representative simulation results are shown in Figures 4 and 5. In agreement with previous theory (Haigh 1978; Gordo and Charlesworth 2000; Bachtrog 2008), the impact of the ratchet is strongest when the ancestral Y carries many functional gene duplicates and when mutations have small individual fitness effects. Relatively low rates of gene conversion can rescue Y-linked genes from stochastic loss via Muller's ratchet and thereby increase mean fitness of the Y (Figures 4 and 5). Increasing the total mutation and gene conversion rates on the Y (U and D, respectively) amplifies the differences between recombining and nonrecombining chromosomes, whereas a decrease in these compound parameters (U, D → 0) eliminates these long-term evolutionary differences. This effect occurs both with and without biased gene conversion between duplicates.

Figure 4.—

Intrapalindrome gene conversion prevents the erosion of Y chromosome gene content and enhances adaptation on the Y. N represents the Y-linked effective size, sh is the fitness cost associated with mutations to one copy of each duplicate pair, t refers to the generation within the simulation, and n is the number of distinct genes on the chromosome (including duplicates, each Y carries 2n genes). Results are presented for c = 0, b = 0.5, and u = 5 × 10−4, per locus, per generation. Each data point represents the average of 10 simulation replicates. Since estimates of gene conversion from human–chimp comparisons suggest that D may be considerably higher than the mutation rate (Rozen et al. 2003), the results, if anything, will underestimate the impact of gene conversion on functional gene retention.

Figure 5.—

The proportion of loss-of-function duplicates following 100,000 generations of mutation, selection, and genetic drift. Parameters are described in the Figure 4 legend and throughout the text. Results are presented for c = 0, b = 0.5, u = 5 × 10−4, per locus, per generation, and D on the order of the mutation rate, D = U = 2nu. Each point represents the average of 10 replicate simulations.

Gene conversion appears to constrain accumulation of deleterious mutations in a way that is identical to crossing over in traditional models of Muller's ratchet. Under both models, the rate at which the ratchet “clicks”—the least mutated class of individuals is lost—is highest when individual mutations are weakly deleterious and/or the chromosome-wide mutation rate (an increasing function of the mutation rate per locus and the number of loci) is high (Charlesworth and Charlesworth 2000; Bachtrog 2008). The similar consequences of gene conversion and crossing over are not surprising: both processes permit chromosomal transitions from more to fewer mutations and this, along with purifying selection, can counteract the steady accumulation of new deleterious mutations within a population.


Previous theory indicates that selection does not generally favor the invasion of a rare duplicate gene unless there is a direct benefit of carrying an additional gene copy (Clark 1994) or there is recombination between the paralogs (Yong 1998; Otto and Yong 2002; Tanaka and Takahasi 2009). We have shown that gene conversion between duplicates can broaden the parameter conditions favoring the invasion of duplicate genes from low initial frequency. Biased gene conversion, with conversion favoring undamaged over damaged gene copies, can generate positive selection for rare duplicates that do not provide a direct fitness benefit (that is, individuals with two functional copies have fitness equal to those with one). However, the strength of positive selection acting on such duplicates is weak (on the order of the mutation rate). This result is in agreement with a recent simulation study, which also found that gene conversion does not strongly promote the invasion of new Y-linked duplicates (Marais et al. 2010).

The invasion dynamics of rare duplicate genes bear some similarities to models of adaptation within gene families (Walsh 1985; Mano and Innan 2008), which show that gene conversion can enhance the probability that a weakly beneficial allele becomes fixed. In our model, gene conversion alone is unlikely to overpower genetic drift unless Nu ≫ 1, yet this condition is rarely (if ever) expected to arise within animal populations, particularly with respect to Y-linked loci that have reduced effective size relative to other nuclear genes. Furthermore, there is no biological reason to suspect that gene conversion will necessarily be biased against mutant copies of a particular gene. We therefore expect that Y-linked duplicates will most likely become fixed by genetic drift, unless they directly increase the fitness of those who carry them (for additional discussion of duplicate gene fixation, see Innan and Kondrashov 2010). Likewise, deleterious Y-linked crossover events can generate selection against gene duplicates. This factor will have little impact on the probability of fixation or loss unless the crossover rate is relatively high and direct selection on the duplicate is weak or absent.

Y chromosome recombination can exert a profound influence on the retention of functional copies of genes that have already become fixed within the population. Our simulations show that low rates of gene conversion are sufficient to maintain Y-linked genes and counteract degradation via Muller's ratchet. These results are conservative, as higher rates enhance the preservation of functional gene copies. Thus, once gene conversion has evolved, it can potentially provide a degree of stability on an otherwise evolutionarily unstable Y chromosome. Interestingly, Marais et al. (2010) observed that the rate of invasion for gene conversion modifier alleles does not greatly exceed neutral expectations unless they greatly increase the gene conversion rate. This suggests that, while low rates of conversion may slow the rate of Muller's ratchet, the evolution of the gene conversion rate itself may be much more restrictive.

The large number of genes within the “ampliconic” region of the human Y (Skaletsky et al. 2003) should provide a large target for mutations, creating an opportunity for Muller's ratchet to act. This role of gene conversion on the Y is therefore likely to explain patterns of gene retention on the human Y chromosome. It is less clear whether similar patterns characterize other animal species. Current (albeit incomplete) data suggest that gene family amplification and retention might be common Y chromosome attributes (Rozen et al. 2003; Verkaar et al. 2004; Murphy et al. 2006; Alföldi 2008; Wilkerson et al. 2008; Krsticevic et al. 2009), although the prevalence of Y-linked gene conversion outside the human and chimp lineages is less clear (but see Geraldes et al. 2010). Future sequencing efforts, including evidence for gene conversion among Y-linked genes in nonhuman species, will help to determine the general relevance of the duplication and gene conversion model presented here.

Within-chromosome crossovers can generate an abnormal, sterility-inducing Y (Lange et al. 2009) and potentially represent a deleterious fitness consequence of Y-linked recombination. This cost also implies that the number of Y-linked duplicate genes (or in humans the size of Y-linked palindromes) will have an upper limit. As the number of Y-linked loci that interact via recombination increases, so too should the rate of deleterious crossovers. This suggests an upper limit to Y chromosome gene content, where crossing over becomes unbearably costly. From this perspective, duplication and recombination represent a costly mechanism of Y chromosome preservation.

In addition to the Y chromosome, our findings have implications for asexually reproducing species. Recent reports suggest that the asexual bdelloid rotifers are tetraploid (Mark Welch et al. 2008) and that gene conversion occurs between gene copies (Hur et al. 2008; Mark Welch et al. 2008). Our model supports the verbal claim that gene conversion between homologous gene copies might aid in DNA damage repair and prevent the genomic degradation that is expected to accompany strict asexual reproduction. Unlike the Y chromosome scenario, crossovers between homologous, tetraploid chromosomes will tend to avoid deleterious chromosomal aberrations. The relative rate of nonhomologous crossovers is an empirical question that may be difficult to assess, given the likely association between chromosome abnormalities and embryonic death, which will lead to a pronounced bias toward “normal” chromosomes. On the other hand, crossing over between homologous chromatids is likely to generate copy number polymorphism, which adds a level of complexity to the evolutionary dynamics of autosomal gene duplicates or gene families. This may lead to different evolutionary consequences of crossing over and gene conversion in asexual lineages compared to the results that we report for the Y chromosome and represents an interesting avenue for future theoretical research.


We are grateful to Roman Arguello, Clement Chow, Margarida Cardoso-Moreira, Qixin He, Lacey Knowles, Amanda Larracuente, Rich Meisel, Nadia Singh, and two anonymous reviewers for discussion and comments that substantially improved the quality of the manuscript and to Sarah Otto for comments about the eigenvalue-selection-coefficient approximation and for sharing an unpublished manuscript. This work was supported by National Institutes of Health grant GM64590 to A.G.C. and A. B. Carvalho.


  • Received March 17, 2010.
  • Accepted May 31, 2010.


View Abstract