Abstract
Hotspots regulate the position and frequency of Spo11 (Rec12)-initiated meiotic recombination, but paradoxically they are suicidal and are somehow resurrected elsewhere in the genome. After the DNA sequence-dependent activation of hotspots was discovered in fission yeast, nearly two decades elapsed before the key realizations that (A) DNA site-dependent regulation is broadly conserved and (B) individual eukaryotes have multiple different DNA sequence motifs that activate hotspots. From our perspective, such findings provide a conceptually straightforward solution to the hotspot paradox and can explain other, seemingly complex features of meiotic recombination. We describe how a small number of single-base-pair substitutions can generate hotspots de novo and dramatically alter their distribution in the genome. This model also shows how equilibrium rate kinetics could maintain the presence of hotspots over evolutionary timescales, without strong selective pressures invoked previously, and explains why hotspots localize preferentially to intergenic regions and introns. The model is robust enough to account for all hotspots of humans and chimpanzees repositioned since their divergence from the latest common ancestor.
The conflict between the evolutionary persistence of hotspots and the instability intrinsic to their mode of action implies a deep flaw in our understanding of the mechanism of meiotic recombination (Pineda-Krch and Redfield 2005, p. 2321).
...Homologous recombination may be regulated primarily by a finite number of discrete DNA sites and proteins that interact with those sites (Wahls and Smith 1994, abstract).
Meiotic Recombination Hotspots
In meiosis, crossover recombination structures help to align paired homologous chromosomes on the metaphase plate of the first meiotic division, and this alignment is required for the faithful segregation of homologs (Gerton and Hawley 2005). Meiotic recombination is clustered at hotspots that regulate its frequency and distribution along chromosomes. Jürg Kohli’s laboratory discovered DNA sequence-dependent activation of recombination hotspots in fission yeast ∼20 years ago (Schuchert et al. 1991). At about the same time, Tom Petes’ laboratory provided evidence for such regulation in budding yeast (White et al. 1991, 1993; Fan et al. 1995), which is highly diverged from fission yeast. The yeast paradigms long stood alone, but we now know that DNA sequence elements also help to position meiotic recombination at hotspots in mammals (Myers et al. 2008; Baudat et al. 2010). In each case, it appears that sequence-specific DNA-binding proteins trigger epigenetic modifications of chromatin structure that help to regulate the initiation of recombination by Spo11 (Rec12) (Kon et al. 1997; Yamada et al. 2004; Hirota et al. 2007; Buard et al. 2009; Baudat et al. 2010; Myers et al. 2010; Parvanov et al. 2010).
Three approaches have been used to discover regulatory (hotspot) DNA sequence motifs. The first approach has been to map hotspot locations genetically and then, by using scanning base-pair substitution mutagenesis in the genome, to define the DNA sequence(s) required for activity (Gutz 1971; Szankasi et al. 1988; Schuchert et al. 1991). This approach is laborious and is practical only in model organisms such as yeast. Indeed, the scanning mutational, “gold standard” approach for documenting unambiguously the DNA sequence dependence of hotspots has, to our knowledge, been carried out only in fission yeast (Schuchert et al. 1991; Steiner et al. 2009, 2011).
The second approach, made possible by high-resolution mapping of hotspot positions across sequenced genomes (e.g., Gerton et al. 2000; Myers et al. 2005; Ptak et al. 2005; Winckler et al. 2005; Cromie et al. 2007), has been to search computationally for motifs that are nonrandomly associated with hotspots. This method identified a consensus motif that is present at a subset of meiotic crossover hotspots (COH) and at hotspots of nonallelic homologous recombination (NAHR) in humans (Myers et al. 2008). Nucleotide polymorphisms within the motif correlate with attenuated hotspot activity, providing evidence that the motif is indeed recombinogenic.1 Closely related hotspot-associated motifs were subsequently detected in humans and mice (Baudat et al. 2010; Kong et al. 2010; Myers et al. 2010). One caveat is that correlation does not demonstrate causation, and there is at least one example where a hotspot motif predicted computationally (Blumental-Perry et al. 2000) was dispensable for hotspot activity when tested experimentally (Haring et al. 2004). Computational searches are prone to false-negative results, too, and can miss (Cromie et al. 2007) motifs that are known to be recombinogenic and that are associated with >20% of hotspots throughout the genome (Schuchert et al. 1991; Wahls and Davidson 2010).
Recognizing that hotspot motifs are likely prevalent but elusive, Walter Steiner’s group developed a third, “brute force” biological approach for motif discovery (Steiner et al. 2009, 2011). They screened individually the frequency of meiotic recombination in ∼46,000 fission yeast strains harboring short, randomized nucleotide sequences within a test locus. A subset of candidate motifs were subsequently tested again using the rigorous criteria base-pair substitution mutagenesis. The authors showed that at least five distinct DNA sequence motifs activate hotspots, and they provided compelling evidence that there are many more regulatory motifs yet to be discovered.
The GENETICS articles by Steiner et al. (2009, 2011) deserve special mention, both for the insightful experimental approach and for the implications of the findings. The multiplicity of cis-acting regulatory elements is striking, as is the fact that very different motifs and binding proteins can function redundantly to promote recombination. Such findings support the idea that discrete DNA sites regulate much, perhaps most, meiotic recombination. They also provide a fresh new perspective with which to consider published data and long-extant puzzles, as described below.
Hotspot Paradox
About 40 years ago Herbert Gutz (1971) described the fundamental characteristics of meiotic recombination hotspots. First, hotspots are regulated in cis because they are allele specific and display Mendelian inheritance. Second, a given hotspot allele promotes recombination in only a subset of meioses. Third, a chromosome that harbors an activated hotspot serves preferentially as the recipient of genetic information from the homologous chromosome (gene conversion), and a subset of gene conversions is accompanied by crossing over (Gutz 1971; Schuchert and Kohli 1988). Consequently, when heterozygous, the chromosome region harboring the hotspot is preferentially converted into a hotspot-inactive state (Figure 1A). The conversion rate varies according to hotspot, with at least 1% of meiotic products being converted at highly active hotspots of mice and yeast (Grimm et al. 1994; Cromie et al. 2005; Guillon et al. 2005). Hotspots therefore seed their own destruction and on the evolutionary timescale should be lost from the population.
Model for the evolutionarily rapid redistribution of meiotic recombination hotspots. (A) Hotspots (“H”) act as recipients of genetic information during gene conversion, leading preferentially to loss of the hotspot (“C,” cold). (B) For every recombination-promoting DNA sequence motif [the M26 DNA site of fission yeast is illustrated (Schuchert et al. 1991)], there is a reservoir of cryptic motifs. (C) Over time, DNA sequence-dependent hotspots are rendered inactive by mutation (“M”) or gene conversion (“GC”). Hotspots arise de novo when mutations change cryptic DNA sequence motifs into hotspot motifs. Consequently, a small number of base-pair changes can dramatically alter the distribution of hotspots in the genome.
The genomic distribution of hotspots varies markedly between closely related taxa (Ptak et al. 2005; Winckler et al. 2005), by ∼50% or more between species of the same genus (Tsai et al. 2010) and, to a lesser extent, even between members of the same species (Kong et al. 2010), illustrating the evolutionary transience of hotspots. These changes occur rapidly, and even in humans one can chart the eventual death of individual hotspots (Jeffreys and Neumann 2009). Nevertheless, recombination hotspots remain abundant in sexually active eukaryotes. Therein lies the “hotspot paradox” (Boulton et al. 1997). Individual hotspots are suicidal, but, collectively, hotspots are somehow maintained. Moreover, the dynamic, evolutionarily rapid redistribution of hotspots requires that the mechanisms for replacement be facile and relatively plastic with regard to chromosomal location. We suggest that the mechanisms are coupled to, and can be explained fully by, the DNA sequence-dependent regulation of recombination hotspots.
Equilibrium Dynamics of Mutations and Gene Conversion: A Model
Two mechanisms have been shown experimentally to remove and add recombination hotspots in the genome. These are gene conversion (Gutz 1971) and base-pair substitutions (Szankasi et al. 1988; Schuchert et al. 1991). Below we describe evidence and models for how these mechanisms participate in a self-regulating, dynamic equilibrium that helps to maintain and reposition hotspots in the genome over time.
There are now at least 10 different DNA sequence motifs demonstrated by base-pair mutagenesis (5 in fission yeast), implicated by deletion studies (3 in budding yeast), or inferred from association studies (1 class each in humans and mice) to regulate hotspot activity (Schuchert et al. 1991; White et al. 1993; Myers et al. 2008; Steiner et al. 2009; Baudat et al. 2010). Each DNA sequence motif is short, commensurate with the molecular determinants for binding of a hotspot-activating protein or complex. For example, only 7 bp are required for hotspot activity of the M26 DNA site in fission yeast (Schuchert et al. 1991), and only 8 bp of the 13-bp human crossover consensus motif are conserved (Myers et al. 2008). Notably, base-pair substitutions that create or ablate discrete DNA sites can generate and abolish hotspot activity, respectively (Schuchert et al. 1991; Steiner and Smith 2005; Steiner et al. 2009, 2011). Thus, within the genome is a collection of hotspot-active DNA sites and a reservoir of “cryptic” DNA sequence motifs that can be rendered active by as little as a single-base-pair substitution (Figure 1B).
A model for evolutionarily rapid redistribution of meiotic recombination hotspots is presented in Figure 1C. Individual, DNA sequence-dependent hotspots are inactivated by mutations within the DNA site or by gene conversion in meiosis. Opposing this trend is the de novo generation of recombinogenic DNA sites by mutations within cryptic DNA sequence motifs. In contrast, previous models suggested that cis-acting mutations cannot compensate successfully for the loss of hotspots by gene conversion (Boulton et al. 1997; Pineda-Krch and Redfield 2005; Coop and Myers 2007; Peters 2008). The distinction (and major conceptual shift) between models lies in the nature and density of DNA sequence motifs within the genome, which are factors not considered in the prior reports. (In a subsequent section, “As Easy as A-G-C-T?”, we describe a third class of model that, like ours, involves DNA sequence motifs.)
The frequency with which spontaneous mutations generate hotspots de novo is likely high enough to support rapid evolutionary change because the reservoir of cryptic DNA sequence motifs is vast. For example, every regulatory DNA site 7 bp in length has 21 cryptic permutations that might be rendered hotspot active by a single-base-pair substitution (Figure 1B). These occur on average once every 780 nucleotides along each strand of DNA in the genome (47 ÷ 21, assuming random sequence DNA). In fission yeast (Schuchert et al. 1991; Steiner et al. 2009), in budding yeast (White et al. 1991, 1993; Fan et al. 1995), and likely in humans (Myers et al. 2008; Berg et al. 2010), multiple different DNA sequence motifs are recombinogenic, and each of those motifs has a corresponding cryptic reservoir (e.g., Table S1). If one considers the recombinogenic DNA sequence motifs already defined experimentally in fission yeast (Schuchert et al. 1991; Steiner et al. 2009, 2011), there is on average a cryptic, single-base-pair variant DNA element about every 194 bp along the genome (Table S1). In other words, ∼0.17% (1/582) of spontaneous mutations will generate a DNA sequence motif already known to be recombinogenic. This value calculated from the experimentally defined motifs sets the lower bound because there are additional recombination-promoting DNA sequences of fission yeast whose functional motifs remain to be defined by base-pair mutagenesis—and still more are predicted statistically (Steiner et al. 2009). Given the surprisingly high frequency with which hotspot motifs are created (at least 0.17% per mutation), and that there are many intervening mitoses for each meiosis, the rate at which hotspots are created de novo by mutations might offset the rate at which they are lost through meiotic gene conversion. Further evidence supporting this model and its applicability to primates and to evolutionary timescales are described below.
Why Do Hotspots Avoid ORFs?
Another long-standing enigma is why recombination hotspots are located preferentially within intergenic regions (IGRs) and occur much less frequently within the ORFs of protein-coding genes (e.g., Nicolas et al. 1989; Baudat and Nicolas 1997; Gerton et al. 2000; Buhler et al. 2007; Cromie et al. 2007; Frazer et al. 2007; Robine et al. 2007). This is apparently not due to avoidance of genes, but rather reflects avoidance of protein-coding regions. Hotspots do occur within genes, but when they do so, they are more prevalent within introns than within exons (Kong et al. 2010). Similarly, while hotspots are underrepresented in transcribed regions coding for proteins, they are abundant in transcribed regions that produce long, polyadenylated, noncoding RNAs (Wahls et al. 2008). We suggest that the regulation of hotspots by short DNA sequence motifs provides a mechanism for these phenomena.
The natural inclination is to ask, what mechanisms direct hotspots preferentially toward noncoding regions? From this perspective our DNA sequence-dependent model might seem unsatisfactory, because if mutations stochastically “sprinkle” hotspot motifs into the genome over time, then one would expect hotspots to arise with equal probability in coding and noncoding regions. One possibility is that natural selection favors newly arising hotspot motifs in noncoding regions (see below). However, one can ask the same question in a different way: what mechanisms might direct hotspots preferentially away from coding regions? This simple change in perspective, applied to existing data, revealed a causal link between molecular mechanisms for hotspot genesis and molecular mechanisms for negative selective forces (reduced organismal fitness) that help to drive the localization of hotspots (Figure 2).
Hotspot-activating mutations within ORFs can, coincidentally, decrease the overall fitness of the organism. (A) Diagram of the fission yeast ade6 locus with positions of alleles in the ORF. (B) The nonsense mutation (lowercase letters) that created the ade6-M26 allele (Szankasi et al. 1988) simultaneously created a 7-bp DNA site (box) that promotes recombination (Schuchert et al. 1991). A similar mutation that created the ade6-M375 allele generated neither a hotspot motif nor hotspot activity. (C) To illustrate published findings (Gutz 1971), we plated serial dilutions of cells on medium with and without adenine. The observed decrease in organismal fitness (single dagger) is due to the opal (stop) codons because it can be alleviated by a suppressor tRNA (asterisk). Furthermore, the decrease in fitness is unrelated to the hotspot motif or to hotspot activity because the hotspot is active in the presence of the suppressor tRNA (Goldman and Smallets 1979). Thus, while the decrease in fitness and the hotspot share a common origin, they are in fact independent (coincidental) consequences of the single-base-pair substitution. And because the de novo hotspot motif is tightly linked to the stop codon, negative selective forces that operate due to and upon the stop codon will also affect the hotspot motif. By these molecular mechanisms, natural selection disfavors most hotspots that arise by mutations in protein-coding regions, relative to those that arise in noncoding regions.
Mutation of cryptic DNA sequence motifs within ORFs can generate hotspot-active DNA sites (Schuchert et al. 1991; Virgin et al. 1995; Steiner et al. 2009), but more often than not such mutations will also alter the sequence of the encoded protein or lead to premature termination of protein synthesis. For example, the single-base-pair substitution that created an M26 hotspot DNA site in the ade6 gene of fission yeast (Schuchert et al. 1991) also introduced a stop codon (Szankasi et al. 1988). This conferred a decrease in fitness because the cells can no longer grow unless adenine is added to the culture medium (Gutz 1971). Therefore natural selection, against the mutated proteins, would disfavor or eliminate the majority of hotspot-proficient DNA sites that arise within ORFs. In contrast, mutations that create hotspot-active DNA sites within IGRs or introns would not trigger negative selection due to mutated proteins. As DNA sequence-regulated hotspots arise de novo in the genome via mutation (a stochastic process), differential selective pressures would subsequently drive their localization away from coding regions and hence toward IGRs and introns. Indeed, the majority of DNA sequence motifs known to activate hotspots are found preferentially within noncoding regions (Steiner et al. 2011), providing evidence that such a drive operates across the genome and supporting our model. Interestingly, the bias is greater for motifs that are active than for those that are inactive, suggesting that additional factors influence the function of hotspot motifs, the selective forces that shape their dynamic repositioning over time, or both.
Our model predicts that a fraction of hotspot-generating mutations within protein-coding regions, namely translationally silent mutations and occasionally missense mutations, would be tolerated. Experimental and correlative data are consistent with these predictions. First, the negative selective forces elicited by a nonsense mutation (coincident with generation of a hotspot motif) can be uncoupled from hotspot activity by a nonsense suppressor tRNA (Figure 2) (Gutz 1971; Goldman and Smallets 1979). Second, while DNA sequence-dependent hotspots are found preferentially in noncoding regions, they are also present in coding regions (Steiner and Smith 2005; Wahls and Davidson 2010; Steiner et al. 2011).
We note that negative selection against de novo hotspot motifs (Figure 2) need not be restricted to protein-coding regions. In principle, any noncoding region of the genome whose primary DNA sequence is important functionally would be similarly constrained as a target for the evolutionary retention of de novo, sequence-dependent hotspots. Well-defined examples of noncoding, sequence-constrained features include centromeres, telomeres, and silent mating-type loci, each of which is depleted for recombination (Choo 1998; Petes 2001). Parenthetically, regional variation of hotspot motif density is not the only factor that enhances or attenuates recombination regionally. Additional factors, such as centromeric heterochromatin, can actively suppress the initiation of recombination (Robine et al. 2007; Ellermeier et al. 2010). As another example, the linear element protein Rec10 (orthologous to synaptonemal complex protein Red1) can suppress the function of some hotspot motifs (Pryce et al. 2005), perhaps by sequestering motifs from their binding proteins, or from the recombination machinery, or both.
Snapshot in Evolutionary Time
Further support of the models can be found in the sequence of the fission yeast genome, which reflects the sum total of dynamic changes in all preceding mitoses and meioses. Mutations are stochastic, so the rates at which any given DNA site is created or ablated by mutation should be equivalent. However, DNA sites that activate recombination hotspots are removed preferentially by gene conversion when heterozygous (Gutz 1971; Schuchert et al. 1991) and hence should be lost from the genome over successive generations. This process is evident because all of the recombinogenic DNA sequence motifs are underrepresented in the genome, relative to the mean frequencies of corresponding single-base-pair variants (cryptic motifs) (Figure 3A). There are several implications.
Equilibrium kinetics of DNA sequence-dependent hotspots. (A) Plots show frequencies of hotspot DNA sequence motifs and each single-base-pair variant (“Cryptic”) motif in the fission yeast genome (Table S1). Bars indicate mean ± SEM. (B) Hotspot and cryptic motifs are interchanged by mutations and gene conversion. The ratio of motif frequencies (indicated by box sizes) affects rate vectors, driving the system to equilibrium.
First, on the laboratory-experimental (Schuchert et al. 1991; Steiner et al. 2009) and inferred-historical (Figure 3A) timescales, multiple, distinct DNA sequence motifs of fission yeast promote meiotic recombination and are suicidal. DNA sequence-dependent hotspots of mammals, inferred from association studies, are likewise suicidal (Jeffreys and Neumann 2009; Myers et al. 2010).
Second, despite their suicidal tendencies, recombination-promoting DNA sequence motifs remain present (Figure 3A) and active (Schuchert et al. 1991; Steiner and Smith 2005; Steiner et al. 2009; Wahls and Davidson 2010) in the fission yeast genome. Such motifs must have arisen, and presumably continue to arise over time, by mutations (Figure 1). This rationale applies equally well for motif-dependent, suicidal hotspots of other eukaryotes.
Third, the persistence of hotspot motifs in the face of experimentally documented loss and gain requires, necessarily, a balance between rates of loss and gain. A mechanism to regulate this balance (Figure 3B) is described in the next paragraph.
Fourth, it has been assumed a priori that strong positive selection is required to maintain the presence of hotspots in the genome (Boulton et al. 1997; Pineda-Krch and Redfield 2005; Coop and Myers 2007; Peters 2008; Ubeda and Wilkins 2011). However, the need to invoke such selection is attenuated if one considers that short DNA sequence motifs regulate recombination. Gene conversion removes hotspots over time, so the frequency of any given hotspot motif in the genome is lower than the average frequency of corresponding cryptic motifs (Figure 3A). Rate kinetics come into play. When the frequency ratio of hotspot-to-cryptic motifs (average frequency) is high (e.g., 1:1), then mutations would have a negligible net effect on motif frequencies and conversion will preferentially reduce the frequency of hotspot motifs (Figure 3B, top). When the ratio of hotspot-to-cryptic motifs is low (e.g., 1:10), then mutations will preferentially increase the frequency of hotspot motifs (Figure 3B, bottom). Thus, substrate concentration-dependent rate kinetics of conversion and mutation would drive the system to equilibrium. Even if a hotspot motif is eliminated from the genome, it would ultimately be resurrected from the vast pool of cryptic motifs by the inexorable stochastic process of mutation (Figures 1B and 3B). Speculatively, first-order rate kinetics could provide the primary force for retention (and commensurate repositioning) of DNA sequence-regulated hotspots over evolutionary timescales.
Selective Pressures and Population Genetics
We do not mean to imply, in the preceding section, that natural selection has no role in the positioning or evolutionary maintenance of hotspots. Indeed, negative selection demonstrably impinges upon some de novo hotspots (Figure 2). Additional forces, selective and nonselective, likely contribute to hotspot dynamics and a subset of such forces is described here.
Meiotic recombination is broadly conserved and, with few exceptions, crossover recombination is required for the faithful segregation of homologs in the first meiotic division (Gerton and Hawley 2005). However, while natural selection operates to maintain recombination, mathematical modeling indicated that the known benefits of recombination (on fertility and viability) are insufficient to maintain hotspots in the face of their loss by gene conversion (Boulton et al. 1997; Pineda-Krch and Redfield 2005; Peters 2008). Furthermore, in the absence of crossover interference recombination rates can be titrated down by >10-fold, to approximately one crossover per chromosome pair, before there is a perceptible decrease in fitness due to aberrant chromosome segregation (Kan et al. 2011). Two possibilities that might explain this exist. First, selection for meiotic recombination per se has little or no role in maintaining hotspots. Second, selection for recombination has a key role in maintaining hotspots, but we have not yet identified the benefits of having multiple recombination events (active hotspots) on each chromosome in each meiosis. Somewhere between these extremes, natural selection might operate through a combination of cis- and trans-acting factors (Peters 2008; Ubeda and Wilkins 2011), including recombinogenic DNA sites and their binding proteins.
The DNA-binding protein Prdm9 (Meisetz) is implicated to be a chromatin-remodeling transcription factor (Hayashi et al. 2005), a “species-incompatibility” protein involved in hybrid sterility (Mihola et al. 2009), and a hotspot-activating protein (Baudat et al. 2010; Myers et al. 2010; Parvanov et al. 2010). Interestingly, its DNA-binding domain (and hence its hotspot motif selectivity) is evolving rapidly, and there is apparently positive selection for newly arising variants, at least in some taxa (Oliver et al. 2009; Thomas et al. 2009; Ponting 2011). Therefore, selective pressures that drive the rapid evolution of Prdm9 likely help to shape the recombination landscape by changing where Prdm9 promotes recombination, without necessarily changing the overall number of Prdm9-dependent hotspots. It has been suggested, conversely, that selection for the recombination-promoting functions of Prdm9 might help to drive its rapid evolution (Ponting 2011). This idea seems plausible but tentative, particularly in the context of points raised in the preceding paragraph. At issue is what the actual selective forces are and whether any of them operate via recombination.
Boulton et al. (1997) pointed out that hotspots might be maintained in part by selection for aspects of cellular physiology other than recombination. This is illustrated well by the Atf1-Pcr1 heterodimer that, like Prdm9, is both a hotspot-activating factor and a transcription factor (Wahls and Smith 1994; Shiozaki and Russell 1996; Wilkinson et al. 1996; Kon et al. 1997). In its latter role, the Atf1-Pcr1 heterodimer regulates the induced transcription of core environmental stress response genes required for cells to survive under a wide variety of different stress conditions (Chen et al. 2003; Davidson et al. 2004). The decrease in fitness observed in mutants lacking this protein complex is attributable to defects in transcription. Notably, the recombination-promoting activity of the Atf1-Pcr1 heterodimer maps to a different domain of Atf1 than that required for fitness under stress (Gao et al. 2008). Such findings support mechanistically the insight of Boulton et al. (1997). One is left with the question of whether there is any selection for the recombination-promoting activities of proteins such as Prdm9 and the Atf1-Pcr1 heterodimer. Intuition would suggest that such forces exist, even though they have so far eluded detection.
All sequence-specific DNA-binding proteins known or implicated to activate hotspots are also transcription factors (White et al. 1991, 1993; Wahls and Smith 1994; Kon et al. 1997; Steiner et al. 2009, 2011; Baudat et al. 2010; Myers et al. 2010; Parvanov et al. 2010). Discussions of selective forces that might operate upon their DNA-binding sites are beyond the scope of this article, but can be found elsewhere (e.g., Hahn et al. 2003; Doniger and Fay 2007; Babbitt 2010; He et al. 2011). It is sufficient to say that natural selection, which can drive newly arising transcription-factor-binding sites toward some locations of the genome and away from others (e.g., Figure 2), helps coincidentally to drive hotspots preferentially to IGRs and promoter-containing regions. As for hotspot-activating proteins, natural selection upon the DNA sequence motifs might be largely or entirely distinct from natural selection upon recombination hotspot activity itself. Each possible scenario is fully compatible with our model for the stochastic generation of hotspot motifs from cryptic motifs.
Last but not least, population genetics can markedly influence the distribution of hotspots over evolutionary time frames. For example, simulation modeling indicated that allelic drift can affect hotspot positioning in humans due to small effective population sizes and bottlenecks (Coop and Myers 2007). Similarly, population genetics likely had a key role in the positioning of hotspots in closely related species of the genus Saccharomyces (Tsai et al. 2010). The population-genetic influences are not restricted to cis-acting determinants because allelic variation of trans-acting factors (e.g., Prdm9) also has a role in specifying the positions of hotspots (Berg et al. 2010).
For context, two molecular mechanisms are known to ablate and create hotspots. These are gene conversion (Gutz 1971) and base-pair substitutions (Szankasi et al. 1988; Schuchert et al. 1991). These primary determinants of change likely operate together in a dynamic equilibrium to help maintain and reposition DNA sequence-dependent hotspots (Figures 1 and 3). Superimposed are other forces—selective and nonselective—that help to shape the recombination landscape over time (e.g., Figure 2). Summarized metaphorically, base-pair substitutions can seed the field, and additional forces can subsequently do the weeding.
As Easy as A-G-C-T?
DNA sequence motifs that activate meiotic recombination hotspots have been exceptionally difficult to identify, due mainly to their short lengths, their context-variable penetrance,2 and their functional redundancy (Wahls and Davidson 2010). However, the absence of evidence is not evidence for absence. Paradigms established long ago in fission yeast and budding yeast (Schuchert et al. 1991; White et al. 1991, 1993) have recently been confirmed or implicated in metazoans (Myers et al. 2008; Baudat et al. 2010) and protozoa (Jiang et al. 2011). Furthermore, individual species demonstrably have (Steiner et al. 2009, 2011) or likely have (Myers et al. 2005, 2008; Berg et al. 2010; Jiang et al. 2011) multiple, different hotspot-activating motifs. And to the extent tested, each motif apparently helps to regulate as much as 20–41% of recombination in the genome on the basis of frequency distributions of double-stranded DNA breaks and crossovers, respectively (Myers et al. 2008; Wahls and Davidson 2010). Such findings render into theory the hypothesis that “a significant fraction of recombination may be regulated by a finite number of discrete [DNA] sites such as M26” (Wahls and Smith 1994, p. 1699).
We now suggest, on the basis of the experimental evidence discussed in preceding sections, that most de novo hotspots arise from mutations that create recombination-promoting DNA sequence motifs (Figures 1 and 3). Mutations can also alter the DNA-binding-site specificity of hotspot-activating proteins, such as Prdm9, and hence relocate a subset of hotspots (Baudat et al. 2010). However, the frequency of such “shifts” is likely many orders of magnitude lower than the frequency with which mutations change cryptic motifs into hotspot motifs. [Consider a “mutational target” density of a few per genome vs. >6 × 104 per genome (Table S1)] And once a shift has occurred, the newly chosen hotspot motif would become subject to the concentration-dependent rate kinetics of conversion and mutation that drive the motif to dynamic equilibrium in the genome (Figure 3). Indeed, both the shift and subsequent motif-specific drive can be inferred from comparing Prdm9-associated motifs of humans to those of chimpanzees (Myers et al. 2010).
The “equilibrium dynamics” model for the overall maintenance of hotspot numbers and the Prdm9 “shift” model are mechanistically distinct and mutually complementary. The shift model provides a way to relocate, in one fell swoop, a subset of hotspots. Such punctuated changes, which for rapidly evolving Prdm9 have probably occurred several times during human evolution (Oliver et al. 2009; Berg et al. 2010), substitute one set of motifs with another. The occasional shifts a priori would not substantially change the total number of hotspot motifs in the genome, so it is difficult to envision how the process could successfully counteract the relentless loss of motifs by gene conversion. The equilibrium dynamics model, on the other hand, does not explain punctuated shifts, but it does provide a way to replace continuously those hotspot motifs lost to conversion (Figure 3) and to progressively move hotspots throughout the genome (Figure 1). Notably, this model applies to all DNA sequence-dependent hotspots, not only to those whose binding proteins undergo atypically rapid evolution of DNA-binding-site specificity.3 Together, the two models, each based on the theory that discrete DNA sequence motifs help to position meiotic recombination, can explain many features of hotspot biology.
Accounting for Hotspots That Move
Humans and chimpanzees share almost 99% DNA sequence identity (Chimpanzee Sequencing and Analysis Consortium 2005) but few hotspot positions (Ptak et al. 2005; Winckler et al. 2005). By inference, most hotspot positions of the latest common ancestor have been ablated (converted away) during species divergence (Coop and Myers 2007). Could the hotspot motif model (Figure 1) explain all of the newly positioned hotspots? Yes, if one assumes that the density of cryptic motifs in primate genomes is similar to that documented in fission yeast (Figure 4, Table S1). There are ∼3.5 × 107 single-base-pair differences (mutations) between humans and chimpanzees (Chimpanzee Sequencing and Analysis Consortium 2005). If at least 0.17% of mutations change a cryptic motif into a hotspot motif, then together humans and chimpanzees would have at least 59,500 de novo hotspot motifs, relative to the latest common ancestor. At first approximation, these would be sufficient to account for all hotspots in each organism [∼25,000 (Myers et al. 2005)]. Thus our hotspot motif model (Figure 1) is robust enough to account for the repositioning of hotspots and the maintenance of hotspot numbers over evolutionary time scales, with or without additional factors such as shifts. We view this as a provisional conclusion, pending a more systematic and comprehensive identification of the nature and density of hotspot motifs within primate (and other) genomes.
Hotspots created by mutation could account for hotspots repositioned over evolutionary timescales. Gene conversion preferentially removes hotspots from the genome (Gutz 1971; Boulton et al. 1997; Jeffreys and Neumann 2009), and this driving force is thought to have ablated most of the hotspots that were present in the latest common ancestor of humans and chimpanzees (Coop and Myers 2007). A second driving force is base-pair mutations, which can generate and remove hotspots (Szankasi et al. 1988; Schuchert et al. 1991; Fox et al. 1997; Steiner et al. 2009, 2011). These (and potentially other) opposing forces must operate in a dynamic equilibrium (e.g., Figure 3) for hotspots to be maintained in the genome over evolutionary timescales. We suggest that most hotspots arise from mutations that create hotspot motifs and that this process could account for hotspots repositioned during species divergence. The calculations, which assume a density of regulatory motifs similar to that of fission yeast (asterisk), illustrate this point. Additional mechanisms, such as shifts in the DNA-binding site specificity of hotspot-activating proteins, could also relocate hotspots.
Conclusions
Occasionally, a change in one’s perspective yields a clear solution to a seemingly intractable problem. One such problem is explaining, mechanistically, the distribution and dynamics of meiotic recombination hotspots. The GENETICS articles by Walter Steiner et al. (2009, 2011) revealed that many different DNA sequence motifs of the same organism are recombinogenic, providing an important piece for the puzzle. Our realization that cryptic motifs are densely packed in the genome and can be changed easily into hotspot motifs provided a fresh perspective for the interpretation of existing data. The resulting model, which applies to all DNA sequence-dependent hotspots, describes a mechanism for the dynamic, evolutionarily rapid redistribution of hotspots in the genome. The model also shows how hotspot numbers could be maintained over time without strong selective pressures. It explains why hotspots localize preferentially to IGRs and introns. And it can explain existing data on hotspot repositioning during species divergence.
There are many interesting questions. For example, what regulates the context-variable penetrance of hotspot motifs? What mechanisms underlie sex-specific differences in the distribution of recombination? What are the relative contributions of mutation-conversion equilibria and shifts to hotspot dynamics? The answers to such questions probably lie within the constellation of recombinogenic DNA sequence motifs and the regulation of proteins that bind to those DNA sites. We are, as revealed by the insightful recent work of Steiner et al. (2009, 2011), looking at the “tip of the iceberg” for such regulation.
Acknowledgments
We thank Reine Protacio for the construction and plating of strains (Figure 2); Giulia Baldini, Jun Gao, Fengling Kan, and Reine Protacio for helpful discussions; Adam Wilkins and anonymous reviewers for constructive suggestions; and the National Institute of General Medical Sciences at the National Institutes of Health for research support (grant GM81766).
Footnotes
Supporting information is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.111.134130/-/DC1.
↵1 Experimental data support the association data. Tandem copies of a hypervariable minisatellite (SAT) sequence, and its binding proteins, promote homologous recombination in cultured cells (Wahls et al. 1990, 1991; Wahls and Moore 1998). If one allows for a 1-bp gap, there is perfect identity between the SAT sequence (5′-CCACC–TGCCCACCTCT-3′) and the conserved positions within the COH/NAHR consensus motif (5′-CCNCCNTNNCCNC-3′). Given circular permutation of tandem repeats, additional alignments can be made.
↵2 We propose the term “context-variable penetrance” to describe the fact that individual, discrete DNA sequences motifs known to be recombinogenic exhibit variable levels of hotspot activity at different locations in the genome.
↵3 Ten different sequence-specific DNA-binding proteins are known or implicated to help activate hotspots. These are Atf1-Pcr1 heterodimer, Php2-Php3-Php5 complex, Bas1, Bas2, Rap1, Rst2, and Prdm9 (White et al. 1991, 1993; Wahls and Smith 1994; Kon et al. 1997; Steiner et al. 2009, 2011; Baudat et al. 2010; Myers et al. 2010; Parvanov et al. 2010). Prdm9 is the only one whose DNA-binding specificity is known to change rapidly over time. Moreover, Prdm9 is absent from many eukaryotes, including some vertebrate taxa, and it has lost some or all of its functions in other taxa due to mutations (Oliver et al. 2009).
- Copyright © 2011 by the Genetics Society of America
Available freely online through the author-supported open access option.