Mounting evidence points to differences in gene regulation as a major source of phenotypic variation. MicroRNA-mediated post-transcriptional regulation has emerged recently as a key factor controlling gene activity during development. MicroRNA genes are abundant in genomes, acting as managers of gene expression by directing translational repression. Thus, understanding the role of microRNA sequence variation within populations is essential for fully dissecting the origin and maintenance of phenotypic diversity in nature. In this study, we investigate allelic variation at microRNA loci in the nematode Caenorhabditis briggsae, a close relative of C. elegans. Phylogeographic structure in C. briggsae partitions most strains from around the globe into a “temperate” or a “tropical” clade, with a few strains having divergent, geographically restricted genotypes. Remarkably, strains that follow this latitudinal dichotomy also differ in temperature-associated fitness. With this phylogeographic pattern in mind, we examined polymorphisms in 18 miRNAs in a global sample of C. briggsae isolates and tested whether newly isolated strains conform to this phylogeography. Surprisingly, nucleotide diversity is relatively high in this class of gene that generally experiences strong purifying selection. In particular, we find that miRNAs in C. briggsae are substantially more polymorphic than in Arabidopsis thaliana, despite similar background levels of neutral site diversity between the two species. We find that some mutations suggest functional divergence on the basis of requirements for target site recognition and computational prediction of the effects of the polymorphisms on RNA folding. These findings demonstrate the potential for miRNA polymorphisms to contribute to phenotypic variation within a species. Sequences were deposited in GenBank under accession nos. JN251323–JN251744.
GROWING empirical evidence points to altered gene regulation as a major cause of phenotypic variation (Wray 2007; Carroll 2008). For instance, in nematodes, morphological differences in the excretory duct cell between Caenorhabditis elegans and C. briggsae are due in part to the divergence of cis-regulatory sequences (Wang and Chamberlin 2002). Similarly, divergent regulatory sequences may lead to differential chemosensory receptor expression patterns in nematode neurons (Nokes et al. 2009). Moreover, the high level of transcription factor sequence variation both between Caenorhabditis species and segregating within populations highlights the potential significance of transcriptional regulation evolution in this genus (Haerty et al. 2008; Jovelin 2009; Jovelin et al. 2009).
However, gene expression regulation is not limited to transcriptional regulation. Changes in regulation can occur at many levels, including chromatin remodeling and accessibility, alternative splicing, transcription factor and RNA stability, and post-translational modifications (Alonso and Wilkins 2005). In addition, post-transcriptional gene regulation mediated by small RNAs has emerged recently as a key factor controlling gene activity (Bartel and Chen 2004). One class of small RNA genes, the microRNAs (miRNAs), plays critical roles in development by fine tuning gene expression and by buffering regulatory networks against noise and environmental perturbations (Wu et al. 2009; Herranz and Cohen 2010). MiRNAs are abundant small regulatory RNAs (∼22 bp) that bind to the 3′-UTR of mRNAs to direct translational repression or, less frequently, mRNA cleavage, when overall sequence complementarity is high (Ambros 2004; Bartel 2004; Plasterk 2006). First identified in the nematode C. elegans (Lee et al. 1993), miRNAs are widespread among animals, plants, fungi, and viruses, with 15,172 miRNA genes from 142 species now annotated in miRBase release 16 (Kozomara and Griffiths-Jones 2011). Primary miRNA transcripts (pri-miRNA) are cleaved in the nucleus by the Drosha RNase III endonuclease, liberating a miRNA precursor (pre-miRNA) that is exported to the cytoplasm by the export receptor exportin-5. Pre-miRNAs have a stereotypical stem-loop hairpin structure that is critical for processing by the enzyme Dicer, then generating an RNA heteroduplex comprising the mature miRNA (miR) and its complementary sequence (miR*). The mature miRNA is then loaded into the RNA-induced silencing complex (RISC) where it can bind to its target mRNAs by sequence complementarity of a 7-bp long motif, the seed sequence (Bartel 2004).
In the nematode C. briggsae, a close relative to C. elegans, nucleotide variation is strongly partitioned according to geographical origin, with the majority of strains collected from nature falling into one of the two “temperate” and “tropical” clades and a minority of strains exhibiting genetically divergent multilocus haplotypes that do not follow this latitudinal differentiation (Graustein et al. 2002; Cutter et al. 2006; Dolgin et al. 2008; Howe and Denver 2008; Cutter et al. 2010; Raboin et al. 2010). C. elegans and C. briggsae both are cosmopolitan species and found commonly on rotting vegetation, suggesting that, contrary to a long-standing paradigm, their natural habitat is not soil, but environments rich in bacteria, on which they feed (Felix and Braendle 2010). Temperature is an important ecological variable for many ectothermic organisms, including nematodes (see Anderson et al. 2007). In particular, fitness-associated responses to temperature vary across genetic backgrounds in C. briggsae, depending strongly on the geographical origin of the strains. Temperate-latitude isolates have greater fecundity at low temperatures and reduced fecundity at high temperatures than do isolates from the tropical phylogeographic clade, perhaps reflecting local adaptation to climatic temperature profiles (Prasad et al. 2010).
RNA folding and stability are temperature dependent, suggesting that selection could favor alleles that confer a stable secondary structure over a broad range of temperatures or, alternatively, alleles that form temperature-specific hairpins. For instance, in A. thaliana, two MIR824 alleles, one with a temperature-sensitive structure and one with a thermoresistant structure, are maintained at intermediate frequency by balancing selection (de Meaux et al. 2008). Thus, distinct climatic regimes experienced by different populations might impose divergent selective pressure on miRNA folding structures, providing either a target for preserved function of gene regulation or a means of facilitating local adaptation.
Motivated by the phylogeographic divergence and temperature-dependent phenotypic differences in C. briggsae, in this study, we investigate nucleotide variation at 18 miRNA loci in a global sample of C. briggsae isolates that covers the range of known diversity in this species. These miRNA genes represent 12% of the C. briggsae miRNAs (de Wit et al. 2009), for which we (i) quantify the level of segregating variation and investigate selective pressures acting in present-day populations, (ii) test for divergent alleles between tropical and temperate strains, and (iii) test for divergent miRNA forms in the strains derived from unique geographic locations. We also examine 19 newly collected strains from 13 locations from around the globe, relative to previously sampled strains. We find that despite purifying selection to maintain the miRNA hairpin secondary structure, some of the strains from unique geographic locations carry alleles with mutations indicative of functional divergence.
Materials and Methods
C. briggsae strains, PCR amplification, and sequencing
We examined nucleotide diversity at miRNA loci for 25 to 48 strains, including 19 new isolates from 13 locations (Table 1). The full list of strains used to examine miRNA nucleotide variation is given in Supporting Information, Table S1. In addition, we investigated the relationships of 17 of these newly collected strains (Table 1 and Figure S1) relative to a worldwide sample of 117 strains that were used in previous studies on C. briggsae diversity, based on a standard set of six loci (Cutter et al. 2006, 2010; Dolgin et al. 2008). Newly isolated nematode strains were gifts from Marie-Anne Felix (“JU” strain prefix), Matthew Rockman (QG), Christian Braendle (NIC), Guo-Xiu Wang (VX), and Erik Anderson and Leonid Kruglyak (QX) (Table 1). The new strains were collected mostly from rotting fruits (Table 1), identified through mating tests, and founded from a single hermaphrodite individual to establish selfing strains. All strains were maintained using standard C. elegans protocols (Brenner 1974).
Gene fragments of ∼700 bp, containing mostly intronic sequence and corresponding to the six loci used in previous studies of C. briggsae diversity, were amplified to examine the relationship of the newly collected strains to the global distribution of C. briggsae. PCR and sequencing of these six loci used published primers (Cutter et al. 2006). A total of 18 miRNAs corresponding to 12% of the C. briggsae miRNA complement (de Wit et al. 2009) also were amplified and sequenced, along with flanking sequence. MiRNA-specific primers were designed from the C. briggsae genomic sequence (Table S2) (Stein et al. 2003). For each strain, DNA was isolated from large populations of worms using the DNeasy Blood and Tissue kit (Qiagen). Amplifications were processed in 30-μl reaction volumes with 1.5 μl DMSO, 3 μl dNTPs (6.6 mm), 3 μl 10X Buffer (Fermentas), 2.4 μl MgCl2, 0.36 μl of each primer (50 μm), 0.18 μl of Taq polymerase (Fermentas), and 2 μl of genomic DNA. Cycling conditions were: 95° for 4 min followed by 35 cycles of 95° for 1 min, 55° or 58° for 1 min, and 72° for 1 min. Amplification products were sequenced at the University of Arizona Genetics Core sequencing facility. Primers were designed such that forward and reverse sequences strongly overlap, resulting in all loci being sequenced on both strands. All polymorphisms were visually verified in sequence chromatograms. In addition, we independently reamplified and resequenced all 18 miRNAs in a subset of four strains (NIC19, JU1341, JU1348, and BW287) to confirm that PCR or sequencing errors were not the cause of observed polymorphisms. This redundant sequencing confirmed all observed allelic variation for all miRNAs within these four strains, indicating that sequencing errors do not present spurious miRNA alleles into our dataset. Primer sequences were manually deleted from each locus prior to analysis. Sequences were deposited in GenBank under accession nos. JN251323–JN251744.
Orthologs of the six protein-coding genes and the 18 miRNAs were identified in the Caenorhabditis sp. 9 genome assembly using the BLASTN program (http://genome.wustl.edu/tools/blast) (Altschul et al. 1990). Each miRNA with ∼600 bp of flanking sequence was used in the BLAST search to ensure identification of the C. sp. 9 ortholog and to avoid retrieving nonorthologous miRNAs with similar sequence. C. sp. 9 was used as an outgroup because C. sp. 9 and C. briggsae are closely related and do not exhibit saturation at synonymous sites (Cutter et al. 2010; Woodruff et al. 2010; Jovelin and Phillips 2011).
We constructed multiple-sequence alignments by eye for each gene using BioEdit (Hall 1999). Sequences from the six loci used in studies of C. briggsae diversity were concatenated and combined with published sequences (Cutter et al. 2006, 2010; Dolgin et al. 2008). We then inferred relationships among C. briggsae strains from the concatenated sequences using unrooted neighbor networks generated with a Jukes–antor distance in SplitsTree 4.10 (Huson and Bryant 2006). Neighbor networks are particularly suited for analyzing relationshipsC among taxa for which reticulated evolution such as recombination events may be nonnegligible (Huson and Scornavacca 2011).
We combined our polymorphism data from the six nuclear loci with published data from 23 other loci (Cutter and Choi 2010) to compare nucleotide variation at miRNA and non-miRNA loci. Nucleotide diversity (π) (Nei 1987) was measured for each locus using DnaSP 5.10 (Librado and Rozas 2009). We determined the ancestral/derived state of each polymorphism found in the C. briggsae miRNAs by comparing to the C. sp. 9 miRNA orthologs and examined the frequency of the derived polymorphism in the global sample of C. briggsae alleles.
We then examined the structural context of each miRNA polymorphism, first by using the stem-loop structure of the AF16 reference strain allele (de Wit et al. 2009). We also investigated how temperature might affect miRNA folding patterns by examining the minimum free energy (MFE) structure of each miRNA allele at different temperatures, using the Vienna RNA server (Gruber et al. 2008). We examined the effect of polymorphism on miRNA folding at two extreme temperatures, 14° and 30°, in addition to an intermediate temperature of 20°. The effect of temperature on self-fecundity is heritable in C. briggsae and discriminates strains on the basis of their geographical origin. Lifetime fecundity is higher for temperate strains at low temperatures and higher for tropical strains at higher temperature (Prasad et al. 2010), prompting us to hypothesize that temperature-dependent RNA folding by miRNAs might correlate with phenotypic differentiation.
Finally, we investigated selective constraints on miRNAs by examining single nucleotide polymorphism (SNP) density and nucleotide divergence on the full pre-miRNAs, mature sequences (miRs), and flanking sequence. Flanking sequences include non-miRNA sequence upstream and downstream of miRNAs as well as intergenic sequence for miRNAs found in clusters. We included indels in the SNP density calculation, with indels larger than 1 bp being treated as a single polymorphism, and applied a correction for the difference in sample size among the miRNAs investigated, analogous to Watterson’s sample size correction (Watterson 1975). The SNP density per base pair is the ratio of the number of SNPs to the total length of sequence examined, divided by log[n − 1], where n is the sample size. Divergence was measured between C. briggsae AF16 and C. sp. 9 JU1422, limiting the analysis to the orthologs of the sequences used in the computation of the SNP density. Interspecies divergence was measured with a Jukes–Cantor distance in MEGA 5, excluding indels (Kumar et al. 2008).
C. briggsae diversity
C. briggsae is distributed all over the world, and previous analyses of nucleotide diversity showed that most strains cluster in one of two main phylogeographic groups, with a few strains from several locations having divergent haplotypes with little genetic affinity to either clade (Cutter et al. 2006, 2010; Dolgin et al. 2008; Howe and Denver 2008; Raboin et al. 2010). We first sought to test whether 17 newly collected strains from 12 locations conform to this established pattern of phylogeography, by comparing them to a worldwide dataset of 117 C. briggsae isolates (Table 1 and Figure S1).
Among 17 new strains, we identified 14 distinct haplotypes, of which 9 were not described previously (Figure S1). Most of the newly collected strains fall into either the temperate or the tropical clades, with 2 notable exceptions (Figure 1). Nearly all strains from the tropical latitudes form a phylogeographic group that also includes a newly identified strain (VX0033) from a more northerly location in mainland China (30.61°N), suggesting that the geographic range of this clade extends farther than previously thought, beyond the formal definition of tropics (23°N or S latitude). Remarkably, none of the new isolates from tropical locations fall in the temperate clade, supporting the idea that most C. briggsae strains are separated by latitude. We also identified 2 genetically distinct strains from Taiwan (NIC19) and Guangshui, China (VX0034) that fall outside of the latitudinal dichotomy and that show little evidence of recombination with the remaining isolates (Figure 1), consistent with additional sampling continuing to discover rare, novel, geographically restricted genetic backgrounds in C. briggsae (Cutter et al. 2010). These results are consistent with most C. briggsae strains being genetically differentiated on the basis of their latitudinal geographical origin, but that pockets of divergent genotypes occur both in high and low latitude parts of the world that evolve essentially independently from the bulk of C. briggsae genotypes.
Patterns of miRNA sequence evolution in C. briggsae
With our understanding of the phylogeographic relationships among C. briggsae strains in mind, we then investigated nucleotide variation at miRNA loci for a subset of them from across the world (Table S1). We sequenced 25–48 strains for each of 18 miRNAs, representing 12% of the miRNA complement in the C. briggsae genome (de Wit et al. 2009). These miRNAs are located on five chromosomes (none on chromosome V; Table 2) and 12 of them occur in three clusters (on chromosomes II, III, and X).
Overall, we discovered a total of 45 polymorphisms in miRNA genes, including 38 single base-pair mutations and seven insertion/deletions of ≥1 bp, the majority of which are located in the stem region of the hairpin structure (Figure 2). A fifth of the polymorphisms (9/45) are derived alleles that segregate at high frequency in C. briggsae, with one derived high-frequency allele located in the mature sequence of Cbr-mir-2224 (Figure 3).
The distribution of substitutions between orthologs of C. briggsae and C. sp. 9 shows the following pattern of conservation across the miRNA regions (number of substitutions shown in parentheses): miR* (5) < miR (14) < loop (29) < secondary loop (39) < stem (60). This pattern of constraint among regions of miRNAs is fully consistent with the polymorphism data as well (Figure 2). Surprisingly, the complementary mature sequences miR* have less substitutions than the functional mature sequences. Nevertheless, the complementary sequence miR* is degraded soon after the mature miR sequence loads in the RISC and is often present in the cell at levels too low for detection (Bartel 2004). For instance, miR* was cloned for only a third of the miRNAs analyzed here (de Wit et al. 2009). Comparison of the mean divergence between the miR and miR* sequences suggests that long-term selection pressure is higher on the mature miR sequence (K = 0.032 ± 0.017) than on the miR* sequence (K = 0.042 ± 0.025), similar to the SNP pattern observed across miRNA loci in Arabidopsis and humans (Saunders et al. 2007; Ehrenreich and Purugganan 2008). Clearly, the high conservation of the seed sequence reflects strong purifying selection associated with its requirement for target recognition.
The level of nucleotide diversity at pre-miRNA loci ranges from 0 to 9.36 × 10−3 per site (Table 2). Three miRNAs, Cbr-mir-2222-1, Cbr-mir-2225, and Cbr-mir-2226, contained no variants even with ≥30 strains sampled, suggesting that purifying selection may be strong across the entire miRNA sequence. Lending further support to this idea, nucleotide diversity at miRNA loci (n = 18, mean π = 2.902 × 10−3) is significantly lower than the background nucleotide diversity in C. briggsae (n = 29, mean π = 5.531 × 10−3) (Wilcoxon two-sample P = 0.019). However, nucleotide diversity was sampled in a different set of strains for the miRNAs and non-miRNAs (this work and Cutter and Choi 2010). To further investigate selective constraints acting on C. briggsae miRNA loci, we compared interspecific and intraspecific nucleotide variation in miRNAs and in mature miRs with their flanking sequence. Overall, there is two and four times, respectively, more interspecies nucleotide divergence in flanking sequence than in the pre-miRNA and mature sequences, consistent with purifying selection generally eliminating variants in miRNA loci, with strongest selective constraints on the mature sequence involved in post-transcriptional regulation (Figure 4).
Despite strong purifying selection operating on mature miRNAs, we discovered that five mature miR sequences nevertheless harbor polymorphisms. Three of the variants are transversion SNPs from an adenine and one is a thymine-to-cytosine transition (Figure 3). Interestingly, one of the polymorphisms is a 10-bp-long deletion in Cbr-mir-64b that overlaps five nucleotides of the seed sequence in the divergent strain NIC19 from Taiwan (Figure 5). Exact matching of the seed sequence with the target messenger RNA is required for proper miRNA regulation (Bartel 2004), although some classes of miRNA binding sites also require pairing in the 3′ end or in the center of the mature sequence (Brennecke et al. 2005; Shin et al. 2010). Consequently, the deletion of ∼70% of the seed sequence strongly suggests that the NIC19 allele of Cbr-mir-64b is not functional. Additionally, Cbr-mir-64b harbors the most segregating polymorphisms within C. briggsae of any of the miRNA loci we examined, implying that it might be subject to less stringent purifying selection than other C. briggsae miRNAs. Among the 18 miRNAs analyzed here, Cbr-mir-64b also is the only miRNA with a substitution in the seed sequence between C. briggsae and C. sp. 9 orthologs (Figure 5), suggesting divergent mRNA target sets between the two species (Table 2).
The strains that have been isolated from a few locations also contribute to the higher level of polymorphism observed at Cbr-mir-64b. Among the seven polymorphisms found in Cbr-mir-64b, four are specific to the divergent strains from Kerala, Taipei, and Nairobi. This is a general pattern across miRNA loci: 47 and 60% of the polymorphisms found in miRNAs and miRs, respectively, are specific to the divergent strains from Kerala, Nairobi, Montreal, Taipei, and Guangshui, indicating relatively high divergence of miRNA sequences for these strains that have distinctive genetic backgrounds.
Effects of temperature and polymorphisms on miRNA folding structures
The patterns of nucleotide polymorphism and divergence we documented for miRNAs clearly implicate purifying selection to maintain the integrity of the hairpin structure (Figure 4). We investigated further whether the polymorphisms found within C. briggsae might nevertheless identify obvious candidates for functional differentiation as a consequence of how the polymorphisms affect RNA folding. Most polymorphisms have little effect on the hairpin structure (Table 3). However, several alleles from the geographically restricted C. briggsae strains carry mutations that alter the secondary structure and reduce its themodynamic stability (Figure 6). The 10-bp deletion in NIC19’s allele of Cbr-mir-64b not only removes most of the seed sequence but also results in a clover leaf-like secondary structure that is very different from the typical miRNA hairpin structure. Similarly, a 9-bp insertion in JU1348's allele of Cbr-mir-64 modifies the shape of the hairpin, resulting in a complex secondary structure with a bifurcating stem. Several point mutations in the JU1341 allele of Cbr-mir-2224 also lead to the formation of an additional stem (Figure 6). These results indicate that the strains showing divergent haplotypes based on neutral markers (Figure 1) also harbor genetic differences in miRNAs that are indicative of functional differentiation from the other C. briggsae isolates.
Although we detected allelic variants both in the temperate and the tropical strains, no polymorphism was found exclusively in all strains within one of these two phylogeographic clades. Most of the alleles segregate either at high or at low frequency (Figure 3), and the two intermediate-frequency variants do not differentiate the tropical and temperate groups of strains. Thus, the hypothesis that selection could have favored distinct alleles in the temperate and tropical clades does not hold for this collection of 18 miRNAs.
We next investigated how temperature might affect RNA folding by comparing the MFE structure of each allele to the AF16 reference strain allele’s MFE structure at three different temperatures. First, we noted that temperatures from 14° to 30° do not change the predicted hairpin structure, or alter it only slightly, for the majority of the miRNAs (Table 3). However, there was a strong effect on the predicted AF16 fold for three miRNAs (not shown): temperatures below 30° result in complex secondary structures for the AF16 allele of Cbr-mir-64b, Cbr-mir-54d, and Cbr-mir-41b, whereas the stereotypical hairpin structure is restored at 30°. Second, for each temperature that we explored, half of the variants (45%) do not influence the miRNA secondary structure (Table 3). Nearly all of the variants that do alter the miRNA secondary structure have little impact on overall hairpin formation, and either create bulges or extend the length of the secondary and primary loops. Two polymorphisms do strongly change the hairpin structure at 14°, however: a G→A transition at position +78 in Cbr-mir-64 for strain EG5612 and a 6-bp deletion at position −54 in Cbr-mir-54d for strain JU1341, but these effects disappear when temperature is increased to 20° and 30°, respectively. Two other mutations in strain JU1341 have temperature-dependent effects (Table 3). There is a T→C transition that increases the length of a secondary loop in Cbr-mir-64b at 30° and the effect of a C→T transition disappears in Cbr-mir-2224 at 30°, although other mutations in this miRNA do alter the hairpin structure (Figure 6). These results suggest that C. briggsae miRNAs generally are tolerant to variation in temperature even in the face of sequence polymorphism, which may explain the observed lack of correlation between miRNA genotype and geographical origin.
Mutations generated in the laboratory during forward genetic screens or using reverse genetics are powerful means to investigate gene function and have proven to be particularly effective in understanding developmental mechanisms and their evolution (Brenner 1974; Nüsslein-Volhard and Wieschaus 1980; Nüsslein-Volhard et al. 1984; Jansen et al. 1997). However it is unclear how such mutations, often with strong deleterious effects, may contribute to the long-term evolution of the trait they affect. Investigation of allelic variation in natural populations is a valuable complement to understanding the evolution of particular gene function (Kammenga et al. 2008; Jovelin 2009). We applied this philosophy to the nematode C. briggsae by examining nucleotide polymorphism and divergence at miRNA loci, a class of regulatory molecules with important roles in development. The patterns of polymorphism and divergence clearly demonstrate that purifying selection is the dominant mode of natural selection acting on miRNA genes (Figure 2 and Figure 4), similar to the pattern observed in humans and Arabidopsis (Chen and Rajewsky 2006; Saunders et al. 2007; Ehrenreich and Purugganan 2008; Quach et al. 2009). In particular, the mature sequence of miRNAs is subject to strong purifying selection, consistent with its important role in gene regulation. Despite strong constraint overall, we identified notable examples of functionally relevant polymorphisms in C. briggsae miRNAs.
SNPs altering miRNA structure
Although most polymorphisms have little effect on miRNA folding (Table 3), we identified some mutations that likely cause complex and atypical secondary structures (Figure 6), suggesting that these polymorphisms may have functional effects. At least one such allele, in Cbr-mir-64b, is clearly not functional due to a 10-bp deletion that removes most of the seed sequence that is required for interaction with target mRNAs (Figure 5 and Figure 6). Interestingly, the common allele of this miRNA also harbors a substitution in the seed sequence between C. briggsae and C. sp. 9 (Figure 5), suggesting that Cbr-mir-64b orthologs in these two species regulate distinct sets of target genes. Because Cbr-mir-64b is the only miRNA with seed sequence GGACCGC in the C. briggsae genome (de Wit et al. 2009) its function cannot be rescued by same-family miRNAs (Alvarez-Saavedra and Horvitz 2010).
One possible explanation for the pattern of nucleotide variation at Cbr-mir-64b is its recent origin in the lineage leading to C. briggsae and C. sp. 9. Orthologs of Cbr-mir-64b were not found in other nematode species (de Wit et al. 2009) and there is evidence that young miRNAs evolve faster than more phylogenetically conserved miRNAs (Lu et al. 2008a; Fahlgren et al. 2010; Nozawa et al. 2010). It is conceivable that young miRNAs have limited functional scope and contribute little to organismal fitness and, consequently, are more subject to genetic drift until they either become inactive or are coopted into regulatory networks. Lending support for this view, most newly arisen miRNAs become inactive and disappear during evolution (Lu et al. 2008b; Nozawa et al. 2010). Alternatively, gene loss may be an important mechanism of phenotypic diversification (Aboobaker and Blaxter 2003). In contrast to Cbr-mir-64b, the miRNA Cbr-mir-64 is not a young miRNA and orthologs are present in other nematode species (de Wit et al. 2009). However, its seed sequence is conserved in two other miRNAs found together in the same cluster on chromosome III (de Wit et al. 2009), suggesting that if the 9-bp insertion found in strain JU1348 is deleterious (Figure 6), then the function of Cbr-mir-64 could potentially be rescued by other members of the mir-64 family (Alvarez-Saavedra and Horvitz 2010).
Phylogeography of miRNA polymorphism in C. briggsae
The background level of nucleotide diversity at silent sites in C. briggsae is similar to that observed in A. thaliana (Nordborg et al. 2005; Cutter et al. 2010). And yet, we found that 28% (5/18) of the C. briggsae miRNAs surveyed contained a polymorphism in the mature sequence compared to only 3% (2/66) of miRNAs in a global sample of A. thaliana accessions (Ehrenreich and Purugganan 2008). Notably, plant miRNAs typically require perfect or near-perfect complementarity with their target sites, whereas animal miRNAs preferentially bind to their targets through complementarity of the seed sequence (Axtell et al. 2011). The difference between A. thaliana and C. briggsae in polymorphism levels within mature miRNA sequences could be due in part to differences in pairing requirements with their target sites, with stronger purifying selection on the overall mature sequence in plant miRNAs.
Strikingly, pre-miRNAs in C. briggsae harbor twice as much nucleotide diversity as pre-miRNAs in A. thaliana (Ehrenreich and Purugganan 2008), despite the fact that background levels of polymorphism at neutral sites are very similar between the two species. This surprisingly higher incidence of polymorphisms among the miRNAs of C. briggsae suggests that miRNA-mediated gene expression variation could be particularly important in C. briggsae, perhaps facilitated by the high differentiation of the geographically restricted isolates. Remarkably, all mutations having strong effects on miRNA folding (Figure 6), as well as half of the total polymorphisms, are found in the few divergent C. briggsae strains that have been isolated from only a few locations. Although mutations in the seed sequence are a clear indication of functional divergence, nucleotide variation at other positions in the mature sequence and in the miRNA precursor also can have consequences on miRNA function (Duan et al. 2007; de Meaux et al. 2008; Jazdzewski et al. 2008, 2009), and some miRNA binding sites require base pairing beyond the seed sequence (Brennecke et al. 2005; Shin et al. 2010). Thus, the polymorphisms in the geographically restricted strains suggest that distinct miRNA-mediated gene regulation might be operating and raises the possibility that these strains differ in some aspects of their biology relative to the vast majority of C. briggsae strains. Close examination of phenotypic differences in the newly identified divergent genotypes might provide insights into the microevolution of developmental processes and/or life-history traits within this species (Figure 1 and Cutter et al. 2010).
Most strains of C. briggsae from around the world show a distinct haplotype structure based on geographical origin that correlates with temperature-dependent effects on fitness (Figure 1, Figure S1, and Cutter et al. 2006, 2010; Dolgin et al. 2008; Howe and Denver 2008; Prasad et al. 2010; Raboin et al. 2010). Because RNA folding also is temperature dependent, we hypothesized that miRNAs with fixed nucleotide differences between the temperate and tropical phylogeographic groups might confer distinct miRNA secondary structures (de Meaux et al. 2008). Surprisingly, none of the polymorphisms for the 18 miRNA loci that we identified were specific to either of these two phylogeographic groups, as most alleles segregate at high or low frequency (Figure 3). Thus, the pattern of nucleotide variation in miRNA loci contrasts sharply with the pattern of differentiation at putatively neutral loci that have been investigated to date. One reason for this disparity may be that the strong selective pressure on miRNA sequence keeps most new mutations at low frequency, preventing fixation in any one clade (Figure 4). Nevertheless, we found that most variants did not change the secondary structure when we computationally manipulated temperature (Table 3), suggesting that there may be little or no selective pressure to maintain alternative variants in each phylogeographic group. This contrasts with the situation of temperature-sensitive balancing selection for MIR824 in A. thaliana (de Meaux et al. 2008). Although we did not detect a correlation between miRNA genotype and strain latitudinal origin, future investigation of the remaining miRNAs and other small RNAs in the C. briggsae genome merits additional exploration of the potential for local adaptation mediated by small RNA divergence.
In conclusion, we document strong overall purifying selection on miRNAs within contemporary populations of C. briggsae. Nevertheless, our analysis of polymorphism in 12% of C. briggsae’s miRNAs uncovered substantial nucleotide variation in mature sequences. Our results motivate further investigation of miRNA sequence variation in C. briggsae and identify strong candidates for experimental work to dissect the functional consequences of allelic variation on gene expression regulation among divergent strains of C. briggsae.
We thank Erik Andersen, Christian Braendle, Marie-Anne Felix, Leonid Kruglyak, Matt Rockman, and Guo-Xiu Wang for kindly providing C. briggsae strains. Some of the C. sp. 9 sequence data were produced by The Genome Center at Washington University School of Medicine in St. Louis. R.J. is supported by a postdoctoral fellowship from the Ontario Ministry of Research and Innovation and A.D.C. is supported by the Natural Sciences and Engineering Research Council of Canada and a Canada Research Chair in Evolutionary Genomics.
Supporting information is available online at http://www.genetics.org/content/suppl/2011/09/02/genetics.111.132795.DC1.
- Received July 14, 2011.
- Accepted August 18, 2011.
- Copyright © 2011 by the Genetics Society of America