- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.104.038851v1
171/2/639 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by DuMont, V. B.
- Articles by Aquadro, C. F.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by DuMont, V. B.
- Articles by Aquadro, C. F.
Originally published as Genetics Published Articles Ahead of Print on July 14, 2005.
Genetics, Vol. 171, 639-653, October 2005, Copyright © 2005
doi:10.1534/genetics.104.038851
Multiple Signatures of Positive Selection Downstream of Notch on the X Chromosome in Drosophila melanogaster
Vanessa Bauer DuMont and Charles F. Aquadro1
Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853
1 Corresponding author: Department of Molecular Biology and Genetics, 235 Biotechnology Bldg., Cornell University, Ithaca, NY 14853.
E-mail: cfa1{at}cornell.edu
To identify genomic regions affected by the rapid fixation of beneficial mutations (selective sweeps), we performed a scan of microsatellite variability across the Notch locus region of Drosophila melanogaster. Nine microsatellites spanning 60 kb of the X chromosome were surveyed for variation in one African and three non-African populations of this species. The microsatellites identified an
14-kb window for which we observed relatively low levels of variability and/or a skew in the frequency spectrum toward rare alleles, patterns predicted at regions linked to a selective sweep. DNA sequence polymorphism data were subsequently collected within this 14-kb region for three of the D. melanogaster populations. The sequence data strongly support the initial microsatellite findings; in the non-African populations there is evidence of a recent selective sweep downstream of the Notch locus near or within the open reading frames CG18508 and Fcp3C. In addition, we observe a significant McDonald-Kreitman test result suggesting too many amino acid fixations species wide, presumably due to positive selection, at the unannotated open reading frame CG18508. Thus, we observe within this small genomic region evidence for both recent (skew toward rare alleles in non-African populations) and recurring (amino acid evolution at CG18508) episodes of positive selection.
THE observed positive correlation between nucleotide diversity and the rate of crossing over at a locus in Drosophila melanogaster (BEGUN and AQUADRO 1992) raised the possibility that within this species, the fixation of advantageous mutations has been frequent. The role of mutation in shaping the correlation has been rejected for Drosophila (no relationship between divergence and recombination; BEGUN and AQUADRO 1992). However, the relative roles of positive selection vs. the continuous removal of deleterious mutations ("background selection"; CHARLESWORTH et al. 1993; HUDSON and KAPLAN 1995) are still debated (see ANDOLFATTO 2001a and AQUADRO et al. 2001). Attempts to distinguish between these forces have led to results that are mixed or difficult to interpret (e.g., AQUADRO et al. 1994; BEGUN and WHITLEY 2000a; LANGLEY et al. 2000; ANDOLFATTO 2001a,b; NACHMAN 2001; KAUER et al. 2002; SCHÖFL and SCHLÖTTERER 2004).
While this debate continues, a growing number of studies show that positive selection has affected molecular evolution at specific genomic regions within D. melanogaster. These studies have largely been conducted at loci for which there were a priori expectations that advantageous fixations had occurred. Recent examples of such loci in Drosophila are Pgm (VERRELLI and EANES 2000), Relish (BEGUN and WHITLEY 2000b), mth (SCHMIDT et al. 2000; DUVERNELL et al. 2003), desat2 (TAKAHASHI et al. 2001), and immunity-related proteins (SCHLENKE and BEGUN 2003).
These results have motivated scans of the genome for patterns and levels of variability at multiple microsatellites and/or short segments of nucleotide sequence that might indicate recent positive selection (see review by SCHLÖTTERER 2002a). The premise is that markers closest to the target of a sweep would show a reduction of variability and/or a nonequilibrium allele frequency distribution (depending on the strength of selection, time since the sweep, and regional rates of recombination; e.g., KAPLAN et al. 1989). Genome scans have been preformed in two ways. Some studies are based on coarse large-scale chromosomal or genome-wide scans of variation (e.g., SCHLÖTTERER 2002b; PAYSEUR et al. 2002; GLINKA et al. 2003; KAUER et al. 2003; KAYSER et al. 2003; ORENGO and AGUADÉ 2004; STORZ et al. 2004). Others are based on genotyping markers in specific genomic regions. The primary goal of these studies is to localize, on a much finer scale, the specific targets of positive selection (KOHN et al. 2000; NURMINSKY et al. 2001; HARR et al. 2002; SCHLENKE and BEGUN 2004). Studies of this sort should shed light onto the types of genetic changes that have resulted in fitness differences within and between species. Here, we perform a fine-scale scan of microsatellite variability centered at the Notch locus region of the X chromosome in D. melanogaster.
Notch was originally chosen for analysis because it was assumed to be evolving neutrally. However, we previously reported a significant Hudson-Krietman-Aguadé (HUDSON et al. 1987) test result within exon 6 of the Notch locus (a region we previously called Notch 3'; BAUER DUMONT et al. 2004). The deviation suggested either too little variability or too much divergence at this region of Notch. Synonymous site divergence is nearly three times higher than the average level observed between D. melanogaster and D. simulans at this region. In addition, further investigation revealed that positive selection has accelerated the fixation of some of these synonymous mutations (i.e., preferred and unpreferred along the D. simulans and D. melanogaster lineages, respectively; BAUER DUMONT et al. 2004). Yet, aside from the accelerated divergence, there remained the observation of low levels of variability at Notch 3', especially when considering the moderately high recombination rate at Notch. Thus, we remained curious whether additional data could detect other patterns indicative of a recent selective sweep at the Notch region. To address this question, we performed a microsatellite genome scan within this region in D. melanogaster. We surveyed variation at nine microsatellites spanning a 60-kb region encompassing the Notch locus in four population samples of this species (Figure 1). The microsatellite data suggest that at least one selective sweep has occurred within an
14-kb window. Additional nucleotide polymorphism data were collected for a 10,558-bp segment within this putatively swept region in three populations of D. melanogaster (Figure 1). These data allow us to reject neutrality in favor of a model incorporating a recent selectively driven fixation downstream of Notch in at least some non-African populations. We also detect ongoing fixations of amino acid mutations due to positive selection at the open reading frame CG18508.
|
Samples:
Four population samples of D. melanogaster were surveyed for microsatellite variability. These include Zimbabwe (Sengwa Wildlife Research Institute), the United States (Arvin and Soda Lake, CA), Ecuador (Atacame), and China (Beijing). The Zimbabwe, United States, and China populations were also surveyed for nucleotide sequence variability across a 10,558-bp region including and extending downstream from the region we previously sequenced and had labeled Notch 3' (BAUER DUMONT et al. 2004). Collection data for these populations have been reported previously (BEGUN and AQUADRO 1991, 1994, 1995). For all populations, extracted X chromosome lines (BEGUN and AQUADRO 1994) were used. Thus each line represents a single X chromosome from the respective populations. The sample sizes for the microsatellite study were 49, 34, 45, and 72 chromosomes for Zimbabwe, the United States, Ecuador, and China, respectively, while those for sequencing were 12, 12, and 15 chromosomes for Zimbabwe, China, and the United States, respectively. One allele from a United States population of D. simulans was also sequenced for the entire 10,558-bp region, and one allele from D. yakuba was sequenced for the CG18508 open reading frame.
Analysis of microsatellite variability:
Cosmid clones 163A10 (accession no. AL035436) and 140G11 (accession no. AL035395) completely span Notch. After omitting overlapping regions, these clones cover a total of 76,889 bp of which 10,092 bp extend upstream (and centromere distal) of Notch, 35,367 bp encompass the Notch locus, and 31,430 bp extend downstream (and centromere proximal) of Notch. Using the "find" option of the DNASTAR program EditSeq, we searched this sequence for all dinucleotide motifs with lengths greater than five perfect repeats. Thirty-six repeats were found, nine of which were chosen for further analysis on the basis of their length and location. The microsatellites were named on the basis of their location (in kilobases) within the combined sequence of these clones. The microsatellites surveyed for variability are denoted as follows: 7.9, 28.6, 33.3, 37.3, 45.7, 46.8, 50.7, 57.8, and 67.8. Thus, our microsatellite survey spans 59.9 kb, which corresponds to bases 2,891,1882,953,111 of the D. melanogaster genomic scaffold (Release 3.2.1). The primers and conditions used to amplify these microsatellites can be found in the supplemental data (at http://www.genetics.org/supplemental/). Forward primers were labeled with the fluorescent dye FAM (Applied Biosystems, Foster City, CA). PCR product lengths were determined on an ABI 373XL automated sequencer using the ABI programs Genescan and Genotyper.The Bottleneck program (CORNUET and LUIKART 1996) was used to evaluate the relationship between the observed number of alleles and expected heterozygosity at each microsatellite. We report the two-phase model results, which were determined under the default settings of the program (variance 30.0, probability 70%). The LnRV and LnRH tests (SCHLÖTTERER 2002b; KAUER et al. 2003) were also applied to the microsatellite data to test for population-specific reductions in variability. The tests were performed by comparing our 9 Notch region loci to data from 118 other X-linked microsatellite loci reported by KAUER et al. (2003). For these tests, levels of variability at monomorphic loci were adjusted by replacing one allele with another that is one repeat unit different from the original allele length, following the suggestion of SCHLÖTTERER (2002b) and KAUER et al. (2003).
Analysis of sequence polymorphism data:
Six overlapping regions were amplified using PCR (SAIKI et al. 1988) to result in 10,558 bp of sequence surveyed for variability in the United States and Zimbabwe population samples. This sequence includes and continues downstream from the region that we had previously surveyed and labeled Notch 3' (BAUER DUMONT et al. 2004). For the Chinese population only three additional segments were PCR amplified and sequenced, resulting in a total of four segments analyzed in this population. Information pertaining to these regions including PCR primer details can be found in the supplemental data (http://www.genetics.org/supplemental/). The total 10,558-bp region corresponds to bases 2,897,500 and 2,907,597 of the D. melanogaster genomic scaffold (Release 3.2.1).
Sequencing was performed by the Biotechnology Resource Center DNA Sequencing Facility at Cornell University (http://www.brc.cornell.edu) using ABI chemistry and running products out on the ABI 377 or ABI 3700 machines. Both strands were sequenced in D. simulans. In D. melanogaster, the lines sequenced were each homozygous for a different X chromosome, so calling heterozygotes was not an issue. Roughly half of the region was sequenced on a single strand for this resequencing effort. Sequences were aligned using MegAlign of the DNASTAR software package and analyzed using the DnaSP 4.0 program (ROZAS and ROZAS 1997). This program was used to calculate
and
, estimates of 3Neµ (since the sequence is on the X chromosome). DnaSP 4.0 was also used to perform the following tests of the neutral theory: Tajima's (TAJIMA 1989) D, Fu and Li's (FU and LI 1993) D, Fay and Wu's (FAY and WU 2000) H, and the McDonald-Kreitman test (MCDONALD and KREITMAN 1991). Sliding window plots of these tests were also obtained using DnaSP 4.0. P-values for the Tajima's D, Fu and Li's D, and Fay and Wu's H test statistics across the entire region were obtained using the coalescent simulator of DnaSP 4.0 assuming no recombination. For the China data set, DnaSP 4.0 was used to obtain P-values for the test statistics incorporating recombination. For this analysis we used R = 3Nerm = 70, where r is the rate of recombination per base pair per generation and m is the length in base pairs of the region analyzed. Due to computer time constraints in analyzing the USA and Zimbabwe data sets using DnaSP 4.0, the ms program of HUDSON (2002) was used to determine the significance of these tests incorporating recombination. The recombination rate used for these data sets was R = 333. These recombination rates were chosen because they are half the expected R for these regions given the sex-averaged genetic map-based recombination rate at Notch of r = 2.1 x 108 recombinants/generation/bp (HEY and KLIMAN 2002) and an estimate of Ne of 1 x 106 (KREITMAN 1983) for D. melanogaster. This should be conservative with regard to detecting significant skews in the frequency spectrum.
We compared patterns of variability observed in our data to those from simulations to evaluate how extreme the empirical patterns are from that expected under the standard neutral model or different population bottleneck scenarios (details in RESULTS). Simulations were carried out using the ms program for a sequence of 10,558 bp and a recombination rate per region (R) of 665. Neutral steady-state simulations were conditioned on the number of segregating sites observed within the United States. We also considered 16 different bottleneck scenarios. In these simulations, the number of segregating sites in the United States was taken to represent that of the present population size, and the number of segregating sites observed in Zimbabwe was assumed to be representative of the ancestral population size prior to the bottleneck associated with the founding of the non-African populations. We simulated bottlenecks that occurred 5000, 7000, 10,000, or 20,000 years ago (50,000, 70,000, 100,000, or 200,000 generations ago, assuming 10 generations per year) as these bracket recently published estimates of the age of the out-of-Africa population bottleneck [a minimum of 6400 years ago (BAUDRY et al. 2004); 16,000 years or 0.021 3Ne generations ago (HADDRILL et al. 2005)]. For each time point we simulated four different bottleneck reductions, which differed in strength and behavior (i.e., followed with or without exponential growth). The ms program performs simulations going backward in time for which changes in population size are relative to the present size. In the bottleneck simulations without exponential growth, a population of the size inferred for the United States remains constant until the specified time of the bottleneck upon which the population size instantaneously increases to that observed in Zimbabwe. Given that we observe
41% of the variation in the United States (147 segregating sites) as compared to Zimbabwe (362 segregating sites), by design, these simulations incorporate a bottleneck of
59% (10041%) and represent the least severe considered. We also simulated bottlenecks of greater severity. Moving backward in time, the population size shrinks such that at the time of the bottleneck the population size is 67, 75, or 99% smaller than that observed in Zimbabwe. At this point the population size instantaneously jumps to that observed in Zimbabwe. For each scenario, 1000 replicates were performed.
The composite likelihood method for detecting positive selection along a recombining chromosome of KIM and STEPHAN (2002) was applied to the entire 10,558-bp sequence. This method compares the likelihood of observed patterns of nucleotide sequence variation under either a selective sweep or a standard neutral model of molecular evolution. The objective is to determine whether a selective sweep has occurred and to predict the location of the target of the sweep if detected. The two programs needed to perform this test (i.e., clsw and ssw) were downloaded from http://128.151.242.156/
orrlab/People_Yuseob.HTML. The clsw program performs the composite likelihood comparison. We used the
-1 option (resulting in
being estimated from our data directly) and a recombination rate (4Nr) of 0.063/bp and assumed that the mutation rate was uniform across the region. We placed no constraint on the localization of the target of any potential selective sweep (target could be placed anywhere within the 10,558-bp segment or outside of it). We used the version of the program that assumes that the derived state of a segregating site is known, which was deduced using comparisons to D. simulans. At segregating sites for which the derived state could not be determined in this manner (either because a third base was segregating in D. simulans or due to an insertion/deletion difference between the species) we assumed the base with the higher frequency to be ancestral. For the United States, 31 of 147 segregating sites analyzed fell into this category, while for Zimbabwe the number was 52 of 336. As noted by KIM and STEPHAN (2002), this assumption will have little effect on detecting selection if the hitchhiking event was old. If the event was recent, the power of the analysis to detect positive selection is decreased. Therefore, this treatment of our data should be conservative. We deleted regions from analysis for which there were N's (incomplete data) for some sequences within a population sample.
To determine the significance of the composite likelihood output of clsw, we produced 1000 neutral genealogies using the simulation program ssw as described in KIM and STEPHAN (2002). These simulations were conditioned on the
value estimated from the original data. The outputs of the neutral simulations were subsequently run through clsw. To obtain a P-value, we determined how many of the 1000 neutral genealogies had a composite likelihood ratio equal to or greater than that observed for our data.
The msHH program of PRZEWORSKI (2003; http://email.eva.mpg.de/
przewors/), a rejection-sampling method to estimate the age of a selective sweep, was also applied to the data. As detailed below, the results of the KIM and STEPHAN (2002) method suggest that a recent sweep has occurred within this 10,558-bp region. For the United States population, the KIM and STEPHAN (2002) method places the target of the sweep near a fixed difference observed between the African and non-African populations. We applied the msHH method to the sequence 5' (the upstream region) and 3' (the downstream region) of this fixed difference. The method was applied in two ways. First, the program was run considering only the 1000 bp immediately flanking the fixed difference on each side. Second, the program was run considering the total upstream and downstream sequence. The input of the first 1000 bp upstream was as follows: 16 segregating sites, Tajima's D of 1.140, and 5 haplotypes in a sample of 15. For the entire upstream region the input was as follows: the length of the region is 5423 bp, 56 segregating sites, Tajima's D of 0.840, and 15 haplotypes in a sample of 15. The input of the first 1000 bp downstream was 18 segregating sites, Tajima's D of 1.504, and 6 haplotypes in a sample of 15. For the entire downstream region the input was as follows: the length of the region is 5135 bp, 91 segregating sites, Tajima's D of 1.186, and 15 haplotypes in a sample of 15. For all regions 2000 accepted trials were obtained with the tolerance of Tajima's D (one criterion to determine acceptance) set at 0.10. The following estimates were used as the mean of the prior distributions: mutation rate of 1 x 108/nucleotide/generation, recombination rate of 2.1 x 108 recombinants/bp/generation, and effective population size of 7.5 x 105 (the latter being the X chromosome effective population size).
Microsatellite variability:
Nine dinucleotide repeats selected from genomic sequence to be distributed across a 60-kb region encompassing the 30-kb Notch gene (Figure 1)were sampled for length variation in population samples of D. melanogaster from Zimbabwe, the United States, Ecuador, and China. Table 1 lists measures of variability for each of the microsatellites in each population (see Table S1 in supplemental data at http://www.genetics.org/supplemental/ for the distribution of PCR fragment lengths). For these nine loci, the non-African populations harbor on average fewer alleles and have lower heterozygosity than the Zimbabwe population. Of particular note is a localized reduction of variability apparent in Figure 2. All populations show some reduction in heterozygosity for microsatellites 33.346.8. However, microsatellites are notoriously heterogeneous in mutation rate among loci due particularly to different numbers of perfect repeats (e.g., BRINKMANN et al. 1998; SCHUG et al. 1998; BACHTROG et al. 2000; ELLEGREN 2000). To circumvent this difficulty, we evaluated whether the neutral relationship between the number of alleles at a locus and the frequency distribution of those alleles (as measured by heterozygosity) were met at each microsatellite. We assess significance of a departure using the computer program Bottleneck (CORNUET and LUIKART 1996). This program evaluates the probability of observing the expected heterozygosity at individual loci on the basis of allele frequencies given the observed number of alleles assuming a two-phase mutation model (i.e., majority of microsatellite mutations are stepwise but occasionally a larger jump in allele length occurs; results were similar when a stepwise model was assumed). A deficiency of heterozygosity (negative DH/sd value) indicates an excess of low-frequency alleles and would be consistent with linkage to a recent selective sweep.
|
|
Bottleneck results for the nine Notch region microsatellites are given in Table 1 and depicted in Figure 2. In all of the non-African populations there is a cluster of microsatellites for which we observe strongly negative DH/sd values (the United States and China) or no variation (Ecuador). While none of the microsatellites reject neutrality when evaluated as a two-tailed test (with significance cutoff of 0.025), the departure is close to significant at microsatellite 45.7 in the United States and at microsatellite 33.3 for the United States and Chinese populations. Zimbabwe lacks the strong tendency of negative DH/sd values in the center of the region surveyed (microsatellite loci 33.346.8).
We also applied the LnRV and LnRH multilocus tests (SCHLÖTTERER 2002b; KAUER et al. 2003) to our microsatellite data. The goal of these tests is to detect loci that are outliers to the distribution of the ratio of microsatellite variation observed across loci when one population is compared to another (variation being measured as either variance in repeat number or expected heterozygosity per locus, respectively). Loci that are significant outliers show a population-specific excess or deficiency in variation, which is interpreted as the signature of population-specific balancing or directional selection in that region of the genome. We compared the microsatellites at Notch to a set of 118 X-linked loci surveyed for variation in population samples from Zimbabwe and Europe (KAUER et al. 2003; data from supplemental material at www.genetics.org/cgi/content/full/165/3/1137/DC1). To perform the tests, our data from the United States, China, and Ecuador were individually combined with the European data of KAUER et al. (2003) and were compared to the combined Zimbabwe data set. Results of these tests are reported in Table 1.
The LnRV test indicates a significant deficiency of variation in the United States at microsatellites 45.7 and 46.8. Interestingly, the LnRH test indicates that microsatellite 37.3 has a significant excess of variation in the United States compared to Zimbabwe. The excess expected heterozygosity at this one locus is due to the presence of only two intermediate frequency alleles in the United States sample. In Ecuador, microsatellites 7.9 and 45.7 show a significantly reduced level of variation with the LnRH test. The LnRV test similarly reveals that microsatellites 7.9, 45.7, and 46.8 are significantly less variable in Ecuador compared to Zimbabwe. All three of these microsatellites are monomorphic in our Ecuador sample but show "normal" levels of variation in Zimbabwe. While our China sample shows a similar pattern of reduced variation at some of these loci, they are not significant outliers in the LnRV or LnRH tests comparing China to Zimbabwe. These tests can be taken only as suggestive since for our non-African populations, we are comparing Notch-region microsatellite variation within the United States, Ecuador, and China to variation observed at other X-linked microsatellites found in European samples. Nonetheless, taken together, the results of the Bottleneck, LnRV, and LnRH tests point to an
14-kb window with unusual levels and frequency distributions of microsatellite variability suggestive of one or more recent selective sweeps in the non-African populations (Figure 2, Table 1).
Nucleotide sequence data:
To evaluate whether positive selection played a role in the 14-kb window of reduced and/or skewed microsatellite variability, we assayed nucleotide sequence variation across a 10,558-bp segment. This region includes the 3'-end of Notch plus four open reading frames just downstream of the Notch transcript (Figure 1). We note that this sequence includes only the last 4.4 kb of the 14-kb window. We focused on this region because it includes the Fcp3C protein. Given this protein's expression during the formation of the follicle cuticle, it belongs to a class of proteins often found to be rapidly evolving due to positive selection (e.g., TSAUR and WU 1997; SWANSON et al. 2001; SWANSON and VACQUIER 2002). Nucleotide variability was surveyed beyond microsatellite 46.8 (the most 3' skewed/invariant microsatellite) to ensure the sampling of sequence unaffected by the potential sweep. In all populations the first 1.5 kb of sequence is what we had previously called Notch 3' (BAUER DUMONT et al. 2004). In the Zimbabwe and United States population samples a region spanning 10,558 bp was sequenced. For the Chinese population, we sequenced three segments (in addition to the region previously labeled Notch 3') resulting in a total of 5 kb surveyed within the 10,558-bp segment. We labeled the regions sequenced in China: Notch 3', the CG18508 region, the Fcp3C region, and the 3' region, which corresponds to the last 2 kb of the 10,558-bp sequence.The segregating sites within this 10,558-bp region in the D. melanogaster populations are represented in Figure S1 in supplemental data at http://www.genetics.org/supplemental/. The levels of variability and results of tests of neutrality considering the site frequency spectrum are given in Table 2. As has been previously observed at X-linked loci (e.g., BEGUN and AQUADRO 1993; ANDOLFATTO 2001b; GLINKA et al. 2003), the United States population harbors less variation than Zimbabwe. All frequency spectrum-based tests of neutrality are negative in the United States and Zimbabwe across the total 10,558 bp. When the neutral cutoff of these statistics is determined by simulating neutrality with recombination, Fu and Li's D is significantly negative across the region in Zimbabwe. Both Tajima's D and Fay and Wu's H are significantly negative in the United States. The results in both populations remain significant after correcting for multiple tests indicating both an excess of rare alleles and high-frequency-derived mutations in the United States and an excess of young (rare) segregating variants in Zimbabwe across this 10,558-bp sequence in D. melanogaster. In China the test statistics are negative at the Notch 3', CG18508, and Fcp3C regions and are positive at the 3' region. After multiple testing corrections, none of the tests is significant in this population. The non-African population results are in agreement with the microsatellite survey. The Zimbabwe results were unexpected, as we did not detect a skew in the frequency spectrum with the microsatellite Bottleneck analysis.
|
Figure 3 depicts sliding window plots of variability, total divergence, Tajima's D, Fu and Li's D, and Fay and Wu's H across the 10,558-bp sequence for the United States and Zimbabwe populations. Fluctuations in variation do not correspond to coding and noncoding boundaries. In Zimbabwe, the fluctuations in variability follow the fluctuations of divergence very closely, except within the Notch locus where divergence is very high and synonymous sites are not evolving neutrally (BAUER DUMONT et al. 2004). All the test statistics tend to be negative on average but Fu and Li's D is more consistently negative across the region.
|
In the United States, fluctuations in levels of variation depart dramatically from levels of divergence between coding regions CG18508 and Fcp3C. Within this region we observe a valley of variability but a peak in divergence. We also observe a prominent valley in the Tajima's D and Fay and Wu's H statistics within this region. These test statistic results are strongly influenced by a 1.3-kb region for which all the variability is observed in one individual allele (USA16; Figure S1 in supplemental data at http://www.genetics.org/supplemental/). However, the tests remain significant even when this individual is removed from the analysis showing that the tendency for variants to be rare is true for the entire sample. Data from China are consistent with the patterns observed in the United States. The Fcp3C segment in this population has lower variability and has a greater skew in the frequency spectrum toward rare alleles compared to the adjacent segments.
These data suggest that the non-African populations sampled have undergone a recent perturbation of gene genealogies consistent with a selective sweep between bases 14,000 and 20,000 of our sequence. Interestingly, within this region is a 811-bp region (between bases 17,210 and 18,020) with no variation within the United States population and two fixed differences between the African and non-African populations (all but 100 bp of this gap has been surveyed in China with no variation observed).
We used the ms program of HUDSON (2002) to test whether observing an 811-bp region of no variation is expected when the variation is neutral and at mutation-drift equilibrium, as well with a recent population bottleneck. Bottlenecks were considered because non-African populations of D. melanogaster are thought to have recently expanded out of Africa possibly associated with a founder effect (DAVID and CAPY 1988; LACHAISE et al. 1988; BAUDRY et al. 2004). Genealogies were constructed under a standard steady-state neutral model as well as with varying strengths and time since a bottleneck followed with or without exponential growth (see MATERIALS AND METHODS). All simulations were conditioned on the total number of segregating sites observed across the 10,558-bp sequence in the United States. The ms program assumes that the mutation rate is uniform across the sequence, which appears valid given that divergence across this gap of variation between the CG18508 and Fcp3C open reading frames is not reduced (Figure 3). Results from the simulations are given in Table 3. Only 7 of 1000 steady-state genealogies had a gap of variation of
811 bp (P = 0.007). Only for bottlenecks with a 99% reduction 7000, 10,000, or 20,000 years ago do we observe a gap of variation of
811 bp above the 5% level.
|
KIM and STEPHAN (2002) proposed a composite likelihood analysis for the detection and localization of the targets of recent selective sweeps. Their method compares the likelihood of the distribution of variation and its frequency spectrum observed across a region of the genome under a selective sweep model compared to that under an equilibrium neutral model. In Zimbabwe we cannot reject neutrality (likelihood ratio of 6.00, P-value of 0.447). However, the United States data strongly reject the standard neutral model (likelihood ratio of 26.84, P-value < 0.001) in favor of a model with a single recent selective sweep. The test remains significant even when the analysis is done without the USA 16 allele that contributes all of the variation for a 1.3-kb segment corresponding in part to the large valley in Tajima's D and Fay and Wu's H in Figure 3 (likelihood ratio of 11.76, P-value < 0.001). In China we could perform the test with only six alleles because they were the only ones sequenced consistently across all regions (due to fly lines and DNA no longer being available). Even though we expect the power of the Kim and Stephan method to be compromised by a smaller sample size, we also observe a significantly better fit to selection in this population (likelihood ratio of 5.51, P-value of 0.039).
The likelihood ratio surface indicating the location of the target of the selective sweep in the United States and Zimbabwe populations is shown in Figure 4. The likelihood surface for Zimbabwe is flat. The pattern is strikingly different for the United States with a single major peak in the likelihood surface. The location for the target of the selective sweep in the United States sample is predicted to be at or near position 17,607. For the more limited data from China, the predicted target is at or near position 18,499. In the United States the predicted target is between the open reading frames CG18508 and Fcp3C and is within the 811-bp gap of variation. Given the orientation of these coding regions (Figure 1), the target is 5' of CG18508 and 3' of Fcp3C. In China the target is predicted to be within the first exon of Fcp3C. We note that the target in the United States is 5' of the deepest valleys of the Tajima's D and Fay and Wu's H. As previously mentioned, this pattern is due to all of the segregating sites being present on a single allele, presumably due to recombination or gene conversion during the sweep. Thus this region is predicted to be adjacent to, but not contain, the target of the sweep.
|
Could a recent population bottleneck associated with the out-of-Africa expansion of D. melanogaster lead spuriously to the significant rejection of neutrality we obtained using the Kim and Stephan likelihood ratio test? To evaluate this possibility, we applied their test to each of the individual output simulations obtained for various bottleneck scenarios reported in Table 3. Only for a bottleneck with a 99% reduction 10,000 years ago (0.033 x 3Ne generations ago) do we observe likelihood ratios as large as those of the United States (barely) above the 5% level (proportion 5.1%; Table 3). As has been previously noted (JENSEN et al. 2005), such bottlenecks appear to be the most confounding with regard to distinguishing between demographic evolutionary histories from those involving positive selection. To the extent that the major "out-of-Africa" bottleneck occurred <10,000 years ago, as estimated by BAUDRY et al. (2004), the reduction and skew in variation that we observe just downstream of Notch are most consistent with the action of positive selection.
The KIM and STEPHAN (2002) method assumes that the selective sweep has just completed. However, it is of interest to estimate a more precise age. PRZEWORSKI (2003) has proposed a resampling method to estimate the time since a predicted selective sweep. This is done by repeatedly constructing genealogies on the basis of the prior distributions of such parameters as the time since a sweep, the strength of selection, mutation rate, recombination rate, and effective population size. Genealogies, and thus the parameters used to generate them, are accepted as potential forces shaping the observed patterns of variation if the number of segregating sites, the number of haplotypes, and the frequency spectrum as parameterized by Tajima's D match what is empirically observed in the data. To apply this method to our data we assumed that the fixed difference between the United States and Zimbabwe populations at position 17,604 is the target of the sweep that appears to have affected the non-African populations sampled. We assume this because the fixed difference is within the 811-bp gap of variability and is 3 bp away from the target predicted by the Kim and Stephan method. Data 5' of position 17,604 were called the "upstream region" and data 3' the "downstream region." We analyzed the first 1000 bp flanking both the fixed difference and the total upstream and downstream regions. Analysis was performed looking at upstream and downstream regions separately because this method considers only one direction from the target.
PRZEWORSKI (2003) considers a sweep to be recent if the mode of the estimated age in units of 4N generations (3N for X-linked loci) is 0.25 (a quarter of the time expected under the neutral theory). The mode for the age for the first 1000 bp upstream is between 0 and 0.05 and for the first 1000 bp downstream is between 0.15 and 0.25. As the Przeworski method also estimates the effective population size we can convert these estimates to the number of generations. An age of 20,000 and 35,000 generations ago is predicted for the first 1000 bp upstream and downstream, respectively. The total upstream and downstream segments predict an age between 200,000 and 300,000 generations ago. A predicted older age for the total upstream and downstream analyses was expected as they include regions apparently unaffected by the sweep (i.e., ends of the sequence, see Figure 3). Assuming 10 generations per year, this sweep is estimated to have completed between 2,000 and 30,000 years ago when considering both the 1000 bp and total sequence analyses.
Another parameter of interest concerning the putative sweep is the strength of selection (s). With the PRZEWORSKI (2003) method, s ranges between 0.00007 and 0.05. The average s is 0.026 for both the total upstream and downstream analyses and 0.016 and 0.015 for the first 1000 bp upstream and downstream, respectively. The estimate of 1.5Nes from the KIM and STEPHAN (2002) method is 757 for the United States population. If we assume an effective population size (Ne) of 1 x 106, s is estimated to be 0.0005; if we assume a Ne of 1 x 105, s is 0.005.
We tested all coding regions within the sequence for evidence of rapid amino acid evolution using the McDonald-Kreitman test (MCDONALD and KREITMAN 1991). The tests were performed combining data across the populations (Table 4). Most coding regions within this 10,558-bp segment do not depart from neutral expectations. However, we observe a significant departure at CG18508, indicating an acceleration of amino acid fixations or an excess of synonymous polymorphisms at this locus. Synonymous sites are often assumed to be evolving neutrally although this has been shown not be the case at some loci in Drosophila (e.g., SHIELDS et al. 1988; KLIMAN and HEY 1993; AKASHI 1994; BAUER DUMONT et al. 2004). However, there is no evidence that synonymous sites have been affected by positive selection at CG18508 using the method of BAUER DUMONT et al. (2004), suggesting that the McDonald-Kreitman test results are due to an acceleration of amino acid fixations at CG18508.
|
D. yakuba was used as an outgroup to infer on which lineage each fixed difference occurred between D. melanogaster and D. simulans. This was done to determine if the amino acid fixations have occurred equally along each species' lineage. Five of the nonsynonymous fixed differences occurred along the D. melanogaster lineage while seven were inferred to have occurred on the D. simulans lineage. The nonsynonymous fixations thus appear to have occurred equally along each of these species' lineages. When considering only the fixations that have occurred along the D. melanogaster lineage, the McDonald-Kreitman test is close to being significant (nine synonymous polymorphisms and four synonymous fixed differences compared to one nonsynonymous polymorphism and five nonsynonymous fixed differences; Fisher's exact test P-value = 0.057). We note that the small numbers mean that there is little statistical power to reject the null hypothesis in this lineage-specific analysis. However, the results are in the same direction as the unpolarized analysis shown above suggesting that CG18508 has experienced an elevated number of amino acid fixations due to positive selection specifically in D. melanogaster.
On the basis of microsatellite and nucleotide polymorphism data, the Zimbabwe population conforms, for the most part, to neutral expectations across the 10,558-bp region extending downstream of Notch. The exception is the significantly negative Fu and Li's D-test statistic across the entire region. While this is a pattern expected during the recovery phase after a sweep, none of the other tests detects such a departure within this population, and the test is uniformly negative across the region. This pattern would also be consistent with a recent population expansion, and recent studies indicate that this population has experienced such a population perturbation (GLINKA et al. 2003; F. A. REED, R. NEILSEN and C. F. AQUADRO, unpublished data). Thus to date, the most compelling explanation for the patterning of variation observed at this region within this population is neutrality coupled with a population expansion.
On the other hand, there is strong evidence at the polymorphism level for a recent sweep in the non-African populations. The microsatellite and nucleotide sequence data reveal a localized reduction of variability and a skew in the frequency spectrum toward rare alleles centered at the 3'-end and downstream of the Notch locus. These data result in a significant KIM and STEPHAN (2002) test with the target of the sweep near CG18508 and Fcp3C. The predicted sweep appears to be young as estimated from the PRZEWORSKI (2003) method. The estimated age between 2000 and 30,000 years ago corresponds to when D. melanogaster is thought to have expanded out of Africa (DAVID and CAPY 1988; LACHAISE et al. 1988) and is consistent with many studies, suggesting a high rate of selective sweeps in non-African populations of D. melanogaster (e.g., BEGUN and AQUADRO 1993; VERRELLI and EANES 2000; TAKAHASHI et al. 2001; ANDOLFATTO 2001b; HARR et al. 2002; KAUER et al. 2002; GLINKA et al. 2003; KAUER et al. 2003; ORENGO and AGUADÉ 2004). We consider below the possibility that nonequilibrium demography has confounded our results.
We next consider the significant McDonald-Kreitman test result at the 99 amino acid-predicted protein-coding gene CG18508. Given that there is no evidence of selection affecting synonymous site evolution at this locus, we conclude that in D. melanogaster there has been an acceleration of amino acid fixations due to positive selection. At present, no biological function has been attributed to this coding region. However, its homolog can be found in the D. pseudoobscura genomic sequence, and expressed sequence tags (ESTs) have been detected in D. melanogaster (two in the embryo, two in adult testes, and one from the adult head; FLYBASE CONSORTIUM 2003). The only functional domain identified for CG18508, using SignalP 2.0 (http://www.cbs.dtu.dk/services/SignalP-2.0/), is a signal sequence at the N-terminal end of the 99 amino acid protein in both the D. melanogaster and D. pseudoobscura sequence.
Could the selective fixation of amino acids at CG18508 be the source of the recent sweep detected in the non-African populations? The McDonald-Kreitman results suggest a species-wide phenomenon as there are no fixed amino acid differences between the non-African and African populations. In contrast, the strongest evidence of a recent sweep from polymorphism data is found only in the non-African populations. Therefore, we feel that there are two likely explanations for the pattern of polymorphism in non-African populations: (1) a selectively driven fixation adjacent to the CG18508 coding region possibly due to an advantageous regulatory variant or (2) nonequilibrium population history of the non-African populations.
It is assumed that non-African populations of D. melanogaster are recently derived from an ancestral species range in Africa (DAVID and CAPY 1988; LACHAISE et al. 1988). As a result non-African populations may have historically experienced bottlenecks, admixture, and/or population expansion. A growing number of studies describe how such demographic effects perturb the tests used in this study to detect deviations from a Wright-Fisher equilibrium population model. For example, PRZEWORKSI (2002) has shown that population admixture can result in significant test departures with Fay and Wu's H test. In addition, bottlenecks and migration have been shown to affect the Tajima's D, Fu and Li's D, and Fay and Wu's H test statistics (e.g., TAJIMA 1989; FU and LI 1993; FAY and WU 2000; WAKELEY and ALIACAR 2001; WAKELEY 2003). Also, the Kim and Stephan method has a high false positive rate in the presence of strong population bottlenecks or population structure with low migration rates (JENSEN et al. 2005). Interestingly, demographic effects are not thought to strongly affect the McDonald-Kreitman test when implemented as simply a test of the neutral theory (WAKELEY 2003; but see also EYRE-WALKER 2002).
The effect of potential bottlenecks on our inference of positive selection depends on their strength and age (Table 3; JENSEN et al. 2005). DNA variability at multiple loci for non-African and African samples of D. melanogaster appears most consistent with a relatively severe and recent (on the order of 0.02 x 3Ne generations ago) out-of-Africa bottleneck (BAUDRY et al. 2004; see also HADDRILL et al. 2005). The patterns of variability across the 10,558 bp studied here in the United States population appear more reduced and skewed than can be explained by such a severe and recent bottleneck alone. As noted in Table 3, a Kim and Stephan likelihood ratio of 26.848 or higher is observed at most 0.02% of the time for a severe bottleneck, 0.023 x 3Ne generations ago. Also, the United States and Chinese data do not reject a recently proposed goodness-of-fit test of JENSEN et al. (2005). The data are found to fit the selective sweep model of KIM and STEPHAN (2002) significantly better than a general evolutionary model, which could include nonequilibrium demography and/or population structure. Therefore, a recent selective sweep remains the most likely explanation for the window of reduced and skewed polymorphism in the CG18508 region of D. melanogaster in the non-African populations sampled.
What might be the selective target of this putative sweep? Four fixed differences between the African and non-African populations within the 10,558-bp segment are observed. One is within the 3'-end of Notch (position 14,203), two are located within the 811-bp gap of variability in the United States population (17,433 and 17,604), and the other is located 5' of Fcp3C (position 19,095). Only two of these fixed differences (14,203 and 17,604) have the derived mutation in the non-African populations. Interestingly, the 17,604 fixed difference is within the 811-bp gap of variability and is relatively close to the predicted target of the sweep estimated from the Kim and Stephan method (895 bp from the predicted site in China and 3 bp from the prediction in the United States).
We searched for transcriptional binding sites in the noncoding sequence between CG18508 and the Fcp3C protein using the web-based program MatInspector (http://www.genomatix.de/cgi-bin/matinspector/matinspector.pl). The fixed difference at position 17,604 is within the active site of a putative Caudal transcription factor binding domain and is predicted only in the non-African populations. Caudal is expressed in the embryo, during oogenesis, and in the adult testes, which corresponds to when and where CG18508 and Fcp3C are expressed. Therefore, the recent sweep out of Africa could be due to a change in expression level between populations at either CG18508 or Fcp3C. A similar result has been reported by ODGERS et al. (2002) who demonstrated a selectively driven change in promoter sequence (and possibly expression levels) at Esterase 6 between African and non-African populations of D. melanogaster. In addition, there is evidence for a weak correlation between gene expression evolution and protein evolution in yeast and worms (GU et al. 2002; CASTILLO-DAVIS et al. 2004) that may suggest a link between rapid amino acid evolution at CG18508 and regulatory sequence evolution nearby. Future work will attempt to verify this potential Caudal transcriptional binding site difference between African and non-African populations of D. melanogaster.
We note that between this study and that of BAUER DUMONT et al. (2004) we detect evidence of the action of positive selection acting over time at synonymous sites within Notch, at nonsynonymous sites within CG18508, and of a recent sweep in non-African populations within the CG18508 region. Thus, three apparently independent scenarios suggest the action of positive selection within this 10,558-bp segment.
How does this density of positive selection compare to that inferred as necessary to explain the genome-wide correlation between polymorphism and recombination in D. melanogaster? We assume that only two of the three aforementioned scenarios are the result of the fixation of mutations with a selection coefficient large enough to play a part in the correlation between variability and recombination (i.e., recent sweep downstream of Notch and McDonald-Kreitman results at CG18508). At CG18508, 16 nonsynonymous fixations separate D. melanogaster and D. simulans. Given the ratio of polymorphic to divergent synonymous sites at this locus (nine polymorphic vs. eight divergent) we would have expected, under the neutral theory, roughly one nonsynonymous fixation relative to the one nonsynonymous polymorphism observed. Considering both the results at CG18508 itself and the results of the recent selective sweep that appears to be centered upstream of this open reading frame, we crudely infer between 2 and 15 fixations due to "strong" positive selection have occurred within this 10.5-kb region. Using the method proposed by KIM and STEPHAN (2000) to estimate the frequency and intensity of positive selection in the presence of background selection, ANDOLFATTO (2001a) estimated
v (where
= 2Ns and v is the beneficial substitution rate per generation) to be 1.0 x 107 from the observed relationship between nucleotide sequence variability and recombination in an African sample of D. melanogaster. Assuming N = 1 x 106 for D. melanogaster and values of s equal to 0.00007, 0.005, 0.01, and 0.05 (corresponding to the range of selection coefficients estimated from our data by the PRZEWORSKI 2003 or KIM and STEPHAN 2002 methods) leads to estimates of 1 x 1091 x 1012 beneficial fixations per generation per nucleotide. These rates of adaptive substitution are consistent with BIERNE and EYRE-WALKER's (2004) maximum likelihood estimate of 1 x 1011 obtained comparing synonymous and nonsynonymous polymorphism and divergence within and between D. melanogaster, D. simulans, and D. yakuba. The former two species are thought to have diverged 2.5 MYA, which, assuming 10 generations per year, means a total divergence time of 5.0 x 107 generations. For a region of the size we have studied downstream of Notch (10,558 bp), we would thus expect totals of 528, 5.3, 2.6, or 0.5 beneficial substitutions to have accumulated along the total lineage separating contemporary D. melanogaster and D. simulans. These estimates bracket the number that we have inferred in our present analysis for the
10.5- kb region downstream of Notch. Broader genome scans of the X chromosome in D. melanogaster that have scanned from 29 to 109 microsatellites or sequenced regions also have found numerous putative sweep targets (HARR et al. 2002; GLINKA et al. 2003, ORENGO and AGUADÉ 2004). Additional high-resolution molecular population genetic studies of high-recombination regions of the genome will help refine our knowledge of the nature and frequency of adaptive fixations.
AKASHI, H., 1994 Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136: 927935.[Abstract]
ANDOLFATTO, P., 2001a Adaptive hitchhiking effects on genome variability. Curr. Opin. Genet. Dev. 11: 635641.[CrossRef][Medline]
ANDOLFATTO, P., 2001b Contrasting pattern of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18: 279290.
AQUADRO, C. F., D. J. BEGUN and E. C. KINDAHL, 1994 Selection, recombination, and DNA polymorphism in Drosophila, pp 4656 in Non-neutral Evolution: Theories and Molecular Data, edited by BRIAN GOLDING. Chapman & Hall, New York.
AQUADRO, C. F., V. BAUER DUMONT and F. A. REED, 2001 Genome-wide variation in the human and fruitfly: a comparison. Curr. Opin. Genet. Dev. 11: 627634.[CrossRef][Medline]
BACHTROG, D., M. AGIS, M. IMHOG and C. SCHLÖTTERER, 2000 Microsatellite variability differs between dinucleotide repeat motifs evidence from Drosophila melanogaster. Mol. Biol. Evol. 17: 12771285.
BAUDRY, E., B. VIGINIER and M. VEUILLE, 2004 Non-African populations of Drosophila melanogaster have a unique origin. Mol. Biol. Evol. 8: 14821491.
BAUER DUMONT, V., J. C. FAY, P. P. CALABRESE, and C. F. AQUADRO 2004 DNA variability and divergence at the Notch locus in Drosophila melanogaster and D. simulans: a case of accelerated synonymous site divergence. Genetics 167: 171185.
BEGUN, D. J., and C. F. AQUADRO, 1991 Molecular population genetics of the distal portion of the X chromosome in Drosophila: evidence for genetic hitchhiking of the yellow-achaete region. Genetics 129: 11471158.[Abstract]
BEGUN, D. J., and C. F. AQUADRO, 1992 Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356: 519520.[CrossRef][Medline]
BEGUN, D. J., and C. F. AQUADRO, 1993 African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365: 548550.[CrossRef][Medline]
BEGUN, D. J., and C. F. AQUADRO, 1994 Evolutionary inferences from DNA variation at the 6-phosphogluconate dehydrogenase locus in natural populations of Drosophila: selection and geographic differentiation. Genetics 136: 155171.[Abstract]
BEGUN, D. J., and C. F. AQUADRO, 1995 Molecular variation at the vermilion locus in geographically diverse populations of Drosophila melanogaster and D. simulans. Genetics 140: 10191032.[Abstract]
BEGUN, D. J., and P. WHITLEY, 2000a Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc. Natl. Acad. Sci. USA 97: 59605965.
BEGUN, D. J., and P. WHITLEY, 2000b Adaptive evolution of Relish, a Drosophila NF-kappaB/IkappaB protein. Genetics 154: 12311238.
BIERNE, N., and A. EYRE-WALKER, 2004 The genomic rate of adaptive amino acid substitution in Drosophila. Mol. Biol. Evol. 21: 13501360.
BRINKMANN, B., M. KLINTSCHAR, F. NEUHUBER, J. HÜHNE and B. ROLF, 1998 Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat. Am. J. Hum. Genet. 62: 14081415.[CrossRef][Medline]
CASTILLO-DAVIS, C. I., D. L. HARTL and G. ACHAZ, 2004 cis-regulatory and protein evolution in orthologous and duplicate genes. Genome Res. 14: 15301536.
CHARLESWORTH, B., M. T. MORGAN and D. CHARLESWORTH, 1993 The effect of deleterious mutations on neutral molecular variation. Genetics 134: 12891303.[Abstract]
CORNUET, J. M., and G. LUIKART, 1996 Description and power analysis of two tests for detecting recent population bottlenecks from allele frequency data. Genetics 144: 20012014.[Abstract]
DAVID, J. R., and P. CAPY, 1988 Genetic variation of Drosophila melanogaster natural populations. Trends Genet. 4: 106111.[CrossRef][Medline]
DUVERNELL, D. D., P. S. SCHMIDT and W. F. EANES, 2003 Clines and adaptive evolution in the methuselah gene region in Drosophila melanogaster. Mol. Ecol. 12: 12771285.[CrossRef][Medline]
ELLEGREN, H., 2000 Microsatellite mutations in the germline: implications for evolutionary inference. Trends Genet. 16: 551558.[CrossRef][Medline]
EYRE-WALKER, A., 2002 Changing effective population size and the McDonald-Kreitman test. Genetics 162: 20172024.
FAY, J. C., and C.-I WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155: 14051413.
FLYBASE CONSORTIUM, 2003 The FlyBase database of the Drosophila genome projects and community literature. Nucleic Acids Res. 31: 172175.
FU, Y-X., and W-H. LI, 1993 Statistical tests of neutrality of mutations. Genetics 133: 693709.[Abstract]
GLINKA, S., L. OMETTO, S. MOUSSET, W. STEPHAN and D. DE LORENZO, 2003 Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics 165: 12691278.
GU, Z., D. NICOLAE, H. H-S. LU and W-H. LI, 2002 Rapid divergence in expression between duplicate genes inferred from microarray data. Trends Genet. 18: 609613.[CrossRef][Medline]
HADDRILL, P. R., K. R. THORNTON, B. CHARLESWORTH and P. ANDOLFATTO, 2005 Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populations. Genome Res. 15: 790799.
HARR, B., M. KAUER and C. SCHLÖTTERER, 2002 Hitchhiking mapping: a population-based fine-mapping strategy for adaptive mutations in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 99: 1294912954.
HEY, J., and R. M. KLIMAN, 2002 Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160: 595608.
HUDSON, R.R., 2002 Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18: 337338.
HUDSON, R. R., and N. L. KAPLAN, 1995 Deleterious background selection with recombination. Genetics 141: 16051617.[Abstract]
HUDSON, R. R., M. KREITMAN and M. AGUADÉ, 1987 A test of neutral molecular evolution based on nucleotide data. Genetics 116: 153159.
JENSEN, J. D., Y. KIM, V. BAUER DUMONT, C. F. AQUADRO and C. D. BUSTAMANTE, 2005 Distinguishing between selective sweeps and demography using DNA polymorphism data. Genetics 170: 14011410.
KAPLAN, N. L., R. R. HUDSON and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics 123: 887899.
KAUER, M., B. ZANGERL, D. DIERINGER and C. SCHLÖTTERER, 2002 Chromosomal patterns of microsatellite variability contrast sharply in African and non-African populations of Drosophila melanogaster. Genetics 160: 247256.
KAUER, M. O., D. DIERINGER and C. SCHLÖTTERER, 2003 A microsatellite variability screen for positive selection associated with the "out of Africa" habitat expansion of Drosophila melanogaster. Genetics 165: 11371148.
KAYSER, M., S. BRAUER and M. STONEKING, 2003 A genome scan to detect candidate regions influenced by local natural selection in human populations. Mol. Biol. Evol. 20: 893900.
KIM, Y., and W. STEPHAN, 2000 Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics 155: 14151427.
KIM, Y., and W. STEPHAN, 2002 Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160: 765777.
KLIMAN, R. M., and J. HEY, 1993 Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol. Biol. Evol. 10: 12391258.[Abstract]
KOHN, M. H., H.-J. PELZ and R. K. WAYNE, 2000 Natural selection mapping of the warfarin-resistance gene. Proc. Natl. Acad. Sci. USA 97: 79117915.
KREITMAN, M., 1983 Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melangoaster. Nature 304: 412417.[CrossRef][Medline]
LACHAISE, D., M. CARIOU, J. R. DAVID, F. LEMEUNIER, L. TSACAS et al., 1988 Historical biogeography of the Drosophila melanogaster species subgroup, pp. 159225 in Evolutionary Biology, edited by M. K. HECHT, B. WALLACE and G. T. PRANCE. Plenum, New York.
LANGLEY, C. H., B. P. LAZZARO, W. PHILLIPS, E. HEIKKINEN and J. M. BRAVERMAN, 2000 Linkage disequilibria and the site frequency spectra in the su(s) and su(wa) region of the Drosophila melanogaster X chromosome. Genetics 156: 18371852.
MCDONALD, J. H., and M. KREITMAN, 1991 Adaptive protein evolution at the Adh locus in Drosophila. Nature 20: 652654.
NACHMAN, M. W., 2001 Single nucleotide polymorphisms and recombination rate in humans. Trends Genet. 17: 481485.[CrossRef][Medline]
NURMINSKY, D., D. DE AGUIAR, C. D. BUSTAMANTE and D. L. HARTL, 2001 Chromosomal effects of rapid gene evolution in Drosophila melanogaster. Science 291: 128130.
ODGERS, W. A., C. F. AQUADRO, C. W. COPPIN, M. J. HEALY and J. G. OAKESHOTT, 2002 Nucleotide polymorphism in the Est6 promoter, which is widespread in derived populations of Drosophila melanogaster, changes the level of Esterase 6 expressed in the male ejaculatory duct. Genetics 162: 785797.
ORENGO, D. J., and M. AGUADÉ, 2004 Detecting the footprint of positive selection in a European population of Drosophila melanogaster: multilocus pattern of variation and distance to coding regions. Genetics 167: 17591766.
PAYSEUR, B. A., A. D. CUTTER and M. W. NACHMAN, 2002 Searching for evidence of positive selection in the human genome using patterns of microsatellite variability. Mol. Biol. Evol. 19: 11431153.
PRZEWORSKI, M., 2002 The signature of positive selection at randomly chosen loci. Genetics 162: 11791189.
PRZEWORSKI, M., 2003 Estimating the time since the fixation of a beneficial allele. Genetics 164: 16671676.
ROZAS, J., and R. ROZAS, 1997 DnaSP version 2.0: A novel software package for extensive molecular population genetics analysis. Comput. Appl. Biosci. 13: 307311.
SAIKI, R. K., D. H. GELFAND, D. STOFFEL, S. J. SCHARF, R. HIGUCHI et al., 1988 Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487491.
SCHLENKE, T. A., and D. J. BEGUN, 2003 Natural selection drives Drosophila immune system evolution. Genetics 164: 14711480.
SCHLENKE, T. A., and D. J. BEGUN, 2004 Strong selective sweep associated with a transposon insertion in Drosophila simulans. Proc. Natl. Acad. Sci. USA 101: 16261631.
SCHLÖFL, G., and C. SCHLÖTTERER, 2004 Patterns of microsatellite variability among X chromosomes and autosomes indicate a high frequency of beneficial mutations in non-African D. simulans. Mol. Biol. Evol. 21: 13841390.
SCHLÖTTERER, C., 2002a Towards a molecular characterization of adaptation in local populations. Curr. Opin. Genet. Dev. 12: 683687.[CrossRef][Medline]
SCHLÖTTERER, C., 2002b A microsatellite-based multilocus screen for the identification of local selective sweeps. Genetics 160: 753763.
SCHMIDT, P. S., D. D. DUVERNELL and W. F. EANES, 2000 Adaptive evolution of a candidate gene for aging in Drosophila. Proc. Natl. Acad. Sci. USA 97: 1086110865.
SCHUG, M. D., C. M. HUTTER, K. A. WETTERSTRAND, M. S. GAUDETTE, T. F. C. MACKAY et al., 1998 The mutation rates of di-, tri- and tetranucleotide repeats in Drosophila melanogaster. Mol. Biol. Evol. 15: 17511760.[Abstract]
SHIELDS, D. C., P. M. SHARP, D. G. HIGGINS and F



