help button home button Genetics Email Content Delivery
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Andolfatto, P.
Right arrow Articles by Kreitman, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Andolfatto, P.
Right arrow Articles by Kreitman, M.
Genetics, Vol. 154, 1681-1691, April 2000, Copyright © 2000

Molecular Variation at the In(2L)t Proximal Breakpoint Site in Natural Populations of Drosophila melanogaster and D. simulans

Peter Andolfattoa and Martin Kreitmana
a Committee on Genetics, Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637

Corresponding author: Peter Andolfatto, Institute of Cell, Animal and Population Biology, Ashworth Labs, Kings Bldgs., University of Edinburgh, Edinburgh, EH9 3JT Scotland, United Kingdom., peter.andolfatto{at}ed.ac.uk (E-mail)

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

A previous study of nucleotide polymorphism in a Costa Rican population of Drosophila melanogaster found evidence for a nonneutral deficiency in the number of haplotypes near the proximal breakpoint of In(2L)t, a common inversion polymorphism in this species. Another striking feature of the data was a window of unusually high nucleotide diversity spanning the breakpoint site. To distinguish between selective and neutral demographic explanations for the observed patterns in the data, we sample alleles from three additional populations of D. melanogaster and one population of D. simulans. We find that the strength of associations among sites found at the breakpoint varies between populations of D. melanogaster. In D. simulans, analysis of the homologous region reveals unusually elevated levels of nucleotide polymorphism spanning the breakpoint site. As with American populations of D. melanogaster, our D. simulans sample shows a marked reduction in the number of haplotypes but not in nucleotide diversity. Haplotype tests reveal a significant deficiency in the number of haplotypes relative to the neutral expectation in the D. simulans sample and some populations of D. melanogaster. At the breakpoint site, the level of divergence between haplotype classes is comparable to interspecific divergence. The observation of interspecific polymorphisms that differentiate major haplotype classes in both species suggests that haplotype classes at this locus are considerably old. When considered in the context of other studies on patterns of variation within and between populations of D. melanogaster and D. simulans, our data appear more consistent with the operation of selection than with simple demographic explanations.


THE chromosomal rearrangement In(2L)t is one of four common polymorphic inversions with stable geographic frequency clines in natural populations of Drosophila melanogaster. While rare in temperate climates, it reaches frequencies of 40–60% in tropical populations of Australasia and Africa (KNIBB 1982 Down; BENASSI et al. 1993 Down). The existence of parallel latitudinal clines across different continents and hemispheres suggests that natural selection maintains at least some inversion polymorphisms in this species (KNIBB 1982 Down). The mechanism by which inversions become established in natural populations and the mode of selection operating on them are not well understood.

In a recent study of nucleotide variation spanning the proximal In(2L)t breakpoint, ANDOLFATTO et al. 1999 Down found that the inversion has a recent origin relative to standard lineages. The authors also noted that standard chromosomes exhibit a several hundred-base pair window of elevated nucleotide polymorphism directly spanning the In(2L)t breakpoint site. Interestingly, most of this nucleotide variation in this window is distributed between, and not within, two deeply diverged standard haplotype classes. A much larger (~2 kb) DNA interval, which includes this window of elevated polymorphism, revealed little evidence for recombination despite the large number of intermediate frequency (i.e., informative) polymorphic sites.

A departure from the neutral prediction for the number of haplotypes can arise as a result of selection or demographic shifts. In particular, a reduction in the number of haplotypes in a sample is expected under models of balancing selection or population subdivision (STROBECK 1987 Down). Similar patterns are expected for partial selective sweeps (MAYNARD-SMITH and HAIGH 1974 Down; KAPLAN et al. 1989 Down; BRAVERMAN et al. 1995 Down; HUDSON et al. 1997 Down) or traffic models (KIRBY and STEPHAN 1996 Down). Under a neutral equilibrium model with recombination, the distribution of the expected number of haplotypes in a population sample can be determined by coalescent simulation (HUDSON 1990 Down; FU 1996 Down). ANDOLFATTO et al. 1999 Down introduce a test of the neutral equilibrium model with recombination (based on STROBECK 1987 Down) to detect subregions of a data set with unusual haplotype structure. A large window of polymorphisms spanning the In(2L)t breakpoint was shown to have fewer haplotypes than expected under the neutral model. This pattern is apparent whether In(2L)t chromosomes are included in the analysis or not.

Similar haplotype deficiencies have recently been reported at Sod, vermilion, Fbp2, and Su(H) in D. melanogaster (HUDSON et al. 1994 Down; BEGUN and AQUADRO 1995 Down; BENASSI et al. 1999 Down; DEPAULIS et al. 1999 Down) as well as at the Pgd, runt, G6pd, and vermilion loci in D. simulans (BEGUN and AQUADRO 1994 Down; HAMBLIN and VEUILLE 1999 Down; LABATE et al. 1999 Down). One explanation is that the recent expansion of African populations of D. melanogaster and D. simulans to other parts of the world (DAVID and CAPY 1988 Down; LACHAISE et al. 1988 Down) resulted in the deficiency in haplotype and nucleotide diversity observed in non-African populations (HALE and SINGH 1991 Down; BEGUN and AQUADRO 1993 Down, BEGUN and AQUADRO 1994 Down, BEGUN and AQUADRO 1995 Down). However, a characteristic feature of demographic shifts is that their signature is expected over the whole genome rather than localized to any particular locus. Interestingly, polymorphic sites further away from Sod, Su(H), and the In(2L)t breakpoint in D. melanogaster reveal more recombination (BENASSI et al. 1993 Down; HUDSON et al. 1997 Down; ANDOLFATTO et al. 1999 Down), consistent with expectations under certain selection models (KAPLAN et al. 1989 Down; BRAVERMAN et al. 1995 Down; KIRBY and STEPHAN 1996 Down; HUDSON et al. 1997 Down).

Distinguishing between selective and demographic explanations for patterns of nucleotide diversity at a particular locus on the basis of data from a single sample is difficult. Here, we expand the polymorphism data set of ANDOLFATTO et al. 1999 Down to include samples from three additional geographically diverse populations of D. melanogaster and one population of D. simulans. Our study focuses on a 1-kb region spanning the proximal breakpoint of In(2L)t that exhibits both a window of elevated polymorphism and strong linkage disequilibrium among sites in the Costa Rican sample. Data on the geographic distribution of variation at this locus in D. melanogaster and comparisons with D. simulans may help us distinguish among evolutionary scenarios. If the strong linkage disequilibrium observed in this population is due to ancient balancing or epistatic selection, we may expect to see similar haplotype structure in all D. melanogaster populations and, potentially, in D. simulans. We may also expect to see a number of trans-specific polymorphisms. Alternatively, if the pattern in Costa Rica is the result of a recent contraction in population size as suggested by data from the X chromosome (BEGUN and AQUADRO 1993 Down), we expect to observe reduced nucleotide variation in non-African samples relative to African samples. Finally, if the unusual haplotype structure in the Costa Rican D. melanogaster sample is due to the recent increase in In(2L)t's frequency (ANDOLFATTO et al. 1999 Down), we would not expect a similar pattern in a population sample of D. simulans (which lacks In(2L)t).


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Population samples and sequencing:
The isolation of the In(2L)t proximal breakpoint (34A8–9 on the cytological map) and the collection of polymorphism data from a San Jose, Costa Rica D. melanogaster population sample are described by ANDOLFATTO et al. 1999 Down. Three additional population samples are from Florida City, Florida, Yeppoon, Australia, and Zimbabawe, Africa and were chosen primarily because they are geographically diverse. Only standard alleles are sampled in this study; In(2L)t alleles sampled from all four populations are described in ANDOLFATTO et al. 1999 Down.

Genomic DNA was prepared from wild-caught females from the Florida City population. Individuals were karyotyped by PCR with the use of standard and In(2L)t-specific primer pairs (ANDOLFATTO et al. 1999 Down). For Yeppoon and Zimbabawe (which included individuals from both Harare and Sengwa), we chose one In(2L)t heterozygote male per isofemale line (kindly provided by C.-I Wu). A 1-kb segment spanning the inversion breakpoint site was PCR amplified from D. melanogaster In(2L)t heterozygotes, using standard-specific primers (see Fig 1). To obtain alleles from a D. simulans population (Arena Farms, Maryland), the following cross was carried out: Multiple males from each isofemale line were crossed to virgin female In(2L)t homozygotes of D. melanogaster. The resulting hybrid progeny (all female) were heterozygous for In(2L)t. This allowed the recovery of individual D. simulans alleles by PCR with standard arrangement-specific primers (one individual per isofemale line).



View larger version (13K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. PCR sampling strategy of a 1-kb region of the proximal breakpoint region from In(2L)t heterozygotes. The cytological position of this region (C/D) is ~34A8–9 on chromosome 2L. Standard-specific primer pairs were used to PCR amplify standard alleles from individual inversion heterozygotes.

Polyethylene glycol (PEG)-precipitated templates were directly sequenced on both strands using a dRhodamine Terminator Cycle sequencing kit (Applied Biosystems, Foster City, CA) and run on an ABI377XL Automated Sequencer. Sequences were analyzed with ABI Sequence Analysis v3.0 software; contigs were managed with Sequencher v3.0 software. Sequences collected in this study have been deposited into GenBank under accession nos. AF217926, AF217927, AF217928, AF217929, AF217930, AF217931, AF217932, AF217933, AF217934, AF217935, AF217936, AF217937, AF217938, AF217939, AF217940, AF217941, AF217942, AF217943, AF217944, AF217945, AF217946, AF217947, AF217948, AF217949. Intraspecific alignments have been deposited into the EMBL database (ftp://ftp.ebi.ac.uk/pub/databases/embl/align/) under accession nos. DS41064–DS41065.

Polymorphism analyses:
Although we report all segregating polymorphisms (Fig 2 and Fig 3), we have restricted our analyses to two-state single-nucleotide polymorphisms and insertion-deletions within each population (see Table 1). The neutral mutation parameter {theta} = 4Neµ, where Ne is the effective population size of the species and µ is the neutral mutation rate, is estimated from both {pi}, the average pairwise difference per base pair (TAJIMA 1983 Down), and S, the number of polymorphic sites in the sample ({theta}w; WATTERSON 1975 Down). Certain analyses of polymorphism and divergence were performed with DnaSP v3.0 software (ROZAS and ROZAS 1999 Down).



View larger version (32K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. Summary of polymorphic variation found in four population samples of D. melanogaster (cr, Costa Rica; fc, Florida City; y, Yeppoon; zh, Harare, Zimbabwe; zs, Sengwa, Zimbabwe). All mutations have been polarized using D. simulans as an outgroup; the reference sequence represents the ancestral state where possible (those sites with ambiguous polarity are indicated with an asterisk). M denotes complex mutations; i and d are insertions and deletions of the length indicated, respectively. Sites with more than two states or that overlap with deletions were excluded from analyses. The black bar indicates the position of the In(2L)t breakpoint, where inverted chromosomes are fixed for a 94-bp deletion. Polymorphism number 1 (nucleotide position 5) corresponds to polymorphism 76 (nucleotide position 2012) in Fig 2 of ANDOLFATTO et al. 1999 Down. Polymorphic sites 5, 77, and 83 were previously interpreted as fixed differences between karyotypes in the study of ANDOLFATTO et al. 1999 Down. Polymorphic site numbers 9, 64, 76, and 77 (boxed columns) are shared with D. simulans (Fig 3).



View larger version (36K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3. Summary of polymorphic variation found for the In(2L)t proximal breakpoint site homologue in a Maryland population of D. simulans. All mutations have been polarized using D. melanogaster as an outgroup; the reference sequence represents the ancestral state where possible (those sites with ambiguous polarity are indicated with an asterisk). The black bar represents the approximate position of the In(2L)t breakpoint in D. melanogaster. Sites with more than two states or that overlap with deletions were excluded from analyses. Polymorphic sites 6, 57, 68, and 69 (boxed columns) are shared with D. melanogaster (Fig 2).


 
View this table:
[in this window]
[in a new window]

 
Table 1. Summary information for D. melanogaster and D. simulans population samples

The population recombination rate, C = 4Ner, where r is the recombination rate per base pair per generation, is estimated in three ways. A lower bound for C is based on the minimum number of inferred recombination events in the history of a sample (RM, in HUDSON and KAPLAN 1985 Down). Cmin is defined as the highest value of C such that <2.5% of simulated data sets have RM or more inferred recombination events. This estimate is not strictly conservative when used in our haplotype test (see below) but is useful since it represents a lower bound for the recombination rate (HUDSON and KAPLAN 1985 Down; WALL 1999 Down). To estimate Cmin, coalescent simulations were carried out for a neutral panmictic population conditional on the sample size, n, the number of segregating sites, S, and the population recombination rate, C (HUDSON 1993 Down). A second estimate, Chud, is an estimate of the expected population recombination rate and is obtained from polymorphism data by the method of HUDSON 1987 Down. Chud is not employed in statistical tests. A third estimate of C is Clab = 4Ne{rho}(1 - 2q(1 - q)), where {rho} is the estimate of rates of crossing over per base pair per generation for the cytological band 34A based on laboratory crosses (COMERON et al. 1999 Down) and q is the estimated frequency of In(2L)t; Ne is taken to be 106 (KREITMAN 1983 Down). Laboratory estimates of recombination ({rho}) based on the exchange of distant flanking markers interpolated to the intragenic scale (i.e., several kilobases) are likely to be underestimates of the true rate of exchange (r), since they ignore the added contribution of gene conversion (ANDOLFATTO and NORDBORG 1998 Down). Clab is not conservative in statistical tests but has the advantage of being independent of the sampled data and is our best a priori guess at the true population recombination rate in the chromosomal region studied.

Statistical tests of neutral equilibrium and panmictic population models:
Tajima's D statistic (TAJIMA 1989 Down) is used to characterize the skew in the frequency distribution of segregating mutations in our samples. To test for geographic differentiation between population samples of standard chromosomes, we use permutation tests described by HUDSON et al. 1992A Down. Differentiation between populations for haplotype frequencies is measured by the statistic {chi}2 (NEI 1987 Down, p. 110). The statistic K* (HUDSON et al. 1992A Down) is based on the frequencies of individual segregating sites in two (or more) populations. These two statistics were the most powerful under all parameters considered in HUDSON et al. 1992A Down. For each test, 100,000 permutations of the data were carried out. We report the one-tailed probability that the two samples were drawn from a single panmictic population. A program to perform these tests was kindly provided by R. Hudson. For comparisons to earlier studies, we also report Fst (HUDSON et al. 1992B Down) although significance levels are not assessed for this statistic.

We use the haplotype test of ANDOLFATTO et al. 1999 Down to detect deviations from the neutral model in the number of observed haplotypes in our population samples. Given a polymorphism data set with n chromosomes and S segregating sites, we define Sk to be the largest number of consecutive segregating sites that contain only k different haplotypes (1 < k < n). An empirical distribution of Sk is determined from 10,000 simulations using an infinite-sites, panmictic coalescent model conditional on n, S, and C (HUDSON 1993 Down). We then calculate the proportion, pk, of simulated data sets that contain at least one stretch of Sk consecutive segregating sites having k or fewer haplotypes. This is equivalent to calculating the proportion of simulated data sets that have Sk greater than or equal to Sk observed in the data. Since choosing any particular value of k is arbitrary, we correct for the implicit multiple tests involved. This corrected P value is determined from further coalescent simulations that compare the actual smallest pk value with simulated smallest pk values. For D. melanogaster populations, haplotype tests were performed on constructed random samples based on the population's frequency of In(2L)t (ANDOLFATTO et al. 1999 Down). The presence of an inversion in our D. melanogaster samples makes estimators of C difficult to interpret. For this reason, the robustness of our results are tested over a wide range of values for C.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Polymorphism and recombination among D. melanogaster samples:
The sampled region spans the proximal In(2L)t breakpoint (Fig 1) and includes 840 bp from region C (within the inverted region) and 160 bp from region D (outside the inverted region). The inversion breakpoint is located between positions 840 and 934 where all sampled In(2L)t chromosomes are fixed for a 94-bp deletion (ANDOLFATTO et al. 1999 Down). Nucleotide and insertion-deletion variation found in four population samples of D. melanogaster standard chromosomes is summarized in Fig 2. Our estimates of {pi}, {theta}, and C/{theta} (Table 1) include all biallelic variation in a given population sample (see MATERIALS AND METHODS). Two polymorphisms detected in the Florida City sample (polymorphic sites 77 and 83) as well as one in the Yeppoon sample (polymorphic site 5) were previously interpreted as fixed differences between standard and In(2L)t lineages (ANDOLFATTO et al. 1999 Down). Thus, since they are fixed in the In(2L)t class, these polymorphic sites are at intermediate frequencies in all populations. Polymorphic site 77 is a trans-specific polymorphism (see below).

Two features of the data differ between population samples of D. melanogaster (Fig 2). First, the Costa Rican and Miami samples appear to have stronger linkage disequilibrium among sites than Yeppoon and Zimbabwe samples. However, in contrast to previous studies for X-linked loci in various populations of D. melanogaster (BEGUN and AQUADRO 1993 Down, BEGUN and AQUADRO 1994 Down, BEGUN and AQUADRO 1995 Down), there is no evidence for an African/non-African difference in levels of nucleotide diversity (Table 1). Second, the frequency spectrum of polymorphisms is sharply skewed toward intermediate frequency mutations in the Costa Rican sample (Table 1). This skew is much less dramatic for the Florida City and Yeppoon samples and is in the direction of an excess of rare polymorphisms in the Zimbabwe sample. Tajima's D is significantly positively skewed for the Costa Rican sample for all C >= 0 [constructed random samples (CRS), Table 1]. However, the interpretation of this P value is difficult since this sampled region was preselected because it appeared unusual in this population. Tajima's D was not significantly skewed for CRS of any other population when 0 <= C <= Clab.

Estimates of the population recombination rate also differ among samples. Under neutral equilibrium assumptions, the ratio Chud/{theta}w (Table 1) is an estimate of the expected number of recombination events per mutation in the sample. An independent measure of this quantity based on laboratory estimates of the recombination rate in this chromosomal region ({rho} = 1.47 x 10-8 per base pair per generation; COMERON et al. 1999 Down) and the neutral mutation rate (µ ~1.6–3.0 x 10-9 per site per generation, assuming 10 generations per year; HARADA et al. 1993 Down; LI 1997 Down) yields a ratio ({rho}/µ) of ~4.5–9.0. If we assume an In(2L)t frequency of 50% and no recombination in inversion heterozygotes, this range becomes ~2.3–4.5. While the Zimbabwe and Yeppoon samples of standard chromosomes are roughly in agreement with these estimated ranges, Costa Rica and Florida City samples yield a Chud/{theta}w ratio >10-fold smaller than expected (Table 1). Since Chud is a summary of the amount of linkage disequilibrium in a sample (HUDSON 1987 Down) and population samples have similar estimates of {theta}w, lower than expected Chud/{theta}w reflect stronger linkage disequilibrium in the American samples relative to Yeppoon and Zimbabwe samples. The Chud/{theta}w ratios are uniformly low in CRS (Table 1), likely due to the linkage disequilibrium introduced by In(2L)t chromosomes.

Geographic differentiation between population samples of standard chromosomes:
Pairwise Fst estimates (Table 2, top right) suggest that differentiation between populations is low. Permutation tests described by HUDSON et al. 1992A Down were conducted on standard chromosome samples to test a panmictic population model (Table 2). In pairwise comparisons, the {chi}2 statistic reveals haplotype differentiation between Costa Rica and all other populations. No other significant differentiation between haplotypes ({chi}2) is detected in pairwise comparisons of populations. Two of six tests based on site frequencies (K*) have P values near the 0.05 level (Zimbabwe/Costa Rica and Zimbabwe/Florida City). Caution should be exercised in interpreting the P values in Table 2 since they are not corrected for multiple tests. However, these tests do suggest haplotype differentiation between standard chromosomes of Costa Rican and other populations.


 
View this table:
[in this window]
[in a new window]

 
Table 2. Geographic differentiation among standard chromosomes in D. melanogaster

Polymorphism patterns in a D. simulans population sample:
Fig 3 summarizes polymorphism data for the In(2L)t proximal breakpoint homologue in a North American population of D. simulans. Estimates of {pi}, {theta}, and C/{theta} are given in Table 1. In agreement with previous data from nucleotide variation in D. melanogaster and D. simulans (reviewed in MORIYAMA and POWELL 1996 Down), estimates of {theta} from {pi} and S are approximately twofold larger in D. simulans than in D. melanogaster (Fig 2). Unexpectedly, strong associations among polymorphic sites are observed in the D. simulans data set, similar to those seen in North American D. melanogaster samples. The D. simulans data set contains only three detected recombination events (by a four gamete test; HUDSON and KAPLAN 1985 Down), despite a large number of informative sites. Tajima's D (Table 1) is positive but not significant under conservative estimates of recombination. Levels of nucleotide diversity in D. simulans show marked variation across the sequenced region (Fig 4A). As with D. melanogaster, a window spanning the In(2L)t breakpoint site homologue in D. simulans exhibits much higher levels of nucleotide diversity than average (~1.6 and 3.3% for silent sites in D. melanogaster and D. simulans, respectively; MORIYAMA and POWELL 1996 Down).




View larger version (43K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4. (a) Sliding window of nucleotide diversity ({pi}) across the sequenced region in D. melanogaster (dotted line) and D. simulans (solid line). All windows have an equal number of sites; the window size (excluding gaps) is 100 bp and the increment 10%. The approximate position of the In(2L)t breakpoint deletion is indicated by the black bar (positions 840–934, Fig 2). Nucleotide positions do not correspond exactly to those in Fig 2 and Fig 3. (b and c) Average pairwise divergence between major haplotype classes of D. melanogaster and D. simulans vs. interspecific divergence (dotted line). Haplotype classes were arbitrarily defined: D. melanogaster class 1 alleles are cr30, cr08, cr47, cr52, cr38, and cr66 (Fig 2); D. simulans class 1 alleles are ar02, ar03, ar05, and ar12 (Fig 3). Dxy and Da are average pairwise divergence and net divergence, respectively (cf. NEI 1987 Down).

Divergence between haplotypes matches interspecific divergence:
There is evidence that the major haplotype classes in both species are old relative to divergence between species. Fig 4B and Fig C shows levels of divergence between two "haplotype classes" of the Costa Rican sample of D. melanogaster and our sample of D. simulans. Uncertainty in the alignment between the two species, especially near the breakpoint site, makes a quantitative assessment of divergence (and shared polymorphisms) difficult. However, a tentative alignment between D. melanogaster and D. simulans sequences reveals (qualitatively) that the breakpoint region has elevated interspecific divergence (Fig 4B and Fig C). Strikingly, average pairwise divergence between the two haplotype classes in both species matches or exceeds the estimated divergence between species near the In(2L)t breakpoint site. These observations can only be considered qualitative both because the alignment between D. melanogaster and D. simulans is poor and because the assignment of alleles to haplotype classes is arbitrary.

Shared polymorphisms:
Shaded columns in Fig 2 and Fig 3 show the positions of four trans-specific polymorphisms (sites 9, 64, 76, and 77 in Fig 2; sites 6, 57, 68, and 69 in Fig 3). All are at intermediate frequency (i.e., sampled more than once) in D. simulans and CRS of D. melanogaster. In D. melanogaster, two of the four polymorphisms (9 and 76, Fig 2) differentiate two major standard haplotype classes in the Costa Rican population (AGAG and GKGG). Although sampled only once in the Florida City sample, polymorphism 77 is fixed in In(2L)t chromosomes and forms an inversion-specific haplotype (AGAA; ANDOLFATTO et al. 1999 Down). In D. simulans, two of the four polymorphisms (68 and 69, Fig 3) define two major haplotype classes in the D. simulans sample (GGAA and RKGG). The haplotypes formed by these polymorphisms are to some extent trans-specific. The ATGG haplotype, sampled twice in the D. simulans sample (Fig 3), is also found in the Florida City, Yeppoon, and Zimbabwe samples of D. melanogaster. The GGGG haplotype is found at intermediate frequency in the D. simulans sample and all D. melanogaster samples. One of these haplotypes (i.e., ATGG or GGGG) may be ancestral.

Analysis of haplotype structure:
We use the haplotype test of ANDOLFATTO et al. 1999 Down to determine whether haplotype structure in our D. simulans sample is unusual under a neutral equilibrium model. The data depart from the neutral model when C = Cmin (P = 0.029, see Table 3). P values were significant for all simulations with C >= 0 (P < 0.035) and P decreased monotonically for C > Cmin. For comparison, we constructed CRS for each D. melanogaster population based on its estimated In(2L)t frequencies; 20.8% for Costa Rica, 25.0% for Florida City, 22.9% for Yeppoon, and 58.2% for Zimbabwe (ANDOLFATTO et al. 1999 Down). The neutral model was rejected (Table 3) for both the Costa Rican and Florida City populations when C = Cmin; P = 0.012 and P = 0.043, respectively. For both populations, the neutral model was rejected in all simulations when C >= 0 (maximum probabilities: Costa Rica, P < 0.015; Florida City, P < 0.046). Similar tests on CRS for Yeppoon and Zimbabwe (Table 3) do not reject the neutral model when C = Cmin. When we condition simulations on C = Clab, all probabilities become much lower; the null model is rejected for the Yeppoon sample assuming C = Clab (P = 0.0175).


 
View this table:
[in this window]
[in a new window]

 
Table 3. Summary of haplotype tests on D. melanogaster and D. simulans populations


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Elevated polymorphism at the In(2L)t breakpoint site among standard chromosomes:
Sliding-window analyses of polymorphism in the D. melanogaster (Costa Rica) and D. simulans samples reveal unusually high levels of nucleotide diversity near the In(2L)t breakpoint site (Fig 4A). This pattern is reminiscent of the predicted signature of balancing selection (HUDSON and KAPLAN 1988 Down; KREITMAN and HUDSON 1991 Down). A plausible alternative to balancing selection is simply that different DNA regions differ either in mutation rates or in levels of selective constraint. Variation in mutation rates across the sequence is unlikely given the observation that segregating mutations on In(2L)t chromosomes (not shown here) do not cluster near the inversion breakpoint (ANDOLFATTO et al. 1999 Down).

An argument in favor of heterogeneity in selective constraint across the region is that species polymorphism and divergence appear to be coupled (Fig 4). Indeed, the region immediately spanning the breakpoint appears to have elevated divergence as well as elevated levels of polymorphism (but seemingly lower divergence than that observed between major haplotype classes within each species). Comparisons of polymorphism and divergence for the In(2L)t proximal breakpoint with other loci (i.e., the HKA test of HUDSON et al. 1987 Down) suggest that heterogeneity in constraint is a sufficient explanation for diversity levels in both species (results not shown). In addition, the analysis of a larger region surrounding the breakpoint (8.4 kbp) revealed at least two candidate exons (ANDOLFATTO et al. 1999 Down). The orientation of these putative exons is consistent with the presence of an additional exon or regulatory region in the 5' region of the 1-kb region investigated here.

While heterogeneity in levels of constraint may be sufficient to explain levels of nucleotide diversity in the In(2L)t breakpoint region, it cannot explain the unusual distribution of this variation among haplotypes. This feature of the data is difficult to reconcile with a neutral equilibrium model given the rate of recombination expected in this chromosomal region (i.e., Clab; see Table 3). In addition, the several trans-specific polymorphisms in this region tend to fall on the deepest branches of genealogies in samples from both species. This observation can be taken as evidence that some component of this elevated window of nucleotide diversity surrounding the breakpoint is due to the long persistence of polymorphisms rather than simply a higher substitution rate (as suggested by the elevated divergence).

Geographic patterns and haplotype structure in D. melanogaster:
We detect significant geographic differentiation of standard haplotypes between populations of D. melanogaster despite low values of Fst (Table 2). The simplest explanation for the marked reduction in haplotype diversity in American samples is the recent expansion of African populations into the Americas (DAVID and CAPY 1988 Down; LACHAISE et al. 1988 Down) resulting in founder effects or the recent admixture of previously subdivided populations. Several lines of evidence conflict with these simple demographic explanations for the D. melanogaster data. First, recent data from microsatellite loci (IRVIN et al. 1998 Down) offer no evidence for recent bottlenecks in non-African populations of D. melanogaster. Second, the four populations sampled do not differ from each other in levels of nucleotide diversity at the In(2L)t breakpoint (Table 2). When the presence of In(2L)t in each population is accounted for (CRS, Table 1), diversity is actually lowest in the African population. Several other autosomal loci show similar patterns (Gld, HAMBLIN and AQUADRO 1997 Down; Acp26A, TSAUR et al. 1998 Down; Adh, S. C. TSAUR, unpublished results; Tra, R. KULATHINAL, personal communication; but see AGUADE 1998 Down, AGUADE 1999 Down). Thus, the apparent split between African and non-African populations reported for X-linked genes (BEGUN and AQUADRO 1993 Down–1995) does not seem to generalize to the autosomes. Third, the three non-African populations of D. melanogaster sampled for the In(2L)t breakpoint show distinctly different patterns of polymorphism (see Table 2). For example, Florida City and Yeppoon samples, in contrast to Costa Rica, reveal considerably more evidence for recombination and have less extreme values of Tajima's D. Finally, while both reduced haplotype diversity and a skew toward high frequency variants are observed in the Costa Rica data set, a survey of a larger region surrounding the In(2L)t breakpoint revealed lower frequency polymorphisms and more evidence for recombination (ANDOLFATTO et al. 1999 Down).

Thus, whether considering more populations, loci, or sites, one is led to some feature of the data that makes simple demographic explanations for the D. melanogaster data unlikely. It could be argued that, under demographic models with intermediate levels of recombination, a large variance in patterns of polymorphism is expected after a bottleneck (R. HUDSON, personal communication). All demographic models, however, predict a reduction in the expected genome-wide level of polymorphism in bottlenecked populations. This pattern is not generally observed in the available autosomal data for D. melanogaster.

Comparing patterns in D. melanogaster and D. simulans:
The observation of fewer haplotypes than expected for the In(2L)t proximal breakpoint homologue in D. simulans could again be taken as evidence for a recent range expansion of African populations (as for D. melanogaster above). For example, data for vermilion and G6pd loci show evidence for reduced haplotype diversity in some non-African populations of D. simulans relative to African populations (HAMBLIN and VEUILLE 1999 Down). It has been repeatedly suggested that the data in D. simulans reflect recent population admixture (HASSON et al. 1998 Down; HAMBLIN and VEUILLE 1999 Down; LABATE et al. 1999 Down).

An increasing number of reports of unusual haplotype structure in population samples of both D. melanogaster and D. simulans point to simple explanations, such as those based on the demographic histories of these species. However, while D. melanogaster and D. simulans may have certain similarities in their demographic histories (i.e., a possible recent expansion from Africa), it seems unlikely that these histories will be similar enough to produce identical patterns at any particular locus under investigation. While multiple loci reveal evidence for geographic differentiation in both D. melanogaster and D. simulans (e.g., HALE and SINGH 1991 Down; BEGUN and AQUADRO 1993 Down–1995; HAMBLIN and VEUILLE 1999 Down), the pattern of haplotype structure varies from locus to locus. For example, while some loci (e.g., Pgd, this study) show a nonneutral deficiency of haplotypes in North American populations of D. simulans, others (e.g., vermilion, Gld) do not (BEGUN and AQUADRO 1995 Down; HAMBLIN and AQUADRO 1996 Down). Also, in contrast to the pattern reported here, unusual patterns at these loci are not trans-specific. For example, the vermilion locus shows a reduction in nucleotide diversity and number of haplotypes in a North American population of D. melanogaster but not D. simulans (BEGUN and AQUADRO 1995 Down).

Selection-based explanations for the unusual linkage disequilibrium patterns observed at the In(2L)t breakpoint site should also be entertained. The expansion of African populations of D. melanogaster and D. simulans into more temperate climates may have been accompanied by selection at many loci. In regions of intermediate recombination, this could lead to considerable heterogeneity in haplotype structure both among loci and among populations. This is a reasonable interpretation of the pattern seen in different populations of D. melanogaster at the In(2L)t breakpoint. It is also possible that the elevated nucleotide diversity, deficiency of haplotypes, and trans-specific polymorphisms that differentiate major haplotype classes are the result of long-standing epistatic interactions. Evidence for selective constraints and putative exons near the In(2L)t breakpoint (ANDOLFATTO et al. 1999 Down) suggests that this region is itself a potential target of selection. The relatively recent appearance of In(2L)t (ANDOLFATTO et al. 1999 Down) may seem to preclude its relation to the pattern of linkage disequilibrium observed at its breakpoint among the ancient standard lineages. However, several theoretical studies suggest that a newly arising inversion is likely to confer a fitness advantage in the presence of preexisting epistatic interactions (KIMURA 1956 Down; WASSERMAN 1968 Down; CHARLESWORTH and CHARLESWORTH 1973 Down; CHARLESWORTH 1974 Down; ALVAREZ and ZAPATA 1997 Down).

Demography and selection are not mutually exclusive hypotheses and the two forces may in fact interact to produce even greater deviations than expected under either class of models (cf. KAPLAN et al. 1991 Down; NORDBORG 1997 Down; SLATKIN and WIEHE 1998 Down). Even if a long-standing epistatic interaction exists at the In(2L)t proximal breakpoint site in standard chromosomes, the recent increase in In(2L)t's frequency (ANDOLFATTO et al. 1999 Down) and demographic perturbations, such as population expansion, may have affected geographic patterns of variation for standard alleles at this locus. If both selection and demographic shifts have influenced patterns of variability, then the demographic history of D. melanogaster and D. simulans will have to be better understood before selection at any particular locus can be inferred.


*  ACKNOWLEDGMENTS

We thank J. Comeron, R. Hudson, R. Kulathinal, M. Przeworski, S. C. Tsaur, and J. Wall for helpful discussions. R. Hudson and J. Wall provided computer programs. This manuscript was improved with comments from P. Awadalla, W. Eanes, an anonymous reviewer, and especially M. Przeworski. We thank Jean Gladstone for excellent technical assistance and Chung-I Wu for Australian and African fly lines. This research was supported by National Science Foundation grant DEB-9408869 and National Institutes of Health grant R01GM39355 to M.K. P.A. holds a Postgraduate Scholarship from the National Science and Engineering Council of Canada.

Manuscript received May 8, 1999; Accepted for publication December 22, 1999.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

AGUADÉ, M., 1998  Different forces drive the evolution of the Acp26Aa and Acp26Ab accessory gland genes in the Drosophila melanogaster species complex. Genetics 150:1079-1089[Abstract/Free Full Text].

AGUADÉ, M., 1999  Positive selection drives the evolution of the Acp29AB accessory gland protein in Drosophila. Genetics 152:543-551[Abstract/Free Full Text].

ALVAREZ, G. and C. ZAPATA, 1997  Conditions for protected inversion polymorphism under supergene selection. Genetics 146:717-722[Abstract].

ANDOLFATTO, P. and M. NORDBORG, 1998  The effect of gene conversion on intralocus associations. Genetics 148:1397-1399[Free Full Text].

ANDOLFATTO, P., J. D. WALL, and M. KREITMAN, 1999  Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster.. Genetics 153:1297-1311[Abstract/Free Full Text].

BEGUN, D. J. and C. F. AQUADRO, 1993  African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548-550[Medline].

BEGUN, D. J. and C. F. AQUADRO, 1994  Evolutionary inferences from DNA variation at the 6-Phosphogluconate Dehydrogenase locus in natural populations of Drosophila—selection and geographic differentiation. Genetics 136:155-171[Abstract].

BEGUN, D. J. and C. F. AQUADRO, 1995  Molecular variation at the vermilion locus in geographically diverse populations of Drosophila melanogaster and Drosophila simulans.. Genetics 140:1019-1032[Abstract].

NASSI, V., S. AULARD, S. MAZEAU, and M. VEUILLE, 1993  Molecular variation of Adh and P6 genes in an African population of Drosophila melanogaster and its relation to chromosomal inversions. Genetics 134:789-799[Abstract].

NASSI, V., F. DEPAULIS, G. K. MEGHLAOUI, and M. VEUILLE, 1999  Partial sweeping of variation at the Fbp2 locus in a West African population of Drosophila melanogaster.. Mol. Biol. Evol. 16:347-353[Abstract].

BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995  The hitchhiking effect on the site frequency-spectrum of DNA polymorphisms. Genetics 140:783-796[Abstract].

CHARLESWORTH, B., 1974  Inversion polymorphism in a two-locus genetic system. Genet. Res. 23:259-280[Medline].

CHARLESWORTH, D. and B. CHARLESWORTH, 1973  Selection of new inversions in multi-locus genetic systems. Genet. Res. 21:167-183.

COMERON, J. M., M. KREITMAN, and M. AGUADÉ, 1999  Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239-249[Abstract/Free Full Text].

DAVID, J. R. and P. CAPY, 1988  Genetic variation of Drosophila melanogaster natural populations. Trends Genet. 4:106-111[Medline].

DEPAULIS, F., L. BRAZIER, and M. VEIULLE, 1999  Selective sweep at the Drosophila melanogaster Suppressor of Hairless locus and its association with the In(2L)t inversion polymorphism. Genetics 152:1017-1024[Abstract/Free Full Text].

HASSON, E., I. N. WANG, L. W. ZENG, M. KREITMAN, and W. F. EANES, 1998  Nucleotide variation in the triose-phosphate isomerase (Tpi) locus of Drosophila melanogaster and Drosophila simulans.. Mol. Biol. Evol. 15:756-769[Abstract].

FU, Y. X., 1996  New statistical tests of neutrality for DNA samples from a population. Genetics 143:557-570[Abstract].

HALE, L. R. and R. S. SINGH, 1991  Contrasting patterns of genetic structure and evolutionary history as revealed by mitochondrial DNA and nuclear gene-enzyme variation. J. Genet. 70:79-89.

HAMBLIN, M. T. and C. F. AQUADRO, 1996  High nucleotide sequence variation in a region of low recombination in Drosophila simulans is consistent with the background selection model. Mol. Biol. Evol. 13:1133-1140[Abstract].

HAMBLIN, M. T. and C. F. AQUADRO, 1997  Contrasting patterns of nucleotide sequence variation at the glucose dehydrogenase (Gld) locus in different populations of Drosophila melanogaster.. Genetics 145:1053-1062[Abstract].

HAMBLIN, M. T. and M. VEUILLE, 1999  Population structure among African and derived populations of Drosophila simulans: evidence for ancient subdivision and recent admixture. Genetics 153:305-317[Abstract/Free Full Text].

HARADA, K., S. I. KUSAKABE, T. YAMAZAKI, and T. MUKAI, 1993  Spontaneous mutation rates in null and band-morph mutations of enzyme loci in Drosophila melanogaster.. Jpn. J. Genet. 68:605-616[Medline].

HUDSON, R. R., 1987  Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245-250[Medline].

HUDSON, R. R., 1990 Gene genealogies and the coalescent process, pp. 1–44 in Oxford Surveys in Evolutionary Biology, Vol. 7, edited by D. J. FUTUYMA and J. ANTONOVICS. Oxford University Press, Oxford.

HUDSON, R. R., 1993 The how and why of generating gene genealogies, pp. 23–36 in Mechanisms of Molecular Evolution, edited by N. TAKAHATA and A. G. CLARK. Japan Scientific Society, Tokyo.

HUDSON, R. R. and N. F. KAPLAN, 1985  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147-164[Abstract/Free Full Text].

HUDSON, R. R. and N. F. KAPLAN, 1988  The coalescent process in models with selection and recombination. Genetics 120:831-840[Abstract/Free Full Text].

HUDSON, R. R., M. KREITMAN, and M. AGUADÉ, 1987  A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159[Abstract/Free Full Text].

HUDSON, R. R., D. D. BOOS, and N. F. KAPLAN, 1992a  A statistical test for detecting geographic subdivision. Mol. Biol. Evol. 9:138-151[Abstract].

HUDSON, R. R., M. SLATKIN, and W. P. MADDISON, 1992b  Estimating levels of gene flow from DNA sequence data. Genetics 132:583-589[Abstract].

HUDSON, R. R., K. BAILEY, D. SKARECKY, J. KWIATOWSKI, and F. J. AYALA, 1994  Evidence for positive selection in the Superoxide Dismutase (Sod) region of Drosophila melanogaster.. Genetics 136:1329-1340[Abstract].

HUDSON, R. R., A. G. SÁEZ, and F. J. AYALA, 1997  DNA variation at the Sod locus of Drosophila melanogaster: an unfolding story of natural selection. Proc. Natl. Acad. Sci. USA 94:7725-7729[Abstract/Free Full Text].

IRVIN, S. D., K. A. WETTERSTRAND, C. M. HUTTER, and C. F. AQUADRO, 1998  Genetic variation and differentiation at microsatellite loci in Drosophila simulans: evidence for founder effects in New World populations. Genetics 150:777-790[Abstract/Free Full Text].

KAPLAN, N., R. R. HUDSON, and M. IIZUKA, 1991  The coalescent process in models with selection, recombination and geographic subdivision. Genet. Res. 57:83-91[Medline].

KAPLAN, N. L., R. R. HUDSON, and C. H. LANGLEY, 1989  The hitchhiking effect revisited. Genetics 123:887-899[Abstract/Free Full Text].

KIMURA, M., 1956  A model of a genetic system which leads to closer linkage by natural selection. Evolution 10:278-287.

KIRBY, D. A. and W. STEPHAN, 1996  Multi-locus selection and the structure of variation at the white gene of Drosophila melanogaster.. Genetics 144:635-645[Abstract].

KNIBB, W. R., 1982  Chromosomal inversion polymorphism in Drosophila melanogaster II. Geographic clines and climatic associations in Australasia, North America and Asia. Genetica 58:213-221.

KREITMAN, M., 1983  Nucleotide polymorphism at the Alcohol Dehydrogenase locus of Drosophila melanogaster.. Nature 304:412-417[Medline].

KREITMAN, M. and R. R. HUDSON, 1991  Inferring the evolutionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127:565-582[Abstract].

LABATE, J. A., C. H. BIERMANN, and W. F. EANES, 1999  Nucleotide variation at the runt locus in Drosophila melanogaster and Drosophila simulans.. Mol. Biol. Evol. 16:724-731[Abstract].

LACHAISE, D., M. L. CARIOU, J. R. DAVID, F. LEMEUNIER, and L. TSACAS et al., 1988  Historical biogeography of the Drosophila melanogaster species subgroup. Evol. Biol. 22:159-225.

LI, W. H., 1997 Molecular Evolution. Sinauer Press, Sunderland, MA.

MAYNARD-SMITH, J. and J. HAIGH, 1974  The hitch-hiking effect of a favorable gene. Genet. Res. 23:23-35[Medline].

MORIYAMA, E. N. and J. R. POWELL, 1996  Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277[Abstract].

NEI, M., 1987 Molecular Evolutionary Genetics. Columbia University Press, New York.

NORDBORG, M., 1997  Structured coalescent processes on different time scales. Genetics 146:1501-1514[Abstract].

ROZAS, J. and R. ROZAS, 1999  DnaSP version 3: an integrated program for molecular population genetics and molecular evolutionary analysis. Bioinformatics 15:174-175[Abstract/Free Full Text].

SLATKIN, M. and T. WIEHE, 1998  Genetic hitchhiking in a subdivided population. Genet. Res. 71:155-160[Medline].

STROBECK, C., 1987  Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117:149-153[Abstract/Free Full Text].

TAJIMA, F., 1983  Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437-460[Abstract/Free Full Text].

TAJIMA, F., 1989  Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595[Abstract/Free Full Text].

TSAUR, S. C., C. T. TING, and C. I. WU, 1998  Positive selection driving the evolution of a gene of male reproduction, Acp26Aa, of Drosophila: II. Divergence versus polymorphism. Mol. Biol. Evol. 15:1040-1046[Abstract].

WALL, J. D., 1999  Recombination and the power of statistical tests of neutrality. Genet. Res. 74:65-79.

WASSERMAN, M., 1968  Recombination-induced chromosomal heterosis. Genetics 58:125-139[Free Full Text].

WATTERSON, G. A., 1975  On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7:256-276[Medline].




This article has been cited by other articles:


Home page
GeneticsHome page
A. Bhutkar, S. W. Schaeffer, S. M. Russo, M. Xu, T. F. Smith, and W. M. Gelbart
Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes
Genetics, July 1, 2008; 179(3): 1657 - 1680.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. L. Turner, M. T. Levine, M. L. Eckert, and D. J. Begun
Genomic Analysis of Adaptive Differentiation in Drosophila melanogaster
Genetics, May 1, 2008; 179(1): 455 - 473.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. T. Levine and D. J. Begun
Evidence of Spatially Varying Selection Acting on Four Chromatin-Remodeling Loci in Drosophila melanogaster
Genetics, May 1, 2008; 179(1): 475 - 485.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Sanchez-Gracia and J. Rozas
Unusual Pattern of Nucleotide Sequence Variation at the OS-E and OS-F Genomic Regions of Drosophila simulans
Genetics, April 1, 2007; 175(4): 1923 - 1935.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Kirkpatrick and N. Barton
Chromosome Inversions, Local Adaptation and Speciation
Genetics, May 1, 2006; 173(1): 419 - 434.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. E. Pool, V. B. DuMont, J. L. Mueller, and C. F. Aquadro
A Scan of Molecular Variation Leads to the Narrow Localization of a Selective Sweep Affecting Both Afrotropical and Cosmopolitan Populations of Drosophila melanogaster
Genetics, February 1, 2006; 172(2): 1093 - 1105.
[Abstract] [Full Text] [PDF]


Home page