| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 170, 709-718, June 2005, Copyright © 2005
doi:10.1534/genetics.104.036483
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,1
,
* Division of Population Genetics, National Institute of Genetics, Mishima 411-8540, Japan
Department of Genetics, Graduate University for Advanced Studies (Sokendai), Mishima 411-8540, Japan
School of Advanced Sciences, Graduate University for Advanced Studies (Sokendai), Shonan Village, Hayama, Kanagawa 240-0193, Japan
1 Corresponding author: Division of Population Genetics, National Institute of Genetics, 1111 Yata, Mishima 411-8540, Japan.
E-mail: atakahas{at}lab.nig.ac.jp
| ABSTRACT |
|---|
|
|
|---|
Recently, a large family of odorant-binding proteins (OBPs) was identified in Drosophila (GALINDO and SMITH 2001; GRAHAM and DAVIES 2002; HEKMAT-SCAFE et al. 2002; VOGT et al. 2002). The family contains up to 51 putative members that are expressed in olfactory and gustatory organs (GALINDO and SMITH 2001; HEKMAT-SCAFE et al. 2002). The exact function of these proteins is not known, but they seem to play an important role in odor detection by restricting the odorants accessible to specific receptors (VOGT et al. 1991). We expect to find a reasonable amount of null mutants in OBPs, considering that many such null mutants have been found in the receptor side.
We focused on one tandemly duplicated pair of odorant-binding proteins, Obp57d and Obp57e. These two genes are known to be expressed in the same four cells of tarsi (GALINDO and SMITH 2001) and thus may have an overlapping function that could have weakened the functional constraints on one or both of them. Our survey indeed found two types of null mutations (deletions that disrupt the transcript) in Obp57e. One of them was found at a fairly high frequency in the populations we examined. Our aim in this study was to analyze the surrounding DNA regions of these two genes and to investigate if there was any pattern that deviated from neutrality.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
|
RT-PCR:
RNA was extracted from the adult tissues of KY02073 on the third day after eclosion. Wings, tarsi, labela, antennae, and maxillary palps were collected from 10 individual males and females and stored in RNAlater RNA stabilization reagent (QIAGEN, Valentia, CA). Poly(A)+ RNA was extracted with the Rneasy mini kit (QIAGEN) and cDNA was synthesized using the SuperScript first-strand synthesis system (Invitrogen, Carlsbad, CA). A BD Advantage 2 PCR kit (CLONTECH, BD Biosciences, San Jose, CA) was used to amplify each transcript. The reactions were hot started at 95° for 1 min, then at 95° for 30 sec, 64° for 1 min for 35 cycles, and finally 64° for 1 min. The primers used were as follows: Obp57d, 5'-TGTACCGCATCTGGCTTGTA-3' and 5'-ACTTGTGGGACCTTTTCACG-3'; Obp57e, 5'-TTGGACCAACTTACACTGTGTTT-3' and 5'-ACTGGCCAATTCTCCATCAC-3'. The primers for the ribosomal protein 49 gene (Rp49), 5'-AGATCGTGAAGAAGCGCACCAAG-3' and 5'-CACCAGGAACTTCTTGAATCCGG-3', were used as internal controls. All the primer pairs were selected in the coding regions and designed to span introns. The products from these genes were visualized on a 1.5% agarose gel by loading 10 µl from the 50-µl total reaction volume.
Data analyses:
DNA sequences were aligned using the Clustal W program (THOMPSON et al. 1994). The level of nucleotide diversity (
), nucleotide divergence, population recombination rate (C = 4Ner, where Ne is effective population size and r is recombination rate per site per generation; HUDSON 1987), and linkage disequilibrium (LD) were estimated using the DnaSP program v.4.0 (ROZAS et al. 2003). The sliding-window analyses of
and divergence were also performed by this program. The test of heterogeneity of
in the sliding-window analyses was conducted by generating 100 random sequence data sets by coalescent simulation using ProSeq v.2.91 (FILATOV 2002). The parameters (
, C, etc.) used to generate these sequences were estimated from the population samples. The estimation and permutation tests of FST levels (HUDSON et al. 1992) between the Obp57e indel types at positions 11511160 were performed using the ProSeq program. Tajima's test (TAJIMA 1989) and Fay and Wu's test (FAY and WU 2000) were performed using the DnaSP program. We performed 1000 coalescent simulations with the observed level of recombination, C = 0.001, for both regions estimated by HUDSON (1987), which are implemented in the DnaSP program, to determine the critical values of the test statistics. HKA tests (HUDSON et al. 1987) were also performed using the DnaSP program.
| RESULTS |
|---|
|
|
|---|
|
|
a and
s, respectively, among the four lines with intact exons to determine if the amino acid changes are suppressed by functional constraints. In both genes, the finding that
a <
s (Obp57d
a = 0.011,
s = 0.038; Obp57e
a = 0.0032,
s = 0.027) suggests functional constraints in these genes. In addition, among the six amino acid changes in Obp57d and the two found in the four intact Obp57e lines (Figure 1), four were conservative in terms of charge, polarity, and volume using the criteria by ZHANG (2000). Only one site in each of the two genes (at nucleotide positions 624 and 1535) was radically changed in terms of charge. Another site in each of the genes (at positions 538 and 1559) was radically changed in terms of polarity and volume. These results suggest that these OBP genes are functional, because most of the observed amino acid changes were conservative. We also confirmed the exon-intron boundary predicted in the database by sequencing the RT-PCR products (data not shown) to make sure that the deletions found in the coding regions disrupt translation. We surveyed 13 African isofemale lines and 3 non-African lines to investigate whether the frameshift mutation of Obp57e at positions 11511160 is present in other regions of the world. The results indicate that the mutation found in Kyoto is not a locally occurring new mutation, but exists in worldwide populations and thus could be quite old (Table 1).
This frameshift mutation (deletion at positions 11511160) could be merely drifting to high frequency due to weak functional constraint. Alternatively, if it is under some type of balancing selection, a sliding-window analysis should reveal elevated nucleotide diversity around this mutation. The analysis showed a peak of nucleotide diversity of silent sites at the window surrounding this mutation (Figure 4A). We tested whether this heterogeneity in nucleotide diversity levels was statistically significant. To be conservative, we used the concatenated sequence of introns and third-position amino acid sites. The concatenated sequence with a size of 998 bp also showed elevated
level at four adjacent 100-bp windows with a step size of 25 bp (175 bp). The maximum
value of 175-bp windows with a step size of 1 was 0.032. In this data set,
= 0.012 and the estimated C = 4Ner = 0.001. We generated 100 sets of 16 random sequences using these parameters by coalescent simulation. The probability that the maximum
value among 175-bp windows with a step size of 1 is
0.032 in these generated data sets was P = 0.04. Thus, heterogeneity of nucleotide diversity in this gene was significant at the 5% level. Nevertheless, it should be noted that because the simulations assume a constant panmictic population, we cannot exclude the possibility that the true variance of
and P might be greater than that obtained by the simulation model.
|
Sliding-window analyses of divergence between wild-type and deletion and that of nucleotide diversity within each of the two types were then performed to determine if the frameshift mutation (deletion at positions 11511160) is responsible for the heterogeneity of nucleotide diversity observed in Figure 4A. The former showed a similar peak pattern to the pooled sample, whereas the latter two did not. This indicates that the peak pattern of
is due to divergence of these two alleles. The peak of divergence between the two alleles shown in Figure 4B (0.077 at the window surrounding position 1164) may be close to the divergence level between D. melanogaster and D. simulans, which is estimated to be
0.050.10. For example, the divergence in the Obp57d region was estimated to be 0.10 including all sites, and 0.15 when using silent sites. Recent data by DUMONT et al. (2004) showed that the average divergence between D. melanogaster and D. simulans was 0.078 among 27 intron regions, and 0.124 among synonymous sites of 81 genes.
We also tested for a significant association of Obp57e sequence variation with the wild-type or null (deletion) variation at positions 11511160 by analyzing differentiation. We have utilized FST (HUDSON et al. 1992), a parameter generally used to measure population differentiation to test whether the differentiation between two different alleles at positions 11511160 is greater than expected from a random mixing of a set of sequences. The strain with another null mutant at positions 12261241 (KY02101) was excluded from the analysis. The sequence differentiation between the wild-type and null (deletion) alleles was significant in the Obp57e region (FST = 0.68, P < 0.001), but not in the proximal Obp57d region (FST = 0.17, P = 0.064).
The LD between pairs of all the polymorphic sites is shown in Figure 5A. After correcting for multiple tests, there were 21 significant pairs of sites in the Obp57e region surrounding the position of the frameshift mutation at positions 11511160, whereas in the Obp57d region, there was only one significant pair. No significant pairs were found after correction when all the sites from both regions were compared, indicating that a strong LD between the two regions did not exist (see also Figure 2 polymorphism table). There were more pairs of sites in LD in the Obp57e region than in the Obp57d region. To see if this difference was due to the longer range of LD in the Obp57e region, LD expressed by the square of correlation coefficient (R2) was plotted against distance between the pairs of sites (Figure 5B). The expected R2 = 1/(1 + k x distance) + 1/16 (WEIR and HILL 1986), where k is a constant in the unit of 4Ner. The best fit was k = 0.0049 for the Obp57d region and k = 0.0070 for the Obp57e region. R2 falls off more quickly in the latter region than in the former, indicating that the reason for more significant pairs in the latter region was probably not due to the smaller r.
|
|
|
| DISCUSSION |
|---|
|
|
|---|
|
Another relevant data set is available from the same Kyoto population. INOMATA et al. (2004) analyzed DNA sequences of a gustatory receptor gene, Gr5a, in 152 samples collected at the same collection site in Kyoto in the same year. The authors rejected neutrality by Tajima's test and Fay and Wu's test and argued the possibility of both selection and demographic factors. Nevertheless, further analysis of their data (52 haplotypes with 59 segregating sites identified in 1786 bp) provided us with some information on the population structure. One thousand coalescent simulations on their data showed that the probability for having
52 haplotypes would be P = 0.183. Despite the large sample size in their study, the paucity of haplotype numbers, which is evidence of a strong recent population bottleneck, was not detected. Although we cannot deny a possibility of complicated population history affecting our gene region, a simple model of bottleneck-causing biallellism observed in Obp57e is not likely.
Although population bottleneck does not seem to be a large factor in this population, we should note that the level of nucleotide diversity within deletion strains is lower than that within wild-type strains (Figure 4B). Interestingly, this tendency can be seen in the African populations as well (Figure 7). Although many aspects of this gene region favor balancing selection, it may be sensible to consider additional factors beyond a simple selection model to explain the above pattern. One scenario could be that a small number of null mutants have recently spread in many populations by either directional or balancing selection after being maintained for a long time by balancing selection in a relatively restricted area.
Our data showed a higher number of significant pairs of sites in LD in the Obp57e region than in the Obp57d region (Figure 5A), but did not indicate a longer range of LD in the former (Figure 5B). The length of the region affected by balancing selection, characterized by elevated nucleotide diversity and strong LD, is influenced by the population recombination rate C = 4Ner. Several genes in Drosophila show patterns of nucleotide diversity compatible with balancing selection: Adh (KREITMAN and HUDSON 1991), Idgf1 (
UROVCOVá and AYALA 2002), and Est-6 (AYALA et al. 2002), as indicated by elevated nucleotide variation around the sites predicted to have functional divergences. The widths of these peaks are all <500 bp, as seen in Obp57e in our study (Figure 4A). In Arabidopsis R genes, compatibility of the balancing selection model has been investigated by coalescent simulation using the estimate of 2Ner from the map distances (STAHL et al. 1999; TIAN et al. 2002; MAURICIO et al. 2003). The widths of the peaks in nucleotide diversity are longer (several kilobases) than those in Drosophila genes. This is not surprising since LD in Arabidopsis is known to be extremely large compared to that of Drosophila. A genomic scale analysis in Arabidopsis thaliana showed that LD decays within
250 kb (NORDBORG et al. 2002), whereas in D. melanogaster, it typically decays within 1 kb (i.e., LONG et al. 1998). Despite the detailed experimental data on recombination frequencies (map distances), it is still difficult to obtain a reasonable estimate of population recombination rate in Drosophila. Indeed, there is an excess of LD relative to the standard neutral model in the non-African Drosophila population, given the estimated rate of crossing over from the map distances (ANDOLFATTO and PRZEWORSKI 2000; WALL et al. 2002). Therefore, we could not obtain the expected peak width in nucleotide diversity by coalescent simulations as in the studies of Arabidopsis R genes (STAHL et al. 1999; TIAN et al. 2002; MAURICIO et al. 2003).
Our data also could not reject selective neutrality using Tajima's test, which is a standard method for detecting balancing selection. With regard to another R gene in Arabidopsis, Rps2, CAICEDO et al. (1999) initially did not favor a balancing selection model to explain their data because of the nonsignificant Tajima's D. However, MAURICIO et al. (2003) reanalyzed the region with larger-scale data and succeeded in finding other statistical evidence supporting the selection hypothesis. Probably due to the small sample size, Tajima's test lacked power to detect linked variants to the alleles of frequencies 5/16 = 0.32 (wild type) and 11/16 = 0.68 (frameshift) in our data.
GALINDO and SMITH (2001) demonstrated that Obp57d and Obp57e coexpress exclusively in four cells associated with chemosensory bristles of tarsi. They also showed that expressing Grim, a proapoptotic factor that induces programmed cell death, in the cells expressing these genes results in decreased sensitivity to sucrose, which was determined by measuring the proboscis extension reflex. This suggests that the two genes are expressed in cells that are important for gustation. However, GALINDO and SMITH (2001) did not report expression of these genes in wings or labela (despite their investigation of these organs) as shown by RT-PCR in our study (Figure 3). A possible reason for this incongruence may be that they detected expression by fusing several kilobases of an upstream regulatory sequence for each gene to a reporter gene. This assay may have missed expression in some organs due either to regulatory elements further upstream or to a positional effect of the transgene. The null mutant of Obp57e found in our study could serve as a naturally occurring gene knockout for further understanding the function of these OBP genes since, to date, the only mutant available for OBPs in Drosophila is lush, which was generated by P-mutagenesis (KIM et al. 1998).
Although the molecular pattern favors the balancing selection hypothesis for Obp57e, it is not easy to infer the advantage and disadvantage of losing the gene function. One possibility may be something to do with the formation of dimers as pointed out by SáNCHEZ-GRACIA et al. (2003) regarding the two duplicated OSE and OSF genes that are coexpressed in the same cells. The formation of dimers in insect OBPs is demonstrated in physiological conditions (SANDLER et al. 2000). If Obp57d and Obp57e can form heterodimers in the coexpressed cells, the dosage of Obp57e may be important for determining sensitivity to particular odorants. If this is the case, there may be a slight heterozygous advantage of presence/absence alleles at this locus. This possibility has many assumptions that require further experimental investigation. In general, the high frequency of a null mutation may provide insights into the mode of selection as it relates to functional activity of a class of proteins.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
| LITERATURE CITED |
|---|
|
|
|---|
ANDOLFATTO, P., and M. PRZEWORSKI, 2000 A genome-wide departure from the standard neutral model in natural populations of Drosophila. Genetics 156: 257268.
AYALA, F. J., E. S. BALAKIREV and A. G. SáEZ, 2002 Genetic polymorphism at two linked loci, Sod and Est-6, in Drosophila melanogaster. Gene 300: 1929.[CrossRef][Medline]
CAICEDO, A. L., B. A. SCHAAL and B. N. KUNKEL, 1999 Diversity and molecular evolution of the RPS2 resistance gene in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 96: 302306.
DUMONT, V. B., J. C. FAY, P. P. CALABRESE and C. F. AQUADRO, 2004 DNA variability and divergence at the Notch locus in Drosophila melanogaster and D. simulans: a case of accelerated synonymous site divergence. Genetics 167: 171185.
FAY, J. C., and C.-I WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155: 14051413.
FILATOV, D. A., 2002 ProSeq: a software for preparation and evolutionary analysis of DNA sequence data sets. Mol. Ecol. Notes 2: 621624.[CrossRef]
GALINDO, K., and D. P. SMITH, 2001 A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159: 10591072.
GRAHAM, L. A., and P. L. DAVIES, 2002 The odorant-binding proteins of Drosophila melanogaster: annotation and characterization of a divergent gene family. Gene 292: 4355.[CrossRef][Medline]
HEKMAT-SCAFE, D. S., C. R. SCAFE, A. J. MCKINNEY and M. A. TANOUYE, 2002 Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster. Genome Res. 12: 13571369.
HUDSON, R. R., 1987 Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50: 245250.[Medline]
HUDSON, R. R., M. KREITMAN and M. AGUADé, 1987 A test of neutral molecular evolution based on nucleotide data. Genetics 116: 153159.
HUDSON, R. R., M. SLATKIN and W. P. MADDISON, 1992 Estimation of levels of gene flow from DNA sequence data. Genetics 132: 583589.[Abstract]
INOMATA, N., H. GOTO, M. ITOH and K. ISONO, 2004 A single-amino-acid change of the gustatory receptor gene, Gr5a, has a major effect on trehalose sensitivity in a natural population of Drosophila melanogaster. Genetics 167: 17491758.
KIM, M. S., A. REPP and D. P. SMITH, 1998 LUSH odorant-binding protein mediates chemosensory responses to alcohols in Drosophila melanogaster. Genetics 150: 711721.
KREITMAN, M., and R. R. HUDSON, 1991 Inferring the evolutionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127: 565582.[Abstract]
LONG, A. D., R. F. LYMAN, C. H. LANGLEY and T. F. C. MACKAY, 1998 Two sites in the
gene region contribute to naturally occurring variation in bristle number in Drosophila melanogaster. Genetics 149: 9991017.
MAURICIO, R., E. A. STAHL, T. KORVES, D. TIAN, M. KREITMAN et al., 2003 Natural selection for polymorphism in the disease resistance gene Rps2 of Arabidopsis thaliana. Genetics 163: 735746.
MENASHE, I., O. MAN, D. LANCET and Y. GILAD, 2002 Population differences in haplotype structure within a human olfactory receptor. Hum. Mol. Genet. 11: 13811390.
MENASHE, I., O. MAN, D. LANCET and Y. GILAD, 2003 Different noses for different people. Nat. Genet. 34: 143144.[CrossRef][Medline]
NORDBORG, M., J. O. BOREVITZ, J. BERGELSON, C. C. BERRY, J. CHORY et al., 2002 The extent of linkage disequilibrium in Arabidopsis thaliana. Nat. Genet. 30: 190193.[CrossRef][Medline]
ROBERTSON, H. M., C. G. WARR and J. R. CARLSON, 2003 Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 100: 1453714542.
ROZAS, J., J. C. SáNCHEZ-DELBARRIO, X. MESSEGUER and R. ROZAS, 2003 DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 24962497.
SáNCHEZ-GRACIA, A., M. AGUADé and J. ROZAS, 2003 Patterns of nucleotide polymorphism and divergence in the odorant-binding protein genes OS-E and OS-F: analysis in the melanogaster species subgroup of Drosophila. Genetics 165: 12791288.
SANDLER, B. H., L. NIKONOVA, W. S. LEAL and J. CLARDY, 2000 Sexual attraction in the silkworm moth: structure of the pheromone-binding-protein-bombykol complex. Chem. Biol. 7: 143151.[CrossRef][Medline]
STAHL, E. A., G. DWYER, R. MAURICIO, M. KREITMAN and J. BERGELSON, 1999 Dynamics of disease resistance polymorphism at the Rpm1 locus of Arabidopsis. Nature 400: 667671.[CrossRef][Medline]
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585595.
TAKANO-SHIMIZU, T., A. KAWABE, N. INOMATA, N. NAMBA, R. KONDO et al., 2004 Interlocus nonrandom association of polymorphisms in Drosophila chemoreceptor genes. Proc. Natl. Acad. Sci. USA 101: 1415614161.
THOMPSON, J. D., D. G. HIGGINS and T. J. GIBSON, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 46734680.
TIAN, D., H. ARAKI, E. STAHL, J. BERGELSON and M. KREITMAN, 2002 Signature of balancing selection in Arabidopsis. Proc. Natl. Acad. Sci. USA 99: 1152511530.
VOGT, R. G., G. D. PRESTWICH and M. R. LERNER, 1991 Odorant-binding-protein subfamilies associate with distinct classes of olfactory receptor neurons in insects. J. Neurobiol. 22: 7484.[CrossRef][Medline]
VOGT, R. G., M. E. ROGERS, M. D. FRANCO and M. SUN, 2002 A comparative study of odorant binding protein genes: differential expression of the PBP1-GOBP2 gene cluster in Manduca sexta (Lepidoptera) and the organization of OBP genes in Drosophila melanogaster (Diptera). J. Exp. Biol. 205: 719744.
WALL, J. D., P. ANDOLFATTO and M. PRZEWORSKI, 2002 Testing models of selection and demography in Drosophila simulans. Genetics 162: 203216.
WEIR, B. S., and W. G. HILL, 1986 Nonuniform recombination within the human beta-globin gene cluster. Am. J. Hum. Genet. 38: 776781.[Medline]
ZHANG, J., 2000 Rates of conservative and radical nonsynonymous nucleotide substitutions in mammalian nuclear genes. J. Mol. Evol. 50: 5668.[Medline]
UROVCOVá, M., and F. J. AYALA, 2002 Polymorphism patterns in two tightly linked developmental genes, Idgf1 and Idgf3, of Drosophila melanogaster. Genetics 162: 177188.
This article has been cited by other articles:
![]() |
T. Matsuo Rapid Evolution of Two Odorant-Binding Protein Genes, Obp57d and Obp57e, in the Drosophila melanogaster Species Group Genetics, February 1, 2008; 178(2): 1061 - 1072. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Wang, R. F. Lyman, S. A. Shabalina, T. F. C. Mackay, and R. R. H. Anholt Association of Polymorphisms in Odorant-Binding Protein Genes With Variation in Olfactory Response to Benzaldehyde in Drosophila Genetics, November 1, 2007; 177(3): 1655 - 1665. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |