| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 178, 1061-1072, February 2008, Copyright © 2008
doi:10.1534/genetics.107.079046
| ||||||||||||||||||||||||||||||||||||||||||||||||||
Department of Biological Sciences, Tokyo Metropolitan University, Tokyo 192-0397, Japan
1 Address for correspondence: Department of Biological Sciences, Tokyo Metropolitan University, 1-1 Minami Osawa, Hachioji, Tokyo 192-0397, Japan.
E-mail: mts{at}tmu.ac.jp
| ABSTRACT |
|---|
|
|
|---|
With the completion of many genome sequences, however, comparisons of genomic data have raised a question of whether all the differences in multigene-family size are consequences of selection. Alternatively, they might be caused by merely a stochastic gain and loss of genes. Indeed, results from genome analyses revealed that, at least in part, the size difference in multigene families between species can be explained by neutral evolution (KAREV et al. 2003, 2004; REED and HUGHES 2004; HAHN et al. 2005; DE BIE et al. 2006; RUDNICKI et al. 2006). On the basis of these observations, it was proposed that the size difference in multigene families is not a consequence, but a cause of evolutionary changes in phenotypes (NEI 2005).
These two theories, however, may not look at the same phenomenon. It is known that genes generated by a duplication undergo two successive but distinct stages of evolution (LYNCH and CONERY 2000). At the earlier stage, the two genes are functionally identical and tend to be reduced to a single gene by degeneration of either gene. Once they have functionally diverged from each other, however, both genes independently contribute to fitness, and selection pressure maintains the two genes stably for a long time. The selection model of gene-family evolution may explain differences at the later stage, while the stochastic gain-and-loss model may fit events occurring at the earlier stage. Thus, it is important to know which stage contributes more to the size difference between the gene families of interest, because it influences the conclusion of analyses.
Genes encoding odorant-binding proteins (OBP), secreted molecules that function in insects' chemosensilla, form a large family in an insect genome. In the Drosophila melanogaster genome, there are
50 OBP genes; this number is comparable to that of odorant receptors and gustatory receptors (GALINDO and SMITH 2001; GRAHAM and DAVIES 2002; HEKMAT-SCAFE et al. 2002). Similarly to those of the chemoreceptor gene families, the size of the OBP gene family also varies between species, suggesting that OBP genes are under the control of the same kind of evolutionary mechanisms that determine the sizes of chemoreceptor gene families (XU et al. 2003; FORÊT and MALESZKA 2006).
Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia (MATSUO et al. 2007). These genes are expected to be under selective pressure by ecological conditions, and their evolution might have affected behavior also in other species. Here, by comparing the genomic sequences at the Obp57d/e locus from 27 Drosophila species, we revealed the rapid evolution of the two OBP genes, which resulted in gene-number differences between species. The differences were divided into two classes: the difference of the genes that have functionally diverged from each other and the difference of the genes that appear to be at the early stage of evolution after gene duplication and thus have not yet functionally diverged. Our findings demonstrate that the comparative analysis of many genomic sequences from closely related species is useful for discrimination between these two classes of differences.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Fly stocks:
The fly stocks used in this study are listed in Table 1. All the stocks are maintained in our laboratory, except for D. melanogaster and D. pseudoobscura, whose genomic sequences were obtained from FlyBase (release 5.1 and 2.0, respectively).
|
ORF identification:
The genomic sequences at the Obp57d/e region were searched for second exons (ORF) using an OBP signature cystein motif (C-X10-C-X8-C). First exons (ORF) were determined among possible ORFs using the following criteria: starts with ATG, length is sufficient (>50 bp), and exon–intron boundary can be assigned to form in-frame connection with the corresponding second exon. By using these criteria, the first exon and the exon–intron boundary could be uniquely determined for every second exon found by using the C-X10-C-X8-C motif.
Amino acid sequence analyses:
Alignment and phylogenetic analyses of the deduced amino acid sequences were carried out using the MEGA3.1 sequence analysis package (KUMAR et al. 2004). The signal peptide sequence was predicted using SignalP 3.0 (BENDTSEN et al. 2004). The ancestral amino acid sequence at each internal node was inferred by the maximum-parsimony (MP) method (FITCH 1971). An original script running on the R statistical package was used for ancestral state inference and for counting the number of amino acid substitution events for each site (see supplemental materials at http://www.genetics.org/supplemental/ for the script and a detailed description). Types I and II functional divergences between Obp57d and Obp57e were examined using DIVERGE 2.0 (GU and VELDEN 2002; GU 2006).
| RESULTS |
|---|
|
|
|---|
|
|
|
|
|
|
|
Site-specific analysis of functional constraint against amino acid substitution:
Obp57d or Obp57e knock-out flies showed similar changes in behavioral response to octanoic acid, indicating that these two OBP genes, at least in part, share the same function in perception of octanoic acid (MATSUO et al. 2007). We searched for amino acid sites that are conserved between Obp57d and Obp57e. For the analysis, we selected species with single copies of Obp57d and Obp57e genes, to ensure better conservation of functions in each gene (Figure 5A). A bifurcating tree is not adequate to describe the phylogenetic relationship between the selected species. Thus, we employed the multibranched tree for ancestral state inference by the MP method, and the substitution events at all branches were counted. The total numbers of substitutions were almost the same between Obp57d and Obp57e, indicating that the strengths of overall constraints on these two genes are equivalent to each other (Table 2). When the distribution of the number of amino acid substitutions at each site was analyzed, 16 sites were conserved, being beyond the expectation by the negative binomial distribution (see supplemental Figure S1 at http://www.genetics.org/supplemental/). Amino acids at these sites are shown with those in D. pseudoobscura and D. obscura (Table 3). In addition to the six OBP-signature cysteines, three sites at positions 59, 97, and 124 are conserved between the obscura and melanogaster groups. They are the candidates for the amino acids that determine the common function between Obp57d and Obp57e.
|
I) between Obp57d and Obp57e are significantly different from zero for both trees 2 and 3, showing that the site-specific evolutionary rate differs between the two clades. On the other hand, the coefficients for type II functional divergence (
II) are not significantly different from zero, showing that there is no site-specific shift of amino acid property between the two clades.
|
|
| DISCUSSION |
|---|
|
|
|---|
-helical domains, and (3) OBP has six cysteines at particular intervals that are necessary for appropriate conformation. Most of the other sites in the OBP genes are not conserved at the amino acid level (GALINDO and SMITH 2001; GRAHAM and DAVIES 2002; HEKMAT-SCAFE et al. 2002). Because most OBP genes in a genome are supposed to have diverged from others in function, the comparison of the amino acid sequences of OBP genes within a genome is not effective for elucidating the relationship between the structure and the specific function of each OBP. Thus, it is more preferable to compare orthologs from closely related but different species, which are expected to retain the same function, e.g., ligand repertoire. By comparisons of the orthologous genes from many species, we found the conserved amino acids in both Obp57d and Obp57e. They might be the key sites for the specific functions shared by these two genes. We also found the type I functionally diverged sites between the two OBP genes. They are possibly the key amino acids responsible for the specific functions of each OBP.
Functional divergence between Obp57d and Obp57e:
There are two theories for the functional divergence of duplicated genes: subfunctionalization and neofunctionalization (ROTH et al. 2007). The subfunctionalization of duplicated genes is a key process in the DDC model, in which the functions of an ancestral gene are divided to the duplicated genes that functionally complement each other. On the other hand, the neofunctionalization of duplicated genes results in the acquisition of a novel function by one gene, while preserving the ancestral function by another gene. The ME tree of the amino acid sequences showed that Obp57d and Obp57e have equally diverged from the OBP gene in the obscura group, suggesting that the neofunctionalization of either gene is not likely. Also, type II functional divergence was not supported, which means that there was no radical substitution of amino acids leading to the acquisition of a novel function. However, not all of the type I diverged sites between Obp57d and Obp57e appear to be caused by the loss of functional constraints; among the 19 conserved sites that are clade specific, 12 sites are not conserved in the obscura group, in which the ancestral functions should be conserved (Table 3). The specific condition for OBP genes needs to be considered to understand these observations. Because the most sites in OBP genes are evolutionarily free, acquisition of a novel function after gene duplication might be observed as an increase of functional constraints at the sites that had been free before duplication. Such site-specific differences of evolutionary rate will be detected as type I divergence, but in this case, it should be related to neofunctionalization rather than subfunctionalization. Positional shift of functionally important sites, for example, may cause such changes. Our analysis did not include insertion/deletion variations, which clearly affect positional relationships between functional amino acids. Thus, it remains possible that each of Obp57d and Obp57e inherited a subdivision of ancestral functions (subfunctionalization), and at the same time they gained a novel function that is specific to each of the two OBP genes (neofunctionalization). It has been proposed that subfunctionalization has a role as a transition state to neofunctionalization (HE and ZHANG 2005; RASTOGI and LIBERLES 2005). This possibility should be examined experimentally by in vitro assay, as well as by behavioral assay of genetically manipulated flies.
Birth-and-death process and selection:
The ananassae subgroup and the auraria–rufa lineage provide interesting examples in which either Obp57d or Obp57e has been lost. Even in the subfunctionalization process, duplicated genes require selective pressure for the preservation of ancestral functions. This selective pressure is also necessary for the maintenance of both subfunctionalized genes. In other words, the species lacking either subfunctionalized gene may exhibit defective phenotypes that had been deleterious during the subfunctionalization process. Thus, the loss of either gene in the ananassae subgroup and the auraria–rufa lineage may indicate a shift in selective pressure, such as a reduction in population size leading to genetic drift, or an environmental change leading to a shift in food availability. Indeed, some of the amino acids conserved in the other species that have both Obp57d and Obp57e (Tables 3 and 5) are not conserved in the ananassae subgroup and the auraria–rufa lineage (Figure 2), indicating that the functional constraint on the remaining gene has changed. It should be also noted that the OBP gene number at the Obp57d/e locus is under a particular selection mechanism. In natural populations of D. melanogaster, there is polymorphism at the Obp57e locus (TAKAHASHI and TAKANO-SHIMIZU 2005). The Obp57e null allele was found worldwide, indicating the existence of balancing selection. This denotes that there is a selection-based mechanism that affects the gene number at the Obp57d/e locus. The mechanism of this selection may have great importance as a determinant of the OBP gene family size.
Contrary to the examples discussed above, the multiple Obp57d genes in D. takahasii, D. biarmipes, D. ficusphila, D. elegans, and D. vaians and the two Obp57e genes in D. constricta appear to be at the early stage of evolution after gene duplications in each lineage. Although the amino acids conserved in Obp57d genes in the other species (Tables 3 and 5) are not conserved in some of these multiple Obp57d genes (Figure 2), it may indicate, in this case, a gene degeneration process rather than functional divergence. Indeed, Obp57d pseudogenes are found in D. takahashii and D. elegans. Analyses of intraspecies variations in each species might reveal selection pressure on these extra Obp57d genes, if any. Although the observed number difference of genes at the earlier stage of evolution (e.g., three Obp57d genes in D. biarmipes) is larger than that at the later stage (e.g., loss of Obp57d in D. auraria) and it may contribute more to the size difference of the gene family between species, its contribution to phenotypic differences would be less than that of the number difference of the functionally diverged genes.
Functional relationship with receptors:
In the evolution of OBP genes, not only the populational and environmental factors, but also the local factors at the molecular level are important determinants of selection pressure. For its proper function, OBP must be coexpressed with functionally corresponding chemoreceptors in the same sensilla (XU et al. 2005). Changes in the structure (function) and expression pattern of the corresponding receptor are expected to change the selection pressure on OBP genes. More importantly, changes in the expression pattern of OBP itself also alter its local environment. The evolution in the expression pattern of each OBP may affect the selection pressure on itself, possibly resulting in the gene number difference between species. It is necessary to examine whether these genes are expressed in the same pattern as seen in D. melanogaster or in different patterns in the species in which the number of Obp57d/e genes is altered.
Conclusion:
There are two classes of gene number differences in the Obp57d/e region: the difference of the genes that have functionally diverged from each other and the difference of the genes that appear to be functionally identical. Although both of these two classes contribute to the size difference of the gene family between species, their contributions to the phenotypic differences are not equal, and the evolutionary mechanisms underlying them are different. Thus, it is important to distinguish between these two classes in the analysis of the size difference of multigene families among species. Comparisons of many genomic sequences from closely related species are effective for this purpose.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
| LITERATURE CITED |
|---|
|
|
|---|
BENDTSEN, J. D., H. NIELSEN, G. VON HEIJNE and S. BRUNAK, 2004 Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340: 783–795.[CrossRef][Medline]
DA LAGE, J.-L., G. J. KERGOAT, F. MACZKOWIAK, J.-F. SILVAIN, M.-L. CARIOU et al., 2007 A phylogeny of Drosophilidae using the Amyrel gene: questioning the Drosophila melanogaster species group boundaries. J. Zool. Syst. Evol. Res. 45: 47–63.[CrossRef]
DE BIE, T., N. CRISTIANINI, J. P. DEMUTH and M. W. HAHN, 2006 CAFE: a computational tool for the study of gene family evolution. Bioinformatics 22: 1269–1271.
FITCH, W. M., 1971 Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20: 406–416.
FORÊT, S., and R. MALESZKA, 2006 Function and evolution of a gene family encoding odorant binding-like proteins in a social insect, the honey bee (Apis mellifera). Genome Res. 16: 1385–1394.
GALINDO, K., and D. P. SMITH, 2001 A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159: 1059–1072.
GILAD, Y., V. WIEBE, M. PRZEWORSKI, D. LANCET and S. PÄÄBO, 2004 Loss of olfactory receptor genes coincides with the acquisition of full trichromatic vision in primates. PLoS Biol. 2: E5.[CrossRef][Medline]
GO, Y., Y. SATTA, O. TAKENAKA and N. TAKAHATA, 2005 Lineage-specific loss of function of bitter taste receptor genes in humans and nonhuman primates. Genetics 170: 313–326.
GRAHAM, L. A., and P. L. DAVIES, 2002 The odorant-binding proteins of Drosophila melanogaster: annotation and characterization of a divergent gene family. Gene 292: 43–55.[CrossRef][Medline]
GU, X., 2006 A simple statistical method for estimating type-II (cluster-specific) functional divergence of protein sequences. Mol. Biol. Evol. 23: 1937–1945.
GU, X., and K. V. VELDEN, 2002 DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 18: 500–501.
HAHN, M. W., T. DE BIE, J. E. STAJICH, C. NGUYEN and N. CRISTIANINI, 2005 Estimating the tempo and mode of gene family evolution from comparative genomic data. Genome Res. 15: 1153–1160.
HE, X., and J. ZHANG, 2005 Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169: 1157–1164.
HEKMAT-SCAFE, D. S., C. R. SCAFE, A. J. MCKINNEY and M. A. TANOUYE, 2002 Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster. Genome Res. 12: 1357–1369.
JUNIER, T., and M. PAGNI, 2000 Dotlet: diagonal plots in a web browser. Bioinformatics 16: 178–179.
KAREV, G. P., Y. I. WOLF and E. V. KOONIN, 2003 Simple stochastic birth and death models of genome evolution: Was there enough time for us to evolve? Bioinformatics 19: 1889–1900.
KAREV, G. P., Y. I. WOLF, F. S. BEREZOVSKAYA and E. V. KOONIN, 2004 Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models. BMC Evol. Biol. 4: 32.[CrossRef][Medline]
KUMAR, S., K. TAMURA and M. NEI, 2004 MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinform. 5: 150–163.
LYNCH, M., and J. S. CONERY, 2000 The evolutionary fate and consequences of duplicate genes. Science 290: 1151–1155.
MATSUO, T., S. SUGAYA, J. YASUKAWA, T. AIGAKI and Y. FUYAMA, 2007 Odorant-binding proteins Obp57d and Obp57e affect taste perception and host-plant preference in Drosophila sechellia. PLoS Biol. 5: e118.[CrossRef][Medline]
NEI, M., 2005 Selectionism and neutralism in molecular evolution. Mol. Biol. Evol. 22: 2318–2342.
RASTOGI, S., and D. A. LIBERLES, 2005 Subfunctionalization of duplicated genes as a transition state to neofunctionalization. BMC Evol. Biol. 5: 28.[CrossRef][Medline]
REED, W. J., and B. D. HUGHES, 2004 A model explaining the size distribution of gene and protein families. Math. Biosci. 189: 97–102.[CrossRef][Medline]
ROTH, C., S. RASTOGI, L. ARVESTAD, K. DITTMAR, S. LIGHT et al., 2007 Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. J. Exp. Zool. B Mol. Dev. Evol. 308: 58–73.[Medline]
RUDNICKI, R., J. TIURYN and D. WÓJTOWICZ, 2006 A model for the evolution of paralog families in genomes. J. Math. Biol. 53: 759–770.[CrossRef][Medline]
TAKAHASHI, A., and T. TAKANO-SHIMIZU, 2005 A high-frequency null mutant of an odorant-binding protein gene, Obp57e, in Drosophila melanogaster. Genetics 170: 709–718.
XU, P. X., L. J. ZWIEBEL and D. P. SMITH, 2003 Identification of a distinct family of genes encoding atypical odorant-binding proteins in the malaria vector mosquito, Anopheles gambiae. Insect Mol. Biol. 12: 549–560.[CrossRef][Medline]
XU, P., R. ATKINSON, D. N. M. JONES and D. P. SMITH, 2005 Drosophila OBP LUSH is required for activity of pheromone-sensitive neurons. Neuron 45: 193–200.[CrossRef][Medline]
Communicating editor: N. TAKAHATA
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |