Genetics, Vol. 165, 613-621, October 2003, Copyright © 2003

A Family of Genes Clustered at the Triplo-lethal Locus of Drosophila melanogaster Has an Unusual Evolutionary History and Significant Synteny With Anopheles gambiae

Douglas R. Dorera, Jamie A. Rudnick1,b, Etsuko N. Moriyamab,c, and Alan C. Christensenb
a Department of Microbiology, Meharry Medical College, Nashville, Tennessee 37208
b School of Biological Sciences, University of Nebraska, Lincoln, Nebraska 68588
c Plant Science Initiative, University of Nebraska, Lincoln, Nebraska 68588

Corresponding author: Alan C. Christensen, 348 Manter Hall, University of Nebraska, Lincoln, NE 68588-0118., achristensen2{at}unl.edu (E-mail)

Communicating editor: S. HENIKOFF


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Within the unique Triplo-lethal region (Tpl) of the Drosophila melanogaster genome we have found a cluster of 20 genes encoding a novel family of proteins. This family is also present in the Anopheles gambiae genome and displays remarkable synteny and sequence conservation with the Drosophila cluster. The family is also present in the sequenced genome of D. pseudoobscura, and homologs have been found in Aedes aegypti mosquitoes and in four other insect orders, but it is not present in the sequenced genome of any noninsect species. Phylogenetic analysis suggests that the cluster evolved prior to the divergence of Drosophila and Anopheles (250 MYA) and has been highly conserved since. The ratio of synonymous to nonsynonymous substitutions and the high codon bias suggest that there has been selection on this family both for expression level and function. We hypothesize that this gene family is Tpl, name it the Osiris family, and consider possible functions. We also predict that this family of proteins, due to the unique dosage sensitivity and the lack of homologs in noninsect species, would be a good target for genetic engineering or novel insecticides.


WHEN the Drosophila genome was surveyed for dosage-sensitive regions, only one was found that was both triplo-lethal and haplo-lethal (LINDSLEY et al. 1972 Down). This locus, located in cytological region 83D4,5-E1,2, was called the Triplo-lethal locus, abbreviated Tpl. Stocks carrying a duplication of Tpl on one homolog and a deficiency on the other homolog are viable and provide a powerful selection for either up or down mutations when crossed to a wild-type fly (DENELL 1976 Down). Using this selection, KEPPY and DENELL 1979 Down were able to obtain duplications and deficiencies of Tpl, but were unable to isolate point mutations following EMS or formaldehyde mutagenesis. ROEHRDANZ and LUCCHESI 1980 Down were also unable to isolate point mutations of Tpl following EMS mutagenesis, although they did isolate mutations in the Suppressor of Tpl locus, which has been shown to encode the Ell protein, a general transcription elongation factor (DORER et al. 1995 Down; EISSENBERG et al. 2002 Down).

Denell proposed three hypotheses to explain the lack of point mutations at Tpl: (1) the locus is very small so the mutation rate is very low, (2) the locus does not encode a protein and therefore is less sensitive to single base changes, or (3) the locus consists of a gene cluster with at least partial redundancy, such that mutation of one of the genes does not rescue the lethality of a duplication of the entire cluster. The small size hypothesis predicts that as the number of mutagenized chromosomes increases, the chance that a mutant will be found also increases. However, we have subsequently screened >106 chromosomes and still have not isolated point mutations (DORER and CHRISTENSEN 1990 Down; our unpublished data). The non-protein-coding hypothesis predicts that transposon insertions or inversion breakpoints would still inactivate the locus, but in spite of considerable effort we have never isolated a P-element insertional mutation in Tpl (DORER and CHRISTENSEN 1990 Down; our unpublished data). Thus the most likely hypothesis is that Tpl consists of a cluster of genes with at least partial redundancy.

Using the complete genomic sequence of Drosophila melanogaster we have tested the prediction that the Triplo-lethal region contains a cluster of genes with high similarity. To do that, first we defined the molecular limits of Tpl by isolating and mapping duplications and deletions and then examining the sequence within those limits for repeated genetic units. We describe here the discovery of a multigene family in the Triplo-lethal region, consistent with the best hypothesis based on the genetic data. Although the proteins encoded by this family are novel, the sequences have features that allow us to make predictions about their function. We predict that a family of genes whose dosage is so critical will be well conserved and show evidence of strong selection on expression levels. Comparison of the D. melanogaster gene family with the orthologous genes in Anopheles gambiae allows us to analyze the expression, selection, and evolution of the family.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Drosophila rearrangements and mapping:
Drosophila stocks were previously described (DORER and CHRISTENSEN 1990 Down; DORER et al. 1995 Down) or were obtained from the Indiana University Drosophila Stock Center. Duplications and deficiencies of Tpl were generated by crossing flies carrying both a {Delta}2-3 source of transposase and a single P element inserted near Tpl to YSX.YL, In(1)EN y;;Dp(3;3)Tpl pp/Df(3R)Tpl10 pp. Survivors were backcrossed to the Dp/Df line to establish a stock carrying the new rearrangement, often flanked by the starting P element. Three single P-element insertions were used: P{ry+t7.2=PZ}l(3)0108601086, inserted in RM62; P{hsneo} l(3)neo331, inserted in castor (COOLEY et al. 1988 Down); and RS2/24, inserted in pollux (VINCENT et al. 1990 Down). Inverse PCR was used to amplify the DNA flanking P-induced rearrangements as described (SPRADLING et al. 1999 Down). DNA sequencing was done by the University of Nebraska DNA Sequencing Facility. Breakpoints were mapped by comparing these sequences to the Drosophila genome using BLAST (ALTSCHUL et al. 1997 Down).

Bioinformatics:
Sequence similarity searches were done using the BLAST server at http://www.ncbi.nlm.nih.gov (ALTSCHUL et al. 1990 Down, ALTSCHUL et al. 1997 Down) or Vector NTI from Informax. Targeting predictions were done using TargetP 1.01 (NIELSEN et al. 1997 Down; EMANUELSSON et al. 2000 Down). Transmembrane helix predictions were done with TMHMM 2.0 (KROGH et al. 2001 Down). Multiple alignments based on amino acid sequences were generated by MULTICLUSTAL (YUAN et al. 1999 Down) and ClustalW (THOMPSON et al. 1994 Down). On the basis of this alignment, amino acid distances were estimated by the JTT method (JONES et al. 1992 Down).

Phylogenetic relationships were reconstructed with the neighbor-joining (NJ) method (SAITOU and NEI 1987 Down). Bootstrap supporting values were estimated from one thousand repetitions of bootstrap sampling. JTT distance estimation, NJ tree reconstruction, and bootstrap analysis were conducted by PHYLIP version 3.6.a3 (FELSENSTEIN 2001 Down).

Codon usage bias was measured as the "effective number of codons" (ENC) as developed by WRIGHT 1990 Down. Numbers of synonymous and nonsynonymous substitutions per site were estimated by the LI 1993 Down method. General statistical analyses were conducted with JMP 5 (SAS Institute).


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Molecular mapping of Tpl:
Because we and others have been unable to isolate point or transposon insertion mutations in Tpl, in spite of a very powerful selection, we generated duplications and deletions flanking single P-element insertions (PRESTON et al. 1996 Down). Because the proximal breakpoint of Dp(3;3)Tpl is just distal to the Rm62 gene (DORER et al. 1990 Down), we chose three single P-element insertions near Rm62 as starting points for new duplications and deficiencies. Of the 24 duplications and 43 deficiencies that resulted, we were able to accurately map 29 of them by inverse PCR and sequencing. The smallest duplication, Dp(3;3)TplJE10B, is duplicated from Rm62 through Pak, a region of ~334 kb. The smallest deficiency, Df(3R)Tpl6F, also has its distal breakpoint in Pak, confirming that Tpl is located between Rm62 and Pak.

A novel family of proteins is encoded within the Triplo-lethal region:
Examination of this region in the Drosophila genome (ADAMS et al. 2000 Down) reveals a previously undescribed family of genes that is consistent with the unusual genetic properties of Tpl. This family of genes is located in the 168-kb region from CG15585 through CG15188, which represents roughly half of the Tpl region defined by the breakpoints. Only 7 of the 27 genes in this region are not members of this family (CG31562, NPFR1, CG15589, CG15597, CG15594, CG31556, and CG31560). BLASTP and PSI-BLAST searching has revealed only 3 members of the family outside of this cluster (see below). Of these 23 family members, 18 are shown aligned in Fig 1. The others are described separately below. The complete alignment is given in Supplemental Data at http://www.genetics.org/supplemental/. All of the proteins appear to have endoplasmic reticulum signal peptides. In addition to the signal peptide, these proteins have three conserved domains. The first is near the amino terminus and consists of a pair of cysteines usually separated by seven to nine amino acids. The second consists of four hydrophobic blocks, separated by lysines. A proline is usually present in the first two hydrophobic blocks. This region is predicted to be a transmembrane domain (KROGH et al. 2001 Down) and resembles a stop-transfer anchor, thus these are type I transmembrane proteins with the N terminus outside the cell and the C terminus inside (GODER and SPIESS 2001 Down). The third domain is a region rich in conserved histidines and tyrosines, including the highly conserved sequence AQXLAY near the carboxyl terminus. A number of different endocytic signaling motifs include tyrosines (BONIFACINO and DELL'ANGELICA 1999 Down; ROYLE et al. 2002 Down), and copies of one such signal, YXXØ, are boxed in Fig 1. Because the Isis locus partially rescues the effect of trisomy for this region (DORER et al. 1993 Down), we call this family of proteins the Osiris family and have named the genes Osiris 1 through Osiris 23 on the basis of their position in the cluster (see Fig 1 legend).




View larger version (241K):
In this window
In a new window
Download PPT slide
 
Figure 1. Alignment of 18 members of the Osiris multigene family. Identical and strongly conserved amino acids are indicated with black shading, similar residues are shaded gray, and the YXXØ motifs (see text) are boxed. Consensus residues are defined as being present in at least 50% of the proteins. Numbers in parentheses indicate where that number of amino acids is not shown. The signal peptide and the nonconserved region between it and the cysteine motifs are not shown. The revised Osiris names correspond to the provisional CG names as follows: Osiris 1, CG15585; Osiris 2, CG1148; Osiris 3, CG1150; Osiris 4, CG10303; Osiris 5, CG15590; Osiris 6, LD21503; Osiris 7, CG1153; Osiris 8, CG15591; Osiris 9, CG15592; Osiris 10, CG15593; Osiris 11, CG15596; Osiris 12, CG1154; Osiris 13, CG15595; Osiris 14, CG1155, Sp558; Osiris 15, CG1157; Osiris 16, CG31561; Osiris 17, CG15598; Osiris 18, CG1169; Osiris 19, CG15189; Osiris 20, CG15188; Osiris 21, CG14925; Osiris 22, CG8644; and Osiris 23, CG15538.

Four protein sequences from the cluster at 83E that appear to be members of the Osiris family cannot be aligned as well and were not shown in Fig 1, but are aligned in Supplemental Data. These four proteins are Osiris 10, Osiris 13, Osiris 14, and Osiris 17. Each includes the transmembrane domain and the conserved tyrosine motif, however Osiris 10 appears to be internally repeated, and Osiris 13 and Osiris 14 are diverged at the ends, perhaps because of errors in predicting the exons. Osiris 17 includes copies of the cysteine motif at both the amino and carboxyl termini, and is predicted to be mitochondrial. Despite these differences, the four are recognizable as members of the family, and annotation errors may account for the differences.

Through BLAST and PSI-BLAST queries of the NCBI database, we have identified three other members of this family, encoded at three different sites elsewhere in the genome. Osiris 21 (CG14925, polytene region 32E) and Osiris 23 (CG15538, 99F) appear to be typical members of the family (Fig 1). Osiris 22 (CG8644, 87E) lacks the N-terminal cysteines, although it is otherwise very similar to the others (supplemental data at http://www.genetics.org/supplemental/). None of these loci are triplo-lethal or located within haplo-insufficient regions (LINDSLEY et al. 1972 Down).

The Osiris gene family is highly conserved between Drosophila and Anopheles:
We compared the genes in this region to the A. gambiae genomic sequence (HOLT et al. 2002 Down), and found that Anopheles has an orthologous family of proteins, mostly encoded in two clusters on chromosome 2R. Phylogenetic relationships of the Osiris genes from both species were determined as described in MATERIALS AND METHODS, and the resulting tree is shown in Fig 2. With very few exceptions the closest relative of each gene is its ortholog in the other species, rather than any of the paralogs in the same species. These orthologous pairs were also supported with high bootstrap values (>90%). This suggests that the family diverged by gene duplication before the divergence of Drosophila and Anopheles, but each member of the family has since retained its unique features.



View larger version (36K):
In this window
In a new window
Download PPT slide
 
Figure 2. Neighbor-joining phylogeny based on the amino acid alignment of the 23 D. melanogaster and 22 A. gambiae Osiris family genes. The tree is rooted with the Osiris 22/agCG47244 pair as an outgroup. The numbers at the nodes are bootstrap supporting values (%). Only values >70% are shown.

To understand the close relationships between orthologs, we examined the codon usage biases of the genes and the base substitution patterns between orthologous pairs. Codon usage bias was measured as the effective number of codons. It can range from 20 (where only one codon is used for each of the 20 amino acids and thus codon usage is most biased) to 61 (where all possible codons are used and there is no bias). The average ENC for >12,000 genes from D. melanogaster is 49, with a range from 28 to 61 (MORIYAMA 2003 Down). It has been shown that highly expressed genes have high codon usage bias as seen by a low ENC. For example, the average ENC among ribosomal protein genes is 39 (MORIYAMA 2003 Down). The average ENC from the 23 Osiris genes from D. melanogaster is 45.1 with a range from 31 to 61. The ENC from the 22 A. gambiae homologs ranges between 33 and 57 with an average of 40.6. The codon usage bias was significantly correlated between orthologous pairs of genes (R = 0.52, P = 0.02). The average G + C contents at the fourfold degenerate positions are 71% (D. melanogaster) and 74% (A. gambiae) and are also significantly correlated between orthologous pairs (R = 0.54, P = 0.01). On the other hand, both species have low and uncorrelated G + C content in introns (33% for D. melanogaster and 43% for A. gambiae; R = 0.12, P >> 0.05). These observations indicate that there is strong selection on these genes and on the maintenance of their expression levels (DURET 2002 Down; HURST et al. 2002 Down).

A significant negative correlation is observed between the numbers of synonymous substitutions per site and the codon usage bias of orthologous pairs (R = 0.76, P = 0.002). This again implies that these genes are under translational selection. Remarkably, the orthologous pairs are similar enough to estimate the synonymous substitution rates for 15 of them. The average ratio between the synonymous and nonsynonymous substitutions per site is 0.37 and the ratios range from 0.25 to 0.58. These ratios are <1.0 (Fig 3), indicating that these genes are under selection. Not surprisingly, these ratios are higher than those obtained within the Drosophila lineage. BERGMAN et al. 2002 Down, for example, reported that ~90% of such ratios obtained from comparisons among three Drosophila species are under 0.2. Interestingly, nonsynonymous substitution rates are correlated with synonymous substitution rates (Fig 3; Spearman Rho = 0.6, P = 0.02). Such correlations have been described in Drosophila (AKASHI 1994 Down) and mammalian genes (MAKALOWSKI and BOGUSKI 1998 Down). In the case of Drosophila genes, translational selection both on codon usage bias and on amino acid substitutions is considered to cause such correlations (AKASHI 1994 Down).



View larger version (8K):
In this window
In a new window
Download PPT slide
 
Figure 3. Correlation between the numbers of nonsynonymous and synonymous substitutions per site. The dashed line shows where the ratio between the numbers of synonymous and nonsynonymous substitutions per site is 1.0. The numbers of nonsynonymous substitutions were estimated from 20 orthologous pairs, but synonymous substitution rates could be determined for only 14 pairs, due to the arithmetic violations caused when there are too many substitutions. Data from these 14 pairs are plotted.

The Osiris gene cluster displays significant synteny:
The map locations of most of the A. gambiae orthologs are known (HOLT et al. 2002 Down), and the families maintain a remarkable degree of synteny in the two species. The genes found in the Tpl cluster in D. melanogaster are found in the same order in A. gambiae in two clusters on chromosome 2. Ten of the family members are in polytene region 18CD, and another seven are in 15D. A comparison of the Drosophila and Anopheles clusters is shown in Fig 4. The block of genes from Osiris 1 through Osiris 12 retains one of the largest regions of microsynteny found in a comparison of the two genomes (ZDOBNOV et al. 2002 Down). In this interval, 9 of 11 pairs of orthologous Osiris genes are in the same order. With only two exceptions in each species the genes are all transcribed from the same strand. This synteny also includes the nonfamily members NPFR1 and CG15589/agCG45916. Similarly, the block from Osiris 14 to Osiris 20 contains seven family members in the same order as their Anopheles orthologs in 15D. Two of the unlinked Drosophila family members (Osiris 21 and Osiris 22) have Anopheles orthologs that are also unlinked to the main clusters. The other unlinked Drosophila family member, Osiris 23, does not have a clear ortholog.



View larger version (51K):
In this window
In a new window
Download PPT slide
 
Figure 4. Map of the Osiris genes in D. melanogaster and A. gambiae. Drosophila genes are shown in the center and Anopheles genes are to either side. Unlinked or unmapped genes are indicated off of the chromosomes and with dashed borders. Orthologous pairs are connected by lines. Nonhomologous genes that have orthologs in a syntenic position are also shown in italics. All the mapped genes are transcribed from top to bottom except agCG45918, Osiris 4, Osiris 11, and agCG47253.

Osiris family members in other species:
Homologs to the Osiris genes have also been sequenced in the dipteran insects D. pseudoobscura and Aedes aegypti. Not surprisingly, all 23 family members are found in the D. pseudo-obscura genome, completely syntenic with the D. melanogaster genes (http://hgsc.bcm.tmc.edu/drosophila). No other insect genomes have been reported to be completely sequenced as of this writing. However, a partial cDNA from A. aegypti (GenBank accession no. BQ789636) encodes a homolog of Osiris 11. The cDNA sequence contains the putative transmembrane domain, and the high degree of conservation of this region is shown in Fig 5A.



View larger version (89K):
In this window
In a new window
Download PPT slide
 
Figure 5. Alignment of D. melanogaster Osiris proteins with homologs from other insect species. (A) Osiris 11. Only the C-terminal 114 amino acids of the A. aegypti homolog are available from GenBank. D. melanogaster, GenBank accession no. NP_649630; D. pseudoobscura, translation of nucleotides 51,261–52,215 of Contig9442 from 1/13/03 sequence release (http://hgsc.bcm.tmc.edu/drosophila); A. gambiae, agCP2541, GenBank accession no. EAA08588; A. aegypti, translation of partial cDNA clone AeP-230, GenBank accession no. BQ789636. (B) Osiris 14. Only the N-terminal 215 amino acids of the A. mellifera homolog are available from GenBank. D. melanogaster, GenBank accession no. AAF34808; D. pseudoobscura, translation of nucleotides 144,856–145,100, 145,218–145,425, and 145,495–145,851 of Contig5874_Contig1844.5 from 3/26/03 sequence release (http://hgsc.bcm.tmc.edu/drosophila); A. mellifera, translation of ESTs BB160010B10C05.5 (GenBank accession no. BI512836), BB160020B10B02.5 (GenBank accession no. BI515793), and BB170024B10E08.5 (GenBank accession no. BI509951).

We have also observed Osiris genes in four other insect orders. At least 17 different homologs to the Drosophila genes are found by BLAST searches of the first assembly of the honeybee (Apis mellifera) genome (http://titan.biotec.uiuc.edu/bee/honeybee_project.htm). A honeybee expressed sequence tag (EST; BI512836, BI515793, and BI509951) encodes a probable ortholog to Osiris 14 that contains the signal peptide, the paired cysteines, and the transmembrane domain (Fig 5B). cDNAs of other Osiris genes have been recovered from the lepidopterans Bombyx mori (BP121280, BP117216, BP119727, BP119640, BP119167, and others), Helicoverpa ameriga (BU038419), and Manduca sexta (BF046752); the coleopteran Cicindela campestris (BQ475392); and the hemipteran Toxoptera citricida (CB855036). In contrast to the conservation demonstrated in these insect species, no homologs have been sequenced in any other phyla to date. This suggests that the function of the Osiris proteins may be insect or Arthropod specific.

Expression:
The lethal phase of Tpl aneuploids is late embryonic or early larval, with the tracheae and the gut the first tissues to be affected (SMOYER et al. 2004 Down). Expression data are available from the Berkeley Drosophila Genome Project for some of the Osiris genes (http://www.fruitfly.org) and are partly consistent with the phenotype. The embryonic expression of six of these genes (Osiris 3, Osiris 7, Osiris 9, Osiris 17, Osiris 18, and Osiris 20) peaks in stages 13–16, while Osiris 19 is detected throughout embryogenesis and with a much broader tissue distribution than the others. Osiris 9, Osiris 18, and Osiris 20 are expressed in the embryonic tracheal system; Osiris 9 is expressed in the esophagus and foregut; and Osiris 17 is expressed in the esophagus and hindgut. Additional family members known to be expressed in late embryonic stages include Osiris 2, Osiris 6, Osiris 14, and Osiris 15, consistent with a late embryonic lethal phase.

Expression of Osiris genes in later stages has also been observed. Osiris 7 shows two peaks of expression during metamorphosis in addition to one late in embryogenesis (ARBEITMAN et al. 2002 Down). The A. aegypti cDNA homolog to Osiris 11 is expressed during metamorphosis (KREBS et al. 2002 Down), and the Osiris 14 homolog in A. mellifera was found in an adult brain cDNA library (WHITFIELD et al. 2002 Down).


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The Triplo-lethal locus has been mysterious since its discovery in 1972, primarily because point mutations and transposon insertional mutants have not been isolated (KEPPY and DENELL 1979 Down; ROEHRDANZ and LUCCHESI 1980 Down; DORER and CHRISTENSEN 1990 Down). In 1979 Denell proposed three hypotheses to explain the peculiar genetic properties of Tpl (KEPPY and DENELL 1979 Down). Subsequent work suggested that a cluster of related genes is the most likely hypothesis. We tested this by molecularly defining the limits of Tpl to ~334 kb and examining the genomic sequence of that region. We found 20 genes that are clustered and closely related and encode a novel family of transmembrane proteins. With 23 total members, this is one of the largest gene families in Drosophila. Of the 1437 sequence similarity groups reported at http://www.fruitfly.org/annot/similarity.html, only 26 groups are larger.

Homologous genes have been found only in insects, and the function of this family is unknown in A. gambiae or any other insect species. We predict that the orthologous families in Anopheles and other insects will be dosage sensitive. Reciprocal crosses between D. melanogaster carrying duplications and deficiencies of Tpl with D. simulans, D. mauritiana, and D. seychellia have shown that Tpl is both triplo- and haplo-lethal to the interspecific hybrids of both sexes (our unpublished data). These genes are also located in one of the longest regions of microsynteny between D. melanogaster and A. gambiae (ZDOBNOV et al. 2002 Down). Comparison of the sequences in the two species shows that in almost all cases the most closely related gene is the ortholog, rather than any of the paralogs. The low synonymous substitution rate appears to be due to strong codon bias, suggesting selection on expression level. The nonsynonymous substitution rate is lower than the synonymous rate, suggesting selection on function. These two observations and the clustering of 20 of the 23 family members in 83DE are consistent with the Osiris cluster of proteins being the Triplo-lethal locus.

The genetic data for Tpl (KEPPY and DENELL 1979 Down) led Denell to propose that the functions of the individual genes in the cluster are partially redundant; another possibility is a threshold effect such that duplication or deletion of multiple genes is needed to see lethal dosage effects. Either of these possibilities could explain the inability of point mutants to complement duplications of the entire cluster. Subdivision of the Osiris cluster and transformation with members of the family would allow these hypotheses to be distinguished, as well as allowing a formal demonstration that the Osiris cluster is Tpl. Maintenance of the linkage relationships throughout evolution is reminiscent of the synteny maintained within the homoeotic gene families (ZHANG and NEI 1996 Down). It is possible that the tight linkage is maintained in response to selection—perhaps the genes are coordinately regulated or imprinted (HURST et al. 2002 Down). In any case, the stability of the linkage arrangements and the sequences through long periods of time is unusual.

Without knowing the precise function of these proteins it is hard to know why this genetic region is so dosage sensitive. However, the relative concentrations of membrane proteins can affect rates of association and assembly of complexes (HOFFMAN and EDELMAN 1983 Down; BRAVO et al. 2000 Down; KEENAN CURTIS and KANE 2002 Down), leading to dosage sensitivity. The very high sequence conservation in the hydrophobic domain is intriguing. This region is somewhat long for a transmembrane domain, and it is not completely hydrophobic, showing a periodicity of prolines and glycines, as well as lysines. These sequences are conserved in Anopheles, which suggests that this region has been under selection for more than interaction with membrane lipids. This transmembrane domain likely interacts with other proteins—possibly each other in complexes—or with an intramembrane protease such as the rhomboid protein (URBAN and FREEMAN 2002 Down). Given that Tpl-trisomics are partially rescued by hyperoxia (SMOYER et al. 2004 Down), the conserved pair of cysteines is also intriguing. It is possible that the extracellular cysteine pair responds to redox potential and the intracellular tyrosines play a role in signaling. Finally, we suggest that because of the extreme dosage-sensitive lethality, the impossibility of rescuing duplications with point mutants, and the lack of homologs in other phyla, this family of proteins would make an ideal target for genetic modifications or insecticide discovery in Anopheles and other dipteran pests.


*  FOOTNOTES

1 Present address: Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN 47907-2033. Back


*  ACKNOWLEDGMENTS

We thank Deidre Potter for excellent advice; Larry Harshman, Han Asard, and Bob Weldon for helpful conversations; Laura Smoyer, John Engelman, and Marilyn Cadden for technical assistance; and Joel Eissenberg for critical comments on the manuscript. This work was supported in part by grants to A.C.C from the National Science Foundation and to D.R.D. from the National Institutes of Health (S06 GM08037-31).

Manuscript received March 26, 2003; Accepted for publication May 16, 2003.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

ADAMS, M. D., S. E. CELNIKER, R. A. HOLT, C. A. EVANS, and J. D. GOCAYNE et al., 2000  The genome sequence of Drosophila melanogaster.. Science 287:2185-2195.[Abstract/Free Full Text]

AKASHI, H., 1994  Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927-935.[Abstract]

ALTSCHUL, S. F., W. GISH, W. MILLER, E. W. MYERS, and D. J. LIPMAN, 1990  Basic local alignment search tool. J. Mol. Biol. 215:403-410.[Medline]

ALTSCHUL, S. F., T. L. MADDEN, A. A. SCHAFFER, J. ZHANG, and Z. ZHANG et al., 1997  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.[Abstract/Free Full Text]

ARBEITMAN, M. N., E. E. M. FURLONG, F. IMAM, E. JOHNSON, and B. H. NULL et al., 2002  Gene expression during the life cycle of Drosophila melanogaster. Science 297:2270-2275.[Abstract/Free Full Text]

BERGMAN, C. M., B. D. PFEIFFER, D. E. RINCON-LIMAS, R. A. HOSKINS, A. GNIRKE et al., 2002 Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome. Genome Biol. 3: 0086.1–0086.20.

BONIFACINO, J. S. and E. C. DELL'ANGELICA, 1999  Molecular bases for the recognition of tyrosine-based sorting signals. J. Cell Biol. 145:923-926.[Free Full Text]

BRAVO, A., B. ILLANA, and M. SALAS, 2000  Compartmentalization of phage {phi}29 DNA replication: interaction between the primer terminal protein and the membrane-associated protein p1. EMBO J. 19:5575-5584.[Medline]

COOLEY, L., R. KELLEY, and A. SPRADLING, 1988  Insertional mutagenesis of the Drosophila genome with single P elements. Science 239:1121-1128.[Abstract/Free Full Text]

DENELL, R. E., 1976  The genetic analysis of a uniquely dose-sensitive chromosomal region of Drosophila melanogaster.. Genetics 84:193-210.[Abstract/Free Full Text]

DORER, D. R. and A. C. CHRISTENSEN, 1990  The unusual spectrum of mutations induced by hybrid dysgenesis at the Triplo-lethal locus of Drosophila melanogaster.. Genetics 125:795-801.[Abstract]

DORER, D. R., A. C. CHRISTENSEN, and D. H. JOHNSON, 1990  A novel RNA helicase gene tightly linked to the Triplo-lethal locus of Drosophila melanogaster.. Nucleic Acids Res. 18:5489-5494.[Abstract/Free Full Text]

DORER, D. R., M. A. CADDEN, B. GORDESKY-GOLD, G. HARRIES, and A. C. CHRISTENSEN, 1993  Suppression of a lethal trisomic phenotype in Drosophila melanogaster by increased dosage of an unlinked locus. Genetics 134:243-249.[Abstract]

DORER, D. R., D. H. EZEKIEL, and A. C. CHRISTENSEN, 1995  The Triplo-lethal locus of Drosophila: reexamination of mutants and discovery of a second-site suppressor. Genetics 141:1037-1042.[Abstract]

DURET, L., 2002  Evolution of synonymous codon usage in metazoans. Curr. Opin. Genet. Dev. 12:640-649.[Medline]

EISSENBERG, J. C., J. MA, M. A. GERBER, A. CHRISTENSEN, and J. A. KENNISON et al., 2002  dELL, an essential RNA polymerase II elongation factor with a general role in development. Proc. Natl. Acad. Sci. USA 99:9894-9899.[Abstract/Free Full Text]

EMANUELSSON, O., H. NIELSEN, S. BRUNAK, and G. VON HEIJNE, 2000  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300:1005-1016.[Medline]

FELSENSTEIN, J., 2001 PHYLIP (Phylogeny Inference Package), version 3.6.a3. University of Washington, Seattle.

GODER, V. and M. SPIESS, 2001  Topogenesis of membrane proteins: determinants and dynamics. FEBS Lett. 504:87-93.[Medline]

HOFFMAN, S. and G. M. EDELMAN, 1983  Kinetics of homophilic binding by embryonic and adult forms of the neural cell adhesion molecule. Proc. Natl. Acad. Sci. USA 80:5762-5766.[Abstract/Free Full Text]

HOLT, R. A., G. M. SUBRAMANIAN, A. HALPERN, G. G. SUTTON, and R. CHARLAB et al., 2002  The genome sequence of the malaria mosquito Anopheles gambiae. Science 298:129-149.[Abstract/Free Full Text]

HURST, L. D., E. J. B. WILLIMAS, and C. PÁL, 2002  Natural selection promotes the conservation of linkage of co-expressed genes. Trends Genet. 18:604-606.[Medline]

JONES, D. T., W. R. TAYLOR, and J. M. THORNTON, 1992  The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8:275-282.[Abstract/Free Full Text]

KEENAN CURTIS, K. and P. M. KANE, 2002  Novel vacuolar H+-ATPase complexes resulting from overproduction of Vma5p and Vma13p. J. Biol. Chem. 277:2716-2724.[Abstract/Free Full Text]

KEPPY, D. O. and R. E. DENELL, 1979  A mutational analysis of the triplo-lethal region of Drosophila melanogaster.. Genetics 91:421-441.[Abstract/Free Full Text]

KREBS, K. C., K. L. BRZOZA, and Q. LAN, 2002  Use of subtracted libraries and macroarray to isolate developmentally specific genes from the mosquito, Aedes aegypti.. Insect Biochem. Mol. Biol. 32:1757-1767.[Medline]

KROGH, A., B. LARSSON, G. VON HEIJNE, and E. L. L. SONNHAMMER, 2001  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305:567-580.[Medline]

LI, W. H., 1993  Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36:96-99.[Medline]

LINDSLEY, D. L., L. SANDLER, B. S. BAKER, A. T. C. CARPENTER, and R. E. DENELL et al., 1972  Segmental aneuploidy and the genetic gross structure of the Drosophila genome. Genetics 71:157-184.[Abstract/Free Full Text]

MAKALOWSKI, W. and M. S. BOGUSKI, 1998  Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. Proc. Natl. Acad. Sci. USA 95:9407-9412.[Abstract/Free Full Text]

MORIYAMA, E. N., 2003  Encyclopedia of the Human Genome. Macmillan, London in press.

NIELSEN, H., J. ENGELBRECHT, S. BRUNAK, and G. VON HEIJNE, 1997  Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10:1-6.[Abstract/Free Full Text]

PRESTON, C. R., J. A. SVED, and W. R. ENGELS, 1996  Flanking duplications and deletions associated with P-induced male recombination in Drosophila. Genetics 144:1623-1638.[Abstract]

ROEHRDANZ, R. L. and J. C. LUCCHESI, 1980  Mutational events in the triplo- and haplo-lethal region (83DE) of the Drosophila melanogaster genome. Genetics 95:355-366.[Abstract/Free Full Text]

ROYLE, S. J., L. K. BOBANOVIC, and R. D. MURRELL-LAGNADO, 2002  Identification of a non-canonical tyrosine-based endocytic motif in an ionotropic receptor. J. Biol. Chem. 277:35378-35385.[Abstract/Free Full Text]

SAITOU, N. and M. NEI, 1987  The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.[Abstract]

SMOYER, L. K., D. R. DORER, K. W. NICKERSON, and A. C. CHRISTENSEN, 2004  Phenotype of the Triplo-lethal locus of Drosophila melanogaster and its suppression by hyperoxia. Genet. Res. in press.

SPRADLING, A. C., D. STERN, A. BEATON, E. J. RHEM, and T. LAVERTY et al., 1999  The Berkeley Drosophila Genome Project gene disruption project: single P-element insertions mutating 25% of vital Drosophila genes. Genetics 153:135-177.[Abstract/Free Full Text]

THOMPSON, J. D., D. G. HIGGINS, and T. J. GIBSON, 1994  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract/Free Full Text]

URBAN, S. and M. FREEMAN, 2002  Intramembrane proteolysis controls diverse signalling pathways throughout evolution. Curr. Opin. Genet. Dev. 12:512-518.[Medline]

VINCENT, J.-P., J. A. KASSIS, and P. H. O'FARRELL, 1990  A synthetic homeodomain binding site acts as a cell type specific, promoter specific enhancer in Drosophila embryos. EMBO J. 9:2573-2578.[Medline]

WHITFIELD, C. W., M. R. BAND, M. F. BONALDO, C. G. KUMAR, and L. LIU et al., 2002  Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee. Genome Res. 12:555-566.[Abstract/Free Full Text]

WRIGHT, F., 1990  The ‘effective number of codons’ used in a gene. Gene 87:23-29.[Medline]

YUAN, J., A. AMEND, J. BORKOWSKI, R. DEMARCO, and W. BAILEY et al., 1999  MULTICLUSTAL: a systematic method for surveying Clustal W alignment parameters. Bioinformatics 15:862-863.[Abstract/Free Full Text]

ZDOBNOV, E. M., C. VON MERING, I. LETUNIC, D. TORRENTS, and M. SUYAMA et al., 2002  Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science 298:149-159.[Abstract/Free Full Text]

ZHANG, J. and M. NEI, 1996  Evolution of Antennapedia-class homeobox genes. Genetics 142:295-303.[Abstract]




This article has been cited by other articles:


Home page
GeneticsHome page
A. Bhutkar, S. W. Schaeffer, S. M. Russo, M. Xu, T. F. Smith, and W. M. Gelbart
Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes
Genetics, July 1, 2008; 179(3): 1657 - 1680.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
N. Ren, C. Zhu, H. Lee, and P. N. Adler
Gene Expression During Drosophila Wing Morphogenesis and Differentiation
Genetics, October 1, 2005; 171(2): 625 - 638.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. N. Belyakin, G. K. Christophides, A. A. Alekseyenko, E. V. Kriventseva, E. S. Belyaeva, R. A. Nanayev, I. V. Makunin, Heidelberg Fly Array Consortium, F. C. Kafatos, and I. F. Zhimulev
Genomic analysis of Drosophila chromosome underreplication reveals a link between replication control and transcriptional territories
PNAS, June 7, 2005; 102(23): 8269 - 8274.
[Abstract] [Full Text] [PDF]