Genetics, Vol. 153, 179-219, September 1999, Copyright © 1999

An Exploration of the Sequence of a 2.9-Mb Region of the Genome of Drosophila melanogaster: The Adh Region

M. Ashburnera,b, S. Misrad, J. Rootea, S. E. Lewisd, R. Blazejg, T. Davisc, C. Doyleg, R. Galleg, R. Georgeg, N. Harrisg, G. Hartzelld, D. Harveyd,e, L. Hongd, K. Houstong, R. Hoskinsg, G. Johnsona, C. Martin1,g, A. Moshrefig, M. Palazzolo2,g, M. G. Reesed, A. Spradlingf, G. Tsangd,e, K. Wang, K. Whitelawg, B. Kimmel2,g, S. Celnikerg, and G. M. Rubing,d,e
a Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, England,
b EMBL—European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, England,
c Department of Pathology, University of Wales College of Medicine, Cardiff, CF4 4XN, Wales,
d Berkeley Drosophila Genome Project, Department of Molecular and Cell Biology, University of California, Berkeley, California 94720-3200,
e Howard Hughes Medical Institute, Life Sciences Annex, University of California, Berkeley, California 94720,
f Howard Hughes Medical Institute, Carnegie Institution of Washington, Baltimore, Maryland
g Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California 94720

Corresponding author: M. Ashburner, Department of Genetics, Downing St., Cambridge, CB2 3EH, England., m.ashburner{at}gen.cam.ac.uk (E-mail)

Communicating editor: T. C. KAUFMAN


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS AND DISCUSSION
*CONCLUSIONS
*APPENDIX
*LITERATURE CITED

A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.

Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. MILNE 1926


IT is nearly 100 years since W. E. Castle and his colleagues at Harvard University introduced Drosophila melanogaster to the joys and rigors of scientific research (KOHLER 1994 Down). From that slender beginning research with this small fly has dominated genetics and much of biology. It is, therefore, wholly appropriate that Drosophila melanogaster should join the new elite of organisms—as one whose genome will be sequenced in its entirety (MIKLOW and RUBIN 1996 Down; RUBIN 1998 Down). That goal is still some time away, but significant progress has already been made, with the determination of the complete sequences of the 338-kb bithorax and 430-kb Antennapedia regions (LEWIS et al. 1995 Down; MARTIN et al. 1995 Down; S. CELNIKER, B. PFEIFFER, J. KNAFELS, C. MAYEDA, C. MARTIN and M. PALAZZOLO, unpublished results) and with the availability of over 40 Mb of genomic sequence available in the public domain (BERKELEY DROSOPHILA GENOME PROJECT 1999; EUROPEAN DROSOPHILA GENOME PROJECT 1999). There are many reasons, both pragmatic and theoretical, for wanting to complete the sequence of a model organism such as Drosophila. On a practical level, the availability of this sequence will be of immediate benefit to all studying particular genes. More theoretically, only by the completion of this sequence can we contemplate a description of the protein universe of Drosophila, can we answer with assurance the question of gene number in Drosophila, can we know the nature, number, and distribution of noncoding regions of DNA (including transposable elements), or can we explore the Drosophila genome for regularities in sequence organization that may correlate with chromosome organization. Moreover, the availability of the complete sequence of Drosophila will itself be a major impetus to evolutionary studies and to comparative insect genomics. Finally, but by no means least important, the sequence itself will spur functional studies, themselves of great interest to all biologists, especially those struggling to interpret the function of genes of the larger genomes of mammals.

The analysis and interpretation of long genomic sequences pose several unsolved problems, among which are gene prediction and correlation of genetically identified loci with computationally predicted genes. We have selected the 2.9-Mb Adh region, a region of the genome of D. melanogaster that was already well characterized by conventional genetic analyses, as a test-bed to develop and evaluate approaches to large-scale genomic sequence annotation in Drosophila. This chromosome region is defined as the 69 polytene chromosome bands from 34C4 to 36A2 on chromosome arm 2L, which is the region between (and including) the previously known genes kuzbanian (kuz) and dachshund (dac). Genetic analysis of this chromosome region began with the studies of E. H. Grell in the early 1960s and the recovery of an Adh- deletion, Df(2L)64j (GRELL et al. 1968 Down). W. Sofer and students, especially J. M. O'Donnell (O'DONNELL et al. 1977 Down), recovered several more deletions, using formaldehyde as a mutagen, and defined 12 loci by complementation analysis among 33 EMS-induced lethal mutations uncovered by these deletions. These studies have been continued in the last 20 years by M. Ashburner's group (e.g., WOODRUFF and ASHBURNER 1979A Down, WOODRUFF and ASHBURNER 1979B Down).

Genetic analysis has defined 73 genes in this chromosome region. Of these genes, 65 are represented by mutant alleles and 8 more are predicted on the basis of the phenotypes of overlapping deletions. Of those with mutant alleles, 50 genes have at least one lethal allele (i.e., they are genes whose activities are vital), 6 are known only from sterile alleles (2 male sterile and 4 female sterile), 8 only from alleles with clear visible phenotypes, and 2 genes have alleles with no gross phenotype: Adh and smi35A. Forty-nine protein-coding genes (and 5 tRNA genes) in this region had been molecularly characterized prior to or during our work; these included 7 that had not been identified by genetic analysis. In addition to a collection of over 1038 different mutant alleles of genes in this region, the genetic analysis was enormously aided by a very large collection of chromosome aberrations, including 86 inversions, 109 translocations, 317 deletions, and 40 duplications. Apart from some conventional recombination mapping in the early stages of the project, all genes have been ordered by deletion mapping. The genetic positions of the breakpoints of many inversions and translocations have been mapped with respect to the genes, often by combining these breakpoints with others to synthesize deletions or duplications.

These genetic data posed two major questions. The first was that of "saturation": What proportion of the genes had been identified by the genetic analysis? It is well known (e.g., BARRETT 1980 Down) that the distribution of mutant hits to genes defies any rigorous statistical estimation of the size of the class of genes that are mutationally silent (see LEFEVRE and WATKINS 1986 Down). This is particularly true in the present case, since many independent mutagenesis screens using a variety of deletions have been done, as have several specific locus screens. These mutation screens have been done with a variety of chemical agents, with ionizing radiation and with P elements, and although the most mutable genes in general screens have 50 or more alleles (e.g., wb and esg), we already know, or predict, some genes that have been refractory, including those eight genes predicted from overlapping deletion phenotypes. Moreover, we had no experimental estimate of the number of genes that give no phenotype when mutant (see below). The second question is that raised by the very nonrandom clustering of aberration breakpoints. There are two extreme interpretations of this clustering: that the different regions differ in target size or that there is some intrinsic property that biases the recovery of chromosomal breaks. Both this question, and that of "saturation," have been answered from the analysis of the sequence of this region.

There is direct experimental evidence, or prediction, for 229 genes in the 2.9 Mb of sequenced DNA. Of these, there is evidence for function or some hint of function from sequence matches for 102 genes. One of the challenges for the future is to discover, by experiment, the function of all of the genes.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS AND DISCUSSION
*CONCLUSIONS
*APPENDIX
*LITERATURE CITED

Genetics:
All of the mutations and chromosome aberrations used in this study are fully described in FlyBase (FLYBASE CONSORTIUM 1999 Down). Table 1 presents a summary of the mutations that have been identified. The majority of these have been published in previous articles from M. Ashburner's laboratory, and others have been given to us by colleagues; those that are new are described in FlyBase. Where possible we have mapped aberration breakpoints genetically by combining the elements of translocations (by segregation) or inversion breakpoints (by recombination, using autosynaptic intermediates in the case of pericentric inversions; see GUBB 1998 Down) so as to synthesize deletions whose limits could be mapped by complementation. All genetic crosses were, unless otherwise stated, done between balancer heterozygotes and care was always taken to allow any very delayed progeny to eclose. A failure of complementation is based upon the absence of nonbalancer progeny, usually in progenies of 200 flies or more. Crosses were routinely done on standard laboratory food at 25°.


 
View this table:
In this window
In a new window

 
Table 1. Genes in the Adh region identified by genetic analysis

P elements from several laboratories, from screens for lethal P elements on chromosome 2 (see SPRADLING et al. 1995 Down), were screened against three deletions that, in sum, cover the entire genetic interval of interest—(Df(2L)b84a7, Df(2L)A48, and Df(2L)r10)—and then mapped more precisely using appropriate deletions and mutant alleles. We are very grateful to I. Kiss for the preliminary screen with his P-element collection. Further P elements were initially identified only on the basis of the chromosomal mapping of their insertion site by in situ hybridization to polytene chromosomes, using a P-element probe and standard techniques. These were then subjected to genetic analysis, typically tests for complementation with appropriate deletions and mutant alleles representative of candidate loci. The EP lines used in this study were from the collection described by RORTH et al. 1998 Down.

P-element excisions and male recombinants were generated using P{{Delta}2-3}99B as the source of an active P transposase. These derivatives were then characterized by conventional genetic complementation analyses.

Cytology:
For conventional polytene chromosome analysis we used propionic-carmine-orcein squash preparations. In situ hybridization was performed by standard procedures using biotinylated probes and horseradish peroxidase staining. Polytene chromosomes were interpreted using the revised maps of C. B. and P. N. Bridges (see LEFEVRE 1976 Down).

Clones:
The P1 clone library, with an average insert size of 80 kb, was that prepared from an isogenic y; cn bw sp stock in the vectors pNS583tet14Ad10 and pAd10sacBII (STERNBERG 1990 Down) and described by SMOLLER et al. 1991 Down. The strategy for building contigs of overlapping clones has been described by KIMMERLY et al. 1996 Down. The first stage was to build a "framework" map of the genome of D. melanogaster by mapping over 2600 of the P1 clones to the polytene chromosomes by in situ hybridization (HARTL et al. 1994 Down). Then, short sequence tagged sites (STS) were used to determine overlaps between P1 clones by STS-content mapping, using a PCR-based approach (OLSON et al. 1989 Down; GREEN and OLSON 1990 Down). STS sequences were derived from a number of sources: end sequences of P1 clones, insertion sites of P elements determined after plasmid rescue or inverse PCR, and sequences of known Drosophila genes. BAC clones were from a newly constructed library in pBACe3.6 (OSOEGAWA et al. 1998 Down; K. OSOEGAWA, A. MAMMOSER and P. DE JONG, unpublished results). This is a 20-hit library from a partial EcoRI digestion of DNA from the y; cn bw sp isogenic stock.

The P1 clones were first assembled into eight contigs by screening a 5-hit P1 clone library. By generating STS sequences determined from the ends of these contigs, and then mapping these to a second larger P1 clone library (10 hit), and by directed PCR experiments, these seven contigs assembled into two, of 0.8 Mb and 1.9 Mb, plus an isolated P1 clone containing the kuzbanian gene. The gaps between the two long contigs and between the isolated P1 clone and the 1.9-Mb contig were closed by screening the BAC clone library with sequences prepared from the appropriate end clones.

DNA sequencing:
The sequence of the Adh region has been assembled by first determining the sequences of the 51 individual P1 clones that comprise the 0.8-Mb and 1.9-Mb contigs. The gap between the two contigs was filled by sequencing the BAC clone BACR44L22. The gap between the P1 clones DS07660 and DS01368 was filled by sequencing BACR48E02. Table 2 lists the clones sequenced and their DDBJ/EMBL/GenBank accession numbers.


 
View this table:
In this window
In a new window

 
Table 2. Sequenced P1 and BAC clones in region 34D-36A

The sequencing strategies have evolved over time. Essentially, ca. 3-kb subclone libraries of randomly sheared DNA were prepared from each P1 clone in plasmid vectors. The sequences of both ends of each plasmid insert were determined using primers complementary to the vector and these sequences were used to assemble a set of overlapping 3-kb clones that span an entire P1 clone. The 3-kb clones were then sequenced using a combination of transposon-mediated sequencing (KIMMEL et al. 1997 Down) and custom oligonucleotide-primed sequence runs. All sequences were determined on both DNA strands and assembled using the PHRAP program (P. GREEN, unpublished results). The error rate was estimated using PHRAP quality scores as <1 in 10,000. We wrote our own genomic assembler to generate a single complete sequence of the entire region from the individual clone sequences. The core alignment software used in this assembler was the sim4 program of FLOREA et al. 1998 Down. The assembler iteratively runs sim4 against pairs of sequences that are known to overlap from the physical mapping data. The assembler then uses the exact alignment that covers the two ends of the clones to incrementally construct the complete sequence, performing reverse complementation when needed.

cDNA identification and sequencing:
cDNA clones derived from genes in the 34D-36A region were identified by searching for sequence matches between the genomic DNA sequence and 5' expressed sequence tags (ESTs) from the Berkeley Drosophila Genome Project (BDGP)/Howard Hughes Medical Institute (HHMI) Drosophila EST project (http://www.fruitfly.org/EST/). In addition, cDNAs corresponding to crp, heix, l(2)35Fe, anon-35Fa, anon-35F/36A, BG:DS02740.2, BG:DS02740.4, BG:DS02740.8, BG:DS02740.9, and BG:DS02740.10 were isolated by screening the LD cDNA library using the method of MUNROE et al. 1995 Down. The LD cDNA library was made from poly(A)+-selected RNA from 0–22-hr embryos, size fractionated (~1 to 6 kb), and directionally cloned in either the Stratagene (La Jolla, CA) Uni-Zap XR vector or the pOT2 plasmid (both EcoRI/XhoI digested; L. HONG, unpublished results). For each gene, the longest available cDNA was sequenced from one strand to allow unambiguous alignment with the genomic sequence. The cDNA sequences were aligned with the genomic sequence using the sim4 program of FLOREA et al. 1998 Down. Because these cDNA sequences were low-pass, single-stranded sequence it was not always possible to construct a single open reading frame from sim4 alignments. In those cases, adjustments were made by an annotator. The virtual cDNA sequences were verified using the ORFfinder program (v. 0.1, E. FRISE, unpublished results) and their structures relative to the genomic sequence manually checked in CloneCurator (see below).

Molecular mapping of P-element insertion sites:
The precise insertion sites of all P elements described here were determined by comparison of the reference genomic sequence with a sequence that spanned the junction between a P element and the genome using sim4. These junction sequences were determined from either plasmid-rescued clones or inverse PCR products, as described in SPRADLING et al. 1999 Down. The insertion site is reported as the first base pair of the 8-bp target site duplication generated by the P-element insertion.

Sequence analysis:
Two broad categories of computational method were used together to predict and identify genes. The first was gene prediction algorithms, based on the statistical properties of protein-coding regions. The second category of method used alignment algorithms for predictions based upon similarities of the sequence with other sequences in the public domain, both nucleic acid and protein.

The main gene prediction program used in the early stages of this analysis was GENEFINDER (v. 0.83; GREEN 1995 Down), trained on a Drosophila sequence data set (G. HELT, unpublished results). GENEFINDER predicts genes on the basis of the statistical properties of their sequence, codon usage, codon preference, and splice site profiles. More recently, we made a comparison of the performance of a number of different programs using the sequence of the P1 clone DS02740. This showed that GENSCAN (v. 1.0; BURGE 1997 Down; BURGE and KARLIN 1997 Down), trained on a vertebrate sequence data set, gave more reliable predictions than GENEFINDER, GENIE (REESE et al. 1997 Down), or a version of GRAIL trained on a Drosophila sequence training set (XU et al. 1995 Down). This comparison showed a tendency for GENSCAN to overpredict genes. This characteristic was complemented by GENEFINDER, which tends to underpredict genes. For this reason, both programs were used for the final data analyses, using their default parameters. Predictions with scores lower than 45 for GENSCAN or 20 for GENEFINDER were ignored. No current gene prediction program behaves well with introns that are either very large or very small, and these errors were corrected, whenever possible, by using available alignment data. tRNA genes were predicted using the tRNAscan-SE program (v. 1.02) of LOWE and EDDY 1997 Down.

To estimate the statistical properties of D. melanogaster protein-coding regions a nonredundant data set of coding regions (CDS) was made. By nonredundant we mean that for any one gene only one CDS is included, even if the gene encodes multiple protein products (that included was usually the longest complete sequence available from the EMBL Nucleic Acid Sequence Data Library). All of the CDS regions were checked for legitimate start and stop codons and for a continuous open reading frame in between these. Four genes with non-ATG starts were included in this data set (CTG, amn, ewg; GTG, Cha; CTC, cpo) following advice from D. Cavener, as were two CDSs (oaf and kelch) with in-frame UGA codons, perhaps coding for seleno-cysteine. This data set of 1335 CDSs was used for the construction of normalized codon and di-codon (hexamer) tables (HELT 1997 Down) and is available as cds_sequence_set.embl.v1.5 from ftp://ftp.ebi.ac.uk/pub/databases/edgp/sequence_sets/ and as na_embl.dros.v1.5 from http://www.fruitfly.org/sequence/download.html.

Databases against which similarity searches were made included GenBank, dbEST, SWISS-PROT, SPTREMBL, and sequences from the European Drosophila Genome Project (EDGP). Updates of these were collected weekly, the sequence data sorted into species-specific files, and all submissions from the Berkeley Drosophila Genome Project removed to provide data sets for searches. These data sets were then processed to append all database cross-references to FASTA header lines. For sequence similarity searches the BLASTN, BLASTX, and TBLASTX programs (version 2.0a) of W. GISH (unpublished results) were used (with the option B = 1,000,000, options filter = SEG + XNU).

Transposable elements were screened using a nonredundant data set of transposable element sequences from which all "flanking" DNA sequences had been trimmed. This data set was originally derived from the EMBL Nucleotide Sequence Data Library records, but as our analysis progressed more complete sequences of elements only known before from partial sequence were added, replacing incomplete sequences. This data set is available from ftp://ftp.ebi.ac.uk/pub/databases/edgp/sequence_sets/transposon_sequence_set.embl and from http://www.fruitfly.org/sequence/download.html (as na_te.dros).

A collection of repetitive sequences from D. melanogaster, not otherwise included in the transposable element sequence set, was also made. This data set includes, e.g., satellite DNA sequences and a miscellany of sequences annotated as being repetitive by FlyBase. It is not as nonredundant as the other two data sets, and was only used for screening for sequences similar to those previously described as repetitive. The data set is available from ftp://ftp.ebi.ac.uk/pub/databases/edgp/sequence_sets/repeat_sequence_set.embl and http://www.fruitfly.org/sequence/download.html (as na_re.dros).

The data output from these various computational analyses is voluminous and requires intelligent filtering to remove redundant and irrelevant information before being passed to the human annotators. Moreover, the task of annotation is almost impossible without tools for the visualization of these data. An application, BLAST Output Parser (v. 01; BOP), was written (S. LEWIS, unpublished results). BOP summarizes all automatically computed analysis data for an individual sequence into one file (i.e., all output from the programs mentioned previously: BLAST, GENSCAN, etc.). This file is in XML syntax. BOP also removes as much of the "noise" as possible (e.g., redundant matches, "shadow" matches on the noncoding strand, and matches to sequences of very biased base composition). These condensed data were then presented to the annotator in a graphical view (CloneCurator v. 0.1; S. LEWIS, N. HARRIS, S. MISRA and G. HELT, unpublished results).

CloneCurator was used to isolate individual genes from the clone sequences, based on expert evaluation of these analyses. CloneCurator allowed the annotator to compare results from different programs and to view the results using filters to determine a desired level of probability of prediction. The annotator used this visual summary to endorse a set of results as evidence, thereby generating a verified annotation. Annotations can be edited in CloneCurator and the annotators can add textual comments to any particular annotation, assign gene symbols, etc. This program was used to generate nucleic acid and amino acid FASTA files for each gene annotation. When a gene spanned more than one clone, manual intervention by an annotator was necessary to construct virtual mRNA sequences.

Open reading frames of predicted genes were validated using ORFfinder (v. 0.1; E. FRISE, unpublished results) and all predicted proteins were then tested with BLASTP (v. 2.0a) with the options filter = SEG + XNU (unless the results are stated as being "unfiltered") against SWISS-PROT and SPTREMBL protein sets organized into nine taxonomic groups (Drosophila, Caenorhabditis elegans, Saccharomyces cerevisiae, other invertebrates, primates, rodents, other vertebrates, plants, and bacteria). Matches with an expectation below P = 10-7 were ignored.

Protein domains and motifs were analyzed against the PROSITE (release 15.0; HOFMANN et al. 1999 Down) and PFAM (v. 2.1.1; SONNHAMER et al. 1997 Down; BATEMAN et al. 1999 Down) databases using the programs PPSEARCH [a Unix implementation of MacPattern at http://www2.ebi.ac.uk/services.html (FUCHS 1994 Down)] and HAMMER2.1 (EDDY 1998 Down). PROSITE output was filtered using EMOTIF (NEVILL-MANNING et al. 1998 Down) at the European Bioinformatics Institute (EBI). The SAPS program (version of July 23, 1993; BRENDEL et al. 1992 Down) was run from the EBI server (http://www2.ebi.ac.uk/SAPS/) to analyze various compositional features of predicted protein sequences. The PSORTII suite of programs (HORTON and NAKAI 1997 Down), trained on the proteins of S. cerevisiae, was used to predict the subcellular localization of proteins. Sequence alignments were generated using CLUSTALW (HIGGINS et al. 1996 Down) from the European Bioinformatics Institute server (http://www2.ebi.ac.uk/services.html).

The output from the various sequence analysis programs is archived on FlyBase as FlyBase-Annotation files linked to the sequenced clones. Version 1 of these files includes the analyses used for this article. Subsequent versions will result from reanalysis of the sequence data.

Nomenclature:
All genes are named according to the conventions agreed between the Berkeley and European Drosophila Genome Projects and FlyBase (http://flybase.bio.indiana.edu/docs/nomenclature). Each gene is given a unique name composed of three parts: a prefix (BG for genes defined by the Berkeley Project, EG for those defined by the European Project), followed by a clone name and an integer. The clone name is that of the clone on which the gene was first defined (regardless of whether or not the gene overlaps more than one clone). The final integer is simply a serial number, and does not imply the order of a gene within a clone. An example is BG:DS09218.6, the sixth gene annotated on P1 clone DS09218. If a gene was already known to FlyBase, then a formal name is still assigned but will be treated by FlyBase as a synonym of the established name.

All genes known to FlyBase are named by those names and symbols declared by FlyBase as valid. In addition, the historical names of the lethals identified by the genetic analysis of the Adh region are given.

Availability of data and materials:
The DNA sequence of the Adh region is made available for file transfer protocol (ftp) and searching (using BLAST) at http://www.fruitfly.org/data/genomic_fasta/Adh_and_cactus. All sequence data from genomic clones, ESTs, cDNAs, and P-element flanking regions are deposited in GenBank. Supplementary tables of data, cited in this article as Tables S1, S2, and S3, are available from http://www.genetics.org/supplemental/. Accession numbers for the genomic sequences are given in Table 2, for P-element flanking regions in Table S1 (http://www.genetics.org/cgi/content/full/153/1/179/DC1), and for cDNAs and ESTs in Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2). P1 clones are available from laboratories listed on FlyBase. cDNA clones are available from Research Genetics (Huntsville, AL) or from Genome Systems (St. Louis, MO). BAC clones (library RPCI-98) are available from Dr. P. de Jong (Roswell Park Cancer Institute, Buffalo, NY). P-element alleles are available from the Bloomington and Szeged Drosophila Stock Centers or from the Berkeley Drosophila Genome Project (BDGP). The annotated sequences can be viewed through FlyBase as CloneCurator reports.


*  RESULTS AND DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS AND DISCUSSION
*CONCLUSIONS
*APPENDIX
*LITERATURE CITED

The physical map and sequence of the Adh region:
The physical map of the Adh region was assembled and sequenced from P1 and BAC as described in MATERIALS AND METHODS. The P1 clones formed three contigs, one of 1,940,896 bp, one of 798,089 bp, and the third, a single P1 clone. The gap between the 1.9-Mb and 0.79-Mb contigs could not be closed in P1 clones, but was, however, readily closed by screening the BAC library; it was found to be 43,803 bp in length. A BAC clone also linked the isolated P1 clone (DS07660) to the distal end of the 1.9-Mb contig. This gap was 35,162 bp in length. The total length of sequence studied is 2,919,020 bp. A summary of the interpretation of this sequence is given in Figure 1, with an expanded view of three selected regions in Figure 2.





View larger version (77K):
In this window
In a new window
Download PPT slide
 
Figure 1. A summary molecular map of the Adh region, covering 2.9 Mb of DNA. Genes located on the top of each map are transcribed from distal to proximal (with respect to the telomere of chromosome arm 2L); those on the bottom are transcribed from proximal to distal. The gene symbols used in this figure are boldface type; if not the formal symbol then the latter is shown in a lighter font (formal symbols are abbreviated, their BG: prefix being omitted from Figure 1 and Figure 2). P-element insertions are shown as triangles projecting to the molecular map. Red bars indicate transcribed regions, with intron-exon structures as predicted. Those in dark red are confirmed by a cDNA or were previously known; those in light red have only GENEFINDER or GENSCAN predictions (with cutoffs of 20 and 45, respectively). The blue and green boxes are BLASTX or TBLASTX matches detected using genomic DNA sequences from a GenBank submission (usually a single P1 or BAC clone) to search against sequences of other species in the databases. Similarities are shown in green for expectations between P = 10-8 and P = 10-50; blue for expectations of P = 10-51 or lower. Once translations of predicted or known genes were used for BLASTP searches, some similarities that had not been detected using the nucleic acid sequence of the genomic clones were found. A summary of these BLASTP data is found in Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2). Transposable elements are indicated by black boxes and are named according to FlyBase. Genes defined genetically are shown above the map. Genes whose symbols are within square brackets are not tied to the map. These genes are indicated above a horizontal line when their order with respect to the genes below the line is not known. A scale in kilobases is shown; ~1 cm = 10 kb.



View larger version (27K):
In this window
In a new window
Download PPT slide
 
Figure 2. Enlarged views of the Sos-RpII33, l(2)35Bb-vas, and twe-chif regions. Symbols and conventions as in Figure 1. A scale in kilobases is shown; ~3 cm = 10 kb.

General features of the sequence:
The overall base composition of the sequence is 40.82% G + C, to be compared to the figure of 43% for the genome as a whole (LAIRD and MCCARTHY 1969 Down). The G + C contents of functionally different regions of the sequence, protein-coding regions, introns, and intergenic spacer are 49.7, 38.7, and 39.6%, respectively (intergenic regions may well be overestimated in size, because the gene prediction programs will have missed 5' exons distant from the body of a gene unless full-length cDNAs were available). The average number of exons per gene is 4.4, but this figure must be treated with caution for the reasons just mentioned.

Gene prediction in the Adh region:
A primary objective of the sequence analysis was to identify genes, both protein coding and others (e.g., tRNA), in the 2.9 Mb of sequenced DNA. We predict the existence of 229, of which 218 are predicted to be protein coding and 11 tRNA coding (Figure 1). The bases for the predictions are summarized in Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2). Forty-one of the protein-coding genes are predicted only on the basis of a high score with a gene-finding program; of these, 16 have both GENSCAN and GENEFINDER predictions (above the thresholds we used), 2 have only GENEFINDER predictions, and 23 only GENSCAN predictions. All of the other protein-coding genes are predicted by either (or both) sequence similarities (a BLAST score of P = <10-7; 156, 71%) or a match with a Drosophila EST, cDNA, or genomic sequence (110, 52% of protein-coding genes). (Seventeen more genes had matches to Drosophila ESTs, but these matches were clearly due to the ESTs being derived from genes encoding similar sequences, i.e., from paralogous genes.)

It is important to get an estimate of the false-negative and false-positive frequencies of prediction. A GENSCAN threshold of 45 fails to predict 22 protein-coding genes predicted by other means (or known prior to this work). Of these 22, 10 have EST matches and 3 were known prior to this analysis (Mst35Ba, Mst35Bb, and cni). Lowering the threshold for GENSCAN to 30 would include 8 of these 22 false negatives, but this would also predict a further 25 protein-coding genes in this region, none of which would have any other support. The GENEFINDER program, at a threshold of 20, fails to predict 56 of the protein-coding genes. Of these false negatives, 35 have support from experimental data and 21 have support from GENSCAN predictions [Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2)]. One feature of GENSCAN that we have noticed is that its scores tend to be low in regions of very high gene density.

ESTs and cDNA sequences of genes in the Adh region:
Even the best computational methods are imperfect in their ability to determine the intron-exon structures of genes from genomic sequence alone. Moreover, because such methods rely on information from codon usage and the maintenance of open reading frames, they are inherently unable to predict the presence of introns in 5' or 3' untranslated regions or to predict the transcriptional start sites. For these reasons it is necessary to isolate and sequence cDNAs (or RT-PCR products). We have used sequence matches between the genomic sequence and 5' ESTs as a rapid way of identifying cDNAs for sequencing [see MATERIALS AND METHODS; Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2)]. cDNAs corresponding to 95 genes were identified by matches to ESTs (44% of known or predicted protein-coding genes) at a time when the total number of Drosophila ESTs available was 53,000.

Of the 68 protein-coding genes for which there was some prior knowledge (i.e., both genetic and molecular data or molecular data alone), 50 (74%) have ESTs; of the 150 genes that are newly discovered, only 44 (29%) have ESTs. This is a rather surprising result. It may indicate either a bias in the sample of genes that had already been studied or an overprediction of new genes, or it may be a biologically interesting result (see below).

P-element hits:
Several collections of lethal P elements were screened against deletions that, in sum, covered the entire Adh region (see SPRADLING et al. 1995 Down, SPRADLING et al. 1999 Down). We have also analyzed genetically P elements from these collections that had not been recovered in the screens for lethals or semilethals, but which were found to map to the region by in situ hybridization to polytene chromosomes or by a sequence match of the sequences flanking the P-element insertion (SPRADLING et al. 1999 Down). Similarly, sequences flanking 2300 insertions of the EP element (RORTH et al. 1998 Down) were determined (J. REHM and G. RUBIN, unpublished data) and used to identify 24 EP insertions in this region. From these screens, and from those identified by others, 181 independent P-element insertions in 43 genes have been identified [ Table 1 and S1(http://www.genetics.org/cgi/content/full/153/1/179/DC1)]. P-element insertions in 35 genes give a lethal, or semilethal, sterile, or visible phenotype. In the remaining eight genes all known insertions are without obvious phenotypic effect.

Gene density in the Adh region:
Of the 229 genes, 218 are protein coding and 11 are tRNAs. The average gene density for protein-coding genes is one per 13.4 kb. The average size of the genes, as estimated both from computational analysis and the "full"-length cDNAs, is 5.5 kb (from ATG to terminator, including introns). The average gene density of one gene per 13.4 kb hides enormous variation in density. Some regions are very dense, with genes being separated by only a few hundreds of base pairs; others are, by comparison, very gene poor (see Figure 1 and Figure 2).

There are few studies of long genomic sequences of Drosophila that we can use for comparison with the Adh region. Preliminary analyses of 2 Mb of genomic sequence from region 1–3 of the X chromosome give a gene density of one gene per 8 kb (T. BENOS and M. ASHBURNER, unpublished analyses of European Drosophila Genome Project data). In the 338-kb bithorax region there are 13 known or predicted genes (1 per 24 kb), but 3 of these (Ubx, abd-A, and Abd-B) are exceptionally large (22 to 78 kb for their coding regions alone). In the Antp region Celniker et al. (S. CELNIKER, B. PFEIFFER, J. KNAFELS, C. MAYEDA, C. MARTIN and M. PALAZZOLO, unpublished data) have identified 26 protein-coding genes in 430 kb, a density of 1 gene per 16.5 kb. MALESZKA et al. 1998 Down predicted 12 genes within one 67-kb P1 clone from the base of the X chromosome (1 gene per 5.6 kb).

Transcriptional bias:
The number of genes transcribed from each DNA strand is approximately equal (121 vs. 108). In very gene-dense regions there is a strong tendency for the direction of transcription to alternate (see Figure 1); overall, however, the pattern of transcriptional direction appears to be random. This was tested by expressing the pattern as a binary string and attempting to compress it using the Lempel-Ziv compression algorithm (ZIV and LEMPEL 1977 Down). The string did not compress any better than did 1000 randomly generated strings of the same length.

Estimates of total gene number in Drosophila:
Any estimate of total gene number, based on the analysis of the Adh region, depends on this region being "typical" of the genome as a whole, with respect to the number of genes. This is a difficult question to answer with any rigor. Genetically, there are no indications that the Adh region is atypical. The number of genes discovered by genetic analysis is, given the number of polytene chromosome bands included, very similar to that in other well-studied regions. Classical "saturation" studies give a ratio of lethal complementation groups to polytene chromosomes bands of ~0.84 (Table 3); for the Adh region this ratio is 0.81.


 
View this table:
In this window
In a new window

 
Table 3. Selected regions of the gemone of D. melanogaster subjected to "saturation" genetic analysis for lethal complementation groups, showing the average ratio of lethal loci to polytene chromosome bands

Our estimates of the total gene number rely on estimates of the total DNA content of D. melanogaster. This has been independently estimated to be 170 Mb by RUDKIN 1972 Down(and cited in KAVENOFF and ZIMM 1973 Down), using UV microspectrophotometry of diploid ganglion cells by RASCH et al. 1971 Down, by Feulgen microspectrophotometry of sperm and haemocyte cells, and by KAVENOFF and ZIMM 1973 Down from the kinetics of relaxation of whole chromosome-length DNA molecules. The kinetics of reassociation of denatured DNA gave a slightly lower estimate (LAIRD 1971 Down). Of this 170 Mb of DNA, some 21% is estimated to be low-complexity satellite sequence (LOHE and BRUTLAG 1987 Down) and 12% transposable elements and other repeated sequences, such as the histone and rRNA genes (LAIRD and MCCARTHY 1968 Down). This gives an estimate of ~115 Mb of "unique" DNA sequence.

Simple arithmetic, 115 Mb/13.4 kb, gives an estimate of 8600 protein-coding genes for the Drosophila genome as a whole. This is a remarkably low number, being less than half as much again as the yeast S. cerevisiae (6000; MEWES et al. 1997 Down) and less than half the number now estimated for Caenorhabditis elegans (19,090; THE C. ELEGANS SEQUENCING CONSORTIUM 1998). An independent estimate can be made, knowing that the sequenced region covers 69 polytene chromosome bands, an average of 42 kb/band plus its adjacent interband [rather higher than Sorsa's estimate of 21.6 kb/band (SORSA 1988 Down)]. The total band number is estimated to be 5160 (V. Sorsa, quoted in ASHBURNER 1989 Down). In terms of band number, therefore, the Adh region is 1.34% of the total. If the density of genes per band in this region is typical of the genome as a whole, then this leads to an estimate of 16,975 genes. Our two estimates of the total gene number in D. melanogaster, 8600 and 16,975, bracket the estimate of 12,000 by MIKLOW and RUBIN 1996 Down, based on the sizes of 276 individual genes.

Local duplications of genes:
A number of genes in Drosophila have been found to exist as locally duplicated gene pairs. Members of a pair may be functionally distinct (e.g., en, inv) or functionally redundant (e.g., gsb-d, gsb-p; ph-d, ph-p). The most obvious model for the origin of gene pairs is unequal recombination (STURTEVANT 1925 Down; INGRAM 1961 Down; BAGLIONI 1963 Down; SMITHIES et al. 1962 Down) followed by sequence divergence.

In this chromosome region we have identified at least 12 (protein-coding) gene repeats. One had already been identified, first in Drosophila pseudoobscura (SCHAEFFER and AQUADRO 1987 Down), i.e., Adh and Adhr, genes just 300 bp apart that have protein products only 33% identical in sequence, yet with a conserved position of introns. Remarkably, Adhr is only transcribed as a dicistronic transcript with Adh (BROGNA and ASHBURNER 1997 Down). The second gene repeat is a triplication of three zinc finger domain transcription factors, escargot, worniu, and snail, within 150 kb. The proteins encoded by these genes show 31–37% pairwise identity. Interestingly, although each of these is required for viability, there is some residual functional redundancy between at least esg and sna (see Appendix). The third example is BG:DS01514.2 and BG:DS05899.1, two genes 7.5 kb apart that encode protein products 43% identical in sequence; these proteins show similarity to mouse long-chain fatty acid coenzyme-A ligase. Mst35Ba and Mst35Bb are a tandem pair of genes encoding protamine-like proteins characterized by RUSSELL and KAISER 1993 Down. These proteins are 91% identical over their common region (that of Mst35Bb is longer by 25 amino acids than that of Mst35Ba). At the nucleic acid level the duplication extends over ~1 kb.

Five genes, closely clustered in the region between RpII33 and Ance, show between 30 and 37% amino acid sequence similarities. These are BG:DS00941.11–BG:DS00941.15, genes whose proteins are about the same size but all lack any sequence matches. BG:DS00180.7–BG:DS00180.10, BG:DS00180.12, and BG:DS00180.14 are six genes all with epidermal growth factor (EGF) domains clustered within a few tens of kilobases just distal to rk. Their sequence similarities are not high, but are evidence of ancient duplications.

In the region between the lace and CycE genes there are six predicted genes within 21 kb, each encoding a protein of the astacin subfamily of Zn-metalloproteases (BARRETT et al. 1998 Down; BG:BACR44L22.1–BG:BACR44L22.4, BG:BACR44L22.6, and BG:BACR44L22.8). The predicted protein sequences of these genes are between 29 and 64% identical. There are two clusters of genes encoding proteins predicted to be serine proteases. One is of two genes within 14.8 kb and showing 45% pairwise similarity (BG:DS06874.4 and BG:DS06874.6); the other is a pair of genes within 10.2 kb showing 35% sequence similarity (BG:DS07108.1 and BG:DS07108.5). Right at the proximal margin of the region sequenced are three genes encoding proteins identified by KAWAMURA et al. 1999 Down as imaginal disc growth factors (see below). These genes show 51–55% pairwise similarity in sequence and are within 7.7 kb (Idgf1, Idgf2, and Idgf3). Interestingly, there is evidence for a tandem triplication of chitinase genes, which these resemble, in mosquitoes (DE LA VEGA et al. 1998 Down). A further triplication is exemplified by beat and two similar genes, beat-B and beat-C, first discovered in this sequence by T. PIPES (personal communication). These three genes are not contiguous, but are clustered within 200 kb. The proteins predicted for beat-B and beat-C are 51 and 46% identical, respectively, to that of beat. The three genes have a similar structure. The final example of duplicate genes is that of noc and BG:DS06238.3, a gene some 100 kb distal, which we suggest is elB (see below). These two genes encode Zn-finger proteins with 27% amino acid identity.

The 38 genes in the 34C-36A region that appear to be members of tandem series represent 17% of the total number of protein-coding genes. This is a minimum estimate, because a BLASTP search of all 218 known and predicted protein sequences against themselves identifies other potential duplications, which require further study. Many of these duplications are very old, as judged by the sequence similarities between members of a set. Tandem series of genes are also a feature of C. elegans (THE C. ELEGANS SEQUENCING CONSORTIUM 1998; THE C. ELEGANS GENOME SEQUENCING PROJECT 1999) and Arabidopsis thaliana (BEVAN et al. 1998 Down). The fraction of genes included in tandem sets of two or more (18%) is about the same as that found in the Adh region (JONES 1999 Down). One possible reason why C. elegans appears to have more genes than D. melanogaster would be that these local tandem arrays are, on average, larger in C. elegans. The data available so far do not support this suggestion.

Genes within genes:
The first example of a gene known to be entirely included within another gene was that of a pupal cuticle protein gene (Pcp) fully encoded within an intron of ade3 (HENIKOFF et al. 1986 Down). Since then, >30 examples have been discovered (data from FlyBase) and in the majority of cases (25/32) the included gene is transcribed from the opposite strand of the including gene. In the Adh region we have identified 17 examples of nested genes, 12/17 following the majority rule of antiparallel transcription.

The inclusion of Adh within osp was first suggested by genetic data, because osp aberrations mapped to either side of Adh (CHIA et al. 1985 Down; see below). This suggestion, and the inclusion of Adhr in the same intron, was confirmed by molecular analysis (MCNABB et al. 1996 Down) and is proven here by the comparison of the sequence of a full-length osp cDNA with the genomic sequence (see below). Two other predicted genes are within osp: BG:DS07721.1 and BG:DS09219.1.

An open reading frame in the 5' intron of vasa (vig, for vasa intronic gene) was first identified by K. EDWARDS (personal communication) by a comparison of sequences from D. grimshawi with those from this project. There is another CDS within vasa: BG:DS00929.15 in the long third intron, first identified as a ubiquitous transcript from RNA blots with genomic DNA by P. LASKO (personal communication; see STYHLER et al. 1998 Down). The other examples of putative included genes are BG:BACR48E02.1, BG:BACR48E02.2, and BG:BACR48E02.3, all included within the second intron of B4; BG:DS07486.3, BG:DS07486.4, and BG:DS07486.5 included within introns of beat-B, the former in intron 1 and the latter two in intron 2; BG:DS03792.2 is within wb; BG:DS03192.4 is within BG:DS03192.2; BG:DS07295.4 is within BG:DS07295.1; BG:DS07660.1 is within kuz; and BG:DS01514.1 is within BG:DS01514.3.

The phenotypes of overlapping and contiguous deletions—the search for more genes:
We have evidence that the genetic screens failed to recover mutations at loci expected to have scorable phenotypes—the failure to recover any alleles of beat is an example (see Appendix). One new lethal locus (l(2)35Fg) was discovered when the chromosome 2 P elements were systematically screened. One further genetic technique to discover genes is to systematically screen hetetozygotes between two overlapping deletions. We have made transheterozygotes between all possible pairs of deletions, which, by genetic criteria, abut, i.e., the distal end of one and the proximal end of another are located between the same pair of genes identified by mutant alleles. These pairs of deletions may or may not physically overlap.

Pairwise combinations (836) have been made and the genotypes scored for viability, male and female fertility, and obvious visible phenotypes. Although these phenotypes could be the result of the additive effects of haplo-insufficiency, we have predicted the existence of four lethal loci from these data, two loci required for male fertility and two loci required for female fertility (each "locus" could include more than one gene, of course). A variation on this protocol for the discovery of mutant phenotypes is to test combinations of deletions that are known to overlap by only one gene with a mutant phenotype in the presence of a transgene that is known independently to rescue the mutant phenotype. If the transgene rescues the deficiency heterozygote to phenotypic normality, then we can conclude that no other genes capable of giving a mutant phenotype are located in the deleted interval; and if not, then we can conclude the existence of a previously unsuspected locus.

Overlapping Ance- deletions are lethal, which is expected, since Ance itself is a vital gene. There is, however, evidence for another lethal near Ance, because the lethality of some, but not all, overlapping deletion pairs can be rescued by a 16.5-kb transformant that includes both Ance and anon-34Ea (carried on P{RACE}). l(2)34Ec is predicted on the basis of the failure of this transformant to rescue the lethality of, e.g., Df(2L)SR407/Df(2L)b82a1. This predicted gene is not in the overlap of, e.g., Df(2L)SR407/Df(2L)b74c6.

The existence of ms(2)35Bi, between the 5' exons of osp and l(2)35Bb, is predicted on the basis of viable, but male-sterile, overlapping deletion heterozygotes (see Appendix). l(2)35Cc is predicted on the basis of the recessive lethality of Df(2L)rd9 (ASHBURNER et al. 1990 Down). rd9 is lethal with deletions of rd; all five other known alleles of rd are hemizygous viable. The existence of l(2)35Cc is confirmed by the complementation behavior of deletions generated from gftPZ06430 by male recombination. Of nine deletions, one extended distally and was rd+ but lethal with Df(2L)rd9 and gft; the other eight extended proximally from gft to include ms(2)35Ci.

The region between esg and sna is, genetically, rather complex. From the phenotypes of overlapping deletions ASHBURNER et al. 1990 Down identified a region that, when homozygously deleted, can result in either lethality or an absence of the halteres. These phenotypes are separable; e.g., the Df(2L)osp38/Df(2L)TE35D-22 heterozygote is viable and lacks halteres, but Df(2L)osp18/Df(2L)TE35D-22 is lethal. Both map between esg and worniu. The lethal is here named l(2)35Cg. There is another predicted lethal in this region, simply called l by ASHBURNER et al. 1990 Down(Figure 2). It (l(2)35Ch) is predicted from the lethality of, e.g., Df(2L)el20 when heterozygous with Df(2L)Scorv25. There is only one gene prediction in the esg-worniu interval; this is BG:DS03023.4.

fs(2)35Ec is inferred from the sterility of Df(2L)RA5 females heterozygous with 18 different deletions, e.g., Df(2L)TE35D-3. The existence of fs(2)35Ed is suggested by the sterility of Df(2L)RM5/Df(2L)TE35D-2 females and of four similar genotypes; this gene may correspond to beat-C. ms(2)35Eb is inferred from the male sterility of the heterozygote Df(2L)RA5/Df(2L)TE35D-14. The predicted female steriles, fs(2)35Ec and fs(2)35Ed, are tentative; we are concerned that these phenotypes may simply result from haplo-insufficiency, particularly for BicC.

There are several regions that are homozygous viable when deleted. We estimate that the longest of these, the overlap of Df(2L)A178 and Df(2L)A446, is 190 kb. This overlap deletes or disrupts four known genes (noc, Adh, Adhr, and osp), eight tRNA genes, and five predicted protein-encoding genes in the noc-BG:DS07721.3 interval.

The structure and function of gene products:
We have used three computational techniques to infer structural and functional attributes of the products of the genes predicted for this chromosome region. These are searches for protein motifs or domains using the PFAM and PROSITE databases, BLASTP similarities of the predicted open reading frames with proteins in the SWISSPROT and SPTREMBL databases, and some analysis of protein features using the PSORT and SAPS programs (see MATERIALS AND METHODS). In general, we have been rather conservative in making these inferences, as we have for gene prediction in general. These functional inferences are summarized in Table S3 (http://www.genetics.org/cgi/content/full/153/1/179/DC3), using a classification now being developed by the Gene Ontology Consortium (FlyBase, Mouse Genome Informatics and the Saccharomyces Genome Database; GO 1999). Of the 218 known or predicted protein-coding genes, we know, from previous work by others, or have inferred, the function of less than half (91, 42%). Of these, 41 are obviously enzymes and 18 are predicted to be proteases; the rest cover the functional spectrum from structural proteins (e.g., cuticle protein) to growth factors and transporters. From our analysis of protein motifs we predict that 16 of the proteins are DNA or RNA binding; the PSORT analysis predicts that 82 are nuclear localized, but this may well be an overestimate. There are some features of the domain analysis that deserve further study: the cluster of six genes (BG:DS00180.10 and neighbors) whose products are predicted to have EGF domains in particular.

Evolutionary conservation:
Of the 156 known or predicted protein-coding genes, 72% have clear matches with those in other organisms [summarized in Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2)]. Of these, 120 have matches to the sequences of C. elegans, 69 to the sequence of S. cerevisiae, 35 to sequences of A. thaliana, 114 to sequences from rodents (nearly all mouse, with a few rat), 125 to human sequences, and 128 to rodent + human sequences. Thirty proteins have matches in yeast, C. elegans, Arabidopsis, and rodents + human, and 55 in yeast, C. elegans, and rodents + human. With the exception of S. cerevisiae and C. elegans (whose genomes are entirely sequenced, or almost so) these numbers reflect the available sequence data, although, overall, they are an impressive witness to the conservation of protein sequence across very different taxa. These sequence similarities are, of course, very useful for making functional inferences about new Drosophila genes; they must, however, be treated with some caution as the evolution of function and sequence may not be as tightly linked as is sometimes believed. We see evidence for this in the genes of this region; e.g., the fact that the three genes we first identified by their sequence characteristics as chitinases are in fact secreted imaginal disc growth factors, as has been shown experimentally (KAWAMURA et al. 1999 Down). The inferences we have made are only hypotheses that demand experimental verification or falsification.

In addition to sequence similarities between genes in this chromosome region and sequences from other taxa, 49 of the predicted or known protein-coding genes have significant database matches outside the Adh region to the known protein universe of Drosophila. This is from a sample of only 2000 or so proteins, <15% of the expected total. The conclusion, which is no great surprise, is that nearly all proteins of Drosophila will be members of protein sequence families. In some cases the similarities in sequence between different proteins are very striking, e.g., the two "stress-activated" mitogen activated protein (MAP) kinases p38b and Mpk2 are 77% identical in sequence (see Appendix). There is no obvious clustering of the genes that are paralogs of genes in the Adh region; this would have been evidence of large-scale genomic duplications, such as are found in S. cerevisiae (WOLFE and SHIELDS 1997 Down).

Correspondence between known genes and the sequence:
One of the major objectives of this study was to identify the 73 genes known or predicted from the genetic analyses on the sequence and, if possible, to infer their function. For those that had been sequenced previously their identification was straightforward. Others have been identified by mapping to the sequence the sites of insertion of P-element alleles and by correlating the genetic and sequence maps. Forty-nine of these 73 genes have been identified on the sequence [see Figure 1 and Table S2 (http://www.genetics.org/cgi/content/full/153/1/179/DC2)]. For the remaining 24, candidate sequences can be identified, but no firm correlation can be made on the available data. Detailed consideration of these 49 genes and others of interest identified on the sequence is given in the Appendix

Genes with phenotypes are more likely to be conserved:
Genes that can mutate to an observable phenotype are far more conserved than those that cannot. The data are shown in Table 4. We compare the sequence similarities between known and predicted proteins in two groups: the first is of all 218 proteins, the second just that subset of 49 encoded by genes for which we have phenotypically detectable mutant alleles. Even at a BLASTP threshold of P = 10-50, 63% of the 49 genes with phenotypes (and known sequences) have sequence similarities in other taxa, compared to only 31% for the total sample of 218 genes. This difference is also observed if one only considers the comparisons to individual species, such as C. elegans and S. cerevisiae, whose genomes are completely sequenced; this argues that the observation cannot be due to an ascertainment bias.


 
View this table:
In this window
In a new window

 
Table 4. A comparison of the sequence similarities between genes with known mutant phenotypes and those without

We know, or predict from genetic data, that 73 out of 218 genes have mutant phenotypes. If we assume that the 24 genes that we have not yet managed to tie to the sequence are as conserved as the 49 that we have, then we can calculate the expected properties of the total sets of genes with and without mutant phenotypes. For example, we can predict 46/73 will have BLASTP hits to other species at an expectation of P = 10-50. Because there are only 67 hits to other species from the total of 218 genes (at this cutoff) we can conclude that 63% of the genes with mutant phenotypes are conserved, but only 14% (21/(218-73)) of the genes without detectable mutant phenotypes. If we raise the BLASTP cutoff to P = 10-100, then the numbers are even more striking: 37 and 2%, respectively, for genes of the two classes.

We realize that this analysis has its limitations. The distinction between genes with and without discernible mutant phenotypes is not hard and fast, but we point out that the great majority of mutant phenotypes known in this chromosome region are very obvious, i.e., lethality, sterility, or marked changes to adult morphology. We can, in addition, have reasonable confidence that mutations have been detected in nearly all of the genes in this region that can mutate to these phenotypes.

Conserved genes are more highly expressed:
Genes known previous to this analysis are far more likely to have ESTs than those newly discovered (see above). We were concerned that this could indicate an overoptimism in predicting new genes. Yet the analysis of Table 4 shows that this cannot be so, or at least it cannot be the entire reason. Genes with BLAST similarities with P values <10-7 are unlikely to be false predictions. Yet in the total data set of 218 genes we see that the fraction that have ESTs increases the higher we set the expectation: for "all" species hits it is 48% at P = 10-7, 53% at P = 10-20, 60% at P = 10-50, and 80% for P = 10-100. Genes with mutant phenotypes have ESTs at an overall higher frequency than do those without phenotypes (Table 4). The observation that "conserved" genes are more highly expressed than are "nonconserved" genes, as judged by the occurrence of ESTs, was first made by GREEN et al. 1993 Down in their analysis of evolutionarily conserved regions in proteins. They suggested that highly expressed genes might be under a higher selection pressure. The similar bias in C. elegans, where genes with matches to proteins in distant taxa (i.e., non-Nematodes) are three times more likely to have an EST than genes with no such match, was confirmed by an analysis of the C. elegans sequence (THE C. ELEGANS SEQUENCING CONSORTIUM 1998).

tRNA genes:
An initial rush of enthusiasm mapped many tRNA genes by in situ hybridization to the polytene chromosomes and many of these were subsequently cloned and sequenced (e.g., KUBLI 1982 Down). A total of 182 tRNA genes have so far been mapped in Drosophila (data from FlyBase), yet others remain to be discovered (e.g., tryptophan and cysteine tRNAs). Many tRNA genes occur in clusters, either of isoaccepting or diverse tRNAs. A cluster of five glycine tRNAs was already known in the Adh region (MENG et al. 1988 Down; 13 others are known). In addition we have identified a single glutamine tRNA (the first to be sequenced in Drosophila; BG:DS01514.1) and a single leucine tRNA (five others are known; BG:DS03192.1), four proline tRNAs (two others are known), one (BG:DS04641.2) immediately distal to the glycyl-tRNA cluster, and three (BG:DS01486.2–.4) just proximal to this cluster, immediately distal to osp. The 100-kb region between noc and osp therefore contains nine tRNA genes.

Transposable elements:
About 12% of the genome of D. melanogaster is estimated to be composed of transposable element sequences, ribosomal DNA, and core histone genes (LAIRD and MCCARTHY 1968 Down; SPRADLING and RUBIN 1981 Down). Seventeen elements have been recognized in the sequence of the Adh region; 6 are LINE-like elements (G, F, Doc, and jockey), 11 are retrotransposons with long terminal repeats (LTRs; copia, roo, 297, blood, mdg1-like and yoyo; see Figure 1 and Figure 2). This is an average spacing of 1 element per 171 kb. On the basis of kinetic data the "middle-repetitive" sequences of D. melanogaster had been estimated to be ~5.6 kb in length, and separated by 13 kb or more of single-copy DNA (MANNING et al. 1975 Down; CRAIN et al. 1976 Down).

A new retrotransposon element has been identified. It has been called yoyo in view of its sequence similarity with an element of the medfly Ceratitis capitata with this name. The yoyo LTR seems to be a hotspot for P-element insertion; k08808, a lethal allele of l(2)35Bc, is inserted in an LTR of yoyo and at least four other examples are known of P elements in yoyo LTRs (PZ06264, EP(2)0533, EP(2)0396, and EP(2)0417).

About 1.8% of the sequence of the Adh region is within identified transposable elements. This is much less than the 9% of the genome as a whole estimated to be composed of such sequences (SPRADLING and RUBIN 1981 Down). The reason for this difference is probably that the density of transposable elements is higher in the heterochromatic and peri-heterochromatic regions of the chromosomes (see SUN et al. 1997 Down). Perhaps only half the retroviral elements are euchromatic. That this is so is indicated by a comparison of the total numbers of elements estimated by DNA reassociation kinetics and those seen in the euchromatic arms by in situ hybridization. For the 412 element, e.g., the numbers were 40 (POTTER et al. 1979 Down) and 26 (STROBEL et al. 1979 Down), respectively, in Oregon-R; similar data were found for the 297 and copia elements.

There are other sequences that are clearly related to those of transposable elements but whose identity cannot be confidently stated. For example, on P1 clone DS07108 there are three very A + T-rich sequence regions that show similarities to elements such as 297 and mdg1 but appear to be very degenerate. In addition, in an intron of crp there is an 860-bp sequence very similar to the repetitive element described as Su(Ste) (BALAKIREVA et al. 1992 Down).

Breakpoint distribution:
We have mapped genetically 658 aberration breakpoints to this region of the Drosophila genome. Sixty-three breakpoints disrupt genes. Of these breakpoints many had previously been mapped to chromosome walks, usually in {lambda} phage. Ninety-four of these were mapped to restriction fragments in the 450-kb "Adh" walk from Ashburner's laboratory (CHIA et al. 1985 Down; MCGILL et al. 1988 Down; DAVIS et al. 1990 Down, DAVIS et al. 1997 Down; GUBB et al. 1990 Down; CHEAH et al. 1994 Down; MCNABB et al. 1996 Down), while others had been mapped to the vasa (LASKO and ASHBURNER 1988 Down), Su(H) (SCHWEISGUTH and POSAKONY 1992 Down), Sos (BONFINI et al. 1992 Down), BicC (MAHONE et al. 1995 Down), beat (FAMBROUGH and GOODMAN 1996 Down), twe (ALPHEY et al. 1992 Down), fzy (DAWSON et al. 1995 Down), and cni (ROTH et al. 1995 Down) regions. Computer-generated restriction maps of the sequences of these regions were used to correlate these data with the sequence map. This was reasonably straightforward, the major adjustments being those needed to take transposable elements into account. We have compared the genetic and physical distributions of chromosome breakpoints in several ways. One is shown in Figure 3. In this figure we plot the numbers of breakpoints in each defined genetic interval with the length of DNA in that interval. It is clear that the two parameters are well correlated [Spearman's rank coefficient (SPEARMAN 1904 Down) rs = 0.78, t43 = 8.17, P = <0.001], despite some degree of ascertainment bias in the data (most marked in the intervals surrounding b where very large-scale irradiation experiments have been done). Thus, the nonrandom clustering of aberration breakpoints seen in genetic mapping experiments is due to differing DNA target sizes rather than to some intrinsic property of the sequences themselves.



View larger version (22K):
In this window
In a new window
Download PPT slide
 
Figure 3. A comparison of the distribution of DNA with that of genetically mapped chromosome breakpoints in the kuz-dac region. The genetic positions of 571 chromosome breakpoints have been determined (J. Roote, M. Ashburner and colleagues, data in FlyBase) with respect to 48 genes. The number in each gene interval is plotted along with the DNA content (in kilobases) of the same interval. The DNA lengths were measured between the chromosomally distal ends of genes (as defined by the predictions; see Figure 1).


*  CONCLUSIONS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS AND DISCUSSION
*CONCLUSIONS
*APPENDIX
*LITERATURE CITED

We chose the Adh region of D. melanogaster for our first experiment in megabase sequencing and sequence analysis because this region had been subjected to genetic analysis in greater detail than any of comparable size in a metazoan species. This has allowed us to integrate sequence analysis with saturating mutational analysis on a scale not previously seen in any metazoan organism.

A critical feature of the data is that the genes are not subject to ascertainment bias—they only share a common chromosomal location. The comparison of the sequences of genes known to be required for a "normal" phenotype and those not known by phenotypically mutant alleles has shown a surprisingly strong correlation between evolutionary conservation and "essentialness" of function. The fact that two independent measures of functional importance—evolutionary conservation over 500 million years and requirement for normal phenotype—are correlated has significant implications. For example, it argues that functionally essential genes are not organism specific, nor are their functions protected by gene duplication. Functionally essential genes show a second characteristic: on average they are expressed at higher levels, as judged by their representation in EST collections, than are genes that are not required for a normal phenotype.

MIKLOW and RUBIN 1996 Down estimated that ~30% of the genes of D. melanogaster are "vital"; i.e., loss of their function will result in lethality. Estimates of the fraction of the genes that are vital from our present analyses give the slightly lower figure of 24%, because we have 53 genes known or suspected from genetic data to be lethals, out of a total of 218 protein-coding genes.

One major challenge is to discover the functions of not only those genes for which mutant alleles are already known, but also those for which no alleles have been recovered in the screens performed so far. One general approach will be to engineer dominant gain-of-function alleles of these, e.g., by using the P element engineered by RORTH 1996 Down. Another approach will be to make double mutant combinations when we have reason to believe that a gene may be "redundant" due to a second gene in the genome. For example, mutations of BG: DS08249.2 could be selected on a background mutant for the other known glycerol phosphate oxidase gene. Finally, the sequences or patterns of expression of a gene might suggest more appropriate phenotypic or biochemical assays to perform in search of its function.

This analysis of just 2.9 Mb of Drosophila sequence has been enormously informative and rewarding. Despite the fact that there is much more to be learned about this sequence, and the proteins it encodes, it has proved to be an invaluable experiment in preparation for the complete genomic sequence of this little fly, which we expect within the next year. Two matters are not in doubt; first, there is enough even in 2.9 Mb to keep biologists busy for many years and, second, their work will be invaluable in furthering our understanding not only of how Drosophila works and how it evolved, but also of human gene function.


*  FOOTNOTES

This article is dedicated to the birth of Aden Misra Siebel, who waited so patiently to join us. Back
1 Present address: JGI Sequencing Centre, Walnut Creek, CA 94598. Back
2 Present address: Amgen Inc., Thousand Oaks, CA 91320. Back


*  ACKNOWLEDGMENTS

We have benefited greatly from the resources of FlyBase, supported by grants from the National Institutes of Health (NIH) and the Medical Research Council (MRC, London). We thank K. Matthews and the Drosophila Stock Center, Bloomington, supported by grants from the National Science Foundation, both for keeping our own mutant strains and for supplying many others.

Many colleagues have made information available to us in advance of its publication. In particular we thank L. Alphey, M. Anderson, R. Anholt, J. Baker, U. Banerjee, S. Baumgartner, P. Benos, M. Botchan, S. D. M. Brown, A. Campos, D. Cavener, W. Chia, S. Cohen, I. Darboux, I. Dawson, J. B. Duffy, K. Edwards, G. Fedorowicz, C. Flores, J. Gates, M. Gatti, S. Hayashi, J. Heilig, A. Hudson, T. Ip, B. Iyengar, S. Jones, L. Keegan, D. Kiehart, I. Kiss, F. Laski, P. Lasko, M. Leptin, H. Mistry, T. Pipes, M. Pflumm, J. Posakony, J.-M. Reichhart, F. Schweisguth, K. Schmid, C. Sunkel, M. Taylor, C. Thummel, P. Tolias, J. Tower, N. Wakabayashi-Ito, A. Willingham, and last, but by no means least, C. Zuker.

We thank the following individuals who helped produce the sequence data presented in this article: A. Aghavani, D. A. Alcivare, T. T. Arcaina, D. Aragnol, E. Baxter, M. M. Bondoc, M. Chew, A. Chiang, P. A. Critz, I. Darboux, C. A. Davis, C. L. Ericsson, D. Fambrough, D. E. Farnan, J. Flanagan, K. M. Gunning, S. R. Hummasti, M. A. Jaklevic, K. E. Kadner, K. Karra, L. Kearney, K. Kim, S. F. Kim, S. H. Kim, C. L. Ko, B. Lee, K. D. Lewis, M. Li, K. J. Lindquist, M. Lomatan, V. M. Lustre, M. U. Machrus, C. A. Mayeda, P. Mazda, T. M. Miguel, C. A. Miller, M. S. Mok, M. Moshrefi, K. Nixon, J. M. Pacleb, S. Park, S. G. Patel, B. Pfeiffer, D. Punch, A. Salva, R. F. Santos, E. Snir, S. G. S. Subramanian, R. Svirskas, M. Taylor, B. Towne, B. Twomey, A. Yee, R. T. Yeh, C. Yu, R. Zhang, and L. L. Zieran. We thank S. Mullaney for drawing Figure 1 and Figure 2.

Stocks or clones have been kindly provided to us by L. Alphey, C. Flores, J. Gates, L. Keegan, D. Kiehart, M. Phillips, T. Pipes, K. Tatei, P. Tolias and C. Zuker. We thank A. Brazma, European Bioinformatics Institute (EBI), for testing for randomness in the direction of transcription, W. Fleischmann (EBI) for the EMOTIF analyses, P. Horton (Osaka) for running the PSORT programs for us, J. Barrett (Cambridge) for statistical advice, T. Benos (EBI) for help with the sequence data sets, and R. Svirskas (Motorola Inc.) for providing informatics support at Berkeley. The help, advice, and programs of G. Helt (Berkeley) have been absolutely invaluable.

M.A. and G.M.R. thank both past and present members of their groups for both material and intellectual (not to say moral) support for this work: at Cambridge, S. Brogna, W. Chia, C. Detwiler, A. de Grey, D. Gubb, D. Huen, R. Karp, D. Kimbrell, P. Lasko, S. McGill, S. McNabb, S. Tsubota, S. Russell, and R. Woodruff; at Berkeley, E. Frise, G. Mardon, D. J. Pan, M. Simon, and T. Xu. This work would have been quite impossible without the dedicated and skillful technical support of, at Cambridge, P. Thompson, D. Coulson, B. Durrant, J. Faithfull, P. Fletcher, S. Herrmann, T. Littlewood, T. Morley, M. Omar, M. Shelton, J. Trenear, and Y. Zhang; at Berkeley, A. Beaton, S. Chai, M. Evans-Holm, T. Laverty, D. Simas, and C. Suh.

G. M. Rubin thanks M. Bissell, A. Chatterjee, P. Oddone, C. Shank, and others at Lawrence Berkeley National Laboratory for their continuous support and encouragement, as well as L. Rubin for her patience. M.A., S.L., and S.M. thank W. M. Gelbart and his colleagues for their hospitality at Harvard.

Work in Cambridge on the genetic and molecular analyses of the Adh region has been continuously supported since 1983 by a MRC Programme Grant G8225539 to M. Ashburner and colleagues. Work in Berkeley was supported by the Drosophila Genome Center Grant from the NIH (P50 HG00750) and grants from the Department of Energy (DOE/DE-FG03-98ER62625 and DOE/DE-FG03-99ER62739) to G. Rubin and colleagues as well as by the Howard Hughes Medical Institute.

Note added in proof: LANDIS and TOWER 1999 Down show that the chiffon protein shares two domains with Dbf4p of S. cerevisiae. smi35A has now been cloned and sequenced independently by CLEGHON and colleagues (EMBL:AF168467); MIN and BENZER 1999 Down have described BG:DS0514.2 as bubblegum and have shown that a mutant allele has a neurodegeneration phenotype that can be corrected by feeding larvae glyceryl trioleate oil. Note that this gene is one of two in this region predicted to code for a long-chain fatty acid coenzyme A ligase (the other is BG:DS05899.1). BG:DS00941.1 was identified as encoding a carbonate dehydratase by analysis of our sequence by HEWETT-EMMETT and TASHIAN (D. HEWETT-EMMETT and R. E. TASHIAN, 1996, Functional diversity, conservation, and convergence in the evolution of the {alpha}-, ß-, and {gamma}-carbonic anhydrase gene families. Mol. Phylogen. Evol. 5: 50–77); they called this gene CAH, subsequently changed to CAH1 (D. HEWETT-EMMETT, personal communication).

Manuscript received March 24, 1999; Accepted for publication June 15, 1999.


*  APPENDIX
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS AND DISCUSSION
*CONCLUSIONS
*APPENDIX
*LITERATURE CITED

DETAILED DESCRIPTION OF GENES IDENTIFIED IN THE Adh REGION
B4:
B4 was discovered by SOTILLOS et al. (1997) and is the gene 823 bp distal to, and divergently transcribed from, kuz. The P-element insertion PZ05337 is within B4. This mutation is viable and fertile with Df(2L)b84a7, an including deletion. The P element k01405 [a cluster mate of k01403, Table S1 (http://www.genetics.org/cgi/content/full/153/1/179/DC1)] is a lethal kuz allele but may also affect B4 function, since the viability of hemizygous k01405 flies can be increased by C765:GAL4 driving UAS:B4 (SOTILLOS et al. 1997 Down). B4 corresponds to BG:DS07660.4 (only the N terminus is on this sequence) and the predicted protein has no similarity to other proteins, even when the full-length protein of SOTILLOS et al. (1997) is used in a BLASTP analysis.

kuz (l(2)34Da):
l(2)34Da was first identified as being a lethal associated with TE34Ca, an insertion of G. Ising's w+rst+ element, and its alleles TE34Cb and TE34Cc (M. ASHBURNER and J. ROOTE, unpublished observations). It is kuzbanian, encoding a disintegrin-like metalloprotease of the ADAM family (BG:DS07660.3; FAMBROUGH et al. 1996 Down; ROOKE et al. 1996 Down). kuz is required for Notch signal transduction, perhaps for the proteolytic cleavage of the Notch protein (PAN and RUBIN 1997 Down; SOTILLOS et al. 1997 Down). Several P-element alleles of kuz are known, only some of which are lethal [see Table S1 (http://www.genetics.org/cgi/content/full/153/1/179/DC1)].

BG:DS07660.1:
This gene is predicted to encode a protein of 453 amino acids that shows significant similarity in sequence to sodium/phosphate cotransporters of mammals (e.g., BLASTP, P = 10-59, 32% identity, over 88% of length, to the brain-specific sodium-dependent inorganic phosphate transporter of rat, SP:Q28722). It is also similar (30% identity over 86% of length) to a Na+-dependent inorganic phosphate cotransporter of D. melanogaster mapped to 43BC (EMBL:Y07720). PSORT predicts that the protein has eight transmembrane domains, as do other members of this protein family (GRIFFITH and SANSOM 1998 Down).

BG:BACR48E02.4:
By virtue of significant sequence similarity with the human and mouse RAS-suppressor protein RSU1 (e.g., BLASTP, P = 10-73, 55% identity over 86% of its length with SP:Q15404), this predicted gene probably codes for a small GTPase regulatory/interacting protein similar to that identified in mice by CUTLER et al. 1992 Down in an expression cloning assay for suppressors of the v-Ras phenotype.

BG:DS01368.1:
The predicted protein product of this gene is weakly similar (BLASTP, P = 10-20, 26% identity over 51%) to a hypothetical protein of C. elegans (C26B9.1, SPTREMBL:Q18202).

BG:DS08249.2:
This gene almost certainly encodes a Drosophila mitochondrial glycerol 3-phosphate dehydrogenase, but it is not that known as Gpo (which maps to chromosome arm 2R; DAVIS and MACINTYRE 1988 Down). The protein product predicted for BG:DS08249.2 has significant matches to the mitochondrial glycerol 3-phosphate dehydrogenase of organisms as different as human (P = 10-105 with SP:P43304) and Saccharomyces cerevisiae (P = 10-83 with SP:P32191) as well as having both PROSITE and PFAM flavin adenine dinucleotide (FAD)-dependent glycerol-3-phosphate dehydrogenase matches.

BG:DS08249.3:
The product of BG:DS08249.3 has a PROSITE (PS00518) and PFAM RING-finger domain (PF00097, P = 7.9 x 10-9) but the only significant BLASTP match is with a hypothetical human protein (P = 10-75, 43% identity over 91% of its length with SPTREMBL:O75598). Weaker matches are seen with other C3HC4-type zinc finger proteins, e.g., the Lnx protein of mouse (P = 10-11, with SPTREMBL:O70623) and a hypothetical protein of C. elegans (P = 10-9, with F45G2.6, SPTREMBL:O62248).

BG:DS00797.1:
The P element k07245 is a viable and phenotypically invisible insertion (although associated with a lethal chromosome) that is located 9 bp 5' to the putative start of transcription of this gene. One out of 135 transposase-induced excisions of this P element is a long distally extending deletion (at least to kuz); this deletion is not mutant for l(2)34Db, giving a distal limit for this gene. The protein encoded by BG:DS00797.1 is predicted to be a transmembrane domain protein (PSORT), similar to the EMP70 protein of S. cerevisiae (BLASTP, P = 10-94, 34% identity over 72% of length) and a related protein from Arabidopsis thaliana (SPTREMBL:O04091). It has been suggested by SCHIMMOLER et al. 1998 Down that members of the EMP70 protein family may be involved in small molecule transport in the endosome.

BG:DS00797.2:
This hypothetical protein is similar to proteins from Escherichia coli, S. cerevisiae, and Pennisetum ciliare, whose functions are unknown but that belong to the same protein family (UPF0010).

p38b:
This gene, encoding a MAP kinase (MAPK), corresponds to BG:DS00797.3 as shown by its sequence. It was first found on our sequence by HAN et al. 1998 Down and was also identified by an EDGP STS sequence (ESTS:186F5S). It is implicated in the antimicrobial response pathway, overexpression downregulating the induction of defense proteins by bacteria. p38b has also been identified by ADACHI-YAMADA et al. 1999 Down who have shown that it is involved in the TGFß signalling pathway, because expression of a dominant-negative form causes a dpp-like phenotype and enhances the dpp mutant phenotype. p38b is very similar in protein sequence to human mitogen-activated protein kinase p38 (72% amino acid sequence identity). In D. melanogaster, there is a second p38 homolog, Mpk2, mapping at 95E. The proteins encoded by p38b and Mpk2 (= p38a) are nearly 80% identical in amino acid sequence. Whether or not they can functionally substitute for each other is not yet known. p38b is the fourth MAPK to be identified in Drosophila; the others are the products of the rolled and basket genes, belonging to the ERK2 and JNK families of MAP kinases, respectively; both p38b and Mpk2 belong to the stress-activated family.

BG:DS00797.4:
The conceptual protein of this predicted gene only shows significant similarity with one of unknown function from C. elegans, F26C11.1 (BLASTP, P = 10-38 with SPTREMBL:Q17843, a protein with PROSITE histidine acid phosphatase signatures), and another of unknown function from the plant Pimpinella brachycarpa (BLASTP, P = 10-37 with SPTREMBL:O81652).

BG:DS00797.5:
The predicted protein of BG:DS00797.5 has a PFAM ABC transporter pattern (P = 1.9 x 10-40) and shows BLASTP similarities in its C-terminal exon with ABC transporters from mammals, but the identities are relatively low (~32%). At a similar level of identity, it resembles a hypothetical protein of C. elegans, F33E11.4, which also belongs to the ABC transporter protein family.

BG:DS00797.6:
The protein of this predicted gene shows significant similarities with only two others: one is a hypothetical protein of C. elegans, K09A11.1, said to be similar to transposases (P = 10-14, 21% identity over 39% of residues with SPTREMBL:Q21374) and the other is the transposase of the Hermit element of Lucilia cuprina (P = 10-11, 19% identity over 66% of residues with SPTREMBL:Q25239).

anon-34Da:
This gene was named for transcript 7 of BONFINI et al. 1992 Down, mapping ~20 kb distal to Sos. From its position, it probably corresponds to BG:DS00797.7, and this may correspond to l(2)34Db. The predicted protein is similar to the SEC7 protein of S. cerevisiae (P = 10-171, 33% identity over 28% of length). In yeast this protein is essential for vegetative growth and is involved in endoplasmic reticulum to Golgi protein transport (ACHSTETTER et al. 1988 Down; FRANZUSOFF et al. 1991 Down). The Drosophila protein also shares a domain with the bovine guanyl-nucleotide exchange protein (MORINGA et al. 1996 Down; 61% identity over 46% of length) and a similar protein is found in A. thaliana (F23E12.60).

BG:DS00941.1:
This is a Drosophila carbonate dehydratase. It shows highly significant BLASTP matches over its entire length with this enzyme from human, mouse, Chlamydomonas, Anabaena, and zebra fish, and there is a similar sequence predicted in C. elegans (R173.1). In vertebrates there are several carbonate dehydratases with different subcellular localizations. BG:DS00941.1 is most similar to the human CA7 and mouse Car2 genes, known or presumed to code for cytosolic forms of the enzyme, which catalyzes the hydration of carbon dioxide. There is biochemical evidence for three carbonate dehydratase genes in Drosophila (CHOUDHARY et al. 1992 Down), but these genes had not been characterized at the molecular level.

BG:DS00941.2:
This gene would appear to code for one of two Drosophila RNA adenosine deaminases (BASS 1997 Down). The other is also a predicted gene from the EDGP (EG:BACN35H14.1). The protein predicted from BG:DS00941.2 shows ~30% identity over its entire length to the double-stranded RNA adenosine deaminases of human, mouse, and Xenopus, which are involved in pre-mRNA editing. There are equally similar proteins predicted for S. cerevisiae (YGL243W), S. pombe, and C. elegans (T20H4.4) that lack double-stranded RNA-binding domains, and the S. cerevisiae protein has been shown to be an adenosine deaminase acting on tRNA (ADAT).

BG:DS00941.2 was independently identified as an RNA adenosine deaminase by L. KEEGAN (personal communication), who has named it Adat. The absence of a ds-RNA-binding domain from this protein, and in vitro studies of the expressed protein, have led L. KEEGAN and colleagues (personal communication) to the conclusion that this protein functions as a tRNA, rather than as a pre-mRNA, adenosine deaminase. This gene probably does not correspond to l(2)34Db, because we expect BG:DS00941.2 to have been included in a 15-kb Kpn1 Sos transgene (BONFINI et al. 1992 Down) that does not rescue alleles of this lethal locus.

BG:DS00941.3:
The only significant BLASTP match with the protein predicted for BG:DS00941.3 is to a human cDNA sequence (SP:O43351) that matches the human EST EMBL:AA085966, itself said to be similar to the human P31 proteasome subunit (P = 10-10, 53% identity over 21% of length).

Sos (l(2)34Ea):
l(2)34Ea was one of the most mutable genes in the early EMS mutagenesis experiments. It is the gene named Son of sevenless by ROGGE et al. 1991 Down, who recovered an allele as a dominant suppressor of a gain-of-function allele of sevenless. The same gene was identified as an enhancer of sevenless by SIMON et al. 1991 Down, who showed it to encode a guanine-nucleotide exchange factor required for signal transduction in the RAS pathway (see also BONFINI et al. 1992 Down). Sos corresponds to BG:DS00941.4, as is shown by direct sequence comparison.

black:
The first mutant allele of the black body color gene was discovered by T. H. Morgan in October 1910. It is a nonvital gene and all mutant alleles result in very darkly pigmented adult flies and white pupal cases. The phenotype results from a failure to synthesize ß-alanine (HODGETTS 1972 Down) and can be corrected by dietary ß-alanine (JACOBS 1974 Down). ß-alanine forms an adduct with dopamine (WRIGHT 1987 Down) and this is required for proper tanning of the cuticle (the ß-alanyl-dopamine synthetase is probably the product of the ebony gene; see WALTER et al. 1996 Down). There are two possible pathways of ß-alanine synthesis, by decarboxylation of aspartic acid and by pyrimidine catabolism (JACOBS 1974 Down). The facts that black mutant alleles are enhanced by mutations in su(r), which encodes the NAD+-dependent dihydrouracil dehydrogenase (BAHN 1972 Down), and that 6-azathymidine produces a black phenocopy (PEDERSEN 1982 Down) suggested that pyrimidine catabolism is the more important in Drosophila.

The predicted gene BG:DS00941.5 maps between Sos and BG:DS00941.6; we argue that the latter is l(2)34Dc (see below). This is precisely the genetic location of black by deletion mapping; moreover, these three genes are so very closely spaced that we can be confident that no others are to be found in this 18-kb interval. BG:DS00941.5 shows a good match (45–50% identity) to glutamate decarboxylase from mammals (mice, human) and to the rat cysteine sulfinate decarboxylase (SPTR-EMBL:Q64611). The Drosophila gene had been sequenced by PHILLIPS et al. 1993 Down. A cDNA of this gene, the gift of M. Phillips, crosses the breakpoint of Tp(2;3)b79d6, an aberration allele of black that is viable when hemizygous with long deletions of the black region. There is a second gene encoding glutamate decarboxylase in Drosophila, which is required for the synthesis of the neurotransmitter {gamma}-aminobutyric acid, Gad2, mapping to 64A (JACKSON et al. 1990 Down). Glutamate decarboxylase is known to have aspartate decarboxylase (ADC) activity in mammals (PORTER and MARTIN 1988 Down). This suggests that the absence of ß-alanine in black mutations is due to a failure of aspartic acid catabolism, rather than of pyrimidine breakdown, despite the data of JACOBS 1974 Down that indicated no difference in the decarboxylation of 14C-aspartic acid between a black strain and a wild type (see also PHILLIPS et al. 1993 Down).

tamas (l(2)34Dc):
This was identified as a lethal locus from eight EMS-induced alleles. Adult escapers have missing bristles on the head and notum and blistered wings with some disruption of the wing veins. This gene has been deletion mapped to between black and l(2)34Dd or l(2)34Df (the last two genes have not been ordered genetically). Because l(2)34Dd is a Drosophila homolog of yeast SOP2 (below; BG:DS00941.7) and because BG:DS00941.6 is the only open reading frame between black and Sop2 in a very closely packed interval, we conclude that l(2)34Dc is BG:DS00941.6, i.e., encodes the catalytic subunit of the mitochondrial DNA polymerase, previously sequenced from Drosophila by two groups (LEWIS et al. 1996 Down; ROPP and COPELAND 1996 Down). It is interesting that BG:DS00941.9, just 8 kb proximal, encodes the accessory subunit of this enzyme (see below).

B. IYENGAR, J. ROOTE and A. R. CAMPOS (unpublished results) identified an EMS-induced mutation of l(2)34Dc in a screen for larvae defective in their response to light. This phenotype was found to be a consequence of a defect in larval locomotor behavior. Four mutant alleles of l(2)34Dc, which they call tamas, were sequenced; two were missense mutations and the others small (1-bp and 5-bp) deletions within the coding region of the gene encoding the catalytic subunit of the mitochondrial DNA polymerase.

Sop2 (l(2)34Dd):
This gene is known only from three EMS-induced lethal alleles. HUDSON and COOLEY 1998 Down have shown, by transformation rescue, that these are in BG:DS00941.7, a Drosophila homologue of the Schizosaccharomyces pombe Suppressor of Profilin 2 (SOP2) gene. A similar sequence is the 41-kD subunit of the human ARP2/3 complex, a protein complex involved in the control of actin-filament assembly (WELCH et al. 1997 Down).

Orc5 (l(2)34Df):
Only two EMS-induced lethal alleles are known for l(2)34Df. Genetically, l(2)34Df maps between l(2)34Dc and l(2)34Dd or between l(2)34Dd and l(2)34De, and there are two candidate-predicted genes: BG:DS00941.8 and BG:DS00941.9. The former, BG:DS00941.8, encodes the Drosophila Origin Recognition Complex subunit 5 protein (GOSSEN et al. 1995 Down) and, as shown by M. PFLUMM (personal communication) by transformation rescue, is l(2)34Df. This conclusion places l(2)34Df between l(2)34Dd and l(2)34De.

MtpolB (l(2)34De):
Genetically, l(2)34De maps between l(2)34Dd (Sop2) or l(2)34Df (Orc5) and l(2)34Dg (RpII33). The evidence for the gene order l(2)34De l(2)34Dg comes from complementation data with T(2;3)b89e12, which is l(2)34Dd- l(2)34Df- l(2)34De- l(2)34Dg+. There is only one predicted gene in the 1.9 kb separating Orc5 and RpII33, which is the gene encoding the accessory subunit of the mitochondrial DNA polymerase (BG:DS00941.9; WANG et al. 1997 Down). It is a reasonable hypothesis that l(2)34De encodes this protein.

RpII33 (l(2)34Dg):
l(2)34Dg was first identified from two EMS-induced lethal alleles; subsequently the P-element insertion k05605 was shown to be allelic. This insertion is in the 5' of BG:DS00941.10, encoding a homolog to the 33-kD subunit of RNA polymerase II from mammals, S. cerevisiae, and A. thaliana; we can be confident that this is indeed the RpII33 gene of Drosophila, because the amino acid identities are ~68% between the entire Drosophila protein and its human homolog.

BG:DS08220.1:
This is a predicted gene with a match to human and C. elegans EST sequences of unknown function. The P elements PZ06646 and rN149 are phenotypically silent insertions at the same nucleotide 1 kb upstream of this transcription unit; the viable insertion k10802 is inserted 11 bp 5' to this transcription unit. Over 180 transposase-induced excisions of the PZ06646 element have been recovered; all are viable when heterozygous with long deletions of the 34D-35B interval. Three (of 84) transposase-induced excisions of rN149 are associated with lethal mutations, two of which map distal to BG:DS08220.1 and, presumably, are due to secondary events, and the third of which deletes Ance-wb. The product of BG:DS08220.1 may well be involved in a signal transduction pathway. The most similar proteins are the hypothetical KIAA0167 human protein (BLASTP, P = 10-148, 42% identity over 51% of residues) and hypothetical C. elegans protein Y39A1A.15B (BLASTP, P = 10-139, 45% identity over 30% of residues), but significant similarities are seen over short regions with the pig and rat inositol 1,2,3,4-tetrakisphosphate receptor (or binding protein).

anon-34Ea:
This gene was defined by FlyBase for a transcript immediately 5' to Ance detected by TATEI et al. 1995 Down. It is BG:DS08220.2, and is without any significant database matches. The 16.5-kb EcoRI fragment transformed by TATEI et al. 1995 Down carries anon-34Ea (and Ance) and rescues mutant alleles of Ance, as well as the homozygously deleted region in Df(2L)b88f32/Df(2L)nBR55 heterozygotes (TATEI et al. 1995 Down). The viable insertion EP(2)2171 is inserted within the first exon of this gene.

Ance (l(2)34Eb):
This vital gene was identified by two EMS-induced alleles. It was shown by transformation rescue to encode a peptidyl-dipeptidase A, similar to human angiotensin-converting enzyme, hence Ance, by TATEI et al. 1995 Down. It is BG:DS08220.3, and was also sequenced by CORNELL et al. 1995 Down, but mismapped by them to 34A. Ance protein is an early marker for amnioserosal differentiation (TATEI et al. 1995 Down; FRANK and RUSHLOW 1996 Down), where it is activated by the zen homeodomain transcription factor (RUSCH and LEVINE 1997 Down). There is a second gene encoding an angiotensin-converting enzyme-like protein in Drosophila, Acer, mapping at 29D (TAYLOR et al. 1996 Down). Clearly, these are not functionally redundant; indeed, HOUARD et al. 1998 Down show that the purified ANCE and ACER enzymes, which are 47% identical in amino acid sequence, have different substrate specificities and expression patterns.

Acyp:
A. Bairoch identified a sequence encoding a homolog of vertebrate acylphosphatase in our sequence of DS00180; this is BG:DS00180.1 (SP:P56544). Biochemical studies of the protein expressed in E. coli confirm its function (PIERI et al. 1998 Down).

BG:DS00180.2, BG:DS00180.3:
BG:DS00180.2 and BG:DS00180.3 are predicted genes whose protein sequences are 28% identical and have valine/proline-rich repeats. These proteins have significant database matches in unfiltered BLASTP to articulins, cytoskeletal proteins of the epiplasm of flagellates and ciliates. Articulins are characterized by VPVPxxVxxxV repeats (MARRS and BOUCK 1992 Down). BG:DS00180.2, e.g., has four copies of a VIK[K|E]V[P|H]VPV motif and four copies of a PVEKx[V|I]HVPV[H|K]V motif.

BG:DS00180.5:
The protein of this predicted gene has a limited region of similarity with angiotensin-converting enzymes from mammals and Drosophila, e.g., 42% identity over 13% to the human DCP1 protein (SP:P12821). It does not have a PROSITE zinc metallopeptidase, zinc-binding region signature, nor is it similar overall with either the Ance or Acer proteins. The existence of this gene is based on ab initio prediction; it has no EST matches.

BG:DS00180.12, BG:DS00180.7, BG:DS00180.8, BG:DS00180.9, BG:DS00180.10, and BG:DS00180.14:
These are a cluster of predicted genes, all of which show features of extracellular protein domains, such as EGF repeats and similarities to vertebrate tenascins and fibrillins. Inter se their similarities are in the twilight zone (18–28% identity) except for BG:DS00180.12 and BG:DS00180.8 (37% identity). Four of these genes have Drosophila EST sequences. Their relationships and structures require further study.

BG:DS00180.11:
This is one of the two genes in this region that encode cytochrome P450s [the other is l(2)35Fb]. The most similar protein is Cyp28a1 of D. mettleri (68% identity), one of a new family of cytochrome P450s identified as being induced by isoquinoline alkaloids found in the cactus hosts of this desert species (DANIELSON et al. 1997 Down).

rk:
rickets was discovered after UV mutagenesis by EDMONDSON 1948 Down. All alleles cause a recessive visible phenotype characterized by bent legs (especially those of the metathorax) and unexpanded wings (at least in strong alleles). It is not lethal, because overlapping deletions [e.g., Df(2L)el80f1/Df(2L)b85f1A] are viable (and extreme rickets). There is one P-element allele known, rk11P; its insertion site maps some 4 kb upstream of the rk sequence as identified by J. BAKER (personal communication) as corresponding to BG:DS00180.13. This gene encodes a 7TM protein that may be a neuropeptide hormone receptor because it shows sequence similarity to the mammalian G-protein-coupled lutropin-choriogonadotrophic hormone receptor (BLASTP, P = 10-91 with SP:P22888; J. BAKER, personal communication). The rickets protein is also similar in sequence to the product of the Drosophila Fsh gene, described as being related to the mammalian glycoprotein hormone receptors (HAUSER et al. 1997 Down).

BG:DS01514.2 and BG:DS05899.1:
These genes are of rather different structure. The former has seven exons and the latter two. Yet their predicted proteins are of similar length (668 and 681 amino acids, respectively) and 43% identical (71% similar) in sequence. Both show significant similarities with long-chain-fatty-acid-CoA-ligases from species as different as Archaeoglobus fulgidus, yeasts, and mammals, and with similar genes in C. elegans (R09E10.3) and A. thaliana (T08I13.8). This is presumably their function in Drosophila. The P element k09909 maps to BG:DS01514.2. M. LEPTIN and C. COELHO (personal communication) have sequenced cDNAs for both of these genes.

l(2)34Fa:
This vital gene is known from two EMS alleles and one P-element insertion (k00811). The insertion site of the latter has been sequenced and falls 1.4 kb 5' to the open reading frame of BG:DS05899.2. The predicted product of this gene has no sequence matches.

BG:DS05899.7:
The predicted protein of this gene shows similarities to a variety of proteins from C. elegans, S. cerevisiae, Arabidopsis, and mammals. These all have leucine-rich repeats in common with BG:DS05899.7.

BG:DS05899.3:
The product of BG:DS05899.3 is cysteine rich and has relatively low similarities (BLASTP expectations in the range P = 10-9 to 10-12) with mammalian fibrillin 1 precursors as well as with the apx-1 gene product of C. elegans. The latter is a Delta-like protein expressed maternally in the worm and interacting with the glp-1 protein (a homolog of Drosophila Notch) in the determination of the anterior-posterior axis of the four-cell embryo (MELLO et al. 1994 Down).

BG:DS05899.4:
This gene is predicted to encode a nicotinic acetylcholine receptor alpha chain. It shows 54% identity (over 57% of its length) with the human neuronal nicotinic acetylcholine receptor alpha-7 chain precursor (CHRNA7, SP:P36544) and its homologs in chicken and mouse. Three other nicotinic acetylcholine receptor alpha chains are known in Drosophila, two in 96A on chromosome arm 3R and one at 7E on the X chromosome (data from FlyBase).

BG:DS01523.2:
BG:DS01523.2 is predicted to encode a protein that has relatively low similarity (25% identity) to Drosophila midline fasciclin and fasciclin-like proteins from chick (SPTREMBL:O42390), mouse (osteoblast specific factor 2, SPTREMBL:Q62009), and a human TGFß-induced protein (SP:Q15582). The C-terminal region of this 1894-residue predicted protein is very threonine rich (overall the predicted protein is 17.6% threonine), with many small repeat motifs, e.g., nine copies of TT[P|R|N]APTTT[D|E|K], plus many small repeats (e.g., five copies of TTTTA, four of TTTTS, four of EITTT).

smi35A:
smi35A was identified by ANHOLT et al. 1996 Down on the basis of the reduced avoidance to benzaldehyde and other noxious chemicals associated with a P-element insertion. R. ANHOLT (personal communication) has discovered that a similar phenotype is associated with the insertions k16716 and k06901. Both these and the original smi35A insertion map within a 21-bp interval some 12 kb 5' to wb. Indeed, k16716, but not the other two insertions, is associated with a very weak wing-blister phenotype (when hemizygous with a wb- deletion). However, the strongest smell-impaired phenotype is associated with the insertion k11509, which maps some 30 kb more distally, within the 5' exon of BG:DS01523.3 (R. ANHOLT, personal communication). One (of 128) transposase-induced loss of this element is a lethal allele of wb. This predicted gene encodes a YAK1/DYRK family protein kinase; the Drosophila and human proteins (DYRK2, SPTREMBL:Q92630) are 56% identical over 44% of the length of the former.

wb (l(2)34Fb):
Alleles of wing blister are the most common lethals in EMS screens against deletions uncovering the Adh region. The alleles vary from being completely lethal to viable, with adult flies having a characteristic blister in the central wing. Several P-element alleles have been sequenced, some of which are lethal alleles and some viable. A lethal insertion, PZ09437, maps within a long intron of BG:DS03792.1, a gene encoding a protein similar to both laminin {alpha}-1 and {alpha}-2 chains of mouse and human. This gene has also been studied by MARTIN et al. 1999 Down, who have determined both its molecular structure and its expression. The gene is among the largest in the Adh region, over 70 kb in length with a predicted mRNA of 10.8 kb spliced with at least 16 exons. Its size presumably accounts for its mutability, not only with EMS but also after irradiation; three chromosome aberrations are associated with wb alleles [T(2;3)6r28, In(2LR)DTD121, and T(2;3)H68]. There is an independent gene prediction included within wb, BG:DS03792.2.

BG:DS01068.10:
This is one of several predicted genes to encode a serine protease. The protein of BG:DS01068.10 is similar to trypsins from several organisms, from Streptomyces glaucescens to macaque. It is most similar to the theta-trypsin of D. melanogaster (37% identity over its entire length).

BG:DS01068.6:
This is another gene encoding a protein conserved between yeasts and flies, but all of whose significant matches are themselves hypothetical. PSORT strongly predicts this protein to be nuclear. The matches are to F32E10.1 of C. elegans (45% identity over 77% of residues), YGR145W of S. cerevisiae (38% identity over 76% of residues), and SPCC330.09 of S. pombe (37% identity over 78% of residues). Mammalian EST matches indicate that a similar gene (or genes) will be found in mouse and human in due course.

Rab14:
Rab14 is one of many genes in D. melanogaster encoding RAS-related proteins. By direct sequence comparison BG:DS01068.7 is Rab14, which had been sequenced by SATOH et al. 1997 Down but mapped by them to 36A-B. This gene is also identified by an STS sequence derived from a cosmid mapped to 34F-35A (ESTS:57H4T, EMBL:Z50609). The phenotypically silent P-element insertion k08712 is inserted at the 5' end of Rab14. Three transposase-induced excisions of k08712 are lethal (of 37 recovered). One is an allele of l(2)35Aa and two are alleles of l(2)34Fd. The lethality of the l(2)35Aa- derivative of k08712 is rescued by a P-element insertion carrying a 5-kb l(2)35Aa rescue fragment (given to us by C. Flores) in the Df(2L)k08712-rv21/Df(2L)TE35B-7 heterozygote, which is deleted for Rab14, l(2)35Aa, spel1, and ppk. These data suggest that l(2)34Fd is distal to Rab14, and that Rab14 itself is not a vital gene.

l(2)35Aa:
Seven EMS-induced lethal alleles of l(2)35Aa are known. l(2)35Aa corresponds to BG:DS01068.8, which encodes a protein similar to a polypeptide N-acetylgalactosaminyltransferase of human (SPTREMBL:Q10471), as was demonstrated by FLORES and ENGELS 1999 Down by transformation rescue of mutant alleles and the overlapping deletions Df(2L)b84hl and Df(2L)TE35B-7.

spel1:
spellchecker-1 encodes a Drosophila protein probably involved in DNA mismatch repair, because it carries a mutS protein family signature (FLORES and ENGELS 1999 Down). It corresponds to BG:DS01068.9. spel1 is not a vital gene because a 5-kb l(2)35Aa transgene rescues the lethality of overlapping deletions [Df(2L)TE35B-7/Df(2L)b84h1] that are homozygously deleted for both l(2)35Aa and spel1 (FLORES and ENGELS 1999 Down).

ppk:
pickpocket encodes a protein whose sequence shows it to be a member of the DEG/ENaC protein superfamily (ADAMS et al. 1998 Down; WALDMANN and LAZDUNSKI 1998 Down). ADAMS et al. 1998 Down suggest that this may be involved as an ion channel protein in mechanosensory signal transduction, because it is expressed in a subset of multidendritic neurons. It corresponds to BG:DS06238.1. It is not a vital gene because the deletions Df(2L)A400 and Df(2L)b88h49 both remove ppk (M. ANDERSON, personal communication) and these deletions are viable when heterozygous with each other (see also above). This gene has also been sequenced by DARBOUX et al. 1998 Down and described as a multidendritic neuron sodium channel protein.

elbow (el) and pupal (pu):
The genetics of the elbow-no ocelli region have long been known to be complex (see DAVIS et al. 1997 Down). elbow and pupal have been known for many years, although, until the genetic analysis of the Adh region began, only a single allele of elbow had been recovered. The complex complementation patterns between the many alleles of elbow that have now been analyzed suggest that this "gene" is in fact two, elB and elA, and that mutations of each can act as dominant enhancers of mutations of the other. The insertion EP(2)2039 is a weak elbowB allele, and enhances Sco, as do other alleles of elB; transposase-induced excisions of this element either revert the elbow phenotype, remain elbow, or are deletions extending proximal-ward (to include pupal) or distal-ward (to include l(2)35Aa). This P element is inserted at the 5' end of the GENSCAN prediction for BG:DS06238.3, encoding a Zn-finger protein. We suggest that this gene is elB. If BG:DS06238.3 is elB, then BG:DS06238.4, predicted to encode a protein with similarity to a Drosophila pupal cuticle protein (60% identity over 28% of length with the Edg84A protein), is probably pupal (whose most obvious phenotype is a failure of wing expansion), and BG:DS08340.1 is probably elA. BG:DS08340.1 is wholly contained within the 20-kb deletion associated with el1; its sequence has no significant database matches. Although elB, pu, and elA are all nonvital individually, deleting all three genes results in pharate adult lethality, the adult escapers having crippled legs.

noc:
no-ocelli was first identified by the absence of ocelli in certain viable overlapping deletion heterozygotes (ASHBURNER et al. 1982A Down). Subsequently, a number of viable alleles were found, including one associated with G. Ising's w+ rst+ TE, TE146 (now TE35B; GUBB et al. 1985 Down). A lethal complementation group, described as l(2)35Ba, was clearly associated with noc, because heterozygotes between the EMS-induced lethal alleles of this group and viable noc alleles had no ocelli. In fact, this lethal locus and noc are the same gene, the viable alleles all being in 3' regulatory regions (CHIA et al. 1985 Down; MCGILL 1985 Down; DAVIS et al. 1990 Down; CHEAH et al. 1994 Down). Three of the EMS-induced alleles die as embryos, showing a failure of embryonic head involution with hypertrophy of the supraesophageal ganglion (CHEAH et al. 1994 Down). Paradoxically, overlapping deletions for noc die as larvae, with no central nervous system phenotype; these three EMS alleles are recessive antimorphs (see discussion in CHEAH et al. 1994 Down). noc encodes a protein with a C2H2-like zinc finger and several long poly-alanine runs and corresponds to BG:DS04641.1, as shown by direct sequence comparison with the data of CHEAH et al. 1994 Down. This protein shows sequence similarity with the human SP1 and SP2 transcription factors.

noc shows complex genetic interactions with mutations at the elA and elB loci (DAVIS et al. 1997 Down). It therefore is of some interest that BG:DS06238.3, which we suggest is elB and maps ~100 kb distal to noc, encodes a zinc finger protein showing 27% amino acid sequence identity with the noc protein.

BG:DS01486.1:
Ubiquitin-protein ligases are required for the ubiquitination of proteins destined for breakdown via the 26S proteasome. BG:DS01486.1 is the 12th gene in this family to be discovered in D. melanogaster (data from FlyBase); there are at least 13 in S. cerevisiae (SACCHAROMYCES GENOME DATABASE 1999) and at least 10 in C. elegans (WORMPEP 1999 Down). BG:DS01486.1 shows high identities (up to 83%) with 17-kD ubiquitin-conjugating enzyme E2 of organisms from yeast (UBC13p) to human (VARSHAVSKY 1997 Down).

osp. outspread was first recognized in Cambridge by the outspread wing phenotype of certain viable overlapping deletion heterozygotes (WOODRUFF and ASHBURNER 1979A Down). Subsequently, E. H. Grell (cited in LINDSLEY and ZIMM 1992 Down) identified an EMS-induced allele and many have been found since. It is not a vital gene, as complete deletions of osp are viable. Molecular mapping of aberration breakpoints associated with osp alleles on a phage chromosome walk showed that three mapped distal to Adh and four mapped proximal to this gene, leading to the conclusion that Adh was contained within osp (CHIA et al. 1985 Down). Subsequent work (MCNABB et al. 1996 Down) strengthened this hypothesis and, from our cDNA sequencing, we found that coding exons of osp map both distal to Adh and proximal (BG:DS01486.7). Adh and Adhr appear not to be the only genes included within osp; in addition to these are two transposable elements (roo and jockey) and two predicted genes, BG:DS07721.1 and BG:DS09219.1. The second of these would be transcribed in the same direction as osp, and may be part of osp itself, if osp has an alternative transcript that has not yet been found as a cDNA (we already know of alternative transcripts of this gene that differ in their 3' exons). BG:DS07721.1 cannot be part of osp because it would be transcribed from the opposite strand (its existence is predicted by an EST sequence).

There are two P-element insertions in the 5' exon of osp: one (rJ571) causes an osp phenotype, the other (k13218) does not. (A minority of transposase-induced excisions of k13218, 10 out of 225, are phenotypically outspread.) The gene is the largest we have found in the sequenced region, extending over 95 kb, with 5.3- and 3.9-kb cDNAs.

The predicted osp protein has a pleckstrin homology (PH) domain (PFAM:PF00169), implicating a role in the cytoskeleton. It shows some similarity to a protein involved in the control of the actin cytoskeleton in mice (p116Rip, SPTREMBL:P97434), to the myosin heavy chain products of the human MYH3 and MYH8 genes, and to the S. cerevisiae gene product USO1 involved in intracellular protein transport.

Adh and Adhr:
These are a pair of related genes, coding for proteins with 33% amino acid identity. The positions of the two introns that interrupt the coding regions of each are the same in the two genes, supporting the hypothesis that they arose by tandem duplication (SCHAEFFER and AQUADRO 1987 Down). The transcript of Adhr is much rarer than that of Adh and is always found as an Adh-Adhr dicistronic mRNA (BROGNA and ASHBURNER 1997 Down). These genes correspond to BG:DS01486.8 and BG:DS01486.9, respectively. Despite its sequence matches Adhr is probably not an alcohol dehydrogenase; it is not an essential gene (ASHBURNER 1998 Down).

BG:DS00810.1:
The product of this predicted gene has a significant BLASTP score (P = 10-19, 34% identity over 46% of length) to a hypothetical protein of C. elegans (ZK652.6).

BG:DS06874.2:
High BLASTP scores (P = 10-60, 39% identity over 94% of length) identify the product of BG:DS06874.2 as being involved in a G-protein signal transduction pathway, because it is similar to the human protein GPS1 (and its rat homolog) isolated as a cDNA that suppresses gain-of-function mutations in the pheromone response pathway of S. cerevisiae and the RAS pathway in mammalian cells (SPAIN et al. 1996 Down). The mammalian protein, and its Drosophila homolog, are also similar to the FUS6 protein of A. thaliana, which is a negative regulator of light-mediated signal transduction (CASTLE and MEINKE 1994 Down).

BG:DS06874.3:
The protein predicted to be the product of BG:DS06874.3 has a PROSITE ATP/GTP-binding site motif A (P-loop) and PROSITE AAA-protein family signature. Its closest sequence match in the yeast genome is MSP1, encoding an AAA family ATPase of the inner mitochondrial membrane presumed to be involved in protein sorting (NAKAI et al. 1993 Down; P = 10-63, 42% identity over 77% of its length). There are similar proteins in C. elegans (K04D7.2), A. thaliana (T14P8.7), and human (SKD1), and the BG:DS06874.3 protein shows 36% amino acid sequence identity with the TER94 gene product of D. melanogaster, isolated as a homolog of the yeast CDC48 protein (PINTER et al. 1998 Down). The CDC48 protein is an essential AAA-family ATPase required for membrane fusion (YEAST PROTEOME DATABASE 1998). The AAA family ATPases are a functionally diverse group of proteins, many of which are associated with the membranes of cell organelles (PATEL and LATTERICH 1998 Down). The predicted protein of BG:DS06874.3 has a long C-terminal coiled-coil domain (PSORT prediction).

BG:DS06874.4 and BG:DS06874.6:
The predicted protein products of these genes are 45% identical in amino acid sequence, and both products show significant similarities with a variety of serine proteases from organisms as different as C. elegans and human. These are not vital genes, because the heterozygote between the deletions Df(2L)A72 and Df(2L)A47 that removes both of these genes, is viable (J.-M. REICHHART, personal communication).

BG:DS03431.1:
We predict that the protein product of BG:DS03431.1 is a cation-dependent amino acid transporter. It shows 31% amino acid identity with the Drosophila inebriated protein (a Na+/Cl--dependent neurotransmitter transporter; SOEHNGE et al. 1996 Down), and similar identities with Na+/Cl--dependent transporters from human (SLC6A6, a taurine transporter), Manduca sexta (KAAT1, amino acid transporter), rat (SLC6A11, GABA transporter), and even Methanococcus jannaschii (MJ1319, a putative sodium-dependent transporter). As expected for a protein of this function the BG:DS03431.1 product is predicted by PSORT to have 12 transmembrane domains.

Mst35Ba and Mst35Bb:
These are a tandem pair of related genes that encode protamine-like proteins (RUSSELL and KAISER 1993 Down). They are probably not vital because Df(2L)TE35D-5/Df(2L)TE35B-9 and Df(2L)TE35B-9/Df(2L)osp29 survive but are male sterile, suggesting that one or both of these may be required for male fertility. This conclusion is tentative, because these deletions remove much more than just these two Mst genes, but we reserve the symbol ms(2)35Bi for the genetic factor(s) responsible for this sterility. These protamine-like genes correspond to BG:DS03431.2 and BG:DS03431.3, respectively.

BG:DS03144.1:
This is a large predicted gene (~13.5 kb) with 11 predicted exons. Significant BLASTP matches are seen with a number of poorly characterized putative glycosyl phosphatidyl inositol (GPI)-anchored membrane-bound proteins with immunoglobulin-like domains [e.g., the D. melanogaster Amalgam protein and locust lachesin (P = 10-29 with SP:Q26474; KARLSTROM et al. 1993 Down)].

BG:DS03323.1:
The BG:DS03323.1 protein shares a region of 61% amino acid identity (over 28% of its length) with that coded for by the strawberry-notch gene of D. melanogaster. We have tested deficiencies that include BG:DS03323.1 for interactions with sno alleles, with negative results. This protein is also similar to hypothetical proteins from human (R31180_1, P = 10-231), C. elegans (F20H11.2, P = 10-252), and A. thaliana (YUP8H12R.3, P = 10-179) and to a probable methylase or helicase from the pNL1 plasmid of Sphingomonas aromaticivorans (orf235), itself showing 31% identity to the sno protein.

BG:DS01219.3:
This protein shows weak similarity (29% identity over 47% of length) with the neuromusculin protein of Drosophila, a cell-adhesion protein, and with a fragment of the FAR-2 protein of Gallus (SPTREMBL:Q90843, 32% identity over 22% of length).

BG:DS01219.1:
This shows weak similarity to a hypothetical protein of C. elegans (C26B9.1, P = 10-17, 31% identity over 47% of length).

l(2)35Bb and l(2)35Bc:
Five lethal complementation groups were identified in the interval between osp and Su(H). Of these, l(2)35Bb is the most distal, because only it is included within Df(2L)fn3; l(2)35Bd is the most proximal, because only it is included within Df(2L)Ctxrv1. The remaining three loci, l(2)35Bc, l(2)35Be, and l(2)35Bf, were unordered between these loci.

k11524 is a lethal allele of l(2)35Bb, which, by the sequence of its insertion site, maps 5' to BG:DS01291.1 (a gene prediction supported by several ESTs) and within the GENSCAN prediction BG:DS00929.16. k08808 is a lethal allele of l(2)35Bc. Two out of seven induced derivatives of this element revert this lethality; three are deletions; one extends distally to include osp, as well as l(2)35Bb, l(2)35Bc, l(2)35Be, and l(2)35Bf; one extends distally to include only l(2)35Bc and l(2)35Be; and the third extends proximally to include l(2)35Bc and l(2)35Bd. This establishes the following gene order: l(2)35Bf, l(2)35Be, l(2)35Bc. The insertion site of k08808 is within the LTR of a yoyo element. Confusingly, in the DNA sequenced, there is a yoyo element within an intron of l(2)35Bb. However, k08808 is not an allele of this gene. We assume that in the chromosome into which k08808 inserted there was a yoyo element in l(2)35Bc. It is probable that l(2)35Bc corresponds to either BG:DS00929.4 or to BG:DS00929.3 (see below).

BG:DS00929.2:
The protein product of BG:DS00929.2 has a PFAM ankyrin repeat pattern (PF00023, P = 5.3 x 10-21) and is similar to ankyrin R of human (39% identity over 57% of length), to the D. melanogaster ankyrin protein (47% identity over 41% of length), and to similar proteins of other taxa. Ankyrins, as their name suggests, are involved in anchoring cytoskeletal proteins to the plasma membrane.

BG:DS00929.3:
This protein is probably a Drosophila homolog of the transcription-factor-associated protein of human DR1 (61% identity over 65% of length). It shows a similar similarity with the Xenopus homolog (SPTREMBL:O13068) and significant similarity with the Saccharomyces and Arabidopsis homologs (SPTREMBL:Q92317 and SP:P49592, respectively). The DR1 protein interacts with the TATA-binding protein TBF to repress both basal and activated transcription (YEUNG et al. 1994 Down).

BG:DS00929.4:
We can make no predictions about the function of the protein of BG:DS00929.4, yet it is conserved, with 54% identity (over 77% of its length) with the hypothetical YGR024C protein of S. cerevisiae. It also shows weak similarity with MTH972 of Methanococcus thermoautotrophicum (29% identity over 67% of length), but this too is of unknown function.

l(2)35Bd:
This is a lethal locus known from six EMS-induced alleles, a P-element allele (PZ10408), and an allele on the cytologically complex translocation Tp(3;2)AntpCtx. The latter allele may be due to a second-site mutation, as SCHWEISGUTH and POSAKONY 1992 Down mapped the 35B breakpoint of this translocation 12 kb distal to Su(H), a position some 18 kb proximal to BG:DS00929.5, the predicted gene in which PZ10408 lies. The breakpoint mapped by SCHWEISGUTH and POSAKONY 1992 Down must be correct, as it was the position of the fusion fragment with Antp from which they initiated the chromosome walk to Su(H). BG:DS00929.5 encodes a protein similar to the mRNA cap methyltransferases of S. cerevisiae and S. pombe (34–35% identity over 60–64% of the length of BG:DS00929.5).

BG:DS00929.6:
Although only one GABA-receptor has been well studied in Drosophila (Rdl, a mutation of which results in cyclodiene resistance), there is at least one other known, Lcch3 (HOSIE et al. 1997 Down for review) and evidence of a third from the EDGP sequence data (EG:30B8.6). The predicted BG:DS00929.6 protein is 56% identical in sequence over a short domain with the rat GABA-BR1B receptor (SPTREMBL:O08621) and shows weak similarity (24–30% identity) with the human metabotropic glutamate receptor GRM8 (SP:O00222), the Fugu pheromone receptor CA12 (SPTREMBL:O73638), and the Drosophila metabotropic glutamate receptor Glu-RA (SP:P91685). PSORT predicts that the BG:DS00929.6 protein has seven transmembrane domains.

BG:DS00929.7:
The BG:DS00929.7 protein is similar to fibrinogens from mammals and to a similar protein in C. elegans (SPTREMBL:Q18914). For example, the identity with the human fibrinogen alpha chain precursor is 42% over 95% of the length of BG:DS00929.7. There is a similar degree of similarity (39% identity) to the Drosophila scabrous protein. The scabrous product is a secreted glycoprotein and its fibrinogen-related domain is required for activity (LEE et al. 1998 Down).

BG:DS00929.8:
The only significant similarities for the protein of this predicted gene are to the yellow proteins of D. melanogaster (SP:P09957) and D. subobscura (SPTREMBL:O02437). In both cases the similarity is 43% amino acid identity over 67% of the length of the BG:DS00929.8 protein.

l(2)35Bg:
This is a lethal locus identified by two EMS alleles, a PM hybrid dysgenesis allele and a P-element insertion, k10011. The P element is in a very short predicted gene, BG:DS00929.9, just distal to Su(H). The protein is similar (57–74% identity) to others of unknown function in human (A-152E5.9), C. elegans (T20B12.7), and S. cerevisiae (YKR071C). V. MOREL and F. SCHWEISGUTH (personal communication) have shown that a 1.9-kb deletion isolated by excision of an unmarked P element in Su(H) does not complement lethal alleles of either Su(H) or l(2)35Bg. This lethality is rescued by a transformant carrying the transcription unit immediately 5' to Su(H), called transcript B by SCHWEISGUTH and POSAKONY 1992 Down; l(2)35Bg corresponds, therefore, to BG:DS00929.9.

Su(H) (l(2)35Bh):
Loss-of-function alleles and deletions of Su(H) act as dominant suppressors of Hairless, while a gain-of-function allele and duplications of the wild-type gene act as dominant enhancers of H (see NASH 1965 Down; ASHBURNER 1982 Down). Adult escapers of loss-of-function alleles have an extreme vg-like wing phenotype and almost no macrochaetae (ASHBURNER 1982 Down). The gene was cloned by FURUKAWA et al. 1992 Down and by SCHWEISGUTH and POSAKONY 1992 Down and encodes a transcription factor. Notch activation by its ligand Delta results in the translocation of the Su(H) protein from the cytoplasm to the nucleus (GUO et al. 1996 Down) where it regulates E(spl) complex transcription (e.g., BAILEY and POSAKONY 1995 Down). Su(H) corresponds to BG:DS00929.10.

ck:
crinkled was first identified by Bridges in 1930 (BRIDGES and BREHME 1944 Down), but the original allele has been lost. New alleles were discovered by ASHBURNER et al. 1982B Down(see GUBB et al. 1984 Down), and these cause a very similar phenotype to that described by Bridges. Mutant alleles are lethal or semilethal, escaper adults have stubbly bristles, multiple trichomes, and feathery aristae; embryos have abnormal denticles (NUSSLEIN-VOLHARD et al. 1984 Down). The insertion of the G. Ising's w+ rst+ TE element TE35BC interrupts BG:DS00929.11, the predicted gene immediately proximal to Su(H) where, indeed, ck deletion maps. This gene encodes an unconventional myosin (myosin VIIA) and was cloned and sequenced on this basis by D. KIEHART (personal communication; see CHEN et al. 1991 Down). The P element PZ07130 is inserted just 28 bp 5' to the presumed start of transcription of ck. It is, phenotypically, a weak ck allele and most (34/48) transposase-induced excisions revert this phenotype; three were stronger ck alleles and six were deletions extending either proximally to include TfIIS or distally to include Su(H). Mutations in the human and murine myosin VIIA cause deafness, Usher syndrome type 1B in human (WEIL et al. 1995 Down), and shaker-1 in mouse (GIBSON et al. 1995 Down). It is striking that in strong shaker-1 alleles of mouse (e.g., Myo7a816SB) there are defects in organization of the stereocilia of the cochlea (SELF et al. 1998 Down); the stereocilia are analogous to the epidermal cell hairs of Drosophila. A second analogous phenotype is seen in the trichomes of epidermal cells of Arabidopsis mutant for the ZWI kinesin-like protein (OPPENHEIMER et al. 1997 Down). The ZWI protein and myosin VIIA proteins share a C-terminal MyTH4 domain (PFAM:PF00784; CHEN et al. 1996 Down).

TfIIS (l(2)35Cf):
There is only one genetically characterized gene that maps between ck and vasa. This is l(2)35Cf, known from PM hybrid dysgenic alleles that escape to give flies with a held-out wing and rough eye phenotype (ASHBURNER et al. 1990 Down). The only gene predicted in this region is BG:DS00929.12, which encodes an RNA-polymerase II elongation factor, TfIIS (MARSHALL et al. 1990 Down; OH et al. 1995 Down; XIE and PRICE 1996 Down). The identification of l(2)35Cf with TfIIS is supported by the mapping of the proximal breakpoint of Df(2L)64j by LASKO and ASHBURNER 1988 Down. This breakpoint maps ~15 kb distal to the EcoRI site that is 1 kb 3' to the 3' end of vasa; Df(2L)64j is l(2)35Cf- vasa+ and the breakpoint is predicted to be within BG:DS00929.12.

vas, vig, and BG:D500929.15:
vasa is a maternal-effect lethal, and embryos from homozygous mothers have a "posterior" phenotype with no abdomen or pole cells (SCHUPBACH and WIESCHAUS 1986 Down). It encodes a DEAD-box RNA-dependent ATPase that is localized to the pole plasm of oocytes and is sequestered by the pole cells of the embryo (HAY et al. 1988 Down; LASKO and ASHBURNER 1988 Down, LASKO and ASHBURNER 1990 Down). The vasa protein interacts with the oskar protein and, with this and the tudor protein, is a pole granule component (BREITWIESER et al. 1996 Down). vasa corresponds to BG:DS00929.14. When first characterized, its 5' exon was missed, but was subsequently discovered (see STYHLER et al. 1998 Down). This exon is separated by a 6.6-kb intron from the rest of the gene and this intron includes BG:DS00929.13, named vasa intronic gene (vig) by K. EDWARDS (personal communication). Two P elements, EP(2)0812 and k07233, map within the putative coding region of vig. Genetically, both behave as alleles of vasa, e.g. being female sterile when heterozygous with the EMS-induced allele vasa3, P. LASKO (personal communication) has discovered another gene included within vasa. This is BG:DS00929.15 and its existence was also predicted by GENSCAN. While ESTs for vig have been found, none, so far, are known for this gene.

BG:DS04929.1:
The protein predicted for BG:DS04929.1 only shows a low degree of similarity (22–25% identity over 15–18% of its length) with hypothetical proteins from C. elegans (F56A8.1), S. pombe (PI030), and S. cerevisiae (YBR086C). PSORT predicts the Drosophila protein to have seven transmembrane domains.

stc (l(2)35Cb):
shuttle craft was characterized by STROUMBAKIS et al. 1996 Down as a protein related in sequence to the mammalian transcription factor NF-X1; in addition to cysteine-rich domains, characteristic of NF-X1, it has an RD RNA-binding domain. It corresponds to l(2)35Cb, known from five EMS-induced alleles, the proximal breakpoint of In(2L)dpps22, and two P-element alleles. The insertion site of one of the latter has been sequenced. Lethal alleles of l(2)35Cb die as embryos that do not hatch due to a failure of the peristaltic movements required for hatching (STROUMBAKIS et al. 1996 Down; TOLIAS and STROUMBAKIS 1998 Down). The stc sequence corresponds to BG:DS04929.4. Just 5' to this sequence is a short open reading frame (BG:DS04929.3) that also has a PROSITE C2H2 type zinc finger domain and is similar to other zinc finger proteins (e.g., 46% identity over 39% of length to human ZNF41). Curiously, the insertion site of PZ05441 (called PZ9 by STROUMBAKIS et al. 1996 Down) is within an intron of this second open reading frame. Extensive genetic tests have confirmed the allelism of this insertion with other l(2)35Cb alleles. In addition, STROUMBAKIS et al. 1996 Down reverted the stc phenotype associated with PZ05441 by P-element excision. One possibility is that there is an undetected 5' exon of stc distal to BG:DS04929.3; another is that BG:DS04929.3, rather than stc, is l(2)35Cb. The former possibility is suggested because STROUMBAKIS et al. 1996 Down state that homozygotes for PZ05441 lack protein that reacts with an anti-STC antibody.

BG:DS03192.2:
BG:DS03192.2 is predicted to encode a protein with leucine-rich repeats. It has a PFAM LRR domain (PF00560, P = 9.3 x 10-142) and shows significant BLASTP matches with a variety of proteins, all of which have similar domains, including the Drosophila chaoptin gene.

BG:DS07295.1:
We infer that the product of BG:DS07295.1 is a metal ion transporter. It shows 47% identity with the human zinc transporter ZNT-3 and 58% identity with the rat zinc transporter ZNT-2. It is also similar to the S. pombe gene product SPAC23C11.1p, implicated in zinc/cadmium resistance and the S. cerevisiae protein zrc1p. Loss-of-function zrc1 mutations are hypersensitive to zinc and cadmium and to oxidative stress (KAMIZONO et al. 1989 Down; KOBAYASHI et al. 1996 Down).

BG:DS07295.5:
The product of BG:DS07295.5 is weakly similar to a c-MYC binding protein of human (SP:Q99471) and hypothetical proteins from C. elegans (F35H10.6) and M. jannaschii (MJ0648). The BLASTP scores to all of these are just at the limit of the threshold used in these analyses (P = 10-7–10-8).

BG:DS05639.1:
The BG:DS05639.1 protein shows weak sequence similarities (~20%, with BLASTP scores between P = 10-7–10-9) to several myosin heavy chain proteins, including the unc-54 protein of C. elegans and a nonmuscle myosin of chick (SP:P14105). PSORT predicts long coiled-coil regions in this protein.

gft (l(2)35Cd):
This lethal, known from seven EMS-induced, one {gamma}-ray-induced, one P-element insertion, and one PM hybrid dysgenesis-induced allele, plus one of obscure origin, has been named guftagu by MISTRY 1997 Down. Escapers have unexpanded wings and small eyes (ASHBURNER et al. 1990 Down). Mistry showed that l(2)35Cd alleles, or a deletion for this gene, act as dominant suppressors of the complex visible phenotype that results from the ectopic expression of the G-protein G{alpha}s driven by certain enhancer-trapped GAL4 elements. gft corresponds to BG:DS07851.2, as shown by both comparison with Mistry's sequence (H. MISTRY, personal communication) and by the sequence of the insertion site of PZ06430. The sequence is similar to a human cullin and similar proteins in several other organisms, including the cul-3 gene product of C. elegans (48% identity over 99% of length) and a hypothetical product of the human cDNA KIAA0617 (68% identity over entire length). In S. cerevisiae cullin family proteins are components of the anaphase-promoting complex (APC2p; KRAMER et al. 1998 Down) and of the SCF complex (Cdc53p; LAMMER et al. 1998 Down), both targeting proteins into the ubiquitin-dependent degradation pathway.

BG:DS07851.3:
The BG:DS07851.3 protein is probably a member of the YER057C/yjgF family defined by PROSITE pattern PS01094 and PFAM domain PF01042 (P = 4.4 x 10-55). Like other members of this family the BG:DS07851.3 protein is small (138 amino acids); most family members are of unknown function, although the mammalian perchloric acid soluble protein, e.g., the human PSP (SP:P52758), is described as a translational inhibitor (SCHMIEDEKNECHT et al. 1996 Down).

ms(2)35Ci:
BG:DS07851.10 is a weak GENSCAN prediction (score of 35) with neither ESTs nor any significant sequence matches. A P element associated with a male-sterile mutation, ms(2)46AB02316 (CASTRILLON et al. 1993 Down), has been rescued and its flanking sequence maps to a predicted intron of BG:DS07851.10. This is consistent with our genetic mapping of the male-sterile phenotype, which is within both Df(2L)osp18 and Df(2L)A263. It is possible that the prediction of BG:DS07851.10 is false and that the male-sterile phenotype is due to the insertion of this P element 1 kb 5' to BG:DS07851.8. Because ms(2)46AB is clearly an inappropriate name we call this gene ms(2)35Ci.

BG:DS07851.6:
The only significant protein database match of BG:DS07851.6 is to the Drosophila Taf110 protein, a subunit of TFIID (40% amino acid identity over 37% of its length). There are also BLASTP matches to similar proteins in human and yeast (but below the cutoff expectation we have used).

esg (l(2)35Ce):
escargot is the most frequent site of P-element insertion in this chromosome region; over 50 independent insertions have been recovered, as well as three EMS-induced alleles and four alleles associated with chromosome aberrations. The P-element alleles vary in phenotype; of 56 characterized, 35 are lethal or semilethal (as hemizygotes with esg- deletions) but 19 are viable (see Table S1). Twenty of these P-element insertions have been sequenced; all map between 192 bp and 1258 bp 5' to the start of the esg protein-coding region, as did those sequenced by WHITELEY et al. 1992 Down; there are 15 sites in this region at which P elements have inserted. Escapers of lethal or semilethal alleles usually show abnormalities in abdominal differentiation, though some unusual alleles [e.g., esgdgl, once thought to be a different gene, dgl of ASHBURNER et al. 1990 Down] show a failure of the dorsal and ventral surfaces of the wing to fuse (ASHBURNER et al. 1990 Down). esg was independently identified by three groups (WHITELEY et al. 1992 Down; HAYASHI et al. 1993 Down) and the identification of BG:DS07851.7 as esg is both from a comparison of this genomic sequence with previous data and from mapping the precise insertion sites of 15 different P-element alleles. esg encodes a C2H2 class zinc finger domain protein. This protein is required for the maintenance of diploidy in imaginal disc cells; in its absence these arrest in G2 and continue to endoreplicate (HAYASHI 1996 Down). It is interesting that esg and snail show evidence of functional redundancy. Not only do they cross-regulate and bind similar DNA targets, but in esg- sna- embryos some wing disc markers (e.g., vestigial) that are expressed in either single mutant are not expressed (FUSE et al. 1996 Down; see also YAGI and HAYASHI 1997 Down). T. IP (personal communication) has evidence of a degree of functional redundancy between esg, sna, and worniu (see below).

worniu (l(2)35Da):
The predicted gene immediately proximal to esg (BG:DS03023.1) also encodes a C2H2-class zinc finger protein, similar to those encoded by esg and snail. This is probably l(2)35Da, known from eight EMS-induced alleles. Loss of l(2)35Da function results in embryonic lethality, with disrupted cuticle belts (ASHBURNER et al. 1990 Down); T. IP (personal communication) has suggested the name worniu for BG:DS03023.1 (worniu is Chinese for snail), which has been independently identified by K. SCHMID (personal communication), and S. I. ASHRAF and T. IP (personal communication) have rescued mutant alleles by transformation.

BG:DS03023.4:
This gene is predicted only on the basis of a GENSCAN score; it has neither ESTs nor significant database matches. From its position it is a good candidate for l(2)35Cg.

BG:DS03023.2:
This is yet another protein whose only significant matches are to hypothetical proteins of unknown function from the sequences of C. elegans and S. cerevisiae. The BG:DS03023.2 protein shows 32% identity (over 89% of its length) to the C. elegans F31D4.2 protein and 27% identity (over 83% of its length) to the YMR027W protein of S. cerevisiae. From its position this predicted gene may correspond to l(2)35Ch.

sna (l(2)35Db):
snail encodes a product required for mesoderm determination; mutant embryos fail to form a ventral furrow (GRAU et al. 1984 Down; LEPTIN 1994 Down for review). Like worniu and esg, snail encodes a C2H2-class zinc finger domain transcription factor (BOULAY et al. 1987 Down; ALBERGA et al. 1991 Down) and direct sequence comparison shows that it corresponds to BG:DS01845.1.

Tim17:
This gene encodes a preprotein translocase of the inner mitochondrial membrane that is highly conserved in different organisms. It was identified in our sequence by BOMER et al. 1996 Down and corresponds to BG:DS01845.2.

lace (l(2)35Dc):
This is a vital gene, strong alleles are lethal, and the embryos show head defects, but weak alleles, and some heteroallelic combinations, give viable adult flies with supernumerary wing veins, hence the name lace (ASHBURNER et al. 1990 Down). It is known from over 14 EMS-induced alleles, an allele associated with T(Y;2)b8, and six P-element insertions. The insertion site of one of the P-element alleles was sequenced and shown to be located at the 5' end of BG:DS01845.3, a gene that encodes a protein with similarity to serine palmitoyl transferases from organisms as different as yeast and human (52% amino acid sequence identity to human serine palmitoyl transferase subunit II). We presume this to be the function of the product of lace.

kek3:
kekkon3 was identified by J. DUFFY (personal communication) as being similar to kek1 and kek2 of MUSACCHIO and PERRIMON 1996 Down. These genes encode transmembrane proteins with both leucine-rich repeats and an immunoglobulin domain and are targets of the Egfr signal transduction pathway (see SAPIR et al. 1998 Down). kek3 corresponds to BG:DS04862.1, predicted by PSORT to have a single transmembrane domain with an internal C terminus.

BG:BACR44L22.1, BG:BACR44L22.8, BG:BACR44L22.2, BG:BACR44L22.3, BG:BACR44L22.4, and BG:BACR44L22.6:
These six genes encode proteins of ~250 amino acids, all with clear similarities to zinc metallopeptidases of the M12A subfamily (see BARRETT et al. 1998 Down). These genes presumably evolved by duplication, because they show between 29 and 64% pairwise sequence identities. BG:BACR44L22.2 and BG:BACR44L22.3 are the most similar pair, and BG:BACR44L22.4 and BG:BACR44L22.8 the most divergent pair.

BG:DS07108.4:
BLASTP matches with the translation of BG:DS07108.4 include a large number of extracellular proteins with leucine-rich repeats. Other than the fact that this protein has three PFAM:PF00560 leucine-rich repeat patterns, indicative of protein-protein interactions, we can make no inference concerning its function.

BG:DS07108.2:
This protein is probably a calcium channel subunit, because it shows 36% identity (over 30% of its length) to the human alpha-2/delta subunit (EMBL:AF042793) and similar identities to mouse and rabbit L-type calcium channel subunits (SPTREMBL:O08532 and SP:P13806). It is also similar to the C. elegans unc-36 protein, which has the characteristics of a calcium channel {alpha}-subunit.

BG:DS07108.1 and BG:DS07108.5:
The BG:DS07108.1 protein is predicted to be a serine-type protease. It has similarities with several mammalian, worm, and bacterial serine proteases, but is most similar (36% identity over 61% of its length) to the antibacterial serine protease, Limulus factor D, from the Japanese horseshoe crab (KAWABATA et al. 1996 Down; SPTREMBL: P91817). The BG:DS07108.5 protein is similar, showing 33% identity (over 89% of length) with Limulus factor D. These two genes probably arose by tandem duplication; their protein sequences are 35% identical. We know that these two genes are nonvital, because the deletion heterozygote Df(2L)75c/Df(2L)TE35D-17, which removes both, is viable (J.-M. REICHHART, personal communication). The viable P-element insertion PZ09259 maps 12 kb 3' to BG:DS07108.5; it may be an allele.

CycE (l(2)35Dd):
This gene was identified first from embryonic lethal alleles that may escape to give flies with a small eye phenotype (ASHBURNER et al. 1990 Down). It was first cloned by RICHARDSON et al. 1993 Down using "cyclin box" probes and is very similar to the G1 cyclin, cyclin E, of S. cerevisiae, and, indeed, the Drosophila gene will functionally complement cln2 cln3 yeast (EDGAR 1994 Down for review). Three EMS-induced alleles, nine P-element alleles, a w+ rst+ insertion (TE35D), and the breakpoint of T(2;3)G16 are the known mutations. The insertion sites of three P-element alleles have been sequenced, and all fall at the 5' end of BG:DS07108.3, which is indeed CycE by direct sequence comparison.

BG:DS09217.1:
This prediction has matching EST sequences and both GENEFINDER and GENSCAN predictions, but the only significant database match is with a hypothetical protein of C. elegans (ZK809.3, 36% identity over 89% of length). Its position makes it a good candidate for l(2)35Di.

l(2)35Df:
Four of the five known EMS-induced alleles of l(2)35Df are lethal; one (l(2)35DfHL58) gives viable, but female-sterile, escapers with a small bristle phenotype (ASHBURNER et al. 1990 Down). In addition, the P-element insertion k14423 is a lethal allele of this locus. This P element is inserted 13 bp from the start of the most 5'-extending cDNA of BG:DS09217.2. BG:DS09217.2 encodes a protein similar to the SKI2W helicase of human and the MTR4 ATP-dependent DEIH motif RNA helicase of S. cerevisiae. The greatest similarity (62% identity over 87% of length) is to the translation (SWISS-PROT:P42285) of a human EST sequence (KIAA0052, EMBL:D29641). M. TAYLOR and D. ARAGNOL (personal communication) have found that the BG:DS09217.2 transcript is substantially reduced in l(2)35DfP15, suggesting that this predicted gene is indeed l(2)35Df.

Gli (l(2)35Dg):
Gliotactin encodes a transmembrane-spanning protein with a serine esterase-like motif (AULD et al. 1995 Down). All known EMS-induced alleles are embryonic lethal (ASHBURNER et al. 1990 Down). By comparison of the genomic sequence with that published by AULD et al. 1995 Down, Gli is BG:DS09217.3. AULD et al. 1995 Down generated several null alleles by imprecise P-element excision; they die as late embryos that are morphologically normal. They are, however, paralyzed and the electrophysiological data suggest that the hemolymph-nerve barrier has broken down, the glial cells being permeable to K+ ions.

BG:DS09217.4:
The BG:DS09217.4 protein is similar (24–40% identities) to hypothetical proteins from human (the KIAA0547 cDNA), C. elegans (B0285.4), S. cerevisiae (YOL141W), S. pombe (SPBC19C7.08c), and A. thaliana (T7I23.16). Despite this conservation nothing can be inferred about the function of BG:DS09217.4.

l(2)35Ea:
This lethal complementation group was known from two alleles, one EMS induced, the other probably radiation induced. The P element PZ05271 is a viable and fertile insertion within the first exon of BG:DS09217.5, which is predicted to encode a C2H2-type zinc finger protein (J. GATES and C. THUMMEL, personal communication). Although this insertion and the two classical alleles complement, both give adults with crippled legs and small wings when heterozygous with a deletion. J. GATES (personal communication) recovered a transposase-induced male recombinant of PZ05271. This is a lethal allele of l(2)35Ea, strongly suggesting that this gene is BG:DS09217.5. If so, then this means that BG:DS09217.4 and BG:DS09217.6 probably correspond to l(2)35De and l(2)35Dh, but the available data cannot determine which is which.

BG:DS09217.6:
The BG:DS09217.6 protein shows weak identities (25% over 46% of its length) with the human and murine 86-kD subunit of ATP-dependent DNA helicase II (SP:P13010 and SP:P27641). This single-stranded DNA helicase is a heterodimer and, with KU70, binds DNA ends as part of the DNA-dependent protein kinase complex involved in nonhomologous DNA end-joining (CRITCHLOW and JACKSON 1998 Down).

BG:DS02252.3:
This protein shows only weak similarities with the IMH1 protein of S. cerevisiae (20% identity over 36% of length) and with a human homolog of the yeast Spc98 protein (P = 10-98 with SPTREMBL:O60852), a protein that is associated with centrosomal {gamma}-tubulin (MURPHY et al. 1998 Down).

BG:DS02252.2:
The BG:DS02252.2 protein matches at 22–28% identity over its C-terminal two-thirds several tektins, particularly the C1 tektin of the sea urchin Strongylocentrotus purpuratus (BLASTP, P = 10-42). Tektins are filamentous proteins that form heteropolymeric protofilaments of flagellar microtubules (NORRANDER et al. 1996 Down). In the BG:DS02252.2 protein the RPNVELCRD motif is present as RPNVENCRD. At lower statistical significance the BG:DS02252.2 protein is similar to many myosin heavy chain proteins, including the Drosophila zipper protein (22% amino acid identity over 61% of length); like these, this protein has a long coiled-coil domain (predicted by PSORT).

BG:DS00365.1:
The BG:DS00365.1 protein matches sequences of aminopeptidase N from taxa as different as Lactococcus and Felix silvestris. The identities to the mammalian enzymes are 33–34% over 75–80% of the length of BG:DS00365.1 (e.g., to the human ANEP protein, SP:P15144). Aminopeptidase N enzymes are membrane-bound zinc metalloproteases and PSORT predicts an N-terminal signal sequence for the BG:DS00365.1 protein.

BG:DS00365.2:
The BG:DS00365.2 protein has a PROSITE Alpha-2-macroglobulin family thiolester region signature and belongs to the PFAM:PF00207 Alpha-2-macroglobulin family (P = 1.5 x 10-107). It shows ~33% sequence identity with alpha-2 macroglobulin of mammals and 28% identity (over 55% of its length) with the Limulus alpha-2 macroglobulin. Whether or not these similarities indicate that BG:DS00365.2 is a protease inhibitor needs to be determined by experiment. In Limulus the protein is restricted in its distribution to hemocytes (IWAKI et al. 1996 Down). TBLASTN searches of all available Drosophila sequence data with the human alpha-2 macroglobulin sequence identify three further genes in this family: one is Mcr, mapping to 28DE (T. CROWLEY, personal communication to FlyBase), and the other two are on Berkeley Drosophila Genome Project (BDGP) P1 clones mapping to 28BC (DS01509) and 37F (DS08491), respectively (J.-M. REICHHART, personal communication).

BG:DS00365.3:
Sequence similarities of the order of 26–32% with serine carboxypeptidases from the Aedes mosquito (SP:P42660), A. thaliana (SP:P32826), and the so-called lysosomal protective protein of human and mouse (e.g., SP:P10619; a S10 family peptidase) suggest that the product of this gene is a serine carboxypeptidase.

beat-B and beat-C:
These genes were identified by T. Pipes and C. Goodman by virtue of their sequence similarity with beat. cDNA sequences, determined by T. PIPES (personal communication), correspond to BG:DS00365.4 and BG:DS00913.1, respectively, both predicted by GENSCAN. The proteins predicted for these genes are similar to that of beat—38% identity in the case of beat-B, 30% (over a shorter common region) in the case of beat-C. All three genes are within 200 kb, and have similar intron/exon structures. T. PIPES (personal communication) has shown that beat-C is expressed in the embryonic pole cells and is removed by Df(2L)RA5. These data suggest that it might correspond to fs(2)35Ed, an inferred locus. beat-C is not vital, because deletions that overlap this gene (e.g., Df(2L)TE35D-19/Df(2L)RA5) are viable when heterozygous (T. PIPES and D. FAMBROUGH, personal communication).

BG:DS07486.3:
This is the third gene in this region predicted to encode a serine peptidase with similarity to Limulus factor D. In the case of BG:DS07486.3 the similarity is 33% identity over 29% of its length, less than for either BG:DS07108.1 or BG:DS07108.5. BG:DS07486.3 is also similar, to about the same extent, to serine proteases of a variety of organisms from Streptomyces griseus to human.

BG:DS07486.2:
This is a gene predicted to encode a leucine-rich repeat protein (PFAM:PF00560, P = 1.7 x 10-12). It shows a quite strong match to an outer arm dynein light chain of the sea urchin 2 of Anthocidaris crassipina (P = 10-37, 43% identity over entire length) and a weaker match to a hypothetical LRR protein of C. elegans (K10D2.1).

BicC:
Bicaudal C, when mutant, has a dominant maternal-effect semilethal phenotype (NUSSLEIN-VOLHARD et al. 1982 Down; MOHLER and WIESCHAUS 1986 Down). BicC activity is required for both the migration of the somatic follicle cells over the anterior oocyte and for the determination of the anterior-posterior polarity of the oocyte itself (MAHONE et al. 1995 Down). This gene has been sequenced by MAHONE et al. 1995 Down and corresponds to BG:DS00913.2. The product of BicC is a KH domain protein that may be RNA binding.

beat:
Despite being a vital gene, no point alleles of beat were recovered in the Cambridge screens. Two chromosome aberrations, In(2L)C163.41 and In(2L)dppd36, were found to be associated with a semilethality in the region where beat is now known to map, but the genetic data were at that time not consistent enough for the identification of a gene (ASHBURNER et al. 1990 Down; more recent data, with a larger deletion set, show that both of these inversions are leaky alleles of beat and are, in fact, broken within beat). By direct sequence comparison beat corresponds to BG:DS00913.3. beat is required for motorneuron pathfinding; in mutant embryos the intersegmental nerve fails to find its target muscles (VAN VACTOR et al. 1993 Down). HOLMES et al. 1998 Down recovered an EMS-induced mutation disrupting Bolwig's organ but not affecting the motorneurons of larvae. This mutation, tric, is almost certainly an allele of beat (HOLMES and HEILIG 1998 Down). beat encodes a secreted protein and FAMBROUGH and GOODMAN 1996 Down suggest that this may function as an antiadhesive during nerve fasciculation, because the mutant phenotype can be partly suppressed by mutations in Fasciclin 2 and connectin (TESSIER-LAVIGNE and GOODMAN 1996 Down, for review).

BG:DS04095.2:
The only similarities seen with the protein predicted from BG:DS04095.2 are to the predicted protein from the D. melanogaster anon-fe2C9 gene (SPTREMBL:O16052, 32% identity over 83% of length) and its D. yakuba homolog.

Ca-{alpha}1D (l(2)35Fa):
The four known alleles of l(2)35Fa defined a lethal gene; strong alleles are embryonic lethal, but heterozygotes for two weak alleles may eclose, with a held-out wing phenotype (ASHBURNER et al. 1990 Down; see also EBERL et al. 1998 Down). ZHENG et al. 1995 Down sequenced a gene coding for an {alpha}1 subunit of a calcium channel protein. This is BG:DS02795.1, and EBERL et al. 1998 Down have shown that the l(2)35Fa alleles are mutant for this protein. This is the most complex gene in this region, with 31 predicted exons. A gene 45 kb proximal to Ca-{alpha}1D (BG:DS07473.1) also has some sequence similarity to L-type calcium channel subunits.

PRL-1:
The expected product of BG:DS07473.3 matches prenylated protein tyrosine phosphatases from organisms as different as C. elegans and human; its C terminus (CSVQ) suggests that it may be geranyl geranylated. The sequence similarities are high, e.g., 59% amino acid sequence identity (over 92% of length) to the human PRL-1 (SPTREMBL:O00648) and 73% identity to its C. elegans homolog (T1D2.2, SPTREMBL:Q22582). PRL-1 was identified from a partial cDNA sequence by ZHENG et al. (EMBL:AF063902). The P elements k09834, PZ03264, and EP(2)0311 are inserted within an intron and have no observable phenotypic effect. All 49 transposase-induced excisions of PZ03264 are viable, but 2 (of 145) excisions of k09834 are lethal. One is deleted for twe, the other for twe and crp. These data suggest that PRL-1 is not a vital gene.

twe:
twine is a maternal-effect lethal, but is also required for male fertility. These phenotypes are separable, as two newly characterized P-element alleles, k08310 and EP(2)0613, are male sterile but female fertile when heterozygous with tweHB5. twine is mat(2)synHB5 of SCHUPBACH and WIESCHAUS 1989 Down. Characterized by ALPHEY et al. 1992 Down and COURTOT et al. 1992 Down, twine encodes a homolog of the S. pombe CDC25 protein tyrosine phosphatase; indeed, it was first identified by a cDNA that could rescue the cdc25-22 mutation of this yeast (JIMENEZ et al. 1990 Down). As shown by direct sequence comparison, it is BG:DS02740.1. Function of twe is required for both oogenesis and male meiosis, and there is genetic evidence that twe is a vital gene, because Df(2L)el18/Df(2L)RN2 and Ts(2Lt;4Lt)TE35B-101 + Ts(2Rt;4Rt)DTD22/Df(2L)RN2 are lethal; twe is the only gene in the 4-kb overlap between Df(2L)el18 and Df(2L)RN2 [the latter is broken within twe, as is T(2;4)DTD22]. A 10-kb transgene from L. Alphey carried on P{twe+10.0}, however, rescues the sterility of twe alleles, but not the lethality of Df(2L)el18/Df(2L)RN2, whereas a large duplication [Dp(2;3)osp3] rescues both the sterility and lethality of twe alleles.

BG:DS02740.2:
The BG:DS02740.2 protein is a member of the WD-40 repeat protein family (NEER et al. 1994 Down; PFAM:PF00400, P = 1.2 x 10-23) characteristic of the ß-subunit of G proteins but also found in a number of other proteins. There are three WD-40 repeats in the N-terminal one-third of this protein. The most similar protein is the hypothetical protein of C. elegans, F33G12.2 (SPTREMBL:Q19986, 35% sequence identity over 63% of residues).

crp (l(2)35Fd):
l(2)35Fd is a P-element insertion hotspot; 21 independent alleles are known, but only 2 EMS-induced alleles. One EMS allele (crpRAR46) escapes to give adults with a pleiotropic phenotype (rough, small, eyes; held-out and narrow, pointed wings; malformed legs; ASHBURNER et al. 1990 Down). The P-element alleles escape when heterozygous with this EMS allele (with a narrow, pointed wing phenotype) but rarely when heterozygous with crp- deletions. Function of this gene has been shown to be required for tracheal branching by CHIU and KRASNOW 1997 Down, and they have named it cropped for this reason. It is BG:DS02740.3, a 22-kb gene. The gene structure prediction based on a cDNA sequence comparison with the genomic DNA indicated that two of the three P-element sites that were sequenced (PZ00232 and k07829) are 16 kb apart, on either side of the long intron. This gene encodes a Drosophila homologue of the human AP4 transcription factor (BLASTP, P = 10-32; SP:Q01664). There is, in the DNA sequenced, a Su(Ste)-like repetitive sequence in the long intron of this gene; the insertion EP(2)0721 in this sequence is not lethal.

BG:DS02740.4:
BG:DS02740.4 encodes a predicted protein with 30% sequence identity (over 54% of its length) to the human protein kinase A anchoring protein. It is less strongly similar to a hypothetical protein from C. elegans (B0336.4, SPTREMBL:Q10955). Like the human protein kinase A anchoring protein, the BG:DS02740.4 protein has a PFAM:PF00615 regulator of G protein signalling domain (P = 3.7 x 10-7), characteristic of GTPase-activating proteins that interact with the {alpha}-subunit of G proteins (DE VRIES et al. 1995 Down). Protein kinase A anchoring protein interacts with the RII subunits of cyclic AMP-dependent kinase (protein kinase A), affecting its subcellular localization (PAWSON and SCOTT 1997 Down for review).

l(2)35Fb:
This locus is known only from one spontaneous and one EMS-induced allele. The lethal period is late and there are many adult escapers. Transformation rescue experiments by A. WILLINGHAM (personal communication) show that it corresponds to BG:DS02740.6, which encodes a cytochrome P450. Its closest mammalian gene products are the phenobarbitol-inducible cytochrome P450s CYP2B6 of human and CYP2B4 of rabbit (32% sequence identity). Alleles of this locus have also been recovered as mechanosensory defectives in C. Zuker's laboratory (C. ZUKER, personal communication). There are over 20 genes encoding cytochrome-P450s now known in Drosophila; this is the first with a clear mutant phenotype.

heixuedian (l(2)35Fc):
Two P-element alleles in this gene, previously known from two EMS-induced alleles, have been rescued and the sequences of their insertion sites determined; transposase-induced loss of the P elements reverts the lethal phenotype (N. WAKABAYASHI-ITO, personal communication). They map to BG:DS02740.7, coding for a putative transmembrane protein (PSORT prediction). heix is expressed in the hemocyte/macrophage cell lineage. Mutant larvae show an overproliferation of hemocytes and accumulate melanotic "tumors" (L. HONG and G. M. RUBIN, unpublished data). The only sequence similarity seen with the conceptual heix protein sequence is to one described as a probable 1,4-dihydroxy-2-naphthoate octaprenyltransferase of Bacillus subtilis (SP:P39582; P = 10-23, 31% sequence identity over 74% of length). This protein is also matched by some mouse EST sequences (e.g., EMBL:AA000881, EMBL:AA087043).

BG:DS02740.8:
This is a C2H2 zinc finger domain protein and shows significant BLASTP matches with several proteins of this family, most significantly with the Zfp35 protein of mouse (SP:P15620, 38% amino acid sequence identity over 46% of length).

BG:DS02740.9:
BG:DS02740.9 shows 53% amino acid sequence identity (over 95% of its length) to human and rodent glial maturation factor ß (SP:P17774, SP:P17774). The Drosophila protein has a PFAM:PF00241 domain characteristic of cofilin/tropomyosin-type actin-binding proteins (P = 3.1 x 10-17), as do the GMF proteins. GMF was identified as a brain protein. Its precise function is not known, but it appears to play a role in signal transduction because, when phosphorylated, it inhibits the ERK1/ERK2 family of MAP kinases and enhances the activity of the p38 MAP kinase. There is also evidence that it forms a complex with the p38 MAP kinase (LIM and ZAHEER 1996 Down).

anon-35Fa:
This gene was named by FlyBase for the region encoding transcript III near cornichon (ROTH et al. 1995 Down). From a comparison of the sequence and gene prediction data with the map of the cornichon region (Figure 6 of ROTH et al. 1995 Down), it is clear that this is BG:DS02740.11, encoding a protein similar to one of unknown function in C. elegans (ZK418.5, SP:Q23483, 44% identity over 78% of length) and to a human seven-pass transmembrane protein (SP:O75790, 50% identity over 86% of length). PSORT predicts the presence of five transmembrane domains in the anon-35Fa protein.

Sed5 (l(2)35Ff):
Sed5 encodes a putative syntaxin family vesicle targeting protein involved in ER-Golgi transport, homologous to the SED5 protein of S. cerevisiae, and was characterized by BANFIELD et al. 1994 Down from DNA corresponding to transcript II of ROTH et al. 1995 Down and DAWSON et al. 1995 Down. The single EMS allele is a larval/pupal lethal (ASHBURNER et al. 1990 Down). Sed5 is the predicted gene BG:DS02740.12., as shown by direct comparison with the published sequence. DAWSON et al. 1995 Down mapped the distal limit of Df(2L)H60-3, a l(2)35Ff- cni- fzy- deletion, and I. Dawson (quoted in ROTH et al. 1995 Down) mapped the distal end of the l(2)35Ff+ cni- fzy- deletion Df(2L)III18. These data support the identification of Sed5 with BG:DS02740.12.

cni:
cornichon (ASHBURNER et al. 1990 Down) is a maternal-effect lethal required for dorsal-ventral signalling in the germ line (ROTH et al. 1995 Down). In cni- the oocyte shows abnormal anterior-posterior polarity, a phenotype similar to that seen in mutant gurken embryos (GONZALEZ-REYES et al. 1995 Down). By comparison with the published sequence it corresponds to BG:DS02740.13. ROTH et al. 1995 Down suggest that the cni protein is required for signal transduction in the Egfr pathway, at least during oogenesis (see also GONZALEZ-REYES et al. 1995 Down). Very similar proteins, of unknown function, are known from mouse (e.g., SPTREMBL:O35372, 56% identity over entire length; see HWANG et al. 1999 Down) and C. elegans genomic sequence (T09E8.3, SPTREMBL:Q22361, 49% identity over 93% of length). A protein related in sequence has been identified in S. cerevisiae as the ER vesicle protein Erv14p, thought to be needed for the export of particular cargos from the ER (POWERS and BARLOWE 1998 Down). It is fascinating that erv14 yeast cells show a polarity defect, a haploid-specific defect in the site of bud formation.

fzy:
fizzy (NUSSLEIN-VOLHARD et al. 1984 Down) is a vital gene known from several EMS alleles, one P-element allele and one X-ray-induced allele. Escapers carrying weak alleles are female sterile. Lethal embryos show metaphase arrest (DAWSON et al. 1993 Down) and fizzy is required for the normal mitotic degradation of cyclin A and cyclin E (DAWSON et al. 1995 Down; SIGRIST et al. 1995 Down). It is a WD-40 repeat family protein that is a homolog of the S. cerevisiae CDC20, although the fly gene cannot functionally rescue cdc20 mutations (DAWSON et al. 1995 Down). It is BG:DS02740.14. There are clear homologs in C. elegans (ZK1307.6), Xenopus, rodents, and humans (e.g., 58% sequence identity over 71% of length to SPTREMBL:Q12834; WEINSTEIN et al. 1994 Down).

cact:
Embryos from homozygous cactus mothers have a ventralized phenotype (SCHUPBACH and WIESCHAUS 1989 Down), known to be due to the failure to restrict the dorsal protein from dorsal nuclei (ROTH et al. 1989 Down). cactus codes for the Drosophila equivalent of I{kappa}B (GEISLER et al. 1992 Down); it is BG:DS02740.15. The dorsal protein is a homolog of NF{kappa}B. A large number of both EMS and P-element alleles are known, the result of site-specific screens by ROTH et al. 1991 Down.

anon-35F/36A:
This gene was named by FlyBase for a 1.2-kb transcript immediately 3' to cactus (GEISLER et al. 1992 Down, Figure 2). It is BG:DS02740.16, which encodes a protein similar to the product of the NIF3 gene of S. cerevisiae (SP:P53081, 39% identity over 80% of length) about which little is known. The viable and fertile P-element insertion k17003 may be an allele of this gene; it is inserted 1185 bp upstream of the putative transcript.

l(2)35Fe:
A vital gene known only from a single EMS allele (which is a larval/pupal lethal; ASHBURNER et al. 1990 Down) and a single P-element insertion. The insertion site of the latter was sequenced after plasmid rescue and maps to the 5' end of BG:DS02740.17, encoding a protein similar to the bacterial 50S ribosomal subunit protein, a protein of unknown function from C. elegans (T23B12.1, SPTREMBL:O17005, 46% identity over 75% of length), and the translations of several mouse and human EST sequences. The similarity with bacterial L4 ribosomal proteins (e.g., 37% identity over 66% of length to that of Bacillus stearothermophilus, SP:P28601) and chloroplast L4 ribosomal proteins (e.g., 39% identity over 56% of length to the chloroplast L4 of Odontella sinensis, SP:P49546) indicates that the function of this gene is to encode a mitochondrial ribosomal protein. This inference is supported by a strong PSORT prediction for mitochondrial localization.

chif:
Females homozygous for some mutant chiffon alleles lay eggs with a fragile chorion that are not fertilized; other alleles are zygotic lethals (T. Schupbach, quoted in ASHBURNER et al. 1990 Down; LINDSLEY and ZIMM 1992 Down). It has been independently cloned and characterized by LANDIS and TOWER 1999 Down and corresponds to BG:DS09218.2. The P element k04216 is a female-fertile insertion in the first intron of chif. Some induced excisions (4/149) of this element are female sterile when heterozygous with chif- deletions. The only protein sequence similarities seen with the chif protein are limited to two short regions of 45 and 38 amino acids, with the rad51 protein of S. pombe (SPTREMBL:O59836).

BG:DS09218.4:
This gene encodes a protein disulphide isomerase (PDI), as judged by 52% amino acid sequence identity (over 93% of its length) with the human protein (SP:Q15084) and similarly significant matches to homologs from cow, rat, C. elegans, and S. cerevisiae. PDI is an enzyme of the lumen of the endoplasmic reticulum required for the folding of proteins that contain disulfide bridges. Like other PDIs, the Drosophila protein has a PROSITE thioredoxin family active site and a PFAM:PF00085 thioredoxin pattern (P = 1.7 x 10-96). This is the second protein disulfide isomerase to be discovered in Drosophila. The other maps to chromosome arm 3L (MCKRAY et al. 1995 Down) and is only 17% identical in protein sequence to BG:DS09218.4. Both proteins have the C terminus KDEL, indicative of retention in the endoplasmic reticulum (MUNRO and PELHAM 1987 Down).

BG:DS09218.5:
The only significant BLASTP match to the BG:DS09218.5 protein is to the hypothetical protein HI0912 of Haemophilus influenzae (29% sequence identity over 39% of length).

BG:DS02780.1:
This is another protein characterized by leucine-rich repeats. Like BG:DS07108.4, it shows BLASTP matches to a number of extracellular proteins.

Idgf1, Idgf2, and Idgf3:
These three genes are contiguous within 7.7 kb and encode proteins 51–55% identical in sequence. They all show sequence similarities with chitinases, but have been identified by KAWAMURA et al. 1999 Down as coding for imaginal disc growth factors. They are secreted into the medium by cultured imaginal disc cells and will promote imaginal disc growth. In larvae they are highly expressed in the fat body. They correspond to BG:DS02780.5, BG:DS02780.4, and BG:DS02780.2.

dac (l(2)36Ae):
dachshund is a vital gene, although some mutant alleles escape to produce flies with rough eyes and crippled legs (hence its name). Alleles of dac were also identified as dominant suppressors of the hypermorphic mutation of the EGF receptor, EgfrEllipse (MARDON et al. 1994 Down). Nine EMS and a single P-element allele are known. By comparison with the published sequence (MARDON et al. 1994 Down) it corresponds to BG:DS02780.3, and is the most proximal gene in the region sequenced (in fact our sequence only includes the 3' end of this gene). dac encodes a nuclear protein (perhaps a transcription factor), and expression driven by dpp:GAL4 induces the development of ectopic eyes, perhaps by normally acting as a target for the eyeless PAX6 transcription factor. This interpretation is complicated by the fact that ectopic dac can also induce ey expression (SHEN and MARDON 1997 Down).


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS AND DISCUSSION
*CONCLUSIONS
*APPENDIX
*LITERATURE CITED

ACHSTETTER, T., A. FRANZUSOFF, C. FIELD, and R. SCHEKMAN, 1988  SEC7 encodes an unusual, high molecular weight protein required for membrane traffic from the yeast Golgi apparatus. J. Biol. Chem. 263:11711-11717[Abstract/Free Full Text].

ADACHI-YAMADA, T., M. NAKAMURA, K. IRIE, Y. TOMOYASU, and Y. SANO et al., 1999  p38 MAP kinase can be involved in TGF-beta superfamily signal transduction in Drosophila wing morphogenesis. Mol. Cell. Biol. 19:2322-2329[Abstract/Free Full Text].

ADAMS, C. M., M. G. ANDERSON, D. G. MOTTO, M. P. PRICE, and W. A. JOHNSON et al., 1998  Ripped pocket and pickpocket, novel Drosophila DEG/ENaC subunits expressed in early development and in mechanosensory neurons. J. Cell Biol. 140:143-152[Abstract/Free Full Text].

ALBERGA, A., J. L. BOULAY, E. KEMPE, C. DENNEFELD, and M. HAENLIN, 1991  The snail gene required for mesoderm formation in Drosophila is expressed dynamically in derivatives of all three germ layers. Development 111:983-992[Abstract/Free Full Text].

ALPHEY, L., J. JIMENEZ, H. WHITE-COOPER, I. DAWSON, and P. NURSE et al., 1992  twine, a cdc25 homolog that functions in the male and female germline of Drosophila.. Cell 69:977-988[Medline].

ANHOLT, R. R. H., R. F. LYMAN, and T. F. C. MACKAY, 1996  Effects of single P-element insertions on olfactory behavior in Drosophila melanogaster.. Genetics 143:293-301[Abstract].

ASHBURNER, M., 1982  The genetics of a small autosomal region of Drosophila melanogaster containing the structural gene for Alcohol dehydrogenase. III. Hypomorphic and hypermorphic mutations affecting the expression of Hairless. Genetics 101:447-459[Abstract/Free Full Text].

ASHBURNER, M., 1989 Drosophila: A Laboratory Handbook. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

ASHBURNER, M., 1998  Speculations on the subject of alcohol dehydrogenase and its properties in Drosophila and other flies. Bioessays 20:949-954[Medline].

ASHBURNER, M., C. S. AARON, and S. TSUBOTA, 1982a  The genetics of a small autosomal region of D. melanogaster, including the structural gene for Alcohol Dehydrogenase. V. Characterization of X-ray-induced Adh null mutations. Genetics 102:421-435[Abstract/Free Full Text].

ASHBURNER, M., S. TSUBOTA, and R. C. WOODRUFF, 1982b  The genetics of a small chromosome region of Drosophila melanogaster containing the structural gene for Alcohol dehydrogenase. IV. Scutoid, an antimorphic mutation. Genetics 102:401-420[Abstract/Free Full Text].

ASHBURNER, M., P. THOMPSON, J. ROOTE, P. F. LASKO, and Y. GRAU et al., 1990  The genetics of a small autosomal region of Drosophila melanogaster containing the structural gene for alcohol dehydrogenase. VII. Characterization of the region around the snail and cactus loci. Genetics 126:679-694[Abstract].

AULD, V. J., R. D. FETTER, K. BROADIE, and C. S. GOODMAN, 1995  Gliotactin, a novel transmembrane protein on peripheral glia, is required to form the blood-nerve barrier in Drosophila.. Cell 81:757-767[Medline].

BAGLIONI, C., 1963 Correlations between genetics and chemistry of human haemoglobins, pp. 405–475 in Progress in Molecular Genetics, Vol. 1, edited by J. H. TAYLOR. Academic Press, New York.

BAHN, E., 1972  A suppressor locus for the pyrimidine requiring mutant: rudimentary. Dros. Inf. Serv. 49:98.

BAILEY, A. M. and J. W. POSAKONY, 1995  Suppressor of Hairless directly activates transcription of Enhancer of split complex genes in response to Notch receptor activity. Genes Dev. 9:2609-2622[Abstract/Free Full Text].

BALAKIREVA, M. D., Y. Y. SHEVELYOV, D. I. NURMINSKY, K. J. LIVAK, and V. A. GVOZDEV, 1992  Structural organization and diversification of Y-linked sequences comprising Su(Ste) genes in Drosophila melanogaster.. Nucleic Acids Res. 20:3731-3736[Abstract/Free Full Text].

BANFIELD, D. K., M. J. LEWIS, C. RABOUILLE, G. WARREN, and H. R. B. PELHAM, 1994  Localization of Sed5, a putative vesicle targeting molecule, to the cis-Golgi network involves both its transmembrane domain and cytoplasmic domains. J. Cell Biol. 127:357-371[Abstract/Free Full Text].

BARRETT, J. A., 1980  The estimation of the number of mutationally silent loci in saturation-mapping experiments. Genet. Res. 35:33-44[Medline].

BARRETT, A. J., N. D. RAWLINGS and J. F. WOESSNER, 1998 Handbook of Proteolytic Enzymes. Academic Press, San Diego.

BASS, B. L., 1997  RNA editing and hypermutation by adenosine deamination. Trends Biochem. Sci. 22:157-162[Medline].

BATEMAN, A., E. BIRNEY, R. DURBIN, S. R. EDDY, and R. D. FINN et al., 1999  Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic Acids Res. 27:260-262[Abstract/Free Full Text].

BELOTE, J. M., F. M. HOFFMANN, M. MCKEOWN, R. L. CHORSKY, and B. S. BAKER, 1990  Cytogenetic analysis of chromosome region 73AD of Drosophila melanogaster.. Genetics 125:783-793[Abstract].

BERKELEY DROSOPHILA Genome Project, 1999 http://www.fruitfly.org/.

BEVAN, M., I. BANCROFT, E. BENT, K. LOVE, and H. GOODMAN et al., 1998  Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis thaliana.. Nature 391:485-488[Medline].

BOMER, U., J. RASSOW, N. ZUFALL, N. PFANNER, and M. MEIJER et al., 1996  The preprotein translocase of the inner mitochondrial membrane: evolutionary conservation of targeting and assembly of Tim17. J. Mol. Biol. 262:389-395[Medline].

BONFINI, L., C. A. KARLOVICH, C. DASGUPTA, and U. BANERJEE, 1992  The Son of sevenless gene product: a putative activator of Ras. Science 255:603-606[Abstract/Free Full Text].

BOULAY, J. L., C. DENNEFELD, and A. ALBERGA, 1987  The Drosophila developmental gene snail encodes a protein with nucleic acid binding fingers. Nature 330:395-398[Medline].

BREITWIESER, W., F. H. MARKUSSEN, H. HORSTMANN, and A. EPHRUSSI, 1996  Oskar protein interaction with Vasa represents an essential step in polar granule assembly. Genes Dev. 10:2179-2188[Abstract/Free Full Text].

BRENDEL, V., P. BUCHER, I. NOURBAKHSH, B. E. BLAISDELL, and S. KARLIN, 1992  Methods and algorithms for statistical analysis of protein sequences. Proc. Natl. Acad. Sci. USA 89:2002-2006[Abstract/Free Full Text].

BRIDGES, C. B., and K. S. BREHME, 1944 The Mutants of Drosophila melanogaster. Publs. Carnegie Instn. 552.

BROGNA, S. and M. ASHBURNER, 1997  The Adh-related gene of Drosophila melanogaster is expressed as a functional dicistronic messenger RNA: multigenic transcription in higher organisms. EMBO J. 16:2023-2031[Medline].

BURGE, C., 1997 Identification of genes in human genomic DNA. Ph.D. Thesis, Stanford University, Stanford, CA.

BURGE, C. and S. KARLIN, 1997  Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268:78-94[Medline].

C. ELEGANS SEQUENCING CONSORTIUM, THE, 1998 Genomic sequence of the nematode C. elegans: a platform for investigating biology. Science 282: 2012–2018.

C. ELEGANS GENOME SEQUENCING PROJECT, THE, 1999 How the worm was won. Trends Genet. 15: 51–58.

CASTLE, L. A. and D. W. MEINKE, 1994  A FUSCA gene of Arabidopsis encodes a novel protein essential for plant development. Plant Cell 6:25-41[Abstract].

CASTRILLON, D. H., P. GONCZY, S. ALEXANDER, R. RAWSON, and C. G. EBERHART et al., 1993  Toward a molecular genetic analysis of spermatogenesis in Drosophila melanogaster: characterization of male-sterile mutants generated by single P-element mutagenesis. Genetics 135:489-505[Abstract].

CHEAH, P. Y., Y. B. MENG, X. YANG, D. A. KIMBRELL, and M. ASHBURNER et al., 1994  The Drosophila l(2)35Ba/nocA gene encodes a putative Zn finger protein involved in the development of the embryonic brain and the adult ocellar structures. Mol. Cell. Biol. 14:1487-1499[Abstract/Free Full Text].

CHEN, T. L., K. A. EDWARDS, R. C. LIN, L. W. COATS, and D. P. KIEHART, 1991  Drosophila myosin heavy chain at 35BC. J. Cell Biol. 115:330a.

CHEN, Z.-Y., T. HASSON, P. M. KELLEY, B. J. SCHWENDER, and M. F. SCHWARTZ et al., 1996  Molecular cloning and domain structure of human myosin-VIIa, the gene product defective in Usher Syndrome 1B. Genomics 36:440-448[Medline].

CHIA, W., R. KARP, S. MCGILL, and M. ASHBURNER, 1985  Molecular analysis of the Adh region of the genome of Drosophila melanogaster.. J. Mol. Biol. 186:689-706[Medline].

CHIU, S. K. and M. A. KRASNOW, 1997  Identification of new genes required for the formation of terminal tracheal branches. A. Conf. Dros. Res. 38:229A.

CHOUDHARY, M., M. B. COULTHART, and R. S. SINGH, 1992  A comprehensive study of genic variation in natural populations of Drosophila melanogaster. VI. Patterns and processes of genic divergence between Drosophila melanogaster and its sibling species, Drosophila simulans. Genetics 130:843-853[Abstract].

CORNELL, M. J., T. A. WILLIAMS, N. S. LAMANGO, D. COATES, and P. CORVOL et al., 1995  Cloning and expression of an evolutionary conserved single-domain angiotensin converting enzyme from Drosophila melanogaster.. J. Biol. Chem. 270:13613-13619[Abstract/Free Full Text].

COURTOT, C., C. FANKHAUSER, V. SIMANIS, and C. F. LEHNER, 1992  The Drosophila cdc25 homolog twine is required for meiosis. Development 116:405-416[Medline].

CRAIN, W. R., F. C. EDEN, W. R. PEARSON, E. H. DAVIDSON, and R. J. BRITTEN, 1976  Absence of short period interspersion of repetitive and non-repetitive sequences in the DNA of Drosophila melanogaster.. Chromosoma 56:309-326[Medline].

CRITCHLOW, S. E. and S. P. JACKSON, 1998  DNA end-joining: from yeast to man. Trends Biochem. Sci. 23:394-398[Medline].

CUTLER, M. L., R. H. BASSIN, L. ZANONI, and N. TALBOT, 1992  Isolation of rsp-1, a novel cDNA capable of suppressing v-Ras transformation. Mol. Cell. Biol. 12:3750-3756[Abstract/Free Full Text].

DANIELSON, P. B., R. J. MACINTYRE, and J. C. FOGLEMAN, 1997  Molecular cloning of a family of xenobiotic-inducible drosophilid cytochrome p450s: evidence for involvement in host-plant allelochemical resistance. Proc. Natl. Acad. Sci. USA 94:10797-10802[Abstract/Free Full Text].

DARBOUX, I., E. LINGUEGLIA, D. PAURON, P. BARBRY, and M. LAZDUNSKI, 1998  A new member of the amiloride-sensitive sodium channel family in Drosophila melanogaster peripheral nervous system. Biochem. Biophys. Res. Commun. 246:210-216[Medline].

DAVIS, M. B. and R. J. MACINTYRE, 1988  A genetic analysis of the {alpha}-glycerophosphate oxidase locus in Drosophila melanogaster.. Genetics 120:755-766[Abstract/Free Full Text].

DAVIS, T., J. TRENEAR, and M. ASHBURNER, 1990  The molecular analysis of the el-noc complex of Drosophila melanogaster.. Genetics 126:105-119[Abstract].

DAVIS, T., M. ASHBURNER, G. JOHNSON, D. GUBB, and J. ROOTE, 1997  Genetic and phenotypic analysis of the genes of the elbow-no-ocelli region of chromosome 2L of Drosophila melanogaster.. Hereditas 126:67-75[Medline].

DAWSON, I. A., S. ROTH, M. AKAM, and S. ARTAVANIS-TSAKONAS, 1993  Mutations of the fizzy locus cause metaphase arrest in Drosophila melanogaster embryos. Development 117:359-376[Abstract/Free Full Text].

DAWSON, I. A., S. ROTH, and S. ARTAVANIS-TSAKONAS, 1995  The Drosophila cell cycle gene fizzy is required for normal degradation of cyclins A and B during mitosis and has homology to the CDC20 gene of Saccharomyces cerevisiae.. J. Cell Biol. 129:725-737[Abstract/Free Full Text].

DE LA VEGA, H., C. A. SPECHT, Y. LIU, and P. W. ROBBINS, 1998  Chitinases are a multi-gene family in Aedes, Anopheles and Drosophila.. Insect Mol. Biol. 7:233-239[Medline].

DE VRIES, L., M. MOUSLI, A. WURMSER, and M. G. FARQUHAR, 1995  GAIP, a protein that specifically interacts with the trimeric G protein G alpha i3, is a member of a protein family with a highly conserved core domain. Proc. Natl. Acad. Sci. USA 92:11916-11920[Abstract/Free Full Text].

EBERL, D. F., D. REN, G. FENG, L. J. LORENZ, and D. VAN VACTOR et al., 1998  Genetic and developmental characterization of Dmca1D, a calcium channel {alpha}1 subunit gene in Drosophila melanogaster.. Genetics 148:1159-1169[Abstract/Free Full Text].

EDDY, S. R., 1998 HAMMER2.1 Profile hidden Markov models for biological sequence analysis. http://hmmer.wustl.edu/.

EDGAR, B. A., 1994  Cell cycle. Cell-cycle control in a developmental context. Curr. Biol. 4:522-524[Medline].

EDMONDSON, M. E., 1948  New mutants report. Dros. Inf. Serv. 22:53.

EUROPEAN DROSOPHILA GENOME PROJECT, 1999 http://edgp.ebi.ac.uk/.

FAMBROUGH, D. and C. S. GOODMAN, 1996  The Drosophila beaten path gene encodes a novel secreted protein that regulates defasciculation at motor axon choice points. Cell 87:1049-1058[Medline].

FAMBROUGH, D., D. PAN, G. M. RUBIN, and C. S. GOODMAN, 1996  The cell surface metalloprotease/disintegrin Kuzbanian is required for axonal extension in Drosophila.. Proc. Natl. Acad. Sci. USA 93:13233-13238[Abstract/Free Full Text].

FLOREA, L., G. HARTZELL, Z. ZHANG, G. M. RUBIN, and W. MILLER, 1998  A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 8:967-974[Abstract/Free Full Text].

FLORES, C. and W. R. ENGELS, 1999  Microsatellite instability in Drosophila spellchecker1 (MutS homolog) mutants. Proc. Natl. Acad. Sci. USA 96:2964-2969[Abstract/Free Full Text].

FLYBASE CONSORTIUM,, 1999  The FlyBase database of the Drosophila genome projects and community literature. Nucleic Acids Res. 27:85-88[Abstract/Free Full Text].

FRANK, L. H. and C. RUSHLOW, 1996  A group of genes required for maintenance of the amnioserosa tissue in Drosophila.. Development 122:1343-1352[Abstract].

FRANZUSOFF, A., K. REDDING, J. CROSBY, R. S. FULLER, and R. SCHEKMAN, 1991  Localization of components involved in protein transport and processing through the yeast Golgi apparatus. J. Cell Biol. 112:27-37[Abstract/Free Full Text].

FUCHS, R., 1994  Predicting protein functions: a versatile tool for the Apple Macintosh. CABIOS 10:171-178[Abstract/Free Full Text].

FURUKAWA, T., S. MARUYAMA, M. KAWAICHI, and T. HONJO, 1992  The Drosophila homolog of the immunoglobulin recombination signal-binding protein regulates peripheral nervous system development. Cell 69:1191-1197[Medline].

FUSE, N., S. HIROSE, and S. HAYASHI, 1996  Determination of wing cell fate by the escargot and snail genes in Drosophila.. Development 122:1059-1067[Abstract].

GAUSZ, J., G. BENCZE, H. GYURKOVICS, M. ASHBURNER, and D. ISH-HOROWICZ et al., 1979  Genetic characterization of the 87C region of the third chromosome of Drosophila melanogaster.. Genetics 93:917-934[Abstract/Free Full Text].

GEISLER, R., A. BERGMANN, Y. HIROMI, and C. NUSSLEIN-VOLHARD, 1992  cactus, a gene involved in dorsoventral pattern formation of Drosophila, is related to the I{kappa}B gene family of vertebrates. Cell 71:613-621[Medline].

GIBSON, F., J. WALSH, P. MBURU, A. VARELA, and K. A. BROWN et al., 1995  A type VII myosin encoded by the mouse deafness gene shakyer-1.. Nature 374:62-64[Medline].

GENE ONTOLOGY CONSORTIUM, 1999 http://www.ebi.ac.uk/~ashburn/GO/ and http://www.fruitfly.org/~suzi/.

GONZALEZ-REYES, A., H. ELLIOTT, and R. D. ST. JOHNSTON, 1995  Polarization of both major body axes in Drosophila by gurken-torpedo signalling. Nature 375:654-658[Medline].

GOSSEN, M., D. T. S. PAK, S. K. HANSEN, J. K. ACHARYA, and M. R. BOTCHAN, 1995  A Drosophila homolog of the yeast origin recognition complex. Science 270:1674-1677[Abstract/Free Full Text].

GRAU, V., G. CARTERET, and P. SIMPSON, 1984  Mutation and chromosomal rearrangements affecting the expression of snail, a gene involved in embryonic patterning in Drosophila melanogaster.. Genetics 108:347-360[Abstract/Free Full Text].

GREEN, E. D. and M. V. OLSON, 1990  Systematic screening of yeast artificial-chromosome libraries by use of the polymerase chain reaction. Proc. Natl. Acad. Sci. USA 87:1213-1217[Abstract/Free Full Text].

GREEN, P., 1995 GENEFINDER Documentation. http://www.ibc.wustl.edu/bio_data/genefinder.html.

GREEN, P., D. LIPMAN, L. HILLIER, R. WATERSTON, and D. STATES et al., 1993  Ancient conserved regions in new gene sequences and the protein databases. Science 259:1711-1716[Abstract/Free Full Text].

GRELL, E. H., K. B. JACOBSON, and J. B. MURPHY, 1968  Alterations of genetic material for analysis of alcohol dehydrogenase isozymes of Drosophila melanogaster.. Ann. NY Acad. Sci. 151:441-455[Medline].

GRIFFITH, J. K., and C. E. SANSOM, 1998 The Transporter Facts Book. Academic Press, San Diego.

GUBB, D., 1998 Chromosome mechanics: the genetic manipulation of aneuploid stocks, pp. 109–130 in Drosophila: A Practical Approach, edited by D. B. ROBERTS. IRL Press, Oxford.

GUBB, D., M. SHELTON, J. ROOTE, S. MCGILL, and M. ASHBURNER, 1984  The genetic analysis of a large transposing element of Drosophila melanogaster. The insertion of a w+ rst+ TE into the ck locus. Chromosoma 91:54-64.

GUBB, D., J. ROOTE, G. HARRINGTON, S. MCGILL, and B. DURRANT et al., 1985  A preliminary genetic analysis of TE146, a very large transposing element of Drosophila melanogaster.. Chromosoma 92:116-123.

GUBB, D., M. ASHBURNER, J. ROOTE, and T. DAVIS, 1990  A novel transvection phenomenon affecting the white gene of Drosophila melanogaster.. Genetics 126:167-176[Abstract].

GUO, M., L. Y. JAN, and Y. N. JAN, 1996  Control of daughter cell fates during asymmetric division: interaction of numb and Notch.. Neuron 17:27-41[Medline].

HAN, Z. S., H. ENSLEN, X. HU, X. MENG, and I.-H. WU et al., 1998  A conserved p38 mitogen-activated protein kinase pathway regulates Drosophila immunity gene expression. Mol. Cell. Biol. 18:3527-3539[Abstract/Free Full Text].

HARTL, D. L., D. I. NURMINSKY, R. W. JONES, and E. R. LOZOVSKAYA, 1994  Genome structure and evolution in Drosophila: applications of the framework P1 map. Proc. Natl. Acad. Sci. USA 91:6824-6829[Abstract/Free Full Text].

HAUSER, F., H. P. NOTHACKER, and C. J. GRIMMELIKHUIJZEN, 1997  Molecular cloning, genomic organization, and developmental regulation of a novel receptor from Drosophila melanogaster structurally related to members of the thyroid-stimulating hormone, follicle-stimulating hormone, luteinizing hormone/choriogonadotropin receptor family from mammals. J. Biol. Chem. 272:1002-1010[Abstract/Free Full Text].

HAY, B. A., L. Y. JAN, and Y. N. JAN, 1988  A protein component of Drosophila polar granules is encoded by vasa and has extensive sequence similarity to ATP-dependent helicases. Cell 55:577-587[Medline].

HAYASHI, S., 1996  Checkpoint mechanism that maintains diploidy in Drosophila: CDC2 inhibits S phase entry in G2 by a kinase independent mechanism. Cell Struct. Funct. 21:694.

HAYASHI, S., S. HIROSE, T. METCALFE, and A. D. SHIRRAS, 1993  Control of imaginal cell development by the escargot gene of Drosophila.. Development 118:105-115[Abstract].

HEITZLER, P., D. COULSON, M. T. SAENZ-ROBLES, M. ASHBURNER, and J. ROOTE et al., 1993  Genetic and cytogenetic analysis of the 43A-E region containing the segment polarity gene costa and the cellular polarity genes prickle and spiny-legs in Drosophila melanogaster.. Genetics 135:105-115[Abstract].

HELT, G., 1997 Data visualization and gene discovery in Drosophila melanogaster. Ph.D. Thesis, University of California, Berkeley, CA.

HENIKOFF, S., M. A. KEENE, K. FECHTEL, and J. W. FRISTROM, 1986  Gene within a gene: nested Drosophila genes encode unrelated proteins on opposite DNA strands. Cell 44:33-42[Medline].

HIGGINS, D. G., J. D. THOMPSON, and T. J. GIBSON, 1996  Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 266:383-402[Medline].

HILLIKER, A. J., S. H. CLARK, W. M. GELBART, and A. CHOVNICK, 1981  Cytogenetic analysis of the rosy micro-region, polytene chromosome interval 87D2-4; 87E12-F1, of D. melanogaster.. Dros. Inf. Serv. 56:65-72.

HODGETTS, R. B., 1972  Biochemical characterization of mutants affecting the metabolism of ß-alanine in Drosophila.. J. Insect Physiol. 18:937-947[Medline].

HOFMANN, K., P. BUCHER, L. FALQUET, and A. BAIROCH, 1999  The PROSITE database, its status in 1999. Nucleic Acids Res. 27:215-219[Abstract/Free Full Text].

HOLMES, A. L. and J. S. HEILIG, 1998  Fascilin II and beaten path modulate intercellular adhesion in larval visual organ development. Development 126:261-272[Abstract].

HOLMES, A. L., R. N. RAPER, and J. S. HEILIG, 1998  Genetic analysis of Drosophila larval optic nerve development. Genetics 148:1189-1201[Abstract/Free Full Text].

HORTON, P. and K. NAKAI, 1997  Better prediction of protein cellular localization sites with the k nearest neighbors classifier. Proc. Int. Conf. Intelligent Syst. Mol. Biol. 5:147-152.

HOSIE, A. M., K. ARONSTEIN, D. B. SATTELLE, and R.H. FFRENCH-CONSTANT, 1997  Molecular biology of insect neuronal GABA receptors. Trends Neurosci. 20:578-583[Medline].

HOUARD, X., T. A. WILLIAMS, A. MICHAUD, D. DANI, and R. E. ISAAC et al., 1998  The Drosophila melanogaster-related angiotensin-I-converting enzymes Acer and Ance. Distinct enzymic characteristics and alternative expression during pupal development. Eur. J. Biochem. 257:599-606[Medline].

HUDSON, A. and L. COOLEY, 1998  Analysis of the Drosophila Arp2/3 complex in oogenesis. A. Dros. Res. Conf. 39:289B.

HWANG, S.-Y., B. OH, Z. ZHANG, W. MILLER, and D. SOLTER et al., 1999  The mouse cornichon gene family. Dev. Genes Evol. 209:120-125[Medline].

INGRAM, V. N., 1961  Gene evolution and the haemoglobins. Nature 189:704-708[Medline].

IWAKI, D., S. KAWABATA, Y. MIURA, A. KATO, and P. B. ARMSTRONG et al., 1996  Molecular cloning of Limulus alpha 2-macroglobulin. Eur. J. Biochem. 242:822-831[Medline].

JACKSON, F. R., L. M. NEWBY, and S. J. KULKARNI, 1990  Drosophila GABAergic systems: sequence and expression of glutamic acid decarboxylase. J. Neurochem. 54:1068-1078[Medline].

JACOBS, M. E., 1974  Beta-alanine and adaptation in Drosophila.. J. Insect Physiol. 20:859-866[Medline].

JIMENEZ, J., L. ALPHEY, P. NURSE, and D. M. GLOVER, 1990  Complementation of fission yeast cdc2ts and cdc25ts mutants identifies two cell cycle genes from Drosophila: a cdc2 homologue and string. EMBO J. 9:3565-3571[Medline].

JONES, S. J. M., 1999 Computational analysis of the Caenorhabditis elegans genome sequence. Ph.D. Thesis, Open University, England.

JUDD, B. H., M. W. SHEN, and T. C. KAUFMAN, 1972  The anatomy and function of a segment of the X chromosome of Drosophila melanogaster.. Genetics 71:139-156[Abstract/Free Full Text].

KAMIZONO, A., M. NISHIZAWA, Y. TERANISHI, K. MURATA, and A. KIMURA, 1989  Identification of a gene conferring resistance to zinc and cadmium ions in the yeast Saccharomyces cerevisiae.. Mol. Gen. Genet. 219:161-167[Medline].

KARLSTROM, R. O., L. P. WILDER, and M. J. BASTIANI, 1993  Lachesin: an immunoglobulin superfamily protein whose expression correlates with neurogenesis in grasshopper embryos. Development 118:509-522[Abstract].

KAVENOFF, R. and B. H. ZIMM, 1973  Chromosome-sized DNA molecules from Drosophila.. Chromosoma 41:1-27[Medline].

KAWABATA, S., F. TOKUNAGA, Y. KUGI, S. MOTOYAMA, and Y. MIURA et al., 1996  Limulus factor D, a 43-kDa protein isolated from horseshoe crab hemocytes, is a serine protease homologue with antimicrobial activity. FEBS Lett. 398:146-150[Medline].

KAWAMURA, K., T. SHIBATA, O. SAGET, D. PEEL, and P. J. BRYANT, 1999  A new family of growth factors produced by the fat body and active on Drosophila imaginal disc cells. Development 126:211-219[Abstract].

KIMMEL, B. E., M. J. PALAZZOLO, C. H. MARTIN, J. D. BOEKE and S. E. DEVINE, 1997 Transposon-mediated DNA sequencing, pp. 455–532 in Genome Analysis, Vol. 1, edited by B. BIRREN, E. D. GREEN, S. KLAPHOLZ, R. M. MYERS and J. ROSKAMS. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

KIMMERLY, W. J., K. STULTZ, S. LEWIS, K. LEWIS, and V. LUSTRE et al., 1996  A P1-based physical map of the Drosophila euchromatic genome. Genome Res. 6:414-430[Abstract/Free Full Text].

KOBAYASHI, S., S. MIYABE, S. IZAWA, Y. INOUE, and A. KIMURA, 1996  Correlation of the OSR/ZRCI gene product and the intracellular glutathione levels in Saccharomyces cerevisiae.. Biotechnol. Appl. Biochem. 23:3-6.

KOHLER, R. E., 1994 Lords of the Fly: Drosophila Genetics and the Experimental Life. University of Chicago Press, Chicago.

KOZLOVA, T. Y., V. F. SEMESHIN, I. V. TRETYAKOVA, E. B. KOKOZA, and V. PIRROTTA et al., 1994  Molecular and cytogenetical characterization of the 10A1-2 band and adjoining region in the Drosophila melanogaster polytene X chromosome. Genetics 136:1063-1073[Abstract].

KRAMER, K. M., D. FESQUET, A. L. JOHNSON, and L. H. JOHNSTON, 1998  Budding yeast RSI1/APC2, a novel gene necessary for initiation of anaphase, encodes an APC subunit. EMBO J. 17:498-506[Medline].

KUBLI, E., 1982  The genetics of transfer RNA in Drosophila.. Adv. Genet. 21:123-172[Medline].

LAIRD, C. D., 1971  Chromatid structure: relationship between DNA content and nucleotide sequence diversity. Chromosoma 32:378-406[Medline].

LAIRD, C. D. and B. J. MCCARTHY, 1968  Nucleotide sequence homology within the genome of Drosophila melanogaster.. Genetics 60:323-334[Free Full Text].

LAIRD, C. D. and B. J. MCCARTHY, 1969  Molecular characterization of the Drosophila genome. Genetics 63:865-882[Free Full Text].

LAMMER, D., N. MATHIAS, J. M. LAPLAZA, W. JIANG, and Y. LIU et al., 1998  Modification of yeast Cdc53p by the ubiquitin-related protein rub1p affects function of the SCFCdc4 complex. Genes Dev. 12:914-926[Abstract/Free Full Text].

LANDIS, G. and J. TOWER, 1999  The Drosophila chiffon gene is required for chorion gene amplification, and is related to the yeast Dbf4 regulator of DNA replication and cell cycle. Development 126(in press).

LASKO, P. F. and M. ASHBURNER, 1988  The product of the Drosophila gene vasa is very similar to eukaryotic initiation factor 4A. Nature 335:611-617[Medline].

LASKO, P. F. and M. ASHBURNER, 1990  Posterior localization of vasa protein correlates with, but is not sufficient for, pole cell development. Genes Dev. 4:905-921[Abstract/Free Full Text].

LEE, E. C., S. Y. YU, X. HU, M. MLODZIK, and N. E. BAKER, 1998  Functional analysis of the fibrinogen-related scabrous gene from Drosophila melanogaster identifies potential effector and stimulatory protein domains. Genetics 150:663-673[Abstract/Free Full Text].

LEFEVRE, G., 1976 A photographic representation and interpretation of the polytene chromosomes of Drosophila melanogaster salivary glands, pp. 31–66 in The Genetics and Biology of Drosophila, Vol. 1a, edited by M. ASHBURNER and E. NOVITSKI. Academic Press, London.

LEFEVRE, G. and W. S. WATKINS, 1986  The question of the total gene number in Drosophila melanogaster.. Genetics 113:869-895[Abstract/Free Full Text].

LEPTIN, M., 1994  Morphogenesis: control of epithelial cell shape changes. Curr. Biol. 4:709-712[Medline].

LEWIS, E. B., J. D. KNAFELS, D. R. MATHOG, and S. E. CELNIKER, 1995  Sequence analysis of the cis-regulatory regions of the bithorax complex of Drosophila.. Proc. Natl. Acad. Sci. USA 92:8403-8407[Abstract/Free Full Text].

LEWIS, D. L., C. L. FARR, Y. WANG, A. T. LAGINA, and L. S. KAGUNI, 1996  Catalytic subunit of mitochondrial DNA polymerase from Drosophila embryos: cloning, bacterial overexpression, and biochemical characterization. J. Biol. Chem. 271:23389-23394[Abstract/Free Full Text].

LIM, R. and A. ZAHEER, 1996  In vitro enhancement of p38 mitogen-activated protein kinase activity by phosphorylated glia maturation factor. J. Biol. Chem. 271:22953-22956[Abstract/Free Full Text].

LINDSLEY, D. L., and G. G. ZIMM, 1992 The Genome of Drosophila melanogaster. Academic Press, San Diego.

LITTLETON, J. T. and H. J. BELLEN, 1994  Genetic and phenotypic analysis of thirteen essential genes in cytological interval 22F1-2;23B1-2 reveals novel genes required for neural development in Drosophila.. Genetics 138:111-123[Abstract].

LOHE, A. R. and D. L. BRUTLAG, 1987  Adjacent satellite DNA segments in Drosophila.. J. Mol. Biol. 194:171-179[Medline].

LOWE, T. M. and S. R. EDDY, 1997  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequences. Nucleic Acids Res. 25:955-964[Abstract/Free Full Text].

MAHONE, M., E. E. SAFFMAN, and P. F. LASKO, 1995  Localized Bicaudal-C RNA encodes a protein containing a KH domain, the RNA binding motif of FMR1. EMBO J. 14:2043-2055[Medline].

MALESZKA, R., H. G. DE COUET, and G. L. G. MIKLOS, 1998  Data transferability from model organisms to human beings: insights from the functional genomics of the flightless region of Drosophila.. Proc. Natl. Acad. Sci. USA 95:3731-3736[Abstract/Free Full Text].

MANNING, J. E., C. W. SCHMID, and N. DAVIDSON, 1975  Interspersion of repetitive and nonrepetitive DNA sequences in the Drosophila melanogaster genome. Cell 4:141-155[Medline].

MARDON, G., N. M. SOLOMON, and G. M. RUBIN, 1994  dachshund encodes a nuclear protein required for normal eye and leg development in Drosophila.. Development 120:3473-3486[Abstract].

MARRS, J. A. and G. B. BOUCK, 1992  The two major membrane skeletal proteins (articulins) of Euglena gracilis define a novel class of cytoskeletal proteins. J. Cell Biol. 118:1465-1475[Abstract/Free Full Text].

MARSHALL, T. K., H. GUO, and D. H. PRICE, 1990  Drosophila RNA polymerase II elongation factor DmS-II has homology to mouse S-II and sequence similarity to yeast PPR2. Nucleic Acids Res. 18:6293-6298[Abstract/Free Full Text].

MARTIN, C. H., C. A. MAYEDA, C. A. DAVIS, C. L. ERICSSON, and J. D. KNAFELS et al., 1995  Complete sequence of the bithorax complex of Drosophila.. Proc. Natl. Acad. Sci. USA 92:8398-8402[Abstract/Free Full Text].

MARTIN, D., S. ZUSMAN, X. LI, E. L. WILLIAMS, and N. KHARE et al., 1999  wing blister, a new Drosophila laminin {alpha} chain required for cell adhesion and migration during embryonic and imaginal development. J. Cell Biol. 145:191-201[Abstract/Free Full Text].

MCGILL, S., 1985 Molecular studies of the Adh region of Drosophila melanogaster. Ph.D. Thesis, University of Cambridge, England.

MCGILL, S., W. CHIA, R. KARP, and M. ASHBURNER, 1988  The molecular analysis of an antimorphic mutation of Drosophila melanogaster, Scutoid. Genetics 119:647-661[Abstract/Free Full Text].

MCKRAY, R. D., L. ZHU, and R. D. SHORTRIDGE, 1995  A Drosophila gene that encodes a member of the protein disulfide isomerase/phospholipase C-{alpha} family. Insect Biochem. Mol. Biol. 25:647-654[Medline].

MCNABB, S., S. GREIG, and T. DAVIS, 1996  The alcohol dehydrogenase gene is nested in the outspread locus of Drosophila melanogaster.. Genetics 143:897-911[Abstract].

MELLO, C. C., B. W. DRAPER, and J. R. PRIESS, 1994  The maternal genes apx-1 and glp-1 and establishment of dorsal-ventral polarity in the early C. elegans embryo. Cell 77:95-106[Medline].

MENG, Y. B., R. D. STEVENS, W. CHIA, S. MCGILL, and M. ASHBURNER, 1988  Five glycyl tRNA genes within the noc gene complex of Drosophila melanogaster.. Nucleic Acids Res. 16:7189[Free Full Text].

MEWES, H. W., K. ALBERMANN, M. BÄHR, D. FRISHMAN, and A. GLIESSNER et al., 1997  Overview of the yeast genome. Nature 387(Suppl.):7-8[Medline].

MIKLOW, G. L. G. and G. M. RUBIN, 1996  The role of the genome project in determining gene function: insights from model organisms. Cell 86:521-529[Medline].

MILNE, A. A., 1926 Winnie-the-Pooh. Methuen, London.

MIN, K.-T. and S. BENZER, 1999  Preventing neurodegeneration in the Drosophila mutant bubblegum. Science 284:1985-1988[Abstract/Free Full Text].

MISTRY, H., 1997 Identification of loci interacting with G{alpha}s signalling in Drosophila melanogaster. Ph.D. Thesis, University of Cambridge, England.

MOHLER, J. and E. WIESCHAUS, 1986  Dominant maternal-effect mutations of Drosophila melanogaster causing the production of double-abdomen embryos. Genetics 112:803-822[Abstract/Free Full Text].

MORINGA, N., S. C. TSAI, J. MOSS, and J. VAUGHAN, 1996  Isolation of a brefeldin A-inhibited guanine nucleotide-exchange protein for ADP ribosylation factor (ARF) 1 and ARF3 that contains a Sec7-like domain. Proc. Natl. Acad. Sci. USA 93:12856-12860[Abstract/Free Full Text].

MUNRO, S. and H. R. PELHAM, 1987  A C-terminal signal prevents secretion of luminal ER proteins. Cell 48:899-907[Medline].

MUNROE, D. J., R. LOEBBERT, E. BRIC, T. WHITTON, and D. PRAWITT et al., 1995  Systematic screening of an arrayed cDNA library by PCR. Proc. Natl. Acad. Sci. USA 92:2209-2213[Abstract/Free Full Text].

MURPHY, S. M., L. URBANI, and T. STEARNS, 1998  The mammalian gamma-tubulin complex contains homologues of the yeast spindle pole body components spc97p and spc98p. J. Cell Biol. 141:663-674[Abstract/Free Full Text].

MUSACCHIO, M. and N. PERRIMON, 1996  The Drosophila kekkon genes: novel members of both the leucine-rich repeat and immunoglobulin superfamilies expressed in the CNS. Dev. Biol. 178:63-76[Medline].

NAKAI, M., T. ENDO, T. HASE, and H. MATSUBARA, 1993  Intramitochondrial protein sorting: isolation and characterization of the yeast MSP1 gene which belongs to a novel family of putative ATPases. J. Biol. Chem. 268:24262-24269[Abstract/Free Full Text].

NASH, D., 1965  The expression of `Hairless' in Drosophila and the role of two closely linked modifiers of opposite effect. Genet. Res. 6:175-189.

NEER, E. J., C. J. SCHMIDT, R. NAMBUDRIPAD, and T. F. SMITH, 1994  The ancient regulatory-protein family of WD-repeat proteins. Nature 371:297-300[Medline].

NEVILL-MANNING, C. G., T. D. WU, and D. L. BRUTLAG, 1998  Highly specific protein sequence motifs for genome analysis. Proc. Natl. Acad. Sci. USA 95:5865-5871[Abstract/Free Full Text].

NORRANDER, J. M., A. PERRONE, L. A. AMOS, and R. W. LINCK, 1996  Structural comparison of tektins and evidence for their determination of complex spacings in flagellar microtubules. J. Mol. Biol. 257:385-397[Medline].

NUSSLEIN-VOLHARD, D., E. WIESCHAUS and G. JURGENS, 1982 Segmentierung bei Drosophila. Verh. Ges. Dtsch. Zool. 1982: 91–104.

NUSSLEIN-VOLHARD, C., E. WIESCHAUS, and H. KLUDING, 1984  Mutations affecting the pattern of the larval cuticle in Drosophila melanogaster. I. Zygotic loci on the second chromosome. Roux's Arch. Dev. Biol. 193:267-282.

O'DONNELL, J. M., H. C. MANDEL, M. KRAUSS, and W. SOFER, 1977  Genetic and cytogenetic analysis of the Adh region in Drosophila melanogaster.. Genetics 86:553-566[Abstract/Free Full Text].

OH, Y., J. YOON, and K. BAEK, 1995  Isolation and characterization of the gene encoding the Drosophila melanogaster transcriptional elongation factor, TFIIS. Biochim. Biophys. Acta 1262:99-103[Medline].

OLSON, M. V., L. HOOD, C. CANTOR, and D. BOSTEIN, 1989  A common language for physical mapping of the human genome. Science 245:1434-1435[Free Full Text].

OPPENHEIMER, D. G., M. A. POLLOCK, J. VACIK, D. B. SZYMANSKI, and B. ERICSON et al., 1997  Essential role of a kinesin-like protein in Arabidopsis trichome morphogenesis. Proc. Natl. Acad. Sci. USA 94:6261-6266[Abstract/Free Full Text].

OSOEGAWA, K., P. Y. WOON, B. ZHAO, E. FRENGEN, and M. TATENO et al., 1998  An improved approach for construction of bacterial artificial chromosome libraries. Genomics 52:1-8[Medline].

PAN, D. and G. M. RUBIN, 1997  Kuzbanian controls proteolytic processing of Notch and mediates lateral inhibition during Drosophila and vertebrate neurogenesis. Cell 90:271-280[Medline].

PATEL, S. and M. LATTERICH, 1998  The AAA team: related ATPases with diverse functions. Trends Cell Biol. 8:65-71[Medline].

PAWSON, T. and J. D. SCOTT, 1997  Signalling through scaffold, anchoring, adaptor proteins. Science 278:2075-2080[Abstract/Free Full Text].

PEDERSEN, M. B., 1982  Enhancement and suppression of the black mutant and induction of black phenocopies in Drosophila melanogaster.. Hereditas 97:329.

PHILLIPS, A. M., L. B. SALKOFF, and L. E. KELLY, 1993  A neural gene from Drosophila melanogaster with homology to vertebrate and invertebrate glutamate decarboxylases. J. Neurochem. 61:1291-1301[Medline].

PIERI, A., F. MAGHERINI, G. LIGURI, G. RAUGEI, and N. TADDEI et al., 1998  Drosophila melanogaster acylphosphatase: a common ancestor for acylphosphatase isoenzymes of vertebrate species. FEBS Lett. 433:205-210[Medline].

PINTER, M., G. JEKELY, R. J. SZEPSESI, A. FARKAS, and U. THEOPOLD et al., 1998  TER94, a Drosophila homolog of the membrane fusion protein CDC48/p97, is accumulated in nonproliferating cells: in the reproductive organs and in the brain of the imago. Insect Biochem. Mol. Biol. 28:91-98[Medline].

PORTER, T. G. and D. L. MARTIN, 1988  Non-steady state kinetics of brain glutamate decarboxylase resulting from the interconversion of the apo- and holoenzyme. Biochim. Biophys. Acta 874:235-244.

POTTER, S. S., W. J. BROREIN, P. DUNSMUIR, and G. M. RUBIN, 1979  Transcription of elements of the 412, copia and 297 dispersed repeated gene families in Drosophila.. Cell 17:415-427[Medline].

POWERS, J. and C. BARLOWE, 1998  Transport of Ax12p depends on the Erv14p, an ER-vesicle protein related to the Drosophila cornichon gene product. J. Cell Biol. 142:1209-1222[Abstract/Free Full Text].

RASCH, E. M., H. J. BARR, and R. W. RASCH, 1971  The DNA content of sperm of Drosophila melanogaster.. Chromosoma 33:1-18[Medline].

REESE, M. G., F. H. EECKMAN, D. KULP, and D. HAUSSLER, 1997  Improved splice site detection in Genie. J. Comput. Biol. 4:311-323[Medline].

RICHARDSON, H. E., L. V. O'KEEFE, S. I. REED, and R. SAINT, 1993  A Drosophila G1-specific cyclin E homolog exhibits different modes of expression during embryogenesis. Development 119:673-690[Abstract].

ROGGE, R. D., C. A. KARLOVICH, and U. BANERJEE, 1991  Genetic dissection of a neurodevelopmental pathway: son of sevenless functions downstream of the sevenless and EGF receptor tyrosine kinases. Cell 64:39-48[Medline].

ROOKE, J., D. PAN, T. XU, and G. M. RUBIN, 1996  KUZ, a conserved metalloprotease-disintegrin protein with two roles in Drosophila neurogenesis. Science 273:1227-1231[Abstract].

ROPP, P. A. and W. C. COPELAND, 1996  Cloning and characterization of the human mitochondrial DNA polymerase, DNA polymerase {alpha}. Genomics 36:449-458[Medline].

RTH, P., 1996  A modular misexpression screen in Drosophila detecting tissue-specific phenotypes. Proc. Natl. Acad. Sci. USA 93:12418-12422[Abstract/Free Full Text].

RTH, P., K. SZABO, A. BAILEY, T. LAVERTY, and J. REHM et al., 1998  Systematic gain-of-function genetics in Drosophila.. Development 125:1049-1057[Abstract].

ROTH, S., D. STEIN, and C. NUSSLEIN-VOLHARD, 1989  A gradient of nuclear localization of the dorsal protein determines dorsoventral pattern in the Drosophila embryo. Cell 59:1189-1202[Medline].

ROTH, S., Y. HIROMI, D. GODT, and C. NUSSLEIN-VOLHARD, 1991  cactus, a maternal gene required for proper formation of the dorsoventral morphogen gradient in Drosophila embryos. Development 112:371-388[Abstract].

ROTH, S., F. S. NEUMAN-SILBERBERG, G. BARCELO, and T. SCHUPBACH, 1995  cornichon and the EGF receptor signaling process are necessary for both anterior-posterior and dorsal-ventral pattern formation in Drosophila.. Cell 81:967-978[Medline].

RUBIN, G. M., 1998  The Drosophila genome project: a progress report. Trends Genet. 14:340-341[Medline].

RUDKIN, G. T., 1972 Replication in polytene chromosomes, pp. 59–85 in Developmental Studies on Giant Chromosomes, edited by W. BEERMANN. Springer-Verlag, Berlin.

RUSCH, J. and M. LEVINE, 1997  Regulation of a dpp target gene in the Drosophila embryo. Development 124:303-311[Abstract].

RUSSELL, S. R. H., and K. KAISER, 1993 mst35b, a male germline specific gene. Abstracts 13th Eur. Dros. Res. Conf.: I2.

SACCHAROMYCES GENOME DATABASE, 1999 http://genome-www.stanford.edu/Saccharomyces/.

SAPIR, A., R. SCHWEITZER, and B. Z. SHILO, 1998  Sequential activation of the EGF receptor pathway during Drosophila oogenesis establishes the dorsoventral axis. Development 125:191-200[Abstract].

SATOH, A. K., F. TOKUNAGA, and K. OZAKI, 1997  Rab proteins of Drosophila melanogaster: novel members of the Rab-protein family. FEBS Lett. 404:65-69[Medline].

SCHAEFFER, S. W. and C. F. AQUADRO, 1987  Nucleotide sequence of the Adh gene region of Drosophila pseudoobscura: evolutionary change and evidence for an ancient gene duplication. Genetics 117:61-73[Abstract/Free Full Text].

SCHIMMOLER, F., E. DIAZ, B. MUHLBAUER, and S. P. PFEFFER, 1998  Characterization of a 76kDa endosomal, multispanning membrane protein that is highly conserved throughout evolution. Gene 216:311-318[Medline].

SCHMIEDEKNECHT, G., C. KERKHOFF, E. ORSO, J. STOEHR, and C. ASLANIDIS et al., 1996  Isolation and characterization of a 14.5-kDa trichloroacetic-acid-soluble translational inhibitor protein from human monocytes that is upregulated upon cellular differentiation. Eur. J. Biochem. 242:339-351[Medline].

SCHUPBACH, T. and E. WIESCHAUS, 1986  Germline autonomy of maternal-effect mutations altering the embryonic body pattern of Drosophila.. Dev. Biol. 113:443-448[Medline].

SCHUPBACH, T. and E. WIESCHAUS, 1989  Female sterile mutations on the second chromosome of Drosophila melanogaster. I. Maternal effect mutations. Genetics 121:101-117[Abstract/Free Full Text].

SCHWEISGUTH, F. and J. W. POSAKONY, 1992  Suppressor of Hairless, the Drosophila homolog of the mouse recombination signal-binding protein gene, controls sensory organ cell fates. Cell 69:1199-1212[Medline].

SELF, T., M. MAHONY, J. FLEMING, J. WALSH, and S. D. M. BROWN et al., 1998  Shaker-1 mutations reveal roles for myosin VIIA in both development and function of cochlea hair cells. Development 125:557-566[Abstract].

SHEN, W. and G. MARDON, 1997  Ectopic eye development in Drosophila induced by directed dachshund expression. Development 124:45-52[Abstract].

SIGRIST, S., G. RIED, and C. F. LEHNER, 1995  Dmcdc2 kinase is required for both meiotic divisions during Drosophila spermatogenesis and is activated by the twine cdc25 phosphatase. Mech. Dev. 53:247-260[Medline].

SIMON, M. A., D. D. L. BOWTELL, G. S. DODSON, T. R. LAVERTY, and G. M. RUBIN, 1991  Ras1 and a putative guanine nucleotide exchange factor perform crucial steps in signaling by the sevenless protein tyrosine kinase. Cell 67:701-716[Medline].

SMITHIES, O., G. E. CONNELL, and G. H. DIXON, 1962  Chromosomal rearrangements and the evolution of haptoglobin genes. Nature 196:232-236[Medline].

SMOLLER, D. A., D. PETROV, and D. L. HARTL, 1991  Characterization of bacteriophage P1 library containing inserts of Drosophila DNA of 75–100 kilobase pairs. Chromosoma 100:487-494[Medline].

SOEHNGE, H., X. HUANG, M. BECKER, P. WHITKEY, and D. CONOVER et al., 1996  A neurotransmitter transporter encoded by the Drosophila inebriated gene. Proc. Natl. Acad. Sci. USA 93:13262-13267[Abstract/Free Full Text].

SONNHAMER, E. L., S. R. EDDY, and R. DURBIN, 1997  Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28:405-420[Medline].

SORSA, V., 1988 Chromosome Maps of Drosophila, Vols. 1 and 2. CRC Press, Boca Raton, FL.

SOTILLOS, S., F. ROCH, and S. CAMPUZANO, 1997  The metalloprotease-disintegrin Kuzbanian participates in Notch activation during growth and patterning of Drosophila imaginal discs. Development 124:4769-4779[Abstract].

SPAIN, B. H., K. S. BOWDISH, A. PACAL, S. FLUCKIGER STAUB, and D. KOO et al., 1996  Two human cDNAs, including a homolog of Arabidopsis FUS6 (COP11), suppress G-protein- and mitogen-activated protein kinase-mediated signal transduction in yeast and mammalian cells. Mol. Cell. Biol. 16:6698-6706[Abstract].

SPEARMAN, C., 1904  The proof and measurement of association between two things. Am. J. Psychol. 15:72-101.

SPRADLING, A. C. and G. M. RUBIN, 1981  Drosophila genome organization: conserved and dynamic aspects. Annu. Rev. Genet. 15:219-264[Medline].

SPRADLING, A. C., D. M. STERN, I. KISS, J. ROOTE, and T. LAVERTY et al., 1995  Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proc. Natl. Acad. Sci. USA 92:10824-10830[Abstract/Free Full Text].

SPRADLING, A. C., D. STERN, A. BEATON, E. J. RHEM, and N. MOZDEN et al., 1999  The BDGP gene disruption project: single P-element insertions mutating 30% of Drosophila autosomal genes. Genetics 153:135-177[Abstract/Free Full Text].

STATHAKIS, D. G., E. S. PENTZ, M. E. FREEMAN, J. KULLMAN, and G. R. HANKINS et al., 1995  The genetic and molecular organization of the Dopa decarboxylase gene cluster of Drosophila melanogaster.. Genetics 141:629-655[Abstract].

STERNBERG, N., 1990  Bacteriophage P1 cloning system for the isolation, amplification, and recovery of DNA fragments as large as 100 kilobase pairs. Proc. Natl. Acad. Sci. USA 87:103-107[Abstract/Free Full Text].

STROBEL, E., P. DUNSMUIR, and G. M. RUBIN, 1979  Polymorphisms in the chromosomal locations of elements of the 412, copia and 297 dispersed repeated gene families in Drosophila.. Cell 17:429-439[Medline].

STROUMBAKIS, N. D., Z. LI, and P. P. TOLIAS, 1996  A homolog of human transcription factor NF-X1 encoded by the Drosophila shuttle craft gene is required in the embryonic central nervous system. Mol. Cell. Biol. 16:192-201[Abstract].

STURTEVANT, A. H., 1925  The effects of unequal crossing over at the Bar locus in Drosophila.. Genetics 10:117-147[Free Full Text].

STYHLER, S., A. NAKAMURA, A. SWAN, B. SUTER, and P. LASKO, 1998  vasa is required for GURKEN accumulation in the oocyte, and is involved in oocyte differentiation and germline cyst development. Development 125:1569-1578[Abstract].

SUN, X., J. WAHLSTROM, and G. KARPEN, 1997  Molecular structure of a functional Drosophila centromere. Cell 91:1007-1019[Medline].

TATEI, K., H. CAI, Y. T. IP, and M. LEVINE, 1995  Race: a Drosophila homologue of the angiotensin converting enzyme. Mech. Dev. 51:157-168[Medline].

TAYLOR, C. A. M., D. COATES, and A. D. SHIRRAS, 1996  The Acer gene of Drosophila codes for an angiotensin-converting enzyme homologue. Gene 181:191-197[Medline].

TESSIER-LAVIGNE, M. and C. S. GOODMAN, 1996  The molecular biology of axon guidance. Science 274:1123-1133[Abstract/Free Full Text].

TOLIAS, P. P. and N. D. STROUMBAKIS, 1998  The Drosophila zygotic lethal gene shuttle craft is required maternally for proper embryonic development. Dev. Genes Evol. 208:274-282[Medline].

VAN VACTOR, D., H. SINK, D. M. FAMBROUGH, R. TSOO, and C. S. GOODMAN, 1993  Genes that control neuromuscular specificity in Drosophila.. Cell 73:1137-1153[Medline].

VARSHAVSKY, A., 1997  The ubiquitin system. Trends Biochem. Sci. 22:383-387[Medline].

WALDMANN, R. and M. LAZDUNSKI, 1998  H+-gated cation channels: neuronal acid sensors in the NaC/DEG family of ion channels. Curr. Biol. 8:418-424.

WALTER, M. F., L. L. ZEINEH, B. C. BLACK, W. E. MCIVOR, and T. R. WRIGHT et al., 1996  Catecholamine metabolism and in vitro induction of premature cuticle melanization in wild type and pigmentation mutants of Drosophila melanogaster.. Arch. Insect Biochem. Physiol. 31:219-233[Medline].

WANG, Y., C. L. FARR, and L. S. KAGUNI, 1997  Accessory subunit of mitochondrial DNA polymerase from Drosophila embryos. Cloning, molecular analysis, and association in the native enzyme. J. Biol. Chem. 272:13640-13646[Abstract/Free Full Text].

WEIL, D., S. BLANCHARD, J. KAPLAN, P. GUILFORD, and F. GIBSON et al., 1995  Defective myosin VIIA gene responsible for Usher syndrome type 1B. Nature 374:60-61[Medline].

WEINSTEIN, J., F. W. JACOBSEN, J. HSU-CHEN, T. WU, and L. G. BAUM, 1994  A novel mammalian protein, p55CDC, present in dividing cells is associated with protein kinase activity and has homology to the Saccharomyces cerevisiae cell division cycle proteins Cdc20 and Cdc4. Mol. Cell. Biol. 14:3350-3363[Abstract/Free Full Text].

WELCH, M. D., A. H. DE PACE, S. VERMA, A. IWAMATSU, and T. MITCHISON, 1997  The human ARP2/3 complex is composed of evolutionarily conserved subunits and is localized to cellular regions of dynamic actin filament assembly. J. Cell Biol. 138:375-384[Abstract/Free Full Text].

WHITELEY, M., P. D. NOGUCHI, S. M. SENSABAUGH, W. F. ODENWALD, and J. A. KASSIS, 1992  The Drosophila gene escargot encodes a zinc finger motif found in snail-related genes. Mech. Dev. 36:117-127[Medline].

WOLFE, K. H. and D. C. SHIELDS, 1997  Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387:708-713[Medline].

WOODRUFF, R. C. and M. ASHBURNER, 1979a  The genetics of a small autosomal region of Drosophila melanogaster containing the structural gene for alcohol dehydrogenase. I. Characterization of deficiencies and mapping of Adh and visible mutations. Genetics 92:117-132[Abstract/Free Full Text].

WOODRUFF, R. C. and M. ASHBURNER, 1979b  The genetics of a small autosomal region of Drosophila melanogaster containing the structural gene for alcohol dehydrogenase. II. Lethal mutations in the region. Genetics 92:133-149[Abstract/Free Full Text].

WORMPEP, 1999 http:// www.sanger.ac.uk / Projects / C_elegans/wormpep/.

WRIGHT, T. R. F., 1987  The genetics of biogenic amine metabolism, sclerotization, and melanization in Drosophila melanogaster.. Adv. Genet. 24:127-222[Medline].

XIE, Z. and D. H. PRICE, 1996  Purification of an RNA polymerase II transcript release factor from Drosophila.. J. Biol. Chem. 271:11043-11046[Abstract/Free Full Text].

XU, Y., G. HELT, J. R. EINSTEIN, G. M. RUBIN and E. C. UBERBACHER, 1995 Drosophila GRAIL: an intelligent system for gene recognition in Drosophila DNA sequences, pp. 128–135 in Symposium on Intelligence in Neural and Biological Systems. IEEE Computer Society, Los Alamitos, CA.

YAGI, Y. and S. HAYASHI, 1997  Role of the Drosophila EGF receptor in determination of the dorsoventral domains of escargot expression during primary neurogenesis. Genes Cells 2:41-53[Abstract].

YEUNG, K. C., J. A. INOSTROZA, F. H. MERMELSTEIN, C. KANNABIRAN, and D. REINBERG, 1994  Structure-function analysis of the TBP-binding protein Dr1 reveals a mechanism for repression of class II gene transcription. Genes Dev. 8:2097-2109[Abstract/Free Full Text].

YEAST PROTEOME DATABASE, 1998 The Yeast Proteome Handbook. Ed. 5. Proteome Inc., Beverly, MA.

ZHENG, W., G. FENG, D. REN, D. F. EBERL, and F. HANNAN et al., 1995  Cloning and characterization of a calcium channel {alpha}1 subunit from Drosophila melanogaster with similarity to the rat brain type D isoform. J. Neurosci. 15:1132-1143[Abstract].

ZIV, J. and A. LEMPEL, 1977  A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23:337-343.




This article has been cited by other articles:


Home page
GeneticsHome page
A. Deredec, A. Burt, and H. C. J. Godfray
The Population Genetics of Using Homing Endonuclease Genes in Vector and Pest Management
Genetics, August 1, 2008; 179(4): 2013 - 2026.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
I. Gitelman
Evolution of the vertebrate twist family and synfunctionalization: a mechanism for differential gene loss through merging of expression domains
Mol. Biol. Evol., September 1, 2007; 24(9): 1912 - 1925.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. Ashburner and C. M. Bergman
Drosophila melanogaster: A case study of a model genomic sequence and its consequences
Genome Res., December 1, 2005; 15(12): 1661 - 1667.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
S. Jayaramaiah Raja and R. Renkawitz-Pohl
Replacement by Drosophila melanogaster Protamines and Mst77F of Histones during Chromatin Condensation in Late Spermatids and Role of Sesame in the Removal of These Proteins from the Male Pronucleus
Mol. Cell. Biol., July 15, 2005; 25(14): 6165 - 6177.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. P. Kiehart, J. D. Franke, M. K. Chee, R. A. Montague, T.-l. Chen, J. Roote, and M. Ashburner
Drosophila crinkled, Mutations of Which Disrupt Morphogenesis and Cause Lethality, Encodes Fly Myosin VIIA
Genetics, November 1, 2004; 168(3): 1337 - 1352.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
V. Stolc, Z. Gauhar, C. Mason, G. Halasz, M. F. van Batenburg, S. A. Rifkin, S. Hua, T. Herreman, W. Tongprasit, P. E. Barbano, et al.
A Gene Expression Map for the Euchromatic Genome of Drosophila melanogaster
Science, October 22, 2004; 306(5696): 655 - 660.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. B. Milchanowski, A. L. Henkenius, M. Narayanan, V. Hartenstein, and U. Banerjee
Identification and Characterization of Genes Involved in Embryonic Crystal Cell Formation During Drosophila Hematopoiesis
Genetics, September 1, 2004; 168(1): 325 - 339.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. D. Pollock and J. C. Larkin
Estimating the Degree of Saturation in Mutant Screens
Genetics, September 1, 2004; 168(1): 489 - 502.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. J. Bellen, R. W. Levis, G. Liao, Y. He, J. W. Carlson, G. Tsang, M. Evans-Holm, P. R. Hiesinger, K. L. Schulze, G. M. Rubin, et al.
The BDGP Gene Disruption Project: Single Transposon Insertions Associated With 40% of Drosophila Genes
Genetics, June 1, 2004; 167(2): 761 - 781.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
E. Ryder, F. Blows, M. Ashburner, R. Bautista-Llacer, D. Coulson, J. Drummond, J. Webster, D. Gubb, N. Gunton, G. Johnson, et al.
The DrosDel Collection: A Set of P-Element Insertions for Generating Custom Chromosomal Aberrations in Drosophila melanogaster
Genetics, June 1, 2004; 167(2): 797 - 813.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Luschnig, B. Moussian, J. Krauss, I. Desjeux, J. Perkovic, and C. Nusslein-Volhard
An F1 Genetic Screen for Maternal-Effect Mutations Affecting Embryonic Pattern Formation in Drosophila melanogaster
Genetics, May 1, 2004; 167(1): 325 - 342.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
K.-i. Kimura, A. Kodama, Y. Hayasaka, and T. Ohta
Activation of the cAMP/PKA signaling pathway is required for post-ecdysial cell death in wing epidermal cells of Drosophila melanogaster
Development, April 1, 2004; 131(7): 1597 - 1606.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
U. Weihe, R. Dorfman, M. F. Wernet, S. M. Cohen, and M. Milan
Proximodistal subdivision of Drosophila legs and wings: the elbow-no ocelli gene complex
Development, February 15, 2004; 131(4): 767 - 774.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
K. Sawamura, J. Roote, C.-I Wu, and M.-T. Yamamoto
Genetic Complexity Underlying Hybrid Male Sterility in Drosophila
Genetics, February 1, 2004; 166(2): 789 - 796.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. H. Myster, F. Wang, R. Cavallo, W. Christian, S. Bhotika, C. T. Anderson, and M. Peifer
Genetic and Bioinformatic Analysis of 41C and the 2R Heterochromatin of Drosophila melanogaster: A Window on the Heterochromatin-Euchromatin Junction
Genetics, February 1, 2004; 166(2): 807 - 822.
[Abstract] [Full Text] [PDF]


Home page
J HeredHome page
D. J. Orengo, M. Papaceit, and E. Juan
A Minisatellite with Fold-Back Structure is Included in the 5'-Flanking Region of the Adh Gene of Scaptodrosophila lebanonensis
J. Hered., January 1, 2004; 95(1): 62 - 69.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
T. Domazet-Loso and D. Tautz
An Evolutionary Analysis of Orphan Genes in Drosophila
Genome Res., October 1, 2003; 13(10): 2213 - 2219.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
L. V. Sun, L. Chen, F. Greil, N. Negre, T.-R. Li, G. Cavalli, H. Zhao, B. van Steensel, and K. P. White
Protein-DNA interaction mapping using genomic tiling path microarrays in Drosophila
PNAS, August 5, 2003; 100(16): 9428 - 9433.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
O. Jaillon, C. Dossat, R. Eckenberg, K. Eiglmeier, B. Segurens, J.-M. Aury, C. W. Roth, C. Scarpelli, P. T. Brey, J. Weissenbach, et al.
Assessing the Drosophila melanogaster and Anopheles gambiae Genome Annotations Using Genome-Wide Sequence Comparisons
Genome Res., July 1, 2003; 13(7): 1595 - 1599.
[Abstract] [Full Text] [PDF]


Home page
JCBHome page
J. Kuja-Panula, M. Kiiltomaki, T. Yamashiro, A. Rouhiainen, and H. Rauvala
AMIGO, a transmembrane protein implicated in axon tract development, defines a novel protein family with leucine-rich repeats
J. Cell Biol., March 17, 2003; 160(6): 963 - 973.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. E. Mohr and W. M. Gelbart
Using the P{wHy} Hybrid Transposable Element to Disrupt Genes in Region 54D-55B in Drosophila melanogaster
Genetics, September 1, 2002; 162(1): 165 - 176.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A.-S. K. Tseng and I. K. Hariharan
An Overexpression Screen in Drosophila for Genes That Restrict Growth or Cell-Cycle Progression in the Developing Eye
Genetics, September 1, 2002; 162(1): 229 - 243.
[Abstract] [Full Text] [PDF]


Home page
J. Exp. Biol.Home page
J. D. Baker and J. W. Truman
Mutations in the Drosophila glycoprotein hormone receptor, rickets, eliminate neuropeptide-induced tanning and selectively block a stereotyped behavioral program
J. Exp. Biol., September 1, 2002; 205(17): 2555 - 2565.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
R. Dorfman, L. Glazer, U. Weihe, M. F. Wernet, and B.-Z. Shilo
Elbow and Noc define a family of zinc finger proteins controlling morphogenesis of specific tracheal branches
Development, August 1, 2002; 129(15): 3585 - 3596.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. Simin, A. Scuderi, J. Reamey, D. Dunn, R. Weiss, J. E. Metherall, and A. Letsou
Profiling Patterned Transcripts in Drosophila Embryos
Genome Res., July 1, 2002; 12(7): 1040 - 1047.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
K. G. T. Hagen and D. T. Tran
A UDP-GalNAc:Polypeptide N-Acetylgalactosaminyltransferase Is Essential for Viability in Drosophila melanogaster
J. Biol. Chem., June 14, 2002; 277(25): 22616 - 22622.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
T. Schwientek, E. P. Bennett, C. Flores, J. Thacker, M. Hollmann, C. A. Reis, J. Behrens, U. Mandel, B. Keck, M. A. Schafer, et al.
Functional Conservation of Subfamilies of Putative UDP-N-acetylgalactosamine:Polypeptide N-Acetylgalactosaminyltransferases in Drosophila, Caenorhabditis elegans, and Mammals. ONE SUBFAMILY COMPOSED OF l(2)35Aa IS ESSENTIAL IN DROSOPHILA
J. Biol. Chem., June 14, 2002; 277(25): 22623 - 22638.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. Thomasova, L. Q. Ton, R. R. Copley, E. M. Zdobnov, X. Wang, Y. S. Hong, C. Sim, P. Bork, F. C. Kafatos, and F. H. Collins
Comparative genomic analysis in the region of a major Plasmodium-refractoriness locus of Anophelesgambiae
PNAS, June 11, 2002; 99(12): 8179 - 8184.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. J. Newfeld and N. T. Takaesu
An Analysis Using the hobo Genetic System Reveals That Combinatorial Signaling by the Dpp and Wg Pathways Regulates dpp Expression in Leading Edge Cells of the Dorsal Ectoderm in Drosophila melanogaster
Genetics, June 1, 2002; 161(2): 685 - 692.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Schenkel, S. Hanke, C. De Lorenzo, R. Schmitt, and B. M. Mechler
P Elements Inserted in the Vicinity of or Within the Drosophila snRNP SmD3 Gene Nested in the First Intron of the Ornithine Decarboxylase Antizyme Gene Affect Only the Expression of SmD3
Genetics, June 1, 2002; 161(2): 763 - 772.
[Abstract] [Full Text] [PDF]


Home page
JCBHome page
A. M. Hudson and L. Cooley
A subset of dynamic actin rearrangements in Drosophila requires the Arp2/3 complex
J. Cell Biol., February 18, 2002; 156(4): 677 - 687.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. E. Koryakov, I. F. Zhimulev, and P. Dimitri
Cytogenetic Analysis of the Third Chromosome Heterochromatin of Drosophila melanogaster
Genetics, February 1, 2002; 160(2): 509 - 517.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
F. Chanut, K. Woo, S. Pereira, T. J. Donohoe, S.-Y. Chang, T. R. Laverty, A. P. Jarman, and U. Heberlein
Rough eye Is a Gain-of-Function Allele of amos That Disrupts Regulation of the Proneural Gene atonal During Drosophila Retinal Differentiation
Genetics, February 1, 2002; 160(2): 623 - 635.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Sci.Home page
X. Fei, B. He, and P. N. Adler
The growth of Drosophila bristles and laterals is not restricted to the tip or base
J. Cell Sci., January 10, 2002; 115(19): 3797 - 3806.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
V. N. Bolshakov, P. Topalis, C. Blass, E. Kokoza, A. della Torre, F. C. Kafatos, and C. Louis
A Comparative Genomic Analysis of Two Distant Diptera, the Fruit Fly, Drosophila melanogaster, and the Malaria Mosquito, Anopheles gambiae
Genome Res., January 1, 2002; 12(1): 57 - 66.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
G. C. T. Pipes, Q. Lin, S. E. Riley, and C. S. Goodman
The Beat generation: a multigene family encoding IgSF proteins related to the Beat axon guidance molecule in Drosophila
Development, November 15, 2001; 128(22): 4545 - 4552.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Llimargas and P. A. Lawrence
Seven Wnt homologues in Drosophila: A case study of the developing tracheae
PNAS, November 15, 2001; (2001) 251304398.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. A. George, J. P. Woolley, and P. T. Spellman
Ceramic Capillaries for Use in Microarray Fabrication
Genome Res., October 1, 2001; 11(10): 1780 - 1783.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
K. J. Schmid and C. F. Aquadro
The Evolutionary Analysis of ""Orphans"" From the Drosophila Genome Identifies Rapidly Diverging and Incorrectly Annotated Genes
Genetics, October 1, 2001; 159(2): 589 - 598.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. A. Lee, L. K. Elfring, G. Bosco, and T. L. Orr-Weaver
A Genetic Screen for Suppressors and Enhancers of the Drosophila PAN GU Cell Cycle Kinase Identifies Cyclin B as a Target
Genetics, August 1, 2001; 158(4): 1545 - 1556.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Butler, S. Levine, X. Wang, S. Bonyadi, G. Fu, P. Lasko, B. Suter, and R. Doerig
Map Position and Expression of the Genes in the 38 Region of Drosophila
Genetics, August 1, 2001; 158(4): 1597 - 1614.
[Abstract] [Full Text] [PDF]


Home page
Mol. Biol. CellHome page
V. Kondylis, S. E. Goulding, J. C. Dunne, and C. Rabouille
Biogenesis of Golgi Stacks in Imaginal Discs of Drosophila melanogaster
Mol. Biol. Cell, August 1, 2001; 12(8): 2308 - 2327.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
J. Cong, W. Geng, B. He, J. Liu, J. Charlton, and P. N. Adler
The furry gene of Drosophila is important for maintaining the integrity of cellular extensions during morphogenesis
Development, July 15, 2001; 128(14): 2793 - 2802.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. S. Hewes and P. H. Taghert
Neuropeptides and Neuropeptide Receptors in the Drosophila melanogaster Genome
Genome Res., June 1, 2001; 11(6): 1126 - 1142.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
P. V. Benos, M. K. Gatt, L. Murphy, D. Harris, B. Barrell, C. Ferraz, S. Vidal, C. Brun, J. Demaille, E. Cadieu, et al.
From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies
Genome Res., May 1, 2001; 11(5): 710 - 730.
[Abstract] [Full Text]


Home page
Genome ResHome page
S. Rogic, A. K. Mackworth, and F. B.F. Ouellette
Evaluation of Gene-Finding Programs on Mammalian Sequences
Genome Res., May 1, 2001; 11(5): 817 - 832.
[Abstract] [Full Text]


Home page
Genes Dev.Home page
M. A. Hiller, T.-Y. Lin, C. Wood, and M. T. Fuller
Developmental regulation of transcription by a tissue-specific TAF homolog
Genes & Dev., April 15, 2001; 15(8): 1021 - 1030.
[Abstract] [Full Text]


Home page
Chem SensesHome page
R. R.H. Anholt, J. J. Fanara, G. M. Fedorowicz, I. Ganguly, N. H. Kulkarni, T. F.C. Mackay, and S. M. Rollmann
Functional Genomics of Odor-guided Behavior in Drosophila melanogaster
Chem Senses, February 1, 2001; 26(2): 215 - 221.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
M. Pflumm and M. Botchan
Orc mutants arrest in metaphase with abnormally condensed chromosomes
Development, January 5, 2001; 128(9): 1697 - 1707.
[Abstract] [PDF]


Home page
Genome ResHome page
B. T. Wakimoto
Doubling the Rewards: Testis ESTs for Drosophila Gene Discovery and Spermatogenesis Expression Profile Analysis
Genome Res., December 1, 2000; 10(12): 1841 - 1842.
[Full Text]


Home page
Genome ResHome page
J. Andrews, G. G. Bouffard, C. Cheadle, J. Lü, K. G. Becker, and B. Oliver
Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis
Genome Res., December 1, 2000; 10(12): 2030 - 2043.
[Abstract] [Full Text]


Home page
GeneticsHome page
W. Geng, B. He, M. Wang, and P. N. Adler
The tricornered Gene, Which Is Required for the Integrity of Epidermal Cell Extensions, Encodes the Drosophila Nuclear DBF2-Related Kinase
Genetics, December 1, 2000; 156(4): 1817 - 1828.
[Abstract] [Full Text]


Home page
GeneticsHome page
E. G. Pasyukova, C. Vieira, and T. F. C. Mackay
Deficiency Mapping of Quantitative Trait Loci Affecting Longevity in Drosophila melanogaster
Genetics, November 1, 2000; 156(3): 1129 - 1146.
[Abstract] [Full Text]


Home page
GeneticsHome page
A. M. Huang and G. M. Rubin
A Misexpression Screen Identifies Genes That Can Modulate RAS1 Pathway Signaling in Drosophila melanogaster
Genetics, November 1, 2000; 156(3): 1219 - 1230.
[Abstract] [Full Text]


Home page
Nucleic Acids ResHome page
J. Trzcinska-Danielewicz and J. Fronk
SURVEY AND SUMMARY: Exon-intron organization of genes in the slime mold Physarum polycephalum
Nucleic Acids Res., September 15, 2000; 28(18): 3411 - 3416.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Nathan and M. U. Shiloh
Reactive oxygen and nitrogen intermediates in the relationship between mammalian hosts and microbial pathogens
PNAS, August 1, 2000; 97(16): 8841 - 8848.
[Abstract] [Full Text] [PDF]


Home page
JCBHome page
T. Brody and A. Cravchik
Drosophila melanogaster G Protein-coupled Receptors
J. Cell Biol., July 24, 2000; 150(2): F83 - F88.
[Abstract] [Full Text] [PDF]


Home page