Abstract
We describe a technique to tag Drosophila proteins with GFP at their native genomic loci. This technique uses a new, small P transposable element (the Wee-P) that is composed primarily of the green fluorescent protein (GFP) sequence flanked by consensus splice acceptor and splice donor sequences. We demonstrate that insertion of the Wee-P can generate GFP fusions with native proteins. We further demonstrate that GFP-tagged proteins have correct subcellular localization and can be expressed at near-normal levels. We have used the Wee-P to tag genes with a wide variety of functions, including transmembrane proteins. A genetic analysis of 12 representative fusion lines demonstrates that loss-of-function phenotypes are not caused by the Wee-P insertion. This technology allows the generation of GFP-tagged reagents on a genome-wide scale with diverse potential applications.
THE advent of green fluorescent protein (GFP) as a means to visualize proteins in living cells has begun a revolution in many fields of cellular biology (Chalfieet al. 1994). Three major drawbacks currently hamper experiments using GFP tagging. First, proteins of interest, once tagged, must generally be overexpressed and this overexpression can generate novel phenotypes that disrupt the processes being studied. Second, overexpression will often disrupt the fine architecture of protein localization. Finally, to tag a protein, a full-length cDNA must be available.
We have developed a GFP trap in Drosophila that is capable of generating full-length GFP fusion proteins throughout the genome. This GFP trap is based on the mobile genetic P element in Drosophila in which a GFP sequence is flanked by splice acceptor and splice donor sequences. This enables GFP to be spliced onto the native gene transcripts and take advantage of the endogenous splicing apparatus. The expression of GFP fusion proteins generated in this manner is controlled by endogenous genomic regulatory sequences. As a result, the correct spatial and temporal expression patterns are achieved and the resulting fusion proteins are not expressed at elevated levels.
Several transposable-element-based trapping methodologies have been recently pursued. Most gene traps and enhancer traps in Drosophila have been designed to facilitate gene disruption (Bieret al. 1989; Wilsonet al. 1989; Lukacsovichet al. 2001). In contrast, the GFP trap that we describe here is designed to generate intact fusion proteins for studies of subcellular localization in a living cell. This GFP trap is the first described that has little apparent alteration of protein function. The protein-trapping technology that we describe here is similar to recent work from the Chia laboratory (Morinet al. 2001). A fundamental difference between these approaches resides in the construction of the mobile P element. To make the Wee-P less mutagenic than standard P elements, we have used the FLP-FRT system to remove the mini-w+ gene so that the resulting Wee-P element is small in size and lacks a second transcription unit. Our genetic data suggest that the Wee-P rarely generates loss-of-function phenotypes when homozygous or when placed in trans to deficiency chromosomes that uncover the insertion location. This allows the Wee-P to be visualized as a homozygous element, thereby tagging a larger portion of the endogenous transcript, which can be essential for visualizing proteins of low abundance. In addition, this may allow the Wee-P protein fusions to be used as reporters for changes in protein abundance during development or in response to experimental manipulation.
MATERIALS AND METHODS
Construction of the Wee-P element: The Wee-P element consists of a GFP sequence flanked by splice donor and splice acceptor sites, a mini-white gene flanked by FRT sites, an ampicillin resistance gene and origin of bacterial replication (pUC segment), and P-element inverted repeat ends. To generate this transposon, we first made an intermediate vector containing the following three elements: (1) a pUC segment and P-element ends cut from pCaSpeR-4, (2) a mini-white gene flanked by FRT sites cut from FLP-IMP (Kelleret al. 2002), and (3) a short linker sequence. The GFP sequence was PCR amplified from pEGFP tubulin (CLONTECH, Palo Alto, CA) using the primers JB NotI GFP-2 (TTAATTTACTAT GCGGCCGCTTTACTTAATTACAGGGTCTATATAAGCAGA GCTGG) and JB-GFP BamHI (ATTCCAAGTTCTGCAGATA AAGAATTACTTACCTCTCGAGATCTGAGTCCGGAC) and inserted into this intermediate vector. The most conserved Drosophila splice acceptor site, CAG, and splice donor site, GTAAGT (Mountet al. 1992), were inserted at the 5′ end and 3′ end, respectively, of the GFP sequence. To increase the likelihood that the cell’s splicing machinery will recognize these artificially inserted splice sites, these splice sites were both positioned adjacent to 10- to 12-bp AT-rich regions characteristic of intronic DNA (Mountet al. 1992). The GFP sequence also contains its own ATG codon for translation initiation. The pUC segment of the vector was used for subcloning purposes only and does not integrate into the genomic DNA of the fly. The mini-white gene was used as a marker for transformation of the injected vector and subsequently removed using heat-shock FLP-induced FRT recombination, as described previously (Golic and Lindquist 1989). Therefore, the resulting P element transposed to create independent Wee-P-element insertions is ∼1.9 kb in length.
Genetics: y, w, P[hsFLP12, ry+]; Sco/CyO used for FRT recombination of the mini-white gene from the original Wee-P element vector was obtained from the Bloomington Stock Center. The stock containing the transposase enzyme (w; Cyo/Sp; ry, Dr Δ 2-3/TM6) was kindly provided by John Roote. The initial en masse crosses to this transposase source are with silent, starter Wee-P elements (ST) that are in w- backgrounds. w/Y; Wee-PST/+; Dr Δ2-3/+ were crossed en masse to virgin w females derived from a stock of w/Y P(hs-hid) (a gift from the Matt Scott lab) that were heat shocked to yield high numbers of virgin females. Crosses were done in large cages on apple or grape juice plates that held ∼300 parental flies. Each cage was given a unique number and screened for several weeks until the yield of progeny was low. Care was taken to minimize duplicated work due to the effects of premeiotic clusters by noting down in detail the initial GFP expression pattern and cage number of each GFP-positive animal selected. Each GFP-expressing line typically had three separate Wee-P insertion events, suggesting a high frequency of transposition (the absence of a scorable marker independent of the GFP precludes the determination of the transposition frequency using traditional markers such as variegated eye color). To determine which Wee-P insertion causes the GFP expression we outcrossed each line two generations to a y w stock, selecting on the GFP expression pattern. After two generations we typically observe only a single band by inverse PCR (IPCR) originating from a single Wee-P insertion. Infrequently, two bands persist after outcrossing. In this case, both bands are sequenced and only one is generally capable of generating a fusion event.
The Df(3R)Δ356 stock was a generous gift of the Zinsmaier laboratory. The following lines (provided by Bloomington Stock Center) were used for genetic analysis: (kis1cn1bw1sp1/SM6a), (y, w; P{w[+mC] = lacW}Hsc 70-4 L3929/TM3 Ser), (Df(3R)P13), and (l(2)k10423).
Screening techniques: Embryos were aged until late embryogenesis and dechorionated using 50% bleach for 2-3 min followed by a brief rinse through a sieve in water. The embryos were then placed in a small dish of heptane and spread with the heptane on microscope slides with pipettors. Because they sank in the heptane, the embryos formed single layers on the microscope slides. Five to ten seconds after the heptane evaporated, we placed halocarbon oil over the embryos and then screened visually for GFP on a GFP dissection microscope using a ×10 compound objective (Zeiss). We also screened first instar larvae by rinsing them off of plates with PBT (1× PBS, 0.1% Tween) into a sieve; we then rinsed them with water, placed them in a petri dish with a thin layer of PBT, and screened visually with a GFP dissecting microscope.
Molecular analysis of P-element insertions and Western blotting: The insertion site of the Wee-P element in generated lines was determined using inverse PCR, as described by the Berkeley Drosophila Genome Project (http://www.fruitfly.org/about/methods/inverse.pcr.html). PCR amplification was performed using first-round primers IPCR-4 (CAATCA TATCGCTGTCTCAC) and IPCR-6 (GATTAACCCTTAGC ATGTCC) and second-round nested primers IPCR-7 (ACT ATTCCTTTCACTCGCAC) and IPCR-9 (ACCTCTCGAGAT CTGAGTCC). Western blots were done with standard SDS-PAGE gels according to standard protocols.
RESULTS
Construction and expression of the Wee-P: To generate new reagents for live visualization of proteins we developed a P-element-based strategy for tagging proteins with GFP at their native genomic loci. This approach takes advantage of the endogenous splicing apparatus utilized by the majority of genes in Drosophila: ∼82% of the genes in Drosophila have introns and, of those, there are an average of 2.5 introns per gene (Longet al. 1995; Deutsch and Long 1999; M. Long, personal communication). There is also evidence that exons are frequently modular, encoding distinct motifs within a gene (Longet al. 1995). Therefore, genes with exons may tolerate the insertion of a new motif such as GFP without significant alterations to their function.
To create a small P element that would act as a GFP trap and hop frequently, we took a two-step strategy. First, we designed a P element containing the GFP sequence as well as a scorable marker (mini-white) that would allow us to identify transgenic animals harboring the P-GFP element (Figure 1A). Second, we excised the mini-white gene using the FLP recombinase system (Figure 1B; Golic and Lindquist 1989). After removing the mini-white gene from the P-GFP element, the total size of the P element was ∼1.9 kb (compared to standard mini-white-containing P elements that are 10-15 kb; see Bieret al. 1989; Wilson et al. 1989). We reasoned that a smaller P element would be less mutagenic and increase the likelihood that a trapped protein would be expressed at endogenous levels by the native genomic transcriptional regulatory sequences. We refer to this P element as the “Wee-P” due to its small size. As P elements show a tendency to insert near or in the 5′ untranslated region (UTR) of genes, we chose to keep the ATG of GFP in the Wee-P so that 5′ spliced GFP fusions would be expressed (Liaoet al. 2000). Therefore, we also placed consensus translation initiation sequences just 5′ to the ATG (Cavener and Ray 1991). Although sequences associated with an initial ATG were placed in the Wee-P, we deliberately chose not to include any promoter sequences so that GFP expression would accurately reflect the expression of the trapped protein. As introns occur in three phases, we constructed three different Wee-P elements appropriate for each intron phase (Figure 1C).
Screening with the Wee-P for GFP-tagged proteins: For our initial screen we crossed Wee-PST0 (a second chromosome phase 0 Wee-P insertion that, due to its location, does not form a fusion protein) to a stock of Δ2-3, Dr/TM6b (Figure 1D). Males heterozygous for Wee-PST0 and Δ2-3, Dr were then crossed to y w animals. Embryos were aged until late embryogenesis to allow time for zygotic protein translation before dechorionation and visualization on microscope slides. Embryos were plated as a monolayer on standard microscope slides and visualized using a fluorescent dissection microscope equipped with a ×10 compound objective. Each slide contained ∼2000 embryos and was screened in 30 min. Using this method GFP+ embryos were found at a frequency of ∼1/400 animals.
Each embryo identified as having GFP expression was placed in an individual vial. Adults that emerged were then crossed to y w. In some cases the Δ2-3, Dr chromosome had to be selected against. Initially, we performed IPCR directly on the progeny of the initial GFP+ animal. However, these animals typically had three or more independent Wee-P insertions, which complicated the completion of the IPCR step. We therefore did two rounds of outcrossing on the basis of GFP expression to clean the chromosomes. IPCR on these animals generally yielded single clear bands. We then performed sequence analysis of these single bands to determine the site and orientation of the putative fusion event. In a number of cases sequence analysis of RT-PCR products was performed to confirm the exact nature of the fusion event (Table 1).
—Construction and genetic crosses with the Wee-P. (A) The first step in constructing the Wee-P was to generate a P element that had a mini-white gene for scoring the initial injections. Flanking this mini-white were two FRT sites for eventual excision. Intron sequences and splice acceptor and splice donor sites flank the ORF of GFP. The miniwhite, GFP, and flanking intron sequences are shown to scale. The FRT sites and 31-bp inverted repeats are enlarged for simplicity in viewing. (B) Following excision of the mini-white by FLP recombinase, the remaining P element is only 1.9 kb long and is composed essentially of the GFP ORF and flanking intron and splicing sequences. Scale as in A. (C) Introns can insert in any one of three positions relative to coding sequence. We have constructed three distinct Wee-P elements appropriate for each of the three intron phases. To compensate for the phases, we inserted nucleotide sequences (green) that would be as minimally disruptive as possible to the coding sequence. For example, in our phase 2 construct, just before the “gtaagt” splice donor sequence, we placed a “CT.” We chose CT because it will encode for an extra leucine, a common amino acid with a small and poorly reactive side chain. (D) Our first cross is designed to combine a transposase source (Δ2-3) with a starter Wee-P element (Wee-PST) that, due to its insertion location, does not form a fusion protein. In the second cross, males with the Wee-PST and the Δ2-3 are combined with y w virgins. These y w virgins are collected from a stock that has a “hs-hid” element on the Y chromosome, which, following heat shock, allows for rapid collection of thousands of virgins. Once a GFP+ embryo or larva has been isolated, it is outcrossed two times to clean up the chromosomes and then IPCR is performed on the line.
Approximately 60% of all GFP-positive animals represent in-phase fusion events. Of the remaining 40% of animals that are selected on the basis of GFP expression, a large percentage are due to a hot spot for Wee-P insertions in the lola locus. In these animals, the GFP amino acid sequence plus a few additional nonsense amino acid residues are translated after the Wee-P exon fuses with a 5′ UTR exon of the lola locus that contains multiple stop codons. Other nonfusion events include cases where the Wee-P element has landed in the 5′ UTR of a gene whose first intron is not in the same phase as the Wee-P element.
Identity and subcellular localization of GFP fusion events: To date we have analyzed >100 lines that express GFP. We have characterized a subset of these lines in molecular and genetic detail to assess the properties of Wee-P fusion events. A diverse array of proteins have been tagged with GFP using the Wee-P element, including transmembrane proteins as well as cytoplasmic and nuclear proteins (Figures 2, 3, 4; Table 1). The GFP fusion events can occur N-terminally as well as within the open reading frame (ORF). N-terminal fusions are derived from a variety of insertion sites including upstream of the 5′ UTR, within an exon of the 5′ UTR, and within introns bound by 5′ UTR exons. Internal fusion events are derived from insertions in introns bound by coding exons and constitute about half of the fusions (Table 1).
Summary of selected Wee-P fusions
For internal fusion events the average size of the intron into which Wee-P elements have been observed to insert is 2.9 kb (SD = 4.1 kb). The smallest intron into which a Wee-P has inserted and generated a fusion is 183 bp. In comparison, the average intron size for fusions discovered with the protein trap transposon is ∼8 kb (Morinet al. 2001), suggesting that the shorter Wee-P transposon may have a significant advantage in trapping shorter introns.
An analysis of subcellular localization for Wee-P fusions demonstrates that the fusion proteins are correctly localized in each insertion line analyzed to date. For example, the Wee-P114:GFP fusion with calmodulin is widely expressed and enriched in neuronal tissues (Figure 2A). In a second example, a fusion with Ciboulot (a G-actin-binding protein) is distributed in several different tissues, such as the central nervous system, the chordotonal organs, and the trachea (Figure 2, C-E). The subcellular localization of Wee-P3:GFP:Cib within each of these cell types is cytoplasmic, as predicted for a G-actin-binding protein. A fusion with the transmembrane protein Sec61 shows localization consistent with an endoplasmic reticulum (ER) resident protein (Figure 2F). Finally, in the Wee-P:GFP fusion with the putative chromatin remodeling protein Kismet, GFP is localized to the nucleus as predicted, where it is observed in puncta (Figure 2, G and H; Daubresseet al. 1999).
Molecular analysis of Wee-P fusion events: We examined a subset of Wee-P fusions by Western blots using anti-GFP antibodies. In all the fusions analyzed we found bands of the correct predicted size (Figure 3A). Furthermore, these Western blots indicate that there is generally no evidence of altered protein stability or degradation. It is also important to determine how much of the native transcript includes the GFP sequence. A previously published antibody to the Cib protein worked well on Westerns and demonstrates that a majority of the protein made from the Wee-P3:Cib locus includes GFP (Figure 3B; Boquetet al. 2000). However, in this instance, there is a reduction in the total protein expression level. To address this question further we performed RT-PCR on five different Wee-P lines and found that in each case, in addition to transcripts that include the GFP, wild-type transcripts are also formed (Table 1).
Analysis of the Wee-P1:GFP:Hsc70-4 fusion: To illustrate in more detail the functioning of the Wee-P, we present a detailed examination of the Wee-P1:GFP: Hsc70-4 fusion (Figure 4). This Wee-P inserted in an intron of Hsc70-4 (Figure 4A), a broadly expressed and essential gene that plays important roles as a molecular chaperone (Carbajalet al. 1993) and in catalyzing the uncoating of clathrin-coated vesicles (Newmyer and Schmid 2001). Hsc70-4 has also been proposed to play an important role in vesicle fusion at the neuromuscular junction (NMJ; Bronket al. 2001). RT-PCR using primers specific for GFP and the open reading frame of Hsc70-4 indicates that a fusion mRNA is made and demonstrates that the native splicing apparatus recognizes the splice donor sites in the Wee-P (Figure 4A). Additional RT-PCR using primers designed to GFP and the exon composed of the 5′ UTR of Hsc70-4 also indicates that the 5′ UTR exon is linked to this new mRNA and demonstrates that the native splicing apparatus also recognizes the splice acceptor sequences in the Wee-P (Figure 4A). The fusion product between GFP and Hsc70-4 should be a perfect fusion of GFP onto the Hsc70-4 protein without any additional alterations to Hsc70-4. Sequence analysis of our RT-PCR product indicates that such a perfect fusion has occurred. Western analysis of Wee-P1 flies using an anti-GFP antibody indicates that a protein of the predicted ∼98 kD is formed (Figure 3A). A genetic analysis of Wee-P1, described in detail below, indicates that the Wee-P1 insertion does not alter Hsc70-4 function.
—Wee-P fusions show a wide variety of expression patterns. (A) Wee-P114:GFP:Cam is found at the neuromuscular junction in third instar larvae, as well as in the motoneuron axons and in the muscle nuclei. (B) Wee-P114:GFP:Cam is found in the soma and dendrites of the third instar sensory neurons of the peripheral nervous system. (C) Wee-P3: GFP:Cib is expressed throughout the optic lobes of the larval central nervous system and in a subset of cells within the ventral ganglion. (D) Cytoplasmic Wee-P3:GFP:Cib expression is observed within the chordotonal organs of the third instar larvae. (E) Cytoplasmic Wee-P3:GFP:Cib expression is observed in the tracheal system of the third instar larvae. (F) Wee-P4:GFP:Sec61 is localized in patches of the cytoplasm of cells. Depicted are epithelial cells of a third instar larva. (G) Wee-P2:GFP:Kis expression is widespread and localized to the nucleus, as has been previously characterized for this gene. Expression of Wee-P2:GFP:Kis is shown in an imaginal disc at low magnification. (H) At higher magnification, discrete puncta of Wee-P2:GFP:Kis fusion proteins in the nuclei are visible. Such puncta were noted in all cell types examined.
—Wee-P fusion proteins are of the predicted size and can reveal expression patterns for uncharacterized genes. (A) Representative fusion lines were blotted and probed with anti-GFP antibodies. Single clear bands are seen in all cases and are approximately the correct predicted sizes. (B) An analysis of the Wee-P3:GFP:Cib fusion in homozygous Wee-P3 flies with an anti-Cib antibody indicates that the majority of the protein made from the locus is tagged with GFP. There is a reduction in the total amount of protein made from this locus in comparison to y w controls. (C) Wee-P20 is a fusion with the uncharacterized but annotated gene CG9894. Nuclei from third instar muscle nuclei are shown.
We have examined the tissue localization pattern of Wee-P1:GFP:Hsc70-4 in larvae and found a broad and low-level expression of GFP in all cells and a strong expression in tissues undergoing morphological changes, such as third instar imaginal discs (data not shown). Previous studies with RNA in situ have revealed that Hsc70-4 is expressed in virtually all cells, with enrichment in tissues undergoing extensive rapid growth and changes in shape (Perkinset al. 1990). Thus the Wee-P1:Hsc70-4 fusion provides direct evidence that GFP fusions from Wee-Ps can reflect the correct tissue localization patterns.
Analysis of Wee-P1 also underscores that the correct subcellular localization can occur in Wee-P fusions. Prior studies have demonstrated that Hsc70-4 is found both in the cytoplasm and in the nucleus (Carbajalet al. 1993) and that such a distribution for GFP is observed in Wee-P1 (Figure 4C). Recent research has also demonstrated that Hsc70-4 is enriched at the third instar neuro-muscular synapse where it is involved in vesicle cycling (Bronket al. 2001). In Wee-P1, GFP expression is enriched at the NMJ and, within the NMJ, Wee-P1:GFP: Hsc70-4 is localized to hot spots (Figure 4B).
—The Wee-P1: GFP:Hsc70-4 fusion reveals a subsynaptic localization at the NMJ and a heat-shock-induced increase in nuclear expression. (A) Splicing events that generate the Wee-P1: GFP:Hsc70-4 fusion. (B) Wee-P1:GFP:Hsc-70-4 is concentrated at the neuromuscular synapse and shows subsynaptic hot spots that resemble the distribution of synaptic vesicles within this nerve terminal. (C) Heat-shock treatment leads to an increase in nuclear localization of Wee-P1:GFP:Hsc70-4. (Left) Third instar muscle nuclei from a Wee-P1 heterozygote that was not given a heat shock. (Right) Third instar muscle nuclei from an animal that was given a heat shock. (D) Heat-shock treatment of Wee-P1 flies correlates with a 400% increase in nuclear expression of Wee-P1:GFP: Hsc70-4.
This fusion also demonstrates appropriate trafficking of a Wee-P:GFP fusion protein. In response to heat shock, Hsc70-4 translocates to the nucleus (Carbajalet al. 1993). We subjected Wee-P1 animals to heat shock and examined the fluorescence intensity of GFP in muscle nuclei in animals before and after heat shock. We observed a 400% increase in nuclear GFP fluorescence intensity following heat shock (Figure 4, C and D). To control for nonspecific effects on protein translation and translocation, we performed a similar analysis with Wee-P114 and Wee-P4. In neither case did heat shock alter GFP fluorescence intensity. Thus, the Wee-P1:GFP: Hsc70-4 fusion traffics correctly in response to heat-shock stimulation.
Wee-P fusions can be used to probe the expression patterns of uncharacterized genes: Our data examining Wee-P fusions with previously characterized genes indicate that fusion proteins are stable and behave normally within their endogenous cellular environment. On the basis of these data, it is reasonable to suspect that fusions with novel and previously uncharacterized genes may also reflect correct subcellular localization. For example, we have isolated a fusion with Larp, a large (150-kD) protein that is largely uncharacterized. The Larp protein has a conserved La domain that is most closely related to the yeast Sro9p gene (Chauvetet al. 2000). Sro9p is primarily cytoplasmic and thought to bind mRNAs and modulate their translation (Sobel and Wolin 1999). We observe that Wee-P5:GFP:Larp is widely expressed in the embryo and larval stages and within cells is localized to the cell cytoplasm (data not shown). As a second example, Wee-P20 is a fusion with the gene CG9894, which encodes a protein of unknown function. This fusion is abundantly expressed and highly localized to the nucleus in a wide variety of cell types although not in all cell types (Figure 3D).
Wee-P insertions, in general, do not disrupt the genes to which they fuse: For the Wee-P fusion events to be most useful, it is important that the presence of the Wee-P element not disrupt the function of the gene to which the fusion has occurred. We have attempted to address these issues by undertaking a genetic analysis of a representative subset of Wee-P fusion lines (Table 2). For nine of the fusion events that we analyzed genetically, there was no observable phenotype in animals homozygous for the Wee-P insertion. In three cases we performed our phenotypic analysis on the chromosome bearing the Wee-P element placed in trans over either a deficiency that uncovered the locus or a null allele of the fused gene. In all three cases none of the predicted phenotypes were observed. Thus the absence of phenotypes in this random sampling of Wee-P lines suggests that Wee-P insertions are largely nonmutagenic.
For some of the genes for which we have fusion events, an allelic series has been previously described. For example, null alleles of the chaperone gene Hsc70-4 die as first instar larvae, strong hypomorphic alleles die as late larvae-early pupae, and the weakest hypomorphic alleles die as early pupae-adults. The few adults of the weakest hypomorphic Hsc70-4 alleles that manage to eclose all have visibly rough eyes, bent wings, and thin, short bristles (FlyBase 1999; Bronket al. 2001). When WeeP-1 is crossed to two different null alleles of Hsc70-4, transheterozygous, fertile adults eclose at a wild-type frequency and they show no morphological changes to their eyes, bristles, or wings, suggesting that the presence of the GFP in Wee-P1 has not disrupted Hsc70-4 function. Another example is Wee-P6, which forms a fusion with the protein Belle, an RNA helicase. Null alleles of Belle are lethal while hypomorphic alleles produce sterile adults (Jones and Rawls 1988; Castrillonet al. 1993). When Wee-P6 is crossed to Df(3R)P13, trans-heterozygotes of the Wee-P6/Df(3R)P13 genotype eclose at wild-type ratios and both males and females are fertile.
Summary of genetic analysis on representative Wee-P lines
There are several possible explanations for the lack of observable phenotypes in the examples presented above. First, the added GFP exon may be inert. In the literature there are now many examples of GFP-tagged proteins that are fully functional (e.g., Sweeney and Davis 2002). Alternatively, since both wild-type and GFP incorporated transcripts are usually generated (Table 1), normal function may be contributed by the wild-type transcript. However, this may not account for the lack of phenotype in examples where weak hypomorphic phenotypes should be observed such as Hsc70-4 (see above). A list of the Wee-P lines examined genetically, as well as what is known about any alleles of the fused genes, is given in Table 2.
DISCUSSION
We have developed a broadly applicable technique for systematically tagging proteins with GFP on a genome-wide scale. Importantly, the tagged proteins are governed by their endogenous transcriptional and translational controls. Fusion proteins generated with this technology show the predicted tissue-level expression pattern and the correct subcellular localization allowing microdomains of protein localization to be visualized in a living organism. We anticipate that this technology will enable a new generation of experiments to be pursued in Drosophila. It may prove possible to use these reagents to study protein localization, trafficking, turnover, concentration, and translation within an in vivo genetic system without the caveat of protein overexpression that often precludes reliable experimental interpretation.
Wee-P technology is most significant if a tagged protein retains all of the features of tissue expression and subcellular localization for the native protein, and if the fusion protein functions normally. Our analysis of Wee-P1:GFP:Hsc70-4 and other fusion lines demonstrates that Wee-Ps can accurately reflect the expression patterns of the genes they tag (Figures 2 and 4). In addition, our findings that none of the representative 12 Wee-P lines analyzed have phenotypes suggests that Wee-Ps are generally not mutagenic. Thus, Wee-P technology should allow for rapid and accurate assessments of the temporal and spatial patterns of genes.
Although ∼40% of the GFP-expressing lines collected in the initial screen are not true fusions, in the great majority of cases this fact can be discovered after just the IPCR step. This is because the insertion site of the Wee-P element can immediately illustrate if a true protein-trapping event has occurred or if some other event, such as fusion with a UTR exon, has occurred. Thus, determining whether or not a GFP-expressing line is a true fusion can be rapidly determined without a time-consuming molecular analysis.
A database of GFP-tagged proteins generated with Wee-P technology can have many potential applications. First, GFP-tagged proteins can reveal new patterns of gene expression for novel genes, as exemplified by Wee-P20:CG9894, as well as previously characterized genes, as exemplified by the tracheal and chordotonal expression of WeeP:Cib (Figures 2 and 3). Second, these reagents allow for subcellular protein localization to be characterized in detail in the living animal. Third, fusion events can tag specific splice variants of a gene, allowing for detailed analysis that is otherwise difficult. Such isoform-specific tagging has occurred in our two Wee-P:GFP:Kis lines, in which only the larger isoform has been tagged. Fourth, the Wee-P will tag proteins regardless of their size, enabling fusion events with large proteins to be generated, which are otherwise difficult reagents to create. Our Wee-P:GFP:Kis lines, which have tagged a 17-kb open reading frame, are also good examples of this point. Finally, patterns of GFP expression in specific cell types can be reagents to study developmental processes such as axon guidance, synaptic plasticity, or tracheal development (see Wee-P3:GFP:Cib and Wee-P1:GFP:Hsc70-4 in Figure 2 and Figure 4, respectively).
Several important considerations must be taken into account when using Wee-P fusions. Most important is the question of the retention of wild-type protein function in the fused lines. Our phenotypic analysis of 12 of the Wee-P lines indicated no alteration to function and suggests that Wee-P insertions are largely nonmutagenic. However, we anticipate that some fusion events will alter protein function, mostly in those cases where the GFP disrupts a critical domain of the protein. Although exons are frequently modular, encoding distinct motifs within a gene (Longet al. 1995), there are instances to the contrary.
Our goal is to generate as many GFP-tagged proteins as possible with Wee-P elements. We anticipate making these reagents freely available so that other workers can perform similar screens for genes in their area of interest, and we are constructing a website to facilitate widespread use of this technology.
Acknowledgments
We thank G. Ruiz and S. Gandhi for helpful assistance. P.J.C. was supported by a postdoctoral fellowship from the Medical Investigation of Neurodevelopmental Disorders Institute at the University of California, Davis. S.T.S. is a Wellcome Prize Traveling Fellow (058327/Z/99/Z). This work was supported by a Burroughs Wellcome Young Investigator Award, a Merck Scholar Award, and a National Institutes of Health grant (44908-32374) to G.W.D.
Footnotes
-
Communicating editor: K. Anderson
- Received March 31, 2003.
- Accepted August 11, 2003.
- Copyright © 2003 by the Genetics Society of America