Genetics, Vol. 148, 233-242, January 1998, Copyright © 1998, Genetics Society of America

Conserved Subfamilies of the Drosophila HeT-A Telomere-Specific Retrotransposon

Olga N. Danilevskayaa, Ky Lowenhaupta, and Mary Lou Parduea
a Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Corresponding author: Mary Lou Pardue, Department of Biology 68-670, Massachusetts Institute of Technology, Cambridge, MA 02139, mlpardue{at}mit.edu (E-mail).

Communicating editor: V. G. FINNERTY


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

HeT-A, a major component of Drosophila telomeres, is the first retrotransposon proposed to have a vital cellular function. Unlike most retrotransposons, more than half of its genome is noncoding. The 3' end contains >2.5 kb of noncoding sequence. Copies of HeT-A differ by insertions or deletions and multiple nucleotide changes, which initially led us to conclude that HeT-A noncoding sequences are very fluid. However, we can now report, on the basis of new sequences and further analyses, that most of these differences are due to the existence of a small number of conserved sequence subfamilies, not to extensive sequence change during each transposition event. The high level of sequence conservation within subfamilies suggests that they arise from a small number of replicatively active elements. All HeT-A subfamilies show preservation of two intriguing features. First, segments of extremely A-rich sequence form a distinctive pattern within the 3' noncoding region. Second, there is a strong strand bias of nucleotide composition: The DNA strand running 5' to 3' toward the middle of the chromosome is unusually rich in adenine and unusually poor in guanine. Although not faced with the constraints of coding sequences, the HeT-A 3' noncoding sequence appears to be under other evolutionary constraints, possibly reflecting its roles in the telomeres.


THE telomere-specific retrotransposon, HeT-A, is unusual because a large part of its DNA is noncoding; the 3' half of the element has no significant open reading frames (BIESSMANN et al. 1994 Down; DANILEVSKAYA et al. 1994 Down; PARDUE et al. 1996A Down). Telomeres on Drosophila chromosomes contain tandem arrays of complete and partial HeT-A elements. Because partial HeT-A elements are preferentially truncated from the 5' end, the 3' noncoding regions are overrepresented in telomeres. We speculate that the enrichment of the 3' noncoding sequences might have a significant effect on the chromatin structure of this part of the chromosome. The 3' noncoding DNA shows a pattern of nucleotide repeats that, although imperfect, are highly conserved between elements. This conservation suggests that these sequences may play a role in forming chromatin structures by specific protein binding. This suggestion is reinforced by the fact that complete HeT-A elements are found only in telomere regions, regions identified as heterochromatic by MULLER 1938 Down. The association of HeT-A with heterochromatin is further bolstered by a second unusual finding about this element; long segments of the 3' noncoding region (without associated coding regions and 5' ends) have been incorporated into families of tandem repeats found at several loci along the length of the heterochromatic Y chromosome (DANILEVSKAYA et al. 1993 Down). Although the noncoding sequence of HeT-A is well represented in two different heterochromatic environments, telomeres and the Y chromosome, no HeT-A-related sequences have ever been found in euchromatin.

Our initial study of HeT-A element sequences (BIESSMANN et al. 1992 Down) was based on 3' noncoding sequences from five elements: Two of these elements had transposed onto broken ends shortly before they were cloned, and three were from established telomeres. (One from the second group, A4-4, is only 132 bp long. Because the 3' noncoding regions of the other four elements in this study are 1.5–2.7 kb, we will not include A4-4 in our analyses.) These noncoding sequences showed almost as much sequence conservation as was seen for HeT-A coding regions in a later study (PARDUE et al. 1996B Down). Pairwise comparisons of the 3' noncoding sequences showed greater than 75% sequence identity between all elements when gapped regions were omitted (BIESSMANN et al. 1992 Down). HeT-A coding regions have been found to differ by as much as 20% in both nucleotide and amino acid sequences (PARDUE et al. 1996B Down). It is important to note that both the 3' noncoding and the coding region studies included elements that had transposed shortly before they were cloned and therefore should be young, active elements. The recently transposed elements were no less diverged than the elements from established telomeres. We conclude that the sequence differences seen among these copies are not primarily a result of decay of older sequences.

Despite their sequence conservation, the four large noncoding regions in the first study differed because of insertions and/or deletions, some of which were of significant size. The pairwise comparisons required 16–27 gaps for each kilobase of sequence aligned. Gaps were distributed over the noncoding region, but it was notable that insertions and/or deletions in one element frequently overlap or coincide with insertions and/or deletions in other elements. Most of these changes must have resulted from different events. Therefore, the spatial coincidences of the changes suggested that, although these sequences are noncoding, they must be under some type of selection influencing either the production or the survival of aberrations and limiting these aberrations to certain regions.

Our early study suggested that HeT-A sequences might be rapidly changing. High mutability has been documented for transposable elements and viruses that, like HeT-A, have an RNA-templated stage in their replication cycle (PRESTON 1996 Down; PRESTON and DOUGHTERTY 1996; CASACUBERTA et al. 1995 Down). Changes are usually thought to result from high-error rates during RNA transcription or reverse transcription. On the other hand, the repeats added to chromosomes by telomerase are also RNA-templated and, within each species, these repeats are remarkable for their lack of variation (BLACKBURN 1992 Down). HeT-A is a retrotransposable element, but it also has a role in chromosome structure much like that of the telomerase-generated repeats. Because of this chromosomal role, HeT-A might show less variation than is typically seen for transposable elements. HeT-A's unusual situation as both an important component of the telomeres and as a retrotransposable element makes it interesting to understand the nature of sequence variation among HeT-A elements. We now report additional HeT-A sequences and analyses showing that the differences seen between elements reflect a limited number of replicatively active HeT-A subfamilies, rather than rapid sequence variation during transposition.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

DNA sequencing:
3' noncoding regions from elements 23Zn1 and 23Zn3 were subcloned from the {lambda}23Zn clone (DANILEVSKAYA et al. 1992 Down) and sequenced by the dideoxy chain-termination technique (SANGER et al. 1977 Down) using the Sequenase kit, version 2.0 (United States Biochemical Corp., Cleveland, OH). Sequence of both strands was obtained for all subclones. The new sequences lie adjacent to sequences from this same clone previously recorded as accession number U06920. Therefore, the new sequences have been added to the U06920 entry in GenBank.

DNA sequence analysis:
Sequence from the 23Zn elements was compared to previously published sequences from our laboratory and other laboratories (see Table 1 for description of sequences included). Analyses were made with the WinGenesis programs (Team Associates, Westerville, OH) and programs from the University of Wisconsin Genetics Computer Group (DEVEREUX et al. 1984 Down). Dotplots in Figure 2 and Figure 4 were made with WinGenesis programs using a window of 20, a criterion of 7, and the Unitary cost matrix. The dotplot of the entire element (Figure 3) was made on a VAX running the Genetics Computer Group program with a window of 21 and a stringency of 14, giving a plot essentially equivalent to the smaller plots. Pairwise alignments were made using the Unitary cost matrix. Multiple-sequence alignments were made with the Multalin program, version 3.0 (CORPET 1988 Down), using the default parameters. Pairwise comparisons were terminated at the end of the shorter element in each pair for quantitation of divergence. Nucleotide (nt) divergence was calculated from the alignments using only those positions where neither sequence had a gap.



View larger version (9K):
[in this window]
[in a new window]
 
Figure 1. —Diagram of the head-to-tail array of HeT-A elements in the {lambda}23Zn clone, depicted as the RNA coding strand. The two flanking elements, 23Zn2 and 23Zn3, are truncated by the restriction sites used to clone the DNA. Coding regions are marked in grey. Noncoding regions are white. Arrowheads indicate the oligo(A) segments that form the junction between each element and the next, more proximal element. Lengths of these oligo(A) segments differ from element to element, as expected if the initiation of reverse transcription at the 3' end of the RNA is imprecise. (For examples of this variation see BIESSMANN et al. 1990 Down.) Because of this variation, we do not include these oligo(A) segments in calculations of sequence similarity. When numbering sequences in the 3' to 5' direction, we begin numbering immediately 5' of the oligo(A). The bars below indicate the regions with >99% identity to RT473 (in element 23Zn3) and RT394 (in element 23Zn1).



View larger version (66K):
[in this window]
[in a new window]
 
Figure 2. —Dot matrix comparisons of the 3' noncoding regions of several HeT-A elements. All elements have intact 3' ends, but most are truncated by cloning at the 5' end. Therefore, comparison was begun at the 3' end and extended toward the 5' end. The plot has been oriented so that comparison is initiated at the bottom right. Sequence numbering begins at the 3' end (not including the oligo(A) tail) and extends 5'. Comparisons are terminated at the end of the most truncated element in each pair. (A) 23Zn1 is plotted against RT394. The unbroken diagonal shows the near identity of sequence of these two elements over >2.6 kb. The off-diagonal clusters indicate segments of sequence repetition along the elements. (B) 23Zn1 is plotted against RT473 (2.7 kb). Gaps and shifts in the diagonal line demonstrate some of the larger insertions and/or deletions that distinguish the subfamilies represented by these elements. Despite differences, a strong pattern of sequence repetition can be seen in the off-diagonal clusters. (C) 23Zn1 is plotted against Het-A44P (2.2 kb). The two subfamilies represented here are less different than the two shown in B. (D) Het-A44P is plotted against 1187 (2.2 kb). These two elements appear to represent a third subfamily. Plots were made by the WinGenesys program using a window of 20, a criterion of 7 and the unitary cost matrix.



View larger version (35K):
[in this window]
[in a new window]
 
Figure 3. —Dot matrix analysis showing patterns of sequence repeats in a complete HeT-A element. When the sequence of 23Zn1 is compared to itself, off-diagonal clusters indicating regions of sequence repetition are seen in the 3' and 5' noncoding regions but are almost entirely absent within the coding region. The diagrams of HeT-A RNA forming the axes show the extent of the coding region between the 5' and 3' noncoding regions. (A)n indicates the 3' oligo(A) on the RNA. The plot has been orientated as in Figure 1 to allow easy comparison of the 3' noncoding regions. Sequence numbering for Figure 3 is 5' to 3'.



View larger version (28K):
[in this window]
[in a new window]
 
Figure 4. —Enlargement from dot matrix comparison 23Zn1 and RT473 (sequence 750 to 1250 nt from 3' end). The region of 15 clusters surrounding the diagonal in Figure 2B is enlarged to reveal the pattern of segments of repetitive sequences. As in Figure 2 sequence numbering is 3' to 5' and the plot was made from the 3' end. Analysis of sequences in the repetitive pattern shows that their similarity is mainly due to their A-rich nature. See Table 3.


 
View this table:
[in this window]
[in a new window]
 
Table 1. HeT-A elements compared in this study


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Sequence comparisons of HeT-A 3' noncoding regions suggest that there are a limited number of HeT-A subfamilies:
Because the first HeT-A elements analyzed (BIESSMANN et al. 1992 Down) showed marked sequence variations, we were surprised to find the next two elements that we sequenced were nearly identical to two members of that initial group. This was the first suggestion that there are conserved subfamilies of HeT-A. Both new elements came from a single clone ({lambda}23Zn) isolated from a stock with normal telomeres (DANILEVSKAYA et al. 1992 Down). {lambda}23Zn was constructed in Moscow from an Oregon R stock that had been in Russia since 1967 (V. GVOZDEV, personal communication). The stock used for {lambda}23Zn can only be distantly related to any of the stocks from which other HeT-A sequences have been derived since those stocks have all been kept in the United States. The DNA cloned in {lambda}23Zn comes from an established telomere (PARDUE et al. 1996B Down) and appears to have been excised from a head-to-tail tandem array of three complete HeT-A elements (Figure 1). Cloning has removed part of each of the flanking elements. The central element, 23Zn1, is intact, but the downstream element, 23Zn2, is truncated at the 3' end, and the upstream element, 23Zn3, is truncated at the 5' end. (For convenience, we use 5' and 3' designations as for the RNA encoded by HeT-A.)

The 3' noncoding region of element 23Zn1 is nearly identical to the sequence that transposed in the initial healing event of the broken chromosome end in the RT394 stock (BIESSMANN et al. 1992 Down). The two sequences can be aligned continuously over the entire 2686 bp of sequence cloned from element RT394 (Figure 2A). In marked contrast to the earlier sequence comparisons, the 23Zn1 and RT394 alignment required only 18 gaps. Thirteen of these gaps are only single-nucleotide gaps, and the longest gap is 5 nucleotides. The 5-nucleotide gap is within a long homopolymer of dA/dT and might reflect slippage during DNA replication, either in Drosophila or during cloning in Escherichia coli. (Some of the differences between elements might have been introduced in cloning or sequencing. It is not feasible to check apparent errors against genomic DNA because the genome has many copies of HeT-A.) The nucleotide sequences of 23Zn1 and RT394 are 99.5% identical over their entire length (Table 2).


 
View this table:
[in this window]
[in a new window]
 
Table 2. Pairwise comparisons of 3' noncoding regions of HeT-A elements

Two of the elements in our initial analysis of HeT-A sequences, RT394 and RT473, had transposed to heal two terminally deleted chromosomes less than 4 years before they were cloned (BIESSMANN et al. 1992 Down). (Although "healing" of telomeres has not been defined in molecular terms, we will use healing here to mean the attachment of telomere DNA to broken chromosome ends.) The two transpositions had occurred in the same genetic background, yet the two elements are different enough to require 70 gaps for alignment. Many of the insertions and/or deletions that require these gaps are large, as can be seen when the RT473 sequence is compared with the RT394/23Zn1 sequence by dot matrix analysis (Figure 2B). Even when sequence in the gapped regions is ignored and only aligned sequence is compared, RT473 is significantly more diverged from RT394 and 23Zn1 than RT394 and 23Zn1 are from each other (Table 2). Thus, RT473 appears to define a different subgroup of elements, both on the basis of insertions and/or deletions and on the basis of nucleotide changes. 23Zn3, the element immediately upstream from 23Zn1, belongs to this second subgroup. 23Zn3 has only 80.5% sequence identity to its neighbor 23Zn1 but has 99.3% identity with the RT473 sequence. The 23Zn3/RT473 alignment, like the 23Zn1/RT394 alignment, required very few gaps. Most are one nucleotide gaps. The longest gap is 4 nt.

The sequence identity in the 23Zn1/RT394 and 23Zn3/RT473 pairs is significantly above that seen in any earlier comparisons of the 3' noncoding regions and appears to justify classifying each pair as a subfamily. The two subfamilies are adjacent in the 23Zn clone, but the healing event in RT394 is independent of the healing event in RT473. Because both healing events occurred in the same experiment, it appears that both subfamilies can be active in the same stock.

Two additional HeT-A sequences have been published from other laboratories. The 1187 element was cloned in a study of the telomere-associated repeats of the X chromosome (KARPEN and SPRADLING 1992 Down), and HeT-A44P was cloned in a walk from telomere-associated regions of chromosome 3R (LEVIS et al. 1993 Down). These elements may identify a third subfamily. For example, the alignment of 23Zn1 to HeT-A44P (Figure 2C) shows gaps and offsets, indicating insertions and/or deletions, that are not seen in the alignment of HeT-A44P to 1187 (Figure 2D). In the regions where sequence can be aligned, HeT-A44P also has a higher level of nucleotide identity with 1187 than do any other elements (Table 2).

The relationships suggested by the dotplots are supported by quantification of the pairwise nucleotide sequence comparisons (Table 2). We have considered base changes separately from insertions and/or deletions because the two types of changes might have different origins and patterns of occurrence. However the subfamily relationships are the same whether they are determined by the number of base changes or the number of insertions and/or deletions; the two types of changes seem to be accumulating at relatively similar rates. In addition to the 23Zn1/RT394 and 23Zn3/RT473 subfamilies discussed above, HeT-A44P and 1187 show the strong similarity predicted from the dotplot and appear to form a subfamily. TA1 and TA2 may identify two additional subfamilies because they do not show a strong similarity to any of the other elements or to each other. However, these last two elements come from a clone, {lambda}T-A, in which sequence rearrangements are so extensive that we think it may have come from an older region of the telomere and have undergone sequence decay. Although differences in sequence identity allow HeT-A elements to be divided into several subfamilies, there is strong sequence conservation across those subfamilies. None of the elements have less than 70% sequence identity with any other elements.

Conserved features of HeT-A subfamilies:
There are several distinctive structural features shared by all of the HeT-A subfamilies. The most striking of these is the pattern of sequence repetitions along the length of the 3' noncoding region. These repetitions are easily seen in the dot matrix comparisons as a more or less regular pattern of off-diagonal dots. The most intriguing component of this pattern is a cluster of dots around the diagonal indicating overlapping repeat segments, ~1000 nt from the 3' end of each element. This cluster has three subunits in members of the 23Zn1/RT394 subfamily, four subunits in members of the HeT-A44P/1187 subfamily, and five subunits in members of the 23Zn3/RT473 subfamily. Similar repeat segments are also found, somewhat less distinctly, spaced at intervals along the entire nontranslated region of each element. They are absent in the coding region of the element (Figure 3) but are found again in the 5' nontranslated region. There is also one 0.3-kb region of continuous sequence repeat within the 3' noncoding region of the 23Zn1/RT394 and the 1187/HeT-A44P families.

An enlargement of the central cluster of repeated segments from the 23Zn1/RT473 dot matrix comparison is shown in Figure 4. Analysis of the dot matrix in this figure showed 163 segments of 10 or more nucleotides that are detected as overlapping repeat segments at the stringency used for the analysis. The only obvious sequence motif in these segments is an enrichment for oligo(A). In these segments, adenine frequently appears as a homopolymer or in a series of doublets and triplets separated by thymidine or cytosine. The 3' noncoding region of HeT-A RNA (and the coding strand of HeT-A DNA) is adenine-rich and guanine-poor throughout (Table 3). The base composition and marked strand bias is strongly conserved in the eight elements sequenced. When the 163 repeat segments between nt 700 and 1200 of 23Zn1 and RT473 (indicated by diagonal lines on the dot matrix shown in Figure 4) are summed, they show almost a doubling of the already high adenine content. The enrichment comes largely at the expense of guanine, although both cytosine and thymidine are also decreased.


 
View this table:
[in this window]
[in a new window]
 
Table 3. Base compositions of 3' noncoding regions of HeT-A elements


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The HeT-A family is divided into a limited number of subfamilies:
Analysis of the four new HeT-A sequences discussed here and comparison of these sequences with the original four sequences shows that the 3' noncoding region of HeT-A is less variable than it initially appeared to be (BIESSMANN et al. 1992 Down). Although the four elements in our earlier analysis had no more than 86% sequence identity in any pairwise comparison and many insertions and/or deletions >3 bp, the addition of four sequences to the set reveals two pairs with >99% sequence identity and a third pair with >95% identity. Insertion and/or deletion gaps required for alignment within any of these three pairs are significantly smaller and less frequent than the gaps required for alignment between pairs, supporting the designation of these pairs as subfamilies. Because the small set of HeT-A sequences available contains two members of each of three different subfamilies, we conclude that the number of HeT-A subfamilies is not large.

There appear to be a limited number of replicatively active HeT-A elements:
The conservation of sequences within HeT-A subfamilies is surprising in view of studies on the rate of sequence divergence in other elements that have an RNA-templated step in their replication. If HeT-A elements experience a high level of sequence change at each RNA-templated transposition and every transposed element is replicatively active, errors should be amplified at subsequent transpositions. Such a scenario would lead to a rapidly diverging population of elements in the genome. Our results show a level of sequence conservation within HeT-A subfamilies that does not fit the predictions of this model. Instead the subfamilies are most easily explained by a limited number of replicatively active HeT-A elements. In this case, the majority of elements in the genome would be separated from one of these "master elements" by only one step of reverse transcription, and the progeny of each "master element" would form a subfamily.

Rapid sequence change has been reported for many elements with an RNA-based step in replication. DRAKE 1993 Down has calculated that RNA viruses have a spontaneous mutation rate 300-fold higher than that of DNA viruses. Retroviruses are known to be rapidly varying (see PRESTON 1996 Down; PRESTON and DOUGHERTY 1996 Down). Using a reporter system designed to measure changes occurring during a single cycle of replication without strong selection, PARTHASARATHI et al. 1995 Down measured a mutation rate of 3.3% per cell cycle in murine leukemia virus. Nearly half of the changes detected in these experiments were gross rearrangements. Retrotransposon variation has received much less study; however, a clever reporter system has recently been used to study single cycles of replication of the yeast LTR retrotransposon, Ty1 (GABRIEL et al. 1996 Down). The study found that the mutation rate for Ty1 was comparable to rates seen for retroviruses, although all of the Ty1 mutations were base substitutions; no insertions or deletions were detected. Whether the lack of rearrangements in Ty1 represents a real difference between retrotransposons and retroviruses or simply the idiosyncrasies of the reporter systems used is an open question.

Although it seems reasonable to assume that non-LTR retrotransposons have the same high mutation rates found for other retroelements, no mutation rate measurements have been reported for this class of elements. Our study does not give such rates, but it does set some limits on the mutability of HeT-A elements. The crucial finding is that the two recently transposed elements, RT394 and RT473, are essentially identical to two elements in the {lambda}23Zn clone, 23Zn1 and 23Zn3, and that these two sets of nearly identical elements were found in the first eight elements for which sequence has been obtained. (As mentioned earlier, we ignore the tiny element A4-4, which is noninformative.) The history of the stocks from which the 23Zn (DANILEVSKAYA et al. 1992 Down) and the RT elements (MASON et al. 1984 Down) were derived ensures that the RT transpositions occurred a minimum of 15 years after they could have shared an ancestral element with {lambda}23Zn; for at least 15 years before the RT transpositions occurred, the {lambda}23Zn stock was in Moscow while the RT stock was in the United States. Drosophila are sometimes held at 18° to extend the generation time and decrease the need for stock transfer. If we assume culture at this temperature for the entire 15 years, there would have been a minimum of 130 fly generations between the common ancestor and the RT transpositions. There would have been nearly twice as many generations if flies had been kept at 25° for the entire time. The time when the stock was moved to Moscow (V. GVOZDEV, personal communication) provides a minimum estimate of separation from other Oregon R stocks that might actually have occurred any time after the original collection of the Oregon R stock (1925 or earlier; LINDSLEY and ZIMM 1992 Down).

It is worth noting that the {lambda}23Zn elements were from an established telomere; therefore, we have no way to estimate when they transposed. Transposition of RT394 and RT473 was onto terminally deleted chromosomes produced by J. Mason, and the times of their transpositions are known (BIESSMANN et al. 1990 Down; MASON et al. 1984 Down). The RT394 and RT473 transpositions show that both of these subfamilies were active within the same stock at about the same time. Whether the two subfamilies are active within the same individual is an open question.

The differences between RT394 and 23Zn1, as well as those between RT473 and 23Zn3, are smaller than would be expected if HeT-A had a high mutation rate during the RNA-templated stage and most products of transposition then went on to serve as donors for subsequent transpositions. This situation would produce sequential amplifications of changes, and would not maintain >99% identity over the time period that separates RT394 from 23Zn1 or RT473 from 23Zn3. This high level of identity would be more easily explained if all members of a subfamily descended from the same replicatively active master element. Master elements have been implicated in replication of other non-LTR retrotransposons; sequence analyses of mammalian LINES-1 elements suggested that only a small number of elements give rise to the population (HARDIES et al. 1986 Down). More recent evidence indicates that the dynamics of determining which LINES-1 elements within the genome are replicatively active may be complex (ADEY et al. 1994 Down; FURANO and USDIN 1995 Down).

An alternative explanation for the high level of similarity within HeT-A subfamilies might be that the HeT-A RNA step is relatively free of error. We think it less likely that fidelity of reverse transcription can be the major reason for strong sequence conservation; however, it may contribute to the result. There is some precedent for differences in the error rates of different reverse transcriptases; the reverse transcriptase of human immunodeficiency virus (HIV-1) is 10-fold less accurate than that of avian myeloblastosis virus (AMV) and has a different spectrum of errors (BEBENEK et al. 1989 Down).

A pattern of A-rich regions and a strong strand bias is conserved in the HeT-A subfamilies:
Dot matrix analyses reveal a striking pattern of sequence repeats extending through the noncoding portion of HeT-A but ending at the junction with the open reading frame. Despite the insertions, deletions, and nucleotide changes that differentiate the HeT-A subfamilies, the repeat pattern is conserved, although the number of subunits in the central cluster varies from three to five in the subfamilies studied. Regions identified as repeats appear to be an extreme manifestation of the strong strand bias that is another conserved feature of the subfamilies. In HeT-A elements, the strand running 5' to 3' toward the centromere (and the RNA transposition intermediate) is extremely A-rich and G-poor. Base compositions of all of the elements studied differ very little (Table 3). This strand bias holds for both the coding and the noncoding regions of HeT-A. It is interesting that one of the features frequently noted for telomerase-generated repeats is a strong strand bias of base composition.

The repeats detected in the dot plot are regions where the adenine composition has risen to almost two thirds of the total and guanine is nearly absent. The conservation of these regions of aberrant base composition suggests a role in chromatin structure. Short homopolymer A runs are frequently involved in DNA bending (KOO et al. 1986 Down). The HeT-A regions might serve this function or they might serve as binding sites for proteins. In either case, it remains a puzzle why all of the A-rich regions are on the same DNA strand over such a long distance. Neither bending nor protein binding would be expected to recognize a single strand over the length of these 3' regions. It would seem that either bending or binding could be accomplished by A-rich regions on one strand in some places and on the other strand elsewhere. Such strand alternation is not seen for HeT-A. The strand bias continues over the entire length of the element, including the coding region, suggesting that there are other influences involved in maintaining the nucleotide distribution. Because there is an RNA phase in HeT-A transposition, we speculate that the strand bias is determined by heretofore unsuspected nucleotide preferences of either RNA polymerase or reverse transcriptase.

Like the sequences added by telomerase to telomeres of most organisms, HeT-A serves to extend the chromosomes and thereby compensate for any sequence loss, protecting distal genes. If its only function is to protect the chromosome end, why does HeT-A have such an unusual and conserved 3' noncoding region? Typical retrotransposons contain little except sequence coding for their own transposition machinery. It would seem that any of the typical elements could serve as an adequate sequence buffer, if buffering were the only role of telomere elements. The finding that nearly half the genome of HeT-A consists of noncoding sequences raises the possibility that this element has more than one role at the telomere. The patterns of sequence conservation revealed by our studies support the hypothesis that there are evolutionary pressures for maintenance of the 3' noncoding sequences of HeT-A elements. We have suggested that one additional role for these sequences is in controlling chromatin structure, but, in any event, understanding these pressures will give new insights into the multiple roles of telomeres.


*  ACKNOWLEDGMENTS

This work has been supported by grant GM50315 from the National Institutes of Health. We are grateful to P. G. DEBARYSHE, N. C. HOGAN and K. L. TRAVERSE for comments on the manuscript.

Manuscript received April 28, 1997; Accepted for publication September 15, 1997.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

ADEY, N. B., S. A. SCHICHMAN, D. K. GRAHAM, S. N. PETERSON, and M. H. EDGELL et al., 1994  Rodent L1 evolution has been driven by a single dominant lineage that has repeatedly acquired new transcriptional regulatory sequences. Mol. Biol. Evol. 11:778-789[Abstract].

BEBENEK, K., J. ABBOTTS, J. D. ROBERTS, S. H. WILSON, and T. A. KUNKEL, 1989  Specificity and mechanism of error prone replication by human immunodeficiency virus-1 reverse transcriptase. J. Biol. Chem. 264:16948-16956[Abstract/Free Full Text].

BIESSMANN, H., J. MASON, K. FERRY, M. D'HULST, and K. VALGEIRSDOTTIR et al., 1990  Addition of telomere-associated HeT DNA sequences "heals" broken chromosome ends in Drosophila. Cell 61:663-673[Medline].

BIESSMANN, H., K. VALGEIRSDOTTIR, A. LOFSKY, C. CHIN, and B. GINTHER et al., 1992  HeT-A, a transposable element specifically involved in "healing" broken chromosome ends in Drosophila. Mol. Cell. Biol. 12:3910-3918[Abstract/Free Full Text].

BIESSMANN, H., B. KASRAVI, T. BUI, G. FUJIWARA, and L. E. CHAMPION et al., 1994  Comparison of two active Het-A retroposons of Drosophila melanogaster. Chromosoma 103:90-98[Medline].

BLACKBURN, E. H., 1992  Telomerases. Annu. Rev. Biochem. 61:113-129[Medline].

CASACUBERTA, J. M., S. VERNHETTES, and M. A. GRANDBASTIEN, 1995  Sequence variability within the tobacco retrotransposon TnT1 population. EMBO J. 14:2670-2678[Medline].

CORPET, F., 1988  Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16:10881-10890[Abstract/Free Full Text].

DANILEVSKAYA, O. N., D. A. PETROV, M. A. PAVLOVA, A. KOGA, and E. V. KURENOVA et al., 1992  A repetitive DNA element, associated with telomeric sequences in Drosophila melanogaster, contains open reading frames. Chromosoma 102:32-40[Medline].

DANILEVSKAYA, O., A. LOFSKY, E. V. KURENOVA, and M. L. PARDUE, 1993  The Y chromosome of Drosophila melanogaster contains a distinctive subclass of HeT-A-related repeats. Genetics 134:531-543[Abstract].

DANILEVSKAYA, O., F. SLOT, M. PAVLOVA, and M. L. PARDUE, 1994  Structure of the Drosophila HeT-A transposon: a retrotransposon-like element forming telomeres. Chromosoma 103:215-224[Medline].

DEVEREUX, J., P. HAEBERLI, and O. SMITHIES, 1984  A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12:387-395[Medline].

DRAKE, J. W., 1993  Rates of spontaneous mutation among RNA viruses. Proc. Natl. Acad. Sci. USA 90:4171-4175[Abstract/Free Full Text].

FURANO, A. and K. USDIN, 1995  DNA "fossils" and phylogenetic analysis. J. Biol. Chem. 270:25301-25304[Free Full Text].

GABRIEL, A., M. WILLEMS, E. H. MULES, and J. D. BOEKE, 1996  Replication infidelity during a single cycle of Ty1 retrotransposition. Proc. Natl. Acad. Sci. USA 93:7767-7771[Abstract/Free Full Text].

HARDIES, S. C., S. L. MARTIN, C. F. VOLIVA, C. A. HUTCHISON, III, and M. H. EDGELL, 1986  An analysis of replacement and synomous changes in the rodent L1 repeat family. Mol. Biol. Evol. 3:109-125[Abstract].

KARPEN, G. H. and A. S. SPRADLING, 1992  Analysis of subtelomeric heterochromatin in the Drosophila minichromosome Dp1187 by single P element insertional mutagenesis. Genetics 132:737-753[Abstract].

KOO, H. F., H.-M. WU, and D. M. CROTHERS, 1986  DNA bending at adenine-thymine tracts. Nature 320:501-506[Medline].

LEVIS, R. W., R. GANESAN, K. HOUTCHENS, L. A. TOLAR, and F.-M. SHEEN, 1993  Transposons in place of telomeric repeats at a Drosophila telomere. Cell 75:1083-1093[Medline].

LINDSLEY, D. L., and G. G. ZIMM, 1992 The Genome of Drosophila melanogaster. Academic Press, San Diego.

MASON, J. M., E. STROBEL, and M. M. GREEN, 1984  2: mutator gene in Drosophila that potentiates the induction of terminal deficiencies. Proc. Natl. Acad. Sci. U.S.A. 81:6090-6094[Abstract/Free Full Text].

MULLER, H. J., 1938  The remaking of Chromosomes. Collect. Net 13:182-198.

PARDUE, M. L., O. N. DANILEVSKAYA, K. LOWENHAUPT, F. SLOT, and K. L. TRAVERSE, 1996a  Drosophila telomeres: new views on chromosome evolution. Trends Genet. 12:48-52[Medline].

PARDUE, M. L., O. N. DANILEVSKAYA, K. LOWENHAUPT, J. WONG, and K. ERBY, 1996b  The gag coding region of the Drosophila telomeric retrotransposon, HeT-A, has an internal frame shift and a length polymorphic region. J. Mol. Evol. 43:572-583[Medline].

PARTHASARATHI, S., A. VARELA-ECHAVARRIA, Y. RON, B. D. PRESTON, and J. P. DOUGHERTY, 1995  Genetic rearrangements occurring during a single cycle of murine leukemia virus vector replication: characterization and implications. J. Virol. 69:7991-8000[Abstract].

PRESTON, B. D., 1996  Error-prone retrotransposition: rime of the ancient mutators. Proc. Natl. Acad. Sci. USA 93:7427-7431[Abstract/Free Full Text].

PRESTON, B. D. and J. P. DOUGHERTY, 1996  Mechanisms of retroviral mutation. Trends Microbiol. 4:16-21[Medline].

SANGER, F., S. NICKLEN, and A. R. COULSON, 1977  DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467[Abstract/Free Full Text].




This article has been cited by other articles:


Home page
Genome ResHome page
A. Villasante, J. P. Abad, R. Planello, M. Mendez-Lago, S. E. Celniker, and B. de Pablos
Drosophila telomeric retrotransposons derived from an ancestral element that was recruited to replace telomerase
Genome Res., December 1, 2007; 17(12): 1909 - 1918.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. Shpiz, D. Kwon, A. Uneva, M. Kim, M. Klenov, Y. Rozovsky, P. Georgiev, M. Savitsky, and A. Kalmykova
Characterization of Drosophila Telomeric Retroelement TAHRE: Transcription, Transpositions, and RNAi-based Regulation of Expression
Mol. Biol. Evol., November 1, 2007; 24(11): 2535 - 2545.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Villasante, J. P. Abad, and M. Mendez-Lago
Centromeres were derived from telomeres during the evolution of the eukaryotic chromosome
PNAS, June 19, 2007; 104(25): 10542 - 10547.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. A. George, P. G. DeBaryshe, K. L. Traverse, S. E. Celniker, and M.-L. Pardue
Genomic organization of the Drosophila telomere retrotransposable elements
Genome Res., October 1, 2006; 16(10): 1231 - 1240.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Sci.Home page
E. N. Andreyeva, E. S. Belyaeva, V. F. Semeshin, G. V. Pokholkova, and I. F. Zhimulev
Three distinct chromatin domains in telomere ends of polytene chromosomes in Drosophila melanogaster Tel mutants
J. Cell Sci., December 1, 2005; 118(23): 5465 - 5477.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. P. Abad, B. de Pablos, K. Osoegawa, P. J. de Jong, A. Martin-Gallardo, and A. Villasante
Genomic Analysis of Drosophila melanogaster Telomeres: Full-length Copies of HeT-A and TART Elements at Telomeres
Mol. Biol. Evol., September 1, 2004; 21(9): 1613 - 1619.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. Casacuberta and M.-L. Pardue
HeT-A elements in Drosophila virilis: Retrotransposon telomeres are conserved across the Drosophila genus
PNAS, November 25, 2003; 100(24): 14091 - 14096.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
E. Casacuberta and M.-L. Pardue
Coevolution of the Telomeric Retrotransposons Across Drosophila Species
Genetics, July 1, 2002; 161(3): 1113 - 1124.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
T. Kahn, M. Savitsky, and P. Georgiev
Attachment of HeT-A Sequences to Chromosomal Termini in Drosophila melanogaster May Occur by Different Mechanisms
Mol. Cell. Biol., October 15, 2000; 20(20): 7634 - 7642.
[Abstract] [Full Text]


Home page
Mol. Cell. Biol.Home page
O. N. Danilevskaya, K. L. Traverse, N. C. Hogan, P. G. DeBaryshe, and M. L. Pardue
The Two Drosophila Telomeric Transposable Elements Have Very Different Patterns of Transcription
Mol. Cell. Biol., January 1, 1999; 19(1): 873 - 881.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
O. N. Danilevskaya, C. Tan, J. Wong, M. Alibhai, and M.-L. Pardue
Unusual features of the Drosophila melanogaster telomere transposable element HeT-A are conserved in Drosophila yakuba telomere elements
PNAS, March 31, 1998; 95(7): 3770 - 3775.
[Abstract] [Full Text] [PDF]