Rider is a novel and recently active Ty1-copia-like retrotransposon isolated from the T3238fer mutant of tomato. Structurally, it is delimited by a duplication of target sites and contains two long terminal direct repeats and an internal open reading frame, which encodes a Ty1-copia-type polyprotein with characteristic protein domains required for retrotransposition. The family of Rider elements has an intermediate copy number and is scattered in the chromosomes of tomato. Rider family members in the tomato genome share high sequence similarity, but different structural groups were identified (full-size elements, deletion derivatives, and solo LTRs). Southern blot analysis in Solanaceae species showed that Rider was a Lycopersicon-specific element. Sequence analysis revealed that among other plants, two Arabidopsis elements (named as Rider-like 1 and Rider-like 2) are most similar to Rider in both the coding and noncoding regions. RT–PCR analysis indicates that Rider is constitutively expressed in tomato plants. The phylogeny-based parsimony analysis and the sequence substitution analyses of these data suggest that these Rider-like elements originated from a recent introgression of Rider into the tomato genome by horizontal transfer 1–6 million years ago. Considering its transcriptional activity and the recent insertion of the element into at least two genes, Rider is a recently active retrotransposon in the tomato genome.
LONG terminal repeat (LTR) retrotransposons are structurally similar to retroviruses that can move within a host genome from one chromosome position to another through an RNA intermediate by a “copy-and-paste” mechanism (Kumar and Bennetzen 1999). LTR retrotransposons are classified as a large group of mobile elements, which are distinct from DNA transposons that move as a DNA molecule. A typical complete LTR retrotransposon has an open reading frame (ORF) in the internal region flanked by two LTRs. Normally, the ORF is divided into two regions, gag and pol; the former encodes an RNA-binding protein gag, which forms a virus-like particle, and the latter encodes several proteins, such as protease, integrase, reverse transcriptase, and RNase H, which are required for cDNA synthesis and integration of cDNA into host chromosomes (Boeke and Corces 1989). On the basis of domain order within the pol protein and sequence similarity, LTR retrotransposons are further classified as either the Ty3-gypsy- (Metaviridae) or the Ty1-copia- (Pseudoviridae) type (Kumar and Bennetzen 1999). Both types of retrotransposons are found in high numbers in the genomes of higher plants (from 5.5% of the Arabidopsis thaliana genome to >50% of the genomic content of Zea mays) (SanMiguel et al. 1996; Arabidopsis Genome Initiative 2000). There is increasing evidence that LTR retrotransposons are a major driving force for genome evolution and that they contribute to genome organization as well as to the regulation of gene activity (Flavell et al. 1992). To date, a series of LTR retrotransposons have been isolated and characterized from a wide range of plant taxa from monocotyledonous to dicotyledonous plants. The Ty1-copia elements show a broad insertion pattern, heterogeneity, and sequence variability (Kumar and Bennetzen 1999). Some of these elements display an insertion bias toward coding regions (Flavell et al. 1992).
Most of the isolated Ty1-copia retrotransposons are nonfunctional due to the presence of stop codons, frameshift mutations, or deletions, and only a few Ty1-copia retrotransposons characterized from plants have been shown to be active, such as Tnt1 and Tto1 from tobacco, the Tos elements from rice, and BARE-1 from barley (Flavell et al. 1992). The activity of these elements was detected only during certain stages of plant development or stress conditions, indicating that their transposition is regulated at the transcriptional level (Flavell et al. 1992). To better understand the structure and diversity of LTR retrotransposons as well as their impact on genomes, it is necessary to isolate and characterize additional active elements. The discovery of those active LTR retrotransposons not only could lead to a better understanding of retrotransposon gene expression and transposition, but also could be useful as genetic markers (Ellis et al. 1998).
Vertical or horizontal (cross-species) transfers are postulated as two alternative ways in which LTR retrotransposons evolve. Vertical transmission by descent from a common ancestor has been proposed as the main distribution mechanism of LTR retrotransposons (Jordan and McDonald 1998), while the data on horizontal transfer, the process by which DNA elements move between distinct organisms, have been less conclusive because the elements occurred too long ago or between such closely related species that we cannot completely eliminate the alternative vertical transmission hypothesis (Jordan and McDonald 1998; Jordan et al. 1999). In prokaryotes, a significant proportion of their genetic diversity is based on the acquisition of sequences from distantly related organisms. The lateral gene transfer plays an integral role in the evolution of bacterial genomes and in the diversification and speciation of the enterics and other bacteria (Ochman et al. 2000). However, in eukaryotes the horizontal gene transfer is much less studied. The few best-documented examples involving the horizontal transfer of elements are P (Silva and Kidwell 2000), mariner (Robertson et al. 1998), and a copia element (Jordan et al. 1999) as well as other elements (Silva et al. 2004) in animals. In plants, horizontal transfer of a transposable Mu-like element (MULE) between the Setaria and rice and a retrotransposon RIRE1 within the genus Oryza are documented (Diao et al. 2006; Roulin et al. 2008).
In this article, we report the isolation and characterization of the copia-like element Rider from T3238fer, a spontaneous iron-inefficient mutant of tomato (Brown and Chaney 1971). Our results reveal that it is a new type and Lycopersicon-specific Ty1-copia retrotransposon. It might originate from the Arabidopsis genome by a horizontal transfer event and still contain transpositional activity.
MATERIALS AND METHODS
The cosmid library of T3238fer was constructed with a pWEB cosmid cloning kit following the manufacturer's instructions (EPICENTRE Biotechnologies): ∼20 μg of genomic DNA were first sheared by passing it through a standard pipette tip. After repairing fragments to generate blunt ends, 30- to 40-kb DNA fragments were selected on a low-melting-point agarose gel by comparison with a supplied 36-kb DNA standard and purified according to the instructions of the manufacturer. Subsequently, the selected DNA fragments were ligated into the supplied blunt-ended Cloning-Ready pWEB cosmid vector, packaged using ultra-high efficiency Max-Plax Lambda Packaging Extracts (>109 pfu/mg for phage λ), and plated on the included EPI100-T1R phage T1-resistant Escherichia coli plating strain. The library screening was performed with an improved PCR-based library screening method (Cheng et al. 2004). Two pairs of gene-specific primers flanking the two ends of the insertion event of FER were designed and used for the library screening. They are left1 5′-GCTGCAATGTGTCGCCCTTT-3′, left2 5′-GCTTTGCGATCCTTAGTTC-3′, right1 5′-CCTCTCCTTTTGCGCTCCGA-3′, and right2 5′-GTTATGCATATTGGGCTTATTAAT-3′. The insertion fragment was sequenced by primer walking applied Biosystems ABI Prism 3700 DNA Analyzer and BigDye Terminator kit v3.1 (following the standard instructions of the manufacturer).
To identify the LTR and target-site duplication (TSD) sequences as well as the internal domains, the sequence was analyzed by comparison with itself and Tnt1-94 (X13777) with the Dotter program (Sonnhammer and Durbin 1995). The tRNA-Met sequence used for the identification of the primer binding site (PBS) locus was obtained from a tRNA database (Lowe and Eddy 1997) (http://rna.wustl.edu/tRNAdb/), and a search for conserved protein domains was carried out with RPS-Blast (Marchler-Bauer et al. 2003) (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). The insertion times of LTR retrotransposons with both LTRs were determined in a manner similar to the method by Ma et al. (2004). For each complete copy, the two LTRs were aligned using the GCG program, and the nucleotide substitution (transitions + transversions) rate between the two LTRs was estimated by the Kimura two-parameter method of MEGA software (indels and microsatellites were not taken into account in estimating these divergence rates) (Kumar et al. 2004). LTR divergence rates were then converted into dates using the average substitution rate of 6.96 × 10−9 substitutions/synonymous site/year (Ma et al. 2004). BLAST searches (Altschul et al. 1997) with a nucleotide sequence of the Rider-1 reverse transcriptase (RT) domain against public and local nucleotide databases were used for the identification of related elements with the threshold E-value of E <1 × 10−10. The sequence of the conserved RT domain extracted from each identified element was defined as described by Xiao et al. (2008), and the RT neighbor-joining tree (Poisson correction) was constructed using MEGA 2.1 (Kumar et al. 2004) based on ClustalX alignments. The GenBank accession numbers for the sequences used in the phylogenetic analysis are as follows: Rider, EU195798; TY4, M94164; pCal, AF007776; Tdh5, AJ439552; Le.Copia, AY678298; TLC1.1, AF279585; Tnt1, X13777; Rider-like 1, AL138663; At.Copia2, AF287471; Ta1.1, X13291; Ta1.3, X53973; AtRE1, AB021263; AtRE2.1, AB021266; Os.Copia1, NM_001072329; Os.Copia2, U72726; Osr1, AB046118; Zm.Copia, AF466202; Stonor, AF082134; Ins2, AF434192; BARE, Z17327; Osser, X69552; CIRE1.1, AM040263; MIRE1, AY196987; Tgmr, U96748; Ta.Copia, DQ890165; Tar1, AB008772; Ib.Copia, AY830138; Pb.Copia, AJ416708; CIRE1.1, AM040263.
To perform the fluorescence in situ hybridization (FISH) and fiber-FISH, immature tomato flower buds of ∼3.0 mm were harvested and fixed in Carnoy's solution (ethanol:glacial acetic = 3:1). Microsporocytes at meiosis were squashed in acetocarmine solution as described by Wu (1967). Slides were frozen in liquid nitrogen. After removing coverslips, they were dehydrated in an ethanol series (70, 90, and 100%) prior to use in FISH. The FISH procedure used for chromosomes was performed according to Jiang et al. (1995). A HindIII restriction fragment that contained the LTR and gag-protease region of Rider was used as Rider-specific probe and labeled by nick translation with digoxigenin-16-dUTP (Roche). After hybridization and washing, chromosomes and FISH signal images were captured under the Olympus BX61 fluorescence microscope conjunct with a microCCD camera.
Genomic DNA extraction was performed as described by Li et al. (2004). For Southern blot analysis, ∼10 μg genomic DNA was separately digested with EcoRI or HindIII. Gel separation, blotting of DNA fragments onto Hybond-N+ membrane (Amersham), and hybridization were performed according to the protocol described by Ling et al. (1996). The probes were synthesized by PCR amplification and labeled with 32P by the Prime-a-Gene System (Promega).
Total RNA was isolated from leaves and roots of 3-week-old T3238 and T3238fer plants and flowers as described by Li et al. (2004) and treated with RNase-free DNase I (Promega) to eliminate genomic DNA, and then 2 μg/sample was used for reverse transcription with the M-MLV RT kit (Promega) according to the manufacturer's instructions. The reactions were diluted sixfold for RT–PCR analysis according to the protocol described by Li et al. (2004). The internal fragment-specific primers used in the RT–PCR experiments are internal-forward and internal-reverse as described above (real-time PCR analysis). The other primers used in this experiment are p-l 5′-GGGCCGACGGTATACAATG-3′, p-r 5′-ACCGAGAGGC TCTGATACCA-3′, and fer 5′-CAAAGGCACGAGGACTGACC-3′. The PCR reaction profile included 26 cycles of 30 sec at 94°, 50 sec at 55°, and 50 sec at 72°, preceded by an initial denaturation (3 min at 94°) and followed by a final extension step (5 min at 72°). Reaction products were separated on agarose gel electrophoresis.
For monitoring the real-time PCR reactions, we used the Mastercycler ep realplex system (Eppendorf) and DyNAmo SYBR Green quantitative PCR (qPCR) kit for fluorescence labeling (New England Biolabs) with optimized PCR protocols. The PCR reaction system (20 μl) contained 10 μl DyNAmo SYBR Green qPCR mix, 0.5 μm forward and reverse primer, and 2 μl DNA template (0–100 ng). The standard amplification profile consisted of 2 min at 95° for initial denaturation and 40 cycles of amplification of 10 sec at 95° for denaturation, 30 sec at 60° for annealing/extension, and 20 sec at 72° with a single fluorescence measurement. Melt curve analysis was performed subsequent to the PCR run to examine the specificity of amplification. After adjusting baseline cycles and calculating threshold values, the number of cycles halfway through the exponential phase (Ct value) was obtained. The relative standard curve methods were used for quantification of copy number with a 10-fold dilution series of 4.8 × 10−9 copies/μl of Rider as plasmid DNA. The genome sample was genomic DNA of T3238 (5.0 μg/μl) and water as negative control. When converting the result of relative copy number into absolute copy number, we used a tomato genome size of 960 Mb (Van der Hoeven et al. 2002). The quantification of each sample was repeated three times. To amplify LTR and internal fragments, the following PCR primer pairs were used: LTR-f, 5′-TGAATCGGACCCGCTACAA-3′; LTR-r, 5′-TGACCG GGCAGCGAGA-3′; internal-f, 5′-TGTCCTAGGAGAGAGTG GTTC-3′; and internal-r, 5′-CATTAGAGAGCATGCACCTTG-3′.
Isolation and sequence analysis of the Ty1-copia-like retrotransposon Rider:
T3238fer, a spontaneous iron-inefficient mutant of tomato, was found in the 1960s (Brown and Chaney 1971). Our previous work has demonstrated that the mutation of T3238fer was caused by an insertion of a large DNA fragment in the first exon of FER, blocking the expression of this gene that functions in controlling iron deficiency responses and iron uptake (Ling et al. 2002). T3238fer is a lethal mutant under normal culture conditions because of its inefficiency in acquiring iron from the soil. Using a PCR-based library screening method (Cheng et al. 2004) and the specific PCR primers of FER, a cosmid clone covering the whole insertion was isolated from a genomic cosmid library of T3238fer and sequenced by primer walking. Finally, a 6-kb sequence, which contained the complete insertion and the exon of FER, was obtained.
Using the dotplot program (Sonnhammer and Durbin 1995), the insertion fragment in FER was identified to be 4876 bp long and flanked by two 5-bp short directed repeats (5′-CTTTT-3′). Adjacent to the short directed repeats there are two 397-bp LTRs that terminate by the consensus sequence 5′-TG…CA-3′, and a 3921-bp internal putative coding region (Figure 1A). The short directed repeat is a typical TSD, which is a characteristic feature of a transposon insertion event. On the basis of the identification of LTR sequences, the inserted element in FER possibly belongs to the class of LTR retrotransposons.
In the internal region between the LTRs, two conserved sites, the primer binding site (PBS) and the polypurine tract site (PPT) (Figure 1A), were identified 3 bp downstream from the left LTR and closely upstream from the right LTR, respectively. The PBS is part of a conserved 5′-TGGTATCAGAGCC-3′ sequence, which serves to complement the 3′ sequence of tRNA (Met) and to activate the transcription of mRNA (Akama and Tanifuji 1989). The polypurine tract site with the conserved AGGGGGAG motif (Friant et al. 1996) is speculated to prime the synthesis of second-strand DNA. Between the two sites, a large single ORF was identified. It encodes a polyprotein composed of 1307 amino acids. Searching for conserved domains using RPS-Blast allowed us to identify gag, zinc-finger, protease (pro), integrase (int), and RT domains within the polyprotein together with the domain of ribonuclease H (RH). In comparison with all published LTR elements, we found that the general organization of these domains was very similar to Ty1-copia retrotransposons both in order and in length. A homology matrix comparison of the deduced amino acid sequences between this element and tobacco Tnt1 is shown in Figure 1B. The identity value at the protein level between the two ORFs was 45% and the two polyproteins showed a common order of gag, pro, int, RT, and RH domains. Whereas different domains showed different levels of sequence similarity, the protease, integrase, and reverse transcriptase-RNase H domains were most conserved with identities of 43% in 98 amino acids, 50% in 158 amino acids, and 57% in 599 amino acids, respectively. The gag polyprotein is less conserved with 40% identity in 246 residues. This high degree of conservation in amino acid sequence and domain order indicates that the tomato element contains all characteristic protein domains required for retrotransposition, suggesting that this element belongs to a new Ty1-copia family of LTR retrotransposons. Most recently, Xiao et al. (2008) described a gene duplication underlying the morphological variation of tomato fruit mediated by the retrotransposon Rider, whose sequence is completely identical to our isolated element. Thus, here we call our isolated element Rider, too.
Rider elements in the tomato genome:
To identify sequences homologous to Rider, we searched all the published tomato BAC or PAC sequences using the full-length sequence of Rider as the query. A set of 32 matches (>200 bp and identity value >90%) were identified from 25 tomato bacterial artificial chromosome (BAC) or P1 artificial chromosome (PAC) clones distributed on chromosomes 1, 4, 5, 8, and 10. These matches were highly similar in sequence and differed from any retrotransposable elements previously isolated in tomato. Among these matches, 6 are full-size homologs with the same overall structure as Rider, 4 are putative nonautonomous elements that lost partially or completely internal coding regions, and 22 are solo LTRs or deletion derivatives. Such divergent structural groups indicate that frequent rearrangement, deletions, or template switching events occurred in the Rider family of retrotransposons.
We analyzed the six full-size Riders individually (Table 1), revealing that they all have completely conserved PBS or PPT sites and a single internal ORF (Rider-3 and Rider-6 have one or two stop codons in the gag–pro regions). Distinct TSDs could be identified, too, except Rider-5, revealing that the TSD sequence of each element was different, e.g., 5′-CTTTT-3′ in Rider-1 and 5′-CTAGC-3′ in Rider-2 (Table 1). This suggests that integration sites of Rider in tomato chromosomes might be randomly distributed. As the two LTRs are created by a duplication of a single template during replication of the element, followed by the accumulation of random mutations, the degree of their divergence is proportional to the time passed since the insertion of the element. Therefore, the nucleotide divergence between the two LTRs of each full-size element was compared and used to estimate the insertion time. As shown in Table 1, five elements contained few mutations (>98% identity) between the two LTRs. Rider-6 had six nucleotide changes in the 397-bp LTR sequence and was determined to be the most ancestral insertion (1.07 MYA). Rider-1, Rider-2, and Rider-7 were the three youngest elements. All had completely identical LTRs and their insertion must have occurred very recently. This is consistent with the occurrence of the T3238fer mutation caused by Rider-1 in the1960s.
When using the Rider element as the query sequence of BLASTN, another interesting match, GTOM5, was obtained from the nucleotide database of NCBI. The GTOM5 mRNA (X67143) is an aberrant transcript of the phytoene synthase gene (Psyl) that was isolated from the tomato mutant yellow flesh. This sequence contains the first 326-bp nucleotides of the Psyl gene followed by 181 bp of an inserted sequence. Southern hybridization analysis had shown that this inserted sequence was highly repetitive in the tomato genome (Fray and Grierson 1993). Blast analysis revealed that the repeated sequence was similar to part of the Rider LTR, sharing 96% identity at the nucleotide level. Fray and Grierson (1993) reported that the insertion of the repeated segment was the reason for the transcriptional lesion of the phytoene synthase gene resulting in the yellow-fruited mutant yellow flesh (Price and Drinkard 1908). Therefore, this mutant is most likely another case of a Rider mutational insertion event in the tomato genome.
Four Rider nonautonomous elements were also analyzed. The ones identified from the sequences of AJ439079, AP009276, and AC18878, respectively, showed degraded TSD sites, and two of them from AP009276 and AC188781 could still be identified to contain the intact PBS and PPT sites, whereas the TSD and PBS/PPT sites of the fourth, which was identified from the 5′ upstream region of tomato gene LEACO1 (X58273) and described as a typical repeated sequence with multiple copies in the tomato genome (Blume et al. 1997), were completely degenerated.
Transcriptional activity analysis:
The tomato EST database (http://sgn.cornell.edu) and NCBI EST database were searched for evidence of the transcriptional activity of Rider. In tomato, seven EST hits with >95% identity were obtained. The lengths of these EST hits varied from 433 to 808 bp. Three of them are located in the RT–RH domain and four are in the LTR. Then we experimentally analyzed the transcriptional activity of the Rider-1 that inserted in the FER gene. As mentioned above, the aberrant transcript GTOM5 was caused by a Rider insertion and the expression of this RNA was under the control of the Psyl promoter (Fray and Grierson 1993). To test the Rider transcription driven by the FER promoter, two FER gene-specific primers (left-1 and right-1) designed from the 5′ and 3′ regions of the first exon of FER were used in combination with the Rider-specific primer LTR-r located in the reverse direction of the left LTR or other internal and LTR primers (Figure 2A), respectively. Using left-1/LTR-r primers for RT–PCR, a very weak band was detected in the root sample of T3238fer with 40 PCR cycles, while no products were identified with the other primer pairs (Figure 2B). This is consistent with the root-specific transcriptional activity of FER (Ling et al. 2002) and indicates that the band might be due to FER promoter activity. However, this transcription was at a very low level and could not extend into the internal region of Rider due to a stop signal in the LTR regions.
The RT–PCR amplification with the primers of LTR-l/LTR-r and internal-l/internal-r, derived from LTR and gag–protease sequences (Figure 2A), respectively, was performed with RNAs from leaves, roots, and flowers of T3238 and the T3238fer mutant. As shown in Figure 2, positive PCR products were amplified from roots, leaves, and flowers of T3238fer. Sequencing analysis verified that the sequences of the PCR products were 98–100% similar to Rider-1, which indicated that they were from the transcription of Rider-1 or its homologs. RNA samples from plants grown under stress conditions such as drought or tissue culture were also analyzed, but no differences in expression were found (data not shown).
Copy-number estimation and chromosome distribution:
With the primers specific to LTR and protease domain sequences, a quantitative real-time PCR amplification was performed to estimate the copy number of Rider elements in the tomato genome. The relative standard curve was made by a 10-fold dilution series of plasmid DNA containing Rider. As shown in Figure 3A, the slopes of the two five-point standard curves, Ct = f(−log[DNA]), were −1.954 and −0.652, respectively, with a correlation index >0.95 for both curves. From the standard curves, the number of LTRs and internal regions of Rider in the T3238 genome were determined and are shown in Figure 3A (right): a total of 195 LTR copies and 66 protease copies were estimated in the T3238 genome. The ratio calculated from copy numbers of LTR vs. protease regions was >2. Considering that each Rider element has one copy of the internal region and two LTRs, the ratio suggests that many solo Rider LTRs exist in the tomato genome.
To study the global distribution of Rider members on tomato chromosomes, a fiber-FISH analysis was carried out with the fragment containing both the LTR and gag–protease regions as probe. The in situ hybridization showed that the Rider members were abundant and scattered in tomato chromosomes (Figure 3B). This localization differs from most plant Ty1-copia group elements, which are euchromatic (Kumar et al. 1997).
Rider elements in other species of the Solanaceae family:
To check the distribution of the Rider retrotransposon family in Solanaceae, six Lycopersicon species (L. hirsutum, L. chmielewskii, L. peruvianum, L. chilense, L. pimpinellifolium, and L. esculentum) and three distant relatives (coffee, potato, and tobacco) were analyzed by Southern blot hybridization together with T3238fer. As shown in Figure 4, after hybridization, signals were observed in all Lycopersicon species even when washed by high washing stringency (2× SSC and 0.1% SDS), whereas no hybridization signals were detected in coffee, potato, and tobacco even at a low washing stringency (2× SSC and 0.5% SDS) with the probes derived from the LTR region and the internal gag domain of Rider. These results suggest that the Rider retrotransposon family is specific in Lycopersicon species. Additionally, the intensity of hybridization signals varied among the species. With both probes, L. chilense displayed the weakest hybridization signals among the species investigated, whereas Rider elements are more abundant in L. esculentum (Figure 4).
To further confirm the presence of Rider-retrotransposable elements in other species, we carried out a BLASTN search in the nr/nt database of NCBI with the internal coding region of Rider as the query sequence. A total of 156 matches, distributed in 14 genomes and sharing 40% of the common region with the query sequence, were obtained (Figure 5). All the 156 nucleotide sequences were extracted for alignment with the full-size Rider sequence individually, and the regions of nucleotide identities aligned to Rider were drawn. Figure 5B (red indicates that the identity score is high) illustrates the regions of nucleotide identities among the representative elements from the 14 species aligned to Rider. All of these elements only partially matched Rider in internal regions such as int, RT, or RH domains, except two putative Ty1-copia-like elements from Arabidopsis BAC clone AL138663 and AC007188. The two putative Arabidopsis elements matched Rider in both coding and noncoding regions with 75% identity at the nucleotide level (Figure 6) and were named Rider-like 1 and Rider-like 2, respectively. They are two complete retrotransposable elements with two 248-bp long LTRs terminated by the consensus sequence 5′-TG…CA-3′, which share high sequence identity to the 3′ part of Rider LTRs (Figure 6B). The PBS/PPT sites of the two Rider-like's are identical to Rider also. Moreover, Rider-like 2 has a pair of distinct TSD sequences (TCTCC) whereas the TSD sequence of Rider-like 1 is degenerated. LTR–nucleotide divergence analysis of Rider-like 1 and Rider-like 2 showed that their LTRs share 93 and 91% identity, respectively. It allows us to estimate that the insertion time of Rider-like 1 and Rider-like 2 is ∼4.89 and 5.60 MYA, respectively (Table 1). This result suggests that the “birth” of Rider-like 1 and Rider-like 2 in the Arabidopsis genome occurred after the radiation of tomato and potato genomes 12 MYA (Tanksley et al. 1992). While in the Arabidopsis genome, there are only two copies of Rider-like, no transcriptional products were detected by both EST database searching and RT–PCR amplification with Rider-like-specific primers (data not shown), suggesting that Rider-like 1 and Rider-like 2 are transcriptionally inactive.
To gain insight into the evolutionary relationship of Rider and other copia-type retrotransposons, the sequences of their conserved RT domains were extracted from 24 previously identified copia-type elements as well as from Rider and its homologs (from Oryza sativa and Triticum aestivum, as shown in Figure 5B) for the phylogenetic tree reconstruction. As shown in Figure 7, the unrooted neighbor-joining tree revealed that the various RT sequences clustered into several distinct major evolutionary lineages supported by high bootstrap value and topographic structure. The long branches and loose lineage structure indicate a considerable diversity and similar time of origin of each group, consistent with the previous classification of plant copia families (Wicker and Keller 2007). In the tree, Rider was clustered into the group with Rider-like 1 of Arabidopsis, Os.Copia1, and Ta.Copia, but not with the published elements from Solanaceae (shaded dark green in Figure 5). This is consistent with previous BLASTN searching as shown in Figure 5.
Sequence analysis revealed that the Rider element has all the structural features characteristic of Ty1-copia-type retrotransposons. Its LTR is completely different from any other known LTR element, whereas the internal regions share common sequences at a low identity level with the putative Ty1-copia-like elements identified from the plant genomes with the exception of Rider-like 1 and Rider-like 2 of Arabidopsis. Furthermore, Rider showed a constitutive transcriptional activity in all tissues investigated (Figure 2), and two recent mutational insertions were observed in FER of the tomato mutant T3238fer (Brown and Chaney 1971) and Psyl of the mutant yellow flesh (Fray and Grierson 1993). More recently, Xiao et al. (2008) found a Rider (Rider-7)-mediated gene duplication in the tomato genome that resulted in morphological variation of the fruit. These results strongly support Rider as a novel type of Ty1-copia retrotransposable element that was recently actively transposing in the tomato genome.
Vertical inheritance and horizontal transfer are considered as two main mechanisms for the evolution of a transposable element. Southern hybridization and homology search analysis revealed that Rider is specific to Lycopersicon species (Figure 4) and is virtually identical to the Ty1-copia-like elements (Rider-like 1 and Rider-like 2) of Arabidopsis (Figures 5 and 6). The noncoding region, including the LTR and PBS/PPT sites, which are expected to evolve more rapidly than the coding region of a retrotransposon, even showed high levels of identity between Rider of tomato and Rider-like of Arabidopsis.
The fact that Rider elements are absent in the genomes of related species of Lycopersicon such as potato, tobacco, and coffee (Figure 4) might be interpreted as a loss of Rider in these species in the framework of the vertical inheritance. However, this inference suffers from two serious problems. The first problem is that one has to assume many independent losses of the Rider-like elements after Solanaceae diverged from the most recent ancestor shared with Arabidopsis. For example, in the phylogeny shown in Figure 4, three independent losses have to be invoked in the lineages toward Coffea arabica, Nicotiana tabacum, and Solanum tuberosum to ensure that the Rider-like elements exist in every ancestor of these species and eventually passed to the tomato genus. The horizontal transfer hypothesis seems much more likely because it assumes only one introgression event into the ancestral tomato genome. This phylogeny-based parsimony analysis strongly supports the horizontal transfer hypothesis. The second problem of the vertical transfer hypothesis is that, if this were the case, the age of the Rider family would be at least as old as the common ancestor of Solanaceae and Arabidopsis, which is known to have diverged 100–120 MYA (Long et al. 1996). According to an evolutionary rate of 5.9–6.96 × 10−9 substitutions at synonymous site per year in plants (White and Doebley 1999; Ma et al. 2004), the silent sequences (nonprotein sequences) in the Rider members in the two branches should keep only 31–41% of identity, which is significantly lower than what we observed—∼75%. On the basis of this analysis, the high identity with the Rider homologs in the distantly related species Arabidopsis (Figure 5) also supports that Rider originates from a horizontal gene transfer.
Rider-like elements in Arabidopsis are ∼5.6 MY old, obviously older than the most ancient Rider element detected in tomato (with an age of 1.07 MY; Table 1), but younger than the speciation of tomato (∼12 MYA) (Tanksley et al. 1992). Considering that Rider-like in Arabidopsis is inactive and largely unalignable at the 5′ half of the LTRs to the tomato Rider, we infer that Rider-like of Arabidopsis and Rider of tomato are all from an ancestral Rider element of some genomes in Arabidopsis or related species by horizontal transfer that occurred 1–5.6 MYA. The transfers should be mediated via a vector of some kind, such as viruses, bacteria, fungi, and sap-sucking insects (Silva et al. 2004; Diao et al. 2006; Roulin et al. 2008), and its transpositional activity was retained in the Lycopersicon genome, but lost in Arabidopsis.
FISH mapping and the determination of copy number demonstrated that the tomato genome contains an intermediate number of Rider homologs (∼195 LTR copies and 66 internal copies) scattered in the genome of tomato. This distribution is consistent with the fact that the TSD sites among different element members are nonconserved. This feature differs from other plant Ty1-copia elements (Flavell et al. 1992; Kumar et al. 1997). A more detailed knowledge on the distribution and the integration sites of Rider elements will be gained from the International Tomato Sequencing Project (http://www.sgn.cornell.edu/about/tomato_sequencing.pl).
Southern hybridization analysis also revealed many insertion polymorphisms of Rider among cultured tomato and its closely related wild species. This reveals a potential use of Rider for taxonomic studies. Several marker systems based on retrotransposons have been developed for plant genome and biodiversity analysis (Purugganan and Wessler 1995), and techniques allowing the PCR analysis of retrotransposon insertion polymorphisms have recently extended the application of retrotransposons as genetic markers (Waugh et al. 1997; Ellis et al. 1998). Therefore, studying the distribution of Rider in more tomato species could be useful for evaluation and clarification of the phylogenetic relationships in Lycopersicon genus.
The transposition of retrotransposons is believed to be regulated mainly at the transcriptional level. Most retrotransposons reported are silent or active only during certain stages of plant development (Kumar et al. 1997) or under specific stress conditions (Kumar et al. 1997; Beguiristain et al. 2001; Rico-Cabanas and Martinez-Izquierdo 2007). Such limited activity might minimize the deleterious effect of element amplification in the genome (Kumar and Bennetzen 1999). Rider showed a constitutive transcription in all tested tissues of tomato under different growth conditions and an intermediate number (∼100 copies) in the tomato genome, which does not correlate with its transcription activity. Four steps (transcription, translation, reverse transcription, and integration of cDNA element) are included in a replication cycle of LTR retrotransposon. Regulation at any of the four steps can limit the transposition rate. The constitutive expression of Rider indicates that no amplification suppression exists at the transcriptional level. The regulation of the Rider transposition rate might occur after the transcription. Some constitutively expressed LTR retrotransposons have been observed in other plants (Neumann et al. 2003).
In conclusion, Rider is a novel, young, and recent active Ty1-copia-like retrotransposon in the tomato genome and may originate from horizontal gene transfer. This finding is in agreement with earlier conclusions that the movement and amplification of retrotransposons are major contributors to genome evolution and genetic diversity in plant (Jin and Bennetzen 1994; Bergthorsson et al. 2003; Long et al. 2003; Wang et al. 2006). The possible hijacking of nuclear genes by retrotransposons could transfer them between organisms, leading to the origination of a new gene or a new gene recombination, representing a drastic innovation in a genome (Jin and Bennetzen 1994; Wang et al. 2006). Further characterization of the Rider family will give new insight into the role of these retrotransposons in genome evolution and genetic diversity of tomato species.
The authors are grateful to Barbara Hohn (The Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland) for critical reading of the manuscript. This work was supported by the Chinese Academy of Sciences (grant nos. KSCX2-YW-N-001 and KSCX2-YW-N-056), the Ministry of Science and Technology of China (grant nos. 2005cb20904 and 2006AA10A105), and the National Natural Science Foundation of China (grant no. 30530460)
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession no. EU195798.
Communicating editor: J. A. Birchler
- Received November 26, 2008.
- Accepted January 9, 2009.
- Copyright © 2009 by the Genetics Society of America