- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Melquist, S.
- Articles by Bender, J.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Melquist, S.
- Articles by Bender, J.
Arabidopsis PAI Gene Arrangements, Cytosine Methylation and Expression
Stacey Melquista, Bradley Luffa, and Judith Benderaa Department of Biochemistry, Johns Hopkins University School of Public Health, Baltimore, Maryland 21205
Corresponding author: Judith Bender, Department of Biochemistry, Johns Hopkins University School of Public Health, 615 N. Wolfe St., Baltimore, MD 21205., jbender{at}welchlink.welch.jhu.edu (E-mail)
Communicating editor: J. A. BIRCHLER
| ABSTRACT |
|---|
Previous analysis of the PAI tryptophan biosynthetic gene family in Arabidopsis thaliana revealed that the Wassilewskija (WS) ecotype has four PAI genes at three unlinked sites: a tail-to-tail inverted repeat at one locus (PAI1-PAI4) plus singlet genes at two other loci (PAI2 and PAI3). The four WS PAI genes are densely cytosine methylated over their regions of DNA identity. In contrast, the Columbia (Col) ecotype has three singlet PAI genes at the analogous loci (PAI1, PAI2, and PAI3) and no cytosine methylation. To understand the mechanism of PAI gene duplication at the polymorphic PAI1 locus, and to investigate the relationship between PAI gene arrangement and PAI gene methylation, we analyzed 39 additional ecotypes of Arabidopsis. Six ecotypes had PAI arrangements similar to WS, with an inverted repeat and dense PAI methylation. All other ecotypes had PAI arrangements similar to Col, with no PAI methylation. The novel PAI-methylated ecotypes provide insights into the mechanisms underlying PAI gene duplication and methylation, as well as the relationship between methylation and gene expression.
THE model higher plant Arabidopsis thaliana has a small genome size of ~108 bp. Despite the relative simplicity of the Arabidopsis genome, many functions in this plant are encoded by gene families rather than by a single gene. For example, many of the enzymes in the tryptophan biosynthetic pathway are encoded by two- (![]()
![]()
![]()
![]()
In particular, the gene family encoding the enzyme that catalyzes the third step of the tryptophan biosynthetic pathway, phosphoribosylanthranilate isomerase (PAI), displays a number of intriguing features. The PAI gene family has been characterized previously in detail in two standard laboratory strains of Arabidopsis, Columbia (Col), and Wassilewskija (WS; ![]()
![]()
The high degree of sequence identity across PAI gene family members is unusual in comparison to other Arabidopsis tryptophan gene families. For example, the duplicated tryptophan synthase ß-subunit genes TSB1 and TSB2 are 85% identical to each other in their exon sequences (exclusive of presumed chloroplast target sequences), but they are highly divergent in their transcribed untranslated sequences (![]()
-subunit genes ASA1 and ASA2 are 67% identical to each other in their predicted amino acid sequences, but they are divergent in their exon and intron nucleic acid sequences and in their intron structures (![]()
In WS, there are two more unusual features of the PAI gene family. First, whereas WS carries PAI2 and PAI3 genes that are almost identical to the Col PAI2 and PAI3 genes, at the PAI1 locus, WS carries an inverted-repeat duplication of the PAI1 gene PAI1-PAI4 (![]()
![]()
|
|
|
The second unusual feature of the WS PAI genes is that all four full-length genes and the partial pai4 5' gene are densely cytosine methylated (![]()
![]()
![]()
![]()
![]()
Because the PAI1 locus is polymorphic between WS and Col, we reasoned that an analysis of PAI gene copy number, arrangements, and methylation in other Arabidopsis ecotypes might provide new insights into the correlation between gene structure and the onset of cytosine methylation, as well as provide new genetic tools for understanding methylation and its relationship to gene expression. In addition, the PAI structures observed in other ecotypes could elucidate the mechanism of PAI gene duplication. To investigate this possibility, we screened PAI gene structure and methylation by Southern blot analysis for 39 additional ecotypes of Arabidopsis isolated from around the world. We found that whereas most ecotypes had PAI gene structures identical to that observed for Col (three unmethylated genes) and two ecotypes had PAI gene structures nearly identical to that observed for WS (four methylated genes), four ecotypes had novel PAI arrangements, with a variation of the inverted-repeat PAI gene structure at the PAI1 locus and PAI cytosine methylation. On the basis of our results, we discuss possible models for the differential methylation of PAI gene arrangements, the evolution of the polymorphic PAI gene family, and the effects of methylation on PAI gene expression.
| MATERIALS AND METHODS |
|---|
Arabidopsis strains and growth conditions:
Arabidopsis ecotypes were obtained from the Arabidopsis Biological Resource Center at Ohio State University, with the exception of C24, which was obtained from Patrick Masson (University of Wisconsin, Madison, WI). Plants were grown under continuous light in Fafard Growing Mix 2 (Griffin Greenhouse Supplies).
Plant genomic DNA preparation and Southern blot analysis:
Plant genomic DNA was prepared as follows. Four-week-old plants were frozen in liquid nitrogen and ground into a fine powder with a mortar and pestle. Ground tissue (~10 g) was thawed in 20 ml lysis buffer (100 mM Tris-OH, pH 9.5, 1.4 M NaCl, 20 mM EDTA, pH 8.0, 2% hexadeclytrimethylammonium bromide, 1% polyethylene glycol, Mr 8000), plus 50 µl 2-mercaptoethanol and then heated at 74° for 20 min. The lysed tissue was cooled to room temperature and extracted with an equal volume of CHCl3. The aqueous phase was transferred to a fresh tube and precipitated with an equal volume of isopropanol at room temperature for 30 min, followed by centrifugation at 5000 x g for 20 min. The pellet was resuspended in 1 ml 0.75 M NaCl, incubated with 5 µl 10 mg/ml RNase A at 37° for 30 min, mixed with 0.25 ml of water and 0.75 ml of JetStar (Genomed) column equilibration buffer E4, and centrifuged for 10 min at 5000 x g to pellet insoluble material. The DNA sample supernatant was loaded on an equilibrated JetStar minicolumn and washed, eluted, and isopropanol precipitated as described by the manufacturer. DNA pellets were resuspended to a final concentration of ~0.5 µg/µl in water. For Southern blot analysis, ~1 µg of genomic DNA was digested with the restriction endonuclease of interest, electrophoresed through an 0.7% TBE agarose gel, transferred to a Hybond-N (Amersham, Arlington Heights, IL) membrane, and fixed by UV cross-linking. Purified probe DNA fragments were labeled using the MegaPrime kit (Amersham) as described by the manufacturer and hybridized to Southern blots in Church buffer (![]()
![]()
![]()
Construction and screening of genomic DNA libraries:
Genomic DNA isolated from Kas-1, C24, Ita-0, or Cvi-0 was partially digested with Sau3AI to an average fragment size of 1520 kb, ligated with
DASH (Stratagene, La Jolla, CA) arms, and packaged into phage particles using Gigapack Gold (Stratagene) in vitro phage packaging extracts as described by the manufacturer. Each library contained 100,000200,000 primary clones. PAI-positive plaques were isolated by transferring to Hybond-N membranes and hybridizing with a PAI cDNA or direct-repeat probe as described for Southern blot analysis above.
DNA sequencing:
Genomic DNA fragments were subcloned from
library isolates into pBluescript II KS+ (Stratagene) and were sequenced either from standard T3 and T7 primer sites in the vector or from custom-designed internal primers by the Johns Hopkins University Department of Biological Chemistry Biosynthesis and Sequencing Facility.
PCR analysis of sequences flanking the PAI1-proximal direct repeat:
PCR primers based on sequences lying just outside the 731-bp pai4 5' duplication in WS and the heterologous 944-bp sequence present at this region in Col (see Figure 3) were used to amplify the analogous regions from Kas-1, C24, and Ita-0 genomic DNA using standard PCR reagents and conditions. These three ecotypes gave fragments that were identical in size and in PstI or HincII restriction patterns to the Col fragment. Also, genomic clones of this region from Kas-1 and Cvi-0 were explicitly sequenced and found to be nearly identical to Col.
Isolation and analysis of PAI cDNAs:
Standard Col (![]()
![]()
![]()
![]()
![]()
Reverse transcriptase PCR analysis of PAI cDNAs:
Total Arabidopsis RNA isolated as described (![]()
Northern blot analysis:
RNA was prepared from whole 4-wk-old plants, electrophoresed, transferred to Hybond-N (Amersham) membranes using standard procedures (![]()
-tubulin probe to normalize for differences in loading.
| RESULTS |
|---|
Determination of PAI gene structures and methylation in Arabidopsis ecotypes:
Because the PAI1 locus is structurally polymorphic between the previously characterized Col and WS Arabidopsis ecotypes (![]()
|
As a preliminary screen of both PAI structure and methylation, we digested genomic DNA from 39 new ecotypes with the methylation-sensitive restriction endonuclease isoschizomers HpaII and MspI, followed by Southern blot analysis with a PAI1 cDNA probe. Each PAI locus carries a single conserved HpaII/MspI site in the second PAI intron with different flanking HpaII/MspI sites (Figure 3). Therefore, each PAI locus yields distinct HpaII/MspI fragments, diagnostic of either fully cleaved unmethylated or partially cleaved methylated sequences. PAI methylation detected by HpaII/MspI digestion was previously shown to correlate with PAI methylation across the rest of the WS PAI genes detected by several other restriction endonucleases and by a genomic methylation sequencing technique (![]()
![]()
![]()
Out of the 39 new ecotypes examined, 31 had HpaII/MspI PAI patterns identical to those observed in Col, diagnostic of three unmethylated genes (Table 1, Figure 2). Two other ecotypes had Col-related patterns. Lu-1 had PAI1 and PAI2 patterns identical to those observed for Col, but a different PAI3 pattern. This pattern was most likely caused by a HpaII/MspI PAI3 polymorphism because when Lu-1 DNA was cleaved with XhoI, it yielded a band pattern identical to Col (data not shown). Ll-0 had PAI2 and PAI3 patterns identical to those observed for Col, but no detectable PAI1 fragment. This pattern was verified with a XhoI digest (data not shown).
Two ecotypes had band patterns related to those observed in WS, diagnostic of a complex PAI arrangement at the PAI1 locus plus dense cytosine methylation across all three PAI loci (Table 1). One PAI-methylated ecotype, Bur-0, had PAI patterns identical to those observed in WS. Another PAI-methylated ecotype, Nd-0, had PAI1-PAI4 and PAI2 patterns identical to those observed in WS, but a different PAI3 pattern. This pattern was most likely caused by a HpaII/MspI PAI3 polymorphism because when Nd-0 DNA was cleaved with XhoI, it yielded a band pattern identical to WS (data not shown). Four other PAI-methylated ecotypes, Kas-1, C24, Ita-0, and Cvi-0, displayed band patterns consistent with novel complex structures at the PAI1 locus (Figure 1 and Figure 2).
Southern blot analysis of the four novel PAI-methylated ecotypes using the enzyme HindIII, which is relatively insensitive to cytosine methylation, confirmed that there were only three PAI loci in each of these strains (Figure 1). The PAI2 and PAI3 loci were structurally invariant relative to Col and WS, whereas the PAI1 locus was polymorphic. The sizes of the PAI1 locus bands were consistent with either singlet genes or inverted-repeat genes spaced further apart than in WS. Southern blot analysis with the enzyme XhoI (restriction map shown in Figure 3), and subsequent cloning and sequencing of the PAI1 locus from each ecotype (see below), showed that each of the novel PAI-methylated ecotypes in fact carried a variation of the PAI1-PAI4 inverted-repeat structure.
The PAI1-PAI4 locus in WS is flanked by ~3 kb of direct-repeat sequences (Figure 3; ![]()
Sequence analysis of PAI1 locus structures:
To obtain a detailed understanding of the novel PAI1 loci detected by Southern blot in Kas-1, C24, Ita-0, and Cvi-0, we constructed genomic libraries from these strains and screened for PAI1 locus clones using direct-repeat and/or PAI1 cDNA probes. Relevant portions of the clones were subcloned and sequenced. We also extended the existing sequences around the cloned WS and Col PAI1 loci. These analyses, as well as the complete sequence of a P1 clone carrying the PAI2 region of the genome (![]()
Sequencing of the WS PAI1-PAI4 locus over a continuous region of ~11 kb (Figure 3) showed that the direct repeats flanking PAI1-PAI4 were nearly identical to each other: the PAI1-proximal repeat was 2880 bp long, and the PAI4-proximal repeat was 2905 bp long because of the presence of a 25-bp insertion near the end furthest away from PAI4. Each of the repeats contained a complete open reading frame for a gene with identity to histone H2B, although this gene was not represented in the Arabidopsis Expressed Sequence Tag (EST) database. Immediately upstream of the PAI1-proximal direct repeat was a 731-bp partial pai4 5' duplication followed by unique sequences with no significant identity to the sequences in the database. The unique sequences immediately beyond the PAI4-proximal direct repeat encoded ribosomal protein S15a (![]()
Previous sequence analysis of the Col PAI1 locus indicated that this locus carries a deleted direct repeat of 813 bp relative to the WS sequence downstream of PAI1, followed by the S15a gene (![]()
![]()
![]()
![]()
The novel methylated ecotypes all had overall structures similar to that of WS at the PAI1 locus, with two direct repeats flanking a PAI1-PAI4 inverted repeat. However, each ecotype had unique variations on this basic pattern (Figure 3). For example, the Kas-1 and C24 ecotypes had the same structure as Col upstream of PAI1. The Ita-0 ecotype also had a Col-like sequence upstream of PAI1, except that substituting for the 33 bp of direct-repeat sequence most proximal to PAI1 was 300 bp of the PAI1-distal end of the direct-repeat sequence in an inverted orientation. In Cvi-0, the region upstream of PAI1 had a more complex structure than any of the other ecotypes examined. Beyond the PAI1-proximal direct repeat, there was a duplication consisting of a full-length PAI gene oriented in the same direction as PAI4 (PAI4*) plus the last 207 bp of a direct-repeat sequence. Beyond this deleted direct-repeat PAI4* duplication was the same upstream sequence found in Col, Kas-1, C24, and Ita-0.
In the region upstream of PAI4 the ecotypes Kas-1, Ita-0, and Cvi-0 each had a full direct repeat followed by S15a sequences similar to WS (Figure 3). In C24, however, 96 bp around the junction between PAI4 and its flanking direct repeat was substituted with 929 bp of heterologous sequences. These sequences consisted of a 911-bp duplication of the sequences found immediately adjacent to the upstream end of the PAI1-proximal direct repeat, followed by 18 bp of unique sequence at the junction with PAI4. Beyond this duplication, C24 carried direct-repeat and S15a sequences analogous to those found in the other ecotypes (Figure 3).
Another focus of our analysis was the sequence between the PAI1-PAI4 inverted repeats, representing the sequences 3' of each PAI gene. In Col, the PAI1 3' sequence extended for 470 bp downstream of the translational stop codon with almost perfect identity to the Col PAI2 gene 3' sequence, followed by 10 bp of heterologous sequence and then 813 bp of flanking direct-repeat sequence (Figure 3 and Figure 4). In the ecotypes WS, Kas-1, and C24, the PAI1-PAI4 inverted-repeat central sequences consisted of deleted palindromes of the 470-bp downstream sequence (Figure 4). In Ita-0 and Cvi-0, PAI1 and PAI4 both carried the full 470-bp downstream sequence with a short novel sequence sandwiched in between. In Ita-0, this central sequence carried 24-bp inverted-repeat ends plus another repeat of the 24-bp sequence and a repeat of the last 43 bp of the PAI2-identical downstream sequences in the middle. In Cvi-0,the central sequence carried a short duplication of a PAI fifth-exon sequence flanked by novel sequences. There was no identity between Ita-0 and Cvi-0 central sequences.
|
With a combination of sequencing and restriction mapping analysis, we were able to construct HpaII/MspI restriction maps for the novel PAI1 loci. This information plus the observed fragment sizes on a HpaII/MspI Southern blot (Figure 2) allowed a determination of which sites in each PAI1-PAI4 structure were cytosine methylated (marked with asterisks in Figure 3). Methylation was almost entirely contained within the PAI-identical sequences, with little or no spread into flanking direct repeats.
PAI cDNA abundance and function studies:
Using a deletion mutant derivative of WS that lacks the PAI1-PAI4 inverted-repeat genes, we previously showed that cytosine methylation correlates with a loss of expression for the PAI2 gene (![]()
![]()
![]()
![]()
To better understand which PAI genes are expressed in PAI-methylated ecotypes, we analyzed PAI transcripts in WS by cloning and sequencing PAI cDNAs from a standard WS library. This approach revealed that PAI1 was the only detectable species (Table 2). In contrast, screening of Col and Ler cDNA libraries revealed that PAI1 and PAI2 were equally abundant in either of these unmethylated PAI ecotypes, with PAI3 approximately fourfold less abundant than either of its sister genes. In both Ler and WS, rare PAI1 cDNAs were found that had upstream direct-repeat sequences spliced to PAI1 first-exon sequences (Table 2). These rare cDNAs are best explained as transcripts that initiate from upstream S15a sequences in the direct-repeat rather than more proximal PAI1 sequences. The material that is deleted between the rare transcript start sites and the PAI1 first-exon site corresponds to predicted intron material in the 5' untranslated region of the S15a transcript on the basis of a comparison between an S15a cDNA sequence (GenBank accession no. R31305) and the genomic sequence. In the Ler 5' spliced transcript, the 3' junction of the S15a first upstream exon is spliced directly to a PAI1 upstream site. In the WS 5' spliced transcript, the 3' junction of the S15a first upstream exon is correctly spliced to the second S15a exon, and then a cryptic site in the second exon is spliced to the same PAI1 upstream site used in the Ler transcript (Table 2).
|
cDNA analysis also suggested that PAI4 and PAI3, besides being poorly expressed, do not encode functional PAI enzymes. WS PAI4 contains a 9-bp deletion in the fifth exon relative to PAI1 and PAI2 (![]()
![]()
![]()
Sequencing of PAI3 cDNAs isolated from Col and Ler libraries indicated that the PAI3 transcript was incorrectly spliced to yield a 25-bp insertion of intron sequences between the fourth and fifth exons (Table 2). The insertion is most likely caused by a single-base polymorphism that changes the consensus 3' intron site in the fourth PAI intron from AG to TG. PAI3 cDNAs are predicted to encode a protein with 12 novel amino acids downstream of the fourth exon followed by a premature termination codon. When expressed in a PAI-deficient E. coli strain, the Col PAI3 cDNA failed to complement tryptophan auxotrophy. Therefore, the PAI3 gene is not likely to produce PAI activity in any ecotype that carries the PAI3 splice site polymorphism, including Col, WS, C24, Kas-1, Ita-0, and Cvi-0 (as assessed by sequencing cloned PAI3 genes and/or by monitoring a PstI polymorphism created by the PAI3 splice junction mutation).
RT-PCR analysis of PAI gene expression:
As for WS, the novel PAI-methylated ecotypes Kas-1, C24, Ita-0, and Cvi-0 all had steady-state levels of total PAI transcripts slightly higher than the levels measured in Col (Figure 5A). Each PAI-methylated ecotype also displayed low levels of higher-molecular-weight PAI transcripts, as previously observed in WS (![]()
|
PAI transcripts expressed from the PAI1-PAI4 inverted-repeat genes can be distinguished from transcripts expressed from the singlet PAI2 and PAI3 genes in most PAI-methylated ecotypes by a restriction site polymorphism. Specifically, the PAI1 and PAI4 genes in the ecotypes WS, Kas-1, C24, and Ita-0 lack a conserved second exon SacI site, whereas the PAI2 and PAI3 genes carry this site. We performed RT-PCR on total RNA from these ecotypes using primers to common PAI sequences in the first and fourth exons that flank the polymorphic SacI site and then cleaved the PCR product with SacI. This analysis showed that in WS, Kas-1, C24, and Ita-0, none of the PAI RT-PCR product cleaved with SacI (Figure 5B), indicating that PAI2 and PAI3 are not expressed at significant levels in these ecotypes (estimated to be <10% of total transcripts) and that the bulk of expression comes from PAI1 and/or PAI4. As a control for SacI cleavage, we performed the analogous RT-PCR reaction on RNA from the Col and Cvi-0 ecotypes, where all the PAI genes carry the SacI site. We found that in these ecotypes, the PCR product was completely cleaved.
In the ecotypes with the PAI SacI polymorphism, we determined that only PAI1 was being significantly expressed, with additional RT-PCR experiments to distinguish between PAI1 and PAI4 transcripts. For WS, PAI1 and PAI4 can be distinguished by the 9-bp deletion in the PAI4 fifth exon. RT-PCR of this region with flanking primers to common PAI sequences showed that there is no detectable PAI4-sized transcript (Figure 6A). Therefore, this experiment confirms the result of the WS cDNA abundance analysis, that PAI1 is the abundant transcript. The primer set used in this analysis also flanks the PAI3 25-bp intron insertion region between the fourth and fifth exons, but no PAI3-sized transcript was detected in either WS or Col. Because cDNA analysis shows that PAI3 can be transcribed in Col (Table 2), our failure to detect this species via RT-PCR of total Col RNA suggests that our assay conditions were not sensitive enough to detect this weakly expressed gene. In particular, the PAI3 transcript might have been discriminated against during PCR amplification because it is longer than the PAI1 or PAI4 species. Nonetheless, this RT-PCR experiment suggests that PAI3 is not a major transcript in either WS or Col.
|
In the ecotype Kas-1, PAI1 can be distinguished from PAI4 by a polymorphism that destroys a conserved PstI site in the third exon of PAI1. RT-PCR analysis of Kas-1 RNA with the same primer set used for the SacI analysis, followed by cleavage of the PCR product with PstI, showed that none of the Kas-1 product was cleaved (Figure 6B). As a control, the same product amplified from Col RNA, where all three PAI genes contain the PstI site, was completely cleaved. Therefore, like WS, Kas-1 expresses transcripts primarily from PAI1. Furthermore, the Kas-1 PAI4 gene has a deletion of 7 bp in the second exon, which is predicted to alter the reading frame so that the protein is terminated near the end of the putative chloroplast transit sequence (![]()
In the ecotype C24, PAI1 can be distinguished from PAI4 by a 6-bp deletion in the PAI4 third exon. RT-PCR of this region with flanking primers to common PAI sequences showed that there is no detectable PAI4-sized transcript (Figure 6C). Furthermore, the C24 PAI4 gene carries a 91-bp deletion extending from the middle of the fourth intron into the middle of the fifth exon, which is predicted to disrupt the correct splicing of the fourth-intron and fifth-exon coding sequences. Thus, regardless of expression, no functional PAI enzyme would be produced from the C24 PAI4 gene.
In the ecotype Ita-0, PAI1 can be distinguished from PAI4 by a polymorphism that destroys a conserved DdeI site just upstream of the translational start in the first exon. This polymorphic site is contained in the RT-PCR product from the SacI analysis shown in Figure 5B. However, it was difficult to visualize the polymorphism by direct cleavage of the RT-PCR fragment with DdeI followed by gel electrophoresis because the relevant cleavage products were obscured by background nonspecific PCR species. We therefore used the alternative strategy of subcloning the Ita-0 SacI analysis RT-PCR products into a pBluescript II KS+ plasmid vector and analyzing individual clones by DdeI digest (Figure 6D). An analogous subclone of a Cvi-0 PAI1 RT-PCR product was used as a control for the restriction pattern given by a PAI gene carrying the upstream DdeI site. All of 24 independent Ita-0 subclones tested in this way lacked the upstream DdeI site, indicating that in Ita-0, PAI1 is the only abundant transcript species.
To determine which PAI transcripts are expressed in Cvi-0, we cloned the RT-PCR fragments generated in the SacI analysis (Figure 5B) and sequenced 10 independent clones, using single-base polymorphisms unique to each PAI gene as a means of identification. All 10 Cvi-0 clones were PAI1. Therefore, in all PAI-methylated ecotypes, PAI1 is the only significantly expressed PAI gene.
| DISCUSSION |
|---|
Our studies of cytosine methylation, structural polymorphisms, and gene expression indicate that an inverted repeat in the Arabidopsis genome displays a number of unusual behaviors. The PAI1-PAI4 inverted-repeat structures analyzed here consist of ~2 kb of nearly perfect mirror image sequence separated by not more than 247 bp and not less than 90 bp of nonpalindromic sequence (Figure 4). The inverted repeats are densely cytosine methylated over their regions of mirror image identity, without a significant spread into neighboring sequences (Figure 2 and Figure 3). Moreover, the inverted repeats are associated with dense methylation of the unlinked identical sequences PAI2 and PAI3 (Figure 2). These observations suggest that inverted repeats provide uniquely favorable substrates for methylation. The wide variation in PAI inverted-repeat structures observed across ecotypes (Figure 3 and Figure 4) suggests that these structures are unusually unstable, perhaps because they are difficult to replicate accurately and/or because they are recombinationally very active. Finally, in contrast to the singlet PAI2 gene, the inverted-repeat PAI1 gene is not silenced by cytosine methylation, suggesting that it is relatively exposed to the transcription machinery.
Inverted repeats trigger cytosine methylation:
Our original observation that the PAI genes are methylated in the WS ecotype but not in the Col ecotype could be explained by several models, including the difference in PAI gene arrangement and copy number between the two strains or strain-to-strain variation in the efficiency of the methylation machinery. The observations reported here argue that PAI gene arrangements, rather than variations in other loci, determine PAI cytosine methylation. Specifically, none of the 34 ecotypes with two or three unlinked singlet PAI genes displays PAI methylation, whereas all 7 ecotypes with a PAI1-PAI4 inverted repeat at the PAI1 locus display dense PAI methylation (Table 1). Consistent with the model that the inverted repeat locus triggers cytosine methylation, we found that when the WS inverted repeat is combined with the unmethylated Col PAI genes in WS x Col hybrid plants, the Col PAI genes become methylated de novo within a few generations of inbreeding (![]()
The inverted repeat could promote cytosine methylation via DNA/DNA interactions, RNA/DNA interactions, or both. Evidence suggesting that PAI methylation is triggered by DNA/DNA interactions comes from two observations. First, methylation is coextensive with PAI DNA sequence identity, including intron and promoter sequences (Figure 3; ![]()
![]()
![]()
![]()
An alternative model is that the inverted repeat gives rise to unusual hairpin RNA molecules as a result of readthrough transcription, and that these molecules promote PAI cytosine methylation and silencing, perhaps because they are converted into double-stranded RNA molecules (![]()
![]()
![]()
Genesis of the PAI gene family:
The sequence divergence between PAI3 and its sister PAI genes suggests that the two classes of genes are relatively evolutionarily distant from each other. However, the close sequence identity among PAI1, PAI2, and PAI4 argues that they arose more recently from a common progenitor. Transposon-mediated rearrangements provide a mechanism for horizontal transfer of sequences within a genome and for generation of tandem-sequence duplications. For example, transposon-generated, inverted-repeat sequence duplications have been characterized previously at the nivea locus in Antirrhinum majus (![]()
![]()
![]()
Regardless of the mechanism(s) of PAI sequence duplication, it is most likely that PAI2 on chromosome 5 was the progenitor gene that was copied to produce PAI1 and/or PAI1-PAI4 on chromosome 1. In particular, the 3' sequences that are duplicated in PAI1, PAI2, and PAI4 (but not PAI3) include the 3' end of the FAD8 gene (![]()
Our analysis of structural variants of the PAI1 locus across six ecotypes of Arabidopsis suggests that all the variants are related and arise from a common progenitor structure. Our data support either of two models for the genesis of the PAI1 locus. One model is that the structure found in Col and 32 other ecotypes (Table 1) is the progenitor structure and that this structure underwent a duplication and rearrangement event to generate an inverted repeat of PAI genes flanked by full direct-repeat sequences in one unusual lineage. This initial inverted-repeat structure then underwent further deletions and rearrangements to generate the variety of inverted-repeat structures observed in PAI-methylated ecotypes today. Presumably, the high degree of variation from the progenitor inverted-repeat structure was due to both the instability of the inverted repeat (![]()
![]()
![]()
![]()
An alternative model is that the progenitor structure of all ecotypes was an inverted repeat of two PAI genes flanked by full-length direct repeats. This common structure could have given rise to the Col structure by deletion of one PAI gene and part of a flanking direct repeat, and to the inverted-repeat ecotype structures by other rearrangement events (Figure 3). In this scenario, the predominance of the Col structure over methylated inverted-repeat structures in the wild population (Table 1) might reflect a greater fitness of the Col structure. Because Col expresses PAI enzyme from two unlinked genes, PAI1 and PAI2 (Table 2), this redundancy protects Col from deleterious consequences of PAI gene mutations. In contrast, the PAI-methylated ecotypes express PAI enzyme only from the PAI1 gene (Table 2, Figure 5 and Figure 6), making them vulnerable to tryptophan auxotrophy via PAI gene mutation. This vulnerability might therefore account for the underrepresentation of PAI-methylated ecotypes in the wild population.
Although Col and the inverted-repeat ecotypes all have the potential to yield a deletion at the PAI1 locus because of homologous recombination between the flanking direct-repeat sequences (![]()
![]()
![]()
In the PAI-methylated ecotypes Cvi-0, Ita-0, C24, Kas-1, and WS, the major structural difference is the sequence between the inverted-repeat PAI genes (Figure 3 and Figure 4). Cvi-0 and Ita-0 are the only ecotypes that carry extra sequences in this central region relative to the sequences downstream of Col PAI1 and PAI2 (Figure 4). Other structural differences unique to particular ecotypes can best be explained as secondary events (Figure 3). For example, the WS pai4 5' duplication could have been generated by pairing between the direct-repeat sequences, followed by gene conversion of the sequences adjacent to the PAI1-proximal repeat to sequences adjacent to the PAI4-proximal repeat. Similarly, the C24 PAI4 promoter rearrangement could have been generated by pairing between the direct-repeat sequences, followed by gene conversion of the sequences adjacent to the PAI4-proximal repeat to sequences adjacent to the PAI1-proximal repeat. The Ita-0 PAI1 promoter rearrangement could have been generated by pairing between the two inverted-repeat PAI genes, with gene conversion of the PAI1 promoter sequences to the PAI4 promoter sequences. The Cvi-0 PAI4* duplication is most simply explained by unequal crossing over between direct repeats to amplify a structure consisting of direct-repeat 1-PAI1*PAI4*-direct-repeat 2-PAI1-PAI4-direct-repeat 3, followed by a deletion of most of direct-repeat 1 and PAI1*.
Cytosine methylation has been shown to suppress homologous recombination (![]()
Cytosine methylation and PAI gene expression:
Cytosine methylation is usually correlated with a loss of transcription from methylated sequences (![]()
![]()
![]()
![]()
Why is the methylated PAI1 gene expressed while the methylated PAI2 gene is silenced? Detailed analysis of cytosine methylation patterns for WS PAI1 and PAI2 have revealed no significant differences in the extent or density of methylation (![]()
| ACKNOWLEDGMENTS |
|---|
We thank Laura Pawlowski, Gromoslaw Smolen, and Tomoko Hamma for technical assistance. This work was supported by a Basil O'Connor Starter Scholar Award 5-FY98-0535 from the March of Dimes, a National Institute of Environmental Health Sciences Training Grant to S.M. (ES 07141), and a Searle Scholars Award 97-E-103 to J.B.
Manuscript received April 1, 1999; Accepted for publication May 10, 1999.
| LITERATURE CITED |
|---|
AUSUBEL, F. M., R. BRENT, R. E. KINGSTON, D. D. MOORE, J. G. SEIDMAN et al., 1989 Current Protocols in Molecular Biology. Greene Publishing Associates and Wiley-Interscience, New York.
BENDER, J., 1998 Cytosine methylation of repeated sequences in eukaryotes: the role of DNA pairing. Trends Biochem. Sci. 23:252-256[Medline].
BENDER, J. and G. R. FINK, 1995 Epigenetic control of an endogenous gene family is revealed by a novel blue fluorescent mutant of Arabidopsis. Cell 83:725-734[Medline].
BHATT, A. M., C. LISTER, N. CRAWFORD, and C. DEAN, 1998 The transposition frequency of Tag1 elements is increased in transgenic Arabidopsis lines. Plant Cell 10:427-434
BOLLMANN, J., R. CARPENTER, and E. S. COEN, 1991 Allelic interactions at the nivea locus of Antirrhinum.. Plant Cell 3:1327-1336
BONHAM-SMITH, P. C. and M. M. MOLONEY, 1994 Nucleotide and protein sequences of a cytoplasmic ribosomal protein S15a gene from Arabidopsis thaliana.. Plant Physiol. 106:401-402[Medline].
CHURCH, G. M. and W. GILBERT, 1984 Genomic sequencing. Proc. Natl. Acad. Sci. USA 81:1991-1995
ELLEDGE, S. J., J. T. MULLIGAN, S. W. RAMER, M. SPOTTSWOOD, and R. D. DAVIS, 1991
YES: a multifunctional cDNA expression vector for the isolation of genes by complementation of yeast and Escherichia coli mutations. Proc. Natl. Acad. Sci. USA 88:1731-1735
GIBSON, S., V. ARONDEL, K. IBA, and C. SOMERVILLE, 1994 Cloning of a temperature-regulated gene encoding a chloroplast omega-3 desaturase from Arabidopsis thaliana.. Plant Physiol. 106:1615-1621[Abstract].
HANFSTINGL, U., A. BERRY, E. A. KELLOGG, J. T. COSTA, III, and W. RUDIGER et al., 1994 Haplotypic divergence coupled with lack of diversity at the Arabidopsis thaliana alcohol dehydrogenase locus: roles for both balancing and directional selection? Genetics 138:811-828[Abstract].
HENDERSON, S. T. and T. D. PETES, 1993 Instability of a plasmid-borne inverted repeat in Saccharomyces cerevisiae.. Genetics 134:57-62[Abstract].
INNAN, H., R. TERAUCHI, and N. T. MIYASHITA, 1997 Microsatellite polymorphism in natural populations of the wild plant Arabidopsis thaliana.. Genetics 146:1441-1452[Abstract].
JEDDELOH, J. A., J. BENDER, and E. J. RICHARDS, 1998 The DNA methylation locus DDM1 is required for maintenance of gene silencing in Arabidopsis.. Genes Dev. 12:1714-1725
KASS, S. U., D. PRUSS, and A. P. WOLFFE, 1997 How does DNA methylation repress transcription? Trends Genet. 13:444-449[Medline].
KUNKEL, T. A., J. D. ROBERTS, and R. A. ZAKOUR, 1987 Rapid and efficient site-specific mutagenesis without phenotypic selection. Methods Enzymol. 154:367-382[Medline].
LAST, R. L., P. H. BISSINGER, D. J. MAHONEY, E. R. RADWANSKI, and G. R. FINK, 1991 Tryptophan mutants in Arabidopsis: the consequences of duplicated tryptophan synthase ß genes. Plant Cell 3:345-358
LI, J., J. ZHAO, A. B. ROSE, R. SCHMIDT, and R. L. LAST, 1995 Arabidopsis phosphoribosylanthranilate isomerase: molecular genetic analysis of triplicate tryptophan pathway genes. Plant Cell 7:447-461[Abstract].
LUFF, B., L. PAWLOWSKI, and J. BENDER, 1999 An inverted repeat triggers cytosine methylation of identical sequences in Arabidopsis.. Mol. Cell 3:505-511[Medline].
MALOISEL, L. and J.-L. ROSSIGNOL, 1998 Suppression of crossing-over by DNA methylation in Ascobolus.. Genes Dev. 12:1381-1389
METTE, M. F., J. VAN DER WINDEN, M. A. MATZKE, and A. J. M. MATZKE, 1999 Production of aberrant promoter transcripts contributes to methylation and silencing of unlinked homologous promoters in trans.. EMBO J. 18:241-248[Medline].
MINET, M., M.-E. DUFOUR, and F. LACROUTE, 1992 Complementation of Saccharomyces cerevisiae auxotrophic mutants by Arabidopsis thaliana cDNAs. Plant J. 2:417-422[Medline].
MONTGOMERY, M. K. and A. FIRE, 1998 Double-stranded RNA as a mediator in sequence-specific genetic silencing and co-suppression. Trends Genet. 14:255-258[Medline].
NAGY, F., S. A. KAY and N.-H. CHUA, 1988 Analysis of gene expression in transgenic plants, pp. B4/11B4/12 in Plant Molecular Biology Manual, edited by S. B. GELVIN and R. A. SCHILPEROORT. Kluwer Academic Publishers, Dordrecht, The Netherlands.
NIYOGI, K. K. and G. R. FINK, 1992 Two anthranilate synthase genes in Arabidopsis: defense-related regulation of the tryptophan pathway. Plant Cell 4:721-733
NIYOGI, K. K., R. L. LAST, G. R. FINK, and B. KEITH, 1993 Suppressors of trp1 fluorescence identify a new Arabidopsis gene, TRP4, encoding the anthranilate synthase ß subunit. Plant Cell 5:1011-1027
RUSKIN, B. and G. R. FINK, 1993 Mutations in POL1 increase the mitotic instability of tandem inverted repeats in Saccharomyces cerevisiae.. Genetics 134:43-56[Abstract].
SATO, S., H. KOTANI, Y. NAKAMURA, T. KANEKO, and E. ASAMIZU et al., 1997 Structural analysis of Arabidopsis thaliana chromosome 5. I. Sequence features of the 1.6 Mb regions covered by twenty physically assigned P1 clones. DNA Res. 4:215-230[Abstract].
STINARD, P. S., D. S. ROBERTSON, and P. S. SCHNABLE, 1993 Genetic isolation, cloning, and analysis of a Mutator-induced, dominant antimorph of the maize amylose extender1 locus. Plant Cell 5:1555-1566[Abstract].
WATERHOUSE, P. M., M. W. GRAHAM, and M.-B. WANG, 1998 Virus resistance and gene silencing in plants can be induced by simultaneous expression of sense and antisense RNA. Proc. Natl. Acad. Sci. USA 95:13959-13964





