Allopolyploidy is formed by combining two or more divergent genomes and occurs throughout the evolutionary history of many plants and some animals. Transcriptome analysis indicates that many genes in various biological pathways, including flowering time, are expressed nonadditively (different from the midparent value). However, the mechanisms for nonadditive gene regulation in a biological pathway are unknown. Natural variation of flowering time is largely controlled by two epistatically acting loci, namely FRIGIDA (FRI) and FLOWERING LOCUS C (FLC). FRI upregulates FLC expression that represses flowering in Arabidopsis. Synthetic Arabidopsis allotetraploids contain two sets of FLC and FRI genes originating from Arabidopsis thaliana and A. arenosa, respectively, and flower late. Inhibition of early flowering is caused by upregulation of A. thaliana FLC (AtFLC) that is trans-activated by A. arenosa FRI (AaFRI). Two duplicate FLCs (AaFLC1 and AaFLC2) originating from A. arenosa are expressed in some allotetraploids but silenced in other lines. The expression variation in the allotetraploids is associated with deletions in the promoter regions and first introns of A. arenosa FLCs. The strong AtFLC and AaFLC loci are maintained in natural Arabidopsis allotetraploids, leading to extremely late flowering. Furthermore, FLC expression correlates positively with histone H3-Lys4 methylation and H3-Lys9 acetylation and negatively with H3-Lys9 methylation, epigenetic marks for gene activation and silencing. We provide evidence for interactive roles of regulatory sequence changes, chromatin modification, and trans-acting effects in natural selection of orthologous FLC loci, which determines the fate of duplicate genes and adaptation of allopolyploids during evolution.
GENOME sequence analysis indicates that polyploidy can be found throughout the evolutionary history and genetic diversity of all eukaryotes (Ohno 1970), including flowering plants (Masterson 1994). Many important agricultural crops, including wheat, cotton, and Brassica, are allopolyploid, and many, including maize and Arabidopsis (Masterson 1994; Blanc and Wolfe 2004), have identifiable polyploidy in their ancestry. The common occurrence of allopolyploids in nature suggests that the combination of evolutionarily divergent genomes confers selective advantages (Grant 1981; Levin 1983; Ramsey and Schemske 1998; Matzke et al. 1999; Wendel 2000; Levy and Feldman 2002; Osborn et al. 2003; Comai 2005). Some duplicate loci contribute to a wider range of enzyme and biochemical activity than their parental genotypes (i.e., hybrid vigor), whereas others may become functionally divergent during evolution. Therefore, allopolyploids are generally more adaptive than their progenitors in higher altitudes and latitudes and in broader climates (Grant 1981).
Biologically, allopolyploidy provides a unique system for studying regulatory interactions between evolutionarily divergent loci originating in orthologous genomes. In the early stages of polyploidization, an allopolyploid must reconcile regulatory and transcriptome divergence (Wang et al. 2006) between the divergent species that have been separated for millions of years. As a result, many genes involved in various regulatory pathways are expressed nonadditively (or differently from the midparent value) (Wang et al. 2004, 2006). However, the underlying molecular mechanisms for nonadditive gene regulation are poorly understood (Osborn et al. 2003; Chen and Ni 2006). Flowering time is an important trait for plant evolution and speciation. The pathways affecting the induction of flowering in response to vernalization or autonomous regulation (under short days) have been elucidated in Arabidopsis (Simpson and Dean 2002; Simpson et al. 2004; He and Amasino 2005). In the genetic pathway, FRIGIDA (FRI) acts epistatically on FLOWERING LOCUS C (FLC) and FRI enhances FLC expression and inhibits early flowering (Johanson et al. 2000). FLC, a MADS-box transcription factor, represses flowering (Michaels and Amasino 1999; Sheldon et al. 2002). The FLC locus has strong and weak alleles in A. thaliana ecotypes (e.g., Columbia vs. Ler) because of transposon insertion in the first intron (Gazzani et al. 2003; Michaels et al. 2003). FRI is a coiled-coil nuclear protein and exerts positive and epistatic effects on FLC expression (Johanson et al. 2000). Here we studied the genetic basis and phenotypic outcome of regulatory interactions between FRI and FLC loci in Arabidopsis allopolyploids. The synthetic allotetraploids contain both A. thaliana FLC (AtFLC) and A. arenosa FLC (AaFLC) and only A. arenosa FRI (AaFRI) because A. thaliana FRI (AtFRI) is nonfunctional. In natural allotetraploids, A. suecica (AsFLC1 and -2) diverged from A. arenosa FLC (AaFLC1 and -2), respectively, whereas AtFLC originating from A. thaliana is identical to that in A. thaliana. We demonstrated that genetic interactions between A. arenosa FRI and A. thaliana FLC and A. arenosa FLC loci contribute to late flowering in the synthetic allotetraploids and natural A. suecica strains. AaFRI complements defective AtFRI through trans-activation of AtFLC, which determines late flowering in the synthetic allotetraploids. Genetic interactions between AaFRI and AtFLC loci provide a molecular basis for selecting strong AtFLC and AsFLC1 in A. suecica, coincident with its natural habitants in cold climates (O'Kane et al. 1995; Sall et al. 2003). The trans-acting effects of AaFRI on AtFLC are related to histone acetylation and methylation at some specific lysine residues. AaFLC and AtFLC expression displays allelic variation and correlates with changes in cis-regulatory elements. We propose a model that suggests cis- and trans-regulation and chromatin modification of divergent orthologous loci (e.g., FRI and FLC) in the progenitors determine the fate of duplicate regulatory pathways during allopolyploid formation and evolution.
MATERIALS AND METHODS
The synthetic A. suecica lines were produced by pollinating an autotetraploid A. thaliana (Ler, accession no. CS3900) (2n = 4x = 20) with autotetraploid A. arenosa (CS3901) (2n = 4x = 32). A total of 25 allotetraploid lines (F1) were produced. Three independent allotetraploid lines, Allo733 (CS3895), Allo738 (CS3896), and Allo745 (CS3897), were selfed to the sixth generation (S6) (Wang et al. 2004). A. suecica strains were AsLC1 (CS22505), As9502 (CS22509), and As13 (a gift from Luca Comai at the University of Washington). A. thaliana Columbia (Col) (CS6673) and San Feliu 2 (SF2) (CS1516) were obtained from the Arabidopsis Biological Resource Center. Seeds harvested from the allotetraploid plants were germinated on Murashige–Skoog medium (Sigma). To mimic a winter/spring transition, we grew plants in a growth chamber with short-day conditions (24°/20° day/night and 8 hr of light/day) for 8 weeks followed by long-day conditions (24°/20° day/night and 16 hr of light/day). The transgenic seedlings (T0) resistant to kanamycin (50 mg/liter) were selected and grown under the long-day conditions. Except as noted otherwise, seedling leaves were harvested at the vegetative stage (4 weeks in A. thaliana, 8 weeks in A. arenosa, synthetic allotetraploids, and A. suecica), and subjected to DNA and RNA analyses and chromatin immunoprecipitation (ChIP) assays. Flowering time was recorded from the date of seed germination to the development of the first flower. A total of 24 plants (each genotype) in three replications each with 8 plants were used for statistical analysis.
Molecular cloning and sequencing analysis of A.arenosa FLC and FRI genes in A. arenosa and A. suecica:
Total RNA was isolated using the Trizol reagent (Invitrogen). Full-length AaFLC cDNA fragments were amplified from A. arenosa and a natural A. suecica line (As9502) using the forward and reverse primers AaFLC-F: 5′-AAATTAGGGCACAAAGCCCTCTCGG-3′ and AaFLC-R: 5′-CAACCGCCGATTTAAGGTGGCTA-3′. To clone AaFRI, we used 5′-RACE kit (Invitrogen) to clone and sequence the 5′-AaFRI region in A. arenosa. The full-length AaFRI cDNA was then amplified using the primer pair AaFRI-F: 5′-CGCTTTCTCATGGCCAATTAT-3′ and AaFRI-R: 5′-CGCGGATCCTGCATTCTTAAGCCCCAAAC-3′. A. arenosa and A. suecica FLC promoter regions were amplified using the primer pair pFLC-F: 5′-ATGGCGAAGGTGAAATGCATAC-3′, located in At5g10150, upstream of AtFLC, and pFLC-R: 5′-AGCTTTCTCGATGAGACCGT-3′, located in the 5′ end of the first exon in AtFLC. The first intron regions of AaFLC and AsFLC were amplified by PCR using primers designed from the exons 1 and 2, respectively, InFLC-F: 5′-AGCCCTCTCGGAGACAGAAG-3′ and InFCL-R: 5′-CAGGCTGGAGAGATGACAAA-3′.
We cloned the PCR-amplified full-length cDNA fragments into pGEM-T vector (Promega) and sequenced 5–20 individual inserts from each cloning event. DNA sequence data were analyzed using DNAStar LASERGENE programs, version 5.05. Multiple alignments were performed using Megalign ClustalW alignment of DNAStar, and percent amino acid identity indicates the percentage of identical residues between complete full-length cDNA sequences. Phylogenetic trees were constructed using ClustalW protein alignments and PAUP 4.0 (Swofford 2003), using maximum parsimony analysis with heuristic search and stepwise addition options, and were confirmed using bootstrap analysis with heuristic search and 1,000 replicates.
For protein domain analysis, we used the translated amino acid sequence of AaFRI as query in the InterPro database (http://www.ebi.ac.uk/interpro/index.html) and SWISS-PROT (http://us.expasy.org/ExpasyHunt/).
RT–PCR and cleaved amplified polymorphic sequence analyses:
For each reaction, 10 μg of total RNA was treated with DNase I, and the first-strand cDNA was synthesized using RT superscript II enzyme (Invitrogen). An aliquot (1/100) of cDNA was used as template in the PCR reaction with one cycle of 94° for 2 min followed by 30 cycles of amplification at 94° for 30 sec, 53° for 30 sec, and 72° for 90 sec. Act2 was used as internal control (Wang et al. 2004). AtFLC, AaFLC, or AsFLC transcripts from the position 223 to 491 (+1 ATG codon) were amplified using the primer pair shown in supplemental Figure 1 (http://www.genetics.org/supplemental/). The amplified products were digested with ClaI to distinguish the transcripts between AtFLC and AaFLC1/AaFCL2 or AsFLC1/AsFLC2.
Allele gene specific expression with TaqMan detection system:
To measure quantitative levels of allele-specific expression, quantitative PCR was performed in an ABI7500 thermal cycler (ABI Biosystems) with the TaqMan probe detection system. Briefly, we used the primer pair TM-FLC-F: 5′-GAAACA(A/G)CATGCTGATCTTAAA-3′ and TM-FLC-R: 5′-CAT(A/G)GTGTG(A/G)ACCATAGTTCGAGCTT-3′ to amplify both AtFLC and AaFLC, which were detected specifically and quantified simultaneously by two TaqMan probes, namely, AtFLC-probe: FAMCTTGGATCATCAGTCAAMGBNFQ and AaFLC-probe: VICCCTTGGATATTCAGTCAAMGBNFQ. The data were normalized to the expression level of an internal control Act2 (At5g09810), which was quantified by the PCR products amplified using primer pair: 5′-GTCTGTGACAATGGAACTGGAA-3′ and 5′-CTTTCTGACCCATACCAACCAT-3′.
Single-strand conformation polymorphism analysis:
Single-strand conformation polymorphism (SSCP) analysis was performed using the 0.5× mutation detection enhancement (MDE, Cambrex Bio Science) gel containing 7.5% (w/v) urea. After electrophoresis at room temperature for 18 hr with constant watts (6 W), the gel was fixed in 10% acetic acid for 30 min followed by washing three times 5 min each in deionized water. The gel was stained using silver nitrate solution (0.5 g silver nitrate in 500 ml H2O) for 30 min and quickly rinsed in H2O and transferred to an ice-cold sodium carbonate solution [45 g sodium carbonate in 1500 ml H2O plus 450 μl thiosulfate solution (10 mg/ml) and 1.5 ml formaldehyde]. When clear bands appeared, 10% acetic acid was added to stop development. Gel images were taken using a CCD camera.
To determine allelic variation in allotetraploids and their progenitors, we cloned and sequenced every fragment present in the SSCP gel (Figure 6A). The sequence data were used to designate specific allele/locus names.
Plasmid construction and plant transformation:
The plasmid pART27-35S-AaFRI was constructed as follows. The EcoRI/BamHI fragment in pGEM-T Easy vector (Promega) consisting of full-length AaFRI cDNA was cloned into compatible sites of pKANNIBAL vector, resulting in a 35S-AaFRI cassette, which was subcloned into binary vector pART27. The vectors were generously supplied by Peter Waterhouse of the CSRIO Plant Industry, Canberra, Australia. Arabidopsis (Columbia and Ler) was transformed with Agrobacterium-mediated transformation (floral dipping method). No transformants were obtained from Ler plants.
Bisulfite DNA sequencing:
Genomic DNA (∼2 μg) of At2Ler, At4Ler, A. arenosa, F1-12, -19, -22, Allo733, Allo738, and As9502 was digested using EcoRI and purified (QIAGEN). The purified DNA was treated in a bisulfite solution as previously described (Johnson et al. 2002). Genomic fragments (∼300 bp) upstream of the ATG codon were amplified with the primers designed using the web-based software MethPrimer (http://www.urogene.org/methprimer/index1.html) (supplemental Figure 1, http://www.genetics.org/supplemental/). The AaFLC fragments amplified from A. arenosa, F1-12, -19, -22, Allo733, Allo738, and As9502 were subject to direct sequencing. The AtFLC fragments were cloned into pGEM-T vector (Promega), and 8–10 individual inserts were sequenced in each line (At2Ler, At4Ler, Allo733, and As9502).
Chromatin immunoprecipitation assays:
The ChIP assays were performed using a protocol modified from previously published methods (Bastow et al. 2004; He et al. 2004; Tian et al. 2005). For each assay, fresh leaves (∼3 g) were subjected to vacuum infiltration in a formaldehyde (1%) solution for crosslinking the chromatin proteins to DNA. Chromatin was extracted and sonicated (Fisher, Model 60 sonicator) at half maximal power for five 10-sec pulses with chilling on ice for 3 min after each pulse. The average size of the resultant DNA fragments produced was ∼0.3–1.0 kb. We used an aliquot of chromatin solution (1/10 of total volume) as input DNA to determine the DNA fragment sizes. The remaining chromatin solution was diluted 10-fold and divided into two aliquots: one was incubated using 10 μl of antibodies (anti-dimethyl-histone H3-Lys4, anti-dimethyl-H3-Lys9, or anti-acetyl-H3-Lys9; Upstate Biotechnology) and the other incubated without antibodies (mock). The immunoprecipitated DNA was amplified by semiquantitative PCR using the primers designed from upstream sequences of the FLC ATG codon (supplemental Figure 1, http://www.genetics.org/supplemental/). Two independent experiments were performed in each assay.
Genetic dominance of late flowering in synthetic Arabidopsis allotetraploids and natural A. suecica:
Genetically stable Arabidopsis allotetraploids were generated by interspecific hybridization between A. thaliana (Ler) autotetraploids and A. arenosa (Figure 1A) that diverged ∼6 million years ago (Koch et al. 2000). Under a combination of short- (8 weeks) and long-day conditions, A. thaliana (Ler) autotetraploids flowered in 57 ± 10 days (corresponding to 8–10 true leaves), and A. arenosa flowered in 102 ± 12 days (Figure 1C). All 25 synthetic allotetraploids (F1) flowered later (119–142 days) than tetraploid parents, indicating nonadditive effects (overdominance in this case) and immediate changes in flowering time after allopolyploidization. Although the F1 lines varied in flowering time, their offspring flowered at similar times after five generations of selfing. Notably, Allo745, a synthetic allotetraploid that was outcrossed to A. suecica (LC1) (Wang et al. 2004), flowered later than Allo733 and -738 but earlier than A. suecica strains. A. suecica is a natural allotetraploid or amphidiploid containing the genomes derived from A. thaliana and A. arenosa ancestors (O'Kane et al. 1995) (Figure 1B). The formation of A. suecica is estimated to have occurred from ∼20,000 years (on the basis of chloroplast DNA) (Sall et al. 2003) to ∼1.5 million years (Koch et al. 2000). A. suecica strains did not flower until 150–220 days after seed germination. It is likely that the genetic interactions between A. arenosa and A. thaliana loci contribute to late flowering in the synthetic allotetraploids and natural A. suecica strains.
Sequence analysis of FLCs in A.thaliana, A. arenosa, and A. suecica:
Microarray analysis indicated that FLC is upregulated (Wang et al. 2006) in the synthetic allotetraploids that are late flowering. To determine the molecular basis of FLC upregulation and flowering-time variation in Arabidopsis allotetraploids, we characterized FLC full-length cDNAs in A. arenosa and A. suecica (Figure 2A). A. arenosa contains AaFLC1 and AaFLC2 that have 96 and 98% identities in nucleotide and amino acid sequences. Interestingly, among nine independently isolated BACs, each contained two AaFLC fragments within a 10-kb region (J. Wang and Z. J. Chen, unpublished data), suggesting that AaFLC1 and -2 are tandem duplicate genes. Compared to AtFLC, AaFCL1 and AaFLC2 have 95.4 and 94.8% nucleotide-sequence identities, respectively.
In self-pollinating A. suecica (As9502), AaFLC1-like locus (designated AsFLC1) diverged from AaFLC1 (Figure 2B), whereas AtFLC in A. suecica is identical to FLC in A. thaliana, suggesting AtFLC sequence is highly conserved during polyploid evolution. AsFLC1 has 99, 98.5, and 94.9% nucleotide-sequence identities to AaFLC1, AaFLC2, and AtFLC, respectively. AsFLC2 corresponding to AaFLC2 was not identified by sequencing >20 cDNA fragments amplified in A. suecica, but a partial genomic sequence of AsFLC2 was 99.6% identical to AaFLC2, suggesting that AsFLC2 is likely silenced in A. suecica. In the phylogenetic tree, FLC loci in A. thaliana, A. arenosa, and A. suecica are closely related. Many Brassica FLC loci, except BnFLC1, diverged from AtFLC with ∼86% nucleotide-sequence identity. BnFLC1 in B. napus is rooted in the clade next to the Arabidopsis locus, suggesting that it is an ancient homolog that might have existed before the species divergence between Arabidopsis and Brassica. Other members of the MAF/MADS-box gene family, including MAFs (Ratcliffe et al. 2001), are distantly related to FLC.
We further examined the cis-regulatory changes in FLC orthologs by cloning promoters and first introns of all possible FLC loci in A. thaliana (Col and Ler), A. arenosa, and A. suecica. Without exception, AaFLC and AsFLC had an ∼1.0-kbp deletion (Figure 3A and supplemental Figure 1, http://www.genetics.org/supplemental/) located 253 bp upstream of the ATG, giving rise to a short promoter compared with the AtFLC promoter. As a result, the upstream regulatory elements were deleted in the AaFLC and AsFLC promoters but present in the AtFLC promoter region. Coincidentally, this region was identified as a minimal promoter (∼250 bp) in a previous study (Sheldon et al. 2002).
The first intron of FLC loci displayed a high degree of sequence changes (Figure 3A). A major deletion (∼560 bp) was identified in the upstream region (from +855 to +1412) of the intron in AaFLC1, AaFLC2, and AsFLC2 but not in AsFLC1. A MUTATOR-like transposon (∼1.3 kbp) was inserted in A. thaliana (Ler) (Gazzani et al. 2003; Michaels et al. 2003), but absent in AaFLC1/2, AsFLC1/2, or AtFLC(Col). Sequence length in that region varied in AtFLC(Col), AaFLC1/2, and AsFLC1/2. AsFLC1 showed sequence features different from other loci (dashed line).
Using the primers flanking the 1.3-kbp transposon insertion, we genotyped FLC loci in Arabidopsis allotetraploids and their progenitors. In the synthetic allotetraploids, all lines (F1) each possessed one AtFLC(Ler) and two A. arenosa loci (Figure 3B). AaFLC1 was absent in one allotetraploid (no. 22) due to allelic variation in the outcrossing A. arenosa strains, and the allele was amplified using a new primer pair (data not shown). In the selfing progeny (S6) of Allo733 and -738, each had three FLC fragments corresponding to AtFLC, AaFLC1, and AaFLC2, respectively. Similarly, the self-pollinating natural A. suecica strains had three loci, AtFLC, AsFLC1, and AsFLC2, of which AtFLC was derived from a Col-like strain as determined by sequencing the fragment (data not shown).
Upregulation of AtFLC and downregulation of AaFLC in synthetic allotetraploids:
Using the FLC sequences, we designed TaqMan probes to discriminate the transcripts between AtFLC and AaFLC1/2 or AsFLC1/2 (Figure 2B). Alternatively, the amplified RT–PCR products were distinguishable using ClaI that cleaves AtFLC. Allelic variation between AaFLC1 and -2 or AsFLC1 and -2 was indistinguishable. We found that AtFLC was immediately reactivated in de novo allotetraploid lines (F1) and expressed at constantly high levels in selfing progeny and A. suecica (Figure 4A), whereas AaFLC1/2 was downregulated in the synthetic allotetraploids and AsFLC1/2 was highly expressed in A. suecica. Using quantitative RT–PCR and Pearson correlation analyses, we found that the flowering-time variation (days to flowering) in A. thaliana, A. arenosa, synthetic allotetraploids, and A. suecica (Figure 1C) is correlated with the cumulative levels of AtFLC and AaFLC1/2 and /AsFLC transcripts (Figure 4B) (r2 = 0.94). Allo745 displayed increased levels of AaFLC1/2 and AsFLC1/2 expression and late flowering because of outcrossing to A. suecica (Wang et al. 2004).
AaFRI complements AtFRI and likely trans-activates AtFLC in Arabidopsis allotetraploids:
A winter-annual ecotype, such as A. thaliana San Feliu 2 (SF2), contains both AtFRI and AtFLC loci (Lee et al. 1993). Moreover, FRI from SF2 has moderate effects on FLC expression in Ler background (Gazzani et al. 2003; Michaels et al. 2003). However, it is unknown whether AaFRI and AtFLC originating in two divergent species are genetically compatible. To test this, we cloned a full-length AaFRI cDNA (supplemental Figure 3, http://www.genetics.org/supplemental/) and another partial AaFRI-like fragment (data not shown) in A. arenosa. Putative AaFRI and AtFRI (At4g00650) had 92 and 89% identities in nucleotide and amino acid sequences, respectively. Like AtFRI, AaFRI is a plant-specific gene and encodes a predicted nuclear coiled-coil protein containing a FRIGIDA domain (Johanson et al. 2000).
To test trans-acting effects of AaFRI on AtFLC activation, we overexpressed AaFRI in A. thaliana Col plants (Figure 5A). Among eight independent transgenic lines examined, all flowered late. The control (At2Col) flowered in ∼40 days under long-day conditions, and SF2 flowered in ∼85 days. The severity of abnormal phenotypes in the transgenic plants was correlated with the high levels of AaFRI overexpression, which may induce deleterious effects on other biological pathways. AaFRI was highly expressed in the transgenic lines, but AtFLC was expressed at a high level similar to that in S2F (Figure 5B). The correlation between the levels of AaFRI expression and AtFLC upregulation was not proportional, suggesting that other regulators in vernalization, autonomous, and photoperiod pathways affect FLC expression (Simpson and Dean 2002; Bastow et al. 2004; He et al. 2004; Sung and Amasino 2004; He and Amasino 2005). Compared to relatively small trans-acting effects between ecotypes (Michaels et al. 2003), AaFRI has large trans-activating effects on AtFLC expression in the allotetraploids.
AtFLC, AaFLC, and AsFLC expression variation contributes to late flowering in Arabidopsis allotetraploids:
Do changes in cis-regulatory elements affect FLC expression and flowering-time variation in synthetic allotetraploids and A. suecica? AtFLC was instantaneously upregulated in all synthetic and natural allotetraploids lines tested (Figure 4), suggesting that despite ∼6 million years of divergence between A. thaliana and A. arenosa (Koch et al. 2000), AtFLC was trans-activated by A. arenosa FRI in the synthetic allotetraploids, probably because it has an intact promoter and intron (Figure 3A). AaFLC1 and AaFLC2 have short promoter regions (Figure 3A) and their expression levels were highly variable (Figure 4). To distinguish allelic expression patterns, we used SSCP analysis followed by cloning and sequencing individual fragments in each locus. The data (Figure 6A) indicated that AaFLC2 expression was barely detectable in A. arenosa and undetectable in F1 lines, Allo733, and -738, leading to the silencing of AsFLC2 in A. suecica. Therefore, the contribution of AaFLC2 and AsFLC2 to flowering-time variation in A. arenosa and A. suecica is negligible. The fate of AaFLC1 is unpredictable: it was highly expressed in A. arenosa, F1-12, -14, and -19 but poorly expressed in F1-4, -22, and -23, Allo733, and -738, suggesting epigenetic variation of AaFLC1 expression in the synthetic allotetraploids. Low levels of AaFLC2 and AsFLC2 expression were probably associated with sequence deletions and mutations in the first introns. Deletions in the same upstream region (supplemental Figure 1, http://www.genetics.org/supplemental/) were shown to cause downregulation of FLC in A. thaliana (Sheldon et al. 2002). Collectively, the data suggest sequence divergence in the promoters and introns (Figure 3A) provides a molecular basis for differential regulation of FLC orthologs in Arabidopsis allotetraploids (Figure 6A). This does not preclude a possibility that other genes in the flowering pathways (Simpson and Dean 2002; He and Amasino 2005) contribute to late flowering during evolution. Indeed, AsMAF1, an AtMAF1 homolog (Ratcliffe et al. 2001; Scortecci et al. 2001), was upregulated in three natural A. suecica strains (Figure 6A).
FLC expression variation is mediated by histone H3-Lys9 acetylation, H3-Lys4 methylation, and H3-Ly9 methylation:
To determine the cause of AtFLC upregulation and AaFLC1 and AsFLC1 expression variation in the synthetic and natural allotetraploids, we investigated the levels of H3-Lys9 acetylation and H3-Lys4 and H3-Lys9 methylation using ChIP assays (He et al. 2003, 2004; Bastow et al. 2004) and semiquantitative PCR amplification with the locus-specific primers designed for A. thaliana (Ler) and A. arenosa, respectively (supplemental Figure 1, http://www.genetics.org/supplemental/). AtFLC reactivation was associated with the increased levels of histone H3-Lys9 acetylation and H3-Lys4 dimethylation (Figure 6B), two epigenetic marks for gene activation. Similarly, AsFLC1 upregulation in A. suecica correlated with the increased levels of H3-Lys9 acetylation and H3-Lys4 dimethylation. A low level of H3-Lys4 dimethylation was detected in Allo733, which correlates with the relatively low level of AaFLC1 expression in this synthetic allotetraploid (Figures 4 and 6A). This is because AaFLC is reactivated only in a subset of cells or gene expression changes may be reversible (Bastow et al. 2004) in the synthetic allotetraploids. Indeed, a high level of H3-Lys9 acetylation was detected, which may be related to variable and unstable levels of AaFLC1 expression in Allo733. Using antibodies against H3-Lys9 dimethylation, an epigenetic mark for gene repression, we found H3-Lys9 dimethylation was dramatically reduced in the AaFLC1/AsFLC1 promoter and moderately reduced in the AtFLC promoter. The data suggest that AtFLC activation and AsFLC1 expression are mediated by histone acetylation and methylation at H3-Lys4 and -Lys9 sites. It is notable that H3-Lys4 dimethylation and Lys9 dimethylation levels may vary in some assays because H3-Lys9 dimethylation levels do not correlate with FLC repression in nonvernalized plants (Bastow et al. 2004). Residual levels of H3-Lys9 dimethylation detected in AtFLC and low levels of H3-Lys4 dimethylation and high levels of H3-Lys9 acetylation detected in AaFLC1 may also suggest that other regions, such as the first intron, are important to FLC expression (Sheldon et al. 2002; Bastow et al. 2004). Alternatively, other genes in the vernalization and photoperiod pathways may contribute to the FLC repression (Levy and Dean 1998; Gendall et al. 2001; Simpson and Dean 2002; He et al. 2003; Sung and Amasino 2004; He and Amasino 2005).
Indeed, vernalization reduced the expression of AtFLC and Aa/AsFLC1/2 and reversed late-flowering phenotype in A. arenosa and A. suecica strains to early flowering (Figure 7), suggesting that the first intron contains regulatory elements required for vernalization (Sheldon et al. 2002). Consistent with the previous study (Finnegan et al. 2005), no changes in DNA methylation were detected in the AtFLC and AaFLC promoter regions using the bisulfite sequencing method (supplemental Figure 1, http://www.genetics.org/supplemental/). The data indicate functional AaFRI and a selective combination of AaFLC1 and AtFLC expression contribute to the natural variation of flowering time in response to vernalization (extended winter-like cold temperatures) in Arabidopsis allopolyploids as in the A. thaliana ecotypes.
Flowering-time variation in Arabidopsis allopolyploids:
Allotetraploids may provide a unique genetic system to test the effects of sequence divergence on the expression of orthologous genes originating in different species (Chen and Ni 2006). Arabidopsis allotetraploids are late flowering mainly because of trans-activation of AtFLC by AaFRI. The data suggest that AaFRI and AtFLC, the two genes responsible for natural variation of flowering time, are conserved between A. thaliana and A. arenosa after ∼6 million years of evolution (Koch et al. 2000). The trans-acting effect of AaFRI on AtFLC reflects a strategy of allopolyploids using the best combination of orthologous loci in a genetic pathway. This selective epistatic interaction may be determined by an intact promoter and the first intron of AtFLC. AaFLC loci possess deletions both in the promoter and the first intron. Therefore, compared to AtFLC, AaFLC loci are relatively weak. Between two AaFLC loci, AaFLC2 expression is repressed immediately in the synthetic allotetraploids, whereas AaFLC1 expression levels are variable in the synthetic allotetraploids, which may correspond to epigenetic transient expression states during selfing (Wang et al. 2004). A strong AaFLC1 is selected in A. suecica. The selection likely acts on the first intron of AaFLC1, leading to the high level of AsFLC1 expression because no sequence divergence was detected in the AsFLC1 and AaFLC1/2 promoters (Figure 3 and supplemental Figure 1, http://www.genetics.org/supplemental/). It is intriguing that although AaFLC1 and AaFLC2 share similar promoter and intron sequences, AaFRI trans-activates AtFLC and maintains the expression of AaFLC1 but not AaFLC2 in the new allotetraploids and natural A. suecica strains. Sequence variation between the first introns of AsFLC1 and AaFLC1 (Figure 2) may also suggest that a different A. arenosa strain is the genome donor of A. suecica.
In addition to AaFRI effects on AtFLC expression, flowering-time variability in the synthetic allotetraploids may be related to the allelic variation in the first intron of FLC. For example, the low levels of FLC RNA accumulation in A. thaliana Ler result from a transposable element inserted in the intron, which induces repressive chromatin modifications mediated by short interfering RNAs generated from homologous transposable elements in the genome (Liu et al. 2004). AtFLC in A. suecica is derived from a Col-like ecotype and does not contain the insertion, which is associated with high level of AtFLC expression in A. suecica. In contrast, the first intron of AaFLC2 may have little effect on changes in FLC expression in the synthetic and natural allotetraploids. Despite absence of transposable element in the first intron, the AaFLC2 expression level is very low in the synthetic allotetraploids and undetectable in A. suecica.
Note that >80 genes regulate flowering time in Arabidopsis (Levy and Dean 1998). Within FRI and FLC loci, there are numerous allelic variations detected among various ecotypes (Lempe et al. 2005; Shindo et al. 2005). Therefore, the molecular basis for this complex flowering-time trait remains to be carefully dissected in the allotetraploid plants that combine two interactive regulatory pathways inherited from two divergent progenitors.
A model for nonadditive gene regulation in allotetraploids:
Our data suggest a model for genetic interactions between orthologous loci in a genetic pathway (Chen and Ni 2006) that mediates flowering time variation in Arabidopsis allotetraploids, which explains how new allopolyploid species assemble a functionally compatible pathway by selecting and modifying the expression of orthologous loci originating from divergent species. During ∼6 million years of evolution (Koch et al. 2000), A. arenosa and A. thaliana (Ler) diverged in flowering habits probably because of selective adaptation to the cold and warm climates (O'Kane et al. 1995; Sall et al. 2003), respectively. Sequence evolution of FRI and FLC loci leads to a nonfunctional AtFRI in A. thaliana (Johanson et al. 2000) and cis-regulatory changes in A. thaliana and A. arenosa FLC loci. In synthetic allotetraploids, A. arenosa FRI interacts in trans with the downstream gene, AtFLC, making the synthetic allotetraploids winter annual in a dosage-dependent manner. Interestingly, it was trans-activation of AtFLC that determined genetic dominance of late flowering in the synthetic allotetraploids, despite evolutionary divergence between A. thaliana and A. arenosa. Low levels of AaFLC expression may be associated with cis-regulatory changes in A. arenosa loci. AtFLC1 and AsFLC1 with intact cis-regulatory elements (promoters and/or introns) are selectively associated with a strong winter-annual habit in natural A. suecica strains, whereas AaFLC2 and AsFLC2 expression appears to be dispensable. The effects of AaFRI on AtFLC and AaFLC1/AsFLC1 upregulation are mediated by histone acetylation and methylation. FRI is in the same pathway as FRIGIDA-LIKE 1 (FRL1) (Michaels et al. 2004) and FRIGIDA-ESSENTIAL 1 (FES) (Schmitz et al. 2005). Although there is no direct evidence for FRI interacting with these proteins, FRI may interact with protein complexes such as PAF1 (He et al. 2004) responsible for locus-specific chromatin modifications.
The current model may be generalized to explain the fate of duplicate genes being involved in biological pathways during allopolyploidization. Many orthologous genes in the progenitors might have evolved to possess divergent cis-regulatory elements that confer strong or weak, dominant or recessive alleles, tissue-specific expression, and/or developmental regulation. Evidently, the regulatory networks may be reset by chromatin modification immediately after allopolyploidization, leading to novel variation and increased fitness (Grant 1981; Wendel 2000; Osborn et al. 2003; Comai 2005). We provided the mechanistic evidence that altered regulatory networks (Birchler 2001; Osborn et al. 2003) and cis- and trans-regulation (Wittkopp et al. 2004) between the divergent biological pathways lead to epigenetic reprogramming of a biological (flowering) pathway after polyploidization. A similar mechanism may be responsible for the functional diversification of duplicate genes in developmental regulation of gene expression, a phenomenon known as subfunctionalization of duplicate genes (Lynch and Force 2000; Adams et al. 2003). It is notable that flowering time directly affects plant reproduction and adaptation. Therefore, sequence evolution and epigenetic regulation play interactive and pervasive roles in reconciling the regulatory incompatibilities between divergent genomes, leading to natural variation and selective adaptation during allopolyploid evolution.
We thank Richard Amasino, Gary E. Hart, Yuehui He, and Edward Himelblau for critical suggestions. The work is supported by grants from the National Institutes of Health (GM-067015) and the National Science Foundation Plant Genome Research Program (DBI0077774) to Z.J.C.
- Received January 31, 2006.
- Accepted March 14, 2006.
- Copyright © 2006 by the Genetics Society of America