| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 174, 2181-2202, December 2006, Copyright © 2006
doi:10.1534/genetics.106.064543
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,1

* Genetics and Plant Breeding, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany,
Genetics and Evolution, Max Planck Institute for Chemical Ecology, 07745 Jena, Germany and
Department of Biology, Duke University, Durham, North Carolina 27708-0338
1 Corresponding author: Max Planck Institute for Plant Breeding Research, Carl-von-Linné Weg 10, 50829 Cologne, Germany.
E-mail: demeaux{at}mpiz-koeln.mpg.de
| ABSTRACT |
|---|
|
|
|---|
Our current understanding of cis-regulatory evolution is largely based on patterns of DNA conservation. There is now clear evidence that function in noncoding DNA (ncDNA) is broadly maintained. Whole-genome sequence comparisons between species have uncovered numerous segments of conserved noncoding DNA (DERMITZAKIS et al. 2004). Constraints on conserved segments are often experimentally related to functional conservation (CLIFTEN et al. 2001; KOCH et al. 2001; BOFFELLI et al. 2003). Interestingly, levels of constraint in ncDNA are found to vary across species (KEIGHTLEY and GAFFNEY 2003; KEIGHTLEY et al. 2005); they also can be larger than in protein-coding regions (BEJERANO et al. 2004). However, opportunities for adaptive evolution and lineage-specific functional changes remain poorly understood.
Recently, a study of nucleotide polymorphism and divergence within and between two Drosophila species over a large number of noncoding DNA loci suggested that many ncDNA changes, mostly located in UTRs, may have undergone adaptive evolution (ANDOLFATTO 2005). In humans, multiple instances of adaptive changes have been observed at specific cis-regulatory loci (BAMSHAD et al. 2002; ROCKMAN et al. 2003; HAHN et al. 2004; ROCKMAN et al. 2004), although some examples remain controversial (SABETI et al. 2005). In other species, including Drosophila, examples of neutral nucleotide variation at cis-regulatory regions have been reported (BALHOFF and WRAY 2005; FAY and BENAVIDES 2005; MACDONALD and LONG 2005).
Overall, the relationship between nucleotide and functional variation in cis-regulatory DNA has rarely been characterized, and little is known about the amount and type of variation to be expected within and among closely related species. In-depth characterization of cis-regulatory variation at both nucleotide and functional levels is required to gain some insight into the baseline evolutionary scenario of functional noncoding regulatory DNA. The most compelling models of cis-regulatory evolution are based on Drosophila developmental genes that are controlled by internal signals and whose misexpression is fatal to the organism (PHINCHONGSAKULDIT et al. 2004; LUDWIG et al. 2005). So far, the generation of cis-regulatory novelties in a less constrained expression context has received little attention. In comparison to animals, plants continuously fine tune their development to prevailing environmental conditions. Thus, plant models of cis-regulatory evolution in genes controlled by environmental signals may shed light on the possible adaptive role of cis-regulatory variation.
Previously, we examined regulatory polymorphisms within Arabidopsis thaliana. We established a robust allele-specific assay to examine cis-regulatory variation in response to abiotic, biotic, or developmental changes (DE MEAUX et al. 2005). This assay of allele-specific expression in F1 heterozygotes effectively controls for background variation and allows the detection of subtle cis-regulatory differences. We focused on the promoter region of the chalcone synthase (CHS) gene because it is among the best-characterized promoters in plants and its expression is induced by multiple cues (HARTMANN et al. 1998; KOCH et al. 2001; LOGEMANN and HAHLBROCK 2002). Among those cues, light and insect herbivory were shown to upregulate CHS expression in A. thaliana (REYMOND et al. 2000; JENKINS et al. 2001; WADE et al. 2001). In addition, CHS is the branch-point enzyme of a pathway involved in the interaction between plants and their abiotic and biotic environments (WINKEL-SHIRLEY 2001). Hence this gene is likely to play a role in adaptive evolution (see JOHNSON and DOWD 2004). In A. thaliana, we found substantial functional cis-regulatory variation in CHS expression. However, patterns of nucleotide variation in the A. thaliana CHS promoter showed no evidence of non-neutral evolution in this inbreeding annual species (DE MEAUX et al. 2005).
In this article, we elucidate the evolutionary dynamics of CHS cis-regulation in Arabidopsis and report microevolutionary patterns of cis-regulation. These experiments were conducted with A. halleri and A. lyrata, two self-incompatible species that differ substantially in their ecology (MITCHELL-OLDS 2001). In central Europe, A. halleri grows in highly competitive open meadows, whereas A. lyrata is restricted to low-competition habitats on exposed rocks (HOFFMANN 2005). We compared our findings to data from the model species A. thaliana (DE MEAUX et al. 2005), which differs from other species in the genus by its self-compatibility and low species-wide levels of diversity (WRIGHT et al. 2003; SCHMID et al. 2005).
We analyzed allele-specific expression levels in the progeny of intra- and interspecific crosses and evaluated functional polymorphism and divergence of CHS cis-regulation. We combined our functional assay with an analysis of polymorphism and divergence at the nucleotide level in the CHS 5' upstream intergenic region and addressed the following questions: (i) How does cis-regulatory diversity in A. lyrata and A. halleri compare to that in A. thaliana?, (ii) What qualitative and quantitative divergence in CHS cis-regulation is seen among species?, and (iii) Is a footprint of selection detectable in the intergenic region containing the CHS promoter in any of the three Arabidopsis species examined?
At the nucleotide level, the evolutionary dynamics of the CHS promoter region differed among the Arabidopsis species examined, in agreement with their genomewide patterns of diversity. Thus, patterns of variation at the CHS promoter region show no indication of adaptive evolution. Nonetheless, patterns of functional diversity point to abundant cis-regulatory variation within and between species, most of which results from qualitative differences in the response to individual environmental cues. Our results reveal that CHS cis-regulation evolves mainly by modification of cis-regulatory response modules to one but not all environmental cues.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
t (between populations). Per-site nucleotide diversity was also computed within populations (
w). Species-wide patterns of nucleotide polymorphism were summarized by various test statistics: Tajima's D, based on the differences between two estimators of intraspecific diversity, and Fay and Wu's H, which makes use of an outgroup sequence to analyze the frequency of derived polymorphisms (TAJIMA 1989; FAY and WU 2000). Associations among nucleotide variants can be summarized by Wall's B statistics, which evaluate the proportion of adjacent segregating sites that partition equally the sequence sample (WALL 1999). Such sites occur along the same branch of the coalescent tree and the test examines whether branch length is compatible with the neutral equilibrium model. The compatibility of H and B statistics with evolution under the standard neutral model was tested by 1000 coalescent simulations. The Hudson, Kreitman, and Aguadé (HKA) test is based on the prediction that, for a particular region of the genome, the rate of divergence between species is proportional to the levels of polymorphism within species (HUDSON et al. 1987). This test compares the ratio of intraspecific polymorphism to interspecific divergence in two loci. HKA tests were performed for silent positions using silent segregating sites and the silent divergence value (NEI 1987) to compare the intergenic region and the CHS coding region. In the intergenic region, all positions were considered to be silent. These four neutrality tests (Tajima's D, Fay and Wu's H, Wall's B, and HKA) focus on different characteristics of nucleotide polymorphism and thus effectively summarize the evolutionary history of the CHS intergenic region. Intergenic sequences were examined for known transcription-factor-binding sites as previously described (DE MEAUX et al. 2005). Gene conversion tracts between alleles identified in the three different species were searched using the algorithm described by BETRAN et al. (1997) and implemented in DnaSP. This algorithm uses the frequency of a nucleotide at a site to determine if the site is informative to detect a conversion event between groups of sequences. The length of the conversion event is determined by the distance between informative sites (BETRAN et al. 1997). We investigated the gene genealogy among haplotypes using the program TCS, to account for population-level phenomena such as recombination or persistence of the ancestral haplotype (CLEMENT et al. 2000).
|
We further obtained four A. thaliana x A. lyrata hybrid progeny from three A. thaliana (Ei-2, Kas-1, and Ag-0) and four A. lyrata parental genotypes (AL3, AL22, AL52, AL41) and three A. thaliana x A. halleri hybrid progeny from two A. thaliana (Kas-1, Ka-0) and two A. halleri parental genotypes (AH4, AH12) for a total of 81 and 25 hybrids of each type (see supplemental Table 1c at http://www.genetics.org/supplemental/). Only crosses using A. thaliana as a mother were successful and all crosses were not equally successful (see supplemental Table 3 at http://www.genetics.org/supplemental/).
Seeds were sown on humid filter paper in small petri dishes and vernalized at 4° in the dark for 2 weeks, followed by 2 weeks in Voetsch reach-in chambers (12-hr day, 20° day temperature, 16° night temperature, 70% humidity) for germination. The whole procedure was repeated until germination was successful. One-week-old seedlings were transplanted into single pots and assigned to random positions in York walk-in growth chambers in the following conditions: 11-hr day, 21° day temperature, 16° night temperature.
CHS expression experiments:
In A. thaliana, CHS gene expression is repressed in the dark and strongly induced in the light (JENKINS et al. 2001). CHS expression is also induced upon feeding by Plutella xylostella larvae (H. VOGEL and T. MITCHELL-OLDS, unpublished results). To assess the cis-regulatory diversity of CHS expression in response to these various environmental cues, plants either were placed for 48 hr in the dark followed by 8 hr of strong light or were challenged with P. xylostella larvae during 24 hr, following DE MEAUX et al. (2005).
Three- to 6-month-old plants were used for the CHS expression experiments. For intraspecific crosses, CHS expression experiments were performed in two independent trials separated by a 2-week interval. Half of the progeny of each cross were randomly attributed to one or the other trial. Within each trial, half of the progeny were randomly assigned to be sampled for CHS expression in the dark and the other half for CHS expression in the light. All plants, however, underwent the entire dark/light treatment. Insect-feeding experiments were carried out at least 4 weeks after the light experiment. Likewise, within each trial, half of the progeny were randomly assigned to be sampled for CHS expression in insect-challenged leaves and the other half for CHS expression in control leaves of insect-free plants.
For the interspecific progeny, CHS expression experiments were conducted in the same way but in a single trial. For the A. thalianaA. lyrata F1 progeny, flowers were harvested at the time point between the end of the dark period and the beginning of the light period on the plants that had been attributed to the dark treatment, to look at CHS expression independently from the influence of light. CHS is known to be specifically upregulated in A. thaliana flowers where flavonoids are produced abundantly (BURBULIS et al. 1996). Due to delayed flowering, flower-specific CHS expression was not studied in the A. thalianaA. halleri progeny.
Quantitative analysis of allele-specific CHS expression:
Approximately 2 sq cm of leaf material (or two flower buds) was harvested. RNA extraction and cDNA synthesis were performed as described previously (DE MEAUX et al. 2005). Allele-specific CHS mRNA was quantified using the quantitative properties of pyrosequencing (NEVE et al. 2002; DE MEAUX et al. 2005). To control for possible position effects in the thermocycler, cDNA samples together with DNA extracted from heterozygous plants were randomly distributed across 96-well plates prior to PCR. For the A. lyrata progeny, the number of samples allowed a hierarchical randomization of cDNA samples across plates within a given trial. For the A. halleri progeny, no hierarchical randomization was performed. The pyrosequencing reactions were performed using the PyrosequencerAB device (Biotage, Uppsala, Sweden) as previously described (DE MEAUX et al. 2005).
In each two species, three single nucleotide polymorphisms (SNPs) located in the CHS coding region were used to measure allele-specific CHS expression: SNP1008 (PCR primers 5'-TCGGTCAGGCTCTTTTCAGTG-3'and 5'-TGTCCGTCTATGGCACCATC-3', sequencing primer 5'-GGGAGGATGGTCTGT-3'), SNP572 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-TTAGGGACTTCAACC-3'), and SNP591 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-CTGCCGCTTCTTTGCC-3') in A. lyrata; SNPB6 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-TAAGCGCACATGTGTGG-3'), SNPM6 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-GATGTCCTGTCGGGTG-3'), and SNPCZ (PCR primers 5'-GACCGACCTCAAGGAGAAG-3'and 5'-TTGATGGCCTTCACTGCCG-3', sequencing primer 5'-CTAGCTTAGGGACTTCA-3') in A. halleri; for A. thalianaA. lyrata hybrids, we used SNP1230 (PCR primers 5'-ACCTTCCATCTCCTCAAGG-3' and 5'-CTCTTCCTTTAGTCCTAGC-3', sequencing primer 5'-CCTTTAGTCCTAGCTT-3') and SNP587 (PCR primers 5'-GACCGACCTCAAGGAGAAG-3' and 5'-TTGATGGCCTTCACTGCCG-3', sequencing primer 5'-CCTAGCTTAGGGAC-3'); for A. thalianaA. halleri hybrids, we used SNP587 and SNP1370 (PCR primers 5'-AGGTGGAGATAAAGCTAGG-3' and 5'-AAGACACCCCACTCCAACCC-3', sequencing primer 5'-CTCCAACCCTTCTCCT-3'). Only half of the progeny of AL22xAL12, AL22xAL3, AH22xAH4, and AH22xAH5 was analyzed due to heterozygosity of either the AL22 parent for SNP591 or the AH22 parent for SNP M6. Promoter allele genotyping indicated that individuals analyzed in these progeny harbored either the sequenced AL22 or the AH22-1 alleles at the CHS intergenic region described below (Table 1). The genotyping assay is described below. For several progeny, as well as for interspecific hybrids, expression data were analyzed using two independent SNP assays. Data obtained with different SNP assays were all significantly correlated (minimum P = 0.027, Table 3). The strength of the correlation between SNP assays depends on the overall amount of cis-regulatory variation in the progeny.
|
Evaluation of methylation in the CHS intergenic region:
Approximately 2 g of leaves from natural genotypes Ei-2 (A. thaliana), AH4 (A. halleri), AL52 (A. lyrata), and from six and seven A. thalianaA. halleri and A. thalianaA. lyrata hybrids, respectively, were collected. DNA was extracted using the Midi-Prep DNA extraction kit (QIAGEN, Valencia, CA). Quantitative evaluation of methylation was performed at three CpG sites by M. Pettersson at Biotage (Premium CpG Methylation Service, Uppsala, Sweden) at two CpG sites located within the core promoter. The two CpG sites are located 2 bp upstream and within the A-box. The A-box is an essential element of the core promoter located between two equally essential elements (the MRE and ACE elements; LOGEMANN and HAHLBROCK 2002). These two CpG sites correspond to positions 1342 and 1346 on the alignment provided in the supplemental data at http://www.genetics.org/supplemental/.
Individual genotyping in progeny from heterozygous parents:
In the AL22 parent, only one allele was detected in both intergenic and coding regions by the sequencing strategy described above. However, SNP591 revealed that the AL22 parent is heterozygous in the CHS coding region. Using a singleton carried by the AL22 individual at position 1367 (see alignment provided in supplemental data at http://www.genetics.org/supplemental/), we genotyped alleles in the intergenic region of all AL22 progeny by pyrosequencing, using PCR primers 5'-AAAGGGGGCTAACAACTAGCC-3' and 5'-GAAAGATGGCGGAGAGTG-3' and the SNP primer 5'-GGGAAAAAGGAGATG-3'. This analysis showed that SNP591 allowed the assessment of individuals carrying the identified AL22 allele. The AL22xAL7 progeny were assessed by the SNP1008 assay, which is based on a singleton carried by the AL7 cDNA allele. In these progeny, individuals carrying either one or the other alleles could be assessed. Similarly, we genotyped the AL52 intergenic allele of all individuals in the AL41xAL52 progeny (position 1333 in the alignment provided as supplemental data at http://www.genetics.org/supplemental/, PCR primers as above, SNP primer 5'-ATGGACGGGCGGATGAAG-3'). We further used singletons found in intergenic allele AL11-1 to confirm that this allele was not present in the AL12 individual used for crosses (position 914 in the alignment provided as supplemental data at http://www.genetics.org/supplemental/, PCR primers 5'-GAGTTAAGTATGCACGTG-3' and 5'-TACGTACACCAACAAAAGGG-3', SNP primer 5'-GGAGATTTCACTTCCC-3'). In A. halleri, RFLP blots confirmed the size and number of alleles obtained by sequencing (not shown). AH22 and AH4 alleles were genotyped at position 921 and 935, respectively (see alignment provided in supplemental data at http://www.genetics.org/supplemental/), using PCR primers 5'-GAGTTAAGTATGCACGTG-3' and 5'-TACGTACACCAACAAAAGGGG-3' and SNP primer 5'-GTAGAGTTTCTCCACC-3'.
Statistical analysis of expression data:
We conducted the statistical analysis in three steps. In the first step, we investigated for trial effects without measurements made on heterozygous DNA and performed the following GLM analysis
![]() | (1) |
In the second step, we reincorporated the DNA measurements and investigated the existence of main effects or interaction. In this second analysis, the treatment source of variation included five treatments: cDNA samples from dark-maintained, light-maintained, insect-damaged, and control leaves as well as DNA samples from heterozygous individuals. In A. halleri, main and interaction effects involving trials were not significant. Subsequently, we included DNA samples and repeated the GLM analysis with a modified model (1) without trial effects. In A. lyrata, trial effect was significant in one of three SNP assays. Thus DNA samples were randomly attributed to one or the other trial and the GLM analysis was repeated using model (1).
In the third step, we dissected the main effects of genotypes and treatments as well as their interaction. For this, we conducted a separate GLM analysis for each progeny, with the following model
![]() | (2) |
where µ is the grand mean, Ij is the jth CHS expression environment or treatment, Pl is the lth PCR and pyrosequencing plate (for A. lyrata data, Pkl is the lth PCR and pyrosequencing plate in the kth trial), Tk is the kth trial, ITjk represents the interaction between treatment and trial, and C is a technical covariate following DE MEAUX et al. (2005). Effect trial and interaction trial x treatment were not included in the model for analysis in A. halleri because no significant trial effect was found in the global analysis (see above). When possible, Mm, the effect of the mth mother was added to the model. When data for a single genotype could be collected with more than one SNP assay, data obtained with both SNP assays were pooled, a SNP effect was added to the model, and the plate effect was nested within the SNP. If the size of the data set was too small, i.e., if fewer than two repeated measurements were available per cell to test all effects described above, a model (3) without interaction was used. Three parental combination (AL22xAL7, AL41xAL52, and AH22xAH4) individuals of the progeny were genotyped and an allele effect was added to the GLM model as well as an interaction between allele and treatment. Results for each GLM analysis are reported in supplemental Tables 4 and 6 at http://www.genetics.org/supplemental/.
Species-specific CHS expression in F1 A. thalianaA. lyrata hybrids was analyzed using a similar GLM model with modifications, depending on the sample size. For example, in the A. thalianaA. lyrata hybrids, missing data prevented the analysis of a genotype x treatment interaction. For the A. thalianaA. halleri F1 hybrids, sample size was limited and CHS expression could not be studied in each genotype combination and each environment. In addition, all cDNA and DNA samples fit within one 96-well plate for one SNP assay. Therefore, for these F1 hybrids, we used a GLM model that did not investigate either PCR plate or genotype effect.
To identify treatments in which relative allelic expression of a progeny was significantly different, we performed a post-hoc test using Tukey's honest significant difference (HSD) test, which compares each treatment least squares (LS) mean with every other treatment mean in a pairwise manner and controls the family-wise type I error to no >0.05. This test is suitable for pairwise comparisons performed without a priori on which pairs of average measurements may be different (QUINN and KEOUGH 2002). For the progeny in which parental allele effect was investigated, a Tukey's HSD test was performed to identify treatment x allele means that were significantly different (reported in supplemental Table 5 at http://www.genetics.org/supplemental/).
Fold-difference estimates:
Calibrated pyrosequencing data provide a rough estimate of the amplitude of cis-regulatory differences between species or between genotypes within species. The mean of calibrated pyrosequencing data, or the mean relative allelic proportion of CHS in a biallelic CHS cDNA pool, was computed for each F1 progeny and each CHS expression environment. The highest mean value identified by the GLM analysis as significantly different from allelic proportions in DNA was used to calculate an approximate maximum fold difference of CHS mRNA abundance due to cis-regulatory variation.
| RESULTS |
|---|
|
|
|---|
91% identical to the orthologous A. thaliana sequence (KOCH et al. 2001; DE MEAUX et al. 2005). In A. lyrata, the intergenic region was sequenced in 14 accessions from nine locations in Europe and one location in North America. Nineteen alleles varied in size from 1248 to 3399 bp, with large multiple independent insertions (Figure 1). Interestingly, all these insertions occurred between two regions, which were highlighted as strongly constrained in the Brassicaceae (KOCH et al. 2001). We found that one of these insertions contained a 200-bp fragment of a Mariner transposable element (FESCHOTTE and WESSLER 2002). The history of these insertions is likely to be complex. For example, in the AL10 allele, a 426-bp insertion was found, which is in the same position and alignable with the insertion found in alleles AL22, AL7, AL8, AL41, and AL42 (see Figure 1), but has modified nucleotides at the junction points. We removed these insertions from the sequences for the purpose of the alignment-based analysis of diversity (see supplemental data at http://www.genetics.org/supplemental/ for the alignments and the location of the large insertions). In the remaining alignment, we found 42 SNPs (20 singletons) and 47 indels (of which 26 were singletons). Indels ranged in size from 1 to 44 bp, with 28 of 47 affecting only one nucleotide position.
|
t), reached 0.010 (Table 2). This value falls within the range of the silent diversity levels found at eight loci in A. lyrata (RAMOS-ONSINS et al. 2004). Interestingly, within one population, the level of diversity was comparable to species-wide diversity (
w = 0.008 in Lilienfeld, Austria, Table 1). By contrast, individuals sampled in two Swedish populations were almost identical (only one 4-bp indel difference in a TAn tract), pointing to heterogeneous distribution of diversity within this species. A nonsignificant Tajima's D indicated that the frequency distribution of SNP polymorphisms did not deviate from expectations under the neutral-equilibrium model (D = 0.225, P > 0.1). The value of D was typical of previously sampled loci in A. lyrata (RAMOS-ONSINS et al. 2004). Using A. thaliana as an outgroup, we analyzed the frequency distribution of derived mutations by Fay and Wu's H test. No excess of high-frequency-derived mutations was detected (H = 0.413, P > 0.5). Association patterns between adjacent sites indicated that branch length in the coalescent tree is compatible with an equilibrium neutral model (Wall's B = 0.121, P > 0.2 (WALL 1999). The HKA comparison of polymorphism-to-divergence ratios across loci indicates whether a given DNA region has an unusual rate of evolution or polymorphism. We compared the ratio of polymorphism to divergence in the CHS coding region to that in the intergenic region, using A. thaliana as outgroup. The HKA test results were nonsignificant (
2 = 0.413, P > 0.5). In A. halleri, the intergenic region was sequenced in eight accessions from six locations in Europe. Ten alleles were uncovered, varying in size from 1085 to 1704 bp (AH4-1 and AH6-2 are identical). Difference in allele length was mostly due to a large indel found in the 5' part of the alleles AH4-2 and AH3 (Figure 1). In addition, two large indels (>40 bp) were observed at different positions along the sequence. A 42-bp deletion was observed only in allele AH4-2 and a 167-bp insertion was found AH11 and AH12 alleles. For alignment-based analysis of diversity, these regions were removed.
A total of 53 SNPs were observed, with five singletons. Eighteen indels of <40 bp were observed, with only one being a singleton. Population structure was apparent in our sample (Snn = 0.604, P < 0.001); therefore we performed neutrality tests on samples containing one sequence for each population. The AH4 individual harbored two divergent alleles and the inclusion of one or the other allele in the subsample modified substantially the levels of diversity. Therefore neutrality tests were performed on two subsamples, one containing AH4-1 and the other AH4-2, to reflect the range of values that can be observed (Table 2). The level of diversity,
t, reached 0.021 (with AH4-1) or 0.025 (with AH4-2), falling within the range of synonymous diversity at eight coding loci in A. halleri (RAMOS-ONSINS et al. 2004).
The haplotype structure of diversity in A. halleri differed from that found in A. lyrata (Figure 2). In contrast to A. lyrata alleles, A. halleri alleles formed three clades. In particular, one of these clades was closer to A. lyrata than to the other two clades (Table 4). Average levels of pairwise nucleotide differences between clade 3 and the other two clades reached
= 0.04, exceeding species-wide levels of diversity. This level of diversity is comparable to the level of nucleotide divergence observed in the intergenic region between A. lyrata and A. halleri (K per site = 0.037). To verify that CHS is a single-copy gene, we looked at allelic segregation in the progeny of two crosses: AH3xAH4 and AH4xAH22. The RFLP profiles of 10 progeny of each cross indicated that alleles segregated in a manner consistent with the expectations for alleles at a single locus (not shown). No significant deviation from neutrality was detected in the patterns of diversity in our sample (Table 2). A nonsignificant Tajima's D detected no deviation from expectations under the neutral-equilibrium model (D = 0.76 or D = 0.96, both P > 0.2). The coding region of CHS exhibits singular polymorphism features in A. halleri (RAMOS-ONSINS et al. 2004), with a highly significant Fay and Wu's H, presumably indicative of genetic introgression. In the intergenic region, Fay and Wu's H was not significant (H = 7.2 or H = 2.93, minimum P = 0.09), nor was the difference in the polymorphism-to-divergence ratio between the CHS intergenic and coding regions (HKA,
2 = 2.06 or 2.14, minimum P > 0.1). Association patterns between adjacent sites indicated that branch length in the coalescent tree is compatible with a equilibrium neutral model (Wall's B = 0.667 or 0.622, minimum P > 0.9; WALL 1999). Each natural accession found to be heterozygous harbored alleles from different clades. Thus population subdivision is unlikely to explain this haplotype structure. Instead, in the two divergent alleles (AH3 and AH4-2), several gene conversion tracts were detected throughout the sequence (BETRAN et al. 1997). Three tracts involved a conversion from A. lyrata into A. halleri and one a conversion from A. thaliana into A. halleri. No similar pattern was found in the analysis of A. lyrata alleles. Introgression of related Arabidopsis species into A. halleri was previously suggested by a multilocus analysis (RAMOS-ONSINS et al. 2004). Therefore, the existence of divergent allelic lineages in A. halleri appeared compatible with genomewide patterns of variation. These alleles appear to have originated from the recombination of existing diversity and are not the result of a distinct evolutionary history along a separate branch of the genealogical tree. Haplotype networks, such as the one presented in Figure 2, allow the representation of recombination events and lineage mixtures and thus illustrate this phenomenon more accurately than a conventional phylogenetic tree.
|
|
Expression diversity:
In F1 individuals obtained from intra- as well as interspecific crosses, parental CHS cis-regulatory regions are in perfect linkage with parental-coding regions and experience the same trans-regulatory background. Thus, the relative amount of parental CHS mRNA reflects the relative activity of parental cis-regulatory regions (COWLES et al. 2002). This approach allows us to evaluate the amount of cis-regulatory variation. In our assay, we used DNA from F1 individuals to experimentally model the null hypothesis of the "no activity" difference between parental cis-regulatory alleles (DE MEAUX et al. 2005). Indeed, heterozygous DNA contains equal amounts of parental alleles. A total of 461 and 421 F1 progeny were obtained from intraspecific crosses between multiple genotypes in A. lyrata and A. halleri, respectively (summarized in supplemental Table 1 at http://www.genetics.org/supplemental/). Plants were submitted to four different CHS induction treatments (maintained for 48 hr in the dark and 8 hr in the light and submitted to herbivory by P. xylostella and control insect-free plants; see MATERIALS AND METHODS). Leaf tissue was subsequently harvested from these plants to examine CHS cis-regulatory variation. Relative allelic amounts were determined in a total of 709 and 597 samples (summarized in supplemental Table 2 at http://www.genetics.org/supplemental/). Along with this, a total of 81 and 25 individuals were obtained from crosses between A. thaliana and either A. lyrata and A. halleri, respectively, which yielded a total of 361 measurements of relative allelic amounts (supplemental Table 3 at http://www.genetics.org/supplemental/). CHS expression was also examined in floral tissue in the thalianalyrata hybrids.
Expression diversity in A. lyrata:
Using a GLM model, we investigated the effect of treatments (i.e., DNA and cDNA pools) and genotypes (progeny of a parental combination; see MATERIALS AND METHODS).
Three SNP assays were used to evaluate cis-regulatory diversity in A. lyrata. For each SNP assay, a significant effect of genotype and CHS induction treatments was detected (P
0.031, Table 5). Likewise, the interactions between genotype and induction treatments were always highly significant (P < 0.001, Table 5). This reveals that relative allelic expression varies across the CHS expression environments in a way that depends on the genotype of the progeny.
|
|
|
The analysis of the allelic differences in CHS expression in plants maintained 48 hr in the dark yielded three to four functional groups of cis-regulatory alleles (see Figure 3). The detailed analysis of the pairwise comparison of cis-regulatory activity indicates the following relationship in cis-regulatory activity: AL52-1
AL41 = AL52-2 >> AL22/AL7/AL3/AL10 >> AL12. The unknown AL22 intergenic allele could form an additional class but it is not known whether it is different from AL41. All cis-regulatory alleles showed equal activity after 8 hr of exposure to strong light as indicated by a nonsignificant difference between light-exposed-leaf cDNA samples and DNA samples for any of the genotypes. Thus these four functional groups respond differently to the onset of light, as they compensate in various degrees for the variable level of CHS expression in the dark.
Large-indel differences alone did not explain cis-regulatory differences in A. lyrata. For example, the known AL22 allele and AL3 had different large-indel content but no functional difference, whereas AL41 and AL7 were functionally different despite an identical large-indel content (Figures 2 and 3).
The average allelic proportion measured in our assay provided a rough estimate of the maximum fold difference in mRNA levels driven by cis-regulatory variation in each CHS expression environment (Table 6). In A. lyrata, maximums of 3.1- and 2.5-fold differences were observed in leaves maintained in the dark or challenged by herbivory, respectively.
|
|
|
Expression differences in interspecific hybrids:
To evaluate the functional cis-regulatory divergence among Arabidopsis species, we crossed A. thaliana genotypes with both A. lyrata and A. halleri. Hybrid individuals have a haploid copy of each parental genome (i.e., 13 chromosomes) and are sterile. They are morphologically similar to their non-A. thaliana parent. In total, five CHS expression environments were assessed (48 hr dark, 8 hr light, 24 hr insect feeding and respective control, expression in flowers after 48 hr in the dark). Altogether, 245 and 116 relative allelic measurements were performed for A. thalianaA. lyrata and A. thalianaA. halleri F1 progeny, respectively (summarized in supplemental Table 3 at http://www.genetics.org/supplemental/).
Cis-regulatory differences between A. thaliana and A. lyrata:
In the A. thalianaA. lyrata F1 progeny, our assay did not detect CHS expression in either dark-maintained leaves or control non-insect-challenged leaves. Detection of CHS expression in the A. thalianaA. halleri progeny with the same SNP assay (see below) suggested differences in transcription factor expression between hybrid types. The GLM analysis was conducted on a data set that included hybrid DNA samples and mRNA samples collected from three CHS expression environments (flowers, leaves after light exposure, and insect-damaged leaves; Table 8). The analysis examined the following sources of variation: CHS expression environment, parental genotype, SNP assay, and interactions of SNP x treatment and SNP x parental genotype, as well as a technical covariate (see MATERIALS AND METHODS).
|
|
|
Absence of large maternal effect and methylation:
In each of the two species A. lyrata and A. halleri, four of nine crosses yielded individuals from both reciprocal crosses (supplemental Table 1 at http://www.genetics.org/supplemental/). In only two instances (two A. halleri progeny) was there any suggestion of reciprocal differences due to the direction of the cross (supplemental Table 6 at http://www.genetics.org/supplemental/; P = 0.047 for AH22xAH4 and P = 0.032 for AH12xSie).
Additionally, studies of newly formed allopolyploids suggest that interspecific hybrids may experience dramatic expression changes due to methylation of one or both parental copies (ADAMS et al. 2004; WANG et al. 2004). The interspecific hybrids obtained for this study were not polyploid. It seemed sensible, however, to evaluate the potential impact of methylation on the observed variation. We extracted DNA from leaves of the hybrids and from some of their parental genotypes. Levels of methylation were assessed at three potentially methylated CpG sites in the core promoter. No methylation could be detected at these sites in either parent or in the hybrid progeny. This suggests that bringing two distinct haploid genomes together in these interspecific hybrids did not alter dramatically the methylation at the CHS intergenic region.
No simple candidate mutation to explain functional variation:
In A. thaliana, a light-responsive box was found to be polymorphic and to correlate with cis-regulatory differences in dark-maintained and light-exposed leaves (DE MEAUX et al. 2005). In both A. halleri and A. lyrata, this box is conserved. Thus, the differential cis-regulatory activity in the dark has to be found elsewhere. Association between polymorphisms and functional cis-regulatory differences has been successful in A. thaliana, where levels of nucleotide diversity are low. In A. lyrata and A. halleri, alleles instead differ on average at >13 positions (
> 0.01 in both species examined here). Therefore, the observed functional diversity, either within or between species, could not be tracked down to any single polymorphic sequence feature. Likewise, it was not possible to determine whether nucleotide differences in the ACEMRE conserved regulatory element have functional consequences on CHS cis-regulation in A. thaliana vs. A. lyrata or A. halleri. We did not identify a candidate polymorphic motif to explain functional variation found within and between these species. It is interesting, however, to note that a W-box was lost through introgression of a sequence fragment from A. lyrata into intergenic alleles AH4-2 and AH3 alleles. W-boxes are bound by WRKY transcription factors, involved in different types of stress and developmental responses (EULGEM et al. 2000). Whether this element is directly involved in the weaker cis-regulatory activity of the AH4-2 and AH3 alleles has to be tested experimentally.
| DISCUSSION |
|---|
|
|
|---|
Modular cis-regulatory variation in Arabidopsis:
To evaluate functional cis-regulatory variation at the species level, we performed crosses between parental genotypes sampled in different locations throughout the native range of A. lyrata and A. halleri. By means of these crosses, we compared expression of different alleles within the same cells, and thus in the same trans-regulatory background. Because cis-regulatory and coding regions of each parent are linked, differences in the relative amount of allelic mRNA directly reflect allelic cis-regulatory differences. Using this same approach, we also evaluated the functional divergence of CHS cis-regulation among species in the