help button home button Genetics JPET
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Originally published as Genetics Published Articles Ahead of Print on October 8, 2006.

Genetics, Vol. 174, 2181-2202, December 2006, Copyright © 2006
doi:10.1534/genetics.106.064543

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplement to de Meaux el at. 2006
Right arrow All Versions of this Article:
genetics.106.064543v1
174/4/2181    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by de Meaux, J.
Right arrow Articles by Mitchell-Olds, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by de Meaux, J.
Right arrow Articles by Mitchell-Olds, T.

Cis-regulatory Evolution of Chalcone-Synthase Expression in the Genus Arabidopsis

Juliette de Meaux*,{dagger},1, A. Pop{dagger} and T. Mitchell-Olds{ddagger}

* Genetics and Plant Breeding, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany, {dagger} Genetics and Evolution, Max Planck Institute for Chemical Ecology, 07745 Jena, Germany and {ddagger} Department of Biology, Duke University, Durham, North Carolina 27708-0338

1 Corresponding author: Max Planck Institute for Plant Breeding Research, Carl-von-Linné Weg 10, 50829 Cologne, Germany.
E-mail: demeaux{at}mpiz-koeln.mpg.de

Manuscript received August 10, 2006. Accepted for publication September 26, 2006.


    ABSTRACT
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
The contribution of cis-regulation to adaptive evolutionary change is believed to be essential, yet little is known about the evolutionary rules that govern regulatory sequences. Here, we characterize the short-term evolutionary dynamics of a cis-regulatory region within and among two closely related species, A. lyrata and A. halleri, and compare our findings to A. thaliana. We focused on the cis-regulatory region of chalcone synthase (CHS), a key enzyme involved in the synthesis of plant secondary metabolites. We observed patterns of nucleotide diversity that differ among species but do not depart from neutral expectations. Using intra- and interspecific F1 progeny, we have evaluated functional cis-regulatory variation in response to light and herbivory, environmental cues, which are known to induce CHS expression. We find that substantial cis-regulatory variation segregates within and among populations as well as between species, some of which results from interspecific genetic introgression. We further demonstrate that, in A. thaliana, CHS cis-regulation in response to herbivory is greater than in A. lyrata or A. halleri. Our work indicates that the evolutionary dynamics of a cis-regulatory region is characterized by pervasive functional variation, achieved mostly by modification of response modules to one but not all environmental cues. Our study did not detect the footprint of selection on this variation.


THE rhythm, dynamics, and location of gene expression are fundamentally important for development of phenotypes. Transcription is controlled in part by the interaction of regulatory proteins (trans-regulatory factors) with specific DNA regions (cis-regulatory DNA regions or promoters). In contrast with the explicit constraints imposed on protein-coding DNA by the genetic code, the functional architecture of cis-regulatory DNA is less apparent. Expression variation is widespread within and between species (OLEKSIAK et al. 2002; BECHER et al. 2004; KHAITOVICH et al. 2004; KLIEBENSTEIN et al. 2006), and subtle changes in expression can significantly affect the phenotype (WANG et al. 1999; GOMPEL et al. 2005). Accordingly, cis-regulatory DNA is thought to play a prominent role in adaptive evolution (KING and WILSON 1975; WRAY et al. 2003). Rewiring the regulatory network through cis-changes may indeed allow the generation of phenotypic novelties while simultaneously preserving key physiological and developmental functions.

Our current understanding of cis-regulatory evolution is largely based on patterns of DNA conservation. There is now clear evidence that function in noncoding DNA (ncDNA) is broadly maintained. Whole-genome sequence comparisons between species have uncovered numerous segments of conserved noncoding DNA (DERMITZAKIS et al. 2004). Constraints on conserved segments are often experimentally related to functional conservation (CLIFTEN et al. 2001; KOCH et al. 2001; BOFFELLI et al. 2003). Interestingly, levels of constraint in ncDNA are found to vary across species (KEIGHTLEY and GAFFNEY 2003; KEIGHTLEY et al. 2005); they also can be larger than in protein-coding regions (BEJERANO et al. 2004). However, opportunities for adaptive evolution and lineage-specific functional changes remain poorly understood.

Recently, a study of nucleotide polymorphism and divergence within and between two Drosophila species over a large number of noncoding DNA loci suggested that many ncDNA changes, mostly located in UTRs, may have undergone adaptive evolution (ANDOLFATTO 2005). In humans, multiple instances of adaptive changes have been observed at specific cis-regulatory loci (BAMSHAD et al. 2002; ROCKMAN et al. 2003; HAHN et al. 2004; ROCKMAN et al. 2004), although some examples remain controversial (SABETI et al. 2005). In other species, including Drosophila, examples of neutral nucleotide variation at cis-regulatory regions have been reported (BALHOFF and WRAY 2005; FAY and BENAVIDES 2005; MACDONALD and LONG 2005).

Overall, the relationship between nucleotide and functional variation in cis-regulatory DNA has rarely been characterized, and little is known about the amount and type of variation to be expected within and among closely related species. In-depth characterization of cis-regulatory variation at both nucleotide and functional levels is required to gain some insight into the baseline evolutionary scenario of functional noncoding regulatory DNA. The most compelling models of cis-regulatory evolution are based on Drosophila developmental genes that are controlled by internal signals and whose misexpression is fatal to the organism (PHINCHONGSAKULDIT et al. 2004; LUDWIG et al. 2005). So far, the generation of cis-regulatory novelties in a less constrained expression context has received little attention. In comparison to animals, plants continuously fine tune their development to prevailing environmental conditions. Thus, plant models of cis-regulatory evolution in genes controlled by environmental signals may shed light on the possible adaptive role of cis-regulatory variation.

Previously, we examined regulatory polymorphisms within Arabidopsis thaliana. We established a robust allele-specific assay to examine cis-regulatory variation in response to abiotic, biotic, or developmental changes (DE MEAUX et al. 2005). This assay of allele-specific expression in F1 heterozygotes effectively controls for background variation and allows the detection of subtle cis-regulatory differences. We focused on the promoter region of the chalcone synthase (CHS) gene because it is among the best-characterized promoters in plants and its expression is induced by multiple cues (HARTMANN et al. 1998; KOCH et al. 2001; LOGEMANN and HAHLBROCK 2002). Among those cues, light and insect herbivory were shown to upregulate CHS expression in A. thaliana (REYMOND et al. 2000; JENKINS et al. 2001; WADE et al. 2001). In addition, CHS is the branch-point enzyme of a pathway involved in the interaction between plants and their abiotic and biotic environments (WINKEL-SHIRLEY 2001). Hence this gene is likely to play a role in adaptive evolution (see JOHNSON and DOWD 2004). In A. thaliana, we found substantial functional cis-regulatory variation in CHS expression. However, patterns of nucleotide variation in the A. thaliana CHS promoter showed no evidence of non-neutral evolution in this inbreeding annual species (DE MEAUX et al. 2005).

In this article, we elucidate the evolutionary dynamics of CHS cis-regulation in Arabidopsis and report microevolutionary patterns of cis-regulation. These experiments were conducted with A. halleri and A. lyrata, two self-incompatible species that differ substantially in their ecology (MITCHELL-OLDS 2001). In central Europe, A. halleri grows in highly competitive open meadows, whereas A. lyrata is restricted to low-competition habitats on exposed rocks (HOFFMANN 2005). We compared our findings to data from the model species A. thaliana (DE MEAUX et al. 2005), which differs from other species in the genus by its self-compatibility and low species-wide levels of diversity (WRIGHT et al. 2003; SCHMID et al. 2005).

We analyzed allele-specific expression levels in the progeny of intra- and interspecific crosses and evaluated functional polymorphism and divergence of CHS cis-regulation. We combined our functional assay with an analysis of polymorphism and divergence at the nucleotide level in the CHS 5' upstream intergenic region and addressed the following questions: (i) How does cis-regulatory diversity in A. lyrata and A. halleri compare to that in A. thaliana?, (ii) What qualitative and quantitative divergence in CHS cis-regulation is seen among species?, and (iii) Is a footprint of selection detectable in the intergenic region containing the CHS promoter in any of the three Arabidopsis species examined?

At the nucleotide level, the evolutionary dynamics of the CHS promoter region differed among the Arabidopsis species examined, in agreement with their genomewide patterns of diversity. Thus, patterns of variation at the CHS promoter region show no indication of adaptive evolution. Nonetheless, patterns of functional diversity point to abundant cis-regulatory variation within and between species, most of which results from qualitative differences in the response to individual environmental cues. Our results reveal that CHS cis-regulation evolves mainly by modification of cis-regulatory response modules to one but not all environmental cues.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Sequencing:
The 5' flanking region at the CHS gene (henceforth referred to as the "intergenic region") was sequenced from 15 and 8 individuals in A. lyrata and Arabidopsis halleri, respectively. All accessions are diploid and their geographic origin is described in Table 1. Each accession was named as follows: the two letters indicate the species, the first number indicates the population, the second number the individual in the population (if several individuals were studied in the population), and the number after the hyphen describes the allele (if more than one allele was uncovered in a given individual). All A. lyrata accessions were provided by M. J. Clauss (Max Planck Institute for Chemical Ecology, Jena, Germany) with the exception of AL11, AL12, and AL10, which were provided by T. Mitchell-Olds and T. Sharbel (Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany). A. halleri accessions were collected by M. J. Clauss (AH4 and AH5) and T. Mitchell-Olds. P. Saumitou-Laprade (University of Lille, Lille, France) provided the seeds for AH11, AH12, AH21, and AH22. Young leaves from each accession were ground in liquid nitrogen and DNA was subsequently purified following standard CTAB protocol. CHS is single copy in the A. thaliana genome (KOCH et al. 2000). A Southern blot analysis confirmed that CHS occurs also as a single copy in A. lyrata and A. halleri (not shown). To amplify the CHS intergenic region in A. lyrata and A. halleri, we used the annotated A. thaliana genome to design a forward primer in the closest adjacent putative open reading frame (ORF) 5' upstream from CHS (5'-AGGACAATCGTTGATCCAG-3') and a reverse primer in the first CHS exon (5'-GTAGTCAGGATACTCCGC-3'). The adjacent ORF (annotated AT5G13920) belongs to the Zinc knuckle protein family (http://www.Arabidopsis.org). The PCR was conducted as previously described with the exception of the use of a 54° annealing temperature for PCR cycling (DE MEAUX et al. 2005). Two independent PCRs were performed and products were cloned using a TOPO TA cloning kit (Invitrogen Life Technologies, Paisley, UK). Six clones perPCR were sequenced on one strand with an ABI3700 capillary sequencer using primers placed approximately every 500 bp. Sequences were assembled with Seqman 5.0 (DNASTAR) and each variable site was checked by examining sequence chromatograms. Sites found to be polymorphic across clones obtained from separate PCRs indicated the segregation of two alleles in the PCR pool. Allele sizes were checked by RFLP in A. halleri (not shown). In A. lyrata, genotyping assays based on singletons detected a second allele in AL22 (henceforth called "unknown AL22 allele") but not in AL12. The CHS exons 1 and 2 were sequenced following RAMOS-ONSINS et al. (2004) in the 7 A. lyrata and 5 A. halleri parental genotypes used for the expression assay (Table 1). Sequences are available from the EMBL Nucleotide Sequence Database under accession nos. AM296511AM296543.


View this table:
[in this window]
[in a new window]

 
TABLE 1 Geographical origin of the individuals analyzed and levels of diversity found within populations

 
Population genetic analyses:
We assumed that individuals for which a single sequence was obtained carried two identical alleles. Sequences were aligned with Megalign 5.03 (DNASTAR). The DnaSP 4.0 program (ROZAS and ROZAS 1999) was used for both intra- and interspecific analyses of nucleotide polymorphism (Table 2 ). Deviations from panmixia, e.g., population subdivision, violate basic assumptions of most neutrality tests. We investigated the existence of genetic differentiation using the Snn estimator developed by HUDSON et al. (1992) and implemented in DnaSP. This estimator is a nucleotide-sequence-based measure of genetic differentiation between populations. Significance of Snn was tested using 1000 permutations. Following RAMOS-ONSINS et al. (2004), we chose one sequence per location if significant genetic differentiation was detected in our sample to perform neutrality tests on the basis of species-wide estimates of diversity. In A. halleri, two summary values of variation are reported (Table 2) because distinct subsamples sometimes yielded markedly different values. This was not the case for A. lyrata. In the reduced sample, per-site nucleotide diversity is described as {pi}t (between populations). Per-site nucleotide diversity was also computed within populations ({pi}w). Species-wide patterns of nucleotide polymorphism were summarized by various test statistics: Tajima's D, based on the differences between two estimators of intraspecific diversity, and Fay and Wu's H, which makes use of an outgroup sequence to analyze the frequency of derived polymorphisms (TAJIMA 1989; FAY and WU 2000). Associations among nucleotide variants can be summarized by Wall's B statistics, which evaluate the proportion of adjacent segregating sites that partition equally the sequence sample (WALL 1999). Such sites occur along the same branch of the coalescent tree and the test examines whether branch length is compatible with the neutral equilibrium model. The compatibility of H and B statistics with evolution under the standard neutral model was tested by 1000 coalescent simulations. The Hudson, Kreitman, and Aguadé (HKA) test is based on the prediction that, for a particular region of the genome, the rate of divergence between species is proportional to the levels of polymorphism within species (HUDSON et al. 1987). This test compares the ratio of intraspecific polymorphism to interspecific divergence in two loci. HKA tests were performed for silent positions using silent segregating sites and the silent divergence value (NEI 1987) to compare the intergenic region and the CHS coding region. In the intergenic region, all positions were considered to be silent. These four neutrality tests (Tajima's D, Fay and Wu's H, Wall's B, and HKA) focus on different characteristics of nucleotide polymorphism and thus effectively summarize the evolutionary history of the CHS intergenic region. Intergenic sequences were examined for known transcription-factor-binding sites as previously described (DE MEAUX et al. 2005). Gene conversion tracts between alleles identified in the three different species were searched using the algorithm described by BETRAN et al. (1997) and implemented in DnaSP. This algorithm uses the frequency of a nucleotide at a site to determine if the site is informative to detect a conversion event between groups of sequences. The length of the conversion event is determined by the distance between informative sites (BETRAN et al. 1997). We investigated the gene genealogy among haplotypes using the program TCS, to account for population-level phenomena such as recombination or persistence of the ancestral haplotype (CLEMENT et al. 2000).


View this table:
[in this window]
[in a new window]

 
TABLE 2 Summary statistics

 
Allele-specific quantification of CHS expression in F1:
Plant material used for crosses:
The accessions used to perform crosses were chosen from different populations covering a representative part of the species ranges. Whenever possible, accessions found to be homozygous in the intergenic region were chosen. Because both A. halleri and A. lyrata are self-incompatible, crosses were performed by simply rubbing stamen of the paternal genotype with the pistil of the maternal genotype. In A. lyrata, seven accessions were used to generate 17 F1 progeny, 10 of which yielded enough individuals for statistical analysis (supplemental Table 1a at http://www.genetics.org/supplemental/). In A. halleri, five accessions were used to generate 8 F1 progeny large enough to be analyzed statistically (supplemental Table 1b at http://www.genetics.org/supplemental/). Additional combinations could not be used, either because crosses remained unsuccessful (presumably due to self-incompatibility) or because parental alleles could not be differentiated by any polymorphism. When possible, reciprocal crosses were performed to control for maternal effects (see supplemental Table 1, a and b, at http://www.genetics.org/supplemental/).

We further obtained four A. thaliana x A. lyrata hybrid progeny from three A. thaliana (Ei-2, Kas-1, and Ag-0) and four A. lyrata parental genotypes (AL3, AL22, AL52, AL41) and three A. thaliana x A. halleri hybrid progeny from two A. thaliana (Kas-1, Ka-0) and two A. halleri parental genotypes (AH4, AH12) for a total of 81 and 25 hybrids of each type (see supplemental Table 1c at http://www.genetics.org/supplemental/). Only crosses using A. thaliana as a mother were successful and all crosses were not equally successful (see supplemental Table 3 at http://www.genetics.org/supplemental/).

Seeds were sown on humid filter paper in small petri dishes and vernalized at 4° in the dark for 2 weeks, followed by 2 weeks in Voetsch reach-in chambers (12-hr day, 20° day temperature, 16° night temperature, 70% humidity) for germination. The whole procedure was repeated until germination was successful. One-week-old seedlings were transplanted into single pots and assigned to random positions in York walk-in growth chambers in the following conditions: 11-hr day, 21° day temperature, 16° night temperature.

CHS expression experiments:
In A. thaliana, CHS gene expression is repressed in the dark and strongly induced in the light (JENKINS et al. 2001). CHS expression is also induced upon feeding by Plutella xylostella larvae (H. VOGEL and T. MITCHELL-OLDS, unpublished results). To assess the cis-regulatory diversity of CHS expression in response to these various environmental cues, plants either were placed for 48 hr in the dark followed by 8 hr of strong light or were challenged with P. xylostella larvae during 24 hr, following DE MEAUX et al. (2005).

Three- to 6-month-old plants were used for the CHS expression experiments. For intraspecific crosses, CHS expression experiments were performed in two independent trials separated by a 2-week interval. Half of the progeny of each cross were randomly attributed to one or the other trial. Within each trial, half of the progeny were randomly assigned to be sampled for CHS expression in the dark and the other half for CHS expression in the light. All plants, however, underwent the entire dark/light treatment. Insect-feeding experiments were carried out at least 4 weeks after the light experiment. Likewise, within each trial, half of the progeny were randomly assigned to be sampled for CHS expression in insect-challenged leaves and the other half for CHS expression in control leaves of insect-free plants.

For the interspecific progeny, CHS expression experiments were conducted in the same way but in a single trial. For the A. thaliana–A. lyrata F1 progeny, flowers were harvested at the time point between the end of the dark period and the beginning of the light period on the plants that had been attributed to the dark treatment, to look at CHS expression independently from the influence of light. CHS is known to be specifically upregulated in A. thaliana flowers where flavonoids are produced abundantly (BURBULIS et al. 1996). Due to delayed flowering, flower-specific CHS expression was not studied in the A. thaliana–A. halleri progeny.

Quantitative analysis of allele-specific CHS expression:
Approximately 2 sq cm of leaf material (or two flower buds) was harvested. RNA extraction and cDNA synthesis were performed as described previously (DE MEAUX et al. 2005). Allele-specific CHS mRNA was quantified using the quantitative properties of pyrosequencing (NEVE et al. 2002; DE MEAUX et al. 2005). To control for possible position effects in the thermocycler, cDNA samples together with DNA extracted from heterozygous plants were randomly distributed across 96-well plates prior to PCR. For the A. lyrata progeny, the number of samples allowed a hierarchical randomization of cDNA samples across plates within a given trial. For the A. halleri progeny, no hierarchical randomization was performed. The pyrosequencing reactions were performed using the PyrosequencerAB device (Biotage, Uppsala, Sweden) as previously described (DE MEAUX et al. 2005).

In each two species, three single nucleotide polymorphisms (SNPs) located in the CHS coding region were used to measure allele-specific CHS expression: SNP1008 (PCR primers 5'-TCGGTCAGGCTCTTTTCAGTG-3'and 5'-TGTCCGTCTATGGCACCATC-3', sequencing primer 5'-GGGAGGATGGTCTGT-3'), SNP572 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-TTAGGGACTTCAACC-3'), and SNP591 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-CTGCCGCTTCTTTGCC-3') in A. lyrata; SNPB6 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-TAAGCGCACATGTGTGG-3'), SNPM6 (PCR primers 5'-GGAAACGCCACATGCATCTG-3' and 5'-TCCTTGATGGCCTTCACTGC-3', sequencing primer 5'-GATGTCCTGTCGGGTG-3'), and SNPCZ (PCR primers 5'-GACCGACCTCAAGGAGAAG-3'and 5'-TTGATGGCCTTCACTGCCG-3', sequencing primer 5'-CTAGCTTAGGGACTTCA-3') in A. halleri; for A. thaliana–A. lyrata hybrids, we used SNP1230 (PCR primers 5'-ACCTTCCATCTCCTCAAGG-3' and 5'-CTCTTCCTTTAGTCCTAGC-3', sequencing primer 5'-CCTTTAGTCCTAGCTT-3') and SNP587 (PCR primers 5'-GACCGACCTCAAGGAGAAG-3' and 5'-TTGATGGCCTTCACTGCCG-3', sequencing primer 5'-CCTAGCTTAGGGAC-3'); for A. thaliana–A. halleri hybrids, we used SNP587 and SNP1370 (PCR primers 5'-AGGTGGAGATAAAGCTAGG-3' and 5'-AAGACACCCCACTCCAACCC-3', sequencing primer 5'-CTCCAACCCTTCTCCT-3'). Only half of the progeny of AL22xAL12, AL22xAL3, AH22xAH4, and AH22xAH5 was analyzed due to heterozygosity of either the AL22 parent for SNP591 or the AH22 parent for SNP M6. Promoter allele genotyping indicated that individuals analyzed in these progeny harbored either the sequenced AL22 or the AH22-1 alleles at the CHS intergenic region described below (Table 1). The genotyping assay is described below. For several progeny, as well as for interspecific hybrids, expression data were analyzed using two independent SNP assays. Data obtained with different SNP assays were all significantly correlated (minimum P = 0.027, Table 3). The strength of the correlation between SNP assays depends on the overall amount of cis-regulatory variation in the progeny.


View this table:
[in this window]
[in a new window]

 
TABLE 3 Correlation between SNP assays for parental allelic combinations harboring two SNP differences in the CHS coding region

 
Ratio of polymorphic over monomorphic sequencing peaks were deduced from pyrosequencing measurements, which provided an estimation of relative allelic concentration in mRNA pools. The ratios were calibrated as previously described by WITTKOPP et al. (2004). For interspecific crosses, a marked PCR bias was observed for all three SNP assays, and the standard curve was better modeled by a second-degree polynomial equation. This might result from the higher sequence divergence of orthologous mRNAs and may partly explain the higher variance of the pyrosequencing measurement observed in the quantification of species-specific CHS mRNA levels in interspecific F1 hybrids (see below).

Evaluation of methylation in the CHS intergenic region:
Approximately 2 g of leaves from natural genotypes Ei-2 (A. thaliana), AH4 (A. halleri), AL52 (A. lyrata), and from six and seven A. thaliana–A. halleri and A. thaliana–A. lyrata hybrids, respectively, were collected. DNA was extracted using the Midi-Prep DNA extraction kit (QIAGEN, Valencia, CA). Quantitative evaluation of methylation was performed at three CpG sites by M. Pettersson at Biotage (Premium CpG Methylation Service, Uppsala, Sweden) at two CpG sites located within the core promoter. The two CpG sites are located 2 bp upstream and within the A-box. The A-box is an essential element of the core promoter located between two equally essential elements (the MRE and ACE elements; LOGEMANN and HAHLBROCK 2002). These two CpG sites correspond to positions 1342 and 1346 on the alignment provided in the supplemental data at http://www.genetics.org/supplemental/.

Individual genotyping in progeny from heterozygous parents:
In the AL22 parent, only one allele was detected in both intergenic and coding regions by the sequencing strategy described above. However, SNP591 revealed that the AL22 parent is heterozygous in the CHS coding region. Using a singleton carried by the AL22 individual at position 1367 (see alignment provided in supplemental data at http://www.genetics.org/supplemental/), we genotyped alleles in the intergenic region of all AL22 progeny by pyrosequencing, using PCR primers 5'-AAAGGGGGCTAACAACTAGCC-3' and 5'-GAAAGATGGCGGAGAGTG-3' and the SNP primer 5'-GGGAAAAAGGAGATG-3'. This analysis showed that SNP591 allowed the assessment of individuals carrying the identified AL22 allele. The AL22xAL7 progeny were assessed by the SNP1008 assay, which is based on a singleton carried by the AL7 cDNA allele. In these progeny, individuals carrying either one or the other alleles could be assessed. Similarly, we genotyped the AL52 intergenic allele of all individuals in the AL41xAL52 progeny (position 1333 in the alignment provided as supplemental data at http://www.genetics.org/supplemental/, PCR primers as above, SNP primer 5'-ATGGACGGGCGGATGAAG-3'). We further used singletons found in intergenic allele AL11-1 to confirm that this allele was not present in the AL12 individual used for crosses (position 914 in the alignment provided as supplemental data at http://www.genetics.org/supplemental/, PCR primers 5'-GAGTTAAGTATGCACGTG-3' and 5'-TACGTACACCAACAAAAGGG-3', SNP primer 5'-GGAGATTTCACTTCCC-3'). In A. halleri, RFLP blots confirmed the size and number of alleles obtained by sequencing (not shown). AH22 and AH4 alleles were genotyped at position 921 and 935, respectively (see alignment provided in supplemental data at http://www.genetics.org/supplemental/), using PCR primers 5'-GAGTTAAGTATGCACGTG-3' and 5'-TACGTACACCAACAAAAGGGG-3' and SNP primer 5'-GTAGAGTTTCTCCACC-3'.

Statistical analysis of expression data:
We conducted the statistical analysis in three steps. In the first step, we investigated for trial effects without measurements made on heterozygous DNA and performed the following GLM analysis

Formula 1(1)
where µ is the grand mean, Gi is the effect of the ith genotypic combination or cross, Ij is the jth CHS expression environment or treatment (i.e., dark-maintained, light-maintained, insect-damaged, and control leaves), Tk is the kth trial, Pl is the lth PCR and pyrosequencing plate (for A. lyrata data, Pkl is the lth PCR and pyrosequencing plate in the kth trial), C is a technical covariate following DE MEAUX et al. (2005) and GIij, ITjk, GTik, and GTIijk represent interactions between cross x treatment, treatment x trial, cross x trial, and cross x treatment x trial, respectively. When possible, Mim, the effect of the mth mother in the ith genotype was added to the model. A significant cross x treatment effect indicates that CHS cis-regulatory alleles respond differently to the different expression conditions examined in this study. Within a genotypic combination or cross, several allelic combinations may be segregating if the parents are heterozygous. In this analysis, allelic combinations are taken together. This approach is conservative as the presence of different allelic combinations with different effects on cis-regulation will tend to increase the variance and consequently to decrease power to detect functional cis-regulatory differences between parents.

In the second step, we reincorporated the DNA measurements and investigated the existence of main effects or interaction. In this second analysis, the treatment source of variation included five treatments: cDNA samples from dark-maintained, light-maintained, insect-damaged, and control leaves as well as DNA samples from heterozygous individuals. In A. halleri, main and interaction effects involving trials were not significant. Subsequently, we included DNA samples and repeated the GLM analysis with a modified model (1) without trial effects. In A. lyrata, trial effect was significant in one of three SNP assays. Thus DNA samples were randomly attributed to one or the other trial and the GLM analysis was repeated using model (1).

In the third step, we dissected the main effects of genotypes and treatments as well as their interaction. For this, we conducted a separate GLM analysis for each progeny, with the following model

Formula 2(2)

where µ is the grand mean, Ij is the jth CHS expression environment or treatment, Pl is the lth PCR and pyrosequencing plate (for A. lyrata data, Pkl is the lth PCR and pyrosequencing plate in the kth trial), Tk is the kth trial, ITjk represents the interaction between treatment and trial, and C is a technical covariate following DE MEAUX et al. (2005). Effect trial and interaction trial x treatment were not included in the model for analysis in A. halleri because no significant trial effect was found in the global analysis (see above). When possible, Mm, the effect of the mth mother was added to the model. When data for a single genotype could be collected with more than one SNP assay, data obtained with both SNP assays were pooled, a SNP effect was added to the model, and the plate effect was nested within the SNP. If the size of the data set was too small, i.e., if fewer than two repeated measurements were available per cell to test all effects described above, a model (3) without interaction was used. Three parental combination (AL22xAL7, AL41xAL52, and AH22xAH4) individuals of the progeny were genotyped and an allele effect was added to the GLM model as well as an interaction between allele and treatment. Results for each GLM analysis are reported in supplemental Tables 4 and 6 at http://www.genetics.org/supplemental/.

Species-specific CHS expression in F1 A. thaliana–A. lyrata hybrids was analyzed using a similar GLM model with modifications, depending on the sample size. For example, in the A. thaliana–A. lyrata hybrids, missing data prevented the analysis of a genotype x treatment interaction. For the A. thaliana–A. halleri F1 hybrids, sample size was limited and CHS expression could not be studied in each genotype combination and each environment. In addition, all cDNA and DNA samples fit within one 96-well plate for one SNP assay. Therefore, for these F1 hybrids, we used a GLM model that did not investigate either PCR plate or genotype effect.

To identify treatments in which relative allelic expression of a progeny was significantly different, we performed a post-hoc test using Tukey's honest significant difference (HSD) test, which compares each treatment least squares (LS) mean with every other treatment mean in a pairwise manner and controls the family-wise type I error to no >0.05. This test is suitable for pairwise comparisons performed without a priori on which pairs of average measurements may be different (QUINN and KEOUGH 2002). For the progeny in which parental allele effect was investigated, a Tukey's HSD test was performed to identify treatment x allele means that were significantly different (reported in supplemental Table 5 at http://www.genetics.org/supplemental/).

Fold-difference estimates:
Calibrated pyrosequencing data provide a rough estimate of the amplitude of cis-regulatory differences between species or between genotypes within species. The mean of calibrated pyrosequencing data, or the mean relative allelic proportion of CHS in a biallelic CHS cDNA pool, was computed for each F1 progeny and each CHS expression environment. The highest mean value identified by the GLM analysis as significantly different from allelic proportions in DNA was used to calculate an approximate maximum fold difference of CHS mRNA abundance due to cis-regulatory variation.


    RESULTS
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Nucleotide variation:
We sequenced the 5' flanking region at the CHS gene (the intergenic region) from 15 accessions in A. lyrata and from 8 accessions in A. halleri. Table 1 summarizes the number of accessions sequenced and the number of alleles obtained and their length. All accessions are diploid. The intergenic regions of A. halleri and A. lyrata are ~91% identical to the orthologous A. thaliana sequence (KOCH et al. 2001; DE MEAUX et al. 2005). In A. lyrata, the intergenic region was sequenced in 14 accessions from nine locations in Europe and one location in North America. Nineteen alleles varied in size from 1248 to 3399 bp, with large multiple independent insertions (Figure 1). Interestingly, all these insertions occurred between two regions, which were highlighted as strongly constrained in the Brassicaceae (KOCH et al. 2001). We found that one of these insertions contained a 200-bp fragment of a Mariner transposable element (FESCHOTTE and WESSLER 2002). The history of these insertions is likely to be complex. For example, in the AL10 allele, a 426-bp insertion was found, which is in the same position and alignable with the insertion found in alleles AL22, AL7, AL8, AL41, and AL42 (see Figure 1), but has modified nucleotides at the junction points. We removed these insertions from the sequences for the purpose of the alignment-based analysis of diversity (see supplemental data at http://www.genetics.org/supplemental/ for the alignments and the location of the large insertions). In the remaining alignment, we found 42 SNPs (20 singletons) and 47 indels (of which 26 were singletons). Indels ranged in size from 1 to 44 bp, with 28 of 47 affecting only one nucleotide position.


Figure 1
View larger version (58K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 1.— Distribution of polymorphisms along the intergenic region upstream from the CHS open reading frame in (a) A. lyrata and (b) A. halleri. Bars on the top and bottom part of the graph indicate single nucleotide and insertion/deletion polymorphisms, respectively. Shaded bars along the sequence delineate the phylogenetic footprints found by KOCH et al. (2001) in the Brassicaceae. The hatched box indicates the 5'-UTR. Nucleotide positions along the sequence are indicated as base-pair distances from the ATG.

 
Significant population structure was detected (Snn = 0.685, P < 0.001); hence one sequence was randomly chosen in each location, following RAMOS-ONSINS et al. (2004). Over this reduced sample the level of diversity (measured as the average pairwise number of differences per site, or {pi}t), reached 0.010 (Table 2). This value falls within the range of the silent diversity levels found at eight loci in A. lyrata (RAMOS-ONSINS et al. 2004). Interestingly, within one population, the level of diversity was comparable to species-wide diversity ({pi}w = 0.008 in Lilienfeld, Austria, Table 1). By contrast, individuals sampled in two Swedish populations were almost identical (only one 4-bp indel difference in a TAn tract), pointing to heterogeneous distribution of diversity within this species. A nonsignificant Tajima's D indicated that the frequency distribution of SNP polymorphisms did not deviate from expectations under the neutral-equilibrium model (D = 0.225, P > 0.1). The value of D was typical of previously sampled loci in A. lyrata (RAMOS-ONSINS et al. 2004). Using A. thaliana as an outgroup, we analyzed the frequency distribution of derived mutations by Fay and Wu's H test. No excess of high-frequency-derived mutations was detected (H = 0.413, P > 0.5). Association patterns between adjacent sites indicated that branch length in the coalescent tree is compatible with an equilibrium neutral model (Wall's B = 0.121, P > 0.2 (WALL 1999). The HKA comparison of polymorphism-to-divergence ratios across loci indicates whether a given DNA region has an unusual rate of evolution or polymorphism. We compared the ratio of polymorphism to divergence in the CHS coding region to that in the intergenic region, using A. thaliana as outgroup. The HKA test results were nonsignificant ({chi}2 = 0.413, P > 0.5).

In A. halleri, the intergenic region was sequenced in eight accessions from six locations in Europe. Ten alleles were uncovered, varying in size from 1085 to 1704 bp (AH4-1 and AH6-2 are identical). Difference in allele length was mostly due to a large indel found in the 5' part of the alleles AH4-2 and AH3 (Figure 1). In addition, two large indels (>40 bp) were observed at different positions along the sequence. A 42-bp deletion was observed only in allele AH4-2 and a 167-bp insertion was found AH11 and AH12 alleles. For alignment-based analysis of diversity, these regions were removed.

A total of 53 SNPs were observed, with five singletons. Eighteen indels of <40 bp were observed, with only one being a singleton. Population structure was apparent in our sample (Snn = 0.604, P < 0.001); therefore we performed neutrality tests on samples containing one sequence for each population. The AH4 individual harbored two divergent alleles and the inclusion of one or the other allele in the subsample modified substantially the levels of diversity. Therefore neutrality tests were performed on two subsamples, one containing AH4-1 and the other AH4-2, to reflect the range of values that can be observed (Table 2). The level of diversity, {pi}t, reached 0.021 (with AH4-1) or 0.025 (with AH4-2), falling within the range of synonymous diversity at eight coding loci in A. halleri (RAMOS-ONSINS et al. 2004).

The haplotype structure of diversity in A. halleri differed from that found in A. lyrata (Figure 2). In contrast to A. lyrata alleles, A. halleri alleles formed three clades. In particular, one of these clades was closer to A. lyrata than to the other two clades (Table 4). Average levels of pairwise nucleotide differences between clade 3 and the other two clades reached {pi} = 0.04, exceeding species-wide levels of diversity. This level of diversity is comparable to the level of nucleotide divergence observed in the intergenic region between A. lyrata and A. halleri (K per site = 0.037). To verify that CHS is a single-copy gene, we looked at allelic segregation in the progeny of two crosses: AH3xAH4 and AH4xAH22. The RFLP profiles of 10 progeny of each cross indicated that alleles segregated in a manner consistent with the expectations for alleles at a single locus (not shown). No significant deviation from neutrality was detected in the patterns of diversity in our sample (Table 2). A nonsignificant Tajima's D detected no deviation from expectations under the neutral-equilibrium model (D = –0.76 or D = 0.96, both P > 0.2). The coding region of CHS exhibits singular polymorphism features in A. halleri (RAMOS-ONSINS et al. 2004), with a highly significant Fay and Wu's H, presumably indicative of genetic introgression. In the intergenic region, Fay and Wu's H was not significant (H = –7.2 or H = –2.93, minimum P = 0.09), nor was the difference in the polymorphism-to-divergence ratio between the CHS intergenic and coding regions (HKA, {chi}2 = 2.06 or 2.14, minimum P > 0.1). Association patterns between adjacent sites indicated that branch length in the coalescent tree is compatible with a equilibrium neutral model (Wall's B = 0.667 or 0.622, minimum P > 0.9; WALL 1999). Each natural accession found to be heterozygous harbored alleles from different clades. Thus population subdivision is unlikely to explain this haplotype structure. Instead, in the two divergent alleles (AH3 and AH4-2), several gene conversion tracts were detected throughout the sequence (BETRAN et al. 1997). Three tracts involved a conversion from A. lyrata into A. halleri and one a conversion from A. thaliana into A. halleri. No similar pattern was found in the analysis of A. lyrata alleles. Introgression of related Arabidopsis species into A. halleri was previously suggested by a multilocus analysis (RAMOS-ONSINS et al. 2004). Therefore, the existence of divergent allelic lineages in A. halleri appeared compatible with genomewide patterns of variation. These alleles appear to have originated from the recombination of existing diversity and are not the result of a distinct evolutionary history along a separate branch of the genealogical tree. Haplotype networks, such as the one presented in Figure 2, allow the representation of recombination events and lineage mixtures and thus illustrate this phenomenon more accurately than a conventional phylogenetic tree.


Figure 2
View larger version (21K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 2.— Synthetic representation of the parsimony haplotype network built for the portion of the intergenic region that is alignable across species in Arabidopsis (see MATERIALS AND METHODS). Numbers indicate the number of intermediate changes along each branch. Thick solid lines indicate branches with larger number of changes that determine major clades. A. lyrata forms a separate clade. Three different haplotype groups are observed in A. halleri. For A. lyrata, haplotypes found in central Europe are indicated in gray and are interspersed in the network with haplotypes found in peripheral populations. The gray line reflects homoplasy between A. halleri and A. lyrata.

 

View this table:
[in this window]
[in a new window]

 
TABLE 4 Average pairwise nucleotide divergence in A. halleri, within and among allelic clades defined by the haplotype network in Figure 2

 
Levels and patterns of polymorphism and divergence at noncoding DNA regions may reflect constraints on functionally important regulatory sites. The 5' part of the intergenic region has substantially diverged across the genus Arabidopsis. The three species aligned poorly in this region (see alignment in supplemental data at http://www.genetics.org/supplemental/). Even within A. halleri, no reliable alignment could be obtained over this region, due to large insertions and deletions. By contrast, we found low levels of diversity in the 3' part of the intergenic region, which contains the core promoter of CHS. We examined polymorphism within the CHS promoter stretches found to be highly conserved throughout the Brassicaceae (KOCH et al. 2001). One and two polymorphisms were found to segregate in A. halleri and A. lyrata, respectively, which occurred within the conserved region around the ACE and MRE regulatory elements. However, these polymorphisms did not affect any of the three elements required for expression of CHS in response to light or fungal elicitors (HARTMANN et al. 1998; LOGEMANN and HAHLBROCK 2002). No segregating polymorphism was found in the three other conserved sequence blocks in the CHS promoter (KOCH et al. 2001). Two fixed differences between A. thaliana and either A. halleri or A. lyrata were also observed in the ACEMRE conserved block, one of them affecting the ACE element (see alignment in supplemental data at http://www.genetics.org/supplemental/), which is necessary for light-responsive CHS expression. No fixed differences were found between A. lyrata and A. halleri in any of the four phylogenetic footprints found by KOCH et al. (2001) in the Brassicaceae. The low levels of nucleotide divergence and polymorphism in this region did not allow the application of an HKA test to examine whether its evolutionary rate is unusually low.

Expression diversity:
In F1 individuals obtained from intra- as well as interspecific crosses, parental CHS cis-regulatory regions are in perfect linkage with parental-coding regions and experience the same trans-regulatory background. Thus, the relative amount of parental CHS mRNA reflects the relative activity of parental cis-regulatory regions (COWLES et al. 2002). This approach allows us to evaluate the amount of cis-regulatory variation. In our assay, we used DNA from F1 individuals to experimentally model the null hypothesis of the "no activity" difference between parental cis-regulatory alleles (DE MEAUX et al. 2005). Indeed, heterozygous DNA contains equal amounts of parental alleles. A total of 461 and 421 F1 progeny were obtained from intraspecific crosses between multiple genotypes in A. lyrata and A. halleri, respectively (summarized in supplemental Table 1 at http://www.genetics.org/supplemental/). Plants were submitted to four different CHS induction treatments (maintained for 48 hr in the dark and 8 hr in the light and submitted to herbivory by P. xylostella and control insect-free plants; see MATERIALS AND METHODS). Leaf tissue was subsequently harvested from these plants to examine CHS cis-regulatory variation. Relative allelic amounts were determined in a total of 709 and 597 samples (summarized in supplemental Table 2 at http://www.genetics.org/supplemental/). Along with this, a total of 81 and 25 individuals were obtained from crosses between A. thaliana and either A. lyrata and A. halleri, respectively, which yielded a total of 361 measurements of relative allelic amounts (supplemental Table 3 at http://www.genetics.org/supplemental/). CHS expression was also examined in floral tissue in the thaliana–lyrata hybrids.

Expression diversity in A. lyrata:
Using a GLM model, we investigated the effect of treatments (i.e., DNA and cDNA pools) and genotypes (progeny of a parental combination; see MATERIALS AND METHODS).

Three SNP assays were used to evaluate cis-regulatory diversity in A. lyrata. For each SNP assay, a significant effect of genotype and CHS induction treatments was detected (P ≤ 0.031, Table 5). Likewise, the interactions between genotype and induction treatments were always highly significant (P < 0.001, Table 5). This reveals that relative allelic expression varies across the CHS expression environments in a way that depends on the genotype of the progeny.


View this table:
[in this window]
[in a new window]

 
TABLE 5 Global GLM analysis conducted separately for each SNP assay in A. lyrata

 
We subsequently conducted a separate analysis of variance for each genotype (supplemental Table 5 at http://www.genetics.org/supplemental/). If the treatment effect was significant, we further performed a Tukey's post-hoc multiple mean comparison test to identify which CHS expression environment yielded differences in the relative ratios of parental CHS mRNA (Figure 3). All but two pairs of alleles exhibited significant functional differences in at least one CHS expression environment with respect to equal expression of parental alleles (Figure 3).


Figure 3
View larger version (41K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 3.— Box plots reporting the relative CHS cis-regulatory activity in A. lyrata F1 individuals from 10 parental combinations in response to dark, light, and insect feeding (with corresponding control). For each cross, the y-axis of the box plot indicates the relative expression level. In a box, the center horizontal line marks the median of the sample. The length of the box shows the range within which the central 50% of the values fall, with the box edges at the first and third quartiles. The whiskers show the range of observed values that fall within 1.5 times the midrange (or length of the box). The horizontal gray line indicates the expected value for equal promoter activity of both parental cis-regulatory regions in individuals of the progeny, as measured by relative allele abundance in DNA samples of the heterozygous individuals. For each genotype, an analysis of variance was conducted (see MATERIALS AND METHODS). Indicated here is the F-value of the treatment effect for each parental combination. Letters within the box plots indicate the result of the post-hoc multiple mean comparison (Tukey's test). The absence of a letter in common indicates significant differences in LS means. An "(a)" indicates samples that were analyzed with two independent SNP assays.

 
If parental genotypes used for the crosses are heterozygous, different combinations of intergenic alleles will segregate in the progeny. In particular, two singletons in the AL22 intergenic region and cDNA were found to segregate in all F1 progeny of AL22, demonstrating that this individual is heterozygous. We genotyped each allele in the AL22xAL7 progeny and performed a GLM analysis with and without allele effect and allele x treatment interaction. An analysis not taking into account the allelic combination found no significant difference in expression between AL22 and AL7 alleles (F4,98 = 1.16, P = 0.143) and accounted poorly for variation (R2 = 0.166). Instead, the analysis incorporating allele effect accounted for a much greater part of variation (R2 = 0.730) and found significant effects of treatment (F4,87 = 5.48, P = 0.001), allele (F1,87 = 86.68, P <0.001), and allele x treatment interaction (F4,87 = 17.71, P < 0.001). Post-hoc multiple mean comparison tests revealed that the identified AL22 allele was not significantly different from the AL7 allele, whereas the unknown AL22 allele differed markedly from the AL7 allele (Figure 4; see also supplemental Tables 4 and 5 at http://www.genetics.org/supplemental/). In individuals carrying the unknown AL22 allele, AL22 mRNA was overrepresented in the dark, as well as in both insect-damaged and control leaves but not in light-exposed leaves. We also genotyped the AL52 alleles in the AL52xAL41 progeny and found significant treatment and allele effects (F4,56 = 8.469, P < 0.001 and F1,56 = 6.813, P = 0.012, respectively; see also Figure 4). Post-hoc tests indicated that only the AL52-1 intergenic allele is significantly different from the AL41 allele, while the AL52-2 is not. However, the interaction between allele and treatment effect was not significant (F4,56 = 1.891, P = 0.125), indicating that the difference is tenuous (Figure 4; supplemental Tables 4 and 5 at http://www.genetics.org/supplemental/). This analysis demonstrates that distinct cis-regulatory alleles can segregate within populations. This is possible only if a second allele can be differentiated and if the progeny are big enough for an allele effect to be incorporated (for example, the AL22xAL10 progeny were too small for a similar analysis to be performed). In the AL22xAL12, AL22xAL41, and AL22xAL3 progeny, only individuals carrying the known AL22 alleles were analyzed (see MATERIALS AND METHODS) and we could not identify any second allele in AL41, AL7, AL12, or AL3. The results described above indicate that a statistical analysis that examines parental allelic combination in bulk tends to mask some of the cis-regulatory differences existing between parents. It remains possible that the larger variance observed for some measurements results from unknown allelic combinations segregating in the progeny.


Figure 4
View larger version (20K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 4.— Cis-regulatory variation in progeny in which different combinations of parental alleles are segregating. Box plots report the activity of one parental cis-regulatory allele relative to each of the two cis-regulatory alleles of the other parent in four different CHS expression environments. (a) Activity of each AL52 allele relative to the cis-regulatory allele of AL41. Only the AL52-1 allele is functionally different from the AL41 allele. (b) Activity of each AL22 allele relative to the cis-regulatory allele of AL7. Only one AL22 allele is functionally different from the AL7 allele. (c) Activity of each AH4 allele relative to the AH22-1 cis-regulatory allele. Only the AH4-2 allele is functionally different from the AH22-1 allele. F-value and associated P-values of the interaction between CHS expression environment (treatment) and allele are indicated below each box plot. Letters indicate significantly different pairs of treatment x allele means.

 
A large insect-specific difference in CHS cis-regulation was detected in the cross between genotypes AL22 and AL10 (Figure 3). This difference was confirmed by two independent SNP assays and resulted from a relative decrease in AL10 cis-regulatory activity that was also apparent in a few plants obtained from a cross between genotypes AL10 and AL52 (not shown). The unknown AL22 allele was more active in both insect challenged and control leaves than the known AL22 allele. This may explain the larger variance observed in the AL22xAL10 progeny.

The analysis of the allelic differences in CHS expression in plants maintained 48 hr in the dark yielded three to four functional groups of cis-regulatory alleles (see Figure 3). The detailed analysis of the pairwise comparison of cis-regulatory activity indicates the following relationship in cis-regulatory activity: AL52-1 ≥ AL41 = AL52-2 >> AL22/AL7/AL3/AL10 >> AL12. The unknown AL22 intergenic allele could form an additional class but it is not known whether it is different from AL41. All cis-regulatory alleles showed equal activity after 8 hr of exposure to strong light as indicated by a nonsignificant difference between light-exposed-leaf cDNA samples and DNA samples for any of the genotypes. Thus these four functional groups respond differently to the onset of light, as they compensate in various degrees for the variable level of CHS expression in the dark.

Large-indel differences alone did not explain cis-regulatory differences in A. lyrata. For example, the known AL22 allele and AL3 had different large-indel content but no functional difference, whereas AL41 and AL7 were functionally different despite an identical large-indel content (Figures 2 and 3).

The average allelic proportion measured in our assay provided a rough estimate of the maximum fold difference in mRNA levels driven by cis-regulatory variation in each CHS expression environment (Table 6). In A. lyrata, maximums of 3.1- and 2.5-fold differences were observed in leaves maintained in the dark or challenged by herbivory, respectively.


View this table:
[in this window]
[in a new window]

 
TABLE 6 Cis-regulatory fold difference observed in A. lyrata and A. halleri

 
Expression diversity in A. halleri:
Three SNP assays were used to look at cis-regulatory variation in A. halleri. No trial effect was detected (see MATERIALS AND METHODS). Significant genotype and treatment effects were detected for two of three assays (P < 0.001, Table 7). For the third SNP assay (SNPM6), only the treatment effect was found to be significant (P < 0.001, Table 7) but a marginally significant interaction between genotype and treatment was found (P = 0.048).


View this table:
[in this window]
[in a new window]

 
TABLE 7 Global GLM analysis conducted separately for each SNP assay in A. halleri

 
We further conducted a separate GLM analysis for each genotype to identify statistically differentiated responses among genotypes (supplemental Table 6 at http://www.genetics.org/supplemental/). Figure 5 reports the pairwise comparison of cis-regulatory activity for each parental combination as well as the result of the post-hoc multiple mean comparison tests. Interestingly, the AH3 cis-regulatory allele was significantly less active than the alleles of AH5, AH12, or AH22 in all conditions although no clearly significant cis-regulatory differences were found in the progeny of AH3xAH4 (Figure 5; supplemental Table 6 at http://www.genetics.org/supplemental/). The AH3 intergenic alleles, as well as the AH4-2 allele, belong to clade 3. This clade is highly divergent from the other alleles segregating in A. halleri, in particular from the AH4-1 allele also carried by the AH4 individual (Figure 2). We genotyped individuals in the AH4xAH22 progeny for the AH4 allele that they inherited (note that the individuals that we analyzed in these progeny all harbored the same AH22-1 CHS intergenic region; see MATERIALS AND METHODS). A significant effect of the AH4 allele on expression was found as well as a significant interaction between treatment and the AH4 allele (F1,97 = 84.04, P < 0.001, F1,97 = 16.28, P < 0.001; supplemental Table 5 at http://www.genetics.org/supplemental/). Only F1 individuals harboring the promoter allele combination AH4-2 and AH22-1 showed significantly higher expression of the AH22-1 allele (Figure 4). The AH4xAH3 progeny was also genotyped. No significant effect of the AH4 allele was detected on expression data. However, restricted sample size limited our ability to detect a significant difference between the AH4-1/AH3 and AH4-2/AH3 promoter allele combinations. No significant cis-regulatory difference was observed between AH22-1 and AH5 (Figure 5).


Figure 5
View larger version (42K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 5.— Box plots reporting the relative CHS cis-regulatory activity in A. halleri F1 individuals from eight parental combinations in response to dark, light, and insect feeding (with corresponding control). The horizontal gray line indicates the expected value for equal promoter activity of both parental cis-regulatory regions in the progeny. Indicated here is the F-value of the treatment effect for each progeny. Letters within the box plots indicate the result of post-hoc multiple mean comparison (Tukey's test; see MATERIALS AND METHODS). The absence of a letter in common indicates significant differences in LS means. An "(a)" indicates samples that were analyzed with two independent SNP assays.

 
Altogether, our comparative study of six alleles of the CHS intergenic region uncovered three functional groups. AH3 and AH4-2 constituted a set of alleles that are divergent at both nucleotide and functional levels. CHS cis-regulation in AH12 showed moderate but significant differences from AH5. And no cis-regulatory difference was detected among individuals harboring CHS intergenic alleles AH22-1, AH4-1, and AH5. In our study, cis-regulatory diversity in A. halleri controlled at most a fourfold difference in CHS mRNA level as observed in dark-maintained leaves of the progeny of AH12xAH3 (Table 6). High levels of nucleotide divergence in the intergenic region appeared to explain a large part of, but not all, cis-regulatory variation segregating in A. halleri.

Expression differences in interspecific hybrids:
To evaluate the functional cis-regulatory divergence among Arabidopsis species, we crossed A. thaliana genotypes with both A. lyrata and A. halleri. Hybrid individuals have a haploid copy of each parental genome (i.e., 13 chromosomes) and are sterile. They are morphologically similar to their non-A. thaliana parent. In total, five CHS expression environments were assessed (48 hr dark, 8 hr light, 24 hr insect feeding and respective control, expression in flowers after 48 hr in the dark). Altogether, 245 and 116 relative allelic measurements were performed for A. thaliana–A. lyrata and A. thalianaA. halleri F1 progeny, respectively (summarized in supplemental Table 3 at http://www.genetics.org/supplemental/).

Cis-regulatory differences between A. thaliana and A. lyrata:
In the A. thalianaA. lyrata F1 progeny, our assay did not detect CHS expression in either dark-maintained leaves or control non-insect-challenged leaves. Detection of CHS expression in the A. thalianaA. halleri progeny with the same SNP assay (see below) suggested differences in transcription factor expression between hybrid types. The GLM analysis was conducted on a data set that included hybrid DNA samples and mRNA samples collected from three CHS expression environments (flowers, leaves after light exposure, and insect-damaged leaves; Table 8). The analysis examined the following sources of variation: CHS expression environment, parental genotype, SNP assay, and interactions of SNP x treatment and SNP x parental genotype, as well as a technical covariate (see MATERIALS AND METHODS).


View this table:
[in this window]
[in a new window]

 
TABLE 8 Cis-regulatory variation of CHS expression

 
A significant treatment effect was detected (F3,218 = 173.393, P < 0.001). Post-hoc multiple mean comparison tests indicated that this effect resulted mostly from the relative overexpression of the CHS mRNA of A. thaliana in insect-challenged leaves and, to a lesser degree, from slight overexpression of the A. thaliana mRNA in flowers (see Figure 6; supplemental Figure 1 at http://www.genetics.org/supplemental/). The first response is most likely insect specific because our assay failed to detect CHS expression in most control leaves but not in insect-challenged leaves. In addition, in the few control leaf samples where CHS expression could be detected, no skew toward one or the other parental CHS mRNA was apparent. Fold-change estimates indicated that the A. thaliana CHS mRNA allele was four times more induced by insect feeding than its ortholog in A. lyrata (Table 9).


Figure 6
View larger version (19K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 6.— Box plots reporting the relative CHS cis-regulatory activity in F1 interspecific hybrids in response to different CHS expression environments. The y-axis of the box plots indicates the relative mRNA level of each parental species. The horizontal gray line indicates the expected value for equal promoter activity of both parental cis-regulatory regions in the progeny. Letters within the box plots indicate the result of post-hoc multiple mean comparison (Tukey's test; see MATERIALS AND METHODS). The absence of a letter in common indicates significant differences in LS means. Because in most samples CHS expression was not detectable, the CHS expression data for control non-insect-damaged leaves were excluded from the data analysis in A. thaliana x A. lyrata. An "(a)" indicates samples that were analyzed with two independent SNP assays.

 

View this table:
[in this window]
[in a new window]

 
TABLE 9 Cis-regulatory fold change observed in A. thaliana x A. lyrata and A. thaliana x A. halleri progenies

 
Cis-regulatory differences between A. thaliana and A. halleri:
Species-specific levels of CHS expression were also investigated in A. thaliana–A. halleri diploid hybrids. The GLM model incorporated variation attributable to the CHS expression environment, SNP assay, interaction between SNP and treatment, and a technical covariate (see MATERIALS AND METHODS). Due to limited sample size, CHS expression was not studied in all environments for some genotypes. Therefore, the effect of the parental genotypes was not incorporated into the analysis. The GLM analysis detected a significant treatment effect (F4,105 = 23.125, P < 0.001; Table 8). Post-hoc tests indicated a significant difference between insect-damaged leaves and all other treatments, including measurements made in hybrid DNA and control leaves (Figure 6; supplemental Figure 1 at http://www.genetics.org/supplemental/). Thus, the A. thaliana CHS gene was more induced by insect feeding than its ortholog in A. halleri. In addition, in the dark as well as in control non-insect-damaged leaves (which were collected in the early morning), the A. halleri CHS gene appeared to be more highly expressed (i.e., presumably less repressed) than its ortholog in A. thaliana. The A. thaliana CHS gene transcript is twice as abundant in insect-damaged leaves as its A. halleri ortholog, whereas in the dark, the A. halleri CHS gene transcript is expressed three times more than its A. thaliana ortholog (Table 9).

Absence of large maternal effect and methylation:
In each of the two species A. lyrata and A. halleri, four of nine crosses yielded individuals from both reciprocal crosses (supplemental Table 1 at http://www.genetics.org/supplemental/). In only two instances (two A. halleri progeny) was there any suggestion of reciprocal differences due to the direction of the cross (supplemental Table 6 at http://www.genetics.org/supplemental/; P = 0.047 for AH22xAH4 and P = 0.032 for AH12xSie).

Additionally, studies of newly formed allopolyploids suggest that interspecific hybrids may experience dramatic expression changes due to methylation of one or both parental copies (ADAMS et al. 2004; WANG et al. 2004). The interspecific hybrids obtained for this study were not polyploid. It seemed sensible, however, to evaluate the potential impact of methylation on the observed variation. We extracted DNA from leaves of the hybrids and from some of their parental genotypes. Levels of methylation were assessed at three potentially methylated CpG sites in the core promoter. No methylation could be detected at these sites in either parent or in the hybrid progeny. This suggests that bringing two distinct haploid genomes together in these interspecific hybrids did not alter dramatically the methylation at the CHS intergenic region.

No simple candidate mutation to explain functional variation:
In A. thaliana, a light-responsive box was found to be polymorphic and to correlate with cis-regulatory differences in dark-maintained and light-exposed leaves (DE MEAUX et al. 2005). In both A. halleri and A. lyrata, this box is conserved. Thus, the differential cis-regulatory activity in the dark has to be found elsewhere. Association between polymorphisms and functional cis-regulatory differences has been successful in A. thaliana, where levels of nucleotide diversity are low. In A. lyrata and A. halleri, alleles instead differ on average at >13 positions ({pi} > 0.01 in both species examined here). Therefore, the observed functional diversity, either within or between species, could not be tracked down to any single polymorphic sequence feature. Likewise, it was not possible to determine whether nucleotide differences in the ACE–MRE conserved regulatory element have functional consequences on CHS cis-regulation in A. thaliana vs. A. lyrata or A. halleri. We did not identify a candidate polymorphic motif to explain functional variation found within and between these species. It is interesting, however, to note that a W-box was lost through introgression of a sequence fragment from A. lyrata into intergenic alleles AH4-2 and AH3 alleles. W-boxes are bound by WRKY transcription factors, involved in different types of stress and developmental responses (EULGEM et al. 2000). Whether this element is directly involved in the weaker cis-regulatory activity of the AH4-2 and AH3 alleles has to be tested experimentally.


    DISCUSSION
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Factors governing the diversity and evolution of cis-regulatory DNA are poorly understood. Here, we have characterized the standing variation of cis-regulation at the CHS locus at both nucleotide and functional levels in Arabidopsis. We show that large cis-regulatory differences segregate within species, both within and among populations. We further show that CHS cis-regulation has changed considerably among species, with alteration of the response to specific cues, which may be of ecological relevance. Our study reveals that CHS cis-regulation evolves in a modular fashion. In addition, we show that the patterns of nucleotide variation in the intergenic region upstream from CHS are complex and variable among species, yet they reveal no significant departure from neutrality. Interestingly, our study also documents some consequences of interspecific gene flow on cis-regulatory variation in A. halleri.

Modular cis-regulatory variation in Arabidopsis:
To evaluate functional cis-regulatory variation at the species level, we performed crosses between parental genotypes sampled in different locations throughout the native range of A. lyrata and A. halleri. By means of these crosses, we compared expression of different alleles within the same cells, and thus in the same trans-regulatory background. Because cis-regulatory and coding regions of each parent are linked, differences in the relative amount of allelic mRNA directly reflect allelic cis-regulatory differences. Using this same approach, we also evaluated the functional divergence of CHS cis-regulation among species in the