Microarray analysis of gene expression patterns in immature ear, seedling, and embryo tissues from the maize inbred lines B73 and Mo17 identified numerous genes with variable expression. Some genes had detectable expression in only one of the two inbreds; most of these genes were detected in the genomic DNA of both inbreds, indicating that the expression differences are likely caused by differential regulation rather than by differences in gene content. Gene expression was also monitored in the reciprocal F1 hybrids B73 × Mo17 and Mo17 × B73. The reciprocal F1 hybrid lines did not display parental effects on gene expression levels. Approximately 80% of the differentially expressed genes displayed additive expression patterns in the hybrids relative to the inbred parents. The ∼20% of genes that display nonadditive expression patterns tend to be expressed at levels within the parental range, with minimal evidence for novel expression levels greater than the high parent or less than the low parent. Analysis of allele-specific expression patterns in the hybrid suggested that intraspecific variation in gene expression levels is largely attributable to cis-regulatory variation in maize. Collectively, our data suggest that allelic cis-regulatory variation between B73 and Mo17 dictates maintenance of inbred allelic expression levels in the F1 hybrid, resulting in additive expression patterns.
MAIZE demonstrates a large amount of intraspecific sequence variation (Tenaillon et al. 2001; Vroh Bi et al. 2005) in the form of single nucleotide and insertion/deletion polymorphisms. The sequencing of allelic regions in multiple maize inbreds has identified numerous examples of nonhomologies in repetitive sequences and Helitron-transposed exon gene fragments (Fu and Dooner 2002; Song and Messing 2003; Brunner et al. 2005; Lai et al. 2005; Morgante et al. 2005). Additionally, recent microarray studies have identified transcriptional differences between different maize lines (Ma et al. 2006; Swanson-Wagner et al. 2006). This substantial intraspecific diversity makes maize an interesting system for studying the sources and effects of transcriptional diversity.
Intraspecific allelic variation is often attributed to qualitative changes that affect the nature of the gene products and quantitative changes that alter the amount of the gene product produced. Quantitative changes in gene expression may be the result of cis- or trans-variations in gene regulation (Wittkopp et al. 2004). Cis-regulation is due to variation that is genetically linked to the locus with differential expression. Alternatively, trans-regulation is due to variation at unlinked loci affecting the level of gene expression. The relative prevalence of cis- and trans-regulatory variation has been monitored using both expression quantitative trait loci (eQTL) and allele-specific expression (ASE) methodologies (Monks et al. 2004; Morley et al. 2004; Cheung et al. 2005; Doss et al. 2005; Pastinen et al. 2005; Stranger et al. 2005). These studies have documented a strong contribution of cis-regulatory variation toward explaining intraspecific variation in mammalian gene expression.
These approaches have also been used to study the contribution of cis-regulatory variation to quantitative variation in maize. An eQTL approach found that 80% of the maize eQTL with an LOD score >7.0 mapped to the physical location of the differentially expressed gene (Schadt et al. 2003). The use of allele-specific expression analysis found that 11 of 15 genes in hybrid maize displayed evidence for allelic variation in gene expression levels (Guo et al. 2004). These two studies suggest that a large proportion of the intraspecific variation in gene expression levels in maize can be attributed to cis-regulatory variation.
One current challenge is to relate the observed modes of allelic regulation with overall gene expression differences observed in maize inbreds and hybrids. Birchler et al. (2003) discussed two extreme molecular models of gene expression in an F1 hybrid, which are not mutually exclusive on a transcriptome-wide level. One model posits that hybrids exhibit additive expression levels, in which the hybrid expression matches the midparent level of the inbreds. The alternative model proposes that hybrids may experience interactions between alleles that alter regulatory networks and result in a novel type of gene action that is different from either parent or additive values. Nonadditive gene expression occurs whenever the expression level in the hybrid deviates from the predicted midparent value and can include instances of gene expression within the range of the parents (similar to high-parent or low-parent expression) or outside the range of the parental values (over high parent or under low parent). Previous studies in maize have provided evidence for nonadditive gene expression in hybrid maize relative to inbred maize (Song and Messing 2003; Auger et al. 2005).
Microarray-based analyses can be useful as a screening tool for comparing the gene expression of inbred and hybrid lines, revealing the extent of quantitative changes and the mode of gene action in the hybrid. Specifically, it is possible to determine the amount of additive gene expression relative to nonadditive expression in a hybrid. In relating the concepts of additivity or nonadditivity to molecular mechanisms it is important to separate nonadditive genes into those that fall within high- and low-parent levels (within the parental range) as opposed to over high-parent and under low-parent levels (hybrid expression outside the parental range).
Microarrays have been used to monitor the mode of gene action in Arabidopsis hybrids (Vuylsteke et al. 2005) and both intraspecific and interspecific Drosophila hybrids (Gibson et al. 2004; Ranz et al. 2004). A significant portion (5–10%) of genes were found to display over high-parent or under low-parent expression levels in hybrids of Drosophila and Arabidopsis. Clustering analyses revealed that in both Drosophila and Arabidopsis the two inbred parents were more similar to each other in expression profile than to the hybrid (Gibson et al. 2004; Vuylsteke et al. 2005). It should be noted that microarray-based approaches are quite good at documenting the prevalence of strongly nonadditive expression but may have limited resolution for the discrimination of truly additive expression patterns from slightly nonadditive expression patterns. By monitoring the expression of genes in the genetically identical reciprocal hybrids it was found that numerous genes exhibit parental effects on expression (Gibson et al. 2004; Vuylsteke et al. 2005).
There is an impressive range of genomic and regulatory variation observed between maize inbred lines. However, the effect of these variations on hybrid gene expression remains largely unknown. Toward this end, this study attempts to better define the variations in gene expression between maize inbreds, how this variation affects gene expression levels in the F1 hybrid, and the relative contribution of cis- and trans-variation.
MATERIALS AND METHODS
Oligonucleotide microarrays designed for detection of maize genes were purchased from Affymetrix (Santa Clara, CA). This microarray was designed using EST contig sequences. There are 15 probes for each represented gene on the microarray. The EST sequences are derived from multiple inbred lines and contain sequence polymorphisms. All polymorphic positions were masked such that when possible, the probe sets will be robust for multiple genotypes. The maize Affymetrix array contains 17,622 probe sets that are designed to detect the expression of 13,495 genes. Some genes are represented by multiple probe sets designed to detect sense and antisense expression or the expression of alternative transcripts. The Affymetrix array likely represents approximately one-third of the maize genes and is likely to have a partial bias toward highly expressed genes. However, we also noted low expression states for many genes and a consistent lack of expression for almost one-quarter of the genes on the array, indicating that genes with low transcription are represented on the array.
Plant growth and tissue collection:
Eleven-day old seedlings:
Seeds for inbreds Mo17 and B73 and hybrids Mo17 × B73 and B73 × Mo17 were sown and grown under greenhouse conditions for 11 days and then sampled for gene expression analysis. Three biological replicates were planted and grown sequentially over an ∼5-week period during May 2005. Seeds were planted such that one seed of each genotype was present in every pot. Twelve pots were selected for tissue collection; thus 12 plants were sampled per genotype. All plants were sampled between 9 am and 10 am and plants were cut immediately above the highest root; thus all aboveground tissues and meristems were sampled. The sampled tissues were flash frozen in liquid nitrogen and stored at −80° prior to RNA isolation.
Immature ears from inbreds Mo17 and B73 and hybrids Mo17 × B73 and B73 × Mo17 were collected from field-grown maize plants during July 2005 for gene expression analysis. The maize plants were grown on the St. Paul campus Agricultural Experiment Station. Ears were collected for each genotype between 9 am and 11 am on each of three consecutive days from various positions in the field. Ears ranging from 3.7 to 5.4 mm in length were sampled. Ears were flash frozen in liquid nitrogen following the manual removal of husks and silks.
Embryos were collected from kernels of inbreds Mo17 and B73 and hybrids Mo17 × B73 and B73 × Mo17 for gene expression analysis. Samples were collected during August 2005 from field-grown maize plants grown on the St. Paul campus Agricultural Experiment Station. Inbred and hybrid crosses were performed at three different time intervals within the same week. Embryos were collected at 19 days after pollination (DAP) for the three bioreplicates. All tissues were flash frozen in a dry ice-cooled ethanol bath and subsequently stored at −80°. We collected the plant materials for each biological replicate on the same date between 9 am and 10 am. The three biological replicates represent collections performed on three different dates.
For seedling RNA isolation, tissues from 12 seedlings/genotype/biological replicate were pooled and ground in liquid nitrogen. For immature ear RNA isolation, tissues from 6 ears/genotype/biological replicate were pooled and ground in liquid nitrogen. RNAs were extracted using Trizol reagent according to the manufacturer's instructions (Invitrogen, Carlsbad, CA). All RNAs were subsequently subjected to DNAse treatment and phenol:chloroform extraction. RNAs were precipitated with 0.1× volume sodium acetate (pH 5.5) and 2.5× volume ethanol. Resuspended RNAs were purified further using the RNeasy system, according to the manufacturer's instructions (QIAGEN, Valencia, CA). Five embryos from six different 19 DAP ears were pooled and ground in liquid nitrogen. Embryo RNAs were isolated using the plant RNeasy kit, according to the manufacturer's instructions (QIAGEN). All purified RNA samples were quantified and qualified using the Nanodrop spectrophotometer (Nanodrop Technologies, Montchanin, DE) and agarose gel electrophoresis.
Microarray hybridizations and statistical analysis:
Affymetrix microarray hybridizations were performed for the seedling, immature ear, and 19 DAP embryo RNAs. Hybridizations for three biological replicates/tissue type were performed for each of four genotypes: inbreds Mo17 and B73 and hybrids Mo17 × B73 and B73 × Mo17. A total of 8 μg of total RNA was labeled for each hybridization according to the manufacturer's instructions (Affymetrix). Hybridization chemistries were performed at the University of Minnesota Microarray Facility.
The signal data from the microarrays were processed using two different methods, MAS5.0 and GC-RMA. The GCOS software package v1.2 (Affymetrix) was used to produce a MAS5.0 signal and presence–absence calls. Normalization for the MAS5.0 signal was performed with a user-defined value of 1 and a scaling factor of 500 using the default GCOS parameters. These signals were then imported into GeneSpring (Agilent Technologies, Palo Alto, CA) and a per-chip normalization to the 50th percentile and a per-gene normalization to median were performed. GC-RMA processing of the .cel files was performed using GeneSpring software. The GC-RMA processing performs a normalization between the arrays and a subsequent per-gene normalization was applied to the resulting values (Wu et al. 2004). Differentially expressed genes were identified by performing a one-way ANOVA on the GC-RMA or MAS5.0 values using a parametric test with no assumption of equal variance. A Benjamini and Hochberg multiple testing correction was applied using three different false-discovery rate significance thresholds: 0.05, 0.10, or 0.25, such that, respectively, 5, 10, or 25% of the genes identified in a test are likely to be falsely identified. The statistical classification of the expression state of the differentially expressed genes in the hybrid was performed by comparing the hybrid expression level with each of the two parents separately. This analysis allowed for classification of genes as “over high parent,” “high parent,” “between parent,” “low parent,” or “under low parent.” The classification analysis was performed using GeneSpring software to perform an ANOVA analysis with a Benjamini and Hochberg FDR of 0.05. Genes were judged to be between parent if they were significantly different from both parental lines and had a raw expression value between the two parental lines. A second analysis was done to test for nonadditive expression. In this analysis the inbred midparent value was calculated for all three biological replicates and compared with the average hybrid expression for each of the three biological replicates. A two-tailed homoscedastic t-test was performed and all genes with P < 0.05 were considered to be nonadditively expressed. The GeneSpring software was used to perform a hierarchical clustering analysis using a Pearson correlation method to create gene or condition trees on the basis of specified gene lists, conditions, and genotypes.
To test for the possibility of polymorphisms that resulted in probe-specific effects and false discovery of differential expression, an individual probe level testing was performed. The individual probe signals were extracted and a per-chip normalization was applied. For each differentially expressed gene, the difference between each of the 15 perfect match–mismatch probe signals was determined for each of the three biological replicates and used to perform an independent sample comparison of means assuming normality with a cutoff of P = 0.05. The number of probe pairs that pass this test for each gene was determined.
Present–absent gene analysis:
Genes that were called present in only one of the two inbred genotypes were identified on the basis of the MAS5.0 presence–absence calls. BLAST analyses were performed using these sequences to query the NCBI GSS sequences derived from maize. Primers were designed using primer 3 software (Rozen and Skaletsky 2000). PCR reactions were performed in a 15 μl total volume containing ∼25 ng of DNA, 2 pmol of each primer, 0.4 units of HotStar Taq polymerase (Eppendorf), 1.56 μl of 10× reaction buffer, and 0.2 μl of 25 mm dNTPs. Conditions of the PCR were as follows: 94° for 15 min, 35 cycles of 94° for 30 sec, 60° for 30 sec, 72° for 2 min, followed by 72° for 7 min. Amplified products were separated in a 1% agarose TBE gel and visualized by ethidium bromide staining. Validation of expression patterns was performed on cDNA templates using the same PCR protocols.
RNAs from all tissues were treated with DNAse prior to allele-specific expression analyses. cDNAs were synthesized from all three biological replicates of Mo17 × B73 and B73 × Mo17 hybrid RNAs. Three mixed cDNAs were also synthesized from equal mixes of the biological replicates of Mo17 and B73 inbred RNAs. The cDNAs were reverse transcribed using Superscript III reverse transcriptase, according to the manufacturer's instructions (Invitrogen).
PCR-based assays for allele-specific expression analyses were designed for 27 genes, in collaboration with Sequenom (San Diego). The genes were randomly chosen by comparing the set of differentially expressed genes with the B73/Mo17 sequences available at Panzea (Zhao et al. 2006) and selecting examples with B73/Mo17 SNP polymorphisms. PCR and extension PCR reactions on cDNA and DNA templates were performed using the manufacturer's specifications (Sequenom). Mass spectrometry quantification of allele ratios was performed at the University of Minnesota Genotyping Facility.
B73 and Mo17 DNA samples in known ratios were used to standardize the allele-specific quantification procedure for each gene. DNAs from reciprocal hybrid seedlings were used to standardize 1:1 allelic ratios. DNAs from early hybrid endosperms were reciprocally used to standardize 2:1 allelic ratios. DNAs from B73 and Mo17 inbreds were reciprocally mixed in 4:1 ratios to further standardize the methodology. Multiple allele-specific measurements were performed using the DNA from 1:1, reciprocal 2:1, and reciprocal 4:1 templates to generate a robust eightfold dynamic range standard curve for each gene assay. All assays used in this study display a strong correlation between the known input ratio of B73:Mo17 and the measured ratio.
Multiple measurements of the ratio of the two alleles were performed for each of the three biological replicates of mixed RNA and the six F1 RNAs in the tissue type that displayed differential expression. Statistical analyses were performed using a two-tailed homoscedastic variance t-test (P = 0.10). Three statistical analyses were performed, including a test of difference between the mix RNA and a known 1:1, F1 RNA vs. a known 1:1, and mixed RNA vs. F1 RNA. Genes were described as differentially expressed if the mixed inbred RNA was not equal to the known 1:1 for the two alleles. Genes were described as trans-regulated if the mixed inbred RNA was not equal to 1:1 but the F1 RNA was equal to a 1:1. Genes were described as cis-regulated if the mix and F1 RNA were not statistically different and both were different from a known 1:1. Genes were described as cis- and trans-regulated if the F1 and mix RNA were not equal to each other or a known 1:1.
Many genes are differentially expressed in B73 relative to Mo17:
We sought to investigate the transcriptional variation of two maize heterotic inbreds, and the transcriptional alterations that accompany F1 hybridization. The maize Affymetrix GeneChips were used to profile the expression patterns in maize inbred lines B73 and Mo17 as well as reciprocal F1 hybrids to compare the expression patterns of the inbreds and hybrids. Microarray hybridizations were performed with RNA isolated from seedling, immature ear, or embryo tissue of B73, Mo17, B73 × Mo17, or Mo17 × B73 genotypes. Expression was detected in multiple tissues for many of the genes present on the array (see supplemental Table 1 and Figure 1 at http://www.genetics.org/supplemental/). While expression was detected for many of the probe sets in multiple tissues, the level of expression varied significantly between the tissues.
A correlation plot indicates that there are numerous genes with differential expression in B73 relative to Mo17 (Figure 1A). Separate statistical analyses were performed using either the MAS5.0 or GC-RMA processed signals to identify genes that are differentially expressed in inbred B73 relative to Mo17 (Table 1; supplemental Table 2 at http://www.genetics.org/supplemental/). Depending upon the tissue, data processing, and statistical methods, we identified 4–18% of genes as differentially expressed in B73 relative to Mo17. For the remaining analyses in this study, we have used the most restrictive statistical analysis [ANOVA false discovery rate (FDR) of P = 0.05, no assumption of equal variance] on the GC-RMA processed signals such that we were limited to genes that are most likely to be differentially expressed. The fold change for differentially expressed genes (as identified in our statistical test between B73 and Mo17) varied from 1.04 to 1070, 1.08 to 827, and 1.04 to 2380 in seedling, immature ear, and embryo, respectively.
Analysis of Affymetrix probe level effects:
The high rate of intraspecific sequence variation in maize could lead to a high error rate when using Affymetrix microarrays to compare the relative expression of two different genotypes. In a comparison of B73 and Mo17 sequences it was found that, on average, insertion/deletion polymorphisms occur every 309 bp and single nucleotide polymorphisms occur every 79 bp (Vroh Bi et al. 2005). These data were obtained by focusing on 3′ UTRs. Additionally, it was found that 44% of 592 randomly selected sequences contained a polymorphism in B73 relative to Mo17. The design of the maize Affymetrix microarray attempted to control for these intraspecific sequence differences. There are 15 probes for each represented gene on the microarray. The probes for the microarray are designed on the basis of EST contig sequences. The EST contig sequences are derived from multiple inbred genotypes. Any positions exhibiting sequence polymorphism or sequencing errors were masked out during the probe design phase. Therefore, when sequence information was available, the probes will match both B73 and Mo17 alleles.
However, while the microarray design attempted to counter the problem of intraspecific sequence variation, the lack of complete sequence for B73 and Mo17 means that in some cases it is possible that some probes would match only one of the two alleles. If this is the case, a gene may be called differentially expressed on the basis of one or two probes that provide a significantly different hybridization signal for one allele. We searched through the genes we identified as differentially expressed to identify potential examples. The normalized signal values for all perfect match and mismatch probes for the differentially expressed genes were extracted and analyzed using a t-test. The number of probes (out of 15) per gene that were significantly different was then determined (see supplemental Figure 2 at http://www.genetics.org/supplemental/). For the majority of differentially expressed genes, at least 9 of the 15 probes yielded significantly different signals. A separate analysis was performed using the perfect match probe signal (instead of the perfect match–mismatch difference) and similar results were obtained (data not shown). These results suggest that the majority of examples of differential expression are not likely to be erroneous calls based on allelic sequence differences, but are instead likely to be examples of differential expression.
Very few genes exhibit parental effects on gene expression levels:
The phenotypic characteristics of maize F1 hybrids, including yield, are not influenced by the direction of the cross, although microarray-based analysis of gene expression levels of reciprocal hybrids in Drosophila and Arabidopsis have identified numerous genes with altered expression on the basis of the direction of the cross (Gibson et al. 2004; Vuylsteke et al. 2005). The design of our experiment allowed a comparison of the F1 hybrid derived from crossing a female B73 by male Mo17 with a hybrid derived from female Mo17 crossed by male B73. Correlation plots of the microarray data indicate that there is much less transcriptional variation between reciprocal hybrids than between parental inbred lines (Figure 1, A and B; supplemental Figure 3 at http://www.genetics.org/supplemental/). Our statistical analyses failed to find strong evidence for differential expression of genes on the basis of parental effects (Table 1; supplemental Table 2 at http://www.genetics.org/supplemental/). When relaxed false discovery rates were applied, a small number of genes were identified as differentially expressed.
Gene expression levels in the hybrid are within the range of the inbred parents:
A brief examination of the correlation between the inbred parents and the hybrid offspring suggested significant additivity in gene expression patterns (Figure 1C; supplemental Figure 3 at http://www.genetics.org/supplemental/). We sought to characterize the level of gene expression in the hybrids relative to the two parental inbred lines by dividing the genes into two groups on the basis of the statistical analysis of inbred expression level in each tissue. The group I genes show no significant difference in the expression levels of the two parental lines, while the group II genes were differentially expressed in the inbreds B73 and Mo17. Given the novel heterotic phenotypes, we expected to find novel expression states for a subset of group I genes. These novel expression states would be due to either over high-parent gene action or epistatic interactions. Surprisingly, we found that group I genes do not show novel expression states in hybrids (supplemental Figure 4 at http://www.genetics.org/supplemental/). ANOVA was performed on the group I genes to test for expression changes in the hybrids. Applying a Benjamini and Hochberg FDR of 0.05 or 0.20, zero genes were identified with differential expression in either hybrid relative to either inbred.
The expression state for group II genes, those that are differentially expressed in the two parents, can be described as over high parent, high parent, between parent (less than the high parent, but greater than the low parent), low parent, or under low parent. A statistical analysis comparing the expression levels of the group II genes in the hybrid with each of the two parental inbred lines reveals that the majority (61–82%) of these genes are expressed at between-parent levels (Figure 2A), indicating that additive or near-additive expression patterns may be very common in the hybrid. To directly address this question of additive vs. subtle nonadditive expression patterns in the hybrid, a t-test statistical analysis was performed to compare the inbred average (midparent value) with the hybrid average expression levels using the three biological replicates. This test indicates that ∼20% of the group II genes exhibit nonadditive expression in the hybrid with a significance of P < 0.05 (20.2% for seedling, 23.6% for immature ear, and 23.2% for embryo).
While this indicates that a significant proportion of differentially expressed genes experience nonadditive expression, our previous statistical categorization (Figure 2A) indicates that these genes are still within the parental range. Almost no group II genes (5 of 5,640) show evidence for over high-parent or under low-parent expression states. A comparison of hybrid expression levels relative to inbred parents supports the conclusions of the statistical tests (Figure 2B). Clustering analyses were performed for group I and II genes in all three tissues to search for gene clusters that might exhibit novel expression states in the hybrid relative to the inbred parents (Figure 2, C–E; supplemental Figure 4 at http://www.genetics.org/supplemental/). We did not find evidence for any such gene clusters.
A significant number of genes are only expressed in one inbred line:
The MAS5.0 algorithm provides a call of “Present” (P), “Marginal” (M), or “Absent” (A) expression for each gene on the basis of a set of criteria. We obtained the P, M, A calls for each of the differentially expressed genes in the microarray study and identified examples that were called absent in all three biological replicates of one inbred but present in all three biological replicates of the other inbred (Table 2). A significant number of genes that displayed present–absent expression patterns were identified. Every one of these genes was called present in all six biological replicates of F1 genotypes (three B73 × Mo17 replicates and three Mo17 × B73 replicates). This results in a slightly larger number of expressed genes in the hybrid relative to either inbred parent. The P, M, A calls for all of the present–absent genes were obtained in all three tissues. We found that 427 genes (2.4%) were expressed in at least one tissue of one inbred but not expressed in any of the three tissues of the other inbred.
The present–absent genes may reflect differences in the genic content of B73 and Mo17 (as initially suggested in Fu and Dooner 2002) or may be examples of differences in the regulation of conserved loci. A combination of BLAST sequence similarity searches and PCR amplification methods was used to separate these two possibilities. Although the maize genome is not sequenced, the majority of maize genes have been tagged through reduced representation genome survey sequencing (GSS) of B73 genomic DNA (Springer et al. 2004). We performed BLAST searches of the B73 GSS sequences using 43 genes expressed in Mo17 but not in B73 as queries. A highly similar sequence was detected in the B73 genome for the majority (33/43) of these genes (Table 3 and S3). This rate is comparable to the rate achieved when using the genes expressed in B73 but not in Mo17 as queries (83/94). Additionally, PCR was used to test for the presence of 43 Mo17-specific and 94 B73-specific genes in genomic DNA. The majority of genes (110/137) were detected in both B73 and Mo17, with only a small subset showing amplification in only one genotype (Table 3; supplemental Table 3 at http://www.genetics.org/supplemental/). In combination, these results suggest that the majority of genes that are expressed in only one inbred are present in both inbred lines, but may experience differences in transcriptional regulation.
The present–absent expression calls were validated for a subset of the genes that could be amplified from the genomic DNA of both parental inbreds. PCR was performed using cDNA derived from B73 and Mo17 ear or seedling tissue (depending upon the tissue that displayed present–absent expression in the microarray analysis). The majority (69/90) displayed the expected present–absent expression pattern (Table 3). Many of the genes that were not validated (14/21) had detectable expression in both inbreds but less product was amplified in the inbred line that had been called absent in the microarray analysis.
Variation in gene expression can be primarily explained by cis-regulatory variation:
The high number of genes that vary in B73 relative to Mo17 could be the result of a small number of alterations of trans-acting factors, each with large effects, or the result of many cis-acting variants (Wittkopp et al. 2004). The predominance of additive expression levels in the hybrid suggests that most regulatory variation occurs via cis-acting variants or dosage-dependent trans-acting variants. One method to differentiate between trans-acting and cis-acting factors involves the use of quantitative allele-specific expression assays (Cowles et al. 2002; Yan et al. 2002; Bray et al. 2003; Lo et al. 2003; Pastinen et al. 2004; Wittkopp et al. 2004). A gene that is completely subject to trans-regulation is expected to have equal levels of expression of the two parental alleles in the F1 hybrid. A gene that is subject to cis-regulation will show a bias in the relative expression of the two alleles in the F1. We designed quantitative allele-specific assays for 27 genes that were identified as differentially expressed in our microarray analyses. The genes were chosen on the basis of the availability of a useful B73/Mo17 polymorphism, not on the basis of expression levels, fold change, or annotation. The quantitative SNP-based Sequenom technology (Jurinke et al. 2005) was used to test the relative expression level for the two alleles in a mix of RNA from the two parents as well as in the F1 hybrid (Table 4). One of the assays was designed for a gene that was differentially expressed in all three tissues and six of the assays represented genes that were differentially expressed in two tissues, resulting in a total of 35 tissue-by-assay combinations. These data were used to validate the Affymetrix data as well as to test for cis- vs. trans-regulatory changes. The relative proportion of the B73 allele in a mix of inbred B73 and Mo17 RNA was compared to the predicted proportion derived from the Affymetrix data. There was strong correlation of the two measures and in most cases the allele-specific assays supported a conclusion of differential expression (Figure 3A).
A comparison of the relative fraction of transcript derived from the B73 allele in the F1 and the mixed inbred RNA showed a high correlation, indicating that much of the allelic expression variation can be attributed to cis-regulation (Figure 3B). A t-test was used to identify significant differences between the relative fraction of the B73 allele in the F1 RNA compared to either the mixed B73 and Mo17 RNA or a 1:1 genomic DNA sample (Table 4). The results of the t-tests were then used to classify modes of gene regulation. Genes that showed the same allelic ratios in the inbreds and hybrids were determined to be cis-regulated. Genes that showed allelic bias in the inbreds, but equal proportions in the hybrid, were determined to be trans-regulated. Genes in which hybrid allelic proportions matched neither parental nor equal proportions were determined to be regulated by a combination of cis- and trans-factors. Of 35 assays, we classified 18 genes as cis-regulated, 1 gene as trans-regulated, 13 genes as cis- and trans-regulated, and 3 genes as not differentially expressed between inbreds.
The cis- and trans-regulatory variation was also investigated for a subset of the present–absent genes that displayed detectable expression levels only in one of the two inbred lines. Sequencing of the PCR amplicons derived from genomic DNA identified useful polymorphisms in 21 of the genes. A combination of agarose gel electrophoresis (for insertion/deletion polymorphisms) and sequencing (for single nucleotide polymorphisms) was used to monitor the presence of amplicons derived from the two alleles using both F1 hybrid DNA and cDNA as independent templates (supplemental Table 3 and supplemental Figure 6 at http://www.genetics.org/supplemental/). This analysis provided a qualitative analysis of cis- vs. trans-regulatory variation for the present–absent genes. This method is not sensitive enough to discriminate trans-only effects from cis- and trans-effects. At least some transcript was detected for both alleles in hybrid cDNA for 6 of the genes tested, implying at least a partial contribution of trans-regulation in these genes. For the remaining 15 genes, only one of the two alleles was detected in hybrid cDNA, implying primarily cis-regulatory variation.
Gene expression patterns in inbred vs. hybrid maize:
We have measured transcription levels of the maize inbred varieties B73 and Mo17 as well as the reciprocal hybrids. Given the high level of transcriptional variation in inbreds B73 and Mo17, we expected to observe novel expression phenotypes in the F1 hybrid, assuming that the variant alleles might have epistatic effects on gene regulation. Surprisingly, there is little evidence for such epistatic effects. On the basis of our microarray data, the expression level for the hybrid is nearly always found statistically within the range of the two parents and frequently at intermediate midparent levels. This is in contrast to the observed phenotypes for the F1 hybrids, which are superior to both parents. Approximately 20% of the genes analyzed in our study displayed nonadditive expression patterns in the hybrid; the hybrid expression levels in these genes were typically at levels similar to the high or low parent. It should be noted that while microarrays are a very useful tool for profiling the relative expression level for a large number of genes, the technology does not provide precise measurements. In our study with three biological replicates per tissue type we could accurately assess the prevalence of major nonadditive expression patterns but had limited ability to distinguish subtly nonadditive gene expression patterns from additive expression. Our finding that ∼80% of the differentially expressed genes are indistinguishable from additive expression is similar to those found in a recent study comparing the transcriptional profiles of B73, Mo17, and Mo17 × B73 14-day seedlings using a maize cDNA microarray (Swanson-Wagner et al. 2006). Using nine biological replicates, the authors found that 78% of the differentially expressed genes displayed hybrid expression levels that were not significantly different from midparent, and the majority of nonadditive profiles displayed hybrid expression levels statistically within the parental range (Swanson-Wagner et al. 2006).
However, the high frequency of additive expression patterns in hybrids found in our study is somewhat in contrast to several previous studies, both in maize and in other systems. Song and Messing (2003) found nonadditive hybrid expression patterns for individual zein gene copies in maize endosperms. Additionally, Auger et al. (2005) reported a high frequency of nonadditive expression profiles in hybrid mature leaf tissues. The reason for the differences in our data set compared to these previous studies is unknown, but may relate to differences in experimental design. For instance, the proportion of additive vs. nonadditive expression patterns in the tissues sampled [growing endosperm tissue in the Song and Messing (2003) study, terminally differentiated leaf tissues in the Auger et al. (2005) study, and young growing tissues in our study and Swanson-Wagner et al. (2006)] may in fact be different. Additionally, there are inherent differences in the technical and statistical procedures used in these studies that may have contributed to these differences. Also, the Auger et al. (2005) study was designed to test the effects on genes in all three genomes in the cell, which would represent a different sampling from that of the microarrays. Importantly, however, both the Auger et al. (2005) study and our study are consistent in finding that hybrid expression profiles grossly over high parent are rarely observed.
Furthermore, it is worth noting that our results differ from microarray studies that reported modes of gene action in Arabidopsis and Drosophila hybrids (Gibson et al. 2004; Ranz et al. 2004; Vuylsteke et al. 2005). In Drosophila, there are numerous examples of genes showing novel expression states in the hybrid relative to the inbred strains (Gibson et al. 2004). Clustering analyses in Drosophila found that the inbred strains are more similar to each other than to the hybrid strains. The difference between the findings of our study and those of the Drosophila or Arabidopsis studies can be illustrated by comparing the [d]/[a] plots (Gibson et al. 2004; Vuylsteke et al. 2005; supplemental Figure 7 at http://www.genetics.org/supplemental/). It is generally considered that [d]/[a] values >1 or < −1 exhibit overdominance for the phenotype measured (in this case the phenotype is gene expression). In Drosophila 5% of genes showed overdominant expression phenotypes (Gibson et al. 2004) while nearly 10% of the Arabidopsis genes displayed overdominant expression patterns (Vuylsteke et al. 2005). In our study <0.01% of genes exhibit [d]/[a] values that would suggest overdominant expression levels. The difference in these findings is quite interesting as it may indicate that the modes of gene action following hybridization may be different across species and/or may be affected by the degree of genetic similarity between the parents.
Prevalence of cis-acting variation:
In our study we have combined allele-specific expression data with microarray analyses to document the sources of transcriptional variation between maize inbreds and relate this to the expression patterns observed in maize hybrids. Our microarray data indicate that many genes display additive or near-additive expression levels in the hybrid. Furthermore, our allele-specific expression data indicate that the parental allelic expression ratios are approximately maintained in the hybrid, which is consistent with the previous finding that maize hybrids frequently display biased allelic expression (Guo et al. 2004). The majority of differentially expressed genes in our study (31/32 genes studied by quantitative allele-specific expression and 15/21 genes monitored by qualitative allele-specific analysis) are regulated in part or completely by cis-acting variation, similar to results observed in interspecific hybrid Drosophila (Wittkopp et al. 2004). Many of the instances in which genes have been classified as regulated by both cis- and trans-variation actually appear to be explained primarily by cis-regulation (Figure 3B). Cis-regulated alleles that maintain their respective parental expression values following hybridization should result in hybrid expression values tending toward midparent levels, as was observed. In combination, these data indicate that the majority of variation in gene expression between B73 and Mo17 is due to cis-regulatory variation, suggesting a mechanism for the high frequency of additive or near-additive expression patterns in the hybrid (Figure 4).
The majority of genes that display nonadditive expression patterns in our study, the Swanson-Wagner et al. (2006) study, and the Auger et al. (2005) study are expressed at levels that are within the range of expression of the two inbred parents. These nonadditive expression patterns may be explained by the presence of dominant (or codominant) trans-acting factors. For example, a dominant transcriptional activator or repressor in B73 or Mo17 could lead to nonadditive expression patterns (Figure 4).
The prevalence of cis-acting regulation provides an explanation for the relative lack of epistatic interactions that would result in novel expression states in the hybrid. Potential cis-regulatory changes could include sequence variation (likely, but not necessarily, at promoter sites), altered chromatin states, or differences in RNA stability. Although it was initially surprising that cis-regulation is more commonly observed than trans-regulation, the sequence data on maize intraspecific genome variation may provide an explanation. Multiple studies (Fu and Dooner 2002; Song and Messing 2003; Brunner et al. 2005) have found that the repetitive elements surrounding a gene in different maize inbreds are frequently distinct. Therefore, alleles from different inbreds often reside in distinct chromatin “neighborhoods” that may have subtle cis-acting effects upon expression levels.
Alternatively, differential allelic expression in the F1 may result from transcriptional memory, in which the parental expression levels are maintained possibly via heritable epigenetic states. A mechanism involving transcriptional memory would likely be nonpermanent, and subsequent generations may eventually generate equivalent allelic expression. For now, however, the cis-acting mechanisms and generational stability of hybrid allelic variation remain to be elucidated.
Considerations of the relationship between expression data and heterosis:
It is difficult to link the phenomenon of heterosis, in which hybrid plants exhibit superior phenotypes relative to the inbred parents, to molecular mechanisms. For example, Ma et al. (2006) observed fewer gene expression differences between inbred W23 and hybrid ND101 × W23 than was anticipated on the basis of phenotypic differences. It is likely that heterosis is quite complex as it involves numerous component phenotypes and involves both quantitative and qualitative variation. Our experimental design does not directly address the underlying causes or mechanisms of heterosis but can be useful for general considerations of the contribution of gene expression variation to heterosis. Within our data set there is little evidence to support a model of heterosis that involves overdominant or underdominant gene expression patterns, as the hybrid plants almost never exhibit novel expression states. Instead, we observed differences in the inbred expression levels (or tissue-specific expression patterns) for many genes, with the hybrids displaying expression at the midparent levels. The midparent expression level per se may be phenotypically advantageous; however, we have no direct evidence of this.
The authors thank Peter Hermanson, Michelle Carlson, Elizabeth Chrans, and Dinesha Walek for providing technical help with sample preparation and data collection. The Minnesota Supercomputing Institute provided access to software packages used for data analysis. We thank Shawn Kaeppler, Patrick Schnable, Ronald Phillips, Peter Tiffin, Neil Olszewski, Irina Makarevitch, and four anonymous reviewers for providing invaluable discussions and feedback. This work was supported by the National Science Foundation (grant no. DBI-0227310 to N.M.S.) and the University of Minnesota Graduate School.
Communicating editor: J. A. Birchler
- Received May 12, 2006.
- Accepted May 12, 2006.
- Copyright © 2006 by the Genetics Society of America