While most gene transcription yields RNA transcripts that code for proteins, a sizable proportion of the genome generates RNA transcripts that do not code for proteins, but may have important regulatory functions. The brain-derived neurotrophic factor (BDNF) gene, a key regulator of neuronal activity, is overlapped by a primate-specific, antisense long noncoding RNA (lncRNA) called BDNFOS. We demonstrate reciprocal patterns of BDNF and BDNFOS transcription in highly active regions of human neocortex removed as a treatment for intractable seizures. A genome-wide analysis of activity-dependent coding and noncoding human transcription using a custom lncRNA microarray identified 1288 differentially expressed lncRNAs, of which 26 had expression profiles that matched activity-dependent coding genes and an additional 8 were adjacent to or overlapping with differentially expressed protein-coding genes. The functions of most of these protein-coding partner genes, such as ARC, include long-term potentiation, synaptic activity, and memory. The nuclear lncRNAs NEAT1, MALAT1, and RPPH1, composing an RNAse P-dependent lncRNA-maturation pathway, were also upregulated. As a means to replicate human neuronal activity, repeated depolarization of SY5Y cells resulted in sustained CREB activation and produced an inverse pattern of BDNF-BDNFOS co-expression that was not achieved with a single depolarization. RNAi-mediated knockdown of BDNFOS in human SY5Y cells increased BDNF expression, suggesting that BDNFOS directly downregulates BDNF. Temporal expression patterns of other lncRNA-messenger RNA pairs validated the effect of chronic neuronal activity on the transcriptome and implied various lncRNA regulatory mechanisms. lncRNAs, some of which are unique to primates, thus appear to have potentially important regulatory roles in activity-dependent human brain plasticity.
THE availability of mammalian genome sequences has made it possible to delineate the boundaries and structures of all genes in a genome and has demonstrated an abundance of non-protein-coding transcriptional units that rivals the numbers of known protein-coding genes (reviewed in Carninci and Hayashizaki 2007). Complex and potentially functional regulatory relationships between protein-coding and noncoding genes, including noncoding RNA genes that are poorly conserved across different species, have recently been delineated (Katayama et al. 2005; Engstrom et al. 2006). These long noncoding RNA (lncRNA) genes can be defined by four fundamental criteria: encoding transcripts that lack any open reading frames (ORFs) >100 amino acids or possessing protein database homologies (Dinger et al. 2008); being within the known range of lengths of mammalian mRNAs; support by transcript-to-genome alignments from complementary DNA (cDNA) data; and absence of matches to any known noncoding-RNA classes. Functionally, lncRNAs can have regulatory effects on coding mRNAs through a number of mechanisms, including those involving endogenous antisense lncRNA transcripts that repress their sense-strand protein-coding partners (Katayama et al. 2005; Yu et al. 2008).
Endogenous lncRNAs can also have catalytic roles, as exemplified by the TERC telomerase RNA, and by the RNAse P and MRP RNAs required for processing of other RNAs. lncRNAs essential to nuclear architecture include NEAT1 and NEAT2. Nuclear hormone receptors, homeobox transcription factors, tumor suppressors, and immune regulators are all endogenously modulated by lncRNAs (reviewed in Lipovich et al. 2010). Numerous lncRNAs are transcribed in the vicinity of known protein-coding genes and regulate those known genes through epigenetic mechanisms. Regulation of protein-coding genes by overlapping, or nearby encoded, lncRNAs is central in cancer, cell cycle, and reprogramming (reviewed in Lipovich et al. 2010; Loewer et al. 2010; Orom et al. 2010). lncRNAs encoded in an antisense orientation to, and overlapping with, known protein-coding genes are particularly abundant, and the small number of antisense lncRNAs characterized to date is replete with novel functions. Endogenous antisense lncRNAs are essential in mammalian X-inactivation (Tian et al. 2010); can directly regulate tumor suppressors; function through dicer-independent mechanisms; and may be rapidly evolving or not conserved, raising the potential for new regulation of old genes over evolutionary time (Lipovich et al. 2010). RNA interference (RNAi) and overexpression of lncRNAs in cell lines generate reproducible phenotypes, as we and others have shown (Bernard et al. 2010; Sheik Mohamed et al. 2010). Hundreds of human lncRNAs bind the polycomb repressor complex 2 (PRC2), a key epigenetic negative regulator (Khalil et al. 2009). In addition to high-throughput evidence of interactions with epigenetic factors, specific epigenetic roles of lncRNAs are beginning to be defined. Antisense lncRNAs actively and specifically modulate gene expression by serving as effectors of epigenetic changes at target loci (Yu et al. 2008). These changes include antisense lncRNA-mediated epigenetic silencing of the sense-strand protein-coding gene promoter; such silencing can be abrogated by Argonaute-2-dependent, small-RNA-mediated suppression of the antisense lncRNA, resulting in “RNA activation” of the sense gene (Morris et al. 2008). Promoter-overlapping antisense lncRNAs can also be targeted by exogenous short RNAs that regulate sense gene expression, also via Argonaute (Schwartz et al. 2008). Despite these promising examples, a majority of the thousands of other lncRNAs evident in transcriptome data still remain devoid of assigned functions.
This abundance of lncRNAs, many of which are primate-specific, warrants a systematic assessment of whether they have functional, regulatory roles. Perhaps nowhere might this be more important than in the human brain that is composed of a diverse set of cell types connected through complex synaptic arrangements. The degree of synaptic activity in the brain can be translated into functional and structural changes through activity-dependent changes in gene expression (Katz and Shatz 1996). Although these changes can be effected through direct activation of synaptic genes, they can also be achieved through the release of neurotrophic factors such as brain-derived neurotrophic factor (BDNF) that have direct effects on synaptic architecture and indirect effects by producing changes in gene expression (Isackson et al. 1991; Binder et al. 2001). BDNF, a member of the nerve growth factor family, regulates the survival and differentiation of neuronal populations, axonal growth and pathfinding, and dendritic growth and morphology and has been linked to many human brain disorders (reviewed in Bibel and Barde 2000; Binder and Scharfman 2004; Hu and Russek 2008). BDNF messenger (mRNA) and protein are upregulated by seizure activity in animal models of epilepsy as well as in human brain tissues that display increased epileptic activities (Ernfors et al. 1991; Lindvall et al. 1994; Nibuya et al. 1995; Beaumont et al. 2012). The genomic locus encoding BDNF is structurally complex and also encodes BDNFOS, a primate-specific lncRNA that is antisense to the coding BDNF gene (Liu et al. 2006; Aid et al. 2007; Pruunsild et al. 2007). BDNF and BDNFOS form double-stranded duplexes, suggesting a potential for BDNFOS to post-transcriptionally regulate BDNF (Pruunsild et al. 2007). Antisense knockdown of BDNFOS, in fact, has recently been shown to increase BDNF expression in HEK293 cells and promotes neuronal outgrowth in vitro (Modarresi et al. 2012)
BDNF binding to its receptors results in a diverse array of downstream signaling pathways including the activation of cyclic adenosine monophosphate response element binding protein (CREB), which, in turn, can also regulate BDNF by binding to a cognate site within the BDNF gene (Tao et al. 1998; Spencer et al. 2008). Activation of CREB by phosphorylation at serine 106 as a result of neuronal activity leads to changes in gene expression that cause reinforcement and stabilization of more active neuronal circuits (reviewed in Herdegen and Leah 1998; Kandel 2001; Matynia et al. 2002; West et al. 2002). Downstream from phosphorylated CREB (pCREB), immediate early genes have been shown to mediate long-lasting changes in neuronal structure and excitability. Upstream of CREB activation, several known signaling pathways are rapidly activated in response to neuronal activity (Kandel 2001; reviewed in West et al. 2002), including CaMKinase IV, protein kinase A, and MAPK. We have recently observed a pattern of transcriptional activation in human brain regions where seizures start that strongly implicates sustained MAPK/CREB activation and downstream coding gene activations that could underlie layer-specific changes in synaptic architecture that makes these regions prone to seizures (Rakhade et al. 2005; Barkmeier et al. 2012; Beaumont et al. 2012).
Given that human lncRNA genes tend to be less well-conserved than protein-coding genes, and can give rise to unique transcripts not found in other species, we sought out a uniquely human system to examine activity-dependent gene expression for both coding and noncoding RNAs using a pairwise comparison of human cortical regions displaying variable degrees of epileptic activities. These brain regions were removed as part of surgical treatment for intractable seizures. We show that regions of human neocortex that display increased activity and BDNF expression have reduced BDNFOS expression and that BDNFOS directly downregulates BDNF in vitro in a neuronal cell line. We developed a custom microarray platform to perform a transcriptome-wide discovery of other regulatory lncRNAs and matched these to nearby or overlapping, differentially expressed protein-coding genes to develop a genome-wide list of lncRNA–mRNA gene pairs. Many of the coding mRNAs identified in this way are known to modulate activity-dependent gene expression in the human brain, suggesting that these lncRNA–mRNA pairs form a newly revealed regulatory network of human brain plasticity.
Materials and Methods
Human brain tissue
Informed consent was obtained from seven patients who underwent surgery for medically intractable epilepsy. Extreme care was taken to ensure that our study did not influence surgical decision making. All patients underwent presurgical evaluation and identification of epileptic and control regions as previously described (Rakhade et al. 2005; Beaumont et al. 2012). To localize epileptic brain regions that displayed both clinical seizures and interictal epileptiform discharges (spikes), a two-stage surgical approach using subdural electrodes with continuous brain-surface recordings (electrocorticography) and video monitoring was undertaken for 2–5 days. For these studies, paired tissue samples from neocortex within each patient displaying high and low amounts of interictal (between seizures) spiking were used to compare differential gene expression as a function of brain activity (Loeb 2010). To avoid introducing additional variables into the analysis, each block of tissue was examined histopathologically and demonstrated a normal six-layered neocortical structure without lesions. The paired analysis of high- and low-spiking neocortex within each patient is also critical to isolate the variable under study, which is the degree of activity. Total RNA was prepared using a modification of the protocol described previously (Beaumont et al. 2012). The difference was that only gray matter was used by pooling two to three nearby strips of gray matter that extended from the pial surface to the white matter from each block of tissue corresponding to a given electrode location. This pooling method helps correct for differences in dissections that could lead to over- or under-representation of specific cortical layers.
Cell cultures, transfections, and depolarizations
The SH-SY5Y cell line (ATCC) was maintained in Dulbecco’s modified Eagle’s medium supplemented with 10% FBS and used for experiments. Cells between 17 and 25 passages were transfected with BDNFOS-targeting and BC013641-targeting small interfering RNAs (siRNAs) by electroporation according to the manufacturer’s instructions at ∼80% confluence (Neon electroporation system, Invitrogen). The electroporation conditions used for SH-SY5Y cell transfection were the following: 1200 V; pulse width: 20 ms; and number of pulses: 2. Prior to the experiments, these conditions had been optimized using a condition matrix, a control siRNA, and fluorescent reporters (data not shown). Single and multiple depolarizations of cells were induced by adding 100 mM KCl (final concentration) to the medium at different time points as indicated in the Figure 5 legend.
Quantitative PCR, siRNAs, and primers
Total RNA from cultured SH-SY5Y cells was isolated with an RNAeasy mini kit according to the manufacturer’s instructions (QIAgen). The first-strand cDNA was prepared using SuperScript First-Strand cDNA kit (Invitrogen), and mRNA and lncRNA expression levels were determined by Taqman quantitative real-time PCR (Taqman qPCR). BDNFOS siRNAs designated S1, S2, S3, and S4 were custom-designed and synthesized by Invitrogen. The BDNFOS Taqman primer/probe combos were custom-designed by uploading FASTA-format sequences of preferred amplicon regions to the ABI Taqman custom-design website and were purchased from ABI/Life Technologies. This vendor does not release the actual primer and probe sequences of custom-designed amplicons to the users. An siRNA against the housekeeping gene glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was used to rule out nonspecific effects. While this siRNA knocked down GAPDH, it had no effects on BDNF, BDNFOS, and Lin7C at 24 and 48 hr (see supporting information, Figure S1 and File S4).
Western blot analysis
Cell lysates were prepared with SDS sample buffer (Sigma) and subjected to Western blotting to measure CREB phosphorylation as described (Beaumont et al. 2012). Briefly, proteins separated on 4–20% gradient sodium dodecyl sulfate-polyacrylamide gel were electrically transferred onto nitrocellulose membrane. After blocking with 5% (v/v) skim milk in TBS containing 0.05% Tween-20 for 1 hr at room temperature (RT), the membrane was incubated with rabbit polyclonal antibody against pCREB (Cell Signaling) at a dilution of 1:1000 for 1 hr at RT and then with specific secondary antibody coupled with HRP (1:5000) for 1 hr at RT. pCREB was visualized with ECL detection system (Pierce). The membrane was then stripped and reprobed with CREB antibody (Cell Signaling) at (1:1000) to measure total CREB.
Seven 60-mer probes per gene, unambiguously mapping by BLAT (Kent 2002) to a single genomic location and free of interspersed and simple repeats, were designed using the Agilent Technologies OpenGenomics eArray interface for 5586 of the 6736 lncRNA genes from Jia et al. (2010). The remaining lncRNA genes had been excluded because of eArray failure to yield seven probes per gene or because the eArray-designed probes failed our subsequent check for genomic uniqueness and absence of repeats. As a positive control, we also included seven probes each for 111 of the 137 previously determined protein-coding epileptic genes (Beaumont et al. 2012) and for six housekeeping control genes. The eArray Fill Array feature was used to randomly select control protein-coding gene probes to fill all features that would have otherwise remained vacant (<2% of total features on a 44,000-feature, i.e., “44k,” array cell). The entire probe set was printed in quadruplicate on each slide using the Agilent 4 × 44,000 high-density oligonucleotide microarray platform. Our custom lncRNA microarray contained the combined resulting set of probes (File S2).
Our Agilent 60mer probes are longer than the 25mers used on the Affymetrix platform, and more importantly, we tested each probe (after the probe was proposed by the Agilent EArray design software) for genomic mapping uniqueness [by University of California at Santa Cruz (UCSC) BLAT] and for repetitive element overlaps (by RepeatMasker). Only repeat-free and uniquely-mapping 60mer probes were included. Specificity is assured by these sequence qualities of the probes as well as by our strategy of profiling each lncRNA gene with seven unique probe sequences (not seven replicates of the same probe).
A dye-flip quadruplicate two-color microarray experiment was performed on each within-patient pair of high-spiking and low-spiking surgically resected samples on both the Agilent human genome-wide array (G4112A) and our custom lncRNA array as described, but using a different labeling method (Beaumont et al. 2012). We used the Epicentre protocol to generate aminoallyl-amplified RNA (aRNA) for subsequent amplification and labeling with either cyanine or Alexa dyes. For our custom lncRNA arrays, we used labeling with Alexa dyes (Alexa-647 and Alexa-555, Invitrogen) within the flip-dye design, as described by the manufacturer (SuperScript Indirect RNA Amplification System, Invitrogen) (Holloway et al. 2008). For every patient, each of the quadruplicates was hybridized on four separate slides. Four slides of 4 × 44,000 Agilent arrays (4 arrays, each composed of the same set of ∼44,000 probes) were used to screen seven patients. All slides were scanned as described previously (Beaumont et al. 2012).
Because our lncRNA custom microarray platform is new, we also used qPCR to validate a representative subset of differentially expressed lncRNAs. We considered which specific probes were responsible for the differential expression of each coding and noncoding gene observed across all seven patients and used probes to target only the region of each transcript that was overlapped by the differentially expressed probes. Positive correlation coefficients were seen in all cases, ranging from 0.61 to 0.96 (File S1) between the array and qPCR results within each patient; all protein-coding gene differential expression results were from the G4112A or F catalog protein-coding microarray, and all lncRNA results were from our lncRNA custom microarray (Figure S4 in File S4).
All lncRNA–mRNA overlaps in this work are in the antisense orientation. For all lncRNA–mRNA neighbor pairs, there is a spacer between the two genes along the genome, regardless of strand. All probes on our microarray are strand-specific and, therefore, even in the case of an lncRNA–mRNA antisense pair, will exclusively profile either only the lncRNA (on the custom array described in this article) or only the mRNA (on the Agilent catalog array).
Microarray statistical methods
To identify those differentially expressed lncRNAs that may be directly regulating their overlapping or neighboring protein-coding genes, we integrated our custom lncRNA expression microarray data with our conventional mRNA expression microarray data for the in vivo high-/low-activity cortical sample pairs from all seven patients analyzed with both array types. For each epilepsy patient, we had a within-patient sample pair of a high-spiking and a low-spiking region. This within-patient sample pair was analyzed, using the same dye-flip quadruplicate strategy, for both the catalog coding (G4112A) and the custom lncRNA microarray. Differentially expressed genes were identified from both microarray platforms but using the same strategy. Consistency between arrays was first examined by correlating the fold change of all protein-coding control genes common to both arrays, which was possible because our 111 “epileptic transcriptome” genes from the prior protein-coding array work (Beaumont et al. 2012) were used as controls on the lncRNA array. We used the average value of the seven probes corresponding to each control gene on the lncRNA custom array. For 140 catalog (Agilent G4112A) coding-array probes corresponding to these 111 genes, Pearson’s correlation coefficient was 0.90, attesting to very high reproducibility between the coding array and the noncoding custom array.
Scanned microarray images from coding and noncoding microarrays were analyzed by the software Agilent Feature Extraction (Agilent, V10.3.1) with the default protocol GE2_107_Sep09. A fluorescent correction factor was determined using both qRT-PCR and Agilent Spike-IN probes. This correction factor was then applied on the fluorescence intensity (fluorescence at exponent 1.125) and improved the fold change prediction. The fluorescence distribution inside each repetition of the microarray experiments was normalized by R V2.11 (R Development Core Team 2010) using the library “limma” (Smyth and Speed 2003) in a two-step process: (i) normalization of the intensity of fluorescence between dyes using a Loess correction (iterations: 50; span: 0.05) and (ii) independent scaling of fluorescence intensity on the same range across all the arrays for each dye using quantile normalization. The quality of the normalization process of the microarray fluorescence was validated using MA plot density and distribution analysis. Normality was asserted using the Anderson–Darling test from the library Nortest (Gross 2006). For each array, the background level was globally computed using the median of the fluorescence intensity of the negative control probes and subtracted from the signal of each probe.
Once normalized, the microarrays were further analyzed using standard statistical methods (Kerr and Churchill 2001b; Wolfinger et al. 2001). The differentially expressed genes between high and low spiking were determined using a two-step mixed model analysis of variance (Jin et al. 2001) with the library LME4 (Bates et al. 2009). This mixed model approach has been used to compute the fitted effect and the random effects simultaneously (Littell et al. 1996). To improve the sensibility of the analysis (Kerr et al. 2000; Jin et al. 2001), computation did not use the ratio but instead used dye fluorescence intensity indexed by the type of RNA (Tanaka et al. 2000) (RNA from the high-spiking area or RNA from the low-spiking area). The false discovery rate (FDR) and corrected P-value for each gene was computed with “R” using the library “fdrtool” (Strimmer 2009). The differentially expressed genes were detected using fold change and significance simultaneously (Tanaka et al. 2000) and were determined as significantly differentially expressed if their fold change, for at least one probe per gene, was ≥1.4 and if their FDR was ≤5%.
In addition to a number of custom approaches to identify 8 cis-acting coding lncRNA pairs, 26 trans-acting lncRNAs were identified as significant and activity-dependent by their tight correlation (Pearson’s correlation coefficient minimum of 0.9) to a well-known group of 13 activity-dependent protein-coding genes (Rakhade et al. 2007; Beaumont et al. 2012), which themselves had been co-expressed with a Pearson’s correlation coefficient of 0.95. These results were displayed graphically using Cytoscape (Smoot et al. 2011). To include a trans-acting lncRNA in this group, at least one probe (of the seven available probes) representing the lncRNA gene had to meet this statistical requirement.
To study the genes represented both on the catalog array and on our custom array, we used genomic localization of all transcripts along the same human genome assembly, hg19, to find differentially expressed transcripts from the custom noncoding and the G4112A catalog coding array that were close to each other along the genome or overlapped in an antisense orientation within a genomic region. The genomic position, strand, and exon/intron location information for each transcript is contained in the all_mrna BED file of the UCSC Genome Database.
Reciprocal patterns of BDNF and BDNFOS expression in electrically active human brain
Patients who fail to respond to medical management of their seizures can greatly benefit from a two-stage surgical procedure where long-term in vivo brain-surface recordings are used to identify and remove epileptic brain regions. We have used this human system recently to identify a relatively small group of genes, including BDNF that are differentially expressed in regions of the human neocortex where seizures start (Rakhade et al. 2005; Beaumont et al. 2012). While removing seizure-onset regions is key to a good outcome for improved seizure control, seizures from these brain regions are relatively infrequent compared to the small, but extremely frequent “interictal,” epileptic discharges that can occur almost constantly between seizures in some brain regions (Staley et al. 2005). In fact, several of the genes induced at seizure-onset zones correlate precisely with interictal spiking rather than with seizure frequency (Rakhade et al. 2007), suggesting that interictal spiking may be the driving force behind this altered expression pattern. Consistently, an animal model of interictal spiking without seizures was sufficient to produce neuronal layer-specific changes in these genes (Barkmeier et al. 2012). Here we have focused on brain regions with different levels of interictal spiking to identify the relationships between coding and noncoding transcripts in the in vivo human brain.
Figure 1A shows a table of seven patients used for the present study, together with quantified in vivo spike frequencies, tissue locations, and pathological descriptions. Patients varied in both sex and age, but were chosen because of the availability of both high- and low-interictalspiking neocortical brain samples from nearby brain regions for each patient that were removed as part of their seizure surgery treatment. Figure 1B shows how each of these pairs was selected with a short sample of the electroencephalogram recording that illustrates the large difference in interictal spiking. It is important to emphasize that, because of genetic differences, medication effects, and effects of tissue processing, our internally controlled experimental design is crucial (Rakhade et al. 2005; Beaumont et al. 2012). Although patients are listed with different pathological diagnoses from multiple neocortical regions, only tissue samples that showed a normal cortical architecture were used so as not to influence the major variable of interest: increased brain activity.
Because of the potential regulatory relationship of transcripts that code for BDNF with those that encode the partially antisense BDNFOS, as a first step we compared the relative expression levels of BDNF and BDNFOS between paired high- and low-spiking regions of human neocortex using qPCR for each patient (Figure 1C). In most patients, BDNF expression was higher in more electrically active regions, whereas BDNFOS lncRNA levels were significantly reduced in the high-spiking regions. We used EGR1 expression as a positive control for high-spiking human cortical brain regions as its expression has been shown to be directly proportionate to interictal brain activity (Rakhade et al. 2007). These results raise the possibility that increased BDNF levels could in part be regulated by a decrease of the antisense BDNFOS RNA.
BDNFOS is a negative regulator of BDNF in an in vitro human cell culture system
The genomic antisense orientation of BDNF and BDNFOS is shown in Figure 2A, where both overlapping and non-overlapping regions are delineated. We have previously demonstrated that perturbation of lncRNA levels at multiple cis-antisense lncRNA–mRNA pairs affects levels of the cognate mRNAs (Katayama et al. 2005). To distinguish whether the lncRNA BDNFOS directly regulates BDNF mRNA levels, we custom-designed three siRNAs targeting human BDNFOS (Figure 2A) and used qPCR to interrogate BDNFOS lncRNA and BDNF mRNA levels after the siRNA transfections. BDNFOS siRNAs were individually transfected into the human neuroblastoma cell line SH-SY5Y by electroporation and caused reproducible BDNFOS knockdown at 24 hr (all three siRNAs) and at 48 hr (only S2). Two of the siRNAs led to knockdown of BDNFOS by >70% (Figure 2B). BDNFOS knockdown by these double-stranded RNAs (dsRNAs) consistently led to an increase in BDNF mRNA levels (between 1.5- and 3.5-fold-change), suggesting that the cis-antisense BDNFOS RNA functions as a negative regulator of human BDNF (Figure 2B).
lncRNA genes in gene chains—loci where three or more genes are joined through shared antisense overlaps and bidirectional promoters—are a general property of the mammalian genome (Engstrom et al. 2006). Human BDNFOS is part of a three-gene genomic positional chain: it shares a putative bidirectional promoter with the LIN7C gene at its 5′ end while also encompassing an exonic cis-antisense overlap with BDNF exonic sequences at its 3′ end. BDNFOS knockdown by the same dsRNAs also increased the mRNA levels of LIN7C, suggesting that BDNFOS may negatively regulate other genes at its locus.
Transcriptome-wide profiling of all known human protein-coding and lncRNAs reveals activity-dependent regulatory pairs and networks
Our functional validation of the primate-specific BDNF/BDNOS pair suggests the potential for many more coding/noncoding regulatory relationships across the human genome that may vary as a function of brain activity. Here we utilized these same paired RNA samples from the same seven patients to identify the activity-dependent coding/noncoding interactome. To achieve this, we developed a custom lncRNA microarray, which allowed us to compare transcriptional profiles of lncRNAs to coding RNAs from a commercial genome-wide coding array (Figure 3). This new custom array is based on our previously defined and characterized human lncRNA gene catalog from experimental transcriptome data represented by cDNA and EST sequences in GenBank, totaling 6736 lncRNA transcriptional units (Jia et al. 2010). Our human lncRNA gene catalog is mostly nonredundant with respect to other recently published human lncRNA collections (Figure S2 in File S4). In contrast to our custom lncRNA array, current commercial microarray platforms do not adequately represent many genomically complex loci, including those encoding lncRNA genes and sense–antisense pairs (Orlov et al. 2007; Jia et al. 2010).
Both platforms utilized a dye-flip (Kerr and Churchill 2001a) quadruplicate experimental design to obtain the most accurate statistical comparison of each pair of tissue samples from each patient (Yao et al. 2004; Rakhade et al. 2005; Beaumont et al. 2012). Each within-patient sample pair was analyzed, using the same dye-flip quadruplicate strategy, for both the catalog coding (G4112A) and the custom lncRNA microarray. Differentially expressed genes were identified from both microarray platforms, but using the same strategy. Consistency between arrays was first examined by correlating the fold-change of all differentially expressed protein-coding control genes common to both arrays, which was possible because our 111 epileptic transcriptome genes from the prior protein-coding array work were used as controls on the lncRNA array. We used the average value of the seven probes corresponding to each control gene on the lncRNA custom array. For 140 catalog (Agilent G4112A) coding-array probes corresponding to these 111 genes, Pearson’s correlation coefficient was 0.90, attesting to very high reproducibility between the coding array and the noncoding custom array.
To define a gene as differentially expressed, we required at least one microarray oligonucleotide probe corresponding to that gene to be ≥1.4-fold differentially expressed with FDR ≤5% in a groupwise analysis of all seven patients. These thresholds were selected based on a power analysis using this flip-dye quadruplicate design (Loeb and Beaumont 2009). Using this criterion, we identified 4004 protein-coding genes from the catalog array (1944 upregulated and 2060 downregulated in high-activity areas; File S1). On the lncRNA arrays, 86 of the 111 positive control genes were upregulated, and 1288 lncRNA genes were differentially expressed between high-activity and low-activity neocortical regions (698 upregulated lncRNA genes and 590 downregulated lncRNA genes in high-activity areas; File S2). BDNF was represented on both the coding microarray and, as a brain-expressed known control gene, on the lncRNA microarray. BDNF was upregulated in high-activity tissue from all seven patients according to both our array platforms: coding microarray, median 3.6-fold change; lncRNA microarray, median 2.8-fold change.
To integrate the coding and noncoding transcriptomes of the human neocortex (File S3), we then determined which of the differentially expressed protein-coding genes were encoded by genomic loci overlapping, or adjacent to, the loci which also encoded differentially expressed lncRNA genes as outlined in Figure 3A. Here, we analyzed the entire extent of differential expression for lncRNAs that participate in sense–antisense pairs and in non-overlapping gene neighbor pairs such that one gene in the pair was protein-coding whereas the other gene encoded the lncRNA. Specifically, we identified all cis-encoded gene pairs in which both a protein-coding gene and a noncoding (lncRNA) gene were expressed from the same locus. We refer to these pairs as coding–noncoding gene pairs. We then separated the pairs into two categories—antisense and neighbor—both of which carry the potential for mRNA regulation by a paired lncRNA (Jia et al. 2010; Lipovich et al. 2010). We defined a cis-antisense gene pair as two genes transcribed from the opposite strands of the same locus in a configuration such that at least some sequence in at least one exon overlaps one exon of the other gene. We defined a neighbor-gene pair as any gene pair such that the nearest boundaries of two nearby, but nonoverlapping, genes are <10 kb away from one another. In this study, “cis” therefore refers to any same-locus (not necessarily same-allele) regulatory mechanisms, which include antisense-mediated regulation by lncRNAs of protein-coding genes that are encoded in the same locus. “Trans” refers to any regulation involving genes encoded at multiple distinct genomic loci.
Of our lncRNAs differentially expressed at high-activity regions, 290 were members of sense–antisense gene pairs (File S3). We define codifferential expression as a differential expression profile of two genes such that the differential expression of one gene is either inversely or directly correlated with the differential expression of the other gene across multiple sample pairs, each of which originates from a different patient and all of which are statistically significant. Only 4 of these 290 mRNA–lncRNA cis-antisense pairs were codifferentially expressed in all seven patients (Figure 3B). Only 1 of the 4 pairs (BDNFOS/BDNF) featured an inverse differential expression profile. The other 3 pairs all had a positive, direct correlation. This is, in fact, not surprising, given the prevalence of synergistic, as opposed to inverse, expression patterns in mammalian cis-antisense pairs in response to a stimulus or to a knockdown (Katayama et al. 2005). These 3 pairs featured lncRNAs cis-antisense to MAPK1IP1L (MAP Kinase 1 Interacting Protein 1-like, potentially a modulator of MAP Kinase 1, whose role centers on the CREB activation pathway upstream of brain activity-dependent gene expression); PURB (purine-rich element-binding protein, a gene expression regulator); and C11ORF96, which we have shown by bioinformatic analysis to be a human homolog of the rat AG2 gene (Matsuo et al. 2000), induced as a consequence of sustained long-term potentiation in vivo in rat hippocampus and therefore implicated in neuronal plasticity. In summary, the protein-coding genes at 3 of the 4 codifferentially expressed lncRNA–mRNA cis-antisense pairs have known neuronal functions centered on synaptic plasticity. The remaining 286 lncRNA–mRNA cis-antisense pairs did not have any correlation between the mRNA and the lncRNA, within each antisense gene pair, across the seven patients.
A total of 276 lncRNAs differentially expressed at high-activity brain regions resided at genomic loci where a protein-coding gene and an lncRNA gene were non-overlapping but within 10 kb of each other along the genome (Jia et al. 2010). However, only four mRNA–lncRNA neighbor-gene pairs were codifferentially expressed in the groupwise analysis of the seven patients (Figure 3B). These four codifferentially expressed neighbor-gene pairs contained lncRNA genes neighboring the protein-coding genes ARC (activity-regulated cytoskeleton-associated), a key regulator of neuronal receptor endocytosis required for both synaptic plasticity and long-term memory; L-plastin (LCP1), relevant to the activity-dependent MAPK/CREB activation by its placement within the human MAPK interactome (Bandyopadhyay et al. 2010); SMEK2, a regulatory subunit of Ser/Thr phosphatase 4; and CYR61, a secreted protein that associates with the extracellular matrix and the cell surface, regulates Akt activation (Goodwin et al. 2010), and is differentially expressed in autism (Garbett et al. 2008).
Of the 1288 lncRNA genes determined by our custom microarray to be differentially expressed at high-activity areas of the human neocortex, 846 remain largely refractory to functional interpretation as they were not genomically near, or antisense to, any known protein-coding genes. These include the lncRNA MALAT-1, originally discovered as a predictor of metastasis and survival in lung cancer (Ji et al. 2003) and now known to be a regulator of several synaptic genes (Bernard et al. 2010; Lipovich et al. 2010). Some are key components of specific nuclear bodies, while other lncRNAs regulate imprinting genes and still others perform essential catalytic roles (Bernard et al. 2010; reviewed in Lipovich et al. 2010). Differential expression of 5 of these known nuclear RNAs (Figure 3B, bottom) was significant. MIAT, the sole member of this group that was downregulated in the more active areas delineates a novel neuronal nuclear domain (Sone et al. 2007) shown to be both a direct target and a putative co-activator of the transcription factor Oct4 (Sheik Mohamed et al. 2010). In contrast, the levels of the other 4 known nuclear lncRNAs were increased in the more active areas. These lncRNAs included: KCNQ1OT1, which may regulate imprinting by recruiting the DNA methyltransferase DNMT1 to differentially methylated regions (Mohammad et al. 2010); RPPH1, the catalytic-RNA component of RNase P, essential for tRNA 5′ end maturation and for regulating Pol III-dependent tRNA transcription (Jarrous and Reiner 2007); NEAT1, an essential component of nuclear paraspeckles that suppresses the nucleocytoplasmic export of Alu-containing RNAs; and NEAT2 (MALAT-1), an essential component of nuclear speckles and a regulator of synaptic genes (Bernard et al. 2010).
We also used a second unbiased approach to identify activity-dependent lncRNAs with potential importance in synaptic plasticity transcriptional regulatory networks. We have shown that a number of coding genes, including EGR1, EGR2, and FOS, are expressed in human brain in direct relation to the degree of epileptic discharges (Rakhade et al. 2007). Using co-expression clustering of protein-coding genes, we identified these and 8 additional genes (13 total) that have the same pattern of expression across the seven patients and then identified 26 lncRNAs whose pattern of expression correlated with this group of coding genes. Figure 4, constructed from our coding/noncoding transcriptome quantitation integration by Cytoscape software (Smoot et al. 2011), illustrates co-expression of these 26 differentially expressed lncRNA genes with the 13 differentially expressed protein-coding genes. In Figure 4, genes that are closer together are more tightly linked. While it appears that there are two linked clusters of coding vs. noncoding genes, this apparent separation may in fact be due to the overall lower level of expression of the lncRNAs compared to the mRNAs (by a factor of almost 80-fold). Some of these lncRNAs, such as NEAT1, are not at or near known coding-gene loci (Figure 3), while others are adjacent to known genes that are not differentially expressed. IL8RBP, a complex mosaic lncRNA transcript that combines unique upstream exons with an IL8RB pseudogene downstream exon, is differentially expressed, although its parental gene IL8RB and the RUFY4 known gene that the pseudogene overlaps, are not detectable above background in the same samples on our protein-coding gene arrays.
Time-dependent patterns of lncRNA–mRNA codifferential expression with chronic depolarization of cultured human neuronal cells
To facilitate the study of primate-specific, activity-dependent, coding–noncoding regulatory networks, we developed an in vitro system of repeated depolarization using the human SH-SY5Y neuroblastoma cell line. Depolarization with supraphysiological concentrations of KCl has been used extensively as a means to study CREB activation and downstream transcription in neuronal cells in culture (Sheng et al. 1990; Connolly and Kingsbury 2010). Figure 5A shows that, while a single treatment of these cells with 100 mM KCl leads only to transient CREB activation (CREB phosphorylation, detectable by the pCREB Ser106 antibody), repeated 5-min exposures with KCl separated by 2-hr intervals lead to more sustained CREB activation, similar to that observed in highly spiking human neocortex (Beaumont et al. 2012) and in an animal model of frequent interictal spiking (Barkmeier et al. 2012) (Figure 5B).
We then compared gene expression over a 48-hr time course together with BDNF and BDNFOS expression by qPCR. Repeated depolarization leads to a marked and more sustained increase in EGR1 activation (a marker of epileptic activity in the brain) (Figure 5B, bottom panel). Even though EGR1 goes up within 4 hr, both BDNF and BDNFOS are initially downregulated. However, whereas BDNF and BDNFOS almost return to baseline levels 24 hr after a single depolarization, cultures that were repeatedly depolarized show a small but significant increase in BDNF expression, while BDNFOS remains downregulated at 24 hr. Importantly, the 48-hr time point extends this trend, maintaining the sustained increase in BDNF expression and demonstrating an even stronger downregulation of BDNFOS. While this is likely an oversimplification of a dynamic and complex set of regulatory interactions unrelated to BDNFOS, the chronic depolarization-induced reciprocal BDNFOS depletion and BDNF mRNA increase in culture parallel the inverse relationship between BDNFOS and BDNF in high-activity areas of the human brain shown in Figure 1. Therefore, we applied this cell culture system to interrogate the activity-dependent expression patterns of other lncRNAs and their cis-encoded partner mRNAs from Figure 3.
In Figure 5C, chronic depolarization of SH-SY5Y cells revealed a number of distinct time-dependent patterns. Unlike the cis-antisense BDNF/BDNFOS pair, the AF086035-MAPK1P1L cis-antisense pair had an opposite effect of decreasing lncRNA and increasing mRNA levels early in the time course, although by 24 hr the lncRNA level showed a slight increase. The BC013641-ARC neighbor-gene pair displayed increased expression of both genes at the 4-hr time point in the depolarization-treated Sh-sy5y cells, mirroring the increased expression of both genes in the high-activity human brain. At subsequent time points, ARC mRNA levels decreased back to the pretreatment levels although we observed a sustained elevated level of the BC013641 lncRNA encoded near the ARC gene along the genome (Figure 5C). Since the BC013641 gene is located ∼6 kb from ARC with a divergent genomic orientation relative to ARC we designed two custom siRNA oligonucleotides targeting BC013641. Although both siRNAs knocked down BC013641, only one led to a 1.25-fold increase in the mRNA level of the neighboring ARC gene, suggesting that reciprocal gene expression directionality at lncRNA–mRNA pairs may occur at neighbor-gene loci such as BC013641-ARC, and not solely at sense–antisense loci such as BDNFOS-BDNF (data not shown). The BC047792-PURB cis-antisense pair showed increased expression of both genes, which mirrored the coordinate increase observed in high-spiking brain regions; however, in contrast to BC013641-ARC, expression of both transcripts was maximal at the 8-hr time point and returned to baseline at 24 hr, showing no sustained increase in lncRNA expression. Finally, we looked at the time course of one stand-alone nuclear lncRNA, NEAT1, which has a potentially far-reaching regulatory role. NEAT1 goes up within 4 hr, returns to baseline at 8 hr, but shows some chronic elevated expression at 24 hr.
A primate-specific lncRNA regulatory mechanism for BDNF
A striking feature of the BDNF/BDNFOS locus is the complexity of its genomic landscape, which is highly representative of the genomic properties observed at lncRNA-encoding loci throughout mammalian genomes. In addition to residing in a three-gene chain, with LIN7C sharing its bidirectional promoter and BDNF overlapping its 3′ end, BDNFOS may have emerged in recent mammalian evolution after the primate–rodent divergence. A possible recent origin for this lncRNA gene is supported by two lines of evidence. First, several splice sites of BDNFOS are poorly conserved outside of primates (data not shown), suggesting that the genomic structure of BDNFOS either arose or was modified specifically in primate evolution. This is consistent with our recent finding that lncRNA genes may comprise a majority of primate-specific genes (Tay et al. 2009). Second, there is no evidence for a BDNFOS-like gene between the mouse Lin7c and Bdnf genes in any public mouse cDNA and EST sequence data represented by UCSC Genome Browser transcript-to-genome alignments (not shown); however, a recently identified non-orthologous postional equivalent Bdnf antisense transcript was found in the mouse, suggesting an evolutionarily distinct, but similar, mechanism for lncRNA-dependent regulation of Bdnf (Engstrom et al. 2006; Modarresi et al. 2012). This genomic and evolutionary complexity of the BDNF/BDNFOS locus suggests that functional lncRNAs in the human brain may be characterized by interspecies nonconservation or high divergence of their gene structures. This is of particular interest because of the persistent inverse codifferential expression of the BDNF/BDNFOS gene pair as a function of human brain activity shown here together with the observed increase in BDNF mRNA levels following knockdown of BDNFOS. Our rescue of LIN7C by BDNFOS RNAi indicates that BDNFOS function may, in part, be nuclear and epigenetic. This would be consistent with the recent demonstration that an antisense lncRNA acts epigenetically by modulating target transcription (Tay et al. 2009). A possible explanation for the upregulation of both BDNF and LIN7C via BDNFOS RNAi might involve BDNFOS-mediated PRC2 recruitment to this locus. PRC2 association with lncRNAs (Khalil et al. 2009) makes BDNFOS-PRC2 binding a distinct possibility. Knockdown of BDNFOS, preventing BDNFOS-mediated PRC2 targeting at this locus, would then result in increased LIN7C and BDNF mRNA levels, which is what we observed. BDNF is known to be under epigenetic control: it is activated by acetylation of multiple H3 lysine residues in its promoter chromatin (Tian et al. 2010) and is repressed in vivo by H3K27Me2 (Tsankova et al. 2006), a direct PRC2-catalyzed modification (reviewed in Margueron and Reinberg 2011). Our model for BDNFOS-mediated PRC2 recruitment in LIN7C and BDNF suppression does not contradict the concurrent possibility of cytoplasmic, post-transcriptional BDNFOS-BDNF regulation; in fact, the efficiency of our RNAi knockdowns of BDNFOS implies cytoplasmic localization, since RNA-induced silencing complex (RISC) activity is cytoplasmic. Future work in this field should also clarify whether BDNF mRNA has a reciprocal regulatory impact on BDNFOS.
To determine whether BDNFOS may regulate BDNF transcription, as opposed to splicing or RNA stability, we performed Taqman qPCR with a custom amplicon spanning part of the last exon and part of the last intron of BDNF (hg19::chr11:27,680,112-27,680,229). While there was little change in our ability to detect this amplicon with BDNFOS RNAi S1 and S2 treatment, there was an increase with S3, raising the possibility that the BDNFOS effect is at the level of new transcription (data not shown). However, more work is needed to address the question of whether BDNFOS directly regulates BDNF transcription.
In summary, our findings, which center on the poorly conserved but functional lncRNA BDNFOS, provide a uniquely human view of activity-dependent transcriptional regulatory networks in the brain whose endogenous components cannot be modeled in rodents or other nonprimate species. A set of differentially expressed microRNAs, including miR-30a-5p, act as post-transcriptional inhibitors of BDNF in the prefrontal cortex (Mellios et al. 2008). Our demonstration of the primate-specific lncRNA BDNFOS as an inhibitor of BDNF complements this earlier miRNA work, suggesting that BDNF is targeted by multiple RNA-mediated regulatory mechanisms involving short and long, ancient as well as evolutionarily young, noncoding RNAs.
This is the first study where reciprocal lncRNA–mRNA regulation is inferred from the in vivo human brain in a groupwise analysis of multiple living patients and then validated by RNAi in a human neuronal cell line. Moreover, reciprocal regulation in sense–antisense pairs is an exception rather than the rule (Katayama et al. 2005). Here, we pinpoint an lncRNA, BDNFOS, which, through its ability to regulate BDNF, may be a key novel contributor to epileptogenesis in a locus where future mechanistic analysis is warranted.
Genome-wide integration of coding/noncoding RNAs as a function of human brain activity
Recently, we generated a stringently filtered catalog of human lncRNAs and described the genomic positional relationships between these lncRNAs and protein-coding genes, providing insights into lncRNA functions (Jia et al. 2010). Despite their prominence in the transcriptome, most lncRNAs remain poorly understood, although lncRNAs may contribute to the biological complexity of regulatory networks (reviewed in Mattick and Makunin 2006). Because of their abundance, lncRNAs may be even more important than microRNAs. microRNAs function as post-transcriptional repressors, but lncRNAs have additional mechanisms to positively and negatively regulate cotranscriptional and post-transcriptional alterations in gene expression. Here we have used these insights to develop a custom lncRNA microarray to provide the first genome-wide analysis of human brain lncRNA-based regulatory networks as a function of electrical brain activity. Several co-expressed lncRNA/coding gene pairs identified here have important roles in activity-dependent synaptic plasticity either directly, such as BDNF and others involved in the MAPK/CREB signaling, or indirectly through the expression of regulatory lncRNAs such as MALAT-1 (Bernard et al. 2010). Our focus on the relationship between coding mRNAs and lncRNAs with respect to brain activity is complementary to other human brain transcriptome studies such as those focusing on developmental, regional, and disease-related gene expression patterns (Johnson et al. 2009; Voineagu et al. 2011), but significantly expands upon those studies through our annotation of the human lncRNA transcriptome and its expression-level relationships with specific protein-coding genes.
This genome-wide lncRNA expression survey of electrically active human neocortex has uncovered hundreds of lncRNAs differentially expressed between more and less electrophysiologically active areas of the human neocortex. Of these lncRNAs, 26 are expressed directly in proportion to known activity-dependent genes (Figure 4), and therefore these lncRNAs could represent biomarkers and drug targets for human brain diseases, such as epilepsy (Loeb 2011). The co-expression clustering topology (Figure 4) suggests a network where mRNAs and lncRNAs are linked by previously uncharacterized lncRNA nodes (such as BC009864) as hubs with spoke edges extending simultaneously to multiple mRNAs and other lncRNAs. We also observed eight lncRNA–mRNA cis-antisense and neighbor-gene pairs characterized by coordinated differential expression of both genes in each coding–noncoding pair, suggesting lncRNA-mediated regulation of protein-coding gene expression in the brain, and the even more intriguing reciprocal possibility that some mRNAs may function at the RNA level to regulate lncRNA expression or in bidirectionally regulated feedback loops in cis. Other lncRNAs such as NEAT1 were detected only by the trans-regulation analysis, where we searched for lncRNAs whose expression was highly correlated with protein-coding genes regardless of the genomic mapping location of those coding genes. Although the role of the NEAT2 (MALAT-1) lncRNA from nuclear speckles in synaptic gene regulation is now known (Bernard et al. 2010), our study complements that work by implicating NEAT1, the lncRNA from nuclear paraspeckles that is encoded near the NEAT2 locus, in regulatory interactions with activity-dependent genes in the brain. Our three lines of evidence for activity-dependent NEAT1 function in the neocortex are our detection of NEAT1 as a differentially expressed lncRNA on the custom microarray analysis of human brain samples, our demonstration of activity-dependent NEAT1 expression in depolarized human SY5Y cell culture, and the assignment of NEAT1 as a central node to a co-expression cluster of specific coding and noncoding RNAs (Figure 4). Our cis-regulation and trans-regulation analyses uncovered different, nonredundant sets of lncRNAs, suggesting that specific lncRNAs are involved in both types of regulation, which for any given lncRNA may be mutually exclusive. These results represent the first functional evidence for a remarkably diverse pattern of lncRNA expression in the human brain.
Functionality of coding/noncoding RNA regulatory networks in the human brain
Our microarray results and our qPCR analysis of both the epilepsy patient samples and the recurrent-depolarization SH-SY5Y cell culture time course jointly represent the first demonstration that known lncRNAs are activity-dependent both in vivo and in cell culture. The complex, but similar, pattern of lncRNA–mRNA expression in activated human brain and in a chronically depolarized human neuronal cell line enables the temporal characterization of these regulatory pathways and provides a new system in which to study these complex, primate-specific transcriptional regulatory networks. We previously performed time-course analysis of mammalian cis-antisense pairs in cell culture subjected to a specific stimulus, such as lipopolysaccharide induction of macrophages, revealing a wide diversity of temporal differential expression patterns. These patterns, with concordant or synergistic regulation of the two paired genes, were observed at most of the differentially expressed cis-antisense pairs (similar to our results in Figure 5C). Less frequently, the patterns showed inverse or reciprocal regulation of the two paired genes at a locus, similar to our results in Figure 2B (Katayama et al. 2005).
Fundamental functional roles have been previously established for a relatively few lncRNAs. We show upregulation of three nuclear RNAs—RNase P (RPPH1), NEAT1, and MALAT-1—in high-activity areas of the neocortex. The catalytic-RNA component of RNAse P is essential for the 3′-end cleavage of both NEAT1 (Sunwoo et al. 2009) and NEAT2/MALAT-1 (Wilusz et al. 2008). Therefore, these three lncRNAs may compose an lncRNA-mediated lncRNA maturation network in highly active brain regions (Figure S3 in File S4). The function of this induced network may be to modulate the expression of synaptic genes, such as those whose mRNA levels are regulated by MALAT-1 (Bernard et al. 2010). This RNA-mediated regulatory network may function either independently from, or synergistically with, the MAPK/CREB pathway to regulate activity-dependent gene expression.
While BDNF regulation by BDNFOS bolsters previous precedents for reverse-genetic approaches to functional validation of lncRNA genes (Sheik Mohamed et al. 2010), further mechanistic studies of the novel regulatory lncRNAs will be needed to delineate the functions of these widely heterogeneous lncRNAs. For example, it is important to distinguish nuclear epigenetic from post-transcriptional and cytoplasmic regulatory mechanisms of lncRNAs. Such studies should be aided by structural insights into the mammalian lncRNAome, following in the footsteps of existing whole-transcriptome empirical RNA secondary structure delineation methodologies (Kertesz et al. 2010). It has also become increasingly evident that human lncRNAs perform diverse yet crucial functions, one of which is to regulate mRNA stability on a transcriptome-wide scale through repetitive elements embedded in exons of many lncRNAs (Gong and Maquat 2011). Ribonucleoprotein complexes that enable lncRNA function and complexes that facilitate lncRNA-mediated regulation of mRNAs in sense–antisense pairs can be identified by affinity columns and mass spectrometric analysis. This identification will allow therapeutic targeting of lncRNA–mRNA regulatory relationships. Finally, integrated differential expression analysis of the protein-coding and the long noncoding transcriptome represents only a limited entry point into transcriptional regulatory networks underlying activity-dependent gene expression. A comprehensive assessment of this network is possible only if all transcript classes, including mRNAs, lncRNAs, microRNAs, and the recently discovered endo-siRNAs (Smalheiser et al. 2011), are profiled jointly.
The genomic complexity of the human AK093366-AG2 lncRNA–mRNA cis-antisense pair (Figure S5 in File S4) is reminiscent of that observed in the BDNFOS-BDNF lncRNA–mRNA cis-antisense pair. Both the BDNFOS lncRNA gene and the AK093366 lncRNA gene contained primate-specific splice sites. The splice donor of AK093366's sole intron is primate-specific because it is harbored within an AluJb repeat. Alu repeats are the best-known class of primate-specific interspersed repeats (Greally 2002), and therefore key gene structure elements, including splice sites, contained within Alu repeats provide direct evidence that the corresponding gene structures either arose or were modified after the mammalian radiation, specifically in the primate lineage. Notably, EST-supported cis-antisense lncRNA transcription of the Alu-containing AK093366 transcriptional unit extends substantially beyond the UCSC C11ORF96 (AG2) gene model and well into the C11ORF96 ORF. This underscores the utility of EST data, much of which remains unincorporated into reference gene models and annotations in delineating the boundaries of lncRNA genes, including those involved in putative regulatory relationships with protein-coding counterparts. While Alu-containing lncRNAs have recently been implicated in the in trans post-transcriptional regulation of gene expression via effecting mRNA decay (Gong and Maquat 2011), our analysis suggests distinct cis-regulatory roles in overlapping-gene regulation for certain Alu-containing lncRNAs—specifically, AK093366. Two of the four codifferentially expressed lncRNA–mRNA cis-antisense pairs in the human neocortex, BDNFOS-BDNF and AK093366-AG2, thus feature primate-specific sequence at lncRNA gene splice junctions. The protein-coding genes BDNF and AG2 are overlapped by endogenous antisense lncRNAs containing exonic Alu repeats, and these gene pairs are codifferentially expressed in active areas of the human epileptic neocortex.
BDNFOS-mediated regulation of BDNF provides initial evidence that primate-specific regulation of conserved protein-coding genes by cis-antisense lncRNAs takes place in epilepsy, a complex human brain disorder. The co-expression of mRNAs and nonconserved lncRNAs at loci other than BDNF, including AG2, suggests that primate-specific regulation of conserved genes by nonconserved lncRNAs in the human brain may not be unique to the BDNF locus.
We thank J. B. Brown, Department of Statistics, University of California, Berkeley, for helpful comments on the manuscript. This work was funded by National Institutes of Health (NIH)/National Institute of Neurological Disorders and Stroke grants R01NS045207 and R01NS058802 (J.A.L.) and by internal funds from Wayne State University (L.L.). Microarray development was supported by NIH/National Institute on Drug Abuse grant 1R03DA026021-01 (L.L.). Microarray scanning was performed by the Core Facility of the Department of Pediatrics, School of Medicine, Wayne State University.
Communicating editor: M. Johnston
- Received February 9, 2012.
- Accepted August 24, 2012.
- Copyright © 2012 by the Genetics Society of America