The biological basis of voluntary exercise is complex and simultaneously controlled by peripheral (ability) and central (motivation) mechanisms. The accompanying natural reward, potential addiction, and the motivation associated with exercise are hypothesized to be regulated by multiple brain regions, neurotransmitters, peptides, and hormones. We generated a large (n = 815) advanced intercross line of mice (G4) derived from a line selectively bred for increased wheel running (high runner) and the C57BL/6J inbred strain. We previously mapped multiple quantitative trait loci (QTL) that contribute to the biological control of voluntary exercise levels, body weight, and composition, as well as changes in body weight and composition in response to short-term exercise. Currently, using a subset of the G4 population (n = 244), we examined the transcriptional landscape relevant to neurobiological aspects of voluntary exercise by means of global mRNA expression profiles from brain tissue. We identified genome-wide expression quantitative trait loci (eQTL) regulating variation in mRNA abundance and determined the mode of gene action and the cis- and/or trans-acting nature of each eQTL. Subsets of cis-acting eQTL, colocalizing with QTL for exercise or body composition traits, were used to identify candidate genes based on both positional and functional evidence, which were further filtered by correlational and exclusion mapping analyses. Specifically, we discuss six plausible candidate genes (Insig2, Socs2, DBY, Arrdc4, Prcp, IL15) and their potential role in the regulation of voluntary activity, body composition, and their interactions. These results develop a potential initial model of the underlying functional genomic architecture of predisposition to voluntary exercise and its effects on body weight and composition within a neurophysiological framework.
VOLUNTARY exercise levels are considerably variable in mice (e.g., Lightfoot et al. 2004) and humans (e.g., Centers for Disease Control and Prevention (CDC) 2003); may be an important predictor of human health-related quality of life (Booth et al. 2000; Friedenreich and Orenstein 2002); and are influenced by multiple environmental variables, genetic factors, and their interactions (review by Garland et al. 2011b). Furthermore, multiple lines of evidence suggest that the biological basis of voluntary exercise behavior is composed of both ability and motivation (Waters et al. 2008; Meek et al. 2009; reviewed in Knab and Lightfoot 2010; Garland et al. 2011b). These two components are almost certainly not mutually exclusive in their contribution to voluntary exercise, their relative influence may vary among individuals, and each undoubtedly has a genetic basis (Dishman 2008; Bray et al. 2009).
From a neurobiological perspective, exercise in humans and rodents is hypothesized to be self-rewarding (e.g., Brené et al. 2007), potentially addictive (MacLaren and Best 2010), and a highly motivated behavior (Sherwin 1998). For example, Sherwin and Nicol (1996) demonstrated that mice are motivated to engage in wheel running, even when the cost of gaining access to a wheel is increased by requiring shallow water traverses. Not only are mice apparently willing to incur a cost to gain access to a running wheel, but also operant conditioning studies have demonstrated that both rats and mice are motivated to lever press for the opportunity to run (Belke 2006; Belke and Garland 2007). In addition, selective breeding for both elevated endurance capacity in rats and elevated wheel running in mice has resulted in alterations to neurobiological pathways that appear to delay the onset of exercise-induced fatigue (rats: Foley et al. 2006) and increase motivation for wheel-running behavior (mice: Rhodes et al. 2005). Although a detailed understanding of the neurobiology of exercise is still years away, potential mechanistic foundations include multiple brain regions encompassing interactions between neurotransmitters, peptides, and hormones (see figure 1 in Kotz et al. 2008; figure 5 in Garland et al. 2011b). More specifically, pharmacological experiments on mice selectively bred for elevated wheel running have implicated alterations in dopamine function (Rhodes et al. 2005; Knab and Lightfoot 2010; Mathes et al. 2010) and endocannabinoid signaling (Keeney et al. 2008) as underlying the neurobiology of high voluntary exercise.
In addition to voluntary exercise, dopamine and endocannabinoid signaling, among other central nervous system processes, have also been linked to aspects of eating behavior and obesity (Cagniard et al. 2006; Davis et al. 2008; Garland et al. 2011b). The interactions of these redundant neural systems are currently poorly understood (Lenard and Berthoud 2008), but it has been demonstrated in mice (Kumar et al. 2010) and humans (Cai et al. 2006) that food intake and physical activity, both components of energy balance, may be governed by a similar underlying genetic architecture. For example, Mathes et al. (2010) observed significant differential gene expression, with associated changes in the dopamine pathway (D1 and D2 receptors), G-proteins, and adenylate cyclase in mice selectively bred for either high running or obesity relative to a nonselected outbred strain of mice.
Previously, we generated an advanced intercross line (AIL; G4) of mice that is broadly useful for investigation of the phenotypic relationships and genetic architecture of voluntary exercise behavior and related body composition traits. The AIL was created through reciprocal crosses between a line selectively bred for high voluntary wheel running and the inbred strain C57BL/6J (Kelly et al. 2010a). The use of an AIL (as opposed to an F2 population) resulted in a threefold expansion of the genetic map (Kelly et al. 2010b) and led to increased QTL mapping resolution with reduced confidence intervals (C.I.’s) (Farber et al. 2011). The high-runner (HR) line utilized in creation of the AIL is a result of a replicated artificial selection experiment for increased voluntary wheel-running behavior on days 5 and 6 of a 6-day wheel exposure (Swallow et al. 1998). The HR lines have diverged significantly from the control lines with an ∼2.5- to 3-fold increase in total revolutions/day with correlated changes in a number of morphological, physiological, and behavioral traits (reviewed in Swallow et al. 2009; Garland et al. 2011a).
Using this AIL, we have previously reported multiple QTL for voluntary exercise traits including daily wheel running (distance, duration, average speed, and maximum speed), running values averaged across days 5 and 6 of the 6-day test, and the running trajectory (slope and intercept) across the wheel-access period (Kelly et al. 2010b). We have also localized QTL controlling relationships between voluntary exercise, food consumption, and changes in body weight and composition (Kelly et al. 2011). In this report, as a complementary approach to further our understanding of the genetic architecture of exercise and its relationship with changes in body weight and composition, we examined global gene expression profiles in brain tissue. By combining QTL mapping with large-scale gene expression analysis, our primary goal was to further dissect these complex traits and make the selection of candidate genes underlying predisposition loci more efficient (e.g., Pomp 2005). By initially using brain tissue, we attempted to capture the transcriptional landscape relevant to motivational aspects of voluntary exercise. Expression quantitative trait loci (eQTL) were identified, their additive and dominance gene action calculated, and the cis- and/or trans-acting nature determined. Subsets of cis-acting eQTL that mapped under loci for exercise or body composition traits were used to produce a list of potential candidate genes on the basis of their genomic position and/or known function. Combined with correlational and exclusion mapping analyses, these results begin to develop a more detailed description of the underlying genetic architecture of predisposition to voluntary exercise and its effects on body weight and composition within a neurophysiological framework.
Materials and Methods
Population and phenotyping
A G4 population (n = 815) was generated from a reciprocal cross between mice selectively bred for high voluntary wheel running (HR line) and the inbred strain C57BL/6J (B6). Details regarding the creation and phenotyping of the G4 population have been described elsewhere (Kelly et al. 2010a). Only methodological details relevant to the current suite of phenotypes and the corresponding statistical analyses will be described here. Details regarding the final set of SNPs utilized for the QTL analyses (n = 530), with an average genome-wide spacing of 4.7 Mb, have been provided elsewhere (Kelly et al. 2010b), and complete genotypes are provided in the Supporting Information, File S1.
G4 mice at ∼8 weeks of age were assessed for body weight and composition (percentage fat tissue and percentage lean tissue) (EchoMRI-100, Echo Medical Systems, Houston, TX) and then were individually housed with access to running wheels (circumference = 1.1 m) (Lafayette Instruments, Lafayette, IN; model 80850) for 6 sequential days. Voluntary running was recorded electronically at 1-min intervals for 23–24 hr daily. The following daily traits were calculated: distance (total revolutions), time spent running (cumulative 1-min intervals in which at least 1 revolution was recorded), average speed (total revolutions/time spent running), and maximum speed (highest number of revolutions in any 1-min interval within a 24-hr period). In addition, we calculated mean values of the above traits on days 5 and 6 of the 6-day test, and the slope and intercept for regressions of distance, time, average speed, and maximum speed across the 6 days of wheel exposure. After body weight and composition measures were taken on day 6, mice were decapitated and brains were immediately harvested, placed on a chilled aluminum block, separated into left and right hemispheres, flash-frozen in liquid nitrogen, and stored at −80°. Percentage change in body mass in response to 6 days of voluntary wheel running was calculated as [(pre-wheel mass – post-wheel mass)/pre-wheel mass] × 100. Percentage change (after wheel access) in percentage body fat (and lean) was calculated as [(% post-wheel access – % pre-wheel access)/% pre-wheel access] × 100. All procedures were approved by and were conducted in accordance with the guidelines set forth by the Institutional Animal Care and Use Committee at the University of North Carolina at Chapel Hill.
RNA isolation and microarray analysis
Total RNA was isolated and purified from a homogenate of the right hemisphere of the brain from a subset of the G4 population using TRIzol (Invitrogen, Carlsbad, CA). The subpopulation (n = 244) was utilized due to financial limitations in running microarrays and represented the population-wide variation in running distance. Among the subpopulation, each of two parent-of-origin types [whether a G4 individual was descended from a progenitor (F0) cross of HR♀ × B6♂ or B6♀ × HR♂] and sexes were equally represented, as these two factors have known effects on wheel-running behavior and body composition-related traits (see Kelly et al. 2010a). Within each of the four subpopulation categories, the 62 individuals were nonrandomly chosen to represent the entire distribution of running distance (i.e., low, moderate, and high runners) observed across each subpopulation and across the full population. Four individuals were removed from the final analyses due to a lack of genotype information.
Following isolation and purification, RNA quality and quantity was determined by spectrophotometry (NanoDrop ND-1000, NanoDrop Technologies, Wilmington, DE) and bio-analysis (Genomics and Bioinformatics Core Facility, Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, NC). Expression profiles were generated for 45,281 transcripts using the MouseWG-6 v2.0 Beadchip (Illumina, San Diego) and processed utilizing Microarray Services (Expression Analysis; Durham, NC). Following Gordon et al. (2010), transcript expression profiles were normalized using Loess-Quantile normalization methods with R v 2.8.1 statistical software (R Development Core Team, 2008; http://www.r-project.org, lumi package) (see File S2). Transcripts with a detection score ≥0.95 were considered to be expressed above background and utilized for correlation and eQTL (Burns et al. 2010; Gordon et al. 2010; O’Leary and Osborne 2011).
Phenotypic correlations were calculated using SAS (version 9.1, SAS Institute, Cary, NC) between all genes significantly expressed above background levels and exercise and body composition phenotypes previously measured in the G4 population. In an attempt to determine if sex or parent-of-origin type were potentially underlying significant correlations with transcript levels, partial correlations were calculated adjusting for both factors (Kelly et al. 2010b, 2011). For all correlation analyses, adjustments for multiple comparisons were performed in SAS using the false-discovery-rate procedure controlling the overall type I error rate at 5% (Curran-Everett 2000).
eQTL were identified using the multiple imputation method (Sen and Churchill 2001) within R/qtl (Broman et al. 2003) for the R environment. Statistical models included effects of sex and parent of origin. Permutation (n = 1000) tests of 100 randomly selected transcripts yielded an average significance threshold of a log of the odds (LOD) > 3.8 (an approach similar to van Nas et al. 2010). This average threshold was calculated using R/qtl, which assumes an F2 population structure and therefore independence among individuals. We acknowledge that our approach is only one of potentially many and that the threshold estimates may be imperfect in the context of available techniques. As new approaches that are more computationally efficient (e.g., Zhang et al. 2012) and applicable to the current data set become available, thresholds computed here may be revisited.
Due to our more complex breeding history (G4 as opposed to F2), some adjustment of the LOD threshold is warranted (Valdar et al. 2009). The subpopulation utilized here (n = 248) was composed of 54 unique families with an average of four individuals representing each family (median = 4; minimum = 1; maximum = 11). For comparison, 57 unique families (average = 13 individuals per family; median = 13; minimum = 4; maximum = 28) were represented in the overall QTL mapping population (n = 815). Hence, we did not statistically account for family structure as in previous studies (see Kelly et al. 2010b, 2011). Rather, given the reduced population size, short number of intercrosses, and the relatively well-balanced mating design in our G4, we chose the expedient approach of setting a higher LOD threshold; eQTL with LOD ratio scores >4.3 (P-value < 0.00005) were deemed significant (previously chosen by Schadt et al. 2003). In addition, we chose to highlight only candidate genes with high relative LOD scores or extensive existing functional support (as in the case of arrestin domain containing 4, or Arrdc4). We computed the distance of the mapped location of significant eQTL to the midpoint of the physical location of the gene that each represented. If the distance was ≤10 Mb, eQTL were classified as cis-acting or local eQTL. If the distance was >10 Mb (including on another chromosome), eQTL were classified as trans-acting or distant (following Doss et al. 2005). In addition, the percentage variation explained and the additive and dominance effects of each significant cis- and trans-acting eQTL were estimated in R/qtl (see Kelly et al. 2011).
Exclusion mapping analysis
For a subset of candidate genes uncovered by eQTL analysis, we examined gene regions to determine if the haplotypes of the parental strains (HR and B6) were identical by descent (IBD) utilizing SNPs from the Mouse Diversity array (Yang et al. 2009). We genotyped a subset of representative individuals from the F0 parental strains (n =12, HR; n =1, B6) using the complete Mouse Diversity array containing 623,124 SNPs. We utilized these SNPs to determine (1) whether the interval is homozygous or heterozygous in each sample and (2) whether samples are IBD in each homozygous interval. As previously described in Farber et al. (2011), we determined the frequency of heterozygous calls in windows of 200 consecutive SNPs in each one of the 12 HR individuals independently. We then identified intervals that are IBD among all 12 HR founders and B6 using the same 200-SNP window and threshold (>97% genotype identity) approach used to identify the segregating regions (following Farber et al. 2011).
Following Loess-Quantile normalization, 17,571 (of 45,281) transcripts were identified with a detection score (calculated across all 244 mice) ≥0.95. Non-normalized and normalized transcript files are provided in File S3, File S4, and File S5. Pearson partial correlations were generated between these transcripts and a total of 36 exercise-related phenotypes and 17 traits related to body weight, body composition, and change in body weight and composition as a result of 6 days of exercise. Correlations were adjusted for factors with known phenotypic effects (sex and parent of origin). After adjustment for multiple testing, 348 (∼0.04% of total possible) partial correlations were found to be statistically significant (P ≤ 0.05), indicating putative functional relevance (Figure 1). Relationships between exercise-related traits and transcript levels accounted for the largest proportion (61.8%) of observed significant correlations (Figure 1). Among the exercise traits, running duration represented the largest percentage of significant relationships with transcript levels (48.6%). Collectively, body weight and composition-related traits accounted for 37.3% of significant correlations (Figure 1). Changes in body weight and composition, as a result of 6 days of exercise, represented only 0.006% of significant correlations with transcript levels. Greatest-magnitude correlations between exercise/body composition traits and transcript levels are presented in Table S1.
Notably, adjusted partial correlations revealed that Insig2 was significantly correlated with body mass post exercise (r = −0.33, P = 0.0166; top correlation, see Table S1), with running duration on day 1 (r = 0.30, P = 0.0362), with the trajectory of running duration across all 6 days (r = −0.34, P = 0.0084), and with the intercept of running duration (r = 0.32, P = 0.0084).
In total, 2441 cis-acting and 2164 trans-acting statistically significant eQTL were observed (Figure 2, A and B). Cis-acting eQTL were distributed genome-wide. The average LOD score for cis-acting eQTL was 18.7 (range = 4.3–108.2), while for trans-acting eQTL the mean LOD score was 15.0 with a range of 4.3–92.9. Among 100 randomly selected, statistically significant, trans-acting eQTL the average size of the 95% C.I. (defined by one LOD drop) was 15.9 Mb (median = 15.3 Mb) with an average of 30.5 (median = 24.0) transcripts physically residing with the C.I.’s. Among cis-acting eQTL, the median distance of the mapped location to the midpoint of the physical location of the gene was 1.94 Mb, and the distance was negatively correlated with the significance level (Figure 2C). For comparison, in early generations of a mouse recombinant inbred strain panel, the pre-Collaborative Cross, the median liver eQTL-gene distance was 0.92 Mb (Aylor et al. 2011). In addition, in an outbred mouse population (MF1 stock; Harlan, Indianapolis), the median distance of eQTL peak markers from the physical gene location was 0.67 Mb, and for 25% of the genes eQTL-gene distance was <300 kb (Ghazalpour et al. 2008). Trans-acting eQTL were identified on all chromosomes, with a potential master regulatory region observed on the distal end of Chr. 1 at ∼170–180 Mb based; 332 trans-acting eQTL mapping to this region (Figure 2B).
Significant cis-acting eQTL individually explained, on average, 26.35% of the total phenotypic variation for mRNA abundance of the underlying transcript, with a median percentage variance explained of 19.63% and a range of 0.45–86.65%. Average additive effects for cis-acting eQTL were almost universally statistically significant (98.4% or 2404 of 2441), with increasing expression as a result of the HR allele comprising ∼42% (n = 998) and increasing expression as a result of the B6 allele comprising ∼58% (n = 1406) of the total. In addition, average dominance effects were generally small with little statistical significance. Furthermore, we observed significant dominance effects only in the absence of significant additive effects for three cis-acting eQTL: BC020354 (Chr. 20, gene location: ∼147.42 Mb; eQTL location: 138.78 Mb); Slc22a17 (Chr. 14, gene location: ∼55.53 Mb; eQTL location: 63.19 Mb); and Arrdc4 (Chr. 7, gene location: ∼75.89 Mb; eQTL location: 76.26 Mb). Uniquely, Arrdc4 mapped under a previously detected QTL (82.6 Mb, C.I. = 75–86 Mb) for running duration on day 1 (of a 6-day exposure to running wheels), which also displayed significant dominance effects in the absence of significant additive effects (Kelly et al. 2010b).
Significant trans-acting eQTL individually explained, on average, 10.38% of the total phenotypic variation for mRNA abundance of the underlying transcript with a median percentage variance explained of 8.74% and a range of 0.01–86.90%. Average additive effects for trans-acting eQTL were generally large and frequently statistically significant (76.7% or 1659 of 2164), with increasing effects as a result of the HR allele comprising ∼49.5% (n = 821) and increasing effects as a result of the B6 alleles comprising ∼50.5% (n = 838) of the total. In addition, average dominance effects were large and statistically significant for 673 eQTL. Among these trans-acting eQTL, dominance was frequently found in the absence of significant additive effects. Trans-eQTL composing the potential master regulatory region located on the distal end of Chr. 1 similarly displayed additive effects that were frequently significant. Significant dominance effects were also noted for 102 of 332 eQTL located in the potential master regulatory region.
Using the larger G4 population of which a subset was analyzed in the present study, we previously identified 39 significant and 18 suggestive QTL representing various exercise traits (Kelly et al. 2010b). Here, we compared the locations of cis-acting eQTL within the C.I.’s (defined by 1 LOD drop) of QTL observed for subsets of the mean exercise traits (distance, duration, average speed, and maximum speed) on days 1 and 2 (Figure 3B) and on days 5 and 6 of the 6-day wheel-access period (Figure 3, A and C–F).
Comparisons between cis-acting eQTL and running distance QTL (mean on days 5 and 6) revealed 30 positional candidate genes on Chr. 7 (Figure 3A and Table S2), all with significant additive effects and little dominance. For running distance on day 1, 20 potential candidate genes were identified under 2 QTL on Chr. 1, with all having significant additive effects with little dominance. On day 2, 17 overlapping candidate genes were identified under one significant QTL (Figure 3B and Table S2). For mean running duration on days 5 and 6, we identified 40 candidate genes on Chr. 7, with 30 overlapping with running distance (Figure 3C and Table S2). The additional 10 candidate genes unique to running duration resulted from an expansion of the C.I.’s for running duration loci (91–129 Mb) as compared to the C.I. for the running distance QTL (99–124 Mb) and also displayed significant additive effects. In total, we observed 50 statistically significant cis-acting eQTL that mapped under the previously identified QTL for an average running speed on days 5 and 6 of the 6-day wheel exposure. Thirty candidate genes were located on Chr. 2 (Figure 3D and Table S2) and the remaining 20 candidate genes were observed on Chr. 14 (Figure 3E and Table S2). Of these, similar to those described above, eQTL generally had large and statistically significant additive effects with little dominance. For maximum running speed (on days 5 and 6), 33 candidate genes were identified on Chr. 2 with 30 overlapping with those identified for running speed (Figure 3D and Table S2). In addition, 53 candidate genes, with significant additive effects, were identified on Chr. 11 for maximum running speed (Figure 3F and Table S2).
In addition to comparisons between cis-acting eQTL and exercise QTL, we examined colocalizing cis-acting eQTL and loci previously identified for change in body weight and body composition in response to exercise. Comparisons between cis-acting eQTL and loci observed for percentage change in body mass, as a result of 6 days of exercise, revealed 18 candidate genes on Chr. 11 (Figure 4A and Table S2). For percentage change in percentage fat mass (as described in Kelly et al. 2011) we identified 36 candidate genes on Chr. 1 and 40 on Chr. 5 (Figure 4, B and C; Table S2). In addition, we observed 27 candidate genes on Chr. 5 for percentage change in percentage lean mass, all of which overlapped with those identified for percentage change in percentage fat mass (Figure 4D and Table S2). In general, eQTL had large significant additive effects with little dominance.
Of the cis-acting eQTL identified above, Insig2 was among the most statistically significant (LOD = 100.0; physical location: ∼124 Mb) and mapped near the peaks of the following previously identified trait QTL: body mass pre-exercise (peak: 116 Mb), body mass post exercise (∼106 Mb), running distance on day 1 (∼113 Mb) and day 2 (∼134 Mb), intercept of running distance (∼123 Mb), running duration on day 1 (∼136 Mb), trajectory of running duration (∼123 Mb), and intercept of running duration (∼134 Mb). Insig2 was also mapped to ∼83.0 Mb on Chr. 7 (trans-acting, LOD = 7.4). In addition, IL15 (physical location: Chr. 8 at ∼84.86 Mb) was mapped (83.9 Mb, LOD = 45.9) under previously identified loci for percentage fat mass and percentage lean mass (post exercise; see Kelly et al. 2011). Two additional cis-acting candidate genes of interest (discussed below) were prolylcarboxypeptidase, angiotensinase C (Prcp: physical location: Chr. 7 at ∼100.08 Mb), which mapped to 93.0 Mb (LOD = 99.5); and Arrdc4 (physical location = Chr. 7 at ∼75.89 Mb), which mapped to 76.3 Mb (LOD = 4.4). Both potential candidate genes mapped under previously identified loci for exercise related traits.
In isolated cases, we characterized the pattern of haplotype diversity in the founders. The haplotype analysis was used to identify if the select candidate genes fell within a region of IBD, making them a lower priority, or in segregating regions, making them a higher priority. We performed a detailed investigation of the cis-acting eQTL for Insig2, IL15, Prcp, and Arrdc4. These loci were chosen because they were among the most statistically significant and/or because they directly mapped under the confidence intervals of a variety of previously identified phenotypic QTL related to exercise and/or body composition. On the basis of the characterization of the pattern of haplotype diversity in the founders, we concluded that Insig2, IL15, Prcp, and Arrdc4 all fell within a segregating region of the genome as opposed to IBD regions.
Only two genes have been identified, on the basis of rodent studies, as potentially strong candidates involved in the regulation of physical activity: dopamine receptor 1 (Drd1) and nescient helix loop helix 2 (Nhlh2) (Lightfoot 2011). Combining QTL mapping with large-scale gene expression analysis (Jansen and Nap 2001), or eQTL mapping, is becoming increasingly commonplace in a variety of organisms (e.g., Druka et al. 2010; Liu 2011; Parts et al. 2011) and has made the selection of candidate genes underlying predisposition loci more efficient. Here we used approaches similar to those described by Lightfoot (2011) to identify candidate genes potentially playing a role in the regulation of voluntary exercise (wheel running), body composition, and their interactions. For specific candidate genes, we discuss these analyses in the context of existing functional studies. To our knowledge, an approach that bridges the gap between the neurophysiology of, and genetic predisposition to, voluntary exercise and body composition-related traits has not been addressed.
Mode of gene action
Here, we examined the percentage of phenotypic variance explained and the average additive and dominance effects of all statistically significant cis- and trans-acting eQTL (n = 4605 in total). While the cis-acting eQTL, on average, explained a much higher percentage (more than twice the amount) of the total phenotypic variation, the range of variance explained by cis- and trans-acting eQTL was very similar (0.45–86.65% vs. 0.01–86.90%, respectively). Some trans-acting eQTL explained a very high percentage of the total phenotypic variation, and this can in part be explained by how we chose to define trans-acting eQTL: as the eQTL located >10 Mb from the gene’s physical midpoint. Therefore, per this definition, derived from statistical effects not biological ones (see Schadt et al. 2003), a trans-acting eQTL may be mapped to the same chromosome and even as close as 10.5 Mb from the physical midpoint of the gene. However, this was certainly not the rule, as the trans-acting eQTL that explained the highest percentage of phenotypic variance (86.90%) were not mapped to the same chromosome as the physical location of the gene. Although average additive effects were generally large across all eQTL, they were large and most pervasive for cis-acting eQTL (98.4% as opposed to 76.7% for trans-acting of the total). Conversely to what was observed for additive effects, dominance appeared to play a lesser role for the cis-acting eQTL, while, for the trans-acting eQTL, dominance effects were more frequently large and statistically significant.
Similar differences, as noted here, in mode of gene action have been previously observed in an F1 hybrid diversity panel of Arabidopsis thaliana (Zhang et al. 2011). Zhang et al. tested 21,803 expression traits (from the fifth or sixth true leaf) with 56,819 genome-wide SNPs and observed a tendency for locally associated SNPs to be additively inherited, while distantly related SNPs were mostly dominant. Zhang et al. (2011) also observed that additive effects were generally larger than dominant ones and more frequently fell into local regulatory regions.
Although the increasing effects resulting from each of the founder alleles (HR and B6) were nearly identical among the trans-acting eQTL, among the cis-acting eQTL increasing effects as a result of the B6 allele were more pervasive. The allelic bias among cis-acting eQTL may potentially reflect probe bias toward the strain-specific reference genome used to design microarray probes or directional detection of functional polymorphisms (see Verdugo et al. 2010). Alternatively, the differences in gene action between HR and B6 may be reflective of different historical selective forces, a chief difference being the artificial selection of the HR strain for increased voluntary activity.
Identification of potential candidate genes
Many genes were identified with multiple lines of evidence implicating them as candidates for exercise QTL; here we discuss selected examples only. While we discuss single candidate genes here, we acknowledge that the combined effects of multiple linked eQTL could potentially cause phenotypic QTL. One highly significant (LOD = 100.0) cis-acting eQTL located on Chr. 1 (∼123.2 Mb), Insig2 (insulin-induced gene 2), colocalized with loci previously identified for exercise and body composition-related traits. Insig2 has been implicated in the functional regulation of lipid and cholesterol metabolism, positively (Herbert et al. 2006; Lyon et al. 2007) and negatively (Dina et al. 2007) associated with human obesity, associated with cholesterol biosynthesis, and characterized as a strong candidate susceptibility gene for total plasma cholesterol levels.
A subnetwork of Insig2 consisted of several additional genes (BC014805, Socs2, and Mod1) also implicated in obesity phenotypes (Cervino et al. 2005). In the present study, a trans-acting eQTL was mapped for Socs2 (physically located on Chr. 10 at ∼94.8 Mb) to Chr.1 at 170 Mb, a locus previously identified for percentage change in percentage adiposity in response to 6 days of wheel running.
Interleukin (IL)-15α (cytokine-specific receptor) knockout mice have been shown to exhibit a leaner phenotype with lower fat composition, elevated home-cage activity and heat dissipation (light and dark phases), greater food intake (light phase only), and raised oxygen consumption (light phase only) (He et al. 2010). In the present study, an eQTL for IL15 was mapped to Chr. 8 (cis-acting, LOD = 45.9) and Chr. 1 (trans-acting, LOD = 5.5), colocalizing with previously identified QTL for body composition and wheel-running traits, respectively. Furthermore, IL15 expression level was significantly correlated with running duration, body mass, and percentage fat mass.
Prcp was found to be a highly significant (LOD = 99.5) cis-acting eQTL mapped to 93.0 Mb on Chr. 7, a region contained within the confidence intervals of previously (see Kelly et al. 2010b) identified QTL for running distance and duration (see Figure 3C and Kelly et al. 2010b). Wallingford et al. (2009) observed that the inhibition of Prcp activity in vivo decreased food intake in wild-type and obese mice, and Prcp-null mice were leaner and shorter than wild-type controls and resistant to high-fat diet-induced obesity (Wallingford et al. 2009). Similar phenotypes have been observed in the HR strain of mice utilized here (Swallow et al. 1999; Kelly et al. 2006; Vaanholt et al. 2008).
A cis-acting eQTL for Arrdc4 was mapped under a previously detected QTL for running duration on day 1, and notably both the QTL and eQTL displayed significant dominance effects in the absence of significant additive effects. Arrdc4 has been shown to alter glucose metabolism (Patwari et al. 2009), and in cancer cells is induced under lactic acidosis and repressed with glucose deprivation (Chen et al. 2010). Notably, HR mice have previously demonstrated adaptive plasticity in GLUT-4 abundance (Gomes et al. 2009), possibly indicating that the relationship between running duration and glucose usage may be facilitated through Arrdc4.
A final candidate, DBY or Ddx3y [DEAD (Asp-Glu-Ala-Asp) box polypeptide 3], is located on the Y-chromosome gene. We did not attempt to map eQTL for DBY (due to a lack of markers on the Y chromosome), but did identify several highly significant correlations between DBY expression levels and exercise and body composition traits (see Table S1). The significant correlations were large and were present only when relationships were not adjusted for sex or parent-of-origin type. Dby mRNA is retained in mouse spermatozoa and important for male zygotic development, possibly implicating a mechanistic role for spermatozoa mRNA during embryonic stages (Yao et al. 2010a,b). Previously, we demonstrated significant parent-of-origin effects on a variety of exercise and body composition traits, with these effects being most pronounced for males in some cases (Kelly et al. 2010a). Given the current results, we hypothesize that DBY, via the early environment of the zygote, may in part explain the large parent-of-origin effects.
Limitations to the current approach
Perhaps surprisingly, genes regulating dopamine signaling were not expressed above background in the present study. While other studies (e.g., Bronikowski et al. 2004; Mathes et al. 2010) have targeted specific brain regions (e.g., dorsal striatum and nucleus accumbens, hippocampus), we chose to use the entire right hemisphere of the brain, which may have caused significant signal dilution and, in effect, led us to miss additional important candidate genes. We acknowledge that an alternative approach would have been to target a specific brain region previously implicated in the regulation of exercise behavior (see Introduction for examples). In our opinion, there is no one particular brain region sufficient to account for the diversity of behavioral and physiological traits measured in the G4 population (e.g., running distance, weight regulation, food consumption). Our compromise was to bisect the hemispheres, run our initial expression assays on one hemisphere, and reserve the remaining hemisphere for potential follow-up studies in a more focused/targeted fashion. This strategy, we believe, takes advantage of the fact that 186 of the top 500 genes expressed in the brain are expressed in all cell types (Lein et al. 2006).
In addition to not utilizing a localized brain region, we also recognize the possible limitations of the current QTL/microarray approach of identifying candidate genes (potential limitations are discussed at length in Verdugo et al. 2010). While we acknowledge these potential pitfalls, they did not restrict our approach to the current eQTL analysis. Therefore, although we provide supporting evidence for the proposed potential candidate genes above, we acknowledge that our results are most strongly relevant to the current methods, and caution should be taken when generally extrapolating these findings. In addition to the support provided here, additional lines of evidence will be needed (described in Lightfoot 2011) to validate the functional role of these candidate genes, especially with regard to voluntary exercise behavior. Regardless, the proximate goal of the integrative genomics approach adopted here is efficient gene discovery relevant to predisposition to engage in voluntary activity and the resultant modification of body composition. The ultimate goal is to use these results to adopt better-informed approaches to using physical activity as a preventative and therapeutic treatment for chronic disease.
We thank Z. Yun for assistance with animal care and data collection; Chris Wiesen, the University of North Carolina’s Odum Institute for Research in Social Science statistical consultation; and Fernando Pardo-Manuel de Villena and Ryan J. Buus, Department of Genetics at the University of North Carolina, Chapel Hill, for assistance with the IBD analysis. This work was supported by National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) grant DK-076050 to D. Pomp. S. A. Kelly was partially supported through a National Institute of Mental Health-funded (5T32MH075854-04) Interdisciplinary Obesity Training program. Phenotypes were collected using the Animal Metabolism Phenotyping core facility within the University of North Carolina’s Clinical Nutrition Research Center (funded by NIDDK grant DK056350).
Communicating editor: S. F. Chenoweth
- Received January 15, 2012.
- Accepted March 20, 2012.
- Copyright © 2012 by the Genetics Society of America