A long-term series of experiments to map QTL influencing wood property traits in loblolly pine has been completed. These experiments were designed to identify and subsequently verify QTL in multiple genetic backgrounds, environments, and growing seasons. Verification of QTL is necessary to substantiate a biological basis for observed marker-trait associations, to provide precise estimates of the magnitude of QTL effects, and to predict QTL expression at a given age or in a particular environment. Verification was based on the repeated detection of QTL among populations, as well as among multiple growing seasons for each population. Temporal stability of QTL was moderate, with approximately half being detected in multiple seasons. Fewer QTL were common to different populations, but the results are nonetheless encouraging for restricted applications of marker-assisted selection. QTL from larger populations accounted for less phenotypic variation than QTL detected in smaller populations, emphasizing the need for experiments employing much larger families. Additionally, 18 candidate genes related to lignin biosynthesis and cell wall structure were mapped genetically. Several candidate genes colocated with wood property QTL; however, these relationships must be verified in future experiments.
A continuous, as opposed to discrete, distribution of phenotypic values is a feature of many traits important to animal and plant breeding, as well as many traits impacting human health. Variation in these quantitative, or complex, traits is influenced by multiple genetic loci with relatively small effects coupled with environmental and epistatic interactions. Quantitative trait locus (QTL) mapping is a well-developed discipline that dissects the inheritance of complex traits into discrete Mendelian genetic factors. The number and location of chromosomal regions affecting trait variation and the magnitude of their effects can be determined by associating genotypes with phenotypes in a segregating population. In a limited number of cases, QTL mapping has identifed genetic markers suitable for the improvement of breeding populations by marker-assisted selection (Bernacchiet al. 1998; Hardin 2000) as well as genes with direct influence on phenotype (Cormieret al. 1997; Fraryet al. 2000; Steinmetzet al. 2002).
Numerous factors influence the ability to detect a QTL. Using computer simulations, Beavis (1994) first showed the impact of sample size. Small segregating populations (100-200 progeny) typical of many QTL mapping experiments resulted in the detection of few loci with disproportionately large effects on phenotype and with poor congruence of QTL position between simulations. These findings were supported empirically in maize by Melchinger et al. (1998) and by Utz et al. (2000). In addition, genetic background, environment, and interactions among QTL affect QTL detection. For example, the expression of a QTL in long-lived organisms, such as perennial plants, is likely to be modified by changing biotic and abiotic factors on a seasonal or yearly basis. With particular reference to sample size, many experimental studies have ignored these limitations (Utzet al. 2000), which has left an incomplete portrayal of the genetic architecture of many complex traits. Larger, replicated follow-up experiments that sample time, space, and genotypes are required. Furthermore, these experiments provide the opportunity to verify QTL detected in earlier experiments to support an underlying biological basis for observed marker-trait associations.
Among forest trees, QTL mapping has focused on wood properties and traits related to adaptation and growth (reviewed in Sewell and Neale 2000). In loblolly pine (Pinus taeda L.), the leading timber species of North America, the physical and chemical properties of wood have been studied extensively in a single population of modest size (Sewell et al. 2000, 2002). The physical properties of wood, which have a major influence on the quality and end use of sawed lumber, include wood-specific gravity (wsg) and microfibril angle (mfa). The chemical properties of wood, which impact the pulping process, are largely determined by the relative amounts of cellulose, hemicellulose, and lignin. Biologically, wood is essentially a matrix of cell walls and the lumen of secondary xylem (Megraw 1985). It can be considered as the end product of the collective action of many genes modulating the morphology and composition of secondary xylem cell walls in response to environmental and developmental signals. Consistent with this hypothesis, Sewell et al. (2000) identified 39 QTL for wsg and 7 for mfa, each accounting for 5.4-15.7% of the phenotypic variance.
Verification of the findings of Sewell et al. (2000, 2002) is required to assess how well the genetic architecture of wood properties in loblolly pine has been described. True QTL verification implies complete replication of all experimental parameters. For many conifer species, low to moderate rates of clonal propagation, coupled with environmental heterogeneity inherent in large test sites, complicates the design of such experiments. In this report, QTL verification is defined as the repeated detection, at a similar position on the genetic map, of a QTL controlling a trait under more than one set of experimental conditions. The goals of this study were (1) to perform a QTL analysis on a larger (N = 457), independent population of the original QTL detection population to verify the existence of previously detected QTL; (2) to obtain accurate estimates of the percentage of phenotypic variation explained by the QTL; (3) to compare QTL detected in an unrelated pedigree with QTL in the original detection pedigree to determine if a similar suite of QTL are expressed and if additional QTL are revealed; and (4) to identify positional candidate genes underlying wood property phenotypes by genetic mapping and colocation with QTL influencing wood properties.
MATERIALS AND METHODS
Mapping populations: Three populations from two three-generation outbred pedigrees of loblolly pine were considered (Table 1). QTL influencing wood properties were initially identified in the detection population of the QTL pedigree, which consists of 172 progeny located at six sites in Oklahoma and Arkansas (Grooveret al. 1994; Sewell et al. 2000, 2002). The verification population consists of 457 progeny derived from remating the parents of the QTL pedigree. The unrelated population consists of 445 progeny derived from the base pedigree (Deveyet al. 1994). Both populations were established by Weyerhaeuser in 1994 on adjacent sites near New Bern, North Carolina. A smaller population of the base pedigree and the detection population are reference populations used for routine genetic mapping in loblolly pine (http://dendrome.ucdavis.edu/Synteny/refmap.html).
Genotypic data and map construction: Methods pertaining to restriction fragment length polymorphism (RFLP) and expressed sequence tag polymorphism analyses in loblolly pine followed Devey et al. (1991) and Temesgen et al. (2001), respectively. Evenly spaced markers were chosen for each mapping population, with preference given to those that segregated in both parents (i.e., fully informative markers). Sex-averaged linkage maps of the verification population and the unrelated population were constructed from genotypic segregation data using Joinmap 1.4 (Stam 1993) according to Sewell et al. (1999). A consensus map, which allowed the relative placement of QTL onto a single integrated map, was constructed according to Sewell et al. (1999).
Phenotypic measurements: A 5-mm radial wood core was taken for each progeny of the verification and unrelated populations at ∼1.4 m above ground and cropped at the pith and outer ring. Earlywood and latewood measurements for a variety of physical wood properties (Table 1) were determined for either one or three growth rings (rings 4-6 from the pith, respectively) and their averages were calculated (composite traits) as described in Sewell et al. (2000). Rings 4-6 represent predominantly juvenile wood in loblolly pine, which requires 7-10 years of growth before the onset of mature wood production (Megraw 1985). A number of cores were excluded due to an excess of compression wood; as a result, the actual number of phenotypic data points per trait ranged from 381 to 409 in the verification population and 408 to 434 in the unrelated population. A 12-mm core was also taken from 280 progeny of only the verification population for analysis of chemical properties of earlywood and latewood in the fifth growth ring according to Sewell et al. (2002). Models for projection to latent structures (PLS) were used to predict the chemical composition of cell walls (i.e., α-cellulose, lignin, galactan, xylan, and mannan content) from pyrolysis molecular beam mass spectrometry data using multivariate statistics (Daviset al. 1999). PLS estimates from two independently analyzed subsamples were averaged and used as traits in QTL analysis. Chemical properties were measured by chemical content per unit weight rather than per unit volume. Because wood is ∼97% lignin and holocellulose (i.e., α-cellulose and hemicellulose), an inverse relationship exists between lignin and cellulose content on a per-unit-weight basis. As a result, an increase in lignin content could actually be due to a reduction in α-cellulose and vice versa. Therefore, the QTL detected are described as cell wall chemistry (cwc) traits (Table 1) rather than as QTL associated with any specific wood chemistry component (Sewellet al. 2002).
QTL analysis: Associations between segregating genetic markers and phenotypic variability for wood property traits in the verification and unrelated mapping populations were detected using the interval mapping approaches of Knott et al. (1997) and QTL Express (Seatonet al. 2002), a World Wide Web-based interface for the method of Haley et al. (1994). A minor modification to the genotype file was required to enable running the F2 QTL Analysis Servlet of QTL Express with data from a three-generation outbred pedigree (C. S. Haley, personal communication). Estimated QTL positions and associated F -statistics were identical between software packages.
Each linkage group was scanned at 5-cM intervals for locations explaining a high proportion of the phenotypic variance using a one-QTL model interval analysis. Only regions of the genome that exceeded chromosome-wide P < 0.05 (suggestive level) or P < 0.01 (significant level) significance in support of the existence of a QTL are reported. These thresholds were determined by performing 1000 permutations of the data as implemented in QTL Express. Note that these thresholds correspond approximately to genome-wide significance levels of 0.6 and 0.12, respectively, following Bonferroni correction (Lynch and Walsh 1998). Therefore, it is probable that some QTL are false positives, but are reported to the mapping community as recommended by Lander and Kruglyak (1995). To compare these results to QTL observed previously in the detection population, it was also necessary to assess marker-trait associations using the suggestive (0.01 > P > 0.005) and significant (P < 0.005) thresholds employed by Sewell et al. (2000, 2002). Only a few differences in accepting or rejecting the null hypothesis by either method were found; therefore, comparisons between experiments were considered valid and a reanalysis of the detection population was not performed.
A two-dimensional analysis at 5-cM intervals was also performed to fit a two-QTL model for each linkage group. Permutation tests have not been implemented for this model in QTL Express and the suggestive and significant levels of Sewell et al. (2000, 2002) were used in determining significance.
The model used to test the effect of QTL alleles as reported in Sewell et al. (2000, Table 3) assumes that the grandparents of each parent have divergent wsg phenotypes. This assumption is not valid for other traits measured for progeny of the detection and verification populations since grandparent phenotypes were not assessed. For the same reason, the unrelated population may violate this assumption. Thus, comparisons of the magnitude and direction of the parental effects and interaction effects among populations are difficult to interpret and these effects are not reported.
Candidate gene mapping: Candidate gene selection emphasized structural genes of phenylpropanoid metabolism involved in monolignol synthesis, including phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), coumarate 3-hydroxylase (C3H), caffeate O-methyltransferase, 4-coumarate:CoA ligase (4CL), caffeoly CoA O-methyltransferase (CCoAOMT), and cinnamyl alcohol dehydrogenase (CAD). Also included were loblolly pine genes homologous to (1) laccase, a gene potentially involved in polymerization of lignin monomers; (2) three genes involved in supplying methyl groups for lignin biosynthesis via S-adenosyl methionine (SAM), including SAM synthetase (SAMS), S-adenosyl homocysteine hydrolase (SAHH), and glycine hydroxymethyl-transferase; and (3) five genes encoding arabinogalactan proteins (AGPs). These AGPs are abundantly and differentially expressed in differentiating xylem (Loopstra and Sederoff 1995; Loopstraet al. 2000; Zhanget al. 2003) and may play critical roles in wood development. PCR amplification primers can be viewed at the Genetics website (http://www.genetics.org/supplemental).
Primer pairs were designed with CPrimer using individual loblolly pine expressed sequence tags (ESTs) or contig assemblies of EST sequences accessed through the National Center for Biotechnology Information web server. Contig assembly was done using Sequencher (Gene Codes, Ann Arbor, MI). For several genes, it was possible to distinguish different members of a gene family within a contig, which allowed the selective amplification and mapping of individual gene family members. PCR amplication was performed as described previously (Harryet al. 1998) except HotStarTaq DNA Polymerase (QIAGEN, Valencia, CA) and a 60° annealing temperature were used. Genotypic data for markers segregating in any of the loblolly pine reference populations were obtained primarily by denaturing gradient gel electrophoresis according to Temesgen et al. (2001). Genotypic data for SAHH were generated by pyrosequencing (http://www.pyrosequencing.com) with the oligonucleotide 5′-CGGCGAGTATCAAGTT-3′ following PCR amplification. For PAL-1, a segregating banding pattern was obtained after restriction digestion with NlaIII (New England Biolabs, Beverly, MA). cDNA clones encoding an arabinogalactan-like protein (Pta3HZ) and CAD were mapped previously by RFLP analysis (Sewellet al. 1999).
Linkage map construction: The sex-averaged linkage map of the verification population consists of 103 markers distributed across 12 linkage groups (LG) of loblolly pine (2n = 24). The map spans 1305 cM, slightly larger than that of Sewell et al. (2000, 2002). Only one fully informative marker on LG 8 was available. In this case, a sex-averaged map was not constructed and QTL analysis was performed on the individual parental maps. The linkage map of the unrelated population, which covers 890 cM, consists of 73 markers distributed across 10 of the 12 LGs. Marker order on the four parental maps and the consensus map are essentially identical to those published previously (Sewell et al. 1999, 2002; Brownet al. 2001).
QTL mapping: All QTL observed in the detection population using both the one-QTL and two-QTL models were reported in Sewell et al. (2000, Tables 4-7; 2002, Tables 3-4). QTL detected in the verification and unrelated populations can be viewed at the Genetics website at http://www.genetics.org/supplemental. Many of these QTL are independent estimations of the same QTL since many of these traits are highly associated (i.e., the same trait measured from annual rings and the composite trait derived from these rings). Accordingly, unique QTL are defined as the subset of QTL influencing the same traits that map within ∼15 cM of one another (Sewell et al. 2000, 2002). Further interpretations are based solely on these unique QTL.
Verification population: A total of 44 unique QTL were detected in the verification population using the one- and two-QTL models, including 10 QTL for earlywood wsg, 8 QTL for latewood wsg, 12 QTL for the percentage volume of latewood (%lw), 4 QTL for latewood mfa, 5 QTL for earlywood cwc, and 5 QTL for latewood cwc. No QTL were detected for earlywood mfa. With the exception of a QTL on LG 5, which accounted for as much as 15.9% of the phenotypic variation in latewood wsg, the percentage of variance explained by each QTL was generally small, ranging from 1.7 to 5.7%. These effects are two- to threefold smaller than those reported by Sewell et al. (2000, 2002) for the same traits and likely represent more accurate estimates of the true QTL effects owing to the larger segregating population analyzed.
Unrelated population: A total of 12 unique QTL for physical wood properties were detected in the unrelated population using the one- and two-QTL models, including 5 QTL for latewood wsg, 5 QTL for %lw, and 2 QTL for latewood mfa. No QTL were detected for wsg or mfa of earlywood. The percentage of the phenotypic variance explained by each QTL was also small, ranging from 1.8 to 4.4%. Fewer QTL were detected in this pedigree in part due to less complete genome coverage. In addition, the grandparents of the unrelated population were not chosen on the basis of divergent phenotypic values and, as a result, fewer QTL may be segregating in the mapping population.
QTL verification: For comparative purposes, unique QTL in the three populations were placed into 15-cM regions of the consensus genetic map of loblolly pine on the basis of the position of homologous flanking markers (Figure 1). In some cases (e.g., LG 7 of the unrelated population), this assignment is only approximate due to insufficient numbers of such markers. The 95% confidence interval of each QTL likely varies between experiments and in many cases will be considerably >15 cM; however, this bin size was chosen for illustrative purposes and in keeping with the definition of a unique QTL used here and previously. QTL verification (i.e., the repeated detection of a QTL) was possible at three different levels: (1) across growing seasons, (2) between mapping populations of the same pedigree, and (3) between unrelated pedigrees.
Across growing seasons: Within each population, the colocation of QTL detected over multiple annual rings for a given trait represents a form of verification across a developmental gradient. For traits where QTL were estimated in more than one growing season (e.g., wsg and %lw in all populations and mfa in the detection populations), 56 of 91 (62%) QTL were detected in more than one ring. For example, the majority of wsg QTL in the verification population were supported by both composite trait and individual ring analyses.
Between mapping populations of the same pedigree: Sewell et al. (2000, 2002) identified 61 unique QTL influencing physical and chemical wood properties in the detection population. Of the 44 QTL detected in the verification population, 12 (27%) are potentially repeated detections of the same QTL (Table 2 and Figure 1). For example, 6 QTL for earlywood wsg were found in similar locations in both populations on LGs 1, 3, 5 (2 QTL), 6, and 12. Despite the increased power of the larger verification population to detect QTL of small effect, fewer QTL than reported in the detection population were found. Several confounding factors may have contributed to these findings, including differences in both the number of annual rings sampled and the test sites for each of the two populations. Sewell et al. (2000) suggested that the onset of mature wood production might induce the expression of a new suite of QTL not detected in juvenile wood. The overall success of QTL verification, therefore, is probably biased downward since not all QTL found in the detection pedigree, in particular those detected in rings 8-11, may be directly comparable to those detected in the juvenile wood cores of the verification population.
Between unrelated pedigrees: Of 12 (33%) unique QTL-influencing physical wood properties in the unrelated population, 4 mapped to similar locations in the detection population (Table 2 and Figure 1). These included late-wood wsg QTL on LGs 2 and 7 and %lw QTL on LGs 5 and 7. The unrelated population was not phenotyped for cwc traits. It was rare to observe a QTL common to all populations with only a single %lw QTL on LG 5 being found. Its detection suggests that this QTL is affected less by genetic background (e.g., lack of epistasis) and potentially by genotype × environment interactions than all other detected QTL.
Candidate gene mapping: The map positions of 18 candidate genes with known or putative roles in the biosynthesis of lignin or components of the cell wall are shown in Figure 1. Of particular interest were those candidate genes colocating with QTL verified by repeated detection (Table 2). A laccase mapped near a verified QTL on LG 1 influencing earlywood wsg. C4H, GlyHMT, and Pta14A9 on LG 3 and PtaAGP6 on LG 5 also mapped near verified QTL for earlywood wsg. CCoAOMT mapped to a region on LG 6 containing verified QTL influencing %lw. C3H, 4CL, and PtaAGP4 mapped to a region on LG 7 possibly influencing latewood wsg and %lw. Finally, a member of the SAM synthetase gene family (SAMS-2) mapped to LG 8 near a cluster of QTL affecting latewood wsg and cwc.
The size and expense of experiments to verify QTL has proven to be an obstacle to their widespread implementation. In a review of QTL mapping in forest trees (Sewell and Neale 2000), QTL verification was a component of only 2 of 20 experiments described (Wilcoxet al. 1997; Frewenet al. 2000). QTL verification is essential to substantiate a biological basis for observed marker-trait associations, to provide precise estimates of the magnitude of QTL effects, and to predict whether a QTL will be expressed at a given developmental age or in a particular environment. Despite a number of confounding factors, these experiments lead to several conclusions regarding the genetic control of wood properties in loblolly pine.
Unlike agronomic crop species, which develop to maturity within a single season, forest trees are long lived, experiencing both seasonal cycles and maturation processes over decades. Half of the QTL influencing wsg and %lw were consistently detected over multiple growing seasons. QTL controlling mfa were equally stable when measured across multiple years. The structural and regulatory genes underlying these QTL may be the primary determinants of the physical properties of juvenile wood whereas QTL detected in only a single year may represent physiological processes activated in response to biotic or abiotic variation. However, it is not known which, if any, of these QTL contribute to mature wood properties. Once these populations have grown sufficiently to ensure the production of mature wood, a second QTL analysis is necessary to determine the consistency of QTL expression at maturity.
The components of wsg, in general, were detected more consistently than those of mfa or cwc in both the verification and unrelated pedigrees. This may be a reflection of high heritabilities for wsg (0.2 > h2 > 1; Zobel and Jett 1995), although estimates of h2 for mfa and cwc are limited. Between the detection and verification populations, factors in addition to the problem of sampling different annual rings may also have contributed to inconsistent estimates of QTL. First, overlapping but not identical marker sets were used for construction of the genetic maps. Coupled with differences in family sizes, different recombination distances between markers common to both populations were observed in some cases, which may have obscured the orthologous relationship betweeen QTL. Second, statistical significance thresholds are not static but vary among linkage groups and experiments according to sample size, marker density, and the proportion and pattern of missing data, among other factors (Churchill and Doerge 1994). Although our preliminary analyses showed that the different statistical criteria gave rise to similar results, it is possible that at least some of the QTL of suggestive significance in the detection population may arise from type I error. Finally, the impact of the environment on QTL expression and detection remains to be addressed. Forest trees grow under conditions of great environmental heterogeneity, which impacts tree physiology and growth and the properties of wood (Zobel and Jett 1995). Inconsistent QTL detection between environments may reflect differing environmental influences on specific metabolic pathways in the formation of wood. Appropriately designed experiments deploying large amounts of clonal material over multiple test sites need to be performed.
The repeated detection of QTL in unrelated families was difficult. Given the outcrossing mating system of pine and other conifers, that is not surprising since both a QTL and its genetic marker will not be polymorphic in every family. (As a corollary, QTL must be identified in multiple families to account for all genomic regions affecting trait variation.) The populations used for QTL mapping in agronomic crops, such as F2 intercrosses or recombinant inbred lines, are considerably more efficient for QTL detection since nearly all genetic markers and QTL segregate. These differences are apparent when comparing the extent of QTL verification between unrelated genotypes in agronomic crops and loblolly pine. For example, Moncada et al. (2001) reported that 11 of 25 (44%) QTL detected in an interspecific cross of rice were identified previously in similar positions. Lan and Paterson (2001) reported 9 of 17 (53%) QTL controlling plant size in Brassica oleracea were common between F2 populations derived from two different intervarietal crosses, and 27% of QTL were common among three crosses. The existence of QTL that exert major effects on phenotypes, such as those controlling fruit size and shape in tomato (reviewed by Grandilloet al. 1999) and height and flowering time in maize (Linet al. 1995), have further facilitated QTL verification in agronomic crops.
Successful QTL verification raises the prospect of marker-assisted selection (MAS). Strauss et al. (1992) discussed several serious obstacles to its implementation in forest tree breeding programs. While technical concerns, such as the cost of marker development, have largely vanished, practical considerations remain. The domestication of forest trees has begun only recently; thus, breeding programs contain a wide diversity of germ-plasm. Unlike crop plants, in which strong linkage disquilibrium (LD) is created by inbreeding or hybridization, breeding populations of forest trees show little LD, giving rise to inconsistent marker-trait associations among genotypes. Plantation site heterogeneity and the extended time and variable climatic conditions that trees experience before harvesting may result in unpredictable QTL effects. Nonetheless, MAS within families for juvenile wood property traits seems feasible, given the modest level of QTL verification between the detection and verification populations. One circumstance in which QTL mapping and MAS may be warranted is when clonal deployment over large areas is anticipated for a small number of highly valued families. Family selection will be possible under two scenarios: if the gene underlying a QTL is identified or if a genetic marker in linkage disequilibrium with the molecular polymorphism causing trait variation at the population level is discovered.
Isolating the gene underlying a QTL is an enormous undertaking even in species with small genomes (e.g., Fraryet al. 2000), and it is unlikely that map-based cloning will be used in pines [the C value of loblolly pine is 21-23 pg (Wakamiyaet al. 1993)]. A population-based association approach using positional candidate genes is an alternate strategy. Our screen of genes involved in lignin biosynthesis or encoding arabinogalactan-like proteins provided plausible candidates controlling wood property traits. For example, C4H, C3H, 4CL, and CCoAOMT of monolignol synthesis may have influences on wsg. However, the populations used for this analysis are in strong linkage disequilibrium, and, as such, a very large region of DNA including many additional genes is implicated by these findings. Validation of these results is being undertaken in a natural population of loblolly pine, in which linkage disequilibrium between any two sites is minimized due to historical recombination events. A successful association test of single nucleotide polymorphisms in these candidate genes with wood property phenotypes promises to enable family selection at the allele level, regardless of pedigree or family relationships.
The authors thank C. Dana Nelson and three anonymous reviewers for critical reading of the manuscript and Robert Saich for technical assistance. This research was funded in part by the United States Department of Agriculture National Research Initiative Plant Genome grant 96-35300-3719, National Science Foundation grant 9975806, and Department of Energy Agenda 2020 grant DE-AC05-96OR22464.
Communicating editor: A. H. D. Brown
- Received November 1, 2002.
- Accepted April 4, 2003.
- Copyright © 2003 by the Genetics Society of America