Abstract
Functional genetic redundancy is widespread in plants and could have an important impact on phenotypic diversity if the multiple gene copies act in an additive or dosage-dependent manner. We have cloned four Brassica rapa homologs (BrFLC) of the MADS-box flowering-time regulator FLC, located at the top of chromosome 5 of Arabidopsis thaliana. Relative rate tests revealed no evidence for differential rates of evolution and the ratios of nonsynonymous-to-synonymous substitutions suggest BrFLC loci are not under strong purifying selection. BrFLC1, BrFLC2, and BrFLC3 map to genomic regions that are collinear with the top of At5, consistent with a polyploid origin. BrFLC5 maps near a junction of two collinear regions to Arabidopsis, one of which includes an FLC-like gene (AGL31). However, all BrFLC sequences are more closely related to FLC than to AGL31. BrFLC1, BrFLC2, and BrFLC5 cosegregate with flowering-time loci evaluated in populations derived by backcrossing late-flowering alleles from a biennial parent into an annual parent. Two loci segregating in a single backcross population affected flowering in a completely additive manner. Thus, replicated BrFLC genes appear to have a similar function and interact in an additive manner to modulate flowering time.
DUPLICATION of genes, as chromosomal blocks, individually, or by whole genome polyploidization, is thought to be a major mechanism for creating new genetic and phenotypic diversity. The impact of paralogous genes on diversification is particularly striking in flowering plants where as many as 70% of species, including many of our most important crop plants, show evidence for polyploidy (Masterson 1994). The selective advantage of genetic redundancy is not well understood, but having multiple copies of genes could contribute to phenotypic diversity through the functional divergence of redundant genes. Although examples of this have been discovered, many duplicated genes appear to retain their original function (see Wendel 2000 for review). The lower than expected frequency of duplicate gene silencing (e.g., Nadeau and Sankoff 1997) also suggests that maintenance of duplicated gene function is an important feature in evolution. Force et al. (1999) have hypothesized subfunctionalization as one explanation for retention of duplicated gene function— that is, the accumulation of different mutations in duplicated genes that cause each locus to control only a However, by this mechanism, the retention of function would contribute little to phenotypic diversity.
A mechanism by which retention of duplicated gene function could impact phenotypic diversity is if each gene copy contributed to the control of the phenotype in a dosage-dependent manner. Increases in enzymatic activity and gene expression are associated with increasing ploidy (e.g., Roose and Gottlieb 1980; Guoet al. 1996), and a study of Hox group 3 genes in mice found that paralogous loci can act in a dosage-dependent manner to affect phenotype (Manley and Capecchi 1997). In plants, changes of regulatory genes are believed to be particularly important for the diversification of plant phenotypes (Doebley and Lukens 1998; Shepard and Purugganan 2002), and alleles at several key regulatory genes controlling developmental processes are known to interact in an additive manner (e.g., Tb1, Lukens and Doebley 1999; fw2.2, Fraryet al. 2000; CO, Koornneefet al. 1991; and FLC, Michaels and Amasino 2000). These additive or dosage-dependent effects at a single regulatory locus could be expanded through gene replication if multiple copies of the genes also interacted in an additive or dosage-dependent manner.
Brassica species, which include several important crops with a wide range of morphologies, are hypothesized to be ancient polyploid relatives of Arabidopsis thaliana (Lagercrantz 1998). A major component of the morphological diversity in Brassica species is variation in flowering time. In A. thaliana, many genes have been identified that control flowering time, but much of the natural variation involves allelic variation at FLOWERING LOCUS C (FLC) or at FRIGIDA (FRI), a regulator of FLC expression (Michaels and Amasino 1999; Sheldonet al. 1999; Johansonet al. 2000). FLC acts in a dosage-dependent manner to delay flowering (Michaels and Amasino 1999; Sheldonet al. 1999). Alleles that cause late flowering produce intermediate phenotypes when heterozygous with early flowering alleles, and transgenic expression of additional FLC genes leads to even later flowering phenotypes (Michaels and Amasino 2000). Diploid Brassica species contain three copies of the genomic region that corresponds to the top of chromosome 5 in A. thaliana (At5) where FLC is located (Osbornet al. 1997; Lagercrantz 1998; Parkinet al. 2002). QTL having large effects on flowering time have been mapped to these genome regions in Brassica rapa (Teutonico and Osborn 1994; Osbornet al. 1997), B. napus (Ferreiraet al. 1995; Osbornet al. 1997; Butruilleet al. 1999), B. oleracea (Bohuonet al. 1998; Lan and Paterson 2000), B. nigra (Lagercrantzet al. 1996), and B. juncea (Axelssonet al. 2001). Thus, multiple copies of a gene homologous to a flowering-time gene on At5, such as FLC, could contribute to the wide range of variation in flowering time observed in Brassica species.
Axelsson et al. (2001) hypothesized that the QTL that they identified in several Brassica species correspond to homologs of CO, another flowering-time gene at the top of At5 that is involved in photoperiod regulation of flowering (Putterillet al. 1995). Their evidence was based on confidence intervals for QTL and map positions or hypothesized map positions for CO and FLC homologs. Kole et al. (2001) provided strong evidence that VFR2, a flowering-time locus in B. rapa, is allelic to a B. rapa FLC homolog. VFR2 was originally identified as a QTL in a segregating population derived from annual and biennial oilseed B. rapa parents (Osbornet al. 1997). However, after backcrossing the late-flowering allele into the annual parent, the flowering-time effects conferred by VFR2 segregated as a single Mendelian locus that mapped 13 cM from a CO homolog but cosegregated exactly with a FLC homolog (Koleet al. 2001). The effect of the late allele was almost completely additive and was nearly eliminated by 3 weeks of vernalization. These data strongly suggest that VFR2 is a B. rapa homolog of FLC. Tadege et al. (2001) subsequently reported the cloning of five FLC homologs (BnFLC1-BnFLC5) from the allopolyploid B. napus (n = 19) by screening a cDNA library. Expression of these cDNA with a 35S promoter in A. thaliana delayed flowering, and the expression of the five BnFLCs in B. napus was reduced by vernalization. Their results are consistent with B. napus having multiple functional homologs of FLC; however, the total number and origins of FLC homologs and their effects through allelic variation in Brassica species are not known. This information could provide important new insight into the evolution of replicated genes.
In this study we report on the cloning of four genomic FLC genes from the diploid B. rapa (n = 10) and three genes from B. oleracea (n = 9). These genes were compared to each other and to A. thaliana genes by sequence analysis and comparative mapping. Phenotypic effects associated with the four BrFLC sequences were determined by evaluating flowering-time variation in backcross populations segregating for FLC loci individually or in combinations. Our results provide evidence that polyploidy has contributed to phenotypic variation for flowering time in B. rapa through replication of FLC, an important regulatory gene that acts in a dosage-dependent manner.
MATERIALS AND METHODS
Cloning and sequence analysis of Brassica FLC genes: Plants of the biennial B. rapa oilseed cultivar, Per, were grown in a growth chamber for 2 weeks under long-day (LD) conditions (16 hr light:8 hr dark) at 21°. Total RNA was extracted from leaves using the TRI reagent (Sigma, St. Louis) as directed by the manufacturer. First strand of cDNA was synthesized with the SuperScript II reverse transcriptase (Life Science Technology, Gaithersburg, MD) using the poly(dT)-M13 primer (5′-GTA AAA CGA CGG CCA GTC CCT TTT TTT TTT TTT T-3′). Synthesized first strands of cDNA were used as templates to amplify BrFLC cDNA by using the FLC44 primer (5′-CGG CTT AGA TCT CCG GCG ACT-3′) and the poly(dT)-M13 primer. The PCR products were cloned into pGEM-Teasy vectors (Promega, Madison, WI) and sequences were analyzed. All cDNA corresponded to a single BrFLC gene.
To isolate additional genomic Brassica FLC genes, conserved primers were designed by aligning the BrFLC cDNA with A. thaliana FLC cDNA (AF116527; Figure 1a, exon 2 and exon 7 primers) and used for 35 cycles of PCR with genomic DNA from doubled haploid lines of B. rapa (IMB218) and B. oleracea (TO1000). PCR products were excised from the gel, purified using the GFX PCR DNA and gel band isolation kit (Amersham Biosciences, Piscataway, NJ), and cloned into pGem T-Vectors (Promega).
Plasmid inserts were sequenced by ABI PRISM dye terminator cycle sequencing ready reaction kit (PE Applied Biosystems, Foster City, CA). At least two independent clones from separate PCR reactions were sequenced for each locus. Sequencing contigs were assembled using the Sequencher software package (GeneCodes, Ann Arbor, MI). After sequence analysis (see below) locus-specific primers were designed from a variable region of exon 4 of the B. rapa sequences (Figure 1a).
FLC sequences from B. rapa (AY115675-AY115678), B. oleracea (AY115672-AY115674), and A. thaliana (AF116528) were aligned using the Multiple Alignment Program (Huang 1996) and by eye. Exon and intron boundaries were identified by comparison to A. thaliana mRNA sequence (AF116527) and by checking boundary consensus sequences (Brownet al. 1996). The coding sequences for FLC from A. thaliana, B. rapa, B. oleracea, and B. napus (AY036888-AY036892) and for two FLC-like genes from A. thaliana, AGL27 (AF312665) and AGL31 (AY052229), were aligned using CLUSTALW (Thompsonet al. 1994) with manual adjustments.
Phylogenetic analyses were done using PAUP*, version 4.0 (Swofford 2000) with both maximum-parsimony and maximum-likelihood methods. For maximum-likelihood analyses, the transition:transversion ratio (ts/tv) was set at the default value of 2.0. The base frequencies and the gamma shape parameter α were both determined empirically from the data. Heuristic searches were done with tree-reconnection branch swapping. Bootstrap support values (BS) were estimated by doing 10,000 “fast” replicates using the parsimony criterion.
The rate of molecular evolution of the B. rapa FLC genes was tested by a relative rate test (Tajima 1993) with A. thaliana FLC sequence as the out-group, using the MEGA2 software package (Kumaret al. 2001). Similarly, AGL27 and AGL31 were compared using FLC as an out-group. The method of Nei and Gojobori (1986) was used to calculate the number of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN) and their ratio (dN/dS) using the codeml program in the PAML software package (Yang 2000).
Genetic mapping and map comparisons: Two regions containing flowering-time QTL, FR1 and FR2, were mapped using two backcross populations. The populations were derived from two recombinant inbred (RI) lines from a previously described B. rapa population (Koleet al. 1997) created from a cross between Per and R500, an annual sarson. For FR1, an RI line (PQ3) had Per alleles at restriction fragment length polymorphism (RFLP) loci flanking the FR1 region on linkage group R2 (Osbornet al. 1997) and R500 for RFLP loci flanking other flowering-time QTL. This line was backcrossed to R500 for three generations with selection at each generation for plants having Per alleles at marker loci flanking FR1. One BC3 plant heterozygous at FR1 was self-pollinated and 78 BC3S1 plants were grown in the field, along with 20 replicates of both early (R500) and late (PQ3) parents, under the same conditions as described by Kole et al. (2001). For FR2, an RI line (IMB1061) was selected that had Per alleles at RFLP markers on the linkage group R3 flanking FR2 and that had R500 alleles at RFLP markers flanking other known flowering-time QTL (Osbornet al. 1997). The RI line was crossed to R500, and a hybrid plant (genetically equivalent to a BC1) was selfed. One hundred resulting BC1S1 plants and five replicates of the early parent (R500) and the late parent (PQ1050) were grown in a growth chamber under LD conditions at 21°.
Linkage maps for the FR1 and FR2 regions were generated using RFLP and simple sequence repeat (SSR) marker loci. DNA was extracted as described in Kidwell and Osborn (1992) and were analyzed for RFLP by Southern blot hybridizations as described by Teutonico and Osborn (1994). Probes hybridized to blots included DNA clones found on linkage groups containing flowering-time QTL (Osbornet al. 1997), the four isolated B. rapa FLC genomic clones, and the A. thaliana FLC and CO clones used by Kole et al. (2001). In addition, several PCR markers were used. The exon 7 primer with the exon 4 locus-specific primers for BrFLC2 and BrFLC3 (Figure 1a) gave polymorphic PCR products (Figure 1b). Also, several SSR markers provided by D. Lydiate (personal communication) were used. In total, 14 marker loci per population were utilized. Linkage maps for the FR1 (R2) and the FR2 (R3) regions were constructed by analyzing segregation data from the two backcross populations using JoinMap 3.0 (CPRO-DRO, Wageningen).
Brassica probes used for RFLP analyses in this study (R2 and R3) and in previous studies (R10 = LG8, Koleet al. 2001; N2, N3, and N10, Osbornet al. 1997 and Butruilleet al. 1999) were sequenced to infer the position of putative homologs within the A. thaliana genome. The sequences were compared to the A. thaliana genomic sequence as provided on February 15, 2001, on the TAIR database (http://www.arabidopsis.org). Sequences were compared by BLASTn analysis (Altschulet al. 1997), and putative homology relationships were established if pairwise comparison BLAST scores were ≥82 (L. Lukens, F. Zou, D. Lydiate, I. Parkin and T. Osborn, unpublished results).
Flowering-time evaluation, QTL mapping, and gene interaction analysis: The 78 BC3S1 plants segregating for FR1, the 100 BC1S1 plants segregating for FR2, and the 326 F2 plants segregating for both FR1 and VFR2 (described below) were evaluated for flowering time by counting the number of days after sowing to the first open flower (DTF) and the number of leaves on the main axis at flowering (LN). Using the linkage maps constructed for the FR1 and FR2 populations, QTL for flowering time were analyzed using QTL Cartographer (Bastenet al. 2001) with a 10-cM covariate window for composite interval mapping (CIM). For both FR1 and FR2 populations, the broad-sense heritability for flowering time was estimated from variance components using the average variance of the parents and the hybrid as an estimate of the environmental variance and the variance of the segregating populations as an estimate of the phenotypic variance.
To study the interactions of two putative FLC loci, an F2 population (326 plants) that included both VFR2 (described in Koleet al. 2001) and FR1 in an R500 background was created by crossing two BC3S1 homozygous plants (fr1/fr1, VFR2/VFR2 × FR1/FR1,vfr2/vfr2). The F2 population was grown in the field under conditions as reported in Kole et al. (2001). The 326 F2 plants were screened by RFLP for VFR2 with the BrFLC1 clone and for FR1 with the BrFLC2 PCR polymorphism. To test the effects of the two putative FLC loci and their interactions, a two-factor analysis of variance was done using Proc MIXED of SAS (Littellet al. 1996) with means weighted according to the frequency of individuals in each two-locus class.
RESULTS
Cloning and analysis of Brassica FLC sequences: Cloning and sequence characterization: Five cDNA clones were analyzed by sequencing. Four of these clones were identical (BrFLC cDNA1) and contained 896 bp (coding 196 amino acids) corresponding to the seven exons of A. thaliana FLC with 75% identity (85% for the coding region). The fifth clone (BrFLC cDNA2) was 100% identical to BrFLC cDNA1 for exons 1-6; the final exon and the 3′-untranslated region (3′-UTR) were highly divergent and had no significant homology to any other sequence in GenBank. This was apparently a splicing variant of the same gene as BrFLC cDNA1, as explained below.
Alignment of BrFLC cDNA1 and FLC allowed us to design highly conserved primers in exons 2 and 7 (Figure 1a), and amplification with these primers yielded three distinct fragments after gel separation (Figure 1b). Cloning and sequencing of these PCR products resulted in the identification of four BrFLC genes (BrFLC1, BrFLC2, BrFLC3, and BrFLC5) and three B. oleracea genes (BoFLC1, BoFLC3, and BoFLC5). Locus names are based on their similarity to BnFLC cDNA sequences reported by Tadege et al. (2001; see below). Southern blot analysis using FLC as a probe identified four prominent restriction fragments in B. rapa (Figure 1c). The correspondence between the four FLC restriction fragments and our four cloned BrFLC genes was confirmed by locus-specific Southern hybridization analyses (Figure 1c).
—Cloning and analysis of B. rapa FLC homologs. (a) Summary of genomic structure of the A. thaliana FLC gene showing sequences of conserved primers from exons 2 and 7 used to clone B. rapa homologs. The chart lists intron and exon sizes (in base pairs) for FLC and the four B. rapa FLC homologs (BrFLC1-BrFLC5) cloned from a rapid-cycling doubled-haploid line of B. rapa (IMB). The nucleic acid alignment revealed a highly variable region of exon 4 used to design gene-specific forward primers (sequences underlined). (b) PCR amplification of genomic FLC sequences in IMB and the B. rapa cultivars Per and R500. Conserved primers (exon 2 and exon 7) amplify all four paralogs. Separation of FLC1 and -3 is not seen because of similar lengths (1616 and 1651 bp, respectively). Amplification specificity is shown by using gene-specific forward primers with a conserved exon 7 reverse primer. (c) Southern blot hybridizations of MspI-digested DNA of IMB, R500, and Per hybridized with exon 2 through exon 7 genomic probes of FLC and of BrFLC1-BrFLC5. The A. thaliana probe hybridizes to all four Brassica genes, whereas the gene-specific probes show very little cross-hybridization.
—Phylogenetic analyses of Brassica FLC homologs. Aligned coding sequences of FLC homologs from B. napus (Bn; Tadegeet al. 2001), B. rapa (Br), B. oleracea (Bo), and the FLC-like genes AGL27 and AGL31 were used for phylogenetic analyses. Phylogenetic analyses were done using both maximum-parsimony (a) and maximum-likelihood (b) methods. The number of nucleotide changes is shown on the top of each branch and bootstrap support values are shown below the branch of the consensus tree from the two most parsimonious trees (a). Analyses show four well-supported Brassica FLC clades, represented by each of the four BrFLC sequences. The BnFLC sequences are sisters to the BrFLC sequences. Three clades form a monophyletic Brassica group (BrFLC2, BrFLC3, and BrFLC5 clades). The analyses show all FLC sequences are monophyletic with respect to the FLC-like sequences AGL27 and AGL31. However, the two analyses differ in the placement of BrFLC1; parsimony gives a monophyletic Brassica clade (a) and maximum likelihood shows a paraphyletic relationship (b).
Exon and intron boundaries were identified by comparison to the A. thaliana cDNA sequence and by checking boundary consensus sequences (data summarized in Figure 1a). The BrFLC coding regions were 81.8-84.6% identical to FLC. Exon size was highly conserved among the Brassica and A. thaliana FLC sequences. The one exception was exon 4 of BrFLC2 for which both the IMB218 and the R500 alleles had a 56-bp deletion (established by partial sequencing of the Per and R500 BrFLC2 alleles) that eliminated part of exon 4 (18 bp) and intron 4 (38 bp).
Several introns were conserved in length and sequence. In particular, intron 3 was highly conserved and was the only intron whose sequences could be confidently aligned with 74.4-81.3% sequence similarity to FLC. Other introns were more polymorphic. Intron 2 varied 1.7-fold in length. Per and R500 alleles of BrFLC3 had two indels of 17 and 21 bp relative to one another in intron 2 (established by partial sequencing of these alleles). Intron 1 was not cloned because of its large size in A. thaliana (3.5 kb). However, PCR analyses of B. rapa genomic DNA revealed that BrFLC3 has a relatively small intron 1 (∼1140 bp), while the other loci also have large (>3 kb) intron 1 sequences (data not shown). Intron 6 was highly variable in length (5.3-fold). Sequence comparisons with the 3′ sequence of BrFLC cDNA2 revealed that it contained a portion of intron 6 of BrFLC5 that was in frame with the exon 6 sequence, but excluded exon 7. Hence, BrFLC cDNA1 and BrFLC cDNA2 are alternate splice variants of the same locus, BrFLC5. A putative 51-bp insertion of noncoding mitochondrial DNA (92% similarity) was also identified in intron 6 of BrFLC1. The sequencing of the A. thaliana genome revealed 14 such insertions, ranging in size from 94 to 3500 bp (Arabidopsis Genome Initiative 2000). The mitochondrial insertion was not present in BoFLC1.
FLC phylogeny: Phylogenetic analyses were conducted using a total of 451 bp of aligned coding sequence from exons 2-7 (Figure 1b), excluding indels. A total of 206 sites were polymorphic, and 145 were phylogenetically informative. Maximum-parsimony analysis resulted in two most parsimonious trees with a length of 308 (consistency index of 0.85; retention index of 0.87; consensus tree shown in Figure 2a). Maximum-likelihood analysis yielded a tree with ln L =-1905, an estimated ts/tv ratio of 1.382 and with rate variation estimated among nucleotide sites as gamma shape parameter α= 1.53 (Figure 2b).
dN and dS substitutions among Brassica FLC sequences compared to A. thaliana FLC
The phylogenetic analysis of FLC and FLC-like sequences showed several interesting relationships (Figure 2). First, all FLC sequences from Brassica species fall into four well-supported clades, each of which we refer to by the BrFLC sequence included in the clade. Sequences from each of the three species are present in each clade with the exception of a B. oleracea FLC in the BrFLC2 clade. Second, one sequence, at most, from a base Brassica diploid is found in any one group, suggesting that there has not been recent gene duplication. Third, both analyses (Figure 2) give a monophyletic group including the BrFLC2, BrFLC3, and BrFLC5 clades with high-parsimony bootstrap support (90% BS), but with poor resolution within the group [only 64% BS for a BrFLC3/BrFLC5 clade with parsimony (Figure 2a) and weak support for a BrFLC2/BrFLC3 clade with maximum-likelihood analysis (Figure 2b)]. Fourth, parsimony and likelihood analyses differ with respect to the placement of the BrFLC1 clade—being monophyletic with the other Brassica FLC sequences with parsimony, but paraphyletic with likelihood. Fifth, three of the five BnFLC sequences (BnFLC1, BnFLC3, and BnFLC5) cloned by Tadege et al. (2000) are sisters to the BrFLC sequences. Finally, both parsimony and likelihood analyses suggest that the Brassica sequences are more closely related to FLC than to the paralagous AGL27 and AGL31.
FLC sequence analyses: Duplicate loci that have diverged in function can show differential rates of evolution. Tajima’s relative rate tests comparing the BrFLC sequences to one another and using A. thaliana FLC as an outgroup gave chi-square values between 0 and 1 with all values being nonsignificant. Hence, we find no evidence that one locus is evolving more rapidly or more slowly than the others. Comparative mapping studies raised questions about the relationship of BrFLC5 to AGL31 (see below). Hence, we wanted to resolve several issues regarding the AGL31 cluster of genes, designated FLC-like sequences 2, 3, and 4 (FLCL2-4 by Tadegeet al. 2001). FLCL2-4 genes are 62.2-74.6% identical to FLC for predicted coding regions. Although FLC and the AGL31 cluster are paralogous due to a larger chromosomal duplication event (Arabidopsis Genome Initiative 2000), FLCL2-4 are more similar (67.7-82.7%) to AGL27 than to FLC. AGL27 is located in a nonduplicated region of chromosome 1. Relative rate tests were used to test for functional divergence of AGL31 by comparison to AGL27 using FLC as an outgroup. The test showed that AGL31 is not significantly different from AGL27.
Ratio (dN/dS) of dN and dS substitutions among BrFLC sequences
The number of dS and the number of dN and their ratio (dN/dS) were calculated for BrFLC and BnFLC sequences as compared to FLC (Table 1). A ratio of 0 is evidence for strong amino acid conservation and purifying selection and a ratio of ≥1.0 suggests neutral or positive selection. The dN/dS ratios for the Brassica FLC sequences compared to FLC ranged from 0.26 to 0.36 (Table 1) and from 0.31 to 0.53 when BrFLC sequences were compared to one another (Table 2). These values are similar to the ratios found between Brassica and A. thaliana CO genes (0.39-0.44), but much higher than the average of 0.10 (Lagercrantz and Axelsson 2000) and 0.14 (Tiffin and Hahn 2002) for other Brassica genes compared to their A. thaliana homologs. The ratio for Brassica FLC sequences is also higher than the average for the K-box and C regions of several MADS-box genes (Puruggananet al. 1995).
Genetic mapping and map comparisons: BrFLC1 was determined to be the FLC locus mapped onto linkage group R10 on the basis of comparisons to results reported by Kole et al. (2001). The position of BrFLC2 was mapped onto R2 using the 78 BC3S1 plants. The R2 map included a total of 14 genetic marker loci spanning 58.9 cM. Both BrFLC3 and BrFLC5 were mapped onto R3 using the 100 BC1S1 plants. The R3 map contained 14 marker loci covering 54.9 cM.
Comparative mapping between A. thaliana and linkage groups from B. rapa (R2, R3, and R10) and their homologs in B. napus (N2, N3, and N10) confirmed extensive synteny and collinearity among these groups and with chromosome 5 of A. thaliana (At5; Figure 3). The collinearity consisted of two blocks, one having homology to the top of At5 and the second with inverted orientation to a region on the bottom of At5. The first region from marker wg1a10 to wg6b2 corresponded to 0.11-7.50 Mb of At5. The second region from marker tg6a12 to ec3d3 corresponded to 26.7-18.2 Mb of At5. After the second shared region of collinearity to the bottom of At5, R2 then shared homology to At1 (26.6-29.0 Mb), R3 shared homology to At2 (13.0-17.8 Mb), and R10 terminated. Hence, all three linkage groups (R2, R3, and R10) appear to share common chromosomal breakpoints compared to A. thaliana.
The BrFLC2, BrFLC3, and BrFLC1 all mapped to the expected collinear region (3.13 Mb on At5 where FLC is located) of R2, R3, and R10, respectively. However, BrFLC5 did not map to a region having collinearity to the top of At5. It mapped to the interval between the runs of collinearity with the bottom of At5 and At2 (Figure 3).
—Comparative map of Brassica linkage groups containing FLC homologs. Vertical lines represent homologous linkage groups in B. rapa (R) and B. napus (N). ▪, relative positions (but not relative distances) of RFLP loci mapped in previous studies on R2, R3, and R10 (B. rapa linkage groups equivalent to LG2, LG3, and LG8 of Koleet al. 1997; Osbornet al. 1997; Koleet al. 2001) and on N2, N3, and N10 (B. napus linkage groups from Osbornet al. 1997; Butruilleet al. 1999). ♦, relative positions of markers used in this study. Markers connected with horizontal lines are RFLP loci detected with the same probe. Homology of RFLP markers to A. thaliana, based on comparative mapping (Osbornet al. 1997; Koleet al. 2001) or on BLAST searches using DNA sequences of RFLP probes (L. Lukens, F. Zou, D. Lydiate, I. Parkin and T. Osborn, unpublished results), is shown by the name of the A. thaliana genomic bacterial artificial chromosomes and genomic positions in megabases of DNA. Brassica genome regions that are collinear with four A. thaliana genome segments are enclosed in boxes. Loci wg4f4 on N3 and N10 and ec3f12 on N3 are not in collinear positions, but their map positions are based on a consensus map from five populations (Butruilleet al. 1999) and may be incorrect. Open bars indicate the positions of QTL for flowering time based on backcross populations in this study (FR1, VFR1, and FR2) and one qualitative trait locus (VFR2; Koleet al. 2001). Map positions of B. rapa homologs of the A. thaliana flowering-time gene FLC are shown, three of which correspond to the positions of flowering-time genes. Three of the BrFLC genes (BrFLC2, BrFLC3, and BrFLC1) map to positions on R2, R3, and R10 that show collinearity to the top of At5 where FLC is located. However, BrFLC5 maps to an interval between regions of collinearity to the bottom of At5 and to At2.
—Genetic mapping and QTL analyses of B. rapa chromosomes R2 and R3. The effects on flowering time associated with BrFLC2, BrFLC3, and BrFLC5 were analyzed in two populations derived by backcrossing alleles from a biennial (Per) into an annual (R500) B. rapa cultivar. Marker names, distances in centimorgans, LOD plots, and QTL positions with 1 and 2 LOD confidence intervals are shown for R2 (a) and R3 (b). (a) Seventy-eight BC3S1 plants were screened with 14 RFLP markers on R2, including BrFLC2 and CO. The largest QTL (R2 = 80%; LOD score = 35) was centered on BrFLC2. CO was outside the confidence interval. (b) One-hundred BC1S1 plants were screened with 10 RFLP markers on R3, including BrFLC3 and BrFLC5, and with 4 SSR markers. The largest QTL (R2 = 39.0%; LOD score = 10.6) was centered on BrFLC5. No QTL effect was identified with BrFLC3. A CO locus on R3 was not identified in this population.
Flowering time, QTL, and gene interaction analyses: R2 (FR1) population: The 78 BC3S1 plants grown in a field required 51-83 DTF and formed 14-35 LN, with means of 63.3 DTF and 25.7 LN. The flowering-time variation was greatly reduced in another set of BC3S1 plants after 3 weeks of vernalization (data not shown). DTF was significantly correlated with LN (r = 0.78; P < 0.01). The average DTF and LN for 20 plants of the early flowering parent (R500) were 51.5 and 17.9, respectively. For the late flowering parent (PQ3) the means were 75.8 DTF and 29.4 LN. Fourteen genetic marker loci, including BrFLC2 and spanning 58.9 cM of R2, were used for QTL analysis (Figure 4a). CIM revealed two QTL. The major QTL (FR1, LOD = 34.7) centered on the BrFLC2 locus, explained 80.6% of the variation, and had an additive effect of 9.4 DTF. The correlation of BrFLC2 genotypes with flowering time is summarized in Figure 5b. A second, smaller QTL (VFR1, LOD = 8.1) centered at 53.9 cM and explained 14.0% of the variation in the population with an additive effect of 4.1 DTF. The broad-sense heritability for flowering time of this population was estimated to be 0.96.
R3 (FR2) population: The 100 unvernalized BC1S1 plants grown in a growth chamber had a range of flowering times from 46 to 92 DTF and from 22 to 44 LN, with averages of 66.4 DTF and 28.9 LN. The variation was greatly reduced in another set of BC1S1 plants after three weeks of vernalization (data not shown). DTF was significantly correlated with LN (r = 0.74; P < 0.01). Based on averages of 5 plants the early flowering parent (R500) had values of 43.8 DTF and 20.8 LN, the hybrid had values of 69.4 DTF and 27.8 LN, and the late flowering parent (PQ1050) had values of 92.5 DTF and 37.3 LN. Fourteen genetic marker loci spanning 54.9 cM of R3 including BrFLC3 and BrFLC5 were used for QTL analysis (Figure 4b). There was segregation distortion for wg4a4 (P < 0.01) with fewer plants having the homozygous Per genotypes. The BrFLC1 and sn0319 loci were similarly distorted (P < 0.05). Composite interval mapping (CIM) gave a single QTL (LOD = 10.64) centered on the BrFLC1 locus. This QTL explained 39.0% of the variation in flowering time with an additive effect of 8.0 DTF. The broad-sense heritability for flowering time of this population was estimated to be 0.95.
R2 (FR1) and R10 (VFR2): We analyzed interactions between two putative FLC genes by comparing two BC3S1 populations segregating for FR1 (Figure 5a) and VFR2 (Figure 5b) alone, with an F2 population segregating for both FR1 and VFR2 that was derived by crossing two BC3S1 homozygous plants (fr1/fr1,VFR2/VFR2 × FR1/FR1,vfr2/vfr2; Figure 5c). The days to flower for the 78 BC3S1 plants segregating for FR1 discussed above were plotted by genotype at the BrFLC2 locus (Figure 5a). Similarly, the days to flower for the BC3S1 plants segregating for VFR2 reported in Kole et al. (2001) were plotted by genotype at the BrFLC1 locus (Figure 5b). The 326 F2 plants had a mean DTF of 92.8 with a range of 50-150 DTF, and the mean LN was 36.5 with a range of 14-53 LN. At 150 days after planting, the experiment was terminated with 18 plants having never flowered. DNA from all plants was used to genotype at BrFLC1 and BrFLC2, giving nine genotypic classes. The days to flower of the 326 F2 plants segregating for both FR1 and VFR2 were plotted on the basis of genotype at both BrFLC1 and BrFLC2 (Figure 5c).
—Genetic effects for late-flowering alleles at FR1 (BrFLC2) and VFR2 (BrFLC1). (a) Days to flowering for FR1 genotypes segregating in a BC3S1 population and scored as BrFLC2 marker classes. R/R, R/P, and P/P are homozygous R500, heterozygous, and homozygous Per genotypes, respectively. Error bars represent the 95% confidence interval for the mean flowering time associated with each genotypic class. (b) Days to flowering for VFR2 genotypes segregating in a BC3S1 population and scored for BrFLC1 marker classes, as reported by Kole et al. (2001). (c) Days to flowering for the nine genotypes of FR1 and VFR2 loci segregating in a single population and scored as BFLC1 and BFLC2 marker classes. Each line represents a FR1 genotypic class at the three genotypes of VFR2. Data are from an F2 population (326 plants) that included both QTL segregating in an R500 background, derived from a cross of two BC3S1 homozygous plants (fr1fr1VFR2VFR2 × FR1FR1vfr2vfr2). Additive effects at the two QTL explain 98% of the genetic variation for flowering time, suggesting that these QTL are duplicate copies of the same gene that have retained similar function.
To test the main and interaction effects of two BrFLC loci on flowering time, the F2 data were subjected to a two-factor analysis of variance. The full genetic model explained 87% of the flowering-time variation. Ninety-eight percent of this genetic variation was due to the individual additive effects of BrFLC1 (72.2%) and BrFLC2 (25.4%), similar to the results for the populations with each gene segregating alone (Figure 5). Dominance at BrFLC1 was significant in the F2 population, as were some of the epistatic interactions, but in total these nonadditive effects explained only 2.4% of the genetic variation.
DISCUSSION
Brassica species contain a wide range of morphological variations that have been selected for use as vegetables, oilseeds, and condiments. The expression of these variations may be due, in part, to allelic variation at redundant copies of key regulatory genes controlling developmental processes. Genes that affect phenotypes in a dosage-dependent manner would be particularly effective at expanding phenotypic diversity if they contained allelic variation at multiple functional copies. Our findings suggest that FLC is such a gene in B. rapa.
Cloning and analysis of Brassica FLC sequences: Using a PCR-based cloning approach, we identified four FLC homologs from B. rapa (named BrFLC1, BrFLC2, BrFLC3, and BrFLC5; Figure 1) and three B. oleracea homologs (BoFLC1, BoFLC3, and BoFLC5). Our ability to accurately identify and distinguish the different homologs was established by locus-specific PCR (Figure 1b) and by Southern blot analysis (Figure 1c). Southern blot hybridization with the four individual BrFLC clones accounted for all the restriction fragments detected by hybridization with an A. thaliana FLC probe (Figure 1c). We were not able to clone a BoFLC2 sequence, and Southern blot analysis suggested that this locus does not exist or is highly diverged in the rapid cycling B. oleracea TO1000 (data not shown). However, additional loci are likely in B. oleracea, including a tandem duplication of BoFLC1 (A. Millar, G. King and N. Salathia, personal communication).
Our results using Tajima’s relative rate test do not support the hypothesis of differential rates of evolution of the different Brassica FLC loci. Thus, we assumed that differential rates of evolution would not complicate our phylogeny reconstructions. Our phylogenetic analyses provide several interesting hypotheses for the origins of the duplication events giving rise to multiple FLC loci in Brassica. If the duplication events in the Brassica lineage all took place following the divergence from the Arabidopsis lineage, then the Brassica clade would be monophyletic. Both analyses (Figure 2) give a monophyletic clade of BrFLC2, FLC3, and FLC5 (but with poor internal resolution). However, parsimony analysis (Figure 2a) and the maximum-likelihood analysis (Figure 2b) differ in their placement of the BrFLC1 clade. Parsimony analysis has the BrFLC1 clade as being monophyletic with the BrFLC2, FLC3, and FLC5 clades (but with only 68% BS support) and maximum likelihood has the BrFLC1 clade as sister to A. thaliana FLC and to the remaining Brassica FLC sequences. Hence, the phylogeny does not resolve whether the duplication event leading to the BrFLC1 clade and the ancestor of the BrFLC2, -3, and -5 clades occurred before or after the divergence of the Brassica and Arabidopsis lineages.
Our phylogeny reconstruction also shows that three of the five BnFLC sequences (BnFLC1, BnFLC3, and BnFLC5) cloned by Tadege et al. (2001) are sisters to the BrFLC sequences, suggesting that these and probably BnFLC2 (Figure 2) are B. rapa FLC homologs from B. napus. The homology of BnFLC4 is uncertain since we did not detect a BoFLC2. B. napus (n = 19) is an interspecific allopolyploid between B. rapa (n = 10) and B. oleracea (n = 9). We obtained partial sequence of four B. rapa and three B. oleracea FLC sequences, suggesting that at least seven FLC loci should be in B. napus. Tadege et al. (2001) obtained evidence for only five FLC loci in B. napus on the basis of their cDNA library screening. Our unpublished mapping data with natural B. napus show at least seven FLC loci; thus the likely B. olereacea FLC homolog sequences in B. napus have yet to be identified and merit additional study.
Previous studies have established homology between the top of chromosome 5 of A. thaliana and three Brassica linkage groups (Lagercrantzet al. 1996; Osbornet al. 1997; Bohuonet al. 1998; Parkinet al. 2002). Our analysis confirms these previous results, showing strong collinearity between regions of three B. rapa linkage groups (R2, R3, and R10) and the top of At5. We also mapped FLC loci (BrFLC2, BrFLC3, and BrFLC1) at the predicted collinear regions of R2, R3, and R10, respectively (Figure 3). We attempted to map CO loci in the At5 homologous regions of B. rapa because others have argued that QTL mapping to these regions is due to alleles of this gene (Bohuonet al. 1998; Axelssonet al. 2001). We identified polymorphisms that mapped to R2 and to R10, but none that mapped to R3. This could simply be due to a lack of allelic variation at or near CO in our cross; however, efforts to clone CO homologs resulted in only two cloned CO orthologs from B. nigra (Bni COa and BniCOb; Lagercrantz and Axelsson 2000) and only a single pair of homeologous CO sequences from N2 and N12 of polyploid B. napus (Robertet al. 1998). Differences in the copy number and organization of Brassica CO and FLC loci, including the presence of a fourth Brassica FLC locus (BrFLC5), suggest that the two genes located only 2 Mb apart in the A. thaliana genome may have had different evolutionary pathways in Brassica species.
Whereas BrFLC1, BrFLC2, and BrFLC3 map within collinear regions, BrFLC5 maps on R3 to the interval between two stretches of collinearity with At2 and with the bottom of At5 (Figure 3). Analyses of the A. thaliana genome sequence found that the region around and including FLC (2.9-3.3 Mb) on the top of At5 was duplicated to the region from 26.4 to 27.1 Mb on the bottom of At5 (Arabidopsis Genome Initiative 2000). This duplicated region contains four similar tandem copies of a MADS-box gene (FLCL2-4 of Tadegeet al. 2001), including AGL31, that are all presumably paralogs of FLC. Although the proximity of BrFLC5 to the collinear region containing the AGL31 cluster suggests that BrFLC5 could be orthologous to one of these loci, several observations suggest that BrFLC5 is an ortholog of FLC and not of AGL31. First, we tested if BrFLC5 might have several tandem copies of the gene, as the AGL31 region does. We observed no evidence for this on the basis of Southern blot analysis of Per and R500 DNA using six restriction enzymes (DraI, BamHI, MspI, CfoI, EcoRV, and XbaI) and the BrFLC5 sequence as a probe (data not shown). Second, the comparative mapping data showed that the orientations of the collinear regions between R3 and the bottom of At5 are inverted relative to one another (Figure 3). To explain the orthology of BrFLC5 and one of the AGL31 genes, one would have to hypothesize that the inversion event occurred after the fusion of the At2 and At5 regions, leaving the AGL31 ortholog at the breakpoint or some other complex chromosomal rearrangement. Third, phylogenetic analyses clearly show that BrFLC5 is more closely related to FLC than to AGL31 (Figure 2). Finally, BrFLC5 does not appear to have changed more rapidly than any of the BrFLC sequences, and relative rate tests of AGL31 with AGL27, using FLC as an out-group, did not give evidence of differential rates of evolution. Hence, all lines of evidence suggest that BrFLC5 is an ortholog of FLC and not of AGL31. Determination of how an FLC ortholog was duplicated and inserted in this location on R3 will require additional experimental work.
Functional constraints may be reduced for duplicate genes, and we found mixed evidence for this for BrFLC genes. The higher dN/dS ratios for BrFLC sequences compared to the average of other MADS-box genes and the large variation in intron length suggests that they are not under strong purifying selection. Hence, the proteins have some flexibility to allow new amino acid sequences. However, except for the deletion in BrFLC2, there is strong conservation for exon size, with only a few changes in amino acid chain length (Figure 1a). The flexibility for allowing amino acid substitutions, reflected in the high dN/dS ratios, could indicate that the BrFLC sequences are undergoing rapid evolution, as Lagercrantz and Axelsson (2000) argue for Brassica CO sequences. The higher ratio can be interpreted to mean either relaxed sequence constraint while maintaining function or selection for diverse sequences and function. To test for conservation in function of the different BrFLC loci, we determined the phenotypic effects associated with alleles at BrFLC loci by using the sequences as candidate genes in segregation analyses.
Effects of FLC regions on flowering time: We found that three of our four cloned B. rapa FLC homologs, BrFLC1, BrFLC2, and BrFLC5, cosegregate with loci controlling flowering time in populations derived by backcrossing alleles from a biennial B. rapa into an annual B. rapa. BrFLC1 cosegregates exactly with the VFR2 locus on R10 reported by Kole et al. (2001). BrFLC2 and BrFLC5 map within confidence intervals of flowering-time QTL FR1 on R2 and FR2 on R3, respectively (Figure 4). These results generally agree with two previous QTL studies using F2 (Teutonico and Osborn 1995) and RI (Osbornet al. 1997) populations derived from the same parental lines, although there were differences in the magnitude and positions of effects. Populations used in the previous studies segregated for many loci affecting flowering time, and the QTL estimations may have been biased by chance associations of unlinked genomic regions. We used backcrossing in the current study to eliminate allelic variation at nontarget QTL, minimizing the bias that could be created by other segregating regions. This appeared to be very effective for the two QTL on R2, which were estimated in a population derived after three generations of backcrossing and whose combined effects closely matched the heritability estimate of the population. It was less effective for the QTL on R3, which were estimated in a BC1S1 population and accounted for only about one-half of the heritable variation of this population. A QTL effect on R3 near BrFLC3 was not detected in this or previous studies; however, the parents may have BrFLC3 alleles with small differential effects on flowering time that could be detectable after additional backcrossing.
Other researchers have found flowering-time variation associated with these same genomic regions in B. rapa and other Brassica species (Bohuonet al. 1998; Axelssonet al. 2001;Österberget al. 2002), and this variation was attributed to replicated CO loci (Axelssonet al. 2001). Our results indicate that most of the difference in flowering time between the annual and biennial B. rapa that we analyzed is controlled by replicated FLC loci. One of these flowering loci (VFR2) mapped as a single Mendelian locus precisely with BrFLC1 (Koleet al. 2001), and the other two were mapped as QTL in defined backcross populations to regions containing FLC homologs within the QTL confidence intervals. Further evidence that these loci correspond to FLC came from the reduction in flowering-time effects after vernalization. Finally, we tested the combined effects of alleles segregating at two loci, FR1 and VFR2 (BrFLC1), and found little evidence for epistasis; additive effects of the two loci accounted for 98% of the genetic variation for flowering time. This result supports our hypothesis that FR1 and VFR2 are duplicate copies of the same gene that have maintained a similar function. The overexpression of BnFLC in A. thaliana by Tadege et al. (2001) also provides evidence that multiple FLC loci encode functional gene products, but it does not demonstrate the allelic effects of these loci in Brassica. This is important for determining the role of replicated genes in phenotypic diversity and could be further demonstrated by analyzing the phenotypic effects of additional alleles derived from diverse genotypes, by studying their gene expression pattern, and by transformation experiments using Brassica FLC alleles expressed from native promoters.
Michaels and Amasino (2000) presented the “flowering rheostat” model to explain the additive effects of FLC alleles at endogenous and transgenic FLC loci in A. thaliana. In their model, additional copies of FLC act in an additive manner to increase the time to flowering, like settings on a rheostat, until biennialism is obtained. Our results from analyzing the interaction effects of two FLC loci fit this model and illustrate how replicated copies of FLC could expand the rheostat-like effect of the gene. The effects of gene dosage on an important trait like flowering time explains why replicated FLC genes have been retained and have apparently maintained ancestral function. Retention of replicate gene function has been observed at higher than expected frequencies (Lynch and Conery 2000). For genes that act in a dosage-dependent manner, the expansion of phenotypic variation through gene replication may be one reason for widespread success of polyploids and for the retention of duplicate gene function.
Acknowledgments
We thank J. Chris Pires and two anonymous reviewers for valuable comments and Josh Uduall and Enrique Leon for help with figures. Support was provided by the U.S. Department of Agriculture National Research Initiative Competitive Grants Program to T.C.O. The research in R.A.’s lab is supported by the College of Agricultural and Life Sciences of the University of Wisconsin and by grants from the U.S. Department of Agriculture National Research Initiative Competitive Grants Program and the National Science Foundation. M.E.S. was supported by a Molecular Biosciences Training Grant and by the D.C. Smith Fellowship at the University of Wisconsin. P.Q. was supported by a scholarship from Central University of Venezuela. L.L. was supported by a National Sciences Foundation Biotechnology Fellowship.
Footnotes
-
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. AY115672-AY115678.
-
Communicating editor: A. Paterson
- Received June 12, 2002.
- Accepted August 26, 2002.
- Copyright © 2002 by the Genetics Society of America