A Map-Based Cloning Strategy Employing a Residual Heterozygous Line Reveals that the GIGANTEA Gene Is Involved in Soybean Maturity and Flowering

Flowering is indicative of the transition from vegetative to reproductive phase, a critical event in the life cycle of plants. In soybean (Glycine max), a flowering quantitative trait locus, FT2, corresponding to the maturity locus E2, was detected in recombinant inbred lines (RILs) derived from the varieties “Misuzudaizu” (ft2/ft2; JP28856) and “Moshidou Gong 503” (FT2/FT2; JP27603). A map-based cloning strategy using the progeny of a residual heterozygous line (RHL) from the RIL was employed to isolate the gene responsible for this quantitative trait locus. A GIGANTEA ortholog, GmGIa (Glyma10g36600), was identified as a candidate gene. A common premature stop codon at the 10th exon was present in the Misuzudaizu allele and in other near isogenic lines (NILs) originating from Harosoy (e2/e2; PI548573). Furthermore, a mutant line harboring another premature stop codon showed an earlier flowering phenotype than the original variety, Bay (E2/E2; PI553043). The e2/e2 genotype exhibited elevated expression of GmFT2a, one of the florigen genes that leads to early flowering. The effects of the E2 allele on flowering time were similar among NILs and constant under high (43°N) and middle (36°N) latitudinal regions in Japan. These results indicate that GmGIa is the gene responsible for the E2 locus and that a null mutation in GmGIa may contribute to the geographic adaptation of soybean.

F LOWERING represents the transition from the vegetative to reproductive phase in plants. Various external cues, for example, photoperiod and temperature, are known to initiate plant flowering under the appropriate seasonal conditions. Of these cues, photoperiod sensitivity is one of the important keys that enables crops to adapt to a wide range of latitudes ( Jung and Muller 2009). To understand the molecular mechanism of flowering, extensive studies have been performed using model plant species, such as Arabidopsis thaliana and rice (Oryza sativa), and these have revealed the numerous regulatory network components associated with flowering ( Jung and Muller 2009). Several steps are involved in perceiving the external light signals and integrating these into floral stimuli in the plant leaf organ, and many of these components, including photoreceptors such as phytochromes, as well as clock genes, clock-associated proteins, and transcriptional factors, have been well characterized.
In A. thaliana, flowering is promoted under long-day (LD) conditions through the CONSTANS (CO)-mediated photoperiodic flowering pathway that regulates the expression of the FLOWERING LOCUS T (FT ) encoding florigen (Samach et al. 2000;Suarez-Lopez et al. 2001;Abe et al. 2005). The stability of CO mediated by phytochrome A and cryptochrome avoiding proteasome Supporting information is available online at http://www.genetics.org/ cgi/content/full/genetics.110.125062/DC1. Available freely online through the author-supported open access option.
Sequence data described in this article have been deposited at the DNA Data Bank in Japan (DDBJ) Data Libraries under accession nos. AB554196-AB554222. The first 5 sequences and subsequent 20 sequences correspond to AFLP fragments and 20 BAC end sequences, respectively. Putative coding sequences of GmGIa, originating from Moshidou Gong 503, and GmGIb, originating from the Misuzudaizu allele, have been deposited under accession nos. AB554221 and AB554222, respectively. Sequences of the five BAC clones have been deposited under accession nos. AP011822 (WBb35C13), AP011811 (MiB300H01), AP011821 (WBb225N14), AP011813 (MiB319A04), and AP011810 (MiB039C03), respectively. 1 protein degradation increases under LD conditions and induces the expression of FT in leaves (Yanovsky and Kay 2002;Valverde et al. 2004;. The GIGANTEA (GI ) mutant, discovered half a century ago, shows the late-flowering phenotype in LD conditions (Redei 1962). GI encodes a nuclear-localized membrane protein that functions upstream of CO and FT (Koornneef et al. 1998;Fowler et al. 1999;Huq et al. 2000;Mizoguchi et al. 2005). GI coupled with the blue light receptor protein, FLAVIN binding KELCH REPEAT F-BOX with PAS/LOV domains (FKF), forms a blue-light-dependent complex that degrades the CYCLING DOF FACTOR 1 (CDF) protein that binds the promoter region of CO and thereby induces CO expression (Nelson et al. 2000;Imaizumi et al. 2003Imaizumi et al. , 2005Sawa et al. 2007). Moreover, GI regulates another pathway that controls FT expression via microRNAs that cooperate with another transcriptional factor ( Jung et al. 2007).
The rice CO ortholog, Hd1, was discovered as a quantitative trait locus (QTL) affecting the heading date in the cross between japonica and indica rice cultivars (Yano et al. 2000;Kojima et al. 2002). In contrast to CO in A. thaliana, Hd1 is considered to promote the expression of FT orthologs under inductive short-day (SD) conditions and to repress it under LD conditions (Yano et al. 2000). Overexpression of OsGI, an ortholog of the Arabidopsis GI gene, in transgenic rice increased Hd1 expression, whereas expression of the FT ortholog was suppressed and resulted in a late-flowering phenotype (Hayama et al. 2003). Many other QTL for heading date in rice have been identified and some of the genes were found to encode novel transcriptional factors with no orthologs in the A. thaliana genome (Doi et al. 2004;Xue et al. 2008). Although some components, for example, CO, GI, and FT, for the flowering response are conserved in many crops, such as wheat (Triticum aestivum; Nemoto et al. 2003;Zhao et al. 2005;Yan et al. 2006), barley (Hordeum vulgare;Dunford et al. 2005;Yan et al. 2006), pea (Pisum sativum; Hecht et al. 2007), tomato (Solanum lycopersicum;Lifschitz et al. 2006), onion (Allium cepa; Taylor et al. 2010), cucurbitas (Lin et al. 2007), and grapevine (Vitis vinifera; Carmona et al. 2007), the mechanisms controlling the flowering pathway may differ between species, and these therefore must be elucidated at the species level.
Of these, the E1, E3, and E4 loci are considered as photoperiod sensitivity loci under various light conditions (Saidon et al. 1989;Cober et al. 1996;Abe et al. 2003). The E3 (Glyma19g41210) and E4 (Glyma20g22160) genes encode phytochrome A (PhyA) homologs Watanabe et al. 2009; http:/ /www.phytozome.net/), and the recessive alleles of these loci confer an early flowering phenotype under LD conditions extended by incandescent (e4) or fluorescent lighting (e3), respectively (Abe et al. 2003Watanabe et al. 2009). The expression of GmFTs, FT orthologs of soybean, that promote flowering in transgenic Arabidopsis plants (Kong et al. 2010) cannot be suppressed in the recessive alleles of both PhyA genes under LD conditions (Kong et al. 2010). Although other flowering gene orthologs have been analyzed using information from EST libraries and by expression analyses (Tasma and Shoemaker 2003;Hecht et al. 2005), the association between other E alleles, including E2, and flowering genes is still limited.
The soybean genome sequence based on the American cultivar "Williams82" has been available since January 2008 and was recently published (Schmutz et al. 2010).
The availability of such information in public databases can greatly accelerate the identification of genes controlling agronomically important traits. However, reliable and efficient mapping methods are a prerequisite. Although comparison of near isogenic lines (NILs) is the most effective way of evaluating a locus that affects a phenotype, developing NILs by backcrossing is a timeconsuming and laborious process. Alternatively, using a mapping population derived from a residual heterozygous line (RHL) has been proposed ). An RHL selected from an recombinant inbred line (RIL) population harbors a heterozygous region where the target QTL is located, but contains a homozygous background for most other regions of the genome. The progeny of the RHL is expected to show a simple phenotypic segregation based on the effects of the target QTL at the heterozygous region. Tuinstra et al. (1997) used a similar term, heterogeneous inbred family, for progenies of RHLs to identify the QTL associated with seed weight in sorghum. The RHL strategy has already been used to identify loci underlying pathogen resistance in soybean (Njiti et al. 1998;Meksem et al. 1999;Triwitayakorn et al. 2005).
In previous studies, using the same population as in this study, three flowering-time QTL, FT1, FT2, and FT3 loci, corresponding to maturity loci E1, E2, and E3, respectively, were identified (Yamanaka et al. 2001;Watanabe et al. 2004). Although FT2 has a larger effect on flowering time than FT3 (E3), the responsible gene for this QTL is still unknown. The responsible gene for the E3 locus was isolated using an RHLderived map-based cloning strategy (Watanabe et al. 2009). So far, however, there are no other successful examples of the isolation of responsible genes for QTL using RHLs in soybean. The aims of this study were therefore to confirm the utility of the RHL strategy for map-based cloning of a gene for a QTL; to identify the gene responsible for the FT2 (E2) locus; to characterize the E2 gene, including the relationship with soybean florigen genes; and to determine the genetic stability of the E2 allele under differential environmental conditions.

MATERIALS AND METHODS
Plant materials and flowering-time investigation: A population of 156 RILs (F 8:10 ) that was derived from a cross between the two varieties, Misuzudaizu (accession no. JP28856 in the National Institute of Agrobiological Sciences Genebank) and Moshidou Gong 503 ( JP27603), was used. This population had been used previously for linkage map construction and QTL analysis of agronomic traits ). Three QTL for flowering time, FT1, FT2, and FT3, were identified at LG C2 (chromosome 6), LG O (chromosome 10) and LG L (chromosome 19), respectively. The late-flowering alleles FT1, FT2, and FT3 are partially dominant over the early-flowering alleles ft1, ft2, and ft3, respectively. Misuzudaizu harbored the late-flowering allele of the FT1 and FT3 loci, whereas Moshidou Gong 503 carried the late-flowering alleles of the FT2 locus.
Screening methods for identifying candidate lines for RHLs with large phenotypic variance for flowering time were performed as previously described (Watanabe et al. 2009). RIL 6-8 were found to be heterozygous for the FT2 locus. Seeds of RHL 6-8 were sown on May 21, 2003, and seedlings were grown under natural conditions at Chiba University, Matsudo (35°789 N, 139°909 E). One pair of NILs, with contrasting alleles for FT2, 6-8-FT2, and 6-8-ft2, was selected from the progenies of this RHL. Additionally, several plants heterozygous for the FT2 locus were screened from the progeny of RHL 6-8 using molecular markers linked to this QTL. All seeds obtained from heterozygous plants were bulked to develop a large segregating population for fine mapping. This population consisted of 888 plants, and together with NILs 6-8 (211 plants for 6-8-FT2 and 168 plants for 6-8-ft2), were sown on May 30, 2006, at the Japan International Research Center for Agricultural Sciences, Hachimandai, Tsukuba (36°039 N, 140°049 E), and the seedlings were transplanted to the field on June 12 and grown under natural conditions. A total of 21 recombinants were screened using new DNA markers tightly linked to the FT2 locus from this population. The progenies, consisting of 24-74 plants derived from these recombinants, and the NILs 6-8 were sown on May 24, 2007, at Hokkaido University, Sapporo (43°079 N, 141°399 E), and seedlings were transplanted on June 7 to the field and grown under natural conditions. A mutant line, carrying a premature stop codon mutation in the GmGIa (E2) gene, and the original variety [Bay; PI553043, plant introduction (PI) number deposited in the National Plant Germplasm System in the United States] were sown on June 23, 2008, at the National Institute of Agrobiological Sciences, Tsukuba (36°029 N, 140°119 E) and grown under natural day-length conditions.
DNA isolation: Genomic DNA was extracted from fresh trifoliate leaves of 2-week-old seedlings using the standard cetyltrimethyl ammonium bromide (CTAB) method (Murray and Thompson 1980) and used for amplified fragment length polymorphism (AFLP) analysis, fine mapping, progeny tests, and confirmation of the E2 allele.
AFLP analysis: Templates for the AFLP reaction were prepared based on the method of Vos et al. (1995) using 150 ng DNA for restriction enzyme digestion with EcoRI and MseI. Selective amplification was performed using combinations of EcoRI (E) primers and MseI (M) primers, each with three selective nucleotides. Nomenclature for the AFLP markers were expressed as En 1 Mn 2 , with the letters E and M denoting the EcoRI and MseI primers, respectively, and the subsequent n1 and n2 codes representing the three selective nucleotides for each primer. The amplification procedures, conditions for polyacrylamide gel electrophoresis, and methods for detection and scoring of the polymorphic bands followed those described by Hayashi et al. (2001). Bulked segregant analysis (BSA) was used to identify the locus affecting segregation of flowering time in a candidate of RHL. Two DNA pools, an early flowering bulk and a late-flowering bulk from the RIL 6-8 subpopulation, were used as DNA templates. Additional polymorphic AFLP markers were identified using NILs 6-8 from the rest of the primers of all possible 4096 AFLP primer combinations.
Development of sequence characterized amplified region markers for fine mapping of the FT2 locus: The detected polymorphic bands were isolated using the pGEM T-easy vector system (Promega KK, Tokyo) and sequenced with an ABI PRISM 3100 avant Genetic Analyzer using a BigDye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems Japan, Tokyo) according to the manufacturer's instructions. Primer3 (Rozen and Skaletsky 2000) was used to design primers for new sequence characterized amplified region (SCAR) markers originating from AFLPs.
Construction of a physical contig of the FT2 region: Two libraries of BAC clones constructed from the genomic DNA of Misuzudaizu and William 82 were used. Library screening and end-sequencing were performed as described previously (Xia et al. 2005). Several BAC clones were identified using the SCAR markers. No positive clone was obtained for the SCAR marker originating from AFLP E37M27/47. The flanking region of this AFLP fragment was amplified using the adapter ligation method described previously (Tsuchiya et al. 2009).
The nucleotide sequences of five BAC clones (WBb35C13, MiB300H01, WBb225N14, MiB319A04, and MiB039C03) were determined according to the bridging shotgun method described previously . The generated sequences were assembled using Phred-Phrap programs (Philip Green, University of Washington, Seattle). A lower threshold of acceptability for the generation of consensus sequences was set at a Phred score of 20 for each base. These BAC sequences were used to develop new DNA markers and to predict the candidate gene for the FT2 locus.
DNA marker analysis: For fine mapping, seven DNA markers developed from the AFLP and BAC sequences listed in Table 1 were used to determine the genotypes of the progenies of the RHL. Genomic DNA (20-30 ng) was used as template, and PCR reactions were performed using Ex-Taq (Takara Bio, Shiga, Japan) with 30 cycles at 96°for 30 sec, 58°for 30 sec, and 72°for 1 min. Some markers required restriction enzyme digestion to detect a polymorphism (Table  1). PCR products were separated by 10% (w/v) polyacrylamide gel electrophoresis and visualized with ethidium bromide (EtBr).
Analysis of GmGI transcripts and diagnostic marker for the e2 allele: Total RNA from leaves of NILs 6-8 was extracted by the Trizol method (Invitrogen Japan K.K., Tokyo). Primers designed from the predicted start codon to stop codon region, based on the BAC sequence, were used to obtain the GmGI transcript. For the RT-PCR reaction, 1 mg of total RNA was used for first-strand cDNA synthesis with ReverTra Ace (TOYOBO, Osaka, Japan) and a standard oligo(dT 20 ) primer according the manufacturer's instructions. The cDNAs were then diluted twofold with PCR-grade water, and 2-ml aliquots were used as RT-PCR templates. Specific primers for RT-PCR of GmGIa were P1a (59-TGTCGTCATCTTCGTCTTCG-39) and P2 (59-CCAGAGCAGAGTCACAAGCA-39) and for GmGIb were P1b (59-CATCGTTTCACCCACTGAGA-39) and P2. The PCR conditions using Ex-Taq consisted of 30 cycles at 96°for 30 sec, 58°for 30 sec, and 72°for 3 min. PCR products were isolated and cloned into the pGEM T-easy vector system and sequenced. To evaluate the null allele of GmGIa, specific primers for the detection of the premature stop codon were 59-AAGCCTATGCCAGCTAGGTATTT-39 and 59-GAAGCCCA TCAGAGGCATGTCTTATT-39. The PCR conditions using Ex-Taq consisted of 30 cycles at 96°for 30 sec, 58°for 30 sec, and 72°for 1 min. PCR products were digested with DraI (Takara Bio) and then separated by 10% (w/v) polyacrylamide gel electrophoresis and visualized by EtBr staining.
Construction of soybean mutant libraries: Seeds of the soybean cultivar, Bay, were independently treated with two different mutagens (X ray or EMS). For X-ray treatment, dry seeds were irradiated with 200 Gy X rays at an exposure rate of 3 Gy/min. For EMS treatment, seeds were soaked in a 0.35% (w/v) EMS solution for 12 hr and then rinsed in tap water for 8 hr. M2 seeds were obtained from self-fertilized M1 plants. Green leaves were harvested from individual M2 plants for DNA preparation. Genomic DNAs were purified using diatomaceous earth columns, followed by CTAB extraction. Pooled DNAs from eight individuals were used for mutant screening.
Expression analysis of soybean florigen gene: Two combinations of NILs, NILs 6-8, Bay and Bay-e2 mutant were used, with eight plants for each line. Seeds were sown on June 13, 2008, in a greenhouse under natural day-length conditions. A piece of a fully expanded trifoliolate leaf was collected, with three replicates, at 9:00 AM 4 weeks after sowing. The methods for phenotypic investigation, RNA extraction, and cDNA synthesis were essentially as described above. The RT-PCR conditions using Ex-Taq consisted of 25 cycles at 96°for 30 sec, 58°for 30 sec, and 72°f or 30 sec with GmFT2a-specific primers 59-ATCCCGATGCACC TAGCCCA-39 and 59-ACACCAAACGATGAATCCCCA-39. The GmTubulin gene was used as an internal control and amplified for 28 cycles at 96°for 30 sec, 58°for 30 sec, and 72°for 30 sec with primers 59-TCTTGGACAACGAAGCCATCT-39 and 59-AAG CCTATGCCAGCTAGGTATTT-39. PCR products were separated by 13% (w/v) polyacrylamide gel electrophoresis and visualized by SYBR Green (Invitrogen) staining. The levels of GmTubulin expression were used to calculate the relative expression levels of genes using Image J software (http:/ /rsb.info.nih.gov/ij/). Three independent experiments were performed.
Data analysis: Data were analyzed using R software (http:// cran.r-project.org/) for one-way or two-way classification analysis of variance (ANOVA) without the assumption of equal variance. Multiple regression analysis was applied to estimate the additive and dominance effects of the FT2 locus, details of which were described previously . Phylogenic analysis of the GIGANTEA protein was performed using the neighborjoining (NJ) method with the program MEGA 4.0 (Kumar et al. 2008).

RESULTS
Development of an RHL for the FT2 locus: Most RILs showed a small variance in their flowering phenotype segregation in lines with an average standard deviation (S.D.) of 1.33 (supporting information, Table S1 and  Table S2) similarly to the parents [estimated S.D. values for Misuzudaizu (Mi) and Moshidou Gong 503 (Mo) were 1.20 and 1.59, respectively (Table S3)], whereas RIL 6-8 exhibited a larger S.D. value of 3.52. We expected that RIL 6-8 would harbor some heterozygous region, including the QTL for flowering time in its genome. Using BSA analysis, a polymorphic AFLP marker, E7M19, was detected between the early-flowering bulk and the late-flowering bulk derived from the progeny of RILs 6-8. Mapping results and QTL analysis for flowering time using the RILs suggested that this marker was located close to the LOD peak position of the QTL assigned FT2 (Figure 1). This indicated that RIL 6-8 harbored a heterozygous region, including the FT2 locus, in the previous generation (F 7 ) and showed phenotypic segregation in the subsequent generation (F 8 ). A region covering 10 cM, including the FT2 locus, was found through DNA marker analysis to have segregated in the progeny of RIL 6-8 (Figure 1). The plants heterozygous for this region, designated as RIL 6-8, generated NILs 6-8-FT2 and -ft2 among its progeny. The difference between NILs for flowering time was highly significant (P , 0.001) with 57.5 6 1.72 days (n ¼ 168) for 6-8-ft2 and 67.6 6 1.56 (n ¼ 210) days for 6-8-FT2. These results indicated that the progeny derived from RHLs 6-8 would be suitable for fine mapping of the FT2 locus. We developed additional DNA markers tightly linked to this QTL using NILs 6-8.
Screening of AFLP markers and construction of a physical contig around the FT2 locus: From the progeny of RHL 6-8, we selected several NILs harboring the narrower heterozygous region, including the FT2 locus for the development of new AFLP markers tightly linked to this locus. Figure S1 shows the scheme used to screen for the AFLP marker. Among the products amplified from all possible 4096 primer pair combinations, only five polymorphic bands showed constant polymorphism between the contrasting genotypes of FT2/FT2 and ft2/ ft2 in NILs 6-8. We confirmed the genetic positions of these markers using the RILs ( Figure S1). These polymorphic bands were excised from the gel, sequenced, and converted to SCAR markers. Three SCAR markers, originating from these AFLP bands, were developed. Using these SCAR markers, 10 BAC clones were identified after screening of two independent genomic DNA libraries. A physical contig covering this region was constructed on the basis of the results of PCR analysis using the BAC end sequences ( Figure S2). Five of the 10 BAC clones (WBb35C13, MiB300H01, Wbb225N14, MiB319A04, and MiB039C03) were then subjected to shotgun sequence analysis. Each BAC clone was separately analyzed and assembled, and the sequence information was then combined using overlapping sequences. The total length covered by the five clones was 430 kbp. A total of seven DNA markers, including two AFLPderived markers (markers 1 and 4) and five PCR-based markers developed from these BAC sequences (markers 2, 3, 5, 6, and 7), were used in the fine-mapping experiments to precisely restrict the FT2 locus (Table 1).
Fine mapping of the FT2 locus: A population consisting of 888 plants, derived from several RHL 6-8 plants, was used for fine mapping of the FT2 locus. Markers 1, 4, 6, and 7 were used for preliminary screening (Table S4 and Table S5). Recombination between these markers was not observed in 822 plants but was found in 21 other plants. The remaining 45 individuals were omitted from the analysis because of missing data for phenotypes or genotypes (Table S5).The numbers of FT2 homozygous late-flowering genotype (n ¼ 213, 68.5 6 1.08), heterozygous (n ¼ 420, 63.9 6 1.95), and ft2 homozygous early flowering genotype (n ¼ 210, 58.1 6 2.50) fitted well with a 1:2:1 segregation ratio (Table S6), and differences in flowering time among genotypes based on marker 4 were significant (P , 0.001). The additive effect and the dominance effect of this QTL were estimated to be 25.17 days and 0.57 days, respectively. Furthermore, the ratio of genetic variance explained by the FT2 locus accounted for 87.9% of the total variance, indicating that the variation observed in this population was largely controlled by the single-QTL effect. On the other hand, the 21 plants harboring recombination between these markers were analyzed in more detail using additional DNA markers ( Figure 2A). One-way ANOVA of the phenotypic data of each recombinant classified by genotypes of DNA markers enabled us to estimate the location of the QTL more precisely ( Figure 2B). The highest statistic was obtained at the locus of marker 4 (F ¼ 91.25, p , 0.001), indicating that the QTL was close to this marker. Of these recombinants, line #060501 had a recombination point between marker 3 and marker 4. Another line, #060120, generated a recombination between marker 4 and marker 5 (Figure 2). Markers 3 and 5 originated from the end sequences of MiB300H01 (Table 1). The phenotypic segregation patterns in progenies of lines #060120 and #060501 were evaluated to restrict the precise position of the FT2 locus since both lines had a crossover within a single BAC clone, MiB300H01. The NILs (6-8-FT2 and 6-8-ft2) and two homozygous lines (#060452 and #060528) showed small levels of phenotypic segregation compared with lines #060120 and #060501 (Table 2 and Table S7). A clear phenotypic segregation classified by the genotype of marker 4 was observed in progenies of lines #60120 and #60501 (Table 2). Considering the recombination points in each line and cosegregation of phenotypes and genotypes of marker 4 in the lines, this result indicated that the FT2 locus was restricted to a single BAC clone, MiB300H01. To identify the gene responsible for this QTL, the nucleotide sequence of this BAC clone was investigated.
Candidate gene for the FT2 locus: The complete sequence of the physical contig, assembled from the five BAC clones, corresponded to a 427-kbp sequence from 44,415 to 44,843 kbp in Gm10 of the soybean genome sequence available in the public database. The 94-kbp sequence of MiB300H01 coincided with the sequence from 44,693 to 44,787 kbp in Gm10. The average similarity between the sequences of Misuzudaizu and Williams 82 in this region was 99.6%. Nine annotated genes (Glyma10g36580-36670) were predicted in this region. One of these genes, Glyma10g36600, with a high level of similarity with the GI gene, was considered a strong candidate for the FT2 locus, since the loss of function of GI is known to cause drastic changes in the flowering phenotype of other plant species (Fowler et al. 1999;Hecht et al. 2007). On the basis of this assumption, we isolated the complete predicted coding region using an RNA sample extracted from leaves of NILs 6-8-FT2. We hereafter refer to this gene as GmGIa, since another GI gene, GmGIb, was also obtained from the same RNA sample. GmGI genes isolated from the Mo late-flowering allele (GmGIa-Mo and GmGIb-Mo) were found to encode proteins consisting of 1170 and 1168 amino acids, respectively. Amino acid identity between the two GmGI proteins was close to 97.3%. The predicted amino acid sequences of GmGIa and GmGIb were completely identical to Glyma10g36600 and Gly-ma20g30980, respectively. Phylogenic analysis showed that these soybean GI proteins displayed a high level of similarity (71-91%) to GI proteins from dicots and monocots (Figure 3). The presence of paralog genes with high levels of similarity attracted our attention. However, as there were no experimental data for GmGIb, we focused on the GmGIa gene in the present study.
The coding sequence of GmGIa was extended to a 20-kbp genomic region and contained 14 exons ( Figure  4A). Marker 4, which cosegregated with the QTL genotypes and originated from the AFLP marker, E60M38, was located in the 5th intron ( Figure 4A). Compared to GmGIa-Mo, the Misuzudaizu early-flowering allele, GmGIa-Mi, showed four single nucleotide polymorphisms (SNPs) in its coding sequence ( Figure 4B). One of these SNPs, detected in 10th exon, introduced a premature stop codon mutation that led to a truncated 521-aa GI protein in the GmGIa-Mi allele. This stop codon mutation was considered a candidate for a functional nucleotide polymorphism in GmGIa. A derived cleaved amplified polymorphic sequence (dCAPs) marker was developed to examine the identity of this stop codon mutation in other NILs for the E2 locus.  performed testcross experiments using various combinations of NILs originating from Harosoy (e2/e2) and Clark (E2/E2). The genotypes of all isolines tested coincided well with the genotype of this diagnostic dCAPs marker ( Figure S3). This result indicated that the gene responsible for the QTL of the FT2 was probably identical to the E2 gene and that a conserved mutation might have caused the early flowering phenotype in their recessive allele. To validate the significance of the mutation in the GmGIa gene, we screened a mutant line from the mutagen-treated libraries. The sequence of GmGIa in the wild-type Bay cultivar was completely identical to the E2 allele. One mutant line harboring a deletion in the 10th exon that caused a truncated protein (735 aa; Figure  4B and Figure S4) showed a significantly earlier flowering phenotype (39.8 6 1.16; n ¼ 8) than the wild type (47.6 6 1.06; n ¼ 29) under natural day-length conditions. These results indicated that the mutation in GmGIa causes the early flowering phenotype in soybean.
GI has the conserved function of controlling the expression of the FT (florigen) gene in Arabidopsis, rice, and pea (Hayama et al. 2003;Mizoguchi et al. 2005;Hecht et al. 2007). We examined the relationship between the GmGIa (E2) gene and the soybean florigen homologous genes GmFT2a (Glyma16g26660) and GmFT5a (Glyma16g04830). The diurnal expression profile of GmFT2a/5a showed that the highest level of expression was at 4 hr after dawn under SD conditions (Kong et al. 2010). We analyzed the expression of GmFT2a at 9:00 AM 4 weeks after sowing using E2 (FT2) NILs grown under natural day-length conditions whose photoperiod changed from LD to SD for soybean because of the sowing in midsummer. A clear association between the GmFT2a expression level and the early flowering phenotype was observed in both NILs ( Figure 5). However, there was no significant difference in the GmFT5a expression levels between these NILs (data not shown). These results suggest that GmGIa probably controlled flowering time through the regulation of GmFT2a. The recessive alleles, whether generated naturally or artificially, were perhaps unable to suppress GmFT2a expression and resulted in the early flowering phenotype. In addition to studying the molecular mechanisms of E2 gene functions, we considered it important to evaluate the genetic stability of the E2 allele and its interaction with the environment, and we therefore compared the genetic effects of the E2 gene under different latitudinal conditions.
Environmental stability of the genetic effects of the E2 allele: We performed fine-mapping experiments and progeny tests at two different locations-Tsukuba (36°0 39 N, 140°049 E) and Sapporo (43°079 N, 141°399 E). Differences between these experimental locations provided us with information about the environmental stability of the E2 allele. Two-way ANOVA was used to estimate the variance components associated with genetic and environmental effects and interactions between the genetic and environmental conditions (G·E; Table 3). All factors were statistically significant, but most of the variance components were occupied with the environmental (77.5%) and genetic (15.2%) variances. This indicated that differences in photoperiod, caused by the differences in latitude, probably had significant effects on suppressing the initiation of the flowering process. On the other hand, a small G·E interaction (0.1%) indicated that the genetic effect of the E2 allele was invariable between the different environments. Comparisons of the genetic effect of the E2 allele in three different experiments supports the stability of the E2 allele because the additive effects observed in Tsukuba, in Sapporo, and in the mutant analysis were 5.2, 4.3, and 3.9 days, respectively.

DISCUSSION
In this study, we have isolated the gene responsible for the soybean maturity locus E2 using a RHL-derived map-based cloning strategy. We used the same strategy to identify the gene responsible for the target QTL of another maturity locus, E3, in a previous study (Watanabe et al. 2009). These results indicate that the RHL strategy is an extremely useful method for determining the position of a QTL and for identifying the responsible gene. The probability of discovering RHLs for a target QTL depends on the ratio of heterozygosity in a population and the size of the population. If p is the probability of heterozygosity of any population with size n, then the probability of detecting k individuals with a heterozygous genotype is C(n, k) p k (1 2 p) n2k on the basis of a binomial distribution. In the case of an F 7 generation of RILs, the ratio of heterozygosity (p) is 1.56%, and with a population size of 200 (n), the prob-ability of detecting at least one RHL is .0.95. In our case, we discovered one and two RHLs from RILs, consisting of 156 lines, for fine mapping of the genes responsible for E2 and E3 (Watanabe et al. 2009), respectively. We therefore propose that a combination of QTL analysis, using the F 6 -F 8 RIL population, and the RHL strategy is useful for dissecting QTL for agronomic traits in crops where backcrossing is both timeconsuming and laborious. Indeed, the homozygous ratio is sufficiently high to evaluate traits with replication, and the heterozygosity ratio is not so low and will allow the identification of a sufficient number of RHLs.
Our results demonstrate that the gene responsible for the soybean maturity locus E2 is a GI ortholog. GI plays an important role in flowering through the control of CO and FT mRNA expression levels under inductive conditions in a wide range of plant species, including monocots and dicots such as rice and Arabidopsis (Koornneef et al. 1998;Fowler et al. 1999;Hayama et al. 2003;Mizoguchi et al. 2005). Moreover, many studies have characterized the expression profile of GI and its associations with that of circadian clock genes in barley, Brachypodium distachyon, and Lemna gibba (Dunford    Serikawa et al. 2008;Hong et al. 2010). In these studies, the expression of GI orthologs showed circadian rhythms. Thakare et al. (2010) compared the expression of flowering-time gene orthologs in soybean with E1 NILs that have the genetic background of Harosoy. Although the E1 gene is a major maturity gene, the responsible gene has not yet been isolated (Bernald 1971;Yamanaka et al. 2004). The authors used Glyma10g36600, the gene responsible for E2, as a GIGANTEA ortholog in their expression experiments. Glyma10g36600 showed a circadian rhythm in both LD and SD conditions regardless of the e2/e2 genotype of Harosoy, and there were no significant differences in the expression patterns between the E1 NILs (Thakare et al. 2010). This indicates that the circadian rhythm of GmGIa was not affected by the premature stop codon in the e2 allele. Kong et al. (2010) identified functional soybean FT homologs included in the PhyA-mediated photoperiodic response under SD conditions. In our experiment, the early flowering phenotype, caused by the loss of function of GmGIa, was probably related to the expression level of GmFT2a ( Figure 5). On the other hand, the PhyA-mediated pathway showed larger differences in the GmFT expression levels in comparison of NILs for the PhyA genes (e3 or e3e4; Kong et al. 2010). This was probably due to differences in the pathways controlled by GmPhyA and GmGI that promote GmFT expression, to differences in the experimental conditions, or to differences in the genetic background used as experimental materials. Considering the similarities in amino acid sequence and functionality of GmGIa and GmFTs with other GIs and FTs, respectively, the mechanisms controlling the photoperiodic pathways are most probably conserved in soybean and other plant species. In contrast, some other studies have reported that the genetic interactions between the E1 and E2 alleles and the recessive e1 allele suppress or weaken the effects of the E2 allele (Bernald 1971;Yamanaka et al. 2000;Watanabe et al. 2004). Further investigations are therefore needed to clearly elucidate the relationships between functional GmGIs and other genetic components, including the E1 gene.
A redundancy in highly similar genes, which originate from genome duplication, can increase the chance of evolving subfunctionalized genes (Adams and Wendel 2005). Indeed, hexaploid wheat has a greater potential for successful adaptation to a wide range of environmental conditions than tetraploid wheat (Dubcovsky and Dvorak 2007). The soybean genome is a typical paleopolyploid species, and the duplicated regions extend over the whole genome (Shoemaker et al. 1996;Tsubokura et al. 2008;Schmutz et al. 2010). The soybean genome is considered to originate from two duplication events: the soybean lineage-specific paleotetraploidization [13-15 million years ago (Mya)] and the early duplication in legumes that occurred near the origin of the papilionoid lineage (44-59 Mya; Schlueter et al. 2004;Schmutz et al. 2010). Representative examples of such gene redundancy that affect the same agronomic trait are the two photoreceptor genes, GmPhyA2 and -3, which affect the photoperiod-sensitive maturity loci, E4 and E3, respectively Watanabe et al. 2009). In some cases, the E3 and E4 alleles are found to have overlapping functions in flowering responses depending on photo quality and length (Cober et al. 1996;Abe et al. 2003), and the geographical distribution of the recessive e4 allele is found to be restricted to the high-latitude regions of Japan (Kanazawa et al. 2009). These facts indicate that duplication and subfunctionalization of PhyA genes contribute to the divergence of flowering time and maturity in soybean. Such duplicated gene phenomena may well explain the situation of the GmGIs. Premature stop codon mutations in the GI genes, induced by various mutagens, were found to have severe effects on flowering as well as pleiotropic effects on other traits in Arabidopsis (Araki and Komeda 1993;Fowler et al. 1999) and pea (Hecht et al. 2007). GI is associated not only with circadian rhythms (Mizoguchi et al. 2005;Martin-Tryon et al. 2007) but also with several phenotypes and The location in the 5th intron of marker 4, originating from AFLP E60M38, is represented by the shaded bar. (B) The substitution of nucleotides and amino acid residues (in parentheses) in the coding sequence are displayed above the boxes that represent each exon. Nucleotide and amino acid changes in the case of nonsynonymous and nonsense mutations from the ft2 to the FT2 allele, on the basis of the start codon position, are shown. In the early flowering ft2 allele, a single nucleotide substitution causing a premature stop codon was discovered in the 10th exon. The mutant line, harboring a nucleotide deletion in the same exon as the ft2 allele, had a truncated GI protein that similarly caused the early flowering phenotype under natural day-length conditions. The truncated protein lengths of the three alleles are compared at the bottom. physiological processes, including photomorphogenesis (Paltiel et al. 2006;Oliverio et al. 2007), cold stress response (Cao et al. 2005), oxidative stress tolerance (Kurepa et al. 1998;Cao et al. 2006), and starch accumulation (Eimert et al. 1995). Association analysis using the flowering gene in Arabidopsis has shown that the FRIGIDA and FLC genes, which are related to the vernalization pathway, contribute to geographic adaptation. However, SNPs in the GI gene, not related to the null allele, had only small effects on flowering time under LD conditions in a natural accession  or in inbred lines derived from intercrosses with multiple parental lines ). Although GI overexpressing transgenic rice plants and specific RNA interference lines have been evaluated, a natural loss-offunction GI rice mutant has not yet been discovered. In barley, the correlation between the GI locus and some QTL related to flowering traits has also not yet been reported (Dunford et al. 2005). Compared with species that harbor a single GI gene in their genome, the loss of function of E2 in soybean has little influence on growth except for phenotypes directly associated with flowering. The high level of sequence similarity (97%) between GmGIa (Glyma10g36600) and GmGIb (Glyma20g30980) indicates that these genes originated from soybean lineage-specific duplication. Although we have no experimental data regarding the functions of GmGIb, it is likely that these duplicated soybean GI genes cooperate or functionally compensate for each other and that the loss of function of GmGIa in soybean varieties contributes to their geographic adaptation, as with the GmPhyAs. Studies of such duplicated genes, represented by GmGIs and GmPhyAs, especially in comparison with Vitis vinifera that contains no recent genome duplication ( Jaillon et al. 2007) or with plants that have complex genomes, such as Zea mays, T. aestivum, and Populus trichocarpa, will provide further novel insights into the functions of these genes.
In this study, we have demonstrated, through a mapbased cloning strategy using the progeny of an RHL, that the gene responsible for the soybean maturity locus E2 is an ortholog of GI, GmGIa. The e2/e2 genotype caused early flowering by inducing the expression of the soybean florigen gene homolog, GmFT2a, whereas the effect of the E2 allele on flowering under different environments was stable. These results indicate that the functions of GI in flowering are conserved in soybean and that null mutations in the GI gene may be useful resources for adapting plants with complex genomes, such as soybean, to a wide range of geographic regions. More information regarding the biochemical functions Figure 5.-Expression analysis of GmFT2a between NILs of E2 under natural day-length conditions. Two pairs of NILs were used. Quantitative RT-PCR experiments for GmFT2a were performed using RNA isolated from trifoliolate leaves sampled at 9:00 AM 4 weeks after sowing. A representative polyacrylamide gel demonstrating the expression levels is shown. Each bar indicates the average expression level of GmFT2a compared with that of the GmTubulin gene with three independent replications and time to flowering; standard deviations of each line are shown below the bar graph. of the soybean GI genes and their interactions with other maturity genes are required, and these will provide new and intriguing insights into the flowering network in plants.  The white, black and gray chromosomal segments represent the Misuzudaiuz, the Moshidou Gong 503 and heterozygous alleles, respectively. The first screening of AFLP markers was performed using NILs6-8 (No 1 and 2). Polymorphic markers detected between NILs were reanalyzed using subfamilies selected from the progenies of . Five AFLP fragments tightly linked to the FT2 locus were highly reproducible and were used in subsequent fine mapping experiments. Representative segregation patterns of each fragment among the NILs and RIL population using one AFLP marker, E37M31, are shown in the box and in B, respectively. Arrowhead indicates polymorphic band detected between NILs and asterisks indicate the polymorphic markers detected between Misuzudaizu and Moshidou Gong 503 originating from other genetic loci.

TABLES S1-S7
Tables S1-S7 are available for download as an Excel file at http://www.genetics.org/cgi/content/full/genetics.110.125062/DC1. Table S1: Data for standard deviation in RILs Table S2: Raw data for flowering time in RILs Table S3: Phenotypic variation in parental lines and RHL6-8 Table S4: Raw data for fine mapping experiments Table S5: Frequency of flowering time classified with the line and QTL genotype Table S6: Average values for fine mapping and NIL evaluation Table S7: Raw data for progeny test