Three sex-determining (SD) genes, SRY (mammals), Dmy (medaka), and DM-W (Xenopus laevis), have been identified to date in vertebrates. However, how and why a new sex-determining gene appears remains unknown, as do the switching mechanisms of the master sex-determining gene. Here, we used positional cloning to search for the sex-determining gene in Oryzias luzonensis and found that GsdfY (gonadal soma derived growth factor on the Y chromosome) has replaced Dmy as the master sex-determining gene in this species. We found that GsdfY showed high expression specifically in males during sex differentiation. Furthermore, the presence of a genomic fragment that included GsdfY converts XX individuals into fertile XX males. Luciferase assays demonstrated that the upstream sequence of GsdfY contributes to the male-specific high expression. Gsdf is downstream of Dmy in the sex-determining cascade of O. latipes, suggesting that emergence of the Dmy-independent Gsdf allele led to the appearance of this novel sex-determining gene in O. luzonensis.
IN most vertebrates, sex is determined genetically. Mammals and birds with cytogenetically well-differentiated sex chromosomes have sex determination systems that differ between the taxonomic classes but not within them (Solari 1994). In mammals, for example, the sex-determining (SD) gene SRY/Sry on the Y chromosome has a universal role in sex determination (Gubbay et al. 1990; Sinclair et al. 1990; Koopman et al. 1991; Foster et al. 1992). By contrast, some fish groups, such as salmonids, sticklebacks, and Oryzias fishes, have sex chromosomes that differ among closely related species (Devlin and Nagahama 2002; Woram et al. 2003; Takehana et al. 2007a; Ross et al. 2009).
A DM-domain gene, Dmy, was the first SD gene identified in a nonmammalian vertebrate, the fish medaka Oryzias latipes (Matsuda et al. 2002, 2007). In this species, the term Y chromosome is employed to refer to a recombining chromosome that carries the male-determining gene Dmy, and X is used for the homologous chromosome; these are not a heteromorphic pair. This gene is conserved among all wild populations of O. latipes examined to date (Shinomiya et al. 2004). The closely related species O. curvinotus also has Dmy on its Y chromosome, which is orthologous to the O. latipes Y chromosome (Matsuda et al. 2003). However, Dmy has not been detected in any other type of fish, including other Oryzias fishes (Kondo et al. 2003). Analysis of the Y-specific region of the O. latipes sex chromosome has demonstrated that Dmy arose from duplication of the autosomal Dmrt1 gene (Nanda et al. 2002; Kondo et al. 2006). This Dmrt1 duplication is estimated to have occurred within the last 10 million years in a common ancestor of O. latipes, O. curvinotus, and O. luzonensis. In O. luzonensis, however, no functional duplicated copy of Dmrt1 has been detected (Kondo et al. 2003) (Figure 5A).
O. luzonensis possesses an XX–XY system, which is homologous to an autosomal linkage group (LG 12) in O. latipes (Hamaguchi et al. 2004; Tanaka et al. 2007). This species, like O. latipes, has homomorphic sex chromosomes without recombination suppression between them. This supports the hypothesis that Dmy lost its SD function and disappeared after a new SD gene appeared in O. luzonensis. O. luzonensis may, therefore, be very informative for studying the evolution of master SD genes and of the early stages of sex-chromosome differentiation.
Materials and Methods
O. luzonensis was collected by M. J. Formacion and H. Uwa at Solsona, Ilocos Norte, Luzon, the Philippines, in 1982, and has been maintained as a closed colony (Formacion and Uwa 1985). In the d-rR strain, the wild-type allele R of r (a sex-linked pigment gene) is located on the Y chromosome. The body of the female was white, whereas that of the male was orange-red (Hyodo-Taguchi and Sakaizumi 19934). These fishes were supplied by a subcenter (Niigata University) of the National BioResource Project (medaka) supported by the Ministry of Education, Culture, Sports, Science, and Technology of Japan.
DNA and RNA extraction
Total RNA and genomic DNA were extracted from each hatched embryo after homogenization in a 1.5-ml tube with 350 ml RLT buffer supplied with the RNeasy Mini kit (Qiagen). The homogenized lysates were centrifuged and supernatants were used for RNA extraction with the RNeasy Mini kit and the RNase-Free DNase set protocol (Qiagen). Precipitated material was used for DNA extraction by using the DNeasy tissue kit (Qiagen) according to the manufacturer’s protocol.
Genotyping of the SD region was conducted by using genomic PCR of fin clip DNA. Genomic PCR was performed by using four sets of primers designed in the SD region (Supporting Information, Table S1). PCR conditions were as follows: 5 min at 95°, followed by 35 cycles of 20 s at 95°, 30 s at 58°, and 1 min at 72°, followed by 5 min at 72°.
Construction of genomic libraries and chromosome walking
A BAC genomic library (LMB1) was constructed from cultured cells that were derived from an embryo that was produced by mating an XX female with a sex-reversed XX male. The cultured cells were embedded in the agarose gel, and then, partially digested with SacI. The fragments in the size range of 150–225 kb were selected. The size-selected DNA fragments were ligated to pKS145 vector and used to transform DH10B. A total of 36,864 BAC clones were picked and arrayed to 384 microtiter plates and then 3D DNA pools were constructed for clone screening by PCR. Chromosome walking started at OluX2-8. Inserted end fragments of positive BAC clones were amplified by using vectorette PCR and then used to assemble positive clones (Arnold and Hodgson 1991). An amplified end fragment at the far end of the SD side was used in the subsequent screening of the BAC library.
A fosmid library was constructed from a YY individual obtained by hybridizing an XY male and a sex-reversed XY female with pCC1FOS vecter (Epicentre Technology) following the manufacturer’s protocol. We made 48 fosmid clone pools, which contained 2000 fosmid clones per tube. Seven fosmid clones that correspond to the SD region in the X chromosome were isolated by the PCR screening method, which decreased the number of fosmid clones in the pool by three PCR steps.
BAC DNA was hydrodynamically sheared to average sizes of 1.5 and 4.5 kb, and the DNA was ligated into a pUC18 vector. We sequenced each BAC to have a genome coverage of 13 by using dye-terminator chemistry. Individual BACs were assembled from the shotgun sequences by using Phred version (ver.) 0.000925.c, Crossmatch ver. 0.990319, and Phrap ver. 0.990319 (Codon Code), as well as PCP ver. 2.1.6 and Cap4 ver. 2.1.6 (Paracel). The gaps in each BAC were closed with a combination of BAC walking, directed PCR, and resequencing of individual clones. The sequence of the fosmid clones was determined by using the same method.
The predicted mature domain of GSDF, additional members of the transforming growth factor (TGF)-β superfamily, and another human cystine-knot cytokine (brain-derived neurotrophic factor, BDNF) were aligned by using molecular evolutionary genetics analysis (MEGA) ver. 3.1 software (http://www.megasoftware.net). The GenBank accession numbers of the aligned amino acid sequences are as follows: human TGF-β1, NP_000651.3; mouse TGF-β1, NP_035707.1; zebrafish TGF-β1a, NP_878293.1; rainbow trout TGF-β1, CAA67685.1; medaka TGF-β1, ENSORLP00000001563; human growth-differentiation factor 5, NP_000548.1; mouse GDF5, NP_032135.2; zebrafish GDF5, CAA72733.1; medaka GDF5, ENSORLP00000003714 human inhibin α(INHα), NP_002182.1; mouse INHα, NP_034694.3; zebrafish INHα, CAK11253.1; rainbow trout INHα, BAB19272.1; medaka INHα, ENSORLP00000002713; human anti-Mullerian hormone (AMH/MIS), NP_000470.2; mouse AMH, NP_031471.2; zebrafish AMH, NP_001007780.1; medaka Amh, NP_001098198.1; zebrafish GSDF, NP_001108140.1; rainbow trout GSDF, ABF48201.1; medaka GSDF, NP_001171213.1; human BDNF, NP_001137277.1; mouse BDNF, NP_031566.4; zebrafish BDNF, NP_571670.2; and rainbow trout BDNF, ACY54685.1.
RT–PCR was performed by using a One-Step RT–PCR kit (QIAGEN). Aliquots (20 ng) of total RNA samples were used as templates in 25-μl reaction volumes.
The PCR conditions were: 30 min at 55°; 15 min at 95°; cycles of 20 s at 96°, 30 s at 55°, and 60 s at 72°; and 5 min at 72°. The number of cycles for each gene was adjusted to be within the linear range of amplification, specifically 35 cycles for predicted genes (PGs) and 24 cycles for β-actin. Specific primers for PGs were designed in each exon (Table S1).
Expression levels were quantified by using RNA from the body trunks of fry from −2 to 10 days after hatching (dah). Concentrations were adjusted to a total of 5 ng for each real-time assay. Using base substitutions between GsdfX and GsdfY (gonadal soma derived growth factor on the Y chromosome), primers were designed to examine the expression profiles of GsdfX and GsdfY (Table S1). Quantitative gene expression analysis was performed on an ABI PRISM 7000 (ABI) using a One-Step SYBR Prime-Script RT–PCR kit (Takara Bio). The PCR conditions were 5 min at 42°, 10 s at 95°, and then 40 cycles of 5 s at 95° and 30 s at 65°.
In situ hybridization
Fry at 5 dah and adult gonads were fixed in 4% paraformaldehyde in PBS at 4° overnight. Digoxigenin (DIG)-labeled RNA probes were generated by in vitro transcription with a DIG RNA labeling kit (Roche, Basalm, Switzerland) from a GsdfY cDNA plasmid. Sections were deparaffinized, hydrated, treated with proteinase K (10 μg/ml) at 37° for 5 min, and hybridized with the DIG-labeled antisense RNA probes at 60° for 18–24 hr. Hybridization signals were detected by using an alkaline phosphatase-conjugated anti-DIG antibody (Roche) with NBT/BCIP (Roche) as the chromogen.
We made two constructs to obtain transgenic lines. First, we inserted a fluorescent reporter gene into the fosmid clone containing GsdfY (OluFY3-1); this was a crystal lens-specific crystalline-γM2 promoter driving red fluorescent protein (RFP). The fluorescent reporter was inserted into the fosmid vector by using a Quick and Easy BAC Modification kit (Gene Bridges, Dresden, Germany), which relies on homologous recombination in Escherichia coli. This construct (construct 1) contained 3.5 kb of the coding sequence, 20 kb of the upstream region, and 13 kb of the downstream region. By removing other PGs, we obtained the second construct (construct 2), which contained only GsdfY, by using In-Fusion (Takara Bio) methods. We amplified two fragments: a 7.3-kb genomic sequence containing 3.5 kb of the coding region, 1.8 kb of the upstream region, and 2 kb of the downstream region, and a fosmid vector sequence. We then cloned the GsdfY fragments into vectors by using an In-Fusion Advantage PCR Cloning kit (Takara Bio). Reporter gene integration was similarly achieved.
Fertilized eggs were collected within 20 min of spawning and were microinjected. We used DNA at 10 ng/μl in Yamamoto’s solution (133 mM NaCl, 2.7 mM KCl, 2.1 mM CaCl2, 0.2 mM NaHCO3; pH 7.3). The injected eggs were incubated at 27° until hatching.
The GsdfY and GsdfX luciferase reporter plasmids (Luc Y and Luc X) were generated by cloning the 3-kb upstream region of each Gsdf into the vector (pGL.4.14; Promega) by using the In-Fusion Advantage PCR Cloning Kit (Takara Bio) with designed primers (Table S1). Modified reporter plasmids (Luc 1–6) were generated on the basis of Luc Y and Luc X by using In-Fusion methods. HEK293 cells were cultured at 37° in DMEM (Invitrogen) supplemented with 10% fetal bovine serum (Gibco); 2.5 × 104 cells were plated in each well of 96-well plates 24 hr before transfection. The cells were transfected with 100 ng of the GsdfY luciferase reporter, GsdfX luciferase reporter, or modified luciferase reporters, and 100 ng of TK-Renilla luciferase plasmid (pGL.4.79; Promega) with Lipofectamine 2000 reagent (Invitrogen) and opti-MEM (Invitrogen). After 40 hr, luciferase assay was performed with the Dual-Glo Luciferase Reporter Assay system (Promega) and a Wallac 1420 ARVO-SX multilabel counter (Perkin Elmer). The levels of firefly luciferase activity were normalized against Renilla luciferase activity. At least three independent experiments were performed.
Results and Discussion
Nine genes were predicted in the SD region
The SD region of O. luzonensis maps between eyeless and 171M23F on LG 12 (Tanaka et al. 2007). We performed further linkage analysis and obtained two male recombinants for this region. One male had a recombination breakpoint between OluX2-8 and OluX2-25, and the other had a breakpoint between OluX3-34 and OluX4-6, refining the SD region to between OluX2-8 and OluX4-6 (Figure 1A).
We constructed a BAC library of an XX fish and a fosmid library of a YY fish, and we made physical maps of the SD region of the X and Y chromosomes. This region was covered with two BAC clones (OluBXKN2 and OluBXKN1) on the X chromosome and with seven fosmid clones (OluFY13-1, OluFY24-1, OluFY18-1, OluFY3-1, OluFY8-1, OluFY7-1, and OluFY29-1) on the Y chromosome (Figure 1A). The entire nucleotide sequence was determined by using shotgun sequencing, except for a repetitive region in OluFY3-1 and OluBXKN1. Restriction analysis of both clones demonstrated that the length of the repetitive region was the same for both chromosomes (data not shown). The SD region is ∼180 kb for the X and Y chromosomes, and both chromosomes exhibit high sequence identity with no large deletions or insertions. The gene-prediction program Genscan identified nine genes in this region; all are found on both the X and Y chromosomes (Figure 1A).
GsdfY is responsible for male-specific high expression during sex differentiation
To examine whether the predicted genes (PGs) are expressed during sexual differentiation, we performed RT–PCR for each PG. The first difference in germ cell number is seen 3 dah in O. luzonensis (Nakamoto et al. 2009). Given that expression of the SD gene Dmy precedes the first morphological gonadal difference in O. latipes, the SD gene of O. luzonensis should function sometime before 3 dah. RT–PCR detected the expression of seven of the nine genes at 0 dah (Figure 1B). Only one gene, PG5, shows higher expression in XY embryos than in XX embryos.
We determined the full-length mRNA sequence of PG5 on the X and Y chromosomes using 5′ and 3′ RACE. The longest open reading frame (ORF) spanned five exons and encodes a putative protein of 215 amino acids (Figure 2, A and B). The N-terminal regions are rich in hydrophobic amino acid residues and are followed by a potential cleavage site comprising Ala and Phe (amino acid residues 19 and 20; Figure 2B). Phylogenetic analysis of the mature domain of the cystine-knot cytokines revealed that the PG5 sequence is found in the same clade as Gsdf, which is a member of the TGF-β superfamily (Figure 2C). When we compared Gsdf on the X chromosome (GsdfX) with that on the Y chromosome (GsdfY), we found 12 base substitutions in the full-length mRNA, including two synonymous substitutions in the ORF; however, the amino acid sequences of GsdfX and GsdfY are the same.
Using the base substitutions between GsdfX and GsdfY, we examined the expression profiles of GsdfX and GsdfY using real-time PCR. Expression of Gsdf was higher in XY embryos than in XX embryos from 2 days before hatching (dbh) to 10 dah (Figure 3A). In the XY embryo, GsdfY expression was higher than GsdfX expression at 0 dah, whereas it was similar to GsdfX expression at 5 and 10 dah.
At 5 dah in the developing gonads, supporting cells surrounding the germ cells expressed Gsdf in both XY and XX embryos, although Gsdf expression was much higher in XY embryos (Figure 3, B and C). In the adult testis, Gsdf was detected in the Sertoli cells around type A spermatogonia (Figure 3D); in the adult ovary, Gsdf was expressed in the granulosa cells surrounding well-developed oocytes (Figure 4E).
GsdfY induced fertile XX male in O. luzonensis
We performed overexpression experiments using a GsdfY genomic clone. First, we used a fosmid clone (OluFY3-1) that spans 20 kb upstream and 13 kb downstream of GsdfY. Construct 1, containing GsdfY, PG3, and PG4, was injected into one-cell–stage embryos of O. luzonensis (Figure S1A). In generation zero (G0), we obtained 54 adult fish with the transgene, one of which was a sex-reversed XX male (Table 1). We mated the XX male with a normal female to obtain G1 progeny, and G2 progeny were obtained from an XX male of the G1 progeny. All fish bearing the transgene developed as males in both the G1 and G2 progeny, whereas all fish without the transgene developed as females. Consequently, we established a transgenic strain (strain 1) whose sex was determined by the transgene construct 1. Next, we made a construct (construct 2) that contained 3.5 kb of GsdfY, as well as 1.8 kb of its upstream region and 2 kb of its downstream region, but no other predicted genes (Figure S1B). As with the previous transgenic experiment, we established a strain (strain 2) whose sex was determined by the transgene (Table 1). To confirm the mRNA expression of both strains, we examined embryos at 0 dah by using real-time PCR. XX embryos carrying the transgene expressed higher levels of Gsdf than did XX embryos without the transgene in both strains (data not shown).
GsdfY-specific mutations involved in high expression
We hypothesized that there were GsdfY sequences specific for the high expression within construct 2. According to Gautier et al. (2011), the Gsdf proximal gene promoter harbors evolutionarily conserved cis-regulatory motifs among fish species. To find these sequences, we compared 1.8 kb upstream and 2.0 kb downstream of GsdfY with those of GsdfX and Gsdf in O. latipes. We found 13 substitutions between the X and Y in the upstream region, 9 of them GsdfY-specific mutations, and 31 between the X and Y in the downstream region (including 20 GsdfY-specific mutations) (Figure 4, A and B). We used a luciferase assay to assess the 9 GsdfY-specific upstream mutation sites. The GsdfY reporter plasmid with all mutations in the Y-type allele (Luc Y) showed higher luciferase activity than the GsdfX reporter plasmid (Luc X) (Figure 4C). Luciferase activity was significantly decreased in recombinant constructs Luc 3–6, whereas two constructs (Luc 1 and 2) showed high luciferase activity, equal to that of Luc Y. Because the constructs yielding high expression all had Y-type mutations 1, 2 or 3–6 in addition to mutations 6–9, we conclude that Y-type mutations 6–9 are necessary for the high expression and that either 1, 2, or 3–6 Y-type mutations are also required.
GsdfY induced sex-reversal in O. latipes
In O. latipes, the ortholog of GsdfX/Y is located on an autosome (LG12). Gsdf in XY fish shows significantly higher expression levels compared with that in XX fish during sex differentiation, suggesting that expression levels of Gsdf are directly or indirectly controlled by Dmy (Shibata et al. 2010). To examine whether Dmy-independent expression of GsdfY induces sex reversal in O. latipes, we injected construct 1 into one-cell–stage embryos of the d-rR strain of O. latipes. Consequently, we established an O. latipes strain (strain 3) whose sex was determined by GsdfY from O. luzonensis (Table 1). Real-time PCR revealed that this strain showed high expression of GsdfY in an XX embryo at 0 dah (data not shown).
The evolutionary process leading to a novel SD gene
Our results strongly suggest that GsdfY is the SD gene in O. luzonensis and represents a new SD gene in vertebrates. Three SD genes, SRY, Dmy, and DM-W, have been identified (Yoshimoto et al. 2008). These genes encode transcription factors, whereas Gsdf encodes a secretory protein belonging to the TGF-β superfamily and was originally identified as a factor controlling the proliferation of primordial germ cells and spermatogonia in rainbow trout (Sawatari et al. 2007). Since homologous sequences with high similarity to Gsdf have not been found in nonpiscine species, Gsdf is likely unique to teleosts. The three SD genes are not allelic. Dmy and DM-W might have emerged by duplication of DMRT1 and are located on the Y and W chromosomes, respectively (Sawatari et al. 2007; Yoshimoto et al. 2008). SRY is believed to have arisen from SOX3 130–170 million years ago (mya), suggesting that it was formerly allelic to SOX3 (Marshall-Graves 2002). Although GsdfY appeared in the same way as SRY, it remains allelic to GsdfX likely because of its more recent origin (within 5 mya) (Tanaka et al. 2007).
Expression analysis and our reporter assay suggest that cis-regulatory sequences of GsdfY are involved in higher expression of the gene in males (Figures 3 and 4). In silico analysis of the regulatory motif suggested that the sequences containing 6–9 mutations are a steroidogenic factor 1 (SF1) binding site (i.e., SF1 can bind upstream of GsdfX but not of GsdfY). GsdfY may have evolved from ancestral Gsdf by acquiring high expression during an earlier stage of sex determination via a change in the SF1 binding site. In O. latipes, Gsdf shows high expression specifically in males during sex differentiation (Shibata et al. 2010). Since Dmy determines sex in O. latipes, the sex-specific high expression of Gsdf should be triggered by Dmy in this species. However, the transgene expressing GsdfY in O. latipes is sufficient to induce fertile XX males (Table 1). During O. luzonensis sex differentiation, other genes, such as Sox9a2, Dmrt1, and Foxl2, which are presumably downstream of Gsdf, show expression patterns similar to those in O. latipes (Nakamoto et al. 2009). Taken together, these results imply that O. luzonensis and O. latipes share a common sex differentiation pathway downstream of Gsdf and that, if high Gsdf expression can be achieved during sex differentiation, then the XX embryo will develop as a male without Dmy.
Willkins proposed that sex-determination pathways grow by the successive addition of upstream control elements to an ancient conserved downstream module (Wilkins 1995). For example, in Drosophila, double sex determined the sex in the ancestral state. Then, sex-related genes were added in succession upstream of double sex to give the present SD cascade (Pomiankowski et al. 2004). In O. luzonensis, the scenario is somewhat different (Figure 5, A and B). Gsdf was downstream of Dmy in the ancestor of O. luzonensis. Mutations involved in high expression of Gsdf without the Dmy signal then accumulated, until the expression exceeded the threshold which determines male development, leading to the new SD gene GsdfY. If these mutations induced high expression independently of Dmy, individuals with either Dmy or GsdfY would develop as males, and those with neither Dmy nor GsdfY would develop as females. Since mating occurs only between males (with either Dmy or GsdfY) and females (with neither Dmy nor GsdfY), the sex ratio did not become skewed toward males. In this population, two SD genes (Dmy and GsdfY) could temporarily coexist. Finally, if the chromosome with Dmy is lost from this population, the master SD gene Dmy would be replaced by GsdfY. We conclude that SD cascades can also evolve by expression of a downstream gene becoming independent of an existing sex-determining gene, and usurping control of the downstream cascade.
In Oryzias fishes, >20 extant species are recognized, and the sex chromosomes of 7 species have been identified; O. latipes (LG1), O. curvinotus (LG1), O. luzonensis (LG12), O. minutillus (LG8), O. dancena (LG10), O. hubbsi (LG5), and O. javanicus (LG16) (Takehana et al. 2007a,b, 2008; Nagai et al. 2008). Dmy has been detected only in O. latipes and O. curvinotus. Since sex chromosomes homologous to LG12 have not evolved repeatedly, Gsdf cannot be the SD gene in the 4 remaining species. We are now examining the role of Gsdf on the SD cascade in these species.
We thank C. L. Peichel for constructive comments on the manuscript, Y. Takehana and A. Shinomiya for advice, and the Sequencing Technology Team at the RIKEN Genomic Sciences Center for BAC and fosmid clone sequencing. This work was supported by KAKENHI via Grants-in-Aid for Scientific Research on the priority area “Comparative Genomics” from the Ministry of Education, Culture, Sports, Science, and Technology of Japan and via Grants-in-Aid for Scientific Research on the priority area “Scientific Research B” (16370094 and 20370086).
Supporting information is available online at http://www.genetics.org/content/suppl/2012/02/23/genetics.111.137497.DC1.
Communicating editor: D. Charlesworth
- Received December 5, 2011.
- Accepted February 1, 2012.
- Copyright © 2012 by the Genetics Society of America