- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Supplemental Table S1
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Krzywinski, J.
- Articles by Besansky, N. J.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Krzywinski, J.
- Articles by Besansky, N. J.
Isolation and Characterization of Y Chromosome Sequences From the African Malaria Mosquito Anopheles gambiae
Jaroslaw Krzywinskia, Deborah R. Nusskernb, Marcia K. Kerna, and Nora J. Besanskyaa Center for Tropical Disease Research and Training, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana 46556
b Celera Genomics, Rockville, Maryland 20850
Corresponding author: Nora J. Besansky, University of Notre Dame, 317 Galvin Life Science Center, Notre Dame, IN 46556., nbesansk{at}nd.edu (E-mail)
Communicating editor: M. A. F. NOOR
| ABSTRACT |
|---|
The karyotype of the African malaria mosquito Anopheles gambiae contains two pairs of autosomes and a pair of sex chromosomes. The Y chromosome, constituting
10% of the genome, remains virtually unexplored, despite the recent completion of the A. gambiae genome project. Here we report the identification and characterization of Y chromosome sequences of total length approaching 150 kb. We developed 11 Y-specific PCR markers that consistently yielded male-specific products in specimens from both laboratory colony and natural populations. The markers are characterized by low sequence polymorphism in samples collected across Africa and by presence in more than one copy on the Y. Screening of the A. gambiae BAC library using these markers allowed detection of 90 Y-linked BAC clones. Analysis of the BAC sequences and other Y-derived fragments showed massive accumulation of a few transposable elements. Nevertheless, more complex sequences are apparently present on the Y; these include portions of an
48-kb-long unmapped AAAB01008227 scaffold from the whole genome shotgun assembly. Anopheles Y appears not to harbor any of the genes identified in Drosophila Y. However, experiments suggest that one of the ORFs from the AAAB01008227 scaffold represents a fragment of a gene with male-specific expression.
SEX chromosomes of many groups of animals and plants originated independently from a pair of ordinary autosomes after acquisition of a major sex-determining locus (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The mosquito Anopheles gambiae, a major vector of human malaria, has a karyotype consisting of two pairs of autosomes and a pair of sex chromosomes. The Y chromosome contains a male-determining factor(s) that dominantly induces male development by its presence in an XX/XY system (![]()
![]()
10% of the genome, remains virtually unexplored. Despite whole-genome shotgun sequencing of A. gambiae from male and female templates, no sequence data has yet been assigned to that chromosome (![]()
![]()
We initiated a study aiming at the isolation of A. gambiae Y chromosome sequences for two main reasons. First, identification of such sequences may allow new insights into the evolution of Y chromosome sequence and structure, which is of great general interest. Second, it may allow development of new markers that would lead to a better characterization of anopheline population history and geographic structure. Thorough understanding of population structure and gene flow among A. gambiae populations is critical for effective implementation of malaria control strategies. Evidence based on existing markers generally suggests that this species, despite being broadly distributed across sub-Saharan Africa, has a shallow population structure and a strikingly weak effect of distance on differentiation (![]()
![]()
![]()
![]()
![]()
![]()
Male-specific markers are difficult to isolate, because the nonrecombining Y chromosome consists mostly of repetitive elements that share high similarity with sequences on other chromosomes. Yet Y chromosome DNA fragments have been identified in a variety of organismsprimarily mammals, but also in fish, flies, and plants. The strategies used in those studies include construction of libraries from flow-sorted Y chromosomes (![]()
![]()
![]()
![]()
![]()
![]()
| MATERIALS AND METHODS |
|---|
DNA samples:
Specimens used in the study were field collected in Senegal and Burkina Faso in 1997, F1 progeny of females collected in Kenya in 1987 (![]()
![]()
Southern blot hybridization:
Restriction endonuclease-digested DNA was separated by electrophoresis on a 0.8% agarose gel and transferred by capillary blotting onto Hybond-N+ membranes (Amersham Biosciences) in 10x standard saline citrate (SSC) buffer (![]()
![]()
Differential hybridization:
A
DASH II (Stratagene, La Jolla, CA) genomic library prepared from partially digested A. gambiae SUA strain DNA of both sexes (![]()
30,000 PFU on each of the 12 (150-mm) plates. Plaque lifts using Duralose membranes (Stratagene) were performed in quadruplicate. Two sets of filters each were differentially screened with equivalent amounts of male or female A. gambiae ZAN/U strain genomic DNA radiolabeled by random priming using a HighPrime kit (Roche, Indianapolis). The filters were washed at high stringency. Phage that reproducibly hybridized only with the male total genomic DNA were plugged into 1 ml of SM phage dilution buffer and rescreened to obtain purified putative male-specific phage. The phage were harvested from liquid lysates and DNA was isolated according to ![]()
Subcloning and sequencing:
Phage inserts were subcloned into pBluescript SK+ (Stratagene) and, after electroporation, amplified in Escherichia coli DH10B (GIBCO BRL, Gaithersburg, MD). Plasmids were purified using a QIAprep Spin Miniprep kit (QIAGEN, Valencia, CA). PCR products were gel purified using a QIAquick Gel Extraction kit (QIAGEN) and sequenced directly or after cloning into the pGEM-T Easy vector (Promega, Madison, WI). Cloned PCR templates were PCR amplified and gel purified prior to sequencing. Sequencing was performed using ABI BigDye terminator chemistry (PE Applied Biosystems) on an ABI 377 sequencer. Sequences were assembled and verified by inspection of both strands using ABI Sequence Navigator software. Similarity searches of the obtained DNA sequences against the GenBank nr database were performed using BLASTN and BLASTX (![]()
Bacterial artificial chromosome library screening:
To identify clones containing Y chromosome-derived inserts, a bacterial artificial chromosome (BAC) genomic DNA library (ND-TAM) constructed from DNA prepared from both males and females of the A. gambiae PEST strain (![]()
|
Subtraction of BAC clones:
Preparation of driver and tester:
BAC clone DNA was sonicated in 30% glycerol until the resulting random fragments ranged from 100 bp to 2 kb in size. To eliminate fragments <100 bp, the fragmented DNA was purified using the StrataPrep PCR purification kit (Stratagene) and eluted in 50 µl of water. Driver DNA (8 µg) was biotinylated with Photoprobe biotin (Vector Laboratories, Burlingame, CA) by thermal coupling and, following purification according to manufacturer's recommendation, resuspended in 8 µl of Tris-EDTA (pH 8.0). Tester DNA (3 µg) was treated with mung bean nuclease (1 unit/µg of DNA) for 30 min at 30° to remove single-stranded extensions and, after phenol/chloroform extraction and ethanol precipitation, was resuspended in 4 µl of Tris-EDTA. Adapter sequences were generated from mixtures of complementary oligonucleotides OL1 (5'-ACCGTCGTCCATCCAGTCGCAATCC-3') and OL2 (5'-GGATTGCGACTGGATGGA-3') by heat denaturation and slow cooling to room temperature. Adapter sequences were ligated to the tester DNA for 14 hr at 14° using 400 units of T4 DNA ligase. Prior to hybridization the tester was diluted in 20 mM HEPES-HCl (pH 6.6), 50 mM NaCl, and 0.2 mM EDTA (pH 8.0).
Subtractive hybridization: A total volume of 5 µl of the hybridization mixture containing 2 µg of driver DNA mixed with 40 ng of tester DNA, 50 mM HEPES-HCl (pH 8.0), 0.5 M NaCl, and 0.2 mM EDTA (pH 8.0) covered with a drop of mineral oil was denatured at 98° for 5 min, cooled to 68°, and incubated at 68° for 24 hr. Then 2 µg of freshly denatured driver DNA was added to the hybridization mixture and incubated at 68° for an additional 48 hr. DNA from the hybridization mixture was precipitated, resuspended in TBST binding buffer [0.1 M Tris (pH 7.5), 150 mM NaCl, 0.1% Tween 20] and biotinylated homo- and heteroduplexes were removed using Vectrex Avidin D (Vector Laboratories) according to the manufacturer's protocol. The resulting supernatant was used as a template in PCR amplification using the OL1 oligonucleotide as a primer. PCR products were cloned into pGEM-T Easy vector and individual clones were sequenced.
In silico searches of A. gambiae genome:
Searches for Y-linked scaffolds harboring potential coding sequences were performed as described (![]()
![]()
RNA extraction and reverse transcriptase-PCR experiments:
Total RNA was extracted using TRIZOL (GIBCO BRL). Residual DNA contamination was eliminated with DNase I (Invitrogen, Carlsbad, CA). Reverse transcriptase-PCR (RT-PCR) was performed using the Superscript One-Step RT-PCR kit (GIBCO BRL). All experiments were done according to the manufacturer's protocol.
PCR assays:
PCR mixtures contained 1 µl template DNA (1/100 of the DNA extracted from a single mosquito), 1.5 mM MgCl2, 20 mM Tris-HCl (pH 8.4), 50 mM KCl, 0.2 mM each dNTP, 25 pmol of each primer, and 2.5 units Taq polymerase in a total volume of 50 µl. The PCR reactions were performed in the Perkin-Elmer (Norwalk, CT) 9600 thermocycler with an initial denaturation at 94° for 3 min, followed by 35 cycles of 94° for 20 sec, 55°58° for 30 sec, and 72° for 1 min, followed by a final elongation step at 72° for 10 min.
| RESULTS |
|---|
Identification of Y-linked clones through differential hybridization:
An A. gambiae phage genomic library was screened using the total genomic DNA of males or females as probes. Phage that produced signals only with the male probe were purified by rescreening. Of 34 isolated phage from which DNA was extracted, restriction digested, blotted, and probed with genomic DNA from males or females (reverse Southern blot), 16 phage hybridized uniquely or significantly more strongly with male DNA. The 10 phage giving the most distinct male-specific signals were subcloned, and selected individual fragments were used as probes in genomic Southern blot analysis to confirm association of the phage inserts with the Y chromosome. Subclones from 5 phage showed much stronger hybridization to male DNA and/or hybridized to fragments unique to males, in addition to fragments common to both sexes. Subclones from other tested phage appeared to be present in both sexes, with comparable intensity of hybridization signal.
Two phage, 3-7 and 6-8, hybridized only with male DNA on reverse Southern blots and the subclones derived from them hybridized strongly to fragments unique to males on genomic Southern blots. These two phage were characterized further by restriction mapping and subclone sequencing. GenBank nr database searches using subclone sequences revealed an accumulation in both 3-7 and 6-8 of densely packed transposable elements (TEs), including mtanga and undescribed TEs similar to mag and Pao from Bombyx mori and mdg1 from D. melanogaster.
A 4.1-kb subclone from 3-7 contained a fragment of an mdg1-like retrotransposon with its open reading frame (ORF2) interrupted by a 682-bp fragment lacking similarity to any known coding sequence, followed by a mag-like retrotransposon with an apparently complete ORF1 and a partial ORF2. To confirm Y chromosome linkage of the 3-7 clone we implemented a PCR approach based on the premise that primers designed from the Y should allow amplification exclusively from male DNA, provided target sequences on the Y had diverged enough from sequences on other chromosomes. Initially a primer pair designed within the noncoding fragment was used (3-7_4H2F 5'-CATAGTGTCATAACCAGCACG-3' and 3-7_4H2R 5'-TCTTCTTCGGGAACACGATG-3'). However, a product of the expected size was amplified from genomic DNA of both sexes, although usually with more robust product from male samples. This result together with the male-specific bands observed in the genomic Southern analysis (data not shown) suggested association of the noncoding sequence with the Y, without Y specificity. Two additional primers (mag-mdgF and mag-mdgR; Table 1), each designed within the coding sequences of the flanking retrotransposons, permitted amplification of a male-specific product, providing convincing evidence for Y linkage of the 3-7 sequences (Fig 1). Here, the PCR pinpoints a unique genomic rearrangement caused by an integration of the mag-like retrotransposon into an mdg1-like retrotransposon on the Y chromosome. The male specificity of the mag-mdg1 marker was confirmed by PCR on males and females from the PEST laboratory colony and on field-derived specimens of A. gambiae.
|
The nature of the noncoding region from the 4.1-kb subclone was explored further. A BLASTN search of the A. gambiae genome using the 4.1-kb subclone sequence as a query identified a single 458-kb scaffold (AAAB01008977) mapped to chromosome 2, containing sequence 9799% identical with both the noncoding region and ORFs of the mag-like element. Remarkably, a 187-bp fragment from the 5' end of the noncoding region was directly repeated on the scaffold, with the repeats spaced by 4482 nucleotides that bore, at the amino acid level, high similarity to mag element proteins, suggesting the presence of a long terminal repeat (LTR) retrotransposon. This putative element has identical LTR sequences and is bracketed by 5-bp direct repeats (CTGTT) probably representing a duplication of the chromosomal target sequence produced during element insertion. Thus, the noncoding sequence in the 4.1-kb subclone of phage 3-7 appears to be a retrotransposon's integral part, consisting of the 5' LTR and a long putative leader region preceding ORF1. Presence of a highly similar copy of the element on chromosome 2 explains amplification of the noncoding region from both sexes, and the more abundant PCR product from males suggests that the mag-like element is present in multiple copies on the Y. However, stronger amplification in males could also result from mismatches between one of the primers and the target site on the autosomal copy of the element. The primer's 5' end contains a putative target site for the retroelement, the 5-bp sequence CATAG, located directly upstream of the LTR, which differs from the duplicated target site on chromosome 2 in four of five positions. The 3' end of the mag-like element from the 3-7 phage has not been sequenced entirely and it is not known if the CATAG sequence represents a genuine target site. If it does, the mag-like retroelement would have no sequence specificity for insertions, similar to the mag retrotransposon (![]()
Marker development from end sequence of Y-linked BAC clone:
The availability of the ND-TAM BAC genomic library containing DNA from both sexes of A. gambiae allowed us to screen for large DNA inserts derived from the Y chromosome using a male-specific mag-mdg1 PCR marker. This library, representing
14 genome equivalents and containing 30,720 clones with an average insert size of 133 kb (![]()
750 bp. Extending both sequences by primer walking revealed their complete divergence downstream from a short fragment highly similar to a middle repetitive sequence identified earlier in an A. gambiae autosomal telomere (![]()
A microsatellite (AT)16 was found in the 104C2 sequence downstream from the region shared with 174B1. This was the first microsatellite identified as linked to the A. gambiae Y, a potentially valuable finding, because high mutation rates within microsatellite regions make them powerful markers for population genetics studies. A primer pair (104C2SP6F2 and 104C2SP6R; Table 1) designed on the flanks of the repeat amplified a male-specific PCR product of the expected size (Fig 1). Occasionally a single fragment was amplified in females; however, its size was
1 kb larger in females than in males.
Direct sequencing of the 104C2SP6F2-R PCR products from individual PEST males resulted in initially unambiguous sequence, which deteriorated at the end of the microsatellite region. Multiple peaks observed downstream from simple repeats are often a consequence of Taq polymerase slippage on low complexity sequence; yet another explanation for this phenomenon is more likely here. BLASTN searches of the A. gambiae genome with the 104C2SP6 marker identified 12 scaffolds with microsatellite flanking sequences identical to those of the marker, but with the microsatellite regions containing from 13 to 65 dinucleotide repeats. Apparently, the 104C2SP6F2-R primers amplify several distinct Y-linked loci harboring microsatellites of different lengths. Interestingly, some of the size divergent loci seem to be closely linked in the genome. Screening of the BAC library with 104C2SP6F2-R primers identified clones containing more than one such locus. Using those BAC clones as template, the 104C2SP6F2-R primers produced either two fragments easily separated on a 2% agarose gel or a smear of PCR products ranging in size from 250 to 300 bp, rather than the expected single band. Similarly, amplification of the marker from some males derived from natural populations yielded two or more bands.
All but one scaffold identified by BLASTN using the 104C2SP6 marker sequence were unmapped and very shortundoubtedly representing fragments of the Y chromosome that could not be assembled into larger contigs. In the exceptional scaffold (AAAB01008960) mapped to chromosome 2, two sequence regions were nearly identical to the marker sequence. Both regions had target sites perfectly matching the primers, which should have yielded 270- and 301-bp PCR products. However, none of the tested female specimens yielded the expected products. This discrepancy between in silico analysis and laboratory experiments likely resulted from incorrect sequence assembly of this chromosome 2 scaffold within a genomic region containing repetitive sequences in common with those flanking the marker. We have further evidence for incorrect assembly of A. gambiae scaffolds (see below).
Development of Y markers through BAC DNA subtraction:
Finding identical or highly similar sequences in the random Y-linked clones led us to the hypothesis that large portions of the Y chromosome in A. gambiae are composed of a few highly abundant transposable elements, among which may exist small islands of more complex sequences. On the basis of this hypothesis we designed a subtractive hybridization strategy to eliminate repetitive sequences and enrich for low-copy-number fragments. DNA of BAC clone 104B1 was used as a "driver" and pooled DNA of clones 104C2, 106G16, 106K11, and 174B1 was used as a "tester." After random fragmentation, driver DNA was biotinylated and tester DNA was blunt ended and ligated to an adaptor. After liquid hybridization of tester with driver DNA and removal of homo- and heteroduplexes, the remaining molecules were PCR amplified, cloned, and sequenced. Seven clones containing putative nonrepetitive sequences were evaluated as potential Y chromosome PCR markers. Three of the clones contained sequences that largely overlapped the same genomic region and of those one sequence was selected for primer design. In total, of five candidates under investigation, two new markers, S23 and S291, were developed (Fig 1). The primer sequences for the markers are given in Table 1. Rarely, nonspecific products were amplified in females using the S291 marker primers.
Large-scale BAC library screening for Y-linked clones:
The ends of the ND-TAM BAC library were sequenced as part of the A. gambiae genome sequencing effort (![]()
The emerging view of the A. gambiae Y as a sink for a small number of highly repetitive TEs was reinforced from analysis of the available 134 BAC end sequences. Collectively, they constituted >80 kb, assuming an average sequence read of 600 bp. BLAST searches revealed that all of this sequence is repetitive. Nearly 40% of the sequences were mdg1- or 412-like retrotransposon fragments. Other frequent retrotransposon sequences were mag-like (12%) and mtanga (10%). This composition may have resulted from a disproportionate accumulation of those elements compared to other repetitive sequences on the Y and/or from a bias arising during the BAC library construction. The repetitive character of BAC ends bearing no similarity to known TEs was confirmed by BLASTN searches against the A. gambiae genome. These sequences did not allow us to identify Y-linked scaffolds with complex, low-copy-number sequences. Scaffolds identified in silico using the BAC end sequences fell into two categories: very short unmapped scaffolds, likely originating from the Y, that were composed entirely of repetitive elements and scaffolds mapped to other chromosomes by independent evidence, apparently identified because highly repetitive query sequences are ubiquitous in the A. gambiae genome.
Screening of the BAC library suggested close linkage of different markers, because many clones were hit with more than one marker. To evaluate the extent of linkage, each clone confirmed to be Y linked with one marker was screened for the presence of the remaining available markers. Among all 90 identified clones, 27 clones were hit with all four markers, 34 were hit with three markers, 23 were hit with two markers, and only 6 were hit with one marker (see Appendix). Many of the multiple-hit clones may contain overlapping sequences, with markers embedded in the shared genomic regions. Several clones found to share one identical end sequence seem to support this hypothesis, although it was not tested whether their DNA originated from the same region of the Y chromosome. This is apparently not the case with the 5 initially identified BAC clones, because analysis of their DNA subjected to restriction digestion showed restriction patterns lacking any common-sized bands. It is conceivable that most of the identified Y-linked BAC clones carry different regions of the Y but that they are identified with multiple markers because of the ubiquity of the marker sequences on that chromosome. Further study could elucidate the relative location of these markers on the Y.
Search for Y-linked scaffolds within the A. gambiae genome:
The A. gambiae genome has been sequenced from genomic libraries constructed separately from female and male DNA (![]()
![]()
50 and 33%, respectively, as our investigation suggested (
35% X;
51% autosome). For scaffolds with undetermined chromosome linkage ("unmapped"), the male contribution was
51%, suggesting that Y chromosome sequences may not be well represented in the unmapped scaffold set because of cloning problems and/or due to the small size of the Y chromosome relative to other chromosomes.
We identified 975 scaffolds containing fragments originating exclusively from male libraries among all 8845 unmapped scaffolds (see supplementary data at http://www.genetics.org/supplemental/). These scaffolds constituted a male-derived sequence database created using STANDALONE BLAST. In an attempt to detect scaffolds with coding sequences, a strategy implemented by ![]()
![]()
![]()
Characterization of the AAAB01008227 scaffold:
The AAAB01008227 scaffold, assembled from 441 fragments, has a total predicted length of 48,385 bp encompassing 43,438 bp of determined sequence and two sequence gaps. A primer pair 128125A (Table 1) designed in the middle of the scaffold amplified a PCR product exclusively in males. To evaluate the distribution of male-specific sequences within the scaffold, to test for the scaffold's integrity and develop new potential Y-specific markers, additional primer pairs (Table 1) were designed along the scaffold (Fig 2). At least one primer from each pair was located outside of repetitive regions according to BLASTN searches against the A. gambiae genome. Six of those primer pairs yielded male-specific products in all specimens tested, demonstrating Y linkage of the scaffold sequence and Y specificity of the markers (Fig 3). However, three other primer pairs, including primers flanking two microsatellite regions, yielded no PCR products. Neither adjusting amplification conditions nor redesigning primers resolved the problem, suggesting incorrect assembly of those genomic regions. Indeed, examination of the fragments (single sequence reads corresponding to a clone end sequence) used for the scaffold's assembly (retrieved from A. gambiae Trace Archive at http://www.ncbi.nlm.nih.gov/blast/mmtrace.html) revealed that none of the fragments spanned the microsatellite regions in their entirety; i.e., the fragments invariably ended within the microsatellites. Further support for the notion of incorrect assembly of the scaffold comes from the mate pair information obtained from Trace Archive (mate pairs are sequences originating from the opposite ends of a single clone). We examined five regions of the scaffold that contained Y marker sequences and compared orientation and distance of mate pairs predicted from the size of the source clones vs. the predictions based on the scaffold assembly. Only in one case did the mate pair position on the scaffold match the expectation based on the source clone. In four other cases, mate pairs were oriented on the scaffold in the same direction and in two such instances were separated by distances twice the expected length. Local misassembly problems regarding the A. gambiae genome sequences were reported earlier by ![]()
|
|
Although the AAAB01008227 scaffold is evidently chimeric, parts of the scaffold have been experimentally demonstrated to be linked to the Y chromosome and, as such, were analyzed further. A BLASTX search against the GenBank nr database revealed the presence of sequences highly similar to fragments of putative retrotransposons from other parts of the A. gambiae genome (Table 2). BLASTN searches against A. gambiae genome revealed other highly repetitive sequences scattered along the scaffold, with homologs on autosomes and the X chromosome; three fragments, >500 bp long, are repeated on the scaffold itself, two directly and one in inverted orientation (data not shown). The evidence suggests that the scaffold also contains low-copy-number or, possibly, single-copy sequences interspersed among the highly repetitive ones.
|
The scaffold sequences contain 28 ORFs detected using the Artemis annotation tool, release 4 (![]()
|
In an attempt to isolate BAC clones containing portions of the AAAB01008227 sequence, all seven markers designed from the scaffold were used to screen the ND-TAM BAC library. Surprisingly, contrary to screening with other markers, no clone was hit. Lack of hits in the BAC library reinforces the notion that Y markers derived from this scaffold represent sequences that are present in low copy number on the Y. However, the unexpected absence of BAC clones containing any of these markers may also be a consequence of biased library construction or instability of clones containing these sequences (![]()
Screening field-derived specimens using Y chromosome markers:
The developed markers were evaluated for their utility in population genetics studies. To maximize the chance of finding variation we sampled natural populations from West Africa (Senegal and Burkina Faso) and from East Africa (Kenya), locales separated by up to 6000 kilometers. Following amplification and purification, PCR products were directly sequenced from at least 10 specimens from both West and East Africa. All sequences were identical at mag-mdg1, S23, and S291 loci. Some variation between individuals existed in the remaining markers, although without clear differentiation among populations. Furthermore, in those cases intraindividual variation confounding sequencing results was also observed. The two markers containing microsatellites with di- and pentanucleotide repeats (104C2SP6 and 128125D, respectively) were amplified from more than one locus per individual, each with a different number of repeats, resulting in ambiguous sequences within and downstream from the microsatellite region.
| DISCUSSION |
|---|
Prior to this study sequence information on the Y chromosome of A. gambiae was limited to a short fragment harboring a mtanga transposable element (![]()
![]()
Complex, low-copy or single-copy sequences are the most illuminating markers in studies of both population genetics and sex chromosome evolution. Initially, in our search of Y chromosome sequences we implemented a differential hybridization strategy, in which labeled total genomic DNA from males and females was used in turn to screen for clones preferentially hybridizing to the male probe in a dual-sex genomic library. The hybridization kinetics of a total genomic probe dictates that only highly or moderately repeated DNA should hybridize. Thus, this method targets male-specific repetitive DNA or sequences that are highly amplified on the Y chromosome but may be present elsewhere in the genome. The premise of this approach was that single-copy- or low-copy-number sequences could be found among repetitive fragments. However, the A. gambiae Y appears to be highly degenerated, making application of this approach to finding more complex sequences laborious and impractical. Neither in the Y-linked phage inserts isolated by differential hybridization nor among sequences derived from Y-linked BAC clones subsequently identified have such complex sequences been detected. All fragments were repetitive and had very high similarity to sequences on other chromosomes. These results suggest that more complex sequences may constitute a few small, isolated islands scattered in an ocean of repetitive DNA, in agreement with the entirely heterochromatic state of the Y chromosome in A. gambiae. Most of the identified sequences belong to transposable elements, consistent with the Y chromosome serving as a trap for retrotransposons and with the expected tendency of the Y chromosome to degenerate during evolution (![]()
![]()
![]()
![]()
Ubiquitous repetitive sequences found in nearly all characterized fragments made development of population genetic markers very difficult. Although the BLASTN searches suggested otherwise, BAC library screening and direct sequencing of PCR products amplified from individual genomic templates showed that none of the markers appear to be present as single-copy sequences within the genome. Even primers flanking microsatellite-containing markers, found to be the most variable in this study, amplified multiple products, each with a different number of microsatellite repeats. Although their sequence data are difficult to analyze, the potential to utilize such multicopy PCR products in fragment analysis by treating them as compound haplotypes (![]()
The highly repetitive nature of a few ubiquitous transposable elements and possibly of other sequences present on the A. gambiae Y chromosome prevented the assembly of individual sequence reads generated during shotgun genome sequencing into larger scaffolds. Similar problems were encountered in the D. melanogaster genome assembly after whole-genome shotgun sequencing: only a single 15-kb scaffold containing a portion of the kl-5 gene, previously found to be Y specific (![]()
![]()
![]()
![]()
![]()
![]()
260 million years has elapsed since the last common ancestor of both organisms (![]()
The degree of Y degeneration and the number of functional genes on the Y in A. gambiae remains unknown. There is evidence that at least a single locus corresponding to a male-sex-determining factor is still present there (![]()
Fragments of other Y chromosome genes are likely present among the unmapped scaffolds, as was the case with the Drosophila Y genes. We limited our search to a small subset of the unmapped scaffolds, only those derived from male-only libraries, which may have resulted in a failure to recover more coding sequences. It remains to be seen if our more exhaustive ongoing search, encompassing all unmapped scaffolds, including those assembled from a mixture of male and female sequence fragments, will be more successful.
Our study has yielded an overview of structure and organization of the Y in A. gambiae, but clearly shows that assembly of the whole-genome shotgun fragments alone contributes little, if anything, to the elucidation of the Y chromosome sequence. Without a significant amount of additional evidence generated with approaches that take into account the peculiarity of Y chromosome organization, shotgun sequence data from the Y are unrecoverable or remain in the form of small isolated scaffolds that cannot be assembled into larger sequences by any available method (![]()
![]()
| FOOTNOTES |
|---|
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos.
CG865069,
CG865070,
CG865071,
CG865072,
CG865073,
CG865074,
CG865075,
CG865076,
CG865077,
CG865078,
CG865079,
CG865080,
CG865081,
CG865082,
CG865083,
CG865084,
CG865085,
CG865086,
CG865087,
CG865088,
CG865089,
CG865090,
CG865091,
CG865092,
CG865093,
CG865094,
CG865095,
CG865096,
CG865097,
CG865098,
CG865099,
CG865100,
CG865101,
CG865102,
CG865103,
CG865104,
CG865105,
CG865106,
CG865107,
CG865108,
CG865109,
CG865110. ![]()
| ACKNOWLEDGMENTS |
|---|
We thank Patricia Romans for a critical reading of the manuscript. Mathew Chrystal provided excellent computer support. This work was supported by grant AI44003 from the National Institutes of Health.
Manuscript received August 4, 2003; Accepted for publication December 3, 2003.
| APPENDIX |
|---|
|
| LITERATURE CITED |
|---|
ALTSCHUL, S. F., T. L. MADDEN, A. A. SCHAFFER, J. ZHANG, and Z. ZHANG et al., 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
ANLEITNER, J. E. and D. S. HAYMER, 1992 Y enriched and Y specific DNA sequences from the genome of the Mediterranean fruit fly, Ceratitis capitata.. Chromosoma 101:271-278.[CrossRef][Medline]
BACHTROG, D., 2003a Accumulation of spock and worf, two novel non-LTR retrotransposons, on the neo-Y chromosome of Drosophila miranda.. Mol. Biol. Evol. 20:173-181.
BACHTROG, D., 2003b Adaptation shapes patterns of genome evolution on sexual and asexual chromosomes in Drosophila.. Nat. Genet. 34:215-219.[CrossRef][Medline]
BESANSKY, N. J., T. LEHMANN, G. T. FAHEY, D. FONTENILLE, and L. E. BRAACK et al., 1997 Patterns of mitochondrial variation within and between African malaria vectors, Anopheles gambiae and An. arabiensis, suggest extensive gene flow. Genetics 147:1817-1828.[Abstract]
BIESSMANN, H., F. KOBESKI, M. F. WALTER, A. KASRAVI, and C. W. ROTH, 1998 DNA organization and length polymorphism at the 2L telomeric region of Anopheles gambiae.. Insect Mol. Biol. 7:83-93.[CrossRef][Medline]
BONACCORSI, S., G. SANTINI, M. GATTI, S. PIMPINELLI, and M. COLLUZZI, 1980 Intraspecific polymorphism of sex chromosome heterochromatin in two species of the Anopheles gambiae complex. Chromosoma 76:57-64.[CrossRef][Medline]
CARVALHO, A. B., B. P. LAZZARO, and A. G. CLARK, 2000 Y chromosomal fertility factors kl-2 and kl-3 of Drosophila melanogaster encode dynein heavy chain polypeptides. Proc. Natl. Acad. Sci. USA 97:13239-13244.
CARVALHO, A. B., B. A. DOBO, M. D. VIBRANOVSKI, and A. G. CLARK, 2001 Identification of five new genes on the Y chromosome of Drosophila melanogaster.. Proc. Natl. Acad. Sci. USA 98:13225-13230.
CARVALHO, A. B., M. D. VIBRANOVSKI, J. W. CARLSON, S. E. CELNIKER, and R. A. HOSKINS et al., 2003 Y chromosome and other heterochromatic sequences of the Drosophila melanogaster genome: How far can we go? Genetica 117:227-237.[CrossRef][Medline]
CHARLESWORTH, B. and D. CHARLESWORTH, 2000 The degeneration of Y chromosomes. Philos. Trans. R. Soc. Lond. B Biol. Sci. 355:1563-1572.[CrossRef][Medline]
CLEMENTS, A. N., 1992 The Biology of Mosquitoes. Chapman & Hall, London.
COLLINS, F. H., M. A. MENDEZ, M. O. RASMUSSEN, P. C. MEHAFFEY, and N. J. BESANSKY et al., 1987 A ribosomal RNA gene probe differentiates member species of the Anopheles gambiae complex. Am. J. Trop. Med. Hyg. 37:37-41.
DONNELLY, M. J., M. C. LICHT, and T. LEHMANN, 2001 Evidence for recent population expansion in the evolutionary history of the malaria vectors Anopheles arabiensis and Anopheles gambiae.. Mol. Biol. Evol. 18:1353-1364.
DONNISON, I. S., J. SIROKY, B. VYSKOT, H. SAEDLER, and S. R. GRANT, 1996 Isolation of Y chromosome-specific sequences from Silene latifolia and mapping of male sex-determining genes using representational difference analysis. Genetics 144:1893-1901.[Abstract]
EWIS, A. A., J. LEE, Y. KUROKI, T. SHINKA, and Y. NAKAHORI, 2002 Yfm1, a multicopy marker specific for the Y chromosome and beneficial for forensic, population, genetic, and spermatogenesis-related studies. J. Hum. Genet. 47:523-528.[CrossRef][Medline]
FISHER, R. A., 1931 The evolution of dominance. Biol. Rev. 6:345-368.
GAREL, A., P. NONY, and J. C. PRUDHOMME, 1994 Structural features of mag, a gypsy-like retrotransposon of Bombyx mori, with unusual short terminal repeats. Genetica 93:125-137.[CrossRef][Medline]
GAUNT, M. W. and M. A. MILES, 2002 An insect molecular clock dates the origin of the insects and accords with palaeontological and biogeographic landmarks. Mol. Biol. Evol. 19:748-761.
GEPNER, J. and T. S. HAYS, 1993 A fertility region on the Y chromosome of Drosophila melanogaster encodes a dynein microtubule motor. Proc. Natl. Acad. Sci. USA 90:11132-11136.
HAMMER, M. F. and S. L. ZEGURA, 1996 The role of the Y chromosome in human evolutionary studies. Evol. Anthropol. 5:116-134.
HOLT, R. A., G. M. SUBRAMANIAN, A. HALPERN, G. G. SUTTON, and R. CHARLAB et al., 2002 The genome sequence of the malaria mosquito Anopheles gambiae.. Science 298:129-149.
HONG, Y. S., J. R. HOGAN, X. WANG, A. SARKAR, and C. SIM et al., 2003 Construction of a BAC library and generation of BAC end sequence-tagged connectors for genome sequencing of the African malaria mosquito Anopheles gambiae.. Mol. Genet. Genomics 268:720-728.[Medline]
HURLES, M. E. and M. A. JOBLING, 2001 Haploid chromosomes in molecular ecology: lessons from the human Y. Mol. Ecol. 10:1599-1613.[CrossRef][Medline]
JUKANOVIC, N., A. TERRINONI, C. DI FRANCO, C. VIEIRA, and C. LOEVENBRUCK, 1998 Accumulation of transposable elements in the heterochromatin and on the Y chromosome of Drosophila simulans and Drosophila melanogaster.. J. Mol. Evol. 46:661-668.[CrossRef][Medline]
KAMAU, L., W. R. MUKABANA, W. A. HAWLEY, T. LEHMANN, and L. W. IRUNGU et al., 1999 Analysis of genetic variability in Anopheles arabiensis and Anopheles gambiae using microsatellite loci. Insect Mol. Biol. 8:287-297.[CrossRef][Medline]
LAHN, B. T. and D. C. PAGE, 1999 Four evolutionary strata on the human X chromosome. Science 286:964-967.
LEHMANN, T., W. A. HAWLEY, L. KAMAU, D. FONTENILLE, and F. SIMARD et al., 1996 Genetic differentiation of Anopheles gambiae populations from East and West Africa: comparison of microsatellite and allozyme loci. Heredity 77:192-200.
MCLAIN, D. K., F. H. COLLINS, A. D. BRANDLING-BENNETT, and J. B. WERE, 1989 Microgeographic variation in rDNA intergenic spacers of Anopheles gambiae in western Kenya. Heredity 62:257-264.
MULLER, H. J., 1932 Some genetic aspects of sex. Am. Nat. 66:118-138.[CrossRef]
OLIVIER, M. and G. LUST, 1998 Two DNA sequences specific for the canine Y chromosome. Anim. Genet. 29:146-149.[CrossRef][Medline]
OOSTHUIZEN, C. J., J. S. HERBERT, L. K. VERMAAK, J. BRUSNICKY, and J. FRICKE et al., 1990 Deletion mapping of 39 random isolated Y-chromosome DNA fragments. Hum. Genet. 85:205-210.[Medline]
RICE, W. R., 1994 Degeneration of a nonrecombining chromosome. Science 263:230-232.
ROHR, C. J., H. RANSON, X. WANG, and N. J. BESANSKY, 2002 Structure and evolution of mtanga, a retrotransposon actively expressed on the Y chromosome of the African malaria vector Anopheles gambiae.. Mol. Biol. Evol. 19:149-162.
RUTHERFORD, K., J. PARKHILL, J. CROOK, T. HORSNELL, and P. RICE et al., 2000 Artemis: sequence visualization and annotation. Bioinformatics 16:944-945.
SALAZAR, C. E., D. M. HAMM, D. M. WESSON, C. B. BEARD, and V. KUMAR et al., 1994 A cytoskeletal actin gene in the mosquito Anopheles gambiae.. Insect Mol. Biol. 3:1-13.[Medline]
SAMBROOK, J., E. F. FRITSCH and T. MANIATIS, 1998 Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
SEVERSON, D. W., 1997 RFLP analysis of insect genomes, pp. 309320 in The Molecular Biology of Insect Disease Vectors, edited by J. M. CRAMPTON, C. B. BEARD and C. LOUIS. Chapman & Hall, London.
SHIBATA, F., M. HIZUME, and Y. KUROKI, 1999 Chromosome painting of Y chromosomes and isolation of a Y chromosome-specific repetitive sequence in the dioecious plant Rumex acetosa.. Chromosoma 108:266-270.[CrossRef][Medline]
SKALETSKY, H., T. KURODA-KAWAGUCHI, P. J. MINX, H. S. CORDUM, and L. HILLIER et al., 2003 The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825-837.[CrossRef][Medline]
SONG, J., F. DONG, J. W. LILLY, R. M. STUPAR, and J. JIANG, 2001 Instability of bacterial artificial chromosome (BAC) clones containing tandemly repeated DNA sequences. Genome 44:463-469.[Medline]
STEINEMANN, M. and S. STEINEMANN, 1992 Degenerating Y chromosome of Drosophila miranda: a trap for retrotransposons. Proc. Natl. Acad. Sci. USA 89:7591-7595.
STEINEMANN, M. and S. STEINEMANN, 1998 Enigma of Y chromosome degeneration: neo-Y and neo-X chromosomes of Drosophila miranda a model for sex chromosome evolution. Genetica 102(103):409-420.
STEINEMANN, M. and S. STEINEMANN, 2000 Common mechanisms of Y chromosome evolution. Genetica 109:105-111.[CrossRef][Medline]
This article has been cited by other articles:
![]() |
K. R. Ng'habi, A. Horton, B. G. J. Knols, and G. C. Lanzaro A New Robust Diagnostic Polymerase Chain Reaction for Determining the Mating Status of Female Anopheles gambiae Mosquitoes Am J Trop Med Hyg, September 1, 2007; 77(3): 485 - 487. [Abstract] |




