An intensive linkage map of the yellow fever mosquito, Aedes aegypti, was constructed using single-strand conformation polymorphism (SSCP) analysis of cDNA markers to identify single nucleotide polymorphisms (SNPs). A total of 94 A. aegypti cDNAs were downloaded from GenBank and primers were designed to amplify fragments <500 bp in size. These primer pairs amplified 94 loci, 57 (61%) of which segregated in a single F1 intercross family among 83 F2 progeny. This allowed us to produce a dense linkage map of one marker every 2 cM distributed over a total length of 134 cM. Many A. aegypti cDNAs were highly similar to genes in the Drosophila melanogaster genome project. Comparative linkage analysis revealed areas of synteny between the two species. SNP polymorphisms are abundant in A. aegypti genes and should prove useful in both population genetics and mapping studies.
THE mosquito Aedes aegypti has been the subject of extensive genetic research due to its medical importance and the ease with which it can be manipulated in the laboratory. On a worldwide basis, A. aegypti is the most common vector of yellow fever and dengue fever flaviviruses (Milleret al. 1989; Monath 1991; Gubler and Meltzer 1999). Beginning in the early 1960s, an abundance of visible genetic markers were identified during isolation of isofemale lines from field A. aegypti populations (Craiget al. 1961). These 87 spontaneous mutants were associated with a wide array of eye color markers, cuticular scale patterns and colors, distortions of the legs and palps, homeotic mutants, loci with recessive lethal alleles, loci affecting sex ratio, and insecticide resistance (Craiget al. 1961; Craig and Hickey 1967). Somatic and germ cell cytogenetics are well characterized in A. aegypti (Rai 1963, 1966; Mescher and Rai 1966), and chromosomal translocations and inversions have been induced with gamma radiation (Mcgivern and Rai 1972; Raiet al. 1973).
Allozymes constituted the next generation of genetic markers (Munstermann and Craig 1979) and provided many additional loci on the A. aegypti linkage map. The first intensive map of A. aegypti was obtained in the early 1990s through restriction fragment length polymorphism (RFLP) analysis of cDNA clones and there are currently >100 cDNA loci mapped (Severson et al. 1993, 1994, 1995a,b). Soon after, Antolin et al. (1996) demonstrated that single-strand conformation polymorphism (SSCP) analysis of randomly amplified polymorphic DNA (RAPD) markers could be used to rapidly construct a linkage map from a single F1 intercross family. However, subsequent analysis indicated that RAPD loci that were polymorphic within one A. aegypti family were fixed for dominant or recessive alleles in other families (Bosioet al. 2000), precluding their use for comparisons across families or populations. This problem led us to explore several different types of markers.
Microsatellites are abundant in the genome of the mosquito Anopheles gambiae (Zheng et al. 1991, 1993, 1996; Lanzaroet al. 1995; Wanget al. 1999). However, isolation and analysis of microsatellites in A. aegypti yielded curious results (Fagerberg et al. 2000). Various di- and trinucleotide repeats were tested but none were abundant in the A. aegypti genome. Furthermore, most of the microsatellite loci that were obtained were either not variable when analyzed in several A. aegypti families or alleles at polymorphic loci segregated as band-absent (recessive) or band-present (dominant) markers. Sequence analysis indicated that loci, not alleles, varied in the number of microsatellite repeats and that some amplified loci had no microsatellite repeats at all.
We subsequently explored a variety of techniques for identification of single nucleotide polymorphisms (SNPs) in PCR products. These included RFLP analysis, SSCP analysis (Oritaet al. 1989), heteroduplex analysis (Whiteet al. 1992), denaturing gradient gel electrophoresis (Myerset al. 1987), and allele-specific oligonucleotide hybridization (Saikiet al. 1986). In our hands SSCP analysis was the most reproducible and sensitive of these techniques and also the most rapid and least expensive (Black and Duteau 1997). SSCP is based on the principle that both size and primary sequence influence the impedance of single-strand DNA molecules in nondenaturing gels. Impedance is a function of primary sequence because several stable shapes or conformations are formed when secondary base pairing occurs among nucleotides on a single DNA strand. The length, location, and number of intrastrand base pairs determine secondary and tertiary structure of a conformation. Point mutations that affect intrastrand interactions may therefore change the shapes of molecules and alter their mobility during electrophoresis. The SSCP technique is reported to detect ≥99% of point mutations in DNA molecules 100-300 bp in length and ≥89% of mutations in molecules 300-450 bp in length (Oritaet al. 1989; Hayashi 1991).
Here we report on the large diversity of A. aegypti cDNA genes that are currently available in GenBank and demonstrate that SSCP analysis of these reveals extensive polymorphisms that can be used to develop an intensive linkage map in a single F1 intercross family. We also compare the locations of these genes to their physical locations in the Drosophila melanogaster genome (Adamset al. 2000) to examine the degree of synteny between the two species.
MATERIALS AND METHODS
Mosquito breeding and processing: A single F1 intercross family consisting of 83 F2 individuals was used to estimate recombination frequencies among cDNA loci. The P1 individuals of this family originated from two laboratory colonies derived from field collections of eggs. The P1 female belonged to the subspecies A. aegypti formosus collected from Ibo village, Nigeria. Fifth and sixth generation mosquitoes were used. The P1 male belonged to the subspecies A. aegypti aegypti and was collected in San Juan, Puerto Rico. First and second generation mosquitoes were used. F1 offspring from this cross were collected and intercrossed. The resulting F2 offspring were reared to adults. All family members were frozen and stored at -70° to await processing.
DNA was extracted from individual mosquitoes (Black and Munstermann 1996) and resuspended in 500 μl TE buffer (50 mm Tris-HCl, 5 mm EDTA, pH 8.0). A 50-μl aliquot of this DNA was overlaid with sterile mineral oil and stored at 4° for daily use in polymerase chain reaction (PCR). The remainder was stored in plastic screw-top vials at -70°.
Annotation of Aedes aegypti anonymous cDNAs: The database of expressed sequence tags (dbest) in GenBank currently contains most of the ∼1630 A. aegypti genetic markers. These were individually downloaded from GenBank and a BLASTX search was performed against the Drosophila genome project (Adamset al. 2000). Those without a significant match (>e15) were subjected to a BLASTX search against the nonredundant (NR) database. Remaining unmatched cDNAs were subject to a BLASTN search against the Drosophila genome, NR, and dbest databases. The physical locations of matches in the Drosophila genome were recorded along with the name or, for Drosophila genes of unknown function, accession number.
Primer design: A subset of 94 cDNA sequences of identified function was selected for further analysis. Primers were designed directly from the cDNA sequence using Primer Premier v4.11 (Premier Biosoft International, Palo Alto, CA). Search parameters were set to a primer length of 20 nucleotides, a 100-pm template concentration, a 50-mm monovalent ion concentration, a 1.5-mm free Mg2+ concentration, a 250-mm total Na+ equivalent, and 25° for free energy calculations. Primers were designed to amplify a 200- to 500-bp region of the gene, an amount deemed optimal for SSCP analysis. These primers were optimized for annealing temperatures using a Mastercycler gradient thermal cycler (Eppendorf, Madison, WI) and template DNA mass isolated from ∼500 Puerto Rican larvae. Annealing temperatures (Ta) that yielded single bands with strong amplification were considered optimal.
PCR was completed in thin-walled polycarbonate 96-well plates (Fisher Scientific, Pittsburgh, PA). Each plate contained an entire family, including all four P1 and F1 parents, the 83 F2 offspring, and a negative control (no template DNA added). The remainder of the PCR and SSCP analyses followed Black and Duteau (1997) and Bosio et al. (2000). Cloning and sequencing of bands followed Bosio et al. (2000).
Linkage mapping: Genotypes at each putative locus were scored and entered in the JoinMap 2.0 (Stam and Van Ooijen 1995) data file format for a cross pollinator cross. These were tested for conformity to Mendelian ratios with a χ2 goodness-of-fit analysis using the JMSLA procedure in JoinMap. Loci at which Mendelian genotype ratios were observed were separated into individual linkage groups using the JMGRP and JMSPL procedures with a starting LOD threshold of 0.0 that was increased to 8.0 in increments of 0.1. Pairwise distances (Kosambi 1944) were estimated among loci in each of the three linkage groups using JMREC, and the maximum-likelihood map was estimated using JMMAP. The linkage map was plotted using DrawMap1.1 (Van Ooijen 1994).
Other markers: Microsatellite loci amplified by the TAG66 primers (Fagerberget al. 2001) were mapped as were sequence-tagged amplified RAPD (STAR) loci (Bosioet al. 2000) and LF markers (Seversonet al. 1993) to orient the map derived in our study relative to maps from earlier studies (Severson et al. 1993, 1994, 1995a,b; Antolinet al. 1996; Bosioet al. 2000). Alleles at the TAG66 loci segregate as dominant markers. STAR loci were developed by cloning and sequencing RAPD markers and then designing primers that contained the original RAPD primer at the 5′ end and the next 10 nucleotides in the sequence to the 3′ end. STARs are amplified by targeted PCR and alleles at STAR loci often segregate as codominant markers (Bosioet al. 2000).
Ninety-four primer sets were designed from genes of identified function in A. aegypti or from a collection of the ∼1530 A. aegypti expressed sequence tags (ESTs) in GenBank (indicated with AI in the accession numbers). Primers were designed only from ESTs that had high similarity in a BLASTX search to genes of known function in GenBank. Primers were tested on family DNA and 88 of them amplified products of the anticipated size to yield a total of 94 loci (Table 1). Three primer sets amplified more than a single locus [allatotropin (5 loci), ADPATPtl (2 loci), and Feilai405 (two loci)]. Fifty-seven (61%) of these were polymorphic and alleles segregated as codominant markers at 53 (93%) of the polymorphic loci. Alleles at the 4 (7%) remaining loci segregated as dominant (band-present) and recessive (band-absent) markers.
The inheritance of genotypes was fully informative at 18 loci and was partially informative at the remaining 39 loci. Examples of genotypes segregating among the P1 and F1 parents and the first nine F2 offspring appear for 10 loci in Figure 1. Alleles at Fxa, Hexam2, Peroxnc, and RNAhelic segregated as codominant markers whose genotypes were fully informative in this F1 intercross family. At Fxa all four P1 and F1 parents had unique genotypes that were recovered in the F2 offspring and the P1 male appeared to be homozygous for a null allele. At Hexam2, Peroxnc, and RNAhelic, the P1 parents had unique genotypes and F1 parents were heterozygous. All three genotypes were recovered in the F2 offspring. Alleles at ADPATPtla segregated as a dominant marker arising from the P1 mother and a recessive marker in the P1 father. Genotypes were only partially informative for mapping because the P1 mother and her F1 daughter shared the same genotype. Alleles at the ADPATPtlb, Dynein, Gpd-1, Rf5, TrypB, and TrypEarl loci segregated as codominant markers but a P1 parent and at least one of its F1 offspring shared the same genotype and were thus only partially informative for mapping.
Genotype frequencies at all loci fit expected Mendelian ratios except LF138, LF90, Glusyn, and Apyr1 and these were excluded from mapping. The remaining 53 cDNA-SSCP markers were mapped among the 83 F2 individuals. In addition, 9 TAG66 microsatellite markers (Fagerberg et al. 2000), 6 STAR markers (Bosioet al. 2000), and the Sex locus were used (Figure 2). The total map consists of 134 cM (58 + 39 + 37), with a marker density of 1.9 markers/cM. Three linkage groups were detected at a LOD of 2.9. The LOD was increased in 0.1 increments and the three linkage groups remained intact until a LOD of 3.5, when B20.390 and Rf1 formed a separate linkage group. The three linkage groups then remained intact until a LOD of 4.7, when AbdA separated from chromosome 1.
Products of various primers were sequenced to determine if they amplified the predicted product. For A. aegypti genes of known function these included Apyr, CarboxA, D7, DefA1, Fxa, all of the LF markers (Seversonet al. 1993), Malt, Sialokin1, TrypLate, and TrypB. In every case BLASTN recovered the predicted sequences from the NR database. AbdA, Gpd-1, and Hsp70 were designed from ESTs. Sequences amplified from these primers were subjected to a BLASTN search and in every case recovered the original EST and in a BLASTX search recovered the Abd-A gene (3e-27; CG10325), the glycerol-3-phosphate dehydrogenase gene (7e-22; CG8256), and Hsc70-4 (4e-19; CG4264) genes from D. melanogaster.
We also sequenced any products that appeared as multiple independently segregating alleles. The ADP-ATPtl primers were designed from an EST (AI657540) and amplified two independently segregating bands (Figure 1). A BLASTP search indicated that both were highly similar (6e49) to a clone of A. gambiae ADP/ATP carrier protein (L11617). Sequence analysis (Figure 3) suggested that ADPATPtla is a pseudogene with a premature stop at codon 45 while ADPATPtlb may encode a functional mRNA (Figure 3). Interestingly, two A. aegypti ESTs AI650113 and AI650176 that were similar in sequence to AI657540 contained insertions between codons 33 and 34 and at codons 41, 57, and 66. These may represent other ADPATPtl pseudogenes.
The primers designed to amplify a single allatotropin locus from A. aegypti (U65314) instead amplified five loci albeit at a low annealing temperature of 43°. All five amplicons were mapped and sequenced but none were similar to the allatotropin gene in A. aegypti or to any other sequences in GenBank and were thus assigned labels Rf1-Rf5. The primers that were predicted to amplify actin loci amplified two separate loci. One had no similarity to any sequences in GenBank and was thus labeled Rf6. The other amplicon was similar to an A. aegypti repetitive element Feilai 405 (AF107667).
An initial BLASTX search with AI650010 suggested similarity to a region of the Antennapedia complex in D. melanogaster. Anticipating that AI650010 would map at an ∼10-cM distance from AbdA, as in D. melanogaster, we added this marker to our map. It mapped to chromosome 1 at a distance of ∼20 cM from AbdA. However, while a BLASTN analysis of the amplified fragment recovered AI650010, a subsequent search of the Drosophila genome database with AI650010 identified it as being more similar (2e20) to a gene of unknown function (CG18355 on the right arm of chromosome 2).
The GenBank sequences of all mapped loci were subjected to BLAST searches against the Drosophila genome to compare Aedes linkage locations to Drosophila physical locations (Table 1). The locations of several genes on A. aegypti chromosome 1 also mapped to D. melanogaster chromosome 1 (Figure 2). Hemepoly, LF198, Erudi, Cathbp, and Transfer all mapped to the first 14 cM of chromosome 1 in A. aegypti and to the first 29 Mb of Drosophila chromosome 1 albeit not in identical order. In addition, Aamy2, BMIOP, and Chitan1 all mapped between 32 and 33 cM on A. aegypti chromosome 1 and were located within 39-46 Mb of Drosophila chromosome 2. The remainder of genes appeared to be located on different linkage groups in the two species.
The A. aegypti genome contains from 750 to 842 Mbp and 40% of this consists of repetitive elements distributed as short repeats (Warren and Crampton 1991). On the basis of a range of estimated linkage sizes of 134 cM (this report) to 228 cM (Munstermann and Craig 1979), the relationship between physical and recombination distance is between 3.3 and 6.3 Mbp/cM. However, comparison of physical and recombination distances (D. W. Severson, personal communication) suggests that, as with D. melanogaster (Adamset al. 2000), a large proportion of the repetitive elements are clustered in centromeres or along whole arms such that the resolution among coding sequences may be as low as 1 Mb/cM.
This relatively low resolution and lack of well-resolved polytene chromosomes predict that A. aegypti genetic studies will continue to rely heavily on linkage mapping and eventually mapped-based positional cloning to identify genes of interest. Positional cloning depends critically on having a high density of genetic markers. RFLP analysis provides abundant codominant loci (Severson et al. 1993, 1994, 1995b; Severson and Zhang 1996) but is limited by the amount of data that can be gleaned from an individual mosquito, since an extraction from one mosquito yields ≤10 μg of genomic DNA (Seversonet al. 1993). Also, sequence variation outside of restriction sites is undetected. Alternatively, use of PCR-based analyses increases the amount of data that can be acquired from a mosquito such that saturated linkage maps can be constructed with DNA from only a single family.
We demonstrated that detection of SNPs in cDNA loci by SSCP analysis provides an abundance of codominant markers for construction of saturated linkage maps in A. aegypti. SSCP analysis detected allelic sequence variation at 61% of the loci examined in a single family. This underestimates the amount of natural variation at these loci: analysis of additional mosquitoes from natural populations identified variation at Apyr, CarboxA, D7, DefA1, Fxa, Gpd-1, Hsp70, Malt, Sialokin1, TrypLate, and TrypB loci. Furthermore, markers that could be mapped in our 83-member family could also be consistently amplified and mapped in a reciprocal cross (Bosioet al. 2000) and in the original RAPD family (Antolinet al. 1996).
The linkage map derived in our study is shorter than the earlier maps constructed using RAPD markers [52.3 + 58.2 + 57 = 168 cM in Antolin et al. (1996); 61 + 52 + 99 = 212 cM in Bosio et al. (2000)]. This may in part be due to fewer markers used in our study (68 as compared to 98 and 83). However, if repetitive DNA is clustered rather than dispersed, then our map may fail to include estimates of recombination among noncoding repetitive sequences. Combined use of cDNA and additional STAR markers may result in a map of more accurate length.
Jennifer Holmes, Amy Fagerberg, and Heather Stevenson assisted in the laboratory. Dr. Norma Gorrochotegui-Escalante provided preliminary sequence results from analysis of some cDNA genes in A. aegypti populations. Dr. Chris Bosio constructed the A. aegypti family used in this study. Drs. Barry Beaty and Boris Kondratieff served on R.F.’s graduate committee. This research was supported in part by the MacArthur Foundation for the Network on the Biology of Parasite Vectors and by National Institutes of Health grants AI 41436 and AI 45430.
Communicating editor: G. A. Churchill
- Received July 24, 2000.
- Accepted March 13, 2001.
- Copyright © 2001 by the Genetics Society of America