- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.108.100065v1
genetics.108.100065v2
181/4/1673 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Maydan, J. S.
- Articles by Moerman, D. G.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Maydan, J. S.
- Articles by Moerman, D. G.
Originally published as Genetics Published Articles Ahead of Print on February 2, 2009.
Genetics, Vol. 181, 1673-1677, April 2009, Copyright © 2009
doi:10.1534/genetics.108.100065
De Novo Identification of Single Nucleotide Mutations in Caenorhabditis elegans Using Array Comparative Genomic Hybridization
Jason S. Maydan*,
H. Mark Okada
,
Stephane Flibotte
,
Mark L. Edgley
and
Donald G. Moerman*,
,1
* Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada,
Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 4S6, Canada and
Michael Smith Laboratories, University of British Columbia V6T 1Z4, Vancouver, British Columbia, Canada
1 Corresponding author: University of British Columbia, 6270 University Blvd., Vancouver, BC V6T 1Z4, Canada.
E-mail: moerman{at}zoology.ubc
Array comparative genomic hybridization (aCGH) has been used primarily to detect copy-number variants between two genomes. Here we report using aCGH to detect single nucleotide mutations on oligonucleotide microarrays with overlapping 50-mer probes. This technique represents a powerful method for rapidly detecting novel homozygous single nucleotide mutations in any organism with a sequenced reference genome.
A major roadblock in genetic research lies in the molecular identification of mutations responsible for an observed phenotype. Traditional positional cloning techniques are laborious, time-consuming, and sometimes impractical for mapping mutations to regions smaller than a few mega-base pairs, particularly in regions with low recombination frequencies such as the centers of Caenorhabditis elegans chromosomes (BARNES et al. 1995). Sequencing such a large region still remains impractical for most laboratories, and as a result many mutations remain uncharacterized. Recently, array comparative genomic hybridization (aCGH) has been used to detect single nucleotide variation in the 12.5-Mb yeast genome using short 25-mer probes (GRESHAM et al. 2006). Here we demonstrate the use of 50-mer probes to detect single nucleotide mutations in the 100-Mb C. elegans genome.
aCGH has been used to detect many types of genome diversity in a variety of organisms (GRESHAM et al. 2008). We have been using aCGH with exon-centric tiling arrays of 50-mer oligonucleotide probes to screen for deletions in the C. elegans genome following mutagenesis with trimethylpsoralen (TMP) and ultraviolet (UV) irradiation (MAYDAN et al. 2007). In one set of experiments utilizing a microarray with probes targeting primarily exons on C. elegans chromosome II, we screened individuals homozygous for a mutagenized chromosome II. In these experiments we identified three statistically significant putative mutations (P-values ranged from 2.7 x 10–5 to 1.8 x 10–14 according to one-sample t-tests). These putative mutations affected just a few adjacent overlapping probes and produced modest signals comparable to those normally observed for heterozygous deletions. We hypothesized that very small homozygous mutations (much shorter than the length of a probe) could produce signals of this magnitude. The mutations would have to be very small to target only a few overlapping probes and permit some hybridization of complementary sequence to the array. Mutations of this size would not have produced statistically significant signals on our whole-genome tiling arrays because each mutation would affect only one or two probes.
Our hypothesis was confirmed when PCR and DNA sequencing identified single nucleotide mutations in all three mutants. The strain VC10078 carries gk802, an A
T transversion allele of syd-1 at II: 7586645 (see Figure 1), causing a nonconservative amino acid substitution [I(887)
K]; VC10079 contains allele gk803, an A
G transition at nucleotide II: 10825740, which results in a synonymous base-pair substitution in mix-1 at the third position of a codon for leucine (CUA
CUG); and VC10077 carries gk801, an allele with two closely linked mutations in Y46E12BL.2: a G
A transition at II: 15240024, causing a conservative amino acid substitution [V(714)
I], and an A
G transition at II: 15240052, resulting in a nonconservative amino acid substitution [Y(723)
C].
|
Dense tiling with oligonucleotides is necessary to obtain sufficient statistical power to detect single nucleotide alterations. In a previous study (FLIBOTTE et al. 2009) we have shown that a window of
20 bases contains a strong log2 ratio signal (see Figure 1 in FLIBOTTE et al. 2009), and since we require about four probes to target the mutated site, this allows a maximum probe spacing of
5 bases. The plot in Flibotte's figure also shows that it would be useful to target both strands and use the small shift in the peak position on opposite strands to help distinguish single nucleotide polymorphisms (SNPs) from artifacts. Utilizing these probe spacing guidelines, we conducted an additional 13 aCGH experiments comparing homozygous mutants to their parental strains, using 50-mer oligonucleotide microarrays probing regions from 0.65 to 2.60 Mb in length that are known to include unidentified mutations based on prior mapping experiments. The probe spacing, i.e., the distance between the 5'-ends of consecutive probes, on these arrays ranged from 1 to 5 bp, and all known repeats were excluded from the array designs. Unlike our previous exon-centric arrays, no other constraints were applied to the oligonucleotides. Note that, while probes for both strands are desirable, we were not able to include them for the majority of the 13 experiments, because the interval to be tested was too large to allow probes for both strands. All microarrays were manufactured by Roche NimbleGen with oligonucleotides synthesized at random positions on the arrays. Mutant strains were generated by standard ethyl methanesulfonate (EMS) mutagenesis, which yields approximately one single nucleotide mutation every 100–400 kb (ANDERSON 1995; CUPPEN et al. 2007), and then were serially backcrossed with their parental strains. From these experiments we selected 58 candidate single nucleotide mutations on the basis of a visual inspection of the data and identification using a segmentation algorithm (MAYDAN et al. 2007) or a sliding-window technique. We then performed PCR and DNA sequencing to gauge the accuracy of our mutation predictions. For each candidate mutation, we calculated a SNP score by averaging the log2 fluorescence ratios (mutant/wild-type) in a small window containing probes putatively affected by the mutation and renormalizing by subtracting from that the average log2 ratio in the immediate flanking regions. This renormalization is necessary to account for local bias, which varies both among and within experiments and makes the detection of SNPs more difficult since artifacts associated with a strong local bias in the log2 ratio could easily be confused with the signature expected for a SNP. Unlike previous observations that mutations near the centers of 25-mer probes are most inhibitory to efficient hybridization (SHARP et al. 2007), we observed that mutations located away from the glass slide and freely floating in the solution closer to the 5'-ends of our 50-mer probes produced a larger perturbation to the hybridization process, with a maximum perturbation at 7 bases in from the 5'-end (probably due to steric effects; again see Figure 1 in FLIBOTTE et al. 2009). The location of the window used to calculate the score reflects this observation. This sensitivity to mutations at the 5'-end of NimbleGen probes has also been observed by WEI et al. (2008). The sequencing results (summarized in Figure 2A) confirmed the presence of a single nucleotide mutation in 16 of the candidates for an overall success rate or specificity of 28%. All mutations were either C-to-T or G-to-A transitions, as expected from EMS mutagenesis. The locations of the mutations were usually predicted to within <10 bp of their true positions and to within 1 bp in one case.
|
To estimate the sensitivity of our single nucleotide mutation detection technique, we performed aCGH experiments to test our ability to detect 2639 known SNPs in the CB4856 strain isolated in Hawaii (see Figure 2 legend for details of the array design). Examples of all possible transitions and transversions were detected. The SNP detection sensitivity is shown in Figure 2B for various thresholds in the SNP score described above. At the reasonable threshold of –0.45, the specificity (the percentage of predicted SNPs that are real) would be 31% with a sensitivity (the percentage of real SNPs that are successfully detected) of 37%. In other words, with the current SNP detection technique we could expect to detect roughly one of every three SNPs present in the targeted region and have to sequence roughly three candidates to detect a real SNP. As expected, the SNP detection sensitivity of the current technique depends on the type of transition or transversion being investigated, and, as can be seen in Figure 2C, the sensitivity reaches
50% for the most commonly induced EMS-generated mutations (C to T and G to A). The optimal probe length for single nucleotide mutation detection by CGH is unclear and likely depends on the hybridization conditions. Single nucleotide mutations should have a greater impact on hybridization to shorter oligonucleotides, but longer oligonucleotides allow a greater number of overlapping probes to target a given single nucleotide mutation, and arrays with longer oligonucleotides tend to have better standard deviations in log2 ratios (SHARP et al. 2007). Further experiments are needed to determine the optimal probe length to achieve the greatest sensitivity and specificity as a function of the size of the targeted region; such an optimal length will probably vary with the complexity of the genome being studied.
Although this technique is particularly well suited to detecting SNPs generated by EMS mutagenesis, some single nucleotide mutations may not be detectable by aCGH even with higher probe densities than we have used here. We suspected that some of the Hawaiian SNPs that we failed to detect might have been missed because they were found in regions with significant homology to other regions of the genome. In these cases, multiple regions of the genome could have hybridized to our probes, making it difficult for the effect of a SNP on the log2 ratios to be detectable. However, filtering the oligonucleotide properties according to our best practices and standard microarray design recommendations (FLIBOTTE and MOERMAN 2008) failed to improve the SNP detection sensitivity, which makes this possibility unlikely. It is also possible that SNPs are more difficult to detect with aCGH when present in the background of the Hawaiian genome because this genome has significant structural variation relative to the N2 reference genome (MAYDAN et al. 2007); consequently, for a more typical SNP detection experiment the sensitivity of the technique might be slightly better than what we have reported here. However, limiting the analysis to SNPs that are located far away from other known polymorphisms did not improve the SNP detection sensitivity, which makes this possible source of interference also unlikely. Finally, we have not yet attempted to detect heterozygous single nucleotide mutations using this technique, but this would be nearly impossible to accomplish with current microarrays.
The ability of aCGH to detect homozygous single nucleotide mutations in addition to deletions and duplications makes it possible to quickly and affordably identify mutations mapped by traditional positional cloning approaches. A clear example of the feasibility of this technique is demonstrated in an accompanying article in this issue, O'MEARA et al. (2009), where two single base lesions were mapped to the promoter of the gene cog-1 using aCGH. We recommend a maximum probe spacing of
5 bp to have a reasonable chance at successful SNP detection with this technique, This probe spacing corresponds to
2 Mbp of genomic sequence on a microarray with 380,000 probes, the oligo capacity of the chips that we used in this study. We prefer to apply this SNP detection technique only in situations where the mutation is mapped to a maximum of a 1-Mbp region, as this provides denser coverage of the mutation site and allows us to target both strands. Targeting both strands should result in fewer false positives. Further reducing the size of the candidate region should improve the likelihood of successful base-change detection as more probes target any specific base. If any sequences in the mapped region can be excluded (such as noncoding DNA, repeat elements, or genes that can be ruled out as candidate genes), the probe density can be further increased in specific regions of interest. It is of course possible to use more than one microarray to probe the candidate region if the region is too large to achieve the desired probe density on a single array. Also, when the search region is small enough to allow very high density tiling, one can take advantage of the fact that the effect of a SNP on hybridization is dependent on its position in the probe by including probes that target both strands and then by pursuing primarily candidates showing a small shift between the plus and minus strand log2 ratio profiles.
To make the current SNP detection technique more accessible, we have mounted a web application to design oligonucleotide microarrays. The application can be found at http://hokkaido.bcgsc.ca/SNPdetection/. Downstream analysis tools to calculate and normalize the log2 ratios are also available on the same web site. Given the criteria set by the user, such as the probe target region and strand(s), the oligonucleotides are selected in a way to evenly distribute the probes across the selected region. The placement of these probes are selected to avoid repeat regions, noncoding regions (optional), and specific probe sequences that cannot be synthesized due to the cycle number constraint in NimbleGen's manufacturing process. Once the criteria have been selected the file is sent to the user in a format ready for submission to NimbleGen. We recommend that users start with the constraints that we describe in this article. Currently, the probe selection application has been set to support the C. elegans and Drosophila melanogaster genomes, but genomes from other species will be added upon request.
With the advent of whole-genome sequencing using new high-throughput sequencing machines (HILLIER et al. 2008; SARIN et al. 2008) it might be asked whether SNP detection on microarrays is a reasonable technique for mutation detection. Deep sequencing is certainly a powerful method, but for now our method is easier to perform, as we have provided the web site for oligo design and data analysis. Mapping short reads and calling variants is still challenging using deep sequencing, but programs are coming online to make this much easier (see, for example, MAQ in LI et al. 2008). A CGH experiment can be done rapidly and involves less labor and, if desired, DNA labeling and hybridization can be outsourced to NimbleGen. This advantage will certainly be short lived as more and more sequencing machines become available and their use more transparent. Our CGH method is also less expensive, but this situation too will no doubt change in the future as deep sequencing becomes commonplace. At present it is difficult to compare the two methods for accuracy of mutation detection. We have measured a false-positive and false-negative rate for CGH in this article, but at present there is no comparable measure for deep sequencing. We suspect that with several short reads across an interval containing a mutation and with improvements in alignment programs such as MAQ that deep sequencing will become highly accurate. With either method one cannot avoid genetic mapping. For our SNP detection method one needs to do initial mapping to limit the mutation of interest to a small region of the genome. For deep sequencing one can sequence first, but one then has to determine which of several hundred changes in the genome is the causative change (HILLIER et al. 2008 and our unpublished results). A more effective approach using deep sequencing is illustrated in SARIN et al. (2008) where the gene of interest was first mapped to a 4-Mb interval.
ANDERSON, P., 1995 Mutagenesis. Methods Cell Biol. 48: 31–58.[Medline]
BARNES, T. M., Y. KOHARA, A. COULSON and S. HEKIMI, 1995 Meiotic recombination, noncoding DNA and genomic organization in Caenorhabditis elegans. Genetics 141: 159–179.[Abstract]
CUPPEN, E., E. GORT, E. HAZENDONK, J. MUDDE, J. VAN DE BELT et al., 2007 Efficient target-selected mutagenesis in Caenorhabditis elegans: toward a knockout for every gene. Genome Res. 17: 649–658.
EDGLEY, M. L., and D. L. RIDDLE, 2001 LG II balancer chromosomes in Caenorhabditis elegans: mT1(II;III) and the mIn1 set of dominantly and recessively marked inversions. Mol. Genet. Genomics 266: 385–395.[CrossRef][Medline]
FLIBOTTE, S., and D. G. MOERMAN, 2008 Experimental analysis of oligonucleotide microarray design criteria to detect deletions by comparative genomic hybridization. BMC Genomics 9: 497.[CrossRef][Medline]
FLIBOTTE, S., M. L. EDGLEY, J. MAYDAN, J. TAYLOR, R. ZAPF et al., 2009 Rapid high resolution single nucleotide polymorphism-comparative genome hybridization mapping in Caenorhabditis elegans. Genetics 181: 33–37.
GRESHAM, D., D. M. RUDERFER, S. C. PRATT, J. SCHACHERER, M. J. DUNHAM et al., 2006 Genome-wide detection of polymorphisms at nucleotide resolution with a single DNA microarray. Science 311: 1932–1936.
GRESHAM, D., M. J. DUNHAM and D. BOTSTEIN, 2008 Comparing whole genomes using DNA microarrays. Nat. Rev. Genet. 9: 291–302.[CrossRef][Medline]
HILLIER, L. W., G. T. MARTH, A. R. QUINLAN, D. DOOLING, G. FEWELL et al., 2008 Whole-genome sequencing and variant discovery in C. elegans. Nat. Methods 5: 183–188.[CrossRef][Medline]
LI, H., J. RUAN and R. DURBIN, 2008 Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18: 1851–1858.
MAYDAN, J. S., S. FLIBOTTE, M. L. EDGLEY, J. LAU, R. R. SELZER et al., 2007 Efficient high-resolution deletion discovery in Caenorhabditis elegans by array comparative genomic hybridization. Genome Res. 17: 337–347.
O'MEARA, M. M., H. BIGELOW, S. FLIBOTTE, J. F. ETCHBERGER, D. G. MOERMAN et al., 2009 Cis-regulatory mutations in the Caenorhabditis elegans homeobox gene locus cog-1 affect neuronal development. Genetics 181: 1679–1686.
SARIN, S., S. PRABHU, M. M. O'MEARA, I. PE'ER and O. HOBERT, 2008 Caenorhabditis elegans mutant allele identification by whole-genome sequencing. Nat. Methods 5: 865–867.[CrossRef][Medline]
SELZER, R. R., T. A. RICHMOND, N. J. POFAHL, R. D. GREEN, P. S. EIS et al., 2005 Analysis of chromosome breakpoints in neuroblastoma at sub-kilobase resolution using fine-tiling oligonucleotide array CGH. Genes Chromosomes Cancer 44: 305–319.[CrossRef][Medline]
SHARP, A. J., A. ITSARA, Z. CHENG, C. ALKAN, S. SCHWARTZ et al., 2007 Optimal design of oligonucleotide microarrays for measurement of DNA copy-number. Hum. Mol. Genet. 16: 2770–2779.
WEI, H., P. F. KUAN, S. TIAN, C. YANG, J. NIE et al., 2008 A study of the relationships between oligonucleotide properties and hybridization signal intensities from NimbleGen microarray datasets. Nucleic Acids Res. 36: 2926–2938.
Communicating editor: K. KEMPHUES
This article has been cited by other articles:
![]() |
M. M. O'Meara, H. Bigelow, S. Flibotte, J. F. Etchberger, D. G. Moerman, and O. Hobert Cis-regulatory Mutations in the Caenorhabditis elegans Homeobox Gene Locus cog-1 Affect Neuronal Development Genetics, April 1, 2009; 181(4): 1679 - 1686. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.108.100065v1
genetics.108.100065v2
181/4/1673 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Maydan, J. S.
- Articles by Moerman, D. G.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Maydan, J. S.
- Articles by Moerman, D. G.

1666. A mixed-stage population of VC1415 {unc-4(e120)/mIn1[mIs14 dpy-10(e128)] II} was subjected to mutagenesis with TMP at 10 µg/ml for 1 hr followed by UV irradiation for 90 sec at 340 µW/cm2 and then placed on food at 20°. Both unc-4 and dpy-10 mutations are recessive, and the mIn1 inversion suppresses recombination along the middle of chromosome II from lin-31 to rol-1 (
