- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Genissel, A.
- Articles by Long, A. D.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Genissel, A.
- Articles by Long, A. D.
No Evidence for an Association Between Common Nonsynonymous Polymorphisms in Delta and Bristle Number Variation in Natural and Laboratory Populations of Drosophila melanogaster
Anne Genissela, Tomi Pastinenb, Andrea Dowella, Trudy F. C. Mackayc, and Anthony D. Longaa Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525,
b Montreal Genome Centre, McGill University Health Centre, Montreal, Quebec H3G 1A4, Canada
c Department of Genetics, North Carolina State University, Raleigh, North Carolina 27695
Corresponding author: Anne Genissel, Laboratoire d'Ecologie, UMR7625 Université Pierre et Marie Curie, cc 237, Bâtiment A, 7e étage, 7 quai Saint Bernard, 75252 Paris Cedex 05, France., anne.genissel{at}snv.jussieu.fr (E-mail)
Communicating editor: M. AGUADÉ
| ABSTRACT |
|---|
We test the hypothesis that naturally occurring nonsynonymous variants in the Delta ligand of the Notch signaling pathway contribute to standing variation in sternopleural and/or abdominal bristle number in Drosophila melanogaster, for both a large cohort of wild-caught flies and previously described laboratory lines. We sequenced the transcribed region of Delta for 16 naturally occurring chromosomes and 65 SNPs, including 7 nonsynonymous SNPs (nsSNPs), were observed. Identified nsSNPs and 6 additional common SNPs, all located in exon 6 and the 3' UTR, were genotyped in 2060 wild-caught flies using an OLA-based methodology and genotyped in 38 additional natural chromosomes via DNA sequencing. None of the genotyped nsSNPs were significantly associated with natural variation in bristle number as assessed by a permutation test. A 95% upper bound on the additive genetic variance attributable to each genotyped SNP in the large natural cohort is <2% of the total phenotypic variation. Results suggest that two previously detected genotype/phenotype associations between bristle number and variants in the introns of Delta cannot be explained by linkage disequilibrium between these variants and nearby nonsynonymous variants. Unidentified regulatory variants more parsimoniously explain previous observations.
IT has been long debated whether the actual variants contributing to variation in quantitative traits and short-term evolutionary change are largely regulatory (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Drosophila melanogaster sternopleural and abdominal bristle number is an excellent model system for dissecting the genetic basis of continuous variation (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
This work tests the hypothesis that nsSNPs in the Delta gene account for previously observed associations between other naturally occurring SNPs in the second and fifth introns of Delta and abdominal and sternopleural bristle number variation in D. melanogaster. We test every common nsSNP in Delta; thus if we fail to find associations we can exclude common nsSNPs as a class as contributors to bristle number variation. Although it would be desirable to examine additional SNPs, including the significant polymorphisms observed in ![]()
![]()
![]()
![]()
![]()
![]()
A second goal of this study is to compare the nature of associations between coding and noncoding SNPs in Delta and variation in sternopleural and abdominal bristle number in three different genetic backgrounds: (1) a large natural cohort of D. melanogaster (1031 females and 1029 males), (2) a set of lines consisting of 47 different natural homozygous third chromosomes in an otherwise isogenic background, and (3) a set of lines consisting of 46 lines differing only in a small natural introgressed fragment including Dl. We test whether allelic effects estimated in the natural population and set of laboratory lines differ. The answer to this question has important implications for theories attempting to explain the maintenance of quantitative genetic variation in terms of the frequencies and effects of the alleles at the underlying QTL (![]()
We employ a two-tiered approach to allow large-scale association studies to be efficiently carried out (cf. ![]()
![]()
| MATERIALS AND METHODS |
|---|
Bristle counts and genomic DNA preparation:
A sample of 2060 D. melanogaster adults (1029 males and 1031 females) were collected from openly fermenting Pinot Noir grapes and discarded pulp at an organic winery (Kaz Vineyard and Winery, Sonoma Valley, CA) in 1998. Of 2085 collected flies, 11 males and 14 females were discarded as they were determined to be D. simulans by use of a PCR assay that distinguishes D. melanogaster and D. simulans by amplicon size (data not shown). Sternopleural bristle number is taken to be the total number of macrochaetae and microchaetae on the left and right sternopleural plates and abdominal bristle number to be the number of bristles on the fifth sternite for males or the sixth sternite for females. After bristles were counted for each fly, DNA was extracted from whole flies using the Puregene DNA isolation from cell and tissue kit (Gentra Systems, Research Triangle Park, NC) following the manufacturer's instructions.
The laboratory strains used were derived from a population sample collected in North Carolina and are described elsewhere (![]()
![]()
![]()
SNP identification:
Sequencing of the transcribed and small exon-flanking regions (25 bp on either side of each exon) of the Delta locus was performed on 16 isogenic lines of D. melanogaster (Fig 2; GenBank accessions nos.
AY437140,
AY437141,
AY437142,
AY437143,
AY437144,
AY437145,
AY437146,
AY437147,
AY437148,
AY437149,
AY437150,
AY437151,
AY437152,
AY437153,
AY437154,
AY437155,
AY437156,
AY437157,
AY437158,
AY437159,
AY437160,
AY437161,
AY437162,
AY437163,
AY437164,
AY437165,
AY437166,
AY437167,
AY437168,
AY437169,
AY437170,
AY437171,
AY437172,
AY437173,
AY437174,
AY437175,
AY437176,
AY437177,
AY437178,
AY437179,
AY437180,
AY437181,
AY437182,
AY437183,
AY437184,
AY437185,
AY437186,
AY437187,
AY437188,
AY437189,
AY437190,
AY437191,
AY437192,
AY437193,
AY437194,
AY437195,
AY437196,
AY437197,
AY437198,
AY437199,
AY437200,
AY437201,
AY437202,
AY437203,
AY437204,
AY437205,
AY437206,
AY437207,
AY437208,
AY437209,
AY437210,
AY437211,
AY437212,
AY437213,
AY437214,
AY437215,
AY437216,
AY437217,
AY437218,
AY437219,
AY437220,
AY437221,
AY437222,
AY437223,
AY437224,
AY437225,
AY437226,
AY437227,
AY437228,
AY437229,
AY437230,
AY437231,
AY437232,
AY437233,
AY437234,
AY437235), and on one D. simulans inbred line (GenBank accession nos.
AY438142,
AY438143,
AY438144,
AY438145,
AY438146). All the nucleotide positions referred to throughout this work are from a fragment of the Drosophila genome stretching from
14 kb upstream of the start of the 5' UTR of Delta to the end of the 3' UTR (DDBJ/EMBL/GenBank accession nos. TPA:
BK004004). Sequence data consist of six fragments for each of 16 strains with the fragment including the 5' UTR and exon 1 extending from 14,784 to 15,542, exon 2 from 20,900 to 21,261, exon 3 from 24,975 to 25,091, exon 4 from 29,000 to 29,298, exon 5 from 32,348 to 32,452, and exon 6 and the 3' UTR from 34,552 to 38,460. Templates consisted of long overlapping PCR amplicons of
5 kb in size covering the regions of interest (see ![]()
![]()
|
|
Additional sequencing and genotyping assay validation:
We sequenced a 1545-bp DNA fragment spanning the last 1320 bases of exon 6 and the first 225 bases of the 3' UTR (positions 35,01636,560) for 38 additional chromosomes described in ![]()
SNP typing:
The SNPs were typed by two different assays, the OLA-based and the ASO assays, the second for assessing the error rate of the new genotyping assay we developed on the basis of the OLA.
Oligonucleotide ligation assay (OLA):
We performed 12-plex ligation-dependent probe amplifications on the basis of a method similar to one initially described by ![]()
![]()
PCR: A 2136-bp fragment covering the end of exon 6 and the beginning of the 3' UTR was PCR amplified in a 10-µl reaction from 2050 ng of genomic (g)DNA template, with 0.27 units of Extaq polymerase in its recommended buffer (Panvera), 0.5 µM of each primer 5'-AAACCCTGTCATCAGGGAATCT and 5'-ATGAGGAGGTTTCTTTCAATCGT, and 200 µM of each dNTP. This amplicon contains all nsSNPs identified from the sequencing of 16 isogenic lines. A second round of PCR was performed to amplify a 1957-bp amplicon from 1 µl of first-round PCR, diluted 100 times in water using the same concentration of PCR reagents but internal primers 5'-CCTGTCATCAGGGAATCTGC and 5'-GTCGTCCAGTGGTTCTTGGT. Cycling conditions were 4 min at 95° followed by 30 cycles of 45 sec at 94°, 45 sec at 60°, and 90 sec at 72° for the first PCR; and 5 min at 95° followed by 25 cycles of 45 sec at 95°, 45 sec at 64°, and 45 sec at 72° for the second PCR.
Ligation and amplification:
Second-round PCR products were treated with Proteinase K (Fisher Scientific, Pittsburgh) at 0.5 µg µl-1 for 30 min at 37° and 15 min at 80°. One microliter of the proteinase K-treated amplicon was added to a 3-µl ligation mix, containing 1.6 units of Taq DNA ligase (New England Biolabs, Beverly, MA), 7.5 nM of each oligonucleotide (for a total of 12 sets of three oligonucleotides), 50 mM Tris pH 8.5, 7.5 mM MgCl2, 1 mM nicotinamide-adenine-dinucleotide, 50 mM KCl, and 2.5 mM dithiothreitol. The Taq DNA ligase catalyzes the formation of a phosphodiester bond between juxtaposed 5' phosphate and 3' hydroxyl of the two adjacent oligonucleotides that hybridize to the complementary template DNA. Ligation occurs with
1000 times greater affinity if the oligonucleotide is perfectly paired to the complementary target DNA compared to a mismatched 3'-most base (![]()
![]()
Membrane preparation: Dry nylon membranes (Immobilon-N, Millipore) were gridded with the products of the OLA reactions using a 96-pin tool, for a total of 12 replicate filter sets at a density of 768 reactions per filter. DNA was denatured for 10 min in 0.5 M NaOH, 1.5 M NaCl and UV cross-linked to the nylon membrane at 50 mJ. Filters were then neutralized in 0.4 M Tris-HCl, 0.3 M NaCl, 0.03 M sodium citrate for 1 hr and stored at 4° in the same buffer prior to hybridization.
Probe preparation:
Radioactive labeling of each probe was carried out in a 10-µl reaction, 1 µM with respect to probe and using 5 units of T4 polynucleotide kinase (Fisher Scientific) in its own buffer and 2 µl of [
-33P]ATP [
20 µCi (
6.7 nmol); Dupont New England Nuclear], for 45 min at 37° and 15 min at 65°. Unincorporated ATP was removed using MicroSpin G-25 columns (Amersham, Arlington Heights, IL), following the manufacturer's instructions.
Prehybridization, hybridization, and washing:
A prehybridization step was performed for 3 hr at 42° in 1 mM EDTA pH 7.2, 0.5 M NaPi pH 7.2, 7% SDS, 1% bovine serum albumin (Sigma, St. Louis), and herring sperm DNA (Promega, Madison, WI) diluted to 1 µg µl-1 and previously denatured at 96° for 5 min. Hybridization was performed for 3 hr, with radiolabeled probe diluted to
2 nM in the same buffer as for prehybridization except that a nonspecific 18-mer oligonucleotide diluted to 40 nM was substituted for the herring sperm DNA. Washing consisted of 37° for 1 min, 37° for 30 min, and room temperature for 1 min. Filters were exposed against phosphor screen for 6 days prior to scanning using a PhosphoImager 445 SI (Molecular Dynamics, Sunnyvale, CA). Each membrane was stripped at 80° for 10 min in 0.1% SDS and reprobed, for typing alternate alleles at a SNP, and membranes were reused for different SNPs. Hybridization steps were performed in duplicate on independent filters for each SNP.
ASO:
We performed ASO assays on a subset of the SNPs typed using OLA to provide a comparison between two largely independent assays and aid in estimating the accuracy of OLA.
Probe design:
Template-specific oligonucleotide probes of
15 nucleotides (see Table S at http://www.genetics.org/supplemental/) having a melting temperature over 55° and containing the query SNP in the center were designed. The lengths of probes and washing temperatures were empirically adjusted to maximally discriminate between alternative alleles at each SNP (data not shown).
Membrane preparation: Dry nylon transfer membranes (Osmonics, Minnetonka, MN) were printed with the second-round PCR product amplicons described above. DNA was denatured in 0.4 M NaOH, 1.5 M NaCl for 10 min and UV cross-linked at 150 mJ to the nylon membrane.
Prehybridization, hybridization, and washing:
Prehybridization was performed at 37° for 3 hr in 5x Denhardt's buffer, 5x SSPE buffer (![]()
Washing was performed in 0.1% SDS, 5x SSPE (![]()
Image analysis and genotype calling:
Data were extracted from scanned filters using ArrayVision (version 6.0; Imaging Research, St. Catherine, Ontario, Canada). All genotype calling was carried out using custom routines written in the "R" statistical language (version 1.5.1, the R project for statistical computing, http://www.r-project.org/), after background subtraction. All signal intensity measurements were natural log transformed and the signals associated with alternative alleles/bar codes were plotted against one another. On the basis of a visual examination of the resulting X-Y plot we defined sets of points belonging to each of three visually apparent clusters corresponding to AA, Aa, and aa genotypes. For each cluster, a mean centroid (M) and variance/covariance matrix (S) are calculated. As clusters are based on a large number of observations, repeated independent definitions of the clusters by the user had little effect on resulting estimates of M and S (i.e., processing of the data subsequent to cluster definition seemed very resistant to deviations in how those clusters were initially defined). Based on the estimates of S and M for each cluster, and assuming bivariate normality, the likelihood of individual i belonging to cluster j equals LAA = L(yi; Mj=AA, Sj=AA)/
j L(yi; Mj, Sj), where L(yi; M, S) is the bivariate normal probability distribution function and yi is a pair of log transformed intensity measures corresponding to the two alleles at that SNP. We assigned a provisional genotype to an individual if the relative likelihood of that individual belonging to one of the three clusters was >95% [e.g., LAA/(LAA + LAa + Laa) > 0.95] and the likelihood of it belonging to that cluster was >0.0005 (i.e., a likelihood of 0.0005 corresponds to a point
3 standard deviations from the cluster center). Data from the two replicate experiments for each SNP were then compared. We assigned an individual a genotype, if either the provisional genotypes were the same in both replicates or one of the replicates had a provisional genotype and the other none (occurring with proportions A and B, respectively). We did not assign an individual a genotype if the two provisional calls conflicted or we were unable to assign a provisional genotype for either replicate (occurring with proportions C and D, respectively). We define call rate as the rate at which we assign an individual a genotype, which is equal to A + B. Furthermore, if we assume that the probability of a correct call in a single replicate is x, then we expect C/(A + C) = 2x(1 - x) if replicates are independently assigned calls. We can solve for
and estimate our miscall rate as ((1 - x)2 A + (1 - x)B)/(A + B). The miscall rate is the proportion of called genotypes that we expect to be incorrect. We can also estimate the assay failure rate, or the rate at which the OLA assay gives a consistent yet incorrect call, through a comparison of two largely independent assays (i.e., OLA and ASO), with discrepancies being scored by DNA sequencing. We note that the assay failure rate is low enough (<<1%) that directly estimating this rate by DNA sequencing would require thousands of sequencing reactions. We estimate the assay failure rate for OLA as the fraction of discordant calls between ASO and OLA, for individuals called using both assays multiplied by the fraction of these discordant calls for which the OLA assay is incorrect as assessed by DNA sequencing. This rate was estimated by sequencing a subset of 44 individuals covering 100 discrepancies between OLA and ASO (we chose individuals discrepant for multiple SNPs to minimize sequencing effort).
Population genetics data analyses:
The average per site heterozygosity
, the nucleotide diversity
per site estimated from the number of segregating sites (![]()
![]()
![]()
![]()
![]()
![]()
2 test on R2 (![]()
![]()
![]()
Quantitative genetic analyses:
ANOVA tests for association between genotype and phenotype were performed for both additive and arbitrary dominant effect models (![]()
![]()
![]()
From the effects and associated errors estimated from the above models upper bounds can be placed on the magnitude of any true undetected underlying effects in the population, on the basis of the central limit theorem (![]()
![]()
and the total fraction of phenotypic variance attributable to the site as VAmax/VP, where VP is the total phenotypic variation. We calculate similar statistics for the laboratory lines as
and VAmax(lab)/VPlab, where amax(lab) is the estimate of the upper bound on ahap obtained from the laboratory lines and VPlab is the variance among laboratory lines. Estimates for the laboratory lines are for comparison only, as small sample sizes result in larger standard errors associated with parameter estimates.
| RESULTS |
|---|
Bristle variation in the natural D. melanogaster population:
Histograms showing the distributions of abdominal and sternopleural bristle number in wild-caught male and female D. melanogaster are shown in Fig 1. Bristle number appears normally distributed, with a mean ± SD of 16.9 ± 2.1 for sternopleural bristle number in males (SBM), 17.1 ± 2.1 for sternopleural bristle number in females (SBF), 16.6 ± 2.4 for abdominal bristle number in males (ABM), and 18.2 ± 2.7 for abdominal bristle number in females (ABF). A significant sexual dimorphism was observed for both traits (Student's t-tests: SBM vs. SBF, t = -3.00, P = 0.002; ABM vs. ABF, t = -14.2, P < 0.001). The total phenotypic variation within the population is close to the phenotypic variation for single third chromosome isogenic lines observed in previously published studies. For example, the ratios (Vnature/VW;lab) of observed phenotypic variance in the wild and a previously described set of isogenic third chromosomes are 1.240, 1.143, 0.661, and 0.807 for SBM, ABM, SBF, and ABF, respectively (![]()
![]()
40% of the Drosophila genome, bristle variation should thus not be thought of as strictly additive over chromosomes.
SNP identification and molecular population genetics of Dl:
The regions sequenced for 16 isogenic chromosomes consist of a portion of the Delta gene limited to the 5' and 3' UTRs (2801 bp) and the six exons (for a total of 2493 bp), each flanked by 25 bp of intronic sequence. All observed segregating sites (N = 110) are listed in Fig 2. A total of 65 SNPs and 19 insertion/deletion (indel) polymorphisms were identified. Indels were present in only noncoding regions and in all cases were <10 nucleotides. Of the 65 SNPs, 38% were in 5' and 3' UTRs (utrSNPs), 5% were in flanking intronic regions, and 57% were in coding regions (cSNPs). Of the 37 cSNPs, 7 were nonsynonymous (nsSNPs) and were all located within the large sixth exon. Despite all the nonsynonymous SNPs being in the sixth exon the ratio of synonymous to nonsynonymous SNPs in the sixth exon was not significantly different from the same ratio over the other five exons, as assessed by a contingency table analysis (
, P = 0.173).
The average per site heterozygosity,
, was estimated to be 0.00394 for the surveyed region (Table 1). The estimate of nucleotide diversity
from the number of segregating sites was 0.00356. Estimates of
and
are slightly smaller than previous estimates of nucleotide variation described for a restriction survey of the Delta region that included introns and flanking regions (![]()
![]()
and
are the same, and departure from neutrality can be detected as the normalized difference between
and
, a statistic referred to as Tajima's D (![]()
compared to noncoding regions, suggesting that 5' and 3' UTRs are under greater functional constraints than regions coding for amino acids. The parameter R = 4Nc, which corresponds to four times the effective population size multiplied by the recombination rate per generation between the most distant sites in the survey (5.5 kb apart), is estimated to be 84.4. The estimate of 15.4 R kb-1 is similar to the previous estimate of 14.9 R kb-1 for the Delta region on the basis of a six-cutter restriction enzyme survey (![]()
|
Sequencing of 38 additional isogenic chromosomes for three overlapping amplicons (total of 1.6 kb) that include all identified coding SNPs from the survey of 16 chromosomes resulted in the detection of seven additional biallelic SNPs and one triallelic SNP. Among the newly identified cSNPs, we observed four synonymous and one nonsynonymous biallelic singletons and a triallelic nsSNP (with minor alleles C and A observed once and twice, respectively). We also observed two additional SNPs in the 3' UTR, one a singleton and the other observed twice. The average minor-allele frequency among the eight newly identified SNPs was 2.21 ± 0.87%, providing empirical support for the assertion that the sequencing of 16 alleles identified the bulk of common variants. The estimate of Tajima's D from the total of 54 DNA sequences for a smaller region including only most of exon 6 and part of the 3' UTR still did not provide evidence for departure from neutrality (D = -0.124, P > 0.10).
A Hudson-Kreitman-Aguadé test was carried out to compare polymorphism to divergence in Dl UTRs vs. coding DNA (![]()
2 = 0.012, P = 0.9134). A McDonald and Kreitman test was carried out using the set of 54 sequenced lines for much of exon 6 and the D. simulans sequence to determine if coding regions of Dl have evolved in a nonneutral manner (![]()
Genotyping a large natural cohort:
On the basis of the sequencing survey of 16 isogenic chromosomes we chose a subset of SNPs to type in a large sample of wild-caught flies. We genotyped 6 nsSNPs, but did not type the seventh (nsSNP 35,820) as it is in complete LD with typed nsSNP 35,791 in both the samples of 16 (Fig 2) and 54 (Fig 3) chromosomes. We genotyped 6 additional common SNPs (synonymous or 3' UTR) located in the same 2136-bp amplicon that includes the 1358 most 3' bases of exon 6 and the 778 adjacent bases of the 3' UTR. Of the 12 SNPs genotyped, 2 nsSNPs (positions 35,039 and 36,195) are monomorphic in the large sample of 2060 flies. These two sites were also singletons in both the initial sequencing survey of 16 and the combined set of 54 chromosomes. The observation of a singleton in a small sample of chromosomes that is at a much lower frequency in the population from which the sample is drawn is not inconsistent with the Wright-Fisher sampling distribution (![]()
|
|
We examined pairwise linkage disequilibrium between SNPs for both the small haploid and large diploid data sets. The haploid data consisted of 24 SNPs with minor-allele frequencies >5% (N = 54; Fig 3A). Although a few polymorphic sites were in strong linkage disequilibrium, most pairs of sites did not show significant departures from equilibrium (significant LD was observed for 24% of the comparisons at 0.05, for 0.2% at 0.01, and for 0.5% at 0.001). These results are consistent with the relatively high level of recombination within the Delta region suggested by previous surveys (![]()
Data quality from the high-throughput OLA genotyping assay:
As we present a relatively novel SNP typing assay, it is worthwhile to evaluate the quality (and quantity) of data obtained using this method. We summarize the efficiency of our approach using three statistics. The average call rate per SNP is the proportion of individuals we assay to which we assign a genotype. In this experiment we observe an average call rate of 90.7 ± 2.8% standard deviations (Table 2). A large fraction of the non-called individuals are likely due to PCR failures, which are estimated to be
5% on the basis of running agarose gels (N = 384; data not shown). We further define a statistic called the miscall rate, which is the proportion of individuals that we estimate are assigned the incorrect genotype. We estimate our miscall rate to be 0.087 ± 0.06% standard deviation (Table 2). Thus our method performs at close to the 99.9% accuracy attributable to many SNP typing methodologies, although here quality control is assessed directly from the SNP typing experiment itself.
Both the call and miscall rates are calculated on the basis of measures of internal consistency over replicates. It is possible that an assay is internally very consistent (hence having a low miscall rate) and yet biased in some manner so that it consistently makes the incorrect call. We refer to this source of error as the assay failure rate. Such failures are often attributed to uncontrolled sources of variation such as polymorphisms in primer-binding sites. To assess assay failure rate we need to estimate error rates using an independent assay method. We used ASO assays successfully developed for four of the SNPs typed by OLA (SNPs 35,639, 35,791, 36,116, and 36,271) to identify individuals for which the two methods gave discordant calls (121, 180, 176, and 142 individuals, respectively). We then used DNA sequencing for a subset of the discordant individuals to determine which method is correct. DNA sequencing and OLA were in agreement for 12/12, 12/12, 43/44, and 32/32 sequences for SNPs 35,639, 35,791, 36,116, 36,271, respectively. Thus we estimate the OLA failure rate as 0.12 ± 0.04% standard deviation over SNPs (Table 2). We conclude that OLA has an acceptable failure rate given our assay conditions, but note that the assay failure rate is likely to vary on a SNP-by-SNP basis and is generally difficult to measure.
Associations between bristle number variation and polymorphic SNPs in Dl:
Fig 4 plots F-statistics of tests for association between SNPs and bristle number variation in both wild-caught and laboratory lines. Fig 4A and Fig B, shows additive and arbitrary dominance models fitted to the wild population, whereas Fig 4C and Fig D, shows a haploid model fitted to 47 isogenic lines homozygous for the entire third chromosome or 46 lines homozygous for an introgressed fragment including Dl, respectively (![]()
![]()
|
In addition to analyzing marker/phenotype associations separated by sex, we also fitted models in which the two sexes were combined and we tested for marker, sex, and marker-by-sex interactions. Results from these analyses were in qualitative agreement with those obtained when the sexes were considered separately. The one exception was for SNP 36,271, where we detected an abdominal bristle-number-by-sex interaction (F = 6.00, P = 0.014), despite not detecting a significant marker effect or significant effects on bristle number when the sexes were analyzed separately. We also tested all possible two-way epistatic interactions between SNPs marginally associated with bristle number variation (regardless of the background, sex, or bristle trait for which we observed significance). For the laboratory lines we fitted the additive effect at each marker and a two-way interaction, for the large natural cohort we fitted an additive and arbitrary dominance effect at each marker and an additional additive-by-additive interaction term with data being analyzed separately by sex. None of these pairwise comparisons were found significant.
Although empirical thresholds were different for the haploid and diploid models as the number of markers and individuals varied over experiments, empirical thresholds were generally very similar within the haploid and diploid models over characters and sexes. The one exception to this rule is the W background for abdominal bristle number (Fig 4C, solid and shaded lines for ABM and ABF, respectively). These departures from a more typical threshold of
11 were due mainly to two phenotypically atypical lines; one line had a count of 26.7 abdominal bristles in males and the other a count of 15.1 abdominal bristles in females. Once these two lines were removed and the permutation testing was repeated, the threshold for experimentwise significance at P < 0.05 returned to more typical values of 10.9 and 11.1 for ABM and ABF, respectively.
Small effects can be detected:
Despite the lack of significance via the permutation-testing framework, the maximum-likelihood estimates of the phenotypic effects associated with each SNP in the large natural cohort suggest that the failure to uncover associations is not due to lack of power. Table 3 lists effects associated with each of the 10 SNPs typed in the large natural cohort (with sign relative to the major allele) under additive and arbitrary dominance models for diploid data and under a haploid model for the laboratory lines. Standard deviations associated with phenotypic effects are generally larger in the laboratory than in natural populations, with average standard deviations of 0.89, 0.47, and 0.16 for the W background, B background, and wild cohort under an additive model, respectively. Thus, with large sample sizes phenotypic effects associated with markers can be accurately measured even without controlling for environmental and segregating genetic variation.
|
On the basis of accurate genotypic frequency and allelic-effect estimates in the large natural cohort it is possible to ask how much of the total phenotypic variation could possibly be associated with each of the SNPs examined. To do this we use the central limit theorem to place a 95% upper bound on the maximum-likelihood estimate of the phenotypic effect associated with each marker (amax). This 95% upper bound on a is conceptually closely related to a power analysis, as the expected value of amax is the point where we have roughly 80% power to reject the null hypothesis (see DISCUSSION). We then substitute our estimated amax into the formula for the additive genetic variance under an additive model to calculate Vmax. We calculate a similar statistic for haploid laboratory lines primarily for comparison, although estimates of laboratory parameters are generally less accurate and results are more difficult to interpret. Fig 5A (male data) and B (female data), plots the maximum proportion of total variance attributable to each SNP by sex and trait, using colors to identify the different genetic backgrounds. Fig 5 shows that the estimates of phenotypic effects associated with each marker in the large natural cohort are inconsistent with these markers contributing to a meaningful proportion of the total phenotypic variation. In fact, the largest fraction of total phenotypic variation attributable to any of the nonsynonymous SNPs, for either of the bristle characters in either males or females in the wild cohort, is only 1.7%.
|
An objective of this work was to compare phenotypic effects between the natural populations and the laboratory lines. We estimated correlation between effects measured in different backgrounds over SNPs, characters, and sexes (i.e., each correlation is >10 SNPs x 2 characters x 2 sexes). We observed that effects in the W background and the large natural cohort were not correlated (t = 1.93,
= 0.30, P > 0.5), although effects measured in the B background were significantly correlated with those measured in the large natural cohort (t = 3.99,
= 0.53, P < 0.01). We do not have a good explanation for why this correlation exists, as effects measured in two arbitrary halves of the natural cohort are uncorrelated (t = 1.40,
= 0.22, P > 0.5). The correlation of effects over arbitrary halves of the natural cohort would seem to define an upper bound on what the correlation should be between the natural cohort and any other background; thus it is likely that the observed correlation between effects in the B background and the natural cohort is an unexplained artifact. One possibility is that estimates of allelic effects in the B background are accurate estimates of true parameters in comparison to the W background (since more non-Dl segregating and environmental variation is controlled for in the B background), and we confirm these measures only when we have the power of the full natural cohort.
| DISCUSSION |
|---|
A novel method for high-throughput SNP genotyping:
The high-throughput genotyping methodology we developed is based on previous methods investigated by ![]()
![]()
Predicting functional SNPs on the basis of sequence analysis:
The molecular population genetic analysis of Delta does not suggest it has been a target of recent natural selection. We did not observe an excess of rare variants in DNA sequence data compared with what is expected under a neutral model (![]()
![]()



0.50 for each site; the 1957-bp DNA fragment examined in the large natural population is underlined. The first column indicates the isogenic line numbers corresponding to the numbers described in 

