TABLE 1

Parameters of nucleotide diversity in the Drosophila EGFR

Segregating polymorphismscf. D. simulans
RegionLength (bp)StartaHetbθcStotdScommRepl (rare)eIndelfSilentReplIndel
5′ exon 1  60554020.0150.01346 23 7 13 7
Exon 1  15460160.0090.0087  56 (2) 1  0 6 1
Intron 1  49261700.0070.00821  9 7NDND
5′ exon 2  416301200.0060.01023  6 9 13 1
Exon 2  295305180.0080.01320  63 (3) 0  5 0 0
3′exon 2 1,384308190.0120.01182 4216 27 8
Intron 2 2,425353400.0040.008112 2713 25 2
Exon 3  223377570.0150.01013 100 (0) 0  6 0 0
Intron 3  170379800.0520.03730 20 5 10 1
Exon 4 1,175381160.0100.00854 321(1) 0 22 0 0
Intron 4   66392910.0120.0156  2 3  6 0
Exon 5  133393580.0170.01612  52 (1) 0  0 2 0
Intron 5   74394910.0090.0125  1 1  7 1
Exon 6 2,446395610.0080.00795 568 (8) 1 44 2 0
3′ UTR  347420100.0100.01020 11 2  6 1
Intergenic  455423550.0030.00513  4 0  9 0
Total10,8630.0090.00954625920 (15)651931022
  • a The number of the first base of the region is GenBank Drosophila accession NG_000184 (17571116). Segment lengths do not necessarily match perfectly with our sequence lengths because of indel variation among sequences.

  • b Het (heterozygosity) estimated as the average expected nucleotide heterozygosity at Hardy-Weinberg proportions.

  • c θ per nucleotide, estimated as Stot divided by the sum from i = 1 to n of 1/n, for n, the average number of alleles in the region, as well as by the number of nucleotides in the sequence segment.

  • d Stot is the total number of segregating SNPs, and Scomm is the number of common SNPs (less common allele >5%).

  • e Repl (rare) is the number of replacement (or rare replacement) polymorphisms.

  • f Indel information is for the North American sample only.