Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.
F Tajima

Abstract

The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated. It is found that the correlation between these two estimates is large when the sample size is small, and decreases slowly as the sample size increases. Using the relationship obtained, a statistical method for testing the neutral mutation hypothesis is developed. This method needs only the data of DNA polymorphism, namely the genetic variation within population at the DNA level. A simple method of computer simulation, that was used in order to obtain the distribution of a new statistic developed, is also presented. Applying this statistical method to the five regions of DNA sequences in Drosophila melanogaster, it is found that large insertion/deletion (greater than 100 bp) is deleterious. It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.