Abstract

We consider the distribution of pairwise sequence differences of mitochondrial DNA or of other nonrecombining portions of the genome in a population that has been of constant size and in a population that has been growing in size exponentially for a long time. We show that, in a population of constant size, the sample distribution of pairwise differences will typically deviate substantially from the geometric distribution expected, because the history of coalescent events in a single sample of genes imposes a substantial correlation on pairwise differences. Consequently, a goodness-of-fit test of observed pairwise differences to the geometric distribution, which assumes that each pairwise comparison is independent, is not a valid test of the hypothesis that the genes were sampled from a panmictic population of constant size. In an exponentially growing population in which the product of the current population size and the growth rate is substantially larger than one, our analytical and simulation results show that most coalescent events occur relatively early and in a restricted range of times. Hence, the "gene tree" will be nearly a "star phylogeny" and the distribution of pairwise differences will be nearly a Poisson distribution. In that case, it is possible to estimate r, the population growth rate, if the mutation rate, mu, and current population size, N0, are assumed known. The estimate of r is the solution to ri/mu = ln(N0r) - gamma, where i is the average pairwise difference and gamma approximately 0.577 is Euler's constant.

This content is only available as a PDF.
This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)