A fundamental concept in population genetics is the representation of the evolutionary history of a sample, of a single locus, as a tree. The theory describing such trees is called “coalescence theory,” and the primary mathematical model used is called “the coalescent,” or sometimes “Kingman’s coalescent,” as it was first discovered by the British mathematician Kingman (1982a,b). Coalescence theory is used to understand the statistical properties of a sample from a population and it underlies almost all the computational methods used for analysis of population-level DNA sequence data.

The discovery of Kingman’s coalescent is arguably one of the most important theoretical discoveries in all of biology over the past 50 years. It was the culmination of decades of work on population genetic theory by Ewens (1972), Watterson (1975), Gladstein (1978), Griffiths (1980), and others. The basic idea of representing the history of a sample as a tree had been percolating for a while. For example, Gladstein (1978) described a process, akin to Kingman’s coalescent, of loss of evolutionary lineages in the population over time. Griffiths (1980) derived mathematical properties of the tree structure of a sample, but used a more complicated, and less general, construction than the one eventually discovered by Kingman (1982a,b). Today, we celebrate the seminal contributions of Kingman in the development of the coalescent by appropriately naming the process after him. However, by the early 1980s the field had matured to such a degree that coalescence theory, in one form or another, was being developed independently by several researchers, including two graduate students: Hudson (1983a,b) working with John Gillespie at the University of California, Davis, and Tajima (1983) working with Masatoshi Nei at the University of Texas at Houston, both of whom would become central in the development of modern population genetic theory.

Tajima was trained in both phylogenetics and population genetics and was therefore well positioned to make inroads into problems regarding tree representations of the genealogical structure of a sample in a population. In his 1983 paper in *GENETICS* (Tajima 1983), he developed many of the most important results in coalescence theory, such as means and variances of the time to most recent common ancestor of the sample, and he illustrated how many classical population genetic results could be easily rederived using coalescence theory. He did so apparently independently of the work of Kingman, which he was unaware of at the time. In addition, he studied coalescence trees in models with two diverging populations and derived the probabilities of different tree topologies in this context. Probabilities of tree topologies in models with multiple populations (or species), were also an important part of the contemporaneous paper by Hudson (1983b). Together, these papers provided the first mathematical descriptions of tree structures caused by what will later become known as “incomplete lineage sorting” (ILS)—a very important concept in our understanding of phylogenetic trees. They initiated decades of research on the interface between phylogenetics and population genetics and on understanding ILS and its consequences. However, this is not the main reason Tajima’s (1983) paper became so highly cited. In a later section of the paper, Tajima provided the first derivation of the variance of the average number of pairwise differences (π) under the infinite sites model, and argued in favor of using π as an estimator of the mutation scaled effective population size (θ). This estimator came into common use and is now often referred to as “Tajima’s estimator.” It remains one of the standard statistical methods for analyzing population genetic data.

Tajima (1983) was one of the founding papers of modern population genetics and was arguably the first paper that truly demonstrated the tremendous power of the coalescent when deriving statistical properties of a sample of DNA sequences. It also introduced the problem of incomplete lineage sorting in biology. It remains one of the pillars of modern population genetics and should be required reading for any graduate student entering into the field of population genetics.

## Footnotes

*Communicating editor: M. Turelli***ORIGINAL CITATION**Evolutionary Relationship of DNA Sequences in Finite Populations

Fumio Tajima

*GENETICS*October 1, 1983**105:**437–460Image courtesy of Fumio Tajima and the University of Tokyo.

- Copyright © 2016 by the Genetics Society of America