Table 1  Behavior of f- and D-statistics for a simulated scenarios of admixture
ScenarioFst(C, B)Fst(O, B)D(A, B; C, O)D(A, X; C, O)f3(B; A, C)f3(X; A, C)f4-ratio
Baseline0.100.140.00−0.080.002−0.0050.47
Vary sample size
n = 2 from each population0.100.140.00−0.080.002−0.0050.47
Vary SNP ascertainment
Use all sites (full sequencing data)0.100.130.00−0.110.001−0.0020.47
Polymorphic in a single B individual0.100.16−0.01−0.060.003−0.0060.47
Polymorphic in a single C individual0.100.160.00−0.130.003−0.0070.46
Polymorphic in a single X individual0.110.160.00−0.110.003−0.0070.49
Polymorphic in two individuals: B and O0.100.16−0.01−0.080.002−0.0050.46
Vary demography
NA = 2,000 (vs. 50,000) pop A bottleneck0.100.140.00−0.080.002−0.0050.48
NB = 2,000 (vs. 12,000) pop B bottleneck0.140.170.00−0.080.011−0.0040.48
NC = 1,000 (vs. 25,000) pop C bottleneck0.160.140.00−0.080.002−0.0050.46
NX = 500 (vs. 10,000) pop X bottleneck0.100.140.00−0.080.0020.0040.47
NABB = 3,000 (vs. 7,000) ABB′ bottleneck0.140.170.00−0.090.002−0.0070.47
  • We carried out simulations for populations related according to Figure 4 using ms (Hudson 2002) with the command: ./ms 110 1000000 -t 1 -I 5 22 22 22 22 22 -n 1 8.0 -n 2 2.5 -n 3 5.0 -n 4 1.2 -n 5 1.0 -es 0.001 5 0.47 -en 0.001001 6 1.0 -ej 0.0060 5 4 -ej 0.007 6 2 -en 0.007001 2 0.33 -ej 0.01 4 3 -en 0.01001 3 0.7 -ej 0.03 3 2 -en 0.030001 2 0.25 -ej 0.06 2 1 -en 0.060001 1 1.0. We chose parameters to produce pairwise FST similar to that for A = Adygei, B = French, X = Uygur, C = Han and O = Yoruba. The baseline simulations correspond to n = 20 samples from each population; SNPs ascertained as heterozygous in a single individual from the outgroup O; and a mixture proportion of α = 0.47. Times are in generations with the subscript indicating the populations derived from the split: tadmix = 40, tBB = 240, tABB = 400, tCC = 280, tABB = 400, tABBCC =1,200, tO = 2,400. The diploid population sizes are indicated by a subscript corresponding to the population to which they are ancestral in Figure 4 and are: NA = 50,000, NB = 12,000, NB = 10,000, NBB = 12,000, NC = 25,000, NX = NC= 10,000, NCC = 3,300, NO = 80,000, NABB = 7,000, NABBCC = 2,500, NABBCCO = 10,000. All simulations involved 106 replicates except for the run involving 2 samples (a single heterozygous individual) from each population, where we increased this to 107 replicates to accommodate the noisier results.