## Abstract

In this article we apply some graph-theoretic results to the study of coalescence in a structured population with migration. The graph is the pattern of migration among subpopulations, or demes, and we use the theory of random walks on graphs to characterize the ease with which ancestral lineages can traverse the habitat in a series of migration events. We identify conditions under which the coalescent process in populations with restricted migration, such that individuals cannot traverse the habitat freely in a single migration event, nonetheless becomes identical to the coalescent process in the island migration model in the limit as the number of demes tends to infinity. Specifically, we first note that a sequence of symmetric graphs with Diaconis–Stroock constant bounded above has an unstructured Kingman-type coalescent in the limit for a sample of size two from two different demes. We then show that circular and toroidal models with long-range but restricted migration have an upper bound on this constant and so have an unstructured-migration coalescent in the limit. We investigate the rate of convergence to this limit using simulations.

A classical dichotomy exists in population genetics between measurements of gene flow based on the observed movement of individuals and those based on patterns of genetic variation (Slatkin 1985). These direct and indirect measurements often disagree, and the possible explanations for this have been debated extensively. Here we are concerned with cases in which direct measurements of individual movement, or intuition based on the mobility of organisms, seem to predict that a correlation between genetic distance and geographic distance should be observed but indirect measurements of gene flow show no such correlation. We offer a new explanation for this, based on a surprising mathematical result. It contrasts with the prevailing notion, which was articulated recently by Ouborg *et al*. (1999, p. 559) in a review of genetic studies of dispersal in plants: “If no isolation by distance is detected, then either dispersal is not distance dependent (which is, apart from very small spatial scales, unlikely in plants) or no equilibrium exists.” The statement that no equilibrium exists means that changes in population size or structure must have occurred in the recent past.

Observations of the kind we consider do not abound in the literature, but there are some. Bohonak (1999) found significant structure among demes of water mites in the genus Arrenurus in North America, but found no correlation between genetic distance and geographic distance. While genetic diversity in these species is likely to be strongly influenced by postglacial population expansion, one of them (*Arrenurus birgei*) was suggested to be at or near migration-drift equilibrium. Durand *et al*. (2000) found low but significant population structure with no detectable isolation by distance in the widely distributed and long-term demographically stable African savanna grass species *Hyparrhenia diplandra*, whose dispersal ability is apparently low. Vandewoestijne *et al*. (1999) observed a high level of genetic diversity but a low level of structure, and no correlation between geographic distance and genetic distance, in the butterfly *Aglais urticae* for which very long distance dispersal is considered unlikely. A fourth example is Strand *et al*. (1996) who found high diversity and a high degree of population structure among local populations of plants in the genus Aquilegia in the southwestern United States and adjacent parts of Mexico and used the lack of a correlation between genetic distance and geographic distance to argue for a history of isolation from a common ancestral population without subsequent gene flow.

We do not claim that historical events such as population expansion or the splitting of populations are not important factors. On the contrary, for many species these may be the most significant determinants of current patterns of diversity. We point out only that not all patterns of restricted migration are expected to produce a pattern of isolation by distance.

The more typical situation, in which there is some correlation between genetic distance and geographic distance, is easily explained. The movement of individuals or the dispersal of gametes is usually local, and the birth and death of individuals in a population of this sort establish a pattern where relatedness tends to decrease with the distance between organisms. When DNA sequences or other genetic data are sampled from many individuals separated by different distances, a pattern of isolation by distance will be observed (Wright 1943). The models most commonly invoked to explain the correlation between genetic distance and geographic distance are the one-dimensional and two-dimensional stepping-stone models (Kimura 1953; Kimura and Weiss 1964; Weiss and Kimura 1965; Maruyama 1970, 1971; Sawyer 1976). In these models, the demes that make up the population are arrayed either on a line or on a two-dimensional lattice, and migration occurs only between neighboring demes.

Wright's island model and its variants (Wright 1931; Latter 1973; Maruyama 1974) are the only equilibrium migration models that have been invoked to explain a lack of correlation between genetic distance and geographic distance. In the island model, a single migration event is equally likely to move the individual or gamete to any other deme in the population. At equilibrium, a sample taken from a single deme will show less genetic variation than a sample taken from more than one deme, but there will be no isolation by distance. Although the assumption of island-model migration is unrealistic for most species, estimates of gene flow are typically made under this assumption. Recent work has shown that the coalescent process, which is the retrospective process by which contemporary samples of genetic data trace back through ancestral lineages to reach common ancestors, is relatively simple in the island model when the number of demes is very large (Wakeley 1998). It is characterized by a slow process for between-deme samples, which is a Kingman-type coalescent process (Kingman 1982), and a fast process for within-deme samples, during which they either coalesce or end up in different demes (and then enter the slow process).

Charlesworth *et al*. (2003) suggested that a similar two-phase coalescent process, in which the recent ancestry of a sample depends on the sample locations but the memory of these locations fades quickly as lineages trace their genealogy farther into the past, should be observed in the two-dimensional stepping-stone model. Some recent theoretical work supports this. Wilkins (2004) identified this sort of behavior in an analysis of coalescence in a continuous two-dimensional habitat with symmetric Gaussian dispersal, although in that case the ancestral process is not necessarily an unstructured coalescent. The assumptions of the model we describe below are similar in spirit to those of Wilkins (2004) in that both allow individuals to migrate to a potentially very large number of locations. However, they differ in that our model is a discrete-deme model and does not assume a particular form for the distribution of dispersal distances.

Following an earlier article by Cox and Durrett (2002) and making use of some special properties of random walks in two dimensions, Zähle *et al*. (2005) found a two-phase coalescent process in a two-dimensional stepping-stone model with direct migration possible between demes separated by *K* steps or fewer along the lattice. While on the surface their model and results appear similar to ours, there are important differences. We seek an ancestral limit process that approximates the behavior of a population composed of very many demes. In comparison to Zähle *et al*. (2005), who consider *K* to be a fixed, relatively small number compared to the total number of demes in the population, we assume that *K* is very large, on the order of the number of demes. The consequence of this is that the behavior of our model converges to that of the island model while theirs continues to predict a pattern of isolation by distance even in the limit as the number of demes tends to infinity.

## MODEL AND RESULTS

Our model consists of *D* demes, each containing *N* haploid individuals. We make use of the framework of graph theory, so we represent each deme as a node on a graph, where the edges are potential single-step migration paths. Any discrete-deme model with migration can be represented in this way. We restrict ourselves to the case of vertex-transitive graphs, which we simply call symmetric graphs. These graphs are homogeneous in the sense that they look the same from every node. Migration patterns of this sort have been called “isotropic” in the population genetics literature (Strobeck 1987). An example is given in Figure 1. Clearly, the circular and toroidal stepping-stone models are symmetric graphs. Trivially, the island model, which is represented by the complete graph, is also symmetric. Each node of a symmetric graph contacts the same number of edges. The number of edges that a single node contacts is called the degree of the symmetric graph and is denoted *d*. Because each edge connects two nodes, the total number of edges in the graph is equal to *dD*/2.

We assume a continuous-time Moran model of reproduction in which all variation is selectively neutral. Each of the *ND* individuals in the population dies at some rate λ per unit time. When an individual is chosen to die, it is replaced by the offspring of an individual chosen uniformly at random either from the same deme, with probability 1 − *m*, or from one of the *d* demes it is connected to in the graph, with probability *m*. In the first case, the same individual can be chosen to reproduce and to die. Consider the ancestry of a sample of size two in this model. When viewed backward in time, the forward-time process of migration and reproduction becomes a coalescing random walk of ancestral lineages on the graph, with the caveat that the two ancestral lineages can occupy the same node without coalescing. The assumption of Moran-type reproduction simplifies the analysis somewhat because only one lineage can move at a time. Later, we consider the case of Wright–Fisher reproduction and the possibility that a sample of size greater than two is taken from the population.

Let the random variable *X _{ij}*,

*i*,

*j*∈ {1, 2, . . . ,

*D*}, be the time back to the most recent common ancestor for a pair of samples taken from deme

*i*and deme

*j*. We first consider the expected value

*E*[

*X*]. Looking back in time, each lineage encounters its birth with rate λ and is either a migrant or a nonmigrant, with probabilities

_{ij}*m*and 1 −

*m*, respectively. By conditioning on the first event in the continuous-time Markov process that describes the ancestry of a sample, we have(1)in which Ω

_{i}is the set of labels of the

*d*demes accessible by a single migration event from deme

*i*. The terms on the right in Equation 1 are, from left to right: the waiting time for one or the other lineage to be the offspring in a reproduction event; the probability that the event is neither a migration event nor a coalescent event times the expected time given this; and the probability that the event is a migration event to a particular deme

*j*times the expected time given this, summed over all

*j*.

Because the graph is symmetric and every deme has the same size, *E*[*X _{ii}*] =

*E*[

*X*] for all

_{jj}*i*and

*j*. The main result we present below is that, for certain types of graphs, when

*D*is large, the time for a pair of lineages sampled in distinct demes to enter the same deme does not depend on the initial choice of demes. It follows that the distribution of

*X*,

_{ij}*i*≠

*j*, does not depend on

*i*and

*j*in the limit as

*D*tends to infinity. Now, let us define τ

_{0}to be the average time for the pair of lineages to enter the same deme, thus giving them a chance to coalesce. For the expected value of

*X*, we can write(2)This says that the expected coalescence time for a sample of size two from two different demes is equal to the time τ

_{ij}_{0}plus the probability they do not immediately coalesce when they enter the same deme times the expected coalescence time given that they are now in the same deme. Slatkin (1987) and Strobeck (1987) showed that the expected within-deme pairwise coalescence time under symmetric migration is the same as the expected pairwise coalescence time in a single, panmictic population of the same total size. In our model, this is

*E*[

*X*] =

_{ii}*ND*/(2λ). Substituting Equation 2 into Equation 1 and using

*E*[

*X*] =

_{ii}*ND*/(2λ) to then solve for τ

_{0}gives τ

_{0}≈

*D*/(2λ

*m*), where the approximation is valid if

*D*is large. Using Equation 2 again, we obtain(3)which is the same as the result for the island model with a large number of demes and Moran-type reproduction;

*e.g.*, see Slade and Wakeley (2005).

We use the Diaconis–Stroock bound (Diaconis and Stroock 1991) to prove that the limit (*D* → ∞) distribution of *X _{ij}*,

*i*≠

*j*, does not depend on

*i*and

*j*for certain types of graphs. The crucial fact of the analysis is that it is possible for a random walk on a graph to have a finite “mixing time” even when the number of nodes of the graph goes to infinity. The mixing time, denoted τ

_{2}, quantifies the amount of time it takes for a Markov chain to near the stationary distribution. If the mixing time is short compared to the amount of time before a coalescent event, then the Markov chain acts close to as if it had been started with the stationary distribution. In particular, the initial location of the samples is irrelevant.

A bound for the mixing time is provided by the Diaconis–Stroock bound. To apply this theory, one chooses a “distinguished set of paths” Γ connecting every ordered pair of nodes in the state space. For this Γ one finds *L*, which is the length of the longest path, and *B*, the maximum number of paths going through a given edge, to gain the bound for our symmetric graph:This equation is a special case of Corollary 1 of Diaconis and Stroock (1991). In the cases of interest to us, we will be able to find a set of paths that have maximal path length *L* bounded above and the maximal number of paths going through an edge *B* will be bounded above. The degree *d* is always less than or equal to the number of nodes *D*. Thus, τ_{2} is bounded above by a constant; we call such a graph “fast mixing.” In biological terms we might think of this as saying that spatial structure is unimportant in the limit if, first, it is possible to get across the space in a small number of migration events, and, second, there are no narrow channels through which migration must occur. We also note that this bound is the simplest example of the paths technique; further development can be found in Saloff-Coste (1997).

Using this mixing-time bound we can show using a theorem of Aldous (1989) that the distribution of the time for any pair of lineages to enter the same deme converges to an exponential distribution, with mean equal to one when time is rescaled appropriately. The scaling is by the average time τ_{0}, introduced above, and part of the proof is to show that τ_{0} converges to *D*/(2λ*m*) as *D* increases. The details of the proof are given in the appendix. Briefly, we note that the random walk of two lineages on a symmetric graph can be treated by fixing the position of one of them and letting the other move with twice the rate, in this case 2λ*m*. We then label the nodes of the graph so that deme one is the fixed position of the first lineage and define *Y _{i}* to be the time for the moving lineage to reach deme one given that it starts in deme

*i*. We use

*H*to denote the expected value

_{i}*E*[

*Y*]. The overall average is . The interesting result is that, on a fast-mixing graph, the distribution of

_{i}*Y*/τ

_{i}_{0}converges to an exponential distribution with mean equal to one in the limit as

*D*tends to infinity.

Figure 2 illustrates the behavior of the scaled expected times *H _{i}*/τ

_{0}for a series of graphs like the one in Figure 1. To produce Figure 2, we calculated

*H*analytically using the spectral formula for hitting times (Equation 9) and the properties of circulant matrices (Davis 1979). All

_{i}*H*/τ

_{i}_{0}should be equal to one in the limit we consider. Figure 2 shows

*H*/τ

_{i}_{0}for a series of circular graphs with increasing

*D*, where each node is connected to all nodes that are within one-tenth of the graph away from it. This corresponds to a species in which an individual can move to any deme that is within 10% of the total distance around its circular habitat. When the number of demes is small there is a substantial difference between samples that are close together and those that are far apart. However, when the number of demes becomes large this difference decreases. As might be expected, there is also a visible jump at , so that samples within migration range of each other have shorter times than more distant samples, but the magnitude of this jump becomes negligible in the limit as

*D*tends to infinity.

The graphs in Figures 1 and 2 can be thought of as extensions of the familiar circular stepping-stone model (Kimura and Weiss 1964), but where we have added some extra branches representing longer-range migration. We also consider a toroidal model, which is the corresponding extension of the two-dimensional stepping-stone model. Conditions under which the toroidal model will have mixing time bounded above, and will thus fall under our convergence result, are identified in the appendix. In both the circular case and the toroidal case, it is sufficient that a migrant can migrate freely within a neighborhood of nonvanishing size (measured as a fraction of the total number of demes) as *D* tends to infinity. The neighborhood need not be large, so migration is restricted even in the limit model. For example, in Figures 1 and 2 it will take at least 10 migration steps to move once around the habitat.

We can now return to the coalescence time, *X _{ij}*, for two samples starting in demes

*i*and

*j*on a fast-mixing graph. In the limit as

*D*tends to infinity, we have shown that the waiting time for the two lineages to enter the same deme does not depend on

*i*and

*j*and is exponentially distributed with mean equal to one when time is rescaled by the factor

*D*/(2λ

*m*). When they first enter the same deme, there is a chance 1/

*N*that the two lineages coalesce. If they do not coalesce immediately (probability 1 − 1/

*N*), then after some number of reproduction events either they will coalesce, with probability(4)or one of the lineages will migrate out of the deme. The time it takes for one or the other of these events to occur will have mean equal to

*N*/(2λ(

*Nm*+ 1 −

*m*)), and when

*D*is large this will be much less than the average time

*D*/(2λ

*m*) for the pair to meet in the same deme.

Thus, we can appeal to the “separation of timescales” between this within-deme process and the process of migration movement of lineages across the population, treated above and in the appendix. We can show that the distribution of *X _{ij}* in the limit is also exponential with mean equal to one when it is rescaled appropriately. Overall, the probability of coalescence given that the two lineages enter the same deme is equal to(5)Again, if they do not coalesce, then one or the other lineage will migrate to a different deme. The number of times the two lineages will have to repeat this process of entering the same deme and having a chance to coalesce before they finally do coalesce will be geometrically distributed with parameters equal to Equation 5. Möhle (1998) has developed a formal method for treating Markov processes with two timescales, which applies here, but also admits more generality.

Our result that the time for a pair of lineages to enter the same deme is exponentially distributed requires that time is rescaled by τ_{0}, which we recall converges to *D*/(2λ*m*) as *D* tends to infinity. On this timescale and as *D* → ∞, the durations of the periods when the lineages are together in the same deme become negligible. Therefore the distribution of *X _{ij}*2λ

*m*/

*D*is given by the sum of a geometric number of exponential random variables that can be shown,

*e.g.*, as in Wakeley (1999), to be exponential with mean equal to

*Nm*+ 1 −

*m*. If we rescale time again, by this new factor, so that our new unit of time is(6)of the original units, then the limit distribution of the scaled coalescence time

*T*=

_{ij}*X*/

_{ij}*N*

_{e}will be exponential with mean equal to one. Note that our “effective population size”

*N*

_{e}is identical to the expression for

*E*[

*X*] given in Equation 3.

_{ij}We have followed Wakeley (1999) in defining *N*_{e} so that the time-rescaled coalescent process for a sample of lineages from *different* demes is given by Kingman's coalescent. Samples from the same deme will undergo an instantaneous process of migration and coalescence, called the “scattering phase” in Wakeley (1999). A pair of lineages sampled (without replacement) from a single deme will coalesce with probability given by Equation 4. If they do not coalesce, then one or the other will migrate and they will enter the Kingman coalescent, or “collecting phase.” The difference in the distributions of coalescence times for within-deme *vs.* between deme samples can be seen in their respective cumulative distribution functions (CDFs):(7)(8)In the limit model, single-deme samples of size two have a probability mass of (1 − *m*)/(*Nm* + 1 − *m*) at *t* = 0, followed by the usual exponential decay for *t* > 0.

We used simulations to assess the convergence of the rescaled coalescence time *X _{ij}*/

*N*

_{e}to the exponential distribution. The source code of a program that simulates the exact model is available from the authors upon request. Some results are shown in Figure 3, which compares the CDF of

*X*/

_{ij}*N*

_{e}in simulations to the limit results given in Equations 7 and 8, for a series of increasing

*D*. Figure 3 presents results for samples of size two: (a) from the same deme, (b) from adjacent demes, and (c) from maximally distant demes. The demes were arrayed on an

*l*×

*l*torus and we allowed direct migration to any deme within

*l*/16 steps from the current deme in either dimension. When

*D*=

*l*

^{2}is large, this corresponds to a species in which individuals can migrate to any deme in an area equal to one sixty-fourth, or ∼1.56%, of the total, two-dimensional, species range. We set λ = 1,

*N*= 50, and

*m*= 0.02 and considered

*l*= 16, 32, 62, and 128. As the number of demes increases, the CDFs for adjacent and maximally distant samples converge to that of the same exponential distribution, with mean equal to one and given by Equation 8. The CDF for a single-deme sample converges to Equation 7. The theory presented above and in the appendix assures convergence in the limit, but Figure 3 illustrates that the limit result can be approximately true even when

*D*is only moderately large.

## DISCUSSION

We have shown that the distribution of coalescence times for a sample of size two from a subdivided population with restricted migration may not depend on the distance between sampling locations. This result holds in the limit as the number of demes tends to infinity and with some restrictions on the pattern of migration. Samples from the same deme differ from samples from different demes—on average having shorter coalescence times—but in the limit this will be the only evidence of subdivision. Thus, although migration may be fairly restricted, the usual pattern of isolation by distance may not be observed. We suspect that this will be important for only a limited number of species, but we hope that the result adds something to the ongoing debate about the role of gene flow in structuring genetic variation.

Our result is similar to other recent results (Wakeley and Aliacar 2001) and can be understood with reference to the strong-migration limit of Nagylaki (1980), which was taken up in a genealogical setting by Notohara (1993). In particular, when the number of demes is large in our model, the memory of the original sampling locations of lineages is lost quickly in comparison to the rate at which their common ancestor is reached. For this reason, it seems reasonable to speculate that the same kind of result will hold for populations with asymmetric migration patterns (*e.g.*, populations with edges), populations with different migration rates between different pairs of demes, and populations with other kinds of reproduction (*e.g.*, Wright–Fisher reproduction), as well as for samples of size greater than two.

In fact, we have verified the last two of these predictions using simulations, although we do not present the results. However, we note that the rate of convergence to Kingman's result (as *D* increases) for the CDF of time to the first coalescent in the sample decreases as the sample size increases. This is to be expected since the validity of Kingman's coalescent depends roughly on the square of the sample size being much smaller than the effective size of the population (Kingman 1982). Of course, studies of isolation by distance are nearly always based on pairwise comparisons between samples, and our results for samples of size two hold marginally for pairs in samples of any size.

## APPENDIX

Here we present and prove the technical results for this article. For simplicity we consider random walks with rate one rather than walks with rate 2λ*m* as in the main text. The expectations of the time to enter the same deme (the hitting time) scale appropriately; for example, if a given hitting time is *H* for the rate one case, the corresponding hitting time would be *H*/(2λ*m*) in the previously considered case. Also, to conform with the large and well-established literature on graph theory and random processes on graphs, we use *n* to denote the number of nodes of a graph rather than *D*, which is used in the main text.

First we recall some basic facts about random walks on graphs. Let *M* denote the Markov transition matrix for the random walk on our symmetric graph. *M* is symmetric and thus can be diagonalized with an orthogonal matrix *V* with entries *v _{ij}*. Let us order the eigenvalues 1 = λ

_{1}> λ

_{2}≥ ⋯ ≥ λ

_{n}≥ –1. The eigenvalues are bounded above by one in absolute value because large powers of the Markov matrix are bounded.

There is a nice formula for expected hitting times in terms of these eigenvalues and vectors:(9)Using the orthogonality of the matrix *V* gives(10)where τ_{0} is again the average hitting time. For proofs see, for example, Lovasz (1993), section 3. The aforementioned mixing time is defined in terms of the second largest eigenvalue: τ_{2} = (1 − λ_{2})^{−1}. As mentioned above, the mixing time quantifies the amount of time the chain will take to near the stationary distribution. Under our hypotheses the mixing time is bounded above, which is the crucial fact that allows our conclusions.

The Diaconis–Stroock bound is found in the literature as Proposition 1 of (Diaconis and Stroock 1991):

#### Theorem 1.

(11)where κ is a constant that contains information about a chosen set of “distinguished allowable paths” Γ connecting every pair of states in the Markov chain. In the case of a random walk on a symmetric graph, Γ is simply a chosen set of paths on the graph connecting every pair of nodes. Furthermore, as a special case of Corollary 1 of Diaconis and Stroock (1991), κ can be bounded above by *LBd*/*n*, where *L* is the longest path in Γ and *B* is the maximal number of paths that traverse a given edge *e*. As before, *d* is the degree and *n* is the number of nodes in the graph.

This method gives an upper bound for λ_{2} that is independent of *n* so we can apply the following proposition, proven below.

Proposition 1. *Let G*_{1}, *G*_{2}, … , *be a sequence of symmetric graphs where G _{n} has at least n nodes. Assume that*(12)

*for some*0 < Δ < 1.

*Assume that i*(

*n*)

*is a sequence of nodes such that i*(1)

*is a node of G*

_{1},

*i*(2)

*is a node of G*

_{2},

*and so on. Then the distribution of Y*

_{i}_{(n)}/τ

_{0}

*converges to the exponential distribution with mean 1 in the limit of n going to infinity.*

Putting these two together, an upper bound for κ that is independent of *n* will give the exponential convergence result. The following proposition gives such a bound for the torus with uniform migration across a given fraction of the habitat:

Proposition 2. *Define G*_{θ,l} *to be an l* × *l toroidal lattice with an additional edge between any two nodes whose horizontal and vertical distance is less* ≤⌈θ*l*⌉ + 1. *A set of distinguished paths* Γ *on G*_{θ,l} *exists such that* κ *is bounded above by a number that depends only on* θ.

Note that the hypotheses of the proposition are certainly satisfied for large *l* × *l* tori that have connections between any two nodes that are within ρ*l* steps of each other along the lattice, where ρ is some fixed fraction and is greater than zero. We also note that an upper bound can be proven for the one-dimensional cyclic population model with edges connecting any two nodes within a fixed fraction of the circle; this version is easier to prove and we omit its proof.

Now we prove the propositions, starting with Proposition 1. By the ordering of the eigenvalues the hypothesis implies that for any *k* > 1Then we recall that for vertex-transitive chains (see Proposition 5 of Aldous 1989)for *i* ≠ 1. ThusWe apply this fact as follows,where the last step is via the definition of an orthogonal matrix. The entry *v*_{11} is the first coordinate of the first eigenvector, which is scaled to have norm one. As the first eigenvector corresponds to the stationary distribution, which is in this case the uniform distribution, *v*_{11} = *n*^{−1/2}. Therefore this upper bound goes to zero.

The following fact is a case of Proposition 8 of Aldous (1989):

Theorem 2. *Consider a sequence of vertex-transitive graphs such that*

*The number of nodes goes to infinity.*.

*for any sequence of i's*.

*Then the distribution of Y _{i}*/τ

_{0}converges to the exponential distribution with mean one for any i.Under the hypotheses of our Proposition 1, λ_{2} is bounded away from 1 and therefore τ_{2} is bounded above. Clearly τ_{0} goes to infinity and by the above calculation *H _{i}*/τ

_{0}converges to one; therefore we can apply Aldous' theorem to prove Proposition 1.

Now we prove Proposition 2. Recall that we need to find a set of distinguished paths Γ such that *L* and *B* are bounded by fixed constants, where again *L* is the maximum length of any path in Γ, and *B* is the maximal number of paths in Γ going through a given edge.

We set *S* = ⌈θ^{−1}⌉. It is easy to see that we can travel from any node to any other node along existing edges in at most *S* steps, asTherefore we can choose Γ to contain paths of maximal length *S*, which will make *L* bounded.

In fact, we choose paths in Γ to be paths of exactly length *S* as often as possible. To specify the class of paths, we pick a node *p* and then choose a set of paths Γ_{p} from *p* to all of the points of the torus. We then translate this set of paths around the torus to get a complete set of paths Γ.

To describe Γ_{p}, set up coordinates on the torus such that *p* is at the point (⌈*l*/2⌉, ⌈*l*/2⌉). Now, for any point *q* on the torus, integers *a* and *b* exist such thatWe note that by hypothesis an edge will exist between any two nodes that are |*a*| + 1 or less apart in the horizontal direction and |*b*| + 1 apart or less in the vertical direction. For example, when *l* is sufficiently large,Therefore we choose a path from *p* to *q* using edges going only *a* or *a* + 1 in the horizontal direction and only *b* or *b* + 1 in the vertical direction. This path will be at most *S* steps long. We repeat this process for all *q* to construct Γ_{p} and then translate to construct the whole of Γ.

Now we need to show that *B* is bounded above by a constant independent of *l*. Pick an edge *e* traversed by a path γ and assume that it goes *a* steps in the horizontal direction and *b* steps in the vertical direction. By the above construction the start *p* and the terminus of the path *q* will satisfy the inequalitiesNote that in a given Γ_{p} there can be at most 4*S*^{2} possible terminal points in this region. Therefore there are at most 4*S*^{2} paths in Γ_{p} that contain the edge *e*.

Now, how many translates of paths in Γ_{p} contain *e*? Let us denote by *E*(Γ_{p′}) the union of edges traversed by paths in Γ_{p′}. Note that if *e* ∈ *E*(Γ_{p′}), then we have made a choice of *e*′ ∈ *E*(Γ_{p}) to map onto *e*. This choice determines the base of translation *p*′ uniquely. Because each path is at most *S* edges long, and there are at most 4*S*^{2} paths, there are at most 4*S*^{3} choices of such an *e*′ and therefore at most 4*S*^{3} possible translations. Each choice of translation has at most 4*S*^{2} paths traversing *e*; therefore, *B* is at most 16*S*^{5}. This proves Proposition 2.

## Acknowledgments

We thank David Aldous, Rick Durrett, and an anonymous reviewer for discussion and comments. F.A.M. was supported by a Graduate Research Fellowship from the National Science Foundation. J.W. was supported by a Presidential Early Career Award for Scientists and Engineers (DEB-0133760) from the National Science Foundation and by a Fellowship from the Radcliffe Institute for Advanced Study.

## Footnotes

Communicating editor: M. S. McPeek

- Received June 30, 2005.
- Accepted October 11, 2005.

- Copyright © 2006 by the Genetics Society of America