help button home button Genetics eBMJ
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Genetics, Vol. 167, 2097-2109, August 2004, Copyright © 2004
doi:10.1534/genetics.103.021535

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Takebayashi, N.
Right arrow Articles by Uyenoyama, M. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Takebayashi, N.
Right arrow Articles by Uyenoyama, M. K.

Maximum-Likelihood Estimation of Rates of Recombination Within Mating-Type Regions

Naoki Takebayashi*,1, Ed Newbigin{dagger} and Marcy K. Uyenoyama*,2

* Department of Biology, Duke University, Durham, North Carolina 27708-0338
{dagger} School of Botany, University of Melbourne, Victoria 3010, Australia

2 Corresponding author: Department of Biology, Box 90338, Duke University, Durham, NC 27708-0338.
E-mail: marcy{at}duke.edu

Manuscript received August 21, 2003. Accepted for publication April 30, 2004.


    ABSTRACT
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Features common to many mating-type regions include recombination suppression over large genomic tracts and cosegregation of genes of various functions, not necessarily related to reproduction. Model systems for homomorphic self-incompatibility (SI) in flowering plants share these characteristics. We introduce a method for the exact computation of the joint probability of numbers of neutral mutations segregating at the determinant of mating type and at a linked marker locus. The underlying Markov model incorporates strong balancing selection into a two-locus coalescent. We apply the method to obtain a maximum-likelihood estimate of the rate of recombination between a marker locus, 48A, and S-RNase, the determinant of SI specificity in pistils of Nicotiana alata. Even though the sampled haplotypes show complete allelic linkage disequilibrium and recombinants have never been detected, a highly significant deficiency of synonymous substitutions at 48A compared to S-RNase suggests a history of recombination. Our maximum-likelihood estimate indicates a rate of recombination of perhaps 3 orders of magnitude greater than the rate of synonymous mutation. This approach may facilitate the construction of genetic maps of regions tightly linked to targets of strong balancing selection.


IN many organisms, reproduction occurs only between individuals or gametes that express different sexes or mating types. Many such systems show high structural and sequence diversity among distinct mating types and greatly reduced recombination across extensive genomic tracts surrounding the determinants of mating type (FERRIS et al. 1997; HISCOCK and KüES 1999). Suppression of crossing-over between the human X and Y chromosomes appears to have occurred in a stepwise fashion, through a series of chromosomal rearrangements, with the nonrecombining euchromatic region of the Y now comprising some 23 Mb (LAHN and PAGE 1999; IWASE et al. 2003; SKALETSKY et al. 2003). The majority of functional protein-coding genes in the male-specific (non-pseudoautosomal) region show testis-specific expression, occur in palindromes, and have their most closely related homologs dispersed throughout the genome (ROZEN et al. 2003; SKALETSKY et al. 2003). Most of the remaining protein-coding genes show ubiquitous expression, perform various metabolic functions, and have X-linked homologs (SKALETSKY et al. 2003).

Two mating types exist in the unicellular green alga Chlamydomonas reinhardtii and in some fungi, both ascomycete (Neurospora species) and basiodiomycete (Cryptococcus neoformans and Ustilago hordei). In C. reinhardtii, recombination suppression over a 1-Mb tract surrounding the mating-type locus appears to derive from multiple inversions and translocations within the central 200-kb domain (R) in which mating-type-specific genes reside (FERRIS et al. 2002). Not all genes embedded in this region contribute directly to reproduction: many perform general ("housekeeping") functions or show maximal expression in vegetative stages of the life cycle (FERRIS et al. 2002). In the smut fungus U. hordei, extensive rearrangements across a 500-kb region containing the determinants of mating type (transcription factors, pheromones, and pheromone receptors) appear to inhibit recombination between mating types (LEE et al. 1999). In the self-fertile Neurospora tetrasperma, the chromosomal segment in which the highly rearranged alternative mating-type alleles lie fails to pair during meiosis (GALLEGOS et al. 2000). In the human pathogen C. neoformans, alternative mating-type alleles also show extensive rearrangements, with the region of reduced recombination spanning >100 kb (LENGELER et al. 2002). As in Chlamydomonas, this region contains many genes that perform a variety of primary functions, not restricted to reproduction (LENGELER et al. 2002).

Homomorphic self-incompatibility (SI) in flowering plants prevents fertilization by pollen that express mating specificities in common with the pistil. In the best-known model systems, a single, large nonrecombining region (S-locus) encodes both pistil and pollen mating specificity. Genetic analysis of S-locus regions may afford insight into many issues of general relevance to sexual reproduction, including the genetic and evolutionary mechanisms that permit genes to maintain function in the face of recombination suppression and enforced heterozygosity.

Brassica oleracea and its relatives represent the model system for sporophytic self-incompatibility (SSI), in which the specificity expressed by an individual pollen grain is determined not by its own S-locus genotype but by the genotype of the plant that produced it. The S-locus region comprises several open reading frames bearing sequence similarity to genes of known function, some with no apparent relationship to SI or reproduction (SUZUKI et al. 1999; CASSELMAN et al. 2000). S-alleles segregating within Brassica populations differ greatly with respect to the physical distance between the determinants of pollen and pistil specificity and to the orientation of those genes relative to one another and to other genes embedded in the region (NASRALLAH 2000).

In gametophytic SI (GSI) systems, the S-locus genotype of each pollen grain or tube determines its own specificity. In many species of the Solanaceae, a functional ribonuclease (S-RNase), expressed in pistil, exhibits highly specific inhibition of pollen tubes that express incompatible specificities (LEE et al. 1994; MURFETT et al. 1994). Homologs of the S-RNase gene appear to mediate GSI in the distantly related plant families Rosaceae and Scrophulariaceae as well (IGIC and KOHN 2001; STEINBACHS and HOLSINGER 2002). While the Brassica S-locus spans some hundreds of kilobases (BOYES et al. 1997), the domain of reduced recombination surrounding S-RNase comprises at least 1 Mb (MCCUBBIN and KAO 1999) and likely >4 Mb (WANG et al. 2003). To date, dozens of open reading frames have been identified for which linkage to S-RNase appears to be virtually complete (LI et al. 2000; MCCUBBIN et al. 2000; LAI et al. 2002; ENTANI et al. 2003; USHIJIMA et al. 2003). If the S-locus region of S-RNase-based GSI shares with Brassica the feature of high gene density (one per 5.4 kb; SUZUKI et al. 1999), a considerable number of expressed genes may cosegregate with S-RNase.

The advent of genomic sequence information has raised the prospect of using the pattern of variability across sites of known location to identify candidate targets of selection (e.g., NORDBORG 1997; KIM and STEPHAN 2002). While neutral substitution at a given site proceeds at the rate of neutral mutation, independent of linkage to targets of selection (BIRKY and WALSH 1988), selection can truncate or expand the time since the most recent common ancestor (MRCA) at the target and regions tightly linked to it (HUDSON and KAPLAN 1988; KAPLAN et al. 1989; TAKAHATA 1990; CHARLESWORTH et al. 1993). Implementation of this method of recognizing candidate targets requires an understanding of the joint effects of multiple kinds of selection operating at nearby sites (KIM and STEPHAN 2000; KELLY and WADE 2002) and demographic history, including population-wide bottlenecks (BARTON 1998; GALTIER et al. 2000).

Our objective is the obverse: we seek to use the relative magnitudes of neutral diversity at markers tightly linked to the mating-type locus to infer their relative order. Other approaches for the detection of recombination recognize incongruence among gene genealogies as prima facie evidence of recombination (see CRANDALL and TEMPLETON 1999; WIUF et al. 2001).

TAKEBAYASHI et al. (2003) analyzed the pattern of variation at 48A, a pollen-expressed, protein-encoding gene tightly linked to S-RNase in Nicotiana alata. Screens of hundreds of plants have yielded no recombinants (LI et al. 2000) and 48A exhibits complete allelic linkage disequilibrium with S-RNase (BREWER 1998). However, the great deficiency in the number of segregating neutral mutations at 48A relative to S-RNase departs highly significantly from expectation under the hypothesis of complete linkage (TAKEBAYASHI et al. 2003).

Here, we present a method for the maximum-likelihood (ML) estimation of the rate of recombination between a marker locus and a target of strong balancing selection. It explicitly incorporates symmetric, multiallelic balancing selection into a two-locus coalescent. We base our ML estimate of the recombination rate on computed (rather than simulated) probabilities of arbitrary numbers of neutral mutations observed segregating in a sample of haplotypes comprising two loci: one showing absolute linkage and the other partial linkage to the target of selection. Our method assumes the absence of intragenic recombination and that neutral mutations arise at each locus at identical per-site rates or at known relative per-locus rates. Application to the S-RNase-based system of GSI indicates a rate of recombination between S-RNase and 48A of perhaps 3 orders of magnitude greater than the rate of synonymous mutation. Estimation of very low recombination rates, practicably undetectable by direct observation, may facilitate construction of genetic maps within mating-type regions. It also applies to other systems of symmetric balancing selection, perhaps including the vertebrate major histocompatibility complex region.


    BALANCING SELECTION
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Ancestral selection graphs provide a means of incorporating selection into genealogical reconstruction (KRONE and NEUHAUSER 1997; NEUHAUSER and KRONE 1997). NEUHAUSER (1999) represented GSI as the strong-selection (lethal homozygosis) limit of symmetric overdominance, but concluded that the ancestral selection graph approach cannot be applied to this system because the diffusive limit on which it relies does not exist under selection pressures of this intensity. To accommodate GSI, we adopt a model that includes both slow diffusion and instantaneous jumps.

Overdominance in viability:
Under symmetric overdominance (SOD), individuals that bear any specificity in homozygous form survive to reproduction at rate (1 – s) relative to heterozygotes. Coefficients describing the drift (mean change) in frequency of a given specificity (x) and the diffusion (variation around the mean change) generated by stochastic forces correspond to

(1a)

(1b)

(see, for example, TAKAHATA 1990), for u, the rate of mutation to alleles not presently segregating in common frequencies, and F, the homozygosity (inverse of the effective number of alleles; KIMURA and CROW 1964). Under strong SOD, common alleles segregate in frequencies near the deterministic expectation of 1/n, for n the number of common alleles maintained at approximate steady state; in this case, F corresponds closely to 1/n. For s and u of order 1/2N, the coefficients in (1) describe a valid diffusion process. Under lethal homozygosis (s -> 1), the change due to selection (1a) becomes infinite in the diffusion limit (N -> {infty}).

Gametophytic self-incompatibility:
GSI differs from the strong-selection limit (s -> 1) of SOD. WRIGHT's (1939) diffusion approximation for GSI has endured various criticisms (e.g., FISHER 1930; MORAN 1962; EWENS 1964; MAYO 1966), but nevertheless appears to provide a remarkably accurate description of GSI evolution (YOKOYAMA and NEI 1979; VEKEMANS and SLATKIN 1994). After a succession of modifications, the canonical drift and diffusion coefficients for GSI have become

(2a)

(2b)
for u (~O(1/2N)), the rate of mutation to new S-alleles (WRIGHT 1969, pp. 405–406). The difference between the diffusion coefficients for SOD (1b) and GSI (2b) reflects that GSI interactions between diploid zygotes (pistils) and haploid gametes (pollen) cannot be reduced to interactions among gametes alone, as is the case for SOD. Even so, GSI shares with strong SOD (s > O(1/2N)) the property that in the diffusion limit (N -> {infty}), the mean change of the focal allele (2a) remains finite in only two frequency domains: very close to zero (x ~ O(1/2N)) and very close to the deterministic equilibrium ((Fx) ~ O(1/2N)).

Our model of GSI evolution adopts (2) within these two domains, together with jumps of the process between them (see SASAKI 1989, 1992; GILLESPIE 1983, 1994). A jump corresponds to the rise of an allele to common frequencies that is virtually instantaneous relative to the diffusion process. We restrict consideration to haplotypes under linkage sufficiently tight to justify ignoring recombination during jumps (contrast KAPLAN et al. 1989 with WIEHE and STEPHAN 1993). Under these assumptions, jumps can induce multifurcating nodes in the gene genealogy, reflecting the simultaneous coalescence of multiple haplotypes expressing the same specificity into the single ancestral haplotype in which the specificity-determining mutation first arose.

Invasion of new specificities:
Upon its introduction into the population in very low frequency, a new specificity wanders in a "boundary layer" (GILLESPIE 1983), primarily under the influence of drift, until it either declines to extinction or rises to sufficiently high frequency to permit selection to dominate the evolutionary process. In the latter case, strong selection then mediates a jump to common frequency (1/n). Let {lambda} represent the rate at which new specificities rise to common frequencies. Under SOD (1), {lambda} corresponds to

(3)

(SASAKI 1989, 1992), and under GSI (2),

(4)

(UYENOYAMA 2003).

Approximate stationarity of the number of common alleles (n) entails a balance between invasion of new alleles and extinction of common alleles. This invasion/extinction process sets the timescale of coalescence among allelic classes (TAKAHATA 1990). Because balancing selection promotes the increase of any rare allele, allelic lineages persist in the population for longer intervals than expected under neutrality. TAKAHATA (1990) compared the process of coalescence of symmetrically overdominant alleles in a population of size 2N to that among neutral alleles in a population of size 2Nf, for f, a scaling factor that represents the expansion of the timescale of the allelic genealogy caused by balancing selection. Coalescence among j neutral lineages occurs at rate

(5)
and among j overdominant lineages at rate

(6)

The latter expression reflects the invasion of a new specificity ({lambda}), without the extinction of its immediate parent specificity (1 – 1/n), and also the inclusion in the sample of descendants of both the offspring and parent lineages (see HEY 1992; TAKAHATA et al. 1992). Equating (5) and (6) determines the scaling factor

(7)
for {lambda} given by (3) for SOD and by (4) for GSI.


    TWO-LOCUS LIKELIHOODS
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
We describe a method for determining the probability of any pattern of neutral segregating mutations observed in a sample of haplotypes. Each haplotype comprises two loci: A shows absolute linkage and B partial linkage to the specificity-determining region. We assume no intragenic recombination and an r-cM separation between A and B. Neutral mutations arise within A and B according to the infinite-sites model at the per-locus, per-generation rates vA and vB, respectively. The population comprises N diploid individuals, among which n distinct common mating specificities segregate in equal frequencies.

We first summarize the derivation of probability-generating functions of numbers of segregating sites and the recursive computation of likelihoods. We then describe the Markov transition matrices specific to the evolutionary process under consideration.

Probability-generating functions:
Transitions in state:
As an alternative to the analysis of full ancestral recombination graphs (GRIFFITHS 1991; GRIFFITHS and MARJORAM 1996), we study correlations between the marginal genealogies using joint probability distributions. We represent the ancestry of the sample as two distinct but concurrent genealogies: level (lA, lB) corresponds to the segment of the concurrent genealogies in which lA locus A lineages and lB locus B lineages remain (Figure 1).



View larger version (13K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 1.— Joint levels of concurrent marginal genealogies.

 
A type 0 haplotype bears an ancestral lineage at neither locus, type 1 only at A, type 2 only at B, and type 3 at both A and B (compare KAPLAN and HUDSON 1985; GRIFFITHS 1991). At any point in the genealogy, ni,j,k of the n common specificity classes in the population contain i type 1, j type 2, and k type 3 haplotypes (0 ≤ i, j, k ≤ L), with

(8)

The state of the process corresponds to the configuration of lineages within alleles. We represent the state within level (lA, lB) by SlA,lB = (n0,0,0, n0,0,1, ... , ni,j,k, ... ), for all i, j, and k. Transition matrix PlA,lB describes rates of transition of the process between states within level (lA, lB) and QlA,lB describes rates of transition from level (lA, lB) to lower levels (fewer lineages); UlA,lB and VlA,lB give conditional transition rates, given that a transition has occurred,

(9a)

(9b)
for ClA,lB a diagonal matrix, with the mth diagonal element (ClA,lB (m)) corresponding to the total rate of transition from state m to any other state.

Recursive derivation:
For state m on level (lA, lB), the probability that a transition in state has occurred more recently than a mutational event in any of the lineages is

The joint probability-generating function (pgf) of the number of new mutations accumulated in any of the lineages since the most recent transition in state corresponds to

(10)

This expression reflects that the distribution of the total number of mutational events since the most recent transition is geometric with parameter plA,lB, and the number of mutations in A (or B) lineages given the total number is binomial.

Within a sample of any given configuration SlA,lB, the segregating mutations arose either before or after the most recent transition. Because the numbers of mutations in these disjunct periods are independent, the pgf for the total number of segregating sites corresponds to the product of the pgfs for the two periods:

(11a)

(11b)
in which glA,lB represents the vector of pgfs for sample configurations on level (lA, lB), hlA,lB is the vector of pgfs on lower levels, and FlA,lB is a diagonal matrix with the mth diagonal element given by flA,lB (10). Equivalently,

(12a)

(12b)
for DlA,lB, a diagonal matrix of the denominators of FlA,lB. For states (1, 2) or (2, 1) (two lineages remaining in one marginal genealogy), the only lower level (1, 1) corresponds to the joint MRCA of the sample. Because any mutations in this state do not contribute to the number of segregating sites,

(13)

Beginning at this lowest level, the pgfs for samples of arbitrary sizes and configurations may be obtained recursively from (11) or (12).

Determination of the joint probability distribution of segregating sites requires only the derivatives of the pgfs rather than the pgfs themselves. We obtain the joint probability of observing kA segregating sites at locus A and kB at B by taking derivatives of the vector of pgfs,

(14)
in which the dummy variables of the pgf and the subscripts indicating level are suppressed for simplicity, and the ckA,jA,kB,jB are determined from either of

(15a)

(15b)
with initial conditions c1,0,0,0 = c1,1,0,0 = c0,0,1,0 = c0,0,1,1 = 1 and for any jA > kA or jB > kB. Beginning with initial levels (2, 1) and (1, 2) (13), we obtain derivatives of arbitrary orders for all sample sizes and configurations by successive application of (14), a double recursion in sample size and mutation numbers.

Mating-type regions:
Application of this method to the estimation of the rate of recombination between the determinant of mating type and a neutral marker locus entails only specification of one-step Markov matrices (PlA,lB and QlA,lB). Figure 2 illustrates the three processes that mediate changes in state: coalescence of haplotypes within specificity class through genetic drift, transfer of marker lineages among specificity classes by recombination, and turnover (invasion and extinction) of specificity classes. Drift and turnover contribute to between-level transitions (QlA,lB) and recombination contributes to within-level transitions (PlA,lB).



View larger version (12K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 2.— Mechanisms of transitions in state. Genetic drift may effect the coalescence of a pair of lineages; recombination may effect the transfer of one locus B lineage between specificity classes; and turnover of specificity classes may effect the simultaneous coalescence of multiple lineages into the single ancestor in which the new specificity arose.

 
Genetic drift:
We assume that at approximate steady state, each of the n common classes comprises 2N/n haplotypes, any pair of which may coalesce through genetic drift. Within any of the ni,j,k specificity classes of type (i, j, k) (containing i type 1, j type 2, and k type 3 haplotypes), coalescence between two type 1 haplotypes occurs at rate i(i – 1)n/4N and moves the process from level (lA, lB) to (lA – 1, lB). All other state transitions due to drift and their rates are similarly determined.

Recombination:
Because locus A shows absolute linkage to the specificity-determining region, only B locus regions can transfer between specificity classes by recombination. In any generation, a fraction r/2 of haplotypes are newly formed recombinants. We treat as negligible the probability of transmission of more than one haplotype derived from the single meiosis that generated a particular recombinant; consequently, its sister recombinant could not have borne an ancestral lineage (type 0). If the focal recombinant bears no ancestral B lineage (type 0 or 1), recombination would have induced no transition in state. At rate rjni,j,k/2, one of the type 2 haplotypes in a specificity class presently of type (i, j, k) represents a newly formed recombinant; rkni,j,k/2 is the rate for a type 3 recombinant. If the new recombinant is type 2, the state of the recipient class immediately before the recombination event was (i, j – 1, k); if type 3, it was (i + 1, j, k – 1). The B lineage of the recombinant was derived from a specificity class presently of type (x, y, z) with probability nx,y,z/(n – 1) for (x, y, z) != (i, j, k), and probability (ni,j,k – 1)/(n – 1) for (x, y, z) = (i, j, k). Our assumption that the sister to the focal recombinant was not transmitted entails that the former state of the donor class was (x, y + 1, z), irrespective of whether the newly formed recombinant is type 2 or 3.

Turnover:
At rate {lambda}, one of the n common specificities segregating in the population represents a class that expanded in the immediately preceding generation. We assume that once initiated, the rise to common frequencies of an invading class occurs virtually instantaneously relative to mutation, recombination, and drift. Consequently, all lineages within the new class descend immediately and without mutation from the single ancestor in which the new specificity arose. Only the ns specificities that contain a single type of lineage can have undergone expansion in the immediately preceding generation:

(16)

At approximate steady state for the number of common alleles (n), the extinction of an existing common allele accompanies the invasion of a new allele. At rate {lambda}/n, a new specificity class (O) invades and replaces its immediate parent (P). For example, O presently has type (i, 0, 0) with probability ni,0,0/ns, having expanded from a type (1, 0, 0) ancestor. This event implies a transition in state from level (lA, lB) to (lAi + 1, lB).

At rate {lambda}(1 – 1/n), O invades and coexists with P. The probability that O currently has type (i, j, k), in which only one of i, j, and k is positive, and P currently has type (x, y, z) corresponds to ni,j,knx,y,z/ns(n – 1) for (x, y, z) != (i, j, k), and to ni,j,k(ni,j,k – 1)/ns(n 1) for (x, y, z) = (i, j, k). For example, if O presently has type (0, 0, k), then immediately before the turnover event, P had type (x, y, z + 1) and the process was at level (lA k + 1, lBk + 1). Similar considerations provide the rates of all possible state transitions due to turnover.

Relative rates of transition:
For state m on level (lA, lB), the joint pgf of the numbers of mutations accumulated at locus A and locus B before the first transition in state corresponds to

(17)
for CD, CR, and CT rates of change due to genetic drift within specificity class, recombination, and turnover of specificity classes:

(18a)

(18b, 18c)

We consider a rapid-drift approximation (compare TAKAHATA 1991 with NOTOHARA 2000), which entails that coalescence within specificity class occurs virtually instantaneously relative to recombination or coalescence between classes (n/2N >> r/2, {lambda}). Lineages ancestral to a present-day sample in any configuration quickly come to reside in separate classes. Upon the formation, by recombination or turnover, of a class that contains more than one lineage-bearing haplotype, genetic drift dominates the evolutionary process (CD >> CR, CT), ensuring immediate transition to a lower level. Between such formations, CD = 0 and the rates of transition through turnover or recombination correspond to

(19a)

(19b)

[from (18)].


    APPLICATION
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
We apply our method to obtain a maximum-likelihood estimate (MLE) of the rate of recombination between marker locus 48A and S-RNase, the determinant of pistil specificity in S-RNase-based systems of GSI. We assume that substitution of neutral mutations occurs at the per-site rate v in both loci, with the per-locus rates proportional to sequence length (vA = vXA, vb = vXB, for XA and XB numbers of sites). The parameters of our model include the number of common S-alleles segregating in the population (n), scaled per-site rate of neutral substitution ({theta} = 4Nv), rate of invasion of new common S-alleles [{lambda} (4), or its inverse function f (7)], and scaled rate of recombination (R = 4Nr).

HUDSON and KAPLAN's (1988) analysis of symmetric biallelic overdominance without allelic turnover suggested that the rate of recombination between the target of balancing selection and a linked neutral site might be inferred from the level of variation at the neutral site alone. In the presence of allelic turnover, however, this information is insufficient because balancing selection can decrease as well as increase the age of the MRCA at linked sites (UYENOYAMA and TAKEBAYASHI 2004). Simultaneous consideration of variation at sites absolutely linked to the target of selection improves inference about the rate of recombination between the target and partially linked sites.

In this section, we summarize our previous estimates of the number of mutation events at S-RNase, assumed absolutely linked to the S-locus, and at the partially linked marker 48A, and our conclusion that the pattern of variation at the marker locus alone provides an insufficient basis for the estimation of the rate of recombination between the loci. We then describe analyses of variation at each locus separately, comparing estimates obtained from the method-of-moments and likelihood approaches. Finally, we present results of our ML analysis of the joint pattern of variation at S-RNase and 48A.

Numbers of segregating sites:
TAKEBAYASHI et al. (2003) used the approach of YANG et al. (2000) to assign each codon within S-RNase and 48A to either a positive selection partition or a conservative evolution partition. Because our primary goal was to infer phylogenetic history from neutral variation at the subset of sites from which all targets of positive selection had been excluded, we adopted a very liberal criterion for positive selection. Table 1 presents our estimates, corresponding to the total lengths of ML trees, of the numbers of neutral mutations segregating in the conservative evolution partition in a sample of five two-locus haplotypes, each representing a distinct S-allele specificity. The estimated number of synonymous mutations at S-RNase exceeds the number of synonymous sites, indicating that many sites have accumulated multiple hits within the span of the very deep S-allele genealogy. Even under this underestimation of the number of mutations in S-RNase, the >15-fold deficiency in 48A departs highly significantly from expectation under the hypothesis of absolute linkage (TAKEBAYASHI et al. 2003).


View this table:
[in this window]
[in a new window]

 
TABLE 1 Segregating neutral mutations

 
Underestimation of the total number of mutations accumulated at S-RNase over the span of the S-allele genealogy may have caused underestimation of both genealogical depth (f{theta}) and recombination rate (fR). The net effect of these biases on our estimate of the rate of recombination relative to the rate of synonymous substitution (R/{theta}) remains unexplored.

Insufficiency of information at marker locus alone:
A key determinant of the rate of coalescence among marker lineages is the rate of formation of S-allele classes that contain multiple lineages. Because allelic turnover as well as recombination can generate such classes, the number of segregating sites at the marker locus alone provides an insufficient basis for the estimation of the recombination rate.

Figure 3, from UYENOYAMA and TAKEBAYASHI (2004), shows the likelihood surface generated by our method from the estimated number of segregating sites at 48A under the assignment n = 20 and f = 10. While information on the number (n) of S-alleles segregating in ancestral natural populations of N. alata is unavailable, the extensive body of empirical and theoretical work on gametophytic SI systems suggests the segregation of dozens of S-alleles (LAWRENCE 2000). To an order of magnitude, the assigned value of the scaling factor (f = 10) is comparable to rough estimates for major histocompatibility loci (TAKAHATA et al. 1992) and fungal mating-type genes (MAY et al. 1999).



View larger version (87K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 3.— Log-likelihoods of parameters R/{theta} and f{theta} determined from estimated number of segregating sites at 48A under the assignment n = 20 and f = 10. The thick line indicates the ridge of highest likelihood, with contour lines spaced at intervals of 0.5 log-likelihood units.

 
The existence of a ridge of nearly equal likelihoods (Figure 3) indicates that the low level of neutral variation at 48A relative to S-RNase is consistent with a continuum of values of R/{theta} and f{theta}. For example, low variation might reflect tight linkage to the S-locus (small R/{theta}) and frequent rapid expansions of newly arisen alleles (high turnover rate {lambda}, or small f{theta}). Alternatively, loose linkage (large R/{theta}) would moderate the expected increase in age of the MRCA at sites linked to a target of balancing selection that is subject to infrequent turnover (large f{theta}). Consideration of additional marker loci would introduce a new recombination rate for each locus; even so, analysis of the joint numbers of segregating sites would improve the estimation if linkage relationships among the markers were known.

Analysis of variation at each locus separately:
Under the rapid-drift approximation (19), the pgf of the number of segregating sites in samples of size L reduces, under a rescaling of time, to WATTERSON's (1975) classic result for populations without specificity classes,

(20)
in which

(21)

(UYENOYAMA and TAKEBAYASHI 2004). The rescaling corresponds to the expected waiting time for formation of a specificity class that contains more than one lineage (1/[r/2(n 1) + {lambda}/n2]). That this pgf has only one parameter suggests that the number of segregating sites at the marker locus provides information about only one composite parameter, precluding separation of R/{theta} and f{theta}.

For comparison to our ML analysis, we outline a method-of-moments approach using the one-locus, rapid-drift pgf (20). This pgf provides familiar expressions for the first two moments of the number of segregating sites (S),

(22a)

(22b)
in which and , which suggest an unbiased moment estimator of {theta}R of S/aL, with variance S/aL + S2bL/a4L (WATTERSON 1975).

Table 2 presents estimates of {theta}R (21) based on the number of segregating sites at each locus separately. Shown are both method-of-moments estimates, with approximate standard deviations, and MLEs, with 2-unit support ranges (two log-likelihood units; EDWARDS 1972). In the absence of recombination between 48A and S-RNase (R = 0), composite parameters f{theta} and {theta}R (21) would be identical. In accord with the significant rejection of absolute linkage by an exact test (TAKEBAYASHI et al. 2003), the 2-unit support ranges for the MLEs of these parameters do not overlap. In contrast, the moments approach fails to distinguish between the estimates of f{theta} and {theta}R, or even between either estimate and zero. Both the asymmetry of the 2-unit support ranges of the MLEs and the magnitude of the confidence intervals obtained from the moments approach suggest that the variance estimates provide little indication of dispersion about the expectation.


View this table:
[in this window]
[in a new window]

 
TABLE 2 Single-locus analyses of S-RNase and 48A

 
Two-locus likelihoods:
Considering variation at both loci, we obtained maximum-likelihood estimates of R/{theta} and f{theta} under the rapid-drift approximation (19). The joint likelihood represents the probability of the observed pair of numbers of segregating sites at the two loci, and the composite likelihood the product of the probabilities determined separately for each locus. Maximization of the composite likelihood would treat the numbers of segregating sites at S-RNase and 48A as independent. Correlations between loci reflect periods of shared history between lineages while residing in the same haplotype (KAPLAN and HUDSON 1985; WIUF and HEIN 1999). Recombination tends to break up any type 3 haplotypes in the initial sample into types 1 and 2.

To assess the effect of shared history, we compared the MLEs determined by maximizing the joint likelihood to those determined by maximizing the composite likelihood obtained from the two one-locus models (20). Table 3 shows very close correspondence between the composite and joint MLEs across a range spanning realistic values of the number of S-alleles in the population (n). This agreement suggests that recombination has occurred at a sufficiently high rate to justify ignoring the correlation due to the initial sampling of type 3 haplotypes.


View this table:
[in this window]
[in a new window]

 
TABLE 3 Maximum-likelihood estimates of rates of recombination and synonymous substitution

 
Assuming negligible correlation between the loci, we studied the shape of the composite-likelihood surface. Figure 4 shows composite likelihoods of R/{theta} and f{theta} under the assignment n = 20. The steep decline in the likelihood of small values of R/{theta} supports our earlier rejection of absolute linkage (TAKEBAYASHI et al. 2003). This analysis provides an MLE of R/{theta} of 841.56 (Figure 4, X).



View larger version (101K):
[in this window]
[in a new window]
[Download PPT slide]
 
FIGURE 4.— Log-likelihoods of parameters R/{theta} and f{theta} determined from composite probabilities under the assignment n = 20. Contour lines surrounding the the MLE point (X) are spaced at intervals of 0.5 log-likelihood units, with the dashed line indicating the 2-unit support range. The horizontal line (23), indicating the expected relationship based on the first moments, provides an excellent indication of the region of high likelihood.

 
Rearrangement of (21) suggests another means of estimating R/{theta},

(23)
for determined from the number of segregating sites at 48A and from S-RNase. Assignment of n = 20 and and to the moments estimates given in Table 2 gives = 776.52 (Figure 4, horizontal line), somewhat less than the MLE.


    DISCUSSION
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
Maximum-likelihood estimation of very low recombination rates:
We have described a method for the recursive computation of the exact sampling distribution of the numbers of neutral mutations segregating at a marker locus and a target of balancing selection in a sample of arbitrary size and configuration. A >15-fold deficiency in variation at marker locus 48A relative to S-RNase, the determinant of SI specificity in pistil (Table 1; TAKEBAYASHI et al. 2003), corresponds to an MLE of the interlocus recombination rate of perhaps 3 orders of magnitude greater than the per-site rate of synonymous substitution (Figure 4).

To an order of magnitude, rates of synonymous substitution in plants (Arabidopsis thaliana; ZHANG et al. 2002) as well as mammals (LI et al. 1985; KUMAR and SUBRAMANIAN 2002) fall in the range 10–8–10–9/site/year. Under the assumption of an annual life cycle, our estimates suggest a rate of recombination between 48A and S-RNase on the order of 10–3 or 10–4 cM. LI et al. (2000) detected no recombination between the S-locus and three marker loci, including 48A, in screens of hundreds of plants (Table 4). The minimum detectable rate of recombination (~1 cM) exceeds by several orders of magnitude our MLE, which reflects recombination events that occurred within the span of the S-allele genealogy. Segregating S-allele classes appear to have diverged >30 million years (MY) ago in both the solanaceous S-RNase-based GSI system (IOERGER et al. 1990) and the Brassica SSI system (UYENOYAMA 1995). Adopting this figure for the average pairwise divergence time suggests an expected total length of a five-allele genealogy of 120 MY. Over intervals of this magnitude, recombination at even the very low rate suggested by our analysis appears to have been sufficient to cause a significant reduction in the number of synonymous mutations segregating at 48A relative to S-RNase.


View this table:
[in this window]
[in a new window]

 
TABLE 4 Minimum detectable interval between the S-locus and marker loci

 
Countervailing effects of strong balancing selection:
Most studies of the effects of balancing selection on neutral variability in linked regions restrict consideration to the biallelic case (e.g., STROBECK 1980; KAPLAN et al. 1988; BARTON and NAVARRO 2002). In contrast, homomorphic SI systems maintain dozens or even hundreds of S-alleles (LAWRENCE 2000). A phenomenon that arises in multiallelic but not biallelic systems held in polymorphism by strong balancing selection is the rapid increase in frequency of novel or rare alleles. Such abrupt expansions tend to decrease neutral variation within specificity class at hitchhiking sites. However, strong balancing selection can also increase divergence times and neutral diversity at linked sites by expanding the timescale of coalescence among allelic classes (TAKAHATA 1990; SASAKI 1992; VEKEMANS and SLATKIN 1994; UYENOYAMA 2003). Recombination and allelic turnover jointly influence the overall pattern of nucleotide variation. In contrast, under the assumptions of constant allele frequencies and no allelic turnover, the observed decline with looser linkage in interallelic pairwise differences in a sliding-window analysis of the Adh locus of Drosophila melanogaster was deemed sufficient to identify the nucleotide site subject to balancing selection (HUDSON and KAPLAN 1988).

Distinguishing between an ancient episode of purifying selection at a tightly linked site and a recent episode at a loosely linked site presents a general challenge to methods that seek to infer the location of a target of selection from the pattern of neutral variation (KIM and STEPHAN 2002; NAVARRO and BARTON 2002). The likelihood surface obtained from consideration of variation at 48A alone (Figure 3) reflects these alternative explanations for low variation. In our study, the problem was exacerbated by the nature of the sample, which comprised only one representative of each S-allele. The magnitude of variation within mating-type alleles can provide information about the age of a given specificity and consequently the rate of specificity turnover (SLATKIN and RANNALA 1997; MAY et al. 1999).

Evolution under tight linkage and enforced heterozygosity:
In addition to the intense balancing selection engendered by GSI, "Muller's ratchet" (FELSENSTEIN 1974) and related degenerative processes (CHARLESWORTH and CHARLESWORTH 2000) likely operate within mating-type regions. ROZEN et al. (2003) have proposed that gene conversion within the extensive palindromes of the human Y chromosome has permitted preservation of many essential male-specific genes. Degeneration appears to be the fate of Y-linked genes that lack recourse to such special mechanisms (MARSHALL GRAVES 2002).

In regions under complete linkage to the S-locus, enforcement of S-allele heterozygosity postpones formation of zygotes bearing a mutation in homozygous form until after the diversification in mating type of its carriers. Ancient (>30 MY) divergence among S-alleles and tight linkage to the S-locus of genomic tracts of considerable size (BOYES et al. 1997; MCCUBBIN and KAO 1999; WANG et al. 2003) together imply the accumulation of mutations over very long periods and at a great many sites prior to their selective screening in homozygous form.

Small population size promotes the substitution of deleterious mutations (NEI 1970). Subdivision of the gamete pool into many mating types exacerbates degeneration at sites tightly linked to mating-type regions. In populations with equal frequencies of the sexes, Y-linked genes experience an effective population size fourfold lower than that of autosomal genes. Linkage to the S-locus reduces effective population size by n-fold, for n the typically large number of common S-alleles segregating in natural populations (LAWRENCE 2000). As a consequence, linkage disequilibrium between deleterious mutations and the S-locus (S-allele-specific load) may accumulate.

S-allele-specific load:
An analysis of the shape of S-RNase genealogies indicated significantly long terminal branches (UYENOYAMA 1997). This method has been used to detect similar distortions in a larger sample of solanaceous S-alleles (RICHMAN and KOHN 1999), among Brassica SSI S-alleles (SCHIERUP et al. 1998), among fungal mating specificities (MAY et al. 1999), and at a major histocompatibility complex class II locus (RICHMAN and KOHN 1999). Even the Adh polymorphism in D. melanogaster, an exemplar of balancing selection, shows this pattern (HUDSON and KAPLAN 1986).

UYENOYAMA (1997)(2000) proposed that progressive intensification of allele-specific load may contribute to this kind of genealogical distortion. Analysis of an extension of WRIGHT's (1939) classical diffusion model of GSI indicated that allele-specific load reduces the expected number of common alleles (n) maintained in populations and the rate of S-allele diversification ({lambda} {propto} 1/f) for a given effective population size and rate of origin of new S-alleles (UYENOYAMA 2003). Although S-allele-specific load is expected to change the nature of the dependence of f and n on population size and rate of origin of new alleles, our analysis treats f and n as phenomenological parameters. Consequently, we conjecture that the presence of S-allele-specific load may not severely compromise our estimate of the relative rate of recombination (R/{theta}).

Method of moments:
Many workers have used backward recursions to determine the moments of divergence times or numbers of pairwise differences (e.g., KAPLAN et al. 1988; NOTOHARA 1990; TAKAHATA 1991; WAKELEY 1998). In particular, KAPLAN and HUDSON (1985) derived the mean, variance, and covariance of numbers of segregating sites at multiple loci in samples of arbitrary size. While the expectation corresponds well to the high-likelihood region (Figure 4), the confidence intervals appear uninformative (Table 2; UYENOYAMA and TAKEBAYASHI 2004). For the system at hand, likelihood methods may provide a more reliable basis for statistical inference.

Maximum-likelihood methods:
FELSENSTEIN et al. (1999) and STEPHENS and DONNELLY (2000) have provided lucid accounts of two powerful ML approaches using Monte Carlo computational methods. Felsenstein and colleagues have addressed the estimation of population parameters, using Metropolis-Hastings (MH) sampling of genealogies to average over the missing datum of the genealogy of the sampled genes. In the Felsenstein approach, a genealogy comprises topology and branch lengths in units of mutations. Alternatively, GRIFFITHS and TAVARé (1994) characterize the probability distribution of entire histories as solutions of systems of linear equations. In the Griffiths/Tavaré approach, a history comprises topology together with assignment of mutations to branches. Because the central linear recursions are defined over the effectively infinite-dimensioned state space of history, importance sampling (IS) is used to approximate the underlying Markov chain.

The joint distribution of numbers of segregating sites at two neutral loci (GRIFFITHS 1981) or of the ages of the most recent common ancestors of sampled genes at multiple neutral loci (GRIFFITHS and MARJORAM 1996; GRIFFITHS 1999; SLADE 2002) provides information about rates of recombination among the loci. Implementations of MH (KUHNER et al. 2000) and IS (FEARNHEAD and DONNELLY 2001, 2002) methods for the estimation of recombination rates have been developed. Incorporation of recombination entails analysis of the even larger state space of ancestral recombination graphs (GRIFFITHS 1991; GRIFFITHS and MARJORAM 1996).

Our primary objective is the estimation of population parameters rather than genealogy reconstruction. Our method is much simpler than other ML approaches because we base our estimate of the rate of recombination on the summary statistic of number of segregating sites. Computation of the likelihoods, which incorporate averaging over all possible histories of the sampled haplotypes, entails recursive solution of linear systems of equations, not simulation of a posterior distribution of genealogies. The variables of our recursion system correspond to probability-generating functions of numbers of segregating sites, and the state space of our Markov chain comprises all possible configurations of the sample (assignment of lineages to S-allele classes). Mutations are not part of the definition of state, as in the Griffiths/Tavaré approach, but are tracked separately by the pgfs. The enormous reduction in the dimension of the state space permits direct solution of the systems of linear equations without resort to IS computational methods. Further, the simplicity of the analytical description [e.g., (11) and (14)] serves to clarify the evolutionary process.

Applications:
Our approach may facilitate the determination of the relative order of markers in regions containing targets of various forms of strong balancing selection (see Introduction). Sampling strategies most useful for inferring gene order from the relative numbers of segregating sites include obtaining haplotypes that comprise the target of balancing selection as well as the marker loci, multiple haplotypes corresponding to the same specificity class, and multiple distinct specificity classes.

In principle, our method can accommodate additional summary statistics (e.g., number of distinct haplotypes, frequency spectrum of mutations) and general biological processes or population structures. Customization to the ecology and genetics of the particular system under study entails redefinition of the state space and derivation of one-step Markov transition matrices (9). In our present analysis, these tasks were straightforward, even while accommodating a two-locus coalescent, intense balancing selection and gene genealogies with multifurcating nodes. In practice, the range of applications would be limited primarily by the size of the central system of linear recursions (14). Efficient IS methods have been described for some arbitrarily large systems (GRIFFITHS and TAVARé 1994; STEPHENS and DONNELLY 2000).


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 
We thank two anonymous reviewers for constructive comments. E.N. receives funding from the Australian Research Council. U.S. Public Health Service grant GM 37841 to M.K.U. provided support for this study.


    FOOTNOTES
 
1 Present address: Department of Biology and Wildlife, University of Alaska, Fairbanks, AK 99775. Back


    LITERATURE CITED
 TOP
 ABSTRACT
 BALANCING SELECTION
 TWO-LOCUS LIKELIHOODS
 APPLICATION
 DISCUSSION
 ACKNOWLEDGEMENTS
 LITERATURE CITED
 

BARTON, N. H., 1998 The effect of hitchhiking on neutral genealogies. Genet. Res. 72: 123–133.[CrossRef]

BARTON, N. H., and A. NAVARRO, 2002 Extending the coalescent to multilocus systems: the case of balancing selection. Genet. Res. 79: 129–139.[CrossRef][Medline]

BIRKY, C. W., and J. B. WALSH, 1988 Effects of linkage on rates of molecular evolution. Proc. Natl. Acad. Sci. USA 85: 6414–6418.[Abstract/Free Full Text]

BOYES, D. C., M. E. NASRALLAH, J. VREBALOV and J. B. NASRALLAH, 1997 The self-incompatibility (S) haplotypes of brassica contain highly divergent and rearranged sequences of ancient origin. Plant Cell 9: 237–247.[Abstract]

BREWER, P., 1998 The statistical analysis of a gene tightly linked to the self-incompatibility locus. BSc Honours Thesis, University of Melbourne, Melbourne.

CASSELMAN, A. L., J. VREBALOV, J. A. CONNER, A. SINGHAL, J. GIOVANNONI et al., 2000 Determining the physical limits of the brassica S locus by recombinational analysis. Plant Cell 12: 23–33.[Abstract/Free Full Text]

CHARLESWORTH, B., and D. CHARLESWORTH, 2000 The degeneration of Y chromosomes. Philos. Trans. R. Soc. Lond. B 355: 1563–1572.[CrossRef][Medline]

CHARLESWORTH, B., M. T. MORGAN and D. CHARLESWORTH, 1993 The effect of deleterious mutations on neutral molecular variation. Genetics 134: 1289–1303.[Abstract]

CRANDALL, K. A., and A. R. TEMPLETON, 1999 Statistical approaches to detecting recombination, pp. 153–176 in The Evolution of HIV, edited by K. A. CRANDALL. Johns Hopkins University Press, Baltimore.

EDWARDS, A. W. F., 1972 Likelihood. Cambridge University Press, Cambridge, UK.

ENTANI, T., M. IWANO, H. SHIBA, F.-S. CHE, A. ISOGAI et al., 2003 Comparative analysis of the self-incompatibility (S-) locus region of Prunus mume: identification of a pollen-expressed F-box gene with allelic diversity. Genes Cells 8: 203–213.[Abstract]

EWENS, W. J., 1964 On the problem of self-sterility alleles. Genetics 50: 1433–1438.[Free Full Text]

FEARNHEAD, P., and P. DONNELLY, 2001 Estimating recombination rates from population genetic data. Genetics 159: 1299–1318.[Abstract/Free Full Text]

FEARNHEAD, P., and P. DONNELLY, 2002 Approximate likelihood methods for estimating local recombination rates. J. R. Stat. Soc. B 64: 657–680.[CrossRef]

FELSENSTEIN, J., 1974 The evolutionary advantage of recombination. Genetics 78: 737–756.[Abstract/Free Full Text]

FELSENSTEIN, J., M. K. KUHNER, J. YAMATO and P. BEERLI, 1999 Likelihoods on coalescents: a Monte Carlo sampling approach to inferring parameters from population samples of molecular data, pp. 163–185 in Statistics in Molecular Biology and Genetics, edited by F. SEILLIER-MOISEIWITSCH. Institute of Mathematical Statistics and American Mathematics Society, Haywood, CA.

FERRIS, P. J., C. PAVLOVIC, S. FABRY and U. W. GOODENOUGH, 1997 Rapid evolution of sex-related genes in Chlamydomonas. Proc. Natl. Acad. Sci. USA 94: 8634–8639.[Abstract/Free Full Text]

FERRIS, P. J., E. V. ARMBRUST and U. W. GOODENOUGH, 2002 Genetic structure of the mating-type locus of Chlamydomonas reinhardtii. Genetics 160: 181–200.[Abstract/Free Full Text]

FISHER, R. A., 1930 The Genetical Theory of Natural Selection, Ed. 1. Oxford University Press, Oxford.

GALLEGOS, A., D. J. JACOBSON, N. B. RAJU, M. P. SKUPSKI and D. O. NATVIG, 2000 Suppressed recombination and a pairing anomaly on the mating-type chromosome of Neurospora tetrasperma. Genetics 154: 623–633.[Abstract/Free Full Text]

GALTIER, N., F. DEPAULIS and N. H. BARTON, 2000 Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics 155: 981–987.[Abstract/Free Full Text]

GILLESPIE, J., 1983 Some properties of finite populations experiencing strong selection and weak mutation. Am. Nat. 121: 691–708.[CrossRef]

GILLESPIE, J., 1994 Substitution processes in molecular evolution. ii. Exchangeable models from population genetics. Evolution 48: 1101–1113.[CrossRef]

GRIFFITHS, R. C., 1981 Neutral two-locus multiple allele models with recombination. Theor. Popul. Biol. 19: 169–186.[CrossRef]

GRIFFITHS, R. C., 1991 The two-locus ancestral graph, pp. 100–117 in Selected Proceedings of the Sheffield Symposium on Applied Probability, Vol. 18, edited by I. V. BASAWA and R. L. TAYLOR. Institute of Mathematical Statistics, Hayward, CA.

GRIFFITHS, R. C., 1999 The time to the ancestor along sequences with recombination. Theor. Popul. Biol. 55: 137–144.[CrossRef][Medline]

GRIFFITHS, R. C., and P. MARJORAM, 1996 Ancestral inference from