Genetics, Vol. 160, 1231-1241, March 2002, Copyright © 2002

A Coalescent-Based Method for Detecting and Estimating Recombination From Gene Sequences

Gil McVeana, Philip Awadallab, and Paul Fearnheada
a Department of Statistics, University of Oxford, Oxford OX1 3TG, United Kingdom
b Section of Evolution and Ecology, University of California, Davis, California 95616

Corresponding author: Gil McVean, 1 S. Parks Rd., Oxford OX1 3TG, United Kingdom., mcvean{at}stats.ox.ac.uk (E-mail)

Communicating editor: J. HEY


*  ABSTRACT
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION AND APPLICATION
*LITERATURE CITED

Determining the amount of recombination in the genealogical history of a sample of genes is important to both evolutionary biology and medical population genetics. However, recurrent mutation can produce patterns of genetic diversity similar to those generated by recombination and can bias estimates of the population recombination rate. HUDSON 2001 Down has suggested an approximate-likelihood method based on coalescent theory to estimate the population recombination rate, 4Ner, under an infinite-sites model of sequence evolution. Here we extend the method to the estimation of the recombination rate in genomes, such as those of many viruses and bacteria, where the rate of recurrent mutation is high. In addition, we develop a powerful permutation-based method for detecting recombination that is both more powerful than other permutation-based methods and robust to misspecification of the model of sequence evolution. We apply the method to sequence data from viruses, bacteria, and human mitochondrial DNA. The extremely high level of recombination detected in both HIV1 and HIV2 sequences demonstrates that recombination cannot be ignored in the analysis of viral population genetic data.


RECOMBINATION breaks down the correlation in genealogical history between different regions of a genome and shuffles genetic diversity among chromosomes. In evolutionary biology, the importance of recombination is the generation of novel gene combinations, which allows the spread of multiple beneficial mutations (FISHER 1932 Down; MULLER 1932 Down) and prevents the accumulation of deleterious ones (MULLER 1964 Down). In medical genetics, associations between disease phenotypes and genetic markers that build up through genetic drift and are broken down by recombination are central to the mapping of disease-associated mutations (PRITCHARD and PRZEWORSKI 2001 Down).

The occurrence of recombination also has practical implications for evolutionary inference. For population geneticists, recombination reduces the effects of evolutionary stochasticity, averaging out genealogical histories over a genome. In contrast, traditional methods of phylogenetic inference typically assume the absence of recombination. If the assumption is incorrect, inferences about the evolutionary history of gene sequences may be misleading (SCHIERUP and HEIN 2000 Down). Recombination is therefore a critical issue for analyses of within- species variation.

A variety of nonparametric methods have been developed to detect recombination from gene sequences, without estimating the rate at which it occurs. Some use phylogenetic methods to ask whether different regions of a gene have different histories (GRASSLY and HOLMES 1997 Down; MCGUIRE et al. 2000 Down), which are targeted at identifying rare recombinant genotypes. Other methods are aimed at inferring the presence of recurrent recombination, such as occurs among the genes of most eukaryote species. Among these methods, some consider summary statistics that are sensitive to recombination, such as the relationship between physical distance and measures, or indicators of linkage disequilibrium (LEWONTIN 1964 Down; MAYNARD SMITH 1999 Down). Other methods consider properties of phylogenetic trees inferred under the assumption of no recombination (MAYNARD SMITH and SMITH 1998 Down; WOROBEY 2001 Down). The methods vary in their ability to statistically detect recombination under different conditions and their sensitivity to an accurate characterization of the underlying model of sequence evolution (MAYNARD SMITH 1999 Down; MEUNIER and EYRE-WALKER 2001 Down).

The inability of such methods to estimate the rate at which recombination occurs is a serious limitation. Characterizing the rate of recombination is important for analyzing the power of association studies, assessing the reliability of phylogenetic methods, and predicting the rate at which advantageous mutations, such as those conferring drug resistance, can spread between genetic backgrounds. Some nonparametric methods for detecting recombination, such as the homoplasy test (MAYNARD SMITH and SMITH 1998 Down) and derivatives (WOROBEY 2001 Down), provide a characterization of how far the data are from the extremes of free recombination and complete clonality. But there is no straightforward relationship between such a property and the parameters of any underlying evolutionary model. As a result, comparison between genes or species is problematic, and there is little or no way of statistically testing whether data sets have different levels of recombination. Model-based estimation of the rate of recombination does rely on an underlying model that is almost certainly a simplification of reality. However, the benefits gained are the ease of comparison between different data sets, the ability to make predictions about the question of interest, and the potential to test whether the model of evolution is an adequate characterization of the underlying processes. In addition, parametric models can be used to test for the presence of recombination by comparing the likelihood of the data under models with and without recombination (BROWN et al. 2001 Down).

What evolutionary model is appropriate for describing the effects of recombination on gene sequences? Coalescent theory provides a statistical description of the genealogical history of sequences sampled from large, Fisher-Wright populations with nonoverlapping generations, constant population size, and no selection or migration (KINGMAN 1982 Down; HUDSON 1991 Down). Within this framework, the effects of recombination on sample history are a function not of the absolute recombination rate, but of the product of the per gene per generation rate of crossing over (genetic map length), r, and the effective population size, Ne (GRIFFITHS and MARJORAM 1996B Down). Without prior information about one of these parameters, it is possible only to estimate the product of these parameters, often written as (equivalently, one can estimate the ratio of the recombination rate and the mutation rate, r/µ, and the population mutation rate ). The coalescent can readily be extended to include time-varying population size, migration, and some forms of selection (HUDSON and KAPLAN 1994 Down; BRAVERMAN et al. 1995 Down). Under these more complex situations, the effects of recombination on gene samples also depend on other parameters. In general, however, the product of the current effective population size of the population and the absolute recombination rate is the key determinant of the impact of recombination on patterns of genetic diversity.

Within the framework of the coalescent, several methods have been proposed as estimators of the population recombination rate. HUDSON 1987 Down derived a moment estimator on the basis of the variance in pairwise differences. HEY and WAKELEY 1997 Down developed a method on the basis of combining analytically derived likelihoods for all pairs of sites and sets of four sequences. WALL 2000 Down proposed to find the value of 4Ner that maximizes the likelihood of observing the number of haplotypes and inferred minimum number of recombination events (HUDSON and KAPLAN 1985 Down). Full-likelihood estimators of the population recombination rate, on the basis of the coalescent, have also been developed. These use computationally intensive Monte Carlo methods; GRIFFITHS and MARJORAM 1996A Down described a method on the basis of importance sampling, while KUHNER et al. 2000 Down developed a Metropolis-Hastings rejection Monte Carlo Markov chain (MCMC) method. Recently, FEARNHEAD and DONNELLY 2001 Down improved the importance sampling method considerably. Even so, full-likelihood methods are computationally intensive and practically impossible for many data sets.

Recently, HUDSON 2001 Down suggested an ad hoc method for estimating the population recombination rate on the basis of combining the coalescent likelihoods of all pairwise comparisons of segregating sites. Estimation of 4Ner is rapid, and the method performs well in terms of bias and variance in comparison to Hudson's earlier moment estimator (HUDSON 1987 Down) and other ad hoc approaches (HUDSON 2001 Down). The method does not use all available information in the sequence data and introduces nonindependence in the combination of multiple comparisons, but is flexible and can potentially be expanded to incorporate deviations from the standard coalescent. HUDSON's (2001) estimator of 4Ner has been termed the composite-likelihood estimate (CLE).

In this article we consider a problem of critical importance to the analysis of recombination: the detection and estimation of recombination in genomes, such as those of many viruses and bacteria, where the rate of substitution is sufficiently high that some sites have experienced multiple mutations in the history of the sample. The issue is important because recurrent mutation can generate patterns of genetic variability that resemble the effects of recombination (Fig 1); in particular, the presence of all four haplotypes for a pair of segregating sites. Under the infinite-sites model, any such incompatibilities would be interpreted as evidence for recombination and hence will bias estimates of the recombination rate upward. Similarly, the likelihood-ratio test for the presence of recombination will be sensitive to misspecification of the mutation model, particularly the underestimation of the mutation rate at segregating sites, which can be caused by rate heterogeneity.



View larger version (11K):
In this window
In a new window
Download PPT slide
 
Figure 1. Recurrent mutation (A) and recombination (B) can generate similar patterns of genetic variability. The top shows the genealogies and occurrence of mutations, while the bottom depicts the resulting sampled gene sequences.

To address these problems we have extended Hudson's composite-likelihood method (HUDSON 2001 Down) to allow for finite-sites mutation models. In addition, we propose a permutation-based test (the likelihood permutation test) to test the hypothesis of no recombination . We use a permutation-based approach, rather than estimate confidence intervals from the composite likelihood, as the nonindependence makes interpretation of the composite-likelihood surface problematic, but also because we wish the test to be robust to model misspecification. We find that the composite-likelihood estimator performs well, even when most sites analyzed have experienced multiple mutations, and that the likelihood permutation test is more powerful than previous permutation-based methods for detecting recombination. We also consider the effect of misspecification of the model of sequence evolution on both the test for recombination and estimation of 4Ner. We show that the likelihood permutation test is robust to misspecification, unlike the homoplasy test (MAYNARD SMITH and SMITH 1998 Down) or the informative sites test (WOROBEY 2001 Down), and that estimation of 4Ner is also robust to minor misspecification of the model of sequence evolution. We apply the likelihood permutation test and estimation procedure to several empirical data sets from viruses, bacteria, and human mitochondria.


*  METHODS
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION AND APPLICATION
*LITERATURE CITED

Composite-likelihood estimation of 4Ner:
First, we outline our implementation of the approach of HUDSON 2001 Down for estimating the population recombination rate under the standard Fisher-Wright population model. The central difference between the method of HUDSON 2001 Down and that presented here is that we allow for models of sequence evolution in which multiple mutations may occur at a site during the history of the sample. Although it is possible to use an arbitrary model of sequence evolution, we make the simplifying assumption that all sites in a sequence conform to a two-allele model with reversible, symmetric mutation, such that the rate of mutation per site per generation is µ and is constant across sites. Consequently, we restrict analysis to sites at which there are no more than two alleles segregating. The extension of the method to more complex models of sequence evolution is left to future research; however, it is worth noting that the method appears to perform well, even when the true model of sequence evolution is considerably more complex than that assumed (see below).

The estimation procedure has four stages. The initial step is to estimate the population mutation rate per site, , from an approximate finite-sites version of the Watterson estimate

(1)

where S is the number of segregating sites, L is the total length of sequence analyzed, and n is the number of sampled gene sequences. The second stage is to consider every pair of segregating sites in the data (excluding sites with more than two alleles) and classify them into equivalent sets. For example, under the assumed mutation model, if one pair had the ordered data {AA, AT, TA, TA, AA} and another {GG, CC, CG, GG, CG}, these are equivalent to the unordered sequence {00, 00, 10, 10, 01}, where 0 represents the rare allele at each site. The number of types (hence the execution time of the program) depends on the number of sequences, the level of diversity, and the complexity of the assumed mutation model.

The third stage is to estimate the likelihood of each equivalent set under the estimated value of {theta}, the symmetric, reversible mutation model, and a range of recombination rates (typically 0 <= 4Ner <= 100), using the importance sampling method of FEARNHEAD and DONNELLY 2001 Down. We also used a simple Monte Carlo scheme for estimating the likelihood, similar to that implemented in HUDSON 2001 Down, to check the accuracy of likelihoods estimated by the importance sampling method (results not shown).

In the final stage, an estimate of the population recombination rate for the entire sequence (4Ner) is obtained by combining the likelihoods from all pairwise comparisons. The composite likelihood is given by

(2)

where {ell}(Xij|4Nerij) is the log likelihood of the data for segregating sites i and j given

(3)

where dij is the physical distance (in nucleotides) separating sites i and j and L is the total length of the sequence (i.e., we assume a constant rate of recombination over the gene). The estimate of 4Ner is taken as the value that has the highest composite log likelihood.

For genomes, such as viruses and bacteria, in which a gene-conversion model for recombination is more appropriate than a crossing-over model, the relationship between physical distance and recombination rate is modeled as

(4)

where c is the per base rate of initiation of gene conversion and is the average gene conversion tract length (assuming an exponential distribution; FRISSE et al. 2001 Down). This type of model can also be applied to circular genomes, such as that of the mitochondria, where dij is the minimum distance between two points on the circle (WIUF 2001 Down). While it is possible to coestimate both the rate of gene conversion and the average tract length, in practice we fix the average tract length and estimate the compound parameter

(5)

which can be thought of as the population rate of recombination between two distantly linked loci caused by gene conversion.

For simple data sets and low values of 4Ner, it is possible to compare the composite-likelihood surface with the full-likelihood surface estimated by the method of FEARNHEAD and DONNELLY 2001 Down. Fig 2 shows a comparison of the two surfaces for a single case and the joint distribution of the maximum-likelihood estimator (MLE) and CLE point estimates of 4Ner for 100 simulated data sets with and . For the single example (Fig 2A), the composite-likelihood curve has a very similar point estimate to the ML estimate, but is more highly curved because of the nonindependence introduced by multiple comparisons. Statistics for the two estimators of 4Ner (full-likelihood/composite-likelihood) are median, 2.4/3.8; variance, 9.1/15.6; proportion within a factor of two from the true value, 0.50/0.52. The correlation between the composite- and maximum-likelihood estimates is 0.78 (Fig 2B).



View larger version (11K):
In this window
In a new window
Download PPT slide
 
Figure 2. (A) The composite (CLR) and full (LR) relative likelihood surface for a single simulated data set. (B) The joint distribution of the maximum-likelihood estimate (MLE) of 4Ner and the composite-likelihood estimate (CLE). Likelihoods were calculated with per site.

HUDSON 2001 Down characterized the composite-likelihood estimator for the case where data conform to the infinite-sites model. In terms of bias and variance, the CLE is one of the better ad hoc methods for estimating the population recombination rate, although the estimator has considerable variance. However, this is also true of the MLE (Fig 2) and, to a large extent, is a reflection of inherent stochasticity in the genealogical process. However, while full likelihood provides an estimate of the relative likelihood of different values, there is no easily interpretable meaning of the composite-likelihood curve. Confidence intervals for the estimate of 4Ner can be obtained only by extensive simulation (HUDSON 2001 Down).

The likelihood permutation test:
We propose a simple test for the presence of recombination. Under a model of no recombination, and assuming a uniform mutation rate, sites are exchangeable (this is also true if there is free recombination). That is, the likelihood of observing the data is independent of the order in which sites occur. If there is some recombination, sites are no longer exchangeable, because closely linked sites have correlated genealogies. Consequently, the likelihood of observing the data is dependent on the order of sites. The likelihood permutation test for recombination is based on this property; we find the maximum composite likelihood for a data set (estimating 4Ner in the process), then permute segregating sites by location, and for each permutation find the maximum composite likelihood (and the corresponding value of 4Ner). The proportion of permuted data sets with a composite likelihood equal to or greater than that of the original data is calculated. If this proportion is lower than a chosen significance level, we conclude that there is evidence for recombination.

There are several methods for detecting recombination on the basis of the permutation of segregating sites. Permutation tests for recombination aimed at detecting a decay of a summary statistic of linkage disequilibrium (r2 or |D'|) with distance have been used to suggest the presence of recombination in human mitochondria (AWADALLA et al. 2000 Down) and Plasmodium falciparum (CONWAY et al. 1999 Down) and regions of low recombination in the Drosophila melanogaster genome (MIYASHITA and LANGLEY 1988 Down). Another permutation test (referred to as G4) has been suggested by MEUNIER and EYRE-WALKER 2001 Down, which compares the sum of distances between all pairs of sites that have all four possible haplotypes to the distribution in permuted data sets. We compared the power of the likelihood permutation test with these other permutation-based tests.

Models of sequence evolution:
We characterize both the composite-likelihood estimator and likelihood permutation test under a range of models of sequence evolution that reflect genomes experiencing high mutation rates at some or all sites. We have chosen four caricature models to represent the diversity of possible situations:

  • Infinite sites: All sites have the same low mutation rate and conform to the two-allele symmetric, reversible mutation model used in the likelihood estimation stage. This represents the best-case scenario (effectively infinite sites), as might be assumed for nuclear loci in humans (excluding hypermutable CpG dinucleotides).

  • Hypermutable: Most sites (99.5%) effectively conform to the infinite-sites model , but a fraction (0.5%) have a 100-fold higher mutation rate. All sites conform to the symmetric, reversible mutation model. This is chosen to reflect extreme rate variation, as occurs when hypermutable CpG dinucleotides are included in an analysis or in the mitochondrial genome of mammals.

  • Complex: This is characterized by strong base composition variation and mutation rate variation. Specifically, this is an HKY (Hasegawa, Kishino, Yano) mutation model (HASEGAWA et al. 1985 Down), with base frequencies , a transition-transversion ratio of 2, and an exponential distribution of mutation rates with a base-averaged mutation rate of , where

    (6)

and ij is the average per generation mutation rate from base i to base j (from the exponential distribution). This model is chosen to reflect the complexity of sequence evolution in prokaryote genomes with strong base composition bias.

Finite sites: All sites have the same, high mutation rate and conform to the two-allele symmetric, reversible mutation model. In this case, each segregating site experiences, on average, 2.6 mutations in the history of the sample. This model represents the extreme levels of polymorphism as occur at synonymous sites in retroviruses such as human immunodeficiency virus (HIV).

Data are simulated under the null and , for and the length of sequence chosen such that the average number of segregating sites is in the range 40–50. Ideally, for each simulated data set the likelihoods should be calculated for the value of {theta} estimated from the data. However, for the large number of replicates required to provide an accurate characterization of the estimator's properties, calculating the likelihoods for each data set is practically unfeasible. Instead, we have estimated likelihoods under three different values of {theta}, 0.01, 0.1, and 0.5, and present the results for each, along with mean and standard deviation of the values of {theta} estimated from the simulated data. One advantage of this approach is that it allows us to characterize the severity of model misspecification on the detection and estimation of recombination.

Empirical data:
We applied both the likelihood permutation test and estimation of the population recombination rate to a series of empirical data sets from viruses, bacteria, and human mtDNA. Previous analyses (SUERBAUM et al. 1998 Down; AWADALLA et al. 1999 Down; WOROBEY et al. 1999 Down; INGMAN et al. 2000 Down; WOROBEY 2001 Down) of these data sets revealed a range of levels of recombination, from effectively clonal in hepatitis C virus (HCV) and mtDNA (INGMAN et al. 2000 Down; WOROBEY 2001 Down) to freely recombining in Helicobacter pylori (SUERBAUM et al. 1998 Down). While none of these data sets represent random samples from Fisher-Wright populations, as is supposed by the coalescent methods of analysis, the results are likely to be indicative of the situation in more appropriate samples.

Viral genomes: Data sets were the following: HCV, 6 complete genome sequences (WOROBEY 2001 Down; worldwide sample); measles, 50 sequences of the Hemagglutinin gene (WOELK et al. 2001 Down; worldwide sample); dengue DEN-1 virus, 7 sets of concatenated capsid C, premembrane/membrane prM/M, and E genes (WOROBEY et al. 1999 Down; worldwide); HIV2 subtype A, 21 sequences of env gene (KUIKEN et al. 2000 Down; worldwide); and HIV1 subtype B, 93 sequences of the env gene (KUIKEN et al. 2000 Down; worldwide).

Bacterial genomes: H. pylori data sets were 33 sequences of the flaA gene (worldwide; SUERBAUM et al. 1998 Down).

Mitochondrial genomes: Data sets were 45 partial genome sequences from the analysis of AWADALLA et al. 1999 Down(worldwide) and 53 complete genome sequences from the analysis of INGMAN et al. 2000 Down.


*  RESULTS
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION AND APPLICATION
*LITERATURE CITED

Estimating 4Ner with recurrent mutation:
To date, estimators of the population recombination rate have typically been characterized under the infinite-sites assumption that each segregating site is the result of a single mutation. In many biologically realistic situations this assumption cannot be justified, even though the infinite-sites model is superficially plausible. For example, if 20 mutations occur in a genealogy of 500 linked sites (the expected number for and ), the probability that at least one site experiences recurrent mutation is >30% and will be higher if there is recombination or any variation between sites in the mutation rate. In organisms with high mutation rates, such as many viruses and bacteria, a large proportion of sites may have experienced multiple mutations.

Because recurrent mutation can create patterns of genetic variability that resemble the effects of recombination (Fig 1), it is important to develop methods for estimating the recombination rate that can account for finite-sites models of sequence evolution. We have extended HUDSON's (2001) composite-likelihood method for estimating the population recombination rate, 4Ner, within a coalescent framework, to incorporate models in which sites may experience multiple mutations in the history of the sample. Our approach is to use the simplest possible model of finite-sites evolution (two-allele system with symmetric reversible mutation and a constant mutation rate across sites) and to investigate how the method performs under a variety of caricature models of sequence evolution chosen to reflect biological diversity.

Fig 3 shows the distribution of point estimates for 4Ner for data simulated under the four caricature models ( and ) and likelihoods estimated under three different values of {theta}: 0.01, 0.1, and 0.5. In Table 1 we also present the median and proportion of estimates that are within a factor of two from the true value, along with the mean and standard deviation of estimates of {theta} obtained from Equation 1.



View larger version (35K):
In this window
In a new window
Download PPT slide
 
Figure 3. The distribution of CLEs of the population recombination rate simulated and analyzed under different models of sequence evolution. Each chart represents the results from 1000 data sets simulated with . The model of sequence evolution used to simulate data is on the left and the value of {theta} used to calculate likelihoods under the two-allele symmetric reversible model is at the top of the columns.


 
View this table:
In this window
In a new window

 
Table 1. Statistical properties of the composite-likelihood estimator

As expected, when there is a considerable discrepancy between the true value of {theta} and that used to estimate likelihoods, estimates of 4Ner are strongly biased. When the true value of {theta} is lower than the value used to estimate likelihoods, estimates of 4Ner are downwardly biased. In contrast, when the true value of {theta} is greater than the value used to estimate likelihoods, estimates of 4Ner are biased upward. However, it is encouraging to find that when likelihoods are estimated under the correct value of {theta}, the estimator performs almost as well when the mutation rate is very high as it does when the mutation rate is low (Fig 3, bottom right vs. top left).

The middle two rows of Fig 3 and Table 1 show the effects of applying the simplistic mutation model to data simulated under models representing some degree of biological complexity. For both the hypermutable and complex models there is strong rate variation across sites, yet the estimator properties are hardly worse than under the best-case scenario, and the estimates of {theta} are well within the range that leads to sensible estimates of 4Ner. In short, the composite-likelihood estimator of the population recombination rate is robust to minor misspecification of the underlying mutation model. This conclusion is of great importance as it provides a justification of the use of the CLE on real data sets.

Detecting recombination:
The results presented above may give us some confidence that the value of 4Ner estimated by the composite-likelihood method is meaningful, even in genomes where the rate of recurrent mutation is high. However, one important question that is difficult to address within the CLE framework is whether one can reject the hypothesis that . Direct experimental evidence for recombination may be difficult to obtain for many genomes (particularly if genetic exchange is very rare); thus it is important to have indirect, population genetic-based methods for detecting recombination. And it is equally important that such methods should not create false positives through misspecification of the model of sequence evolution.

We have proposed the likelihood permutation test as a means of testing for the presence of recombination. Table 2 shows the results of the power analysis carried out on the same four caricatures of sequence evolution, and again estimating likelihoods under the three values of {theta}. We also compare the power of the likelihood permutation test to other permutation-based tests for recombination that consider summaries of the data sensitive to the presence of recombination.


 
View this table:
In this window
In a new window

 
Table 2. Power analysis of permutation tests for detecting recombination

The key result is that the likelihood permutation test is consistently the most powerful permutation-based method for detecting recombination from population genetic data. In the case of infinite-sites data, recombination is detected in almost 96% of cases, compared to ~80% for the other tests. Even when the model used to estimate likelihoods is very different from the true model, the power of the test is considerable. For example, with data generated by the finite-sites model with , recombination is detected in 83% of cases when the correct value of {theta} is used to calculate likelihoods, compared to 82% of cases when is used to estimate likelihoods. In contrast, those methods that rely heavily on the distribution of pairs at which all four gametes are present (|D'| and G4) have greatly reduced power under such high levels of mutation (51 and 39%, respectively). The one situation where the likelihood permutation test has reduced power is when the true value of {theta} is much lower than that used to estimate likelihoods; however, such a situation is unlikely to occur for empirical data. It is also worth noting that the power to detect recombination using the correlation between r2 and physical distance is consistently greater than with either |D'| or G4 for the biologically plausible models of sequence evolution.


*  DISCUSSION AND APPLICATION
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION AND APPLICATION
*LITERATURE CITED

The composite-likelihood method and likelihood permutation test together present a powerful approach for assessing the influence of recombination on patterns of genetic variability. Even when the mutational and substitutional processes affecting gene sequence evolution are complex and unlikely to be fully characterized by any simple model, the use of simple models provides a remarkably robust way of detecting recombination and estimating the population recombination rate. To investigate how the new approach performs on real data, we have applied the methods to samples of gene sequences from the viruses HIV1, HIV2, hepatitis C, dengue-1, and measles, the bacterium H. pylori, and human mitochondrial DNA. We also discuss possible limitations of the approach, in particular misspecification of the population model used to estimate the likelihoods.

Empirical data:
The empirical data sets were chosen to reflect a diversity of levels of recombination, as had been estimated from previous studies (MAYNARD SMITH et al. 1993 Down; SUERBAUM et al. 1998 Down; AWADALLA et al. 1999 Down; WOROBEY et al. 1999 Down; INGMAN et al. 2000 Down; WOROBEY 2001 Down). For the HIV data sets, we analyzed third position sites in the coding region separately from the first two positions, to investigate whether different results were obtained from using data with different levels of diversity. In addition, we analyzed two human mtDNA data sets that have been used to provide evidence for (AWADALLA et al. 1999) and against (INGMAN et al. 2000 Down) recombination. In all cases, a gene-conversion model for recombination is more appropriate than a crossing-over model, and we have fixed the average tract length of gene conversion to 100 bp for the viral and bacterial data sets and 500 bp for the mtDNA data sets. These numbers are arbitrary, although in the microbial and viral data sets, the composite likelihood increases for small tract lengths (data not shown). In one of the few cases in eukaryotes where gene conversion tract lengths have been estimated, the best fit to the data was a geometric distribution with mean tract length of 352 bp (HILLIKER et al. 1994 Down).

Table 3 presents the results of these analyses and the estimate of the population recombination rate, {gamma}, under a gene conversion type model; see Equation 5. In addition, we carried out the same analyses, but filtering out single nucleotide polymorphisms (SNPs) for which the minor allele was at a frequency <0.1; the results are presented in Table 4. For the HCV and dengue virus data sets the results from the filtered analysis are identical to those in Table 2 as the sample sizes are <10. We also omitted the results for the test of MEUNIER and EYRE-WALKER 2001 Down as it behaves in an almost identical fashion to |D'|.


 
View this table:
In this window
In a new window

 
Table 3. Detecting recombination in empirical data


 
View this table:
In this window
In a new window

 
Table 4. Detecting recombination with mutations at intermediate frequencies

From Table 3 and, more noticeably, from Table 4, we find evidence for recombination in almost all data sets and levels of recombination that range from in HCV to > 100 in HIV1 ( was chosen as a cutoff as it is the limit for which likelihoods were estimated). In HCV, only the correlation of r2 with distance shows a significant negative relationship, but with six sequences, there is little power in the likelihood permutation test. For the measles data set, only r2 is significant when all data are used, but all tests are either significant, or marginally significant, for the filtered data. The other data sets show evidence for much higher levels of recombination. The estimate of {gamma} is >40 for H. pylori and 60 for dengue. The ratio /W gives an indication of the relative likelihood of a nucleotide experiencing a recombination event relative to mutation. Within the data sets for which there is strong support for recombination, the ratio varies from ~35 in measles to ~1000 in dengue and H. pylori and is potentially much higher in HIV1.

The effect of filtering out rare variants is worth noting. Rare variants are largely uninformative about recombination (though not entirely; MCVEAN 2001 Down), and hence their inclusion may obscure the signal of recombination, particularly if there is an excess of rare mutations in the data. Removal of rare variants from the data has little effect on estimates of the population recombination rate in both the empirical (compare estimates of {gamma} from Table 3 and Table 4) and simulated data. For example, under the finite-sites model, the median of estimates of {gamma} was 9.8 when all sites were used (and analyzed under the correct mutation model) and 10.2 when the analysis was restricted to sites for which the minor allele frequency was at least 0.1. In the simulated data, no increase in the power of the likelihood permutation test was found when the analysis was restricted to intermediate frequency variants. However, the simulated data sets have no excess of rare variants, unlike the empirical data.

Very high levels of recombination in HIV:
The results concerning recombination in HIV1 subtype B and HIV2 subtype A sequences are particularly notable. Although recombination between different subtypes is occasionally observed (KUIKEN et al. 2000 Down), recombination within subtypes has largely been ignored in phylogenetic analysis of genetic diversity (NIELSEN and YANG 1998 Down; RAMBAUT et al. 2001 Down). The results presented here support such a conclusion. Using the likelihood permutation test, we find evidence for recombination in both HIV2 and HIV1, though only when SNPs are filtered for the case of HIV1. For HIV1 the estimate of {gamma} is beyond the range for which likelihoods were estimated.

Levels of genetic diversity are extremely high in HIV1 and HIV2 (estimates of {theta} per site at first/second codon positions of 0.144 and 0.102, respectively). Because recurrent mutation can cause patterns of genetic diversity similar to that caused by recombination, one might be cautious of concluding that recombination is present. However, the estimation of a low level of recombination in HCV, which has an even higher level of diversity , and in measles, which has a comparable level of sequence diversity , indicates that high levels of sequence diversity do not necessarily lead to high estimates of the population recombination rate.

The implications of such a high level of recombination in HIV1 are considerable. Not only does it question the validity of conclusions about the age and timings of events in the history of the virus that have been made assuming an absence of recombination (NIELSEN and YANG 1998 Down; RAMBAUT et al. 2001 Down), but it has practical implications for predicting how fast mutations (such as drug resistance) may spread across different genetic backgrounds. Analysis of genetic data from appropriate samples taken at different population scales will be essential for inferring the extent and consequences of recombination.

Recombination in human mtDNA?
Another issue of considerable importance is whether there is evidence for recombination in human mtDNA. The data set of AWADALLA et al. 1999 Down clearly shows evidence for recombination when all data are used, irrespective of the test employed (for r2 and the likelihood permutation test this is also true for >90% of random subsets of 35 of the 45 sequences). In direct contrast, the data of INGMAN et al. 2000 Down show no evidence for recombination, irrespective of the test used. When the frequency filter is applied, only one statistic, r2, still shows evidence for recombination in the first data set (and this is sensitive to the removal of a single segregating site). These results are in direct contrast to those from the viral and bacterial sequences, where the frequency filter increases the power of almost all tests. Taken together, the results suggest a lack of evidence for recombination in human mtDNA.

Why should low frequency variants create the impression of recombination? HEY 2000 Down suggested that sequencing protocols might lead to the propagation of correlated errors. Such an effect may be enhanced by the combination of sequences from multiple laboratories (because recurrent errors will be strongly correlated), and for this reason, the data collected and sequenced by INGMAN et al. 2000 Down is preferable. Given that sequencing errors tend to be at low frequency, this may explain why three of the four tests are significant only if all the data are analyzed, but it does not explain (beyond chance) why r2 still shows a significant relationship with distance when only high frequency variants are used. MCVEAN 2001 Down suggested that bouts of local adaptive evolution might lead to correlated mutations and a relationship between physical distance and linkage disequilibrium as measured by r2. How adaptive evolution influences patterns of linkage disequilibrium and the measurement and detection of recombination is an important problem.

Misspecification of the population model:
While the properties of the composite-likelihood estimator of the population recombination rate have been examined across a variety of models of sequence evolution, no mention has been made so far as to how robust the methods described here may be to deviations from the population model. Coalescent estimation of likelihoods assumes that a random sample has been taken from a population of constant size, with random mating, no migration to or from different populations, and no natural selection. In reality, none of these assumptions are tenable, although several deviations from the standard neutral model (such as fluctuating population size) can be approximated as having an effect on the effective population size, Ne.

Population growth, strong geographical structuring, and nonrandom representation of gene sequences in the databases are potentially important concerns for the use of coalescent methods. Sampling of sequences specifically for population genetic analysis will overcome the problems of nonrandom database representation; however, inadequacies in the demographic model are more problematic. Population growth tends to decrease linkage disequilibrium while population structure tends to increase linkage disequilibrium (e.g., PRITCHARD and PRZEWORSKI 2001 Down). Consequently, one might expect estimates of the population recombination rate (and the ability to detect recombination) to be sensitive to the demographic history of the population.

While no exhaustive attempt is made here to characterize the behavior of the CLE under misspecified population models, it is possible to ask whether the data sets analyzed show evidence for deviation from the neutral model in terms of the allele frequency spectrum. This can most simply be assessed through the use of Tajima's D statistic, which compares estimates of the population mutation rate derived from the number of segregating sites and the average pairwise differences. A negative value of the statistic indicates an excess of rare variants and the possibility of population growth, and a positive value suggests population structure may be important.

Table 3 includes the value of Tajima's D statistic for the data sets analyzed, and indicates the significance level estimated assuming no recombination. While the statistic is negative for all data sets, it is only significantly so for measles, HIV1, and the two mtDNA data sets. However, the variance of the statistic is reduced by recombination (so reducing the confidence limits under the null model). Other data sets (particularly the HIV2 data) may therefore also reflect significant deviations from the standard neutral model. However, those data sets that show evidence for a departure from the standard neutral model also reflect the full diversity of estimated recombination rates. In short, while departure from the assumed demographic model may have some influence on the estimate of the population recombination rate, it is unlikely to be confused with the signal of recombination.


*  ACKNOWLEDGMENTS

We thank Michael Worobey for the generous supply of empirical data sets and important insights. In addition, we thank Dick Hudson, Molly Przeworksi, and two reviewers for discussion and comments on the manuscript. G.M. is funded by the Royal Society and P.A. is funded by the Wellcome trust. The programs pairwise and permute used to estimate the population recombination rate and test for recombination are available within the LDhat package, which can be downloaded from http://www.stats.ox.ac.uk/~mcvean.

Manuscript received October 2, 2001; Accepted for publication January 7, 2002.


*  LITERATURE CITED
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION AND APPLICATION
*LITERATURE CITED

AWADALLA, P., A. EYRE-WALKER, and J. MAYNARD SMITH, 1999  Linkage disequilibrium and recombination in hominid mitochondrial DNA. Science 286:2524-2525[Abstract/Free Full Text].

AWADALLA, P., A. EYRE-WALKER and J. MAYNARD SMITH, 2000 Questioning evidence for recombination in human mitochondrial DNA—reply. Science 288: 1931a.

BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995  The hitchhiking effect on the site-frequency spectrum of DNA polymorphisms. Genetics 140:783-796[Abstract].

BROWN, C. J., E. C. GARNER, A. K. DUNKER, and P. JOYCE, 2001  The power to detect recombination using the coalescent. Mol. Biol. Evol. 18:1421-1424[Free Full Text].

CONWAY, D. J., C. ROPER, A. M. J. ODUOLA, D. E. ARNOT, and P. G. KREMSNER et al., 1999  High recombination rate in natural populations of Plasmodium falciparum.. Proc. Natl. Acad. Sci. USA 96:4506-4511[Abstract/Free Full Text].

FEARNHEAD, P. and P. J. DONNELLY, 2001  Estimating recombination rates from population genetic data. Genetics 159:1299-1318[Abstract/Free Full Text].

FISHER, R. A., 1932 The Genetical Theory of Natural Selection. Oxford University Press, London.

FRISSE, L., R. R. HUDSON, A. BARTOSZEWICA, J. D. WALL, and J. DONFACK et al., 2001  Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69:831-843[Medline].

GRASSLY, N. C. and E. C. HOLMES, 1997  A likelihood method for the detection of selection and recombination using nucleotide sequences. Mol. Biol. Evol. 14:239-247[Abstract].

GRIFFITHS, R. C. and P. MARJORAM, 1996a  Ancestral inferences from samples of DNA sequences with recombination. J. Comput. Biol. 3:479-502[Medline].

GRIFFITHS, R. C., and P. MARJORAM, 1996b An ancestral recombination graph, pp. 257–270 in IMA Volume on Mathematical Population Genetics, edited by P. J. DONNELY and S. TAVARÉ. Springer-Verlag, Berlin.

HASEGAWA, M., H. KISHINO, and T. A. YANO, 1985  Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160-174[Medline].

HEY, J., 2000  Human mitochondrial DNA recombination: can it be true? Trends Ecol. Evol. 15:181-182[Medline].

HEY, J. and J. WAKELEY, 1997  A coalescent estimator of the population recombination rate. Genetics 145:833-846[Abstract].

HILLIKER, A. J., G. HARAUZ, A. G. REAUME, M. GRAY, and S. H. CLARK et al., 1994  Meiotic gene conversion tract length distribution within the rosy locus of Drosophila melanogaster.. Genetics 137:1019-1026[Abstract].

HUDSON, R. R., 1987  Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245-250[Medline].

HUDSON, R. R., 1991 Gene genealogies and the coalescent process, pp. 1–44 in Oxford Surveys in Evolutionary Biology, Vol. 7, edited by D. FUTUYAMA and J. ANTONOVICS. Oxford University Press, London.

HUDSON, R. R., 2001  Two-locus sampling distributions and their application. Genetics 159:1805-1817[Abstract/Free Full Text].

HUDSON, R. R. and N. KAPLAN, 1985  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147-164[Abstract/Free Full Text].

HUDSON, R. R., and N. L. KAPLAN, 1994 Gene trees with background selection, pp. 140–153 in Non-Neutral Evolution: Theories and Molecular Data, edited by G. B. GOLDING. Chapman & Hall, New York.

INGMAN, M., H. KAESSMAN, S. PÄÄBO, and U. GYLLENSTEN, 2000  Mitochondrial genome variation and the origin of modern humans. Nature 408:708-713[Medline].

KINGMAN, J. F. C., 1982  The coalescent. Stoch. Proc. Appl. 13:235-248.

KUHNER, M. K., J. YAMATO, and J. FELSENSTEIN, 2000  Maximum likelihood estimation of recombination rates from population data. Genetics 156:1393-1401[Abstract/Free Full Text].

KUIKEN, C., B. FOLEY, B. HAHN, P. MARX, F. MCCUTCHAN et al. (Editors), 2000 HIV Sequence Compendium 2000. Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM.

LEWONTIN, R. C., 1964  The interaction of selection and linkage. I. general considerations; heterotic models. Genetics 49:49-67[Free Full Text].

MAYNARD SMITH, J., 1999  The detection and measurement of recombination from sequence data. Genetics 153:1021-1027[Abstract/Free Full Text].

MAYNARD SMITH, J. and N. H. SMITH, 1998  Detecting recombination from gene trees. Mol. Biol. Evol. 15:590-599[Abstract].

MAYNARD SMITH, J., N. H. SMITH, M. O'ROURKE, and B. G. SPRATT, 1993  How clonal are bacteria? Proc. Natl. Acad. Sci. USA 90:4383-4388.

MCGUIRE, G., F. WRIGHT, and M. J. PRENTICE, 2000  A Bayesian model for detecting past recombination in DNA multiple alignments. J. Comput. Biol. 7:159-170[Medline].

MCVEAN, G. A. T., 2001  What do patterns of genetic variability reveal about mitochondrial recombination? Heredity 87:613-620[Medline].

MEUNIER, J. and A. EYRE-WALKER, 2001  The correlation between linkage disequilibrium and distance. Implications for recombination in Hominid mitochondria. Mol. Biol. Evol. 18:2132-2135[Free Full Text].

MIYASHITA, N. and C. H. LANGLEY, 1988  Molecular and phenotypic variation of the white locus region in Drosophila melanogaster.. Genetics 120:199-212[Abstract/Free Full Text].

MULLER, H. J., 1932  Some genetic aspects of sex. Am. Nat. 66:118-138.

MULLER, H. J., 1964  The relation of recombination to mutational advance. Mutat. Res. 1:2-9.

NIELSEN, R. and Z. YANG, 1998  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929-936[Abstract/Free Full Text].

PRITCHARD, J. and M. PRZEWORSKI, 2001  Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69:1-14[Medline].

RAMBAUT, A., D. L. ROBERTSON, O. G. PYBUS, M. PEETERS, and E. C. HOLMES, 2001  Human immunodeficiency viruses. Phylogeny and origin of HIV-1. Nature 410:1047-1048[Medline].

SCHIERUP, M. H. and J. HEIN, 2000  Consequences of recombination on traditional phylogenetic analysis. Genetics 156:879-891[Abstract/Free Full Text].

SUERBAUM, S., J. MAYNARD SMITH, K. BAPUMIA, G. MORELLI, and N. H. SMITH et al., 1998  Free recombination within Helicobacter pylori.. Proc. Natl. Acad. Sci. USA 95:12619-12624[Abstract/Free Full Text].

WALL, J. D., 2000  A comparison of estimators of the population recombination rate. Mol. Biol. Evol. 17:156-163[Abstract/Free Full Text].

WIUF, C., 2001  Recombination in human mitochondrial DNA? Genetics 159:749-756[Abstract/Free Full Text].

WOELK, C. H., J. LI, E. C. HOLMES, and D. W. G. BROWN, 2001  Immune and artificial selection in the hemagglutinin (h) glycoprotein of measles virus. J. Gen. Virol. 82:2463-2474[Abstract/Free Full Text].

WOROBEY, M., 2001  A novel approach to detecting and measuring recombination: new insights into evolution in viruses, bacteria and mitochondria. Mol. Biol. Evol. 18:1425-1434[Abstract/Free Full Text].

WOROBEY, M., A. RAMBAUT, and E. C. HOLMES, 1999  Widespread intraserotype recombination in natural populations of dengue virus. Proc. Natl. Acad. Sci. USA 96:7352-7357[Abstract/Free Full Text].




This article has been cited by other articles:


Home page
Mol Biol EvolHome page
D. J. White and N. J. Gemmell
Can Indirect Tests Detect a Known Recombination Event in Human mtDNA?
Mol. Biol. Evol., July 1, 2009; 26(7): 1435 - 1439.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. Wang and B. Rannala
Population genomic inference of recombination rates and hotspots
PNAS, April 14, 2009; 106(15): 6215 - 6219.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. Ross-Ibarra, M. Tenaillon, and B. S. Gaut
Historical Divergence and Gene Flow in the Genus Zea
Genetics, April 1, 2009; 181(4): 1399 - 1413.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Nadachowska and W. Babik
Divergence in the Face of Gene Flow: The Case of Two Newts (Amphibia: Salamandridae)
Mol. Biol. Evol., April 1, 2009; 26(4): 829 - 841.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. P. Foxe, T. Slotte, E. A. Stahl, B. Neuffer, H. Hurka, and S. I. Wright
Recent speciation associated with the evolution of selfing in Capsella
PNAS, March 31, 2009; 106(13): 5241 - 5245.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
P. Lefeuvre, J.-M. Lett, A. Varsani, and D. P. Martin
Widely Conserved Recombination Patterns among Single-Stranded DNA Viruses
J. Virol., March 15, 2009; 83(6): 2697 - 2707.
[Abstract] [Full Text] [PDF]


Home page
J HeredHome page
D. E. Janes, T. Ezaz, J. A. Marshall Graves, and S. V. Edwards
Recombination and Nucleotide Diversity in the Sex Chromosomal Pseudoautosomal Region of the Emu, Dromaius novaehollandiae
J. Hered., March 1, 2009; 100(2): 125 - 136.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B.-H. Song, A. J. Windsor, K. J. Schmid, S. Ramos-Onsins, M. E. Schranz, A. J. Heidel, and T. Mitchell-Olds
Multilocus Patterns of Nucleotide Diversity, Population Structure and Linkage Disequilibrium in Boechera stricta, a Wild Relative of Arabidopsis
Genetics, March 1, 2009; 181(3): 1021 - 1033.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
H. D. Marshall, M. W. Coulson, and S. M. Carr
Near Neutrality, Rate Heterogeneity, and Linkage Govern Mitochondrial Genome Evolution in Atlantic Cod (Gadus morhua) and Other Gadine Fish
Mol. Biol. Evol., March 1, 2009; 26(3): 579 - 589.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Carneiro, N. Ferrand, and M. W. Nachman
Recombination and Speciation: Loci Near Centromeres Are More Differentiated Than Loci Near Telomeres Between Subspecies of the European Rabbit (Oryctolagus cuniculus)
Genetics, February 1, 2009; 181(2): 593 - 606.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. N. Balakrishnan and S. V. Edwards
Nucleotide Variation, Linkage Disequilibrium and Founder-Facilitated Speciation in Wild Populations of the Zebra Finch (Taeniopygia guttata)
Genetics, February 1, 2009; 181(2): 645 - 660.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. J. Wilson, E. Gabriel, A. J.H. Leatherbarrow, J. Cheesbrough, S. Gee, E. Bolton, A. Fox, C. A. Hart, P. J. Diggle, and P. Fearnhead
Rapid Evolution and the Importance of Recombination to the Gastroenteric Pathogen Campylobacter jejuni
Mol. Biol. Evol., February 1, 2009; 26(2): 385 - 397.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
Y. Wang and B. Rannala
Bayesian inference of fine-scale recombination rates using population genomic data
Phil Trans R Soc B, December 27, 2008; 363(1512): 3921 - 3930.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
B. C. Verrelli, C. M. Lewis Jr, A. C. Stone, and G. H. Perry
Different Selective Pressures Shape the Molecular Evolution of Color Vision in Chimpanzee and Human Populations
Mol. Biol. Evol., December 1, 2008; 25(12): 2735 - 2743.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Microbiol.Home page
N. C. LaCross, C. F. Marrs, M. Patel, S. A. Sandstedt, and J. R. Gilsdorf
High Genetic Diversity of Nontypeable Haemophilus influenzae Isolates from Two Children Attending a Day Care Center
J. Clin. Microbiol., November 1, 2008; 46(11): 3817 - 3821.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
T. Slotte, H. Huang, M. Lascoux, and A. Ceplitis
Polyploid Speciation Did Not Confer Instant Reproductive Isolation in Capsella (Brassicaceae)
Mol. Biol. Evol., July 1, 2008; 25(7): 1472 - 1481.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
P. Nicolas, S. Mondot, G. Achaz, C. Bouchenot, J.-F. Bernardet, and E. Duchaud
Population Structure of the Fish-Pathogenic Bacterium Flavobacterium psychrophilum
Appl. Envir. Microbiol., June 15, 2008; 74(12): 3702 - 3709.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. F. McRae, E. M. Byrne, Z. Z. Zhao, G. W. Montgomery, and P. M. Visscher
Power and SNP tagging in whole mitochondrial genome association studies
Genome Res., June 1, 2008; 18(6): 911 - 917.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. D. Kern and D. J. Begun
Recurrent Deletion and Gene Presence/Absence Polymorphism: Telomere Dynamics Dominate Evolution at the Tip of 3L in Drosophila melanogaster and D. simulans
Genetics, June 1, 2008; 179(2): 1021 - 1027.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
R. Burri, H. N. Hirzel, N. Salamin, A. Roulin, and L. Fumagalli
Evolutionary Patterns of MHC Class II B in Owls and Their Implications for the Understanding of Avian MHC Evolution
Mol. Biol. Evol., June 1, 2008; 25(6): 1180 - 1191.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
S. Yan, H. Liu, T. J. Mohr, J. Jenrette, R. Chiodini, M. Zaccardelli, J. C. Setubal, and B. A. Vinatzer
Role of Recombination in the Evolution of the Model Plant Pathogen Pseudomonas syringae pv. tomato DC3000, a Very Atypical Tomato Strain
Appl. Envir. Microbiol., May 15, 2008; 74(10): 3171 - 3181.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Beisswanger and W. Stephan
Evidence that strong positive selection drives neofunctionalization in the tandemly duplicated polyhomeotic genes in Drosophila
PNAS, April 8, 2008; 105(14): 5447 - 5452.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J.-F. Lefebvre and D. Labuda
Fraction of Informative Recombinations: A Heuristic Approach to Analyze Recombination Rates
Genetics, April 1, 2008; 178(4): 2069 - 2079.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
I. J. Tsai, D. Bensasson, A. Burt, and V. Koufopanou
Population genomics of the wild yeast Saccharomyces paradoxus: Quantifying the life cycle
PNAS, March 25, 2008; 105(12): 4957 - 4962.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
M. T. Edwards, N. K. Fry, and T. G. Harrison
Clonal population structure of Legionella pneumophila inferred from allelic profiling
Microbiology, March 1, 2008; 154(3): 852 - 864.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Ojeda, L.-S. Huang, J. Ren, A. Angiolillo, I.-C. Cho, H. Soto, C. Lemus-Flores, S. M. Makuza, J. M. Folch, and M. Perez-Enciso
Selection in the Making: A Worldwide Survey of Haplotypic Diversity Around a Causative Mutation in Porcine IGF2
Genetics, March 1, 2008; 178(3): 1639 - 1652.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. D. Cutter
Multilocus Patterns of Polymorphism and Selection Across the X Chromosome of Caenorhabditis remanei
Genetics, March 1, 2008; 178(3): 1661 - 1672.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
A. J McCarthy, M.-A. Shaw, and S. J Goodman
Pathogen evolution and disease emergence in carnivores
Proc R Soc B, December 22, 2007; 274(1629): 3165 - 3174.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. De Mita, J. Ronfort, H. I. McKhann, C. Poncet, R. El Malki, and T. Bataillon
Investigation of the Demographic and Selective Forces Shaping the Nucleotide Diversity of Genes Involved in Nod Factor Signaling in Medicago truncatula
Genetics, December 1, 2007; 177(4): 2123 - 2133.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. Garrigan, S. B. Kingan, M. M. Pilkington, J. A. Wilder, M. P. Cox, H. Soodyall, B. Strassmann, G. Destro-Bisol, P. de Knijff, A. Novelletto, et al.
Inferring Human Population Sizes, Divergence Times and Rates of Gene Flow From Mitochondrial, X and Y Chromosome Resequencing Data
Genetics, December 1, 2007; 177(4): 2195 - 2207.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
K. A. Mather, A. L. Caicedo, N. R. Polato, K. M. Olsen, S. McCouch, and M. D. Purugganan
The Extent of Linkage Disequilibrium in Rice (Oryza sativa L.)
Genetics, December 1, 2007; 177(4): 2223 - 2232.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
B. E. Owor, D. P. Martin, D. N. Shepherd, R. Edema, A. L. Monjane, E. P. Rybicki, J. A. Thomson, and A. Varsani
Genetic analysis of maize streak virus isolates from Uganda reveals widespread distribution of a recombinant variant
J. Gen. Virol., November 1, 2007; 88(11): 3154 - 3165.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
R. Ueno, V. A. R. Huss, N. Urano, and S. Watabe
Direct evidence for redundant segmental replacement between multiple 18S rRNA genes in a single Prototheca strain
Microbiology, November 1, 2007; 153(11): 3879 - 3893.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
T. Wirth, G. Morelli, B. Kusecek, A. van Belkum, C. van der Schee, A. Meyer, and M. Achtman
The rise and spread of a new pathogen: Seroresistant Moraxella catarrhalis
Genome Res., November 1, 2007; 17(11): 1647 - 1656.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. Gay, S. Myers, and G. McVean
Estimating Meiotic Gene Conversion Rates From Population Genetic Data
Genetics, October 1, 2007; 177(2): 881 - 894.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B. S. Ort and G. H. Pogson
Molecular Population Genetics of the Male and Female Mitochondrial DNA Molecules of the California Sea Mussel, Mytilus californianus
Genetics, October 1, 2007; 177(2): 1087 - 1099.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. T. Gerrard and A. Meyer
Positive Selection and Gene Conversion in SPP120, a Fertilization-Related Gene, during the East African Cichlid Fish Radiation
Mol. Biol. Evol., October 1, 2007; 24(10): 2286 - 2297.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
U. Arunyawat, W. Stephan, and T. Stadler
Using Multilocus Sequence Data to Assess Population Structure, Natural Selection, and Linkage Disequilibrium in Wild Tomatoes
Mol. Biol. Evol., October 1, 2007; 24(10): 2310 - 2322.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
J. Bangham, D. J Obbard, K.-W. Kim, P. R Haddrill, and F. M Jiggins
The age and evolution of an antiviral resistance mutation in Drosophila melanogaster
Proc R Soc B, August 22, 2007; 274(1621): 2027 - 2034.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
S. R. Miller, R. W. Castenholz, and D. Pedersen
Phylogeography of the Thermophilic Cyanobacterium Mastigocladus laminosus
Appl. Envir. Microbiol., August 1, 2007; 73(15): 4751 - 4759.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. Auton and G. McVean
Recombination rate estimation in the presence of hotspots
Genome Res., August 1, 2007; 17(8): 1219 - 1227.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. M. Barr, S. R. Keller, P. K. Ingvarsson, D. B. Sloan, and D. R. Taylor
Variation in Mutation Rate and Polymorphism Among Mitochondrial Genes of Silene vulgaris
Mol. Biol. Evol., August 1, 2007; 24(8): 1783 - 1791.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. RoyChoudhury and M. Stephens
Fast and Accurate Estimation of the Population-Scaled Mutation Rate, {theta}, From Microsatellite Genotype Data
Genetics, June 1, 2007; 176(2): 1363 - 1366.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
V. N. Minin, K. S. Dorman, F. Fang, and M. A. Suchard
Phylogenetic Mapping of Recombination Hotspots in Human Immunodeficiency Virus via Spatially Smoothed Change-Point Processes
Genetics, April 1, 2007; 175(4): 1773 - 1785.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
X. Didelot and D. Falush
Inference of Bacterial Microevolution Using Multilocus Sequence Data
Genetics, March 1, 2007; 175(3): 1251 - 1266.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
K. A. Dyer, B. Charlesworth, and J. Jaenike
Chromosome-wide linkage disequilibrium as a consequence of meiotic drive
PNAS, January 30, 2007; 104(5): 1587 - 1592.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. Eyheramendy, J. Marchini, G. McVean, S. Myers, and P. Donnelly
A model-based approach to capture genetic variation for future association studies
Genome Res., January 1, 2007; 17(1): 88 - 95.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. Fearnhead
SequenceLDhot: detecting recombination hotspots
Bioinformatics, December 15, 2006; 22(24): 3061 - 3066.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
X. Liu, M. M. Gutacker, J. M. Musser, and Y.-X. Fu
Evidence for Recombination in Mycobacterium tuberculosis
J. Bacteriol., December 1, 2006; 188(23): 8169 - 8177.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Ojeda, J. Rozas, J. M. Folch, and M. Perez-Enciso
Unexpected High Polymorphism at the FABP4 Gene Unveils a Complex History for Pig Populations
Genetics, December 1, 2006; 174(4): 2119 - 2127.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Heuertz, E. De Paoli, T. Kallman, H. Larsson, I. Jurman, M. Morgante, M. Lascoux, and N. Gyllenstrand
Multilocus Patterns of Nucleotide Diversity, Linkage Disequilibrium and Demographic History of Norway Spruce [Picea abies (L.) Karst]
Genetics, December 1, 2006; 174(4): 2095 - 2105.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. S. Guttman, S. J. Gropp, R. L. Morgan, and P. W. Wang
Diversifying Selection Drives the Evolution of the Type III Secretion System Pilus of Pseudomonas syringae
Mol. Biol. Evol., December 1, 2006; 23(12): 2342 - 2354.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. T. T. Edwards, E. C. Holmes, O. G. Pybus, D. J. Wilson, R. P. Viscidi, E. J. Abrams, R. E. Phillips, and A. J. Drummond
Evolution of the Human Immunodeficiency Virus Envelope Gene Is Dominated by Purifying Selection
Genetics, November 1, 2006; 174(3): 1441 - 1453.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. D. Cutter, S. E. Baird, and D. Charlesworth
High Nucleotide Polymorphism and Rapid Decay of Linkage Disequilibrium in Wild Populations of Caenorhabditis remanei
Genetics, October 1, 2006; 174(2): 901 - 913.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. L. Kosakovsky Pond, D. Posada, M. B. Gravenor, C. H. Woelk, and S. D. W. Frost
Automated Phylogenetic Detection of Recombination Using a Genetic Algorithm
Mol. Biol. Evol., October 1, 2006; 23(10): 1891 - 1901.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
B. C. Verrelli, S. A. Tishkoff, A. C. Stone, and J. W. Touchman
Contrasting Histories of G6PD Molecular Evolution and Malarial Resistance in Humans and Chimpanzees
Mol. Biol. Evol., August 1, 2006; 23(8): 1592 - 1601.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. L. Morrell, D. M. Toleno, K. E. Lundy, and M. T. Clegg
Estimating the Contribution of Mutation, Recombination and Gene Conversion in the Generation of Haplotypic Diversity
Genetics, July 1, 2006; 173(3): 1705 - 1723.
[Abstract] [Full Text] [PDF]


Home page
Biol. Bull.Home page
S. B. Johnson, C. R. Young, W. J. Jones, A. Waren, and R. C. Vrijenhoek
Migration, Isolation, and Speciation of Hydrothermal Vent Limpets (Gastropoda; Lepetodrilidae) Across the Blanco Transform Fault
Biol. Bull., April 1, 2006; 210(2): 140 - 157.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. C. Bruen, H. Philippe, and D. Bryant
A Simple and Robust Statistical Test for Detecting the Presence of Recombination
Genetics, April 1, 2006; 172(4): 2665 - 2681.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. P. Rooney, J. L. Swezey, R. Friedman, D. W. Hecht, and C. W. Maddox
Analysis of Core Housekeeping and Virulence Genes Reveals Cryptic Lineages of Clostridium perfringens That Are Associated With Distinct Disease Presentations
Genetics, April 1, 2006; 172(4): 2081 - 2092.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Carvajal-Rodriguez, K. A. Crandall, and D. Posada
Recombination Estimation Under Complex Evolutionary Models with the Coalescent Composite-Likelihood Method
Mol. Biol. Evol., April 1, 2006; 23(4): 817 - 827.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
C. Charpentier, T. Nora, O. Tenaillon, F. Clavel, and A. J. Hance
Extensive Recombination among Human Immunodeficiency Virus Type 1 Quasispecies Makes an Important Contribution to Viral Diversity in Individual Patients
J. Virol., March 1, 2006; 80(5): 2472 - 2482.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. J. Wilson and G. McVean
Estimating Diversifying Selection and Functional Constraint in the Presence of Recombination
Genetics, March 1, 2006; 172(3): 1411 - 1425.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. H. Bos and B. Waldman
Evolution by Recombination and Transspecies Polymorphism in the MHC Class I Gene of Xenopus laevis
Mol. Biol. Evol., January 1, 2006; 23(1): 137 - 143.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
N. G. C. Smith and P. Fearnhead
A Comparison of Three Estimators of the Population-Scaled Recombination Rate: Accuracy and Robustness
Genetics, December 1, 2005; 171(4): 2051 - 2062.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
G. M. Clarke and L. R. Cardon
Disentangling Linkage Disequilibrium and Linkage From Dense Single-Nucleotide Polymorphism Trio Data
Genetics, December 1, 2005; 171(4): 2085 - 2095.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
R. J. Whitaker, D. W. Grogan, and J. W. Taylor
Recombination Shapes the Natural Population Structure of the Hyperthermophilic Archaeon Sulfolobus islandicus
Mol. Biol. Evol., December 1, 2005; 22(12): 2354 - 2361.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
J. Slate
Molecular evolution of the sheep prion protein gene
Proc R Soc B, November 22, 2005; 272(1579): 2371 - 2377.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Sikorski and E. Nevo
Adaptation and incipient sympatric speciation of Bacillus simplex under microclimatic contrast at "Evolution Canyons" I and II, Israel
PNAS, November 1, 2005; 102(44): 15924 - 15929.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
K. Roselius, W. Stephan, and T. Stadler
The Relationship of Nucleotide Polymorphism, Recombination Rate and Selection in Wild Tomato Species
Genetics, October 1, 2005; 171(2): 753 - 763.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
R. W Lawrence, D. M Evans, and L. R Cardon
Prospects and pitfalls in whole genome association studies
Phil Trans R Soc B, August 29, 2005; 360(1460): 1589 - 1595.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
M. De Iorio, E. de Silva, and M. P.H Stumpf
Recombination hotspots as a point process
Phil Trans R Soc B, August 29, 2005; 360(1460): 1597 - 1603.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. Mu, D. A. Joy, J. Duan, Y. Huang, J. Carlton, J. Walker, J. Barnwell, P. Beerli, M. A. Charleston, O. G. Pybus, et al.
Host Switch Leads to Emergence of Plasmodium vivax Malaria in Humans
Mol. Biol. Evol., August 1, 2005; 22(8): 1686 - 1693.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
G. A.T McVean and N. J Cardin
Approximating the coalescent with recombination
Phil Trans R Soc B, July 29, 2005; 360(1459): 1387 - 1393.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. Zhu and C. D. Bustamante
A Composite-Likelihood Approach for Detecting Directional Selection From DNA Sequence Data
Genetics, July 1, 2005; 170(3): 1411 - 1421.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. D. Tsaousis, D. P. Martin, E. D. Ladoukakis, D. Posada, and E. Zouros
Widespread Recombination in Published Animal mtDNA Sequences
Mol. Biol. Evol., April 1, 2005; 22(4): 925 - 933.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. Consuegra, H.-J. Megens, H. Schaschl, K. Leon, R. J. M. Stet, and W. C. Jordan
Rapid Evolution of the MH Class I Locus Results in Different Allelic Compositions in Recently Diverged Populations of Atlantic Salmon
Mol. Biol. Evol., April 1, 2005; 22(4): 1095 - 1106.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. A. Jolley, D. J. Wilson, P. Kriz, G. Mcvean, and M. C. J. Maiden
The Influence of Mutation, Recombination, Population History, and Selection on Patterns of Genetic Diversity in Neisseria meningitidis
Mol. Biol. Evol., March 1, 2005; 22(3): 562 - 569.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Fraser, W. P. Hanage, and B. G. Spratt
Neutral microepidemic evolution of bacterial pathogens
PNAS, February 8, 2005; 102(6): 1968 - 1973.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
K. K. Shimizu, J. M. Cork, A. L. Caicedo, C. A. Mays, R. C. Moore, K. M. Olsen, S. Ruzsa, G. Coop, C. D. Bustamante, P. Awadalla, et al.
Darwinian Selection on a Selfing Locus
Science, December 17, 2004; 306(5704): 2081 - 2084.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
G. Piganeau, M. Gardner, and A. Eyre-Walker
A Broad Survey of Recombination in Animal Mitochondria
Mol. Biol. Evol., December 1, 2004; 21(12): 2319 - 2325.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
C. R. Linder and L. H. Rieseberg
Reconstructing patterns of reticulate evolution in plants.
Am. J. Botany, October 1, 2004; 91: 1700 - 1708.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. R. Bordenstein and J. J. Wernegreen
Bacteriophage Flux in Endosymbionts (Wolbachia): Infection Frequency, Lateral Transfer, and Recombination Rates
Mol. Biol. Evol., October 1, 2004; 21(10): 1981 - 1991.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. Shriner, A. G. Rodrigo, D. C. Nickle, and J. I. Mullins
Pervasive Genomic Recombination of HIV-1 in Vivo
Genetics, August 1, 2004; 167(4): 1573 - 1583.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. Fearnhead, R. M. Harding, J. A. Schneider, S. Myers, and P. Donnelly
Application of Coalescent Methods to Reveal Fine-Scale Rate Variation and Recombination Hotspots
Genetics, August 1, 2004; 167(4): 2067 - 2081.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
J. M. Burrows, L. Bromham, M. Woolfit, G. Piganeau, J. Tellam, G. Connolly, N. Webb, L. Poulsen, L. Cooper, S. R. Burrows, et al.
Selection Pressure-Driven Evolution of the Epstein-Barr Virus-Encoded Oncogene LMP1 in Virus Isolates from Southeast Asia
J. Virol., July 1, 2004; 78(13): 7131 - 7137.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. Lemey, O. G. Pybus, A. Rambaut, A. J. Drummond, D. L. Robertson, P. Roques, M. Worobey, and A.-M. Vandamme
The Molecular Population Genetics of HIV-1 Group O
Genetics, July 1, 2004; 167(3): 1059 - 1068.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. D. Wall
Estimating Recombination Rates Using Three-Site Likelihoods
Genetics, July 1, 2004; 167(3): 1461 - 1473.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
D. T. Haydon, A. D. S. Bastos, and P. Awadalla
Low linkage disequilibrium indicative of recombination in foot-and-mouth disease virus gene sequence alignments
J. Gen. Virol., May 1, 2004; 85(5): 1095 - 1100.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
S. F. Sarkar and D. S. Guttman
Evolution of the Core Genome of Pseudomonas syringae, a Highly Clonal, Endemic Plant Pathogen
Appl. Envir. Microbiol., April 1, 2004; 70(4): 1999 - 2012.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
E. S. Balakirev and F. J. Ayala
Nucleotide Variation in the tinman and bagpipe Homeobox Genes of Drosophila melanogaster
Genetics, April 1, 2004; 166(4): 1845 - 1856.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
X. Ke, S. Hunt, W. Tapper, R. Lawrence, G. Stavrides, J. Ghori, P. Whittaker, A. Collins, A. P. Morris, D. Bentley, et al.
The impact of SNP density on fine-scale patterns of linkage disequilibrium
Hum. Mol. Genet., March 15, 2004; 13(6): 577 - 588.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
E. S. Balakirev and F. J. Ayala
Nucleotide Variation of the Est-6 Gene Region in Natural Populations of Drosophila melanogaster
Genetics, December 1, 2003; 165(4): 1901 - 1914.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
N. Li and M. Stephens
Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data
Genetics, December 1, 2003; 165(4): 2213 - 2233.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. Charlesworth, C. Bartolome, M. H. Schierup, and B. K. Mable
Haplotype Structure of the Stigmatic Self-Incompatibility Gene in Natural Populations of Arabidopsis lyrata
Mol. Biol. Evol., November 1, 2003; 20(11): 1741 - 1753.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. D. Polley, W. Chokejindachai, and D. J. Conway
Allele Frequency-Based Analyses Robustly Map Sequence Sites Under Balancing Selection in a Malaria Vaccine Candidate Antigen
Genetics, October 1, 2003; 165(2): 555 - 561.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. Charlesworth, B. K. Mable, M. H. Schierup, C. Bartolome, and P. Awadalla
Diversity and Linkage of Genes in the Self-Incompatibility Gene Family in Arabidopsis lyrata
Genetics, August 1, 2003; 164(4): 1519 - 1535.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Anisimova, R. Nielsen, and Z. Yang
Effect of Recombination on the Accuracy of the Likelihood Method for Detecting Positive Selection at Amino Acid Sites
Genetics, July 1, 2003; 164(3): 1229 - 1236.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
E. S. Balakirev, V. R. Chechetkin, V. V. Lobzin, and F. J. Ayala
DNA Polymorphism in the {beta}-Esterase Gene Cluster of Drosophila melanogaster
Genetics, June 1, 2003; 164(2): 533 - 544.
[Abstract] [Full Text] [PDF]