- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Bierne, N.
- Articles by Eyre-Walker, A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Bierne, N.
- Articles by Eyre-Walker, A.
The Problem of Counting Sites in the Estimation of the Synonymous and Nonsynonymous Substitution Rates: Implications for the Correlation Between the Synonymous Substitution Rate and Codon Usage Bias
Nicolas Biernea and Adam Eyre-Walkeraa Centre for the Study of Evolution and School of Biological Sciences, University of Sussex, Brighton BN1 9QG, United Kingdom
Corresponding author: Adam Eyre-Walker, University of Sussex, Brighton BN1 9QG, United Kingdom., a.c.eyre-walker{at}sussex.ac.uk (E-mail)
| ABSTRACT |
|---|
Most methods for estimating the rate of synonymous and nonsynonymous substitution per site define a site as a mutational opportunity: the proportion of sites that are synonymous is equal to the proportion of mutations that would be synonymous under the model of evolution being considered. Here we demonstrate that this definition of a site can give misleading results and that a physical definition of site should be used in some circumstances. We illustrate our point by reexamining the relationship between codon usage bias and the synonymous substitution rate. It has recently been shown that the rate of synonymous substitution, calculated using the Goldman-Yang method, which encapsulates the mutational-opportunity definition of a site at a high level of sophistication, is either positively correlated or uncorrelated to synonymous codon bias in Drosophila. Using other methods, which account for synonymous codon bias but define a site physically, we show that there is a negative correlation between the synonymous substitution rate and codon bias and that the lack of a negative correlation using the Goldman-Yang method is due to the way in which the number of synonymous sites is counted. We also show that there is a positive correlation between the synonymous substitution rate and third position GC content in mammals, but that the relationship is considerably weaker than that obtained using the Goldman-Yang method. We argue that the Goldman-Yang method is misleading in this context and conclude that methods that rely on a mutational-opportunity definition of a site should be used with caution.
THERE are many different methods designed to estimate the rate of synonymous and nonsynonymous substitution (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
An alternative way to proceed is to define sites "physically" and to estimate the rates of substitution at sites of different degeneracy separately. Thus we estimate rates of synonymous substitution at twofold and fourfold sites independently with the number of sites, in each case, being the actual number of sites that are twofold and fourfold degenerate. One could also estimate the synonymous substitution rate at threefold degenerate sites but there are usually too few of them to warrant consideration. For nonsynonymous sites it is usual to estimate the rate per codon (Appendix B).
The aim of this article is to compare these two ways in which we can define a site: as a mutational opportunity or as a physical position. Counting sites as mutational opportunities seems a sensible way to proceedif the ts/tv ratio is very high, most mutations at a twofold degenerate site are synonymous and the site should therefore be treated as largely synonymous. However, this definition of a site can give anomalous and misleading results. To illustrate the problem let us consider a simple model. For clarity and simplicity we assume that synonymous mutations are neutral and that nonsynonymous mutations are either neutral or deleterious. Let us assume that all codons are twofold degenerate, that the rate of transversion mutation is x per nucleotide site, and that the ts/tv ratio is
; i.e., if
= 1, each transition (e.g., C
T) occurs at the same rate as each transversion (e.g., C
A). Under this model the nonsynonymous and synonymous mutation rates per gene are, respectively,
![]() |
(1) |
where L is the length of the gene in nucleotides. In the methods of ![]()
![]()
![]()
![]()
![]()
![]() |
(2) |
This gives the expected results under the philosophy of counting sites as mutational opportunities; if transitions and transversions are equally frequent, then
s = 1/9 (the third position is one-third synonymous), and if transitions greatly outnumber transversions, then
s = 1/3 (the third position is completely synonymous). The numbers of synonymous and nonsynonymous sites are
![]() |
(3) |
If the proportion of nonsynonymous mutations that are neutral is
, then the rates of synonymous and nonsynonymous substitution per gene are
![]() |
(4) |
Thus the rates per site are
![]() |
(5) |
As expected under this definition of site, the nonsynonymous substitution rate per site equals the synonymous rate (i.e., dn = ds) when
= 1. However, this definition can give misleading results. Consider two genes; imagine that they both have similar rates of transversion mutation, but that the ts/tv ratio is 1 in the first and 5 in the second. Under this model, the rate of synonymous substitution is 5 times greater in the second gene than in the first, because all synonymous mutations are transitions, and transitions occur 5 times more frequently in the second gene. However, the estimate of synonymous substitution rate per site, ds, is 3x in the first gene and 7x in the second; i.e., the synonymous substitution rate per site in the second gene is estimated to be only 2.3 times that in the first, whereas in reality it is 5 times higher. The definition of a site as a mutational opportunity is misleading in this contextit does not reflect the true biology. The reason for the discrepancy is that, while the number of synonymous substitutions is 5 times higher in the second gene, the proportion of sites that are synonymous is also higherit is 0.11 in the first gene and 0.24 in the second. However, the physical number of twofold degenerate sites is the same in the two genes, and if we had counted just the number of substitutions per physical site we would have gotten the answer we expected.
Unfortunately, the definition of a site can be critical to our understanding of a problem. To illustrate this we reconsider the relationship between the rate of synonymous substitution and codon usage bias in Drosophila and mammals. Until recently it was generally accepted that the synonymous substitution rate was negatively correlated to the level of synonymous codon bias in enteric bacteria (![]()
![]()
![]()
![]()
![]()
A similar revision has taken place in mammals. It was originally thought that the relationship between codon usage bias, measured as third-position GC content (GC3), and the synonymous substitution rate was a negative quadratic, with the maximum substitution rate being obtained at a GC3 value of
60% (![]()
![]()
![]()
![]()
![]()
The lack of a negative correlation between synonymous codon bias and the synonymous substitution rate is puzzling because there is a negative correlation between the nonsynonymous substitution rate and codon usage bias in Drosophila (![]()
![]()
![]()
![]()
![]()
![]()
![]()
As we show here, the discrepancy between the relationships we see with the nonsynonymous and the synonymous substitution rates and codon usage bias, in Drosophila, is due to the definition of a site. If we use a physical definition of a site there is a negative correlation between codon usage bias and both the synonymous and nonsynonymous substitution rates in Drosophila; however, the correlation disappears if we use a mutational-opportunity definition of a site. Which of these definitions is more informative is a question we return to in the DISCUSSION.
| MATERIALS AND METHODS |
|---|
Materials:
![]()
![]()
![]()
Methods:
There are potentially a number of different ways in which we can estimate the synonymous substitution rate under a physical-sites model (see Appendix B and DISCUSSION). Here we use a simple method. We estimate the rate of synonymous substitution at twofold and fourfold degenerate sites separately. We restrict our analysis to those codons that code for the same amino acid in the two species being considered and we consider only synonymous changes at the third codon position. In restricting our analysis to codons that have no nonsynonymous differences we are assuming that the codon has undergone no amino acid substitutionthis is a reasonable assumption given the level of amino acid divergence in the data sets we analyze. We use nucleotide-based methods that take into account the major feature of the codon usage bias in Drosophila and mammalsi.e., the bias toward G- and C-ending codons. For fourfold degenerate sites we used the method of Tamura (![]()
![]() |
(6) |
where

p2 is the proportion of twofold sites that show a synonymous difference and f2 is the frequency of GC at those sites. In theory we could estimate the rate of substitution for CT and AG twofolds separately, but this is unnecessary because combining them gives accurate estimates (see below). Bulmer's method corrects for GC content. We estimate the total number of synonymous substitutions per codon, for the codons analyzed, as
![]() |
(7) |
where n2 and n4 are the numbers of twofold and fourfold degenerate sites used in the calculation of dTs4 and dBs2, respectively. We refer to these collectively as Bulmer and Tamura (BT) methods.
The original GY maximum-likelihood estimates of divergences were kindly provided by Katherine Dunn and Joe Bielawski; these were the number of synonymous (dGYs) and nonsynonymous (dGYn) substitutions per site and the estimated numbers of synonymous (LGYs) and nonsynonymous (LGYn) sites. In each case the substitution rates were estimated using the nucleotide frequencies at each codon position (F3x4 model; ![]()
![]() |
(8) |
For purpose of comparison with previous studies (![]()
![]()
![]()
| RESULTS |
|---|
Drosophila:
Using the BT methods we find that the rate of synonymous substitution at both twofold and fourfold degenerate codons is positively correlated to ENC for both the D. melanogaster-D. pseudoobscura and D. simulans-D. yakuba data sets; i.e., the synonymous substitution rate per physical site is negatively correlated to codon usage bias. In contrast, the GY estimate of the synonymous substitution rate is not correlated to codon bias in either data set (Fig 1).
|
The discrepancy between the methods is not due to problems with the correction for multiple hits because both methods give similar estimates for the number of synonymous substitutions per codon that occur in each gene, if we restrict the analysis to those codons considered by the BT methods presented here (i.e., twofold and fourfold codons with no apparent amino acid substitution; Fig 2). Furthermore, the rate of synonymous substitution per codon is significantly correlated to ENC for both the GY (Fig 3) and BT methods (results not shown). So the correlation between the synonymous substitution rate and codon bias vanishes for the GY method only when the rate is calculated per site; hence the difference between the GY and BT estimates is due to the definition of a site.
|
|
The GY method uses the mutational-opportunity definition of a site; however, it takes into account not only the ts/tv ratio but also codon usage bias in its estimate of the number of sites. As a consequence, the proportion of sites that are synonymous (
s) is correlated to codon bias (Fig 4)as codon bias increases (i.e., ENC decreases), so the proportion of sites that are synonymous decreases, which cancels out the decrease in the synonymous substitution rate per codon, to yield a synonymous substitution rate per site that is independent of codon bias.
|
Mammals:
The estimate of the synonymous substitution rate per site is positively correlated to codon bias using both the GY and the BT methods (Fig 5). However, the nature of the relationship is very differentthe gradient is much greater for dGYs than for dBs2 or dTs4 (ANCOVA test for different slopes significant at P < 0.0001 in each case) and in fact the relationship between dGYs and GC3 is significantly nonlinear (a model including a quadratic term provides a significantly better fit to the data). Interestingly the slopes for dBs2 and dTs4 are also significantly different (P < 0.05), but neither is significantly nonlinear. As in Drosophila, the difference in the patterns seen with the GY and BT methods is due to the way in which the GY and BT methods count sites: the BT and GY methods give very similar estimates for the number of synonymous substitutions per codon if we restrict the analysis to those sites analyzed by the BT method (mean of DcGYs across genes is 4% greater than mean DcBTs). As in Drosophila the proportion of sites that are synonymous, under the GY method, decreases as codon usage bias increases (i.e., increasing GC3). This is the case even if we restrict the analysis to fourfold degenerate codons that have not undergone any amino acid substitution (Fig 6). The proportion of sites that are synonymous, among these fourfold degenerate codons, estimated by the GY method varies from 0.10 to 0.46 in the primate-artiodactyl data set (Fig 6) and yet the proportion of sites that are physically synonymous is one-third (assuming that the fourfold sites being considered have been fourfold throughout the divergence of primates and artiodactyls, which seems reasonable given that there has been no apparent amino acid substitution in the codons considered and the overall level of amino acid divergence is low).
|
|
| DISCUSSION |
|---|
The nature of the relationship between codon usage bias and the synonymous substitution depends upon the definition of a site used to estimate the substitution rate. If a mutational-opportunity definition is used, as encapsulated in the method of ![]()
![]()
![]()
![]()
The crucial question is which definition of a site is more informative in the context of substitution rates and codon bias, and which definition is more informative in other contextsboth definitions of a site are "correct" since one can define a site however one wants. We would argue that the mutational-opportunity definition of a site is likely to be misleading in some contexts simply because the definition of site is abstract and likely to depend on many factors that are not immediately obvious. For example, the proportion of sites that are synonymous is dependent upon the level of codon bias (Fig 4 and Fig 6).
The fact that the synonymous substitution rate per codon and per physical site is negatively correlated to codon bias (positively correlated to ENC) in Drosophila suggests that there is a biological phenomenon that needs to be explained, a phenomenon that is either obscured or in the wrong direction when a mutational-opportunity definition is employed. Furthermore, under the physical definition of site, it is relatively easy to develop models to explain the pattern. For example, we might hypothesize that the correlation is generated by directional selectionin the development of such a model a site is most easily defined physically (one could define the site as a mutational opportunity and include this in the model, but this would add complications). Alternatively we might hypothesize that the relationship is generated by a correlation between the mutation rate and gene expression, as appears to be the case in Escherichia coli (![]()
![]()
General considerations:
Rates of synonymous and nonsynonymous substitution have been used in many contexts including (i) the estimation of phylogeny, (ii) the estimation of absolute rates of evolution, (iii) the comparison of substitution rates between genes, (iv) the testing of models of evolution, and (v) the investigation of adaptive evolution. Which definition of a site should we use in these different contexts?
- It is probably not particularly important whether we define a site as mutational opportunity or a physical site in the reconstruction of phylogenythe most important quality of our metric is that it reflects evolutionary divergence.
- Whether we should use a physical or mutational-opportunity definition of a site to measure absolute rates of substitution depends on what we wish to use our estimate for. Under the assumption that synonymous mutations are neutral, ds, the synonymous substitution rate per site, under the mutational-opportunity definition of site, is the average mutation rate across the three codon positions (Z. YANG, personal communication; see Equation 5), and dsLn is the amino acid mutation rate per gene. Both of these quantities may be useful. However, in other instances the physical definition of site may be more usefulfor example, if we wanted to estimate the effective population size of a species, we could estimate nucleotide diversity and the synonymous subsitution rate at fourfold degenerate sites.
- As we have shown above, both in the simple model used in the Introduction and in the analysis of the relationship between codon bias and the synonymous substitution rate, the mutational-opportunity definition of a site can give misleading results when genes are compared unless the proportion of sites that are synonymous and nonsynonymous is the same in all the genes in the comparison.
- Furthermore, if we are seeking to test a model of evolution, for example, to test whether a correlation between synonymous codon bias and the synonymous substitution rate is due to selection, then we can use either definition of a site by building the definition of a site into the model itself. However, this will generally be much easier for the physical definition of a site.
- The one arena in which the definition of a site as a mutational opportunity is clearly superior to the physical definition of site is in the detection of adaptive evolution. Adaptive evolution can be detected in a comparison of the nonsynonymous (dn) and synonymous (ds) substitution rates. Let us assume that synonymous mutations are neutral; then if we can define dn and ds such that dn = ds when all nonsynonymous mutations are neutral, adaptive evolution can be inferred if dn > ds. Estimating substitution rates as the number of substitutions per mutational opportunity is clearly appropriate in this contextif all nonsynonymous mutations are neutral, then the substitution rate per mutation will equal that at synonymous sites (see Equation 5). Inferring the action of adaptive evolution using the physical definition of a site is much more complex. These considerations are summarized in Table 1.
View this table:
In this window
In a new window
Table 1. Which definition of a site to use
Estimating the rate per physical site:
We can estimate the rate of substitution per physical site in a number of different ways (Appendix B). We can choose to estimate the substitution rates per codon or per nucleotide site. The former has the advantage that the method yields a single estimate of the synonymous and nonsynonymous substitution rates, but it has the disadvantage that the substitution rate will depend to some extent on the degeneracy of the codons in the gene. This may be important in the estimation of the synonymous substitution rate; if the rate of synonymous substitution is higher at fourfold than at twofold degenerate sites, as we would expect given that all mutations at a fourfold degenerate site are synonymous, then genes with a high proportion of fourfold sites will have higher rates of synonymous substitution per codon than genes with a low number of fourfold sites. This may not be satisfactory. However, this sort of bias is likely to be less important for nonsynonymous substitutions since the majority of mutations in a gene are nonsynonymous and the relative proportion of twofold and fourfold degenerate codons does not greatly affect this.
The alternative to calculating rates per codon is to calculate rates per nucleotide site as we have done in our BT method. The BT method is useful for calculating the rate of synonymous substitution per physical site when codon usage can be easily summarized in terms of base composition. However, this is often not the casefor example, E. coli has strong synonymous codon bias, which is not a simple function of base composition. For data of this sort, it is preferable to use a codon-based model to estimate the number of substitutions and then to express these values per physical site. Z. YANG (personal communication) has recently suggested a measure, d4, which can be derived from the GY method. The method estimates the number of synonymous substitutions that have occurred between fourfold degenerate codons and then divides this by the current number of sites that are physically fourfold degenerate. It would be possible to derive a similar estimate for the rate at twofold degenerate sites. For estimating the rate of nonsynonymous substitution we could estimate rates at zero-fold and twofold degenerate sites.
Codon bias and the number of sites:
Under the GY method the proportion of sites that are synonymous is correlated to the level of codon usage bias (Fig 4 and Fig 6). This is due to the fact that the GY method takes into account not only the ts/tv ratio but also codon bias itself in calculating the number of sites that are synonymous. The reason codon bias affects the number of sites that are synonymous is as follows. Imagine a gene in which all codons are fourfold degenerate and in which there is strong bias in favor of G- and C-ending codons. Let us assume for simplicity that this codon bias is mutational in origin (the GY method implicitly assumes this). A strong bias in favor of GC tells us that the mutation rate from AT to GC is stronger than the rate from GC to AT. Since nonsynonymous sites have lower GC content than synonymous sites, because they are subject to functional constraints, they will have a higher mutation rate (because they have more AT sites, which have a high mutation rate). The proportion of mutations that are nonsynonymous will therefore be relatively large, which will be reflected in a large value of LGYn and a small value of LGYs. Genes with high synonymous codon bias therefore have a lower proportion of synonymous sites because a smaller proportion of mutations are synonymous. As with the ts/tv ratio this can lead to anomalous results. Imagine two genes that have the same number of twofold and fourfold sites and the same synonymous codon bias and have undergone exactly the same number of synonymous substitutions. They have the same synonymous substitution rate per physical site, but if their nonsynonymous sites differ in composition, then the estimates of the number of synonymous substitutions per site, under the GY method, will be different because the proportion of mutations, and hence sites, that are synonymous will differ between the genes.
Other issues with the GY method:
The synonymous substitution rate estimated by the GY method can be used to detect positive selection at nonsynonymous sites: i.e., adaptive evolution can be inferred when dGYn/dGYs > 1. However, since selection acts upon synonymous mutations in many organisms (![]()
C mutation at the first codon position may be selected against because CUU is a less optimal codon, in terms of translational accuracy, for instance. So if selection on protein structure tends to be strongly positive or negative, or neutral (i.e., no slightly deleterious and advantageous effects on protein structure), and a nonsynonymous mutation is as likely to change a preferred codon to an unpreferred codon, or vice versa, as a synonymous mutation, then the GY method will remain valid. However, these conditions are unlikely to be met in many organismsfor example, because most preferred codons are G or C in Drosophila, most nonsynonymous mutations may have weak synonymous effects, because they usually do not change a preferred codon to an unpreferred codon, or vice versa. So some caution should be used in using any test for adaptive evolution that relies on the dn/ds ratio; however, it should be remembered that the test is very conservative.
Other results:
The GY method has been used to examine the relationship between the synonymous substitution rate and codon usage bias in three other groups, enteric bacteria (![]()
![]()
![]()
![]()
![]()
Conclusions:
We have shown that the basic philosophy underlying the counting of sites in many methods for estimating substitution rates (i.e., the mutational-opportunity concept) is inappropriate in some contexts. In particular, it is inappropriate for comparing rates between genes. The GY method encapsulates this basic philosophy better than most other methods since it takes into account both the transition/transversion ratio and synonymous codon bias. Ironically, it is the sophistication of the GY method that has made the problem of counting sites apparent.
| ACKNOWLEDGMENTS |
|---|
We are very grateful to Andrea Betancourt, Katherine Dunn, Joe Bielawski, Junko Kusumi, and Nick Smith for sharing their data and results and to Nicolas Galtier and Laurence Hurst for useful discussions and comments on an earlier draft. The authors are supported by the Biotechnology and Biological Sciences Research Council and the Royal Society.
Manuscript received March 3, 2003; Accepted for publication July 25, 2003.
| APPENDIX A |
|---|
MUTATIONAL-OPPORTUNITY METHODS
Here we describe the major methods that are used to estimate rates of synonymous and nonsynonymous substitution.
![]()
Nei and Gojobori suggested two methods, which differ in the way they compute the number of synonymous and nonsynonymous changes between two codons that differ at more than one site. Their method I appears to be the only one used currently. In this method the different pathways between two codons, which differ by more than one codon, are weighted equally. The correction of multiple hits is achieved using the Jukes-Cantor (![]()
![]()
![]()
![]()
The method of ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]() |
(A1) |
where L0, L2, and L4 are the numbers of zero-, two-, and fourfold degenerate sites, respectively (the methods treat isoleucine as twofold degenerate), and Ax and Bx are the rates of transition and transversion substitution at x-fold degenerate sites, respectively. Note that we use the symbols Ka and Ks rather than dn and ds to maintain consistency with the original articles.
In essence the methods are attempting to estimate the rate of substitution at zerofold and fourfold degenerate sites taking into account rates of evolution at twofold degenerate sites. This method is a mutational-opportunity method but this is not obvious. To demonstrate this let us assume that a fraction
of codons are fourfold degenerate with a fraction (1 -
) being twofold degenerate; for simplicity we assume that there are no threefold and sixfold degenerate codons. As in the simple model above we assume that the transversion rate is x and that the ts/tv ratio is
. Under this model we can write Equation A1 as
![]() |
(A2) |
where L is the length of the gene in codons (we could define it in nucleotides but the logic is a little clearer in codons). These equations simplify to
![]() |
(A3) |
as expected. Note, however, that these equations are exactly those given by the mutational-opportunity model in the Introduction (Equation 5). Although in the simple model we assumed that all codons were twofold degenerate, the conclusions remain unchanged if we include fourfold degenerate codons (results not shown).
![]()
The method of Comeron is essentially that of ![]()
![]()
![]()
![]()
![]()
![]()
Ina suggests two methods. In each of his methods the ts/tv ratio is estimated and this is used to compute the number of synonymous and nonsynonymous sitesi.e., the method is a mutational-opportunity method. The two methods differ in how the ts/tv ratio is estimated. In the first approximate method the ts/tv ratio estimated at the third codon position is used to calculate the number of sites; however, this will tend to bias the ts/tv ratio upward because some of the third codon-position sites are twofold degenerate. The second method uses an iterative procedure to estimate the ts/tv ratio. Pathways between codons with multiple substitutions are weighted equally.
![]()
The method of ![]()

where
j is the equilibrium frequency of codon j, µ is the nucleotide substitution rate per codon, k is the ts/tv ratio, and
is the nonsynonymous to synonymous substitution rate ratio (dn/ds). The method finds the values of µ, k, and
that maximize the likelihood of observing the data. The proportion of sites that are synonymous is estimated by using the maximum-likelihood value of k, setting
= 1 and evaluating the expressions
![]() |
(A4) |
The proportion of sites that are synonymous is then
![]() |
(A5) |
This is a mutational-opportunity method that takes into account not only the ts/tv ratio, but also codon usage bias in its estimate of the proportion of sites that are synonymous.
| APPENDIX B |
|---|
PHYSICAL-SITES APPROACH
Nucleotide site methods:
Physical-site methods can be divided into two categoriesthose that estimate rates per nucleotide site and those that estimate rates per codon. Methods to estimate rates per nucleotide site have been largely concentrated on estimating the rate of synonymous substitution at fourfold degenerate sites, a measure usually given the symbol K4 or d4. The approach taken is the one we have used abovei.e., restricting the analysis to fourfold degenerate sites in codons that have not undergone any apparent amino acid substitution, and then using one of the many nucleotide substitution models to correct for multiple hits: the most widely used models, in order of complexity (i.e., number of parameters), are the models of ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Codon methods:
We are not aware of any method aimed at estimating the rate of nonsynonymous substitution per physical nucleotide site, but there are physical-site methods that estimate the rate per codon. In these methods the nonsynonymous, or amino acid, substitution rate per codon is estimated by calculating the proportion of amino acid sites that differ between two sequences, p, and then using a correction for multiple hits. The simplest correction is
![]() |
(B1) |
(![]()
![]()
![]() |
(B2) |
![]()
One physical-site method is designed to estimate both the synonymous and nonsynonymous substitution rates per codon. This is the method of ![]()
![]()
![]()

The synonymous and nonsynonymous substitution rates per codon,
and ß, are estimated by maximum likelihood. Contrary to ![]()
![]()
![]()
| LITERATURE CITED |
|---|
AKASHI, H., 1994 Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927-935.[Abstract]
BERG, O. G. and M. MARTELIUS, 1995 Synonymous substitution-rate constants in Escherichia coli and Salmonella typhimurium and their relationship to gene expression and selection pressure. J. Mol. Evol. 41:449-456.[Medline]
BERNARDI, G., D. MOUCHIROUD, and C. GAUTIER, 1993 Silent substitutions in mammalian genomes and their evolutionary implications. J. Mol. Evol. 37:583-589.[Medline]
BETANCOURT, A. J. and D. C. PRESGRAVES, 2002 Linkage limits the power of natural selection in Drosophila. Proc. Natl. Acad. Sci. USA 99:13616-13620.
BIELAWSKI, J. P., K. A. DUNN, and Z. YANG, 2000 Rates of nucleotide substitution and mammalian nuclear gene evolution: approximate and maximum-likelihood methods lead to different conclusions. Genetics 156:1299-1308.
BULMER, M., 1991 Use of the method of generalized least squares in reconstructing phylogenies from sequence data. Mol. Biol. Evol. 8:868-883.
BULMER, M., K. H. WOLFE, and P. M. SHARP, 1991 Synonymous substitution rates in mammalian genes: implications for the molecular clock and the relationships of mammalian orders. Proc. Natl. Acad. Sci. USA 88:5974-5978.
COMERON, J., 1995 A method for estimating the numbers of synonymous and nonsynonymous substitutions per site. J. Mol. Evol. 41:1152-1159.[Medline]
DUNN, K. A., J. P. BIELAWSKI, and Z. YANG, 2001 Substitution rates in Drosophila nuclear genes: implications for translational selection. Genetics 157:295-305.
EYRE-WALKER, A. and M. BULMER, 1995 Synonymous substitution rates in enterobacteria. Genetics 140:1407-1412.[Abstract]
GOLDMAN, N. and Z. YANG, 1994 A codon-based model of nucleotide substitution for protein-coding sequences. Mol. Biol. Evol. 11:725-736.[Abstract]
HASEGAWA, M., H. KISHINO, and T. YANO, 1985 Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160-174.[Medline]
INA, Y., 1995 New methods for estimating the numbers of synonynous and non-synonymous substitutions. J. Mol. Evol. 40:190-226.[Medline]
JUKES, T. H., and C. R. CANTOR, 1969 Evolution of protein molecules, pp. 121123 in Mammalian Protein Metabolism, edited by N. H. MUNRO. Academic Press, New York.
KIMURA, M., 1980 A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.[Medline]
KIMURA, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.
KUSUMI, J., Y. TSUMURA, H. YOSHIMARU, and H. TACHIDA, 2002 Molecular evolution of nuclear genes in Cupressacea, a group of conifers. Mol. Biol. Evol. 19:736-747.
LI, W.-H., 1993 Unbiased estimation of the rates of synonymous and non-synonymous substitution. J. Mol. Evol. 36:96-99.[Medline]
LI, W.-H., C.-I WU, and C.-C. LUO, 1985 A new method of estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2:150-174.[Abstract]
MIYATA, T. and T. YASUNAGA, 1980 Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitution from homologous sequences and its application. J. Mol. Evol. 16:23-26.[Medline]
MORIYAMA, E. N. and D. L. HARTL, 1993 Codon usage bias and base composition of nuclear genes of Drosophila. Genetics 134:847-858.[Abstract]
MUSE, S. V., 1996 Estimating the synonymous and nonsynonymous substitution rates. Mol. Biol. Evol. 13:105-114.[Abstract]
MUSE, S. V. and B. S. GAUT, 1994 A likelihood approach for comparing synonymous and nonsynonymous nucleotide rates, with application to the chloroplast genome. Mol. Biol. Evol. 11:715-724.[Abstract]
NEI, M. and T. GOJOBORI, 1986 Simple methods for estimating the number of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418-426.[Abstract]
PAMILO, P. and N. O. BIANCHI, 1993 Evolution of Zfx and Zfy genesrates and interdependence between the genes. Mol. Biol. Evol. 10:271-281.[Abstract]
PERLER, R., A. EFSTRATIADIS, P. LOMEDICO, W. GILBERT, and R. KLODNER et al., 1980 The evolution of genes: the chicken preproinsulin gene. Cell 20:555-566.[Medline]
SHARP, P. M. and W.-H. LI, 1987 The rate of synonymous substitution in enterobacterial genes is inversely related to codon usage bias. Mol. Biol. Evol. 4:222-230.[Abstract]
SHARP, P. M. and W.-H. LI, 1989 On the rate of DNA sequence evolution in Drosophila. J. Mol. Evol. 28:398-402.[Medline]
SHARP, P. M., C. J. BURGESS, A. T. LLOYD and K. J. MITCHELL, 1992 Selective use of termination and variation in codon choice, pp. 397425 in Transfer RNA in Protein Synthesis, edited by D. L. HATFIELD, B. J. LEE and R. M. PIRTLE. CRC Press, Boca Raton, FL.
SMITH, N. G. C. and L. D. HURST, 1999 The effect of tandem substitutions on the correlation between synonymous and nonsynonymous rates in rodents. Genetics 153:1395-1402.
SMITH, N. G. C. and A. EYRE-WALKER, 2001 Nucleotide substitution rate estimation in enterobacteria: approximate and maximum-likelihood methods lead to similar conclusions. Mol. Biol. Evol. 18:2124-2126.
TAJIMA, F. and M. NEI, 1984 Estimation of evolutionary distances between nucleotide sequences. Mol. Biol. Evol. 1:269-285.[Abstract]
TAMURA, K., 1992 Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C-content biases. Mol. Biol. Evol. 9:678-687.[Abstract]
TAMURA, K. and M. NEI, 1993 Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526.[Abstract]
WOLFE, K. H., P. M. SHARP, and W.-H. LI, 1989 Mutation rates differ among regions of the mammalian genome. Nature 337:283-285.[Medline]
WRIGHT, F., 1990 The effective number of codons used in a gene. Gene 87:23-29.[Medline]
YANG, Z., 1997 PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555-556.
ZUCKERKANDL, E., and L. PAULING, 1965 Evolutionary divergence and convergence in proteins, pp. 97166 in Evolving Genes and Proteins, edited by V. BRYSON and H. J. VOGEL. Academic Press, New York.
This article has been cited by other articles:
![]() |
A. Llopart and J. M. Comeron Recurrent Events of Positive Selection in Independent Drosophila Lineages at the Spermatogenesis Gene roughex Genetics, June 1, 2008; 179(2): 1009 - 1020. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Bullaughey, M. Przeworski, and G. Coop No effect of recombination on the efficacy of natural selection in primates Genome Res., April 1, 2008; 18(4): 544 - 554. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Cutter Divergence Times in Caenorhabditis and Drosophila Inferred from Direct Estimates of the Neutral Mutation Rate Mol. Biol. Evol., April 1, 2008; 25(4): 778 - 786. [Abstract] [Full Text] [PDF] |
||||
|
|























