- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Dermitzakis, E. T.
- Articles by Zouros, E.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Dermitzakis, E. T.
- Articles by Zouros, E.
Negative Covariance Suggests Mutation Bias in a Two-Locus Microsatellite System in the Fish Sparus aurata
Emmanouil T. Dermitzakisa,b, Andrew G. Clarkb, Costas Batargiasa,c, Antonios Magoulasc, and Eleftherios Zourosa,ca Department of Biology, University of Crete, 711 10 Iraklion, Crete, Greece,
b Institute of Molecular Evolutionary Genetics, Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802
c Genetics Department, Institute of Marine Biology of Crete, 710 03, Iraklion, Greece
Corresponding author: Emmanouil T. Dermitzakis, 208 Mueller Lab, Department of Biology, Pennsylvania State University, University Park, PA 16802., exd158{at}psu.edu (E-mail).
Communicating editor: M. W. FELDMAN
| ABSTRACT |
|---|
Constraints on microsatellite length appear to vary in a species-specific manner. We know very little about the nature of these constraints and why they should vary among species. While surveying microsatellite variation in the Mediterranean gilthead sea bream, Sparus aurata, we discovered an unusual pattern of covariation between two closely linked microsatellite loci. One- and two-locus haplotypes were scored from PCR amplification products of each locus separately and both loci together. In a sample of 211 fish, there was a strong negative covariance in repeat number between the two loci, which suggests a mechanism that maintains the combined length below a constrained size. In addition, there were two clusters of the same combined haplotype length, one consisting of a long repeat array at one locus and a short array at the other and vice versa. We demonstrate that several models of biased mutation or natural selection, in theory, could generate this pattern of covariance. The common feature of all the models is the idea that tightly linked microsatellites do not evolve in complete independence, and that whatever size dependence there is to the process, it appears to "read" the combined size of the two loci.
THE utility of microsatellite repeats for making inferences about population history hinges on an understanding of the mutational processes that generate their remarkable diversity in populations. Several models have been proposed for the evolution of microsatellites (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In the process of studying the population structure of the Mediterranean sea bream, Sparus aurata, we isolated two arrays of tandemly repeated GT dinucleotides that are separated by only 75 nucleotides of unique sequence, each of which shows extensive allelic variation. We could amplify DNA from each gene separately and from the two together from each chromosome of an individual fish. This allowed quantification of one-locus and two-locus allele associations at the gametic and the zygotic levels. The patterns of associations observed provide useful insights into the processes that may constrain microsatellite length variation in a chromosomal region.
| MATERIALS AND METHODS |
|---|
Sampling procedure:
Five geographically distinct samples of Mediterranean sea bream (S. aurata) were obtained:
- IMBC-1: wild animals collected in 1993 from several regions of the Greek seas (n = 32). These animals are maintained at the Institute of Marine Biology of Crete (IMBC) for experimental purposes.
- G2: a sample collected in 1996 from the Mesolongi lagoon, Greece (n = 40).
- I2: a sample collected in 1996 from the northern Adriatic, Italy (n = 40).
- S4: a sample collected in 1996 from Alicante on the Mediterranean coast of Spain (n = 51).
- S3: a sample collected from Cadiz in 1996, on the Atlantic side of Gibraltar (n = 48).
Samples were frozen as soon as possible after collection and transferred to the laboratory for tissue excision. DNA was extracted from frozen livers or from muscle preserved in 70% alcohol. In both cases, the DNA extraction protocol of ![]()
PCR amplification:
A battery of microsatellite markers for S. aurata has been developed by Batargias and colleagues (C. BATARGIAS, E. T. DERMITZAKIS, A. MAGOULAS and E. ZOUROS, unpublished results). The two microsatellite markers used here are (GT)n repeats that are separated by a 75-bp unique sequence. They are designated as SA41a and SA41b, and the combined region is called SA41. Primers were designed to amplify each locus separately and both together (Figure 1; EMBL accession numbers Y17262 and Y17263). For the amplification of SA41a locus, primers pSA41Fa and pSA41Ra were used; for SA41b locus, primers pSA41Fb and pSA41b; and for both loci (composite PCR product), primers pSA41Rb and pSA41Fa. For the visualization of the PCR products, one primer for each amplification was end-labeled with [
-32P] ATP. Primers pSA41Ra and pSA41Rb were labeled for loci SA41a and SA41b, respectively, and pSA41Fa was labeled for the combined two-locus product (SA41).
|
All reactions were performed in 0.2 ml PCR tubes, 10 µl reactions consisting of 1x PCR buffer (GIBCO BRL, Gaithersburg, MD), 0.6 µM of each of the two primers, 0.2 mM of each dNTP, 1 mM of MgCl2, and 0.04 µM of the labeled primer, 0.25 units of Taq polymerase (GIBCO BRL), and about 10 ng of total genomic DNA. The conditions for each amplification were: 95° for 2 min (hot start) for 1 cycle, and 95° for 45 sec, 52° for 30 sec, 72° for 30 sec for 35 cycles, and 72° for 10 min at the end.
Gel electrophoresis and size scoring:
PCR products were resolved in 6% polyacrylamide denaturing gels. The sequence of the phage plasmid M13 was used as a size marker to determine the genotype of certain individuals, which were subsequently used as size markers. To assure accuracy in sizing, the markers covered the full size range of the sampled PCR products.
Size scoring of PCR products from autoradiographs was performed at least three times for each case. Associated pairs of allele sizes (327) of the two loci (haplotypes) were inferred by the presence of their composite PCR product for most of the individuals, which was always equal to the sum of the single locus products minus the 45 nucleotides of overlap. In some cases (~10% of the chromosomes analyzed) one of the two haplotypes of an individual was inferred by deduction, when only one of the composite PCR products was visualized (we assumed that the other pair was associated, although we could not see the composite PCR product). This method was used only in cases where the genotyping of the two loci was unambiguous. We inferred 157 genotypes (that correspond to 314 haplotypes) due to the fact that only one of the two composite products was scorable for 13 individuals.
We were able to verify the association of all the pairs of alleles obtained by PCR-scoring for the IMBC-1 sample (60 haplotypes: 18.3% of the total number of haplotypes analyzed), by scoring the genotypes of the offspring of experimental crosses between these individuals, and by observing the cosegregation of the alleles of the two loci to the next generation. In all 60 cases the associated pair observed in the offspring coincided with the associated pair inferred from the PCR assay.
Permutation tests:
Several permutation tests were applied to assess the statistical significance of properties of the observed microsatellite variation. Initial inspection of the data suggested that the variance in the size of the composite PCR product was lower than that expected from random associations of the alleles of the two loci. This motivated a test designed to compare the observed variance of SA41 allele size, which also represents the variance in repeat number, with the variance of random draws of pairs of allele sizes of loci SA41a and SA41b. Specifically, we generated 1000 samples of 327 pairs of allele sizes by shuffling the observed allele sizes of one locus vs. the other. We also applied the same permutation test to compare the observed covariance of the dinucleotide repeat number in the two loci with that expected by chance, as suggested by ![]()
| RESULTS |
|---|
Allele and genotype frequencies:
Mediterranean populations of S. aurata show no significant genetic heterogeneity with regard to allozyme and mitochondrial DNA variation (A. MAGOULAS, unpublished data; ![]()
![]()
![]()
![]()
|
Linkage disequilibrium:
Although a direct estimate of recombination rate between the two loci has not been obtained, the spacing of just 75 nucleotides means that recombination must be very rare. An indication of the low rate of recombination is that we did not observe a single instance of the haplotype composed of the two most frequent alleles (SA41a99 and SA41b152). Given this very tight linkage, linkage disequilibrium between alleles at two loci would be eroded mainly through mutation, which could generate multiple combinations of the same length at each of the two loci, yet these combinations would not be identical by descent.
Overall linkage disequilibrium among alleles at the two loci was found to be highly significant by the chi-square test (![]()
![]()
![]()
|
Permutation test for random association of SA41a and SA41b alleles:
The negative correlation of individual allele length motivated a test that would compare the variance and the covariance of the observed data with 1000 replicates generated by random shuffling of the alleles of the one locus against the other, as described in the MATERIALS AND METHODS. The distribution of variance in fragment size and the covariance (Figure 4) between allele size for the two loci obtained from the 1000 permutations in both cases had no overlap with the observed values, indicating that the probability of getting such an extreme value by chance was <0.001. The significant negative covariance could be generated by a number of distinct mechanisms, some of which were explored with computer simulations described below.
|
Simulations of models of tandem microsatellite evolution:
We considered three models that might a priori be expected to generate negative covariance: recovery from a population bottleneck, mutation bias, and natural selection. We do not test the hypothesis of gene conversion proposed by ![]()
Simulations of the models make use of the coalescent process for generation of gene genealogies without recombination (![]()
![]()
|
|
The first model assumed a population bottleneck early in the coalescent tree. The rationale behind this model is that negative covariance can be generated by chance only if a small number of haplotypes survive, whose allele sizes of the two loci are negatively correlated. We reduced the effective population size (N) to a certain fraction B, ("bottleneck factor") of the former size, and kept the population size low for an arbitrary period from node 40 to node 320 (where the earliest node is indexed as node 1). After the bottleneck the population stepped back to the initial size. These changes in population sizes are modeled as changes in the intensity of the coalescence, which alters the branch lengths of the tree. In Figure 5A the correlation between the two arrays of repeats is illustrated for different severities of bottleneck. The striking observation is that there is no correlation in allele sizes even when the population is reduced to 1% of its current size. If the population is reduced to the point that only two lineages survive, then by chance those two could yield a negative covariance, but this degree of bottleneck is highly unlikely for S. aurata based on available data for mtDNA and allozymes (![]()
In the second model, we assume that there is a bias to the mutation mechanism (during replication) such that mutations increase the size of the array with probability µ[1/2 -
(L - T)], where µ is the mutation rate,
is the degree of bias, L is the sum of lengths of the alleles at the two loci, and T is the threshold size (Figure 6A). Under this model, when the combined alleles have a length <T, both repeats tend to mutate to a larger size, and when the combined allele size is >T, both repeats tend to mutate to a smaller size. Another form of mutation bias might occur after DNA replication. This model assumes that there is a postreplication scanning mechanism, which truncates one or the other locus when the summed size gets large. If the combined length of the two repeats is greater than threshold T, then with probability s (the "stringency" of the truncation), one or the other array is shortened by one repeat. As the stringency increases, the observed correlation in allele sizes becomes more negative (Figure 6B).
The third model assumes that natural selection acts on summed array size (with long arrays having low fitness), together with a mutation bias that favors the increase of the repeat array. In all the simulations of this model, a bias of 2% was used, such that 52% of the mutations increased the array length and 48% decreased it. The fitness associated with each allele was w = 1 - (L - Opt)s, where L is the summed length of the alleles at the two loci, Opt is the optimal size (which is the original size of the common ancestor), and s is the selection coefficient. Selection was modeled as a haploid process, which is equivalent to additive fitness effects in a diploid model. An approximation to haploid selection is made by having each node in the tree generate an array of descendants, each having fitness w. Descendants are drawn from the array with probability equal to their fitness. Again we see that this model can generate negative correlation in allele sizes (Figure 6C).
In sum, Figure 6 illustrates that the plausible parameters of models with mutational bias (during or after replication) or natural selection can produce patterns of negative correlation similar to what was observed. Further empirical work, such as direct scoring of mutations, would be needed to discriminate among these mechanisms, but the common feature of all models is an interdependence of the changes at the two linked repeat arrays.
Permutation test for Hardy-Weinberg genotype frequencies:
One test that may identify a marked variation in fitness of different SA41 genotypes is to determine whether the genotype frequencies correspond to those expected under random union of gametes. In particular, we want to know whether allele sizes that compose diploid genotypes are drawn at random. We addressed this question with two different tests.
The negative covariance seen between alleles on a chromosome may extend to a nonrandom association of haplotypes in genotypes. Such a nonrandom association may occur if the genotypic fitness were affected by allele sizes in some way. A permutation test was used to test for departures of this type. We calculated the variance of the sum of the two haplotypes (the sum of the composite lengths SA41 of the two chromosomes) and the variance of their difference for all 157 genotypes whose allelic composition was unambiguous (see MATERIALS AND METHODS). Then we drew random pairs of haplotypes by shuffling the observed haplotypes to produce 1000 sets of 157 genotypes, and we calculated the variance of the sum and difference for each set. We then generated the distribution of the variances and compared it with the observed. In both cases, the observed value of variance was placed in the core of the distribution (P > 0.3), indicating that the association of the haplotypes into diploid genotypes was random.
We also performed an exact test for Hardy-Weinberg equilibrium (![]()
| DISCUSSION |
|---|
This study shows that two closely linked microsatellite arrays, whose repeat numbers might be expected to evolve independently from each other, do in fact behave in such a way that there is a "preferred" intermediate combined length. Permutation tests first established that there is a highly significant negative covariance between the repeat lengths for loci SA41a and SA41b. We consider three competing explanations for this pattern of variation, including population history (e.g., bottleneck), mutation bias, and natural selection.
In the first model (population history) we assume that the two most common combinations of alleles (SA41a99-SA41b182 and SA41a133-SA41b152) represent two ancestral haplotypes at the SA41 locus whose predominance in the present-day populations of gilthead sea bream occur either because these were the haplotypes in the original population that evolved into S. aurata species, or because at some later time the species, as a whole, experienced a severe bottleneck through which these haplotypes were the ones to survive. One difficulty with this explanation is that the initial preponderance of the two alternative major haplotypes is assumed to have arisen by chance. While this is formally possible, our simulations show that it is very unlikely unless the bottleneck is much smaller than other evidence allows (![]()
The other two models (natural selection and mutation bias) share the prediction that we should have observed more haplotypes with the same combined length but intermediate to the two clusters observed. Selective pressure on repetitive sequences was proposed by ![]()
![]()
![]()
The mutation bias model (before or after replication) assumes that when the combined length passes a threshold of repeats (which in this case may be a haplotype length close to 236), either the replication mechanism favors the decrease of the number of repeats, or there is another mechanism that truncates repeats after replication from one or the other locus. The only problem with this explanation is that it cannot explain the fact that only two clusters are mainly observed. On the contrary, we should have observed many different haplotypes with the same combined length.
However, based on the available data we can propose a possible model on how this clustering was generated. It is proposed, and supported by empirical data (![]()
![]()
![]()
![]()
|
Although natural selection cannot be rejected as a possible explanation for our results, mutation bias seems more plausible because it fits better the assumptions currently accepted for the evolution of microsatellite repeats. A mutation bias hypothesis is consistent with several studies suggesting constraints on the length of repeat arrays (![]()
![]()
![]()
![]()
![]()
![]()
![]()
Whether our observation is a result of natural selection or mutation bias remains to be resolved in future studies of additional tightly linked microsatellite loci. The striking negative correlation in allele sizes of linked microsatellite repeats in S. aurata argues that the two loci are not evolving independently, and that either mutation processes or natural selection are driving the pattern of interlocus disequilibrium.
| ACKNOWLEDGMENTS |
|---|
We thank Drs. G. Kotoulas, C. Saavedra, and A. Argyrokastritis for helpful discussions and ideas during this study and all the members of Dr. Zouros's lab in Crete and Dr. Clark's lab for their support. We also thank Drs. M. Kentouri, T. Patarnello, M. C. Alvarez, and J. P. Andrande for providing samples. We are also grateful to Dr. A. Civetta and B. Lazzaro for critically reading earlier versions of this manuscript and the two anonymous reviewers for their helpful comments. E.T.D. was supported by the Greek Foundation of State Scholarships. The project was supported by AIR3 (AIR CT 94 1926, funded by the European Union) to E.Z. and A.M.
Manuscript received March 10, 1998; Accepted for publication September 8, 1998.
| APPENDIX 1 |
|---|
| LITERATURE CITED |
|---|
AMOS, W., S. J. SAWCER, R. W. FEAKES, and D. C. RUBINSZTEIN, 1996 Microsatellites show mutational bias and heterozygote instability. Nat. Genet. 13:390-391[Medline].
CHARLESWORTH, B., C. H. LANGLEY, and W. STEPHAN, 1986 The evolution of restricted recombination and the accumulation of repeated DNA sequences. Genetics 112:947-962
CHARLESWORTH, B., P. SNIEGOWSKI, and W. STEPHAN, 1994 The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371:215-220[Medline].
DI RIENZO, A., A. C. PETERSON, J. C. GARZA, A. M. VALDES, and M. SLATKIN et al., 1994 Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91:3166-3170
ESTOUP, A., L. GARNERY, M. SOLIGNAC, and J. M. CORNUET, 1995 Microsatellite variation in honey bee (Apis mellifera) populations: hierarchical genetic structure and test of the infinite allele and stepwise mutation models. Genetics 140:679-695[Abstract].
FELDMAN, M. W., A. BERGMAN, D. D. POLLOCK, and D. B. GOLDSTEIN, 1997 Microsatellite genetic distances with range constraints: analytic description and problems of estimation. Genetics 145:207-216[Abstract].
FIELD, D. and C. WILLS, 1998 Abundant microsatellite polymorphism in Saccharomyces cerevisae, and the different distributions of microsatellites in eight prokaryotes and S. cerevisae, result from strong mutation pressures and a variety of selective forces. Proc. Natl. Acad. Sci. USA 95:1647-1652
GARZA, J. C., M. SLATKIN, and N. B. FREIMER, 1995 Microsatellite allele frequencies in humans and chimpanzees, with implications for constraints on allele size. Mol. Biol. Evol. 12:594-603[Abstract].
GOLDSTEIN, D. B. and A. G. CLARK, 1995 Microsatellite variation in North American populations of Drosophila melanogaster.. Nucleic Acids Res. 23:3882-3886
GOLDSTEIN, D. B. and D. D. POLLOCK, 1997 Launching microsatellites: a review of mutation processes and methods of phylogenetic inference. J. Hered. 88:335-342
GOLDSTEIN, D. B., A. RUIZ LINARES, L. L. CAVALLI-SFORZA, and M. W. FELDMAN, 1995 Genetic absolute dating based on microsatellites and the origin of modern humans. Proc. Natl. Acad. Sci. USA 92:6723-6727
HUDSON, R. R., 1990 Gene genealogies and the coalescent process. Oxf. Surv. Evol. Biol. 17:1-44.
MAGOULAS, A., K. SOPHRONIDES, T. PATARNELLO, E. HATZILARIS, and E. ZOUROS, 1995 Mitochondrial DNA variation in an experimental stock of gilthead sea bream (Sparus aurata). Mol. Mar. Biol. Biotechnol. 4:110-116[Medline].
MITTON, J. B. and R. K. KOEHN, 1973 Population genetics of marine pelecypods. 3. Epistasis between functionally related isoenzymes of Mytilus edulis. Genetics 73:493-496. (Appendix by T. PROUT)..
POGSON, G. H. and E. ZOUROS, 1994 Allozyme and RFLP heterozygosities as correlates of growth rate in the scallop Placopectin magellanicus: a test of the associative overdominance hypothesis. Genetics 137:221-231[Abstract].
PRITCHARD, J. K. and M. W. FELDMAN, 1996 Statistics for microsatellite variation based on coalescence. Theo. Pop. Biol. 50:325-344.
RAYMOND, M. and F. ROUSSET, 1995 GENEPOP (version 1.2): population genetics software for exact tests and ecumenism. J. Heredity 86:248-249
RUBINSZTEIN, D. C., W. AMOS, J. LEGGO, S. GOODBURN, and S. JAIN et al., 1995 Microsatellite evolution: evidence for directionality and variation in rate between species. Nat. Genet. 10:337-343[Medline].
SCHLOETTERER, C., C. VOGL, and D. TAUTZ, 1997 Polymorphism and locus-specific effects on polymorphism at microsatellite loci in natural Drosophila melanogaster populations. Genetics 146:309-320[Abstract].
SCHUG, M. D., T. F. C. MACKAY, and C. F. AQUADRO, 1997 Low mutation rates of microsatellite loci in Drosophila melanogaster.. Nat. Genet. 15:99-102[Medline].
SCHUG, M. D., K. A. WETTERSTRAND, M. S. GUADETTE, R. H. LIM, and C. M. HUTTER et al., 1998 The distribution and frequency of microsatellite loci in Drosophila melanogaster.. Mol. Ecol. 7:57-69[Medline].
SLATKIN, M., 1995 A measure of population subdivision based on microsatellite allele frequencies. Genetics 139:457-462[Medline].
STEPHAN, W., 1989 Tandem-repetitive noncoding DNA: forms and forces. Mol. Biol. Evol. 6:198-212[Abstract].
TAKEZAKI, N. and M. NEI, 1996 Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA. Genetics 144:389-399[Abstract].
VALDES, A. M., M. SLATKIN, and N. B. FREIMER, 1993 Allele frequencies at microsatellite loci: the stepwise mutation model revisited. Genetics 133:737-749[Abstract].
WEBER, J. L. and C. WONG, 1993 Mutation of human short tandem repeats. Hum. Mol. Genet. 2:1123-1128
WEIR, B. S., 1979 Inferences about linkage disequilibrium. Biometrics 35:235-254[Medline].
WIERDL, M., M. DOMINSKA, and T. PETES, 1997 Microsatellite instability in yeast: dependence on the length of the microsatellite. Genetics 146:769-779[Abstract].
YEH, R. C., R-C. YANG, T. B. J. BOYLE, Z-H. YE and J. X. MAO, 1997 POPGENE: the user-friendly shareware for population genetic analysis. Molecular Biology and Biotechnology Center, University of Alberta, Canada.
ZHIVOTOVSKY, L. A. and M. W. FELDMAN, 1995 Microsatellite variability and genetic distances. Proc. Natl. Acad. Sci. USA 92:11549-11552
ZHIVOTOVSKY, L. A., M. W. FELDMAN, and S. A. GRISHECHKIN, 1997 Biased mutations and microsatellite variation. Mol. Biol. Evol. 14:926-933[Abstract].
This article has been cited by other articles:
![]() |
D. Weetman, L. Hauser, and G. R. Carvalho Reconstruction of Microsatellite Mutation History Reveals a Strong and Consistent Deletion Bias in Invasive Clonal Snails, Potamopyrgus antipodarum Genetics, October 1, 2002; 162(2): 813 - 822. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Streelman and T. D. Kocher Microsatellite variation associated with prolactin expression and growth of salt-challenged tilapia Physiol Genomics, April 10, 2002; 9(1): 1 - 4. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Streelman and T. D. Kocher Microsatellite variation associated with prolactin expression and growth of salt-challenged tilapia Physiol Genomics, April 10, 2002; 9(1): 1 - 4. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Dermitzakis, E. T.
- Articles by Zouros, E.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Dermitzakis, E. T.
- Articles by Zouros, E.








