Abstract
Constraints on microsatellite length appear to vary in a speciesspecific manner. We know very little about the nature of these constraints and why they should vary among species. While surveying microsatellite variation in the Mediterranean gilthead sea bream, Sparus aurata, we discovered an unusual pattern of covariation between two closely linked microsatellite loci. One and twolocus haplotypes were scored from PCR amplification products of each locus separately and both loci together. In a sample of 211 fish, there was a strong negative covariance in repeat number between the two loci, which suggests a mechanism that maintains the combined length below a constrained size. In addition, there were two clusters of the same combined haplotype length, one consisting of a long repeat array at one locus and a short array at the other and vice versa. We demonstrate that several models of biased mutation or natural selection, in theory, could generate this pattern of covariance. The common feature of all the models is the idea that tightly linked microsatellites do not evolve in complete independence, and that whatever size dependence there is to the process, it appears to “read” the combined size of the two loci.
THE utility of microsatellite repeats for making inferences about population history hinges on an understanding of the mutational processes that generate their remarkable diversity in populations. Several models have been proposed for the evolution of microsatellites (Goldsteinet al. 1995; Zhivotovsky and Feldman 1995; Pritchard and Feldman 1996; Zhivotovskyet al. 1997), but comparatively few experimental studies have quantified the evolutionary dynamics of these sequences (Estoupet al. 1995; Garzaet al. 1995; Schloettereret al. 1997; Schuget al. 1997; Field and Wills 1998). Two mutation processes that have been considered are the stepwise mutation model and the infinite alleles model. Both models succeed in explaining some features of these sequences, but the most likely process entails primarily singlerepeat changes, with occasional mutations of multiplerepeat steps (Valdeset al. 1993; Di Rienzoet al. 1994). In addition, several studies have suggested a biased tendency of microsatellite arrays to increase rather than decrease in size (Weber and Wong 1993; Rubinszteinet al. 1995; Amoset al. 1996), although this point remains in question.
Garza et al. (1995) studied the mutation and evolution of microsatellites by comparing the range and variability of the same loci in humans and chimpanzees. They concluded that there must be some constraints on these sequences based on the fact that the difference in average allele size between the two species was too small for their evolutionary distance, given the high mutation rate of microsatellites. They proposed that the possible explanations for their results were a biased mutation process, gene conversion, or natural selection. Slatkin (1995) and Goldstein et al. (1995) proposed metrics for computing distances between populations, but Takezaki and Nei (1996) and Feldman et al. (1997) showed that these metrics are informative only for closely related populations. Constraints on allele size lead to a plateau for sufficiently distantly related populations. Although it is possible to estimate distances in the face of such constraints, the species and locusspecificity of the constraints makes it exceedingly cumbersome to obtain prior empirical data on the limits of repeat numbers in each context.
In the process of studying the population structure of the Mediterranean sea bream, Sparus aurata, we isolated two arrays of tandemly repeated GT dinucleotides that are separated by only 75 nucleotides of unique sequence, each of which shows extensive allelic variation. We could amplify DNA from each gene separately and from the two together from each chromosome of an individual fish. This allowed quantification of onelocus and twolocus allele associations at the gametic and the zygotic levels. The patterns of associations observed provide useful insights into the processes that may constrain microsatellite length variation in a chromosomal region.
MATERIALS AND METHODS
Sampling procedure: Five geographically distinct samples of Mediterranean sea bream (S. aurata) were obtained:
IMBC1: wild animals collected in 1993 from several regions of the Greek seas (n = 32). These animals are maintained at the Institute of Marine Biology of Crete (IMBC) for experimental purposes.
G2: a sample collected in 1996 from the Mesolongi lagoon, Greece (n = 40).
I2: a sample collected in 1996 from the northern Adriatic, Italy (n = 40).
S4: a sample collected in 1996 from Alicante on the Mediterranean coast of Spain (n = 51).
S3: a sample collected from Cadiz in 1996, on the Atlantic side of Gibraltar (n = 48).
Samples were frozen as soon as possible after collection and transferred to the laboratory for tissue excision. DNA was extracted from frozen livers or from muscle preserved in 70% alcohol. In both cases, the DNA extraction protocol of Pogson and Zouros (1994) was followed.
PCR amplification: A battery of microsatellite markers for S. aurata has been developed by Batargias and colleagues (C. Batargias, E. T. Dermitzakis, A. Magoulas and E. Zouros, unpublished results). The two microsatellite markers used here are (GT)_{n} repeats that are separated by a 75bp unique sequence. They are designated as SA41a and SA41b, and the combined region is called SA41. Primers were designed to amplify each locus separately and both together (Figure 1; EMBL accession numbers Y17262 and Y17263). For the amplification of SA41a locus, primers pSA41Fa and pSA41Ra were used; for SA41b locus, primers pSA41Fb and pSA41b; and for both loci (composite PCR product), primers pSA41Rb and pSA41Fa. For the visualization of the PCR products, one primer for each amplification was endlabeled with [γ^{32}P] ATP. Primers pSA41Ra and pSA41Rb were labeled for loci SA41a and SA41b, respectively, and pSA41Fa was labeled for the combined twolocus product (SA41).
All reactions were performed in 0.2 ml PCR tubes, 10 μl reactions consisting of 1× PCR buffer (GIBCO BRL, Gaithersburg, MD), 0.6 μm of each of the two primers, 0.2 mm of each dNTP, 1 mm of MgCl_{2}, and 0.04 μm of the labeled primer, 0.25 units of Taq polymerase (GIBCO BRL), and about 10 ng of total genomic DNA. The conditions for each amplification were: 95° for 2 min (hot start) for 1 cycle, and 95° for 45 sec, 52° for 30 sec, 72° for 30 sec for 35 cycles, and 72° for 10 min at the end.
Gel electrophoresis and size scoring: PCR products were resolved in 6% polyacrylamide denaturing gels. The sequence of the phage plasmid M13 was used as a size marker to determine the genotype of certain individuals, which were subsequently used as size markers. To assure accuracy in sizing, the markers covered the full size range of the sampled PCR products.
Size scoring of PCR products from autoradiographs was performed at least three times for each case. Associated pairs of allele sizes (327) of the two loci (haplotypes) were inferred by the presence of their composite PCR product for most of the individuals, which was always equal to the sum of the single locus products minus the 45 nucleotides of overlap. In some cases (∼10% of the chromosomes analyzed) one of the two haplotypes of an individual was inferred by deduction, when only one of the composite PCR products was visualized (we assumed that the other pair was associated, although we could not see the composite PCR product). This method was used only in cases where the genotyping of the two loci was unambiguous. We inferred 157 genotypes (that correspond to 314 haplotypes) due to the fact that only one of the two composite products was scorable for 13 individuals.
We were able to verify the association of all the pairs of alleles obtained by PCRscoring for the IMBC1 sample (60 haplotypes: 18.3% of the total number of haplotypes analyzed), by scoring the genotypes of the offspring of experimental crosses between these individuals, and by observing the cosegregation of the alleles of the two loci to the next generation. In all 60 cases the associated pair observed in the offspring coincided with the associated pair inferred from the PCR assay.
Permutation tests: Several permutation tests were applied to assess the statistical significance of properties of the observed microsatellite variation. Initial inspection of the data suggested that the variance in the size of the composite PCR product was lower than that expected from random associations of the alleles of the two loci. This motivated a test designed to compare the observed variance of SA41 allele size, which also represents the variance in repeat number, with the variance of random draws of pairs of allele sizes of loci SA41a and SA41b. Specifically, we generated 1000 samples of 327 pairs of allele sizes by shuffling the observed allele sizes of one locus vs. the other. We also applied the same permutation test to compare the observed covariance of the dinucleotide repeat number in the two loci with that expected by chance, as suggested by Pritchard and Feldman (1996).
RESULTS
Allele and genotype frequencies: Mediterranean populations of S. aurata show no significant genetic heterogeneity with regard to allozyme and mitochondrial DNA variation (A. Magoulas, unpublished data; Magoulaset al. 1995, respectively). The null hypothesis of allele homogeneity for the five samples [using the GENEPOP software of Raymond and Rousset (1995)] was not rejected for locus SA41b (P = 0.248) but was rejected for SA41a (P = 0.0065). The large number of alleles at this locus (n = 43) and the relatively small sample sizes may have contributed to this heterogeneity. This heterogeneity has no effect on our analysis of interlocus associations, because spurious linkage disequilibrium may result from the pooling of samples only if the samples are heterogeneous in allele frequencies at both loci (Appendix by Prout in Mitton and Koehn 1973). We have therefore combined the five samples into one (see Appendix for raw data). The allele frequency distributions in this combined sample are given in Figure 2. In the combined sample, genotype frequencies do not deviate from HardyWeinberg equilibrium at either locus [P = 0.232 and P = 0.0564 for SA41a and SA41b, respectively, as tested by Fisher’s exact test of GENEPOP software, Raymond and Rousset (1995)].
Linkage disequilibrium: Although a direct estimate of recombination rate between the two loci has not been obtained, the spacing of just 75 nucleotides means that recombination must be very rare. An indication of the low rate of recombination is that we did not observe a single instance of the haplotype composed of the two most frequent alleles (SA41a^{99} and SA41b^{152}). Given this very tight linkage, linkage disequilibrium between alleles at two loci would be eroded mainly through mutation, which could generate multiple combinations of the same length at each of the two loci, yet these combinations would not be identical by descent.
Overall linkage disequilibrium among alleles at the two loci was found to be highly significant by the chisquare test (Raymond and Rousset 1995). The same result was obtained employing the chisquare test proposed by Weir [1979, using the POPGENE software by Yeh et al. (1997)]. With the above method (Weir 1979), we were able to identify haplotypes at the two loci that were either in excess or deficiency. Given the tight linkage, the observation of a high degree of linkage disequilibrium was not unexpected. However, we have further observed that there was a strong correlation between the combined length of alleles at the two loci and the probability that their combination will be in excess or deficiency. The highest values of the twoallele linkage disequilibrium (defined as D = P_{ij}  P_{i}P_{j}, where P_{i} is the frequency of allele i of locus SA41a, P_{j} is the frequency of allele j of locus SA41b, and P_{ij} is the frequency of the haplotype with allele i of SA41a and j of SA41b) were found to cluster in two areas of the plane defined by the allelic size of the two loci. More importantly, the two clusters have about the same combined length (Figure 3). Indeed, the two combinations that were in the highest excess were SA41a^{99}SA41b^{182} (combined length 236 nucleotides) and SA41a^{133}SA41b^{152} (combined length 240 nucleotides). Overall there was a strong negative correlation between the sizes of the alleles of the two loci (correlation coefficient r = 0.229, P < 0.001).
Permutation test for random association of SA41a and SA41b alleles: The negative correlation of individual allele length motivated a test that would compare the variance and the covariance of the observed data with 1000 replicates generated by random shuffling of the alleles of the one locus against the other, as described in the materials and methods. The distribution of variance in fragment size and the covariance (Figure 4) between allele size for the two loci obtained from the 1000 permutations in both cases had no overlap with the observed values, indicating that the probability of getting such an extreme value by chance was <0.001. The significant negative covariance could be generated by a number of distinct mechanisms, some of which were explored with computer simulations described below.
Simulations of models of tandem microsatellite evolution: We considered three models that might a priori be expected to generate negative covariance: recovery from a population bottleneck, mutation bias, and natural selection. We do not test the hypothesis of gene conversion proposed by Garza et al. (1995) because the same size was generated by different combinations of alleles of the two loci, thus giving a different sequence for each case. We simulated all three models as described below, and we compared them with the null model that assumes none of the above mechanisms.
Simulations of the models make use of the coalescent process for generation of gene genealogies without recombination (Hudson 1990; Valdeset al. 1993). Onto these neutral gene genealogies mutations are placed following a Poisson distribution with mean number of mutations μt for each branch of length t. If strictly stepwise mutations occur at both loci, and there are no range constraints, the simulations correctly produce zero (0) correlation (as indicated by the rightmost point in Figure 5A and the leftmost point in Figure 6). All three models employ the observed sample size of 327 pairs of associated alleles and use a mutation parameter of 4Nμ = 30 for both loci in the nonbottlenecked population. This value was chosen because it generated a distribution with mean variance equal to the observed variance under the null model (no bottleneck, mutation bias, or selection). If the mutation rate is 10^{4}, then 4Nμ = 30 corresponds to a population size of 75,000, which is at least plausible. We also tested some other values of 4Nμ (5, 10, and 60) for the mutation bias, the truncation stringency, and the natural selection models, and the qualitative results were robust over this range. A value of 4Nμ outside this range is not likely because of the observed level of variation of the microsatellites under study and the available estimates on S. aurata population size.
The first model assumed a population bottleneck early in the coalescent tree. The rationale behind this model is that negative covariance can be generated by chance only if a small number of haplotypes survive, whose allele sizes of the two loci are negatively correlated. We reduced the effective population size (N) to a certain fraction B, (“bottleneck factor”) of the former size, and kept the population size low for an arbitrary period from node 40 to node 320 (where the earliest node is indexed as node 1). After the bottleneck the population stepped back to the initial size. These changes in population sizes are modeled as changes in the intensity of the coalescence, which alters the branch lengths of the tree. In Figure 5A the correlation between the two arrays of repeats is illustrated for different severities of bottleneck. The striking observation is that there is no correlation in allele sizes even when the population is reduced to 1% of its current size. If the population is reduced to the point that only two lineages survive, then by chance those two could yield a negative covariance, but this degree of bottleneck is highly unlikely for S. aurata based on available data for mtDNA and allozymes (Magoulaset al. 1995; A. Magoulas, unpublished data). The distribution of covariance of allele size across replicate simulations is remarkably similar whether the population is bottlenecked or not (Figure 5B). The same result is obtained if the bottleneck is applied from node 3 to node 16 or from node 4 to node 32. In sum, the bottleneck model fails to explain the observed data.
In the second model, we assume that there is a bias to the mutation mechanism (during replication) such that mutations increase the size of the array with probability μ[^{1}/_{2}  α(L  T)], where μ is the mutation rate, α is the degree of bias, L is the sum of lengths of the alleles at the two loci, and T is the threshold size (Figure 6A). Under this model, when the combined alleles have a length <T, both repeats tend to mutate to a larger size, and when the combined allele size is >T, both repeats tend to mutate to a smaller size. Another form of mutation bias might occur after DNA replication. This model assumes that there is a postreplication scanning mechanism, which truncates one or the other locus when the summed size gets large. If the combined length of the two repeats is greater than threshold T, then with probability s (the “stringency” of the truncation), one or the other array is shortened by one repeat. As the stringency increases, the observed correlation in allele sizes becomes more negative (Figure 6B). The third model assumes that natural selection acts on summed array size (with long arrays having low fitness), together with a mutation bias that favors the increase of the repeat array. In all the simulations of this model, a bias of 2% was used, such that 52% of the mutations increased the array length and 48% decreased it. The fitness associated with each allele was w = 1  (L  Opt)s, where L is the summed length of the alleles at the two loci, Opt is the optimal size (which is the original size of the common ancestor), and s is the selection coefficient. Selection was modeled as a haploid process, which is equivalent to additive fitness effects in a diploid model. An approximation to haploid selection is made by having each node in the tree generate an array of descendants, each having fitness w. Descendants are drawn from the array with probability equal to their fitness. Again we see that this model can generate negative correlation in allele sizes (Figure 6C).
In sum, Figure 6 illustrates that the plausible parameters of models with mutational bias (during or after replication) or natural selection can produce patterns of negative correlation similar to what was observed.
Further empirical work, such as direct scoring of mutations, would be needed to discriminate among these mechanisms, but the common feature of all models is an interdependence of the changes at the two linked repeat arrays.
Permutation test for HardyWeinberg genotype frequencies: One test that may identify a marked variation in fitness of different SA41 genotypes is to determine whether the genotype frequencies correspond to those expected under random union of gametes. In particular, we want to know whether allele sizes that compose diploid genotypes are drawn at random. We addressed this question with two different tests. The negative covariance seen between alleles on a chromosome may extend to a nonrandom association of haplotypes in genotypes. Such a nonrandom association may occur if the genotypic fitness were affected by allele sizes in some way. A permutation test was used to test for departures of this type. We calculated the variance of the sum of the two haplotypes (the sum of the composite lengths SA41 of the two chromosomes) and the variance of their difference for all 157 genotypes whose allelic composition was unambiguous (see materials and methods). Then we drew random pairs of haplotypes by shuffling the observed haplotypes to produce 1000 sets of 157 genotypes, and we calculated the variance of the sum and difference for each set. We then generated the distribution of the variances and compared it with the observed. In both cases, the observed value of variance was placed in the core of the distribution (P >
0.3), indicating that the association of the haplotypes into diploid genotypes was random.
We also performed an exact test for HardyWeinberg equilibrium (Raymond and Rousset 1995) of the association of composite lengths, and equilibrium was not rejected (P > 0.05), suggesting a random association of haplotypes in genotypes.
DISCUSSION This study shows that two closely linked microsatellite arrays, whose repeat numbers might be expected to evolve independently from each other, do in fact behave in such a way that there is a “preferred” intermediate combined length. Permutation tests first established that there is a highly significant negative covariance between the repeat lengths for loci SA41a and SA41b. We consider three competing explanations for this pattern of variation, including population history (e.g., bottleneck), mutation bias, and natural selection.
In the first model (population history) we assume that the two most common combinations of alleles (SA41a^{99}SA41b^{182} and SA41a^{133}SA41b^{152}) represent two ancestral haplotypes at the SA41 locus whose predominance in the presentday populations of gilthead sea bream occur either because these were the haplotypes in the original population that evolved into S. aurata species, or because at some later time the species, as a whole, experienced a severe bottleneck through which these haplotypes were the ones to survive. One difficulty with this explanation is that the initial preponderance of the two alternative major haplotypes is assumed to have arisen by chance. While this is formally possible, our simulations show that it is very unlikely unless the bottleneck is much smaller than other evidence allows (Magoulaset al. 1995; A. Magoulas, unpublished data). Another difficulty with the assumption that the two haplotypes were the only ones present in an ancestral S. aurata population is that the mutation rate at microsatellite loci is large enough to have caused a decay of the temporarily dominant haplotypes. At the same time, the only way a bottleneck could explain the phenomenon is if it were very severe (reduction of the population to <1% for many generations), because only then would it be likely to have just two major haplotypes survive.
The other two models (natural selection and mutation bias) share the prediction that we should have observed more haplotypes with the same combined length but intermediate to the two clusters observed. Selective pressure on repetitive sequences was proposed by Charlesworth et al. (1986, 1994) and by Stephan (1989). Our selection model assumes that the individuals carrying alleles >236 have a selective disadvantage compared to smaller alleles. This is based on the fact that large stretches of repetitive DNA generally result in the instability of the region of DNA carrying these (e.g., FragileX in humans). The observation that the sampled zygotes are random combinations of haplotype lengths argues against this hypothesis, although deviations from the null hypothesis of no selection are expected to be seen only when there are strong viability differences among genotypes. Selection might be much weaker than could be detected by this HardyWeinberg test. However, by simulating a model with selection we concluded that, to obtain the negative correlation we observed, the selection coefficient s must be large. Therefore, unless SA41 has special properties, something not indicated by the random association of combined haplotype lengths in the genotypes, it seems unlikely that such strong selection is acting on this sequence. Thus, the selection hypothesis is less plausible as an explanation for the observed data, but it cannot be formally rejected.
The mutation bias model (before or after replication) assumes that when the combined length passes a threshold of repeats (which in this case may be a haplotype length close to 236), either the replication mechanism favors the decrease of the number of repeats, or there is another mechanism that truncates repeats after replication from one or the other locus. The only problem with this explanation is that it cannot explain the fact that only two clusters are mainly observed. On the contrary, we should have observed many different haplotypes with the same combined length.
However, based on the available data we can propose a possible model on how this clustering was generated. It is proposed, and supported by empirical data (Weber and Wong 1993; Rubinszteinet al. 1995; Amoset al. 1996), that mutations in microsatellite repeats are biased with a tendency to make the array grow larger. At the same time one can assume that if replication slippage is the main mechanism for this expansion, a long repeat allows more replication slippage events than a short one. This is indicated by the fact that repeat length is usually positively correlated with the repeat variance (Goldstein and Clark 1995). Starting with a haplotype composed of two short repeats, say of equal size, then by chance some of them will have an increase in SA41a and others in SA41b. These haplotypes would mutate with the larger one mutating faster than the smaller. When the lengthconstraint mechanism (mutation bias or natural selection) starts to act, it “eliminates” repeats randomly from one of the two arrays with equal probability. Therefore, the long array will lose repeats with the same rate as the short, but it will gain repeats faster than the short array. If this mechanism is allowed to act for long, it will generate two clusters of about the same size but at the two extremes (Figure 7).
Although natural selection cannot be rejected as a possible explanation for our results, mutation bias seems more plausible because it fits better the assumptions currently accepted for the evolution of microsatellite repeats. A mutation bias hypothesis is consistent with several studies suggesting constraints on the length of repeat arrays (Garzaet al. 1995; Feldmanet al. 1997; Goldstein and Pollock 1997; Zhivotovskyet al. 1997). Wierdl et al. (1997) observed several events of elimination of repeats from GT arrays in yeast when those were very long. In a recent study (Schuget al. 1998) the constraints of microsatellite length in different species were compared, and it appears that these constraints are speciesspecific or at least specific to some taxonomic groups. The very low observed microsatellite mutation rate in Drosophila melanogaster (Schuget al. 1997) shows that the mechanisms responsible for the mutation and evolution of tandem repeats are not completely random.
Whether our observation is a result of natural selection or mutation bias remains to be resolved in future studies of additional tightly linked microsatellite loci. The striking negative correlation in allele sizes of linked microsatellite repeats in S. aurata argues that the two loci are not evolving independently, and that either mutation processes or natural selection are driving the pattern of interlocus disequilibrium.
Acknowledgments
We thank Drs. G. Kotoulas, C. Saavedra, and A. Argyrokastritis for helpful discussions and ideas during this study and all the members of Dr. Zouros’s lab in Crete and Dr. Clark’s lab for their support. We also thank Drs. M. Kentouri, T. Patarnello, M. C. Alvarez, and J. P. Andrande for providing samples. We are also grateful to Dr. A. Civetta and B. Lazzaro for critically reading earlier versions of this manuscript and the two anonymous reviewers for their helpful comments. E.T.D. was supported by the Greek Foundation of State Scholarships. The project was supported by AIR3 (AIR CT 94 1926, funded by the European Union) to E.Z. and A.M.
Footnotes

Communicating editor: M. W. Feldman
 Received March 10, 1998.
 Accepted September 8, 1998.
 Copyright © 1998 by the Genetics Society of America