- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Charlesworth, D.
- Articles by Awadalla, P.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Charlesworth, D.
- Articles by Awadalla, P.
Diversity and Linkage of Genes in the Self-Incompatibility Gene Family in Arabidopsis lyrata
Deborah Charleswortha, Barbara K. Mable2,a, Mikkel H. Schierupb, Carolina Bartoloméa, and Philip Awadalla3,aa Institute of Cell, Animal and Population Biology, University of Edinburgh, Ashworth Laboratories, Edinburgh EH9 3JT, United Kingdom
b Department of Ecology and Genetics, University of Aarhus, DK-8000 Aarhus C, Denmark
Corresponding author: Deborah Charlesworth, Animal and Population Biology, University of Edinburgh, Ashworth Laboratories, King's Bldgs., West Mains Rd., Edinburgh EH9 3JT, United Kingdom., deborah.charlesworth{at}ed.ac.uk (E-mail)
Communicating editor: M. K. UYENOYAMA
| ABSTRACT |
|---|
We report studies of seven members of the S-domain gene family in Arabidopsis lyrata, a member of the Brassicaceae that has a sporophytic self-incompatibility (SI) system. Orthologs for five loci are identifiable in the self-compatible relative A. thaliana. Like the Brassica stigmatic incompatibility protein locus (SRK), some of these genes have kinase domains. We show that several of these genes are unlinked to the putative A. lyrata SRK, Aly13. These genes have much lower nonsynonymous and synonymous polymorphism than Aly13 in the S-domains within natural populations, and differentiation between populations is higher, consistent with balancing selection at the Aly13 locus. One gene (Aly8) is linked to Aly13 and has high diversity. No departures from neutrality were detected for any of the loci. Comparing different loci within A. lyrata, sites corresponding to hypervariable regions in the Brassica S-loci (SLG and SRK) and in comparable regions of Aly13 have greater replacement site divergence than the rest of the S-domain. This suggests that the high polymorphism in these regions of incompatibility loci is due to balancing selection acting on sites within or near these regions, combined with low selective constraints.
IN Brassica, control of pollen-stigma interactions at the stigmatic interface involves highly polymorphic recognition genes of the self-incompatibility (SI) system. It is of interest to understand which regions of the proteins that these genes encode have recognition functions, how this affects the polymorphism in the coding sequence and surrounding genome regions, and how the two recognition genes maintain their coadaptation to produce functional incompatibility types. To understand the evolution of the self-incompatibility loci, it will be helpful to study them in the context of the gene families to which they belong. Doing this allows one to evaluate the possibility of exchanges between loci by gene conversion. It also makes it possible to compare sequence evolution of loci that are involved in incompatibility, and are thus under balancing selection, with similar sequences not under such selection.
The S-locus region contains loci belonging to two distinct gene families. "S-domain genes," members of the plant receptor-like protein-kinase gene family (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Further S-domain genes are known in Brassica and related plants (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
40 members (![]()
![]()
Studies of sequence diversity of Brassica S-locus genes have until recently concentrated on the SLG gene, but some data from the S-domains and a portion of the kinase domain of SRK of some haplotypes have been published (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Here we report results of population genetic studies of several S-domain loci in natural populations of A. lyrata. We characterize diversity at several different S-domain loci for comparison with SRK to establish whether SRK indeed has an unusually polymorphic S-domain, as expected for a gene under balancing selection. Balancing selection is not expected for S-domain genes that are not involved in SI (although they could have experienced other forms of selection, for instance, directional selection subsequent to gene duplication in the evolution of the gene family).
A. lyrata is a self-incompatible, predominantly diploid member of the Brassicaceae, but distantly related to Brassica (![]()
0.2 to >1 without Jukes-Cantor correction (reviewed in ![]()
![]()
![]()
![]()
![]()
![]()
![]()
As expected for the S-locus, Aly13 sequences are exceptionally polymorphic at both synonymous and replacement sites (![]()
![]()
![]()
Data from other Aly loci also allow us to compare levels of selective constraint in different regions of the S-domain. It is often suggested that the hypervariable (HV) regions in the extracellular S-domain are the most important for recognition (![]()
![]()
![]()
![]()
First, in regions of the protein where amino acid variants alter specificities, balancing selection will promote variation, so we expect high nonsynonymous diversity in the genomic sequences, as observed in both Brassica (e.g., ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
A third important influence on diversity is that selective constraints may differ between different regions of the protein (![]()
![]()
A final reason for studying other loci is that the analysis of sequence data to infer selection is complicated by population subdivision. To assess the effects of demographic and historical processes that can generate patterns that may be mistaken for evidence of selection, it is necessary to have reference loci that are not under strong balancing selection, but instead are evolving more or less neutrally. For example, genetic differentiation between populations can cause haplotype structure that may be difficult to distinguish from balancing selection unless other data are available to show the true situation (e.g., ![]()
![]()
![]()
![]()
![]()
Here we describe analyses of diversity and selection at several A. lyrata S-domain loci and assess the implications for our understanding of balancing selection and its effects on sequence diversity within S-loci and in their genomic neighborhood.
| MATERIALS AND METHODS |
|---|
A. lyrata plant material and DNA preparation:
Seeds were collected from four populations of A. lyrata (see details in ![]()
![]()
![]()
Primers, amplification, cloning, and sequencing:
S-domain primers:
Primers were designed on the basis of sequence alignments of Brassica SLG and SRK loci (Table 1) and used to amplify A. lyrata genomic DNA. Because SLG and SRK are members of a gene family, our initial primers were based on the most conserved regions of the Brassica S-domain and should amplify multiple A. lyrata S-domain genes, particularly those most similar to SRK. The S-domains of most Brassica oleracea and B. campestris SLG and SRK alleles have no introns (![]()
![]()
![]()
![]()
|
Kinase domain reverse primers: To test whether each S-domain sequence had a kinase domain downstream from the S-domain, we used specific forward primers for the loci identified in A. lyrata with reverse primers based on either Brassica SRK locus kinase domains (srk4r and srk5r) or an A. lyrata SRK kinase sequence kindly provided by J. B. Nasrallah (srknasr1, srknasr4, and srknasr3; see Table 1).
Cloning and sequencing: Because some primers amplify more than one locus, and also because of the high variability of some of the putative loci (see below), PCR products of the expected size were generally cloned before sequencing [using the Invitrogen (San Diego) TOPO TA cloning kit]. To detect sequence variants and differentiate between loci (see below), the cloned amplification products were digested with four- and six-cutter restriction enzymes and fragments were separated electrophoretically.
Sequences were obtained using standard cycle sequencing protocols for the Applied Biosystems (Foster City, CA) model 377 sequencer; with the Big Dye sequencing kit, using M13 universal primers for clones; or by direct sequencing using primers specific to the original amplified product. All Aly3, Aly9, Aly10.1, and Aly14 were sequenced directly. Sequences of the more variable loci were sequenced from cloned PCR products. In most cases, at least two clones were sequenced per individual to check for PCR errors. However, since this was not always done, some diversity values may be slightly overestimated, and an excess of singletons may have been produced; this does not affect our general conclusions. Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under POPSET accession nos. AY186752, AY186753, AY186754, AY186755, AY186756, AY186757, AY186758, AY186759, AY186760, AY186761, AY186762, AY186763, AY186764, AY186765, AY186766, AY186767, AY186768, AY186769, AY186770, AY186771, AY186772, AY186773, AY186774, AY186775, AY186776, AY186777. The full population sequence set can be obtained from the authors by request.
Sequence alignments and analyses:
Sequences were aligned using ClustalX (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
|
To estimate nucleotide divergence between sequences, synonymous (Ks) and nonsynonymous substitutions per site (Ka) were calculated using the method of ![]()
![]()
|
|
Nonsynonymous and synonymous diversity values (
a and
s) within species were estimated using a set of putative alleles of the different loci sequenced from a common small sample of individuals from the four populations. (Some individuals did not yield sequences for some loci, sometimes because the DNA sample was used up, so we were unable to obtain exactly the same samples for all loci; Table 5 shows the sample sizes.) The MEGA2 software was used for diversity estimation. Proseq v. 2.9 (![]()
![]()
![]()
|
|
|
Recombination in the Aly loci was tested by two types of analysis. Correlation analyses were done with the r2 program written by M. H. Schierup (http://www.brics.dk/~compbio/r2). Only segregating sites with frequencies >0.1 were included. Where only one or two sequences had gaps, the site was included (although the sequences with gaps were excluded from the analysis); otherwise, gap regions were excluded from all the sequences. Significance of the correlation coefficients of two measures of linkage disequilibrium with distance (r2 or D') was determined using 5000 random permutations of the variable sites. The second analysis used the composite likelihood finite sites extension to HUDSON's (2000) method (![]()
= 0 and
= 50. Significance against
= 0 was tested by 1000 random permutations. Minimum numbers of recombination events were estimated by HUDSON and KAPLAN's (1985) estimator using DNAsp v. 3.5 (![]()
To test whether the parts of the S-domain sequences that are hypervariable in the Aly13 sequences also evolve unusually in the other A. lyrata S-domain genes, we compared the levels of divergence and polymorphism in different regions of Aly13 with diversity in the sequences of each putative locus. Following ![]()
![]()
![]()
To test whether more changes have occurred in any of the A. lyrata Aly genes, compared with their orthologs in A. thaliana genes since their divergence from outgroup sequences, relative rates were evaluated using Tajima's one-parameter test (![]()
Tests for linkage between the A. lyrata self-incompatibility locus and the S-domain loci:
To test Aly S-domain loci for linkage to the self-incompatibility locus, we used full-sib families in which the incompatibility groups of progeny plants had been determined by hand-pollinations between individuals or in which Aly13 genotypes had been determined, so that it was known that one or both parent plants were Aly13 heterozygotes (see ![]()
| RESULTS |
|---|
Amplification of S-domain sequences:
Several combinations of primers designed to match conserved regions within the first exon (the S-domain) of the Brassica S-gene family (see MATERIALS AND METHODS) were used in the initial screening of A. lyrata genomic DNA for S-domain sequences. Amplifications from a single individual from Scotland yielded PCR products of the size predicted from Brassica S-gene sequences. Five different sequence types were initially identified using the six-cutter restriction enzymes EcoRI, HindIII, and BamHI, and these were sequenced (Aly7, Aly8, Aly9, Aly10, and Aly13; ![]()
![]()
BLAST searches of these sequences showed homology to Brassica and A. thaliana S-domain loci (see below), and they could all readily be aligned with Brassica S-allele sequences and with members of the S-domain gene family in A. thaliana. The portions of the S-domains sequenced (see Fig 2) in all the Aly loci identified, with the exception of Aly7 and some of the Aly10.1 alleles (see below), are open reading frames. There are no stop codons in any of the S-domains, and all indels, including the few that are polymorphic within loci (see Fig 2), are multiples of three nucleotides. No introns were found in any of the A. lyrata S-domain sequences.
Design of specific primers for S-domain sequence types and evidence that they represent different loci:
Given the very high diversity of the A. lyrata Aly13 sequences, which we ascribe to a single incompatibility locus, as explained above, it is important to investigate in detail the extent and nature of the S-domain gene family to check that the Aly13 sequences truly come from a single locus. We therefore used sequence variants to help classify the S-domain sequences into sets belonging to different loci.
Specific primer pairs were designed for individual sequences (![]()
![]()
![]()
![]()
A. thaliana orthologs of the Aly genes and structure of the loci:
Comparing the Aly sequences with A. thaliana S-domain receptor kinase genes (Table 2), we can identify probable orthologs for five genes (Fig 1). For three loci (Aly3, Aly7, or Aly9), we could not identify kinase domains by PCR (see MATERIALS AND METHODS); for brevity, we refer to these as "nonkinase domain" sequences, although it is possible that a kinase domain exists but was not detected. No orthologs can be identified for two of these loci, Aly3 and Aly7. For Aly9, AtS1 is a potential ortholog (see Table 2 for accession numbers). This is the probable ortholog of SLR1 in Brassica (![]()
We tentatively identify orthologs of Aly8, Aly10.1, and Aly10.2 as the three kinase domain loci Ark3, Ark1, and Ark2, respectively (Table 2). These three A. lyrata genes have quite similar S-domains, which amplify with the same forward primers (see Table 1), and kinase domains were detectable for all three. A possible ortholog of Aly14 (which we have not tested for the presence of a kinase domain) is an anonymous kinase domain sequence, which we denote by At14 (contig accession AL161566.2|ATCHRIV66, position 170101).
For the putative orthologous pairs, silent site divergence from A. thaliana for the S-domain ranges from 18 to 30%, and replacement site divergence is between 3 and 12% (see Table 2). Divergence values in the three kinase domain sequence types were similar (based on the coding sequence of exons 17, synonymous and nonsynonymous divergence values between Aly8 and Ark3 were 0.34 and 0.049, respectively; for Aly10.1 vs. Ark1, the values were 0.22 and 0.07, and for Aly10.2 vs. Ark2, 0.19 and 0.065). The values for both the S- and kinase domains are higher than values for most orthologous sequence comparisons between these two species (![]()
30% after Jukes-Cantor correction is unlikely for true orthologs). The high divergence for Aly8 is in part due to its diversity within A. lyrata (discussed in more detail below).
Fig 2 summarizes the structure of the S-domain sequence types identified in A. lyrata and their putative A. thaliana orthologs within the regions that were sequenced for the eight putative A. lyrata loci. The inferred amino acid sequences all share the 12 cysteine residues present in the Brassica SLG, SRK, and SLR S-domain sequences of other Brassicaceae (![]()
![]()
Aly7 sequences:
As mentioned previously, some Aly7 sequences have a single base-pair insertion (at position 751 within a region of four TA repeats; see Fig 3, "Aly7(+)" sequences). This disrupts the reading frame and creates a downstream stop codon. A 9-bp insertion is also present in this set of sequences, relative to the in-frame, Aly7(-) sequences (Fig 3). Overall, 26% out of a total of 27 Aly7 sequences classified (either by sequencing or by amplifying with primers specific to each haplotype; see Table 1) were of the 7(+) type. These sequences could represent a separate locus or could be allelic to the other Aly7 sequences. The sequences with and without the insertion form two haplotypes. There is significant linkage disequilibrium between the two types, between sites separated by 550 nucleotides, and between closer sites (Fig 3). This might suggest two distinct loci, but there are few pairs of sites for which linkage disequilibrium is complete, so the sequences appear to have recombined. Alternatively, interlocus gene conversion may have occurred, so this does not conclusively rule out the possibility of two loci. Nucleotide similarity is otherwise high between the two sequence types. Removing the insertion from the Aly7(+) sequences, the mean divergence from the other Aly7 sequences is 0.017 for synonymous sites, and slightly higher (0.019) for nonsynonymous sites, suggesting that the sequences are evolving neutrally. However, net divergence is very small, given the diversity within the Aly7(-) sequences. The sequence diversity of the Aly7(+) sequences is lower than that of Aly7(-), and only slightly lower than the divergence between the two, and again the sequences appear to be evolving neutrally (among Aly7(+) sequences,
s = 0.0098 and
a = 0.0107). Tajima's D is significantly negative (D = -1.65, P < 0.05) for the Aly7(+) sequences, but is not significant for the 7(-) sequences. Although relationships within loci were not well resolved (Fig 1), no evidence was found for separation of the Aly7(+) and Aly7(-) sequence types in the gene tree.
|
If the Aly7 gene is duplicated, both sequences should be detectable in all individuals, but this is not the case. Aly7(+) sequences have been found in only three of the four populations studied (two in the North Carolina population, one in the Indiana population, and three in the Scottish population). This suggests a single locus or else a duplication that is absent from the Iceland population. Consistent with the single-locus hypothesis, we find plants with both sequence types (apparent heterozygotes) as well as apparent homozygotes, and three individuals heterozygous for two different 7(-) sequences had no sign of sequences with the frame-altering 7(+) insertion. Finally, if the two haplotypes represent alleles, the haplotypes should segregate in the progeny of the apparent heterozygotes. Using primers specific for the two different haplotypes to score 11 progeny (family 99E-10) of such a plant (98E17-4), crossed with an Aly7(-) homozygote (98E17-6), 5 were apparent heterozygotes and 6 apparent homozygotes (-/-); i.e., we find the expected 1:1 ratio. We therefore conclude that the Aly7(+) sequences are probably null alleles of the same locus as the Aly7(-) sequences. In the further analyses below, the Aly7(+) sequences containing the frameshift are omitted.
Aly10.1 sequences:
Four types of Aly10.1 alleles have been found, and they are shown in Fig 2. Relative to the type "A" sequences, "B1" sequences have a 227-bp deletion beginning 99 bp from the end of the S-domain and leaving only 7 bp of intron 1 and an in-frame stop codon 5 bp before the deletion. "B2" sequences have a further deletion of 25 bp starting at bp 654, which changes the reading frame, while "B3" sequences have a 223-bp deletion starting at bp 446 (which also changes the reading frame). The A allele type, which presumably encodes a functional protein, is the commonest (77% overall, out of 44 alleles sequenced) and is present in all four populations studied, whereas B1 and B2 alleles were seen only in the U.S. populations, and B3 only in the Scottish population. Apart from a single B1/B2 plant, all individuals had at least one allele of type A. No evidence of grouping by alleles was found in the gene tree analysis (Fig 1).
Tests for linkage between the A. lyrata self-incompatibility locus and the S-domain loci:
We tested for linkage of the Aly S-domain loci and the self-incompatibility locus, using families whose parents were heterozygous for one or more Aly loci (see MATERIALS AND METHODS). Linkage between Aly13 variants and the S-locus in both sibships, and in several other families, has already been reported (![]()
![]()
For the sibship 98E-15 (see ![]()
![]()
For Aly8, however, some variants showed linkage. Table 4 shows another sibship in which both parents were double heterozygotes and in which the Aly8 variants again cosegregated with Aly13 sequences. Linkage of variants was detected in several other sibships by scoring Aly13 variants known from our previous work to show linkage to the S-locus (Aly13-4, -5, -9, -13, -16, and -22 from several different natural populations). These results for Aly8 are consistent with the fact that Ark3, its A. thaliana ortholog, is linked to the putative SRK ortholog of this species, which is a pseudogene (![]()
![]()
To examine further the possibility of paralogous loci, we aligned all our Aly8 sequences to test whether they cluster into two sets with fixed differences between them. However, we found no evidence for any such haplotype structure in the complete sequence data set. Moreover, we could not identify variants that characterize the set of linked or the set of recombining sequences from several families. In other words, there are no sites in linkage disequilibrium that allow us to define site states characteristic of the two putative loci and that might allow us to distinguish the loci on the basis of their sequences. This is also clear in Fig 1, in which there is no evident split into two Aly8 types.
Finally, there is linkage disequlibrium between the Aly8 and the Aly13 loci. We studied a set of Aly8 sequences that cosegregate with various incompatibility alleles of independent origins (scored using restriction enzyme digestion of Aly13 PCR products to determine the Aly13 sequence types). Four Aly13 types were represented twice in the sample. There were a total of 38 nucleotide sites polymorphic among the Aly8 sequences in this sample. Between pairs of Aly8 sequences from the four pairs of haplotypes whose Aly13 sequences match, the mean proportion of these variable sites that differ was 7% (range 017%). Between Aly8s from haplotypes with different Aly13s, the differences were much greater (the mean proportion of difference in the 21 comparisons is 41%, and the range is 2071%). The similarities between the most similar Aly8 sequences are underestimated, because some of the differences may be PCR errors (these sequences are from cloned PCR products, as this highly polymorphic locus cannot be directly sequenced).
Within- and between-population variability of the different Aly loci:
Within-population and total diversity:
We estimated sequence variability of seven S-domain Aly loci, at least two of which are not closely linked to the S-locus (see above). Table 5 summarizes the results for each of the four populations studied, as well as for the total sample. The mean within-population synonymous site diversity values (
s) are mostly <1%.
a/
s values, based on the within-population diversity values (Table 5), are mostly rather high for the loci studied here and for the Aly13 locus (although the very high Aly9 value is based on few variable sites). The extremely high diversity for Aly8 (species wide
s = 7.5% and
a > 1%) is partly, but not entirely, attributable to the fact that this sequence type could represent more than one locus, as just explained. This is discussed further below.
Between-population diversity:
When all sites were used in the analysis, tests for spatial structure (![]()
Recombination and linkage disequilibrium:
In an effort to test whether the putative alleles of each of the loci identified from the sequences are truly allelic, we tested for recombination in S-domain loci other than Aly13, where polymorphism levels are high and balancing selection is likely, which violates the assumptions of the analysis. Except for the Aly9 locus, which has little variability, both tests used suggest recombination (or some other form of exchange) in all the putative loci (Table 6). In the Aly8 sequences, many exchange events are detected using HUDSON and KAPLAN's (1985) estimator, even though these sequences probably come from at least two loci (see above). This suggests the possibility of gene conversion between the different loci.
|
Tests for selection within A. lyrata:
Tajima's tests:
Tajima's D statistic (![]()
Patterns of evolution in the S-domain sequences:
Replacement site polymorphism in the putative A. lyrata self-incompatibility locus, Aly13, is significantly higher in the regions corresponding to the Brassica SRK and SLG hypervariable regions than in the rest of the sequence (![]()
However, the low polymorphism at most of these loci (see above) makes it difficult to detect differences in diversity among different sequence regions. We therefore also estimated nonsynonymous divergence among the A. lyrata paralogs and among the three loci with putative orthologs in A. thaliana (Aly8, -10.1, and -10.2; see Table 2). Divergence for regions corresponding to the Brassica SLG and SRK and Aly13 HV regions was compared with divergence elsewhere in the S-domain sequences (see MATERIALS AND METHODS for the positions assigned to these regions). Among A. lyrata paralogs, the regions that correspond to non-HV regions of the S-domain accumulate fewer substitutions per nonsynonymous site than do those corresponding to HV regions (the mean for HV is 60% higher than that for non-HV regions); synonymous divergence is saturated and comparisons are not informative for such sites. Of the 15 comparisons, 14 show HV nonsynonymous divergence greater than non-HV divergence (Fig 4, open and shaded bars, respectively; the difference is significant with P < 0.0005 by a paired sign test, although it must be realized that the tests are not independent). Thus either these regions are under lower selective constraint than the remainder of the sequence or directional selection has caused divergence in these regions, specifically, in the different loci. There is no such clear effect between the Aly loci and their A. thaliana orthologs (Fig 4, solid bars).
|
MCDONALD-KREITMAN (1991) tests did not detect evidence of directional selection driving divergence between paralogous loci specifically in the HV regions (although this test indicated a significant excess of nonsynonymous polymorphic sites in the non-HV region of Aly3, when compared with divergence from several of the other loci; the reason for this result is unknown, but there is no evidence for balancing selection as this locus does not have high diversity). The relative rate tests also give no indication of any overall deviation from equal rates of evolution of the orthologous pairs of genes since divergence (data not shown). Overall, we conclude that the fairly high Ka/Ks values in the S-domains and the high nonsynonymous divergence in HV regions are due largely to low selective constraints, rather than to diversifying selection.
| DISCUSSION |
|---|
The S-domain gene family:
The S-domain loci studied here clearly form part of an ancient gene family with members in all other angiosperms tested, including distantly related species such as maize (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
There is no evidence for major birth and death of members of this gene family between A. lyrata and A. thaliana, since most loci can be identified in both species. However, duplication of the pollen-expressed SCR gene was found in one of the two haplotypes studied by ![]()
![]()
Pseudogenes:
Pseudogene S-domain genes have been found in the Brassica S-locus region (![]()
![]()
The Aly10.1 sequences containing deletions may also be a pseudogene. Our diversity analysis included the different types of alleles of this locus, excluding the deletion regions, and nonsynonymous diversity was low, as was the
a/
s ratio (see Table 5), suggesting that loss of function occurred recently. It therefore seems most likely that the B1, B2, and B3 alleles (see Fig 2) are null alleles.
Other examples of polymorphic null alleles are known, sometimes at frequencies as high as those found for the Aly7 and Aly10.1 sequences (e.g., ![]()
![]()
![]()
![]()
![]()
a/
s ratio might suggest a locus that has lost function or is in an early stage of doing so and is evolving neutrally; but this is not certain, since similar or even higher
a/
s values are found for other Aly loci (see Table 5). Loss of function could also explain the higher diversity in Aly7 than in most of the other loci. We are unable to compare divergence of the nonfunctional and potentially functional Aly7 alleles, since no A. thaliana ortholog can be identified. The absence of an ortholog is, however, consistent with this gene being a nonfunctional duplicate in A. lyrata.
Levels and patterns of diversity:
The S-domain loci studied here have a range of nucleotide diversity values, including widely differing silent site diversity. Lack of evidence for balancing selection and only moderate diversity levels are also reported for the Brassica SLR1, SLA, and SLB loci, which are not linked to the incompatibility locus (although sample sizes are very small; ![]()
![]()
![]()
![]()
-value for all nucleotide sites of 0.1%; the total diversity, including three different populations, was 0.38% (![]()
![]()
![]()
![]()
![]()
![]()
The difference between the reference loci studied here and the otherwise similar S-domain Aly13 sequences (the putative A. lyrata S-locus) therefore supports the view that the Aly13 silent and amino acid diversity is unusually high due to the maintenance of the polymorphism of incompatibility alleles. Moreover, variation at the Aly13 locus is similar in all populations, and Kst does not differ significantly from zero. This is as expected for loci experiencing balancing selection (![]()
![]()
![]()
![]()
![]()

), is close to some of the Aly13 alleles. Predicted relationships to putative orthologs from A. thaliana for several other sequence types (

