## Abstract

We developed a general model of sporophytic self-incompatibility under negative frequency-dependent selection allowing complex patterns of dominance among alleles. We used this model deterministically to investigate the effects on equilibrium allelic frequencies of the number of dominance classes, the number of alleles per dominance class, the asymmetry in dominance expression between pollen and pistil, and whether selection acts on male fitness only or both on male and on female fitnesses. We show that the so-called “recessive effect” occurs under a wide variety of situations. We found emerging properties of finite population models with several alleles per dominance class such as that higher numbers of alleles are maintained in more dominant classes and that the number of dominance classes can evolve. We also investigated the occurrence of homozygous genotypes and found that substantial proportions of those can occur for the most recessive alleles. We used the model for two species with complex dominance patterns to test whether allelic frequencies in natural populations are in agreement with the distribution predicted by our model. We suggest that the model can be used to test explicitly for additional, allele-specific, selective forces.

THE population genetics of plant species with sporophytic self-incompatibility (SSI) are notoriously difficult to study both empirically and theoretically because of the complex dominance relationships occurring among alleles. Wright (1939) developed a theory for gametophytic self-incompatibility (GSI), a genetic system to avoid self-fertilization involving recognition between a protein expressed in the haploid pollen and two codominantly expressed pistil proteins, and identified negative frequency-dependent selection as the major evolutionary force promoting allelic diversity. According to Wright's theory, selection under GSI is symmetric among alleles, and the only relevant feature of an allele is its current population frequency. In SSI, however, the incompatibility phenotypes of pollen and pistils are determined by the diploid genotypes of the paternal and maternal plants, respectively, and are governed by complex dominance interactions among alleles that introduce asymmetrical selection among alleles (Bateman 1952; Schierup *et al.* 1997). A general understanding of the population genetics of SSI has also been difficult because different authors investigated different, often nonoverlapping model representations of SSI, sometimes with unstated assumptions, so that their outcomes have been difficult to compare (Table 1). Differences among models comprise: (1) frequency-dependent selection acting on male fitness only (corresponding to Wright's assumption) *vs.* selection on both male and female fitnesses, named “fecundity selection” in Vekemans *et al.* (1998); (2) expression of dominance in both pollen and pistils *vs.* codominance in pistil and dominance in pollen; and (3) occurrence of at most one allele per dominance class along a hierarchical ladder of dominance *vs.* allowing several alleles per dominance class.

Previous analyses of deterministic models of SSI have shown the following:

Recessive alleles should reach higher equilibrium frequencies than more dominant alleles, the so-called “recessive effect” (Bateman 1952; Sampson 1974). The reason is that negative frequency-dependent selection tends to homogenize the frequencies of the phenotypic classes (the “isoplethy” hypothesis). Because of dominance, recessive alleles can be present in more phenotypic classes than are dominant alleles, and they thus reach higher total frequencies (Cope 1962).

Within a given class of dominance, individual allelic frequencies should be inversely related to the number of alleles in that class, the so-called “small number effect” (Sampson 1974). This arises because within a given dominance class, alleles are selectively equivalent and are thus expected to reach identical frequencies at equilibrium (as in completely symmetric models of balancing selection such as GSI). In contrast to expectation 1 above, recessive alleles, if present in higher number, could thus potentially reach lower overall equilibrium frequency than dominant alleles.

The recessive effect is substantially less pronounced in models where dominance is acting in pollen (with a single allele per dominance class) and all alleles are codominant in the pistil (domcod model) as compared to models with identical expression of dominance in pollen and pistil (dom model) (Charlesworth 1988; Schierup

*et al.*1997). In the latter, the strength of the recessive effect was found to increase with an increasing number of dominance classes whereas the reverse was true for the domcod model (Schierup*et al.*1997).The properties of the domcod model denoted in expectation 3 are no longer valid when selection acts on both the male and the female fitnesses (

*i.e.*, when the availability of compatible pollen is limited). Under such a selection scenario, the properties of the domcod and dom models are qualitatively very similar (Vekemans*et al.*1998).A peculiar behavior was observed for alleles in the recessive class in a domcod model with two classes of dominance and no selection on female fitness. In many instances where two alleles were introduced in the recessive class, only a single allele could be maintained deterministically (Uyenoyama 2000). Also, with a single recessive allele, the total frequency of dominant alleles was increasing with the number of alleles, varying from ∼0.5 to 0.8 when increasing from 2 to 50 dominant alleles in the population.

Detailed theoretical investigations of SSI in finite populations have been rather scarce and did not model cases with more than one allele per dominance class. Imrie *et al.* (1972) showed that only a very low number of alleles at the S locus (S alleles) would be maintained in a small population under domcod SSI, but that recurrent migration would lead to a substantial increase in the number of alleles maintained. Schierup *et al.* (1997) compared different models of SSI with or without dominance, with dominance represented as a linear hierarchy with a single allele per dominance level. The total number of alleles maintained in finite populations was found to decrease with increasing expression of dominance and was interpreted as the result of decreasing overall strength of selection in the presence of recessive alleles that are not expressed in heterozygous genotypes (Schierup *et al.* 1997). The domcod model showed idiosyncratic evolutionary dynamics of alleles, with most dominant alleles having a higher rate of successful invasion whereas most recessive alleles had a higher rate of loss, leading to continuous evolution toward an increasing absolute level of dominance within populations (Schierup *et al.* 1997). The introduction of selection on female fitnesses was found to override this running-over process (Vekemans *et al.* 1998).

Although data on patterns of dominance and distribution of allelic frequencies in natural populations are accumulating slowly, application of the theoretical results to actual data has remained difficult because extant patterns of dominance relationships among alleles at SSI loci are typically more complex than the over-simplified dominance relationships assumed in the models (*e.g.*, Stevens and Kay 1989; Kowyama *et al.* 1994; Mehlenbacher 1997). However, testing predictions from models on the distribution of allelic frequencies within and among populations is necessary to assess the occurrence of frequency-dependent selection, to test whether selection is acting on male *vs.* female fitnesses, and to detect the occurrence of additional selective forces, potentially differing among alleles. Additional components of selection could arise, for instance, as a consequence of the association of S-allele lineages to different sets of deleterious alleles at linked loci (the “sheltered load”; Uyenoyama 1997) or of the occurrence of partial overlap among pairs of allelic specificities leading to selection to avoid false rejection of nonself pollen (Richman 2000; Chookajorn *et al.* 2004). Although the sheltered load hypothesis was initially suggested for GSI systems, it is potentially of major interest in SSI, with the difference that its strength is expected to vary among dominance classes as recessive alleles may occur as homozygotes (Bechsgaard *et al.* 2004).

Here, we present a flexible model of SSI that allows deterministic computations for any number of dominance classes with any number of alleles per dominance class and full definition of dominance relationships in pollen and pistil between each pair of alleles. Our model also allows specifying whether selection acts on male fitness only or both on male and on female fitnesses. The model allows the computation of expected equilibrium frequencies as well as single-generation genotypic changes in cases with complex patterns of dominance previously determined from nature. We use the model to explore situations with more than two dominance classes and different numbers of alleles per class, both in deterministic models and in finite populations. In both types of models we estimate the relative frequencies of alleles from the most recessive and the most dominant classes. We also estimate frequencies of homozygote genotypes as they are of special interest in the context of the sheltered load hypothesis. In finite populations we monitor the number of alleles per dominance class and the number of dominance classes maintained. We discuss how our results extend the current knowledge about SSI and permit explicit detection of non-frequency-dependent selection components in species with SSI.

## MODEL AND METHODS

We consider a sporophytic self-incompatibility system determined by *n* specificities in a diploid hermaphrodite species. The phenotype of an individual bearing specificities *i* and *j* depends on the patterns of expression (relative dominance) of their associated alleles at the pollen and pistil self-incompatibility genes. We suppose that a given allele encodes for a single specificity and a given specificity is encoded by a single allele. An *i*-specific pollen protein is produced when allele *S _{i}* is expressed in the anthers, whereas an

*i-*specific protein is produced at the pistil surface when the

*S*allele is expressed in the pistil. The pollen and pistil phenotypes of an individual

_{i}*S*can then be interpreted as the relative proportions of proteins

_{i}S_{j}*i*and

*j*produced in anthers and pistils, respectively. Note that a given plant may express different specificities in pistils and anthers if the dominance relationships between co-occurring alleles are not the same in the reproductive structures. Moreover, an individual may express two different specificities if the alleles it carries are partially codominant in a reproductive structure, possibly at different intensities in the case of partial dominance. A cross is considered incompatible when pollen and pistil proteins bearing the same specificity come into contact provided there is enough of them in both reproductive structures for recognition to occur.

In the following, we computed the expected genotypic frequencies under any scheme of dominance relationships among *n* alleles and two contrasting models of frequency-dependent selection:

Frequency-dependent selection occurring through male and female reproductive structures (FDS

_{m/f}), also called fecundity selection by Vekemans*et al.*(1998).Frequency-dependent selection occurring only through male reproductive structures (FDS

_{m}) first proposed by Wright (1939). The main hypothesis of this model is that all maternal plants produce the same number of offspring without regard for the quantity of compatible pollen they receive.

#### Dominance relationships:

We define α and ϕ, two square matrices of dimension *n × n*, respectively containing the dominance relationships for all pairs of alleles in anthers and pistils, where *n* is the number of different specificities actually present in the population. The elements on row *i* and column *j* in these matrices are α_{ij} = 1 − α_{ji} and ϕ_{ij} = 1 − ϕ_{ji}, respectively the dominance level of allele *S _{i}* over allele

*S*in pollen and pistil, with 0 ≤ α

_{j}_{ij}≤ 1 and 0 ≤ ϕ

_{ij}≤ 1 for all {

*i*,

*j*}. For instance, if α

_{ij}= 1,

*S*is fully dominant over

_{i}*S*in anthers and if ,

_{j}*S*and

_{i}*S*are codominant. More generally, if α

_{j}_{ij}> ,

*S*is partially dominant over

_{i}*S*in anthers.

_{j}#### From genotypes to phenotypes:

We denote *A _{ij}* and

*P*, respectively, the anther and pistil phenotypes of an individual with genotype

_{ij}*S*Phenotypes are defined as vectors of dimension

_{i}S_{j}.*n*as follows: with

*x*= 0 for ,

_{u}*x*= α

_{i}_{ij}and

*x*= α

_{j}_{ji}= 1 − α

_{ij}if

*i*≠

*j*, and

*x*= 1 if

_{i}*i = j*. We can define in the same way the phenotype of the pistil: with

*x*= 0 for ,

_{u}*x*= ϕ

_{i}_{ij}and

*x*= ϕ

_{j}_{ji}= 1 − ϕ

_{ij}if

*i*≠

*j*, and

*x*= 1 if

_{i}*i = j*. Typically,

*A*and

_{ij}*P*have the form , which can be interpreted as the proportion of

_{ij}*u*-specific protein produced in anthers and pistil of an individual

*S*, for all .

_{i}S_{j}#### Cross compatibilities:

This definition of the phenotype is convenient because a cross between a pollen from an *S _{i}S_{j}* plant and the pistil of an

*S*plant is compatible if , with 0 ≤ σ ≤ 1 a specified threshold value, and the superscript T indicates the transpose of vector

_{k}S_{l}*P*. We denote

_{kl}*p*the variable that takes the value 1 if the cross between a pollen from an

_{ijkl}*S*plant and the pistil of a

_{i}S_{j}*S*plant is compatible and 0 if it is not; in other words,

_{k}S_{l}*p*= 1 if or else

_{ijkl}*p*= 0.

_{ijkl}A cross is compatible in four cases:

If

*i*≠*k*,*i*≠*l*,*j*≠*k*, and*j*≠*l*, then and the cross is compatible.If

*i = k*,*i*≠*l*,*j*≠*i*, and*j*≠*l*, and the cross is compatible if . If we specify σ = 0 then the cross is possible only if*S*is fully recessive relatively to_{i}*S*or_{j}*S*or both._{l}Symmetrically, if

*j = l*,*i*≠*k*,*j*≠*i*, and*j*≠*k*, , and the cross is compatible if .If

*i = k*,*j = l*, and*i*≠*j*, .

While the cross implies two plants with the same genotype *S _{i}S_{j}*, the cross may be compatible if dominance is expressed differently in anthers and pistil; for example, if

*S*is fully dominant over

_{i}*S*in anthers and fully recessive in pistil, then we have α

_{j}_{ij}= 1 and ϕ

_{ij}= 0, and therefore and the cross is compatible.

#### Genotypic frequency change:

We denote *f _{ij}* the frequency of genotype

*S*in the population, the frequency of genotype

_{i}S_{j}*S*in the next generation, and a symmetric matrix. We assume an infinite population of diploid hermaphrodite individuals producing an infinite number of pollen and ovules. We also assume that the probability that a given pistil receives a given type of pollen depends only on the frequency of the latter in the pollen pool. We denote

_{i}S_{j}*w*the fraction of seeds produced by a cross between

_{ijkl}*S*pollen and

_{i}S_{j}*S*ovules among all seeds. The fraction of seeds produced by a cross between genotypes

_{k}S_{l}*S*and

_{i}S_{j}*S*is

_{k}S_{l}*w*+

_{ijkl}*w*. Hence, the frequency of seeds

_{klij}*S*produced by a cross between genotypes

_{i}S_{k}*S*and

_{i}S_{j}*S*is (

_{k}S_{l}*w*+

_{ijkl}*w*)/4 if

_{klij}*k*≠

*l*and

*i*≠

*j*, (

*w*+

_{ijkl}*w*)/2 if

_{klij}*k = l*or

*i = j*, and (

*w*+

_{ijkl}*w*) if

_{klij}*k = l*and

*i = j*. Finally, the frequencies of heterozygotes

*S*and homozygotes

_{i}S_{j}*S*after one generation are(1)The value of

_{i}S_{i}*w*for any two genotypes depends on their population frequencies, on the compatibility indicators

_{ijkl}*p*, and on the chosen regime of frequency-dependent selection: FDS

_{ijkl}_{m/f}or FDS

_{m}.

Under FDS_{m/f}, a pistil *S _{k}S_{l}* receives compatible pollen with a probability equal to the overall sum of the frequency of compatible pollen . Hence, if compatible pollen is rare (pollen limitation), a given plant may produce less seeds and selection may occur through both male and female reproductive functions. Pollination of a pistil

*S*by a pollen

_{k}S_{l}*S*occurs with probability

_{i}S_{j}*f*and fertilization with probability

_{ij}f_{kl}*p*. Hence, the contribution of a cross between a pollen

_{ijkl}f_{ij}f_{kl}*S*and a pistil

_{i}S_{j}*S*to the next generation under FDS

_{k}S_{l}_{m/f}is(2)Under FDS

_{m}, every plant receives enough compatible pollen such that all plants produce the same quantity of seeds and selection occurs through the male reproductive function only. The total frequency of compatible pollen with an

*S*pistil is such that ovules of an

_{k}S_{l}*S*plant are fertilized by pollen from an

_{k}S_{l}*S*plant with probability(3)According to FDS

_{i}S_{j}_{m}, a cross between a pollen

*S*and a pistil

_{i}S_{j}*S*thus contributes to the next generation in proportion(4)

_{k}S_{l}Overall the frequency of allele *S _{i}* in the next generation thus equals to , where and can be obtained from Equation 1, where values for

*w*can be obtained from either Equation 2 or Equation 4 under FDS

_{ijkl}_{m/f}or FDS

_{m}, respectively.

#### Simple cases:

Three kinds of dominance relationships have typically been used in the literature (Schierup *et al.* 1997; Vekemans *et al.* 1998): cod, where all alleles are codominant in pistil and pollen; dom, where alleles occur in a linear dominance hierarchy, identical in both pistil and pollen (two alleles from the same dominance class are codominant and each allele of a given dominance class is either strictly dominant or strictly recessive relative to the alleles of another dominance class); and domcod, where alleles follow the dom model in pollen and the cod model in pistil. Each of these three models can be conveniently represented with our notations, using the following dominance relationships matrices α and ϕ (here represented for three dominance classes and six alleles, two alleles in each class):(5)Hence, for instance, allele *S*_{1} in the domcod model is codominant with allele *S*_{2} only (they are in the same dominance class) and recessive relative to all other alleles in pollen, while it is codominant with all alleles in pistil. Here, we do computations mainly for two kinds of dominance relationships, dom and domcod, for which we state for clarity that class 1 is always the most recessive class. It is, however, possible to numerically compute genotypic frequencies at equilibrium for any dominance relationships between alleles (see below). A Mathematica (Wolfram Research 2004) notebook performing those calculations is available from S. Billiard.

#### Deterministic equilibrium frequencies:

According to Equations 1, 2, and 4, the frequency change for a given genotype depends on the genotypic frequencies matrix *F*. We thus computed the frequency of all genotypes after reproduction, during which frequency-dependent selection occurs, using Equations 1 and 2 under FDS_{m/f} or Equations 1 and 4 under FDS_{m}. Deterministic equilibrium genotypic and allelic frequencies were computed using recursively Equations 1, 2, and 4 until the frequency change in one generation was <10^{−6} for all genotypes. The initial genotypic frequencies are the same for all genotypes and set as the inverse of the total number of possible genotypes .

#### Allele number in the most recessive class:

This analysis was motivated by the unexpected result of Uyenoyama (2000) who showed that, under FDS_{m} and domcod dominance relationships with two dominance classes, two alleles could not coexist in the recessive class except when more than two alleles occur in the recessive class or two or less alleles in the dominant class, in which case all recessive alleles can be maintained. We determined if this result still holds for models with more than two dominance classes, for dom and domcod dominance relationships and under both FDS_{m/f} and FDS_{m}. For a given set of parameters, we first ran computations until deterministic equilibrium was reached (see previous paragraph). Since all genotypic frequencies are equal at the beginning of the deterministic computations, no alleles are lost deterministically during this step. We then perturbed the system by randomly changing the frequency of all genotypes (multinomial sampling of genotypic frequencies with 1000 trials) and let the population evolve deterministically for an additional 10,000 generations. The deterministic equilibrium after the perturbation did not depend on the sampling. So, we chose to perform the procedure a single time for each single parameter set. We specifically investigated whether (1) allelic frequencies came back to the same equilibrium values as before the perturbation (we considered that the equilibrium frequencies before and after the perturbation were identical if the difference between them was <10^{−6}) or (2) some alleles tended to disappear. We performed computations for 3, 4, 5, 6, and 10 dominance classes, with one, two or three alleles by dominance class. Only some combinations of these parameters have been examined (see supplemental data at http://www.genetics.org/supplemental/ for the complete list of parameter combinations tested).

#### Selection strength:

Because frequency-dependent selection acts on genotypes rather than alleles and since an infinite array of genotype frequencies is compatible with a given set of allelic frequencies, measuring the strength of selection for a given allele is not an easy task. In general, the strength of selection can be defined as a function of the frequency change in a generation. When a genotype departs from its equilibrium frequency, negative frequency-dependent selection is expected to bring frequencies back to equilibrium. To measure selection strength for allele *i* in a synthetic way, we compared its frequency change in a generation normalized by the genetic variance for a given deviation from its equilibrium frequency , where a superscript asterisk refers to equilibrium values. Practically, random deviations were obtained by sampling 500 diploid individuals from a population at deterministic equilibrium for cases with two alleles by dominance class. Values of *d _{i}* and

*s*were computed for every simulated population and the procedure was repeated 100,000 times. The strength of selection for allele

_{i}*i*was measured as the slope of the linear regression between

*d*and

_{i}*s*.

_{i}#### Stochastic simulations:

For simulations in finite populations with *N* diploid individuals, we used a three-step life cycle:

Gametogenesis, syngamy, and seed production: Frequency-dependent selection occurred during this step and we used Equations 1, 2, and 4 to compute the genotypic frequencies in the seed pool (assuming that individuals produce an infinite number of seeds).

Drift: Given the genotypic frequencies in the seed pool after frequency-dependent selection, we randomly sampled

*N*seeds to constitute the next generation of adults.Mutation: The number of mutation events occurring in a generation was drawn from a Poisson distribution with mean 2

*N*μ, with μ the mutation rate. We used a*K*-allele model (KAM);*i.e.*, we randomly drew for each mutation event one of the 2*N*genes, whose allelic state was equiprobably changed to one of*K*− 1 other possible states, irrespective of dominance classes.

To compute the expected number of alleles in finite populations, simulations were performed for dom and domcod dominance relationships, under both selection regimes, with three dominance classes. We fixed the number of possible allelic states to *K* = 18 (6 per class) and *K* = 30 (10 per class) for *N* = 100 and *N* = 500, respectively. Those limits were fixed after some preliminary simulations showing that for these population sizes, a higher number of alleles at a given time was never or very rarely reached. Some alleles could be lost by drift or introduced in the population by mutation, and thus *n* varies but at any given time, *n* ≤ *K*. Simulations were also performed with a higher number of possible allelic states, but the results were quantitatively similar (not shown). Simulations were started with *K*/3 alleles in each class, the total number of alleles at the beginning was thus *n = K*, and the initial genotypic frequencies were set at deterministic equilibrium without mutation as described above. We ran simulations for 10,000 generations to reach a drift–mutation–selection equilibrium where the mean total number of alleles over time remained stable. We then recorded the number of alleles in each class every generation during 100,000 generations and computed the probability that a given number of alleles were present in a given class during the whole process. An allele was counted if its frequency was >0.01 after drift. We performed 100 independent replicates of the whole process and we present the mean over replicates. Since the process is stationary, that is to say the mean and the variance of the number of alleles in the population do not change with time, our estimation of the expected number of alleles should not be biased.

## DETERMINISTIC DYNAMICS AND EQUILIBRIUM

#### Model verification:

The deterministic equilibrium frequencies obtained using our model were compared to the results available in the literature for models with FDS_{m} with only one allele in each class (Schierup *et al.* 1997), *n* alleles in two classes under domcod (Uyenoyama 2000), *n* alleles in three classes under domcod (Sampson 1974), and *n* alleles in four classes under dom (Schierup *et al.* 2006) as well as with FDS_{m/f} with 1 allele in each class (Vekemans *et al.* 1998). Results were identical in all cases, thus confirming that the proposed formalization is a correct generalization of all previous SSI models.

#### Deterministic dynamics:

Figure 1 shows the dynamics of allelic frequencies under dom and domcod dominance patterns with three dominance classes and FDS_{m/f} or FDS_{m} selection regimes. Although equilibrium was reached more quickly under FDS_{m/f} than under FDS_{m} for the dom model, equilibrium allelic frequencies were sensibly the same under both selection regimes. Under domcod, in contrast, the equilibrium values differed greatly between the two selection regimes: The difference between dominance classes was higher under FDS_{m/f} than under FDS_{m}. Despite the large difference in the frequencies at equilibrium under domcod, the equilibrium values were reached in approximately the same number of generations under both models of selection. This is due to the fact that selection occurred through male and female fitnesses under FDS_{m/f}; consequently, although the difference between initial and equilibrium frequencies was greater under FDS_{m/f} than FDS_{m}, selection was also stronger and the approach toward equilibrium frequencies was faster.

#### The recessive effect:

We investigated the recessive effect in models with different numbers of dominance classes and numbers of alleles per class, using both selection regimes. We quantified this effect with either the frequency of a single allele from a given class or with the total frequency of all alleles of that class. Figure 2 shows that equilibrium frequency of a given allele decreases when the dominance level of its class increased under both dom and domcod dominance patterns and for both selection regimes. The effect was, however, weak under domcod with FDS_{m}. Interestingly, equilibrium allelic frequencies were almost identical between the two selection regimes under the dom model, which generalizes the results obtained by Vekemans *et al.* (1998) in a simpler model with only one allele by class. In the domcod model in contrast, equilibrium allelic frequencies were sharply different between FDS_{m} and FDS_{m/f} with substantially higher difference between the most recessive and the most dominant allelic frequencies under FDS_{m/f}.

Figure 2 also reveals that the equilibrium frequency of an allele of a given class increases together with the total number of classes. This effect arises as a consequence of fixing the total number of allelic states (the small number effect; Sampson 1974). In such case, the number of alleles by class varies and the total frequency of a class is then equally shared among all alleles of that class. Moreover, the total frequency of a class depends on the total number of alleles considered. For a given number of classes, an increase in the total number of alleles resulted in an increase in the total frequency of the most dominant classes, together with a decrease in the total frequency of the most recessive classes (Table 2).

In spite of the inherent asymmetry among alleles in SSI caused by dominance, the frequency unevenness (FU), as measured by the ratio of the total frequency of the most recessive to the most dominant classes, was very low in some cases, especially under FDS_{m} and domcod dominance patterns (Vekemans *et al.* 1998 showed this result for one allele by class). Indeed, in this case FU remained as low as 1.03 when the 20 alleles considered belong to two dominance classes only. Yet, several factors had an impact on FU. For a given set of parameters in Table 2, FU under FDS_{m/f} was on average 35.5% higher than under FDS_{m}. Moreover, as revealed by Table 2, the number of dominance classes strongly influenced how even allelic frequencies remained among classes. FU generally increased with the number of classes, reaching overall frequencies for the most recessive class 5.03 times higher than the overall frequencies for the most dominant class with 20 alleles evenly distributed among 10 dominance classes. Table 2 also shows that when the number of alleles by class increases, FU decreases, under both dominance patterns and selection regime. This is due to the fact that the higher the number of alleles, the lower the selection strength difference between alleles of different dominance classes since the number of compatible genotypes is increased for all genotypes. Hence FU tends toward 1 when the total number of alleles increases. Conversely, when there is only one allele in the most recessive class and there are several alleles in the more dominant classes, the recessive effect may be large (Table 3).

These results confirmed the generality of the recessive effect when there is a linear dominance hierarchy between alleles in pistil and pollen (dom) or in pollen only (domcod), for models with a larger number of dominance classes and higher numbers of alleles per class than had previously been considered, as well as under the two selection regimes considered.

#### Homozygote frequencies:

Because of the dominance relationships among alleles in SSI, homozygotes can be formed for alleles belonging to all classes but the most dominant one. As revealed by Table 3, homozygotes for the most recessive allele can reach high equilibrium frequencies even when the total number of alleles is high and they possibly represent an important proportion of individuals carrying the most recessive allele. The homozygote frequency is higher under dom than under domcod for both selection regimes and is lower under FDS_{m} than under FDS_{m/f}. Table 3 also shows homozygote frequencies for alleles of class 2 when there are three dominance classes. Homozygotes for class 2 alleles reach much lower equilibrium frequency than the class 1 allele.

## ALLELE NUMBER BY DOMINANCE CLASS

#### Allele number in the most recessive class:

Uyenoyama (2000) showed that under FDS_{m} and domcod dominance patterns with two dominance classes, very restrictive conditions are required for two alleles to be deterministically maintained in the most recessive class. We used our general model to extend this investigation and explore the conditions under which several alleles can be deterministically maintained in the most recessive class when there are more than two dominance classes for both selection regimes, under dom and domcod dominance relationships (all parameter combinations examined and results are available as supplemental data at http://www.genetics.org/supplemental/).

Under FDS_{m/f}, two or more alleles in the most recessive class were always maintained and had identical frequency at equilibrium under both the dom and the domcod dominance relationships. The same was true for the dom model under FDS_{m}. In contrast, under FDS_{m} in the domcod model, several alleles could be maintained in the most recessive class only if more than two alleles occurred in the most recessive class. When exactly two alleles occurred in the most recessive class, they could still be maintained if (1) there is a single allele in each other dominance class and/or (2) there is a single allele in each class but exactly two alleles in the most dominant class (3 classes), in one of the 3 most dominant classes (4, 5, and 6 classes), or in one of the 4 most dominant classes (10 classes). In all other cases, notably with two alleles in the most recessive class and three alleles in at least one of the other classes, the recessive allele with the lowest frequency ultimately disappeared. Altogether our results thus suggest that the occurrence of a single allele in the most recessive class is a specific property of the FDS_{m} regime when combined with the domcod dominance relationships.

#### Number of alleles maintained in finite populations:

Simulations with three dominance classes, under domcod dominance relationships and for both selection regimes in a population of 100 and 500 diploid individuals, showed that the distributions of the allele number for dom under FDS_{m/f} and FDS_{m} were similar to those for domcod under FDS_{m/f} (Figure 3). The average numbers of alleles under FDS_{m/f} and FDS_{m} under dom and domcod are given in Table 4.

Interestingly, the number of alleles maintained in a class increased with dominance. Under FDS_{m/f}, there was a high probability to observe only one allele in the most recessive class, but this probability decreased with increasing population size. When comparing both dominance models under FDS_{m/f}, we found that the number of alleles maintained in each class was higher under domcod than under dom (Table 4).

When the population size was large enough, more than one allele could frequently be maintained in the most recessive class under FDS_{m/f} and domcod (Table 4). Under FDS_{m}, however, one allele was at best expected in the most recessive class, with many cases where all alleles from this class became lost from the population. Under FDS_{m} when *N* = 100, distributions for the number of alleles were quite different between dom and domcod. Under domcod, the most typical outcome was no allele in the most recessive class and mostly one allele in class 2. Under dom, one allele was always maintained in the most recessive class (Table 4). Under domcod, when the size of the population was increased to *N* = 500 individuals, the probability of loss for the most recessive class was ∼0.33. In the latter case, class 2 became the most recessive class, where a single allele was thus expected to be maintained (*vs*. one to four alleles maintained when class 1 existed in the population). As a consequence, the distribution of the number of class 2 alleles became bimodal (Figure 3). Although the distribution of the number of class 3 alleles was not bimodal, it was, however, wider than under FDS_{m/f} presumably because when class 1 was lost, more alleles were present in class 3.

#### Selection strength:

Although selection strength varied extensively according to the genotypic composition of the population (Figure 4), the expected selection strength over all genotypic compositions closely followed negative frequency-dependent selection. Most interestingly, Figure 4 shows that the relationship between selection strength and relative deviation differed among classes. The slope was high for the most dominant class, intermediate for the intermediate class, and low for the most recessive class. In addition, the regression slope was higher under FDS_{m/f} than under FDS_{m} (Table 4), in line with the fact that equilibrium was attained more quickly (Figure 1). The slopes were also higher under domcod than under dom under both selection regimes, and they were also higher for alleles in higher dominance classes. This strongly suggested that alleles in the most recessive class are subject to weaker selection than alleles in the dominant classes. This analysis provided a mechanistic explanation for the differences in the number of alleles maintained in a class in finite populations. The stronger selection is, the higher the number of alleles maintained in the population is expected to be. Indeed, even if alleles in the most dominant classes are expected to be the least frequent (Table 2), the most dominant class has the highest number of alleles (Table 4) because those alleles are returned to their equilibrium frequency faster than alleles in the recessive classes. Hence, for a given equilibrium frequency and a given population size, the probability for an allele to be lost by drift is negatively correlated with its dominance level.

## APPLICATIONS

#### Comparison between expected and observed frequencies in natural populations:

The dominance relationships between alleles have been studied in only a few species with sporophytic self-incompatibility and, at best, only partially. An estimation of observed allelic frequencies in a natural population is also available in a few cases only. To illustrate the potential use of our model to test the hypothesis of negative frequency-dependent selection, we applied our model to two cases for which both data sets are available: *Sinapis arvensis* (Stevens and Kay 1989) and *Ipomoea trifida* (Kowyama *et al.* 1994). Assuming that individuals were sampled in a large population at selection–mutation equilibrium, it is possible to test whether the observed allelic frequencies are significantly different from the expectation under our model. For that purpose, we used a likelihood-ratio test to test whether the observed frequencies significantly differ from a multinomial distribution with the expected equilibrium allelic frequencies as parameters. The expected equilibrium allelic frequencies were computed using our general model, with the number of alleles *n* equal to the number of specificity observed in each population: *n* = 33 for the population of *S. arvensis* and, respectively, *n* = 5, *n* = 16, and *n* = 6 for populations M81, M84, and G80 of *I. trifida.* We computed the log-likelihood ratio(6)where *N _{j}* is the number of copies of allele

*j*observed in the population,

*N*is the total number of sampled individuals, is the expected frequency at equilibrium for allele

*j*, and

*n*is the number of alleles observed in the sample. The observed allelic frequencies are then significantly different from the expectations if

*Q*is higher than a chi square with

*n −*1 d.f. We also performed the test on each allele independently, considering all other alleles as a single one, and tested if its observed frequency significantly differed from its expected frequency under our models (footnote

*a*in Tables 5 and 6). For that purpose, we also used a likelihood-ratio test with

*n*= 2.

#### Example 1—*S. arvensis*:

Stevens and Kay (1989) obtained genotype frequencies for 34 individuals from a natural population. They found 35 alleles, belonging to three distinct dominance classes in pollen and two dominance classes in pistil. Interestingly, several dominance relationships were asymmetrical. Allele 1, for instance, was one of the most recessive alleles in pistil but one of the most dominant alleles (together with allele 2) in pollen (Stevens and Kay 1989; appendix a). Table 5 reports the observed frequencies from Stevens and Kay (1989) as well as the expected frequencies obtained under FDS_{m} and FDS_{m/f}. Overall, the observed frequencies were not significantly different from the expected frequencies under FDS_{m/f} while they were significantly different from the expected frequencies under FDS_{m} (Table 5). This discrepancy was mostly due to several large differences between both selection regimes in the expected frequencies at equilibrium, notably for alleles 23 and 25. Allele 25 is the single most recessive allele in pollen and one of the nine most recessive alleles in pistil. Its observed frequency is significantly different from the expected frequency under FDS_{m}. Surprisingly, although allele 23 was predicted to be one of the less frequent alleles in the population, it was actually one of the most frequent in the samples, suggesting that evolutionary forces other than negative frequency-dependent selection may play a role in the evolution of this allele.

#### Example 2—*I. trifida*:

The genotype of 214 individuals from three populations was determined and 23 alleles were found by Kowyama *et al.* (1994). We give in Table 6 the expected allelic frequencies obtained under the FDS_{m} and FDS_{m/f} models as well as observed frequencies in three populations. The dominance relationships matrix used for the computations is derived from Kowyama *et al.* (1994) and is given in appendix b. The dominance relationships are approximately symmetric in pollen and pistil for all alleles. For computations in each population, we used a reduced matrix restricted to alleles present in each population. We found almost no differences between the predictions under both selection regimes. The observed frequencies were significantly different from the expectations in each population (Table 6). This departure was due to 2 alleles in population M81, 3 in population M84, and 4 in population G80. Allele 3 was recessive to all other alleles in pistil and pollen and it was, as expected, the most frequent in all populations. Allele 10 was recessive to all alleles except allele 3 and, as a consequence, it was expected to be the second most frequent allele after allele 3 in all populations. However, it was actually not the second most frequently observed allele in populations M81 and M84. Indeed, allele 8 in population M81 and allele 16 in population M84 had a much higher frequency than expected.

Although some level of concordance can be found, especially for *S. arvensis*, the two examples above generally enabled us to reject the model's predictions. Although the low number of individuals analyzed relative to the total number of alleles in the population may greatly affect the accuracy of allele frequency estimates, this pattern of discrepancy may actually reveal that additional evolutionary forces are indeed interfering with the strict frequency-dependent selection implemented in our model. Our model may thus be viewed as a “null model,” where frequency-dependent selection is the only evolutionary force taken into account.

## DISCUSSION

#### Generality of the model:

Due to the generality and flexibility of our approach, we were able to investigate a large range of SSI models in a single framework. This enabled us to fill the gaps between previously scattered theoretical investigations (Table 1) as well as to extend previous results to situations with many dominance classes and several alleles per class. This approach has been used to compute equilibrium genotypic and allelic frequencies under a variety of conditions and to compute expected genotypic changes over a single generation on the basis of known genotypic composition from a natural population, as well as to perform exploratory stochastic simulations. These analyses rely on the assumption that the only relevant force acting on S alleles is negative frequency-dependent selection generated by self-incompatibility. Basically we showed that the higher frequency of recessive alleles is a general feature of SSI models, but also that in finite populations the number of dominance classes will evolve according to the population size, and that dominant classes will contain more alleles than recessive classes. We also showed that fecundity selection (FDS_{m/f}) may have large effects on the dynamics of allelic frequencies as well as on the expectation of the number of alleles, the number of dominance classes, and the number of alleles by class maintained in finite populations. We notably showed that contrary to FDS_{m}, it is possible to maintain deterministically more than one allele in the most recessive class under domcod and consequently in finite populations.

Because our model uses very general recurrent equations, it can be extended to incorporate direct selection on S alleles, such as, for instance, selection due to the expression of linked deleterious alleles in homozygotes. In line with most other theoretical investigations of self-incompatibility (SI) (but see Charlesworth 1988), we also assumed that self-incompatibility was fully functional (technically, the compatibility threshold σ was set to zero in all our computations). However, partial self-incompatibility is often reported in empirical studies (Nou *et al.* 1991; Reinartz and Les 1994; Good-Avila and Stephenson 2002; Mable *et al.* 2005), and examination of its effect using modifications of our model is straightforward.

#### Generality of the recessive effect:

As was shown by several authors in a number of different specific situations, we observed in deterministic models that alleles from the most recessive class were always more frequent than dominant alleles, and this result was consistent across any number of classes, any number of alleles per class, and both types of selection (Table 2 and Figure 2). However, the strength of this recessive effect varied greatly among models. In the dom model, the effect increased drastically with increasing number of dominance classes, but was little affected by the number of alleles per class. In the domcod model with FDS_{m}, the recessive effect was weak and did not increase much with the number of dominance classes, whereas the effect was stronger under FDS_{m/f}. These results are qualitatively identical to those obtained by Schierup *et al.* (1997) and Vekemans *et al.* (1998) in models with only one allele per class. We showed that the recessive effect is even higher when the number of alleles by classes is different, especially when there is only one allele in the most recessive class (Table 3). In general agreement with these results, higher frequencies of the most recessive alleles have been observed in most empirical surveys of S alleles in natural populations of species with SSI (Sampson 1967; Stevens and Kay 1989; Kowyama *et al.* 1994; Mable *et al.* 2003; Glémin *et al.* 2005). Using our model, the recessive effect seemed also to be verified for the most recessive alleles in a case with very complex overall patterns of dominance, such as allele 25 in *S. arvensis*. In other complex cases, it may be difficult to decide unambiguously which allele is dominant or recessive but we can still interpret differences in expected frequencies in terms of relative dominance.

#### How many alleles per dominance class in finite populations?

This study is the first to investigate models of SSI that allow multiple alleles per dominance class in finite populations (but see Uyenoyama 2000). We consistently found that the number of alleles maintained per dominance class increases with dominance level. This result is not trivial because deterministic computations in models with an identical number of alleles in each class predicted a lower frequency of more dominant alleles, which could potentially cause a higher rate of loss of dominant alleles due to genetic drift. Empirical studies seem to agree overall with this expectation. In *Arabidopsis lyrata*, where four dominance classes have been recorded, 8 S alleles are known overall in the two most recessive classes, whereas the two most dominant ones comprise 16 S alleles (Prigoda *et al.* 2005). Similarly, in a single population of this species, 3 and 8 S alleles were found in the most recessive and the most dominant classes, respectively (Schierup *et al.* 2006). In *Brassica insularis*, where two dominance classes are known and dominance relationships seem to follow the domcod model, 2 alleles have been identified in the recessive class whereas >18 alleles have been recorded in the dominant class (Glémin *et al.* 2005). Differences in allele numbers between the most recessive and the most dominant classes are increased under FDS_{m/f}, as compared to the FDS_{m} model, and we showed that the overall strength of selection is higher under the former. The higher number of alleles in dominant than in recessive classes could be due either to higher rates of incorporation of new alleles, with dominance conferring a large advantage over recessive alleles when rare (Schierup *et al.* 1997), and/or to lower rates of loss. Our estimates of the strength of selection on alleles from different classes of dominance indicate a clear trend of increasing selection intensity with increasing dominance (Table 4). SSI systems thus appear as important examples of asymmetric balancing selection. Because individual allelic frequencies within a given dominance class are inversely related to the number of alleles in that class (Sampson 1974), the recessive effect is amplified by this difference in allele numbers, with substantially higher differences in frequencies between dominant and recessive alleles in finite populations, which could explain the very high differences in allelic frequencies observed for *I. trifida* (Table 6).

An additional mechanism must be taken into account to explain the number of alleles by dominance class under the FDS_{m} model of selection and domcod dominance relationships. Indeed, as first pointed out by Uyenoyama (2000), under deterministic conditions, two alleles cannot coexist in the most recessive class: As soon as a slight perturbation is introduced, the less frequent allele tends to disappear. Although this arguably occurred under a wide range of parameters, we also showed that some cases exist where two alleles can be maintained in the most recessive class for up to 10 dominance classes. It is interesting to note that this result does not extend to FDS_{m/f} under which two alleles can typically coexist at deterministic equilibrium under domcod. When the size of the population is small, only one allele is observed in the most recessive class (Figure 4). However, when the size of the population increases, then more than one allele coexist in the most recessive class even under domcod dominance relationships, contrary to results under the FDS_{m} model (Table 4 and Figure 4). Glémin *et al.* (2005) found evidence for the occurrence of two alleles in the most recessive class within populations of *B. insularis* and suggested that it could be due to population substructure. We show here that it is possible to maintain two alleles in the most recessive class in a panmictic population when selection occurs through both male and female reproductive structures. Because of the low density in *B. insularis* populations, pollen limitation due to low pollinator activity could indeed generate a female component of frequency-dependent selection through fecundity selection (Vekemans *et al.* 1998), and the FDS_{m/f} model is likely to be relevant.

#### How many dominance classes in finite populations?

In finite populations, the number of dominance classes may evolve. Schierup *et al.* (1997) showed that under the FDS_{m} regime and domcod, the most recessive alleles could not be maintained in finite populations: Every time a more dominant allele appeared in the population (in their model the number of dominance classes was infinite), the most recessive allele tended to be lost, resulting in a constant turnover of the most recessive allele. Hence, absolute levels of dominance tend to increase over time. Here we reported a phenomenon similar to the results of both Uyenoyama (2000) and Schierup *et al.* (1997). Indeed, as shown in Figure 3 and Table 4, under FDS_{m} and domcod, class 1 may be lost. The probability of loss of the most recessive class depends on the size of the population, suggesting that a simple interaction between demography and selection may cause differences in the number of dominance classes in different species. This could be relevant to explain why in the Brassica genus, only two dominance classes are found in Brassica, where domcod patterns of dominance are reported (Thompson and Taylor 1966), whereas at least four dominance classes are known to occur in *A. lyrata* where patterns of dominance are dom-like (Prigoda *et al.* 2005). Our results suggest that differences in numbers of dominance classes can be explained by an interaction between demography and selection. The lower number of dominance classes in the Brassica genus could be related, for instance, to the occurrence of an ancient bottleneck, which would be congruent with the observation of lower phylogenetic diversity of alleles in Brassica than in *A. lyrata* (Schierup *et al.* 2001).

#### Measure of selection strength:

The measure we used to quantify selection strength allowed us to better understand the difference in allele number maintained in different dominance classes under different models. Selection strength is a synthetic quantitative measure explaining the different dynamics and expectations at equilibrium in finite populations both for selection regimes and for any kind of dominance relationship. Figure 4 showed that the frequency of an allele is not sufficient to predict its frequency change in a generation under negative frequency-dependent selection because this change will depend upon the exact genotypic composition of the population at this generation. Indeed, the frequency of an allele can increase in a generation even if it is higher than its equilibrium frequency expectation. Sporophytic self-incompatibility systems illustrate how difficult it may be to measure selection under complex selection regimes such as negative frequency-dependent selection, notably since it depends on genotypic rather than allelic frequencies. In finite populations, selection strength reflects the probability of loss for an allele and depends on its dominance class. However, within complex dominance patterns, it can be difficult to determine the absolute dominance level of an allele and it is possible to use selection strength for this allele to estimate its probability of being lost or maintained.

#### Frequency-dependent selection models and pollen limitation:

We observed large differences in the properties of the FDS_{m} and FDS_{m/f} models of selection, in terms of frequencies at equilibrium, dynamics, and expected number of alleles by class. Notably, under FDS_{m/f} several alleles are expected in the most recessive class, contrary to under the FDS_{m} model. The fundamental difference between both models is that under the FDS_{m} model, all plants receive sufficient compatible pollen to fertilize all their ovules, such that all plants produce the same quantity of seeds whatever their S-locus genotype and whatever the genotypes of the other individuals in the population. However, it is known that “pollen limitation” may be frequent in natural populations, *i.e*., that seed output is limited by the availability of compatible pollen (Larson and Barrett 2000). Under the FDS_{m/f} model, not all plants produce the same quantity of seeds when the population departs from equilibrium because the probability that a given ovule will be fertilized is proportional to the amount of compatible pollen and thus varies among genotypes at the S locus. It would be interesting to know which of the two models of selection is closer to natural conditions, and we anticipate that this should heavily depend both on the species considered and on ecological conditions. Indeed, when populations are small, because of a recent bottleneck or a recent colonization event, or highly structured or when pollen dispersal is strongly spatially restricted, we could expect pollen limitation to be important. The occurrence of clonality due to vegetative propagation within species with SSI can also increase the incidence of pollen limitation, due to within-ramets incompatible pollination (DeMauro 1993; Young *et al.* 2002). Under those circumstances, FDS_{m/f} models could be more relevant.

#### Causes of the deviation between expected and observed frequencies in natural populations:

Our model enabled us to compute expected frequencies for each allele and each genotype at equilibrium for any kind of dominance relationship. We showed that observed allelic frequencies were significantly different from the expected frequencies under FDS_{m} for *S. arvensis* (Table 5) and that alleles 23 and 25 taken independently had frequencies significantly different from their expectations. For *I. trifida* the observed frequencies were different from their expectations in three populations. Four kinds of causes could explain these discrepancies between expected and observed allelic frequencies. First, the empirical determination of allelic frequencies and dominance relationships within natural populations may be inaccurate. Indeed, in articles from which those data were taken, only a fraction of all pairwise dominance relationships between alleles were actually tested, the others being inferred transitively. Also, errors in allele identification and statistical uncertainty due to low sample size relative to the number of alleles could lead to imprecise estimation of allelic frequencies. In particular, if all alleles are not known, the model will overestimate the allelic frequencies at equilibrium, especially for alleles in the same dominance class as the missing alleles. Second, the sampled population may not be at equilibrium because of drift or perturbations of the habitat or because the colonization of the population is recent. Third, population structure can also cause some deviation from deterministic equilibrium because the intrademe frequencies distribution can be different from that of a panmictic population for intermediate migration rates (Schierup 1998). Fourth, some important model assumptions could be erroneous. Indeed, we assumed that all S alleles are selectively equivalent while there are reports of differential selection on S alleles (Bechsgaard *et al.* 2004). Also, we have seen that the frequency of homozygotes is not negligible, especially for alleles in the most recessive dominance classes and when the number of S alleles is low. Hence, selection on S-allele-linked deleterious recessive alleles may play an important role in the dynamics of S alleles in sporophytic systems (Uyenoyama 1997) and could explain the discrepancy between expected and observed frequencies in a natural population. Finally, several mechanisms have been described that could occur in addition to frequency-dependent selection and potentially affect predictions for the number of alleles and their respective frequencies.

## CONCLUSIONS

The model presented here is very general and allows predicting either genotypic frequencies at deterministic equilibrium (and all statistics derived from genotypic frequencies such as *F*_{is}) or the expected genotypic frequencies that change in a generation. We confirmed previously identified characteristics of SSI systems and generalized them to situations with several alleles per dominance class. Moreover, we found emerging properties of such systems in finite populations such as the dynamics of the number of dominance classes and the asymmetry in allele numbers among classes. However, these predictions hold only under the assumptions that the sole mechanism acting is negative frequency-dependent selection due to the particular reproductive system. Hence, the results obtained under this model constitute a direct test of this hypothesis, as illustrated with two example applications. Consequently one can use this model as a tool to detect alternative selection forces that may be allele specific. Furthermore, in this article, the dominance classes were set either as parameters in the deterministic computations or as the result of the maintenance or loss of alleles in a finite population because of drift and frequency-dependent selection. However, the model can be used to examine the evolution of dominance through the selection of a dominance modifier.

## Acknowledgments

We thank F. Christiansen for sending a Mathematica notebook to verify our derivations in models with four dominance classes. We are indebted to S. Glémin, to an anonymous reviewer, and to the associate editors for their suggestions to improve the manuscript. S. Billiard especially thanks M. Ballatore without whom this work would not exist. The self-incompatibility team at Lille University is supported by an Action Thématique et Incitative sur Programme grant from the Centre National de la Recherche Scientifique, an Action de Recherches Concertées d'Initiative Régionale grant from Région Nord-Pas de Calais, and a Fonds Européen de Développement Régional grant from the European Union.

## Footnotes

Communicating editor: M. K. Uyenoyama

- Received December 23, 2005.
- Accepted December 21, 2006.

- Copyright © 2007 by the Genetics Society of America