## Abstract

Interindividual differences in chromatin states at a locus (epialleles) can result in gene expression changes that are sometimes transmitted across generations. In this way, they can contribute to heritable phenotypic variation in natural and experimental populations independent of DNA sequence. Recent molecular evidence shows that epialleles often display high levels of transgenerational instability. This property gives rise to a dynamic dimension in phenotypic inheritance. To be able to incorporate these non-Mendelian features into quantitative genetic models, it is necessary to study the induction and the transgenerational behavior of epialleles in controlled settings. Here we outline a general experimental approach for achieving this using crosses of epigenomically perturbed isogenic lines in mammalian and plant species. We develop a theoretical description of such crosses and model the relationship between epiallelic instability, recombination, parent-of-origin effects, as well as transgressive segregation and their joint impact on phenotypic variation across generations. In the limiting case of fully stable epialleles our approach reduces to the classical theory of experimental line crosses and thus illustrates a fundamental continuity between genetic and epigenetic inheritance. We consider data from a panel of Arabidopsis epigenetic recombinant inbred lines and explore estimates of the number of quantitative trait loci for plant height that resulted from a manipulation of DNA methylation levels in one of the two isogenic founder strains.

SYSTEMATIC or stochastic changes in chromatin states, such as gains or losses of DNA or histone methylation, are sometimes transmitted across generations with significant phenotypic effects (Richards 2006). Since different chromatin variants (epialleles) can exist in the same sequence background (*i.e.,* on the same sequence allele), they can produce a dimension of functional variation at the population level that cannot be captured by an analysis based on DNA sequence alone (Johannes*et al.* 2008). How much of this epigenetic variation is routinely missed in linkage or association mapping studies is an open question (Johannes*et al.* 2008; Maher 2008; Manolio*et al.* 2009; Eichler*et al.* 2010), but preliminary estimates in plants suggest that it can account for up to 30% of the variation in commonly studied phenotypes such as height and flowering time (Johannes*et al.* 2009).

Unlike DNA sequence alleles, epialleles can exhibit a high degree of instability across generations (Rakyan*et al.* 2002; Mathieu*et al.* 2007). Because of these dynamic properties the quantitative implications of epigenetic inheritance in the context of human health, evolution, and agriculture have remained largely speculative (Johannes*et al.* 2008; Richards 2008; Bossdorf*et al.* 2008; Petronis 2010; Biemont 2010). To overcome this limitation, it is necessary to obtain a basic inventory of the transgenerational behavior of epialleles in both mammals and plants and to formally incorporate these properties into our current models of quantitative inheritance in natural and experimental populations (Johannes*et al.* 2008). The aim of this article is to outline both experiment and theory to achieve this.

A powerful experimental approach for studying the induction and propagation of epigenetic variation is through crosses of epigenomically perturbed isogenic strains. In the model plant Arabidopsis two groups have recently implemented such an approach by constructing so-called epigenetic recombinant inbred lines (epiRILs) (Johannes*et al.* 2009; Reinders*et al.* 2009). These populations were derived from crosses between individuals with virtually identical DNA sequences but drastically divergent epigenomic profiles. In both cases, the cross was initiated from a wild-type (wt) plant and a plant carrying a single loss-of-function mutation in *ddm1* (Johannes*et al.* 2009) or *met1* (Reinders*et al.* 2009), two genes involved in DNA methylation control. As a result, mutant plants exhibit significant global changes in DNA methylation (Vongs*et al.* 1993; Cokus*et al.* 2008; Lister*et al.* 2008; Reinders*et al.* 2009). Although mobilization of transposable elements also occurs in these epiRILs, they nonetheless provide a unique opportunity to study the transgenerational behavior of induced epigenetic variation against a nearly invariant DNA sequence background (Figure 1A). The experimental setup for constructing such populations represents a general strategy. Similar approaches could be considered in mammals and/or through the use of environmental triggers to initiate the epigenomic changes in the parental generation (Figure 1B).

Molecular profiling of the two Arabidopsis epiRIL populations has shown that only a fraction of induced epialleles remain stable in subsequent generations, the rest being subject to dynamic modifications (Johannes*et al.* 2009; Reinders*et al.* 2009). Two basic patterns are beginning to emerge. The first pattern indicates that a subset of epialleles undergo rapid and stochastic fluctuations over a wide spectrum of chromatin states, many of which are outside of the parental range (Reinders*et al.* 2009). While such alterations can be causative of phenotypes within a given generation, they probably do not contribute to phenotypic inheritance (Slatkin 2009) and can therefore be regarded as noise in the underlying heritable substrate. The second, and more important pattern, is a systematic and gradual reversion of mutant epiallelic states to those of the wt over the course of several generations. This process represents an intrinsic rescue system that is invoked to restore proper genome function and integrity. Loci that meet this pattern tend to correspond to sequences that are continuously targeted by the RNA-directed DNA methylation (RdDM) machinery (Johannes*et al.* 2009; Teixeira*et al.* 2009; Teixera and Colot 2010).

Epiallelic instabilities, as described above, create a complex source of heritable variation. These properties pose challenges to the way we have to approach quantitative inheritance in the epiRIL or similar populations. The key task is to simultaneously account for two processes: The first involves the meiotic transmission of maternal and paternal DNA sequence haplotypes according to Mendelian laws. The second is a dynamic process that governs continuous changes in the chromatin states (epialleles) harbored by these haplotypes and leads to non-Mendelian patterns of inheritance. Here we develop the necessary theoretical foundation to quantify these processes using epiRILs (Figure 1A) as a model system. We find that epiallelic reversion, recombination, parent-of-origin effects, and transgressive segregation are key parameters in these populations: Their joint effects can produce complex and highly dynamic inheritance patterns that cannot be predicted from strictly Mendelian models. In the limiting case of fully stable epialleles our model reduces to the classical theory of experimental line crosses and thus illustrates a fundamental continuity between genetic and epigenetic inheritance. In what follows we present the first comprehensive attempt to quantify epigenetic inheritance in model organisms.

## THEORY

#### Conceptual basis:

Consider a locus, *L*, extensively involved in genome-wide chromatin control. Genotype C.C at this locus corresponds to proper chromatin maintenance, whereas the mutant genotype *c*.*c* induces global chromatin changes (*e.g.,* modifications of DNA or histone methylation). We start with two inbred parents, a wild-type parent, *P*1 | C.C, and a mutant parent, *P*2 | *c*.*c*. By design, the two parents have identical DNA sequences (except at locus *L* and inevitably at a small number of other loci; Mirouze*et al.* 2009; Tsukahara*et al.* 2009), but drastically divergent chromatin profiles (Figure 1A).

As a result of the epigenomic perturbation induced by the mutation, the two parents will differ in their chromatin states at *N* loci determining a quantitative trait *y*. Suppose that in the *P*1 | C.C parent *N*(1 − τ) of these loci have stable epigenotype Ω.Ω and *N*τ of the loci have stable epigenotype ω.ω (Figure 2A). Here epiallele Ω corresponds to a phenotypically increasing and ω to a decreasing chromatin state. Relative to *P*1 | C.C, the mutant parent, *P*2 | *c*.*c*, will have undergone the following possible epiallelic changes at the *N* loci: Ω → ω, , ω → Ω, and , where the tilde (∼) signifies an unstable epiallelic state (Figure 2A).

We suppose that a proportion *s* of the newly induced epialleles remain stable in subsequent generations (Ω and ω) (Figure 2A). This is even the case when proper chromatin maintenance function is restored. Since stable epialleles behave like DNA sequence changes at the population level, we make no formal distinction between them. Only direct molecular profiling and sequencing of the two parents and their cross-derivatives will make it possible to uncover the physical basis of such stable induced alterations.

Apart from stable epialleles, we also assume a proportion (1 − *s*) of newly induced epialleles ( and ) with the capacity to revert to approximate wild-type states over generations. We quantify this physical process through a function γ(*t*), which describes the progressive changes of epiallelic states in continuous time (Figure 2B). Specifically, and correspond to sequences that are targeted by the RdDM machinery (Johannes*et al.* 2009; Reinders*et al.* 2009; Teixeira*et al.* 2009; Teixera and Colot 2010) or possibly other correction mechanism. For the cross design shown in Figure 1A, this reversion begins after the C.C genotype has been reintroduced at *L*, that is, in the conditional F2 (*F*2|C.C) or backcross (*BC*|C.C) populations (Figure 2, A and B). Upon further propagation of individual lines from these conditional populations by means of selfing (plants) or sibling mating (mammals), progressive epiallelic reversion continues through recurrent maintenance action at each generation. Including the initial perturbation, the different epiallelic fates outlined above can be summarized schematically as follows:

Our goal is to model this process for any generation of inbreeding. This allows us to draw direct connections between the basic properties of epialleles and their impact on heritable variation at the population level.

#### Transgenerational epigenetic dynamics:

We quantify the epigenotype at locus *j* at time *t* using the coding introduced in Table 1.

##### Parental generation:

Since all individuals are assumed homozygous in the parental generation, the total phenotypic variance can be expressed as(1)where is the pooled within-line (environmental) variance, and δ is henceforth defined as the average contribution of a single locus to the between-parental phenotypic mean difference *D* (Serebrovsky 1928):(2)This expression assumes that δ is equivalent over all *N* causative loci; that is, δ_{1} = δ_{2} = … = δ* _{N}* = δ.

Note in Equation 1 that when the transgression parameter τ = 0.5 we have that In this limiting case the total phenotypic variance in the parental generation is purely environmental, despite drastic functional divergence between the parents. This situation provides the condition for a maximum gain in transgressive variance in subsequent generations. It follows from the fact that each parent is fixed for both phenotypically decreasing and increasing states so that recombination events can produce offspring with more extreme epigenotypes (Riesenberg*et al.* 1999). We therefore refer to parameter τ as a measure of “transgression potential” in the parental generation, which can become “realized,” in some sense, in subsequent generations. This phenomenon is discussed in detail below.

##### F1 and base-population:

Crossing *P*1|C.C × *P*2|*c*.*c* yields the *F*1 generation. We assume that epiallelic states induced in the *P*2|*c*.*c* parent remain stable (Teixeira*et al.* 2009), but this assumption can be easily relaxed if necessary. As a consequence, the phenotypic variance may or may not (Richards 2009) be equivalent to the environmental variance. As shown in Figure 1A the *F*1|C.c are used to derive the base population for advanced inbreeding generations either through a backcross (*F*1|C.c × *P*1|C.C) or through an intercross (*F*1|C.c × *F*1|C.c). From these crosses, only the C.C progeny (*BC*|C.C or *F*2|C.C) are selected to initiate the inbreeding process through selfing or sibling mating. This permits a detailed study of the time-dependent behavior of parental epialleles independent of the recurrent action of the *c*.*c* genotype. For simplicity we ignore the introgression of wt epigenotypes surrounding locus *L* as a result of this selection procedure.

##### Advanced inbreeding generations:

At any time point *t* of inbreeding, the variance in trait *y* is the sum of an epigenetic component, σ^{2}(η,*t*), and an environmental (error) component, σ^{2}(ε):(3)

We assume that epigenotypes are uncorrelated with the error term ε, and that the error terms are uncorrelated across generation times. With equal effect sizes across the *N* causative loci, the epigenetic variance can be approximated as(4)where σ^{2}(η* _{j}* |

*t*) is the variance at a single locus

*j*at time

*t*and is the covariance between any two loci

*j*and

*k*separated by an average pairwise recombination fraction (Franklin 1970). It can be shown (see Appendix A) that Equation 4 has the explicit form(5)where we put γ = γ(

*t*) to lighten the notation. The parameters

*q*

_{1}and

*q*

_{2}depend on the mode of action (additivity or dominance), the type of inbreeding scheme (selfing or sibling mating), and the base population (backcross or F2-intercross) used to initiate the inbreeding processes. Table 2 provides a summary of the specific form of

*q*

_{1}and

*q*

_{2}in each of these cases.

An important observation is that dominance (complete dominance in our case) appears as a time-dependent phenomenon, owing not only to the progressive depletion of heterozygote epigenotypes, but also to the reversion of mutant epialleles to wt states. It is therefore necessary to distinguish two types of dominance effects, one being attributable to epialleles inherited from the mutant parent (*P*2|*c*.*c*) and the other being due to epialleles deriving from the wt parent (*P*2|C.C) (see Tables 1 and 2). As we show, this distinction has an effect on the epigenetic variation when inbreeding is carried forward from a backcross base population, as a result of the initial asymmetry of the epigenotype frequencies. In the case of a F2 base population, on the other hand, the epigenetic variance component is equivalent under the two dominance scenarios (what differs is the phenotypic mean of the population).

#### Estimation of the number of induced quantitative trait loci (QTL):

As an extension of previous biometrical approaches (Castle 1921; Serebrovsky 1928; Lande 1981; Zeng 1992), Equation 5 can be used directly to obtain conservative estimates of the number of QTL (*N*) resulting from the initial epigenomic perturbation in the parental generation. To achieve this, we substitute δ from Equation 2 into the equation for the epigenetic variance (Equation 5), and solve for *N* at any generation *t* of inbreeding to obtain(6)

It is perhaps interesting to note that if we consider the restrictive case of a F2 base population with additivity, *s* = 1 (fully stable epialleles), τ = 0 (no transgression), (linkage equilibrium among *N* loci), and *t* = 0, Equation 6 reduces to the well-known Castle–Wright estimator (Castle 1921).

The most important result of this manuscript is Equation 5. It formalizes the relationship between epiallelic reversion (via γ(*t*) and *s*), recombination (via ), transgression (via τ) and parent-of-origin effects (by keeping track of epiallelic origins). This relationship jointly determines the epigenetic variation in the population during inbreeding. In the following section we examine more closely the complex and highly dynamic patterns of heritable variation that can arise from it.

## RESULTS

Classical theory of experimental line crosses typically assumes no transgression (τ = 0) and the transmission of fully stable parental alleles (*s* = 1). Sometimes, it is further assumed that all loci are in linkage equilibrium These constraints represent special cases of the theory developed here. We treat these scenarios as a reference against which to compare the rich spectrum of epigenetic inheritance in epigenomically perturbed line crosses. The phenotypic variance at any time point is simultaneously determined by all of the parameters specified in Equation 5. For clarity we assess their influences systematically by varying them one at the time. Several key population-level phenomena are considered for the case of selfing starting from a backcross base population. The case of selfing from a F2 base population can be found in the supporting information. Throughout, we fix the average recombination rate at 0.44, a value that is based on the Arabidopsis genetic map (Lynch and Walsh 1998).

#### Inheritance of unstable alleles:

We first consider the case of complete epiallelic instability (*s* = 0) and no transgression (τ = 0). In this case all induced mutant epialleles are effectively reverted to the wt state over time. The specific form of the reversion function that governs this process is currently unknown, but should be of substantial interest in future empirical studies (Johannes*et al.* 2008). Although it is reasonable to assume that reversion is locus specific, for the purpose of providing an average description of the system it suffices to consider an average (between-locus) reversion function, γ(*t*). We tentatively posit the form γ(*t*) = , where β is the rate parameter (Figure 2B). We vary β so that we can examine the full range from slow to fast reversion. In contrast to stable Mendelian inheritance (Figure 3, A–C (I), black solid and dashed line), the reversion function has the effect of eroding heritable variation over time so that as *t* →∞ the heritable variation in the population is progressively lost (Figure 3, A–C (I), dark gray solid lines).

#### Mixed inheritance of stable and unstable epialleles:

Complete instability of epialleles is an extreme case. It is more likely that a proportion of the induced epialleles remain stable and produce Mendelian inheritance patterns (Johannes*et al.* 2009; Reinders*et al.* 2009). Indeed, preliminary molecular profiling of the *ddm*1-derived epiRILs suggests that only up to one-half of the tested epialleles are reversible with the remaining loci being stable for at least eight generations of inbreeding (Johannes*et al.* 2009; Teixeira*et al.* 2009). Moreover, gross perturbations of the epigenome can lead to *de novo* sequence variation through the insertion of remobilized transposable elements or other structural abnormalities. These induced alterations in the parental generation also contribute to the fraction of stable segregating variation. The precise proportions depend on the particular perturbation, organism, and experimental setup and needs to be determined on a case-by-case basis.

We explore the effect of different proportions assuming a fixed reversion function and no transgression, τ = 0. Figure 3, A–C (II), illustrates the effect of this type of mixed inheritance with the phenotypic variance being decreased over generation time due to reversion but converging to a stable value as *t* → ∞. The final variance represents the stable heritable substrate that can be gleaned from the initial perturbation. It should be clear that as the proportion of stable epialleles in the genome approaches unity (*s* → 1), our model converges to the familiar case of Mendelian inheritance, which is the basis of the classical theory of experimental line crosses.

#### Effects of imperfect epigenomic resetting:

The reversion of mutant epialleles to the wt state can be an imperfect process (Figure 2B). It is possible that epialleles converge to values outside of the parental range (Reinders*et al.* 2009). For example, the remethylation of *hypo*methylated mutant epialleles could produce *hyper*methylated states in subsequent generations relative to the wt (Reinders*et al.* 2009). We consider this case by letting all unstable epialleles revert to state values that are twice that of the wt parent. This implies that the reversion of mutant states must first pass through wt states before they reach their final stable state. This process leaves an interesting signature at the population level: It leads to an initial depletion of epigenetic variance, before there is an unexpected gain in heritable variance at later generations (Figure 3, A–C (III) light gray solid lines).

A different pattern occurs in situations where epiallelic reversion converges to intermediate parental values (0.5 of wt values). In this case, heritable variation is never completely lost (Figure 3, A–C (III), dark gray solid lines). Note that this latter pattern mimics the situation observed for the case of mixed inheritance (previous section). Distinguishing these two possibilities empirically therefore depends on detailed knowledge of the average reversion function, γ(*t*), and the proportion of stable epialleles, *s*, operating in a given population.

#### Realized transgression potentials:

It has been shown that transgressive segregation is widespread in experimental line crosses (Riesenberg*et al.* 1999). A major reason for this is that the parents have often not undergone divergent selection prior to crossing and are therefore each fixed for both increasing and decreasing genotypes. In QTL experiments this is reflected in the detection of QTL with opposite signs relative to the parental means.

In perturbation-derived populations as the ones described here, transgressive segregation is likely to be a key aspect of quantitative inheritance. The removal of methylation, for instance, can lead to active or inactive chromatin states in the mutant parent, which can translate into increasing or decreasing phenotypic values, depending on whether the underlying sequences are involved in inhibitory or facilitating functions in the networks that connect (epi)genotype with phenotype (Figure 2A).

We explore the effect of different transgression potentials (τ) for a fixed reversion function and *s* = 0.5. We vary τ between 0 and 0.35 (τ = 0.5 being the maximum). The effect for large τ is dramatic. Under additivity, and assuming τ = 0.35, we find that heritable variation increases in the order of fivefold relative to the between parental variance (Figure 3A, IV). This effect is further exaggerated when mutant epialleles act dominantly. In this case we observe a nearly 11-fold increase in heritable variation at early generations (Figure 3B, IV) followed by a gradual decrease due to the progressive loss of hererozygote epigenotypes.

Our treatment of transgression represents the first theoretical approach to highlight the importance of transgressive segregation in generating phenotypic innovation in experimental line crosses.

#### Application:

A recent analysis of the *ddm*1-derived Arabidopsis epiRILs reported large heritable variation for plant height. This work has shown that perturbations of plant methylomes are sufficient to induce lasting phenotypic consequences in commonly studied complex traits. An important question concerns the specific features of the heritable architecture that has been set up in the epiRILs, such as its physical basis (*e.g.,* sequence *vs.* methylation based) or the number and sizes of induced QTL. Definitive answers to this question require genome-wide epigenetic profiling techniques (*e.g.,* ChIP-chip, ChIP-seq, or BS-seq) in combination with QTL mapping methods as well as resequencing of each line (Johannes*et al.* 2009). While such efforts are underway, we here consider using Equation 6 as the basis for deriving the first estimates of the number of QTL (and their average effects sizes) underlying plant height in this population.

Since phenotypic data from only two time points is available (parental generation and generation eight of inbreeding), it is necessary to make informed assumptions about several of the unknown parameters specified in Equation 6. These assumptions rely heavily on previous molecular or phenotypic observations of this population (Johannes*et al.* 2009; Teixeira*et al.* 2009) and are explicitly listed below:

σ

^{2}(η,*t*) = 11.2: estimated from a random-effects model.*D*= 9.92: difference in phenotypic means (in centimeters) between the*ddm*1 and wt parents.*s*= 0.5: based on an analysis of a random subset of loci.: average recombination fraction for the Arabidopsis genome (Lynch and Walsh 1998).

Values for the transgression potential, τ, remain elusive because this quantity cannot be directly measured using molecular techniques. Keeping this limitation in mind, we show various estimates for the number of QTL (*N*) for different values of τ (Figure 4). They range from as low as 1 (for τ = 0) to as high as 6 (for τ = 0.31). A unique estimate of *N* can be obtained by fixing τ at its theoretical average (see Appendix B). In this case we find that (95% CI = ± 0.31) (Figure 4, black bar). This (conservative) estimate suggests the induction of a polygenic heritable architecture, with each QTL explaining ∼14% of the phenotypic variance in plant height. Given these considerably large effect sizes, the underlying causative loci should be mappable in future integrative QTL studies, even with relatively small sample sizes.

## DISCUSSION

Epigenetic modifications, such as DNA methylation, are not only widely conserved across species (Zemach*et al.* 2010; Feng*et al.* 2010), but also show substantial interindividual variation within populations (Vaughn*et al.* 2007; Zhang*et al.* 2008; Kaminsky*et al.* 2009). The possibility that this type of epigenetic variation can influence phenotypes independent of DNA sequence variation poses major challenges to our current understanding of complex trait inheritance.

The first problem is that sequence-based mapping approaches (*i.e.,* linkage or association mapping) may be insufficient to fully capture the heritable architecture of complex traits (Johannes*et al.* 2008). A recent study by Kong*et al.* (2009), for example, illustrates that knowledge of the epigenetic status of sequence alleles (in this case, the parent-of-origin of the alleles) is often necessary to establish significant associations with phenotypes. Although the authors found that these effects were sex specific, it raises the need to routinely include both levels of variation (DNA sequence or chromatin state) in the analysis.

The second and partially related problem is that the potential temporal instability of epigenetic variation can produce a level of phenotypic dynamics at the population-level that cannot be predicted from strictly Mendelian models of inheritance. Here we have outlined an experimental and theoretical approach to begin to address this issue. The approach relies on the use of a perturbation strategy to induce epigenetic variation in isogenic lines, followed by a transgenerational assessment of derivative populations. This provides an ideal platform for studying the temporal properties of epialleles and permits a theoretical description of these properties in connection with the inheritance of complex traits observed in such populations.

#### Extensions using environmental triggers:

Two experimental examples of this approach have recently been implemented in the model plant Arabidopsis using loss-of-function mutations in *ddm1* and *met1*, two genes involved in genome-wide methylation maintenance, to initiate the perturbation. While this continues to be a valuable resource, future experiments should attempt to produce similar populations using environmental manipulations in place of mutations (Figure 1B). Bossdorf*et al.* (2010), for instance, demonstrated that treatment of Arabidopsis ecotypes with demethylation agents is sufficient to invoke phenotypically relevant methylation changes. Similarly, Verhoeven*et al.* (2010) used environmental stressors, such as chemical induction of herbivore and pathogen defenses, and noted heritable DNA methylation changes in asexual dandelion. These types of interventions may be sufficient to induce and propagate epigenetic variation in the parents and their cross derivatives. A demonstration of this should have important implications for evolutionary theory, which traditionally draws a clear divider between the environment and the heritable material. There is certainly need for modeling approaches that incorporate environmentally induced epigenetic changes into evolutionary theory. More general attempts to assess the role of epigenetic inheritance in the context of selection and adaptation have been undertaken (Csaba 1998; Pál and Miklós 1999; Bonduriansky and Day 2009). The inclusion of epiallelic reversion, as we have formalized it here, would be an appealing extension. It is tempting to speculate that such reversion processes have evolved to facilitate short-term adaptation of populations to rapid environmental changes. These issues are outside of the scope of this work.

#### Considerations for mammalian populations:

Studies of epigenetic inheritance have been primarily pursued in plant species, and much less is known about the transgenerational behavior of epialleles in mammalian populations. The dominant paradigm dictates that the epigenome is completely reset during early mammalian development (Reik 2007; Feng*et al.* 2010), which implies that induced epigenetic effects are not carried to subsequent generations. Formally, this suggests a reversion process that follows a step function, dropping to wt levels at *t* = −1 (first-generation progeny of two parents). However, several well-documented single-locus examples of epiallelic inheritance exist in mammalian systems (Rakyan*et al.* 2003; Blewitt*et al.* 2006; Daxinger and Whitelaw 2010). These findings are probably no exception as more recent genome-wide surveys of mouse gametes show clear instances of transmitted parental DNA methylation profiles (Borgel*et al.* 2010), suggesting that epigenetic inheritance may be much more widespread in mammalian populations than previously acknowledged.

By help of transgenerational phenotypic data, concrete hypotheses about the extent of epigenomic resetting can be tested on statistical grounds using the experimental framework outlined in this article. This can be achieved by considering alternative reversion functions in a fit to the data. Such a proposal is akin to the approach recently developed by Tal*et al.* (2010), which permits hypothesis tests about effect of epigenomic resetting events on the covariance between relatives.

We argue that the construction of mammalian crosses between isogenic parents with perturbed and unperturbed epigenomes will be critically important to begin to extrapolate results to humans. The theory outlined in this article, particularly the results for sibling mating, is entirely compatible with an analysis of experimental mammalian populations (*e.g.,* mice or rats). However, the initial construction of such crosses using perturbations may be more complicated than in plants, given that mutations in genes controlling DNA methylation tend to be lethal (Li*et al.* 1992). Instead, one could consider other mutants, partial knockdowns, or any suitable environmental manipulation strategy. Another complication is to distinguish instances of maternal or paternal imprinting. One solution would be to set up reciprocal crosses to delineate these effects (*i.e.,* perturbation of progenitor mother *vs.* father).

#### Conclusion:

Our theory attempts to connect recent observations of the dynamic properties of epialleles (*i.e.,* DNA methylation variants) to a long tradition of quantitative genetics. We have shown that in the case of fully stable epialleles, genetic and epigenetic inheritance are formally indistinguishable. This illustrates that there is actually no dichotomy between these two modes of inheritance. Rather, they should be viewed as different points on a continuum that ranges from stable to unstable inheritance. We therefore hope that our work will help to bridge the gap between the fields of genetics and epigenetics.

## APPENDIX A

This appendix shows the derivation of the epigenetic variance in the population, Equation 5. For the details on the values of the parameters presented, we refer the reader to the supporting information. The epigenetic variance can be written aswhere σ^{2}(η* _{j}*|

*t*) is the variance at a single locus

*j*at time

*t*and is the covariance between any two loci

*j*and

*k*separated by an average pairwise recombination fraction (Franklin 1970).

#### Variance at a single locus:

The 12 different epigenotypes at locus *j* resulting from the initial cross (Figure 2A) can be classified into four different classes of three elements each. Denote these four classes by *s*_{1},…, *s*_{4} with corresponding probability weights *W*_{1},…, *W*_{4} (supporting material in File S1, 1.1). Since each locus can belong to any one of these classes, its variance can be written as the sum of the within-class variances weighted by their probabilities:(7)where is a vector of expected single-locus epigenotype probabilities and is a vector of expected epigenotypes at locus *j* at time *t* (supporting material in File S1, 1.2). The probability vector is obtained following a Markov chain approach. Below we specify the cases for selfing and sibling mating.

##### Selfing:

is a three-dimensional vector with the probabilities of each single-locus epigenotype, determined by the type of base population (F2 or BC) (supporting material in File S1, 1.3.1). The transition matrix for each class is a 3 × 3 matrix of transition probabilities (supporting material in File S1, 1.3.2). Using a Markov chain approach we calculate the frequency of each epigenotype at time *t* using Using Equation 7 the variance at a single locus is given by(8)

The value of the parameter *q*_{1} differs for the type of base population considered (F2 or BC) and for the type of epigenotypic effect considered (additivity or dominance); see Table A1.

##### Sibling mating:

Consider, instead of the probabilities of each single-locus, the probabilities of their mating types (Bulmer 1985). There are 16 possible mating types in each class, which can be reduced to 6 considering the following basic symmetries (supporting material in File S1, 1.4.1): and The transition matrix is the collection of the probabilities of going from one mating type to another in one generation of sibling mating; it has dimension 6 × 6 (supporting material in File S1, 1.4.3) (for a detailed description on how to construct such a matrix, see Bulmer 1985, Chap. 3). The initial probabilities of each mating class, is a 6-dimensional vector given by *Q*_{(l)}(0) = *K*_{(i)}(0)*K*_{(j)}(0), where *i* and *j* are the single-locus epigenotypes involved in mating type *l*. At any generation *t*, The probabilities for the 4 different single-locus epigenotypes are given by where *p _{il}* is the proportion of single-locus

*i*involved in mating type

*l*(supporting material in File S1, 1.4.2). Using Equation 7 the variance at a single locus is given by(9)The value of the parameter

*q*

_{1}depends on the case considered (Table A2).

#### Covariance between loci:

To calculate the covariance term between loci *j* and *k* separated by an average recombination factor we use a similar classification for the possible 160 different two-locus epigenotypes resulting from the initial cross (Figure 2A). They can be assigned to 16 different classes of 10 pairs each, *d*_{1},…, *d*_{16}, with probability *V _{i}*(

*η*,

_{j}*η*) (supporting material in File S1, 2.1). Since each two-locus epigenotype can belong to any one of these 16 classes we can write the covariance as:(10)where is a 10-dimensional vector of two-locus epigenotype probabilities (

_{k}*i.e.,*parental haplotypes). Here, and are the two different expected epigenotypes in a given class

*d*(supporting material in File S1, 2.2). The probability vector is obtained following a Markov chain approach; below we specify the cases for selfing and sibling mating. In the simplified case of purely Mendelian inheritance (

_{m}*s*= 1), no transgression (τ = 0), and

*t*= ∞, the derivation of has received considerable attention as a problem in its own right (Haldane and Waddington 1931; Wright 1933; Kimura 1963; Broman 2005).

#### Selfing:

For each of the 16 classes is calculated depending on the type of base population considered (F2 or BC) (supporting material in File S1, 2.3.1). The transition probability matrix is a 10 × 10 matrix of transition probabilities of each two-locus epigenotype crossed with itself (supporting material in File S1, 2.3.2), and Using Equation 10 we obtain(11)The analytical value of the parameter *q*_{2} is shown in Table A3.

#### Sibling mating:

Consider the 55 different mating types between the 10 two-locus epigenotypes in each class (supporting material in File S1, 2.4.1) (Bulmer 1985). Taking into account the symmetries mentioned above we can reduce them to 22 different mating types for the F2- generated population and to 34 for the BC one. In the same way as for the single-locus case, are the initial probabilities of each mating class, and the probabilities for the 10 two-locus epigenotypes, can be extracted from the mating type probabilities (supporting material in File S1, 2.4.2).

Unfortunately, the transition matrix cannot be diagonalized symbolically because its dimension is too large (see Figure S2). For this reason we cannot obtain an analytical expression (as a function of the parameter ) for the probabilities However, we can fix to a numerical value before calculating the power and thus obtain the exact probability values at any time *t*. Moreover, it is possible to write symbolically the probability vector as and, using Equation 10 and taking into account we can write the covariance for sibling mating in the form(12)where the constant *q*_{2} is calculated exactly for any value of for a F2- or a BC-based population.

### Epigenetic variance:

Finally, combining Equations 8 and 9 with Equations 11 and 12, and multiplying by the number of loci and their mean phenotypic effect, *N*δ^{2}, yields Equation 5 in the main text.

## APPENDIX B

In this appendix we show how we estimate the number of QTL in the *ddm*1-derived epiRILs. Consider the equation for *N* in the main text (Equation 6) at *t* = ∞ (*i.e.,* fully inbred), under the assumption of perfect resetting (γ(*t* =∞) = −):where **Ψ** is a vector of all the parameters specified on the right-hand side of the equation. Substituting the values for *D*, σ^{2}(η, *t*), *s,* and provided in the main text yields one equation and two unknowns (*N* and τ). With heritability data from only one generation, it is not possible to find unique solutions, and at least one additional generation of phenotypic measurements is required.

#### Obtaining :

In the absence of such additional data one strategy is to calculate an average *N* by integrating over the theoretical range of τ,where *u* = τ|(*N* = *N*_{max}) is the upper integration limit. To find a value for *u* we solve Equation 5 for τ and evaluate it at the expected maximum number of QTL, *N*_{max}, which can be detected given the particular mating scheme.

#### Obtaining the expected value for *N*_{max}:

In a population of RILs, let *R _{j}* denote the probability of a recombinant type over the entire length of the

*j*th chromosome, where

*R*

^{(i)}(

*t*) are the components of (Appendix A) at fixation and ℛ is the ensemble of recombinant two-locus epigenotypes. Using our Markov chain approach for selfing, the value of

*R*is given by(13)where

_{j}*r*is the recombination fraction at meiosis between the beginning and the end of chromosome

_{j}*j*. Note that the result for

*R*|

_{j}*F*2 is consistent with Haldane and Waddington (1931). Given a known genetic map,

*r*can be calculated using any map function, as long as its inverse is available (Liu 1998). The probability of a recombinant type implies at least one recombination breakpoint in the interval, thus generating two potential QTL segments flanking the breakpoint. Assuming

_{j}*s*= 1 (all epialleles are stable), the expected maximum number of QTL occurs in a situation where each generated segment is occupied by a QTL. This expectation can be approximated bywhere

*C*is the total number of chromosomes, and the ratio on the right-hand side is the odds ratio of a recombination

*vs.*no recombination breakpoint on chromosome

*j*. As a rule of thumb, it is safe to say that

*E*(

*N*

_{max}|

*F*2)≃2

*C*and These latter expressions assume linkage equilibrium between the beginning and end of chromosome

*j*, that is

*r*= 0.5.

_{j}#### Bootstrap standard errors:

We obtain standard errors for using a nonparametric bootstrap approach. To achieve this we take the following steps:

Recalculate

*D*on the basis of a random sample of size*n*from each of the two parental phenotypic vectors.Draw a random, stratified bootstrap sample from the epiRILs phenotypic vector, and approximate the epigenetic variance, σ

^{2}(η,*t*), using a random intercepts model:*y*_{i}_{,j}= β_{0}+*b*_{i}z_{i}_{,j}+ ε_{i}_{,j}, where β_{0}is a common fixed intercept,*b*is the random intercept of the_{i}*i*th line,*z*_{i}_{,}is an index variable, and ε_{j}_{i}_{,}is the error. We assume that_{j}*b*∼_{i}*N*(0, σ^{2}(η,*t*= ∞)), ε_{i}_{,}∼_{j}*N*(0, σ^{2}), and Cov(ε_{i}_{i}_{,}, ε_{j}_{i}_{,}_{j}_{′}) = 0.Use the estimates for 1 and 2, and determine the upper integration limit (

*u*) for τ.Determine by calculating

Repeat steps 1–4 a large number of times. The standard deviation of the resulting bootstrap distribution is an approximation for the standard error of

Note that these sampling errors will be slightly underestimated, because the values for *s* and γ(*t*) are assumed known from molecular analysis, and the variation in *N*_{max} is also neglected for simplicity.

## Acknowledgments

We thank several anonymous reviewers for detailed suggestions. We also thank Ritsert C. Jansen and Vincent Colot for helpful comments on earlier versions of this manuscript. F. Johannes was supported by a Horizon Breakthrough grant (Netherlands Organisation for Scientific Research (NOW)), and M. Colomé-Tatché acknowledges support from the Centre for Quantum Engineering and Space-Time Research (QUEST).

## Footnotes

Communicating editor: D. W. Threadgill

- Received January 22, 2011.
- Accepted February 28, 2011.

- Copyright © 2011 by the Genetics Society of America

Available freely online through the author-supported open access option.