## Abstract

High-order epistasis has been observed in many genotype-phenotype maps. These multi-way interactions between mutations may be useful for dissecting complex traits and could have profound implications for evolution. Alternatively, they could be a statistical artifact. High-order epistasis models assume the effects of mutations should add, when they could in fact multiply or combine in some other nonlinear way. A mismatch in the “scale” of the epistasis model and the scale of the underlying map would lead to spurious epistasis. In this article, we develop an approach to estimate the nonlinear scales of arbitrary genotype-phenotype maps. We can then linearize these maps and extract high-order epistasis. We investigated seven experimental genotype-phenotype maps for which high-order epistasis had been reported previously. We find that five of the seven maps exhibited nonlinear scales. Interestingly, even after accounting for nonlinearity, we found statistically significant high-order epistasis in all seven maps. The contributions of high-order epistasis to the total variation ranged from 2.2 to 31.0%, with an average across maps of 12.7%. Our results provide strong evidence for extensive high-order epistasis, even after nonlinear scale is taken into account. Further, we describe a simple method to estimate and account for nonlinearity in genotype-phenotype maps.

RECENT analyses of genotype-phenotype maps have revealed “high-order” epistasis—that is, interactions between three, four, and even more mutations (Ritchie *et al.* 2001; Segrè *et al.* 2005; Xu *et al.* 2005; Tsai *et al.* 2007; Imielinski and Belta 2008; Matsuura *et al.* 2009; da Silva *et al.* 2010; Pettersson *et al.* 2011; Wang *et al.* 2012; Hu *et al.* 2013; Weinreich *et al.* 2013; Sun *et al.* 2014; Anderson *et al.* 2015; Yokoyama *et al.* 2015). The importance of these interactions for understanding biological systems and their evolution is the subject of current debate (Weinreich *et al.* 2013; Poelwijk *et al.* 2016). Can they be interpreted as specific, biological interactions between loci? Or are they misleading statistical correlations?

We set out to tackle one potential source of spurious epistasis: a mismatch between the “scale” of the map and the scale of the model used to dissect epistasis (Fisher 1918; Rothman *et al.* 1980; Frankel and Schork 1996; Cordell 2002; Phillips 2008; Szendro *et al.* 2013). The scale defines how to combine mutational effects. On a linear scale, the effects of individual mutations are added. On a multiplicative scale, the effects of mutations are multiplied. Other, arbitrarily complex scales, are also possible (Rokyta *et al.* 2011; Schenk *et al.* 2013; Blanquart 2014).

Application of a linear model to a nonlinear map will lead to apparent epistasis (Fisher 1918; Rothman *et al.* 1980; Frankel and Schork 1996; Cordell 2002; Phillips 2008; Szendro *et al.* 2013). Consider a map with independent, multiplicative mutations. Analysis with a multiplicative model will give no epistasis. In contrast, analysis with a linear model will give epistatic coefficients to account for the multiplicative nonlinearity (Cordell 2002; Phillips 2008). Epistasis arising from a mismatch in scale is mathematically valid, but obscures a key feature of the map: its scale. It is also not parsimonious, as it uses many coefficients to describe a potentially simple, nonlinear function. Finally, it can be misleading because these epistatic coefficients partition global information about the nonlinear scale into (apparently) specific interactions between mutations.

Most high-order epistasis models assume a linear scale (or a multiplicative scale transformed onto a linear scale) (Heckendorn and Whitley 1999; Szendro *et al.* 2013; Weinreich *et al.* 2013; Poelwijk *et al.* 2016). These models sum the independent effects of mutations to predict multi-mutation phenotypes. Epistatic coefficients account for the difference between the observed phenotypes and the phenotypes predicted by summing mutational effects. The epistatic coefficients that result are, by construction, on the same linear scale (Heckendorn and Whitley 1999; Weinreich *et al.* 2013; Poelwijk *et al.* 2016).

Because the underlying scale of genotype-phenotype maps is not known *a priori*, the interpretation of high-order epistasis extracted on a linear scale is unclear. If a nonlinear scale can be found that removes high-order epistasis, it would suggest that high-order epistasis is spurious: a highly complex description of a simple, nonlinear system. In contrast, if no such scale can be found, high-order epistasis provides a window into the profound complexity of genotype-phenotype maps.

In this article, we set out to estimate the nonlinear scales of experimental genotype-phenotype maps. We then account for these scales in the analysis of high-order epistasis. We took our inspiration from the treatment of multiplicative maps, which can be transformed into linear maps using a log transform. Along these same lines, we set out to transform genotype-phenotype maps with arbitrary, nonlinear scales onto a linear scale for analysis of high-order epistasis. We develop our methodology using simulations and then apply it to experimentally measured genotype-phenotype maps.

## Methods

### Experimental data sets

We collected a set of published genotype-phenotype maps for which high-order epistasis had been reported previously. Measuring an *L*th-order interaction requires knowing the phenotypes of all binary combinations of *L* mutations—that is, genotypes. The data sets we used had exhaustively covered all genotypes for five or six mutations. These data sets cover a broad spectrum of genotypes and phenotypes. Genotypes included point mutations to a single protein (Weinreich *et al.* 2006), point mutations in both members of a protein/DNA complex (Anderson *et al.* 2015), random genomic mutations (de Visser *et al.* 2009; Khan *et al.* 2011), and binary combinations of alleles within a biosynthetic network (Hall *et al.* 2010). Measured phenotypes included selection coefficients (Weinreich *et al.* 2006; de Visser *et al.* 2009; Khan *et al.* 2011), molecular binding affinity (Anderson *et al.* 2015), and yeast growth rate (Hall *et al.* 2010). (For several data sets, the “phenotype” is a selection coefficient. We do not differentiate fitness from other properties for our analyses; therefore, for simplicity, we will refer to all maps as genotype-phenotype maps rather than specifying some as genotype-fitness maps). All data sets had a minimum of three independent measurements of the phenotype for each genotype. All data sets are available in a standardized American Standard Code for Information Interchange (ASCII) text format.

### Nonlinear scale

We described nonlinearity in the genotype-phenotype map by a power transform (see *Results*) (Box and Cox 1964; Carroll and Ruppert 1981). The independent variable for the transformation was the predicted phenotypes of all genotypes assuming linear and additive effects for each mutation. The estimated additive phenotype of genotype *i* is given by(1)where is the average effect of mutation *j* across all backgrounds, is an index that encodes whether or not mutation *j* is present in genotype *i*, and *L* is the number of sites. The dependent variables are the observed phenotypes taken from the experimental genotype-phenotype maps.

We use nonlinear least-squares regression to fit and estimate the power transformation from to where *ε* is a residual and *τ* is a power-transform function. This is given by:where *A* and *B* are translation constants, is the geometric mean of , and *λ* is a scaling parameter. We used standard nonlinear regression techniques to minimize *d*:We then reversed this transformation to linearize using the estimated parameters and We did so by the back-transform:

### High-order epistasis model

We dissected epistasis using a linear, high-order epistasis model. These have been discussed extensively elsewhere (Heckendorn and Whitley 1999; Weinreich *et al.* 2013; Poelwijk *et al.* 2016), so we will only briefly and informally review them here.

A high-order epistasis model is a linear decomposition of a genotype-phenotype map. It yields a set of coefficients that account for all variation in phenotype. The signs and magnitudes of the epistatic coefficients quantify the effect of mutations and interactions between them. A binary map with genotypes requires epistatic coefficients and captures all interactions, up to *L*th-order, between them. This is conveniently described in matrix notation.(3)a vector of phenotypes can be transformed into a vector of epistatic coefficients using a decomposition matrix that encodes which coefficients contribute to which phenotypes. If is invertible, one can determine from a collection of measured phenotypes by(4) can be formulated in a variety of ways (Poelwijk *et al.* 2016). Following others in the genetics literature, we use the form derived from Walsh polynomials (Heckendorn and Whitley 1999; Weinreich *et al.* 2013; Poelwijk *et al.* 2016). In this form, is a Hadamard matrix. Conceptually, the transformation identifies the geometric center of the genotype-phenotype map and then measures the average effects of each mutation and combination of mutations in this “average” genetic background (Figure 1). To achieve this, we encoded each mutation at each site in each genotype as −1 (wild type) or +1 (mutant) (Heckendorn and Whitley 1999; Weinreich *et al.* 2013; Poelwijk *et al.* 2016). This has been called a Fourier analysis,(Neidhart *et al.* 2013; Szendro *et al.* 2013), global epistasis (Poelwijk *et al.* 2016), or a Walsh space (Heckendorn and Whitley 1999; Weinreich *et al.* 2006). Another common approach is to use a single wild-type genotype as a reference and encode mutations as either 0 (wild type) or 1 (mutant) (Poelwijk *et al.* 2016).

One data set (IV, Table 1) has four possible states (A, G, C, and T) at two of the sites. We encoded these using the WYK tetrahedral-encoding scheme (Zhang and Zhang 1991; Anderson *et al.* 2015). Each state is encoded by a three-bit state. The wild-type state is given the bits The remaining states are encoded with bits that form corners of a tetrahedron. For example, the wild type of site 1 is G and encoded as the state. The remaining states are encoded as follows: A is C is and T is

### Experimental uncertainty

We used a bootstrap approach to propagate uncertainty in measured phenotypes into uncertainty in epistatic coefficients. To do so, we: (1) calculated the mean and SD for each phenotype from the published experimental replicates, (2) sampled the uncertainty distribution for each phenotype to generate a pseudoreplicate vector that had one phenotype per genotype, (3) rescaled using a power transform, and (4) determined the epistatic coefficients for We then repeated steps 2–4 until convergence. We determined the mean and variance of each epistatic coefficient after every 50 pseudoreplicates. We defined convergence as the mean and variance of every epistatic coefficient changed by after addition of 50 more pseudoreplicates. On average, convergence required ∼100,000 replicates per genotype-phenotype map. Finally, we used a *z*-score to determine if each epistatic coefficient was significantly different than zero. To account for multiple testing, we applied a Bonferroni correction to all *P*-values (Abdi 2007).

### Computational methods

Our full epistasis software package—written in Python3 extended with Numpy and Scipy (van der Walt *et al.* 2011)—is available for download via github (https://github.com/harmslab/epistasis). We used the Python package scikit-learn for all regressions (Pedregosa *et al.* 2011). Plots were generated using matplotlib and Jupyter Notebooks (Hunter 2007; Perez and Granger 2007).

### Data availability

The data sets and code used in this work are available at https://github.com/harmslab/notebooks-nonlinear-high-order-epistasis. The data sets are available in standard JSON format. The code is available as Jupyter Notebooks.

## Results

### Nonlinear scale induces apparent high-order epistasis

Our first goal was to understand how a nonlinear scale, if present, would affect estimates of high-order epistasis. To probe this question, we constructed a five-site binary genotype-phenotype map on a nonlinear scale, and then extracted epistasis assuming a linear scale. The nonlinear scale we chose was a saturating function:(5)where is the linear phenotype of genotype *g*, is the transformed phenotype of genotype *g*, and *K* is a scaling constant. As the map becomes linear. As *K* increases, mutations have systematically smaller effects when introduced into backgrounds with higher phenotypes.

We calculated for all binary genotypes using the random, additive coefficients shown in Figure 2A. These coefficients included no epistasis. We then transformed onto the nonlinear scale using Equation 5 with the relatively shallow saturation curve shown in Figure 2B. Finally, we applied a linear epistasis model to to extract epistatic coefficients.

We found that nonlinearity in the genotype-phenotype map induced extensive high-order epistasis when the nonlinearity was ignored (Figure 2C). We observed epistasis up to the fourth order, despite building the map with purely additive coefficients. This result is unsurprising: the only mechanism by which a linear model can account for variation in phenotype is through epistatic coefficients (Rothman *et al.* 1980; Frankel and Schork 1996; Cordell 2002). When given a nonlinear map, it partitions the variation arising from nonlinearity into specific interactions between mutations. This high-order epistasis is mathematically valid, but does not capture the major feature of the map—namely, saturation. Indeed, this epistasis is deceptive, as it is naturally interpreted as specific interactions between mutations. For example, this analysis identifies a specific interaction between mutations one, two, four, and five (Figure 2C, purple). But this four-way interaction is an artifact of the nonlinearity in phenotype of the map, rather than a specific interaction.

### Nonlinear scale and specific epistatic interactions induce different patterns of nonadditivity

Our next question was whether we could separate the effects of nonlinear scale and high-order epistasis in binary maps. One useful approach to develop intuition about epistasis is to plot the observed phenotypes against the predicted phenotype of each genotype, assuming linear and additive mutational effects (Rokyta *et al.* 2011; Schenk *et al.* 2013; Szendro *et al.* 2013). In a linear map without epistasis, equals because each mutation would have the same, additive effect in all backgrounds. If epistasis is present, phenotypes will diverge from the line.

We simulated maps including varying amounts of linear, high-order epistasis, placed them onto increasingly nonlinear scales, and then constructed *vs.* plots. We added high-order epistasis by generating random epistatic coefficients and then calculating phenotypes using Equation 3. We introduced nonlinearity by transforming these phenotypes with Equation 5. For each genotype in these simulations, we calculated as the sum of the first-order coefficients used in the generating model. is the observable phenotype, including both high-order epistasis and nonlinear scale.

High-order epistasis and nonlinear scale had qualitatively different effects on *vs.* plots. Figure 3A shows plots of *vs.* for increasing nonlinearity (left to right) and high-order epistasis (bottom to top). As nonlinearity increases, curves systematically relative to This reflects the fact that is on a linear scale and is on a saturating, nonlinear scale. The shape of the curve reflects the map between the linear and saturating scale: the smallest phenotypes are underestimated and the largest phenotypes are overestimated. In contrast, high-order epistasis induces random scatter away from the line. This is because the epistatic coefficients used to generate the map are specific to each genotype, moving observations off the expected line, even if the scaling relationship is taken into account.

### Nonlinearity can be separated from underlying high-order epistasis

The *vs.* plots suggest an approach to disentangle high-order epistasis from nonlinear scale. By fitting a function to the *vs.* curve, we describe a transformation that relates the linear scale to the (possibly nonlinear) scale (Schenk *et al.* 2013; Szendro *et al.* 2013). Once the form of the nonlinearity is known, we can then linearize the phenotypes so they are on an appropriate scale for epistatic analysis. Variation that remains (*i.e.*, scatter) can then be confidently partitioned into epistatic coefficients.

In the absence of knowledge about the source of the nonlinearity, a natural choice is a power transform (Box and Cox 1964; Carroll and Ruppert 1981), which identifies a monotonic, continuous function through *vs.* A key feature of this approach is that power-transformed data are normally distributed around the fit curve and thus appropriately scaled for regression of a linear epistasis model.

We tested this approach using one of our simulated data sets. One complication is that, for an experimental map, we do not know In the analysis above, we determined from the additive coefficients used to generate the space. In a real map, is not known; therefore, we had to estimate We did so by measuring the average effect of each mutation across all backgrounds, and then calculating for each genotype as the sum of these average effects (Equation 1).

We fit the power transform to *vs.* (solid red line, Figure 3B). The curve captures the nonlinearity added in the simulation. We linearized using the fit model (Equation 2), and then extracted epistatic coefficients. The extracted coefficients were highly correlated with the coefficients used to generate the map (Figure 3C). In contrast, applying the linear epistasis model to this map without first accounting for nonlinearity gives much greater scatter between the input and output coefficients (Figure 3D). This occurs because phenotypic variation from nonlinearity is incorrectly partitioned into the linear epistatic coefficients.

### Nonlinearity is a common feature of genotype-phenotype maps

Our next question was whether experimental maps exhibited nonlinear scales. We selected seven genotype-phenotype maps that had previously been reported to exhibit high-order epistasis (Table 1) and fit power transforms to each data set (Figure 4 and Supplemental Material, Figure S1 and File S1). We expected some phenotypes to be multiplicative (*e.g.*, data sets I, II, and IV were relative fitness), while we expected some to be additive (*e.g.*, data set IV is a free energy). Rather than rescaling the multiplicative data sets by taking logarithms of the phenotypes, we allowed our power transform to capture the appropriate scale. The power transform identified nonlinearity in the majority of the data sets. Of the seven data sets, three were less than additive (II, V, VI), two were greater than additive (III, IV), and two were approximately linear (I, VII). All data sets gave random residuals after fitting the power transform (Figure 4 and Figure S1).

### High-order epistasis is a common feature of genotype-phenotype maps

With estimated scales in hand, we linearized the maps using Equation 2 and remeasured epistasis (Figure S2). We used bootstrap sampling of uncertainty in the measured phenotypes to determine the uncertainty of each epistatic coefficient (see *Methods*), and then integrated these distributions to determine whether each coefficient was significantly different than zero. We then applied a Bonferroni correction to each *P*-value to account for multiple testing.

Despite our conservative statistical approach, we found high-order epistasis in every map studied (Figure 5A and Figure S3). Every data set exhibited at least one statistically significant epistatic coefficient of fourth order or higher. We even detected statistically significant fifth-order epistasis (blue bar in Figure 5A, data set II). High-order coefficients were both positive and negative, often with magnitudes equal to or greater than the second-order terms. These results reveal that high-order epistasis is a robust feature of these maps, even when nonlinearity and measurement uncertainty in the genotype-phenotype map is taken into account.

We also dissected the relative contributions of each epistatic order to the remaining variation. To do so, we created truncated epistasis models: an additive model, a model containing additive and pairwise terms, a model containing additive through third-order terms, *etc*. We then measured how well each model accounted for variation in the phenotype using a Pearson’s coefficient between the fit and the data. Finally, we asked how much the Pearson coefficient changed with addition of more epistatic coefficients. For example, to measure the contribution of pairwise epistasis, we took the difference in the correlation coefficient between the additive plus pairwise model and the purely additive model.

The contribution of epistasis to the maps was highly variable (Figure 5B and Figure S3). For data set I, epistatic terms explained 5.9% of the variation in the data. The contributions of epistatic coefficients decayed with increasing order, with fifth-order epistasis only explaining 0.1% of the variation in the data. In contrast, for data set II, epistasis explains 43.3% of the variation in the map. Fifth-order epistasis accounts for 6.3% of the variation in the map. The other data sets had epistatic contributions somewhere between these extremes.

### Accounting for nonlinear genotype-phenotype maps alters epistatic coefficients

Finally, we probed to what extent accounting for nonlinearity in phenotype altered the epistatic coefficients extracted from each space. Figure 6 and Figure S4 show correlation plots between epistatic coefficients extracted both with and without linearization. The first-order coefficients were all highly correlated between the linear and nonlinear analyses for all data sets (Figure S5).

For the epistatic coefficients, the degree of correlation depended on the degree of nonlinearity in the data set. Data set I—which was essentially linear—had identical epistatic coefficients whether the nonlinear scale was taken into account or not. In contrast, the other data sets exhibited scatter off of the line. Data set III was particularly noteworthy. The epistatic coefficients were systematically overestimated when the nonlinear scale was ignored. Two large and favorable pairwise epistatic terms in the linear analysis became essentially zero when nonlinearity was taken into account. These interactions—M182T/g4205a and G283S/g4205a—were both noted as determinants of evolutionary trajectories in the original publication (Weinreich *et al.* 2006); however, our results suggest the interaction is an artifact of applying a linear model to a nonlinear data set. Further ∼20% (six of 27) of the epistatic coefficients flipped sign when nonlinearity was taken into account (Figure 6, III, bottom-right quadrant).

Overall, we found that low-order epistatic coefficients were more robust to the linear assumption than high-order coefficients. Data set IV is a clear example of this behavior. The map exhibited noticeable nonlinearity (Figure 4). The first- and second-order terms were well correlated between the linear and nonlinear analyses (Figure 6, Figure S4, and Figure S5). Higher-order terms, however, exhibited much poorer overall correlation. While the for second-order coefficients was 0.95, the correlation was only 0.43 for third-order coefficients. This suggests that previous analyses of nonlinear genotype-phenotype maps correctly identified the key mutations responsible for variation in the map, but incorrectly estimated the high-order epistatic effects.

## Discussion

Our results reveal that both nonlinear scales and high-order epistasis play important roles in shaping experimental genotype-phenotype maps. Five of the seven data sets we investigated exhibited nonlinear scales, and all of the data sets exhibited high-order epistasis, even after accounting for nonlinearity. This suggests that both should be taken into account in analyses of genotype-phenotype maps.

### Origins of nonlinear scales

We observed two basic forms of nonlinearity in these maps: saturating, less-than-additive maps and exploding, greater-than-additive maps. Many have observed less-than-additive maps in which mutations have lower effects when introduced into more optimal backgrounds (MacLean *et al.* 2010; Chou *et al.* 2011). Such saturation has been proposed to be a key factor shaping evolutionary trajectories (Otto and Feldman 1997; MacLean *et al.* 2010; Chou *et al.* 2011; Tokuriki *et al.* 2012; Kryazhimskiy *et al.* 2014). Further, it is intuitive that optimizing a phenotype becomes more difficult as that phenotype improves. Our nonlinear fits revealed this behavior in three different maps.

The greater-than-additive maps, in contrast, were more surprising: why would mutations have a larger effect when introduced into a more favorable background? For the β-lactamase genotype-phenotype map (III, Figure 4), this may be an artifact of the original analysis used to generate the data set (Weinreich *et al.* 2006). This data set describes the fitness of bacteria expressing variants of an enzyme with activity against β-lactam antibiotics. The original authors measured the minimum-inhibitory concentration (MIC) of the antibiotic against bacteria expressing each enzyme variant. They then converted their MIC values into apparent fitness by sampling from an exponential distribution of fitness values and assigning these fitness values to rank-ordered MIC values (Weinreich *et al.* 2006). Our epistasis model extracts this original exponential distribution (Figure S6). This result demonstrates the effectiveness of our approach in extracting nonlinearity in the genotype-phenotype map.

The origins of the growth in the transcription factor/DNA binding data set are less clear (IV, Figure 4). The data set measures the binding free energy of variants of a transcription factor binding to different DNA response elements. We are aware of no physical reason for mutations to have a larger effect on free energy when introduced into a background with better binding. One possibility is that the genotype-phenotype map reflects multiple features that are simultaneously altered by mutations, giving rise to this nonlinear shape. This is a distinct possibility in this data set, where mutations are known to alter both DNA binding affinity and DNA binding cooperativity (McKeown *et al.* 2014).

### Best practice

Because nonlinearity is a common feature of these maps, linearity should not be assumed in analyses of epistasis. Given a sufficient number of phenotypic observations, however, the appropriate scale can be estimated by construction of a *vs.* plot and regression of a nonlinear scale model. With this scale in hand, one can then transform the genotype-phenotype map onto a linear scale appropriate for analysis using a high-order epistasis model. Our software pipeline automates this process. It takes any genotype-phenotype map in a standard text format, fits for nonlinearity, and then estimates high-order epistasis. It is freely available for download (https://github.com/harmslab/epistasis).

One important question is how to select an appropriate function to describe the nonlinear scale. By visual inspection, all of the data sets we studied were monotonic in and could be readily captured by a power transform. Other maps may be better captured with other functions. For example, inspection of a *vs.* plot could reveal a nonmonotonic scale, leading to a better fit with a polynomial than a power transform. Another possibility is that external biological knowledge motivates scale choice (Schenk *et al.* 2013).

The choice of model determines what fraction of the variation is assigned to scale *vs.* “epistasis.” The more complicated the function chosen, the more variation in the data is shifted from epistasis and into scale. One could, for example, fit a completely uninformative *L*th-order polynomial, which would capture all of the variation as scale and none as epistasis. Scale estimation should be governed by the well-established principles of model regression: find the simplest function that captures the maximum amount of variation in the data set without fitting stochastic noise. Because epistasis is scatter off the scale line (noise), model-selection approaches like the *F*-test, Akaike information criterion, and inspection of fit residuals are a natural strategy for partitioning variation between scale and epistasis.

### Interpretation

Another powerful aspect of this approach is that it allows explicit separation of two distinct origins of nonadditivity in genotype-phenotype maps.

This can be illustrated with a simple, conceptual example. Imagine mutations to an enzyme, expressed in bacteria, that have a less-than-additive effect on bacterial growth rate. To a first approximation, this epistasis could have two origins. The first is at the level of the enzyme: maybe the mutations have specific, negative chemical interactions that alter enzyme rate. The second is at the level of the whole cell: maybe, above a certain activity, the enzyme is fast enough that some other part of the cell starts limiting growth. Mutations continue to improve enzyme activity, but growth rate does not reflect this. These two origins of less-than-additive behavior will have different effects in a *vs.* plot: saturation of growth rate will appear as nonlinearity, and interactions between mutations at the enzyme level will appear as linear epistasis. Our analysis would reveal this pattern and set up further experiments to tease apart these possibilities.

This may also provide important evolutionary insights. One important question is to what extent evolutionary paths are shaped by global constraints *vs.* specific interactions that lead to specific historical contingencies (Harms and Thornton 2014; Kryazhimskiy *et al.* 2014; Shah *et al.* 2015). For example, recent work has shown specific epistatic interactions lead to sequence-level unpredictability, while a globally less-than-additive scale leads to predictable phenotypes in evolution (Kryazhimskiy *et al.* 2014). Our analysis approach naturally distinguishes these origins of nonadditivity, and thus these evolutionary possibilities. Prevailing-magnitude epistasis (de Visser *et al.* 2009), global epistasis (Kryazhimskiy *et al.* 2014), and diminishing-returns epistasis (Otto and Feldman 1997; MacLean *et al.* 2010; Chou *et al.* 2011; Tokuriki *et al.* 2012) will all appear as nonlinear scales. In contrast, specific interactions will appear in specific coefficients in the linear epistasis model. Our detection of nonlinearity and high-order epistasis in most data sets suggests that both forms of nonadditivity will be in play over evolutionary time.

### High-order epistasis

Finally, our work reveals that high-order epistasis is, indeed, a common feature of genotype-phenotype maps. Our study could be viewed as an attempt to “explain away” previously observed high-order epistasis. To do so, we both accounted for nonlinearity in the map and propagated experimental uncertainty to the epistatic coefficients. Surprisingly—to the authors, at least—high-order epistasis was robust to these corrections.

High-order epistasis can make huge contributions to genotype-phenotype maps. In data set II, third-order and higher epistasis accounts for fully 31.0% of the variation in the map. The average contribution, across maps, is 12.7%. We also do not see a consistent decay in the contribution of epistasis with increasing order. In data sets II, V, and VI, third-order epistasis contributes more variation to the map than second-order epistasis. This suggests that epistasis could go to even higher orders in larger genotype-phenotype maps.

The generality of these results across all genotype-phenotype maps is unclear. The maps we analyzed were measured and published because they were “interesting,” either from a mechanistic or evolutionary perspective. Further, most of the maps have a single, maximum phenotype peak. The nonlinearity and high-order epistasis we observed may be common for collections of mutations that, together, optimize a function, but less common in “flatter” or more random genotype-phenotype maps. This can only be determined by characterization of genotype-phenotype maps with different structural features.

The observation of this epistasis also raises important questions: What are the origins of third, fourth, and even fifth-order correlations in these data sets? What, mechanistically, leads to a five-way interaction between mutations? Does neglecting high-order epistasis bias estimates of low-order epistasis (Otwinowski and Plotkin 2014)? What can this epistasis tell us about the biological underpinning of these maps (Lehár *et al.* 2008; Hu *et al.* 2011, 2013; Taylor and Ehrenreich 2015).?

The evolutionary implications are also potentially fascinating. Epistasis creates temporal dependency between mutations: the effect of a mutation depends strongly on specific mutations that fixed earlier in time (Bedau and Packard 2003; Desai 2009; Harms and Thornton 2014; Shah *et al.* 2015). How does this play out for high-order epistasis, which introduces long-range correlations across genotype-phenotype maps? Do these low magnitude interactions matter for evolutionary outcomes or dynamics? These, and questions like them, are challenging and fascinating future avenues for further research.

## Acknowledgments

We thank Patrick Phillips, Jamie Bridgham, and members of the Harms laboratory for helpful discussions and comments. We would also like to thank David Hall for providing the complete data for data sets VI and VII. Work was supported by start-up funds from the University of Oregon (Z.R.S). M.J.H. is a Pew Scholar in the Biomedical Sciences, supported by The Pew Charitable Trusts.

## Footnotes

Supplemental material is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.116.195214/-/DC1.

*Communicating editor: C. Kendziorski*

- Received August 29, 2016.
- Accepted January 9, 2017.

- Copyright © 2017 Sailer and Harms

Available freely online through the author-supported open access option.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.