## Abstract

In an attempt to understand whether it should be expected that some genes tend to be used disproportionately often by natural selection, we investigated two related phenomena: the evolution of flux control among enzymes in a metabolic pathway and properties of adaptive substitutions in pathway enzymes. These two phenomena are related by the principle that adaptive substitutions should occur more frequently in enzymes with greater flux control. Predicting which enzymes will be preferentially involved in adaptive evolution thus requires an evolutionary theory of flux control. We investigated the evolution of enzyme control in metabolic pathways with two models of enzyme kinetics: metabolic control theory (MCT) and Michaelis–Menten saturation kinetics (SK). Our models generate two main predictions for pathways in which reactions are moderately to highly irreversible: (1) flux control will evolve to be highly unequal among enzymes in a pathway and (2) upstream enzymes evolve a greater control coefficient then those downstream. This results in upstream enzymes fixing the majority of beneficial mutations during adaptive evolution. Once the population has reached high fitness, the trend is reversed, with the majority of neutral/slightly deleterious mutations occurring in downstream enzymes. These patterns are the result of three factors (the first of these is unique to the MCT simulations while the other two seem to be general properties of the metabolic pathways): (1) the majority of randomly selected, starting combinations of enzyme kinetic rates generate pathways that possess greater control for the upstream enzymes compared to downstream enzymes; (2) selection against large pools of intermediate substrates tends to prevent majority control by downstream enzymes; and (3) equivalent mutations in enzyme kinetic rates have the greatest effect on flux for enzymes with high levels of flux control, and these enzymes will accumulate adaptive substitutions, strengthening their control. Prediction 1 is well supported by available data on control coefficients. Data for evaluating prediction 2 are sparse but not inconsistent with this prediction.

THEORETICAL research on the process of adaptation has focused primarily on describing the size and number of genetic changes underlying phenotypic change (Fisher 1930; Kimura 1983; Orr 1998, 2002, 2003). By contrast, comparatively little theoretical attention has been given to the question of whether certain genes or types of genes are preferentially involved in the process of adaptation. Yet the current debate over the relative importance of regulatory *vs.* structural genes in morphological evolution (Hoekstra and Coyne 2007; Stern and Orgogozo 2008) clearly indicates that this question is of interest to evolutionary biologists.

One situation in which this question is pertinent is the evolution of characters that are influenced by the concentration of end products of metabolic pathways. Often change in end-product concentration can be achieved by substitutions in any one of several genes in the pathway. One example is the intensity of floral pigmentation. To a first approximation, final pigment concentration, and hence color intensity, can be viewed as being determined by the flux rate down the pigment biosynthetic pathway for a fixed time corresponding to the duration of floral development. More generally, any situation in which flux rate determines phenotype is likely to fall in this category. In such situations, metabolic control theory (MCT) (Kacser and Burns 1973) and similar approaches (Heinrich and Rapoport 1974; Savageau 1976) indicate that changes in flux can be achieved by changing the activity of any enzyme in the pathway. We seek to determine whether and, if so, why enzymes differ in the probability that they contribute to evolutionary change in pathway flux.

It has been suggested as a general principle that enzymes with the greatest control over flux will be disproportionately involved in such evolutionary change (Hartl *et al*. 1985; Eanes 1999; Watt and Dean 2000). This argument is based on the theoretical expectation that the probability of fixation of an advantageous allele is roughly proportional to its selection coefficient (Hedrick 2000). Since mutations equivalent in terms of enzyme kinetic properties will have greater effects on flux, and hence on fitness, in enzymes with greater metabolic control, mutations in those enzymes will be substituted preferentially.

While this argument is likely sound, it simply pushes back the question of which genes evolve preferentially to the question of which enzymes are expected to have greatest control over flux. Although we are unaware of any theoretical attempts to model the evolution of flux control, many authors have speculated about where in pathways control is expected to be highest.

Kacser and Burns (1973) hypothesized that the magnitudes of flux control exerted by different enzymes may be very similar. This hypothesis was based on the result from MCT that in linear pathways, overall flux control can be shared by all enzymes. Since metabolic pathways often consist of many enzymes, each would be expected to have only a limited potential to influence flux. Subsequent theoretical analysis of this hypothesis demonstrated that a given flux is consistent with many different flux-control distributions, including, at one extreme, equal flux control by all enzymes and, at the other extreme, major control by one or a few enzymes and little control for all others (Mazat *et al*. 1996). However, the question of which of these possibilities, if any, are likely to be favored by selection has not been addressed.

Another hypothesis, the epistatic or synergistic principle, predicts that control will be vested in a single enzyme at any given time, but will shift over time among enzymes (Dykhuizen *et al.* 1987; Keightley 1989; Bost *et al.* 2001) According to this hypothesis, starting from equal control among enzymes in the pathway, selection to increase (or decrease) flux will cause the activity of one enzyme to increase (decrease). This change results in a decrease (increase) in flux control for the enzyme that changes and an increase (decrease) in control for the other pathway enzymes, causing control to be unequally shared. While this argument seems plausible, there has been no analysis of whether over time all enzymes have an equal chance of having elevated control.

Finally, Eanes' (1999) review of enzyme polymorphisms found that control is often centered in enzymes at pathway branch points, which constitute the most upstream enzymes of their specific branch. Flowers *et al.* (2007) also demonstrated that branching enzymes tend to exhibit more adaptive substitutions than downstream enzymes as would be expected under the principle that evolutionary change will be concentrated in enzymes with the largest control coefficients. In addition, evolutionary changes in these enzymes may be favored because they allow organisms to modify flux allocation to alternate functions and track environmental fluctuations. This suggestion is supported by the “branch point effect,” a theoretical demonstration that control coefficients can dramatically shift between enzymes depending on the kinetic rates of the two competing enzymes (LaPorte *et al*. 1984). However, this study does not address the question of how the distribution of control is likely to evolve, but describes only which distributions of control are mathematically possible. Thus, Eanes (1999, p. 318) concludes his review, stating “all enzymes in [a] contributing pathway may not be equal; determining the rule[s] for these inequalities should be a major goal in studies of enzyme polymorphism.”

A control coefficient (CC) indicates the degree to which flux through a pathway is altered by a small change in the activity of an enzyme (see appendix; this is equivalent to the sensitivity coefficient of Kacser and Burns 1973). The “rules” governing the distribution of control coefficients are determined by the biological evolution of metabolic systems. While research demonstrates that there are many *possible* distributions of control coefficients, none has examined which of these is most likely to evolve. The optimization of metabolic systems has been explored in detail (Heinrich *et al.* 1991, 1997; Heinrich and Schuster 1998). In these studies, however, the investigators employ as optimization criteria maximizing flux, maximizing transient times, or minimizing metabolic intermediates, criteria whose biological and evolutionary relevance is unclear.

In an effort to understand how control is expected to be shared among enzymes and predict which enzymes are most likely to contribute to adaptive genetic changes, we present two models of the evolution of flux control in a simple linear pathway. The first model employs the framework of MCT. Although there have been many challenges to the MCT framework (Savageau 1976, 1992; Cornish-Bowden 1989; Savageau and Sorribas 1989), it should be made clear that our aim is not to construct a precisely parameterized model of a particular biological system, but to use this generalized framework to address a single, critically ignored question: What are the rules governing how control will evolve to be distributed among enzymes? The use of the MCT framework to address questions in evolutionary genetics is firmly established, with investigation focused on the molecular basis of dominance (Kacser and Burns 1981; Keightley 1996a; Phadnis and Fry 2005; but see Bagheri and Wagner 2004), the relationship between metabolic flux and fitness (Dykhuizen *et al.* 1987; Szathmary 1993), the amount of additive and nonadditive genetic variance in metabolic systems (Keightley 1989), whether this variation can be explained by mutation–selection balance (Clark 1991), and patterns of response of quantitative traits to selection (Keightley 1996b). The second model we examine, saturation kinetics (SK), is based on Michealis–Menten kinetics and enables us to relax one major assumption of MCT: that enzymes are far from saturation.

Here we limit our analysis to linear pathways as an initial attempt to examine these issues. We find that for such pathways control coefficients will generally evolve to be unequal; that the magnitude of this inequality depends on the *thermodynamic* properties, rather than the kinetic properties, of each reaction step; that upstream enzymes tend to evolve higher control coefficients than downstream enzymes; and that upstream enzymes fix advantageous mutations in greater numbers, and those mutations have larger effects than in downstream enzymes.

## GENERAL MODELING APPROACH

We concentrate here on a simple, linear metabolic pathway containing 2, 3, or 10 enzymes. Following Dykhuizen *et al.* (1987), Clark (1991), and Szathmary (1993) we assume that flux through the pathway is the sole component of fitness and flux is subject to stabilizing selection. Such selection would arise, for example, when there is stabilizing selection on a character (*e.g.*, lactose metabolism, floral pigment intensity, or glucosinolate production following herbivory) whose value is determined by flux. This framework also allows us to examine the effects of directional selection that arises when the optimal level of flux changes, as might occur for floral pigment intensity if there were environmental shift in the optimal intensity for attracting pollinators. We assume, as is usual in an MCT framework, that enzymes are at sufficiently high concentration that they are substantially below saturation (Kacser and Burns 1973). This assumption is subsequently relaxed to investigate the effect of enzyme saturation on the distribution of control coefficients.

To model the evolution of flux control, we assume that flux may be changed either by altering the level of expression of individual enzymes in the pathway (increasing expression level generally increases flux) or by altering the kinetic properties of individual enzymes in the pathway (increasing the specific rate of reaction of an enzyme generally increases flux). We subsume both of these effects by focusing on changes in the overall rate of reaction of an enzyme under standard metabolic conditions. Specifically, we represent the forward rate of reaction *i*, *k _{i}*, to be a product of (1) the amount of the enzyme present and (2) the standard per-unit-enzyme forward rate constant. For a given

*k*, the reverse rate of reaction

_{i}*i*,

*k*

_{−i}, is a thermodynamic property of the system; it is determined by the difference in free energy between products and reactants (Nelson and Cox 2000) and is not subject to organic evolution. To determine the control properties of each enzyme at evolutionary equilibrium, we initialize replicate populations described by the enzyme activities (

*k*

_{1},

*k*

_{2}, …,

*k*) of a metabolic pathway determining fitness of organisms in the population. Each replicate population is seeded with random

_{i}*k*values.

_{i}Mutations occur in a randomly chosen enzyme, *i*. We do not allow enzymes to differ in their mutation rates. The effect size of each mutation, Δ*k _{i}*, is drawn from a Gaussian distribution centered at zero. We measure the effect of each mutation on flux and fitness (Δ

*W*) and determine whether the mutation is fixed on the basis of the standard population-genetic fixation probability. Mutations are allowed to occur repeatedly until the flux reaches its optimum. This process produces one evolutionary trajectory, defined as the route across an adaptive landscape from an initial flux toward a new optimal flux. Trajectories are determined for a large number of different initial sets of randomly chosen

*k*values all evolving toward the same optimal flux. In essence, this procedure simulates a situation in which the optimal flux has just shifted, and the starting points represent many different previous optima. We do not assert that all of these starting values are biologically likely, but adopt this approach to explore the entire space of possible activity values. We perform additional simulations to model a situation in which populations repeatedly shift between two different environmental optima. All simulations are repeated assuming a range of

_{i}*k*

_{−i}values to determine how the control evolves under varying thermodynamic environments. Details of the models used are presented in the appendix.

This analysis yields two types of information: a frequency distribution of the final degree of flux control for each enzyme at evolutionary equilibrium and the distribution of number and size of mutations in every evolutionary trajectory. By analyzing two models of metabolic control (MCT and SK) and conducting multiple sets of runs with different values of model parameters (*e.g.*, mutational distribution, reversibility of reactions, initial size of substrate pool, intensity of stabilizing selection; Table 1), we are able to evaluate the robustness of our conclusions.

## RESULTS

#### General behavior of the MCT model:

In our model, *N*, α_{i}, σ_{J}, σ_{k}, *I*, and *J*_{opt} are fixed parameters (Table 1). We first consider in detail the behavior of the model for a population size of *N* = 1000, *I* = 10, an optimal flux of *J*_{opt} = 1, a strength of selection of σ_{J} = 0.5, and a variance in mutation size of σ_{k} = 0.05. Subsequently we discuss how the behavior of the model changes as these parameters are varied. We also assume that there is some maximal enzyme efficiency that is the same for each enzyme, and we scale this maximum to 1 (*i.e.*, we assume that *k _{i}* < 1) by choosing the appropriate unit for time. For simplicity, we consider the case where the enzyme reversibilities, α

_{i}, are the same for all

*i*and equal to α, and we let α = 0.001, 0.05, 0.50, and 0.95, representing, respectively, conditions of very low, low, intermediate, and high reaction reversibility. A total of 200 trials were run with each model for each α-value, each trial beginning from a randomly chosen starting point (

*k*

_{1},

*k*

_{2}).

Simulations usually reached the optimal flux in 500–2000 mutation/fixation cycles, typically having undergone 10–25 substitutions. In most simulations, pathways with 2 steps reached the fitness peak more quickly than models with 3 or 10 steps. The reversibility parameter α also influenced the rate of adaptation: when α = 0.001 or 0.05, pathways reached equilibrium more rapidly than when α = 0.50 or 0.95.

#### Evolution of control under the two-step MCT model:

The 2-, 3-, and 10-step pathways exhibit similar evolutionary patterns, and because these patterns are easier to visualize for 2-step pathways, we begin by examining the evolution of control in 2-step pathways.

For largely irreversible reactions (α = 0.001 and 0.05), populations starting at random (*k*_{1}, *k*_{2}) values evolve highly inequitable control coefficients, with the CC for the first enzyme being on average ∼3–20 times greater than that for the second enzyme (Table 2). As expected (Figure 1A), populations converge to an equilibrium line corresponding to the hyperbola
(1)which is obtained from the two-enzyme equation for flux analogous to (A1). Although different runs converged to different points on this line, the majority of these points lie in the region corresponding to *C*_{1} > *C*_{2} (Figure 1A) above the (dashed) *C*_{1} = *C*_{2} line,(2)derived from Equation A3.

Two factors contribute to this bias in evolutionary outcomes. First, for a randomly chosen starting (*k*_{1}, *k*_{2}) there is an initial bias for the CC of the first enzyme to be greater than that of the second enzyme. This bias arises because the space above the *C*_{1} = *C*_{2} line (corresponding to *C*_{1} > *C*_{2}) is much larger than the space below the line (corresponding to *C*_{1} < *C*_{2}) (Figure 1A). Second, except for starting points very near the *C*_{1} = *C*_{2} line, the average instantaneous trajectories are virtually horizontal; *i.e.*, the average change in *k*_{1} is large while the average change in *k*_{2} is small (Figure 1A). This means that trajectories from most starting points will tend to move populations toward a point on the equilibrium line lying in the *C*_{1} > *C*_{2} region. In fact, this second property is a direct consequence of the first. When *C*_{1} > *C*_{2}, equivalent mutations affecting *k* will have larger effects on flux if they occur in enzyme 1. The selection coefficient favoring a beneficial mutation will thus on average be greater for mutations in enzyme 1. Since the probability of fixation of an advantageous mutation is approximately proportional to its selection coefficient, mutations in enzyme 1 will have a greater chance of fixation than mutations in enzyme 2.

For reactions with intermediate reversibility (α_{i} = 0.5), the evolutionary outcome is still biased toward larger CC in the first enzyme, but this bias is not as large as in the previous case (Table 2). And for largely reversible reactions (α_{i} = 0.95), there is little bias (Table 2). Paralleling this decrease in bias with increasing reversibility is both a decrease in the expected proportion of starting points with *C*_{1} > *C*_{2} and a decrease in the proportion of instantaneous trajectories for which *k*_{1} is substantially larger in absolute value than *k*_{2} (Figure 1B; supporting information, Figure S1).

To assess the generality of these results, we performed analogous simulations for different combinations of the values of the parameters *N*, α_{i}, σ_{J}, σ_{k}, *I*, and *J*_{opt}. Simulations were run in a factorial design varying all six parameters by at least two orders of magnitude, yielding 972 different comparisons, each with 20 replicates. We used an ANOVA (JMP, SAS Institute, 2005) to analyze the effect of altering these parameters on the equilibrium control coefficient for the two enzymes. As expected, varying α significantly affected mean control coefficients, but changes in the other five parameters (*N*, σ_{J}, σ_{k}, *I*, and *J*_{opt}) did not significantly alter the mean control coefficient (Table S1). Many of the interaction terms were statistically significant (Table S1), but these effects are subtle and do not change the fundamental evolutionary trend: the mean control coefficient for the first enzyme is larger then that of the second for every comparison (Table S2).

The previous analyses demonstrate that a shift in the flux optimum will, *when averaged over random starting points*, tend to yield a new equilibrium at which flux control is biased toward the upstream enzyme. However, it is not known whether all possible starting combinations of (*k*_{1}, *k*_{2}) are equally likely. Moreover, the analysis does not necessarily imply that equilibria at which the majority of control is vested in the *second* enzyme are not evolutionarily stable in a shifting environment. It is thus conceivable that the above analysis fails to completely capture the dynamics of CC evolution in an environment in which the optimal flux varies.

To examine this issue, we simulated evolutionary trajectories in which the optimal flux fluctuated between two values, with shifts in the optimum occurring every 30,000 mutation–fixation cycles. Simulations were performed with six pairs of optimal flux (1.0, 0.5; 1.0, 3.0; 1, 5.0; 3.0, 0.5; 3.0, 1.0; and 3.0, 5.0). Simulations were initiated with all replicates at optimal fitness, equally spaced along the high fitness hyperbola, with half of the starting points at *C*_{1} > *C*_{2} and half at *C*_{2} > *C*_{1}. The parameters α and σ_{k} were varied over at least an order of magnitude across replicate simulations to determine the effect on the equilibrium distribution of control. To determine the long-term distribution of genotypes along the equilibrium curve we used our simulations to estimate transition probabilities from one class of (*k*_{1}, *k*_{2}) values to all other classes. Using this information, we constructed a transition matrix, *M*. Each element, *m _{ij}* of this matrix represents the proportion of replicates starting at the midpoint of segment

*j*that ended within segment

*i*. Under the approximation that this transition probability is the same for any starting point within segment

*j*, the long-term, equilibrium probability that a population will be in segment

*k*is just the

*k*th element of the leading eigenvector of

*M*.

In general, modeling evolution in a fluctuating environment produced results similar to those already obtained: many more trials yielded greater control by the upstream enzyme (*C*_{1} > *C*_{2}) than the reverse (Table 3). The few exceptions to this trend occur only when mutations are small and reactions are of intermediate or high reversibility, a condition that appears uncommon in biochemical reactions (see discussion).

#### Evolution of control under the three-step MCT model:

As with the two-step MCT model, a total of 200 trials were run with this model for each α-value, each trial beginning from a randomly chosen starting point (*k*_{1}, *k*_{2}, *k*_{3}). In these simulations, the evolution of the CC in a three-step pathway exhibits similar patterns to those described above for a two-step pathway. All trials converged to an equilibrium hyperboloid represented by Equation 1, with *J* = *J*_{opt} = 1 (Figure 2). As for the two-step pathway, when α = 0.001 and α_{i} = 0.05, there is a strong bias for trajectories to converge to points having a high CC for the first enzyme: the mean *C*_{1} is 20 times greater than the mean *C*_{2} and *C*_{3} for α = 0.001 and >3 times greater for α = 0.05 (Table 2 and Figure 2A). This bias is reduced for α_{i} = 0.5 and essentially absent when α_{i} = 0.95 (Table 2, Figure 2B, and Figure S2). This pattern is not dependent on the values of the fixed parameters in our model. As in the two-step model, we generated simulations for all combinations of parameter values (*N*, α_{i}, σ_{J}, σ_{k}, *I*, and *J*_{opt}), each of which was varied over two orders of magnitude. Once again, we found that the effects of α_{i} on the mean values of the control coefficients for each enzyme were significant, while variation in the other five parameters (*N*, σ_{J}, σ_{k}, *I*, and *J*_{opt}) had little effect on the CC for any of the enzymes (Table S3 and Table S4).

#### Evolution of control under the 10-step MCT model:

As a final examination of the generality of our results, additional simulations were performed using a model of a longer metabolic pathway containing 10 enzymes. Two hundred replicate pathways were evolved for 20,000 mutation/fixation cycles with α = 0.001 and 0.05: CCs decrease rapidly with position in the pathway and on average the degree of control for the first enzyme is approximately eight and three times greater, respectively, than that for the second enzyme (Table S5). With α = 0.5, this bias is less, but the first 2 enzymes share two-thirds of the control, and the first 3 enzymes account for >75% of the control. Finally, as with the shorter pathways, when α = 0.95, control is much more equitably distributed on average, with the CCs of the first and last enzymes differing by only 0.032 (Table S5).

#### Properties of fixed mutations under the MCT model:

We examined three properties of fixed mutations: the number of substitutions (*N*_{μ}) in a trajectory, the change in enzyme activity caused by each mutation (Δ*k _{i}*), and the change in fitness caused by each mutation (Δ

*W*). We compared the value of these properties for different enzyme steps, during different phases of an evolutionary trajectory. Adaptive walks were divided into two phases: the first phase when the population is far from the optimum (

*W*(

**J**) < 0.95) and evolution is dominated by directional selection and a second phase when the population is near the optimum (

*W*(

**J**) > 0.95) and evolution is dominated by stabilizing selection.

For pathways with low reversibility (α_{i} = 0.001 and 0.05), during the first phase of the adaptive walk, two to three times as many substitutions occurred in the first enzyme as in the second or third enzymes (Table 4A). Although the average effects of these substitutions on *k* were similar for the different enzymes, the average effect on fitness was larger for the first enzyme (Table 4A). Moreover, many more substitutions improved fitness in the first enzyme, compared to those downstream (α_{i} = 0.001: 2*E* pathway, 44.0% *vs.* 10.3%; 3*E* pathway, 38.2% *vs.* 7.3% and 2.7%). In the second phase, the same pattern emerges in the distribution of beneficial mutations between up- and downstream enzymes (α_{i} = 0.001: 2*E* pathway, 91.6% *vs.* 26.1%; 3*E* pathway, 93.7% *vs.* 35.1% and 0.5%). There were many more substitutions that had no effect on fitness during the first phase than during the second phase. This unexpected result is caused by mutations that were fixed primarily when populations were far from the fitness optimum and the landscape was essentially flat. The average effect on *k* during the second phase was significantly smaller for the first enzyme than for the more downstream enzymes, presumably reflecting the fact that for enzymes with greater flux control, only mutations with very small effects will either not overshoot the optimum or have small enough fitness effects to be fixed by drift (Table 4A). Nevertheless, the effects on fitness were largest for the first enzyme, although Δ*W* is approximately an order of magnitude lower than in the first phase (Table 4A). These patterns indicate that during the directional selection phase, adaptive substitutions are fixed primarily in the first enzyme and neutral mutations occur in both phases of selection, but are fixed most frequently in downstream enzymes.

Pathways with intermediate to high reversibility (α_{i} = 0.5 or 0.95) yield a very different pattern. In particular, enzymes tend not to differ in either phase for number of substitutions or their effects on enzyme activity or fitness (Table 4). These results demonstrate that a fairly high degree of irreversibility is needed to cause upstream and downstream enzymes to differ in the properties of their associated fixed mutations.

The results for three properties of fixed mutations investigated are robust to the defined threshold between the two phases of adaptive evolution (*W* > 0.95 or *W* > 0.99) and the distribution of mutational effects (results not shown). We analyzed three Gaussian distributions with mutations skewed toward decreasing enzyme function (μ = 0.0, μ = −σ_{k}, and μ = −2σ_{k}) and found no effect on final distributions of *N*_{μ}, Δ*W*, or Δ*k*_{1}.

#### Evolution of control under the SK model:

The saturation kinetics model contains one more parameter than the MCT model. Under saturation kinetics, when downstream enzymes have slower rates of reaction, there is the potential for the buildup of large pools of pathway intermediates. Because high concentrations of intermediates are likely to be deleterious, we follow Clark (1991) in introducing a fitness penalty for high substrate pools, represented by the parameter **T** (see appendix). We first examine an exemplar set of simulations for the two- and three-enzyme models in which *I* = *T* = 10, *N* = 1000, *J*_{opt} = 0.5, σ_{J} = 0.05, and σ_{k} = 0.05 and then consider how the behavior of the model changes as these parameters are varied. For simplicity, we again consider the case where α_{i} are the same for all *i* and equal to α. Two hundred trials were run for the SK model for each α-value, each trial beginning from a randomly chosen starting point (*V*_{1}, *V*_{2}).

The evolution of control in these exemplar simulations of two- and three-enzyme pathways yields patterns that are very similar to those exhibited by the MCT model (Table S6). In this model, Equation A12 with *v* = *J*_{opt} defines a line (two-enzyme pathway) or surface (three-enzyme pathway) of optimal flux corresponding to the set of enzyme maximal velocities, *V*_{1}, *V*_{2}, *V*_{3}, that makes the reaction velocity equal to the optimum (blue curves in Figure 3, A and B, and surfaces in Figure 4). This line or surface is divided into distinct regions corresponding to control by the different enzymes (dashed line in Figure 3, A and B, and blue line in Figure 4A).

For both two- and three-enzyme pathways, our simulations yielded final values of the *V _{i}* that lie on or very close to the line/surface of optimal flux (Figures 3 and 4, Figure S3, and Figure S4). Slight deviations from the surface largely reflect nearly neutral substitutions fixed by drift after a population has evolved very close to the optimum. However, the portion of the optimal line/surface occupied depends on the reversibility of the reaction. This point is clearly illustrated for the two-enzyme model with low reversibilty: all points lie in the

*C*

_{1}>

*C*

_{2}region because of the threshold on the size of substrate pools (Figure 3C), even though the optimal flux hyperbola extends into the

*C*

_{2}>

*C*

_{1}region (blue line, Figure 3A). Enzyme activity values on the optimum line/surface will produce very high intermediate substrate levels when

*V*

_{2}(or also possibly

*V*

_{3}in the three-enzyme pathway) is very close to

*v*, making

*V*

_{2}–

*v*(or

*V*

_{3}–

*v*) very small. Because the concentration of the intermediate,

*B*, is inversely proportional to

*V*

_{2}–

*v*(and, in the three-enzyme pathway, intermediate

*C*concentration is proportional to

*V*

_{3}–

*v*) (Equation A14), the intermediate pool becomes very large and a strong fitness penalty is imposed. Thus, for largely irreversible reactions, fitness penalties associated with large pools of the intermediate constrain the equilibrium to be in the region in which

*C*

_{1}>

*C*

_{2}. If the penalty threshold

*T*were greater, some trials would evolve to equilibria with enzyme 2 (or 3) exhibiting greatest control. However, unless

*T*is unreasonably large, a majority of trials will yield equilibria with enzyme 1 having greatest control.

With moderately to highly reversible reactions (α = 0.5 and 0.95), this constraint is absent (Figure 3D and Figure S3) and consequently the system evolves to equilibria that lie along the entire line/surface of optimal flux (Figures 3B and 4B, Figure S3, and Figure S4). In the two-enzyme pathway, approximately half the equilibria correspond to *C*_{1} > *C*_{2} and half to *C*_{2} > *C*_{1}; in the three-enzyme pathway, approximately one-third of the equilibria correspond to greatest control by each of the enzymes. Our interpretation of this result is as follows: in a reversible reaction, the net forward flux is countered by a backward flux, even if the intermediate pool is small. Consequently, to achieve a given optimal flux, the *V _{i}* must be higher than when reactions are less reversible. This effect displaces the line/surface of optimal flux toward higher values of

*V*

_{1},

*V*

_{2}(Figure 3, A and B: blue lines shifted toward higher

*V*

_{1}and

*V*

_{2}with increasing reversibility) or

*V*

_{1},

*V*

_{2},

*V*

_{3}(Figure 4, A and B: optimum surface shifted toward higher

*V*). Because of this displacement, at all points along the line/surface

_{i}*V*

_{2}(and

*V*

_{3}) is substantially larger than the actual flux

*v*(0.5 in this example). Consequently,

*V*

_{2}–

*v*(and

*V*

_{3}–

*v*) remains relatively large and the intermediate pool remains relatively small, avoiding the fitness penalty associated with high intermediate pools. In these examples, pools corresponding to the line/surface of optimal flux never exceed the threshold value of

*T*(Figures 3D and 4B, Figure S3, and Figure S4). There is thus no constraint preventing equilibria outside the region in which enzyme 1 exerts greatest control.

In summary, a bias favoring dominant control by enzyme 1 can arise in pathways with low reaction reversibility if there is a fitness penalty associated with large pools of intermediates. This penalty does not cause the *V _{i}* to deviate from the surface of optimal flux, but rather restricts the portion of that surface that corresponds to a high-fitness equilibrium. By contrast, this restriction greatly decreases in pathways with moderately or highly reversible reactions because reversibility prevents large intermediate pools from building up. Consequently, all enzymes have a roughly equal chance of exerting predominant control over flux.

To assess the generality of these results, we performed simulations for different combinations of the values of the parameters α, **J**_{opt}, σ_{k}, σ_{J}, *I*, and *T*. Simulations were run in a factorial design varying all six parameters by at least two orders of magnitude, each with 25 replicates. We employed a restricted factorial design to avoid simulations run with unrealistic parameter values and decrease the computer running time of these simulations (for details see Table S7). The data were analyzed with an ANOVA, testing effects of single parameters and two-way interactions on the distribution of control.

Varying two parameters, σ_{J} and *I*, had a significant effect on the distribution of control (Table S7 and Table S8), but these effects are subtle and do not change the fundamental pattern: control coefficients evolve to be larger for upstream enzymes (Table S9 and Table S10). However, there were two instances in the three-enzyme case in which the second enzyme had the greatest control coefficient in the pathway: α = 0.95, σ_{J} = 0.01 and σ_{J} = 0.01, *I* = 100 (Table S10). Additionally, there is a third instance (*I* = 100, *T* = 100) in which control is quite evenly shared between all three enzymes. These three cases demonstrate that, for a few parameter combinations, control can evolve to be centered in a downstream enzyme. Regardless, the fundamental evolutionary pattern for this model, and every other model investigated, is that control evolves to be centered in the most upstream enzyme.

The three properties of mutations examined in this analysis, *N*_{μ}, Δ*k _{i}*, and Δ

*W*, were similar for both the SK and MCT models (Table 4). For reactions with low reversibility, during the directional selection phase, most substitutions occurred in the first enzyme and the fitness effects of these substitutions were larger for the first enzyme, despite the fact that substitutions in downstream enzymes have a larger effect on enzyme activity (Table 4B). During the stabilizing selection phase, the last enzyme in the pathway accumulated the greatest number of mutations, and a substantial portion of these tended to have neutral or detrimental fitness effects (α

_{i}= 0.001: 2

*E*pathway, 10.6%

*vs.*55.0%; 3

*E*pathway, 11.2%

*vs.*49.6% and 60.8%) and were thus fixed by drift. These differences among enzymes were greatly reduced when reactions were largely reversible (Table 4B).

## DISCUSSION

#### Evolution of control coefficients:

The evolution of metabolic control has been the subject of much speculation. A number of different arguments have been put forward: (1) metabolic control is likely to be shared roughly evenly across enzymes in a pathway (Kacser and Burns 1973); (2) control will shift between enzymes in a pathway and no one enzyme is expected to be more likely to have high control (Dykhuizen *et al*. 1987; Keightley 1989; Bost *et al*. 2001); and (3) control will be unequally shared and is likely to be highest for enzymes just below branch points because of fluctuating selection for allocation of flux along different branches (Eanes 1999).

Our simulations lend little support to any of these arguments. For example, except when reactions are largely reversible, control is very unevenly shared among enzymes. The unequal distribution is a result of selection for control to be centered in a single enzyme and is not simply a consequence of the summation property, as hypothesized by Bost *et al*. (2001). Our simulations find no evidence for constant shifts in control between enzymes (Dykhuizen *et al.* 1987; Keightley 1989). This result hinges on the assumption of directional *vs.* stabilizing selection; under directional selection to always increase flux, control will shift between enzymes, while we assumed stabilizing selection because a pathway's flux will eventually be inhibited at some rate. Under stabilizing selection, the strong tendency for the most upstream enzyme to gain predominant control tends to prevent frequent shifts in control among enzymes and means that most enzymes have a greatly reduced chance of exercising major control. Finally, we are cautious about extending our conclusions concerning linear pathways to separate branches of a branching pathway. We believe additional research focused on branching pathways needs to be conducted to fully address this question. Nevertheless, to the extent that this is possible, our results suggest that fluctuating selection on allocation between branches is not necessary for the evolution of major control by the most upstream enzyme in a pathway branch.

Our analysis does reveal, however, two properties of control that are expected to evolve, at least in linear pathways, when stabilizing selection acts on pathway flux and reactions have low reversibility: (1) metabolic control will be concentrated in one or a few enzymes and (2) control tends to be located in upstream enzymes. These are general properties of the system that evolve under a large range of parameter values and regardless of whether enzymes are assumed to be saturated or unsaturated.

In the MCT model, these patterns result from two interacting properties. The first is that a very large portion of the (*k*_{1}, *k*_{2},…,) phenotypic space corresponds to greater control for upstream enzymes (*e.g.*, Figure 1A). The second factor arises when comparing mutations of equivalent magnitude on enzyme kinetic properties in enzymes with small and large control coefficients. A beneficial mutation will more likely fix in the enzyme with greatest control, because it will have a greater effect on flux and hence on fitness. Thus, the initial biased distribution of control in upstream enzymes is maintained and intensified by this biased distribution of beneficial mutations.

These same patterns arise for different reasons under the SK model. In our examples, the regions of the line/surface of optimal flux in which the different enzymes exert dominant control are roughly equal in size. Random starting points thus have a roughly equal probability of corresponding to dominant control of a specific enzyme. Instead, a fitness penalty associated with large intermediate pools differentially penalizes downstream enzymes and prevents them from evolving to exert dominant control.

The concentration of intermediate metabolites may also contribute to favoring major control in upstream enzymes in the MCT model. Although this was not formally included in our simulations, in File S1 it is shown that the concentration of a pathway's intermediate is proportional to the *k _{i}*'s of the reactions preceding the intermediate and inversely proportional to the

*k*'s of the reactions following it. If there is a cost associated with maintaining pools of intermediates, and if that cost increases with the concentration of the intermediate, then there will no longer be an equilibrium

_{i}*surface*given by Equation 1. Instead, there will be an equilibrium point corresponding to a low value of

*k*

_{1}and high values of the remaining

*k*,

_{i}*i.e.*, to a high CC for the most upstream reaction and a low CC for all others. Thus, the inequality of flux control due to selection against large intermediate pools in the SK model is likely to be seen in the MCT model as well.

#### Expected and observed bias in control:

Our simulations indicate that reaction reversibility, α, has the greatest influence on the degree of bias among enzymes in flux control. As shown in the appendix, the value of α for a particular reaction is directly related to the equilibrium constant for the reaction; *i.e.*, α = 1/*K*_{eq}. Because *K*_{eq} is a *thermodynamic* property of the reaction [it is determined by the difference in free energy between products and reactants (Nelson and Cox 2000)], α is not influenced by the *kinetic* properties of the enzyme associated with that reaction. Because, unlike kinetic properties, thermodynamic properties cannot evolve, the extent of bias in control is set *a priori* by intrinsic chemical and physical properties of the reactions.

Whether in general one would expect to see substantial bias in control thus depends on whether metabolic reactions tend to be reversible or irreversible. Equilibrium constants have been measured for many enzymatic reactions and they overwhelmingly tend to be large, indicating irreversibility. For example, the tryptophan biosynthetic pathway has equilibrium constants ranging from 7.6 × 10^{8} to 9.2 × 10^{12}, corresponding to α-values much less than the α = 0.001 used in our simulations (Kishore *et al*. 1998). Two additional studies (Tewari *et al.* 2002a,b) of metabolic pathways found that the majority of *K*_{eq}'s measured are >10^{8}, and only two reactions had smaller equilibrium constants (*e.g.*, *K*_{eq} = 1.7 and 4.6, corresponding to α = 0.59 and 0.22, respectively). Although these examples do not constitute an exhaustive survey, they indicate that for many biochemical pathways, if not most, reactions are largely irreversible, and therefore the bias toward high CC in upstream enzymes predicted by our model should often be found. Given that most estimated values of α are several orders of magnitude smaller than the smallest value (0.001) used in our simulations, we would expect this bias to be substantially stronger than what we found (*e.g.*, Table 2).

Evaluating agreement between the predictions of our model and patterns of flux control in real pathways is complicated by the fact that most pathways are not strictly linear. Instead, they are often branched or cyclical, and it is not clear whether and how predictions from our model carry over to such pathways. Nevertheless, a literature survey of linear and nearly linear metabolic pathways provides strong support for our first conclusion that control is distributed unequally in pathways and marginal support for the second claim that control should be vested in upstream enzymes. In 13 control analysis studies, all but one (Wisniewski *et al.* 1995), demonstrated significant evidence for nonuniform distributions of control (Groen *et al.* 1986; Dykhuizen *et al.* 1987; Albe and Wright 1992; Hill *et al.* 1993; Kashiwaya *et al.* 1994; van der VLAG *et al.* 1995; Roussel *et al.* 1998; Thomas and Fell 1998; Bost *et al.* 2001; Cronwright *et al.* 2002; Pritchard and Kell 2002; Wu *et al.* 2004). These studies are sampled across a broad spectrum of organisms and metabolic pathways. Half of them focused on the glycolysis pathway, but other pathways included the tricarboxylic acid cycle, the oxidative phosophorylation pathway, gluconeogenesis, lactose catabolism, and the succinate pathway.

Our second prediction, that upstream enzymes tend to evolve larger CCs than downstream enzymes, is less well supported. Five studies provided strong support for this prediction: results indicated that the first enzyme had the highest control coefficient (Groen *et al.* 1986; Dykhuizen *et al.* 1987; Roussel *et al.* 1998; Cronwright *et al.* 2002; Pritchard and Kell 2002). Pritchard and Kell's (2002) analysis of the entire glycolysis pathway indicates that the two most upstream enzymes exhibit the majority of control in this pathway. By contrast, five investigations found that an enzyme either in the center of the metabolic pathway (Albe and Wright 1992; Kashiwaya *et al.* 1994; van der VLAG *et al.* 1995; Thomas and Fell 1998) or at the end of the pathway (Wisniewski *et al.* 1995) exerted the majority of control.

It should be noted that our analysis does not predict that downstream enzymes will never exhibit the greatest flux control. Even with low reaction reversibility (α = 0.001 or 0.05), ∼5–25% of our simulations evolved to equilibria at which the highest CC corresponded to an enzyme downstream of the first enzyme in the pathway. Moreover, our simulations of shifting optima revealed that pathways with control centered in the most upstream enzyme can evolve majority control in another enzyme and that there is a low, but not insignificant, probability that at any given time, a downstream enzyme will exert majority control. It is thus not surprising that examples exist in which control is vested in downstream enzymes. The real question is whether this situation is as common as the reverse. Although the slight bias in these studies toward control being exerted in upstream enzymes is consistent with our expectation, this small sample clearly does not provide sufficient evidence to either support or refute the prediction that control should most commonly be vested in upstream enzymes.

#### Which genes participate in bouts of adaptive evolution?

The ultimate goal of our analysis was to determine whether predictions can be made about which genes in a pathway are likely to be involved in adaptive change. In doing so, we aim to extend the general theory of adaptation (Fisher 1930; Kimura 1983; Orr 2005) beyond its current focus on predicting the number and size distribution of mutations involved in bouts of adaptation. Initially, we suspected that a simple principle would govern the choice of genes involved in an adaptive walk: mutations in genes for which flux control is highest would be used preferentially because such mutations are likely to have greater effects on flux and hence on fitness (Hartl *et al.* 1985; Eanes 1999; Watt and Dean 2000).

Our analyses confirm the operation of this principle. In both the MCT and the SK models, when reactions are largely irreversible and the distribution of control is the most inequitable between enzymes, most substitutions, as well as substitutions with the largest effect on fitness, occur in upstream enzymes during adaptive walks toward an optimal flux. However, this pattern reverses, such that the majority of substitutions occur in downstream enzymes, as the population nears the optimum. Moreover, many substitutions fixed in this phase appear to be nearly neutral (Ohta 1973; Kimura 1983). This shift occurs because near the optimum most mutations are deleterious and because substitutions in downstream enzymes have smaller effects on fitness because of their reduced control. Consequently, a mutation with a given effect on enzyme activity will have a smaller detrimental effect, and thus a greater probability of being fixed by drift, when it occurs in a downstream enzyme. Some adaptive substitutions continue to be fixed, however, as compensation for slightly deleterious substitutions, and these occur preferentially in the upstream enzymes. To summarize, the difference in control causes upstream enzymes to be subject to strong directional selection during the first phase of an adaptive walk and to purifying selection during the second.

The distribution of substitutions among genes within metabolic pathways has been investigated in three systems of which we are aware. The anthocyanin pathway is a linear pathway composed of six enzymes that produce pigments involved in flower coloration across angiosperms. Three studies of molecular evolution in the anthocyanin pathway, looking at broad and narrow phylogenetic scales (between three angiosperm families and within the genus Ipomea), found that upstream enzymes had the lowest rates of substitution and the downstream enzymes had the greatest rates of substitution (Rausher *et al.* 1999, 2008; Lu and Rausher 2003). These studies found no evidence of positive selection, suggesting that most substitutions were of neutral or slightly disadvantageous mutations. To the extent that these substitutions were disadvantageous, this pattern is consistent with expectations under our model: because of reduced flux control, mutations that cause deviation from the flux optimum are more likely to have smaller effects on fitness in downstream enzymes and are thus more likely to be fixed by genetic drift (Hedrick 2000). An independent study of four terpenoid biosynthesis pathways obtained similar results: a strong correlation between elevated rates of substitution and downstream enzyme position (Ramsay *et al.* 2009). This study finds that the elevated rates of substitution in downstream enzymes are, at least partly, caused by relaxed selection in the downstream enzymes, a pattern consistent with the expectations of our model. Tests for positive selection at a broad phylogenetic scale (between angiosperm families) found significant effects for multiple genes distributed between up- and downstream positions of the pathways. Ramsay *et al.* (2009) conclude that decreased pleiotropy in downstream genes resulted in relaxed selection on these enzymes; however, our model provides another explanation: the control of downstream enzymes is greatly reduced compared to upstream enzymes and they accumulate many more neutral or nearly neutral mutations.

An additional study by Flowers *et al.* (2007) examined rates of molecular evolution in Drosophila for 17 metabolic enzymes in five pathways (glycolytic, gluconeogenic, glycogenic, trehalose, and pentose shunt) that intersect at the glucose-6-phosphate enzyme. This study found a strong signature of adaptive evolution on the *D. melanogaster* and *D. simulans* lineages. The three enzymes that showed statistically significant elevated rates of adaptive evolution are all upstream enzymes occurring at the branch points between pathways. Flowers *et al.* (2007) interpret this pattern as arising from fluctuating selection on the relative magnitudes of flux on different branches. However, this pattern may also be consistent with our model's prediction that adaptive substitutions will be concentrated in the most upstream enzyme of a terminal linear pathway, if the evolution of a pathway branch behaves similarly to the evolution of the linear pathways described here. This issue will be resolved, however, only by examining the evolution of control in branched pathways.

#### Conclusion:

We present population genetic models of the evolution of metabolic pathway flux to investigate whether predictions can be made about whether different genes in a pathway will be differentially involved in the process of adaptation. Although the models pertain only to linear pathways, they indicate that there are likely to be strong differences among enzymes in the numbers and types of substitutions they accumulate. While the extent to which the specific predictions of our model may be extended to pathways with more complex topology is unclear, the fact that our model generates strong patterns of bias suggests that similarly strong, though perhaps different, patterns may be expected in more complicated pathways. We suggest that the approaches presented here will be useful for examining these more complex situations.

### APPENDIX

We describe the details of the models used in our analysis.

#### Metabolic Control Theory Model

##### Specification of flux control:

Here we provide model details for the three-enzyme pathway. Equations for other pathway lengths are analogous.

MCT demonstrates that the equilibrium flux, **J**, is given by(A1)where *k _{i}* is the rate of the forward reaction of enzyme

*i*and

*k*=

_{i}*V*/

_{i}*M*, the ratio of the enzyme's maximum rate of production and the Michaelis–Menten constant (Kacser and Burns 1973). α

_{i}_{i}is a thermodynamic parameter equal to the reversibility of the catalyzed reaction α

_{i}=

*k*

_{−i}/

*k*= 1/

_{i}*K*

_{eq}, and

*K*

_{eq}is the equilibrium constant of the reaction (Kacser and Burns 1973).

**I**is the concentration of the input substrate and we assume that the input substrate is maintained at a constant level by physiological buffering and that the final product is sequestered as soon as it is produced, so that there is no reverse reaction for the final step (

*i.e.*,

*k*

_{−3}= 0). These are simplifying assumptions of our model, whereas the Kacser and Burns (1973) presentation of MCT explicitly allows for variation in pathway input and final products.

The flux control coefficient for enzyme *i*, *C _{i}*, is defined as the proportional change in flux caused by a unit proportional change in enzyme activity;

*i.e.*,(A2)MCT demonstrates that the flux control coefficient for enzyme

*i*is determined by the kinetic and thermodynamic properties of the enzymes (the

*k*'s and the α's) by the relationship(A3)(Kacser and Burns 1973). Because this parameter has the property that ∑

_{i}

*C*= 1, each coefficient represents the proportional control of the corresponding enzyme over the total flux.

_{i}##### Specification of selection:

To estimate the probability of fixation of any mutation, it is necessary to know the selection coefficient, *s*, associated with that mutation. In our model, we assume that fitness is maximal at some optimal flux, **J**_{opt}, and declines monotonically as flux deviates from that optimum. Specifically, we represent fitness for flux **J** by(A4)where σ_{J} is a scaling constant representing the strength of stabilizing selection.

We represent a homozygote genotype, **G**, by a vector of reaction rates, (*k*_{1}, *k*_{2}, *k*_{3}). The fitness of this genotype, **W**(**G**) is then determined by first determining the flux associated with the vector of reaction rates (using Equation A1) and then determining the fitness associated with that flux (using Equation A4). Any mutation is assumed to alter only one of the rates of reaction. Without loss of generality, consider a mutation that changes *k*_{1} to *k*_{1}′, so that the new homozygous genotype, **G**′, has the vector of reaction rates (*k*_{1}′, *k*_{2}, *k*_{3}). The fitness of this genotype, **W**(**G**′), is calculated in similar fashion. The selection coefficient associated with the mutant is then *s* = [**W**(**G**′) – **W**(**G**)]/**W**(**G)** if the mutation is advantageous and *s* = [**W**(**G**) – **W**(**G**′)]/**W**(**G)** if the mutation is disadvantageous.

##### Probability of substitution:

A bout of evolution begins with an initial genotype, **G**. The initial reaction rates (*k*_{1}, *k*_{2}, *k*_{3}) are independently drawn from a uniform random distribution varying between 0.0 and 1.0. A mutation, corresponding to genotype **G**′, is introduced that modifies *k*_{1}, *k*_{2}, or *k*_{3} with equal probability, with the magnitude of the change in *k* determined by drawing from a normal probability distribution with a mean of 0 and a standard deviation of σ_{k}. This assumes that each enzyme has the same mutational target size. Once the fitnesses of **G** and **G**′ are determined, as described above, the probability of fixation, *u*, of the new mutation with frequency 1/2*N*, is calculated asfor an advantageous allele and(A5)for a disadvantageous allele (Hedrick 2000). These equations assume that there is no dominance, *i.e.*, that the fitness of the heterozygote is the average of the homozygote fitnesses. Because the overall reaction rate in a heterozygote is likely to be the sum of the rates for each individual allele, the assumption of no dominance *at the kinetic level* is probably appropriate. Moreover, unless the effects of mutation are very large, absence of dominance at the kinetic level is not expected to generate dominance in flux (Keightley 1996b). This in turn means that the fitness of the heterozygote will be approximately the average of the two homozygote fitnesses, because fitness changes approximately linearly with flux if mutations are not large. It thus seems that the assumption of no dominance in fitness is reasonable to a first approximation.

The mutation is fixed or not according to the probabilities in Equation A5, completing one mutation–fixation cycle. Another mutation is then introduced and its fitness and probability of fixation are determined as previously. Each evolutionary bout consisted of 50,000 mutation–fixation cycles to ensure flux reached an evolutionary equilibrium. This corresponds to 1.7 million generations for a three-step pathway assuming an effective population size of 1000 and a per gene mutation rate of 10^{−5}. At the end of the bout, CCs are calculated from the final *k _{i}* using Equation A3.

In addition to simulating complete trajectories, we estimated a vector field of “instantaneous” trajectories, for the two-enzyme case, by repeated single mutation–fixation cycles from the same starting combination of *k _{i}*. All combinations of

*k*

_{1}= (0.05, 0.15, 0.25, …, 0.95) and

*k*

_{2}= (0.05, 0.15, 0.25, …, 0.95) were used as starting points, and for each starting point, the fate of 10,000 mutations was determined. The average vector (Δ

*k*

_{1}, Δ

*k*

_{2}), including 0's for mutations that were not fixed, at a starting point was taken to represent the estimated instantaneous trajectory at that point in the (

*k*

_{1},

*k*

_{2}) space. It should be noted that this vector does not represent the actual trajectory any particular population will follow, since a single mutation changes either

*k*

_{1}or

*k*

_{2}, but not both. Rather, it indicates the relative probability that the rates of the first and second enzymes will change and by how much on average.

##### Evolution of control coefficients in a shifting environment:

We modeled the evolution of control for a two-step metabolic pathway in a changing environment. An initial optimal flux was chosen, which yields an equilibrium curve of possible initial *k-*value combinations (see below). Twenty evenly dispersed points along this curve were then chosen to generate initial (*k*_{1}, *k*_{2}) values for simulation. Ten of these points corresponded to a higher CC for the first enzyme, while the other 10 corresponded to a higher CC for the second enzyme. For each replicate, evolution was allowed to proceed for 180,000 mutation–fixation cycles. After 30,000 cycles, the flux optimum was shifted to a second value, and after 60,000 cycles it shifted back to the original value to complete one flux cycle. This flux cycle was then repeated two more times. At the end of the 180,000 mutation cycles we recorded the final values of *k*_{1}, *k*_{2}, *C*_{1}, and *C*_{2}.

#### Saturation Kinetics Model

An irreversible reaction under saturation kinetics can be described asThe equilibrium rate of reaction, *v*, is described by the Michaelis–Menten equation(A6)where *V*_{max} is the maximum rate of reaction, [*S*] is the substrate concentration, and the Michaelis–Menten constant, *K*_{m} = *k*_{2} + *k*_{−1}/*k*_{1} (Segel 1993). A reversible enzyme reaction under saturation kinetics can be described as

Solving for the steady-state equilibrium, the forward rate of reaction is(A7)where(Segel 1993).

To simplify our analysis, we assume that *k*_{1} = *k*_{2}. Defining α as the ratio of the reverse to the forward reaction, α = *k*_{−1}/*k*_{1}, and then for the irreversible reaction *K*_{m} = 1 + α, and for the reversible reaction *K*_{ms} = 1 + α, *K*_{mp} = 1 + α/α. Additionally, we can define the relationship between the forward and reverse rates of reaction as *V*_{maxr} = α · *V*_{maxf}.

In a three-step pathway with the first two steps being reversible and the last step being irreversible,flux through the three parts of the pathway is represented by three equations:(A8)Here we assume that α is the same for the first two steps in the pathway to make our results comparable to the MCT simulations. For the last, irreversible, step in the pathway α = 0. At flux equilibrium, these three velocities are equal; *i.e.*, *v*_{1} = *v*_{2} = *v*_{3} = *v*. Because, as in the MCT analysis, we assume that the concentration of the initial substrate *A* is constant, this is a set of three equations in three unknowns (*B*, *C*, and *v*), which were solved numerically for the unknowns. An analytical solution is also presented below (Equation A12) and there is a close match between the two methods (Figure 2, A–C).

Simulations were initialized with random values of maximum enzyme velocities, *V*_{1}, *V*_{2}, and *V*_{3}. Under this model, very high concentrations of intermediates *B* and *C* can accumulate, depending on the values of the kinetic properties of the enzymes. Because high concentrations of intermediates are likely to be deleterious, we follow Clark (1991) in introducing a fitness penalty for high substrate pools. In particular, we introduce a parameter **T**, which represents the intensity of selection against substrate accumulation. We then model fitness aswhere the function *W*(**J**) is the same as in Equation A4. Under this formulation, when [*B*] + [*C*] is small relative to **T**, fitness is approximately that given by Equation A4. However, when [*B*] + [*C*] is large relative to **T**, fitness approaches 0. Imposing a threshold does not change the optimum fitness value, but only increases the sharpness of the adaptive landscape.

As previously, pathways evolve under this model by a series of mutation–fixation cycles. First, one of the enzymes is chosen randomly and its reaction rate, *V*, is changed by an increment that is normally distributed with a mean of 0 and a standard deviation of σ_{k}. The fitness of each new mutation is calculated, as are the selection coefficient and probability of fixation, in the same manner as described for the MCT model. This mutation–fixation cycle is then repeated 20,000 times to complete one bout of evolution. This corresponds to 680,000 generations for a three-step pathway assuming an effective population size of 1000 and a per gene mutation rate of 10^{−5}.

##### Numerical calculation of control in the saturation kinetics model:

Control in the SK Model was calculated numerically, in a way analogous to the calculation of control in the MCT model (Equation A2). The procedure was as follows: *V _{i}* of enzyme

*i*was perturbed by ±2%, 4%, and 6%, while holding

*V*of the other enzymes constant and calculating the Δ

_{j}**J**for each Δ

*V*. The control coefficient was then estimated as the slope of the regression on the set of seven values for

_{i}**J**on

*V*. Unlike MCT, these regression-based control coefficients do not sum to unity. To make the two numbers comparable, control coefficients under the saturation kinetics model are presented as a percentage of the total control estimated for all enzymes in a pathway.

_{i}###### Relationship between control coefficient and V:

Here we examine the relationship between which enzyme has greatest control over flux and the relative values of enzyme activity, *V.* In Equations A8, at equilibrium *v*_{1} = *v*_{2} = *v*_{3} = *v.* Using this result, the equations can be solved to produce an expression involving only the unknown flux, *v*:(A9)

This equation may be differentiated implicitly to yield the partial derivatives of *v* with respect to the *V _{i}*,(A10)where β is the sum of terms involving ∂

*v*/∂

*V*

_{1}, ∂

*v*/∂

*V*

_{2}, or ∂

*v*/∂

*V*

_{3}(the sum is the same for all three equations).

We ask first about the conditions under which enzyme 1 has greater control over flux than enzyme 2. By the definition of control coefficient (Equation A2), this will occur whenUsing Equations A10, this is equivalent toWhen α is small (reactions largely irreversible), terms including α can be ignored, and this equation can be simplified to yield the approximationWhen α is very small, this reduces to the approximation

Next, consider the criteria for (∂*v*/∂*V*_{1})(*v*/*V*_{1}) > (∂*v*/∂*V*_{3})/(*v*/*V*_{3}). A similar argument leads to the approximate criterionFinally, a similar derivation shows that the criterion for enzyme 2 to have greater control than enzyme 3 isThe conditions under which each enzyme has the greatest control are summarized as follows:

We note that when α is small (largely irreversible reactions), an approximate solution of Equation A9 is(A11a)or(A11b)or(A11c)When enzyme 1, 2, or 3 has control, the flux corresponds to (A11a), (A11b), or (A11c), respectively.

Results for the two-enzyme case can be derived in a similar fashion. The two-enzyme equivalent of Equation A9 is(A12)and the exact criterion for enzyme 1 to have greater flux control is(A13)or, when α is small, approximately

###### Intermediate pool in two-enzyme reaction:

By analogy with Equation A8, the equilibrium flux for the second reaction in a two-enzyme pathway iswhich, upon rearrangement, gives the equilibrium concentration of the intermediate pool:(A14)

###### Intuitive explanation for results:

These results show that, in general, the enzyme with the largest control over flux is the enzyme with the smallest *V*. An intuitive explanation for this result can be given in terms of flow of a fluid through pipes of different diameters. The diameter of a pipe determines the maximum flow that can occur through the pipe and is thus analogous to *V*. If the first pipe has the smallest diameter, there will be a set flow through that pipe. Since the following pipes have larger diameters, the flow from the first pipe will not fill them completely and they will not limit the overall flow. If an intermediate pipe has the smallest diameter, it will determine the maximum flow. For pipes upstream with larger diameters, initially flow will be greater through them. However, once the flow-limiting pipe has reached its capacity, fluid will back up into reservoirs behind it, analogous to the formation of intermediate substrate pools. Pressure from these reservoirs essentially reduces the flow through the upstream pipes to match that of the flow through the flow-limiting pipe.

## Acknowledgments

We thank the Rausher lab group, Frederik Nijhout, Kristi Montooth, and one anonymous reviewer for many insightful comments on the manuscript. This work was supported by National Science Foundation grant DEB-0448889 to M.D.R.

## Footnotes

Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.110411/DC1.

Communicating editor: M. Long

- Received September 29, 2009.
- Accepted November 17, 2009.

- Copyright © 2010 by the Genetics Society of America