Homing endonuclease genes (HEGs) encode proteins that in the heterozygous state cause double-strand breaks in the homologous chromosome at the precise position opposite the HEG. If the double-strand break is repaired using the homologous chromosome, the HEG becomes homozygous, and this represents a powerful genetic drive mechanism that might be used as a tool in managing vector or pest populations. HEGs may be used to decrease population fitness to drive down population densities (possibly causing local extinction) or, in disease vectors, to knock out a gene required for pathogen transmission. The relative advantages of HEGs that target viability or fecundity, that are active in one sex or both, and whose target is expressed before or after homing are explored. The conditions under which escape mutants arise are also analyzed. A different strategy is to place HEGs on the Y chromosome that cause one or more breaks on the X chromosome and so disrupt sex ratio. This strategy can cause severe sex-ratio biases with efficiencies that depend on the details of sperm competition and zygote mortality. This strategy is probably less susceptible to escape mutants, especially when multiple X shredders are used.
THE possibility of controlling man's major pests, pathogens, and disease vectors using genetic manipulation has long been discussed (Hamilton 1967; Curtis 1968) and is of great current interest (Turelli and Hoffmann 1999; Alphey et al. 2002; James 2005; Sinkins and Gould 2006). A broad spectrum of possible strategies has been explored. Organisms can be manipulated to be conditionally sterile or lethal and released into the environment to disrupt mating or to reduce the fecundity of the wild population (Thomas et al. 2000; Atkinson et al. 2007; Phuc et al. 2007). With these inundative techniques the manipulated construct is not required to persist in the environment. A different approach is to introduce a beneficial genetic construct into a wild population with a drive mechanism that causes it to increase in frequency. The construct might impose a fitness load on the population, reducing its density or causing it to go extinct. Alternatively, it may alter the phenotype of the organism with no or minor changes to its fitness. The latter is of particular relevance to disease vectors where it may be possible to reduce or eliminate transmission. Recent advances in molecular genetics have demonstrated that knocking out certain Anopheles mosquito genes, or inserting new constructs, prevents the insect from transmitting Plasmodium, the malaria pathogen (Ito et al. 2002; Moreira et al. 2002), while RNAi techniques have been used to prevent Aedes mosquitoes from transmitting the dengue virus (Franz et al. 2006). Enthusiasm for these control strategies is tempered by the realization that any method involving genetic manipulation will require the highest scrutiny and investigation prior to implementation and that support from the public will be essential for any project to go ahead (Alphey et al. 2002; James 2005; Knols et al. 2006).
A variety of different mechanisms for driving genes through a population have been considered, most of them based on elements with non-Mendelian heritance that have been discovered in nature (Burt and Trivers 2006). Some genes cause the chromosomes on which they reside to be overrepresented in the gamete pool and thus could be used to increase the frequency of an introduced linked gene (Burt and Trivers 2006). Genetic constructs can be designed that show underdominance—heterozygote inferiority—and hence will increase in frequency once their abundance passes a certain threshold (Davis et al. 2001; Magori and Gould 2006). Elements that jump between chromosomes can be used as vectors for beneficial constructs, and transposable elements in particular have received a lot of attention (Coates et al. 1998). Heterozygote females carrying medea elements modify their eggs such that they survive only if they carry the medea gene or are fertilized by sperm that carry the element. This disadvantages wild-type alleles and allows medea to spread (Wade and Beeman 1994). Artificially engineered medea elements have recently been developed and offer an important new potential drive mechanism (Chen et al. 2007). Certain symbiotic microorganisms with vertical inheritance spread by manipulating host reproduction such that infected individuals produce more daughters than uninfected individuals. Introducing the beneficial gene into the symbiont could then lead to its spread, though with the disadvantage that the gene may not be expressed in the correct tissue. The intracellular bacterium Wolbachia that is present in a very large fraction of insects and that spreads through cytoplasmic incompatibility (noninfected females are at a disadvantage because they cannot use the sperm of infected males) is the most important candidate drive mechanism of this type (Werren 1997; Turelli and Hoffmann 1999). Finally, a variant of these techniques is to use the drive mechanism to impose a fitness cost on the organism and then to link the beneficial gene to a construct that mitigates the cost and hence is selected to spread (Sinkins and Godfray 2004). In comparing drive mechanisms the most important factors likely to influence success or acceptability include the evolutionary stability of the construct, the degree to which the within- and between-individual spread of the element can be predicted, whether the construct can increase in frequency from rare or if a threshold frequency must be exceeded before spread occurs, and whether it is possible to reverse the manipulation.
An exciting potential drive strategy is to use site-specific selfish genes such as homing endonuclease genes (HEGs) (Burt 2003). A HEG codes for a protein that recognizes and cuts DNA containing a specific 20- to 30-bp sequence (Stoddard 2005). Critically this sequence is found only on chromosomes not containing the HEG and at the precise location where the HEG occurs. After a double-strand break in a heterozygote, the cell's recombinational repair mechanism uses the chromosome carrying the HEG as a template and the HEG is thus copied from one chromosome to the other, converting a heterozygote to a homozygote. If there are no fitness costs to the HEG, it spreads until it reaches fixation. Other elements such as group II introns and certain LINE-like transposable elements have similar strategies for spread, though with a more complicated mechanism involving RNA intermediaries (Burt and Trivers 2006). Below we concentrate just on HEGs that offer the most straightforward site-specific selfish genes for exploitation.
HEGs are found in nature in single-celled fungi, plants, protists, and bacteria, but not in higher animals. They tend to reside in noncoding regions (especially introns) and so have little effect on fitness because they are spliced out prior to translation into protein. Due to their low fitness costs they are expected to spread to fixation, but then decay because once they are fixed there is no selection for their maintenance. Comparative studies have shown that HEGs probably survive by jumping from species to species and that maintenance requires that the rate of species jumps must exceed HEG “death” in a lineage (Goddard and Burt 1999; Burt and Koufopanou 2004). It is likely that it is easier for HEGs to jump among single-cell organisms than among animals with segregated germlines, which may explain their absence from the latter.
The aim of this article is to describe the different ways in which HEGs might be used as part of a genetic control strategy and to develop and analyze the population genetic models that will be required to assess their relative advantages and disadvantages. It builds on the analyses of Burt (2003), who derived the equilibrium frequency and genetic load of HEGs with different homing frequencies that were either lethal or sterile to one or both sexes. He also discussed alternative strategies such as the use of multiple HEGs and their use as “X chromosome shredders” that are analyzed formally here for the first time. We also study the population genetics of mutations that might nullify the action of the HEG.
We first treat “classical” HEGs that spread by copying themselves into homologous chromosomes after double-strand breaks. We derive equations for (i) the spread and equilibria of HEGs that are active after gene expression and (ii) HEGs that are active before. We then explore (iii) the possible advantages of sex-specific expression and (iv) the risk of mutations arising that prevent HEG spread. Second, we study HEGs on the Y chromosome that cause X chromosome breaks—X shredders. We (i) derive equilibrium sex ratios for different numbers of shredders, (ii) analyze the effects of reduced sperm number and competition for zygotes, and (iii) study the evolution of escape mutants.
THEORETICAL RESULTS: DRIVING HEGS
HEG active after gene expression:
Consider an engineered HEG that is introduced into a chromosome opposite a functional gene. Let the homing rate (the probability of a successful gene conversion) be e and the fitness costs of disrupting gene function be s for the homozygote and sh for the heterozygote. We begin by assuming fitness costs are equal in males and females and that homing occurs at meiosis after gene expression (so any costs of being homozygote are not experienced by the individual in which homing occurs). If q and p are the gametic frequencies of the HEG and wild-type alleles, respectively, the recurrence for q is(1)The equilibrium frequency for the HEG, q*, is(2)(3)When these inequalities do not hold there is an interior equilibrium(4)which is stable if(5)and unstable if(6)The latter implies low heterozygote fitness, h > , in which case the HEG either goes extinct or reaches fixation depending on whether the initial frequency is less than or greater than the unstable equilibrium (Figure 1 ).
We define the “HEG load” (L) to be the relative reduction in the growth rate of the population in the presence of the HEG. Assume a population with discrete generations and let be the rate at which the population increases in the absence of any density-dependent effects (equivalent here to per capita female fecundity). Then define , where is the population growth rate when the HEG is present. HEG load is thus a quantity very similar to genetic load as usually interpreted in classical population genetics, except that it does not include effects on males and can include processes that bias the sex ratio (see below). In using HEGs to drive down vector and pest densities it is important to note that a HEG load of L does not necessarily mean that the population density is reduced by a factor L. The observed reduction will depend on the precise form of density dependence operating in the population, as well as on the relative ordering of homing, target gene expression, and density dependence in the life cycle. This article is concerned only with genetic dynamics and so we cannot predict absolute population reductions. Nevertheless, calculating L provides a useful comparative measure of potential population reductions. In this case, the HEG load experienced by the population at equilibrium is(7)for 0 < q* <1, no load when q* = 0, and L = s for q* = 1.
Consider first the special case in which HEGs are employed to knock out an essential gene with the aim of maximizing HEG load and driving a population to extinction. For a fully recessive homozygote lethal (s = 1, h = 0) the equilibrium HEG frequency is e and the load e2. Thus very substantial loads, and hence potential reductions in population numbers, are possible as homing frequencies approach unity. However, the highest equilibrium HEG loads do not occur when the HEG is invariably lethal in the homozygote. For a given value of the homing rate (e), the greatest load occurs when the fitness costs are at the maximum that still allows the HEG to become fixed (s = e), in which case the load equals the selection pressure (L = s) (Figure 1).
If we assume that the heterozygote also has reduced fitness (h > 0), then a greater range of equilibrium behaviors may be observed (Figure 1). The HEG is always fixed when fitness costs (s) are low and homing rates (e) are high, but away from this region of parameter space decreasing heterozygote fitness first sees a reduction in the parameter combinations where the HEG equilibrium frequency is less than one but greater than zero (an internal equilibrium). Then, when heterozygote fitness is closer to the homozygote HEG than to the wild type (h > ), a region of bistability appears (the HEG goes to extinction or fixation depending on its initial frequency) that increases as the HEG becomes fully dominant. In the absence of stochastic effects, for most parameter combinations a HEG with h > will not spread from rare. As before, for fixed e, load is generally maximized at the highest value of s that allows fixation.
A slightly different strategy is to engineer a HEG to target a gene that is required for reproduction rather than survival. In the simplest case, if the knockout prevents the individual from participating in mating, then the dynamics and HEG load are exactly the same as described above. But suppose the knockout acts later, such that mating occurs normally but is less productive (for example, the male makes defective sperm that fertilize the eggs normally but result in inviable progeny), so that any matings involving either a male or a female carry the HEG lead to fewer offspring (a postmating fertility effect). Then while the dynamics of spread and equilibrium would still be as described above, the genetic load would be greater. Only those matings not involving an infertile carrier of the HEG, a fraction (1 − q2s − 2pqhs)2, would produce offspring and hence the genetic load would be 1 − (1 − q2s − 2pqhs)2. The load is thus always greater than or equal to the equivalent load (q2s + 2pqhs) for a HEG targeting survival. If the knockout mutation is recessive and abolishes reproductive success completely (s = 1, h = 0), the HEG equilibrium frequency is q = e and the load is (Figure 2).
The second special case is when a HEG is employed to knock out a gene required for an insect to vector a pathogen. Ideally the gene would have no fitness costs to the host (s = 0), in which case it would always spread and cause no HEG load. A HEG whose fitness effects are manifest only in homozygotes becomes fixed provided e > s and causes a load of L = s when it affects survival or fecundity. When fertility is affected and determined after mating by the genotype of both partners, then L = s(2 − s).
Were a HEG to be used in a vector or pest control program, not only the ultimate outcome but also the rate at which it is attained would be significant. In Figure 3 we plot the number of generations that it takes for a HEG to increase in frequency from 0.05 to 0.9. For those HEGs that can reach fixation, spread is faster for high homing rates and for genes with recessive fitness costs. For much of this parameter space rapid spread occurs within 10–15 generations, which for many insect species is just a couple of years and so is highly relevant to pest and vector control on relatively short timescales.
HEG active before gene expression:
We now assume that homing and gene conversion occur prior to the expression of the gene containing the HEG recognition sequence. Any fitness consequences of disrupting the gene are now experienced both by homozygotes and by the “transformed” heterozygotes. The recurrence for HEG frequency is now(8)The equilibrium frequency for the HEG, q*, is(9)(10)When these inequalities do not hold, there is an interior equilibrium(11)which is stable if(12)and unstable if(13)In the last case the HEG either goes extinct or reaches fixation depending on whether the initial frequency is less than or greater than the unstable equilibrium (Figure 4) .
The HEG load is s when the HEG becomes fixed and(14)for the interior equilibrium.
A HEG targeting a fully recessive gene with a significant effect on fitness (h = 0, s > ) can invade only if e > s and if the initial HEG frequency exceeds a threshold of . Thus the strategy of using HEGs to create a recessive lethal (s = 1, h = 0) will not work if homing occurs prior to gene expression. The benefits of gene conversion in the heterozygote are nullified by the fitness costs of creating a homozygote.
In the limit, when there are no costs to carrying a HEG, it makes no difference when homing occurs and the HEG will always spread. For moderate fitness costs, the condition for the fixation of the HEG is the same irrespective of the order of homing and expression. However, in the regions where both fixation and extinction of the HEG can occur depending on initial conditions, fixation now requires higher gene frequencies compared to the case where the HEG is active after gene expression. Where the HEG is not fixed, its equilibrium frequency and load are always lower when homing occurs prior to expression, and the rate of spread of the gene is also relatively slower (data not shown).
Returning to our original model of the HEG being active after gene expression, we now assume that it targets a gene that has different effects on males and females and/or that the HEG has different rates of homing in the two sexes. Let qx, ex, sx, and hx have the same meanings as before except now we assume their values may be different in males (x = m) or females (x = f). The dynamics are given by the coupled recurrence equations(15)
The general solution to these equations is too complex to be helpful and we focus on a few specific cases. First, assume the knockouts are recessive and affect only females (hf = 0, sf > 0, sm = 0). Then(16)(17)The HEG load is sf qm qf or(18)for < 1 and L = sf otherwise. If we assume homing frequency is the same in the two sexes (ef = em = e) and the knockout is a female-specific recessive lethal or sterile (sf = 1), then the load is (Figure 2). This load is always greater than that when both sexes are killed or removed from the mating pool, for which the load is L = e2. Killing males is counterproductive because it reduces the frequency of the HEG without reducing population productivity. We can also compare the load caused by a female-specific lethal or sterile with that of a HEG that disrupts both male and female fertility so that only zygotes produced by parents neither of whom are homozygous HEG carriers survive (L = e2(2 − e2)). The female-specific HEG is superior unless homing rates (e) are large (Figure 2). As before, the maximum HEG load, for a given homing rate, occurs for the highest homozygote fitness cost at which the HEG can become fixed. Here this is found when in which case L = sf. Larger loads, for the same homing rate, are thus possible for sex-specific fitness effects. The rate of spread of sex-specific HEGs is similar to that of nonspecific genes.
Second, assume that the fitness effects of the HEG are the same for both males and females, but that homing rates are different (sf = sm, hf = hm = 0, ef ≠ em). Recall that when the homing rate is the same in the two sexes, fixation requires s < e: in the present case s must be less than the average rate of homing in the two sexes. If sf = sm = 1, then fixation cannot occur and the load is . If the average homing rate is kept constant, the load is at a minimum when rates are the same in the two sexes and increases as the difference gets larger.
Third, consider the case where both homing and costs are sex specific. If the HEG homes only in females and is a female-specific lethal or sterile (em = 0, ef > 0, sm = 0, sf = 1), then the loads produced are identical to the non-sex-specific case. But if homing is restricted to males rather than females, then a female lethal or sterile HEG (em > 0, ef = 0, sm = 0, sf = 1) causes substantially lower loads to occur (), with the maximum obtainable load (as homing frequency approaches one) being L = rather than 1. The reason for this is that when homozygous females are rendered dead or sterile, then heterozygous females make a relatively larger contribution to the next generation, and hence the spread of the HEG is particularly influenced by the homing that occurs in these heterozygotes. The case of non-sex-specific lethality but homing only in a single sex provides an even worse outcome in terms of load than female lethality and male homing.
Consider now using HEGs to knock out a gene essential for vector transmission. Suppose there are mild costs (s ≪ e) to the knockout that may be experienced by males, females, or both sexes equally. The HEG will always go to fixation and there are only minor differences in the speed at which this happens (fastest when only one sex suffers the fitness cost).
We have also explored sex-specific fitness costs when the HEG is active prior to gene expression. Qualitatively the conclusions are very similar to the comparison of the two situations in the non-sex-specific case: HEG activity prior to expression always tends to reduce the rate of spread, genetic load, and equilibrium frequency.
Dynamics of HEG-escape mutants:
When a homing endonuclease cuts a chromosome it is normally repaired using the second chromosome as a template, the mechanism through which the HEG increases in frequency. But it is possible that the chromosome is rejoined in a different way. Possibly an incomplete copy of the HEG is transferred, or alternatively the ends of the chromosome may be ligated without the use of a template. If the wild-type chromosome is precisely reconstituted, then the initial cut leaves no trace and the only dynamic effect is a reduction in the efficiency of homing, a lower value of the parameter e. But if the repair destroys the HEG recognition site, then nonstandard repair can be very important. A critical issue is whether the repaired chromosome, without a HEG recognition site, contains a functioning gene.
To explore this assume there are three classes of allele, the wild type (+), the HEG (H), and a misrepaired allele (M) at frequencies of p, qH, and qM, respectively. In reality there are likely to be several classes of misrepaired allele, although we simplify the situation by allowing just a single type. The genotype frequencies and fitnesses are given in the table, where the subscripted s and h parameters describe the fitness costs of the different alleles and their pattern of dominance. Population fitness, , is the average genotype fitness weighted by frequency (and so the HEG load is 1 − ) and γ is the frequency of misrepair [and hence (1 − γ) is the frequency of repair resulting in a functional HEG].
|Fitness||1||1 − hHsH||1 − hMsM||1 − sH||1 − sI||1 − sM,|
With these assumptions we can write recurrences for qH and qM,(19)(20)The general solution of these equations is too complex to be useful so we explore some special cases.
Assume first that the wild-type allele is fully dominant (hH = hM = 0) and that both the HEG and the misrepair allele cause nonfunctional gene products leading to death in the absence of a wild-type allele (sH = sM = sI = 1). The equilibrium frequency of the two alleles is(21)and the HEG load is e2(1 − γ)2. Thus if the aim of the program is to reduce population densities by targeting a lethal gene, the effect of misrepair leading to nonfunctional alleles is simply to make homing less efficient.
Now assume that the misrepair allele produces a gene product that is at least partially functional. Specifically let sM = sI = s while the HEG remains homozygous lethal (sH = 1) and all other parameters are as above. In the case of s = 0 the misrepaired allele is completely functional and it can be shown formally that there is no stable equilibrium with the HEG present, no matter how small is the rate with which escape arises (γ). For s > 0 an equilibrium exists with both the HEG and the escape mutant present. As the costs of the escape mutant rise, the equilibrium frequency of the HEG also increases. Higher homing rates (e) and higher probabilities of legitimate repair (1 − γ) lead to both greater frequencies of the HEG and greater load (Figure 5).
Recall that in the absence of misrepair a HEG that has no fitness costs (sH = 0) inexorably becomes fixed. If less than half the time (γ < ) misrepair generates nonfunctional or partially functional alleles (sM = sI > 0), then fixation of a cost-free HEG still occurs, though it may happen more slowly. If functional alleles are produced that have no fitness costs (sM = sI = 0), then both the HEG and the misrepair allele increase in frequency until the wild-type allele disappears. The equilibrium frequency of the HEG (qH*=1−γ(1−qH0)) depends on the frequency with which these mutant alleles arise (γ) as well as the initial HEG frequency qH0. After this the HEG and misrepair alleles will show neutral dynamics affected only by drift.
Complex dynamics may occur if the HEG targets a gene whose knockout is neither lethal nor of no fitness consequence to the organism (0 < sH < 1). We have not performed a full analysis of all possible scenarios but HEGs are more likely to become fixed if they have high fitness in the homozygote and lost if the homozygote is costly. High homing rates (e) tend to favor HEG fixation and low rates loss; high misrepair rates (γ) also increase the chance of loss, while low relative escape mutant fitness (sI > sH) tends to favor fixation. For intermediate values of sH polymorphisms may occur with the HEG frequency depending on the relative costs of the HEG and escape mutant, as well as the frequency with which the latter arises. Simulations of the dynamics show complex behaviors including dependence on initial conditions and cycles where the frequency of the HEG at times gets so close to zero that in a real population it could be lost due to stochastic effects. We note that similar, complex dynamics have been observed in other meiotic drive systems (Charlesworth and Hartl 1978; Nauta and Hoekstra 1993).
THEORETICAL RESULTS: X CHROMOSOME SHREDDERS
Assume that in a species with heterogametic males k HEGs are inserted into the Y chromosome and each recognizes a specific, different site on the X chromosome causing a break with probability e. The chance that a particular X chromosome survives this assault is thus (1 − e)k and the fraction of gametes carrying the Y chromosome increases from to 1/[1 + (1 − e)k]. If there are no fitness costs to reduced sperm volume and the frequency of HEG-bearing Y chromosomes (as a fraction of all Y chromosomes) in the current generation is q, then in the next generation it will be(22)The HEG spreads to fixation on the Y chromosome (Figure 6) and hence the sex ratio (proportion of males) is equal to the fraction of Y-bearing gametes 1/[1 + (1 − e)k]. Note that in the special case of a single HEG the equilibrium sex ratio is 1/(2 − e). We define the HEG load when there is a biased sex ratio to be , where r is the sex ratio and and are the average population growth rates (excluding the sex-ratio effect) in the presence and the absence of the HEG, respectively. In this case, once the HEG has become fixed, .
The equilibrium sex ratio as a function of the chromosome break frequency (e) and the number of HEGs are shown in Figure 7. If breaks occur with a high probability, then the insertion of a single HEG can skew the sex ratio so strongly to males that the population is unlikely to persist. If breaks occur with lower probability, then high skew and population extinction can still be achieved using multiple HEGs. The spread of the gene is relatively fast. For example, a single X shredder with a cutting rate of e = 0.9 can increase in frequency from 0.01 to 0.99 in ∼15 generations. The same speed of spread for lower cutting rates can be achieved with multiple X shredders (Figure 6).
Note that the mathematics is unaltered if the parameter k refers not to the number of unique target sites of multiple HEGs, but to the multiple targets of a single HEG. Theoretically, very high values of k might thus be achieved if the HEG is engineered to recognize common sequences of a tandem array of genes or other multiple-gene families. Note that at some point the concentration of the homing endonuclease might become limiting, depending upon how strongly it is expressed.
Costs due to lower sperm numbers:
The arguments above assume that males with reduced sperm production suffer no fitness penalties. The simplest way to relax this assumption is to let the relative cost paid by engineered males be a function, f (x), of relative sperm volume (x = 1/2[1 + (1 − e)k]), where f (x) decreases from one to zero as x varies over the same range. Intuitively, males are penalized by producing fewer sperm. If the costs are sufficiently high that HEG-bearing sperm fertilize fewer eggs than they would have in the absence of X shredding, then the spread of the HEG can be prevented (f (x) > 1 − x). Below this threshold the HEG still spreads to fixation and the final sex ratio is the same, but the costs can considerably slow the rate at which this is attained.
A more mechanistic model of the costs of reduced sperm production can also be constructed. Multiple mating will reduce the advantage of the Y chromosome bearing the HEG as some of the benefits of the smaller number of X gametes will be shared with wild-type Y chromosomes from other males. To model this, assume that when a female is mated by x males carrying the HEG and y carrying the wild-type allele, her eggs are fertilized randomly by the pool of sperm produced by the x + y males. The recurrence for the frequency of the HEG becomes(23)where p(x, y) is the probability of the combination [x, y]. If females mate only once at random with the two types of males [so p(1, 0) = q, p(0, 1) = 1 − q, and all other p(x, y) = 0], then Equation 23 simplifies to Equation 22. Another limit is the case when all females are mated by a large number of males (x + y → ∞), in which case : the HEG allele has exactly the same fitness as the wild-type allele because it gets no special benefit from the reduction in X gametes it brings about.
Figure 8 explores intermediate cases assuming p(x, y) is a compound distribution with the total number of matings (x + y) determined by a geometric distribution [the probability that a female is mated x + y times is (1 − φ)φx+y−1 with φ < 1] and the numbers of x and y conditional on their sum described by the binomial distribution. Increased frequencies of multiple mating reduce the advantages of X shredding and slow the spread of the HEG, without affecting the final outcome.
Costs due to lower zygote numbers:
A different type of cost occurs when sperm that carry shredded and hence inviable X chromosomes can pseudofertilize zygotes. The latter die and are not available for fertilization by the HEG-carrying Y. Let u = (1 − e)k be the fraction of X chromosomes that avoid shredding. Assume that a fraction z of the remaining (1 − u) shredded X chromosomes pseudofertilize a zygote. In the absence of X shredding the Y chromosome could expect to fertilize the zygotes, and with X shredding but not pseudofertilization this fraction goes up to 1/(1 + u). However, if pseudofertilization occurs the fraction drops to . Clearly the advantage of X shredding disappears if z = 1 while for z < 1 the HEG still spreads to fixation with the final sex ratio unaffected, though the speed of spread declines as z increases (Figure 9).
X-shredder escape mutants:
Breaks in the X chromosome may be repaired by ligation of the two ends (repair using the homologous chromosome as a template is not of course possible in the heterogametic sex). If the repair regenerates the HEG recognition site, then this is simply equivalent to a reduction in the efficiency of shredding. But if the recognition site is lost, and if the repaired chromosome suffers no or mild fitness costs after repair, then an X-shredder escape mutant will have been generated.
To explore this for the case of an X shredder with a single recognition site (k = 1) we model the dynamics of four kinds of chromosome: the wild-type sex chromosomes (Y and X), Y chromosomes bearing a HEG (Yh), and X chromosomes bearing the escape mutant (Xm). Among Y chromosomes the frequency of Yh is q while the frequency of Xm among X chromosomes is ξm in sperm and ξf in eggs. The genotype frequencies and fitnesses are given in the following tables: The subscripted s and h parameters describe the fitness costs of the different alleles and their pattern of dominance. We assume that the costs of bearing a HEG arise directly from the X-shredding process.
|Frequency||(1 − q)(1 − ξf)||(1 − q)ξf||q(1 − ξf)||qξf|
|Fitness||1||1 − sm||1 − sH||1 − sm|
|Frequency||(1 − ξm) (1 − ξf)||ξm(1 − ξf) + (1 − ξm)ξf||ξmξf|
|Fitness||1||1 − sf hf||1 − sf.|
Assuming that escape mutants arise with frequency g, the recurrences for gamete frequencies are(24)
The general solution is very complex and we focus on some special cases. First, assume that there are no costs to HEG carriage (sH = 0). If the escape mutant also has no costs (sm = sf = 0), then it will always spread to fixation. If an escape mutant suffers fitness costs sM in the homozygote and the hemizygote (sm = sf = sM) and hMsM in the heterozygote, then fixation of the escape mutant generally occurs below a critical threshold given by the two inequalities,(25)Fixation of the mutant restores equal sex ratios and the remaining HEG alleles in the population will then have neutral dynamics because they have identical fitness to the wild type. If the costs are above this threshold, then the HEG will spread to fixation with the population remaining polymorphic for the escape mutant (Figure 10). Thus a single HEG is least affected by resistant escape mutants when its target sequence is an essential gene whose function is lost after repair (sM = 1, hM = 0). Similar dynamics are observed when the gene targeted by the HEG is essential only in males or in females.
Now suppose that there is a cost to X shredding (sH > 0), perhaps because sperm volume is reduced or through pseudofertilization (see above), and focus on cases where the target gene cut on the X chromosome may be essential. If HEG costs exceed a threshold (sH > e(1 − γ)/2) that depends on the frequency with which escape mutants arise, then the HEG always goes extinct and the sex ratio returns to equality. If the cost of the HEG is below this threshold, then the HEG becomes fixed and the predicted sex ratio depends on the fitness of the escape mutant. If the escape mutant has wild-type fitness it spreads to fixation and equal sex ratio is restored. But if repair gives rise to an escape mutant that at least in some circumstances is lethal, then a polymorphism may result where the sex ratio shows an intermediate bias toward males. For instance, if the mutant is dominant and lethal in females (sf = 1, hf = 1, sm = 0), the sex ratio evolves toward 1/(2 − e + eγ), and at each generation a proportion eγ/(1 − e + eγ) of the females die before reproducing.
This analysis suggests that single X shredders, or X shredders with single recognition sites (k = 1), are likely to fail if cost-free escape mutants occur. However, the probability of escape mutants of this type arising can be markedly reduced by using multiple HEGs or HEGs that cut at multiple targets on the X chromosome. An escape mutant requires that the X chromosome be rejoined with the loss of the HEG recognition site at all k sites before it can avoid further attack. We have not modeled this scenario in detail but the main effect of using a HEG with multiple shredding sites can be captured by reducing g, the rate at which escape mutants are generated. Though formally for all g > 0 long-term persistence of the HEG is not possible when escape mutants have no costs, for small g this may take a very long time. A more detailed model of escape mutant evolution would need to take into account the frequency of X chromosomes that had lost the HEG recognition site at some but not all sites. Also, for HEGs targeting recognition sites in multicopy genes, the possibility of gene conversion and molecular drive would need to be considered.
Parasitic or selfish genetic elements have clear attractions as potential tools for population genetic engineering and control since they are able to spread through a population even if they do not increase the fitness of organisms carrying them—indeed, they may spread even if they cause some harm. Moreover, our ever-increasing understanding of the molecular basis of drive means that it is becoming feasible to synthesize artificial elements ourselves (Han and Boeke 2004; Adelman et al. 2007; Chen et al. 2007). This article is a theoretical investigation of the conditions under which one class of parasitic genetic element, HEGs, might be useful for population genetic engineering and in particular for applications to control disease vectors or pests. The aim of the work presented here is to provide guidance in designing efficient constructs with the lowest likelihood of resistance arising.
Two logically different uses of HEGs are considered. First, we investigated how their intrinsic homing ability might be put to use. A HEG could be designed to target an essential gene and hence to impose a genetic load on a population or in a vector to knock out a gene that is essential for disease transmission but nonessential for the host. Second, we consider a HEG placed on the Y chromosome that recognizes and cuts one or more sites on the X chromosome. By reducing competition from X-carrying sperm the driving Y chromosome spreads, causing a male-biased sex ratio that can severely reduce population growth rate.
Our models show that a HEG targeting an essential gene can potentially cause a substantial reduction in population fitness, while one that targets a gene whose knockout has minor consequences for host fitness quickly goes to fixation. These strategies rely on the break in the chromosome caused by the HEG being repaired by gene conversion with the homologous chromosome as the template. In our models the frequency of homing by this mechanism is described by the parameter e that typically should be as high as possible. But repair by gene conversion is not the only possible pathway, and experiments have shown that the frequency of different types of repair can depend upon the genomic context. For example, if the cleavage site is flanked by direct repeats, then the single-strand annealing (SSA) pathway usually seems to predominate, at least in the premeiotic male germline. In this process all DNA sequence between the repeats is removed: thus the HEG site is lost but a functional gene is unlikely to be reconstituted. Thus SSA reduces the efficiency of homing rather than giving rise to resistant alleles. In one set of experiments with Drosophila the SSA pathway was used two-thirds of the time in the premeiotic male germline with conversion accounting for only 10–20% of repairs (Preston et al. 2006). In other experiments, without direct repeats flanking the cleavage site, conversion repair was observed ∼85% of the time [excluding repair of the sister chromatid, which precisely regenerates the target site, producing a chromosome that can simply be cut again (Rong and Golic 2003)]. These experiments indicate that it will be important to avoid targets that are flanked by direct repeats. The timing of HEG expression can also be important: if cleavage is late in spermatogenesis, then nonhomologous end joining (NHEJ) can predominate (Preston et al. 2006). Here the ends of the chromosome are directly ligated without involvement of the homolog (and, as discussed below, with a high probability of the loss of the HEG recognition sequence), again not the desired result. There may also be differences in the frequency of conversion repair (e), depending upon where the target is located in the genome.
If the aim of the intervention is to impose a load on the population, the models suggest that it is better to target a gene that affects fertility rather than viability and that it is in general better for the target to be recessive (though in some cases maximum load is obtained when heterozygote fitness is somewhat reduced, 0 < h < 0.5). In Drosophila there are thought to be ∼3000 genes essential for viability and 50–100 needed for female fertility, most of which are recessive (Ashburner et al. 1999, 2005). It is also possible to impose a load if one targets male fertility. In Drosophila there are two genes (misfire and sneaky) that, when disrupted, have the effect of impairing embryo development after fertilization (Ohsako et al. 2003; Wilson et al. 2006; Smith and Wakimoto 2007). Misfire is needed for correct sperm head decondensation and sneaky is required for the proper breakdown of the sperm plasma membrane. Homologs of these genes in pest and vector species are potential targets. In principle, if a gene were to affect both male fertility (in this way) and female fertility, then targeting it could be more efficient than knocking out an essential gene, but it is not clear if any such genes exist in insects.
It may also be possible to target genes that have fitness consequences in only one sex or that have different homing frequencies in the two sexes. Targeting only females is advantageous as the reduced overall costs allow the HEG to reach higher equilibrium frequencies (and spread faster) and then impose a greater load on the population. For most human disease vectors it is the female that requires a blood meal prior to oviposition and transmits the pathogen, another advantage for preferentially targeting this sex.
Different control sequences can be used to determine the relative timing of homing and the expression of the gene targeted by the HEG. We find constructs are more invasive if HEG expression is after that of the target gene, so that heterozygotes have more-or-less normal fitness. In many cases HEGs cannot spread if the fitness consequences of gene conversion are experienced by the individuals in which it occurs, and the speed of spread and eventual load of any HEG that has a significant fitness cost is lower if homing occurs before expression of the target gene.
There is one circumstance in which the inability of an early-acting HEG to spread may be a positive advantage. Successful control of several pests has been obtained by the mass release of males that have been sterilized using chemicals or radiation (sterile insect techniques, SIT). A development of this is to engineer condition-dependent dominant lethals that are mass reared in the presence of an artificial repressor that inactivates this gene (Thomas et al. 2000; Atkinson et al. 2007; Phuc et al. 2007). If released into the environment in sufficient numbers they bring about population collapse by competing with wild-type males for mates that then fail to produce offspring (in the absence of the artificial repressor in the field). If the gene is only lethal for females (Thomas et al. 2000), this release of insects with a dominant lethal (RIDL) technique has the additional advantage that the male progeny of released insects can transmit the female lethality to the next generation. Although in the absence of further releases the trait quickly disappears, its temporary persistence in the environment is an advantage of RIDL over SIT. A HEG that could not spread through a population could be combined with a condition-dependent female-lethal construct in a “RIDL-with-drive” strategy, which would be more effective than RIDL by itself (Thomas et al. 2000), though again without long-term persistence.
The models show that for broad regions of parameter space, typically involving HEGs with substantial fitness costs in the heterozygote state, the fate of the construct depends on its initial frequency. Although these HEGs cannot spread from rare, were they to be released in sufficient numbers, the constructs would spread to fixation. The ability of many HEG constructs to spread from rare, even in the presence of costs, distinguishes them from a number of potential drive elements—for example, medea elements, Wolbachia, and underdominant chromosomes—that in the presence of costs require frequencies to exceed a threshold before spread occurs (Sinkins and Gould 2006). If a HEG construct was developed and then found to spread only above a modest threshold frequency, it might still be useful as a drive mechanism. Currently, we see no particular advantages to such constructs and efforts should be directed at developing HEGs that can spread from rare.
For insect-transmitted diseases a different strategy involving homing is to target a gene whose disablement has no or relatively low costs to the vector but that is essential for successful parasite transmission. Such HEGs rapidly rise in frequency to fixation in the population and considerations such as whether the HEG acts before or after target gene expression have little or no effect on whether it spreads. Currently, there is intensive study of mosquito gene products required by Plasmodium and other disease agents (e.g., Dinglasan et al. 2007; Ecker et al. 2007) and measures of the costs to the vector of their knockout would be very interesting.
Any strategy that aims to impose a fitness load on a population inevitably leads to strong selection for resistance. For HEGs, the easiest way for resistance to occur is for an allele to arise that codes for a functioning gene but that does not possess the HEG recognition site. This could arise through point mutation or, most likely, through the chromosome break caused by the HEG not being rejoined by homologous repair but by NHEJ. Here we show that if an escape mutant has normal fitness, then it will spread and the HEG will eventually disappear. If the HEG has low fitness costs, then the spread of the escape mutant will occur slowly and significant short-term benefits in population management may still occur. But for HEGs whose prime purpose is imposing load any escape mutant that arises will quickly destroy any benefit.
It is thus critical to design a HEG where loss of the recognition site in the target gene also implies loss of gene function. Some HEGs that target protein-coding genes have evolved to “ignore” the third base silent sites in the target sequence, so as to broaden their own specificity (Koufopanou et al. 2002; Kurokawa et al. 2005). If this feature can be maintained in the engineered HEGs, then the most obvious sort of resistant mutation can be nullified. Loss of the recognition site that arises from NHEJ normally involves the deletion of a short stretch of DNA, and hence engineering a HEG to recognize the DNA coding for an active site of an enzyme, or equivalent conserved motif, should ensure its loss causes nonfunctionality. Further strategies that might be explored include targeting multiple sites within the same gene simultaneously and targeting multiple genes simultaneously. Akin to multiple-drug resistance, these measures might substantially delay the evolution of resistance, possibly allowing enough time for local extinction to occur.
The second distinct way of using a HEG is to destroy X chromosomes at male meiosis and so bias the sex ratio (Burt 2003), a strategy formally modeled here for the first time. This mechanism does not rely on homing per se but on the HEG providing an extra-Mendelian advantage to its Y chromosome carrier. Naturally occurring driving Y chromosomes are known from two medically important genera of mosquito, Aedes and Culex, which is encouraging for this strategy (Newton et al. 1976; Sweeny and Barr 1978; Wood and Newton 1991; Cha et al. 2006). Nothing is known at the molecular level of how they work, but cytologically drive is associated with X chromosome breaks at male meiosis (Cazemajor et al. 2000; Windbichler et al. 2007). The degree to which breaking the X chromosome favors the Y chromosome depends quite critically on the organism's biology. We show that if there are costs to reduced sperm production, or if multiple mating occurs and fertilization is a lottery, then the spread of the HEG-bearing Y may be slowed. Similarly, if damaged X chromosomes “pseudofertilize” eggs, that is, if they compete with other sperm and create zygotes that then die, the HEG-bearing Y spread may be slowed and in the limit even totally prevented. We lack information on these issues for major vectors and pests and studies on the reproductive biology of a candidate species would need to be done in advance to assess the likely outcome of adopting this approach. For example, if irradiation of sperm (which causes chromosome breaks) leads to death of zygotes and not sperm, as occurs in some species, then this would suggest pseudofertilization might occur.
An escape mutant on the X chromosome that destroys the HEG recognition site and does not dramatically reduce fitness would increase in frequency and ultimately return the sex ratio to equality. We would also expect a suppressor mutant on an autosome to spread in a similar manner and destroy the sex-ratio bias, though we have not modeled this scenario. Again, as with classical HEGs, the probability of resistance occurring can be reduced by designing the HEG to cut multiple sites on the X chromosome. This is also advantageous as for cutting frequencies (e) less than one multiple site increases the equilibrium sex-ratio bias. An example of how this might be implemented is provided by Anopheles gambiae, the most important vector of malaria in sub-Saharan Africa. In this species rDNA repeats are restricted to the X chromosome, and the I-PpoI homing endonuclease of Physarum slime molds recognizes and cuts a sequence within the gene (Windbichler et al. 2007). Given that there are >100 copies of the rDNA gene (Collins and Paskewitz 1996), the evolution of resistance will be a very rare event. Note that if the X-shredding HEG was placed on an autosome, it would still disrupt X sperm but not spread. This strategy could be employed in a noninvasive pest or vector management campaign, although it may be necessary to make HEG expression condition dependent (using a RIDL-like strategy) to enable efficient mass rearing.
Hybrids between the two main strategies—classical homing and biasing the sex ratio—might also be devised. In the major agricultural pest, the medfly (Ceratitis capitata), XX individuals require at least one functional copy of the autosomal gene tra (transformer) to be phenotypic females (Pane et al. 2005). If a HEG was engineered to knock out the tra gene, it would drive the Y chromosome to extinction so that all males would be XX tra− tra− and the sex ratio would be (1 − e)/2, where e is the homing efficiency (detailed modeling not shown).
Throughout this article we have measured the effect of HEGs designed to reduce pest or vector population densities by what we have called “HEG load,” the decrease in population fitness caused by reductions in viability and fertility. The effect of sex-ratio biases on population fitness can be measured using the same metric. It is critically important to stress that how HEG load translates into real population reductions depends on the particular demography and population dynamics of the species involved and also the context in which pest or vector management is sought. For example, if population control is attempted for a seasonally invasive species or one with boom-and-bust population dynamics, then the most important concern may be to reduce maximum population growth rate. In these circumstances HEG load will map directly onto population reduction and be a simple means of comparing different strategies. However, if a population is subject to density-dependent mortality, then it is possible for a substantial HEG load to be present but only modest or even no reductions in pest or vector numbers to be observed. In theses circumstances comparing HEG load provides a ranking of the effectiveness of different strategies, though not a quantitative metric. In this article we have deliberately not extended the models developed here beyond frequency into the density domain as we believe analyses based on particular systems and specific biologies are the most effective means of exploring these issues. We are also aware that our models have been purely deterministic and have assumed discrete generations: stochastic models with overlapping generations will be valuable in providing guidance about the numbers of HEG-bearing insects that should be released in an actual management program.
The aim of the work described here is primarily to help in the design of HEGs that might be used in pest and vector management, especially involving insects. The success of a HEG-based strategy will depend on (i) the discovery of suitable recognition sites or the successful engineering of HEGs to recognize new sites, (ii) the ability of HEGs to cut and home in insects, (iii) the design of constructs and implementation strategies to avoid or delay resistance, and (iv) the regulatory acceptance of a pest or vector management strategy that involves genetic manipulation. We have mentioned above one significant recognition site in a malaria vector that can be cut by a naturally occurring HEG, while there has been substantial recent progress on reengineering HEGS to recognize novel sites (Ashworth et al. 2006; Arnould et al. 2007). Preliminary work suggests artificially introduced HEGs can function in mosquito cells (Windbichler et al. 2007). Much remains to be done but we believe HEGs offer exciting prospects for novel approaches to pest and vector management and that consideration of their population genetics will be important in guiding this development.
We are grateful to Michael Ashburner, Andrea Crisanti, Fred Gould, Penny Hancock, Ray Monnat, Samantha O'Loughlin, Steve Russell, Barry Stoddard, and an anonymous referee for helpful discussions. This work was funded by a grant from the Foundation for the National Institutes of Health through the Grand Challenges in Global Health.
Communicating editor: J. B. Walsh
- Received March 7, 2008.
- Accepted April 28, 2008.
- Copyright © 2008 by the Genetics Society of America