The Limits to Parapatric Speciation II: Strengthening a Preexisting Genetic Barrier to Gene Flow in Parapatry

By encompassing the whole continuum between allopatric and sympatric scenarios, parapatric speciation includes many potential scenarios for the evolution of new species. Here, we investigate how a genetic barrier to gene flow, that relies on a single postzygotic genetic incompatibility, may further evolve under ongoing migration. We consider a continent island model with three loci involved in pairwise Dobzhansky–Muller incompatibilities (DMIs). Using an analytic approach, we derive the conditions for invasion of a new mutation and its consequences for the strength and stability of the initial genetic barrier. Our results show that the accumulation of genetic incompatibilities in the presence of gene flow is under strong selective constraints. In particular, preexisting incompatibilities do not always facilitate the invasion of further barrier genes. If new mutations do invade, they will often weaken or destroy the barrier rather than strengthening it. We conclude that migration is highly effective at disrupting the so-called “snowball effect”, the accelerated accumulation of DMIs that has been described for allopatric populations en route to reproductive isolation.

on the island. The necessary condition is therefore obtained for m = 0 and the sufficient one for m = m b max . If B helps C to invade, then the roles are reversed: the necessary condition is 24 obtained for m = m b max and the sufficient one for m = 0.

between an island adaptation C interacting with either an island adaptation, A, or a continental adaptation B
We choose the continental haplotype to have a fitness of zero on the island. Then, the two parametrization are equivalent with the following change α = −β , γ = γ + BC and AC = − BC .
First, we assume that C interacts with an immigrating B allele. If the barrier is strengthened, too large (saddle-node bifurcation, Fig. C1(b)). Indeed, if C is lost due to its incapacity to 29 withstand migration (transcritical bifurcation), then B is still polymorphic when this happens, 30 i.e. m bC max < m b max,0 . Therefore, strengthening can never happen in this case. The x-axis corresponds to the epistasis between B and C. The y-axis corresponds the frequencies of allele B (red) and C (blue) at m = m AC max . The gray vertical line corresponds to the lower limit of BC for strengthening of the barrier (the two-locus barrier is stronger than the single-locus barrier between this value and 0). The x-axis shows the strength of epistasis between A and C. The y-axis shows the selective advantage of new allele C. The background color indicates the consequence of the invasion of allele C on the genetic barrier at the A locus: gray the genetic barrier remains unchanged, blue the genetic barrier is strengthened, orange the genetic barrier is weakened and red the polymorphism at locus A is lost. In addition, the solid black line gives the necessary condition for invasion of allele C on the island. Below this bound invasion is always impossible. The black dashed line gives the sufficient condition for invasion. Above this bound, allele C can always invade, regardless of the migration rate (provided the polymorphism at the A locus still exists).

allele C appears on the continent
The x-axis corresponds to the scaled epistasis between A and C. The y-axis corresponds to the scaled selective advantage of C. The background color indicates the consequence of the invasion of allele C on the genetic barrier at the A locus: red, allele A is lost, orange the genetic barrier is weakened, both green and light blue indicate that the barrier is strengthened through fixation of allele C on the island and dark blue indicates the barrier is strengthened even if C remains polymorphic. The distinction between green and light blue has to do with the fate of the polymorphism at low migration rate: for green, C can always fix, for light blue C fixes only if migration is strong enough. While the dark blue area indicates that the genetic barrier is strengthened with both loci remaining polymorphic, one of them will eventually fix its continental allele: above the black line allele C will fix first, below allele A is lost first. C 1.1.2 A single-locus barrier is impossible (β > 0). 56 In this paragraph, we assume that B is advantageous on the island. Therefore, there is no 57 initial single-locus genetic barrier, m b max,0 = 0. Following a successful invasion of allele A, the 58 expression of the strength of the genetic barrier is given by equation (C3).
We define α min as the minimal selective advantage of a new allele A required to form a 60 genetic barrier. From equation (C3), one can define necessary conditions for a new A mutation 61 to form such a barrier.

62
• if β > α 2 or AB < −β, then the genetic barrier is lost due to fixing both alleles A and B 63 together (first expression in (C3)). Therefore, we obtain α min = 4m AB AB +β and α > 4m is to α > 4m + 2β (blue dashed line on Fig. C4). This last expression is a more precise 72 necessary condition.

73
As a conclusion α > 4m is a always a necessary condition (it is also sufficient for lethal incom-74 patibilities) to observe a genetic barrier against swamping.

75
In addition, if α/2 > β and AB < −α/2 , then α > 4m + 2β is also sufficient condi-  on γ (except the obvious γ > 0) as the minimum of m Ab max is 0 in that case. Scaled epistasis involving C, ϵ _C /(-β) Scaled migration rate, m/(-β) C interacting with a Figure C5: Genetic barrier to swamping for all loci in loose linkage The x-axis corresponds to the epistasis between C and its interacting allele. The y-axis corresponds to m max , the genetic barrier for local stability, i.e. resistance to swamping. The black line corresponds to the genetic barrier before the apparition of C. To make comparison easier, m b max,0 = m Ab max,0 . The blue line corresponds to m b max and both the orange and magenta lines to m Ab max with C interacting with allele B (orange) or allele a (magenta).
Fig . C5 illustrates the impact of a third mutation on m b max . We assume that A is also poly-110 morphic (otherwise, the case has already been described in the previous section) and therefore 111 m b max,0 is given by m Ab max,0 . To allow for easier (visual) comparison between the different sce-(m b max ) and the "2 → 3 transition" in orange (m Ab max , epistasis with allele B) or magenta (m Ab max , 116 epistasis with allele a).

117
First, the results obtained for the "2 → 3 transition" are very similar to the "1 → 2 tran-118 sition". Indeed, the genetic barrier is only strengthened for negative epistasis between the new 119 island adaptation and its interacting continental allele. Secondly, γ has to be larger than m Ab max 120 as the C mutation needs to be able to resist gene flow up to m Ab max,0 (otherwise it will be 121 swamped before being able to strengthen anything) plus the hybrid cost.
C interacting with a Figure C6: An island adaptation can contribute to a genetic barrier stronger than its own selective advantage (m b max > α)) The x-axis corresponds to the epistasis between C and its interacting allele. The y-axis corresponds to the migration rate. The different lines correspond to m b max , the resistance to swamping at locus B under different scenarios. The initial single-locus and two-locus barriers, m b max,0 ≤ m Ab max,0 , are given in black and gray. The impact of a new allele C, interacting with B, on the single-locus and two-locus genetic barrier is represented by the blue and orange lines, respectively. Lastly, the impact of an allele C, interacting with allele a is represented in magenta. Allele B is here advantageous on the island. The thin vertical black line indicates the absence of epistasis. This figure is obtained for ( α β = 0.2, γ β = 1.2 and AB β = −1.5).
selective advantage smaller than the migration rate, α <m b max (magenta line above 0.2). If allele and selection against hybrids. Therefore, any strengthening of the genetic barrier will be through 160 one of these two mechanisms. We will first introduce the strengthening of the genetic barrier m Ab max , for the different linkage architectures in haploids The x-axis corresponds to the epistatic interaction between allele C and its interacting allele (either A if C appears on the continent ( AC α ) or B if C appears on the island( BC α )). Both positive ( BC > 0, AC > 0) and negative epistasis ( BC < 0, AC < 0) are considered. The y-axis represents the maximal migration rate for maintenance, m Ab max α . If a curve is not visible, it means that m Ab max α = 0. Each color corresponds to a different linkage architecture; plain lines indicate that C appears on the island, dashed lines on the continent. The black and gray lines serve as a reference for a two-locus genetic barrier between A and B, for tight linkage and loose linkage. Other parameters used are: β α = .75, γ α = −0.5 and AB α = −2 based on an increase of selection against migrants. A more detailed analysis for each linkage 162 architecture is available in the next section C 2.

163
• C appears at a locus in tight linkage with either A and/or B on the island. In that case, 164 the genetic barrier is strengthened if C is advantageous on the island (by reinforcing the 165 selective advantage of the "island alleles" either AC or bC). In addition, it requires that 166 the epistasis between B and C is mainly negative, otherwise the new allele will boost the 167 selective advantage of the "continental allele" B, making it easier for it to fix on the island

171
• C appears at a locus in tight linkage with either A and/or B on the continent. This requires 172 that C is deleterious on the island. In this case, the genetic barrier is strengthened by 173 making the "continental alleles" (aC or BC) even less fit on the island than they were before. (see green, red and blue solid lines in Fig. 6(a), ?? and Fig. C7).

175
• C appears on the continent and generates positive epistasis with allele A. In this case, 176 allele C can easily fix on the island. By doing so, it will reinforce the selective advantage 177 of allele A on the island (α → α + AC ) and therefore strengthen the genetic barrier (see in linkage and C appears on the continent (orange dashed line). This minimum can be explained as 197 follow: if negative epistasis is weak, fixation of C allows A to stay polymorphic. Therefore, the 198 weaker the epistasis the better for allele A as it pays a weaker hybrid cost (m Ab max is increasing).

199
For larger negative epistasis, it is no longer the case. Only the internal equilibrium is stable.

200
Therefore, the more negative epistasis, the longer it will be stable because C is purged faster 201 (if C increases in frequency then the frequency of allele A decreases, making swamping easier).

202
This corresponds to the decreasing part of m Ab max . This can be summarized as follows: if A can 203 prevent introgression of C, the stronger the incompatibility, the better. If not, then the best 204 option for allele A is to not interact with C at all or even better generate positive epistasis. to fulfill the conditions detailed in equation (C5).
Now, if we assume that both A and B are polymorphic on the island, then the substitution 211 has to fulfill the conditions detailed in equation (C6) to possibly invade.

212
and A and B are in tight linkage If all loci are in tight linkage, as soon as there is one locus polymorphic on the island, 213 then the first locus "pays" the price of migration and the new substitution needs only to be 214 advantageous. Of course, it requires that the new mutation C appears in the right background.

215
For selection against migrants, the cost of migration needs to be paid only once. If C is in indirectly, through the frequency at B, which is reduced by the polymorphism at locus A. For 219 selection against hybrids, the cost of hybrids is therefore shared between the different "island" 220 loci but the migration cost has to be paid by each locus independently. This is not true when m 221 is close to α( AB − β)/4/ AB and − AB > β. In this case, B is deleterious on the island on its 222 own and epistasis creates a hybrid cost that does not help (because selection against migrants 223 is the main force) and reduces the fitness of all genotypes on the island, making it harder for a 224 new mutation to invade.

225
When A and B are in loose linkage, and C appears in tight linkage with one of the two loci, 226 the results are no longer so clear, since we need to take into account the intra-locus competition 227 between the different alleles. For example, if C appears in tight linkage with A, with C associated 228 to allele A (resp. a), then we have three possible alleles at locus A: ac, Ac and AC (resp. ac, 229 Ac, aC). Invasion of the new allele then also depends of allele Ac, Fig. C8 blue (for allele AC) 230 and cyan (for allele aC) lines. Competition mainly happens if C appears with allele a since it 231 has to overcome the selective advantage of A on the island before it even has time to recombine 232 into the "optimal" genome.

233
In tight linkage, given that we assume r → 0, the fate of the new mutation will still be 234 determined by which allele it appears with. To form the next step of the genetic barrier, one 235 needs to wait for either a really good C mutation (that can invade in both backgrounds, and 236 then an unlikely recombination event will form the best haplotype) or for C to appear directly 237 in the "good" background. If C appears at a locus in tight linkage with B, associated with allele 238 b, then the polymorphism at locus A does not matter and the case, where only B is polymorphic, 239 has already been explained above.

240
To complete what has been presented in the previous paragraph, one needs to keep in 241 mind that it can also be difficult to hit the proper background when C is in tight linkage 242 with another locus. This is particularly true when m → m Ab max , then either p A → 0 and/or . As a consequence, the 245 new substitution has a low probability to be associated with the "island-adapted" allele (A or   expression. To describe the different equilibria, we will reuse the notation above, for example 256 S AB stands for an equilibrium with loci A and B polymorphic, C is monomorphic. appearing on the island The x-axis corresponds to the incompatibility between B and C. The y-axis corresponds to selective advantage of C on the island. Red indicates that the genetic barrier is strengthened, black that the genetic barrier is unchanged and white that it is weakened or worse. If (α < β or − AB < β), the black area indicates that the two-locus barrier is absent and the third locus does not change this fact.
• C appears on the continent appearing on the continent The x-axis corresponds to the incompatibility between A and C. The y-axis corresponds to selective advantage of C on the island. Stripped red indicates that the genetic barrier is strengthened, stripped black that the genetic barrier is strengthened through fixation of C on the island and white that it is weakened or worse.
A and C is strong enough ( AC > β − α) and negative epistasis between A and B is  otherwise the hybrid cost is too high and allele C is lost (Fig. C17, panels 1-8). If 304 the effect of C, γ, is much larger than the strength of the genetic barrier, then strong 305 negative epistasis can help strengthen the genetic barrier by further repressing the 306 migrant haplotype (aBc) through selection against hybrids. (Fig. C17, panels 10-12)

307
-If the two-locus barrier does not exist, the presence of a third locus allows the for-308 mation of the genetic barrier as long as C is advantageous on the island and the 309 negative epistasis between B and C is strong enough to prevent the fixation of allele 310 B (Fig. C17, panels 9, 13-16).

311
• C appears on the continent 312 -If the two-locus barrier exists, the genetic barrier is strengthened as long as there is positive epistasis between A and C, that increases the marginal fitness of A on the 314 island (Fig. C18, panels 1-8, 10-12).

315
-If the two-locus barrier does not exist, then the genetic barrier is created under the 316 same condition then the ABC architecture, i.e. through fixation of C (−γ < AC ) and 317 if the epistasis between A and C is strong enough ( AC > β − α) and the negative 318 epistasis between A and B is strong enough (− AB > β), (Fig. C18, 9, 13-16).  Figure C11: Comparison between the genetic barrier for AB-C and the old one AB, C appearing on the island The x-axis corresponds to the incompatibility between B and C. The y-axis corresponds to selective advantage of C on the island. Purple indicates that the genetic barrier is strengthened, black that the genetic barrier is unchanged and white that it is weakened or destroyed.  Figure C12: Comparison between the genetic barrier for AB-C and the old one AB, C appearing on the continent The x-axis corresponds to the incompatibility between A and C. The y-axis corresponds to selective advantage of C on the island. Stripped purple indicates that the genetic barrier is strengthened, stripped black that the genetic barrier is strengthened through fixation of C on the island and white that it is weakened or destroyed.

C appearing on the island
The x-axis corresponds to the incompatibility between B and C. The y-axis corresponds to selective advantage of C on the island. Green indicates that the genetic barrier is strengthened, gray that the genetic barrier is unchanged and white that it is weakened or destroyed.

C appearing on the continent
The x-axis corresponds to the incompatibility between A and C. The y-axis corresponds to selective advantage of C on the island. Stripped green indicates a strengthening of the genetic barrier, stripped gray that the genetic barrier is strengthened through the fixation of C on the island and white that it is weakened or worse.
-If the two-locus barrier does not exist, then it is impossible to form a barrier with a 347 third locus.

C appearing on the island
The x-axis corresponds to the incompatibility between B and C. The y-axis corresponds to selective advantage of C on the island. Blue indicates that the genetic barrier is strengthened, gray that the genetic barrier is unchanged and white that it is weakened or destroyed.

C appearing on the continent
The x-axis corresponds to the incompatibility between A and C. The y-axis corresponds to selective advantage of C on the island. Stripped blue indicates a strengthening of the genetic barrier, stripped gray that the genetic barrier is strengthened through fixation of C on the island and white that it is weakened or destroyed.
part of the selective advantage of A to get rid off of most the incompatibility. Since 360 B is deleterious, the epistasis hinders more than helps the genetic barrier and getting 361 rid of it can strengthen the genetic barrier (Fig. C15, panels 1-8, 10-12,16). This is 362 the same mechanism that happens for the two-locus 3-alleles model.

363
-If the two-locus barrier does not exist, then the barrier can be strengthened if C 364 is advantageous and has negative epistasis with B. Such mutation strengthens both 365 selection against migrants and selection against hybrids, the last one being necessary 366 since B is advantageous on the island (Fig. C15, panels 9, 13-15). or by making the continental allele less fit on the island (Fig. C16, panels 1-8, 10-12, .

373
-If the two-locus barrier does not exist, then it is impossible to form a new barrier 374 with a third locus.
A is lost and C fixes, ie Ac is replaced by aC ac invades the equilibrium ac invades and S ABC is unstable 382 If C appears on the island and interacts with allele B, the barrier is strengthened if C fulfills 383 2 conditions. First, its selective advantage has to be strong enough so the polymorphism at locus 384 C is not the first one lost (the barrier is unchanged otherwise) and second, allele C must repress 385 allele B, i.e. epistasis between B and C is negative. This case has been discussed in the main 386 text.

394
The values calculated here for m Ab max (both for C on the island or on the continent) have 395 been checked against the maximal migration generated by the "best" linkage architecture.  as AB < 0, BC < 0 and β < γ, having all loci in loose linkage will always generate a weaker 405 barrier, m Ab max , than a barrier generated by the same loci in tight linkage.

406
Next, let consider V (p) = 1 − p B , then : If the previous equation is true, then p B → 1.

417
In addition, we focus on a special case that is of biological interest: the duplication of the 418 A locus. We assume that the copy of the A locus has been transposed somewhere else (in loose 419 linkage). The new copy is called locus C.

427
The strength of the genetic barrier remains unchanged when selection against migrants is 428 the main component of the genetic barrier (β < 0 and AB not too strong). This situation 429 corresponds to an already strong genetic barrier. Indeed, the genetic barrier is then mainly 430 due to selection against the incoming alleles (a, B, c) and acts (almost) independently at the different loci. Due to the selection pressure, the frequency of B is kept relatively low. Close to m Ab max , the frequencies of both alleles A and C are low. As a consequence, there is almost no 433 epistasis expressed and A and C have no effect on each other, since they can no longer really 434 affect the frequency of the B allele. For this scenario, m Ab max corresponds to the simultaneous 435 loss of alleles A and C.

436
The other possible outcome is a strengthening of the genetic barrier. This strengthening 437 is the most efficient when B is also advantageous on the island and the incompatibility is just 438 strong enough to allow the DMI to persist in the first place ( AB ≈ −β, Fig. C19, yellow ridge).

439
In this situation, the frequency of allele B increases quickly with m, and the new duplicated if β + AB < 0 and β 2 < 2 2 AB and α 2 + 4α AB + 2 2 AB < 0 or α + AB ≤ 0 if β ≥ 0 or β 2 < 2 2 AB or α + β + 2 AB > 0 and β + AB < 0 and α 2 + 4α AB + 2 2 AB ≥ 0 and α + AB > 0 αβ β+2 AB if β 2 ≥ 2 2 AB and β + AB < 0 and (α + β + 2 AB < 0 or α 2 + 4α AB + 2 2 AB < 0 or α + AB ≤ 0  Figure C20: Linkage architecture forming the strongest genetic barrier, following the invasion of C For each panel, the x-axis corresponds to the epistasis between C and its interacting allele, B for the first row and b for the second row. The y-axis corresponds to selective advantage of allele C on the island. The different colors indicate the linkage architecture and location a C mutation should appear to maximize m b max . White area means that C never strengthens the barrier, red that C should appear in tight linkage and orange in loose linkage. Fully filled areas indicate that C should appear on the island and stripped areas on the continent. The genetic barrier, described in the previous cases, is therefore always smaller or equal to 484 α − β + γ and therefore, as long as B is deleterious, the optimal architecture is given by having 485 all loci in tight linkage. In this section, we assume that C appears on the continent and generates negative epistasis 506 with allele A. In addition, we assume here that C is maladaptive on the island (γ < 0).

507
If β < 0, then we were able to show, using Mathematica, that m Abc max ≤ α − β − γ, for all  Therefore, having C appearing on the island or on the continent is not symmetric in this 520 regard.

521
abc Abc aBc abC ABc AbC aBC ABC Fitness: 0 α β γ α + β + AB α + γ β + γ + BC α + β + γ + AB concerns the effect of allele C, and more precisely its epistasis. C interacts with allele B but 527 only in the absence of allele A. Therefore, C needs to be in loose linkage with both loci to be 528 able to express its epistasis. This corresponds to the existence of a three-locus interactions term 529 that cancels the effect of the interactions of B and C in presence of allele A ( ABC = − BC ).

530
Still despite designing a fitness scheme that should favor having all loci in loose linkage, this 531 linkage architecture forms the strongest barrier only over a small range of parameters. Indeed, 532 to observe such behavior, first we have to avoid having AbC as the optimal haplotype on the 533 island and therefore γ (Fig. C23(c)) has to be not too large and β quite large ( Fig. C23(a)).

534
Then epistasis between A and B needs to be not too strong ( Fig. C23(b)), otherwise allele A 535 represses efficiently allele B, and therefore increasing the marginal fitness of allele A by attaching 536 allele C to it, is the best option. Lastly, the epistasis between B and C has to be strong enough 537 ( Fig. C23(d)), such that it is more efficient to have C in loose linkage than in tight linkage with 538 allele b, where its reduces the selective advantage of allele B on the island. However, too much 539 epistasis is not optimal either, as m Ab max converges to γ 4 = (here 0.1375) as the incompatibility 540 becomes lethal. We represent m Ab max as a function of all parameters of the system, varying only one each time and using for the following values for non-varying ones: β α = 1.5, γ α = 0.8, AB α = −2, BC α = −2.5. Each color corresponds to a different linkage architecture: black to AB, gray to A-B, red to ABC, purple to AB-C, blue to AC-B, green to A-BC and orange to A-B-C. We represent m Ab max as a function of recombination. Each line follows a precise scheme: the color of the symbols corresponds to the linkage architecture for r = 0, the dashed line, a guide for the eye, is colored to correspond to the linkage architecture at r → ∞. For example, a orange dashed line with green symbols, indicates that we start in the following configuration A-BC and ends with all loci in loose linkage, A-B-C). Horizontal solid lines correspond to limited cases studied previously, with the color scheme indicated in the legend. For the numerical evaluations, we use (r ac , r bc , r ab ) = 500α to represent loose linkage.
Here, we observe that for all the different architectures, we always observe convergence to 548 the limiting cases, both for r → 0 and r → ∞. In most cases, similar to Bank et al. [2012], 549 m Ab max is an monotonous function of recombination.

550
In addition, departure from r = 0 is always relatively fast (as long as the genetic barrier 551 exists in tight linkage), making this configuration rare in genomes. The tight linkage behavior is 552