- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Fujitani, Y.
- Articles by Kobayashi, I.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Fujitani, Y.
- Articles by Kobayashi, I.
Effect of DNA Sequence Divergence on Homologous Recombination as Analyzed by a Random-Walk Model
Youhei Fujitania and Ichizo Kobayashiba Department of Applied Physics and Physico-Informatics, Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan
b Department of Molecular Biology, Institute of Medical Science, University of Tokyo, Tokyo 108-8639, Japan
Corresponding author: Youhei Fujitani, Department of Applied Physics and Physico-Informatics, Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan., youhei{at}appi.keio.ac.jp (E-mail)
Communicating editor: N. TAKAHATA
| ABSTRACT |
|---|
A point connecting a pair of homologous regions of DNA duplexes moves along the homology in a reaction intermediate of the homologous recombination. Formulating this movement as a random walk, we were previously successful at explaining the dependence of the recombination frequency on the homology length. Recently, the dependence of the recombination frequency on the DNA sequence divergence in the homologous region was investigated experimentally; if the methyl-directed mismatch repair (MMR) system is active, the logarithm of the recombination frequency decreases very rapidly with an increase of the divergence in a low-divergence regime. Beyond this regime, the logarithm decreases slowly and linearly with the divergence. This "very rapid drop-off" is not observed when the MMR system is defective. In this article, we show that our random-walk model can explain these data in a straightforward way. When a connecting point encounters a diverged base pair, it is assumed to be destroyed with a probability that depends on the level of MMR activity.
MANY experimental studies have analyzed the relationship between the frequency of homologous recombination and the homology length that ranges from some hundreds of base pairs up to ~20 kbp (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]() |
(1) |
where c is the constant of proportionality. The linear function thus obtained, however, was later found to disagree with nonlinear dependence of the frequency on the homology length observed in a mammalian gene targeting system (![]()
|
In contrast with the MEPS theory, our "random-walk model" was shown to explain the data from both systems (![]()
![]()
![]()
![]()
The recombination frequency has been found to decrease as sequence differences are introduced into the homologous region; its logarithm appears to be reduced linearly with an increase of the divergence (the ratio of the number of diverged base pairs to the number of all base pairs in a region of homology between two DNA duplexes) for very long homologous regions (106107 bp) in bacterial systems (![]()
![]()
![]()
![]()
![]()
![]()
As described in the next section, these effects of the MMR system have been explained in terms of the MEPS theory, which has already failed to explain the nonlinear dependence of the recombination frequency on the homology length. Here we present an alternative explanation in terms of the random-walk model after a brief review of the original version of the random-walk model. Symbols we use frequently are listed in Table 1.
|
| PREVIOUS MODELS |
|---|
Assuming that a base pair at a particular position in a homologous region will be diverged with a probability equal to the divergence (D, 0
D
1), one can calcu-late the average recombination frequency to compare it with experimental data. We express this average over positions of diverged base pairs by putting the recom-bination frequency, denoted by
, between the angle brackets,
and
, in the following equations. The recombination frequency at D = 0 need not be averaged.
In the MEPS theory, initial enzymes are supposed to work only when they cling to a MEPS devoid of diverged base pairs; the recombination frequency is proportional to the number of ways of picking up a MEPS devoid of diverged base pairs from the homologous region (N bp in total; ![]()
![]()
![]() |
(2) |
where the superscript (M) indicates a result in the framework of the MEPS theory, c is the constant used in Equation 1, and
(M)(D = 0, N) is the recombination frequency at D = 0 given by Equation 1. When D << 1, because e-D
1 - D, we have
![]() |
(3) |
The reaction, thus initiated, may be aborted by the MMR system. The MMR system would attack a mismatch, which is produced at a diverged base pair as the heteroduplex elongates.
![]()
![]() |
(4) |
where the modified MEPS length, M
eps, depends on the level of MMR activity. Equation 4 implies that the logarithm is a linear function of D with the slope dependent on the level of MMR activity. As shown later, ![]()
![]()
e-ßD, the probability with which the MMR system is triggered is given by 1 - R0e-ßD. They introduced a factor f denoting the probability with which the reaction is aborted after the MMR system is triggered and expressed the averaged recombination frequency as a function of D, N, and f:
![]() |
(5) |
They fitted Equation 5 to their experimental data (N = 350) for the wild-type strains showing the very rapid drop-off to obtain f = 0.97. When f = 0, Equation 5 is equivalent to Equation 3, which can explain the data for the Mmr- strains showing no very rapid drop-off. Equation 5 gives different values to the recombination frequency between identical substrates in the wild-type strains, <
(M)(D = 0, N = 350, f = 0.97)>, and to that in the Mmr- strains, <
(M)(D = 0, N = 350, f = 0)>, which agrees with their data. ![]()
We feel that ![]()
| THE RANDOM-WALK MODEL |
|---|
Here we review the original version of the random-walk model (![]()
per site and neglect cases where more than one connecting point is produced in a relatively short identical region (n
<< 1). A "randomly walking" connecting point is assumed to be processed somewhere within the region. Here, "being processed" includes "being resolved to a recombinant" and "being destroyed" (i.e., "disappearing without yielding a recombinant"). We write k (0 < k
1) for the conditional probability of resolution given that a connecting point is processed. A connecting point is assumed to be destroyed whenever it encounters either end of the homology. This is the condition of a totally absorbing boundary (![]()
![]()
![]() |
(6) |
where pj(t) denotes the probability distribution of a connecting point at a (real) site j (1
j
n) at time t, and p*(t) is this probability distribution at an imaginary site * (Figure 3A). This site represents the state at which a homologous recombinant has been formed. The parameter g is the transition probability per unit time (or transition rate) of the random walk; h is the ratio of the probability with which a random walker (a connecting point) is processed per site per unit time to g. The assumption adopted here that g, h, and k are site-independent is appropriate when the homologous region is devoid of sequence divergence. We assume that the re-combination frequency is measured after a long enough time in the experiments.
|
|
Suppose first that a connecting point is produced at a real site m, and the initial condition is given by pj(0) = 0 for j
m and pm(0) = 1. The solution pj(t) of Equation 6 depends on m and the number of the sites n; we use a superscript (m;n) to express this dependence. As derived in Appendix A, the recombination frequency after a long enough time is given by
![]() |
(7) |
![]() |
(8) |
where
![]() |
(9) |
Here, sinh and cosh, as well as tanh and coth appearing below, are the hyperbolic functions. Because a connecting point is actually produced with probability
per site, the recombination frequency is given by
![]() |
(10) |
When h << 1, we have
![]() |
(11) |
![]() |
(12) |
as described in Appendix A and in ![]()
. The expression in the lower line of Equation 12 apparently coincides with the linear function given by Equation 1. One can see that the parameter h, named "relative probability of intermediate processing," is a key parameter here, instead of the MEPS length in the MEPS theory. As shown by ![]()
![]()
Expressed in terms of physics [see, e.g., chapters VI and X of ![]()
![]()
![]()
| THEORY FOR THE VERY RAPID DROP-OFF |
|---|
Here we explain why the very rapid drop-off was observed in ![]()
2
i(y - yi)2 as a measure of the goodness of fit, where yi is the data value (the natural logarithm of the recombination frequency) for the ith data-point and y is the value of a theoretical curve at the point. The results are summarized in Table 2.
|
|
|
As in the previous models (![]()
![]()
)3 =
of the frequency for zero divergence. Because 1/27 > (1/8)2, the frequency-drop from no diverged base pairs to one diverged base pair is more "rapid" than that from one diverged base pair to two diverged base pairs. It is probable that the random-walk model thus explains the very rapid drop-off. Actually, the recombination frequencies obtained by ![]()
Let us examine this scenario. Suppose that one connecting point is produced initially at the lth site (say, from the left end) of a homologous region with N sites. This region may be divided into some identical subregions by diverged sites, each of which plays the role of a totally absorbing boundary. Suppose that this lth site is an identical site (i.e., a site of an identical base pair), and we define Fl(m, n) (1
l
N, 1
m
n, 1
n
N) as the probability with which the connecting point is produced at the mth site of an identical subregion with n sites. The identical subregion lies between diverged sites (Figure 4A), lies between a diverged site and either end of the homologous region (Figure 4B), or coincides with the entire homologous region. In the first case, we have Fl(m, n) = D2(1 - D)n because n bp are identical with probability (1 - D)n and 2 bp at both ends are diverged with probability D2. In the second case, we have Fl(m, n) = D(1 - D)n because 1 bp at an end need not be diverged. Which case we have is determined by the relationship among l, m, n, and N as shown in Appendix B.
Noting that Equation 8 gives the probability of resolution of the connecting point considered above, we can express the averaged recombination frequency in the homologous region by
![]() |
(13) |
where we added the superscript + to indicate that this expression is valid when the MMR system is active enough. Note that
, defined by Equation 9, depends on only h. By setting D = 0 in Equation 13, we recover Equation A12 with n replaced by N.
The value of 
+(D, N)
/(k
) is independent of the k
value. Thus, when we plot ln
+(D, N)
against D, we can only shift the curve upward or downward by increasing or decreasing the k
value, respectively, with the curve shape remaining the same. The parameter h also influences the overall position of the curve because the intercept, i.e., the logarithm at D = 0, is given by the logarithm of Equation 12 with n replaced by N. The curve shape depends not on k
but on h.
We have two fitting parameters in Equation 13: h and the product k
. Curve fitting to ![]()
= 3.4 x 10-9 (
2 = 7.3). These values are consistent with ![]()
> 10-10). The fitted curve can follow the very rapid drop-off shown by the data (Figure 5). We replot ![]()
(M)(D = 0, N, f = 0), Meps, f, R0, and ß, of which the last four parameters are responsible for the curve shape. Their fit (
2 = 1.8) is better than ours.
The homology length (350 bp) is found to be comparable to
= 1.8 x 102, around which the shift in the dependence should occur as shown by Equation 12. Although we consider this, the calculated ratio of the frequency for one diverged base pair to that for zero divergence,
(D = 0, N = 350) = 0.71, appears to be large as compared with the one-eighth mentioned in the second paragraph of this section. The reason is as follows. The one-eighth corresponds with the case where the diverged base pair is at the center of the homologous region in the third-power dependence range. The average <
+ (D =
, N = 350)> is influenced not only by this case but also by the case where a diverged base pair is introduced near either end of the homologous region to give almost the same recombination frequency as
+(D = 0, N = 350).
Thus, the random-walk model can offer a very straightforward explanation for the presence of the very rapid drop-off in the wild-type strains (Mmr+). The same mechanism can explain the map expansion phenomenon, Rac > Rab + Rbc, where each term implies the recombination frequency between two markers indicated by the letters of the subscript and loci of the markers a, b, and c are arranged in this order (![]()
![]()
![]()
![]()
| THEORY FOR MMR-DEFECTIVE STRAINS |
|---|
Assuming that a connecting point is always destroyed at a diverged site unlike at an identical site, in the preceding section we were successful at explaining the very rapid drop-off. What we assumed is a kind of site dependence in the transition rates. Thus, we expect to explain the absence of the very rapid drop-off in ![]()
msh2
msh3; solid circles in Figure 5) by similarly assuming site dependence in the transition rates. We assume that, when the MMR system is defective, a connecting point is a little more likely to be processed and destroyed at a diverged site than at an identical site; the resolution step could be affected by mismatches themselves (![]()
![]()
![]()
As illustrated in Figure 6A, this model supposes that the potential felt by a random walker has the same "height" at the "hilltops." We assume that there are two kinds of heights of the valley bottoms: one for an identical site and the other for a diverged site (Figure 6A). The latter should be higher than the former because a connecting point is assumed to be a little more unstable at a diverged site. A random walker can reach a neighboring site after "climbing up" a lower "hill," i.e., with larger transition rate, when it starts from a diverged site than when it starts from an identical site [see, e.g., chapter X of ![]()
![]() |
(14) |
where gj, hj, and kj take the values g, h, and k, respectively, at an identical site, and take g', h', and k', respectively, at a diverged site (Figure 6B). Without diverged sites, Equation 14 is reduced to Equation 6 with n replaced by N.
|
As in Equation 7 and Equation 10, the recombination frequency is given by
![]() |
(15) |
where the superscript (RT) indicates the recombination frequency for a set of transition rates of the random-trap type, and p(m;N)*(
) is given by
![]() |
(16) |
where p(m;N)j(t) is the solution of Equation 14 under the initial condition pj (0) = 0 for j
m and pm (0) = 1. We have, from Equation 15 and Equation 16,
![]() |
(17) |
As shown later,
(RT) (N) is independent of g and g'. Because p(m;N)j (t) is a solution of the first three equations of Equation 14 and is independent of
and k,
(RT) (N) is invariant for any set of values of
, k, and k' as long as k
and k'/k remain fixed. This is also the case with its average
II(RT)(D, N)
; we can therefore regard h, k
, h', and k'/k as the parameters of 
(RT) (D, N)
. The shape of the curve of ln
(RT)(D, N)
depends not on k
but on h, h', and k'/k, as the shape of the curve of ln
+(D, N)
depended not on k
but on h.
We simulate the dynamics described by Equation 14 with a computer (VT-Alpha 433S8/3N, 433 MHz cpu; Visual Technology, Tokyo). Suppose that a random walker is now at an identical site. According to Equation 14, the probability of its jump to either of the neighboring sites in a short time
t is given by 2g
t, and the probability of its being processed in this short time is given by gh
t. Thus, on average, some action (i.e., jump to a next site or being processed) of the random walker at an identical site occurs in a short time
t =
. Similarly, a random walker at a diverged site takes some action in a short time
t' =
on average. One time step (Monte Carlo step) in our simulation is made to correspond with this time interval
t or
t' when the random walker is at an identical site or a diverged site, respectively. Thus, some action occurs at each time step in our simulation. A random walker jumps to one neighboring site with probability g/{g(2 + h)}, jumps to the other with probability g/{g(2 + h)}, and is processed with probability gh/{g(2 + h)} at each time step if it is at an identical site. If it is at a diverged site, the probabilities are g'/{g'(2 + h')}, g'/{g'(2 + h')}, and g'h'/{g'(2 + h')}, respectively. This rule is modified at either end of the homology. Because these probabilities are independent of g and g', we need not specify values of g and g' to calculate the recombination frequency. This point is shown analytically in Appendix C.
We have introduced a set of transition rates of the random-trap type to analyze the data for the Mmr- strains, but we should also be able to analyze data for the Mmr+ strains with Equation 14 HREF="#FD15">Equation 15Equation 16Equation 17. We first analyze the data of ![]()
and k'/k
0. This expectation is verified in Figure 5; the cross symbols, which are obtained numerically from Equation 14 with large h'/h and k' = 0, agree with the bottom solid curve obtained in the preceding section. This point is also discussed in the next section.
Let us now analyze ![]()
![]() |
(18) |
where
is defined by
(1 - D)h + Dh' and
is
of Equation 9 with h replaced by
.
![]()
values as the wild-type strains. Curve fitting to the data for the Mmr- strains results in the fitted values h = 2.2 x 10-3, k
= 8.4 x 10-9, and h' = 8.1 x 10-2 with
2 = 1.2 x 10 (Figure 5). The fitted k'/k value varies from 10-7 to 10-4 depending on the initial condition of curve fitting; the curve shape is insensitive to k'/k so long as it is not too large. This is expected because k'/k appears only in the first term in the first braces of Equation 18, which term is negligible as compared with the second term when k'/k is not too large. We also obtained simulation results with the same parameter values (Figure 5); the agreement between them and the fitted curve shows the validity of our decoupling approximation.
![]()
2 = 7.1). Their fit is better than ours, judging from the
2 value over the divergence range examined (0
D
0.26). Our curve is convex (i.e., its second derivative is positive) although the data appear to be concave as a whole; our curve deviates considerably from the data point at D = 0.26. Except for this data point, however, our curve can be fit to the data (
2 = 3.8) better than their line (
2 = 7.1).
| FOR LONGER SUBSTRATES |
|---|
![]()
(RT)(D, N = 350)>, changing the h' value or changing the k'/k value (Figure 7A and Figure B). Using the same sets of parameter values, we plot the logarithm for N = 3500 in Figure 7C and Figure D.
|
We find that the curves, which the decoupling approximation yields for h' = 2.0 x 10-3 and h' = 2.0 x 10-2 (i.e., the top two dashed curves in Figure 7A and Figure C), agree well with the corresponding simulation results. This is expected because we then have h' - h << 1 (h = 3.0 x 10-5). We again find that the simulation results tend to Equation 13 as h'/h
and k'/k
0 in each of Figure 7A&NDASH;D; the very rapid drop-off appears then.
We find that the corresponding curves for N = 350 and N = 3500 share almost the same shape. The curve shape is thus insensitive to N probably because the horizontal axis represents the divergence. At the same divergence, the average interval between two neighboring diverged sites is irrespective of the homology length. This average interval would mainly determine how frequently the connecting point encounters a diverged site and thus would mainly determine how the recombination frequency is reduced from that in the case of zero divergence.
Curve fitting of Equation 18 to ![]()
= 3.1 x 10-9, and h' = 1.9 x 10-3 (
2 = 6.0 x 10-1). The fitted k'/k value varies from 10-7 to 10-3 depending on the initial condition of curve fitting as in the preceding section. Line fitting to the data for the Mmr- strains gives the fitted intercept -3.6 and the fitted slope -1.7 x 10 (
2 = 3.8 x 10-1). These comparable
2 values demonstrate that our fit is as good as ![]()
|
The fitted h value gives
= 3.5 x 102, which is much smaller than N = 107. Unless h changes drastically enough to make
comparable to or much larger than N, the intercept is still given approximately by k
N as shown by the bottom line of Equation 12 with n replaced by N. The intercepts appear to be the same among the Mmr- strains, the wild-type strains, and the Mmr++ strains in Figure 8. We assume that the same k
value is shared among the three types of strains; we expect that their h values are not drastically different.
Judging from our analysis of the data of ![]()
![]()
values as obtained for the Mmr- strains (Figure 8). We find that the data point at D = 0.17 is not so far from the curve, but its overall agreement with the data is poor (
2 = 2.3 x 10). If we do a line fit as in ![]()
2 = 4.7 x 10-1 (Figure 8). This fit is much better than ours.
Let us fit Equation 13 to the data for the Mmr++ strains with h being the only fitting parameter. Using the 433 MHz machine to perform the summation over N = 107 in Equation 13, we obtain the fitted value h = 1.0 x 10-6 with
2 = 2.5 x 10 (Figure 8). The data for the Mmr++ strains appear to show the very rapid drop-off, which is followed by our curve. Attributing this tendency to saturation of the MMR proteins without its formulation, ![]()
2 = 3.0). In passing, if the extreme data point is included, these values are -5.9 and -7.1 x 10, respectively, with
2 = 2.9 x 10.
Our curves for the Mmr- strains and for the Mmr++ strains (the top and the bottom solid curves in Figure 8, respectively) appear to have the same intercept regardless of their different h values as expected. Comparing our curve for the Mmr++ strains with that for the wild-type strains (the middle curve in Figure 8), we find that the slope near D = 0 is steeper, i.e., the very rapid drop-off becomes more prominent, as h decreases. This can be explained qualitatively as follows. As D increases in Equation 13, the whole homologous region is separated by a greater number of totally absorbing boundaries and average length of an identical subregion becomes shorter. As
is larger, even if D is small, more identical subregions can be in the third-power dependence range of Equation 12. This dependence causes the very rapid drop-off as discussed in the second paragraph of THEORY FOR THE VERY RAPID DROP-OFF.
Although the substrates are very long (~107 bp), we have used the random-walk model with a single random walker. In other words, we still assumed N
<< 1 in this section as in Equation 6 and Equation 14. This is consistent with the fitted value of k
= 3.1 x 10-9 above.
| FURTHER DISCUSSION |
|---|
As mentioned in the Introduction, ![]()
, increases with the RecA concentration. As discussed, our curve of either ln
+(D, N)
or ln
(RT)(D, N)
is then lifted with its shape remaining the same. Thus, the random-walk model can also explain this SOS-induced change of the intercept in a very straightforward way.
Table 2 summarizes the results of the curve fits. The
2 values tell that the curves in our model cannot be fit to the data better than those in the previous models, except for the Mmr++ strains. However, this never means failure of our model. First, the previous models are based on the MEPS theory, which has failed to explain the nonlinearity between the recombination frequency and the homology length as discussed in the opening section. Second, the previous models cannot explain the very rapid drop-off well; ![]()
![]()
![]()
![]()
, which also determine the dependence of the homologous recombination on the homology length in Equation 11. We have mentioned an agreement between the estimates in Equation 11 and Equation 13 in the paragraph next but one to that containing Equation 13. In particular, how the logarithm drops very rapidly from the intercept is determined by only one parameter h. This parameter, relative probability of intermediate processing, is also the key to the relationship between the recombination frequency and the homology length. This very simple explanation for the very rapid drop-off is our main result. The very rapid drop-off is not observed in ![]()
We also assumed site dependence of the transition rates for the Mmr- strains of ![]()
![]()
Although we find that the very rapid drop-off becomes less prominent as a diverged site obstructs the homologous recombination less severely (Figure 7), our curve cannot be fitted to ![]()
![]()
We supposed that the MMR system, if active enough, detects mismatches to abort the homologous recombination as in ![]()
![]()
![]()
![]()
![]()
To explain all these findings, we may also have to take into account possible influence of the divergence on the initial events in the random-walk model. ![]()
![]()
values. The intercept of our curve for the Mmr- strains (the upper solid curve in Figure 5) is larger by 1.5 than that of our curve for the wild-type strains (the lower solid curve in Figure 5). Of this difference, 0.9 is caused by the, difference in k
and the rest is caused by the difference in h as calculated with Equation 12. On the contrary, as discussed in the preceding section, both the h and k
values need not remain fixed in explaining (almost) the same intercepts among ![]()
among the three types of strains.
We again emphasize that the random-walk model can explain, in a straightforward way, the linear dependence and the nonlinear dependence of the recombination frequency on the homology length, the presence or the absence of the very rapid drop-off and the SOS-induced change of the intercept in the relationship between the recombination frequency and the sequence divergence, and the map expansion. We therefore believe that the random-walk model helps in understanding essential aspects of the reaction of the homologous recombination.
| ACKNOWLEDGMENTS |
|---|
Y.F. acknowledges helpful advice of Dr. G. J. M. Koper and Professor K. Kitahara. He also thanks Y. Mizoguchi and J. Kawai, who helped him in some of the curve fits. The work by Y.F. was supported by Keio Gakuji Shinko Shikin. The work by I.K. was supported by grants from the Ministry of Education, Science, Sports and Culture of Japanese government (Class C, Class B, Repair, Genome), Nagase Science and Technology Foundation, Takeda Science Foundation, Yakult Bio-Science Foundation, and New Energy and Industrial Technology Development Organization (NEDO).
Manuscript received March 8, 1999; Accepted for publication August 11, 1999.
| APPENDIX A |
|---|
Using Equation 5 and B9 of ![]()
![]() |
(A1) |
where
![]() |
(A2) |
However, a more simple expression of p(m;n)* (
), equivalent to the above, saves us computing time.
Using the Laplace transform of pj(t),
![]() |
(A3) |
we obtain from Equation 6
![]() |
(A4) |
where L is an n x n matrix
![]() |
(A5) |
From Equation 7, we thus obtain
![]() |
(A6) |
where L-1 is the inverse of L. Thus, we have from Equation 10
![]() |
(A7) |
Equation A6 is equivalent to Equation 16 of ![]()
= 0." As is well known in the research community of path integrals [see, e.g., Equation 3.41 of ![]()
![]() |
(A8) |
Here,
is defined by Equation 9 and satisfies
![]() |
(A9) |
One can check that substituting Equation A5 and Equation A8 into LL-1 produces the n x n unit matrix. [One way to derive Equation A8 is substituting "x1" and "xN-1" obtained from Equations B8 and B9 into Equations B5 and B7 of ![]()
= 0].
Using Equation A8, we have
![]() |
(A10) |
where we used Equations 1.341.2, 1.314.6, 1.334.1, and 1.313.2 of ![]()
Equation A10 leads to
![]() |
(A11) |
where we used Equations 1.314.6, 1.341.4, and 1.313.2 of ![]()
![]() |
(A12) |
When
<< 1, we have h
4
2 from Equation A9, and Equation A12 produces Equation 11 because coth
1/
.
| APPENDIX B |
|---|
Suppose (N + 1)/2
l. Then, the identical subregion can reach neither end of the homologous region if n
l - 1, but it reaches only the left end if l
n
N - l and m = l. Considering it in this way and writing F for Fl(m, n), we have 1. Case of l = 1:
When 1
n
N - 1,

When n = N,

2. Case of l = 2: When n = 1, F = D2(1 - D). When 2
n
N - 2,

When n = N - 1,

When n = N,

3. Cases of 3
l
(N + 1)/2: When n
l - 1, F = D2(1 - D)n. When l
n
N - l (this case does not exist if l = (N + 1/2),

When N - l + 1
n
N - 2,

When n = N - 1,

When n = N,

We can obtain Fl(m, n) for l > (N + 1)/2 by using
![]() |
(B1) |
which comes from the symmetry of the one-dimensional lattice where the random walk occurs. When D = 0, the above Fl(m, n) is reduced to
![]() |
(B2) |
The lth site of a homologous region is diverged with probability D, and otherwise it is the mth site of an identical subregion with n sites with probability Fl(m, n), where 1
m
n and 1
n
N. Thus, the normalization condition is given by
![]() |
(B3) |
It is easy to see that this condition is satisfied when D = 0 because of Equation B2. Let us next check this condition when D
0 and l
(N + 1)/2; we then have
![]() |
(B4) |
Here, the first term does not exist when l = 1, the second term does not exist when l = 1 and when l = (N + 1)/2, the third term does not exist when l = (N + 1)/2, and the fourth term does not exist when l
2. Using the sum formulas of the geometric series and the arithmetico-geometric series [Equations 0.112 and 0.113 of ![]()
0 and l > (N + 1)/2.
| APPENDIX C |
|---|
Following the derivation of Equation A7, we can obtain from Equation 14Equation 15Equation 16
![]() |
(C1) |
where M and V are N x N matrices,
![]() |
(C2) |
and
![]() |
(C3) |
with
being an arbitrary real number.
We can expand the inverse of the matrix in Equation C1 as
![]() |
(C4) |
where
![]() |
(C5) |
This is a generalization of Equation A8, and
is defined so as to satisfy
![]() |
(C6) |
Introducing an N x N matrix,
![]() |
(C7) |
we obtain from Equations C1 and C4
![]() |
(C8) |
where 
nj
hnj -
. Equation C8 tells that
(RT)(N) is independent of g and g'.
Each of the products hn0 kn0 and hn0kn0 (
n1)(
n2) ... (
nq) is put between the angle brackets,
and
, when Equation C8 is averaged over positions of diverged sites. Let us consider the average of the latter product. Suppose that the subscripts n0, n1, ... , nq contain r(: 0
r
q) kinds of numbers, m0(
n0), m1, ... , mr, and that the subscripts n0, n1, ..., nq are composed of N0 pieces of m0, N1 pieces of m1, ... , and Nr pieces of mr. Then, the average of the product is given by
![]() |
(C9) |
However, because all the subscripts n0, n1, ... , nq are different from each other in the overwhelming majority of terms appearing in the summation n0, n1, ..., nq of Equation C8, we can decouple the average of the product approximately as
![]() |
(C10) |
which coincides with the case of r = q and Ni = 1 for any i in Equation C9. This decoupling approximation is valid when both h -
and h' -
are set to be small enough as compared to unity to make terms of higher power with respect to them negligible in Equation C9. Then, Equation C8 reads

(C11)
Expanding the inverse of a matrix
![]() |
(C12) |
as in Equation C4, where E is the N x N unit matrix, we obtain the infinite series in the brackets of Equation C11. The matrix, Equation C12, turns out to be the matrix L with h replaced by
and n replaced by N, where L is defined by Equation A5 and
is defined just below Equation 18. Because replacing as such in Equation A8 gives the inverse of Equation C12, replacing as such in Equation A11 gives the summation in Equation C11. Thus, the decoupling approximation yields Equation 18 irrespective of
.
| LITERATURE CITED |
|---|
AHN, B., K. J. DORNFELD, T. J. FRAGRELIUS, and D. M. LIVINGSTON, 1988 Effect of limited homology on gene conversion in a Saccharomyces cerevisiae plasmid recombination system. Mol. Cell. Biol. 8:2442-2448
DATTA, A., M. HENDRIX, M. LIPSITCH, and S. JINKS-ROBERTSON, 1997 Dual roles for DNA sequence identity and the mismatch repair system in the regulation of mitotic crossing-over in yeast. Proc. Natl. Acad. Sci. USA 94:9757-9762
DENG, C. and M. R. CAPECCHI, 1992 Reexamination of the gene targeting frequency as a function of the extent of homology between the targeting vector and the target locus. Mol. Cell. Biol. 12:3365-3371
DENTENEER, P. J. H. and M. H. ERNST, 1984 Diffusion in systems with static disorder. Phys. Rev. B 29:1755-1768.
EYRING, H., and E. M. EYRING, 1963 Modern Chemical Kinetics. Reinhold, New York.
FINCHAM, J. R. S. and R. HOLLIDAY, 1970 An explanation of fine structure map expansion in terms of excision repair. Mol. Gen. Genet. 109:309-322[Medline].
FUJITANI, Y. and I. KOBAYASHI, 1995 Random-walk model of homologous recombination. Phys. Rev. E 52:6607-6622.
FUJITANI, Y. and I. KOBAYASHI, 1997 Mismatch-stimulated destruction of intermediates as an explanation for map expansion in genetic recombination. J. Theor. Biol. 189:443-447[Medline].
FUJITANI, Y., K. YAMAMOTO, and I. KOBAYASHI, 1995 Dependence of frequency of homologous recombination on the homology length. Genetics 140:797-809[Abstract].
GRADSHTEYN, I. S., and I. M. RYZHIK, 1980 Tables of Integrals, Series, and Products. Academic Press, New York.
HAUS, J. W. and K. W. KEHR, 1987 Diffusion in regular and disordered lattices. Phy. Rep. 150:263-406.
HOLLIDAY, R., 1964 A mechanism for gene conversion in fungi. Genet. Res. 5:282-304.
JINKS-ROBERTSON, S., M. MICHELITCH, and S. RAMCHARAN, 1993 Substrate length requirements for efficient mitotic recombination in Saccharomyces cerevisiae.. Mol. Cell. Biol. 13:3937-3950
MAJEWSKI, J. and F. M. COHAN, 1998 The effect of mismatch repair and heteroduplex formation on sexual isolation in Bacillus. Genetics 148:13-18
NEGRITTO, M. T., X. WU, T. KUO, S. CHU, and A. M. BAILIS, 1997 Influence of DNA sequence identity on efficiency of targeted gene replacement. Mol. Cell. Biol. 17:278-286[Abstract].
PANYUTIN, I. G. and P. HSIEH, 1993 Formation of a single base mismatch impedes spontaneous DNA branch migration. J. Mol. Biol. 230:413-424[Medline].
PORTER, G., J. WESTMORELAND, S. PRIEBE, and M. A. RESNICK, 1996 Homologous and homeologous intermolecular gene conversion are not differentially affected by mutations in the DNA damage or the mismatch repair genes RAD1, RAD50, RAD51, RAD52, RAD54, PMS1 and MSH2.. Genetics 143:755-767[Abstract].
ROBERTS, M. S. and F. M. COHAN, 1993 The effect of DNA sequence divergence on sexual isolation. Genetics 134:401-408[Abstract].
RUBNITZ, J. and S. SUBRAMANI, 1984 The minimum amount of homology required for homologous recombination in mammalian cells. Mol. Cell. Biol. 4:2253-2258
SAKITA, B., and K. KIKKAWA, 1986 Keiro Sekibun ni yoru Taryuushikei no Ryoushirikigaku (Quantum Mechanics of Many-Particle Systems and Path Integrals). Iwanami, Tokyo (in Japanese).
SHEN, P. and H. V. HUANG, 1986 Homologous recombination in Escherichia coli: dependence on substrate length and homology. Genetics 112:441-457
SHEN, P. and H. V. HUANG, 1989 Effect of base pair mismatches on recombination via the Rec BCD pathway. Genetics 218:358-360.
SINGER, B. S., L. GOLD, P. GAUSS, and D. H. DOHERTY, 1982 Determination of the amount of homology required for recombination in bacteriophage T4. Cell 31:25-33[Medline].
SUGAWARA, N. and J. E. HABER, 1992 Characterization of double-strand break-induced recombination: homology requirements and single-stranded DNA formation. Mol. Cell. Biol. 12:563-575
THOMPSON, B. J., M. N. CAMIEN, and R. C. WARNER, 1976 Kinetics of branch migration in double-stranded DNA. Proc. Natl. Acad. Sci. USA 73:2299-2303
VAN KAMPEN, N. G., 1981 Stochastic Processes in Physics and Chemistry. North-Holland, Amsterdam.
VULI
, M., F. DIONISIO, F. TADDEI, and M. RADMAN, 1997 Molecular keys to speciation: DNA polymorphism and the control of genetic exchange in enterobacteria. Proc. Natl. Acad. Sci. USA 94:9763-9767
WALDMAN, A. S. and R. M. LISKAY, 1988 Dependence of intrachromosomal recombination in mammalian cells on uninterrupted homology. Mol. Cell. Biol. 8:5350-5357
ZAWADZKI, P., M. S. ROBERTS, and F. M. COHAN, 1995 The log-linear relationship between sexual isolation and sequence divergence in Bacillus transformation is robust. Genetics 140:917-932[Abstract].
This article has been cited by other articles:
![]() |
T. Tsuru and I. Kobayashi Multiple Genome Comparison within a Bacterial Species Reveals a Unit of Evolution Spanning Two Adjacent Genes in a Tandem Paralog Cluster Mol. Biol. Evol., November 1, 2008; 25(11): 2457 - 2473. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Tsuru, M. Kawai, Y. Mizutani-Ui, I. Uchiyama, and I. Kobayashi Evolution of Paralogous Genes: Reconstruction of Genome Rearrangements Through Comparison of Multiple Genomes Within Staphylococcus aureus Mol. Biol. Evol., June 1, 2006; 23(6): 1269 - 1285. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Handa and I. Kobayashi Type III Restriction Is Alleviated by Bacteriophage (RecE) Homologous Recombination Function but Enhanced by Bacterial (RecBCD) Function J. Bacteriol., November 1, 2005; 187(21): 7362 - 7373. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Opperman, E. Emmanuel, and A. A. Levy The Effect of Sequence Divergence on Recombination Between Direct Repeats in Arabidopsis Genetics, December 1, 2004; 168(4): 2207 - 2215. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Fujitani, S. Mori, and I. Kobayashi A Reaction-Diffusion Model for Interference in Meiotic Crossing Over Genetics, May 1, 2002; 161(1): 365 - 372. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z.-C. Tu, K. C. Ray, S. A. Thompson, and M. J. Blaser Campylobacter fetus Uses Multiple Loci for DNA Inversion within the 5' Conserved Regions of sap Homologs J. Bacteriol., November 15, 2001; 183(22): 6654 - 6661. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Evans and E. Alani Roles for Mismatch Repair Factors in Regulating Genetic Recombination Mol. Cell. Biol., November 1, 2000; 20(21): 7839 - 7844. [Full Text] |
||||
![]() |
K. C. Ray, Z.-C. Tu, R. Grogono-Thomas, D. G. Newell, S. A. Thompson, and M. J. Blaser Campylobacter fetus sap Inversion Occurs in the Absence of RecA Function Infect. Immun., October 1, 2000; 68(10): 5663 - 5667. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Fujitani, Y.
- Articles by Kobayashi, I.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Fujitani, Y.
- Articles by Kobayashi, I.
























, x,
, and
represent simulation results by use of 































