- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zhao, H.
- Articles by Speed, T. P.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zhao, H.
- Articles by Speed, T. P.
Statistical Analysis of Ordered Tetrads
Hongyu Zhaoa and Terence P. Speedba Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut 06520
b Department of Statistics, University of California, Berkeley, California 94720
Corresponding author: Hongyu Zhao, Department of Epidemiology and Public Health, Yale University School of Medicine, 60 College St., New Haven, CT 06520., hongyu.zhao{at}yale.edu (E-mail).
Communicating editor: D. BOTSTEIN
| ABSTRACT |
|---|
Ordered tetrad data yield information on chromatid interference, chiasma interference, and centromere locations. In this article, we show that the assumption of no chromatid interference imposes certain constraints on multilocus ordered tetrad probabilities. Assuming no chromatid interference, these constraints can be used to order markers under general chiasma processes. We also derive multilocus tetrad probabilities under a class of chiasma interference models, the chi-square models. Finally, we compare centromere map functions under the chi-square models with map functions proposed in the literature. Results in this article can be applied to order genetic markers and map centromeres using multilocus ordered tetrad data.
GENETIC studies using tetrad data are very valuable in studying the chance mechanisms in meiosis, including: (1) positions of crossovers along the four-strand bundle; (2) nonsister strand pairs involved in each crossover; (3) spindle-centromere attachment at the first meiotic division; and (4) spindle-centromere attachment at the second meiotic division. Deviation from random distributions of crossovers on the four-strand bundle is called chiasma interference. Deviation from random involvement of nonsister chromatid pairs in each crossover is called chromatid interference. Compared with single spore data, where the four products from a single meiosis can only be recovered separately, tetrad data, where four meiotic products can be recovered together, have several advantages. First, chromatid interference and chiasma interference can be distinguished using tetrad data. Second, when chromatid interference is absent, chiasma interference can be detected with only two markers, whereas at least three markers are needed for single spore data. Chiasma interference can even be detected with one marker in some studies. Third, the position of the centromere can be inferred. In some organisms, such as Neurospora crassa, the asci are produced in a linear order corresponding to the meiotic divisions and are called ordered tetrads. In other organisms, such as Saccharomyces cerevisiae, the asci are produced as a group without order and are called unordered tetrads.
Ordered tetrads have been used extensively to study the crossover process during meiosis since ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In this article, ordered tetrad data are studied under different assumptions on the chance mechanisms. For each assumption, a detailed discussion is provided for single marker and two marker data. General results for multiple markers are then presented. Although the number of spores is four in a tetrad and eight in an octad, there is no loss of generality for discussing only tetrads when aberrant segregations can be ignored. Half-tetrad data are another type of genetic data that are closely related to ordered tetrad data and widely used in genetic studies. A detailed study of half-tetrad data is given in the accompanying article (![]()
We adopt the following notation in this article. Markers are denoted by script letters; for example, we use
and
to denote markers. Alleles are denoted by italic letters. For example, A and a denote two alleles of marker
. We use [X, Y, Z, W] to denote the observed marker configuration for an ordered tetrad, where X and Y are attached to one centromere and Z and W are attached to the other centromere. For example, [AB, Ab, aB, ab] represents an ordered tetrad with two strands carrying AB and Ab attached to one centromere and with two strands carrying aB and ab attached to the other centromere. The centromere is denoted by CEN. For patterns between a pair of markers, we use P to denote parental ditype where all four strands retain the parental type, T to denote tetratype where two of the four strands show recombination, and N to denote nonparental ditype where all four strands are recombinants.
| METHODS |
|---|
Random spindle-centromere attachment assumption:
Random spindle-centromere attachment (RSCA) assumes that two centromeres have the same chance to go to either pole at the first meiotic division, and the divided centromeres have the same chance to go to either pole at the second meiotic division (![]()
For marker
with alleles A and a inherited from two parents, there are six distinguishable configuration types, as illustrated in Table 1. Under RSCA, types 1 and 2 ([A, A, a, a] and [a, a, A, A]) have the same probability because of random spindle-centromere attachment at the first meiotic division, whereas types 3 to 6 ([A, a, A, a], [a, A, a, A], [A, a, a, A], and [a, A, A, a]) have the same probability because of random spindle-centromere attachment at the second meiotic division. Types 1 and 2 are called first division segregation (FDS) pattern, and types 3 to 6 are called second division segregation (SDS) pattern (![]()
![]()
|
For the two markers
and
, there are six distinguishable configurations at
and six distinguishable configurations at
. Therefore, there are 36 distinguishable patterns jointly at two markers. Under RSCA, if one pattern can be changed to another pattern through one of the eight permutations in Table 2, these two patterns should have the same probability. For example, the following eight types have the same probability: [AB, Ab, aB, ab], [Ab, AB, aB, ab], [AB, Ab, ab, aB], [Ab, AB, ab, aB], [aB, ab, AB, Ab], [ab, aB, AB, Ab], [aB, ab, Ab, AB], and [ab, aB, Ab, AB]. After examining all 36 distinguishable patterns, the number of distinct probabilities is seven under RSCA. These seven groups are shown in Table 3. When
shows FDS, there are three distinct groups corresponding to whether
and
have parental ditype, nonparental ditype, or tetratype, denoted by P1, N1, and T12, respectively, in Table 3. When
shows SDS, there are four distinct groups. Two groups correspond to
and
having parental and nonparental ditypes, denoted by P2 and N2, respectively, in Table 3. When
and
have tetratype,
can show either FDS or SDS. Under RSCA, these two groups may have different probabilities, denoted by T21 and T2. RSCA can be tested by examining whether all distinguishable patterns within each group occur with equal frequency. For example, the 8 distinguishable patterns listed above, which correspond to group T12 with
showing FDS and
and
having tetratype, should occur with equal frequency. These and more cases were first studied by ![]()
|
|
For n markers, there are 6n distinguishable patterns. Under RSCA, these 6n patterns reduce to (6n + 5 x 2n)/8 distinct probabilities. This result was first derived by ![]()
No chromatid interference (one marker):
For the case of one marker,
, if NCI holds, the four configurations corresponding to the SDS pattern have the same probability before meiotic divisions. Therefore, these four types should occur with the same frequency even if RSCA fails. As a result, only when both NCI and RSCA fail can these four types occur with different probabilities.
As shown by ![]()
shows FDS and SDS patterns, given k chiasmata between CEN and
, are
![]() |
(1) |
As k increases, the probability of SDS tends to 2/3. For one marker,
, NCI imposes no constraints on the probabilities of FDS and SDS, denoted by F
and S
. For any observed FDS and SDS proportions, we can always construct a chiasma process model that gives rise to the observed FDS and SDS proportions. In fact, the process with probability F
having zero chiasmata and with probability S
having one chiasma is the simplest such model.
No chromatid interference (two markers):
Two markers,
and
, may be (1) on different chromosomes; (2) on the same chromosome but on different sides of the centromere (that is, the order is
CEN
); or (3) on the same chromosome and on the same side of the centromere (that is, the order is CEN
or CEN
). We study these three cases separately. These three cases were first discussed in detail by ![]()
and
.
Two markers on different chromosomes:
Let p and q denote the probability of SDS at
and
. When both markers have FDS, tetrad types between
and
can be either parental ditype or nonparental ditype, depending on which pair of alleles are separated at the first meiotic divisionAB vs. ab or Ab vs. aB. The probability of either outcome, group P1 or N1, is (1 - p)(1 - q)/2. Similar considerations lead to the probabilities of the seven groups in Table 4. These seven probabilities are determined by two independent parameters: p and q.
|
Two markers on different sides of the centromere (
CEN
):
We use p(P(k,
)1) , p(N(k,
)1) , p(T(k,
)12) , p(P(k,
)2) , p(N(k,
)2), p(T(k,
)21) , and p(T(k,
)2) to denote the frequency of ordered tetrads of groups P1, N1, T12, P2, N2, T21, and T2 among meioses with k chiasmata in
CEN and
chiasmata in CEN
. We can easily check that
and
For k +
> 2,
where F(k)
and S(k)
were defined in (1). For a given chiasma process along the four-strand bundle, let ck
denote the joint probability of there being k chiasmata between CEN and
and
chiasmata between CEN and
. The above relations can be combined with expressions for (ck
) and summed, to give our desired frequencies. For example,
and
On the basis of the above results, it can be shown that, for any chiasma process, seven distinct groups can have at most five different probabilities: the probabilities of P1, N1, T12, T21, and (P2 + T2 + N2) can differ, and we denote them by
, ß,
,
, and
. The ratio of the probabilities of P2:T2:N2 is 1:2:1. Therefore, the probabilities of P2, T2, and N2 are
/4,
/2, and
/4, respectively. These are summarized in Table 5.
|
The probabilities of these seven groups can be derived by another approach. If we treat the centromere as a marker, the results for unordered tetrads (![]()
and CEN, and three types, P, T, and N, between CEN and
. This would lead to nine distinct probabilities. Let pk0 , pk1 , and pk2 denote the conditional probabilities of pk0 , pk1 , and pk2 of P, T, and N, given k chiasmata between a pair of markers. Under NCI, ![]()
1,
![]() |
(2) |
When k = 0, p00 = 1 and p01 = p02 = 0. Let pij denote the probability of joint tetrad pattern (ij), where i, j = 0, 1, or 2 corresponding to P, T, and N in each interval; then,
where pki and p
j were defined in (2). Because the two centromeres cannot be distinguished, some of these classes are not distinguishable. For example, (0, 0) and (2, 2) both give rise to FDS at two markers and no recombinations between these two markers. Using the notation in Table 5, we have
= p(P1) = p00 + p22, ß = p(N1) = p02 + p20,
= p(T12) = p01 + p21,
= p(T21) = p10 + p12, and
= 4p(P2) = 2p(T2) = 4p(N) = p11.
It was shown in ![]()
![]()
![]()
![]()
0, where
and 0 = (0, 0, 0)'. For two markers, write the pij in lexicographical order as p; there is an underlying chiasma process satisfying NCI compatible with unordered tetrad probabilities p if and only if T-12p
0, where T2 = T1
T1, T-12 = T-11
T-11 and 0 is a column vector with nine 0's, plus equality constraints described in ![]()
is the standard tensor product (see, e.g., ![]()
) is simply c = T-12p . Using the property that, for any underlying chiasma process, there is a compatible chiasma process with at most two chiasmata in each interval, we may focus on the study of chiasma processes with at most two chiasmata in each interval. Using the notation in Table 5, we have the following proposition, whose proof is given in the Appendix 1 (Proposition 2): under NCI, for any joint ordered tetrad probabilities with two markers on different sides of the centromere, there is an underlying chiasma process that is compatible with these probabilities if and only if
ß and
+
2ß and the ratios of p(P2):p(T2):p(N2) are 1:2:1.
Using
, ß,
,
, and
, we may express the probability that
and
show P, T, and N, as p(P) =
+
, p(T) =
+
+
, and p(N) = ß +
, respectively. In the unordered tetrad case, the constraints imposed by NCI are: p(P)
P(N) and p(T)
2p(N) (![]()
, ß,
,
, and
in these two inequalities, we get
ß and
+
2ß. Therefore, for markers on different sides of the centromere, the only extra constraints added by ordered tetrads are the 1:2:1 proportionality constriants among p(P2), P(T2), and p(N2).
Two markers on the same side of the centromere (CEN
; the case of CEN
can be discussed similarly):
As in the previous discussion, we use (i, j) to denote the nine distinct groups if the centromere can be treated as a marker. Because the centromere cannot be observed, (0, 0) and (2, 0) cannot be distinguished. Both (0, 0) and (2, 0) show FDS at
and parental ditype between
and
. Therefore, we have p(P1) = p00 + p20. Similarly, p(N1) = p02 + p22, p(T12) = p01 + p21, p(P2) = p10, p(N2) = p12, and p(T21) = p(T2) =
. Therefore, these seven groups can have at most six different probabilities. Each of these 2 x 3 types can be represented as (i1i2), with i1 = 0 or 1 corresponding to FDS or SDS at
, and i2 = 0, 1, or 2 corresponding to P, T, or N between
and
. Denote these probabilities by
= p(P1), ß = p(N1),
= p(T12),
= p(P2),
= p(N2), and
= 2p(T21) = 2p(T2) (Table 6). It can be shown, as in ![]()
ß,
2ß,
,
2
, and p(T21) = p(T2).
|
No chromatid interference (multiple markers):
Here we consider only markers on the same chromosome. Markers on the same side of the centromere and markers on different sides of the centromere are discussed separately.
Markers on the same side of the centromere:
Under the assumption of NCI, there are 2 x 3n-1 distinct probabilities for n markers in the order of CEN
1
2 · · ·
n. Each of these 2 x 3n-1 classes can be identified as follows: FDS and SDS are distinguished at
1, and for each pair of consecutive markers, there are three types: P, T, and N. Each of these 2 x 3n-1 types can be represented as (i1i2 · · · in), where i1 = 0 or 1 corresponding to FDS or SDS at
1, and ir = 0, 1, or 2 corresponding to P, T, or N between
r-1 and
r, for r = 2, ... , n. For unordered tetrads with n markers (n - 1 intervals), there are 3n-1 distinct probabilities because FDS and SDS cannot be differentiated at
1; that is, i1 cannot be determined. Write the probabilities of the observable patterns (i2 ... in) from unordered tetrads, denoted by pti2...in, in lexicographical order as pt. It was shown that there is an underlying chiasma process satisfying NCI compatible with unordered tetrad probabilities if and only if T-1n-1pt
0 , where Tn-1 = T1
...
T1 (n - 1 terms), plus equality constraints described in ![]()
In our discussion of multiple marker ordered tetrad data, the pt0i2...in and pt1i2...in are considered separately. Write the pt0i2...in in lexicographical order as pt0, the pt1i2...in in lexicographical order as pt1. If for a given (0i2 ... in), there are k
2 tetratypes in the n - 1 intervals, these tetratype combinations may be subdivided further into 4k-1 subcells as follows. First, the strands can be labeled such that strands 1 and 3 always show recombination between two markers that have tetratype closest to the centromere. For the other k - 1 intervals showing tetratype, recombinations can occur on four possible pairs of strands. Therefore, there are 4k-1 subtypes. The probability of each subcell can be denoted by p0i2...in(h1 ... hk-1), where each hj is 1, 2, 3, or 4. If for a given (1i2 ... in), there are k
1 tetratypes in the n - 1 intervals, these tetratype combinations may be subdivided further into 2 x 4k-1 subcells as follows. Suppose the first pair of markers showing tetratype from the centromere is
r-1 and
r. Marker
r-1 must show SDS, because otherwise there must be a tetratype before
r-1. Marker
r can show either FDS or SDS. Two types can thus be distinguished, depending on whether
r shows FDS or SDS. The strands can be labeled such that strands 1 and 3 always show recombination between
r-1 and
r. For the other k - 1 intervals showing tetratype, recombinations can occur on four possible pairs of strands. The probability of each subcell can be denoted by p1i2...in (h0, h1 ... hk-1), where h0 is 0 or 1 if
r shows FDS or SDS, and each hj is 1, 2, 3, or 4 for j
1. Using arguments similar to those in ![]()
0, T-1n-1pt1
0, all the subcell probabilities pt0i2...in (h1, ... , hr) in a cell i2 ... in with ir = 1 for more than one r are equal, and all the subcell probabilities pt1i2...in (h0, h1, ... , hr) in a cell i2 ... in with ir = 1 for one ore more r are equal.
Markers on different sides of the centromere:
Consider markers on one side of the centromere in the order of CEN
1
2 · · ·
n1 and markers on the other side in the order of CEN
1
2 · · ·
n2 . If the centromere could be observed, any joint tetrad pattern can be represented by (i1i2 ... in1 ; j1j2 ... jn2 ), where ir = 0, 1, or 2 corresponding to P, T, or N between
r-1 and
r, js = 0, 1, or 2 corresponding to P, T, or N between
s-1 and
s, and
0 and
0 both denote the same centromere. Because the centromere is not observable, both (0i2 ... in1 ; 0j2 ... jn2 ) and (2i2 ... in1 ; 2j2 ... jn2 ) show FDS at
1 and
1 and parental ditype between
1 and
1, they are not distinguishable. Similarly, (0i2 ... in1 ; 1j2 ... jn2 ) is not distinguishable from (2i2 ... in1 ; 1j2 ... jn2 ), (1i2 ... in1 ; 0j2 ... jn2 ) is not distinguishable from (1i2 ... in1 ; 2j2 ... jn2 ), and (0i2 ... in1 ; 2j2 ... jn2 ) is not distinguishable from (2i2 ... in1 ; 0j2 ... jn2 ). For SDS at both
1 and
1, that is, tetratype in the intervals
1CEN and CEN
1, there are three distinguishable types based on the configuration between
1 and
1: P, T, or N.
We combine tetrad types having P between
1 and
1, (0i2 ... in1 ; 0j2 ... jn2 ), (2i2 ... in1 ; 2j2 ... jn2 ), and one of the three types of (1i2 ... in1 ; 1j2 ... jn2 ) showing P between
1 and
1, and denote the grouped type by (P; i2 ... in1 ; j2 ... jn2 ). Similarly, we obtain new grouped types, (T; i2 ... in1 ; j2 ... jn2 ) and (N; i2 ... in1 ; j2 ... jn2 ), where the tetrad types between
1 and
1 are T and P. It can be shown that the inequality constraints imposed by NCI on ordered tetrads are the same as the inequality constraints imposed on unordered tetrads applied to the above new grouped types. In the new grouped types, FDS or SDS information is ignored at
1 and
1. The equality constraints can be established but are more complex; we omit the details here.
Genetic mapping (one marker):
The probabilities of FDS and SDS at a marker
can be related to the map distance between CEN and
if a chiasma process model is specified. We study several chiasma models and compare various map functions derived from these models and map functions proposed in the literature. Note that centromeres can be mapped using other types of data. When markers at the centromere are available, the centromere can be treated as a marker and standard mapping procedures can be used to map centromeres (![]()
![]()
Complete interference model:
If there is at most one chiasma between CEN and
, let c0 and c1 denote the probabilities of having 0 and 1 chiasma; then, F
= p(FDS) = c0 and S
= p(SDS) = c1. The map distance d between CEN and
is c1/2. Therefore, F
= 1 - 2d and S
= 2d.
If more than one chiasma is allowed, map distance d cannot be estimated from F
and S
unless the chiasma process is fully specified with the map distance as the only unknown parameter.
Poisson model:
The most widely used chiasma process model is the Poisson process, which imposes no chiasma interference. In this model, the probability of k chiasmata between CEN and
is e-2d(2d)k/k!. Therefore, from (1),
![]() |
(3) |
Under the complete interference model, the SDS proportion is twice the map distance. Under the Poisson model, which imposes no chiasma interference, the SDS proportion will never exceed 2/3. Therefore, for ordered tetrad data, the presence of chiasma interference can be shown with just a single marker if NCI is assumed and the observed SDS proportion is significantly above 2/3. In many organisms, the SDS proportion was observed to be larger than 2/3 for some markers (![]()
![]()
![]()
![]()
There are several proposals in the literature to incorporate chiasma interference in relating the map distance and the SDS proportion. The earliest one appears to be the model proposed by ![]()
1 chiasmata is
![]() |
(4) |
Map distances and SDS proportions can be expressed in terms of x and
. ![]()
in (4). To avoid confusion with other notation in this article,
is used in the following discussion. ![]()
between 0.2 and 0.3 provided good fit to Drosophila and Neurospora data.
After trying out many candidates for simple map functions for SDS proportions, ![]()
=
sin(3d) was in excellent agreement with the empirical data in ![]()
On the basis of a map function relating the map distance d and the recombination fraction
between two markers proposed by ![]()
![]()
= 3
- d.
Here we will compare these map functions with map functions derived from the chi-square chiasma interference models. The chi-square model, first introduced by ![]()
![]()
![]()
![]()
![]()
Let p = m + 1, and define Dk(y) to be the matrix whose (i, j)th entry is d(ij) =
! if pk + j - i
0, and dk(ij) = 0 otherwise. Let 1 = (1, 1, ... , 1)' and
= (1/p)1'. For an interval defined by parameter y, the map distance d is y/2p because (1) the average number of crossover intermediates between these two markers is y; (2) one out of every p = m + 1 intermediates resolves as a crossover; and (3) each strand has a chance of 1/2 of being involved in each crossover. Therefore, a given strand is involved in a crossover for every 2p crossover intermediates. The probability of having k chiasmata between two markers is ck =
Dk1 (![]()
Therefore, from (1),
where d is related to y by d =
from the above discussion. The expressions of F
(y) and S
(y) are more complicated for m > 1. Map functions relating the SDS proportion and the map distance for different m's are plotted in Figure 1. Note that m = 0 corresponds to the no-interference model, that is, the Poisson model. Under the no interference model, the SDS proportion never goes above 2/3. For m > 0, the SDS proportion rises above 2/3. As m increases, the maximal value of S
increases, and it is achieved at smaller d. For m > 0, there is no one-to-one correspondence between S
and d. Therefore, the centromere cannot be uniquely mapped when the SDS proportion is larger than 0.6, and chiasma interference cannot be ruled out.
|
To compare map functions proposed in the literature, we plot different map functions in Figure 2. The map functions presented are: (1) the map function under the complete-interference model, (2) the map function under the no-interference model, (3) the map function proposed by ![]()
= 0.3, (4) the map function proposed by ![]()
![]()
up to 2/3. Therefore, the map functions proposed in the literature can be well approximated by the map functions under the Cx(Co)2 model. In the context of single-spore data, it was also found that map functions under the chi-square model can approximate most map functions in the literature (![]()
|
Genetic mapping (two markers):
From two-marker ordered tetrad data, the map distances among two markers and the centromere can be estimated for a given chiasma process model. Here we derive joint ordered tetrad probabilities under the chi-square model. A special case of the chi-square model, the Poisson model, is studied separately, because joint tetrad probabilities can be expressed rather easily under this model. We consider markers on different sides of the centromere and markers on the same side of the centromere in turn.
Markers on different sides of the centromere (Poisson model):
For a Poisson chiasma process, if the map distance between CEN and
is d1, and if the centromere could be observed, p0(d1) =
(1 + 2e-3d1 + 3e-2d1), p1(d1) =
(2 - 2e-3d1), and p2(d1) =
(1 + 2e-3d1 - 3e-2d1), where p0(d1), p1(d1), and p2(d1) are the probabilities of P, T, and N between CEN and
, respectively (![]()
is d2, similarly we obtain p0(d2), p1(d2), and p2(d2), the probabilities of P, T, and N between CEN and
. The joint tetrad probability pij for type (ij), where i, j = 0, 1, or 2, is pi(d1)pj(d2). Therefore, the five probabilities in Table 5 are:
Markers on different sides of the centromere (chi-square model): The chi-square model Cx(Co)m assumes that the chiasma process is stationary. This model has been applied mostly to markers on the same side of the centromere. Because the chiasma interference pattern may be different across the centromere, two chi-square models starting from the centromere toward two different telomeres may be necessary to model the chiasma process. In general, we may model interference across the centromere by relating the two most proximal crossover intermediates on two sides of the centromere. For example, we may assign the most proximal crossover intermediates on both sides of the centromere as the first Co after a Cx, thus inducing a higher chiasma interference in the centromere region than those in other regions. Or we may assign these crossover intermediates as the mth Co after a Cx. This will induce a lower chiasma interference in the centromere region than those in other regions. For simplicity, in this discussion we assume that starting from the centromere, there are two stationary chiasma processes on the two arms of the chromosome. In this case, there is no chiasma interference between the two arms.
For marker
, if the centromeres from the two parents could be distinguished, the probabilities p0, p1, and p2 of P, T, or N between CEN and
can be evaluated as follows. Let Dk(y) be as defined above; the probability of having k chiasmata between CEN and
is ck =
Dk1. Define P(y) = 
k=0pk0Dk(y), T(y) = 
k=0pk1Dk(y), and N(y) = 
k=0pk2Dk(y), where pk0, pk1 , and pk2 were defined in (2). Then p0 =
P(y)1, p1 =
T(y)1, and p2 =
N(y)1. The relation between the map distance d and the parameter y is d =
. Using these results, p0(d1), p1(d1), and p2(d1) can be obtained. Similarly, the probability of P, T, or N between CEN and
, p0(d2), p1(d2), and p2(d2) can be evaluated. The joint tetrad probability pij is pi(d1)pj(d2). Therefore, the five probabilities can be obtained as in the Poisson model.
When m = 1, it can be shown that
and
where d =
. Even for this simple model, the analytical forms are not so simple. No general results for arbitrary m are presented in this article.
Markers on the same side of the centromere (Poisson model):
For a Poisson chiasma process, if the map distance between CEN and
is d1, as shown in Equation 3, then F
(d1) =
(1 + 2e-3d1) and S
(d1) =
(1 - e-3d1) . If the map distance between
and
is d2, p0(d2) =
(1 + 2e-3d2 + 3e-2d2), p1(d2) =
(1 - e-3d2), and p2(d2) =
(1 + 2e-3d2 - 3e-2d2), where p0(d2), p1(d2), and p2(d2) are the probabilities of P, T, and N between
and
, respectively. The six probabilities in Table 6 are
Markers on the same side of the centromere (chi-square model):
Under the chi-square model, the joint tetrad probability cj
of having k and
chiasmata in the intervals (CEN,
) and (
,
) is
Dk(y1)D
(y2)1, where
, Dk(y), and 1 were defined above (![]()
where pki1 is the conditional probability for FDS (i1 = 0) or SDS (i1 = 1) defined in Equation 1, and p
i2 is the conditional tetrad type probability defined in Equation 2. Define F(y) = 
k=0[
(
+ (-
)k)]Dk(y) and S(y) = 
k=0[
(1 - (-
)k)]Dk(y) . For any joint tetrad pattern (i1i2), pi1i2 =
M1(y1)M2(y2)1 , where M1(y1) = F(y1) or S(y1) when i1 = 0 or 1, and M2(y2) = P(y2), T(y2), or N(y2) when i2 = 0, 1, or 2. The matrices P(y2), T(y2), and N(y2) were defined above. Explicit expressions for F(y), S(y), P(y), T(y), and N(y) were obtained in previous discussion under the CxCo model.
Genetic mapping (multiple markers):
As before, markers on the same side of the centromere and on different sides of the centromere are considered separately.
Markers on the same side of the centromere:
Consider n markers
1,
2, · · · ,
n in the order of CEN
1
2 · · ·
n. Under NCI, there are 2 x 3n-1 different probabilities corresponding to patterns (i1i2 ... in). These 2 x 3n-1 types were mentioned in the discussion of the NCI assumption for the multiple marker case. Denote the map distance between
r-1 and
r by dr, where
0 is the centromere.
For a Poisson chiasma process, from the previous discussion, F
1(d1) =
(1 + 2e-3d1), S
1(d1) =
(1 - e-3d1), p0(d) =
(1 + 2e-3dr + 3e-2dr), p1(dr) =
(1 - e-3dr) , and p2(dr) =
(1 + 2e-3dr - 3e-2dr) . The probability of tetrad pattern (i1i2 ... in) is f x
nr=2pir(dr) , where f is F
1(d1) or S
1(d1) when i1 = 0 or 1.
Under the chi-square model, define F(y), S(y), P(y), T(y), and N(y) as above. The probability of tetrad pattern (i1i2 ... in) is
(
nr=1Mr)1 , where M1 = F(y1) or S(y1) for i1 = 0 or 1, and Mr = P(yr), T(yr), or N(yr) for ir = 0, 1, or 2 when r
2. The parameter yr and the map distance dr are related by dr =
.
Markers on different sides of the centromere:
Consider markers in the order of
n2 · · ·
1CEN
1
2 · · ·
n1 . If the two chiasma processes on different sides of the centromere are independent, we may first consider the case in which the centromere could be observed. For tetrad pattern (i1i2 ... in) on markers CEN,
1,
2, · · · , and
n, p(i1i2...in1) =
(
n1r=1Mr)1, where Mr = P(yr), T(yr) , and N(yr) for ir = 0, 1, and 2. The map distance between
r-1 and
r is dr =
. For tetrad pattern (j1j2 ... jn2 ) on markers CEN,
1,
2, · · · , and
n2, p(j1j2...jn2) =
(
n2s=1Ms)1 , where Ms = P(ys), T(ys), and N(ys) for is = 0, 1, and 2. The map distance between
s-1 and
s is ds =
. Because the centromere is not observable, instead of 3n1+n2 probabilities, there are 5 x 3n1+n2-2 distinct probabilities. These 5 x 3n1+n2-2 distinct probabilities can be denoted by (o; i2 ...in1; j2 ... jn2) , where o = 0 corresponds to FDS at both
1 and
1 and the tetrad type between
1 and
1 being P, o = 1 corresponds to FDS at both
1 and
1 and the tetrad type between
1 and
1 being N, o = 2 corresponds to FDS at
1 and SDS at
1, o = 3 corresponds to SDS at
1 and FDS at
1, and o = 4 corresponds to SDS at both
1 and
1. The probability of type (o;i2 ... in1 ; j2 ... jn2 ) is
| RESULTS |
|---|
In this section, the methods developed and described in METHODS are used to find the order of a set of markers and to estimate map distances between the centromere and genetic markers.
Order markers under NCI (two markers):
As discussed above and summarized in Table 4 Table 5 Table 6, different orders of two markers impose different constraints among the probabilities of seven groups in Table 3. These constraints can be used to order two markers. Data from ![]()

and 
. The observed numbers of tetrads for the seven groups are shown in Table 7. It is clear that the data satisfy the constraints under the order CEN

but not the constraints under other orders. Thus, the order CEN

can be established. To make the inference more rigorous, the maximum likelihood estimates of the probabilities for the seven groups and the corresponding maximum likelihoods were calculated under the four possible orders: (1) CEN1
, CEN2
, (2) 
CEN
, (3) CEN

, and (4) CEN
. It is straightforward to obtain the maximum likelihood estimates under order (1). To find the maximum likelihood estimates under the linear inequality constraints among the seven probabilities for orders (2), (3), and (4), an expectation maximization (EM) algorithm (![]()
![]()
yielded the largest maximized log-likelihood, thus establishing the order CEN
. The second pair of markers analyzed are
and
. The observations as well as the expected values and maximized log-likelihoods under the four orders are summarized in Table 8. Comparing the maximized log-likelihoods under the four orders leads to the order
CEN
. The last pair of markers studied are
and
(Table 9). The data are consistent with
and
being on different chromosomes. ![]()
|
|
|
Order markers under NCI (three markers):
Consider three markers
,
, and
. Under RSCA, there are 32 distinct probabilities (Appendix 1: Proposition 1). When
,
, and
are on the same chromosome, there are a total of 12 possible orders among them: (1) CEN
, (2) CEN
, (3) CEN
, (4) CEN
, (5) CEN


are on different chromosomes




and
under four possible orders for
under four possible orders for
under four possible orders for