## Abstract

Several solutions have been proposed to extend the transmission disequilibrium test (TDT) to include cases with missing parental genotype. However, completion of the missing parental genotype may bias the test if the underlying missing data mechanism is informative. Furthermore, all these solutions resolve the problem of missing parental genotype, while offspring with missing genotypes are typically ignored. We propose here an extension to the TDT, called *robust TDT* (rTDT), able to handle incomplete genotypes on both parents and children and that does not rest on any assumption about the missing data mechanism. rTDT returns minimum and maximum values of TDT that are consistent with all the possible completions of the missing data. We also show that, in some situations, rTDT can achieve both greater power and greater significance than the popular TDT analysis of incomplete data. rTDT is applied to a database of markers of susceptibility to Crohn's disease and it shows that only 2 of the 11 markers originally associated with the phenotype do not depend on assumptions about the missing data mechanism.

THE transmission/disequilibrium test (TDT) is widely used to identify genetic association based upon analysis of parent-proband trios. In its simplest version, the test compares the frequency of transmission of the major and minor alleles from heterozygous parents to affected offspring, assuming complete genotype information on each trio. When genotypic information on either parents or child is missing, it is still common practice to simply discard the entire trio (Weinberg 1999), although several authors (Spielman and Ewens 1996; Curtis 1997) have suggested that disregarding these families may bias the results of the test.

Several approaches have been proposed to enable TDT to handle missing parental genotypic information (Curtis 1997; Spielman and Ewens 1998; Clayton 1999; Knapp 1999; Sun* et al.* 1999). Despite their differences, these approaches reconstruct missing parental genotypes under the assumption that they follow the probability distribution of the fully observed cases. In statistical terms, these approaches assume that parental genotypes are missing completely at random (Little and Rubin 1987) and do not depend on the genotypes themselves. Unfortunately, this assumption may fail in the presence of linkage disequilibrium (Spielman and Ewens 1996), when the probability that a genotype is missing may depend on unobserved alleles. In this situation, data are said to be *informatively missing*, and Allen* et al.* (2003) have recently shown that TDT is prone to bias when missing genotypes follow this pattern. Furthermore, all these approaches are unable to handle trios with missing genotypic information on the proband and these families are typically excluded from the analysis.

This report introduces a robust version of TDT—called *robust TDT* (rTDT)—that does not rest on a particular assumption about the missing genotype mechanism and is able to handle incomplete information about probands. This method falls within a novel approach to robust statistical inference from incomplete databases based on probability intervals (Ramoni and Sebastiani 2001a,b; Sebastiani and Ramoni 2001). The main intuition behind our approach is that the available information, albeit incomplete, constrains the space of possible statistics. We can therefore analytically build an envelope for the set of statistics consistent with all the possible completions of the data and draw robust inference on the basis of the envelope bounds. rTDT follows the same intuition to handle missing genotypes on either parents or probands and returns minimum and maximum values of the TDT statistic consistent with all possible completions of the missing data. These minimum and maximum values can be then used to infer linkage/association regardless of the missing genotype mechanism. If the minimum value of rTDT is already sufficient to ensure a particular significance level against the null hypothesis of no linkage/association, we can conclude that the inference is robust with respect to any pattern of missing genotypes on either parents or children. Similarly, if the maximum value of rTDT is below a threshold to detect linkage/association, we can conclude that the lack of evidence against the null hypothesis is independent of the missing genotype mechanism. We also show that, for some pattern of missing data, rTDT is more powerful that TDT.

To find the minimum and maximum values of the TDT statistic, we first study the mathematical properties of the McNemar statistic used to test transmission disequilibrium. Convexity of the test statistic implies that there is a simple solution to finding its minimum and maximum values. We apply rTDT to a database of trios recently used to identify markers of susceptibility to Crohn's disease (Rioux* et al.* 2001). Our analysis shows that only 2 of the 11 markers originally associated with the phenotype do not depend on assumptions about the missing data mechanism.

## ROBUST TRANSMISSION/DISEQUILIBRIUM TEST

Suppose our data set 𝒟 consists of the genotypes of *n* trios, all with affected children. We denote by 1 and 2 the two possible alleles, so that the genotype of each individual will be one of the three unordered pairs (1, 1), (1, 2), and (2, 2). The genotype (1, 2) corresponds to a heterozygous individual, while (1, 1) and (2, 2) are homozygous individuals on the major (1) and minor allele (2), respectively. We denote the genotype of a trio by a triplet—such as (1, 1), (1, 2), (1, 2) —where the first two pairs represent the genotypes of the parents and the last pair represents the genotype of the affected child. Let *b* and *c* denote the numbers of major and minor alleles transmitted by the heterozygous parents. The TDT statistic is the McNemar test statistic and, under the null hypothesis of no association/no linkage, *t* follows a chi-square distribution with 1 d.f.

Suppose now that the data set 𝒟 consists of *n*_{o} trios with complete genotypes and *n*_{m} trios with incomplete genotype, where a trio genotype is incomplete if the genotype of either one parent or the proband is missing. We denote a missing genotype by the pair (0, 0). We regard the incomplete data set as the result of a deletion process applied to a complete but unknown data set 𝒟_{c}, consisting of *n*_{o} + *n*_{m} trios with known genotypes. We then define an *admissible completion* of 𝒟 to be any complete data set 𝒟_{c} from which 𝒟 can be obtained by some deletion process. We denote by {𝒟_{c}} the set of admissible completions. Because the trio genotypes in any data set 𝒟_{c} have to be consistent with the rules of genetic inheritance, the incomplete data set will consist only of either trios with known genotypes or *admissible incomplete trios*.

Let *b*_{o} and *c*_{o} be the number of major and minor alleles transmitted from heterozygous parents in the *n*_{o} trios with complete genotype, and let *t*_{o} = *t*(*b*_{o}, *c*_{o}) be the value of the TDT statistic. We refer to a pair (*b*, *c*) that can be computed from an admissible completion of 𝒟 as an *admissible* value for (*b*, *c*). We also use the term *admissible increment* to denote one of the differences *b* − *b*_{o}, *c* − *c*_{o}, or *b* + *c* − (*b*_{o} + *c*_{o}), where (*b*, *c*) is admissible. Our objective is to use the information available from the incomplete trios to bound all admissible pairs (*b*, *c*) and identify the minimum and maximum values to the TDT statistic, as shown by Tables 1 and 2 and illustrated by the following example.

Suppose we have a data set 𝒟 of *n* = 100 families, 99 of which present a complete genotype and *b*_{o} = 20, *c*_{o} = 10. The value *t*_{o} of the test statistic in the complete cases is so that there is no evidence against the null hypothesis of no linkage/association. In the family with incomplete genotype we know that one of the parents has genotype (1, 1), while the genotypes of the other parent and of the child are unknown. Therefore, the trio genotype is the triplet (0, 0), (1, 1), (0, 0) and, given the rules of inheritance, it has only four admissible completions:

(1, 1), (1, 1), (1, 1)

(1, 2), (1, 1), (1, 1)

(1, 2), (1, 1), (1, 2)

(2, 2), (1, 1), (1, 2).

Each admissible completion will yield admissible values for *b* and *c*. Cases 1 and 4 do not change the value of the test statistic because both parents are homozygous. In case 2, the heterozygous parent transmits the allele 1, so that *c* = *c*_{o} and *b* = *b*_{o} + 1 and the value of the test statistic would be (21 − 10)^{2}/31 = 3.90. In case 3, the heterozygous parent transmits the allele 2 so that *c* = *c*_{o} + 1 and *b* = *b*_{o}, yielding (20 − 11)^{2}/31 = 2.61. Therefore, the range of values of the test statistics is in the interval [2.61; 3.90]. Note that if the genotype of the incomplete trio was as in case 3, the data would provide some weak evidence against the null hypothesis while case 2 would weaken the evidence against the null hypothesis even further.

This example shows that including the partial information conveyed by the trios with incomplete genotypes can modify the conclusions based on the complete data alone. In particular, if one is looking for evidence against the null hypothesis, disregarding the incomplete trios may weaken the sample evidence against the null hypothesis with consequent loss of power, or it may lead to inferring an association on the basis of the assumptions on the missing data mechanisms alone, with consequent loss of significance.

## DERIVATION OF rTDT

Figure 1 plots the TDT statistic as a function of *b* and *c*. Note that *t* is a convex function, and it is symmetric in *b* and *c*. Furthermore, *t* is increasing in *b* and decreasing in *c* when *b* > *c*, it is decreasing in *b* and increasing in *c* when *b* < *c*, and *t* = 0 whenever *b* = *c*. By convexity, the function *t* is maximized at one of the extreme points of its domain region (Rockafellar 1970). By monotony, the function *t* is also minimized at one of the extreme points of the domain region, unless the domain of the function contains the line *b* = *c*, in which case the minimum of the function would be 0. Therefore, to find the extreme points of the TDT statistic *t*, we need to characterize its domain region 𝒞 defined by the admissible values (*b*, *c*).

### Characterization of the region 𝒞:

Simple enumeration shows that there are seven possible trios in which either the genotype of one of the parents is missing or at least one parent is heterozygous. These patterns are listed in column 2 of Table 1. The same table reports, for each parental genotype, the admissible incomplete trios. The complete list of 17 admissible trios is listed in Table 2, together with the admissible increments of *b* and *c* induced by the admissible completions. Tables A1 and A2 in the appendix show the admissible increments for each case. Cases 9 and 11 in Table 2 allow us to increase only *b* (case 9) and *c* (case 11), while the other cases can increase either *b* or *c*, neither of them, or both. We denote by *n _{k}* the frequency of type

*k*incomplete cases in the data set, where the index

*k*refers to the case

*k*in Table 2. We first characterize the domain region 𝒞 of the test statistic

*t*.

The domain 𝒞 of the TDT statistic *t* is defined by where

This region is drawn in Figure 2 and its vertices are defined by *A* = (*b*_{M}, *c*_{1}), *A*′ = (*b*_{M}, *c*_{2}), *A*″ = (*b*_{3}, *c*_{1}), *B* = (*b*_{1}, *c*_{M}), *B*′ = (*b*_{2}, *c*_{M}), and *B*″ = (*b*_{1}, *c*_{3}). We first show that *b*_{M} is the largest admissible value for *b*. We begin by noting that the incomplete cases *k* = 4, 7, 11, 14 do not admit any completion that would increase *b*, while all cases *k* = 1, 2, 3, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17 can be completed to increase *b* the most without increasing *c*. Using the results summarized in Table 2, it is easy to check that the largest admissible increment of *b* for each of the incomplete cases *k* = 1, 2, 8, 9, 16 is 2, while the largest admissible increment for *b* from each of the incomplete cases *k* = 3, 5, 6, 10, 12, 13, 15, 17 is 1. Therefore, *b*_{M} is the largest admissible value for *b*. Note that, with the exception of cases *k* = 3, 10, there is a unique completion of each of the incomplete cases *k* = 1, 2, 3, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17 that yields the largest admissible increment for *b* without changing *c*. On the other hand, each of the cases 3 and 10 can be completed to increase *b* by 1 without changing *c* or to increase both *b* and *c* by 1. Furthermore, with the exception of case *k* = 11, in which every completion of an incomplete trio leads to an increase of *c* by 1 or 2, the cases *k* = 4, 7, 14 can be completed so that *c* does not change, or it changes by 1 or 2. Therefore, when *b* = *b*_{M}, the minimum admissible value *c* is *c*_{1} = *c*_{o} + *n*_{11} and the maximum admissible value *c* is *c*_{2} = *c*_{o} + *n*_{3} + 2*n*_{4} + *n*_{7} + *n*_{10} + 2*n*_{11} + *n*_{14}. It is also straightforward to show that every value *c*_{1} < *c*′ < *c*_{2} when *b* = *b*_{M} is admissible because a completion of the data that would yield *b* = *b*_{M} and *c* = *c*′ exists. It can be similarly shown that *c*_{M} is the largest admissible value *c* and *b*_{1} ≤ *b* ≤ *b*_{2} when *c* = *c*_{M}.

Consider now the line joining the vertices *A*′ and *B*′ and the line joining the vertices *A*″ and *B*″ in Figure 2. By definition and that

Using these properties, it is easy to show that the segment joining *A*″ to *B*″ is on the line with equation *b* + *c* = *b*_{3} + *c*_{1} = *b*_{1} + *c*_{3}, while the segment joining *A*′ to *B*′ is on the line with equation *b* + *c* = *b*_{2} + *c*_{M} = *b*_{M} + *c*_{2}. Because *c*_{2} is the largest admissible value of *c* when *b* = *b*_{M}, *b*_{M} + *c*_{2} = *b*_{2} + *c*_{M} is the largest admissible value of *b* + *c*. Therefore, every admissible value for *b* and *c* satisfies the inequality *b* + *c* ≤ *b*_{M} + *c*_{2} and lies on the left of the line with equation *c* = −*b* + *b*_{M} + *c*_{2}. A similar argument shows that both vertices *B*″ = (*b*_{1}, *c*_{3}) and *A*″ = (*b*_{3}, *c*_{1}) represent admissible values of (*b*, *c*), as well as every pair (*b*, *c*) with *b* = *b*_{1} and *c*_{3} ≤ *c* ≤ *c*_{M} or *c* = *c*_{1} and *b*_{3} ≤ *b* ≤ *b*_{M}. Furthermore, *b*_{1} + *c*_{3} = *b*_{3} + *c*_{1} is the minimum admissible value of *b* + *c*, so that every admissible pair (*b*, *c*) will satisfy the inequality *b* + *c* ≥ *b*_{1} + *c*_{3} = *b*_{3} + *c*_{1}.

### Extreme points of the TDT statistic:

From the characterization of the domain region 𝒞 we deduce that the vertices *A*, *A*′, *A*″ and *B*, *B*′, *B*″ are the extreme points. As we noted before, by convexity, the function *t* is maximized at one of the extreme points of its domain region. By monotony, the function *t* is also minimized at one of the extreme points of the domain region, unless the domain of the function contains the line *b* = *c*, where the minimum of the function will be 0. The minimum and maximum values of the TDT statistic are as follows:

If

*b*_{1}≥*c*_{M}, then 2.If*b*_{M}≤*c*_{1}, then 3.In all other cases Consider first condition 1. The inequality*b*_{1}≥*c*_{M}implies that the domain 𝒞 is on the right of the line*b*=*c*(see Figure 2) so that the TDT statistic is increasing in*b*and decreasing in*c*. Therefore, the maximum is found when*b*is maximum (*b*=*b*_{M}) and*c*is at its allowed minimum value (*c*=*c*_{1}). This is point*A*in Figure 2. Similarly, the minimum of the function is achieved when*b*is minimum (*b*=*b*_{1}) and*c*is maximum (*c*=*c*_{M}). This is point*B*in Figure 2. The proof of condition 2 is symmetrical to condition 1 and uses the fact that, by*b*_{M}≤*c*_{1}, the domain region 𝒞 is on the left of the line*b*=*c*so that the TDT statistic is decreasing in*b*and increasing in*c*. In condition 3, one can reason in a similar manner to show that, by convexity, the maximum of the*t*statistics is the maximum between conditions 1 and 2:*t*_{max}= max{*t*(*b*_{1},*c*_{M}),*t*(*b*_{M},*c*_{1})}. Furthermore, the domain 𝒞 contains the line*b*=*c*, so that the minimum is 0. Note that, in this situation, a completion of the data set for which*b*=*c*may not exist, so that the value 0 is a tight lower bound.

### Interpretation:

In practice, we can use rTDT to look for robust evidence against the null hypothesis of no linkage/association by examining the lower bound: a small *P*-value *P*(χ^{2} > *t*_{min}) will provide evidence against the null hypothesis, regardless of the missing data mechanism. If *t*_{min} ≤ *t*_{o}, a significant result with the rTDT analysis implies a significant result with the traditional TDT analysis, but not the other way around. Conversely, rTDT can show that significant TDT results may be due to the assumed missing genotype mechanisms. Furthermore, for all those markers for which *P*(χ^{2} < *t*_{max}) is large, the rTDT analysis will be nonsignificant, regardless of the missing genotype mechanism. In this case, if *t*_{o} ≤ *t*_{max}, a nonsignificant result with rTDT will imply a nonsignificant result with TDT, but not the other way around. All those situations in which the TDT results are significant and the rTDT results are not will cast doubt on the validity of the TDT analysis that may be biased by the missing genotype mechanism.

It is also important to emphasize that rTDT can increase the power of TDT. This property is a consequence of the fact that *t*_{min} may be greater than *t*_{o}. For example, suppose that the incomplete trios are all as case 9 in Table 2, so that, in each incomplete trio, we know that one parent is heterozygous and the child is always homozygous. Then, if *b*_{o} > *c*_{o}, *t*_{min} = (*b*_{o} + *n*_{m} × *n*_{9} − *c*_{o})^{2}/(*b*_{o} + *n*_{m} × *n*_{9} + *c*_{o}) > *t*_{o}, with consequent increase of power for rTDT. Similarly, it can be shown that if all incomplete trios are as in case 11, and *b*_{o} < *c*_{o}, then *t*_{max} < *t*_{o}. In this case, ignoring missing data can claim false linkage, with increased significance for rTDT.

## SCREENING FOR MARKERS ASSOCIATED WITH CROHN'S DISEASE

rTDT was applied to 103 SNPs on the IBD5 locus of chromosome 5*q*31 (available from http://www-genome.wi.mit.edu/mpg/idrg/IBD5/haplodata.html) genotyped in 129 trios with at least one heterozygous parent and one affected child (Daly* et al.* 2001; Rioux* et al.* 2001). Data were collected to identify genetic variants conferring susceptibility to Crohn's disease. Rioux* et al.* (2001) identify 11 SNPs with highly significant TDT results using only those trios with complete genotype, shown in Table 3. An analysis of the available data shows 2 more SNPs not included in the original report. These SNPs are included in Table 3.

We first investigate whether the probability of missing genotype changes with the marker position. In the presence of linkage disequilibrium, a spatial pattern in the proportion of missing genotypes could indicate that the missing data mechanism is informative. Consistent proportions of missing genotype across segments of the reference sequence could then be confused with association to phenotype, and the missing genotype mechanism would be informative (Spielman and Ewens 1996). Figure 3A plots the proportion of missing data relative to the marker positions in the reference sequence and suggests the presence of regions with higher probability of missing genotypes. To verify this hypothesis, we tried to decompose the spatial pattern of missing data into two groups using hidden Markov models (Rabiner 1989). We modeled the state variable as a binary variable and, conditional on the state variable, we modeled the observed proportion of missing data by log-normal distributions. We used the implementation of Gibbs sampling in Winbugs 1.4 (Thomas* et al.* 1992) to identify the state of the hidden variable and the parameters of the two log-normal distributions. The method identified two different spatial patterns of missing data that are highlighted in Figure 3B in red and black. The plot shows that the markers in the contiguous regions at positions 411–443, 495–577, and 594–718 kb are characterized by a higher probability of missing genotypes (0.19 ± 0.002). We analyzed the data using rTDT to show how missing genotype information can actually change the conclusions drawn by simply disregarding the incomplete trios.

When the incomplete trios are considered in the analysis, 2 of 103 SNPs have *t*_{max} < 3.84, so that missing data would not alter the lack of evidence against the null hypothesis of no association/no linkage. If the threshold is changed to have increasing significance against the null hypothesis, 7 SNPs have *t*_{max} < 6.63 (99% significance), 18 SNPs have *t*_{max} < 10.82 (99.9% significance), and 35 SNPs have *t*_{max} < 15.13 (99.99% significance). When we try to identify those SNPs for which there is evidence of linkage, regardless of the missing data mechanism, 11 of 103 SNPs have *t*_{min} > 3.84, and only 1 has *t*_{min} > 10.82 (99.9% significance), the threshold selected by Rioux* et al.* (2001) to account for multiple comparisons. This is the marker IGR2055a_1 for which we can conclude that the substantial evidence of linkage is independent of any missing data mechanism.

Table 4 shows the admissible values *b*_{1}, *b*_{M}, *c*_{1}, *c*_{M} for *b*, *c* and the minimum value of the TDT statistic *t*_{min} for those SNPs that were reported as highly significant in the original analysis. IGR2055a_1 and results are still significant only for marker IGR3096a_1, although IGR2055a_1, with a significance of 99.92%, remains very close to the threshold. For all the other markers, inclusion of the partial information provided by the incomplete genotypes weakens the evidence of linkage/association. In particular, for markers IGR2230a_1 and IGR3097a_1, the minimum value of the TDT statistic is 0 and the large proportion of missing genotypes (26 and 14%, respectively) precludes any conclusions about their role in susceptibility to Crohn's disease.

## CONCLUSIONS

Missing genotypic information is a common problem in family-based studies that limits the applicability of the TDT statistic. Several authors have proposed extensions of the TDT that either disregard incomplete trios or enforce some assumptions about the missing genotype mechanism with potential for serious bias. To avoid both limitations, we have introduced the rTDT. Contrary to other methods, the rTDT analysis is robust with respect to the missing genotype mechanism in both parents and children. We have also shown that, in some situations, rTDT can achieve both greater power and greater significance than the traditional TDT analysis discarding incomplete trios.

One interesting issue is whether the trios with completely missing genotype information should be used in the identification of the minimum and maximum values of the TDT statistic. We argue that, when the analysis concerns one single marker, these trios do not provide any information and coherent analysis should actually disregard them. In this case, only *b*_{M} and *c*_{M} change, as we removed the frequency *n*_{1} in their calculation. However, in those analyses that measure the strength of association between several markers, it seems to be important to maintain a common sample size and the use of rTDT provides a principled way to retain all the incomplete trios.

A program implementing rTDT is available from http://www.ugr.es/~mabad/rTDT/rTDT.html.

Incomplete genotypes | Admissible completions | Increments | ||||||
---|---|---|---|---|---|---|---|---|

Case: k | Parent | Parent | Child | Parent | Parent | Child | b | c |

1 | (0, 0) | (0, 0) | (0, 0) | (1, 1) | (1, 1) | (1, 1) | 0 | 0 |

(1, 1) | (1, 2) | (1, 1) | 1 | 0 | ||||

(1, 1) | (1, 2) | (1, 2) | 0 | 1 | ||||

(1, 1) | (2, 2) | (1, 2) | 0 | 0 | ||||

(1, 2) | (1, 2) | (1, 1) | 2 | 0 | ||||

(1, 2) | (1, 2) | (1, 2) | 1 | 1 | ||||

(1, 2) | (1, 2) | (2, 2) | 0 | 2 | ||||

(2, 2) | (1, 2) | (1, 2) | 1 | 0 | ||||

(2, 2) | (1, 2) | (2, 2) | 0 | 1 | ||||

(2, 2) | (2, 2) | (2, 2) | 0 | 0 | ||||

2 | (0, 0) | (0, 0) | (1, 1) | (1, 1) | (1, 1) | (1, 1) | 0 | 0 |

(1, 1) | (1, 2) | (1, 1) | 1 | 0 | ||||

(1, 2) | (1, 2) | (1, 1) | 2 | 0 | ||||

3 | (0, 0) | (0, 0) | (1, 2) | (1, 1) | (1, 2) | (1, 2) | 0 | 1 |

(1, 1) | (2, 2) | (1, 2) | 0 | 0 | ||||

(1, 2) | (1, 2) | (1, 2) | 1 | 1 | ||||

(1, 2) | (2, 2) | (1, 2) | 1 | 0 | ||||

4 | (0, 0) | (0, 0) | (2, 2) | (1, 2) | (1, 2) | (2, 2) | 0 | 2 |

(1, 2) | (2, 2) | (2, 2) | 0 | 1 | ||||

(2, 2) | (2, 2) | (2, 2) | 0 | 0 | ||||

5 | (0, 0) | (1, 1) | (0, 0) | (1, 1) | (1, 1) | (1, 1) | 0 | 0 |

(1, 2) | (1, 1) | (1, 1) | 1 | 0 | ||||

(1, 2) | (1, 1) | (1, 2) | 0 | 1 | ||||

(2, 2) | (1, 1) | (1, 2) | 0 | 0 | ||||

6 | (0, 0) | (1, 1) | (1, 1) | (1, 1) | (1, 1) | (1, 1) | 0 | 0 |

(1, 2) | (1, 1) | (1, 1) | 1 | 0 | ||||

7 | (0, 0) | (1, 1) | (1, 2) | (1, 2) | (1, 1) | (1, 2) | 0 | 1 |

(2, 2) | (1, 1) | (1, 2) | 0 | 0 |

Incomplete genotypes | Admissible completions | Increments | ||||||
---|---|---|---|---|---|---|---|---|

Case: k | Parent | Parent | Child | Parent | Parent | Child | b | c |

8 | (0, 0) | (1, 2) | (0, 0) | (1, 1) | (1, 2) | (1, 1) | 1 | 0 |

(1, 1) | (1, 2) | (1, 2) | 0 | 1 | ||||

(1, 2) | (1, 2) | (1, 1) | 2 | 0 | ||||

(1, 2) | (1, 2) | (1, 2) | 1 | 1 | ||||

(1, 2) | (1, 2) | (2, 2) | 0 | 2 | ||||

(2, 2) | (1, 2) | (1, 2) | 1 | 0 | ||||

(2, 2) | (1, 2) | (2, 2) | 0 | 1 | ||||

9 | (0, 0) | (1, 2) | (1, 1) | (1, 1) | (1, 2) | (1, 1) | 1 | 0 |

(1, 2) | (1, 2) | (1, 1) | 2 | 0 | ||||

10 | (0, 0) | (1, 2) | (1, 2) | (1, 1) | (1, 2) | (1, 2) | 0 | 1 |

(1, 2) | (1, 2) | (1, 2) | 1 | 1 | ||||

(2, 2) | (1, 2) | (1, 2) | 1 | 0 | ||||

11 | (0, 0) | (1, 2) | (2, 2) | (1, 2) | (1, 2) | (2, 2) | 0 | 2 |

(2, 2) | (1, 2) | (2, 2) | 0 | 1 | ||||

12 | (0, 0) | (2, 2) | (0, 0) | (1, 1) | (2, 2) | (1, 2) | 0 | 0 |

(1, 2) | (2, 2) | (1, 2) | 1 | 0 | ||||

(1, 2) | (2, 2) | (2, 2) | 0 | 1 | ||||

(2, 2) | (2, 2) | (2, 2) | 0 | 0 | ||||

13 | (0, 0) | (2, 2) | (1, 2) | (1, 1) | (2, 2) | (1, 2) | 0 | 0 |

(1, 2) | (2, 2) | (1, 2) | 1 | 0 | ||||

(2, 2) | (2, 2) | (1, 2) | 0 | 0 | ||||

14 | (0, 0) | (2, 2) | (2, 2) | (1, 2) | (2, 2) | (2, 2) | 0 | 1 |

(2, 2) | (2, 2) | (2, 2) | 0 | 0 | ||||

15 | (1, 1) | (1, 2) | (0, 0) | (1, 1) | (1, 2) | (1, 1) | 1 | 0 |

(1, 1) | (1, 2) | (1, 2) | 0 | 1 | ||||

16 | (1, 2) | (1, 2) | (0, 0) | (1, 2) | (1, 2) | (1, 1) | 2 | 0 |

(1, 2) | (1, 2) | (1, 2) | 1 | 1 | ||||

(1, 2) | (1, 2) | (2, 2) | 0 | 2 | ||||

17 | (1, 2) | (2, 2) | (0, 0) | (1, 2) | (2, 2) | (1, 2) | 1 | 0 |

(1, 2) | (2, 2) | (2, 2) | 0 | 1 |

## Acknowledgments

This research was supported by National Science Foundation grant 0113496, the Spanish State Office of Education and Universities, the European Social Fund, and the Fulbright Program.

## Footnotes

Communicating editor: S. Tavare

- Received December 15, 2003.
- Accepted August 30, 2004.

- Genetics Society of America