Statistical Modeling of Interlocus Interactions in a Complex Disease: Rejection of the Multiplicative Model of Epistasis in Type 1 Diabetes
 Heather J. Cordell*,
 John A. Todd*,
 Natasha J. Hill*,1,
 Christopher J. Lord*,2,
 Paul A. Lyons*,
 Laurence B. Peterson†,
 Linda S. Wicker† and
 David G. Clayton*
 ^{*} Department of Medical Genetics, University of Cambridge, Wellcome Trust Centre for Molecular Mechanisms in Disease, Cambridge, CB2 2XY, United Kingdom
 ^{†} Merck Research Laboratories, Rahway, New Jersey 07065
 Corresponding author: Heather J. Cordell, Department of Medical Genetics, University of Cambridge, Wellcome Trust Centre for Molecular Mechanisms in Disease, Wellcome Trust/MRC Bldg., Addenbrooke’s Hospital, Hills Rd., Cambridge CB2 2XY, United Kingdom. Email: heather.cordell{at}cimr.cam.ac.uk
Abstract
In general, common diseases do not follow a Mendelian inheritance pattern. To identify disease mechanisms and etiology, their genetic dissection may be assisted by evaluation of linkage in mouse models of human disease. Statistical modeling of multiplelocus linkage data from the nonobese diabetic (NOD) mouse model of type 1 diabetes has previously provided evidence for epistasis between alleles of several Idd (insulindependent diabetes) loci. The construction of NOD congenic strains containing selected segments of the diabetesresistant strain genome allows analysis of the joint effects of alleles of different loci in isolation, without the complication of other segregating Idd loci. In this article, we analyze data from congenic strains carrying two chromosome intervals (a double congenic strain) for two pairs of loci: Idd3 and Idd10 and Idd3 and Idd5. The joint action of both pairs is consistent with models of additivity on either the log odds of the penetrance, or the liability scale, rather than with the previously proposed multiplicative model of epistasis. For Idd3 and Idd5 we would also not reject a model of additivity on the penetrance scale, which might indicate a disease model mediated by more than one pathway leading to βcell destruction and development of diabetes. However, there has been confusion between different definitions of interaction or epistasis as used in the biological, statistical, epidemiological, and quantitative and human genetics fields. The degree to which statistical analyses can elucidate underlying biologic mechanisms may be limited and may require prior knowledge of the underlying etiology.
MUCH effort has been invested in the mapping and identification of loci that predispose to common multifactorial diseases, such as type 1 diabetes. However, little information is available as to the nature of interaction between genes. It is hoped that the identification of the mode of gene interaction will facilitate understanding of the pathological mechanisms involved in complex diseases, as well as the further identification of disease susceptibility loci. One major challenge in complex disease genetics is to be able to detect, at a reasonable statistical level, genes that alone have small effects on the disease phenotype. The chances of detecting such small effects may be increased when the interaction of one such gene with another is taken into consideration. For example, in type 1 diabetes (Cordell et al. 1995, 2000), type 2 diabetes (Coxet al. 1999), and inflammatory bowel disease (Choet al. 1998), identification of the most suitable model of interaction between loci provided increased evidence for linkage at one locus when the interaction at another locus was taken into consideration.
The modeling of interlocus effects in human complex disease is still in its infancy (Coxet al. 1999), but there is some doubt that it offers significant advantages except under special conditions (Leal and Ott 2000). It is challenging, owing to the size of the data set required to provide power to distinguish between different genetic models and the fact that other disease loci are segregating in the study population and environmental factors may be operating. Nevertheless, twolocus genetic modeling has been reported, for example, between IDDM1 and IDDM2 and between IDDM1 and IDDM4, using affected sibpair data (Cordellet al. 1995). In that data set, the IDDM1/IDDM2 joint effect was well described by the multiplicative model, while the IDDM1/IDDM4 effect followed a heterogeneity model.
The study of interlocus interactions in complex disease has been confused by differences in definition and terminology between biologists, epidemiologists, and statisticians, and between quantitative and human geneticists. The term “epistatic” was originally introduced by Bateson (1909) to describe a “masking” effect in which a factor at one Mendelian locus prevents another from manifesting its effect. This definition is perhaps closest to the phenomenon of interest to a biologist or biochemist investigating interaction between proteins in a mechanistic sense. In fact, the concept of biological interaction is often not precisely defined (Greenland and Rothman 1998) but usually corresponds to such a situation in which the qualitative nature of the mechanism of action of each of two factors is affected by the presence or absence of the other (Siemiatycki and Thomas 1981).
In quantitative genetics the term epistatic has classically been used to refer to a deviation from additivity in the effects of alleles at different loci with respect to prediction of a quantitative phenotype. This definition is due to Fisher (1918), but note that this is not equivalent to the Bateson (1909) definition, as pointed out by R. C. Punnett in his review of the Fisher (1918) article (Norton and Pearson 1976). Epistasis in the Fisher (1918) sense is similar to the usual concept of statistical interaction: departure from a (specific) linear model describing the relationship between predictive factors (here assumed to be alleles at different genetic loci) and an outcome or phenotype of interest. Note that with this definition, the choice of scale becomes important as factors that are additive with respect to an outcome measured on one scale may exhibit interaction when a different, transformed scale is used (Frankel and Schork 1996; Greenland and Rothman 1998).
In human genetics, three main models of gene interaction for the penetrance (the probability of developing disease given genotype) are commonly considered (Risch 1990): a heterogeneity model (Risch 1990; Neuman and Rice 1992), which is a generalization of a model in which an individual becomes affected through possessing a predisposing genotype at either of two loci; an additive model, which has been shown to approximate the heterogeneity model when modeling familial relative risks (Risch 1990; Cordellet al. 1995); and a multiplicative model (Hodge 1981). The additive and heterogeneity models are usually assumed to represent nonepistatic models and to correspond to a situation in which the biological pathways involved in disease are at some level separate or “independent.” The multiplicative model is usually considered to be an epistatic model in which the loci and pathways involved are not independent; note, however, that a multiplicative model can be considered to be an additive model when transformed to the logarithmic scale. In a statistical sense, therefore, the multiplicative model signifies independent additive effects on a logarithmic scale. A fourth model is one of additivity on a liability or probit scale, where loci contribute to an underlying, unobserved, continuous trait in an additive fashion and development of disease occurs if this trait exceeds a certain threshold (Pearson 1900; Fisher 1930; Wright 1934a,b). The question that inevitably arises is whether there is some “natural” scale (such as the probit scale, or alternatively the raw penetrance scale) on which models have a specific biological or causal interpretation. We postpone further discussion of this question until later.
One approach to the dissection of complex disease genetics (and the modeling of gene interactions in complex diseases) is to exploit a rodent model of a human complex disease, such as the model of human type 1 diabetes, the nonobese diabetic (NOD) mouse. Spontaneous diabetes in this inbred mouse strain has a similar etiology to human type 1 diabetes, both in terms of the major physiological features of the disease and also some shared genetic determinants, notably at the MHC (Makinoet al. 1980; Lyons and Wicker 1999). The advantage that NOD mouse genetic analysis holds is that large backcross and intercross pedigrees can be bred and phenotyped under stable environmental conditions, allowing the identification of disease linkages with convincing statistical support (Lyons and Wicker 1999). It is notable that our previous modeling of multiple locus linkage data from such an NOD cross provided evidence for the multiplicative model of epistasis on the penetrance scale (Rischet al. 1993). This model gave very similar results to the classic polygenic threshold liability model, which is often assumed in the inheritance of complex traits. However, in this previous study, all of the complex genegene interactions were assessed simultaneously and the nature of interactions between specific pairs of loci was not examined in detail.
To define interactions between specific pairs of loci, double congenic mouse strains may be developed. For standard single congenic strains, specific chromosome intervals from one inbred strain [the donor, in this case the diabetesresistant strains C57BL/6 (B6) or C57BL/10 (B10)] are introgressed onto the background of the recipient strain (the diabetessusceptible NOD strain) by backcrossing (Wickeret al. 1994). Conversely, NOD chromosome segments can be introgressed into the diabetesresistant strain to try to convert a resistant strain into a sensitive one (Yuiet al. 1996). Allelically variable markers are used to guide the strain construction and to make sure that after several generations of backcrossing only the desired chromosome segment is of B6 or B10 origin. The influence of this introgressed (“congenic”) segment on disease frequency can then be evaluated by creating the homozygous strain and monitoring disease in a cohort of NOD congenic mice compared with a cohort of NOD parental mice. If an introgressed interval contains an Idd locus then the NOD congenic strain will be protected from diabetes (Wickeret al. 1994; Lyons and Wicker 1999). As an extension of this strategy, double congenic strains have now been developed, where a single strain possesses two welldefined congenic regions derived from two separate single congenic strains. These double congenic strains allow a more specific assessment of the mode of interaction between two chromosome regions (compared to backcross data), as the genetic background is not a variable and sufficient numbers of mice can be bred to achieve enough statistical power to distinguish between models. In the work presented here, we report on the analysis of previously published data from an NOD double congenic strain for Idd3 and Idd10 on chromosome 3 (Wickeret al. 1994), and on data from an NOD double congenic strain for Idd5 and Idd3 on chromosomes 1 and 3, respectively (Hillet al. 2000). The diabetes frequencies in both single and double congenic strains were modeled mathematically on linear, log odds, and liability scales using several statistical models. Given our previous multiplelocus results, we were surprised to find strong evidence against multiplicativity for both pairs of loci, even though models that were additive on a log odds or liability scale fitted the data well.
MATERIALS AND METHODS
The development of the congenic strains analyzed here has been previously described by Wickeret al. 1994, Podolin et al. (1997), and Hill et al. (2000), but no specific modeling of epistasis was undertaken by these authors. Here we model the joint action of two pairs of type 1 diabetes susceptibility loci, Idd3 and Idd10 and Idd3 and Idd5, in causing disease. For Idd3 and Idd10, F_{1} data were used so that animals could be classified into nine categories: homozygous for NOD at both loci, homozygous for NOD at Idd3 and heterozygous for NOD/B6 at Idd10, homozygous for NOD at Idd3 and homozygous for B6 at Idd10, heterozygous at Idd3 and homozygous for NOD at Idd10, heterozygous at both loci, heterozygous at Idd3 and homozygous for B6 at Idd10, homozygous for B6 at Idd3 and homozygous for NOD at Idd10, homozygous for B6 at Idd3 and heterozygous at Idd10, and homozygous for B6 at both loci. Two different strains, NOD.B6Idd10^{R}^{1} and NOD. B6Idd10^{R}^{2}, were used to generate the Idd10 congenics (Podolinet al. 1997). These strains differ in their development of diabetes due to differences in the length of the congenic region around the Idd10 locus, with the NOD.B6Idd10^{R}^{1} congenic region in fact corresponding to two Idd loci, Idd10 and Idd17 (Podolinet al. 1997).
Counts of the number of mice developing diabetes in the nine categories for Idd3 and Idd10 are given in Table 1. Counts for comparable NOD double congenic strains for Idd3 and Idd5 (Hillet al. 2000) are given in Table 2. For Idd3 and Idd5, data were available only in the homozygous categories, and the nonNODderived allele for Idd5 was derived from the B10 rather than the B6 strain. Given data in the form of Table 1 or Table 2, mathematical models for the joint effect of alleles at the two loci were fitted.
Models for penetrance: Consider modeling the data for the nine genotype categories in Table 1. Let p_{ij} be the probability that an animal develops disease given that it has genotype i at locus 1 and j at locus 2, where i and j take values from 0 to 2 corresponding to the number of B (B6 or B10) alleles in the genotype. The “raw” estimate of p_{00} for strain NOD. B6Idd10^{R}^{1} in Table 1 is therefore 63/81 or 0.778, for example. Since p_{ij} is a probability, any realistic model requires that any estimate of p_{ij} be constrained to lie in the interval [0, 1] for all i, j.
Additive model: The additive model as used in human genetics is usually parameterized as p_{ij} = x_{i} + y_{j}, where x_{i} and y_{j} are parameters to be estimated representing the contributions of the different genotypes at loci 1 and 2, respectively (Risch 1990). This model is equivalent to the standard quantitative genetics model
Heterogeneity model: In human genetics, the additive model is often considered to be a good approximation to a model of genetic heterogeneity, in which loci 1 and 2 are considered to be independent causes of disease (Risch 1990). In the heterogeneity model, p_{ij} is written as p_{ij} = x_{i} + y_{j}  x_{i}y_{j} (Neuman and Rice 1992). When x_{i} and y_{j} take values x_{0} = y_{0} = 1, x_{1} = y_{1} = x_{2} = y_{2} = 0, or alternatively x_{0} = y_{0} = x_{1} = y_{1} = 1, x_{2} = y_{2} = 0, this leads to penetrance matrices of the form given in Table 3, which clearly correspond to a “classical” heterogeneity model in which an individual can be affected through possessing a predisposing genotype at either locus. When the penetrances are not constrained to be 0 or 1, however, the biological interpretation of the generalized heterogeneity model p_{ij} = x_{i} + y_{j}  x_{i}y_{j} is less clear. Risch (1990) and Cordell et al. (1995) have shown that the heterogeneity and additive models for the penetrance give very similar results when used to model familial relative risks of disease. This may not be the case, however, when modeling the penetrances directly; e.g., note that the penetrance matrices in Table 3 cannot be achieved using an additive model. Schork et al. (1993) have also considered heterogeneity models that have a parameterization equivalent to the Risch (1990) parameterization if the parameter ϕ in the Schork et al. (1993) formulation is set equal to 1.
The heterogeneity model used in human genetics does not have a direct equivalent in the quantitative genetics literature. Note, however, that the model may be written as log (1  p_{ij}) = log(1  x_{i}) + log(1  y_{j}); i.e., it may be considered to be an additive or nonepistatic model on the log(1  p_{ij}) scale. This illustrates that, like the additive model, it must have only five free parameters.
Multiplicative model: The multiplicative model (Hodge 1981) may be written as p_{ij} = x_{i}y_{j}. This model is usually considered to represent a form of epistasis between loci 1 and 2, but note that on a logarithmic scale it is equivalent to an additive model and, therefore, again has only five free parameters.
General (epistatic) model: Restricted models for the penetrance may be compared to a general epistatic model that corresponds to a saturated model in which we estimate nine parameters: p_{11}, p_{12}, p_{13}, p_{21}, p_{22}, p_{23}, p_{31}, p_{32}, and p_{33}. In the quantitative genetics literature, this model is usually written
Application to data in Table 2: The data in Table 2 can be modeled in a similar way to that of Table 1. Note that the data in Table 2 have less degrees of freedom (d.f.), with the general model having four free parameters and the restricted models having three free parameters. The standard quantitative genetics model (Weberet al. 1999; Zenget al. 2000) for these data would be
Models for the log odds: In epidemiological studies, rather than modeling the penetrance directly, a more common measure is the natural logarithm of the odds, log(p/(1  p)), which has the advantage of not being constrained to the interval [0, 1]. The standard epidemiological procedure is to fit an additive model to the log odds so that we write log(p_{ij}/(1  p_{ij})) = x_{i} + y_{j} or, equivalently, p_{ij} = e^{xi}^{+}^{yj}/(1 + e^{xi}^{+}^{yj}). This model allows us to model the effects at loci 1 and 2 as independent (in a statistical sense) additive effects on the log odds scale; note, however, that it leads to interactive effects (epistasis) on the penetrance scale. We may also consider fitting heterogeneity and multiplicative models to the log odds in the same way as to the penetrances, although it is unclear what biological meaning should be attached to such models. Note that if all the penetrances are small, an additive model for the log odds should be equivalent to a multiplicative model for the penetrance since x_{i} + y_{j} = log(p_{ij}/(1  p_{ij})) ≈ log(p_{ij}) and so p_{ij} ≈ e^{xi}e^{yj} = X_{i}Y_{j}, say.
Liability models: Another model we consider is a liability or probit model similar to that described by Risch et al. (1993). This model corresponds to the classical polygenic threshold model (Pearson 1900; Fisher 1930; Wright 1934a,b) in which each disease susceptibility locus contributes to W (an underlying unobserved continuous variable) in an additive fashion. Around each genotype contribution is a residual variance, which here, as in Risch et al. (1993), is assumed to equal 1.0. A threshold V is defined so that the area beyond the threshold corresponds to the overall frequency of affected individuals. The probabilities p_{ij} can be written p_{ij} = 1  Φ(V, W, 1) = 1  Φ(V, x_{i} + y_{j}, 1), where Φ(V, μ, σ^{2}) is defined to be the cumulative distribution function of a random variable that is normally distributed with mean μ and variance σ^{2}, i.e., the probability that such a variable takes a value less than V.
The liability model used here is slightly more general than the one described by Risch et al. (1993), which was parameterized by a single parameter for each locus, leading to two rather than three free parameters for their data, which were in a similar form to Table 2. [Note also that Risch et al. (1993) model heterozygote effects rather than B10 homozygous effects since they consider data generated from a backcross rather than from a congenic strain.] The Risch et al. (1993) parameterization essentially forces the value of W in category four of Table 2 to equal 1 times the value of W in category one. Our parameterization here is easier to extend to the data for all nine categories in Table 1. In addition it has the following useful property: The probabilities Φ(V, x_{i} + y_{j}, 1) can be written as Φ(x_{i} y_{j}, V, 1) and it can be shown that fitting the model p_{ij} = 1  Φ(x  y_{j}, V, 1) is invariant to the choice of V; i.e., we can use the standard cumulative normal distribution function Φ(x_{i} y_{j}, 0, 1) and estimation of V is not required.
A natural extension to the additive liability model just described would be to consider heterogeneity or multiplicative effects for the liability; e.g., p_{ij} = 1  Φ(V, x_{i} + y_{j}  x_{i}y_{j}, 1) or p_{ij} = 1  Φ(V, x_{i}y_{j}, 1). Unfortunately these formulations do not have the invariant property of the additive liability model and, moreover, do not have an obvious genetic interpretation. We therefore do not present results from these analyses. Note that although the additive liability model is additive on the liability scale, it leads to interactive effects (epistasis) on the penetrance scale.
Fitting the likelihood: Given a model for p_{ij} and data in all relevant genotype categories, the likelihood for the data may be written as
NOD: (x_{0} + y_{0})^{55}(1  x_{0}  y_{0})^{18}
Idd5 congenic strain: (x_{0} + y_{2})^{42}(1  x_{0}  y_{2})^{48}
Idd3 congenic strain: (x_{2} + y_{0})^{12}(1  x_{2}  y_{0})^{47}
Idd3/5 double congenic strain: (x_{2} + y_{2})^{2}(1  x_{2}  y_{2})^{89}.
The overall likelihood may be calculated as the product of the likelihoods for each of the four strains. The likelihood may then be maximized with respect to the parameters to be estimated. Standard statistical theory predicts that twice the difference between the natural logarithms of the maximized likelihoods for nested models should be distributed as a χ^{2} with degrees of freedom equal to the difference in the number of estimated parameters. The maximized loglikelihood for a restricted model (additive, heterogeneity, multiplicative) can therefore be compared to that for the general unrestricted (saturated) model, allowing a test for the goodness of fit of the restricted model to be performed.
Generalized linear models: It is worth noting that the models described here can all be considered as generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder 1989). In the generalized linear model framework, the probability p_{ij} is related to a linear predictor via a link function: g(p) = μ + x_{i} + y_{j}. Here the link functions g(p) correspond to the previously described models as follows
Additive model for penetrance: g(p) = p
Multiplicative model for penetrance: g(p) = log(p)
Heterogeneity model for penetrance: g(p) = log(1  p)
Additive model for liability: g(p) = probit(p)
Additive model for log odds: logistic regression: g(p) = logit(p).
These models can be fitted using standard routines in most statistics packages (e.g., PROC GENMOD in SAS; glm in SPlus; GLM in Stata, GLIM, etc.). In the epidemiological literature a number of more general parametric families of link function have been proposed (Thomas 1981; Guerrero and Johnson 1982; Breslow and Storer 1985).
RESULTS
Modeling the joint effects of Idd3 and Idd10: Table 4 shows the results for fitting the models described above to the data from Table 1 using the strain NOD. B6Idd10^{R}^{1}. Table 5 shows the results using the strain NOD.B6Idd10^{R}^{2}. Results are given in terms of fitted values for the penetrances, and differences in 2 ln likelihoods and P values for rejection of the models are compared to the general model. The asymptotic P value assumes the likelihoodratio statistic has a standard χ^{2} distribution that may not be true if cell counts are too small or if maximization is carried out subject to nonstandard constraints, e.g., constraining p_{ij} to be between 0 and 1 for the penetrance models. We therefore also present empirical P values for rejecting each specific restricted model, which were calculated by simulating data under the null penetrances for that restricted model, and we note how often the simulated 2 ln likelihood difference exceeded the observed difference. The fitted values for the penetrances for some of these models are displayed graphically in Figure 1, allowing comparisons to be made between the shapes of the graphs for the restricted models and the general model. In the discussion below, we assume a 5% significance level for acceptance/rejection of models; i.e., models with a P value of <0.05 are rejected, although note that we include exact P values where possible.
From Table 4, using the NOD.B6Idd10^{R}^{1} strain we find no evidence to reject an additive model for either the log odds or the liability (P values are 0.12 and 0.20, respectively). There is evidence against all three models (additive, heterogeneity, and multiplicative)for the penetrance. In addition there is evidence against a heterogeneity model for the log odds and strong evidence (P = 7 × 10^{16}) against a multiplicative model for the log odds. The empirical P values are very close to the asymptotic P values. It is interesting that a multiplicative model for the penetrance is rejected (P = 3 × 10^{5}) even though an additive model for the liability is accepted. This contrasts with the results of Risch et al. (1993), who found that the additive liability model gave predictions reasonably close to a multiplicative model for the penetrances when applied to backcross data in four genotype categories at a variety of Idd loci. These authors also found that a multiplicative model of epistasis (on the penetrance scale) provided a good fit to the backcross data they analyzed. The differences between our results and those of Risch et al. (1993) may result partly from consideration of different susceptibility loci, partly from the fact that we have congenic data as opposed to backcross data, and partly from the fact that we are modeling data from all nine genotype categories as opposed to four categories. These last two considerations in particular might be expected to provide a more powerful procedure for discriminating between and rejecting specific genetic models, since congenic and double congenic data allow us to estimate penetrances directly (as opposed to backcross data, which allows estimation of relative penetrances only), while consideration of all possible genotype categories allows us to model differences in effect between a single and a double dose of the predisposing alleles. Note that, in practice, it would not be possible to analyze data at these two loci using the methods of Risch et al. (1993) with a backcross experiment, since Idd3 and Idd10 are linked and so their segregation is not independent: The methods proposed by Risch et al. (1993) would not be applicable without some modification to take into account the linkage between the loci. This illustrates one advantage of using a congenic approach.
Using the NOD.B6Idd10^{R}^{2} strain (Table 5) we again find no evidence to reject an additive model for either the log odds or the liability. In this strain an additive model for the penetrance is also accepted (P = 0.17), while heterogeneity and multiplicative models for either the penetrance or the log odds are again rejected. The rejection of a heterogeneity model for the penetrance while an additive model is accepted illustrates a difference between these two models when considered on the penetrance scale. The acceptance of the additive model for the penetrance in this strain, which was rejected when using the strain NOD.B6Idd10^{R}^{1}, may result from the fact that for the strain NOD.B6Idd10^{R}^{2} there are two genotype categories missing. From Table 4, these genotype categories contribute to causing the additive model to be rejected when using the NOD.B6Idd10^{R}^{1} strain, since the estimated penetrances in the additive model are quite different from the observed penetrances (those estimated in the general model). However, if these categories are dropped in the analysis of the NOD.B6Idd10^{R}^{1}, we still find evidence against an additive model (P = 0.001). The difference in acceptance/rejection of the additive model is therefore more likely to result from the previously demonstrated difference in diabetes development between the two strains (seen in row 3 of Table 1, 33 vs. 49%) owing to the Idd17 effect.
Table 6 shows the parameter estimates from the general model for strain NOD.B6Idd10^{R}^{1} using the quantitative genetics parameterization given in Equation 1. It is not possible to estimate the parameters of this model for strain NOD.B6Idd10^{R}^{2} since there are nine parameters to estimate but only seven cells containing data. From Table 6 we find that when modeling the penetrance, there are significant additive × additive (i_{aa}) and dominance × additive (i_{da}) interaction effects. When modeling the log odds or liability there are no significant interaction effects, as expected from the fact that the additive model fits well in these cases. It is interesting to note that, regardless of whether the penetrance, log odds, or liability are modeled, the dominance effect d_{2} at locus 2 (Idd10) is not significantly different from 0.
The interpretation of the parameter estimates in Table 6 as true additive/dominance/interaction effects relies on the model being orthogonal so that the parameter estimates do not change according to which other parameters are currently being estimated. This is true for a designed experiment such as a backcross and is found to be approximately true for the congenic data analyzed here (results not shown), although it may not hold in general for such data owing to the fact that genotype frequencies may occur in unbalanced proportions and in addition we are analyzing a binary outcome (i.e., a proportion). For nondesigned experiments such as analysis of data from human studies, this approximation is unlikely to hold and careful attention to fitting models in an appropriate hierarchy will be required; e.g., dominance effects may not be included without the relevant additive effects, or interaction effects without the relevant main effects, etc.
Modeling the joint effects of Idd3 and Idd5: Tables 7 and 8 show the results of fitting models to the data from Table 2 (using the congenic strains for Idd3 and Idd5). We find that the data are well modeled by either an additive model for the penetrance, an additive model for the log odds, or an additive model for the liability: No significant interaction effects are observed on any of the scales examined. This means that we would not reject the hypothesis that these two loci act additively on the penetrance scale (i.e., with no epistasis), but note that a slightly better fit is provided by an additive model on the liability scale, which on the penetrance scale would include epistasis. The additive model on the log odds scale also fits adequately, which is consistent with results presented by Ghosh et al. (1993), who fitted logistic models to an ordinal insulitis phenotype in backcross data, and found no evidence for an interaction between Idd3 and Idd5 on the log odds scale. Note that again the multiplicative penetrance model is rejected (P = 0.005), indicating a difference between the multiplicative model and the additive liability model and also indicating that the multiplicative model does not here provide a good fit to the action of Idd3 and Idd5.
DISCUSSION
In this analysis we have fitted models for the joint action of two pairs of NOD diabetes loci, Idd3 and Idd10 and Idd3 and Idd5. The locus Idd3, which we believe to be the interleukin2 (IL2) locus (Dennyet al. 1997), is common to both. Even though we had the NOD.Idd3 Idd10 double congenic data in 1994 (Wickeret al. 1994), only in this present article have we analyzed the data in a more formal mathematical way. To our knowledge, only four other groups have evaluated interlocus effects in congenic strains, in models of diabetes (Foxet al. 2000), lupus (Mohanet al. 1999), cancer (Fijnemanet al. 1996), and hypertension (Rappet al. 1998). However, none has carried out the type of modeling we describe here. In all of those reports epistasis has either been inferred (Mohanet al. 1999; Foxet al. 2000) or described mathematically (Fijnemanet al. 1996; Rappet al. 1998) but using simpler functions than those described here. Our consideration of the joint effects of Idd loci within the NOD mouse model further validates this strain as a model of human type 1 diabetes. Methods for analysis of interlocus effects in human complex disease are still being developed. Although the genes predisposing to the disease in humans may not be identical to those in the NOD mouse, the hope is that the underlying genetic basis in terms of number of genes involved and their influence on physiological disease processes may present similarities between the two species.
Our results suggest that the joint effects of Idd3 and Idd10 follow a model that is additive on either the log odds or liability scale, but epistatic on the penetrance scale. The action of these two loci does not appear to be multiplicative, in spite of previous suggestions that most loci involved in type 1 diabetes will contribute to disease in a multiplicative manner (Rischet al. 1993). Moreover, a multiplicative model is also rejected for the action of Idd3 and Idd5. Although multiplicative genetic models were strongly rejected in all our analyses, it was not always possible to distinguish between alternative models that fit the data; e.g., for Idd3 and Idd5 the joint action could be well modeled by either an additive model for the penetrance, an additive model for the log odds, or an additive model for the liability.
Having accepted and rejected specific models for the joint action of two loci, the question arises as to the interpretation (biological or otherwise) of these results. The biologist is interested primarily in mechanisms and pathways, but the detection of a statistical interaction does not necessarily imply interaction on the biological or mechanistic level (Witte 1998). Moreover, since the presence of a statistical interaction depends on the scale of measurement (i.e., whether we choose to model the penetrance, the log odds, or the liability), it is unclear what the biological interpretation should be.
Although this issue of interpretation has not previously been discussed in detail in the genetics literature, it has historically received much attention in the epidemiological literature (see Greenland and Rothman 1998). Rothman et al. (1980) define four types of interaction: statistical, biological, public health, and individual decision making, and note that the scale of interest may be different for each type. A statistical interaction, for instance, refers to departure from additivity of effects on any chosen outcome scale and depends on the statistical model postulated. It can be argued that the scale itself has little biological meaning: The preferred statistical procedure would be to choose a scale that removes interaction to generate the most parsimonious model (Elston 1961; Cox 1984; Breslow and Storer 1985). This procedure makes sense when the goal is simple description of observed phenomena or prediction of outcome (Rothmanet al. 1980). For public health purposes, an interaction between two factors relates to the public health consequences of joint exposure to the factors, which may be considered to be directly proportional to the resulting number of cases in the population. It is usually argued that, for public health purposes, the natural scale is the incidence rate, with interaction between risk factors equivalent to a departure from additivity of incidence rate differences (Blott and Day 1979; Rothmanet al. 1980). Similar economic considerations usually underlie the choice of scale in quantitative genetics, where traits such as total yield, milk production, etc. are of interest. In individual (personal) decision making, Rothman et al. (1980) argue that similar reasoning leads to the absolute risk of disease being the relevant outcome scale.
The phenomenon of most interest in the present context is biological interaction and the degree to which it can be elucidated by statistical analysis. Unfortunately, this turns out to be the most complex of the interactions considered. The problem is that any given data pattern can usually be obtained from a number of dissimilar mechanisms or models for disease development (Siemiatycki and Thomas 1981; Thompson 1991). For instance, consider the following five models of disease causation: model 1, a “nohit” model in which those who become diseased are those who fail to experience one or more occurrences of a beneficial event, and two factors act additively in increasing the rate at which beneficial events occur; model 2, a “singlehit” model in which occurrence of a single adverse event is sufficient for development of disease, and two factors show morethanadditive effects (i.e., one factor augments the biologic effect of another) in terms of increasing the rate at which events occur [such effects are sometimes called “synergistic” although Blott and Day (1979) propose that this term be reserved for the public health concept of interaction]; model 3, each factor affects a different stage of a multistage process in which the pathogenic process involves transition from a normal state to stage 1, followed by subsequent transition from stage 1 to disease; model 4, two factors have additive effects on an intervening variable that bears an exponential relationship to incidence of disease; model 5, two factors have lessthanadditive (“antagonistic”) effects on an intervening variable that bears an exponential power relationship to incidence of disease. Thompson (1991) shows that each of these very different causal models results in a multiplicative statistical model for the effects of the two risk factors. Siemiatycki and Thomas (1981) describe additional models in which multiplicative and additive statistical models may arise from either interacting or noninteracting biological models.
The problem of interpretability is compounded by the low statistical power for detecting statistical interactions even when they are present and by the “discretizing” of an underlying continuous variable that can influence the presence of interaction (Rothman and Keller 1972). The fact that we are dealing with multifactorial discrete traits with reduced penetrance also adds to the complexity. For instance, if the penetrances p_{ij} are constrained to be either 0 or 1, the heterogeneity model on the penetrance scale as described earlier has a natural biological interpretation, but for a complex disease this constraint would seem unrealistic. Greenland and Rothman (1998) show that for a discrete trait, departures from additivity in average risk correspond to biological interaction in a general sense (via many possible mechanisms), although no departure from additivity does not necessarily imply absence of biological interaction. These results in terms of absolute risk would support the idea of using the penetrance as the outcome of interest in our analyses; however, Greenland and Rothman (1998) make assumptions that are essentially equivalent to assuming complete penetrance within each genotype category, which is again unrealistic for complex disease (and moreover patently contradicted by our data). Even if the underlying model in terms of all genetic and environmental effects could be expressed in these terms, analysis of any subset of these factors would integrate out the effects of the other factors, leading to reduced penetrance and a violation of the assumptions of Greenland and Rothman (1998).
The overall conclusion from this is that from the numerical data alone it will usually be impossible to discern how or even whether two risk factors interact in any biologically meaningful way. The assumed biological interpretation of models given in Risch (1990) is likely to be unrealistically simplistic. Nevertheless, there may be some value in modeling epistasis. If a prior biological model can be postulated in detail, it may be possible to infer a statistical model and determine how well the data fit the model. For instance, with models of multistage carcinogenesis, specific scenarios for the action of etiological factors and in particular whether they act in the same step or at different steps in the process can be shown to correspond to particular mathematical models (Rothmanet al. 1980; Siemiatycki and Thomas 1981). In reality, however, it may be that many statistical models fit the data equally well. In this case, allowing for different modes of interaction between potential disease loci can lead to increased power to detect any one of the loci. Simulation studies (Cordell et al. 1995, 2000; Leal and Ott 2000) suggest that this increase in power may be relatively modest. Nevertheless, in analysis of real data for type 1 diabetes (Cordell et al. 1995, 2000), type 2 diabetes (Coxet al. 1999), and inflammatory bowel disease (Choet al. 1998), increased evidence for linkage at one locus was seen when the interaction at another locus was taken into consideration. Finally, identification of the most parsimonious statistical model for the joint effects of alleles at several loci provides a means for improved prediction of phenotype, compared to considering the loci in isolation. Understanding the joint action of the loci may also lead to improved targeting of interventions; e.g., for the loci considered here, the fact that the disease frequency in the double congenic strain is so much lower than that in either of the single congenic strains suggests that intervention via relatively few gene products may be of value. The question of true biological interaction, for instance, between alleles at the HLA and insulin genes in human type 1 diabetes, remains of paramount interest. However, further research concerning the interpretation and significance of results will be required if statistical analyses such as those presented here are to be used to elucidate the underlying biologic mechanisms involved in complex binary traits.
Acknowledgments
We thank colleagues at Cambridge and Case Western Reserve Universities for fruitful discussions. This work was supported by grants from the Wellcome Trust, the U.K. Medical Research Council, the Juvenile Diabetes Fund, and Diabetes U.K. Some of the results of this article were obtained by using the program package S.A.G.E., which is supported by a U.S. Public Health Service resource grant (1 P 41 RR03655) from the National Center for Research Resources.
Footnotes

Communicating editor: ZB. Zeng
 Received December 18, 2000.
 Accepted February 5, 2001.
 Copyright © 2001 by the Genetics Society of America