Abstract
Widely used standard expressions for the sampling variance of intraclass correlations and genetic correlation coefficients were reviewed for small and large sample sizes. For the sampling variance of the intraclass correlation, it was shown by simulation that the commonly used expression, derived using a firstorder Taylor series performs better than alternative expressions found in the literature, when the betweensire degrees of freedom were small. The expressions for the sampling variance of the genetic correlation are significantly biased for small sample sizes, in particular when the population values, or their estimates, are close to zero. It was shown, both analytically and by simulation, that this is because the estimate of the sampling variance becomes very large in these cases due to very small values of the denominator of the expressions. It was concluded, therefore, that for small samples, estimates of the heritabilities and genetic correlations should not be used in the expressions for the sampling variance of the genetic correlation. It was shown analytically that in cases where the population values of the heritabilities are known, using the estimated heritabilities rather than their true values to estimate the genetic correlation results in a lower sampling variance for the genetic correlation. Therefore, for large samples, estimates of heritabilities, and not their true values, should be used.
THERE are three classic papers on the topic of sampling variances of estimates of genetic correlations in the 1950s: Reeve (1955), who derived expressions of the sampling variance for parentoffspring designs; Robertson (1959a), who derived general expressions for balanced oneway ANOVA designs with equal population values for heritabilities; and Tallis (1959), who derived general expressions for balanced and unbalanced oneway designs. Although the latter article is, in a sense, the most general, it is not referred to frequently (it does not help that the reference added in the proof of the Robertson article points to the wrong journal). It could be argued that the expressions derived in those articles are no longer relevant, since estimation techniques have moved on from leastsquares methods to likelihoodbased methods [mainly residual maximum likelihood (REML); Patterson and Thompson (1971)]. Using likelihood methods, sampling variances can be approximated from likelihood profiles (Meyer and Hill 1991). However, numerous publications, particularly in the evolutionary genetics literature, still use the expressions derived by Reeve, Robertson, and Tallis. There appears to be some confusion about the use of expressions for the sampling variance of genetic correlations. For example, Koots and Gibson (1996), who performed a metaanalysis of an impressive number (1500) of estimates of heritabilities and genetic correlations for beef cattle traits, argued that using the estimates of heritabilities and genetic correlations in the expressions of Reeve (1955) and Robertson (1959a), rather than their true (population) values, appeared to be closer to the observed empirical sampling variance of the genetic correlation coefficient. This was corroborated by a small simulation study. Furthermore, these authors implied that the equations derived in Reeve (1955) and Robertson (1959a) were inaccurate, because the empirical sampling variance of estimated genetic correlation coefficients was much larger than the expected sampling variance, using both their data and simulations.
Three areas of confusion can be identified:
Should the parameters in the expressions for the sampling variance of the genetic correlation coefficient in Reeve (1955), Robertson (1959a), and Tallis (1959) be the population parameters or the estimates of those parameters?
How good are the expressions derived in the trio of articles?
What is the impact of estimates that are outside the parameter space, i.e., negative heritability estimates and/or estimates of genetic correlations <−1 or >+1?
In this article I review the main expressions of Reeve, Robertson, and Tallis and clarify under what circumstances the expressions should be used. I also review and evaluate the equations for the sampling variance of intraclass correlations as a paradigm, since these are central to the understanding of the assumptions and methods.
MATERIALS AND METHODS
To answer the above questions, we first look at the derivation of the sampling variance of intraclass correlation coefficients because this serves as an appropriate paradigm for the sampling variance of the genetic correlation coefficient. Subsequently, expressions for the sampling variance of the genetic correlation are reviewed and evaluated.
Sampling variance of intraclass correlation coefficients
Population parameters known: Consider a simple, balanced oneway design with s sires and n progeny per sire, and assume that parameters are estimated using leastsquares methods, e.g., ANOVA. Observations are assumed to be normally distributed. The expectations and variances of the between and withinsire mean squares (MS) are
For large n, Equation 1 reduces to
A number of expressions similar to Equation 1, which all differ in the terms relating to the number of sires and progeny per sire, can be found in the literature. They differ in particular in the degrees of freedom relating to the betweensire component of variance. The general expression is
All the above expressions for c_{i} reduce to the one most commonly used (i.e., c_{0} = 1/[n(s − 1)(n − 1)]) for large s and n. However, even for large n, there are discrepancies between the formulas depending on the use of s, (s − 1), or (s − 2) in the denominator. It is likely that the form used by Robertson and Lerner is incorrect and probably stems from the use of Fisher's ztransformation. Fisher showed that
Population values unknown: Except when doing power calculations and/or investigating the design of experiments (Robertson 1959b) or simulation studies, we do not know the population values and hence do not know the exact or approximate sampling variance of the intraclass correlation. The standard practice is to use the formulae derived in the previous section and to substitute
Kempthorne (1957) argued that an adjustment to the degrees of freedom should be made when estimating the sampling variance of an estimate of the intraclass correlation, since
Simulation study: Simulations were performed to compare the empirical standard deviation of heritability estimates, the predicted standard deviation using the true population values (Equation 1), and the average estimated standard deviation (using Equations 1, 4, and 8, with
Since the only difference between the various prediction equations for the sampling variances are functions of s and n, only results for the Taylor series (Osborne and Paterson 1952) are presented. The other equations for the sampling variance differ approximately by factors of (s − 1)/s (using Fisher's formula) and (s − 1)/(s + 1) (using Kempthorne 1957).
Sampling variance of genetic correlation coefficient
Expressions from literature: Tallis (1959) derived a general expression for the approximate sampling variance of the estimated genetic correlation coefficient for a balanced halfsib design. Population values for the intraclass correlations of the two traits are t_{1} and t_{2}, and the genetic and withinsire correlation coefficients are r_{g} and r_{w}, respectively. The general form of Tallis' expression is
Special cases:
A further simplification is if the genetic and withinsire correlation are the same. The expression for the sampling variance may be written as
Finally, Robertson (1959a) suggested a very simple and general expression for the sampling variance of the genetic correlation coefficient by observing the similarity between expressions derived for special cases,
For a large number of progeny per sire, Tallis' equation reduces to a very simple form,
There are difficulties in using the approaches for the sampling variance of the intraclass correlation to determine the sampling variance of estimates of the genetic correlation coefficient: (1) The estimate of r_{g} is unbounded in principle, so that large positive and negative values (outside the range −1 to +1) are possible, and (2) the true heritability, or its estimate, appears in both the numerator and the denominator of the equation for the sampling variance of r_{g} (see Equations 9, 10 and 11). This means that in the vicinity of true or estimated h^{2} being zero, the estimate of the sampling variance can become very large because of a division by a small number. Also, if one or both of the estimated heritabilities is <0, the estimate of the genetic correlation coefficient is an imaginary number (Reeve 1955). Simulation results are less meaningful in these cases, because the estimate of the sampling variance may not have converged (or never will). For example, in the simulation study of parentoffspring regression in Koots and Gibson (1996), with true population parameters of
Heritabilities known: Koots and Gibson (1996) argued that in some (rare) cases, the population heritabilities may be known when the genetic correlation is estimated, for example, through prior information or a metaanalysis of relevant literature values. In that case, one could proceed with only estimating the between and withinsire covariances, along with the phenotypic variances, from the data. If the phenotypic variances are assumed to be estimated accurately, i.e., for large sn, then
Simulations: A simulation study was performed in which the empirical variance of the estimated r_{g} and the average estimated sampling variance using the estimates of heritabilities and correlations was compared to the predicted sampling variance [using equations from Tallis (1959); see above]. Independent 2 × 2 matrices of withinsire mean squares and mean crossproducts (W_{11}, W_{12}, and W_{22}) and betweensire mean squares and crossproducts (B_{11}, B_{12}, and B_{22}) were sampled from a central Wishart distribution with (s(n − 1)) and (s − 1) degrees of freedom, respectively (see Visscher 1995 for more details). The resulting matrices of mean squares and crossproducts were used to calculate leastsquares estimates of the two heritabilities and the genetic and withinsire correlation coefficients in the standard way (see, for example, Tallis 1959). For estimated sire variances that were positive, estimates of genetic correlations could be <−1 or >1. A leastsquares estimate of the genetic correlation coefficient was not possible when one or both of the heritability estimates were negative.
Since most authors (e.g., Fisher 1921; Robertson 1959a; Tallis 1959) have explicitly warned against the use of the “standard” equations when the true heritability, or its estimate, is close to zero, it seems more meaningful to force the estimate of r_{g} to be >−1 and <1. Therefore, if the estimate of the genetic covariance matrix, i.e., (B − W)/n, with B and W the 2 × 2 matrices of between and withinsire mean squares, respectively, was not positivedefinite, REML estimates were calculated using the sampled between and withinsire mean squares and crossproducts. To force the parameters in the parameter space, a form of “bending” the (leastsquares) covariance matrix was applied (e.g., Hayes and Hill 1981; Calvin 1993; Visscher 1995). The form of bending applied was described as attenuating the covariance matrix by Visscher (1995), and estimates of (co)variances are the same if a REML analysis had been carried out on the original data. This was done by calculating heritabilities on a canonical scale and setting negative heritabilities to a small positive value (10^{−6}) and heritabilities larger than one to a value less than one (1–10^{−6}). Following Koots and Gibson (1996), simulated data sets were summarized only if the geometric mean of the estimated heritabilities was >0.01. For each set of parameters, simulation was stopped when 10^{5} replicated samples of the estimated genetic correlation coefficient were obtained.
RESULTS
Validation of expressions for sampling variances of intraclass correlations: Predicted sampling variances of heritability estimates, observed sampling variances from simulations, and the average estimated sampling variance from simulation are presented in Table 1. For the predicted and estimated sampling variances, only the equation of Osborne and Paterson (1952) was used. To obtain other predictions, results for the standard error of the heritability need to be multiplied by, approximately, factors of [(s − 1)/s]^{1/2} (using Fisher's formula) and [(s − 1)/(s + 1)]^{1/2} (using Kempthorne 1957). Results (Table 1) indicate that the approximation of Osborne and Paterson (1952) works very well, in that the predicted variation in estimated heritabilities is close to the observed variation from simulation, in particular for a small value of the heritability. For a large heritability and a small number of sires, it appears that the approximation of Fisher is better. For example, for s = 2, n = 1000, and h^{2} = 0.96, the predicted standard error from Osborne and Paterson (1952) is 1.0353 (Table 1) and from Fisher's equation, is 0.7323 (not shown in tables), while the observed standard error was 0.7192. However, further simulations (not shown) with different values of the heritability indicated that both predictions deviate substantially from the observed standard errors for large heritabilities and a small number of sires. For example, for t = 0.95 (and hence a “heritability” of 3.80), the observed standard error in estimated heritability was 1.2619, whereas the predictions from Osborne and Paterson (1952) and Fisher (1921) were 0.2687 and 0.5183, respectively.
The average estimated standard errors of the heritability estimates are close to the observed standard error from simulation. Clearly, the approximation of Osborne and Paterson (1952), i.e., the substitution of the estimated heritabilities into Equation 1, works very well, and, therefore, using Fisher's or Kempthorne's formula will underestimate the true standard error by factors of [(s − 1)/s]^{1/2} and [(s − 1)/(s + 1)]^{1/2}. It appears that the average estimated standard error is closer to the observed standard error than the prediction using the population values of the heritability.
Sampling variances for genetic correlations: Results are presented in Table 2. Clearly for small n, the empirical standard error of r_{g} is usually larger than that predicted under the unconstrained (leastsquares) model. For example, for s = 100 and n = 10, the empirical standard error is 0.833 for h^{2} = 0.10 and r_{g} = 0.0, whereas the predicted value is 0.509. When the parameters are forced in the parameter space, the maximum empirical standard error is 1.0, when half of the time an estimate of +1 is obtained, and half of the time an estimate of −1. For small s and n, the empirical standard error can then be smaller than that predicted. For example, for s = 100 and n = 2, the empirical standard error from REML was 0.897 for h^{2} = 0.10 and r_{g} = 0, whereas the predicted value was 2.84. For large n (>10), the equations perform well. Substituting the estimated parameters into expressions 8 and 9 is almost always worse, because of the real possibility of obtaining very small estimates of the heritabilities. Only for large designs are the average estimated and predicted standard errors similar. For powerful designs, i.e., for those designs with a small probability of obtaining leastsquares estimates that are out of bounds, the average estimated sampling variance appears to be closer to the observed sampling variance than the predicted values (Table 2).
Relationship between heritability estimates and sampling variance of r_{g}: A more detailed investigation into the relationship between estimated heritabilities, estimated genetic correlation coefficients, and the sampling variance of the genetic correlation estimate was performed for s = 100, n = 1000, h^{2} = 0.50 (both traits), and r_{w} = r_{g} = 0.75. One million replicated populations were simulated, and both the observed standard error and the estimated standard error were summarized as a function of the geometric mean of the heritability estimates (i.e., {ĥ_{1} ×ĥ_{2}).} Simulation results are displayed in Figure 1. The graph also includes a plot of the predicted standard error, assuming that the values of the heritabilities on the xaxis are the population values. Since for a powerful design with many progeny per sire the predicted sampling variance of the genetic correlation coefficient does not depend on the heritabilities (Equations 14 and 16), the corresponding line in Figure 1 appears to be horizontal. The prediction of the standard error using population values, i.e., h_{2} = 0.50 and r_{g} = r_{w} = 0.75, is 0.0443 for this design. The observed standard error of genetic correlation coefficients over all samples, hence also over all possible values of estimated heritabilities, was 0.0443, and the correlation between the estimated genetic correlation and the geometric mean of the heritability estimates was 0.59 (results not shown).
DISCUSSION AND CONCLUSIONS
Population or estimated values: It is clear that all the expressions for the sampling variance of intraclass correlations or genetic correlation coefficients were essentially derived using a firstorder Taylor series about the true population values. Hence, these are the values that should be used to study, for example, the power of various experimental designs, because they give the best prediction of the sampling variance.
Sampling variance of intraclass correlation: The prediction of the sampling variance of the heritability estimate based upon population values is accurate for small heritabilities and/or a large number of sires. However, for a small number of sires and a large heritability, the prediction is very poor. The reason for this is that the distribution of the heritability estimate becomes very skewed in these cases, and the Taylor series in Equation 1 ignores higherorder terms by implicitly assuming that both numerator and denominator in the series are normally distributed. In fact, they are distributed proportionally to χ^{2} distributions, which are known to be highly skewed for a small number of degrees of freedom [the coefficient of skewness of a central χ^{2} distribution with k degrees of freedom is 2^{3/2}/k^{1/2} (Lancaster 1969, p. 20)]. Higherorder Taylor series converged only slowly (results not shown) and do not improve the prediction of the sampling variance substantially. For the extreme case of sires with many progeny it was shown (Equation 2) that the predicted standard error of the estimated heritability is proportional to t(1 − t). However, the observed standard error in this case is a function of t, since (B/
Estimating the sampling variance of intraclass correlations by substituting the estimates of the heritabilities into the standard expressions (Osborne and Paterson 1952) works remarkably well, in that the average estimated standard error is very close to that predicted. For a large value of the heritability (h^{2} = 0.96) and a small number of sires (s = 2 or s = 4), the average estimated standard error is smaller than the predicted standard error and appears to be close to the observed standard error (Table 1). However, further simulations using more extreme values of t showed that this is not a general observation. For example, for t > 0.5 (corresponding to “heritability” >2.0) the prediction using population values is closer to the observed sampling variance than the average estimated sampling variation (results not shown). Part of the reason why the average estimated standard error of the heritability is closer to the observed sampling variance than the prediction based upon t for heritability values in the normal range is that it takes account of the bias in the heritability estimate. Heritability estimates are biased downward, in particular for large values of t and small values of s. For example, for h^{2} = 0.96, s = 4, and n = 1000, the average estimate for the heritability from simulation is 0.86 (results not in tables). When this value is used for the standard prediction equation (Equation 1), the predicted standard error of the heritability is 0.5532, which is closer to the observed standard error (0.5169, Table 1) than that predicted from the population value of the heritability (0.5978, Table 1).
Sampling variances of genetic correlations: The expressions for var(
Koots and Gibson (1996) showed in one of their simulation studies that the empirical sampling variance of the genetic correlation coefficient depended on the estimates of the two heritabilities and concluded that, therefore, the estimates of the heritabilities should be used in, for example, Equation 15, even if the population values were known. Further simulation results using the same population values (K. Koots and J. Gibson, personal communication) showed very clearly that the sampling variance of the genetic correlation coefficient, conditional on the values of the estimated heritabilities, is accurately estimated by substituting the estimated (and not the true) heritabilities into Equation 15, in the case of a parentoffspring design, heritabilities of 0.10, and environmental and genetic correlation coefficients of zero. This appears to contradict the simulation results in Table 2, which show that the average sampling variance is poorly estimated using estimated parameters. However, the results in Table 2 indicate that the estimation is poor only when the leastsquares estimates are likely to be outside the parameter space. In the cases where the leastsquares and REML estimates are identical, both the prediction and average estimated sampling variance are close to the observed one.
The relationship between estimated heritabilities and the empirical sampling variance of the genetic correlation was explored in Figure 1 for a powerful design. From these results we may conclude that (1) the predicted sampling variance accurately predicts the average observed sampling variance (0.0443), but not the observed sampling variance for given values of the achieved estimates of the heritabilities; (2) for a given value of the estimates of the heritabilities, the estimated sampling variance follows a very similar pattern to that of the observed sampling variance (as in Koots and Gibson 1996); (3) for a given value of the estimates of the heritabilities, the estimated sampling variance is larger than the observed sampling variance; and (4) when the estimated heritabilities are close to the true values, the predicted and estimated sampling variances coincide. For other designs, a similar pattern was observed. When the experiment was large and the population values of the correlation coefficients zero, the observed and estimated sampling variances, as a function of the geometric mean of the estimated heritabilities, were very similar. These additional results confirm the results of Koots and Gibson (1996) and reinforce their recommendation that the value estimated heritabilities should be used in calculating the sampling variance of the genetic correlation coefficient, even in the rare cases when the population values of the heritabilities are known. From Table 2 it appears that the average estimated sampling variance of the genetic correlation coefficient is closer to the observed sampling variance than the prediction using population parameters. For one design, s = 500, n = 10, h^{2} = 0.10, r_{g} = r_{w} = 0, this was further explored using the results from 10^{6} replicated samples. Figure 2 shows that for ĥ_{1}ĥ_{2} > 0.07, the observed, estimated, and predicted sampling variance are virtually identical. For each of the geometric mean classes, the mean estimate of the genetic correlation was zero (results not shown), so that for a particular value of the estimated geometric mean of the heritabilities, the sampling variance of the genetic correlation coefficients reflects sampling from a population with the true values of the heritabilities equal to those estimated. This is in contrast with the previous example, in which the correlation between the estimated genetic correlation and geometric mean of the heritabilities was positive (+0.59). For smaller values of the estimated heritabilities, Figure 2 shows that the estimated sampling variance is much larger than either the predicted or observed sampling variance. The estimated sampling variance of the genetic correlation coefficient for ĥ_{1}ĥ_{2} = 0.01 was 9.9. It is not clear from these results why the average estimated sampling variance is closer to the observed sampling variance than the prediction using population values.
van Vleck and Henderson (1961) investigated the behavior of the expression derived by Reeve (1955) for the parentoffspring regression scenario by simulation. They came to the same conclusion as Koots and Gibson (1996); i.e., in the case of a single progeny and one parent, more than 1000 pairs were needed before Reeve's expression was reasonably accurate.
Bias in estimate of sampling variance: It is usually assumed that by substituting the estimates of population parameters in the prediction equations for the sampling variance of the heritability or genetic correlation, unbiased estimates of those sampling variances are obtained. However, this is not generally the case for small sample sizes. Furthermore, when comparing the average sampling variation from simulation, it matters whether results are expressed in the average standard error (as in this study) or in the average sampling variance. This is best illustrated using the example of a halfsib design with large n and a small population value of t. Then
Conclusion: For the design of experimental populations to estimate genetic parameters, the prediction of the sampling variance of heritabilities using Osborne and Paterson (1952) is accurate, unless the population heritability is large and the number of family groups is very small. For analysis of data, the estimate of the standard error of the heritability obtained by substituting the estimated heritability for the true value in the standard prediction formulas is almost unbiased for the range of heritabilities and sample sizes likely to be encountered in practice.
For small experiments, estimates of heritability are biased downward, and estimates of sampling variances are generally not unbiased. Combining results from different experiments by weighting the heritability estimates by the inverse of their estimatessampling variances may result in a severely biased heritability estimate, because the smaller experiments tend to have estimates that are too low, and too much weight is given to these estimates if their sampling variances are biased downward too. A joint analysis of all data is to be preferred.
The predicted sampling variance of the genetic correlation using Reeve (1955) and Tallis (1959) are accurate only if the population heritabilities are not close to zero and if the number of families is large. Even if the population heritabilities are known, the estimated heritabilities should be used in the estimation of the sampling variance of the genetic correlation coefficient.
Acknowledgments
I thank Naomi Wray and Bill Hill for helpful comments and Ken Koots and John Gibson for many constructive discussions and the sharing of additional simulation results.
Footnotes

Communicating editor: R. G. Shaw
 Received February 3, 1997.
 Accepted March 23, 1998.
 Copyright © 1998 by the Genetics Society of America