## Abstract

Maize yield improvement has been strongly linked to improvements in stress tolerance, particularly to increased interplant competition. As a result, modern hybrids are able to produce kernels at high plant population densities. Identification of the genetic factors responsible for density response in maize requires direct testing of interactions between genetic effects and density and evaluation of that response in multiple traits. In this article we take a broad view of the problem and use a general approach based upon mixed models to analyze data from eight segmental inbred lines in a B73 background and their crosses to the unrelated parent Mo17 (hybrids). We directly test for the interaction between treatment effects and genetic effects instead of the commonly used overlaying of results on a common map. Additionally, we demonstrate one way to handle heteroscedasticity of variances common in stress responses. We find that some SILs are consistently different from the recurrent parent regardless of the density, while others differ from the recurrent parent in one density level but not in the other. Thus, we find positive evidence for both main effects and interaction between genetic loci and density in cases where the approach of overlapping results fails to find significant results. Furthermore, our study clearly identifies segments that respond differently to density depending upon the inbreeding level (inbred/hybrid).

MAIZE (*Zea mays* L.) is one of the world's oldest domesticated crops and has been the source of nutrition for humans for thousands of years. It has played a significant role in human culture from the earliest human civilizations (Smith 1995). Maize selection has occurred in the context of food and feed production systems from the earliest days. The advent of modern hybrid maize and modern breeding methods have made possible tremendous gains in grain yield compared to that of older open-pollinated varieties of maize. Concurrently, research on yield potential and stress resistance in maize, combined with extensive phenotypic selection, has allowed for significant genetic gains in rain-fed yield of maize hybrids (Derieux *et al*. 1987; Russell 1991; Tollenaar 1991; Duvick 1992; Eyherabide *et al*. 1994; Sangoi *et al*. 2002). Genetic variation for yield potential in elite maize populations appears to be controlled by many genetic factors, each with relatively small effects (Schön *et al*. 2004). Therefore, the utility of QTL mapping and marker-assisted selection approaches for yield potential in maize is questionable (Holland 2004). However, it should be possible to dissect maize responses to specific biotic and abiotic stresses into less complex component traits. These component traits may be controlled by smaller sets of genes, making them more amenable to QTL mapping and marker-assisted selection (Ribaut *et al*. 2001; Tuberosa and Salvi 2004). This makes stress resistance traits such as tolerance to interplant competition, pests, and drought, along with enhancement of yield stability across environments and production systems, a primary objective for plant breeders.

An additional complication arises in hybrid maize because it displays heterosis for many traits and the physiological response to stress may differ between inbred lines and hybrids. Since maize breeding programs focus on improvement of hybrids, but accomplish this initially through the selection among inbred lines, comparisons of responses observed in inbred and hybrid maize are of critical importance. First, such comparisons will give insight into the gene action affecting stress response mechanisms. Second, such comparisons will guide appropriate breeding strategies for improving hybrid maize, by determining the relative efficiencies of screening inbreds *per se vs*. hybrids created by crossing experimental inbreds to a common but unrelated tester line. Therefore, studies of plant response to stress in the context of inbred/hybrid status are important for understanding the genetic basis of the physiological response.

Concurrent with understanding the interplay between inbred/hybrid responses to stress, one of the more pressing problems is understanding the physiological mechanisms responsible for the response to stress at the whole-plant level (*i.e*., the vegetative as well as reproductive responses to stress levels and how these traits are related to grain yield). We are interested in understanding how stress levels affect multiple aspects of the plant response and how those responses are related to each other. By identifying the relationships among treatment responses for vegetative and reproductive parameters and their relationships with grain yield, we hope to understand the physiological mechanisms that underlie the responses, and, at the same time, the genetic basis of such responses. Multitrait methods for identifying QTL have been developed for a variety of mating designs and statistical approaches (Jiang and Zeng 1995; Korol *et al*. 1995; Knott and Haley 2000; Gilbert and Le Roy 2003; Lund *et al*. 2003; Sorensen *et al*. 2003). In all cases, multivariate approaches have been shown to improve the precision of the QTL effects and location estimates. This advantage has now been extended to scenarios with nonnormal traits (Lange and Whittaker 2001) or a combination of qualitative and quantitative traits (Corander and Sillanpaa 2002). Methods for adding fixed treatment factors of interest during QTL modeling have been recently developed. For instance, Ma *et al*. (2002) provided a theoretical framework for QTL mapping of longitudinal data, and Zhao *et al*. (2004) extended it by adding sex effects.

Previous studies have tried to map QTL for grain yield under certain stresses, often by comparing positions of grain yield QTL with those of QTL for component traits known to play a role in stress tolerances. From such studies we have learned much about plant response to nitrogen deficiency (Bertin and Gallais 2001; Hirel *et al*. 2001), water deficiency (Landi *et al*. 1995; Ribaut *et al*. 1996; Tuberosa *et al*. 1998, 2002b; Sanguineti *et al*. 1999; Sari-Gorla *et al*. 1999; Li *et al*. 2003), low phosphorus (Reiter *et al*. 1991), and cold temperature growing conditions (Hund *et al*. 2004). The question is often approached *indirectly*, by mapping the trait of interest (*i.e*., yield) under two or more conditions and overlaying the resulting scans on the common map, to identify shared and distinct QTL. The QTL identified under one condition but not under the other are then inferred to be responsible for the plant response to the stress. However, such an *indirect* identification of QTL positions for response to stress is unsatisfactory as there is no statistical assessment of the interaction between QTL and stress level.

The ability to simultaneously examine the response to stress in inbreds and hybrids on several traits and discover how the stress directly affects the relationship among traits would improve the genetic analysis of stress response (Phillips *et al*. 2001). Mixed-effects linear models are a natural approach for developing flexible methods to identify QTL in complex situations. One advantage of mixed-effects models is that they can accommodate heterogeneity in the error variances across stress levels, and the variances are used to directly test QTL-by-treatment interactions. In addition, mixed models with heterogeneous error variances can be extended to multivariate situations using a repeated measures approach. Other sources of variation, either fixed or random, can also be included in the model and tested directly. Multilevel models and the response to multiple-factor stresses in maize can also be explored (Vyn and Hooker 2002). While maximum-likelihood methods are more computationally intensive than least squares (Seaton *et al*., 2002), the implementation of new, more efficient algorithms (Gilmour *et al*. 1999; Perez-Enciso and Misztal 2004) along with the increase in computing power makes the implementation of mixed-effects models a reasonable approach for many QTL analyses.

We used the mixed-effect model framework to directly test the interaction between a specific locus and interplant competition. We used a set of eight segmental introgression lines (SILs) derived from the cross B73 × Tx303. We show the importance of testing locus-by-density interaction directly by comparing the results from our analysis with the results from the traditional separate analysis and overlaying of the results. The flexibility to model heterogeneous error covariance matrices in mixed models is an important advantage since the QTL-by-stress level interaction tests account for heteroscedasticity imposed by the density treatment. This complements previous research demonstrating the utility of mixed-model QTL analysis to account for heteroscedasticity and spatial trends of within-environment error variation and multiplicative genotype-by-environment interactions (Piepho 1997, 2005; Smith *et al*. 2001a,b).

## METHODS

#### Maize population:

Entries in this study were eight SILs provided by J. Holland selected from a population of 89 lines developed by C.W. Stuber at North Carolina State University and described by Hostert (2002). Note that in the original publication these lines were referred to as near isogenic lines (NILs). SILs were developed from the cross of the well-studied inbred line B73, as the recurrent parent, and an unrelated subtropical inbred line Tx303, as the donor parent. Tx303 introgressions were located on both arms of every chromosome with the exception of the long arm of chromosome 3, the short arm of chromosome 7, and the short arm of chromosome 8. Each of the 89 SILs was composed of a unique Tx303 introgression event and had on average 2.5% donor parent alleles at nontarget loci (Hostert 2002; J. B. Holland, unpublished data).

Populations of SILs are particularly well suited to elucidate the genetic and physiological bases of the association among developmental and yield component traits that may be related to grain yield. SILs allow for a more accurate assessment of the breeding value of morphophysiological traits than do independent hybrids and, in some cases, provide genetic resources to aid the map-based cloning of the gene underlying the QTL (Tuberosa *et al*. 2002a). The population of SILs studied here has been extensively evaluated for various agronomically important characteristics. The performance of a subset of 56 SILs for days from planting to anthesis and silking, anthesis-silking interval, final plant height, height to ear insertion, number of ears per plant at maturity, and percentage of barren plants was examined at low and high plant population densities in replicated trials in one North Carolina location in 2001 (Hostert 2002) and one Indiana location in 2002. In each of the years 2000 and 2001, testcross progeny derived from crossing each SIL to an inbred line (Mo17) were evaluated for days from planting to anthesis and silking in one North Carolina environment and also for final plant height, height to ear insertion, grain moisture, and grain yield in four evaluation sites in North Carolina.

The eight SILs used in this study were selected on the basis of information on their performances as inbred lines *per se* and as hybrids from the above studies. SILs were selected according to their differences from the recurrent parent B73 principally in time to anthesis, time to silking, anthesis-to-silking interval (ASI), and yield as a hybrid when crossed to Mo17. The subset of selected SILs represented most of the variability present for the traits mentioned above. These SILs included the lines with smaller nontarget introgressions (Table 1) and significant differences from the recurrent parent B73 in at least one trait (Table 2).

To evaluate the introgressed segments in hybrids, the selected SILs were crossed to inbred line Mo17, a member of the Lancaster heterotic group. Mo17 was selected as the tester parent because the B73 × Mo17 hybrid was a very succesful hybrid nearly 3 decades ago and is the most popular example of the cross between lines from the Stiff Stalk Synthetic and Lancaster heterotic groups. Both inbred lines (Mo17, B73) and the B73 × Mo17 hybrid have been extensively studied. As a result, QTL studies for several traits performed on the progeny of this cross have been published (Stuber *et al*. 1992; Kaeppler *et al*. 2000; Mickelson *et al*. 2002; Carson *et al*. 2004). Seed for the eight selected SILs and their crosses to inbred line Mo17 was increased in the winter of 2003 in Puerto Rico.

#### Experimental design and data collection:

Entries were arranged in a split-split-plot design (Figure 1). The study was replicated three times (three blocks) in each of two locations. The whole-plot treatment factor was inbreeding level, the SILs (inbreds) or the cross of the SIL with Mo17 (hybrids). The subplot treatment factor was density with two levels, a high population density planted at 100,000 seeds ha^{−1} (38 seeds row^{−1}) and a low population density planted at 50,000 seeds ha^{−1} (19 seeds row^{−1}). The sub-subplot treatment factor was the genotype, in this case the eight Tx303 introgressions (Table 1). To assure uniform pollen availability for late-flowering genotypes, pollinator rows of different maturity were uniformly distributed across the study. Each sub-subplot consisted of four rows measuring 5 m long with 0.76-m interrow spacing and a 1-m alley at the end of each plot.

The experiment was planted on May 26, 2003, in the Purdue University Agronomy Center for Research and Education near West Lafayette (40 33′ 36″ N, 86 55′ 48″ W), Indiana, on a soil classified as poorly drained Chalmers silty clay loam and on May 27, 2003 in the Throckmorton–Purdue Agricultural Center (TPAC) (40 17′ 52″ N, 86 54′ 10″ W) near Romney, Indiana, on a Raub silt loam soil.

To account for spatial variability in the field, and to enhance the ability to detect introgression effects for inbreds, each sub-subplot was planted with two consecutive rows of the SIL paired to two consecutive rows of B73. Measurements were taken on the central two rows of each sub-subplot, one B73 and the other SIL. To account for border effects due to the alley, data were not collected on the first and last three plants in a row (Figure 1). Data collected were height to the uppermost stretched leaf tip on a weekly basis from week 6 to week 9 after planting, height from the ground to ear insertion, height to the collar of the flag leaf, date of first visible anther, date of first visible stigma, kernel weight for each ear, and number of kernels in each ear. ASI was obtained by subtracting the Julian date of first visible anther from the Julian date of first visible stigma. These traits were selected because they represent different ontogenic stages of the corn plants. In addition, some of these traits have been reported as responsive to environmental stresses while others are unaffected in the presence of similar stresses. ASI has been associated with the status of the plant under water deficiency conditions in the period bracketing flowering (Bolanos and Edmeades 1996). Plants susceptible to drought stress tend to delay silking and have longer ASIs. Anthesis date, on the other hand, is less affected in the presence of several different stresses (Ribaut *et al*. 1996, 1997). Plant heights modify the ability of a genotype to avoid shading by neighboring individuals. Investigations aimed at studying the effect of plant population density and planting rectangularity on plant adaptation to shaded environments have documented changes in plant heights (Maddonni *et al*. 2001; Borras *et al*. 2003). Grain yield per plant and its primary yield component, kernel number (Chapman and Edmeades 1999), were measured to evaluate the fitness of the SILs.

Sub-subplots in the hybrid whole plots were planted and evaluated following the above description for inbred whole plots. The difference was that each sub-subplot contained two consecutive rows of the SIL × Mo17 hybrid paired with two consecutive rows of B73 × Mo17 hybrid (Figure 1), instead of the SIL and B73 inbred lines.

## STATISTICAL ANALYSIS

#### Estimating effects of the introgressions:

To gain precision in the estimation of the effects of the Tx303 introgressions, we first computed estimates of the Tx303 introgression effects for all eight traits at the sub-subplot level. In each sub-subplot, the estimates of the Tx303 introgression effects on the eight traits were obtained from the analyses of observations scored on the two paired rows planted in the center (Figure 1). For simplicity, the statistical models used to estimate the effects of the introgressions are described for inbreds. The estimates of Tx303 introgression effects for height from the ground to ear insertion (HEI), height to the collar of the flag leaf (HCF), ASI, grain yield per plant (GY), and number of kernels per plant (KNP) were obtained from fitting the linear model(1)to all plants scored in a sub-subplot. In Equation 1, *z _{ij}* represents the trait value scored on plant

*j*of genotype

*i*,

*i*= {B73, SIL}, τ

_{i}represents the effect of the genotype

*i*on the trait, and ϵ

_{ij}is a random within-plot error term assumed normally distributed with mean 0 and variance . The Tx303 introgression effects were estimated as τ

_{SIL}− τ

_{B73}.

The height of each plant was measured weekly in centimeters (see above). Tx303 introgression effects on initial and midseason plant growth rates were obtained from fitting a random coefficient model to the plant height data scored from week 6 through week 9 after planting. For each sub-subplot, the model(2)fits a regression line, where *w _{ij}* represents the height of plant

*i*at week

*j*,

*j*= {6, 7, 8, 9} (week 6 was considered as the time point 0, and the remaining weeks were considered as time points 1, 2, and 3);

*x*represents the week on which the height measurement

_{ij}*w*was taken; β

_{ij}_{0}and β

_{1}represent the fixed-effects intercept of the model; β

_{2}and β

_{3}represent the fixed-effects slope of the model;

*a*and

_{i}*b*(

_{i}*x*) represent a random intercept and slope for plant

_{ij}*i*, respectively; ϵ

_{ij}is a random error term; and genotype is an indicator variable with value 1 for the SIL and 0 for B73. Random effects

*a*and

_{i}*b*were assumed to be normally distributed with mean 0 and variances and , respectively. Error terms ϵ

_{i}_{ij}were assumed to be normally distributed with mean 0 and variance σ

^{2}. Variances were estimated using the restricted maximum-likelihood (REML) method. Fixed and random intercepts and slopes were estimated by solving Henderson's mixed-model equations. Two Tx303 introgression effects can be estimated from Equation 2: the effect of the introgression on the initial plant growth rate (HW6) () and the effect of introgression on midseason plant growth rate (HGR) ().

Days from planting to anthesis data are often strongly skewed (Vermerris and McIntyre 1999). Several survival models were evaluated to determine the probability distribution that best fit the data. Models based on log-normal and gamma distributions were equivalent. Since censoring was <3%, and log-normal survival models fit well, estimates of the effects of the introgressed segments on days to anthesis were obtained from fitting linear models similar to Equation 1 to the logarithm of the time to the event. The Tx303 introgression effects for days from planting to anthesis (DTA) were then estimated as τ_{SIL} − τ_{B73}. This approach is equivalent to fitting a log-normal model in survival time analysis with no censored data; it has the advantage that the estimated Tx303 introgression effects are normally distributed (Lawless 1982). The resulting differences are expected to be normally distributed. Similarly, differences between hybrids SIL × Mo17 and B73 × Mo17 were derived from each sub-subplot of the hybrid whole plots. Since the scales of the estimated introgression effects traits varied with the traits, the introgression effects were standardized by multiplying by a constant. All estimated introgression effects derived for one trait were standardized using the same constant. The constants for all eight traits were selected so that the estimated introgression effects had a constant variance of 2. This standardization was used to facilitate the convergence of the algorithms in the multitrait analysis.

#### Correlation among effects of the introgressions:

Simple Pearson correlations among the traits, within the different inbreeding levels and density treatments, were calculated on the introgression effects estimated as described in the previous section. These correlations represent phenotypic correlations among traits. *P*-values for the Pearson correlations are computed by treating *t* = (*n* − 2)^{1/2}[(*r*^{2}/(1 − *r*^{2})]^{1/2} as coming from a *t*-distribution with *n* − 2 d.f., where *r* is the sample correlation and *n* is the sample size. A false discovery rate (FDR) of 2% was used as the threshold to determine significance (Benjamini and Hochberg 1995).

#### Mixed model for joint analysis of the effects of the introgressions:

The estimated introgression effects on the eight traits in each sub-subplot were analyzed jointly, using a multivariate mixed-effects model with repeated measures at the sub-subplot level (*e.g*., the estimates for the eight traits within a sub-subplot) (Wolfinger 1996; McCulloch and Searle 2001). For the vector **Y**_{s},(3)of the introgression effects estimated for all traits in a sub-subplot s of linear mixed-effects models whose elements fit a split-split-plot experimental design for each trait. In Equation 3,for *t* = {HEI, HCF, ASI, GY, KNP, HW6, HGR, DAT}. *y _{ijklm}*

^{(t)}are the effects on trait

*t*of the introgressed segments

*m*estimated for a sub-subplot with density treatment

*l*in block

*k*within location

*i*. , and

*G*

_{m}^{(t)}represent location, inbred/hybrid, block within a location, density treatment, and the introgression effects for trait

*t*, respectively, and represents a random error term. Three and four-way location interactions were tested in preliminary analyses. All

*P*-values were >0.8 on the basis of the corrected (Satterthwaite)

*F*-test. Therefore, these terms were not included in the final model. Block within location and their interactions with the other factors were considered random effects, whereas the remaining terms in the model were considered fixed. The random effects , and were assumed independent from each other and normally distributed with means 0 and variances , and , respectively.

For each of the vectors **Y**_{s} there is a corresponding vector **e**_{s} of residuals. The variance of **e**_{s} was assumed to have an unstructured variance–covariance matrix **R**_{s} with Cov(**e**_{s}, **e**_{s′}) = 0 for all subscripts s and s′. This last equation states that the covariance between any two vectors of residuals **e**_{s} is 0, *i.e*., that errors associated with different experimental units are independent. However, errors of different traits on the same experimental units may be correlated.

Variances were estimated using the REML method. Fixed and random effects were estimated by solving Henderson's mixed-model equations (Henderson 1984), which generates estimates of the fixed effects that are equivalent to generalized least-squares estimators (GLSE) for fixed effects and best linear unbiased prediction (BLUP) of random effects (Searle *et al*. 1992).

#### Hypothesis testing:

One of the advantages of mixed-effects models is the ability to formally test for treatment-dependent heteroscedasticity. Differences in the error variances as well as the covariance among errors for each trait due to the density treatments can be tested using a likelihood-ratio test (LRT). In general, a LRT is a likelihood-based approach to compare the adequacy of two nested models (Bozdogan 1987). The test statistic is constructed by taking twice the difference in the log likelihood of two models of interest. Under the null hypothesis that the two models fit the data equally well, the LRT statistic is asymptotically distributed χ^{2} with degrees of freedom equal to the difference in the number of parameters between the full model and the reduced model. Unfortunately, little theoretical information is available on the behavior of the LRT for selection of the structure for the variance–covariance matrix in a mixed model with reasonable sample sizes in the lowest level (in our case 48). Therefore, we used information criteria along with the LRT to select among models for the variance–covariance matrix. Three models for **R** were compared: (i) models with a single variance–covariance matrix **R**_{s} for all sub-subplots (*i.e*., error variances of the effects of introgressions and the error covariances among them are the same for all the sub-subplots), (ii) models where the variance–covariance matrices **R**_{s} are allowed to vary between inbreds and hybrids, and (iii) models where the variance–covariance matrices **R**_{s} are allowed to vary between inbreds and hybrids and density treatments. The first model has 36 parameters, the second has 72 parameters, and the third model has 144 parameters in the **R** matrix.

Once the variance–covariance model for the residuals was selected, a series of hypothesis tests were carried out (Figure 2). To test the interaction between the effect of introgression *m* on trait *t* and density levels for inbreds and hybrids separately, we tested (Figure 2, tests a). If the effect of the introgression *m* on trait *t* is strictly additive, then we expect and (*i.e*., in the inbred whole plots *j* each plant has two copies of the introgression, whereas in the hybrid whole plots *j*′ each plant has only one copy). The above equalities can be rewritten as , which can be seen as a test for strict additivity in the interaction of the effect of the introgression *m* on trait *t* and density levels. For significant density-by-introgression interactions, we tested for changes in the density responses across locations. None of the reported 14 significant density-by-introgression interaction effects changed significantly across locations.

In addition to the interaction between the effect of the introgression and density, the main effects of the Tx303 introgressed segments *m* were estimated and tested. In cases where the effect of the Tx303 introgressed segment *m* on trait *t* changed with density treatments (*i.e*., significant introgression-by-density interaction), the effects were estimated and tested for each density treatment individually using and (Figure 2, tests b). If the introgression-by-density interaction was not significant, the effects of the introgression *m* on trait *t* were estimated as the average effect across densities and tested using (Figure 2, tests c). In our data analysis, significance levels were corrected for multiple testing using the FDR procedure (Benjamini and Hochberg 1995), a review of which can be found in Verhoeven *et al*. (2005). The FDR procedure correction was implemented to address inflation of type I error rates due to multiple testing. Locus-by-density interaction tests (Figure 2, test a) were adjusted to a FDR level of 20%. Tests for introgression effects (Figure 2, tests b and c) were adjusted to a FDR level of 5%. Differences in levels were used to accommodate difference in power for these tests and in an attempt to balance type I and type II errors. All analyses were conducted in SAS and mixed models were implemented in the Proc Mixed procedure (SAS Institute, Cary, NC). The data and SAS programs are available upon request.

A large amount of research has been directed at estimation using mixed models with repeated measures. Estimation techniques appear to work well in both large and small samples. However, much less effort has been directed toward methods for inference. Except for some simple mixed models, there is not a general form for an exact test for fixed effects. Various asymptotic test statistics work well in large samples, but poorly in small samples producing inflated type I error rates (Barton and Cramer 1989; Schluchter and Elashoff 1990). Some approaches have been proposed to control the inflated type I error rates of these tests. For the likelihood-ratio test, Bartlett's correction may be used (Bartlett 1937). A further refinement is to use an adjusted likelihood. A combination of these two approaches appears to work well with as few as 12 independent subjects (Zucker *et al*. 2000). Unfortunately, these methods involve complex calculations and, except for simple mixed models, they are not readily available and easy to implement.

For the approximate *F*-tests of fixed effects, a Satterthwaite-type adjustment for the degrees of freedom can improve the approximations (McLean and Sanders 1988). However, these corrected *F*-tests still have inflated type I error when the variance–covariance matrix is unstructured and the sample size is small (Catellier and Muller 2000). For example, Catellier and Muller (2000) reported a type I error rate of 0.16 with six repeated measurements, 24 independent subjects, and no missing data. Another approach to inference for fixed effects in mixed models consists of constructing a semiparametric bootstrap by sampling from the best linear unbiased predictor residuals. Although quite popular, it has been shown that such a bootstrap will consistently underestimate the variation in the data in finite samples (Morris 2002).

To examine whether inflation of type I error for the corrected *F*-test of fixed effects was an issue for our particular study, we conducted a simulation. Our study consists of eight repeated measurements with 48 independent subjects for each class of heterogeneity and no missing data. To estimate the coverage level of Satterthwaite-corrected *F*-tests in our analysis, we mirrored the simulation described by Catellier and Muller (2000). We generated 1000 samples from the null distribution, a multivariate normal distribution with mean vector 0 and variance–covariance matrix of data **Σ**. For **Σ**, the estimated variance–covariance matrix **V** from the analysis of our experimental data was used. Data were simulated using the “rmvnorm” function of SPLUS (version 6.1, 2002; Insightful, Seattle), maintaining the structure of the experimental design. For each randomly generated data set, we computed the *P*-value for the Satterthwaite-corrected *F*-test of location effect. We then rejected the null hypothesis if this *P*-value was less than the nominal level of 0.05. The number of times in the 1000 trials that this test was rejected is an estimate of the type I error rate of the test. The estimated type I error rate of this *F*-test for a nominal level of α = 0.05 was 0.1022.

The above examination is for a single test at a nominal level of 0.05. We performed multiple tests and corrected them using a FDR procedure. The FDR correction is generally found to be conservative (Storey and Tibshirani 2003). To determine the approximate expected inflation for a single test after our multiple-test correction, we considered the *P*-value for the nominal level determined by the FDR procedure. The threshold for a SIL-by-density interaction effect to be declared significant after implementing the FDR procedure at the 20% level was a *P*-value of 0.0143. A nominal *P*-value of 0.0143 would result in a type I error rate for an individual test of 0.0412 on the basis of our simulation results. The threshold for the main effect of SIL to be declared significant after implementing the FDR procedure at the 5% level was 0.0240. A nominal *P*-value of 0.0240 results in a nominal type I error rate of 0.0628 on the basis of our simulations. On the basis of these simulations and our use of FDR control we conclude that, in this particular case, inflation of the type I error rate is modest, but worthy of careful consideration when drawing conclusions. In fact, any implementation of *F*-tests in a mixed model requires careful consideration of these issues, as missing data and small sample sizes can result in severe inflation of type I error. Fewer independent subjects for each class of heterogeneity or more traits will produce *F*-tests with higher inflation.

## RESULTS

The unadjusted averages of traits evaluated in this study for the two parental lines (B73, Tx303) of the population and their hybrid progeny to Mo17 are displayed in Table 3. Grain yield per plant and yield components in inbred Tx303 are measured on very few individuals as the majority of plants in this partly subtropical inbred line produced multiple pseudo-ears in each node without kernels. In addition, the few plants without this peculiar manifestation produced few kernels. Density treatments were consistent with works published previously (Modarres *et al*. 1998; Maddonni *et al*. 2001; Borras *et al*. 2003).

Increasing plant density increased anthesis silk interval and decreased grain yield per plant, kernel number per plant, and weight of 1000 kernels for all genotypes (Table 3). The increases in ASI and decreases in grain yield per plant were similar among genotypes. In contrast, B73 × Mo17 responded to increasing density with a much greater decrease of kernel number per plant but with a lower decrease of kernel weight than that of Tx303 × Mo17. These two components of grain yield counteracted each other, but did not entirely cancel each other out: the yield of B73 × Mo17 was more affected by higher density than that of Tx303 × Mo17 (a decrease of 76 *vs*. 63.5 g pl^{−1}). Nevertheless, B73 × Mo17 was higher yielding than Tx303 × Mo17 under both densities.

The response to density treatment can be measured in many different ways. Density treatments may induce changes in the variances or mean responses of a particular trait or in the correlations among traits.

#### Phenotypic correlations among physiological traits:

The phenotypic variances of the introgression effects were smaller in high-density stands regardless of inbreeding levels. Phenotypic variance and covariance estimates obtained from inbreds were larger than estimates obtained from the hybrid progeny.

Introgression effects for grain yield per plant were positively correlated to the effects on kernel number per plant (Table 4), results that agree with previous reports that kernel number is the principal yield component in maize and that its correlation with grain yield is not affected by stress (Ribaut *et al*. 1996, 1997; Chapman and Edmeades 1999; Frova *et al*. 1999). There was no statistically significant phenotypic correlation between ASI and yield components. Both Tx303 inbred and Tx303 × Mo17 hybrid had substantially greater ASI than B73 or B73 × Mo17, respectively. ASI of all parental genotypes increased in a similar fashion under higher density. Therefore, ASI was not likely to be responsible for how the genotypes differed for responses to density measured in terms of yield and its component traits. Previous research has consistently reported that the phenotypic correlation between ASI and yield increases under intermediate to severe drought conditions (Bolanos and Edmeades 1996; Ribaut *et al*. 1997; Betran *et al*. 2003). However, the relationship between ASI and grain yield per plant may be specific to drought stress. Under other types of stresses, ASI is not always affected. For instance, wet, cold growing conditions that reduced grain yield by nearly 20% did not affect ASI and its phenotypic correlations with other traits (Beavis *et al*. 1994; Veldboom and Lee 1996; Ribaut *et al*. 1997). The introgression effects on height of primary ear insertion were negatively correlated to the introgression effects on ASI in the hybrid background, but not in the inbred background.

Introgression effects on days to anthesis were significantly negatively correlated to introgression effects on initial growth (plant height 6 weeks after planting) at both densities (Table 4), indicating that Tx303 segments that induced faster plant development at the beginning of the season also induced earlier anthesis. Introgressions associated with faster midseason plant development rate were also phenotypically correlated with earlier anthesis, but this effect was significant only in the high-density stands. The effects of Tx303 segments on ear insertion height were significantly correlated to the effects of the same introgressed segment on final height. The effects of the introgressed segments on final height were phenotypically correlated to the effects on midseason growth rate in both backgrounds and densities. Introgressions associated with effects on final plant height tended to be associated with faster midseason growth rate. However, although introgression effects on ear height followed the effects on midseason growth rate at low densities, they were lower and not significantly correlated in high densities.

#### Choice of the variance–covariance matrix:

Hypothesis tests can be affected by the choice of the variance–covariance matrix. This matrix is estimated from the data but the structure and complexity must be selected by the researcher. To select the structure for the variance–covariance matrix **R**_{n} a LRT was performed. On the basis of the LRT test, the model with a different variance–covariance matrix for the residuals from each inbreeding level by density combination (model iii) fitted the data significantly better than the reduced models i (LRT = 465.6, *P* < 0.0001) and ii (LRT = 196.8, *P* < 0.0001) (Table 5).

Little information is available on the behavior of the LRT for selection of the structure for the variance–covariance matrix in a mixed model. In some cases, selection between two structures for the variance matrix reduces to testing that a variance component is 0. In these situations, the LRT has low power. This is because the null hypothesis is on the boundary of the parameter space, and the distribution of the LRT statistic is not χ^{2} with 1 d.f. However, the LRTs presented here are for null hypotheses of homogeneity in the variance–covariance matrices across groups. Therefore, in the selection of the variance–covariance matrix we used information criteria as well as the LRT.

The Akaike information criterion (AIC) and Bayes' information criterion (BIC) were consistent with the results of the LRT when comparing models i and ii. When comparing models ii and iii, the AIC was consistent with the results of the LRT. However, unsurprisingly, the BIC indicated that model ii was better than model iii (Table 5). To decide between these two alternatives, we investigated the effect of using one or another structure on the inferences for fixed effects. All 14 significant density-by-introgression interactions detected using model iii were also detected using model ii. In model ii, two more density-by-introgression interactions were declared significant. These correspond to borderline tests using model iii. For the tests that the introgression effects were 0, all 90 tests that were significant using model iii were also significant when model ii was used. No new significant introgression effect was detected. We then decided to use a less parsimonious model for the variance–covariance matrix to allow for the possibility that there was a true difference between the groups and, in this case, to be more conservative in our conclusions.

#### Main effects of the Tx303 introgressions:

The effects of Tx303 segments on the eight traits evaluated in this study are summarized in Table 6. When introgression effects were not altered by density treatments, the effect in Table 6 corresponds to the average effects across densities. When density treatments altered the expression of the introgression, the effects at the two density levels are reported. All but one of the Tx303 introgressions were associated with significant effects for five or more of the eight traits with the exception of UMC076, which demonstrated small effects on grain yield per plant and growth rate. The preselection process may explain the association with more than one of the measured traits for all introgressions.

#### Genotype-by-plant population density interactions:

Significant locus-by-density interaction effects were detected for all traits except initial plant growth (Table 6). In the inbreds, the effects of the introgressions at loci UMC147A, UMC021, and UMC168 were constant, whereas the effects of the remaining five Tx303 introgression events were affected by density treatments for at least one trait (Table 6). In the hybrids the effects of introgressions at loci UMC147A, UMC122, UMC076, UMC168, and UMC021 were affected by density treatments. There were seven significant density responses in the inbreds, five of which were associated with yield per plant (loci UMC107 and both events of UMC122) or kernel number (loci UMC107 and UMC122). There was no significant density response for flowering-related traits (ASI and days to anthesis). In contrast, in the hybrids, five of the seven significant density responses were observed for flowering-related traits (loci UMC122 and UMC076 for days to anthesis and loci UMC147A, UMC122, and UMC168 for ASI), while only one response (locus UMC021) was significantly different for kernel number. The tests for interaction between the effect of the introgression *m* on trait *t* and density within inbred and hybrid states *j* were performed by comparing the effect of the introgression *m* in low and high density: .

In the inbreds, three loci for density response in grain yield were detected. These were detected at loci UMC107, BNL5.46, and UMC122. In all three cases, the Tx303 segments were associated with favorable density responses (Table 6). At low density, Tx303 segments at loci UMC107, BNL5.46, and UMC122 reduced significantly grain yield per plant by 16.80, 29.66, and 27.10 g pl^{−1}, respectively. However, the introgressed Tx303 segment at locus UMC107 did not reduce grain yield at high density, and loci BNL5.46 and UMC122 reduced grain yield by only 13.81 and 9.38 g pl^{−1} in high-density stands. Two loci for density response in kernel number were detected at UMC107 and BNL5.46. These two loci for kernel number followed the same trend as the segments described for grain yield, as expected given the high correlation among these traits. Two loci for density response in ear height and midgrowth rate were detected at loci BNL5.46 and UMC076, respectively. In both cases, the effects of the Tx303 introgressed segments were enhanced under higher densities. BNL5.46 induced a significant increase in ear insertion height of 4.40 cm in low-density stands, but such an effect was more than twice as large (11.92 cm) in high-density stands. Introgression at locus UMC076 had no effect on growth rate in low-density stands, but was associated with a significant increase in growth rate of 1.31 cm week^{−1} in high-density stands.

In the hybrids, 1 locus for kernel number (UMC021) was associated with density response. At low density, plants with the B73 segment set nearly 73 more kernels per plant than plants with the Tx303 allele (Table 6). In contrast, plants with the Tx303 allele, although not statistically significant, set nearly 20 more kernels per plant than plants with the B73 allele in the high-density stand. Three loci for density response in ASI were detected (Table 6). They were detected at loci UMC147A, UMC122, and UMC168. In the first two loci, the Tx303 segments were associated with significantly longer ASIs in low-density stands. However, the effects of these two Tx303 segments were not statistically significant in high-density stands. Conversely, the Tx303 segment at locus UMC168 was associated with significantly greater ASI at both densities, and the effect of the introgression at high density was approximately twice as large as the effect at low density. Two loci were detected for density response in days to anthesis. The response to density of these two Tx303 segments were different. At locus UMC122, Tx303 introgression significantly delayed anthesis by 1.3 days in low-density stands, and this effect was increased to 2.42 days at the higher density. However, the Tx303 segment at locus UMC076 delayed anthesis in the low-density stand only by 0.78 days, with no significant effect in the high-density stand. A locus for density response associated with final height was detected at locus UMC168. The Tx303 segment was associated with plants 8.27 cm taller at the end of the season at low density, but only 1.38 cm taller at high density (Table 6).

None of the density responses detected in inbreds were detected in hybrids and vice versa. These differences in density responses of inbred and hybrid plants may be attributed to genetic background effects (epistasis) or dose effects (*i.e*., in the inbreds there are two copies of the introgression, whereas the hybrids have only one copy), different dominance interactions between Mo17 and Tx303 or B73 alleles, or a combination of these effects. To evaluate the dose hypothesis, we tested whether the density response observed in the hybrid background was half of the response observed in the inbred background: . For five of the seven significant density responses detected in the inbreds, the density responses in the hybrids were consistent with a dose effect (Table 6). The other two density responses detected in the inbreds (locus UMC107 for kernel number and locus UMC122 for grain yield per plant) were not consistent with the dose hypothesis, and, in fact, inbred responses were −5.11 and −1.1 times that of hybrid responses, respectively. In both cases the density responses in the inbreds were in different directions from the density responses in the hybrids.

## DISCUSSION

There has been considerable discussion in the literature as to the definition of a QTL for stress tolerance and how to effectively use this information in marker-assisted selection schemes. Selection of alleles that are favorable under the conditions that are the targeted production conditions is desired. If an allele has a negative effect under a subset of conditions, this does not preclude its utility for other production environments if it interacts with production conditions, such as planting density. For example, segment UMC021 has a significantly negative effect on hybrid kernel number in low density but a positive effect (not statistically significant) on hybrid kernel number in high density. UMC107 and UMC147A display a significantly negative effect on yield and kernel number in inbreds. These segments display a positive effect on the traits in hybrids (not statistically significant). This demonstrates that the allelic effect of a particular segment depends upon the genetic context in which it is measured.

Previous studies have assigned QTL for stress tolerance on the basis of comparisons of the regions associated with the traits in separate analyses, arguing, for example, that a QTL was associated with stress tolerance if the QTL was significant in the stress environment and not significant in the unstressed environment. In our study, 36 locus-by-density interactions would have been identified if separate analyses were compared. We find statistical evidence for 14 interactions between the locus and density response. Of these 14, only 8 were identified using separate analyses and there were 6 additional statistically significant locus-by-density interactions that were not detected by the separate analysis approach. Detection of loci based upon separate analyses fails to detect loci for response to stress when the locus was significant in both individual analyses, but where the size of the effect changed between the low-stress and high-stress environments. On the other hand, many loci with moderate to small constitutive effects can be mistakenly associated with response to stress. In this case, the magnitude of the effect of the locus does not change across stress conditions, but error variance induced by the imposition of the stress does change. As a consequence, the failure to account for variance heterogeneity when comparing parallel results may lead to erroneous conclusions. Modeling locus-by-stress level interaction effects allows for direct comparison of the effects for individual loci detected under contrasting stress levels. Direct modeling of this interaction adds a powerful tool to QTL mapping of stress response. Overall, by taking into account both the size and the direction of the genetic effects relative to the variability present under both stress conditions, estimation of locus-by-stress level interaction provides a *direct* statistical test for detection of QTL responsible for stress tolerance that accounts for heterogeneity of variance. The quantified evidence based upon the statistical test is often different from what the simple visual comparison would indicate as evidenced by our study.

In our study, the segments associated with density response in inbreds and hybrids were not the same. In addition, the effects of density on the phenotypic correlations among traits varied across inbreds and hybrids. This indicates that there are major differences involved in the physiological response to increased interplant competition between inbreeding levels. These results are consistent with the emerging understanding of genetic variation among inbred lines of maize (Brunner *et al*. 2005; Fu and Dooner 2002) and are also consistent with traditional understanding of dominance, epistasis, and/or pleiotropy. The differences between inbreds and hybrids highlight the importance of studying hybrid maize in conjunction with inbreds when developing effective strategies for crop improvement. The inclusion of the SILs in this study permitted us to address the issue of whether all genome regions affecting yield in Tx303 and B73 contributed similarly to these responses or if some genome regions from Tx303 could contribute to improved hybrid yield under density stress, despite the overall yield inferiority of the Tx303 genotype in the midwestern location.

The phenotypic correlation between days to anthesis and midseason growth rate was not significant at low densities, but was significant in high-density stands in both genetic backgrounds. Reductions in midseason growth rate at high density may indicate lack of adaptation to high levels of interplant competition, which results in delayed flowering. Similarly to the study by Beavis *et al*. (1994), our data showed a small, statistically insignificant phenotypic correlation between ASI and kernel number or grain yield in both of the density treatments. Phenotypic correlation between ASI and yield increases under intermediate to severe drought conditions (Bolanos and Edmeades 1996; Ribaut *et al*. 1997; Betran *et al*. 2003). In our study, pollen availability during the entire flowering period might have masked the detrimental effect of longer ASI on grain yield observed in other studies (Bolanos and Edmeades 1996; Ribaut *et al*. 1997; Betran *et al*. 2003).

Grain yield per plant and kernel number were not significantly correlated to other traits across backgrounds and densities. In the inbred background at high denisty, grain yield and kernel number seemed to be consistently, positively more correlated to early to midseason growth rates than to any other traits (Table 4). Furthermore, in inbreds the relationship among these phenotypic correlations changed with density. In low density, the phenotypic correlation between growth rate and kernel number/grain yield is higher in the hybrids than in the inbreds. At low density, both kernel number and grain yield showed higher correlations with initial plant growth, but these traits showed higher phenotypic correlation to final plant height and midseason growth rate as the density increased. These results are not surprising because density treatments establish a competition for light. By minimizing shading effects, introgressions that induced faster growth rates when the competition for light was established tended to yield better. In summary, trait correlations in QTL studies aimed at mapping tolerance to interplant competition seem to be markedly different from those for other types of stresses. Under shading situations, grain yield seems to be more related to genetic factors associated with the ability to intercept light.

The SIL phenotypes reported here were generally, but not always, consistent with the responses observed in the initial studies that were used to select them. For example, effects of the introgressions on anthesis date were consistent between the two studies for all SILs except TBBC3-50 (UMC122, event 1) and as inbred and TBCC3-76 (UMC021) as a hybrid. In each of these cases, the introgressions had significantly positive effects on anthesis date in the present study (Table 6), but nonsignificantly negative effects in the previous study in North Carolina (Table 2). Some of these discrepancies may be due to greater experimental precision in the present study (where fewer genotypes were studied, leading to less within-block heterogeneity) or due to genotype-by-environment interactions arising from differences between the environments in North Carolina and Indiana. More serious was the significantly negative ASI effect of TBCC3-73 (UMC168) over densities in the current study (Table 6) compared to the significantly positive ASI of the same line under low density in the previous studies in North Carolina and across densities in Indiana (Table 2). The environmental factors that cause this difference were not identifiable in this study, but their elucidation is of interest to better understand the three-way interactions of genotype, density, and environment.

The loci identified here can also be compared by map position to loci previously identified in B73 × Mo17 mapping populations. Backcrosses of F_{3} lines from this cross to either parent were studied by Stuber *et al*. (1992), and only one of their yield QTL maps to a region covered by the introgressions studied here. Stuber *et al*. (1992) reported a major yield QTL on chromosome 1, in the same region as the introgression around UMC107, and the Tx303 allele at this introgression conferred a yield increase relative to the B73 × Mo17 hybrid in the initial study of the introgression lines (Table 2). In the current study, the introgression at this region did not significantly increase hybrid yield, but did significantly reduce the inbred yield at low density (Table 6). Furthermore, UMC107 is closely linked (∼5 cM on Pioneer 1999 Composite Map, www.maizegdb.org) to phi120, which was linked to a major QTL for leaf angle in the B73 × Mo17 RIL population studied by Mickelson *et al*. (2002). We did not measure leaf angles, so the physiological relationships between leaf angle, density responses, and inbred–hybrid responses are uncertain, but the concordance of these results suggests that further analysis of this region could reveal a part of the genetic control of these physiologically related traits. The other chromosomal regions covered by the introgression lines in this experiment did not align with other QTL identified by Stuber *et al*. (1992) or Mickelson *et al*. (2002).

We found 14 significant locus-by-density interaction effects (Table 6). These responses were observed for three loci for grain yield per plant, kernel number, and ASI; for two loci for days to anthesis; and for one locus for growth rate, ear height, and final height. No density response was observed for initial growth, which may be the result of the lack of competition for light during the first developmental stages. As is the case in any other complex, quantitative trait in maize, density response was greatly affected by genetic background. There were two loci detected in the inbred background that were not detected in the hybrid background and that did not follow an additive pattern (locus UMC107 for kernel number and locus UMC122 for grain yield per plant). Likewise, none of the seven loci detected for density response in the hybrid background were detected in the inbred background.

Multivariate mixed-model techniques are well suited for mapping genetic loci in complex experimental design scenarios. The main advantage is that the explicit statistical testing of genetic-by-treatment effects can be conducted. Previous approaches rely on the visual comparison between results overlaid on a common map. Overlaying results on a common map does not allow for a statistical assessment of the significance of the visual differences observed. Changes in magnitude are particularly hard to assess and can be affected by changes in variances. The hypothesis-testing framework of a mixed model allows for a statistical test for loci responsible for changes in plant response to stress.

Another advantage of the mixed-model methods in detecting loci contributing to stress response lies in the ability to estimate and test heterogeneity in the variance–covariance matrices of the residuals. This is not possible in least-squares methods, where only one common variance–covariance matrix for the residuals is assumed. The ability to model different variance–covariance matrices for the observations taken on plants under different levels of competition and for other treatments of interest (*e.g*., inbreds *vs*. hybrids) is advantageous. In addition, the ability to fit the particular experimental design used is an advantage to the mixed-model approach. It has been shown previously that stress environments usually result in a reduction of the genetic variance (Rosielle and Hamblin 1981). In addition, pairwise correlation between traits varies depending on the levels of stress, which has been observed in several QTL studies for drought tolerance (Ribaut *et al*. 1996, 1997; Sari-Gorla *et al*. 1999). LRTs indicated that the estimated **R**_{n} matrices significantly differed across densities and inbred/hybrid states. Differences across densities were likely due to the increased number of plants sampled in the high density.

However, mixed models also present serious statistical concerns that should be kept in mind when employing such techniques. One concern is the unknown behavior of the LRT to determine the appropriate variance–covariance structure. Therefore, we also examined the AIC and BIC in our determination of the variance–covariance matrix. While few of our conclusions were affected by this choice, this will not always be the case. Researchers should be aware of the sensitivity of their hypothesis tests to the choice of this matrix and exert care in its selection. Another concern is the type I error of the *F*-tests for fixed effects. Type I error for these tests can be grossly inflated, particularly when multiple traits are considered and correlation among traits is high (Catellier and Muller 2000). We performed a simulation study to determine exactly what the type I error inflation was expected to be for our experimental design. We found evidence for modest inflation of the type I error, and accordingly we are cautious in the interpretation of our results. Some theory has been developed that allows for adjustments to the test statistics that can, in some situations, lead to the improvement of the behavior of the *F*-test (McLean and Sanders 1988; Zucker *et al*. 2000). However, these methods are very specific and have not yet been successfully generalized. Clearly, more work in this area is needed.

We presented an approach for detecting loci responsive to density treatments in a small set of SILs on the basis of mixed-models theory. This method is particularly suited for identifying loci that may be associated with response to stress since it allows for accommodation of stress-induced changes in the variance–covariance matrix among residuals. Furthermore, it accounts for this heterogeneity in the correlation among traits when testing both main effects of the locus and locus-by-density interaction effects. The ability to estimate and directly test locus-by-density interactions in the mapping of loci associated with response to interplant competition is of major importance for marker-assisted selection initiatives aimed at improving performance. Loci with the favorable response to density treatments in addition to loci with a favorable mean may be important in understanding the genetic basis of grain yield response to density. Furthermore, although the approach proposed here was applied to a small set of SILs, it can be extended to recombinant inbred lines, backcrosses, and F_{2} mapping populations by specifying the design matrix to accommodate the marker information (Haley and Knott 1992; Whittaker *et al*. 1996; Coffman *et al*. 2005).

## Acknowledgments

The authors thank the two anonymous reviewers for insightful comments. Data for the initial studies were kindly provided by Brian Hostert. William Foster and Phil Devillez were critical in the planting, harvesting, and execution of this experiment and Wilfred Vermerris assisted with field management. Andrew Linville shelled and counted the 5962 ears in this study and the students Eleni Bachlava, Kyle Becker, and Jason Brewer helped score phenotypes in the field. This work is supported by National Science Foundation grant DBI 98-08026/00-96044 (L.M.M.) and National Institutes of Health grant NIA-AG16996 (L.M.M.) and by the Purdue University agricultural research centers, United States Department of Agriculture–National Research Institute–Competitive Grants Program award no. 2001-35301-10601 (J.B.H.).

## Footnotes

Communicating editor: J. B. Walsh

- Received May 17, 2005.
- Accepted February 13, 2006.

- Copyright © 2006 by the Genetics Society of America