Abstract
Nucleotide variation in a 2.2-kbp region of basic chitinase (ChiB) locus in 17 ecotypes of Arabidopsis thaliana was compared with previously investigated regions to investigate genetic mechanisms acting on DNA polymorphism. In the ChiB region, dimorphic DNA variation was detected, as in the Adh and ChiA regions. Nucleotide diversity (π) of the entire region was 0.0091, which was similar to those of the two other regions. About half of polymorphic sites (37/87) in the ChiB region were observed in only two ecotypes. Tajima's D was negative but not significantly, while Fu and Li's D* was positive. Neither McDonald-Kreitman nor Hudson, Kreitman, Aguadé tests showed a significant result, indicating that these loci were under similar evolutionary mechanisms before and after speciation. Linkage disequilibria were observed within the three regions because of dimorphic polymorphisms. Interlocus linkage disequilibrium was not detected between the Adh and the two chitinase regions, but was observed between the ChiA and ChiB regions. This could be due to epistatic interaction between the two chitinase loci, which are located on different chromosomes.
IN Arabidopsis thaliana, DNA polymorphism was first reported in the alcohol dehydrogenase (Adh) region (Hanfstinglet al. 1994; Innanet al. 1996; Miyashitaet al. 1996). In the Adh region, dimorphic DNA variation was observed and considered the result of balancing selection (Hanfstinglet al. 1994) and/or fusion of two diverged subpopulations (Innanet al. 1996). Similarly, dimorphism was observed in the acidic chitinase (ChiA) region (Kawabeet al. 1997) and Cauliflower (Cal) region (Purugganan and Suddith 1998), suggesting that the presence of the dimorphic variations could be a characteristic of the nuclear genome of A. thaliana. However, patterns of polymorphism (i.e., the proportion of singleton and replacement polymorphisms) and results of neutrality tests varied among the regions.
To clarify the mechanisms responsible for the different patterns of polymorphism in the regions, one approach is to compare loci that have similar function. The chitinase system in plants provides a good opportunity, since the system consists of multiple genes of similar function but distinct nucleotide (or amino acid) sequences. Chitinases are enzymes that catalyze hydrolysis of chitin and play an important role in plant defense systems against fungal pathogens. Plant chitinases are classified into four classes with respect to their structures. Class I chitinase has four motifs: signal peptide, cysteine-rich (chitin binding) domain, hinge region, and catalytic domain (Shinshiet al. 1990; Collingeet al. 1993; Hamelet al. 1997). Class II and IV chitinases are similar to class I chitinase. Class II chitinase lacks the N-terminal cysteine-rich domain, while class IV chitinase is distinguished from class I chitinase by four deletions (Collingeet al. 1993; Hamelet al. 1997). Class III chitinase is similar to bacterial chitinases and has lysozyme activities (Metrauxet al. 1989; Henrissat 1990; Jekelet al. 1991; Watanabeet al. 1992; Beintema and Terwisscha van Scheltinga 1996). Of the four classes of chitinases, only class III chitinase has no sequence similarities and may have an independent evolutionary origin (Meinset al. 1992; Collingeet al. 1993; Hamelet al. 1997). As with other pathogenesis-related (PR) proteins, chitinases are classified as acidic or basic according to their protein constitutions and expression patterns. The acidic and basic chitinases are expressed in the central vacuole and intracellular space, respectively (Mauch and Staehelin 1989; Metrauxet al. 1989). The induction mechanisms are also different between acidic and basic PR proteins. The acidic PR proteins are induced by salicylic acid, while basic PR proteins are induced by ethylene (Ohashi and Ohshiro 1992).
In A. thaliana, two chitinases, acidic (ChiA) and basic (ChiB) chitinases, were cloned and their DNA sequences were reported (Samacet al. 1990). The ChiA locus has three exons, coding a protein 302 amino acids (aa) long, while the ChiB has two exons, whose product is 336 aa long. It is not possible to align nucleotide or amino acid sequences of the two chitinases. ChiA and ChiB are located on chromosomes 5 and 3, respectively (Bell and Ecker 1994; Satoet al. 1998). ChiA encodes a class III chitinase and ChiB a class I chitinase. These two chitinases were classified as acidic and basic chitinases, although both chitinases are basic proteins (Beintema and Terwisscha van Scheltinga 1996). This classification is based on their different expression patterns (Samacet al. 1990; Samac and Shah 1991, 1994). ChiB is expressed in the root tissue constitutively and induced by ethylene (Samacet al. 1990; Samac and Shah 1994), as are other basic PR proteins (Ohashi and Ohshiro 1992). ChiA is expressed in root, leaf vascular, hydathode, guard cell, and anther tissues (Samac and Shah 1991) and is induced by salicylic acid (Samac and Shah 1991), typical for acidic PR proteins (Ohashi and Ohshiro 1992).
In this study, we report intraspecific nucleotide variation in a 2.2-kbp fragment of the ChiB region in A. thaliana using the same ecotypes (ecological races) analyzed for the ChiA region. The main purpose of this study is to investigate whether or not polymorphic patterns in the ChiA region are related to protein function. If the pattern of polymorphism in a locus is related to its function, similar patterns should be found in loci of similar function.
MATERIALS AND METHODS
Plant materials: Seeds of 16 ecotypes of A. thaliana were obtained from Professor Nobuharu Goto, Sendai Arabidopsis Seed Stock Center, Miyagi University of Education, Sendai, Japan and were grown in soil pots in an incubator under 16-hr light and 8-hr dark conditions. These ecotypes were sampled worldwide. Sixteen of the 17 ecotypes were analyzed for Adh variation (Innanet al. 1996), and the same 17 ecotypes were analyzed for the ChiA region (Kawabeet al. 1997). To estimate divergence, a related species, Arabis gemmifera (sampled at Ashibi, Kyoto prefecture, Japan), was used.
Sequencing: Total DNA was extracted from mature plants by a CTAB method of Weising et al. (1991) and used as template for PCR amplification. Primers for PCR amplification are 5′ CGT CTA TTT TTA TTT TCT CCA 3′ (BCH-1) and 5′ TTG GTT TGA TGT TGG TTT TGT 3′ (BCH-102), which amplifies a fragment about 2.5 kbp long encompassing the entire coding region of the ChiB locus. Two primers, BCH-1 and 5′ AAG TTT AGG CTC TGA TTT ATG 3′ (BCH-101), were used for A. gemmifera to amplify a fragment about 2.5 kbp long containing the entire coding region of the ChiB locus. The PCR products were cloned into plasmid pUC 18 as described in Terauchi et al. (1997). Sequencing reactions (ALFexpress Auto Read sequence kit, Pharmacia, Piscataway, NJ) followed the dideoxy chain termination method (Sangeret al. 1977) using cloned plasmid as the template. Sequencing primers were designed at ~350- to 400-bp intervals. Sequence for each ecotype was determined in both strands by a Pharmacia ALFred sequencer. Singleton variants were confirmed by sequencing more than two clones to eliminate PCR artifacts. Newly obtained nucleotide sequences were deposited in the DDBJ database under nos. AB023448–AB023464.
Sequence analyses: Seventeen sequences including a published sequence of the ecotype Columbia (Col-0: Samacet al. 1990) were studied. The analyzed region is located between nucleotide positions 394 and 2640 of the Columbia ecotype (Samacet al. 1990). Nucleotide diversity (π, Nei and Li 1979; Tajima and Nei 1984), 4Nμ (θ, Watterson 1975), the number of recombination events (Rm, Hudson and Kaplan 1985), and 4Nc (C, Hudson 1987) were estimated by using DnaSP program version 2.5 (Rozas and Rozas 1997). The phylogenetic tree was constructed by the neighbor-joining (NJ) method (Saitou and Nei 1987) based on the genetic distances for the entire region calculated by the Jukes and Cantor (1969) method (PHYLIP v. 3.572; Felsenstein 1996). Nucleotide sequences of the Adh and ChiA regions were obtained from Miyashita et al. (1996, 1998), Innan et al. (1996), and Kawabe et al. (1997). Between all non-singleton variants in the three regions, linkage disequilibrium was analyzed by the DnaSP program (Rozas and Rozas 1997) and tested by Fisher's exact test.
RESULTS
DNA polymorphism in the ChiB region of A. thaliana: There were 109 variants (83 nucleotide substitutions and 26 length variants) in the entire ChiB region (Figure 1). Of these, 22 nucleotide substitutions and 5 length variants were singletons. Length variation was observed only in noncoding regions. Five repeat number variants [poly(A) at site 500–506, poly(T) at 1335–1338, poly(A) at 1391–1396, poly(T) at 2598–2603, and poly(AT) at 1306–1333] were excluded from the following analyses.
In the coding region, 21 nucleotide polymorphisms were observed, including four replacement substitutions. The proportion of replacement nucleotide sites (19.1%) was lower than those in the ChiA (43.2%) and Adh (31.6%) regions. The high proportion of replacement sites in ChiA could be specific to the locus. Only one replacement site variant, a non-singleton, was observed in ChiB; it caused a nonconservative amino acid change, between negatively charged Asp and positively charged Lys (Miyataet al. 1979). However, the site was located at amino acid position 17, which is included in the N-terminal region of 30 amino acids that is cut off from mature protein (Samacet al. 1990). Therefore, this nonconservative polymorphic site is not related to chitinase function. The other three amino acid changes, all singletons, were conservative.
Nucleotide diversity (π) of the ChiB region was estimated (Table 1). Nucleotide diversity of the entire region was 0.0091. This value was almost the same as those of the ChiA (0.0104), Adh (0.0080), and Cal (0.0070) regions. Table 1 also summarizes the results of Tajima's (1989) and Fu and Li's (1993) tests. The D test statistics were negative in most regions, while D* was positive in some regions. None of the tests gave a significant result, except for Fu and Li's test on the 3′ flanking region. The ChiB region had an excess of doublet sites (Figure 2), which contrasted with the other three loci. This excess of doublet polymorphisms caused contrary results for Tajima's and Fu and Li's tests. The significantly positive D* value under Fu and Li's test for the 3′ flanking region reflected that only 1 of 21 polymorphic sites in the region was a singleton polymorphism. In the ChiA region, the two test statistics were significantly negative, because of excess singletons (Kawabeet al. 1997). These chitinase loci also differed in the proportion of singleton nucleotide sites.
Summary of DNA variations detected in the ChiB region of A. thaliana. Vertical bars indicate nucleotide variations, of which replacement changes are shown with solid circles. Open triangles mean indels. Non-singleton changes are summarized at the bottom, where dot indicates nucleotide identical to consensus, + indicates presence of indel, and minus indicates absence. Sequences of indels, indicated by lowercase letters, are a, TTA; b, AA; c, ACTGT; d, CTGGATACTAC; e, TCTTGATTAATCTAAACGCAAAT; f, TTAA; g, AATATAATAC; h, AT; i, ATATT; and j, GATTC.
Intragenic recombination in the ChiB region of A. thaliana: As in the Adh, ChiA, and Cal regions, nonsingleton polymorphisms found throughout the ChiB region were dimorphic (Figure 1), confirming previous results that dimorphism was a characteristic of DNA variation in the A. thaliana nuclear genome. It was certain that intragenic recombination had occurred in the ChiB region, especially in exon 2. Two ecotypes, Ci-0 and Pog-0, which have 39 distinct variants, were diverged from other ecotypes. But these two ecotypes are not obvious parental types of any evidently recombinant ecotypes. Complicated partitioning in the ChiB region and the absence of an informative outgroup sequence made it impossible to use the methods previously used for detecting the number and times of recombination events (Innanet al. 1996; Kawabeet al. 1997). Instead, we estimated the minimum number of recombination events (Rm, Hudson and Kaplan 1985) and 4Nc (C, Hudson 1987) to describe intragenic recombination in the ChiB region, together with the Adh and ChiA regions (Table 2). The values per site for ChiB were intermediate between those of the two regions. These two chitinase loci again differed in the rate of recombination. The estimated C values for each of the three regions were always smaller than estimates of θ, indicating that recombination has occurred less frequently than nucleotide mutation. Although estimates of 4Nμ (θ) were relatively constant over the three regions, those of C varied widely. This result meant that recombination rate in A. thaliana was not constant, but strongly depended on region.
Summary of polymorphism in the ChiB region of A. thaliana
Frequency spectrum of polymorphic nucleotide sites in the ChiB region. The expected value was obtained according to Tajima (1989).
Genealogical relationship of ecotypes based on nucleotide variations in the ChiB region: An estimate of genealogical relationships among ecotypes was obtained by the NJ method, based on nucleotide variation in the entire ChiB region (Figure 3). Because there have been some recombination events, this NJ tree may not be greatly informative. However, there were clearly three distinct clusters. The first cluster included 13 ecotypes, the second cluster included Mt-0 and Kn-0, and the third cluster consisted of Ci-0 and Pog-0. Nucleotide diversities within clusters II and III were low (0.0005 and 0.0009, respectively), while that of cluster I was 0.0075, which was similar to the overall estimate. This result suggested that sequences in cluster I were relatively older than those in clusters II and III. Average nucleotide distances between clusters were 0.0147, 0.0221, and 0.0223 for I and II, I and III, and II and III, respectively. For the Adh and ChiA regions, average nucleotide distances between the two most divergent sequence types were 0.0127 and 0.0148, respectively. Thus, the ChiB region contained more divergent sequences than the other two regions.
Neighbor-joining tree of ecotypes based on the nucleotide variation in the ChiB region. Bootstrap probabilities >50% are shown above branches.
As in the Adh and ChiA regions, no association between phylogenetic clustering and sample locations was observed (Figure 3). In particular, the two ecotypes in each of clusters II and III, which had low nucleotide diversity, came from different continents. These results support the hypothesis that the A. thaliana population has spread over the world recently (Kinget al. 1993; Priceet al. 1994; Innan et al. 1996, 1997).
Summary of estimated recombination at the three regions
Result of McDonald-Kreitman test for the ChiB locus
Interspecific variation between A. thaliana and A. gemmifera in the ChiB coding region: A McDonald-Kreitman test (McDonald and Kreitman 1991) was conducted to examine the ratio of polymorphic replacement and silent sites in A. thaliana relative to divergence between A. thaliana and A. gemmifera (Table 3). Because only one sequence was determined in A. gemmifera, the divergence could include both fixed sites between A. thaliana and A. gemmifera and some polymorphic sites within A. gemmifera. None of the comparisons gave significant results, although the ratio of replacement to silent changes for the interspecific comparison was higher than that for the intraspecific comparison. This result indicated that ratio of replacement to synonymous substitutions was constant in intra- and interspecific variations.
To examine the difference in levels of nucleotide variation between polymorphism and divergence among the Adh, ChiA, and ChiB regions, a Hudson, Kreitman, Aguadé test (Hudsonet al. 1987) was conducted, comparing the number of polymorphic sites in coding regions and using A. gemmifera as a reference species (Table 4). None of the comparisons were significant, indicating that the level of polymorphism was consistent with divergence among the three regions.
Linkage disequilibrium within and between loci: Because most of the ecotypes studied for the ChiB region were also used for analyses of the Adh and ChiA regions (Innanet al. 1996; Kawabeet al. 1997), it was possible to analyze intra- and interlocus linkage disequilibrium between polymorphic DNA variants in the three regions (Figure 4). Since all three regions showed dimorphism, significant linkage disequilibria were observed within each region. The proportions of linkage disequilibria that were significant at least at the 5% level were 368 of 1035 tests (36%), 173 of 351 (49%), and 1651 of 3403 (49%) for Adh, ChiA, and ChiB, respectively. Sign tests (Lewontin 1995) were applied to test the overall degree of linkage disequilibrium (Table 5). In each region, an excess of coupling (positive) linkage disequilibrium was detected. Between the Adh and either of the two chitinase regions, almost no pairs of sites showed linkage disequilibrium significant at the 5% level [12 of 5060 tests (0.2%)]. On the other hand, between the two chitinase regions, 428 of 2241 tests (19%) were significant at the 5% level. The significant interlocus linkage disequilibria were caused by two divergent ecotypes Ci-0 and Pog-0, which had unique polymorphisms in the two regions. In the ChiB region, 27 nucleotide substitutions and 12 indels were uniquely found in Ci-0 and Pog-0. Almost all of these variants were located in noncoding regions, with the exception of two synonymous substitutions in exon 1. It should be mentioned that because of the enormous number of comparisons, none of the tests were significant when the Bonferroni correction was applied.
DISCUSSION
Patterns of DNA variation among the Adh, ChiA, and ChiB regions: One of the main purposes of this study was to distinguish whether or not the polymorphic patterns in the ChiA region, i.e., excess of rare alleles and higher replacement polymorphism, were related to the function of chitinase protein. None of the ChiA-specific polymorphic patterns were observed in the ChiB region, except for asymmetric partitioning of dimorphism. In both regions, dimorphism was maintained at low frequencies. These results suggested that the patterns of polymorphism in the ChiA region were not related to chitinase function, but might be specific to the locus. Therefore, another approach is necessary to determine the causes of higher replacement polymorphism or high proportion of rarer alleles in the ChiA region.
Summary of the results of Hudson, Kreitman, Aguadé test for the ChiB region
Intra- and interlocus linkage disequilibrium between polymorphic DNA variations in the Adh, ChiA, and ChiB regions. Polymorphic variations in each region are arranged from 5′ to 3′. Statistical significance based on Fisher's exact test is shown.
In Drosophila melanogaster, recombination rate is related to level of intraspecific DNA variations (Aguadéet al. 1991; Begun and Aquadro 1992) as a result of hitchhiking and/or background selection (Maynard Smith and Haigh 1974; Charlesworthet al. 1993). A. thaliana is a self-pollinated plant and its outcrossing rate was estimated to be only ~0.3% (Abbott and Gomes 1989). In such highly selfing species, the whole genome may act as a linked locus and the effects of hitchhiking or background selection may be similar for all loci (Charlesworthet al. 1993). Estimated recombination rates were higher in the Adh and ChiB regions than in the ChiA region (Table 2). The reason why the recombination rate in the ChiA region was reduced could not be clarified in this study. However, the rates of synonymous substitution were also related to estimated recombination rates (C) of the three regions in A. thaliana (d.f. = 1, r = 0.99996, P < 0.01), as shown in D. melanogaster (Moriyama and Powell 1996). This relationship indicates that recombination may play an important role in determining levels of intraspecific variation in A. thaliana.
In the three regions, C/θ, which reduced to c/μ, ranged from 0.0001 to 0.2197 and the average was 0.0823. In D. melanogaster, this ratio was estimated as 1.6 in the Adh region (Hudson 1987). If the mutation rate has been similar between D. melanogaster and A. thaliana, the lower value of C/θ in A. thaliana suggests less recombination in this plant species and is consistent with the difference in breeding system of these organisms.
Interchromosomal linkage disequilibrium between the two chitinase regions: Although statistical significance was not confirmed for linkage disequilibrium, it is clear that the two ecotypes have a divergent sequence type in the two chitinase loci. To explain this pattern in the chitinase loci, which lie on different chromosomes, two hypotheses could be considered. First, the pattern may represent a consequence of random assortment of the chromosomes, where the two loci are located, after formation of a heterozygote between two divergent sequence types. Although this plant species is highly selfing, previous studies of nucleotide variation in the Adh and ChiA loci (Innanet al. 1996; Kawabeet al. 1997) showed that recombination events had occurred. Because recombination events can be detected only in the presence of heterozygous individuals, some outcrossing must have occurred in the evolutionary history of A. thaliana. Once this combination is formed in the two chitinase loci, it will be maintained by selfing at the species level. The second explanation for the pattern in the chitinase loci could be epistatic interaction between the genes. Since ChiA and ChiB are functionally related, this hypothesis is attractive. At this moment, we do not have any supporting evidence. To distinguish the possibilities, it may be necessary to study DNA variation in surrounding regions of the chitinase loci and compare chitinase activity between ecotypes.
The sign test applied to Adh, ChiA, and ChiB regions of A. thaliana
Acknowledgments
We are grateful to N. Goto, Sendai Arabidopsis Stock Center, Miyagi University of Education, for A. thaliana seeds and to F. Tajima, M. Uyenoyama, and E. Stahl for comments and improving the manuscript. This article is contribution number 556 from the Laboratory of Plant Genetics, Graduate School of Agriculture, Kyoto University.
Footnotes
-
Communicating editor: M. K. Uyenoyama
- Received November 3, 1998.
- Accepted July 27, 1999.
- Copyright © 1999 by the Genetics Society of America