Genetics, Vol. 159, 623-633, October 2001, Copyright © 2001

The Relationship Between Third-Codon Position Nucleotide Content, Codon Bias, mRNA Secondary Structure and Gene Expression in the Drosophilid Alcohol Dehydrogenase Genes Adh and Adhr

David B. Carlinia, Ying Chena, and Wolfgang Stephanb
a Department of Biology, University of Rochester, Rochester, New York 14627
b Department of Evolutionary Biology, University of Munich, 80333 Munich, Germany

Corresponding author: Wolfgang Stephan, Department of Evolutionary Biology, University of Munich, 80333 Munich, Germany., stephan{at}zi.biologie.uni-muenchen.de (E-mail)

Communicating editor: J. HEIN


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

To gain insights into the relationship between codon bias, mRNA secondary structure, third-codon position nucleotide distribution, and gene expression, we predicted secondary structures in two related drosophilid genes, Adh and Adhr, which differ in degree of codon bias and level of gene expression. Individual structural elements (helices) were inferred using the comparative method. For each gene, four types of randomization simulations were performed to maintain/remove codon bias and/or to maintain or alter third-codon position nucleotide composition (N3). In the weakly expressed, weakly biased gene Adhr, the potential for secondary structure formation was found to be much stronger than in the highly expressed, highly biased gene Adh. This is consistent with the observation of approximately equal G and C percentages in Adhr (~31% across species), whereas in Adh the N3 distribution is shifted toward C (42% across species). Perturbing the N3 distribution to approximately equal amounts of A, G, C, and T increases the potential for secondary structure formation in Adh, but decreases it in Adhr. On the other hand, simulations that reduce codon bias without changing N3 content indicate that codon bias per se has only a weak effect on the formation of secondary structures. These results suggest that, for these two drosophilid genes, secondary structure is a relatively independent, negative regulator of gene expression. Whereas the degree of codon bias is positively correlated with level of gene expression, strong individual secondary structural elements may be selected for to retard mRNA translation and to decrease gene expression.


THE highly conserved secondary structures of rRNAs (NOLLER and WOESE 1981 Down), tRNAs (SPRINZL et al. 1987 Down), catalytic RNAs (PACE et al. 1989 Down), and precursor mRNAs (KIRBY et al. 1995 Down) have been closely studied. A variety of methods have been used to predict secondary structures and, in some cases, inferred structures have been experimentally verified. In comparison, the extent to which secondary structures form within protein-coding regions of mature mRNAs has received less attention. The primary constraint on the evolution of protein-coding genes is purifying selection at the amino acid level. Until recently, variation at synonymous sites was thought to be selectively neutral. In the past decade, however, the role of codon bias, i.e., weak selection for preferred synonymous codons, has been recognized as an important evolutionary force.

Several hypotheses have been advanced to account for the positive correlation between the degree of codon bias and level of gene expression. It is thought that this relationship reflects selection for the use of codons specifying abundant tRNA molecules (POST et al. 1979 Down; GRANTHAM et al. 1980 Down; IKEMURA 1981 Down; BENNETZEN and HALL 1982 Down; SHARP and LI 1986 Down). Thus, the evolution of codon bias in highly expressed genes is hypothesized to be due to natural selection for increased protein elongation rates (i.e., translational efficiency; BULMER 1991 Down) and for minimization of errors in translation of the mRNA transcript (i.e., translational accuracy; AKASHI 1994 Down). Another possible interpretation is that codon bias is related to mRNA secondary structure. In this case, the relationship between codon bias and gene expression may in fact be a secondary consequence of the effect of mRNA secondary structure on mRNA translation (WADA and SUYAMA 1986 Down; ANTEZANA and KREITMAN 1999 Down).

If mRNA secondary structure and codon bias are interrelated, then it should be possible to ascertain the relationship between the two factors. In other words, are mRNAs from highly biased genes more (or less) stable than mRNAs from unbiased genes? While the relationship may appear to be relatively straightforward to test, ascertaining the degree of mRNA stability is complicated. Unlike the shorter tRNAs, which can be crystallized and subject to X-ray diffraction (DOCK et al. 1984 Down), secondary structures for mRNAs cannot be directly observed. They must instead be inferred using two basic techniques, free energy minimization and phylogenetic comparisons.

Algorithms based on free energy minimization (e.g., ZUKER et al. 1991 Down) perform well for short sequences (WALTER et al. 1994 Down) but become unreliable as sequence length increases due to the vast number of possible structures for longer sequences (KONINGS and GUTELL 1995 Down). Also, comparisons between different genes must be standardized to account for differences in sequence length and nucleotide composition. The phylogenetic comparative method (FOX and WOESE 1975 Down; PACE et al. 1989 Down) provides a more reliable means of identifying secondary structures, but it does not provide a quantitative measure of stability. Furthermore, the level of sequence divergence and phylogenetic relationships of the aligned sequences are not considered. An alternate method for detecting RNA secondary structures was developed by MUSE 1995 Down. Using an explicitly defined evolutionary model, likelihood-ratio tests (LRTs) are performed to identify and quantify the relative stability of regions showing constraints for Watson-Crick base pairings. Muse's method was extended by PARSCH et al. 2000 Down to allow structures to be predicted and LRT scores calculated without a priori knowledge of the paired regions.

In this study we used the method of PARSCH et al. 2000 Down to predict secondary structures in two related drosophilid genes, Adh and Adhr, which differ in degree of codon bias and level of gene expression. By comparing the LRT scores of individual secondary structures between the two genes, we gain insight into the nature of the relationship between codon bias, third-codon position nucleotide content (N3), mRNA secondary structure, and gene expression. Fig 1 illustrates the hypothesized interactions between these factors. Adh and Adhr were selected for analysis because they met the following criteria:

  1. The two genes exhibit a substantial difference in the extent of codon bias and level of gene expression.



    View larger version (14K):
    In this window
    In a new window
    Download PPT slide
     
    Figure 1. Hypothesized interactions between mRNA secondary structure, codon bias, N3 content, and gene expression. Solid arrows represent interactions examined in this study.

  2. The method of PARSCH et al. 2000 Down requires a set of aligned homologous sequences and a "known" phylogeny of the aligned sequences. The method is computer intensive and therefore candidate genes must not be too long (<1 kb).

  3. The genes are similar in length and have similar base compositions.

  4. The two genes are very tightly linked (<1 kb) and thus presumably undergo similar rates of recombination (in the species from which Adhr is sampled in this study). In Drosophila, this was the only gene pair to meet these criteria.

Our specific aims are: (1) to examine whether secondary structure has an effect on gene expression, (2) to test whether codon bias per se influences secondary structure formation, and (3) to determine whether N3 affects secondary structure. The interactions we investigate in this study are shown as solid-line arrows in Fig 1. For each gene, we performed four types of randomization simulations to maintain or reduce codon bias and/or to maintain or alter third-codon position nucleotide content. The potential of forming individual secondary structural elements (helices) is measured in terms of LRT scores. The LRT distributions predicted from the different sets of 100 randomization simulations were compared to each other and to the structures predicted from analyses of the native sequences to address the above three questions. For several reasons (in particular, to avoid problems associated with the alignment of the sequences), we focus here on LRT score distributions of individual pairing regions occurring in exons only.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Sequences and alignments:
Adh sequences from seven drosophilid species were downloaded from GenBank (Table 1) and aligned by eye. Relative to the D. melanogaster Adh sequence (Wa-S allele), a single 6-bp gap was introduced at positions 7–12 in all other sequences, with the exception of Zaprionus, where a single 3-bp gap was introduced at positions 10–12. The Adh alignment comprised 771 positions (including gaps). Adhr sequences from six drosophilid species were downloaded from GenBank (Table 1) and aligned using the CLUSTAL algorithm as implemented in GeneJockey II (TAYLOR 1996 Down) using the default parameters. Both alignments are available at http://troi.cc.rochester.edu/~ying/align.html.


 
View this table:
In this window
In a new window

 
Table 1. Accession numbers, codon bias, and base composition of sequences used in this study

Secondary structure prediction:
Putative pairing regions were identified using the PIRANAH software program described in PARSCH et al. 2000 Down. To calculate the LRT statistic for a set of aligned sequences, PIRANAH requires a phylogenetic tree for the aligned sequences. Tree topologies were based on the relationships determined from a comprehensive molecular phylogenetic study of the drosophilids (RUSSO et al. 1995 Down). For Adh, the input tree was (((((D. melanogaster, D. pseudoobscura), D. willistoni), (D hydei, D. immigrans)), Z. tuberculatus), S. lebanoniensis). The Adhr input tree topology used was ((((D. melanogaster, D. teisseri), (D. pseudoobscura, D. ambigua)), D. immigrans), S. lebanonensis). The program was modified to allow several data sets to be analyzed sequentially. GU wobble pairs were permitted internally; GU terminal pairs were treated as mismatches. For the native Adh sequences and all randomization simulations of the Adh alignment, potential helices were considered if conserved in six out of the seven sequences (15 out of 21 comparisons). Minimum helix length was set to 4 bp. LRT scores were calculated for all helices meeting the criteria specified above (i.e., no minimum "pairing score"). LRTs were calculated for all helices so that the distribution of LRT scores could be compared among randomizations. PIRANAH settings for Adhr were identical to those for Adh, except that potential helices were considered if conserved in five out of the six sequences (10 out of 15 comparisons).

Randomization simulations:
Each native DNA sequence alignment of the several taxa was randomly shuffled 100 times to generate a distribution of randomized sequence alignments to which the native sequence alignment can be compared. Since we used sequence alignments instead of individual sequences, and coding regions instead of noncoding regions, two criteria had to be met to draw meaningful comparisons with the native alignments. First, levels of sequence divergence among the individual sequences in the sequence alignment were to remain unchanged. Second, the encoded amino acid sequence could not be altered.

This was accomplished by randomizing the DNA sequence alignment column by column. Each column was composed of one codon (3 nucleotides) at the same position of every sequence in the alignment. A randomized codon table corresponding to the original codon table was generated, and the codons in each column of the sequence alignment were substituted with the respective codons in the new codon table. Since the same codon conversion table was applied to all codons within one column, sequences that had the same codon in the native alignment would still have the same codon after the randomization, even though now the codon they shared could be different from the one in the native alignment. In this way, the phylogenetic relationships of the native sequence alignment were maintained in the randomized alignments as much as possible. The codon conversion table was generated by shuffling the order of codons within each codon family, so that the encoded amino acid was maintained after the randomization. For "native bias" randomizations, the codon conversion table was generated once at the beginning and used throughout the alignment. For "reduced bias" randomizations, a codon conversion table was generated independently for each column of the alignment before the column was randomized. So a preferred codon may be changed to one codon in one column and to a different codon in another column, thus reducing the codon bias for the entire sequence length.

No shuffling was conducted on Met and Trp codons, as there is no codon degeneracy for these two amino acids. Stop codons also remained unaltered in the randomized sequence alignments. For twofold degenerate codons, such as Phe, a codon within the Phe codon family (UUU or UUC) was randomly drawn in the "equal N3" randomizations (see below) to correspond to the UUU codon in the new randomized table. In this way, the original UUU in the native sequence alignment had a 50% probability of remaining unaltered and a 50% probability of changing to UUC. In the "native N3" randomizations, codons were drawn from the weighted probabilities on the basis of the observed frequencies of UUU and UUC in the native alignment. Threefold and fourfold degenerate codon families were shuffled in the same manner as the twofold degenerate codon families. The three sixfold degenerate codon families were each split into two separate codon families of twofold degeneracy and fourfold degeneracy. For example, the Arg codon family was broken into a twofold degenerate codon subfamily of AGR and a fourfold degenerate subfamily of CGN. The two codon subfamilies were not interchangeable: an AGA in the original sequence alignment was never changed to any of the CGN codons, and vice versa, despite the fact that they all translated to Arg. The main reason for doing so was to maintain the level of conservation among sequences in the sequence alignment, since random shuffling among the sixfold degenerate codon family as a whole could incur two simultaneous nucleotide substitutions: one at the first codon position and one at the third position. This would result in an increase in the divergence among simulated sequences relative to the original sequences. Since the level of sequence divergence affects the LRT score of predicted helices, the LRT distributions of helices predicted in randomized alignments would not be strictly comparable to those obtained from analysis of the original alignments.

Due to the method of splitting sixfold degenerate codon families into two noninterchangeable subfamilies, four classes of synonymous substitutions were not permitted in the randomizations (Leu: TTA {leftrightarrow} CTA, TTG {leftrightarrow} CTG; Arg: AGA {leftrightarrow} CGA, AGG {leftrightarrow} CGG). These four classes of substitutions were relatively rare in the original Adh and Adhr alignments: only 16 of 256 Adh codons and 16 of 287 Adhr codons contained these types of substitutions. In each of the 16 cases for both genes, the substitution was usually restricted to one or two of the sequences, so that the incidence of such substitutions was actually much less than simply 16/256 or 16/287. This relative rarity of "noninterchangeable substitutions" in the original alignments justifies our approach of breaking up sixfold degenerate codon families into two separate codon families.

For each gene, the following four types of randomizations were carried out on the native sequence alignment to generate 100 randomized sequence alignments each.

Native bias, equal N3: In this randomization method, the native codon bias was maintained in the randomized alignments, while the N3 was changed to approximately 25% G, 25% C, 25% A, and 25% U. A randomized codon table corresponding to the original codon table was generated once at the beginning of the randomization and was used throughout the sequence alignment until the randomization was complete. In this manner, the ranking of favored codons in each codon family may have been altered, but the relative proportions of each codon usage in each codon family remained the same; i.e., the codon bias of each randomized sequence alignment was identical to that of the native sequence alignment. In each codon family any codon, regardless of its third-position nucleotide, was equally likely to be chosen as the most favored codon. As a result, the overall N3, averaging the effect of random shuffling of 21 codon families, approached 25% per nucleotide in these randomizations (for details see http://troi.cc.rochester.edu/~ying/appendix1.html).

Reduced bias, equal N3: In contrast to the native bias, equal N3 case, the randomized codon table corresponding to the original codon table was generated for each column of codons in the native sequence alignment. Thus, by averaging over 256 (Adh) or 287 (Adhr) columns of codons, the bias of codon usage in one randomization was reduced. The codon usage bias could not be totally eliminated due to the requirement of maintaining the phylogenetic relationships among the DNA sequences in the alignment. The N3 content approaches 25% each due to the same reason given in the native bias, equal N3 randomization method (see http://troi.cc.rochester.edu/~ying/appendix1.html).

Native bias, native N3: The base composition of all sequences in the native sequence alignment was calculated for each codon family, and the frequencies of nucleotide content were used as weights in generating the randomized codon table corresponding to the original codon table. A codon in the original sequence alignment was more likely to be changed to a G or C ending codon than an A or U ending codon of the same codon family, if the GC3 content of that particular codon family was >50%. For example, in the original Adh alignment, there was an average of 75.4% of C and 24.6% of U at the third-codon position for the Phe codon family. In this case the more favored codon, UUC, was slightly more than three times as likely to remain unchanged than to change to UUU after the randomization. The N3 contents of each randomization were not identical to those in the native sequence alignment, but remained quite close (see http://troi.cc.rochester.edu/~ying/appendix1.html). The randomized codon table was generated once at the beginning and used throughout the sequence alignment of one randomization to maintain the native codon bias.

Dinucleotide content has been shown to have a significant influence on the potential for secondary structure formation in mRNAs (WORKMAN and KROGH 1999 Down). For each gene, we calculated the dinucleotide frequencies for each of the 100 reduced bias, native N3 randomizations. Dinucleotide frequencies of the randomized alignments were compared to those of the native alignment to determine if the dinucleotide content of the simulations was significantly different from that of the native alignment. The dinucleotide frequencies of the 100 randomized alignments were not significantly different from those of the native alignment (Adh: {chi}2 = 3.116, NS; Adhr: {chi}2 = 2.614, NS).

Reduced bias, native N3: This randomization method was similar to the native bias, native N3 method in maintaining the native N3 content. However, the randomized codon table corresponding to the original codon table was generated for each column of codons in the sequence alignment to reduce codon bias, as explained in the reduced bias, equal N3 method. The randomization programs' source codes (written in C) are available at http://troi.cc.rochester.edu/~ying/randomization.html.

Analysis of LRT score distributions:
PIRANAH generates a list of helices in the sequence alignment that satisfy the criteria specified by the user (see Secondary Structure Prediction). For each helix, the LRT score and position of paired nucleotides are provided. The effects of altering codon bias and/or third-codon base composition were assessed in several ways. First, for each gene in each of the four sets of randomizations, we compiled the average proportion of helices with LRT scores within bins of five LRT units. The proportion of helices in each bin could then be compared across randomizations. The average scores among the 100 replicate randomizations of the best helix (highest LRT) were calculated for each gene in each of the four randomizations. We also calculated the average 5% cutoff of the best helices for each of the randomizations. The total number of helices in each randomization was multiplied by 5% to obtain the rank of the helix (and LRT score) representing the 5% cutoff. We averaged these LRT scores to obtain the average 5% cutoff for each set of randomizations.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Distribution of LRT scores of all predicted structures:
The distribution of LRT scores of individual structures predicted from analysis of the native Adh sequences and the four sets of 100 randomization simulations is illustrated in Fig 2. Scores of predicted helices are grouped into bins of five LRT units on the abscissa. For the native Adh sequences, the proportion of 234 predicted helices with LRT scores in a given range are plotted. For the randomizations, the average proportions (among the 100 simulated data sets) of helices in a given range are plotted. Error bars represent ±1 standard deviation (SD) of the mean within each bin. For all randomizations and for the native sequences, the majority of helices fulfilling the criteria specified in PIRANAH (e.g., degree of conservation, minimum helix length) were in the intermediate range (5–15) of LRT scores. Both of the equal N3 randomizations contained a higher proportion of helices with high LRT scores (>=20) than did the native sequences or either of the native N3 randomizations. For high LRT scores (>=20), results from analysis of native sequences did not differ appreciably from results from native N3 randomizations, irrespective of level of codon bias.



View larger version (29K):
In this window
In a new window
Download PPT slide
 
Figure 2. Distribution of LRT scores of individual structures predicted from PIRANAH analysis of the native Adh sequences and average proportions of helices falling within the specified LRT ranges for the four sets of 100 Adh randomization simulations. Error bars represent ±1 standard deviation of the mean.

The distribution of LRT scores of individual structures predicted from analysis of the native Adhr sequences and the four Adhr randomization simulations is illustrated in Fig 3. Most helices meeting the criteria specified in PIRANAH were in the intermediate range (5–15) of LRT scores for all randomizations and for the native sequences. In contrast to results from analysis of Adh, the native Adhr sequences contained a higher proportion of helices with high LRT scores (>=20) than all four randomization simulations. Also in contrast to Adh, the equal N3 randomizations contained a lower proportion of helices with high LRT scores than did native N3 randomizations, irrespective of level of codon bias. Differences between the equal N3 and native N3 randomization simulations were slighter than those observed for Adh, due to the smaller difference between native N3 content and equal N3 content in Adhr (Table 1).



View larger version (31K):
In this window
In a new window
Download PPT slide
 
Figure 3. Distribution of LRT scores of individual structures predicted from PIRANAH analysis of the native Adhr sequences and average proportions of helices falling within the specified LRT ranges for the four sets of 100 Adhr randomization simulations. Error bars represent ±1 standard deviation of the mean.

These results suggest that a reduction of codon bias per se (without changing N3 content) has a relatively weak effect on the pairing potential of the best stems (LRT >=20) in both Adh and Adhr. In contrast, N3 content has a significant effect on pairing. Perturbing the N3 distribution to approximately equal amounts of A, G, C, and T increases the potential for secondary structure formation in Adh, but decreases it in Adhr. This finding, together with the result that the equal N3 simulations produce more helices with LRT >=20 than the native sequences for Adh, but not for Adhr, suggests that the potential for secondary structure formation is much stronger in Adhr than in Adh.

These observations are based on the distribution of helices with LRT >=20. Except for the fact that this is a relatively high value for LRT cutoff scores for secondary structures (PARSCH et al. 2000 Down), there is no guarantee that helices with this or a higher score exist in vivo. To further examine this problem and to check to what extent our conclusions depend on cutoff criteria, we next consider alternate criteria for assessing the potential for secondary structure formation, including the average maximum LRT from each randomization and the average 5% cutoff LRT scores.

Maximum and 5% cutoff LRT scores:
The average (±SD) of the maximum LRT and 5% cutoff LRT from each of the four Adh randomizations are presented in Table 2. A two-way ANOVA was conducted to test for the effects of bias and N3 content on maximum LRT score (Table 3). The effect of altering codon bias was not significant for maximum LRT score (P = 0.567), but reduced bias randomizations had significantly greater 5% cutoff LRTs (P = 0.001). The effect of altering N3 content was highly significant for both maximum and 5% cutoff LRTs (P < 0.01): the equal N3 simulations had higher average maximum and 5% cutoff LRT scores than native N3 simulations. The interaction (codon bias * N3 content) was significant for maximum LRT scores (P = 0.03), due to the fact that bias could not be effectively removed while maintaining N3 content at native levels. The average level of bias in reduced bias, native N3 simulations was significantly greater (average ENC = 45.5 ± 1.0) than the level of bias in reduced bias, equal N3 simulations (average ENC = 54.6 ± 1.2; t = -58.9, P < 0.001). The maximum LRT score of helices predicted from analysis of the native Adh sequences was 26.52. In comparison, 43 of the 100 native bias, native N3 randomized data sets had helices with higher LRT scores.


 
View this table:
In this window
In a new window

 
Table 2. Average LRT scores (±SD) of best and upper 5% cutoff of randomized alignments


 
View this table:
In this window
In a new window

 
Table 3. Results from two-way ANOVAs on maximum LRT scores and 5% cutoff LRT scores of randomized alignments

Results from comparisons of average (±SD) LRT scores of top helices from each of the Adhr simulations are also presented in Table 2. As with Adh, altering level of codon bias had no significant effect on average maximum LRT scores (Table 3), nor did it significantly affect the average 5% cutoff LRT. Altering N3 content resulted in a highly significant effect; the average maximum LRT score from the native N3 simulations was higher than the average from the equal N3 randomizations (P < 0.01). Average 5% cutoff LRTs were also significantly greater for the native N3 randomizations (P < 0.01). The interaction variance was not significant for either maximum LRT or 5% LRT. The top LRT score of the helices predicted from analysis of the native Adhr sequences was 28.72. Of the 100 native bias, native N3 simulations, 49 had structures with higher LRT scores.

In general, the results from analysis of the LRT score distributions presented above and the results from maximum and 5% cutoff LRTs presented in this section are in agreement. That is, altering N3 content exerted the greatest effect on both the LRT score distributions and on the maximum and 5% cutoff LRT scores. One exception is that altering codon bias did result in significant differences among the Adh 5% cutoff LRTs, whereas there was no effect on maximum LRTs or on the LRT distributions. This apparent inconsistency is due in part to the reduced variance in upper 5% LRT scores (SD {cong} 1) compared with maximum LRT scores (SD {cong} 2), such that the critical difference for statistical significance was lower for the 5% cutoff comparisons. To make sure that this pattern held for a range of different cutoff LRTs around the 5% critical value, we also calculated 3 and 10% cutoff LRTs for the Adh simulations. For both the 3 and 10% critical LRTs, the effect of bias remained significant (P < 0.01), as did N3 content (P < 0.0001), whereas the interaction (bias * N3) remained insignificant (P > 0.2). In other words, we found the same pattern for the 3 and 10% critical values as was found for the 5% critical values. Together, these results suggest that we would obtain the same result no matter what cutoff value we use, unless the cutoff values chosen are too low (in which case the results would be very similar to the maximum LRT results) or too high (for which no differences would be observed).

Reading frame pairings:
There are three possible pairing orientations when considering the codon positions of nucleotides on opposite strands of a helix. Third-codon position on one strand can pair with either first- (3-1), second- (3-2), or third- (3-3) codon position nucleotides on the opposite strand. The number and average LRT scores of best helices in the 3-1, 3-2, or 3-3 complementary reading frame orientations are listed in Table 2. For both Adh and Adhr, most of the top helices are in the 3-3 orientation, with Adhr exhibiting an even greater preponderance of 3-3 helices. The 3-2 pairing frame is the least common orientation, in particular for Adhr. The average LRT scores of the three possible orientations do not differ appreciably, with the exception of the Adh reduced bias, equal N3 randomizations, where the average maximum LRT of 3-3 frame helices was significantly greater than that of 3-2 frame helices (Fisher's post hoc pairwise test: P = 0.01). For the native sequences, the 3-3 pairing was the most common in both the Adh and Adhr genes. The maximum LRT helix in both alignments was in the 3-3 frame. Overall, 39% of the 234 helices predicted from analysis of the native Adh sequences were in the 3-1 frame, 9% were in the 3-2 frame, and 51% were in the 3-3 frame. For the 437 helices predicted from analysis of the native Adhr sequences, the corresponding proportions were 40, 14, and 46%. The relative proportions of 3-1, 3-2, and 3-3 frame helices were more skewed for higher LRT helices. For Adh, of the 57 helices with LRTs >=15, the proportions of 3-1, 3-2, and 3-3 frame helices were 23, 5, and 72%, respectively. For Adhr, of the 123 predicted helices with LRTs >=15, the proportions of 3-1, 3-2, and 3-3 frame helices were 29, 10, and 61%, respectively.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Overview:
Our results from analysis of the distribution of LRT scores of all helices indicate that codon bias per se has only a weak effect on the potential for formation of individual secondary structures. In fact, what appeared to exert the strongest effect on overall potential for structure formation was N3 content (Fig 2 and Fig 3). For Adh, we observed that evening out the N3 content leads to an increase in the proportion of helices with high LRT scores. We attribute this pattern to an increase in pairing potential when N3 content was approximately equally distributed among the four bases to ~25% each.

The skew in base composition at first and second codon positions is less severe than at third-codon positions in Adh, which are skewed toward high C3 (42.1%) and low A3 (7.3%) content (Table 1). Removing that skew at third positions increases the potential for pairings between third-codon position nucleotides and those at first and second positions. The proportion of best helices in the 3-1, 3-2, and 3-3 frame helices was 21% (i.e., 21 of the 100 maximum LRT helices predicted from analyses of the 100 randomized alignments were in the 3-1 frame), 17%, and 62% for the native bias, native N3 randomizations and 27, 20, and 53% for the native bias, equal N3 randomizations (Table 2). Similarly, the proportion of best helices in the 3-1, 3-2, and 3-3 conformations was 26, 10, and 64% for the reduced bias, native N3 randomizations and 38, 16, and 46% for the reduced bias, equal N3 randomizations. In both cases, native bias or reduced bias, the proportion of 3-1 and 3-2 helices was higher in the equal N3 simulations than in the native N3 simulations. Altering N3 content to a more equal distribution among the four bases also enhances the potential for pairing in the 3-3 frame. Since the native Adh sequences are skewed toward a C3 bias, not an equal GC bias, G-C pairings at third positions are more restricted. The same holds for A-T pairings in the native Adh sequences, where the rarity of A3 restricts the number of A-T pairings between third-position nucleotides. This is evidenced by the higher average LRT score of 3-3 frame helices in the equal N3 randomizations than in the native N3 randomizations (Table 2).

Adhr, a gene with less codon bias and lower GC3% (i.e., a more equally distributed N3%) than Adh, actually exhibited a greater potential for formation of high LRT helices (>=20) than did any of the four sets of randomization simulations. The exact opposite pattern was observed when comparing the LRT distribution of helices predicted from wild-type Adh sequences with the distributions from randomization simulations. This comparison of the two patterns provides additional evidence that in genes with high levels of codon bias the formation of strong individual secondary structures is inhibited through the alteration of N3 content.

mRNA secondary structure and gene expression:
In Drosophila, as in other organisms with high levels of codon bias, all codon families tend to be biased for the same individual nucleotides (MORIYAMA and HARTL 1993 Down; POWELL and MORIYAMA 1997 Down). Of the six twofold degenerate codon families encoded by a third-position pyrimidine, all but one (Asp) exhibit a strong preference for C in Drosophila. All three of the twofold degenerate codon families encoded by a third-position purine exhibit a strong preference for G. The threefold degenerate isoleucine codon family strongly prefers C. Of the five fourfold degenerate codon families, four use C as the major preferred codon, and the other (Val) shows a slight preference for G over C. Two of the three sixfold degenerate codon families prefer C, and the other (Leu) prefers G. These patterns hold for Adh, Adhr, and for the Drosophila genome as a whole (NAKAMURA et al. 1998 Down).

In principle, it is possible to have equally strong codon bias, but without any among-codon family consistency for preferred nucleotides. That is essentially what our native bias, equal N3 randomizations are designed to simulate. If the only factor driving the evolution of codon bias was selection for preferred codons to maximize translational efficiency and/or accuracy, there would be no reason to expect that each codon family shares the same base preference. In other words, each codon family would still tend to prefer a certain base at the third position, but that preferred base would be unique to each codon family, such that the pattern of codon bias in natural sequences would be similar to our native bias, equal N3 randomizations. The observation that such genes are nonexistent is consistent with the hypothesis that the formation of long and stable helices interferes with the process of mRNA translation. This hypothesis is supported by the results from analysis of the distribution of LRT scores of individual helices (Fig 2 and Fig 3). The translational efficiency (BULMER 1991 Down) and translational accuracy (AKASHI 1994 Down) models for the evolution of codon bias do not explain the concordance among different codon families in preference for the same third-position nucleotide. Furthermore, a hypothesis of mutational bias toward C in the nuclear genome of Drosophila is untenable, since the average base composition of nontranscribed regions is actually biased against C (MORIYAMA and HARTL 1993 Down). As the results from analysis of our native bias, equal N3 randomizations demonstrate, when each codon family prefers a unique third-position nucleotide, high levels of codon bias do not inhibit the formation of strong individual secondary structures.

We must therefore consider the possibility that mRNA secondary structure can also affect the rate of mRNA translation. Preference for C3 synonymous codons in Drosophila has two benefits: (1) enhancing translation efficiency and accuracy by matching an abundant tRNA pool and (2) minimizing the formation of highly stable hairpins that would interfere with ribosome movement and consequently reduce the translation rate. The combined effect may in fact be synergistic (nonadditive) and perhaps experimentally measurable. However, our data do not allow us to address the following question: Why C3 instead of G3, A3, or T3? Clearly a consistent preference among all the codon families for any of the four nucleotides at the third-codon position would result in a decreased potential for the formation of individual secondary structures. Perhaps the C3 preference is the result of a "frozen accident" due to genetic drift. Once C3 became the established preferred third-position nucleotide for most codon families, the evolution of alternate preferred N3s would be prevented by the presence of fitness valleys (excepting, of course, those twofold degenerate codon families encoded by a third-position purine).

Reading frame pairings:
Our finding that the 3-2 reading frame pairings were less common than 3-1 reading frame pairings stands in contrast to FITCH 1974 Down, who predicted that pairing should occur between strong amino-acid-determining positions with the most degenerate positions for maximal thermodynamic stability. However, 3-2 pairings would be relatively intolerant of new variants due to the inflexibility of the second-codon position. Such pairings would therefore be incompatible with evolution at the protein level. In contrast, 3-1 pairings would best preserve mRNA secondary structure through compensatory substitutions while allowing for evolution at the amino acid level ("evolutionarily compatible coding"). In a statistical analysis of retroviral mRNA sequences, KONECNY et al. 2000 Down provided some evidence that evolutionarily compatible coding plays a significant role in mRNAs with strong secondary structures. Our results support this view: the proportion of 3-1 to 3-2 frame helices was greater in Adhr (e.g., 18:3 for the native bias, native N3 randomizations), a gene with stronger individual secondary structures, than in Adh (e.g., 21:17 for the native bias, native N3 simulations; Table 2). Of course 3-3 pairings allow for the most unrestricted compensatory evolution, and our finding that 3-3 pairings are the most common is not surprising. FITCH 1974 Down and KONECNY et al. 2000 Down were primarily concerned with the tradeoff between secondary structure and evolution at the amino acid level, and therefore they did not discuss 3-3 pairings, because compensatory changes would not cause amino acid substitutions and such changes would not result in protein evolution.

mRNA secondary structure and codon bias—dual regulators of gene expression?
While there has been considerable work on the relationship between mRNA secondary structure and gene expression, and that between codon bias and gene expression, few studies have explored the relationship between codon usage and mRNA secondary structure. MITA et al. 1988 Down found unique codon usage patterns in the two distinctive regions in the Bombyx mori silk fibroin gene. Preferred codons were used in the repetitive components but not in the joining component, and the authors argue that this pattern is consistent with the avoidance of highly stable secondary structures in rapidly translated regions. Consistent with this hypothesis is the observation that there are pauses in the fibroin translation process, evidenced by the transient accumulation of discrete size classes of fibroin chains during translation. ZAMA 1990 Down provided evidence that pauses in the translation of the type I collagen from chicken corresponded to regions of local, highly stable mRNA secondary structures. Furthermore, the regions of high mRNA stability and synthesis pauses corresponded to the occurrence of rare codons. Translational pauses are especially important for eukaryotes, where sequential cotranslational folding of protein domains minimizes errors in protein folding that are more likely to occur in the longer chains of eukaryotes (NETZER and HARTL 1997 Down; ELLIS and HARTL 1999 Down).

These studies, combined with the evidence in this study, suggest that the joint effects of mRNA secondary structure and codon bias may interact to regulate the level of gene expression (Fig 1). In the highly expressed Adh gene, the balance is shifted toward codon bias, a positive regulator of gene expression. In the weakly expressed Adhr gene, the balance is shifted toward mRNA secondary structure, a negative regulator of gene expression. According to our preliminary model, evolutionary shifts in the balance between codon bias and mRNA secondary structure are mediated through N3 content. Natural selection for a particular nucleotide at the third-codon position, consistent across most codon families (e.g., C in Drosophila), results in codon bias without the disruptive effects of stable secondary structures. For weakly expressed genes, there is selection for roughly equal G and C (or A and T) at N3. Neutral drift alone would result in a relatively equal frequency of all four nucleotides at the third-codon position (i.e., 25% A3, 25% C3, 25% G3, and 25% T3). However, stronger secondary structures would form if there was natural selection for high GC3% (50% G3, 50% C3 in the extreme case) or high AT3% (50% A3, 50% T3 in the extreme case) because this would maximize the pairing potential between opposite strands of a helix. Therefore, we conclude that N3 content in weakly expressed genes may also be governed by natural selection. Selection for high, equally distributed GC3% or AT3% could potentially promote the formation of stable secondary structures, resulting in an inhibitory effect on mRNA translation rates.

It should be borne in mind that the results of this study are based on the simplest paired-site model of nucleotide substitution, involving the estimation of only one free parameter after scaling the branch lengths (MUSE 1995 Down). Although general reversible models of paired-site nucleotide substitution have been shown to provide a better fit to the data (SAVILL et al. 2001 Down), they require estimation of up to 26 free parameters. Because calculation of LRT scores for all helices predicted from a single set of aligned sequences using complex models is extremely time consuming, it was not feasible to use them in a large-scale analysis of 800 simulated alignments. Thus, an obvious direction for future work would be to implement more complex models of paired-site substitution in the calculation of LRT scores. Additional gene pairs should be examined to validate or refute the generality of the conclusions presented here, which are based on analysis of only two genes.

Previous studies have compared the global stability of mRNAs vs. various forms of randomized sequences (SEFFENS and DIGBY 1999 Down; WORKMAN and KROGH 1999 Down; RIVAS and EDDY 2000 Down). Although the conclusions are somewhat contradictory, it appears that mRNAs are no more stable than randomized sequences with the same dinucleotide content and base composition. Our results are not in conflict with this general conclusion since these studies did not partition the genes into categories (i.e., high codon bias vs. low codon bias, highly expressed vs. weakly expressed). It is possible that highly expressed genes may be more stable than random sequences, or vice versa, but since both highly expressed and weakly expressed genes are included in the same analysis (e.g., 51 mRNAs in WORKMAN and KROGH 1999 Down), the overall effect is cancelled out. In this case no difference in the global stability of mRNAs and randomized sequences would be observed.

To complement these studies, it would be interesting to examine the effect of altering codon bias and N3 content on the formation of global mRNA secondary structures in genes with different levels of expression. The pattern would not necessarily be the same as that revealed in the present analysis of individual helices. One might predict that, on average, highly expressed genes would exhibit greater global stability than weakly expressed genes. A greater global stability of highly expressed mRNAs would be advantageous because such mRNAs would be more resistant to degradation, resulting in a longer residence time in the cell compared to mRNAs of weakly expressed genes. How could this apparent discrepancy be reconciled? Although highly expressed mRNAs might have greater global stabilities, the individual helices in the global structure would have to be relatively short and weak. Since we found no convincing evidence that any of the individual helices in Adh are stronger than randomized sequences, it may be that there is no particular conserved global structure for a set of related mRNAs. Instead, many alternate global structures of approximately the same stability could form, and the constituent helices of global structures would be relatively weak. To test this theory, the predictions of currently available programs for inferring global mRNA secondary structures on the basis of the comparative method (e.g., KNUDSEN and HEIN 1999 Down; PARSCH et al. 2000 Down) may be used for mRNAs that can be fully aligned (including the 5' and 3' untranslated regions).


*  ACKNOWLEDGMENTS

We thank J. Parsch for providing the source code for Pirandom, a randomization program that shuffles the columns of a set of aligned sequences. We thank J. Braverman for providing the source code to the PIRANAH and GROUPER computer programs. This research was supported by National Institutes of Health grant GM-58405 and by funds from the University of Munich to W.S.

Manuscript received April 18, 2001; Accepted for publication July 10, 2001.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

AKASHI, H., 1994  Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927-935[Abstract].

ANTEZANA, M. A. and M. KREITMAN, 1999  The nonrandom location of synonymous codons suggests that reading frame-independent forces have patterned codon preferences. J. Mol. Evol. 49:36-43[Medline].

BENNETZEN, J. L. and B. D. HALL, 1982  Codon selection in yeast. J. Biol. Chem. 257:3026-3031[Abstract/Free Full Text].

BULMER, M., 1991  The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897-907[Abstract].

DOCK, A. C., B. LORBER, D. MORAS, G. PIXA, and J. C. THIERRY et al., 1984  Crystallization of transfer ribonucleic acids. Biochimie 66:179-201[Medline].

ELLIS, R. J. and F. U. HARTL, 1999  Principles of protein folding in the cellular environment. Curr. Opin. Struct. Biol. 9:102-110[Medline].

FITCH, W. M., 1974  The large extent of putative secondary nucleic acid structure in random nucleotide sequences or amino acid derived messenger-RNA. J. Mol. Evol. 3:279-291[Medline].

FOX, G. E. and C. R. WOESE, 1975  5S rRNA secondary structure. Nature 256:505-507[Medline].

GRANTHAM, R., C. GAUTIER, M. GOUY, R. MERCIER, and A. PAVE, 1980  Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 8:49-62.

IKEMURA, T., 1981  Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes. J. Mol. Biol. 146:1-21[Medline].

KIRBY, D. A., S. V. MUSE, and W. STEPHAN, 1995  Maintenance of pre-mRNA secondary structure by epistatic selection. Proc. Natl. Acad. Sci. USA 92:9047-9051[Abstract/Free Full Text].

KNUDSEN, B. and J. HEIN, 1999  RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics 15:446-454[Abstract/Free Full Text].

KONECNY, J., M. SCHÖNIGER, I. HOFACKER, M.-D. WEITZE, and G. L. HOFACKER, 2000  Concurrent neutral evolution of mRNA secondary structures and encoded proteins. J. Mol. Evol. 50:238-242[Medline].

KONINGS, D. A. and R. R. GUTELL, 1995  A comparison of thermodynamic foldings with comparatively derived structures of 16S and 16S-like rRNAs. RNA 1:559-574[Abstract].

MITA, K., S. ICHIMURA, M. ZAMA, and T. C. JAMES, 1988  Specific codon usage pattern and its implications on the secondary structure of silk fibroin mRNA. J. Mol. Biol. 203:917-925[Medline].

MORIYAMA, E. N. and D. L. HARTL, 1993  Codon usage bias and base composition of nuclear genes in Drosophila. Genetics 134:847-858[Abstract].

MUSE, S. V., 1995  Evolutionary analyses of DNA sequences subject to constraints of secondary structure. Genetics 139:1429-1439[Abstract].

NAKAMURA, Y., T. GOJOBORI, and T. IKEMURA, 1998  Codon usage tabulated from the international DNA sequence databases. Nucleic Acids Res. 26:334[Abstract/Free Full Text].

NETZER, W. J. and F. U. HARTL, 1997  Recombination of protein domains facilitated by co-translational folding in eukaryotes. Nature 399:343-349.

NOLLER, H. F. and C. R. WOESE, 1981  Secondary structure of 16S ribosomal RNA. Science 212:403-411[Abstract/Free Full Text].

PACE, N. R., D. K. SMITH, G. J. OLSEN, and B. D. JAMES, 1989  Phylogenetic comparative analysis and the secondary structure of ribonuclease P RNA—a review. Gene 82:65-75[Medline].

PARSCH, J., J. M. BRAVERMAN, and W. STEPHAN, 2000  Comparative sequence analysis and patterns of covariation in RNA secondary structures. Genetics 154:909-921[Abstract/Free Full Text].

POST, L. E., G. D. STRYCHARZ, M. NOMURA, H. LEWIS, and P. P. DENNIS, 1979  Nucleotide sequence of the ribosomal protein gene cluster adjacent to the gene for RNA polymerase subunit b in E. coli. Proc. Natl. Acad. Sci. USA 76:1697-1701[Abstract/Free Full Text].

POWELL, J. R. and E. N. MORIYAMA, 1997  Evolution of codon bias in Drosophila. Proc. Natl. Acad. Sci. USA 94:7784-7790[Abstract/Free Full Text].

RIVAS, A. and S. R. EDDY, 2000  Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs. Bioinformatics 16:583-605[Abstract/Free Full Text].

RUSSO, C. A. M., N. TAKEZAKI, and M. NEI, 1995  Molecular phylogeny and divergence times of Drosophilid species. Mol. Biol. Evol. 12:391-404[Abstract].

SAVILL, N. J., D. C. HOYLE, and P. HIGGS, 2001  RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum-likelihood methods. Genetics 157:399-411[Abstract/Free Full Text].

SEFFENS, W. and D. DIGBY, 1999  mRNAs have greater negative folding free energies than shuffled or codon choice randomized sequences. Nucleic Acids Res. 27:1578-1584[Abstract/Free Full Text].

SHARP, P. M. and W.-H. LI, 1986  An evolutionary perspective on synonymous codon usage in unicellular organisms. J. Mol. Evol. 24:28-38[Medline].

SPRINZL, M., T. HARTMANN, F. MEISSNER, J. MOLL, and T. VORDERWÜLBECKE, 1987  Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 15(Suppl.):r53-r188.

TAYLOR, P. L., 1996 GeneJockey II. Biosoft, Cambridge, United Kingdom.

WADA, A. and A. SUYAMA, 1986  Local stability of DNA and RNA secondary structure and its relation to biological functions. Prog. Biophys. Mol. Biol. 47:113-157[Medline].

WALTER, A. E., D. H. TURNER, J. KIM, M. H. LYTTLE, and P. MULLER et al., 1994  Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding. Proc. Natl. Acad. Sci. USA 91:9218-9222[Abstract/Free Full Text].

WORKMAN, C. and A. KROGH, 1999  No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution. Nucleic Acids Res. 27:4816-4822[Abstract/Free Full Text].

ZAMA, M., 1990  Codon usage pattern in a2(I) chain domain of chicken type I collagen and its implications for the secondary structure of the mRNA and the synthesis pauses of the collagen. Biochem. Biophys. Res. Commun. 167:772-776[Medline].

ZUKER, M., J. A. JAEGER, and D. H. TURNER, 1991  A comparison of optimal and suboptimal RNA secondary structures predicted by free energy minimization with structures determined by phylogenetic comparison. Nucleic Acids Res. 19:2707-2714[Abstract/Free Full Text].




This article has been cited by other articles:


Home page
Mol Biol EvolHome page
T. Warnecke and L. D. Hurst
Evidence for a Trade-Off between Translational Efficiency and Splicing Regulation in Determining Synonymous Codon Usage in Drosophila melanogaster
Mol. Biol. Evol., December 1, 2007; 24(12): 2755 - 2762.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Wu, Y. Zheng, I. Qureshi, H. T. Zin, T. Beck, B. Bulka, and S. J. Freeland
SGDB: a database of synthetic genes re-designed for optimizing protein over-expression
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D76 - D79.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. A. Shabalina, A. Y. Ogurtsov, and N. A. Spiridonov
A periodic pattern of mRNA secondary structure created by the genetic code.
Nucleic Acids Res., January 1, 2006; 34(8): 2428 - 2437.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. M. Comeron and T. B. Guthrie
Intragenic Hill-Robertson Interference Influences Selection Intensity on Synonymous Mutations in Drosophila
Mol. Biol. Evol., December 1, 2005; 22(12): 2519 - 2530.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
K.-N. Zhao, W. Gu, N. X. Fang, N. A. Saunders, and I. H. Frazer
Gene Codon Composition Determines Differentiation-Dependent Expression of a Viral Capsid Gene in Keratinocytes In Vitro and In Vivo
Mol. Cell. Biol., October 1, 2005; 25(19): 8643 - 8655.
[Abstract] [Full Text] [PDF]


Home page
Clin. Cancer Res.Home page
U. H. Frey, H. Alakus, J. Wohlschlaeger, K. J. Schmitz, G. Winde, H. G. van Calker, K.-H. Jockel, W. Siffert, and K. W. Schmid
GNAS1 T393C Polymorphism and Survival in Patients with Sporadic Colorectal Cancer
Clin. Cancer Res., July 15, 2005; 11(14): 5071 - 5077.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. B. Carlini
Context-Dependent Codon Bias and Messenger RNA Longevity in the Yeast Transcriptome
Mol. Biol. Evol., June 1, 2005; 22(6): 1403 - 1411.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Qin, W. B. Wu, J. M. Comeron, M. Kreitman, and W.-H. Li
Intragenic Spatial Patterns of Codon Usage Bias in Prokaryotic and Eukaryotic Genomes
Genetics, December 1, 2004; 168(4): 2245 - 2260.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. I. Wright, C. B. K. Yau, M. Looseley, and B. C. Meyers
Effects of Gene Expression on Molecular Evolution in Arabidopsis thaliana and Arabidopsis lyrata
Mol. Biol. Evol., September 1, 2004; 21(9): 1719 - 1726.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J.-V. Chamary and L. D. Hurst
Similar Rates but Different Modes of Sequence Evolution in Introns and at Exonic Silent Sites in Rodents: Evidence for Selectively Driven Codon Usage
Mol. Biol. Evol., June 1, 2004; 21(6): 1014 - 1023.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
L. M. Matzkin
Population Genetics and Geographic Variation of Alcohol Dehydrogenase (Adh) Paralogs and Glucose-6-Phosphate Dehydrogenase (G6pd) in Drosophila mojavensis
Mol. Biol. Evol., February 1, 2004; 21(2): 276 - 285.
[Abstract] [Full Text] [PDF]