- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by de Oliveira, T.
- Articles by Cassol, S.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by de Oliveira, T.
- Articles by Cassol, S.
Genetics, Vol. 167, 1047-1058, July 2004, Copyright © 2004
doi:10.1534/genetics.103.018135
Mapping Sites of Positive Selection and Amino Acid Diversification in the HIV Genome
An Alternative Approach to Vaccine Design?
Tulio de Oliveira*,
Marco Salemi
,
Michelle Gordon*,
Anne-Mieke Vandamme
,
Estrelita Janse van Rensburg
,
Susan Engelbrecht
,
Hoosen M. Coovadia
and
Sharon Cassol*,**,1
* HIV Molecular Virology and Bioinformatics Laboratory, Africa Centre for Health and Population Studies, Doris Duke Medical Research Institute, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban 4013, South Africa
Rega Institute for Medical Research, KULeuven, Leuven B3000, Belgium
University of Stellenbosch and Tygerberg Hospital, Tygerberg 7505, South Africa
Centre for HIV/AIDS Networking, Doris Duke Medical Research Institute, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban 4013, South Africa
** Nuffield Department of Clinical Medicine, University of Oxford, Oxford OX3 9DU, United Kingdom
1 Corresponding author: Africa Centre for Health and Population Studies, Doris Duke Medical Research Institute, Nelson R. Mandela School of Medicine, 719 Umbilo Rd., Congella 4013, Durban, South Africa.
E-mail: sharon.cassol{at}mrc.ac.za
A safe and effective HIV-1 vaccine is urgently needed to control the worldwide AIDS epidemic. Traditional methods of vaccine development have been frustratingly slow, and it is becoming increasingly apparent that radical new approaches may be required. Computational and mathematical approaches, combined with evolutionary reasoning, may provide new insights for the design of an efficacious AIDS vaccine. Here, we used codon-based substitution models and maximum-likelihood (ML) methods to identify positively selected sites that are likely to be involved in the immune control of HIV-1. Analysis of subtypes B and C revealed widespread adaptive evolution. Positively selected amino acids were detected in all nine HIV-1 proteins, including Env. Of particular interest was the high level of positive selection within the C-terminal regions of the immediate-early regulatory proteins, Tat and Rev. Many of the amino acid replacements were associated with the emergence of novel (or alternative) myristylation and casein kinase II (CKII) phosphorylation sites. The impact of these changes on the conformation and antigenicity of Tat and Rev remains to be established. In rhesus macaques, a single CTL-associated amino substitution in Tat has been linked to escape from acute SIV infection. Understanding the relationship between host-driven positive selection and antigenic variation may lead to the development of novel vaccine strategies that preempt the escape process.
DEVELOPMENT of an efficacious acquired immune deficiency syndrome (AIDS) vaccine is a public health priority (CHECK 2003; KLAUSNER et al. 2003). Studies in macaques challenged with simian immunodeficiency virus (SIV) have shown that it is possible to contain and prevent infection (DANIEL et al. 1992; HIRSCH et al. 1994; DUNN et al. 1997). However, these studies have been ambivalent, and the cellular and humoral responses needed to elicit protective immunity have not been well defined. In addition, little is known about the immunogenicity of different subgenomic regions of SIV or about the epitopes that are likely to elicit a potent and sustained immune response. To date, the only successful retroviral vaccine has been one targeted against the transmitted variant of feline leukemia virus (HOOVER et al. 1991).
Many groups have sequenced HIV-1 and defined cytotoxic T-lymphocyte (CTL) epitopes that are expressed during disease progression (BORROW et al. 1997; ROSENBERG et al. 1997; NOVITSKY et al. 2001). However, it has been difficult to obtain information on acute phase viruses and the immune responses they elicit, primarily because most patients are not diagnosed during primary infection. Studies of rhesus macaques, inoculated with a single cloned variant of SIV (ALLEN et al. 2000), indicated that wild-type virus predominated during the first 2 weeks of infection. This was followed by a sharp decline in plasma viremia coincident with the emergence of Tat-specific CTLs. By 4 weeks postinfection, the first escape mutants were detected and, by 8 weeks, wild-type virus was completely replaced with Tat escape variants (ALLEN et al. 2000; O'CONNOR et al. 2001).
Much less is known about the generation of CTL escape mutants in humans exposed to multiple variants of HIV-1 in genital secretions or blood (GOULDER et al. 1997; MCMICHAEL and PHILLIPS 1997; PRICE et al. 1997; DELWART et al. 1998; KARLSSON et al. 1998; MCMICHAEL 1998). Studies of viral kinetics indicate that HIV-1 replicates to high titer during the first week of infection, reaching peak viremia at
3 weeks (MELLORS et al. 1997). During this time, prior to the induction of CTLs, the dominant forces acting on HIV-1 are likely to be related to viral fitness, replication capacity, and the adaptive potential of the virus in the new host (OVERBAUGH and BANGHAM 2001). In most patients, peak viremia is followed by a rapid decline in plasma HIV-1 RNA 68 weeks postinfection and, ultimately, by stabilization at a level referred to as the viral set point (MELLORS et al. 1996).
Evidence suggests that the decline in HIV-1 viremia and the control of persistent infection is mediated by CTLs (BORROW et al. 1997; ROSENBERG et al. 1997). By analogy with SIV (ADDO et al. 2001; O'CONNOR et al. 2001), it would be expected that by 8 weeks postinfection, transmitted variants of HIV-1 would be completely replaced with escape viruses that have evaded the initial CTL response. Identification of early escape variants or, perhaps more importantly, identification of wild-type (pre-escape) variants may provide important new insights for vaccine design. The mechanisms underlying the escape process are not fully understood (DA SILVA and HUGHES 1998; YUSIM et al. 2002), but are likely to be complex and involve a number of different steps. In addition to changes in antigenicity, these steps may include alterations in proteasomal cleavage, TAP-mediated translocation, human histocompatibility system (HLA)-binding, and T-cell receptor recognition (PAMER and CRESSWELL 1998; ABELE and TAMPE 1999; BOCHTLER et al. 1999; WILSON et al. 1999).
In this study, we used codon-substitution models to measure selection pressures along the length of the HIV-1 genome and to search for positively selected amino acids that may play an important role in the escape from host immunity. The application of positive selection models to vaccine design was first suggested in 1998 (NIELSEN and YANG 1998). Using this approach, widespread positive selection in the HIV-1 genome has been detected both at the interpatient level (YANG 2001; YANG et al. 2003) and within the same patient over time (ZANOTTO et al. 1999). Our studies confirm and extend these findings.
HIV-1 sequence data sets:
Analyses were performed on a total of 71 full-length sequences from subtypes B (n = 27) and C (n = 27) and group M (n = 27) viruses, representing subtypes AK. These representative sequences, which were downloaded from the Los Alamos HIV database (http://hiv-web.lanl.gov), are described in detail elsewhere (DE OLIVEIRA et al. 2003a). To rule out the possibility of intersubtype recombination, only sequences classified as nonrecombinant were included in the analyses (ANISIMOVA et al. 2003; YANG et al. 2003). To avoid the introduction of insertions and deletions, nucleotide sequences representing multiply spliced early regulatory (tat, rev, nef), singly spliced (env, vif, vpr, vpu), and unspliced structural (gag, pol) genes were aligned against their predicted amino acid sequence using a CLUSTAL algorithm implemented in DAMBE (XIA and XIE 2001). Similar alignments were constructed for the translated amino acid sequences. The alignments were manually edited using the Genetic Data Environment for Linux interface (DE OLIVEIRA et al. 2003b).
Phylogenetic analysis and tree building:
Separate analyses were performed on each individual gene, including both distance and maximum-likelihood (ML) methods. The best-fitting nucleotide substitution model was evaluated using a hierarchical likelihood-ratio test (LRT) implemented in MODELTEST 3.0 (POSADA and CRANDALL 1998). The ML trees for complete genomes and individual genes were obtained by implementing a heuristic search with tree bisection reconnection branch swapping. Neighbor-joining trees were constructed using the Felsentein 84 model and used in the codon-selection analysis. Phylogenetic analyses were performed with the PAUP* 4.0b10 program (SWOFFORD 2002).
Analysis of selection pressure:
Positive selection was assessed using four different codon-based ML substitution models (YANG et al. 2000): M0 (one-ratio), M1 (neutral), M2 (selection), and M3 (discrete). All models were implemented in the Codeml program of the PAML software package (YANG 1997). Analyses were performed using the discrete model (M3) with three dn/ds (
) classes. Such models allow
to vary among sites by defining a set number of discrete site categories, each with its own
value. Through maximum-likelihood optimization, it is possible to estimate the value for
and for p, the fraction of sites in the aligned data set that falls into a given category. Finally, the algorithm calculates the a posteriori probability that each codon belongs to a particular site category. Using the M3 model, sites with a posterior probability exceeding 90% and a
value >1.0 were designated as being "positive selection sites" (YANG et al. 2000). Since these models are nested, with M3 being the most complex and M0 the least complex, it is possible to evaluate the best-fitting model for the data using the LRT (ANISIMOVA et al. 2001). Comparision of M0 with M3 is a test of site rate variation; comparison of M1 with M2 is a test for positive selection.
Reconstruction of common ancestors:
A rooted tree of n taxa contains n 1 internal nodes. Ancestral sequences at the internal nodes of each of the nine proteins in the B and C data sets were reconstructed by maximum likelihood using codon models selected by the LRT method. Reconstructed ancestral sequences were saved and translated into their corresponding amino acid sequences. The gp160 envelope glycoprotein was the most difficult to analyze due to the presence of hypervariable regions containing multiple insertions and deletions (indels). To facilitate analysis of gp160, sequences were aligned using glycosylation, myristylation, and protein kinase sites as anchors. To investigate the possibility of alternative coalescence events, other than those depicted by the ML trees constructed in PAML, suboptimal ML trees were also reconstructed using the Bayesian algorithm implemented in MRBayes software (HUELSENBECK and RONQUIST 2001). Ancestral sequences of the trees were also reconstructed and saved for the prediction of escape epitopes.
Identification of escape epitopes:
To search for potential escape epitopes, genomic regions containing a large number of positively selected sites were analyzed together with ancestral sequences. The sequences were aligned, translated, and analyzed for differences between sampled strains and their reconstructed ancestral sequences. To identify new peptide sequences that were not present in the sampled strains, ancestral sequences were analyzed using a 10-amino-acid sliding window incremented one codon at a time. Whenever a reconstructed 10-amino-acid ancestral peptide was not present in the external branches of the tree, the sampled sequence was saved as a possible novel epitope. Novel amino acid peptides were screened against previously identified epitopes using two predictive software programs, SYFPEITHI (RAMMENSEE et al. 1999) and Epimap from the Los Alamos HIV Seq.Db (BRANDER and GOULDER 1999), for amino acid composition and binding properties.Positive selection and amino acid variability:
ML methods were used to assess amino acid variation and identify targets of positive selection. Significant differences were observed in the number and distribution of positively selected variants, among both different HIV-1 proteins and different regions of the same protein. Table 1 describes the fraction of sites (p1, p2, p3) in each protein that were under positive (diversifying) selection, along with the respective
(dn/ds) values for each category of the M0, M1, M2, and M3 models. Table 1 also shows the results of the LRT comparing M3 with M0 and M2 with M1. Using LRT, the M3 (discrete) and M2 (positive) models were selected (P < 0.001) for all proteins of subtypes B and C, providing evidence of varying selection pressure at individual sites across the HIV-1 genome. The M3 and M2 models were also accepted for group M viruses. However, when compared to subtypes B and C, the number of positively selected sites in the group M data set was substantially higher. Several sites in the M group mapped to signature sequences identified by the VESPA program (KORBER and MYERS 1992), suggesting that observed variation was due to subtype constraints, rather than to immune selection pressure. This interpretation is consistent with other studies that have shown a decrease in power when applying selection models to highly divergent data sets (RAMBAUT et al. 2004).
|
One of the most unexpected findings was the high frequency of positively selected variants in the early regulatory proteins, Tat and Rev. Overall, 30.3% of Tat codons in subtype C and 18.4% in subtype B had dn/ds values
2.0. The corresponding values for Rev were 18.4% for subtype C and 29.2% for subtype B. Lower levels of variability were observed for Vif, Env, Vpr, Vpu, and Nef with dn/ds
2.0 values ranging from 1 to 15%. Although Env and Vpu are generally considered to be the most variable HIV-1 proteins, the relative proportion of positively selected sites was greater in Tat and Rev. Many of the sites in Env were localized near regions that contained inserted or deleted codons. To avoid bias, these indels were removed from the analysis, making these regions uninformative. In Vpu, a large proportion of codons (27.3% in subtype B, 24.5% in subtype C) were under positive selection, but the selection intensity was relatively low with 20.8% of B and 23.1% of C viruses having dn/ds values
1.10 and 1.68, respectively. Only 1.3% of Vpu codons in subtype C and 6.5% in B were strongly selected with dn/ds values of 8.78 and 5.16, respectively. A similar pattern was observed for Nef. The least-variable protein was integrase with no sites in subtype C and 1.9% of sites in subtype B having dn/ds
2.0.
Phylogenetic analysis of the ancestral sequences:
Figure 1 is a representative tree constructed from 27 full-length subtype C and 8 reconstructed ancestral sequences. The sequences fell into 8 distinct sublineages, representing strains from India, southern Africa, Ethiopia and Israel, and Brazil. This pattern was supported by high bootstrap values (>75%) and high-score ML trees and by phylogenetic analyses of both nucleotide and deduced amino acid sequences. The nucleotide diversity for the complete alignment was 7.8%. As previously reported by GASCHEN et al. (2002), the average distance between a given contemporary sequence and its most recent common ancestor (MRCA) was approximately one-half the sublineage diversity. The mean divergence among sequences in the same sublineage ranged from 4.3% among Brazilian viruses to 4.4% for Indian, 7.8% for Ethiopian and Israeli, and 8.0% for viruses from southern Africa. The average deviation between a given HIV-1 sequence and its most proximal ancestor was 2.2% for Brazilian, 2.6% for Indian, 3.9% for Ethiopian and Israeli, and 4.7% for African viruses. All of the following analyses focus on the early regulatory proteins Tat and Rev.
|
Pattern of amino acid variability:
The distribution of amino acid variants in the Tat and Rev proteins of subtypes B and C is shown in Figure 2. For both proteins, positively selected amino acids were concentrated primarily in the C-terminal regions, while neutral and negatively selected codons predominated in the conserved functional domains near the N termini of the proteins. Overall, 30 codons in Tat and 17 in Rev were under positive selection in subtype C viruses. The corresponding values for subtype B were 19 and 32, respectively. The detection of fewer positively selected sites in the Rev protein of subtype C was not unexpected, given the truncated nature of the C protein (POLLARD and MALIM 1998). A total of 21 of the positively selected codons, 12 in Tat and 9 in Rev, were common to both subtypes, suggesting that these amino acids are frequent targets of host selection pressure.
|
|
Correlations among highly conserved functional domains, CTL epitopes, and amino acid variability:
Relatively few positively selected variants were detected at the N terminus of Tat between codons 1 and 57, a region that contains the functionally important minimal activation domain, the cysteine-rich disulphide bond region, the nuclear localization signal (NLS), the TAR- and Sp1-binding sites, and the histone acetyltransferase (HAT) domain (JEANG et al. 1999). In addition to being conserved and negatively selected, this region contains several experimentally defined CTL epitopes. Limited variation was tolerated within the disulphide bond region, but not within the essential cysteine residues at codons 22, 25, 27, 30, 34, and 37. Similarly, few positively selected variants were detected in the N-terminal portion of Rev. This region, which overlaps the C terminus of Tat, contains the high-affinity, arginine-rich binding site TRQARRNRRRRWRERQR, which functions as a nuclear import signal, a multimerization domain, and an RRE-binding domain (POLLARD and MALIM 1998). Despite a high density of CTL epitopes, the N terminus of Rev between codons 1 and 49 was relatively invariant, a finding that presumably reflects the structural and functional constraints of this region.
Correlations among CKII phosphorylation domains, amino acid variation, and CTL epitopes:
A striking finding was the high density of phosphorylation motifs at the C terminus of Tat. Four putative casein kinase II (CKII) sites, codons 6164, 7780, 8285, and 9598, were prevalent in C, but not in B, viruses. Three of these sites (at positions 6164, 8285, and 9598) were highly conserved with prevalence rates ranging from 88.9 to 92.6%. Variation at these conserved sites was restricted primarily to the spacer (x), rather than to the functionally important serine/threonine (S/T) and aspartic/glutamic acid (D/E) residues. In two viruses that lacked a CKII motif at codons 8285, an alternative CKII site was detected downstreamthe first at codons 8386, the second at codons 8790. Less-conserved CKII sites were identified at codons 7780 and 9396 in 66.6 and 51.0% of C viruses, respectively. At these sites, nonsynonymous mutations were tolerated in the serine/threonine (77T, 93S) and glutamic/aspartic acid (80D, 96E) residues. Interestingly, the RGD (arginine, glutamine, aspartic acid) cell attachment site of the Tat protein was embedded within the CKII motif at positions 7780. Nonsynonymous mutations in the arginine (R) and aspartic acid (D) residues of the CKII motif lead to elimination of the RGD site. The linked nature of these overlapping regions is shown in Figure 3.CKII phosphorylation sites in Rev were also under positive selection. In subtype B, the most common CKII was located at codons 811 near the N terminus of the protein. Despite selection pressure on position 11, most mutations were synonymous or involved the replacement of glutamic acid with aspartic acid, substitutions that preserved the CKII motif. The majority (88.8%) of C viruses lacked this CKII site due to an alanine substitution at codon 11. Instead, the predominant CKII site in C viruses was located within the multimerization domain adjacent to the RRE-binding site at codons 5457. Mutations (85.2%) at serine-54 in C viruses were often synonymous or involved the substitution of serine with threonine, leading to preservation of this serine-based CKII motif. In contrast, selection pressure on serine-54 and D/E-57 residues of B viruses led to nonsynonymous mutation and disruption of the CKII motif in 60% of isolates. Positive selection was also observed within the leucine-rich nuclear export signal (NES) of Rev (POLLARD and MALIM 1998), especially in C viruses. The frequent replacement of leucine-4 and -9 in the NES sequence LPPLERLTL suggests that C viruses may be interacting with nuclear export proteins that are different from those used by subtype B. Other features of subtype C included the deletion of amino acids 108116 at the extreme C terminus of Rev and the presence of multiple overlapping myristylation motifs between codons 89105, immediately upstream from the deletion. Despite positive selection pressure, the 89105 deletion was not detected in subtype B and only 20% of B viruses carried the extended myristylation motif.
Constraints imposed by overlapping reading frames:
In total, 22 (73.0%) of the positively selected Tat codons in subtype C were situated in a region that overlaps the N terminus of Rev. Of these, 14 (63.6%) were localized in a reading frame that also overlaps with Env. In this region, nucleotide changes in tat would be expected to affect not only Tat but also the overlapping segment of Rev and Env. Conversely, nucleotide substitutions in rev would be expected to have an impact on the overlapping regions of Tat and Env. As an example, the glycine residue at position 33 of Rev was a frequent target of antibody and CTL reactivity (Figure 3). In total, eight substitutions were detected at this position, seven of which were synonymous mutations in the third codon position. These GGG
GGA mutations had no impact on the myristylation site in Rev, but caused a nonsynonymous aspartic acid (GAC) to asparagine (AAC) mutation in Tat, a change that eliminates the CKII site at codons 6164.
Prediction of novel peptide sequences:
A sliding window method was used to search for novel peptide sequences that were present in ancestral sequences, but absent from sampled contemporary sequences. A total of 771 peptides (including 589 Env, 111 Nef, 39 Tat, and 32 Rev sequences) in which one or more of the 10 amino acids in the ancestral sequence differed from the sampled sequence were identified. All of these peptides were localized in regions of positive selection. Several Env peptides were located adjacent to areas of insertions or deletions, suggesting that indels may contribute to the creation of new antigenic sites (data not shown). A high proportion (75%) of newly identified epitopes were localized internally at the ancestral nodes of the tree.As a result of the rapid accumulation of sequence data, combined with advances in codon-based substitutions and ancestral reconstruction methods, it is now possible to begin searching for evolutionary patterns that may be relevant to vaccine design. These methods can be applied to the entire HIV-1 genome and to large numbers of pooled sequences collected from patients with the same, or different, HLA types. Through a process of ancestral reconstruction, it may also be possible to identify wild-type ancestral sequences and to reconstruct CTL escape pathways.
As an example, we recently analyzed a set of published sequences collected during a time-course study of acute SIV infection in rhesus macaques inoculated with a single cloned variant of SIVMAC 239 (ALLEN et al. 2000; O'CONNOR et al. 2001). In these studies, escape from acute SIV infection was associated with a single-amino-acid substitution in the Tat-specific epitope, SL8. Ancestral analysis of serial SL8 sequences collected from different macaques and from the same macaque sampled over time at 2, 4, 6, and 8 weeks postinfection identified the inoculating variant of SL8, STPESANL, as the MRCA, even when this sequence was no longer present in the postinoculum specimen (data not shown).
In the HIV-1 setting, it may be possible to measure the intensity of the selection pressure exerted on individual amino acid sites and to preselect a small number of epitopes that are strongly selected and warrant further experimental investigation. These sequences could then be used to design a multi-epitope vaccine directed against regions of the virus that are unable to mutate and escape immune recognition. By inducing a mucosal immune response to these epitopes, prior to infection, it may be possible to prevent the initial establishment of infection or to reduce the level of peak viremia.
Our data confirm and extend previously published findings. In agreement with YANG (2001) and YANG et al. (2003), we detected varying levels of adaptive evolution in all nine HIV-1 genes. One of the most striking findings was the distinct pattern and high concentration of positively selected codons in the C termini of Tat and Rev in regions that lack CTL epitopes and essential functional domains such as the TAR- and RRE-binding domains. Our studies also suggest that the selection pressures directed against these C-terminal regions are likely to be complex and to involve both direct and indirect selection pressures exerted through overlapping reading frames. These findings are particularly intriguing, given that Tat and Rev (along with Nef) are the earliest proteins to be expressed in newly infected cells and that these proteins are the primary determinants controlling the complex, temporally regulated expression of HIV-1. Tat plays a major role in the upregulation of HIV-1 gene expression; Rev controls the switch from chronic abortive infection to full-length mRNA expression and productive infection. Since many of the selection pressures exerted on the C terminus of Tat are likely to also impact on Rev (and vice versa), our findings suggest that the coordinated, sequential expression of these two proteins may be regulated by antigenic variation induced by complex interactions with the host immune system and/or by interactions with other proteins and regulatory factors in the intracellular milieu. Flexibility at the C termini of Tat and Rev, as shown by the emergence and relocation of CKII and myristylation, suggests that these regions are able to tolerate a high level of genetic variation while still retaining their biological properties. In contrast to highly conserved and functionally constrained domains at the N termini of Tat and Rev (i.e., NLS-, TAR-, and Sp1-binding sites) sequences at the C termini of Tat and Rev would be expected to be more susceptible to host-driven selection pressure.
One could argue that CKII sites [S/T-x(2)-D/E] may not be particularly relevant because of their low complexity and the possibility that these sites may not be phosphorylated in vivo. However, phosphorylation is known to play a major role in the regulation of RNA-binding proteins such as Tat and Rev (HOLMES 1996; MEGGIO et al. 1996; PARADA and ROEDER 1996; YANG et al. 1996; FOUTS et al. 1997; CHUN et al. 1998; MARIN et al. 2000). Studies have shown that CKII-mediated phosphorylation of serine-54 leads to conformational changes in Rev and rapid, efficient RNA-binding (FOUTS et al. 1997). It has also been shown that less pathogenic HIV-2 viruses lack this CKII site. We found that, although most B viruses lacked the serine-54 site, they carried an alternative CKII phosphorylation motif at codons 811. In B viruses, phosphorylation of serine-8 has been shown to be important for transactivation of Rev.
Our understanding of Tat phosphorylation is less clear (PARADA and ROEDER 1996; YANG et al. 1996; CHUN et al. 1998). Most studies have shown that Tat enhances the activity of other phosphorylated proteins. However, the detection of five CKII sites (three of which were highly conserved) in a short stretch of amino acids at the C terminus of subtype C suggests that direct phosphorylation (hyperphosphorylation) of Tat may also occur. The conservation of these CKII sites, in the face of strong diversifying pressure, may be due to the fact that many of the substitutions occurred in spacer amino acids (x), although serine
threonine and glutamic
aspartic acid substitutions were also tolerated. At the more variable CKII sites, two in subtype C and one in subtype B, nonsynonymous mutations were also tolerated in the functionally important [S/T] and [D/E] residues. However, the loss of a CKII [or protein kinase C (PKC)] site at one position was frequently associated with the presence of an alternative site at a different location.
The immunological and functional significance of these variations remains to be established. Interestingly, the STPESANL epitope associated with CTL-mediated escape from acute SIV infection (ALLEN et al. 2000; O'CONNOR et al. 2001) contains a serine-based CKII motif, [S]-x(2)-[E]. By 8 weeks postinfection, this CKII site had been eliminated from 35% of the SIV escape mutants. Further studies are needed to determine whether STPESANL and other escape epitopes are phosphorylated in vivo and whether phoshorylation/dephosphorylation alters the conformation and antigenicity of these epitopes, facilitating immune escape. Two additional epitopes have been identified. One contains a PKC site; the other maps to the NES (ADDO et al. 2001). Both sites were under positive selection, especially in C viruses. Again, additional studies are needed to determine whether these patterns are consistent and whether they reflect different biological properties of the two viral subtypes.
Although codon-based evolutionary methods are still in the early stages of development, our studies suggest that a combination of mathematical and experimental methods will lead to an improved understanding of the mechanisms underlying CTL escape and provide new insights for vaccine design. Such studies may also yield new information on the significance of overlapping reading frames and how these regions contribute to the complexity of host-virus interactions and the regulated, ordered expression of the HIV-1 genome. As suggested by OVERBAUGH and BANGHAM (2001), to be informative, evolutionary modeling should be complemented with parallel testing of effector cell populations, including HIV-1-specific CTL and T-helper and B-cell clones to identify which selection processes are most critical to the escape pathway.
Our studies suggest that it may be advantageous to extend the above approach to an analysis of important functional domains. When possible, the analyses should be performed on transmitted variants of HIV-1 collected sequentially during the period of acute seroconversion. Although selection models have proven useful for analyzing HIV-1 from different patients with the same, or different, viral subtypes (YANG 2001; YANG et al. 2003), they are best suited to the analysis of within-host variation. In addition, to avoid problems relating to intersubtype recombination, the analyses should be performed on sequences that are known to be nonrecombinant at the subtype level (ANISIMOVA et al. 2003; YANG et al. 2003). Once important escape patterns have been identified, they can be tested experimentally in the SIV model. Such studies would involve the induction of CTL responses to pre-escape variants, followed by viral challenge. Finally, by using site-directed mutagenesis, it should be possible to elucidate the potential role of phosphorylation in the escape process.
ABELE, R., and R. TAMPE, 1999 Function of the transport complex TAP in cellular immune recognition. Biochim. Biophys. Acta 1461: 405419.[Medline]
ADDO, M., M. ALTFELD, E. S. ROSENBERG, R. L. ELDRIDGE, M. N. PHILIPS et al., 2001 The HIV-1 regulatory proteins Tat and Rev are frequently targeted by cytotoxic T lymphocytes derived from HIV-1-infected individuals. Proc. Natl. Acad. Sci. USA 98: 17831796.
ALLEN, T. M., D. H. O'CONNOR, P. JING, J. L. DZURIS, B. R. MOTHE et al., 2000 Tat-specific cytotoxic T lymphocytes select for SIV escape variants during resolution of primary viraemia. Nature 407: 386390.[CrossRef][Medline]
ANISIMOVA, M., J. P. BIELAWSKI and Z. YANG, 2001 Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol. Biol. Evol. 18: 15851592.
ANISIMOVA, M., R. NIELSEN and Z. YANG, 2003 Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics 164: 12291236.
BAROUCH, D. H., J. KUNSTMAN, M. J. KURODA, J. E. SCHMITZ, S. SANTRA et al., 2002 AIDS vaccine failure in a rhesus monkey by viral escape from cytotoxic T lymphocytes. Nature 415: 335339.[CrossRef][Medline]
BOCHTLER, M., L. DITZEL, M. GROLL, C. HARTMANN and R. HUBER, 1999 The proteasome. Annu. Rev. Biophys. Biomol. Struct. 28: 295317.[CrossRef][Medline]
BORROW, P., H. LEWICKI, X. WEI, M. S. HORWITZ, N. PEFFER et al., 1997 Antiviral pressure exerted by HIV-1 specific cytotoxic T lymphocytes (CTLs) during primary infection demonstrated by rapid selection of CTL escape virus. Nat. Med. 3: 205211.[CrossRef][Medline]
BRANDER, C., and P. J. R. GOULDER, 1999 Recent advances in HIV-1 CTL epitope characterization, pp. IV-117 in HIV Molecular Immunology Database, edited by T. M. KORBER, C. BRANDER, B. F. HAYNES, R. KOUP, C. KUIKEN et al. Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM.
CHECK, E., 2003 AIDS vaccines: back to plan A. Nature 424: 912914.[CrossRef][Medline]
CHUN, R. F., O. J. SEMMES, C. NEUVEUT and K. T. JEANG, 1998 Modulation of Sp1 phosphorylation by human immunodeficiency virus type 1 Tat. J. Virol. 72: 26152629.
DANIEL, M. D., F. KIRCHHOFF, S. C. CZAJAK, P. K. SEHGAL and R. C. DESROSIERS, 1992 Protective effects of a live attenuated SIV vaccine with a deletion in the nef gene. Science 258: 19381941.
DA SILVA, J., and A. L. HUGHES, 1998 Conservation of cytotoxic T lymphocyte (CTL) epitopes as a host strategy to constrain parasite adaptation: evidence from the nef gene of human immunodeficiency virus 1 (HIV-1). Mol. Biol. Evol. 15: 12591268.[Abstract]
DELWART, E. L., J. I. MULLINS, P. GUPTA, G. H. LEARN and M. HOLODINY, 1998 Human immunodeficiency virus type 1 populations in blood and semen. J. Virol. 72: 617623.
DE OLIVEIRA, T., S. ENGELBRECHT, E. J. VAN RENSBURG, M. GORDON, K. BISHOP et al., 2003a Variability at HIV-1 subtype C protease cleavage sites: An indication of viral fitness? J. Virol. 77: 94229430.
DE OLIVEIRA, T., R. MILLER, M. TARIN and S. CASSOL, 2003b An integrated genetic data environment (GDE)-based LINUX interface for analysis of HIV-1 and other microbial sequences. Bioinformatics 19: 153154.
DUNN, C. S., B. HURTREL, C. BEYER, L. GLOECKLER, T. N. LEDGER et al., 1997 Protection of SIV mac-infected macaque monkeys against superinfection by a simian immunodeficiency virus expressing envelope glycoproteins of HIV type 1. AIDS Res. Hum. Retroviruses 13: 913922.[Medline]
FOUTS, D. E., H. L. TRUE, K. A. CENGEL and D. CELANDER, 1997 Site-specific phosphorylation of the human immunodeficiency virus type-1 Rev protein accelerates formation of an efficient RNA-binding conformation. Biochemistry 36: 1325613263.[CrossRef][Medline]
GASCHEN, B., J. TAYLOR, K. YUSIM, B. FOLEY, F. GAO et al., 2002 Diversity considerations in HIV-1 vaccine selection. Science 296: 23542360.
GOULDER, P. J., R. E. PHILLIPS, R. A. COLBERT, S. MCADAM, G. OGG et al., 1997 Late escape from an immunodominant cytotoxic T-lymphocyte response associated with progression to AIDS. Nat. Med. 3: 212217.[CrossRef][Medline]
HIRSCH, V. M., S. GOLDSTEIN, N. A. HYNES, W. R. ELKINS, W. T. LONDON et al., 1994 Prolonged clinical latency and survival of macaques given a whole inactivated simian immunodeficiency virus vaccine. J. Infect. Dis. 170: 5159.[Medline]
HOLMES, A. M., 1996 In vitro phosphorylation of human immunodeficiency virus type 1 Tat protein by protein kinase C: evidence for the phosphorylation of amino acid residue serine-46. Arch. Biochem. Biophys. 335: 812.[CrossRef][Medline]
HOOVER, E. A., N. A. PERIGO, S. L. QUACKENBUSH, C. K. MATHIASON-DUBARD, J. M. OVERBAUGH et al., 1991 Protection against feline leukemia virus infection by use of an inactivated virus vaccine. J. Am. Vet. Med. Assoc. 199: 13921401.[Medline]
HUELSENBECK, J. P., and F. RONQUIST, 2001 MRBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17: 754755.
JEANG, K-T., H. XIAO and E. A. RICH, 1999 Multifaceted activities of the HIV-1 transactivator of transcription, Tat. J. Biol. Chem. 274: 2883728840.
KARLSSON, A. C., S. LINDBACK, H. GAINES and A. SONNERBORG, 1998 Characterization of the viral population during primary HIV-1 infection. AIDS 12: 839847.[CrossRef][Medline]
KLAUSNER, R. D., A. S. FAUCI, L. COREY, G. J. NABEL, H. GAYLE et al., 2003 The need for a global HIV vaccine enterprise. Science 300(5628): 20362039.
KORBER, B., and G. MYERS, 1992 Signature pattern analysis: a method for assessing viral sequence relatedness. AIDS Res. Hum. Retroviruses 8: 15491560.[Medline]
MARIN, O., S. SARNO, M. BOSCHETTI, M. A. PAGANO, F. MEGGIO et al., 2000 Unique features of HIV-1 Rev protein phosphorylation by protein kinase CK2. FEBS Lett. 481: 6367.[CrossRef][Medline]
MCMICHAEL, A. T., 1998 Cell responses and viral escape. Cell 93: 673676.[CrossRef][Medline]
MCMICHAEL, A. J., and R. E. PHILLIPS, 1997 Escape of human immunodeficiency virus from immune control. Annu. Rev. Immunol. 15: 271296.[CrossRef][Medline]
MEGGIO, F., D. M. D'AGOSTINO, V. CIMINALE, L. CHIEVO-BIANCHI and L. A. PINNA, 1996 Phosphorylation of HIV-1 Rev protein: implication of protein kinase CK2 and pro-directed kinases. Biochem. Biophys. Res. Commun. 226: 547554.[CrossRef][Medline]
MELLORS, J. W., C. R. RINALDO, JR., P. GUPTA, R. M. WHITE, J. A. TODD et al., 1996 Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science 272: 11671170.[Abstract]
MELLORS, J. W., A. MUNOZ, J. V. GIORGI, J. B. MARGOLICK, C. J. TASSONI et al., 1997 Plasma viral load and CD4+ lymphocytes as prognostic makers of HIV-1 infection. Ann. Intern. Med. 126: 946954.
NIELSEN, R., and Z. YANG, 1998 Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148: 929936.
NOVITSKY, V., N. RYBAK, M. F. MCLANE, P. GILBERT, P. CHIGWEDERE et al., 2001 Identification of human immunodeficiency virus type I subtype C Gag-, Tat-, Rev-, and Nef-specific Elispot-based cytotoxic T-lymphocyte responses for AIDS vaccine design. J. Virol. 75: 92109228.
O'CONNOR, D., T. ALLEN and D. I. WATKINS, 2001 Vaccination with CTL epitopes that escape: An alternative approach to HIV vaccine development? Immunol. Lett. 79: 7184.
OVERBAUGH, J., and C. R. M. BANGHAM, 2001 Selection forces and constraints on retroviral sequence variation. Science 292: 11061109.
PAMER, E., and P. CRESSWELL, 1998 Mechanisms of MHC class I-restricted antigen processing. Annu. Rev. Immunol. 16: 323358.[CrossRef][Medline]
PARADA, C. A., and R. G. ROEDER, 1996 Enhanced processivity of RNA polymerase II triggered by Tat-induced phosphorylation of its carboxy-terminal domain. Nature 384: 375378.[CrossRef][Medline]
POLLARD, V. W., and M. H. MALIM, 1998 The HIV-1 Rev protein. Annu. Rev. Microbiol. 52: 491532.[CrossRef][Medline]
POSADA, D., and K. A. CRANDALL, 1998 MODELTEST: testing the model of DNA substitution. Bioinformatics 14: 817818.
PRICE, D. A., P. J. GOULDER, P. KLENERMAN, A. K. SEWELL, P. J. EASTERBROOK et al., 1997 Positive selection of HIV-1 cytotoxic T lymphocytes escape variants during primary infection. Proc. Natl. Acad. Sci. USA 94: 18901895.
RAMBAUT, A., D. POSADA, K. A. CRANDALL and E. C. HOLMES, 2004 The causes and consequences of HIV evolution. Nat. Rev. Genet. 5: 5261.[CrossRef][Medline]
RAMMENSEE, H. G., J. BACHMANN, N. P. EMMERICH, O. A. BACHOR, S. STEVANOVIC et al., 1999 SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50: 213219.[CrossRef][Medline]
ROSENBERG, E. S., J. M. BILLINGSLEY, A. M. CALIENDO, S. L. BOSWELL, P. E. SAX et al., 1997 Vigorous HIV-1-specific CD4+ T cell responses associated with control of viremia. Science 278: 14471450.
SWOFFORD, D. L., 2002 PAUP* 4.0: Phylogenetic Analysis Using Parsimony (and Other Methods), Version 4.0b2a. Sinauer Associates, Sunderland, MA.
WILSON, C., R. C. BROWN, B. T. KORBER, B. M. WILKES, D. J. RUHL et al., 1999 Frequent detection of escape from cytotoxic T-lymphocyte recognition in perinatal human immunodeficiency virus (HIV) type 1 transmission: the Ariel project for the prevention of transmission of HIV from mother to infant. J. Virol. 73: 39753985.
XIA, X., and Z. XIE, 2001 DAMBE: software package for data analysis in molecular biology and evolution. J. Hered. 92(4): 371373.
YANG, W., J. P. BIELAWSKI and Z. YANG, 2003 Widespread adaptive evolution in the human immunodeficiency virus type 1 genome. J. Mol. Evol. 57(2): 212221.[CrossRef][Medline]
YANG, X., C. H. HERRMAN and A. P. RICE, 1996 The human immunodeficiency virus Tat proteins specifically associated with TAK in vivo and require the carboxyl-terminal domain of RNA polymerase II for function. J. Virol. 70: 45764584.[Abstract]
YANG, Z. 1997 PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13: 555556.
YANG, Z., 2001 Maximum likelihood analysis of adaptive evolution in HIV-1 gp120 env gene. Pac. Symp. Biocomput. 2001: 226237.
YANG, Z., R. NIELSEN, N. GOLDMAN and A. M. PEDERSEN, 2000 Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155: 431449.
YUSIM, K., C. KESMIR, B. GASCHEN, M. M. ADDO, M. ALTFELD et al., 2002 Clustering patterns of cytotoxic T-lymphocyte epitopes in human immunodeficiency virus type 1 (HIV-1) proteins reveal imprints of immune evasion on HIV-1 global variation. J. Virol. 76: 87578768.
ZANOTTO, P. M., E. G. KALLAS, R. F. DE SOUZA and E. C. HOLMES, 1999 Genealogical evidence for positive selection in the nef gene of HIV-1. Genetics 153: 10771089.
This article has been cited by other articles:
![]() |
P. C. Matthews, A. J. Leslie, A. Katzourakis, H. Crawford, R. Payne, A. Prendergast, K. Power, A. D. Kelleher, P. Klenerman, J. Carlson, et al. HLA Footprints on Human Immunodeficiency Virus Type 1 Are Associated with Interclade Polymorphisms and Intraclade Phylogenetic Clustering J. Virol., May 1, 2009; 83(9): 4605 - 4615. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. A. Travers, D. C. Tully, G. P. McCormack, and M. A. Fares A Study of the Coevolutionary Patterns Operating within the env Gene of the HIV-1 Group M Subtypes Mol. Biol. Evol., December 1, 2007; 24(12): 2787 - 2801. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. McCauley, S. de Groot, T. Mailund, and J. Hein Annotation of selection strengths in viral genomes Bioinformatics, November 15, 2007; 23(22): 2978 - 2986. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Mayrose, A. Doron-Faigenboim, E. Bacharach, and T. Pupko Towards realistic codon models: among site variability and dependency of synonymous and non-synonymous rates Bioinformatics, July 1, 2007; 23(13): i319 - i327. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. de Groot, T. Mailund, and J. Hein Comparative annotation of viral genomes with non-conserved gene structure Bioinformatics, May 1, 2007; 23(9): 1080 - 1089. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Wilson and G. McVean Estimating Diversifying Selection and Functional Constraint in the Presence of Recombination Genetics, March 1, 2006; 172(3): 1411 - 1425. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. D. W. Frost, T. Wrin, D. M. Smith, S. L. K. Pond, Y. Liu, E. Paxinos, C. Chappey, J. Galovich, J. Beauchaine, C. J. Petropoulos, et al. Neutralizing antibody responses drive the evolution of human immunodeficiency virus type 1 envelope during recent HIV infection PNAS, December 20, 2005; 102(51): 18514 - 18519. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by de Oliveira, T.
- Articles by Cassol, S.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by de Oliveira, T.
- Articles by Cassol, S.



); NES (
); cysteine-rich CTL epitopes (CTL-B) disulfide bond region, Cys Rich (
); Sp1-binding site (
); HAT (Ø); RGD cell attachment site, cell attch (
); casein kinase phosphorylation sites, CKII (); protein kinase C phosphorylation sites, PKC (
); myristylation sites, MYRISTYL (
), and MRCA signature mutations that differ between subtypes B and C, Mut sub.BxC. Reading frames (+1, +2, and +3) of the overlapping regions of Tat, Rev, and Env are shown at the bottom.




