Description

This track displays Positive Selection analysis results of VAX003 subtype AE or subtype B HIV-1 gp120 nucleotide sequences. Each positive selection site is labeled by the corresponding amino acid of the consensus sequence.

Methods

The standard method for detecting adaptive molecular evolution in protein-coding DNA sequences is through comparison of nonsynonymous (amino acid changing; dN) and synonymous (silent; dS) substitution rates through the dN/dS ratio [ω or acceptance rate]. ω measures the difference between both rates based on a codon substitution model. If an amino acid substitution is neutral, it will be fixed at the same rate as a synonymous mutation, with ω = 1. If the amino acid change is deleterious, purifying or negative selection (i.e., natural selection against deleterious mutations with negative selection coefficients) will reduce its fixation rate, thus ω < 1. Only when the amino acid change offers a selective advantage is it fixed at higher rate than a synonymous mutation, with ω > 1. Therefore, an ω ratio significantly higher than one is convincing evidence for adaptive or diversifying selection. In the HIV context, this is the principle means of identifying changes and therefore sites under antigenic selection and therefore most important in vaccine targets. Here dN/dS ratios in each subtype and placebo and vaccine samples were compared using methods implemented in PAML v3.14. ω and the proportion of sites (ρ) with ω > 1 were estimated under the site-specific models of Goldman et al. (1994) and Yang (2000). Tests of positive selection were performed by comparing likelihood scores (likelihood ratio test) between the M1 (neutral) and M2 (selection) and between the M7 (beta) and M8 (beta&ω) per-site nested models. More stringent models in PAML (M0) of per-gene selection were also estimated for comparison. If adaptive selection was identified, we then applied the Bayesian test developed by Yang et al. and implemented in PAML to identify the potential sites under diversifying selection as indicated by a posterior probability > 0.95. Maximum likelihood trees were estimated for each subtype using PhyML under the best-fit substitution models. Each analysis was run twice under ω = 1.5 and 0.5.

Simulations published by Anisimova et al. (2003) and Shriner et al. (2003) assessed the accuracy and power of the LRT and Bayes test implemented in PAML in the presence of recombination. General conclusions from these analyses indicate that excessive recombination (ρ = 0.01), like usually observed in HIV sequences, can cause false positives in the Bayes test and makes the LRT unrealistic as it often mistakes recombination as evidence for positive selection. The LRT test that compares models M7 and M8 seems to be more robust to recombination and the detection of sites under positive selection seems to be less affected by recombination. Nevertheless, a new coalescent model has been recently described that estimates the dN/dS ratio in the presence of recombination and hence generates simultaneous estimates of ω and ρ using Bayesian inference (Wilson et al. 2006). Such a model is implemented in omegaMap and has been applied to our subtype and vaccine and placebo samples. We ran omegaMap under a constant model for variation (i.e., all sites are assumed to share common ω and ρ) and the following parameter settings:

  • N° orders = 10
  • N° iterations = 106
  • thinning = 100
  • priors = improper inverse
  • Results

    Selection pressure, as indicated by the dN/d S ratio per gene and per site, and the proportion of sites under selection (p) was high for both subtypes, although subtype B showed higher values than subtype AE for both parameters. The Bayesian approach detected numerous sites under selection (n) in both datasets, although up two times more positively selected sites were observed in subtype AE than in subtype B. These differences are probably a consequence of the uneven sample sizes of these two datasets (181 and 29 sequences, respectively). Simultaneous estimates of selection and recombination also showed higher dN/dS estimates for subtype B than for subtype AE; the recombination rate (ρ), however, was almost four times higher for subtype AE than for subtype B (Table 1). Subtype AE is of recombinant origin, so one should expect higher recombination rates for this subtype than for subtype B. Despite this, recombination could inflate dN/dS rates, but this does not seem to be the case, since dN/dS rates (as estimated in PAML) are higher for subtype B, hence suggesting that the high frequency of subtype AE in Thailand is rather a founder event that could have taken place approximately 25 years ago.

      Ns ωMO -lnLM1 -lnLM2 ωM2 ρM2 nM2 -lnLM7 -lnLM8 ωM8 ρM8 nM8 ω ρ
    Subtype AE 181 0.561 33555.5 32999.5 3.29 0.107 42 33450.1 32949 2.95 0.108 44 0.404 (0.366-0.443) 15.56 (14.65-16.65)
        Placebo 93 0.561 19183.7 18915.3 3.23 0.105 33 19186.9 18918.2 2.87 0.119 42 0.424 (0.375-0.480) 15.81 (14.02-17.84)
        Vaccine 88 0.536 18145.8 17870.7 3.57 0.096 34 18106.8 17851.6 3.07 0.100 30 0.447 (0.396-0.504) 11.17 (10.07-12.39)
    Subtype B 29 0.756 9303 9179.9 3.68 0.127 34 9321.7 9188.2 3.22 0.17 39 0.778 (0.673-0.901) 3.95 (3.45-4.53)
        Placebo 16 0.751 6254.2 6185.1 3.94 0.127 23 6262.9 6187.6 3.79 0.140 27 0.732 (0.611-0.869) 8.79 (6.56-11.76)
        Vaccine 13 0.774 5312.3 5265.4 3.99 0.136 15 5318.1 5267.0 3.89 0.147 21 0.873 (0.722-1.059) 1.9 (1.55-2.34)

    Table 1. Test of adaptive selection for the Thailand HIV-1 subtypes B and AE from placebo and vaccinated individuals in PAML and omegaMap. All model comparisons in PAML were significant (P < 0.001). The recombination rate (ρ) under selection (omegaMap) is also provided. 95% HPD intervals are indicated between parentheses.

    Credits

    The data for this track were provided by Keith A. Crandall at Genoma LLC.

    References

    Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003 Jul;164(3):1229-36.

    Crandall KA, Kelsey CR, Imamichi H, Lane HC, Salzman NP. Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. Mol Biol Evol. 1999 Mar;16(3):372-82.

    Goldman N, Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994 Sep;11(5):725-36.

    Miyata T, Yasunaga T. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol. 1980 Mar;16(1):23-36.

    Shriner, D., D. C. Nickle, M. A. Jensen, and J. I. Mullins. Potential impact of recombination on sitewise approaches for detecting positive natural selection. Genet Res. 2003 Apr;81(2):115-21.

    Wilson DJ, McVean G. Estimating diversifying selection and functional constraint in the presence of recombination. Genetics. 2006 Mar;172(3):1411-25.

    Yang Z. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol Biol Evol. 2007 Aug;24(8):1586-91.

    Yang Z, Nielsen R, Goldman N, Pedersen A-MK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000 May;155(1):431-449.

    Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selectioni. Mol Biol Evol. 2005 Apr;22(4):1107-18.