Description

This track displays Positive Selection analysis results (by race) of VAX004 HIV-1 gp120 nucleotide sequences. Each positive selection site is labeled by the corresponding amino acid of the consensus sequence.

Methods

The standard method for detecting adaptive molecular evolution in protein-coding DNA sequences is through comparison of nonsynonymous (amino acid changing; dN) and synonymous (silent; dS) substitution rates through the dN/dS ratio [ω or acceptance rate]. ω measures the difference between both rates based on a codon substitution model. If an amino acid substitution is neutral, it will be fixed at the same rate as a synonymous mutation, with ω = 1. If the amino acid change is deleterious, purifying or negative selection (i.e., natural selection against deleterious mutations with negative selection coefficients) will reduce its fixation rate, thus ω < 1. Only when the amino acid change offers a selective advantage is it fixed at higher rate than a synonymous mutation, with ω > 1. Therefore, an ω ratio significantly higher than one is convincing evidence for adaptive or diversifying selection. In the HIV context, this is the principle means of identifying changes and therefore sites under antigenic selection and therefore most important in vaccine targets. Here dN/dS ratios in each subtype and placebo and vaccine samples were compared using methods implemented in PAML v3.14. ω and the proportion of sites (ρ) with ω > 1 were estimated under the site-specific models of Goldman et al. (1994) and Yang (2000). Tests of positive selection were performed by comparing likelihood scores (likelihood ratio test) between the M1 (neutral) and M2 (selection) and between the M7 (beta) and M8 (beta&ω) per-site nested models. More stringent models in PAML (M0) of per-gene selection were also estimated for comparison. If adaptive selection was identified, we then applied the Bayesian test developed by Yang et al. and implemented in PAML to identify the potential sites under diversifying selection as indicated by a posterior probability > 0.95. Maximum likelihood trees were estimated for each subtype using PhyML under the best-fit substitution models. Each analysis was run twice under ω = 1.5 and 0.5.

Simulations published by Anisimova et al. (2003) and Shriner et al. (2003) assessed the accuracy and power of the LRT and Bayes test implemented in PAML in the presence of recombination. General conclusions from these analyses indicate that excessive recombination (ρ = 0.01), like usually observed in HIV sequences, can cause false positives in the Bayes test and makes the LRT unrealistic as it often mistakes recombination as evidence for positive selection. The LRT test that compares models M7 and M8 seems to be more robust to recombination and the detection of sites under positive selection seems to be less affected by recombination. Nevertheless, a new coalescent model has been recently described that estimates the dN/dS ratio in the presence of recombination and hence generates simultaneous estimates of ω and ρ using Bayesian inference (Wilson et al. 2006). Such a model is implemented in omegaMap and has been applied to our subtype and vaccine and placebo samples. We ran omegaMap under a constant model for variation (i.e., all sites are assumed to share common ω and ρ) and the following parameter settings:

  • N° orders = 10
  • N° iterations = 106
  • thinning = 100
  • priors = improper inverse
  • Results

    Subtype B Ns ωMO -lnLM1 -lnLM2 ωM2 ρM2 nM2 -lnLM7 -lnLM8 ωM8 ρM8 nM8
    Asian 5 0.478 3205.2 3186.5 8.07 0.049 6 3206.3 3186.7 7.90 0.051 9
    Black 12 0.414 5291.7 5268.3 4.01 0.048 5 5299.1 5269.4 3.36 0.062 8
    Hispanic 20 0.426 8437.7 8366.4 3.49 0.081 16 8440.8 8358.1 2.93 0.098 25
    White 279 0.424 90874.5 89282.6 3.18 0.125 41 89524.4 88422.4 2.84 0.095 38
    Other 14 0.455 6578.2 6514.1 4.18 0.077 17 6591.1 6517.9 3.64 0.090 21

    Table 1. Test of adaptive selection for the USA subtype B subgroups in PAML. All model comparisons in PAML were significant (P < 0.001). These estimates were obtained using one clone per individual.

    HIV-1 Subtype B θ ρ ωPAML ωSNAP
    Asian 0.003 5.2 1.35 0.82
    Black 0.005 2.15 1.41 1.24
    Hispanic 0.005 3.61 0.83 0.75
    White 0.004 5.22 0.84 0.7
    Other 0.004 2.57 0.75 0.71

    Table 2. Mean genetic diversity (θ), population recombination rate (ρ), and selection (ω) estimates (as estimated in PAML and SNAP) for the USA subtype B subgroups. These estimates were obtained using all the clones from an individual and then averaging over all individuals in each subgroup.

    Credits

    The data for this track were provided by Keith A. Crandall at Genoma LLC.

    References

    Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003 Jul;164(3):1229-36.

    Crandall KA, Kelsey CR, Imamichi H, Lane HC, Salzman NP. Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. Mol Biol Evol. 1999 Mar;16(3):372-82.

    Goldman N, Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994 Sep;11(5):725-36.

    Miyata T, Yasunaga T. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol. 1980 Mar;16(1):23-36.

    Shriner, D., D. C. Nickle, M. A. Jensen, and J. I. Mullins. Potential impact of recombination on sitewise approaches for detecting positive natural selection. Genet Res. 2003 Apr;81(2):115-21.

    Wilson DJ, McVean G. Estimating diversifying selection and functional constraint in the presence of recombination. Genetics. 2006 Mar;172(3):1411-25.

    Yang Z. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol Biol Evol. 2007 Aug;24(8):1586-91.

    Yang Z, Nielsen R, Goldman N, Pedersen A-MK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000 May;155(1):431-449.

    Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selectioni. Mol Biol Evol. 2005 Apr;22(4):1107-18.