Description

This column contains protein similarity scores assigned by the Rankprop algorithm. The scores reported in this column range from zero to one, with one being the most significant. Currently, Rankprop does not report an E-value statistic.

Rankprop detects subtle protein sequence similarities by performing a diffusion across a protein similarity network in which the edges are defined and weighted by PSI-BLAST E-values. The diffusion operation allows the algorithm to exploit non-local network structure in the identification of remote protein homologs. Thus, two proteins that do not have a significant pairwise PSI-BLAST E-value may still receive a good score from Rankprop if they are connected by many short paths in the protein similarity network. The Rankprop algorithm is described in Weston J et al., Protein ranking: From local to global structure in the protein similarity network, Proc. Natl. Acad. Sci 101, 6559-6563 (2004). It is based upon the approach of Zhou et al., Ranking on Data Manifolds, Advances in Neural Information Processing Systems 16, MIT Press, Cambridge, MA, (2004)

Methods

PSI-BLAST was run on the $organism proteins from the SwissProt and TrEMBL databases of 15 November 2004 in an all-versus-all fashion using the following flags:

    -j 5  -e 1000 -v 1000 -b 0

The resulting symmetrized PSI-BLAST E-values are available in the PSI-BLAST Gene Sorter track. Next, Rankprop was run on the resulting protein similarity network using a slightly modified algorithm, which will be described fully in a forthcoming manuscript. The primary differences with respect to the published algorithm are as follows:

(i) Activation scores are propagated from multiple sources, not just from the query point itself. The pseudo-queries are chosen by pulling in close homologs using PSI-BLAST E-values.

(ii) The network weights are adjusted adaptively depending on local density in the similarity network. These two modifications result in improved performance while maintaining efficiency of the algorithm.

Credits

The Rankprop homology Gene Sorter track was created by William Stafford Noble, Jason Weston, and Mark Diekhans.

Jason Weston works for NEC Laboratories America, Princeton (formerly at the Max Planck Institute for Biological Cybernetics, Tuebingen, where part of this work was completed).

This work was funded in part by award EIA-0312706 from the National Science Foundation.