Description

This track shows $organism genome high-confidence gene annotations from the Consensus Coding Sequence (CCDS) project. This project is a collaborative effort to identify a core set of $organism protein-coding regions that are consistently annotated and of high quality. The long-term goal is to support convergence towards a standard set of gene annotations on the $organism genome.

Collaborators include:

Methods

CDS annotations of the $organism genome were obtained from two sources: NCBI RefSeq and a union of the gene annotations from Ensembl and Vega, collectively known as Hinxton.

Genes with identical CDS genomic coordinates in both sets become CCDS candidates. The genes undergo a quality evaluation, which must be approved by all collaborators. The following criteria are currently used to assess each gene:

A unique CCDS ID is assigned to the CCDS, which links together all gene annotations with the same CDS. CCDS gene annotations are under continuous review, with periodic updates to this track.

Credits

This track was produced at UCSC from data downloaded from the CCDS project web site.

References

Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, Searle S, Farrell CM, Loveland JE, Ruef BJ et al. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res. 2009 Jun 4. [Epub ahead of print]

Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, et al. The Ensembl genome database project. Nucl. Acids Res. 2002 Jan 1;30(1):38-41.

Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucl. Acids Res. 2005 Jan 1;33(Database Issue):D501-D504.