Description

RACEfrags are the products of 5’ RACE reactions performed on GENCODE genes (using the primers displayed in the subtrack "Gencode 5’ RACE primer") in 12 tissues and 3 cell lines (15 subtracks) followed by hybridization on ENCODE tiling arrays. Each RACEfrag is linked to the 5’ RACE primer but no other connectivity information is available from this experiment.

Methods

For a detailed description of the methods and references used, see Denoeud et al., 2007.

A combination of 5’ RACE and high-density tiling microarrays were used to empirically annotate 5’ transcription start sites (TSSs) and internal exons of all 410 annotated protein-coding loci across the 44 ENCODE regions (Oct. 2005 GENCODE freeze ; Harrow et al., 2006). Oligonucleotides for 5’ RACE experiments were chosen such that they map to a coding exon (the index exon) common to most of the transcripts of protein-coding gene loci annotated by the GENCODE (Oct. 2005 freeze). The 5’ RACE reactions were performed with oligonucleotides mapping to a coding exon (the index exon) on polyA+ RNA from twelve adult human tissues (brain, heart, kidney, spleen, liver, colon, small intestine, muscle, lung, stomach, testis, placenta) and three cell lines (GM06990 (lymphoblastoid), HL60 (acute promyelocytic leukemia) and HeLaS3 (cervix carcinoma)).

The RACE reactions were then hybridized to 20 nucleotide-resolution Affymetrix tiling arrays covering the non-repeated regions of the 44 ENCODE regions. The resulting "RACEfrags" -- array-detected fragments of RACE products -- were assessed for novelty by comparing their genomic coordinates to those of GENCODE-annotated exons.

Verification

Connectivity between novel RACEfrags and their respective index exon were investigated by RT-PCR using the 5’ RACE primer as one of the primers, followed by hybridization on tiling arrays. 385 RT-PCR reactions corresponding to 199 GENCODE loci were positive after hybridization on tiling arrays (244 RACE reactions). All positive RT-PCR reactions and a subset of those that were negative in the hybridization experiments were further verified by cloning and sequencing of the RT-PCR products. In most cases, eight clones were selected from each set of RT-PCR products for sequencing. To be retained in the dataset, these sequences must unambiguously map to the correct location, show splicing and pass manual inspection by the HAVANA team. By these criteria, 89 of these RT-PCR reactions (69 GENCODE loci) were positive after cloning and sequencing. (see Denoeud et al., 2007 for further details). The resulting cDNA sequences were deposited in GenBank under accession numbers DQ655905-DQ656069 and EF070113-EF070122. See additional information about the sequences here.

Credits

The RACEfrags result from a collaborative effort among the following laboratories:

Lab/Institution
Contributors
Genome Bioinformatics Lab CRG, Barcelona, Spain France Denoeud, Julien Lagarde, Tyler Alioto, Sylvain Foissac, Robert Castelo, Roderic Guigó
Department of Genetic Medicine and Development, University of Geneva, Switzerland Catherine Ucla, Carine Wyss, Caroline Manzano, Colette Rossier, Stylianos E. Antonorakis
Center for Integrative Genomics, University of Lausanne, Switzerland Jacqueline Chrast, Charlotte N. Henrichsen, Alexandre Reymond
Affymetrix, Inc., Santa Clara, CA, USA Philipp Kapranov, Jorg Drenkow, Sujit Dike, Jill Cheng, Thomas R. Gingeras
HAVANA annotation group, Wellcome Trust Sanger Insitute, Hinxton, UK Adam Frankish, James Gilbert, Tim Hubbard, Jennifer Harrow

References

Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J et al. Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 2007 Jun;17(6):746-59.

Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7 Suppl 1:S4.1-9.

ENCODE Project Consortium, Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007 Jun 14;447(7146):799-816.