This track shows 5' cap analysis gene expression (CAGE) tags and clusters in RNA extracts from different sub-cellular localizations in multiple cell lines. A CAGE cluster is a region of overlapping tags with an assigned value that represents the expression level. The data in this track were produced as part of the ENCODE Transcriptome Project.
This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here.
To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide.
This track contains the following views:
Color differences in subtracks are are used as a visual cue to distinguish between the different cell types, and between annotations on the plus and minus strand.
Cells were grown according to the approved ENCODE cell culture protocols. RNA molecules longer than 200 nt and present in the RNA population isolated from each subcellular compartment were fractionated into polyA+ and polyA- fractions as described in these protocols. The CAGE tags were sequenced from the 5' ends of cap-trapped cDNAs produced using RIKEN CAGE technology (Kodzius et al. 2006; Valen et al. 2009). To create the tag, a linker was attached to the 5' end of polyA+ or polyA- reverse-transcribed cDNAs which were selected by cap trapping (Carninci et al. 1996). The first 27 bp of the cDNA were cleaved using class II restriction enzymes. A linker was then attached to the 3' end of the cDNA.
After PCR amplification, the tags were sequenced (36 bp single reads) using ABI SOLiD technology (polyA- RNA from the cytosol and nucleus of K562 cell lines, and from whole cell in prostate cells) or Illumina/Solexa GA (all other data). Tags were mapped to the human genome (NCBI Build36, hg18) using the program nexalign (T. Lassmann manuscript in preparation). SOlid CAGE sequences were mapped with up to 3 mismatches; 2 mismatches were allowed for Solexa CAGE. Alignments of sequences mapping 10 times or fewer were retained. The expression level was computed as the number of reads making up the cluster, divided by the total number of reads sequenced, times 1 million.
These data were generated and analyzed by Timo Lassmann, Phil Kapranov, Hazuki Takahashi, Yoshihide Hayashizaki, Carrie Davis, Tom Gingeras, and Piero Carninci.
Contact: Piero Carninci at RIKEN Omics Science Center
Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D, Imamura K, Kai C, Harbers M, et al. CAGE: cap analysis of gene expression. Nat Methods. 2006 March 1; 3(3):211-222.
Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C, Murata M, Nishiyori H, Lazarevic D, Motti D, et al. Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE. Genome Res. 2009 February; 19(2):255-265.
Carninci P, Kvam C, Kitamura A, Ohsumi T, Okazaki Y, Itoh M, Kamiya M, Shibata K, Sasaki N, Izawa M, et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics. 1996 November 1; 37(3):327-336.
Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.