Description

This track represents regions of chromosomes 6, 7, 13, 14, 19, 20, 21, 22, X, and Y that are thought to be transcribed (transfrags), based on the transcriptome data from Affymetrix. This track is a preview of the Phase Two Affymetrix transcriptome project and is based on hybridzation of mRNA from the SK-N-AS cell line as seen in the Transcription track. Data from seven additional cell lines will be available when the project is completed. While the coverage of the genome is much larger and the probe density greater, the data generation is similar to the Phase One project carried out on chromosomes 21 and 22 (Kapranov, P. et al. Large-scale transcriptional activity in chromosomes 21 and 22. Science 296(5569), 916-9 (2002)).

Transfrags that have a strong blat hit elsewhere in the genome are displayed in a lighter shade of blue. Those that overlap a pseudogene are colored even lighter blue and are hidden from display, but can be viewed by changing the options on the track configuration page.

The score value is interpreted as follows:

   0 - Passes all filters
   1 - Overlaps a pseudogene
   2 - Overlaps a blat hit
   3 - Overlaps both a pseudogene and a blat hit

Methods

mRNA was isolated and hybridized to tiling GeneChips that contain a probe every 5 bp over chromosomes 6, 7, 13, 14, 19, 20, 21, 22, X, and Y (over 74 million probes total). The resulting data were normalized and smoothed using the Hodges-Lehmann estimator associated with the Wilcoxon signed-rank statistic of the perfect match - mismatch probes (PM - MM) values that lie within ± 30 bp of a sliding window centered at every genomic coordinate. This is very similar to the method used previously on the Phase One data (Kampa, D. et al. Novel RNAs identified from a comprehensive analysis of the transcriptome of human chromosomes 21 and 22.. Genome Res. 14, 331-342 (2004)).

To determine the locations of the transfrags, a threshold of 13.93 (which gives an average false positive rate of ~ 0.05 in bacterial negative controls) was applied to all the signal graphs and defined coordinates as transcribed. Transcribed coordinates separated by gaps 30 bp or less were merged. Resulting regions that were smaller than 50 bp in length were filtered out. Transfrags were filtered to remove Repeat Masker Repeats and low complexity repeats, but not pseudogenes. Finally, to center the coordinates over the probes for the display, 12 bp were added to all starts and 13 bp to all stop coordinates. (The raw and graph data are left-justified with respect to probes.)

Credits

Data generation and analysis: Transcriptome group at Affymetrix - Bekiranov, S., Brubaker, S., Cheng, J., Dike, S., Drenkow, J., Ghosh, S., Gingeras, T., Helt, G., Kampa, D., Kapranov, P., Long, J., Madhavan, G., Manak, J., Patel, S., Piccolboni, A., Sementchenko, V., and Tammana, H.

Data presentation in Genome Browser: Chuck Sugnet.