Description

The track displays the location of RNA polyadenylation (polyA) sites based on high-throughput RNA sequencing using the PolyA-seq protocol.

Display Conventions and Configuration

PolyA-Seq data is strand-specific, therefore two tracks are provided for each tissue. PolyA site positions correspond to a single base, namely the ends of read alignments immediately upstream of the polyadenylation site. The data provided in this track consists of filtered polyA sites (see Methods). When multiple sites occurred within a 30-bp window on the same strand, the sum of the reads was attributed to the site with the most reads. Units are in reads per million (RPM) aligned. To obtain read counts, multiply RPM values by the total number of filtered reads for the corresponding experiment:

Species	Sample	Filtered reads
Human	MAQC-UHR1	5057048
	MAQC-UHR2	5030985
	MAQC-Brain1	4086039
	MAQC-Brain2	3921040
	Brain	2980439
	Kidney	4626843
	Liver	5626271
	Muscle	4920121
	Testis	5098780
Rhesus	Brain	2615605
	Ileum	3251495
	Kidney	2666757
	Liver	4299805
	Testis	4836387
Dog	Brain	4309201
	Kidney	5768315
	Testis	5397546
Mouse	Brain	1187654
	Kidney	3921370
	Liver	4189409
	Muscle	5517961
	Testis	2364217
Rat	Brain	5549424
Rat	Testis	7466688

A detailed explanation of the experimental methods is provided at NCBI's Gene Expression Omnibus under accession GSE30198. Briefly, PolyA+ RNA was reverse-transcribed using a T(10)VN primer and strand-specific universal adapters, amplified, and sequenced on an Illumina GAIIx sequencer. Reads were reverse-complemented, aligned to the corresponding reference genome and splice junctions, and retained only if they aligned uniquely. 3' ends of alignments were considered polyA sites. Sites were then filtered using downstream base frequency matrices for true- and false-positive sites determined from a modified experiment based on a T(10) primer (i.e., excluding the 3' VN). When multiple filtered sites occurred within a 30-nt window on the same strand, read counts were summed and attributed to the most abundant peak. For each tissue, read counts were then divided by the total number of reads, in millions, from all filtered sites.

Methods

Data Release Policy

No restrictions.