Description

The track displays the location of RNA polyadenylation (polyA) sites based on high-throughput RNA sequencing using the PolyA-seq protocol.

Display Conventions and Configuration

PolyA-Seq data is strand-specific, therefore two tracks are provided for each tissue. PolyA site positions correspond to a single base, namely the ends of read alignments immediately upstream of the polyadenylation site. The data provided in this track consists of filtered polyA sites (see Methods). When multiple sites occurred within a 30-bp window on the same strand, the sum of the reads was attributed to the site with the most reads. Units are in reads per million (RPM) aligned. To obtain read counts, multiply RPM values by the total number of filtered reads for the corresponding experiment:

Species

Sample

Filtered reads

Human

MAQC-UHR1

5057048

MAQC-UHR2

5030985

MAQC-Brain1

4086039

MAQC-Brain2

3921040

Brain

2980439

Kidney

4626843

Liver

5626271

Muscle

4920121

Testis

5098780

Rhesus

Brain

2615605

Ileum

3251495

Kidney

2666757

Liver

4299805

Testis

4836387

Dog

Brain

4309201

Kidney

5768315

Testis

5397546

Mouse

Brain

1187654

Kidney

3921370

Liver

4189409

Muscle

5517961

Testis

2364217

Rat

Brain

5549424

Testis

7466688

A detailed explanation of the experimental methods is provided at NCBI's Gene Expression Omnibus under accession GSE30198. Briefly, PolyA+ RNA was reverse-transcribed using a T(10)VN primer and strand-specific universal adapters, amplified, and sequenced on an Illumina GAIIx sequencer. Reads were reverse-complemented, aligned to the corresponding reference genome and splice junctions, and retained only if they aligned uniquely. 3' ends of alignments were considered polyA sites. Sites were then filtered using downstream base frequency matrices for true- and false-positive sites determined from a modified experiment based on a T(10) primer (i.e., excluding the 3' VN). When multiple filtered sites occurred within a 30-nt window on the same strand, read counts were summed and attributed to the most abundant peak. For each tissue, read counts were then divided by the total number of reads, in millions, from all filtered sites.

Methods

A detailed explanation of the experimental methods is provided at NCBI's Gene Expression Omnibus under accession GSE30198. Briefly, PolyA+ RNA was reverse-transcribed using a T(10)VN primer and strand-specific universal adapters, amplified, and sequenced on an Illumina GAIIx sequencer. Reads were reverse-complemented, aligned to the corresponding reference genome and splice junctions, and retained only if they aligned uniquely. 3' ends of alignments were considered polyA sites. Sites were then filtered using downstream base frequency matrices for true- and false-positive sites determined from a modified experiment based on a T(10) primer (i.e., excluding the 3' VN). When multiple filtered sites occurred within a 30-nt window on the same strand, read counts were summed and attributed to the most abundant peak. For each tissue, read counts were then divided by the total number of reads, in millions, from all filtered sites.

Data Release Policy

No restrictions.