Description

These tracks display the level of sequence uniqueness of the reference mm9 genome. They were generated using different window sizes and high signal will be found in areas where the sequence is unique.

Display Conventions and Configuration

This track contains multiple subtracks representing different cell types that display individually on the browser. Instructions for configuring tracks with multiple subtracks are here.

These tracks provide a measure of how often the sequence found at the particular location will align within the whole genome. Unlike measures of uniqueness, alignability will tolerate up to 2 mismatches. These tracks are in the form of signals ranging from 0 to 1 and have several configuration options.

Methods

The CRG Alignability tracks show how uniquely k-mer sequences align to a region of the genome. By using the GEM mapper aligner, where up to two mismatches were allowed, the method is equivalent to mapping sliding windows of k-mers back to the genome (where k has been set to 36, 40, 50, 75 or 100 nucleotides to produce these tracks). For each window, a mapability score was computed (S = 1/(number of matches found in the genome): S=1 means one match in the genome, S=0.5 is two matches in the genome, and so on). The CRG Alignability tracks were generated independently of the ENCODE project, in the framework of the GEM (GEnome Multitool) project.

Release Notes

This is Release 1 (June 2012) of the ENCODE mapability track. It is a port of the old mapability track into the ENCODE format. There are no new datasets.

Credits

The CRG Alignability track was created by Thomas Derrien and Paolo Ribeca in Roderic Guigo's lab at the Centre for Genomic Regulation (CRG), Barcelona, Spain. TD was supported by funds from NHGRI for the ENCODE project, while PR was funded by a Consolider grant CDS2007-00050 from the Spanish Ministerio de Educación y Ciencia."

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.