Description

This track shows likely TAF1 binding sites in fibroblastoid (IMR90) cells as assayed by ChIP-chip using a NimbleGen microarray. The two subtracks show known TAF1 binding sites and additional novel sites where, based on the data in the LI TAF1Signal companion track, TAF1 is most likely to bind.

TAF1, a protein found at the start of transcribed genes, is a general transcription factor that is a key part of the pre-initiation complex found on the promoter. It is more fully known as TBP-associated factor 1 of the TFIID complex or by its molecular weight as TAF250.

To survey the entire human genome in an unbiased fashion, a total of 38 high-density oligonucleotide arrays (NimbleGen platform) were fabricated, representing approximately 1.45 billion base pairs of non-repetitive DNA with 50-mer oligonucleotides positioned at every 100 base pairs throughout the human genome (UCSC hg16). Using this array, genome-wide location analysis of TAF1 was conducted employing ChIP-chip using chromatin extracted from primary fibroblast IMR90 cells.

Methods

Chromatin from IMR90 cells lines was cross-linked, precipitated with TAF1 antibody (sc-735, Santa Cruz), sheared, amplified and hybridized to 38 high-density oligonucleotide arrays (NimbleGen). These arrays contain a total of 14,535,659 50-mer oligonucleotides positioned at every 100 base pairs through the human genome (UCSC hg16). Using this set of arrays, a total of 9,966 clusters of TFIID binding sites were identified.

To verify the binding of TFIID to these sequences, a condensed array was designed containing a total of 379,521 oligonucleotides to represent the 9,966 putative TFIID binding sequences plus 29 control genomic loci at 100 bp resolution. Using these condensed arrays, two independent chromatin immunoprecipitation (ChIP) experiments were performed with the antibodies against TAF1, RNA polymerase II, acetylated histone 3 and dimethylated K4 histone 3. A total of 8,597 TFIID binding regions, ranging in size from 400 bp to 9.8 Kbp, were confirmed by the TAF1 replicate experiments. The verification data can be viewed in the LI TAF1 Valid track.

To further define the sites of TFIID binding within the identified regions, a model-based peak-finding algorithm was developed that estimates the most likely TFIID binding sites based on the hybridization intensity of probes within each fragment. The signals from a set of consecutive significantly-enriched probes were collectively used to locate the most likely TFIID binding site to the probe with the peak signal. The algorithm predicted a total of 12,150 TFIID binding sites within the 8,597 confirmed TFIID binding fragments.

The locations of the 12,150 peaks were compared to the annotated 5' end of transcripts from RefSeq, GenBank and DBTSS, using a cutoff of 2.5 Kbp. It was found that 10,504 peaks corresponding to 9,281 non-redundant transcripts were within 2.5 Kbp of the annotated 5' end. 47 of the remaining peaks were within 2.5 Kbp of Ensembl genes, resulting in a total of 9328 known non-redundant promoters. The remaining peaks were further filtered using Acembly annotation and H3ac, RNAP and MeH3K4 ChIP-chip data. The total number of novel peaks was 1,239.

The raw data are available from GEO GSE2672.

Verification

The peaks from genome scan experiments were verified using condensed arrays, as described in the Methods section. The verification data may be viewed in the LI TAF1 Valid track.

References

Kim TH, Barrera LO, Zheng M, Qu C, Singer MA, Richmond TA, Wu Y, Green RD, Ren B. A high-resolution map of active promoters in the human genome. Nature. 2005 Aug 11;436:876-80.