Description

This track shows regions that co-precipitate with antibodies against each of ten factors in all ENCODE regions, in retinoic-acid stimulated HL-60 cells harvested after 0, 2, 8, and 32 hours. Clustered sites are shown in separate subtracks for each of the ten antibodies:

Brg1 - Brahma-related Gene 1
CEBPe - CCAAT-enhancer binding protein-epsilon
CTCF - CCTC binding factor
H3K27me3 (H3K27T) - Histone H3 tri-methylated lysine 27
H4Kac4 (HisH4) - Histone H4 tetra-acetylated lysine
P300 - E1A-binding protein, 300-KD
PU1 - Spleen focus forming virus proviral integration oncogene
Pol2 - RNA Polymerase II (8WG16 ab against pre-initiation complex form)
RARA (RARecA) - Retinoic Acid Receptor-Alpha
SIRT1 - Sirtuin-1

Retinoic acid-stimulated HL-60 cells were harvested and whole cell extracts (control) were made. An antibody was used to immunoprecipitate bound chromatin fragments (treatment). DNA was purified from these samples and hybridized to Affymetrix ENCODE oligonucleotide tiling arrays, which have 25-mer probes tiled every 22 bp on average in the non-repetitive ENCODE regions.

Display Conventions and Configuration

The subtracks within this composite annotation track may be configured in a variety of ways to highlight different aspects of the displayed data. The graphical configuration options for the subtracks are shown at the top of the track description page, followed by a list of subtracks. For more information about the graphical configuration options, click the Graph configuration help link.

Color differences among the subtracks are arbitrary. They provide a visual cue for finding the same antibody in different timepoint tracks.

Methods

The data from replicate arrays were quantile-normalized (Bolstad et al., 2003) and all arrays were scaled to a median array intensity of 22. Within a sliding 1001 bp window centered on each probe, a signal estimator S = ln[max(PM - MM, 1)] (where PM is perfect match and MM is mismatch) was computed for each biological replicate treatment- and all replicate control-probe pairs. An estimate of the significance of the enrichment of treatment signal for each replicate over control signal in each window was given by the P-value computed using the Wilcoxon Rank Sum test over each biological replicate treatment and all control signal estimates in that window. The median of the log transformed P-value (-10 log₁₀ P) across processed replicate data is displayed.

Several independent biological replicates (four each for Brg1, CEBPe, CTCF, PU1, and SIRT1; five each for H3K27me3, H4Kac4, P300, Pol2 and RARA) were generated and hybridized to duplicate arrays (two technical replicates). Reproducible enriched regions were generated from the signal by first applying a cutoff of 20 to the log transformed P-values, a maxGap and minRun of 500 and 0 basepairs respectively, to each biological replicate. Since each region or site may be comprised of more than one probe, a median based on the distribution of log transformed P-values was computed per site for each of the respective replicates. These seed sites were then ranked individually within each of the replicates. If a site was absent in a replicate, the maximum or worst rank of the distribution was assigned to it.

The following three values were computed for each site by combining data from all biological replicates:

average of all ranks computed among biological replicates
sum of all pairwise differences in these ranks computed among biological replicates
a combined P-value, using a chi square distribution, across all replicates

The final sites were selected when all of the above three metrics were relatively low, where "low" corresponds to the top 25 percentile of the distribution.

Verification

Using the P-values from the biological replicates, all pairwise rank correlation coefficients were computed among biological replicates. Data sets showing both consistent pairwise correlation coefficients and at least weak positive correlation across all pairs were considered reproducible.

Credits

These data were generated and analyzed by the Gingeras/Struhl collaboration with the Tom Gingeras group at Affymetrix and Kevin Struhl's group at Harvard Medical School.

References

Please see the Affymetrix Transcriptome site for a project overview and additional references to Affymetrix tiling array publications.

Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2), 185-193 (2003).

Cawley, S., Bekiranov, S., Ng, H. H., Kapranov, P., Sekinger, E. A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A. J., et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116(4), 499-509 (2004).