Description

This track shows Human Accelerated Regions (HARs), genomic elements that are conserved in other vertebrates but significantly changed in humans. The HARs were identified in a genome-wide scan in two stages. First, all regions of at least 100 bp that are 96% or more identical between the chimp, mouse and rat genomes were identified. Then, each of these regions was examined for evidence of substitution rate acceleration on the human lineage compared to 11 other vertebrates (including mammalian, amphibian, bird, and fish species) plus a parsimony inferred chimp-human ancestor. The 202 HARs displayed in this track all showed statistically significant (FDR adjusted p<0.1) acceleration in the human lineage. The "score" in this track is the value of the LRT statistic described below. Details of the HARs are available here.

Methods

Multiz alignments of the following assemblies were used to generate this track:

Alignment gaps and CpG dinucleotides were excluded from the analysis. The phylogeny used was the same as that employed in the 17 species vertebrate conservation track, described here. Filters were used to remove cases of assembly errors, alignment errors, human pseudogenes, and misaligned paralogous sequences.

Each chimp-rodent conserved region was assessed for evidence of accelerated substitution rate in the human lineage using a Likelihood Ratio Test (LRT). The LRT statistic compares the likelihood of the alignment data under a molecular evolutionary model with a human substitution rate parameter (constrained to be accelerated only) to a model without this parameter. Both models are fit to each alignment by scaling a genome-wide model for conserved sequences. The general time-reversible (REV) single-nucleotide model for molecular evolution was used. Large values of the LRT statistic indicate more evidence for acceleration in the human lineage. Significance was assessed by simulation from the genome-wide (no acceleration) model. P-values were adjusted for multiple comparisons using the false discovery rate controlling procedure of Benjamini & Hochberg (1995).

More information about the methods can be found in the two references by Pollard et al. (2006).

Credits

The genome-wide HAR screen was developed and executed at UCSC by Katherine Pollard, using programs from the PHAST library written by Adam Siepel and scripts contributed by Gill Bejerano.

References

K.S. Pollard, S.R. Salama, B. King, A.D. Kern, T. Dreszer, S. Katzman, A. Siepel, J.S. Pedersen, G. Bejerano, R. Baertsch, K.R. Rosenbloom, J. Kent, D. Haussler Forces shaping the fastest evolving regions in the human genome. PLoS Genetics, early online publication (August 23, 2006).

K.S. Pollard, S.R. Salama, N. Lambert, M.A. Lambot, S. Coppens, J.S. Pedersen, S. Katzman, B. King, C. Onodera, A. Siepel1, A.D. Kern, C. Dehay, H. Igel, M. Ares Jr, P. Vanderhaeghen, D. Haussler An RNA gene expressed during cortical development evolved rapidly in humans. Nature, early online publication (August 16, 2006).

Y. Benjamini, Y. Hochberg. Controlling the False Discovery Rate - a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B-Methodological 57, 289-300 (1995).