Description

This track displays results of a variation of the standard McDonald-Kreitman (1) test to examine intra-species (human, HapMap phase II or dbSNP126) polymorphism and inter-species (human-chimp and human-rhesus) divergence in non-coding DNA in non-overlapping 10 kb windows tiled across the genome. The MKAR test uses repeats ancestral to human-macaque (AR) as the neutral proxy. These are defined as LTR, SINE, LINE, and DNA elements excluding AluY, L1PA1-L1PA7, L1HS, and others shown to violate neutral expectations (2).

Methods

Non-overlapping 10 kb windows were tiled across the genome. A 2X2 contingency table was generated for each window by apportioning SNP counts and divergence counts within a window into AR and non-AR categories. A SNP count is registered when a SNP is present in an aligned position. A divergence count is registered when aligned bases differ. An AR count is registered when either a SNP or diverged base fall in an ancestral repeat site within a window. A non-AR site is registered when either a SNP or a diverged base fall outside an AR site within a window. Gaps or regions that do not align in either species pair (human-chimp or human-macaque), or windows with a zero count in any member of the 2X2 contingency table were not counted. The significance of the table.s deviation from neutrality was determined by the Fisher.s exact as implemented in R (3). The FDR multiple tests correction procedure was applied to the empirical distribution of p-values as implemented in R, using the qvalue library and the bootstrap method of estimating the proportion of true null hypotheses (4).

Credits

This work is a collaborative effort among researchers in Anthropology, Biology and the Center for Comparative Genomics and Bioinformatics at the Pennsylvania State University.

References

  1. McDonald J.H.,Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991 Jun 20;351(6328):652-4.
  2. Kamal M., Xie X., Lander E.S. A large family of ancient repeat elements in the human genome is under strong selection. Proc Natl Acad Sci U S A. 2006 Feb 21;103(8):2740-5.
  3. Team R.D.C. (R Foundation for Statistical Computing, Vienna, 2005).
  4. Storey J.D. Journal of the Royal Statistical Society Series B-Statistical Methodology. 64, 479 (2002).