This track displays results of a variation of the standard McDonald-Kreitman (1) test to examine intra-species (human, HapMap phase II or dbSNP126) polymorphism and inter-species (human-chimp and human-rhesus) divergence in non-coding DNA in non-overlapping 10 kb windows tiled across the genome. The MKAR test uses repeats ancestral to human-macaque (AR) as the neutral proxy. These are defined as LTR, SINE, LINE, and DNA elements excluding AluY, L1PA1-L1PA7, L1HS, and others shown to violate neutral expectations (2).
Non-overlapping 10 kb windows were tiled across the genome. A 2X2 contingency table was generated for each window by apportioning SNP counts and divergence counts within a window into AR and non-AR categories. A SNP count is registered when a SNP is present in an aligned position. A divergence count is registered when aligned bases differ. An AR count is registered when either a SNP or diverged base fall in an ancestral repeat site within a window. A non-AR site is registered when either a SNP or a diverged base fall outside an AR site within a window. Gaps or regions that do not align in either species pair (human-chimp or human-macaque), or windows with a zero count in any member of the 2X2 contingency table were not counted. The significance of the table.s deviation from neutrality was determined by the Fisher.s exact as implemented in R (3). The FDR multiple tests correction procedure was applied to the empirical distribution of p-values as implemented in R, using the qvalue library and the bootstrap method of estimating the proportion of true null hypotheses (4).
This work is a collaborative effort among researchers in Anthropology, Biology and the Center for Comparative Genomics and Bioinformatics at the Pennsylvania State University.