Description

The alignments in this track were generated using PECAN, a consistency-based multiple aligner. The conservation subtracks display conservation scores and conserved elements generated by GERP. GERP scores each column of the alignment by quantifying rejected substitutions. These scores are then used to find regions of the alignment where high scores accumulate.

The multiple alignments are based on a set of 9-way EPO (Enredo-Pecan-Ortheus) alignments for all high-coverage placental mammal genomes. 22 low coverage genomes (including guinea pig) were mapped onto the alignments using the BlastZ-net alignments to the human genome. Insertions in the low coverage genomes are ignored, i.e. low-coverage genome-specific sequence is removed from the alignments.

The multiple alignments are based on whole-genome assemblies residing at UCSC for high coverage genomes and Ensembl assemblies for low-coverage genomes. Ensembl adds an extra layer to the assembly in order to accommodate gene models. The species are:

OrganismSpeciesVersion
HumanHomo sapiens UCSC hg18
ChimpanzeePan troglodytes UCSC panTro2
OrangutanPongo abelii UCSC ponAbe2
MacaqueMacaca mulatta UCSC rheMac2
MouseMus musculus UCSC mm9
RatRattus norvegicus UCSC rn4
DogCanis familiaris UCSC canFam2
CowBos taurus UCSC bosTau3
HorseEquus caballus UCSC equCab2
Guinea pigCavia porcellus Ensembl
SlothCholoepus hoffmanni Ensembl
ArmadilloDasypus novemcinctus Ensembl
Kangaroo ratDipodomys ordii Ensembl
TenrecEchinops telfairi Ensembl
European HedgehogErinaceus europaeus Ensembl
CatFelis catus Ensembl
GorillaGorilla gorilla Ensembl
ElephantLoxodonta africana Ensembl
Mouse LemurMicrocebus murinus Ensembl
Bat (sbbat)Myotis lucifugus Ensembl
PikaOchotona princeps Ensembl
RabbitOryctolagus cuniculus Ensembl
GalagoOtolemur garnettii Ensembl
Rock hyraxProcavia capensis Ensembl
Flying FoxPteropus vampyrus Ensembl
ShrewSorex araneus Ensembl
SquirrelSpermophilus tridecemlineatus Ensembl
TarsierTarsius syrichta Ensembl
Tree shrewTupaia belangeri Ensembl
DolphinTursiops truncatus Ensembl
AlpacaVicugna pacos Ensembl

Methods

PECAN

PECAN (v0.7), a consistency-based multiple aligner, was used to align co-linear segments between the 9 high-coverage genomes, as defined by ENREDO (v0.5). The anchors for ENREDO were extracted from multiple and pairwise alignments using human, mouse, rat, dog, cow and horse genomes. The sequences from the low-coverage genomes were projected onto the alignments using BLASTZ-net parwise alignments between these and the human genome.

GERP

We used GERP v2.1b to score each column of the alignment. This is done by comparing the number expected with the number of observed substitutions in every column of the alignment. These scores are then used to find regions of the alignment whith more high conservation scores than expected by chance.

Credits

The PECAN multiple alignments were created by Kathryn Beal, Stephen Fitzgerald and Javier Herrero from the Ensembl team.

ENREDO was developed by Javier Herrero (Paten et al. 2008).

PECAN was developed by Benedict Paten (Paten et al. 2008).

GERP 2.1b was written by Eugene Davydov (Cooper et al. 2005).

References

Paten B, Herrero J, Beal K, Fitzgerald S, Birney E. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 2008 Nov;18(11):1814-28.

Cooper G, Stone EA, Asimenos G, NISC Comparative Sequencing Program, Green ED, Batzoglou S, and Sidow A. Distribution and intensity of constraint in mammalian genomic sequence Genome Res. 2005 Jul;15(7):901-13.