Description

This track displays human-centric multiple sequence alignments in the ENCODE regions for the 23 vertebrates in the May 2005 ENCODE MSA freeze, based on comparative sequence data generated for the ENCODE project. The alignments in this track were generated using the Threaded Blockset Aligner (TBA). A complete list of the vertebrates included in the May 2005 freeze may be found at the top of the description page for this track.

The Genome Browser companion tracks, TBA Cons and TBA Elements, display conservation scoring and conserved elements for these alignments based on various conservation methods.

Display Conventions and Configuration

In full display mode, this track shows pairwise alignments of each species aligned to the human genome. The alignments are shown in dense display mode using a gray-scale density gradient. The checkboxes in the track configuration section allow the exclusion of species from the pairwise display.

When zoomed-in to the base-display level, the track shows the base composition of each alignment. The numbers and symbols on the "human gap" line indicate the lengths of gaps in the human sequence at those alignment positions relative to the longest non-human sequence. If there is sufficient space in the display, the size of the gap is shown; if not, and if the gap size is a multiple of 3, a "*" is displayed, otherwise "+" is shown. To view detailed information about the alignments at a specific position, zoom in the display to 30,000 or fewer bases, then click on the alignment.

Methods

The TBA was used to align sequences in the May 2005 ENCODE sequence data freeze. Multiple alignments were seeded from a series of combinatorial pairwise blastz alignments (not referenced to any one species). The specific combinations were determined by the species guide tree. Additionally, a blastz.specs file was used to fine-tune the blastz parameters, based on the evolutionary distance of the species being compared. The resulting multiple alignments were projected onto the human reference sequence.

Credits

The TBA multiple alignments were created by Elliott Margulies of the Green Lab at NHGRI.

The programs Blastz and TBA, which were used to generate the alignments, were provided by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group.

The phylogenetic tree is based on Murphy et al. (2001) and general consensus in the vertebrate phylogeny community.

References

Blanchette, M., Kent, W.J., Reimer, C., Elnitski, L., Smit, A., Roskin, K., Baertsch, R., Rosenbloom, K.R., Clawson, H. et al. Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner. Genome Res 14, 708-15 (2004).

Chiaromonte, F., Yap, V.B., and Miller, W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput 2002, 115-26 (2002).

Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R., Haussler, D. and Miller, W. Human-Mouse Alignments with BLASTZ. Genome Res 13(1):103-7 (2003).

Murphy, W.J., et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294(5550), 2348-51 (2001).