Description

This track shows the "reciprocal best" human/chimpanzee alignment net. It is useful for finding orthologous regions and for studying genome rearrangement.

In the graphical display, the boxes represent ungapped alignments, while the lines represent gaps. In full display mode, the top-level (Level 1) chains are the largest, highest-scoring chains that span this region. In many cases, gaps exist in the top-level chains. When possible, these are filled in by other chains displayed at Level 2. The gaps in Level 2 chains may be filled by Level 3 chains and so forth. Clicking on a box displays detailed information about the chain as a whole, while clicking on a line shows information on the gap. The detailed information is useful in determining the cause of the gap or, for lower-level chains, the genomic rearrangement.

Individual track features are categorized as one of four types (other than gap):

Methods

These alignments were generated using blastz and blat alignments of chimpanzee genomic sequence from the 13 Nov. 2003 Arachne chimpanzee draft assembly. The initial alignments were generated using blastz on repeatmasked sequence using the following chimp/human scoring matrix:

     A    C    G    T
A   100 -300 -150 -300
C  -300  100 -300 -150
G  -150 -300  100 -300
T  -300 -150 -300  100

K = 4500, L = 3000,  Y=3400, H=2000

The resulting alignments were processed by the axtChain program. AxtChain organizes all the alignments between a single chimp chromosome and a single $organism chromosome into a group and makes a kd-tree out of all the gapless subsections (blocks) of the alignments. The maximally-scoring chains of these blocks were found by running a dynamic program over the kd-tree. Chains scoring below a certain threshold were discarded.

To place additional chimp scaffolds that weren't initially aligned by blastz, a DNA blat of the unmasked sequence was performed. The resulting blat alignments were also chained, and then merged with the blastz-based chains produced in the previous step to produce "all chains".

These chains were sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain.

Due to the draft nature of this initial genome assembly, this net track (and the companion chain track) was generated using a "reciprocal best" strategy. This strategy attempts to minimize paralog fill-in for missing orthologous chimp sequence by filtering from the human net all sequences not found in the chimp side of the net. After generating the human alignment net, the subset of chains in the chimp-reference net was extracted and used for an additional netting step, which was then filtered for non-syntenic sequences smaller than 50 bases.

Credits

The chimp sequence used in this track was obtained from the 13 Nov. 2003 Arachne assembly. We'd like to thank the National Human Genome Research Institute (NHGRI), the Eli & Edythe L. Broad Institute at MIT/Harvard, and Washington University School of Medicine for providing this sequence.

Blastz was developed at Pennsylvania State University by Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison.

Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker.

The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent.

References

Kent, W.J., Baertsch, R., Hinrichs, A., Miller, W., and Haussler, D. Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci USA 100(20), 11484-11489 (2003).

Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R., Haussler, D., and Miller, W. Human-Mouse Alignments with BLASTZ. Genome Res. 13(1), 103-7 (2003).