Description

This track displays regions of the chimpanzee draft assembly (panTro1) that are deleted in the human genome assembly (hg16). Only regions of between 80 and 12000 bases are included. The name of each deletion is a unique pointer to that deletion followed by an underscore and its length. A similar track, showing chimp deletions in the human assembly, appears in the human Genome Browser.

Methods

The human/chimpanzee alignments were created at UCSC with blastz and blat using a reciprocal best strategy with chaining and netting. The initial alignments were generated using blastz on repeatmasked sequence with following matrix:

       A    C    G    T
 A   100 -300 -150 -300
 C  -300  100 -300 -150
 G  -150 -300  100 -300
 T  -300 -150 -300  100

 O = 400, E = 30, K = 4500, L = 4500, M = 50

The overall score is the sum of the score over all pairs.

The resulting alignments were processed by the axtChain program. To place additional chimp scaffolds that weren't initially aligned by blastz, a DNA blat of the unmasked sequence was performed. The resulting blat alignments were also chained, and then merged with the blastz-based chains produced in the previous step to produce "all chains", which were further processed by the chainNet and netSyntenic programs. Finally, a "reciprocal best" strategy was employed to minimize paralog fill-in for missing orthologous chimp sequence. Details of the alignment methods can be found in the descriptions of the Chimp Chain and Chimp Net tracks of the human genome browser.

Human deletions in chimp were determined from the collection of indels implied by these alignments. The criteria for inclusion in the list of deletions were (i) within, not between, scaffolds; (ii) simple gaps only (no opposing, unmatched bases or double gaps); (iii) 80-12000 bp long; and (iv) not a missed overlap or incorrect gap size in assembly. These criteria aim to include plausible repeat insertions and exclude assembly and alignment artifacts.

Credits

The chimpanzee sequence used in this track was obtained from the 13 Nov. 2003 Arachne assembly. This sequence was provided by the National Human Genome Research Institute (NHGRI), the Eli & Edythe L. Broad Institute at MIT/Harvard, and Washington University School of Medicine.

The BLASTZ program was created by Webb Miller of the Penn State Bioinformatics Group.

Jim Kent at UCSC wrote the blat program, the chaining and netting programs, and the scripts for displaying the alignments in this browser.

The list of mid-sized (80-12000 bp) human deletions relative to chimpanzee was provided by Tarjei Mikkelsen at MIT. The UCSC alignments of complete chimpanzee scaffolds to the human genome assembly were used to generate this list.

References

Batzoglou, S., Jaffe, D.B., Stanley, K., Butler, J., Gnerre, S., Mauceli, E., Berger, B., Mesirov, J.P. and Lander, E.S. ARACHNE: a whole-genome shotgun assembler. Genome Research 12(1), 177-189 (2002).

Chiaromonte, F., Yap, V.B. and Miller, W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput 2002, 115-26 (2002).

Jaffe, D.B., Butler, J., Gnerre, S., Mauceli, S., Lindblad-Toh, K., Mesirov, J.P., Zody, M.C. and Lander, E.S. Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Research 13(1), 91-96 (2003).

Kent, W.J. BLAT - the BLAST-like alignment tool. Genome Research 12(4), 656-664 (2002).

Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R., Haussler, D., and Miller, W. Human-mouse alignments with BLASTZ. Genome Research 13(1), 103-7 (2003).