Description

This track shows the chromosome-based version of the November 2003 Arachne $organism assembly (NCBI Build 1 Version 1) from the Chimp Genome Sequencing Consortium.

The Arachne assembly of sequence to whole genome shotgun (WGS) contigs and the mapping of the contigs to scaffolds (supercontigs) were performed at the Broad Institute. To build this assembly, a "modified de novo" (MDN) assembly was created first, using the human genome to determine — among other things — that particular inserts were not chimeric. A separate "validated chimp-on-human" (VCH) assembly was created by taking the chimp reads that align uniquely to human, forming them into contigs via this alignment, and then removing those contigs that failed a two-haplotype consistency check. Shared reads were then used to align the two assemblies to one another. In positions where the sequence was consistent, it was transferred from the VCH to the MDN assembly, further enriching the latter. Two global misassemblies were manually removed from the merged assembly. This final assembly comprises the Build 1 Version 1 draft release.

To create a chromosome-based version of the assembly, the Arachne scaffolds were mapped to chimp chromosomes by LaDeana Hillier at the University of Washington, using human/chimp homology based on the alignments produced at UCSC and the Broad Institute. Centromeres were introduced into the chimp at the positions of the centromeres in the human chromosomes. Nine documented/known human inversions were introduced into the ordering, as was the break of human chromosome 2 into chimpanzee chromosomes 12 and 13. An additional centromere was introduced in chromosome 13 at the site of the 30 Kb alpha-satellite.

The Arachne assembly is composed of 361,782 contigs with an N50 length of 15.7 kb. The total contig length is 2.73 Gb, spanning 3.02 Gb. The assembly contains 37,849 supercontigs (scaffolds) having an N50 length of 8.6 Mb (not including gaps).

In dense mode, this track depicts the path through the scaffolds used to create the assembled sequence. Scaffold boundaries are distinguished by the use of alternating gold and brown coloration. Where gaps exist in the path, spaces are shown between the gold and brown blocks. Relative order and orientation of the scaffolds, as determined from the human/chimp alignments, is implied for the non-random chromosomes.

References

ARACHNE: A Whole-Genome Shotgun Assembler. Serafim Batzoglou, David B. Jaffe, Ken Stanley, Jonathan Butler, Sante Gnerre, Evan Mauceli, Bonnie Berger, Jill P. Mesirov, and Eric S. Lander. Genome Research 2002 Jan;12:177-189.

Whole-Genome Sequence Assembly for Mammalian Genomes: ARACHNE 2. David B. Jaffe, Jonathan Butler, Sante Gnerre, Evan Mauceli, Kerstin Lindblad-Toh, Jill P. Mesirov, Michael C. Zody, and Eric S. Lander. Genome Research 2003 Jan;13(1):91-96.