Description

The Neandertal Sequence Contigs track shows consensus contigs called (after duplicate reads from each library were merged) from overlapping, non-redundant reads that passed mapping and base quality criteria.

Display Conventions and Configuration

The contigs (query sequences) from each of the six samples are contained in separate subtracks. Use the checkboxes to select which samples will be displayed in the browser. Click and drag the sample name to reorder the subtracks. The order in which the subtracks appear in the subtrack list will be the order in which they display in the browser.

The query sequences in the SAM/BAM alignment representation are normalized to the + strand of the reference genome (see the SAM Format Specification for more information on the SAM/BAM file format). If a query sequence was originally the reverse of what has been stored and aligned, it will have the following flag:

(0x10) Read is on '-' strand.

BAM/SAM alignment representations also have tags. Some tags are predefined and others (those beginning with X, Y or Z) are defined by the aligner or data submitter. The following tag is associated with this track:

The item labels and display colors of features within this track can be configured through the controls at the top of the track description page.

Methods

All Neandertal sequence reads from each of the six samples were aligned to the $organism ($db) genome using the short read aligner/mapper ANFO.

To reduce the effects of sequencing error, the alignments of Neandertal reads to the human and chimpanzee reference genomes were used to construct human-based and chimpanzee-based consensus "minicontigs". To generate the consensus, uniquely placed, overlapping alignments were selected (ANFO MAPQ ≥ 90) and these were merged into a single multi-sequence alignment using the common reference genome sequence.

At each position in the resulting alignment, for each observed base, and for each possible original base: i) The likelihood of the observation was calculated, ii) the likely length of single-stranded overhangs was estimated, and iii) the potential for ancient DNA damage using the Briggs-Johnson model was considered (Briggs et al. 2007). If most observations in a given position showed a gap, the consensus became a gap; otherwise the base with the highest quality score (calculated by dividing each likelihood by the total likelihood) was used as the consensus.

At the current coverage, heterozygous sites will appear as low quality bases with the second base (not shown) having a similar likelihood to the consensus base. Likewise, heterozygous indels are included only by chance or may show up as stretches of low quality bases.

Credits

This track was produced at UCSC using data generated by Ed Green.

Reference

Briggs AW, Good JM, Green RE, Krause J, Maricic T, Stenzel U, Lalueza-Fox C, Rudan P, Brajkovic D, Kucan Z et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci USA. 2007. Sep 11;104(37):14616-21.

Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH et al. A Draft Sequence of the Neandertal Genome. Science. 2010 7 May;328(5979):710-22.