This track displays variant base calls from the publicly released genome sequences of several individuals:
Most of these SNP calls have been lifted from the hg18 coordinates. Any SNPs that are a match to the new reference have been removed.
Substitutions and indels are displayed as boxes. When read frequency data are available, they are displayed in the mouseover text (e.g. "T:8 G:3" means that 8 reads contained a T and 3 reads contained a G at that base position), and box colors are used to show the proportion of alleles. In the genome browser, when viewing the forward strand of the reference genome (the normal case), the displayed alleles are relative to the forward strand. When viewing the reverse strand of the reference genome ("reverse" button), the displayed alleles are reverse-complemented to match the reverse strand.
On the details page for each variant, the alleles are given for the forward strand of the reference genome. Frequency data are shown when available.
KB1, NB1, MD8, TK1, ABT (Penn State)
(Schuster et al.)
SNPs are from the allSNPs.txt file which can be downloaded
from Galaxy. The indels are also
available for download from Galaxy.
CEU trio NA12878, NA12891, NA12892; YRI trio NA19240, NA19238, NA19239
(1000 Genomes Project)
(1000 Genomes)
The variants shown are from the 1000 Genomes Project's March 2010
release.
The CEU variant calls were based on sequence data from the
Wellcome Trust Sanger Insititute and the
Broad Institute, using the Illumina/Solexa platform.
The YRI variant calls were based on sequence data from the
Baylor College of Medicine Human Genome Sequencing Center and
Applied Biosystems, using the SOLiD platform.
For more information on the mapping, variant calling, filtering and
validation, see the
pilot 2 README file.
The variant calls are available from the March 2010 release subdirectory at
EBI and at
NCBI.
Complete Genomics 69 genomes (Complete Genomics)
(CG)
There are four sets of data: a Yoruba trio; a Puerto Rican trio; a 17-member, 3-generation pedigree; and a diversity panel representing 9 different populations. The CEPH samples within the pedigree and diversity sets are from the NIGMS Repository and the remainder from the NHGRI Repository, both housed at the Coriell Institute for Medical Research.
George Church, Misha Angrist, Rosalynn Gill, Henry Louis Gates Sr, Henri Louis Gates Jr (Personal Genome Project)
(PGP)
The variants for all but Church are from Trait-o-matic.
The numbers for Angrist are read counts, the number supporting each allele was not given.
The variants for Church are from Complete Genomics.
Craig Venter (JCVI)
(Levy et al.)
An overview is given
here.
This subtrack contains Venter's single-base variants from the file
HuRef.InternalHuRef-NCBI.gff,
filtered to include only Method 1 variants (where each variant was kept in its original
form and not post-processed), and to exclude any variants that had N as an allele.
JCVI hosts a
genome browser.
James Watson
(CSHL)
(Wheeler et al.)
These single-base variants came from the file
watson_snp.gff.gz.
CSHL hosts a
genome browser.
Yoruba NA18507
(Illumina Cambridge/Solexa)
(Bentley et al.)
Illumina released the read sequences to the
NCBI Short Read Archive.
Aakrosh Ratan in the Miller Lab at Penn State University (PSU)
mapped the sequence reads to the reference genome and called
single-base variants using
MAQ.
YH (YanHuang Project)
(Wang et al.)
The YanHuang Project released these single-base variants from the
genome of a Han Chinese individual.
The data are available from the
YH database in the file
yhsnp_add.gff.
The YanHuang Project hosts a
genome browser.
SJK (GUMS/KOBIC)
(Ahn et al.)
Researchers at Gachon University of Medicine and Science (GUMS)
and the Korean Bioinformation Center (KOBIC)
released these single-base variants from the genome of Seong-Jin Kim.
The data are available from
KOBIC
in the file
KOREF-solexa-snp-X30_Q40d4D100.gff.
AK1 (Genomic Medicine Institute) (Kim et al.)
The variants shown are from the AK1_SNP.tar.gz download.
Stephen Quake (Stanford) (Pushkarev et al.)
The variants shown are from the Trait-o-matic
download.
Anonymous Irish male
(Tong et al.)
The SNPs shown are from the Galaxy library,
Irish whole genome.
Marjolein Kriek
(Leiden)
The SNPs shown are called by Belinda Giardine from PSU, from the BAM file
provided by Leiden University Medical Centre. The reads were aligned to the
hg19 build. SNP calls were made using samtools, with a minimum of 4 reads
supporting the variant call and maximum of 45. Also those with a quality
score of less than 30 were filtered out.
Gregory Lucier
(Life Technologies)
The SNPs shown are from Nimbus Informatics. Sequencing was done using the Life SOLiD platform.
Palaeo-Eskimo Saqqaq individual (Saqqaq Genome Project) (Rasmussen et al.)
The variants shown are all the SNPs found by the SNPest program, and in
a second track the
high confidence SNPs from the first set.
The allele counts are not available for these tracks but read depth is available. The read depth was put in place of the allele counts to give a measure of
the reliability of the call.
KB1, NB1, MD8, TK1, ABT (Penn State)
Schuster S.C., et al. Complete Khoisan and Bantu genomes from southern Africa. Nature 463, 841-990 (18 February 2010).
Le Roux, W., and White, A. The voices of the San living in Southern Africa today. Cape Town: Kwla books (2004)
CEU trio NA12878, NA12891, NA12892; YRI trio NA19240, NA19238, NA19239 (1000 Genomes)
Analysis is underway for a manuscript on the pilot project; until publication,
please see
http://1000genomes.org/.
(See also the Science and
Nature Biotechnology
news articles describing the project.)
Complete Genomics 69 genomes
Complete Genomics
George Church, Misha Angrist, Rosalynn Gill, Henry Louis Gates Sr, Henry Louis Gates Jr
Personal Genome Project
Church(Complete Genomics version 1.2.0.14):
R. Drmanac, et. al. Science, 5 November 2009
Craig Venter (JCVI)
Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N,
Huang J, Kirkness EF, Denisov G, et al.
The diploid genome sequence of an individual human.
PLoS Biol. 2007 Sep 4;5(10):e254.
James Watson (CSHL)
Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W,
Chen YJ, Makhijani V, Roth GT, et al.
The complete genome of an individual by massively parallel
DNA sequencing.
Nature. 2008 Apr 17;452(7189):872-6.
Yoruba NA18507 (Illumina Cambridge/Solexa)
Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG,
Hall KP, Evers DJ, Barnes CL, Bignell HR, et al.
Accurate whole human genome sequencing using reversible
terminator chemistry.
Nature. 2008 Nov 6;456(7218):53-9.
YH (YanHuang Project)
Wang J, Wang W, Li R, Li Y, Tian G, Goodman L, Fan W, Zhang J, Li J,
Zhang J, et al.
The diploid genome sequence of an Asian individual.
Nature. 2008 Nov 6;456(7218):60-5.
SJK (GUMS/KOBIC)
Ahn SM, Kim TH, Lee S, Kim D, Ghang H, Kim DS, Kim BC, Kim SY, Kim WY, Kim C,
et al.
The first Korean genome sequence and analysis: full genome
sequencing for a socio-ethnic group.
Genome Res. 2009 Sep;19(9):1622-9.
AK1 (Genomic Medicine Institute)
Jong-Il Kim, Young Seok Ju, Hansoo Park, Sheehyun Kim, Seonwook Lee,
Jae-Hyuk Yi, Joann Mudge, Neil A. Miller, Dongwan Hong, Callum J. Bell,
et al.
A highly annotated whole-genome sequence of a Korean individual.
Nature 460, 1011-1015 (20 August 2009).
Stephen Quake
Pushkarev D, Neff NF, Quake SR "Single-molecule Sequencing of an Individual Human Genome" Nature Biotech 27, 847-850 (10 August 2009) doi:10.1038
PDF
Anonymous Irish male
Tong et al.
Sequencing and analysis of an Irish human genome.
Genome Biology 2010, 11:R91.
Marjolein Kriek
Not published yet, data provided by Leiden University Medical Centre.
Gregory Lucier
Not published, data provided by Life Technologies and Nimbus Informatics.
Palaeo-Eskimo Saqqaq individual
Rasmussen M, Li Y, Lindgreen S, Pedersen JS, Albrechtsen A, Moltke I, Metspalu M, Metspalu E, Kivisild T, Gupta R,
et al.
Ancient Human Genome Sequence of an Extinct Palaeo-Eskimo.
Nature 463, 757-762 (11 February 2010).