Mouse Inbred Line Genotype Data

(16th April 2005)

This page contains information related to our project to genotype Recombinant Inbred Lines and Inbred Lines across 15360 SNPs.

The genotype data have now arrived. All the genotyping was perfomed by Illumina, San Diego. There are genotypes for 478 strains and 13377 successful SNP assays (a few more SNPs that are not mapped onto Build 33 of the mouse genome will be added shortly) .

They can be downloaded as a series of chromosome-specific compressed text files by following these links:

chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chrX Everything (tarball)

The file format is space-separated text, suitable for viewing in Excel, with one row of data per strain. The first column gives the strain name. The remaining columns are the genotypes in the marker order specified by the SNP names in the first row of the file.

We have performed some basic error checking and have not discovered any major problems, but please take note that:

  1. This is a beta relase of the data and is subject to revision.
  2. The strand on which genotypes for a given SNP are reported in this dataset may not be the same as that reported for the same SNP in other published data sets, and in particular may be different from that in dbSNP..
  3. We have checked the standard inbred strains provided by the Jackson Laboratory with the published genotypes, and they agree almost perfectly, once strand differences are taken into account.
  4. The haplotype block structure for the Recombinant Inbred Lines in the set shows well-defined blocks, indicating that the genotypes are consistent with each other.
  5. So far we have NOT checked non-Jax strains (including the RIL) against published genotypes, so we can't rule out the possibility of a mix-up of samples. Therefore, if you suspect that anything is wrong, PLEASE LET US KNOW.
  6. If a strain that you expected to be on the list is missing PLEASE LET US KNOW.

Note that the following samples Failed: NIH/OLa FVBS/Ant PWK/Pas PWK/Rbrc PWK/Ros MusSpretusOutbred1 MusSpretusOutbred2 strain1050185 BXD82 MUSpaha PWD/PhJ(36186) CZECH11/Ei(35213) JF1/MS(35242) CASA/RkJ CAST/Ei CZECHI/EiJ MOLC/RkJ MOLD/RkJ MOLFEiJ MSM/Ms PANCEVO/EiJ PWK/PhJ SKIVE/EiJ SPRET/EiJ CIM(M.m.musculuswild derived) CTP(M.m.musculuswild derived) MAI(M.m.musculuswild derived) MBT(M.m.musculuswild derived) L6 .

Conditions of Use

  1. These data are freely available.
  2. There are no constraints on the use of the data, but if you redistribute it include a reference to this site http://www.well.ox.ac.uk/mouse/INBREDS in your distribution.
  3. If you make use of the data please reference this web site http://www.well.ox.ac.uk/mouse/INBREDS.
  4. We do not provide any guarantee that the data are correct. CAVEAT EMPTOR.

List of SNPS that produced successful assays. These SNPS have also been typed across 2300 HS mice to fine map multiple QTL in parallel.

Original list of SNPs submitted for genotyping.

The file is Comma-separated text file in Illumina format. The columns are: SNP_Name,Sequence,Genome_Build_Version,Chr,Coordinate,Source,dbSNP_Version,Ploidy,Species,Customer_Strand). Note that all SNPs have been remapped onto Build 33 of the mouse genome, and where possible renamed by their dbsnp rs number if that exists. The original SNP name is included in the source information.

Coordinates start at 1, ie follow the DBSNP convention, which is different from UCSC coordinates which start at 0

The SNPs were selected as follows. Where possible we used validated SNPs known to be polymorphic on at least some of the eight strains A/J, AKR Balb/cJ, DBA2/J, C57BL/6J, LP/J, I, RIII, although in many cases we did not have full strain distribution data. About 7000 SNPs were contributed by GNF (Tim Wiltshire), and 7000 by  Merck (Eric Schadt), and 1600 by JAX (Petko Petkov).  We thinned out SNPs closer than 50kb with identical strain distribution patterns. We then identified all gaps > 500kb and looked for SNP to fill them . We used Celera, Czech and Affy SNPs to do this (provided by Mark Daly and Rob Williams). We only included SNPs that mapped uniquely to Build33 of the mouse assembly according to BLAT (thanks to Martin Taylor).

We added a few special SNPs that determine the MHC alleles, the tyrosinase and agouti loci, and the mitochondrion.  We included SNPS mapping to unordered chromosomal fragments (like 7_random) because these are likely to become part of the assembly in the future. 

The resulting set of SNPs is fairly uniformly distributed on Build 33, see final-9-2-5.space.txt. When the next build comes out no doubt much of this careful work will be undone, and new gaps will appear, but this is the best we could do at the moment. We are confident that most of the SNPs will work and be polymorphic. The exceptions are those SNPs derived from the czech mouse; we expect less than half of these to be polymorphic. Making this selection was surprisingly time-consuming and difficult. In particular filling the gaps was hard. Whether these gaps really are regions with few SNPs, or are caused by errors in the mouse genome assembly, or are caused by SNP ascertainment problems, remains to be seen.

Spacing of selected SNPS.

Haplotype block structure of Recombinant Inbred Lines inferred from the data.

Many thanks to Tim Wiltshire, Mathew Pletcher (GNF) , Eric Schadt (Rosetta/Merck), Petko Petkov (JAX), Mark Daly/Andrew Kirby (MIT/Broad), Rob Williams (Tenessee), Chistophe Benoiste (Harvard) for providing SNP information. (The source of each SNP is indicated in the file)

Final list of lines and strains sent to Illumina for genotyping.

List in csv text format.

Many thanks to the following people for providing Mouse DNA samples: Christophe Benoiste, Chris Ebeling, Beth Bennett, Lu Lu, Daniel Pomp, David Keays, Robert Reis, Grant Morahan, Gudrun Brockmann, Hiroke Nagase, Howard Gershenfeld, Jim Cheverud, Jimmy Spearow, Jonathan Flint, Kathy Hood, Molly Bogue/Susan Deveau, Morley/Haywood, Peter Demant, Petko Petkov, Rob Williams, Simon Horvat, Steve Clapcote, Xavier Montagutelli.

Contact Richard Mott or Jonathan Flint

 
spacer