A Multiparent Advanced Generation Inter-Cross to Fine-Map Quantitative Traits in Arabidopsis thaliana Paula X. Kover, William Valdar, Joseph Trakalo, Nora Scarcelli, Ian M. Ehrenreich, Michael D. Purugganan, Caroline Durrant, Richard Mott PLoS Genetics 2009 10.1371/journal.pgen.1000551.
Identifying natural allelic variation that underlies quantitative trait variation remains a fundamental problem in genetics. Most studies have employed either simple synthetic populations with restricted allelic variation or performed association mapping on a sample of naturally occurring haplotypes. Both of these approaches have some limitations, therefore alternative resources for the genetic dissection of complex traits continue to be sought. Here we describe one such alternative, the Multiparent Advanced Generation Inter-Cross (MAGIC). This approach is expected to improve the precision with which QTL can be mapped, improving the outlook for QTL cloning. Here, we present the first panel of MAGIC lines developed: a set of 527 recombinant inbred lines (RILs) descended from a heterogeneous stock of 19 intermated accessions of the plant Arabidopsis thaliana. These lines and the 19 founders were genotyped with 1,260 single nucleotide polymorphisms and phenotyped for development-related traits. Analytical methods were developed to fine-map quantitative trait loci (QTL) in the MAGIC lines by reconstructing the genome of each line as a mosaic of the founders. We show by simulation that QTL explaining 10% of the phenotypic variance will be detected in most situations with an average mapping error of about 300 kb, and that if the number of lines were doubled the mapping error would be under 200 kb. We also show how the power to detect a QTL and the mapping accuracy vary, depending on QTL location. We demonstrate the utility of this new mapping population by mapping several known QTL with high precision and by finding novel QTL for germination data and bolting time. Our results provide strong support for similar ongoing efforts to produce MAGIC lines in other organisms.
Create a working directory into which you will download the data.
Download the MAGIC genotypes, formatted for the HAPPY package analysis, into the analysis directory. Unpack them using the commands
%gunzip magic.15012010.tar.gz %tar xvf magic.15012010.tar %ls chr* chr1.MAGIC.alleles chr2.MAGIC.data chr3.MAGIC.map chr5.MAGIC.alleles chr1.MAGIC.data chr2.MAGIC.map chr4.MAGIC.alleles chr5.MAGIC.data chr1.MAGIC.map chr3.MAGIC.alleles chr4.MAGIC.data chr5.MAGIC.map chr2.MAGIC.alleles chr3.MAGIC.data chr4.MAGIC.map
These data are genotypes of 703 MAGIC RILs genotyped at 1513 SNPs, formatted for analysis by the HAPPY R package.
You will need to install the following R packages (make sure the environment variable R_LIBS includes the directory to where these packages are installed).
For example, to install the downloaded happy.hbrem package to the directory
R_package_dirtype the command
R CMD INSTALL -l R_package_dir happy.hbrem_2.4.tar.gz
You will also need to download the following R scripts in the analysis directory
Test that everything is installed by starting an R session in the analysis directory:
mus [70]% R WARNING: ignoring environment value of R_HOME R version 2.9.1 (2009-06-26) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > source("magic.R") Loading required package: g.data Loading required package: multicore Loading required package: splines >Now you need to build a database containing the HAPPY descent probability matrices. This step need only be done once. In R type the command
> prepare.database()
which should generate the output
./chr1.MAGIC.data TRUE ./chr1.MAGIC.alleles TRUE ./chr1.MAGIC.map TRUE mindist: 1e-05 datafile ./chr1.MAGIC.data allelesfile ./chr1.MAGIC.alleles gen 7 genotype phase: unknown ./chr2.MAGIC.data TRUE ./chr2.MAGIC.alleles TRUE ./chr2.MAGIC.map TRUE mindist: 1e-05 datafile ./chr2.MAGIC.data allelesfile ./chr2.MAGIC.alleles gen 7 genotype phase: unknown ./chr3.MAGIC.data TRUE ./chr3.MAGIC.alleles TRUE ./chr3.MAGIC.map TRUE mindist: 1e-05 datafile ./chr3.MAGIC.data allelesfile ./chr3.MAGIC.alleles gen 7 genotype phase: unknown ./chr4.MAGIC.data TRUE ./chr4.MAGIC.alleles TRUE ./chr4.MAGIC.map TRUE mindist: 1e-05 datafile ./chr4.MAGIC.data allelesfile ./chr4.MAGIC.alleles gen 7 genotype phase: unknown ./chr5.MAGIC.data TRUE ./chr5.MAGIC.alleles TRUE ./chr5.MAGIC.map TRUE mindist: 1e-05 datafile ./chr5.MAGIC.data allelesfile ./chr5.MAGIC.alleles gen 7 genotype phase: unknown a->markers 211 Reading phenotype and genotype data from ped file ./chr2.MAGIC.data a->markers 275 Reading phenotype and genotype data from ped file ./chr1.MAGIC.data a->markers 251 Reading phenotype and genotype data from ped file ./chr3.MAGIC.data a->markers 230 Reading phenotype and genotype data from ped file ./chr4.MAGIC.data a->markers 292 Reading phenotype and genotype data from ped file ./chr5.MAGIC.data Number of individuals: 703 Number of markers: 211 Number of strains: 19 Use Parents: no Number of subjects with two parents: 0 null model mean nan var nan assuming haploid(inbred) genotypes dfile ./chr2.MAGIC.data afile ./chr2.MAGIC.alleles gen 7 Number of individuals: 703 Number of markers: 230 Number of strains: 19 Use Parents: no Number of subjects with two parents: 0 null model mean nan var nan assuming haploid(inbred) genotypes dfile ./chr4.MAGIC.data afile ./chr4.MAGIC.alleles gen 7 Number of individuals: 703 Number of markers: 251 Number of strains: 19 Use Parents: no Number of subjects with two parents: 0 null model mean nan var nan assuming haploid(inbred) genotypes dfile ./chr3.MAGIC.data afile ./chr3.MAGIC.alleles gen 7 Number of individuals: 703 Number of markers: 275 Number of strains: 19 Use Parents: no Number of subjects with two parents: 0 null model mean nan var nan assuming haploid(inbred) genotypes dfile ./chr1.MAGIC.data afile ./chr1.MAGIC.alleles gen 7 Number of individuals: 703 Number of markers: 292 Number of strains: 19 Use Parents: no Number of subjects with two parents: 0 null model mean nan var nan assuming haploid(inbred) genotypes dfile ./chr5.MAGIC.data afile ./chr5.MAGIC.alleles gen 7 >
This will create a subdirectory called CONDENSED which contains the R binary versions of the probability matrices, and is used automatically by the subsequent QTL mapping.
An example phenotype file MAGIC.phenotype.example.12102015.txt is provided. Make sure your phenotype data file conforms exactly with these specifications:
The simplest way to perform the QTL mapping is with the R command
scan.phenotypes(phenotypefile)This performs the following steps:
The start of the screen output should look like this:
> scan.phenotypes("../MAGIC.phenotype.example.12102015.txt") loading condensed db ./CONDENSED model additive read 1254 matrices loading genome summary loading condensed summary reading phenotype file ../MAGIC.phenotype.example.04012009.txt Analysing numeric phenotypes bolt.to.flower days.to.bolt days.to.germ leaves.day.28.given.days.to.germ plotting histograms of phenotypes to histogram.pdf scanning bolt.to.flower ~ 1 426 subjects analysed with phenotypes Analysis of Variance Table Response: bolt.to.flower Df Sum Sq Mean Sq F value Pr(>F) Residuals 425 2623.91 6.17 writing parameter estimates to bolt.to.flower.MN4_142943.imputed.txt writing parameter estimates to bolt.to.flower.FLC_3090.imputed.txt plotting estimates to bolt.to.flower.accession.estimates.pdf scanning days.to.bolt ~ 1 426 subjects analysed with phenotypes Analysis of Variance Table Response: days.to.bolt Df Sum Sq Mean Sq F value Pr(>F) Residuals 425 16823.6 39.6 writing parameter estimates to days.to.bolt.MN1_21908389.imputed.txt writing parameter estimates to days.to.bolt.MASC02069.imputed.txt writing parameter estimates to days.to.bolt.MASC03765.imputed.txt writing parameter estimates to days.to.bolt.FRI_2343.imputed.txt writing parameter estimates to days.to.bolt.MN5_3177504.imputed.txt plotting estimates to days.to.bolt.accession.estimates.pdf scanning days.to.germ ~ 1 426 subjects analysed with phenotypes Analysis of Variance Table Response: days.to.germ Df Sum Sq Mean Sq F value Pr(>F) Residuals 425 1376.30 3.24 scanning leaves.day.28.given.days.to.germ ~ 1 426 subjects analysed with phenotypes Analysis of Variance Table Response: leaves.day.28.given.days.to.germ Df Sum Sq Mean Sq F value Pr(>F) Residuals 344 1441.22 4.19 writing parameter estimates to leaves.day.28.given.days.to.germ.MN4_48812OK.imputed.txt writing parameter estimates to leaves.day.28.given.days.to.germ.FRI_1888.imputed.txt writing parameter estimates to leaves.day.28.given.days.to.germ.MN4_428535.imputed.txt plotting estimates to leaves.day.28.given.days.to.germ.accession.estimates.pdf saving scans to binary file scans.RData plotting scans to scans.pdf writing scans to bolt.to.flower.scan.txt writing qtls to bolt.to.flower.qtls.txt writing scans to days.to.bolt.scan.txt writing qtls to days.to.bolt.qtls.txt writing scans to days.to.germ.scan.txt writing qtls to days.to.germ.qtls.txt writing scans to leaves.day.28.given.days.to.germ.scan.txt writing qtls to leaves.day.28.given.days.to.germ.qtls.txt
The command options to scan phenotypes are
scan.phenotypes <- function( phenotype.file, phenotypes=NULL, dir="./CONDENSED", threshold=0.1, permute=1000, histogram.pdf="histogram.pdf", save.file="scans.RData", scan.plot.pdf= "scans.pdf", mc.cores=5)where