QTL Mapping using MAGIC Arabidopsis Recombinant Inbred Lines

Identifying natural allelic variation that underlies quantitative trait variation remains a fundamental problem in genetics. Most studies have employed either simple synthetic populations with restricted allelic variation or performed association mapping on a sample of naturally occurring haplotypes. Both of these approaches have some limitations, therefore alternative resources for the genetic dissection of complex traits continue to be sought. Here we describe one such alternative, the Multiparent Advanced Generation Inter-Cross (MAGIC). This approach is expected to improve the precision with which QTL can be mapped, improving the outlook for QTL cloning. Here, we present the first panel of MAGIC lines developed: a set of 527 recombinant inbred lines (RILs) descended from a heterogeneous stock of 19 intermated accessions of the plant Arabidopsis thaliana. These lines and the 19 founders were genotyped with 1,260 single nucleotide polymorphisms and phenotyped for development-related traits. Analytical methods were developed to fine-map quantitative trait loci (QTL) in the MAGIC lines by reconstructing the genome of each line as a mosaic of the founders. We show by simulation that QTL explaining 10% of the phenotypic variance will be detected in most situations with an average mapping error of about 300 kb, and that if the number of lines were doubled the mapping error would be under 200 kb. We also show how the power to detect a QTL and the mapping accuracy vary, depending on QTL location. We demonstrate the utility of this new mapping population by mapping several known QTL with high precision and by finding novel QTL for germination data and bolting time. Our results provide strong support for similar ongoing efforts to produce MAGIC lines in other organisms.


Create a working directory into which you will download the data.


Download the MAGIC genotypes, formatted for the HAPPY package analysis, into the analysis directory. Unpack them using the commands

%gunzip magic.15012010.tar.gz
%tar xvf magic.15012010.tar
%ls chr*

chr1.MAGIC.alleles	    chr5.MAGIC.alleles	chr4.MAGIC.alleles	    chr3.MAGIC.alleles

These data are genotypes of 703 MAGIC RILs genotyped at 1513 SNPs, formatted for analysis by the HAPPY R package.

R code

You will need to install the following R packages (make sure the environment variable R_LIBS includes the directory to where these packages are installed).

For example, to install the downloaded happy.hbrem package to the directory

type the command
 R CMD INSTALL -l R_package_dir happy.hbrem_2.4.tar.gz 

You will also need to download the following R scripts in the analysis directory

magic.R , happy.preCC.R

Mapping QTLs

Test that everything is installed by starting an R session in the analysis directory:

> source("magic.R")
Loading required package:
Loading required package: multicore
Loading required package: splines
Now you need to build a database containing the HAPPY descent probability matrices. This step need only be done once. In R type the command
> prepare.database()

which should generate the output

This will create a subdirectory called CONDENSED which contains the R binary versions of the probability matrices, and is used automatically by the subsequent QTL mapping.

Phenotype Data

An example phenotype file MAGIC.phenotype.example.12102015.txt is provided. Make sure your phenotype data file conforms exactly with these specifications:

  1. The file is tab-delimited
  2. The first row contains the column headings.
  3. One column must be labelled SUBJECT.NAME and must contain the names of the MAGIC lines in the format MAGIC.N where N is an integer (eg MAGIC.100). Note that these line designations are the same as those used by the stock centre, if you order seeds.
  4. Missing data are indicated by the symbol NA.

The simplest way to perform the QTL mapping is with the R command

This performs the following steps:
  1. Loads the database of proability matrices from ./CONDENSED
  2. Map QTLs for each column in the phenotype file that can be interpreted as numeric
    • Performs a genome scan with 1000 permutations to determine genomewide thresholds for statistical significance. The summary statistics for the scans are written as text files named like "phenotype.scan.txt", eg A binary R data object with all the scan information is written to "scans.Rdata".
    • Find all QTLs where the logP of genetic association is genome-wide significant with a permutation P-value < 0.1 (by default). These are written to text files named phenotype.qtls.txt, eg. Note that at present confidence intervals are not provided: instead the island intervals are given; these are the segments exceeding the genome-wide significance level.
    • Estimate founder accession effects at each QTL by multiple imputation. These are written to text files named "phenotype.marker.imputed.txt", and summarised graphically in the file "phenotype.accession.estimates.pdf" (example:
  3. Plots histograms of phenotype values, by default to the file histogram.pdf.
  4. Plots genome scans, by default to the file scans.pdf. In the plots, chromosome boundaries are indicated by vertical red lines. The permutation-derived genomewide thresholds at 50%, 90% and 95% are indicated y the grey horizontal lines. The positions of QTLs are indicated by orange dots.

The command options to scan phenotypes are

 scan.phenotypes <- function( phenotype.file, phenotypes=NULL, dir="./CONDENSED", threshold=0.1, permute=1000, histogram.pdf="histogram.pdf", save.file="scans.RData", scan.plot.pdf= "scans.pdf", mc.cores=5) 