HAPPY R PACKAGE
We have implemented happy in the R programming language. The package is available for download. Choose from these archives:
The advantages of the R implementation over the original C version of HAPPY are:
- The dependence on the commercial NAG library has been removed, so the use of happy in R is now completely free (it is distributed under the GNU Public License).
- The dynamic-programming engine in the original C version of happy has been retained as a library, which is linked to R at runtime.
- The range of statistical models that can be fit to the data is now much larger. Within the R package one can fit a wealth of linear and non-linear models. Multivariate trait analysis will be supported shortly.
- Support for multiple QTL, including tests for epistasis, are now included.
- Support for strain merging is included
- Support for covariates is included
- Plotting of QTL fits are supported.
- The input file formats are unchanged, although ped file format is now also accepted.
- Full online documentation is provided, also available as PDF version 1.1 , version 2.0.2 file, version 2.0.3 ,
version 2.0.4 , version 2.0.6 , version 2.1.
Use of the R happy package is illustrated in this basic tutorial:
- Download the happy package. Install (under Linux) using a command such as
R CMD INSTALL happy_1.1.tar.gz
- Invoke an R session
- The happy package is loaded into R using the command
library(happy)
- The user creates a happy object inside R with a call like this:
h <- happy( 'happy.data', 'happy.alleles', generations=200 )
This will read in the data and alleles files and perform the dynamic-programming step of the analysis. The object h is used as a handle in subsequent model-fitting. It is actually a list of items, including the elements:
- h$markers list of the marker names
- h$map the genetic map of the markers, represented as centi-Morgan coordinates
- h$subjects list of the subject names
- h$phenotypes the phenotypes of the subjects
- To fit a simple additive model to all the marker data (ie replicate the original C HAPPY analysis) use the command
fit <- hfit( h )
which returns a fit object with the results of the fit for each marker.
- One can fit a full model, allowing for interaction between the alleles within each locus, thus:
fit <- hfit( h, model='full' )
which will not only fit the model but test if it is superior to the additive model with a partial F-test
One can include covariates in the model by specifying an additional design matrix, X:
fit <- hfit( h, model='full', covariatematrix=X )
- The log-P values of the fit can be extracted using the command
write.table(fit$table)
and the results can be plotted using the command
happyplot(fit)
Version 1.2 Notes (14/09/2004)
- This version now runs under Windows XP and Linux, using R version 1.9
- Limited support for multivariate phenotypes has been included (essentially they are still treated as a series of univariate analyses)
- Permutation tests are now inplemented in hfit()
Version 2.0.4 Notes (11/08/2006)
- This version now supports "genome cache" objects. When analysing whole genomes with multiple phenotypes, it is time-consuming to keep re-computing the happy design matrices. Moreover, for some applications involving fitting multiple QTL across the genome it is very difficult to load multiple happy objects simultaneously beciase of memory limitations. The genome cache avoids these issues by saving a genome's worth of happy data to disk using the R delayed data package "g.data". The objects can then be retreived transparently as if they were in memory, using hdesign(), hfit() etc.
Limitations
- Bootstrapping is not yet available.
- The package is still under development and subject to change.
Please send Questions, Comments, and Bug Reports to Richard Mott
|
|