The GSCANDB software

This page describes the GSCANDB software:

System Requirements

  1. A linux server running instances of MySQL and Apache, or their equivalents on other OS.
  2. A MySQL account with privileges to create, drop and alter databases.
  3. Java (jdk.1.6) and ANT have to be installed on your system, for creating and populating the gscan database.

Download, configuration and installation

  1. Download and unzip the file: gscandb_v1.zip
  2. Download and unzip the biomart files: biomart.zip
  3. Go to the gscandb_v1 directory
    cd gscandb_v1
  4. Ask your web server administrator to create a web alias for the gscandb_v1 directory.
  5. Configure the database.properties file (located in the gscandb_v1 sub-directory called etc)by replacing xxx with your username and password
  6. Create the database and build the tables by issuing the command
     ant
    from within the gscandb_v1 directory.

The Genome Scan Viewer

  1. Check that the web server works, by pointing it at the url:
    http://localhost/yourGScanWebAlias/wwwqtl.cgi
  2. This should generate a web page with header 'Genome Scan Viewer', similar to that on our GSCANDB web site http://gscan.well.ox.ac.uk/gs/wwwqtl.cgi except the scrolling lists and pulldown menus will be empty

Uploading data

  1. The database will need to be populated with your data. We provide some example data from our mouse QTL mapping experiment in the compressed tarball gscandb.examples.zip. Download and unpack it, preferably into a directory different to the gscandb directory. The directory contains comma-separate files, which format and content are further described in the Input files section below.
  2. GSCANDB can be populated using different arguments, depending on whether it is being populated for the first time or whether data is being updated or added to the database.

Input files

    All the infiles should be comma seperated.
  1. marker.csv
    containing basic marker information
  2. marker_mapping.csv
    containing positions of the markers on genome builds
  3. sample.csv
    containing information about samples (individuals with genotypes)
  4. genotype.csv
    containing the genotypes of the markers on the samples
  5. hapmap.csv
    containing haplotype map information for the markers
  6. files named
    Biochem.ALP.chr*.scan
    containing genome-scan data for one phenotype across 20 chromosomes in a special format described below.
  7. threshold.csv
    containing significance threshold information for genome scans
  • The headers for the csv files files are as follows. Null fields should be entered as ",,". Fields in bold cannot be null.
    TABLE NAME FIELD NAMES
    genome_build name, date species, comments, ensembl, ensembldb, ensemblspecies,liftover
    phenotype name, description,public_name
    population name, species, size, comments
    marker name, marker_type, leftseq, rightseq, alias
    marker_mapping marker, genome_build, chromosome, bp_position, strand, cm
    trait_locus name, population,genome_scan, subscan_label phenotype, marker1, marker2, species, chromosome, start_bp, end_bp, threshold, score, peak, label, comment, url
    sample name, gender, notes
    genotype marker, sample, genotype
    hapblock genome_build, chromosome, marker_start, marker_end, info
    chromosome name,genome_build,length

  • Most of the fields in the tables are self-explanatory, but in detail:
  • In GSCANDB a genome scan is associated with a mapping population, phenotype and genome build. Each scan contains one or more named subscans. A subscan is a series of quantitative measurements along the genome, where each measurement is associated with a marker or marker interval. The subscan mechanism is useful for storing different analyses of the same underlying data, for example we analyse all our phenotypes in at least four ways, looking for singlepoint additive and dominance effects and and multipoint additive and dominance effects. Note that the marker order of the data depends on the genome build, and is therefore defined by uploading marker_mapping files. Although genome scan files may contain positional information, this is ignored. Genome Scan input files have two accepted formats:

    Examples

      Examples will be shown here


    Thorhildur Juliusdottir
    Last modified: Tue Oct 13 12:01:24 BST 2009