GOMO logo

Usage:
gomo [options] <go-term database> <scoring file>+

Description:
The name GOMO stands for "Gene Ontology for Motifs." The program searches in a set of ranked genes for enriched GO terms associated with high ranking genes. The genes can be ranked, for example, by applying a motif scoring algorithms on their upstream sequence. The p-values for each GO-term are computed empirically by shuffling the gene identifiers in the ranking (ensuring consistancy across species) to generate scores from the null hypothesis. Then q-values are derived from these p-values following the method of Benjamini and Hochberg (where "q-value" is defined as the minimal false discovery rate at which a given GO-term is deemed significant). The program reports all GO terms that receive q-values smaller than a specified threshold, outputting a gomo score with emprically calculated p-values and q-values for each.

Input:

Output:

GOMO will create a directory, named gomo_out by default. Any existing output files in the directory will be overwritten. The directory will contain:

The default output directory can be overridden using the --o or --oc options which are described below.

Additionally the user can override the creation of files altogether by specifying the --text option which outputs to standard out in a tab seperated values format:
"Motif Identifier" "GO Term Identifier" "GOMO Score" "p-value" "q-value"

By default GOMO calculates the ranksum statistics on the p-values of each gene given in the CisML input file . Using the option --gs switches the focus from the p-values to the scores. Any sequence failing to provide a p-value will prompt GOMO to abort the calculations. The same happens when any of the genes in the CisML file lacks a score attribute and --gs was activated.

Options: