GOMO file format
GOMO outputs an xml file using the following format.
Tag | Child of | Description |
<gomo> | Nothing |
Information about this run of gomo.
- version - The version of gomo that generated the xml file.
- release - The release date of the version that generated the xml.
|
<program> |
<gomo> |
Information about the state of the program when it ran.
- name - name of the program.
- cmd - the command line passed to the program.
- gene_url - the url used to lookup further information on the gene ids.
The url has ampersands (&) converted into & and the place where
the gene ID should be replaced by !!GENEID!! .
- outdir - the output directory that the program wrote to.
- clobber - true if gomo was allowed to overwrite the output directory.
- text_only - true if gomo wrote to stdout, in which case this file would
not exist so it must be false.
- use_e_values - true if gomo used E-values (converted from p-values) as
input scores, false if gomo used gene scores.
- score_e_thresh - if gomo used E-values then this is the threshold that
gomo assumed the worst E-value (p-value = 1.0) for the gene to smooth out noise.
- min_gene_count - the minimum number of genes that a GO term was annotated
with before gomo would calculate a score for it.
- motifs - if present then a space delimited list of the motifs that gomo
calculated a score for, othewise gomo scored all motifs.
- shuffle_scores - the number of times gomo generated a shuffled mapping of
gene id to gene id to be used to generate scores from the null model.
- q_threshold - gomo filtered the results to only show those with a better
(smaller) q-value.
|
<gomapfile> |
<program> |
Information about the GO mapping file.
- path - the path to the mapping file.
|
<seqscorefile> |
<program> |
Information about the sequence scoring file.
- path - the path to the sequence scoring file.
|
<motif> |
<gomo> |
Information about the motif.
- id - the motif identifier.
- genecount - the number of scored sequences that were used to compute the result.
|
<goterm> |
<motif> |
Information about the GO term.
- id - the GO identifier.
- score - the geometic mean across all species of the rank-sum test p-value.
- pvalue - the empirically calculated p-value.
- qvalue - the empirically calculated q-value.
- annotated - the number of genes annotated with the go term.
- group - the subgroup that the term belongs to. For the Gene Ontology
b = biological process, c = cellular component and m = molecular function.
- nabove - the number of more general terms that link to this one.
- nbelow - the number of more specific terms that link from this one.
- implied - is the go term implied by other significant go terms?
Allows values 'y', 'n' or 'u' (default) for yes, no or unknown.
- description - the GO term description.
|
<gene> |
<goterm> |
Information about the GO term's annotated genes for the primary species.
- id - the gene identifier.
- rank - the rank of the scored gene.
|