DREME - a motif discovery tool

Usage:

dreme [options] -p <sequence file> [-n <background file>]

Description:

DREME (Discriminative Regular Expression Motif Elicitation) finds relatively short motifs (up to 8 bases) fast, and can perform discriminative motif discovery if given a negative set, consisting of sequences unlikely to contain a motif of interest that is however likely to be found in the main ("positive") sequence set. If you do not provide a negative set the program shuffles the positive set to provide a background (in the role of the negative set).

The input to DREME is one or two sets of DNA sequences. The program uses a Fisher Exact Test to determine significance of each motif found in the postive set as compared with its representation in the negative set, using a significance threshold that may be set on the command line.

DREME achieves its high speed by restricting its search to regular expressions based on the IUPAC alphabet representing bases and ambiguous characters, and by using a heuristic estimate of generalised motifs' statistical significance.

Input:

<sequence file> is a collection of sequences in FASTA format.
<background file> (optional) is a collection of sequences in FASTA format.

Output:

DREME writes an XML file to the output folder and converts it into a minimal MEME-formatted motif file and a human readable html file.

Additionally DREME can output motif logos if the -png and/or -eps options are specified.

Options:

--o <dir name> - Specifies the output directory. If the directory already exists, the contents will not be overwritten.
--oc <dir name> - Specifies the output directory. If the directory already exists, the contents will be overwritten.
--png - Output images in PNG format.
--eps - Output images in EPS format.
--desc <description> - Specifies a description to be stored in the output.
--dfile <description file> - Specifies a file containing a description to be stored in the output.
-e <ethresh> - stop if motif E-value > <ethresh> default: 0.05.
-m <m> - stop if <m> motifs have been output; default: only stop at E-value threshold.
-g <ngen> - number of REs to generalize; default: 100. Hint: increasing <ngen> will make the motif search more thoroughly at some cost in speed.
-s <seed> - seed for shuffling sequences; ignored if -n <filename> given; default: 1
-v <verbosity> - 1..5 for varying degrees of extra output; default: 2
-h - print this usage message

Setting Core Motif Width

-mink <mink> - minimum width of core; default: 3.
-maxk <maxk> - maximum width of core; default: 7.
-k <k> - sets mink=maxk=<k>.

Experimental below here; enter at your own risk

-l - print list of enrichment of all REs tested.