Meta-MEME v3.3

Sequence formats

The preferred sequence format for Meta-MEME is Fasta format. For example,
>ICYA_MANSE Insecticyanin A form blue biliprotein) 
GDIFYPGYCPDVKPVNDFDLSAFAGAWHEIAKLPLENENQGKCTIAEYKYDGKK
ASVYNSFVSNGVKEYMEGDLEIAPDA
>LACB_BOVIN Beta-lactoglobulin precursor (BETA-LG)
MKCLLLALALTCGAQALIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPL
RVYVEELKPTPEGDLEILLQKW
Sequences start with a header line followed by sequence lines. The header line begins with the character ">", followed by a unique name without any spaces, followed by (optional) descriptive text. After the header line come the actual sequence lines. Spaces and blank lines are ignored. Sequences may be in capital, lowercase or both.
The web version of Meta-MEME also accepts protein and DNA sequences in any of the following formats by converting them to Fasta format.

IG/Stanford, used by Intelligenetics and others

GenBank/GB, genbank flatfile format

NBRF format

EMBL, EMBL flatfile format

DNAStrider, for common Mac program

Fitch format, limited use

Pearson/Fasta, a common format used by Fasta programs and others

Zuker format, limited use

Olsen, format printed by Olsen VMS sequence editor

Phylip3.2, sequential format for Phylip programs

Phylip, interleaved format for Phylip programs (v3.3, v3.4)

MSF multi sequence format used by GCG software

PAUP's multiple sequence (NEXUS) format

PIR/CODATA format used by PIR

ASN.1 format used by NCBI

The Meta-MEME web site uses the ReadSeq program to read in sequences. ReadSeq is copyright 1990 by D. G. Gilbert, Biology Dept., Indiana University.
Return to the Meta-MEME home page.
Please send comments and questions to: .