Sequence formats
The preferred sequence format for Meta-MEME is Fasta format. For example,Sequences start with a header line followed by sequence lines. The header line begins with the character ">", followed by a unique name without any spaces, followed by (optional) descriptive text. After the header line come the actual sequence lines. Spaces and blank lines are ignored. Sequences may be in capital, lowercase or both.>ICYA_MANSE Insecticyanin A form blue biliprotein) GDIFYPGYCPDVKPVNDFDLSAFAGAWHEIAKLPLENENQGKCTIAEYKYDGKK ASVYNSFVSNGVKEYMEGDLEIAPDA >LACB_BOVIN Beta-lactoglobulin precursor (BETA-LG) MKCLLLALALTCGAQALIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPL RVYVEELKPTPEGDLEILLQKWThe web version of Meta-MEME also accepts protein and DNA sequences in any of the following formats by converting them to Fasta format.
- IG/Stanford, used by Intelligenetics and others
- GenBank/GB, genbank flatfile format
- NBRF format
- EMBL, EMBL flatfile format
- DNAStrider, for common Mac program
- Fitch format, limited use
- Pearson/Fasta, a common format used by Fasta programs and others
- Zuker format, limited use
- Olsen, format printed by Olsen VMS sequence editor
- Phylip3.2, sequential format for Phylip programs
- Phylip, interleaved format for Phylip programs (v3.3, v3.4)
- MSF multi sequence format used by GCG software
- PAUP's multiple sequence (NEXUS) format
- PIR/CODATA format used by PIR
- ASN.1 format used by NCBI
The Meta-MEME web site uses the ReadSeq program to read in sequences. ReadSeq is copyright 1990 by D. G. Gilbert, Biology Dept., Indiana University.
Return to the Meta-MEME home page.