DNA or protein sequences are accepted. The sequences must all
be of the same type, either protein or DNA.
Protein sequences should use the standard IUPAC
alphabet: ACDEFGHIKLMNPQRSTVWY.
They may also contain the ambiguous letters "BUXZ", which
are converted to "X" and treated as "unknown".
DNA sequences should use the standard DNA alphabet:
ACGT.
They may also contain the ambiguous letters "BDHKMNRSUVWY",
which will be converted to "X" and treated as
"unknown".
Note: If none of the sequences in your dataset contain
any of the letters "EFILPQXZ", it will be assumed that your
sequences are DNA. You can force them to be interpreted as protein
sequences by adding an "X" to the end (or beginning) of one
of the sequences in your dataset.