glam2 Reference
Usage
glam2 [options] alphabet my_seqs.fa
An alphabet other than p or n is interpreted as the name of an alphabet file.
Options that determine which alignments are possible
- -o
- output directory; will not clobber existing files
- -O
- output directory (glam2_out); allow clobbering
- -2
- Search both strands.
- -z
- Specify the minimum number sequences that must participate in
the alignment. If greater than the number of input sequences, then all
the sequences must participate.
- -a
- Specify the minimum number of key positions (aligned columns)
in the alignment.
- -b
- Specify the maximum number of key positions (aligned columns)
in the alignment.
Options that affect the scoring scheme
These options specify parameters in the formula for calculating
alignment scores. See GLAM2 methods (PDF)
for details.
- -d
- Specify a Dirichlet mixture
file, which describes residues' tendencies to align with one another.
- -D
- Specify the deletion pseudocount.
- -E
- Specify the 'no-deletion' pseudocount.
- -I
- Specify the insertion pseudocount.
- -J
- Specify the 'no-insertion' pseudocount.
- -q
- Specify the weight for generic versus sequence-set-specific
residue abundances. The residue abundances are estimated by counting
the residue types in all the input sequences, and adding pseudocounts.
The total number of pseudocounts is equal to the alphabet size
mutiplied by the -q parameter. The allocation of pseudocounts among the
residue types depends on the alphabet.
Options that affect the search algorithm
glam2 uses a simulated
annealing algorithm, with a temperature parameter. At high temperatures,
glam2 only slightly favours changes that increase the alignment's score,
and at low temperatures it strongly favours such changes. Thus, at high
temperatures the score will be optimized too slowly, but at low
temperatures the algorithm will get frozen in a local optimum. The
strategy, then, is to start with a high temperature and reduce it as
slowly as possible.
- -r
- Specify the number of alignment runs.
- -n
- Specify how many iterations should pass since the
highest-scoring alignment seen so far before ending each alignment run.
- -w
- Specify the initial number of key positions (aligned columns)
in the alignment. If less than the minimum (-a) or greater than the
maximum (-b), it is increased to the minimum or reduced to the maximum.
- -t
- Specify the initial temperature.
- -c
- Specify the cooling factor per n iterations. The temperature
is multiplied by a constant factor after each iteration, such that
after n iterations, it has dropped by this amount. 'n' is the number
specified with the -n option.
- -u
- Specify the minimum temperature. The temperature never drops
below this level, to avoid numerical problems.
- -m
- Specify the rate of column sampling relative to site sampling.
On each iteration, glam2 randomly decides to try either realigning a
sequence or adjusting an aligned column (key position). This parameter
sets the probabilities for this decision.
- -x
- Specify the site sampling algorithm: 0=FAST, 1=SLOW, 2=FFT.
See GLAM2 methods (PDF) for details. In
summary, the FAST algorithm deviates slightly from the strict
definition of simulated annealing, but works well in practice. The SLOW
algorithm implements strict simulated annealing, but is much slower,
especially for longer sequences. The FFT algorithm also implements
strict simulated annealing, and has intermediate speed, but carries
greater risk of numerical roundoff error. In order to use the FFT
algorithm, it is necessary to install the FFTW library, and re-compile glam2
with 'make glam2fft'.
- -s
- Specify the seed for pseudo-random number generation. Change
this to avoid getting identical results each time the program is run.
Cosmetic options
- -h
- Show all options and their default settings.
- -o
- Specify an output file name.
- -p
- Print information about the algorithm's state before each
iteration. The following information is printed: the temperature, the
number of key positions in the alignment, the number of sequences in
the alignment, and the alignment's score. In addition, some information
about each move is printed: for site sampling moves, which sequence is
picked, and for column sampling moves, which column is picked, whether
or not it is deleted, and whether the direction is left or right.
- -Q
- Run quietly, suppressing unnecessary messages.
Warnings
glam2 might occasionally issue this warning: 'accuracy loss due
to numeric underflow'. If this happens, its ability to optimise the
alignment may be harmed: see GLAM2
methods (PDF) for details. To fix this, try increasing -u, or possibly
decreasing -b. Increasing -u somewhat should be harmless, as long as it
stays well below 1.