GenBank/RefSeq Error Handling
The following approaches are used to make this process as robust as
possible:
- The output of the all scripts is logged; error notification is done
by e-mail.
- Semaphore files are created when a task script is running, which
prevents other tasks from being accidental run at the same time. If a
task fails, a file is created indicate that this occurred. Tasks will not
run as long as a failed semaphore file is in place. The condition must be
manually correct and the semaphores removed.
- There is a defined data flow between each step. A step can be restart
from the beginning to synchronize it with results of the previous step
without corrupting data.
- Files are written in an atomic manner where needed, first writing the
file in the same directory with a temporary name, then renaming it. This
is done any time the existance of a file indicates a step is
complete.
- Data files are not modified after successful creation.
- Various verifications and sanity checks are used.
- Ability to explictly exclude incorrect genbank entries
(
data/ignore.idx
files).