GenBank/RefSeq Update Goals
- Incremental update of mRNAs and ESTs for multiple species and
assemblies based on daily updates from NCBI.
- Automatically run from cron, possibly every night.
- Only require manual intervention on non-recoverable errors or when a
large cluster run is required to do a large alignment.
- Incremental across GenBank releases; don't force a full realignment
every quarter.
- Allow removal of older genbank full releases (and still not force a
full realignment).
- Avoid corruption of disk files and databases.
- Recover from failures state, automatically when possible, making
manual recover easy.
- Allow restarting failed steps without restarting the entire
process.
- Don't require the process to be run at defined intervals. When a run
is done, data files will be updated to reflect the current state of the
NCBI repository.
- Include HTS files in automated download process.