Backing up your Mascot Server
The demise of Tranche, last year, with reports that 80% of the raw data ‘safely held’ there has been permanently lost, should be a wake-up call for many laboratories to consider their back up strategy.
It’s very hard to get reliable data for the annualised failure rate (AFR) of disk drives, but most estimates suggest around 1.5% of drives will fail in some way during any given year. Of course, not all failures result in data loss, and technologies such as S.M.A.R.T can help by giving advance warning of impending catastrophic failure.
We suspect that many people don’t back up their Mascot Server regularly because of practical issues. A busy Mascot server can easily have more than 2TB of data and many IT groups won’t have the facility to back this up regularly.
It’s best to decide on your restore strategy before launching into backing up your data. If you need to be up and running again on the same day, you have little choice other than to back up the whole drive, and for a Windows system, probably the system drive too. If you are prepared to have a little more delay and pain, then you might choose to re-install the operating system, (maybe from an image), and then re-install Perl and Mascot before restoring the Mascot directories. If you go this route, it isn’t necessary to back up every Mascot directory.
There is no need to back up the following directories up if you intend to re-install Mascot from the program DVD. Total size for these is approximately 300MB:
- mascot/_install_backup
- mascot/bin
- mascot/cgi
- mascot/cluster
- mascot/htdig
- mascot/html
- mascot/sessions
- mascot/x-cgi
You should consider backing up the following every day:
- mascot/config
- mascot/data
For mascot/data, the ‘daily’ directories need to be backed up, possibly on an incremental basis, but not mascot/data/test or mascot/data/cache. If it makes backing up easier, then the cache directory can be moved to a different location by changing the CacheDirectory setting in the options section of mascot.dat
By default, sequence database files are saved under the mascot/sequence directory, but other locations may have been chosen. If you keep old versions of database files online or in an archive, you will need to back these up because it isn’t generally possible to get old versions of SwissProt, NCBInr, etc. It isn’t worth backing up the Fasta files for current releases; easier to download fresh copies from the original source. If you decide to back up any database files, you can exclude the *.?00 files, which will be recreated by Mascot. If a backed-up database uses taxonomy or UniGene, you may also need to back up these directories:
- mascot/taxonomy
- mascot/unigene
Once you have estimated how much data you need to store, it should be possible to seek advice on possible solutions. Taking storage media off-site shouldn’t be necessary if you have space available in a secure fire safe.