HACKER Q&A
📣 atomicnature

How do you deal with data backups in servers?


Recently, we lost some data during migrating our servers due to a missing backup.

We thought we had something backed up - but was not really the case.

We have multiple databases and apps - each having its own data store often.

How do you usually deal with server backups? What has worked for you and what has not?


  👤 Bender Accepted Answer ✓
For me both professionally and personally having a manifest of all non-OS and non git repo committed data e.g. code artifacts that are restored by code deployment clearly defines what needs to be backed up. This must be tested routinely by restoring only what exists in the role based manifest along with the role based procedure and doing QA testing on the restored nodes. Procedures will vary by role but there must be a manifest that defines what directories contain live data. Each role must have its own clearly defined procedure for data restoration. So for example DBA's will be responsible for writing their role based procedure for primary and secondary databases. Ideally role based data should be neatly contained to a corporate specific directory structure meaning that every role could in theory be restored to a single node without overlapping ports for stand-alone QA testing on a developer laptop.

Personally I also like to have a local snapshot using rsnapshot of live/ephemeral data so that I can quickly get a node back in service assuming the backup volume only accessible by root has not been tainted or tampered with. OSSEC is one of the many tools that can checksum data and alert on tampering. Anti-tampering is an entire topic by itself.


👤 codegeek
Some rules for backups that you must follow:

1. Backups must be taken offsite on a separate server (obvious but surprisingly some people miss this)

2. Backups must be tested frequently. If you cannot test a backup, you don't have a backup.

3. Frequency depends on your criticality of data, your contract/SLA with your customer etc. Ideally, you should be able to have Point-in-time-Restore (PTR) going back to certain number of hours/days/weeks

4. Make sure to have notifications for backup failures. If a backup failed, you must be notified to correct it manually.

5. Bonus: Have a backup reconciliation script that runs additionally to recon all backups for a certain period.


👤 penis123429
google for "3-2-1 backup rule"

should be easy