There seems to exist, buried underneath the superficial & the common sense, theory on how to do backups well.
I've found two elements upon which better theory concerning rotatioms & other details (EG hash verification scheduling, amount of different devices) can be built.
The first is the Tower of Hanoi scheduling scheme, which we will abbreviate TOH.
The second is the Incremental-Differential-Full backups concept, which we will abbreviate IDF.
The best available resource seems to be the Acronis websites' illustrated docs: http://acronis-backup-recovery.helpmax.net/en/understanding-acronis-backup-recovery-10/tower-of-hanoi-backup-scheme/. I request that you in good faith ignore Acronis is a company selling commercial Windows software; you are free to post better links in comments were you to find better info elsewhere.
We end up with a scheme we can call IDF-TOH. In it we have three types of backups:
- Incremental, at L_0 ("Level A" in the linked resource), the most frequent level, capturing only changes made since the last backup.
- Differential, at each level that belongs to the closed interval L_0-L_n, capturing changes made since the last full backup.
- Full, at L_n, the least frequent level, capturing the whole system to be backed up.
So now, what can we do? At least the following directions could be taken in further developing a Theory of Backups:
0 (backup scheduling): The frequencies can be chosen in many ways, and I am not sure which one is most optimal. Tower of Hanoi is for every level L_a, where a belongs to closed interval 0 ... n, 2^a. Frame-Steward may or may not be of any use in this.
1 (rotations): IDF-TOH does not address the problems of rotations. IE: if you make a backup that corrupts your previous data, and then repeat the mistake, you get in trouble quick. It's ALSO noteworthy that certain mediums may better fit certain layers in IDF-TOH & the future schemes. At least, for example, adrian_b three days before this wrote:
"... Of all the optical discs that have been available commercially, those with the longest archival time were the pressed CD-ROM with gold mirrors, where the only degradation mechanism is the depolymerization of the polycarbonate, which could make them fragile, but when kept at reasonable temperatures and humidities that should require many centuries..."
Consequently these would be the best for the Full backups, while Solid State Drives may work for the Incremental ones.
2 (perfecting IDF): The IDF scheme may not be perfect either & can probably be refined more or less.
3 (hashing): Verifying the backups matters & should be a part of a complete scheme.
---
This may not be valuable for all businesses but most invididuals already using rsync or borg would probably prefer to use the best available scheme if reduces probability of incidental data loss at minimal effort. The task of translating the best possible scheme to a config program with humane interface is an undertaking of its own.
The other normal backups are usually managed by someone else, he just does the hardware, most of the time.
His backups are tested by experience.
I personally use the following backup strategy:
- Setup an encrypted ZFS Storage in the network (e.g. TrueNAS - in my case it is Proxmox)
- Enable zfs-auto-snapshot for 15 min snapshots auto rotation (keep 24 daily, etc.)
- NEVER (!) type in the passwords of ZFS Storage permitted users on any client, that could be affected by ransomware
- Provide a user authenticated samba share to store all important data - try to prevent local storage of data
- Sync the ZFS snapshots to an external USB drive every night (I use a tasmota shelly plug and an external usb case to power off the devices if they are not needed)
# create current snapshot
zfs snapshot -r "$NEW_POOL_SNAP"
# first backup
zfs send --raw -R "$SRC_POOL@$NEW_SNAP_NAME" | pv | zfs recv -Fdu "$DST_POOL"
# incremental backup
zfs send --raw -RI "$BACKUP_FROM_SNAPSHOT" "$BACKUP_UNTIL_SNAPSHOT" | pv | zfs recv -Fdu "$DST_POOL"
- On Windows and macOS, backup the OS on an external drive- Use restic to keep an additional copy of the local files and folders somewhere else
- Use a bluray burner to backup the most important stuff as a restic repository or encrypted archive (like very important documents, the best photo collections of you family, Keepass database, etc.) and put it to another location
- If cloud storage is affordable for the amount of data you have, consider using restic to store your stuff in the cloud
- From time to time try to restore a specific file from the backup and check if it worked and try to restore a full system (on an additional harddisk).
This may sound overkill, but ransomware is a pretty bad thing these days, even if you think you are not one of its targets.
Regarding backup scheduler - sometimes companies need to have frequent backups due to their RPOs and RTOs, for example, if they operate in a highly regulated industry. If someone can tolerate the loss of data of two hours, then, they need to have backup performed every 2 hours, if we speak here about 8 hours (working day), so why not to have backups on a daily basis?
Regarding rotations - everything depends on a backup solution, if it provides with immutable backups, so the entire data won't be corrupted. Thus, the faster someone notices the mistake, the faster they can restore their copy. IDF helps more to decide the issue with storage - not to overload it (here also worth mentioning deduplication and compression).
1. How long should you keep backups for - is the content of your backup covered by privacy laws that require you to not have copies of it after a certain period of time? is there a point where the content of your back up is so old that it's the logical equivalent of not having made a back up in the first place?
2. How much does your backup process cost - if it costs more to back up a system than it would cost you if you lost it, then you've got the backup process wrong (interestingly this can be affected by economies of scale)
3. What do you need to restore a backup - does your system requires bespoke hardware that might have been lost in whatever disaster you're trying to recover from?
…but I never delete because the more copies of the same thing there are, the more likely it will survive. If in fact I need it, time spent searching is far shorter than tedious backup procedure.
In addition, if I have to recreate something version 2 will be better because I keep getting better at the things I do.
But that is me not you. Good luck.