Backups
Developing a comprehensive, rigorous backup solution.
Issues, criteria
- Availability is what it's all about — having access to my files (and applications) in the face of risks: hardware failure, software bugs, human error… (Most terminology here is from the security domain where risks are dealt with rigorously… So, need to expound analysis, terminology…)
- Tiers (Ramifications from analysis, but needs prominence…)
- Mirroring: up-to-date copy (soft realtime) to protect against hardware failure. Ie, must not be on same medium. Remote increases latency, and traffic is expensive; typically done with RAID. (But, reconsider…)
- Versioning: read-only snapshots to protect against "bad" changes (mistaken deletion, broken upgrades, etc).
- Off-site: to protect against entire site failures (typically, catastrophes: theft, fire, etc).
- Encryption: confidentiality plays a special role, because redundancy is the most cost-effective approach to reliability, so we want to make multiple, distributed backups, especially remote (off-site) copies — which must be encrypted, obviously, and preferably not require privileged access (root login).
- Encrypt both storage and communications. (GnuPG, KGpg, OpenSSH.)
- Operating system backup
- Mirroring?
- Versioning: take snapshots. (How to clone current OS configuration?)
- Coverage: ensure everything is backed up.
- Exclusions: use a negative approach — include everything except what's explicitly excluded. Deals with careless omissions.
- (Canonical tree, FS boundaries, symlinks, offline, mtime?)
- Metadata (ownership, permissions, user/group identities, extended attributes…)
- History/versioning (rsnapshot, FUSE…?)
- Automation: Cron (nice (run in background), when, frequency…)
- Dependencies
- Format: something that conforms with other criteria, is reliable, convenient… (Duplicity, Box Backup or EncFS/SSHFS? Rsync, rdiff-backup…?)
- Restoring (recovery, testing)
- Verification (If it's not tested, assume broken…)
- Selective restoration: browsing archives…
- OS snapshots (How to revert OS changes?)
- Detection and monitoring (How to detect "bit rot"? Tripwire, SMART, CRCs/parity?)
- RAID mirroring
- Monitoring automated backups
- Performance
- Efficiency: incremental (space)/differential (time).
Choice of tools
- Rsync
- Simple to use, but… really falls short on most requirements:
- Mirroring: Rsync isn't continuous, and expensive to execute, even if efficient in bandwidth; useless for mirroring.
- Versioning: yeah, there are hard-links based hacks to do it, but… feel like hacks?
- Off-site: no encryption, so can't. There's Rsyncrypto, which I don't trust, or can hack something if have privileged access to the remote host, eg, to run an encrypted file system there, but beats the no privilege requirement…
- Simple to use, but… really falls short on most requirements:
- Duplicity (Done; explain…)
- Dar? Others? Why not?
Notes
- Merge with Secure remote backup.
(Appending notes disabled temporarily.)
Last modified 2009-09-29 10:34:58 +0000