The first time I had to restore a client’s website from backup, I learned the difference between “we have backups” and “we have working, recent, restorable backups”. The former is a checkbox; the latter is what saves your weekend when a SQL injection wipes the customer table at 2am.
Borg is the backup tool I run on every managed server I touch. It’s open-source, encryption is on by default, and the deduplication is so effective that nightly backups of multi-terabyte servers don’t fill the storage in a month.
What Borg actually does
Borg is a deduplicating, encrypting, compressing backup tool. From the user’s perspective, the workflow is:
- Initialize a repository (a directory or remote server location).
- Run
borg createto back up specific paths into the repository. - Optionally
borg pruneto delete old archives on a retention schedule. - Run
borg extractorborg mountwhen you need to restore.
What makes Borg interesting is what happens underneath. Every file gets split into chunks. Each chunk gets a hash. If a chunk’s hash already exists in the repository, Borg doesn’t store it again. It stores a reference.
For a server where most data doesn’t change night-to-night (database tables, system files, large binary uploads), this means:
- The first backup is full-sized.
- Every subsequent backup is essentially a delta.
- A repository holding 60 nightly backups of a 200GB server might be 250GB total, not 12TB.
Add compression on top (Borg supports zstd, lz4, lzma, and zlib), and the repository sizes drop further.
Why encryption matters here
Borg encrypts every chunk at the client side, with the key never leaving your server. The repository (where the backups live) only ever sees ciphertext. This is the part that makes Borg fundamentally different from rsync or tarball backups: even if your backup destination is compromised, the attacker has random bytes.
The default encryption mode is repokey-blake2: the key is stored in the repository, encrypted with your passphrase. For paranoid setups, keyfile-blake2 keeps the key entirely off the repository server, so the operator of the backup host can’t decrypt your data even with full access to their own infrastructure.
Borgmatic, the wrapper that makes Borg manageable
Borg’s command-line interface is precise but verbose. After the third time you write borg create --stats --compression auto,zstd,11 --exclude-caches ..., you’ll want a wrapper.
Borgmatic is exactly that. One YAML file describes everything:
location:
source_directories:
- /etc
- /var/www
- /var/lib/mysql-backups
repositories:
- ssh://...@borgbase.com/./repo
retention:
keep_daily: 7
keep_weekly: 4
keep_monthly: 6
consistency:
checks:
- repository
- archives
A single borgmatic command runs the backup, prunes old archives per the retention rules, runs integrity checks, and exits. Wire it to a systemd timer or a cron job and the rest is automatic.
Where to store the repository
Three patterns I’ve used:
- A second server you control. Cheapest if you already have spare disk. Risk: if both servers live in the same data center, a regional outage takes both down at once.
- An S3-compatible bucket via
rcloneor similar. Decent off-site, cheap at terabyte scale, but the workflow is slightly clunky because Borg wants block-level access to its repo. - A managed Borg host like BorgBase. Built specifically for Borg and Restic repositories, with native protocol support, monitoring, and per-repository key management. This is the path I default to for production hosts.
BorgBase is run by the same team that builds PikaPods, and the privacy ethos shows in both products. They sponsor open-source backup tools (Vorta, the desktop client) directly.
What Borg doesn’t fix
Borg is a backup tool, not a disaster recovery plan:
- It doesn’t decide what to back up. You pick paths. If you forget
/etc/letsencrypt, your backup won’t have your TLS certs. Audit yourborgmaticconfig every six months. - It doesn’t replace application-level backups. A
borg createof/var/lib/mysqlwhile MySQL is running gives you a torn database. Usemysqldump(orpg_dumpall, or whatever your stack uses) to a SQL file first, then back the SQL file up. - It doesn’t manage your encryption passphrases. If you lose the passphrase, the data is gone. Treat the passphrase like the recovery key it is, and store it somewhere you’ll find in a year (a hardware-backed password manager is the right answer).
Closing the loop
Borg is the boring, reliable backup tool that just works. Encrypted by default, deduplicating to absurd ratios, with a maintainer team that responds to issues fast. The combination of Borg + Borgmatic + BorgBase is the default backup stack for every managed server I run.
If your current backups are “we hope they’re working”, the Cloud Infrastructure Audit & Hardening engagement always includes a backup audit and a tested recovery plan as part of the deliverables. For more on the open-source tooling I run for clients, the open-source solutions category has the rest.