I’m trying to come up with a elegant way of backing up my docker volumes. I don’t really care about my host and the data on the host, because everything I do happens inside my docker containers and the mapped volumes. Some containers use mapped paths, but some others use straight up docker volumes.
I’ve started writing a script that inspects the containers, reads all the mount paths and then spins up another container to tar all the mounted paths.
docker run --rm --volumes-from $container_name-v /data/backup:/backup busybox tar cvf /backup/$container_name.tar $paths
So far so good, this works and I can write all backups to my storage and later sync them to an offsite backup space.
But error handling, (nice)logging and notifications using ntfy in case of success / errors / problems is going to suck in a bash script. Local backup file rollover and log file rollover also just suck if I have to do all this by hand. I’m able to use other languages to write this backup util but I don’t want to start this project if there already is a ready made solution.
So the question, is there a utility that can simply schedule arbitrary bits of script, write nicer logs for these script bits, do file rollovers and run another script on success / error?
All the backup programs that I can find are more focused on backing up directories, permissions and so on.
[Docker Volume Backup]((https://github.com/offen/docker-volume-backup )?
That seems like the solution I was looking for. Google wasn’t really helpful for me here :).
As I’m not using a Swarm or cluster, I consider Docker volumes volatile and use mounts where I need persistence. All my configuration and other persistent data is under
/opt/docker/<container>/<foldername>
. And/opt/docker
gets backed up regularly using restic../ bind mount everything in your compose file because dockers volume storage is bloody annoying to backup. Then you can just back up the one folder with everything in it.
That’s what I’m doing too. Kind of felt wrong to do that, but if you don’t need the extra features of docker volumes this is the way. It always puzzles me when I read “Volumes are easier to back up or migrate than bind mounts.” in the docs. How are these --volumes-from shenanigans easier than rsyncing some directory off the host?
dockers volume storage is bloody annoying to backup
They’re just normal folders located in
/var/lib/docker/volumes
so you can back up that directory.Yah, but if you have just one corrupt, then you have to fuck around figuringbout which UID named volume is attached to which container that may or may not be running, so now you have to sort through config files to identify it. Restoring a bind mount is the work of seconds and is very straightforward.
I just bind mount volumes I want to keep and use duplicati to backup the contents of my containers folder. Another idea if you are committed to docker volumes, is to also mount them to your backup solution.
Backup for docker is no different than backing up anything else.
Volumes are located in
/var/lib/docker/volumes
by default, so just back up that directory with Restic or Borg, or something like Backrest if you want notifications and stuff handled for you.Or you can do mount points to a directory instead of using volumes and back that up, but it’s really the same process.
I made an ansible role for this:
https://github.com/CSUN-CCDC/ccdc-2024/tree/main/linux/ansible/roles/docker
It was designed for a cybersecurity competition, and can back up containers and volumes. The volume back up works by creating another container and then mounting the volume to that container, and within that container a simple tar backup is ran.
I made my own solution since I wasn’t impressed by projects I had found. There’s two parts, the backup image and the restore image.
I use it like so:
services: restore_sabnzbd: image: untouchedwagons/simple-restore:1.0.5 container_name: restore_sabnzbd restart: no environment: - BACKUP_APPEND_DIRECTORY=/docker/production/sabnzbd - BACKUP_BASE_NAME=sabnzbd - FORCE_OWNERSHIP=1000:1000 volumes: - sabnzbd:/data - /mnt/tank/Media/Backups:/backups sabnzbd: image: ghcr.io/onedr0p/sabnzbd:4 container_name: sabnzbd restart: unless-stopped user: 1000:1000 volumes: - sabnzbd:/config - /mnt/tank/Media/Usenet:/mnt/data/Usenet depends_on: restore_sabnzbd: condition: service_completed_successfully networks: - traefik_default backup_sabnzbd: image: untouchedwagons/simple-backup:1.1.0 container_name: backup_sabnzbd restart: unless-stopped environment: TZ: "America/Toronto" BACKUP_APPEND_DIRECTORY: "/docker/production/sabnzbd" BACKUP_BASE_NAME: "sabnzbd" BACKUP_RETENTION: "24" BACKUP_FREQUENCY: "0 0 * * *" volumes: - sabnzbd:/data:ro - /mnt/tank/Media/Backups:/backups networks: traefik_default: external: true volumes: sabnzbd:
The restore container looks for a file called RESTORED in
/data
and if one isn’t found it’ll try to restore the latest backup (if available) and then create a RESTORED file. The backup container ignores this file during backup.I don’t have an answer for this, but I wonder if something for this could be set up with a docker container. I know LinuxServer.io has a really good image system where you can run a container with “mods”. For example, you might be able to run their base Ubuntu / Fedora / Alpine image and then customizing it to run your script as a service / cron job / systemd service.
https://hub.docker.com/r/lsiobase/ubuntu
https://docs.linuxserver.io/general/container-customization/
This is a classic conundrum.
Containers and their volumes are supposed to be ephemeral (right?).
Yet we use them to run little apps where we configure settings etc in the app which we would like to “keep” - thus back up. Yes in a proper set up you would hook your container up to something that is not ephemeral like a database somewhere, but often we just want an app, see it’s got a self contained docker image, and just run it.
Whilst not in the spirit of things… I’ve tried using Borg backup however it just fails due to random permissions on the volumes.
I should spend more time looking into it but haven’t the time right now, could be the solution is specific to the app/container but the simplicity of just backing up a /volumes/* directory is soooo tempting…
Edit upon reflection, what about a sudo cron tab to zip volumes and set useful permissions on the zip. Then Borg to backup the zip. Borg (or at least vorta) can easily run scripts before/after and pass variables relating to the backup though.
I can run the scripts by hand but I have to do everything else (logging, rollovers, notifications, etc.) by hand too, that is what I wanted to avoid.
Permissions are not a big deal inside the containers and even If there is an issue, this is for my home nas. I don’t really care If I have to spend a couple of hours to fix broken permissions after restoring something from a backup.
Biggest issue I’ve encountered a couple times is some applications like Gitlab etc it’s much better to use the backup process they provide than to try with disk copies etc. Although I haven’t tried to use a
btrfs send
but rsync / cp / tar etc all seem to fail with the special files and extended fs attributes.I implemented Duplicati as a fast, easy and adaptable backup solution at my workplace. It’s itself running as a docker stack on every node in the swarm. Even migration to new servers is possible, because it can restore file permissions and ownership too. Just give it access to either the volumes path of the host or just the whole host file system.
Can be accessed via web interface, password protected. I actually recommend the image by the linuxserver community instead of the “official” for a simple, no-config deployment that starts up in seconds. Just add a data volume and your host bind in your compose file and that’s pretty much it.