File Backup Storage with Btrfs Snapshots

The Linux kernel 3.6 finally added a feature to btrfs which I needed to easily implement a snapshot file backup storage. This feature will be automatically enabled if the backup storage path points to a btrfs file system and btrfs supports cross-sub-volume reflinks.

A backup with snapshots works like this:

  • Every file backup is put into a separate sub-volume, i.e., a file backup with a name like “121102-1010” is not a normal folder any more. It is a sub-volume in the btrfs file system.
  • Btrfs can make snapshots of sub-volumes. On an incremental backup UrBackup creates a snapshot with the usual name like “121102-2210”. The snapshot has the contents of the last file backup. This snapshot operation is very fast, because no files have to be copied or file entries created. Btrfs does not internally create another copy of the file tree. It only creates copies of parts that we change in the future.
  • UrBackup deletes all the files in the snapshot that have changed or have been deleted since the last backup.
  • Then UrBackup loads the files that have changed or are new. If a file was changed UrBackup only writes changed parts of the file causing btrfs to only save the blocks in the file that are actually different.
  • Once the backup has to be removed UrBackup simply deletes the whole sub-volume.

This has several advantages:

  • Snapshot creation is very fast, causing faster incremental backups.
  • Deleting the file backups is way faster because not every file has to be deleted.
  • Only data that has changed in files between incremental backups is stored in the file system. This drastically reduces the storage requirements, e.g, for large database files.

The cross-device reflinks enable UrBackup to store same files which occur on different sub-volumes only once. On other filesystems this is done using hard links, but those only work on the same file system. Because the Linux kernel sees the sub-volume as a different file system the cross-device reflinks have to be used. There is no disadvantage in using them instead of hard links.