Symbolically linking directories during incremental file backups

current_hardlinking_schema

Current hard linking incremental file backup schema

Currently files which are unchanged between incremental file backups are hard linked. This means there exist more than one file paths for the same data. The advantages are

  • The operating systems takes care of managing the data. If there are no file entries pointing at the data it deletes it. Removing a backup means removing a folder.
  • Every file backup looks like a full backup on the backup server. This makes restoring and browsing file backups really easy.

But there are some issues in some cases:

  • On some filesystems and/or slow hard disks the hard link operations are slow (NTFS) and are the bottleneck during backups.
  • Some filesystems only have a limited amount of available file entries (ext e.g.).

With a lot of unchanged files between incremental backups it is almost certain the hard linking operation is the bottleneck.

To improve this situation UrBackup server 1.4 will create symlinks to large directories which are completely unchanged during incremental backups. This avoids a lot of the hard linking operations.

symlinking_schema

Symlinking large unchanged directories to a pooled directory

If, during incremental backups, a large unchanged directory is detected it is moved into a client specific directory pool. Then a symbolic link in the current and last backup to the directory in the pool is created. If the last backup already contains a link to the pool a symlink in the current backup is created to the pooled directory. Because the operating system does not count how many symbolic links point to a directory, the number of symlinks per pooled directory is saved into UrBackups internal database. If a backup is deleted and has symlinks the relevant entries in the database are removed. If there are no entries for the directory in the database the pooled directory is removed.

There are some caveats:

  • There are now symbolic links in the backups. Some tools may not be able to handle them, or need to be told how to handle them.
  • If you rename the backup folder all symbolic links are invalidated and won’t work anymore. The UrBackup server has a tool to fix that, but it will be slow.
  • The reference counting of symbolic links has some tricky corner cases and needs some testing.
  • You cannot just delete file backups anymore as then pooled directories may not be deleted. The UrBackup server includes a tool to detect that, but this might be slow with an increasing amount of directories in the pool.

The symbolic linking of large unchanged directories will be the default behavior in UrBackup server 1.4, though you will be able to deactivate it in the advanced settings. UrBackup server will not symlink directories on btrfs, as there the snapshotting provides a superior method to do fast backups.

7 thoughts on “Symbolically linking directories during incremental file backups

  1. What’s Happening i’m new to this, I stumbled upon this I
    have found It absolutely useful and it has aided me out loads.
    I hope to contribute & help other users like its aided me.
    Good job.

  2. Hi! I´m having troubles to understand that. The OS shows me that my data is duplicated when I run an incremental backup after a full one. Example, I have 10GB on the full backup path and same 10GB on the incremental path, but just one archive has change ( 10KB ). What am I doing wrong? Can I not use this and just see on incremental path archives that really changes?
    Thanks!!

    • UrBackup uses hard links so it does not really store 10GB, the explorer only shows that much. See https://en.wikipedia.org/wiki/Hard_link .

      The mentioned symlink mode is available in 1.4. Then it will probably show less, though some files may still be hard linked.

  3. Will it be possible to upgrade existing installations?
    I mean the data storage, not the software…

  4. I’ve renamed the backup folder. Which UrBackup tool do I use to correct the symbolic links?

    • remove_unknown.bat or start_urbackup_server –remove_unknown

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.