Wiki

Backup

Backup basics

Backups of virtual hosts are only useful, when a backup is done while the vhost is running. And a backup has to be consistent: All files have to be saved at the same moment: When the files change during the backup process, it might be possible that the host does not work after restore, because the restored files don't fit together.

Both requirements can be met by using snapshots: When a snapshot is created, all files of the vhost are frozen at the same moment. An empty snapshot file is created. This takes only fractions of a second because no files are copied or moved. The vhost can be kept running.

When a file is changed inside the vhost, the froozen file will be moved to the snapshot file before beeing changed. The procedure is called 'copy-on-write'. When the snapshot file is saved e.g. as a backup, all saved files are copied from the snapshot. Files that don't exist in the snapshot are read from the original disk image. As a result the copy of the snapshot contains alle files from the moment of the snapshot creation - a consistent image.

After the backup is finished, the snapshot can simply be deleted, because the vhost disk image still contains all current files. It would be also possible to return to the content of the disk image at the moment of the snapshot creation. But this is only possible by stopping the vhost, merging the snapshot into the current disk image and restarting the vhost.

Backup scenarios

For KVM there are two backup scenarios who will be described in further detail:

Snapshots using LVM
Snapshots using virsh

It is not possible to mix both scenarios. The best experiences were gained with LVM on iSCSI partitions.

LVM based backups

Prerequisites for doing backups with the help of LVM are:

The virtual disk image has to be stored on a LVM virtual partition: Either a virtual partition is directly used as virtual disk or a virtual disk image file is stored on a LVM virtual partition.
There is sufficient free disk space in the LVM volume group the used LVM virtual partition is part of. It must be possible to create a snapshot of sufficient size to store all files that will be changed over the lifetime of the snapshot.

virsh based backups

Prerequisites for doing backups with the help of virsh are:

Snapshot creation using sub-commands of virsh can be used comfortably and securely only with recent versions of virsh. We suggest to use at least Ubuntu 14.04 and virsh version 1.2.12 to go with virsh snapshots.
Files have to be used to store virtual disk images. Using LVM virtual partitions as virtual disks and using virsh snapshots does not work.

Snapshot compression

Snapshot backups can be compressed when there is not enough space. Compression is usually done while the copy is written.

Backup images are usually saved on remote hosts/drives for reliability reasons. But compressed backups should only be written to local disks. The best way is to store/compress the backup to a local drive first and to copy/move it to a remote drive/host afterwards.

The compression of a backup to a network drive can be interrupted - e.g. by a temporary network error. A short timeout is sufficient. This leads to a hanging compression process (bzip or gzip). Such compression processes can only be killed by a reboot.

One or a few hanging processes are usually no problem for a running Linux. But hanging compression processes can block a snapshot and thus prevent it from being removed. When the snapshot is not big enough to save all changed files, first the snapshot will become useless, then the affected vhost will crash and eventaully the KVM host with all their running vhosts is killed.

Synchronize snapshots with their backup

Backup copies can most easily be written as complete copies of the vhost disk image. This can quickly become a disk space consuming process. A lot of disk space is wasted because only a part of the virtual disk image is filled by the vhost and an even smaller part has changed since the last backup.

An alternative might seem to open the snapshot and the last backup file, to mount the contained partitions and to synchronize the content of snapshot and backup. As a result the backup is a current copy of the snapshot; an additional backup is not necessary (saves space) and only the changed files are copied (saves time).

Although it has been practically proven that such a backup process works, the process has a number of disadvantages:

The partitioning inside the vhost has to be exactly known. Usually it cannot be recognized automatically. The more vhosts exist - possibly with different partition schemes, the more complicated it will become to automate such a backup for a decent number of vhosts.
The process of creating the snapshot, opening the images, recognizing the contained partitions, mounting them, copying files and then the other way around unmounting and removing the partitions, closing the images and finally removing the snapshots is an error-prone process which risks the stability of both the KVM host and the affected vhost.
This type of backup needs sufficient resources on the KVM hosts. Not sufficient resources lead to read-only partitions inside the vhosts for unknown reasons. A visible amount of KVM host resources is used by this type of backup (e.g. visible by monitoring).

previous chapter | contents | next chapter

Table of Contents