Backups of virtual hosts are only useful, when a backup is done while the vhost is running. And a backup has to be consistent: All files have to be saved at the same moment: When the files change during the backup process, it might be possible that the host does not work after restore, because the restored files don't fit together.
Both requirements can be met by using snapshots: When a snapshot is created, all files of the vhost are frozen at the same moment. An empty snapshot file is created. This takes only fractions of a second because no files are copied or moved. The vhost can be kept running.
When a file is changed inside the vhost, the froozen file will be moved to the snapshot file before beeing changed. The procedure is called 'copy-on-write'. When the snapshot file is saved e.g. as a backup, all saved files are copied from the snapshot. Files that don't exist in the snapshot are read from the original disk image. As a result the copy of the snapshot contains alle files from the moment of the snapshot creation - a consistent image.
After the backup is finished, the snapshot can simply be deleted, because the vhost disk image still contains all current files. It would be also possible to return to the content of the disk image at the moment of the snapshot creation. But this is only possible by stopping the vhost, merging the snapshot into the current disk image and restarting the vhost.
For KVM there are two backup scenarios who will be described in further detail:
It is not possible to mix both scenarios. The best experiences were gained with LVM on iSCSI partitions.
Prerequisites for doing backups with the help of LVM are:
Prerequisites for doing backups with the help of virsh are:
virsh can be used comfortably and securely only with recent versions of virsh. We suggest to use at least Ubuntu 14.04 and virsh version 1.2.12 to go with virsh snapshots.Snapshot backups can be compressed when there is not enough space. Compression is usually done while the copy is written.
Backup images are usually saved on remote hosts/drives for reliability reasons. But compressed backups should only be written to local disks. The best way is to store/compress the backup to a local drive first and to copy/move it to a remote drive/host afterwards.
The compression of a backup to a network drive can be interrupted - e.g. by a temporary network error. A short timeout is sufficient. This leads to a hanging compression process (bzip or gzip). Such compression processes can only be killed by a reboot.
One or a few hanging processes are usually no problem for a running Linux. But hanging compression processes can block a snapshot and thus prevent it from being removed. When the snapshot is not big enough to save all changed files, first the snapshot will become useless, then the affected vhost will crash and eventaully the KVM host with all their running vhosts is killed.
Backup copies can most easily be written as complete copies of the vhost disk image. This can quickly become a disk space consuming process. A lot of disk space is wasted because only a part of the virtual disk image is filled by the vhost and an even smaller part has changed since the last backup.
An alternative might seem to open the snapshot and the last backup file, to mount the contained partitions and to synchronize the content of snapshot and backup. As a result the backup is a current copy of the snapshot; an additional backup is not necessary (saves space) and only the changed files are copied (saves time).
Although it has been practically proven that such a backup process works, the process has a number of disadvantages: