Table of Contents
Faulty mapping between iSCSI and multipath devices
Error description
In another section the device depenencies for KVM host 'kvm55' were already displayed:
root@kvm55:~# dmsetup ls --tree 350002ac21d622071 (252:4) ├─ (8:32) └─ (8:16) vg-images (252:0) └─ (8:4) vg3_nas-lv_nas5501 (252:5) └─350002ac21d632071 (252:2) ├─ (8:64) └─ (8:48) vg3_dbserver-lv_dbserver5501 (252:3) └─ (8:32) vg-lv_dbslave5501 (252:1) └─ (8:4)
The device depencies show an anomaly: The device 'vg3_dbserver-lv_dbserver5501' should be bound to the multipath device '350002ac21d622071'. Instead it is bound only to the hard disk (8:32) of the iSCSI device. According to this source the anomaly influences the stability of the device: When the iSCSI path fails on which the hard disk (8:32) is bound, the LVM device 'vg3_dbserver-lv_dbserver5501' will also fail - no matter whether the 2nd iSCSI path is still connected or not. Thus the reliability regarding both iSCSI paths is not working. Data loss due to a faulty mapping was indeed observed during administration.
Failure cause
The error is caused by misconfiguration of LVM. Devices on which LVM devices must not bind have to be blacklisted in the LVM onfiguration file 'lvm.conf'. See the solution description below for detail.
Solution
The problem can be solved in two steps.
First step: Adapt the LVM configuration
A filter in the LVM configuration file '/etc/lvm/lvm.conf' can prevent the binding of LVM devices to single hard disks. Only the most important lines of the configuration file are shown:
root@kvm58:/etc/lvm# cat lvm.conf # This is an adapted configuration file for the LVM2 system. # It contains the default settings that would be used if there was no # plus special filter for proper connection of lvm to multipath # distributed by puppet - do not edit locally devices { ... # 1st filter: iscsi devices, 2nd,3rd filter: vg on local disks/RAID0 filter = [ 'a/mapper/', 'a|/dev/sda4$|', 'a|/dev/md|', 'r/.*/' ] ... }
The decisive line of the configuration file is the one beginning with 'filter ='. The entry '/dev/sda4$' prevents the direct binding of LVM devices to SCSI disks. For details to the filter syntax see the man page to 'lvm.conf'.
The filter has to be in place, before any LVM devices become active. The easiest way to achive this is to reboot the host after reconfiguration. When this is not possible, follow the description of the second step.
Second step: Rescan and reconnect the LVM devices
When the filter is in place, these steps have to be followed to properly reconnect the LVM devices:
- Deactivate all processes accessing the LVM devices
- Set all LVM partitions and volume groups to state 'inactive'
- Rescan all LVM devices
- Set all LVM partitions and volume groups back to state 'active'
- Reactivate all processes accessing the LVM devices
The steps will be now described in more detail:
1. Deactivate processes
Details depend from the running processes. On a KVM host at least all vhosts have to be stopped that access LVM partitions. Often the process libvirt-bin also has to be stopped:
root@kvm55:~# virsh list Id Name State ---------------------------------------------------- 1 nas running 2 webserver01 running 3 admin running 4 vpn running root@kvm55:~# virsh shutdown nas root@kvm55:~# virsh shutdown webserver01 root@kvm55:~# virsh shutdown admin root@kvm55:~# virsh shutdown vpn root@kvm55:~# /etc/init.d/libvirt-bin stop
The output of the commands are not displayed, because the commands cannot be tested on a running host without really stopping all vhosts.
If possible prefer the proper shutdown of the vhosts from their command line to the shutdown with the command virsh shutdown
from the command line of the KVM host.
2. Deactivate LVM devices
It is sufficient to deactivate the volume groups. The logical partitions will be deactivated autmatically.
root@kvm55:~# vgchange -an
The output of the commands again are not displayed, because the commands cannot be tested on a running host.
3. Scan LVM devices
The scanning is done per LVM device type (physical volume, volume group, logical volume):
root@kvm55:~# pvscan root@kvm55:~# vgscan root@kvm55:~# lvscan
To the output of the commands - see above.
4. Activate LVM devices
At first the volume groups get activated. Afterwards the logical partitions can be activated:
root@kvm55:~# vgchange -ay root@kvm55:~# lvchange –ay /dev/vg3_dbserver/lv_dbserver5501 root@kvm55:~# lvchange –ay /dev/vg/images root@kvm55:~# lvchange –ay /dev/vg/lv_dbslave5501
To the output of the commands - see above. The activation of the logical volumes is not necessary. They will be activated during the next access anyway.
5. Activate processes
All processes that access LVM devices con now be started again. For a KVM host the service 'libvirt-bin' can be restarted as well as the virtual hosts:
root@kvm55:~# /etc/init.d/libvirt-bin start root@kvm55:~# virsh start nas root@kvm55:~# virsh start webserver01 root@kvm55:~# virsh start admin root@kvm55:~# virsh start vpn
Possible mutual dependencies among the vhosts are not considered in the depicted order of the restart of the vhosts. To the output of the commands - see above.
As a result logical partitions on iSCSI devices should be properly bound to multipath devices. This means, the partitions are now bound to both iSCSI paths:
root@kvm55:~# lsscsi --device [4:2:0:0] disk LSI MR9240-4i 2.13 /dev/sda[8:0] [5:0:0:0] disk 3PARdata VV 3112 /dev/sdc[8:32] [5:0:0:1] disk 3PARdata VV 3112 /dev/sde[8:64] [5:0:0:254] enclosu 3PARdata SES 3112 - [6:0:0:0] disk 3PARdata VV 3112 /dev/sdb[8:16] [6:0:0:1] disk 3PARdata VV 3112 /dev/sdd[8:48] [6:0:0:254] enclosu 3PARdata SES 3112 - root@kvm55:~# dmsetup ls --tree vg-images (252:0) └─ (8:4) vg3_nas-lv_nas5501 (252:5) └─350002ac21d632071 (252:2) ├─ (8:64) └─ (8:48) vg3_dbserver-lv_dbserver5501 (252:3) └─350002ac21d622071 (252:4) ├─ (8:32) └─ (8:16) vg-lv_dbslave5501 (252:1) └─ (8:4)