Setting up RAID on encrypted (LUKS) LVM and decrypting at boot

I have used RAID using LVM as well as LVM-on-LUKS for years, and now I wanted to combine the two: set up two encrypted containers in a simple raid1 array. However, there were some unexpected niggles along the way. Here I document how I set up the system on Debian. My starting point was a system whose root filesystem was alreadt  logical volume (LV) on top of an encrypted block device (PV), but you can use the steps here to set things up from scratch.

Assume that the volume group (VG) is called Main, the LV is named root, and there's an extra block device, /dev/sdb2 that we want to use as an encrypted mirror of root. First set up an encrypted block device on top of sdb2: cryptsetup --verbose --verify-passphrase luksFormat /dev/sdb1

Then add a new entry to /etc/crypttab so this volume would be decrypted at boot, for example:

sdb2_crypt UUID=... none luks,discard,initramfs

You can get the UUID from e.g. ls -l /dev/disk/by-uuid. Using an UUID should protect against hardware or boot order changes where a disk may be remapped to sdc or similar. I use none because I'm happy to type in the passphrase at boot (note though that you'll need to type in the passphrase for all volumes). I added discard as I use an SSD, but check the security implications in man crypttab. Finally, I added initramfs in an attempt to solve a boot order issue (see below); it did not help there but it may help with ordering things properly during shutdown.

Don't forget to do an update-initramfs -u whenever updating crypttab or any LVM configuration like adding PVs to a VG. The easiest is to reboot now to get the new device, /dev/mapper/sdb2_crypt.

Then add the new PV to LVM: vgextend Main /dev/mapper/sdb2_crypt

Finally set up raid1 for the existing LV: lvconvert --type raid1 -m 1 Main/root

Using raid1 appears to be a better option than a simple mirror, but ymmv.

A problem here though is that during boot, LVM is triggered every time a new block device appears. Once the first luks container is opened (but not the second one), it'll be happy to activate the root LV in a degraded state, which means that all data is available in a RAID array, but some underlying devices are missing. Since the missing device will eventually appear, the LV gets into a special state, which you can see using lvs -a -o name,health_status which will report "refresh needed".

If this happens, trigger a refresh using lvchange --refresh Main/root. For me, lvconvert --repair Main/root or lvchange --syncaction repair didn't do anything. These tools are likely useful when the underlying PV is gone for good (e.g. in case of a drive failure).

To prevent this from happening at every boot, update /etc/lvm/lvm.conf and set activation_mode = "complete" instead of degraded. Run update-initramfs -u. This means that LVM will refuse to activate Main/root (and all other raid LVs) after just the first luks container was opened, and will wait for the second one to appear. A downside of this approach that in case of a total drive failure the LVs won't activate at all during boot, but you can probably do that easily manually from the initramfs prompt.

Popular Posts