unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
@ 2020-05-01 13:56 maxim.cournoyer
  2021-08-07  5:06 ` Maxim Cournoyer
  2022-03-27  4:07 ` Maxim Cournoyer
  0 siblings, 2 replies; 8+ messages in thread
From: maxim.cournoyer @ 2020-05-01 13:56 UTC (permalink / raw)
  To: 40999

On a system where:

1) Each disks comprising the array is fully LUKS encrypted
2) Each mapped disk is made part of a Btrfs RAID1 array

When attempting to boot the system after pulling out (in BIOS or using
the cable) the drive to simulate a complete disk failure, GRUB hangs,
prompting for the LUKS password of the disappeared drive and
(unsurprisingly) failing to open it.

This prevents booting in a degraded LUKS encrypted, Btrfs RAID1 on Guix
System.

Maxim




^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2020-05-01 13:56 bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS maxim.cournoyer
@ 2021-08-07  5:06 ` Maxim Cournoyer
  2021-08-11 14:45   ` Giovanni Biscuolo
  2022-03-27  4:07 ` Maxim Cournoyer
  1 sibling, 1 reply; 8+ messages in thread
From: Maxim Cournoyer @ 2021-08-07  5:06 UTC (permalink / raw)
  To: 40999

[-- Attachment #1: Type: text/plain, Size: 624 bytes --]

Hello,

maxim.cournoyer@gmail.com writes:

> On a system where:
>
> 1) Each disks comprising the array is fully LUKS encrypted
> 2) Each mapped disk is made part of a Btrfs RAID1 array
>
> When attempting to boot the system after pulling out (in BIOS or using
> the cable) the drive to simulate a complete disk failure, GRUB hangs,
> prompting for the LUKS password of the disappeared drive and
> (unsurprisingly) failing to open it.
>
> This prevents booting in a degraded LUKS encrypted, Btrfs RAID1 on Guix
> System.

I retested this today, and the problem still occurs.  Here's a
screenshot from the failed boot (GRUB):

[-- Attachment #2: IMG_20210807_004828_1.jpg --]
[-- Type: image/jpeg, Size: 1155457 bytes --]

[-- Attachment #3: Type: text/plain, Size: 178 bytes --]


Ideally, GRUB (or is it our boot script?) should be smart enough to
realize that oh, that's Btrfs RAID1, it ought to work in degraded mode,
so let's keep going.

Thanks,

Maxim

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2021-08-07  5:06 ` Maxim Cournoyer
@ 2021-08-11 14:45   ` Giovanni Biscuolo
  2021-08-12  2:25     ` Maxim Cournoyer
  0 siblings, 1 reply; 8+ messages in thread
From: Giovanni Biscuolo @ 2021-08-11 14:45 UTC (permalink / raw)
  To: Maxim Cournoyer, 40999

[-- Attachment #1: Type: text/plain, Size: 1151 bytes --]

Hello Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

[...]

>> On a system where:
>>
>> 1) Each disks comprising the array is fully LUKS encrypted
>> 2) Each mapped disk is made part of a Btrfs RAID1 array
>>
>> When attempting to boot the system after pulling out (in BIOS or using
>> the cable) the drive to simulate a complete disk failure, GRUB hangs,
>> prompting for the LUKS password of the disappeared drive and
>> (unsurprisingly) failing to open it.

[...]

> Ideally, GRUB (or is it our boot script?)

Since the end result is your system entered "grub rescue" mode AFAIU
it's a GRUB issue

> should be smart enough to realize that oh, that's Btrfs RAID1, it
> ought to work in degraded mode, so let's keep going.

I (still) don't have a Guix System to test your setup and (try to) patch
thing up, so we need more info to debug the situation.

Can you please provide the output of the "ls" command and the "set"
command from the grub rescue shell?

Also, please what is your /proc/cmdline (when Linux correcly boots)?

Best regards, Gio

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2021-08-11 14:45   ` Giovanni Biscuolo
@ 2021-08-12  2:25     ` Maxim Cournoyer
  2021-08-13 15:05       ` Giovanni Biscuolo
  0 siblings, 1 reply; 8+ messages in thread
From: Maxim Cournoyer @ 2021-08-12  2:25 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: 40999

Hello Giovanni,

Giovanni Biscuolo <g@xelera.eu> writes:

> Hello Maxim,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
> [...]
>
>>> On a system where:
>>>
>>> 1) Each disks comprising the array is fully LUKS encrypted
>>> 2) Each mapped disk is made part of a Btrfs RAID1 array
>>>
>>> When attempting to boot the system after pulling out (in BIOS or using
>>> the cable) the drive to simulate a complete disk failure, GRUB hangs,
>>> prompting for the LUKS password of the disappeared drive and
>>> (unsurprisingly) failing to open it.
>
> [...]
>
>> Ideally, GRUB (or is it our boot script?)
>
> Since the end result is your system entered "grub rescue" mode AFAIU
> it's a GRUB issue

Yeah, it looks like it.  The grub.cfg file only has basic things in it,
nothing that could explain the failure.

>> should be smart enough to realize that oh, that's Btrfs RAID1, it
>> ought to work in degraded mode, so let's keep going.
>
> I (still) don't have a Guix System to test your setup and (try to) patch
> thing up, so we need more info to debug the situation.

I believe the basic recipe to reproduce is there:

1. Partition two drives like so (GPT with 2MiB BIOS boot):

$ sudo sfdisk -l /dev/sda
Disk /dev/sda: 931.53 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: WDC WD1002FAEX-0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: B5BB7BA4-23A3-4E7C-87BB-8339B02C5905

Device     Start        End    Sectors   Size Type
/dev/sda1   2048       6143       4096     2M BIOS boot
/dev/sda2   6144 1953523711 1953517568 931.5G Linux filesystem

$ sudo sfdisk -l /dev/sdb
Disk /dev/sdb: 931.53 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: WDC WD1002FAEX-0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 45C58C18-7B39-A745-B22F-6A2321FB1999

Device     Start        End    Sectors   Size Type
/dev/sdb1   2048       6143       4096     2M BIOS boot
/dev/sdb2   6144 1953523711 1953517568 931.5G Linux filesystem

2. LUKS encrypt the whole 2nd (main) partition of each drive.

3. Format the mapped devices as Btrfs RAID1.

4. Reconfigure a Guix system on top of that (see a config snippet below)

5. Disconnect one of the two drives and reboot.

6. Contemplate the failure to get past GRUB.

> Can you please provide the output of the "ls" command and the "set"
> command from the grub rescue shell?

I'll post after rebooting.

> Also, please what is your /proc/cmdline (when Linux correcly boots)?

--8<---------------cut here---------------start------------->8---
BOOT_IMAGE=/@root/gnu/store/1c0dkkkv5vdnyp73gvcl9k1kym5jjm54-linux-libre-5.13.8/bzImage
--root=/dev/mapper/cryptroot
--system=/gnu/store/815481yf1kfacwgkh4aa11rlb3lm6gvi-system
--load=/gnu/store/815481yf1kfacwgkh4aa11rlb3lm6gvi-system/boot quiet
snd_hda_intel.dmic_detect=0 modprobe.blacklist=rtl8187
--8<---------------cut here---------------end--------------->8---

The system config relevant sections are:

--8<---------------cut here---------------start------------->8---
(operating-system
    (host-name "hurd")
    (timezone "America/Montreal")
    (keyboard-layout (keyboard-layout "dvorak"))
    (bootloader (bootloader-configuration
                 (bootloader grub-bootloader)
                 (target "/dev/sda")
                 (terminal-outputs '(console))
		 (keyboard-layout keyboard-layout)))
    (kernel-arguments '("quiet" "snd_hda_intel.dmic_detect=0"
                        "modprobe.blacklist=rtl8187"))
    (mapped-devices
     (list (mapped-device
            (source "/dev/sda2")
            (target "cryptroot")
            (type luks-device-mapping))
           (mapped-device
            (source "/dev/sdb2")
            (target "cryptroot-mirror")
            (type luks-device-mapping))
           (mapped-device
            (source "/dev/sdc2")
            (target "cryptroot-mirror2")
            (type luks-device-mapping))))

    ;; Note: Using any of the LUKS encrypted drives exposed under
    ;; /dev/mapper is enough to reference the Btrfs RAID-1 array,
    ;; since the 'btrfs device scan' command is executed in the init
    ;; RAM disk and takes care of assembling the array.
    (file-systems (cons* (file-system
                           (mount-point "/")
                           (device "/dev/mapper/cryptroot")
                           (type "btrfs")
			   (options (alist->file-system-options
                                     (cons '("subvol" . "@root")
                                           %common-btrfs-options)))
			   (dependencies mapped-devices))
                         (file-system
                           (device "/dev/mapper/cryptroot")
                           (mount-point "/home")
                           (type "btrfs")
                           (options (alist->file-system-options
                                     (cons '("subvol" . "@home")
                                           %common-btrfs-options)))
                           (dependencies mapped-devices))
                         (file-system
                           (device "/dev/mapper/cryptroot")
                           (mount-point "/data")
                           (type "btrfs")
                           (options (alist->file-system-options
                                     (cons '("subvol" . "@data")
                                           %common-btrfs-options)))
                           (dependencies mapped-devices))
                         %base-file-systems))
   [...]
--8<---------------cut here---------------end--------------->8---

Thanks,

Maxim




^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2021-08-12  2:25     ` Maxim Cournoyer
@ 2021-08-13 15:05       ` Giovanni Biscuolo
  2021-08-29  6:15         ` Maxim Cournoyer
  0 siblings, 1 reply; 8+ messages in thread
From: Giovanni Biscuolo @ 2021-08-13 15:05 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 40999

[-- Attachment #1: Type: text/plain, Size: 5998 bytes --]

Hi Maxim,

I'd "debug" the issue trying to compare my Debian system config with
yours since I'm also using a BTRFS RAID1 filesystem on LUKS.

I've still not unplugged one of the two disks on mine to simulate a
drive failure, Soon™ I'd like to test this condition... but it's a
busy machine so I don't know when.

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

[...]

>>> Ideally, GRUB (or is it our boot script?)
>>
>> Since the end result is your system entered "grub rescue" mode AFAIU
>> it's a GRUB issue
>
> Yeah, it looks like it.  The grub.cfg file only has basic things in it,
> nothing that could explain the failure.

Please could you also provide the result of "lsblk -f"?

This is (part of) my disks layout:

--8<---------------cut here---------------start------------->8---

sdc                                                                                  
├─sdc1                                                                               
├─sdc2 vfat                      F6D8-67E3                             470.8M     1% /boot/efi
├─sdc3 crypto_L                  e554b806-19ac-48b2-b521-b4e89839a756                
│ └─crypt_swap01
│      swap                      a43ce70c-dd35-47d8-a2ef-ef9d3c6d0885                [SWAP]
└─sdc4 crypto_L                  820bfdf7-46f7-46f5-8536-7e1b0f04e70e                
  └─crypt_btrfs01_03
       btrfs    btrfs_pool01     82afe97a-bb97-4b3d-90cb-93a058185b97                
sdd                                                                                  
├─sdd1                                                                               
├─sdd2                                                                               
├─sdd3 crypto_L                  960aa919-182b-4604-a8be-8477c86386cc                
│ └─crypt_swap02
│      swap                      3f8f6974-05a9-4047-993a-c4ccb27eaa1d                [SWAP]
└─sdd4 crypto_L                  c590c62e-6ac8-418c-9ea7-7ae9c79058c8                
  └─crypt_btrfs01_04
       btrfs    btrfs_pool01     82afe97a-bb97-4b3d-90cb-93a058185b97  802.3G    57% /mnt/btrfs

--8<---------------cut here---------------end--------------->8---

btrfs_pool01 is my BTRFS RAID1 filesystem, it includes /boot and /
(root) and is on two ancrypted LUKS partitions, as you can see.

Also, please what's your grub.cfg?

This is the config of a menuentry of mine:

--8<---------------cut here---------------start------------->8---

menuentry 'Debian GNU/Linux' --class debian --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-82afe97a-bb97-4b3d-90cb-93a058185b97' {
	load_video
	insmod gzio
	if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
	insmod part_gpt
	insmod cryptodisk
	insmod luks
	insmod gcry_rijndael
	insmod gcry_rijndael
	insmod gcry_sha256
	insmod btrfs
	cryptomount -u c590c62e6ac8418c9ea77ae9c79058c8
	set root='cryptouuid/c590c62e6ac8418c9ea77ae9c79058c8'
	if [ x$feature_platform_search_hint = xy ]; then
	  search --no-floppy --fs-uuid --set=root --hint='cryptouuid/c590c62e6ac8418c9ea77ae9c79058c8'  82afe97a-bb97-4b3d-90cb-93a058185b97
	else
	  search --no-floppy --fs-uuid --set=root 82afe97a-bb97-4b3d-90cb-93a058185b97
	fi
	echo	'Loading Linux 5.10.0-0.bpo.3-amd64 ...'
	linux	/debian_root/boot/vmlinuz-5.10.0-0.bpo.3-amd64 root=UUID=82afe97a-bb97-4b3d-90cb-93a058185b97 ro rootflags=subvol=debian_root ip=10.38.2.2::10.38.2.1:255.255.255.0:anemone:eth0:none quiet
	echo	'Loading initial ramdisk ...'
	initrd	/debian_root/boot/initrd.img-5.10.0-0.bpo.3-amd64
}

--8<---------------cut here---------------end--------------->8---

AFAIU this code (from the snippet above):

--8<---------------cut here---------------start------------->8---

	if [ x$feature_platform_search_hint = xy ]; then
	  search --no-floppy --fs-uuid --set=root --hint='cryptouuid/c590c62e6ac8418c9ea77ae9c79058c8'  82afe97a-bb97-4b3d-90cb-93a058185b97
	else
	  search --no-floppy --fs-uuid --set=root 82afe97a-bb97-4b3d-90cb-93a058185b97
	fi

--8<---------------cut here---------------end--------------->8---

sets [1] the root GRUB env variable to the first found device containing
the UUID 82afe97a-bb97-4b3d-90cb-93a058185b97, that is the UUID of my
BTRFS filesystem

AFAIU (but still not tested) this means that if the device with UUID
c590c62[...] is missing the search ensures that GRUB will find the next
device containing the BTRFS filesystem identified by UUID 82afe97a[...]

WDYT?

[1] https://www.gnu.org/software/grub/manual/grub/grub.html#search

[...]

>> Can you please provide the output of the "ls" command and the "set"
>> command from the grub rescue shell?
>
> I'll post after rebooting.

OK thanks.

>> Also, please what is your /proc/cmdline (when Linux correcly boots)?
>
> --8<---------------cut here---------------start------------->8---
> BOOT_IMAGE=/@root/gnu/store/1c0dkkkv5vdnyp73gvcl9k1kym5jjm54-linux-libre-5.13.8/bzImage
> --root=/dev/mapper/cryptroot
> --system=/gnu/store/815481yf1kfacwgkh4aa11rlb3lm6gvi-system
> --load=/gnu/store/815481yf1kfacwgkh4aa11rlb3lm6gvi-system/boot quiet
> snd_hda_intel.dmic_detect=0 modprobe.blacklist=rtl8187
> --8<---------------cut here---------------end--------------->8---

This is mine (derived from the GRUB menu entry shown above):

--8<---------------cut here---------------start------------->8---

BOOT_IMAGE=/debian_root/boot/vmlinuz-5.10.0-0.bpo.3-amd64 root=UUID=82afe97a-bb97-4b3d-90cb-93a058185b97 ro rootflags=subvol=debian_root ip=10.38.2.2::10.38.2.1:255.255.255.0:anemone:eth0:none quiet

--8<---------------cut here---------------end--------------->8---

AFAIU using "root=UUID=..." is more robust than using the (possibly
missing) device mapper path.

[...]

Hope this helps.

Best regards, Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2021-08-13 15:05       ` Giovanni Biscuolo
@ 2021-08-29  6:15         ` Maxim Cournoyer
  2022-03-05  3:33           ` Maxim Cournoyer
  0 siblings, 1 reply; 8+ messages in thread
From: Maxim Cournoyer @ 2021-08-29  6:15 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: 40999

[-- Attachment #1: Type: text/plain, Size: 7065 bytes --]

Hello Giovanni!

I've finally reboot the machine, so am I sharing the information
requested:

Giovanni Biscuolo <g@xelera.eu> writes:

> Hi Maxim,
>
> I'd "debug" the issue trying to compare my Debian system config with
> yours since I'm also using a BTRFS RAID1 filesystem on LUKS.

Sounds useful!

[...]

> Please could you also provide the result of "lsblk -f"?

NAME          FSTYPE    FSVER LABEL      UUID                                 FSAVAIL FSUSE% MOUNTPOINT
sda
sda1
sda2        crypto_LU                  0792432c-78d8-4dcc-87c5-30200c3d02db
  cryptroot btrfs           my-root    2e97fbbd-fa4e-4858-948b-b3a89278a39b  201.2G    77% /var/lib/dock
sdb
sdb1
sdb2        crypto_LU                  a9aead40-9d01-4f7a-bb83-be70dd192b7b
  cryptroot-mirror
              btrfs           my-root    2e97fbbd-fa4e-4858-948b-b3a89278a39b
sdc
sdc1
sdc2        crypto_LU                  f0afd5c9-da70-46a7-9c6f-5d22913638bf
  cryptroot-mirror2
              btrfs           my-root    2e97fbbd-fa4e-4858-948b-b3a89278a39b
sdd           crypto_LU                  f04928db-90aa-458c-8908-036a620b74f6
luks-f04928db-90aa-458c-8908-036a620b74f6
              btrfs           Seagate2TB 231e9e86-e841-4c97-81f1-013a2b8d99c2    1.6T    12% /media/maxim/
sr0
sr1
zram0         swap                       76423fb7-9d60-47fc-b64c-313f0a7b1f55                [SWAP]
--8<---------------cut here---------------start------------->8---

The Btrfs file system in my case is labeled 'my-root' and composed of 3
drives in a raid1c3 btrfs array (3 copies).  @root is a subvolume on
which the root file system lives.

> This is (part of) my disks layout:
>
>
> sdc
> ..sdc1
> ..sdc2 vfat                      F6D8-67E3                             470.8M     1% /boot/efi
> ..sdc3 crypto_L                  e554b806-19ac-48b2-b521-b4e89839a756
> . ..crypt_swap01
> .      swap                      a43ce70c-dd35-47d8-a2ef-ef9d3c6d0885                [SWAP]
> ..sdc4 crypto_L                  820bfdf7-46f7-46f5-8536-7e1b0f04e70e
>   ..crypt_btrfs01_03
>        btrfs    btrfs_pool01     82afe97a-bb97-4b3d-90cb-93a058185b97
> sdd
> ..sdd1
> ..sdd2
> ..sdd3 crypto_L                  960aa919-182b-4604-a8be-8477c86386cc
> . ..crypt_swap02
> .      swap                      3f8f6974-05a9-4047-993a-c4ccb27eaa1d                [SWAP]
> ..sdd4 crypto_L                  c590c62e-6ac8-418c-9ea7-7ae9c79058c8
>   ..crypt_btrfs01_04
>        btrfs    btrfs_pool01     82afe97a-bb97-4b3d-90cb-93a058185b97  802.3G    57% /mnt/btrfs
>
>
> btrfs_pool01 is my BTRFS RAID1 filesystem, it includes /boot and /
> (root) and is on two ancrypted LUKS partitions, as you can see.
--8<---------------cut here---------------end--------------->8---

> Also, please what's your grub.cfg?

Here it is:

--8<---------------cut here---------------start------------->8---
# This file was generated from your Guix configuration.  Any changes
# will be lost upon reconfiguration.

# Set 'root' to the partition that contains /gnu/store.
search --file --set /@root/gnu/store/wlf9ccsl9pmch1dyv5x8c2gdngwn9m5i-grub-image.png


terminal_output console


insmod png
if background_image /@root/gnu/store/wlf9ccsl9pmch1dyv5x8c2gdngwn9m5i-grub-image.png; then
  set color_normal=light-gray/black
  set color_highlight=yellow/black
else
  set menu_color_normal=cyan/blue
  set menu_color_highlight=white/blue
fi
# Localization configuration.
# search --file --set /@root/gnu/store/q1cf63j2az4wlajg0caqy4nbndp0mvpm-grub-locales/en@quot.mo
set locale_dir=/@root/gnu/store/q1cf63j2az4wlajg0caqy4nbndp0mvpm-grub-locales
set lang=en_US
insmod keylayouts
keymap /@root/gnu/store/25s8pbpv2fnidrgir26mn97g0ciq52gz-grub-keymap.dvorak

set default=0
set timeout=5
menuentry "GNU with Linux-Libre 5.13.12" {
  search --file --set /@root/gnu/store/hvmyb8maz32dy6ra5g68gr4wd08pzq3r-linux-libre-5.13.12/bzImage
  linux /@root/gnu/store/hvmyb8maz32dy6ra5g68gr4wd08pzq3r-linux-libre-5.13.12/bzImage --root=/dev/mapper/cryptroot --system=/gnu/store/6qa5ga0pkjbmz8ix8gfrpy65zkl16xi7-system --load=/gnu/store/6qa5ga0pkjbmz8ix8gfrpy65zkl16xi7-system/boot quiet snd_hda_intel.dmic_detect=0 modprobe.blacklist=rtl8187
  initrd /@root/gnu/store/kllyldndnazfxxrhkabgifx5zvgyz82q-raw-initrd/initrd.cpio.gz
}

submenu "GNU system, old configurations..." {
menuentry "GNU with Linux-Libre 5.13.11 (#275, 2021-08-23 23:17)" {
  search --file --set /@root/gnu/store/fznnj7bgs46czizzhn186606jgr52qnp-linux-libre-5.13.11/bzImage
  linux /@root/gnu/store/fznnj7bgs46czizzhn186606jgr52qnp-linux-libre-5.13.11/bzImage --root=/dev/mapper/cryptroot --system=/var/guix/profiles/system-275-link --load=/var/guix/profiles/system-275-link/boot quiet snd_hda_intel.dmic_detect=0 modprobe.blacklist=rtl8187
  initrd /@root/gnu/store/g73vj8qy6kfrgmr8gnmmzh2q59cbnf2w-raw-initrd/initrd.cpio.gz
}

[...]

if [ "${grub_platform}" == efi ]; then
  menuentry "Firmware setup" {
    fwsetup
  }
fi

--8<---------------cut here---------------end--------------->8---

> This is the config of a menuentry of mine:
>
>
> menuentry 'Debian GNU/Linux' --class debian --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-82afe97a-bb97-4b3d-90cb-93a058185b97' {
> 	load_video
> 	insmod gzio
> 	if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
> 	insmod part_gpt
> 	insmod cryptodisk
> 	insmod luks
> 	insmod gcry_rijndael
> 	insmod gcry_rijndael
> 	insmod gcry_sha256
> 	insmod btrfs
> 	cryptomount -u c590c62e6ac8418c9ea77ae9c79058c8
> 	set root='cryptouuid/c590c62e6ac8418c9ea77ae9c79058c8'
> 	if [ x$feature_platform_search_hint = xy ]; then
> 	  search --no-floppy --fs-uuid --set=root --hint='cryptouuid/c590c62e6ac8418c9ea77ae9c79058c8'  82afe97a-bb97-4b3d-90cb-93a058185b97
> 	else
> 	  search --no-floppy --fs-uuid --set=root 82afe97a-bb97-4b3d-90cb-93a058185b97
> 	fi
> 	echo	'Loading Linux 5.10.0-0.bpo.3-amd64 ...'
> 	linux	/debian_root/boot/vmlinuz-5.10.0-0.bpo.3-amd64 root=UUID=82afe97a-bb97-4b3d-90cb-93a058185b97 ro rootflags=subvol=debian_root ip=10.38.2.2::10.38.2.1:255.255.255.0:anemone:eth0:none quiet
> 	echo	'Loading initial ramdisk ...'
> 	initrd	/debian_root/boot/initrd.img-5.10.0-0.bpo.3-amd64
> }
>
>
> AFAIU this code (from the snippet above):
>
>
> 	if [ x$feature_platform_search_hint = xy ]; then
> 	  search --no-floppy --fs-uuid --set=root --hint='cryptouuid/c590c62e6ac8418c9ea77ae9c79058c8'  82afe97a-bb97-4b3d-90cb-93a058185b97
> 	else
> 	  search --no-floppy --fs-uuid --set=root 82afe97a-bb97-4b3d-90cb-93a058185b97
> 	fi
>
>
> sets [1] the root GRUB env variable to the first found device containing
> the UUID 82afe97a-bb97-4b3d-90cb-93a058185b97, that is the UUID of my
> BTRFS filesystem
>
> AFAIU (but still not tested) this means that if the device with UUID
> c590c62[...] is missing the search ensures that GRUB will find the next
> device containing the BTRFS filesystem identified by UUID 82afe97a[...]
>
> WDYT?

[...]


>>> Can you please provide the output of the "ls" command and the "set"
>>> command from the grub rescue shell?

See the attached screenshot of the result:


[-- Attachment #2: IMG_20210829_023531.jpg --]
[-- Type: image/jpeg, Size: 1028986 bytes --]

[-- Attachment #3: Type: text/plain, Size: 177 bytes --]


I was about to mess around in GRUB to edit the prefix, cmdline and
root values and do `insmod normal`, `normal` to proceed to boot, but
then the init RAM disk failed like so:


[-- Attachment #4: IMG_20210829_024344.jpg --]
[-- Type: image/jpeg, Size: 1015956 bytes --]

[-- Attachment #5: Type: text/plain, Size: 449 bytes --]


So there are more than one things to be adjusted :-).

Thank you, I'll look at the data with a fresh head later, but it seems
to me that we'd need to have GRUB fallback logic for the root devices
when RAID setups are detected.  I'll read on what GRUB has in store for
this kind of thing when I have a chance.

Your Debian GRUB config also has me wondering about the 'btrfs' modules,
and the others than Guix System is not using.

Thank you!

Maxim

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2021-08-29  6:15         ` Maxim Cournoyer
@ 2022-03-05  3:33           ` Maxim Cournoyer
  0 siblings, 0 replies; 8+ messages in thread
From: Maxim Cournoyer @ 2022-03-05  3:33 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: 40999

Hi,

I'm writing here because I just found a much easier way to trigger this
than by opening the case of my desktop and pulling a drive out with this
QEMU script:

--8<---------------cut here---------------start------------->8---
#!/usr/bin/env bash

devices=(sda sdb sdc)
args=(-enable-kvm -snapshot -m 2G)

i=0
for d in "${devices[@]}"; do
    args+=(-drive file=/dev/$d,index=$i,media=disk)
    let i++
done

qemu-system-x86_64 "${args[@]}" "$@"
--8<---------------cut here---------------end--------------->8---

This attempts to boot the drives of the *live* system in QEMU; don't
fret, it's not dangerous as the '-snapshot' option ensure no actual
writes reach the drives.  It seems to fail at the mount command in our
initrd, but it at least allow testing GRUB easily.

With the above script and my Btrfs RAIDc3 array on drives /dev/sda,
/dev/sdb and /dev/sdc, after removing 'sdb' from the devices list for
example I get:

--8<---------------cut here---------------start------------->8---
Booting from Hard Disk...
GRUB loading...
Welcome to GRUB!

Attempting to decrypt master key...
Enter passphrase for hd0,gpt2 (0792432c78d84dcc87c530200c3d02db):
Slot 0 opened
error: failure reading sector 0x0 from `fd0'.
error: no such cryptodisk found.
Attempting to decrypt master key...
Enter passphrase for hd1,gpt2 (f0afd5c9da7046a79c6f5d22913638bf):
Slot 0 opened
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
error: failure reading sector 0x80 from `fd0'.
--8<---------------cut here---------------end--------------->8---

Dropping just sdc instead, I get:

--8<---------------cut here---------------start------------->8---
Booting from Hard Disk...
GRUB loading...
Welcome to GRUB!

Attempting to decrypt master key...
Enter passphrase for hd0,gpt2 (0792432c78d84dcc87c530200c3d02db): 
Slot 0 opened
Attempting to decrypt master key...
Enter passphrase for hd1,gpt2 (a9aead409d014f7abb83be70dd192b7b): 
Slot 0 opened
error: failure reading sector 0x0 from `fd0'.
error: no such cryptodisk found.
error: failure reading sector 0x80 from `fd0'.
error: unknown filesystem.
Entering rescue mode...
--8<---------------cut here---------------end--------------->8---

This should make a future fix cheaper to try (but a system test will be
best anyway :-)).

Thanks,

Maxim




^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS
  2020-05-01 13:56 bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS maxim.cournoyer
  2021-08-07  5:06 ` Maxim Cournoyer
@ 2022-03-27  4:07 ` Maxim Cournoyer
  1 sibling, 0 replies; 8+ messages in thread
From: Maxim Cournoyer @ 2022-03-27  4:07 UTC (permalink / raw)
  To: 40999

Hi,

maxim.cournoyer@gmail.com writes:

> On a system where:
>
> 1) Each disks comprising the array is fully LUKS encrypted
> 2) Each mapped disk is made part of a Btrfs RAID1 array
>
> When attempting to boot the system after pulling out (in BIOS or using
> the cable) the drive to simulate a complete disk failure, GRUB hangs,
> prompting for the LUKS password of the disappeared drive and
> (unsurprisingly) failing to open it.
>
> This prevents booting in a degraded LUKS encrypted, Btrfs RAID1 on Guix
> System.

It seems this is a problem not unknown to other (non-Btrfs) software
RAID as well, such as mdadm.  There was recently a fix for it in Ubuntu
[0].  It can probably provide cues about how to go to fix it in Guix
System.

[0]  https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1879980




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-27  4:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01 13:56 bug#40999: GRUB prevents booting a degraded RAID1 array atop LUKS maxim.cournoyer
2021-08-07  5:06 ` Maxim Cournoyer
2021-08-11 14:45   ` Giovanni Biscuolo
2021-08-12  2:25     ` Maxim Cournoyer
2021-08-13 15:05       ` Giovanni Biscuolo
2021-08-29  6:15         ` Maxim Cournoyer
2022-03-05  3:33           ` Maxim Cournoyer
2022-03-27  4:07 ` Maxim Cournoyer

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).