* ABI mismatch on boot on arm32 system @ 2024-10-16 10:11 Christoph Buck 2024-10-16 20:05 ` Richard Sent ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Christoph Buck @ 2024-10-16 10:11 UTC (permalink / raw) To: help-guix Hi! Currently i am trying to create an guix image which will boot on embedded imx6 arm32 board. Following the guix manual, i was able to create such an image. This involved adding a custom uboot version and a kernel with custom definition file. If flashed on an sdcard, the uboot runs and the kernel boots. However, early on boot (presumably on executing initrd.cpio.gz), an `record-abi-mismatch-error` is thrown and a guix recovery repl is opened > Use 'gnu.repl' for an initrd REPL. > ice-9/boot-9.scm:1685:16: In procedure raise-exception: > Throw to key `record-abi-mismatch-error' with args `(abi-check "~a: record ABI mismatch; recompilation needed" (#<record-type <file-system>>) ())'. > Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. > GNU Guile 3.0.9 > Copyright (C) 1995-2023 Free Software Foundation, Inc. > Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. > This program is free software, and you are welcome to redistribute it > under certain conditions; type `,show c' for details. Unfortunatly i have absolutely no clue what the problem could be. Could it be that the image was compiled with a differnt guile version than executet on the image? Could this explain the abi mismatch in the `file-system` record? Googling for the error i found the following post on this mailing list: > https://lists.gnu.org/archive/html/help-guix/2023-02/msg00147.html Seems like Maxim Cournoyer had the same problem with a board with the same socc (imx6). Unfortunatly no followup. (I mailed him in private in case he come up with a solution. If so, i will document it here, so that the next unlucky soul running into this error can find the solution). I cross-compile the image on x64 with > guix build -f custom-board.scm --target=arm-linux-gnueabihf -v3 -c2 -M2 -K --no-grafts where `custom-board.scm` is my image definition (i can share it if helpfull). Option `--no-grafts` is needed due to > https://issues.guix.gnu.org/66866 For tips on how to debug this issue further i would be very grateful. Feels like i am very close to a bootable image. -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-16 10:11 ABI mismatch on boot on arm32 system Christoph Buck @ 2024-10-16 20:05 ` Richard Sent 2024-10-20 15:15 ` Christoph Buck 2024-10-18 20:58 ` Denis 'GNUtoo' Carikli 2024-10-20 15:23 ` Christoph Buck 2 siblings, 1 reply; 13+ messages in thread From: Richard Sent @ 2024-10-16 20:05 UTC (permalink / raw) To: Christoph Buck; +Cc: help-guix Hi Christoph, > Currently i am trying to create an guix image which will boot on > embedded imx6 arm32 board. Following the guix manual, i was able to > create such an image. This involved adding a custom uboot version and a > kernel with custom definition file. If flashed on an sdcard, the uboot > runs and the kernel boots. However, early on boot (presumably on > executing initrd.cpio.gz), an `record-abi-mismatch-error` is thrown and > a guix recovery repl is opened > >> Use 'gnu.repl' for an initrd REPL. > >> ice-9/boot-9.scm:1685:16: In procedure raise-exception: >> Throw to key `record-abi-mismatch-error' with args `(abi-check "~a: record ABI mismatch; recompilation needed" (#<record-type <file-system>>) ())'. Your issue sounds very similar to the one described in https://issues.guix.gnu.org/61173#4. The TL;DR (although I encourage you to read it!) is that you need the CONFIG_BINFMT_MISC Linux kernel compilation option set, but when you use a linux-libre-*-generic kernel that option is NOT set. If you're using the qemu-binfmt-service, you'll fail to boot and the error you posted will occur before you're dropped into a REPL. Unfortunately services do not currently have any mechanism to require or check kernel config options. Not knowing your operating-system declaration I can't tell for sure if that is what's going on, but I suspect what I described or something similar is the case. If you confirm this is in fact the problem, feel free to leave a comment on the issue! Best of luck. -- Take it easy, Richard Sent Making my computer weirder one commit at a time. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-16 20:05 ` Richard Sent @ 2024-10-20 15:15 ` Christoph Buck 0 siblings, 0 replies; 13+ messages in thread From: Christoph Buck @ 2024-10-20 15:15 UTC (permalink / raw) To: Richard Sent; +Cc: help-guix Richard Sent <richard@freakingpenguin.com> writes: > Hi Christoph, Hi Richard! > > The TL;DR (although I encourage you to read it!) is that you need the > CONFIG_BINFMT_MISC Linux kernel compilation option set, but when you use > a linux-libre-*-generic kernel that option is NOT set. If you're using > the qemu-binfmt-service, you'll fail to boot and the error you posted > will occur before you're dropped into a REPL. I use a custom kernel modified straight from kernel.org and indeed i didn't enable the `CONFIG_BINFMT_MISC` setting. However, unfortunately enabling this option **does not** solve my problem. -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-16 10:11 ABI mismatch on boot on arm32 system Christoph Buck 2024-10-16 20:05 ` Richard Sent @ 2024-10-18 20:58 ` Denis 'GNUtoo' Carikli 2024-10-20 15:23 ` Christoph Buck 2 siblings, 0 replies; 13+ messages in thread From: Denis 'GNUtoo' Carikli @ 2024-10-18 20:58 UTC (permalink / raw) To: Christoph Buck; +Cc: help-guix [-- Attachment #1: Type: text/plain, Size: 1489 bytes --] On Wed, 16 Oct 2024 12:11:30 +0200 Christoph Buck <dev@icepic.de> wrote: > Hi! Hi, > Currently i am trying to create an guix image which will boot on > embedded imx6 arm32 board. Following the guix manual, i was able to > create such an image. This involved adding a custom uboot version and > a kernel with custom definition file. If flashed on an sdcard, the > uboot runs and the kernel boots. However, early on boot (presumably on > executing initrd.cpio.gz), an `record-abi-mismatch-error` is thrown > and a guix recovery repl is opened > > > Use 'gnu.repl' for an initrd REPL. > > > ice-9/boot-9.scm:1685:16: In procedure raise-exception: > > Throw to key `record-abi-mismatch-error' with args `(abi-check "~a: > > record ABI mismatch; recompilation needed" (#<record-type > > <file-system>>) ())'. There is also the option to try to bisect the issues (there might be more than one). We now have a u-boot-qemu-arm package so you could for instance start with arm64 (with u-boot-qemu-arm64 and a system definition that you create or reuse+modify) and manage to boot a system with qemu. Then once you managed to boot an arm64 system, you could try to reproduce it for 32bit arm with an older guix revision (and possibly a recent u-boot-qemu-arm that doesn't change) and then start bisecting. I tried to do that a long time ago but I don't have fast computers and so at some point I gave up and I never found the time to get back to it. Denis. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-16 10:11 ABI mismatch on boot on arm32 system Christoph Buck 2024-10-16 20:05 ` Richard Sent 2024-10-18 20:58 ` Denis 'GNUtoo' Carikli @ 2024-10-20 15:23 ` Christoph Buck 2024-10-20 15:39 ` Zack Weinberg 2 siblings, 1 reply; 13+ messages in thread From: Christoph Buck @ 2024-10-20 15:23 UTC (permalink / raw) To: help-guix Hi! I played around a little bit more and i can indeed now successfully boot. Instead of using cross-compilation (cli option `--target=arm-linux-gnueabihf`) i created a build using qemu emulation (cli option `--system=armhf-linux`). This takes ages to build, but the resulting images is bootable without abi error. Unfortunatly this is not a real fix because it is too slow to be a practical workaround (at least for me). I digged a little deeper and this is what i found out so far. In case i am running off in a totally wrong direction, someone with more clue than me should please stop me ;) I think something goes wrong during crosscompilation of the guile modules in package `module-import-compiled`. The abi error is thrown early on boot in the `initrd.cpio.gz` ramdisk. I extracted and decompressed the ramdisk from both builds (crosscompilation and qemu) which contain the `module-import-compiled` guile modules. I would expect that the *.go files from the `module-import-compiled` package of both ramdisks are binary identical but they have different md5sums. Lets take for example `file-systems.go`, which cause the abi error. --8<---------------cut here---------------start------------->8--- local@host:crosscompilation-initrd/gnu/store/5ffy1h3fgikzdhfz4nkchxnibbri4ain-module-import-compiled/gnu/system$ md5sum file-systems.go 7839e9c7a0c7c6c8d9ea45566ab9f61e file-systems.go --8<---------------cut here---------------end--------------->8--- vs --8<---------------cut here---------------start------------->8--- local@host:qemu-initrd/gnu/store/hvgj80xqf70mvx460pnvwmi87kqqn2bj-module-import-compiled/gnu/system$ md5sum file-systems.go a43a7e939ae9d0cc1ce30726cb51d6d4 file-systems.go --8<---------------cut here---------------end--------------->8--- Additional it looks like different symbols are exported depending if cross-compilation or qemu was used. This is at least what `readelf -s file-system.go` reports. I naively thought these files should be identical. Additional i saw these strange errors in the build log during crosscompilation --8<---------------cut here---------------start------------->8--- ;;; WARNING: loading compiled file /gnu/store/5ffy1h3fgikzdhfz4nkchxnibbri4ain-module-import-compiled/gnu/build/file-systems.go failed: ;;; In procedure load-thunk-from-memory: ELF file does not have native word size ;;; WARNING: loading compiled file /gnu/store/5ffy1h3fgikzdhfz4nkchxnibbri4ain-module-import-compiled/gnu/system/uuid.go failed: ;;; In procedure load-thunk-from-memory: ELF file does not have native word size ;;; WARNING: loading compiled file /gnu/store/5ffy1h3fgikzdhfz4nkchxnibbri4ain-module-import-compiled/gnu/system/file-systems.go failed: ;;; In procedure load-thunk-from-memory: ELF file does not have native word size --8<---------------cut here---------------end--------------->8--- This also looks suspicious. These stem from the `check_elf_header` function in guile. Guile warns that the class type in the elf header is 32bits if executed in a cross-compiliation context on an x64 system. But until now i couldn't figure out, if i can ignore these warnings or if they might cause a problem. -- Best regards Christoph I did some further digging into this issue. it warns if the class type in the elf header is 32bit. Christoph Buck <dev@icepic.de> writes: > Hi! > > Currently i am trying to create an guix image which will boot on > embedded imx6 arm32 board. Following the guix manual, i was able to > create such an image. This involved adding a custom uboot version and a > kernel with custom definition file. If flashed on an sdcard, the uboot > runs and the kernel boots. However, early on boot (presumably on > executing initrd.cpio.gz), an `record-abi-mismatch-error` is thrown and > a guix recovery repl is opened > >> Use 'gnu.repl' for an initrd REPL. > >> ice-9/boot-9.scm:1685:16: In procedure raise-exception: >> Throw to key `record-abi-mismatch-error' with args `(abi-check "~a: record ABI mismatch; recompilation needed" (#<record-type <file-system>>) ())'. > >> Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. >> GNU Guile 3.0.9 >> Copyright (C) 1995-2023 Free Software Foundation, Inc. > >> Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. >> This program is free software, and you are welcome to redistribute it >> under certain conditions; type `,show c' for details. > > Unfortunatly i have absolutely no clue what the problem could be. Could > it be that the image was compiled with a differnt guile version than > executet on the image? Could this explain the abi mismatch in the > `file-system` record? > > Googling for the error i found the following post on this mailing list: > >> https://lists.gnu.org/archive/html/help-guix/2023-02/msg00147.html > > Seems like Maxim Cournoyer had the same problem with a board with the > same socc (imx6). Unfortunatly no followup. (I mailed him in private in > case he come up with a solution. If so, i will document it here, so that > the next unlucky soul running into this error can find the solution). > > I cross-compile the image on x64 with > >> guix build -f custom-board.scm --target=arm-linux-gnueabihf -v3 -c2 -M2 -K --no-grafts > > where `custom-board.scm` is my image definition (i can share it if > helpfull). Option `--no-grafts` is needed due to > >> https://issues.guix.gnu.org/66866 > > For tips on how to debug this issue further i would be very > grateful. Feels like i am very close to a bootable image. -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-20 15:23 ` Christoph Buck @ 2024-10-20 15:39 ` Zack Weinberg 2024-10-20 17:24 ` Christoph Buck 0 siblings, 1 reply; 13+ messages in thread From: Zack Weinberg @ 2024-10-20 15:39 UTC (permalink / raw) To: help-guix On Sun, Oct 20, 2024, at 11:23 AM, Christoph Buck wrote: > I think something goes wrong during crosscompilation of the guile > modules in package `module-import-compiled`. The abi error is thrown > early on boot in the `initrd.cpio.gz` ramdisk. I extracted and > decompressed the ramdisk from both builds (crosscompilation and qemu) > which contain the `module-import-compiled` guile modules. I would expect > that the *.go files from the `module-import-compiled` package of both > ramdisks are binary identical but they have different md5sums. Lets take > for example `file-systems.go`, which cause the abi error. [...] Can you show us the *complete and unedited* output of `readelf -hlSd file-systems.go` from both the working and the broken ramdisk, please? zw ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-20 15:39 ` Zack Weinberg @ 2024-10-20 17:24 ` Christoph Buck 2024-10-21 9:55 ` Christoph Buck 0 siblings, 1 reply; 13+ messages in thread From: Christoph Buck @ 2024-10-20 17:24 UTC (permalink / raw) To: Zack Weinberg; +Cc: help-guix [-- Attachment #1: Type: text/plain, Size: 784 bytes --] Hi Zack! > Can you show us the *complete and unedited* output of `readelf -hlSd > file-systems.go` from both the working and the broken ramdisk, please? > Sure. See the attachments of this mail. But i just saw that i made a mistake and compaired the module `qemu/gnu/build/file-systems.go` to `cross/gnu/system/file-systems.go`. The md5sum between `qemu/gnu/system/file-systems.go` and `cross/gnu/system/file-systems.go` is still different, but the exported symbols are the same (see attachemt). The only difference i now can see is in the `Start of section headers` (426376 vs 426352) and different addresses in the subsequent output of `readelf`. Are theses expected to be deterministic/equal? Sorry for the confusion. > zw Greetings Christoph -- Best regards Christoph [-- Attachment #2: qemu.readelf --] [-- Type: application/octet-stream, Size: 3734 bytes --] ELF Header: Magic: 7f 45 4c 46 01 01 01 ff 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: <unknown: ff> ABI Version: 0 Type: DYN (Shared object file) Machine: None Version: 0x1 Entry point address: 0x0 Start of program headers: 52 (bytes into file) Start of section headers: 426352 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 3 Size of section headers: 40 (bytes) Number of section headers: 20 Section header string table index: 17 Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .guile.procprops PROGBITS 00000000 06c790 000020 00 0 0 8 [ 2] .rodata PROGBITS 0003fe38 03fe38 0074ac 00 A 0 0 8 [ 3] .data PROGBITS 00050000 050000 018170 00 WA 0 0 8 [ 4] .rtl-text PROGBITS 00000098 000098 03fda0 00 A 0 0 8 [ 5] .dynamic DYNAMIC 000472e8 0472e8 000030 00 A 0 0 8 [ 6] .strtab STRTAB 00000000 070b98 000666 00 0 0 8 [ 7] .symtab SYMTAB 00000000 06f118 001a80 10 6 0 8 [ 8] .guile.ariti[...] STRTAB 00000000 071200 00085a 00 0 0 8 [ 9] .guile.arities PROGBITS 00000000 068490 004253 00 8 0 8 [10] .guile.docst[...] STRTAB 00000000 071a60 0008ac 00 0 0 8 [11] .guile.docstrs PROGBITS 00000000 06c6e8 0000a8 00 10 0 8 [12] .debug_info PROGBITS 00000000 06c7b0 000b6f 00 0 0 8 [13] .debug_abbrev PROGBITS 00000000 06d320 000041 00 0 0 8 [14] .debug_str PROGBITS 00000000 06d368 000672 00 0 0 8 [15] .debug_loc PROGBITS 00000000 06d9e0 000000 00 0 0 8 [16] .debug_line PROGBITS 00000000 06d9e0 001733 00 0 0 8 [17] .shstrtab STRTAB 00000000 072310 0000d3 00 0 0 8 [18] PROGBITS 00000000 000000 000094 00 A 0 0 8 [19] NULL 00000000 068170 000000 00 0 0 0 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), p (processor specific) Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x47318 0x47318 R 0x10000 LOAD 0x050000 0x00050000 0x00050000 0x18170 0x18170 RW 0x10000 DYNAMIC 0x0472e8 0x000472e8 0x000472e8 0x00030 0x00030 R 0x8 Section to Segment mapping: Segment Sections... 00 .rodata .rtl-text .dynamic 01 .data 02 .dynamic Dynamic section at offset 0x472e8 contains 6 entries: Tag Type Name/Value 0x37146003 (<unknown>: 37146003) 0x3000006 0x37146002 (<unknown>: 37146002) 0x98 0x37146000 (<unknown>: 37146000) 0x50000 0x37146001 (<unknown>: 37146001) 0x18170 0x0000000c (INIT) 0x1b918 0x00000000 (NULL) 0x0 [-- Attachment #3: cross.readelf --] [-- Type: application/octet-stream, Size: 3735 bytes --] ELF Header: Magic: 7f 45 4c 46 01 01 01 ff 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: <unknown: ff> ABI Version: 0 Type: DYN (Shared object file) Machine: None Version: 0x1 Entry point address: 0x0 Start of program headers: 52 (bytes into file) Start of section headers: 426376 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 3 Size of section headers: 40 (bytes) Number of section headers: 20 Section header string table index: 17 Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .guile.procprops PROGBITS 00000000 06c7a8 000020 00 0 0 8 [ 2] .rodata PROGBITS 0003fe60 03fe60 008b64 00 A 0 0 8 [ 3] .data PROGBITS 00050000 050000 018188 00 WA 0 0 8 [ 4] .rtl-text PROGBITS 00000098 000098 03fdc4 00 A 0 0 8 [ 5] .dynamic DYNAMIC 000489c8 0489c8 000030 00 A 0 0 8 [ 6] .strtab STRTAB 00000000 070ba8 000666 00 0 0 8 [ 7] .symtab SYMTAB 00000000 06f128 001a80 10 6 0 8 [ 8] .guile.ariti[...] STRTAB 00000000 071210 000b62 00 0 0 8 [ 9] .guile.arities PROGBITS 00000000 0684a8 004255 00 8 0 8 [10] .guile.docst[...] STRTAB 00000000 071d78 0008ac 00 0 0 8 [11] .guile.docstrs PROGBITS 00000000 06c700 0000a8 00 10 0 8 [12] .debug_info PROGBITS 00000000 06c7c8 000b67 00 0 0 8 [13] .debug_abbrev PROGBITS 00000000 06d330 000041 00 0 0 8 [14] .debug_str PROGBITS 00000000 06d378 000672 00 0 0 8 [15] .debug_loc PROGBITS 00000000 06d9f0 000000 00 0 0 8 [16] .debug_line PROGBITS 00000000 06d9f0 001733 00 0 0 8 [17] .shstrtab STRTAB 00000000 072628 0000d3 00 0 0 8 [18] PROGBITS 00000000 000000 000094 00 A 0 0 8 [19] NULL 00000000 068188 000000 00 0 0 0 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclupde), p (processor specific) Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x489f8 0x489f8 R 0x10000 LOAD 0x050000 0x00050000 0x00050000 0x18188 0x18188 RW 0x10000 DYNAMIC 0x0489c8 0x000489c8 0x000489c8 0x00030 0x00030 R 0x8 Section to Segment mapping: Segment Sections... 00 .rodata .rtl-text .dynamic 01 .data 02 .dynamic Dynamic section at offset 0x489c8 contains 6 entries: Tag Type Name/Value 0x37146003 (<unknown>: 37146003) 0x3000006 0x37146002 (<unknown>: 37146002) 0x98 0x37146000 (<unknown>: 37146000) 0x50000 0x37146001 (<unknown>: 37146001) 0x18188 0x0000000c (INIT) 0x1b918 0x00000000 (NULL) 0x0 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-20 17:24 ` Christoph Buck @ 2024-10-21 9:55 ` Christoph Buck 2024-10-29 17:11 ` Christoph Buck 0 siblings, 1 reply; 13+ messages in thread From: Christoph Buck @ 2024-10-21 9:55 UTC (permalink / raw) To: Zack Weinberg; +Cc: help-guix Hi! I can now reproduce this error locally. Consider the following file: --8<---------------cut here---------------start------------->8--- (define-module (abi-error) #:use-module (gnu system file-systems) #:export (bla test)) (define bla (file-system (device (file-system-label "my-root")) (mount-point "/") (type "ext4"))) --8<---------------cut here---------------end--------------->8--- If crosscompiled on x64 to arm32 using (this is what `compiled-modules` in gexp.scm does, at least as far i can tell) --8<---------------cut here---------------start------------->8--- (use-modules (system base compile)) (use-modules (system base target)) (with-target "arm-linux-gnueabihf" (lambda () (compile-file "abi-error.scm" #:output-file "abi-error.go"))) --8<---------------cut here---------------end--------------->8--- loading the module in an emulated arm32 guile repl fails with an abi error: --8<---------------cut here---------------start------------->8--- icepic@G16-Buck:~/guix$ guix shell --container --system=armhf-linux guix guile file bash which coreutils icepic@G16-Buck ~/guix [env]$ guix repl -L . GNU Guile 3.0.9 Copyright (C) 1995-2023 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guix-user)> ,use (abi-error) While executing meta-command: Throw to key `record-abi-mismatch-error' with args `(abi-check "~a: record ABI mismatch; recompilation needed" (#<record-type <file-system>>) ())' --8<---------------cut here---------------end--------------->8--- But if compiled direclty in qemu on arm32 it works without abi-error, see --8<---------------cut here---------------start------------->8--- icepic@G16-Buck:~/guix$ guix shell --container --system=armhf-linux guix guile file bash which coreutils icepic@G16-Buck ~/guix [env]$ guix repl compile.scm icepic@G16-Buck ~/guix [env]$ guix repl -L . GNU Guile 3.0.9 Copyright (C) 1995-2023 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guix-user)> ,use (abi-error) scheme@(guix-user)> bla $1 = #<<file-system> device: #<file-system-label "my-root"> mount-point: "/" type: "ext4" flags: () options: #f mount?: #t mount-may-fail?: #f needed-for-boot?: #f check?: #t skip-check-if-clean?: #t repair: preen create-mount-point?: #f dependencies: () shepherd-requirements: () location: ((filename . "abi-error.scm") (line . 4) (column . 12))> --8<---------------cut here---------------end--------------->8--- where `compile.scm` is simply --8<---------------cut here---------------start------------->8--- (use-modules (system base compile)) (use-modules (system base target)) (compile-file "abi-error.scm" #:output-file "abi-error.go") --8<---------------cut here---------------end--------------->8--- This is not what one expect, is it? Interestingly it works if `aarch64-linux-gnu` instead of `arm-linux-gnueabihf` is used. So... it looks like there is a bug in cross-compilation support for arm32 in guile? -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-21 9:55 ` Christoph Buck @ 2024-10-29 17:11 ` Christoph Buck 2024-10-30 7:09 ` Efraim Flashner 0 siblings, 1 reply; 13+ messages in thread From: Christoph Buck @ 2024-10-29 17:11 UTC (permalink / raw) To: Zack Weinberg; +Cc: help-guix Hi! In case anybody is reading along here. I digged deeper and found something rather interessting :P From my understanding by reading through the records.scm from guix (and please note that im a total scheme newbee), the abi check works by calculation a string-hash over the record field names and storing the hash as hidden field in the record. During runtime this string-hash is computed again and compared to the compiled hash. If they don't match, the abi is broken because a field was added or removed. The hash is calculated in the `compute-abi-cookie` procedure in the records.scm. I extended the procedure with the following debug outputs --8<---------------cut here---------------start------------->8--- (define (compute-abi-cookie field-specs) ;; Compute an "ABI cookie" for the given FIELD-SPECS. We use ;; 'string-hash' because that's a better hash function that 'hash' on a ;; list of symbols. (let ((hash (syntax-case field-specs () (((field get properties ...) ...) (string-hash (object->string (syntax->datum #'((field properties ...) ...))) ;; (bla) (cond-expand (guile-3 (target-most-positive-fixnum)) (else most-positive-fixnum)) )))) (fd (syntax-case field-specs () (((field get properties ...) ...) (object->string (syntax->datum #'((field properties ...) ...))))))) (format #t "Compute-abi-cookie: ~a~%" hash) (format #t "field-specs: ~a~%" field-specs) (format #t "fd: ~a~%" fd) (format #t "hashsize ~a~%: " (cond-expand (guile-3 (target-most-positive-fixnum)) (else most-positive-fixnum))) hash)) --8<---------------cut here---------------end--------------->8--- Now, if i compile a simple test record --8<---------------cut here---------------start------------->8--- (define-record-type* <test-system> test-system make-test-system test-system? (device test-system-device) (mount-point test-system-mount-point)) (define test-abi (test-system (device "my-root") (mount-point "/"))) --8<---------------cut here---------------end--------------->8--- on x64 using guile cross-compiling (in a `guix shell --container guix guile` environment) using the call --8<---------------cut here---------------start------------->8--- (with-target "arm-linux-gnueabihf" (lambda () (compile-file "test-abi.scm"))) --8<---------------cut here---------------end--------------->8--- the following outputs are generated: --8<---------------cut here---------------start------------->8--- Compute-abi-cookie: 212719825 field-specs: ((#<syntax device> #<syntax test-system-device>) (#<syntax mount-point> #<syntax test-system-mount-point>)) fd: ((device) (mount-point)) hashsize 536870911 --8<---------------cut here---------------end--------------->8--- The abi cookie is computed by calculating the string hash over "((device) (mount-point))" while limiting the size of the hash by 536870911. One can manually check this by calling --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (string-hash "((device) (mount-point))" 536870911) $1 = 212719825 --8<---------------cut here---------------end--------------->8--- in the repl. Now, if i do the same in a qemu arm32 environment (using `guix shell --container guix guile --system=armhf-linux`), a different hash is printed, even though the hash is calculated over the same string, see: --8<---------------cut here---------------start------------->8--- Compute-abi-cookie: 2434018 field-specs: ((#<syntax device> #<syntax test-system-device>) (#<syntax mount-point> #<syntax test-system-mount-point>)) fd: ((device) (mount-point)) hashsize 536870911 --8<---------------cut here---------------end--------------->8--- You can verify this in the repl as well: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (string-hash "((device) (mount-point))" 536870911) $1 = 2434018 --8<---------------cut here---------------end--------------->8--- My first intuition after seeing the source of `compute-abi-cookie` was, that maybe the `target-most-positive-fixnum` results in an wrong value when called in a cross-compile context. But as you can see, this is not the case. Instead, the `string-hash` calculates a different hash even thought the input values are the same. Now, i am not even sure if one can expect that hash functions running on different architectures result in the same hash if the input is the same. If not, then the implementation in guix record.scm would be buggy. If one expects that the hash of `string-hash` for the same input must be the same regardless of the architecture, then this would hint to a bug in the `string-hash` function in guile for arm32. Any inputs and thoughts regarding this issue would be appreciated. Greetings Christoph -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-29 17:11 ` Christoph Buck @ 2024-10-30 7:09 ` Efraim Flashner 2024-10-30 13:24 ` Christoph Buck 0 siblings, 1 reply; 13+ messages in thread From: Efraim Flashner @ 2024-10-30 7:09 UTC (permalink / raw) To: Christoph Buck; +Cc: Zack Weinberg, help-guix [-- Attachment #1: Type: text/plain, Size: 6043 bytes --] On Tue, Oct 29, 2024 at 06:11:27PM +0100, Christoph Buck wrote: > Hi! > > In case anybody is reading along here. I digged deeper and found > something rather interessting :P Thank you! From where I'm sitting it's much easier (for me) to suggest things than to try and setup your environment. > From my understanding by reading through the records.scm from guix (and > please note that im a total scheme newbee), the abi check works by > calculation a string-hash over the record field names and storing the > hash as hidden field in the record. During runtime this string-hash is > computed again and compared to the compiled hash. If they don't > match, the abi is broken because a field was added or removed. > > The hash is calculated in the `compute-abi-cookie` procedure in the > records.scm. > > I extended the procedure with the following debug outputs > > --8<---------------cut here---------------start------------->8--- > (define (compute-abi-cookie field-specs) > ;; Compute an "ABI cookie" for the given FIELD-SPECS. We use > ;; 'string-hash' because that's a better hash function that 'hash' on a > ;; list of symbols. > (let ((hash > (syntax-case field-specs () > (((field get properties ...) ...) > (string-hash (object->string > (syntax->datum #'((field properties ...) ...))) > ;; (bla) > (cond-expand > (guile-3 (target-most-positive-fixnum)) > (else most-positive-fixnum)) > )))) > (fd (syntax-case field-specs () > (((field get properties ...) ...) > (object->string > (syntax->datum #'((field properties ...) ...))))))) > > (format #t "Compute-abi-cookie: ~a~%" hash) > (format #t "field-specs: ~a~%" field-specs) > (format #t "fd: ~a~%" fd) > (format #t "hashsize ~a~%: " (cond-expand > (guile-3 (target-most-positive-fixnum)) > (else most-positive-fixnum))) > hash)) > --8<---------------cut here---------------end--------------->8--- > > Now, if i compile a simple test record > > --8<---------------cut here---------------start------------->8--- > > (define-record-type* <test-system> test-system > make-test-system > test-system? > (device test-system-device) > (mount-point test-system-mount-point)) > > (define test-abi (test-system > (device "my-root") > (mount-point "/"))) > > --8<---------------cut here---------------end--------------->8--- > > on x64 using guile cross-compiling (in a `guix shell --container guix > guile` environment) using the call > > --8<---------------cut here---------------start------------->8--- > (with-target "arm-linux-gnueabihf" (lambda () (compile-file "test-abi.scm"))) > --8<---------------cut here---------------end--------------->8--- > > the following outputs are generated: > > --8<---------------cut here---------------start------------->8--- > Compute-abi-cookie: 212719825 > field-specs: ((#<syntax device> #<syntax test-system-device>) (#<syntax mount-point> #<syntax test-system-mount-point>)) > fd: ((device) (mount-point)) > hashsize 536870911 > --8<---------------cut here---------------end--------------->8--- > > The abi cookie is computed by calculating the string hash over > "((device) (mount-point))" while limiting the size of the hash by > 536870911. One can manually check this by calling > > --8<---------------cut here---------------start------------->8--- > scheme@(guile-user)> (string-hash "((device) (mount-point))" 536870911) > $1 = 212719825 > --8<---------------cut here---------------end--------------->8--- > > in the repl. > > Now, if i do the same in a qemu arm32 environment (using `guix shell > --container guix guile --system=armhf-linux`), a different hash is > printed, even though the hash is calculated over the same string, see: > > --8<---------------cut here---------------start------------->8--- > Compute-abi-cookie: 2434018 > field-specs: ((#<syntax device> #<syntax test-system-device>) (#<syntax mount-point> #<syntax test-system-mount-point>)) > fd: ((device) (mount-point)) > hashsize 536870911 > --8<---------------cut here---------------end--------------->8--- > > You can verify this in the repl as well: > > --8<---------------cut here---------------start------------->8--- > scheme@(guile-user)> (string-hash "((device) (mount-point))" 536870911) > $1 = 2434018 > --8<---------------cut here---------------end--------------->8--- > > My first intuition after seeing the source of `compute-abi-cookie` was, > that maybe the `target-most-positive-fixnum` results in an wrong value > when called in a cross-compile context. But as you can see, this is not > the case. Instead, the `string-hash` calculates a different hash > even thought the input values are the same. > > Now, i am not even sure if one can expect that hash functions running on > different architectures result in the same hash if the input is the > same. If not, then the implementation in guix record.scm would be > buggy. If one expects that the hash of `string-hash` for the same input > must be the same regardless of the architecture, then this would hint to > a bug in the `string-hash` function in guile for arm32. > > Any inputs and thoughts regarding this issue would be appreciated. > Can you run it again, but with i686 -> armhf, and x86_64 -> i686? My curiosity includes i686 -> x86_64, but I suspect it won't tell us anything we won't learn from the previous tests. -- Efraim Flashner <efraim@flashner.co.il> רנשלפ םירפא GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-30 7:09 ` Efraim Flashner @ 2024-10-30 13:24 ` Christoph Buck 2024-11-06 10:25 ` Christoph Buck 0 siblings, 1 reply; 13+ messages in thread From: Christoph Buck @ 2024-10-30 13:24 UTC (permalink / raw) To: Zack Weinberg; +Cc: help-guix Efraim Flashner <efraim@flashner.co.il> writes: > Can you run it again, but with i686 -> armhf, and x86_64 -> i686? > Hi Efraim! Sure. No problem. Here we go: cross compiled x86_64/i686 = 212719825 hash vs qemu i686 = 2434018 hash cross compiled i686/arm* = 2434018 hash vs qemu arm = 2434018 hash * This combination is run in qemu as well and then cross compiled using `with-target` because i don't have a native i686 architecture. > My curiosity includes i686 -> x86_64, but I suspect it won't tell us > anything we won't learn from the previous tests. Unfortnuatly this combination crashes: cross compiled i686/x86_x64* = --8<---------------cut here---------------start------------->8--- icepic@G16-Buck ~/guix/raspberry/touchscreen/abi-error/test/test-abi [env]$ ./compile.sh Backtrace: In ice-9/boot-9.scm: 2595:24 19 (call-with-deferred-observers _) 3424:24 18 (_) 222:17 17 (map1 (((test-systems)))) 3327:17 16 (resolve-interface (test-systems) #:select _ #:hide _ # ?) In ice-9/threads.scm: 390:8 15 (_ _) In ice-9/boot-9.scm: 3253:13 14 (_) In ice-9/threads.scm: 390:8 13 (_ _) In ice-9/boot-9.scm: 3544:20 12 (_) 2836:4 11 (save-module-excursion _) 3564:26 10 (_) In unknown file: 9 (primitive-load-path "test-systems" #<procedure ad7a0 a?>) In ice-9/eval.scm: 721:20 8 (primitive-eval (define-record-type* <test-system> # # ?)) In ice-9/psyntax.scm: 1229:36 7 (expand-top-sequence (#<syntax:test-systems.scm:10:0 ?>) ?) 1121:20 6 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?) 1342:32 5 (syntax-type (#<syntax define-record-type*> #<synta?> ?) ?) 1562:32 4 (expand-macro #<procedure bcc50 at ice-9/eval.scm:333:?> ?) In ice-9/eval.scm: 293:34 3 (_ #(#(#(#(#(#(#(#(#<directory ?> ?) ?) ?) ?) ?) ?) ?) ?)) 298:34 2 (_ #(#(#<directory (abi-records) 178c80>) ((#<s?> ?) ?))) In unknown file: 1 (string-hash "((device) (mount-point))" # #<undefined> #) In ice-9/boot-9.scm: 1685:16 0 (raise-exception _ #:continuable? _) ice-9/boot-9.scm:1685:16: In procedure raise-exception: Value out of range 1 to< 4294967295: 2305843009213693951 icepic@G16-Buck ~/guix/raspberry/touchscreen/abi-error/test/test-abi [env]$ --8<---------------cut here---------------end--------------->8--- Without further looking into this, i would hypothesize that during cross compilation to x64 on i868, `target-most-positive-fixnum` returns a number > 32bit which `string-hash` don't handle correct if executed on i868. To recap: For me, it looks like as if `string-hash` is not implemented in a plaform independent way but uses a platform specific data type size somewhere in its implementation. As long as bit mode(x64,x32) during cross-compilation and execution is the same, it works (crosscompilation on x64 and execution on arm64 is ok as well as crosscompilation on i868 and exection on arm32, see above). As soon as the bit mode is different, the resulting hashes are different or the exeuction crashes. I can debug into guiles `string-hash` function to find out where the difference comes from. But i first need to figure out how to setup gdb correctly. But for me it sound like that a sensible bugfix for this issue (and i think it is a bug/issue ;) ) would be using a platform independent hashing algorithm implemented in plain guile without relying on native c functions. If during compilation calucataed hashes are stored in the compiled module one must assume that the hashes are always the same independent of the architecture. Cross compilation will not work reliable as soons as this assumption is broken, because compilation and execution might happend on different platforms. -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-10-30 13:24 ` Christoph Buck @ 2024-11-06 10:25 ` Christoph Buck 2024-11-11 7:47 ` Christoph Buck 0 siblings, 1 reply; 13+ messages in thread From: Christoph Buck @ 2024-11-06 10:25 UTC (permalink / raw) To: Zack Weinberg, Efraim Flashner; +Cc: help-guix Hi Guix! So i looked into the guile source code and, as expected, the `scm_hash` function (see hash.c in guile) uses `unsigned long` wich is 8 bytes on x64 and 4 bytes on arm32/i868. If `string-hash` is called with the size parameter `n`, the hash value is limited to size by calculating the modulo `n` of the hash value, see scm_ihash in hash.c:440, namely > (unsigned long) scm_raw_ihash (obj, 10) % n (The `10` can be ignored as far as i can tell). Since the hash values are different on different platforms the modulo is different as well. However, if one steps through the call stack of `string-hash` you can see that the actual hash value is calculated by the `JENKINS_LOOKUP3_HASHWORD2` macro, which contains are rather interesting comment and a possible workaround for the abi problem, namely --8<---------------cut here---------------start------------->8--- /* Scheme can access symbol-hash, which exposes this value. For \ cross-compilation reasons, we ensure that the high 32 bits of \ the hash on a 64-bit system are equal to the hash on a 32-bit \ system. The low 32 bits just add more entropy. */ \ if (sizeof (ret) == 8) \ ret = (((unsigned long) c) << 32) | b; \ else \ ret = c; \ --8<---------------cut here---------------end--------------->8--- in hash.c:82. Meaning, if executed on a x64 platform, the higher 32bit of the resulting 64bit hash result are equal to the hash value on a 32bit platform. A simple test case in c++ looks like this: --8<---------------cut here---------------start------------->8--- int main(int args, char** argv) { scm_init_guile(); auto strToHash = scm_from_locale_string ("((device) (mount-point))"); auto maxULong = scm_from_ulong(ULONG_MAX); auto hashResult = scm_hash(strToHash,maxULong); auto hashResultUL = scm_to_ulong(hashResult); std::cout << "Max ULONG_MAX: " << ULONG_MAX <<std::endl; std::cout << "Original hashResult ulong: " << hashResultUL << std::endl; if(sizeof(hashResultUL) == 8) { std::cout << "Corrected for 32bit: " << (hashResultUL >> 32) << std::endl; } } --8<---------------cut here---------------end--------------->8--- which results on x64 in > Max ULONG_MAX: 18446744073709551615 > Original hashResult ulong: 10454028974864831 > Corrected for 32bit: 2434018 and on arm32 to > Max ULONG_MAX: 4294967295 > Original hashResult ulong: 2434018 This suggest the following workaround. Always limit the hash size to 32bit even if executed on a 64bit platform (or to be more specific a platform where ulong is 8bytes big). Do this by right shift the hash value 32bits and don't rely on the size parameter of the `string-hash` function. In code it could look something like this --8<---------------cut here---------------start------------->8--- (define (compute-abi-cookie field-specs) ;; Compute an "ABI cookie" for the given FIELD-SPECS. We use ;; 'string-hash' because that's a better hash function that 'hash' on a ;; list of symbols. (let ((hash (syntax-case field-specs () (((field get properties ...) ...) (let ((hash-value (string-hash (object->string (syntax->datum #'((field properties ...) ...)))))) (if (= (native-word-size) 8) (ash hash-value -32) hash-value))))) (fd (syntax-case field-specs () (((field get properties ...) ...) (object->string (syntax->datum #'((field properties ...) ...))))))) (format #t "Compute-abi-cookie: ~a~%" hash) hash)) --8<---------------cut here---------------end--------------->8--- where `native-word-size` is define by --8<---------------cut here---------------start------------->8--- (define (native-word-size) ((@ (system foreign) sizeof) '*)) --8<---------------cut here---------------end--------------->8--- (taken from `cross-compilation.test`). There might be a cleaner way to formulate this, but you get the point. This seems to work for all combinations on my machine. I tested x64 -> arm, x64 -> i868, i868 -> x64... I can only think of two drawbacks. 1) Lost entropy on 64 bit machines 2) Abi break because on new compilation the hash values on 64bit platforms will change. 1) is imho irrelevant, because it is not cryptophically important. For 2) i am not sure how important this is. Any thoughts on this? Might this be something worth fixing and sending a patch in? Best regard Christoph -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ABI mismatch on boot on arm32 system 2024-11-06 10:25 ` Christoph Buck @ 2024-11-11 7:47 ` Christoph Buck 0 siblings, 0 replies; 13+ messages in thread From: Christoph Buck @ 2024-11-11 7:47 UTC (permalink / raw) To: Zack Weinberg; +Cc: Efraim Flashner, help-guix Hi! I submitted a patch which fixes the issue. See > https://issues.guix.gnu.org/74296 Feedback is appreciated! Christoph -- Best regards Christoph ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-11-11 8:07 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-16 10:11 ABI mismatch on boot on arm32 system Christoph Buck 2024-10-16 20:05 ` Richard Sent 2024-10-20 15:15 ` Christoph Buck 2024-10-18 20:58 ` Denis 'GNUtoo' Carikli 2024-10-20 15:23 ` Christoph Buck 2024-10-20 15:39 ` Zack Weinberg 2024-10-20 17:24 ` Christoph Buck 2024-10-21 9:55 ` Christoph Buck 2024-10-29 17:11 ` Christoph Buck 2024-10-30 7:09 ` Efraim Flashner 2024-10-30 13:24 ` Christoph Buck 2024-11-06 10:25 ` Christoph Buck 2024-11-11 7:47 ` Christoph Buck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).