Hi, I sometimes want to debug Guix-installed software using GDB. Unfortunately, I've only been successful with trivial programs like GNU Hello. All of my attempts to debug actual problems have failed because I can't seem to get GDB to behave. It's a bit frustrating to bang my head on stuff like this by myself, so I'm hoping somebody with more experience can offer some advice. Let's start with what might be a real bug. I've noticed that Guix's virsh command (from the libvirt package) emits a suspicious error when you try to list devices: --8<---------------cut here---------------start------------->8--- $ virsh nodedev-list error: Failed to count node devices error: this function is not supported by the connection driver: virNodeNumOfDevices --8<---------------cut here---------------end--------------->8--- Apparently, because this function is "not supported", it is also not possible to use virt-manager to assign PCI devices to a libvirt domain. That's what I was trying to do when I stumbled across this issue. Anyway, this virsh problem occurs even when I invoke the command as root, so it probably isn't a permissions issue. I searched the Internet for errors like this, but I didn't find anything helpful. Every guide I've read so far seems to suggest that this invocation should just work. But it doesn't. Why? If you want to try reproducing this issue on GuixSD, make sure you have a libvirt-service-type service and a virtlog-service-type service in your operating system configuration declaration: --8<---------------cut here---------------start------------->8--- (service libvirt-service-type (libvirt-configuration (unix-sock-group "libvirt"))) (service virtlog-service-type) --8<---------------cut here---------------end--------------->8--- For good measure, make sure your user is in the "libvirt" group, too: --8<---------------cut here---------------start------------->8--- (user-account (name "marusich") (comment "Chris Marusich") (group "users") (supplementary-groups '("wheel" "netdev" "video" "libvirt")) (home-directory "/home/marusich")) --8<---------------cut here---------------end--------------->8--- Reconfigure and restart if necessary. Then run virsh: --8<---------------cut here---------------start------------->8--- $ virsh nodedev-list error: Failed to count node devices error: this function is not supported by the connection driver: virNodeNumOfDevices --8<---------------cut here---------------end--------------->8--- At this point, there are two possibilities: either everything is fine, and this error is expected, or something is wrong. If somebody knows that this is expected, I'd love to hear about it. However, let's operate on the assumption that something is wrong. How might we debug it? One way to debug it is to use GDB to investigate precisely why this failure occurred. There are probably other ways to debug the issue, but I want to focus on using GDB because this email is more about the problems I've had with GDB than the virsh issue. To begin, I create a directory where I'll do my debugging: --8<---------------cut here---------------start------------->8--- $ mkdir ~/debug $ cd ~/debug --8<---------------cut here---------------end--------------->8--- Let's get the virsh source so we can get GDB to tell us where we are in the code as we debug it: --8<---------------cut here---------------start------------->8--- $ tar -xf $(guix build -S libvirt) --8<---------------cut here---------------end--------------->8--- For me, this unpacks the source to: /home/marusich/debug/libvirt-4.3.0 Note that the function virNodeNumOfDevices is defined in /home/marusich/debug/libvirt-4.3.0/libvirt-4.3.0/src/libvirt-nodedev.c and called on line 254 of /home/marusich/debug/libvirt-4.3.0/tools/virsh-nodedev.c in the virshNodeDeviceListCollect function. I'd like to debug the code for virNodeNumOfDevices using GDB to see what's going on. To do this, I'm going to need the debug symbols, but the libvirt package doesn't have a debug output. Let's define a version of it that does. I put the following package definition into the file /home/marusich/debug/my-libvirt.scm: --8<---------------cut here---------------start------------->8--- (define-module (my-libvirt) #:use-module (guix packages) #:use-module (gnu packages virtualization)) (define-public my-libvirt (package (inherit libvirt) (name "my-libvirt") (outputs '("out" "debug")))) --8<---------------cut here---------------end--------------->8--- Let's build it and install both outputs into a new profile: --8<---------------cut here---------------start------------->8--- $ GUIX_PACKAGE_PATH=/home/marusich/debug guix package -p /home/marusich/debug/profile -i my-libvirt my-libvirt:debug --8<---------------cut here---------------end--------------->8--- Let's make sure the new virsh still reports the same error: --8<---------------cut here---------------start------------->8--- $ /home/marusich/debug/profile/bin/virsh nodedev-list error: Failed to count node devices error: this function is not supported by the connection driver: virNodeNumOfDevices --8<---------------cut here---------------end--------------->8--- Great! Let's debug it with GDB. First, make sure your ~/.gdbinit doesn't exist, otherwise your results might be different from mine. Then let's start GDB: --8<---------------cut here---------------start------------->8--- $ gdb --8<---------------cut here---------------end--------------->8--- Tell it where the debug files live: --8<---------------cut here---------------start------------->8--- (gdb) set debug-file-directory /home/marusich/debug/profile/lib/debug --8<---------------cut here---------------end--------------->8--- Tell it where the source lives: --8<---------------cut here---------------start------------->8--- (gdb) directory /home/marusich/debug/libvirt-4.3.0/src Source directories searched: /home/marusich/debug/libvirt-4.3.0/src:$cdir:$cwd (gdb) directory /home/marusich/debug/libvirt-4.3.0/tools Source directories searched: /home/marusich/debug/libvirt-4.3.0/tools:/home/marusich/debug/libvirt-4.3.0/src:$cdir:$cwd --8<---------------cut here---------------end--------------->8--- Tell it to use the file and read the symbols: --8<---------------cut here---------------start------------->8--- (gdb) file /home/marusich/debug/profile/bin/virsh Reading symbols from /home/marusich/debug/profile/bin/virsh...Reading symbols from /home/marusich/debug/profile/lib/debug//gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/bin/virsh.debug...done. done. --8<---------------cut here---------------end--------------->8--- Set the program's arguments: --8<---------------cut here---------------start------------->8--- (gdb) set args nodedev-list --8<---------------cut here---------------end--------------->8--- Set a breakpoint on the function virNodeNumOfDevices: --8<---------------cut here---------------start------------->8--- (gdb) break virNodeNumOfDevices Breakpoint 1 at 0x28610 --8<---------------cut here---------------end--------------->8--- Uh oh. This is our first sign of a problem: The breakpoint is associated with some sort of memory address, rather than a location in a file. Anyway, let's run the program: --8<---------------cut here---------------start------------->8--- (gdb) run Starting program: /gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/bin/virsh nodedev-list warning: the debug information found in "/home/marusich/debug/profile/lib/debug//gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0.4003.0.debug" does not match "/gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0" (CRC mismatch). warning: the debug information found in "/home/marusich/debug/profile/lib/debug//gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0.4003.0.debug" does not match "/gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0" (CRC mismatch). [Thread debugging using libthread_db enabled] Using host libthread_db library "/gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libthread_db.so.1". [New Thread 0x7ffff2219700 (LWP 16097)] Thread 1 "virsh" hit Breakpoint 1, 0x00007ffff768cdc0 in virNodeNumOfDevices () from /gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0 --8<---------------cut here---------------end--------------->8--- We hit the breakpoint - great! However, it seems GDB did not load the debug information for libvirt because of a CRC mismatch. Indeed, the backtrace seems to suggest that GDB knows about some of the source files, but not all of them: --8<---------------cut here---------------start------------->8--- (gdb) bt #0 0x00007ffff768cdc0 in virNodeNumOfDevicesw () from /gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0 #1 0x00005555555a816e in virshNodeDeviceListCollect (flags=0, ncapnames=, capnames=0x0, ctl=0x7fffffffb460) at virsh-nodedev.c:254 #2 cmdNodeListDevices (ctl=0x7fffffffb460, cmd=) at virsh-nodedev.c:472 #3 0x00005555555b8911 in vshCommandRun (ctl=0x7fffffffb460, cmd=0x55555583d850) at vsh.c:1318 #4 0x000055555557ea65 in main (argc=2, argv=0x7fffffffb7f8) at virsh.c:932 --8<---------------cut here---------------end--------------->8--- I wanted to see what was happening in the virNodeNumOfDevices function, which came from libvirt.so.0. Unfortunately, that's the library with the CRC mismatch. This means I'm totally blocked from investigating any further using GDB. I could set step-mode to "on" to step through the machine code without debug symbols, but as they say: "That is an exercise left to the reader." I have seen this CRC mismatch problem twice now when trying to debug issues with Guix-installed software. The other time was while attempting to debug a segfault in vinagre: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=30591 What is wrong? Am I using GDB wrong? Is there a bug in the part of the gnu-build-system that creates the debug files which might be causing the CRC mismatch? I'm aware of the fact that the gnu-build-system takes advantage of the .gnu-debuglink stuff ((gdb) Separate Debug Files), but to be honest I haven't done a lot of GDB debugging, so part of me wonders if this is just a case of "user error". If so, please help me understand what I'm doing wrong. Thank you, -- Chris