Gábor Boskovits ezt írta (időpont: 2018. júl. 17., K, 17:18): > Chris Marusich ezt írta (időpont: 2018. júl. 17., > K, 8:36): > >> Hi, >> >> I sometimes want to debug Guix-installed software using GDB. >> Unfortunately, I've only been successful with trivial programs like GNU >> Hello. All of my attempts to debug actual problems have failed because >> I can't seem to get GDB to behave. It's a bit frustrating to bang my >> head on stuff like this by myself, so I'm hoping somebody with more >> experience can offer some advice. >> >> Let's start with what might be a real bug. I've noticed that Guix's >> virsh command (from the libvirt package) emits a suspicious error when >> you try to list devices: >> >> --8<---------------cut here---------------start------------->8--- >> $ virsh nodedev-list >> error: Failed to count node devices >> error: this function is not supported by the connection driver: >> virNodeNumOfDevices >> --8<---------------cut here---------------end--------------->8--- >> >> Apparently, because this function is "not supported", it is also not >> possible to use virt-manager to assign PCI devices to a libvirt domain. >> That's what I was trying to do when I stumbled across this issue. >> >> Anyway, this virsh problem occurs even when I invoke the command as >> root, so it probably isn't a permissions issue. I searched the Internet >> for errors like this, but I didn't find anything helpful. Every guide >> I've read so far seems to suggest that this invocation should just work. >> But it doesn't. Why? >> >> If you want to try reproducing this issue on GuixSD, make sure you have >> a libvirt-service-type service and a virtlog-service-type service in >> your operating system configuration declaration: >> >> --8<---------------cut here---------------start------------->8--- >> (service libvirt-service-type >> (libvirt-configuration >> (unix-sock-group "libvirt"))) >> (service virtlog-service-type) >> --8<---------------cut here---------------end--------------->8--- >> >> For good measure, make sure your user is in the "libvirt" group, too: >> >> --8<---------------cut here---------------start------------->8--- >> (user-account >> (name "marusich") >> (comment "Chris Marusich") >> (group "users") >> (supplementary-groups '("wheel" >> "netdev" >> "video" >> "libvirt")) >> (home-directory "/home/marusich")) >> --8<---------------cut here---------------end--------------->8--- >> >> Reconfigure and restart if necessary. Then run virsh: >> >> --8<---------------cut here---------------start------------->8--- >> $ virsh nodedev-list >> error: Failed to count node devices >> error: this function is not supported by the connection driver: >> virNodeNumOfDevices >> --8<---------------cut here---------------end--------------->8--- >> >> At this point, there are two possibilities: either everything is fine, >> and this error is expected, or something is wrong. If somebody knows >> that this is expected, I'd love to hear about it. However, let's >> operate on the assumption that something is wrong. How might we debug >> it? >> >> One way to debug it is to use GDB to investigate precisely why this >> failure occurred. There are probably other ways to debug the issue, but >> I want to focus on using GDB because this email is more about the >> problems I've had with GDB than the virsh issue. >> >> To begin, I create a directory where I'll do my debugging: >> >> --8<---------------cut here---------------start------------->8--- >> $ mkdir ~/debug >> $ cd ~/debug >> --8<---------------cut here---------------end--------------->8--- >> >> Let's get the virsh source so we can get GDB to tell us where we are in >> the code as we debug it: >> >> --8<---------------cut here---------------start------------->8--- >> $ tar -xf $(guix build -S libvirt) >> --8<---------------cut here---------------end--------------->8--- >> >> For me, this unpacks the source to: >> >> /home/marusich/debug/libvirt-4.3.0 >> >> Note that the function virNodeNumOfDevices is defined in >> >> /home/marusich/debug/libvirt-4.3.0/libvirt-4.3.0/src/libvirt-nodedev.c >> >> and called on line 254 of >> >> /home/marusich/debug/libvirt-4.3.0/tools/virsh-nodedev.c >> >> in the virshNodeDeviceListCollect function. >> >> I'd like to debug the code for virNodeNumOfDevices using GDB to see >> what's going on. To do this, I'm going to need the debug symbols, but >> the libvirt package doesn't have a debug output. Let's define a version >> of it that does. I put the following package definition into the file >> /home/marusich/debug/my-libvirt.scm: >> >> --8<---------------cut here---------------start------------->8--- >> (define-module (my-libvirt) >> #:use-module (guix packages) >> #:use-module (gnu packages virtualization)) >> >> (define-public my-libvirt >> (package >> (inherit libvirt) >> (name "my-libvirt") >> (outputs '("out" "debug")))) >> --8<---------------cut here---------------end--------------->8--- >> >> Let's build it and install both outputs into a new profile: >> >> --8<---------------cut here---------------start------------->8--- >> $ GUIX_PACKAGE_PATH=/home/marusich/debug guix package -p >> /home/marusich/debug/profile -i my-libvirt my-libvirt:debug >> --8<---------------cut here---------------end--------------->8--- >> >> Let's make sure the new virsh still reports the same error: >> >> --8<---------------cut here---------------start------------->8--- >> $ /home/marusich/debug/profile/bin/virsh nodedev-list >> error: Failed to count node devices >> error: this function is not supported by the connection driver: >> virNodeNumOfDevices >> --8<---------------cut here---------------end--------------->8--- >> >> Great! Let's debug it with GDB. First, make sure your ~/.gdbinit >> doesn't exist, otherwise your results might be different from mine. >> Then let's start GDB: >> >> --8<---------------cut here---------------start------------->8--- >> $ gdb >> --8<---------------cut here---------------end--------------->8--- >> >> Tell it where the debug files live: >> >> --8<---------------cut here---------------start------------->8--- >> (gdb) set debug-file-directory /home/marusich/debug/profile/lib/debug >> --8<---------------cut here---------------end--------------->8--- >> >> Tell it where the source lives: >> >> --8<---------------cut here---------------start------------->8--- >> (gdb) directory /home/marusich/debug/libvirt-4.3.0/src >> Source directories searched: >> /home/marusich/debug/libvirt-4.3.0/src:$cdir:$cwd >> (gdb) directory /home/marusich/debug/libvirt-4.3.0/tools >> Source directories searched: >> /home/marusich/debug/libvirt-4.3.0/tools:/home/marusich/debug/libvirt-4.3.0/src:$cdir:$cwd >> --8<---------------cut here---------------end--------------->8--- >> >> Tell it to use the file and read the symbols: >> --8<---------------cut here---------------start------------->8--- >> (gdb) file /home/marusich/debug/profile/bin/virsh >> Reading symbols from /home/marusich/debug/profile/bin/virsh...Reading >> symbols from >> /home/marusich/debug/profile/lib/debug//gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/bin/virsh.debug...done. >> done. >> --8<---------------cut here---------------end--------------->8--- >> >> Set the program's arguments: >> >> --8<---------------cut here---------------start------------->8--- >> (gdb) set args nodedev-list >> --8<---------------cut here---------------end--------------->8--- >> >> Set a breakpoint on the function virNodeNumOfDevices: >> >> --8<---------------cut here---------------start------------->8--- >> (gdb) break virNodeNumOfDevices >> Breakpoint 1 at 0x28610 >> --8<---------------cut here---------------end--------------->8--- >> >> Uh oh. This is our first sign of a problem: The breakpoint is >> associated with some sort of memory address, rather than a location in a >> file. Anyway, let's run the program: >> >> --8<---------------cut here---------------start------------->8--- >> (gdb) run >> Starting program: >> /gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/bin/virsh >> nodedev-list >> warning: the debug information found in >> "/home/marusich/debug/profile/lib/debug//gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0.4003.0.debug" >> does not match >> "/gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0" >> (CRC mismatch). >> >> warning: the debug information found in >> "/home/marusich/debug/profile/lib/debug//gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0.4003.0.debug" >> does not match >> "/gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0" >> (CRC mismatch). >> >> [Thread debugging using libthread_db enabled] >> Using host libthread_db library >> "/gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libthread_db.so.1". >> [New Thread 0x7ffff2219700 (LWP 16097)] >> >> Thread 1 "virsh" hit Breakpoint 1, 0x00007ffff768cdc0 in >> virNodeNumOfDevices () >> from >> /gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0 >> --8<---------------cut here---------------end--------------->8--- >> >> We hit the breakpoint - great! However, it seems GDB did not load the >> debug information for libvirt because of a CRC mismatch. Indeed, the >> backtrace seems to suggest that GDB knows about some of the source >> files, but not all of them: >> >> --8<---------------cut here---------------start------------->8--- >> (gdb) bt >> #0 0x00007ffff768cdc0 in virNodeNumOfDevicesw () >> from >> /gnu/store/mx3rmbpg6lhl0yxl9djbx49nfps9lwqi-my-libvirt-4.3.0/lib/libvirt.so.0 >> #1 0x00005555555a816e in virshNodeDeviceListCollect (flags=0, >> ncapnames=, capnames=0x0, ctl=0x7fffffffb460) >> at virsh-nodedev.c:254 >> #2 cmdNodeListDevices (ctl=0x7fffffffb460, cmd=) >> at virsh-nodedev.c:472 >> #3 0x00005555555b8911 in vshCommandRun (ctl=0x7fffffffb460, >> cmd=0x55555583d850) at vsh.c:1318 >> #4 0x000055555557ea65 in main (argc=2, argv=0x7fffffffb7f8) at >> virsh.c:932 >> --8<---------------cut here---------------end--------------->8--- >> >> I wanted to see what was happening in the virNodeNumOfDevices function, >> which came from libvirt.so.0. Unfortunately, that's the library with >> the CRC mismatch. This means I'm totally blocked from investigating any >> further using GDB. I could set step-mode to "on" to step through the >> machine code without debug symbols, but as they say: "That is an >> exercise left to the reader." >> >> I have seen this CRC mismatch problem twice now when trying to debug >> issues with Guix-installed software. The other time was while >> attempting to debug a segfault in vinagre: >> >> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=30591 >> >> What is wrong? Am I using GDB wrong? Is there a bug in the part of the >> gnu-build-system that creates the debug files which might be causing the >> CRC mismatch? I'm aware of the fact that the gnu-build-system takes >> advantage of the .gnu-debuglink stuff ((gdb) Separate Debug Files), but >> to be honest I haven't done a lot of GDB debugging, so part of me >> wonders if this is just a case of "user error". If so, please help me >> understand what I'm doing wrong. >> >> > Actually this is about the same thing I've found out. I'm also suffering > CRC mismatches. The workaround I used was defining a package where I didn't > strip the debugging symbols in the first place. I don't know what this is > about either, but it is annoying. > Hello Chris, I was thinking about this yesterday, is it possible that this is related to grafting? I > > >> Thank you, >> >> -- >> Chris >> >