bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)

unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed

From: "Ludovic Courtès" <ludo@gnu.org>
To: 58320@debbugs.gnu.org
Cc: bug-hurd@gnu.org
Subject: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)
Date: Sun, 09 Oct 2022 18:09:07 +0200	[thread overview]
Message-ID: <8735bx6kt8.fsf@gnu.org> (raw)
In-Reply-To: <87zge671p0.fsf@gnu.org> ("Ludovic Courtès"'s message of "Sat, 08 Oct 2022 17:52:11 +0200")

Hi!

Ludovic Courtès <ludo@gnu.org> skribis:

> $ addr2line -e  /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x1000 0x11627 0x11bb
> ??:0
> /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333
> :?
>
>
> That’s ‘_dl_fatal_printf’ calling ‘_exit’; it’s trying to tell us
> something.
>
> I’ll try and rebuild the system with the debugging patches at
> <https://lists.gnu.org/archive/html/bug-hurd/2011-11/msg00038.html>, to
> get early ld.so output, for lack of a better solution…

I tried adapted the patches above and tried them, but it seems that
‘_dl_sysdep_start’ isn’t even reached.  For example, I set a breakpoint
on ‘mach_task_self’ (called from ‘__mach_init’, called from
‘_dl_sysdep_start’), but that’s never reached (I’m assuming ‘break/tu’
is reliable, is it?).

The user-space backtrace upon trap remains unhelpful:

--8<---------------cut here---------------start------------->8---
start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint trap,                                                        
 eip 0xc1030d5b                                                                                                                         
Breakpoint at  task_resume:     pushl   %ebp                                                                                            
db> debug traps /on                                                                                                                     
db> b task_terminate                                                                                                                    
set breakpoint #2                                                                                                                       
db> c                                                                                                                                   
Kernel Debug trap trap, eip 0xc1030d5b                                                                                                  
 execkernel: Page fault (14), code=6                                                                                                    
Stopped at  0x1000:     pushl   0x4(%ebx)                                                                                               
>>>>> user space <<<<<                                                                                                                  
0x1000(bfffff24,0,0,1160b,0)                                                                                                            
0x11627(bfffff9c,0,0,0,2)                                                                                                               
0x11bb()                                                                                                                                
--8<---------------cut here---------------end--------------->8---

… where:

--8<---------------cut here---------------start------------->8---
$ addr2line -e /gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x11627 0x11bb
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333
:?
$ objdump -S /gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 --start-address=0x000011b0 |head -40

/gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1:     file format elf32-i386


Disassembly of section .text:

000011b0 <_start>:
    11b0:       89 e0                   mov    %esp,%eax
    11b2:       83 ec 0c                sub    $0xc,%esp
    11b5:       50                      push   %eax
    11b6:       e8 b5 0a 00 00          call   1c70 <_dl_start>
    11bb:       83 c4 10                add    $0x10,%esp
--8<---------------cut here---------------end--------------->8---

So it would seem that ‘_dl_start’ is called and somehow then a tail-call
to ‘_dl_fatal_printf’ is made.

Through a dichotomy I tried to see how far it goes.  The info I have so
far is that ld.so errors out from elf/rtld.c:563 (line 565 is not
reached):

--8<---------------cut here---------------start------------->8---
558:  if (bootstrap_map.l_addr || ! bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)])
559:    {
560:      /* Relocate ourselves so we can do normal function calls and
561:         data access using the global offset table.  */
562:
563:      ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);
564:    }
565:  bootstrap_map.l_relocated = 1;
...
578:  __rtld_malloc_init_stubs ();
--8<---------------cut here---------------end--------------->8---

It’s hard to be more precise because ELF_DYNAMIC_RELOCATE is a macro
that expands to quite a lot of code.

I don’t see the code path that would lead to a ‘_dl_fatal_printf’ call
though.

Ideas?  :-)

Ludo’.

next prev parent reply	other threads:[~2022-10-09 16:10 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-05 21:01 bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd) Ludovic Courtès
2022-10-06 13:14 ` Ludovic Courtès
2022-10-06 13:53   ` Samuel Thibault
2022-10-06 22:10     ` Ludovic Courtès
2022-10-06 22:42       ` Samuel Thibault
2022-10-07  8:24         ` Ludovic Courtès
2022-10-07 21:16           ` Samuel Thibault
2022-10-08 15:52             ` Ludovic Courtès
2022-10-09 16:09               ` Ludovic Courtès [this message]
2022-10-09 19:09                 ` Samuel Thibault
2022-10-10 21:14                 ` Ludovic Courtès
2022-10-17 12:51           ` Ludovic Courtès
2022-10-23 13:58       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735bx6kt8.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=58320@debbugs.gnu.org \
    --cc=bug-hurd@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).