From: "Ludovic Courtès" <ludo@gnu.org>
To: 58320@debbugs.gnu.org
Cc: bug-hurd@gnu.org
Subject: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)
Date: Mon, 10 Oct 2022 23:14:15 +0200 [thread overview]
Message-ID: <87pmezxty0.fsf@gnu.org> (raw)
In-Reply-To: <8735bx6kt8.fsf@gnu.org> ("Ludovic Courtès"'s message of "Sun, 09 Oct 2022 18:09:07 +0200")
Ludovic Courtès <ludo@gnu.org> skribis:
> Through a dichotomy I tried to see how far it goes. The info I have so
> far is that ld.so errors out from elf/rtld.c:563 (line 565 is not
> reached):
>
> 558: if (bootstrap_map.l_addr || ! bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)])
> 559: {
> 560: /* Relocate ourselves so we can do normal function calls and
> 561: data access using the global offset table. */
> 562:
> 563: ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);
> 564: }
> 565: bootstrap_map.l_relocated = 1;
> ...
> 578: __rtld_malloc_init_stubs ();
Via brute force¹, I found that ‘__assert_fail’ is hit, with its first
argument in $eax being:
--8<---------------cut here---------------start------------->8---
db> x/c 0x28604,80
ELF32_R_TYPE (reloc->r_info) == R_386_RELATIVE\000\000map->l_in
fo[VERSYMIDX (DT_VERSYM)] != NULL\000\000Fatal glibc error: Too
many audit mo
--8<---------------cut here---------------end--------------->8---
This comes from i386/dl-machine.h:
--8<---------------cut here---------------start------------->8---
auto inline void
__attribute ((always_inline))
elf_machine_rel_relative (Elf32_Addr l_addr, const Elf32_Rel *reloc,
void *const reloc_addr_arg)
{
Elf32_Addr *const reloc_addr = reloc_addr_arg;
assert (ELF32_R_TYPE (reloc->r_info) == R_386_RELATIVE);
*reloc_addr += l_addr;
}
--8<---------------cut here---------------end--------------->8---
How can we get there? Looking at ‘_dl_start’, it could be that
‘elf_machine_load_address’ returns a bogus value and we end up reading
wrong ELF data? Or it could be memory corruption somewhere. Or…?
Thing is, it’s not fully deterministic (happens 9 times out of 10 with
KVM, never happens without KVM).
Ideas? :-)
Ludo’.
¹ Building with ‘-fno-optimize-sibling-calls’ didn’t help get nicer
backtraces, but that’s prolly because all that early relocation code
is inlined.
next prev parent reply other threads:[~2022-10-10 21:15 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-05 21:01 bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd) Ludovic Courtès
2022-10-06 13:14 ` Ludovic Courtès
2022-10-06 13:53 ` Samuel Thibault
2022-10-06 22:10 ` Ludovic Courtès
2022-10-06 22:42 ` Samuel Thibault
2022-10-07 8:24 ` Ludovic Courtès
2022-10-07 21:16 ` Samuel Thibault
2022-10-08 15:52 ` Ludovic Courtès
2022-10-09 16:09 ` Ludovic Courtès
2022-10-09 19:09 ` Samuel Thibault
2022-10-10 21:14 ` Ludovic Courtès [this message]
2022-10-17 12:51 ` Ludovic Courtès
2022-10-23 13:58 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pmezxty0.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=58320@debbugs.gnu.org \
--cc=bug-hurd@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).