unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: Pierre Langlois <pierre.langlois@gmx.com>
To: 39266@debbugs.gnu.org
Subject: bug#39266: Finalization thread hits wrong-type-arg on weak vector (AArch64)
Date: Mon, 09 Mar 2020 22:19:49 +0000	[thread overview]
Message-ID: <87wo7tdvcq.fsf@gmx.com> (raw)
In-Reply-To: <874kux385m.fsf@gnu.org>

Hi Ludo,

Ludovic Courtès writes:

> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> While building the “guix-system.drv” derivation on AArch64, I got this
>> crash (not fully deterministic but quite frequent).  Here the
>> finalization thread gets a wrong-type-arg in ‘scm_i_weak_car’ (i.e.,
>> accessing a one-element weak vector):
>
> With 3.0.1, I can reproduce the bug on x86_64.  With rr (thanks, Andy!),
> I found this (starting from the point where the type cell of the weak
> vector is zeroed, and reverse-continuing until its gets its original
> value of 0x10f):
>
> --8<---------------cut here---------------start------------->8---
> (rr) frame 40
> #40 0x00007ffff7f2e66d in scm_i_weak_car (pair=0x7fffe15af690) at ../libguile/pairs.h:190
> 190	  return SCM_CAR (x);
> (rr) down
> #39 0x00007ffff7f2f576 in scm_c_weak_vector_ref (wv=<optimized out>, k=k@entry=0) at weak-vector.c:193
> 193	  SCM_VALIDATE_WEAK_VECTOR (1, wv);
> (rr) 
> #38 0x00007ffff7ea7ba0 in scm_wrong_type_arg_msg (
>     subr=subr@entry=0x7ffff7f56f00 <s_scm_weak_vector_ref> "weak-vector-ref", pos=pos@entry=1, 
>     bad_value=0x7fffec472b90, szMessage=szMessage@entry=0x7ffff7f56e80 "weak vector") at error.c:282
> 282	      scm_error (scm_arg_type_key,
> (rr) p *((void**)0x7fffec472b90)
> $1 = (void *) 0x0
> (rr) watch *((void**)0x7fffec472b90)
> Hardware watchpoint 1: *((void**)0x7fffec472b90)
> (rr) reverse-cont
> Continuing.
>
> Thread 1 received signal SIGCONT, Continued.
> [Switching to Thread 27074.27074]
> __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:101
> 101	../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Dosiero aŭ dosierujo ne ekzistas.
> (rr) 
> Continuing.
>
> Thread 1 hit Hardware watchpoint 1: *((void**)0x7fffec472b90)
>
> Old value = (void *) 0x0
> New value = (void *) 0x10f
> __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:259
> 259	../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Dosiero aŭ dosierujo ne ekzistas.
> (rr) bt
> #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:259
> #1  0x00007ffff7f1d499 in set_vtable_access_fields (vtable=vtable@entry=0x7fffeb48ee80) at struct.c:143
> #2  0x00007ffff7f1dd8d in scm_i_struct_inherit_vtable_magic (vtable=vtable@entry=0x7ffff4e32fa0, 
>     obj=obj@entry=0x7fffeb48ee80) at struct.c:215
> #3  0x00007ffff7f1dfea in scm_c_make_structv (vtable=0x7ffff4e32fa0, n_tail=<optimized out>, n_init=8, 
>     init=0x7fffffff50d0) at struct.c:364
> #4  0x00007ffff7f1e0b9 in scm_make_struct_no_tail (vtable=0x7ffff4e32fa0, init=0x304) at struct.c:491
> --8<---------------cut here---------------end--------------->8---
>
> Bingo!  There’s a mismatch in struct.c:
>
> --8<---------------cut here---------------start------------->8---
>   bitmask_size = (nfields + 31U) / 32U;
>   unboxed_fields = scm_gc_malloc_pointerless (bitmask_size, "unboxed fields");
>   memset (unboxed_fields, 0, bitmask_size * sizeof(*unboxed_fields));
> --8<---------------cut here---------------end--------------->8---

Oh wow, scary! That was some nice debugging, these types of bugs can be
really hard to get to the bottom of.

>
> Pushed a fix as 7c17655cd3d859bf0c5a86d9782a7788205fc05a.
>
> Thanks, rr!  You made my day!  :-)
>
> Now testing Guix builds on x86_64, i686, ARMv7, and AArch64 to see if
> that addresses seemingly related issues.

I've tested it on AArch64 and it's looking good, I'm running Guile 3
finally! I've tested by running 'guix pull --branch=wip-guile-3.0.1' on
a rockpro64 running the Guix system, I've then reconfigured and rebooted
and it's all good.

Thanks so much for the fix! Hopefully it'll work on every platform and
that can be the end of it :-).

Pierre





  reply	other threads:[~2020-03-09 22:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-24 15:14 bug#39266: Finalization thread hits wrong-type-arg on weak vector (AArch64) Ludovic Courtès
2020-02-19 13:50 ` Ludovic Courtès
2020-02-19 14:19   ` Brian Woodcox
2020-02-29 15:09 ` shtwzrd via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2020-03-09 14:38 ` Ludovic Courtès
2020-03-09 22:19   ` Pierre Langlois [this message]
2020-03-10 17:25     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wo7tdvcq.fsf@gmx.com \
    --to=pierre.langlois@gmx.com \
    --cc=39266@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).