From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Pierre Langlois Newsgroups: gmane.lisp.guile.bugs Subject: bug#39266: Finalization thread hits wrong-type-arg on weak vector (AArch64) Date: Mon, 09 Mar 2020 22:19:49 +0000 Message-ID: <87wo7tdvcq.fsf@gmx.com> References: <87tv4kdgyy.fsf@inria.fr> <874kux385m.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="2543"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.2.0; emacs 26.3 To: 39266@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Mon Mar 09 23:20:21 2020 Return-path: Envelope-to: guile-bugs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jBQl1-0000XD-RB for guile-bugs@m.gmane-mx.org; Mon, 09 Mar 2020 23:20:19 +0100 Original-Received: from localhost ([::1]:50666 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jBQl0-00079y-IQ for guile-bugs@m.gmane-mx.org; Mon, 09 Mar 2020 18:20:18 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56478) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jBQkm-00079g-9p for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jBQkk-0006ek-22 for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:45600) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jBQkj-0006ea-UQ for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jBQkj-0001sq-Pf for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Pierre Langlois Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 09 Mar 2020 22:20:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39266 X-GNU-PR-Package: guile X-Debbugs-Original-To: bug-guile@gnu.org X-Debbugs-Original-Cc: 39266@debbugs.gnu.org Original-Received: via spool by 39266-submit@debbugs.gnu.org id=B39266.15837923997227 (code B ref 39266); Mon, 09 Mar 2020 22:20:01 +0000 Original-Received: (at 39266) by debbugs.gnu.org; 9 Mar 2020 22:19:59 +0000 Original-Received: from localhost ([127.0.0.1]:51573 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBQkh-0001sV-15 for submit@debbugs.gnu.org; Mon, 09 Mar 2020 18:19:59 -0400 Original-Received: from mout.gmx.net ([212.227.17.22]:43395) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBQkf-0001sH-Dz for 39266@debbugs.gnu.org; Mon, 09 Mar 2020 18:19:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1583792391; bh=xhfkwKHuWluJ5IqWI9fDZoA9dZoXgMirnQeo3+Rf/as=; h=X-UI-Sender-Class:References:From:To:Cc:Subject:In-reply-to:Date; b=HRONuHQwKpUlBCNIB7FQf9pgzIZWovhLLlhIUqjBPmQTRIFMBEqvJDXAI7eogED3e IsbNAL4o6MxJoFmZPASP67XqicAOvtavgkouyhydtKGilrLtW6r7A0m06Pns2ovG7b oDzh23ah1WjO2u1nNGS5Dbz1AfrrimTs6JiUoLqM= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Original-Received: from labiere ([80.44.64.14]) by mail.gmx.com (mrgmx104 [212.227.17.174]) with ESMTPSA (Nemesis) id 1M72sJ-1jGgcC0BrT-008bq6; Mon, 09 Mar 2020 23:19:51 +0100 In-reply-to: <874kux385m.fsf@gnu.org> X-Provags-ID: V03:K1:ZFD4CJ3kBdi4sLM9vjoH3KW9uRTpu38m7QNo8NSABG1pdZaypx+ CrLOgfAUKSl+U5ToAT4zK6/LMsaCN0F3yUSZCPvF4vkmShm+PRf0x5QHsvrTD9Vx46T+8f3 635g/boy2Tc6MfguvOy0qJZAUTdNHC/DjmCoIlkdARKrJaRlVwO4a+fvtaTOrxsn+acU0Vi nm+fW6WyiTJXYtnbGYSnQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:cpE4et5AQHU=:532xDQK1qL23o7T7MJVTJj CP1DY7MiWdh97vWxoG1gg70uD9zqpMrro/uu3YMMMPC5YKUDnJdipAFNLbsVnT9Ji8DVZ4SfS BGjkP9XwBIEmXmCeyhtaDOHEuSKxaPgreQyCM6IOMWji8LpLMtC0T3NwUQPYJ0NjdtHKcKdST OxXVCxXozIXPjYyLN2eMaMaibOr7IOXI2rWgLeUoy4NjfBJTAoBX/qG+wKSmKJ/nRQiqeEi2D V43g+YELATlkgv2ztLy9uHel5THBlZOpNzteKPRuc+aYxc04cnl67pCcGMWqYtCMDlTF1C23f e0ZanWEubHVb2xUI41W1US5qjYtPaFNrKs+gTVSmiIGVs54gku/I8ncxWMFmdP9CrxCgfG0Wo KLOTBiX4kzT4laigrw97nKO10Cg7V3HDoFCkOZv9bLxXR/uJrsT8aK7jB7ztxQHoGvxS/sVeL 9FVdqbcQNDh2KmtBnV5NwL1inEyDhuYtmMq4TWYB7tXNcVwRDlNfD3Sziz4t+AEtIKLyrmDTh jwoaZ9CUpZcTYiW165Zfj74mVk62vOC6KWxsyhVJ07u9hJ5iks91r9mmg9JdTZVEAlXVMgs0U nYA0n02PJJ0WcJgCXYiMeqshGy9pAX6RdYcCYbx1UAJNPU0T0VHnAXsW5HxUFSl4hsFJ4/Qsd V4SX0HgRiXKOTOjyqn/WUywZW8y7Hld0yeRbcDCH7kkpDUp9C7nhKxF4VjpDFn2fpsIrhytO5 cO6cvAmLIM87TK9BRDzw22qpixE0MGwtRf5e27nag+BoESpdbAE8h/ImIpxjLbdY9rVs1qzy X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.io gmane.lisp.guile.bugs:9635 Archived-At: Hi Ludo, Ludovic Court=C3=A8s writes: > Ludovic Court=C3=A8s skribis: > >> While building the =E2=80=9Cguix-system.drv=E2=80=9D derivation on AArch= 64, I got this >> crash (not fully deterministic but quite frequent). Here the >> finalization thread gets a wrong-type-arg in =E2=80=98scm_i_weak_car=E2= =80=99 (i.e., >> accessing a one-element weak vector): > > With 3.0.1, I can reproduce the bug on x86_64. With rr (thanks, Andy!), > I found this (starting from the point where the type cell of the weak > vector is zeroed, and reverse-continuing until its gets its original > value of 0x10f): > > --8<---------------cut here---------------start------------->8--- > (rr) frame 40 > #40 0x00007ffff7f2e66d in scm_i_weak_car (pair=3D0x7fffe15af690) at ../li= bguile/pairs.h:190 > 190 return SCM_CAR (x); > (rr) down > #39 0x00007ffff7f2f576 in scm_c_weak_vector_ref (wv=3D, k= =3Dk@entry=3D0) at weak-vector.c:193 > 193 SCM_VALIDATE_WEAK_VECTOR (1, wv); > (rr)=20 > #38 0x00007ffff7ea7ba0 in scm_wrong_type_arg_msg ( > subr=3Dsubr@entry=3D0x7ffff7f56f00 "weak-vect= or-ref", pos=3Dpos@entry=3D1,=20 > bad_value=3D0x7fffec472b90, szMessage=3DszMessage@entry=3D0x7ffff7f56= e80 "weak vector") at error.c:282 > 282 scm_error (scm_arg_type_key, > (rr) p *((void**)0x7fffec472b90) > $1 =3D (void *) 0x0 > (rr) watch *((void**)0x7fffec472b90) > Hardware watchpoint 1: *((void**)0x7fffec472b90) > (rr) reverse-cont > Continuing. > > Thread 1 received signal SIGCONT, Continued. > [Switching to Thread 27074.27074] > __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:101 > 101 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Dosiero a=C5=AD dos= ierujo ne ekzistas. > (rr)=20 > Continuing. > > Thread 1 hit Hardware watchpoint 1: *((void**)0x7fffec472b90) > > Old value =3D (void *) 0x0 > New value =3D (void *) 0x10f > __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec= -unaligned-erms.S:259 > 259 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Dosiero a=C5= =AD dosierujo ne ekzistas. > (rr) bt > #0 __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset= -vec-unaligned-erms.S:259 > #1 0x00007ffff7f1d499 in set_vtable_access_fields (vtable=3Dvtable@entry= =3D0x7fffeb48ee80) at struct.c:143 > #2 0x00007ffff7f1dd8d in scm_i_struct_inherit_vtable_magic (vtable=3Dvta= ble@entry=3D0x7ffff4e32fa0,=20 > obj=3Dobj@entry=3D0x7fffeb48ee80) at struct.c:215 > #3 0x00007ffff7f1dfea in scm_c_make_structv (vtable=3D0x7ffff4e32fa0, n_= tail=3D, n_init=3D8,=20 > init=3D0x7fffffff50d0) at struct.c:364 > #4 0x00007ffff7f1e0b9 in scm_make_struct_no_tail (vtable=3D0x7ffff4e32fa= 0, init=3D0x304) at struct.c:491 > --8<---------------cut here---------------end--------------->8--- > > Bingo! There=E2=80=99s a mismatch in struct.c: > > --8<---------------cut here---------------start------------->8--- > bitmask_size =3D (nfields + 31U) / 32U; > unboxed_fields =3D scm_gc_malloc_pointerless (bitmask_size, "unboxed fi= elds"); > memset (unboxed_fields, 0, bitmask_size * sizeof(*unboxed_fields)); > --8<---------------cut here---------------end--------------->8--- Oh wow, scary! That was some nice debugging, these types of bugs can be really hard to get to the bottom of. > > Pushed a fix as 7c17655cd3d859bf0c5a86d9782a7788205fc05a. > > Thanks, rr! You made my day! :-) > > Now testing Guix builds on x86_64, i686, ARMv7, and AArch64 to see if > that addresses seemingly related issues. I've tested it on AArch64 and it's looking good, I'm running Guile 3 finally! I've tested by running 'guix pull --branch=3Dwip-guile-3.0.1' on a rockpro64 running the Guix system, I've then reconfigured and rebooted and it's all good. Thanks so much for the fix! Hopefully it'll work on every platform and that can be the end of it :-). Pierre