From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Newsgroups: gmane.lisp.guile.bugs Subject: bug#39266: Finalization thread hits wrong-type-arg on weak vector (AArch64) Date: Mon, 09 Mar 2020 15:38:45 +0100 Message-ID: <874kux385m.fsf@gnu.org> References: <87tv4kdgyy.fsf@inria.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="83336"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) To: 39266@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Mon Mar 09 15:39:09 2020 Return-path: Envelope-to: guile-bugs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jBJYj-000LYe-5Q for guile-bugs@m.gmane-mx.org; Mon, 09 Mar 2020 15:39:09 +0100 Original-Received: from localhost ([::1]:43960 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jBJYh-0002fn-Uv for guile-bugs@m.gmane-mx.org; Mon, 09 Mar 2020 10:39:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54467) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jBJYd-0002fN-7K for bug-guile@gnu.org; Mon, 09 Mar 2020 10:39:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jBJYb-0001ya-QU for bug-guile@gnu.org; Mon, 09 Mar 2020 10:39:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:45335) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jBJYb-0001yU-NK for bug-guile@gnu.org; Mon, 09 Mar 2020 10:39:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jBJYb-0005Ml-LH for bug-guile@gnu.org; Mon, 09 Mar 2020 10:39:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 09 Mar 2020 14:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39266 X-GNU-PR-Package: guile Original-Received: via spool by 39266-submit@debbugs.gnu.org id=B39266.158376473320613 (code B ref 39266); Mon, 09 Mar 2020 14:39:01 +0000 Original-Received: (at 39266) by debbugs.gnu.org; 9 Mar 2020 14:38:53 +0000 Original-Received: from localhost ([127.0.0.1]:51308 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBJYT-0005MP-DW for submit@debbugs.gnu.org; Mon, 09 Mar 2020 10:38:53 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:39278) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBJYS-0005ME-OI for 39266@debbugs.gnu.org; Mon, 09 Mar 2020 10:38:53 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:43536) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jBJYN-0001tL-Jv for 39266@debbugs.gnu.org; Mon, 09 Mar 2020 10:38:47 -0400 Original-Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=55354 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jBJYN-0004hz-0b for 39266@debbugs.gnu.org; Mon, 09 Mar 2020 10:38:47 -0400 In-Reply-To: <87tv4kdgyy.fsf@inria.fr> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22\?\= \=\?utf-8\?Q\?'s\?\= message of "Fri, 24 Jan 2020 16:14:29 +0100") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.io gmane.lisp.guile.bugs:9634 Archived-At: Ludovic Court=C3=A8s skribis: > While building the =E2=80=9Cguix-system.drv=E2=80=9D derivation on AArch6= 4, I got this > crash (not fully deterministic but quite frequent). Here the > finalization thread gets a wrong-type-arg in =E2=80=98scm_i_weak_car=E2= =80=99 (i.e., > accessing a one-element weak vector): With 3.0.1, I can reproduce the bug on x86_64. With rr (thanks, Andy!), I found this (starting from the point where the type cell of the weak vector is zeroed, and reverse-continuing until its gets its original value of 0x10f): --8<---------------cut here---------------start------------->8--- (rr) frame 40 #40 0x00007ffff7f2e66d in scm_i_weak_car (pair=3D0x7fffe15af690) at ../libg= uile/pairs.h:190 190 return SCM_CAR (x); (rr) down #39 0x00007ffff7f2f576 in scm_c_weak_vector_ref (wv=3D, k=3D= k@entry=3D0) at weak-vector.c:193 193 SCM_VALIDATE_WEAK_VECTOR (1, wv); (rr)=20 #38 0x00007ffff7ea7ba0 in scm_wrong_type_arg_msg ( subr=3Dsubr@entry=3D0x7ffff7f56f00 "weak-vector= -ref", pos=3Dpos@entry=3D1,=20 bad_value=3D0x7fffec472b90, szMessage=3DszMessage@entry=3D0x7ffff7f56e8= 0 "weak vector") at error.c:282 282 scm_error (scm_arg_type_key, (rr) p *((void**)0x7fffec472b90) $1 =3D (void *) 0x0 (rr) watch *((void**)0x7fffec472b90) Hardware watchpoint 1: *((void**)0x7fffec472b90) (rr) reverse-cont Continuing. Thread 1 received signal SIGCONT, Continued. [Switching to Thread 27074.27074] __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:101 101 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Dosiero a=C5=AD dosie= rujo ne ekzistas. (rr)=20 Continuing. Thread 1 hit Hardware watchpoint 1: *((void**)0x7fffec472b90) Old value =3D (void *) 0x0 New value =3D (void *) 0x10f __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-u= naligned-erms.S:259 259 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Dosiero a=C5= =AD dosierujo ne ekzistas. (rr) bt #0 __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-v= ec-unaligned-erms.S:259 #1 0x00007ffff7f1d499 in set_vtable_access_fields (vtable=3Dvtable@entry= =3D0x7fffeb48ee80) at struct.c:143 #2 0x00007ffff7f1dd8d in scm_i_struct_inherit_vtable_magic (vtable=3Dvtabl= e@entry=3D0x7ffff4e32fa0,=20 obj=3Dobj@entry=3D0x7fffeb48ee80) at struct.c:215 #3 0x00007ffff7f1dfea in scm_c_make_structv (vtable=3D0x7ffff4e32fa0, n_ta= il=3D, n_init=3D8,=20 init=3D0x7fffffff50d0) at struct.c:364 #4 0x00007ffff7f1e0b9 in scm_make_struct_no_tail (vtable=3D0x7ffff4e32fa0,= init=3D0x304) at struct.c:491 --8<---------------cut here---------------end--------------->8--- Bingo! There=E2=80=99s a mismatch in struct.c: --8<---------------cut here---------------start------------->8--- bitmask_size =3D (nfields + 31U) / 32U; unboxed_fields =3D scm_gc_malloc_pointerless (bitmask_size, "unboxed fiel= ds"); memset (unboxed_fields, 0, bitmask_size * sizeof(*unboxed_fields)); --8<---------------cut here---------------end--------------->8--- Pushed a fix as 7c17655cd3d859bf0c5a86d9782a7788205fc05a. Thanks, rr! You made my day! :-) Now testing Guix builds on x86_64, i686, ARMv7, and AArch64 to see if that addresses seemingly related issues. Ludo=E2=80=99.