From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#33014: 26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function Date: Sat, 13 Oct 2018 09:23:38 +0300 Message-ID: <83y3b2uzyt.fsf@gnu.org> References: <87d0sh9hje.fsf@runbox.com> <83murjwplq.fsf@gnu.org> <87zhvjc4r3.fsf@runbox.com> NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1539411726 16940 195.159.176.226 (13 Oct 2018 06:22:06 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 13 Oct 2018 06:22:06 +0000 (UTC) Cc: 33014@debbugs.gnu.org To: Gemini Lasswell Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Oct 13 08:22:02 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gBDJJ-0004FH-1Z for geb-bug-gnu-emacs@m.gmane.org; Sat, 13 Oct 2018 08:22:01 +0200 Original-Received: from localhost ([::1]:43889 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gBDLP-0008CL-Lk for geb-bug-gnu-emacs@m.gmane.org; Sat, 13 Oct 2018 02:24:11 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56476) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gBDLJ-0008Bz-K0 for bug-gnu-emacs@gnu.org; Sat, 13 Oct 2018 02:24:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gBDLG-0003Fv-DD for bug-gnu-emacs@gnu.org; Sat, 13 Oct 2018 02:24:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:43416) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gBDLG-0003Ff-8t for bug-gnu-emacs@gnu.org; Sat, 13 Oct 2018 02:24:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gBDLG-00044A-0o for bug-gnu-emacs@gnu.org; Sat, 13 Oct 2018 02:24:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 13 Oct 2018 06:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33014 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 33014-submit@debbugs.gnu.org id=B33014.153941182615606 (code B ref 33014); Sat, 13 Oct 2018 06:24:01 +0000 Original-Received: (at 33014) by debbugs.gnu.org; 13 Oct 2018 06:23:46 +0000 Original-Received: from localhost ([127.0.0.1]:47674 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gBDKz-00043e-NB for submit@debbugs.gnu.org; Sat, 13 Oct 2018 02:23:46 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:46742) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gBDKy-00043S-6t for 33014@debbugs.gnu.org; Sat, 13 Oct 2018 02:23:44 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gBDKp-0002y1-1d for 33014@debbugs.gnu.org; Sat, 13 Oct 2018 02:23:38 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:54169) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gBDKo-0002xx-Tx; Sat, 13 Oct 2018 02:23:34 -0400 Original-Received: from [176.228.60.248] (port=4224 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gBDKo-0001NZ-HS; Sat, 13 Oct 2018 02:23:34 -0400 In-reply-to: <87zhvjc4r3.fsf@runbox.com> (message from Gemini Lasswell on Fri, 12 Oct 2018 13:02:56 -0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:151192 Archived-At: > From: Gemini Lasswell > Cc: 33014@debbugs.gnu.org > Date: Fri, 12 Oct 2018 13:02:56 -0700 > > I've tried to do that without success. The bug won't reproduce if I put > all the code added to thread.el by the patch into its own file and load > it with C-u M-x byte-compile-file, and it also doesn't work to put the > resulting .elc on my load-path and load it with require. Did you try loading it as a .el file? Anyway, it's too bad that the reproduction is so Heisenbug-like. It probably won't reproduce on my system anyway. > I've determined today that having -O2 in CFLAGS is necessary to > reproduce the bug, and that -O1 or -O0 won't do it. One more reason why reproduction elsewhere is probably hard. > The Lisp backtrace is really short: > > Thread 7 (Thread 0x7f1cd4dec700 (LWP 21837)): > "erb--benchmark-monitor-func" (0x158ec58) If you succeed in reproducing this when this code is loaded uncompiled, the backtrace might be more helpful. > >> #2 0x00000000006122b5 in XHASH_TABLE (a=...) at lisp.h:2241 > > > > and what was its parent object in the calling frame? > > Those are both optimized out with -O2. I recompiled bytecode.c with > "volatile" on the declaration of jmp_table, and got this: > > (gdb) up 3 > #3 exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., > nargs=nargs@entry=0, args=, > args@entry=0x16eacf8 ) at bytecode.c:1403 > 1403 struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table); > (gdb) p jmp_table > $1 = make_number(514) > (gdb) p *top > $3 = XIL(0x42b4d0) > (gdb) pp *top > remove Which one of these is the one that triggers the assertion violation? > Thread 1 "monitor" hit Hardware watchpoint 7: *(EMACS_INT *) 0x16eac38 > > Old value = 60897760 > New value = 24075314 > setup_on_free_list (v=v@entry=0x16eac30 , > nbytes=nbytes@entry=272) at alloc.c:3060 > 3060 total_free_vector_slots += nbytes / word_size; > (gdb) bt 10 > #0 setup_on_free_list (v=v@entry=0x16eac30 , > nbytes=nbytes@entry=272) at alloc.c:3060 > #1 0x00000000005a9a24 in sweep_vectors () at alloc.c:3297 > #2 0x00000000005adb2e in gc_sweep () at alloc.c:6872 > #3 garbage_collect_1 (end=) at alloc.c:5860 > #4 Fgarbage_collect () at alloc.c:5989 > #5 0x00000000005ca478 in maybe_gc () at lisp.h:4804 > #6 Ffuncall (nargs=4, args=args@entry=0x7fff210a3bc8) at eval.c:2838 > #7 0x0000000000611e00 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., > args_template=..., nargs=nargs@entry=2, args=, > args@entry=0x9bd128 ) at bytecode.c:632 > #8 0x00000000005cdd32 in funcall_lambda (fun=XIL(0x7fff210a3bc8), > nargs=nargs@entry=2, arg_vector=0x9bd128 , > arg_vector@entry=0x7fff210a3f00) at eval.c:3057 > #9 0x00000000005ca54b in Ffuncall (nargs=3, args=args@entry=0x7fff210a3ef8) > at eval.c:2870 > (More stack frames follow...) Can you show the Lisp backtrace for the above? > Note that just as was happening when we were working through bug#32357, > the thread names which gdb prints are wrong, which I verified with: Looks like a bug in pthreads version of sys_thread_create: it calls prctl with first arg PR_SET_NAME, but my reading of the documentation is that such a call gives the name to the _calling_ thread, which is not the thread just created. We should instead call pthread_setname_np, I think (but I'm not an expert on pthreads). > Am I correct that the next step is to figure out why the garbage > collector is not marking this vector? Presumably it's no longer > attached to the function definition for erb--benchmark-monitor-func by > the time the garbage collector runs, but it's supposed to be found by > mark_stack when called from mark_one_thread for Thread 7, right? Is this vector the byte-code of erb--benchmark-monitor-func? If so, how come it is no longer attached to the function, as long as the function does exist? And if this vector isn't the byte-code of erb--benchmark-monitor-func, then what is it? IMO, we cannot reason about what GC does or doesn't do until we understand what data structure it processes, and what is the relation of that data structure to the symbols in your program and in Emacs. Thanks.