From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#36609: 27.0.50; Possible race-condition in threading implementation Date: Fri, 12 Jul 2019 10:47:37 +0300 Message-ID: <83sgrb3d9i.fsf@gnu.org> References: <87muhks3b5.fsf@hochschule-trier.de> Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="93793"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 36609@debbugs.gnu.org To: Andreas Politz Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Jul 12 09:48:09 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hlqHo-000OHi-Ud for geb-bug-gnu-emacs@m.gmane.org; Fri, 12 Jul 2019 09:48:09 +0200 Original-Received: from localhost ([::1]:47252 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hlqHn-0003fz-8I for geb-bug-gnu-emacs@m.gmane.org; Fri, 12 Jul 2019 03:48:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42400) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hlqHj-0003fp-Nw for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 03:48:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hlqHi-0003Zn-FQ for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 03:48:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:58621) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hlqHi-0003ZY-BW for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 03:48:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hlqHi-0004QZ-7t for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 03:48:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 12 Jul 2019 07:48:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36609 X-GNU-PR-Package: emacs Original-Received: via spool by 36609-submit@debbugs.gnu.org id=B36609.156291767417007 (code B ref 36609); Fri, 12 Jul 2019 07:48:02 +0000 Original-Received: (at 36609) by debbugs.gnu.org; 12 Jul 2019 07:47:54 +0000 Original-Received: from localhost ([127.0.0.1]:39209 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hlqHa-0004QE-9P for submit@debbugs.gnu.org; Fri, 12 Jul 2019 03:47:54 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:58219) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hlqHZ-0004Q1-1T for 36609@debbugs.gnu.org; Fri, 12 Jul 2019 03:47:53 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:51268) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hlqHT-0003HY-5Y; Fri, 12 Jul 2019 03:47:47 -0400 Original-Received: from [176.228.60.248] (port=3167 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hlqHS-0001aW-J9; Fri, 12 Jul 2019 03:47:47 -0400 In-reply-to: <87muhks3b5.fsf@hochschule-trier.de> (message from Andreas Politz on Thu, 11 Jul 2019 22:51:10 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:162754 Archived-At: > From: Andreas Politz > Date: Thu, 11 Jul 2019 22:51:10 +0200 > > I think there is a race-condition in the implementation of threads. Not sure what you mean by "race condition". Since only one Lisp thread can run at any given time, where could this race happen, and between what threads? > I tried to find a minimal test-case, without success. Thus, I've > attached a lengthy source-file. Loading that file should trigger > this bug and may freeze your session. FWIW, it doesn't freeze my Emacs here, neither on GNU/Linux nor on MS-Windows (with today's master). Are you getting freezes with 100% reproducibility? If so, please show all the details of your build, as collected by report-emacs-bug. > Indications: > > 1. The main-thread has the name of one of created threads (XEmacs in > this case), instead of "emacs". I don't think this is related to the problem. I think we have a bug in the implementation of the 'name' attribute of threads: we call prctl(PR_SET_NAME), which AFAIU changes the name of the _calling_ thread, and the calling thread at that point is the main thread, see sys_thread_create. If I evaluate this: (make-thread 'ignore "XEmacs") and then attach a debugger, I see that the name of the main thread has changed to "XEmacs". > 2. Emacs stops processing all keyboard/mouse input while looping in > wait_reading_process_output. Sending commands via emacsclient still > works. > > GDB output: > > (gdb) info threads > Id Target Id Frame > * 1 Thread 0x7ffff17f5d40 (LWP 26264) "XEmacs" 0x000055555576eac0 in XPNTR (a=XIL(0x7ffff1312533)) at alloc.c:535 > 2 Thread 0x7ffff0ac4700 (LWP 26265) "gmain" 0x00007ffff50d1667 in poll () from /usr/lib/libc.so.6 > 3 Thread 0x7fffebd1a700 (LWP 26266) "gdbus" 0x00007ffff50d1667 in poll () from /usr/lib/libc.so.6 > 4 Thread 0x7fffeb519700 (LWP 26267) "dconf worker" 0x00007ffff50d1667 in poll () from /usr/lib/libc.so.6 > > (gdb) bt full This "bt full" is not enough, because it shows the backtrace of one thread only, the main thread. Please show the backtrace of the other 3 threads by typing "thread apply all bt full" instead. > #5 0x0000555555802a78 in wait_reading_process_output (time_limit=0, nsecs=0, read_kbd=-1, do_display=true, wait_for_cell=XIL(0), wait_proc=0x0, just_wait_proc=0) at process.c:5423 > process_skipped = false > channel = 6 > nfds = 1 > Available = { > fds_bits = {32, 0 } > } Is the keyboard descriptor's bit in Available set or not? What is the contents of the fd_callback_info array at this point? > ;; -*- lexical-binding: t -*- > > (require 'threads) > (require 'eieio) > (require 'cl-lib) > (require 'ring) > > (defun debug (fmt &rest args) > (princ (apply #'format fmt args) #'external-debugging-output) > (terpri #'external-debugging-output)) Please describe what this program tries to accomplish, and how. It's not easy to reverse-engineer that from 200 lines of non-trivial code. It's possible that the reason(s) for the freeze will be apparent from describing your implementation ideas. Thanks.