From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.bugs Subject: bug#36609: 27.0.50; Possible race-condition in threading implementation Date: Fri, 12 Jul 2019 12:57:51 +0000 Message-ID: References: <87muhks3b5.fsf@hochschule-trier.de> <83muhj2zmb.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="190099"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 36609@debbugs.gnu.org, politza@hochschule-trier.de To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Jul 12 15:14:26 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hlvNX-000mzL-Kr for geb-bug-gnu-emacs@m.gmane.org; Fri, 12 Jul 2019 15:14:23 +0200 Original-Received: from localhost ([::1]:49182 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hlv8l-0005W2-59 for geb-bug-gnu-emacs@m.gmane.org; Fri, 12 Jul 2019 08:59:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45519) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hlv8i-0005Vu-CG for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 08:59:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hlv8g-0001zZ-Q0 for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 08:59:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:58778) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hlv8g-0001z6-Iu for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 08:59:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hlv8g-0005Pw-GN for bug-gnu-emacs@gnu.org; Fri, 12 Jul 2019 08:59:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Pip Cet Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 12 Jul 2019 12:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36609 X-GNU-PR-Package: emacs Original-Received: via spool by 36609-submit@debbugs.gnu.org id=B36609.156293631720789 (code B ref 36609); Fri, 12 Jul 2019 12:59:02 +0000 Original-Received: (at 36609) by debbugs.gnu.org; 12 Jul 2019 12:58:37 +0000 Original-Received: from localhost ([127.0.0.1]:39366 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hlv8G-0005PE-Dr for submit@debbugs.gnu.org; Fri, 12 Jul 2019 08:58:36 -0400 Original-Received: from mail-oi1-f173.google.com ([209.85.167.173]:36508) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hlv8E-0005P1-G4 for 36609@debbugs.gnu.org; Fri, 12 Jul 2019 08:58:35 -0400 Original-Received: by mail-oi1-f173.google.com with SMTP id w7so7199496oic.3 for <36609@debbugs.gnu.org>; Fri, 12 Jul 2019 05:58:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FNu3Mbi68by3G5rUEgKAu7KRrqoiBdql1+72xqNJFx0=; b=tmV0ZhYgeHTPlkXOXqjOtBrOe9KPlKvyA4+t69q7zbYdBipHQCLPT3rGasHUh1BW6w cyPirBAwdWJ+ED0rfmcQEqh86IqlzIuWitpKh2qviQ0gfTpAhr7Bq0ssEy7fnVzi3zgc NWQUnnTJuyQL/LyNyL4dVy9t/Rj4OJHoxHY6Xk06OyRlu9njTvMAhuKt3cDxgtvbA04F lSEA0fPsBqr9norFwEbejUOqHm0KVpeKWh4mNWbqiOY7084gpGQmawu7Sw2supl1OWG3 Wb0rsoNfnBDDKE8932lV2DknQD58gTheQtwYHRSiyuTg5lwSvJ4IwQfPJT5sPcGXUDw4 7Uiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FNu3Mbi68by3G5rUEgKAu7KRrqoiBdql1+72xqNJFx0=; b=NKZBTyceG19clhJ0g7mw/CrokUbeAVlfYwfzpqAhviiFeG/IVHvfkcjXp8PuhjrMgP +v9LsBXj9CGCEK3wMl4AAPAF9UoIEjSxK5goxXl3JtCSMPhd79plrOD9y7AE3zY8eM5R qKcqtytuKPhRAc1ahpuWrqngmBlImOXqhx/v3zF329hMx4j/sgKYT8rYwYqEqDFK1qRh pu9YNAysSFiY1yjvqJx5ZDjbPB69BKWBEP7SKFfmzFfYC2oq9DbyhWf45KNPEyuGhjZ2 T7NyXtKzHee0g8JmoM4CKytAlUmNit5Y2cdJi2QXCrIOGHbDeShM1Q195JmAzor5BTzL tHFw== X-Gm-Message-State: APjAAAXGi9vyPcIqOEEO39F4TZ18J6btkKPi5yoIyLowSzNNqiOPzDdo WLMHGvA1ryMwsxG5WZCzkDU5s4oxVvIezJb2bj0= X-Google-Smtp-Source: APXvYqx+Knc8NnxqIYypOyc5OuNwpoKWETAMFN3vC+8jOpreHyk1YAUveMGKdRxcK00fUEuAG7Pu/c3ckGhVEXzYfdw= X-Received: by 2002:aca:aa93:: with SMTP id t141mr5873315oie.128.1562936308527; Fri, 12 Jul 2019 05:58:28 -0700 (PDT) In-Reply-To: <83muhj2zmb.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:162763 Archived-At: On Fri, Jul 12, 2019 at 12:42 PM Eli Zaretskii wrote: > > > From: Pip Cet > > Date: Fri, 12 Jul 2019 09:02:22 +0000 > > Cc: 36609@debbugs.gnu.org > > > > On Thu, Jul 11, 2019 at 8:52 PM Andreas Politz > > wrote: > > > I think there is a race-condition in the implementation of threads. I > > > tried to find a minimal test-case, without success. Thus, I've attached > > > a lengthy source-file. Loading that file should trigger this bug and > > > may freeze your session. > > > > It does here, so I can provide further debugging information if > > needed. > > Thanks, can you provide the info I asked for? Yes, albeit not right now. > > On first glance, it appears that xgselect returns abnormally with > > g_main_context acquired in one thread, and then other threads fail > > to acquire it and loop endlessly. > > If you can describe what causes this to happen, I think we might be > half-way to a solution. Here's the backtrace of the abnormal exit I see with the patch attached: (gdb) bt full #0 0x00000000006bf987 in release_g_main_context (ptr=0xc1d070) at xgselect.c:36 context = 0x7fffedf79710 #1 0x0000000000616f03 in do_one_unbind (this_binding=0x7fffedf79770, unwinding=true, bindflag=SET_INTERNAL_UNBIND) at eval.c:3446 #2 0x0000000000617245 in unbind_to (count=0, value=XIL(0)) at eval.c:3567 this_binding = { kind = SPECPDL_UNWIND_PTR, unwind = { kind = SPECPDL_UNWIND_PTR, func = 0x6bf97b , arg = XIL(0xc1d070), eval_depth = 0 }, unwind_array = { kind = SPECPDL_UNWIND_PTR, nelts = 7076219, array = 0xc1d070 }, unwind_ptr = { kind = SPECPDL_UNWIND_PTR, func = 0x6bf97b , arg = 0xc1d070 }, unwind_int = { kind = SPECPDL_UNWIND_PTR, func = 0x6bf97b , arg = 12701808 }, unwind_excursion = { kind = SPECPDL_UNWIND_PTR, marker = XIL(0x6bf97b), window = XIL(0xc1d070) }, unwind_void = { kind = SPECPDL_UNWIND_PTR, func = 0x6bf97b }, let = { kind = SPECPDL_UNWIND_PTR, symbol = XIL(0x6bf97b), old_value = XIL(0xc1d070), where = XIL(0), saved_value = XIL(0xef26a0) }, bt = { kind = SPECPDL_UNWIND_PTR, debug_on_exit = false, function = XIL(0x6bf97b), args = 0xc1d070, nargs = 0 } } quitf = XIL(0) #3 0x00000000006116df in unwind_to_catch (catch=0x7fffd8000c50, type=NONLOCAL_EXIT_SIGNAL, value=XIL(0x14d3653)) at eval.c:1162 last_time = false #4 0x00000000006126d9 in signal_or_quit (error_symbol=XIL(0x90), data=XIL(0), keyboard_quit=false) at eval.c:1674 unwind_data = XIL(0x14d3653) conditions = XIL(0x7ffff05d676b) string = XIL(0x5f5e77) real_error_symbol = XIL(0x90) clause = XIL(0x30) h = 0x7fffd8000c50 #5 0x00000000006122e9 in Fsignal (error_symbol=XIL(0x90), data=XIL(0)) at eval.c:1564 #6 0x0000000000698901 in post_acquire_global_lock (self=0xe09db0) at thread.c:115 sym = XIL(0x90) data = XIL(0) prev_thread = 0xa745c0 #7 0x000000000069892b in acquire_global_lock (self=0xe09db0) at thread.c:123 #8 0x0000000000699303 in really_call_select (arg=0x7fffedf79a70) at thread.c:596 sa = 0x7fffedf79a70 self = 0xe09db0 oldset = { __val = {0, 0, 7, 0, 80, 140736817269952, 2031, 2080, 18446744073709550952, 32, 343597383808, 4, 0, 472446402655, 511101108348, 0} } #9 0x00000000005e5ee0 in flush_stack_call_func (func=0x699239 , arg=0x7fffedf79a70) at alloc.c:4969 end = 0x7fffedf79a30 self = 0xe09db0 sentry = { o = { __max_align_ll = 0, __max_align_ld = } } #10 0x0000000000699389 in thread_select (func=0x419320 , max_fds=9, rfds=0x7fffedf79fa0, wfds=0x7fffedf79f20, efds=0x0, timeout=0x7fffedf7a260, sigmask=0x0) at thread.c:616 sa = { func = 0x419320 , max_fds = 9, rfds = 0x7fffedf79fa0, wfds = 0x7fffedf79f20, efds = 0x0, timeout = 0x7fffedf7a260, sigmask = 0x0, result = 1 } #11 0x00000000006bfef5 in xg_select (fds_lim=9, rfds=0x7fffedf7a300, wfds=0x7fffedf7a280, efds=0x0, timeout=0x7fffedf7a260, sigmask=0x0) at xgselect.c:130 all_rfds = { fds_bits = {8, 0 } } all_wfds = { fds_bits = {0 } } tmo = { tv_sec = 0, tv_nsec = 0 } tmop = 0x7fffedf7a260 context = 0xc1d070 have_wfds = true gfds_buf = {{ fd = 5, events = 1, revents = 0 }, { fd = 6, events = 1, revents = 0 }, { fd = 8, events = 1, revents = 0 }, { fd = 0, events = 0, revents = 0 } } gfds = 0x7fffedf79b10 gfds_size = 128 n_gfds = 3 retval = 0 our_fds = 0 max_fds = 8 context_acquired = true i = 3 nfds = 0 tmo_in_millisec = -1 must_free = 0 need_to_dispatch = false count = 3 #12 0x000000000066b757 in wait_reading_process_output (time_limit=3, nsecs=0, read_kbd=0, do_display=false, wait_for_cell=XIL(0), wait_proc=0x0, just_wait_proc=0) at process.c:5423 process_skipped = false channel = 0 nfds = 0 Available = { fds_bits = {8, 0 } } Writeok = { fds_bits = {0 } } check_write = true check_delay = 0 no_avail = false xerrno = 0 proc = XIL(0x7fffedf7a440) timeout = { tv_sec = 3, tv_nsec = 0 } end_time = { tv_sec = 1562935633, tv_nsec = 911868453 } timer_delay = { tv_sec = 0, tv_nsec = -1 } got_output_end_time = { tv_sec = 0, tv_nsec = -1 } wait = TIMEOUT got_some_output = -1 prev_wait_proc_nbytes_read = 0 retry_for_async = false count = 2 now = { tv_sec = 0, tv_nsec = -1 } #13 0x0000000000429bf6 in Fsleep_for (seconds=make_fixnum(3), milliseconds=XIL(0)) at dispnew.c:5825 t = { tv_sec = 3, tv_nsec = 0 } tend = { tv_sec = 1562935633, tv_nsec = 911868112 } duration = 3 #14 0x0000000000613e99 in eval_sub (form=XIL(0xf6df73)) at eval.c:2273 i = 2 maxargs = 2 args_left = XIL(0) numargs = 1 original_fun = XIL(0x7fffefa9fb98) original_args = XIL(0xf6df83) count = 1 fun = XIL(0xa756a5) val = XIL(0) funcar = make_fixnum(35184372085343) argvals = {make_fixnum(3), XIL(0), XIL(0), XIL(0), XIL(0), XIL(0), XIL(0), XIL(0)} #15 0x0000000000610032 in Fprogn (body=XIL(0)) at eval.c:462 form = XIL(0xf6df73) val = XIL(0) #16 0x0000000000616102 in funcall_lambda (fun=XIL(0xf6da43), nargs=0, arg_vector=0xe09dd8) at eval.c:3065 val = XIL(0xc0) syms_left = XIL(0) next = XIL(0x3400000013) lexenv = XIL(0) count = 1 i = 0 optional = false rest = false #17 0x0000000000615542 in Ffuncall (nargs=1, args=0xe09dd0) at eval.c:2813 fun = XIL(0xf6da43) original_fun = XIL(0xf6da43) funcar = XIL(0xc0) numargs = 0 val = XIL(0xaf72e0) count = 0 #18 0x000000000069956f in invoke_thread_function () at thread.c:702 count = 0 #19 0x0000000000611d61 in internal_condition_case (bfun=0x69953e , handlers=XIL(0x30), hfun=0x699596 ) at eval.c:1351 val = make_fixnum(1405386) c = 0x7fffd8000c50 #20 0x0000000000699697 in run_thread (state=0xe09db0) at thread.c:741 stack_pos = { __max_align_ll = 0, __max_align_ld = 0 } self = 0xe09db0 iter = 0x0 c = 0x7fffd8000b20 #21 0x00007ffff4b38fa3 in start_thread (arg=) at pthread_create.c:486 ret = pd = now = unwind_buf = { cancel_jmp_buf = {{ jmp_buf = {140737185822464, -1249422724209328276, 140737488341374, 140737488341375, 140737185822464, 0, 1249453444682727276, 1249398985402204012}, mask_was_saved = 0 }}, priv = { pad = {0x0, 0x0, 0x0, 0x0}, data = { prev = 0x0, cleanup = 0x0, canceltype = 0 } } } not_first_call = #22 0x00007ffff49724cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Lisp Backtrace: "sleep-for" (0xedf7a530) 0xf6da40 Lisp type 3 post_acquire_global_lock () can return abnormally (I didn't know that), so really_call_select() can, too, so thread_select() can, too. > > + ptrdiff_t count = SPECPDL_INDEX (); > > I don't think we should do that at this low level. You're right, it does stick out. I think we're safe because we're calling Fsignal with the global lock held, but it's not a pretty or well-documented situation.