From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Elias =?UTF-8?Q?M=C3=A5rtenson?= Newsgroups: gmane.emacs.bugs Subject: bug#25172: 26.0.50; Concurrency feature, sit-for doesn't work (crashing and unexpected behaviour) Date: Tue, 13 Dec 2016 10:38:09 +0800 Message-ID: References: <838trme4jr.fsf@gnu.org> <838trlcals.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=94eb2c05e5cc65bf4d0543811ceb X-Trace: blaine.gmane.org 1481596764 25367 195.159.176.226 (13 Dec 2016 02:39:24 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 13 Dec 2016 02:39:24 +0000 (UTC) Cc: 25172@debbugs.gnu.org, =?UTF-8?Q?Cl=C3=A9ment?= Pit--Claudel To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Dec 13 03:39:20 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cGczv-00064V-Nn for geb-bug-gnu-emacs@m.gmane.org; Tue, 13 Dec 2016 03:39:19 +0100 Original-Received: from localhost ([::1]:34943 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cGd00-0004qy-02 for geb-bug-gnu-emacs@m.gmane.org; Mon, 12 Dec 2016 21:39:24 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56810) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cGczi-0004lH-K8 for bug-gnu-emacs@gnu.org; Mon, 12 Dec 2016 21:39:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cGcze-0007i0-Bo for bug-gnu-emacs@gnu.org; Mon, 12 Dec 2016 21:39:06 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:52749) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cGcze-0007hw-8z for bug-gnu-emacs@gnu.org; Mon, 12 Dec 2016 21:39:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cGcze-0002dR-1D for bug-gnu-emacs@gnu.org; Mon, 12 Dec 2016 21:39:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Elias =?UTF-8?Q?M=C3=A5rtenson?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 13 Dec 2016 02:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25172 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 25172-submit@debbugs.gnu.org id=B25172.148159669710079 (code B ref 25172); Tue, 13 Dec 2016 02:39:01 +0000 Original-Received: (at 25172) by debbugs.gnu.org; 13 Dec 2016 02:38:17 +0000 Original-Received: from localhost ([127.0.0.1]:39915 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cGcyu-0002cV-NT for submit@debbugs.gnu.org; Mon, 12 Dec 2016 21:38:17 -0500 Original-Received: from mail-qk0-f177.google.com ([209.85.220.177]:35391) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cGcyt-0002cI-7f for 25172@debbugs.gnu.org; Mon, 12 Dec 2016 21:38:15 -0500 Original-Received: by mail-qk0-f177.google.com with SMTP id n204so103410349qke.2 for <25172@debbugs.gnu.org>; Mon, 12 Dec 2016 18:38:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=DEiyLw1T5Zhk52ncOVjd/XmDJyuV2s9xfyhJ6ZRBrzo=; b=T43Q3RzpXgzMmc7LnCNTI6wqV2jJBZtZjpVrAkfnD5k8Wop3hxfN2hq1tzacLv1WKF e7l2Pxr9q2S7U6ePztahDvLva9sElZghPbwz4WkE8PNVlwst9I0yutXIu9HFUGWlMgbO AIIRWBm4KkNyXtA4RZcAt/hojsdWIMZDDLX7N1yw2kqN6RVDEzdMSFhdx/87pLgdpWtw a431NJ/vd55TDhzjqnc/nPYmEDBp9A8pwI/5HjkeIpphh3Ej7lToF9glIpUZNf868Hzy NfSkxIk7RVWTu26fYBgU+63324+a0K45VuBUBEzA7kpNNWsxhFSlVJHlzvHUxIq7Z9kt zDNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=DEiyLw1T5Zhk52ncOVjd/XmDJyuV2s9xfyhJ6ZRBrzo=; b=Suj9pceyu2hOZXJr1Z2RY2EQAd7zxhLmUsO34eJPcpZ8wE3Nps2g/8lYL9S4i727lY /Yc9/eiRxUHCRaqLDTWdqhprwr/b28ZZYsnXx67F1OlKsKrwNVHiKkRoqbL7KsQ4NYK8 ZJ5HYpwFdao3JWKKeeRHrVZmztvZ8Fn6Kiuv/14ApexeTFEXV6wuZDSMCqlfUtkn/d/K gZozBNp4/K6ikGqU4KyJ+8IvHT3nNuJRBoEqRwnHSfbONOEifz0IpC5SoOZK9ms+qCHe fCCtiHQ7L1f9UNmBx0PTW0o+11E2jroKzw8LL7Suo2FF7N37L6LEW9qNrutFeWeisL5f Rjcw== X-Gm-Message-State: AKaTC01Boc6IrJWFo9a4RnvWVkFJDAABlBOT8+Fz3/aGBZWuq1YNId5fadnMWURb1r0cmEx9tTguMVHZwVwhsg== X-Received: by 10.55.125.194 with SMTP id y185mr80859859qkc.38.1481596689822; Mon, 12 Dec 2016 18:38:09 -0800 (PST) Original-Received: by 10.55.110.5 with HTTP; Mon, 12 Dec 2016 18:38:09 -0800 (PST) In-Reply-To: <838trlcals.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:126959 Archived-At: --94eb2c05e5cc65bf4d0543811ceb Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I was about to test this, but I have been unable to reproduce the problem as of the current version: 8db7b65d66f01e90a05cc9f11c67667233d84ca0 Has a fix for this been explicitly committed, or did the behaviour change unexpectedly because of some other change? Regards, Elias On 13 December 2016 at 01:37, Eli Zaretskii wrote: > > From: Elias M=C3=A5rtenson > > Date: Mon, 12 Dec 2016 12:50:24 +0800 > > Cc: Eli Zaretskii , 25172@debbugs.gnu.org > > > > I tried with the latest version (a92a027d58cb4df5bb6c7e3c546a72 > 183a192f45) > > and I'm still getting the same error. > > > > The stack trace is as follows: > > [...] > > #34 0x0000000000578a22 in emacs_abort () at sysdep.c:2342 > > #35 0x0000000000564247 in unblock_input_to (level=3D-1) at keyboard.c:7= 167 > > #36 0x000000000056425e in unblock_input () at keyboard.c:7183 > > #37 0x000000000069c5e4 in xg_select (fds_lim=3D15, rfds=3D0x7fffe59e19a= 0, > > wfds=3D0x7fffe59e1920, efds=3D0x0, timeout=3D0x7fffe59e1900, sigmask=3D= 0x0) at > > xgselect.c:162 > > xg_select uses block_input/unblock_input, something other *select > implementations used by Emacs don't do (as those others are system > APIs). block_input/unblock_input manipulate a global variable that is > not incremented and decremented atomically, so it's fundamentally > thread-unsafe. Moreover, some places in Emacs reset that global > variable to zero (although I don't believe those places are part of > your scenario). > > The above is especially important because the calls to the *select > functions are about the only place in Emacs where several threads can > run in parallel, because they are called by thread_select like this: > > release_global_lock (); > sa->result =3D (sa->func) (sa->max_fds, sa->rfds, sa->wfds, sa->efds, > sa->timeout, sa->sigmask); > acquire_global_lock (self); > > So between the call to release_global_lock, which allows another > thread to grab the lock, and the subsequent call to > acquire_global_lock several threads could run and more or less > simultaneously call the *select function. If that function is > xg_select, these threads might step on each other's toes by calling > block_input/unblock_input in parallel. This could easily cause the > global variable to become negative, which then causes the above abort. > > Long story short, could you please try removing the calls to > block_input/unblock_input from xgselect.c, and see if that solves > these crashes? (These calls were introduced to fix a rare and elusive > bug, but I don't think you will see that bug unless you do what that > bug's recipe calls for. And anyway, this removal is just so we see > whether this is indeed the reason for the problem, I don't really > suggest to remove them for good.) > > Thanks. > --94eb2c05e5cc65bf4d0543811ceb Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I was about to test this, but I have been unable to reprod= uce the problem as of the current version:=C2=A08db7b65d66f01e90a05cc9f11c6= 7667233d84ca0

Has a fix for this been explicitly committ= ed, or did the behaviour change unexpectedly because of some other change?<= /div>

Regards,
Elias

On 13 December 2016 at 01:37, E= li Zaretskii <eliz@gnu.org> wrote:
> From: Elias M=C3=A5rtenson <lokedhs@gmail.com>
> Date: Mon, 12 Dec 2016 12:50:24 +0800
> Cc: Eli Zaretskii <eliz@gnu.org= >, 25172@debbugs.gnu.org >
> I tried with the latest version (a92a027d58cb4df5bb6c7e3c546a72183a192f45)
> and I'm still getting the same error.
>
> The stack trace is as follows:
> [...]
> #34 0x0000000000578a22 in emacs_abort () at sysdep.c:= 2342
> #35 0x0000000000564247 in unblock_input_to (level=3D-1) at keyboard.c:= 7167
> #36 0x000000000056425e in unblock_input () at keyboard.c:7183
> #37 0x000000000069c5e4 in xg_select (fds_lim=3D15, rfds=3D0x7fffe59e19= a0,
> wfds=3D0x7fffe59e1920, efds=3D0x0, timeout=3D0x7fffe59e1900, sigmask= =3D0x0) at
> xgselect.c:162

xg_select uses block_input/unblock_input, something other *select implementations used by Emacs don't do (as those others are system
APIs).=C2=A0 block_input/unblock_input manipulate a global variable that is=
not incremented and decremented atomically, so it's fundamentally
thread-unsafe.=C2=A0 Moreover, some places in Emacs reset that global
variable to zero (although I don't believe those places are part of
your scenario).

The above is especially important because the calls to the *select
functions are about the only place in Emacs where several threads can
run in parallel, because they are called by thread_select like this:

=C2=A0 release_global_lock ();
=C2=A0 sa->result =3D (sa->func) (sa->max_fds, sa->rfds, sa->= ;wfds, sa->efds,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0sa->timeout, sa->sigmask);
=C2=A0 acquire_global_lock (self);

So between the call to release_global_lock, which allows another
thread to grab the lock, and the subsequent call to
acquire_global_lock several threads could run and more or less
simultaneously call the *select function.=C2=A0 If that function is
xg_select, these threads might step on each other's toes by calling
block_input/unblock_input in parallel.=C2=A0 This could easily cause the global variable to become negative, which then causes the above abort.

Long story short, could you please try removing the calls to
block_input/unblock_input from xgselect.c, and see if that solves
these crashes?=C2=A0 (These calls were introduced to fix a rare and elusive=
bug, but I don't think you will see that bug unless you do what that bug's recipe calls for.=C2=A0 And anyway, this removal is just so we se= e
whether this is indeed the reason for the problem, I don't really
suggest to remove them for good.)

Thanks.

--94eb2c05e5cc65bf4d0543811ceb--