From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Matthias Dahl Newsgroups: gmane.emacs.devel Subject: Re: wait_reading_process_ouput hangs in certain cases (w/ patches) Date: Tue, 13 Mar 2018 10:54:00 +0100 Message-ID: <13b3e003-d12b-33a7-3ebe-c07b017a7cc0@binary-island.eu> References: <03261534-6bf5-1a5d-915f-d3c55aaa35e9@binary-island.eu> <206ebefa-7583-f049-140c-c8fd041b0719@cs.ucla.edu> <709614e8-1937-07c1-f554-b453ed4f3d4a@binary-island.eu> <7550438b-9fd4-d374-e571-8bb16456cad5@cs.ucla.edu> <797d0e16-1bae-50c2-35f8-05489ffce935@binary-island.eu> <83tvugdiu5.fsf@gnu.org> <877er5s0xv.fsf@gmail.com> <4e4c72bb-295d-81e1-e4ed-cad256bca83c@binary-island.eu> <87zi3v9461.fsf@gmail.com> <87k1uy8x68.fsf@gmail.com> <6d1970af-8c5c-20ba-be09-0b9aa757d663@binary-island.eu> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------DD97B182BB85904F617999F6" X-Trace: blaine.gmane.org 1520934741 21253 195.159.176.226 (13 Mar 2018 09:52:21 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 13 Mar 2018 09:52:21 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 Cc: Eli Zaretskii , =?UTF-8?Q?andr=c3=a9s_ram=c3=adrez?= , emacs-devel@gnu.org, eggert@cs.ucla.edu To: Lars Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Mar 13 10:52:17 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1evgbQ-0005OM-0d for ged-emacs-devel@m.gmane.org; Tue, 13 Mar 2018 10:52:16 +0100 Original-Received: from localhost ([::1]:38625 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1evgdS-0002Ir-Uk for ged-emacs-devel@m.gmane.org; Tue, 13 Mar 2018 05:54:22 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47246) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1evgdG-0002IV-J8 for emacs-devel@gnu.org; Tue, 13 Mar 2018 05:54:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1evgdD-0003ZR-Ce for emacs-devel@gnu.org; Tue, 13 Mar 2018 05:54:10 -0400 Original-Received: from ud19.udmedia.de ([194.117.254.59]:43492 helo=mail.ud19.udmedia.de) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1evgdD-0003Vs-0r for emacs-devel@gnu.org; Tue, 13 Mar 2018 05:54:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=binary-island.eu; h= subject:to:cc:references:from:message-id:date:mime-version :in-reply-to:content-type; s=k1; bh=JsKI4DuXqnJWZx63ZBdDgpwDboar kOWLOqennrD8Aq8=; b=WRplMidfbIrZcHzZox0VyHKtbI4rhI7/EmHUnVgLiwht V4APlmVFP+Ev5LVKVXrW9eMY/zxJi6EgRs+iLP27GiErtVHpfqSQ8c/1jm8EsEJE BHrpSdB8u6mXz0Ss6T3oaqVMHETHwNgqJZDRQvOJ9khniqSvHc5teczT+5LcVDU= Original-Received: (qmail 20057 invoked from network); 13 Mar 2018 10:54:03 +0100 Original-Received: from unknown (HELO ?IPv6:2a02:810b:c540:234:36aa:25b9:ca8f:d05f?) (ud19?126p1@2a02:810b:c540:234:36aa:25b9:ca8f:d05f) by mail.ud19.udmedia.de with ESMTPSA (ECDHE-RSA-AES128-GCM-SHA256 encrypted, authenticated); 13 Mar 2018 10:54:03 +0100 Openpgp: id=1E87ADA02EFE759EFC20B2D1042F47D273AA780C In-Reply-To: Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 194.117.254.59 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:223683 Archived-At: This is a multi-part message in MIME format. --------------DD97B182BB85904F617999F6 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Hello all... Sorry for the delay, things took quite a bit longer than I expected. And I really should scratch that "later this week" phrase entirely from my active use. ;-) @Andrés: Sorry for not replying to your off-list mail yet. I was busy with this bug, given the limited time I have I was prioritizing. So, attached you will find patches that fix bugs I have found with the current GnuTLS code. I am pretty confident that it will fix the issue that Lars is seeing. And I hope that the hangs seen by Andrés will be gone with this as well. Lars and Andrés, please test those patches either against Emacs master without any other additional patches or against the current Emacs 26 branch with the wait_reading_process_output patches that have been applied to master but nothing else. Please report back if those patches fix your issues or, if not, how they affect (if at all) the hangs. Thanks again for investing the time! @Eli: Those patches are, imho, clearly Emacs 26 material as well. And along those lines, and I don't mean to be pushy at all :), I would like to bring up the wait_reading_process_output fixes as well. I still think it would be a good idea to get those into Emacs 26. All the fixes up until now have been for hangs that will usually be erratic and unpredictable to the user... thus, those hangs will probably end up as unresolved bug reports on some random package instead of posts on this list... if at all. Keeping my fingers crossed here. ;-) So long, Matthias -- Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu --------------DD97B182BB85904F617999F6 Content-Type: text/x-patch; name="0001-Fix-GnuTLS-error-handling.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Fix-GnuTLS-error-handling.patch" =46rom 841d0fbe37642bc14c4bcd515cfbc91a25ba179b Mon Sep 17 00:00:00 2001 From: Matthias Dahl Date: Mon, 12 Mar 2018 15:33:45 +0100 Subject: [PATCH 1/2] Fix GnuTLS error handling * src/gnutls.c (emacs_gnutls_read): All error handling should be done in `emacs_gnutls_handle_error', move handling of GNUTLS_E_UNEXPECTED_PACKET_LENGTH accordingly. We always need to set `errno' in case of an error, since later error handling (e.g. `wait_reading_process_output') depends on it and GnuTLS does not set errno by itself. We'll otherwise have random errno values which can cause erratic behavior and hangs. (emacs_gnutls_handle_error): GNUTLS_E_UNEXPECTED_PACKET_LENGTH is only returned for premature connection termination on GnuTLS < 3.0 and is otherwise a real error and should not be gobbled up. --- src/gnutls.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/src/gnutls.c b/src/gnutls.c index 903393fed1..e7d0d3d845 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char = *buf, ptrdiff_t nbyte) rtnval =3D gnutls_record_recv (state, buf, nbyte); if (rtnval >=3D 0) return rtnval; - else if (rtnval =3D=3D GNUTLS_E_UNEXPECTED_PACKET_LENGTH) - /* The peer closed the connection. */ - return 0; else if (emacs_gnutls_handle_error (state, rtnval)) - /* non-fatal error */ - return -1; - else { - /* a fatal error occurred */ - return 0; - } + { + /* non-fatal error */ + errno =3D EAGAIN; + return -1; + } + else + { + /* a fatal error occurred */ + errno =3D EPROTO; + return 0; + } } =20 static char const * @@ -756,8 +758,10 @@ emacs_gnutls_handle_error (gnutls_session_t session,= int err) connection. */ # ifdef HAVE_GNUTLS3 if (err =3D=3D GNUTLS_E_PREMATURE_TERMINATION) - level =3D 3; +# else + if (err =3D=3D GNUTLS_E_UNEXPECTED_PACKET_LENGTH) # endif + level =3D 3; =20 GNUTLS_LOG2 (level, max_log_level, "fatal error:", str); ret =3D false; --=20 2.16.2 --------------DD97B182BB85904F617999F6 Content-Type: text/x-patch; name="0002-Always-check-GnuTLS-sessions-for-available-data.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0002-Always-check-GnuTLS-sessions-for-available-data.patch" =46rom f5b979da67595da4d91f7b7a4eb979deae6a7321 Mon Sep 17 00:00:00 2001 From: Matthias Dahl Date: Mon, 12 Mar 2018 16:07:55 +0100 Subject: [PATCH 2/2] Always check GnuTLS sessions for available data * src/process.c (wait_reading_process_output): GnuTLS buffers data internally and as such there is no guarantee that a select() call on the underlying kernel socket will report available data if all data has already been buffered. Prior to GnuTLS < 2.12, lowat mode was the default which left bytes back in the kernel socket, so a select() call would work. With GnuTLS >=3D 2.12 (the now required version for Emacs), that default changed to non-lowat mode (and we don't set it otherwise) and was subsequently completely removed with GnuTLS >=3D 3.0. So, to properly handle GnuTLS sessions, we need to iterate through all channels, check for available data manually and set the concerning fds accordingly. Otherwise we might stall/delay unnecessarily or worse. This also applies to the !just_wait_proc && wait_proc case, which was previously handled improperly (only wait_proc was checked) and could cause problems if sessions did have any dependency on one another through e.g. higher up program logic and waited for one another. --- src/process.c | 77 +++++++++++++++++++----------------------------------= ------ 1 file changed, 24 insertions(+), 53 deletions(-) diff --git a/src/process.c b/src/process.c index 9b9b9f3550..f8fed56d5b 100644 --- a/src/process.c +++ b/src/process.c @@ -5392,60 +5392,31 @@ wait_reading_process_output (intmax_t time_limit,= int nsecs, int read_kbd, #endif /* !HAVE_GLIB */ =20 #ifdef HAVE_GNUTLS - /* GnuTLS buffers data internally. In lowat mode it leaves - some data in the TCP buffers so that select works, but - with custom pull/push functions we need to check if some - data is available in the buffers manually. */ - if (nfds =3D=3D 0) + /* GnuTLS buffers data internally. select() will only report + available data for the underlying kernel sockets API, not + what has been buffered internally. As such, we need to loop= + through all channels and check for available data manually.= */ + if (nfds >=3D 0) { - fd_set tls_available; - int set =3D 0; - - FD_ZERO (&tls_available); - if (! wait_proc) - { - /* We're not waiting on a specific process, so loop - through all the channels and check for data. - This is a workaround needed for some versions of - the gnutls library -- 2.12.14 has been confirmed - to need it. See - http://comments.gmane.org/gmane.emacs.devel/145074 */ - for (channel =3D 0; channel < FD_SETSIZE; ++channel) - if (! NILP (chan_process[channel])) - { - struct Lisp_Process *p =3D - XPROCESS (chan_process[channel]); - if (p && p->gnutls_p && p->gnutls_state - && ((emacs_gnutls_record_check_pending - (p->gnutls_state)) - > 0)) - { - nfds++; - eassert (p->infd =3D=3D channel); - FD_SET (p->infd, &tls_available); - set++; - } - } - } - else - { - /* Check this specific channel. */ - if (wait_proc->gnutls_p /* Check for valid process. */ - && wait_proc->gnutls_state - /* Do we have pending data? */ - && ((emacs_gnutls_record_check_pending - (wait_proc->gnutls_state)) - > 0)) - { - nfds =3D 1; - eassert (0 <=3D wait_proc->infd); - /* Set to Available. */ - FD_SET (wait_proc->infd, &tls_available); - set++; - } - } - if (set) - Available =3D tls_available; + for (channel =3D 0; channel < FD_SETSIZE; ++channel) + if (! NILP (chan_process[channel])) + { + struct Lisp_Process *p =3D + XPROCESS (chan_process[channel]); + + if (just_wait_proc && p !=3D wait_proc) + continue; + + if (p && p->gnutls_p && p->gnutls_state + && ((emacs_gnutls_record_check_pending + (p->gnutls_state)) + > 0)) + { + nfds++; + eassert (p->infd =3D=3D channel); + FD_SET (p->infd, &Available); + } + } } #endif } --=20 2.16.2 --------------DD97B182BB85904F617999F6--