From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Matthias Dahl Newsgroups: gmane.emacs.devel Subject: Re: wait_reading_process_ouput hangs in certain cases (w/ patches) Date: Tue, 13 Feb 2018 15:25:44 +0100 Message-ID: <797d0e16-1bae-50c2-35f8-05489ffce935@binary-island.eu> References: <83inemrqid.fsf@gnu.org> <398f8d17-b727-d5d6-4a31-772448c7ca0d@binary-island.eu> <56e722a6-95a4-0e42-185c-f26845d4f4bf@binary-island.eu> <21237e45-a353-92f9-01ec-7b51640d2031@cs.ucla.edu> <83vaickfu2.fsf@gnu.org> <83tvxwkexg.fsf@gnu.org> <03261534-6bf5-1a5d-915f-d3c55aaa35e9@binary-island.eu> <206ebefa-7583-f049-140c-c8fd041b0719@cs.ucla.edu> <709614e8-1937-07c1-f554-b453ed4f3d4a@binary-island.eu> <7550438b-9fd4-d374-e571-8bb16456cad5@cs.ucla.edu> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------BA585ECCCCE508C4B085384C" X-Trace: blaine.gmane.org 1518532058 29826 195.159.176.226 (13 Feb 2018 14:27:38 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 13 Feb 2018 14:27:38 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 Cc: emacs-devel@gnu.org To: Paul Eggert , Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 13 15:27:33 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1elbY8-0006GH-QX for ged-emacs-devel@m.gmane.org; Tue, 13 Feb 2018 15:27:13 +0100 Original-Received: from localhost ([::1]:39063 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1elbaA-0006wy-Hc for ged-emacs-devel@m.gmane.org; Tue, 13 Feb 2018 09:29:18 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59010) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1elbWq-00055e-01 for emacs-devel@gnu.org; Tue, 13 Feb 2018 09:25:53 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1elbWm-0007AD-Nf for emacs-devel@gnu.org; Tue, 13 Feb 2018 09:25:51 -0500 Original-Received: from ud19.udmedia.de ([194.117.254.59]:34482 helo=mail.ud19.udmedia.de) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1elbWm-00078N-7O for emacs-devel@gnu.org; Tue, 13 Feb 2018 09:25:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=binary-island.eu; h= subject:to:cc:references:from:message-id:date:mime-version :in-reply-to:content-type; s=k1; bh=WImkvGTjyLAve4aWEHynTp8hpIv2 dF7UFR6QPuJGw/4=; b=YliAgnBh/zUDiHcneBKRygntgUXEXfHloRqk0i0aV/hG WclVsyaqTVlYadwwHQmPIi62k8jLgd39PMdm0oZInVgjaJU7ECx7fkzXeQ9fZxvQ oUcSFUfX5+AOmO63cH0gqA4rY4H+gjr5KhRosR7JnbyZ/6tUEdoXSPM+MYG3oUg= Original-Received: (qmail 5353 invoked from network); 13 Feb 2018 15:25:45 +0100 Original-Received: from unknown (HELO ?IPv6:2a02:810b:c540:234:36aa:25b9:ca8f:d05f?) (ud19?126p1@2a02:810b:c540:234:36aa:25b9:ca8f:d05f) by mail.ud19.udmedia.de with ESMTPSA (ECDHE-RSA-AES128-GCM-SHA256 encrypted, authenticated); 13 Feb 2018 15:25:45 +0100 Openpgp: id=1E87ADA02EFE759EFC20B2D1042F47D273AA780C In-Reply-To: Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 194.117.254.59 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:222710 Archived-At: This is a multi-part message in MIME format. --------------BA585ECCCCE508C4B085384C Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Hello, after some unexpected / unintended prolonged delay due to personal reasons, for which I apologize, attached the v2 of my patches. Basically I have gone a somewhat different route. While working in some of the requested changes, I noticed that there were still some pathological cases that were not covered and fixing that would make things even more convoluted. So, in this version, the return value is calculated (if necessary) strategically right before we return from the function call, thus it cannot be missed and we will always properly signal if data was read from a wait_proc (either directly or indirectly). And instead of messing with got_some_output, we exit the loop when we got some data (directly or indirectly) for our wait_proc if there is no data to be read for this iteration. This leaves the whole function logic alone -- except for this key point. I have addressed the remaining issues, if they still applied. And I have not been able to trigger a single hang with these patches. I appreciate any comments and suggestions. Thanks, again, for all the patience. So long, Matthias -- Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu --------------BA585ECCCCE508C4B085384C Content-Type: text/x-patch; name="0001-Add-process-output-read-accounting.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Add-process-output-read-accounting.patch" =46rom 94e0dc26f45e1a06881b016dd26446c43d339a4d Mon Sep 17 00:00:00 2001 From: Matthias Dahl Date: Tue, 24 Oct 2017 15:55:53 +0200 Subject: [PATCH 1/2] Add process output read accounting This tracks the bytes read from a process' stdin which is not used anywhere yet but required for follow-up work. * src/process.c (read_process_output): Track bytes read from a process. * src/process.h (struct Lisp_Process): Add nbytes_read to track bytes read from a process. --- src/process.c | 3 +++ src/process.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/src/process.c b/src/process.c index 2ec10b12ec..17fdf592ec 100644 --- a/src/process.c +++ b/src/process.c @@ -5889,6 +5889,9 @@ read_process_output (Lisp_Object proc, int channel)= return nbytes; coding->mode |=3D CODING_MODE_LAST_BLOCK; } + =20 + /* Ignore carryover, it's been added by a previous iteration already. = */ + p->nbytes_read +=3D nbytes; =20 /* Now set NBYTES how many bytes we must decode. */ nbytes +=3D carryover; diff --git a/src/process.h b/src/process.h index ab468b18c5..6464a8cc61 100644 --- a/src/process.h +++ b/src/process.h @@ -129,6 +129,8 @@ struct Lisp_Process pid_t pid; /* Descriptor by which we read from this process. */ int infd; + /* Byte-count modulo (UINTMAX_MAX + 1) for process output read from = `infd'. */ + uintmax_t nbytes_read; /* Descriptor by which we write to this process. */ int outfd; /* Descriptors that were created for this process and that need --=20 2.16.1 --------------BA585ECCCCE508C4B085384C Content-Type: text/x-patch; name="0002-Fix-wait_reading_process_output-wait_proc-hang.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0002-Fix-wait_reading_process_output-wait_proc-hang.patch" =46rom b9c05bbfb4559b21deb0ea4e156430dedb60ce41 Mon Sep 17 00:00:00 2001 From: Matthias Dahl Date: Tue, 6 Feb 2018 15:24:15 +0100 Subject: [PATCH 2/2] Fix wait_reading_process_output wait_proc hang * src/process.c (wait_reading_process_output): If called recursively through timers and/or process filters via accept-process-output, it is possible that the output of wait_proc has already been read by one of those recursive calls, leaving the original call hanging forever if no further output arrives through that fd and no timeout has been set. Fix that by using the process read accounting to keep track of how many bytes have been read and use that as a condition to break out of the infinite loop and return to the caller as well as to calculate the proper return value (if a wait_proc is given that is). --- src/process.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/process.c b/src/process.c index 17fdf592ec..0abbd5fa8e 100644 --- a/src/process.c +++ b/src/process.c @@ -5006,6 +5006,7 @@ wait_reading_process_output (intmax_t time_limit, i= nt nsecs, int read_kbd, struct timespec got_output_end_time =3D invalid_timespec (); enum { MINIMUM =3D -1, TIMEOUT, INFINITY } wait; int got_some_output =3D -1; + uintmax_t prev_wait_proc_nbytes_read =3D wait_proc ? wait_proc->nbytes= _read : 0; #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS bool retry_for_async; #endif @@ -5460,6 +5461,8 @@ wait_reading_process_output (intmax_t time_limit, i= nt nsecs, int read_kbd, if (nfds =3D=3D 0) { /* Exit the main loop if we've passed the requested timeout, + or have read some bytes from our wait_proc (either directly= + in this call or indirectly through timers / process filters= ), or aren't skipping processes and got some output and haven't lowered our timeout due to timers or SIGIO and have waited a long amount of time due to repeated @@ -5467,7 +5470,9 @@ wait_reading_process_output (intmax_t time_limit, i= nt nsecs, int read_kbd, struct timespec huge_timespec =3D make_timespec (TYPE_MAXIMUM (time_t), 2 * TIMESPEC_RESOLUTION);= struct timespec cmp_time =3D huge_timespec; - if (wait < TIMEOUT) + if (wait < TIMEOUT + || (wait_proc + && wait_proc->nbytes_read !=3D prev_wait_proc_nbytes_r= ead)) break; if (wait =3D=3D TIMEOUT) cmp_time =3D end_time; @@ -5772,6 +5777,15 @@ wait_reading_process_output (intmax_t time_limit, = int nsecs, int read_kbd, maybe_quit (); } =20 + /* Timers and/or process filters that we have run could have themselve= s called + `accept-process-output' (and by that indirectly this function), thu= s + possibly reading some (or all) output of wait_proc without us notic= ing it. + This could potentially lead to an endless wait (dealt with earlier = in the + function) and/or a wrong return value (dealt with here). */ + if (wait_proc && wait_proc->nbytes_read !=3D prev_wait_proc_nbytes_rea= d) + got_some_output =3D min (INT_MAX, (wait_proc->nbytes_read + - prev_wait_proc_nbytes_read)); + return got_some_output; } =0C --=20 2.16.1 --------------BA585ECCCCE508C4B085384C--