unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* wait_reading_process_ouput hangs in certain cases (w/ patches)
@ 2017-10-24 18:52 Matthias Dahl
  2017-10-25 14:53 ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-10-24 18:52 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3195 bytes --]

Hello all,

recursively calling accept-process-output with the first call waiting
for the output of a specific process, can hang Emacs (as-in: it is
waiting forever but can be canceled of course) even though it should
not be the case since data was read.

This is actually not all that uncommon. One example of this is a hang
seen with Magit opening its COMMIT_MSG buffer, reported here [1]. I've
myself run into this problem continuously which is why I started to
debug it in the first place.

The hang with Magit happens in flyspell.el which waits for output from
its spellchecker process through accept-process-output and specifies
that specific process as wait_proc. Now depending on timing (race),
wait_reading_process_output can call the pending timers... which in
turn can call accept-process-output again. This almost always leads
to the spellchecker output being read back in full, so there is no
more data left to be read. Thus the original accept-process-output,
which called wait_reading_process_output, will wait for the data to
become available forever since it has no way to know that those have
already been read.

Naturally one could argue that a timeout should be specified when
calling accept-process-output. But nevertheless I still think this is
a breach of contract. The caller expects accept-process-output to
return as promised when data has been read from the specified process
and it clearly isn't always the case and can hang forever, depending
on timing and the specifics of the data being read.

The attached patches fix this by introducing process output read
accounting -- simply counting the bytes read per process. And using
that data to strategically check in wait_reading_process_output for
data being read while we handed over control to timers and/or filters.

I haven't seen any ill side-effects in my tests and it clearly fixes
the problem seen in [1] as well as I would wager quite a few others
that were probably seen by user's of all kinds of setups that seemed
unpredictable and mysterious and never debugged.

As a side-note: I decided against an artificial metric and went with
simply counting the bytes read, since this does come in handy when
doing debugging and being able to see how much data was read from a
process during specific time intervals.

Also, this still leaves the possibility that wait_reading_process_output
could wait forever while being called without wait_proc and a timeout
set. This could be mitigated as well by some sort of a tick counter that
only increases when no wait_proc was specified and data from processes
were processed. I decided against implementing that for now since imho
the chances of that happening are marginal, if at all present. OTOH,
the semantics in that case are not all that clear and would further add
complexity to an already rather unhealthy function. I am naturally open
to other opinions and implementing this as well if requested. :-)

Any suggestions and/or comments are very welcome, as always.

Thanks for the patience to read this longish post. :-)

So long,
Matthias

[1] https://github.com/magit/magit/issues/2915

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-process-output-read-accounting.patch --]
[-- Type: text/x-patch; name="0001-Add-process-output-read-accounting.patch", Size: 1625 bytes --]

From 6c24b8d7082222df28d2046bfe70ff0e22342f08 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:55:53 +0200
Subject: [PATCH 1/2] Add process output read accounting

This tracks the bytes read from a process's stdin which is not used
anywhere yet but required for follow-up work.
* src/process.c (read_process_output): Track bytes read from a process.
* src/process.h (struct Lisp_Process): Add infd_num_bytes_read
to track bytes read from a process.
---
 src/process.c | 2 ++
 src/process.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/process.c b/src/process.c
index fc46e74332..904ca60863 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5900,6 +5900,8 @@ read_process_output (Lisp_Object proc, int channel)
   /* Now set NBYTES how many bytes we must decode.  */
   nbytes += carryover;
 
+  p->infd_num_bytes_read += nbytes;
+
   odeactivate = Vdeactivate_mark;
   /* There's no good reason to let process filters change the current
      buffer, and many callers of accept-process-output, sit-for, and
diff --git a/src/process.h b/src/process.h
index 5a044f669f..f796719a51 100644
--- a/src/process.h
+++ b/src/process.h
@@ -129,6 +129,8 @@ struct Lisp_Process
     pid_t pid;
     /* Descriptor by which we read from this process.  */
     int infd;
+    /* Byte-count for process output read from `infd'.  */
+    unsigned long infd_num_bytes_read;
     /* Descriptor by which we write to this process.  */
     int outfd;
     /* Descriptors that were created for this process and that need
-- 
2.14.3


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-src-process.c-wait_reading_process_output-Fix-wait_p.patch --]
[-- Type: text/x-patch; name="0002-src-process.c-wait_reading_process_output-Fix-wait_p.patch", Size: 3155 bytes --]

From 57e9adc220312681588180aff2bae1eb07925ad5 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:56:47 +0200
Subject: [PATCH 2/2] * src/process.c (wait_reading_process_output): Fix
 wait_proc hang.

If called recursively (through timers or process filters by the means
of accept-process-output), it is possible that the output of wait_proc
has already been read by one of those recursive calls, leaving the
original call hanging forever if no further output arrives through
that fd and no timeout has been specified. Implement proper checks by
taking advantage of the process output read accounting.
---
 src/process.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/process.c b/src/process.c
index 904ca60863..a743aa973e 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5003,6 +5003,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   struct timespec got_output_end_time = invalid_timespec ();
   enum { MINIMUM = -1, TIMEOUT, INFINITY } wait;
   int got_some_output = -1;
+  unsigned long initial_wait_proc_num_bytes_read = (wait_proc) ?
+                                                   wait_proc->infd_num_bytes_read : 0;
 #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS
   bool retry_for_async;
 #endif
@@ -5161,6 +5163,17 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 	      && requeued_events_pending_p ())
 	    break;
 
+          /* Timers could have called `accept-process-output', thus reading the output
+             of wait_proc while we (in the worst case) wait endlessly for it to become
+             available later. So we need to check if data has been read and break out
+             early if that is so since our job has been fulfilled. */
+          if (wait_proc
+              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
+            {
+              got_some_output = 1;
+              break;
+            }
+
           /* This is so a breakpoint can be put here.  */
           if (!timespec_valid_p (timer_delay))
               wait_reading_process_output_1 ();
@@ -5606,7 +5619,15 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		 buffered-ahead character if we have one.  */
 
 	      nread = read_process_output (proc, channel);
-	      if ((!wait_proc || wait_proc == XPROCESS (proc))
+
+              /* In case a filter was run that called `accept-process-output', it is
+                 possible that the output from wait_proc was already read, leaving us
+                 waiting for it endlessly (if no timeout was specified). Thus, we need
+                 to check if data was already read. */
+              if (wait_proc
+                  && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
+                got_some_output = 1;
+	      else if ((!wait_proc || wait_proc == XPROCESS (proc))
 		  && got_some_output < nread)
 		got_some_output = nread;
 	      if (nread > 0)
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-24 18:52 wait_reading_process_ouput hangs in certain cases (w/ patches) Matthias Dahl
@ 2017-10-25 14:53 ` Eli Zaretskii
  2017-10-26 14:07   ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-10-25 14:53 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Tue, 24 Oct 2017 20:52:20 +0200
> 
> recursively calling accept-process-output with the first call waiting
> for the output of a specific process, can hang Emacs (as-in: it is
> waiting forever but can be canceled of course) even though it should
> not be the case since data was read.
> 
> This is actually not all that uncommon. One example of this is a hang
> seen with Magit opening its COMMIT_MSG buffer, reported here [1]. I've
> myself run into this problem continuously which is why I started to
> debug it in the first place.
> 
> The hang with Magit happens in flyspell.el which waits for output from
> its spellchecker process through accept-process-output and specifies
> that specific process as wait_proc. Now depending on timing (race),
> wait_reading_process_output can call the pending timers... which in
> turn can call accept-process-output again. This almost always leads
> to the spellchecker output being read back in full, so there is no
> more data left to be read. Thus the original accept-process-output,
> which called wait_reading_process_output, will wait for the data to
> become available forever since it has no way to know that those have
> already been read.

I'm not sure I understand the situation where this happens; can you
elaborate?  Are you saying that a Lisp program calls
accept-process-output and waits for a specific process to produce some
output, and meanwhile some timer runs a function that calls
accept-process-output with its 1st argument nil?  What timers do that?
I think this would be a bug in such timers.

Or are you saying that a Lisp program calls accept-process-output with
its 1st arg nil, and while it waits, a timer calls
accept-process-output for a specific process?

Or do you mean something else?

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-25 14:53 ` Eli Zaretskii
@ 2017-10-26 14:07   ` Matthias Dahl
  2017-10-26 16:23     ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-10-26 14:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello Eli,

Thanks for taking the time to review the issue and patches. :-)

On 25/10/17 16:53, Eli Zaretskii wrote:

> I'm not sure I understand the situation where this happens; can you
> elaborate?

Sure. Let's take the Magit issue [1] as an example:

When committing, Magit prepares a COMMIT_MSG buffer and does some
process magic of its own which is pretty much irrelevant for this.

At some point during that, while we are already in an instance of
wait_reading_process_output (due to sit_for), the post-command-hooks are
run.

And here things get interesting. Eventually flyspell-post-command-hook
is run which executes flyspell-word synchronously. That basically does
write out a word that needs to be checked to the spellchecker process,
waits for the results from stdin via accept-process-output and goes on.
Of special note here is that it a) specifies a wait_proc (spellchecker
process) and no timeout or whatsoever.

The output from the spellchecker is usually there instantaneously, so
that is actually unnoticeable, unless wait_reading_process_output, that
was invoked through that specific accept-process-output, decides to run
the timers.

And here comes the catch: At this point, usually the spellchecker output
is already available but not yet read. When the timers run, one of them
calls accept-process-output again which will read the entire available
output of the spellchecker process. Since there will be no more data on
that fd unless some interaction happens with the process, our original
call to accept-process-output/wait_reading_process_output will wait
endlessly for the data to become available (due to wait_proc being set
without a timeout).

Thus, it appears that Magit hangs while in truth, flyspell hangs waiting
for the spellchecker results to return that have already been read back.

The gist of it is: If we have an active wait_reading_process_output call
with a wait_proc set but no timeout that calls out to either timers or
filters, it is entirely possible that those directly or indirectly call
us again recursively, thus reading the output we are waiting for without
us ever noticing it, if no further output becomes available in addition
to what was read unnoticed... like it happens with flyspell.

That is what my patches fix: They simply add a bytes read metric to each
process structure that we can check for change at strategically relevant
points and decide if we got some data back that went unnoticed and break
out from wait_reading_process_output.

I know, flyspell should do its business asynchronously and also specify
a timeout since it is being run through hooks. Those are bugs by itself.
But I also think that wait_reading_process_output violates its contract
and is buggy in this regard as well, since it should properly function
even if it calls out to filters or timers -- and it clearly does not and
I would wager more hangs seen in the wild that weren't debugged, could
be attributed to this very bug.

I hope my rambling speech was somewhat helpful and clear and I could get
the problem described sufficiently without being too confusing. :-)

If there are any question marks left hanging over your head, please
don't hesitate to ask and I will try my best to clear them up -- but it
might end up being another longish mail, so be warned. ;)

Thanks,
Matthias

[1] https://github.com/magit/magit/issues/2915

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-26 14:07   ` Matthias Dahl
@ 2017-10-26 16:23     ` Eli Zaretskii
  2017-10-26 18:56       ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-10-26 16:23 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Thu, 26 Oct 2017 16:07:31 +0200
> 
> When committing, Magit prepares a COMMIT_MSG buffer and does some
> process magic of its own which is pretty much irrelevant for this.
> 
> At some point during that, while we are already in an instance of
> wait_reading_process_output (due to sit_for), the post-command-hooks are
> run.

AFAIK, post-command-hooks cannot be run while we are in sit-for, but I
guess this is not relevant to the rest of the description?

> And here things get interesting. Eventually flyspell-post-command-hook
> is run which executes flyspell-word synchronously. That basically does
> write out a word that needs to be checked to the spellchecker process,
> waits for the results from stdin via accept-process-output and goes on.
> Of special note here is that it a) specifies a wait_proc (spellchecker
> process) and no timeout or whatsoever.
> 
> The output from the spellchecker is usually there instantaneously, so
> that is actually unnoticeable, unless wait_reading_process_output, that
> was invoked through that specific accept-process-output, decides to run
> the timers.
> 
> And here comes the catch: At this point, usually the spellchecker output
> is already available but not yet read. When the timers run, one of them
> calls accept-process-output again which will read the entire available
> output of the spellchecker process.

I understand that this timer calls accept-process-output with its
argument nil, is that correct?  If so, isn't that a bug for a timer to
do that?  Doing that runs the risk of eating up output from some
subprocess for which the foreground Lisp program is waiting.

So please point out the timer that does this, because I think that
timer needs to be fixed.

> The gist of it is: If we have an active wait_reading_process_output call
> with a wait_proc set but no timeout that calls out to either timers or
> filters, it is entirely possible that those directly or indirectly call
> us again recursively, thus reading the output we are waiting for without
> us ever noticing it, if no further output becomes available in addition
> to what was read unnoticed... like it happens with flyspell.
> 
> That is what my patches fix: They simply add a bytes read metric to each
> process structure that we can check for change at strategically relevant
> points and decide if we got some data back that went unnoticed and break
> out from wait_reading_process_output.

We already record the file descriptors on which we wait for process
output, see compute_non_keyboard_wait_mask.  Once
wait_reading_process_output exits, it clears these records.  So it
should be possible for us to prevent accept-process-output calls
issued by such runaway timers from waiting on the descriptors that are
already "taken": if, when we set the bits in the pselect mask, we find
that some of the descriptors are already watched by the same thread as
the current thread, we could exclude them from the pselect mask we are
setting up.  Wouldn't that be a better solution?  Because AFAIU, your
solution just avoids an infinite wait, but doesn't avoid losing the
process output, because it was read by the wrong Lisp code.  Right?

> But I also think that wait_reading_process_output violates its contract
> and is buggy in this regard as well, since it should properly function
> even if it calls out to filters or timers -- and it clearly does not and
> I would wager more hangs seen in the wild that weren't debugged, could
> be attributed to this very bug.

I think the basic contract that is violated here is that the output
from a subprocess is "stolen" by an unrelated call to
accept-process-output, and we should prevent that if we can.

> If there are any question marks left hanging over your head, please
> don't hesitate to ask and I will try my best to clear them up -- but it
> might end up being another longish mail, so be warned. ;)

Well, I'd like to eyeball the timer which commits this crime.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-26 16:23     ` Eli Zaretskii
@ 2017-10-26 18:56       ` Matthias Dahl
  2017-10-28  8:20         ` Matthias Dahl
  2017-10-28  9:28         ` Eli Zaretskii
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-10-26 18:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3384 bytes --]

Hello Eli...

On 26/10/17 18:23, Eli Zaretskii wrote:

> AFAIK, post-command-hooks cannot be run while we are in sit-for, but I
> guess this is not relevant to the rest of the description?

This probably comes from server.el (server-visit-files) because Magit
uses emacsclient for some of its magic.

I have attached a backtrace, taken during the hang. Unfortunately it is
from a optimized build (would have needed to recompile just now, and I
am a bit in a hurry) but it at least shows the callstack (more or less)
nicely.

> I understand that this timer calls accept-process-output with its
> argument nil, is that correct?  If so, isn't that a bug for a timer to
> do that?  Doing that runs the risk of eating up output from some
> subprocess for which the foreground Lisp program is waiting.

I haven't actually checked which timer it is, to be quite honest since I
didn't think of it as a bug at all.

Correct me if I am wrong, calling accept-process-output w/o arguments
is expected to be quite harmless and can be useful. If you specify a
specific process, you will most definitely wait at least as long as
it takes for that process to produce any output.

Nevertheless: If am not completely mistaken, there is no data lost at
all. It is read and passed to the filter function which was registered
by the interested party -- otherwise the default filter will simply
append it to the buffer it belongs to.

The only thing that is lost is that it was ever read at all and thus
an endless wait begins.

> So please point out the timer that does this, because I think that
> timer needs to be fixed.

If you still need that, I will do some digging and try to find it.

> We already record the file descriptors on which we wait for process
> output, see compute_non_keyboard_wait_mask.  Once
> wait_reading_process_output exits, it clears these records.  So it
> should be possible for us to prevent accept-process-output calls
> issued by such runaway timers from waiting on the descriptors that are
> already "taken": if, when we set the bits in the pselect mask, we find
> that some of the descriptors are already watched by the same thread as
> the current thread, we could exclude them from the pselect mask we are
> setting up.  Wouldn't that be a better solution?  Because AFAIU, your
> solution just avoids an infinite wait, but doesn't avoid losing the
> process output, because it was read by the wrong Lisp code.  Right?

Hm... at the moment I don't see where data is lost with my solution.
Maybe I am being totally blind and making a fool out of myself but I
honestly don't see it.

What you suggest could be dangerous as well, depending on how it is
implemented and the circumstances. What fds get excluded in recursive
calls? Only wait_proc ones? Or every one that is watched somewhere up
in the callstack? Depending on what we do, we could end up with an
almost empty list that doesn't get "ready" as easily as one would
have expected by a naked accept-process-output call... and it could thus
potentially stall as well... worst-case, I know.

Just thinking out loud here. I would really need to check this, those
are just my initial thoughts.

> Well, I'd like to eyeball the timer which commits this crime.

If you still do, let me know and I will try to track it down...

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu

[-- Attachment #2: emacs-bt.txt --]
[-- Type: text/plain, Size: 10206 bytes --]

#0  0x00007ffff2d49ee3 in __pselect (nfds=<optimized out>, readfds=0x7fffffff8410, writefds=0x7fffffff8490, exceptfds=0x0, timeout=<optimized out>, sigmask=<optimized out>)
    at ../sysdeps/unix/sysv/linux/pselect.c:69
#1  0x00000000005c8c6c in really_call_select (arg=0x7fffffff8330) at thread.c:572
#2  0x00000000005c9819 in thread_select (func=<optimized out>, max_fds=max_fds@entry=21, rfds=rfds@entry=0x7fffffff8410, wfds=<optimized out>, efds=efds@entry=0x0, 
    timeout=timeout@entry=0x7fffffff8a40, sigmask=0x0) at thread.c:595
#3  0x00000000005e2c3b in xg_select (fds_lim=21, rfds=rfds@entry=0x7fffffff8b40, wfds=wfds@entry=0x7fffffff8bc0, efds=efds@entry=0x0, timeout=timeout@entry=0x7fffffff8a40, 
    sigmask=sigmask@entry=0x0) at xgselect.c:117
#4  0x00000000005aa35d in wait_reading_process_output (time_limit=<optimized out>, nsecs=<optimized out>, read_kbd=read_kbd@entry=0, do_display=do_display@entry=false, 
    wait_for_cell=..., wait_for_cell@entry=..., wait_proc=0x5874d50, just_wait_proc=0) at process.c:5375
#5  0x00000000005ac2fb in Faccept_process_output (process=..., seconds=..., millisec=..., just_this_one=...) at process.c:4655
#6  0x000000000056aed5 in eval_sub (form=...) at eval.c:2241
#7  0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#8  0x000000000056ae14 in eval_sub (form=..., form@entry=...) at eval.c:2183
#9  0x000000000056bd05 in Fwhile (args=...) at eval.c:985
#10 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#11 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#12 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#13 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#14 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#15 0x000000000056b82d in Fcond (args=...) at eval.c:435
#16 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#17 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#18 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#19 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#20 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#21 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#22 0x000000000056c38c in FletX (args=...) at eval.c:900
#23 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#24 0x000000000056b1bd in Fprogn (body=..., body@entry=...) at eval.c:455
#25 0x00000000005625e6 in Fsave_excursion (args=...) at editfns.c:1050
#26 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#27 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#28 0x000000000056b512 in funcall_lambda (fun=..., fun@entry=..., nargs=nargs@entry=0, arg_vector=arg_vector@entry=0x7fffffff97c0) at eval.c:3042
#29 0x000000000056b6a3 in apply_lambda (fun=..., args=..., count=count@entry=33) at eval.c:2903
#30 0x000000000056ac1a in eval_sub (form=...) at eval.c:2306
#31 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#32 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#33 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#34 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#35 0x000000000056c60b in Flet (args=...) at eval.c:969
#36 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#37 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#38 0x000000000056c60b in Flet (args=...) at eval.c:969
#39 0x000000000056ae14 in eval_sub (form=..., form@entry=...) at eval.c:2183
#40 0x000000000056c9b9 in internal_lisp_condition_case (var=..., bodyform=..., handlers=...) at eval.c:1303
#41 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#42 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#43 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#44 0x000000000056ae14 in eval_sub (form=...) at eval.c:2183
#45 0x000000000056b1bd in Fprogn (body=...) at eval.c:455
#46 0x000000000056b512 in funcall_lambda (fun=..., nargs=nargs@entry=0, arg_vector=arg_vector@entry=0x7fffffffa228) at eval.c:3042
#47 0x0000000000568bbd in Ffuncall (nargs=1, args=0x7fffffffa220) at eval.c:2780
#48 0x0000000000568c69 in funcall_nil (nargs=<optimized out>, args=<optimized out>) at eval.c:2397
#49 0x000000000056823d in run_hook_with_args (nargs=1, args=0x7fffffffa220, funcall=0x568c60 <funcall_nil>) at eval.c:2574
#50 0x00000000005683d7 in run_hook_with_args (funcall=0x568c60 <funcall_nil>, args=0x7fffffffa220, nargs=1) at eval.c:2524
#51 Frun_hook_with_args (args=0x7fffffffa220, nargs=1) at eval.c:2439
#52 run_hook (hook=...) at eval.c:2587
#53 Frun_hooks (nargs=<optimized out>, args=<optimized out>) at eval.c:2421
#54 0x0000000000568c42 in Ffuncall (nargs=2, args=args@entry=0x7fffffffa310) at eval.c:2766
#55 0x000000000059fa88 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs@entry=140737488331568, args=<optimized out>, 
    args@entry=0x1ffffffa5c8) at bytecode.c:629
#56 0x000000000056b2ae in funcall_lambda (fun=..., nargs=140737488331568, nargs@entry=3, arg_vector=0x1ffffffa5c8, arg_vector@entry=0x7fffffffa5c8) at eval.c:2967
#57 0x0000000000568bbd in Ffuncall (nargs=nargs@entry=4, args=0x7fffffffa5c0) at eval.c:2780
#58 0x000000000056a670 in Fapply (nargs=2, args=0x7fffffffa6d8) at eval.c:2386
#59 0x0000000000568c42 in Ffuncall (nargs=3, args=args@entry=0x7fffffffa6d0) at eval.c:2766
#60 0x000000000059fa88 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs@entry=140737488332528, args=<optimized out>, 
    args@entry=0x1ffffffa948) at bytecode.c:629
#61 0x000000000056b2ae in funcall_lambda (fun=..., nargs=140737488332528, nargs@entry=3, arg_vector=0x1ffffffa948, arg_vector@entry=0x7fffffffa948) at eval.c:2967
#62 0x0000000000568bbd in Ffuncall (nargs=4, args=args@entry=0x7fffffffa940) at eval.c:2780
#63 0x000000000059fa88 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs@entry=140737488333160, args=<optimized out>, 
    args@entry=0x1ffffffac30) at bytecode.c:629
#64 0x000000000056b2ae in funcall_lambda (fun=..., nargs=140737488333160, nargs@entry=7, arg_vector=0x1ffffffac30, arg_vector@entry=0x7fffffffac30) at eval.c:2967
#65 0x0000000000568bbd in Ffuncall (nargs=8, args=args@entry=0x7fffffffac28) at eval.c:2780
#66 0x000000000059fa88 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs@entry=140737488333928, args=<optimized out>, 
    args@entry=0x1ffffffaec0) at bytecode.c:629
#67 0x000000000056b2ae in funcall_lambda (fun=..., nargs=140737488333928, nargs@entry=0, arg_vector=0x1ffffffaec0, arg_vector@entry=0x7fffffffaec0) at eval.c:2967
#68 0x0000000000568bbd in Ffuncall (nargs=1, args=args@entry=0x7fffffffaeb8) at eval.c:2780
#69 0x000000000059fa88 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs@entry=140737488334552, args=<optimized out>,
    args@entry=0x1ffffffb180) at bytecode.c:629
#70 0x000000000056b2ae in funcall_lambda (fun=..., nargs=140737488334552, nargs@entry=1, arg_vector=0x1ffffffb180, arg_vector@entry=0x7fffffffb180) at eval.c:2967
#71 0x0000000000568bbd in Ffuncall (nargs=2, args=args@entry=0x7fffffffb178) at eval.c:2780
#72 0x000000000059fa88 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs@entry=140737488335352, args=<optimized out>,
    args@entry=0x5ffffffb7e8) at bytecode.c:629
#73 0x000000000056b2ae in funcall_lambda (fun=..., nargs=140737488335352, nargs@entry=2, arg_vector=0x5ffffffb7e8, arg_vector@entry=0x7fffffffb7e8) at eval.c:2967
#74 0x0000000000568bbd in Ffuncall (nargs=nargs@entry=3, args=0x7fffffffb7e0) at eval.c:2780
#75 0x000000000056a670 in Fapply (nargs=nargs@entry=2, args=args@entry=0x7fffffffb8a0) at eval.c:2386
#76 0x000000000056a95c in apply1 (fn=..., arg=...) at eval.c:2602
#77 0x0000000000567e46 in internal_condition_case_1 (bfun=bfun@entry=0x5a2610 <read_process_output_call>, arg=..., handlers=..., handlers@entry=...,
    hfun=hfun@entry=0x5a2580 <read_process_output_error_handler>) at eval.c:1356
#78 0x00000000005a28b8 in read_and_dispose_of_process_output (coding=<optimized out>, nbytes=239,
    chars=0x7fffffffb940 "-dir /home/matthew/workspace/storage/opensource/llvm-project-unofficial-github-mirror.git/ -current-frame -tty /dev/pts/5 dumb -file /home/matthew/workspace/storage/opensource/llvm-project-unofficial-"..., p=0xbb468f0) at process.c:5998
#79 read_process_output (proc=..., proc@entry=..., channel=channel@entry=20) at process.c:5909
#80 0x00000000005aa5d7 in wait_reading_process_output (time_limit=time_limit@entry=30, nsecs=nsecs@entry=0, read_kbd=read_kbd@entry=-1, do_display=do_display@entry=true,
    wait_for_cell=..., wait_for_cell@entry=..., wait_proc=wait_proc@entry=0x0, just_wait_proc=0) at process.c:5608
#81 0x00000000004226d0 in sit_for (timeout=..., reading=reading@entry=true, display_option=display_option@entry=1) at dispnew.c:5770
---Type <return> to continue, or q <return> to quit---
#82 0x00000000004ff50c in read_char (commandflag=commandflag@entry=1, map=..., map@entry=..., prev_event=..., used_mouse_menu=used_mouse_menu@entry=0x7fffffffd4bb,
    end_time=end_time@entry=0x0) at keyboard.c:2717
#83 0x000000000050025c in read_key_sequence (keybuf=keybuf@entry=0x7fffffffd5c0, prompt=..., prompt@entry=..., dont_downcase_last=dont_downcase_last@entry=false,
    can_return_switch_frame=can_return_switch_frame@entry=true, fix_current_buffer=fix_current_buffer@entry=true, prevent_redisplay=prevent_redisplay@entry=false,
    bufsize=30) at keyboard.c:9147
#84 0x0000000000501d1e in command_loop_1 () at keyboard.c:1368
#85 0x0000000000567dae in internal_condition_case (bfun=bfun@entry=0x501ae0 <command_loop_1>, handlers=..., handlers@entry=..., hfun=hfun@entry=0x4f8380 <cmd_error>)
    at eval.c:1332
#86 0x00000000004f3554 in command_loop_2 (ignore=..., ignore@entry=...) at keyboard.c:1110
#87 0x0000000000567d1d in internal_catch (tag=..., tag@entry=..., func=func@entry=0x4f3530 <command_loop_2>, arg=..., arg@entry=...) at eval.c:1097
#88 0x00000000004f34eb in command_loop () at keyboard.c:1089
#89 0x00000000004f7f93 in recursive_edit_1 () at keyboard.c:695
#90 0x00000000004f82b3 in Frecursive_edit () at keyboard.c:766
#91 0x0000000000418a72 in main (argc=<optimized out>, argv=0x7fffffffd978) at emacs.c:1713

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-26 18:56       ` Matthias Dahl
@ 2017-10-28  8:20         ` Matthias Dahl
  2017-10-28  9:28         ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-10-28  8:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello Eli...

Just to make sure we are not waiting for each other and have dead
lock here. :-)

Have you had any chance to look into this again? Or is there anything
you are waiting for from my side? If you have just been busy, no problem
at all.

Have a nice weekend... and thanks again!
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-26 18:56       ` Matthias Dahl
  2017-10-28  8:20         ` Matthias Dahl
@ 2017-10-28  9:28         ` Eli Zaretskii
  2017-10-30  9:48           ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-10-28  9:28 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Thu, 26 Oct 2017 20:56:03 +0200
> 
> > AFAIK, post-command-hooks cannot be run while we are in sit-for, but I
> > guess this is not relevant to the rest of the description?
> 
> This probably comes from server.el (server-visit-files) because Magit
> uses emacsclient for some of its magic.

I see post-command-hook being run from server-visit-files, but I don't
see sit-for.  The sit_for (not sit-for) in your backtrace is called
from read_char, something that happens every time Emacs becomes idle.

> I have attached a backtrace, taken during the hang. Unfortunately it is
> from a optimized build (would have needed to recompile just now, and I
> am a bit in a hurry) but it at least shows the callstack (more or less)
> nicely.

It lacks the Lisp backtrace ("xbacktrace" can produce it), and the
fact that most arguments are either optimized out or complex data
types shown as "..." makes the backtrace much less useful.

> > I understand that this timer calls accept-process-output with its
> > argument nil, is that correct?  If so, isn't that a bug for a timer to
> > do that?  Doing that runs the risk of eating up output from some
> > subprocess for which the foreground Lisp program is waiting.
> 
> I haven't actually checked which timer it is, to be quite honest since I
> didn't think of it as a bug at all.
> 
> Correct me if I am wrong, calling accept-process-output w/o arguments
> is expected to be quite harmless and can be useful. If you specify a
> specific process, you will most definitely wait at least as long as
> it takes for that process to produce any output.
> 
> Nevertheless: If am not completely mistaken, there is no data lost at
> all. It is read and passed to the filter function which was registered
> by the interested party -- otherwise the default filter will simply
> append it to the buffer it belongs to.
> 
> The only thing that is lost is that it was ever read at all and thus
> an endless wait begins.

But if the wrong call to accept-process-output have read the process
output, it could have also processed it and delivered the results to
the wrong application, no?

> > So please point out the timer that does this, because I think that
> > timer needs to be fixed.
> 
> If you still need that, I will do some digging and try to find it.

I think we need a thorough and detailed understanding of what's going
on in this case, before we can discuss the solutions, yes.  IME,
trying to fix a problem without a good understanding what it is that
we are fixing tends to produce partial solutions at best, and new bugs
at worst.

So please reproduce this in an unoptimized build, and please also show
the Lisp backtrace in this scenario.  Then let's take it from there.

> > We already record the file descriptors on which we wait for process
> > output, see compute_non_keyboard_wait_mask.  Once
> > wait_reading_process_output exits, it clears these records.  So it
> > should be possible for us to prevent accept-process-output calls
> > issued by such runaway timers from waiting on the descriptors that are
> > already "taken": if, when we set the bits in the pselect mask, we find
> > that some of the descriptors are already watched by the same thread as
> > the current thread, we could exclude them from the pselect mask we are
> > setting up.  Wouldn't that be a better solution?  Because AFAIU, your
> > solution just avoids an infinite wait, but doesn't avoid losing the
> > process output, because it was read by the wrong Lisp code.  Right?
> 
> Hm... at the moment I don't see where data is lost with my solution.
> Maybe I am being totally blind and making a fool out of myself but I
> honestly don't see it.

Maybe there is no loss, but I'm not really sure your proposal solves
the root cause, and without detailed understanding of what's exactly
going on, we have no way of discussing this rationally.

> > Well, I'd like to eyeball the timer which commits this crime.
> 
> If you still do, let me know and I will try to track it down...

I do.  I believe we must understand this situation very well before we
reason about its solution.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-28  9:28         ` Eli Zaretskii
@ 2017-10-30  9:48           ` Matthias Dahl
  2017-11-03  8:52             ` Matthias Dahl
  2017-11-04 12:11             ` Eli Zaretskii
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-10-30  9:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3047 bytes --]

Hello Eli...

On 28/10/17 11:28, Eli Zaretskii wrote:

> It lacks the Lisp backtrace ("xbacktrace" can produce it), and the
> fact that most arguments are either optimized out or complex data
> types shown as "..." makes the backtrace much less useful.

Sorry, like I said, I was in a hurry that was all I could provide at
that time. Attached you will find a full backtrace from a proper debug
build w/o optimization -- I usually only run those when really debugging
something, otherwise I keep debug symbols around for an optimized build,
that will at least give me a starting point and an idea when something
goes wrong.

> But if the wrong call to accept-process-output have read the process
> output, it could have also processed it and delivered the results to
> the wrong application, no?

Given that fact that a filter is registered for a given process, and
thus it is this filter that gets called whenever process output is ready
and needs to be processed, a timer or filter would have to replace that
filter with its own through set-process-filter for this to happen. And
that is something I would clearly consider a bug, since no filter or
timer should do something like that.

Naturally there is also the weird case when accept-process-output was
called from a hook by some package which expects that data and needs it
but doesn't consider that a timer could get called during that time and
the package itself has a timer setup that will also interact with that
very same process, trying to read data back from some interaction with
it. That will naturally fail as well... either way, with or without my
patch. And again, I would consider this a bug.

If a package interacts with its own processes, it should be careful and
consider that timers/filters get called while accept-process-output is
called and take that into account.

> I think we need a thorough and detailed understanding of what's going
> on in this case, before we can discuss the solutions, yes.  IME,
> trying to fix a problem without a good understanding what it is that
> we are fixing tends to produce partial solutions at best, and new bugs
> at worst.

I fully agree with your opinion, I just thought I already properly
explained in my lenghty mails what was going on behind the scenes and
how my (rather simple) solution fixes it.

> So please reproduce this in an unoptimized build, and please also show
> the Lisp backtrace in this scenario.  Then let's take it from there.

Like said earlier, you will find that attached. But keep in mind, that
will not be as helpful as you might hope since you will only see where
the hangs occurs (I already stated that) but not why. At that point, the
timers have already run.

> I do.  I believe we must understand this situation very well before we
> reason about its solution.

One example is semantic. Its idle timers use semantic-throw-on-input
which calls accept-process-output without any arguments. Just as an
example.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Attachment #2: emacs-bt-full.txt --]
[-- Type: text/plain, Size: 91338 bytes --]

#0  0x00007ffff2d49ee3 in __pselect (nfds=<optimized out>, readfds=0x7fffffff7d30, writefds=0x7fffffff7db0, exceptfds=0x0, timeout=<optimized out>, sigmask=<optimized out>)
    at ../sysdeps/unix/sysv/linux/pselect.c:69
        resultvar = 18446744073709551102
        sc_cancel_oldtype = 0
        tval = {
          tv_sec = 4, 
          tv_nsec = 126621639
        }
        data = {
          ss = 0, 
          ss_len = 8
        }
#1  0x000000000069a5fb in really_call_select (arg=0x7fffffff7c20) at thread.c:574
        sa = 0x7fffffff7c20
        self = 0xaec100 <main_thread>
        oldset = {
          __val = {0, 750006068578745088, 5627596, 0, 140737488322384, 6683181, 140737488321312, 77506101, 0, 77501859, 0, 0, 2, 0, 4933, 608559125}
        }
#2  0x00000000005eafcc in flush_stack_call_func (func=0x69a53d <really_call_select>, arg=0x7fffffff7c20) at alloc.c:5226
        end = 0x7fffffff7bd0
        self = 0xaec100 <main_thread>
        sentry = {
          o = {
            __max_align_ll = 140737488321488, 
            __max_align_ld = -2.8728035521484346831976196739416242e-2059
          }
        }
#3  0x000000000069a6e4 in thread_select (func=0x7ffff2d49e20 <__pselect>, max_fds=21, rfds=0x7fffffff7d30, wfds=0x7fffffff7db0, efds=0x0, timeout=0x7fffffff8340, 
    sigmask=0x0) at thread.c:597
        sa = {
          func = 0x7ffff2d49e20 <__pselect>, 
          max_fds = 21, 
          rfds = 0x7fffffff7d30, 
          wfds = 0x7fffffff7db0, 
          efds = 0x0, 
          timeout = 0x7fffffff8340, 
          sigmask = 0x0, 
          result = 0
        }
#4  0x00000000006c0315 in xg_select (fds_lim=21, rfds=0x7fffffff83b0, wfds=0x7fffffff8430, efds=0x0, timeout=0x7fffffff8340, sigmask=0x0) at xgselect.c:117
        all_rfds = {
          fds_bits = {1409256, 0 <repeats 15 times>}
        }
        all_wfds = {
          fds_bits = {0 <repeats 16 times>}
        }
        tmo = {
          tv_sec = 10600560, 
          tv_nsec = 30064737696
        }
        tmop = 0x7fffffff8340
        context = 0x2b7f320
        have_wfds = true
        gfds_buf = {{
            fd = 5, 
            events = 1, 
            revents = 0
          }, {
            fd = 6, 
            events = 1, 
            revents = 0
          }, {
            fd = 7, 
            events = 1, 
            revents = 0
          }, {
            fd = 8898464, 
            events = 0, 
            revents = 0
          }, {
            fd = 32, 
            events = 0, 
            revents = 0
          }, {
            fd = -34080, 
            events = 32767, 
            revents = 0
          }, {
            fd = -34064, 
            events = 32767, 
            revents = 0
          }, {
            fd = -33992, 
            events = 32767, 
            revents = 0
          }, {
            fd = -33821, 
            events = 32767, 
            revents = 0
          }, {
            fd = 32, 
            events = 0, 
            revents = 0
          }, {
            fd = 1, 
            events = 0, 
            revents = 0
          }, {
            fd = 1, 
            events = 6, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 92813968, 
            events = 0, 
            revents = 0
          }, {
            fd = 11304160, 
            events = 0, 
            revents = 0
          }, {
            fd = -1325008128, 
            events = 36491, 
            revents = 2664
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 56714256, 
            events = 0, 
            revents = 0
          }, {
            fd = -32960, 
            events = 32767, 
            revents = 0
          }, {
            fd = 6235253, 
            events = 0, 
            revents = 0
          }, {
            fd = -33992, 
            events = 2, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 30048, 
            events = 0, 
            revents = 0
          }, {
            fd = 1, 
            events = 0, 
            revents = 0
          }, {
            fd = -33920, 
            events = 18, 
            revents = 0
          }, {
            fd = 8898376, 
            events = 0, 
            revents = 0
          }, {
            fd = 8898381, 
            events = 0, 
            revents = 0
          }, {
            fd = 11336640, 
            events = 0, 
            revents = 0
          }, {
            fd = 163501968, 
            events = 0, 
            revents = 0
          }, {
            fd = 11304160, 
            events = 0, 
            revents = 0
          }, {
            fd = -32960, 
            events = 32767, 
            revents = 0
          }, {
            fd = 5627596, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -32848, 
            events = 32767, 
            revents = 0
          }, {
            fd = 6237915, 
            events = 0, 
            revents = 0
          }, {
            fd = -32816, 
            events = 32767, 
            revents = 0
          }, {
            fd = 6374324, 
            events = 2, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 30048, 
            events = 0, 
            revents = 0
          }, {
            fd = -32784, 
            events = 32767, 
            revents = 0
          }, {
            fd = 1030, 
            events = 0, 
            revents = 0
          }, {
            fd = 11336640, 
            events = 0, 
            revents = 0
          }, {
            fd = -32840, 
            events = 32767, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -32704, 
            events = 32767, 
            revents = 0
          }, {
            fd = 2, 
            events = 0, 
            revents = 0
          }, {
            fd = 2, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -32784, 
            events = 32767, 
            revents = 0
          }, {
            fd = 6290592, 
            events = 0, 
            revents = 0
          }, {
            fd = 11306592, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 66144099, 
            events = 0, 
            revents = 0
          }, {
            fd = -32736, 
            events = 32767, 
            revents = 0
          }, {
            fd = 5628266, 
            events = 0, 
            revents = 0
          }, {
            fd = 66144115, 
            events = 0, 
            revents = 0
          }, {
            fd = 66144099, 
            events = 0, 
            revents = 0
          }, {
            fd = -32672, 
            events = 32767, 
            revents = 0
          }, {
            fd = 6191436, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -32544, 
            events = 32767, 
            revents = 0
          }, {
            fd = 2, 
            events = 0, 
            revents = 0
          }, {
            fd = 2, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -32624, 
            events = 32767, 
            revents = 0
          }, {
            fd = 6290592, 
            events = 0, 
            revents = 0
          }, {
            fd = 11306592, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 382374951, 
            events = 0, 
            revents = 0
          }, {
            fd = 520, 
            events = 0, 
            revents = 0
          }, {
            fd = 520, 
            events = 0, 
            revents = 0
          }, {
            fd = 382374951, 
            events = 0, 
            revents = 0
          }, {
            fd = -32496, 
            events = 32767, 
            revents = 0
          }, {
            fd = 7177688, 
            events = 0, 
            revents = 0
          }, {
            fd = 79, 
            events = 0, 
            revents = 0
          }, {
            fd = 617625049, 
            events = 0, 
            revents = 0
          }, {
            fd = 600, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 382374951, 
            events = 51751, 
            revents = 56111
          }, {
            fd = 520, 
            events = 0, 
            revents = 0
          }, {
            fd = 382374951, 
            events = 0, 
            revents = 0
          }, {
            fd = 4, 
            events = 0, 
            revents = 0
          }, {
            fd = 994678767, 
            events = 0, 
            revents = 0
          }, {
            fd = -32256, 
            events = 32767, 
            revents = 0
          }, {
            fd = 5695999, 
            events = 0, 
            revents = 0
          }, {
            fd = 66144339, 
            events = 0, 
            revents = 0
          }, {
            fd = 66144371, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 77506101, 
            events = 0, 
            revents = 0
          }, {
            fd = 77506101, 
            events = 0, 
            revents = 0
          }, {
            fd = 115654309, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 56714256, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -1, 
            events = 65535, 
            revents = 65535
          }, {
            fd = 1509269519, 
            events = 0, 
            revents = 0
          }, {
            fd = 603329916, 
            events = 0, 
            revents = 0
          }, {
            fd = 79, 
            events = 0, 
            revents = 0
          }, {
            fd = 617625049, 
            events = 0, 
            revents = 0
          }, {
            fd = 1509269524, 
            events = 0, 
            revents = 0
          }, {
            fd = 598008683, 
            events = 0, 
            revents = 0
          }, {
            fd = -32224, 
            events = 32767, 
            revents = 0
          }, {
            fd = -220848378, 
            events = 32767, 
            revents = 0
          }, {
            fd = 4, 
            events = 0, 
            revents = 0
          }, {
            fd = -1325008128, 
            events = 36491, 
            revents = 2664
          }, {
            fd = 603337489, 
            events = 0, 
            revents = 0
          }, {
            fd = 1509369519, 
            events = 0, 
            revents = 0
          }, {
            fd = 1509369519, 
            events = 0, 
            revents = 0
          }, {
            fd = 603337489, 
            events = 0, 
            revents = 0
          }, {
            fd = -32192, 
            events = 32767, 
            revents = 0
          }, {
            fd = 7177400, 
            events = 0, 
            revents = 0
          }, {
            fd = 100000, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 1509269519, 
            events = 0, 
            revents = 0
          }, {
            fd = 603337489, 
            events = 0, 
            revents = 0
          }, {
            fd = -1, 
            events = 65535, 
            revents = 65535
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = 0, 
            events = 0, 
            revents = 0
          }, {
            fd = -1, 
            events = 65535, 
            revents = 65535
          }}
        gfds = 0x7fffffff7e30
        gfds_size = 128
        n_gfds = 3
        retval = 0
        our_fds = 0
        max_fds = 20
        context_acquired = true
        i = 3
        nfds = 0
        tmo_in_millisec = -1
        must_free = 0
        need_to_dispatch = true
#5  0x000000000066f4e4 in wait_reading_process_output (time_limit=0, nsecs=0, read_kbd=0, do_display=false, wait_for_cell=XIL(0), wait_proc=0x3616410, just_wait_proc=0)
    at process.c:5375
        process_skipped = false
        channel = 6587175
        nfds = 0
        Available = {
          fds_bits = {1409032, 0 <repeats 15 times>}
        }
        Writeok = {
          fds_bits = {0 <repeats 16 times>}
        }
        check_write = true
        check_delay = 0
        no_avail = false
        xerrno = 4
        proc = XIL(0x898879c96000c783)
        timeout = {
          tv_sec = 4, 
          tv_nsec = 994678767
        }
        end_time = {
          tv_sec = 85334981, 
          tv_nsec = 85359716
        }
        timer_delay = {
          tv_sec = 4, 
          tv_nsec = 994678767
        }
        got_output_end_time = {
          tv_sec = 1509369499, 
          tv_nsec = 807260931
        }
        wait = INFINITY
        got_some_output = -1
        retry_for_async = false
        count = 29
        now = {
          tv_sec = 0, 
          tv_nsec = -1
        }
#6  0x000000000066d6a7 in Faccept_process_output (process=XIL(0x3616415), seconds=XIL(0), millisec=XIL(0), just_this_one=XIL(0)) at process.c:4655
        secs = 0
        nsecs = 0
#7  0x0000000000613dc4 in funcall_subr (subr=0xa58bd0 <Saccept_process_output>, numargs=1, args=0x7fffffff87a8) at eval.c:2849
        internal_argbuf = {XIL(0x3616415), XIL(0), XIL(0), XIL(0), XIL(0x55e61d), XIL(0xaffff86d0), XIL(0xa58bd5), XIL(0x1)}
        internal_args = 0x7fffffff8690
#8  0x0000000000613962 in Ffuncall (nargs=2, args=0x7fffffff87a0) at eval.c:2766
        fun = XIL(0xa58bd5)
        original_fun = XIL(0xb8350)
        funcar = XIL(0xad52f0)
        numargs = 1
        val = XIL(0)
        count = 28
#9  0x000000000065f38f in exec_byte_code (bytestr=XIL(0x51851a4), vector=XIL(0x5183ea5), maxdepth=make_number(14), args_template=make_number(512), nargs=0, 
    args=0x7fffffff8e78) at bytecode.c:629
        op = 1
        type = CATCHER
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 52
        bytestr_length = 568
        vectorp = 0x5183ea8
        quitcounter = 3 '\003'
        stack_items = 15
        sa_avail = 15696
        sa_count = 25
        sa_must_free = false
        stack_base = 0x7fffffff8760
        stack_lim = 0x7fffffff87d8
        top = 0x7fffffff87a0
        void_stack_lim = 0x7fffffff87d8
        bytestr_data = 0x7fffffff87d8 "\306 \210\212\307 \210`\310\003!\030ɉ\211\211\211\031\b\311=\204#"
        pc = 0x7fffffff88f6 "\210\t@ޘ\203\032\001\tA\211\021\204/\001\337\021\t:\203D\001\340\t@!\262\002\202D\001\211\341ɉF\262\002\001\313=\203`\001\313\026/\327\004!\210\002\004V\203\\\001\327\003S!\210˂%\002\001;\203\200\001\016\061\203\200\001\313\026/\327\004!\210\002\004V\203|\001\327\003S!\210˂%\002\001\204\232\001\313\026/\327\004!\210\002\004V\203\226\001\327\003S!\210˂%\002\016\062\342W\203\271\001\212\003b\210\323\001e\")\204\336\001\212\002b\210\343\001d\")\204\336\001\016\062\342V\203\370\001\212\003b\210\323\001\005\016\062Z\")\204\336\001\212\002b\210\343\001\004\016\062\\\")\203\370\001\311\026/\016\061\203\357\001"...
        count = 25
        result = XIL(0xc090)
#10 0x00000000006143b4 in funcall_lambda (fun=XIL(0x5184055), nargs=0, arg_vector=0x7fffffff8e78) at eval.c:2967
        size = 6
        val = XIL(0x55e61d)
        syms_left = make_number(512)
        next = XIL(0x5184050)
        lexenv = XIL(0x7fffffff8dd8)
        count = 25
        i = 81604349392
        optional = false
        rest = false
        previous_optional_or_rest = false
#11 0x00000000006139a6 in Ffuncall (nargs=1, args=0x7fffffff8e70) at eval.c:2768
        fun = XIL(0x5184055)
        original_fun = XIL(0x46bcd10)
        funcar = XIL(0xac8660)
        numargs = 0
        val = XIL(0xc090)
        count = 24
#12 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x5184fe4), vector=XIL(0x5183c25), maxdepth=make_number(7), args_template=make_number(0), nargs=0, args=0x7fffffff9420)
    at bytecode.c:629
        op = 0
        type = CONDITION_CASE
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 25
        bytestr_length = 126
        vectorp = 0x5183c28
        quitcounter = 1 '\001'
        stack_items = 8
        sa_avail = 16194
        sa_count = 22
        sa_must_free = false
        stack_base = 0x7fffffff8e60
        stack_lim = 0x7fffffff8ea0
        top = 0x7fffffff8e70
        void_stack_lim = 0x7fffffff8ea0
        bytestr_data = 0x7fffffff8ea0 "\b\205}"
        pc = 0x7fffffff8ec1 "\210p\025`\026\023\202@"
        count = 22
        result = XIL(0x807f1f1)
#13 0x00000000006143b4 in funcall_lambda (fun=XIL(0x5183cf5), nargs=0, arg_vector=0x7fffffff9420) at eval.c:2967
        size = 6
        val = XIL(0x55e61d)
        syms_left = make_number(0)
        next = XIL(0x5183cf0)
        lexenv = XIL(0x7fffffff92e8)
        count = 22
        i = 81604350688
        optional = false
        rest = false
        previous_optional_or_rest = false
#14 0x00000000006139a6 in Ffuncall (nargs=1, args=0x7fffffff9418) at eval.c:2768
        fun = XIL(0x5183cf5)
        original_fun = XIL(0x46b7c40)
        funcar = XIL(0xac8660)
        numargs = 0
        val = XIL(0x807f1f1)
        count = 21
#15 0x0000000000612cab in funcall_nil (nargs=1, args=0x7fffffff9418) at eval.c:2397
No locals.
#16 0x00000000006130c0 in run_hook_with_args (nargs=1, args=0x7fffffff9418, funcall=0x612c88 <funcall_nil>) at eval.c:2574
        global_vals = XIL(0)
        sym = XIL(0xa410)
        val = XIL(0x63b6a73)
        ret = XIL(0)
#17 0x0000000000612d31 in Frun_hook_with_args (nargs=1, args=0x7fffffff9418) at eval.c:2439
No locals.
#18 0x0000000000613122 in run_hook (hook=XIL(0x46b7c40)) at eval.c:2587
No locals.
#19 0x0000000000612cec in Frun_hooks (nargs=1, args=0x7fffffff95b8) at eval.c:2421
        i = 0
#20 0x0000000000613c68 in funcall_subr (subr=0xa55f58 <Srun_hooks>, numargs=1, args=0x7fffffff95b8) at eval.c:2821
No locals.
#21 0x0000000000613962 in Ffuncall (nargs=2, args=0x7fffffff95b0) at eval.c:2766
        fun = XIL(0xa55f5d)
        original_fun = XIL(0xae440)
        funcar = XIL(0x7fffffff9550)
        numargs = 1
        val = XIL(0)
        count = 20
#22 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x8fdf804), vector=XIL(0x7f7b9a5), maxdepth=make_number(13), args_template=make_number(770), nargs=3, 
    args=0x7fffffff9ae0) at bytecode.c:629
        op = 1
        type = (CONDITION_CASE | CATCHER_ALL | unknown: 32764)
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 29
        bytestr_length = 153
        vectorp = 0x7f7b9a8
        quitcounter = 1 '\001'
        stack_items = 14
        sa_avail = 16119
        sa_count = 17
        sa_must_free = false
        stack_base = 0x7fffffff9560
        stack_lim = 0x7fffffff95d0
        top = 0x7fffffff95b0
        void_stack_lim = 0x7fffffff95d0
        bytestr_data = 0x7fffffff95d0 "\305\306\001\030r\004\211\203\205"
        pc = 0x7fffffff9636 "\210)\266\002\004\204y"
        count = 17
        result = XIL(0)
#23 0x00000000006143b4 in funcall_lambda (fun=XIL(0x7f7ba95), nargs=3, arg_vector=0x7fffffff9ac8) at eval.c:2967
        size = 5
        val = XIL(0x55e61d)
        syms_left = make_number(770)
        next = XIL(0x7f7ba90)
        lexenv = XIL(0x7fffffff9a38)
        count = 17
        i = 81604352560
        optional = false
        rest = false
        previous_optional_or_rest = false
#24 0x00000000006139a6 in Ffuncall (nargs=4, args=0x7fffffff9ac0) at eval.c:2768
        fun = XIL(0x7f7ba95)
        original_fun = XIL(0x7f7ba95)
        funcar = XIL(0xac8660)
        numargs = 3
        val = XIL(0)
        count = 16
#25 0x0000000000612c3b in Fapply (nargs=2, args=0x7fffffff9cb8) at eval.c:2386
        i = 4
        numargs = 3
        funcall_nargs = 4
        funcall_args = 0x7fffffff9ac0
        spread_arg = XIL(0)
        fun = XIL(0x7f7ba95)
        retval = XIL(0x30)
        sa_avail = 16352
        sa_count = 16
        sa_must_free = false
#26 0x0000000000613c68 in funcall_subr (subr=0xa55f28 <Sapply>, numargs=2, args=0x7fffffff9cb8) at eval.c:2821
No locals.
#27 0x0000000000613962 in Ffuncall (nargs=3, args=0x7fffffff9cb0) at eval.c:2766
        fun = XIL(0xa55f2d)
        original_fun = XIL(0x28e0)
        funcar = XIL(0x7fffffff9c80)
        numargs = 2
        val = XIL(0x5e7d69)
        count = 15
#28 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x1278244), vector=XIL(0x7064f95), maxdepth=make_number(5), args_template=make_number(128), nargs=3, 
    args=0x7fffffffa168) at bytecode.c:629
        op = 2
        type = CATCHER
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 4
        bytestr_length = 10
        vectorp = 0x7064f98
        quitcounter = 1 '\001'
        stack_items = 6
        sa_avail = 16326
        sa_count = 15
        sa_must_free = false
        stack_base = 0x7fffffff9ca0
        stack_lim = 0x7fffffff9cd0
        top = 0x7fffffff9cb0
        void_stack_lim = 0x7fffffff9cd0
        bytestr_data = 0x7fffffff9cd0 "\300\302\002\"\300\301\003\"\210\207\006\a"
        pc = 0x7fffffff9cd4 "\300\301\003\"\210\207\006\a"
        count = 15
        result = XIL(0x7400000074)
#29 0x00000000006143b4 in funcall_lambda (fun=XIL(0x7064fc5), nargs=3, arg_vector=0x7fffffffa168) at eval.c:2967
        size = 5
        val = XIL(0x55e61d)
        syms_left = make_number(128)
        next = XIL(0x7064fc0)
        lexenv = XIL(0x7fffffffa098)
        count = 15
        i = 81604354192
        optional = 127
        rest = false
        previous_optional_or_rest = false
#30 0x00000000006139a6 in Ffuncall (nargs=4, args=0x7fffffffa160) at eval.c:2768
        fun = XIL(0x7064fc5)
        original_fun = XIL(0x8516d80)
        funcar = XIL(0x7fffffffa110)
        numargs = 3
        val = XIL(0x610471)
        count = 14
#31 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x8fdf1f4), vector=XIL(0x7f7b765), maxdepth=make_number(12), args_template=make_number(1799), nargs=7, 
    args=0x7fffffffa708) at bytecode.c:629
        op = 3
        type = CONDITION_CASE
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 37
        bytestr_length = 210
        vectorp = 0x7f7b768
        quitcounter = 1 '\001'
        stack_items = 13
        sa_avail = 16070
        sa_count = 13
        sa_must_free = false
        stack_base = 0x7fffffffa120
        stack_lim = 0x7fffffffa188
        top = 0x7fffffffa160
        void_stack_lim = 0x7fffffffa188
        bytestr_data = 0x7fffffffa188 "\304", <incomplete sequence \313>
        pc = 0x7fffffffa19a "\310\311\006\006\237\"\210\006\006\204G"
        count = 13
        result = XIL(0x55ec04)
#32 0x00000000006143b4 in funcall_lambda (fun=XIL(0x7f7b895), nargs=7, arg_vector=0x7fffffffa6d0) at eval.c:2967
        size = 5
        val = XIL(0x55e61d)
        syms_left = make_number(1799)
        next = XIL(0x7f7b890)
        lexenv = XIL(0x7fffffffa628)
        count = 13
        i = 81604355616
        optional = 255
        rest = 255
        previous_optional_or_rest = 255
#33 0x00000000006139a6 in Ffuncall (nargs=8, args=0x7fffffffa6c8) at eval.c:2768
        fun = XIL(0x7f7b895)
        original_fun = XIL(0x8516d20)
        funcar = XIL(0x938fe64)
        numargs = 7
        val = XIL(0x47a0)
        count = 12
#34 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x8fdf194), vector=XIL(0x938e8d5), maxdepth=make_number(8), args_template=make_number(0), nargs=0, args=0x7fffffffaba0)
    at bytecode.c:629
        op = 7
        type = CATCHER
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 13
        bytestr_length = 46
        vectorp = 0x938e8d8
        quitcounter = 1 '\001'
        stack_items = 9
        sa_avail = 16266
        sa_count = 10
        sa_must_free = false
        stack_base = 0x7fffffffa6c0
        stack_lim = 0x7fffffffa708
        top = 0x7fffffffa6c8
        void_stack_lim = 0x7fffffffa708
        bytestr_data = 0x7fffffffa708 "r\310\016\v!q\210Ţ\203\030"
        pc = 0x7fffffffa734 "*\207"
        count = 10
        result = XIL(0x57d4723)
#35 0x00000000006143b4 in funcall_lambda (fun=XIL(0x938e945), nargs=0, arg_vector=0x7fffffffaba0) at eval.c:2967
        size = 4
        val = XIL(0x55e61d)
        syms_left = make_number(0)
        next = XIL(0x938e940)
        lexenv = XIL(0x7fffffffaaf8)
        count = 10
        i = 81604356848
        optional = false
        rest = false
        previous_optional_or_rest = false
#36 0x00000000006139a6 in Ffuncall (nargs=1, args=0x7fffffffab98) at eval.c:2768
        fun = XIL(0x938e945)
        original_fun = XIL(0x938e945)
        funcar = XIL(0x7fffffffab70)
        numargs = 0
        val = XIL(0x610471)
        count = 9
#37 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x8fdec64), vector=XIL(0x7bc16d5), maxdepth=make_number(6), args_template=make_number(257), nargs=1, 
    args=0x7fffffffb0e8) at bytecode.c:629
        op = 0
        type = CONDITION_CASE
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 5
        bytestr_length = 27
        vectorp = 0x7bc16d8
        quitcounter = 1 '\001'
        stack_items = 7
        sa_avail = 16301
        sa_count = 9
        sa_must_free = false
        stack_base = 0x7fffffffab80
        stack_lim = 0x7fffffffabb8
        top = 0x7fffffffab98
        void_stack_lim = 0x7fffffffabb8
        bytestr_data = 0x7fffffffabb8 "\300\001\301\"\302\002\301\303#\210\211\205\032"
        pc = 0x7fffffffabcc "0\202\032"
        count = 9
        result = XIL(0)
#38 0x00000000006143b4 in funcall_lambda (fun=XIL(0x7bc1705), nargs=1, arg_vector=0x7fffffffb0e0) at eval.c:2967
        size = 5
        val = XIL(0x55e61d)
        syms_left = make_number(257)
        next = XIL(0x7bc1700)
        lexenv = XIL(0x7fffffffafa8)
        count = 9
        i = 81604358048
        optional = false
        rest = false
        previous_optional_or_rest = false
#39 0x00000000006139a6 in Ffuncall (nargs=2, args=0x7fffffffb0d8) at eval.c:2768
        fun = XIL(0x7bc1705)
        original_fun = XIL(0x85156b0)
        funcar = XIL(0xac8660)
        numargs = 1
        val = XIL(0)
        count = 8
#40 0x000000000065f38f in exec_byte_code (bytestr=XIL(0x8fdec84), vector=XIL(0x7f7b455), maxdepth=make_number(34), args_template=make_number(514), nargs=2, 
    args=0x7fffffffb998) at bytecode.c:629
        op = 1
        type = CONDITION_CASE
        targets = {0x662cb6 <exec_byte_code+17943>, 0x662d1b <exec_byte_code+18044>, 0x662d1d <exec_byte_code+18046>, 0x662d1f <exec_byte_code+18048>, 
          0x662d21 <exec_byte_code+18050>, 0x662d21 <exec_byte_code+18050>, 0x662d9e <exec_byte_code+18175>, 0x662e2d <exec_byte_code+18318>, 
          0x65ebc2 <exec_byte_code+1315>, 0x65ebc4 <exec_byte_code+1317>, 0x65ebc6 <exec_byte_code+1319>, 0x65ebc8 <exec_byte_code+1321>, 0x65ebca <exec_byte_code+1323>, 
          0x65ebca <exec_byte_code+1323>, 0x65ebd3 <exec_byte_code+1332>, 0x65eb7f <exec_byte_code+1248>, 0x65f007 <exec_byte_code+2408>, 0x65f009 <exec_byte_code+2410>, 
          0x65f00b <exec_byte_code+2412>, 0x65f00d <exec_byte_code+2414>, 0x65f00f <exec_byte_code+2416>, 0x65f00f <exec_byte_code+2416>, 0x65f059 <exec_byte_code+2490>, 
          0x65f018 <exec_byte_code+2425>, 0x65f267 <exec_byte_code+3016>, 0x65f269 <exec_byte_code+3018>, 0x65f26b <exec_byte_code+3020>, 0x65f26d <exec_byte_code+3022>, 
          0x65f26f <exec_byte_code+3024>, 0x65f26f <exec_byte_code+3024>, 0x65f206 <exec_byte_code+2919>, 0x65f226 <exec_byte_code+2951>, 0x65f34d <exec_byte_code+3246>, 
          0x65f34f <exec_byte_code+3248>, 0x65f351 <exec_byte_code+3250>, 0x65f353 <exec_byte_code+3252>, 0x65f355 <exec_byte_code+3254>, 0x65f355 <exec_byte_code+3254>, 
          0x65f2ec <exec_byte_code+3149>, 0x65f30c <exec_byte_code+3181>, 0x65f43b <exec_byte_code+3484>, 0x65f43d <exec_byte_code+3486>, 0x65f43f <exec_byte_code+3488>, 
          0x65f441 <exec_byte_code+3490>, 0x65f443 <exec_byte_code+3492>, 0x65f443 <exec_byte_code+3492>, 0x65f3da <exec_byte_code+3387>, 0x65f3fa <exec_byte_code+3419>, 
          0x65fe84 <exec_byte_code+6117>, 0x65fd4c <exec_byte_code+5805>, 0x65fd40 <exec_byte_code+5793>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x66011e <exec_byte_code+6783>, 0x66022f <exec_byte_code+7056>, 
          0x6602ae <exec_byte_code+7183>, 0x66032e <exec_byte_code+7311>, 0x6603af <exec_byte_code+7440>, 0x65ee3e <exec_byte_code+1951>, 0x65eede <exec_byte_code+2111>, 
          0x660448 <exec_byte_code+7593>, 0x65ed96 <exec_byte_code+1783>, 0x65ef5e <exec_byte_code+2239>, 0x6604cf <exec_byte_code+7728>, 0x66054f <exec_byte_code+7856>, 
          0x6605a9 <exec_byte_code+7946>, 0x660629 <exec_byte_code+8074>, 0x660690 <exec_byte_code+8177>, 0x66079a <exec_byte_code+8443>, 0x6607f4 <exec_byte_code+8533>, 
          0x660874 <exec_byte_code+8661>, 0x660917 <exec_byte_code+8824>, 0x660971 <exec_byte_code+8914>, 0x6609cb <exec_byte_code+9004>, 0x660a4b <exec_byte_code+9132>, 
          0x660acb <exec_byte_code+9260>, 0x660b4b <exec_byte_code+9388>, 0x660bee <exec_byte_code+9551>, 0x660c55 <exec_byte_code+9654>, 0x660cbc <exec_byte_code+9757>, 
          0x660dc6 <exec_byte_code+10023>, 0x660e5b <exec_byte_code+10172>, 0x660ef0 <exec_byte_code+10321>, 0x6610da <exec_byte_code+10811>, 
          0x66115f <exec_byte_code+10944>, 0x6611e4 <exec_byte_code+11077>, 0x661269 <exec_byte_code+11210>, 0x6612ee <exec_byte_code+11343>, 
          0x661355 <exec_byte_code+11446>, 0x6613ee <exec_byte_code+11599>, 0x661455 <exec_byte_code+11702>, 0x6614bc <exec_byte_code+11805>, 
          0x661523 <exec_byte_code+11908>, 0x661671 <exec_byte_code+12242>, 0x65fb84 <exec_byte_code+5349>, 0x6616e1 <exec_byte_code+12354>, 
          0x66173b <exec_byte_code+12444>, 0x66183d <exec_byte_code+12702>, 0x6618b8 <exec_byte_code+12825>, 0x661928 <exec_byte_code+12937>, 
          0x661982 <exec_byte_code+13027>, 0x6619da <exec_byte_code+13115>, 0x661a32 <exec_byte_code+13203>, 0x661a92 <exec_byte_code+13299>, 
          0x662cb6 <exec_byte_code+17943>, 0x661afc <exec_byte_code+13405>, 0x661b54 <exec_byte_code+13493>, 0x661bac <exec_byte_code+13581>, 
          0x661c04 <exec_byte_code+13669>, 0x661c5c <exec_byte_code+13757>, 0x661cb4 <exec_byte_code+13845>, 0x65fb84 <exec_byte_code+5349>, 
          0x662cb6 <exec_byte_code+17943>, 0x661d0e <exec_byte_code+13935>, 0x661d75 <exec_byte_code+14038>, 0x661dcf <exec_byte_code+14128>, 
          0x661e29 <exec_byte_code+14218>, 0x661ea9 <exec_byte_code+14346>, 0x661f29 <exec_byte_code+14474>, 0x661f83 <exec_byte_code+14564>, 
          0x662098 <exec_byte_code+14841>, 0x662118 <exec_byte_code+14969>, 0x662198 <exec_byte_code+15097>, 0x662218 <exec_byte_code+15225>, 
          0x662270 <exec_byte_code+15313>, 0x662cb6 <exec_byte_code+17943>, 0x65fa85 <exec_byte_code+5094>, 0x65f513 <exec_byte_code+3700>, 0x65ece3 <exec_byte_code+1604>, 
          0x65f602 <exec_byte_code+3939>, 0x65f6a7 <exec_byte_code+4104>, 0x65f749 <exec_byte_code+4266>, 0x65fa27 <exec_byte_code+5000>, 0x65fa3f <exec_byte_code+5024>, 
          0x65f19e <exec_byte_code+2815>, 0x65fb2f <exec_byte_code+5264>, 0x65fbc7 <exec_byte_code+5416>, 0x65fc64 <exec_byte_code+5573>, 0x65fcb9 <exec_byte_code+5658>, 
          0x65fedc <exec_byte_code+6205>, 0x65ff6b <exec_byte_code+6348>, 0x66000e <exec_byte_code+6511>, 0x660083 <exec_byte_code+6628>, 0x65f4b6 <exec_byte_code+3607>, 
          0x6622ca <exec_byte_code+15403>, 0x66236d <exec_byte_code+15566>, 0x6623c7 <exec_byte_code+15656>, 0x662421 <exec_byte_code+15746>, 
          0x66247b <exec_byte_code+15836>, 0x6624d5 <exec_byte_code+15926>, 0x662555 <exec_byte_code+16054>, 0x6625d5 <exec_byte_code+16182>, 
          0x662655 <exec_byte_code+16310>, 0x6626d5 <exec_byte_code+16438>, 0x662840 <exec_byte_code+16801>, 0x6628c0 <exec_byte_code+16929>, 
          0x662940 <exec_byte_code+17057>, 0x66299a <exec_byte_code+17147>, 0x662a1a <exec_byte_code+17275>, 0x662a9a <exec_byte_code+17403>, 
          0x662af4 <exec_byte_code+17493>, 0x662b4e <exec_byte_code+17583>, 0x66158a <exec_byte_code+12011>, 0x6615f1 <exec_byte_code+12114>, 
          0x662bb5 <exec_byte_code+17686>, 0x662c36 <exec_byte_code+17815>, 0x662cb6 <exec_byte_code+17943>, 0x65f7eb <exec_byte_code+4428>, 0x65f811 <exec_byte_code+4466>, 
          0x65f898 <exec_byte_code+4601>, 0x65f91f <exec_byte_code+4736>, 0x65f9a3 <exec_byte_code+4868>, 0x6606f7 <exec_byte_code+8280>, 0x660d23 <exec_byte_code+9860>, 
          0x66179a <exec_byte_code+12539>, 0x662ee7 <exec_byte_code+18504>, 0x662f71 <exec_byte_code+18642>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x663028 <exec_byte_code+18825>, 0x6630d9 <exec_byte_code+19002>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 
          0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x662cb6 <exec_byte_code+17943>, 0x6632df <exec_byte_code+19520> <repeats 64 times>}
        const_length = 90
        bytestr_length = 995
        vectorp = 0x7f7b458
        quitcounter = 5 '\005'
        stack_items = 35
        sa_avail = 15109
        sa_count = 8
        sa_must_free = false
        stack_base = 0x7fffffffb040
        stack_lim = 0x7fffffffb158
        top = 0x7fffffffb0d8
        void_stack_lim = 0x7fffffffb158
        bytestr_data = 0x7fffffffb158 "\305\062\342\003\306\307\002P\003\"\210\310\002\311\"\204U"
        pc = 0x7fffffffb52b "\266\220\060\202\341\003\201Y"
        count = 8
        result = XIL(0x7fffffffb7c0)
#41 0x00000000006143b4 in funcall_lambda (fun=XIL(0x7f7b735), nargs=2, arg_vector=0x7fffffffb988) at eval.c:2967
        size = 5
        val = XIL(0x55e61d)
        syms_left = make_number(514)
        next = XIL(0x7f7b730)
        lexenv = XIL(0x7fffffffb8f8)
        count = 8
        i = 81604360432
        optional = false
        rest = false
        previous_optional_or_rest = false
#42 0x00000000006139a6 in Ffuncall (nargs=3, args=0x7fffffffb980) at eval.c:2768
        fun = XIL(0x7f7b735)
        original_fun = XIL(0x8515410)
        funcar = XIL(0xac8660)
        numargs = 2
        val = XIL(0)
        count = 7
#43 0x0000000000612c3b in Fapply (nargs=2, args=0x7fffffffba60) at eval.c:2386
        i = 3
        numargs = 2
        funcall_nargs = 3
        funcall_args = 0x7fffffffb980
        spread_arg = XIL(0)
        fun = XIL(0x7f7b735)
        retval = XIL(0x668e815)
        sa_avail = 16360
        sa_count = 7
        sa_must_free = false
#44 0x00000000006131ef in apply1 (fn=XIL(0x8515410), arg=XIL(0x5814ec3)) at eval.c:2602
No locals.
#45 0x000000000067044f in read_process_output_call (fun_and_args=XIL(0x5814ed3)) at process.c:5790
No locals.
#46 0x00000000006102c8 in internal_condition_case_1 (bfun=0x67041e <read_process_output_call>, arg=XIL(0x5814ed3), handlers=XIL(0x5250), 
    hfun=0x670451 <read_process_output_error_handler>) at eval.c:1356
        val = XIL(0x938e735)
        c = 0x2b418a0
#47 0x0000000000670d2a in read_and_dispose_of_process_output (p=0x938e730, 
    chars=0x7fffffffbbc0 "-dir /home/matthew/workspace/storage/opensource/llvm-project-unofficial-github-mirror.git/ -current-frame -tty /dev/pts/7 dumb -file /home/matthew/workspace/storage/opensource/llvm-project-unofficial-"..., nbytes=239, coding=0x8f07a50) at process.c:5998
        outstream = XIL(0x8515410)
        text = XIL(0x938f8b4)
        outer_running_asynch_code = false
        waiting = -1
#48 0x0000000000670932 in read_process_output (proc=XIL(0x938e735), channel=20) at process.c:5909
        nbytes = 239
        p = 0x938e730
        coding = 0x8f07a50
        carryover = 0
        count = 4
        odeactivate = XIL(0)
        chars = "-dir /home/matthew/workspace/storage/opensource/llvm-project-unofficial-github-mirror.git/ -current-frame -tty /dev/pts/7 dumb -file /home/matthew/workspace/storage/opensource/llvm-project-unofficial-"...
#49 0x000000000066fcf0 in wait_reading_process_output (time_limit=30, nsecs=0, read_kbd=-1, do_display=true, wait_for_cell=XIL(0), wait_proc=0x0, just_wait_proc=0)
    at process.c:5608
        nread = 0
        process_skipped = true
        channel = 20
        nfds = 1
        Available = {
          fds_bits = {1048576, 0 <repeats 15 times>}
        }
        Writeok = {
          fds_bits = {0 <repeats 16 times>}
        }
        check_write = true
        check_delay = 0
        no_avail = false
        xerrno = 11
        proc = XIL(0x938e735)
        timeout = {
          tv_sec = 0, 
          tv_nsec = 20000000
        }
        end_time = {
          tv_sec = 1509269470, 
          tv_nsec = 43418519
        }
        timer_delay = {
          tv_sec = 0, 
          tv_nsec = 67285491
        }
        got_output_end_time = {
          tv_sec = 0, 
          tv_nsec = -1
        }
        wait = TIMEOUT
        got_some_output = -1
        retry_for_async = false
        count = 3
        now = {
          tv_sec = 0, 
          tv_nsec = -1
        }
#50 0x00000000004248df in sit_for (timeout=make_number(30), reading=true, display_option=1) at dispnew.c:5793
        sec = 30
        nsec = 0
        do_display = true
#51 0x000000000056afcd in read_char (commandflag=1, map=XIL(0x5801ad3), prev_event=XIL(0), used_mouse_menu=0x7fffffffd361, end_time=0x0) at keyboard.c:2717
        tem0 = XIL(0x5f2edb)
        timeout = 30
        delay_level = 4
        buffer_size = 10
        c = XIL(0)
        jmpcount = 3
        local_getcjmp = {{
            __jmpbuf = {0, -5784433616548994252, 4291728, 140737488345424, 0, 0, -5784433616513342668, 5784432916335878964}, 
            __mask_was_saved = 0, 
            __saved_mask = {
              __val = {5769419, 11306592, 0, 0, 140737488343520, 5627596, 5805312, 140737488343632, 6379229, 0, 3, 92281523, 0, 140737488343632, 6191436, 11306592}
            }
          }}
        save_jump = {{
            __jmpbuf = {0, 0, 0, 0, 0, 0, 0, 0}, 
            __mask_was_saved = 0, 
            __saved_mask = {
              __val = {0 <repeats 16 times>}
            }
          }}
        tem = XIL(0x668e810)
        save = XIL(0xac7ce0)
        previous_echo_area_message = XIL(0)
        also_record = XIL(0)
        reread = false
        recorded = false
        polling_stopped_here = false
        orig_kboard = 0x2bee2f0
#52 0x0000000000577dc4 in read_key_sequence (keybuf=0x7fffffffd500, bufsize=30, prompt=XIL(0), dont_downcase_last=false, can_return_switch_frame=true, 
    fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:9147
        interrupted_kboard = 0x2bee2f0
        interrupted_frame = 0x12f4c30 <bss_sbrk_buffer+8423664>
        key = XIL(0x668e815)
        used_mouse_menu = false
        echo_local_start = 0
        last_real_key_start = 0
        keys_local_start = 0
        new_binding = XIL(0)
        count = 3
        t = 0
        echo_start = 0
        keys_start = 0
        current_binding = XIL(0x5801ad3)
        first_event = XIL(0)
        first_unbound = 31
        mock_input = 0
        fkey = {
          parent = XIL(0xf4ec23), 
          map = XIL(0xf4ec23), 
          start = 0, 
          end = 0
        }
        keytran = {
          parent = XIL(0xb59a93), 
          map = XIL(0xb59a93), 
          start = 0, 
          end = 0
        }
        indec = {
          parent = XIL(0xf4ec43), 
          map = XIL(0xf4ec43), 
          start = 0, 
          end = 0
        }
        shift_translated = false
        delayed_switch_frame = XIL(0)
        original_uppercase = XIL(0)
        original_uppercase_position = -1
        dummyflag = false
        starting_buffer = 0x668e810
        fake_prefixed_keys = XIL(0)
#53 0x0000000000567961 in command_loop_1 () at keyboard.c:1368
        cmd = XIL(0x67de280)
        keybuf = {make_number(99), make_number(103), make_number(115), XIL(0x7fffffffd580), XIL(0xac8660), XIL(0x7fffffffd540), XIL(0), XIL(0x7fffffffd550), XIL(0x55decc), 
          XIL(0), XIL(0x7fffffffd5c0), XIL(0x6156dd), XIL(0x4cf8163), XIL(0x3), XIL(0xac8660), XIL(0), XIL(0), XIL(0x7fffffffd5a0), XIL(0x55decc), XIL(0xb5c405), 
          XIL(0x7fffffffd5e0), XIL(0x610568), XIL(0x10055decc), XIL(0x5250), XIL(0x7fffffffd600), XIL(0x2b41780), XIL(0), XIL(0), XIL(0x7fffffffd610), XIL(0x610471)}
        i = 1
        prev_modiff = 220
        prev_buffer = 0x6091140
        already_adjusted = false
#54 0x0000000000610221 in internal_condition_case (bfun=0x567535 <command_loop_1>, handlers=XIL(0x5250), hfun=0x566cc4 <cmd_error>) at eval.c:1332
        val = XIL(0x55decc)
        c = 0x2b41780
#55 0x000000000056721b in command_loop_2 (ignore=XIL(0)) at keyboard.c:1110
        val = XIL(0)
#56 0x000000000060fac8 in internal_catch (tag=XIL(0xc6f0), func=0x5671ee <command_loop_2>, arg=XIL(0)) at eval.c:1097
        val = XIL(0x7fffffffd6e0)
        c = 0x2b41660
#57 0x00000000005671b9 in command_loop () at keyboard.c:1089
No locals.
#58 0x0000000000566896 in recursive_edit_1 () at keyboard.c:695
        count = 1
        val = XIL(0x7fffffffd740)
#59 0x0000000000566a17 in Frecursive_edit () at keyboard.c:766
        count = 0
        buffer = XIL(0)
#60 0x00000000005644d0 in main (argc=2, argv=0x7fffffffd958) at emacs.c:1713
        stack_bottom_variable = 0x3e
        do_initial_setlocale = true
        dumping = false
        skip_args = 0
        no_loadup = false
        junk = 0x0
        dname_arg = 0x0
        ch_to_dir = 0x0
        original_pwd = 0x0
        disable_aslr = false
        rlim = {
          rlim_cur = 10022912, 
          rlim_max = 18446744073709551615
        }
        sockfd = -1

Lisp Backtrace:
"accept-process-output" (0xffff87a8)
"flyspell-word" (0xffff8e78)
"flyspell-post-command-hook" (0xffff9420)
"run-hooks" (0xffff95b8)
0x7f7ba90 PVEC_COMPILED
"apply" (0xffff9cb8)
"server-visit-files" (0xffffa168)
"server-execute" (0xffffa6d0)
0x938e940 PVEC_COMPILED
"server-execute-continuation" (0xffffb0e0)
"server-process-filter" (0xffffb988)
quit

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-30  9:48           ` Matthias Dahl
@ 2017-11-03  8:52             ` Matthias Dahl
  2017-11-03  9:58               ` Eli Zaretskii
  2017-11-04 12:11             ` Eli Zaretskii
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-11-03  8:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello Eli...

Very friendly ping. :-)

Are there any news on this issue or anything else I can help or provide
you with?

Thanks,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-03  8:52             ` Matthias Dahl
@ 2017-11-03  9:58               ` Eli Zaretskii
  0 siblings, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-03  9:58 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Fri, 3 Nov 2017 09:52:39 +0100
> 
> Very friendly ping. :-)

Thanks, I didn't forget.  I'll get to that soon.

> Are there any news on this issue or anything else I can help or provide
> you with?

I didn't yet have time to read your last message thoroughly, sorry.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-10-30  9:48           ` Matthias Dahl
  2017-11-03  8:52             ` Matthias Dahl
@ 2017-11-04 12:11             ` Eli Zaretskii
  2017-11-06 14:15               ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-04 12:11 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Mon, 30 Oct 2017 10:48:05 +0100
> 
> > But if the wrong call to accept-process-output have read the process
> > output, it could have also processed it and delivered the results to
> > the wrong application, no?
> 
> Given that fact that a filter is registered for a given process, and
> thus it is this filter that gets called whenever process output is ready
> and needs to be processed, a timer or filter would have to replace that
> filter with its own through set-process-filter for this to happen. And
> that is something I would clearly consider a bug, since no filter or
> timer should do something like that.
> 
> Naturally there is also the weird case when accept-process-output was
> called from a hook by some package which expects that data and needs it
> but doesn't consider that a timer could get called during that time and
> the package itself has a timer setup that will also interact with that
> very same process, trying to read data back from some interaction with
> it. That will naturally fail as well... either way, with or without my
> patch. And again, I would consider this a bug.

OK.  I think I'm okay with your patches, but IMO they need a minor
improvement: instead of setting got_some_output to an arbitrary value
of 1, the code should set it to the increment of the bytes read from
the sub-process.  This is what we do when we actually read from the
process, and I think this scenario should behave the same.  WDYT?

With that change, it's okay to commit this to the master branch.

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-04 12:11             ` Eli Zaretskii
@ 2017-11-06 14:15               ` Matthias Dahl
  2017-11-06 16:34                 ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-11-06 14:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello Eli...

On 04/11/17 13:11, Eli Zaretskii wrote:

> OK.  I think I'm okay with your patches, but IMO they need a minor
> improvement: instead of setting got_some_output to an arbitrary value
> of 1, the code should set it to the increment of the bytes read from
> the sub-process.  This is what we do when we actually read from the
> process, and I think this scenario should behave the same.  WDYT?

I know what you mean -- and even though this was bugging me as well when
I implemented it, it is actually on purpose.

First of all, I wanted to make as few assumptions as possible about how
processes are used and be on the conservative/defensive side with this
solution, to make sure not to introduce new bugs or corner-cases.

In this particular case, I wanted to minimize any integer overflow
problems. Granted, depending on how many bits unsigned long has, it is
very (!) unlikely to cause any issues, but otoh that is usually how
bugs get introduced in the first place. So if I had calculated the
difference to set got_some_output properly, I would have had to take
the integer overflow problematic into account.

The current solution has exactly one corner-case which is, imho,
extremely unlikely: You would have to run wait_reading_process_output
for a full wrap-around to exactly the same byte count that it was
started with in the first place. It is safe to assume that will
practically really never happen.

Also, no user of wait_reading_process_output uses the return value
in such a way at all -- which I would also consider a bug since it
is not what is officially documented for the return value in the
first place. It simply says positive, negative or zero... without
stating that it will actually return the bytes read.

So, imho, I would lean toward the current solution since it is as
simple as possible with as few corner-cases as possible. If you
would like to keep the status quo though and to always return how
many bytes were actually read, then I suggest a different approach
altogether. Instead of keeping track how many bytes were read for
the total lifetime of a process, we could track only how many bytes
were read for an in-flight wait. The would require a byte counter
and a counter for how many waits are currently in-flight, so the
last one (or first one) could zero the byte counter again. That
would make things quite a bit more complicated, imho... and without
having a proper gain to show for it.

> With that change, it's okay to commit this to the master branch.

May I suggest, even though it is rather late, to also consider this
for the emacs-26 branch? Since it is not a new feature but a bug fix
to a problem people are actually running into (see Magit), I think it
would be justified. Personally, I have run a patched emacs-26 all the
time here without any problems.

Thanks again for taking the time.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-06 14:15               ` Matthias Dahl
@ 2017-11-06 16:34                 ` Eli Zaretskii
  2017-11-06 18:24                   ` Paul Eggert
  2017-11-07 14:18                   ` Matthias Dahl
  0 siblings, 2 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-06 16:34 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Mon, 6 Nov 2017 15:15:48 +0100
> 
> In this particular case, I wanted to minimize any integer overflow
> problems. Granted, depending on how many bits unsigned long has, it is
> very (!) unlikely to cause any issues, but otoh that is usually how
> bugs get introduced in the first place. So if I had calculated the
> difference to set got_some_output properly, I would have had to take
> the integer overflow problematic into account.

Sorry, I don't understand.  When we call emacs_read in other cases in
that function, the result is an ssize_t, a signed type of the same
width as size_t.  So why would you need to handle overflow, when
reading via emacs_read doesn't?

And even if the result could overflow, we have the likes of
INT_ADD_WRAPV to make handling of that easier and safer.

So I don't see why we should give up in this case.

> Also, no user of wait_reading_process_output uses the return value
> in such a way at all -- which I would also consider a bug since it
> is not what is officially documented for the return value in the
> first place. It simply says positive, negative or zero... without
> stating that it will actually return the bytes read.

I don't agree with this line of reasoning.  We are not developing a
library whose users will be familiar only with the formal interface
definition and nothing else.  We are talking about code that is being
read, studied, and used by Emacs hack^H^H^H^Hdevelopers all the time.
Having a non-trivial function whose behavior is hard to describe and
remember accurately makes using it error-prone, especially if the
deviant behavior happens only in some corner use case that is hard to
reproduce.  In many cases, the information about these corner cases is
lost soon after the code is introduced, and since we have no formal
requirements for what functions like this one should do, and no good
coverage by the test suite, future changes are impossible to test
reliably, in order to make sure they don't break in those corner
cases.

So the situation where "the function FOO returns the number of bytes
read from the subprocess, except when this and that happens, in which
case it returns just 1" -- such situation should be avoided at all
costs, IME.  And in this case, the cost is actually quite low, unless
I'm missing something.

> If you would like to keep the status quo though and to always return
> how many bytes were actually read, then I suggest a different
> approach altogether. Instead of keeping track how many bytes were
> read for the total lifetime of a process, we could track only how
> many bytes were read for an in-flight wait. The would require a byte
> counter and a counter for how many waits are currently in-flight, so
> the last one (or first one) could zero the byte counter again.

Sorry, you lost me half-way through this description.  I actually
meant to return only the increment of how many bytes were read since
the last call, but I don't see why you'd need more than one counter
that will be reset to zero once its value is consumed.

> > With that change, it's okay to commit this to the master branch.
> 
> May I suggest, even though it is rather late, to also consider this
> for the emacs-26 branch?

It's a bug that has been there "forever", and the fix is too risky to
have it in the middle of a pretest.  I don't think we understand well
enough all of its implications, and reading from subprocesses is a
very central and very delicate part of Emacs.  Sorry.

Thanks for working on this.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-06 16:34                 ` Eli Zaretskii
@ 2017-11-06 18:24                   ` Paul Eggert
  2017-11-06 20:17                     ` Eli Zaretskii
  2017-11-07 14:18                   ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-06 18:24 UTC (permalink / raw)
  To: Eli Zaretskii, Matthias Dahl; +Cc: emacs-devel

On 11/06/2017 08:34 AM, Eli Zaretskii wrote:
> When we call emacs_read in other cases in
> that function, the result is an ssize_t, a signed type of the same
> width as size_t.

As a minor point, POSIX does not require that ssize_t be the same width 
as size_t, or that ptrdiff_t be the same width as size_t. Emacs has run 
(and as far as I know, still would run) on unusual platforms where 
ssize_t is narrower than size_t.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-06 18:24                   ` Paul Eggert
@ 2017-11-06 20:17                     ` Eli Zaretskii
  0 siblings, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-06 20:17 UTC (permalink / raw)
  To: Paul Eggert; +Cc: ml_emacs-lists, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 6 Nov 2017 10:24:42 -0800
> 
> On 11/06/2017 08:34 AM, Eli Zaretskii wrote:
> > When we call emacs_read in other cases in
> > that function, the result is an ssize_t, a signed type of the same
> > width as size_t.
> 
> As a minor point, POSIX does not require that ssize_t be the same width 
> as size_t, or that ptrdiff_t be the same width as size_t. Emacs has run 
> (and as far as I know, still would run) on unusual platforms where 
> ssize_t is narrower than size_t.

Yes, I know.  We could replace ssize_t with ptrdiff_t if needed.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-06 16:34                 ` Eli Zaretskii
  2017-11-06 18:24                   ` Paul Eggert
@ 2017-11-07 14:18                   ` Matthias Dahl
  2017-11-07 16:40                     ` Eli Zaretskii
  2017-11-07 17:23                     ` Stefan Monnier
  1 sibling, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-07 14:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello Eli...

On 06/11/17 17:34, Eli Zaretskii wrote:

> Sorry, I don't understand.  When we call emacs_read in other cases in
> that function, the result is an ssize_t, a signed type of the same
> width as size_t.  So why would you need to handle overflow, when
> reading via emacs_read doesn't?
> 
> And even if the result could overflow, we have the likes of
> INT_ADD_WRAPV to make handling of that easier and safer.
> 
> So I don't see why we should give up in this case.

The amount of output read (in total) from a process is tracked via the
following member variable of Lisp_Process:

  /* Byte-count for process output read from `infd'.  */
  unsigned long infd_num_bytes_read;

That gives us at least 4 GiB worth of data that can be read back and
tracked before it wraps around to 0.

When wait_reading_process_output starts, the following happens:

  unsigned long initial_wait_proc_num_bytes_read = (wait_proc) ?
                              wait_proc->infd_num_bytes_read : 0;

We save the current counter and compare it later on to see if data has
been read from that process without us noticing it, e.g.:

 if (wait_proc
     && wait_proc->infd_num_bytes_read !=
        initial_wait_proc_num_bytes_read)
 {
   got_some_output = 1;
   break;
 }

What might have happened now is that wait_proc->infd_num_bytes_read was
already at the edge of the maximum representable value range and wrapped
around during some recursive calls to wait_reading_process_output while
we still have a value in initial_wait_proc_num_bytes_read that has not
wrapped around. E.g.:

  wait_proc->infd_num_bytes_read = (2^32)-10
  initial_wait_proc_num_bytes_read = wait_proc->infd_num_bytes_read

  ... recursive calls happen, output is read ...

  wait_proc->infd_num_bytes_read = 256
  initial_wait_proc_num_bytes_read = (2^32)-10

  // this will be wrong
  bytes_read = wait_proc->infd_num_bytes_read -
               initial_wait_proc_num_bytes_read

That causes a problem -- even though this case is unlikely, it is still
possible and I don't think it should be dismissed.

To get how many bytes have been read for this current call, we have to
calculate the difference of initial_wait_proc_num_bytes_read and
wait_proc->infd_num_bytes_read. That naturally fails if there has been
a wrap around in wait_proc->infd_num_bytes_read.

gnulib's integer overflow helpers won't help in this case, at all.

That is the whole reason why I did not set got_some_output to the proper
byte count but simply to 1, to avoid this integer overflow situation
which, if dealt with properly, would have made things a bit more
complex and difficult.

Naturally you can still do a greater than comparison and choose the
operands (more or less) properly that way. Since that calculation does
happen at two different spots in the code and I generally perceived the
bytes_read return value as unnecessary as well as the fact that I did
not really like that whole idea, I forgo doing this and went with the
simple return value of 1.

But maybe this is the simplest of all solutions, keeping KISS in mind.

> I don't agree with this line of reasoning.  We are not developing a
> library whose users will be familiar only with the formal interface
> definition and nothing else.  We are talking about code that is being
> read, studied, and used by Emacs hack^H^H^H^Hdevelopers all the time.

Even the C interface? It is my impression that the C parts of Emacs are
usually what is being avoided at all costs. Besides that, I treated the
C parts like implementation details and kept strictly to the documented
interfaces, expecting no user to ever touch this.

Nevertheless, I agree, that consistency is important and I should have
made that more of a priority in this case, given what you said.

> Having a non-trivial function whose behavior is hard to describe and
> remember accurately makes using it error-prone, especially if the
> deviant behavior happens only in some corner use case that is hard to
> reproduce.

That is why it is important that the documentation and implementation of
a function actually align. In this case, no user should actually expect
the function to return the number of bytes read, just a positive or
negative number or zero... just like it is stated.

But, again, that might be a matter of perspective. For me, what is
documented is like a contract.

> Sorry, you lost me half-way through this description.  I actually
> meant to return only the increment of how many bytes were read since
> the last call, but I don't see why you'd need more than one counter
> that will be reset to zero once its value is consumed.

My suggestion was to have an in-flight bytes read counter, that will
only track the number of bytes read while /any/ wait for that specific
process is active.

To be able to properly convey that is through another variable that
will be non-zero if we have anything waiting for that process's output.

Eventually that in-flight bytes read counter needs to be reset to zero
again. This can only happen, if we have no other parties waiting for
this process's output higher up in the call chain as otherwise we would
loose information and make it difficult/impossible for them to detect
that something was actually read.

So, that "other variable" that is non-zero if we have anything waiting
for that process's output needs to actually track the number of parties
that are waiting, so we can actually know when we are the last one and
can safely reset the in-flight bytes read counter.

But like I already stated earlier, maybe simply checking if an overflow
happened through a greater than comparison is the simplest approach,
since this solution introduces even more complexity and corner-cases,
imho.

> It's a bug that has been there "forever", and the fix is too risky to
> have it in the middle of a pretest.  I don't think we understand well
> enough all of its implications, and reading from subprocesses is a
> very central and very delicate part of Emacs.  Sorry.

Ack. Even though this means that users running into this right now will
have to wait quite a while for a fix in a stable release, which is a
pity imho.

Thanks for all the feedback,
Matthias

PS. Sorry for the huge wall of text. :-( I just noticed how big this
    mail has gotten... yikes.

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-07 14:18                   ` Matthias Dahl
@ 2017-11-07 16:40                     ` Eli Zaretskii
  2017-11-10 14:45                       ` Matthias Dahl
  2017-11-07 17:23                     ` Stefan Monnier
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-07 16:40 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Cc: emacs-devel@gnu.org
> Date: Tue, 7 Nov 2017 15:18:02 +0100
> 
> The amount of output read (in total) from a process is tracked via the
> following member variable of Lisp_Process:
> 
>   /* Byte-count for process output read from `infd'.  */
>   unsigned long infd_num_bytes_read;
> 
> That gives us at least 4 GiB worth of data that can be read back and
> tracked before it wraps around to 0.

If we want, we can make it EMACS_INT, which will give us as much as
Emacs can gobble.

> What might have happened now is that wait_proc->infd_num_bytes_read was
> already at the edge of the maximum representable value range and wrapped
> around during some recursive calls to wait_reading_process_output while
> we still have a value in initial_wait_proc_num_bytes_read that has not
> wrapped around. E.g.:
> 
>   wait_proc->infd_num_bytes_read = (2^32)-10
>   initial_wait_proc_num_bytes_read = wait_proc->infd_num_bytes_read
> 
>   ... recursive calls happen, output is read ...
> 
>   wait_proc->infd_num_bytes_read = 256
>   initial_wait_proc_num_bytes_read = (2^32)-10
> 
>   // this will be wrong
>   bytes_read = wait_proc->infd_num_bytes_read -
>                initial_wait_proc_num_bytes_read
> 
> That causes a problem -- even though this case is unlikely, it is still
> possible and I don't think it should be dismissed.

That should be very rare (barring application-level bugs), and in any
case it is no worse than just returning 1.  So I don't think we should
give up better support of 90% or even 99% of use cases due to the
other 10% or 1%.

> gnulib's integer overflow helpers won't help in this case, at all.

Those helpers are meant to avoid the overflow itself, thus preventing
"undefined behavior" when a signed integer overflows.  They cannot,
and shouldn't be expected to, fix the reason for the overflow -- that
cannot be helped when it happens.

> > I don't agree with this line of reasoning.  We are not developing a
> > library whose users will be familiar only with the formal interface
> > definition and nothing else.  We are talking about code that is being
> > read, studied, and used by Emacs hack^H^H^H^Hdevelopers all the time.
> 
> Even the C interface?

Especially in the C interface.

> It is my impression that the C parts of Emacs are usually what is
> being avoided at all costs.

It depends on who we are talking about.  Some of us (yours truly
included) work on the C level quite frequently.  Any relatively
radical new feature needs at least some C infrastructure.  I had my
share of writing code based on what I read in the function commentary
and some general common sense and familiarity with the internals, only
to find out later that they were incomplete or even prone to wrong
interpretation.  I'd like to minimize such occurrences as much as I
can.

> Besides that, I treated the C parts like implementation details and
> kept strictly to the documented interfaces, expecting no user to
> ever touch this.

But that's exactly the problem: we _don't_have_ documented interfaces
on the C level, at least not of the quality we would like to have.  Do
you really think that the commentary at the beginning of
wait_reading_process_output describes what the function does anywhere
close to completeness?  Far from it.

> > Having a non-trivial function whose behavior is hard to describe and
> > remember accurately makes using it error-prone, especially if the
> > deviant behavior happens only in some corner use case that is hard to
> > reproduce.
> 
> That is why it is important that the documentation and implementation of
> a function actually align.

Of course.  But it's hard to keep them in sync, partly because it's
incomplete to begin with, and that makes it not easy to recognize when
the commentary needs to change due to some code changes.  And the
result is that I quite frequently find comments that are blatantly
incorrect: mention parameters no longer there or fail to mention new
ones that were added, for example.

> In this case, no user should actually expect the function to return
> the number of bytes read, just a positive or negative number or
> zero... just like it is stated.

I'd actually say that the commentary is incomplete because it doesn't
say what that "positive" value means.  Returning an opaque value from
a function is not useful, unless you can pass it later to the same
function to get it to do something special.

> But, again, that might be a matter of perspective. For me, what is
> documented is like a contract.

IME with Emacs code, we are a long way from interfaces that are
documented on a level that could be considered a contract.  For
starters, we don't even say clearly enough which functions can throw
to top level, without which a contract is not really a contract, don't
you agree?

> > Sorry, you lost me half-way through this description.  I actually
> > meant to return only the increment of how many bytes were read since
> > the last call, but I don't see why you'd need more than one counter
> > that will be reset to zero once its value is consumed.
> 
> My suggestion was to have an in-flight bytes read counter, that will
> only track the number of bytes read while /any/ wait for that specific
> process is active.

Ah, I see.  I think this is an unnecessary complication.  The risk of
overflow, while it isn't zero, is too low to justify such measures and
the resulting complexity, IMO.

> > It's a bug that has been there "forever", and the fix is too risky to
> > have it in the middle of a pretest.  I don't think we understand well
> > enough all of its implications, and reading from subprocesses is a
> > very central and very delicate part of Emacs.  Sorry.
> 
> Ack. Even though this means that users running into this right now will
> have to wait quite a while for a fix in a stable release, which is a
> pity imho.

We can publish the patch here, and then people who really want this
fixed ASAP can simply apply the patch.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-07 14:18                   ` Matthias Dahl
  2017-11-07 16:40                     ` Eli Zaretskii
@ 2017-11-07 17:23                     ` Stefan Monnier
  2017-11-10 14:53                       ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Stefan Monnier @ 2017-11-07 17:23 UTC (permalink / raw)
  To: emacs-devel

>   /* Byte-count for process output read from `infd'.  */
>   unsigned long infd_num_bytes_read;

BTW, we could count the number of (non-empty) "chunks" rather than the
number of bytes.

>  {
>    got_some_output = 1;
>    break;
>  }

Please try to use `true' and `false' for boolean values (there's still
a lot of code in src/*.c which uses 0 and 1, admittedly, but this should
slowly disappear over time).


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-07 16:40                     ` Eli Zaretskii
@ 2017-11-10 14:45                       ` Matthias Dahl
  2017-11-10 15:25                         ` Eli Zaretskii
  2017-11-12 21:17                         ` Paul Eggert
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-10 14:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2936 bytes --]

Hello Eli...

Attached you will find the revised patches which now properly
set the number of bytes read.

Sorry for the delay, but real life intervened.

Turns out, things were _a lot_ easier than I thought since the
C standard actually does have well-defined rules for computation
on unsigned types, so that no overflow happens. And since it is
modular arithmetic, it perfectly fits our use-case. :-) Sometimes
the simplest solution is right in front of your eyes and you are
thinking way too complicated.

So sorry for the trouble. Had I known this earlier, I would not
have skipped that part -- even though I had the best intentions,
obviously.

On with the rest of your mail...

On 07/11/17 17:40, Eli Zaretskii wrote:

> If we want, we can make it EMACS_INT, which will give us as much as
> Emacs can gobble.

Thanks for the suggestion, I changed it to EMACS_UINT -- we need an
unsigned type.

> I had my
> share of writing code based on what I read in the function commentary
> and some general common sense and familiarity with the internals, only
> to find out later that they were incomplete or even prone to wrong
> interpretation.  I'd like to minimize such occurrences as much as I
> can.

I guess I was overly idealistic. Sorry for that.

Nevertheless, it would be nice to improve the situation and make sure
that future changes to the codebase also take the commentaries into
account... and put those themselves to the same high standard as the
implementation itself. Just my two cents...

> But that's exactly the problem: we _don't_have_ documented interfaces
> on the C level, at least not of the quality we would like to have.  Do
> you really think that the commentary at the beginning of
> wait_reading_process_output describes what the function does anywhere
> close to completeness?  Far from it.

I know. OTOH, wait_reading_process_output might be a bad example as it
really could benefit greatly from a refactor since it is way too big,
complex and unclear. And describing its function properly and concisely
could prove quite difficult.

But generally, I pretty much agree on all that you said in the rest of
your mail.

> I'd actually say that the commentary is incomplete because it doesn't
> say what that "positive" value means.  Returning an opaque value from
> a function is not useful, unless you can pass it later to the same
> function to get it to do something special.

I agree. It also leads the user to make assumptions about the value
that might or might not turn out true for all cases... without properly
reading the code.

Thanks again for all your great feedback and review. I hope we now have
something that can be applied to the master branch. If anything comes up
in terms of bugs or problems that might be related, please poke me if I
miss it on the list and I will (try to) fix it.

Have a nice weekend,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-process-output-read-accounting.patch --]
[-- Type: text/x-patch; name="0001-Add-process-output-read-accounting.patch", Size: 1622 bytes --]

From 76b9697bf53c151849face5f95f2b07c2d0b3511 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:55:53 +0200
Subject: [PATCH 1/2] Add process output read accounting

This tracks the bytes read from a process's stdin which is not used
anywhere yet but required for follow-up work.
* src/process.c (read_process_output): Track bytes read from a process.
* src/process.h (struct Lisp_Process): Add infd_num_bytes_read
to track bytes read from a process.
---
 src/process.c | 2 ++
 src/process.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/process.c b/src/process.c
index fc46e74332..904ca60863 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5900,6 +5900,8 @@ read_process_output (Lisp_Object proc, int channel)
   /* Now set NBYTES how many bytes we must decode.  */
   nbytes += carryover;
 
+  p->infd_num_bytes_read += nbytes;
+
   odeactivate = Vdeactivate_mark;
   /* There's no good reason to let process filters change the current
      buffer, and many callers of accept-process-output, sit-for, and
diff --git a/src/process.h b/src/process.h
index 5a044f669f..d49be55f10 100644
--- a/src/process.h
+++ b/src/process.h
@@ -129,6 +129,8 @@ struct Lisp_Process
     pid_t pid;
     /* Descriptor by which we read from this process.  */
     int infd;
+    /* Byte-count for process output read from `infd'.  */
+    EMACS_UINT infd_num_bytes_read;
     /* Descriptor by which we write to this process.  */
     int outfd;
     /* Descriptors that were created for this process and that need
-- 
2.15.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-src-process.c-wait_reading_process_output-Fix-wait_p.patch --]
[-- Type: text/x-patch; name="0002-src-process.c-wait_reading_process_output-Fix-wait_p.patch", Size: 3883 bytes --]

From a309cc76c93826c6ac7d51bba88900738910e504 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:56:47 +0200
Subject: [PATCH 2/2] * src/process.c (wait_reading_process_output): Fix
 wait_proc hang.

If called recursively (through timers or process filters by the means
of accept-process-output), it is possible that the output of wait_proc
has already been read by one of those recursive calls, leaving the
original call hanging forever if no further output arrives through
that fd and no timeout has been specified. Implement proper checks by
taking advantage of the process output read accounting.
---
 src/process.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/process.c b/src/process.c
index 904ca60863..d4e152eb1c 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5003,6 +5003,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   struct timespec got_output_end_time = invalid_timespec ();
   enum { MINIMUM = -1, TIMEOUT, INFINITY } wait;
   int got_some_output = -1;
+  EMACS_UINT initial_wait_proc_num_bytes_read = (wait_proc) ?
+                                                wait_proc->infd_num_bytes_read : 0;
 #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS
   bool retry_for_async;
 #endif
@@ -5161,6 +5163,20 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 	      && requeued_events_pending_p ())
 	    break;
 
+          /* Timers could have called `accept-process-output', thus reading the output
+             of wait_proc while we (in the worst case) wait endlessly for it to become
+             available later. So we need to check if data has been read and break out
+             early if that is so since our job has been fulfilled. */
+          if (wait_proc
+              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
+            {
+              /* Computations on unsigned types are well defined and won't overflow,
+                 so this is safe even if our initial value > our current value, in
+                 case of a wrap around. (ISO/IEC 9899:1999 §6.2.5/9) */
+              got_some_output = wait_proc->infd_num_bytes_read
+                                - initial_wait_proc_num_bytes_read;
+            }
+
           /* This is so a breakpoint can be put here.  */
           if (!timespec_valid_p (timer_delay))
               wait_reading_process_output_1 ();
@@ -5606,7 +5622,21 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		 buffered-ahead character if we have one.  */
 
 	      nread = read_process_output (proc, channel);
-	      if ((!wait_proc || wait_proc == XPROCESS (proc))
+
+              /* In case a filter was run that called `accept-process-output', it is
+                 possible that the output from wait_proc was already read, leaving us
+                 waiting for it endlessly (if no timeout was specified). Thus, we need
+                 to check if data was already read. */
+              if (wait_proc
+                  && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
+                {
+                  /* Computations on unsigned types are well defined and won't overflow,
+                     so this is safe even if our initial value > our current value, in
+                     case of a wrap around. (ISO/IEC 9899:1999 §6.2.5/9) */
+                  got_some_output = wait_proc->infd_num_bytes_read
+                                    - initial_wait_proc_num_bytes_read;
+                }
+	      else if ((!wait_proc || wait_proc == XPROCESS (proc))
 		  && got_some_output < nread)
 		got_some_output = nread;
 	      if (nread > 0)
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-07 17:23                     ` Stefan Monnier
@ 2017-11-10 14:53                       ` Matthias Dahl
  0 siblings, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-10 14:53 UTC (permalink / raw)
  To: emacs-devel

Hello Stefan...

On 07/11/17 18:23, Stefan Monnier wrote:

> BTW, we could count the number of (non-empty) "chunks" rather than the
> number of bytes.

Thanks for the suggestion. In this case, I think just counting the bytes
keeps things simple and makes this value easier to understand when doing
a debugging session that benefits from this information.

Also, counting the chunks wouldn't really improve the situation much, if
I am not mistaken? It would just take a bit longer for a wrap-around to
happen.

> 
>>  {
>>    got_some_output = 1;
>>    break;
>>  }
> 
> Please try to use `true' and `false' for boolean values (there's still
> a lot of code in src/*.c which uses 0 and 1, admittedly, but this should
> slowly disappear over time).

That is actually legacy code right there. I did not introduce that
variable nor is it a boolean. Actually it is a integer in disguise that
stores how many bytes have been read. ;-) It is very unfortunate naming,
I agree.

The revised patches don't set it to 1 but do calculate the proper value,
so this is a non-issue now.

Thanks for taking the time and have a nice weekend,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-10 14:45                       ` Matthias Dahl
@ 2017-11-10 15:25                         ` Eli Zaretskii
  2017-11-12 21:17                         ` Paul Eggert
  1 sibling, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-10 15:25 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: emacs-devel

> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Cc: emacs-devel@gnu.org
> Date: Fri, 10 Nov 2017 15:45:30 +0100
> 
> Attached you will find the revised patches which now properly
> set the number of bytes read.

Thanks.

> Sorry for the delay, but real life intervened.

I know what you mean...

> Turns out, things were _a lot_ easier than I thought since the
> C standard actually does have well-defined rules for computation
> on unsigned types, so that no overflow happens.

Right, for unsigned types there's no undefined behavior when it
overflows.

> > I had my
> > share of writing code based on what I read in the function commentary
> > and some general common sense and familiarity with the internals, only
> > to find out later that they were incomplete or even prone to wrong
> > interpretation.  I'd like to minimize such occurrences as much as I
> > can.
> 
> I guess I was overly idealistic. Sorry for that.

No need to be sorry.

Interestingly enough, we just had another case of this; if you missed
that, I suggest to read the discussion of bug#27647, starting here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=27647#116

This is a textbook example of how "tricky" code that "reuses" a
variable for a very different purpose can trip even veteran hackers,
and cause bugs that take many moons (first reported in July, fixed in
November) to diagnose and fix -- if we are lucky, as in this case.

> Nevertheless, it would be nice to improve the situation and make sure
> that future changes to the codebase also take the commentaries into
> account... and put those themselves to the same high standard as the
> implementation itself. Just my two cents...

Agreed.  I try to add and update any commentary I find missing or
outdated, and encourage everyone else to do the same.

> I hope we now have something that can be applied to the master
> branch. If anything comes up in terms of bugs or problems that might
> be related, please poke me if I miss it on the list and I will (try
> to) fix it.

Thanks, the patches LGTM.  I will wait for a few days, to give others
time to comment, and if no objections come up, will push then.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-10 14:45                       ` Matthias Dahl
  2017-11-10 15:25                         ` Eli Zaretskii
@ 2017-11-12 21:17                         ` Paul Eggert
  2017-11-13  3:27                           ` Eli Zaretskii
  2017-11-13 14:13                           ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Paul Eggert @ 2017-11-12 21:17 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

Matthias Dahl wrote:

>    /* Now set NBYTES how many bytes we must decode.  */
>    nbytes += carryover;
>  
> +  p->infd_num_bytes_read += nbytes;

This will include the carryover in the number of bytes read, even though this 
code did not read the carryover bytes. Is that what you intended?

> +    /* Byte-count for process output read from `infd'.  */
> +    EMACS_UINT infd_num_bytes_read;

This is overkill, as the total amount of bytes read by a call to 
read_process_output cannot exceed 4096, so all we need is an unsigned counter 
with more than 12 bits. How about making it 'unsigned int' instead? It could 
even be 'unsigned short', though that might be overkill. Whatever size is 
chosen, the comment should say that the value recorded is the true value modulo 
the word size.

> +          /* Timers could have called `accept-process-output', thus reading the output
> +             of wait_proc while we (in the worst case) wait endlessly for it to become
> +             available later. So we need to check if data has been read and break out
> +             early if that is so since our job has been fulfilled. */
> +          if (wait_proc
> +              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
> +            {
> +              /* Computations on unsigned types are well defined and won't overflow,
> +                 so this is safe even if our initial value > our current value, in
> +                 case of a wrap around. (ISO/IEC 9899:1999 §6.2.5/9) */
> +              got_some_output = wait_proc->infd_num_bytes_read
> +                                - initial_wait_proc_num_bytes_read;
> +            }
> +
All that matters for got_some_output is whether it is negative, zero, or 
positive. So I suggest replacing the above with the following, as it's a bit 
faster and simpler and doesn't require commentary:

> +          if (wait_proc
> +              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
> +            got_some_output = 1;

Similarly for the other change that assigns to got_some_output.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-12 21:17                         ` Paul Eggert
@ 2017-11-13  3:27                           ` Eli Zaretskii
  2017-11-13  5:27                             ` Paul Eggert
  2017-11-13 14:13                           ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-13  3:27 UTC (permalink / raw)
  To: Paul Eggert; +Cc: ml_emacs-lists, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 12 Nov 2017 13:17:33 -0800
> 
> All that matters for got_some_output is whether it is negative, zero, or 
> positive. So I suggest replacing the above with the following, as it's a bit 
> faster and simpler and doesn't require commentary:
> 
> > +          if (wait_proc
> > +              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
> > +            got_some_output = 1;
> 
> Similarly for the other change that assigns to got_some_output.

You can read up-thread why I'm firmly against doing that.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13  3:27                           ` Eli Zaretskii
@ 2017-11-13  5:27                             ` Paul Eggert
  2017-11-13 16:00                               ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-13  5:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ml_emacs-lists, emacs-devel

Eli Zaretskii wrote:
>>> +          if (wait_proc
>>> +              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
>>> +            got_some_output = 1;
>> Similarly for the other change that assigns to got_some_output.
> You can read up-thread why I'm firmly against doing that.

It doesn't explain why you're so firm about it since the commentary clearly 
states that 1 is OK, but at any rate one could use this instead:

   if (wait_proc)
     {
       unsigned int diff = (wait_proc->infd_num_bytes_read
                            - initial_wait_proc_num_bytes_read);
       if (diff != 0)
         got_some_output = diff;
     }

which is still a bit simpler than what was proposed. Anyway there's no need to 
refer to ISO/IEC 9899:1999 chapter and verse here, any more than there's a need 
to refer to it in the countless other places that we rely on it.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-12 21:17                         ` Paul Eggert
  2017-11-13  3:27                           ` Eli Zaretskii
@ 2017-11-13 14:13                           ` Matthias Dahl
  2017-11-13 16:10                             ` Eli Zaretskii
  2017-11-13 19:44                             ` Paul Eggert
  1 sibling, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-13 14:13 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello Paul,

Thanks for your feedback!

On 12/11/17 22:17, Paul Eggert wrote:

> This will include the carryover in the number of bytes read, even though
> this code did not read the carryover bytes. Is that what you intended?

Good catch -- not what I intended. Sorry for missing that. I will update
the patches once the rest of your suggestions are discussed / decided.

> This is overkill, as the total amount of bytes read by a call to
> read_process_output cannot exceed 4096, so all we need is an unsigned
> counter with more than 12 bits. How about making it 'unsigned int'
> instead? It could even be 'unsigned short', though that might be
> overkill. Whatever size is chosen, the comment should say that the value
> recorded is the true value modulo the word size.

That would not be enough. The counter is for the entire process lifetime
and not just for a single read back or chain of recursive read backs.
That is why it is EMACS_UINT and not just a smallish data type.

What you suggest we discussed shortly earlier in this thread but decided
against it and go for this simpler solution.

>   if (wait_proc)
>     {
>       unsigned int diff = (wait_proc->infd_num_bytes_read
>                            - initial_wait_proc_num_bytes_read);
>       if (diff != 0)
>         got_some_output = diff;
>     }
>
> which is still a bit simpler than what was proposed. Anyway there's no
> need to refer to ISO/IEC 9899:1999 chapter and verse here, any more
> than there's a need to refer to it in the countless other places that
> we rely on it.

Personally I think the current version is a bit more clear and easier
to understand from a semantic point of view. But that might just be me.
So I would rather go with the current version. Generally, one can argue
for both.

Regarding the ISO/IEC commentary, I thought it was worth mentioning here
since it is an important point to make that not everybody might know.

But if there is consensus, I will remove the commentary and update the
if-statement to your version. Eli, what are your thoughts?

As soon as we have finalized those points, I will send updated patches.
And thanks again for catching the bug!

Have a nice day,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13  5:27                             ` Paul Eggert
@ 2017-11-13 16:00                               ` Eli Zaretskii
  2017-11-13 19:42                                 ` Paul Eggert
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-13 16:00 UTC (permalink / raw)
  To: Paul Eggert; +Cc: ml_emacs-lists, emacs-devel

> Cc: ml_emacs-lists@binary-island.eu, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 12 Nov 2017 21:27:05 -0800
> 
> >>> +          if (wait_proc
> >>> +              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
> >>> +            got_some_output = 1;
> >> Similarly for the other change that assigns to got_some_output.
> > You can read up-thread why I'm firmly against doing that.
> 
> It doesn't explain why you're so firm about it since the commentary clearly 
> states that 1 is OK

I explained that as well.  In a nutshell, I see no reason to consider
our commentary the definitive documentation of what the code does.  It
is much more probable that either the commentary was never accurate,
or it was once, but the code was modified without updating the
comments.  IME with Emacs sources, code reading is a much more
reliable way of figuring out what a function does than relying on its
commentary.

> but at any rate one could use this instead:
> 
>    if (wait_proc)
>      {
>        unsigned int diff = (wait_proc->infd_num_bytes_read
>                             - initial_wait_proc_num_bytes_read);
>        if (diff != 0)
>          got_some_output = diff;
>      }
> 
> which is still a bit simpler than what was proposed.

I'm okay with that, but it looks like an stylistic issue, so I
wouldn't insist on that.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13 14:13                           ` Matthias Dahl
@ 2017-11-13 16:10                             ` Eli Zaretskii
  2017-11-14 15:05                               ` Matthias Dahl
  2017-11-13 19:44                             ` Paul Eggert
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-13 16:10 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Mon, 13 Nov 2017 15:13:28 +0100
> 
> > This is overkill, as the total amount of bytes read by a call to
> > read_process_output cannot exceed 4096, so all we need is an unsigned
> > counter with more than 12 bits. How about making it 'unsigned int'
> > instead? It could even be 'unsigned short', though that might be
> > overkill. Whatever size is chosen, the comment should say that the value
> > recorded is the true value modulo the word size.
> 
> That would not be enough. The counter is for the entire process lifetime
> and not just for a single read back or chain of recursive read backs.

We could reset the value to zero once it's consumed, in which case a
narrower type would be okay.  The price is a slight complication of
the logic.

> Regarding the ISO/IEC commentary, I thought it was worth mentioning here
> since it is an important point to make that not everybody might know.
> 
> But if there is consensus, I will remove the commentary and update the
> if-statement to your version. Eli, what are your thoughts?

Maybe make the comment shorter by just saying that a wrap-around could
happen there in case of overflow.

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13 16:00                               ` Eli Zaretskii
@ 2017-11-13 19:42                                 ` Paul Eggert
  2017-11-13 20:12                                   ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-13 19:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ml_emacs-lists, emacs-devel

On 11/13/2017 08:00 AM, Eli Zaretskii wrote:
> I see no reason to consider
> our commentary the definitive documentation of what the code does.  It
> is much more probable that either the commentary was never accurate,
> or it was once, but the code was modified without updating the
> comments.

Hmm, well, since I wrote some of that commentary and code, I can state 
that it was my understanding that the caller should not care about the 
exact value of wait_reading_process_output's result (only whether it is 
negative, zero, or positive), and that my understanding of this API has 
survived until the present day. Partly this was because I did not want 
to change the type of the result if we should ever increase the buffer 
size from 4096 to a value that might not fit in 'int'. In other words, 
the documentation is written the way it's written in order to give the 
implementation some freedom that could be useful in the future.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13 14:13                           ` Matthias Dahl
  2017-11-13 16:10                             ` Eli Zaretskii
@ 2017-11-13 19:44                             ` Paul Eggert
  2017-11-14 14:58                               ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-13 19:44 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

On 11/13/2017 06:13 AM, Matthias Dahl wrote:
>> This is overkill, as the total amount of bytes read by a call to
>> read_process_output cannot exceed 4096, so all we need is an unsigned
>> counter with more than 12 bits. How about making it 'unsigned int'
>> instead? It could even be 'unsigned short', though that might be
>> overkill. Whatever size is chosen, the comment should say that the value
>> recorded is the true value modulo the word size.
> That would not be enough. The counter is for the entire process lifetime
> and not just for a single read back or chain of recursive read backs.

It cannot be for the entire process lifetime, since it's an unsigned 
integer that is designed to wrap around on overflow. So it sounds like 
you want a counter modulo 2**N for the entire process lifetime, and 
we're discussing what value of N is large enough for the purposes of 
wait_reading_process_output. Since EMACS_UINT might be only 32 bits, and 
it's eminently reasonable for a process to read more than 2**32 bytes 
during its lifetime, this is not a theoretical issue.

So: why must the counter contain more than a small number of bits (16, 
say)? I'm not seeing the scenario. I'm asking not because I think it's 
important to save a few bits in the counter: I'm asking because I want 
to understand the change and the reasoning behind it, so that I can be 
sure it's fixing the bug.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13 19:42                                 ` Paul Eggert
@ 2017-11-13 20:12                                   ` Eli Zaretskii
  0 siblings, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-13 20:12 UTC (permalink / raw)
  To: Paul Eggert; +Cc: ml_emacs-lists, emacs-devel

> Cc: ml_emacs-lists@binary-island.eu, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 13 Nov 2017 11:42:50 -0800
> 
> Hmm, well, since I wrote some of that commentary and code, I can state 
> that it was my understanding that the caller should not care about the 
> exact value of wait_reading_process_output's result (only whether it is 
> negative, zero, or positive), and that my understanding of this API has 
> survived until the present day. Partly this was because I did not want 
> to change the type of the result if we should ever increase the buffer 
> size from 4096 to a value that might not fit in 'int'. In other words, 
> the documentation is written the way it's written in order to give the 
> implementation some freedom that could be useful in the future.

Since this is not a library which should not change its API contract,
we don't really need such a freedom.  We can change the meaning of the
return value whenever we want.  So I'd rather we documented the
meaning of the return value.  More importantly, if we sometimes return
the number of bytes and sometimes just 1, we should tell when each one
happens, because this will make it easier for others to use this
function without being privy to its details and history.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13 19:44                             ` Paul Eggert
@ 2017-11-14 14:58                               ` Matthias Dahl
  2017-11-14 15:24                                 ` Paul Eggert
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-11-14 14:58 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello Paul...

On 13/11/17 20:44, Paul Eggert wrote:

> It cannot be for the entire process lifetime, since it's an unsigned
> integer that is designed to wrap around on overflow. So it sounds like
> you want a counter modulo 2**N for the entire process lifetime, and
> we're discussing what value of N is large enough for the purposes of
> wait_reading_process_output. Since EMACS_UINT might be only 32 bits, and
> it's eminently reasonable for a process to read more than 2**32 bytes
> during its lifetime, this is not a theoretical issue.
> 
> So: why must the counter contain more than a small number of bits (16,
> say)? I'm not seeing the scenario. I'm asking not because I think it's
> important to save a few bits in the counter: I'm asking because I want
> to understand the change and the reasoning behind it, so that I can be
> sure it's fixing the bug.

I am well aware that EMACS_UINT can be 32 bits wide and that the counter
could and will eventually wrap around, given the right process usage.

The scenario my patch(es) fix is the following: We have a wait_... call
with a waitproc set. Timers and filters run which can by themselves call
us recursively again, triggering read back from our own waitproc whilst
we would never notice it and stall waiting for it if no new data becomes
available.

Thus, with the current simple implementation, there is only one corner
case that will miss such read back nevertheless: If during an active
call to wait_... all recursive calls happen to read exactly 2**32 (or
whatever bit depths EMACS_UINT is) bytes back, then we will miss it
completely and stall.
If we read more than that, our calculation is off and we report less
bytes read back through got_some_output which is pretty much irrelevant.

Since I did not want to make any random assumptions about an upper limit
I decided to be on the safe side with >= 32 bits which, yes, is probably
overkill but arbitrarily setting a lower limit which might even wrap
around more often without any real gain, is also not the best thing to
do in my opinion.

Besides that, I also opted for the "bytes read during process lifetime"
metric because I thought it could turn out useful in the future while
debugging. A more synthetic metric like chunks read or in-flight bytes
read, not so much, imho.

What do you say?

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-13 16:10                             ` Eli Zaretskii
@ 2017-11-14 15:05                               ` Matthias Dahl
  0 siblings, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-14 15:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

Hello Eli...

On 13/11/17 17:10, Eli Zaretskii wrote:

> We could reset the value to zero once it's consumed, in which case a
> narrower type would be okay.  The price is a slight complication of
> the logic.

That would necessarily mean we need another variable to track how many
active waits we have since we can only zero the bytes read counter when
no other waits are active since they depend on that value as well. This
brings us back to the in-flight bytes read solution we discussed earlier
in this thread.

I don't see any real gain by complicating this further. It will not be
any better in detecting if any read backs have been processed. And the
memory saved is pretty much completely negligible except if there were a
huge amount of active processes floating around -- which usually there
are definitely not.

OTOH, it is more complicated and thus more prone to bugs and not as easy
to understand. And the metric is also not as useful...

> Maybe make the comment shorter by just saying that a wrap-around could
> happen there in case of overflow.

Ok. I will update that with the next revision.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-14 14:58                               ` Matthias Dahl
@ 2017-11-14 15:24                                 ` Paul Eggert
  2017-11-14 16:03                                   ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-14 15:24 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

On 11/14/2017 06:58 AM, Matthias Dahl wrote:
> If during an active
> call to wait_... all recursive calls happen to read exactly 2**32 (or
> whatever bit depths EMACS_UINT is) bytes back, then we will miss it
> completely and stall.

First, this means that the companion idea of subtracting the two 
counters to yield a byte count is also buggy because the byte count will 
be wrong for this call. This would be a bug that could happen whenever a 
successful recursive call occurs, which apparently is quite common. So 
if we stick with the current approach, we definitely should be dropping 
the requirement that Eli was thinking of, which says that a positive 
number returned by wait_reading_process_output indicates the number of 
bytes read for this call.

Second, I don't leaving a known bug in the code, even if the bug is 
unlikely. Too often, these extreme cases occur anyway (e.g., due to a 
network attack). I'd prefer a slightly-more-complicated solution where 
the bug cannot occur. It can't be that hard to fix.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-14 15:24                                 ` Paul Eggert
@ 2017-11-14 16:03                                   ` Eli Zaretskii
  2017-11-14 16:23                                     ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-14 16:03 UTC (permalink / raw)
  To: Paul Eggert; +Cc: ml_emacs-lists, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 14 Nov 2017 07:24:31 -0800
> 
> On 11/14/2017 06:58 AM, Matthias Dahl wrote:
> > If during an active
> > call to wait_... all recursive calls happen to read exactly 2**32 (or
> > whatever bit depths EMACS_UINT is) bytes back, then we will miss it
> > completely and stall.
> 
> First, this means that the companion idea of subtracting the two 
> counters to yield a byte count is also buggy because the byte count will 
> be wrong for this call. This would be a bug that could happen whenever a 
> successful recursive call occurs, which apparently is quite common. So 
> if we stick with the current approach, we definitely should be dropping 
> the requirement that Eli was thinking of, which says that a positive 
> number returned by wait_reading_process_output indicates the number of 
> bytes read for this call.

No, I'm not dropping that requirement.

> Second, I don't leaving a known bug in the code, even if the bug is 
> unlikely. Too often, these extreme cases occur anyway (e.g., due to a 
> network attack). I'd prefer a slightly-more-complicated solution where 
> the bug cannot occur. It can't be that hard to fix.

Please describe such a solution, or show the code.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-14 16:03                                   ` Eli Zaretskii
@ 2017-11-14 16:23                                     ` Eli Zaretskii
  2017-11-14 21:54                                       ` Paul Eggert
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-14 16:23 UTC (permalink / raw)
  To: eggert; +Cc: ml_emacs-lists, emacs-devel

> Date: Tue, 14 Nov 2017 18:03:33 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: ml_emacs-lists@binary-island.eu, emacs-devel@gnu.org
> 
> > Second, I don't leaving a known bug in the code, even if the bug is 
> > unlikely. Too often, these extreme cases occur anyway (e.g., due to a 
> > network attack). I'd prefer a slightly-more-complicated solution where 
> > the bug cannot occur. It can't be that hard to fix.
> 
> Please describe such a solution, or show the code.

And also the problem you are talking about, because I'm not sure I
understand it well enough.

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-14 16:23                                     ` Eli Zaretskii
@ 2017-11-14 21:54                                       ` Paul Eggert
  2017-11-15 14:03                                         ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-14 21:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ml_emacs-lists, emacs-devel

On 11/14/2017 08:23 AM, Eli Zaretskii wrote:
> And also the problem you are talking about, because I'm not sure I  > understand it well enough.
I doubt whether anyone understands the problem well enough, which is why 
I have been asking questions about the proposed solution - which as far 
as I can see, is more of a band-aid rather than a real fix. To help move 
this forward, I just reread the original bug report here:

https://lists.gnu.org/archive/html/emacs-devel/2017-10/msg00743.html

and I have some further questions that may help understand what's going 
on. The bug report said:

> flyspell.el ... waits for output from its spellchecker process  > through accept-process-output and specifies that specific process as 
 > wait_proc. Now depending on timing (race), > 
wait_reading_process_output can call the pending timers... which in > 
turn can call accept-process-output again. This almost always leads > to 
the spellchecker output being read back in full, so there is no > more 
data left to be read. Thus the original accept-process-output, > which 
called wait_reading_process_output, will wait for the data to > become 
available forever since it has no way to know that those have > already 
been read.
When this happens, it appears that the original accept-process-output 
acted by calling wait_reading_process (0, 0, 0, 0, Qnil, PROC, 0) where 
PROC is the ispell-process. First, is that correct? (If not, my 
remaining questions may be a wild goose chase....)

This meant the original wait_reading_process did the following: set wait 
= INFINITY, run the timers (which apparently call wait_reading_process 
recursively), then check whether update_tick != process_tick (line 5182 
of process.c in commit 79108894dbcd642121466bb6af6c98c6a56e9233). Is 
update_tick equal to process_tick in the problematic call? I'll assume 
so, but please check this. (If not, my remaining questions may need to 
be changed.)

Next, the original wait_reading_process output checks whether 
wait_proc->raw_status_new is nonzero (line 5210). Is it nonzero? For 
now, I'll assume it is zero. (If not, my remaining questions may need to 
be changed.)

Next, the original wait_reading_process_output checks whether (! EQ 
(wait_proc->status, Qrun) && ! connecting_status (wait_proc->status)) 
(line 5213). Does this check succeed? For now, I'll assume this check 
returns false. (If not, then we need to understand why.)

Next, the original wait_reading_process_output recomputes the input wait 
masks, sets check_delay = 0, check_write = true, no_avail = 0, timeout = 
timer_delay (line 5355), and so forth. This means it wiil call select 
with a nonzero timeout, even though we don't want it to do that: we want 
wait_reading_process_output to return 0, because it attempted to receive 
input but got none.

The changes you're proposing essentially kick the code so that it 
pretends that it read some bytes, even though it didn't (because the 
bytes were actually read and processed by a subroutine), causing it to 
exit the loop (and return nonzero instead of zero -- why?). But isn't 
this kick what the update_tick != process_tick (line 5182) check is 
supposed to be doing? And if so, why isn't that check working for your 
case? Is it because the code is forgetting to increment a tick count?

This above sort of reasoning is the sort of thing that needs to be done 
with this sort of change to such an intricate part of the Emacs code.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-14 21:54                                       ` Paul Eggert
@ 2017-11-15 14:03                                         ` Matthias Dahl
  2017-11-16 15:37                                           ` Eli Zaretskii
  2017-11-16 16:46                                           ` Paul Eggert
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-15 14:03 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello Paul...

On 14/11/17 22:54, Paul Eggert wrote:

> I doubt whether anyone understands the problem well enough, which is why
> I have been asking questions about the proposed solution - which as far
> as I can see, is more of a band-aid rather than a real fix.

What makes you think that, just out of curiosity? I poured quite a lot
of time into debugging this bug, reading up on the relevant C sources
and coming up with a "simple" enough that is easy to grasp and review.

This was by no means a 5-minute look and fix... which is nothing I would
ever do, period. If anything, I am usually too thorough for my own good.

> When this happens, it appears that the original accept-process-output
> acted by calling wait_reading_process (0, 0, 0, 0, Qnil, PROC, 0) where
> PROC is the ispell-process. First, is that correct? (If not, my
> remaining questions may be a wild goose chase....)

Exactly. You can also have a deeper look at things, as there is a full
and detailed backtrace posted a few messages back (emacs-bt-full.txt).

> This meant the original wait_reading_process did the following: set wait
> = INFINITY, run the timers (which apparently call wait_reading_process
> recursively), then check whether update_tick != process_tick (line 5182
> of process.c in commit 79108894dbcd642121466bb6af6c98c6a56e9233). Is
> update_tick equal to process_tick in the problematic call? I'll assume
> so, but please check this. (If not, my remaining questions may need to
> be changed.)

This is pretty much irrelevant, if I am not missing some huge bit piece
of the puzzle somewhere.

If the branch update_tick != process_tick is taken, there is nothing in
there that would eventually notice that the data from our wait_proc has
already been read.

If thread_select signaled that there is no more data available, we end
up with status_notify for our wait_proc -- and that will also only try
to read any remaining data, if at all.

The crucial part is: ALL data has been read from our wait_proc while we
were running timers or filters -- and no further data will become
available until there is some interaction again with the process. That
is the case with the ispell process.

wait_reading_... currently has no chance/code to detect such an event
since it solely relies on pselect calls and such -- which will come up
empty handed if no further data is available, and there is nothing to
change that.

[ ... I skipped the remaining questions for now. If you still think
      those are relevant, let me know and I will do my best to answer
      those as well. ...]

> The changes you're proposing essentially kick the code so that it
> pretends that it read some bytes, even though it didn't (because the
> bytes were actually read and processed by a subroutine), causing it to
> exit the loop (and return nonzero instead of zero -- why?). But isn't
> this kick what the update_tick != process_tick (line 5182) check is
> supposed to be doing? And if so, why isn't that check working for your
> case? Is it because the code is forgetting to increment a tick count?

The fix is no band-aid, as you put it earlier. To answer your questions:

1)
Yes, it gives wait_reading_... the possibility to exit the loop and
return >= 1, if the data for our wait_proc has been read while we were
running timers or filters.

2)
Returning >= 1 is semantically correct, imho. We were waiting for data
to become available. That data became available. Granted, it is not us
directly who read it and passed it to the filter, but still, the caller
expects us to signal if data become available... and that is exactly
what happened. Returning anything else wouldn't make much sense, imho
and probably (?) cause problems in this particular case...

3)
update_tick != processed_tick ... see earlier in this mail for why this
is not helping a bit with this particular case.

By the way, I do have ideas for solutions that have no corner-cases,
which is what you are asking for. My current solution will fail to
detect a read-back if exactly 2**(bit depth of counter) has been read
which is very (!) unlikely, but still. That's it... otherwise it will
work reliably.

But that requires that we change wait_reading_... to really only return
-1, 0 or 1 and give up returning the real amount of bytes. Otherwise we
either end up with the same corner-case all over again, just in a more
complex solution or with a way too complex solution for this problem.
No user is using the return value in such a way in-tree. And given that
those changes are going only to master, it gives all out-of-tree users
who might use that value differently, ample time to adjust -- or voice
their concerns.

Thanks for your thorough questions and review -- very much appreciated!

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-15 14:03                                         ` Matthias Dahl
@ 2017-11-16 15:37                                           ` Eli Zaretskii
  2017-11-16 16:46                                           ` Paul Eggert
  1 sibling, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-16 15:37 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Wed, 15 Nov 2017 15:03:27 +0100
> 
> The crucial part is: ALL data has been read from our wait_proc while we
> were running timers or filters -- and no further data will become
> available until there is some interaction again with the process. That
> is the case with the ispell process.
> 
> wait_reading_... currently has no chance/code to detect such an event
> since it solely relies on pselect calls and such -- which will come up
> empty handed if no further data is available, and there is nothing to
> change that.

Indeed, that was also my conclusion.  Paul, if you are saying that
there is some machinery in wait_reading_process_output to handle such
situations, and it just doesn't DTRT in this case for some reason,
then please point out where that code lives.

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-15 14:03                                         ` Matthias Dahl
  2017-11-16 15:37                                           ` Eli Zaretskii
@ 2017-11-16 16:46                                           ` Paul Eggert
  2017-11-18 14:24                                             ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Paul Eggert @ 2017-11-16 16:46 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

On 11/15/2017 06:03 AM, Matthias Dahl wrote:
> The crucial part is: ALL data has been read from our wait_proc while we
> were running timers or filters -- and no further data will become
> available until there is some interaction again with the process.

Sure, but how do we know that the data read while running timers and 
filters was being read on behalf of our caller? Perhaps a timer or 
filter fired off some Elisp function that decided to read data for its 
own purposes, unrelated to our caller. We wouldn't want to count the 
data read by that function as being data of interest to our caller.

In your ispell case we know that the timers and filters are reading on 
ispell's behalf, so the proposed fix should be OK. (If memory serves, 
integer overflow isn't possible for ispell either.) But I don't yet see 
how the fix works in general.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-16 16:46                                           ` Paul Eggert
@ 2017-11-18 14:24                                             ` Matthias Dahl
  2017-11-18 14:51                                               ` Eli Zaretskii
  2017-11-19  7:07                                               ` Paul Eggert
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-18 14:24 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello Paul...

On 16/11/17 17:46, Paul Eggert wrote:

> Sure, but how do we know that the data read while running timers and
> filters was being read on behalf of our caller? Perhaps a timer or
> filter fired off some Elisp function that decided to read data for its
> own purposes, unrelated to our caller. We wouldn't want to count the
> data read by that function as being data of interest to our caller.

I had considered that when I debugged the bug but think about it for a
moment. If you treat the process as a shared resource, it is your sole
responsibility to take care of proper management and synchronization of
the process as well.

If a wait_... is in progress for process A which is the response to some
interaction A* (w/ filter F1), then if the timers get processed during
our wait and end up with another interaction B* (w/ filter F2) to
process A that will cause havoc either way. They will probably read the
data that was destined for filter F1 or things get messed up even more
horribly.

Thus, that should not happen. And there is actually nothing Emacs can do
about it form its side. This is solely the responsibility of package
authors and so forth to make sure things like that do not happen through
the usual mechanics and techniques.

And doing the same from a filter... well... everything stated above is
true here as well.

The current situation is without a doubt a bug -- Emacs should detect
that data was read and processed and not hang indefinitely. That we can
agree on, I hope. :-)

We could, by the way, avoid this whole problem and dilemma if we shift
the processing of timers to _AFTER_ we are finished with everything. But
this brings in new problems, like if we have to wait too long for the
data to become available, timers would get delayed quite a bit. And they
would only fire once, no matter how much time has passed. So this is not
ideal as well.

Again, I do not see the problem with my solution. We cannot and should
not account for bugs in 3rd party package implementations like the one
state earlier above.

If I'm wrong here or missing something, please don't hesitate to correct
me. Right now, at least, I am not seeing any problems.

Have a nice weekend,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-18 14:24                                             ` Matthias Dahl
@ 2017-11-18 14:51                                               ` Eli Zaretskii
  2017-11-18 17:14                                                 ` Stefan Monnier
  2017-11-19  7:07                                               ` Paul Eggert
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-18 14:51 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Sat, 18 Nov 2017 15:24:26 +0100
> 
> On 16/11/17 17:46, Paul Eggert wrote:
> 
> > Sure, but how do we know that the data read while running timers and
> > filters was being read on behalf of our caller? Perhaps a timer or
> > filter fired off some Elisp function that decided to read data for its
> > own purposes, unrelated to our caller. We wouldn't want to count the
> > data read by that function as being data of interest to our caller.
> 
> I had considered that when I debugged the bug but think about it for a
> moment. If you treat the process as a shared resource, it is your sole
> responsibility to take care of proper management and synchronization of
> the process as well.
> 
> If a wait_... is in progress for process A which is the response to some
> interaction A* (w/ filter F1), then if the timers get processed during
> our wait and end up with another interaction B* (w/ filter F2) to
> process A that will cause havoc either way. They will probably read the
> data that was destined for filter F1 or things get messed up even more
> horribly.

I think the normal situation is where each process has only one
filter, and therefore even if the output of the process was read by
some unrelated call to wait_reading_process_output, that output was
processed by the correct filter.

IOW, there should be no problems with the actual processing of the
process output, the problem is with the caller of
accept-process-output etc., which must receive an indication that some
output was received and processed.  And that's what the proposed
change is trying to solve -- to prevent that indication from being
lost due to recursive calls to wait_reading_process_output.

> We could, by the way, avoid this whole problem and dilemma if we shift
> the processing of timers to _AFTER_ we are finished with everything. But
> this brings in new problems, like if we have to wait too long for the
> data to become available, timers would get delayed quite a bit. And they
> would only fire once, no matter how much time has passed. So this is not
> ideal as well.

No, this will introduce much worse problems.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-18 14:51                                               ` Eli Zaretskii
@ 2017-11-18 17:14                                                 ` Stefan Monnier
  0 siblings, 0 replies; 151+ messages in thread
From: Stefan Monnier @ 2017-11-18 17:14 UTC (permalink / raw)
  To: emacs-devel

> I think the normal situation is where each process has only one
> filter, and therefore even if the output of the process was read by
> some unrelated call to wait_reading_process_output, that output was
> processed by the correct filter.

Even with a single filter, sometimes the precise context in which this
filter is run matters (e.g. it sets a dynamically bound variable), but
in any case, there's nothing we can do about those problems from within
wait_reading_process_output: in those cases where the input needs to be
processed from within a particular call to wait_reading_process_output,
the problem needs to be solved at the Elisp level.


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-18 14:24                                             ` Matthias Dahl
  2017-11-18 14:51                                               ` Eli Zaretskii
@ 2017-11-19  7:07                                               ` Paul Eggert
  2017-11-19 15:42                                                 ` Eli Zaretskii
  2017-11-20 15:29                                                 ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Paul Eggert @ 2017-11-19  7:07 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

Matthias Dahl wrote:
> If you treat the process as a shared resource, it is your sole
> responsibility to take care of proper management and synchronization of
> the process as well.

OK, but this is all news to me. Shouldn't this be documented? As things stand, 
it is not obvious.

So, getting back to the patch proposed in 
<https://lists.gnu.org/r/emacs-devel/2017-11/msg00193.html>, this discussion 
convinced me that the approach will work well enough. I have the following 
suggestions for improvement:

* Fix the bug with carryover that I mentioned in 
<https://lists.gnu.org/r/emacs-devel/2017-11/msg00283.html>.

* Document in the Elisp manual that filters and timers are supposed to do 
"proper management and synchronization", and be clear about how this constrains 
filters and timers. (This is probably the hardest part of the fix....)

* Change the type of infd_num_bytes_read from EMACS_UINT to uintmax_t. This will 
provide an extra margin of safety on some platforms. infd_num_bytes_read has 
nothing to do with Emacs integers, and wider counts are safer.

* Document in its comment that infd_num_bytes_read is actually the count modulo 
UINTMAX_MAX + 1.

* When assigning to got_some_output, ceiling it at INT_MAX to avoid overflow 
problems. Something like the following, say:

    got_some_output = min (INT_MAX, (wait_proc->infd_num_bytes_read
                                     - initial_wait_proc_num_bytes_read));

This removes the need for that long comment about overflow, since this 
assignment cannot overflow.

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-19  7:07                                               ` Paul Eggert
@ 2017-11-19 15:42                                                 ` Eli Zaretskii
  2017-11-19 17:06                                                   ` Paul Eggert
  2017-11-20 15:29                                                 ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2017-11-19 15:42 UTC (permalink / raw)
  To: Paul Eggert; +Cc: ml_emacs-lists, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 18 Nov 2017 23:07:37 -0800
> 
> * Fix the bug with carryover that I mentioned in 
> <https://lists.gnu.org/r/emacs-devel/2017-11/msg00283.html>.

Not sure what bug is that, and how would you propose to fix it.

> * Document in the Elisp manual that filters and timers are supposed to do 
> "proper management and synchronization", and be clear about how this constrains 
> filters and timers. (This is probably the hardest part of the fix....)

I will handle this part.

> * Change the type of infd_num_bytes_read from EMACS_UINT to uintmax_t. This will 
> provide an extra margin of safety on some platforms. infd_num_bytes_read has 
> nothing to do with Emacs integers, and wider counts are safer.
> 
> * Document in its comment that infd_num_bytes_read is actually the count modulo 
> UINTMAX_MAX + 1.
> 
> * When assigning to got_some_output, ceiling it at INT_MAX to avoid overflow 
> problems. Something like the following, say:
> 
>     got_some_output = min (INT_MAX, (wait_proc->infd_num_bytes_read
>                                      - initial_wait_proc_num_bytes_read));

If we are using uintmax_t, why limit this by INT_MAX?  Overflow of
unsigned values should never be a problem, AFAIK.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-19 15:42                                                 ` Eli Zaretskii
@ 2017-11-19 17:06                                                   ` Paul Eggert
  0 siblings, 0 replies; 151+ messages in thread
From: Paul Eggert @ 2017-11-19 17:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ml_emacs-lists, emacs-devel

Eli Zaretskii wrote:
>> Cc: emacs-devel@gnu.org
>> From: Paul Eggert <eggert@cs.ucla.edu>
>> Date: Sat, 18 Nov 2017 23:07:37 -0800
>>
>> * Fix the bug with carryover that I mentioned in
>> <https://lists.gnu.org/r/emacs-devel/2017-11/msg00283.html>.
> 
> Not sure what bug is that, and how would you propose to fix it.

The bug is that with the patch as written, wait_reading_process_output might 
return a positive value even though it didn't read anything (nor did any timers 
or filters).

>>     got_some_output = min (INT_MAX, (wait_proc->infd_num_bytes_read
>>                                      - initial_wait_proc_num_bytes_read));
> If we are using uintmax_t, why limit this by INT_MAX?

Because got_some_output is an int. And it's an int because the function's 
callers (either directly or indirectly) expect int. And the callers don't care 
what the value is, so long as it's positive (this is documented in the comments 
describing wait_reading_process_output etc.).

We could change got_some_output and all its users to employ the type intmax_t 
rather than int, but even then we'd need to limit it to INTMAX_MAX, so why 
bother? It wouldn't fix any bugs.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-19  7:07                                               ` Paul Eggert
  2017-11-19 15:42                                                 ` Eli Zaretskii
@ 2017-11-20 15:29                                                 ` Matthias Dahl
  2017-11-21 14:44                                                   ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-11-20 15:29 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello Paul...

On 19/11/17 08:07, Paul Eggert wrote:

> * Fix the bug with carryover that I mentioned in
> <https://lists.gnu.org/r/emacs-devel/2017-11/msg00283.html>.

Done already locally, will be in the revised patch.

> * Document in the Elisp manual that filters and timers are supposed to
> do "proper management and synchronization", and be clear about how this
> constrains filters and timers. (This is probably the hardest part of the
> fix....)

I believe Eli said he will take care of this one.

> * Change the type of infd_num_bytes_read from EMACS_UINT to uintmax_t.
> This will provide an extra margin of safety on some platforms.
> infd_num_bytes_read has nothing to do with Emacs integers, and wider
> counts are safer.

Thanks, didn't know about that one. Will do.

> * Document in its comment that infd_num_bytes_read is actually the count
> modulo UINTMAX_MAX + 1.

On my todo, will be on the new patch.

> * When assigning to got_some_output, ceiling it at INT_MAX to avoid
> overflow problems. Something like the following, say:
> 
>    got_some_output = min (INT_MAX, (wait_proc->infd_num_bytes_read
>                                     - initial_wait_proc_num_bytes_read));

Actually, I already spotted this and corrected it locally. I would have
mentioned it in the revised patch. Thanks though for your keen eye. :-)

> This removes the need for that long comment about overflow, since this
> assignment cannot overflow.

Not quite. The long comment explicitly explains why we can always do
the subtraction this way because we could end up in a situation were we
subtract a larger number from a smaller number, e.g. when the initial
value was close to the max and once data was read, we had a wrap around.

The assignment itself was another issue, that went unnoticed in the
first patch. ;-)

I'll update the patches tomorrow most likely and send them to the list
as I just didn't get around to it today.

Thanks again for all the great feedback.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-20 15:29                                                 ` Matthias Dahl
@ 2017-11-21 14:44                                                   ` Matthias Dahl
  2017-11-21 21:31                                                     ` Clément Pit-Claudel
  2017-11-22  8:55                                                     ` Paul Eggert
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-21 14:44 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 315 bytes --]

Hello Eli and Paul,

attached you find the updated patches which have all the discussed
changes and fixes.

If there is anything else, please let me know.

Thanks again for all the patience and valuable feedback.

Have a nice day,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-process-output-read-accounting.patch --]
[-- Type: text/x-patch; name="0001-Add-process-output-read-accounting.patch", Size: 1642 bytes --]

From 94cd9ac3867305184f310dbf411729c59897c2c5 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:55:53 +0200
Subject: [PATCH 1/2] Add process output read accounting

This tracks the bytes read from a process's stdin which is not used
anywhere yet but required for follow-up work.
* src/process.c (read_process_output): Track bytes read from a process.
* src/process.h (struct Lisp_Process): Add infd_num_bytes_read
to track bytes read from a process.
---
 src/process.c | 4 ++++
 src/process.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/src/process.c b/src/process.c
index fc46e74332..ab023457bd 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5886,6 +5886,10 @@ read_process_output (Lisp_Object proc, int channel)
       nbytes += buffered && nbytes <= 0;
     }
 
+  /* Don't count carryover as those bytes have already been count by
+     a previous iteration.  */
+  p->infd_num_bytes_read += nbytes;
+
   p->decoding_carryover = 0;
 
   /* At this point, NBYTES holds number of bytes just received
diff --git a/src/process.h b/src/process.h
index 5670f44736..96c19fcf81 100644
--- a/src/process.h
+++ b/src/process.h
@@ -129,6 +129,8 @@ struct Lisp_Process
     pid_t pid;
     /* Descriptor by which we read from this process.  */
     int infd;
+    /* Byte-count modulo (UINTMAX_MAX + 1) for process output read from `infd'.  */
+    uintmax_t infd_num_bytes_read;
     /* Descriptor by which we write to this process.  */
     int outfd;
     /* Descriptors that were created for this process and that need
-- 
2.15.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-src-process.c-wait_reading_process_output-Fix-wait_p.patch --]
[-- Type: text/x-patch; name="0002-src-process.c-wait_reading_process_output-Fix-wait_p.patch", Size: 3769 bytes --]

From 1bbe69611bb4db8bd4149d57cfa5be548ee64c9d Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:56:47 +0200
Subject: [PATCH 2/2] * src/process.c (wait_reading_process_output): Fix
 wait_proc hang.

If called recursively (through timers or process filters by the means
of accept-process-output), it is possible that the output of wait_proc
has already been read by one of those recursive calls, leaving the
original call hanging forever if no further output arrives through
that fd and no timeout has been specified. Implement proper checks by
taking advantage of the process output read accounting.
---
 src/process.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/src/process.c b/src/process.c
index ab023457bd..b75ac171a1 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5003,6 +5003,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   struct timespec got_output_end_time = invalid_timespec ();
   enum { MINIMUM = -1, TIMEOUT, INFINITY } wait;
   int got_some_output = -1;
+  uintmax_t initial_wait_proc_num_bytes_read = (wait_proc) ?
+                                               wait_proc->infd_num_bytes_read : 0;
 #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS
   bool retry_for_async;
 #endif
@@ -5161,6 +5163,19 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 	      && requeued_events_pending_p ())
 	    break;
 
+          /* Timers could have called `accept-process-output', thus reading the output
+             of wait_proc while we (in the worst case) wait endlessly for it to become
+             available later. So we need to check if data has been read and break out
+             early if that is so since our job has been fulfilled. */
+          if (wait_proc
+              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
+            {
+              /* Make sure we don't overflow signed got_some_output.
+                 Calculating bytes read is modulo (UINTMAX_MAX + 1) and won't overflow. */
+              got_some_output = min(INT_MAX, (wait_proc->infd_num_bytes_read
+                                              - initial_wait_proc_num_bytes_read));
+            }
+
           /* This is so a breakpoint can be put here.  */
           if (!timespec_valid_p (timer_delay))
               wait_reading_process_output_1 ();
@@ -5606,7 +5621,20 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		 buffered-ahead character if we have one.  */
 
 	      nread = read_process_output (proc, channel);
-	      if ((!wait_proc || wait_proc == XPROCESS (proc))
+
+              /* In case a filter was run that called `accept-process-output', it is
+                 possible that the output from wait_proc was already read, leaving us
+                 waiting for it endlessly (if no timeout was specified). Thus, we need
+                 to check if data was already read. */
+              if (wait_proc
+                  && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
+                {
+                  /* Make sure we don't overflow signed got_some_output.
+                     Calculating bytes read is modulo (UINTMAX_MAX + 1) and won't overflow. */
+                  got_some_output = min(INT_MAX, (wait_proc->infd_num_bytes_read
+                                                  - initial_wait_proc_num_bytes_read));
+                }
+	      else if ((!wait_proc || wait_proc == XPROCESS (proc))
 		  && got_some_output < nread)
 		got_some_output = nread;
 	      if (nread > 0)
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-21 14:44                                                   ` Matthias Dahl
@ 2017-11-21 21:31                                                     ` Clément Pit-Claudel
  2017-11-22 14:14                                                       ` Matthias Dahl
  2017-11-22  8:55                                                     ` Paul Eggert
  1 sibling, 1 reply; 151+ messages in thread
From: Clément Pit-Claudel @ 2017-11-21 21:31 UTC (permalink / raw)
  To: emacs-devel; +Cc: ml_emacs-lists

On 2017-11-21 09:44, Matthias Dahl wrote:
> +  /* Don't count carryover as those bytes have already been count by
Should this be "counted"?

> +  uintmax_t initial_wait_proc_num_bytes_read = (wait_proc) ?
Do you need the parentheses here?

Thank for fixing that bug, btw! :)

Clément.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-21 14:44                                                   ` Matthias Dahl
  2017-11-21 21:31                                                     ` Clément Pit-Claudel
@ 2017-11-22  8:55                                                     ` Paul Eggert
  2017-11-22 14:33                                                       ` Matthias Dahl
  2017-12-04  9:40                                                       ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Paul Eggert @ 2017-11-22  8:55 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

> +  /* Don't count carryover as those bytes have already been count by
> +     a previous iteration.  */
> +  p->infd_num_bytes_read += nbytes;
> +

This doesn't look right, as nbytes might be negative (indicating an error).


> Subject: [PATCH 2/2] * src/process.c (wait_reading_process_output): Fix
>  wait_proc hang.

Please start with a summary line that doesn't contain so much detail. <=50 chars 
is good. No trailing period. "Fix wait_reading_process_output_hang", perhaps.

> +  uintmax_t initial_wait_proc_num_bytes_read = (wait_proc) ?
> +                                               wait_proc->infd_num_bytes_read : 0;

Kind of a long name, no? Perhaps make it shorter, so that you can write 
something like this:

    uintmax_t nbytes_read0 = wait_proc ? wait_proc->infd_num_bytes_read : 0;

Even better, shorten the member name too: "nbytes_read" is easier to read than 
"infd_num_bytes_read".

> +          /* Timers could have called `accept-process-output', thus reading the output

Please limit the program to 80 columns.

> +             of wait_proc while we (in the worst case) wait endlessly for it to become
> +             available later. So we need to check if data has been read and break out
> +             early if that is so since our job has been fulfilled. */
> +          if (wait_proc
> +              && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
> +            {
> +              /* Make sure we don't overflow signed got_some_output.
> +                 Calculating bytes read is modulo (UINTMAX_MAX + 1) and won't overflow. */
> +              got_some_output = min(INT_MAX, (wait_proc->infd_num_bytes_read
> +                                              - initial_wait_proc_num_bytes_read));


Space before "(" in function calls.
> +              if (wait_proc
> +                  && wait_proc->infd_num_bytes_read != initial_wait_proc_num_bytes_read)
> +                {
> +                  /* Make sure we don't overflow signed got_some_output.
> +                     Calculating bytes read is modulo (UINTMAX_MAX + 1) and won't overflow. */
> +                  got_some_output = min(INT_MAX, (wait_proc->infd_num_bytes_read
> +                                                  - initial_wait_proc_num_bytes_read));
> +                }

It's annoying that the code (and the comment!) is duplicated. How about putting 
it into a function? Also, there's nothing unusual about unsigned arithmetic 
wrapping around; to me that (repeated) comment is almost as bad as writing "i++; 
/* Add 1 to i.  */". Perhaps that's just me, but at least the obvious comment 
should not be duplicated.

A function like this, perhaps?

   static int
   input_progress (struct Lisp_Process *wait_proc, uintmax_t nbytes_read0)
   {
     if (wait_proc)
       {
	/* This subtraction might wrap around; that's OK.  */
	uintmax_t progress = wait_proc->nbytes_read - nbytes_read0;
	if (progress != 0)
	  return min (progress, INT_MAX);
       }
     return -1;
   }

and then the above chunks could be turned into something as simple as:

   got_some_output = input_progress (wait_proc, nbytes_read0);

with maybe some other trivial changes to make this all work.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-21 21:31                                                     ` Clément Pit-Claudel
@ 2017-11-22 14:14                                                       ` Matthias Dahl
  0 siblings, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-22 14:14 UTC (permalink / raw)
  To: Clément Pit-Claudel, emacs-devel

Hello Clément...

On 21/11/17 22:31, Clément Pit-Claudel wrote:

> Should this be "counted"?

Argh. No matter how often you proofread something, you always miss
a spot. Thanks. :-)

> Thank for fixing that bug, btw! :)

You are very welcome.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-22  8:55                                                     ` Paul Eggert
@ 2017-11-22 14:33                                                       ` Matthias Dahl
  2017-11-24  2:31                                                         ` Stefan Monnier
  2017-12-28 17:52                                                         ` Eli Zaretskii
  2017-12-04  9:40                                                       ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2017-11-22 14:33 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello Paul...

On 22/11/17 09:55, Paul Eggert wrote:

> This doesn't look right, as nbytes might be negative (indicating an error).

You are absolutely right. I initially pushed it one line up before the
carryover is added but at a late call decided to put it a bit further up
for clarity and missed the conditional below somehow (turning blind!?).

Sorry about that. I will pay more attention in the future...

> Please start with a summary line that doesn't contain so much detail.
> <=50 chars is good. No trailing period. "Fix
> wait_reading_process_output_hang", perhaps.

Ok. Thanks for clearing that up. It was hard to figure out what style is
requested for Emacs.

> Kind of a long name, no? Perhaps make it shorter, so that you can write
> something like this:
> 
>    uintmax_t nbytes_read0 = wait_proc ? wait_proc->infd_num_bytes_read : 0;
> 
> Even better, shorten the member name too: "nbytes_read" is easier to
> read than "infd_num_bytes_read".

Well... I will have to think about that one. I am a firm proponent of
descriptive names that convey exactly what a function or variable is
being used for -- even at the expense of the length of that name.

nbytes_read is, imho, too generic -- especially given the huge context
of wait_..., so a good descriptive name helps.

Like I said, I will think about something that is shorter -- but coming
up with this name was not as easy as it may seem because I naturally
aimed at something shorter as well.

> Please limit the program to 80 columns.

Ok, will do.

> Space before "(" in function calls.

Argh, sorry, missed that again. I am not used to that style.

> It's annoying that the code (and the comment!) is duplicated.

I absolutely agree but quite honestly I was convinced I would get
negative feedback if I did put it in a function of its own, given
how the rest of wait_... looked and in general. So I went with the
duplicated code for exactly that reason.

Quite happy to change that.

I will have updated patches available later this week. Thanks for
your great feedback, again.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-22 14:33                                                       ` Matthias Dahl
@ 2017-11-24  2:31                                                         ` Stefan Monnier
  2017-12-28 17:52                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: Stefan Monnier @ 2017-11-24  2:31 UTC (permalink / raw)
  To: emacs-devel

> I absolutely agree but quite honestly I was convinced I would get
> negative feedback if I did put it in a function of its own, given
> how the rest of wait_... looked and in general. So I went with the
> duplicated code for exactly that reason.

Many Emacs functions (especially in the C code) are much too large.
Don't take this as a hint that your code should aim for functions to be
as long as possible.
On the contrary, making them shorter by moving some of their code to
separate functions is welcome.


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-22  8:55                                                     ` Paul Eggert
  2017-11-22 14:33                                                       ` Matthias Dahl
@ 2017-12-04  9:40                                                       ` Matthias Dahl
  2018-02-13 14:25                                                         ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2017-12-04  9:40 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

Hello guys...

I am terribly sorry I haven't yet made the changes and posted updated
patches. Real life happened, unfortunately, as always. I have not been
kidnapped by aliens -- I believe.

This is top on my list and I will post something later this week if
everything goes as planned.

Thanks for the patience.

Have a nice day,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-11-22 14:33                                                       ` Matthias Dahl
  2017-11-24  2:31                                                         ` Stefan Monnier
@ 2017-12-28 17:52                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2017-12-28 17:52 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Wed, 22 Nov 2017 15:33:18 +0100
> 
> I will have updated patches available later this week. Thanks for
> your great feedback, again.

Matthias, any news?

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2017-12-04  9:40                                                       ` Matthias Dahl
@ 2018-02-13 14:25                                                         ` Matthias Dahl
  2018-02-13 16:56                                                           ` Paul Eggert
  2018-02-16 16:01                                                           ` Eli Zaretskii
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-02-13 14:25 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]

Hello,

after some unexpected / unintended prolonged delay due to personal
reasons, for which I apologize, attached the v2 of my patches.

Basically I have gone a somewhat different route. While working in
some of the requested changes, I noticed that there were still some
pathological cases that were not covered and fixing that would make
things even more convoluted.

So, in this version, the return value is calculated (if necessary)
strategically right before we return from the function call, thus it
cannot be missed and we will always properly signal if data was read
from a wait_proc (either directly or indirectly).

And instead of messing with got_some_output, we exit the loop when we
got some data (directly or indirectly) for our wait_proc if there is
no data to be read for this iteration. This leaves the whole function
logic alone -- except for this key point.

I have addressed the remaining issues, if they still applied. And I
have not been able to trigger a single hang with these patches.

I appreciate any comments and suggestions.

Thanks, again, for all the patience.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-process-output-read-accounting.patch --]
[-- Type: text/x-patch; name="0001-Add-process-output-read-accounting.patch", Size: 1597 bytes --]

From 94e0dc26f45e1a06881b016dd26446c43d339a4d Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:55:53 +0200
Subject: [PATCH 1/2] Add process output read accounting

This tracks the bytes read from a process' stdin which is not used
anywhere yet but required for follow-up work.
* src/process.c (read_process_output): Track bytes read from a process.
* src/process.h (struct Lisp_Process): Add nbytes_read to track bytes
read from a process.
---
 src/process.c | 3 +++
 src/process.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/src/process.c b/src/process.c
index 2ec10b12ec..17fdf592ec 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5889,6 +5889,9 @@ read_process_output (Lisp_Object proc, int channel)
 	return nbytes;
       coding->mode |= CODING_MODE_LAST_BLOCK;
     }
+  
+  /* Ignore carryover, it's been added by a previous iteration already.  */
+  p->nbytes_read += nbytes;
 
   /* Now set NBYTES how many bytes we must decode.  */
   nbytes += carryover;
diff --git a/src/process.h b/src/process.h
index ab468b18c5..6464a8cc61 100644
--- a/src/process.h
+++ b/src/process.h
@@ -129,6 +129,8 @@ struct Lisp_Process
     pid_t pid;
     /* Descriptor by which we read from this process.  */
     int infd;
+    /* Byte-count modulo (UINTMAX_MAX + 1) for process output read from `infd'.  */
+    uintmax_t nbytes_read;
     /* Descriptor by which we write to this process.  */
     int outfd;
     /* Descriptors that were created for this process and that need
-- 
2.16.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Fix-wait_reading_process_output-wait_proc-hang.patch --]
[-- Type: text/x-patch; name="0002-Fix-wait_reading_process_output-wait_proc-hang.patch", Size: 3307 bytes --]

From b9c05bbfb4559b21deb0ea4e156430dedb60ce41 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 6 Feb 2018 15:24:15 +0100
Subject: [PATCH 2/2] Fix wait_reading_process_output wait_proc hang

* src/process.c (wait_reading_process_output): If called recursively
through timers and/or process filters via accept-process-output, it is
possible that the output of wait_proc has already been read by one of
those recursive calls, leaving the original call hanging forever if no
further output arrives through that fd and no timeout has been set.
Fix that by using the process read accounting to keep track of how
many bytes have been read and use that as a condition to break out
of the infinite loop and return to the caller as well as to calculate
the proper return value (if a wait_proc is given that is).
---
 src/process.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/process.c b/src/process.c
index 17fdf592ec..0abbd5fa8e 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5006,6 +5006,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   struct timespec got_output_end_time = invalid_timespec ();
   enum { MINIMUM = -1, TIMEOUT, INFINITY } wait;
   int got_some_output = -1;
+  uintmax_t prev_wait_proc_nbytes_read = wait_proc ? wait_proc->nbytes_read : 0;
 #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS
   bool retry_for_async;
 #endif
@@ -5460,6 +5461,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       if (nfds == 0)
 	{
           /* Exit the main loop if we've passed the requested timeout,
+             or have read some bytes from our wait_proc (either directly
+             in this call or indirectly through timers / process filters),
              or aren't skipping processes and got some output and
              haven't lowered our timeout due to timers or SIGIO and
              have waited a long amount of time due to repeated
@@ -5467,7 +5470,9 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 	  struct timespec huge_timespec
 	    = make_timespec (TYPE_MAXIMUM (time_t), 2 * TIMESPEC_RESOLUTION);
 	  struct timespec cmp_time = huge_timespec;
-	  if (wait < TIMEOUT)
+	  if (wait < TIMEOUT
+              || (wait_proc
+                  && wait_proc->nbytes_read != prev_wait_proc_nbytes_read))
 	    break;
 	  if (wait == TIMEOUT)
 	    cmp_time = end_time;
@@ -5772,6 +5777,15 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       maybe_quit ();
     }
 
+  /* Timers and/or process filters that we have run could have themselves called
+     `accept-process-output' (and by that indirectly this function), thus
+     possibly reading some (or all) output of wait_proc without us noticing it.
+     This could potentially lead to an endless wait (dealt with earlier in the
+     function) and/or a wrong return value (dealt with here).  */
+  if (wait_proc && wait_proc->nbytes_read != prev_wait_proc_nbytes_read)
+    got_some_output = min (INT_MAX, (wait_proc->nbytes_read
+                                     - prev_wait_proc_nbytes_read));
+
   return got_some_output;
 }
 \f
-- 
2.16.1


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-13 14:25                                                         ` Matthias Dahl
@ 2018-02-13 16:56                                                           ` Paul Eggert
  2018-02-16 16:01                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: Paul Eggert @ 2018-02-13 16:56 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: emacs-devel

Looks good enough to me, thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-13 14:25                                                         ` Matthias Dahl
  2018-02-13 16:56                                                           ` Paul Eggert
@ 2018-02-16 16:01                                                           ` Eli Zaretskii
  2018-02-16 16:09                                                             ` Lars Ingebrigtsen
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-02-16 16:01 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Tue, 13 Feb 2018 15:25:44 +0100
> 
> after some unexpected / unintended prolonged delay due to personal
> reasons, for which I apologize, attached the v2 of my patches.
> 
> Basically I have gone a somewhat different route. While working in
> some of the requested changes, I noticed that there were still some
> pathological cases that were not covered and fixing that would make
> things even more convoluted.
> 
> So, in this version, the return value is calculated (if necessary)
> strategically right before we return from the function call, thus it
> cannot be missed and we will always properly signal if data was read
> from a wait_proc (either directly or indirectly).
> 
> And instead of messing with got_some_output, we exit the loop when we
> got some data (directly or indirectly) for our wait_proc if there is
> no data to be read for this iteration. This leaves the whole function
> logic alone -- except for this key point.
> 
> I have addressed the remaining issues, if they still applied. And I
> have not been able to trigger a single hang with these patches.
> 
> I appreciate any comments and suggestions.

Thanks for your work and perseverance.  I've now pushed this to the
master branch.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-16 16:01                                                           ` Eli Zaretskii
@ 2018-02-16 16:09                                                             ` Lars Ingebrigtsen
  2018-02-16 16:54                                                               ` Lars Ingebrigtsen
  2018-02-22 11:45                                                               ` andres.ramirez
  0 siblings, 2 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-16 16:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Matthias Dahl, emacs-devel, eggert

Eli Zaretskii <eliz@gnu.org> writes:

> Thanks for your work and perseverance.  I've now pushed this to the
> master branch.

I've tested this for three minutes in Gnus, and I'm not seeing the
network related hangs than I've seen earlier, so this looks promising.

But three minutes is a bit short to say anything definite.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-16 16:09                                                             ` Lars Ingebrigtsen
@ 2018-02-16 16:54                                                               ` Lars Ingebrigtsen
  2018-02-22 11:45                                                               ` andres.ramirez
  1 sibling, 0 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-16 16:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, Matthias Dahl, emacs-devel

Lars Ingebrigtsen <larsi@gnus.org> writes:

> But three minutes is a bit short to say anything definite.  :-)

And I just got another hang.

OK, I'll try to chase this hang down now by running Emacs under gdb from
now on...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-16 16:09                                                             ` Lars Ingebrigtsen
  2018-02-16 16:54                                                               ` Lars Ingebrigtsen
@ 2018-02-22 11:45                                                               ` andres.ramirez
  2018-02-26 14:39                                                                 ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: andres.ramirez @ 2018-02-22 11:45 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: eggert, Eli Zaretskii, Matthias Dahl, emacs-devel

Hi.
> I've tested this for three minutes in Gnus, and I'm not seeing the
> network related hangs than I've seen earlier, so this looks promising.

I have tested this patch for four days.

Nntp news have improved. No network hangs until now.

AR



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-22 11:45                                                               ` andres.ramirez
@ 2018-02-26 14:39                                                                 ` Matthias Dahl
  2018-02-26 15:11                                                                   ` andrés ramírez
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-02-26 14:39 UTC (permalink / raw)
  To: andres.ramirez, Lars Ingebrigtsen; +Cc: Eli Zaretskii, eggert, emacs-devel

Hello Andres,

Thanks for reporting back, it is very much appreciated and quite useful
feedback.

On 22/02/18 12:45, andres.ramirez wrote:

> Nntp news have improved. No network hangs until now.

Great to hear. I suspect there are quite a few mysterious Emacs hangs
that have been reported to package maintainers and such but never got
reproduced/diagnosed that can be directly attributed to this bug.

It would be nice to also have this in Emacs 26 since 27 is still very
far away for the majority of users.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 14:39                                                                 ` Matthias Dahl
@ 2018-02-26 15:11                                                                   ` andrés ramírez
  2018-02-26 15:17                                                                     ` Lars Ingebrigtsen
  2018-02-27  9:11                                                                     ` Matthias Dahl
  0 siblings, 2 replies; 151+ messages in thread
From: andrés ramírez @ 2018-02-26 15:11 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: Lars Ingebrigtsen, eggert, Eli Zaretskii, emacs-devel

> Great to hear. I suspect there are quite a few mysterious Emacs hangs
> that have been reported to package maintainers and such but never got
> reproduced/diagnosed that can be directly attributed to this bug.

That's my idea also. 

> It would be nice to also have this in Emacs 26 since 27 is still very
> far away for the majority of users.
But. I need to remind this case:
https://xkcd.com/1172/

Btw. Lars. Mentioned a hang. I have also got a hang a few minutes ago reading
nntp news with an specific site (hacker news nntp; most of my hangs in last 8 years have
been on emacs-devel nntp and also on hacker news nntp). With my mail client
wanderlust reading.
--8<---------------cut here---------------start------------->8---
        /last:1000/-gwene.com.ycombinator.news@news.gwene.org "hnews"
--8<---------------cut here---------------end--------------->8---

The other one. I have NOT yet got a hang. But quite probably I am
going to get it:
--8<---------------cut here---------------start------------->8---
        /last:2000/-gmane.emacs.devel@news.gmane.org	"emacs-devel"
--8<---------------cut here---------------end--------------->8---

I am still testing emacs-27 (master). 




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 15:11                                                                   ` andrés ramírez
@ 2018-02-26 15:17                                                                     ` Lars Ingebrigtsen
  2018-02-26 15:29                                                                       ` andrés ramírez
  2018-02-27  9:15                                                                       ` wait_reading_process_ouput hangs in certain cases (w/ patches) Matthias Dahl
  2018-02-27  9:11                                                                     ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-26 15:17 UTC (permalink / raw)
  To: andrés ramírez
  Cc: eggert, Eli Zaretskii, Matthias Dahl, emacs-devel

andrés ramírez <rrandresf@gmail.com> writes:

> Btw. Lars. Mentioned a hang. I have also got a hang a few minutes ago
> reading nntp news with an specific site (hacker news nntp; most of my
> hangs in last 8 years have been on emacs-devel nntp and also on hacker
> news nntp). With my mail client wanderlust reading.

The fixes to wait_reading_process_ouput have definitely made a
difference.  I used to see hangs several times a day, but after the fix,
I'm only seeing the hangs once every few days, and it's impossible to
set up a repeatable test case for them.  So my guess is that there might
still be a problem in this area, but that the window for triggering it
is much smaller.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 15:17                                                                     ` Lars Ingebrigtsen
@ 2018-02-26 15:29                                                                       ` andrés ramírez
  2018-02-26 16:52                                                                         ` Daniel Colascione
  2018-02-27  9:15                                                                       ` wait_reading_process_ouput hangs in certain cases (w/ patches) Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: andrés ramírez @ 2018-02-26 15:29 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: eggert, Eli Zaretskii, Matthias Dahl, emacs-devel

> The fixes to wait_reading_process_ouput have definitely made a
> difference.  I used to see hangs several times a day, but after the fix,
> I'm only seeing the hangs once every few days, and it's impossible to
> set up a repeatable test case for them.  So my guess is that there might
> still be a problem in this area, but that the window for triggering it
> is much smaller.

Right. I was thinking the last some days about backporting this change
by myself even to emacs-23 which I still use on an embedded platform (my
phone).




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 15:29                                                                       ` andrés ramírez
@ 2018-02-26 16:52                                                                         ` Daniel Colascione
  2018-02-26 17:19                                                                           ` andrés ramírez
  0 siblings, 1 reply; 151+ messages in thread
From: Daniel Colascione @ 2018-02-26 16:52 UTC (permalink / raw)
  To: andrés ramírez, Lars Ingebrigtsen
  Cc: Matthias Dahl, Eli Zaretskii, eggert, emacs-devel

On 02/26/2018 07:29 AM, andrés ramírez wrote:
>> The fixes to wait_reading_process_ouput have definitely made a
>> difference.  I used to see hangs several times a day, but after the fix,
>> I'm only seeing the hangs once every few days, and it's impossible to
>> set up a repeatable test case for them.  So my guess is that there might
>> still be a problem in this area, but that the window for triggering it
>> is much smaller.
> 
> Right. I was thinking the last some days about backporting this change
> by myself even to emacs-23 which I still use on an embedded platform (my
> phone).

Why wouldn't the phone run a newer Emacs?



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 16:52                                                                         ` Daniel Colascione
@ 2018-02-26 17:19                                                                           ` andrés ramírez
  2018-02-26 17:24                                                                             ` Daniel Colascione
  0 siblings, 1 reply; 151+ messages in thread
From: andrés ramírez @ 2018-02-26 17:19 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Matthias Dahl, Lars Ingebrigtsen, eggert, Eli Zaretskii,
	emacs-devel

> Why wouldn't the phone run a newer Emacs?

info 4.3 is not supported anymore which is installed on kernel 2.6.28.
gtk 2.24. There is some hope on maemo-leste for having mainline on the
n900 which is from the year 2009. Sorry guys droid does not have emacs with X. But this
phone has it. If maemo leste is succesfull I could migrate to
emacs-master once again. I have been running emacs on a touch phone see
my pic:
https://transfer.sh/hOfuv/Screenshot-20180226-121458.png

AR



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re:
  2018-02-26 17:19                                                                           ` andrés ramírez
@ 2018-02-26 17:24                                                                             ` Daniel Colascione
  2018-02-27  1:53                                                                               ` Re: andrés ramírez
  0 siblings, 1 reply; 151+ messages in thread
From: Daniel Colascione @ 2018-02-26 17:24 UTC (permalink / raw)
  To: andrés ramírez
  Cc: Matthias Dahl, Lars Ingebrigtsen, eggert, Eli Zaretskii,
	emacs-devel

On 12/31/1969 04:00 PM,  wrote:
>> Why wouldn't the phone run a newer Emacs?
> 
> info 4.3 is not supported anymore which is installed on kernel 2.6.28.

Can't bootstrap a newer makeinfo?

> gtk 2.24. There is some hope on maemo-leste for having mainline on the
> n900 which is from the year 2009. Sorry guys droid does not have emacs with X. But this
> phone has it. If maemo leste is succesfull I could migrate to
> emacs-master once again. I have been running emacs on a touch phone see

I mean, if we can support Windows 95, we can support this ancient phone.

> my pic:
> https://transfer.sh/hOfuv/Screenshot-20180226-121458.png

Cool. I've wanted better mobile support for Emacs for ages. I've been 
disappointed with all the org-mode mobile client options, and I think 
there's no substitute for the real thing.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re:
  2018-02-26 17:24                                                                             ` Daniel Colascione
@ 2018-02-27  1:53                                                                               ` andrés ramírez
  0 siblings, 0 replies; 151+ messages in thread
From: andrés ramírez @ 2018-02-27  1:53 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Matthias Dahl, Lars Ingebrigtsen, eggert, Eli Zaretskii,
	emacs-devel

On Mon, 26 Feb 2018 11:24:16 -0600,
Daniel Colascione wrote:
> 
> On 12/31/1969 04:00 PM,  wrote:
> >> Why wouldn't the phone run a newer Emacs?
> > 
> > info 4.3 is not supported anymore which is installed on kernel 2.6.28.
> 
> Can't bootstrap a newer makeinfo?
I could compile with --without-makeinfo too. I remember I compiled the
previous release candidate of emacs there and found a bug on eww because
of these old libraries. But emacs-23 have also a small binary which is
paramount on those devices (see the nokia n800). Which I do not turn on
almost for a year now.
> 
> Cool. I've wanted better mobile support for Emacs for ages. I've been
> disappointed with all the org-mode mobile client options, and I think
> there's no substitute for the real thing.

Yes this phone is the real linux phone. I can make phone calls from
bbdb, text from bbdb also. store gps points when needed on a text
file (with a key combination). having with me all my org files is really nice. I need to
hildonize the emacs source code. It means replacing:

--8<---------------cut here---------------start------------->8---
wtop = gtk_window_new (GTK_WINDOW_TOPLEVEL);
on ~/abs/emacs-27.0.50/src/gtkutil.c
--8<---------------cut here---------------end--------------->8---

With
--8<---------------cut here---------------start------------->8---
wtop = hildon_window_new();
hildon_gtk_window_set_portrait_flags (GTK_WINDOW(window), HILDON_PORTRAIT_MODE_SUPPORT);
--8<---------------cut here---------------end--------------->8---

And Emacs is going to support screen rotation (portrait mode). On the
phone.





^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 15:11                                                                   ` andrés ramírez
  2018-02-26 15:17                                                                     ` Lars Ingebrigtsen
@ 2018-02-27  9:11                                                                     ` Matthias Dahl
  2018-02-27 11:54                                                                       ` andrés ramírez
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-02-27  9:11 UTC (permalink / raw)
  To: andrés ramírez
  Cc: Lars Ingebrigtsen, eggert, Eli Zaretskii, emacs-devel

Hello Andrés...

On 26/02/18 16:11, andrés ramírez wrote:

> But. I need to remind this case:
> https://xkcd.com/1172/

I know what you are trying to say but in this particular case I think
the risk is rather low of "breaking" someone's workflow. And if this
change does introduce some kind of valid breakage, it would be nice
to know about this sooner rather than later. I know 26 is already in
beta, but this is essentially a bug fix and the recent patch is also
smaller and more clearly laid-out which makes it easier to assess and
reduces its side-effects.

> Btw. Lars. Mentioned a hang. I have also got a hang a few minutes ago reading
> nntp news with an specific site

Do you get those hangs with an Emacs version that had the bug fix? And
have you been able to pinpoint where exactly Emacs hangs? (backtrace)
Also, is this a "cancelable" hang (hitting ctrl+g) or a real hang and
you need to kill Emacs altogether?

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-26 15:17                                                                     ` Lars Ingebrigtsen
  2018-02-26 15:29                                                                       ` andrés ramírez
@ 2018-02-27  9:15                                                                       ` Matthias Dahl
  2018-02-27 12:01                                                                         ` Lars Ingebrigtsen
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-02-27  9:15 UTC (permalink / raw)
  To: Lars Ingebrigtsen, andrés ramírez
  Cc: Eli Zaretskii, eggert, emacs-devel

Hello Lars...

On 26/02/18 16:17, Lars Ingebrigtsen wrote:

> The fixes to wait_reading_process_ouput have definitely made a
> difference.  I used to see hangs several times a day, but after the fix,
> I'm only seeing the hangs once every few days, and it's impossible to
> set up a repeatable test case for them.  So my guess is that there might
> still be a problem in this area, but that the window for triggering it
> is much smaller.

These are probably multiple bugs you are seeing... one of them being
fixed now hopefully and the other(s) are still affecting you -- which
don't have to be in the same place either.

Have you been able to pinpoint where Emacs hangs? (backtrace) And when
it hangs, can you still cancel (ctrl+g) or do you have to kill Emacs
altogether?

What are you doing generally when Emacs hangs?

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27  9:11                                                                     ` Matthias Dahl
@ 2018-02-27 11:54                                                                       ` andrés ramírez
  2018-02-27 15:02                                                                         ` Matthias Dahl
  2018-02-27 15:14                                                                         ` Matthias Dahl
  0 siblings, 2 replies; 151+ messages in thread
From: andrés ramírez @ 2018-02-27 11:54 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: Lars Ingebrigtsen, eggert, Eli Zaretskii, emacs-devel

Hi Matthias

> I know what you are trying to say but in this particular case I think
> the risk is rather low of "breaking" someone's workflow. And if this
> change does introduce some kind of valid breakage, it would be nice
> to know about this sooner rather than later. I know 26 is already in
> beta, but this is essentially a bug fix and the recent patch is also
> smaller and more clearly laid-out which makes it easier to assess and
> reduces its side-effects.

I also think this patch is going to help with several cases and also
bugs difficult to fix or repeat.

> > Btw. Lars. Mentioned a hang. I have also got a hang a few minutes ago reading
> > nntp news with an specific site
> 
> Do you get those hangs with an Emacs version that had the bug fix?

Yes.

> have you been able to pinpoint where exactly Emacs hangs? (backtrace)
> Also, is this a "cancelable" hang (hitting ctrl+g) or a real hang and
> you need to kill Emacs altogether?
It is|was cancelable. On emacs 26 I used to use a workaround:
--8<---------------cut here---------------start------------->8---
(maphash
 (lambda (con &rest ignored)
	 (when (and con (get-process (car con)))
		 (delete-process (car con))))
 url-http-open-connections)
--8<---------------cut here---------------end--------------->8---

On master (with the patch included) the var url-http-open-connections is
not present. With emacs-26 After applying the mentiode above workaround
 I could have restarted downloading news again .

Also on master. I need to restart wanderlust to continue downloading
news. If I do not restart wanderlust and I keep refreshing the news for
the site emacs hangs again and again needing C-g.

Question: I am going to see how to plug gdb to this server-session. Btw
how were You able of debugging It?. I was not able of doing It in the
last years.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27  9:15                                                                       ` wait_reading_process_ouput hangs in certain cases (w/ patches) Matthias Dahl
@ 2018-02-27 12:01                                                                         ` Lars Ingebrigtsen
  0 siblings, 0 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-27 12:01 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Have you been able to pinpoint where Emacs hangs? (backtrace) And when
> it hangs, can you still cancel (ctrl+g) or do you have to kill Emacs
> altogether?
>
> What are you doing generally when Emacs hangs?

See bug#24201.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 11:54                                                                       ` andrés ramírez
@ 2018-02-27 15:02                                                                         ` Matthias Dahl
  2018-02-27 15:13                                                                           ` Lars Ingebrigtsen
  2018-02-27 15:14                                                                         ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-02-27 15:02 UTC (permalink / raw)
  To: andrés ramírez
  Cc: Lars Ingebrigtsen, eggert, Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

Hello Andrés and Lars...

Could you both try the attached patch on master? It is just a quick test
to test a theory. I spent most of the afternoon digging in the depths
of the Emacs sources (in and close around wait_reading_...) to try to
find other bugs that could be related to your hangs.

I have a suspicion that what you are seeing is related to the GnuTLS
handling.

Let's see what happens with this patch.

I should note: This is just a test and not a production-ready patch. It
should work fine but might have unforeseen other side-effects since we
exit wait_reading_... now as early as possible if data has been read
from wait_proc. This is just to test a theory. It might very well also
increase the rate of hangs you see... which would be interesting as
well.

Thanks a lot,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: wait_reading_process_output-wait_proc-exit-early.patch --]
[-- Type: text/x-patch; name="wait_reading_process_output-wait_proc-exit-early.patch", Size: 522 bytes --]

diff --git a/src/process.c b/src/process.c
index 6ba27a33f4..5763101cbf 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5458,6 +5458,9 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       /*  If we woke up due to SIGWINCH, actually change size now.  */
       do_pending_window_change (0);
 
+      if (wait_proc->nbytes_read != prev_wait_proc_nbytes_read)
+        break;
+
       if (nfds == 0)
 	{
           /* Exit the main loop if we've passed the requested timeout,

^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 15:02                                                                         ` Matthias Dahl
@ 2018-02-27 15:13                                                                           ` Lars Ingebrigtsen
  2018-02-27 15:17                                                                             ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-27 15:13 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Let's see what happens with this patch.

I tried the patch, but Emacs segfaulted immediately.

Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
0x00000000005a9a4c in wait_reading_process_output (
    time_limit=time_limit@entry=0, nsecs=nsecs@entry=0, 
    read_kbd=read_kbd@entry=-1, do_display=true, 
    wait_for_cell=wait_for_cell@entry=XIL(0), wait_proc=wait_proc@entry=0x0, 
    just_wait_proc=0) at process.c:5461
5461          if (wait_proc->nbytes_read != prev_wait_proc_nbytes_read)
(gdb) bt
#0  0x00000000005a9a4c in wait_reading_process_output (
    time_limit=time_limit@entry=0, nsecs=nsecs@entry=0, 
    read_kbd=read_kbd@entry=-1, do_display=true, 
    wait_for_cell=wait_for_cell@entry=XIL(0), wait_proc=wait_proc@entry=0x0, 
    just_wait_proc=0) at process.c:5461
#1  0x00000000004fdaf4 in kbd_buffer_get_event (end_time=0x0, 
    used_mouse_menu=0x7fffffffe5db, kbp=<synthetic pointer>) at keyboard.c:3840
#2  read_event_from_main_queue (used_mouse_menu=<optimized out>, 
    local_getcjmp=0x7fffffffe260, end_time=0x0) at keyboard.c:2157
#3  read_decoded_event_from_main_queue (used_mouse_menu=<optimized out>, 
    prev_event=<optimized out>, local_getcjmp=<optimized out>, 
    end_time=<optimized out>) at keyboard.c:2220
#4  read_char (commandflag=commandflag@entry=1, map=map@entry=XIL(0x42d5fc3), 
    prev_event=<optimized out>, 


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 11:54                                                                       ` andrés ramírez
  2018-02-27 15:02                                                                         ` Matthias Dahl
@ 2018-02-27 15:14                                                                         ` Matthias Dahl
  2018-02-27 15:17                                                                           ` Lars Ingebrigtsen
  2018-03-01 10:42                                                                           ` Lars Ingebrigtsen
  1 sibling, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-02-27 15:14 UTC (permalink / raw)
  To: andrés ramírez
  Cc: Lars Ingebrigtsen, eggert, Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 243 bytes --]

Hello Andrés and Lars...

Argh, sorry, the last patch was obviously bogus. You will find the
fixed version attached. Sorry for the confusion.

Thanks a lot,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: wait_reading_process_output-wait_proc-exit-early-v2.patch --]
[-- Type: text/x-patch; name="wait_reading_process_output-wait_proc-exit-early-v2.patch", Size: 535 bytes --]

diff --git a/src/process.c b/src/process.c
index 6ba27a33f4..58bccc3bda 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5458,6 +5458,9 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       /*  If we woke up due to SIGWINCH, actually change size now.  */
       do_pending_window_change (0);
 
+      if (wait_proc && wait_proc->nbytes_read != prev_wait_proc_nbytes_read)
+        break;
+
       if (nfds == 0)
 	{
           /* Exit the main loop if we've passed the requested timeout,

^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 15:13                                                                           ` Lars Ingebrigtsen
@ 2018-02-27 15:17                                                                             ` Matthias Dahl
  2018-02-27 15:19                                                                               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-02-27 15:17 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Hello Lars...

Yeah, the segfault was to be expected since the patch was
bogus. Sorry you wasted your time on that. :-(

I actually got your mail after I sent the corrected patch. But, wow,
you were seriously fast testing the patch. ;-) I will be more careful in
the future.

Sorry again,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 15:14                                                                         ` Matthias Dahl
@ 2018-02-27 15:17                                                                           ` Lars Ingebrigtsen
  2018-03-01 10:42                                                                           ` Lars Ingebrigtsen
  1 sibling, 0 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-27 15:17 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Argh, sorry, the last patch was obviously bogus. You will find the
> fixed version attached. Sorry for the confusion.

OK, I'm now running with the new patch, and we'll see what happens...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 15:17                                                                             ` Matthias Dahl
@ 2018-02-27 15:19                                                                               ` Lars Ingebrigtsen
  0 siblings, 0 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-02-27 15:19 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> I actually got your mail after I sent the corrected patch. But, wow,
> you were seriously fast testing the patch. ;-) I will be more careful in
> the future.

I've got a one key combo to apply patches from emails to Emacs.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-02-27 15:14                                                                         ` Matthias Dahl
  2018-02-27 15:17                                                                           ` Lars Ingebrigtsen
@ 2018-03-01 10:42                                                                           ` Lars Ingebrigtsen
  2018-03-01 14:36                                                                             ` Matthias Dahl
  2018-03-05 14:43                                                                             ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-03-01 10:42 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Argh, sorry, the last patch was obviously bogus. You will find the
> fixed version attached. Sorry for the confusion.

I've been running Emacs with your patch for almost two days now, and I
haven't seen a single one of these hangs, so I'm beginning to suspect
that you found the problem.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 10:42                                                                           ` Lars Ingebrigtsen
@ 2018-03-01 14:36                                                                             ` Matthias Dahl
  2018-03-01 15:10                                                                               ` andrés ramírez
  2018-03-05 14:43                                                                             ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-03-01 14:36 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Hello Lars and Andrés...

On 01/03/18 11:42, Lars Ingebrigtsen wrote:

> I've been running Emacs with your patch for almost two days now, and I
> haven't seen a single one of these hangs, so I'm beginning to suspect
> that you found the problem.  :-)

Ok, that's reason to be cautiously optimistic so far. Let's wait a few
more days and see if you really see no more hangs. If you report back
that the hangs are gone, I will have some explaining to do on what is
going on behind the scenes, so we can dig deeper and discuss how to fix
it.

In the meantime, Andrés sent me a mail off the list, telling me that he
is actually seeing an increase in hangs with this patch which is rather
surprising, to be honest.

@Andrés: Is that correct? If yes, what Emacs version are you using? What
are you doing when the hangs occur and can those be canceled (ctrl+g)?

Thanks to both of you for taking the time and your patience.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 14:36                                                                             ` Matthias Dahl
@ 2018-03-01 15:10                                                                               ` andrés ramírez
  2018-03-01 16:30                                                                                 ` T.V Raman
  0 siblings, 1 reply; 151+ messages in thread
From: andrés ramírez @ 2018-03-01 15:10 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: Lars Ingebrigtsen, eggert, Eli Zaretskii, emacs-devel

Hi.
> Ok, that's reason to be cautiously optimistic so far. Let's wait a few
> more days and see if you really see no more hangs. If you report back
> that the hangs are gone, I will have some explaining to do on what is
> going on behind the scenes, so we can dig deeper and discuss how to fix
> it.
On my case yesterday I see more than one hang. I am going to do a log.
> 
> In the meantime, Andrés sent me a mail off the list, telling me that he
> is actually seeing an increase in hangs with this patch which is rather
> surprising, to be honest.

I think Lars is testing with gnus. I am testing with
wanderlust. wanderlust uses url-retrieve-synchronously. The weird
thing is that after quitting and restarting wanderlust I can continue
downloading news. Also I am thinking on involving the
wanderlust+ml. Perhaps with could get more tests cases.
> 
> @Andrés: Is that correct? If yes, what Emacs version are you using? What
> are you doing when the hangs occur and can those be canceled (ctrl+g)?

Yes. that's correct.
--8<---------------cut here---------------start------------->8---
emacs-version
GNU Emacs 27.0.50 (build 3, arm-unknown-linux-gnueabihf, X toolkit, Xaw3d scroll bars)
 of 2018-02-28
 --8<---------------cut here---------------end--------------->8---

I have emacs running under gdb. But When in the hang ¿How should I
proceed?.

Yes the hang can be caceled with C-g.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 15:10                                                                               ` andrés ramírez
@ 2018-03-01 16:30                                                                                 ` T.V Raman
  2018-03-01 16:46                                                                                   ` andrés ramírez
  0 siblings, 1 reply; 151+ messages in thread
From: T.V Raman @ 2018-03-01 16:30 UTC (permalink / raw)
  To: andrés ramírez
  Cc: Lars Ingebrigtsen, Matthias Dahl, emacs-devel, Eli Zaretskii,
	eggert

My NNTP hangs have mostly disappeared. url-retrieve-synchronously has
had numerous bugs over the years and may be unrelated. One thing I have
found over time is that killing hanging web connections gets
url-retrieve back to life -- I even have this function in my setup:

(defun emacspeak-wizards-web-clean-up-processes ()
  "Delete stale Web connections."
  (interactive)
  (cl-declare (special url-http-open-connections))
  (let ((count 0))
    (cl-loop 
     for p being the hash-values of url-http-open-connections
     when p do
     (cl-incf count)
     (delete-process (car p)))
    (message "Deleted %d web  connections" count)))
-- 



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 16:30                                                                                 ` T.V Raman
@ 2018-03-01 16:46                                                                                   ` andrés ramírez
  2018-03-01 18:23                                                                                     ` T.V Raman
  2018-03-01 19:13                                                                                     ` Eli Zaretskii
  0 siblings, 2 replies; 151+ messages in thread
From: andrés ramírez @ 2018-03-01 16:46 UTC (permalink / raw)
  To: T.V Raman
  Cc: Lars Ingebrigtsen, Matthias Dahl, emacs-devel, Eli Zaretskii,
	eggert

Hi tv.

On emacs-27 (master)
I have got:
--8<---------------cut here---------------start------------->8---
(void-variable url-http-open-connections)
--8<---------------cut here---------------end--------------->8---

Even the var is defined on url-http.el

AR



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 16:46                                                                                   ` andrés ramírez
@ 2018-03-01 18:23                                                                                     ` T.V Raman
  2018-03-01 19:13                                                                                     ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: T.V Raman @ 2018-03-01 18:23 UTC (permalink / raw)
  To: rrandresf; +Cc: eggert, ml_emacs-lists, emacs-devel, larsi, raman, eliz

Interesting -- I dont get that error under emacs -27 -- suspect I've
either defvar-ed it somewhere or have explicitly loaded url-vars 

andrés ramírez writes:
 > Hi tv.
 > 
 > On emacs-27 (master)
 > I have got:
 > --8<---------------cut here---------------start------------->8---
 > (void-variable url-http-open-connections)
 > --8<---------------cut here---------------end--------------->8---
 > 
 > Even the var is defined on url-http.el
 > 
 > AR

-- 

--



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 16:46                                                                                   ` andrés ramírez
  2018-03-01 18:23                                                                                     ` T.V Raman
@ 2018-03-01 19:13                                                                                     ` Eli Zaretskii
  2018-03-02 20:21                                                                                       ` andrés ramírez
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-03-01 19:13 UTC (permalink / raw)
  To: andrés ramírez
  Cc: eggert, larsi, ml_emacs-lists, emacs-devel, raman

> From: andrés ramírez <rrandresf@gmail.com>
> Cc: Matthias Dahl <ml_emacs-lists@binary-island.eu>,
> 	Lars Ingebrigtsen <larsi@gnus.org>,
> 	eggert@cs.ucla.edu,
> 	Eli Zaretskii <eliz@gnu.org>,
> 	emacs-devel@gnu.org
> Date: Thu, 01 Mar 2018 10:46:04 -0600
> 
> On emacs-27 (master)
> I have got:
> --8<---------------cut here---------------start------------->8---
> (void-variable url-http-open-connections)
> --8<---------------cut here---------------end--------------->8---
> 
> Even the var is defined on url-http.el

I cannot reproduce this with today's master.

Can you show the steps to reproduce?  I just loaded url-http in
"emacs -Q", and then used "C-h v" to show the value of that
variable.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 19:13                                                                                     ` Eli Zaretskii
@ 2018-03-02 20:21                                                                                       ` andrés ramírez
  2018-03-03  7:55                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: andrés ramírez @ 2018-03-02 20:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, larsi, ml_emacs-lists, emacs-devel, raman

Hi Eli
> > --8<---------------cut here---------------start------------->8---
> > (void-variable url-http-open-connections)
> > --8<---------------cut here---------------end--------------->8---
> > 
> > Even the var is defined on url-http.el
> 
> I cannot reproduce this with today's master.
> 
> Can you show the steps to reproduce?  I just loaded url-http in
> "emacs -Q", and then used "C-h v" to show the value of that
> variable.

 Yesterday I have not gotten the hang.
Without the hang I can not reproduce it reliably.

Today. I have not gotten the hang also, at least until now.

Btw. Today I have starting using this script (named: rnews.sh, for doing the
testing. Perhaps You guys could try it. And See If You could get the
hang.

--8<---------------cut here---------------start------------->8---
#!/bin/sh 

# simple receipt for reading news on emacs+wanderlust

# create the alternate home directory on my case (~/deleteme-emacs-home/27)
# EMACS_CCED_HOME is the path where emacs was compiled
# EMACS_CCED_HOME=/usb/bin/emacs
EMACS_TEST_DIR=~/deleteme-emacs-home/27
EMACS_CCED_HOME=~/abs/emacs-27.0.50/src
mkdir -p $EMACS_TEST_DIR
cd $EMACS_TEST_DIR
# put .wl on the alternate home directory
echo "(setq wl-from \"test.emacs <test.emacs@south.pe>\") ;; just 4 avoiding some warnings" > .wl
# put .folders on the alternate home directory
cat << 'EOF' >> .folders
#
# Folder definition file
# This file is generated automatically by Wanderlust/2.15.9 (Almost Unreal).
#
# If you edit this file by hand, be sure that comment lines
# will be washed out by wl-fldmgr.
#

news{
        gwene {
        /last:1000/-gwene.com.ycombinator.news@news.gwene.org "hnews"
        }
        gmane {
        /last:2000/-gmane.emacs.devel@news.gmane.org "emacs-devel"
        }
}

# petname definition (access group, folder in access group)

# end of file.
EOF

# fire up emacs from the alternate home directory
# COMMENT: should be an issue after installing the package and running wl inmediately (no restart) some errors appear
echo 'installing wanderlust. you could modify the emacs path for pointing to the folder where emacs was compiled'
HOME=$EMACS_TEST_DIR $EMACS_CCED_HOME/emacs -Q --eval "(progn(require 'cl) (require 'package nil 'nonil4noerror)(add-to-list 'package-archives '(\"melpa\" . \"http://melpa.milkbox.net/packages/\") t)(package-initialize)(defvar required-packages '(wanderlust) \"a list of packages to ensure are installed at launch.\") (package-refresh-contents) (dolist (p required-packages) (when (not (package-installed-p p)) (package-install p)))(kill-emacs))"
echo 'running wanderlust, answer y to every of the three questions'
HOME=$EMACS_TEST_DIR $EMACS_CCED_HOME/emacs -Q --eval  "(progn(require 'cl) (require 'package nil 'nonil4noerror)(add-to-list 'package-archives '(\"melpa\" . \"http://melpa.milkbox.net/packages/\") t)(package-initialize)(wl)(message \"open the hnews or emacs-devel (the rss folders)\"))"

echo 'open the hnews or emacs-devel (the rss folders)'
echo 'do not forget to remove the deleteme-emacs-home directory'

--8<---------------cut here---------------end--------------->8---



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-02 20:21                                                                                       ` andrés ramírez
@ 2018-03-03  7:55                                                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2018-03-03  7:55 UTC (permalink / raw)
  To: andrés ramírez
  Cc: eggert, larsi, ml_emacs-lists, emacs-devel, raman

> From: andrés ramírez <rrandresf@gmail.com>
> Cc: raman@google.com,
> 	ml_emacs-lists@binary-island.eu,
> 	larsi@gnus.org,
> 	eggert@cs.ucla.edu,
> 	emacs-devel@gnu.org
> Date: Fri, 02 Mar 2018 14:21:02 -0600
> 
> Hi Eli
> > > --8<---------------cut here---------------start------------->8---
> > > (void-variable url-http-open-connections)
> > > --8<---------------cut here---------------end--------------->8---
> > > 
> > > Even the var is defined on url-http.el
> > 
> > I cannot reproduce this with today's master.
> > 
> > Can you show the steps to reproduce?  I just loaded url-http in
> > "emacs -Q", and then used "C-h v" to show the value of that
> > variable.
> 
>  Yesterday I have not gotten the hang.
> Without the hang I can not reproduce it reliably.
> 
> Today. I have not gotten the hang also, at least until now.

Is this related to "void-variable url-http-open-connections"?  It
sounds like you are talking about a separate issue.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-01 10:42                                                                           ` Lars Ingebrigtsen
  2018-03-01 14:36                                                                             ` Matthias Dahl
@ 2018-03-05 14:43                                                                             ` Matthias Dahl
  2018-03-05 14:44                                                                               ` Lars Ingebrigtsen
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-03-05 14:43 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Hello Lars,

hope you had a great weekend. Just wanted to ask how your testing was
going with the test patch and if you had any hangs yet?

Thanks,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-05 14:43                                                                             ` Matthias Dahl
@ 2018-03-05 14:44                                                                               ` Lars Ingebrigtsen
  2018-03-05 14:54                                                                                 ` Matthias Dahl
  2018-03-13  9:54                                                                                 ` Matthias Dahl
  0 siblings, 2 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-03-05 14:44 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Hello Lars,
>
> hope you had a great weekend. Just wanted to ask how your testing was
> going with the test patch and if you had any hangs yet?

Zero hangs still.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-05 14:44                                                                               ` Lars Ingebrigtsen
@ 2018-03-05 14:54                                                                                 ` Matthias Dahl
  2018-03-13  9:54                                                                                 ` Matthias Dahl
  1 sibling, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-03-05 14:54 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

Hello Lars...

On 05/03/18 15:44, Lars Ingebrigtsen wrote:

> Zero hangs still.  :-)

Great news, glad to hear it. I will prepare something for the list
later this week, so that we can discuss and fix this properly.

Thanks, again, for testing this. Should you run into another hang
after all, please let me know. (Just as a side-note: The test patch
only "fixes" the case when there is a wait_proc given, otherwise a hang
is still very much possible. But since that was your use-case, it was
the easiest to test... a full fix will naturally cover all cases.)

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-05 14:44                                                                               ` Lars Ingebrigtsen
  2018-03-05 14:54                                                                                 ` Matthias Dahl
@ 2018-03-13  9:54                                                                                 ` Matthias Dahl
  2018-03-13 12:35                                                                                   ` Robert Pluim
  2018-03-13 16:12                                                                                   ` Eli Zaretskii
  1 sibling, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-03-13  9:54 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, andrés ramírez, emacs-devel, eggert

[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]

Hello all...

Sorry for the delay, things took quite a bit longer than I expected. And
I really should scratch that "later this week" phrase entirely from my
active use. ;-)

@Andrés: Sorry for not replying to your off-list mail yet. I was busy
with this bug, given the limited time I have I was prioritizing.

So, attached you will find patches that fix bugs I have found with the
current GnuTLS code. I am pretty confident that it will fix the issue
that Lars is seeing. And I hope that the hangs seen by Andrés will be
gone with this as well.

Lars and Andrés, please test those patches either against Emacs master
without any other additional patches or against the current Emacs 26
branch with the wait_reading_process_output patches that have been
applied to master but nothing else.

Please report back if those patches fix your issues or, if not, how they
affect (if at all) the hangs. Thanks again for investing the time!

@Eli: Those patches are, imho, clearly Emacs 26 material as well. And
along those lines, and I don't mean to be pushy at all :), I would like
to bring up the wait_reading_process_output fixes as well. I still think
it would be a good idea to get those into Emacs 26. All the fixes up
until now have been for hangs that will usually be erratic and
unpredictable to the user... thus, those hangs will probably end up
as unresolved bug reports on some random package instead of posts on
this list... if at all.

Keeping my fingers crossed here. ;-)

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-GnuTLS-error-handling.patch --]
[-- Type: text/x-patch; name="0001-Fix-GnuTLS-error-handling.patch", Size: 2146 bytes --]

From 841d0fbe37642bc14c4bcd515cfbc91a25ba179b Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Mon, 12 Mar 2018 15:33:45 +0100
Subject: [PATCH 1/2] Fix GnuTLS error handling

* src/gnutls.c (emacs_gnutls_read): All error handling should be done
in `emacs_gnutls_handle_error', move handling of
GNUTLS_E_UNEXPECTED_PACKET_LENGTH accordingly.
We always need to set `errno' in case of an error, since later error
handling (e.g. `wait_reading_process_output') depends on it and GnuTLS
does not set errno by itself. We'll otherwise have random errno values
which can cause erratic behavior and hangs.
(emacs_gnutls_handle_error): GNUTLS_E_UNEXPECTED_PACKET_LENGTH is only
returned for premature connection termination on GnuTLS < 3.0 and is
otherwise a real error and should not be gobbled up.
---
 src/gnutls.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/src/gnutls.c b/src/gnutls.c
index 903393fed1..e7d0d3d845 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
   rtnval = gnutls_record_recv (state, buf, nbyte);
   if (rtnval >= 0)
     return rtnval;
-  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
-    /* The peer closed the connection. */
-    return 0;
   else if (emacs_gnutls_handle_error (state, rtnval))
-    /* non-fatal error */
-    return -1;
-  else {
-    /* a fatal error occurred */
-    return 0;
-  }
+    {
+      /* non-fatal error */
+      errno = EAGAIN;
+      return -1;
+    }
+  else
+    {
+      /* a fatal error occurred */
+      errno = EPROTO;
+      return 0;
+    }
 }
 
 static char const *
@@ -756,8 +758,10 @@ emacs_gnutls_handle_error (gnutls_session_t session, int err)
 	 connection.  */
 # ifdef HAVE_GNUTLS3
       if (err == GNUTLS_E_PREMATURE_TERMINATION)
-	level = 3;
+# else
+      if (err == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
 # endif
+	level = 3;
 
       GNUTLS_LOG2 (level, max_log_level, "fatal error:", str);
       ret = false;
-- 
2.16.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Always-check-GnuTLS-sessions-for-available-data.patch --]
[-- Type: text/x-patch; name="0002-Always-check-GnuTLS-sessions-for-available-data.patch", Size: 4596 bytes --]

From f5b979da67595da4d91f7b7a4eb979deae6a7321 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Mon, 12 Mar 2018 16:07:55 +0100
Subject: [PATCH 2/2] Always check GnuTLS sessions for available data

* src/process.c (wait_reading_process_output): GnuTLS buffers data
internally and as such there is no guarantee that a select() call on
the underlying kernel socket will report available data if all data
has already been buffered. Prior to GnuTLS < 2.12, lowat mode was the
default which left bytes back in the kernel socket, so a select() call
would work. With GnuTLS >= 2.12 (the now required version for Emacs),
that default changed to non-lowat mode (and we don't set it otherwise)
and was subsequently completely removed with GnuTLS >= 3.0.
So, to properly handle GnuTLS sessions, we need to iterate through all
channels, check for available data manually and set the concerning fds
accordingly. Otherwise we might stall/delay unnecessarily or worse.
This also applies to the !just_wait_proc && wait_proc case, which was
previously handled improperly (only wait_proc was checked) and could
cause problems if sessions did have any dependency on one another
through e.g. higher up program logic and waited for one another.
---
 src/process.c | 77 +++++++++++++++++++----------------------------------------
 1 file changed, 24 insertions(+), 53 deletions(-)

diff --git a/src/process.c b/src/process.c
index 9b9b9f3550..f8fed56d5b 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5392,60 +5392,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 #endif	/* !HAVE_GLIB */
 
 #ifdef HAVE_GNUTLS
-          /* GnuTLS buffers data internally.  In lowat mode it leaves
-             some data in the TCP buffers so that select works, but
-             with custom pull/push functions we need to check if some
-             data is available in the buffers manually.  */
-          if (nfds == 0)
+          /* GnuTLS buffers data internally. select() will only report
+             available data for the underlying kernel sockets API, not
+             what has been buffered internally. As such, we need to loop
+             through all channels and check for available data manually.  */
+          if (nfds >= 0)
 	    {
-	      fd_set tls_available;
-	      int set = 0;
-
-	      FD_ZERO (&tls_available);
-	      if (! wait_proc)
-		{
-		  /* We're not waiting on a specific process, so loop
-		     through all the channels and check for data.
-		     This is a workaround needed for some versions of
-		     the gnutls library -- 2.12.14 has been confirmed
-		     to need it.  See
-		     http://comments.gmane.org/gmane.emacs.devel/145074 */
-		  for (channel = 0; channel < FD_SETSIZE; ++channel)
-		    if (! NILP (chan_process[channel]))
-		      {
-			struct Lisp_Process *p =
-			  XPROCESS (chan_process[channel]);
-			if (p && p->gnutls_p && p->gnutls_state
-			    && ((emacs_gnutls_record_check_pending
-				 (p->gnutls_state))
-				> 0))
-			  {
-			    nfds++;
-			    eassert (p->infd == channel);
-			    FD_SET (p->infd, &tls_available);
-			    set++;
-			  }
-		      }
-		}
-	      else
-		{
-		  /* Check this specific channel.  */
-		  if (wait_proc->gnutls_p /* Check for valid process.  */
-		      && wait_proc->gnutls_state
-		      /* Do we have pending data?  */
-		      && ((emacs_gnutls_record_check_pending
-			   (wait_proc->gnutls_state))
-			  > 0))
-		    {
-		      nfds = 1;
-		      eassert (0 <= wait_proc->infd);
-		      /* Set to Available.  */
-		      FD_SET (wait_proc->infd, &tls_available);
-		      set++;
-		    }
-		}
-	      if (set)
-		Available = tls_available;
+              for (channel = 0; channel < FD_SETSIZE; ++channel)
+                if (! NILP (chan_process[channel]))
+                  {
+                    struct Lisp_Process *p =
+                      XPROCESS (chan_process[channel]);
+
+                    if (just_wait_proc && p != wait_proc)
+                      continue;
+
+                    if (p && p->gnutls_p && p->gnutls_state
+                        && ((emacs_gnutls_record_check_pending
+                             (p->gnutls_state))
+                            > 0))
+                      {
+                        nfds++;
+                        eassert (p->infd == channel);
+                        FD_SET (p->infd, &Available);
+                      }
+                  }
 	    }
 #endif
 	}
-- 
2.16.2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13  9:54                                                                                 ` Matthias Dahl
@ 2018-03-13 12:35                                                                                   ` Robert Pluim
  2018-03-13 13:40                                                                                     ` Robert Pluim
  2018-03-13 15:10                                                                                     ` Matthias Dahl
  2018-03-13 16:12                                                                                   ` Eli Zaretskii
  1 sibling, 2 replies; 151+ messages in thread
From: Robert Pluim @ 2018-03-13 12:35 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: eggert, Lars Ingebrigtsen, andrés ramírez,
	Eli Zaretskii, emacs-devel

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> -	      if (set)
> -		Available = tls_available;
> +              for (channel = 0; channel < FD_SETSIZE; ++channel)
> +                if (! NILP (chan_process[channel]))
> +                  {
> +                    struct Lisp_Process *p =
> +                      XPROCESS (chan_process[channel]);
> +
> +                    if (just_wait_proc && p != wait_proc)
> +                      continue;
> +
> +                    if (p && p->gnutls_p && p->gnutls_state
> +                        && ((emacs_gnutls_record_check_pending
> +                             (p->gnutls_state))
> +                            > 0))
> +                      {
> +                        nfds++;
> +                        eassert (p->infd == channel);
> +                        FD_SET (p->infd, &Available);
> +                      }
> +                  }
>  	    }
>  #endif
>  	}

Hi Matthias, I apologize if this has already been mentioned, but did
you check that this doesn't undo the fix for Bug#21337? The issue
there as I recall was that FD's were set in Available that didn't
actually have data to read, hence the need to check the TLS FD's
separately using tls_available.

Regards

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 12:35                                                                                   ` Robert Pluim
@ 2018-03-13 13:40                                                                                     ` Robert Pluim
  2018-03-13 15:10                                                                                     ` Matthias Dahl
  1 sibling, 0 replies; 151+ messages in thread
From: Robert Pluim @ 2018-03-13 13:40 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1519 bytes --]

Robert Pluim <rpluim@gmail.com> writes:

> Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:
>
>> -	      if (set)
>> -		Available = tls_available;
>> +              for (channel = 0; channel < FD_SETSIZE; ++channel)
>> +                if (! NILP (chan_process[channel]))
>> +                  {
>> +                    struct Lisp_Process *p =
>> +                      XPROCESS (chan_process[channel]);
>> +
>> +                    if (just_wait_proc && p != wait_proc)
>> +                      continue;
>> +
>> +                    if (p && p->gnutls_p && p->gnutls_state
>> +                        && ((emacs_gnutls_record_check_pending
>> +                             (p->gnutls_state))
>> +                            > 0))
>> +                      {
>> +                        nfds++;
>> +                        eassert (p->infd == channel);
>> +                        FD_SET (p->infd, &Available);
>> +                      }
>> +                  }
>>  	    }
>>  #endif
>>  	}
>
> Hi Matthias, I apologize if this has already been mentioned, but did
> you check that this doesn't undo the fix for Bug#21337? The issue
> there as I recall was that FD's were set in Available that didn't
> actually have data to read, hence the need to check the TLS FD's
> separately using tls_available.

Answering my own question: your second patch undoes the fix for
21337. I think it needs to look something like this (I've put some of
the tabs back in to make the diff clearer, adjust according to taste):


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 3098 bytes --]

diff --git a/src/process.c b/src/process.c
index 11d914aab2..71b638726f 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5392,58 +5392,36 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 #endif	/* !HAVE_GLIB */
 
 #ifdef HAVE_GNUTLS
-          /* GnuTLS buffers data internally.  In lowat mode it leaves
-             some data in the TCP buffers so that select works, but
-             with custom pull/push functions we need to check if some
-             data is available in the buffers manually.  */
-          if (nfds == 0)
+          /* GnuTLS buffers data internally. select() will only report
+             available data for the underlying kernel sockets API, not
+             what has been buffered internally. As such, we need to loop
+             through all channels and check for available data manually.  */
+          if (nfds >= 0)
 	    {
 	      fd_set tls_available;
 	      int set = 0;
 
 	      FD_ZERO (&tls_available);
-	      if (! wait_proc)
-		{
-		  /* We're not waiting on a specific process, so loop
-		     through all the channels and check for data.
-		     This is a workaround needed for some versions of
-		     the gnutls library -- 2.12.14 has been confirmed
-		     to need it.  See
-		     http://comments.gmane.org/gmane.emacs.devel/145074 */
-		  for (channel = 0; channel < FD_SETSIZE; ++channel)
-		    if (! NILP (chan_process[channel]))
-		      {
-			struct Lisp_Process *p =
-			  XPROCESS (chan_process[channel]);
-			if (p && p->gnutls_p && p->gnutls_state
-			    && ((emacs_gnutls_record_check_pending
-				 (p->gnutls_state))
-				> 0))
-			  {
-			    nfds++;
-			    eassert (p->infd == channel);
-			    FD_SET (p->infd, &tls_available);
-			    set++;
-			  }
-		      }
-		}
-	      else
-		{
-		  /* Check this specific channel.  */
-		  if (wait_proc->gnutls_p /* Check for valid process.  */
-		      && wait_proc->gnutls_state
-		      /* Do we have pending data?  */
-		      && ((emacs_gnutls_record_check_pending
-			   (wait_proc->gnutls_state))
-			  > 0))
-		    {
-		      nfds = 1;
-		      eassert (0 <= wait_proc->infd);
-		      /* Set to Available.  */
-		      FD_SET (wait_proc->infd, &tls_available);
-		      set++;
-		    }
-		}
+              for (channel = 0; channel < FD_SETSIZE; ++channel)
+                if (! NILP (chan_process[channel]))
+                  {
+                    struct Lisp_Process *p =
+                      XPROCESS (chan_process[channel]);
+
+                    if (just_wait_proc && p != wait_proc)
+                      continue;
+
+                    if (p && p->gnutls_p && p->gnutls_state
+                        && ((emacs_gnutls_record_check_pending
+                             (p->gnutls_state))
+                            > 0))
+                      {
+                        nfds++;
+                        eassert (p->infd == channel);
+                        FD_SET (p->infd, &tls_available);
+                        set++;
+                      }
+                  }
 	      if (set)
 		Available = tls_available;
 	    }

^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 12:35                                                                                   ` Robert Pluim
  2018-03-13 13:40                                                                                     ` Robert Pluim
@ 2018-03-13 15:10                                                                                     ` Matthias Dahl
  2018-03-13 15:30                                                                                       ` Robert Pluim
                                                                                                         ` (3 more replies)
  1 sibling, 4 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-03-13 15:10 UTC (permalink / raw)
  To: emacs-devel
  Cc: Lars Ingebrigtsen, andrés ramírez, Eli Zaretskii,
	Robert Pluim, Paul Eggert

[-- Attachment #1: Type: text/plain, Size: 556 bytes --]

Hi Robert,

Thank you very much for pointing this out. You are absolutely correct,
my patches did undo your fix.

I did some digging and attached is a fix for xg_select() which did not
properly imitate pselect() behavior in all cases. That ultimately caused
the bug you were seeing.

@Lars and Andrés: When testing the previously sent patches, please also
include this one here or otherwise you will (sooner or later) run into a
different kind of problems.

Thanks again,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-Make-xg_select-behave-more-like-pselect.patch --]
[-- Type: text/x-patch; name="0003-Make-xg_select-behave-more-like-pselect.patch", Size: 1322 bytes --]

From 2f44ef364bc320dfd73febc51f0c0da862db49b1 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 13 Mar 2018 15:35:16 +0100
Subject: [PATCH 3/3] Make xg_select() behave more like pselect()

* src/xgselect.c (xg_select): If no file descriptors have data ready,
pselect() clears the passed in fd sets whereas xg_select() does not
which caused Bug#21337 for `wait_reading_process_output'.
Clear the passed in sets if no fds are ready but leave them untouched
if pselect() returns an error -- just like pselect() does itself.
---
 src/xgselect.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/xgselect.c b/src/xgselect.c
index fedd3127ef..f68982143e 100644
--- a/src/xgselect.c
+++ b/src/xgselect.c
@@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
             ++retval;
         }
     }
+  else if (nfds == 0)
+    {
+      // pselect() clears the file descriptor sets if no fd is ready (but
+      // not if an error occurred), so should we to be compatible. (Bug#21337)
+      if (rfds) FD_ZERO (rfds);
+      if (wfds) FD_ZERO (wfds);
+      if (efds) FD_ZERO (efds);
+    }
 
   /* If Gtk+ is in use eventually gtk_main_iteration will be called,
      unless retval is zero.  */
-- 
2.16.2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:10                                                                                     ` Matthias Dahl
@ 2018-03-13 15:30                                                                                       ` Robert Pluim
  2018-03-13 15:36                                                                                       ` Dmitry Gutov
                                                                                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 151+ messages in thread
From: Robert Pluim @ 2018-03-13 15:30 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Paul Eggert, Lars Ingebrigtsen, andrés ramírez,
	Eli Zaretskii, emacs-devel

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Hi Robert,
>
> Thank you very much for pointing this out. You are absolutely correct,
> my patches did undo your fix.
>
> I did some digging and attached is a fix for xg_select() which did not
> properly imitate pselect() behavior in all cases. That ultimately caused
> the bug you were seeing.

Yes, that does look to be the root cause.

> @Lars and Andrés: When testing the previously sent patches, please also
> include this one here or otherwise you will (sooner or later) run into a
> different kind of problems.

I've briefly tested with this on top of your previous second patch,
and I haven't seen any occurences of 21337 so far.

Regards

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:10                                                                                     ` Matthias Dahl
  2018-03-13 15:30                                                                                       ` Robert Pluim
@ 2018-03-13 15:36                                                                                       ` Dmitry Gutov
  2018-03-13 15:46                                                                                         ` Robert Pluim
  2018-03-14 14:21                                                                                         ` Matthias Dahl
  2018-03-13 16:32                                                                                       ` Lars Ingebrigtsen
  2018-03-31 15:44                                                                                       ` Lars Ingebrigtsen
  3 siblings, 2 replies; 151+ messages in thread
From: Dmitry Gutov @ 2018-03-13 15:36 UTC (permalink / raw)
  To: Matthias Dahl, emacs-devel
  Cc: Paul Eggert, Lars Ingebrigtsen, andrés ramírez,
	Eli Zaretskii, Robert Pluim

Hi Matthias,

On 3/13/18 5:10 PM, Matthias Dahl wrote:
> Hi Robert,
> 
> Thank you very much for pointing this out. You are absolutely correct,
> my patches did undo your fix.

This kind of accident seems pretty bad. Is there any chance to write a 
regression test? Ideally both for Robert's and your fixes.

Fiddly, hard-to-reproduce misbehavior is the ideal target for regression 
tests, IMO.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:36                                                                                       ` Dmitry Gutov
@ 2018-03-13 15:46                                                                                         ` Robert Pluim
  2018-03-13 15:56                                                                                           ` Dmitry Gutov
  2018-03-14 14:21                                                                                         ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Robert Pluim @ 2018-03-13 15:46 UTC (permalink / raw)
  To: Dmitry Gutov
  Cc: Paul Eggert, Matthias Dahl, emacs-devel, andrés ramírez,
	Lars Ingebrigtsen, Eli Zaretskii

Dmitry Gutov <dgutov@yandex.ru> writes:

> Hi Matthias,
>
> On 3/13/18 5:10 PM, Matthias Dahl wrote:
>> Hi Robert,
>>
>> Thank you very much for pointing this out. You are absolutely correct,
>> my patches did undo your fix.
>
> This kind of accident seems pretty bad.

It got caught before it got committed, so not that bad.

> Is there any chance to write a
> regression test? Ideally both for Robert's and your fixes.
>
> Fiddly, hard-to-reproduce misbehavior is the ideal target for
> regression tests, IMO.

My current test-case for 21337 is 'visit a bunch of files, make sure
global-auto-revert is turned on and auto-revert-use-notify is t, run
Gnus (or anything else that makes TLS connections), wait for errors to
be signalled by inotify'. I don't know how to write a regression test
for that.

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:46                                                                                         ` Robert Pluim
@ 2018-03-13 15:56                                                                                           ` Dmitry Gutov
  2018-03-13 16:57                                                                                             ` Robert Pluim
  0 siblings, 1 reply; 151+ messages in thread
From: Dmitry Gutov @ 2018-03-13 15:56 UTC (permalink / raw)
  To: emacs-devel

On 3/13/18 5:46 PM, Robert Pluim wrote:

> It got caught before it got committed, so not that bad.

I mean the general possibility of having somebody who comes later 
reverse the fix.

For some code the odds of this happening are fairly small (via 
architectural decisions, or segregating a fix into a separate unit of 
code, or maybe just commenting profusely). Not so in this case, apparently.

>> Fiddly, hard-to-reproduce misbehavior is the ideal target for
>> regression tests, IMO.
> 
> My current test-case for 21337 is 'visit a bunch of files, make sure
> global-auto-revert is turned on and auto-revert-use-notify is t, run
> Gnus (or anything else that makes TLS connections), wait for errors to
> be signalled by inotify'. I don't know how to write a regression test
> for that.

Maybe there's a way to simulate the critical conditions more directly?



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13  9:54                                                                                 ` Matthias Dahl
  2018-03-13 12:35                                                                                   ` Robert Pluim
@ 2018-03-13 16:12                                                                                   ` Eli Zaretskii
  2018-03-14  4:16                                                                                     ` Leo Liu
  2018-03-14  9:56                                                                                     ` Matthias Dahl
  1 sibling, 2 replies; 151+ messages in thread
From: Eli Zaretskii @ 2018-03-13 16:12 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: larsi, rrandresf, emacs-devel, eggert

> Cc: andrés ramírez <rrandresf@gmail.com>,
>  eggert@cs.ucla.edu, Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Tue, 13 Mar 2018 10:54:00 +0100
> 
> @Eli: Those patches are, imho, clearly Emacs 26 material as well. And
> along those lines, and I don't mean to be pushy at all :), I would like
> to bring up the wait_reading_process_output fixes as well. I still think
> it would be a good idea to get those into Emacs 26.

I don't think it makes sense to put this on the emacs-26 branch,
sorry.  I even have evidence: the fact that one of the hunks
re-introduced a previously solved bug, as pointed out by Robert.

This code is very delicate and very central to Emacs operation.  It's
very easy to break Emacs subtly by seemingly-innocent changes in that
area.  We really need to test it much more than a few weeks before we
are confident enough it is bug-free.  The master branch is where such
changes should be tested.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:10                                                                                     ` Matthias Dahl
  2018-03-13 15:30                                                                                       ` Robert Pluim
  2018-03-13 15:36                                                                                       ` Dmitry Gutov
@ 2018-03-13 16:32                                                                                       ` Lars Ingebrigtsen
  2018-03-14  9:32                                                                                         ` Matthias Dahl
  2018-03-31 15:44                                                                                       ` Lars Ingebrigtsen
  3 siblings, 1 reply; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-03-13 16:32 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Paul Eggert, Eli Zaretskii, andrés ramírez,
	Robert Pluim, emacs-devel

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> @Lars and Andrés: When testing the previously sent patches, please also
> include this one here or otherwise you will (sooner or later) run into a
> different kind of problems.

Can you resend as a single patch set?  I'm not 100% sure what patches to
apply...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:56                                                                                           ` Dmitry Gutov
@ 2018-03-13 16:57                                                                                             ` Robert Pluim
  2018-03-13 18:03                                                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Robert Pluim @ 2018-03-13 16:57 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 3/13/18 5:46 PM, Robert Pluim wrote:
>
>> It got caught before it got committed, so not that bad.
>
> I mean the general possibility of having somebody who comes later
> reverse the fix.
>
> For some code the odds of this happening are fairly small (via
> architectural decisions, or segregating a fix into a separate unit of
> code, or maybe just commenting profusely). Not so in this case,
> apparently.

As Eli has pointed out, it's a very fiddly area to work in.

>>> Fiddly, hard-to-reproduce misbehavior is the ideal target for
>>> regression tests, IMO.
>>
>> My current test-case for 21337 is 'visit a bunch of files, make sure
>> global-auto-revert is turned on and auto-revert-use-notify is t, run
>> Gnus (or anything else that makes TLS connections), wait for errors to
>> be signalled by inotify'. I don't know how to write a regression test
>> for that.
>
> Maybe there's a way to simulate the critical conditions more directly?

It has to involve TLS and reading from TLS connections. I don't think
we want emacs' regression tests to be making connections to servers on
the internet. I've never tried to use 'make-network-process' to create
a TLS server in emacs, but perhaps that could be used instead.

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 16:57                                                                                             ` Robert Pluim
@ 2018-03-13 18:03                                                                                               ` Eli Zaretskii
  2018-03-13 20:12                                                                                                 ` Robert Pluim
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-03-13 18:03 UTC (permalink / raw)
  To: Robert Pluim; +Cc: emacs-devel

> From: Robert Pluim <rpluim@gmail.com>
> Date: Tue, 13 Mar 2018 17:57:17 +0100
> Cc: emacs-devel@gnu.org
> 
> > Maybe there's a way to simulate the critical conditions more directly?
> 
> It has to involve TLS and reading from TLS connections. I don't think
> we want emacs' regression tests to be making connections to servers on
> the internet. I've never tried to use 'make-network-process' to create
> a TLS server in emacs, but perhaps that could be used instead.

Please take a look at gnutls-tests.el, it already does something like
that.  We could use the same technique for this issue.  Of course, the
inotify aspect might mean that the test can only be run on GNU/Linux.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 18:03                                                                                               ` Eli Zaretskii
@ 2018-03-13 20:12                                                                                                 ` Robert Pluim
  0 siblings, 0 replies; 151+ messages in thread
From: Robert Pluim @ 2018-03-13 20:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> It has to involve TLS and reading from TLS connections. I don't think
>> we want emacs' regression tests to be making connections to servers on
>> the internet. I've never tried to use 'make-network-process' to create
>> a TLS server in emacs, but perhaps that could be used instead.
>
> Please take a look at gnutls-tests.el, it already does something like
> that.  We could use the same technique for this issue.  Of course, the
> inotify aspect might mean that the test can only be run on GNU/Linux.

network-stream-tests.el uses the gnutls-serv executable to test TLS
connections. It would be preferable if we didn't have to rely on the
gnutls-cli package being installed (I don't install it normally).

Quick testing shows that 'make-network-process' doesn't mind being
asked to create a TLS server, but openssl s_client fails to connect to
it. I'll investigate further, as using the builtin TLS is the way
forward.

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 16:12                                                                                   ` Eli Zaretskii
@ 2018-03-14  4:16                                                                                     ` Leo Liu
  2018-03-14  9:22                                                                                       ` Robert Pluim
                                                                                                         ` (3 more replies)
  2018-03-14  9:56                                                                                     ` Matthias Dahl
  1 sibling, 4 replies; 151+ messages in thread
From: Leo Liu @ 2018-03-14  4:16 UTC (permalink / raw)
  To: emacs-devel

On 2018-03-13 18:12 +0200, Eli Zaretskii wrote:
> I don't think it makes sense to put this on the emacs-26 branch,
> sorry.  I even have evidence: the fact that one of the hunks
> re-introduced a previously solved bug, as pointed out by Robert.
>
> This code is very delicate and very central to Emacs operation.  It's
> very easy to break Emacs subtly by seemingly-innocent changes in that
> area.  We really need to test it much more than a few weeks before we
> are confident enough it is bug-free.  The master branch is where such
> changes should be tested.

Is there possibility of including it in the 26.x series? I mean wait
until 27.1 is just too long to see such a critical fix.

In last week I have experienced 2-3 times of hangs (emacs 25.3) that
consumed ~0% CPU but emacs was not responding to anything (C-g, SIGUSR2
etc).

Thanks.

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  4:16                                                                                     ` Leo Liu
@ 2018-03-14  9:22                                                                                       ` Robert Pluim
  2018-03-15  0:37                                                                                         ` Leo Liu
  2018-03-14 15:09                                                                                       ` andrés ramírez
                                                                                                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 151+ messages in thread
From: Robert Pluim @ 2018-03-14  9:22 UTC (permalink / raw)
  To: Leo Liu; +Cc: emacs-devel

Leo Liu <sdl.web@gmail.com> writes:

> On 2018-03-13 18:12 +0200, Eli Zaretskii wrote:
>> I don't think it makes sense to put this on the emacs-26 branch,
>> sorry.  I even have evidence: the fact that one of the hunks
>> re-introduced a previously solved bug, as pointed out by Robert.
>>
>> This code is very delicate and very central to Emacs operation.  It's
>> very easy to break Emacs subtly by seemingly-innocent changes in that
>> area.  We really need to test it much more than a few weeks before we
>> are confident enough it is bug-free.  The master branch is where such
>> changes should be tested.
>
> Is there possibility of including it in the 26.x series? I mean wait
> until 27.1 is just too long to see such a critical fix.
>

Is it that critical, though? I see no hangs with emacs-26 (nor master
for that matter, but I use that less).

> In last week I have experienced 2-3 times of hangs (emacs 25.3) that
> consumed ~0% CPU but emacs was not responding to anything (C-g, SIGUSR2
> etc).

Do you get the same hangs in emacs-26?

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 16:32                                                                                       ` Lars Ingebrigtsen
@ 2018-03-14  9:32                                                                                         ` Matthias Dahl
  2018-03-14 14:55                                                                                           ` Lars Ingebrigtsen
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-03-14  9:32 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Paul Eggert, Eli Zaretskii, andrés ramírez,
	Robert Pluim, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 934 bytes --]

Hello Lars...

On 13/03/18 17:32, Lars Ingebrigtsen wrote:

> Can you resend as a single patch set?  I'm not 100% sure what patches to
> apply...

Sure. Attached you will find a combined patch for the master as well as
the current emacs-26 branch. It contains all relevant fixes up until
now.

FYI, the relevant patches (also attached for completeness sake):

Only Emacs-26 branch:

0001-Add-process-output-read-accounting.patch
0002-Fix-wait_reading_process_output-wait_proc-hang.patch

This got applied to the master branch as one:
4ba32858d61eee16f17b51aca01c15211a0912f8

Emacs master and Emacs-26 branch:

0001-Fix-GnuTLS-error-handling.patch
0002-Always-check-GnuTLS-sessions-for-available-data.patch
0003-Make-xg_select-behave-more-like-pselect.patch

Hope that helps. Thanks for testing. Looking forward to your results! :)

Have a nice day,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-process-output-read-accounting.patch --]
[-- Type: text/x-patch; name="0001-Add-process-output-read-accounting.patch", Size: 1597 bytes --]

From 94e0dc26f45e1a06881b016dd26446c43d339a4d Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 24 Oct 2017 15:55:53 +0200
Subject: [PATCH 1/2] Add process output read accounting

This tracks the bytes read from a process' stdin which is not used
anywhere yet but required for follow-up work.
* src/process.c (read_process_output): Track bytes read from a process.
* src/process.h (struct Lisp_Process): Add nbytes_read to track bytes
read from a process.
---
 src/process.c | 3 +++
 src/process.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/src/process.c b/src/process.c
index 2ec10b12ec..17fdf592ec 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5889,6 +5889,9 @@ read_process_output (Lisp_Object proc, int channel)
 	return nbytes;
       coding->mode |= CODING_MODE_LAST_BLOCK;
     }
+  
+  /* Ignore carryover, it's been added by a previous iteration already.  */
+  p->nbytes_read += nbytes;
 
   /* Now set NBYTES how many bytes we must decode.  */
   nbytes += carryover;
diff --git a/src/process.h b/src/process.h
index ab468b18c5..6464a8cc61 100644
--- a/src/process.h
+++ b/src/process.h
@@ -129,6 +129,8 @@ struct Lisp_Process
     pid_t pid;
     /* Descriptor by which we read from this process.  */
     int infd;
+    /* Byte-count modulo (UINTMAX_MAX + 1) for process output read from `infd'.  */
+    uintmax_t nbytes_read;
     /* Descriptor by which we write to this process.  */
     int outfd;
     /* Descriptors that were created for this process and that need
-- 
2.16.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-Fix-GnuTLS-error-handling.patch --]
[-- Type: text/x-patch; name="0001-Fix-GnuTLS-error-handling.patch", Size: 2146 bytes --]

From 841d0fbe37642bc14c4bcd515cfbc91a25ba179b Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Mon, 12 Mar 2018 15:33:45 +0100
Subject: [PATCH 1/3] Fix GnuTLS error handling

* src/gnutls.c (emacs_gnutls_read): All error handling should be done
in `emacs_gnutls_handle_error', move handling of
GNUTLS_E_UNEXPECTED_PACKET_LENGTH accordingly.
We always need to set `errno' in case of an error, since later error
handling (e.g. `wait_reading_process_output') depends on it and GnuTLS
does not set errno by itself. We'll otherwise have random errno values
which can cause erratic behavior and hangs.
(emacs_gnutls_handle_error): GNUTLS_E_UNEXPECTED_PACKET_LENGTH is only
returned for premature connection termination on GnuTLS < 3.0 and is
otherwise a real error and should not be gobbled up.
---
 src/gnutls.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/src/gnutls.c b/src/gnutls.c
index 903393fed1..e7d0d3d845 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
   rtnval = gnutls_record_recv (state, buf, nbyte);
   if (rtnval >= 0)
     return rtnval;
-  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
-    /* The peer closed the connection. */
-    return 0;
   else if (emacs_gnutls_handle_error (state, rtnval))
-    /* non-fatal error */
-    return -1;
-  else {
-    /* a fatal error occurred */
-    return 0;
-  }
+    {
+      /* non-fatal error */
+      errno = EAGAIN;
+      return -1;
+    }
+  else
+    {
+      /* a fatal error occurred */
+      errno = EPROTO;
+      return 0;
+    }
 }
 
 static char const *
@@ -756,8 +758,10 @@ emacs_gnutls_handle_error (gnutls_session_t session, int err)
 	 connection.  */
 # ifdef HAVE_GNUTLS3
       if (err == GNUTLS_E_PREMATURE_TERMINATION)
-	level = 3;
+# else
+      if (err == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
 # endif
+	level = 3;
 
       GNUTLS_LOG2 (level, max_log_level, "fatal error:", str);
       ret = false;
-- 
2.16.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0002-Always-check-GnuTLS-sessions-for-available-data.patch --]
[-- Type: text/x-patch; name="0002-Always-check-GnuTLS-sessions-for-available-data.patch", Size: 4596 bytes --]

From f5b979da67595da4d91f7b7a4eb979deae6a7321 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Mon, 12 Mar 2018 16:07:55 +0100
Subject: [PATCH 2/3] Always check GnuTLS sessions for available data

* src/process.c (wait_reading_process_output): GnuTLS buffers data
internally and as such there is no guarantee that a select() call on
the underlying kernel socket will report available data if all data
has already been buffered. Prior to GnuTLS < 2.12, lowat mode was the
default which left bytes back in the kernel socket, so a select() call
would work. With GnuTLS >= 2.12 (the now required version for Emacs),
that default changed to non-lowat mode (and we don't set it otherwise)
and was subsequently completely removed with GnuTLS >= 3.0.
So, to properly handle GnuTLS sessions, we need to iterate through all
channels, check for available data manually and set the concerning fds
accordingly. Otherwise we might stall/delay unnecessarily or worse.
This also applies to the !just_wait_proc && wait_proc case, which was
previously handled improperly (only wait_proc was checked) and could
cause problems if sessions did have any dependency on one another
through e.g. higher up program logic and waited for one another.
---
 src/process.c | 77 +++++++++++++++++++----------------------------------------
 1 file changed, 24 insertions(+), 53 deletions(-)

diff --git a/src/process.c b/src/process.c
index 9b9b9f3550..f8fed56d5b 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5392,60 +5392,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 #endif	/* !HAVE_GLIB */
 
 #ifdef HAVE_GNUTLS
-          /* GnuTLS buffers data internally.  In lowat mode it leaves
-             some data in the TCP buffers so that select works, but
-             with custom pull/push functions we need to check if some
-             data is available in the buffers manually.  */
-          if (nfds == 0)
+          /* GnuTLS buffers data internally. select() will only report
+             available data for the underlying kernel sockets API, not
+             what has been buffered internally. As such, we need to loop
+             through all channels and check for available data manually.  */
+          if (nfds >= 0)
 	    {
-	      fd_set tls_available;
-	      int set = 0;
-
-	      FD_ZERO (&tls_available);
-	      if (! wait_proc)
-		{
-		  /* We're not waiting on a specific process, so loop
-		     through all the channels and check for data.
-		     This is a workaround needed for some versions of
-		     the gnutls library -- 2.12.14 has been confirmed
-		     to need it.  See
-		     http://comments.gmane.org/gmane.emacs.devel/145074 */
-		  for (channel = 0; channel < FD_SETSIZE; ++channel)
-		    if (! NILP (chan_process[channel]))
-		      {
-			struct Lisp_Process *p =
-			  XPROCESS (chan_process[channel]);
-			if (p && p->gnutls_p && p->gnutls_state
-			    && ((emacs_gnutls_record_check_pending
-				 (p->gnutls_state))
-				> 0))
-			  {
-			    nfds++;
-			    eassert (p->infd == channel);
-			    FD_SET (p->infd, &tls_available);
-			    set++;
-			  }
-		      }
-		}
-	      else
-		{
-		  /* Check this specific channel.  */
-		  if (wait_proc->gnutls_p /* Check for valid process.  */
-		      && wait_proc->gnutls_state
-		      /* Do we have pending data?  */
-		      && ((emacs_gnutls_record_check_pending
-			   (wait_proc->gnutls_state))
-			  > 0))
-		    {
-		      nfds = 1;
-		      eassert (0 <= wait_proc->infd);
-		      /* Set to Available.  */
-		      FD_SET (wait_proc->infd, &tls_available);
-		      set++;
-		    }
-		}
-	      if (set)
-		Available = tls_available;
+              for (channel = 0; channel < FD_SETSIZE; ++channel)
+                if (! NILP (chan_process[channel]))
+                  {
+                    struct Lisp_Process *p =
+                      XPROCESS (chan_process[channel]);
+
+                    if (just_wait_proc && p != wait_proc)
+                      continue;
+
+                    if (p && p->gnutls_p && p->gnutls_state
+                        && ((emacs_gnutls_record_check_pending
+                             (p->gnutls_state))
+                            > 0))
+                      {
+                        nfds++;
+                        eassert (p->infd == channel);
+                        FD_SET (p->infd, &Available);
+                      }
+                  }
 	    }
 #endif
 	}
-- 
2.16.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0002-Fix-wait_reading_process_output-wait_proc-hang.patch --]
[-- Type: text/x-patch; name="0002-Fix-wait_reading_process_output-wait_proc-hang.patch", Size: 3307 bytes --]

From b9c05bbfb4559b21deb0ea4e156430dedb60ce41 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 6 Feb 2018 15:24:15 +0100
Subject: [PATCH 2/2] Fix wait_reading_process_output wait_proc hang

* src/process.c (wait_reading_process_output): If called recursively
through timers and/or process filters via accept-process-output, it is
possible that the output of wait_proc has already been read by one of
those recursive calls, leaving the original call hanging forever if no
further output arrives through that fd and no timeout has been set.
Fix that by using the process read accounting to keep track of how
many bytes have been read and use that as a condition to break out
of the infinite loop and return to the caller as well as to calculate
the proper return value (if a wait_proc is given that is).
---
 src/process.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/process.c b/src/process.c
index 17fdf592ec..0abbd5fa8e 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5006,6 +5006,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   struct timespec got_output_end_time = invalid_timespec ();
   enum { MINIMUM = -1, TIMEOUT, INFINITY } wait;
   int got_some_output = -1;
+  uintmax_t prev_wait_proc_nbytes_read = wait_proc ? wait_proc->nbytes_read : 0;
 #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS
   bool retry_for_async;
 #endif
@@ -5460,6 +5461,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       if (nfds == 0)
 	{
           /* Exit the main loop if we've passed the requested timeout,
+             or have read some bytes from our wait_proc (either directly
+             in this call or indirectly through timers / process filters),
              or aren't skipping processes and got some output and
              haven't lowered our timeout due to timers or SIGIO and
              have waited a long amount of time due to repeated
@@ -5467,7 +5470,9 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 	  struct timespec huge_timespec
 	    = make_timespec (TYPE_MAXIMUM (time_t), 2 * TIMESPEC_RESOLUTION);
 	  struct timespec cmp_time = huge_timespec;
-	  if (wait < TIMEOUT)
+	  if (wait < TIMEOUT
+              || (wait_proc
+                  && wait_proc->nbytes_read != prev_wait_proc_nbytes_read))
 	    break;
 	  if (wait == TIMEOUT)
 	    cmp_time = end_time;
@@ -5772,6 +5777,15 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       maybe_quit ();
     }
 
+  /* Timers and/or process filters that we have run could have themselves called
+     `accept-process-output' (and by that indirectly this function), thus
+     possibly reading some (or all) output of wait_proc without us noticing it.
+     This could potentially lead to an endless wait (dealt with earlier in the
+     function) and/or a wrong return value (dealt with here).  */
+  if (wait_proc && wait_proc->nbytes_read != prev_wait_proc_nbytes_read)
+    got_some_output = min (INT_MAX, (wait_proc->nbytes_read
+                                     - prev_wait_proc_nbytes_read));
+
   return got_some_output;
 }
 \f
-- 
2.16.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: 0003-Make-xg_select-behave-more-like-pselect.patch --]
[-- Type: text/x-patch; name="0003-Make-xg_select-behave-more-like-pselect.patch", Size: 1322 bytes --]

From 2f44ef364bc320dfd73febc51f0c0da862db49b1 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 13 Mar 2018 15:35:16 +0100
Subject: [PATCH 3/3] Make xg_select() behave more like pselect()

* src/xgselect.c (xg_select): If no file descriptors have data ready,
pselect() clears the passed in fd sets whereas xg_select() does not
which caused Bug#21337 for `wait_reading_process_output'.
Clear the passed in sets if no fds are ready but leave them untouched
if pselect() returns an error -- just like pselect() does itself.
---
 src/xgselect.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/xgselect.c b/src/xgselect.c
index fedd3127ef..f68982143e 100644
--- a/src/xgselect.c
+++ b/src/xgselect.c
@@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
             ++retval;
         }
     }
+  else if (nfds == 0)
+    {
+      // pselect() clears the file descriptor sets if no fd is ready (but
+      // not if an error occurred), so should we to be compatible. (Bug#21337)
+      if (rfds) FD_ZERO (rfds);
+      if (wfds) FD_ZERO (wfds);
+      if (efds) FD_ZERO (efds);
+    }
 
   /* If Gtk+ is in use eventually gtk_main_iteration will be called,
      unless retval is zero.  */
-- 
2.16.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: emacs-emacs26-combined.patch --]
[-- Type: text/x-patch; name="emacs-emacs26-combined.patch", Size: 8083 bytes --]

diff --git a/src/gnutls.c b/src/gnutls.c
index 903393fed1..e7d0d3d845 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
   rtnval = gnutls_record_recv (state, buf, nbyte);
   if (rtnval >= 0)
     return rtnval;
-  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
-    /* The peer closed the connection. */
-    return 0;
   else if (emacs_gnutls_handle_error (state, rtnval))
-    /* non-fatal error */
-    return -1;
-  else {
-    /* a fatal error occurred */
-    return 0;
-  }
+    {
+      /* non-fatal error */
+      errno = EAGAIN;
+      return -1;
+    }
+  else
+    {
+      /* a fatal error occurred */
+      errno = EPROTO;
+      return 0;
+    }
 }
 
 static char const *
@@ -756,8 +758,10 @@ emacs_gnutls_handle_error (gnutls_session_t session, int err)
 	 connection.  */
 # ifdef HAVE_GNUTLS3
       if (err == GNUTLS_E_PREMATURE_TERMINATION)
-	level = 3;
+# else
+      if (err == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
 # endif
+	level = 3;
 
       GNUTLS_LOG2 (level, max_log_level, "fatal error:", str);
       ret = false;
diff --git a/src/process.c b/src/process.c
index b201e9b6ac..c1116c1deb 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5005,6 +5005,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   struct timespec got_output_end_time = invalid_timespec ();
   enum { MINIMUM = -1, TIMEOUT, INFINITY } wait;
   int got_some_output = -1;
+  uintmax_t prev_wait_proc_nbytes_read = wait_proc ? wait_proc->nbytes_read : 0;
 #if defined HAVE_GETADDRINFO_A || defined HAVE_GNUTLS
   bool retry_for_async;
 #endif
@@ -5390,60 +5391,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 #endif	/* !HAVE_GLIB */
 
 #ifdef HAVE_GNUTLS
-          /* GnuTLS buffers data internally.  In lowat mode it leaves
-             some data in the TCP buffers so that select works, but
-             with custom pull/push functions we need to check if some
-             data is available in the buffers manually.  */
-          if (nfds == 0)
+          /* GnuTLS buffers data internally. select() will only report
+             available data for the underlying kernel sockets API, not
+             what has been buffered internally. As such, we need to loop
+             through all channels and check for available data manually.  */
+          if (nfds >= 0)
 	    {
-	      fd_set tls_available;
-	      int set = 0;
-
-	      FD_ZERO (&tls_available);
-	      if (! wait_proc)
-		{
-		  /* We're not waiting on a specific process, so loop
-		     through all the channels and check for data.
-		     This is a workaround needed for some versions of
-		     the gnutls library -- 2.12.14 has been confirmed
-		     to need it.  See
-		     http://comments.gmane.org/gmane.emacs.devel/145074 */
-		  for (channel = 0; channel < FD_SETSIZE; ++channel)
-		    if (! NILP (chan_process[channel]))
-		      {
-			struct Lisp_Process *p =
-			  XPROCESS (chan_process[channel]);
-			if (p && p->gnutls_p && p->gnutls_state
-			    && ((emacs_gnutls_record_check_pending
-				 (p->gnutls_state))
-				> 0))
-			  {
-			    nfds++;
-			    eassert (p->infd == channel);
-			    FD_SET (p->infd, &tls_available);
-			    set++;
-			  }
-		      }
-		}
-	      else
-		{
-		  /* Check this specific channel.  */
-		  if (wait_proc->gnutls_p /* Check for valid process.  */
-		      && wait_proc->gnutls_state
-		      /* Do we have pending data?  */
-		      && ((emacs_gnutls_record_check_pending
-			   (wait_proc->gnutls_state))
-			  > 0))
-		    {
-		      nfds = 1;
-		      eassert (0 <= wait_proc->infd);
-		      /* Set to Available.  */
-		      FD_SET (wait_proc->infd, &tls_available);
-		      set++;
-		    }
-		}
-	      if (set)
-		Available = tls_available;
+              for (channel = 0; channel < FD_SETSIZE; ++channel)
+                if (! NILP (chan_process[channel]))
+                  {
+                    struct Lisp_Process *p =
+                      XPROCESS (chan_process[channel]);
+
+                    if (just_wait_proc && p != wait_proc)
+                      continue;
+
+                    if (p && p->gnutls_p && p->gnutls_state
+                        && ((emacs_gnutls_record_check_pending
+                             (p->gnutls_state))
+                            > 0))
+                      {
+                        nfds++;
+                        eassert (p->infd == channel);
+                        FD_SET (p->infd, &Available);
+                      }
+                  }
 	    }
 #endif
 	}
@@ -5459,6 +5431,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       if (nfds == 0)
 	{
           /* Exit the main loop if we've passed the requested timeout,
+             or have read some bytes from our wait_proc (either directly
+             in this call or indirectly through timers / process filters),
              or aren't skipping processes and got some output and
              haven't lowered our timeout due to timers or SIGIO and
              have waited a long amount of time due to repeated
@@ -5466,7 +5440,9 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 	  struct timespec huge_timespec
 	    = make_timespec (TYPE_MAXIMUM (time_t), 2 * TIMESPEC_RESOLUTION);
 	  struct timespec cmp_time = huge_timespec;
-	  if (wait < TIMEOUT)
+	  if (wait < TIMEOUT
+              || (wait_proc
+                  && wait_proc->nbytes_read != prev_wait_proc_nbytes_read))
 	    break;
 	  if (wait == TIMEOUT)
 	    cmp_time = end_time;
@@ -5781,6 +5757,15 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
       maybe_quit ();
     }
 
+  /* Timers and/or process filters that we have run could have themselves called
+     `accept-process-output' (and by that indirectly this function), thus
+     possibly reading some (or all) output of wait_proc without us noticing it.
+     This could potentially lead to an endless wait (dealt with earlier in the
+     function) and/or a wrong return value (dealt with here).  */
+  if (wait_proc && wait_proc->nbytes_read != prev_wait_proc_nbytes_read)
+    got_some_output = min (INT_MAX, (wait_proc->nbytes_read
+                                     - prev_wait_proc_nbytes_read));
+
   return got_some_output;
 }
 \f
@@ -5899,6 +5884,9 @@ read_process_output (Lisp_Object proc, int channel)
       coding->mode |= CODING_MODE_LAST_BLOCK;
     }
 
+  /* Ignore carryover, it's been added by a previous iteration already.  */
+  p->nbytes_read += nbytes;
+
   /* Now set NBYTES how many bytes we must decode.  */
   nbytes += carryover;
 
diff --git a/src/process.h b/src/process.h
index ab468b18c5..6464a8cc61 100644
--- a/src/process.h
+++ b/src/process.h
@@ -129,6 +129,8 @@ struct Lisp_Process
     pid_t pid;
     /* Descriptor by which we read from this process.  */
     int infd;
+    /* Byte-count modulo (UINTMAX_MAX + 1) for process output read from `infd'.  */
+    uintmax_t nbytes_read;
     /* Descriptor by which we write to this process.  */
     int outfd;
     /* Descriptors that were created for this process and that need
diff --git a/src/xgselect.c b/src/xgselect.c
index fedd3127ef..f68982143e 100644
--- a/src/xgselect.c
+++ b/src/xgselect.c
@@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
             ++retval;
         }
     }
+  else if (nfds == 0)
+    {
+      // pselect() clears the file descriptor sets if no fd is ready (but
+      // not if an error occurred), so should we to be compatible. (Bug#21337)
+      if (rfds) FD_ZERO (rfds);
+      if (wfds) FD_ZERO (wfds);
+      if (efds) FD_ZERO (efds);
+    }
 
   /* If Gtk+ is in use eventually gtk_main_iteration will be called,
      unless retval is zero.  */

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #8: emacs-master-combined.patch --]
[-- Type: text/x-patch; name="emacs-master-combined.patch", Size: 5017 bytes --]

diff --git a/src/gnutls.c b/src/gnutls.c
index 903393fed1..e7d0d3d845 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
   rtnval = gnutls_record_recv (state, buf, nbyte);
   if (rtnval >= 0)
     return rtnval;
-  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
-    /* The peer closed the connection. */
-    return 0;
   else if (emacs_gnutls_handle_error (state, rtnval))
-    /* non-fatal error */
-    return -1;
-  else {
-    /* a fatal error occurred */
-    return 0;
-  }
+    {
+      /* non-fatal error */
+      errno = EAGAIN;
+      return -1;
+    }
+  else
+    {
+      /* a fatal error occurred */
+      errno = EPROTO;
+      return 0;
+    }
 }
 
 static char const *
@@ -756,8 +758,10 @@ emacs_gnutls_handle_error (gnutls_session_t session, int err)
 	 connection.  */
 # ifdef HAVE_GNUTLS3
       if (err == GNUTLS_E_PREMATURE_TERMINATION)
-	level = 3;
+# else
+      if (err == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
 # endif
+	level = 3;
 
       GNUTLS_LOG2 (level, max_log_level, "fatal error:", str);
       ret = false;
diff --git a/src/process.c b/src/process.c
index 9b9b9f3550..f8fed56d5b 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5392,60 +5392,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 #endif	/* !HAVE_GLIB */
 
 #ifdef HAVE_GNUTLS
-          /* GnuTLS buffers data internally.  In lowat mode it leaves
-             some data in the TCP buffers so that select works, but
-             with custom pull/push functions we need to check if some
-             data is available in the buffers manually.  */
-          if (nfds == 0)
+          /* GnuTLS buffers data internally. select() will only report
+             available data for the underlying kernel sockets API, not
+             what has been buffered internally. As such, we need to loop
+             through all channels and check for available data manually.  */
+          if (nfds >= 0)
 	    {
-	      fd_set tls_available;
-	      int set = 0;
-
-	      FD_ZERO (&tls_available);
-	      if (! wait_proc)
-		{
-		  /* We're not waiting on a specific process, so loop
-		     through all the channels and check for data.
-		     This is a workaround needed for some versions of
-		     the gnutls library -- 2.12.14 has been confirmed
-		     to need it.  See
-		     http://comments.gmane.org/gmane.emacs.devel/145074 */
-		  for (channel = 0; channel < FD_SETSIZE; ++channel)
-		    if (! NILP (chan_process[channel]))
-		      {
-			struct Lisp_Process *p =
-			  XPROCESS (chan_process[channel]);
-			if (p && p->gnutls_p && p->gnutls_state
-			    && ((emacs_gnutls_record_check_pending
-				 (p->gnutls_state))
-				> 0))
-			  {
-			    nfds++;
-			    eassert (p->infd == channel);
-			    FD_SET (p->infd, &tls_available);
-			    set++;
-			  }
-		      }
-		}
-	      else
-		{
-		  /* Check this specific channel.  */
-		  if (wait_proc->gnutls_p /* Check for valid process.  */
-		      && wait_proc->gnutls_state
-		      /* Do we have pending data?  */
-		      && ((emacs_gnutls_record_check_pending
-			   (wait_proc->gnutls_state))
-			  > 0))
-		    {
-		      nfds = 1;
-		      eassert (0 <= wait_proc->infd);
-		      /* Set to Available.  */
-		      FD_SET (wait_proc->infd, &tls_available);
-		      set++;
-		    }
-		}
-	      if (set)
-		Available = tls_available;
+              for (channel = 0; channel < FD_SETSIZE; ++channel)
+                if (! NILP (chan_process[channel]))
+                  {
+                    struct Lisp_Process *p =
+                      XPROCESS (chan_process[channel]);
+
+                    if (just_wait_proc && p != wait_proc)
+                      continue;
+
+                    if (p && p->gnutls_p && p->gnutls_state
+                        && ((emacs_gnutls_record_check_pending
+                             (p->gnutls_state))
+                            > 0))
+                      {
+                        nfds++;
+                        eassert (p->infd == channel);
+                        FD_SET (p->infd, &Available);
+                      }
+                  }
 	    }
 #endif
 	}
diff --git a/src/xgselect.c b/src/xgselect.c
index fedd3127ef..f68982143e 100644
--- a/src/xgselect.c
+++ b/src/xgselect.c
@@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
             ++retval;
         }
     }
+  else if (nfds == 0)
+    {
+      // pselect() clears the file descriptor sets if no fd is ready (but
+      // not if an error occurred), so should we to be compatible. (Bug#21337)
+      if (rfds) FD_ZERO (rfds);
+      if (wfds) FD_ZERO (wfds);
+      if (efds) FD_ZERO (efds);
+    }
 
   /* If Gtk+ is in use eventually gtk_main_iteration will be called,
      unless retval is zero.  */

^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 16:12                                                                                   ` Eli Zaretskii
  2018-03-14  4:16                                                                                     ` Leo Liu
@ 2018-03-14  9:56                                                                                     ` Matthias Dahl
  2018-03-14 12:24                                                                                       ` Stefan Monnier
  2018-03-14 16:43                                                                                       ` Eli Zaretskii
  1 sibling, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-03-14  9:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, rrandresf, emacs-devel, eggert

Hello Eli...

On 13/03/18 17:12, Eli Zaretskii wrote:

> I don't think it makes sense to put this on the emacs-26 branch,
> sorry.  I even have evidence: the fact that one of the hunks
> re-introduced a previously solved bug, as pointed out by Robert.
> 
> This code is very delicate and very central to Emacs operation.  It's
> very easy to break Emacs subtly by seemingly-innocent changes in that
> area.  We really need to test it much more than a few weeks before we
> are confident enough it is bug-free.  The master branch is where such
> changes should be tested.

Normally I would fully agree. But we are not talking about new features
here but fixes to bugs that cause sporadic and hard to pinpoint erratic
behavior and hangs possibly all over the place. It is pure chance and
the packages you have installed, if you run into those several times a
day or really never.

What I am trying to say: Is it really better to keep those bugs around
for another year or more until master becomes the next stable release
just based on the pure chance that we might (or might not) introduce
breakage? Or should we commit the fixes now (and fix current/real bugs
this way) while we're still in the beta cycle and deal with any fall-out
now (which might not even be needed after all).

Imho, the later is the right thing to do (famous last words :P). We can
fix the fall-out, if any should happen. That's what beta release are for
and it is better to fix that now instead of in a year's time or so and
have users deal with those bugs until then.

Those bugs are, as you put it, at the heart of Emacs. They should be
fixed and dealt with.

Regarding the regression pointed out by Robert: I'm sorry that happened.
There were no comments or anything. :( Unfortunately, that fix was also
merely a workaround that fixed only half of the problem. Even though it
dealt with the read fds, the other fd sets were kept as-is which surely
could have caused some hiccups later down in wait_reading_... . The fix
posted yesterday fixes the actual root cause, so on the bright side, it
was a happy accident after all. ;-)

Thanks for your patience with my arguing on this subject. In want to
point out, I only have the user's best interests at heart and what ever
you decide is fine by me. No offense meant or whatever. :)

Have a nice day,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  9:56                                                                                     ` Matthias Dahl
@ 2018-03-14 12:24                                                                                       ` Stefan Monnier
  2018-03-14 14:34                                                                                         ` Matthias Dahl
  2018-03-14 16:43                                                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 151+ messages in thread
From: Stefan Monnier @ 2018-03-14 12:24 UTC (permalink / raw)
  To: emacs-devel

> Normally I would fully agree. But we are not talking about new features
> here but fixes to bugs that cause sporadic and hard to pinpoint erratic
> behavior and hangs possibly all over the place. It is pure chance and
> the packages you have installed, if you run into those several times a
> day or really never.

FWIW, I'm with Eli on this: I don't think this is a really new problem,
so it's not super-urgent to fix.

It can wait for 26.2.


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:36                                                                                       ` Dmitry Gutov
  2018-03-13 15:46                                                                                         ` Robert Pluim
@ 2018-03-14 14:21                                                                                         ` Matthias Dahl
  1 sibling, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-03-14 14:21 UTC (permalink / raw)
  To: Dmitry Gutov, emacs-devel
  Cc: Paul Eggert, Lars Ingebrigtsen, andrés ramírez,
	Eli Zaretskii, Robert Pluim

Hello Dmitry...

On 13/03/18 16:36, Dmitry Gutov wrote:

> Is there any chance to write a regression test? Ideally both for Robert's
> and your fixes.

Much to my embarrassment, I have to admit, that I did not know Emacs
actually had a test suite at all.

Thanks for pointing that out. I am a huge proponent of testing and test
driven development in general, so I will check that out once I get the
time.

In this particular case, writing tests could turn out quite challenging
due to the "random" nature / timing / dependencies of the bugs.

Thanks again,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14 12:24                                                                                       ` Stefan Monnier
@ 2018-03-14 14:34                                                                                         ` Matthias Dahl
  2018-03-14 22:52                                                                                           ` Stefan Monnier
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-03-14 14:34 UTC (permalink / raw)
  To: emacs-devel

Hello Stefan...

On 14/03/18 13:24, Stefan Monnier wrote:

> FWIW, I'm with Eli on this: I don't think this is a really new problem,
> so it's not super-urgent to fix.

Please excuse my bluntness (not meant as an offense or anything alike)
but that argument does not really hold up. Imagine we were working on a
filesystem, found a longstanding corruption bug that could randomly hit
everyone and silently corrupt data every once in a while (or never). It
gets discovered, fixes are ready but those won't get committed since
the bug has been around for ages, so it can wait a bit longer until the
fixes have been through another timely major release cycle.

Ok, my examples are usually really bad. :) But you get my point. Those
are real bugs... granted, they partially have been around for longer but
people still will run into them and get weird behavior or hangs... and
that all over the place.

I still think fixing actual bugs is more important than holding those
fixes back for longer due to the off-chance of introducing new ones.

And again: Isn't that what beta release are for? Finding bugs and fixing
them before the actual stable release? I'm not arguing for rushing those
fixes out in a 25.3.x release or whatever. But the 26 beta cycle seems
more than appropriate to me as its a new major release... just my 2ct.

And I promise I will shut up now about this if nobody agrees with me. :)

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  9:32                                                                                         ` Matthias Dahl
@ 2018-03-14 14:55                                                                                           ` Lars Ingebrigtsen
  0 siblings, 0 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-03-14 14:55 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Paul Eggert, Eli Zaretskii, andrés ramírez,
	Robert Pluim, emacs-devel

Matthias Dahl <ml_emacs-lists@binary-island.eu> writes:

> Sure. Attached you will find a combined patch for the master as well as
> the current emacs-26 branch. It contains all relevant fixes up until
> now.

Thanks; I'm now running Emacs with the patch, and we'll see whether
I get any hangs...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  4:16                                                                                     ` Leo Liu
  2018-03-14  9:22                                                                                       ` Robert Pluim
@ 2018-03-14 15:09                                                                                       ` andrés ramírez
  2018-03-14 16:45                                                                                       ` Eli Zaretskii
  2018-03-14 22:54                                                                                       ` Stefan Monnier
  3 siblings, 0 replies; 151+ messages in thread
From: andrés ramírez @ 2018-03-14 15:09 UTC (permalink / raw)
  To: Leo Liu; +Cc: emacs-devel

> In last week I have experienced 2-3 times of hangs (emacs 25.3) that
> consumed ~0% CPU but emacs was not responding to anything (C-g, SIGUSR2
> etc).

In the meantime You could try emacs-emacs26-combined.patch on the latest
release candidate. I am testing with it now.

AR



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  9:56                                                                                     ` Matthias Dahl
  2018-03-14 12:24                                                                                       ` Stefan Monnier
@ 2018-03-14 16:43                                                                                       ` Eli Zaretskii
  2018-03-15 14:59                                                                                         ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-03-14 16:43 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: larsi, rrandresf, emacs-devel, eggert

> Cc: larsi@gnus.org, rrandresf@gmail.com, eggert@cs.ucla.edu,
>  emacs-devel@gnu.org
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Wed, 14 Mar 2018 10:56:33 +0100
> 
> Normally I would fully agree. But we are not talking about new features
> here but fixes to bugs that cause sporadic and hard to pinpoint erratic
> behavior and hangs possibly all over the place. It is pure chance and
> the packages you have installed, if you run into those several times a
> day or really never.

Fixing bugs runs a certain risk of introducing new bugs.  The risk
could be small or not so small, depending on the code where the fix is
made, the complexity of the fix, and our ability to anticipate the
consequences.  This is why, as pretest progresses, and the code base
becomes more and more stable, we progressively raise the bar and allow
only fixes that are less and less risky.

> What I am trying to say: Is it really better to keep those bugs around
> for another year or more until master becomes the next stable release
> just based on the pure chance that we might (or might not) introduce
> breakage? Or should we commit the fixes now (and fix current/real bugs
> this way) while we're still in the beta cycle and deal with any fall-out
> now (which might not even be needed after all).

Dealing with fallout in this case means delaying the release, because
it takes time for issues in this kind of code to surface.  OTOH, the
problems fixed by the proposed changes are (a) relatively rare, and
(b) have been with us for many years.

> Imho, the later is the right thing to do (famous last words :P). We can
> fix the fall-out, if any should happen. That's what beta release are for
> and it is better to fix that now instead of in a year's time or so and
> have users deal with those bugs until then.

The current beta is supposed to be the last one.  Installing these
changes means significant additional delays in releasing Emacs 26.1.
We have been pretesting Emacs 26 since September: how many more moons
are we prepared to wait in order to have one more bug fixed?  That's
the balance we should all be considering.

> Regarding the regression pointed out by Robert: I'm sorry that happened.
> There were no comments or anything. :(

I don't blame you.  Such regressions in tricky code such as this one
are almost inevitable.  The question is what do we learn from such
experiences regarding the probability of introducing other similar
bugs due to these changes, ones that we won't be so lucky to find as
quickly as this one.  (If you are familiar with the "error seeding"
technique for estimating the amount of unknown bugs, it does something
similar.)

> I only have the user's best interests at heart and what ever you
> decide is fine by me. No offense meant or whatever. :)

Same here, obviously.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  4:16                                                                                     ` Leo Liu
  2018-03-14  9:22                                                                                       ` Robert Pluim
  2018-03-14 15:09                                                                                       ` andrés ramírez
@ 2018-03-14 16:45                                                                                       ` Eli Zaretskii
  2018-03-15  1:03                                                                                         ` Leo Liu
  2018-03-14 22:54                                                                                       ` Stefan Monnier
  3 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-03-14 16:45 UTC (permalink / raw)
  To: Leo Liu; +Cc: emacs-devel

> From: Leo Liu <sdl.web@gmail.com>
> Date: Wed, 14 Mar 2018 12:16:14 +0800
> 
> Is there possibility of including it in the 26.x series? I mean wait
> until 27.1 is just too long to see such a critical fix.

It's possible.  I think whether it will happen depends on two main
factors:

  . whether there will be v26.2 and what will be its goals (only very
    safe bug-fixes or more than that)
  . our experience with these fixes on the master branch (how many
    further changes will be needed to fix any fallout)

Of course, if there will be no Emacs 26.2, then 27.1 will be the very
next release, which should shorten the time until these fixes are
available in official versions.

> In last week I have experienced 2-3 times of hangs (emacs 25.3) that
> consumed ~0% CPU but emacs was not responding to anything (C-g, SIGUSR2
> etc).

Do the changes discussed here fix those hangs?



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14 14:34                                                                                         ` Matthias Dahl
@ 2018-03-14 22:52                                                                                           ` Stefan Monnier
  2018-03-15 15:17                                                                                             ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Stefan Monnier @ 2018-03-14 22:52 UTC (permalink / raw)
  To: emacs-devel

> Please excuse my bluntness (not meant as an offense or anything alike)
> but that argument does not really hold up. Imagine we were working on a
> filesystem, found a longstanding corruption bug that could randomly hit
> everyone and silently corrupt data every once in a while (or never). It
> gets discovered, fixes are ready but those won't get committed since
> the bug has been around for ages, so it can wait a bit longer until the
> fixes have been through another timely major release cycle.

AFAIK we're talking a bout hangs (and hangs you can interrupt with C-g
AFAICT), not about data corruption, so the comparison is unfair.

> I still think fixing actual bugs is more important than holding those
> fixes back for longer due to the off-chance of introducing new ones.

> And again: Isn't that what beta release are for? Finding bugs and fixing
> them before the actual stable release?

They're for finding&fixing *regressions*, and I think this bug doesn't
fall into that category.

> I'm not arguing for rushing those fixes out in a 25.3.x release or
> whatever. But the 26 beta cycle seems more than appropriate to me as
> its a new major release... just my 2ct.

My recommendation is based on the premise that 26.1 is due Real
Soon Now (with at most one more pretest before the release-candidate).

> And I promise I will shut up now about this if nobody agrees with me. :)

I don't fundamentally disagree: it's a balancing act.


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  4:16                                                                                     ` Leo Liu
                                                                                                         ` (2 preceding siblings ...)
  2018-03-14 16:45                                                                                       ` Eli Zaretskii
@ 2018-03-14 22:54                                                                                       ` Stefan Monnier
  2018-03-15  1:06                                                                                         ` Leo Liu
  3 siblings, 1 reply; 151+ messages in thread
From: Stefan Monnier @ 2018-03-14 22:54 UTC (permalink / raw)
  To: emacs-devel

> Is there possibility of including it in the 26.x series?

I assume the patch will be backported to the emacs-26 branch right after
the 26.1 release.


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14  9:22                                                                                       ` Robert Pluim
@ 2018-03-15  0:37                                                                                         ` Leo Liu
  0 siblings, 0 replies; 151+ messages in thread
From: Leo Liu @ 2018-03-15  0:37 UTC (permalink / raw)
  To: emacs-devel

On 2018-03-14 10:22 +0100, Robert Pluim wrote:
> Is it that critical, though? I see no hangs with emacs-26 (nor master
> for that matter, but I use that less).

That feeling when you have to force-kill a session in the middle of
work. Emacs supports GNU/Linux better so it is likely to happen less
often there.

> Do you get the same hangs in emacs-26?

My daily servant is 25.3. I am on Mac and using the solid MacPort Emacs.
I have a version of emacs-26 on centos but only for brief use/test. I am
looking forward to improved life in Emacs 26.

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14 16:45                                                                                       ` Eli Zaretskii
@ 2018-03-15  1:03                                                                                         ` Leo Liu
  2018-03-15  7:55                                                                                           ` Robert Pluim
  0 siblings, 1 reply; 151+ messages in thread
From: Leo Liu @ 2018-03-15  1:03 UTC (permalink / raw)
  To: emacs-devel

On 2018-03-14 18:45 +0200, Eli Zaretskii wrote:
> Do the changes discussed here fix those hangs?

Good question. I suspect it might because I am constantly using Emacs +
Erlang that involves that code path. I think one of the reasons these
bugs aren't fixed earlier is because they are hard to reproduce, hard to
pinpoint and require a level of expertise most people don't have.

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14 22:54                                                                                       ` Stefan Monnier
@ 2018-03-15  1:06                                                                                         ` Leo Liu
  0 siblings, 0 replies; 151+ messages in thread
From: Leo Liu @ 2018-03-15  1:06 UTC (permalink / raw)
  To: emacs-devel

On 2018-03-14 18:54 -0400, Stefan Monnier wrote:
> I assume the patch will be backported to the emacs-26 branch right after
> the 26.1 release.

Hopefully ;)

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-15  1:03                                                                                         ` Leo Liu
@ 2018-03-15  7:55                                                                                           ` Robert Pluim
  0 siblings, 0 replies; 151+ messages in thread
From: Robert Pluim @ 2018-03-15  7:55 UTC (permalink / raw)
  To: Leo Liu; +Cc: emacs-devel

Leo Liu <sdl.web@gmail.com> writes:

> On 2018-03-14 18:45 +0200, Eli Zaretskii wrote:
>> Do the changes discussed here fix those hangs?
>
> Good question. I suspect it might because I am constantly using Emacs +
> Erlang that involves that code path. I think one of the reasons these
> bugs aren't fixed earlier is because they are hard to reproduce, hard to
> pinpoint and require a level of expertise most people don't have.

More importantly: does emacs-26 have the same hangs? It's entirely
possible (though not that likely), that fixing your issue doesn't
require these changes at all.

Robert



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14 16:43                                                                                       ` Eli Zaretskii
@ 2018-03-15 14:59                                                                                         ` Matthias Dahl
  2018-06-26 13:36                                                                                           ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-03-15 14:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, larsi, rrandresf, emacs-devel

Hello Eli...

On 14/03/18 17:43, Eli Zaretskii wrote:

> Fixing bugs runs a certain risk of introducing new bugs.  The risk
> could be small or not so small, depending on the code where the fix is
> made, the complexity of the fix, and our ability to anticipate the
> consequences.  This is why, as pretest progresses, and the code base
> becomes more and more stable, we progressively raise the bar and allow
> only fixes that are less and less risky.

I understand and mostly agree with your reasoning... with the exception
that I think known bugs should be fixed before a stable release is made
if they cross a specific severity level.

And I guess that is the point where we somewhat disagree: The severity
level of the bugs in question. :) If we were both on the same page about
that, I strongly believe we would also agree on that actions without any
doubt.

> OTOH, the
> problems fixed by the proposed changes are (a) relatively rare, and
> (b) have been with us for many years.

And I am honestly not so sure about (a). Just because we don't see many
reports on the list, does not mean those issues don't surface themselves
in day-to-day usage for the users in strange and unpredictable ways that
get blamed on packages or whatnot. It is hard to quantify this...

> The current beta is supposed to be the last one.  Installing these
> changes means significant additional delays in releasing Emacs 26.1.
> We have been pretesting Emacs 26 since September: how many more moons
> are we prepared to wait in order to have one more bug fixed?  That's
> the balance we should all be considering.

And I am glad I am in no position to have to make that balancing act.
But I fully get where you are coming from. And your more conservative
approach is probably for the better. And like I said, if we agreed on
the severity of the bugs discussed, we probably wouldn't even be having
this discussion at all since I think we mostly have the same opinion.

> The question is what do we learn from such
> experiences regarding the probability of introducing other similar
> bugs due to these changes, ones that we won't be so lucky to find as
> quickly as this one.

Dmitry was kind enough to point me towards Emacs' test suite which I'm
now having a close look at. I really did not know that even existed. I
will be trying to come up with tests for the bugs and as usual, also do
the necessary commenting on the code to make things clear that are not
as obvious from the code itself (like bug references, ...).

In general, maybe, it would be a good idea to actual put a stronger
emphasis on getting a test case for each bug fix, where possible. That
would increase test coverage considerably over time and help prevent
regressions like this one.

> Same here, obviously.

I never doubted that for a picosecond. :-)

To make one more thing clear: We really don't have to discuss this any
further. You are probably having better things to do and I don't want
to steal your time. And like I said, no matter what decision is made, it
is really fine by me. I don't want to cause any trouble...

Thanks again for your patience,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-14 22:52                                                                                           ` Stefan Monnier
@ 2018-03-15 15:17                                                                                             ` Matthias Dahl
  0 siblings, 0 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-03-15 15:17 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

Hello Stefan...

On 14/03/18 23:52, Stefan Monnier wrote:

> AFAIK we're talking a bout hangs (and hangs you can interrupt with C-g
> AFAICT), not about data corruption, so the comparison is unfair.

You are absolutely right, I am sorry. But I warned you, I am not very
good when it comes to examples / metaphors or arguing. So... ;-)

> I don't fundamentally disagree: it's a balancing act.

Like with Eli, I do believe we are all generally on the same page and
just disagree on the severity of the bugs. And that's just humane to
have a difference of opinion. :)

I rest my case. Like I said in my previous mail, I don't want to waste
anyone's time nor did I want to cause a stir at all. I just have, like
everyone else here, the best intentions...

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-13 15:10                                                                                     ` Matthias Dahl
                                                                                                         ` (2 preceding siblings ...)
  2018-03-13 16:32                                                                                       ` Lars Ingebrigtsen
@ 2018-03-31 15:44                                                                                       ` Lars Ingebrigtsen
  2018-04-01  2:05                                                                                         ` andrés ramírez
  2018-06-08  5:11                                                                                         ` Leo Liu
  3 siblings, 2 replies; 151+ messages in thread
From: Lars Ingebrigtsen @ 2018-03-31 15:44 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: Paul Eggert, Eli Zaretskii, andrés ramírez,
	Robert Pluim, emacs-devel

After not seeing a single hang since applying the patch (that's two
weeks ago?), I got one hang yesterday and one hang today.  I started
wondering whether I had somehow lost the patch, but no, this Emacs is
still the patched one...

The symptoms were the same as before: When negotiating a TLS connection
from a timer, Emacs hung and I had to hit `C-g' three times to make
Emacs progress.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-31 15:44                                                                                       ` Lars Ingebrigtsen
@ 2018-04-01  2:05                                                                                         ` andrés ramírez
  2018-06-08  5:11                                                                                         ` Leo Liu
  1 sibling, 0 replies; 151+ messages in thread
From: andrés ramírez @ 2018-04-01  2:05 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Paul Eggert, Eli Zaretskii, Matthias Dahl, Robert Pluim,
	emacs-devel

Hi There.
> After not seeing a single hang since applying the patch (that's two
> weeks ago?), I got one hang yesterday and one hang today.  I started
> wondering whether I had somehow lost the patch, but no, this Emacs is
> still the patched one...

On my case no hangs. After applying the patch. But my setup is
different. Here using mbsync as MTA for downloading the email. And just using
wl as MUA. But I use wl for recovering news from news.gwene.org and
news.gmane.org. 

> The symptoms were the same as before: When negotiating a TLS connection
> from a timer, Emacs hung and I had to hit `C-g' three times to make
> Emacs progress.
So no TLS negotiation on my case.




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-31 15:44                                                                                       ` Lars Ingebrigtsen
  2018-04-01  2:05                                                                                         ` andrés ramírez
@ 2018-06-08  5:11                                                                                         ` Leo Liu
  2018-06-08  6:57                                                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 151+ messages in thread
From: Leo Liu @ 2018-06-08  5:11 UTC (permalink / raw)
  To: emacs-devel

On 2018-03-31 17:44 +0200, Lars Ingebrigtsen wrote:
> After not seeing a single hang since applying the patch (that's two
> weeks ago?), I got one hang yesterday and one hang today.  I started
> wondering whether I had somehow lost the patch, but no, this Emacs is
> still the patched one...
>
> The symptoms were the same as before: When negotiating a TLS connection
> from a timer, Emacs hung and I had to hit `C-g' three times to make
> Emacs progress.

I am finally able to use 26.1 (which I don't think has the patches from
Matthias) on macOS and already have to force-kill emacs twice. I am
posting this 16 minutes after the second hang.

This is probably the worst kind of hangs. In the middle of normal
editing Emacs suddenly stopped responding to any key strokes. No amount
of C-g or kill SIGUSR2 could get it back. I also checked the emacs
process was consuming nearly 0 CPU before kill-9.

The good work that Matthias Dahl is doing has potential to stabilise
emacs in this area and would be nice to have it for 26.2.

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-08  5:11                                                                                         ` Leo Liu
@ 2018-06-08  6:57                                                                                           ` Eli Zaretskii
  2018-06-08  9:07                                                                                             ` Leo Liu
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-06-08  6:57 UTC (permalink / raw)
  To: Leo Liu; +Cc: emacs-devel

> From: Leo Liu <sdl.web@gmail.com>
> Date: Fri, 08 Jun 2018 13:11:34 +0800
> 
> I am finally able to use 26.1 (which I don't think has the patches from
> Matthias)

Indeed, it does not.

> The good work that Matthias Dahl is doing has potential to stabilise
> emacs in this area and would be nice to have it for 26.2.

It's already on the emacs-26 branch, in anticipation of Emacs 26.2, so
I suggest to try a build from that branch and see if your problems are
solved.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-08  6:57                                                                                           ` Eli Zaretskii
@ 2018-06-08  9:07                                                                                             ` Leo Liu
  0 siblings, 0 replies; 151+ messages in thread
From: Leo Liu @ 2018-06-08  9:07 UTC (permalink / raw)
  To: emacs-devel

On 2018-06-08 09:57 +0300, Eli Zaretskii wrote:
> It's already on the emacs-26 branch, in anticipation of Emacs 26.2, so
> I suggest to try a build from that branch and see if your problems are
> solved.

Excellent. Thanks, Eli.

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-03-15 14:59                                                                                         ` Matthias Dahl
@ 2018-06-26 13:36                                                                                           ` Matthias Dahl
  2018-06-26 14:09                                                                                             ` andrés ramírez
  2018-07-21  9:52                                                                                             ` Eli Zaretskii
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-06-26 13:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, larsi, rrandresf, Stefan Monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 619 bytes --]

Hello Eli,

sorry that it has been a while since my last sign of life but 2018 has
been an especially bad year thus far health-wise, so I am struggling
to juggle "everything".

I just wanted to bring the attached fixes back into discussion to get
them into master.

As I understand it, they don't completely fix the problems Andres and
Lars have been seeing, but they still fix real bugs that can cause
random erratic / buggy behavior and/or freezes.

What do you (and all the others in cc' :P) say?

Thanks a lot for taking the time,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-GnuTLS-error-handling.patch --]
[-- Type: text/x-patch; name="0001-Fix-GnuTLS-error-handling.patch", Size: 2146 bytes --]

From b6372518bb174493da482fcfb388aeb8954640dc Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Mon, 12 Mar 2018 15:33:45 +0100
Subject: [PATCH 1/3] Fix GnuTLS error handling

* src/gnutls.c (emacs_gnutls_read): All error handling should be done
in `emacs_gnutls_handle_error', move handling of
GNUTLS_E_UNEXPECTED_PACKET_LENGTH accordingly.
We always need to set `errno' in case of an error, since later error
handling (e.g. `wait_reading_process_output') depends on it and GnuTLS
does not set errno by itself. We'll otherwise have random errno values
which can cause erratic behavior and hangs.
(emacs_gnutls_handle_error): GNUTLS_E_UNEXPECTED_PACKET_LENGTH is only
returned for premature connection termination on GnuTLS < 3.0 and is
otherwise a real error and should not be gobbled up.
---
 src/gnutls.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/src/gnutls.c b/src/gnutls.c
index d22d5d267c..5bf5ee0e5c 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
   rtnval = gnutls_record_recv (state, buf, nbyte);
   if (rtnval >= 0)
     return rtnval;
-  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
-    /* The peer closed the connection. */
-    return 0;
   else if (emacs_gnutls_handle_error (state, rtnval))
-    /* non-fatal error */
-    return -1;
-  else {
-    /* a fatal error occurred */
-    return 0;
-  }
+    {
+      /* non-fatal error */
+      errno = EAGAIN;
+      return -1;
+    }
+  else
+    {
+      /* a fatal error occurred */
+      errno = EPROTO;
+      return 0;
+    }
 }
 
 static char const *
@@ -756,8 +758,10 @@ emacs_gnutls_handle_error (gnutls_session_t session, int err)
 	 connection.  */
 # ifdef HAVE_GNUTLS3
       if (err == GNUTLS_E_PREMATURE_TERMINATION)
-	level = 3;
+# else
+      if (err == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
 # endif
+	level = 3;
 
       GNUTLS_LOG2 (level, max_log_level, "fatal error:", str);
       ret = false;
-- 
2.18.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Always-check-GnuTLS-sessions-for-available-data.patch --]
[-- Type: text/x-patch; name="0002-Always-check-GnuTLS-sessions-for-available-data.patch", Size: 4588 bytes --]

From c52252a862c39b9877ad50640ba6deb352affd11 Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Mon, 12 Mar 2018 16:07:55 +0100
Subject: [PATCH 2/3] Always check GnuTLS sessions for available data

* src/process.c (wait_reading_process_output): GnuTLS buffers data
internally and as such there is no guarantee that a select() call on
the underlying kernel socket will report available data if all data
has already been buffered. Prior to GnuTLS < 2.12, lowat mode was the
default which left bytes back in the kernel socket, so a select() call
would work. With GnuTLS >= 2.12 (the now required version for Emacs),
that default changed to non-lowat mode (and we don't set it otherwise)
and was subsequently completely removed with GnuTLS >= 3.0.
So, to properly handle GnuTLS sessions, we need to iterate through all
channels, check for available data manually and set the concerning fds
accordingly. Otherwise we might stall/delay unnecessarily or worse.
This also applies to the !just_wait_proc && wait_proc case, which was
previously handled improperly (only wait_proc was checked) and could
cause problems if sessions did have any dependency on one another
through e.g. higher up program logic and waited for one another.
---
 src/process.c | 77 ++++++++++++++++-----------------------------------
 1 file changed, 24 insertions(+), 53 deletions(-)

diff --git a/src/process.c b/src/process.c
index 6dba218c90..5702408985 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5397,60 +5397,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 #endif	/* !HAVE_GLIB */
 
 #ifdef HAVE_GNUTLS
-          /* GnuTLS buffers data internally.  In lowat mode it leaves
-             some data in the TCP buffers so that select works, but
-             with custom pull/push functions we need to check if some
-             data is available in the buffers manually.  */
-          if (nfds == 0)
+          /* GnuTLS buffers data internally. select() will only report
+             available data for the underlying kernel sockets API, not
+             what has been buffered internally. As such, we need to loop
+             through all channels and check for available data manually.  */
+          if (nfds >= 0)
 	    {
-	      fd_set tls_available;
-	      int set = 0;
-
-	      FD_ZERO (&tls_available);
-	      if (! wait_proc)
-		{
-		  /* We're not waiting on a specific process, so loop
-		     through all the channels and check for data.
-		     This is a workaround needed for some versions of
-		     the gnutls library -- 2.12.14 has been confirmed
-		     to need it.  See
-		     http://comments.gmane.org/gmane.emacs.devel/145074 */
-		  for (channel = 0; channel < FD_SETSIZE; ++channel)
-		    if (! NILP (chan_process[channel]))
-		      {
-			struct Lisp_Process *p =
-			  XPROCESS (chan_process[channel]);
-			if (p && p->gnutls_p && p->gnutls_state
-			    && ((emacs_gnutls_record_check_pending
-				 (p->gnutls_state))
-				> 0))
-			  {
-			    nfds++;
-			    eassert (p->infd == channel);
-			    FD_SET (p->infd, &tls_available);
-			    set++;
-			  }
-		      }
-		}
-	      else
-		{
-		  /* Check this specific channel.  */
-		  if (wait_proc->gnutls_p /* Check for valid process.  */
-		      && wait_proc->gnutls_state
-		      /* Do we have pending data?  */
-		      && ((emacs_gnutls_record_check_pending
-			   (wait_proc->gnutls_state))
-			  > 0))
-		    {
-		      nfds = 1;
-		      eassert (0 <= wait_proc->infd);
-		      /* Set to Available.  */
-		      FD_SET (wait_proc->infd, &tls_available);
-		      set++;
-		    }
-		}
-	      if (set)
-		Available = tls_available;
+              for (channel = 0; channel < FD_SETSIZE; ++channel)
+                if (! NILP (chan_process[channel]))
+                  {
+                    struct Lisp_Process *p =
+                      XPROCESS (chan_process[channel]);
+
+                    if (just_wait_proc && p != wait_proc)
+                      continue;
+
+                    if (p && p->gnutls_p && p->gnutls_state
+                        && ((emacs_gnutls_record_check_pending
+                             (p->gnutls_state))
+                            > 0))
+                      {
+                        nfds++;
+                        eassert (p->infd == channel);
+                        FD_SET (p->infd, &Available);
+                      }
+                  }
 	    }
 #endif
 	}
-- 
2.18.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-Make-xg_select-behave-more-like-pselect.patch --]
[-- Type: text/x-patch; name="0003-Make-xg_select-behave-more-like-pselect.patch", Size: 1322 bytes --]

From 63c4c10b9ef4fb7b40f008db5c9969357312ceae Mon Sep 17 00:00:00 2001
From: Matthias Dahl <matthias.dahl@binary-island.eu>
Date: Tue, 13 Mar 2018 15:35:16 +0100
Subject: [PATCH 3/3] Make xg_select() behave more like pselect()

* src/xgselect.c (xg_select): If no file descriptors have data ready,
pselect() clears the passed in fd sets whereas xg_select() does not
which caused Bug#21337 for `wait_reading_process_output'.
Clear the passed in sets if no fds are ready but leave them untouched
if pselect() returns an error -- just like pselect() does itself.
---
 src/xgselect.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/xgselect.c b/src/xgselect.c
index fedd3127ef..f68982143e 100644
--- a/src/xgselect.c
+++ b/src/xgselect.c
@@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
             ++retval;
         }
     }
+  else if (nfds == 0)
+    {
+      // pselect() clears the file descriptor sets if no fd is ready (but
+      // not if an error occurred), so should we to be compatible. (Bug#21337)
+      if (rfds) FD_ZERO (rfds);
+      if (wfds) FD_ZERO (wfds);
+      if (efds) FD_ZERO (efds);
+    }
 
   /* If Gtk+ is in use eventually gtk_main_iteration will be called,
      unless retval is zero.  */
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-26 13:36                                                                                           ` Matthias Dahl
@ 2018-06-26 14:09                                                                                             ` andrés ramírez
  2018-06-27 13:10                                                                                               ` Matthias Dahl
  2018-07-21  9:52                                                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 151+ messages in thread
From: andrés ramírez @ 2018-06-26 14:09 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: Eli Zaretskii, eggert, larsi, Stefan Monnier, emacs-devel

Hi Matthias.
> sorry that it has been a while since my last sign of life but 2018 has
> been an especially bad year thus far health-wise, so I am struggling
> to juggle "everything".
> 
> I just wanted to bring the attached fixes back into discussion to get
> them into master.
Eli told this is on 26.2 included. I suppose It is on master also.
> As I understand it, they don't completely fix the problems Andres and
> Lars have been seeing, but they still fix real bugs that can cause
> random erratic / buggy behavior and/or freezes.

But. It is so much better than the previous behaviour.
> What do you (and all the others in cc' :P) say?

I have not had any issue until a month before when i moved to 26.1, and
having the issue again cos I forgot to apply the patch. Problably I am
going to compile 26.2 soon.

Best Regards. And Thanks for the work



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-26 14:09                                                                                             ` andrés ramírez
@ 2018-06-27 13:10                                                                                               ` Matthias Dahl
  2018-06-27 15:18                                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-06-27 13:10 UTC (permalink / raw)
  To: andrés ramírez
  Cc: Eli Zaretskii, eggert, emacs-devel, larsi, Stefan Monnier

Hello Andrés...

On 26/06/18 16:09, andrés ramírez wrote:

> Eli told this is on 26.2 included. I suppose It is on master also.

Thus far, those patches are neither on master, nor on the 26 branch.

> But. It is so much better than the previous behaviour.

Glad to hear that. Really wish we could fix it completely for you guys.

> And Thanks for the work

You are very welcome.

Have a nice day,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-27 13:10                                                                                               ` Matthias Dahl
@ 2018-06-27 15:18                                                                                                 ` Eli Zaretskii
  2018-06-28  8:01                                                                                                   ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-06-27 15:18 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: larsi, rrandresf, emacs-devel, monnier, eggert

> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Wed, 27 Jun 2018 15:10:31 +0200
> Cc: Eli Zaretskii <eliz@gnu.org>, eggert@cs.ucla.edu, emacs-devel@gnu.org,
> 	larsi@gnus.org, Stefan Monnier <monnier@iro.umontreal.ca>
> 
> Hello Andrés...
> 
> On 26/06/18 16:09, andrés ramírez wrote:
> 
> > Eli told this is on 26.2 included. I suppose It is on master also.
> 
> Thus far, those patches are neither on master, nor on the 26 branch.

Maybe it was my misunderstanding, but I thought Andrés was asking
about the changes we pushed to master before Emacs 26.1 was released,
and which are now backported to emacs-26.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-27 15:18                                                                                                 ` Eli Zaretskii
@ 2018-06-28  8:01                                                                                                   ` Matthias Dahl
  2018-06-28 13:04                                                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-06-28  8:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, rrandresf, emacs-devel, monnier, eggert

Hello Eli...

On 27/06/18 17:18, Eli Zaretskii wrote:

> Maybe it was my misunderstanding, but I thought Andrés was asking
> about the changes we pushed to master before Emacs 26.1 was released,
> and which are now backported to emacs-26.

Ok, now we are both confused. :^) I guess Andrés asked you off-list
because I did not see anything on the list, so I don't quite know what
you are referring to.

No matter what. The patches in question which I sent this week, have
not been applied to either master nor the emacs-26 branch. Could you
please have a look and get them merged, if there are no objections?
Both Lars and Andrés tested them earlier this year, when I initially
sent them... and I have been running them locally as well.

Like I said, they fix real bugs that can lead to all sorts of things.
The commit msgs should be self-explanatory.

Thanks again,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28  8:01                                                                                                   ` Matthias Dahl
@ 2018-06-28 13:04                                                                                                     ` Eli Zaretskii
  2018-06-28 13:25                                                                                                       ` Matthias Dahl
  2018-07-03 13:34                                                                                                       ` Matthias Dahl
  0 siblings, 2 replies; 151+ messages in thread
From: Eli Zaretskii @ 2018-06-28 13:04 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: larsi, rrandresf, emacs-devel, monnier, eggert

> Cc: rrandresf@gmail.com, eggert@cs.ucla.edu, emacs-devel@gnu.org,
>  larsi@gnus.org, monnier@iro.umontreal.ca
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Thu, 28 Jun 2018 10:01:58 +0200
> 
> On 27/06/18 17:18, Eli Zaretskii wrote:
> 
> > Maybe it was my misunderstanding, but I thought Andrés was asking
> > about the changes we pushed to master before Emacs 26.1 was released,
> > and which are now backported to emacs-26.
> 
> Ok, now we are both confused. :^) I guess Andrés asked you off-list
> because I did not see anything on the list, so I don't quite know what
> you are referring to.

It wasn't Andrés, it was Leo.  See

  http://lists.gnu.org/archive/html/emacs-devel/2018-06/msg00231.html

Andrés just assumed I was talking about some other patch, or maybe I
assumed he was talking about the same patch.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28 13:04                                                                                                     ` Eli Zaretskii
@ 2018-06-28 13:25                                                                                                       ` Matthias Dahl
  2018-06-28 16:33                                                                                                         ` Leo Liu
  2018-07-03 13:34                                                                                                       ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-06-28 13:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, rrandresf, monnier, larsi, Leo Liu

Hello...

On 28/06/18 15:04, Eli Zaretskii wrote:

> It wasn't Andrés, it was Leo.  See

Ah... okay. I missed that when I went through my local archives. Sorry.

@Leo Have any of those patches fixed the hangs for you? Or did you make
any progress in figuring out what was causing the hangs?

> Andrés just assumed I was talking about some other patch, or maybe I
> assumed he was talking about the same patch.

I get it now and understand the confusion. :-) Andrés probably assumed
you were talking about _all_ patches (those from last year, as well as
those from earlier this year that I re-sent this week) and Leo, well...
I've no clue if he tried the new ones as well.

It is really unfortunate that all this went into one single loooooooong
thread, which is also mostly my fault. That made things rather unclear.
Once those current patches are dealt with, we'll finally put this thread
to its well deserved rest.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28 13:25                                                                                                       ` Matthias Dahl
@ 2018-06-28 16:33                                                                                                         ` Leo Liu
  2018-06-28 18:31                                                                                                           ` T.V Raman
  2018-07-02 13:24                                                                                                           ` Matthias Dahl
  0 siblings, 2 replies; 151+ messages in thread
From: Leo Liu @ 2018-06-28 16:33 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: eggert, emacs-devel, rrandresf, monnier, larsi, Eli Zaretskii

On 2018-06-28 15:25 +0200, Matthias Dahl wrote:
> @Leo Have any of those patches fixed the hangs for you? Or did you make
> any progress in figuring out what was causing the hangs?

No, I can't find a way to reproduce it. The hang only happens
intermittently and it can be no crash for weeks. So I am basically just
waiting for it to happen. I am on 26.1 for 2-3 weeks and it seems
reasonably stable.

Leo



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28 16:33                                                                                                         ` Leo Liu
@ 2018-06-28 18:31                                                                                                           ` T.V Raman
  2018-07-02 13:27                                                                                                             ` Matthias Dahl
  2018-07-02 13:24                                                                                                           ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: T.V Raman @ 2018-06-28 18:31 UTC (permalink / raw)
  To: Leo Liu
  Cc: eggert, Matthias Dahl, emacs-devel, rrandresf, monnier, larsi,
	Eli Zaretskii

I still get hangs when opening GMail labels, -- hangs happen both when
opening and exiting the summary buffer if the label has a large
(O(100K)) messages 
-- 
Id: kg:/m/0285kf1 



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28 16:33                                                                                                         ` Leo Liu
  2018-06-28 18:31                                                                                                           ` T.V Raman
@ 2018-07-02 13:24                                                                                                           ` Matthias Dahl
  2018-07-14  2:27                                                                                                             ` Leo Liu
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-07-02 13:24 UTC (permalink / raw)
  To: Leo Liu; +Cc: eggert, emacs-devel, rrandresf, monnier, larsi, Eli Zaretskii

Hello Leo...

On 28/06/18 18:33, Leo Liu wrote:

> No, I can't find a way to reproduce it. The hang only happens
> intermittently and it can be no crash for weeks. So I am basically just
> waiting for it to happen. I am on 26.1 for 2-3 weeks and it seems
> reasonably stable.

Sorry to hear that. My tip (but you might already have heard that
countless times): Have emacs built with debugging info and attach gdb to
the process once it hangs. Produce a suitable backtrace (w/ the elisp
part) and post that to the list. Otherwise there is no way to know where
to look. :-(

Last but not least: And it is a real hang, so you cannot cancel it and
you need to really kill emacs? Could you give the master branch a try
and maybe also the patches I re-sent last week?

Good luck,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28 18:31                                                                                                           ` T.V Raman
@ 2018-07-02 13:27                                                                                                             ` Matthias Dahl
  2018-07-02 14:35                                                                                                               ` T.V Raman
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-07-02 13:27 UTC (permalink / raw)
  To: T.V Raman, Leo Liu
  Cc: eggert, emacs-devel, rrandresf, monnier, larsi, Eli Zaretskii

Hi...

On 28/06/18 20:31, T.V Raman wrote:

> I still get hangs when opening GMail labels, -- hangs happen both when
> opening and exiting the summary buffer if the label has a large
> (O(100K)) messages 

Do those hangs happen all the time or randomly? And: What emacs version
are you on? Have you tried any of the patches posted in this thread?

Thanks,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-02 13:27                                                                                                             ` Matthias Dahl
@ 2018-07-02 14:35                                                                                                               ` T.V Raman
  2018-07-03 13:27                                                                                                                 ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: T.V Raman @ 2018-07-02 14:35 UTC (permalink / raw)
  To: ml_emacs-lists
  Cc: eggert, emacs-devel, rrandresf, eliz, monnier, larsi, raman,
	sdl.web

Happens reliably. Am running emacs build from GitHub and rebuild about
once a week 

Matthias Dahl writes:
 > Hi...
 > 
 > On 28/06/18 20:31, T.V Raman wrote:
 > 
 > > I still get hangs when opening GMail labels, -- hangs happen both when
 > > opening and exiting the summary buffer if the label has a large
 > > (O(100K)) messages 
 > 
 > Do those hangs happen all the time or randomly? And: What emacs version
 > are you on? Have you tried any of the patches posted in this thread?
 > 
 > Thanks,
 > Matthias
 > 
 > -- 
 > Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu

-- 
Id: kg:/m/0285kf1 

-- 
Id: kg:/m/0285kf1 



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-02 14:35                                                                                                               ` T.V Raman
@ 2018-07-03 13:27                                                                                                                 ` Matthias Dahl
  2018-07-03 13:52                                                                                                                   ` T. V. Raman
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-07-03 13:27 UTC (permalink / raw)
  To: T.V Raman; +Cc: eggert, emacs-devel, rrandresf, eliz, monnier, larsi, sdl.web

Hi...

On 02/07/18 16:35, T.V Raman wrote:

> Happens reliably. Am running emacs build from GitHub and rebuild about
> once a week 

Perfect, then you shouldn't have a problem getting a decent backtrace
through gdb when emacs hangs (don't forget the lisp part). If you send
that to the list, I am sure that will get you a (helpful) response.

Otherwise, the cause could be pretty much anything and everywhere. No
way to know where to look...

Thanks,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-28 13:04                                                                                                     ` Eli Zaretskii
  2018-06-28 13:25                                                                                                       ` Matthias Dahl
@ 2018-07-03 13:34                                                                                                       ` Matthias Dahl
  2018-07-03 18:57                                                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-07-03 13:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, rrandresf, monnier, larsi, Leo Liu

Hello Eli...

Since I haven't gotten any clear response on those patches at all, is
there even interest in those or am I doing something wrong/suboptimal
that is making things difficult somehow?

Just asking and trying to make sure that the subject is not lost and
that I am not making any stupid mistakes that I should be aware of.

Thanks for your patience,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-03 13:27                                                                                                                 ` Matthias Dahl
@ 2018-07-03 13:52                                                                                                                   ` T. V. Raman
  2018-07-03 15:03                                                                                                                     ` Stefan Monnier
  0 siblings, 1 reply; 151+ messages in thread
From: T. V. Raman @ 2018-07-03 13:52 UTC (permalink / raw)
  To: Matthias Dahl
  Cc: eggert, emacs-devel, rrandresf, eliz, monnier, larsi, sdl.web

It's impossible for me to run gdb and get a trace; the only thing that
is giving me spoken feedback is emacs -- so once I come out of it I
cant do anything. Hopefully someone else will be able to isolate the
issue 
-- 



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-03 13:52                                                                                                                   ` T. V. Raman
@ 2018-07-03 15:03                                                                                                                     ` Stefan Monnier
  0 siblings, 0 replies; 151+ messages in thread
From: Stefan Monnier @ 2018-07-03 15:03 UTC (permalink / raw)
  To: emacs-devel

> It's impossible for me to run gdb and get a trace; the only thing that
> is giving me spoken feedback is emacs -- so once I come out of it I
> cant do anything. Hopefully someone else will be able to isolate the
> issue 

FWIW, My normal "login" rigmarole involves starting Emacs, then M-x
gud-gdb to the run the "real" Emacs.


        Stefan




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-03 13:34                                                                                                       ` Matthias Dahl
@ 2018-07-03 18:57                                                                                                         ` Eli Zaretskii
  2018-07-04  7:35                                                                                                           ` Matthias Dahl
  0 siblings, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-07-03 18:57 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel, rrandresf, monnier, larsi, sdl.web

> Cc: larsi@gnus.org, rrandresf@gmail.com, emacs-devel@gnu.org,
>  monnier@iro.umontreal.ca, eggert@cs.ucla.edu, Leo Liu <sdl.web@gmail.com>
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Tue, 3 Jul 2018 15:34:40 +0200
> 
> Hello Eli...
> 
> Since I haven't gotten any clear response on those patches at all

You didn't wait long enough ;-)

> Just asking and trying to make sure that the subject is not lost and
> that I am not making any stupid mistakes that I should be aware of.

Please have some patience, I don't have enough time lately to live up
to your expectations, sorry.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-03 18:57                                                                                                         ` Eli Zaretskii
@ 2018-07-04  7:35                                                                                                           ` Matthias Dahl
  2018-07-04 15:11                                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 151+ messages in thread
From: Matthias Dahl @ 2018-07-04  7:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, rrandresf, monnier, larsi, sdl.web

Hello Eli...

On 03/07/18 20:57, Eli Zaretskii wrote:

> You didn't wait long enough ;-)

Ah, sorry about that. :-) I was just a bit worried because I did get
one or two responses from you but there was nothing about the patches
in there... not even a "taking a look when I get the time". So I thought
that maybe I was missing something important.

> Please have some patience, I don't have enough time lately to live up
> to your expectations, sorry.

I honestly apologize if you felt that I had any expectations or that I
was being impatient. I was doubting _myself_ if something was expected
of me that I did not do and so forth. Like I said, just wanted to make
sure... that's all.

Take all the time you need... and thanks for even taking the time to
check the patches in the first place.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-04  7:35                                                                                                           ` Matthias Dahl
@ 2018-07-04 15:11                                                                                                             ` Eli Zaretskii
  0 siblings, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2018-07-04 15:11 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, emacs-devel, rrandresf, monnier, larsi, sdl.web

> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Wed, 4 Jul 2018 09:35:29 +0200
> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org, rrandresf@gmail.com,
> 	monnier@iro.umontreal.ca, larsi@gnus.org, sdl.web@gmail.com
> 
> I honestly apologize if you felt that I had any expectations or that I
> was being impatient. I was doubting _myself_ if something was expected
> of me that I did not do and so forth. Like I said, just wanted to make
> sure... that's all.

No need to apologize, you had no way of knowing that the issue just
waits in my queue.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-02 13:24                                                                                                           ` Matthias Dahl
@ 2018-07-14  2:27                                                                                                             ` Leo Liu
  0 siblings, 0 replies; 151+ messages in thread
From: Leo Liu @ 2018-07-14  2:27 UTC (permalink / raw)
  To: emacs-devel

On 2018-07-02 15:24 +0200, Matthias Dahl wrote:
> Sorry to hear that. My tip (but you might already have heard that
> countless times): Have emacs built with debugging info and attach gdb to
> the process once it hangs. Produce a suitable backtrace (w/ the elisp
> part) and post that to the list. Otherwise there is no way to know where
> to look.

26.1 seems pretty stable or else I haven't used it long enough. No hang
or anything fatal. It's becoming harder (if not impossible) to use gdb
on macOS but I always build emacs with debug info just in case. My
situation is a bit hard to get further because there is no known way to
hang emacs except waiting for the hang to show itself.

Thanks for the efforts in improving Emacs on this area.

Leo




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-06-26 13:36                                                                                           ` Matthias Dahl
  2018-06-26 14:09                                                                                             ` andrés ramírez
@ 2018-07-21  9:52                                                                                             ` Eli Zaretskii
  2018-08-07  8:38                                                                                               ` Matthias Dahl
  1 sibling, 1 reply; 151+ messages in thread
From: Eli Zaretskii @ 2018-07-21  9:52 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, larsi, rrandresf, monnier, emacs-devel

> Cc: larsi@gnus.org, rrandresf@gmail.com, emacs-devel@gnu.org,
>  eggert@cs.ucla.edu, Stefan Monnier <monnier@iro.umontreal.ca>
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Tue, 26 Jun 2018 15:36:19 +0200
> 
> Hello Eli,
> 
> sorry that it has been a while since my last sign of life but 2018 has
> been an especially bad year thus far health-wise, so I am struggling
> to juggle "everything".
> 
> I just wanted to bring the attached fixes back into discussion to get
> them into master.
> 
> As I understand it, they don't completely fix the problems Andres and
> Lars have been seeing, but they still fix real bugs that can cause
> random erratic / buggy behavior and/or freezes.
> 
> What do you (and all the others in cc' :P) say?

Sadly, none of "the others" chimed in, and I'm definitely not an
expert on this stuff.  So just some comments, hopefully they will make
sense.

> diff --git a/src/gnutls.c b/src/gnutls.c
> index d22d5d267c..5bf5ee0e5c 100644
> --- a/src/gnutls.c
> +++ b/src/gnutls.c
> @@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
>    rtnval = gnutls_record_recv (state, buf, nbyte);
>    if (rtnval >= 0)
>      return rtnval;
> -  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
> -    /* The peer closed the connection. */
> -    return 0;

Why is this hunk being deleted?  I see that you intend to handle it in
emacs_gnutls_handle_error, but I'm not sure I understand the net
result.  The current code simply ignores this situation; what will
happen with the new code, except for the problem being logged?

>    else if (emacs_gnutls_handle_error (state, rtnval))
> -    /* non-fatal error */
> -    return -1;
> -  else {
> -    /* a fatal error occurred */
> -    return 0;
> -  }
> +    {
> +      /* non-fatal error */
> +      errno = EAGAIN;
> +      return -1;
> +    }
> +  else
> +    {
> +      /* a fatal error occurred */
> +      errno = EPROTO;
> +      return 0;
> +    }

EPROTO is not universally available, AFAIK.  We have Gnulib's errno.h
that defines a value for it, but that just gets us through
compilation.  What do you expect this value to cause when it is
encountered by some code which needs to interpret it?  We need to make
sure the results will be sensible.

> diff --git a/src/process.c b/src/process.c
> index 6dba218c90..5702408985 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -5397,60 +5397,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
>  #endif	/* !HAVE_GLIB */
>  
>  #ifdef HAVE_GNUTLS
> -          /* GnuTLS buffers data internally.  In lowat mode it leaves
> -             some data in the TCP buffers so that select works, but
> -             with custom pull/push functions we need to check if some
> -             data is available in the buffers manually.  */
> -          if (nfds == 0)
> +          /* GnuTLS buffers data internally. select() will only report
> +             available data for the underlying kernel sockets API, not
> +             what has been buffered internally. As such, we need to loop
> +             through all channels and check for available data manually.  */
> +          if (nfds >= 0)
>  	    {
> -	      fd_set tls_available;
> -	      int set = 0;
> -
> -	      FD_ZERO (&tls_available);
> -	      if (! wait_proc)
> -		{
> -		  /* We're not waiting on a specific process, so loop
> -		     through all the channels and check for data.
> -		     This is a workaround needed for some versions of
> -		     the gnutls library -- 2.12.14 has been confirmed
> -		     to need it.  See
> -		     http://comments.gmane.org/gmane.emacs.devel/145074 */
> -		  for (channel = 0; channel < FD_SETSIZE; ++channel)
> -		    if (! NILP (chan_process[channel]))
> -		      {
> -			struct Lisp_Process *p =
> -			  XPROCESS (chan_process[channel]);
> -			if (p && p->gnutls_p && p->gnutls_state
> -			    && ((emacs_gnutls_record_check_pending
> -				 (p->gnutls_state))
> -				> 0))
> -			  {
> -			    nfds++;
> -			    eassert (p->infd == channel);
> -			    FD_SET (p->infd, &tls_available);
> -			    set++;
> -			  }
> -		      }
> -		}
> -	      else
> -		{
> -		  /* Check this specific channel.  */
> -		  if (wait_proc->gnutls_p /* Check for valid process.  */
> -		      && wait_proc->gnutls_state
> -		      /* Do we have pending data?  */
> -		      && ((emacs_gnutls_record_check_pending
> -			   (wait_proc->gnutls_state))
> -			  > 0))
> -		    {
> -		      nfds = 1;
> -		      eassert (0 <= wait_proc->infd);
> -		      /* Set to Available.  */
> -		      FD_SET (wait_proc->infd, &tls_available);
> -		      set++;
> -		    }
> -		}
> -	      if (set)
> -		Available = tls_available;
> +              for (channel = 0; channel < FD_SETSIZE; ++channel)
> +                if (! NILP (chan_process[channel]))
> +                  {
> +                    struct Lisp_Process *p =
> +                      XPROCESS (chan_process[channel]);
> +
> +                    if (just_wait_proc && p != wait_proc)
> +                      continue;
> +
> +                    if (p && p->gnutls_p && p->gnutls_state
> +                        && ((emacs_gnutls_record_check_pending
> +                             (p->gnutls_state))
> +                            > 0))
> +                      {
> +                        nfds++;
> +                        eassert (p->infd == channel);
> +                        FD_SET (p->infd, &Available);
> +                      }
> +                  }

This change is hard to read.  The original code already called
emacs_gnutls_record_check_pending, and there's no calls to pselect in
the hunk, so I'm unsure what exactly are we changing here, in terms of
the details.  The overview in the commit log just gives the general
idea, but its hard for me to connect that to the actual code changes.
Plus, you lose some of the comments, for some reason, even though the
same code is still present.

Bottom line, I'd appreciate more details for this part.

> diff --git a/src/xgselect.c b/src/xgselect.c
> index fedd3127ef..f68982143e 100644
> --- a/src/xgselect.c
> +++ b/src/xgselect.c
> @@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
>              ++retval;
>          }
>      }
> +  else if (nfds == 0)
> +    {
> +      // pselect() clears the file descriptor sets if no fd is ready (but
> +      // not if an error occurred), so should we to be compatible. (Bug#21337)
> +      if (rfds) FD_ZERO (rfds);
> +      if (wfds) FD_ZERO (wfds);
> +      if (efds) FD_ZERO (efds);
> +    }

This is OK, but please don't use C++ style comments, it's not our
style.  Also, please be sure to test this when waiting on pselect is
interrupted by C-g, we had some problems in that area which had roots
in xg_select.

Thanks.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-07-21  9:52                                                                                             ` Eli Zaretskii
@ 2018-08-07  8:38                                                                                               ` Matthias Dahl
  2018-08-07 17:10                                                                                                 ` Paul Eggert
  2018-09-10  8:21                                                                                                 ` Eli Zaretskii
  0 siblings, 2 replies; 151+ messages in thread
From: Matthias Dahl @ 2018-08-07  8:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, larsi, rrandresf, monnier, emacs-devel

Hello Eli...

On 21/07/2018 11:52, Eli Zaretskii wrote:

>> diff --git a/src/gnutls.c b/src/gnutls.c
>> index d22d5d267c..5bf5ee0e5c 100644
>> --- a/src/gnutls.c
>> +++ b/src/gnutls.c
>> @@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
>>    rtnval = gnutls_record_recv (state, buf, nbyte);
>>    if (rtnval >= 0)
>>      return rtnval;
>> -  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
>> -    /* The peer closed the connection. */
>> -    return 0;
> 
> Why is this hunk being deleted?  I see that you intend to handle it in
> emacs_gnutls_handle_error, but I'm not sure I understand the net
> result.  The current code simply ignores this situation; what will
> happen with the new code, except for the problem being logged?

The rationale behind this is:
On GnuTLS < 3, GNUTLS_E_UNEXPECTED_PACKET_LENGTH was returned for
premature connection termination. On >= 3, it is a real error on its
own and should be treated that way.

I moved the error handling over to emacs_gnutls_handle_error, so it's
at a central point and not scattered all across. Also makes the code
clearer to read, imho.

>>    else if (emacs_gnutls_handle_error (state, rtnval))
>> -    /* non-fatal error */
>> -    return -1;
>> -  else {
>> -    /* a fatal error occurred */
>> -    return 0;
>> -  }
>> +    {
>> +      /* non-fatal error */
>> +      errno = EAGAIN;
>> +      return -1;
>> +    }
>> +  else
>> +    {
>> +      /* a fatal error occurred */
>> +      errno = EPROTO;
>> +      return 0;
>> +    }
> 
> EPROTO is not universally available, AFAIK.  We have Gnulib's errno.h
> that defines a value for it, but that just gets us through
> compilation.  What do you expect this value to cause when it is
> encountered by some code which needs to interpret it?  We need to make
> sure the results will be sensible.

Sorry, I missed that. Right now, no code checks specifically for EPROTO,
so any other fatal value would do as well. But EPROTO is fitting rather
well in what is going on.

Any code that will get an errno = EPROTO will treat it as a fatal error
and that is exactly what it should do. So there is no problem with that.

The only problem that might arise is, if one of the platforms that do
not currently define EPROTO, define and use it. Then Emacs needs to be
recompiled for it to work properly again on that platform.

I don't know if Emacs uses other non-universally available errno values
as well. If it does, then, imho, it really does not matter here. If it
does not, I will change it to something else, if requested to avoid the
situation altogether. What do you think?

>> diff --git a/src/process.c b/src/process.c
>> index 6dba218c90..5702408985 100644
>> --- a/src/process.c
>> +++ b/src/process.c
>> @@ -5397,60 +5397,31 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
>>  #endif	/* !HAVE_GLIB */
>>  
>>  #ifdef HAVE_GNUTLS
>> -          /* GnuTLS buffers data internally.  In lowat mode it leaves
>> -             some data in the TCP buffers so that select works, but
>> -             with custom pull/push functions we need to check if some
>> -             data is available in the buffers manually.  */
>> -          if (nfds == 0)
>> +          /* GnuTLS buffers data internally. select() will only report
>> +             available data for the underlying kernel sockets API, not
>> +             what has been buffered internally. As such, we need to loop
>> +             through all channels and check for available data manually.  */
>> +          if (nfds >= 0)
>>  	    {
>> -	      fd_set tls_available;
>> -	      int set = 0;
>> -
>> -	      FD_ZERO (&tls_available);
>> -	      if (! wait_proc)
>> -		{
>> -		  /* We're not waiting on a specific process, so loop
>> -		     through all the channels and check for data.
>> -		     This is a workaround needed for some versions of
>> -		     the gnutls library -- 2.12.14 has been confirmed
>> -		     to need it.  See
>> -		     http://comments.gmane.org/gmane.emacs.devel/145074 */
>> -		  for (channel = 0; channel < FD_SETSIZE; ++channel)
>> -		    if (! NILP (chan_process[channel]))
>> -		      {
>> -			struct Lisp_Process *p =
>> -			  XPROCESS (chan_process[channel]);
>> -			if (p && p->gnutls_p && p->gnutls_state
>> -			    && ((emacs_gnutls_record_check_pending
>> -				 (p->gnutls_state))
>> -				> 0))
>> -			  {
>> -			    nfds++;
>> -			    eassert (p->infd == channel);
>> -			    FD_SET (p->infd, &tls_available);
>> -			    set++;
>> -			  }
>> -		      }
>> -		}
>> -	      else
>> -		{
>> -		  /* Check this specific channel.  */
>> -		  if (wait_proc->gnutls_p /* Check for valid process.  */
>> -		      && wait_proc->gnutls_state
>> -		      /* Do we have pending data?  */
>> -		      && ((emacs_gnutls_record_check_pending
>> -			   (wait_proc->gnutls_state))
>> -			  > 0))
>> -		    {
>> -		      nfds = 1;
>> -		      eassert (0 <= wait_proc->infd);
>> -		      /* Set to Available.  */
>> -		      FD_SET (wait_proc->infd, &tls_available);
>> -		      set++;
>> -		    }
>> -		}
>> -	      if (set)
>> -		Available = tls_available;
>> +              for (channel = 0; channel < FD_SETSIZE; ++channel)
>> +                if (! NILP (chan_process[channel]))
>> +                  {
>> +                    struct Lisp_Process *p =
>> +                      XPROCESS (chan_process[channel]);
>> +
>> +                    if (just_wait_proc && p != wait_proc)
>> +                      continue;
>> +
>> +                    if (p && p->gnutls_p && p->gnutls_state
>> +                        && ((emacs_gnutls_record_check_pending
>> +                             (p->gnutls_state))
>> +                            > 0))
>> +                      {
>> +                        nfds++;
>> +                        eassert (p->infd == channel);
>> +                        FD_SET (p->infd, &Available);
>> +                      }
>> +                  }
> 
> This change is hard to read.  The original code already called
> emacs_gnutls_record_check_pending, and there's no calls to pselect in
> the hunk, so I'm unsure what exactly are we changing here, in terms of
> the details.  The overview in the commit log just gives the general
> idea, but its hard for me to connect that to the actual code changes.
> Plus, you lose some of the comments, for some reason, even though the
> same code is still present.
> 
> Bottom line, I'd appreciate more details for this part.

With Emacs 25.2, the required GnuTLS version was set to 2.12.2 (from
the "2.6.6 or later" previous default).

GnuTLS 2.12 changed its default from lowat mode to non-lowat mode (and
removed lowat mode with GnuTLS 3 completely). What this means is, there
is no data left intentionally in the kernel buffers for a select to work
properly. So we usually do not get any of the fds that belong to GnuTLS
set as available when we do a select() call.

The old Emacs code only checked _all_ channels if there was no wait_proc
and if _no_ fds (nfds == 0) were set as available by our previous
select() call. That was wrong and missed a few important cases and could
lead to unnecessary waits or stalls.

Previously, the old code treated a wait_proc as if just_wait_proc was
set as well and only checked wait_proc and ignored the others. That was
a bug as well.

The new code always (if nfds >= 0) checks all (if !just_wait_proc)
channels and sets the available fds in addition to the ones reported by
select().

I hope that clears it up. The new code is basically a lot simpler and it
tries to do the right thing (tm). :-)

>> diff --git a/src/xgselect.c b/src/xgselect.c
>> index fedd3127ef..f68982143e 100644
>> --- a/src/xgselect.c
>> +++ b/src/xgselect.c
>> @@ -143,6 +143,14 @@ xg_select (int fds_lim, fd_set *rfds, fd_set *wfds, fd_set *efds,
>>              ++retval;
>>          }
>>      }
>> +  else if (nfds == 0)
>> +    {
>> +      // pselect() clears the file descriptor sets if no fd is ready (but
>> +      // not if an error occurred), so should we to be compatible. (Bug#21337)
>> +      if (rfds) FD_ZERO (rfds);
>> +      if (wfds) FD_ZERO (wfds);
>> +      if (efds) FD_ZERO (efds);
>> +    }
> 
> This is OK, but please don't use C++ style comments, it's not our
> style.  Also, please be sure to test this when waiting on pselect is
> interrupted by C-g, we had some problems in that area which had roots
> in xg_select.

I will change that with the next version of the patches after I get your
comments back. Regarding the problems you mentioned: Can you point me to
a bug report and more information about that? If anything breaks with
the new behavior, then that broken code is clearly doing something wrong
and needs fixing. Personally, I have been running Emacs with the patches
since I posted them and I have not run into a single issue.

Thanks for taking the time, Eli. It is very much appreciated!

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu




^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-08-07  8:38                                                                                               ` Matthias Dahl
@ 2018-08-07 17:10                                                                                                 ` Paul Eggert
  2018-09-10  8:21                                                                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: Paul Eggert @ 2018-08-07 17:10 UTC (permalink / raw)
  To: Matthias Dahl, Eli Zaretskii; +Cc: larsi, rrandresf, monnier, emacs-devel

Matthias Dahl wrote:
> I don't know if Emacs uses other non-universally available errno values
> as well. If it does, then, imho, it really does not matter here.

Emacs uses EOVERFLOW, ENOTSUP, EINPROGRESS, ENOTCONN, EMSGSIZE, EINPROGRESS, 
ELOOP, EWOULDBLOCK, even though they are not available on many platforms.

By the way, why does conf_post.h conditionally define EOVERFLOW, why does 
filelock.c have "#ifndef ELOOP", why does process.c have "#ifndef EWOULDBLOCK" 
and "defined EMSGSIZ", and why does sysdep.c have "#ifdef EOVERFLOW"? Do these 
all predate Gnulib errno.h? It sounds like we should remove them.



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
  2018-08-07  8:38                                                                                               ` Matthias Dahl
  2018-08-07 17:10                                                                                                 ` Paul Eggert
@ 2018-09-10  8:21                                                                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 151+ messages in thread
From: Eli Zaretskii @ 2018-09-10  8:21 UTC (permalink / raw)
  To: Matthias Dahl; +Cc: eggert, larsi, rrandresf, monnier, emacs-devel

> Cc: larsi@gnus.org, rrandresf@gmail.com, emacs-devel@gnu.org,
>  eggert@cs.ucla.edu, monnier@iro.umontreal.ca
> From: Matthias Dahl <ml_emacs-lists@binary-island.eu>
> Date: Tue, 7 Aug 2018 10:38:02 +0200

Sorry for such a long delay in responding to your previous message.

> >> --- a/src/gnutls.c
> >> +++ b/src/gnutls.c
> >> @@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte)
> >>    rtnval = gnutls_record_recv (state, buf, nbyte);
> >>    if (rtnval >= 0)
> >>      return rtnval;
> >> -  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
> >> -    /* The peer closed the connection. */
> >> -    return 0;
> > 
> > Why is this hunk being deleted?  I see that you intend to handle it in
> > emacs_gnutls_handle_error, but I'm not sure I understand the net
> > result.  The current code simply ignores this situation; what will
> > happen with the new code, except for the problem being logged?
> 
> The rationale behind this is:
> On GnuTLS < 3, GNUTLS_E_UNEXPECTED_PACKET_LENGTH was returned for
> premature connection termination. On >= 3, it is a real error on its
> own and should be treated that way.
> 
> I moved the error handling over to emacs_gnutls_handle_error, so it's
> at a central point and not scattered all across. Also makes the code
> clearer to read, imho.

OK.

> >>    else if (emacs_gnutls_handle_error (state, rtnval))
> >> -    /* non-fatal error */
> >> -    return -1;
> >> -  else {
> >> -    /* a fatal error occurred */
> >> -    return 0;
> >> -  }
> >> +    {
> >> +      /* non-fatal error */
> >> +      errno = EAGAIN;
> >> +      return -1;
> >> +    }
> >> +  else
> >> +    {
> >> +      /* a fatal error occurred */
> >> +      errno = EPROTO;
> >> +      return 0;
> >> +    }
> > 
> > EPROTO is not universally available, AFAIK.  We have Gnulib's errno.h
> > that defines a value for it, but that just gets us through
> > compilation.  What do you expect this value to cause when it is
> > encountered by some code which needs to interpret it?  We need to make
> > sure the results will be sensible.
> 
> Sorry, I missed that. Right now, no code checks specifically for EPROTO,
> so any other fatal value would do as well. But EPROTO is fitting rather
> well in what is going on.
> 
> Any code that will get an errno = EPROTO will treat it as a fatal error
> and that is exactly what it should do. So there is no problem with that.

The problem that got me worried is whether some human-readable text
should be emitted when this error code is returned.  If that does
happen, we need to be sure the corresponding text is defined somewhere
in Emacs, because otherwise we will get "Unknown error" or some such,
which is a bug.

> I don't know if Emacs uses other non-universally available errno values
> as well. If it does, then, imho, it really does not matter here. If it
> does not, I will change it to something else, if requested to avoid the
> situation altogether. What do you think?

What alternatives could we use here, specifically?  I don't think I
understand the details of that, can you elaborate?

> GnuTLS 2.12 changed its default from lowat mode to non-lowat mode (and
> removed lowat mode with GnuTLS 3 completely). What this means is, there
> is no data left intentionally in the kernel buffers for a select to work
> properly. So we usually do not get any of the fds that belong to GnuTLS
> set as available when we do a select() call.
> 
> The old Emacs code only checked _all_ channels if there was no wait_proc
> and if _no_ fds (nfds == 0) were set as available by our previous
> select() call. That was wrong and missed a few important cases and could
> lead to unnecessary waits or stalls.
> 
> Previously, the old code treated a wait_proc as if just_wait_proc was
> set as well and only checked wait_proc and ignored the others. That was
> a bug as well.
> 
> The new code always (if nfds >= 0) checks all (if !just_wait_proc)
> channels and sets the available fds in addition to the ones reported by
> select().
> 
> I hope that clears it up. The new code is basically a lot simpler and it
> tries to do the right thing (tm). :-)

OK.  I get the idea, but we will have to see if this change causes any
harm in some subtle situation.

> >> +      // pselect() clears the file descriptor sets if no fd is ready (but
> >> +      // not if an error occurred), so should we to be compatible. (Bug#21337)
> >> +      if (rfds) FD_ZERO (rfds);
> >> +      if (wfds) FD_ZERO (wfds);
> >> +      if (efds) FD_ZERO (efds);
> >> +    }
> > 
> > This is OK, but please don't use C++ style comments, it's not our
> > style.  Also, please be sure to test this when waiting on pselect is
> > interrupted by C-g, we had some problems in that area which had roots
> > in xg_select.
> 
> I will change that with the next version of the patches after I get your
> comments back.

Thanks.

> Regarding the problems you mentioned: Can you point me to a bug
> report and more information about that?

See bug#28630 and commit ea39d47.  Bug#25172 might also be relevant.

> If anything breaks with the new behavior, then that broken code is
> clearly doing something wrong and needs fixing. Personally, I have
> been running Emacs with the patches since I posted them and I have
> not run into a single issue.

Did you try the new code on a TTY frame?  C-g produces a SIGINT there.



^ permalink raw reply	[flat|nested] 151+ messages in thread

end of thread, other threads:[~2018-09-10  8:21 UTC | newest]

Thread overview: 151+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-24 18:52 wait_reading_process_ouput hangs in certain cases (w/ patches) Matthias Dahl
2017-10-25 14:53 ` Eli Zaretskii
2017-10-26 14:07   ` Matthias Dahl
2017-10-26 16:23     ` Eli Zaretskii
2017-10-26 18:56       ` Matthias Dahl
2017-10-28  8:20         ` Matthias Dahl
2017-10-28  9:28         ` Eli Zaretskii
2017-10-30  9:48           ` Matthias Dahl
2017-11-03  8:52             ` Matthias Dahl
2017-11-03  9:58               ` Eli Zaretskii
2017-11-04 12:11             ` Eli Zaretskii
2017-11-06 14:15               ` Matthias Dahl
2017-11-06 16:34                 ` Eli Zaretskii
2017-11-06 18:24                   ` Paul Eggert
2017-11-06 20:17                     ` Eli Zaretskii
2017-11-07 14:18                   ` Matthias Dahl
2017-11-07 16:40                     ` Eli Zaretskii
2017-11-10 14:45                       ` Matthias Dahl
2017-11-10 15:25                         ` Eli Zaretskii
2017-11-12 21:17                         ` Paul Eggert
2017-11-13  3:27                           ` Eli Zaretskii
2017-11-13  5:27                             ` Paul Eggert
2017-11-13 16:00                               ` Eli Zaretskii
2017-11-13 19:42                                 ` Paul Eggert
2017-11-13 20:12                                   ` Eli Zaretskii
2017-11-13 14:13                           ` Matthias Dahl
2017-11-13 16:10                             ` Eli Zaretskii
2017-11-14 15:05                               ` Matthias Dahl
2017-11-13 19:44                             ` Paul Eggert
2017-11-14 14:58                               ` Matthias Dahl
2017-11-14 15:24                                 ` Paul Eggert
2017-11-14 16:03                                   ` Eli Zaretskii
2017-11-14 16:23                                     ` Eli Zaretskii
2017-11-14 21:54                                       ` Paul Eggert
2017-11-15 14:03                                         ` Matthias Dahl
2017-11-16 15:37                                           ` Eli Zaretskii
2017-11-16 16:46                                           ` Paul Eggert
2017-11-18 14:24                                             ` Matthias Dahl
2017-11-18 14:51                                               ` Eli Zaretskii
2017-11-18 17:14                                                 ` Stefan Monnier
2017-11-19  7:07                                               ` Paul Eggert
2017-11-19 15:42                                                 ` Eli Zaretskii
2017-11-19 17:06                                                   ` Paul Eggert
2017-11-20 15:29                                                 ` Matthias Dahl
2017-11-21 14:44                                                   ` Matthias Dahl
2017-11-21 21:31                                                     ` Clément Pit-Claudel
2017-11-22 14:14                                                       ` Matthias Dahl
2017-11-22  8:55                                                     ` Paul Eggert
2017-11-22 14:33                                                       ` Matthias Dahl
2017-11-24  2:31                                                         ` Stefan Monnier
2017-12-28 17:52                                                         ` Eli Zaretskii
2017-12-04  9:40                                                       ` Matthias Dahl
2018-02-13 14:25                                                         ` Matthias Dahl
2018-02-13 16:56                                                           ` Paul Eggert
2018-02-16 16:01                                                           ` Eli Zaretskii
2018-02-16 16:09                                                             ` Lars Ingebrigtsen
2018-02-16 16:54                                                               ` Lars Ingebrigtsen
2018-02-22 11:45                                                               ` andres.ramirez
2018-02-26 14:39                                                                 ` Matthias Dahl
2018-02-26 15:11                                                                   ` andrés ramírez
2018-02-26 15:17                                                                     ` Lars Ingebrigtsen
2018-02-26 15:29                                                                       ` andrés ramírez
2018-02-26 16:52                                                                         ` Daniel Colascione
2018-02-26 17:19                                                                           ` andrés ramírez
2018-02-26 17:24                                                                             ` Daniel Colascione
2018-02-27  1:53                                                                               ` Re: andrés ramírez
2018-02-27  9:15                                                                       ` wait_reading_process_ouput hangs in certain cases (w/ patches) Matthias Dahl
2018-02-27 12:01                                                                         ` Lars Ingebrigtsen
2018-02-27  9:11                                                                     ` Matthias Dahl
2018-02-27 11:54                                                                       ` andrés ramírez
2018-02-27 15:02                                                                         ` Matthias Dahl
2018-02-27 15:13                                                                           ` Lars Ingebrigtsen
2018-02-27 15:17                                                                             ` Matthias Dahl
2018-02-27 15:19                                                                               ` Lars Ingebrigtsen
2018-02-27 15:14                                                                         ` Matthias Dahl
2018-02-27 15:17                                                                           ` Lars Ingebrigtsen
2018-03-01 10:42                                                                           ` Lars Ingebrigtsen
2018-03-01 14:36                                                                             ` Matthias Dahl
2018-03-01 15:10                                                                               ` andrés ramírez
2018-03-01 16:30                                                                                 ` T.V Raman
2018-03-01 16:46                                                                                   ` andrés ramírez
2018-03-01 18:23                                                                                     ` T.V Raman
2018-03-01 19:13                                                                                     ` Eli Zaretskii
2018-03-02 20:21                                                                                       ` andrés ramírez
2018-03-03  7:55                                                                                         ` Eli Zaretskii
2018-03-05 14:43                                                                             ` Matthias Dahl
2018-03-05 14:44                                                                               ` Lars Ingebrigtsen
2018-03-05 14:54                                                                                 ` Matthias Dahl
2018-03-13  9:54                                                                                 ` Matthias Dahl
2018-03-13 12:35                                                                                   ` Robert Pluim
2018-03-13 13:40                                                                                     ` Robert Pluim
2018-03-13 15:10                                                                                     ` Matthias Dahl
2018-03-13 15:30                                                                                       ` Robert Pluim
2018-03-13 15:36                                                                                       ` Dmitry Gutov
2018-03-13 15:46                                                                                         ` Robert Pluim
2018-03-13 15:56                                                                                           ` Dmitry Gutov
2018-03-13 16:57                                                                                             ` Robert Pluim
2018-03-13 18:03                                                                                               ` Eli Zaretskii
2018-03-13 20:12                                                                                                 ` Robert Pluim
2018-03-14 14:21                                                                                         ` Matthias Dahl
2018-03-13 16:32                                                                                       ` Lars Ingebrigtsen
2018-03-14  9:32                                                                                         ` Matthias Dahl
2018-03-14 14:55                                                                                           ` Lars Ingebrigtsen
2018-03-31 15:44                                                                                       ` Lars Ingebrigtsen
2018-04-01  2:05                                                                                         ` andrés ramírez
2018-06-08  5:11                                                                                         ` Leo Liu
2018-06-08  6:57                                                                                           ` Eli Zaretskii
2018-06-08  9:07                                                                                             ` Leo Liu
2018-03-13 16:12                                                                                   ` Eli Zaretskii
2018-03-14  4:16                                                                                     ` Leo Liu
2018-03-14  9:22                                                                                       ` Robert Pluim
2018-03-15  0:37                                                                                         ` Leo Liu
2018-03-14 15:09                                                                                       ` andrés ramírez
2018-03-14 16:45                                                                                       ` Eli Zaretskii
2018-03-15  1:03                                                                                         ` Leo Liu
2018-03-15  7:55                                                                                           ` Robert Pluim
2018-03-14 22:54                                                                                       ` Stefan Monnier
2018-03-15  1:06                                                                                         ` Leo Liu
2018-03-14  9:56                                                                                     ` Matthias Dahl
2018-03-14 12:24                                                                                       ` Stefan Monnier
2018-03-14 14:34                                                                                         ` Matthias Dahl
2018-03-14 22:52                                                                                           ` Stefan Monnier
2018-03-15 15:17                                                                                             ` Matthias Dahl
2018-03-14 16:43                                                                                       ` Eli Zaretskii
2018-03-15 14:59                                                                                         ` Matthias Dahl
2018-06-26 13:36                                                                                           ` Matthias Dahl
2018-06-26 14:09                                                                                             ` andrés ramírez
2018-06-27 13:10                                                                                               ` Matthias Dahl
2018-06-27 15:18                                                                                                 ` Eli Zaretskii
2018-06-28  8:01                                                                                                   ` Matthias Dahl
2018-06-28 13:04                                                                                                     ` Eli Zaretskii
2018-06-28 13:25                                                                                                       ` Matthias Dahl
2018-06-28 16:33                                                                                                         ` Leo Liu
2018-06-28 18:31                                                                                                           ` T.V Raman
2018-07-02 13:27                                                                                                             ` Matthias Dahl
2018-07-02 14:35                                                                                                               ` T.V Raman
2018-07-03 13:27                                                                                                                 ` Matthias Dahl
2018-07-03 13:52                                                                                                                   ` T. V. Raman
2018-07-03 15:03                                                                                                                     ` Stefan Monnier
2018-07-02 13:24                                                                                                           ` Matthias Dahl
2018-07-14  2:27                                                                                                             ` Leo Liu
2018-07-03 13:34                                                                                                       ` Matthias Dahl
2018-07-03 18:57                                                                                                         ` Eli Zaretskii
2018-07-04  7:35                                                                                                           ` Matthias Dahl
2018-07-04 15:11                                                                                                             ` Eli Zaretskii
2018-07-21  9:52                                                                                             ` Eli Zaretskii
2018-08-07  8:38                                                                                               ` Matthias Dahl
2018-08-07 17:10                                                                                                 ` Paul Eggert
2018-09-10  8:21                                                                                                 ` Eli Zaretskii
2017-11-07 17:23                     ` Stefan Monnier
2017-11-10 14:53                       ` Matthias Dahl

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).