unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
@ 2023-09-24 21:35 Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-24 23:02 ` Jim Porter
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-24 21:35 UTC (permalink / raw)
  To: 66186; +Cc: jim porter

[-- Attachment #1: Type: text/plain, Size: 22612 bytes --]

X-Debbugs-CC: Jim Porter <jporterbugs@gmail.com>

First reported here:

https://yhetil.org/emacs-devel/ea8d365a-f014-d4d7-14d0-60ccdfe7974e@vodafonemail.de/

Rest of mail structured by outline mode conventions.

I managed to get a GDB backtrace of a SIGPIPE, please see last section.

* Original Text of Above Mail

Not sure whether anybody has seen or reported already:  Approx. 1 of 5
executions of "make lisp/eshell/esh-proc-tests" fail for me like this:

------------------------- snip -------------------------
make[1]: Entering directory '/home/jschmidt/work/emacs-master/test'
  GEN      lisp/eshell/esh-proc-tests.log
Running 23 tests (2023-09-24 20:32:11+0200, selector `(not (tag :unstable))')
Loading em-alias...
Loading em-banner...
Loading em-basic...
Loading em-cmpl...
Loading em-extpipe...
Loading em-glob...
Loading em-hist...
Loading em-ls...
Loading em-pred...
Loading em-prompt...
Loading em-script...
Loading em-term...
Loading em-unix...
   passed   1/23  esh-proc-test/exit-status/failure (0.117111 sec)
   passed   2/23  esh-proc-test/exit-status/success (0.105469 sec)
   passed   3/23  esh-proc-test/exit-status/with-stderr-pipe (0.105925 sec)
   passed   4/23  esh-proc-test/kill-pipeline (0.108324 sec)
   passed   5/23  esh-proc-test/kill-pipeline-head (0.108148 sec)
   passed   6/23  esh-proc-test/kill-process/background-prompt (0.005315 sec)
[sleep]+ Done (/usr/bin/sleep 100)
   passed   7/23  esh-proc-test/kill-process/foreground-only (0.207743 sec)
   passed   8/23  esh-proc-test/kill-process/redirect-message (0.004864 sec)
Tramp: Sending command `exec sh -i'
Tramp: Found remote shell prompt on `sappc2'
Tramp: Sending command `exec sh -i'
Tramp: Found remote shell prompt on `sappc2'
   passed   9/23  esh-proc-test/output/remote-redirect (0.157058 sec)
   passed  10/23  esh-proc-test/output/stderr-to-buffer (0.106075 sec)
   passed  11/23  esh-proc-test/output/stdout-and-stderr-to-buffer (0.105911 sec)
   passed  12/23  esh-proc-test/output/stdout-to-buffer (0.105907 sec)
   passed  13/23  esh-proc-test/output/to-screen (0.105792 sec)
   passed  14/23  esh-proc-test/pipeline-connection-type/first (0.055700 sec)
   passed  15/23  esh-proc-test/pipeline-connection-type/last (0.056159 sec)
make[1]: *** [Makefile:181: lisp/eshell/esh-proc-tests.log] Broken pipe
make[1]: Leaving directory '/home/jschmidt/work/emacs-master/test'
make: *** [Makefile:247: lisp/eshell/esh-proc-tests] Error 2
------------------------- snip -------------------------

I bisected with

------------------------- snip -------------------------
#!/bin/bash

make FAST=true -j8 bootstrap || exit 1
for (( i = 0; i < 30; i++ )); do
  ( cd test && make lisp/eshell/esh-proc-tests ) || exit 1
done

exit 0
------------------------- snip -------------------------

to

------------------------- snip -------------------------
7e50861ca7ed3f620fe62ac6572f6e88b3600ece is the first bad commit
commit 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
Author: Jim Porter <jporterbugs@gmail.com>
Date:   Thu Sep 14 17:51:16 2023 -0700

    ; Simplify how to use 'eshell-debug-command'

    Now, 'eshell-debug-command' works more like 'format-message', which is
    how we usually use it.

    * lisp/eshell/esh-util.el (eshell-always-debug-command): New function.
    (eshell-debug-command): Simplify.  Update callers.

 lisp/eshell/esh-arg.el  |  6 +++---
 lisp/eshell/esh-cmd.el  | 10 ++++++----
 lisp/eshell/esh-proc.el | 41 +++++++++++++++++------------------------
 lisp/eshell/esh-util.el | 26 +++++++++++++++++---------
 4 files changed, 43 insertions(+), 40 deletions(-)
bisect run success
------------------------- snip -------------------------

Pls let me know whether I should open a separate bug for this.

* Modified Test Case

On master, commit 947409d408ed763a9fc35f9f7df97fec28a16837, I took

  lisp/eshell/esh-proc-tests.el

and stripped off everything but tests

   passed   1/12  esh-proc-test/pipeline-connection-type/first (0.067548 sec)
   passed   2/12  esh-proc-test/pipeline-connection-type/first0 (0.057414 sec)
   passed   3/12  esh-proc-test/pipeline-connection-type/first1 (0.057129 sec)
   passed   4/12  esh-proc-test/pipeline-connection-type/first2 (0.057843 sec)
   passed   5/12  esh-proc-test/pipeline-connection-type/last (0.055670 sec)
   passed   6/12  esh-proc-test/pipeline-connection-type/last0 (0.055894 sec)
   passed   7/12  esh-proc-test/pipeline-connection-type/last1 (0.056194 sec)
   passed   8/12  esh-proc-test/pipeline-connection-type/last2 (0.056234 sec)
   passed   9/12  esh-proc-test/pipeline-connection-type/middle (0.058843 sec)
   passed  10/12  esh-proc-test/pipeline-connection-type/middle0 (0.077003 sec)
   passed  11/12  esh-proc-test/pipeline-connection-type/middle1 (0.057962 sec)
   passed  12/12  esh-proc-test/pipeline-connection-type/middle2 (0.058520 sec)

where the <test>N are just a copy of <test>.  Attached for reference.
A good test log is also attached for reference.

* Broken Pipe (Rare)

[test]$ make lisp/eshell/esh-proc-tests
make[1]: Entering directory '/home/jschmidt/work/emacs-master/test'
  GEN      lisp/eshell/esh-proc-tests.log
Running 12 tests (2023-09-24 23:27:38+0200, selector `(not (tag :unstable))')
Loading em-alias...
Loading em-banner...
Loading em-basic...
Loading em-cmpl...
Loading em-extpipe...
Loading em-glob...
Loading em-hist...
Loading em-ls...
Loading em-pred...
Loading em-prompt...
Loading em-script...
Loading em-term...
Loading em-unix...
   passed   1/12  esh-proc-test/pipeline-connection-type/first (0.067822 sec)
   passed   2/12  esh-proc-test/pipeline-connection-type/first0 (0.057025 sec)
   passed   3/12  esh-proc-test/pipeline-connection-type/first1 (0.057386 sec)
   passed   4/12  esh-proc-test/pipeline-connection-type/first2 (0.057817 sec)
   passed   5/12  esh-proc-test/pipeline-connection-type/last (0.055977 sec)
   passed   6/12  esh-proc-test/pipeline-connection-type/last0 (0.055745 sec)
make[1]: *** [Makefile:181: lisp/eshell/esh-proc-tests.log] Broken pipe
make[1]: Leaving directory '/home/jschmidt/work/emacs-master/test'
make: *** [Makefile:247: lisp/eshell/esh-proc-tests] Error 2

* Test Aborted with Elisp Stacktrace (Even Rarer)

Executed under the control of GDB, but also happens without GDB.

[test]$ HOME=/nonexistent LANG=C EMACS_TEST_DIRECTORY=/home/jschmidt/work/emacs-master/test gdb -q -batch -ex run -ex backtrace --args "../src/emacs" --module-assertions --no-init-file --no-site-file --no-site-lisp -L ":."    -l ert  -l lisp/eshell/esh-proc-tests.el   --batch --eval '(ert-run-tests-batch-and-exit (quote (not (tag :unstable))))' 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffee036700 (LWP 7917)]
[Detaching after vfork from child process 7918]
[Detaching after vfork from child process 7919]
[Detaching after vfork from child process 7920]
[Detaching after vfork from child process 7921]
Running 12 tests (2023-09-24 23:15:37+0200, selector `(not (tag :unstable))')
Loading em-alias...
Loading em-banner...
Loading em-basic...
Loading em-cmpl...
Loading em-extpipe...
Loading em-glob...
Loading em-hist...
Loading em-ls...
Loading em-pred...
Loading em-prompt...
Loading em-script...
Loading em-term...
Loading em-unix...
[Detaching after vfork from child process 7922]
[Detaching after vfork from child process 7923]
   passed   1/12  esh-proc-test/pipeline-connection-type/first (0.076640 sec)
[Detaching after vfork from child process 7924]
[Detaching after vfork from child process 7925]
   passed   2/12  esh-proc-test/pipeline-connection-type/first0 (0.058361 sec)
[Detaching after vfork from child process 7926]
[Detaching after vfork from child process 7927]
   passed   3/12  esh-proc-test/pipeline-connection-type/first1 (0.058868 sec)
[Detaching after vfork from child process 7928]
[Detaching after vfork from child process 7929]
   passed   4/12  esh-proc-test/pipeline-connection-type/first2 (0.059533 sec)
[Detaching after vfork from child process 7930]
   passed   5/12  esh-proc-test/pipeline-connection-type/last (0.056367 sec)
[Detaching after vfork from child process 7931]
   passed   6/12  esh-proc-test/pipeline-connection-type/last0 (0.056656 sec)
[Detaching after vfork from child process 7932]
   passed   7/12  esh-proc-test/pipeline-connection-type/last1 (0.056970 sec)
[Detaching after vfork from child process 7933]
   passed   8/12  esh-proc-test/pipeline-connection-type/last2 (0.056878 sec)
[Detaching after vfork from child process 7934]
[Detaching after vfork from child process 7935]
   passed   9/12  esh-proc-test/pipeline-connection-type/middle (0.060267 sec)
[Detaching after vfork from child process 7936]
[Detaching after vfork from child process 7937]
   passed  10/12  esh-proc-test/pipeline-connection-type/middle0 (0.081130 sec)
[Detaching after vfork from child process 7938]
[Detaching after vfork from child process 7939]
Test esh-proc-test/pipeline-connection-type/middle1 aborted with non-local exit
[Detaching after vfork from child process 7940]
[Detaching after vfork from child process 7941]
[Detaching after vfork from child process 7942]
[Detaching after vfork from child process 7953]
[Detaching after vfork from child process 7954]
  ABORTED  11/12  esh-proc-test/pipeline-connection-type/middle1 (0.008848 sec) at lisp/eshell/esh-proc-tests.el:116

Aborted: Ran 12 tests, 10 results as expected, 0 unexpected (2023-09-24 23:15:38+0200, 0.653765 sec)

Error running tests
  backtrace()
  #f(compiled-function () #<bytecode -0x18983e1ec48120ca>)()
  ert-run-tests-batch-and-exit((not (tag :unstable)))
  command-line-1(("-L" ":." "-l" "ert" "-l" "lisp/eshell/esh-proc-tests.el" "--eval" "(ert-run-tests-batch-and-exit (quote (not (tag :unstable))))"))
  command-line()
  normal-top-level()
[Thread 0x7ffff0543400 (LWP 7913) exited]
[Inferior 1 (process 7913) exited with code 02]

* Broken Pipe with gdb Stack Trace

[test]$ HOME=/nonexistent LANG=C EMACS_TEST_DIRECTORY=/home/jschmidt/work/emacs-master/test gdb -q -batch -ex run -ex backtrace --args "../src/emacs" --module-assertions --no-init-file --no-site-file --no-site-lisp -L ":."    -l ert  -l lisp/eshell/esh-proc-tests.el   --batch --eval '(ert-run-tests-batch-and-exit (quote (not (tag :unstable))))'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffee036700 (LWP 7969)]
[Detaching after vfork from child process 7970]
[Detaching after vfork from child process 7971]
[Detaching after vfork from child process 7972]
[Detaching after vfork from child process 7973]
Running 12 tests (2023-09-24 23:15:39+0200, selector `(not (tag :unstable))')
Loading em-alias...
Loading em-banner...
Loading em-basic...
Loading em-cmpl...
Loading em-extpipe...
Loading em-glob...
Loading em-hist...
Loading em-ls...
Loading em-pred...
Loading em-prompt...
Loading em-script...
Loading em-term...
Loading em-unix...
[Detaching after vfork from child process 7974]
[Detaching after vfork from child process 7975]
   passed   1/12  esh-proc-test/pipeline-connection-type/first (0.076930 sec)
[Detaching after vfork from child process 7976]
[Detaching after vfork from child process 7977]
   passed   2/12  esh-proc-test/pipeline-connection-type/first0 (0.057770 sec)
[Detaching after vfork from child process 7978]
[Detaching after vfork from child process 7979]
   passed   3/12  esh-proc-test/pipeline-connection-type/first1 (0.058679 sec)
[Detaching after vfork from child process 7980]
[Detaching after vfork from child process 7981]
   passed   4/12  esh-proc-test/pipeline-connection-type/first2 (0.058402 sec)
[Detaching after vfork from child process 7982]
   passed   5/12  esh-proc-test/pipeline-connection-type/last (0.056096 sec)
[Detaching after vfork from child process 7983]
   passed   6/12  esh-proc-test/pipeline-connection-type/last0 (0.056120 sec)
[Detaching after vfork from child process 7984]
   passed   7/12  esh-proc-test/pipeline-connection-type/last1 (0.056147 sec)
[Detaching after vfork from child process 7985]
   passed   8/12  esh-proc-test/pipeline-connection-type/last2 (0.056252 sec)
[Detaching after vfork from child process 7986]
[Detaching after vfork from child process 7987]

Thread 1 "emacs" received signal SIGPIPE, Broken pipe.
0x00007ffff57bffef in write () from /lib/x86_64-linux-gnu/libpthread.so.0
#0  0x00007ffff57bffef in write () at /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00005555556f2f08 in emacs_full_write (fd=19, buf=0x5555565918b8 "hi", nbyte=2, interruptible=-1) at sysdep.c:2812
#2  0x00005555557c1ef8 in send_process (proc=<optimized out>, buf=<optimized out>, len=<optimized out>, object=<optimized out>) at process.c:6670
#3  0x00005555557c2318 in Fprocess_send_string (process=<optimized out>, string=0x5555565862d4) at lisp.h:779
#4  0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#5  0x0000555555760883 in Ffuncall (nargs=3, args=0x7fffee067160) at eval.c:3008
#6  0x0000555555760d69 in Fapply (nargs=4, args=0x7fffee067160) at eval.c:2632
#7  0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#8  0x0000555555765ec7 in apply_lambda (fun=0x55555631d4dd, args=<optimized out>, count=...) at eval.c:3116
#9  0x0000555555764008 in eval_sub (form=<optimized out>) at eval.c:2601
#10 0x0000555555766e80 in Feval (form=0x7fffed39a1b3, lexical=<optimized out>) at eval.c:2375
#11 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#12 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffb5d0) at eval.c:3008
#13 0x0000555555764503 in eval_sub (form=<optimized out>) at lisp.h:779
#14 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#15 Flet (args=0x7fffed389d23) at eval.c:1038
#16 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#17 0x0000555555766e80 in Feval (form=0x7fffed389d33, lexical=<optimized out>) at eval.c:2375
#18 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#19 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffb8b0) at eval.c:3008
#20 0x0000555555764503 in eval_sub (form=<optimized out>) at lisp.h:779
#21 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#22 Flet (args=0x7fffed39a463) at eval.c:1038
#23 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#24 0x0000555555766e80 in Feval (form=0x7fffed39a473, lexical=<optimized out>) at eval.c:2375
#25 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#26 0x0000555555765ec7 in apply_lambda (fun=0x55555631aead, args=<optimized out>, count=...) at eval.c:3116
#27 0x0000555555764008 in eval_sub (form=<optimized out>) at eval.c:2601
#28 0x0000555555766c81 in internal_lisp_condition_case (var=0x1db700, bodyform=0x7fffed3999e3, handlers=<optimized out>) at eval.c:1440
#29 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#30 0x0000555555766e80 in Feval (form=0x7fffed398ba3, lexical=<optimized out>) at eval.c:2375
#31 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#32 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffbeb0) at eval.c:3008
#33 0x0000555555764503 in eval_sub (form=<optimized out>) at lisp.h:779
#34 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#35 Flet (args=0x7fffed398fb3) at eval.c:1038
#36 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#37 0x0000555555766e80 in Feval (form=0x7fffed398fc3, lexical=<optimized out>) at eval.c:2375
#38 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#39 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffc180) at eval.c:3008
#40 0x0000555555764503 in eval_sub (form=<optimized out>) at lisp.h:779
#41 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#42 Flet (args=0x7fffed398a73) at eval.c:1038
#43 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#44 0x0000555555766e80 in Feval (form=0x7fffed398a83, lexical=<optimized out>) at eval.c:2375
#45 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#46 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffc450) at eval.c:3008
#47 0x0000555555764503 in eval_sub (form=<optimized out>) at lisp.h:779
#48 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#49 Flet (args=0x7fffed3985b3) at eval.c:1038
#50 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#51 0x0000555555766e80 in Feval (form=0x7fffed3985c3, lexical=<optimized out>) at eval.c:2375
#52 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#53 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffc730) at eval.c:3008
#54 0x0000555555764503 in eval_sub (form=<optimized out>) at lisp.h:779
#55 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#56 Flet (args=0x7fffed3a7e73) at eval.c:1038
#57 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#58 0x0000555555766e80 in Feval (form=0x7fffed3a7e83, lexical=<optimized out>) at eval.c:2375
#59 0x00005555557a7fb6 in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at lisp.h:779
#60 0x0000555555765ec7 in apply_lambda (fun=0x555556361115, args=<optimized out>, count=...) at eval.c:3116
#61 0x0000555555764008 in eval_sub (form=<optimized out>) at eval.c:2601
#62 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#63 Flet (args=0x7fffed55f013) at eval.c:1038
#64 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#65 0x0000555555764a21 in Fprogn (body=<optimized out>) at eval.c:436
#66 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#67 0x00005555557668b7 in Funwind_protect (args=0x7fffed55f813) at lisp.h:779
#68 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#69 0x0000555555766709 in Fprogn (body=<optimized out>) at eval.c:436
#70 FletX (args=0x7fffed55f963) at eval.c:970
#71 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#72 0x0000555555765901 in Fprogn (body=<optimized out>) at eval.c:436
#73 funcall_lambda (fun=0x7fffed55fa33, nargs=1, arg_vector=0x7fffffffcfa0) at eval.c:3246
#74 0x0000555555765ec7 in apply_lambda (fun=0x7fffed55fa43, args=<optimized out>, count=...) at eval.c:3116
#75 0x0000555555764008 in eval_sub (form=<optimized out>) at eval.c:2601
#76 0x00005555557644c4 in eval_sub (form=<optimized out>) at eval.c:2478
#77 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#78 Flet (args=0x7fffed550f73) at eval.c:1038
#79 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#80 0x0000555555766c81 in internal_lisp_condition_case (var=0x1db700, bodyform=0x7fffed550f83, handlers=<optimized out>) at eval.c:1440
#81 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#82 0x000055555576653c in FletX (args=0x7fffed5522f3) at lisp.h:779
#83 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#84 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#85 Flet (args=0x7fffed552323) at eval.c:1038
#86 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#87 0x0000555555765901 in Fprogn (body=<optimized out>) at eval.c:436
#88 funcall_lambda (fun=0x7fffed5523f3, nargs=2, arg_vector=0x7fffffffd700) at eval.c:3246
#89 0x0000555555765ec7 in apply_lambda (fun=0x7fffed552403, args=<optimized out>, count=...) at eval.c:3116
#90 0x0000555555764008 in eval_sub (form=<optimized out>) at eval.c:2601
#91 0x00005555557662e1 in Fprogn (body=<optimized out>) at eval.c:436
#92 Flet (args=0x7fffed54f113) at eval.c:1038
#93 0x00005555557646b0 in eval_sub (form=<optimized out>) at lisp.h:779
#94 0x0000555555765901 in Fprogn (body=<optimized out>) at eval.c:436
#95 funcall_lambda (fun=0x7fffed541f43, nargs=0, arg_vector=0x7fffee066270) at eval.c:3246
#96 0x00005555557a7d4e in exec_byte_code (fun=<optimized out>, args_template=<optimized out>, nargs=<optimized out>, args=<optimized out>) at bytecode.c:817
#97 0x0000555555765ec7 in apply_lambda (fun=0x5555560f81fd, args=<optimized out>, count=...) at eval.c:3116
#98 0x0000555555764008 in eval_sub (form=<optimized out>) at eval.c:2601
#99 0x0000555555766e80 in Feval (form=0x7fffed52ef63, lexical=<optimized out>) at eval.c:2375
#100 0x00007fffef012849 in F636f6d6d616e642d6c696e652d31_command_line_1_0 () at /home/jschmidt/work/emacs-master/src/../native-lisp/30.0.50-88254aaa/preloaded/startup-bbc6ea72-b64c9391.eln
#101 0x0000555555760883 in Ffuncall (nargs=2, args=0x7fffffffdff0) at eval.c:3008
#102 0x00007fffef00a268 in F636f6d6d616e642d6c696e65_command_line_0 () at /home/jschmidt/work/emacs-master/src/../native-lisp/30.0.50-88254aaa/preloaded/startup-bbc6ea72-b64c9391.eln
#103 0x0000555555760883 in Ffuncall (nargs=1, args=0x7fffffffe0c8) at eval.c:3008
#104 0x00007fffef005bdf in F6e6f726d616c2d746f702d6c6576656c_normal_top_level_0 () at /home/jschmidt/work/emacs-master/src/../native-lisp/30.0.50-88254aaa/preloaded/startup-bbc6ea72-b64c9391.eln
#105 0x0000555555764969 in eval_sub (form=<optimized out>) at lisp.h:779
#106 0x0000555555766e80 in Feval (form=0x7fffefc999bb, lexical=<optimized out>) at eval.c:2375
#107 0x000055555575ef67 in internal_condition_case (bfun=bfun@entry=0x5555556d1c30 <top_level_2>, handlers=handlers@entry=0x90, hfun=hfun@entry=0x5555556d9a50 <cmd_error>) at eval.c:1486
#108 0x00005555556d25c6 in top_level_1 (ignore=ignore@entry=0x0) at keyboard.c:1174
#109 0x000055555575eec1 in internal_catch (tag=tag@entry=0x107d0, func=func@entry=0x5555556d25a0 <top_level_1>, arg=arg@entry=0x0) at eval.c:1209
#110 0x00005555556d1ba8 in command_loop () at lisp.h:1173
#111 0x00005555556d95e3 in recursive_edit_1 () at keyboard.c:744
#112 0x00005555556d9980 in Frecursive_edit () at keyboard.c:827
#113 0x00005555555a9de6 in main (argc=<optimized out>, argv=<optimized out>) at emacs.c:2625

[-- Attachment #2: esh-proc-tests.el --]
[-- Type: text/x-emacs-lisp, Size: 6784 bytes --]

;;; esh-proc-tests.el --- esh-proc test suite  -*- lexical-binding:t -*-

;; Copyright (C) 2022-2023 Free Software Foundation, Inc.

;; This file is part of GNU Emacs.

;; GNU Emacs is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; GNU Emacs is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.

;;; Code:

(require 'tramp)
(require 'ert)
(require 'esh-mode)
(require 'eshell)

(require 'eshell-tests-helpers
         (expand-file-name "eshell-tests-helpers"
                           (file-name-directory (or load-file-name
                                                    default-directory))))

(defvar esh-proc-test--output-cmd
  (concat "sh -c '"
          "echo stdout; "
          "echo stderr >&2"
          "'")
  "A shell command that prints to both stdout and stderr.")

(defvar esh-proc-test--detect-pty-cmd
  (concat "sh -c '"
          "if [ -t 0 ]; then echo stdin; fi; "
          "if [ -t 1 ]; then echo stdout; fi; "
          "if [ -t 2 ]; then echo stderr; fi"
          "'")
  "A shell command that prints the standard streams connected as TTYs.")

;;; Tests:

\f
;; Output and redirection


\f
;; Pipelines

(ert-deftest esh-proc-test/pipeline-connection-type/first ()
  "Test that only stdin is a PTY when a command starts a pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  (eshell-command-result-equal
   (concat esh-proc-test--detect-pty-cmd " | cat")
   (unless (eq system-type 'windows-nt)
     "stdin\n")))

(ert-deftest esh-proc-test/pipeline-connection-type/first0 ()
  "Test that only stdin is a PTY when a command starts a pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  (eshell-command-result-equal
   (concat esh-proc-test--detect-pty-cmd " | cat")
   (unless (eq system-type 'windows-nt)
     "stdin\n")))

(ert-deftest esh-proc-test/pipeline-connection-type/first1 ()
  "Test that only stdin is a PTY when a command starts a pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  (eshell-command-result-equal
   (concat esh-proc-test--detect-pty-cmd " | cat")
   (unless (eq system-type 'windows-nt)
     "stdin\n")))

(ert-deftest esh-proc-test/pipeline-connection-type/first2 ()
  "Test that only stdin is a PTY when a command starts a pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  (eshell-command-result-equal
   (concat esh-proc-test--detect-pty-cmd " | cat")
   (unless (eq system-type 'windows-nt)
     "stdin\n")))

(ert-deftest esh-proc-test/pipeline-connection-type/middle ()
  "Test that all streams are pipes when a command is in the middle of a
pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd " | cat")
     nil)))

(ert-deftest esh-proc-test/pipeline-connection-type/middle0 ()
  "Test that all streams are pipes when a command is in the middle of a
pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd " | cat")
     nil)))

(ert-deftest esh-proc-test/pipeline-connection-type/middle1 ()
  "Test that all streams are pipes when a command is in the middle of a
pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd " | cat")
     nil)))

(ert-deftest esh-proc-test/pipeline-connection-type/middle2 ()
  "Test that all streams are pipes when a command is in the middle of a
pipeline."
  (skip-unless (and (executable-find "sh")
                    (executable-find "cat")))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd " | cat")
     nil)))

(ert-deftest esh-proc-test/pipeline-connection-type/last ()
  "Test that only output streams are PTYs when a command ends a pipeline."
  (skip-unless (executable-find "sh"))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd)
     (unless (eq system-type 'windows-nt)
       "stdout\nstderr\n"))))

(ert-deftest esh-proc-test/pipeline-connection-type/last0 ()
  "Test that only output streams are PTYs when a command ends a pipeline."
  (skip-unless (executable-find "sh"))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd)
     (unless (eq system-type 'windows-nt)
       "stdout\nstderr\n"))))

(ert-deftest esh-proc-test/pipeline-connection-type/last1 ()
  "Test that only output streams are PTYs when a command ends a pipeline."
  (skip-unless (executable-find "sh"))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd)
     (unless (eq system-type 'windows-nt)
       "stdout\nstderr\n"))))

(ert-deftest esh-proc-test/pipeline-connection-type/last2 ()
  "Test that only output streams are PTYs when a command ends a pipeline."
  (skip-unless (executable-find "sh"))
  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
  ;; handle it!
  (let ((debug-on-error nil))
    (eshell-command-result-equal
     (concat "echo hi | " esh-proc-test--detect-pty-cmd)
     (unless (eq system-type 'windows-nt)
       "stdout\nstderr\n"))))

;;; esh-proc-tests.el ends here

[-- Attachment #3: esh-proc-tests.log --]
[-- Type: text/x-log, Size: 1373 bytes --]

Running 12 tests (2023-09-24 23:30:40+0200, selector `(not (tag :unstable))')
Loading em-alias...
Loading em-banner...
Loading em-basic...
Loading em-cmpl...
Loading em-extpipe...
Loading em-glob...
Loading em-hist...
Loading em-ls...
Loading em-pred...
Loading em-prompt...
Loading em-script...
Loading em-term...
Loading em-unix...
   passed   1/12  esh-proc-test/pipeline-connection-type/first (0.065851 sec)
   passed   2/12  esh-proc-test/pipeline-connection-type/first0 (0.057020 sec)
   passed   3/12  esh-proc-test/pipeline-connection-type/first1 (0.057370 sec)
   passed   4/12  esh-proc-test/pipeline-connection-type/first2 (0.057298 sec)
   passed   5/12  esh-proc-test/pipeline-connection-type/last (0.054581 sec)
   passed   6/12  esh-proc-test/pipeline-connection-type/last0 (0.055588 sec)
   passed   7/12  esh-proc-test/pipeline-connection-type/last1 (0.055823 sec)
   passed   8/12  esh-proc-test/pipeline-connection-type/last2 (0.054410 sec)
   passed   9/12  esh-proc-test/pipeline-connection-type/middle (0.057858 sec)
   passed  10/12  esh-proc-test/pipeline-connection-type/middle0 (0.076824 sec)
   passed  11/12  esh-proc-test/pipeline-connection-type/middle1 (0.057628 sec)
   passed  12/12  esh-proc-test/pipeline-connection-type/middle2 (0.057657 sec)

Ran 12 tests, 12 results as expected, 0 unexpected (2023-09-24 23:30:41+0200, 0.709969 sec)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-24 21:35 bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-24 23:02 ` Jim Porter
  2023-09-25  4:52   ` Eli Zaretskii
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Porter @ 2023-09-24 23:02 UTC (permalink / raw)
  To: Jens Schmidt, 66186

On 9/24/2023 2:35 PM, Jens Schmidt via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
> * Broken Pipe with gdb Stack Trace
> 
> [test]$ HOME=/nonexistent LANG=C EMACS_TEST_DIRECTORY=/home/jschmidt/work/emacs-master/test gdb -q -batch -ex run -ex backtrace --args "../src/emacs" --module-assertions --no-init-file --no-site-file --no-site-lisp -L ":."    -l ert  -l lisp/eshell/esh-proc-tests.el   --batch --eval '(ert-run-tests-batch-and-exit (quote (not (tag :unstable))))'
[snip]
>     passed   8/12  esh-proc-test/pipeline-connection-type/last2 (0.056252 sec)
> [Detaching after vfork from child process 7986]
> [Detaching after vfork from child process 7987]
> 
> Thread 1 "emacs" received signal SIGPIPE, Broken pipe.
> 0x00007ffff57bffef in write () from /lib/x86_64-linux-gnu/libpthread.so.0
> #0  0x00007ffff57bffef in write () at /lib/x86_64-linux-gnu/libpthread.so.0
> #1  0x00005555556f2f08 in emacs_full_write (fd=19, buf=0x5555565918b8 "hi", nbyte=2, interruptible=-1) at sysdep.c:2812
[snip]

Thanks. This looks like it's caused when Eshell runs a command something 
like this:

   echo hi | sh -c 'if [ -t 0 ]; then echo stdin; fi; ...'

(Note that the pipe above is handled entirely by Eshell, using 
'process-send-string' in this case.) I'm guessing that sometimes, the 
'sh' process has exited by the time Eshell calls '(process-send-string 
PROC "hi")'. Presumably, the commit you identified (which just changed 
some debug logging) altered the timings by just enough to trigger this 
race condition for you.

However, I don't understand why this would cause an abort though; 
normally, 'process-send-string' should just signal an Elisp error (which 
Eshell then catches and does the right thing with it). Maybe there's a 
bug somewhere in process.c where it's not correctly handling the (real) 
SIGPIPE signal and converting it to an Elisp signal?

I'm somewhat familiar with process.c, so I can take a look at this, but 
it'll probably be a week or two until I have time to really dig in. In 
the meantime, if anyone else wants to work on a fix, feel free.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-24 23:02 ` Jim Porter
@ 2023-09-25  4:52   ` Eli Zaretskii
  2023-09-25  5:34     ` Jim Porter
  0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2023-09-25  4:52 UTC (permalink / raw)
  To: Jim Porter; +Cc: 66186, jschmidt4gnu

> Date: Sun, 24 Sep 2023 16:02:03 -0700
> From: Jim Porter <jporterbugs@gmail.com>
> 
> > Thread 1 "emacs" received signal SIGPIPE, Broken pipe.
> > 0x00007ffff57bffef in write () from /lib/x86_64-linux-gnu/libpthread.so.0
> > #0  0x00007ffff57bffef in write () at /lib/x86_64-linux-gnu/libpthread.so.0
> > #1  0x00005555556f2f08 in emacs_full_write (fd=19, buf=0x5555565918b8 "hi", nbyte=2, interruptible=-1) at sysdep.c:2812
> [snip]
> 
> Thanks. This looks like it's caused when Eshell runs a command something 
> like this:
> 
>    echo hi | sh -c 'if [ -t 0 ]; then echo stdin; fi; ...'
> 
> (Note that the pipe above is handled entirely by Eshell, using 
> 'process-send-string' in this case.) I'm guessing that sometimes, the 
> 'sh' process has exited by the time Eshell calls '(process-send-string 
> PROC "hi")'. Presumably, the commit you identified (which just changed 
> some debug logging) altered the timings by just enough to trigger this 
> race condition for you.
> 
> However, I don't understand why this would cause an abort though; 
> normally, 'process-send-string' should just signal an Elisp error (which 
> Eshell then catches and does the right thing with it). Maybe there's a 
> bug somewhere in process.c where it's not correctly handling the (real) 
> SIGPIPE signal and converting it to an Elisp signal?

In batch mode, SIGPIPE is not ignored by Emacs, see init_signals.
This was changed 11 years ago, see commit 4d7e6e51dd.

Perhaps Eshell should check that the process is still alive before
calling process-send-string?





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  4:52   ` Eli Zaretskii
@ 2023-09-25  5:34     ` Jim Porter
  2023-09-25  5:47       ` Jim Porter
  2023-09-25  9:01       ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 12+ messages in thread
From: Jim Porter @ 2023-09-25  5:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jschmidt4gnu, 66186

[-- Attachment #1: Type: text/plain, Size: 832 bytes --]

On 9/24/2023 9:52 PM, Eli Zaretskii wrote:
>> Date: Sun, 24 Sep 2023 16:02:03 -0700
>> From: Jim Porter <jporterbugs@gmail.com>
>>
>> However, I don't understand why this would cause an abort though;
>> normally, 'process-send-string' should just signal an Elisp error (which
>> Eshell then catches and does the right thing with it). Maybe there's a
>> bug somewhere in process.c where it's not correctly handling the (real)
>> SIGPIPE signal and converting it to an Elisp signal?
> 
> In batch mode, SIGPIPE is not ignored by Emacs, see init_signals.
> This was changed 11 years ago, see commit 4d7e6e51dd.

Thanks, I didn't realize that.

> Perhaps Eshell should check that the process is still alive before
> calling process-send-string?

Ok, how about this? Jens, could you try this patch out to see if it 
fixes things for you?

[-- Attachment #2: 0001-Check-for-process-liveness-before-calling-process-se.patch --]
[-- Type: text/plain, Size: 1423 bytes --]

From e9d961f0b1debed82fc004d6631ffe6adff7c19f Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Sun, 24 Sep 2023 22:30:34 -0700
Subject: [PATCH] Check for process liveness before calling
 'process-send-string' in Eshell

In other words, seek permission instead of asking for forgiveness
(bug#66186).

* lisp/eshell/esh-io.el (eshell-output-object-to-target): Check
'process-live-p' first.
---
 lisp/eshell/esh-io.el | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/lisp/eshell/esh-io.el b/lisp/eshell/esh-io.el
index cd0cee6e21d..668ff13b825 100644
--- a/lisp/eshell/esh-io.el
+++ b/lisp/eshell/esh-io.el
@@ -644,15 +644,10 @@ eshell-output-object-to-target
   "Output OBJECT to the process TARGET."
   (unless (stringp object)
     (setq object (eshell-stringify object)))
-  (condition-case err
+  (if (process-live-p target)
       (process-send-string target object)
-    (error
-     ;; If `process-send-string' raises an error and the process has
-     ;; finished, treat it as a broken pipe.  Otherwise, just
-     ;; re-throw the signal.
-     (if (process-live-p target)
-         (signal (car err) (cdr err))
-       (signal 'eshell-pipe-broken (list target)))))
+    ;; If the process is already dead, treat that as a broken pipe.
+    (signal 'eshell-pipe-broken (list target)))
   object)
 
 (cl-defmethod eshell-output-object-to-target (object
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  5:34     ` Jim Porter
@ 2023-09-25  5:47       ` Jim Porter
  2023-09-25  6:47         ` Eli Zaretskii
  2023-09-25  9:01       ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 12+ messages in thread
From: Jim Porter @ 2023-09-25  5:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66186, jschmidt4gnu

On 9/24/2023 10:34 PM, Jim Porter wrote:
> On 9/24/2023 9:52 PM, Eli Zaretskii wrote:
>> In batch mode, SIGPIPE is not ignored by Emacs, see init_signals.
>> This was changed 11 years ago, see commit 4d7e6e51dd.
> 
> Thanks, I didn't realize that.
> 
>> Perhaps Eshell should check that the process is still alive before
>> calling process-send-string?
> 
> Ok, how about this? Jens, could you try this patch out to see if it 
> fixes things for you?

I forgot to add: Is there potential for a race condition here? I think 
I'd written it the other way because there's a chance that the process 
exits in between checking 'process-live-p' and calling 
'process-send-string'. I guess we could check liveness both before *and* 
after 'process-send-string'. That would probably still leave a small 
chance of the regression tests crashing though, which isn't great.

I could probably also write the test to avoid this race condition 
entirely, since it's not actually trying to trigger a SIGPIPE (though in 
general, Eshell should do the right thing in response to SIGPIPE). That 
would make the regression tests happy.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  5:47       ` Jim Porter
@ 2023-09-25  6:47         ` Eli Zaretskii
  2023-09-25  7:18           ` Paul Eggert
  2023-09-25 19:12           ` Jim Porter
  0 siblings, 2 replies; 12+ messages in thread
From: Eli Zaretskii @ 2023-09-25  6:47 UTC (permalink / raw)
  To: Jim Porter, Paul Eggert; +Cc: 66186, jschmidt4gnu

> Date: Sun, 24 Sep 2023 22:47:58 -0700
> From: Jim Porter <jporterbugs@gmail.com>
> Cc: jschmidt4gnu@vodafonemail.de, 66186@debbugs.gnu.org
> 
> >> Perhaps Eshell should check that the process is still alive before
> >> calling process-send-string?
> > 
> > Ok, how about this? Jens, could you try this patch out to see if it 
> > fixes things for you?
> 
> I forgot to add: Is there potential for a race condition here? I think 
> I'd written it the other way because there's a chance that the process 
> exits in between checking 'process-live-p' and calling 
> 'process-send-string'.

Yes, and therefore I think you should also keep the old code that
wrapped the call in condition-case.

> I guess we could check liveness both before *and* 
> after 'process-send-string'. That would probably still leave a small 
> chance of the regression tests crashing though, which isn't great.

Perhaps process-send-string should install a temporary SIGPIPE
handler, at least optionally?  Paul, WDYT?

> I could probably also write the test to avoid this race condition 
> entirely, since it's not actually trying to trigger a SIGPIPE (though in 
> general, Eshell should do the right thing in response to SIGPIPE). That 
> would make the regression tests happy.

That's always a good thing, thanks.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  6:47         ` Eli Zaretskii
@ 2023-09-25  7:18           ` Paul Eggert
  2023-09-25  7:43             ` Eli Zaretskii
  2023-09-25 19:12           ` Jim Porter
  1 sibling, 1 reply; 12+ messages in thread
From: Paul Eggert @ 2023-09-25  7:18 UTC (permalink / raw)
  To: Eli Zaretskii, Jim Porter; +Cc: 66186, jschmidt4gnu

On 2023-09-24 23:47, Eli Zaretskii wrote:
> Perhaps process-send-string should install a temporary SIGPIPE
> handler, at least optionally?  Paul, WDYT?

Sounds like a recipe for bad race conditions.

I'm not following the problem closely. However, the usual way to handle 
this is to use sendto's MSG_NOSIGNAL option (GNU/Linux) or use 
setsockopt with SO_NOSIGPIPE (the BSDs and macOS). This should prevent 
those SIGPIPEs from occurring.

Alternatively, but this would be a bigger lift, you can arrange for a 
SIGPIPE signal handler to be enabled all the time, even in batch mode. 
But then you'll need to resurrect the batch-mode code that used to deal 
with this sort of thing (and I've forgotten what it is and as I vaguely 
recall it was a bit buggy but you can look at the change history). The 
basic idea is that in batch mode, if you ignore SIGPIPE then Emacs 
should always check for write errors and exit whenever they happen.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  7:18           ` Paul Eggert
@ 2023-09-25  7:43             ` Eli Zaretskii
  0 siblings, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2023-09-25  7:43 UTC (permalink / raw)
  To: Paul Eggert; +Cc: jporterbugs, 66186, jschmidt4gnu

> Date: Mon, 25 Sep 2023 00:18:02 -0700
> Cc: jschmidt4gnu@vodafonemail.de, 66186@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> On 2023-09-24 23:47, Eli Zaretskii wrote:
> > Perhaps process-send-string should install a temporary SIGPIPE
> > handler, at least optionally?  Paul, WDYT?
> 
> Sounds like a recipe for bad race conditions.
> 
> I'm not following the problem closely. However, the usual way to handle 
> this is to use sendto's MSG_NOSIGNAL option (GNU/Linux) or use 
> setsockopt with SO_NOSIGPIPE (the BSDs and macOS). This should prevent 
> those SIGPIPEs from occurring.

I don't think this is about a network subprocess.  It's about a real
subprocess which runs programs.

> Alternatively, but this would be a bigger lift, you can arrange for a 
> SIGPIPE signal handler to be enabled all the time, even in batch mode. 
> But then you'll need to resurrect the batch-mode code that used to deal 
> with this sort of thing (and I've forgotten what it is and as I vaguely 
> recall it was a bit buggy but you can look at the change history). The 
> basic idea is that in batch mode, if you ignore SIGPIPE then Emacs 
> should always check for write errors and exit whenever they happen.

Hmm...





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  5:34     ` Jim Porter
  2023-09-25  5:47       ` Jim Porter
@ 2023-09-25  9:01       ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 12+ messages in thread
From: Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-25  9:01 UTC (permalink / raw)
  To: Jim Porter; +Cc: Eli Zaretskii, 66186

Jim Porter <jporterbugs@gmail.com> writes:

> On 9/24/2023 9:52 PM, Eli Zaretskii wrote:
>>> Date: Sun, 24 Sep 2023 16:02:03 -0700
>>> From: Jim Porter <jporterbugs@gmail.com>

>> Perhaps Eshell should check that the process is still alive before
>> calling process-send-string?
>
> Ok, how about this? Jens, could you try this patch out to see if
> it fixes things for you?

I managed to reproduce the SIGPIPE twice, but this time in 20-30 tests.
So it does not really fix things (as you have suspected as well), but it
improved the situation definitely.

It's not that I feel badly affected by this bug, it's more of a
nuisance.  So I leave it to you how to continue here.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25  6:47         ` Eli Zaretskii
  2023-09-25  7:18           ` Paul Eggert
@ 2023-09-25 19:12           ` Jim Porter
  2023-09-28 20:33             ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 12+ messages in thread
From: Jim Porter @ 2023-09-25 19:12 UTC (permalink / raw)
  To: Eli Zaretskii, Paul Eggert; +Cc: jschmidt4gnu, 66186

[-- Attachment #1: Type: text/plain, Size: 1605 bytes --]

On 9/24/2023 11:47 PM, Eli Zaretskii wrote:
>> Date: Sun, 24 Sep 2023 22:47:58 -0700
>> From: Jim Porter <jporterbugs@gmail.com>
>> Cc: jschmidt4gnu@vodafonemail.de, 66186@debbugs.gnu.org
>>
>> I forgot to add: Is there potential for a race condition here? I think
>> I'd written it the other way because there's a chance that the process
>> exits in between checking 'process-live-p' and calling
>> 'process-send-string'.
> 
> Yes, and therefore I think you should also keep the old code that
> wrapped the call in condition-case.

Ok, so I've rewritten the patch. Now there are no non-test code changes, 
so Eshell works as it did before, for better or worse. Jumping through 
hoops to reduce, but not eliminate, the chance of a crash didn't seem 
like the right direction to me.

However, I also added a comment in 'eshell-output-object-to-target' 
pointing to this bug, in case anyone finds this SIGPIPE behavior to be 
an actual problem (it might be an issue for people who want to write 
shell scripts in Eshell, but I don't think that's very common anyway). 
And then...

>> I could probably also write the test to avoid this race condition
>> entirely, since it's not actually trying to trigger a SIGPIPE (though in
>> general, Eshell should do the right thing in response to SIGPIPE). That
>> would make the regression tests happy.
> 
> That's always a good thing, thanks.

... I've also done this. Now the regression tests should just avoid the 
possibility of a SIGPIPE, which will hopefully resolve this bug.

Jens, could you try this version out to make sure the tests pass 
reliably for you?

[-- Attachment #2: 0001-Adjust-Eshell-regression-tests-to-avoid-SIGPIPE.patch --]
[-- Type: text/plain, Size: 3019 bytes --]

From 2feac3f3c0a6630aadb4746c3fdcc167bda2e253 Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Sun, 24 Sep 2023 22:30:34 -0700
Subject: [PATCH] ; Adjust Eshell regression tests to avoid SIGPIPE

In batch mode, SIGPIPEs can cause Emacs to abort (bug#66186).

* lisp/eshell/esh-io.el (eshell-output-object-to-target): Update
comment.

* test/lisp/eshell/esh-proc-tests.el
(esh-proc-test/pipeline-connection-type/middle)
(esh-proc-test/pipeline-connection-type/last): Use 'printnl', since
that causes no output when called with no arguments, thus avoiding a
risky 'process-send-string'.
---
 lisp/eshell/esh-io.el              |  7 +++++--
 test/lisp/eshell/esh-proc-tests.el | 20 +++++++-------------
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/lisp/eshell/esh-io.el b/lisp/eshell/esh-io.el
index cd0cee6e21d..d0f1e04e925 100644
--- a/lisp/eshell/esh-io.el
+++ b/lisp/eshell/esh-io.el
@@ -648,8 +648,11 @@ eshell-output-object-to-target
       (process-send-string target object)
     (error
      ;; If `process-send-string' raises an error and the process has
-     ;; finished, treat it as a broken pipe.  Otherwise, just
-     ;; re-throw the signal.
+     ;; finished, treat it as a broken pipe.  Otherwise, just re-raise
+     ;; the signal.  NOTE: When running Emacs in batch mode
+     ;; (e.g. during regression tests), Emacs can abort due to SIGPIPE
+     ;; here.  Maybe `process-send-string' should handle SIGPIPE even
+     ;; in batch mode (bug#66186).
      (if (process-live-p target)
          (signal (car err) (cdr err))
        (signal 'eshell-pipe-broken (list target)))))
diff --git a/test/lisp/eshell/esh-proc-tests.el b/test/lisp/eshell/esh-proc-tests.el
index d58764ac29f..2f03c07b35e 100644
--- a/test/lisp/eshell/esh-proc-tests.el
+++ b/test/lisp/eshell/esh-proc-tests.el
@@ -174,23 +174,17 @@ esh-proc-test/pipeline-connection-type/middle
 pipeline."
   (skip-unless (and (executable-find "sh")
                     (executable-find "cat")))
-  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
-  ;; handle it!
-  (let ((debug-on-error nil))
-    (eshell-command-result-equal
-     (concat "echo hi | " esh-proc-test--detect-pty-cmd " | cat")
-     nil)))
+  (eshell-command-result-equal
+   (concat "printnl | " esh-proc-test--detect-pty-cmd " | cat")
+   nil))
 
 (ert-deftest esh-proc-test/pipeline-connection-type/last ()
   "Test that only output streams are PTYs when a command ends a pipeline."
   (skip-unless (executable-find "sh"))
-  ;; An `eshell-pipe-broken' signal might occur internally; let Eshell
-  ;; handle it!
-  (let ((debug-on-error nil))
-    (eshell-command-result-equal
-     (concat "echo hi | " esh-proc-test--detect-pty-cmd)
-     (unless (eq system-type 'windows-nt)
-       "stdout\nstderr\n"))))
+  (eshell-command-result-equal
+   (concat "printnl | " esh-proc-test--detect-pty-cmd)
+   (unless (eq system-type 'windows-nt)
+     "stdout\nstderr\n")))
 
 \f
 ;; Synchronous processes
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-25 19:12           ` Jim Porter
@ 2023-09-28 20:33             ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-01 20:13               ` Jim Porter
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-28 20:33 UTC (permalink / raw)
  To: Jim Porter; +Cc: Eli Zaretskii, Paul Eggert, 66186

Jim Porter <jporterbugs@gmail.com> writes:

> On 9/24/2023 11:47 PM, Eli Zaretskii wrote:
>
>> That's always a good thing, thanks.
>
> ... I've also done this. Now the regression tests should just
> avoid the possibility of a SIGPIPE, which will hopefully resolve
> this bug.
>
> Jens, could you try this version out to make sure the tests pass
> reliably for you?

They do pass reliably now, thanks.

TBH, I initially didn't read your commit message and, hence, failed to
understand that `printnl' without parameters prints nothing - I thought
it would print at least a newline, which seemed to me like pushing the
race condition just further down the line.

So how about using something that more explicitly does not print
anything?  Like, for example `(ignore)', which also seems to generate no
output?

(Actually, I also tested a variant where that shell statement simply
slurps its stdin, like generated by this function:

(defun esh-proc-test--detect-pty-cmd (&optional read-input)
  "Generate a shell command that prints the standard stream status.
The generated shell command prints the standard streams which are
connected as TTYs.  If READ-INPUT is present and non-nil and
Emacs is in batch mode the generated command gobbles up stdin to
avoid SIGPIPE errors."
  (concat "sh -c '"
          "if [ -t 0 ]; then echo stdin; fi; "
          "if [ -t 1 ]; then echo stdout; fi; "
          "if [ -t 2 ]; then echo stderr; fi; "
          (when (and read-input noninteractive)
            ;; Read stdin using only shell built-ins.
            "while read dummy; do :; done; ")
          "'"))

But simply not printing to the pipe is of course, well, simpler.)





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece
  2023-09-28 20:33             ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-10-01 20:13               ` Jim Porter
  0 siblings, 0 replies; 12+ messages in thread
From: Jim Porter @ 2023-10-01 20:13 UTC (permalink / raw)
  To: Jens Schmidt; +Cc: Eli Zaretskii, Paul Eggert, 66186-done

Version: 30.1

On 9/28/2023 1:33 PM, Jens Schmidt via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
> They do pass reliably now, thanks.

Thanks for checking.

> So how about using something that more explicitly does not print
> anything?  Like, for example `(ignore)', which also seems to generate no
> output?

Now merged to master as 862e5effbf9 with the change from 'printnl' to 
'(ignore)', so closing this.





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-10-01 20:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-24 21:35 bug#66186: "make lisp/eshell/esh-proc-tests" fails intermittently since 7e50861ca7ed3f620fe62ac6572f6e88b3600ece Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-24 23:02 ` Jim Porter
2023-09-25  4:52   ` Eli Zaretskii
2023-09-25  5:34     ` Jim Porter
2023-09-25  5:47       ` Jim Porter
2023-09-25  6:47         ` Eli Zaretskii
2023-09-25  7:18           ` Paul Eggert
2023-09-25  7:43             ` Eli Zaretskii
2023-09-25 19:12           ` Jim Porter
2023-09-28 20:33             ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-01 20:13               ` Jim Porter
2023-09-25  9:01       ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).