unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Emacs development discussions <emacs-devel@gnu.org>
Cc: Michael Albinus <michael.albinus@gmx.de>
Subject: some accept-process-output races fixed; Tramp FIXMEs
Date: Tue, 15 Jan 2019 10:33:43 -0800	[thread overview]
Message-ID: <46ce72ef-c4cb-1cd1-566a-c305ca92e68c@cs.ucla.edu> (raw)

[-- Attachment #1: Type: text/plain, Size: 981 bytes --]

I've been annoyed for months that test/src/process-tests.el occasionally 
failed on my work desktop, and finally got annoyed enough to track it 
down. It was a race condition bug, fixed in the attached patch. I looked 
for similar instances of the bug elsewhere in Emacs, and found what 
appear to be some instances in Tramp code, along with some mysterious 
"0.1"s that may be inadequate attempts to work around the bug. I 
installed the attached patch to try to document and fix this mess, 
including some FIXMEs about those "0.1"s; I hope Michael or someone else 
with Tramp expertise (I don't use Tramp) can take a look at those FIXMEs 
and at my other Tramp changes, and perhaps at all other Tramp uses of 
accept-process-output and of tramp-accept-process-output.

Now that I'm thinking about it, I should also mention that a single 
(accept-process-output P) is not guaranteed to slurp all the data from 
process P even if P has already exited. Just a word to the wise....


[-- Attachment #2: 0001-Fix-accept-process-output-process-live-p-confusion.patch --]
[-- Type: text/x-patch, Size: 7610 bytes --]

From 5a144e547d6bd6a718e76499705d81ef5934776d Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 15 Jan 2019 10:18:45 -0800
Subject: [PATCH] Fix accept-process-output/process-live-p confusion

* doc/lispref/processes.texi (Accepting Output):
Document the issue.
* lisp/net/tramp-adb.el (tramp-adb-parse-device-names):
* lisp/net/tramp-rclone.el (tramp-rclone-parse-device-names):
* lisp/net/tramp-smb.el (tramp-smb-wait-for-output):
* lisp/net/tramp.el (tramp-interrupt-process):
* test/src/process-tests.el (make-process/mix-stderr):
Fix code that uses accept-process-output and process-live-p.
Add FIXME comments as necessary.
* lisp/net/tramp-sudoedit.el (tramp-sudoedit-action-sudo):
* lisp/net/tramp.el (tramp-action-out-of-band):
Add FIXME comments as necessary.
---
 doc/lispref/processes.texi | 20 ++++++++++++++++++++
 lisp/net/tramp-adb.el      |  6 +++---
 lisp/net/tramp-rclone.el   |  6 +++---
 lisp/net/tramp-smb.el      | 19 +++++++++++--------
 lisp/net/tramp-sudoedit.el |  2 ++
 lisp/net/tramp.el          |  9 ++++++---
 test/src/process-tests.el  |  4 ++--
 7 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/doc/lispref/processes.texi b/doc/lispref/processes.texi
index 72b164c5d4..afda8aede8 100644
--- a/doc/lispref/processes.texi
+++ b/doc/lispref/processes.texi
@@ -1859,6 +1859,26 @@ Accepting Output
 arrived.
 @end defun
 
+If a connection from a process contains buffered data,
+@code{accept-process-output} can return non-@code{nil} even after the
+process has exited.  Therefore, although the following loop:
+
+@example
+;; This loop contains a bug.
+(while (process-live-p process)
+  (accept-process-output process))
+@end example
+
+@noindent
+will often work, it has a race condition and can miss some output if
+@code{process-live-p} returns @code{nil} while the connection still
+contains data.  Better is to write the loop like this:
+
+@example
+(while (or (accept-process-output process)
+           (process-live-p process)))
+@end example
+
 @node Processes and Threads
 @subsection Processes and Threads
 @cindex processes, threads
diff --git a/lisp/net/tramp-adb.el b/lisp/net/tramp-adb.el
index e2275bee2a..ca47601e4b 100644
--- a/lisp/net/tramp-adb.el
+++ b/lisp/net/tramp-adb.el
@@ -206,9 +206,9 @@ tramp-adb-parse-device-names
 	(tramp-message v 6 "%s" (mapconcat 'identity (process-command p) " "))
 	(process-put p 'adjust-window-size-function 'ignore)
 	(set-process-query-on-exit-flag p nil)
-	(while (process-live-p p)
-	  (accept-process-output p 0.1))
-	(accept-process-output p 0.1)
+	;; FIXME: Either remove " 0.1", or comment why it's needed.
+	(while (or (accept-process-output p 0.1)
+		   (process-live-p p)))
 	(tramp-message v 6 "\n%s" (buffer-string))
 	(goto-char (point-min))
 	(while (search-forward-regexp "^\\(\\S-+\\)[[:space:]]+device$" nil t)
diff --git a/lisp/net/tramp-rclone.el b/lisp/net/tramp-rclone.el
index 0aa110d9a4..7366057296 100644
--- a/lisp/net/tramp-rclone.el
+++ b/lisp/net/tramp-rclone.el
@@ -183,9 +183,9 @@ tramp-rclone-parse-device-names
 	  (tramp-message v 6 "%s" (mapconcat 'identity (process-command p) " "))
 	  (process-put p 'adjust-window-size-function 'ignore)
 	  (set-process-query-on-exit-flag p nil)
-	  (while (process-live-p p)
-	    (accept-process-output p 0.1))
-	  (accept-process-output p 0.1)
+	  ;; FIXME: Either remove " 0.1", or comment why it's needed.
+	  (while (or (accept-process-output p 0.1)
+	             (process-live-p p)))
 	  (tramp-message v 6 "\n%s" (buffer-string))
 	  (goto-char (point-min))
 	  (while (search-forward-regexp "^\\(\\S-+\\):$" nil t)
diff --git a/lisp/net/tramp-smb.el b/lisp/net/tramp-smb.el
index 0c42a5d1a5..abf3248a35 100644
--- a/lisp/net/tramp-smb.el
+++ b/lisp/net/tramp-smb.el
@@ -2038,10 +2038,10 @@ tramp-smb-wait-for-output
       ;; Algorithm: get waiting output.  See if last line contains
       ;; `tramp-smb-prompt' sentinel or `tramp-smb-errors' strings.
       ;; If not, wait a bit and again get waiting output.
-      (while (and (not found) (not err) (process-live-p p))
-
-	;; Accept pending output.
-	(tramp-accept-process-output p 0.1)
+      ;; FIXME: Either remove " 0.1", or comment why it's needed.
+      (while (and (not found) (not err)
+		  (or (tramp-accept-process-output p 0.1)
+		      (process-live-p p)))
 
 	;; Search for prompt.
 	(goto-char (point-min))
@@ -2052,10 +2052,13 @@ tramp-smb-wait-for-output
 	(setq err (re-search-forward tramp-smb-errors nil t)))
 
       ;; When the process is still alive, read pending output.
-      (while (and (not found) (process-live-p p))
-
-	;; Accept pending output.
-	(tramp-accept-process-output p 0.1)
+      ;; FIXME: This loop should be folded into the previous loop.
+      ;; Also, ERR should be set just once, after the combined
+      ;; loop has finished.
+      ;; FIXME: Either remove " 0.1", or comment why it's needed.
+      (while (and (not found)
+		  (or (tramp-accept-process-output p 0.1)
+		      (process-live-p p)))
 
 	;; Search for prompt.
 	(goto-char (point-min))
diff --git a/lisp/net/tramp-sudoedit.el b/lisp/net/tramp-sudoedit.el
index cab9b8d835..e1e5ab091a 100644
--- a/lisp/net/tramp-sudoedit.el
+++ b/lisp/net/tramp-sudoedit.el
@@ -746,6 +746,8 @@ tramp-sudoedit-handle-write-region
 (defun tramp-sudoedit-action-sudo (proc vec)
   "Check, whether a sudo process copy has finished."
   ;; There might be pending output for the exit status.
+  ;; FIXME: Either remove " 0.1", or comment why it's needed.
+  ;; FIXME: There's a race here.  Shouldn't the next two lines be interchanged?
   (tramp-accept-process-output proc 0.1)
   (when (not (process-live-p proc))
     ;; Delete narrowed region, it would be in the way reading a Lisp form.
diff --git a/lisp/net/tramp.el b/lisp/net/tramp.el
index 437e2d19b9..7632d656a0 100644
--- a/lisp/net/tramp.el
+++ b/lisp/net/tramp.el
@@ -3977,6 +3977,8 @@ tramp-action-process-alive
 (defun tramp-action-out-of-band (proc vec)
   "Check, whether an out-of-band copy has finished."
   ;; There might be pending output for the exit status.
+  ;; FIXME: Either remove " 0.1", or comment why it's needed.
+  ;; FIXME: Shouldn't the following line be wrapped inside (while ...)?
   (tramp-accept-process-output proc 0.1)
   (cond ((and (not (process-live-p proc))
 	      (zerop (process-exit-status proc)))
@@ -4821,9 +4823,10 @@ tramp-interrupt-process
 	;; Wait, until the process has disappeared.  If it doesn't,
 	;; fall back to the default implementation.
 	(with-timeout (1 (ignore))
-	  (while (process-live-p proc)
-	    ;; We cannot run `tramp-accept-process-output', it blocks timers.
-	    (accept-process-output proc 0.1))
+	  ;; We cannot run `tramp-accept-process-output', it blocks timers.
+	  ;; FIXME: Either remove " 0.1", or comment why it's needed.
+	  (while (or (accept-process-output proc 0.1)
+		     (process-live-p proc)))
 	  ;; Report success.
 	  proc)))))
 
diff --git a/test/src/process-tests.el b/test/src/process-tests.el
index 514bd04da4..5dbf441e8c 100644
--- a/test/src/process-tests.el
+++ b/test/src/process-tests.el
@@ -207,8 +207,8 @@ process-tests--mixable
                     :sentinel #'ignore
                     :noquery t
                     :connection-type 'pipe)))
-      (while (process-live-p process)
-        (accept-process-output process))
+      (while (or (accept-process-output process)
+		 (process-live-p process)))
       (should (eq (process-status process) 'exit))
       (should (eq (process-exit-status process) 0))
       (should (process-tests--mixable (string-to-list (buffer-string))
-- 
2.20.1


             reply	other threads:[~2019-01-15 18:33 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-15 18:33 Paul Eggert [this message]
2019-01-16  8:32 ` some accept-process-output races fixed; Tramp FIXMEs Michael Albinus
2019-01-16  8:38   ` Paul Eggert
2019-01-16  8:49     ` Michael Albinus
2019-01-16 13:03   ` Michael Albinus
2019-01-16 13:25 ` Stefan Monnier
2019-01-16 18:39   ` Paul Eggert
2019-01-16 19:07     ` Stefan Monnier
2019-01-17  0:15       ` Paul Eggert
2019-01-20  9:57     ` Michael Albinus
2019-01-21 23:02       ` Daniel Colascione
2019-01-22 16:54         ` Eli Zaretskii
2019-01-22  7:09       ` Paul Eggert
2019-01-22 16:56       ` Eli Zaretskii
2019-01-22 18:58         ` Michael Albinus
2019-01-22 22:06       ` Stefan Monnier
2019-01-22 22:45         ` Michael Albinus
2019-01-23 16:18           ` Eli Zaretskii
2019-01-23 18:35             ` Michael Albinus
2019-01-23 22:19               ` Daniel Pittman
2019-01-23 16:47           ` Stefan Monnier
2019-01-23 17:36             ` Paul Eggert
2019-01-28 15:37               ` Michael Albinus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46ce72ef-c4cb-1cd1-566a-c305ca92e68c@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=michael.albinus@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).