unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#17647: 24.4.50; accept-process-output with process nil can miss output
@ 2014-05-30 22:24 Jorgen Schaefer
  2014-06-02 23:21 ` Paul Eggert
  2014-06-25 19:14 ` Glenn Morris
  0 siblings, 2 replies; 8+ messages in thread
From: Jorgen Schaefer @ 2014-05-30 22:24 UTC (permalink / raw)
  To: 17647

Hello!
I'm witnessing some weird behavior which I assume is some sort of race
condition. The following piece of code reliably hangs my running Emacs
for two seconds while `accept-process-output' is waiting for output,
while a freshly-start Emacs (same binary) returns immediately, as
the process indeed has emitted some output.

(with-temp-buffer
  (let ((proc (start-process "test" (current-buffer) "bash" "-c"
                             "echo foo ; sleep 5")))
    (set-process-query-on-exit-flag proc nil)
    (accept-process-output nil 2)))

This is with a recent trunk version of Emacs, but I have had someone
else with an Emacs 24.3 reproduce the behavior. It only happens in
Emacsen that have been running for a while, but I haven't been able to
trigger it deliberately.

It's interesting that passing the specific process with output to
`accept-process-output' does not exhibit this bug. The following piece
of code always returns immediately.

(with-temp-buffer
  (let ((proc (start-process "test" (current-buffer) "bash" "-c"
                             "echo foo ; sleep 5")))
    (set-process-query-on-exit-flag proc nil)
    (accept-process-output proc 2)))

This is the Emacs process exhibiting the problem:

00:04:34.316560 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 192669865}, {NULL, 8}) = 0 (Timeout) <0.192904>
00:04:34.509752 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 47323707}, {NULL, 8}) = 0 (Timeout) <0.047388>
00:04:34.557354 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 15370771}, {NULL, 8}) = 0 (Timeout) <0.015441>
00:04:34.572965 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 41203200}, {NULL, 8}) = 0 (Timeout) <0.041293>
00:04:34.614484 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 308212924}, {NULL, 8}) = 0 (Timeout) <0.308560>
00:04:34.923287 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 585891448}, {NULL, 8}) = 1 (in [3], left {0, 578913641}) <0.006996>
00:04:34.930384 --- SIGIO (I/O possible) @ 0 (0) ---
00:04:34.930428 rt_sigreturn(0x1d)      = 1 <0.000011>
00:04:34.930476 ioctl(3, FIONREAD, [1]) = 0 <0.000009>
00:04:34.930530 read(3, "\30", 1)       = 1 <0.000011>
00:04:34.930580 ioctl(3, FIONREAD, [0]) = 0 <0.000008>
00:04:34.930618 ioctl(3, FIONREAD, [0]) = 0 <0.000007>
00:04:34.930823 ioctl(3, FIONREAD, [0]) = 0 <0.000010>
00:04:34.930874 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 499903433}, {NULL, 8}) = 1 (in [3], left {0, 139494667}) <0.360435>
00:04:35.291412 --- SIGIO (I/O possible) @ 0 (0) ---
00:04:35.291451 rt_sigreturn(0x1d)      = 1 <0.000023>
00:04:35.291514 ioctl(3, FIONREAD, [1]) = 0 <0.000012>
00:04:35.291627 read(3, "\5", 1)        = 1 <0.000011>
00:04:35.291673 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:04:35.291711 ioctl(3, FIONREAD, [0]) = 0 <0.000008>

That was the C-x C-e to execute the code fragment.

00:04:35.292167 faccessat(AT_FDCWD, "/home/forcer/.", F_OK) = 0 <0.000017>
00:04:35.292258 faccessat(AT_FDCWD, "/home/forcer/.virtualenvs/elpy/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000021>
00:04:35.292340 faccessat(AT_FDCWD, "/home/forcer/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000012>
00:04:35.292397 faccessat(AT_FDCWD, "/home/forcer/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000011>
00:04:35.292450 faccessat(AT_FDCWD, "/usr/local/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000014>
00:04:35.292506 faccessat(AT_FDCWD, "/usr/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000012>
00:04:35.292571 faccessat(AT_FDCWD, "/bin/bash", X_OK) = 0 <0.000012>
00:04:35.292614 stat("/bin/bash", {st_mode=S_IFREG|0755, st_size=975488, ...}) = 0 <0.000011>
00:04:35.292700 open("/dev/ptmx", O_RDWR|O_NOCTTY|O_CLOEXEC) = 9 <0.000080>
00:04:35.292816 fcntl(9, F_SETFD, FD_CLOEXEC) = 0 <0.000009>
00:04:35.292867 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 <0.000009>
00:04:35.292914 ioctl(9, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000009>
00:04:35.292956 ioctl(9, TIOCGPTN, [3]) = 0 <0.000008>
00:04:35.292992 stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 <0.000012>
00:04:35.293053 getuid()                = 1000 <0.000008>
00:04:35.293088 ioctl(9, TIOCSPTLCK, [0]) = 0 <0.000007>
00:04:35.293136 ioctl(9, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000009>
00:04:35.293183 ioctl(9, TIOCGPTN, [3]) = 0 <0.000011>
00:04:35.293234 stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 <0.000010>
00:04:35.293298 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 <0.000008>
00:04:35.293340 faccessat(AT_FDCWD, "/dev/pts/3", R_OK|W_OK) = 0 <0.000012>
00:04:35.293382 ioctl(9, FIONBIO, [1])  = 0 <0.000008>
00:04:35.293420 pipe2([16, 17], O_CLOEXEC) = 0 <0.000017>
00:04:35.293471 fcntl(9, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000008>
00:04:35.293507 fcntl(9, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000007>
00:04:35.293544 rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0 <0.000008>
00:04:35.293586 vfork()                 = 19552 <0.000219>
00:04:35.293841 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000009>
00:04:35.293884 close(17)               = 0 <0.000010>
00:04:35.293922 read(16, "", 1)         = 0 <0.000027>
00:04:35.293979 close(16)               = 0 <0.000015>

Here, the process output is read:

00:04:35.294032 pselect6(16, [4 5 6 8 9 11 13 15], [], NULL, {0, 215142592}, {NULL, 8}) = 1 (in [9], left {0, 212580423}) <0.002577>
00:04:35.296688 read(9, "foo\n", 4096)  = 4 <0.000013>
00:04:35.296775 pselect6(16, [4 5 6 8 11 13 15], [], NULL, {0, 20000000}, {NULL, 8}) = 0 (Timeout) <0.020087>

So far, so good. But now Emacs waits for the two seconds, even though
input was available.

00:04:35.316940 pselect6(16, [4 5 6 8 9 11 13 15], [], NULL, {0, 192233874}, {NULL, 8}) = 0 (Timeout) <0.192464>
00:04:35.509574 pselect6(16, [4 5 6 8 9 11 13 15], [], NULL, {0, 999602693}, {NULL, 8}) = ? ERESTARTNOHAND (To be restarted) <0.165729>

This is me hitting "cursor up" to see when it stops hanging.

00:04:35.675387 --- SIGIO (I/O possible) @ 0 (0) ---
00:04:35.675431 rt_sigreturn(0x1d)      = -1 EINTR (Interrupted system call) <0.000016>
00:04:35.675495 ioctl(3, FIONREAD, [3]) = 0 <0.000015>
00:04:35.675558 read(3, "\33OA", 3)     = 3 <0.000017>
00:04:35.675621 ioctl(3, FIONREAD, [0]) = 0 <0.000013>
00:04:35.675676 ioctl(3, FIONREAD, [0]) = 0 <0.000013>
00:04:35.675845 pselect6(16, [4 5 6 8 9 11 13 15], [], NULL, {0, 881234617}, {NULL, 8}) = 0 (Timeout) <0.882159>
00:04:36.558240 pselect6(16, [4 5 6 8 9 11 13 15], [], NULL, {0, 55923632}, {NULL, 8}) = 0 (Timeout) <0.056018>
00:04:36.614482 pselect6(16, [4 5 6 8 9 11 13 15], [], NULL, {0, 679593160}, {NULL, 8}) = 0 (Timeout) <0.680313>

Two seconds up. Buffer is destroyed, so the child is killed. The rest
isn't too interesting.

00:04:37.294951 rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0 <0.000009>
00:04:37.295010 kill(4294947744, SIGHUP) = 0 <0.000191>
00:04:37.295232 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000010>
00:04:37.295278 --- SIGCHLD (Child exited) @ 0 (0) ---
00:04:37.295308 wait4(19552, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGHUP}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 19552 <0.000018>
00:04:37.295359 wait4(14261, 0x7fff19905b2c, WNOHANG|WSTOPPED|WCONTINUED, NULL) = 0 <0.000009>
00:04:37.295393 wait4(11218, 0x7fff19905b2c, WNOHANG|WSTOPPED|WCONTINUED, NULL) = 0 <0.000009>
00:04:37.295426 wait4(7041, 0x7fff19905b2c, WNOHANG|WSTOPPED|WCONTINUED, NULL) = 0 <0.000009>
00:04:37.295460 wait4(7035, 0x7fff19905b2c, WNOHANG|WSTOPPED|WCONTINUED, NULL) = 0 <0.000008>
00:04:37.295491 rt_sigreturn(0x11)      = 0 <0.000009>
00:04:37.296854 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000011>
00:04:37.297034 write(3, "\33[50;1H\33[?25lt\33[4;1H\33[34h\33[?25h\33"..., 36) = 36 <0.000016>
00:04:37.297093 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:04:37.297375 --- SIGIO (I/O possible) @ 0 (0) ---
00:04:37.297398 rt_sigreturn(0x1d)      = 5 <0.000010>
00:04:37.297469 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:04:37.297564 ioctl(3, FIONREAD, [0]) = 0 <0.000011>
00:04:37.297609 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:04:37.297713 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:04:37.297813 write(3, "\33[50;1H\33[K\33[4;1H", 16) = 16 <0.000011>
00:04:37.297858 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000007>
00:04:37.298406 --- SIGIO (I/O possible) @ 0 (0) ---
00:04:37.298429 rt_sigreturn(0x1d)      = 17450765 <0.000010>
00:04:37.298476 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:04:37.298879 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:04:37.298987 write(3, "\33[9;35H\33[?25l\33[3m7\33[0m\33[39;49m\33["..., 56) = 56 <0.000011>
00:04:37.299035 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000008>
00:04:37.299108 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:04:37.299163 pselect6(16, [3 4 5 6 8 11 13 15], NULL, NULL, {0, 0}, {NULL, 8}) = 0 (Timeout) <0.000013>
00:04:37.299213 read(9, 0x7fff19905540, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000009>
00:04:37.299362 --- SIGIO (I/O possible) @ 0 (0) ---
00:04:37.299384 rt_sigreturn(0x1d)      = 58958800 <0.000009>
00:04:37.299439 close(9)                = 0 <0.000027>
00:04:37.299498 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:04:37.300337 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000010>
00:04:37.300474 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:04:37.300556 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:04:37.300600 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 208572483}, {NULL, 8}) = 0 (Timeout) <0.208822>
00:04:37.509648 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 47428499}, {NULL, 8}) = 0 (Timeout) <0.047519>
00:04:37.557429 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 15296896}, {NULL, 8}) = 0 (Timeout) <0.015382>
00:04:37.573017 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 41145931}, {NULL, 8}) = 0 (Timeout) <0.041237>
00:04:37.614468 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 184612087}, {NULL, 8}) = 0 (Timeout) <0.184836>
00:04:37.800025 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 122673194}, {NULL, 8}) = 0 (Timeout) <0.122837>
00:04:37.923084 pselect6(16, [3 4 5 6 8 11 13 15], [], NULL, {0, 586093728}, {NULL, 8} <unfinished ...>


Now, to a different Emacs, where it returns immediately:

00:05:21.788956 pselect6(4, [3], [], NULL, {99712, 668876746}, {NULL, 8}) = 1 (in [3], left {99711, 566613909}) <1.102290>
00:05:22.891441 --- SIGIO (I/O possible) @ 0 (0) ---
00:05:22.891493 rt_sigreturn(0x1d)      = 1 <0.000010>
00:05:22.891545 ioctl(3, FIONREAD, [1]) = 0 <0.000010>
00:05:22.891601 read(3, "\30", 1)       = 1 <0.000011>
00:05:22.891653 ioctl(3, FIONREAD, [0]) = 0 <0.000008>
00:05:22.891693 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:05:22.891895 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000010>
00:05:22.892066 write(3, "\33[50;1H\33[K\33[11;1H", 17) = 17 <0.000013>
00:05:22.892122 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000008>
00:05:22.892182 ioctl(3, FIONREAD, [0]) = 0 <0.000011>
00:05:22.892232 pselect6(4, [3], [], NULL, {0, 499940522}, {NULL, 8}) = ? ERESTARTNOHAND (To be restarted) <0.000099>
00:05:22.892370 --- SIGIO (I/O possible) @ 0 (0) ---
00:05:22.892391 rt_sigreturn(0x1d)      = -1 EINTR (Interrupted system call) <0.000010>
00:05:22.892439 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:05:22.892484 pselect6(4, [3], [], NULL, {0, 499686972}, {NULL, 8}) = 1 (in [3], left {0, 131907968}) <0.367804>
00:05:23.260373 --- SIGIO (I/O possible) @ 0 (0) ---
00:05:23.260424 rt_sigreturn(0x1d)      = 1 <0.000020>
00:05:23.260839 ioctl(3, FIONREAD, [1]) = 0 <0.000009>
00:05:23.260882 read(3, "\5", 1)        = 1 <0.000009>
00:05:23.260931 ioctl(3, FIONREAD, [0]) = 0 <0.000017>
00:05:23.260997 ioctl(3, FIONREAD, [0]) = 0 <0.000007>

Again, the C-x C-e above.

00:05:23.261331 faccessat(AT_FDCWD, "/home/forcer/.", F_OK) = 0 <0.000015>
00:05:23.261406 faccessat(AT_FDCWD, "/home/forcer/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000011>
00:05:23.261460 faccessat(AT_FDCWD, "/home/forcer/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000009>
00:05:23.261504 faccessat(AT_FDCWD, "/usr/local/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000011>
00:05:23.261550 faccessat(AT_FDCWD, "/usr/bin/bash", X_OK) = -1 ENOENT (No such file or directory) <0.000011>
00:05:23.261604 faccessat(AT_FDCWD, "/bin/bash", X_OK) = 0 <0.000010>
00:05:23.261640 stat("/bin/bash", {st_mode=S_IFREG|0755, st_size=975488, ...}) = 0 <0.000008>
00:05:23.261709 open("/dev/ptmx", O_RDWR|O_NOCTTY|O_CLOEXEC) = 4 <0.000047>
00:05:23.261787 fcntl(4, F_SETFD, FD_CLOEXEC) = 0 <0.000007>
00:05:23.261818 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 <0.000006>
00:05:23.261852 ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000008>
00:05:23.261887 ioctl(4, TIOCGPTN, [3]) = 0 <0.000006>
00:05:23.261917 stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 <0.000011>
00:05:23.261966 getuid()                = 1000 <0.000006>
00:05:23.261994 ioctl(4, TIOCSPTLCK, [0]) = 0 <0.000006>
00:05:23.262023 ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000007>
00:05:23.262055 ioctl(4, TIOCGPTN, [3]) = 0 <0.000007>
00:05:23.262084 stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 <0.000008>
00:05:23.262130 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 <0.000011>
00:05:23.262174 faccessat(AT_FDCWD, "/dev/pts/3", R_OK|W_OK) = 0 <0.000011>
00:05:23.262211 ioctl(4, FIONBIO, [1])  = 0 <0.000006>
00:05:23.262243 pipe2([5, 6], O_CLOEXEC) = 0 <0.000017>
00:05:23.262290 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000007>
00:05:23.262319 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000007>
00:05:23.262348 rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0 <0.000007>
00:05:23.262383 vfork()                 = 19579 <0.000176>
00:05:23.262587 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000008>
00:05:23.262621 close(6)                = 0 <0.000007>
00:05:23.262652 read(5, "", 1)          = 0 <0.000019>
00:05:23.262711 close(5)                = 0 <0.000014>

Now, Emacs reads the output, just as above:

00:05:23.262767 pselect6(5, [4], [], NULL, {1, 999999432}, {NULL, 8}) = 1 (in [4], left {1, 997510494}) <0.002500>
00:05:23.265315 read(4, "foo\n", 4096)  = 4 <0.000010>
00:05:23.265375 pselect6(5, [], [], NULL, {0, 20000000}, {NULL, 8}) = 0 (Timeout) <0.020067>

The first pselect after the close() call now has a 2s delay as per the
accept-process-output call. Above, that pselect had a much shorter
delay. And here, Emacs does not go back to waiting another 2s, but just
destroys the buffer. Again, the rest isn't too interesting.

00:05:23.285512 rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0 <0.000009>
00:05:23.285569 kill(4294947717, SIGHUP) = 0 <0.000130>
00:05:23.285728 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000009>
00:05:23.285762 --- SIGCHLD (Child exited) @ 0 (0) ---
00:05:23.285785 wait4(19579, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGHUP}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 19579 <0.000015>
00:05:23.285830 rt_sigreturn(0x11)      = 0 <0.000009>
00:05:23.286237 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000016>
00:05:23.286418 write(3, "\33[50;1H\33[?25lt\33[11;1H\33[34h\33[?25h"..., 37) = 37 <0.000013>
00:05:23.286477 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000008>
00:05:23.286525 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:05:23.286585 pselect6(5, [3], NULL, NULL, {0, 0}, {NULL, 8}) = 0 (Timeout) <0.000010>
00:05:23.286626 read(4, 0x7fffe6202c90, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000009>
00:05:23.286686 close(4)                = 0 <0.000019>
00:05:23.286949 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000010>
00:05:23.287081 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:05:23.287169 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:05:23.287276 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000012>
00:05:23.287332 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:05:23.287372 pselect6(4, [3], [], NULL, {0, 499144117}, {NULL, 8}) = ? ERESTARTNOHAND (To be restarted) <0.000134>
00:05:23.287537 --- SIGIO (I/O possible) @ 0 (0) ---
00:05:23.287556 rt_sigreturn(0x1d)      = -1 EINTR (Interrupted system call) <0.000009>
00:05:23.287593 ioctl(3, FIONREAD, [0]) = 0 <0.000008>
00:05:23.287631 pselect6(4, [3], [], NULL, {0, 498884056}, {NULL, 8}) = 0 (Timeout) <0.499429>
00:05:23.787255 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000010>
00:05:23.787413 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000009>
00:05:23.787462 pselect6(4, [3], [], NULL, {29, 499910845}, {NULL, 8}) = 1 (in [3], left {28, 748227541}) <0.751721>
00:05:24.539262 --- SIGIO (I/O possible) @ 0 (0) ---
00:05:24.539300 rt_sigreturn(0x1d)      = 1 <0.000011>
00:05:24.539345 ioctl(3, FIONREAD, [3]) = 0 <0.000009>
00:05:24.539384 read(3, "\33OA", 3)     = 3 <0.000010>
00:05:24.539422 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:05:24.539456 ioctl(3, FIONREAD, [0]) = 0 <0.000008>
00:05:24.540507 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000011>
00:05:24.540670 write(3, "\33[50;1H\33[K\33[49;34H\33[?25l\33[3m9 \33["..., 70) = 70 <0.000013>
00:05:24.540722 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000008>
00:05:24.540768 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
00:05:24.540809 pselect6(4, [3], [], NULL, {0, 499953398}, {NULL, 8}) = ? ERESTARTNOHAND (To be restarted) <0.000393>
00:05:24.541237 --- SIGIO (I/O possible) @ 0 (0) ---
00:05:24.541256 rt_sigreturn(0x1d)      = -1 EINTR (Interrupted system call) <0.000009>
00:05:24.541293 ioctl(3, FIONREAD, [0]) = 0 <0.000008>
00:05:24.541331 pselect6(4, [3], [], NULL, {0, 499431017}, {NULL, 8}) = 0 (Timeout) <0.499831>
00:05:25.041328 pselect6(4, [3], [], NULL, {29, 499485945}, {NULL, 8} <unfinished ...>

Regards,
Jorgen





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-05-30 22:24 bug#17647: 24.4.50; accept-process-output with process nil can miss output Jorgen Schaefer
@ 2014-06-02 23:21 ` Paul Eggert
  2014-06-03 19:46   ` Jorgen Schaefer
  2014-06-25 19:14 ` Glenn Morris
  1 sibling, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2014-06-02 23:21 UTC (permalink / raw)
  To: Jorgen Schaefer; +Cc: 17647

Are you using gnutls?

What does M-x list-processes say, in a buggy Emacs?





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-06-02 23:21 ` Paul Eggert
@ 2014-06-03 19:46   ` Jorgen Schaefer
  2014-06-10  3:40     ` Paul Eggert
  0 siblings, 1 reply; 8+ messages in thread
From: Jorgen Schaefer @ 2014-06-03 19:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 17647

On Mon, 02 Jun 2014 16:21:56 -0700
Paul Eggert <eggert@cs.ucla.edu> wrote:

> Are you using gnutls?

Not that I know. ldd `which emacs` does not list any gnutls links.

> What does M-x list-processes say, in a buggy Emacs?

As this blocks input, I can't actually run M-x list-processes while
reproducing the problem. Running list-processes right after
set-process-query-on-exit and before accept-process-output in my
example says:

test            run      *temp*-447149  /dev/pts/5   bash -c echo foo ; sleep 5



I should add that process filters are called at the start of
`accept-process-output', but it still hangs. This is different from
what I thought was the case initially, but that was a debugging mistake
due to output not being updated until after `accept-process-output'
returns.

That is, with time stamps:

(with-temp-buffer
  (let ((proc (start-process "test" (current-buffer) "bash" "-c"
                             "echo foo ; sleep 5")))
    (set-process-query-on-exit-flag proc nil)
    (set-process-filter proc (lambda (proc str)
                               (message "Filter: %s" (float-time))))
    (message "Before: %s" (float-time))
    (accept-process-output nil 2) 
    (message "After.: %s" (float-time))))

Before: 1401824504.1458359
Filter: 1401824504.148198
After.: 1401824506.146296

Regards,
Jorgen





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-06-03 19:46   ` Jorgen Schaefer
@ 2014-06-10  3:40     ` Paul Eggert
  2014-06-13 15:48       ` Jorgen Schaefer
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2014-06-10  3:40 UTC (permalink / raw)
  To: Jorgen Schaefer; +Cc: 17647

[-- Attachment #1: Type: text/plain, Size: 218 bytes --]

I took a look at the nearby code and found some glitches that I can fix. 
  Could you please try the attached patch (relative to trunk bzr 
117299) and see whether it helps you, or at least does not hurt you? 
Thanks.

[-- Attachment #2: apo.patch --]
[-- Type: text/plain, Size: 10494 bytes --]

=== modified file 'src/ChangeLog'
--- src/ChangeLog	2014-06-10 03:13:41 +0000
+++ src/ChangeLog	2014-06-10 03:34:20 +0000
@@ -1,3 +1,13 @@
+2014-06-10  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Avoid hangs in accept-process-output (Bug#17647).
+	* lisp.h, process.c (wait_reading_process_input): Return int, not bool.
+	All uses changed.
+	* process.c (SELECT_CANT_DO_WRITE_MASK): Remove macro, replacing with ...
+	(SELECT_CAN_DO_WRITE_MASK): ... new constant, with inverted sense.
+	All uses changed.
+	(status_notify): New arg WAIT_PROC.  Return int, not void.  All uses changed.
+
 2014-06-10  Dmitry Antipov  <dmantipov@yandex.ru>
 
 	* dispextern.h (PREPARE_FACE_FOR_DISPLAY): Remove as a duplicate of ...

=== modified file 'src/lisp.h'
--- src/lisp.h	2014-06-08 18:27:22 +0000
+++ src/lisp.h	2014-06-10 03:34:20 +0000
@@ -4178,10 +4178,8 @@
 /* Defined in process.c.  */
 extern Lisp_Object QCtype, Qlocal;
 extern void kill_buffer_processes (Lisp_Object);
-extern bool wait_reading_process_output (intmax_t, int, int, bool,
-					 Lisp_Object,
-					 struct Lisp_Process *,
-					 int);
+extern int wait_reading_process_output (intmax_t, int, int, bool, Lisp_Object,
+					struct Lisp_Process *, int);
 /* Max value for the first argument of wait_reading_process_output.  */
 #if __GNUC__ == 3 || (__GNUC__ == 4 && __GNUC_MINOR__ <= 5)
 /* Work around a bug in GCC 3.4.2, known to be fixed in GCC 4.6.3.

=== modified file 'src/process.c'
--- src/process.c	2014-06-09 20:31:06 +0000
+++ src/process.c	2014-06-10 03:34:20 +0000
@@ -224,8 +224,9 @@
 /* Only W32 has this, it really means that select can't take write mask.  */
 #ifdef BROKEN_NON_BLOCKING_CONNECT
 #undef NON_BLOCKING_CONNECT
-#define SELECT_CANT_DO_WRITE_MASK
+enum { SELECT_CAN_DO_WRITE_MASK = false };
 #else
+enum { SELECT_CAN_DO_WRITE_MASK = true };
 #ifndef NON_BLOCKING_CONNECT
 #ifdef HAVE_SELECT
 #if defined (HAVE_GETPEERNAME) || defined (GNU_LINUX)
@@ -281,7 +282,7 @@
 static bool keyboard_bit_set (fd_set *);
 #endif
 static void deactivate_process (Lisp_Object);
-static void status_notify (struct Lisp_Process *);
+static int status_notify (struct Lisp_Process *, struct Lisp_Process *);
 static int read_process_output (Lisp_Object, int);
 static void handle_child_signal (int);
 static void create_pty (Lisp_Object);
@@ -860,7 +861,7 @@
     {
       pset_status (p, list2 (Qexit, make_number (0)));
       p->tick = ++process_tick;
-      status_notify (p);
+      status_notify (p, NULL);
       redisplay_preserve_echo_area (13);
     }
   else
@@ -880,7 +881,7 @@
 	    pset_status (p, list2 (Qsignal, make_number (SIGKILL)));
 
 	  p->tick = ++process_tick;
-	  status_notify (p);
+	  status_notify (p, NULL);
 	  redisplay_preserve_echo_area (13);
 	}
     }
@@ -3986,12 +3987,13 @@
     nsecs = 0;
 
   return
-    (wait_reading_process_output (secs, nsecs, 0, 0,
+    ((wait_reading_process_output (secs, nsecs, 0, 0,
 				  Qnil,
 				  !NILP (process) ? XPROCESS (process) : NULL,
 				  NILP (just_this_one) ? 0 :
 				  !INTEGERP (just_this_one) ? 1 : -1)
-     ? Qt : Qnil);
+      <= 0)
+     ? Qnil : Qt);
 }
 
 /* Accept a connection for server process SERVER on CHANNEL.  */
@@ -4262,10 +4264,11 @@
      (suspending output from other processes).  A negative value
      means don't run any timers either.
 
-   Return true if we received input from WAIT_PROC, or from any
-   process if WAIT_PROC is null.  */
+   Return positive if we received input from WAIT_PROC (or from any
+   process if WAIT_PROC is null), zero if we attempted to receive
+   input but got none, and negative if we didn't even try.  */
 
-bool
+int
 wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 			     bool do_display,
 			     Lisp_Object wait_for_cell,
@@ -4280,8 +4283,7 @@
   int xerrno;
   Lisp_Object proc;
   struct timespec timeout, end_time;
-  int wait_channel = -1;
-  bool got_some_input = 0;
+  int got_some_input = -1;
   ptrdiff_t count = SPECPDL_INDEX ();
 
   FD_ZERO (&Available);
@@ -4292,10 +4294,6 @@
 	   && EQ (XCAR (wait_proc->status), Qexit)))
     message1 ("Blocking call to accept-process-output with quit inhibited!!");
 
-  /* If wait_proc is a process to watch, set wait_channel accordingly.  */
-  if (wait_proc != NULL)
-    wait_channel = wait_proc->infd;
-
   record_unwind_protect_int (wait_reading_process_output_unwind,
 			     waiting_for_user_input_p);
   waiting_for_user_input_p = read_kbd;
@@ -4332,6 +4330,10 @@
       if (! NILP (wait_for_cell) && ! NILP (XCAR (wait_for_cell)))
 	break;
 
+      /* After reading input, vacuum up any leftovers without waiting.  */
+      if (0 <= got_some_input)
+	nsecs = -1;
+
       /* Compute time from now till when time limit is up.  */
       /* Exit if already run out.  */
       if (nsecs < 0)
@@ -4450,7 +4452,7 @@
 	      /* It's okay for us to do this and then continue with
 		 the loop, since timeout has already been zeroed out.  */
 	      clear_waiting_for_input ();
-	      status_notify (NULL);
+	      got_some_input = status_notify (NULL, wait_proc);
 	      if (do_display) redisplay_preserve_echo_area (13);
 	    }
 	}
@@ -4472,18 +4474,23 @@
 	  while (wait_proc->infd >= 0)
 	    {
 	      int nread = read_process_output (proc, wait_proc->infd);
-
-	      if (nread == 0)
-		break;
-
-	      if (nread > 0)
-		got_some_input = read_some_bytes = 1;
-	      else if (nread == -1 && (errno == EIO || errno == EAGAIN))
-		break;
+	      if (nread < 0)
+		{
+		  if (errno == EIO || errno == EAGAIN)
+		    break;
 #ifdef EWOULDBLOCK
-	      else if (nread == -1 && EWOULDBLOCK == errno)
-		break;
+		  if (errno == EWOULDBLOCK)
+		    break;
 #endif
+		}
+	      else
+		{
+		  if (got_some_input < nread)
+		    got_some_input = nread;
+		  if (nread == 0)
+		    break;
+		  read_some_bytes = true;
+		}
 	    }
 	  if (read_some_bytes && do_display)
 	    redisplay_preserve_echo_area (10);
@@ -4514,12 +4521,8 @@
 	  else
 	    Available = input_wait_mask;
           Writeok = write_mask;
-#ifdef SELECT_CANT_DO_WRITE_MASK
-          check_write = 0;
-#else
-          check_write = 1;
-#endif
- 	  check_delay = wait_channel >= 0 ? 0 : process_output_delay_count;
+ 	  check_delay = wait_proc ? 0 : process_output_delay_count;
+	  check_write = SELECT_CAN_DO_WRITE_MASK;
 	}
 
       /* If frame size has changed or the window is newly mapped,
@@ -4545,6 +4548,7 @@
 	{
 	  nfds = read_kbd ? 0 : 1;
 	  no_avail = 1;
+	  FD_ZERO (&Available);
 	}
 
       if (!no_avail)
@@ -4554,7 +4558,7 @@
 	  /* Set the timeout for adaptive read buffering if any
 	     process has non-zero read_output_skip and non-zero
 	     read_output_delay, and we are not reading output for a
-	     specific wait_channel.  It is not executed if
+	     specific process.  It is not executed if
 	     Vprocess_adaptive_read_buffering is nil.  */
 	  if (process_output_skip && check_delay > 0)
 	    {
@@ -4667,12 +4671,6 @@
 	    report_file_errno ("Failed select", Qnil, xerrno);
 	}
 
-      if (no_avail)
-	{
-	  FD_ZERO (&Available);
-	  check_write = 0;
-	}
-
       /* Check for keyboard input */
       /* If there is any, return immediately
 	 to give it higher priority than subprocesses */
@@ -4739,9 +4737,6 @@
 	handle_input_available_signal (SIGIO);
 #endif
 
-      if (! wait_proc)
-	got_some_input |= nfds > 0;
-
       /* If checking input just got us a size-change event from X,
 	 obey it now if we should.  */
       if (read_kbd || ! NILP (wait_for_cell))
@@ -4773,12 +4768,6 @@
 	      /* If waiting for this channel, arrange to return as
 		 soon as no more input to be processed.  No more
 		 waiting.  */
-	      if (wait_channel == channel)
-		{
-		  wait_channel = -1;
-		  nsecs = -1;
-		  got_some_input = 1;
-		}
 	      proc = chan_process[channel];
 	      if (NILP (proc))
 		continue;
@@ -4794,6 +4783,8 @@
 		 buffered-ahead character if we have one.  */
 
 	      nread = read_process_output (proc, channel);
+	      if ((!wait_proc || wait_proc == XPROCESS (proc)) && got_some_input < nread)
+		got_some_input = nread;
 	      if (nread > 0)
 		{
 		  /* Since read_process_output can run a filter,
@@ -5814,7 +5805,7 @@
       p->tick = ++process_tick;
       if (!nomsg)
 	{
-	  status_notify (NULL);
+	  status_notify (NULL, NULL);
 	  redisplay_preserve_echo_area (13);
 	}
     }
@@ -6337,14 +6328,20 @@
 /* Report all recent events of a change in process status
    (either run the sentinel or output a message).
    This is usually done while Emacs is waiting for keyboard input
-   but can be done at other times.  */
-
-static void
-status_notify (struct Lisp_Process *deleting_process)
+   but can be done at other times.
+
+   Return positive if any input was received from WAIT_PROC (or from
+   any process if WAIT_PROC is null), zero if input was attempted but
+   none received, and negative if we didn't even try.  */
+
+static int
+status_notify (struct Lisp_Process *deleting_process,
+	       struct Lisp_Process *wait_proc)
 {
-  register Lisp_Object proc;
+  Lisp_Object proc;
   Lisp_Object tail, msg;
   struct gcpro gcpro1, gcpro2;
+  int got_some_input = -1;
 
   tail = Qnil;
   msg = Qnil;
@@ -6374,8 +6371,14 @@
 		 /* Network or serial process not stopped:  */
 		 && ! EQ (p->command, Qt)
 		 && p->infd >= 0
-		 && p != deleting_process
-		 && read_process_output (proc, p->infd) > 0);
+		 && p != deleting_process)
+	    {
+	      int nread = read_process_output (proc, p->infd);
+	      if (got_some_input < nread)
+		got_some_input = nread;
+	      if (nread <= 0)
+		break;
+	    }
 
 	  /* Get the text to use for the message.  */
 	  if (p->raw_status_new)
@@ -6407,6 +6410,7 @@
 
   update_mode_lines = 24;  /* In case buffers use %s in mode-line-format.  */
   UNGCPRO;
+  return got_some_input;
 }
 
 DEFUN ("internal-default-process-sentinel", Finternal_default_process_sentinel,
@@ -6618,9 +6622,11 @@
    DO_DISPLAY means redisplay should be done to show subprocess
    output that arrives.
 
-   Return true if we received input from any process.  */
+   Return positive if we received input from WAIT_PROC (or from any
+   process if WAIT_PROC is null), zero if we attempted to receive
+   input but got none, and negative if we didn't even try.  */
 
-bool
+int
 wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 			     bool do_display,
 			     Lisp_Object wait_for_cell,
@@ -6808,7 +6814,7 @@
 
   start_polling ();
 
-  return 0;
+  return -1;
 }
 
 #endif	/* not subprocesses */


^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-06-10  3:40     ` Paul Eggert
@ 2014-06-13 15:48       ` Jorgen Schaefer
  2014-06-13 15:55         ` Paul Eggert
  0 siblings, 1 reply; 8+ messages in thread
From: Jorgen Schaefer @ 2014-06-13 15:48 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 17647

On Mon, 09 Jun 2014 20:40:12 -0700
Paul Eggert <eggert@cs.ucla.edu> wrote:

> I took a look at the nearby code and found some glitches that I can
> fix. Could you please try the attached patch (relative to trunk bzr 
> 117299) and see whether it helps you, or at least does not hurt you? 
> Thanks.

I've been running the patched version for three days now. No problems,
and the bug did not reoccur. It seems you caught it.

Thank you!

Jorgen





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-06-13 15:48       ` Jorgen Schaefer
@ 2014-06-13 15:55         ` Paul Eggert
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Eggert @ 2014-06-13 15:55 UTC (permalink / raw)
  To: Jorgen Schaefer; +Cc: 17647-done

Thanks for checking it.  I installed that fix into the trunk.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-05-30 22:24 bug#17647: 24.4.50; accept-process-output with process nil can miss output Jorgen Schaefer
  2014-06-02 23:21 ` Paul Eggert
@ 2014-06-25 19:14 ` Glenn Morris
  2014-06-25 22:43   ` Glenn Morris
  1 sibling, 1 reply; 8+ messages in thread
From: Glenn Morris @ 2014-06-25 19:14 UTC (permalink / raw)
  To: 17647; +Cc: eggert, forcer


It's possible this change did... something.

If I run `make check' in the "current" (before 117400) trunk, I get:

2 unexpected results:
   FAILED  warning-predicate-function-gcc
   FAILED  warning-predicate-rx-gcc

You can see the same failures on hydra.
They seem to start at

http://hydra.nixos.org/build/11864228

which corresponds to

http://lists.gnu.org/archive/html/emacs-diffs/2014-06/msg00165.html


We did not notice these failures because for some reason Emacs still
exits with status 0, despite (kill-emacs 1) being called in
ert-run-tests-batch-and-exit (???).


To reproduce the issue, do (prior to my latest change to flymake-tests.el):

cd test/automated
../../src/emacs -Q -batch -l ./flymake-tests.el \
   -f ert-run-tests-batch-and-exit < /dev/null

It returns far too quickly, as if it did not sleep at all.
I found that (sleep-for 1.0) did nothing, but 
(sleep-for 0.5) (sleep-for 0.5) worked.


Frankly I have no idea what is going on, with the sleeping nor with the
(kill-emacs 1). But it seems to have something to do with processes, and
your change is related to processes, so... over to you! ;)





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#17647: 24.4.50; accept-process-output with process nil can miss output
  2014-06-25 19:14 ` Glenn Morris
@ 2014-06-25 22:43   ` Glenn Morris
  0 siblings, 0 replies; 8+ messages in thread
From: Glenn Morris @ 2014-06-25 22:43 UTC (permalink / raw)
  To: 17647; +Cc: eggert, forcer


Glenn Morris wrote:

> It returns far too quickly, as if it did not sleep at all.
> I found that (sleep-for 1.0) did nothing, but 
> (sleep-for 0.5) (sleep-for 0.5) worked.

This is http://debbugs.gnu.org/15990, so probably unrelated to recent changes.

> Frankly I have no idea what is going on,

I felt like repeating that...





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-06-25 22:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-30 22:24 bug#17647: 24.4.50; accept-process-output with process nil can miss output Jorgen Schaefer
2014-06-02 23:21 ` Paul Eggert
2014-06-03 19:46   ` Jorgen Schaefer
2014-06-10  3:40     ` Paul Eggert
2014-06-13 15:48       ` Jorgen Schaefer
2014-06-13 15:55         ` Paul Eggert
2014-06-25 19:14 ` Glenn Morris
2014-06-25 22:43   ` Glenn Morris

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).