* Re: master 64f2c96 2/2: Make a process test faster.
[not found] ` <20210102125536.C3FED2094D@vcs0.savannah.gnu.org>
@ 2021-01-03 17:41 ` Glenn Morris
2021-01-04 21:24 ` Philipp Stephani
0 siblings, 1 reply; 10+ messages in thread
From: Glenn Morris @ 2021-01-03 17:41 UTC (permalink / raw)
To: emacs-devel; +Cc: Philipp Stephani
> branch: master
> commit 64f2c96cbe3ba803c4026c976c425771911e29e3
> Make a process test faster.
Hi - this makes the test fail for me on CentOS 8.3, and also for
https://hydra.nixos.org/build/134145457
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-03 17:41 ` master 64f2c96 2/2: Make a process test faster Glenn Morris
@ 2021-01-04 21:24 ` Philipp Stephani
2021-01-05 14:27 ` Philipp Stephani
0 siblings, 1 reply; 10+ messages in thread
From: Philipp Stephani @ 2021-01-04 21:24 UTC (permalink / raw)
To: Glenn Morris; +Cc: Philipp Stephani, Emacs developers
Am So., 3. Jan. 2021 um 18:42 Uhr schrieb Glenn Morris <rgm@gnu.org>:
>
>
> > branch: master
> > commit 64f2c96cbe3ba803c4026c976c425771911e29e3
>
> > Make a process test faster.
>
> Hi - this makes the test fail for me on CentOS 8.3, and also for
> https://hydra.nixos.org/build/134145457
I've now tried to make this test more robust with commit
57e872ac757d7a003f4a7f132a08798c3a1a6e97.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-04 21:24 ` Philipp Stephani
@ 2021-01-05 14:27 ` Philipp Stephani
2021-01-08 1:38 ` Glenn Morris
0 siblings, 1 reply; 10+ messages in thread
From: Philipp Stephani @ 2021-01-05 14:27 UTC (permalink / raw)
To: Philipp Stephani; +Cc: Glenn Morris, Emacs developers
Am Mo., 4. Jan. 2021 um 22:24 Uhr schrieb Philipp Stephani
<p.stephani2@gmail.com>:
>
> Am So., 3. Jan. 2021 um 18:42 Uhr schrieb Glenn Morris <rgm@gnu.org>:
> >
> >
> > > branch: master
> > > commit 64f2c96cbe3ba803c4026c976c425771911e29e3
> >
> > > Make a process test faster.
> >
> > Hi - this makes the test fail for me on CentOS 8.3, and also for
> > https://hydra.nixos.org/build/134145457
>
> I've now tried to make this test more robust with commit
> 57e872ac757d7a003f4a7f132a08798c3a1a6e97.
Looks like that helped: https://hydra.nixos.org/build/134431253
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-05 14:27 ` Philipp Stephani
@ 2021-01-08 1:38 ` Glenn Morris
2021-01-08 9:01 ` Michael Albinus
2021-01-09 18:12 ` Philipp Stephani
0 siblings, 2 replies; 10+ messages in thread
From: Glenn Morris @ 2021-01-08 1:38 UTC (permalink / raw)
To: Philipp Stephani; +Cc: Philipp Stephani, Emacs developers
It is now unstable, ie mostly passes but sometimes fails.
Ref eg
https://hydra.nixos.org/build/134561612
https://hydra.nixos.org/build/134556213
Same seems to apply to at least:
tramp-test31-interrupt-process
https://hydra.nixos.org/build/134553525
mml-secure-en-decrypt-2
https://hydra.nixos.org/build/134547492
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-08 1:38 ` Glenn Morris
@ 2021-01-08 9:01 ` Michael Albinus
2021-01-09 18:12 ` Philipp Stephani
1 sibling, 0 replies; 10+ messages in thread
From: Michael Albinus @ 2021-01-08 9:01 UTC (permalink / raw)
To: Glenn Morris; +Cc: Philipp Stephani, Philipp Stephani, Emacs developers
Glenn Morris <rgm@gnu.org> writes:
> Same seems to apply to at least:
> tramp-test31-interrupt-process
> https://hydra.nixos.org/build/134553525
This was is already marked as unstable on emba. So I've done it also for hydra.
Best regards, Michael.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-08 1:38 ` Glenn Morris
2021-01-08 9:01 ` Michael Albinus
@ 2021-01-09 18:12 ` Philipp Stephani
2021-01-09 21:00 ` Philipp Stephani
1 sibling, 1 reply; 10+ messages in thread
From: Philipp Stephani @ 2021-01-09 18:12 UTC (permalink / raw)
To: Glenn Morris; +Cc: Philipp Stephani, Emacs developers
Am Fr., 8. Jan. 2021 um 02:38 Uhr schrieb Glenn Morris <rgm@gnu.org>:
>
>
> It is now unstable, ie mostly passes but sometimes fails.
>
> Ref eg
> https://hydra.nixos.org/build/134561612
> https://hydra.nixos.org/build/134556213
Interesting. I can't see why it would deadlock. There are only 4 possible cases:
1. vfork and exec both succeed. Then the code will immediately close
stdin for cat, causing it to terminate, and accept-process-output
should return immediately after one or two iterations.
2. vfork and exec both fail due to running out of file descriptors.
Then the process object doesn't exist in the first place, and we don't
try to read from it.
3. Some unexpected error (out of memory, executable suddenly vanishes,
...). That should result in a test failure, but no deadlock.
4. vfork succeeds, child setup before exec fails. That's the
interesting case, which I think is happening here: the child tries to
reopen the TTY file, but that fails because of too many open files. In
this case, since make-process waits for exec/exit, the process object
exists, but isn't alive any more. However, in that case
accept-process-output should also return immediately.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-09 18:12 ` Philipp Stephani
@ 2021-01-09 21:00 ` Philipp Stephani
2021-01-10 16:39 ` Eli Zaretskii
0 siblings, 1 reply; 10+ messages in thread
From: Philipp Stephani @ 2021-01-09 21:00 UTC (permalink / raw)
To: Glenn Morris; +Cc: Philipp Stephani, Emacs developers
Am Sa., 9. Jan. 2021 um 19:12 Uhr schrieb Philipp Stephani
<p.stephani2@gmail.com>:
>
> Am Fr., 8. Jan. 2021 um 02:38 Uhr schrieb Glenn Morris <rgm@gnu.org>:
> >
> >
> > It is now unstable, ie mostly passes but sometimes fails.
> >
> > Ref eg
> > https://hydra.nixos.org/build/134561612
> > https://hydra.nixos.org/build/134556213
>
> Interesting. I can't see why it would deadlock.
OK, I think I've found 2 bugs.
1. I've introduced one myself when writing emacs_spawn - we need to
mark processes as live before unblocking SIGCHLD, otherwise the
SIGCHLD handler would never call waitpid for them.
2. The other one happens if SIGCHLD is signaled during
wait_reading_process_output. Then wait_reading_process_output will
wait forever, since the stdout FD never gets closed and it doesn't see
the process status update in time. Not sure how to fix this - the most
elegant way would be pidfd/pdfork, but that's not available on all
systems.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-09 21:00 ` Philipp Stephani
@ 2021-01-10 16:39 ` Eli Zaretskii
2021-01-10 17:17 ` Philipp Stephani
0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2021-01-10 16:39 UTC (permalink / raw)
To: Philipp Stephani; +Cc: phst, rgm, emacs-devel
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Sat, 9 Jan 2021 22:00:45 +0100
> Cc: Philipp Stephani <phst@google.com>, Emacs developers <emacs-devel@gnu.org>
> 2. The other one happens if SIGCHLD is signaled during
> wait_reading_process_output. Then wait_reading_process_output will
> wait forever, since the stdout FD never gets closed and it doesn't see
> the process status update in time.
Do you mean that wait_reading_process_output has this problem in
general, or just in this particular scenario? If the former, I'm
surprised, as we are using this code for a very long time. If the
latter, can you elaborate on the situation, and what does SIGCHLD have
to do with closing stdout?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-10 16:39 ` Eli Zaretskii
@ 2021-01-10 17:17 ` Philipp Stephani
2021-01-16 19:19 ` Philipp Stephani
0 siblings, 1 reply; 10+ messages in thread
From: Philipp Stephani @ 2021-01-10 17:17 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Philipp Stephani, Glenn Morris, Emacs developers
Am So., 10. Jan. 2021 um 17:39 Uhr schrieb Eli Zaretskii <eliz@gnu.org>:
>
> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Sat, 9 Jan 2021 22:00:45 +0100
> > Cc: Philipp Stephani <phst@google.com>, Emacs developers <emacs-devel@gnu.org>
> > 2. The other one happens if SIGCHLD is signaled during
> > wait_reading_process_output. Then wait_reading_process_output will
> > wait forever, since the stdout FD never gets closed and it doesn't see
> > the process status update in time.
>
> Do you mean that wait_reading_process_output has this problem in
> general, or just in this particular scenario? If the former, I'm
> surprised, as we are using this code for a very long time. If the
> latter, can you elaborate on the situation, and what does SIGCHLD have
> to do with closing stdout?
I believe is a rather general problem. Anecdotally I've heard
occasionally about problems with accept-process-output deadlocking,
which might be related. See e.g. some of the other tests in
process-tests.el. AIUI, the sequence of events is as follows:
1. (accept-process-output PROC)
2. Here the process isn't finished yet. accept-process-output waits
for something to become available on stdout.
3. If the process doesn't write anything to stdout,
accept-process-output will block.
4. The process exits without having written anything.
5. Stdout is closed.
6. pselect returns, but since the process hasn't written anything,
wait_reading_process_output doesn't return.
7. Emacs receives SIGCHLD.
8. Emacs tries to notify accept-process-output, but it's too late - we
are already within the pselect call, which now hangs forever.
To test that, I added some printf statements, and indeed I saw that
wait_reading_process_output was entered, then SIGCHLD was received,
but wait_reading_process_output continued to hang.
I remedied this on the scratch/sigchild-fd branch by having the
SIGCHLD handler signal a pipe that pselect watches. That fixes the
deadlock entirely for me on GNU/Linux and macOS.
(Going forward we might want to use pidfds if available, they seem
simpler and less error-prone than signals.)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: master 64f2c96 2/2: Make a process test faster.
2021-01-10 17:17 ` Philipp Stephani
@ 2021-01-16 19:19 ` Philipp Stephani
0 siblings, 0 replies; 10+ messages in thread
From: Philipp Stephani @ 2021-01-16 19:19 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Philipp Stephani, Glenn Morris, Emacs developers
Am So., 10. Jan. 2021 um 18:17 Uhr schrieb Philipp Stephani
<p.stephani2@gmail.com>:
> I remedied this on the scratch/sigchild-fd branch by having the
> SIGCHLD handler signal a pipe that pselect watches. That fixes the
> deadlock entirely for me on GNU/Linux and macOS.
After checking that this branch doesn't break the build on either
Windows, macOS, or GNU/Linux, I've pushed it to master.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-01-16 19:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20210102125533.3832.43853@vcs0.savannah.gnu.org>
[not found] ` <20210102125536.C3FED2094D@vcs0.savannah.gnu.org>
2021-01-03 17:41 ` master 64f2c96 2/2: Make a process test faster Glenn Morris
2021-01-04 21:24 ` Philipp Stephani
2021-01-05 14:27 ` Philipp Stephani
2021-01-08 1:38 ` Glenn Morris
2021-01-08 9:01 ` Michael Albinus
2021-01-09 18:12 ` Philipp Stephani
2021-01-09 21:00 ` Philipp Stephani
2021-01-10 16:39 ` Eli Zaretskii
2021-01-10 17:17 ` Philipp Stephani
2021-01-16 19:19 ` Philipp Stephani
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).