unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
@ 2021-01-23 19:49 Eli Zaretskii
  2021-01-23 20:27 ` Eli Zaretskii
  2021-01-23 20:34 ` Philipp Stephani
  0 siblings, 2 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-01-23 19:49 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: Philipp Stephani, emacs-devel

This additional test fails on MS-Windows:

  Test process-tests/multiple-threads-waiting backtrace:
    signal(ert-test-failed (((should (eql (process-exit-status process)
    ert-fail(((should (eql (process-exit-status process) 0)) :form (eql
    (if (unwind-protect (setq value-344 (apply fn-342 args-343)) (setq f
    (let (form-description-346) (if (unwind-protect (setq value-344 (app
    (let ((value-344 'ert-form-evaluation-aborted-345)) (let (form-descr
    (let* ((fn-342 #'eql) (args-343 (condition-case err (let ((signal-ho
    (while (and (consp --cl-var--) (consp --cl-var--)) (progn (setq proc
    (let ((--cl-var-- processes) (process nil) (--cl-var-- threads) (thr
    (let ((threads nil) (cat (executable-find "cat"))) (let ((value-335
    (unwind-protect (let ((threads nil) (cat (executable-find "cat"))) (
    (let ((processes nil)) (unwind-protect (let ((threads nil) (cat (exe
    (progn (let ((processes nil)) (unwind-protect (let ((threads nil) (c
    (unwind-protect (progn (let ((processes nil)) (unwind-protect (let (
    (let* ((-with-timeout-timer- (run-with-timer 60 nil #'(lambda nil (t
    (catch 'timeout (let* ((-with-timeout-timer- (run-with-timer 60 nil
    (let ((-with-timeout-value- (catch 'timeout (let* ((-with-timeout-ti
    (closure (t) nil (let* ((fn-330 #'fboundp) (args-331 (condition-case
    ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
    ert-run-test(#s(ert-test :name process-tests/multiple-threads-waitin
    ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
    ert-run-tests((not (tag :unstable)) #f(compiled-function (event-type
    ert-run-tests-batch((not (tag :unstable)))
    ert-run-tests-batch-and-exit((not (tag :unstable)))
    eval((ert-run-tests-batch-and-exit '(not (tag :unstable))) t)
    command-line-1(("-L" ";." "-l" "ert" "-l" "src/process-tests.el" "--
    command-line()
    normal-top-level()
  Test process-tests/multiple-threads-waiting condition:
      (ert-test-failed
       ((should
	 (eql
	  (process-exit-status process)
	  0))
	:form
	(eql 1 0)
	:value nil))
     FAILED  21/26  process-tests/multiple-threads-waiting (8.859375 sec)

I did verify that process-send-eof does cause 'cat' to exit, so the
problem is likely in some issue related to threads and what the test
assumes regarding what should happen here.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-23 19:49 master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes Eli Zaretskii
@ 2021-01-23 20:27 ` Eli Zaretskii
  2021-01-23 20:33   ` Philipp Stephani
  2021-01-23 20:34 ` Philipp Stephani
  1 sibling, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-01-23 20:27 UTC (permalink / raw)
  To: p.stephani2; +Cc: phst, emacs-devel

> Date: Sat, 23 Jan 2021 21:49:18 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: Philipp Stephani <phst@google.com>, emacs-devel@gnu.org
> 
>   Test process-tests/multiple-threads-waiting condition:
>       (ert-test-failed
>        ((should
> 	 (eql
> 	  (process-exit-status process)
> 	  0))
> 	:form
> 	(eql 1 0)
> 	:value nil))
>      FAILED  21/26  process-tests/multiple-threads-waiting (8.859375 sec)

And if I run just this one test (using SELECTOR=...), it hangs in the
first call to thread-join.

Can you describe the idea of the test, and in particular why did you
expect the threads to start running?  IOW, which part of this test is
supposed to force the main thread to yield so that one of the other
threads starts running?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-23 20:27 ` Eli Zaretskii
@ 2021-01-23 20:33   ` Philipp Stephani
  2021-01-24  5:26     ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp Stephani @ 2021-01-23 20:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Emacs developers

Am Sa., 23. Jan. 2021 um 21:27 Uhr schrieb Eli Zaretskii <eliz@gnu.org>:
>
> > Date: Sat, 23 Jan 2021 21:49:18 +0200
> > From: Eli Zaretskii <eliz@gnu.org>
> > Cc: Philipp Stephani <phst@google.com>, emacs-devel@gnu.org
> >
> >   Test process-tests/multiple-threads-waiting condition:
> >       (ert-test-failed
> >        ((should
> >        (eql
> >         (process-exit-status process)
> >         0))
> >       :form
> >       (eql 1 0)
> >       :value nil))
> >      FAILED  21/26  process-tests/multiple-threads-waiting (8.859375 sec)
>
> And if I run just this one test (using SELECTOR=...), it hangs in the
> first call to thread-join.
>
> Can you describe the idea of the test,

This test essentially implements your request to verify the
interactions between the self-pipe for SIGCHLD and threads. Seems it
discovered some issues on Windows instead! The test works fine on
GNU/Linux and macOS.

> and in particular why did you
> expect the threads to start running?  IOW, which part of this test is
> supposed to force the main thread to yield so that one of the other
> threads starts running?

thread-join should yield.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-23 19:49 master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes Eli Zaretskii
  2021-01-23 20:27 ` Eli Zaretskii
@ 2021-01-23 20:34 ` Philipp Stephani
  2021-01-24  3:36   ` Eli Zaretskii
  1 sibling, 1 reply; 17+ messages in thread
From: Philipp Stephani @ 2021-01-23 20:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Emacs developers

Am Sa., 23. Jan. 2021 um 20:49 Uhr schrieb Eli Zaretskii <eliz@gnu.org>:
>
> This additional test fails on MS-Windows:
>
>   Test process-tests/multiple-threads-waiting backtrace:
>     signal(ert-test-failed (((should (eql (process-exit-status process)
>     ert-fail(((should (eql (process-exit-status process) 0)) :form (eql
>     (if (unwind-protect (setq value-344 (apply fn-342 args-343)) (setq f
>     (let (form-description-346) (if (unwind-protect (setq value-344 (app
>     (let ((value-344 'ert-form-evaluation-aborted-345)) (let (form-descr
>     (let* ((fn-342 #'eql) (args-343 (condition-case err (let ((signal-ho
>     (while (and (consp --cl-var--) (consp --cl-var--)) (progn (setq proc
>     (let ((--cl-var-- processes) (process nil) (--cl-var-- threads) (thr
>     (let ((threads nil) (cat (executable-find "cat"))) (let ((value-335
>     (unwind-protect (let ((threads nil) (cat (executable-find "cat"))) (
>     (let ((processes nil)) (unwind-protect (let ((threads nil) (cat (exe
>     (progn (let ((processes nil)) (unwind-protect (let ((threads nil) (c
>     (unwind-protect (progn (let ((processes nil)) (unwind-protect (let (
>     (let* ((-with-timeout-timer- (run-with-timer 60 nil #'(lambda nil (t
>     (catch 'timeout (let* ((-with-timeout-timer- (run-with-timer 60 nil
>     (let ((-with-timeout-value- (catch 'timeout (let* ((-with-timeout-ti
>     (closure (t) nil (let* ((fn-330 #'fboundp) (args-331 (condition-case
>     ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
>     ert-run-test(#s(ert-test :name process-tests/multiple-threads-waitin
>     ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
>     ert-run-tests((not (tag :unstable)) #f(compiled-function (event-type
>     ert-run-tests-batch((not (tag :unstable)))
>     ert-run-tests-batch-and-exit((not (tag :unstable)))
>     eval((ert-run-tests-batch-and-exit '(not (tag :unstable))) t)
>     command-line-1(("-L" ";." "-l" "ert" "-l" "src/process-tests.el" "--
>     command-line()
>     normal-top-level()
>   Test process-tests/multiple-threads-waiting condition:
>       (ert-test-failed
>        ((should
>          (eql
>           (process-exit-status process)
>           0))
>         :form
>         (eql 1 0)
>         :value nil))
>      FAILED  21/26  process-tests/multiple-threads-waiting (8.859375 sec)
>
> I did verify that process-send-eof does cause 'cat' to exit, so the
> problem is likely in some issue related to threads and what the test
> assumes regarding what should happen here.

What's surprising here is that "cat" exited with an exit status of 1,
shouldn't that be normally impossible?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-23 20:34 ` Philipp Stephani
@ 2021-01-24  3:36   ` Eli Zaretskii
  2021-01-24 14:43     ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-01-24  3:36 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: phst, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Sat, 23 Jan 2021 21:34:28 +0100
> Cc: Emacs developers <emacs-devel@gnu.org>, Philipp Stephani <phst@google.com>
> > I did verify that process-send-eof does cause 'cat' to exit, so the
> > problem is likely in some issue related to threads and what the test
> > assumes regarding what should happen here.
> 
> What's surprising here is that "cat" exited with an exit status of 1,
> shouldn't that be normally impossible?

I'm not sure this 1 exit status is not a red herring: when I manually
perform the steps for a single process and without any threads, 'cat'
exits with status zero.  And when I run that single test, it hangs.
So I think the exit code of 1 reaped by some thread is just some side
effect of running previous tests in process-tests.el.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-23 20:33   ` Philipp Stephani
@ 2021-01-24  5:26     ` Eli Zaretskii
  2021-01-24 13:22       ` Philipp Stephani
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-01-24  5:26 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: phst, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Sat, 23 Jan 2021 21:33:29 +0100
> Cc: Philipp Stephani <phst@google.com>, Emacs developers <emacs-devel@gnu.org>
> 
> > Can you describe the idea of the test,
> 
> This test essentially implements your request to verify the
> interactions between the self-pipe for SIGCHLD and threads. Seems it
> discovered some issues on Windows instead! The test works fine on
> GNU/Linux and macOS.

Thanks, but this doesn't really answer my question.  I think I
understand what the code does, but I don't know what was the intent in
writing it.  There are some things the code does whose rationale I
don't understand.  For example, why the call to set-process-thread:
what's the purpose?  Or why no-conversion in make-process?  More
generally, what is the basic idea of the test and the expectation
from each thread.

> > and in particular why did you
> > expect the threads to start running?  IOW, which part of this test is
> > supposed to force the main thread to yield so that one of the other
> > threads starts running?
> 
> thread-join should yield.

Isn't that too late?  The processes have exited already, so what does
that test?




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-24  5:26     ` Eli Zaretskii
@ 2021-01-24 13:22       ` Philipp Stephani
  2021-01-24 15:30         ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp Stephani @ 2021-01-24 13:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Emacs developers

Am So., 24. Jan. 2021 um 06:26 Uhr schrieb Eli Zaretskii <eliz@gnu.org>:
>
> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Sat, 23 Jan 2021 21:33:29 +0100
> > Cc: Philipp Stephani <phst@google.com>, Emacs developers <emacs-devel@gnu.org>
> >
> > > Can you describe the idea of the test,
> >
> > This test essentially implements your request to verify the
> > interactions between the self-pipe for SIGCHLD and threads. Seems it
> > discovered some issues on Windows instead! The test works fine on
> > GNU/Linux and macOS.
>
> Thanks, but this doesn't really answer my question.  I think I
> understand what the code does, but I don't know what was the intent in
> writing it.

Right now we don't have that many good tests that test the interaction
between threads and processes, so the intention of this test is to
improve coverage in this area.

>  There are some things the code does whose rationale I
> don't understand.  For example, why the call to set-process-thread:
> what's the purpose?

The test wouldn't work without it, because it calls
accept-process-output from a thread different from the thread in which
the process was created.

>  Or why no-conversion in make-process?

Unit tests should typically test only one aspect, or a small number of
related aspects of the system under test. The goal of this test is not
to test the encoding procedures, so I turn off encoding here.

>  More
> generally, what is the basic idea of the test and the expectation
> from each thread.

I don't understand this question. The idea is to have a test that
calls accept-process-output from multiple threads to increase coverage
in this area.

>
> > > and in particular why did you
> > > expect the threads to start running?  IOW, which part of this test is
> > > supposed to force the main thread to yield so that one of the other
> > > threads starts running?
> >
> > thread-join should yield.
>
> Isn't that too late?  The processes have exited already, so what does
> that test?
>

Again, I don't understand this question. Each unit test is a runnable
example: by writing it, we document some aspect about how Emacs Lisp
should behave, and by running it, we can find bugs (mismatches between
the intended and the actual behavior) and regressions.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-24  3:36   ` Eli Zaretskii
@ 2021-01-24 14:43     ` Eli Zaretskii
  0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-01-24 14:43 UTC (permalink / raw)
  To: p.stephani2; +Cc: phst, emacs-devel

> Date: Sun, 24 Jan 2021 05:36:43 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: phst@google.com, emacs-devel@gnu.org
> 
> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Sat, 23 Jan 2021 21:34:28 +0100
> > Cc: Emacs developers <emacs-devel@gnu.org>, Philipp Stephani <phst@google.com>
> > > I did verify that process-send-eof does cause 'cat' to exit, so the
> > > problem is likely in some issue related to threads and what the test
> > > assumes regarding what should happen here.
> > 
> > What's surprising here is that "cat" exited with an exit status of 1,
> > shouldn't that be normally impossible?
> 
> I'm not sure this 1 exit status is not a red herring: when I manually
> perform the steps for a single process and without any threads, 'cat'
> exits with status zero.  And when I run that single test, it hangs.

Actually, "hangs" is inaccurate: it terminates with an error after 60
sec.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-24 13:22       ` Philipp Stephani
@ 2021-01-24 15:30         ` Eli Zaretskii
  2021-02-28 18:30           ` Philipp
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-01-24 15:30 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: phst, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Sun, 24 Jan 2021 14:22:57 +0100
> Cc: Philipp Stephani <phst@google.com>, Emacs developers <emacs-devel@gnu.org>
> Right now we don't have that many good tests that test the interaction
> between threads and processes, so the intention of this test is to
> improve coverage in this area.

The general intent is clear and welcome; it's the details that I'm
interested in, because I need to know which, if any, parts of the
Windows emulation of the sub-process support need to be improved or
fixed.

> >  There are some things the code does whose rationale I
> > don't understand.  For example, why the call to set-process-thread:
> > what's the purpose?
> 
> The test wouldn't work without it, because it calls
> accept-process-output from a thread different from the thread in which
> the process was created.

But set-process-thread accepts a thread argument, so you could bind
the process to a particular thread, couldn't you?

More generally, my reading of the code is that the first thread which
calls accept-process-output will start listening on all the file
descriptors of all the sub-processes which were started until then,
and all the other threads will have no descriptors left to listen on.
Is this the situation you intended to create?  And if so, is this an
interesting practical use case, or are we just testing implementation
details?

> >  Or why no-conversion in make-process?
> 
> Unit tests should typically test only one aspect, or a small number of
> related aspects of the system under test. The goal of this test is not
> to test the encoding procedures, so I turn off encoding here.

Thanks, so I now understand that the encoding stuff is not important
(since the subprocess doesn't produce any output, it cannot have any
effect, AFAIU).

> >  More
> > generally, what is the basic idea of the test and the expectation
> > from each thread.
> 
> I don't understand this question. The idea is to have a test that
> calls accept-process-output from multiple threads to increase coverage
> in this area.

Why is it important to test this situation, and what do we expect to
happen in it?  Is it reasonable to have more than one thread wait for
output from a subprocess?

> > > thread-join should yield.
> >
> > Isn't that too late?  The processes have exited already, so what does
> > that test?
> 
> Again, I don't understand this question. Each unit test is a runnable
> example: by writing it, we document some aspect about how Emacs Lisp
> should behave, and by running it, we can find bugs (mismatches between
> the intended and the actual behavior) and regressions.

The question I ask myself is what to do with the deviant behavior on
MS-Windows.  If it behaves differently because what we test here are
the peculiarities of the semi-broken Unix mechanism known as "signals"
and/or some details of our implementation of subprocesses on Posix
platforms, then maybe there's no need to change anything in the
Windows code, and just either skip this test on MS-Windows or write a
different test which will work there.  IOW, in such a case there's no
bug in the Windows port, it's just that it isn't bug-for-bug
compatible with Unix.

But if what happens in this test is important in practical use cases,
then perhaps our Windows code does need some changes in this area.

This is why I'm trying to understand the semantics of this test -- I'd
like to figure out which implementation details are important, and
thus should be emulated on Windows, and which are just details that
can be different (and portable Lisp programs will have to take those
differences into account).

For example, the test verifies that each process exited with zero
status, but it doesn't care which thread detected that -- is this
important?  Or what about delivery of SIGCHLD -- is this important
which thread gets delivered the signal, or how many threads should the
signal wake up?  Or is it important when and due to what reason(s)
does each thread exit?

Can you please help me understand these and other related issues?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-01-24 15:30         ` Eli Zaretskii
@ 2021-02-28 18:30           ` Philipp
  2021-02-28 18:36             ` Stefan Monnier
  2021-02-28 20:19             ` Eli Zaretskii
  0 siblings, 2 replies; 17+ messages in thread
From: Philipp @ 2021-02-28 18:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, emacs-devel



> Am 24.01.2021 um 16:30 schrieb Eli Zaretskii <eliz@gnu.org>:
> 
>> From: Philipp Stephani <p.stephani2@gmail.com>
>> Date: Sun, 24 Jan 2021 14:22:57 +0100
>> Cc: Philipp Stephani <phst@google.com>, Emacs developers <emacs-devel@gnu.org>
>> Right now we don't have that many good tests that test the interaction
>> between threads and processes, so the intention of this test is to
>> improve coverage in this area.
> 
> The general intent is clear and welcome; it's the details that I'm
> interested in, because I need to know which, if any, parts of the
> Windows emulation of the sub-process support need to be improved or
> fixed.

The specific scenario here is: Start a number of process and a number of threads, each of which waits for one of the processes; then, cause the processes to exit (since they run „cat“, that works by sending EOF); then, join all the threads.  This should work without hangs or errors.

> 
>>> There are some things the code does whose rationale I
>>> don't understand.  For example, why the call to set-process-thread:
>>> what's the purpose?
>> 
>> The test wouldn't work without it, because it calls
>> accept-process-output from a thread different from the thread in which
>> the process was created.
> 
> But set-process-thread accepts a thread argument, so you could bind
> the process to a particular thread, couldn't you?

Yes, but the threads that call `accept-process-output’ and `process-send-eof’ are different threads; binding the process object to one of the threads would cause the other call to fail.

> 
> More generally, my reading of the code is that the first thread which
> calls accept-process-output will start listening on all the file
> descriptors of all the sub-processes which were started until then,
> and all the other threads will have no descriptors left to listen on.
> Is this the situation you intended to create?

No, but I’m not convinced that that’s actually how the implementation behaves.  Wouldn’t with the current implementation each thread wait for the file descriptors of one process?

>  And if so, is this an
> interesting practical use case, or are we just testing implementation
> details?

The goal here is definitely to test the observable behavior; the implementation should follow from that.  My interpretation of the Elisp manual is that the test should succeed without errors (but unfortunately the manual is still a bit vague on these matters).

> 
>>> Or why no-conversion in make-process?
>> 
>> Unit tests should typically test only one aspect, or a small number of
>> related aspects of the system under test. The goal of this test is not
>> to test the encoding procedures, so I turn off encoding here.
> 
> Thanks, so I now understand that the encoding stuff is not important
> (since the subprocess doesn't produce any output, it cannot have any
> effect, AFAIU).

Yes.  I’d be fine removing the :coding argument, but `no-conversion’ seems to be the simplest choice.

> 
>>> More
>>> generally, what is the basic idea of the test and the expectation
>>> from each thread.
>> 
>> I don't understand this question. The idea is to have a test that
>> calls accept-process-output from multiple threads to increase coverage
>> in this area.
> 
> Why is it important to test this situation, and what do we expect to
> happen in it?  Is it reasonable to have more than one thread wait for
> output from a subprocess?

The tests shouldn’t ask what’s reasonable, but what’s documented or expected.  Surely users can write code like this, and I don’t see why that shouldn’t be supported.

> 
>>>> thread-join should yield.
>>> 
>>> Isn't that too late?  The processes have exited already, so what does
>>> that test?
>> 
>> Again, I don't understand this question. Each unit test is a runnable
>> example: by writing it, we document some aspect about how Emacs Lisp
>> should behave, and by running it, we can find bugs (mismatches between
>> the intended and the actual behavior) and regressions.
> 
> The question I ask myself is what to do with the deviant behavior on
> MS-Windows.  If it behaves differently because what we test here are
> the peculiarities of the semi-broken Unix mechanism known as "signals"
> and/or some details of our implementation of subprocesses on Posix
> platforms, then maybe there's no need to change anything in the
> Windows code, and just either skip this test on MS-Windows or write a
> different test which will work there.  IOW, in such a case there's no
> bug in the Windows port, it's just that it isn't bug-for-bug
> compatible with Unix.

I don’t think so.  Again, I wrote the test based on my understanding of the expected and promised behavior of the process and thread functions, not the Posix implementation.

> 
> But if what happens in this test is important in practical use cases,
> then perhaps our Windows code does need some changes in this area.
> 
> This is why I'm trying to understand the semantics of this test -- I'd
> like to figure out which implementation details are important, and
> thus should be emulated on Windows, and which are just details that
> can be different (and portable Lisp programs will have to take those
> differences into account).

The way I see it, all aspects of the test are important — their behavior should be guaranteed by the implementation.

> 
> For example, the test verifies that each process exited with zero
> status, but it doesn't care which thread detected that -- is this
> important?

IIUC threads don’t really „detect“ process exit.  What I’d assume is that the `thread-join’ calls are synchronization points — after calling `thread-join’ the thread in question is guaranteed to have finished, and since the thread has been waiting for a process to finish (using the accept-process-output loop), the process is also guaranteed to have finished, and this sequence of event is guaranteed exactly in that order.  At least that’s roughly the meaning of such synchronization points in other languages.  Since the process is guaranteed to have finished, it will necessarily need to have the right exit status stored in its process object, and since „cat“ finishes with a zero exit status in case its stdin gets closed, that must be zero.

>  Or what about delivery of SIGCHLD -- is this important
> which thread gets delivered the signal, or how many threads should the
> signal wake up?

Not as long as the observable behavior stays the same (and isn’t buggy).

>  Or is it important when and due to what reason(s)
> does each thread exit?


It’s important that each thread exits normally, i.e., due to its thread function body exiting normally.  It’s not important when exactly the thread finishes, but the observable order of events matters: first process-send-eof, then process exit, then thread exit, then thread-join returns.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-02-28 18:30           ` Philipp
@ 2021-02-28 18:36             ` Stefan Monnier
  2021-02-28 18:52               ` Philipp
  2021-02-28 20:19             ` Eli Zaretskii
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2021-02-28 18:36 UTC (permalink / raw)
  To: Philipp; +Cc: Philipp Stephani, Eli Zaretskii, emacs-devel

>> But set-process-thread accepts a thread argument, so you could bind
>> the process to a particular thread, couldn't you?
> Yes, but the threads that call `accept-process-output’ and
> `process-send-eof’ are different threads; binding the process object to one
> of the threads would cause the other call to fail.

IIRC `set-process-thread` determines the thread in which the process
filters and sentinels are run.  You can still call
`accept-process-output’ and `process-send-eof’ from other threads.


        Stefan




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-02-28 18:36             ` Stefan Monnier
@ 2021-02-28 18:52               ` Philipp
  0 siblings, 0 replies; 17+ messages in thread
From: Philipp @ 2021-02-28 18:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Philipp Stephani, Eli Zaretskii, emacs-devel



> Am 28.02.2021 um 19:36 schrieb Stefan Monnier <monnier@iro.umontreal.ca>:
> 
>>> But set-process-thread accepts a thread argument, so you could bind
>>> the process to a particular thread, couldn't you?
>> Yes, but the threads that call `accept-process-output’ and
>> `process-send-eof’ are different threads; binding the process object to one
>> of the threads would cause the other call to fail.
> 
> IIRC `set-process-thread` determines the thread in which the process
> filters and sentinels are run.  You can still call
> `accept-process-output’ and `process-send-eof’ from other threads.

Partially yes.  It's not possible to call `accept-process-output' from other threads than the locked one, if there's one.
So we could change the test to lock each process to the thread that calls `accept-process-output'.  I'm fine either way, it shouldn't change the behavior of the test.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-02-28 18:30           ` Philipp
  2021-02-28 18:36             ` Stefan Monnier
@ 2021-02-28 20:19             ` Eli Zaretskii
  2021-03-18 12:51               ` Philipp
  1 sibling, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-02-28 20:19 UTC (permalink / raw)
  To: Philipp; +Cc: phst, emacs-devel

> From: Philipp <p.stephani2@gmail.com>
> Date: Sun, 28 Feb 2021 19:30:46 +0100
> Cc: Philipp Stephani <phst@google.com>,
>  emacs-devel@gnu.org
> 
> The specific scenario here is: Start a number of process and a number of threads, each of which waits for one of the processes

See below: I don't see how you ensure in this test that each thread
waits for just one process.

> then, cause the processes to exit (since they run „cat“, that works by sending EOF); then, join all the threads.  This should work without hangs or errors.

There are two "inputs" from each subprocess that Emacs receives: the
self-pipe output and the SIGCHLD signal.  Depending on the timing and
on which thread receives which input, the behavior can be different.
Do you have a clear idea what should happen and what does happen wrt
these "inputs"?  In particular, which thread(s) do you expect to see
SIGCHLD and/or read the self-pipe as result of that, and do you think
it's important whether a thread gets awoken by the signal or by
reading the self-pipe?

> >> The test wouldn't work without it, because it calls
> >> accept-process-output from a thread different from the thread in which
> >> the process was created.
> > 
> > But set-process-thread accepts a thread argument, so you could bind
> > the process to a particular thread, couldn't you?
> 
> Yes, but the threads that call `accept-process-output’ and `process-send-eof’ are different threads; binding the process object to one of the threads would cause the other call to fail.

As Stefan points out, process-send-eof could be sent from any thread.

> > More generally, my reading of the code is that the first thread which
> > calls accept-process-output will start listening on all the file
> > descriptors of all the sub-processes which were started until then,
> > and all the other threads will have no descriptors left to listen on.
> > Is this the situation you intended to create?
> 
> No, but I’m not convinced that that’s actually how the implementation behaves.  Wouldn’t with the current implementation each thread wait for the file descriptors of one process?

How would that happen, in this particular test?  The test first starts
10 processes, then creates 10 threads, then calls thread-join for each
of the threads.  The threads that the test create don't run, because
the first thing each thread does is attempt to take the global lock,
and becomes blocked because the main thread still runs.  The main
thread yields the first time it calls thread-join; at that time, one
of the 10 waiting threads (not necessarily the one passed as argument
to thread-join) will take the global lock.  This thread will call
accept-process-output, which calls wait_reading_process_output, which
computes the mask for pselect to wait on.  Now look at the code in
compute_non_keyboard_wait_mask and its ilk: the mask will include
every descriptor that is not already watched by some other thread.
Since this is the first thread that calls wait_reading_process_output,
it will find all the descriptors up for grabs, and will add all of
them to the mask.  Then it will release the global lock and call
pselect.  Releasing the global lock will let some other thread of the
9 remaining threads run, that thread will also call
accept-process-output, which will call wait_reading_process_output --
but now all the descriptors are taken by the first thread, so there's
nothing left for the second one, and the same for all the other 8
threads.

So what we have now is one thread waiting for output from all the 10
subprocess, and the other 9 waiting only on the self-pipe.  What
happens then, when the processes exit and we get 10 SIGCHLD signals,
is left as an exercise.  And don't forget that order in which we
receive the signal and the byte from self-pipe is not entirely
deterministic, either.

Am I missing something, or is this what happens the way the test is
written?  And if I'm not missing anything, is this the situation you
wanted to create and test?

> > Why is it important to test this situation, and what do we expect to
> > happen in it?  Is it reasonable to have more than one thread wait for
> > output from a subprocess?
> 
> The tests shouldn’t ask what’s reasonable, but what’s documented or expected.  Surely users can write code like this, and I don’t see why that shouldn’t be supported.

I didn't say anything about unsupporting this situation, I asked what
should happen in it, and I meant the details, not merely that there
are no errors.  E.g., is it important that each threads gets output
from exactly one subprocess? is it important whether a thread exits
because it read output or because it received SIGCHLD and read the
self-pipe? is it important whether all the threads exit one after
another, or is it okay if they all wait until one thread exits and
then exit immediately (because all the processes have already died)?
etc.

I instrumented the code that gets executed by this test and saw all
kinds of strange phenomena, and I'm asking myself whether you looked
at those details and whether you are satisfied by what you saw.

> > The question I ask myself is what to do with the deviant behavior on
> > MS-Windows.  If it behaves differently because what we test here are
> > the peculiarities of the semi-broken Unix mechanism known as "signals"
> > and/or some details of our implementation of subprocesses on Posix
> > platforms, then maybe there's no need to change anything in the
> > Windows code, and just either skip this test on MS-Windows or write a
> > different test which will work there.  IOW, in such a case there's no
> > bug in the Windows port, it's just that it isn't bug-for-bug
> > compatible with Unix.
> 
> I don’t think so.  Again, I wrote the test based on my understanding of the expected and promised behavior of the process and thread functions, not the Posix implementation.

Where and how did you obtain that understanding?  The documentation is
(intentionally or unintentionally) very vague regarding these details.
Perhaps you only cared about whether the code should or shouldn't err,
but that's clearly not the only interesting or important aspect of
this exercise, is it?

> > But if what happens in this test is important in practical use cases,
> > then perhaps our Windows code does need some changes in this area.
> > 
> > This is why I'm trying to understand the semantics of this test -- I'd
> > like to figure out which implementation details are important, and
> > thus should be emulated on Windows, and which are just details that
> > can be different (and portable Lisp programs will have to take those
> > differences into account).
> 
> The way I see it, all aspects of the test are important — their behavior should be guaranteed by the implementation.

Which aspects of those I mentioned above do you think the
implementation should guarantee?  E.g., should it guarantee how many
threads get the actual output from the processes in this case, and how
many get the signal?

> > For example, the test verifies that each process exited with zero
> > status, but it doesn't care which thread detected that -- is this
> > important?
> 
> IIUC threads don’t really „detect“ process exit.

??? When a subprocess exits, Emacs gets a SIGCHLD, and then we access
the exit code in order to pass it to the sentinel function.  This is
what I call "detecting process exit".

> What I’d assume is that the `thread-join’ calls are synchronization points — after calling `thread-join’ the thread in question is guaranteed to have finished, and since the thread has been waiting for a process to finish (using the accept-process-output loop), the process is also guaranteed to have finished, and this sequence of event is guaranteed exactly in that order.

I don't see how you reach those conclusions.  The first part is true:
when thread-join exits, the thread it was waiting for is guaranteed to
have exited.  But how it exited, and what caused it to exit -- these
are open questions, in a situation where 10 subprocesses generate
output and produce 10 signals.

> >  Or what about delivery of SIGCHLD -- is this important
> > which thread gets delivered the signal, or how many threads should the
> > signal wake up?
> 
> Not as long as the observable behavior stays the same (and isn’t buggy).

And the "observable behavior" is what? that processes exited
successfully and no Lisp errors happened?  Or is there something else
we expect in this case?

> >  Or is it important when and due to what reason(s)
> > does each thread exit?
> 
> It’s important that each thread exits normally, i.e., due to its thread function body exiting normally.

So it is NOT important what event caused the thread function to run,
and which event caused it to exit accept-process-output and finish?

> the observable order of events matters: first process-send-eof, then process exit, then thread exit, then thread-join returns.

The order of the first two is guaranteed because a single thread
arranges for that sequentially.  But do you care if the sequence is
like this:

  . all the 10 processes exit
  . one of the threads exit because the processes exited
  . the other 9 threads get awoken one by one, exit
    accept-process-output because of SIGCHLD, end exit



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-02-28 20:19             ` Eli Zaretskii
@ 2021-03-18 12:51               ` Philipp
  2021-03-18 15:39                 ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp @ 2021-03-18 12:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, emacs-devel



> Am 28.02.2021 um 21:19 schrieb Eli Zaretskii <eliz@gnu.org>:
> 
>> From: Philipp <p.stephani2@gmail.com>
>> Date: Sun, 28 Feb 2021 19:30:46 +0100
>> Cc: Philipp Stephani <phst@google.com>,
>> emacs-devel@gnu.org
>> 
>> The specific scenario here is: Start a number of process and a number of threads, each of which waits for one of the processes
> 
> See below: I don't see how you ensure in this test that each thread
> waits for just one process.

Shouldn't providing a process argument to accept-process-output ensure this, modulo implementation details that are not observable?  The only nontrivial guarantee that the test requires is that after (accept-process-output PROC) returns nil, the PROC process has finished, independent of other processes or threads, and that multiple concurrent calls of this form don't deadlock.

> 
>> then, cause the processes to exit (since they run „cat“, that works by sending EOF); then, join all the threads.  This should work without hangs or errors.
> 
> There are two "inputs" from each subprocess that Emacs receives: the
> self-pipe output and the SIGCHLD signal.  Depending on the timing and
> on which thread receives which input, the behavior can be different.
> Do you have a clear idea what should happen and what does happen wrt
> these "inputs"?  In particular, which thread(s) do you expect to see
> SIGCHLD and/or read the self-pipe as result of that, and do you think
> it's important whether a thread gets awoken by the signal or by
> reading the self-pipe?

All of those are implementation details that should be entirely invisible to the thread interface.  As such, none of the threads should "see" any of those.  All that matters is the observable behavior of accept-process-output: it may return nil only if and when its process argument has finished, independent of threads or signal disposition.

> 
>>>> The test wouldn't work without it, because it calls
>>>> accept-process-output from a thread different from the thread in which
>>>> the process was created.
>>> 
>>> But set-process-thread accepts a thread argument, so you could bind
>>> the process to a particular thread, couldn't you?
>> 
>> Yes, but the threads that call `accept-process-output’ and `process-send-eof’ are different threads; binding the process object to one of the threads would cause the other call to fail.
> 
> As Stefan points out, process-send-eof could be sent from any thread.

Correct, though this shouldn't make a difference: the test should pass whether the process objects have thread affinity or not (and that's the case for me).

> 
>>> More generally, my reading of the code is that the first thread which
>>> calls accept-process-output will start listening on all the file
>>> descriptors of all the sub-processes which were started until then,
>>> and all the other threads will have no descriptors left to listen on.
>>> Is this the situation you intended to create?
>> 
>> No, but I’m not convinced that that’s actually how the implementation behaves.  Wouldn’t with the current implementation each thread wait for the file descriptors of one process?
> 
> How would that happen, in this particular test?  The test first starts
> 10 processes, then creates 10 threads, then calls thread-join for each
> of the threads.  The threads that the test create don't run, because
> the first thing each thread does is attempt to take the global lock,
> and becomes blocked because the main thread still runs.  The main
> thread yields the first time it calls thread-join; at that time, one
> of the 10 waiting threads (not necessarily the one passed as argument
> to thread-join) will take the global lock.  This thread will call
> accept-process-output, which calls wait_reading_process_output, which
> computes the mask for pselect to wait on.  Now look at the code in
> compute_non_keyboard_wait_mask and its ilk: the mask will include
> every descriptor that is not already watched by some other thread.
> Since this is the first thread that calls wait_reading_process_output,
> it will find all the descriptors up for grabs, and will add all of
> them to the mask.  Then it will release the global lock and call
> pselect.  Releasing the global lock will let some other thread of the
> 9 remaining threads run, that thread will also call
> accept-process-output, which will call wait_reading_process_output --
> but now all the descriptors are taken by the first thread, so there's
> nothing left for the second one, and the same for all the other 8
> threads.
> 
> So what we have now is one thread waiting for output from all the 10
> subprocess, and the other 9 waiting only on the self-pipe.  What
> happens then, when the processes exit and we get 10 SIGCHLD signals,
> is left as an exercise.  And don't forget that order in which we
> receive the signal and the byte from self-pipe is not entirely
> deterministic, either.
> 
> Am I missing something, or is this what happens the way the test is
> written?  And if I'm not missing anything, is this the situation you
> wanted to create and test?

This might be how the implementation behaves (I haven't checked, but I have no reason to doubt your explanation).  However, the test doesn't test the implementation, but the observable behavior.  That is, I can't formally prove that the implementation leads to the observable behavior that the test expects, but I think it should, and I'd consider it a bug (in Emacs, not the test code) if it doesn't.

> 
>>> Why is it important to test this situation, and what do we expect to
>>> happen in it?  Is it reasonable to have more than one thread wait for
>>> output from a subprocess?
>> 
>> The tests shouldn’t ask what’s reasonable, but what’s documented or expected.  Surely users can write code like this, and I don’t see why that shouldn’t be supported.
> 
> I didn't say anything about unsupporting this situation, I asked what
> should happen in it, and I meant the details, not merely that there
> are no errors.  E.g., is it important that each threads gets output
> from exactly one subprocess?

How does a thread "get output"?  Do you mean that accept-process-output returns non-nil?  What I'd expect from accept-process-output with a process argument is that directly after it returns nil, the process represented by the process argument has finished, all output has been passed to buffers/filters, and the sentinel has run.  That's independent of the thread in which accept-process-output is called.

> is it important whether a thread exits
> because it read output or because it received SIGCHLD and read the
> self-pipe?

No, these are implementation details: they might as well not exist as far as the observable behavior of accept-process-output is concerned.

> is it important whether all the threads exit one after
> another, or is it okay if they all wait until one thread exits and
> then exit immediately (because all the processes have already died)?

I don't think that's observable/documented behavior: we only know that a thread has exited after thread-join returns.

> etc.
> 
> I instrumented the code that gets executed by this test and saw all
> kinds of strange phenomena, and I'm asking myself whether you looked
> at those details and whether you are satisfied by what you saw.

No, that's not the way I approach topics like this.  I wrote the test based on my reading of the documentation (having in mind the complexity of the implementation, but not being guided by it).  I then ran the test and it passed, which gave me a bit of confidence in the correctness of the implementation (which the test can't prove, of course).
So the questions I'd ask here: Is the test code correct, based on documented (or at least implied) guarantees?  If it's correct, but the test fails on some system, the implementation is buggy.  What specific sequence of events in the implementation code leads to the buggy behavior is then only the next step, not the starting point of the discussion.

> 
>>> The question I ask myself is what to do with the deviant behavior on
>>> MS-Windows.  If it behaves differently because what we test here are
>>> the peculiarities of the semi-broken Unix mechanism known as "signals"
>>> and/or some details of our implementation of subprocesses on Posix
>>> platforms, then maybe there's no need to change anything in the
>>> Windows code, and just either skip this test on MS-Windows or write a
>>> different test which will work there.  IOW, in such a case there's no
>>> bug in the Windows port, it's just that it isn't bug-for-bug
>>> compatible with Unix.
>> 
>> I don’t think so.  Again, I wrote the test based on my understanding of the expected and promised behavior of the process and thread functions, not the Posix implementation.
> 
> Where and how did you obtain that understanding?  The documentation is
> (intentionally or unintentionally) very vague regarding these details.

Yes, so I had to fill in some aspects that I hope are reasonable.  In particular, the documentation doesn't state explicitly that waiting for a process in a different thread than the original one is supported, but given set-process-thread it implicitly has to be, because what would otherwise be the purpose of set-process-thread?
The documentation definitely needs to be significantly expanded, as this discussion shows.
So maybe asking more directly:
- Should this test pass?
- If so, how can this be deduced from the documentation?
- If not, how can *that* be deduced from the documentation, and what changes would be necessary to guarantee that it passes?
- Is there anything missing from the documentation that prevents deducing whether the test should pass or not?

> Perhaps you only cared about whether the code should or shouldn't err,
> but that's clearly not the only interesting or important aspect of
> this exercise, is it?

It's at least one important aspect, but finding deadlocks (hangs) is probably more important.

> 
>>> But if what happens in this test is important in practical use cases,
>>> then perhaps our Windows code does need some changes in this area.
>>> 
>>> This is why I'm trying to understand the semantics of this test -- I'd
>>> like to figure out which implementation details are important, and
>>> thus should be emulated on Windows, and which are just details that
>>> can be different (and portable Lisp programs will have to take those
>>> differences into account).
>> 
>> The way I see it, all aspects of the test are important — their behavior should be guaranteed by the implementation.
> 
> Which aspects of those I mentioned above do you think the
> implementation should guarantee?  E.g., should it guarantee how many
> threads get the actual output from the processes in this case,

That depends what "get the actual output" means.  What I expect is the behavior of accept-process-output, as explained above. Specifically: accept-process-output will return nil quickly (within milliseconds, assuming the calling thread is unblocked) once the given process has exited.  Once it returns, all process output has been given to process filters/buffers, and the sentinel has run.

> and how
> many get the signal?

None, as the signal is an implementation detail and not part of the interface.

> 
>>> For example, the test verifies that each process exited with zero
>>> status, but it doesn't care which thread detected that -- is this
>>> important?
>> 
>> IIUC threads don’t really „detect“ process exit.
> 
> ??? When a subprocess exits, Emacs gets a SIGCHLD, and then we access
> the exit code in order to pass it to the sentinel function.  This is
> what I call "detecting process exit".

Sure, but is that really related to threads?  I'd rather say that *Emacs* detects process exits.
It's also true that there's some thread that contains the `accept' call, but that's again an implementation detail and not related to Lisp threads.  Several other implementations would be possible: Calling `accept' in a single thread not accessible from Lisp at all; implementing Lisp threads as "green threads" without pthread support; etc.

> 
>> What I’d assume is that the `thread-join’ calls are synchronization points — after calling `thread-join’ the thread in question is guaranteed to have finished, and since the thread has been waiting for a process to finish (using the accept-process-output loop), the process is also guaranteed to have finished, and this sequence of event is guaranteed exactly in that order.
> 
> I don't see how you reach those conclusions.  The first part is true:
> when thread-join exits, the thread it was waiting for is guaranteed to
> have exited.  But how it exited, and what caused it to exit -- these
> are open questions, in a situation where 10 subprocesses generate
> output and produce 10 signals.

There's only one way for a thread to exit: when its thread function returns.  At least that's what the manual says.  And the thread function can only return when `accept-process-output' returns nil, so we're back at the discussion what the semantics of `accept-process-output' are.

> 
>>> Or what about delivery of SIGCHLD -- is this important
>>> which thread gets delivered the signal, or how many threads should the
>>> signal wake up?
>> 
>> Not as long as the observable behavior stays the same (and isn’t buggy).
> 
> And the "observable behavior" is what? that processes exited
> successfully and no Lisp errors happened?  Or is there something else
> we expect in this case?

That the code as a whole doesn't hang/deadlock.  It's harder to make precise guarantees for that, but the 60s timeout should be long enough for the test to finish even on a heavily-loaded system.

> 
>>> Or is it important when and due to what reason(s)
>>> does each thread exit?
>> 
>> It’s important that each thread exits normally, i.e., due to its thread function body exiting normally.
> 
> So it is NOT important what event caused the thread function to run,
> and which event caused it to exit accept-process-output and finish?

No.  AFAIK these are unobservable, and the manual doesn't make any guarantees in this respect.

> 
>> the observable order of events matters: first process-send-eof, then process exit, then thread exit, then thread-join returns.
> 
> The order of the first two is guaranteed because a single thread
> arranges for that sequentially.  But do you care if the sequence is
> like this:
> 
>  . all the 10 processes exit
>  . one of the threads exit because the processes exited
>  . the other 9 threads get awoken one by one, exit
>    accept-process-output because of SIGCHLD, end exit


From what I can see, that should be fine.  The test only checks that the threads finish eventually and don't deadlock, but doesn't care about the order of events across threads (which is not guaranteed unless there are explicit synchronization points like mutexes or condition variables).




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-03-18 12:51               ` Philipp
@ 2021-03-18 15:39                 ` Eli Zaretskii
  2021-03-21 12:00                   ` Philipp
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-18 15:39 UTC (permalink / raw)
  To: Philipp; +Cc: phst, emacs-devel

> From: Philipp <p.stephani2@gmail.com>
> Date: Thu, 18 Mar 2021 13:51:10 +0100
> Cc: Philipp Stephani <phst@google.com>,
>  emacs-devel@gnu.org
> 
> > See below: I don't see how you ensure in this test that each thread
> > waits for just one process.
> 
> Shouldn't providing a process argument to accept-process-output ensure this, 

No.  All it ensures is that accept-process-output will not return
until it heard from that process.  It doesn't say what would happen
with output from other subprocesses that might be running.

> the test should pass whether the process objects have thread affinity or not (and that's the case for me).

But the test has no control of the affinity, so it just tests some
case, not the other(s).  We have no idea what happens in those cases
it doesn't test.

> This might be how the implementation behaves (I haven't checked, but I have no reason to doubt your explanation).  However, the test doesn't test the implementation, but the observable behavior.  That is, I can't formally prove that the implementation leads to the observable behavior that the test expects, but I think it should, and I'd consider it a bug (in Emacs, not the test code) if it doesn't.

It sounds like the "observable behavior" you care about is a very
small part of what's going on in that case.  As one of many tests in
this area, that is entirely fine.  But since it's the _only_ test, it
sounds like we are testing a very small part of the functionality, and
leave the rest entirely untested.

> > is it important whether a thread exits
> > because it read output or because it received SIGCHLD and read the
> > self-pipe?
> 
> No, these are implementation details: they might as well not exist as far as the observable behavior of accept-process-output is concerned.
> 
> > is it important whether all the threads exit one after
> > another, or is it okay if they all wait until one thread exits and
> > then exit immediately (because all the processes have already died)?
> 
> I don't think that's observable/documented behavior: we only know that a thread has exited after thread-join returns.

Threads and their interaction with subprocesses were implemented to
provide some useful behavior which is supposed to support practical
use cases and applications.  If all we care about is that 10 threads
don't deadlock and all the processes exited, we didn't even scratch
the surface of those use cases and applications.  In particular, a
practical application of these facilities may very well care whether
the threads exit one by one or not.  It sounds very strange to me to
hear that we shouldn't care.

> >> I wrote the test based on my understanding of the expected and promised behavior of the process and thread functions, not the Posix implementation.
> > 
> > Where and how did you obtain that understanding?  The documentation is
> > (intentionally or unintentionally) very vague regarding these details.
> 
> Yes, so I had to fill in some aspects that I hope are reasonable.  In particular, the documentation doesn't state explicitly that waiting for a process in a different thread than the original one is supported, but given set-process-thread it implicitly has to be, because what would otherwise be the purpose of set-process-thread?

If you aren't interested in implementation details, there's no useful
answer to such questions.  For example, maybe whoever implemented
set-process-thread had some strange ideas or simply made a design
mistake?

> > Perhaps you only cared about whether the code should or shouldn't err,
> > but that's clearly not the only interesting or important aspect of
> > this exercise, is it?
> 
> It's at least one important aspect, but finding deadlocks (hangs) is probably more important.

I think there are aspects of this much more important than deadlocks
(or lack thereof).

> What I expect is the behavior of accept-process-output, as explained above. Specifically: accept-process-output will return nil quickly (within milliseconds, assuming the calling thread is unblocked) once the given process has exited.  Once it returns, all process output has been given to process filters/buffers, and the sentinel has run.

When the process exits, some thread which waits on its descriptor, or
on the self-pipe descriptor, will return from the 'pselect' call, and
will attempt to take the global lock.  If it cannot take the global
lock, it will block, in which case neither the process output will be
delivered nor the sentinel will run, until the thread becomes
unblocked.  So, as you see, even your simple expectations aren't
really guaranteed.  And your test still succeeds, regardless.  What
does that tell you about the usefulness of the test?

> >> What I’d assume is that the `thread-join’ calls are synchronization points — after calling `thread-join’ the thread in question is guaranteed to have finished, and since the thread has been waiting for a process to finish (using the accept-process-output loop), the process is also guaranteed to have finished, and this sequence of event is guaranteed exactly in that order.
> > 
> > I don't see how you reach those conclusions.  The first part is true:
> > when thread-join exits, the thread it was waiting for is guaranteed to
> > have exited.  But how it exited, and what caused it to exit -- these
> > are open questions, in a situation where 10 subprocesses generate
> > output and produce 10 signals.
> 
> There's only one way for a thread to exit: when its thread function returns. 

No, a thread can also exit if it signals an error.

> > The order of the first two is guaranteed because a single thread
> > arranges for that sequentially.  But do you care if the sequence is
> > like this:
> > 
> >  . all the 10 processes exit
> >  . one of the threads exit because the processes exited
> >  . the other 9 threads get awoken one by one, exit
> >    accept-process-output because of SIGCHLD, end exit
> 
> 
> From what I can see, that should be fine.  The test only checks that the threads finish eventually and don't deadlock, but doesn't care about the order of events across threads (which is not guaranteed unless there are explicit synchronization points like mutexes or condition variables).

OK, but then I think a test that is only interested in these
superficial aspects is not very useful for making sure this part of
Emacs is really useful in Real Life.  Unless we add other tests which
do examine what you consider to be "implementation details".



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-03-18 15:39                 ` Eli Zaretskii
@ 2021-03-21 12:00                   ` Philipp
  2021-03-21 12:23                     ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp @ 2021-03-21 12:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, emacs-devel



> Am 18.03.2021 um 16:39 schrieb Eli Zaretskii <eliz@gnu.org>:
> 
>> From: Philipp <p.stephani2@gmail.com>
>> Date: Thu, 18 Mar 2021 13:51:10 +0100
>> Cc: Philipp Stephani <phst@google.com>,
>> emacs-devel@gnu.org
>> 
>>> See below: I don't see how you ensure in this test that each thread
>>> waits for just one process.
>> 
>> Shouldn't providing a process argument to accept-process-output ensure this, 
> 
> No.  All it ensures is that accept-process-output will not return
> until it heard from that process.  It doesn't say what would happen
> with output from other subprocesses that might be running.

Yeah, but why should that matter?  If multiple processes are started in parallel, surely people shouldn't assume that there's some deterministic order of output arrival across processes.

> 
>> the test should pass whether the process objects have thread affinity or not (and that's the case for me).
> 
> But the test has no control of the affinity, so it just tests some
> case, not the other(s).  We have no idea what happens in those cases
> it doesn't test.

Sure, that's the nature of unit tests: they only test a single scenario.

> 
>> This might be how the implementation behaves (I haven't checked, but I have no reason to doubt your explanation).  However, the test doesn't test the implementation, but the observable behavior.  That is, I can't formally prove that the implementation leads to the observable behavior that the test expects, but I think it should, and I'd consider it a bug (in Emacs, not the test code) if it doesn't.
> 
> It sounds like the "observable behavior" you care about is a very
> small part of what's going on in that case.  As one of many tests in
> this area, that is entirely fine.  But since it's the _only_ test, it
> sounds like we are testing a very small part of the functionality, and
> leave the rest entirely untested.

Again, every unit tests behaves like this, unless you strive for 100% path coverage (which is generally infeasible for a very complex function like accept-process-output).

I certainly won't stop anyone from adding further tests testing different scenarios.

> 
>>> is it important whether a thread exits
>>> because it read output or because it received SIGCHLD and read the
>>> self-pipe?
>> 
>> No, these are implementation details: they might as well not exist as far as the observable behavior of accept-process-output is concerned.
>> 
>>> is it important whether all the threads exit one after
>>> another, or is it okay if they all wait until one thread exits and
>>> then exit immediately (because all the processes have already died)?
>> 
>> I don't think that's observable/documented behavior: we only know that a thread has exited after thread-join returns.
> 
> Threads and their interaction with subprocesses were implemented to
> provide some useful behavior which is supposed to support practical
> use cases and applications.  If all we care about is that 10 threads
> don't deadlock and all the processes exited, we didn't even scratch
> the surface of those use cases and applications.  In particular, a
> practical application of these facilities may very well care whether
> the threads exit one by one or not.  It sounds very strange to me to
> hear that we shouldn't care.

The test in question tests what I've described, which is obviously not the only thing we should care about.

> 
>>>> I wrote the test based on my understanding of the expected and promised behavior of the process and thread functions, not the Posix implementation.
>>> 
>>> Where and how did you obtain that understanding?  The documentation is
>>> (intentionally or unintentionally) very vague regarding these details.
>> 
>> Yes, so I had to fill in some aspects that I hope are reasonable.  In particular, the documentation doesn't state explicitly that waiting for a process in a different thread than the original one is supported, but given set-process-thread it implicitly has to be, because what would otherwise be the purpose of set-process-thread?
> 
> If you aren't interested in implementation details, there's no useful
> answer to such questions.  For example, maybe whoever implemented
> set-process-thread had some strange ideas or simply made a design
> mistake?

Or we can check the manual?

"Because threads were a relatively late addition to Emacs Lisp, and due
to the way dynamic binding was sometimes used in conjunction with
‘accept-process-output’, by default a process is locked to the thread
that created it."

That's the given explanation: to prevent bugs when using dynamic binding.  However, the test in question doesn't use dynamic binding.  We have to call set-process-thread in the test because it would fail otherwise, but apart from that, the question of thread locking (of affinity) should be unrelated.

> 
>>> Perhaps you only cared about whether the code should or shouldn't err,
>>> but that's clearly not the only interesting or important aspect of
>>> this exercise, is it?
>> 
>> It's at least one important aspect, but finding deadlocks (hangs) is probably more important.
> 
> I think there are aspects of this much more important than deadlocks
> (or lack thereof).

And you're welcome to add tests for those aspects.

> 
>> What I expect is the behavior of accept-process-output, as explained above. Specifically: accept-process-output will return nil quickly (within milliseconds, assuming the calling thread is unblocked) once the given process has exited.  Once it returns, all process output has been given to process filters/buffers, and the sentinel has run.
> 
> When the process exits, some thread which waits on its descriptor, or
> on the self-pipe descriptor, will return from the 'pselect' call, and
> will attempt to take the global lock.  If it cannot take the global
> lock, it will block, in which case neither the process output will be
> delivered nor the sentinel will run, until the thread becomes
> unblocked.  So, as you see, even your simple expectations aren't
> really guaranteed.  And your test still succeeds, regardless.  What
> does that tell you about the usefulness of the test?

Nothing.  You seem to have very different ideas about the purpose of unit tests than I, so I don't think we should belabor this point any further.

> 
>>>> What I’d assume is that the `thread-join’ calls are synchronization points — after calling `thread-join’ the thread in question is guaranteed to have finished, and since the thread has been waiting for a process to finish (using the accept-process-output loop), the process is also guaranteed to have finished, and this sequence of event is guaranteed exactly in that order.
>>> 
>>> I don't see how you reach those conclusions.  The first part is true:
>>> when thread-join exits, the thread it was waiting for is guaranteed to
>>> have exited.  But how it exited, and what caused it to exit -- these
>>> are open questions, in a situation where 10 subprocesses generate
>>> output and produce 10 signals.
>> 
>> There's only one way for a thread to exit: when its thread function returns. 
> 
> No, a thread can also exit if it signals an error.

Of course, like any function.  I was just being a bit sloppy.  The unit test verifies that the threads don't signal errors.

> 
>>> The order of the first two is guaranteed because a single thread
>>> arranges for that sequentially.  But do you care if the sequence is
>>> like this:
>>> 
>>> . all the 10 processes exit
>>> . one of the threads exit because the processes exited
>>> . the other 9 threads get awoken one by one, exit
>>>   accept-process-output because of SIGCHLD, end exit
>> 
>> 
>> From what I can see, that should be fine.  The test only checks that the threads finish eventually and don't deadlock, but doesn't care about the order of events across threads (which is not guaranteed unless there are explicit synchronization points like mutexes or condition variables).
> 
> OK, but then I think a test that is only interested in these
> superficial aspects is not very useful for making sure this part of
> Emacs is really useful in Real Life.  Unless we add other tests which
> do examine what you consider to be "implementation details".

See above.

Since this discussion doesn't seem to lead anywhere, maybe we can just agree to disagree and move on?




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes.
  2021-03-21 12:00                   ` Philipp
@ 2021-03-21 12:23                     ` Eli Zaretskii
  0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-21 12:23 UTC (permalink / raw)
  To: Philipp; +Cc: phst, emacs-devel

> From: Philipp <p.stephani2@gmail.com>
> Date: Sun, 21 Mar 2021 13:00:36 +0100
> Cc: Philipp Stephani <phst@google.com>,
>  emacs-devel@gnu.org
> 
> >>> See below: I don't see how you ensure in this test that each thread
> >>> waits for just one process.
> >> 
> >> Shouldn't providing a process argument to accept-process-output ensure this, 
> > 
> > No.  All it ensures is that accept-process-output will not return
> > until it heard from that process.  It doesn't say what would happen
> > with output from other subprocesses that might be running.
> 
> Yeah, but why should that matter?  If multiple processes are started in parallel, surely people shouldn't assume that there's some deterministic order of output arrival across processes.

You've changed the subject: the original question was whether each
thread waits for one process.

> >> the test should pass whether the process objects have thread affinity or not (and that's the case for me).
> > 
> > But the test has no control of the affinity, so it just tests some
> > case, not the other(s).  We have no idea what happens in those cases
> > it doesn't test.
> 
> Sure, that's the nature of unit tests: they only test a single scenario.

Sure, but that assumes you have other unit tests that test the other
scenarios.  By contrast, here we have a single test, and you are
saying that it's enough to test this part of the Emacs functionality,
because you've made significant changes in the implementation, and
used just this one test to verify the changes didn't break anything.

> I certainly won't stop anyone from adding further tests testing different scenarios.

I'm wondering why _you_ considered this to be enough.

> > If you aren't interested in implementation details, there's no useful
> > answer to such questions.  For example, maybe whoever implemented
> > set-process-thread had some strange ideas or simply made a design
> > mistake?
> 
> Or we can check the manual?
> 
> "Because threads were a relatively late addition to Emacs Lisp, and due
> to the way dynamic binding was sometimes used in conjunction with
> ‘accept-process-output’, by default a process is locked to the thread
> that created it."

That's highly inaccurate, as I described up-thread.  At least if the
"locked" part is understood in the way you want to interpret it.

> > When the process exits, some thread which waits on its descriptor, or
> > on the self-pipe descriptor, will return from the 'pselect' call, and
> > will attempt to take the global lock.  If it cannot take the global
> > lock, it will block, in which case neither the process output will be
> > delivered nor the sentinel will run, until the thread becomes
> > unblocked.  So, as you see, even your simple expectations aren't
> > really guaranteed.  And your test still succeeds, regardless.  What
> > does that tell you about the usefulness of the test?
> 
> Nothing.  You seem to have very different ideas about the purpose of unit tests than I, so I don't think we should belabor this point any further.

Well, let me say then that I think your unit test as written is
useless, and any results one gets from them can be considered valid,
even if the test hangs.  Because the test creates a situation that is
entirely uninteresting for any practical use of Emacs, and thus the
test is basically tuned to a single platform where it was originally
written.

That's not how tests for our test suite should be designed and
implemented.

> > No, a thread can also exit if it signals an error.
> 
> Of course, like any function.  I was just being a bit sloppy.  The unit test verifies that the threads don't signal errors.

No, because it doesn't verify that none of the threads signaled.



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-03-21 12:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-23 19:49 master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes Eli Zaretskii
2021-01-23 20:27 ` Eli Zaretskii
2021-01-23 20:33   ` Philipp Stephani
2021-01-24  5:26     ` Eli Zaretskii
2021-01-24 13:22       ` Philipp Stephani
2021-01-24 15:30         ` Eli Zaretskii
2021-02-28 18:30           ` Philipp
2021-02-28 18:36             ` Stefan Monnier
2021-02-28 18:52               ` Philipp
2021-02-28 20:19             ` Eli Zaretskii
2021-03-18 12:51               ` Philipp
2021-03-18 15:39                 ` Eli Zaretskii
2021-03-21 12:00                   ` Philipp
2021-03-21 12:23                     ` Eli Zaretskii
2021-01-23 20:34 ` Philipp Stephani
2021-01-24  3:36   ` Eli Zaretskii
2021-01-24 14:43     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).