From mboxrd@z Thu Jan 1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Philipp
Newsgroups: gmane.emacs.devel
Subject: Re: master 8dcb19f 4/4: Add a unit test testing interaction between
threads and processes.
Date: Sun, 21 Mar 2021 13:00:36 +0100
Message-ID: <50388E17-1D84-4823-94D9-923E792DC3D1@gmail.com>
References: <83k0s34eo1.fsf@gnu.org> <83h7n74cwe.fsf@gnu.org>
<838s8i4ak8.fsf@gnu.org> <6C5E64C7-19A6-4D46-AC30-33C29FA5DEBB@gmail.com>
<83mtvo7xno.fsf@gnu.org> <25EEFE6F-9532-4A5A-8824-A3F8F41012F3@gmail.com>
<83y2ekfp3e.fsf@gnu.org>
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\))
Content-Type: text/plain;
charset=utf-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
logging-data="34283"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: Philipp Stephani , emacs-devel@gnu.org
To: Eli Zaretskii
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 21 13:03:01 2021
Return-path:
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.92)
(envelope-from )
id 1lNwnN-0008ml-38
for ged-emacs-devel@m.gmane-mx.org; Sun, 21 Mar 2021 13:03:01 +0100
Original-Received: from localhost ([::1]:46280 helo=lists1p.gnu.org)
by lists.gnu.org with esmtp (Exim 4.90_1)
(envelope-from )
id 1lNwnM-00080y-21
for ged-emacs-devel@m.gmane-mx.org; Sun, 21 Mar 2021 08:03:00 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:37300)
by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from )
id 1lNwlH-0007HR-MD
for emacs-devel@gnu.org; Sun, 21 Mar 2021 08:00:54 -0400
Original-Received: from mail-ed1-x52a.google.com ([2a00:1450:4864:20::52a]:46607)
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
(Exim 4.90_1) (envelope-from )
id 1lNwl7-0004ka-VT; Sun, 21 Mar 2021 08:00:51 -0400
Original-Received: by mail-ed1-x52a.google.com with SMTP id h10so15964799edt.13;
Sun, 21 Mar 2021 05:00:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
h=mime-version:subject:from:in-reply-to:date:cc
:content-transfer-encoding:message-id:references:to;
bh=csYfkEriRJRGIEH8dGaBV4rVoIXky4Cazt+olZW6lZs=;
b=Ac4HyFHsKZQk+Fal1kDbViE+BwhT5sa0YZdqFOT/DXhbgmVvR+kd9LgJQb80tsuDnB
MiYRL5u48uVXMt11QwDuFfNYtyCYvvxIx6FOYO5A3/N/3Yi+csfsZkK6NcrSN17zN7lB
Ex3NEVu5f9uR+s+B6NLReuveXwzTDG7hlDy68QL4xCrGW2thNG/j6RfUk72gOGkETDr0
ry1nXJBmtt375kyJ+SQhdZ1IPEHzenlxgeFQTy+hqOSsIbjR+XAixkZxOxp1ySD8xxM/
NDhCeT00iZ+4fSjYNxXifmk9uSAY/tQ5ECNpi1SOtSPExUA74YEY3Wvw0C7X4MVNLT8O
n+xg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc
:content-transfer-encoding:message-id:references:to;
bh=csYfkEriRJRGIEH8dGaBV4rVoIXky4Cazt+olZW6lZs=;
b=Cx24K0nJmyWYwRrxcwsYRIQpahc6g8iQ0nnv+EnYvGDa+ejQ+OH4uex+ZbQEa8GLY6
2D4DXy5OrdV8e6hYKg0YQmFX+kq9ropx+PCliB0xUlVgtC1unvXGCkoLx6rR9syAltDk
fdFkmUy1rWJI0USTd/3yNAeUPbpK8aky313BT3zviK2GmReUU09VjYmeux2E2qNRIBIe
VUIOgzP6CtBD7nC2ZdfUb1suBt8ub1hzkTVLsmo8c3v/AKfSDLESPsiDyMr0qyuD+9MB
/IBlZCvMYlAC4xrouagtYiR7sVvCqs3Oz4uUA9giujHK2pWS2GdU8Gd+qNbMx1hD8xLg
ciCA==
X-Gm-Message-State: AOAM530lvhstUPvn2SZOMoptiOVF55kldNXAPJV9O3FNhQd4g5TW17NA
+IHHuwSGYGlEm/zb0DN0jgh4SGejhXG9+Q==
X-Google-Smtp-Source: ABdhPJy5U3wvaXiNWSPs2J/wdnAcGEZvRYH3RftrXxkhVGBLU52yMqPhuFnnymxUklzEJMG9wdrbHg==
X-Received: by 2002:aa7:c850:: with SMTP id g16mr19949722edt.324.1616328038567;
Sun, 21 Mar 2021 05:00:38 -0700 (PDT)
Original-Received: from philipps-macbook-pro.fritz.box (p57aaf893.dip0.t-ipconnect.de.
[87.170.248.147])
by smtp.gmail.com with ESMTPSA id e16sm6919357ejc.63.2021.03.21.05.00.37
(version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
Sun, 21 Mar 2021 05:00:37 -0700 (PDT)
In-Reply-To: <83y2ekfp3e.fsf@gnu.org>
X-Mailer: Apple Mail (2.3654.60.0.2.21)
Received-SPF: pass client-ip=2a00:1450:4864:20::52a;
envelope-from=p.stephani2@gmail.com; helo=mail-ed1-x52a.google.com
X-Spam_score_int: -17
X-Spam_score: -1.8
X-Spam_bar: -
X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001,
RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Emacs development discussions."
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: "Emacs-devel"
Xref: news.gmane.io gmane.emacs.devel:266699
Archived-At:
> Am 18.03.2021 um 16:39 schrieb Eli Zaretskii :
>=20
>> From: Philipp
>> Date: Thu, 18 Mar 2021 13:51:10 +0100
>> Cc: Philipp Stephani ,
>> emacs-devel@gnu.org
>>=20
>>> See below: I don't see how you ensure in this test that each thread
>>> waits for just one process.
>>=20
>> Shouldn't providing a process argument to accept-process-output =
ensure this,=20
>=20
> No. All it ensures is that accept-process-output will not return
> until it heard from that process. It doesn't say what would happen
> with output from other subprocesses that might be running.
Yeah, but why should that matter? If multiple processes are started in =
parallel, surely people shouldn't assume that there's some deterministic =
order of output arrival across processes.
>=20
>> the test should pass whether the process objects have thread affinity =
or not (and that's the case for me).
>=20
> But the test has no control of the affinity, so it just tests some
> case, not the other(s). We have no idea what happens in those cases
> it doesn't test.
Sure, that's the nature of unit tests: they only test a single scenario.
>=20
>> This might be how the implementation behaves (I haven't checked, but =
I have no reason to doubt your explanation). However, the test doesn't =
test the implementation, but the observable behavior. That is, I can't =
formally prove that the implementation leads to the observable behavior =
that the test expects, but I think it should, and I'd consider it a bug =
(in Emacs, not the test code) if it doesn't.
>=20
> It sounds like the "observable behavior" you care about is a very
> small part of what's going on in that case. As one of many tests in
> this area, that is entirely fine. But since it's the _only_ test, it
> sounds like we are testing a very small part of the functionality, and
> leave the rest entirely untested.
Again, every unit tests behaves like this, unless you strive for 100% =
path coverage (which is generally infeasible for a very complex function =
like accept-process-output).
I certainly won't stop anyone from adding further tests testing =
different scenarios.
>=20
>>> is it important whether a thread exits
>>> because it read output or because it received SIGCHLD and read the
>>> self-pipe?
>>=20
>> No, these are implementation details: they might as well not exist as =
far as the observable behavior of accept-process-output is concerned.
>>=20
>>> is it important whether all the threads exit one after
>>> another, or is it okay if they all wait until one thread exits and
>>> then exit immediately (because all the processes have already died)?
>>=20
>> I don't think that's observable/documented behavior: we only know =
that a thread has exited after thread-join returns.
>=20
> Threads and their interaction with subprocesses were implemented to
> provide some useful behavior which is supposed to support practical
> use cases and applications. If all we care about is that 10 threads
> don't deadlock and all the processes exited, we didn't even scratch
> the surface of those use cases and applications. In particular, a
> practical application of these facilities may very well care whether
> the threads exit one by one or not. It sounds very strange to me to
> hear that we shouldn't care.
The test in question tests what I've described, which is obviously not =
the only thing we should care about.
>=20
>>>> I wrote the test based on my understanding of the expected and =
promised behavior of the process and thread functions, not the Posix =
implementation.
>>>=20
>>> Where and how did you obtain that understanding? The documentation =
is
>>> (intentionally or unintentionally) very vague regarding these =
details.
>>=20
>> Yes, so I had to fill in some aspects that I hope are reasonable. In =
particular, the documentation doesn't state explicitly that waiting for =
a process in a different thread than the original one is supported, but =
given set-process-thread it implicitly has to be, because what would =
otherwise be the purpose of set-process-thread?
>=20
> If you aren't interested in implementation details, there's no useful
> answer to such questions. For example, maybe whoever implemented
> set-process-thread had some strange ideas or simply made a design
> mistake?
Or we can check the manual?
"Because threads were a relatively late addition to Emacs Lisp, and due
to the way dynamic binding was sometimes used in conjunction with
=E2=80=98accept-process-output=E2=80=99, by default a process is locked =
to the thread
that created it."
That's the given explanation: to prevent bugs when using dynamic =
binding. However, the test in question doesn't use dynamic binding. We =
have to call set-process-thread in the test because it would fail =
otherwise, but apart from that, the question of thread locking (of =
affinity) should be unrelated.
>=20
>>> Perhaps you only cared about whether the code should or shouldn't =
err,
>>> but that's clearly not the only interesting or important aspect of
>>> this exercise, is it?
>>=20
>> It's at least one important aspect, but finding deadlocks (hangs) is =
probably more important.
>=20
> I think there are aspects of this much more important than deadlocks
> (or lack thereof).
And you're welcome to add tests for those aspects.
>=20
>> What I expect is the behavior of accept-process-output, as explained =
above. Specifically: accept-process-output will return nil quickly =
(within milliseconds, assuming the calling thread is unblocked) once the =
given process has exited. Once it returns, all process output has been =
given to process filters/buffers, and the sentinel has run.
>=20
> When the process exits, some thread which waits on its descriptor, or
> on the self-pipe descriptor, will return from the 'pselect' call, and
> will attempt to take the global lock. If it cannot take the global
> lock, it will block, in which case neither the process output will be
> delivered nor the sentinel will run, until the thread becomes
> unblocked. So, as you see, even your simple expectations aren't
> really guaranteed. And your test still succeeds, regardless. What
> does that tell you about the usefulness of the test?
Nothing. You seem to have very different ideas about the purpose of =
unit tests than I, so I don't think we should belabor this point any =
further.
>=20
>>>> What I=E2=80=99d assume is that the `thread-join=E2=80=99 calls are =
synchronization points =E2=80=94 after calling `thread-join=E2=80=99 the =
thread in question is guaranteed to have finished, and since the thread =
has been waiting for a process to finish (using the =
accept-process-output loop), the process is also guaranteed to have =
finished, and this sequence of event is guaranteed exactly in that =
order.
>>>=20
>>> I don't see how you reach those conclusions. The first part is =
true:
>>> when thread-join exits, the thread it was waiting for is guaranteed =
to
>>> have exited. But how it exited, and what caused it to exit -- these
>>> are open questions, in a situation where 10 subprocesses generate
>>> output and produce 10 signals.
>>=20
>> There's only one way for a thread to exit: when its thread function =
returns.=20
>=20
> No, a thread can also exit if it signals an error.
Of course, like any function. I was just being a bit sloppy. The unit =
test verifies that the threads don't signal errors.
>=20
>>> The order of the first two is guaranteed because a single thread
>>> arranges for that sequentially. But do you care if the sequence is
>>> like this:
>>>=20
>>> . all the 10 processes exit
>>> . one of the threads exit because the processes exited
>>> . the other 9 threads get awoken one by one, exit
>>> accept-process-output because of SIGCHLD, end exit
>>=20
>>=20
>> =46rom what I can see, that should be fine. The test only checks =
that the threads finish eventually and don't deadlock, but doesn't care =
about the order of events across threads (which is not guaranteed unless =
there are explicit synchronization points like mutexes or condition =
variables).
>=20
> OK, but then I think a test that is only interested in these
> superficial aspects is not very useful for making sure this part of
> Emacs is really useful in Real Life. Unless we add other tests which
> do examine what you consider to be "implementation details".
See above.
Since this discussion doesn't seem to lead anywhere, maybe we can just =
agree to disagree and move on?