From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Philipp Newsgroups: gmane.emacs.devel Subject: Re: master 8dcb19f 4/4: Add a unit test testing interaction between threads and processes. Date: Sun, 21 Mar 2021 13:00:36 +0100 Message-ID: <50388E17-1D84-4823-94D9-923E792DC3D1@gmail.com> References: <83k0s34eo1.fsf@gnu.org> <83h7n74cwe.fsf@gnu.org> <838s8i4ak8.fsf@gnu.org> <6C5E64C7-19A6-4D46-AC30-33C29FA5DEBB@gmail.com> <83mtvo7xno.fsf@gnu.org> <25EEFE6F-9532-4A5A-8824-A3F8F41012F3@gmail.com> <83y2ekfp3e.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34283"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Philipp Stephani , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 21 13:03:01 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lNwnN-0008ml-38 for ged-emacs-devel@m.gmane-mx.org; Sun, 21 Mar 2021 13:03:01 +0100 Original-Received: from localhost ([::1]:46280 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lNwnM-00080y-21 for ged-emacs-devel@m.gmane-mx.org; Sun, 21 Mar 2021 08:03:00 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:37300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lNwlH-0007HR-MD for emacs-devel@gnu.org; Sun, 21 Mar 2021 08:00:54 -0400 Original-Received: from mail-ed1-x52a.google.com ([2a00:1450:4864:20::52a]:46607) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lNwl7-0004ka-VT; Sun, 21 Mar 2021 08:00:51 -0400 Original-Received: by mail-ed1-x52a.google.com with SMTP id h10so15964799edt.13; Sun, 21 Mar 2021 05:00:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=csYfkEriRJRGIEH8dGaBV4rVoIXky4Cazt+olZW6lZs=; b=Ac4HyFHsKZQk+Fal1kDbViE+BwhT5sa0YZdqFOT/DXhbgmVvR+kd9LgJQb80tsuDnB MiYRL5u48uVXMt11QwDuFfNYtyCYvvxIx6FOYO5A3/N/3Yi+csfsZkK6NcrSN17zN7lB Ex3NEVu5f9uR+s+B6NLReuveXwzTDG7hlDy68QL4xCrGW2thNG/j6RfUk72gOGkETDr0 ry1nXJBmtt375kyJ+SQhdZ1IPEHzenlxgeFQTy+hqOSsIbjR+XAixkZxOxp1ySD8xxM/ NDhCeT00iZ+4fSjYNxXifmk9uSAY/tQ5ECNpi1SOtSPExUA74YEY3Wvw0C7X4MVNLT8O n+xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=csYfkEriRJRGIEH8dGaBV4rVoIXky4Cazt+olZW6lZs=; b=Cx24K0nJmyWYwRrxcwsYRIQpahc6g8iQ0nnv+EnYvGDa+ejQ+OH4uex+ZbQEa8GLY6 2D4DXy5OrdV8e6hYKg0YQmFX+kq9ropx+PCliB0xUlVgtC1unvXGCkoLx6rR9syAltDk fdFkmUy1rWJI0USTd/3yNAeUPbpK8aky313BT3zviK2GmReUU09VjYmeux2E2qNRIBIe VUIOgzP6CtBD7nC2ZdfUb1suBt8ub1hzkTVLsmo8c3v/AKfSDLESPsiDyMr0qyuD+9MB /IBlZCvMYlAC4xrouagtYiR7sVvCqs3Oz4uUA9giujHK2pWS2GdU8Gd+qNbMx1hD8xLg ciCA== X-Gm-Message-State: AOAM530lvhstUPvn2SZOMoptiOVF55kldNXAPJV9O3FNhQd4g5TW17NA +IHHuwSGYGlEm/zb0DN0jgh4SGejhXG9+Q== X-Google-Smtp-Source: ABdhPJy5U3wvaXiNWSPs2J/wdnAcGEZvRYH3RftrXxkhVGBLU52yMqPhuFnnymxUklzEJMG9wdrbHg== X-Received: by 2002:aa7:c850:: with SMTP id g16mr19949722edt.324.1616328038567; Sun, 21 Mar 2021 05:00:38 -0700 (PDT) Original-Received: from philipps-macbook-pro.fritz.box (p57aaf893.dip0.t-ipconnect.de. [87.170.248.147]) by smtp.gmail.com with ESMTPSA id e16sm6919357ejc.63.2021.03.21.05.00.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 21 Mar 2021 05:00:37 -0700 (PDT) In-Reply-To: <83y2ekfp3e.fsf@gnu.org> X-Mailer: Apple Mail (2.3654.60.0.2.21) Received-SPF: pass client-ip=2a00:1450:4864:20::52a; envelope-from=p.stephani2@gmail.com; helo=mail-ed1-x52a.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:266699 Archived-At: > Am 18.03.2021 um 16:39 schrieb Eli Zaretskii : >=20 >> From: Philipp >> Date: Thu, 18 Mar 2021 13:51:10 +0100 >> Cc: Philipp Stephani , >> emacs-devel@gnu.org >>=20 >>> See below: I don't see how you ensure in this test that each thread >>> waits for just one process. >>=20 >> Shouldn't providing a process argument to accept-process-output = ensure this,=20 >=20 > No. All it ensures is that accept-process-output will not return > until it heard from that process. It doesn't say what would happen > with output from other subprocesses that might be running. Yeah, but why should that matter? If multiple processes are started in = parallel, surely people shouldn't assume that there's some deterministic = order of output arrival across processes. >=20 >> the test should pass whether the process objects have thread affinity = or not (and that's the case for me). >=20 > But the test has no control of the affinity, so it just tests some > case, not the other(s). We have no idea what happens in those cases > it doesn't test. Sure, that's the nature of unit tests: they only test a single scenario. >=20 >> This might be how the implementation behaves (I haven't checked, but = I have no reason to doubt your explanation). However, the test doesn't = test the implementation, but the observable behavior. That is, I can't = formally prove that the implementation leads to the observable behavior = that the test expects, but I think it should, and I'd consider it a bug = (in Emacs, not the test code) if it doesn't. >=20 > It sounds like the "observable behavior" you care about is a very > small part of what's going on in that case. As one of many tests in > this area, that is entirely fine. But since it's the _only_ test, it > sounds like we are testing a very small part of the functionality, and > leave the rest entirely untested. Again, every unit tests behaves like this, unless you strive for 100% = path coverage (which is generally infeasible for a very complex function = like accept-process-output). I certainly won't stop anyone from adding further tests testing = different scenarios. >=20 >>> is it important whether a thread exits >>> because it read output or because it received SIGCHLD and read the >>> self-pipe? >>=20 >> No, these are implementation details: they might as well not exist as = far as the observable behavior of accept-process-output is concerned. >>=20 >>> is it important whether all the threads exit one after >>> another, or is it okay if they all wait until one thread exits and >>> then exit immediately (because all the processes have already died)? >>=20 >> I don't think that's observable/documented behavior: we only know = that a thread has exited after thread-join returns. >=20 > Threads and their interaction with subprocesses were implemented to > provide some useful behavior which is supposed to support practical > use cases and applications. If all we care about is that 10 threads > don't deadlock and all the processes exited, we didn't even scratch > the surface of those use cases and applications. In particular, a > practical application of these facilities may very well care whether > the threads exit one by one or not. It sounds very strange to me to > hear that we shouldn't care. The test in question tests what I've described, which is obviously not = the only thing we should care about. >=20 >>>> I wrote the test based on my understanding of the expected and = promised behavior of the process and thread functions, not the Posix = implementation. >>>=20 >>> Where and how did you obtain that understanding? The documentation = is >>> (intentionally or unintentionally) very vague regarding these = details. >>=20 >> Yes, so I had to fill in some aspects that I hope are reasonable. In = particular, the documentation doesn't state explicitly that waiting for = a process in a different thread than the original one is supported, but = given set-process-thread it implicitly has to be, because what would = otherwise be the purpose of set-process-thread? >=20 > If you aren't interested in implementation details, there's no useful > answer to such questions. For example, maybe whoever implemented > set-process-thread had some strange ideas or simply made a design > mistake? Or we can check the manual? "Because threads were a relatively late addition to Emacs Lisp, and due to the way dynamic binding was sometimes used in conjunction with =E2=80=98accept-process-output=E2=80=99, by default a process is locked = to the thread that created it." That's the given explanation: to prevent bugs when using dynamic = binding. However, the test in question doesn't use dynamic binding. We = have to call set-process-thread in the test because it would fail = otherwise, but apart from that, the question of thread locking (of = affinity) should be unrelated. >=20 >>> Perhaps you only cared about whether the code should or shouldn't = err, >>> but that's clearly not the only interesting or important aspect of >>> this exercise, is it? >>=20 >> It's at least one important aspect, but finding deadlocks (hangs) is = probably more important. >=20 > I think there are aspects of this much more important than deadlocks > (or lack thereof). And you're welcome to add tests for those aspects. >=20 >> What I expect is the behavior of accept-process-output, as explained = above. Specifically: accept-process-output will return nil quickly = (within milliseconds, assuming the calling thread is unblocked) once the = given process has exited. Once it returns, all process output has been = given to process filters/buffers, and the sentinel has run. >=20 > When the process exits, some thread which waits on its descriptor, or > on the self-pipe descriptor, will return from the 'pselect' call, and > will attempt to take the global lock. If it cannot take the global > lock, it will block, in which case neither the process output will be > delivered nor the sentinel will run, until the thread becomes > unblocked. So, as you see, even your simple expectations aren't > really guaranteed. And your test still succeeds, regardless. What > does that tell you about the usefulness of the test? Nothing. You seem to have very different ideas about the purpose of = unit tests than I, so I don't think we should belabor this point any = further. >=20 >>>> What I=E2=80=99d assume is that the `thread-join=E2=80=99 calls are = synchronization points =E2=80=94 after calling `thread-join=E2=80=99 the = thread in question is guaranteed to have finished, and since the thread = has been waiting for a process to finish (using the = accept-process-output loop), the process is also guaranteed to have = finished, and this sequence of event is guaranteed exactly in that = order. >>>=20 >>> I don't see how you reach those conclusions. The first part is = true: >>> when thread-join exits, the thread it was waiting for is guaranteed = to >>> have exited. But how it exited, and what caused it to exit -- these >>> are open questions, in a situation where 10 subprocesses generate >>> output and produce 10 signals. >>=20 >> There's only one way for a thread to exit: when its thread function = returns.=20 >=20 > No, a thread can also exit if it signals an error. Of course, like any function. I was just being a bit sloppy. The unit = test verifies that the threads don't signal errors. >=20 >>> The order of the first two is guaranteed because a single thread >>> arranges for that sequentially. But do you care if the sequence is >>> like this: >>>=20 >>> . all the 10 processes exit >>> . one of the threads exit because the processes exited >>> . the other 9 threads get awoken one by one, exit >>> accept-process-output because of SIGCHLD, end exit >>=20 >>=20 >> =46rom what I can see, that should be fine. The test only checks = that the threads finish eventually and don't deadlock, but doesn't care = about the order of events across threads (which is not guaranteed unless = there are explicit synchronization points like mutexes or condition = variables). >=20 > OK, but then I think a test that is only interested in these > superficial aspects is not very useful for making sure this part of > Emacs is really useful in Real Life. Unless we add other tests which > do examine what you consider to be "implementation details". See above. Since this discussion doesn't seem to lead anywhere, maybe we can just = agree to disagree and move on?