From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.bugs Subject: bug#43100: 28.0.50; pcase not binding variables conditionally Date: Wed, 2 Sep 2020 08:38:22 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12774"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Philipp Stephani , 43100@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Sep 02 10:41:03 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kDOKD-00036R-Je for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 02 Sep 2020 10:41:01 +0200 Original-Received: from localhost ([::1]:53140 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kDOKC-00085g-JL for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 02 Sep 2020 04:41:00 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51674) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kDOJG-0006UB-He for bug-gnu-emacs@gnu.org; Wed, 02 Sep 2020 04:40:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:46699) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kDOJF-0006AF-Nr for bug-gnu-emacs@gnu.org; Wed, 02 Sep 2020 04:40:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kDOJF-0005Q4-Kd for bug-gnu-emacs@gnu.org; Wed, 02 Sep 2020 04:40:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Pip Cet Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 02 Sep 2020 08:40:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 43100 X-GNU-PR-Package: emacs Original-Received: via spool by 43100-submit@debbugs.gnu.org id=B43100.159903594820762 (code B ref 43100); Wed, 02 Sep 2020 08:40:01 +0000 Original-Received: (at 43100) by debbugs.gnu.org; 2 Sep 2020 08:39:08 +0000 Original-Received: from localhost ([127.0.0.1]:58245 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kDOIN-0005On-HF for submit@debbugs.gnu.org; Wed, 02 Sep 2020 04:39:08 -0400 Original-Received: from mail-oo1-f45.google.com ([209.85.161.45]:38681) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kDOIK-0005OH-P0 for 43100@debbugs.gnu.org; Wed, 02 Sep 2020 04:39:05 -0400 Original-Received: by mail-oo1-f45.google.com with SMTP id z11so974069oon.5 for <43100@debbugs.gnu.org>; Wed, 02 Sep 2020 01:39:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=b470FM/TBZoOmtwxQOTr4Sv5vX3xvvw1o/uOBk+MqDY=; b=Xc8vsFJApCe4Z4R3xM22kFvceS3PDjozl5Z6YXppvjS0zJVOsaEg3Isjuq8AM8EzUd S0ufSvpZWNcMXZ/KhD9ctcSbxtx42GSWtvUfbdSMExi8KDFKTm21NtqWF8Hs5s0mpmxR dm5jDnL0qP4kHlBEFFkTe2KpS7B+0Kxkg/ehtpDNnRc6hayIVCyGHUM79OSrE+N21SEB q4bQiWntp0g3gUbasSvEZ7oqkYUhUt+WhI0YjG5Sr9WU6Ch7JLxdNWtH7uJSYuQsB1VG 33r2GwM7w5pWOn7ZSuOzfJ+RF8Oq6ktOE/gykODwe8+Z8dq5aiE4DhxSaPknVp9ICHq3 j1gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=b470FM/TBZoOmtwxQOTr4Sv5vX3xvvw1o/uOBk+MqDY=; b=fqXUC+FfU4rgscBbvXYv3IW83++9lMDKh6CwkNbxcVcJMrhthp1sscLNRKacgGIGeO aP6qqhRI2CTkiRBTBDB99A4w0BklW5/o1oIXkbTV41ooSaOSz2zqhyGoFGeKdWsb7c8r c7k1pvAmSlAr1B0gaUzkSPjDkamzfq4V5FZ8QCMviqIRjGvroU/MXVe/q6fInc/+CCw0 UiBJ+uYeIFaXkYv5Sbna6UqJ85i7h7b5hEOA9lFTIb8/PiBbxxYzDwWg5NDdtDoTEalf uBQLS65XAZagF+2tJ6QjlQidDIZHVndK11G7PX3cVTswwqZQpRwN77Nxanh5zdwiLdDO s7mA== X-Gm-Message-State: AOAM5312FgcdX+kcrhvvlZ2OYWY8yK/6gu2cyqG7ODJdi6RgxhgUhhIc /4gi+lbYo1+mWUeQ5ebUDptT7gWY2Gz7oLrlhPA= X-Google-Smtp-Source: ABdhPJyJzo3fdjuArdVP8I+XYafYsJBKdIfYdH0ix98wvlwArDsIsJqOR6VyGebcVNoKmHoVAT1cA4Ru6HRdMoZXiPM= X-Received: by 2002:a4a:5a06:: with SMTP id v6mr4645821ooa.22.1599035938733; Wed, 02 Sep 2020 01:38:58 -0700 (PDT) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:186915 Archived-At: On Tue, Sep 1, 2020 at 3:12 AM Stefan Monnier wr= ote: > > So, as I half-expected, the reaction to "pcase isn't powerful enough" > > is "let's make it less powerful" :-) > Your words, not mine. I'm glad you said that! > BTW, you can get (more or less) the behavior you want with: > > (pcase V > ((or (and (pred symbolp) (let name name)) name) > (let ((foo 'bar)) name))) That's a good alternative. > [ The "more or less" is because when V is a symbol, the `name` within > the body of the branch will not refer to the presumed surrounding > binding of `name` but to a new binding also called `name` which > temporarily hides the outer binding but is initialized to the same > value. The difference only kicks in if one of those bindings is > mutated. ] > > > Seriously, I get the impression you strongly feel pcase shouldn't be > > (more) powerful, it should instead make non-explicit but fairly strong > > complexity promises. > > I just want the complexity to be predictable without having to guess > which optimization will be applicable. So it's better to have a predictably-bad `append' rather than a sometimes-bad one? But how does that affect ,@? We could make that predictably bad, too! > >> More specifically, I'd rather not choose a semantics that imposes > >> duplicating the branch body, since we have no control over its size an= d > >> that can hence lead to potential code size explosion. > > You're right, and it's a good thing that the duplication of the branch > > body is a fixable implementation detail rather than something imposed > > by the semantics. > > It's only fixable if you disregard the "more or less" above. > I find it to be a pretty bad wrinkle in the semantics. So setq the outer binding if it changed? Note that people also might expect (pcase l (`(,a . ,b) (setq b a))) to modify l... > >> The design of `pcase` assumes you want to optimize away the tests that > >> are common to the various patterns. That can't be done with dynamic > >> patterns. > > Or it's a bit more difficult, at least... > I think it's more than "a bit more difficult", because deciding how to > optimize the tests will almost always take significantly more time than > just doing the tests. So in order to do it dynamically and be useful, > you still need to have 2 stages, where you optimize at one stage and > then use the result several times later (to recoup the cost of the > optimization). Wait, I think there was a misunderstanding here. I don't mean that the pcase pattern should depend structurally on let-bound variables appearing in it. That does sound impossible to me (except with dynamic scoping). > > The difficult part, in fact, is deciding that we want the arglist to > > be part of the exposed function API: given an "arglist" function, the > > rest of the implementation seems unproblematic, > > We have `help-function-arglist`, so it's not that hard. > But it's not guaranteed to work. Oh, thanks for pointing that out. It's not very good, though, is it? > > though some workarounds for lexical binding are required (if nothing > > else, this is an interesting exercise in how painful lexical binding > > can be to work with). > > It's more of a philosophical issue. Deciding whether to expose arglists for functions is, yes. That's what I said above. > Lexical scoping fundamentally means > that variables don't have names: (lambda (x) x) and (lambda (y) y) > should be indistinguishable (that's what =CE=B1-renaming says, anyway). But they're not. (equal (lambda (x) x) (lambda (y) y)) returns nil. I don't see why pcase-call should be in the "funcall" category of being unable to distinguish the two, rather than in the "equal" category of being able to. > >> So it's more like my option of returning nil, except it would return > >> the value of a surrounding `name` variable? That could be done, but I= 'm > >> not convinced it'd be more often useful. > > > > I started out with a fairly explicit practical problem: parsing GCC > > machine descriptions, which are (essentially) sexps but have made the > > mistake of having "optional" non-final parts, and I think it would be > > great to express that in a pcase pattern, both for the obvious reasons > > of legibility and for some non-obvious reasons of my own. > > I'm sorry, I don't understand what those sexps (and their =C2=ABoptional" > non-final parts=C2=BB) have to do with pcase's handling of `or` patterns > where the different branches don't bind the same set of variables. Well, maybe I'm just missing an obvious way of doing it. > >> > disallowing the modification of name in X. > >> That's rather hard to do (and I don't see what would be the benefit he= re). > > I meant adding a cautionary note about it in the documentation, not > > actively preventing it. > > So we're replacing the current "don't do that" with another "don't do > that", neither of which is detected by the byte-compiler? Yes. Emacs Lisp isn't free of "don't do that"s, and reducing the "don't do that" space drastically is better than pointing out the tiny fraction of it which remains as evidence that we shouldn't do the reduction in the first place. > I'd rather fix it "right" with something with a clean and > simple semantics where the things you shouldn't do get properly flagged > during compilation. If you want clean and simple (lexical scoping) semantics and execute user-provided code, you should probably call a user-provided lambda rather than evaluating an expression in the first place? > >> The current implementation amounts to "we should signal an error but w= e > >> don't bother doing so and just warn against it in the manual". > >> Patch welcome ;-) > > You mean a patch that would make pcase less powerful by making what I > > want to do impossible rather than merely difficult? > > The way I see it, it would make what you want to do possible because > what you suggest would merely give meaning to programs previously invalid= . > In contrast, with the current behavior your proposal implies breaking > backward compatibility. I think what you mean is you'd rather break backward compatibility in two steps rather than one, by first causing errors, then redefining the new errors not to happen? Because if so, I totally agree. > >> >> I don't know of a simple implementation. > >> > Here's my better-than-nothing attempt. I don't think that's complex= ; > >> > if anything, it's too trivial. > >> So you give it a search-based semantics. > > I don't think the semantics are at all unclear, > > You misunderstood: I definitely didn't mean "unclear" when wrote "search-= based". > The semantics are definitely clear. Saying I give it search-based semantics implies there are other semantics I could have chosen. Semantically, all I've done is decide, probably incorrectly, that append should be biased towards shy matching of the fist arguments (and that it only work on proper lists). > >> The problem with it for me is that if we turn > >> > >> `(,a ,@b) > >> > >> into > >> > >> (append `(,a) b) > > > > List-final ,@ is too special, IMHO, to be turned into an (append) > > pattern at all. > > So you want some ,@ to be turned into `append` and some not? As an implementation detail, yes. Just like ` does, by the way: `(,@l) doesn't call `append' (it doesn't even check l is a list!), so why should it behave differently when used as a pcase pattern? IOW, I'd like ` (as a pcase macro) to behave like ` already does: constant-time `(,@l), linear-time `(,@l 3). > That's exactly what I don't want, since it makes the performance > unpredictable unless you know the details of the optimization. I'm not opposed to the idea that there should be a form that means "use pcase to evaluate this, but throw an error if I messed up and complexity is worse than O(f(n))". I just don't think it should be (pcase ...), leaving the f to be determined by a human who's allowed to know the expected complexity of pcase patterns (currently not mentioned in the documentation), but isn't allowed to "know the details of the optimization". And, again, performance of ` is unpredictable unless you know the details of the optimization. Should we get rid of ` next? > >> the pcase match will take a lot more time than the equivalent > >> `(,a . ,b) > >> Of course, you can try and handle these "easy" cases more efficiently, > >> but then your ,@ will sometimes be very cheap and sometimes very > >> expensive (depending on when an optimization can be applied), which > >> I think is a misfeature (it's for this same reason that I dislike CL > >> keyword arguments for functions). > > I think it's an implementation detail. > > I don't want implementation details to choose between O(N) and O(1). > This is the kind of "detail" that should be chosen by the author. All powerful matching patterns I'm aware of have horrible worst-time complexity which is reduced to acceptable complexity for most cases, in practice. Take `equal', for example. (let (r s) (dotimes (i 100) (setq r (cons r r)) (setq s (cons s s)) (benchmark 1 `(equal ',r ',s)))) What you're asking sounds to me like the equivalent of "I want a compression algorithm to be as good as possible, but the size of the decompressed data has to be predictable, even if I perform recursive decompression of the output". > > Some reasoning about the minimum and maximum length of sequences > > matched by pcase patterns could help ordinary pcases, too, though: > > Potentially, of course. There are many options when it comes to the > actual code generated by `pcase`. Precisely, including ones that reduce complexity. Somewhat unpredictably. Which is why predictable performance is a bad thing to demand of pcase patterns. > > (pcase '(a b c d) > > (`(,a ,b ,c ,d) (list a b c d))) > > > > could call (pcase--check-length EXPVAL 4 4) rather than calling consp > > four times, potentially descending into expensive predicates that are > > unnecessary. > > But that presumes a C implementation of `pcase--check-length` since > (< (length X) 4) can be a very bad idea when X is a long list. Even a Lisp implementation that isn't a wrapper around (length X) would avoid unnecessary predicates. That is the reason I wrote pcase--check-length rather than (<=3D 4 (length EXPVAL) 4)... > > It's strange to read quotes that yo > ...? Well, it's strange to read quotes that you only half-deleted from your email, Pip! (Sorry). > > Again, I think that's a fundamental difference between us when it > > comes to the philosophy behind pcase. If I understand you correctly, > > you deliberately want to limit pcase, moving away from the intuitive > > definition of it that I gave above, because there might be a situation > > in which people expect better performance than our limited > > implementation can give them. Is that correct? > > [ The intuitive definition you gave doesn't work for most of the core > patterns in `pcase` (e.g. `pred`, `app`, `let`, `guard`), so while > it's fine as a guiding intuition, it can't be used to really define > the intended semantics. ] You mean because "pred" isn't defined as a function, and "let" is defined differently? Because if I'm allowed to defmacro pred, it works perfectly well: (defmacro pred (p) ``(pred ,p)) (defun equalish (a b) (... (pcase a (`(pred ,p) (funcall p b))) with extra hygiene being required, of course, but there's no problem here. > No, the problem is not really one of performance, but rather one of > defining which variables are in scope within a branch such that a given > branch always has the same set of bindings within its scope, no matter > how the pattern was matched, which I think gives it a cleaner and > simpler semantics. Yes: calling lambdas gives cleaner and simpler semantics than evaluating forms, which gives terser code and implies precisely the unclean (but useful) semantics I want. pcase does the latter, implying semantics which we can implement (more easily and cleanly using dynamic bindings, for some reason), but choose not to. I'd like to understand why! > [ It is also linked to a potential problem of code size explosion, > admittedly. ] Let's be clear, though, that we're not talking about an explosion in the number of conses used to express the pcase. It's only when the code is byte compiled that an explosion potentially happens (and it's easily fixable with dynamic binding, but not with lexical binding, as far as I can see). (I think it's interesting that you can't evaluate "proper" pcase at run time; at least, I don't see how you'd implement a pcase-dyn function that evaluates its second argument etc. before interpreting it. It's easy to do with dynamic binding. I think that's because our implementation of lexical binding is too limited, but I'm not sure.) > > I think that's a weak reason for a strong limitation, but of course > > those are subjective questions. > > I don't see what strong limitation you're referring to. Having `name` > get the value of a surrounding `name` instead of nil seems like a very > minor issue and definitely not a "strong limitation". The strong limitation is "you can only add new pcase patterns if they have predictable complexity (in code size, run time, or memory usage, presumably)". Taken at face value, that means no ,@, nothing equal-based, no efficient joining of two pcase cases into one... and no conditional binding of variables. (IMHO, (pcase X (A B) (C D)) should be equivalent to (pcase X ((or (and (let which-alternative 0) A) (and (let which-alternative 1) C)) (case which-alternative (0 B) (1 D)))), assuming "which-alternative" is free in A, B, C, D.) You're right, though, that it'll usually be possible to get the right behavior using the "bind everything to nil" idea. I hadn't understood you to be seriously proposing we implement that, but if you do I'll prepare a patch. > > For example, I don't expect (pcase 9 ((* x x) x)) to work, > > With an appropriate (pcase-defmacro * ...) you could definitely make it > work. By solving polynomial roots? I think it's better to use (pcase-defmacro * ...) for a Kleene star macro...which it turns out isn't precisely trivial to implement with lexical scoping. Pip