all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Pip Cet <pipcet@gmail.com>
Cc: Philipp Stephani <p.stephani2@gmail.com>, 43100@debbugs.gnu.org
Subject: bug#43100: 28.0.50; pcase not binding variables conditionally
Date: Mon, 31 Aug 2020 23:12:02 -0400	[thread overview]
Message-ID: <jwva6yatfzo.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <CAOqdjBf8OVdK-JeQjCDjTsOjjO6kuwo+mjEeJ3-JB1d4hN1KYA@mail.gmail.com> (Pip Cet's message of "Mon, 31 Aug 2020 19:32:43 +0000")

>> >> IIUC you want
>> >>
>> >>     (pcase V
>> >>       ((or (pred symbolp) name)
>> >>        (let ((foo 'bar)) name)))
>> >>
>> >> to behave like
>> >>
>> >>     (cond
>> >>      ((symbolp V) (let ((foo 'bar)) name))
>> >>      (t (let ((name V)) (let ((foo 'bar)) name))))
>> >>
>> >> ?
>> >
>> > Yes, that's correct. It's also how (pcase V ((or (pred symbolp) name)
>> > name) behaves...
>>
>> Indeed, but that's an accident.  Ideally it should either signal an
>> error at macro-expansion time, or return nil when V is a symbol.
> So, as I half-expected, the reaction to "pcase isn't powerful enough"
> is "let's make it less powerful" :-)

Your words, not mine.
BTW, you can get (more or less) the behavior you want with:

    (pcase V
      ((or (and (pred symbolp) (let name name)) name)
       (let ((foo 'bar)) name)))

[ The "more or less" is because when V is a symbol, the `name` within
  the body of the branch will not refer to the presumed surrounding
  binding of `name` but to a new binding also called `name` which
  temporarily hides the outer binding but is initialized to the same
  value.  The difference only kicks in if one of those bindings is
  mutated.  ]

> Seriously, I get the impression you strongly feel pcase shouldn't be
> (more) powerful, it should instead make non-explicit but fairly strong
> complexity promises.

I just want the complexity to be predictable without having to guess
which optimization will be applicable.  It's sometimes hard to satisfy
this desire, admittedly.  But note that it doesn't require the code to
be "efficient": your `append` very much satisfies my desire since its
complexity is very much predictable.

> I disagree: in practice, complexity promises and optimization based on
> them are often unnecessary. In fact, there's a Lisp tradition of using
> assq and memq rather than building ad-hoc hash tables, even though
> that often means run time is theoretically O(n^2) rather than O(n
> log(n)).

The complexity of `assq` is arguably bad, but it is very much predictable.

>> More specifically, I'd rather not choose a semantics that imposes
>> duplicating the branch body, since we have no control over its size and
>> that can hence lead to potential code size explosion.
> You're right, and it's a good thing that the duplication of the branch
> body is a fixable implementation detail rather than something imposed
> by the semantics.

It's only fixable if you disregard the "more or less" above.
I find it to be a pretty bad wrinkle in the semantics.

>> The design of `pcase` assumes you want to optimize away the tests that
>> are common to the various patterns.  That can't be done with dynamic
>> patterns.
> Or it's a bit more difficult, at least...

I think it's more than "a bit more difficult", because deciding how to
optimize the tests will almost always take significantly more time than
just doing the tests.  So in order to do it dynamically and be useful,
you still need to have 2 stages, where you optimize at one stage and
then use the result several times later (to recoup the cost of the
optimization).

> The difficult part, in fact, is deciding that we want the arglist to
> be part of the exposed function API: given an "arglist" function, the
> rest of the implementation seems unproblematic,

We have `help-function-arglist`, so it's not that hard.
But it's not guaranteed to work.

> though some workarounds for lexical binding are required (if nothing
> else, this is an interesting exercise in how painful lexical binding
> can be to work with).

It's more of a philosophical issue.  Lexical scoping fundamentally means
that variables don't have names: (lambda (x) x) and (lambda (y) y)
should be indistinguishable (that's what α-renaming says, anyway).

Obviously, `help-function-arglist` begs to differ, but I like lexical
scoping so I'd rather we keep such exceptions to a minimum.

>> So it's more like my option of returning nil, except it would return
>> the value of a surrounding `name` variable?  That could be done, but I'm
>> not convinced it'd be more often useful.
>
> I started out with a fairly explicit practical problem: parsing GCC
> machine descriptions, which are (essentially) sexps but have made the
> mistake of having "optional" non-final parts, and I think it would be
> great to express that in a pcase pattern, both for the obvious reasons
> of legibility and for some non-obvious reasons of my own.

I'm sorry, I don't understand what those sexps (and their «optional"
non-final parts») have to do with pcase's handling of `or` patterns
where the different branches don't bind the same set of variables.

>> > disallowing the modification of name in X.
>> That's rather hard to do (and I don't see what would be the benefit here).
> I meant adding a cautionary note about it in the documentation, not
> actively preventing it.

So we're replacing the current "don't do that" with another "don't do
that", neither of which is detected by the byte-compiler?

I'd rather fix it "right" with something with a clean and
simple semantics where the things you shouldn't do get properly flagged
during compilation.

> If we had read-only bindings, pcase would probably use them,
> but we don't.

We could add them, tho (I think it would be technically fairly easy,
it's more a question of surface-language design, figuring out the right
color of the bikeshed and deciding whether it's really worth adding this
extra complexity into the language).  I love immutable variables as much
as the next guy, yet I have resisted the urge to add them to Elisp
so far.

>> The current implementation amounts to "we should signal an error but we
>> don't bother doing so and just warn against it in the manual".
>> Patch welcome ;-)
> You mean a patch that would make pcase less powerful by making what I
> want to do impossible rather than merely difficult?

The way I see it, it would make what you want to do possible because
what you suggest would merely give meaning to programs previously invalid.
In contrast, with the current behavior your proposal implies breaking
backward compatibility.

>> >> I don't know of a simple implementation.
>> > Here's my better-than-nothing attempt.  I don't think that's complex;
>> > if anything, it's too trivial.
>> So you give it a search-based semantics.
> I don't think the semantics are at all unclear,

You misunderstood: I definitely didn't mean "unclear" when wrote "search-based".
The semantics are definitely clear.

>> The problem with it for me is that if we turn
>>
>>     `(,a ,@b)
>>
>> into
>>
>>     (append `(,a) b)
>
> List-final ,@ is too special, IMHO, to be turned into an (append)
> pattern at all.

So you want some ,@ to be turned into `append` and some not?
That's exactly what I don't want, since it makes the performance
unpredictable unless you know the details of the optimization.

>> the pcase match will take a lot more time than the equivalent
>>     `(,a . ,b)
>> Of course, you can try and handle these "easy" cases more efficiently,
>> but then your ,@ will sometimes be very cheap and sometimes very
>> expensive (depending on when an optimization can be applied), which
>> I think is a misfeature (it's for this same reason that I dislike CL
>> keyword arguments for functions).
> I think it's an implementation detail.

I don't want implementation details to choose between O(N) and O(1).
This is the kind of "detail" that should be chosen by the author.

> Some reasoning about the minimum and maximum length of sequences
> matched by pcase patterns could help ordinary pcases, too, though:

Potentially, of course.  There are many options when it comes to the
actual code generated by `pcase`.

> (pcase '(a b c d)
>   (`(,a ,b ,c ,d) (list a b c d)))
>
> could call (pcase--check-length EXPVAL 4 4) rather than calling consp
> four times, potentially descending into expensive predicates that are
> unnecessary.

But that presumes a C implementation of `pcase--check-length` since
(< (length X) 4) can be a very bad idea when X is a long list.

> It's strange to read quotes that yo

...?

> Again, I think that's a fundamental difference between us when it
> comes to the philosophy behind pcase.  If I understand you correctly,
> you deliberately want to limit pcase, moving away from the intuitive
> definition of it that I gave above, because there might be a situation
> in which people expect better performance than our limited
> implementation can give them. Is that correct?

[ The intuitive definition you gave doesn't work for most of the core
  patterns in `pcase` (e.g. `pred`, `app`, `let`, `guard`), so while
  it's fine as a guiding intuition, it can't be used to really define
  the intended semantics.  ]

No, the problem is not really one of performance, but rather one of
defining which variables are in scope within a branch such that a given
branch always has the same set of bindings within its scope, no matter
how the pattern was matched, which I think gives it a cleaner and
simpler semantics.

[ It is also linked to a potential problem of code size explosion,
  admittedly.  ]

> I think that's a weak reason for a strong limitation, but of course
> those are subjective questions.

I don't see what strong limitation you're referring to.  Having `name`
get the value of a surrounding `name` instead of nil seems like a very
minor issue and definitely not a "strong limitation".

> For example, I don't expect (pcase 9 ((* x x) x)) to work,

With an appropriate (pcase-defmacro * ...) you could definitely make it
work.  Not sure how useful it would be, but I'd have no problem with it:
just like `append` it doesn't impact the rest of `pcase` (and I presume
that it would come with a "predictably costly" complexity).


        Stefan






  reply	other threads:[~2020-09-01  3:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-29  9:41 bug#43100: 28.0.50; pcase not binding variables conditionally Pip Cet
2020-08-29 12:01 ` Philipp Stephani
2020-08-29 14:27   ` Pip Cet
2020-08-29 16:06     ` Stefan Monnier
2020-08-30 16:21       ` Pip Cet
2020-08-30 18:07         ` Stefan Monnier
2020-08-31 19:32           ` Pip Cet
2020-09-01  3:12             ` Stefan Monnier [this message]
2020-09-02  8:38               ` Pip Cet
2020-09-02 14:16                 ` Stefan Monnier
2020-09-05 14:36                   ` Pip Cet
2020-09-05 16:52                     ` Stefan Monnier
2021-03-02 13:38 ` Pip Cet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwva6yatfzo.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=43100@debbugs.gnu.org \
    --cc=p.stephani2@gmail.com \
    --cc=pipcet@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.