bug#43100: 28.0.50; pcase not binding variables conditionally

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#43100: 28.0.50; pcase not binding variables conditionally
@ 2020-08-29  9:41 Pip Cet
  2020-08-29 12:01 ` Philipp Stephani
  2021-03-02 13:38 ` Pip Cet
  0 siblings, 2 replies; 13+ messages in thread
From: Pip Cet @ 2020-08-29  9:41 UTC (permalink / raw)
  To: 43100, monnier

[-- Attachment #1: Type: text/plain, Size: 1154 bytes --]

I'm having trouble with pcase's behavior.

(pcase "a"
  ((or (pred symbolp) name)
   (let ((foo 'bar)) name)))

throws an error. It shouldn't. (Note that the dummy "let" is necessary
to force the pcase code generation to use a function call).

I believe the culprit is the code around this comment in pcase.el

;; If some of `vars' were not found in `prevvars', that's
;; OK it just means those vars aren't present in all
;; branches, so they can be used within the pattern
;; (e.g. by a `guard/let/pred') but not in the branch.

I believe that's incorrect: using the variable in a condition-case
should work, as should conditional shadowing of an existing binding,
as in this case:

(let ((name "default"))
  (pcase "a"
    ((or (pred symbolp) name)
     name)))

(which works), or this case:

(let ((name "default"))
  (pcase "a"
    ((or (pred symbolp) name)
     (let ((foo 'bar)) name))))

(which doesn't).

I believe the right fix is not to share code for the same branch if it
uses different variables, as in the attached patch. It's possible this
increases codegen complexity in some construed cases, but in practice
that shouldn't be a problem.

[-- Attachment #2: 0001-Allow-variable-bindings-to-differ-across-pcase-alter.patch --]
[-- Type: text/x-patch, Size: 1135 bytes --]

From ec7e4a92ab08adf4eb036ea50f7d70bade63719b Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Sat, 29 Aug 2020 09:35:41 +0000
Subject: [PATCH] Allow variable bindings to differ across pcase alternatives.

This fixes pcase cases like ((or (pred stringp) name) name), which
should use the pcase-local binding only if EXPVAL isn't a string.

* lisp/emacs-lisp/pcase.el (pcase--expand): Do not share code if
variables differ.
---
 lisp/emacs-lisp/pcase.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lisp/emacs-lisp/pcase.el b/lisp/emacs-lisp/pcase.el
index a8ce23284c..72916296cb 100644
--- a/lisp/emacs-lisp/pcase.el
+++ b/lisp/emacs-lisp/pcase.el
@@ -346,7 +346,8 @@ pcase--expand
             (lambda (code vars)
               (let ((vars (pcase--fgrep vars code))
                     (prev (assq code seen)))
-                (if (not prev)
+                (if (or (not prev)
+                        (not (equal (cadr prev) vars)))
                     (let ((res (pcase-codegen code vars)))
                       (push (list code vars res) seen)
                       res)
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-29  9:41 bug#43100: 28.0.50; pcase not binding variables conditionally Pip Cet
@ 2020-08-29 12:01 ` Philipp Stephani
  2020-08-29 14:27   ` Pip Cet
  2021-03-02 13:38 ` Pip Cet
  1 sibling, 1 reply; 13+ messages in thread
From: Philipp Stephani @ 2020-08-29 12:01 UTC (permalink / raw)
  To: Pip Cet; +Cc: 43100, Stefan Monnier

Am Sa., 29. Aug. 2020 um 11:42 Uhr schrieb Pip Cet <pipcet@gmail.com>:
>
> I'm having trouble with pcase's behavior.
>
> (pcase "a"
>   ((or (pred symbolp) name)
>    (let ((foo 'bar)) name)))
>
> throws an error. It shouldn't.

Isn't this case documented in the manual? The last section of
https://www.gnu.org/software/emacs/manual/html_node/elisp/pcase-Macro.html
states:
"It makes no sense for each sub-pattern [in an `or' sequence] to
let-bind a different set of symbols because the body forms have no way
to distinguish which sub-pattern matched and choose among the
different sets."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-29 12:01 ` Philipp Stephani
@ 2020-08-29 14:27   ` Pip Cet
  2020-08-29 16:06     ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Pip Cet @ 2020-08-29 14:27 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: 43100, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 1739 bytes --]

 On Sat, Aug 29, 2020 at 12:01 PM Philipp Stephani
<p.stephani2@gmail.com> wrote:
> Am Sa., 29. Aug. 2020 um 11:42 Uhr schrieb Pip Cet <pipcet@gmail.com>:
> > I'm having trouble with pcase's behavior.
> >
> > (pcase "a"
> >   ((or (pred symbolp) name)
> >    (let ((foo 'bar)) name)))
> >
> > throws an error. It shouldn't.
>
> Isn't this case documented in the manual? The last section of
> https://www.gnu.org/software/emacs/manual/html_node/elisp/pcase-Macro.html
> states:
> "It makes no sense for each sub-pattern [in an `or' sequence] to
> let-bind a different set of symbols because the body forms have no way
> to distinguish which sub-pattern matched and choose among the
> different sets."

Thanks for pointing this out.

I disagree with what the documentation says there: it does make
perfect sense, to me, to conditionally shadow a (lexical) let binding
with a pcase-let one, and body forms will have no trouble in practice
distinguishing between the outer and the inner let-binding. The code
obviously attempts to handle this case, too, it just doesn't get it
perfectly right.

I agree and accept that pcase is very limited, and it probably always
will be, but I think those limitations shouldn't be entirely
arbitrary, and hitting them shouldn't cause silently malfunctioning
code but error messages.

Things like the behavior of

(pcase-let ((`(,a ,@b) (list 3 4)))
  a)

just seem puzzling to me. There are good reasons not to implement
sublist matching (though I don't think those reasons are sufficient
not to have a simple implementation anyway), so an error message would
be acceptable, but the current implementation treats \,@ as an
ordinary symbol, which doesn't help anyone.

Sorry for complaining. Here's a patch.

[-- Attachment #2: 0001-Complain-about-in-backquote-pcase-patterns.patch --]
[-- Type: text/x-patch, Size: 810 bytes --]

From c52708d92dec4a6d7810d634dd4ff23a2cbca1ec Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Sat, 29 Aug 2020 14:21:29 +0000
Subject: [PATCH] Complain about ,@ in backquote pcase patterns

* lisp/emacs-lisp/pcase.el (\`): Complain upon encountering \,@ in
backquote pcase patterns.
---
 lisp/emacs-lisp/pcase.el | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lisp/emacs-lisp/pcase.el b/lisp/emacs-lisp/pcase.el
index a8ce23284c..642ff0f73a 100644
--- a/lisp/emacs-lisp/pcase.el
+++ b/lisp/emacs-lisp/pcase.el
@@ -971,6 +971,7 @@ \`
   (declare (debug (pcase-QPAT)))
   (cond
    ((eq (car-safe qpat) '\,) (cadr qpat))
+   ((eq (car-safe qpat) '\,@) (error "sublist matching not supported"))
    ((vectorp qpat)
     `(and (pred vectorp)
           (app length ,(length qpat))
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-29 14:27   ` Pip Cet
@ 2020-08-29 16:06     ` Stefan Monnier
  2020-08-30 16:21       ` Pip Cet
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-08-29 16:06 UTC (permalink / raw)
  To: Pip Cet; +Cc: Philipp Stephani, 43100

>> > I'm having trouble with pcase's behavior.
>> >
>> > (pcase "a"
>> >   ((or (pred symbolp) name)
>> >    (let ((foo 'bar)) name)))
>> >
>> > throws an error. It shouldn't.
>>
>> Isn't this case documented in the manual? The last section of
>> https://www.gnu.org/software/emacs/manual/html_node/elisp/pcase-Macro.html
>> states:
>> "It makes no sense for each sub-pattern [in an `or' sequence] to
>> let-bind a different set of symbols because the body forms have no way
>> to distinguish which sub-pattern matched and choose among the
>> different sets."
>
> Thanks for pointing this out.
>
> I disagree with what the documentation says there: it does make
> perfect sense, to me, to conditionally shadow a (lexical) let binding
> with a pcase-let one, and body forms will have no trouble in practice
> distinguishing between the outer and the inner let-binding.

How do you expect them to distinguish?

IIUC you want 

    (pcase V
      ((or (pred symbolp) name)
       (let ((foo 'bar)) name)))

to behave like

    (cond
     ((symbolp V) (let ((foo 'bar)) name))
     (t (let ((name V)) (let ((foo 'bar)) name))))

?

I'd rather not go there since it means that the single occurrence of the
`name` identifier in the branch's body refers to two different variable
bindings depending on which pattern was matched.  It smells of dynamic
scoping, tho it is admittedly compatible with lexical-scoping but only at
the cost of having to duplicate the branch's body.

The "intended" behavior instead would be to behave like

    (cond
     ((symbolp V) (let ((name nil)) (let ((foo 'bar)) name)))
     (t (let ((name V)) (let ((foo 'bar)) name))))

That's already the behavior you get if you switch the two:

    (macroexpand '(pcase V
                    ((or (and (pred foo) name) (pred symbolp))
                     (let ((foo 'bar)) name))))
    =>
    (let* ((pcase-0 (lambda (name) (let ((foo 'bar)) name))))
      (cond ((foo V) (funcall pcase-0 V))
            ((symbolp V) (funcall pcase-0 nil))
            (t nil)))

the fact that the behavior depends on the order of elements in `or` is
an undesirable side effect of the implementation technique.

> Things like the behavior of
>
> (pcase-let ((`(,a ,@b) (list 3 4)))
>   a)
>
> just seem puzzling to me. There are good reasons not to implement
> sublist matching (though I don't think those reasons are sufficient
> not to have a simple implementation anyway),

I don't know of a simple implementation.

> so an error message would be acceptable,

You're probably right.

> Sorry for complaining. Here's a patch.

LGTM,


        Stefan






^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-29 16:06     ` Stefan Monnier
@ 2020-08-30 16:21       ` Pip Cet
  2020-08-30 18:07         ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Pip Cet @ 2020-08-30 16:21 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Philipp Stephani, 43100

On Sat, Aug 29, 2020 at 4:06 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> > I'm having trouble with pcase's behavior.
> >> >
> >> > (pcase "a"
> >> >   ((or (pred symbolp) name)
> >> >    (let ((foo 'bar)) name)))
> >> >
> >> > throws an error. It shouldn't.
> >>
> >> Isn't this case documented in the manual? The last section of
> >> https://www.gnu.org/software/emacs/manual/html_node/elisp/pcase-Macro.html
> >> states:
> >> "It makes no sense for each sub-pattern [in an `or' sequence] to
> >> let-bind a different set of symbols because the body forms have no way
> >> to distinguish which sub-pattern matched and choose among the
> >> different sets."
> >
> > Thanks for pointing this out.
> >
> > I disagree with what the documentation says there: it does make
> > perfect sense, to me, to conditionally shadow a (lexical) let binding
> > with a pcase-let one, and body forms will have no trouble in practice
> > distinguishing between the outer and the inner let-binding.
>
> How do you expect them to distinguish?

By repeating the pcase pattern's predicate, usually, or by relying on
other knowledge about the previous value. That's in practice, in
theory you can use gensym or an inaccessible cons.

> IIUC you want
>
>     (pcase V
>       ((or (pred symbolp) name)
>        (let ((foo 'bar)) name)))
>
> to behave like
>
>     (cond
>      ((symbolp V) (let ((foo 'bar)) name))
>      (t (let ((name V)) (let ((foo 'bar)) name))))
>
> ?

Yes, that's correct. It's also how (pcase V ((or (pred symbolp) name)
name) behaves...

> I'd rather not go there

You'd rather have the behavior of (pcase V ((or pred symbolp) name)
EXPR) depend on the complexity of EXPR?

> since it means that the single occurrence of the
> `name` identifier in the branch's body refers to two different variable
> bindings depending on which pattern was matched.  It smells of dynamic
> scoping, tho it is admittedly compatible with lexical-scoping but only at
> the cost of having to duplicate the branch's body.

Well, I'm afraid pcase does smell of dynamic scoping :-)

More precisely, to me, what pcase simulates is building a lexical
environment, then evaluating the BODY argument with the environment as
second argument. (pcase EXP (PAT BODY)) is very roughly:

(for-each-possible-environment ENV (if (equalish (eval PAT ENV) EXP)
(return (eval BODY ENV)))

where for-each-possible-environment magically finds the smallest (or
"best") ENV first.

I think it would be nice to have a lexical three-argument version of
pcase which specifies which variables are output values, treating the
remaining ones as input values, to make it easier to build
non-constant patterns. But that would be very different from what
pcase does do today, which is closer to what I hinted at above.

IOW, if you want to call a function with arguments determined by
pcase-like patterns, why not introduce pcase-call so something like
the following would work:

(defun f (hello world) (cons hello world))

(let ((space " ") (hw "hello world"))
  (pcase-call 'f ((concat hello space world) hw)))

(that would introduce named function arguments, which I think would be
a nice bonus)?

But it's not what pcase is! There's no function call or argument list
in there, just expressions involved in environments.

As for duplicating the body, that is an implementation detail. You can
easily avoid it by producing

(let ((name name))
  (cond ((symbolp V) X)
    (progn (setq name V) X)))

disallowing the modification of name in X.

> The "intended" behavior instead would be to behave like
>
>     (cond
>      ((symbolp V) (let ((name nil)) (let ((foo 'bar)) name)))
>      (t (let ((name V)) (let ((foo 'bar)) name))))
>
> That's already the behavior you get if you switch the two:
>
>     (macroexpand '(pcase V
>                     ((or (and (pred foo) name) (pred symbolp))
>                      (let ((foo 'bar)) name))))
>     =>
>     (let* ((pcase-0 (lambda (name) (let ((foo 'bar)) name))))
>       (cond ((foo V) (funcall pcase-0 V))
>             ((symbolp V) (funcall pcase-0 nil))
>             (t nil)))

I don't see where the nil comes from, or why it's a useful choice for
a default value.

> the fact that the behavior depends on the order of elements in `or` is
> an undesirable side effect of the implementation technique.

It also depends on the complexity of the branch.

It seems to me there are at least three consistent ways of behaving
(throw an error, bind name to nil, bind name to name), with an
inconsistent fourth way being what's currently implemented.

> > Things like the behavior of
> >
> > (pcase-let ((`(,a ,@b) (list 3 4)))
> >   a)
> >
> > just seem puzzling to me. There are good reasons not to implement
> > sublist matching (though I don't think those reasons are sufficient
> > not to have a simple implementation anyway),
>
> I don't know of a simple implementation.

Here's my better-than-nothing attempt. I don't think that's complex;
if anything, it's too trivial.

(pcase-defmacro append (&rest patterns)
  (if patterns
      (let* ((r1 (gensym))
         (r2 (gensym))
         (pat (list '\` (cons (list '\, (list 'and r1 (car patterns)))
                  (list '\, (list 'and r2 (cons 'append
                          (cdr patterns)))))))
         (f (eval `(lambda (l)
             (catch 'pcase--append
               (dotimes (i (1+ (length l)))
                 (let ((l1 (seq-subseq l 0 i))
                   (l2 (seq-subseq l i)))
                   (pcase (cons l1 l2)
                 (,pat (throw 'pcase--append
                          (cons ,r1 ,r2))))))))
              t)))
    `(app ,f ,pat))
    ''nil))



>
> > so an error message would be acceptable,
>
> You're probably right.
>
> > Sorry for complaining. Here's a patch.
>
> LGTM,
>
>
>         Stefan
>





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-30 16:21       ` Pip Cet
@ 2020-08-30 18:07         ` Stefan Monnier
  2020-08-31 19:32           ` Pip Cet
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-08-30 18:07 UTC (permalink / raw)
  To: Pip Cet; +Cc: Philipp Stephani, 43100

>> IIUC you want
>>
>>     (pcase V
>>       ((or (pred symbolp) name)
>>        (let ((foo 'bar)) name)))
>>
>> to behave like
>>
>>     (cond
>>      ((symbolp V) (let ((foo 'bar)) name))
>>      (t (let ((name V)) (let ((foo 'bar)) name))))
>>
>> ?
>
> Yes, that's correct. It's also how (pcase V ((or (pred symbolp) name)
> name) behaves...

Indeed, but that's an accident.  Ideally it should either signal an
error at macro-expansion time, or return nil when V is a symbol.
Since the current implementation doesn't go to the effort of doing
either of those, we instead limit ourselves to recommend against using
such patterns (IOW, "use at your own risks").

>> I'd rather not go there
> You'd rather have the behavior of (pcase V ((or pred symbolp) name)
> EXPR) depend on the complexity of EXPR?

More specifically, I'd rather not choose a semantics that imposes
duplicating the branch body, since we have no control over its size and
that can hence lead to potential code size explosion.

A code size explosion due to a particular implementation choice is
undesirable, but a code size explosion imposed by the semantics is much
more problematic.

> I think it would be nice to have a lexical three-argument version of
> pcase which specifies which variables are output values, treating the
> remaining ones as input values, to make it easier to build
> non-constant patterns.

The design of `pcase` assumes you want to optimize away the tests that
are common to the various patterns.  That can't be done with dynamic
patterns.

> IOW, if you want to call a function with arguments determined by
> pcase-like patterns, why not introduce pcase-call so something like
> the following would work:
>
> (defun f (hello world) (cons hello world))
>
> (let ((space " ") (hw "hello world"))
>   (pcase-call 'f ((concat hello space world) hw)))

How do you intend to implement this?

> As for duplicating the body, that is an implementation detail. You can
> easily avoid it by producing
>
> (let ((name name))
>   (cond ((symbolp V) X)
>     (progn (setq name V) X)))

So it's more like my option of returning nil, except it would return
the value of a surrounding `name` variable?  That could be done, but I'm
not convinced it'd be more often useful.

> disallowing the modification of name in X.

That's rather hard to do (and I don't see what would be the benefit here).

>> The "intended" behavior instead would be to behave like
>>
>>     (cond
>>      ((symbolp V) (let ((name nil)) (let ((foo 'bar)) name)))
>>      (t (let ((name V)) (let ((foo 'bar)) name))))
>>
>> That's already the behavior you get if you switch the two:
>>
>>     (macroexpand '(pcase V
>>                     ((or (and (pred foo) name) (pred symbolp))
>>                      (let ((foo 'bar)) name))))
>>     =>
>>     (let* ((pcase-0 (lambda (name) (let ((foo 'bar)) name))))
>>       (cond ((foo V) (funcall pcase-0 V))
>>             ((symbolp V) (funcall pcase-0 nil))
>>             (t nil)))
>
> I don't see where the nil comes from, or why it's a useful choice for
> a default value.

It comes from the absence of a binding for `name` and was chosen because
nil is the standard default value in Elisp.

It comes from this code in pcase.el:

                    (let ((args (mapcar (lambda (pa)
                                          (let ((v (assq (car pa) vars)))
                                            (setq vars (delq v vars))
                                            (cdr v)))
                                        prevvars)))
                      ;; If some of `vars' were not found in `prevvars', that's
                      ;; OK it just means those vars aren't present in all
                      ;; branches, so they can be used within the pattern
                      ;; (e.g. by a `guard/let/pred') but not in the branch.
                      ;; FIXME: But if some of `prevvars' are not in `vars' we
                      ;; should remove them from `prevvars'!
                      `(funcall ,res ,@args)))))))

The computation of `args` searches in `vars` for the bindings expected
by the branch (stored in `prevvars` the first time we encountered that
branch).  The assq+cdr will return nil if a var from `prevvars` isn't
found in `vars`.

>> the fact that the behavior depends on the order of elements in `or` is
>> an undesirable side effect of the implementation technique.
> It also depends on the complexity of the branch.
> It seems to me there are at least three consistent ways of behaving
> (throw an error, bind name to nil, bind name to name), with an
> inconsistent fourth way being what's currently implemented.

The current implementation amounts to "we should signal an error but we
don't bother doing so and just warn against it in the manual".
Patch welcome ;-)

>> I don't know of a simple implementation.
> Here's my better-than-nothing attempt.  I don't think that's complex;
> if anything, it's too trivial.

So you give it a search-based semantics.
The problem with it for me is that if we turn

    `(,a ,@b)

into

    (append `(,a) b)

the pcase match will take a lot more time than the equivalent

    `(,a . ,b)

Of course, you can try and handle these "easy" cases more efficiently,
but then your ,@ will sometimes be very cheap and sometimes very
expensive (depending on when an optimization can be applied), which
I think is a misfeature (it's for this same reason that I dislike CL
keyword arguments for functions).

I think it's fine to have such a search-based `append` (because it's
"reliably expensive") but I'd rather not automatically use it for ,@ 

[ BTW, you don't need (nor want) `eval` in your definition.  ]

        Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-30 18:07         ` Stefan Monnier
@ 2020-08-31 19:32           ` Pip Cet
  2020-09-01  3:12             ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Pip Cet @ 2020-08-31 19:32 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Philipp Stephani, 43100

[-- Attachment #1: Type: text/plain, Size: 9920 bytes --]

Hello Stefan,

On Sun, Aug 30, 2020 at 6:07 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
> >> IIUC you want
> >>
> >>     (pcase V
> >>       ((or (pred symbolp) name)
> >>        (let ((foo 'bar)) name)))
> >>
> >> to behave like
> >>
> >>     (cond
> >>      ((symbolp V) (let ((foo 'bar)) name))
> >>      (t (let ((name V)) (let ((foo 'bar)) name))))
> >>
> >> ?
> >
> > Yes, that's correct. It's also how (pcase V ((or (pred symbolp) name)
> > name) behaves...
>
> Indeed, but that's an accident.  Ideally it should either signal an
> error at macro-expansion time, or return nil when V is a symbol.

So, as I half-expected, the reaction to "pcase isn't powerful enough"
is "let's make it less powerful" :-)

Seriously, I get the impression you strongly feel pcase shouldn't be
(more) powerful, it should instead make non-explicit but fairly strong
complexity promises.

I disagree: in practice, complexity promises and optimization based on
them are often unnecessary. In fact, there's a Lisp tradition of using
assq and memq rather than building ad-hoc hash tables, even though
that often means run time is theoretically O(n^2) rather than O(n
log(n)).

> Since the current implementation doesn't go to the effort of doing
> either of those, we instead limit ourselves to recommend against using
> such patterns (IOW, "use at your own risks").

> >> I'd rather not go there
> > You'd rather have the behavior of (pcase V ((or pred symbolp) name)
> > EXPR) depend on the complexity of EXPR?
>
> More specifically, I'd rather not choose a semantics that imposes
> duplicating the branch body, since we have no control over its size and
> that can hence lead to potential code size explosion.

You're right, and it's a good thing that the duplication of the branch
body is a fixable implementation detail rather than something imposed
by the semantics.

> A code size explosion due to a particular implementation choice is
> undesirable, but a code size explosion imposed by the semantics is much
> more problematic.

Again, I don't think that's the case here.

> > I think it would be nice to have a lexical three-argument version of
> > pcase which specifies which variables are output values, treating the
> > remaining ones as input values, to make it easier to build
> > non-constant patterns.
>
> The design of `pcase` assumes you want to optimize away the tests that
> are common to the various patterns.  That can't be done with dynamic
> patterns.

Or it's a bit more difficult, at least...

> > IOW, if you want to call a function with arguments determined by
> > pcase-like patterns, why not introduce pcase-call so something like
> > the following would work:
> >
> > (defun f (hello world) (cons hello world))
> >
> > (let ((space " ") (hw "hello world"))
> >   (pcase-call 'f ((concat hello space world) hw)))
>
> How do you intend to implement this?

Proof-of-concept attached; I'm no longer sure I want to be able to say
(concat hello space world) rather than (concat hello (pred (equal
space)) world); it's inconsistent to use `equal' here rather than `eq'
for ordinary symbols (can't we at least use `eql'?), and I'm too
worried about adding an optional argument called `space' and changing
the interpretation of pcase-calls far away.

The difficult part, in fact, is deciding that we want the arglist to
be part of the exposed function API: given an "arglist" function, the
rest of the implementation seems unproblematic, though some
workarounds for lexical binding are required (if nothing else, this is
an interesting exercise in how painful lexical binding can be to work
with).

> > As for duplicating the body, that is an implementation detail. You can
> > easily avoid it by producing
> >
> > (let ((name name))
> >   (cond ((symbolp V) X)
> >     (progn (setq name V) X)))
>
> So it's more like my option of returning nil, except it would return
> the value of a surrounding `name` variable?  That could be done, but I'm
> not convinced it'd be more often useful.

I started out with a fairly explicit practical problem: parsing GCC
machine descriptions, which are (essentially) sexps but have made the
mistake of having "optional" non-final parts, and I think it would be
great to express that in a pcase pattern, both for the obvious reasons
of legibility and for some non-obvious reasons of my own.

> > disallowing the modification of name in X.
>
> That's rather hard to do (and I don't see what would be the benefit here).

I meant adding a cautionary note about it in the documentation, not
actively preventing it. If we had read-only bindings, pcase would
probably use them, but we don't.

> >> The "intended" behavior instead would be to behave like
> >>
> >>     (cond
> >>      ((symbolp V) (let ((name nil)) (let ((foo 'bar)) name)))
> >>      (t (let ((name V)) (let ((foo 'bar)) name))))
> >>
> >> That's already the behavior you get if you switch the two:
> >>
> >>     (macroexpand '(pcase V
> >>                     ((or (and (pred foo) name) (pred symbolp))
> >>                      (let ((foo 'bar)) name))))
> >>     =>
> >>     (let* ((pcase-0 (lambda (name) (let ((foo 'bar)) name))))
> >>       (cond ((foo V) (funcall pcase-0 V))
> >>             ((symbolp V) (funcall pcase-0 nil))
> >>             (t nil)))
> >
> > I don't see where the nil comes from, or why it's a useful choice for
> > a default value.
>
> It comes from the absence of a binding for `name` and was chosen because
> nil is the standard default value in Elisp.

Sorry, I meant I don't see anything in the pcase input that justifies
our using a nil value.

> It comes from this code in pcase.el:
>
>                     (let ((args (mapcar (lambda (pa)
>                                           (let ((v (assq (car pa) vars)))
>                                             (setq vars (delq v vars))
>                                             (cdr v)))
>                                         prevvars)))
>                       ;; If some of `vars' were not found in `prevvars', that's
>                       ;; OK it just means those vars aren't present in all
>                       ;; branches, so they can be used within the pattern
>                       ;; (e.g. by a `guard/let/pred') but not in the branch.
>                       ;; FIXME: But if some of `prevvars' are not in `vars' we
>                       ;; should remove them from `prevvars'!
>                       `(funcall ,res ,@args)))))))
>
> The computation of `args` searches in `vars` for the bindings expected
> by the branch (stored in `prevvars` the first time we encountered that
> branch).  The assq+cdr will return nil if a var from `prevvars` isn't
> found in `vars`.

Yes, it's the precise code I want to change.

> >> the fact that the behavior depends on the order of elements in `or` is
> >> an undesirable side effect of the implementation technique.
> > It also depends on the complexity of the branch.
> > It seems to me there are at least three consistent ways of behaving
> > (throw an error, bind name to nil, bind name to name), with an
> > inconsistent fourth way being what's currently implemented.
>
> The current implementation amounts to "we should signal an error but we
> don't bother doing so and just warn against it in the manual".
> Patch welcome ;-)

You mean a patch that would make pcase less powerful by making what I
want to do impossible rather than merely difficult?

> >> I don't know of a simple implementation.
> > Here's my better-than-nothing attempt.  I don't think that's complex;
> > if anything, it's too trivial.
>
> So you give it a search-based semantics.

I don't think the semantics are at all unclear, except for the greedy
vs shy question. The implementation could be very different, reasoning
about the length of sequences matched by pcase subpatterns, of course.

> The problem with it for me is that if we turn
>
>     `(,a ,@b)
>
> into
>
>     (append `(,a) b)

List-final ,@ is too special, IMHO, to be turned into an (append)
pattern at all.

> the pcase match will take a lot more time than the equivalent
>
>     `(,a . ,b)
>
> Of course, you can try and handle these "easy" cases more efficiently,
> but then your ,@ will sometimes be very cheap and sometimes very
> expensive (depending on when an optimization can be applied), which
> I think is a misfeature (it's for this same reason that I dislike CL
> keyword arguments for functions).

I think it's an implementation detail. Some reasoning about the
minimum and maximum length of sequences matched by pcase patterns
could help ordinary pcases, too, though:

(pcase '(a b c d)
  (`(,a ,b ,c ,d) (list a b c d)))

could call (pcase--check-length EXPVAL 4 4) rather than calling consp
four times, potentially descending into expensive predicates that are
unnecessary.
It's strange to read quotes that yo
In general, of course, multiple ,@s in the same list will be slow
because it's a difficult problem.

> I think it's fine to have such a search-based `append` (because it's
> "reliably expensive") but I'd rather not automatically use it for ,@

Again, I think that's a fundamental difference between us when it
comes to the philosophy behind pcase. If I understand you correctly,
you deliberately want to limit pcase, moving away from the intuitive
definition of it that I gave above, because there might be a situation
in which people expect better performance than our limited
implementation can give them. Is that correct?

I think that's a weak reason for a strong limitation, but of course
those are subjective questions. For example, I don't expect (pcase 9
((* x x) x)) to work, and the intuitive try-everything oracle would
work for it. In any case, if there is such a fundamental difference of
opinion, pcase simply isn't what I should be looking at.

> [ BTW, you don't need (nor want) `eval` in your definition.  ]

Thank you! Premature "optimization"...

Thanks again!

[-- Attachment #2: pcall.el --]
[-- Type: text/x-emacs-lisp, Size: 19830 bytes --]

;; -*- lexical-binding: t; -*-

(defun f (hello world)
  (cons hello world))

(defun pcall-collect-symbols-1 (expr pusher)
  (cond
   ((consp expr)
    (pcall-collect-symbols-1 (car expr) pusher)
    (pcall-collect-symbols-1 (cdr expr) pusher))
   ((symbolp expr)
    (funcall pusher expr))))

(defun pcall-collect-symbols (bindings)
  (let (list)
    (dolist (binding bindings)
      (pcall-collect-symbols-1 binding
                               (lambda (x)
                                 (unless (memq x list)
                                   (push x list)))))
    list))

(defun pcall-make-environment (syms)
  (let (env)
    (dolist (sym syms)
      (push (list 'cons
                  (list 'quote sym)
                  `(condition-case error
                       ,sym
                     (void-variable (push ',sym unbound-syms))))
            env))
    (cons 'list (nreverse env))))

(defun pcaller (func pats)
  (let* ((arglist (arglist func))
         (arg-symbols (pcall-collect-symbols arglist)))
    `(lambda (env vals)
       (dolist (sym ',arglist)
         ;; (if (assq sym env)
         ;;     (warn "shadowing variable binding for %S"
         ;;           sym))
         (setq env (assq-delete-all sym env)))
       (eval '(filtered-pcase vals
                (lambda (x)
                  (assq x env))
                (,(list '\` (mapcar (lambda (x) (list '\, x))
                                    pats))
                 (funcall ',func ,@arglist)))
             env))))

(defun arglist (func)
  (while (and (symbolp func)
	      (setq func (symbol-function func))))
  (pcase func
    ((or `(lambda ,arglist . ,body)
	 `(closure ,lexenv ,arglist . ,body))
     arglist)
    (_ (cdr (read
	     (downcase
	      (car (help-split-fundoc (documentation func t) func t))))))))

(defmacro pcall (func &rest bindings)
  (let* ((syms (pcall-collect-symbols bindings))
         (env (pcall-make-environment syms))
         (pats (mapcar #'car bindings))
         (vals (mapcar #'cadr bindings)))
  `(let ((func ',func))
     (while (and (symbolp func)
	         (setq func (symbol-function func))))
     (let ((pcaller (funcall #'pcaller func ',pats)))
       (let* ((unbound-syms (list nil))
              (env ,env)
	      (pcase--env env))
         (dolist (sym unbound-syms)
           (setq env (assq-delete-all sym env)))
         (funcall pcaller env (list ,@vals)))))))

(defun pcase--expand (exp cases filter)
  ;; (message "pid=%S (pcase--expand %S ...hash=%S)"
  ;;          (emacs-pid) exp (sxhash cases))
  (macroexp-let2 macroexp-copyable-p val exp
    (let* ((defs ())
           (seen '())
           (codegen
            (lambda (code vars)
              (let ((vars (pcase--fgrep vars code))
                    (prev (assq code seen)))
                (if (not prev)
                    (let ((res (pcase-codegen code vars)))
                      (push (list code vars res) seen)
                      res)
                  ;; Since we use a tree-based pattern matching
                  ;; technique, the leaves (the places that contain the
                  ;; code to run once a pattern is matched) can get
                  ;; copied a very large number of times, so to avoid
                  ;; code explosion, we need to keep track of how many
                  ;; times we've used each leaf and move it
                  ;; to a separate function if that number is too high.
                  ;;
                  ;; We've already used this branch.  So it is shared.
                  (let* ((code (car prev))         (cdrprev (cdr prev))
                         (prevvars (car cdrprev))  (cddrprev (cdr cdrprev))
                         (res (car cddrprev)))
                    (unless (symbolp res)
                      ;; This is the first repeat, so we have to move
                      ;; the branch to a separate function.
                      (let ((bsym
                             (make-symbol (format "pcase-%d" (length defs)))))
                        (push `(,bsym (lambda ,(mapcar #'car prevvars) ,@code))
                              defs)
                        (setcar res 'funcall)
                        (setcdr res (cons bsym (mapcar #'cdr prevvars)))
                        (setcar (cddr prev) bsym)
                        (setq res bsym)))
                    (setq vars (copy-sequence vars))
                    (let ((args (mapcar (lambda (pa)
                                          (let ((v (assq (car pa) vars)))
                                            (setq vars (delq v vars))
                                            (cdr v)))
                                        prevvars)))
                      ;; If some of `vars' were not found in `prevvars', that's
                      ;; OK it just means those vars aren't present in all
                      ;; branches, so they can be used within the pattern
                      ;; (e.g. by a `guard/let/pred') but not in the branch.
                      ;; FIXME: But if some of `prevvars' are not in `vars' we
                      ;; should remove them from `prevvars'!
                      `(funcall ,res ,@args)))))))
           (used-cases ())
           (main
            (pcase--u
             (mapcar (lambda (case)
                       `(,(pcase--match val (pcase--macroexpand (car case)))
                         ,(lambda (vars)
                            (unless (memq case used-cases)
                              ;; Keep track of the cases that are used.
                              (push case used-cases))
                            (funcall
                             (if (pcase--small-branch-p (cdr case))
                                 ;; Don't bother sharing multiple
                                 ;; occurrences of this leaf since it's small.
                                 (lambda (code vars)
                                   (pcase-codegen code
                                                  (pcase--fgrep vars code)))
                               codegen)
                             (cdr case)
                             vars))))
                     cases)
             filter)))
      (dolist (case cases)
        (unless (or (memq case used-cases)
                    (memq (car case) pcase--dontwarn-upats))
          (message "Redundant pcase pattern: %S" (car case))))
      (macroexp-let* defs main))))

(defvar pcase--env nil)

(pcase-defmacro concat (&rest patterns)
  (if patterns
      (let* ((pat (list '\` (cons (list '\, (car patterns))
				  (list '\, (cons 'concat
						  (cdr patterns))))))
	     (f `(lambda (l)
		   (catch 'pcase--call
		     (dotimes (i (1+ (length l)))
		       (let* ((lc (cons (seq-subseq l 0 i)
					(seq-subseq l i))))
			 (filtered-pcase lc
			   (lambda (x)
			     (assq x pcase--env))
			   (,pat (throw 'pcase--call lc)))))))))
	`(app ,f ,pat))
    `(pred seq-empty-p)))

(defmacro filtered-pcase (exp filter &rest cases)
  (declare (indent 1) (debug (form &rest (pcase-PAT body))))
  (pcase--expand exp cases filter))

(defmacro pcase (exp &rest cases)
  "Evaluate EXP to get EXPVAL; try passing control to one of CASES.
CASES is a list of elements of the form (PATTERN CODE...).
For the first CASE whose PATTERN \"matches\" EXPVAL,
evaluate its CODE..., and return the value of the last form.
If no CASE has a PATTERN that matches, return nil.

Each PATTERN expands, in essence, to a predicate to call
on EXPVAL.  When the return value of that call is non-nil,
PATTERN matches.  PATTERN can take one of the forms:

  _                matches anything.
  \\='VAL             matches if EXPVAL is `equal' to VAL.
  KEYWORD          shorthand for \\='KEYWORD
  INTEGER          shorthand for \\='INTEGER
  STRING           shorthand for \\='STRING
  SYMBOL           matches anything and binds it to SYMBOL.
                   If a SYMBOL is used twice in the same pattern
                   the second occurrence becomes an `eq'uality test.
  (pred FUN)       matches if FUN called on EXPVAL returns non-nil.
  (app FUN PAT)    matches if FUN called on EXPVAL matches PAT.
  (guard BOOLEXP)  matches if BOOLEXP evaluates to non-nil.
  (let PAT EXPR)   matches if EXPR matches PAT.
  (and PAT...)     matches if all the patterns match.
  (or PAT...)      matches if any of the patterns matches.

FUN in `pred' and `app' can take one of the forms:
  SYMBOL  or  (lambda ARGS BODY)
     call it with one argument
  (F ARG1 .. ARGn)
     call F with ARG1..ARGn and EXPVAL as n+1'th argument

FUN, BOOLEXP, EXPR, and subsequent PAT can refer to variables
bound earlier in the pattern by a SYMBOL pattern.

Additional patterns can be defined using `pcase-defmacro'.

See Info node `(elisp) Pattern-Matching Conditional' in the
Emacs Lisp manual for more information and examples."
  (declare (indent 1) (debug (form &rest (pcase-PAT body))))
  ;; We want to use a weak hash table as a cache, but the key will unavoidably
  ;; be based on `exp' and `cases', yet `cases' is a fresh new list each time
  ;; we're called so it'll be immediately GC'd.  So we use (car cases) as key
  ;; which does come straight from the source code and should hence not be GC'd
  ;; so easily.
  (let ((data (gethash (car cases) pcase--memoize))
        (filter nil))
    ;; data = (EXP CASES . EXPANSION)
    (if (and (equal exp (car data)) (equal cases (cadr data)))
        ;; We have the right expansion.
        (cddr data)
      ;; (when (gethash (car cases) pcase--memoize-1)
      ;;   (message "pcase-memoize failed because of weak key!!"))
      ;; (when (gethash (car cases) pcase--memoize-2)
      ;;   (message "pcase-memoize failed because of eq test on %S"
      ;;            (car cases)))
      ;; (when data
      ;;   (message "pcase-memoize: equal first branch, yet different"))
      (let ((expansion (pcase--expand exp cases filter)))
        (puthash (car cases) `(,exp ,cases ,@expansion) pcase--memoize)
        ;; (puthash (car cases) `(,exp ,cases ,@expansion) pcase--memoize-1)
        ;; (puthash (car cases) `(,exp ,cases ,@expansion) pcase--memoize-2)
        expansion))))

(defun pcase--u (branches filter)
  "Expand matcher for rules BRANCHES.
Each BRANCH has the form (MATCH CODE . VARS) where
CODE is the code generator for that branch.
VARS is the set of vars already bound by earlier matches.
MATCH is the pattern that needs to be matched, of the form:
  (match VAR . PAT)
  (and MATCH ...)
  (or MATCH ...)"
  (when (setq branches (delq nil branches))
    (let* ((carbranch (car branches))
           (match (car carbranch)) (cdarbranch (cdr carbranch))
           (code (car cdarbranch))
           (vars (cdr cdarbranch)))
      (pcase--u1 (list match) code vars (cdr branches) filter))))

(defun pcase--u1 (matches code vars rest filter)
  "Return code that runs CODE (with VARS) if MATCHES match.
Otherwise, it defers to REST which is a list of branches of the form
\(ELSE-MATCH ELSE-CODE . ELSE-VARS)."
  ;; Depending on the order in which we choose to check each of the MATCHES,
  ;; the resulting tree may be smaller or bigger.  So in general, we'd want
  ;; to be careful to chose the "optimal" order.  But predicate
  ;; patterns make this harder because they create dependencies
  ;; between matches.  So we don't bother trying to reorder anything.
  (cond
   ((null matches) (funcall code vars))
   ((eq :pcase--fail (car matches)) (pcase--u rest filter))
   ((eq :pcase--succeed (car matches))
    (pcase--u1 (cdr matches) code vars rest filter))
   ((eq 'and (caar matches))
    (pcase--u1 (append (cdar matches) (cdr matches)) code vars rest
               filter))
   ((eq 'or (caar matches))
    (let* ((alts (cdar matches))
           (var (if (eq (caar alts) 'match) (cadr (car alts))))
           (simples '()) (others '()) (mem-fun 'memq))
      (when var
        (dolist (alt alts)
          (if (and (eq (car alt) 'match) (eq var (cadr alt))
                   (let ((upat (cddr alt)))
                     (eq (car-safe upat) 'quote)))
              (let ((val (cadr (cddr alt))))
                (cond ((integerp val)
                       (when (eq mem-fun 'memq)
                         (setq mem-fun 'memql)))
                      ((not (symbolp val))
                       (setq mem-fun 'member)))
                (push val simples))
            (push alt others))))
      (cond
       ((null alts) (error "Please avoid it") (pcase--u rest filter))
       ;; Yes, we can use `memql' (or `member')!
       ((> (length simples) 1)
        (pcase--u1 (cons `(match ,var
                                 . (pred (pcase--flip ,mem-fun ',simples)))
                         (cdr matches))
                   code vars
                   (if (null others) rest
                     (cons (cons
                            (pcase--and (if (cdr others)
                                            (cons 'or (nreverse others))
                                          (car others))
                                        (cdr matches))
                            (cons code vars))
                           rest))
                   filter))
       (t
        (pcase--u1 (cons (pop alts) (cdr matches)) code vars
                   (if (null alts) (progn (error "Please avoid it") rest)
                     (cons (cons
                            (pcase--and (if (cdr alts)
                                            (cons 'or alts) (car alts))
                                        (cdr matches))
                            (cons code vars))
                           rest))
                   filter)))))
   ((eq 'match (caar matches))
    (let* ((popmatches (pop matches))
           (_op (car popmatches))      (cdrpopmatches (cdr popmatches))
           (sym (car cdrpopmatches))
           (upat (cdr cdrpopmatches)))
      (cond
       ((memq upat '(t _))
        (let ((code (pcase--u1 matches code vars rest filter)))
          (if (eq upat '_) code
            (macroexp--warn-and-return
             "Pattern t is deprecated.  Use `_' instead"
             code))))
       ((eq upat 'pcase--dontcare) :pcase--dontcare)
       ((memq (car-safe upat) '(guard pred))
        (if (eq (car upat) 'pred) (pcase--mark-used sym))
        (let* ((splitrest
                (pcase--split-rest
                 sym (lambda (pat) (pcase--split-pred vars upat pat)) rest))
               (then-rest (car splitrest))
               (else-rest (cdr splitrest)))
          (pcase--if (if (eq (car upat) 'pred)
                         (pcase--funcall (cadr upat) sym vars)
                       (pcase--eval (cadr upat) vars))
                     (pcase--u1 matches code vars then-rest filter)
                     (pcase--u else-rest filter))))
       ((and (symbolp upat) upat)
        (pcase--mark-used sym)
        (cond
         ((and filter (funcall filter upat))
          (pcase--u1 (cons `(match ,sym . (pred (lambda (x) (equal ,(cdr (funcall filter upat)) x))))
                           matches)
                     code vars rest filter))
         ((not (assq upat vars))
          (pcase--u1 matches code (cons (cons upat sym) vars) rest filter))
         (;; Non-linear pattern.  Turn it into an `eq' test.
          (pcase--u1 (cons `(match ,sym . (pred (eq ,(cdr (assq upat vars)))))
                           matches)
                     code vars rest filter))))
       ((eq (car-safe upat) 'let)
        ;; A upat of the form (let VAR EXP).
        ;; (pcase--u1 matches code
        ;;            (cons (cons (nth 1 upat) (nth 2 upat)) vars) rest)
        (macroexp-let2
            macroexp-copyable-p sym
            (pcase--eval (nth 2 upat) vars)
          (pcase--u1 (cons (pcase--match sym (nth 1 upat)) matches)
                     code vars rest filter)))
       ((eq (car-safe upat) 'app)
        ;; A upat of the form (app FUN PAT)
        (pcase--mark-used sym)
        (let* ((fun (nth 1 upat))
               (nsym (gensym "x"))
               (body
                ;; We don't change `matches' to reuse the newly computed value,
                ;; because we assume there shouldn't be such redundancy in there.
                (pcase--u1 (cons (pcase--match nsym (nth 2 upat)) matches)
                           code vars
                           (pcase--app-subst-rest rest sym fun nsym)
                           (lambda (x) (and (not (eq x nsym))) (and filter (funcall filter x))))))
          (if (not (get nsym 'pcase-used))
              body
            (macroexp-let*
             `((,nsym ,(pcase--funcall fun sym vars)))
             body))))
       ((eq (car-safe upat) 'quote)
        (pcase--mark-used sym)
        (let* ((val (cadr upat))
               (splitrest (pcase--split-rest
                           sym (lambda (pat) (pcase--split-equal val pat)) rest))
               (then-rest (car splitrest))
               (else-rest (cdr splitrest)))
          (pcase--if (cond
                      ((null val) `(null ,sym))
                      ((integerp val) `(eql ,sym ,val))
                      ((symbolp val)
                       (if (pcase--self-quoting-p val)
                           `(eq ,sym ,val)
                         `(eq ,sym ',val)))
                      (t `(equal ,sym ',val)))
                     (pcase--u1 matches code vars then-rest filter)
                     (pcase--u else-rest filter))))
       ((eq (car-safe upat) 'not)
        ;; FIXME: The implementation below is naive and results in
        ;; inefficient code.
        ;; To make it work right, we would need to turn pcase--u1's
        ;; `code' and `vars' into a single argument of the same form as
        ;; `rest'.  We would also need to split this new `then-rest' argument
        ;; for every test (currently we don't bother to do it since
        ;; it's only useful for odd patterns like (and `(PAT1 . PAT2)
        ;; `(PAT3 . PAT4)) which the programmer can easily rewrite
        ;; to the more efficient `(,(and PAT1 PAT3) . ,(and PAT2 PAT4))).
        (pcase--u1 `((match ,sym . ,(cadr upat)))
                   ;; FIXME: This codegen is not careful to share its
                   ;; code if used several times: code blow up is likely.
                   (lambda (_vars)
                     ;; `vars' will likely contain bindings which are
                     ;; not always available in other paths to
                     ;; `rest', so there' no point trying to pass
                     ;; them down.
                     (pcase--u rest filter))
                   vars
                   (list `((and . ,matches) ,code . ,vars))
                   filter))
       (t (error "Unknown pattern `%S'" upat)))))
   (t (error "Incorrect MATCH %S" (car matches)))))

(let ((space " "))
  (pcall f ((concat hello space world) "hello world")))

(defun pcase--length (pattern)
    (setq pattern (pcase--macroexpand pattern))
    (pcase pattern
      (`(pred null) (cons 0 0))
      (`'nil (cons 0 0))
      (`(pred consp) (cons 1 1.0e+INF))
      (`(app cdr ,pattern)
       (let ((length (pcase--length pattern)))
	 (if (> (car length) 0)
	     (cons (1+ (car length)) (1+ (cdr length)))
	   (cons 0 (1+ (cdr length))))))
      (`(or . ,patterns)
       (let ((inf 0)
	     (sup 1.0e+INF))
	 (dolist (pattern patterns)
	   (let* ((is (pcase--length pattern))
		  (i (car is))
		  (s (cdr is)))
	     (setq inf (min inf i))
	     (setq sup (max sup s))))
	 (cons inf sup)))
      (`(and . ,patterns)
       (let ((inf 0)
	     (sup 1.0e+INF))
	 (dolist (pattern patterns)
	   (let* ((is (pcase--length pattern))
		  (i (car is))
		  (s (cdr is)))
	     (setq inf (max inf i))
	     (setq sup (min sup s))))
	 (cons inf sup)))
      (_ (cons 0 1.0e+INF))))

(pcase-defmacro not (pat)
  `(app (lambda (expval) (not (pcase expval (,pat t))))
        (pred identity)))

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-31 19:32           ` Pip Cet
@ 2020-09-01  3:12             ` Stefan Monnier
  2020-09-02  8:38               ` Pip Cet
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-09-01  3:12 UTC (permalink / raw)
  To: Pip Cet; +Cc: Philipp Stephani, 43100

>> >> IIUC you want
>> >>
>> >>     (pcase V
>> >>       ((or (pred symbolp) name)
>> >>        (let ((foo 'bar)) name)))
>> >>
>> >> to behave like
>> >>
>> >>     (cond
>> >>      ((symbolp V) (let ((foo 'bar)) name))
>> >>      (t (let ((name V)) (let ((foo 'bar)) name))))
>> >>
>> >> ?
>> >
>> > Yes, that's correct. It's also how (pcase V ((or (pred symbolp) name)
>> > name) behaves...
>>
>> Indeed, but that's an accident.  Ideally it should either signal an
>> error at macro-expansion time, or return nil when V is a symbol.
> So, as I half-expected, the reaction to "pcase isn't powerful enough"
> is "let's make it less powerful" :-)

Your words, not mine.
BTW, you can get (more or less) the behavior you want with:

    (pcase V
      ((or (and (pred symbolp) (let name name)) name)
       (let ((foo 'bar)) name)))

[ The "more or less" is because when V is a symbol, the `name` within
  the body of the branch will not refer to the presumed surrounding
  binding of `name` but to a new binding also called `name` which
  temporarily hides the outer binding but is initialized to the same
  value.  The difference only kicks in if one of those bindings is
  mutated.  ]

> Seriously, I get the impression you strongly feel pcase shouldn't be
> (more) powerful, it should instead make non-explicit but fairly strong
> complexity promises.

I just want the complexity to be predictable without having to guess
which optimization will be applicable.  It's sometimes hard to satisfy
this desire, admittedly.  But note that it doesn't require the code to
be "efficient": your `append` very much satisfies my desire since its
complexity is very much predictable.

> I disagree: in practice, complexity promises and optimization based on
> them are often unnecessary. In fact, there's a Lisp tradition of using
> assq and memq rather than building ad-hoc hash tables, even though
> that often means run time is theoretically O(n^2) rather than O(n
> log(n)).

The complexity of `assq` is arguably bad, but it is very much predictable.

>> More specifically, I'd rather not choose a semantics that imposes
>> duplicating the branch body, since we have no control over its size and
>> that can hence lead to potential code size explosion.
> You're right, and it's a good thing that the duplication of the branch
> body is a fixable implementation detail rather than something imposed
> by the semantics.

It's only fixable if you disregard the "more or less" above.
I find it to be a pretty bad wrinkle in the semantics.

>> The design of `pcase` assumes you want to optimize away the tests that
>> are common to the various patterns.  That can't be done with dynamic
>> patterns.
> Or it's a bit more difficult, at least...

I think it's more than "a bit more difficult", because deciding how to
optimize the tests will almost always take significantly more time than
just doing the tests.  So in order to do it dynamically and be useful,
you still need to have 2 stages, where you optimize at one stage and
then use the result several times later (to recoup the cost of the
optimization).

> The difficult part, in fact, is deciding that we want the arglist to
> be part of the exposed function API: given an "arglist" function, the
> rest of the implementation seems unproblematic,

We have `help-function-arglist`, so it's not that hard.
But it's not guaranteed to work.

> though some workarounds for lexical binding are required (if nothing
> else, this is an interesting exercise in how painful lexical binding
> can be to work with).

It's more of a philosophical issue.  Lexical scoping fundamentally means
that variables don't have names: (lambda (x) x) and (lambda (y) y)
should be indistinguishable (that's what α-renaming says, anyway).

Obviously, `help-function-arglist` begs to differ, but I like lexical
scoping so I'd rather we keep such exceptions to a minimum.

>> So it's more like my option of returning nil, except it would return
>> the value of a surrounding `name` variable?  That could be done, but I'm
>> not convinced it'd be more often useful.
>
> I started out with a fairly explicit practical problem: parsing GCC
> machine descriptions, which are (essentially) sexps but have made the
> mistake of having "optional" non-final parts, and I think it would be
> great to express that in a pcase pattern, both for the obvious reasons
> of legibility and for some non-obvious reasons of my own.

I'm sorry, I don't understand what those sexps (and their «optional"
non-final parts») have to do with pcase's handling of `or` patterns
where the different branches don't bind the same set of variables.

>> > disallowing the modification of name in X.
>> That's rather hard to do (and I don't see what would be the benefit here).
> I meant adding a cautionary note about it in the documentation, not
> actively preventing it.

So we're replacing the current "don't do that" with another "don't do
that", neither of which is detected by the byte-compiler?

I'd rather fix it "right" with something with a clean and
simple semantics where the things you shouldn't do get properly flagged
during compilation.

> If we had read-only bindings, pcase would probably use them,
> but we don't.

We could add them, tho (I think it would be technically fairly easy,
it's more a question of surface-language design, figuring out the right
color of the bikeshed and deciding whether it's really worth adding this
extra complexity into the language).  I love immutable variables as much
as the next guy, yet I have resisted the urge to add them to Elisp
so far.

>> The current implementation amounts to "we should signal an error but we
>> don't bother doing so and just warn against it in the manual".
>> Patch welcome ;-)
> You mean a patch that would make pcase less powerful by making what I
> want to do impossible rather than merely difficult?

The way I see it, it would make what you want to do possible because
what you suggest would merely give meaning to programs previously invalid.
In contrast, with the current behavior your proposal implies breaking
backward compatibility.

>> >> I don't know of a simple implementation.
>> > Here's my better-than-nothing attempt.  I don't think that's complex;
>> > if anything, it's too trivial.
>> So you give it a search-based semantics.
> I don't think the semantics are at all unclear,

You misunderstood: I definitely didn't mean "unclear" when wrote "search-based".
The semantics are definitely clear.

>> The problem with it for me is that if we turn
>>
>>     `(,a ,@b)
>>
>> into
>>
>>     (append `(,a) b)
>
> List-final ,@ is too special, IMHO, to be turned into an (append)
> pattern at all.

So you want some ,@ to be turned into `append` and some not?
That's exactly what I don't want, since it makes the performance
unpredictable unless you know the details of the optimization.

>> the pcase match will take a lot more time than the equivalent
>>     `(,a . ,b)
>> Of course, you can try and handle these "easy" cases more efficiently,
>> but then your ,@ will sometimes be very cheap and sometimes very
>> expensive (depending on when an optimization can be applied), which
>> I think is a misfeature (it's for this same reason that I dislike CL
>> keyword arguments for functions).
> I think it's an implementation detail.

I don't want implementation details to choose between O(N) and O(1).
This is the kind of "detail" that should be chosen by the author.

> Some reasoning about the minimum and maximum length of sequences
> matched by pcase patterns could help ordinary pcases, too, though:

Potentially, of course.  There are many options when it comes to the
actual code generated by `pcase`.

> (pcase '(a b c d)
>   (`(,a ,b ,c ,d) (list a b c d)))
>
> could call (pcase--check-length EXPVAL 4 4) rather than calling consp
> four times, potentially descending into expensive predicates that are
> unnecessary.

But that presumes a C implementation of `pcase--check-length` since
(< (length X) 4) can be a very bad idea when X is a long list.

> It's strange to read quotes that yo

...?

> Again, I think that's a fundamental difference between us when it
> comes to the philosophy behind pcase.  If I understand you correctly,
> you deliberately want to limit pcase, moving away from the intuitive
> definition of it that I gave above, because there might be a situation
> in which people expect better performance than our limited
> implementation can give them. Is that correct?

[ The intuitive definition you gave doesn't work for most of the core
  patterns in `pcase` (e.g. `pred`, `app`, `let`, `guard`), so while
  it's fine as a guiding intuition, it can't be used to really define
  the intended semantics.  ]

No, the problem is not really one of performance, but rather one of
defining which variables are in scope within a branch such that a given
branch always has the same set of bindings within its scope, no matter
how the pattern was matched, which I think gives it a cleaner and
simpler semantics.

[ It is also linked to a potential problem of code size explosion,
  admittedly.  ]

> I think that's a weak reason for a strong limitation, but of course
> those are subjective questions.

I don't see what strong limitation you're referring to.  Having `name`
get the value of a surrounding `name` instead of nil seems like a very
minor issue and definitely not a "strong limitation".

> For example, I don't expect (pcase 9 ((* x x) x)) to work,

With an appropriate (pcase-defmacro * ...) you could definitely make it
work.  Not sure how useful it would be, but I'd have no problem with it:
just like `append` it doesn't impact the rest of `pcase` (and I presume
that it would come with a "predictably costly" complexity).

        Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-09-01  3:12             ` Stefan Monnier
@ 2020-09-02  8:38               ` Pip Cet
  2020-09-02 14:16                 ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Pip Cet @ 2020-09-02  8:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Philipp Stephani, 43100

On Tue, Sep 1, 2020 at 3:12 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > So, as I half-expected, the reaction to "pcase isn't powerful enough"
> > is "let's make it less powerful" :-)
> Your words, not mine.

I'm glad you said that!

> BTW, you can get (more or less) the behavior you want with:
>
>     (pcase V
>       ((or (and (pred symbolp) (let name name)) name)
>        (let ((foo 'bar)) name)))

That's a good alternative.

> [ The "more or less" is because when V is a symbol, the `name` within
>   the body of the branch will not refer to the presumed surrounding
>   binding of `name` but to a new binding also called `name` which
>   temporarily hides the outer binding but is initialized to the same
>   value.  The difference only kicks in if one of those bindings is
>   mutated.  ]
>
> > Seriously, I get the impression you strongly feel pcase shouldn't be
> > (more) powerful, it should instead make non-explicit but fairly strong
> > complexity promises.
>
> I just want the complexity to be predictable without having to guess
> which optimization will be applicable.

So it's better to have a predictably-bad `append' rather than a
sometimes-bad one? But how does that affect ,@? We could make that
predictably bad, too!

> >> More specifically, I'd rather not choose a semantics that imposes
> >> duplicating the branch body, since we have no control over its size and
> >> that can hence lead to potential code size explosion.
> > You're right, and it's a good thing that the duplication of the branch
> > body is a fixable implementation detail rather than something imposed
> > by the semantics.
>
> It's only fixable if you disregard the "more or less" above.
> I find it to be a pretty bad wrinkle in the semantics.

So setq the outer binding if it changed? Note that people also might expect

(pcase l (`(,a . ,b) (setq b a)))

to modify l...

> >> The design of `pcase` assumes you want to optimize away the tests that
> >> are common to the various patterns.  That can't be done with dynamic
> >> patterns.
> > Or it's a bit more difficult, at least...
> I think it's more than "a bit more difficult", because deciding how to
> optimize the tests will almost always take significantly more time than
> just doing the tests.  So in order to do it dynamically and be useful,
> you still need to have 2 stages, where you optimize at one stage and
> then use the result several times later (to recoup the cost of the
> optimization).

Wait, I think there was a misunderstanding here. I don't mean that the
pcase pattern should depend structurally on let-bound variables
appearing in it. That does sound impossible to me (except with dynamic
scoping).

> > The difficult part, in fact, is deciding that we want the arglist to
> > be part of the exposed function API: given an "arglist" function, the
> > rest of the implementation seems unproblematic,
>
> We have `help-function-arglist`, so it's not that hard.
> But it's not guaranteed to work.

Oh, thanks for pointing that out. It's not very good, though, is it?

> > though some workarounds for lexical binding are required (if nothing
> > else, this is an interesting exercise in how painful lexical binding
> > can be to work with).
>
> It's more of a philosophical issue.

Deciding whether to expose arglists for functions is, yes. That's what
I said above.

> Lexical scoping fundamentally means
> that variables don't have names: (lambda (x) x) and (lambda (y) y)
> should be indistinguishable (that's what α-renaming says, anyway).

But they're not. (equal (lambda (x) x) (lambda (y) y)) returns nil. I
don't see why pcase-call should be in the "funcall" category of being
unable to distinguish the two, rather than in the "equal" category of
being able to.

> >> So it's more like my option of returning nil, except it would return
> >> the value of a surrounding `name` variable?  That could be done, but I'm
> >> not convinced it'd be more often useful.
> >
> > I started out with a fairly explicit practical problem: parsing GCC
> > machine descriptions, which are (essentially) sexps but have made the
> > mistake of having "optional" non-final parts, and I think it would be
> > great to express that in a pcase pattern, both for the obvious reasons
> > of legibility and for some non-obvious reasons of my own.
>
> I'm sorry, I don't understand what those sexps (and their «optional"
> non-final parts») have to do with pcase's handling of `or` patterns
> where the different branches don't bind the same set of variables.

Well, maybe I'm just missing an obvious way of doing it.

> >> > disallowing the modification of name in X.
> >> That's rather hard to do (and I don't see what would be the benefit here).
> > I meant adding a cautionary note about it in the documentation, not
> > actively preventing it.
>
> So we're replacing the current "don't do that" with another "don't do
> that", neither of which is detected by the byte-compiler?

Yes. Emacs Lisp isn't free of "don't do that"s, and reducing the
"don't do that" space drastically is better than pointing out the tiny
fraction of it which remains as evidence that we shouldn't do the
reduction in the first place.

> I'd rather fix it "right" with something with a clean and
> simple semantics where the things you shouldn't do get properly flagged
> during compilation.

If you want clean and simple (lexical scoping) semantics and execute
user-provided code, you should probably call a user-provided lambda
rather than evaluating an expression in the first place?

> >> The current implementation amounts to "we should signal an error but we
> >> don't bother doing so and just warn against it in the manual".
> >> Patch welcome ;-)
> > You mean a patch that would make pcase less powerful by making what I
> > want to do impossible rather than merely difficult?
>
> The way I see it, it would make what you want to do possible because
> what you suggest would merely give meaning to programs previously invalid.
> In contrast, with the current behavior your proposal implies breaking
> backward compatibility.

I think what you mean is you'd rather break backward compatibility in
two steps rather than one, by first causing errors, then redefining
the new errors not to happen? Because if so, I totally agree.

> >> >> I don't know of a simple implementation.
> >> > Here's my better-than-nothing attempt.  I don't think that's complex;
> >> > if anything, it's too trivial.
> >> So you give it a search-based semantics.
> > I don't think the semantics are at all unclear,
>
> You misunderstood: I definitely didn't mean "unclear" when wrote "search-based".
> The semantics are definitely clear.

Saying I give it search-based semantics implies there are other
semantics I could have chosen. Semantically, all I've done is decide,
probably incorrectly, that append should be biased towards shy
matching of the fist arguments (and that it only work on proper
lists).

> >> The problem with it for me is that if we turn
> >>
> >>     `(,a ,@b)
> >>
> >> into
> >>
> >>     (append `(,a) b)
> >
> > List-final ,@ is too special, IMHO, to be turned into an (append)
> > pattern at all.
>
> So you want some ,@ to be turned into `append` and some not?

As an implementation detail, yes. Just like ` does, by the way: `(,@l)
doesn't call `append' (it doesn't even check l is a list!), so why
should it behave differently when used as a pcase pattern?

IOW, I'd like ` (as a pcase macro) to behave like ` already does:
constant-time `(,@l), linear-time `(,@l 3).

> That's exactly what I don't want, since it makes the performance
> unpredictable unless you know the details of the optimization.

I'm not opposed to the idea that there should be a form that means
"use pcase to evaluate this, but throw an error if I messed up and
complexity is worse than O(f(n))". I just don't think it should be
(pcase ...), leaving the f to be determined by a human who's allowed
to know the expected complexity of pcase patterns (currently not
mentioned in the documentation), but isn't allowed to "know the
details of the optimization".

And, again, performance of ` is unpredictable unless you know the
details of the optimization. Should we get rid of ` next?

> >> the pcase match will take a lot more time than the equivalent
> >>     `(,a . ,b)
> >> Of course, you can try and handle these "easy" cases more efficiently,
> >> but then your ,@ will sometimes be very cheap and sometimes very
> >> expensive (depending on when an optimization can be applied), which
> >> I think is a misfeature (it's for this same reason that I dislike CL
> >> keyword arguments for functions).
> > I think it's an implementation detail.
>
> I don't want implementation details to choose between O(N) and O(1).
> This is the kind of "detail" that should be chosen by the author.

All powerful matching patterns I'm aware of have horrible worst-time
complexity which is reduced to acceptable complexity for most cases,
in practice. Take `equal', for example.

(let (r s)
  (dotimes (i 100)
    (setq r (cons r r))
    (setq s (cons s s))
    (benchmark 1 `(equal ',r ',s))))

What you're asking sounds to me like the equivalent of "I want a
compression algorithm to be as good as possible, but the size of the
decompressed data has to be predictable, even if I perform recursive
decompression of the output".

> > Some reasoning about the minimum and maximum length of sequences
> > matched by pcase patterns could help ordinary pcases, too, though:
>
> Potentially, of course.  There are many options when it comes to the
> actual code generated by `pcase`.

Precisely, including ones that reduce complexity. Somewhat
unpredictably. Which is why predictable performance is a bad thing to
demand of pcase patterns.

> > (pcase '(a b c d)
> >   (`(,a ,b ,c ,d) (list a b c d)))
> >
> > could call (pcase--check-length EXPVAL 4 4) rather than calling consp
> > four times, potentially descending into expensive predicates that are
> > unnecessary.
>
> But that presumes a C implementation of `pcase--check-length` since
> (< (length X) 4) can be a very bad idea when X is a long list.

Even a Lisp implementation that isn't a wrapper around (length X)
would avoid unnecessary predicates. That is the reason I wrote
pcase--check-length rather than (<= 4 (length EXPVAL) 4)...

> > It's strange to read quotes that yo
> ...?

Well, it's strange to read quotes that you only half-deleted from your
email, Pip! (Sorry).

> > Again, I think that's a fundamental difference between us when it
> > comes to the philosophy behind pcase.  If I understand you correctly,
> > you deliberately want to limit pcase, moving away from the intuitive
> > definition of it that I gave above, because there might be a situation
> > in which people expect better performance than our limited
> > implementation can give them. Is that correct?
>
> [ The intuitive definition you gave doesn't work for most of the core
>   patterns in `pcase` (e.g. `pred`, `app`, `let`, `guard`), so while
>   it's fine as a guiding intuition, it can't be used to really define
>   the intended semantics.  ]

You mean because "pred" isn't defined as a function, and "let" is
defined differently?

Because if I'm allowed to defmacro pred, it works perfectly well:

(defmacro pred (p)
  ``(pred ,p))

(defun equalish (a b)
  (... (pcase a (`(pred ,p) (funcall p b)))

with extra hygiene being required, of course, but there's no problem here.

> No, the problem is not really one of performance, but rather one of
> defining which variables are in scope within a branch such that a given
> branch always has the same set of bindings within its scope, no matter
> how the pattern was matched, which I think gives it a cleaner and
> simpler semantics.

Yes: calling lambdas gives cleaner and simpler semantics than
evaluating forms, which gives terser code and implies precisely the
unclean (but useful) semantics I want. pcase does the latter, implying
semantics which we can implement (more easily and cleanly using
dynamic bindings, for some reason), but choose not to. I'd like to
understand why!

> [ It is also linked to a potential problem of code size explosion,
>   admittedly.  ]

Let's be clear, though, that we're not talking about an explosion in
the number of conses used to express the pcase. It's only when the
code is byte compiled that an explosion potentially happens (and it's
easily fixable with dynamic binding, but not with lexical binding, as
far as I can see).

(I think it's interesting that you can't evaluate "proper" pcase at
run time; at least, I don't see how you'd implement a pcase-dyn
function that evaluates its second argument etc. before interpreting
it. It's easy to do with dynamic binding. I think that's because our
implementation of lexical binding is too limited, but I'm not sure.)

> > I think that's a weak reason for a strong limitation, but of course
> > those are subjective questions.
>
> I don't see what strong limitation you're referring to.  Having `name`
> get the value of a surrounding `name` instead of nil seems like a very
> minor issue and definitely not a "strong limitation".

The strong limitation is "you can only add new pcase patterns if they
have predictable complexity (in code size, run time, or memory usage,
presumably)". Taken at face value, that means no ,@, nothing
equal-based, no efficient joining of two pcase cases into one... and
no conditional binding of variables.

(IMHO, (pcase X (A B) (C D)) should be equivalent to (pcase X ((or
(and (let which-alternative 0) A) (and (let which-alternative 1) C))
(case which-alternative (0 B) (1 D)))), assuming "which-alternative"
is free in A, B, C, D.)

You're right, though, that it'll usually be possible to get the right
behavior using the "bind everything to nil" idea. I hadn't understood
you to be seriously proposing we implement that, but if you do I'll
prepare a patch.

> > For example, I don't expect (pcase 9 ((* x x) x)) to work,
>
> With an appropriate (pcase-defmacro * ...) you could definitely make it
> work.

By solving polynomial roots? I think it's better to use
(pcase-defmacro * ...) for a Kleene star macro...which it turns out
isn't precisely trivial to implement with lexical scoping.

Pip

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-09-02  8:38               ` Pip Cet
@ 2020-09-02 14:16                 ` Stefan Monnier
  2020-09-05 14:36                   ` Pip Cet
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-09-02 14:16 UTC (permalink / raw)
  To: Pip Cet; +Cc: Philipp Stephani, 43100

>> I just want the complexity to be predictable without having to guess
>> which optimization will be applicable.
> So it's better to have a predictably-bad `append' rather than a
> sometimes-bad one?

In my book, yes.

> But how does that affect ,@? We could make that predictably bad, too!

I think the expectation (coming from the use of ,@ in non-patterns)
would be that it is fairly cheap, but yes, I guess it's an option.
Could be coupled with a warning when ,@ can be replaced by an efficient
`. ,`.

>> >> More specifically, I'd rather not choose a semantics that imposes
>> >> duplicating the branch body, since we have no control over its size and
>> >> that can hence lead to potential code size explosion.
>> > You're right, and it's a good thing that the duplication of the branch
>> > body is a fixable implementation detail rather than something imposed
>> > by the semantics.
>> It's only fixable if you disregard the "more or less" above.
>> I find it to be a pretty bad wrinkle in the semantics.
> So setq the outer binding if it changed?

Oh, yuck!  That'd be adding injury to insult.

> Note that people also might expect
>
>     (pcase l (`(,a . ,b) (setq b a)))
>
> to modify l...

They might indeed.  But if so, they'll be disappointed ;-)

>> >> The design of `pcase` assumes you want to optimize away the tests that
>> >> are common to the various patterns.  That can't be done with dynamic
>> >> patterns.
>> > Or it's a bit more difficult, at least...
>> I think it's more than "a bit more difficult", because deciding how to
>> optimize the tests will almost always take significantly more time than
>> just doing the tests.  So in order to do it dynamically and be useful,
>> you still need to have 2 stages, where you optimize at one stage and
>> then use the result several times later (to recoup the cost of the
>> optimization).
> Wait, I think there was a misunderstanding here. I don't mean that the
> pcase pattern should depend structurally on let-bound variables
> appearing in it. That does sound impossible to me (except with dynamic
> scoping).

To me "dynamic patterns" means patterns which are only know at run time
because the pattern itself depends on a runtime value, as in something like:

    (let ((pat (if FOO '(pred symbolp) '1)))
      ...
      (pcase V
        ...
        ((match pat) EXP)
        ...))

Not sure what you meant by it, then.

>> > The difficult part, in fact, is deciding that we want the arglist to
>> > be part of the exposed function API: given an "arglist" function, the
>> > rest of the implementation seems unproblematic,
>> We have `help-function-arglist`, so it's not that hard.
>> But it's not guaranteed to work.
> Oh, thanks for pointing that out.  It's not very good, though, is it?

It's good enough for `C-h o`.

>> Lexical scoping fundamentally means that variables don't have names:
>> (lambda (x) x) and (lambda (y) y) should be indistinguishable (that's
>> what α-renaming says, anyway).
> But they're not.

;-)

> I don't see why pcase-call should be in the "funcall" category of
> being unable to distinguish the two, rather than in the "equal"
> category of being able to.

The notion of equality between functions is pretty delicate (and this
fundamental problem was the original motivation for the development of
type classes, BTW).

>> I'm sorry, I don't understand what those sexps (and their «optional"
>> non-final parts») have to do with pcase's handling of `or` patterns
>> where the different branches don't bind the same set of variables.
> Well, maybe I'm just missing an obvious way of doing it.

Could be, but I have no idea what "it" is.

>> >> > disallowing the modification of name in X.
>> >> That's rather hard to do (and I don't see what would be the benefit here).
>> > I meant adding a cautionary note about it in the documentation, not
>> > actively preventing it.
>> So we're replacing the current "don't do that" with another "don't do
>> that", neither of which is detected by the byte-compiler?
> Yes. Emacs Lisp isn't free of "don't do that"s, and reducing the
> "don't do that" space drastically is better than pointing out the tiny
> fraction of it which remains as evidence that we shouldn't do the
> reduction in the first place.

I don't see your proposal as reducing the "don't do that" space.

>> >> The current implementation amounts to "we should signal an error but we
>> >> don't bother doing so and just warn against it in the manual".
>> >> Patch welcome ;-)
>> > You mean a patch that would make pcase less powerful by making what I
>> > want to do impossible rather than merely difficult?
>> The way I see it, it would make what you want to do possible because
>> what you suggest would merely give meaning to programs previously invalid.
>> In contrast, with the current behavior your proposal implies breaking
>> backward compatibility.
> I think what you mean is you'd rather break backward compatibility in
> two steps rather than one, by first causing errors, then redefining
> the new errors not to happen? Because if so, I totally agree.

That's right.  The first (breaking) step would be "my fault" and would
hence free you to do the second step without introducing
incompatibilities ;-)

> IOW, I'd like ` (as a pcase macro) to behave like ` already does:
> constant-time `(,@l), linear-time `(,@l 3).

I suggest you start with a `pip-backquote` pcase-macro and experiment
with it, to see how useful it is.  You'll then presumably have more
concrete examples to argue for the replacement of the current ` 

> And, again, performance of ` is unpredictable unless you know the
> details of the optimization. Should we get rid of ` next?

AFAIK performance is O(N) where N is the number of cons cells in the
output (minus those cons-cells which are preserved as-is, I guess but
that doesn't make much of a difference).  That seems quite different
from going from O(1) to O(N²).

>> But that presumes a C implementation of `pcase--check-length` since
>> (< (length X) 4) can be a very bad idea when X is a long list.
> Even a Lisp implementation that isn't a wrapper around (length X)
> would avoid unnecessary predicates.

Indeed.  I won't be the one writing the code which tries to guess when
that's more efficient and when it's less efficient.

>> > It's strange to read quotes that yo
>> ...?
> Well, it's strange to read quotes that you only half-deleted from your
> email, Pip! (Sorry).

;-)

> (I think it's interesting that you can't evaluate "proper" pcase at
> run time; at least, I don't see how you'd implement a pcase-dyn
> function that evaluates its second argument etc. before interpreting
> it. It's easy to do with dynamic binding. I think that's because our
> implementation of lexical binding is too limited, but I'm not sure.)

I don't understand what your `pcase-dyn` would be expected to do, so
I don't know how dynamic scoping might coming into the picture.

> The strong limitation is "you can only add new pcase patterns if they
> have predictable complexity (in code size, run time, or memory usage,
> presumably)".

Not at all.  You can define any pcase pattern you like with
`pcase-defmacro`, just like you can define any function or macro you
like with `defun` and `defmacro`.

> By solving polynomial roots? I think it's better to use
> (pcase-defmacro * ...) for a Kleene star macro...which it turns out
> isn't precisely trivial to implement with lexical scoping.

Why would lexical scoping make a difference?


        Stefan






^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-09-02 14:16                 ` Stefan Monnier
@ 2020-09-05 14:36                   ` Pip Cet
  2020-09-05 16:52                     ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Pip Cet @ 2020-09-05 14:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Philipp Stephani, 43100

On Wed, Sep 2, 2020 at 2:16 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> I just want the complexity to be predictable without having to guess
> >> which optimization will be applicable.
> > So it's better to have a predictably-bad `append' rather than a
> > sometimes-bad one?
> In my book, yes.

That's a very unusual book! This isn't real-time programming or
crypto, so I fail to see the damage "faster is better" might cause
here.

> > But how does that affect ,@? We could make that predictably bad, too!
>
> I think the expectation (coming from the use of ,@ in non-patterns)
> would be that it is fairly cheap, but yes, I guess it's an option.
> Could be coupled with a warning when ,@ can be replaced by an efficient
> `. ,`.

I don't think we should warn about that; we should simply do the best
we can do, and warn the user when that's surprisingly bad (i.e. when
we have to iterate over the list building sublists).

> >> >> More specifically, I'd rather not choose a semantics that imposes
> >> >> duplicating the branch body, since we have no control over its size and
> >> >> that can hence lead to potential code size explosion.
> >> > You're right, and it's a good thing that the duplication of the branch
> >> > body is a fixable implementation detail rather than something imposed
> >> > by the semantics.
> >> It's only fixable if you disregard the "more or less" above.
> >> I find it to be a pretty bad wrinkle in the semantics.
> > So setq the outer binding if it changed?
> Oh, yuck!  That'd be adding injury to insult.

Would work, though!

> >> >> The design of `pcase` assumes you want to optimize away the tests that
> >> >> are common to the various patterns.  That can't be done with dynamic
> >> >> patterns.
> >> > Or it's a bit more difficult, at least...
> >> I think it's more than "a bit more difficult", because deciding how to
> >> optimize the tests will almost always take significantly more time than
> >> just doing the tests.  So in order to do it dynamically and be useful,
> >> you still need to have 2 stages, where you optimize at one stage and
> >> then use the result several times later (to recoup the cost of the
> >> optimization).
> > Wait, I think there was a misunderstanding here. I don't mean that the
> > pcase pattern should depend structurally on let-bound variables
> > appearing in it. That does sound impossible to me (except with dynamic
> > scoping).
>
> To me "dynamic patterns" means patterns which are only know at run time
> because the pattern itself depends on a runtime value, as in something like:
>
>     (let ((pat (if FOO '(pred symbolp) '1)))
>       ...
>       (pcase V
>         ...
>         ((match pat) EXP)
>         ...))
>
> Not sure what you meant by it, then.

I didn't mean anything by "dynamic patterns" because I didn't use the term :-)

What I meant was using SYMBOL as short-hand for (pred (equal SYMBOL))
when it's not in a white-list of symbols to be bound instead of
compared. That was the intention of my example. I think dynamic
patterns, as you use the term, should be possible (IIUC, they're not,
with lexical binding), but punitively slow (I don't think we have to
worry about that last part...), but that's an entirely different
discussion.

> >> We have `help-function-arglist`, so it's not that hard.
> >> But it's not guaranteed to work.
> > Oh, thanks for pointing that out.  It's not very good, though, is it?
>
> It's good enough for `C-h o`.

C-h o says

(+ &rest NUMBERS-OR-MARKERS) and (1+ NUMBER)

help-function-arglist says

(&rest rest) and (arg1)

The latter is much less useful, IMHO.

> >> I'm sorry, I don't understand what those sexps (and their «optional"
> >> non-final parts») have to do with pcase's handling of `or` patterns
> >> where the different branches don't bind the same set of variables.
> > Well, maybe I'm just missing an obvious way of doing it.
> Could be, but I have no idea what "it" is.

Matching something like Emacs defuns, which may have a docstring or
not, I'd like to do

`(defun ,symbol ,@(or `(,(and (pred stringp) docstring))) `())  . ,body)

It seems wasteful to have two almost-identical patterns to match the
variants, and I'd like docstring to have a useful default value.

> >> >> > disallowing the modification of name in X.
> >> >> That's rather hard to do (and I don't see what would be the benefit here).
> >> > I meant adding a cautionary note about it in the documentation, not
> >> > actively preventing it.
> >> So we're replacing the current "don't do that" with another "don't do
> >> that", neither of which is detected by the byte-compiler?
> > Yes. Emacs Lisp isn't free of "don't do that"s, and reducing the
> > "don't do that" space drastically is better than pointing out the tiny
> > fraction of it which remains as evidence that we shouldn't do the
> > reduction in the first place.
>
> I don't see your proposal as reducing the "don't do that" space.

Currently, it's "don't bind variables only in some branches of a pcase
or". With my proposal, it's "feel free to bind variables only in some
branches of a pcase or, but don't modify them". It's a strict subset
relationship.

> >> >> The current implementation amounts to "we should signal an error but we
> >> >> don't bother doing so and just warn against it in the manual".
> >> >> Patch welcome ;-)
> >> > You mean a patch that would make pcase less powerful by making what I
> >> > want to do impossible rather than merely difficult?
> >> The way I see it, it would make what you want to do possible because
> >> what you suggest would merely give meaning to programs previously invalid.
> >> In contrast, with the current behavior your proposal implies breaking
> >> backward compatibility.
> > I think what you mean is you'd rather break backward compatibility in
> > two steps rather than one, by first causing errors, then redefining
> > the new errors not to happen? Because if so, I totally agree.
>
> That's right.  The first (breaking) step would be "my fault" and would
> hence free you to do the second step without introducing
> incompatibilities ;-)

Hooray! Let's do that, then.

> > IOW, I'd like ` (as a pcase macro) to behave like ` already does:
> > constant-time `(,@l), linear-time `(,@l 3).
>
> I suggest you start with a `pip-backquote` pcase-macro and experiment
> with it, to see how useful it is.  You'll then presumably have more
> concrete examples to argue for the replacement of the current `
>
> > And, again, performance of ` is unpredictable unless you know the
> > details of the optimization. Should we get rid of ` next?

> AFAIK performance is O(N) where N is the number of cons cells in the
> output (minus those cons-cells which are preserved as-is, I guess but
> that doesn't make much of a difference).

It means (,@l) is O(1), but (,@l 3) is O(N).

> That seems quite different
> from going from O(1) to O(N²).

So going from O(1) to O(N) is okay, but O(N^2) is not? I think it
would be okay to throw a "use append!" warning, or even an error, if
we encounter a situation in which there is more than one ,@ without a
maximum length (i.e. situations in which we fail to be O(N), given
constant-time predicates).

> >> But that presumes a C implementation of `pcase--check-length` since
> >> (< (length X) 4) can be a very bad idea when X is a long list.
> > Even a Lisp implementation that isn't a wrapper around (length X)
> > would avoid unnecessary predicates.
>
> Indeed.  I won't be the one writing the code which tries to guess when
> that's more efficient and when it's less efficient.

Hmm. After running some benchmarks, it seems like a C implementation
of check-length is slightly faster, a Lisp implementation of
check-length is slightly slower, there's less code (in terms of cons
cells) generated with check-length, but the bytecode looks better
without it. All that is without any predicates and in the case of a
matching pattern, so I'd say it's a wash then. In theory, of course,
it's easy to come up with an algorithm that uses both approaches to
achieve minimum complexity, but that's difficult to implement.

So my tentative proposal would be to use `append' or ,@ if you want
length reasoning, plain ` and , if you don't. It might be simpler
always to use it, but I don't want (pcase x (a b) (cons a b)) to emit
a function call.

> > (I think it's interesting that you can't evaluate "proper" pcase at
> > run time; at least, I don't see how you'd implement a pcase-dyn
> > function that evaluates its second argument etc. before interpreting
> > it. It's easy to do with dynamic binding. I think that's because our
> > implementation of lexical binding is too limited, but I'm not sure.)
>
> I don't understand what your `pcase-dyn` would be expected to do, so
> I don't know how dynamic scoping might coming into the picture.

It's what you called dynamic patterns above.

The problem is you cannot eval a pcase form and achieve lexical
binding of the variables used in the bindings. There's no
lexical-scoping equivalent of

(eval `(pcase ',exp ,bindings))

I expect you'll object that that's usually not what you want, anyway,
which is correct but leaves the unusual cases where you know that the
pcase will have been memoized and thus be fairly cheap.

> > The strong limitation is "you can only add new pcase patterns if they
> > have predictable complexity (in code size, run time, or memory usage,
> > presumably)".
>
> Not at all.  You can define any pcase pattern you like with
> `pcase-defmacro`, just like you can define any function or macro you
> like with `defun` and `defmacro`.

Well, I wish :-)

pcase-defmacro is limited to non-recursive patterns.

Thanks for clarifying that the kingdom of predictable complexity does
not extend to pcase-defmacro.

> > By solving polynomial roots? I think it's better to use
> > (pcase-defmacro * ...) for a Kleene star macro...which it turns out
> > isn't precisely trivial to implement with lexical scoping.
>
> Why would lexical scoping make a difference?

Because pcase patterns cannot be recursive (or can they?), but you can
recurse, given dynamic scoping, by eval'ing a pcase form.

By the way, I'm also quite confused by this comment:

;; - implement (not PAT).  This might require a significant redesign.

(pcase-defmacro not (pat)
  `(app (lambda (expval) (not (pcase expval (,pat t))))
        (pred identity)))

would work, wouldn't it? Or did I totally misunderstand what a pcase
not would be expected to do?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-09-05 14:36                   ` Pip Cet
@ 2020-09-05 16:52                     ` Stefan Monnier
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Monnier @ 2020-09-05 16:52 UTC (permalink / raw)
  To: Pip Cet; +Cc: Philipp Stephani, 43100

>> >> I just want the complexity to be predictable without having to guess
>> >> which optimization will be applicable.
>> > So it's better to have a predictably-bad `append' rather than a
>> > sometimes-bad one?
>> In my book, yes.
> That's a very unusual book!

I wouldn't know, it's the only one I have.

> This isn't real-time programming or crypto, so I fail to see the
> damage "faster is better" might cause here.

Because the problem goes in the other direction: the programmer gets
used to seeing it work fast on some uses and then assumes it's
always so, and then gets bitten on other uses.
I think a good API should avoid such surprises.

[ This said, I no doubt introduced several APIs which don't obey this rule.
  But then I plead not guilty because of temporary insanity.  ]

>> > But how does that affect ,@? We could make that predictably bad, too!
>> I think the expectation (coming from the use of ,@ in non-patterns)
>> would be that it is fairly cheap, but yes, I guess it's an option.
>> Could be coupled with a warning when ,@ can be replaced by an efficient
>> `. ,`.
> I don't think we should warn about that; we should simply do the best
> we can do, and warn the user when that's surprisingly bad (i.e. when
> we have to iterate over the list building sublists).

It's important to try and make sure that warnings can be silenced by
improving the code.  When the coders really need the slow search, how
would they silence the warning (other than `with-suppressed-warnings`,
obviously)?

In contrast if we warn when the ,@ can be replaced by `, .` then the
warning can always be silenced without having to resort to
`with-suppressed-warnings`.

>> >> >> More specifically, I'd rather not choose a semantics that imposes
>> >> >> duplicating the branch body, since we have no control over its size and
>> >> >> that can hence lead to potential code size explosion.
>> >> > You're right, and it's a good thing that the duplication of the branch
>> >> > body is a fixable implementation detail rather than something imposed
>> >> > by the semantics.
>> >> It's only fixable if you disregard the "more or less" above.
>> >> I find it to be a pretty bad wrinkle in the semantics.
>> > So setq the outer binding if it changed?
>> Oh, yuck!  That'd be adding injury to insult.
> Would work, though!

And would result in hideous semantics in corner cases.

> What I meant was using SYMBOL as short-hand for (pred (equal SYMBOL))
> when it's not in a white-list of symbols to be bound instead of
> compared.  That was the intention of my example.

Ah, sorry, I completely misunderstood ;-)

There's also a case to be made for SYMBOL to be an accepted short form
for `(pred SYMBOL)` in some cases.  I don't think using
whitelists/blacklists to decide how to treat a given SYMBOL pattern will
scale (nor be sufficiently clear for the casual reader), so I think we
should aim for a slightly less concise syntax.  Given the limits of the syntax
accepted by the Elisp reader, we may have to resort to using

    (= EXP)

as shorthand for

    (pred (equal EXP))

or

    =SYMBOL

as shorthand for

    (pred (equal SYMBOL))

AFAIK There is currently no good hook in `pcase.el` to implement the
latter without actually changing `pcase.el`.

> I think dynamic patterns, as you use the term, should be possible

They very much are possible, of course.  E.g.:

    (defun pcase-match-p (PATTERN OBJECT)
      "Return non-nil if OBJECT matches PATTERN."
      (eval `(pcase ',OBJECT (,PATTERN t)) t))

and then use (pred (pcase-match-p EXP))

If you want to extract data from OBJECT, then you'll presumably want to
define a `pcase-match` function which instead of a boolean returns
a list of values (corresponding to the value of the variables that were
bound in the PATTERN) and that will take more work than the above
3 lines, but should still be easy to do.

> but punitively slow

Yup.

>> >> We have `help-function-arglist`, so it's not that hard.
>> >> But it's not guaranteed to work.
>> > Oh, thanks for pointing that out.  It's not very good, though, is it?
>> It's good enough for `C-h o`.
>
> C-h o says
>
> (+ &rest NUMBERS-OR-MARKERS) and (1+ NUMBER)

Indeed and that info comes straight from `help-function-arglist`.

> help-function-arglist says
>
> (&rest rest) and (arg1)

I suggest you start with `C-h o help-function-arglist` ;-)

>> >> I'm sorry, I don't understand what those sexps (and their «optional"
>> >> non-final parts») have to do with pcase's handling of `or` patterns
>> >> where the different branches don't bind the same set of variables.
>> > Well, maybe I'm just missing an obvious way of doing it.
>> Could be, but I have no idea what "it" is.
>
> Matching something like Emacs defuns, which may have a docstring or
> not, I'd like to do
>
> `(defun ,symbol ,@(or `(,(and (pred stringp) docstring))) `())  . ,body)
>
> It seems wasteful to have two almost-identical patterns to match the
> variants, and I'd like docstring to have a useful default value.

Currently `pcase` should give nil as value to `docstring` if the above
`or` matches via the second alternative, so at least the "useful
default" part is already covered ;-)

But yes, currently, you have to write it like

    `(defun ,symbol ,args
      . ,(or `(,(and (pred stringp) docstring)  . ,body)
             body))

or if you want a more explicit default for `docstring`:

    `(defun ,symbol ,args
      . ,(or `(,(and (pred stringp) docstring)  . ,body)
             (and (let docstring "") body)))

In this particular case it's not too terrible, but it gets much worse if
you have to add more optional stuff to it, like `declare` and
`interactive` (the pattern's size is exponential in the number of
optional elements).

Your `append` handles it more elegantly in the source code (and less
efficiently at run-time).  In theory we could get our cake and eat it
too, but that would involve adding something similar to a parsing
algorithm and would likely require a significant amount of work.

>> > And, again, performance of ` is unpredictable unless you know the
>> > details of the optimization. Should we get rid of ` next?
>> AFAIK performance is O(N) where N is the number of cons cells in the
>> output (minus those cons-cells which are preserved as-is, I guess but
>> that doesn't make much of a difference).
> It means (,@l) is O(1), but (,@l 3) is O(N).

Indeed.  It's not as drastic as the O(1) of `. ,` vs O(N²) of `append`,
but point taken.

I think it's time you

    (pcase-defmacro pip-\` ...)

and experiment with it.

> By the way, I'm also quite confused by this comment:
>
> ;; - implement (not PAT).  This might require a significant redesign.
>
> (pcase-defmacro not (pat)
>   `(app (lambda (expval) (not (pcase expval (,pat t))))
>         (pred identity)))
>
> would work, wouldn't it?

[ AKA
    (pcase-defmacro not (pat)
      `(pred (lambda (expval) (not (pcase expval (,pat t))))))
]

Yes, it works but it doesn't give me the efficiency and semantics that
a "true" `not` pattern should give.  If you look at Racket's `match`
(from which I took a lot of inspiration in the Emacs-25 refresh of
pcase), you'll see that their negation pattern is trivial to implement
(it's basically just swapping the "things to do upon success" with the
"things to do upon failure") but gives the right semantics and
efficiency.  E.g. `(not (not PAT))` behaves just like PAT (including
variable bindings and such).

        Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#43100: 28.0.50; pcase not binding variables conditionally
  2020-08-29  9:41 bug#43100: 28.0.50; pcase not binding variables conditionally Pip Cet
  2020-08-29 12:01 ` Philipp Stephani
@ 2021-03-02 13:38 ` Pip Cet
  1 sibling, 0 replies; 13+ messages in thread
From: Pip Cet @ 2021-03-02 13:38 UTC (permalink / raw)
  To: 43100-done, Stefan Monnier

On Sat, Aug 29, 2020 at 9:42 AM Pip Cet <pipcet@gmail.com> wrote:
>
> I'm having trouble with pcase's behavior.
>
> (pcase "a"
>   ((or (pred symbolp) name)
>    (let ((foo 'bar)) name)))

Stefan fixed this in 165353674e5fe7109ba9cbf526de0333902b7851, so I'm
closing this bug.

Pip





^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-03-02 13:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-29  9:41 bug#43100: 28.0.50; pcase not binding variables conditionally Pip Cet
2020-08-29 12:01 ` Philipp Stephani
2020-08-29 14:27   ` Pip Cet
2020-08-29 16:06     ` Stefan Monnier
2020-08-30 16:21       ` Pip Cet
2020-08-30 18:07         ` Stefan Monnier
2020-08-31 19:32           ` Pip Cet
2020-09-01  3:12             ` Stefan Monnier
2020-09-02  8:38               ` Pip Cet
2020-09-02 14:16                 ` Stefan Monnier
2020-09-05 14:36                   ` Pip Cet
2020-09-05 16:52                     ` Stefan Monnier
2021-03-02 13:38 ` Pip Cet

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).