unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
@ 2022-12-07 16:28 hokomo
  2022-12-09  2:57 ` Michael Heerdegen
  2022-12-12  2:50 ` Michael Heerdegen
  0 siblings, 2 replies; 8+ messages in thread
From: hokomo @ 2022-12-07 16:28 UTC (permalink / raw)
  To: 59887


Hello,

How exactly is the underscore symbol treated in pcase's 
backquote-style patterns? Seems like at least pcase and pcase-let 
treat it inconsistently (I haven't checked the other pcase 
operators).

pcase treats the underscore as a literal symbol to match, hence 
this fails:

(pcase '(1 2 3)
  (`(1 _ ,x)
   x))

;; => nil

Adding the missing comma in front of the underscore gives us the 
expected behavior:

(pcase '(1 2 3)
  (`(1 ,_ ,x)
   x))

;; => 3

However, pcase-let is less strict about this, producing the same 
result with or without the comma:

(pcase-let ((`(1 _ ,x) '(1 2 3)))
  x)

;; => 3

(pcase-let ((`(1 ,_ ,x) '(1 2 3)))
  x)

;; => 3

Additionally, I would think one would still be able to match a 
literal underscore symbol even with pcase-let, but the following 
still ends up matching:

(pcase-let ((`(1 ,'_ ,x) '(1 2 3)))
  x)

;; => 3

I think that matching a literal underscore symbol is rare enough 
that the ideal behavior would probably be for an underscore within 
a backquote template to be treated as a wildcard whenever it 
appears literally (e.g., `(1 _)) or unquoted (e.g., `(1 ,_)). 
However, as soon as explicitly quoted (e.g., `(1 ,'_)), it should 
be treated as a match for a literal underscore symbol. In other 
words, I would expect the following would be different from the 
above:

(pcase '(1 2 3)
  (`(1 _ ,x)
   x))

;; => 3 (instead of nil)

(pcase-let ((`(1 ,'_ ,x) '(1 2 3)))
  x)

;; => nil (instead of 3)

I'm not 100% sure if these requirements would cause any 
backwards-incompatible changes or inconsistencies with the other 
pcase operators though. I'm also assuming that `(1 _) and `(1 ,'_) 
can be distinguished, but maybe this is not true?

hokomo





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-07 16:28 bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns hokomo
@ 2022-12-09  2:57 ` Michael Heerdegen
  2022-12-12  2:50 ` Michael Heerdegen
  1 sibling, 0 replies; 8+ messages in thread
From: Michael Heerdegen @ 2022-12-09  2:57 UTC (permalink / raw)
  To: hokomo; +Cc: 59887

hokomo <hokomo@airmail.cc> writes:

> However, pcase-let is less strict about this, producing the same
> result with or without the comma:
>
> (pcase-let ((`(1 _ ,x) '(1 2 3)))
>  x)
>
> ;; => 3
>
> (pcase-let ((`(1 ,_ ,x) '(1 2 3)))
>  x)
>
> ;; => 3

Note this part of the `pcase-let' documentation string:

| Each EXP should match (i.e. be of compatible structure) to its
| respective PATTERN; a mismatch may signal an error or may go
| undetected, binding variables to arbitrary values, such as nil.

Your first case is invalid because the pattern doesn't match the value.
Here it goes undetected and bindings get established.

This behavior is not perfect, but AFAIR it has been preferred over the
less efficient code that better checks would mean.  So it's the
programmer's task to use only matching patterns.  This is not really a
restriction because `pcase-let' is intended to create bindings, not for
testing whether a pattern matches some value.


> I think that matching a literal underscore symbol is rare enough that
> the ideal behavior would probably be for an underscore within a
> backquote template to be treated as a wildcard whenever it appears
> literally (e.g., `(1 _)) or unquoted (e.g., `(1 ,_)). However, as soon
> as explicitly quoted (e.g., `(1 ,'_)), it should be treated as a match
> for a literal underscore symbol.

This idea had been discussed in the past.  It had some votes but it had
been decided not to implement such a feature because it would not really
fit into the existing semantics, just for the sake of leaving out one
",".  So I'm afraid I don't think we will change this.


Michael.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-07 16:28 bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns hokomo
  2022-12-09  2:57 ` Michael Heerdegen
@ 2022-12-12  2:50 ` Michael Heerdegen
  2022-12-12 18:26   ` hokomo
  1 sibling, 1 reply; 8+ messages in thread
From: Michael Heerdegen @ 2022-12-12  2:50 UTC (permalink / raw)
  To: hokomo; +Cc: 59887-done

hokomo <hokomo@airmail.cc> writes:

> How exactly is the underscore symbol treated in pcase's
> backquote-style patterns?

I think the current behavior can be understood and explained from the
documentation quite well.  If you can point to something concrete
missing, please elaborate, and we can reopen this report.

For now I'm closing it: everything works as documented, and we had
decided not to complicate the semantics of `_`, so as far as I see it
nothing is to be done here.


Thanks,

Michael.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-12  2:50 ` Michael Heerdegen
@ 2022-12-12 18:26   ` hokomo
  2022-12-13  1:17     ` Michael Heerdegen
  0 siblings, 1 reply; 8+ messages in thread
From: hokomo @ 2022-12-12 18:26 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 59887-done


> Note this part of the `pcase-let' documentation string:
>
> | Each EXP should match (i.e. be of compatible structure) to its
> | respective PATTERN; a mismatch may signal an error or may go
> | undetected, binding variables to arbitrary values, such as 
> nil.
>
> Your first case is invalid because the pattern doesn't match the 
> value.
> Here it goes undetected and bindings get established.

I see! That clarifies the behavior I was seeing.

> This behavior is not perfect, but AFAIR it has been preferred 
> over the
> less efficient code that better checks would mean.  So it's the
> programmer's task to use only matching patterns.  This is not 
> really a
> restriction because `pcase-let' is intended to create bindings, 
> not for
> testing whether a pattern matches some value.

I suppose that's a fair tradeoff.

> This idea had been discussed in the past.  It had some votes but 
> it had
> been decided not to implement such a feature because it would 
> not really
> fit into the existing semantics, just for the sake of leaving 
> out one
> ",".  So I'm afraid I don't think we will change this.

Right, it doesn't seem like such a huge win now that I understand 
that the behavior of pcase-let was according to its specification 
and there was no inconsistency to begin with. It would maybe make 
the code a little bit easier to read in certain cases, but I can 
see your point.

> I think the current behavior can be understood and explained 
> from the
> documentation quite well.  If you can point to something 
> concrete
> missing, please elaborate, and we can reopen this report.

Your quote above made everything clear, but I completely missed it 
since I was reading the Emacs Lisp manual's explanation [1] rather 
than pcase-let's docstring. Maybe it would be beneficial to 
include the above quote in the manual as well.

[1] 
<https://www.gnu.org/software/emacs/manual/html_node/elisp/Destructuring-with-pcase-Patterns.html>

Thanks!

hokomo





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-12 18:26   ` hokomo
@ 2022-12-13  1:17     ` Michael Heerdegen
  2022-12-13  1:19       ` hokomo
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Heerdegen @ 2022-12-13  1:17 UTC (permalink / raw)
  To: hokomo; +Cc: 59887-done

hokomo <hokomo@airmail.cc> writes:

> Your quote above made everything clear, but I completely missed it
> since I was reading the Emacs Lisp manual's explanation [1] rather
> than pcase-let's docstring. Maybe it would be beneficial to include
> the above quote in the manual as well.

> <https://www.gnu.org/software/emacs/manual/html_node/elisp/Destructuring-with-pcase-Patterns.html>

That says:

|    The macros described in this section use ‘pcase’ patterns to perform
| destructuring binding.  The condition of the object to be of compatible
| structure means that the object must match the pattern, because only
| then the object’s subfields can be extracted.  For example:
| 
|        (pcase-let ((`(add ,x ,y) my-list))
|          (message "Contains %S and %S" x y))
| 
| does the same as the previous example, except that it directly tries to
| extract ‘x’ and ‘y’ from ‘my-list’ without first verifying if ‘my-list’
| is a list which has the right number of elements and has ‘add’ as its
| first element.  The precise behavior when the object does not actually
| match the pattern is undefined, although the body will not be silently
| skipped: either an error is signaled or the body is run with some of the
| variables potentially bound to arbitrary values like ‘nil’.

That explains the same thing quite broadly.  Maybe you did not notice
the implications when you first read it?  I dunno, I'm not that good in
writing documentation, but I can't find something to add from what we
had discussed that would not be redundant.

Or should we maybe just warn about the possible pitfall a bit more
offensively?

Michael.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-13  1:17     ` Michael Heerdegen
@ 2022-12-13  1:19       ` hokomo
  2022-12-13  2:21         ` Michael Heerdegen
  0 siblings, 1 reply; 8+ messages in thread
From: hokomo @ 2022-12-13  1:19 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 59887-done


> That says:
>
> |    The macros described in this section use ‘pcase’ patterns 
> to perform
> | destructuring binding.  The condition of the object to be of 
> compatible
> | structure means that the object must match the pattern, 
> because only
> | then the object’s subfields can be extracted.  For example:
> |
> |        (pcase-let ((`(add ,x ,y) my-list))
> |          (message "Contains %S and %S" x y))
> |
> | does the same as the previous example, except that it directly 
> tries to
> | extract ‘x’ and ‘y’ from ‘my-list’ without first verifying if 
> ‘my-list’
> | is a list which has the right number of elements and has ‘add’ 
> as its
> | first element.  The precise behavior when the object does not 
> actually
> | match the pattern is undefined, although the body will not be 
> silently
> | skipped: either an error is signaled or the body is run with 
> some of the
> | variables potentially bound to arbitrary values like ‘nil’.
>
> That explains the same thing quite broadly.  Maybe you did not 
> notice
> the implications when you first read it?  I dunno, I'm not that 
> good in
> writing documentation, but I can't find something to add from 
> what we
> had discussed that would not be redundant.

That indeed describes it nicely. Somehow I managed to miss that 
whole paragraph and instead skipped directly to the documentation 
string of pcase-let. My bad... :-)

> Or should we maybe just warn about the possible pitfall a bit 
> more
> offensively?

Hmm, I understand the concern about being redundant, especially 
since all four of the listed functions have the same behavior, and 
documenting it for one would mean documenting it for each one.

Perhaps including a variation of the phrase "Each EXP should match 
(i.e. be of compatible structure)" in each of the four 
descriptions would hint at this behavior while not being overly 
verbose? From that point the user can search for "compatible" on 
the same page and immediately find a match in the text at the top 
that explains the constraints.

hokomo





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-13  1:19       ` hokomo
@ 2022-12-13  2:21         ` Michael Heerdegen
  2022-12-13  2:26           ` hokomo
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Heerdegen @ 2022-12-13  2:21 UTC (permalink / raw)
  To: hokomo; +Cc: 59887-done

hokomo <hokomo@airmail.cc> writes:

> That indeed describes it nicely. Somehow I managed to miss that whole
> paragraph and instead skipped directly to the documentation string of
> pcase-let. My bad... :-)
>
> > Or should we maybe just warn about the possible pitfall a bit more
> > offensively?

> Perhaps including a variation of the phrase "Each EXP should match
> (i.e. be of compatible structure)" in each of the four descriptions
> would hint at this behavior while not being overly verbose? From that
> point the user can search for "compatible" on the same page and
> immediately find a match in the text at the top that explains the
> constraints.

My question is that when we make the text even longer, would that help
people that don't read carefully (because we don't need to address
others) at all?

My second question is if that would have helped you at all, because your
crucial misunderstanding was about the meaning of `_`.  Using patterns
in `pcase-let' that don't match generally doesn't make much sense, it's
totally unclear what would happen in this case.  That's another reason
why I don't want to over-emphasize this case.

Maybe saying that `_` is not special when used as a QPAT would make
sense, in (info "(elisp) Backquote Patterns").  I mean in this
paragraph:

| ‘SYMBOL’
| ‘KEYWORD’
| ‘NUMBER’
| ‘STRING’
|      Matches if the corresponding element of EXPVAL is ‘equal’ to the
|      specified literal object.

We could add that `_` is not special (no symbol is special as a qpat,
actually).  Would that give a useful hint?  It seems that some people
seem to expect that `_` is special everywhere in pcase.

Michael.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns
  2022-12-13  2:21         ` Michael Heerdegen
@ 2022-12-13  2:26           ` hokomo
  0 siblings, 0 replies; 8+ messages in thread
From: hokomo @ 2022-12-13  2:26 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 59887-done


> My question is that when we make the text even longer, would 
> that help
> people that don't read carefully (because we don't need to 
> address
> others) at all?

I believe it would. Even though I should've been more careful with 
reading the whole page, one's first instinct (at least mine) when 
reading a reference manual is to jump directly to the operator in 
question and expect to find all of the necessary and essential 
information there, whether it is a detailed explanation or just a 
hint or short remark mentioning some concepts that were introduced 
more thoroughly earlier in the manual.

As an example, the beginning of the Handling Errors page [1] 
describes, among other things, the meaning of the `debug' symbol 
within a condition-case handler's condition list. However, the 
description of condition-case specifically also includes the short 
remark "which can include debug to allow the debugger to run 
before the handler" which is useful to point the reader to the 
description at the beginning (all it takes is searching for 
"debug" on the same page after reading the remark).

[1] 
<https://www.gnu.org/software/emacs/manual/html_node/elisp/Handling-Errors.html>

> My second question is if that would have helped you at all, 
> because your
> crucial misunderstanding was about the meaning of `_`.  Using 
> patterns
> in `pcase-let' that don't match generally doesn't make much 
> sense, it's
> totally unclear what would happen in this case.  That's another 
> reason
> why I don't want to over-emphasize this case.
>
> Maybe saying that `_` is not special when used as a QPAT would 
> make
> sense, in (info "(elisp) Backquote Patterns").  I mean in this
> paragraph:
>
> | ‘SYMBOL’
> | ‘KEYWORD’
> | ‘NUMBER’
> | ‘STRING’
> |      Matches if the corresponding element of EXPVAL is ‘equal’ 
> to the
> |      specified literal object.
>
> We could add that `_` is not special (no symbol is special as a 
> qpat,
> actually).  Would that give a useful hint?  It seems that some 
> people
> seem to expect that `_` is special everywhere in pcase.

That is indeed the core of the issue and I definitely think it 
would be a good idea to have an explicit statement that the 
underscore symbol is not special as a QPAT. You can sort of infer 
it from the specification, but given the unspecified behavior of 
pcase-let in the case of a non-match, making it explicit would be 
nice.

I think I would've ended up poking around pcase-let in any case 
after being puzzled about its behavior, just out of curiosity. 
Having a short remark about "structural compatibility" in the 
documentation of the specific operator would then help me quickly 
narrow down to what I need.

hokomo





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-13  2:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-07 16:28 bug#59887: pcase vs. pcase-let: Underscore in backquote-style patterns hokomo
2022-12-09  2:57 ` Michael Heerdegen
2022-12-12  2:50 ` Michael Heerdegen
2022-12-12 18:26   ` hokomo
2022-12-13  1:17     ` Michael Heerdegen
2022-12-13  1:19       ` hokomo
2022-12-13  2:21         ` Michael Heerdegen
2022-12-13  2:26           ` hokomo

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).