unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
@ 2017-07-12  6:13 Tino Calancha
  2017-07-20  0:54 ` Drew Adams
  2017-07-20 19:51 ` Philipp Stephani
  0 siblings, 2 replies; 14+ messages in thread
From: Tino Calancha @ 2017-07-12  6:13 UTC (permalink / raw)
  To: 27659

Severity: wishlist

Just wondering if the following is of any interest:

(defun string-matched-text (regexp string num &optional start)
  ""
  (when (string-match regexp string start)
    (match-string num string)))
    
Then,

(let ((str "foo-123"))
  (when (string-match "[[:alpha:]]+-\\([0-9]+\\)" str)
    (match-string 1 str)))
=> "123"

is equivalent to:
(string-matched-text "[[:alpha:]]+-\\([0-9]+\\)" "foo-123" 1)
=> "123"

--8<-----------------------------cut here---------------start------------->8---
commit 65741b74d8999beaacd5093e128030a8635aff05
Author: Tino Calancha <tino.calancha@gmail.com>
Date:   Wed Jul 12 14:29:08 2017 +0900

    string-matched-text: New function
    
    * lisp/subr.el (string-matched-text): New defun.
    * doc/lispref/searching.texi (Simple Match Data): Update manual.
    * etc/NEWS: Announce it.

diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 67d4c22464..81bcad1740 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -1457,6 +1457,16 @@ Simple Match Data
 repetition that repeated zero times.
 @end defun
 
+@defun string-matched-text regexp string count &optional start
+This function returns, as a string, the text matched by @var{regexp}
+in @var{string}, or @code{nil} if there is no match.  The meaning
+of @var{count} is same as in @code{match-string}.  If @var{start}
+is non-@code{nil}, the search starts at that index in @var{string}.
+The behavior of this function is equivalent to
+@w{@code{(and (string-match @var{regexp} @var{string} @var{start})
+(match-string @var{count} @var{string}))}}.
+@end defun
+
 @defun match-string-no-properties count &optional in-string
 This function is like @code{match-string} except that the result
 has no text properties.
diff --git a/etc/NEWS b/etc/NEWS
index 68ebdb3c15..794edef9cd 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1105,6 +1105,8 @@ break.
 \f
 * Lisp Changes in Emacs 26.1
 
+** New function 'string-matched-text'.
+
 ** New function 'seq-set-equal-p' to check if SEQUENCE1 and SEQUENCE2
 contain the same elements, regardless of the order.
 
diff --git a/lisp/subr.el b/lisp/subr.el
index a9edff6166..1482787842 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -3635,6 +3635,15 @@ match-string
 	  (substring string (match-beginning num) (match-end num))
 	(buffer-substring (match-beginning num) (match-end num)))))
 
+(defun string-matched-text (regexp string num &optional start)
+  "Return string of text matched by REGEXP in STRING.
+NUM specifies which parenthesized expression in REGEXP.
+  Value is nil if NUMth pair didn’t match, or there were less than NUM pairs.
+Zero means the entire text matched by the whole regexp or whole string.
+If optional arg START is non-nil, then start search at that index in STRING."
+  (when (string-match regexp string start)
+    (match-string num string)))
+
 (defun match-string-no-properties (num &optional string)
   "Return string of text matched by last search, without text properties.
 NUM specifies which parenthesized expression in the last regexp.
--8<-----------------------------cut here---------------end--------------->8---

In GNU Emacs 26.0.50 (build 4, x86_64-pc-linux-gnu, GTK+ Version 3.22.11)
 of 2017-07-11
Repository revision: d014a5e15c1110af77e7a96f06ccd0f0cafb099f





^ permalink raw reply related	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-12  6:13 bug#27659: 26.0.50; Add string-matched-text: string-match + match-string Tino Calancha
@ 2017-07-20  0:54 ` Drew Adams
  2017-07-20  1:19   ` Tino Calancha
  2017-07-20 19:51 ` Philipp Stephani
  1 sibling, 1 reply; 14+ messages in thread
From: Drew Adams @ 2017-07-20  0:54 UTC (permalink / raw)
  To: Tino Calancha, 27659

> (when (string-match "[[:alpha:]]+-\\([0-9]+\\)" str)
>   (match-string 1 str))
> is equivalent to:
> (string-matched-text "[[:alpha:]]+-\\([0-9]+\\)" str 1)

My own opinion, FWIW, is that the former is much clearer
and is sufficiently succinct.  YAGNI.

[FWIW2:
I would write the former as
(and (string-match ...)  (match-string...)), not as
(when (string-match ...) (match-string...)).
I follow the Common Lisp convention of using `when' to
indicate (to a human) that the return value is not
important (the function is used for side effects only).
I don't expect a human reader to look for a `nil' return
value from `when', for instance.]





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-20  0:54 ` Drew Adams
@ 2017-07-20  1:19   ` Tino Calancha
  0 siblings, 0 replies; 14+ messages in thread
From: Tino Calancha @ 2017-07-20  1:19 UTC (permalink / raw)
  To: 27659-done

Drew Adams <drew.adams@oracle.com> writes:

>> (when (string-match "[[:alpha:]]+-\\([0-9]+\\)" str)
>>   (match-string 1 str))
>> is equivalent to:
>> (string-matched-text "[[:alpha:]]+-\\([0-9]+\\)" str 1)
>
> My own opinion, FWIW, is that the former is much clearer
> and is sufficiently succinct.  YAGNI.
Thank you for your feedback and tips.

I am closing the report since it seems this proposal
doesn't bring us actual value.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-12  6:13 bug#27659: 26.0.50; Add string-matched-text: string-match + match-string Tino Calancha
  2017-07-20  0:54 ` Drew Adams
@ 2017-07-20 19:51 ` Philipp Stephani
  2017-07-21 12:29   ` Tino Calancha
  2017-07-22  1:46   ` Michael Heerdegen
  1 sibling, 2 replies; 14+ messages in thread
From: Philipp Stephani @ 2017-07-20 19:51 UTC (permalink / raw)
  To: Tino Calancha, 27659


[-- Attachment #1.1: Type: text/plain, Size: 670 bytes --]

Tino Calancha <tino.calancha@gmail.com> schrieb am Mi., 12. Juli 2017 um
08:16 Uhr:

> Severity: wishlist
>
> Just wondering if the following is of any interest:
>
> (defun string-matched-text (regexp string num &optional start)
>   ""
>   (when (string-match regexp string start)
>     (match-string num string)))
>
> Then,
>
> (let ((str "foo-123"))
>   (when (string-match "[[:alpha:]]+-\\([0-9]+\\)" str)
>     (match-string 1 str)))
> => "123"
>
> is equivalent to:
> (string-matched-text "[[:alpha:]]+-\\([0-9]+\\)" "foo-123" 1)
> => "123"
>

This looks useful, but I think it would be even better to add it as a pcase
macro to be composable (see attached patch).

[-- Attachment #1.2: Type: text/html, Size: 1095 bytes --]

[-- Attachment #2: 0001-Add-rx-pattern-for-pcase.txt --]
[-- Type: text/plain, Size: 4322 bytes --]

From b95f7477887a283134a19021b8d21ee466d457c3 Mon Sep 17 00:00:00 2001
From: Philipp Stephani <phst@google.com>
Date: Thu, 20 Jul 2017 21:36:18 +0200
Subject: [PATCH] Add 'rx' pattern for pcase.

* lisp/emacs-lisp/pcase.el (rx): New pcase macro.
* test/lisp/emacs-lisp/pcase-tests.el (pcase-tests-rx): Add unit test.
---
 etc/NEWS                            |  3 +++
 lisp/emacs-lisp/pcase.el            | 47 +++++++++++++++++++++++++++++++++++++
 test/lisp/emacs-lisp/pcase-tests.el |  9 +++++++
 3 files changed, 59 insertions(+)

diff --git a/etc/NEWS b/etc/NEWS
index 954fe0d547..a16db7f4e0 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1521,6 +1521,9 @@ manual.
 ** 'tcl-auto-fill-mode' is now declared obsolete.  Its functionality
 can be replicated simply by setting 'comment-auto-fill-only-comments'.
 
+** New pcase pattern 'rx' to match against a rx-style regular
+expression.
+
 \f
 * Changes in Emacs 26.1 on Non-Free Operating Systems
 
diff --git a/lisp/emacs-lisp/pcase.el b/lisp/emacs-lisp/pcase.el
index 4a06ab25d3..2273840916 100644
--- a/lisp/emacs-lisp/pcase.el
+++ b/lisp/emacs-lisp/pcase.el
@@ -54,6 +54,7 @@
 ;;; Code:
 
 (require 'macroexp)
+(require 'rx)
 
 ;; Macro-expansion of pcase is reasonably fast, so it's not a problem
 ;; when byte-compiling a file, but when interpreting the code, if the pcase
@@ -930,6 +931,52 @@ pcase--u1
    ((or (stringp qpat) (integerp qpat) (symbolp qpat)) `',qpat)
    (t (error "Unknown QPAT: %S" qpat))))
 
+(pcase-defmacro rx (&rest regexps)
+  "Build a `pcase' pattern matching `rx' regexps.
+The REGEXPS are interpreted as by `rx'.  The pattern matches if
+the regular expression so constructed matches the object, as if
+by `string-match'.
+
+Within the case code, the match data is bound as usual, but you
+still have to pass the correct string as argument to
+`match-string'.
+
+In addition to the usual `rx' constructs, REGEXPS can contain the
+following constructs:
+
+  (let VAR FORM...)  creates a new explicitly numbered submatch
+                     that matches FORM and binds the match to
+                     VAR.
+  (backref-var VAR)  creates a backreference to the submatch
+                     introduced by a previous (let VAR ...)
+                     construct.
+
+The VARs are associated with explicitly numbered submatches
+starting from 1.  Multiple occurrences of the same VAR refer to
+the same submatch."
+  (let* ((vars ())
+         (rx-constituents
+          `((let ,(lambda (form)
+                    (rx-check form)
+                    (let ((var (cadr form)))
+                      (cl-check-type var symbol)
+                      (let ((i (or (cl-position var vars :test #'eq)
+                                   (prog1 (length vars)
+                                     (setq vars `(,@vars ,var))))))
+                        (rx-form `(submatch-n ,(1+ i) ,@(cddr form))))))
+                 1 nil)
+            (backref-var ,(lambda (form)
+                            (rx-check form)
+                            (rx-form
+                             `(backref ,(1+ (cl-position (cadr form) vars
+                                                         :test #'eq)))))
+                         1 1 ,(lambda (var) (memq var vars)))
+            ,@rx-constituents))
+         (regexp (rx-to-string `(seq ,@regexps) :no-group)))
+    `(and (pred (string-match ,regexp))
+          ,@(cl-loop for i from 1
+                     for var in vars
+                     collect `(app (match-string ,i) ,var)))))
 
 (provide 'pcase)
 ;;; pcase.el ends here
diff --git a/test/lisp/emacs-lisp/pcase-tests.el b/test/lisp/emacs-lisp/pcase-tests.el
index ef0b2f6b24..a887b460b1 100644
--- a/test/lisp/emacs-lisp/pcase-tests.el
+++ b/test/lisp/emacs-lisp/pcase-tests.el
@@ -67,6 +67,15 @@ pcase-tests-grep
 (ert-deftest pcase-tests-vectors ()
   (should (equal (pcase [1 2] (`[,x] 1) (`[,x ,y] (+ x y))) 3)))
 
+(ert-deftest pcase-tests-rx ()
+  (should (equal (pcase "a 1 2 3 1 b"
+                   ((rx (let u (+ digit)) space
+                        (let v (+ digit)) space
+                        (let v (+ digit)) space
+                        (backref-var u))
+                    (list u v)))
+                 '("1" "3"))))
+
 ;; Local Variables:
 ;; no-byte-compile: t
 ;; End:
-- 
2.14.0.rc0.284.gd933b75aa4-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-20 19:51 ` Philipp Stephani
@ 2017-07-21 12:29   ` Tino Calancha
  2017-07-21 13:34     ` Stefan Monnier
  2017-07-22  1:46   ` Michael Heerdegen
  1 sibling, 1 reply; 14+ messages in thread
From: Tino Calancha @ 2017-07-21 12:29 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: 27659, Michael Heerdegen, Stefan Monnier, Tino Calancha

[-- Attachment #1: Type: text/plain, Size: 925 bytes --]



On Thu, 20 Jul 2017, Philipp Stephani wrote:

> 
> 
> Tino Calancha <tino.calancha@gmail.com> schrieb am Mi., 12. Juli 2017 um 08:16 Uhr:
>       Severity: wishlist
>
>       Just wondering if the following is of any interest:
>
>       (defun string-matched-text (regexp string num &optional start)
>         ""
>         (when (string-match regexp string start)
>           (match-string num string)))
>
>       Then,
>
>       (let ((str "foo-123"))
>         (when (string-match "[[:alpha:]]+-\\([0-9]+\\)" str)
>           (match-string 1 str)))
>       => "123"
>
>       is equivalent to:
>       (string-matched-text "[[:alpha:]]+-\\([0-9]+\\)" "foo-123" 1)
>       => "123"
> 
> 
> This looks useful, but I think it would be even better to add it as a pcase macro to be composable (see attached patch). 
Thank you!
Although, i must admit i am not fluent in `rx' syntaxis, i find your idea
very nice.

Tino

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-21 12:29   ` Tino Calancha
@ 2017-07-21 13:34     ` Stefan Monnier
  2017-07-21 14:08       ` Tino Calancha
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Stefan Monnier @ 2017-07-21 13:34 UTC (permalink / raw)
  To: Tino Calancha; +Cc: 27659, Michael Heerdegen, Philipp Stephani

>> This looks useful, but I think it would be even better to add it
>> as a pcase macro to be composable (see attached patch). 

Hey, very nice.  Please add it to rx.el.
[ But please change `backref-var` to just `backref` (you can distinguish
  the two based on the type of the argument, I think).  I guess one
  could also argue that you could similarly rename the `let` to
  `group-n`.  ]

> Although, i must admit i am not fluent in `rx' syntaxis, i find your idea
> very nice.

If you prefer the standard/cryptic regexp syntax, I posted a similar
thingy in the past (see below).

This lets you do

    (pcase "foo-123"
      ((re-match "[[:alpha:]]+-\\(?num:[0-9]+\\)")
       num))
    => "123"

Maybe I should install it in pcase.el?


        Stefan


(pcase-defmacro re-match (re)
  "Matches a string if that string matches RE.
RE should be a regular expression (a string).
It can use the special syntax \\(?VAR: to bind a sub-match
to variable VAR.  All other subgroups are treated as shy.

Multiple uses of this macro in a single `pcase' are not optimized
together, so don't expect lex-like performance.  But in order for
such optimization to be possible in some distant future, back-references
are not supported."
  (let ((start 0)
        (last 0)
        (new-re '())
        (vars '())
        (gn 0))
    (while (string-match "\\\\(\\(?:\\?\\([-[:alnum:]]*\\):\\)?" re start)
      (setq start (match-end 0))
      (let ((beg (match-beginning 0))
            (name (match-string 1 re)))
        ;; Skip false positives, either backslash-escaped or within [...].
        (when (subregexp-context-p re start last)
          (cond
           ((null name)
            (push (concat (substring re last beg) "\\(?:") new-re))
           ((string-match "\\`[0-9]" name)
            (error "Variable can't start with a digit: %S" name))
           (t
            (let* ((var (intern name))
                   (id (cdr (assq var vars))))
              (unless id
                (setq gn (1+ gn))
                (setq id gn)
                (push (cons var gn) vars))
              (push (concat (substring re last beg) (format "\\(?%d:" id))
                    new-re))))
          (setq last start))))
    (push (substring re last) new-re)
    (setq new-re (mapconcat #'identity (nreverse new-re) ""))
    `(and (pred stringp)
          (app (lambda (s)
                 (save-match-data
                   (when (string-match ,new-re s)
                     (vector ,@(mapcar (lambda (x) `(match-string ,(cdr x) s))
                                       vars)))))
               (,'\` [,@(mapcar (lambda (x) (list '\, (car x))) vars)])))))





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-21 13:34     ` Stefan Monnier
@ 2017-07-21 14:08       ` Tino Calancha
  2017-07-21 23:28       ` John Mastro
  2017-07-23 20:41       ` Philipp Stephani
  2 siblings, 0 replies; 14+ messages in thread
From: Tino Calancha @ 2017-07-21 14:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 27659, Michael Heerdegen, Philipp Stephani, Tino Calancha



On Fri, 21 Jul 2017, Stefan Monnier wrote:
>> Although, i must admit i am not fluent in `rx' syntaxis, i find your idea
>> very nice.
>
> If you prefer the standard/cryptic regexp syntax, I posted a similar
> thingy in the past (see below).
>
> This lets you do
>
>    (pcase "foo-123"
>      ((re-match "[[:alpha:]]+-\\(?num:[0-9]+\\)")
>       num))
>    => "123"
>
> Maybe I should install it in pcase.el?
This is nice too, and it looks familiar to me.  A pity that it doesn't
accept back references, but to be honest weirdly use them.

For me it's useful to have it to study the code and learn how these
things are implemented; but i understand that is not
an encouraged argument.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-21 13:34     ` Stefan Monnier
  2017-07-21 14:08       ` Tino Calancha
@ 2017-07-21 23:28       ` John Mastro
  2017-07-22  2:02         ` Michael Heerdegen
  2017-07-23 20:41       ` Philipp Stephani
  2 siblings, 1 reply; 14+ messages in thread
From: John Mastro @ 2017-07-21 23:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 27659, Michael Heerdegen, Philipp Stephani, Tino Calancha

Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> If you prefer the standard/cryptic regexp syntax, I posted a similar
> thingy in the past (see below).
>
> This lets you do
>
>     (pcase "foo-123"
>       ((re-match "[[:alpha:]]+-\\(?num:[0-9]+\\)")
>        num))
>     => "123"
>
> Maybe I should install it in pcase.el?

I hope you do - both this and the rx pattern would be nice to have.

        John





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-20 19:51 ` Philipp Stephani
  2017-07-21 12:29   ` Tino Calancha
@ 2017-07-22  1:46   ` Michael Heerdegen
  2017-07-23 20:45     ` Philipp Stephani
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Heerdegen @ 2017-07-22  1:46 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: 27659, Tino Calancha

Hi Philipp,

nice idea!  I have some questions:


+(pcase-defmacro rx (&rest regexps)
+  "Build a `pcase' pattern matching `rx' regexps.
+The REGEXPS are interpreted as by `rx'.

Should we tell what the semantics of multiple REGEXPS is?  I guess they
are implicitly wrapped inside rx-`and' (but FWIW, the doc of `rx' also
fails to tell that).


+Within the case code, the match data is bound as usual, but you

This makes it sound like match data is bound pcase-branch-locally.
This isn't the case, right?


+In addition to the usual `rx' constructs, REGEXPS can contain the
+following constructs:
+
+  (let VAR FORM...)  creates a new explicitly numbered submatch
+                     that matches FORM and binds the match to
+                     VAR.

This made me wonder what FORM should be.  I think it means any rx
symbolic expression, so the name FORM seems misleading.


+(ert-deftest pcase-tests-rx ()
+  (should (equal (pcase "a 1 2 3 1 b"
+                   ((rx (let u (+ digit)) space
+                        (let v (+ digit)) space
+                        (let v (+ digit)) space
+                        (backref-var u))
+                    (list u v)))
+                 '("1" "3"))))
+

I don't understand the example (or test).  Is v first bound to 2, and
after that rebound to 3?  This seems at least surprising, since let
behaves differently in pcase, e.g.

(pcase 'foo
  ((and (let x 1) (let x 2)) x))
==> nil

Hmm, in general I see the risk of confusing this `let' with `pcase' let.
It seems to be something very different.  Maybe you could just pick a
different name, `bind' maybe?


Michael.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-21 23:28       ` John Mastro
@ 2017-07-22  2:02         ` Michael Heerdegen
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Heerdegen @ 2017-07-22  2:02 UTC (permalink / raw)
  To: John Mastro; +Cc: 27659, Philipp Stephani, Stefan Monnier, Tino Calancha

John Mastro <john.b.mastro@gmail.com> writes:

> I hope you do - both this and the rx pattern would be nice to have.

I think we should make it so that they have corresponding semantics.
Ideally, without variable binding, Phillip's macro should be eqivalent
to calling rx on the pattern args and then calling Stefan's version.  On
top of that, both should add variables and back-references in the same
way.  It would be very confusing to add the same thing twice with only
very subtle differences (modulo regexp syntax used).


Michael.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-21 13:34     ` Stefan Monnier
  2017-07-21 14:08       ` Tino Calancha
  2017-07-21 23:28       ` John Mastro
@ 2017-07-23 20:41       ` Philipp Stephani
  2017-07-24 14:30         ` Stefan Monnier
  2 siblings, 1 reply; 14+ messages in thread
From: Philipp Stephani @ 2017-07-23 20:41 UTC (permalink / raw)
  To: Stefan Monnier, Tino Calancha; +Cc: 27659, Michael Heerdegen

[-- Attachment #1: Type: text/plain, Size: 1329 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> schrieb am Fr., 21. Juli 2017 um
15:34 Uhr:

> >> This looks useful, but I think it would be even better to add it
> >> as a pcase macro to be composable (see attached patch).
>
> Hey, very nice.  Please add it to rx.el.
> [ But please change `backref-var` to just `backref` (you can distinguish
>   the two based on the type of the argument, I think).  I guess one
>   could also argue that you could similarly rename the `let` to
>   `group-n`.  ]
>


Pushed as ad4eff3b905dbc32e2d38bfec1e4f93eceec288d. I've renamed
backref-var to backref as you suggested, but left `let' because I think
that feature is important enough to deserve a short, common name.


>
> > Although, i must admit i am not fluent in `rx' syntaxis, i find your idea
> > very nice.
>
> If you prefer the standard/cryptic regexp syntax, I posted a similar
> thingy in the past (see below).
>
> This lets you do
>
>     (pcase "foo-123"
>       ((re-match "[[:alpha:]]+-\\(?num:[0-9]+\\)")
>        num))
>     => "123"
>
> Maybe I should install it in pcase.el?
>
>
>
Sure! I'd suggest to change the syntax to be compatible with other
languages:

\(?<abc>[0-9]+\) or \(?'abc'[0-9+\) (Perl and .NET)
\(?P<abc>[0-9]+\) (Python)

These languages also have syntax for named backreferences, though that's
less important.

[-- Attachment #2: Type: text/html, Size: 2030 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-22  1:46   ` Michael Heerdegen
@ 2017-07-23 20:45     ` Philipp Stephani
  2017-07-23 21:39       ` Michael Heerdegen
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Stephani @ 2017-07-23 20:45 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 27659, Tino Calancha

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

Michael Heerdegen <michael_heerdegen@web.de> schrieb am Sa., 22. Juli 2017
um 03:46 Uhr:

> Hi Philipp,
>
> nice idea!  I have some questions:
>
>
> +(pcase-defmacro rx (&rest regexps)
> +  "Build a `pcase' pattern matching `rx' regexps.
> +The REGEXPS are interpreted as by `rx'.
>
> Should we tell what the semantics of multiple REGEXPS is?  I guess they
> are implicitly wrapped inside rx-`and' (but FWIW, the doc of `rx' also
> fails to tell that).
>


yes, probably something should be added to the docstring of `rx'.


>
>
> +Within the case code, the match data is bound as usual, but you
>
> This makes it sound like match data is bound pcase-branch-locally.
> This isn't the case, right?
>

Yes, I've reworded it to sound less like a variable binding.


>
>
> +In addition to the usual `rx' constructs, REGEXPS can contain the
> +following constructs:
> +
> +  (let VAR FORM...)  creates a new explicitly numbered submatch
> +                     that matches FORM and binds the match to
> +                     VAR.
>
> This made me wonder what FORM should be.  I think it means any rx
> symbolic expression, so the name FORM seems misleading.
>

I think the words `form' and `sexp' are used mostly interchangeably
nowadays, and the docstring of `rx' speaks of "forms" several times.


>
>
> +(ert-deftest pcase-tests-rx ()
> +  (should (equal (pcase "a 1 2 3 1 b"
> +                   ((rx (let u (+ digit)) space
> +                        (let v (+ digit)) space
> +                        (let v (+ digit)) space
> +                        (backref-var u))
> +                    (list u v)))
> +                 '("1" "3"))))
> +
>
> I don't understand the example (or test).  Is v first bound to 2, and
> after that rebound to 3?  This seems at least surprising, since let
> behaves differently in pcase, e.g.
>
> (pcase 'foo
>   ((and (let x 1) (let x 2)) x))
> ==> nil
>
> Hmm, in general I see the risk of confusing this `let' with `pcase' let.
> It seems to be something very different.  Maybe you could just pick a
> different name, `bind' maybe?
>
>
>
I'd rather stick with the short and common `let'. That there's a name clash
is unfortunate, but there are lots of similar cases (e.g. pcase `let' vs.
Lisp `let', rx `and' and `or').

[-- Attachment #2: Type: text/html, Size: 3418 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-23 20:45     ` Philipp Stephani
@ 2017-07-23 21:39       ` Michael Heerdegen
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Heerdegen @ 2017-07-23 21:39 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: 27659, Tino Calancha

Philipp Stephani <p.stephani2@gmail.com> writes:

> I think the words `form' and `sexp' are used mostly interchangeably
> nowadays, and the docstring of `rx' speaks of "forms" several times.

If you read carefully, then you'll see that the word "form" is only used
in descriptive language.  The only place where something is named "form"
is the argument of `eval', which actually must be a form.

AFAIC, we use the convention that anything called "a form" is normal
Lisp code to be evaluated, especially if an argument of a macro or
special form is named "FORM".  We describe this convention in the
manual.

If you don't follow it, it will confuse others.  It confused me; I
didn't understand what your `let' does until I had a look at the tests.
This is just not necessary.  Enough people already have problems
understanding and accepting pcase stuff, so I think it's extra important
to use unambiguous language.

And BTW, if we choose to give up that naming convention, shouldn't we
first discuss it?


Michael.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#27659: 26.0.50; Add string-matched-text: string-match + match-string
  2017-07-23 20:41       ` Philipp Stephani
@ 2017-07-24 14:30         ` Stefan Monnier
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2017-07-24 14:30 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: 27659, Michael Heerdegen, Tino Calancha

> Pushed as ad4eff3b905dbc32e2d38bfec1e4f93eceec288d.  I've renamed
> backref-var to backref as you suggested, but left `let' because I think
> that feature is important enough to deserve a short, common name.

Thanks.

> Sure! I'd suggest to change the syntax to be compatible with other
> languages:
>
> \(?<abc>[0-9]+\) or \(?'abc'[0-9+\) (Perl and .NET)
> \(?P<abc>[0-9]+\) (Python)

Fine by me.  I'm surprised they didn't use the (?<var>:...) syntax,
since it seemed such an obvious choice, but maybe they wanted to allow :
in <var>.

Feel free to adapt my patch for that.  BTW, the main reason why I didn't
push my patch originally is because of its inability to extract the
match-beginning/end of the named subgroup: it felt like
a serious shortcoming.


        Stefan





^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-07-24 14:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-12  6:13 bug#27659: 26.0.50; Add string-matched-text: string-match + match-string Tino Calancha
2017-07-20  0:54 ` Drew Adams
2017-07-20  1:19   ` Tino Calancha
2017-07-20 19:51 ` Philipp Stephani
2017-07-21 12:29   ` Tino Calancha
2017-07-21 13:34     ` Stefan Monnier
2017-07-21 14:08       ` Tino Calancha
2017-07-21 23:28       ` John Mastro
2017-07-22  2:02         ` Michael Heerdegen
2017-07-23 20:41       ` Philipp Stephani
2017-07-24 14:30         ` Stefan Monnier
2017-07-22  1:46   ` Michael Heerdegen
2017-07-23 20:45     ` Philipp Stephani
2017-07-23 21:39       ` Michael Heerdegen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).