unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Removing no-back-reference restriction from syntax-propertize-rules
@ 2020-05-16  8:39 Tassilo Horn
  2020-05-16 13:17 ` Stefan Monnier
  2020-05-17 23:57 ` Stefan Monnier
  0 siblings, 2 replies; 13+ messages in thread
From: Tassilo Horn @ 2020-05-16  8:39 UTC (permalink / raw)
  To: emacs-devel

Hi all,

right now, the docstring of `syntax-propertize-rules' states that
back-references aren't supported (which is true).  I don't see why that
has to be the case.  It already shifts numbered groups as needed, so why
can't it simply shift back-references, too?

The following patch does that:

--8<---------------cut here---------------start------------->8---
modified   lisp/emacs-lisp/syntax.el
@@ -139,14 +139,16 @@ syntax-propertize-multiline
 		  (point-max))))
   (cons beg end))
 
-(defun syntax-propertize--shift-groups (re n)
-  (replace-regexp-in-string
-   "\\\\(\\?\\([0-9]+\\):"
-   (lambda (s)
-     (replace-match
-      (number-to-string (+ n (string-to-number (match-string 1 s))))
-      t t s 1))
-   re t t))
+(defun syntax-propertize--shift-groups-and-backrefs (re n)
+  (let ((incr (lambda (s)
+                (replace-match
+                 (number-to-string
+                  (+ n (string-to-number (match-string 1 s))))
+                 t t s 1))))
+    (replace-regexp-in-string
+     "[^\\]\\\\\\([0-9]+\\)" incr
+     (replace-regexp-in-string "\\\\(\\?\\([0-9]+\\):" incr re t t)
+     t t)))
 
 (defmacro syntax-propertize-precompile-rules (&rest rules)
   "Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
@@ -188,9 +190,7 @@ syntax-propertize-rules
 The SYNTAX expression is responsible to save the `match-data' if needed
 for subsequent HIGHLIGHTs.
 Also SYNTAX is free to move point, in which case RULES may not be applied to
-some parts of the text or may be applied several times to other parts.
-
-Note: back-references in REGEXPs do not work."
+some parts of the text or may be applied several times to other parts."
   (declare (debug (&rest &or symbolp    ;FIXME: edebug this eval step.
                          (form &rest
                                (numberp
@@ -219,7 +219,7 @@ syntax-propertize-rules
                  ;; tell when *this* match 0 has succeeded.
                  (cl-incf offset)
                  (setq re (concat "\\(" re "\\)")))
-               (setq re (syntax-propertize--shift-groups re offset))
+               (setq re (syntax-propertize--shift-groups-and-backrefs re offset))
                (let ((code '())
                      (condition
                       (cond
--8<---------------cut here---------------end--------------->8---

I've tested it with some simple rules, e.g.,

--8<---------------cut here---------------start------------->8---
(defun test-syntax-propertize-with-backrefs ()
  (interactive)
  (setq-local syntax-propertize-function
              (syntax-propertize-rules
               ("\\(one\\)\\(two\\)\\(\\1\\)" (1 "|") (2 "_") (3 "|"))
               ("\\(three\\)\\(four\\)\\(\\1\\)" (1 "|") (2 "_") (3 "|"))))
  (setq-local syntax-propertize--done -1)
  (syntax-propertize (point-max)))
--8<---------------cut here---------------end--------------->8---

and the properties are applied correctly and the code of the generated
function looks correct, i.e., the second back-reference is rewritten to
\\4 which is the right group \\(three\\) in the combinded regexp.

Am I thinking too naively?  Is there something I'm missing out?

Well, I also found a non-working case:

--8<---------------cut here---------------start------------->8---
(defun test-syntax-propertize-with-backrefs ()
  (interactive)
  (setq-local syntax-propertize-function
              (syntax-propertize-rules
               ("\\(one\\)\\(two\\)\\(\\1\\)" (1 "|") (2 "_") (3 "|"))
               ("\\(three\\)\\(four\\)\\(\\1\\)" (1 "|") (2 "_") (3 "|"))
               ("\\(?10:five\\)\\(six\\)\\(\\10\\)" (10 "|") (2 "_") (3 "|"))))
  (setq-local syntax-propertize--done -1)
  (syntax-propertize (point-max)))
--8<---------------cut here---------------end--------------->8---

Syntactically, this seems to do the right thing.  The numbered group
becomes \\(?16:five\\) with back-reference \\(\\16\\).  However, it will
never match.  With a buffer with contents

--8<---------------cut here---------------start------------->8---
onetwoone test bla bla threefourthree bla quux fivesixfive threefourthree.
--8<---------------cut here---------------end--------------->8---

firing up re-builder with the constructed regexp

  "\\(one\\)\\(two\\)\\(\\1\\)\\|\\(three\\)\\(four\\)\\(\\4\\)\\|\\(?16:five\\)\\(six\\)\\(\\16\\)"

will not highlight fivesixfive, and re-search-forward doesn't stop at
it.  So is it true that back-references to explicitly numbered groups
don't work at all?

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-16  8:39 Removing no-back-reference restriction from syntax-propertize-rules Tassilo Horn
@ 2020-05-16 13:17 ` Stefan Monnier
  2020-05-16 13:56   ` Tassilo Horn
  2020-05-17 23:57 ` Stefan Monnier
  1 sibling, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-05-16 13:17 UTC (permalink / raw)
  To: emacs-devel

> right now, the docstring of `syntax-propertize-rules' states that
> back-references aren't supported (which is true).  I don't see why that
> has to be the case.  It already shifts numbered groups as needed, so why
> can't it simply shift back-references, too?

The original reason was that I want to compile those to DFAs.

But a secondary reason was that every case where backrefs were used in
`font-lock-syntactic-keywords` was very easy to rewrite to not use
backrefs.  Do you have an example where backrefs are really important in
`syntax-propertize-rules`?


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-16 13:17 ` Stefan Monnier
@ 2020-05-16 13:56   ` Tassilo Horn
  2020-05-17  2:41     ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Tassilo Horn @ 2020-05-16 13:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> right now, the docstring of `syntax-propertize-rules' states that
>> back-references aren't supported (which is true).  I don't see why
>> that has to be the case.  It already shifts numbered groups as
>> needed, so why can't it simply shift back-references, too?
>
> The original reason was that I want to compile those to DFAs.
>
> But a secondary reason was that every case where backrefs were used in
> `font-lock-syntactic-keywords` was very easy to rewrite to not use
> backrefs.  Do you have an example where backrefs are really important
> in `syntax-propertize-rules`?

In AUCTeX we use them to match macros with choosable delimiter, e.g.,
\verb|foo| which could as well be written as \verb-foo- or \verb/foo/.

  http://git.savannah.gnu.org/cgit/auctex.git/tree/font-latex.el?h=font-latex-update&id=e6076a4f8cf2f06e956ad1a60728ca3b2eb11a83#n1023

Another example is in the minted.el style for things like
\mintinline[lisp]|(setq foo 1)| and there might be more occurrences in
user-defined styles.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-16 13:56   ` Tassilo Horn
@ 2020-05-17  2:41     ` Stefan Monnier
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Monnier @ 2020-05-17  2:41 UTC (permalink / raw)
  To: emacs-devel

> In AUCTeX we use them to match macros with choosable delimiter, e.g.,
> \verb|foo| which could as well be written as \verb-foo- or \verb/foo/.

Hmm... in tex-mode.el we do:

    ("\\\\verb\\**\\([^a-z@*]\\)"
      (1 (prog1 "\""
           (tex-font-lock-verb
            (match-beginning 0) (char-after (match-beginning 1))))))))

and indeed, `tex-font-lock-verb` then restricts it to only match on
a single-line, so a regexp with a backref would be simpler (I think
I evolved that code from that of perl-mode.el where regexp operations
can take similarly delimited args, but those can span several lines so
a regexp was not appropriate).

> Another example is in the minted.el style for things like
> \mintinline[lisp]|(setq foo 1)| and there might be more occurrences in
> user-defined styles.

\lstinline (from `listings` package) is probably another example, yes.


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-16  8:39 Removing no-back-reference restriction from syntax-propertize-rules Tassilo Horn
  2020-05-16 13:17 ` Stefan Monnier
@ 2020-05-17 23:57 ` Stefan Monnier
  2020-05-18 18:20   ` Tassilo Horn
  1 sibling, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-05-17 23:57 UTC (permalink / raw)
  To: emacs-devel

> -(defun syntax-propertize--shift-groups (re n)
> -  (replace-regexp-in-string
> -   "\\\\(\\?\\([0-9]+\\):"
> -   (lambda (s)
> -     (replace-match
> -      (number-to-string (+ n (string-to-number (match-string 1 s))))
> -      t t s 1))
> -   re t t))
> +(defun syntax-propertize--shift-groups-and-backrefs (re n)
> +  (let ((incr (lambda (s)
> +                (replace-match
> +                 (number-to-string
> +                  (+ n (string-to-number (match-string 1 s))))
> +                 t t s 1))))
> +    (replace-regexp-in-string
> +     "[^\\]\\\\\\([0-9]+\\)" incr
> +     (replace-regexp-in-string "\\\\(\\?\\([0-9]+\\):" incr re t t)
> +     t t)))

I think it's OK, but I think the risk of false positives for `\N` is
sufficiently high (compared to that for `\(?N:`) that I think we need to
be more careful and use `subregexp-context-p`.


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-17 23:57 ` Stefan Monnier
@ 2020-05-18 18:20   ` Tassilo Horn
  2020-05-18 19:30     ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Tassilo Horn @ 2020-05-18 18:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> -(defun syntax-propertize--shift-groups (re n)
>> -  (replace-regexp-in-string
>> -   "\\\\(\\?\\([0-9]+\\):"
>> -   (lambda (s)
>> -     (replace-match
>> -      (number-to-string (+ n (string-to-number (match-string 1 s))))
>> -      t t s 1))
>> -   re t t))
>> +(defun syntax-propertize--shift-groups-and-backrefs (re n)
>> +  (let ((incr (lambda (s)
>> +                (replace-match
>> +                 (number-to-string
>> +                  (+ n (string-to-number (match-string 1 s))))
>> +                 t t s 1))))
>> +    (replace-regexp-in-string
>> +     "[^\\]\\\\\\([0-9]+\\)" incr
>> +     (replace-regexp-in-string "\\\\(\\?\\([0-9]+\\):" incr re t t)
>> +     t t)))
>
> I think it's OK, but I think the risk of false positives for `\N` is
> sufficiently high (compared to that for `\(?N:`)

Can you give an example regexp where \N preceeded by a non-\ is no
back-reference (and still valid)?

BTW, do I read the docs right in that there are at most nine
back-references, i.e., \10 cannot exist?  In that case, we'd have the
restriction that at most 9 back-references may appear in all syntax
rules.  I guess in that case we should signal an error, no?

> that I think we need to be more careful and use `subregexp-context-p`.

I'm not sure how to use that.  Do you mean something like this?

--8<---------------cut here---------------start------------->8---
(defun syntax-propertize--shift-groups-and-backrefs (re n)
  (let ((new-re (replace-regexp-in-string
                 "\\\\(\\?\\([0-9]+\\):"
                 (lambda (s)
                   (replace-match
                    (number-to-string
                     (+ n (string-to-number (match-string 1 s))))
                    t t s 1))
                 re t t))
        (pos 0))
    (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
      (setq pos (+ 1 (match-beginning 1)))
      (when (save-match-data
              ;; With \N, the \ must be in a subregexp context and the
              ;; N must not be in a subregexp context.
              (and (subregexp-context-p new-re (match-beginning 0))
                   (not (subregexp-context-p new-re (match-beginning 1)))))
        (setq new-re (replace-match
                      (number-to-string
                       (+ n (string-to-number (match-string 1 new-re))))
                      t t new-re 1))))
    new-re))
--8<---------------cut here---------------end--------------->8---

That doesn't shift/renumber things like [\N] or fo\{\N\} but are those
valid regexps anyway?

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-18 18:20   ` Tassilo Horn
@ 2020-05-18 19:30     ` Stefan Monnier
  2020-05-18 21:30       ` Tassilo Horn
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-05-18 19:30 UTC (permalink / raw)
  To: emacs-devel

>> I think it's OK, but I think the risk of false positives for `\N` is
>> sufficiently high (compared to that for `\(?N:`)
>
> Can you give an example regexp where \N preceeded by a non-\ is no
> back-reference (and still valid)?

Of course: "bar\\(foo\\)[\\1-9]".

> BTW, do I read the docs right in that there are at most nine
> back-references, i.e., \10 cannot exist?  In that case, we'd have the
> restriction that at most 9 back-references may appear in all syntax
> rules.

Apparently, yes:

    (string-match "\\(?5:[ab]\\)-\\5" "a-a")
    0 (#o0, #x0, ?\C-@)
    ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
    nil

[ I guess that's another reason to stay away from backreferences.  ]

> I guess in that case we should signal an error, no?

Indeed.

>       (when (save-match-data
>               ;; With \N, the \ must be in a subregexp context and the
>               ;; N must not be in a subregexp context.
>               (and (subregexp-context-p new-re (match-beginning 0))
>                    (not (subregexp-context-p new-re (match-beginning 1)))))

You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).

> That doesn't shift/renumber things like [\N] or fo\{\N\} but are those
> valid regexps anyway?

[\N] is, \{\N\} isn't.


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-18 19:30     ` Stefan Monnier
@ 2020-05-18 21:30       ` Tassilo Horn
  2020-05-19  2:58         ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Tassilo Horn @ 2020-05-18 21:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Can you give an example regexp where \N preceeded by a non-\ is no
>> back-reference (and still valid)?
>
> Of course: "bar\\(foo\\)[\\1-9]".

Oh, right.

>> BTW, do I read the docs right in that there are at most nine
>> back-references, i.e., \10 cannot exist?  In that case, we'd have the
>> restriction that at most 9 back-references may appear in all syntax
>> rules.
>
> Apparently, yes:
>
>     (string-match "\\(?5:[ab]\\)-\\5" "a-a")
>     0 (#o0, #x0, ?\C-@)
>     ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
>     nil
>
> [ I guess that's another reason to stay away from backreferences.  ]

Ah, so my "back-refs to explicitly numbered groups don't work at all"
issue was actually that I've used a bigger number than 9.

>> I guess in that case we should signal an error, no?
>
> Indeed.

Ok, will do.

>>       (when (save-match-data
>>               ;; With \N, the \ must be in a subregexp context and the
>>               ;; N must not be in a subregexp context.
>>               (and (subregexp-context-p new-re (match-beginning 0))
>>                    (not (subregexp-context-p new-re (match-beginning 1)))))
>
> You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).

Ok.

So all in all, this should give the following patch:

--8<---------------cut here---------------start------------->8---
scratch/syntax-propertize-rules-with-backrefs ba3eee275640d453ffee9f6d9768be1ebd73d51b
Author:     Tassilo Horn <tsdh@gnu.org>
AuthorDate: Sat May 16 10:05:12 2020 +0200
Commit:     Tassilo Horn <tsdh@gnu.org>
CommitDate: Mon May 18 23:14:49 2020 +0200

Parent:     ca7224d5db Add test for recent buffer-local-variables change
Merged:     emacs-27 feature/browse-url-browser-kind master scratch/syntax-propertize-rules-with-backrefs
Contained:  scratch/syntax-propertize-rules-with-backrefs
Follows:    emacs-27.0.91 (945)

Allow back-references in syntax-propertize-rules.

* lisp/emacs-lisp/syntax.el (syntax-propertize--shift-groups-and-backrefs):
Renamed from syntax-propertize--shift-groups, and also shift
back-references.
(syntax-propertize-rules): Adapt docstring and use renamed function.

1 file changed, 25 insertions(+), 10 deletions(-)
lisp/emacs-lisp/syntax.el | 35 +++++++++++++++++++++++++----------

modified   lisp/emacs-lisp/syntax.el
@@ -139,14 +139,28 @@ syntax-propertize-multiline
 		  (point-max))))
   (cons beg end))
 
-(defun syntax-propertize--shift-groups (re n)
-  (replace-regexp-in-string
-   "\\\\(\\?\\([0-9]+\\):"
-   (lambda (s)
-     (replace-match
-      (number-to-string (+ n (string-to-number (match-string 1 s))))
-      t t s 1))
-   re t t))
+(defun syntax-propertize--shift-groups-and-backrefs (re n)
+  (let ((new-re (replace-regexp-in-string
+                 "\\\\(\\?\\([0-9]+\\):"
+                 (lambda (s)
+                   (replace-match
+                    (number-to-string
+                     (+ n (string-to-number (match-string 1 s))))
+                    t t s 1))
+                 re t t))
+        (pos 0))
+    (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
+      (setq pos (+ 1 (match-beginning 1)))
+      (when (save-match-data
+              ;; With \N, the \ must be in a subregexp context, i.e.,
+              ;; not in a character class or in a \{\} repetition.
+              (subregexp-context-p new-re (match-beginning 0)))
+        (let ((shifted (+ n (string-to-number (match-string 1 new-re)))))
+          (when (> shifted 9)
+            (error "There may be at most nine back-references"))
+          (setq new-re (replace-match (number-to-string shifted)
+                                      t t new-re 1)))))
+    new-re))
 
 (defmacro syntax-propertize-precompile-rules (&rest rules)
   "Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
@@ -190,7 +204,8 @@ syntax-propertize-rules
 Also SYNTAX is free to move point, in which case RULES may not be applied to
 some parts of the text or may be applied several times to other parts.
 
-Note: back-references in REGEXPs do not work."
+Note: There may be at most nine back-references in the REGEXPs of
+all RULES in total."
   (declare (debug (&rest &or symbolp    ;FIXME: edebug this eval step.
                          (form &rest
                                (numberp
@@ -219,7 +234,7 @@ syntax-propertize-rules
                  ;; tell when *this* match 0 has succeeded.
                  (cl-incf offset)
                  (setq re (concat "\\(" re "\\)")))
-               (setq re (syntax-propertize--shift-groups re offset))
+               (setq re (syntax-propertize--shift-groups-and-backrefs re offset))
                (let ((code '())
                      (condition
                       (cond
--8<---------------cut here---------------end--------------->8---

Seems to work fine and errors as soon as a back-reference needs to be
renumbered to \10 or more.

Good to go?

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-18 21:30       ` Tassilo Horn
@ 2020-05-19  2:58         ` Stefan Monnier
  2020-05-19 13:28           ` Tassilo Horn
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-05-19  2:58 UTC (permalink / raw)
  To: emacs-devel

Looks good.  IRC you had some tests to along with it.
If you could install them at the same time, that would be great.


        Stefan


Tassilo Horn [2020-05-18 23:30:32] wrote:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> Can you give an example regexp where \N preceeded by a non-\ is no
>>> back-reference (and still valid)?
>>
>> Of course: "bar\\(foo\\)[\\1-9]".
>
> Oh, right.
>
>>> BTW, do I read the docs right in that there are at most nine
>>> back-references, i.e., \10 cannot exist?  In that case, we'd have the
>>> restriction that at most 9 back-references may appear in all syntax
>>> rules.
>>
>> Apparently, yes:
>>
>>     (string-match "\\(?5:[ab]\\)-\\5" "a-a")
>>     0 (#o0, #x0, ?\C-@)
>>     ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
>>     nil
>>
>> [ I guess that's another reason to stay away from backreferences.  ]
>
> Ah, so my "back-refs to explicitly numbered groups don't work at all"
> issue was actually that I've used a bigger number than 9.
>
>>> I guess in that case we should signal an error, no?
>>
>> Indeed.
>
> Ok, will do.
>
>>>       (when (save-match-data
>>>               ;; With \N, the \ must be in a subregexp context and the
>>>               ;; N must not be in a subregexp context.
>>>               (and (subregexp-context-p new-re (match-beginning 0))
>>>                    (not (subregexp-context-p new-re (match-beginning 1)))))
>>
>> You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).
>
> Ok.
>
> So all in all, this should give the following patch:
>
> --8<---------------cut here---------------start------------->8---
> scratch/syntax-propertize-rules-with-backrefs ba3eee275640d453ffee9f6d9768be1ebd73d51b
> Author:     Tassilo Horn <tsdh@gnu.org>
> AuthorDate: Sat May 16 10:05:12 2020 +0200
> Commit:     Tassilo Horn <tsdh@gnu.org>
> CommitDate: Mon May 18 23:14:49 2020 +0200
>
> Parent:     ca7224d5db Add test for recent buffer-local-variables change
> Merged:     emacs-27 feature/browse-url-browser-kind master scratch/syntax-propertize-rules-with-backrefs
> Contained:  scratch/syntax-propertize-rules-with-backrefs
> Follows:    emacs-27.0.91 (945)
>
> Allow back-references in syntax-propertize-rules.
>
> * lisp/emacs-lisp/syntax.el (syntax-propertize--shift-groups-and-backrefs):
> Renamed from syntax-propertize--shift-groups, and also shift
> back-references.
> (syntax-propertize-rules): Adapt docstring and use renamed function.
>
> 1 file changed, 25 insertions(+), 10 deletions(-)
> lisp/emacs-lisp/syntax.el | 35 +++++++++++++++++++++++++----------
>
> modified   lisp/emacs-lisp/syntax.el
> @@ -139,14 +139,28 @@ syntax-propertize-multiline
>  		  (point-max))))
>    (cons beg end))
>  
> -(defun syntax-propertize--shift-groups (re n)
> -  (replace-regexp-in-string
> -   "\\\\(\\?\\([0-9]+\\):"
> -   (lambda (s)
> -     (replace-match
> -      (number-to-string (+ n (string-to-number (match-string 1 s))))
> -      t t s 1))
> -   re t t))
> +(defun syntax-propertize--shift-groups-and-backrefs (re n)
> +  (let ((new-re (replace-regexp-in-string
> +                 "\\\\(\\?\\([0-9]+\\):"
> +                 (lambda (s)
> +                   (replace-match
> +                    (number-to-string
> +                     (+ n (string-to-number (match-string 1 s))))
> +                    t t s 1))
> +                 re t t))
> +        (pos 0))
> +    (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
> +      (setq pos (+ 1 (match-beginning 1)))
> +      (when (save-match-data
> +              ;; With \N, the \ must be in a subregexp context, i.e.,
> +              ;; not in a character class or in a \{\} repetition.
> +              (subregexp-context-p new-re (match-beginning 0)))
> +        (let ((shifted (+ n (string-to-number (match-string 1 new-re)))))
> +          (when (> shifted 9)
> +            (error "There may be at most nine back-references"))
> +          (setq new-re (replace-match (number-to-string shifted)
> +                                      t t new-re 1)))))
> +    new-re))
>  
>  (defmacro syntax-propertize-precompile-rules (&rest rules)
>    "Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
> @@ -190,7 +204,8 @@ syntax-propertize-rules
>  Also SYNTAX is free to move point, in which case RULES may not be applied to
>  some parts of the text or may be applied several times to other parts.
>  
> -Note: back-references in REGEXPs do not work."
> +Note: There may be at most nine back-references in the REGEXPs of
> +all RULES in total."
>    (declare (debug (&rest &or symbolp    ;FIXME: edebug this eval step.
>                           (form &rest
>                                 (numberp
> @@ -219,7 +234,7 @@ syntax-propertize-rules
>                   ;; tell when *this* match 0 has succeeded.
>                   (cl-incf offset)
>                   (setq re (concat "\\(" re "\\)")))
> -               (setq re (syntax-propertize--shift-groups re offset))
> +               (setq re (syntax-propertize--shift-groups-and-backrefs re offset))
>                 (let ((code '())
>                       (condition
>                        (cond
> --8<---------------cut here---------------end--------------->8---
>
> Seems to work fine and errors as soon as a back-reference needs to be
> renumbered to \10 or more.
>
> Good to go?
>
> Bye,
> Tassilo




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-19  2:58         ` Stefan Monnier
@ 2020-05-19 13:28           ` Tassilo Horn
  2020-05-19 15:06             ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Tassilo Horn @ 2020-05-19 13:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Looks good.  IRC you had some tests to along with it.
> If you could install them at the same time, that would be great.

Of course.  I guess it is ok to only test the shifting function?  Or
something more elaborate that puts text in a temp buffer, makes a
syntax-propertize-function with syntax-propertize-rules and checks if
the 'syntax-table properties get applied as expected?

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-19 13:28           ` Tassilo Horn
@ 2020-05-19 15:06             ` Stefan Monnier
  2020-05-19 18:54               ` Tassilo Horn
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-05-19 15:06 UTC (permalink / raw)
  To: emacs-devel

>> Looks good.  IRC you had some tests to along with it.
>> If you could install them at the same time, that would be great.
> Of course.  I guess it is ok to only test the shifting function?

Yes.

> Or something more elaborate that puts text in a temp buffer, makes
> a syntax-propertize-function with syntax-propertize-rules and checks
> if the 'syntax-table properties get applied as expected?

Any test is better than no test at this point, so ... take your pick.


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-19 15:06             ` Stefan Monnier
@ 2020-05-19 18:54               ` Tassilo Horn
  2020-05-19 18:55                 ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Tassilo Horn @ 2020-05-19 18:54 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> Looks good.  IRC you had some tests to along with it.  If you could
>>> install them at the same time, that would be great.
>> Of course.  I guess it is ok to only test the shifting function?
>
> Yes.

Done so.

>> Or something more elaborate that puts text in a temp buffer, makes
>> a syntax-propertize-function with syntax-propertize-rules and checks
>> if the 'syntax-table properties get applied as expected?
>
> Any test is better than no test at this point, so ... take your pick.

Right.  Maybe I'll add something like the above to the test later on.
Now I gotta put the kids in the shower. :-)

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Removing no-back-reference restriction from syntax-propertize-rules
  2020-05-19 18:54               ` Tassilo Horn
@ 2020-05-19 18:55                 ` Stefan Monnier
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Monnier @ 2020-05-19 18:55 UTC (permalink / raw)
  To: emacs-devel

>>>> Looks good.  IRC you had some tests to along with it.  If you could
>>>> install them at the same time, that would be great.
>>> Of course.  I guess it is ok to only test the shifting function?
>> Yes.
> Done so.

Thanks,


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-05-19 18:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-16  8:39 Removing no-back-reference restriction from syntax-propertize-rules Tassilo Horn
2020-05-16 13:17 ` Stefan Monnier
2020-05-16 13:56   ` Tassilo Horn
2020-05-17  2:41     ` Stefan Monnier
2020-05-17 23:57 ` Stefan Monnier
2020-05-18 18:20   ` Tassilo Horn
2020-05-18 19:30     ` Stefan Monnier
2020-05-18 21:30       ` Tassilo Horn
2020-05-19  2:58         ` Stefan Monnier
2020-05-19 13:28           ` Tassilo Horn
2020-05-19 15:06             ` Stefan Monnier
2020-05-19 18:54               ` Tassilo Horn
2020-05-19 18:55                 ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).