all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: emacs-devel@gnu.org
Subject: Re: Removing no-back-reference restriction from syntax-propertize-rules
Date: Mon, 18 May 2020 22:58:14 -0400	[thread overview]
Message-ID: <jwv5zcs3a49.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <87r1vh2ao7.fsf@gnu.org> (Tassilo Horn's message of "Mon, 18 May 2020 23:30:32 +0200")

Looks good.  IRC you had some tests to along with it.
If you could install them at the same time, that would be great.


        Stefan


Tassilo Horn [2020-05-18 23:30:32] wrote:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> Can you give an example regexp where \N preceeded by a non-\ is no
>>> back-reference (and still valid)?
>>
>> Of course: "bar\\(foo\\)[\\1-9]".
>
> Oh, right.
>
>>> BTW, do I read the docs right in that there are at most nine
>>> back-references, i.e., \10 cannot exist?  In that case, we'd have the
>>> restriction that at most 9 back-references may appear in all syntax
>>> rules.
>>
>> Apparently, yes:
>>
>>     (string-match "\\(?5:[ab]\\)-\\5" "a-a")
>>     0 (#o0, #x0, ?\C-@)
>>     ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
>>     nil
>>
>> [ I guess that's another reason to stay away from backreferences.  ]
>
> Ah, so my "back-refs to explicitly numbered groups don't work at all"
> issue was actually that I've used a bigger number than 9.
>
>>> I guess in that case we should signal an error, no?
>>
>> Indeed.
>
> Ok, will do.
>
>>>       (when (save-match-data
>>>               ;; With \N, the \ must be in a subregexp context and the
>>>               ;; N must not be in a subregexp context.
>>>               (and (subregexp-context-p new-re (match-beginning 0))
>>>                    (not (subregexp-context-p new-re (match-beginning 1)))))
>>
>> You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).
>
> Ok.
>
> So all in all, this should give the following patch:
>
> --8<---------------cut here---------------start------------->8---
> scratch/syntax-propertize-rules-with-backrefs ba3eee275640d453ffee9f6d9768be1ebd73d51b
> Author:     Tassilo Horn <tsdh@gnu.org>
> AuthorDate: Sat May 16 10:05:12 2020 +0200
> Commit:     Tassilo Horn <tsdh@gnu.org>
> CommitDate: Mon May 18 23:14:49 2020 +0200
>
> Parent:     ca7224d5db Add test for recent buffer-local-variables change
> Merged:     emacs-27 feature/browse-url-browser-kind master scratch/syntax-propertize-rules-with-backrefs
> Contained:  scratch/syntax-propertize-rules-with-backrefs
> Follows:    emacs-27.0.91 (945)
>
> Allow back-references in syntax-propertize-rules.
>
> * lisp/emacs-lisp/syntax.el (syntax-propertize--shift-groups-and-backrefs):
> Renamed from syntax-propertize--shift-groups, and also shift
> back-references.
> (syntax-propertize-rules): Adapt docstring and use renamed function.
>
> 1 file changed, 25 insertions(+), 10 deletions(-)
> lisp/emacs-lisp/syntax.el | 35 +++++++++++++++++++++++++----------
>
> modified   lisp/emacs-lisp/syntax.el
> @@ -139,14 +139,28 @@ syntax-propertize-multiline
>  		  (point-max))))
>    (cons beg end))
>  
> -(defun syntax-propertize--shift-groups (re n)
> -  (replace-regexp-in-string
> -   "\\\\(\\?\\([0-9]+\\):"
> -   (lambda (s)
> -     (replace-match
> -      (number-to-string (+ n (string-to-number (match-string 1 s))))
> -      t t s 1))
> -   re t t))
> +(defun syntax-propertize--shift-groups-and-backrefs (re n)
> +  (let ((new-re (replace-regexp-in-string
> +                 "\\\\(\\?\\([0-9]+\\):"
> +                 (lambda (s)
> +                   (replace-match
> +                    (number-to-string
> +                     (+ n (string-to-number (match-string 1 s))))
> +                    t t s 1))
> +                 re t t))
> +        (pos 0))
> +    (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
> +      (setq pos (+ 1 (match-beginning 1)))
> +      (when (save-match-data
> +              ;; With \N, the \ must be in a subregexp context, i.e.,
> +              ;; not in a character class or in a \{\} repetition.
> +              (subregexp-context-p new-re (match-beginning 0)))
> +        (let ((shifted (+ n (string-to-number (match-string 1 new-re)))))
> +          (when (> shifted 9)
> +            (error "There may be at most nine back-references"))
> +          (setq new-re (replace-match (number-to-string shifted)
> +                                      t t new-re 1)))))
> +    new-re))
>  
>  (defmacro syntax-propertize-precompile-rules (&rest rules)
>    "Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
> @@ -190,7 +204,8 @@ syntax-propertize-rules
>  Also SYNTAX is free to move point, in which case RULES may not be applied to
>  some parts of the text or may be applied several times to other parts.
>  
> -Note: back-references in REGEXPs do not work."
> +Note: There may be at most nine back-references in the REGEXPs of
> +all RULES in total."
>    (declare (debug (&rest &or symbolp    ;FIXME: edebug this eval step.
>                           (form &rest
>                                 (numberp
> @@ -219,7 +234,7 @@ syntax-propertize-rules
>                   ;; tell when *this* match 0 has succeeded.
>                   (cl-incf offset)
>                   (setq re (concat "\\(" re "\\)")))
> -               (setq re (syntax-propertize--shift-groups re offset))
> +               (setq re (syntax-propertize--shift-groups-and-backrefs re offset))
>                 (let ((code '())
>                       (condition
>                        (cond
> --8<---------------cut here---------------end--------------->8---
>
> Seems to work fine and errors as soon as a back-reference needs to be
> renumbered to \10 or more.
>
> Good to go?
>
> Bye,
> Tassilo




  reply	other threads:[~2020-05-19  2:58 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-16  8:39 Removing no-back-reference restriction from syntax-propertize-rules Tassilo Horn
2020-05-16 13:17 ` Stefan Monnier
2020-05-16 13:56   ` Tassilo Horn
2020-05-17  2:41     ` Stefan Monnier
2020-05-17 23:57 ` Stefan Monnier
2020-05-18 18:20   ` Tassilo Horn
2020-05-18 19:30     ` Stefan Monnier
2020-05-18 21:30       ` Tassilo Horn
2020-05-19  2:58         ` Stefan Monnier [this message]
2020-05-19 13:28           ` Tassilo Horn
2020-05-19 15:06             ` Stefan Monnier
2020-05-19 18:54               ` Tassilo Horn
2020-05-19 18:55                 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv5zcs3a49.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.