From: Stefan Monnier <monnier@iro.umontreal.ca>
To: emacs-devel@gnu.org
Subject: Re: Removing no-back-reference restriction from syntax-propertize-rules
Date: Mon, 18 May 2020 22:58:14 -0400 [thread overview]
Message-ID: <jwv5zcs3a49.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <87r1vh2ao7.fsf@gnu.org> (Tassilo Horn's message of "Mon, 18 May 2020 23:30:32 +0200")
Looks good. IRC you had some tests to along with it.
If you could install them at the same time, that would be great.
Stefan
Tassilo Horn [2020-05-18 23:30:32] wrote:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> Can you give an example regexp where \N preceeded by a non-\ is no
>>> back-reference (and still valid)?
>>
>> Of course: "bar\\(foo\\)[\\1-9]".
>
> Oh, right.
>
>>> BTW, do I read the docs right in that there are at most nine
>>> back-references, i.e., \10 cannot exist? In that case, we'd have the
>>> restriction that at most 9 back-references may appear in all syntax
>>> rules.
>>
>> Apparently, yes:
>>
>> (string-match "\\(?5:[ab]\\)-\\5" "a-a")
>> 0 (#o0, #x0, ?\C-@)
>> ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
>> nil
>>
>> [ I guess that's another reason to stay away from backreferences. ]
>
> Ah, so my "back-refs to explicitly numbered groups don't work at all"
> issue was actually that I've used a bigger number than 9.
>
>>> I guess in that case we should signal an error, no?
>>
>> Indeed.
>
> Ok, will do.
>
>>> (when (save-match-data
>>> ;; With \N, the \ must be in a subregexp context and the
>>> ;; N must not be in a subregexp context.
>>> (and (subregexp-context-p new-re (match-beginning 0))
>>> (not (subregexp-context-p new-re (match-beginning 1)))))
>>
>> You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).
>
> Ok.
>
> So all in all, this should give the following patch:
>
> --8<---------------cut here---------------start------------->8---
> scratch/syntax-propertize-rules-with-backrefs ba3eee275640d453ffee9f6d9768be1ebd73d51b
> Author: Tassilo Horn <tsdh@gnu.org>
> AuthorDate: Sat May 16 10:05:12 2020 +0200
> Commit: Tassilo Horn <tsdh@gnu.org>
> CommitDate: Mon May 18 23:14:49 2020 +0200
>
> Parent: ca7224d5db Add test for recent buffer-local-variables change
> Merged: emacs-27 feature/browse-url-browser-kind master scratch/syntax-propertize-rules-with-backrefs
> Contained: scratch/syntax-propertize-rules-with-backrefs
> Follows: emacs-27.0.91 (945)
>
> Allow back-references in syntax-propertize-rules.
>
> * lisp/emacs-lisp/syntax.el (syntax-propertize--shift-groups-and-backrefs):
> Renamed from syntax-propertize--shift-groups, and also shift
> back-references.
> (syntax-propertize-rules): Adapt docstring and use renamed function.
>
> 1 file changed, 25 insertions(+), 10 deletions(-)
> lisp/emacs-lisp/syntax.el | 35 +++++++++++++++++++++++++----------
>
> modified lisp/emacs-lisp/syntax.el
> @@ -139,14 +139,28 @@ syntax-propertize-multiline
> (point-max))))
> (cons beg end))
>
> -(defun syntax-propertize--shift-groups (re n)
> - (replace-regexp-in-string
> - "\\\\(\\?\\([0-9]+\\):"
> - (lambda (s)
> - (replace-match
> - (number-to-string (+ n (string-to-number (match-string 1 s))))
> - t t s 1))
> - re t t))
> +(defun syntax-propertize--shift-groups-and-backrefs (re n)
> + (let ((new-re (replace-regexp-in-string
> + "\\\\(\\?\\([0-9]+\\):"
> + (lambda (s)
> + (replace-match
> + (number-to-string
> + (+ n (string-to-number (match-string 1 s))))
> + t t s 1))
> + re t t))
> + (pos 0))
> + (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
> + (setq pos (+ 1 (match-beginning 1)))
> + (when (save-match-data
> + ;; With \N, the \ must be in a subregexp context, i.e.,
> + ;; not in a character class or in a \{\} repetition.
> + (subregexp-context-p new-re (match-beginning 0)))
> + (let ((shifted (+ n (string-to-number (match-string 1 new-re)))))
> + (when (> shifted 9)
> + (error "There may be at most nine back-references"))
> + (setq new-re (replace-match (number-to-string shifted)
> + t t new-re 1)))))
> + new-re))
>
> (defmacro syntax-propertize-precompile-rules (&rest rules)
> "Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
> @@ -190,7 +204,8 @@ syntax-propertize-rules
> Also SYNTAX is free to move point, in which case RULES may not be applied to
> some parts of the text or may be applied several times to other parts.
>
> -Note: back-references in REGEXPs do not work."
> +Note: There may be at most nine back-references in the REGEXPs of
> +all RULES in total."
> (declare (debug (&rest &or symbolp ;FIXME: edebug this eval step.
> (form &rest
> (numberp
> @@ -219,7 +234,7 @@ syntax-propertize-rules
> ;; tell when *this* match 0 has succeeded.
> (cl-incf offset)
> (setq re (concat "\\(" re "\\)")))
> - (setq re (syntax-propertize--shift-groups re offset))
> + (setq re (syntax-propertize--shift-groups-and-backrefs re offset))
> (let ((code '())
> (condition
> (cond
> --8<---------------cut here---------------end--------------->8---
>
> Seems to work fine and errors as soon as a back-reference needs to be
> renumbered to \10 or more.
>
> Good to go?
>
> Bye,
> Tassilo
next prev parent reply other threads:[~2020-05-19 2:58 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-16 8:39 Removing no-back-reference restriction from syntax-propertize-rules Tassilo Horn
2020-05-16 13:17 ` Stefan Monnier
2020-05-16 13:56 ` Tassilo Horn
2020-05-17 2:41 ` Stefan Monnier
2020-05-17 23:57 ` Stefan Monnier
2020-05-18 18:20 ` Tassilo Horn
2020-05-18 19:30 ` Stefan Monnier
2020-05-18 21:30 ` Tassilo Horn
2020-05-19 2:58 ` Stefan Monnier [this message]
2020-05-19 13:28 ` Tassilo Horn
2020-05-19 15:06 ` Stefan Monnier
2020-05-19 18:54 ` Tassilo Horn
2020-05-19 18:55 ` Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwv5zcs3a49.fsf-monnier+emacs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.