From: Tassilo Horn <tsdh@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: Removing no-back-reference restriction from syntax-propertize-rules
Date: Mon, 18 May 2020 23:30:32 +0200 [thread overview]
Message-ID: <87r1vh2ao7.fsf@gnu.org> (raw)
In-Reply-To: <jwvd07159ob.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Mon, 18 May 2020 15:30:32 -0400")
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> Can you give an example regexp where \N preceeded by a non-\ is no
>> back-reference (and still valid)?
>
> Of course: "bar\\(foo\\)[\\1-9]".
Oh, right.
>> BTW, do I read the docs right in that there are at most nine
>> back-references, i.e., \10 cannot exist? In that case, we'd have the
>> restriction that at most 9 back-references may appear in all syntax
>> rules.
>
> Apparently, yes:
>
> (string-match "\\(?5:[ab]\\)-\\5" "a-a")
> 0 (#o0, #x0, ?\C-@)
> ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
> nil
>
> [ I guess that's another reason to stay away from backreferences. ]
Ah, so my "back-refs to explicitly numbered groups don't work at all"
issue was actually that I've used a bigger number than 9.
>> I guess in that case we should signal an error, no?
>
> Indeed.
Ok, will do.
>> (when (save-match-data
>> ;; With \N, the \ must be in a subregexp context and the
>> ;; N must not be in a subregexp context.
>> (and (subregexp-context-p new-re (match-beginning 0))
>> (not (subregexp-context-p new-re (match-beginning 1)))))
>
> You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).
Ok.
So all in all, this should give the following patch:
--8<---------------cut here---------------start------------->8---
scratch/syntax-propertize-rules-with-backrefs ba3eee275640d453ffee9f6d9768be1ebd73d51b
Author: Tassilo Horn <tsdh@gnu.org>
AuthorDate: Sat May 16 10:05:12 2020 +0200
Commit: Tassilo Horn <tsdh@gnu.org>
CommitDate: Mon May 18 23:14:49 2020 +0200
Parent: ca7224d5db Add test for recent buffer-local-variables change
Merged: emacs-27 feature/browse-url-browser-kind master scratch/syntax-propertize-rules-with-backrefs
Contained: scratch/syntax-propertize-rules-with-backrefs
Follows: emacs-27.0.91 (945)
Allow back-references in syntax-propertize-rules.
* lisp/emacs-lisp/syntax.el (syntax-propertize--shift-groups-and-backrefs):
Renamed from syntax-propertize--shift-groups, and also shift
back-references.
(syntax-propertize-rules): Adapt docstring and use renamed function.
1 file changed, 25 insertions(+), 10 deletions(-)
lisp/emacs-lisp/syntax.el | 35 +++++++++++++++++++++++++----------
modified lisp/emacs-lisp/syntax.el
@@ -139,14 +139,28 @@ syntax-propertize-multiline
(point-max))))
(cons beg end))
-(defun syntax-propertize--shift-groups (re n)
- (replace-regexp-in-string
- "\\\\(\\?\\([0-9]+\\):"
- (lambda (s)
- (replace-match
- (number-to-string (+ n (string-to-number (match-string 1 s))))
- t t s 1))
- re t t))
+(defun syntax-propertize--shift-groups-and-backrefs (re n)
+ (let ((new-re (replace-regexp-in-string
+ "\\\\(\\?\\([0-9]+\\):"
+ (lambda (s)
+ (replace-match
+ (number-to-string
+ (+ n (string-to-number (match-string 1 s))))
+ t t s 1))
+ re t t))
+ (pos 0))
+ (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
+ (setq pos (+ 1 (match-beginning 1)))
+ (when (save-match-data
+ ;; With \N, the \ must be in a subregexp context, i.e.,
+ ;; not in a character class or in a \{\} repetition.
+ (subregexp-context-p new-re (match-beginning 0)))
+ (let ((shifted (+ n (string-to-number (match-string 1 new-re)))))
+ (when (> shifted 9)
+ (error "There may be at most nine back-references"))
+ (setq new-re (replace-match (number-to-string shifted)
+ t t new-re 1)))))
+ new-re))
(defmacro syntax-propertize-precompile-rules (&rest rules)
"Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
@@ -190,7 +204,8 @@ syntax-propertize-rules
Also SYNTAX is free to move point, in which case RULES may not be applied to
some parts of the text or may be applied several times to other parts.
-Note: back-references in REGEXPs do not work."
+Note: There may be at most nine back-references in the REGEXPs of
+all RULES in total."
(declare (debug (&rest &or symbolp ;FIXME: edebug this eval step.
(form &rest
(numberp
@@ -219,7 +234,7 @@ syntax-propertize-rules
;; tell when *this* match 0 has succeeded.
(cl-incf offset)
(setq re (concat "\\(" re "\\)")))
- (setq re (syntax-propertize--shift-groups re offset))
+ (setq re (syntax-propertize--shift-groups-and-backrefs re offset))
(let ((code '())
(condition
(cond
--8<---------------cut here---------------end--------------->8---
Seems to work fine and errors as soon as a back-reference needs to be
renumbered to \10 or more.
Good to go?
Bye,
Tassilo
next prev parent reply other threads:[~2020-05-18 21:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-16 8:39 Removing no-back-reference restriction from syntax-propertize-rules Tassilo Horn
2020-05-16 13:17 ` Stefan Monnier
2020-05-16 13:56 ` Tassilo Horn
2020-05-17 2:41 ` Stefan Monnier
2020-05-17 23:57 ` Stefan Monnier
2020-05-18 18:20 ` Tassilo Horn
2020-05-18 19:30 ` Stefan Monnier
2020-05-18 21:30 ` Tassilo Horn [this message]
2020-05-19 2:58 ` Stefan Monnier
2020-05-19 13:28 ` Tassilo Horn
2020-05-19 15:06 ` Stefan Monnier
2020-05-19 18:54 ` Tassilo Horn
2020-05-19 18:55 ` Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r1vh2ao7.fsf@gnu.org \
--to=tsdh@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).