From: Alan Mackenzie <acm@muc.de>
Subject: Re: (regexp-opt-depth "[\\(]") => 1 :-(
Date: Wed, 23 Apr 2003 11:26:30 +0000 (GMT) [thread overview]
Message-ID: <Pine.LNX.3.96.1030423105138.178A-100000@acm.acm> (raw)
In-Reply-To: <Pine.LNX.3.96.1030420220746.178A-100000@acm.acm>
On Sun, 20 Apr 2003, Alan Mackenzie wrote:
>There Is No Alternative: regexp-opt-depth MUST analyse its argument
>properly. The following rewrite of regexp-opt-depth does just that.
>[Well, OK, it would be as well for somebody else to check the formulation
>of regexp-opt-not-groupie*-re. ;-]
Many thanks to RMS for doing just that and telling me that
# This is a good idea, but it fails on "[[:alpha:]\\(]".
# I think the value for `class' needs to be more sophisticated.
>The patch below passes the following test cases:
>(regexp-opt-depth "(asdf)") => 0
>(regexp-opt-depth "\\(asdf\\)") => 1
>(regexp-opt-depth "\\(\\(asdf\\)\\)") => 2
>(regexp-opt-depth "\\(?:asdf\\)") => 0
>(regexp-opt-depth "[\\(]") => 0
>(regexp-opt-depth "[a]\\(]asd\\)") => 1
>(regexp-opt-depth "[^a]\\(]asd\\)") => 1
>(regexp-opt-depth "[]\\(]asd)") => 0
>(regexp-opt-depth "[^]\\(]asd)") => 0
>(regexp-opt-depth "\\(? \\)") signals "invalid regexp".
Here is the amended patch with that more sophisticated regexp for class.
In addition to the above test cases, the newer version passes these:
(regexp-opt-depth "[[:alpha:]\\(]") => 0
(regexp-opt-depth "[[:alpha]\\(") signals "invalid regexp"
(regexp-opt-depth "[[:alpha]\\(\\)") => 1
(regexp-opt-depth "[[:alp$ha:]\\(\\)") signals "Invalid regexp"
(regexp-opt-depth "[[alpha:]\\(]\\)") => 1
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2003-04-23 Alan Mackenzie <acm@muc.de>
* regexp-opt.el: In regexp-opt-depth, don't count a "//(" which appears
inside a character set]. New constant regexp-opt-not-groupie*-re.
*** regexp-opt.1.24.el Fri Apr 18 18:34:34 2003
--- regexp-opt.acm.1.24.el Tue Apr 22 20:52:53 2003
***************
*** 110,115 ****
--- 110,133 ----
(re (regexp-opt-group sorted-strings open)))
(if words (concat "\\<" re "\\>") re))))
+ (defconst regexp-opt-not-groupie*-re
+ (let* ((harmless-ch "[^\\\\[]")
+ (esc-pair-not-lp "\\\\[^(]")
+ (class-harmless-ch "[^][]")
+ (class-lb-harmless "[^]:]")
+ (class-lb-colon-maybe-charclass ":\\([a-z]+:]\\)?")
+ (class-lb (concat "\\[\\(" class-lb-harmless
+ "\\|" class-lb-colon-maybe-charclass "\\)"))
+ (class
+ (concat "\\[^?]?"
+ "\\(" class-harmless-ch
+ "\\|" class-lb "\\)*"
+ "\\[?]")) ; special handling for bare [ at end of re
+ (shy-lp "\\\\(\\?:"))
+ (concat "\\(" harmless-ch "\\|" esc-pair-not-lp
+ "\\|" class "\\|" shy-lp "\\)*"))
+ "Matches any part of a regular expression EXCEPT for non-shy \"\\\\(\"s")
+
;;;###autoload
(defun regexp-opt-depth (regexp)
"Return the depth of REGEXP.
***************
*** 120,130 ****
(string-match regexp "")
;; Count the number of open parentheses in REGEXP.
(let ((count 0) start)
! (while (string-match "\\(\\`\\|[^\\]\\)\\\\\\(\\\\\\\\\\)*([^?]"
! regexp start)
! (setq count (1+ count)
! ;; Go back 2 chars (one for [^?] and one for [^\\]).
! start (- (match-end 0) 2)))
count)))
\f
;;; Workhorse functions.
--- 138,149 ----
(string-match regexp "")
;; Count the number of open parentheses in REGEXP.
(let ((count 0) start)
! (while
! (progn
! (string-match regexp-opt-not-groupie*-re regexp start)
! (setq start ( + (match-end 0) 2)) ; +2 for "\\(" after match-end.
! (<= start (length regexp)))
! (setq count (1+ count)))
count)))
\f
;;; Workhorse functions.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
--
Alan Mackenzie (Munich, Germany)
prev parent reply other threads:[~2003-04-23 11:26 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-20 22:30 (regexp-opt-depth "[\\(]") => 1 :-( Alan Mackenzie
2003-04-23 11:26 ` Alan Mackenzie [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.3.96.1030423105138.178A-100000@acm.acm \
--to=acm@muc.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).