From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: Re: (regexp-opt-depth "[\\(]") => 1 :-( Date: Wed, 23 Apr 2003 11:26:30 +0000 (GMT) Sender: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Message-ID: References: Reply-To: Alan Mackenzie NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: main.gmane.org 1051097165 29911 80.91.224.249 (23 Apr 2003 11:26:05 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 23 Apr 2003 11:26:05 +0000 (UTC) Original-X-From: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Wed Apr 23 13:26:03 2003 Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 198IO2-0007m6-00 for ; Wed, 23 Apr 2003 13:26:03 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 198INv-00004q-04 for gnu-bug-gnu-emacs@m.gmane.org; Wed, 23 Apr 2003 07:25:55 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 198INi-0008Vh-00 for bug-gnu-emacs@gnu.org; Wed, 23 Apr 2003 07:25:42 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 198INg-0008V7-00 for bug-gnu-emacs@gnu.org; Wed, 23 Apr 2003 07:25:41 -0400 Original-Received: from acm.muc.de ([193.149.49.134] helo=acm.acm) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 198INf-0008Ou-00 for bug-gnu-emacs@gnu.org; Wed, 23 Apr 2003 07:25:39 -0400 Original-Received: from localhost (root@localhost) by acm.acm (8.8.8/8.8.8) with SMTP id LAA00329 for ; Wed, 23 Apr 2003 11:26:31 GMT X-Sender: root@acm.acm Original-To: bug-gnu-emacs@gnu.org In-Reply-To: X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.bugs:4877 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:4877 On Sun, 20 Apr 2003, Alan Mackenzie wrote: >There Is No Alternative: regexp-opt-depth MUST analyse its argument >properly. The following rewrite of regexp-opt-depth does just that. >[Well, OK, it would be as well for somebody else to check the formulation >of regexp-opt-not-groupie*-re. ;-] Many thanks to RMS for doing just that and telling me that # This is a good idea, but it fails on "[[:alpha:]\\(]". # I think the value for `class' needs to be more sophisticated. >The patch below passes the following test cases: >(regexp-opt-depth "(asdf)") => 0 >(regexp-opt-depth "\\(asdf\\)") => 1 >(regexp-opt-depth "\\(\\(asdf\\)\\)") => 2 >(regexp-opt-depth "\\(?:asdf\\)") => 0 >(regexp-opt-depth "[\\(]") => 0 >(regexp-opt-depth "[a]\\(]asd\\)") => 1 >(regexp-opt-depth "[^a]\\(]asd\\)") => 1 >(regexp-opt-depth "[]\\(]asd)") => 0 >(regexp-opt-depth "[^]\\(]asd)") => 0 >(regexp-opt-depth "\\(? \\)") signals "invalid regexp". Here is the amended patch with that more sophisticated regexp for class. In addition to the above test cases, the newer version passes these: (regexp-opt-depth "[[:alpha:]\\(]") => 0 (regexp-opt-depth "[[:alpha]\\(") signals "invalid regexp" (regexp-opt-depth "[[:alpha]\\(\\)") => 1 (regexp-opt-depth "[[:alp$ha:]\\(\\)") signals "Invalid regexp" (regexp-opt-depth "[[alpha:]\\(]\\)") => 1 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 2003-04-23 Alan Mackenzie * regexp-opt.el: In regexp-opt-depth, don't count a "//(" which appears inside a character set]. New constant regexp-opt-not-groupie*-re. *** regexp-opt.1.24.el Fri Apr 18 18:34:34 2003 --- regexp-opt.acm.1.24.el Tue Apr 22 20:52:53 2003 *************** *** 110,115 **** --- 110,133 ---- (re (regexp-opt-group sorted-strings open))) (if words (concat "\\<" re "\\>") re)))) + (defconst regexp-opt-not-groupie*-re + (let* ((harmless-ch "[^\\\\[]") + (esc-pair-not-lp "\\\\[^(]") + (class-harmless-ch "[^][]") + (class-lb-harmless "[^]:]") + (class-lb-colon-maybe-charclass ":\\([a-z]+:]\\)?") + (class-lb (concat "\\[\\(" class-lb-harmless + "\\|" class-lb-colon-maybe-charclass "\\)")) + (class + (concat "\\[^?]?" + "\\(" class-harmless-ch + "\\|" class-lb "\\)*" + "\\[?]")) ; special handling for bare [ at end of re + (shy-lp "\\\\(\\?:")) + (concat "\\(" harmless-ch "\\|" esc-pair-not-lp + "\\|" class "\\|" shy-lp "\\)*")) + "Matches any part of a regular expression EXCEPT for non-shy \"\\\\(\"s") + ;;;###autoload (defun regexp-opt-depth (regexp) "Return the depth of REGEXP. *************** *** 120,130 **** (string-match regexp "") ;; Count the number of open parentheses in REGEXP. (let ((count 0) start) ! (while (string-match "\\(\\`\\|[^\\]\\)\\\\\\(\\\\\\\\\\)*([^?]" ! regexp start) ! (setq count (1+ count) ! ;; Go back 2 chars (one for [^?] and one for [^\\]). ! start (- (match-end 0) 2))) count))) ;;; Workhorse functions. --- 138,149 ---- (string-match regexp "") ;; Count the number of open parentheses in REGEXP. (let ((count 0) start) ! (while ! (progn ! (string-match regexp-opt-not-groupie*-re regexp start) ! (setq start ( + (match-end 0) 2)) ; +2 for "\\(" after match-end. ! (<= start (length regexp))) ! (setq count (1+ count))) count))) ;;; Workhorse functions. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -- Alan Mackenzie (Munich, Germany)