From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Luc Teirlinck Newsgroups: gmane.emacs.devel Subject: Re: Unquoted special characters in regexps Date: Sun, 26 Feb 2006 10:41:47 -0600 (CST) Message-ID: <200602261641.k1QGfl104925@raven.dms.auburn.edu> References: <4400AD8E.5050001@gmx.at> <4400BBB1.2050800@gmx.at> <200602252213.k1PMDBP24413@raven.dms.auburn.edu> <4401A98D.3070809@gmx.at> NNTP-Posting-Host: main.gmane.org X-Trace: sea.gmane.org 1141412738 10049 80.91.229.2 (3 Mar 2006 19:05:38 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 3 Mar 2006 19:05:38 +0000 (UTC) Cc: rudalics@gmx.at, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Mar 03 20:05:36 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FFFaY-0003Uz-9D for ged-emacs-devel@m.gmane.org; Fri, 03 Mar 2006 20:05:19 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FFFaU-0003O6-H8 for ged-emacs-devel@m.gmane.org; Fri, 03 Mar 2006 14:05:14 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FEv1r-0001yZ-Mm for emacs-devel@gnu.org; Thu, 02 Mar 2006 16:08:08 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FDP2d-0000g8-1B for emacs-devel@gnu.org; Sun, 26 Feb 2006 11:46:41 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FDP2a-0000fO-SX for emacs-devel@gnu.org; Sun, 26 Feb 2006 11:46:37 -0500 Original-Received: from [131.204.53.104] (helo=manatee.dms.auburn.edu) by monty-python.gnu.org with esmtp (Exim 4.52) id 1FDP36-0003qU-HK for emacs-devel@gnu.org; Sun, 26 Feb 2006 11:47:08 -0500 Original-Received: from raven.dms.auburn.edu (raven.dms.auburn.edu [131.204.53.29]) by manatee.dms.auburn.edu (8.13.3+Sun/8.13.3) with ESMTP id k1QGkLR8010035; Sun, 26 Feb 2006 10:46:22 -0600 (CST) Original-Received: (from teirllm@localhost) by raven.dms.auburn.edu (8.11.7p1+Sun/8.11.7) id k1QGfl104925; Sun, 26 Feb 2006 10:41:47 -0600 (CST) X-Authentication-Warning: raven.dms.auburn.edu: teirllm set sender to teirllm@dms.auburn.edu using -f Original-To: schwab@suse.de In-reply-to: (message from Andreas Schwab on Sun, 26 Feb 2006 14:50:38 +0100) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0.1 (manatee.dms.auburn.edu [131.204.53.104]); Sun, 26 Feb 2006 10:46:22 -0600 (CST) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:51087 Archived-At: Andreas Schwab wrote: > According to the Elisp manual all these exhibit "poor practice" since > you didn't quote the second `]'s. It's a bug in the manual. I propose the following patch to lispref/searching.texi, which I can install if desired. I will wait till more people, in particular Richard, have had an opportunity to see it. Note that the current version already clearly states elsewhere that `]' is special _inside_ character alternatives: Note that the usual regexp special characters are not special inside a character alternative. A completely different set of characters is special inside character alternatives: `]', `-' and `^'. Apart from correcting the bug we are discussing, it also corrects another misstatement: For example, a string with unbalanced square brackets is invalid (with a few exceptions, such as `[]]'), That is incorrect as the examples below show. ELISP> (string-match "]]]]" "]]]]") 0 ELISP> (string-match "[[]" "[") 0 One correct way to restate it would be that a string whose square brackets _with special meaning in the context in which they are used _ do not balance is invalid. This would be (unless I overlook something) without exceptions: in `[]]' the square brackets with special meaning do balance. In the patch below I formulated it differently. None of my previous mails to emacs-{devel,pretest-bug} in the last few days have appeared on the list, so I wonder whether this one will. ===File ~/searching.texi-diff=============================== *** searching.texi 06 Feb 2006 16:02:08 -0600 1.68 --- searching.texi 26 Feb 2006 10:25:06 -0600 *************** *** 237,243 **** special constructs and the rest are @dfn{ordinary}. An ordinary character is a simple regular expression that matches that character and nothing else. The special characters are @samp{.}, @samp{*}, @samp{+}, ! @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new special characters will be defined in the future. Any other character appearing in a regular expression is ordinary, unless a @samp{\} precedes it. --- 237,243 ---- special constructs and the rest are @dfn{ordinary}. An ordinary character is a simple regular expression that matches that character and nothing else. The special characters are @samp{.}, @samp{*}, @samp{+}, ! @samp{?}, @samp{[}, @samp{^}, @samp{$}, and @samp{\}; no new special characters will be defined in the future. Any other character appearing in a regular expression is ordinary, unless a @samp{\} precedes it. *************** *** 740,747 **** @kindex invalid-regexp Not every string is a valid regular expression. For example, a string ! with unbalanced square brackets is invalid (with a few exceptions, such ! as @samp{[]]}), and so is a string that ends with a single @samp{\}. If an invalid regular expression is passed to any of the search functions, an @code{invalid-regexp} error is signaled. --- 740,747 ---- @kindex invalid-regexp Not every string is a valid regular expression. For example, a string ! that ends inside a character alternative without terminating @samp{]} ! is invalid, and so is a string that ends with a single @samp{\}. If an invalid regular expression is passed to any of the search functions, an @code{invalid-regexp} error is signaled. ============================================================