From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "St/n_P/rm/n" Newsgroups: gmane.emacs.devel Subject: regexp-quote missing escapes in grouping constructs - Bug? Date: Thu, 12 Jun 2008 19:39:12 -0400 Message-ID: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1213337243 32368 80.91.229.12 (13 Jun 2008 06:07:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 13 Jun 2008 06:07:23 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jun 13 08:08:07 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1K72S6-0002RW-Uk for ged-emacs-devel@m.gmane.org; Fri, 13 Jun 2008 08:08:07 +0200 Original-Received: from localhost ([127.0.0.1]:50625 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K72QH-0003Dg-VJ for ged-emacs-devel@m.gmane.org; Fri, 13 Jun 2008 02:06:05 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1K6wO0-0007kB-Di for emacs-devel@gnu.org; Thu, 12 Jun 2008 19:39:20 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1K6wNz-0007jH-83 for emacs-devel@gnu.org; Thu, 12 Jun 2008 19:39:20 -0400 Original-Received: from [199.232.76.173] (port=43894 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K6wNz-0007jA-1o for emacs-devel@gnu.org; Thu, 12 Jun 2008 19:39:19 -0400 Original-Received: from yw-out-1718.google.com ([74.125.46.158]:47369) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1K6wNy-0008DW-TM for emacs-devel@gnu.org; Thu, 12 Jun 2008 19:39:19 -0400 Original-Received: by yw-out-1718.google.com with SMTP id 9so2193703ywk.66 for ; Thu, 12 Jun 2008 16:39:12 -0700 (PDT) Original-Received: by 10.151.144.15 with SMTP id w15mr3230474ybn.184.1213313952555; Thu, 12 Jun 2008 16:39:12 -0700 (PDT) Original-Received: by 10.151.156.18 with HTTP; Thu, 12 Jun 2008 16:39:12 -0700 (PDT) Content-Disposition: inline X-Google-Sender-Auth: 1cc548e48fcca114 X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 2) X-Mailman-Approved-At: Fri, 13 Jun 2008 02:05:00 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:99064 Archived-At: (regexp-quote "[0-9]\{2,4\}\(-\|/\)[0-9]?+\(-\|/\)[0-9]\{2,4\}") ---> "\\[0-9]{2,4}(-|/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}" Am I misunderstanding something? Shouldn't passing that string to regexp-quote give back something more like this: ---> "[0-9]\\{2,4\\}\\(-\\|/\\)[0-9]?+(-\\|/)[0-9]\\{2,4\\}" or *flinches at the thought* ---> "[0-9]\\\\{2,4\\\\}\\\\(-\\\\|/\\\\)[0-9]?+(-\\\\|/)[0-9]\\\\{2,4\\\\}" - Its possible that I am misunderstanding the function, and I'd rather not file this as a bug report b/c I am using a Lennart's recent W32 patched... GNU Emacs 23.0.60.1 (i386-mingw-nt5.1.2600) of 2008-05-12 on LENNART-69DE564 (patched) However, currently building a derived mode and the regexp-opt and -quote are kinda required, esp. as there doesn't seem to be a clean way to avoid passing everything around through multiple instances of defconst defvar defcustom etc. just to "cache" keyword regexes for font-lock --- I find the following two most relevant to the matter at hand. case a) We get the requisite lisp reader 4x \\\\ for the group construct , but the function not only misses the interior alternative escape but omits it e.g. (regexp-quote "\\(123\|567\\)") ---> "\\\\(123|567\\\\)" case b) In contrast, when we give him enough the double escape "\\" inside the group he DOES catch the the escape and gives us 4x the \ (regexp-quote "\\(123\\|567\\)") ---> "\\\\(123\\\\|567\\\\)" --- This doesn't seem like consistent behavior esp. as regexp-quote is feeding regexp-opt elsewhere. --- These others examples do not strike me as edge cases when 'manually-optimizing' a regex for font-locks: (regexp-quote "\(123\|567\)") ---> "(123|567)" (regexp-quote `(,"\(123\|567\)") ---> ("(123|567)") (regexp-quote '("\(123|567\)") ---> ("(123|567)") (regexp-quote "(123|567)") ---> "(123|567)" (regexp-quote '"(123|567)") ---> "(123|567)" --- again, maybe I am missing something but my head hurts... despite having really come to appreciate emacs regexps :)