From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Stefan Monnier" Newsgroups: gmane.emacs.devel Subject: Re: regex and case-fold-search problem Date: Thu, 29 Aug 2002 12:00:23 -0400 Sender: emacs-devel-admin@gnu.org Message-ID: <200208291600.g7TG0NZ11087@rum.cs.yale.edu> References: <200208230625.PAA23426@etlken.m17n.org> <200208262151.g7QLpfA12782@wijiji.santafe.edu> <200208290853.RAA03185@etlken.m17n.org> <5x8z2pj13t.fsf@kfs2.cua.dk> <200208291338.WAA03607@etlken.m17n.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1030636910 9169 127.0.0.1 (29 Aug 2002 16:01:50 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 29 Aug 2002 16:01:50 +0000 (UTC) Cc: storm@cua.dk, rms@gnu.org, emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17kRjt-0002NQ-00 for ; Thu, 29 Aug 2002 18:01:45 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17kSGF-00005V-00 for ; Thu, 29 Aug 2002 18:35:11 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17kRlH-0007QS-00; Thu, 29 Aug 2002 12:03:11 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17kRii-0007L8-00 for emacs-devel@gnu.org; Thu, 29 Aug 2002 12:00:32 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17kRig-0007Kw-00 for emacs-devel@gnu.org; Thu, 29 Aug 2002 12:00:31 -0400 Original-Received: from rum.cs.yale.edu ([128.36.229.169]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17kRie-0007Kg-00; Thu, 29 Aug 2002 12:00:28 -0400 Original-Received: (from monnier@localhost) by rum.cs.yale.edu (8.11.6/8.11.6) id g7TG0NZ11087; Thu, 29 Aug 2002 12:00:23 -0400 X-Mailer: exmh version 2.4 06/23/2000 with nmh-1.0.4 Original-To: Kenichi Handa Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:7119 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:7119 > In article <5x8z2pj13t.fsf@kfs2.cua.dk>, storm@cua.dk (Kim F. Storm) writes: > > IMO, it is wrong to handle case-fold-search for regexp ranges by > > trying to modify the interpretation of the regex range. > > > Instead, the regex matcher should try to upcase and lowercase each > > character in the string and see if either of these caracters are > > within the given range. > > I also reached to that idea. It makes regexp compiling > simpler and faster but makes regexp matching a little bit > slower. I don't know if that slowerness is tolerable or > not, but it's worth trying. Two things: - Neither `upper(lower(x)) = x' nor `lower(upper(x)) = x' are guaranteed. - The regexp matcher right now only has access to one of the two tables (I believe it's the `lower' but I'm not even sure) and so two chars are deemed to match if translate(a) = translate(b). The first might be a non-issue, I don't know. The second is more serious because that means that if we want to use `upper' we'll need to somehow pass that table as well, which requires changing the interface to the reg-matching functions. Stefan