From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Stefan Monnier" Newsgroups: gmane.emacs.devel Subject: Re: regex and case-fold-search problem Date: Sun, 25 Aug 2002 14:52:41 -0400 Sender: emacs-devel-admin@gnu.org Message-ID: <200208251852.g7PIqf121329@rum.cs.yale.edu> References: <200208230625.PAA23426@etlken.m17n.org> <200208231736.g7NHafW02174@rum.cs.yale.edu> <200208240116.KAA24680@etlken.m17n.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1030301654 22416 127.0.0.1 (25 Aug 2002 18:54:14 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 25 Aug 2002 18:54:14 +0000 (UTC) Cc: monnier+gnu/emacs@rum.cs.yale.edu, emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17j2Wb-0005pR-00 for ; Sun, 25 Aug 2002 20:54:13 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17j313-0003EB-00 for ; Sun, 25 Aug 2002 21:25:41 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17j2Xs-0003Rg-00; Sun, 25 Aug 2002 14:55:32 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17j2VB-0003IH-00 for emacs-devel@gnu.org; Sun, 25 Aug 2002 14:52:45 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17j2V9-0003I5-00 for emacs-devel@gnu.org; Sun, 25 Aug 2002 14:52:45 -0400 Original-Received: from rum.cs.yale.edu ([128.36.229.169]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17j2V9-0003I0-00 for emacs-devel@gnu.org; Sun, 25 Aug 2002 14:52:43 -0400 Original-Received: (from monnier@localhost) by rum.cs.yale.edu (8.11.6/8.11.6) id g7PIqf121329; Sun, 25 Aug 2002 14:52:41 -0400 X-Mailer: exmh version 2.4 06/23/2000 with nmh-1.0.4 Original-To: Kenichi Handa Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:6874 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:6874 > In article <200208231736.g7NHafW02174@rum.cs.yale.edu>, "Stefan Monnier" writes: > > But I think that if it works with (case-fold-search nil) it should > > also work with (case-fold-search t). The current behavior is really > > counter-intuitive. > > I agree. > > >> But, anyway, we have to decide what to do. > >> > >> (1) Regard the above case as a bug, and fix it completely. > >> As we don't support a range striding over different > >> charsets by the current Emacs, I think the fix is > >> difficult but not that much. But, in emacs-unicode, we > >> can't have such a restriction, and thus the fix is very > >> difficult. > > > For ASCII it's pretty easy to fix. But for other charsets, it's > > indeed more tricky. Maybe we can simply use the smallest contiguous > > range of chars that includes all the chars we should match, > > so the behavior is indeed "implementation-defined" (in the sense > > that it's not necessarily obvious to the user what happens) but > > it's at least less confusing (in the sense that (case-fold-search t) > > matches at least as much as (case-fold-search nil)). > > Ideally, the range "[A-_]" must be converted to "[a-z[-_]". Indeed and the (new) current code does just that for ASCII. > But, it seems that your idea is to convert "[A-_]" to > "[_-z]", correct? I agree that it results in less > counter-intuitive behaviour. Not quite: [_-z] would not include [ \ ] and ^. So instead it's [[-z] which includes all of [a-z[-_] as well as ` (in this particular case). > > How about the patch below ? > [...] > ?? It seems that the patch handles only non-ASCII chars. Well, that's because the code for ASCII was already there (just didn't work right because we did PATFETCH instead of PATFETCH_RAW). Stefan