From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: regex and case-fold-search problem Date: Sat, 24 Aug 2002 10:16:10 +0900 (JST) Sender: emacs-devel-admin@gnu.org Message-ID: <200208240116.KAA24680@etlken.m17n.org> References: <200208230625.PAA23426@etlken.m17n.org> <200208231736.g7NHafW02174@rum.cs.yale.edu> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: main.gmane.org 1030151839 2357 127.0.0.1 (24 Aug 2002 01:17:19 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sat, 24 Aug 2002 01:17:19 +0000 (UTC) Cc: emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17iPYE-0000bt-00 for ; Sat, 24 Aug 2002 03:17:18 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17iQ1q-0002xF-00 for ; Sat, 24 Aug 2002 03:47:54 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17iPZR-0007nD-00; Fri, 23 Aug 2002 21:18:33 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17iPXG-0007lH-00 for emacs-devel@gnu.org; Fri, 23 Aug 2002 21:16:18 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17iPXE-0007l5-00 for emacs-devel@gnu.org; Fri, 23 Aug 2002 21:16:17 -0400 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17iPXD-0007l0-00 for emacs-devel@gnu.org; Fri, 23 Aug 2002 21:16:15 -0400 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6/3.7W-20010518204228) with ESMTP id g7O1GBl17256; Sat, 24 Aug 2002 10:16:11 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.3/3.7W-20010823150639) with ESMTP id g7O1GA908528; Sat, 24 Aug 2002 10:16:10 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id KAA24680; Sat, 24 Aug 2002 10:16:10 +0900 (JST) Original-To: monnier+gnu/emacs@rum.cs.yale.edu In-Reply-To: <200208231736.g7NHafW02174@rum.cs.yale.edu> (monnier+gnu/emacs@rum.cs.yale.edu) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:6821 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:6821 In article <200208231736.g7NHafW02174@rum.cs.yale.edu>, "Stefan Monnier" writes: > But I think that if it works with (case-fold-search nil) it should > also work with (case-fold-search t). The current behavior is really > counter-intuitive. I agree. >> But, anyway, we have to decide what to do. >> >> (1) Regard the above case as a bug, and fix it completely. >> As we don't support a range striding over different >> charsets by the current Emacs, I think the fix is >> difficult but not that much. But, in emacs-unicode, we >> can't have such a restriction, and thus the fix is very >> difficult. > For ASCII it's pretty easy to fix. But for other charsets, it's > indeed more tricky. Maybe we can simply use the smallest contiguous > range of chars that includes all the chars we should match, > so the behavior is indeed "implementation-defined" (in the sense > that it's not necessarily obvious to the user what happens) but > it's at least less confusing (in the sense that (case-fold-search t) > matches at least as much as (case-fold-search nil)). Ideally, the range "[A-_]" must be converted to "[a-z[-_]". But, it seems that your idea is to convert "[A-_]" to "[_-z]", correct? I agree that it results in less counter-intuitive behaviour. > How about the patch below ? [...] ?? It seems that the patch handles only non-ASCII chars. --- Ken'ichi HANDA handa@etl.go.jp