From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Eli Zaretskii" Newsgroups: gmane.emacs.devel Subject: Re: regex and case-fold-search problem Date: Sat, 31 Aug 2002 09:14:21 +0300 Sender: emacs-devel-admin@gnu.org Message-ID: <1659-Sat31Aug2002091421+0300-eliz@is.elta.co.il> References: <200208230625.PAA23426@etlken.m17n.org> <200208262151.g7QLpfA12782@wijiji.santafe.edu> <200208290853.RAA03185@etlken.m17n.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: localhost.gmane.org X-Trace: main.gmane.org 1030774392 1070 127.0.0.1 (31 Aug 2002 06:13:12 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sat, 31 Aug 2002 06:13:12 +0000 (UTC) Cc: emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17l1VP-0000H9-00 for ; Sat, 31 Aug 2002 08:13:11 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17l22X-0003ei-00 for ; Sat, 31 Aug 2002 08:47:25 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17l1Wr-0004GM-00; Sat, 31 Aug 2002 02:14:41 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17l1Ur-0004Ep-00 for emacs-devel@gnu.org; Sat, 31 Aug 2002 02:12:37 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17l1Uk-0004Dc-00 for emacs-devel@gnu.org; Sat, 31 Aug 2002 02:12:36 -0400 Original-Received: from diana.inter.net.il ([192.114.186.19]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17l1Uj-0004DW-00; Sat, 31 Aug 2002 02:12:30 -0400 Original-Received: from Zaretsky ([80.230.2.40]) by diana.inter.net.il (Mirapoint Messaging Server MOS 3.1.0.58-GA) with ESMTP id AJT15666; Sat, 31 Aug 2002 09:12:27 +0300 (IDT) Original-To: rms@gnu.org X-Mailer: emacs 21.3.50 (via feedmail 8 I) and Blat ver 1.8.9 In-Reply-To: (message from Richard Stallman on Fri, 30 Aug 2002 15:19:14 -0400) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:7202 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:7202 > From: Richard Stallman > Date: Fri, 30 Aug 2002 15:19:14 -0400 > > I think we all know that is the right behaviour, and at > least for ASCII, the latest code works as that. Perhpas, we > should make Emacs work correctly also for Latin-1 chars, > because in emacs-unicode also, they have the same code > order. > > What about for Latin-2 characters? Will those regexp ranges > change their meaning in emacs-unicode? Yes. Latin-2 characters have different order in Unicode than in 8859-2. Those characters which are common to Latin-2 and Latin-1 are in the same order, but those which aren't have different places. The same goes for all the other Latin-N characters where N != 1. We could have some code to map a range specified by a Lisp program into a range of internal character codepoints (in Unicode Emacs, the latter would be Unicode codepoints). We could make this code depend on some user variable that states the external ordering meant by the application. For example, Cyrillic users could tell Emacs that [A-Z] was intended to work as in KOI8-R or as in 8859-5. Would something like that work?