From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Case mapping of sharp s Date: Mon, 16 Nov 2009 21:12:59 +0200 Message-ID: <83lji6mgg4.fsf@gnu.org> References: <19200.4158.380820.761685@a1i15.kph.uni-mainz.de> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1258398890 11414 80.91.229.12 (16 Nov 2009 19:14:50 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 16 Nov 2009 19:14:50 +0000 (UTC) Cc: ulm@gentoo.org, emacs-devel@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Nov 16 20:14:43 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NA72A-0003DM-M5 for ged-emacs-devel@m.gmane.org; Mon, 16 Nov 2009 20:14:42 +0100 Original-Received: from localhost ([127.0.0.1]:53483 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NA729-0006cI-Vs for ged-emacs-devel@m.gmane.org; Mon, 16 Nov 2009 14:14:42 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NA70j-0005u2-17 for emacs-devel@gnu.org; Mon, 16 Nov 2009 14:13:13 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NA70d-0005qa-2D for emacs-devel@gnu.org; Mon, 16 Nov 2009 14:13:11 -0500 Original-Received: from [199.232.76.173] (port=54238 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NA70c-0005qO-Nk for emacs-devel@gnu.org; Mon, 16 Nov 2009 14:13:06 -0500 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:55584) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NA70c-00071A-9q for emacs-devel@gnu.org; Mon, 16 Nov 2009 14:13:06 -0500 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0KT700900UL5LI00@a-mtaout21.012.net.il> for emacs-devel@gnu.org; Mon, 16 Nov 2009 21:12:56 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.70.37.193]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0KT70090MUPJ7V20@a-mtaout21.012.net.il>; Mon, 16 Nov 2009 21:12:56 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:117041 Archived-At: > From: Kenichi Handa > Date: Mon, 16 Nov 2009 21:06:38 +0900 > Cc: emacs-devel@gnu.org >=20 > In article <19200.4158.380820.761685@a1i15.kph.uni-mainz.de>, Ulric= h Mueller writes: >=20 > > In Unicode since version 5.1.0 the U+1E9E code point is assigned = to > > "LATIN CAPITAL LETTER SHARP S". Would it be possible to add a map= ping > > from this to the lower case =C3=9F, as in the patch below? >=20 > > However, I've noticed that similar mappings for Turkish =C4=B1 (d= otless i) > > and =C4=B0 (I with dot) were commented out [1]. Is it still so th= at such a > > change would "make searches slow", as stated in the comment? >=20 > That kind of setting surely makes the searching of =C3=9F and =E1= =BA=9E > slow because we can't use BM search when case-fold-search is > non-nil. BM search is possible only when all > case-equivalent characters are represented by the same byte > length, and differ only in the last byte. I think we need to solve this limitation anyway, if we want a decent support for Unicode. There are many more pairs of characters that should normally be considered equal in search. Wouldn't the technique described in UTS 18 (http://www.unicode.org/reports/tr18/) help here?