From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Elias Oltmanns Newsgroups: gmane.emacs.devel Subject: Re: New buffer-case-table makes search_buffer painfully slow Date: Fri, 12 May 2006 16:16:20 +0200 Message-ID: <87hd3vs4zv.fsf@denkblock.local> References: <87y7xhq4wy.fsf@denkblock.local> <87fyjnkm0f.fsf@denkblock.local> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1147443429 26606 80.91.229.2 (12 May 2006 14:17:09 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 12 May 2006 14:17:09 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri May 12 16:17:08 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FeYS3-0001LU-Kh for ged-emacs-devel@m.gmane.org; Fri, 12 May 2006 16:17:08 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FeYS3-0001yJ-9G for ged-emacs-devel@m.gmane.org; Fri, 12 May 2006 10:17:07 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FeYRp-0001vc-9j for emacs-devel@gnu.org; Fri, 12 May 2006 10:16:53 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FeYRk-0001uY-BA for emacs-devel@gnu.org; Fri, 12 May 2006 10:16:52 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FeYRk-0001uV-8e for emacs-devel@gnu.org; Fri, 12 May 2006 10:16:48 -0400 Original-Received: from [80.91.229.2] (helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1FeYTU-0002c1-BT for emacs-devel@gnu.org; Fri, 12 May 2006 10:18:36 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1FeYRb-0001Fs-RR for emacs-devel@gnu.org; Fri, 12 May 2006 16:16:40 +0200 Original-Received: from p5088685f.dip.t-dialin.net ([80.136.104.95]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 12 May 2006 16:16:39 +0200 Original-Received: from oltmanns by p5088685f.dip.t-dialin.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 12 May 2006 16:16:39 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-To: emacs-devel@gnu.org Original-Lines: 46 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: p5088685f.dip.t-dialin.net User-Agent: Gnus/5.110004 (No Gnus v0.4) Cancel-Lock: sha1:XTz9sGfRu4l38Wuz05SDNW5p17I= X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54337 Archived-At: Richard Stallman wrote: > Emacs 22's EQUIVALENCES table relates i, and thus I as well, to > two more characters with character codes 331857 and 331856. On > www.unicode.org the character look up engine couldn't find a > match for U+51051 or U+51050 saying that most likely those codes > weren't assigned to any characters yet. > > I think this has to do with the special characters for Turkish, > lower-case i without dot and upper-case I with dot. In Turkish, > upcasing and downcasing preserve the dot, or the absence of the dot. > > I think these lines in characters.el are the cause of the problem. > > (set-downcase-syntax ?? ?i tbl) (set-upcase-syntax ?I ?? tbl) > > They set up only half of what Turkish needs. They make dotless-i > upcase into I, and they make I-with-dot downcase into i. They can't > do vice versa because that would break things for other languages. > So they are not really useful. We could simply delete them. > > We could also add a minor mode to set up the case table all the way > for Turkish. When I come to think of it, I'm not quite sure I understand what exactly you have in mind with regard to the minor mode option. Unfortunately, I don't know anything about Turkish at all, but I'd imagine that while you're editing pure Turkish texts, you'd like to have a matching pair of dotless and dotted up- and downcase i respectively. That way up- and downcasing work properly and case insensitive searches for an i would not match the dotless versions---as expected, I suppose. If you're editing mixed texts as, for instance, Turkish and English, the current behaviour with i matching all four characters might be more convenient; the same applies if you switch between Turkish and other languages rather frequently. The third option, which from my very biased point of view should be the default, is that ASCII i should only match its ASCII upcase counterpart. How would you realise all these needs? Regards, Elias