From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: idn.el and confusables.txt Date: Sat, 14 May 2011 19:42:39 +0300 Message-ID: <834o4xfd34.fsf@gnu.org> References: <874o5uie42.fsf@lifelogs.com> <87y635dll9.fsf@lifelogs.com> <87r58vbj7o.fsf@lifelogs.com> <87fwpba03q.fsf@lifelogs.com> <874o5rqr5z.fsf@lifelogs.com> <87mxjjpal4.fsf@lifelogs.com> <87vcy6nzan.fsf@lifelogs.com> <87tydl4sjj.fsf_-_@lifelogs.com> <87r58pghh7.fsf_-_@lifelogs.com> <83iptdg0yr.fsf@gnu.org> <87y629ien3.fsf@lifelogs.com> <83aaepfiuk.fsf@gnu.org> <87aaepi9k2.fsf@lifelogs.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1305391376 15550 80.91.229.12 (14 May 2011 16:42:56 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 14 May 2011 16:42:56 +0000 (UTC) Cc: emacs-devel@gnu.org To: Ted Zlatanov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 14 18:42:51 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QLHvX-0007rQ-Hm for ged-emacs-devel@m.gmane.org; Sat, 14 May 2011 18:42:51 +0200 Original-Received: from localhost ([::1]:39450 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLHvX-0003ub-0w for ged-emacs-devel@m.gmane.org; Sat, 14 May 2011 12:42:51 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:43328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLHvU-0003uV-4O for emacs-devel@gnu.org; Sat, 14 May 2011 12:42:49 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QLHvS-00069S-Pv for emacs-devel@gnu.org; Sat, 14 May 2011 12:42:48 -0400 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:57717) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLHvS-00069N-Fo for emacs-devel@gnu.org; Sat, 14 May 2011 12:42:46 -0400 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0LL700H0022VRH00@a-mtaout21.012.net.il> for emacs-devel@gnu.org; Sat, 14 May 2011 19:42:45 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([77.124.10.122]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LL700HD22F7KM40@a-mtaout21.012.net.il>; Sat, 14 May 2011 19:42:45 +0300 (IDT) In-reply-to: <87aaepi9k2.fsf@lifelogs.com> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-Received-From: 80.179.55.169 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:139400 Archived-At: > From: Ted Zlatanov > Date: Sat, 14 May 2011 10:30:37 -0500 > > It wouldn't be ideal, surely, but most glyphs are not confusable so the > lookup would fail. For some value of "most": there are 20K entries in confusables.txt. > I might write some of it in C if performance was an issue C won't help, if you need to access the same char-table and compare with half a dozen possible symbols. > or try to inline the conditions with macros, or cache the lookups. Isn't it better to design the table for efficient use to begin with? > But I don't know if markchars.el needs to be terribly fast. I hope we are not introducing another character property for a single use. Some use, some day might need to do it fast. > It runs at the font-lock level and IIUC that's opportunistic and not > time-critical like the display code. For instance, unmodified text is > not rechecked, right? No, you cannot count on that. E.g., fontification-functions are always called with a region that starts at the beginning of a line, even if part of that line is already fontified. > Two char-tables would be enough: one small table for the confusable -> > target mapping, and one even smaller for the reverse target -> > (confusable list) mapping. The reverse lookup table could be stored in > an extra slot of the primary lookup table. Doesn't confusables.txt include both mappings already? If so, you don't need the reverse table.