From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: idn.el and confusables.txt Date: Sat, 14 May 2011 11:06:52 +0300 Message-ID: <83iptdg0yr.fsf@gnu.org> References: <878vv7imqp.fsf@lifelogs.com> <87k4erh6q3.fsf@lifelogs.com> <874o5uie42.fsf@lifelogs.com> <87y635dll9.fsf@lifelogs.com> <87r58vbj7o.fsf@lifelogs.com> <87fwpba03q.fsf@lifelogs.com> <874o5rqr5z.fsf@lifelogs.com> <87mxjjpal4.fsf@lifelogs.com> <87vcy6nzan.fsf@lifelogs.com> <87tydl4sjj.fsf_-_@lifelogs.com> <87r58pghh7.fsf_-_@lifelogs.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1305360553 2734 80.91.229.12 (14 May 2011 08:09:13 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 14 May 2011 08:09:13 +0000 (UTC) Cc: tzz@lifelogs.com, emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 14 10:09:07 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QL9uL-0004vX-Cw for ged-emacs-devel@m.gmane.org; Sat, 14 May 2011 10:09:05 +0200 Original-Received: from localhost ([::1]:49267 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL9uK-0001bM-RL for ged-emacs-devel@m.gmane.org; Sat, 14 May 2011 04:09:04 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:41040) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL9uH-0001b5-FU for emacs-devel@gnu.org; Sat, 14 May 2011 04:09:02 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QL9uG-0005SN-HF for emacs-devel@gnu.org; Sat, 14 May 2011 04:09:01 -0400 Original-Received: from mtaout23.012.net.il ([80.179.55.175]:57795) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL9uG-0005SJ-7k for emacs-devel@gnu.org; Sat, 14 May 2011 04:09:00 -0400 Original-Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0LL600I00DPSZM00@a-mtaout23.012.net.il> for emacs-devel@gnu.org; Sat, 14 May 2011 11:08:56 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([77.124.10.122]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LL600INQEMUQWB0@a-mtaout23.012.net.il>; Sat, 14 May 2011 11:08:56 +0300 (IDT) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-Received-From: 80.179.55.175 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:139387 Archived-At: > From: Stefan Monnier > Date: Fri, 13 May 2011 16:42:57 -0300 > Cc: emacs-devel@gnu.org > > > IMHO idn.el and confusables.txt should go into the Emacs trunk so they > > can generate first-class character properties for `C-u x ='. Stefan, > > Chong, what do you think? > > I don't know enough about the way we handle Unicode tables to know. We create char-tables from them. But I'm not sure I understand the question, so maybe my answer is not helpful. > It does sound like confusables.txt could be turned into > a lisp/international/uni-confusables.el, but I don't know whether there > is a large benefit from having it part of Emacs as opposed to having it > in GNU ELPA. As for idn.el, I haven't seen the file, and don't know > what uses it, so I can't judge. What is idn.el? where can I see it? And how and where would we like to use it? I searched the relevant threads (which were all spin-offs of other threads, which didn't help searching for the info), but didn't find any pointers. Apologies if I missed something. You see, the uni-*.el files we create out of the Unicode DB are not used anywhere in application code, AFAIK. We use them to display character properties in the likes of "C-u C-x =", and that's it. I'm not even sure they are organized in a way that makes them useful. E.g., when I needed to use the Unicode bidirectional properties for bidi reordering, I eventually was forced to create my own tables (see src/biditype.h and src/bidimirror.h, and the corresponding Awk scripts in admin/unidata/) which lend themselves well to using them in real-life code. So I'd really like to avoid introducing yet another huge table whose only effects are to show one more property in "C-u C-x =" and bloat the ELisp manual some more. Can we please have some preliminary ideas and design for using the "confusables" information and the IDNA protocol in Emacs, before we decide whether and how to include them?