From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: inputting characters by hexadigit Date: Sun, 20 Jul 2008 03:29:14 +0300 Organization: JURTA Message-ID: <87sku5if8t.fsf_-_@jurta.org> References: <868ww3vydn.fsf@lifelogs.com> <87myki6fqp.fsf@jurta.org> <87mykhz6tf.fsf@jurta.org> <87tzeokrku.fsf@jurta.org> <87od4wgg8p.fsf@catnip.gol.com> <86od4vmi5i.fsf@lifelogs.com> <873am6n21q.fsf@jurta.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1216514505 22119 80.91.229.12 (20 Jul 2008 00:41:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 20 Jul 2008 00:41:45 +0000 (UTC) Cc: tzz@lifelogs.com, emacs-devel@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jul 20 02:42:33 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KKN0N-0006LD-J5 for ged-emacs-devel@m.gmane.org; Sun, 20 Jul 2008 02:42:27 +0200 Original-Received: from localhost ([127.0.0.1]:47716 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KKMzU-0004QA-NI for ged-emacs-devel@m.gmane.org; Sat, 19 Jul 2008 20:41:32 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KKMzP-0004OS-NZ for emacs-devel@gnu.org; Sat, 19 Jul 2008 20:41:27 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KKMzN-0004Ky-EE for emacs-devel@gnu.org; Sat, 19 Jul 2008 20:41:26 -0400 Original-Received: from [199.232.76.173] (port=58403 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KKMzN-0004Ko-8o for emacs-devel@gnu.org; Sat, 19 Jul 2008 20:41:25 -0400 Original-Received: from relay01.kiev.sovam.com ([62.64.120.200]:1617) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KKMzM-0007ly-Qo for emacs-devel@gnu.org; Sat, 19 Jul 2008 20:41:25 -0400 Original-Received: from [83.170.232.243] (helo=smtp.svitonline.com) by relay01.kiev.sovam.com with esmtp (Exim 4.67) (envelope-from ) id 1KKMy7-000EQm-5a; Sun, 20 Jul 2008 03:41:22 +0300 In-Reply-To: (Kenichi Handa's message of "Sat, 19 Jul 2008 10:11:31 +0900") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (x86_64-pc-linux-gnu) X-Scanner-Signature: 5e2df25549de819b25d1bfd51e9f52eb X-DrWeb-checked: yes X-SpamTest-Envelope-From: juri@jurta.org X-SpamTest-Group-ID: 00000000 X-SpamTest-Header: Trusted X-SpamTest-Info: Profiles 4369 [July 20 2008] X-SpamTest-Info: {received from trusted relay: common white list} X-SpamTest-Info: {HEADERS: header Content-Type found without required header Content-Transfer-Encoding} X-SpamTest-Method: white ip list X-SpamTest-Rate: 10 X-SpamTest-Status: Trusted X-SpamTest-Status-Extended: trusted X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0278], KAS30/Release X-detected-kernel: by monty-python.gnu.org: FreeBSD 4.8-5.1 (or MacOS X 10.2-10.3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:101000 Archived-At: >> + (defun read-char-by-name (prompt) >> + "Read a character by its Unicode name or hex number string. >> + Display PROMPT and read a string that represent a character >> + by its Unicode property `name' or `old-name'. It also accepts >> + a hexadecimal number of Unicode code point. Returns a character >> + as a number." >> + (let (name names) >> + (dotimes (c #x10FFFF) >> + (if (setq name (get-char-code-property c 'name)) >> + (setq names (cons (cons name c) names))) >> + (if (setq name (get-char-code-property c 'old-name)) >> + (setq names (cons (cons name c) names)))) >> + (or (cdr (assoc (setq name (completing-read prompt names)) names)) >> + (string-to-number name 16)))) >> + > > I think it is better to skip these ranges: > #x3400..#x4dbf -- CJK Ideograph Extension A > #x4e00..#x9fff -- CJK Ideograph > #xd800..#xfaFF -- surroage-pair, private use, CJK COMPATIBILITY IDEOGRAPH > #x20000..#x2ffff -- CJK Ideograph Extension B > and end the loop at #xeffff (#xf0000.. are for private use) Actually there are no Unicode names in these ranges in UnicodeData.txt. It has only lines for the first and the last character in these ranges: 3400;;Lo;0;L;;;;;N;;;;; 4DB5;;Lo;0;L;;;;;N;;;;; 4E00;;Lo;0;L;;;;;N;;;;; 9FC3;;Lo;0;L;;;;;N;;;;; D800;;Cs;0;L;;;;;N;;;;; DB7F;;Cs;0;L;;;;;N;;;;; DB80;;Cs;0;L;;;;;N;;;;; DBFF;;Cs;0;L;;;;;N;;;;; DC00;;Cs;0;L;;;;;N;;;;; DFFF;;Cs;0;L;;;;;N;;;;; E000;;Co;0;L;;;;;N;;;;; F8FF;;Co;0;L;;;;;N;;;;; 20000;;Lo;0;L;;;;;N;;;;; 2A6D6;;Lo;0;L;;;;;N;;;;; F0000;;Co;0;L;;;;;N;;;;; FFFFD;;Co;0;L;;;;;N;;;;; 100000;;Co;0;L;;;;;N;;;;; 10FFFD;;Co;0;L;;;;;N;;;;; If it would be possible to loop over names instead of loop over all characters to check for their names, then this code would be more fast, but I don't see how it would be possible to loop over all defined names in UnicodeData.txt. If this is not possible then we could optimize the loop over all characters in the chartable to skip these useless ranges. -- Juri Linkov http://www.jurta.org/emacs/