From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: inputting characters by hexadigit Date: Thu, 24 Jul 2008 01:35:38 +0300 Organization: JURTA Message-ID: <878wvsut39.fsf@jurta.org> References: <868ww3vydn.fsf@lifelogs.com> <87myki6fqp.fsf@jurta.org> <87mykhz6tf.fsf@jurta.org> <87tzeokrku.fsf@jurta.org> <87od4wgg8p.fsf@catnip.gol.com> <86od4vmi5i.fsf@lifelogs.com> <873am6n21q.fsf@jurta.org> <87sku5if8t.fsf_-_@jurta.org> <87od4sti4g.fsf@jurta.org> <867ibcekf3.fsf@lifelogs.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1216853060 1758 80.91.229.12 (23 Jul 2008 22:44:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 23 Jul 2008 22:44:20 +0000 (UTC) Cc: Ted Zlatanov , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 24 00:45:08 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KLn4v-0001am-GF for ged-emacs-devel@m.gmane.org; Thu, 24 Jul 2008 00:45:02 +0200 Original-Received: from localhost ([127.0.0.1]:33391 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KLn41-0006vh-2k for ged-emacs-devel@m.gmane.org; Wed, 23 Jul 2008 18:44:05 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KLn2T-0005kz-EF for emacs-devel@gnu.org; Wed, 23 Jul 2008 18:42:29 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KLn2S-0005ke-NW for emacs-devel@gnu.org; Wed, 23 Jul 2008 18:42:29 -0400 Original-Received: from [199.232.76.173] (port=58214 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KLn2S-0005kY-F7 for emacs-devel@gnu.org; Wed, 23 Jul 2008 18:42:28 -0400 Original-Received: from anti-4.kiev.sovam.com ([62.64.120.202]:54297) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KLn2R-0006i5-MW for emacs-devel@gnu.org; Wed, 23 Jul 2008 18:42:28 -0400 Original-Received: from [83.170.232.243] (helo=smtp.svitonline.com) by anti-4.kiev.sovam.com with esmtp (Exim 4.67) (envelope-from ) id 1KLn2P-000PQI-UV; Thu, 24 Jul 2008 01:42:26 +0300 In-Reply-To: (Stefan Monnier's message of "Wed, 23 Jul 2008 15:31:36 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (x86_64-pc-linux-gnu) X-Scanner-Signature: 9301293e34c22c108b5fa79baa05d543 X-DrWeb-checked: yes X-SpamTest-Envelope-From: juri@jurta.org X-SpamTest-Group-ID: 00000000 X-SpamTest-Header: Trusted X-SpamTest-Info: Profiles 4429 [July 24 2008] X-SpamTest-Info: {received from trusted relay: common white list} X-SpamTest-Info: {HEADERS: header Content-Type found without required header Content-Transfer-Encoding} X-SpamTest-Method: white ip list X-SpamTest-Rate: 10 X-SpamTest-Status: Trusted X-SpamTest-Status-Extended: trusted X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0278], KAS30/Release X-detected-kernel: by monty-python.gnu.org: FreeBSD 6.x (1) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:101342 Archived-At: >> Can completions be cascaded somehow? The first tier would show all the >> common first words, e.g. ... AEGEAN APL GREEK ... and then selecting >> something from the first tier would cascade down to the second tier. > > The slow display should only happen when the list is really long, > i.e. basically if you hit TAB with an empty minibuffer. So we could > indeed easily use a different completion behavior in this case. When I type TAB in the empty minibuffer, I really want to see all completions even when the list is really long, to be able to use isearch to find a completion candidate etc. Everyone who wants to narrow the completion list can type the first word like "latin TAB", so I think the current completion behavior is satisfactory. I'm now trying to improve the performance of Unicode name completion. To cache a list of (CHAR-NAME . CHAR-CODE) pairs I created a new variable `ucs-names' and a function that returns its value or creates a new list (time-consuming operation). The intention is to fill this list only when the user tries to complete or enters a string that doesn't look like a hex number. I also tried to use `lazy-completion-table' but it seems slower than giving a ready alist in the `collection' arg of `completing-read' though I didn't make measurements. The current patch is below. Can you see any problems with it? Index: lisp/international/mule-cmds.el =================================================================== RCS file: /sources/emacs/emacs/lisp/international/mule-cmds.el,v retrieving revision 1.333 diff -c -r1.333 mule-cmds.el *** lisp/international/mule-cmds.el 15 Jul 2008 18:15:03 -0000 1.333 --- lisp/international/mule-cmds.el 23 Jul 2008 22:34:03 -0000 *************** *** 2846,2855 **** (defvar nonascii-insert-offset 0 "This variable is obsolete.") (defvar nonascii-translation-table nil "This variable is obsolete.") (defun ucs-insert (arg) "Insert a character of the given Unicode code point. Interactively, prompts for a hex string giving the code." ! (interactive "sUnicode (hex): ") (or (integerp arg) (setq arg (string-to-number arg 16))) (if (or (< arg 0) (> arg #x10FFFF)) --- 2849,2894 ---- (defvar nonascii-insert-offset 0 "This variable is obsolete.") (defvar nonascii-translation-table nil "This variable is obsolete.") + (defvar ucs-names nil + "Alist of cached (CHAR-NAME . CHAR-CODE) pairs.") + + (defun ucs-names () + "Return alist of (CHAR-NAME . CHAR-CODE) pairs cached in `ucs-names'." + (or ucs-names + (setq ucs-names + (let (name names) + (dotimes (c #xEFFFF) + (unless (or + (and (>= c #x3400 ) (<= c #x4dbf )) ; CJK Ideograph Extension A + (and (>= c #x4e00 ) (<= c #x9fff )) ; CJK Ideograph + (and (>= c #xd800 ) (<= c #xfaff )) ; Private/Surrogate + (and (>= c #x20000) (<= c #x2ffff)) ; CJK Ideograph Extension B + ) + (if (setq name (get-char-code-property c 'name)) + (setq names (cons (cons name c) names))) + (if (setq name (get-char-code-property c 'old-name)) + (setq names (cons (cons name c) names))))) + names)))) + + (defvar ucs-completions (lazy-completion-table ucs-completions ucs-names) + "Lazy completion table for completing on Unicode character names.") + + (defun read-char-by-name (prompt) + "Read a character by its Unicode name or hex number string. + Display PROMPT and read a string that represents a character + by its Unicode property `name' or `old-name'. It also accepts + a hexadecimal number of Unicode code point. Returns a character + as a number." + (let* ((completion-ignore-case t) + (input (completing-read prompt ucs-completions))) + (or (and (string-match "^[0-9a-fA-F]+$" input) + (string-to-number input 16)) + (cdr (assoc input (ucs-names)))))) + (defun ucs-insert (arg) "Insert a character of the given Unicode code point. Interactively, prompts for a hex string giving the code." ! (interactive (list (read-char-by-name "Unicode (hex or name): "))) (or (integerp arg) (setq arg (string-to-number arg 16))) (if (or (< arg 0) (> arg #x10FFFF)) -- Juri Linkov http://www.jurta.org/emacs/