From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Mark Oteiza Newsgroups: gmane.emacs.bugs Subject: bug#28302: 26.0.50; [PATCH] Make ucs-names a hash table Date: Thu, 31 Aug 2017 01:04:15 -0400 Message-ID: <87r2vsqqeo.fsf@holos> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1504155938 30472 195.159.176.226 (31 Aug 2017 05:05:38 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 31 Aug 2017 05:05:38 +0000 (UTC) To: 28302@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Aug 31 07:05:27 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dnHf8-0006Uk-Nl for geb-bug-gnu-emacs@m.gmane.org; Thu, 31 Aug 2017 07:05:06 +0200 Original-Received: from localhost ([::1]:53939 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dnHfF-0005GR-Ms for geb-bug-gnu-emacs@m.gmane.org; Thu, 31 Aug 2017 01:05:13 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37746) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dnHf7-0005Cw-Ia for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:05:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dnHf4-0005CJ-EA for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:05:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:55371) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dnHf4-0005CD-B1 for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:05:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dnHf4-0008OT-31 for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:05:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Mark Oteiza Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 31 Aug 2017 05:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 28302 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.150415587132222 (code B ref -1); Thu, 31 Aug 2017 05:05:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 31 Aug 2017 05:04:31 +0000 Original-Received: from localhost ([127.0.0.1]:35819 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dnHeZ-0008Nd-9G for submit@debbugs.gnu.org; Thu, 31 Aug 2017 01:04:31 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:46734) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dnHeX-0008NO-2M for submit@debbugs.gnu.org; Thu, 31 Aug 2017 01:04:29 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dnHeQ-00055v-7h for submit@debbugs.gnu.org; Thu, 31 Aug 2017 01:04:23 -0400 Original-Received: from lists.gnu.org ([2001:4830:134:3::11]:53014) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dnHeQ-00055r-3v for submit@debbugs.gnu.org; Thu, 31 Aug 2017 01:04:22 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37705) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dnHeO-0004ia-To for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:04:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dnHeL-00055C-QD for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:04:20 -0400 Original-Received: from mail-qt0-x22a.google.com ([2607:f8b0:400d:c0d::22a]:37047) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dnHeL-000554-Lg for bug-gnu-emacs@gnu.org; Thu, 31 Aug 2017 01:04:17 -0400 Original-Received: by mail-qt0-x22a.google.com with SMTP id h15so20402542qta.4 for ; Wed, 30 Aug 2017 22:04:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=udel-edu.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:mime-version; bh=A+NTuddJ8/QEZw9oGXnGLch7vZkNSQ6MIhm6slrrfZg=; b=XJZ5c5xXkGuDqYNZfk03I+IBz4qk5f8HNehxYcsdXz8q4elWjyQFUG8W5bcoXbOalJ MMmCmBH4TcCWOIkWt26l5RyO5ugqfz86P7sxdTK/XR32SYVl9gOxS85JIgtBUeURdPBG eNG1bo181uiGaxceTT6qV9+ega8AMYME4kJl84K2+JNROXP5aFcpJ1VvjIkNPqFBOX29 29afFgP9p6KGIvrXu8uuBsHdJeFcgeL46Vp0omQOJwT8LVz8umTlITiJFVWZCpO/pFao hB1KAYxXPEyUkU0v1wZVxMxKOlA8kFknIO9U35vDjMxggytlLr8qP9enzyJ2IzbYPNah ziww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version; bh=A+NTuddJ8/QEZw9oGXnGLch7vZkNSQ6MIhm6slrrfZg=; b=SQrJUqPUdrkJAGhvx/W8OEmPUDqiESmPTvHuaZpPYQKkv1qCfFZNG85fETvZSK7vlK rzhk05H81Ajtlf1g9z9P2kseyfU/sDPxFfHI6npmM+zAs14vfBKZfzBIUzKg6wh35s6U JlhLfNR/5OG91Ybkq9KkvDGFhNQbiEQywqXv2X0kBn4TfkCJlsst1TtdYe1pL0ANx4Re qmonh0CZVxrNe9KX9MmE4lZjVcNiELne30Aq1qItX2llLxLEzx1zivZEzq4olOu72eUt 9Fj9+wsDN/eZetdfg/UQhzNifmFsKtumswIGVxyB0kaVXfzURBgxlytGr/lb22+DVw5b Dwww== X-Gm-Message-State: AHYfb5hNvcw6Cc1pgEkEyp51UiYYJDcNWC/3unzHcLLN1eN7JsRyYKsW Wkp0C4d5JZwwp1OYM80JZg== X-Received: by 10.200.41.15 with SMTP id y15mr5248921qty.82.1504155856681; Wed, 30 Aug 2017 22:04:16 -0700 (PDT) Original-Received: from holos.localdomain (pool-173-64-88-95.bltmmd.fios.verizon.net. [173.64.88.95]) by smtp.gmail.com with ESMTPSA id g54sm5023284qtk.60.2017.08.30.22.04.15 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 Aug 2017 22:04:15 -0700 (PDT) Original-Received: by holos.localdomain (Postfix, from userid 1000) id 2FEC96877C; Thu, 31 Aug 2017 01:04:15 -0400 (EDT) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:136401 Archived-At: Hi, I seem to remember there having been complaints about ucs-names preview being slow. I was curious about how much of that time was spent assoc'ing every element of a roughly n = 42k element long alist, and so tried making it a hash table instead. The result is a drastic speedup of C-x 8 RET TAB, presumably this makes the operation O(n) vs O(n^2). diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el index 338ca6a6e3..8c5fcf319b 100644 --- a/lisp/international/mule-cmds.el +++ b/lisp/international/mule-cmds.el @@ -2923,10 +2923,10 @@ nonascii-translation-table (make-obsolete-variable 'nonascii-translation-table "do not use it." "23.1") (defvar ucs-names nil - "Alist of cached (CHAR-NAME . CHAR-CODE) pairs.") + "Hash table of cached (CHAR-NAME . CHAR-CODE) pairs.") (defun ucs-names () - "Return alist of (CHAR-NAME . CHAR-CODE) pairs cached in `ucs-names'." + "Return hash table of (CHAR-NAME . CHAR-CODE) pairs cached in `ucs-names'." (or ucs-names (let ((ranges '((#x0000 . #x33FF) @@ -2954,7 +2954,7 @@ ucs-names ;; (#x20000 . #xDFFFF) CJK Ideograph Extension A, B, etc, unused (#xE0000 . #xE01FF))) (gc-cons-threshold 10000000) - names) + (names (make-hash-table :size 42943 :test #'equal))) (dolist (range ranges) (let ((c (car range)) (end (cdr range))) @@ -2965,27 +2965,28 @@ ucs-names ;; shadows a "new-name" but in practice every time an ;; `old-name' conflicts with a `new-name', the newer one has a ;; higher code, so it gets pushed later! - (if new-name (push (cons new-name c) names)) - (if old-name (push (cons old-name c) names)) + (if new-name (puthash new-name c names)) + (if old-name (puthash old-name c names)) (setq c (1+ c)))))) ;; Special case for "BELL" which is apparently the only char which ;; doesn't have a new name and whose old-name is shadowed by a newer ;; char with that name. - (setq ucs-names `(("BELL (BEL)" . 7) ,@names))))) + (puthash "BELL (BEL)" ?\a names) + (setq ucs-names names)))) (defun mule--ucs-names-annotation (name) ;; FIXME: It would be much better to add this annotation before rather than ;; after the char name, so the annotations are aligned. ;; FIXME: The default behavior of displaying annotations in italics ;; doesn't work well here. - (let ((char (assoc name ucs-names))) - (when char (format " (%c)" (cdr char))))) + (let ((char (gethash name ucs-names))) + (when char (format " (%c)" char)))) (defun char-from-name (string &optional ignore-case) "Return a character as a number from its Unicode name STRING. If optional IGNORE-CASE is non-nil, ignore case in STRING. Return nil if STRING does not name a character." - (or (cdr (assoc-string string (ucs-names) ignore-case)) + (or (gethash (if ignore-case (upcase string) string) (ucs-names)) (let ((minus (string-match-p "-[0-9A-F]+\\'" string))) (when minus ;; Parse names like "VARIATION SELECTOR-17" and "CJK