From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: faster unicode character name completion Date: Mon, 07 Dec 2009 22:28:44 +0200 Organization: JURTA Message-ID: <87ws0yttoj.fsf@mail.jurta.org> References: <87einfbxdw.fsf@red-bean.com> <87fx7r68s4.fsf@stupidchicken.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1260219206 26605 80.91.229.12 (7 Dec 2009 20:53:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 7 Dec 2009 20:53:26 +0000 (UTC) Cc: emacs-devel@gnu.org, Kenichi Handa To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Dec 07 21:53:19 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NHka4-00037u-CY for ged-emacs-devel@m.gmane.org; Mon, 07 Dec 2009 21:53:16 +0100 Original-Received: from localhost ([127.0.0.1]:59664 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NHka4-0008E7-BG for ged-emacs-devel@m.gmane.org; Mon, 07 Dec 2009 15:53:16 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NHkKC-0002fG-Jz for emacs-devel@gnu.org; Mon, 07 Dec 2009 15:36:52 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NHkK8-0002dX-Ny for emacs-devel@gnu.org; Mon, 07 Dec 2009 15:36:52 -0500 Original-Received: from [199.232.76.173] (port=46498 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NHkK8-0002dK-By for emacs-devel@gnu.org; Mon, 07 Dec 2009 15:36:48 -0500 Original-Received: from smtp-out4.starman.ee ([85.253.0.6]:54721 helo=mx2.starman.ee) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NHkK7-0001Oo-Uv for emacs-devel@gnu.org; Mon, 07 Dec 2009 15:36:48 -0500 X-Virus-Scanned: by Amavisd-New at mx2.starman.ee Original-Received: from mail.starman.ee (82.131.97.43.cable.starman.ee [82.131.97.43]) by mx2.starman.ee (Postfix) with ESMTP id C59E23F40C0; Mon, 7 Dec 2009 22:36:39 +0200 (EET) In-Reply-To: (Stefan Monnier's message of "Mon, 07 Dec 2009 09:57:46 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (x86_64-pc-linux-gnu) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:118371 Archived-At: > But I have a better idea: most of the time is not spent building the > completion table, but rather just weeding out all the "chars" that don't > have names, or should I say, looking for the few rare chars that do > have a name. > > So the patch below seems to eb a good compromise: it uses up just about > 1000K cons cells (i.e. 16KB on 64bit systems) to keep the precomputed > set of ~34K chars that do have a name, so that building the completion > table takes only a couple seconds. Before this change building the completion table took 10s, now only 2s. The size was 88,203, now 91,595. I think it's a reasonable price for such a speedup. BTW, a related problem: it would be better to hide old obsolete Unicode names to not advertise them, but still allow completions on them. For instance, duplicate names such as name: LATIN CAPITAL LETTER A WITH ACUTE old-name: LATIN CAPITAL LETTER A ACUTE add too much noise. Maybe to use the same approach as used for `completion-ignored-extensions', i.e. to ignore old names, but don't ignore if all possible completions end in one of them. -- Juri Linkov http://www.jurta.org/emacs/