From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: OKURI-NASI Date: Mon, 30 May 2022 09:06:48 -0400 Message-ID: References: <87y1ykxfd2.fsf@gnus.org> <87ee0bxs6f.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28300"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: Stefan Kangas , Emacs developers To: Lars Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon May 30 15:08:28 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nvf8G-00079f-E1 for ged-emacs-devel@m.gmane-mx.org; Mon, 30 May 2022 15:08:28 +0200 Original-Received: from localhost ([::1]:57376 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nvf8F-0000Th-8h for ged-emacs-devel@m.gmane-mx.org; Mon, 30 May 2022 09:08:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42210) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nvf6n-0006g9-5s for emacs-devel@gnu.org; Mon, 30 May 2022 09:06:57 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:13143) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nvf6j-0007Hy-Ob for emacs-devel@gnu.org; Mon, 30 May 2022 09:06:55 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id A344D442B7B; Mon, 30 May 2022 09:06:51 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id C273B442B72; Mon, 30 May 2022 09:06:49 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1653916009; bh=qqbOlAR0yHH5KpqBG04uv0j4ye7CpT8iv+IqZRddw6g=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=a1Y5MA5JiK6TBDAYh5qeIFF0sxs8dHQe+pCbNFhLdJzjznJnMa70CuctmzFkoIUaC MYXfsufPPCd3fE07D9aJEaAKY9idar4n5R+Te0LKmxGjAt3S9Knl4RU7G9VTucPgDj MxAyd9KXXI28/A/JwCzwpBvz2FFkOs2EqHb6jMSpz7WezN4Rl7h2tOJYf+IvUu/CEc R+nnBZP3bMBIEG+MNlpSlU6Z3h2V9vL/Asxthy0olMTpX0x2qIs9cL6k371gc0ar81 rjsGF6VsB9ivWNhWtC2wzdugE5PQYQhY8UO48mjoDTLHCBQKaBRssxtj18uM0YeJ5H vG4DfJzHurqBA== Original-Received: from pastel (unknown [45.72.221.51]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 8018D12034D; Mon, 30 May 2022 09:06:49 -0400 (EDT) In-Reply-To: <87ee0bxs6f.fsf@gnus.org> (Lars Ingebrigtsen's message of "Mon, 30 May 2022 11:53:28 +0200") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:290380 Archived-At: > Doing some very light profiling here, a lot of the time is taken up by > skkdic-get-entry, which is just lookup-nested-alist. Odd: `skkdic-get-entry` didn't even appear in the profile I got (and `lookup-nested-alist` was dwarfed by other things): 5412 79% - normal-top-level 5346 78% - command-line 5345 78% - command-line-1 5343 78% - skkdic-convert 2770 40% - skkdic-convert-okuri-nasi 2751 40% - skkdic-reduced-candidates 2699 39% - skkdic-breakup-string 707 10% - skkdic-breakup-string 7 0% - skkdic-breakup-string 2 0% - skkdic-breakup-string 1 0% skkdic-breakup-string 1915 28% - skkdic-collect-okuri-nasi 81 1% skkdic-get-candidate-list 3 0% lookup-nested-alist 47 0% skkdic-convert-okuri-ari 36 0% skkdic-convert-prefix 34 0% - save-buffer 34 0% - basic-save-buffer 33 0% - basic-save-buffer-1 33 0% - basic-save-buffer-2 26 0% - write-region 26 0% - select-safe-coding-system 1 0% find-auto-coding 1 0% - find-coding-systems-region 1 0% - sort-coding-systems 1 0% # 1 0% - vc-before-save 1 0% - vc-backend 1 0% - vc-registered 1 0% - mapc 1 0% - # 1 0% - vc-call-backend 1 0% - vc-svn-registered 1 0% - let 1 0% - if 1 0% - vc-find-root 1 0% locate-dominating-file 31 0% - set-visited-file-name 26 0% - set-auto-mode 25 0% - set-auto-mode--apply-alist 25 0% - set-auto-mode-0 25 0% - emacs-lisp-mode 25 0% - run-mode-hooks 19 0% - hack-local-variables 19 0% - hack-local-variables-apply 19 0% - hack-one-local-variable 19 0% - bug-reference-prog-mode 18 0% - jit-lock-register 18 0% jit-lock-mode 1 0% defalias 6 0% - run-hooks 6 0% - global-font-lock-mode-enable-in-buffers 6 0% - turn-on-font-lock-if-desired 6 0% - turn-on-font-lock 6 0% - font-lock-mode 6 0% - font-lock-default-function 6 0% - font-lock-mode-internal 6 0% - font-lock-turn-on-thing-lock 6 0% - jit-lock-register 6 0% jit-lock-mode 1 0% hack-local-variables 5 0% - hack-local-variables 5 0% - hack-local-variables-apply 4 0% - hack-one-local-variable 4 0% - bug-reference-prog-mode 4 0% - jit-lock-register 4 0% jit-lock-mode 10 0% skkdic-convert-postfix 65 0% - startup--honor-delayed-native-compilations 61 0% - startup--require-comp-safely 44 0% - byte-code 42 0% - require 30 0% - do-after-load-evaluation 29 0% - elisp--font-lock-flush-elisp-buffers 29 0% font-lock-flush 6 0% - byte-code 6 0% - require 5 0% - do-after-load-evaluation 4 0% - elisp--font-lock-flush-elisp-buffers 4 0% font-lock-flush 1 0% defalias 6 0% - do-after-load-evaluation 6 0% - elisp--font-lock-flush-elisp-buffers 6 0% font-lock-flush 1 0% - native--compile-async 1 0% - comp-run-async-workers 1 0% - write-region 1 0% - select-safe-coding-system 1 0% find-auto-coding 1 0% # 1364 20% - ... 1364 20% Automatic GC 8 0% + redisplay_internal (C function) 6 0% + command-execute Of that profile I mostly see: 2770 40% - skkdic-convert-okuri-nasi 1915 28% - skkdic-collect-okuri-nasi 1364 20% Automatic GC and AFAICT there's not much that can be optimized in `skkdic-collect-okuri-nasi` (assuming my profile is mostly accurate) since it spends most of its time just converting the source file into a usable Lisp data structure. Also, I suspect that most of the GC time comes from the "convert" part (based on the mem profile which shows it allocates about 70% vs 30%), so if we factor GC time into it, it's probably more like 3770 - skkdic-convert-okuri-nasi 2279 - skkdic-collect-okuri-nasi > My guess is that if somebody took a look ja-dic-cnv.el, this algorithm > could be made substantially more efficient by using other data > structures than an extremely long nested alist. I believe those (nested) alists shouldn't be that long (IIUC it's a trie-like data-structure, a bit like keymaps, so even with many entries in total, the total depth of the tree should be fairly short and the length of each list (i.e. the out degree of each node) shouldn't be very large either). Stefan