From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: native compilation units Date: Sat, 04 Jun 2022 10:32:14 -0400 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15915"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: Andrea Corallo , emacs-devel@gnu.org To: Lynn Winebarger Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jun 04 16:33:27 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nxUqE-0003yO-L3 for ged-emacs-devel@m.gmane-mx.org; Sat, 04 Jun 2022 16:33:26 +0200 Original-Received: from localhost ([::1]:37650 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nxUqD-0003gT-7q for ged-emacs-devel@m.gmane-mx.org; Sat, 04 Jun 2022 10:33:25 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51584) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nxUpE-0002G2-J3 for emacs-devel@gnu.org; Sat, 04 Jun 2022 10:32:24 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:18368) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nxUpB-00055x-AT for emacs-devel@gnu.org; Sat, 04 Jun 2022 10:32:23 -0400 Original-Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id AF0661006F8; Sat, 4 Jun 2022 10:32:19 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 4DA4E100171; Sat, 4 Jun 2022 10:32:16 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1654353136; bh=duXHF2RPcvyzvYi8dzdHPsLE/zRVp+Nl08T7x+oU6cU=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=gToB2/bk8Nl9+ITcgPEg3w1FcA9wtxdEKNgSlkE5qDL2/6OvnifH7H2m3RaJLGXJN 6p+vIEJRD5s+/gWlNVrZDKEv8om2xhk5DelJ1zLCPPOP/TDfpj81rtDdv1AwiHfkhl impTbkx4qg2LnYqq9qMddZnW7PVJmCK8OZD6NwELsorHsrTbe2oVCGHRs7McFGumML coDW8hCmDyPMNS9VGMj+G/IyBN5JOtka/jZIrDlNvslYLMkr3ISXjgcaIYZrHIACX6 P4973NIah3ovajVG3vt+BbOLXhr1LOR8bP9GbQ7l0i6tk9qhw3lyRCQQnVi7LUeugc h/iYt/E2GpwJQ== Original-Received: from pastel (unknown [45.72.221.51]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id E300D120494; Sat, 4 Jun 2022 10:32:15 -0400 (EDT) In-Reply-To: (Lynn Winebarger's message of "Fri, 3 Jun 2022 22:43:33 -0400") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:290658 Archived-At: >> Performance issues with read access to directories containing less than >> 10K files seems like something that was solved last century, so >> I wouldn't worry very much about it. > Per my response to Eli, I see (network) directories become almost unusable > somewhere around 1000 files, I don't doubt there are still (in the current century) cases where largish directories get slow, but what I meant is that it's now considered as a problem that should be solved by making those directories fast rather than by avoiding making them so large. >> [ But that doesn't mean we shouldn't try to compile several ELisp files >> into a single ELN file, especially since the size of ELN files seems >> to be proportionally larger for small ELisp files than for large >> ones. ] > > Since I learned of the native compiler in 28.1, I decided to try it out and > also "throw the spaghetti at the wall" with a bunch of packages that > provide features similar to those found in more "modern" IDEs. In terms of > startup time, the normal package system does not deal well with hundreds of > directories on the load path, regardless of AOR native compilation, so I'm > tranforming the packages to install in the version-specific load path, and > compiling that ahead of time. At least for the ones amenable to such > treatment. There are two load-paths at play (`load-path` and `native-comp-eln-load-path`) and I'm not sure which one you're taking about. OT1H `native-comp-eln-load-path` should not grow with the number of packages so it typically contains exactly 2 entries, and definitely not hundreds. OTOH `load-path` is unrelated to native compilation. I also don't understand what you mean by "version-specific load path". Also, what kind of startup time are you talking about? E.g., are you using `package-quickstart`? > Given I'm compiling all the files AOT for use in a common installation > (this is on Linux, not Windows), the natural question for me is whether > larger compilation units would be more efficient, particularly at startup. It all depends where the slowdown comes from :-) E.g. `package-quickstart` follows a similar idea to the one you propose by collecting all the `-autoloads.el` into one bug file, which saves us from having to load separately all those little files. It also saves us from having to look for them through those hundreds of directories. I suspect a long `load-path` can itself be a source of slow down especially during startup, but I haven't bumped into that yet. There are ways we could speed it up, if needed: - create "meta packages" (or just one containing all your packages), which would bring together in a single directory the files of several packages (and presumably also bring together their `-autoloads.el` into a larger combined one). Under GNU/Linux we could have this metapackage be made of symlinks, making it fairly efficient an non-obtrusive (e.g. `C-h o` could still get you to the actual file rather than its metapackage-copy). - Manage a cache of where are our ELisp files (i.e. a hash table mapping relative ELisp file names to the absolute file name returned by looking for them in `load-path`). This way we can usually avoid scanning those hundred directories to find the .elc file we need, and go straight to it. > I posed the question to the list mostly to see if the approach (or similar) > had already been tested for viability or effectiveness, so I can avoid > unnecessary experimentation if the answer is already well-understood. I don't think it has been tried, no. > I don't know enough about modern library loading to know whether you'd > expect N distinct but interdependent dynamic libraries to be loaded in as > compact a memory region as a single dynamic library formed from the same > underlying object code. I think you're right here, but I'd expect the effect to be fairly small except when the .elc/.eln files are themselves small. > It's not clear to me whether those points are limited to call > sites or not. I believe it is: the optimization is to replace a call via `Ffuncall` to a "symbol" (which looks up the value stored in the `symbol-function` cell), with a direct call to the actual C function contained in the "subr" object itself (expected to be) contained in the `symbol-function` cell. Andrea would know if there are other semantic-non-preserving optimizations in the level 3 of the optimizations, but IIUC this is very much the main one. >> IIUC the current native-compiler will actually leave those >> locally-defined functions in their byte-code form :-( > That's not what I understood from > https://akrl.sdf.org/gccemacs.html#org0f21a5b > As you deduce below, I come from a Scheme background - cl-flet is the form > I should have referenced, not let. Indeed you're right that those functions can be native compiled, tho only if they're closed (i.e. if they don't refer to surrounding lexical variables). [ I always forget that little detail :-( ] > It seems more straightforward than trying to link the eln > files into larger units after compilation. That seems like too much trouble, indeed. Stefan