unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
Subject: Re: native compilation units
Date: Sun, 5 Jun 2022 10:08:35 -0400	[thread overview]
Message-ID: <CAM=F=bBj=qUFg1+-UbOxifY24wVpjj87Svq7hqRoEyn0hPExHg@mail.gmail.com> (raw)
In-Reply-To: <CAM=F=bDzv-=r-zZR+ti708aeg7_iXZMnqf-o_5a-kaXB=VY-pw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 13967 bytes --]

On Sun, Jun 5, 2022 at 8:16 AM Lynn Winebarger <owinebar@gmail.com> wrote:

> On Sat, Jun 4, 2022, 10:32 AM Stefan Monnier <monnier@iro.umontreal.ca>
> wrote:
>
>> >> [ But that doesn't mean we shouldn't try to compile several ELisp files
>>
> >>   into a single ELN file, especially since the size of ELN files seems
>> >>   to be proportionally larger for small ELisp files than for large
>> >>   ones.  ]
>> >
>> > Since I learned of the native compiler in 28.1, I decided to try it out
>> and
>> > also "throw the spaghetti at the wall" with a bunch of packages that
>> > provide features similar to those found in more "modern" IDEs.  In
>> terms of
>> > startup time, the normal package system does not deal well with
>> hundreds of
>> > directories on the load path, regardless of AOR native compilation, so
>> I'm
>> > tranforming the packages to install in the version-specific load path,
>> and
>> > compiling that ahead of time.  At least for the ones amenable to such
>> > treatment.
>>
>> There are two load-paths at play (`load-path` and
>> `native-comp-eln-load-path`) and I'm not sure which one you're taking
>> about.  OT1H `native-comp-eln-load-path` should not grow with the number
>> of packages so it typically contains exactly 2 entries, and definitely
>> not hundreds.  OTOH `load-path` is unrelated to native compilation.
>>
>
> Not entirely - as I understand it, the load system first finds the source
> file and computers a hash before determining if there is an ELN file
> corresponding to it.
> Although I do wonder if there is some optimization for ELN files in the
> system directory as opposed to the user's cache.  I have one build where I
> native compiled (but not byte compiled) all the el files in the lisp
> directory, and another where I byte compiled and then native compiled the
> same set of files.  In both cases I used the flag to batch-native-compile
> to put the ELN file in the system cache.  In the first case a number of
> files failed to compile, and in the second, they all compiled.  I've also
> observed another situation where a file will only (bye or native) compile
> if one of its required files has been byte compiled ahead of time - but
> only native compiling that dependency resulted in the same behavior as not
> compiling it at all.  I planned to send a separate mail to the list asking
> whether it was intended behavior once I had reduced it to a simple case, or
> if it should be submitted as a bug.
>

Unrelated, but the one type of file I don't seem to be able to produce AOT
(because I have no way to specify them) in the system directory are the
subr/trampoline files.  Any hints on how to make those AOT in the system
directory?


>
>> Also, what kind of startup time are you talking about?
>> E.g., are you using `package-quickstart`?
>>
> That was the first alternative I tried.  With 1250 packages, it did not
> work.  First, the file consisted of a series of "let" forms corresponding
> to the package directories, and apparently the autoload forms are ignored
> if they appear anywhere below top-level.  At least I got a number of
> warnings to that effect.
> The other problem was that I got a "bytecode overflow error".  I only got
> the first error after chopping off the file approximately after the first
> 10k lines.  Oddly enough, when I put all the files in the site-lisp
> directory, and collect all the autoloads for that directory in a single
> file, it has no problem with the 80k line file that results.
>
>>
>> Also, I should have responded to the first question - "minutes" on recent
server-grade hardware with 24 cores and >100GB of RAM.  That was with 1193
enabled packages in my .emacs file.




On Sun, Jun 5, 2022 at 8:16 AM Lynn Winebarger <owinebar@gmail.com> wrote:

> On Sat, Jun 4, 2022, 10:32 AM Stefan Monnier <monnier@iro.umontreal.ca>
> wrote:
>
>> >> Performance issues with read access to directories containing less than
>> >> 10K files seems like something that was solved last century, so
>> >> I wouldn't worry very much about it.
>> > Per my response to Eli, I see (network) directories become almost
>> unusable
>> > somewhere around 1000 files,
>>
>> I don't doubt there are still (in the current century) cases where
>> largish directories get slow, but what I meant is that it's now
>> considered as a problem that should be solved by making those
>> directories fast rather than by avoiding making them so large.
>>
> Unfortunately sometimes we have to cope with environment we use.  And for
> all I know some of the performance penalties may be inherent in the
> (security related) infrastructure requirements in a highly regulated
> industry.
> Not that that should be a primary concern for the development team, but it
> is something a local packager might be stuck with.
>
>
>> >> [ But that doesn't mean we shouldn't try to compile several ELisp files
>> >>   into a single ELN file, especially since the size of ELN files seems
>> >>   to be proportionally larger for small ELisp files than for large
>> >>   ones.  ]
>> >
>> > Since I learned of the native compiler in 28.1, I decided to try it out
>> and
>> > also "throw the spaghetti at the wall" with a bunch of packages that
>> > provide features similar to those found in more "modern" IDEs.  In
>> terms of
>> > startup time, the normal package system does not deal well with
>> hundreds of
>> > directories on the load path, regardless of AOR native compilation, so
>> I'm
>> > tranforming the packages to install in the version-specific load path,
>> and
>> > compiling that ahead of time.  At least for the ones amenable to such
>> > treatment.
>>
>> There are two load-paths at play (`load-path` and
>> `native-comp-eln-load-path`) and I'm not sure which one you're taking
>> about.  OT1H `native-comp-eln-load-path` should not grow with the number
>> of packages so it typically contains exactly 2 entries, and definitely
>> not hundreds.  OTOH `load-path` is unrelated to native compilation.
>>
>
> Not entirely - as I understand it, the load system first finds the source
> file and computers a hash before determining if there is an ELN file
> corresponding to it.
> Although I do wonder if there is some optimization for ELN files in the
> system directory as opposed to the user's cache.  I have one build where I
> native compiled (but not byte compiled) all the el files in the lisp
> directory, and another where I byte compiled and then native compiled the
> same set of files.  In both cases I used the flag to batch-native-compile
> to put the ELN file in the system cache.  In the first case a number of
> files failed to compile, and in the second, they all compiled.  I've also
> observed another situation where a file will only (bye or native) compile
> if one of its required files has been byte compiled ahead of time - but
> only native compiling that dependency resulted in the same behavior as not
> compiling it at all.  I planned to send a separate mail to the list asking
> whether it was intended behavior once I had reduced it to a simple case, or
> if it should be submitted as a bug.
> In any case, I noticed that the "browse customization groups" buffer is
> noticeable faster in the second case.  I need to try it again to confirm
> that it wasn't just waiting on the relevant source files to compile in the
> first case.
>
> I also don't understand what you mean by "version-specific load path".
>>
> In the usual unix installation, there will be a "site-lisp" one directory
> above the version specific installation directory, and another site-lisp in
> the version-specific installation directory.  I'm referring to installing
> the source (ultimately) in ..../emacs/28.1/site-lisp.  During the build
> it's just in the site-lisp subdirectory of the source root path.
>
>
>> Also, what kind of startup time are you talking about?
>> E.g., are you using `package-quickstart`?
>>
> That was the first alternative I tried.  With 1250 packages, it did not
> work.  First, the file consisted of a series of "let" forms corresponding
> to the package directories, and apparently the autoload forms are ignored
> if they appear anywhere below top-level.  At least I got a number of
> warnings to that effect.
> The other problem was that I got a "bytecode overflow error".  I only got
> the first error after chopping off the file approximately after the first
> 10k lines.  Oddly enough, when I put all the files in the site-lisp
> directory, and collect all the autoloads for that directory in a single
> file, it has no problem with the 80k line file that results.
>
>
>> > Given I'm compiling all the files AOT for use in a common installation
>> > (this is on Linux, not Windows), the natural question for me is whether
>> > larger compilation units would be more efficient, particularly at
>> startup.
>>
>> It all depends where the slowdown comes from :-)
>>
>> E.g. `package-quickstart` follows a similar idea to the one you propose
>> by collecting all the `<pkg>-autoloads.el` into one bug file, which
>> saves us from having to load separately all those little files.  It also
>> saves us from having to look for them through those hundreds
>> of directories.
>>
>> I suspect a long `load-path` can itself be a source of slow down
>> especially during startup, but I haven't bumped into that yet.
>> There are ways we could speed it up, if needed:
>>
>> - create "meta packages" (or just one containing all your packages),
>>   which would bring together in a single directory the files of several
>>   packages (and presumably also bring together their
>>   `<pkg>-autoloads.el` into a larger combined one).  Under GNU/Linux we
>>   could have this metapackage be made of symlinks, making it fairly
>>   efficient an non-obtrusive (e.g. `C-h o` could still get you to the
>>   actual file rather than its metapackage-copy).
>> - Manage a cache of where are our ELisp files (i.e. a hash table
>>   mapping relative ELisp file names to the absolute file name returned
>>   by looking for them in `load-path`).  This way we can usually avoid
>>   scanning those hundred directories to find the .elc file we need, and
>>   go straight to it.
>>
> I'm pretty sure the load-path is an issue with 1250 packages, even if half
> of them consist of single files.
>
> Since I'm preparing this for a custom installation that will be accessible
> for multiple users, I decided to try putting everything in site-lisp and
> native compile everything AOT.  Most of the other potential users are not
> experienced Unix users, which is why I'm trying to make everything work
> smoothly up front and have features they would find familiar from other
> editors.
>
> One issue with this approach is that the package selection mechanism
> doesn't recognize the modules as being installed, or provide any assistance
> in selectively activating modules.
>
> Other places where there is a noticeable slowdown with large numbers of
> packages:
>   * Browsing customization groups - just unfolding a single group can take
> minutes (this is on fast server hardware with a lot of free memory)
>   * Browsing custom themes with many theme packages installed
> I haven't gotten to the point that I can test the same situation by
> explicitly loading the same modules from the site-lisp directory that had
> been activated as packages.  Installing the themes in the system directory
> does skip the "suspicious files" check that occurs when loading them from
> the user configuration.
>
>
>> > I posed the question to the list mostly to see if the approach (or
>> similar)
>> > had already been tested for viability or effectiveness, so I can avoid
>> > unnecessary experimentation if the answer is already well-understood.
>>
>> I don't think it has been tried, no.
>>
>> > I don't know enough about modern library loading to know whether you'd
>> > expect N distinct but interdependent dynamic libraries to be loaded in
>> as
>> > compact a memory region as a single dynamic library formed from the same
>> > underlying object code.
>>
>> I think you're right here, but I'd expect the effect to be fairly small
>> except when the .elc/.eln files are themselves small.
>>
>
> There are a lot of packages that have fairly small source files, just
> because they've factored their code the same way it would be in languages
> where the shared libraries are not in 1-1 correspondence with source files.
>
>>
>> > It's not clear to me whether those points are limited to call
>> > sites or not.
>>
>> I believe it is: the optimization is to replace a call via `Ffuncall` to
>> a "symbol" (which looks up the value stored in the `symbol-function`
>> cell), with a direct call to the actual C function contained in the
>> "subr" object itself (expected to be) contained in the
>> `symbol-function` cell.
>>
>> Andrea would know if there are other semantic-non-preserving
>> optimizations in the level 3 of the optimizations, but IIUC this is very
>> much the main one.
>>
>> >> IIUC the current native-compiler will actually leave those
>> >> locally-defined functions in their byte-code form :-(
>> > That's not what I understood from
>> > https://akrl.sdf.org/gccemacs.html#org0f21a5b
>> > As you deduce below, I come from a Scheme background - cl-flet is the
>> form
>> > I should have referenced, not let.
>>
>> Indeed you're right that those functions can be native compiled, tho only
>> if
>> they're closed (i.e. if they don't refer to surrounding lexical
>> variables).
>> [ I always forget that little detail :-(  ]
>>
>
> I would expect this would apply to most top-level defuns in elisp
> packages/modules.  From my cursory review, it looks like the ability to
> redefine these defuns is mostly useful when developing the packages
> themselves, and "sealing" them for use would be appropriate.
> I'm not clear on whether this optimization is limited to the case of
> calling functions defined in the compilation unit, or applied more broadly.
>
> Thanks,
> Lynn
>
>
>>

[-- Attachment #2: Type: text/html, Size: 19291 bytes --]

  reply	other threads:[~2022-06-05 14:08 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31  1:02 native compilation units Lynn Winebarger
2022-06-01 13:50 ` Andrea Corallo
2022-06-03 14:17   ` Lynn Winebarger
2022-06-03 16:05     ` Eli Zaretskii
     [not found]       ` <CAM=F=bDxxyHurxM_xdbb7XJtP8rdK16Cwp30ti52Ox4nv19J_w@mail.gmail.com>
2022-06-04  5:57         ` Eli Zaretskii
2022-06-05 13:53           ` Lynn Winebarger
2022-06-03 18:15     ` Stefan Monnier
2022-06-04  2:43       ` Lynn Winebarger
2022-06-04 14:32         ` Stefan Monnier
2022-06-05 12:16           ` Lynn Winebarger
2022-06-05 14:08             ` Lynn Winebarger [this message]
2022-06-05 14:46               ` Stefan Monnier
2022-06-05 14:20             ` Stefan Monnier
2022-06-06  4:12               ` Lynn Winebarger
2022-06-06  6:12                 ` Stefan Monnier
2022-06-06 10:39                   ` Eli Zaretskii
2022-06-06 16:23                     ` Lynn Winebarger
2022-06-06 16:58                       ` Eli Zaretskii
2022-06-07  2:14                         ` Lynn Winebarger
2022-06-07 10:53                           ` Eli Zaretskii
2022-06-06 16:13                   ` Lynn Winebarger
2022-06-07  2:39                     ` Lynn Winebarger
2022-06-07 11:50                       ` Stefan Monnier
2022-06-07 13:11                         ` Eli Zaretskii
2022-06-14  4:19               ` Lynn Winebarger
2022-06-14 12:23                 ` Stefan Monnier
2022-06-14 14:55                   ` Lynn Winebarger
2022-06-08  6:56           ` Andrea Corallo
2022-06-11 16:13             ` Lynn Winebarger
2022-06-11 16:37               ` Stefan Monnier
2022-06-11 17:49                 ` Lynn Winebarger
2022-06-11 20:34                   ` Stefan Monnier
2022-06-12 17:38                     ` Lynn Winebarger
2022-06-12 18:47                       ` Stefan Monnier
2022-06-13 16:33                         ` Lynn Winebarger
2022-06-13 17:15                           ` Stefan Monnier
2022-06-15  3:03                             ` Lynn Winebarger
2022-06-15 12:23                               ` Stefan Monnier
2022-06-19 17:52                                 ` Lynn Winebarger
2022-06-19 23:02                                   ` Stefan Monnier
2022-06-20  1:39                                     ` Lynn Winebarger
2022-06-20 12:14                                       ` Lynn Winebarger
2022-06-20 12:34                                       ` Lynn Winebarger
2022-06-25 18:12                                       ` Lynn Winebarger
2022-06-26 14:14                                         ` Lynn Winebarger
2022-06-08  6:46         ` Andrea Corallo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bBj=qUFg1+-UbOxifY24wVpjj87Svq7hqRoEyn0hPExHg@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=akrl@sdf.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).