From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: native compilation units Date: Fri, 3 Jun 2022 22:43:33 -0400 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000007060f05e0963524" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20151"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Andrea Corallo , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jun 04 07:48:26 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nxMe9-00054m-ST for ged-emacs-devel@m.gmane-mx.org; Sat, 04 Jun 2022 07:48:25 +0200 Original-Received: from localhost ([::1]:41552 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nxMe8-0001xA-4z for ged-emacs-devel@m.gmane-mx.org; Sat, 04 Jun 2022 01:48:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55566) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nxJlT-0006tk-W3 for emacs-devel@gnu.org; Fri, 03 Jun 2022 22:43:49 -0400 Original-Received: from mail-lj1-x22a.google.com ([2a00:1450:4864:20::22a]:39702) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nxJlR-0003Hw-J6 for emacs-devel@gnu.org; Fri, 03 Jun 2022 22:43:47 -0400 Original-Received: by mail-lj1-x22a.google.com with SMTP id t13so10306719ljd.6 for ; Fri, 03 Jun 2022 19:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pWMf1UDP+/TvD9J+zYOQC49Hk9dNzxZa9EsvaOc1xkM=; b=JoyRbGpxYOXpZAd5GsmErPkXf01yEuaBNWsaVMZ7NGV6Rw6xp34N2Xe+BoYCqn8LmL D2uFiGQe/AFMdLaNrPmPdi8JpEZzMrWRfWPCrYbXhCljf/245dSepONV4b2rmyQ4fPY/ 6IwORsW/s6tTaBBldgrHG1H698sUeZCC0PiUvoF4dMbia7ewyV5EQM3PJy0/wrfy9Yrw KtnpK0XiqJ8uGCwBOTO7vYs1MmBRxry79e4PG1HbwPwMr02uwuWu/JEqf/qPemrve/Ra WA5zA7MEG44r5mAqnYfKOkKKJ2AZ8GGLMPiKxmXfBmcNtM0MgajouNfjsbw0/OUzG4wd EBPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pWMf1UDP+/TvD9J+zYOQC49Hk9dNzxZa9EsvaOc1xkM=; b=zAekWPcCfnyZLPdUJDGBOM/Ma0fMfgAyxVe8pku0E9UkQxvlstoGaQkgb/FOjlalKP qizIKuGLAaqOJyDE7XMqVPatUCbetAOSp92U/Z/jtikt7iio5rJxOuysNJjNDcjqWvF/ dXSJjSCMiDO0ywplWwfcr5JfZv/Ab/I98e8QSczHrbi4mKudUEOGoEeJ4bODHzTEpV00 ufIS3kcnNNYAZWhFtLQXTh0SSCcgAcIHFY8H1Kn3GVkbSVCQx5jAA8Ocou5COipZzccD v9D+7vsPKfzGIuFHtb+qJ7GwzWEympOiH9xVt4cGcUbjmJ1+UIoKg/QrOk98Gg5jDmFP sy5Q== X-Gm-Message-State: AOAM530GnXIzqHVNN8dwBR4+dr8VSKUv3BLQPJFchdoj8pLw9+G6LtEl xSxJaCUBlUApGkDtQrBUk3B0rPXTMnYJuWI2ubs= X-Google-Smtp-Source: ABdhPJwEXWvUA1ui26Wbu4HIWFLpCJOemEeXgM5+Qf8p72sBZRll1lmfTD5MYtberprA6jb8Q61G1tVk0pZWlzx4JnY= X-Received: by 2002:a2e:9c0c:0:b0:24e:e2e0:f61e with SMTP id s12-20020a2e9c0c000000b0024ee2e0f61emr46293412lji.75.1654310622920; Fri, 03 Jun 2022 19:43:42 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2a00:1450:4864:20::22a; envelope-from=owinebar@gmail.com; helo=mail-lj1-x22a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Sat, 04 Jun 2022 01:44:58 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:290640 Archived-At: --00000000000007060f05e0963524 Content-Type: text/plain; charset="UTF-8" On Fri, Jun 3, 2022 at 2:15 PM Stefan Monnier wrote: > > There was a thread in January starting at > > https://lists.gnu.org/archive/html/emacs-devel/2022-01/msg01005.html > that > > gets at one scenario. At least in pre-10 versions in my experience, > > Windows has not dealt well with large numbers of files in a single > > directory, at least if it's on a network drive. > > Hmm... I count a bit over 6K ELisp files in Emacs + (Non)GNU ELPA, so > the ELN cache should presumably not go much past 10K files. > > Performance issues with read access to directories containing less than > 10K files seems like something that was solved last century, so > I wouldn't worry very much about it. > > Per my response to Eli, I see (network) directories become almost unusable somewhere around 1000 files, but it seems that's a consequence of the network and/or security configuration. > [ But that doesn't mean we shouldn't try to compile several ELisp files > into a single ELN file, especially since the size of ELN files seems > to be proportionally larger for small ELisp files than for large > ones. ] > Since I learned of the native compiler in 28.1, I decided to try it out and also "throw the spaghetti at the wall" with a bunch of packages that provide features similar to those found in more "modern" IDEs. In terms of startup time, the normal package system does not deal well with hundreds of directories on the load path, regardless of AOR native compilation, so I'm tranforming the packages to install in the version-specific load path, and compiling that ahead of time. At least for the ones amenable to such treatment. Given I'm compiling all the files AOT for use in a common installation (this is on Linux, not Windows), the natural question for me is whether larger compilation units would be more efficient, particularly at startup. Would there be advantages comparable to including packages in the dump file, for example? I posed the question to the list mostly to see if the approach (or similar) had already been tested for viability or effectiveness, so I can avoid unnecessary experimentation if the answer is already well-understood. > > Aside from explicit interprocedural optimization, is it possible > libgccjit > > would lay out the code in a more optimal way in terms of memory locality? > > Could be, but I doubt it because I don't think GCC gets enough info to > make such a decision. For lazily-compiled ELN files I could imagine > collecting some amount of profiling info to generate better code, but > our code generation is definitely not that sophisticated. I don't know enough about modern library loading to know whether you'd expect N distinct but interdependent dynamic libraries to be loaded in as compact a memory region as a single dynamic library formed from the same underlying object code. > > If the only concern for semantic safety with -O3 is the redefinability of > > all symbols, that's already the case for emacs lisp primitives > implemented > > in C. > > Not really: > - Most ELisp primitives implemented in C can be redefined just fine. > The problem is about *calls* to those primitives, where the > redefinition may fail to apply to those calls that are made from C. > - While the problem is similar the scope is very different. > >From Andrea's description, this would be the primary "unsafe" aspect of intraprocedural optimizations applied to one of these aggregated compilation units. That is, that the semantics of redefining function symbols would not apply to points in the code at which the compiler had made optimizations based on assuming the function definitions were constants. It's not clear to me whether those points are limited to call sites or not. > > It should be similar to putting the code into a let block with all > > defined functions bound in the block, then setting the global > > definitions to the locally defined versions, except for any variations > > in forms with semantics that depend on whether they appear at > > top-level or in a lexical scope. > > IIUC the current native-compiler will actually leave those > locally-defined functions in their byte-code form :-( > That's not what I understood from https://akrl.sdf.org/gccemacs.html#org0f21a5b As you deduce below, I come from a Scheme background - cl-flet is the form I should have referenced, not let. > > IOW, there are lower-hanging fruits to pick first. > This is mainly of interest if a simple transformation of the sort I originally suggested can provide benefits in either reducing startup time for large sets of preloaded packages, or by enabling additional optimizations. Primarily the former for me, but the latter would be interesting. It seems more straightforward than trying to link the eln files into larger units after compilation. > > It might be interesting to extend the language with a form that > > makes the unsafe optimizations safe with respect to the compilation unit. > > Yes, in the context of Scheme I think this is called "sealing". > > > Stefan > No --00000000000007060f05e0963524 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, Jun 3, 2022 at = 2:15 PM Stefan Monnier <monnier@iro.umontre= al.ca> wrote:

Since I learned of the native compiler in 28.1, I decid= ed to try it out and also "throw the spaghetti at the wall" with = a bunch of packages that provide features similar to those found in more &q= uot;modern" IDEs.=C2=A0 In terms of startup time, the normal package s= ystem does not deal well with hundreds of directories on the load path, reg= ardless of AOR native compilation,=C2=A0 so I'm tranforming the package= s to install in the version-specific load path, and compiling that ahead of= time.=C2=A0 At least for the ones amenable to such treatment.

Given I'm compiling all the file= s AOT for use in a common installation (this is on Linux, not Windows), the= natural question for me is whether larger compilation units would be more = efficient, particularly at startup.=C2=A0 Would there be advantages compara= ble to including packages in the dump file, for example?=C2=A0=C2=A0
<= div dir=3D"auto">



As you deduce below, I come from a Scheme background - cl-flet is the for= m I should have referenced, not let.

IOW, there are lower-hanging fruits to pick first.

This is mainly of interes= t if a simple transformation of the sort I originally suggested can provide= benefits in either reducing startup time for large sets of preloaded packa= ges, or by enabling additional optimizations.=C2=A0 =C2=A0Primarily the for= mer for me, but the latter would be interesting.=C2=A0 It seems more straig= htforward than trying to link the eln files into larger units after compila= tion.



> It might be interesting to extend the language with a form that
> makes the unsafe optimizations safe with respect to the compilation un= it.

Yes, in the context of Scheme I think this is called "sealing".

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Stefan
No
--00000000000007060f05e0963524--