From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: native compilation units Date: Sun, 19 Jun 2022 13:52:35 -0400 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000cc7d7905e1d0a740" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27147"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Andrea Corallo , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Jun 19 19:54:03 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o2z7a-0006qo-AX for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Jun 2022 19:54:02 +0200 Original-Received: from localhost ([::1]:46240 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o2z7Z-0005Zz-66 for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Jun 2022 13:54:01 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39862) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o2z6S-0004sA-Gr for emacs-devel@gnu.org; Sun, 19 Jun 2022 13:52:52 -0400 Original-Received: from mail-oa1-x34.google.com ([2001:4860:4864:20::34]:39632) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o2z6P-00084y-TL for emacs-devel@gnu.org; Sun, 19 Jun 2022 13:52:52 -0400 Original-Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-101dc639636so1841413fac.6 for ; Sun, 19 Jun 2022 10:52:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xHfiitoTOFa0Wp7F9pHAjAMZsEqpGMDjmFFuOVO+pXo=; b=Plg0jNgzaBsdAYYqMUaKyvxWQeppoE0X7HsBUol18wziADN/YM3eooBzP839kzmHvF ih45cVz0CdyMFcYnIlr55NFSUEKAzE+yoxBdXa5xjdYPyyuQ00ldAxGUWCebWebE2QDW Vo/v7VGobc+VbNWYGancbpTx41/HS86w7w6awpk6rLm+lcBAjGVzbyNFyddqnGhEhNTK LUnP6faEy1LopSHnO3/5x9NX5ez//jD6w1CnPZ/l5KhGzXJyLHNvC9+m4ttC4KuXMV2e kJ9pS6FbtMMme+PZr2Os81Td261Y3+gRBPVYvx9+fiGwL+SB7jOmEb661QdmnNBrmqwd uqBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xHfiitoTOFa0Wp7F9pHAjAMZsEqpGMDjmFFuOVO+pXo=; b=sIDzTbIwpJVOLbqYzGFhWWGCoArg4Ywe0twaslxJLjgPUsEjZZbEajelNX9cHQhrWW 4+zogySlYanMgKmGYar7pRxvdIZVR1CyHwkpKpOVVYoH3dnkzyr3SASwGW7FjIt5Dx20 JnSvPK6jLyFDQhmR7coG5tDS74UJdXMDeys9iobqtB4OglkHoAFpLTlAjjAc82uKapwl d91S1nOv7U7jFo5Z4QZGrhofT7uBeYvR05tuSJeemiDGowZoaJhnfzCW7s/v5mpZykHe 44Dzp6qVQu1vhQDGwN1iZqK74GRGacj4BYgs4ELljMDUpdKECGelqkBeYc20v2nzDRmR 4wgw== X-Gm-Message-State: AJIora8JCCZ3VXKmF6bQ8aqckfD/fL+yWEUofNU8nv+GEf0THcb3md6G luAKv7HXB625F0msOmE8XfMdRCejvWcsCWD57e0= X-Google-Smtp-Source: AGRyM1vM5tC7RRQsA3xaicD4N0YGiBifWh8ycQkilNyU8p16kjexOncYQTWo+Ch5CTWLj+eBXiIF6q9tEcdjcT+LIS0= X-Received: by 2002:a05:6870:c1d1:b0:101:f706:bf29 with SMTP id i17-20020a056870c1d100b00101f706bf29mr731917oad.247.1655661168195; Sun, 19 Jun 2022 10:52:48 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2001:4860:4864:20::34; envelope-from=owinebar@gmail.com; helo=mail-oa1-x34.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:291435 Archived-At: --000000000000cc7d7905e1d0a740 Content-Type: text/plain; charset="UTF-8" On Wed, Jun 15, 2022 at 8:23 AM Stefan Monnier wrote: > > There is one kind of expression where Andrea isn't quite correct, and > that > > is with respect to (eval-when-compile ...). > > You don't need `eval-when-compile`. It's already "not quite correct" > for lambda expressions. What he meant is that the function associated > with a symbol can be changed in every moment. But if you call > a function without going through such a globally-mutable indirection the > problem vanishes. > > I'm not sure what the point here is. If all programs were written with every variable and function name lexically bound, then there wouldn't be an issue. After Andrea's response to my original question, I was curious if the kind of semantic object that an ELF shared-object file *is* can be captured (directly) in the semantic model of emacs lisp, including the fact that some symbols in ELF are bound to truly immutable constants at runtime by the loader. Also, if someone were to rewrite some of the primitives now in C in Lisp and rely on the compiler for their use, would there be a way to write them with the same semantics they have now (not referencing the run-time bindings of other primitives). Based on what I've observed in this thread, I think the answer is either yes or almost yes. The one sticking point is that there is no construct for retaining the compile-time environment. If I "link" files by concatenating the source together, it's not an issue, but I can't replicate that with the results the byte-compiler currently produces. What would also be useful is some analogue to Common Lisp's package construct, but extended so that symbols could be imported from compile-time environments as immutable bindings. Now, that would be a change in the current semantics of symbols, unquestionably, but not one that would break the existing code base. It would only come into play compiling a file as a library, with semantics along the lines of: (eval-when-compile (namespace ) ... (export ...) ) Currently compiling a top-level expression wrapped in eval-when-compile by itself leaves no residue in the compiled output, but I would want to make the above evaluate to an object at run-time where the exported symbols in the obstack are immutable. Since no existing code uses the above constructs - because they are not currently defined - it would only be an extension. I don't want to restart the namespace debates - I'm not suggesting anything to do with the reader parsing symbol names spaces from prefixes in the symbol name. > >> It's also "modulo enough work on the compiler (and potentially some > >> primitive functions) to make the code fast". > > Absolutely, it just doesn't look to me like a very big lift compared to, > > say, what Andrea did. > > It very depends on the specifics, but it's definitely not obviously true. > ELisp like Python has grown around a "slow language" so its code is > structured in such a way that most of the time the majority of the code > that's executed is actually not ELisp but C, over which the native > compiler has no impact. > > That's why I said "look[s] to me", and inquired here before proceeding. Having looked more closely, it appears the most obvious safe approach, that doesn't require any ability to manipulate the C call stack, is to introduce another manually managed call stack as is done for the specpdl stack, but malloced (I haven't reviewed that implementation closely enough to tell if it is stack or heap allocated). That does complicate matters. That part would be for allowing calls to (and returns from) arbitrary points in byte-code (or native-code) instruction arrays. This would in turn enable implementing proper tail recursion as "goto with arguments". These changes would be one way to address the items in the TODO file for 28.1, starting at line 173: > * Important features > ** Speed up Elisp execution [...] > *** Speed up function calls [..] > ** Add an "indirect goto" byte-code [...] > *** Compile efficiently local recursive functions [...] As for the other elements - introducing additional registers to facilitate efficient lexical closures and namespaces - it still doesn't look like a huge lift to introduce them into the bytecode interpreter, although there is still the work to make effective use of them in the output of the compilers. I have been thinking that some additional reader syntax for what might be called "meta-evaluation quasiquotation" (better name welcome) could be useful. I haven't worked out the details yet, though. I would make #, and #,@ effectively be shorthand for eval-when-compile. Using #` inside eval-when-compile should produce an expression that, after compilation, would provide the meta-quoted expression with the semantics it would have outside an eval-when-compile form. > Does this mean the native compiled code can only produce closures in > > byte-code form? > > Not directly, no. But currently that's the case, yes. > > > below with shared structure (the '(5)], but I don't see anything in > > the printed text to indicate it if read back in. > > You need to print with `print-circle` bound to t, like the compiler does > when writing to a `.elc` file. > I feel silly again. I've *used* emacs for years, but have (mostly) avoided using emacs lisp for programming because of the default dynamic scoping and the implications that has for the efficiency of lexical closures. > > > I'm sure you're correct in terms of the current code base. But isn't > > the history of these kinds of improvements in compilers for functional > > languages that coding styles that had been avoided in the past can be > > adopted and produce faster code than the original? > > Right, but it's usually a slow co-evolution. > I don't think I've suggested anything else. I don't think my proposed changes to the byte-code VM would change the semantics of emacs LISP, just the semantics of the byte-code VM. Which you've already stated do not dictate the semantics of emacs LISP. > > In this case, it would be enabling the pervasive use of recursion and > > less reliance on side-effects. > > Not everyone would agree that "pervasive use of recursion" is an > improvement. > True, but it's still a lisp - no one is required to write code in any particular style. It would be peculiar (these days, anyway) to expect a lisp compiler to optimize imperative-style code more effectively than code employing recursion. > > Improvements in the gc wouldn't hurt, either. > > Actually, nowadays lots of benchmarks are already bumping into the GC as > the main bottleneck. > I'm not familiar with emacs's profiling facilities. Is it possible to tell how much of the allocated space/time spent in gc is due to the constant vectors of lexical closures? In particular, how much of the constant vectors are copied elements independent of the lexical environment? That would provide some measure of any gc-related benefit that *might* be gained from using an explicit environment register for closures, instead of embedding it in the byte-code vector. Lynn --000000000000cc7d7905e1d0a740 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Jun 15, 2022 at 8:23 AM Stefan Mo= nnier <mon= nier@iro.umontreal.ca> wrote:
> There is one kind of e= xpression where Andrea isn't quite correct, and that
> is with respect to (eval-when-compile ...).

You don't need `eval-when-compile`.=C2=A0 It's already "not qu= ite correct"
for lambda expressions.=C2=A0 What he meant is that the function associated=
with a symbol can be changed in every moment.=C2=A0 But if you call
a function without going through such a globally-mutable indirection the problem vanishes.

I'm not sure what the point here is.=C2=A0 If all= programs were written with every variable=C2=A0
and function nam= e lexically bound, then there wouldn't be an issue.
After And= rea's response to my original question, I was curious if the kind of=C2= =A0
semantic object that an ELF shared-object file *is* can be ca= ptured (directly) in the
semantic model of emacs lisp, including = the fact that some symbols in ELF are bound
to truly immutable co= nstants at runtime by the loader.=C2=A0 =C2=A0Also, if someone were to
rewrite some of the primitives now in C in Lisp and rely on the compi= ler for their use,
would there be a way to write them with the sa= me semantics they=C2=A0have now (not=C2=A0
referencing the run-ti= me bindings of other=C2=A0primitives).
Based on what I've obs= erved in this thread, I think the answer is either yes or almost yes.
=
The one sticking point is that there is no construct for retaining the= compile-time environment.
If I "link" files by concate= nating the source together, it's not an issue, but I can't replicat= e
that with the results the byte-compiler currently produces.
What would also be useful is some analogue to Common Lisp's=C2= =A0package construct, but extended
so that symbols could be impor= ted from compile-time environments as immutable bindings.
Now, th= at would be a change in the current semantics of symbols, unquestionably, b= ut
not one that would break the existing code base.=C2=A0 It woul= d only come into play compiling
a file as a library, with semanti= cs along the lines of:
(eval-when-compile
=C2=A0 (names= pace <name of library obstack>)
=C2=A0 <library code>= ...
=C2=A0 =C2=A0(export <symbol> ...)
)
Currently compiling a top-level expression wrapped in eval-when-compile = by itself leaves=C2=A0
no residue in the compiled=C2=A0 output, b= ut I would want to make the above evaluate
to an object at run-ti= me where the exported symbols in the obstack are immutable.
Since= no existing code uses the above constructs - because they are not currentl= y defined -
=C2=A0it would only be an extension.

I don't want to restart the namespace debates - I'm not su= ggesting anything to do
with the reader parsing symbol names spac= es from prefixes in the symbol name.
=C2=A0
>> It's also "modulo enoug= h work on the compiler (and potentially some
>> primitive functions) to make the code fast".
> Absolutely, it just doesn't look to me like a very big lift compar= ed to,
> say, what Andrea did.

It very depends on the specifics, but it's definitely not obviously tru= e.
ELisp like Python has grown around a "slow language" so its code = is
structured in such a way that most of the time the majority of the code
that's executed is actually not ELisp but C, over which the native
compiler has no impact.

That's why I said "look[s] to me", and = inquired here before proceeding.
Having looked more closely, it a= ppears the most obvious safe approach,
that doesn't require a= ny ability to manipulate the C call stack, is to introduce
anothe= r manually managed call stack as is done for the specpdl stack, but
malloced (I haven't reviewed that implementation closely enough to t= ell if it
is stack or heap allocated).=C2=A0 That does complicate= matters.
That part would be for allowing calls to (and returns f= rom) arbitrary points in
byte-code (or native-code) instruction a= rrays.=C2=A0 This would in turn enable
implementing proper tail r= ecursion as "goto with arguments".

These= changes would be one way to address the items in the TODO file for=C2=A0
28.1, starting at line 173:
* Important features
** Speed up Elisp execution [...]<= br>*** Speed up function calls [..]
** Add an "indirect goto" = byte-code [...]
*** Compile efficiently local recursive functions [...]<= /blockquote>

As for the other elements - introducing add= itional registers to facilitate
efficient lexical closures and na= mespaces - it still doesn't look like a huge lift
to introduc= e them into the bytecode interpreter, although there is still the work
to make effective use of them in the output of the compilers.

I have been thinking that some additional reader syntax f= or what might be
called "meta-evaluation quasiquotation"= ; (better name welcome) could be useful.
I haven't worked out= the details yet, though. I would make #, and #,@ effectively
be= =C2=A0 shorthand for eval-when-compile.=C2=A0 Using #` inside eval-when-com= pile should
produce an expression that, after compilation, would = provide the meta-quoted
expression with the semantics it would ha= ve outside an eval-when-compile
form.

> Does this mean the native compiled code can only produce closures in > byte-code form?

Not directly, no.=C2=A0 But currently that's the case, yes.

> below with shared structure (the '(5)], but I don't see anythi= ng in
> the printed text to indicate it if read back in.

You need to print with `print-circle` bound to t, like the compiler does when writing to a `.elc` file.

I feel s= illy again. I've *used* emacs for years, but have (mostly) avoided usin= g=C2=A0
emacs lisp for programming because of the default dynamic= scoping and the=C2=A0
implications that has for the efficiency o= f lexical closures.=C2=A0=C2=A0
=C2=A0

> I'm sure you're correct in terms of the current code base.=C2= =A0 But isn't
> the history of these kinds of improvements in compilers for functional=
> languages that coding styles that had been avoided in the past can be<= br> > adopted and produce faster code than the original?

Right, but it's usually a slow co-evolution.
I don= 't think I've suggested anything else.=C2=A0 I don't think my p= roposed changes to the byte-code
VM would change the semantics of= emacs LISP, just the semantics of the byte-code
VM.=C2=A0 Which = you've already stated do not dictate the semantics of emacs LISP.
=
=C2=A0
>= In this case, it would be enabling the pervasive use of recursion and
> less reliance on side-effects.

Not everyone would agree that "pervasive use of recursion" is an = improvement.
True, but it's still a lisp - no one = is required to write code in any particular style.=C2=A0 =C2=A0It would
be peculiar (these days, anyway) to expect a lisp compiler to optimi= ze imperative-style code
more effectively than code employing rec= ursion.
=C2=A0
> Improvements in the gc wouldn't hurt, either.

Actually, nowadays lots of benchmarks are already bumping into the GC as the main bottleneck.

I'm not famili= ar with emacs's profiling facilities.=C2=A0 Is it possible to tell how = much of the=C2=A0
allocated space/time spent in gc is due to the = constant vectors of lexical closures?=C2=A0 In particular,
how mu= ch of the constant vectors are copied elements independent of the lexical e= nvironment?
That would provide some measure of any gc-related ben= efit that *might* be gained from using an
explicit environment re= gister for closures, instead of embedding it in the byte-code vector.
=

Lynn

--000000000000cc7d7905e1d0a740--