unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Lynn Winebarger <owinebar@gmail.com>
Cc: Andrea Corallo <akrl@sdf.org>,  emacs-devel@gnu.org
Subject: Re: native compilation units
Date: Mon, 13 Jun 2022 13:15:21 -0400	[thread overview]
Message-ID: <jwv7d5ky08j.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <CAM=F=bBx93uGpk5giYBaPtcu3jJ9Q7wvFBi__72qR3LtOidoVw@mail.gmail.com> (Lynn Winebarger's message of "Mon, 13 Jun 2022 12:33:19 -0400")

> To be clear, I'm trying to first understand what Andrea means by "safe".
> I'm assuming it means the result agrees with whatever the byte
> compiler and VM would produce for the same code.

Not directly.  It means that it agrees with the intended semantics.
That semantics is sometimes accidentally defined by the actual
implementation in the Lisp interpreter or the bytecode compiler, but
that's secondary.

The semantic issue is that if you call

    (foo bar baz)

it normally (when `foo` is a global function) means you're calling the
function contained in the `symbol-function` of the `foo` symbol *at the
time of the function call*.  So compiling this to jump directly to the
code that happens to be contained there during compilation (or the code
which the compiler expects to be there at that point) is unsafe in
the sense that you don't know whether that symbol's `symbol-function`
will really have that value when we get to executing that function call.

The use of `cl-flet` (or `cl-labels`) circumvents this problem since the
call to `foo` is now to a lexically-scoped function `foo`, so the
compiler knows that the code that is called is always that same one
(there is no way to modify it between the compilation time and the
runtime).

> I doubt I'm bringing up topics or ideas that are new to you.  But if
> I do make use of semantic/wisent, I'd like to know the result can be
> fast (modulo garbage collection, anyway).

It's also "modulo enough work on the compiler (and potentially some
primitive functions) to make the code fast".

> I've been operating under the assumption that
>
>    - Compiled code objects should be first class in the sense that
>    they can be serialized just by using print and read.  That seems to
>    have been important historically, and was true for byte-code
>    vectors for dynamically scoped functions.  It's still true for
>    byte-code vectors of top-level functions, but is not true for
>    byte-code vectors for closures (and hasn't been for at least
>    a decade, apparently).

It's also true for byte-compiled closures, although, inevitably, this
holds only for closures that capture only serializable values.

> But I see that closures are being implemented by calling an ordinary
> function that side-effects the "constants" vector.

I don't think that's the case.  Where do you see that?
The constants vector is implemented as a normal vector, so strictly
speaking it is mutable, but the compiler will never generate code that
mutates it, AFAIK, so you'd have to write ad-hoc code that digs inside
a byte-code closure and mutates the constants vector for that to happen
(and I don't know of such code out in the wild).

> OTOH, prior to commit
> https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=d0c47652e527397cae96444c881bf60455c763c1
> it looks like the closures were constructed at compile time rather than by
> side-effect,

No, this commit only changes the *way* they're constructed but not the
when and both the before and the after result in constant vectors which
are not side-effected (every byte-code closure gets its own fresh
constants-vector).

> Wedging closures into the byte-code format that works for dynamic scoping
> could be made to work with shared structures, but you'd need to modify
> print to always capture shared structure (at least for byte-code vectors),
> not just when there's a cycle.

It already does.

> The approach that's been implemented only works at run-time when
> there's shared state between closures, at least as far asI can tell.

There can be problems if two *toplevel* definitions are serialized and
they share common objects, indeed.  The byte-compiler may fail to
preserve the shared structure in that case, IIRC.  I have some vague
recollection of someone bumping into that limitation at some point, but
it should be easy to circumvent.

> Then I think the current approach is suboptimal.  The current
> byte-code representation is analogous to the a.out format.
> Because the .elc files run code on load you can put an arbitrary
> amount of infrastructure in there to support an implementation of
> compilation units with exported compile-time symbols, but it puts
> a lot more burden on the compiler and linker/loader writers than just
> being explicit would.

I think the practical performance issues with ELisp code are very far
removed from these problems.  Maybe some day we'll have to face them,
but we still have a long way to go.

>> You explicitly write `(require 'cl-lib)` but I don't see any
>>
>>     -*- lexical-binding:t -*-
>>
>> anywhere, so I suspect you forgot to add those cookies that are needed
>> to get proper lexical scoping.
>> Ok, wow, I really misread the NEWS for 28.1 where it said
> The 'lexical-binding' local variable is always enabled.

Are you sure?  How do you do that?
Some of the errors you showed seem to point very squarely towards the
code being compiled as dyn-bound ELisp.


        Stefan




  reply	other threads:[~2022-06-13 17:15 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31  1:02 native compilation units Lynn Winebarger
2022-06-01 13:50 ` Andrea Corallo
2022-06-03 14:17   ` Lynn Winebarger
2022-06-03 16:05     ` Eli Zaretskii
     [not found]       ` <CAM=F=bDxxyHurxM_xdbb7XJtP8rdK16Cwp30ti52Ox4nv19J_w@mail.gmail.com>
2022-06-04  5:57         ` Eli Zaretskii
2022-06-05 13:53           ` Lynn Winebarger
2022-06-03 18:15     ` Stefan Monnier
2022-06-04  2:43       ` Lynn Winebarger
2022-06-04 14:32         ` Stefan Monnier
2022-06-05 12:16           ` Lynn Winebarger
2022-06-05 14:08             ` Lynn Winebarger
2022-06-05 14:46               ` Stefan Monnier
2022-06-05 14:20             ` Stefan Monnier
2022-06-06  4:12               ` Lynn Winebarger
2022-06-06  6:12                 ` Stefan Monnier
2022-06-06 10:39                   ` Eli Zaretskii
2022-06-06 16:23                     ` Lynn Winebarger
2022-06-06 16:58                       ` Eli Zaretskii
2022-06-07  2:14                         ` Lynn Winebarger
2022-06-07 10:53                           ` Eli Zaretskii
2022-06-06 16:13                   ` Lynn Winebarger
2022-06-07  2:39                     ` Lynn Winebarger
2022-06-07 11:50                       ` Stefan Monnier
2022-06-07 13:11                         ` Eli Zaretskii
2022-06-14  4:19               ` Lynn Winebarger
2022-06-14 12:23                 ` Stefan Monnier
2022-06-14 14:55                   ` Lynn Winebarger
2022-06-08  6:56           ` Andrea Corallo
2022-06-11 16:13             ` Lynn Winebarger
2022-06-11 16:37               ` Stefan Monnier
2022-06-11 17:49                 ` Lynn Winebarger
2022-06-11 20:34                   ` Stefan Monnier
2022-06-12 17:38                     ` Lynn Winebarger
2022-06-12 18:47                       ` Stefan Monnier
2022-06-13 16:33                         ` Lynn Winebarger
2022-06-13 17:15                           ` Stefan Monnier [this message]
2022-06-15  3:03                             ` Lynn Winebarger
2022-06-15 12:23                               ` Stefan Monnier
2022-06-19 17:52                                 ` Lynn Winebarger
2022-06-19 23:02                                   ` Stefan Monnier
2022-06-20  1:39                                     ` Lynn Winebarger
2022-06-20 12:14                                       ` Lynn Winebarger
2022-06-20 12:34                                       ` Lynn Winebarger
2022-06-25 18:12                                       ` Lynn Winebarger
2022-06-26 14:14                                         ` Lynn Winebarger
2022-06-08  6:46         ` Andrea Corallo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv7d5ky08j.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=akrl@sdf.org \
    --cc=emacs-devel@gnu.org \
    --cc=owinebar@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).