unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
Subject: Re: native compilation units
Date: Mon, 13 Jun 2022 12:33:19 -0400	[thread overview]
Message-ID: <CAM=F=bBx93uGpk5giYBaPtcu3jJ9Q7wvFBi__72qR3LtOidoVw@mail.gmail.com> (raw)
In-Reply-To: <jwvo7yxafng.fsf-monnier+emacs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 6284 bytes --]

On Sun, Jun 12, 2022 at 2:47 PM Stefan Monnier <monnier@iro.umontreal.ca>
wrote:

> >> >> In which sense would it be different from:
> >> >>
> >> >>     (cl-flet
> >> >>         ...
> >> >>       (defun ...)
> >> >>       (defun ...)
> >> >>       ...)
> >> >>
> > I'm trying to determine if there's a set of expressions for which it
> > is semantically sound to perform the intraprocedural optimizations
>
> The cl-flet above is such an example, AFAIK.  Or maybe I don't
> understand what you mean.
>

To be clear, I'm trying to first understand what Andrea means by "safe".
I'm assuming it
means the result agrees with whatever the byte compiler and VM would
produce for the
same code.  I doubt I'm bringing up topics or ideas that are new to you.
But if I do make
use of semantic/wisent, I'd like to know the result can be fast (modulo
garbage collection, anyway).
I've been operating under the assumption that

   - Compiled code objects should be first class in the sense that they can
   be serialized
   just by using print and read.  That seems to have been important
   historically, and
   was true for byte-code vectors for dynamically scoped functions.  It's
   still true for
   byte-code vectors of top-level functions, but is not true for byte-code
   vectors for
   closures (and hasn't been for at least a decade, apparently).
   - It's still worthwhile to have a class of code objects that are
   immutable in the VM
   semantics, but now because there are compiler passes implemented that can
   make use of that as an invariant
   - cl-flet doesn't allow mutual recursion, and there is no shared state
   above,
   so there's nothing to optimize intraprocedurally.
   - cl-labels is implemented with closures, so (as I understand it) the
   native
   compiler would not be able to produce code if you asked it to compile
   the closure returned by a form like (cl-labels ((f ..) (g...) ...) f)

I also mistakenly thought byte-code-vectors of the sort saved in ".elc"
files would not
be able to represent closures without being consed, as the components (at
least
the first 4) are nominally constant.  But I see that closures are being
implemented
by calling an ordinary function that side-effects the "constants" vector.
That's unfortunate
because it means the optimizer cannot assume byte-vectors are constants
that can be
freely propagated.  OTOH, prior to commit
https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=d0c47652e527397cae96444c881bf60455c763c1
it looks like the closures were constructed at compile time rather than by
side-effect,
which would mean the VM would be expected to treat them as immutable, at
least.

Wedging closures into the byte-code format that works for dynamic scoping
could be made to work with shared structures, but you'd need to modify
print to always capture shared structure (at least for byte-code vectors),
not just when there's a cycle.  The approach that's been implemented only
works at run-time when there's shared state between closures, at least as
far
asI can tell.

However, it's a hack that will never really correspond closely to the
semantics
of shared objects with explicit tracking and load-time linking of
compile-time
symbols, because the relocations are already performed and there's no way to
back out where they occured from the value itself.  If a goal is to have a
semantics in which you can

   1. unambiguously specify that at load/run time a function or variable
   name
   is resolved in the compile time environment provided by a separate
   compilation unit as an immutable constant at run-time
   2.  serialize compiled closures as compilation units that provide a
   well-defined
   compile-time environment for linking
   3. reduce the headaches of the compiler writer by making it easy to
   produce code that is eligible for their optimizations

Then I think the current approach is suboptimal.  The current byte-code
representation
is analogous to the a.out format.  Because the .elc files run code on load
you can
put an arbitrary amount of infrastructure in there to support an
implementation
of compilation units with exported compile-time symbols, but it puts a lot
more
burden on the compiler and linker/loader writers than just being explicit
would.

And I'm not sure what the payoff is.  When there wasn't a native compiler
(and
associated optimization passes), I suppose there was no pressing reason
to upend backward compatibility.  Then again, I've never been responsible
for maintaining a 3-4 decade old application with I don't have any idea how
large an installed user base ranging in size from chips running "smart"
electric
switches to (I assume) the biggest of "big iron", whatever that means these
days.


> > I'm trying to capture a function as a first class value.
>
> Functions are first class values and they can be trivially captured via
> things like (setq foo (lambda ...)), (defalias 'foo (lambda ...)) and
> a lot more, so I there's some additional constraint you're expecting but
> I don't know what that is.
>

Yes, I thought byte-code would be treated as constant.  I still think it
makes a lot of sense
to make it so.


>
> > This was not expected with lexical scope.
>
> You explicitly write `(require 'cl-lib)` but I don't see any
>
>     -*- lexical-binding:t -*-
>
> anywhere, so I suspect you forgot to add those cookies that are needed
> to get proper lexical scoping.
>
> Ok, wow, I really misread the NEWS for 28.1 where it said

The 'lexical-binding' local variable is always enabled.

As meaning "always set".  My fault.

> With the current byte-codes, there's just no way to express a call to
> > an offset in the current byte-vector.
>
> Indeed, but you can call a byte-code object instead.
>
> Creating the byte code with shared structure was what I meant by one of
the solutions being to
"patch compile-time constants" at load, i.e. perform the relocations
directly.  The current
implementation effectively inlines copies of the constants (byte-code
objects), which is fine for shared code but not
for shared variables.  That is, the values that are assigned to
my-global-oddp and my-global-evenp (for test2 after
correcting the lexical-binding setting) do not reference each other.  Each
is created with  an independent copy of
the other.

to

[-- Attachment #2: Type: text/html, Size: 8574 bytes --]

  reply	other threads:[~2022-06-13 16:33 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31  1:02 native compilation units Lynn Winebarger
2022-06-01 13:50 ` Andrea Corallo
2022-06-03 14:17   ` Lynn Winebarger
2022-06-03 16:05     ` Eli Zaretskii
     [not found]       ` <CAM=F=bDxxyHurxM_xdbb7XJtP8rdK16Cwp30ti52Ox4nv19J_w@mail.gmail.com>
2022-06-04  5:57         ` Eli Zaretskii
2022-06-05 13:53           ` Lynn Winebarger
2022-06-03 18:15     ` Stefan Monnier
2022-06-04  2:43       ` Lynn Winebarger
2022-06-04 14:32         ` Stefan Monnier
2022-06-05 12:16           ` Lynn Winebarger
2022-06-05 14:08             ` Lynn Winebarger
2022-06-05 14:46               ` Stefan Monnier
2022-06-05 14:20             ` Stefan Monnier
2022-06-06  4:12               ` Lynn Winebarger
2022-06-06  6:12                 ` Stefan Monnier
2022-06-06 10:39                   ` Eli Zaretskii
2022-06-06 16:23                     ` Lynn Winebarger
2022-06-06 16:58                       ` Eli Zaretskii
2022-06-07  2:14                         ` Lynn Winebarger
2022-06-07 10:53                           ` Eli Zaretskii
2022-06-06 16:13                   ` Lynn Winebarger
2022-06-07  2:39                     ` Lynn Winebarger
2022-06-07 11:50                       ` Stefan Monnier
2022-06-07 13:11                         ` Eli Zaretskii
2022-06-14  4:19               ` Lynn Winebarger
2022-06-14 12:23                 ` Stefan Monnier
2022-06-14 14:55                   ` Lynn Winebarger
2022-06-08  6:56           ` Andrea Corallo
2022-06-11 16:13             ` Lynn Winebarger
2022-06-11 16:37               ` Stefan Monnier
2022-06-11 17:49                 ` Lynn Winebarger
2022-06-11 20:34                   ` Stefan Monnier
2022-06-12 17:38                     ` Lynn Winebarger
2022-06-12 18:47                       ` Stefan Monnier
2022-06-13 16:33                         ` Lynn Winebarger [this message]
2022-06-13 17:15                           ` Stefan Monnier
2022-06-15  3:03                             ` Lynn Winebarger
2022-06-15 12:23                               ` Stefan Monnier
2022-06-19 17:52                                 ` Lynn Winebarger
2022-06-19 23:02                                   ` Stefan Monnier
2022-06-20  1:39                                     ` Lynn Winebarger
2022-06-20 12:14                                       ` Lynn Winebarger
2022-06-20 12:34                                       ` Lynn Winebarger
2022-06-25 18:12                                       ` Lynn Winebarger
2022-06-26 14:14                                         ` Lynn Winebarger
2022-06-08  6:46         ` Andrea Corallo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bBx93uGpk5giYBaPtcu3jJ9Q7wvFBi__72qR3LtOidoVw@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=akrl@sdf.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).