unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
Subject: Re: native compilation units
Date: Sun, 19 Jun 2022 21:39:28 -0400	[thread overview]
Message-ID: <CAM=F=bBO7nSPELaVv6P0JmkTGzRmp7+x3BVPGsRngsTRZPQhoQ@mail.gmail.com> (raw)
In-Reply-To: <jwvwndcuvik.fsf-monnier+emacs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 6773 bytes --]

On Sun, Jun 19, 2022 at 7:02 PM Stefan Monnier <monnier@iro.umontreal.ca>
wrote:

> > Currently compiling a top-level expression wrapped in
> > eval-when-compile by itself leaves no residue in the compiled  output,
>
> `eval-when-compile` has 2 effects:
>
> 1- Run the code within the compiler's process.
>    E.g.  (eval-when-compile  (require 'cl-lib)).
>    This is somewhat comparable to loading a gcc plugin during
>    a compilation: it affects the GCC process itself, rather than the
>    code it emits.
>
> 2- It replaces the (eval-when-compile ...) thingy with the value
>    returned by the evaluation of this code.  So you can do (defvar
>    my-str (eval-when-compile (concat "foo" "bar"))) and you know that
>    the concatenation will be done during compilation.
>
> > but I would want to make the above evaluate to an object at run-time
> > where the exported symbols in the obstack are immutable.
>
> Then it wouldn't be called `eval-when-compile` because it would do
> something quite different from what `eval-when-compile` does :-)
>
>
The informal semantics of "eval-when-compile" from the elisp info file are
that
     This form marks BODY to be evaluated at compile time but not when
     the compiled program is loaded.  The result of evaluation by the
     compiler becomes a constant which appears in the compiled program.
     If you load the source file, rather than compiling it, BODY is
     evaluated normally.
I'm not sure what I have proposed that would be inconsistent with "the
result of evaluation
by the compiler becomes a constant which appears in the compiled program".
The exact form of that appearance in the compiled program is not specified.
For example, the byte-compile of (eval-when-compile (cl-labels ((f...) (g
...)))
currently produces a byte-code vector in which f and g are byte-code
vectors with
shared structure.  However, that representation is only one choice.

It is inconsistent with the semantics of *symbols* as they currently stand,
as I have already admitted.
Even there, you could advance a model where it is not inconsistent.  For
example,
if you view the binding of symbol to value as having two components - the
binding and the cell
holding the mutable value during the extent of the symbol as a
global/dynamically scoped variable,
then having the binding of the symbol to the final value of the cell before
the dynamic extent of the variable
terminates would be consistent.  That's not how it's currently implemented,
because there is no way to
express the final compile-time environment as a value after compilation has
completed with the
current semantics.

The part that's incompatible with current semantics of symbols is importing
that symbol as
an immutable symbolic reference.  Not really a "variable" reference, but as
a binding
of a symbol to a value in the run-time namespace (or package in CL
terminology, although
CL did not allow any way to specify what I'm suggesting either, as far as I
know).

However, that would capture the semantics of ELF shared objects with the
text and ro_data
segments loaded into memory that is in fact immutable for a userspace
program.


> > byte-code (or native-code) instruction arrays.  This would in turn enable
> > implementing proper tail recursion as "goto with arguments".
>
> Proper tail recursion elimination would require changing the *normal*
> function call protocol.  I suspect you're thinking of a smaller-scale

version of it specifically tailored to self-recursion, kind of like
> what `named-let` provides.  Note that such ad-hoc TCO tends to hit the same
> semantic issues as the -O3 optimization of the native compiler.
> E.g. in code like the following:
>
>     (defun vc-foo-register (file)
>       (when (some-hint-is-true)
>         (load "vc-foo")
>         (vc-foo-register file)))
>
> the final call to `vc-foo-register` is in tail position but is not
> a self call because loading `vc-foo` is expected to redefine
> `vc-foo-register` with the real implementation.
>
> I'm only talking about the steps that are required to allow the compiler
to
produce code that implements proper tail recursion.
With the abstract machine currently implemented by the byte-code VM,
the "call[n]" instructions will always be needed to call out according to
the C calling conventions.
The call[-absolute/relative] or [goto-absolute] instructions I suggested
*would be* used in the "normal" function-call protocol in place of the
current
funcall dispatch, at least to functions defined in lisp.
This is necessary but not sufficient for proper tail recursion.
To actually get proper tail recursion requires the compiler to use the
instructions
for implementing the appropriate function call protocol, especially if
"goto-absolute" is the instruction provided for changing the PC register.
Other instructions would have to be issued to manage the stack frame
explicitly if that were the route taken.  Or,  a more CISCish call-absolute
type of instruction could be used that would perform that stack frame
management implicitly.
EIther way, it's the compiler that has to determine whether a return
instruction following a control transfer can be safely eliminated or not.
If the "goto-absolute" instruction were used, the compiler would
have to decide whether the address following the "goto-absolute"
should be pushed in a new frame, or if it can be "pre-emptively
garbage collected"  at compile time because it's a tail call.


> > I'm not familiar with emacs's profiling facilities.  Is it possible to
> > tell how much of the allocated space/time spent in gc is due to the
> > constant vectors of lexical closures?  In particular, how much of the
> > constant vectors are copied elements independent of the lexical
> > environment?  That would provide some measure of any gc-related
> > benefit that *might* be gained from using an explicit environment
> > register for closures, instead of embedding it in the
> > byte-code vector.
>
> No, I can't think of any profiling tool we currently have that can help
> with that, sorry :-(
>
> Note that when support for native closures is added to the native
> compiler, it will hopefully not be using this clunky representation
> where capture vars are mixed in with the vector of constants, so that
> might be a more promising direction (may be able to skip the step where
> we need to change the bytecode).
>
>
The trick is to make the implementation of the abstract machine by each of
the
compilers have enough in common to support calling one from the other.
The extensions I've suggested for the byte-code VM and lisp semantics
are intended to support that interoperation, so the semantics of the
byte-code
implementation won't unnecessarily constrain the semantics of the
native-code
implementation.

Lynn

[-- Attachment #2: Type: text/html, Size: 8690 bytes --]

  reply	other threads:[~2022-06-20  1:39 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31  1:02 native compilation units Lynn Winebarger
2022-06-01 13:50 ` Andrea Corallo
2022-06-03 14:17   ` Lynn Winebarger
2022-06-03 16:05     ` Eli Zaretskii
     [not found]       ` <CAM=F=bDxxyHurxM_xdbb7XJtP8rdK16Cwp30ti52Ox4nv19J_w@mail.gmail.com>
2022-06-04  5:57         ` Eli Zaretskii
2022-06-05 13:53           ` Lynn Winebarger
2022-06-03 18:15     ` Stefan Monnier
2022-06-04  2:43       ` Lynn Winebarger
2022-06-04 14:32         ` Stefan Monnier
2022-06-05 12:16           ` Lynn Winebarger
2022-06-05 14:08             ` Lynn Winebarger
2022-06-05 14:46               ` Stefan Monnier
2022-06-05 14:20             ` Stefan Monnier
2022-06-06  4:12               ` Lynn Winebarger
2022-06-06  6:12                 ` Stefan Monnier
2022-06-06 10:39                   ` Eli Zaretskii
2022-06-06 16:23                     ` Lynn Winebarger
2022-06-06 16:58                       ` Eli Zaretskii
2022-06-07  2:14                         ` Lynn Winebarger
2022-06-07 10:53                           ` Eli Zaretskii
2022-06-06 16:13                   ` Lynn Winebarger
2022-06-07  2:39                     ` Lynn Winebarger
2022-06-07 11:50                       ` Stefan Monnier
2022-06-07 13:11                         ` Eli Zaretskii
2022-06-14  4:19               ` Lynn Winebarger
2022-06-14 12:23                 ` Stefan Monnier
2022-06-14 14:55                   ` Lynn Winebarger
2022-06-08  6:56           ` Andrea Corallo
2022-06-11 16:13             ` Lynn Winebarger
2022-06-11 16:37               ` Stefan Monnier
2022-06-11 17:49                 ` Lynn Winebarger
2022-06-11 20:34                   ` Stefan Monnier
2022-06-12 17:38                     ` Lynn Winebarger
2022-06-12 18:47                       ` Stefan Monnier
2022-06-13 16:33                         ` Lynn Winebarger
2022-06-13 17:15                           ` Stefan Monnier
2022-06-15  3:03                             ` Lynn Winebarger
2022-06-15 12:23                               ` Stefan Monnier
2022-06-19 17:52                                 ` Lynn Winebarger
2022-06-19 23:02                                   ` Stefan Monnier
2022-06-20  1:39                                     ` Lynn Winebarger [this message]
2022-06-20 12:14                                       ` Lynn Winebarger
2022-06-20 12:34                                       ` Lynn Winebarger
2022-06-25 18:12                                       ` Lynn Winebarger
2022-06-26 14:14                                         ` Lynn Winebarger
2022-06-08  6:46         ` Andrea Corallo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bBO7nSPELaVv6P0JmkTGzRmp7+x3BVPGsRngsTRZPQhoQ@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=akrl@sdf.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).