On Sun, Jun 19, 2022 at 7:02 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> Currently compiling a top-level expression wrapped in
> eval-when-compile by itself leaves no residue in the compiled  output,

`eval-when-compile` has 2 effects:

1- Run the code within the compiler's process.
   E.g.  (eval-when-compile  (require 'cl-lib)).
   This is somewhat comparable to loading a gcc plugin during
   a compilation: it affects the GCC process itself, rather than the
   code it emits.

2- It replaces the (eval-when-compile ...) thingy with the value
   returned by the evaluation of this code.  So you can do (defvar
   my-str (eval-when-compile (concat "foo" "bar"))) and you know that
   the concatenation will be done during compilation.

> but I would want to make the above evaluate to an object at run-time
> where the exported symbols in the obstack are immutable.

Then it wouldn't be called `eval-when-compile` because it would do
something quite different from what `eval-when-compile` does :-)


The informal semantics of "eval-when-compile" from the elisp info file are that 
     This form marks BODY to be evaluated at compile time but not when
     the compiled program is loaded.  The result of evaluation by the
     compiler becomes a constant which appears in the compiled program.
     If you load the source file, rather than compiling it, BODY is
     evaluated normally.
I'm not sure what I have proposed that would be inconsistent with "the result of evaluation 
by the compiler becomes a constant which appears in the compiled program".
The exact form of that appearance in the compiled program is not specified.
For example, the byte-compile of (eval-when-compile (cl-labels ((f...) (g ...)))
currently produces a byte-code vector in which f and g are byte-code vectors with
shared structure.  However, that representation is only one choice.

It is inconsistent with the semantics of *symbols* as they currently stand, as I have already admitted.
Even there, you could advance a model where it is not inconsistent.  For example,
if you view the binding of symbol to value as having two components - the binding and the cell
holding the mutable value during the extent of the symbol as a global/dynamically scoped variable,
then having the binding of the symbol to the final value of the cell before the dynamic extent of the variable
terminates would be consistent.  That's not how it's currently implemented, because there is no way to
express the final compile-time environment as a value after compilation has completed with the
current semantics.

The part that's incompatible with current semantics of symbols is importing that symbol as 
an immutable symbolic reference.  Not really a "variable" reference, but as a binding
of a symbol to a value in the run-time namespace (or package in CL terminology, although
CL did not allow any way to specify what I'm suggesting either, as far as I know).

However, that would capture the semantics of ELF shared objects with the text and ro_data
segments loaded into memory that is in fact immutable for a userspace program.
 
> byte-code (or native-code) instruction arrays.  This would in turn enable
> implementing proper tail recursion as "goto with arguments".

Proper tail recursion elimination would require changing the *normal*
function call protocol.  I suspect you're thinking of a smaller-scale
version of it specifically tailored to self-recursion, kind of like
what `named-let` provides.  Note that such ad-hoc TCO tends to hit the same
semantic issues as the -O3 optimization of the native compiler.
E.g. in code like the following:

    (defun vc-foo-register (file)
      (when (some-hint-is-true)
        (load "vc-foo")
        (vc-foo-register file)))

the final call to `vc-foo-register` is in tail position but is not
a self call because loading `vc-foo` is expected to redefine
`vc-foo-register` with the real implementation.

I'm only talking about the steps that are required to allow the compiler to 
produce code that implements proper tail recursion.
With the abstract machine currently implemented by the byte-code VM,
the "call[n]" instructions will always be needed to call out according to
the C calling conventions.
The call[-absolute/relative] or [goto-absolute] instructions I suggested
*would be* used in the "normal" function-call protocol in place of the current
funcall dispatch, at least to functions defined in lisp.  
This is necessary but not sufficient for proper tail recursion.
To actually get proper tail recursion requires the compiler to use the instructions
for implementing the appropriate function call protocol, especially if
"goto-absolute" is the instruction provided for changing the PC register.
Other instructions would have to be issued to manage the stack frame
explicitly if that were the route taken.  Or,  a more CISCish call-absolute
type of instruction could be used that would perform that stack frame
management implicitly.
EIther way, it's the compiler that has to determine whether a return
instruction following a control transfer can be safely eliminated or not.
If the "goto-absolute" instruction were used, the compiler would
have to decide whether the address following the "goto-absolute"
should be pushed in a new frame, or if it can be "pre-emptively
garbage collected"  at compile time because it's a tail call.
 
> I'm not familiar with emacs's profiling facilities.  Is it possible to
> tell how much of the allocated space/time spent in gc is due to the
> constant vectors of lexical closures?  In particular, how much of the
> constant vectors are copied elements independent of the lexical
> environment?  That would provide some measure of any gc-related
> benefit that *might* be gained from using an explicit environment
> register for closures, instead of embedding it in the
> byte-code vector.

No, I can't think of any profiling tool we currently have that can help
with that, sorry :-(

Note that when support for native closures is added to the native
compiler, it will hopefully not be using this clunky representation
where capture vars are mixed in with the vector of constants, so that
might be a more promising direction (may be able to skip the step where
we need to change the bytecode).


The trick is to make the implementation of the abstract machine by each of the
compilers have enough in common to support calling one from the other.
The extensions I've suggested for the byte-code VM and lisp semantics
are intended to support that interoperation, so the semantics of the byte-code
implementation won't unnecessarily constrain the semantics of the native-code
implementation.

Lynn