On Sun, Jun 19, 2022, 9:39 PM Lynn Winebarger <owinebar@gmail.com> wrote:

On Sun, Jun 19, 2022 at 7:02 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> Currently compiling a top-level expression wrapped in
> eval-when-compile by itself leaves no residue in the compiled output,

`eval-when-compile` has 2 effects:

1- Run the code within the compiler's process.
E.g. (eval-when-compile (require 'cl-lib)).
This is somewhat comparable to loading a gcc plugin during
a compilation: it affects the GCC process itself, rather than the
code it emits.

2- It replaces the (eval-when-compile ...) thingy with the value
returned by the evaluation of this code. So you can do (defvar
my-str (eval-when-compile (concat "foo" "bar"))) and you know that
the concatenation will be done during compilation.

> but I would want to make the above evaluate to an object at run-time
> where the exported symbols in the obstack are immutable.

Then it wouldn't be called `eval-when-compile` because it would do
something quite different from what `eval-when-compile` does :-)

The informal semantics of "eval-when-compile" from the elisp info file are that
This form marks BODY to be evaluated at compile time but not when
the compiled program is loaded. The result of evaluation by the
compiler becomes a constant which appears in the compiled program.
If you load the source file, rather than compiling it, BODY is
evaluated normally.
I'm not sure what I have proposed that would be inconsistent with "the result of evaluation
by the compiler becomes a constant which appears in the compiled program".
The exact form of that appearance in the compiled program is not specified.
For example, the byte-compile of (eval-when-compile (cl-labels ((f...) (g ...)))
currently produces a byte-code vector in which f and g are byte-code vectors with
shared structure. However, that representation is only one choice.

It is inconsistent with the semantics of *symbols* as they currently stand, as I have already admitted.
Even there, you could advance a model where it is not inconsistent. For example,
if you view the binding of symbol to value as having two components - the binding and the cell
holding the mutable value during the extent of the symbol as a global/dynamically scoped variable,
then having the binding of the symbol to the final value of the cell before the dynamic extent of the variable
terminates would be consistent. That's not how it's currently implemented, because there is no way to
express the final compile-time environment as a value after compilation has completed with the
current semantics.

The part that's incompatible with current semantics of symbols is importing that symbol as
an immutable symbolic reference. Not really a "variable" reference, but as a binding
of a symbol to a value in the run-time namespace (or package in CL terminology, although
CL did not allow any way to specify what I'm suggesting either, as far as I know).

However, that would capture the semantics of ELF shared objects with the text and ro_data
segments loaded into memory that is in fact immutable for a userspace program.

It looks to me like the portable dump code/format could be adapted to serve the purpose I have in mind here. What needs to be added is a way to limit the scope of the dump so only the appropriate set of objects are captured.

There would probably also need to be a separate load-path for these libraries similar to the approach employed for native compiled files.

It could be neat if all LISP code and constants eventually lived in some larger associated compilation units (scope-limited pdmp file), to have a residual dump at any time of the remaining live objects, most corresponding to the space of global/dynamic variables. That could in turn be used for local debugging or in actual bug reporting.

Lynn