* thoughts on top-level and cross-module inlining
@ 2016-03-15 21:54 Andy Wingo
2016-04-14 13:49 ` Ludovic Courtès
0 siblings, 1 reply; 2+ messages in thread
From: Andy Wingo @ 2016-03-15 21:54 UTC (permalink / raw)
To: guile-devel
Hi!
I was thinking about top-level and cross-module inlining earlier today,
and here is a dump of my thoughts.
Firstly, top-level and cross-module inlining are related but not quite
the same. You can consider top-level inlining to be a special case.
Anyway, consider:
(define (make-box x) (list x))
(define (box-ref box) (car box))
...
(define (do-something box)
(box-ref box))
Ideally you'd like for the "box-ref" in "do-something" to be inlined to
"car". We have a pass that can do inlining, the peval pass. Why
doesn't it work?
Well, because in Guile top-level bindings are all treated as mutable
variables. Who's to say if later someone comes along and redefines
box-ref. And this is a cool thing and it provides a lot of value and
pleasant live-rehackability and it's swell and cool.
It's also unfortunately a cost imposed on abstraction. It's not an
/essential/ cost -- if we had adaptive optimization, we could eliminate
this cost, more or less -- but it's a current cost. It would be nice if
we could use procedural and other abstractions and trust the compiler to
do an OK job on it, like we do for lexically bound procedures.
Likewise, we'd like to be able to have "make-box" and "box-ref" live in
another module, without causing the module boundary to be an
optimization barrier too.
So, we have two questions, then: firstly, what semantic change can we
make to the language to permit top-level and cross-module inlining?
Secondly, how do we implement it?
On the semantic side, I propose to add a field in modules indicating
whether the module permits its top-level references to be inlined. In
that way, the person writing a piece of code can control the lateness or
earliness of their bindings. Note that this is exactly the opposite of
define-inlinable -- there, the module being used says "always bind me
early". With cross-module inlining, a module can say "allow my
top-level references to definitions in this module and to definitions in
other modules to be early-bound". Note also that it can't require early
binding, as not all definitions will be available for this optimization.
Whether we default this new parameter to be "early" or "late" for most
modules is another question. At least for (guile-user) it should be
late because people expect late-binding at the top-level.
For me I would like to preserve the property that a program behaves the
same with or without inlining, provided that the top-level bindings used
by a program are used conventionally. To define this, let's first break
bindings into two categories: "declarative" or "assigned". A binding is
declarative if it is "define"'d once and only once within its
compilation unit and is never "set!" within its compilation unit.
Otherwise the binding is "assigned".
Let us assume that assigned bindings are not available for inlining, but
that declarative bindings can be inlined. Then, a program with inlining
will be the same as a program without inlining if declarative bindings
are never assigned from outside their compilation unit. That is what is
meant by "used conventionally", and I think it reflects standard Guile
usage, especially where modules are mostly compiled entirely within one
compilation unit.
(We should make an enhancement to the compiler to emit a warning when a
declarative binding is used unconventionally. Perhaps we should emit a
warning whenever an imported binding is "set!", perhaps with a view
towards enforcing conventional usage; but I don't know, we always want
to have a back-door I think to allow Guile hackers to heal broken
programs while they are running.)
So, that's the big-picture side of the semantics, mostly from the
perspective of a module A which uses a module B. However from the side
of the module being used, there are also some considerations.
If we assume that every top-level binding in the source program should
be present in its module, whether or not it is exported from the module
-- and given the ability of a syntax-case macro to produce new
references to arbitrary definitions within a module, I think we have to
assume this -- then any inlining of a top-level binding in a module will
be a copy. Sadly. Sadly as well, this rules out the possibility of
contification within a module, I think. But this leads to the
restriction that a definition is available for inlining only if it does
not have identity. In practice this limits inlining to constant
literals (i.e. '(foo) but not (list 'foo)) and procedures without free
variables.
I think we probably also have to be careful about definition order as
well, and here I don't fully understand the implications. Consider:
(define foo (lambda () qux))
(define bar (foo)) ;; error: qux unbound
(define qux 42)
Bindings at the top-level are similar to lambda* bindings though not
quite the same. Here we should get an error that qux is unbound. But,
if we compile the whole thing as a unit, it could be that inlining would
treat all definitions as simultaneously available, and it it could cause
this program to compile (erroneously). It's something we have to be
careful of, and is marginally related to "fixing letrec, reloaded".
I think that's all the semantic points I can think of. There are many
implementation considerations as well, but it's getting late so I figure
I bring this mail to a close. For what it's worth, I don't have this on
my schedule; if someone wants to pick up this project, it could be a
nice one. Anyway, semantics first: thoughts welcome :)
Happy hacking,
Andy
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: thoughts on top-level and cross-module inlining
2016-03-15 21:54 thoughts on top-level and cross-module inlining Andy Wingo
@ 2016-04-14 13:49 ` Ludovic Courtès
0 siblings, 0 replies; 2+ messages in thread
From: Ludovic Courtès @ 2016-04-14 13:49 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Hello! :-)
Andy Wingo <wingo@pobox.com> skribis:
> So, we have two questions, then: firstly, what semantic change can we
> make to the language to permit top-level and cross-module inlining?
> Secondly, how do we implement it?
As I view it, the first step would be to allow intra-module inlining
(inlining of unassigned private top-level bindings within the module.)
It seems to me that this would have little impact on semantics; there is
still the occasional, but not so rare, uses of @@ to bypass module
boundaries that we want to allow, though.
IMO a guiding principle should be that top-level inlining can always be
disabled. This “backdoor” as you write is important for many current
uses (Geiser, monkey-patching of third-party modules, etc.)
> On the semantic side, I propose to add a field in modules indicating
> whether the module permits its top-level references to be inlined.
That sounds like a good idea for cross-module inlining!
I wonder if a separate way to annotate candidates to intra-module (but
not necessarily cross-module) inlining would be needed. Maybe something
like the ‘declare’ forms found in CL and in some Schemes?
> Let us assume that assigned bindings are not available for inlining, but
> that declarative bindings can be inlined. Then, a program with inlining
> will be the same as a program without inlining if declarative bindings
> are never assigned from outside their compilation unit. That is what is
> meant by "used conventionally", and I think it reflects standard Guile
> usage, especially where modules are mostly compiled entirely within one
> compilation unit.
>
> (We should make an enhancement to the compiler to emit a warning when a
> declarative binding is used unconventionally. Perhaps we should emit a
> warning whenever an imported binding is "set!", perhaps with a view
> towards enforcing conventional usage; but I don't know, we always want
> to have a back-door I think to allow Guile hackers to heal broken
> programs while they are running.)
I think we want to preserve the ability to ‘set!’ third-party bindings,
indeed, in those cases where that is considered more important than
inlining. But I agree that being able to flag such unconventional usage
with compiler warnings would be helpful.
My 2¢,
Ludo’.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-04-14 13:49 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-15 21:54 thoughts on top-level and cross-module inlining Andy Wingo
2016-04-14 13:49 ` Ludovic Courtès
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).