unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* thoughts on targetting the web
@ 2021-06-19 20:20 Andy Wingo
  2021-06-19 22:24 ` Chris Lemmer-Webber
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Andy Wingo @ 2021-06-19 20:20 UTC (permalink / raw)
  To: guile-devel

Hi :)

A brain-dump tonight.  I was thinking about what it would mean for Guile
to target the web.  The goal would be for Guile to be a useful language
for programming client-side interaction on web pages.  Now, there are a
number of takes for Schemes of various stripes to target JS.  I haven't
made a full survey but my starting point is that in a Guile context, we
basically want the Guile Scheme language, which is to say:

 1. Tail calls.  The solution must include proper tail calls.

 2. Multiple values.  The solution should have the same semantics that
    Guile has for multiple-value returns, including implicit
    truncation.

 3. Varargs.  The solution should allow for rest arguments, keyword
    arguments, and the like.

 4. Delimited continuations.  The solution should allow non-local exit
    from any function, and should allow for capturing stack slices to
    partial continuations, and should allow those continuations to be
    instantiated multiple times if desired.  We should be able to target
    fibers to the web.  Otherwise, why bother?

 5. Garbage collection.  *We should re-use the host GC*.  Although it
    would be possible to manage a heap in linear memory, that has
    retention problems due to cycles between the Guile heap and the JS
    heap.

 6. Numbers.  We should support Scheme's unbounded integers.  Requires
    bignums in the limit case, but thankfully JS has bignums now.  We
    will still be able to unbox in many cases though.

 7. Target existing web.  We shouldn't depend on some future WebAssembly
    or JS proposal -- the solution should work in the here and now and
    only get better as features are added to the web platform.

From a UX perspective, I would expect we would generally want
whole-program compilation with aggressive DCE / tree-shaking, producing
a single JS or WebAssembly artifact at build-time.  But that's a later
issue.

I have thought about this off and on over the years but in the end was
flummoxed about how to meet all requirements.  However recently I think
I have come up with a solution for most of these:

 (1) In JS, tail calls are part of ES2015, but not implemented in all
     browsers.  In WebAssembly, they are a future plan, but hard for
     various reasons.  So in practice the solution for today's web is to
     use a universal trampoline and make all calls tail calls --
     i.e. call all functions via:

       while true:
         call(pop())
         
     Whenever we can target proper tail calls, this will only get
     faster.

 (2) Neither JS nor WebAssembly have the truncation semantics we want.
     Therefore we will return values via an explicit array, and then the
     continuation will be responsible for consuming those values and
     binding any needed variables.

 (3) Although JS does have varargs support, WebAssembly does not.  But
     we can actually use the same solution here as we use for (1) and
     (2) -- pass arguments on an explicit array + nvalues, and relying
     on the function to parse them appropriately.  In this way we can
     get Guile keyword arguments.  This also makes the type of Scheme
     functions uniform, which is important in WebAssembly contexts.

 (4) This is the interesting bit!  As hinted in (1), we will transform
     the program such that all calls are tail calls.  This is a form of
     minimal CPS conversion -- minimal in the sense that functions are
     broken up into the minimum number of pieces to ensure the
     all-tail-calls property.  Non-tail calls transform to tail calls by
     saving their continuation's state on a stack, which is the same as
     stack allocation for these continuations.  The continuation of the
     call pops the saved state from the stack.  Because the stack is
     explicit, we can manipulate it as data: slice it to capture a
     delimited continuation, drop values to make a non-local exit, splat
     a saved slice back on to compose a captured delimited continuation
     with the current continuation, and so on.  Therefore we have the
     necessary primitives to implement delimited continuations as a
     library.

 (5) Scheme needs a representation that can store any value.  In JS this
     is easy because JS is untyped.  For WebAssembly, I think I would
     lean on externref for this purpose, which effectively boxes all
     values.  There are no fixnums in the current WebAssembly spec, so
     this is suboptimal, and we have to call out to JS to evaluate type
     predicates and access fields.  But, support for structured
     GC-managed types is coming to WebAssembly, which will speed up
     object access.

 (6) The easy solution here is to make JS numbers, which are doubles at
     heart, represent flonums, and use JS bignums for Scheme integers.
     Fine.

 (7) This principle has been taken into account in (1)-(6).

Now, a note on the transformation described in (4), which I call
"tailification".

The first step of tailification computes the set of "tails" in a
function.  The function entry starts a tail, as does each return point
from non-tail calls.  Join points between different tails also start
tails.

In the residual program, there are four ways that a continuation exits:

  - Tail calls in the source program are tail calls in the residual
    program; no change.

  - For non-tail calls in the source program, the caller saves the state
    of the continuation (the live variables flowing into the
    continuation) on an explicit stack, and saves the label of the
    continuation.  The return continuation will be converted into a
    arity-checking function entry, to handle multi-value returns; when
    it is invoked, it will pop its incoming live variables from the
    continuation stack.

  - Terms that continue to a join continuation are converted to label
    calls in tail position, passing the state of the continuation as
    arguments.

  - Returning values from a continuation pops the return label from the
    stack and does an indirect tail label call on that label, with the
    given return values.

I expect that a tailified program will probably be slower than a
non-tailified program.  However a tailified program has a few
interesting properties: the stack is packed and only contains live data;
the stack can be traversed in a portable way, allowing for
implementation of prompts on systems that don't support them natively;
and as all calls are tail calls, the whole system can be implemented
naturally with a driver trampoline on targets that don't support tail
calls (e.g. JavaScript and WebAssembly).

This algorithm is implemented in the wip-tailify branch in Guile's git.

I have this little driver program:

    (use-modules (system base compile)
                 (system base language)
                 (language cps tailify)
                 (language cps verify)
                 (language cps renumber)
                 (language cps dce)
                 (language cps simplify)
                 (language cps dump))

    (define (tailify* form)
      (let ((from 'scheme)
            (make-lower (language-lowerer (lookup-language 'cps)))
            (optimization-level 2)
            (warning-level 2)
            (opts '()))
        (define cps
          ((make-lower optimization-level opts)
           (compile form #:from 'scheme #:to 'cps #:opts opts
                    #:optimization-level optimization-level
                    #:warning-level warning-level
                    #:env (current-module))
           (current-module)))
        (format #t "Original:\n")
        (dump cps)
        (format #t "\n\nTailified:\n")
        (let* ((cps (tailify cps))
               (cps (verify cps))
               (cps (eliminate-dead-code cps))
               (cps (simplify cps))
               (cps (renumber cps)))
          (dump cps))))

If I run this on the following lambda:

  (lambda (x)
    (pk x
        (if x
            (pk 12)
            (pk 42))))

Then first it prints the dump of the optimized CPS:

  Original:
  L0:
    v0 := self
    L1(...)
  L1:
    receive()
    v1 := current-module()                      ; module 
    cache-set![0](v1)
    v2 := const-fun L7                          ; _ 
    return v2

  L7:
    v3 := self
    L8(...)
  L8:
    v4 := receive(x)                            ; x 
    v5 := cache-ref[(0 . pk)]()                 ; cached 
    heap-object?(v5) ? L16() : L11()
  L16():
    L17(v5)
  L11():
    v6 := cache-ref[0]()                        ; mod 
    v7 := const pk                              ; name 
    v8 := lookup-bound(v6, v7)                  ; var 
    cache-set![(0 . pk)](v8)
    L17(v8)
  L17(v9):                                      ; box 
    v10 := scm-ref/immediate[(box . 1)](v9)     ; arg 
    false?(v4) ? L22() : L19()
  L22():
    v12 := const 42                             ; arg 
    call v10(v12)
    L25(receive(arg, rest...))
  L19():
    v11 := const 12                             ; arg 
    call v10(v11)
    L25(receive(arg, rest...))
  L25(v13, v14):                                ; tmp, tmp 
    tail call v10(v4, v13)

Then it prints the tailified version:

  L0:
    v0 := self
    L1(...)
  L1:
    receive()
    v1 := current-module()                      ; module 
    cache-set![0](v1)
    v2 := const-fun L8                          ; _ 
    v3 := restore1[ptr]()                       ; ret 
    tail calli v3(v2)

  L8:
    v4 := self
    L9(...)
  L9:
    v5 := receive(x)                            ; x 
    v6 := cache-ref[(0 . pk)]()                 ; cached 
    heap-object?(v6) ? L17() : L12()
  L17():
    L18(v6)
  L12():
    v7 := cache-ref[0]()                        ; mod 
    v8 := const pk                              ; name 
    v9 := lookup-bound(v7, v8)                  ; var 
    cache-set![(0 . pk)](v9)
    L18(v9)
  L18(v10):                                     ; box 
    v11 := scm-ref/immediate[(box . 1)](v10)    ; arg 
    false?(v5) ? L24() : L20()
  L24():
    v14 := const 42                             ; arg 
    v15 := code L37                             ; cont 
    save[(scm scm ptr)](v5, v11, v15)
    tail call v11(v14)
  L20():
    v12 := const 12                             ; arg 
    v13 := code L29                             ; cont 
    save[(scm scm ptr)](v5, v11, v13)
    tail call v11(v12)

  L29:
    L30(...)
  L30:
    v16, v17 := receive(arg, rest...)           ; tmp, tmp 
    v18, v19 := restore[(scm scm)]()            ; restored, restored 
    tail callk L34(_, v18, v19, v16)

  L34:
    meta: arg-representations: (scm scm scm)
    L35(...)
  L35(v20, v21, v22):                           ; _, _, _ 
    tail call v21(v20, v22)

  L37:
    L38(...)
  L38:
    v23, v24 := receive(arg, rest...)           ; tmp, tmp 
    v25, v26 := restore[(scm scm)]()            ; restored, restored 
    tail callk L34(_, v25, v26, v23)

I think I would like to implement the "save" and "restore" primcalls on
stock Guile, to be able to adequately test tailification.  But then I'd
look at targetting WebAssembly (just to make it interesting; run-time
performance would be faster no doubt if I targetted JS, due to GC
concerns).

Anyway, thoughts welcome.  Happy hacking :)

Andy



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-19 20:20 thoughts on targetting the web Andy Wingo
@ 2021-06-19 22:24 ` Chris Lemmer-Webber
  2021-10-06 14:53   ` Christine Lemmer-Webber
  2021-06-20 10:02 ` Maxime Devos
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Chris Lemmer-Webber @ 2021-06-19 22:24 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

O M G!  So excited to read this email.  Yes!

Andy Wingo writes:

> Hi :)
>
> A brain-dump tonight.  I was thinking about what it would mean for Guile
> to target the web.  The goal would be for Guile to be a useful language
> for programming client-side interaction on web pages.  Now, there are a
> number of takes for Schemes of various stripes to target JS.  I haven't
> made a full survey but my starting point is that in a Guile context, we
> basically want the Guile Scheme language, which is to say:
>
>  1. Tail calls.  The solution must include proper tail calls.
>
>  2. Multiple values.  The solution should have the same semantics that
>     Guile has for multiple-value returns, including implicit
>     truncation.
>
>  3. Varargs.  The solution should allow for rest arguments, keyword
>     arguments, and the like.
>
>  4. Delimited continuations.  The solution should allow non-local exit
>     from any function, and should allow for capturing stack slices to
>     partial continuations, and should allow those continuations to be
>     instantiated multiple times if desired.  We should be able to target
>     fibers to the web.  Otherwise, why bother?
>
>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>     would be possible to manage a heap in linear memory, that has
>     retention problems due to cycles between the Guile heap and the JS
>     heap.

It sounds like from when we spoke not long ago that progress towards the
WASM-GC proposal has been happening I think?  So this seems feasible?

However, I do think that a "proof of concept" GC would still be an
advancement.  Like, even lifting the dead-simple version from SICP as a
stopgap.  We'll all know it's not ideal, but it'll still be progress.
Getting somethign going and showing it is helpful.

That said, I also know how meh such a proposed approach can feel. ;)

>  6. Numbers.  We should support Scheme's unbounded integers.  Requires
>     bignums in the limit case, but thankfully JS has bignums now.  We
>     will still be able to unbox in many cases though.
>
>  7. Target existing web.  We shouldn't depend on some future WebAssembly
>     or JS proposal -- the solution should work in the here and now and
>     only get better as features are added to the web platform.

Yes.  Though, to jump back again to when you and I talked offline, I
think what I recommended was that as an early proof of concept, a good
idea might be to just have a small number of APIs, or even bet, make a
mini "fantasy console".  Here's an interesting example (and the one we
talked about, off-list):

  https://euhmeuh.github.io/wasm-adventure/

What's interesting about this first is that it's just a Racket program
that generates quasiquoted s-expression representation WASM.  Now,
obviously your compiler will do more interesting things than that.  But
the second interesting thing is the "fantasy console" component... a set
of custom interfaces to render a simple 8-bit aesthetic console.

My argument for this is: supporting "the existing web" is a big step.
Targeting a much smaller number of APIs, and being able to show the
world a demo that Guile's WASM compiler can do something cool... that's
very motivating, I think.

But an even simpler hello world would be to simply have access to
console.log, and write a scheme program that does ye olde Fibonacci
sequence and spits out the results on the page.  Not as exciting, but
that's the reasonable (and classic scheme) first step I think.

But yes, target the existing web... eventually!  Just don't get so hung
up on worrying about those interfaces that it distracts from making
something that can generate some WASM, is my thoughts.

> From a UX perspective, I would expect we would generally want
> whole-program compilation with aggressive DCE / tree-shaking, producing
> a single JS or WebAssembly artifact at build-time.  But that's a later
> issue.

Yes, focus on things working first... major optimizations can come later!

> I have thought about this off and on over the years but in the end was
> flummoxed about how to meet all requirements.  However recently I think
> I have come up with a solution for most of these:
>
>  (1) In JS, tail calls are part of ES2015, but not implemented in all
>      browsers.  In WebAssembly, they are a future plan, but hard for
>      various reasons.  So in practice the solution for today's web is to
>      use a universal trampoline and make all calls tail calls --
>      i.e. call all functions via:
>
>        while true:
>          call(pop())
>          
>      Whenever we can target proper tail calls, this will only get
>      faster.
>
>  (2) Neither JS nor WebAssembly have the truncation semantics we want.
>      Therefore we will return values via an explicit array, and then the
>      continuation will be responsible for consuming those values and
>      binding any needed variables.
>
>  (3) Although JS does have varargs support, WebAssembly does not.  But
>      we can actually use the same solution here as we use for (1) and
>      (2) -- pass arguments on an explicit array + nvalues, and relying
>      on the function to parse them appropriately.  In this way we can
>      get Guile keyword arguments.  This also makes the type of Scheme
>      functions uniform, which is important in WebAssembly contexts.
>
>  (4) This is the interesting bit!  As hinted in (1), we will transform
>      the program such that all calls are tail calls.  This is a form of
>      minimal CPS conversion -- minimal in the sense that functions are
>      broken up into the minimum number of pieces to ensure the
>      all-tail-calls property.  Non-tail calls transform to tail calls by
>      saving their continuation's state on a stack, which is the same as
>      stack allocation for these continuations.  The continuation of the
>      call pops the saved state from the stack.  Because the stack is
>      explicit, we can manipulate it as data: slice it to capture a
>      delimited continuation, drop values to make a non-local exit, splat
>      a saved slice back on to compose a captured delimited continuation
>      with the current continuation, and so on.  Therefore we have the
>      necessary primitives to implement delimited continuations as a
>      library.

I think I understand 15% of this, but the amount I could absorb in the
limited time I spent stewing on it sounds cool anyway.  Neat!  Most
importantly, you have a good idea that you think will work, so yay!

>  (5) Scheme needs a representation that can store any value.  In JS this
>      is easy because JS is untyped.  For WebAssembly, I think I would
>      lean on externref for this purpose, which effectively boxes all
>      values.  There are no fixnums in the current WebAssembly spec, so
>      this is suboptimal, and we have to call out to JS to evaluate type
>      predicates and access fields.  But, support for structured
>      GC-managed types is coming to WebAssembly, which will speed up
>      object access.
>
>  (6) The easy solution here is to make JS numbers, which are doubles at
>      heart, represent flonums, and use JS bignums for Scheme integers.
>      Fine.
>
>  (7) This principle has been taken into account in (1)-(6).
>
> Now, a note on the transformation described in (4), which I call
> "tailification".
>
> The first step of tailification computes the set of "tails" in a
> function.  The function entry starts a tail, as does each return point
> from non-tail calls.  Join points between different tails also start
> tails.
>
> In the residual program, there are four ways that a continuation exits:
>
>   - Tail calls in the source program are tail calls in the residual
>     program; no change.
>
>   - For non-tail calls in the source program, the caller saves the state
>     of the continuation (the live variables flowing into the
>     continuation) on an explicit stack, and saves the label of the
>     continuation.  The return continuation will be converted into a
>     arity-checking function entry, to handle multi-value returns; when
>     it is invoked, it will pop its incoming live variables from the
>     continuation stack.
>
>   - Terms that continue to a join continuation are converted to label
>     calls in tail position, passing the state of the continuation as
>     arguments.
>
>   - Returning values from a continuation pops the return label from the
>     stack and does an indirect tail label call on that label, with the
>     given return values.
>
> I expect that a tailified program will probably be slower than a
> non-tailified program.  However a tailified program has a few
> interesting properties: the stack is packed and only contains live data;
> the stack can be traversed in a portable way, allowing for
> implementation of prompts on systems that don't support them natively;
> and as all calls are tail calls, the whole system can be implemented
> naturally with a driver trampoline on targets that don't support tail
> calls (e.g. JavaScript and WebAssembly).
>
> This algorithm is implemented in the wip-tailify branch in Guile's git.
>
> I have this little driver program:
>
>     (use-modules (system base compile)
>                  (system base language)
>                  (language cps tailify)
>                  (language cps verify)
>                  (language cps renumber)
>                  (language cps dce)
>                  (language cps simplify)
>                  (language cps dump))
>
>     (define (tailify* form)
>       (let ((from 'scheme)
>             (make-lower (language-lowerer (lookup-language 'cps)))
>             (optimization-level 2)
>             (warning-level 2)
>             (opts '()))
>         (define cps
>           ((make-lower optimization-level opts)
>            (compile form #:from 'scheme #:to 'cps #:opts opts
>                     #:optimization-level optimization-level
>                     #:warning-level warning-level
>                     #:env (current-module))
>            (current-module)))
>         (format #t "Original:\n")
>         (dump cps)
>         (format #t "\n\nTailified:\n")
>         (let* ((cps (tailify cps))
>                (cps (verify cps))
>                (cps (eliminate-dead-code cps))
>                (cps (simplify cps))
>                (cps (renumber cps)))
>           (dump cps))))
>
> If I run this on the following lambda:
>
>   (lambda (x)
>     (pk x
>         (if x
>             (pk 12)
>             (pk 42))))
>
> Then first it prints the dump of the optimized CPS:
>
>   Original:
>   L0:
>     v0 := self
>     L1(...)
>   L1:
>     receive()
>     v1 := current-module()                      ; module 
>     cache-set![0](v1)
>     v2 := const-fun L7                          ; _ 
>     return v2
>
>   L7:
>     v3 := self
>     L8(...)
>   L8:
>     v4 := receive(x)                            ; x 
>     v5 := cache-ref[(0 . pk)]()                 ; cached 
>     heap-object?(v5) ? L16() : L11()
>   L16():
>     L17(v5)
>   L11():
>     v6 := cache-ref[0]()                        ; mod 
>     v7 := const pk                              ; name 
>     v8 := lookup-bound(v6, v7)                  ; var 
>     cache-set![(0 . pk)](v8)
>     L17(v8)
>   L17(v9):                                      ; box 
>     v10 := scm-ref/immediate[(box . 1)](v9)     ; arg 
>     false?(v4) ? L22() : L19()
>   L22():
>     v12 := const 42                             ; arg 
>     call v10(v12)
>     L25(receive(arg, rest...))
>   L19():
>     v11 := const 12                             ; arg 
>     call v10(v11)
>     L25(receive(arg, rest...))
>   L25(v13, v14):                                ; tmp, tmp 
>     tail call v10(v4, v13)
>
> Then it prints the tailified version:
>
>   L0:
>     v0 := self
>     L1(...)
>   L1:
>     receive()
>     v1 := current-module()                      ; module 
>     cache-set![0](v1)
>     v2 := const-fun L8                          ; _ 
>     v3 := restore1[ptr]()                       ; ret 
>     tail calli v3(v2)
>
>   L8:
>     v4 := self
>     L9(...)
>   L9:
>     v5 := receive(x)                            ; x 
>     v6 := cache-ref[(0 . pk)]()                 ; cached 
>     heap-object?(v6) ? L17() : L12()
>   L17():
>     L18(v6)
>   L12():
>     v7 := cache-ref[0]()                        ; mod 
>     v8 := const pk                              ; name 
>     v9 := lookup-bound(v7, v8)                  ; var 
>     cache-set![(0 . pk)](v9)
>     L18(v9)
>   L18(v10):                                     ; box 
>     v11 := scm-ref/immediate[(box . 1)](v10)    ; arg 
>     false?(v5) ? L24() : L20()
>   L24():
>     v14 := const 42                             ; arg 
>     v15 := code L37                             ; cont 
>     save[(scm scm ptr)](v5, v11, v15)
>     tail call v11(v14)
>   L20():
>     v12 := const 12                             ; arg 
>     v13 := code L29                             ; cont 
>     save[(scm scm ptr)](v5, v11, v13)
>     tail call v11(v12)
>
>   L29:
>     L30(...)
>   L30:
>     v16, v17 := receive(arg, rest...)           ; tmp, tmp 
>     v18, v19 := restore[(scm scm)]()            ; restored, restored 
>     tail callk L34(_, v18, v19, v16)
>
>   L34:
>     meta: arg-representations: (scm scm scm)
>     L35(...)
>   L35(v20, v21, v22):                           ; _, _, _ 
>     tail call v21(v20, v22)
>
>   L37:
>     L38(...)
>   L38:
>     v23, v24 := receive(arg, rest...)           ; tmp, tmp 
>     v25, v26 := restore[(scm scm)]()            ; restored, restored 
>     tail callk L34(_, v25, v26, v23)
>
> I think I would like to implement the "save" and "restore" primcalls on
> stock Guile, to be able to adequately test tailification.  But then I'd
> look at targetting WebAssembly (just to make it interesting; run-time
> performance would be faster no doubt if I targetted JS, due to GC
> concerns).
>
> Anyway, thoughts welcome.  Happy hacking :)
>
> Andy

This is all great.  I think getting something that works well enough to
write some proof-of-concept demos in is the real goal.  If you can make
something that can compile just enough to WASM where me adding the
fantasy console stuff would help (I might just lift wasm-adventure's),
then I think we could make some pretty cool demos for the world.  Maybe
we could even have a special guile-fantasy-console gamejam? ;)

Anyway exciting, exciting, exciting!  My main advice in all of it: don't
let great be the enemy of good-enough, and all that.  Getting to the
point where we're able to do *something* is still useful, if the hooks
are in place to iterate from good-enough to great over time.

Yay!

 - Chris



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-19 20:20 thoughts on targetting the web Andy Wingo
  2021-06-19 22:24 ` Chris Lemmer-Webber
@ 2021-06-20 10:02 ` Maxime Devos
  2021-06-20 14:14   ` Chris Lemmer-Webber
  2021-10-06 23:34 ` Dr. Arne Babenhauserheide
  2021-10-07  1:37 ` Nala Ginrut
  3 siblings, 1 reply; 8+ messages in thread
From: Maxime Devos @ 2021-06-20 10:02 UTC (permalink / raw)
  To: Andy Wingo, guile-devel

[-- Attachment #1: Type: text/plain, Size: 758 bytes --]

Andy Wingo schreef op za 19-06-2021 om 22:20 [+0200]:
>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>     would be possible to manage a heap in linear memory, that has
>     retention problems due to cycles between the Guile heap and the JS
>     heap.

I could be mistaken (and I haven't written any ECMAScript in a long time),
but I believe ECMAScript doesn't have guardians, gc hooks, weak vectors and
(key, value, key-value) weak hash tables. So, if we re-use the host GC,
that would mean those GC things cannot be used right?

In that case, it may be a good idea to raise an error at compile time
if some code tries to use these anyways. (I've been using guardians
and weak vectors lately.)

Greetings,
Maxime

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-20 10:02 ` Maxime Devos
@ 2021-06-20 14:14   ` Chris Lemmer-Webber
  2021-06-20 20:55     ` Maxime Devos
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Lemmer-Webber @ 2021-06-20 14:14 UTC (permalink / raw)
  To: Maxime Devos; +Cc: Andy Wingo, guile-devel

Maxime Devos writes:

> Andy Wingo schreef op za 19-06-2021 om 22:20 [+0200]:
>>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>>     would be possible to manage a heap in linear memory, that has
>>     retention problems due to cycles between the Guile heap and the JS
>>     heap.
>
> I could be mistaken (and I haven't written any ECMAScript in a long time),
> but I believe ECMAScript doesn't have guardians, gc hooks, weak vectors and
> (key, value, key-value) weak hash tables. So, if we re-use the host GC,
> that would mean those GC things cannot be used right?
>
> In that case, it may be a good idea to raise an error at compile time
> if some code tries to use these anyways. (I've been using guardians
> and weak vectors lately.)
>
> Greetings,
> Maxime

Weakmaps are a thing these days I think:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WeakMap

Weak vectors, not so sure.  (Never used them myself anyway.)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-20 14:14   ` Chris Lemmer-Webber
@ 2021-06-20 20:55     ` Maxime Devos
  0 siblings, 0 replies; 8+ messages in thread
From: Maxime Devos @ 2021-06-20 20:55 UTC (permalink / raw)
  To: Chris Lemmer-Webber; +Cc: Andy Wingo, guile-devel

[-- Attachment #1: Type: text/plain, Size: 670 bytes --]

Chris Lemmer-Webber schreef op zo 20-06-2021 om 10:14 [-0400]:
> Weakmaps are a thing these days I think:
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WeakMap

Nice, though it doesn't allow enumerating keys
(which is reasonable enough in practice)
and it doesn't support weak-value maps (where the reference
to the value is weak) and doubly-weak maps.

(I never used weak-value or doubly-weak maps though)

Also (unrelated), I have patch for ephemeral-key hash tables in Guile,
but it's buggy (the values sometimes become random heap objects
in some situations??) so I'll try to debug it first.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-19 22:24 ` Chris Lemmer-Webber
@ 2021-10-06 14:53   ` Christine Lemmer-Webber
  0 siblings, 0 replies; 8+ messages in thread
From: Christine Lemmer-Webber @ 2021-10-06 14:53 UTC (permalink / raw)
  Cc: Andy Wingo, guile-devel

Chris Lemmer-Webber <cwebber@dustycloud.org> writes:

> O M G!  So excited to read this email.  Yes!

So, re-poiking this thread.  I'm still just as excited as I was when I
sent this some time ago!

Any updates?  How can we help?  Can we test things?  Poke at something?
Contribute something?'

I'm excited for this future!

> Andy Wingo writes:
>
>> Hi :)
>>
>> A brain-dump tonight.  I was thinking about what it would mean for Guile
>> to target the web.  The goal would be for Guile to be a useful language
>> for programming client-side interaction on web pages.  Now, there are a
>> number of takes for Schemes of various stripes to target JS.  I haven't
>> made a full survey but my starting point is that in a Guile context, we
>> basically want the Guile Scheme language, which is to say:
>>
>>  1. Tail calls.  The solution must include proper tail calls.
>>
>>  2. Multiple values.  The solution should have the same semantics that
>>     Guile has for multiple-value returns, including implicit
>>     truncation.
>>
>>  3. Varargs.  The solution should allow for rest arguments, keyword
>>     arguments, and the like.
>>
>>  4. Delimited continuations.  The solution should allow non-local exit
>>     from any function, and should allow for capturing stack slices to
>>     partial continuations, and should allow those continuations to be
>>     instantiated multiple times if desired.  We should be able to target
>>     fibers to the web.  Otherwise, why bother?
>>
>>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>>     would be possible to manage a heap in linear memory, that has
>>     retention problems due to cycles between the Guile heap and the JS
>>     heap.
>
> It sounds like from when we spoke not long ago that progress towards the
> WASM-GC proposal has been happening I think?  So this seems feasible?
>
> However, I do think that a "proof of concept" GC would still be an
> advancement.  Like, even lifting the dead-simple version from SICP as a
> stopgap.  We'll all know it's not ideal, but it'll still be progress.
> Getting somethign going and showing it is helpful.
>
> That said, I also know how meh such a proposed approach can feel. ;)
>
>>  6. Numbers.  We should support Scheme's unbounded integers.  Requires
>>     bignums in the limit case, but thankfully JS has bignums now.  We
>>     will still be able to unbox in many cases though.
>>
>>  7. Target existing web.  We shouldn't depend on some future WebAssembly
>>     or JS proposal -- the solution should work in the here and now and
>>     only get better as features are added to the web platform.
>
> Yes.  Though, to jump back again to when you and I talked offline, I
> think what I recommended was that as an early proof of concept, a good
> idea might be to just have a small number of APIs, or even bet, make a
> mini "fantasy console".  Here's an interesting example (and the one we
> talked about, off-list):
>
>   https://euhmeuh.github.io/wasm-adventure/
>
> What's interesting about this first is that it's just a Racket program
> that generates quasiquoted s-expression representation WASM.  Now,
> obviously your compiler will do more interesting things than that.  But
> the second interesting thing is the "fantasy console" component... a set
> of custom interfaces to render a simple 8-bit aesthetic console.
>
> My argument for this is: supporting "the existing web" is a big step.
> Targeting a much smaller number of APIs, and being able to show the
> world a demo that Guile's WASM compiler can do something cool... that's
> very motivating, I think.
>
> But an even simpler hello world would be to simply have access to
> console.log, and write a scheme program that does ye olde Fibonacci
> sequence and spits out the results on the page.  Not as exciting, but
> that's the reasonable (and classic scheme) first step I think.
>
> But yes, target the existing web... eventually!  Just don't get so hung
> up on worrying about those interfaces that it distracts from making
> something that can generate some WASM, is my thoughts.
>
>> From a UX perspective, I would expect we would generally want
>> whole-program compilation with aggressive DCE / tree-shaking, producing
>> a single JS or WebAssembly artifact at build-time.  But that's a later
>> issue.
>
> Yes, focus on things working first... major optimizations can come later!
>
>> I have thought about this off and on over the years but in the end was
>> flummoxed about how to meet all requirements.  However recently I think
>> I have come up with a solution for most of these:
>>
>>  (1) In JS, tail calls are part of ES2015, but not implemented in all
>>      browsers.  In WebAssembly, they are a future plan, but hard for
>>      various reasons.  So in practice the solution for today's web is to
>>      use a universal trampoline and make all calls tail calls --
>>      i.e. call all functions via:
>>
>>        while true:
>>          call(pop())
>>          
>>      Whenever we can target proper tail calls, this will only get
>>      faster.
>>
>>  (2) Neither JS nor WebAssembly have the truncation semantics we want.
>>      Therefore we will return values via an explicit array, and then the
>>      continuation will be responsible for consuming those values and
>>      binding any needed variables.
>>
>>  (3) Although JS does have varargs support, WebAssembly does not.  But
>>      we can actually use the same solution here as we use for (1) and
>>      (2) -- pass arguments on an explicit array + nvalues, and relying
>>      on the function to parse them appropriately.  In this way we can
>>      get Guile keyword arguments.  This also makes the type of Scheme
>>      functions uniform, which is important in WebAssembly contexts.
>>
>>  (4) This is the interesting bit!  As hinted in (1), we will transform
>>      the program such that all calls are tail calls.  This is a form of
>>      minimal CPS conversion -- minimal in the sense that functions are
>>      broken up into the minimum number of pieces to ensure the
>>      all-tail-calls property.  Non-tail calls transform to tail calls by
>>      saving their continuation's state on a stack, which is the same as
>>      stack allocation for these continuations.  The continuation of the
>>      call pops the saved state from the stack.  Because the stack is
>>      explicit, we can manipulate it as data: slice it to capture a
>>      delimited continuation, drop values to make a non-local exit, splat
>>      a saved slice back on to compose a captured delimited continuation
>>      with the current continuation, and so on.  Therefore we have the
>>      necessary primitives to implement delimited continuations as a
>>      library.
>
> I think I understand 15% of this, but the amount I could absorb in the
> limited time I spent stewing on it sounds cool anyway.  Neat!  Most
> importantly, you have a good idea that you think will work, so yay!
>
>>  (5) Scheme needs a representation that can store any value.  In JS this
>>      is easy because JS is untyped.  For WebAssembly, I think I would
>>      lean on externref for this purpose, which effectively boxes all
>>      values.  There are no fixnums in the current WebAssembly spec, so
>>      this is suboptimal, and we have to call out to JS to evaluate type
>>      predicates and access fields.  But, support for structured
>>      GC-managed types is coming to WebAssembly, which will speed up
>>      object access.
>>
>>  (6) The easy solution here is to make JS numbers, which are doubles at
>>      heart, represent flonums, and use JS bignums for Scheme integers.
>>      Fine.
>>
>>  (7) This principle has been taken into account in (1)-(6).
>>
>> Now, a note on the transformation described in (4), which I call
>> "tailification".
>>
>> The first step of tailification computes the set of "tails" in a
>> function.  The function entry starts a tail, as does each return point
>> from non-tail calls.  Join points between different tails also start
>> tails.
>>
>> In the residual program, there are four ways that a continuation exits:
>>
>>   - Tail calls in the source program are tail calls in the residual
>>     program; no change.
>>
>>   - For non-tail calls in the source program, the caller saves the state
>>     of the continuation (the live variables flowing into the
>>     continuation) on an explicit stack, and saves the label of the
>>     continuation.  The return continuation will be converted into a
>>     arity-checking function entry, to handle multi-value returns; when
>>     it is invoked, it will pop its incoming live variables from the
>>     continuation stack.
>>
>>   - Terms that continue to a join continuation are converted to label
>>     calls in tail position, passing the state of the continuation as
>>     arguments.
>>
>>   - Returning values from a continuation pops the return label from the
>>     stack and does an indirect tail label call on that label, with the
>>     given return values.
>>
>> I expect that a tailified program will probably be slower than a
>> non-tailified program.  However a tailified program has a few
>> interesting properties: the stack is packed and only contains live data;
>> the stack can be traversed in a portable way, allowing for
>> implementation of prompts on systems that don't support them natively;
>> and as all calls are tail calls, the whole system can be implemented
>> naturally with a driver trampoline on targets that don't support tail
>> calls (e.g. JavaScript and WebAssembly).
>>
>> This algorithm is implemented in the wip-tailify branch in Guile's git.
>>
>> I have this little driver program:
>>
>>     (use-modules (system base compile)
>>                  (system base language)
>>                  (language cps tailify)
>>                  (language cps verify)
>>                  (language cps renumber)
>>                  (language cps dce)
>>                  (language cps simplify)
>>                  (language cps dump))
>>
>>     (define (tailify* form)
>>       (let ((from 'scheme)
>>             (make-lower (language-lowerer (lookup-language 'cps)))
>>             (optimization-level 2)
>>             (warning-level 2)
>>             (opts '()))
>>         (define cps
>>           ((make-lower optimization-level opts)
>>            (compile form #:from 'scheme #:to 'cps #:opts opts
>>                     #:optimization-level optimization-level
>>                     #:warning-level warning-level
>>                     #:env (current-module))
>>            (current-module)))
>>         (format #t "Original:\n")
>>         (dump cps)
>>         (format #t "\n\nTailified:\n")
>>         (let* ((cps (tailify cps))
>>                (cps (verify cps))
>>                (cps (eliminate-dead-code cps))
>>                (cps (simplify cps))
>>                (cps (renumber cps)))
>>           (dump cps))))
>>
>> If I run this on the following lambda:
>>
>>   (lambda (x)
>>     (pk x
>>         (if x
>>             (pk 12)
>>             (pk 42))))
>>
>> Then first it prints the dump of the optimized CPS:
>>
>>   Original:
>>   L0:
>>     v0 := self
>>     L1(...)
>>   L1:
>>     receive()
>>     v1 := current-module()                      ; module 
>>     cache-set![0](v1)
>>     v2 := const-fun L7                          ; _ 
>>     return v2
>>
>>   L7:
>>     v3 := self
>>     L8(...)
>>   L8:
>>     v4 := receive(x)                            ; x 
>>     v5 := cache-ref[(0 . pk)]()                 ; cached 
>>     heap-object?(v5) ? L16() : L11()
>>   L16():
>>     L17(v5)
>>   L11():
>>     v6 := cache-ref[0]()                        ; mod 
>>     v7 := const pk                              ; name 
>>     v8 := lookup-bound(v6, v7)                  ; var 
>>     cache-set![(0 . pk)](v8)
>>     L17(v8)
>>   L17(v9):                                      ; box 
>>     v10 := scm-ref/immediate[(box . 1)](v9)     ; arg 
>>     false?(v4) ? L22() : L19()
>>   L22():
>>     v12 := const 42                             ; arg 
>>     call v10(v12)
>>     L25(receive(arg, rest...))
>>   L19():
>>     v11 := const 12                             ; arg 
>>     call v10(v11)
>>     L25(receive(arg, rest...))
>>   L25(v13, v14):                                ; tmp, tmp 
>>     tail call v10(v4, v13)
>>
>> Then it prints the tailified version:
>>
>>   L0:
>>     v0 := self
>>     L1(...)
>>   L1:
>>     receive()
>>     v1 := current-module()                      ; module 
>>     cache-set![0](v1)
>>     v2 := const-fun L8                          ; _ 
>>     v3 := restore1[ptr]()                       ; ret 
>>     tail calli v3(v2)
>>
>>   L8:
>>     v4 := self
>>     L9(...)
>>   L9:
>>     v5 := receive(x)                            ; x 
>>     v6 := cache-ref[(0 . pk)]()                 ; cached 
>>     heap-object?(v6) ? L17() : L12()
>>   L17():
>>     L18(v6)
>>   L12():
>>     v7 := cache-ref[0]()                        ; mod 
>>     v8 := const pk                              ; name 
>>     v9 := lookup-bound(v7, v8)                  ; var 
>>     cache-set![(0 . pk)](v9)
>>     L18(v9)
>>   L18(v10):                                     ; box 
>>     v11 := scm-ref/immediate[(box . 1)](v10)    ; arg 
>>     false?(v5) ? L24() : L20()
>>   L24():
>>     v14 := const 42                             ; arg 
>>     v15 := code L37                             ; cont 
>>     save[(scm scm ptr)](v5, v11, v15)
>>     tail call v11(v14)
>>   L20():
>>     v12 := const 12                             ; arg 
>>     v13 := code L29                             ; cont 
>>     save[(scm scm ptr)](v5, v11, v13)
>>     tail call v11(v12)
>>
>>   L29:
>>     L30(...)
>>   L30:
>>     v16, v17 := receive(arg, rest...)           ; tmp, tmp 
>>     v18, v19 := restore[(scm scm)]()            ; restored, restored 
>>     tail callk L34(_, v18, v19, v16)
>>
>>   L34:
>>     meta: arg-representations: (scm scm scm)
>>     L35(...)
>>   L35(v20, v21, v22):                           ; _, _, _ 
>>     tail call v21(v20, v22)
>>
>>   L37:
>>     L38(...)
>>   L38:
>>     v23, v24 := receive(arg, rest...)           ; tmp, tmp 
>>     v25, v26 := restore[(scm scm)]()            ; restored, restored 
>>     tail callk L34(_, v25, v26, v23)
>>
>> I think I would like to implement the "save" and "restore" primcalls on
>> stock Guile, to be able to adequately test tailification.  But then I'd
>> look at targetting WebAssembly (just to make it interesting; run-time
>> performance would be faster no doubt if I targetted JS, due to GC
>> concerns).
>>
>> Anyway, thoughts welcome.  Happy hacking :)
>>
>> Andy
>
> This is all great.  I think getting something that works well enough to
> write some proof-of-concept demos in is the real goal.  If you can make
> something that can compile just enough to WASM where me adding the
> fantasy console stuff would help (I might just lift wasm-adventure's),
> then I think we could make some pretty cool demos for the world.  Maybe
> we could even have a special guile-fantasy-console gamejam? ;)
>
> Anyway exciting, exciting, exciting!  My main advice in all of it: don't
> let great be the enemy of good-enough, and all that.  Getting to the
> point where we're able to do *something* is still useful, if the hooks
> are in place to iterate from good-enough to great over time.
>
> Yay!
>
>  - Chris




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-19 20:20 thoughts on targetting the web Andy Wingo
  2021-06-19 22:24 ` Chris Lemmer-Webber
  2021-06-20 10:02 ` Maxime Devos
@ 2021-10-06 23:34 ` Dr. Arne Babenhauserheide
  2021-10-07  1:37 ` Nala Ginrut
  3 siblings, 0 replies; 8+ messages in thread
From: Dr. Arne Babenhauserheide @ 2021-10-06 23:34 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1045 bytes --]


Andy Wingo <wingo@pobox.com> writes:

> Anyway, thoughts welcome.  Happy hacking :)

This sounds great — thank you!

I personally would already be happy if I could start by replacing my
small biwascheme utility by guile-based webassembly:

https://www.draketo.de/software/matrikellanguage.scm

Currently used via 

<script src="biwascheme.js">
  (load "matrikellanguage.scm")
  (let loop ()
      (wait-for "#matrikelnummer" "input")
      (set-content! "#result" (matrikel->pair (get-content "#matrikelnummer")))
      (console-log "ok.")
      (loop))
</script>

on https://www.draketo.de/software/vorlesung-netztechnik#nummer-zu-sprache


Doing this in biwascheme looks good for this small utility, but when I
tried to do anything bigger, I found that biwascheme does not provide
the namespacing of Scheme, so I broke variables. It looks like scheme
and is pretty nice to use, but it is not really Scheme.


Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: thoughts on targetting the web
  2021-06-19 20:20 thoughts on targetting the web Andy Wingo
                   ` (2 preceding siblings ...)
  2021-10-06 23:34 ` Dr. Arne Babenhauserheide
@ 2021-10-07  1:37 ` Nala Ginrut
  3 siblings, 0 replies; 8+ messages in thread
From: Nala Ginrut @ 2021-10-07  1:37 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 11913 bytes --]

I'm looking forward to it, and I'm pretty sure this can bring new
possibilities to Artanis!

Best regards.

On Sun, Jun 20, 2021, 04:21 Andy Wingo <wingo@pobox.com> wrote:

> Hi :)
>
> A brain-dump tonight.  I was thinking about what it would mean for Guile
> to target the web.  The goal would be for Guile to be a useful language
> for programming client-side interaction on web pages.  Now, there are a
> number of takes for Schemes of various stripes to target JS.  I haven't
> made a full survey but my starting point is that in a Guile context, we
> basically want the Guile Scheme language, which is to say:
>
>  1. Tail calls.  The solution must include proper tail calls.
>
>  2. Multiple values.  The solution should have the same semantics that
>     Guile has for multiple-value returns, including implicit
>     truncation.
>
>  3. Varargs.  The solution should allow for rest arguments, keyword
>     arguments, and the like.
>
>  4. Delimited continuations.  The solution should allow non-local exit
>     from any function, and should allow for capturing stack slices to
>     partial continuations, and should allow those continuations to be
>     instantiated multiple times if desired.  We should be able to target
>     fibers to the web.  Otherwise, why bother?
>
>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>     would be possible to manage a heap in linear memory, that has
>     retention problems due to cycles between the Guile heap and the JS
>     heap.
>
>  6. Numbers.  We should support Scheme's unbounded integers.  Requires
>     bignums in the limit case, but thankfully JS has bignums now.  We
>     will still be able to unbox in many cases though.
>
>  7. Target existing web.  We shouldn't depend on some future WebAssembly
>     or JS proposal -- the solution should work in the here and now and
>     only get better as features are added to the web platform.
>
> From a UX perspective, I would expect we would generally want
> whole-program compilation with aggressive DCE / tree-shaking, producing
> a single JS or WebAssembly artifact at build-time.  But that's a later
> issue.
>
> I have thought about this off and on over the years but in the end was
> flummoxed about how to meet all requirements.  However recently I think
> I have come up with a solution for most of these:
>
>  (1) In JS, tail calls are part of ES2015, but not implemented in all
>      browsers.  In WebAssembly, they are a future plan, but hard for
>      various reasons.  So in practice the solution for today's web is to
>      use a universal trampoline and make all calls tail calls --
>      i.e. call all functions via:
>
>        while true:
>          call(pop())
>
>      Whenever we can target proper tail calls, this will only get
>      faster.
>
>  (2) Neither JS nor WebAssembly have the truncation semantics we want.
>      Therefore we will return values via an explicit array, and then the
>      continuation will be responsible for consuming those values and
>      binding any needed variables.
>
>  (3) Although JS does have varargs support, WebAssembly does not.  But
>      we can actually use the same solution here as we use for (1) and
>      (2) -- pass arguments on an explicit array + nvalues, and relying
>      on the function to parse them appropriately.  In this way we can
>      get Guile keyword arguments.  This also makes the type of Scheme
>      functions uniform, which is important in WebAssembly contexts.
>
>  (4) This is the interesting bit!  As hinted in (1), we will transform
>      the program such that all calls are tail calls.  This is a form of
>      minimal CPS conversion -- minimal in the sense that functions are
>      broken up into the minimum number of pieces to ensure the
>      all-tail-calls property.  Non-tail calls transform to tail calls by
>      saving their continuation's state on a stack, which is the same as
>      stack allocation for these continuations.  The continuation of the
>      call pops the saved state from the stack.  Because the stack is
>      explicit, we can manipulate it as data: slice it to capture a
>      delimited continuation, drop values to make a non-local exit, splat
>      a saved slice back on to compose a captured delimited continuation
>      with the current continuation, and so on.  Therefore we have the
>      necessary primitives to implement delimited continuations as a
>      library.
>
>  (5) Scheme needs a representation that can store any value.  In JS this
>      is easy because JS is untyped.  For WebAssembly, I think I would
>      lean on externref for this purpose, which effectively boxes all
>      values.  There are no fixnums in the current WebAssembly spec, so
>      this is suboptimal, and we have to call out to JS to evaluate type
>      predicates and access fields.  But, support for structured
>      GC-managed types is coming to WebAssembly, which will speed up
>      object access.
>
>  (6) The easy solution here is to make JS numbers, which are doubles at
>      heart, represent flonums, and use JS bignums for Scheme integers.
>      Fine.
>
>  (7) This principle has been taken into account in (1)-(6).
>
> Now, a note on the transformation described in (4), which I call
> "tailification".
>
> The first step of tailification computes the set of "tails" in a
> function.  The function entry starts a tail, as does each return point
> from non-tail calls.  Join points between different tails also start
> tails.
>
> In the residual program, there are four ways that a continuation exits:
>
>   - Tail calls in the source program are tail calls in the residual
>     program; no change.
>
>   - For non-tail calls in the source program, the caller saves the state
>     of the continuation (the live variables flowing into the
>     continuation) on an explicit stack, and saves the label of the
>     continuation.  The return continuation will be converted into a
>     arity-checking function entry, to handle multi-value returns; when
>     it is invoked, it will pop its incoming live variables from the
>     continuation stack.
>
>   - Terms that continue to a join continuation are converted to label
>     calls in tail position, passing the state of the continuation as
>     arguments.
>
>   - Returning values from a continuation pops the return label from the
>     stack and does an indirect tail label call on that label, with the
>     given return values.
>
> I expect that a tailified program will probably be slower than a
> non-tailified program.  However a tailified program has a few
> interesting properties: the stack is packed and only contains live data;
> the stack can be traversed in a portable way, allowing for
> implementation of prompts on systems that don't support them natively;
> and as all calls are tail calls, the whole system can be implemented
> naturally with a driver trampoline on targets that don't support tail
> calls (e.g. JavaScript and WebAssembly).
>
> This algorithm is implemented in the wip-tailify branch in Guile's git.
>
> I have this little driver program:
>
>     (use-modules (system base compile)
>                  (system base language)
>                  (language cps tailify)
>                  (language cps verify)
>                  (language cps renumber)
>                  (language cps dce)
>                  (language cps simplify)
>                  (language cps dump))
>
>     (define (tailify* form)
>       (let ((from 'scheme)
>             (make-lower (language-lowerer (lookup-language 'cps)))
>             (optimization-level 2)
>             (warning-level 2)
>             (opts '()))
>         (define cps
>           ((make-lower optimization-level opts)
>            (compile form #:from 'scheme #:to 'cps #:opts opts
>                     #:optimization-level optimization-level
>                     #:warning-level warning-level
>                     #:env (current-module))
>            (current-module)))
>         (format #t "Original:\n")
>         (dump cps)
>         (format #t "\n\nTailified:\n")
>         (let* ((cps (tailify cps))
>                (cps (verify cps))
>                (cps (eliminate-dead-code cps))
>                (cps (simplify cps))
>                (cps (renumber cps)))
>           (dump cps))))
>
> If I run this on the following lambda:
>
>   (lambda (x)
>     (pk x
>         (if x
>             (pk 12)
>             (pk 42))))
>
> Then first it prints the dump of the optimized CPS:
>
>   Original:
>   L0:
>     v0 := self
>     L1(...)
>   L1:
>     receive()
>     v1 := current-module()                      ; module
>     cache-set![0](v1)
>     v2 := const-fun L7                          ; _
>     return v2
>
>   L7:
>     v3 := self
>     L8(...)
>   L8:
>     v4 := receive(x)                            ; x
>     v5 := cache-ref[(0 . pk)]()                 ; cached
>     heap-object?(v5) ? L16() : L11()
>   L16():
>     L17(v5)
>   L11():
>     v6 := cache-ref[0]()                        ; mod
>     v7 := const pk                              ; name
>     v8 := lookup-bound(v6, v7)                  ; var
>     cache-set![(0 . pk)](v8)
>     L17(v8)
>   L17(v9):                                      ; box
>     v10 := scm-ref/immediate[(box . 1)](v9)     ; arg
>     false?(v4) ? L22() : L19()
>   L22():
>     v12 := const 42                             ; arg
>     call v10(v12)
>     L25(receive(arg, rest...))
>   L19():
>     v11 := const 12                             ; arg
>     call v10(v11)
>     L25(receive(arg, rest...))
>   L25(v13, v14):                                ; tmp, tmp
>     tail call v10(v4, v13)
>
> Then it prints the tailified version:
>
>   L0:
>     v0 := self
>     L1(...)
>   L1:
>     receive()
>     v1 := current-module()                      ; module
>     cache-set![0](v1)
>     v2 := const-fun L8                          ; _
>     v3 := restore1[ptr]()                       ; ret
>     tail calli v3(v2)
>
>   L8:
>     v4 := self
>     L9(...)
>   L9:
>     v5 := receive(x)                            ; x
>     v6 := cache-ref[(0 . pk)]()                 ; cached
>     heap-object?(v6) ? L17() : L12()
>   L17():
>     L18(v6)
>   L12():
>     v7 := cache-ref[0]()                        ; mod
>     v8 := const pk                              ; name
>     v9 := lookup-bound(v7, v8)                  ; var
>     cache-set![(0 . pk)](v9)
>     L18(v9)
>   L18(v10):                                     ; box
>     v11 := scm-ref/immediate[(box . 1)](v10)    ; arg
>     false?(v5) ? L24() : L20()
>   L24():
>     v14 := const 42                             ; arg
>     v15 := code L37                             ; cont
>     save[(scm scm ptr)](v5, v11, v15)
>     tail call v11(v14)
>   L20():
>     v12 := const 12                             ; arg
>     v13 := code L29                             ; cont
>     save[(scm scm ptr)](v5, v11, v13)
>     tail call v11(v12)
>
>   L29:
>     L30(...)
>   L30:
>     v16, v17 := receive(arg, rest...)           ; tmp, tmp
>     v18, v19 := restore[(scm scm)]()            ; restored, restored
>     tail callk L34(_, v18, v19, v16)
>
>   L34:
>     meta: arg-representations: (scm scm scm)
>     L35(...)
>   L35(v20, v21, v22):                           ; _, _, _
>     tail call v21(v20, v22)
>
>   L37:
>     L38(...)
>   L38:
>     v23, v24 := receive(arg, rest...)           ; tmp, tmp
>     v25, v26 := restore[(scm scm)]()            ; restored, restored
>     tail callk L34(_, v25, v26, v23)
>
> I think I would like to implement the "save" and "restore" primcalls on
> stock Guile, to be able to adequately test tailification.  But then I'd
> look at targetting WebAssembly (just to make it interesting; run-time
> performance would be faster no doubt if I targetted JS, due to GC
> concerns).
>
> Anyway, thoughts welcome.  Happy hacking :)
>
> Andy
>
>

[-- Attachment #2: Type: text/html, Size: 14273 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-07  1:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-19 20:20 thoughts on targetting the web Andy Wingo
2021-06-19 22:24 ` Chris Lemmer-Webber
2021-10-06 14:53   ` Christine Lemmer-Webber
2021-06-20 10:02 ` Maxime Devos
2021-06-20 14:14   ` Chris Lemmer-Webber
2021-06-20 20:55     ` Maxime Devos
2021-10-06 23:34 ` Dr. Arne Babenhauserheide
2021-10-07  1:37 ` Nala Ginrut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).