From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Christine Lemmer-Webber Newsgroups: gmane.lisp.guile.devel Subject: Re: thoughts on targetting the web Date: Wed, 06 Oct 2021 10:53:21 -0400 Message-ID: <87zgrmfbcc.fsf@dustycloud.org> References: <87h7htfvy3.fsf@pobox.com> <87sg1debng.fsf@dustycloud.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="9000"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.6.6; emacs 27.2 Cc: Andy Wingo , guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Wed Oct 06 16:59:54 2021 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mY8Of-00027e-Of for guile-devel@m.gmane-mx.org; Wed, 06 Oct 2021 16:59:53 +0200 Original-Received: from localhost ([::1]:33462 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mY8Oe-0003RX-LX for guile-devel@m.gmane-mx.org; Wed, 06 Oct 2021 10:59:52 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42766) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mY8OQ-0003R0-WC for guile-devel@gnu.org; Wed, 06 Oct 2021 10:59:39 -0400 Original-Received: from dustycloud.org ([50.116.34.160]:38562) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mY8OO-0004Pu-FU for guile-devel@gnu.org; Wed, 06 Oct 2021 10:59:38 -0400 Original-Received: from twig (localhost [127.0.0.1]) by dustycloud.org (Postfix) with ESMTPS id B3DB62661B; Wed, 6 Oct 2021 10:59:33 -0400 (EDT) In-reply-to: <87sg1debng.fsf@dustycloud.org> Received-SPF: pass client-ip=50.116.34.160; envelope-from=cwebber@dustycloud.org; helo=dustycloud.org X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, MISSING_HEADERS=1.021, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20897 Archived-At: Chris Lemmer-Webber writes: > O M G! So excited to read this email. Yes! So, re-poiking this thread. I'm still just as excited as I was when I sent this some time ago! Any updates? How can we help? Can we test things? Poke at something? Contribute something?' I'm excited for this future! > Andy Wingo writes: > >> Hi :) >> >> A brain-dump tonight. I was thinking about what it would mean for Guile >> to target the web. The goal would be for Guile to be a useful language >> for programming client-side interaction on web pages. Now, there are a >> number of takes for Schemes of various stripes to target JS. I haven't >> made a full survey but my starting point is that in a Guile context, we >> basically want the Guile Scheme language, which is to say: >> >> 1. Tail calls. The solution must include proper tail calls. >> >> 2. Multiple values. The solution should have the same semantics that >> Guile has for multiple-value returns, including implicit >> truncation. >> >> 3. Varargs. The solution should allow for rest arguments, keyword >> arguments, and the like. >> >> 4. Delimited continuations. The solution should allow non-local exit >> from any function, and should allow for capturing stack slices to >> partial continuations, and should allow those continuations to be >> instantiated multiple times if desired. We should be able to target >> fibers to the web. Otherwise, why bother? >> >> 5. Garbage collection. *We should re-use the host GC*. Although it >> would be possible to manage a heap in linear memory, that has >> retention problems due to cycles between the Guile heap and the JS >> heap. > > It sounds like from when we spoke not long ago that progress towards the > WASM-GC proposal has been happening I think? So this seems feasible? > > However, I do think that a "proof of concept" GC would still be an > advancement. Like, even lifting the dead-simple version from SICP as a > stopgap. We'll all know it's not ideal, but it'll still be progress. > Getting somethign going and showing it is helpful. > > That said, I also know how meh such a proposed approach can feel. ;) > >> 6. Numbers. We should support Scheme's unbounded integers. Requires >> bignums in the limit case, but thankfully JS has bignums now. We >> will still be able to unbox in many cases though. >> >> 7. Target existing web. We shouldn't depend on some future WebAssembly >> or JS proposal -- the solution should work in the here and now and >> only get better as features are added to the web platform. > > Yes. Though, to jump back again to when you and I talked offline, I > think what I recommended was that as an early proof of concept, a good > idea might be to just have a small number of APIs, or even bet, make a > mini "fantasy console". Here's an interesting example (and the one we > talked about, off-list): > > https://euhmeuh.github.io/wasm-adventure/ > > What's interesting about this first is that it's just a Racket program > that generates quasiquoted s-expression representation WASM. Now, > obviously your compiler will do more interesting things than that. But > the second interesting thing is the "fantasy console" component... a set > of custom interfaces to render a simple 8-bit aesthetic console. > > My argument for this is: supporting "the existing web" is a big step. > Targeting a much smaller number of APIs, and being able to show the > world a demo that Guile's WASM compiler can do something cool... that's > very motivating, I think. > > But an even simpler hello world would be to simply have access to > console.log, and write a scheme program that does ye olde Fibonacci > sequence and spits out the results on the page. Not as exciting, but > that's the reasonable (and classic scheme) first step I think. > > But yes, target the existing web... eventually! Just don't get so hung > up on worrying about those interfaces that it distracts from making > something that can generate some WASM, is my thoughts. > >> From a UX perspective, I would expect we would generally want >> whole-program compilation with aggressive DCE / tree-shaking, producing >> a single JS or WebAssembly artifact at build-time. But that's a later >> issue. > > Yes, focus on things working first... major optimizations can come later! > >> I have thought about this off and on over the years but in the end was >> flummoxed about how to meet all requirements. However recently I think >> I have come up with a solution for most of these: >> >> (1) In JS, tail calls are part of ES2015, but not implemented in all >> browsers. In WebAssembly, they are a future plan, but hard for >> various reasons. So in practice the solution for today's web is to >> use a universal trampoline and make all calls tail calls -- >> i.e. call all functions via: >> >> while true: >> call(pop()) >> >> Whenever we can target proper tail calls, this will only get >> faster. >> >> (2) Neither JS nor WebAssembly have the truncation semantics we want. >> Therefore we will return values via an explicit array, and then the >> continuation will be responsible for consuming those values and >> binding any needed variables. >> >> (3) Although JS does have varargs support, WebAssembly does not. But >> we can actually use the same solution here as we use for (1) and >> (2) -- pass arguments on an explicit array + nvalues, and relying >> on the function to parse them appropriately. In this way we can >> get Guile keyword arguments. This also makes the type of Scheme >> functions uniform, which is important in WebAssembly contexts. >> >> (4) This is the interesting bit! As hinted in (1), we will transform >> the program such that all calls are tail calls. This is a form of >> minimal CPS conversion -- minimal in the sense that functions are >> broken up into the minimum number of pieces to ensure the >> all-tail-calls property. Non-tail calls transform to tail calls by >> saving their continuation's state on a stack, which is the same as >> stack allocation for these continuations. The continuation of the >> call pops the saved state from the stack. Because the stack is >> explicit, we can manipulate it as data: slice it to capture a >> delimited continuation, drop values to make a non-local exit, splat >> a saved slice back on to compose a captured delimited continuation >> with the current continuation, and so on. Therefore we have the >> necessary primitives to implement delimited continuations as a >> library. > > I think I understand 15% of this, but the amount I could absorb in the > limited time I spent stewing on it sounds cool anyway. Neat! Most > importantly, you have a good idea that you think will work, so yay! > >> (5) Scheme needs a representation that can store any value. In JS this >> is easy because JS is untyped. For WebAssembly, I think I would >> lean on externref for this purpose, which effectively boxes all >> values. There are no fixnums in the current WebAssembly spec, so >> this is suboptimal, and we have to call out to JS to evaluate type >> predicates and access fields. But, support for structured >> GC-managed types is coming to WebAssembly, which will speed up >> object access. >> >> (6) The easy solution here is to make JS numbers, which are doubles at >> heart, represent flonums, and use JS bignums for Scheme integers. >> Fine. >> >> (7) This principle has been taken into account in (1)-(6). >> >> Now, a note on the transformation described in (4), which I call >> "tailification". >> >> The first step of tailification computes the set of "tails" in a >> function. The function entry starts a tail, as does each return point >> from non-tail calls. Join points between different tails also start >> tails. >> >> In the residual program, there are four ways that a continuation exits: >> >> - Tail calls in the source program are tail calls in the residual >> program; no change. >> >> - For non-tail calls in the source program, the caller saves the state >> of the continuation (the live variables flowing into the >> continuation) on an explicit stack, and saves the label of the >> continuation. The return continuation will be converted into a >> arity-checking function entry, to handle multi-value returns; when >> it is invoked, it will pop its incoming live variables from the >> continuation stack. >> >> - Terms that continue to a join continuation are converted to label >> calls in tail position, passing the state of the continuation as >> arguments. >> >> - Returning values from a continuation pops the return label from the >> stack and does an indirect tail label call on that label, with the >> given return values. >> >> I expect that a tailified program will probably be slower than a >> non-tailified program. However a tailified program has a few >> interesting properties: the stack is packed and only contains live data; >> the stack can be traversed in a portable way, allowing for >> implementation of prompts on systems that don't support them natively; >> and as all calls are tail calls, the whole system can be implemented >> naturally with a driver trampoline on targets that don't support tail >> calls (e.g. JavaScript and WebAssembly). >> >> This algorithm is implemented in the wip-tailify branch in Guile's git. >> >> I have this little driver program: >> >> (use-modules (system base compile) >> (system base language) >> (language cps tailify) >> (language cps verify) >> (language cps renumber) >> (language cps dce) >> (language cps simplify) >> (language cps dump)) >> >> (define (tailify* form) >> (let ((from 'scheme) >> (make-lower (language-lowerer (lookup-language 'cps))) >> (optimization-level 2) >> (warning-level 2) >> (opts '())) >> (define cps >> ((make-lower optimization-level opts) >> (compile form #:from 'scheme #:to 'cps #:opts opts >> #:optimization-level optimization-level >> #:warning-level warning-level >> #:env (current-module)) >> (current-module))) >> (format #t "Original:\n") >> (dump cps) >> (format #t "\n\nTailified:\n") >> (let* ((cps (tailify cps)) >> (cps (verify cps)) >> (cps (eliminate-dead-code cps)) >> (cps (simplify cps)) >> (cps (renumber cps))) >> (dump cps)))) >> >> If I run this on the following lambda: >> >> (lambda (x) >> (pk x >> (if x >> (pk 12) >> (pk 42)))) >> >> Then first it prints the dump of the optimized CPS: >> >> Original: >> L0: >> v0 := self >> L1(...) >> L1: >> receive() >> v1 := current-module() ; module >> cache-set![0](v1) >> v2 := const-fun L7 ; _ >> return v2 >> >> L7: >> v3 := self >> L8(...) >> L8: >> v4 := receive(x) ; x >> v5 := cache-ref[(0 . pk)]() ; cached >> heap-object?(v5) ? L16() : L11() >> L16(): >> L17(v5) >> L11(): >> v6 := cache-ref[0]() ; mod >> v7 := const pk ; name >> v8 := lookup-bound(v6, v7) ; var >> cache-set![(0 . pk)](v8) >> L17(v8) >> L17(v9): ; box >> v10 := scm-ref/immediate[(box . 1)](v9) ; arg >> false?(v4) ? L22() : L19() >> L22(): >> v12 := const 42 ; arg >> call v10(v12) >> L25(receive(arg, rest...)) >> L19(): >> v11 := const 12 ; arg >> call v10(v11) >> L25(receive(arg, rest...)) >> L25(v13, v14): ; tmp, tmp >> tail call v10(v4, v13) >> >> Then it prints the tailified version: >> >> L0: >> v0 := self >> L1(...) >> L1: >> receive() >> v1 := current-module() ; module >> cache-set![0](v1) >> v2 := const-fun L8 ; _ >> v3 := restore1[ptr]() ; ret >> tail calli v3(v2) >> >> L8: >> v4 := self >> L9(...) >> L9: >> v5 := receive(x) ; x >> v6 := cache-ref[(0 . pk)]() ; cached >> heap-object?(v6) ? L17() : L12() >> L17(): >> L18(v6) >> L12(): >> v7 := cache-ref[0]() ; mod >> v8 := const pk ; name >> v9 := lookup-bound(v7, v8) ; var >> cache-set![(0 . pk)](v9) >> L18(v9) >> L18(v10): ; box >> v11 := scm-ref/immediate[(box . 1)](v10) ; arg >> false?(v5) ? L24() : L20() >> L24(): >> v14 := const 42 ; arg >> v15 := code L37 ; cont >> save[(scm scm ptr)](v5, v11, v15) >> tail call v11(v14) >> L20(): >> v12 := const 12 ; arg >> v13 := code L29 ; cont >> save[(scm scm ptr)](v5, v11, v13) >> tail call v11(v12) >> >> L29: >> L30(...) >> L30: >> v16, v17 := receive(arg, rest...) ; tmp, tmp >> v18, v19 := restore[(scm scm)]() ; restored, restored >> tail callk L34(_, v18, v19, v16) >> >> L34: >> meta: arg-representations: (scm scm scm) >> L35(...) >> L35(v20, v21, v22): ; _, _, _ >> tail call v21(v20, v22) >> >> L37: >> L38(...) >> L38: >> v23, v24 := receive(arg, rest...) ; tmp, tmp >> v25, v26 := restore[(scm scm)]() ; restored, restored >> tail callk L34(_, v25, v26, v23) >> >> I think I would like to implement the "save" and "restore" primcalls on >> stock Guile, to be able to adequately test tailification. But then I'd >> look at targetting WebAssembly (just to make it interesting; run-time >> performance would be faster no doubt if I targetted JS, due to GC >> concerns). >> >> Anyway, thoughts welcome. Happy hacking :) >> >> Andy > > This is all great. I think getting something that works well enough to > write some proof-of-concept demos in is the real goal. If you can make > something that can compile just enough to WASM where me adding the > fantasy console stuff would help (I might just lift wasm-adventure's), > then I think we could make some pretty cool demos for the world. Maybe > we could even have a special guile-fantasy-console gamejam? ;) > > Anyway exciting, exciting, exciting! My main advice in all of it: don't > let great be the enemy of good-enough, and all that. Getting to the > point where we're able to do *something* is still useful, if the hooks > are in place to iterate from good-enough to great over time. > > Yay! > > - Chris