From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: thoughts on targetting the web Date: Sat, 19 Jun 2021 22:20:20 +0200 Message-ID: <87h7htfvy3.fsf@pobox.com> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26177"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sat Jun 19 22:21:09 2021 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1luhSk-0006YW-Ha for guile-devel@m.gmane-mx.org; Sat, 19 Jun 2021 22:21:08 +0200 Original-Received: from localhost ([::1]:54940 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1luhSj-000887-Jl for guile-devel@m.gmane-mx.org; Sat, 19 Jun 2021 16:21:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47852) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1luhSH-00087k-EB for guile-devel@gnu.org; Sat, 19 Jun 2021 16:20:37 -0400 Original-Received: from fanzine.igalia.com ([178.60.130.6]:60714) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1luhSE-0006YM-Im for guile-devel@gnu.org; Sat, 19 Jun 2021 16:20:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Type:MIME-Version:Message-ID:Date:Subject:To:From; bh=qNxg2TZNMSlfo9SerVLjkr7CsG+y4ALPQaqzYqB/Ljg=; b=hMdzzGNrjFijaJ9RIGxs51gXbzw65dW2KrB946LbQX/GXhEP2zRFbrMAC7r9sycW0KWT+OueqhACFyTFpOmjLTpBEaDWTcwsl2JDr4HacJ07heH3Fh2saZuI5IlR4MxQ8epja8daYZ/y8enN9X0j+4qrPC+47oa96EUukju1AMebuFrtf1mN5krizUFThKTDU5nnLekCQa4kKuwQR5zHK8dRgUv2gal3spmtLLtkXRE222NenJu4Euw99qiWq7ioAsTw/nJs0YN8PhqJM93HQTm5sf8XmGRSdNbR3gQjHRtSaCM/8R3OWZgScXzAuzw2rsIhmIyOJm7DLT5B6l2sOw==; Original-Received: from 82-65-63-215.subs.proxad.net ([82.65.63.215] helo=sparrow) by fanzine.igalia.com with esmtpsa (Cipher TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim) id 1luhSA-0005ej-UP for ; Sat, 19 Jun 2021 22:20:31 +0200 Received-SPF: neutral client-ip=178.60.130.6; envelope-from=wingo@pobox.com; helo=fanzine.igalia.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20786 Archived-At: Hi :) A brain-dump tonight. I was thinking about what it would mean for Guile to target the web. The goal would be for Guile to be a useful language for programming client-side interaction on web pages. Now, there are a number of takes for Schemes of various stripes to target JS. I haven't made a full survey but my starting point is that in a Guile context, we basically want the Guile Scheme language, which is to say: 1. Tail calls. The solution must include proper tail calls. 2. Multiple values. The solution should have the same semantics that Guile has for multiple-value returns, including implicit truncation. 3. Varargs. The solution should allow for rest arguments, keyword arguments, and the like. 4. Delimited continuations. The solution should allow non-local exit from any function, and should allow for capturing stack slices to partial continuations, and should allow those continuations to be instantiated multiple times if desired. We should be able to target fibers to the web. Otherwise, why bother? 5. Garbage collection. *We should re-use the host GC*. Although it would be possible to manage a heap in linear memory, that has retention problems due to cycles between the Guile heap and the JS heap. 6. Numbers. We should support Scheme's unbounded integers. Requires bignums in the limit case, but thankfully JS has bignums now. We will still be able to unbox in many cases though. 7. Target existing web. We shouldn't depend on some future WebAssembly or JS proposal -- the solution should work in the here and now and only get better as features are added to the web platform. >From a UX perspective, I would expect we would generally want whole-program compilation with aggressive DCE / tree-shaking, producing a single JS or WebAssembly artifact at build-time. But that's a later issue. I have thought about this off and on over the years but in the end was flummoxed about how to meet all requirements. However recently I think I have come up with a solution for most of these: (1) In JS, tail calls are part of ES2015, but not implemented in all browsers. In WebAssembly, they are a future plan, but hard for various reasons. So in practice the solution for today's web is to use a universal trampoline and make all calls tail calls -- i.e. call all functions via: while true: call(pop()) Whenever we can target proper tail calls, this will only get faster. (2) Neither JS nor WebAssembly have the truncation semantics we want. Therefore we will return values via an explicit array, and then the continuation will be responsible for consuming those values and binding any needed variables. (3) Although JS does have varargs support, WebAssembly does not. But we can actually use the same solution here as we use for (1) and (2) -- pass arguments on an explicit array + nvalues, and relying on the function to parse them appropriately. In this way we can get Guile keyword arguments. This also makes the type of Scheme functions uniform, which is important in WebAssembly contexts. (4) This is the interesting bit! As hinted in (1), we will transform the program such that all calls are tail calls. This is a form of minimal CPS conversion -- minimal in the sense that functions are broken up into the minimum number of pieces to ensure the all-tail-calls property. Non-tail calls transform to tail calls by saving their continuation's state on a stack, which is the same as stack allocation for these continuations. The continuation of the call pops the saved state from the stack. Because the stack is explicit, we can manipulate it as data: slice it to capture a delimited continuation, drop values to make a non-local exit, splat a saved slice back on to compose a captured delimited continuation with the current continuation, and so on. Therefore we have the necessary primitives to implement delimited continuations as a library. (5) Scheme needs a representation that can store any value. In JS this is easy because JS is untyped. For WebAssembly, I think I would lean on externref for this purpose, which effectively boxes all values. There are no fixnums in the current WebAssembly spec, so this is suboptimal, and we have to call out to JS to evaluate type predicates and access fields. But, support for structured GC-managed types is coming to WebAssembly, which will speed up object access. (6) The easy solution here is to make JS numbers, which are doubles at heart, represent flonums, and use JS bignums for Scheme integers. Fine. (7) This principle has been taken into account in (1)-(6). Now, a note on the transformation described in (4), which I call "tailification". The first step of tailification computes the set of "tails" in a function. The function entry starts a tail, as does each return point from non-tail calls. Join points between different tails also start tails. In the residual program, there are four ways that a continuation exits: - Tail calls in the source program are tail calls in the residual program; no change. - For non-tail calls in the source program, the caller saves the state of the continuation (the live variables flowing into the continuation) on an explicit stack, and saves the label of the continuation. The return continuation will be converted into a arity-checking function entry, to handle multi-value returns; when it is invoked, it will pop its incoming live variables from the continuation stack. - Terms that continue to a join continuation are converted to label calls in tail position, passing the state of the continuation as arguments. - Returning values from a continuation pops the return label from the stack and does an indirect tail label call on that label, with the given return values. I expect that a tailified program will probably be slower than a non-tailified program. However a tailified program has a few interesting properties: the stack is packed and only contains live data; the stack can be traversed in a portable way, allowing for implementation of prompts on systems that don't support them natively; and as all calls are tail calls, the whole system can be implemented naturally with a driver trampoline on targets that don't support tail calls (e.g. JavaScript and WebAssembly). This algorithm is implemented in the wip-tailify branch in Guile's git. I have this little driver program: (use-modules (system base compile) (system base language) (language cps tailify) (language cps verify) (language cps renumber) (language cps dce) (language cps simplify) (language cps dump)) (define (tailify* form) (let ((from 'scheme) (make-lower (language-lowerer (lookup-language 'cps))) (optimization-level 2) (warning-level 2) (opts '())) (define cps ((make-lower optimization-level opts) (compile form #:from 'scheme #:to 'cps #:opts opts #:optimization-level optimization-level #:warning-level warning-level #:env (current-module)) (current-module))) (format #t "Original:\n") (dump cps) (format #t "\n\nTailified:\n") (let* ((cps (tailify cps)) (cps (verify cps)) (cps (eliminate-dead-code cps)) (cps (simplify cps)) (cps (renumber cps))) (dump cps)))) If I run this on the following lambda: (lambda (x) (pk x (if x (pk 12) (pk 42)))) Then first it prints the dump of the optimized CPS: Original: L0: v0 := self L1(...) L1: receive() v1 := current-module() ; module cache-set![0](v1) v2 := const-fun L7 ; _ return v2 L7: v3 := self L8(...) L8: v4 := receive(x) ; x v5 := cache-ref[(0 . pk)]() ; cached heap-object?(v5) ? L16() : L11() L16(): L17(v5) L11(): v6 := cache-ref[0]() ; mod v7 := const pk ; name v8 := lookup-bound(v6, v7) ; var cache-set![(0 . pk)](v8) L17(v8) L17(v9): ; box v10 := scm-ref/immediate[(box . 1)](v9) ; arg false?(v4) ? L22() : L19() L22(): v12 := const 42 ; arg call v10(v12) L25(receive(arg, rest...)) L19(): v11 := const 12 ; arg call v10(v11) L25(receive(arg, rest...)) L25(v13, v14): ; tmp, tmp tail call v10(v4, v13) Then it prints the tailified version: L0: v0 := self L1(...) L1: receive() v1 := current-module() ; module cache-set![0](v1) v2 := const-fun L8 ; _ v3 := restore1[ptr]() ; ret tail calli v3(v2) L8: v4 := self L9(...) L9: v5 := receive(x) ; x v6 := cache-ref[(0 . pk)]() ; cached heap-object?(v6) ? L17() : L12() L17(): L18(v6) L12(): v7 := cache-ref[0]() ; mod v8 := const pk ; name v9 := lookup-bound(v7, v8) ; var cache-set![(0 . pk)](v9) L18(v9) L18(v10): ; box v11 := scm-ref/immediate[(box . 1)](v10) ; arg false?(v5) ? L24() : L20() L24(): v14 := const 42 ; arg v15 := code L37 ; cont save[(scm scm ptr)](v5, v11, v15) tail call v11(v14) L20(): v12 := const 12 ; arg v13 := code L29 ; cont save[(scm scm ptr)](v5, v11, v13) tail call v11(v12) L29: L30(...) L30: v16, v17 := receive(arg, rest...) ; tmp, tmp v18, v19 := restore[(scm scm)]() ; restored, restored tail callk L34(_, v18, v19, v16) L34: meta: arg-representations: (scm scm scm) L35(...) L35(v20, v21, v22): ; _, _, _ tail call v21(v20, v22) L37: L38(...) L38: v23, v24 := receive(arg, rest...) ; tmp, tmp v25, v26 := restore[(scm scm)]() ; restored, restored tail callk L34(_, v25, v26, v23) I think I would like to implement the "save" and "restore" primcalls on stock Guile, to be able to adequately test tailification. But then I'd look at targetting WebAssembly (just to make it interesting; run-time performance would be faster no doubt if I targetted JS, due to GC concerns). Anyway, thoughts welcome. Happy hacking :) Andy