From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: Pausable continuations Date: Fri, 11 Feb 2022 13:10:49 +0100 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000c90ec505d7bcf566" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38984"; mail-complaints-to="usenet@ciao.gmane.io" To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Fri Feb 11 13:17:51 2022 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nIUs3-0009sv-I1 for guile-devel@m.gmane-mx.org; Fri, 11 Feb 2022 13:17:51 +0100 Original-Received: from localhost ([::1]:58384 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nIUs2-0001D5-6u for guile-devel@m.gmane-mx.org; Fri, 11 Feb 2022 07:17:50 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:39048) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nIUlV-0000Hb-Ay for guile-devel@gnu.org; Fri, 11 Feb 2022 07:11:05 -0500 Original-Received: from [2607:f8b0:4864:20::632] (port=43929 helo=mail-pl1-x632.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nIUlS-00035Y-RK for guile-devel@gnu.org; Fri, 11 Feb 2022 07:11:04 -0500 Original-Received: by mail-pl1-x632.google.com with SMTP id p6so4329491plf.10 for ; Fri, 11 Feb 2022 04:11:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ajE51iizs9fIgtXeYz0VGEYGTpgzgJtNNP5pW5zCb1I=; b=EjzqUAOPFrz3wjqwSY6TcTFfWoHRIkPSO6FNQwDT7WTKhwtQ39rtDWBXLmvcwzr3U7 Ee/fpmqWs57zqRKFjDRP4RZWPhmxB1vKiI8VLy7eYWDubVq4cutwGwPeabJlDwX1qlOF uvdTEaBpunJDLrCuo9U9Y6Doi2B4mPmIoWQSFKQIFlCrwzm7q2A4lkfgVXev+apA71/r kKigH0r7+7QHg+7GtDpCdtZJ/ThNWfbpKzIeNNtxhQXXrbgd+TcUkPaMQWyZKwEY80vP rQKc1692M0QeSsPHrlSljmndW9pW00skjC3lDpRlVqhlWo1jL0tG6ETuwVgiBgEAz7Pq jPDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ajE51iizs9fIgtXeYz0VGEYGTpgzgJtNNP5pW5zCb1I=; b=Sfm9BlHGuSKbWTl7K/qvdDTkgJfYs0i4j6pEfPCSUJxQ8AaDvxhN3/4SvEc6KLUMT+ bMrg1s0fPLMYU7uumX2joN6UMNi9TSMiaZeXgLwechuMJdlGZFdorJbkWqCWuBce5D3I wZfjLJZizttqx14zd4IA3l8c6r63AFSDby4MUZKmK0eTxLsEAQ6GbVcilrKV90r21WmA y96cPRns2UDklf0HdAnahd15ywBwV3IazBJcPFaaCoemd0yX3Bfo7qEnqixBVJ22/Axf L3N80rDJS6Jnq4u42AzhY1+8AYChW4738irlpnm4nZocpjmaC1e8zLJH5MnfquDbZfsz Rsrg== X-Gm-Message-State: AOAM530eUrTSOAj5QPn0mgZj5tYeDgS45kqmJKT4Fi8kK01d/RyQVdu/ Y6ee/zkoExTkNKM4JamdmG4yfBNF0A3jOhXi65HXoKn2y+s= X-Google-Smtp-Source: ABdhPJzlkIcyGLnVbtoBF9EZ+rr6Cu5eNyBFMS8E3Rgc++ORMMfJSddBCu0RafVdi+5Y+H1mMglklRD+QjJAfFfYaC4= X-Received: by 2002:a17:903:234f:: with SMTP id c15mr1322116plh.50.1644581460969; Fri, 11 Feb 2022 04:11:00 -0800 (PST) In-Reply-To: X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::632 (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=stefan.itampe@gmail.com; helo=mail-pl1-x632.google.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:21093 Archived-At: --000000000000c90ec505d7bcf566 Content-Type: text/plain; charset="UTF-8" Hmm, I can improve the delimited continuation speed slightly by doing the below code (define prompt (list 1)) (define (f2) (let lp ((i 0)) (when (< i 20000000) (begin (abort-to-prompt prompt) (lp (+ i 1))))) #f) ; 5.906402s real time, 12.297234s run time. 8.894807s spent in GC. So we are actually around 12X faster. (define (test2) (let lp ((k f2)) (let ((k (call-with-prompt prompt k (lambda (k) k)))) (if k (lp k) #f)))) On Fri, Feb 11, 2022 at 1:06 PM Stefan Israelsson Tampe < stefan.itampe@gmail.com> wrote: > I managed to make jitted code work for an example, speeds up the code up > 2x. So in 1s ther is 40M ops per s > overhead in the generator construct, that's essentially 4x slower the > fastest it can do in a very simple loop. And matches > pythons generators and are 15x faster than the example code I have above. > > On Thu, Feb 10, 2022 at 4:19 PM Stefan Israelsson Tampe < > stefan.itampe@gmail.com> wrote: > >> I did some benchmark, consider this code below. Let's turn off the jit. >> Then >> a 20M loop using normal delimited continuations yields, >> >> ;; 7.866898s real time, 14.809225s run time. 9.652291s spent in GC >> >> With a pausing continuation or generator we end up with, >> ;; 0.965947s real time, 0.965588s run time. 0.000000s spent in GC. >> >> python 3's similar generator example is executing at 0.5s for the same >> looop. >> so using delimited continuations to model pythons generators we have an >> overhead of around 15X. >> >> With jit, >> ;; 6.678504s real time, 13.589789s run time. 9.560317s spent in GC. >> >> So we can't really get any speedup help from guile's jit here. The paused >> jit version is not available as I have not figured out how to do this jet. >> >> (define prompt (list 1)) >> (define (f) >> (let lp ((i 0)) >> (when (< i 20000000) >> (begin >> (abort-to-prompt prompt) >> (lp (+ i 1)))))) >> >> (define (test2) >> (let lp ((k f)) >> (call-with-prompt prompt k lp))) >> >> >> >> On Thu, Feb 10, 2022 at 2:07 PM Stefan Israelsson Tampe < >> stefan.itampe@gmail.com> wrote: >> >>> Consider a memory barrier idiom constructed from >>> 0, (mk-stack) >>> 1. (enter x) >>> 2. (pause x) >>> 3. (leave x) >>> >>> The idea is that we create a separate stack object and when entering it, >>> we will swap the current stack with the one in the argument saving the >>> current stack in x and be in the 'child' state and move to a paused >>> position in case of a pause, when pausing stack x, we will return to where >>> after where entered saving the current position in stack and ip, and be in >>> state 'pause' and when we leave we will be in the state 'leave and move >>> to the old stack, using the current >>> ip. At first encounter the function stack frame is copied over hence >>> there will be a fork limited to the function only. >>> >>> This means that we essentially can define a generator as >>> (define (g x) >>> (let lp ((n 0)) >>> (if (< n 10) >>> (begin >>> (pause x) >>> (lp (+ n 1)))))) >>> >>> And use it as >>> (define (test) >>> (let ((x (mk-stack))) >>> (let lp () >>> (case (enter x) >>> ((pause) >>> (pk 'pause) >>> (lp)) >>> ((child) >>> (g x) >>> (leave x)))))))) >>> >>> A paused or leaved stack cannot be paused, an entered stack cannot be >>> entered and one cannot leave a paused stack, but enter a leaved stack. >>> >>> Anyhow this idea is modeled like a fork command instead of functional >>> and have the benefit over delimited continuations that one does not need to >>> copy the whole stack and potentially speed up generator like constructs. >>> But not only this, writing efficient prolog code is possible as well. We >>> could simplify a lot of the generation of prolog code, speed it up and also >>> improve compiler speed of prolog code significantly. >>> >>> How would we approach the prolog code. The simplest system is to use >>> return the >>> alternate pause stack when succeeding things becomes very simple, >>> >>> x = stack to pause to in case of failure >>> cc = the continuation >>> >>> ( (x cc) goal1 goal2) >>> :: (cc (goal1 (goal2 x)) >>> >>> ( (x cc) goal1 goal2) >>> :: (let ((xx (mkstack))) >>> (case (enter xx) >>> ((child) >>> (cc (goal2 xx))) >>> >>> ((pause) >>> (cc (goal2 x))))) >>> >>> Very elegant, and we also can use some heuristics to store already made >>> stacks when >>> leaving a stack and reuse at the next enter which is a common theme in >>> prolog, >>> >>> Anyhow we have an issue, consider the case where everythings >>> succeds forever. Then we will blow the stack . There is no concept of tail >>> calls here. So what you can do is the following for an , >>> >>> (let ((xx (mk-stack))) >>> (case (enter xx) >>> ((child) >>> (goal1 x (lambda (xxx) (pause xx xxx))) >>> >>> ((pause xxx) >>> (goal2 xxx cc)))) >>> >>> This enable cuts so that a cutted and (and!) in kanren lingo will use >>> (goal2 x cc) >>> >>> And we have tail calls! >>> >>> >>> I have a non jitted version guile working as a proof of concept. >>> >>> The drawback with this is if a function uses a lot of stack, it will be >>> a memory hog. >>> >>> WDYT? >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> . >>> >>> >>> >> --000000000000c90ec505d7bcf566 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hmm, I can improve the delimited continuation speed slight= ly by doing the below code


(define prompt (list= 1))
(define (f2)
=C2=A0 (let lp ((i 0))
=C2=A0 =C2=A0 (when (<= i 20000000)
=C2=A0 =C2=A0 =C2=A0 (begin
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = (abort-to-prompt prompt)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (lp (+ i 1)))))
= =C2=A0 #f)

=C2=A0; 5.906402s real time, 12.297234s run time. = =C2=A08.894807s spent in GC.

So we are actually = around 12X faster.

(define (test2)
=C2=A0 (= let lp ((k f2))
=C2=A0 =C2=A0 (let ((k (call-with-prompt prompt k (lambd= a (k) k))))
=C2=A0 =C2=A0 =C2=A0 (if k (lp k) #f))))

<= /div>

On Fri, Feb 11, 2022 at 1:06 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrot= e:
I managed to make jitted code work for an example, speeds up the code u= p 2x. So in 1s ther is 40M ops per s
overhead in the generator construc= t, that's essentially 4x slower the fastest it can do in a very simple = loop. And matches=C2=A0
pythons generators and are 15x faster tha= n the example code I have above.

=
On Thu, Feb 10, 2022 at 4:19 PM Stefa= n Israelsson Tampe <stefan.itampe@gmail.com> wrote:
I did some benchmark, con= sider this code below. Let's turn off the jit. Then
a 20M loop usin= g normal delimited=C2=A0continuations yields,

;; 7.866898s real time, 14.809225s run time. =C2=A09.652291s spent in GC<= /span>

With a pausing continuatio= n or generator we end up with,
;; 0.965947s real time, = 0.965588s run time. =C2=A00.000000s spent in GC.
python 3's similar generator example is exec= uting at 0.5s for the same looop.
so using delimited continuation= s to model pythons generators we have an overhead of around 15X.
=
With jit,
;; 6.678504s real time, 13.589789s run time.= =C2=A09.560317s spent in GC.

So we can't really get any speedup help from guile's jit here. = The paused jit version is not available as I have not figured out how to do= this jet.

(define prompt (list 1))
(de= fine (f)
=C2=A0 (let lp ((i 0))
=C2=A0 =C2=A0 (when (< i 20000000)=
=C2=A0 =C2=A0 =C2=A0 (begin
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (abort-to-pr= ompt prompt)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (lp (+ i 1))))))
=C2=A0
= (define (test2)
=C2=A0 (let lp ((k f))
=C2=A0 =C2=A0 (call-with-promp= t prompt k lp)))



On Thu, Feb 1= 0, 2022 at 2:07 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:
=
Co= nsider a memory barrier=C2=A0idiom constructed from=C2=A0
0, (mk-stack)=
1. (enter x)
2. (pause x)
3. (leave x)

The idea is that we create a separate stack object = and when entering it, we will swap the current stack with the one in the ar= gument saving the current stack in x=C2=A0 and be in the 'child' st= ate and move to a paused position in case of a pause, when pausing stack x,= we=C2=A0will return to where after where entered saving the current positi= on in stack and ip, and be in state 'pause' and when we leave we wi= ll be in the state=C2=A0'leave and move to=C2=A0the old stack, using th= e current
ip. At first encounter the function=C2=A0stack frame is= copied over hence there will be a fork limited to the function=C2=A0only.<= /div>

This means that we essentially can define a genera= tor as
(define (g x)
=C2=A0 (let lp ((n 0))
= =C2=A0 =C2=A0 (if (< n 10)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (begin<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(pause x)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(lp (+ n 1))))))

=
And use it as
(define (test)
=C2=A0 =C2=A0 (let ((= x (mk-stack)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (let lp ()
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(case (enter x)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0((pause)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(pk 'paus= e)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0(lp))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 ((child)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0(g x)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0(leave x))))))))

A paused or leaved s= tack cannot be paused, an entered=C2=A0stack cannot be entered and one cann= ot leave a paused stack, but enter a leaved stack.

Anyhow this idea is modeled like a fork command instead=C2=A0of functional= and have the benefit over delimited continuations that one does not need t= o copy the whole stack and potentially speed up generator=C2=A0like constru= cts. But not only this,=C2=A0writing efficient prolog code is possible as w= ell. We could simplify a lot of the generation of prolog code, speed it up = and also improve compiler speed of prolog code significantly.
How would we approach the=C2=A0 prolog code.=C2=A0The simplest = system is to use return the=C2=A0
alternate pause stack when succ= eeding things becomes very simple,

x=C2=A0 =C2=A0= =3D stack to pause to in case of failure
cc =3D the continuation<= /div>

(<and> (x cc)=C2=A0 goal1 goal2)=C2=A0=C2=A0=
=C2=A0 =C2=A0 =C2=A0:: (cc (goal1 (goal2 x))

(<or >=C2=A0 =C2=A0(x cc)=C2=A0 goal1 goal2)=C2=A0=C2=A0
<= div>=C2=A0 =C2=A0 ::=C2=A0 (let ((xx (mkstack)))
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(case (enter xx)
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0((child)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (cc (goal2 xx)))

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 ((pause)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0(cc (goal2 x)))))

Very elegant, and we a= lso can use some heuristics to store already made stacks when=C2=A0
leaving a stack and reuse at the next enter which is a common theme in p= rolog,

Anyhow we have an issue, consider the case = where everythings succeds=C2=A0forever. Then we will blow the stack . There= is no concept=C2=A0of tail calls here. So what you can do is the following= for an <and>,

(let ((xx (mk-stack)))
<= div>=C2=A0 =C2=A0 (case (enter xx)
=C2=A0 =C2=A0 =C2=A0 ((child)<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0(goal1 x (lambda (xxx) (pause xx xxx))= )
=C2=A0 =C2=A0 =C2=A0 =C2=A0
=C2=A0 =C2=A0 =C2=A0 ((pa= use xxx)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(goal2 xxx cc))))

This enable cuts so that a cutted and (and!) in kanren= lingo will use
(goal2 x cc)

And we have= tail calls!


I have a non jitted ve= rsion guile working as a proof of concept.=C2=A0

T= he drawback with this is if a function uses a lot of stack, it will be a me= mory hog.

WDYT?











.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0
--000000000000c90ec505d7bcf566--