The case with A simple loop of 20M operations are now down to 0.3 s that's almost 20X improvements over the best delimited continuation example (6s). Cpython takes 0.5s! On Fri, Feb 11, 2022 at 1:10 PM Stefan Israelsson Tampe < stefan.itampe@gmail.com> wrote: > Hmm, I can improve the delimited continuation speed slightly by doing the > below code > > > (define prompt (list 1)) > (define (f2) > (let lp ((i 0)) > (when (< i 20000000) > (begin > (abort-to-prompt prompt) > (lp (+ i 1))))) > #f) > > ; 5.906402s real time, 12.297234s run time. 8.894807s spent in GC. > > So we are actually around 12X faster. > > (define (test2) > (let lp ((k f2)) > (let ((k (call-with-prompt prompt k (lambda (k) k)))) > (if k (lp k) #f)))) > > > On Fri, Feb 11, 2022 at 1:06 PM Stefan Israelsson Tampe < > stefan.itampe@gmail.com> wrote: > >> I managed to make jitted code work for an example, speeds up the code up >> 2x. So in 1s ther is 40M ops per s >> overhead in the generator construct, that's essentially 4x slower the >> fastest it can do in a very simple loop. And matches >> pythons generators and are 15x faster than the example code I have above. >> >> On Thu, Feb 10, 2022 at 4:19 PM Stefan Israelsson Tampe < >> stefan.itampe@gmail.com> wrote: >> >>> I did some benchmark, consider this code below. Let's turn off the jit. >>> Then >>> a 20M loop using normal delimited continuations yields, >>> >>> ;; 7.866898s real time, 14.809225s run time. 9.652291s spent in GC >>> >>> With a pausing continuation or generator we end up with, >>> ;; 0.965947s real time, 0.965588s run time. 0.000000s spent in GC. >>> >>> python 3's similar generator example is executing at 0.5s for the same >>> looop. >>> so using delimited continuations to model pythons generators we have an >>> overhead of around 15X. >>> >>> With jit, >>> ;; 6.678504s real time, 13.589789s run time. 9.560317s spent in GC. >>> >>> So we can't really get any speedup help from guile's jit here. The >>> paused jit version is not available as I have not figured out how to do >>> this jet. >>> >>> (define prompt (list 1)) >>> (define (f) >>> (let lp ((i 0)) >>> (when (< i 20000000) >>> (begin >>> (abort-to-prompt prompt) >>> (lp (+ i 1)))))) >>> >>> (define (test2) >>> (let lp ((k f)) >>> (call-with-prompt prompt k lp))) >>> >>> >>> >>> On Thu, Feb 10, 2022 at 2:07 PM Stefan Israelsson Tampe < >>> stefan.itampe@gmail.com> wrote: >>> >>>> Consider a memory barrier idiom constructed from >>>> 0, (mk-stack) >>>> 1. (enter x) >>>> 2. (pause x) >>>> 3. (leave x) >>>> >>>> The idea is that we create a separate stack object and when entering >>>> it, we will swap the current stack with the one in the argument saving the >>>> current stack in x and be in the 'child' state and move to a paused >>>> position in case of a pause, when pausing stack x, we will return to where >>>> after where entered saving the current position in stack and ip, and be in >>>> state 'pause' and when we leave we will be in the state 'leave and move >>>> to the old stack, using the current >>>> ip. At first encounter the function stack frame is copied over hence >>>> there will be a fork limited to the function only. >>>> >>>> This means that we essentially can define a generator as >>>> (define (g x) >>>> (let lp ((n 0)) >>>> (if (< n 10) >>>> (begin >>>> (pause x) >>>> (lp (+ n 1)))))) >>>> >>>> And use it as >>>> (define (test) >>>> (let ((x (mk-stack))) >>>> (let lp () >>>> (case (enter x) >>>> ((pause) >>>> (pk 'pause) >>>> (lp)) >>>> ((child) >>>> (g x) >>>> (leave x)))))))) >>>> >>>> A paused or leaved stack cannot be paused, an entered stack cannot be >>>> entered and one cannot leave a paused stack, but enter a leaved stack. >>>> >>>> Anyhow this idea is modeled like a fork command instead of functional >>>> and have the benefit over delimited continuations that one does not need to >>>> copy the whole stack and potentially speed up generator like constructs. >>>> But not only this, writing efficient prolog code is possible as well. We >>>> could simplify a lot of the generation of prolog code, speed it up and also >>>> improve compiler speed of prolog code significantly. >>>> >>>> How would we approach the prolog code. The simplest system is to use >>>> return the >>>> alternate pause stack when succeeding things becomes very simple, >>>> >>>> x = stack to pause to in case of failure >>>> cc = the continuation >>>> >>>> ( (x cc) goal1 goal2) >>>> :: (cc (goal1 (goal2 x)) >>>> >>>> ( (x cc) goal1 goal2) >>>> :: (let ((xx (mkstack))) >>>> (case (enter xx) >>>> ((child) >>>> (cc (goal2 xx))) >>>> >>>> ((pause) >>>> (cc (goal2 x))))) >>>> >>>> Very elegant, and we also can use some heuristics to store already made >>>> stacks when >>>> leaving a stack and reuse at the next enter which is a common theme in >>>> prolog, >>>> >>>> Anyhow we have an issue, consider the case where everythings >>>> succeds forever. Then we will blow the stack . There is no concept of tail >>>> calls here. So what you can do is the following for an , >>>> >>>> (let ((xx (mk-stack))) >>>> (case (enter xx) >>>> ((child) >>>> (goal1 x (lambda (xxx) (pause xx xxx))) >>>> >>>> ((pause xxx) >>>> (goal2 xxx cc)))) >>>> >>>> This enable cuts so that a cutted and (and!) in kanren lingo will use >>>> (goal2 x cc) >>>> >>>> And we have tail calls! >>>> >>>> >>>> I have a non jitted version guile working as a proof of concept. >>>> >>>> The drawback with this is if a function uses a lot of stack, it will be >>>> a memory hog. >>>> >>>> WDYT? >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> . >>>> >>>> >>>> >>>