unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* Custom streams 10x faster than srfi-41
@ 2016-09-05 17:24 Amirouche Boubekki
  2016-09-05 19:14 ` Jeremy Steward
  0 siblings, 1 reply; 4+ messages in thread
From: Amirouche Boubekki @ 2016-09-05 17:24 UTC (permalink / raw)
  To: Guile User

[-- Attachment #1: Type: text/plain, Size: 803 bytes --]

Yesterday I discovered that my custom stream library is 10 times
faster than srfi-41 on guile 2.1.3

Basically, this custom stream thing is based on delayed lambda
evaluation or something like that. Since code is better than
a hundred word, here is a lazy version of `iota':

(define (lazy-iota count)
   (let loop ((i 0))
     (lambda ()
       (unless (eq? i count)
	(cons i (loop (1+ i)))))))

(define iota3 (lazy-iota 3))

(pk (iota3))

The first time `lazy-iota' is called it returns a procedure
that is bound to `count'. Then you can call that procedure
to get a pair with the first value and a procedure which will
allow to retrieve the rest.

I attached a file that mimicks Tinkerpop's Gremlin DSL and
benchmark it against srfi 41.


Happy hacking!

-- 
Amirouche ~ amz3 ~ http://www.hyperdev.fr

[-- Attachment #2: traversi-benchmark.scm --]
[-- Type: text/plain, Size: 1771 bytes --]

(use-modules (srfi srfi-41))
(use-modules (ice-9 match))
(use-modules (srfi srfi-9))  ;; records
(use-modules (srfi srfi-9 gnu))  ;; set-record-type-printer! and set-field


(define (list->traversi lst)
  (let loop ((lst lst))
    (lambda ()
      (if (null? lst)
          '()
          (cons (cons (car lst) '()) (loop (cdr lst)))))))

(define (traversi->list traversi)
  (let loop ((traversi traversi)
             (out '()))
    (match (traversi)
      ('() (reverse out))
      ((item . next) (loop next (cons (car item) out))))))

(define (traversi-map proc traversi)
  (let loop ((traversi traversi))
    (lambda ()
      (match (traversi)
        ('() '())
        ((item . next) (cons (cons (proc (car item)) item) (loop next)))))))

(define (traversi-filter proc traversi)
  (let loop1 ((traversi traversi))
    (lambda ()
      (let loop2 ((traversi traversi))
        (match (traversi)
          ('() '())
          ((item . next) (if (proc (car item))
                             (cons item (loop1 next))
                             (loop2 next))))))))

(define (traversi-parent traversi)
  (let loop ((traversi traversi))
    (lambda ()
      (match (traversi)
        ('() '())
        ((item . next)
         (let ((parents (cdr item)))
           (if (null? parents)
               (throw 'traversi "item has no parent")
               (cons parents (loop next)))))))))

(define-syntax-rule (timeit body ...)
  (let ((start (current-time)))
    (begin body ...)
    (- (current-time) start)))

(define count 10000000)

(pk (timeit (traversi->list (traversi-filter odd? (traversi-map (lambda (x) (* 3 x)) (list->traversi (iota count)))))))

(pk (timeit (stream->list (stream-filter odd? (stream-map (lambda (x) (* 3 x)) (list->stream (iota count)))))))

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Custom streams 10x faster than srfi-41
  2016-09-05 17:24 Custom streams 10x faster than srfi-41 Amirouche Boubekki
@ 2016-09-05 19:14 ` Jeremy Steward
  2016-09-05 20:48   ` Amirouche Boubekki
  0 siblings, 1 reply; 4+ messages in thread
From: Jeremy Steward @ 2016-09-05 19:14 UTC (permalink / raw)
  To: guile-user

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hey Amirouche,

| (define (lazy-iota count) (let loop ((i 0)) (lambda () (unless (eq?
| i count) (cons i (loop (1+ i)))))))
|
| (define iota3 (lazy-iota 3))
|
| (pk (iota3))
|
| The first time `lazy-iota' is called it returns a procedure that is
| bound to `count'. Then you can call that procedure to get a pair
| with the first value and a procedure which will allow to retrieve
| the rest.

This seems very similar to SRFI 121 (generators) and SRFI 127 (Lazy
Sequences). Is there something particularly different about this
implementation?

I've seen similar performance improvements in CHICKEN Scheme using
SRFI 121 and 127 over SRFI 41 (streams).

Cheers,
- -- 
Jeremy Steward

-----BEGIN PGP SIGNATURE-----

iQIcBAEBCAAGBQJXzcQUAAoJEAJ1qwSKG/f2/mkQALgocsUxgupz2zikBRDul8PX
/w2gUquy+uWlea+BwiVwv7/lTSSymq/1JE79ZMqR4lIpzC9cQXEMIskiz0Dge8hY
ltRbi98qkEqLO+RToe5rXNlpk6aBXjfViJPSYFBqynwuu3Ni2oSdMQrv7cXzL/OG
sAxuMIJdQo73BzGWdYZ/YugOVzkOU55/Vi0qAP6JNTTB9wFqpgYvk9HWALJoMJvP
URhyWIL7CA7CgFzU+EO3782tcxD+Qc63yjfUeoAWqSwxs/qW3ji4mbXbwDGTCpeZ
rq9TgpfryW4G/ZWCpGjffJHvCPbUF8K0i8LdWzRYO5Nzmk5mkEja6SomD36uJH++
i7ccJjDn7FPLZ2iGzjPsXgk+PT4ZGB/rVfr/aXLh8xTgpEPtK4+h0n6IYGgNgZhB
/9PCaI71i3y+xie00p2PMy4v/AZckO+ZTjwUsOotymQAs66tfMrtvLioLJkYpHqx
4zvVMKF/AmAbosSYD/Z065Tg88Si2ywfki1fQ0ljg0XmTGdWAiJ/+HCeJH7Xt+GL
MSSMY8XhMm8r4Gqc7hauaELmZ0Dea1s3FKL3ICzM3txNMSHpk77e1fZJcA3fQeZF
dAARhTbp5o6dwrL7MuSueGAUHP7qk8Lj5YtKOTIwX3oqYTp631ZKL+xxl0php5gY
dodNm50/1v3SHUx/euqA
=7wYq
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Custom streams 10x faster than srfi-41
  2016-09-05 19:14 ` Jeremy Steward
@ 2016-09-05 20:48   ` Amirouche Boubekki
  2016-09-06 17:59     ` Amirouche Boubekki
  0 siblings, 1 reply; 4+ messages in thread
From: Amirouche Boubekki @ 2016-09-05 20:48 UTC (permalink / raw)
  To: jeremy; +Cc: guile-user, guile-user

On 2016-09-05 21:14, Jeremy Steward wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> Hey Amirouche,
> 
> | (define (lazy-iota count) (let loop ((i 0)) (lambda () (unless (eq?
> | i count) (cons i (loop (1+ i)))))))
> |
> | (define iota3 (lazy-iota 3))
> |
> | (pk (iota3))
> |
> | The first time `lazy-iota' is called it returns a procedure that is
> | bound to `count'. Then you can call that procedure to get a pair
> | with the first value and a procedure which will allow to retrieve
> | the rest.
> 
> This seems very similar to SRFI 121 (generators) and SRFI 127 (Lazy
> Sequences).

Thanks for the pointer.

> Is there something particularly different about this implementation?

Values generated by traversi-map are chained which means what actually
travels in the stream is the history list of values with the first
value being the last created value. It allows to backtrack to previous
values. (This is not the case of lazy-iota)

I can't find it anymore in Tinkerpop documentation [0] so maybe it's
not a useful feature.

[0] http://tinkerpop.apache.org/docs/current/reference/#traversal

I'll explain how I use it nonetheless and the reasons I used to do so
in the context of graph traversal.

Usually what happens in graph traversal is that you start with one or
several vertex do some traversal/navigation or other map operation and
filter based on some heuristic and want to find out the original
vertex that lead to the current result you have, that's where
"backtracking" happens. The thing is that you could do the traversal
in reverse order to retrieve the original vertex that's why it might
not be as useful as I was thinking.

The thing is that my graph database library never fully load vertices
and edges into RAM when it traverse a graph. The current value in the
traversi is never more than a proper list. So it can be a string, an
integer or a list of.

For instance if one can use traversi to implement item-item
recommendations.

Draw a graph like the following:

u1 --[likes@1]--> m1

u2 --[likes@1]--> m1
u2 --[likes@1]--> m2
u2 --[likes@0]--> m3

u3 --[likes@1]--> m1
u3 --[likes@1]--> m4

u4 --[likes@0]--> m1
u4 --[likes@1]--> m5

What does the item-item recommendation do, is search for user that
rated high some movie and look what other movie they rated high.

For instance, we can do the following traversal starting with movie
m1:

m1 <--[likes@1]-- u1,u2,u3 --[likes@1]--> m1,m2,m4

I won't explain how everything happens it's better explained
elsewhere [1] (in french here [2])

[1] 
https://markorodriguez.com/2011/09/22/a-graph-based-movie-recommender-engine/
[2] http://hyperdev.fr/notes/recommendations-basees-sur-les-graphes.html

Basically the traversal starts at m1, ask for incomings edges that
are likes which have a score of 1. If edges were fully loaded into
memory I could simply use a filter proc like

(lambda (edge)
   (and (eqv? (assoc-ref edge 'label) 'likes)
        (eq? (assoc-ref edge 'score) 1)))

But instead I have a procedure which fetch a single field.

So the flow of the traversi program is:

- m1
- fetch incomings edges [*]
- fetch 'label for current value
- filter: if value == 'likes
- backtrack to [*]
- fetch 'score for current value
- filter: if value == 1
- backtrack to [*]

Like I said it's an optimisation trick, maybe it's not worthwhile,
maybe it's evil.

> I've seen similar performance improvements in CHICKEN Scheme using
> SRFI 121 and 127 over SRFI 41 (streams).
> 
> Cheers,
> - -- Jeremy Steward
> 


Thanks for giving me the opportunity to re-think it.

-- 
Amirouche ~ amz3 ~ http://www.hyperdev.fr




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Custom streams 10x faster than srfi-41
  2016-09-05 20:48   ` Amirouche Boubekki
@ 2016-09-06 17:59     ` Amirouche Boubekki
  0 siblings, 0 replies; 4+ messages in thread
From: Amirouche Boubekki @ 2016-09-06 17:59 UTC (permalink / raw)
  To: jeremy; +Cc: guile-user, guile-user

On 2016-09-05 22:48, Amirouche Boubekki wrote:
> On 2016-09-05 21:14, Jeremy Steward wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>> 
>> Hey Amirouche,
>> 
>> | (define (lazy-iota count) (let loop ((i 0)) (lambda () (unless (eq?
>> | i count) (cons i (loop (1+ i)))))))
>> |
>> | (define iota3 (lazy-iota 3))
>> |
>> | (pk (iota3))
>> |
>> | The first time `lazy-iota' is called it returns a procedure that is
>> | bound to `count'. Then you can call that procedure to get a pair
>> | with the first value and a procedure which will allow to retrieve
>> | the rest.
>> 
>> This seems very similar to SRFI 121 (generators) and SRFI 127 (Lazy
>> Sequences).
> 
> Thanks for the pointer.
> 
>> Is there something particularly different about this implementation?
> 
> Values generated by traversi-map are chained which means what actually
> travels in the stream is the history list of values with the first
> value being the last created value. It allows to backtrack to previous
> values. (This is not the case of lazy-iota)
> 
> I can't find it anymore in Tinkerpop documentation [0] so maybe it's
> not a useful feature.
> 
> [0] http://tinkerpop.apache.org/docs/current/reference/#traversal
> 

As of today, `back` step in gremlin was superseed by `select` step [1].

[1] http://tinkerpop.apache.org/docs/current/reference/#select-step

I'd like to add that even if given a traversal that is not a random
walk, you can always go back using a reverse traversal so it seems
like useless to have a `back` or `backtrack` or `select` step that
said it's much easier to use `backtrack` as it's more readable.
Right now my implementation doesn't allow to create labels inside
the computation chain to backtrack using a name (unlike gremlin).


Best regards,

amz3



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-09-06 17:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-05 17:24 Custom streams 10x faster than srfi-41 Amirouche Boubekki
2016-09-05 19:14 ` Jeremy Steward
2016-09-05 20:48   ` Amirouche Boubekki
2016-09-06 17:59     ` Amirouche Boubekki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).