unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Guile's time execution issues
@ 2020-04-21 22:03 Aleix Conchillo Flaqué
  2020-04-22 13:47 ` Linus Björnstam
  2020-04-26 17:16 ` Ludovic Courtès
  0 siblings, 2 replies; 12+ messages in thread
From: Aleix Conchillo Flaqué @ 2020-04-21 22:03 UTC (permalink / raw)
  To: guile-user, guile-devel

Hi,

I was trying to get some guile-json performance times loading large JSON
file. However, I'm getting increasing numbers at each run, so I'm wondering
if I'm doing something wrong. Below you can see how the first run took
19.95s and then running the same command kept increasing.

I'm running Guile 2.2.7 on macOS Catalina 10.15.3.

scheme@(guile-user)> (use-modules (json))
scheme@(guile-user)> ,t (define a (call-with-input-file
"/Users/aleix/Downloads/large-file.json" (lambda (port) (json->scm port))))
;; 19.956429s real time, 87.100982s run time.  75.270202s spent in GC.
;; 26.173179s real time, 143.645265s run time.  131.022631s spent in GC.
;; 28.193926s real time, 154.758375s run time.  141.697236s spent in GC.
;; 29.044218s real time, 160.745984s run time.  147.449073s spent in GC.
;; 30.480873s real time, 170.855527s run time.  157.332793s spent in GC.
;; 30.555700s real time, 172.938278s run time.  159.468737s spent in GC.
;; 32.190478s real time, 172.807551s run time.  158.905645s spent in GC.

The good news is that I have applied a suggestion from Linus Björnstam that
reduces times by half (considering the first run above). The suggestion was
to use string ports instead of string-append. And interestingly this time,
time executions remain around the same value:

scheme@(guile-user)> ,t (define a (call-with-input-file
"/Users/aleix/Downloads/large-file.json" (lambda (port) (json->scm port))))
;; 9.099597s real time, 17.239176s run time.  9.309819s spent in GC.
;; 9.927868s real time, 20.855859s run time.  12.528727s spent in GC.
;; 8.981248s real time, 15.043991s run time.  7.050269s spent in GC.
;; 9.055893s real time, 15.400981s run time.  7.383187s spent in GC.
;; 9.055850s real time, 15.261824s run time.  7.239188s spent in GC.
;; 9.315239s real time, 15.231889s run time.  7.005570s spent in GC.
;; 9.106168s real time, 14.987678s run time.  6.915028s spent in GC.
;; 9.175142s real time, 14.471324s run time.  6.302749s spent in GC.
;; 8.966537s real time, 14.400318s run time.  6.412858s spent in GC.

JSON test data obtained from:

https://github.com/json-iterator/test-data/blob/master/large-file.json

Best,

Aleix


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-04-21 22:03 Guile's time execution issues Aleix Conchillo Flaqué
@ 2020-04-22 13:47 ` Linus Björnstam
  2020-04-26 17:16 ` Ludovic Courtès
  1 sibling, 0 replies; 12+ messages in thread
From: Linus Björnstam @ 2020-04-22 13:47 UTC (permalink / raw)
  To: Aleix Conchillo Flaqué, guile-user, guile-devel

Hi Aleix

Try throwing the result of json->SCM away: (let ((a (call-with...))) 1). Then you can run a (gc) between runs. If the GC time increases then, maybe guile cannot release the memory?? I don't know, but there is some shared substring thing going on in guile, even though I think string-append doesn't do that.

Nice to see that my suggestions worked!
-- 
  Linus Björnstam

On Wed, 22 Apr 2020, at 00:03, Aleix Conchillo Flaqué wrote:
> Hi,
> 
> I was trying to get some guile-json performance times loading large 
> JSON file. However, I'm getting increasing numbers at each run, so I'm 
> wondering if I'm doing something wrong. Below you can see how the first 
> run took 19.95s and then running the same command kept increasing.
> 
> I'm running Guile 2.2.7 on macOS Catalina 10.15.3.
> 
> scheme@(guile-user)> (use-modules (json))
> scheme@(guile-user)> ,t (define a (call-with-input-file 
> "/Users/aleix/Downloads/large-file.json" (lambda (port) (json->scm 
> port))))
> ;; 19.956429s real time, 87.100982s run time. 75.270202s spent in GC.
> ;; 26.173179s real time, 143.645265s run time. 131.022631s spent in GC.
> ;; 28.193926s real time, 154.758375s run time. 141.697236s spent in GC.
> ;; 29.044218s real time, 160.745984s run time. 147.449073s spent in GC.
> ;; 30.480873s real time, 170.855527s run time. 157.332793s spent in GC.
> ;; 30.555700s real time, 172.938278s run time. 159.468737s spent in GC.
> ;; 32.190478s real time, 172.807551s run time. 158.905645s spent in GC.
> 
> The good news is that I have applied a suggestion from Linus Björnstam 
> that reduces times by half (considering the first run above). The 
> suggestion was to use string ports instead of string-append. And 
> interestingly this time, time executions remain around the same value:
> 
> scheme@(guile-user)> ,t (define a (call-with-input-file 
> "/Users/aleix/Downloads/large-file.json" (lambda (port) (json->scm 
> port))))
> ;; 9.099597s real time, 17.239176s run time. 9.309819s spent in GC.
> ;; 9.927868s real time, 20.855859s run time. 12.528727s spent in GC.
> ;; 8.981248s real time, 15.043991s run time. 7.050269s spent in GC.
> ;; 9.055893s real time, 15.400981s run time. 7.383187s spent in GC.
> ;; 9.055850s real time, 15.261824s run time. 7.239188s spent in GC.
> ;; 9.315239s real time, 15.231889s run time. 7.005570s spent in GC.
> ;; 9.106168s real time, 14.987678s run time. 6.915028s spent in GC.
> ;; 9.175142s real time, 14.471324s run time. 6.302749s spent in GC.
> ;; 8.966537s real time, 14.400318s run time. 6.412858s spent in GC.
> 
> JSON test data obtained from:
> 
> https://github.com/json-iterator/test-data/blob/master/large-file.json
> 
> Best,
> 
> Aleix



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-04-21 22:03 Guile's time execution issues Aleix Conchillo Flaqué
  2020-04-22 13:47 ` Linus Björnstam
@ 2020-04-26 17:16 ` Ludovic Courtès
  2020-04-26 23:14   ` Aleix Conchillo Flaqué
  1 sibling, 1 reply; 12+ messages in thread
From: Ludovic Courtès @ 2020-04-26 17:16 UTC (permalink / raw)
  To: Aleix Conchillo Flaqué; +Cc: guile-user, guile-devel

Bon dia!

Aleix Conchillo Flaqué <aconchillo@gmail.com> skribis:

> I was trying to get some guile-json performance times loading large JSON
> file. However, I'm getting increasing numbers at each run, so I'm wondering
> if I'm doing something wrong. Below you can see how the first run took
> 19.95s and then running the same command kept increasing.
>
> I'm running Guile 2.2.7 on macOS Catalina 10.15.3.
>
> scheme@(guile-user)> (use-modules (json))
> scheme@(guile-user)> ,t (define a (call-with-input-file
> "/Users/aleix/Downloads/large-file.json" (lambda (port) (json->scm port))))
> ;; 19.956429s real time, 87.100982s run time.  75.270202s spent in GC.
> ;; 26.173179s real time, 143.645265s run time.  131.022631s spent in GC.
> ;; 28.193926s real time, 154.758375s run time.  141.697236s spent in GC.
> ;; 29.044218s real time, 160.745984s run time.  147.449073s spent in GC.
> ;; 30.480873s real time, 170.855527s run time.  157.332793s spent in GC.
> ;; 30.555700s real time, 172.938278s run time.  159.468737s spent in GC.
> ;; 32.190478s real time, 172.807551s run time.  158.905645s spent in GC.

Could this have to do with <https://issues.guix.gnu.org/issue/40194>?

Could you check if that happens with 3.0.2?  (Or suggest a
‘large-file.json’ to use.  :-))

Thanks in advance!

Ludo’.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-04-26 17:16 ` Ludovic Courtès
@ 2020-04-26 23:14   ` Aleix Conchillo Flaqué
  2020-05-02 14:11     ` Ludovic Courtès
  0 siblings, 1 reply; 12+ messages in thread
From: Aleix Conchillo Flaqué @ 2020-04-26 23:14 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-user, guile-devel

On Sun, Apr 26, 2020 at 10:16 AM Ludovic Courtès <ludo@gnu.org> wrote:

> Bon dia!
>
>
Bon dia! And bon jour! :-D

> Aleix Conchillo Flaqué <aconchillo@gmail.com> skribis:
>
> > I was trying to get some guile-json performance times loading large JSON
> > file. However, I'm getting increasing numbers at each run, so I'm
> wondering
> > if I'm doing something wrong. Below you can see how the first run took
> > 19.95s and then running the same command kept increasing.
> >
> > I'm running Guile 2.2.7 on macOS Catalina 10.15.3.
> >
> > scheme@(guile-user)> (use-modules (json))
> > scheme@(guile-user)> ,t (define a (call-with-input-file
> > "/Users/aleix/Downloads/large-file.json" (lambda (port) (json->scm
> port))))
> > ;; 19.956429s real time, 87.100982s run time.  75.270202s spent in GC.
> > ;; 26.173179s real time, 143.645265s run time.  131.022631s spent in GC.
> > ;; 28.193926s real time, 154.758375s run time.  141.697236s spent in GC.
> > ;; 29.044218s real time, 160.745984s run time.  147.449073s spent in GC.
> > ;; 30.480873s real time, 170.855527s run time.  157.332793s spent in GC.
> > ;; 30.555700s real time, 172.938278s run time.  159.468737s spent in GC.
> > ;; 32.190478s real time, 172.807551s run time.  158.905645s spent in GC.
>
> Could this have to do with <https://issues.guix.gnu.org/issue/40194>?
>
>
Honestly, I have no idea... :-(

Could you check if that happens with 3.0.2?  (Or suggest a
> ‘large-file.json’ to use.  :-))
>
>
Sure! I suggested a JSON, it was hidden at the end of my first message ;-).
This is the one:

   https://github.com/json-iterator/test-data/blob/master/large-file.json

So, bad and good news for Guile 3.0.2. I'm trying it inside Docker using
this image https://hub.docker.com/r/schemers/guile.

On guile-json 3.5.0 (still using (string-append)) the first execution time
goes from 19 seconds to 42 seconds. Then, the times keep increasing as in
version 2.2.7 but numbers are much bigger:

# ./env guile
GNU Guile 3.0.2
Copyright (C) 1995-2020 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (use-modules (json))
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 42.015887s real time, 42.059462s run time.  32.838460s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 56.228837s real time, 56.176989s run time.  46.112349s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 63.472869s real time, 63.383118s run time.  52.642226s spent in GC.

The good news is that on master branch I don't use (string-append) anymore
and it goes from 19 seconds (in 2.2.7) to ~6 seconds (both in 2.2.7 and
3.0.2).

❯ ./env guile
GNU Guile 2.2.7
Copyright (C) 1995-2019 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (use-modules (json))
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.635275s real time, 11.126605s run time.  5.119203s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.856995s real time, 12.605237s run time.  6.476064s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.556502s real time, 10.542702s run time.  4.550531s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.396581s real time, 9.638931s run time.  3.707881s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.293497s real time, 8.944718s run time.  3.055733s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.249073s real time, 8.938264s run time.  3.099037s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.168000s real time, 8.557252s run time.  2.772902s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.206124s real time, 7.920464s run time.  2.022655s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.404214s real time, 8.412361s run time.  2.402077s spent in GC.

And actually Guile 3.0.2 seems a bit slower at the beginning but then for
some reason times go down until they seem to stabilize (I have tried it a
couple of times and it does the same) and then it seems a bit faster than
2.2.7:

# ./env guile
GNU Guile 3.0.2
Copyright (C) 1995-2020 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (use-modules (json))
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.961477s real time, 6.973492s run time.  2.294030s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 7.450862s real time, 7.422345s run time.  2.718710s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.438989s real time, 6.399829s run time.  1.896063s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 6.097962s real time, 6.070055s run time.  1.540690s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 5.650008s real time, 5.668240s run time.  1.272017s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 5.754515s real time, 5.748386s run time.  1.275179s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 5.760096s real time, 5.755418s run time.  1.310558s spent in GC.
scheme@(guile-user)> ,t (define a (call-with-input-file "large-file.json"
(lambda (port) (json->scm port))))
;; 5.745274s real time, 5.739386s run time.  1.356917s spent in GC.

Let me know if you want me to test anything else.



> Thanks in advance!
>
>
Thank you!

Aleix


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-04-26 23:14   ` Aleix Conchillo Flaqué
@ 2020-05-02 14:11     ` Ludovic Courtès
  2020-05-04  0:32       ` Aleix Conchillo Flaqué
  0 siblings, 1 reply; 12+ messages in thread
From: Ludovic Courtès @ 2020-05-02 14:11 UTC (permalink / raw)
  To: Aleix Conchillo Flaqué; +Cc: guile-user, guile-devel

Hola!

Aleix Conchillo Flaqué <aconchillo@gmail.com> skribis:

> On guile-json 3.5.0 (still using (string-append)) the first execution time
> goes from 19 seconds to 42 seconds. Then, the times keep increasing as in
> version 2.2.7 but numbers are much bigger:

With Guile 3.0.2 and Guile-JSON 3.5.0, I get:

--8<---------------cut here---------------start------------->8---
$ guile
GNU Guile 3.0.2
Copyright (C) 1995-2020 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> ,use(json)
scheme@(guile-user)> ,t (->bool (call-with-input-file "/tmp/large-file.json" json->scm))
$1 = #t
;; 55.142128s real time, 79.806656s run time.  65.418070s spent in GC.
scheme@(guile-user)> ,t (->bool (call-with-input-file "/tmp/large-file.json" json->scm))
$2 = #t
;; 47.416645s real time, 75.274219s run time.  62.421108s spent in GC.
scheme@(guile-user)> ,t (->bool (call-with-input-file "/tmp/large-file.json" json->scm))
$3 = #t
;; 41.292368s real time, 79.053120s run time.  67.266710s spent in GC.
--8<---------------cut here---------------end--------------->8---

So I think the time increase was just due to the fact that previous
parse results were kept around, somehow.

2.2.7 performs comparably for me:

--8<---------------cut here---------------start------------->8---
$ guix environment -C --ad-hoc guile-json guile@2.2 --share=/tmp -- guile 
guix environment: warning: plursenca pak-specifigo 'guile@2.2'
guix environment: warning: choosing guile@2.2.7 from gnu/packages/guile.scm:256:2
GNU Guile 2.2.7
Copyright (C) 1995-2019 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> ,use(json)
scheme@(guile-user)> ,t (->bool (call-with-input-file "/tmp/large-file.json" json->scm))
$1 = #t
;; 44.963180s real time, 90.606198s run time.  71.529811s spent in GC.
scheme@(guile-user)> ,t (->bool (call-with-input-file "/tmp/large-file.json" json->scm))
$2 = #t
;; 44.147740s real time, 87.796937s run time.  69.818018s spent in GC.
scheme@(guile-user)> ,t (->bool (call-with-input-file "/tmp/large-file.json" json->scm))
$3 = #t
;; 45.057761s real time, 89.689930s run time.  71.370764s spent in GC.
--8<---------------cut here---------------end--------------->8---

So to me, Guile is behaving correctly here.

Now, it would be good to profile ‘json->scm’ to see if there’s anything
that could be improved on the Guile side, or if it’s just a normal
profile for GC-intensive code.

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-02 14:11     ` Ludovic Courtès
@ 2020-05-04  0:32       ` Aleix Conchillo Flaqué
  2020-05-04  9:36         ` Ludovic Courtès
  0 siblings, 1 reply; 12+ messages in thread
From: Aleix Conchillo Flaqué @ 2020-05-04  0:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-user, guile-devel

On Sat, May 2, 2020 at 7:11 AM Ludovic Courtès <ludo@gnu.org> wrote:

> Hola!
>
>
Hola!

So weird I'm getting different numbers on 2.2.7. Not sure how I'm getting
those initial ~20s and you are getting consistent ~ 45s. It shouldn't have
nothing to do with it, but could it be I'm running it on macOS?

Now, it would be good to profile ‘json->scm’ to see if there’s anything
> that could be improved on the Guile side, or if it’s just a normal
> profile for GC-intensive code.
>
>
Good news is that I have been working on performance improvements and
json->scm is going down from my ~19 seconds to ~3 seconds on the same
sample file. Linus Björnstam was the one to bring up performance issues so
we've been back and forth trying to make it fast.

One thing I found is that `match` is slow. The code looked nicer but had to
change it back to lets and conds as the performance increase was ~2 seconds.

Thanks for looking into this,

Aleix


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-04  0:32       ` Aleix Conchillo Flaqué
@ 2020-05-04  9:36         ` Ludovic Courtès
  2020-05-04 11:19           ` Linus Björnstam
  2020-05-04 18:47           ` Linus Björnstam
  0 siblings, 2 replies; 12+ messages in thread
From: Ludovic Courtès @ 2020-05-04  9:36 UTC (permalink / raw)
  To: Aleix Conchillo Flaqué; +Cc: guile-user, guile-devel

Hey!

Aleix Conchillo Flaqué <aconchillo@gmail.com> skribis:

> So weird I'm getting different numbers on 2.2.7. Not sure how I'm getting those initial ~20s and you are getting consistent ~ 45s. It
> shouldn't have nothing to do with it, but could it be I'm running it on macOS?

Did you add this ‘->bool’ call to ensure the resulting alist is not kept
in memory?

>  Now, it would be good to profile ‘json->scm’ to see if there’s anything
>  that could be improved on the Guile side, or if it’s just a normal
>  profile for GC-intensive code.
>
> Good news is that I have been working on performance improvements and json->scm is going down from my ~19 seconds to ~3
> seconds on the same sample file. Linus Björnstam was the one to bring up performance issues so we've been back and forth trying to
> make it fast.

Nice!

> One thing I found is that `match` is slow. The code looked nicer but had to change it back to lets and conds as the performance
> increase was ~2 seconds.

Oh, in which case exactly?  And are you sure your hand-written code is
equivalent to the ‘match’ code (it’s common for hand-written code to be
more lax than ‘match’)?

One thing to pay attention to is the use of ‘list?’, which is O(N), and
is implied by ellipses in ‘match’.  If you want to use ‘match’ in a way
that avoids ‘list?’, write patterns such as (a . b) instead of (a b ...).
It doesn’t have the same meaning, but often the end result is the same,
for instance because you’ll later match on ‘b’ anyway.

(I wish we can one day have a proper list type disjoint from pairs…)

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-04  9:36         ` Ludovic Courtès
@ 2020-05-04 11:19           ` Linus Björnstam
  2020-05-04 20:09             ` Ludovic Courtès
  2020-05-04 18:47           ` Linus Björnstam
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Björnstam @ 2020-05-04 11:19 UTC (permalink / raw)
  To: Ludovic Courtès, Aleix Conchillo Flaqué; +Cc: guile-user, guile-devel


On Mon, 4 May 2020, at 11:36, Ludovic Courtès wrote:
 
> > One thing I found is that `match` is slow. The code looked nicer but had to change it back to lets and conds as the performance
> > increase was ~2 seconds.
> 
> Oh, in which case exactly?  And are you sure your hand-written code is
> equivalent to the ‘match’ code (it’s common for hand-written code to be
> more lax than ‘match’)?
> 
> One thing to pay attention to is the use of ‘list?’, which is O(N), and
> is implied by ellipses in ‘match’.  If you want to use ‘match’ in a way
> that avoids ‘list?’, write patterns such as (a . b) instead of (a b ...).
> It doesn’t have the same meaning, but often the end result is the same,
> for instance because you’ll later match on ‘b’ anyway.
> 
> (I wish we can one day have a proper list type disjoint from pairs…)

The change is here: he is only matching against chars and predicates: https://github.com/aconchillo/guile-json/commit/ad4b06d86e4822466983d00f55474c8f664b538d




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-04  9:36         ` Ludovic Courtès
  2020-05-04 11:19           ` Linus Björnstam
@ 2020-05-04 18:47           ` Linus Björnstam
  1 sibling, 0 replies; 12+ messages in thread
From: Linus Björnstam @ 2020-05-04 18:47 UTC (permalink / raw)
  To: Ludovic Courtès, Aleix Conchillo Flaqué; +Cc: guile-user, guile-devel

Sorry, sent a premature reply. The problem is that some of those match blocks expand to using equal? which is a lot slower than using eqv? If we are doing it on every char in a 24mb file we are getting some serious constant factors. 

match is a syntax-rules macro, so distinguishing literals are not possible. Concerning "the macro writer's bill of rights" I could maybe think this it would be a rather nice thing to turn equal? to eqv? when one argument is a char literal :D

-- 
  Linus Björnstam

On Mon, 4 May 2020, at 11:36, Ludovic Courtès wrote:
> Hey!
> 
> Aleix Conchillo Flaqué <aconchillo@gmail.com> skribis:
> 
> > So weird I'm getting different numbers on 2.2.7. Not sure how I'm getting those initial ~20s and you are getting consistent ~ 45s. It
> > shouldn't have nothing to do with it, but could it be I'm running it on macOS?
> 
> Did you add this ‘->bool’ call to ensure the resulting alist is not kept
> in memory?
> 
> >  Now, it would be good to profile ‘json->scm’ to see if there’s anything
> >  that could be improved on the Guile side, or if it’s just a normal
> >  profile for GC-intensive code.
> >
> > Good news is that I have been working on performance improvements and json->scm is going down from my ~19 seconds to ~3
> > seconds on the same sample file. Linus Björnstam was the one to bring up performance issues so we've been back and forth trying to
> > make it fast.
> 
> Nice!
> 
> > One thing I found is that `match` is slow. The code looked nicer but had to change it back to lets and conds as the performance
> > increase was ~2 seconds.
> 
> Oh, in which case exactly?  And are you sure your hand-written code is
> equivalent to the ‘match’ code (it’s common for hand-written code to be
> more lax than ‘match’)?
> 
> One thing to pay attention to is the use of ‘list?’, which is O(N), and
> is implied by ellipses in ‘match’.  If you want to use ‘match’ in a way
> that avoids ‘list?’, write patterns such as (a . b) instead of (a b ...).
> It doesn’t have the same meaning, but often the end result is the same,
> for instance because you’ll later match on ‘b’ anyway.
> 
> (I wish we can one day have a proper list type disjoint from pairs…)
> 
> Thanks,
> Ludo’.
> 
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-04 11:19           ` Linus Björnstam
@ 2020-05-04 20:09             ` Ludovic Courtès
  2020-05-04 20:50               ` Linus Björnstam
  0 siblings, 1 reply; 12+ messages in thread
From: Ludovic Courtès @ 2020-05-04 20:09 UTC (permalink / raw)
  To: Linus Björnstam; +Cc: guile-user, guile-devel

Hi,

Linus Björnstam <linus.bjornstam@veryfast.biz> skribis:

> On Mon, 4 May 2020, at 11:36, Ludovic Courtès wrote:
>  
>> > One thing I found is that `match` is slow. The code looked nicer but had to change it back to lets and conds as the performance
>> > increase was ~2 seconds.
>> 
>> Oh, in which case exactly?  And are you sure your hand-written code is
>> equivalent to the ‘match’ code (it’s common for hand-written code to be
>> more lax than ‘match’)?
>> 
>> One thing to pay attention to is the use of ‘list?’, which is O(N), and
>> is implied by ellipses in ‘match’.  If you want to use ‘match’ in a way
>> that avoids ‘list?’, write patterns such as (a . b) instead of (a b ...).
>> It doesn’t have the same meaning, but often the end result is the same,
>> for instance because you’ll later match on ‘b’ anyway.
>> 
>> (I wish we can one day have a proper list type disjoint from pairs…)
>
> The change is here: he is only matching against chars and predicates: https://github.com/aconchillo/guile-json/commit/ad4b06d86e4822466983d00f55474c8f664b538d

It would be nice if you could pinpoint which one of these changes causes
a difference, because:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,optimize (match (peek-char port) ((? eof-object?) x) ((? whitespace?) w) (_ e))
$84 = (let ((v (peek-char port)))
  (cond ((eof-object? v) x)
        ((whitespace? v) w)
        (else e)))
--8<---------------cut here---------------end--------------->8---

What might make a difference is the code bloat when using ‘or’:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,optimize (match (peek-char port) ((or #\a #\b #\c #\d) x))
$86 = (let ((v (peek-char port)))
  (cond ((equal? v #\a) x)
        ((equal? v #\b) x)
        ((equal? v #\c) x)
        ((equal? v #\d) x)
        (else
         ((@@ (ice-9 match) error)
          'match
          "no matching pattern"
          v)
         #f)))
--8<---------------cut here---------------end--------------->8---

but even that sounds unlikely.

You’re compiling with -O2, right?

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-04 20:09             ` Ludovic Courtès
@ 2020-05-04 20:50               ` Linus Björnstam
  2020-05-08 11:31                 ` Linus Björnstam
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Björnstam @ 2020-05-04 20:50 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Aleix Conchillo Flaqué, guile-user, guile-devel

You didn't see my other reply. The matching code isn't suboptimal. The equality predicate is  The problem is that match compares using equal? even for literal chars (where eqv? is a lot faster). It would be a rather trivial optimization to do, either to match.scm (meaning: breaking with upstream and use syntax-case) or to the guile compiler in general (changing equal? to eqv, when there are character literals), which seems ok-ish for this use-case but at very little benefit in general.

A long-term goal of mine is to write a pattern matcher with the optimisations that the racket matcher does (among other things: some serious list matching reordering!). That is a daunting task though.

-- 
  Linus Björnstam

On Mon, 4 May 2020, at 22:09, Ludovic Courtès wrote:
> Hi,
> 
> Linus Björnstam <linus.bjornstam@veryfast.biz> skribis:
> 
> > On Mon, 4 May 2020, at 11:36, Ludovic Courtès wrote:
> >  
> >> > One thing I found is that `match` is slow. The code looked nicer but had to change it back to lets and conds as the performance
> >> > increase was ~2 seconds.
> >> 
> >> Oh, in which case exactly?  And are you sure your hand-written code is
> >> equivalent to the ‘match’ code (it’s common for hand-written code to be
> >> more lax than ‘match’)?
> >> 
> >> One thing to pay attention to is the use of ‘list?’, which is O(N), and
> >> is implied by ellipses in ‘match’.  If you want to use ‘match’ in a way
> >> that avoids ‘list?’, write patterns such as (a . b) instead of (a b ...).
> >> It doesn’t have the same meaning, but often the end result is the same,
> >> for instance because you’ll later match on ‘b’ anyway.
> >> 
> >> (I wish we can one day have a proper list type disjoint from pairs…)
> >
> > The change is here: he is only matching against chars and predicates: https://github.com/aconchillo/guile-json/commit/ad4b06d86e4822466983d00f55474c8f664b538d
> 
> It would be nice if you could pinpoint which one of these changes causes
> a difference, because:
> 
> --8<---------------cut here---------------start------------->8---
> scheme@(guile-user)> ,optimize (match (peek-char port) ((? eof-object?) 
> x) ((? whitespace?) w) (_ e))
> $84 = (let ((v (peek-char port)))
>   (cond ((eof-object? v) x)
>         ((whitespace? v) w)
>         (else e)))
> --8<---------------cut here---------------end--------------->8---
> 
> What might make a difference is the code bloat when using ‘or’:
> 
> --8<---------------cut here---------------start------------->8---
> scheme@(guile-user)> ,optimize (match (peek-char port) ((or #\a #\b #\c #\d) x))
> $86 = (let ((v (peek-char port)))
>   (cond ((equal? v #\a) x)
>         ((equal? v #\b) x)
>         ((equal? v #\c) x)
>         ((equal? v #\d) x)
>         (else
>          ((@@ (ice-9 match) error)
>           'match
>           "no matching pattern"
>           v)
>          #f)))
> --8<---------------cut here---------------end--------------->8---
> 
> but even that sounds unlikely.
> 
> You’re compiling with -O2, right?
> 
> Thanks,
> Ludo’.
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Guile's time execution issues
  2020-05-04 20:50               ` Linus Björnstam
@ 2020-05-08 11:31                 ` Linus Björnstam
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Björnstam @ 2020-05-08 11:31 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-user, guile-devel

Another option would be to just overload equal? in match.scm to transform into eqv? when there are char literals or numbers, eq? on symbols, booleans,  the empty list and keywords and (@@ (guile) equal?) for everything else.

Considering that it in this case contributed a 25% overhead to code that was performance critical I think it would be a pretty valid thing to do. If you, ludo, an Andy thinks that would be a good idea I can make such a patch for match.scm. That would have the benefit of not changing the upstream code (which is (include ...)d in match.scm), nor fiddling around with guile optimisations.

-- 
  Linus Björnstam

On Mon, 4 May 2020, at 22:50, Linus Björnstam wrote:
> You didn't see my other reply. The matching code isn't suboptimal. The 
> equality predicate is  The problem is that match compares using equal? 
> even for literal chars (where eqv? is a lot faster). It would be a 
> rather trivial optimization to do, either to match.scm (meaning: 
> breaking with upstream and use syntax-case) or to the guile compiler in 
> general (changing equal? to eqv, when there are character literals), 
> which seems ok-ish for this use-case but at very little benefit in 
> general.
> 
> A long-term goal of mine is to write a pattern matcher with the 
> optimisations that the racket matcher does (among other things: some 
> serious list matching reordering!). That is a daunting task though.
> 
> -- 
>   Linus Björnstam
> 
> On Mon, 4 May 2020, at 22:09, Ludovic Courtès wrote:
> > Hi,
> > 
> > Linus Björnstam <linus.bjornstam@veryfast.biz> skribis:
> > 
> > > On Mon, 4 May 2020, at 11:36, Ludovic Courtès wrote:
> > >  
> > >> > One thing I found is that `match` is slow. The code looked nicer but had to change it back to lets and conds as the performance
> > >> > increase was ~2 seconds.
> > >> 
> > >> Oh, in which case exactly?  And are you sure your hand-written code is
> > >> equivalent to the ‘match’ code (it’s common for hand-written code to be
> > >> more lax than ‘match’)?
> > >> 
> > >> One thing to pay attention to is the use of ‘list?’, which is O(N), and
> > >> is implied by ellipses in ‘match’.  If you want to use ‘match’ in a way
> > >> that avoids ‘list?’, write patterns such as (a . b) instead of (a b ...).
> > >> It doesn’t have the same meaning, but often the end result is the same,
> > >> for instance because you’ll later match on ‘b’ anyway.
> > >> 
> > >> (I wish we can one day have a proper list type disjoint from pairs…)
> > >
> > > The change is here: he is only matching against chars and predicates: https://github.com/aconchillo/guile-json/commit/ad4b06d86e4822466983d00f55474c8f664b538d
> > 
> > It would be nice if you could pinpoint which one of these changes causes
> > a difference, because:
> > 
> > --8<---------------cut here---------------start------------->8---
> > scheme@(guile-user)> ,optimize (match (peek-char port) ((? eof-object?) 
> > x) ((? whitespace?) w) (_ e))
> > $84 = (let ((v (peek-char port)))
> >   (cond ((eof-object? v) x)
> >         ((whitespace? v) w)
> >         (else e)))
> > --8<---------------cut here---------------end--------------->8---
> > 
> > What might make a difference is the code bloat when using ‘or’:
> > 
> > --8<---------------cut here---------------start------------->8---
> > scheme@(guile-user)> ,optimize (match (peek-char port) ((or #\a #\b #\c #\d) x))
> > $86 = (let ((v (peek-char port)))
> >   (cond ((equal? v #\a) x)
> >         ((equal? v #\b) x)
> >         ((equal? v #\c) x)
> >         ((equal? v #\d) x)
> >         (else
> >          ((@@ (ice-9 match) error)
> >           'match
> >           "no matching pattern"
> >           v)
> >          #f)))
> > --8<---------------cut here---------------end--------------->8---
> > 
> > but even that sounds unlikely.
> > 
> > You’re compiling with -O2, right?
> > 
> > Thanks,
> > Ludo’.
> >
> 
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-05-08 11:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-21 22:03 Guile's time execution issues Aleix Conchillo Flaqué
2020-04-22 13:47 ` Linus Björnstam
2020-04-26 17:16 ` Ludovic Courtès
2020-04-26 23:14   ` Aleix Conchillo Flaqué
2020-05-02 14:11     ` Ludovic Courtès
2020-05-04  0:32       ` Aleix Conchillo Flaqué
2020-05-04  9:36         ` Ludovic Courtès
2020-05-04 11:19           ` Linus Björnstam
2020-05-04 20:09             ` Ludovic Courtès
2020-05-04 20:50               ` Linus Björnstam
2020-05-08 11:31                 ` Linus Björnstam
2020-05-04 18:47           ` Linus Björnstam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).