From: "Linus Björnstam" <linus.internet@fastmail.se>
To: "Ludovic Courtès" <ludo@gnu.org>, guile-user@gnu.org
Subject: Re: string-for-each vs. for-each+string->list performance
Date: Sat, 13 Jun 2020 08:41:17 +0200 [thread overview]
Message-ID: <db2035dc-befd-4510-bb2f-23eb4afb1fca@www.fastmail.com> (raw)
In-Reply-To: <87k10c2gah.fsf@gnu.org>
Thanks for clearing that up.
I have an old implementation of large parts of srfi-13 if that would be of interest. I don't know how much you want to change. The licence situation of the reference implementation is weird iirc.
A beginning could be to replace all higher order functions since that would minimize the kind of performance problems discussed here.
--
Linus Björnstam
On Fri, 12 Jun 2020, at 22:13, Ludovic Courtès wrote:
> Hi,
>
> Linus Björnstam <linus.internet@fastmail.se> skribis:
>
> > You can cut another 15-ish % from that loop by making an inline loop, btw
> >
> > (let loop ((pos 0))
> > (when (< pos (string-length str))
> > ...
> > (loop (1+ pos)))
> >
> > I have been looking at the disassembly, even for simpler cases, but I haven't been able to understand enough of it.
> >
> > BTW: string-for-each is in the default environment, and is probably the same as the srfi-13 C implementation.
>
> ‘string-for-each’ in C (the default) is slower than its Scheme counterpart:
>
> --8<---------------cut here---------------start------------->8---
> scheme@(guile-user)> (define (sfe proc str)
> (define len (string-length str))
> (let loop ((i 0))
> (unless (= i len)
> (proc (string-ref str i))
> (loop (+ 1 i)))))
> scheme@(guile-user)> (define str (make-string 15000000))
> scheme@(guile-user)> ,t (sfe identity str)
> ;; 0.263725s real time, 0.263722s run time. 0.000000s spent in GC.
> scheme@(guile-user)> ,t (sfe identity str)
> ;; 0.259538s real time, 0.259529s run time. 0.000000s spent in GC.
> scheme@(guile-user)> ,t (string-for-each identity str)
> ;; 0.841632s real time, 0.841624s run time. 0.000000s spent in GC.
> scheme@(guile-user)> (version)
> $2 = "3.0.2"
> --8<---------------cut here---------------end--------------->8---
>
> In general we seem to pay a high price for leaving (calling a subr) and
> re-entering (via ‘scm_call_n’) the VM. This is especially acute here
> because there’s almost nothing happening in C, so we keep bouncing
> between Scheme and C.
>
> That’s another reason to start rewriting such primitives in Scheme and
> have the C functions just call out to Scheme.
>
> If we do:
>
> perf record guile -c '(string-for-each identity (make-string 15000000))'
>
> we get this profile:
>
> --8<---------------cut here---------------start------------->8---
> Overhead Command Shared Object Symbol
> 31.10% guile libguile-3.0.so.1.1.1 [.] vm_regular_engine
> 27.48% guile libguile-3.0.so.1.1.1 [.] scm_call_n
> 14.34% guile libguile-3.0.so.1.1.1 [.] scm_jit_enter_mcode
> 3.55% guile libguile-3.0.so.1.1.1 [.] scm_i_string_ref
> 3.37% guile libguile-3.0.so.1.1.1 [.] get_callee_vcode
> 2.34% guile libguile-3.0.so.1.1.1 [.] scm_call_1
> 2.31% guile libguile-3.0.so.1.1.1 [.] scm_string_for_each
> --8<---------------cut here---------------end--------------->8---
>
> Indeed, we get better performance when turning off JIT:
>
> --8<---------------cut here---------------start------------->8---
> $ GUILE_JIT_THRESHOLD=-1 time guile -c '(string-for-each identity
> (make-string 15000000))'
> 0.47user 0.00system 0:00.47elapsed 100%CPU (0avgtext+0avgdata
> 26396maxresident)k
> 0inputs+0outputs (0major+1583minor)pagefaults 0swaps
> $ GUILE_JIT_THRESHOLD=100 time guile -c '(string-for-each identity
> (make-string 15000000))'
> 0.83user 0.00system 0:00.83elapsed 100%CPU (0avgtext+0avgdata
> 26948maxresident)k
> 0inputs+0outputs (0major+1748minor)pagefaults 0swaps
> $ GUILE_JIT_THRESHOLD=0 time guile -c '(string-for-each identity
> (make-string 15000000))'
> 0.84user 0.00system 0:00.85elapsed 100%CPU (0avgtext+0avgdata
> 27324maxresident)k
> 0inputs+0outputs (0major+2548minor)pagefaults 0swaps
> --8<---------------cut here---------------end--------------->8---
>
> So it seems that we just keep firing the JIT machinery on every
> ‘scm_call_n’ for no benefit.
>
> That’s probably also the reason why ‘%after-gc-hunk’, ‘reap-pipes’, &
> co. always show high in statprof:
>
> https://lists.gnu.org/archive/html/guile-devel/2020-05/msg00019.html
>
> Thanks,
> Ludo’.
>
>
>
prev parent reply other threads:[~2020-06-13 6:41 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-07 6:27 string-for-each vs. for-each+string->list performance Aleix Conchillo Flaqué
2020-06-07 6:33 ` Aleix Conchillo Flaqué
2020-06-07 11:20 ` Linus Björnstam
2020-06-07 13:50 ` Linus Björnstam
2020-06-12 20:13 ` Ludovic Courtès
2020-06-13 6:41 ` Linus Björnstam [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=db2035dc-befd-4510-bb2f-23eb4afb1fca@www.fastmail.com \
--to=linus.internet@fastmail.se \
--cc=guile-user@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).