From: Nala Ginrut <nalaginrut@gmail.com>
To: Mark H Weaver <mhw@netris.org>
Cc: guile-devel@gnu.org
Subject: Re: Extremly slow for format & string-join
Date: Mon, 01 Apr 2013 17:52:10 +0800 [thread overview]
Message-ID: <1364809930.4639.17.camel@Renee-desktop.suse> (raw)
In-Reply-To: <87obdy3aw9.fsf@tines.lan>
On Mon, 2013-04-01 at 04:36 -0400, Mark H Weaver wrote:
> Nala Ginrut <nalaginrut@gmail.com> writes:
>
> > I've tried to implement a function to mimic string multiply like Python:
> > "asdf" * 10
> >
> > --------------code----------------
> > (define (str* str n)
> > (format #f "~{~a~}" (make-list n str)))
> >
> > or
> >
> > (define (str* str n)
> > (string-join (make-list n str) ""))
> > --------------end-----------------
> >
> >
> > Both are very slow when N is large (> 1000000).
>
> Indeed, the implementation of 'string-join' was very bad: about O(n^2)
> in the length of the list (assuming that the strings are roughly the
> same length). Thanks for bringing this to my attention. The problem
> was that it called 'string-append' repeatedly, adding one component at a
> time to the result string. Since each call to 'string-append' copied
> the source strings into a fresh new string, this resulted in a lot of
> unnecessary copying and allocation.
>
> I just pushed a much faster O(n) implementation to stable-2.0, which
> instead constructs a list of strings, and then calls 'string-append'
> only once.
>
> http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=786ab4258fbf605f46287da5e7550d3ab4b68589
>
> On my system, this makes (string-join (make-list 100000 "test") "-")
> over 3000 times faster (about 28.5 milliseconds vs about 98 seconds).
> I expect that the same test with 1,000,000 elements would be about
> 30,000 times faster (roughly 2.7 hours vs 0.3 seconds), but I didn't
> have the patience to wait 2.7 hours to verify this :)
>
Thanks Mark!
string-join is a common thing for text processing(include web develop).
However, our powerful 'format' is not so efficient. I do think it's
necessary to we spend some time on it.
> Before:
>
> scheme@(guile-user)> ,time (define s (string-join (make-list 10000 "test") "-"))
> ;; 0.998800s real time, 0.996677s run time. 0.984885s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 100000 "test") "-"))
> ;; 98.006569s real time, 97.817077s run time. 97.795970s spent in GC.
>
> After:
>
> scheme@(guile-user)> ,time (define s (string-join (make-list 10000 "test") "-"))
> ;; 0.006362s real time, 0.006351s run time. 0.000000s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 100000 "test") "-"))
> ;; 0.028513s real time, 0.028457s run time. 0.022235s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 1000000 "test") "-"))
> ;; 0.303098s real time, 0.302543s run time. 0.289639s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 10000000 "test") "-"))
> ;; 3.288105s real time, 3.281922s run time. 3.174460s spent in GC.
>
> Format is still slow for large numbers of elements, but I'm not
> sufficiently motivated to dive into that swamp right now.
>
> Thanks,
> Mark
next prev parent reply other threads:[~2013-04-01 9:52 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-01 4:00 Extremly slow for format & string-join Nala Ginrut
2013-04-01 4:39 ` Daniel Hartwig
2013-04-01 5:13 ` Nala Ginrut
2013-04-01 5:35 ` Daniel Hartwig
2013-04-01 6:58 ` Nala Ginrut
2013-04-01 7:02 ` Daniel Hartwig
2013-04-01 8:36 ` Mark H Weaver
2013-04-01 9:52 ` Nala Ginrut [this message]
2013-04-01 12:55 ` Ian Price
2013-04-02 15:56 ` Ludovic Courtès
2013-04-01 10:37 ` Thien-Thi Nguyen
[not found] <mailman.1257260.1364793213.854.guile-devel@gnu.org>
2013-04-01 6:59 ` Daniel Llorens
2013-04-01 7:40 ` Daniel Hartwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1364809930.4639.17.camel@Renee-desktop.suse \
--to=nalaginrut@gmail.com \
--cc=guile-devel@gnu.org \
--cc=mhw@netris.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).