unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Nala Ginrut <nalaginrut@gmail.com>
To: Mark H Weaver <mhw@netris.org>
Cc: guile-devel@gnu.org
Subject: Re: Extremly slow for format & string-join
Date: Mon, 01 Apr 2013 17:52:10 +0800	[thread overview]
Message-ID: <1364809930.4639.17.camel@Renee-desktop.suse> (raw)
In-Reply-To: <87obdy3aw9.fsf@tines.lan>

On Mon, 2013-04-01 at 04:36 -0400, Mark H Weaver wrote:
> Nala Ginrut <nalaginrut@gmail.com> writes:
> 
> > I've tried to implement a function to mimic string multiply like Python:
> > "asdf" * 10
> >
> > --------------code----------------
> > (define (str* str n)
> >   (format #f "~{~a~}" (make-list n str)))
> >
> > or
> >
> > (define (str* str n)
> >   (string-join (make-list n str) ""))
> > --------------end-----------------
> >
> >
> > Both are very slow when N is large (> 1000000).
> 
> Indeed, the implementation of 'string-join' was very bad: about O(n^2)
> in the length of the list (assuming that the strings are roughly the
> same length).  Thanks for bringing this to my attention.  The problem
> was that it called 'string-append' repeatedly, adding one component at a
> time to the result string.  Since each call to 'string-append' copied
> the source strings into a fresh new string, this resulted in a lot of
> unnecessary copying and allocation.
> 
> I just pushed a much faster O(n) implementation to stable-2.0, which
> instead constructs a list of strings, and then calls 'string-append'
> only once.
> 
> http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=786ab4258fbf605f46287da5e7550d3ab4b68589
> 
> On my system, this makes (string-join (make-list 100000 "test") "-")
> over 3000 times faster (about 28.5 milliseconds vs about 98 seconds).
> I expect that the same test with 1,000,000 elements would be about
> 30,000 times faster (roughly 2.7 hours vs 0.3 seconds), but I didn't
> have the patience to wait 2.7 hours to verify this :)
> 

Thanks Mark!
string-join is a common thing for text processing(include web develop).
However, our powerful 'format' is not so efficient. I do think it's
necessary to we spend some time on it.

> Before:
> 
> scheme@(guile-user)> ,time (define s (string-join (make-list 10000 "test") "-"))
> ;; 0.998800s real time, 0.996677s run time.  0.984885s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 100000 "test") "-"))
> ;; 98.006569s real time, 97.817077s run time.  97.795970s spent in GC.
> 
> After:
> 
> scheme@(guile-user)> ,time (define s (string-join (make-list 10000 "test") "-"))
> ;; 0.006362s real time, 0.006351s run time.  0.000000s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 100000 "test") "-"))
> ;; 0.028513s real time, 0.028457s run time.  0.022235s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 1000000 "test") "-"))
> ;; 0.303098s real time, 0.302543s run time.  0.289639s spent in GC.
> scheme@(guile-user)> ,time (define s (string-join (make-list 10000000 "test") "-"))
> ;; 3.288105s real time, 3.281922s run time.  3.174460s spent in GC.
> 
> Format is still slow for large numbers of elements, but I'm not
> sufficiently motivated to dive into that swamp right now.
> 
>      Thanks,
>        Mark





  reply	other threads:[~2013-04-01  9:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-01  4:00 Extremly slow for format & string-join Nala Ginrut
2013-04-01  4:39 ` Daniel Hartwig
2013-04-01  5:13   ` Nala Ginrut
2013-04-01  5:35     ` Daniel Hartwig
2013-04-01  6:58       ` Nala Ginrut
2013-04-01  7:02         ` Daniel Hartwig
2013-04-01  8:36 ` Mark H Weaver
2013-04-01  9:52   ` Nala Ginrut [this message]
2013-04-01 12:55     ` Ian Price
2013-04-02 15:56   ` Ludovic Courtès
2013-04-01 10:37 ` Thien-Thi Nguyen
     [not found] <mailman.1257260.1364793213.854.guile-devel@gnu.org>
2013-04-01  6:59 ` Daniel Llorens
2013-04-01  7:40   ` Daniel Hartwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1364809930.4639.17.camel@Renee-desktop.suse \
    --to=nalaginrut@gmail.com \
    --cc=guile-devel@gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).