unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: arne_bab@web.de
Cc: guile-devel@gnu.org
Subject: Re: efficient implementation of string-replace-substring / string-replace-all
Date: Fri, 13 Sep 2013 15:41:16 -0400	[thread overview]
Message-ID: <87wqmkjyqr.fsf@tines.lan> (raw)
In-Reply-To: <87y570pzbm.wl%arne_bab@web.de> (Arne Babenhauserheide's message of "Fri, 13 Sep 2013 16:32:13 +0200")

Hi Arne,

Arne Babenhauserheide <arne_bab@web.de> writes:

> For wisp I created an efficient implementation of substring replacement and thought it might be useful for guile in general.
>
> I optimized it a bit to get rid of (likely) quadratic behaviour:
>
>
> ; ,time (string-replace-substring (xsubstring "abcdefghijkl" 0 99999) "def" "abc")
> ; 1.140127s real time, 1.139714s run time.  0.958733s spent in GC.
> ; 0.885618s real time, 0.885350s run time.  0.742805s spent in GC.
> ; second number after multiple runs

Here, you're including (xsubstring "abcdefghijkl" 0 99999) in the
benchmark.  Better to (define big (xsubstring "abcdefghijkl" 0 99999))
first, and then: ,time (string-replace-substring big "def" "abc")

> (define (string-replace-substring s substring replacement)
>        "Replace every instance of substring in s by replacement."
>        (let ((sublen (string-length substring)))
>            (let replacer
>                ((newstring s)
>                  (index (string-contains s substring)))
>                (if (not (equal? index #f))
>                   (let ((replaced (string-replace newstring replacement index (+ index sublen))))
>                     (replacer replaced (string-contains replaced substring index)))
>                   newstring))))

Here's an implementation that does this benchmark about 80 times faster
on my machine: (20 milliseconds vs 1.69 seconds)

--8<---------------cut here---------------start------------->8---
(define* (string-replace-substring s substr replacement
                                   #:optional
                                   (start 0)
                                   (end (string-length s)))
  (let ((substr-length (string-length substr)))
    (if (zero? substr-length)
        (error "string-replace-substring: empty substr")
        (let loop ((start start)
                   (pieces (list (substring s 0 start))))
          (let ((idx (string-contains s substr start end)))
            (if idx
                (loop (+ idx substr-length)
                      (cons* replacement
                             (substring s start idx)
                             pieces))
                (string-concatenate-reverse (cons (substring s start)
                                                  pieces))))))))
--8<---------------cut here---------------end--------------->8---

The reason this is so much faster is because it avoids needless
generation of intermediate strings.

     Regards,
       Mark



  reply	other threads:[~2013-09-13 19:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-13 14:32 efficient implementation of string-replace-substring / string-replace-all Arne Babenhauserheide
2013-09-13 19:41 ` Mark H Weaver [this message]
2014-03-23 20:48   ` Andy Wingo
2014-03-24  5:19     ` Mark H Weaver
2014-03-26 20:14       ` Arne Babenhauserheide

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wqmkjyqr.fsf@tines.lan \
    --to=mhw@netris.org \
    --cc=arne_bab@web.de \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).