unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Andy Wingo <wingo@pobox.com>
Cc: arne_bab@web.de, guile-devel@gnu.org
Subject: Re: efficient implementation of string-replace-substring / string-replace-all
Date: Mon, 24 Mar 2014 01:19:12 -0400	[thread overview]
Message-ID: <87mwggkwbz.fsf@yeeloong.lan> (raw)
In-Reply-To: <87a9cgsksy.fsf@pobox.com> (Andy Wingo's message of "Sun, 23 Mar 2014 21:48:45 +0100")

Andy Wingo <wingo@pobox.com> writes:

> On Fri 13 Sep 2013 21:41, Mark H Weaver <mhw@netris.org> writes:
>
>> Here's an implementation that does this benchmark about 80 times faster
>> on my machine: (20 milliseconds vs 1.69 seconds)
>>
>> (define* (string-replace-substring s substr replacement
>>                                    #:optional
>>                                    (start 0)
>>                                    (end (string-length s)))
>>   (let ((substr-length (string-length substr)))
>>     (if (zero? substr-length)
>>         (error "string-replace-substring: empty substr")
>>         (let loop ((start start)
>>                    (pieces (list (substring s 0 start))))
>>           (let ((idx (string-contains s substr start end)))
>>             (if idx
>>                 (loop (+ idx substr-length)
>>                       (cons* replacement
>>                              (substring s start idx)
>>                              pieces))
>>                 (string-concatenate-reverse (cons (substring s start)
>>                                                   pieces))))))))
>
> Inspired to code-golf a bit, here's one that's even faster :)
>
> (define (string-replace-substring s substring replacement)
>   "Replace every instance of substring in s by replacement."
>   (let ((sublen (string-length substring)))
>     (with-output-to-string
>       (lambda ()
>         (let lp ((start 0))
>           (cond
>            ((string-contains s substring start)
>             => (lambda (end)
>                  (display (substring/shared s start end))
>                  (display replacement)
>                  (lp (+ end sublen))))
>            (else
>             (display (substring/shared s start)))))))))
>
> Just marginally so, though.

Nice!  I confess that I find this very surprising.  I would have
expected that the overhead in creating the string port, repeatedly
expanding the string buffer, doing UTF-8 encoding in 'display', and
decoding the UTF-8 when retrieving the result string, would add up to
something slower than what I had.  But experiment trumps theory, and I
guess I'll take your word on it that you did some reasonable benchmarks
and determined that my intuitions were wrong :)

One warning though: in Guile 2.0, string ports only support characters
representable in the %default-port-encoding, which defaults to the
encoding of the current locale.  Importing (srfi srfi-6) fixes this for
open-input-string and open-output-string, but with-output-to-string
remains limited.  Therefore, I recommend using the code above only in
Guile master, where string ports are proper.

    Regards,
      Mark



  reply	other threads:[~2014-03-24  5:19 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-13 14:32 efficient implementation of string-replace-substring / string-replace-all Arne Babenhauserheide
2013-09-13 19:41 ` Mark H Weaver
2014-03-23 20:48   ` Andy Wingo
2014-03-24  5:19     ` Mark H Weaver [this message]
2014-03-26 20:14       ` Arne Babenhauserheide

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mwggkwbz.fsf@yeeloong.lan \
    --to=mhw@netris.org \
    --cc=arne_bab@web.de \
    --cc=guile-devel@gnu.org \
    --cc=wingo@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).