From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Arne Babenhauserheide Newsgroups: gmane.lisp.guile.devel Subject: efficient implementation of string-replace-substring / string-replace-all Date: Fri, 13 Sep 2013 16:32:13 +0200 Message-ID: <87y570pzbm.wl%arne_bab@web.de> Reply-To: arne_bab@web.de NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII X-Trace: ger.gmane.org 1379086970 11160 80.91.229.3 (13 Sep 2013 15:42:50 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 13 Sep 2013 15:42:50 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Fri Sep 13 17:42:51 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VKVW8-0005h6-DU for guile-devel@m.gmane.org; Fri, 13 Sep 2013 17:42:44 +0200 Original-Received: from localhost ([::1]:48362 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VKVW7-000697-W7 for guile-devel@m.gmane.org; Fri, 13 Sep 2013 11:42:43 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51559) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VKUQ5-0005oQ-0H for guile-devel@gnu.org; Fri, 13 Sep 2013 10:32:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VKUPx-0000Cz-BG for guile-devel@gnu.org; Fri, 13 Sep 2013 10:32:24 -0400 Original-Received: from mout.web.de ([212.227.15.4]:53641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VKUPx-0000A2-1S for guile-devel@gnu.org; Fri, 13 Sep 2013 10:32:17 -0400 Original-Received: from kaverne.draketo.de ([212.255.235.232]) by smtp.web.de (mrweb103) with ESMTPA (Nemesis) id 0M7bQ3-1W7ODB1QYx-00xGDE for ; Fri, 13 Sep 2013 16:32:14 +0200 User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/24.3 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) X-Provags-ID: V03:K0:qfGP/WiqXIJOAoe84yUedz2gOy9uKvrbuQHvZrEHSAjGjZXncAx SHgh2koQOvbsYZ2srO9Zne4ibpXQp5JwHAU6nSiNTzWGt+ift23Axl6dPmYXeAMEND5tV6E xUKlY1v3rLAifeDQ0B2/CUTilBYKbcRSQm0tBxtJLV6sHXvL8gmKGvp3Ng/eYLAEoYHE2pN btFQVNG1px328EYpRQJRw== X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 212.227.15.4 X-Mailman-Approved-At: Fri, 13 Sep 2013 11:42:25 -0400 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16631 Archived-At: Hi, For wisp I created an efficient implementation of substring replacement and thought it might be useful for guile in general. I optimized it a bit to get rid of (likely) quadratic behaviour: ; ,time (string-replace-substring (xsubstring "abcdefghijkl" 0 99999) "def" "abc") ; 1.140127s real time, 1.139714s run time. 0.958733s spent in GC. ; 0.885618s real time, 0.885350s run time. 0.742805s spent in GC. ; second number after multiple runs (define (string-replace-substring s substring replacement) "Replace every instance of substring in s by replacement." (let ((sublen (string-length substring))) (let replacer ((newstring s) (index (string-contains s substring))) (if (not (equal? index #f)) (let ((replaced (string-replace newstring replacement index (+ index sublen)))) (replacer replaced (string-contains replaced substring index))) newstring)))) This fast and elegant approach is thanks to ijp. I tested a two alternative approaches. The following uses explicit readonly substrings and is fast, the one afterwards always searches from the beginning and shows drastic slowdown on long strings. ; efficient version from rev 12266d455bb0 of the wisp hg repo ; ,time (string-replace-substring-fast-but-long (xsubstring "abcdefghijkl" 0 99999) "def" "abc") ; 1.316090s real time, 1.315626s run time. 1.129857s spent in GC. ; 0.668212s real time, 0.667948s run time. 0.482193s spent in GC. ; 0.868419s real time, 0.868131s run time. 0.685472s spent in GC ; Second and third are after multiple runs (define (string-replace-substring-fast-but-long s substring replacement) "Replace every instance of substring in s by replacement." (let ((sublen (string-length substring))) (let replacer ((newstring s) (startindex 0) (addindex (string-contains s substring))) (if (not (equal? addindex #f)) (let* ((index (+ startindex addindex)) (replaced (string-replace newstring replacement index (+ index sublen))) (newaddindex (string-contains (substring/read-only replaced index) substring))) (replacer replaced index newaddindex)) newstring)))) ; less efficient version from rev a887aeb0dfe2 ; ,time (string-replace-substring-slow (xsubstring "abcdefghijkl" 0 99999) "def" "abc") ; 5.242456s real time, 5.240834s run time. 0.358750s spent in GC. ; 5.727069s real time, 5.724973s run time. 0.835445s spent in GC. ; switches between fast and slow version (define (string-replace-substring-slow s substring replacement) "Replace every instance of substring in s by replacement." (let ((sublen (string-length substring))) (let replacer ((newstring s) (index (string-contains s substring))) (if (not (equal? index #f)) (let ((replaced (string-replace newstring replacement index (+ index sublen)))) (replacer replaced (string-contains replaced substring))) newstring)))) ijp suggested renaming to (string-replace-all, but I do not know enough about conventions in guile to judge that). Best wishes, Arne