From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Vijay Marupudi Newsgroups: gmane.lisp.guile.devel Subject: [PATCH] Add string-split-substring Date: Sat, 12 Feb 2022 22:05:39 -0500 Message-ID: <87tud3qxdo.fsf@vijaymarupudi.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18611"; mail-complaints-to="usenet@ciao.gmane.io" To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sun Feb 13 04:06:16 2022 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nJ5DM-0004db-8H for guile-devel@m.gmane-mx.org; Sun, 13 Feb 2022 04:06:16 +0100 Original-Received: from localhost ([::1]:44428 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nJ5DK-0008Rl-DJ for guile-devel@m.gmane-mx.org; Sat, 12 Feb 2022 22:06:14 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:44988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJ5D6-0008Pm-CK for guile-devel@gnu.org; Sat, 12 Feb 2022 22:06:00 -0500 Original-Received: from [2a0c:5a00:149::25] (port=43362 helo=mailtransmit04.runbox.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJ5D4-0003yD-1w for guile-devel@gnu.org; Sat, 12 Feb 2022 22:06:00 -0500 Original-Received: from mailtransmit03.runbox ([10.9.9.163] helo=aibo.runbox.com) by mailtransmit04.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1nJ5Cv-005OH4-Nw for guile-devel@gnu.org; Sun, 13 Feb 2022 04:05:49 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=vijaymarupudi.com; s=selector1; h=Content-Type:MIME-Version:Message-ID:Date :Subject:To:From; bh=2Mz810mt8nGuh18ZOf10R2PbOdb5i+PW+BhKnN9T9pE=; b=e/rtmBbk 6l2oE7Y1LCnvdexJMko52Q2+/VJIMMrHbfq3Gu5S3ocWm7Hcmu+jM7G4DC7sqNJShDPo9+15CvGY9 sfCjz/fOCnftezhVoErI7Uqq1MEAvXH88+8bGszfpxFeO9xBQESpc2SLrAHYZNl7vTmwxHFTzlwOR gs6bOMlMrX/MqhD6RmObCI+rYEVcaelKSeF9DHroZl6DiOkBijAfIMMr61kO6eaWae76vNFvzXxuH p506WeF3Zn2AH9vqlso0SM6GfzJubDwuac5JX9Gzc8dHBTfvKwMRx6sl2nXFMSvxs0V6ghGJkGJXf fsijXGqjEoW/P+bM1+n5i10G1Q==; Original-Received: from [10.9.9.73] (helo=submission02.runbox) by mailtransmit03.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1nJ5Cu-00052m-04 for guile-devel@gnu.org; Sun, 13 Feb 2022 04:05:49 +0100 Original-Received: by submission02.runbox with esmtpsa [Authenticated ID (1028486)] (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) id 1nJ5Cq-0002jT-PL for guile-devel@gnu.org; Sun, 13 Feb 2022 04:05:45 +0100 X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a0c:5a00:149::25 (failed) Received-SPF: pass client-ip=2a0c:5a00:149::25; envelope-from=vijay@vijaymarupudi.com; helo=mailtransmit04.runbox.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:21105 Archived-At: --=-=-= Content-Type: text/plain Hello all, I have added a function named `string-split-substring' to the (ice-9 string-fun) module. It acts like `string-split', but taking a substring instead. It works like this (string-replace-substring "item-1::item-2::item-3::item-4" "::") => ("item-1" "item-2" "item-3" "item-4") The tests include all the edge cases in the tests for string-split, and the behavior matches it exactly. Documentation is also included in the patch. I have found myself making and using this function numerous times, and judging by IRC, others find it useful as well. The patch is attached. ~ Vijay --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Add-string-split-substring.patch >From 44ba1874d32e188fdd999e113781548ab2b128fa Mon Sep 17 00:00:00 2001 From: Vijay Marupudi Date: Sat, 12 Feb 2022 22:00:57 -0500 Subject: [PATCH] Add string-split-substring * /ref/api-data.texi: Added documentation * module/ice-9/string-fun.scm: Added implementation * test-suite/tests/strings.test: Added tests --- doc/ref/api-data.texi | 10 ++++++++++ module/ice-9/string-fun.scm | 23 ++++++++++++++++++++++- test-suite/tests/strings.test | 22 +++++++++++++++++++++- 3 files changed, 53 insertions(+), 2 deletions(-) diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi index 8658b9785..a5fcc47b1 100644 --- a/doc/ref/api-data.texi +++ b/doc/ref/api-data.texi @@ -4245,6 +4245,16 @@ Return a new string where every instance of @var{substring} in string @end lisp @end deffn +@deffn {Scheme Procedure} string-split-substring str substring +Split the string @var{str} into a list of substrings delimited by the +appearance of substring @var{substring}. For example: + +@lisp +(string-replace-substring "item-1::item-2::item-3::item-4" "::") +@result{} ("item-1" "item-2" "item-3" "item-4") +@end lisp +@end deffn + @node Representing Strings as Bytes @subsubsection Representing Strings as Bytes diff --git a/module/ice-9/string-fun.scm b/module/ice-9/string-fun.scm index 03e0238fa..a1d4c0366 100644 --- a/module/ice-9/string-fun.scm +++ b/module/ice-9/string-fun.scm @@ -26,7 +26,7 @@ separate-fields-before-char string-prefix-predicate string-prefix=? sans-surrounding-whitespace sans-trailing-whitespace sans-leading-whitespace sans-final-newline has-trailing-newline? - string-replace-substring)) + string-replace-substring string-split-substring)) ;;;; ;;; @@ -313,3 +313,24 @@ (else (display (substring/shared str start))))))))) +(define (string-split-substring str substr) + "Split the string @var{str} into a list of substrings delimited by the +substring @var{substr}." + + (define substrlen (string-length substr)) + (define strlen (string-length str)) + + (define (loop index start) + (cond + ((>= start strlen) (list "")) + ((not index) (list (substring str start))) + (else + (cons (substring str start index) + (let ((new-start (+ index substrlen))) + (loop (string-contains str substr new-start) + new-start)))))) + + (cond + ((string-contains str substr) => (lambda (idx) (loop idx 0))) + (else (list str)))) + diff --git a/test-suite/tests/strings.test b/test-suite/tests/strings.test index 7393bc8ec..0c09e97c8 100644 --- a/test-suite/tests/strings.test +++ b/test-suite/tests/strings.test @@ -699,4 +699,24 @@ (pass-if "string-replace-substring" (string=? (string-replace-substring "a ring of strings" "ring" "rut") - "a rut of struts"))) + "a rut of struts")) + + (pass-if "string-split-substring - empty string" + (equal? (string-split-substring "" "foo") + '(""))) + + (pass-if "string-split-substring - non-empty, no delimiters" + (equal? (string-split-substring "testing" "foo") + '("testing"))) + + (pass-if "string-split-substring - non-empty, delimiters" + (equal? (string-split-substring "testingfoobar" "foo") + '("testing" "bar"))) + + (pass-if "string-split-substring - non-empty, leading delimiters" + (equal? (string-split-substring "foobar" "foo") + '("" "bar"))) + + (pass-if "string-split-substring - non-empty, trailing delimiters" + (equal? (string-split-substring "" "foo") + (list "")))) -- 2.35.1 --=-=-=--