From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: William James Newsgroups: gmane.lisp.guile.user Subject: Re: regex-split for Guile Date: Mon, 14 Mar 2011 07:54:39 -0700 (PDT) Message-ID: <579266.37025.qm@web112612.mail.gq1.yahoo.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1300114967 6612 80.91.229.12 (14 Mar 2011 15:02:47 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 14 Mar 2011 15:02:47 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Mar 14 16:02:43 2011 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Pz9I5-0007P8-Ta for guile-user@m.gmane.org; Mon, 14 Mar 2011 16:02:38 +0100 Original-Received: from localhost ([127.0.0.1]:39294 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pz9I4-0007IF-FA for guile-user@m.gmane.org; Mon, 14 Mar 2011 11:02:36 -0400 Original-Received: from [140.186.70.92] (port=60376 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pz9AR-0003Qt-Iq for guile-user@gnu.org; Mon, 14 Mar 2011 10:54:44 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pz9AP-0004Kl-Ur for guile-user@gnu.org; Mon, 14 Mar 2011 10:54:42 -0400 Original-Received: from nm7-vm1.bullet.mail.sp2.yahoo.com ([98.139.91.193]:20991) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1Pz9AP-0004KQ-Lf for guile-user@gnu.org; Mon, 14 Mar 2011 10:54:41 -0400 Original-Received: from [98.139.91.65] by nm7.bullet.mail.sp2.yahoo.com with NNFMP; 14 Mar 2011 14:54:40 -0000 Original-Received: from [98.139.91.18] by tm5.bullet.mail.sp2.yahoo.com with NNFMP; 14 Mar 2011 14:54:40 -0000 Original-Received: from [127.0.0.1] by omp1018.mail.sp2.yahoo.com with NNFMP; 14 Mar 2011 14:54:40 -0000 X-Yahoo-Newman-Property: ymail-5 X-Yahoo-Newman-Id: 832861.34495.bm@omp1018.mail.sp2.yahoo.com Original-Received: (qmail 50832 invoked by uid 60001); 14 Mar 2011 14:54:39 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1300114479; bh=uxNv0kkrV1tQvJQAoN/ZEQJweYvavlwA/WQ7UWHWGxw=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type; b=5m5soG9ksBzWjr1S930Nl1kkJpHxvYw0mCQO68m7JiF+VYh3pdkMRnQZ8+aI/WKCRsRXxkGWcCFeCIll3Xi7wJtxKbOnUzoiUDyKofjVOBiRCq7+i+HWQjtq5ka0a9R5pciinzw6uAvmPAkIQwyCBXUiVPdKjos3p5nvKz9/NoY= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type; b=zKewqjKLoyn3h4pxv9/4nKj33r4GY0kHrW2o6vpgzn2bom965ly+KGzblkj+dDjA3m2WU5fXhWV/a4PVjr7nNJlO/8ej4+7oFTzGp1U4kqYhW0xVPdmuQnbfIy1RNpskR20XR8CSft6ZMnv7+h7+hhl/1GADcUN0bftMAhI8rWY=; X-YMail-OSG: d6hMBikVM1ldcezFVOrsPYqV5QFz30H.rmubRRPcE1oCsXW YQ7XsfUTC8Bx5GxVtfsenqmyODru494oDsVL3bEg4LkndMnJ0sRqjzHMV3zq on0d9T9WsStqXj4Of0G6PqSy6ntNDg39E5vJLe6pDLboJNjn9bg2gCnuTWo9 lJw6dpHAL8RnGRHbL1A0HNM.GOyREGPGrSvZchXwEcVfRt3gVLvW2NmvFzQM BZ1j7RogXtwxaKFoYbtOkSt9fLIOx.wp_b3INAibnHojyCxR_atG_cDnbALB beqkfySw7cYx7.i07LAXugNCqEo00FmtuIWaNURVE7JP2nB1ANb1xh.8mQNB YQxUEOTztmnMzzriHaPwHE4Y_0h7kHPa2Ey4QolZxKbAkmcwxnlsX5ys- Original-Received: from [209.42.179.173] by web112612.mail.gq1.yahoo.com via HTTP; Mon, 14 Mar 2011 07:54:39 PDT X-Mailer: YahooMailClassic/11.4.20 YahooMailWebService/0.8.109.295617 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 98.139.91.193 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:8537 Archived-At: Neil Jerram wrote: > Thanks for posting that! For fun/interest, here's an alternative > implementation that occurred to me. > > Neil Thanks for the feedback. > > > (use-modules (ice-9 regex) > (ice-9 string-fun)) > > (define (regex-split regex str . opts) > (let* ((unique-char #\@) > (unique-char-string (string unique-char))) > (let ((splits (separate-fields-discarding-char > unique-char > (regexp-substitute/global #f > regex > str > 'pre > unique-char-string > 0 > unique-char-string > 'post) > list))) This is an approach that I used some years ago in Awk. ASCII code 1 is used as the unique character: # Produces array of nonmatching and matching # substrings. The size of the array will # always be an odd number. The first and the # last item will always be nonmatching. function shatter( s, shards, regexp ) { gsub( regexp, "\1&\1", s ) return split( s, shards, "\1" ) } > (cond ((memq 'keep opts) > splits) > (else > (let ((non-matches (map (lambda (i) > (list-ref splits (* i 2))) > (iota (floor (/ (1+ (length > splits)) > 2)))))) > (if (memq 'trim opts) > (filter (lambda (s) > (not (zero? (string-length s)))) > non-matches) > non-matches))))))) The way that I want 'trim to work is to remove just the leading and trailing empty strings. In Ruby, trailing null strings are removed by default: ",foo,,,bar,".split( "," ) ==>["", "foo", "", "", "bar"]