unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: William James <w_a_x_man@yahoo.com>
To: guile-user@gnu.org
Subject: Re: regex-split for Guile
Date: Mon, 14 Mar 2011 07:54:39 -0700 (PDT)	[thread overview]
Message-ID: <579266.37025.qm@web112612.mail.gq1.yahoo.com> (raw)

Neil Jerram wrote:

> Thanks for posting that!  For fun/interest, here's an alternative
> implementation that occurred to me.
> 
>        Neil

Thanks for the feedback.

> 
> 
> (use-modules (ice-9 regex)
>              (ice-9 string-fun))
> 
> (define (regex-split regex str . opts)
>   (let* ((unique-char #\@)
>          (unique-char-string (string unique-char)))
>     (let ((splits (separate-fields-discarding-char
>                    unique-char
>                    (regexp-substitute/global #f
>                                              regex
>                                              str
>                                              'pre
>                                              unique-char-string
>                                              0
>                                              unique-char-string
>                                              'post)
>                    list)))

This is an approach that I used some years ago in Awk.
ASCII code 1 is used as the unique character:

# Produces array of nonmatching and matching
# substrings. The size of the array will
# always be an odd number. The first and the
# last item will always be nonmatching.
function shatter( s, shards, regexp )
{ gsub( regexp, "\1&\1", s  )
  return split( s, shards, "\1" )
}


>       (cond ((memq 'keep opts)
>              splits)
>             (else
>              (let ((non-matches (map (lambda (i)
>                                        (list-ref splits (* i 2)))
>                                      (iota (floor (/ (1+ (length 
> splits)) 
> 2))))))
>                (if (memq 'trim opts)
>                    (filter (lambda (s)
>                              (not (zero? (string-length s))))
>                            non-matches)
>                    non-matches)))))))

The way that I want 'trim to work is to remove just the
leading and trailing empty strings.  In Ruby, trailing
null strings are removed by default:

",foo,,,bar,".split( "," )
    ==>["", "foo", "", "", "bar"]




      



             reply	other threads:[~2011-03-14 14:54 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-14 14:54 William James [this message]
  -- strict thread matches above, loose matches on Subject: below --
2011-03-07 14:57 regex-split for Guile William James
2011-03-12  2:08 ` Neil Jerram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=579266.37025.qm@web112612.mail.gq1.yahoo.com \
    --to=w_a_x_man@yahoo.com \
    --cc=guile-user@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).