unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: "Chris K. Jester-Young" <cky944@gmail.com>
To: guile-devel@gnu.org
Subject: Re: regexp-split for Guile
Date: Sun, 21 Oct 2012 12:08:02 -0400	[thread overview]
Message-ID: <20121021160802.GB25831@yarrow> (raw)
In-Reply-To: <87pq4djkj2.fsf@tines.lan>

On Sat, Oct 20, 2012 at 10:16:49AM -0400, Mark H Weaver wrote:
> Sorry, that last example is wrong of course, but both of these examples
> raise an interesting question about how #:limit and #:trim should
> interact.  To my mind, the top example above is correct.  I think the
> last result should be "baz", not "baz  ".
[...]
> Honestly, this question makes me wonder if the proposed 'regexp-split'
> is too complicated.  If you want to trim whitespace, how about using
> 'string-trim-right' or 'string-trim-both' before splitting?  It seems
> more likely to do what I would expect.

Thanks so much for your feedback, Mark! I appreciate it.

Yeah, I think given the left-to-right nature of regex matching, the
only kind of trimming that makes sense is a right trim. And then once
you do that, people start asking for left trim, and mayhem begins. ;-)

I do want to consider the string pre-trimming approach, as it's more
clear what's going on, and is less "magical" (where "magic" is a plus
in the Perl world, and not so much of a plus in other languages).

Thankfully, the string-trim{,-right,-both} functions you mentioned use
substring behind the scenes, which uses copy-on-write. So that solves
one of my potential concerns, which is that a pre-trim would require
copying most of the string.

			*	*	*

Granted, if you want trimming-with-complicated-regex-delimiter, and
not just whitespace, then your best bet is to trim the output list.
This is slightly more complicated, because my original code simply
uses drop-while before reversing the output list for return, but since
the caller doesn't receive the reversed list, they either have to
reverse+trim+reverse (yuck), or we have to implement drop-right-while
(like you mentioned previously).

In that regard, here's one implementation of drop-right-while (that I
just wrote on the spot):

    (define (drop-right-while pred lst)
      (let recur ((lst lst))
        (if (null? lst) '()
            (let ((elem (car lst))
                  (next (recur (cdr lst))))
              (if (and (null? next) (pred elem)) '()
                  (cons elem next))))))

One could theoretically write drop-right-while! also (I can think of
two different implementation strategies) but it sounds like it's more
work than it's worth.

So, that's our last hurdle: we "just" have to get drop-right-while
integrated into Guile, then we can separate out the splitting and
trimming processes. And everybody will be happy. :-)

Comments welcome,
Chris.



  parent reply	other threads:[~2012-10-21 16:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-17 14:01 regexp-split for Guile Chris K. Jester-Young
2012-09-17 19:32 ` Thien-Thi Nguyen
2012-09-17 20:06   ` Chris K. Jester-Young
2012-09-18  7:06     ` Sjoerd van Leent Privé
2012-09-18 19:31       ` Chris K. Jester-Young
2012-09-18 19:59     ` Chris K. Jester-Young
2012-10-07  2:38       ` Daniel Hartwig
2012-10-12 21:57         ` Mark H Weaver
2012-10-20  4:01           ` Chris K. Jester-Young
2012-10-20 13:27             ` Mark H Weaver
2012-10-20 14:16               ` Mark H Weaver
2012-10-21  8:20                 ` Daniel Hartwig
2012-10-21 19:23                   ` Chris K. Jester-Young
2012-10-21 16:08                 ` Chris K. Jester-Young [this message]
2012-09-18 12:59 ` nalaginrut
2012-09-18 19:55   ` Chris K. Jester-Young
2012-09-19  0:30     ` nalaginrut
2012-10-04 21:47 ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121021160802.GB25831@yarrow \
    --to=cky944@gmail.com \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).