From: "Stephen J. Turnbull" <stephen@xemacs.org>
Cc: "Stephen J. Turnbull" <stephen@xemacs.org>,
emacs-devel@gnu.org, xemacs-design@xemacs.org
Subject: Re: Rationale for split-string?
Date: Fri, 18 Apr 2003 20:50:42 +0900 [thread overview]
Message-ID: <87n0io2fe5.fsf@tleepslib.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <200304171744.h3HHiJCx009215@rum.cs.yale.edu> ("Stefan Monnier"'s message of "Thu, 17 Apr 2003 13:44:18 -0400")
>>>>> "Stefan" == Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu> writes:
>> What is the rationale for the specification of `split-string'?
Stefan> I think the reason is for the default case. In XEmacs we
Stefan> get:
ELISP> (split-string " a b ")
("" "a" "b" "")
Stefan> What is usually desired here is to eliminate all empty
Stefan> parts.
I tend to agree, but remember Larry Wall does not. That concerns me;
Larry is nothing if not remarkably good at intuiting what works. And
the (delete "" (split-string ...)) idiom is hardly an exercise in
perversion or a brainteaser.
Stefan> A gross hack is to test if the last char of the regexp is
Stefan> ?+ and if so get rid of empty strings at start and end.
Stefan> It should take care of 99% of the cases.
That's an implementation, not a specification. Using that means we'll
be having this discussion again, sooner or later. Think about someone
who writes a smart SEPARATORS to get rid of whitespace or leaders
around the elements. I really don't like the idea of iterating a spec
every time somebody finds a plausible use for the function that some
"less gross than the last time hack" rules out. If you want a
specific common case optimized, test for that.
Eg, how about one of
(defun split-string-sanely (string &optional separators)
(cond ((eq separators t) (gnu-emacs-split-string string))
(t (xemacs-split-string string separators))))
(defun split-string-sanely-too (string &optional separators)
(let ((result (xemacs-split-string string separators)))
(cond ((stringp separators) result)
((eq separators 'omit-nulls) (delete "" result))
(t (error 'invalid-argument
"SEPARATORS must be a string or 'omit-nulls"
separators)))))
(defun split-string-flexibly (string &optional separators thunk)
(let ((result (xemacs-split-string string separators)))
(cond ((functionp thunk) (delete-if thunk result))
((eq thunk 'omit-nulls) (delete "" result))
((null thunk) result)
(t (error 'invalid-argument
"THUNK must be nil, 'omit-nulls, or a function"
thunk)))))
These can be easily generalized to further useful special cases
(deleting blank strings or non-numbers, anyone?) without ever screwing
up old code or ruling out uses of a given SEPARATORS regexp.
In fact, my preference would be to implement and name more or less as
above, in which case I would default differently (e.g., if SEPARATORS
is nil, use the omit-nulls behavior). Then the internal function
could be named `split-string' and have the simple, consistent
behavior. Both APIs would be considered public.
--
Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Ask not how you can "do" free software business;
ask what your business can "do for" free software.
next prev parent reply other threads:[~2003-04-18 11:50 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-17 9:06 Rationale for split-string? Stephen J. Turnbull
2003-04-17 11:30 ` Stefan Reichör
2003-04-18 1:54 ` Richard Stallman
2003-04-18 2:59 ` Steve Youngs
2003-04-17 17:44 ` Stefan Monnier
2003-04-17 19:32 ` Luc Teirlinck
2003-04-18 11:50 ` Stephen J. Turnbull [this message]
2003-04-18 14:17 ` Stefan Monnier
2003-04-19 8:18 ` Stephen J. Turnbull
2003-04-19 13:35 ` Richard Stallman
2003-04-19 4:14 ` Richard Stallman
2003-04-19 8:55 ` Stephen J. Turnbull
2003-04-21 0:59 ` Richard Stallman
2003-04-21 1:55 ` Luc Teirlinck
2003-04-21 10:58 ` Stephen J. Turnbull
2003-04-21 21:11 ` Luc Teirlinck
2003-04-21 23:43 ` Miles Bader
2003-04-22 3:26 ` Luc Teirlinck
2003-04-22 4:09 ` Jerry James
2003-04-22 8:15 ` Eli Zaretskii
2003-04-22 13:22 ` Stephen J. Turnbull
2003-04-22 14:38 ` Jerry James
2003-04-22 12:56 ` Luc Teirlinck
2003-04-22 14:56 ` Jerry James
2003-04-22 15:27 ` Luc Teirlinck
2003-04-22 13:19 ` Stephen J. Turnbull
2003-04-22 13:39 ` Miles Bader
2003-04-22 13:51 ` Luc Teirlinck
2003-04-22 16:26 ` Luc Teirlinck
2003-04-23 1:00 ` Richard Stallman
2003-04-23 4:09 ` Stephen J. Turnbull
2003-04-24 23:12 ` Richard Stallman
2003-05-20 1:55 ` Stephen J. Turnbull
2003-05-22 15:00 ` Kai Großjohann
-- strict thread matches above, loose matches on Subject: below --
2003-05-20 3:11 Bill Wohler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87n0io2fe5.fsf@tleepslib.sk.tsukuba.ac.jp \
--to=stephen@xemacs.org \
--cc=emacs-devel@gnu.org \
--cc=xemacs-design@xemacs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).