unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Stephen J. Turnbull" <stephen@xemacs.org>
Cc: xemacs-design@xemacs.org
Subject: Rationale for split-string?
Date: Thu, 17 Apr 2003 18:06:17 +0900	[thread overview]
Message-ID: <87brz57at2.fsf@tleepslib.sk.tsukuba.ac.jp> (raw)

What is the rationale for the specification of `split-string'?

That is, in GNU Emacs

  ;; an often convenient abbreviation
  (split-string "  data  ")
=> ("data")

  ;; weird
  (split-string "  data  " " ")
=> ("" "data" "")

  ;; urk (think "gnumeric just-say-no.xls" "save as" "csv")
  (split-string ",,data,," ",")
=> ("" "data" "")

emacs-version
"21.2.2"

In XEmacs currently we get

  ;; usually (delete "" (split-string "  data  ")) should do the
  ;; trick if you don't like this
  (split-string "  data  ")
=> ("" "data" "")

  ;; no less useful than what GNU Emacs returns
  (split-string "  data  " " ")
=> ("" "" "data" "" "")

  ;; I can't imagine wanting anything else
  (split-string ",,data,," ",")
=> ("" "" "data" "" "")

For comparison, Python's `split' function behaves like XEmacs's
`split-string'.  Perl's `split' function by default removes all trailing
null fields while preserving all leading null fields, but when invoked
"split (/pattern/, string, -1)" behaves like XEmacs's `split-string'.

I think it makes sense for GNU Emacs to adopt (return to?) the
simpler, more consistent behavior, rather than have XEmacs sync to GNU
Emacs.  In particular, I think it's really unfortunate to force people
who want to parse csv data and the like to write their own functions,
while the `(delete "" (split-string ...))' idiom not only seems very
natural to me, but it handles the second example better than GNU Emacs
currently does.  And while I'm sure there exist applications where
trimming null fields at the ends but leaving them when surrounded by
non-null ones make sense, I can't come up with one offhand.  I suspect
they're less common than either "remove all nulls" or "keep all nulls".

I believe that (at least for third-party maintainers) this change
should cause no problems, because we have had no complaints about the
behavior from anyone.  (We discovered the difference only when Ben
started a sync, and the regression test sent up flares and alarums.)


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.



             reply	other threads:[~2003-04-17  9:06 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-17  9:06 Stephen J. Turnbull [this message]
2003-04-17 11:30 ` Rationale for split-string? Stefan Reichör
2003-04-18  1:54   ` Richard Stallman
2003-04-18  2:59     ` Steve Youngs
2003-04-17 17:44 ` Stefan Monnier
2003-04-17 19:32   ` Luc Teirlinck
2003-04-18 11:50   ` Stephen J. Turnbull
2003-04-18 14:17     ` Stefan Monnier
2003-04-19  8:18       ` Stephen J. Turnbull
2003-04-19 13:35     ` Richard Stallman
2003-04-19  4:14   ` Richard Stallman
2003-04-19  8:55     ` Stephen J. Turnbull
2003-04-21  0:59       ` Richard Stallman
2003-04-21  1:55         ` Luc Teirlinck
2003-04-21 10:58         ` Stephen J. Turnbull
2003-04-21 21:11           ` Luc Teirlinck
2003-04-21 23:43             ` Miles Bader
2003-04-22  3:26               ` Luc Teirlinck
2003-04-22  4:09                 ` Jerry James
2003-04-22  8:15                   ` Eli Zaretskii
2003-04-22 13:22                     ` Stephen J. Turnbull
2003-04-22 14:38                       ` Jerry James
2003-04-22 12:56                   ` Luc Teirlinck
2003-04-22 14:56                     ` Jerry James
2003-04-22 15:27                       ` Luc Teirlinck
2003-04-22 13:19                 ` Stephen J. Turnbull
2003-04-22 13:39                   ` Miles Bader
2003-04-22 13:51                   ` Luc Teirlinck
2003-04-22 16:26                   ` Luc Teirlinck
2003-04-23  1:00           ` Richard Stallman
2003-04-23  4:09             ` Stephen J. Turnbull
2003-04-24 23:12               ` Richard Stallman
2003-05-20  1:55               ` Stephen J. Turnbull
2003-05-22 15:00                 ` Kai Großjohann
  -- strict thread matches above, loose matches on Subject: below --
2003-05-20  3:11 Bill Wohler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87brz57at2.fsf@tleepslib.sk.tsukuba.ac.jp \
    --to=stephen@xemacs.org \
    --cc=xemacs-design@xemacs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).