unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Rationale for split-string?
@ 2003-04-17  9:06 Stephen J. Turnbull
  2003-04-17 11:30 ` Stefan Reichör
  2003-04-17 17:44 ` Stefan Monnier
  0 siblings, 2 replies; 35+ messages in thread
From: Stephen J. Turnbull @ 2003-04-17  9:06 UTC (permalink / raw)
  Cc: xemacs-design

What is the rationale for the specification of `split-string'?

That is, in GNU Emacs

  ;; an often convenient abbreviation
  (split-string "  data  ")
=> ("data")

  ;; weird
  (split-string "  data  " " ")
=> ("" "data" "")

  ;; urk (think "gnumeric just-say-no.xls" "save as" "csv")
  (split-string ",,data,," ",")
=> ("" "data" "")

emacs-version
"21.2.2"

In XEmacs currently we get

  ;; usually (delete "" (split-string "  data  ")) should do the
  ;; trick if you don't like this
  (split-string "  data  ")
=> ("" "data" "")

  ;; no less useful than what GNU Emacs returns
  (split-string "  data  " " ")
=> ("" "" "data" "" "")

  ;; I can't imagine wanting anything else
  (split-string ",,data,," ",")
=> ("" "" "data" "" "")

For comparison, Python's `split' function behaves like XEmacs's
`split-string'.  Perl's `split' function by default removes all trailing
null fields while preserving all leading null fields, but when invoked
"split (/pattern/, string, -1)" behaves like XEmacs's `split-string'.

I think it makes sense for GNU Emacs to adopt (return to?) the
simpler, more consistent behavior, rather than have XEmacs sync to GNU
Emacs.  In particular, I think it's really unfortunate to force people
who want to parse csv data and the like to write their own functions,
while the `(delete "" (split-string ...))' idiom not only seems very
natural to me, but it handles the second example better than GNU Emacs
currently does.  And while I'm sure there exist applications where
trimming null fields at the ends but leaving them when surrounded by
non-null ones make sense, I can't come up with one offhand.  I suspect
they're less common than either "remove all nulls" or "keep all nulls".

I believe that (at least for third-party maintainers) this change
should cause no problems, because we have had no complaints about the
behavior from anyone.  (We discovered the difference only when Ben
started a sync, and the regression test sent up flares and alarums.)


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.



^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re: Rationale for split-string?
@ 2003-05-20  3:11 Bill Wohler
  0 siblings, 0 replies; 35+ messages in thread
From: Bill Wohler @ 2003-05-20  3:11 UTC (permalink / raw)
  Cc: emacs-devel

"Stephen J. Turnbull" <stephen@xemacs.org> writes:

> A few I couldn't tell at all without doing a much deeper analysis of
> the code than I have time for right now:
>
> ./lisp/calendar/todo-mode.el:869:  needs checking
> ./lisp/eshell/em-pred.el:601:  needs checking
> ./lisp/mh-e/mh-utils.el:1606:  needs checking

Thanks very much for checking. I believe that Satyaki has already fixed
this in CVS MH-E so that it would be compatible with present and future
versions of Emacs as well as XEmacs. Given our recent history, this
should find its way into CVS Emacs in a few weeks (in MH-E 7.4).

> ./lisp/mh-e/mh-alias.el:156:  want OMIT-NULLS t
> ./lisp/mh-e/mh-alias.el:289:  OK
> ./lisp/mh-e/mh-alias.el:469:  OK
> ./lisp/mh-e/mh-comp.el:374:  OK, double default
> ./lisp/mh-e/mh-e.el:2164:  OK, double default
> ./lisp/mh-e/mh-index.el:475:  OK, double default
> ./lisp/mh-e/mh-seq.el:966:  OK, double default
> ./lisp/mh-e/mh-utils.el:1606:  needs checking

--
Bill Wohler <wohler@newt.com>  http://www.newt.com/wohler/  GnuPG ID:610BD9AD
Maintainer of comp.mail.mh FAQ and MH-E. Vote Libertarian!
If you're passed on the right, you're in the wrong lane.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2003-05-22 15:00 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-17  9:06 Rationale for split-string? Stephen J. Turnbull
2003-04-17 11:30 ` Stefan Reichör
2003-04-18  1:54   ` Richard Stallman
2003-04-18  2:59     ` Steve Youngs
2003-04-17 17:44 ` Stefan Monnier
2003-04-17 19:32   ` Luc Teirlinck
2003-04-18 11:50   ` Stephen J. Turnbull
2003-04-18 14:17     ` Stefan Monnier
2003-04-19  8:18       ` Stephen J. Turnbull
2003-04-19 13:35     ` Richard Stallman
2003-04-19  4:14   ` Richard Stallman
2003-04-19  8:55     ` Stephen J. Turnbull
2003-04-21  0:59       ` Richard Stallman
2003-04-21  1:55         ` Luc Teirlinck
2003-04-21 10:58         ` Stephen J. Turnbull
2003-04-21 21:11           ` Luc Teirlinck
2003-04-21 23:43             ` Miles Bader
2003-04-22  3:26               ` Luc Teirlinck
2003-04-22  4:09                 ` Jerry James
2003-04-22  8:15                   ` Eli Zaretskii
2003-04-22 13:22                     ` Stephen J. Turnbull
2003-04-22 14:38                       ` Jerry James
2003-04-22 12:56                   ` Luc Teirlinck
2003-04-22 14:56                     ` Jerry James
2003-04-22 15:27                       ` Luc Teirlinck
2003-04-22 13:19                 ` Stephen J. Turnbull
2003-04-22 13:39                   ` Miles Bader
2003-04-22 13:51                   ` Luc Teirlinck
2003-04-22 16:26                   ` Luc Teirlinck
2003-04-23  1:00           ` Richard Stallman
2003-04-23  4:09             ` Stephen J. Turnbull
2003-04-24 23:12               ` Richard Stallman
2003-05-20  1:55               ` Stephen J. Turnbull
2003-05-22 15:00                 ` Kai Großjohann
  -- strict thread matches above, loose matches on Subject: below --
2003-05-20  3:11 Bill Wohler

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).