all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Luc Teirlinck <teirllm@dms.auburn.edu>
Cc: emacs-devel@gnu.org
Subject: Re: Rationale for split-string?
Date: Mon, 21 Apr 2003 16:11:21 -0500 (CDT)	[thread overview]
Message-ID: <200304212111.h3LLBLK11879@eel.dms.auburn.edu> (raw)
In-Reply-To: <87ist8yv4n.fsf@tleepslib.sk.tsukuba.ac.jp> (stephen@xemacs.org)

Stephen Turnbull wrote:

   How about:

   ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
   ;;
   ;; one function, three arguments

   (defun split-string (string &optional separators omit-nulls)

     "Splits STRING into substrings bounded by matches for SEPARATORS.

   The beginning and end of STRING, and each match for SEPARATORS, are
   splitting points.  The substrings between the splitting points are
   collected in a list, which is returned.  (The substrings matching
   SEPARATORS are removed.)

   If SEPARATORS is nil, it defaults to \"[ \f\t\n\r\v]+\".

   If OMIT-NULLs is t, zero-length substrings are omitted from the list
   (so that for the default value of SEPARATORS leading and trailing
   whitespace are trimmed).  If nil, all zero-length substrings are
   retained, which correctly parses CSV format, for example."

     ;; implementation
     )

There are two problems with this.  First of, all it would break tons
of existing Emacs code.  Secondly, the defaults for SEPARATORS and for
OMIT-NULLs do not match.  Thus, the most routine call of 
(split-string string) would produce nonsensical results in the case of
leading or trailing whitespace.

Something like

(split-string &optional separators keep-nulls)

that is, the same as your proposal but with the roles of nil and t
reversed would take care of the second objection and also break less
existing Emacs code (but probably still enough to worry about).  Of
course the reduction in broken Emacs code would probably come at the
expense of breaking existing XEmacs code.

With your proposal, we would have to replace plenty of occurrence of
(split-string string) in Emacs with (split-string string nil t).  To
do that automatically, we would have to change all of them.  There is
plenty of Elisp code that is not included in either the Emacs or
XEmacs distributions, but that might still be important to plenty of
people.  We can not change that code.  Code compatible between
different Emacs versions would have to become more complex.  The
reverse version of your proposal would eliminate this part of the
problem, but probably produce a similar problem for XEmacs.  With the
reverse proposal above, we would not have to worry about Emacs calls
to split-string with the default-value for SEPARATORS, but one still
would have to go through all occurrences of split-string with
non-default values of SEPARATORS, at the very least in all .el files
in the Lisp directory and all its subdirectories, and very carefully
check which ones the change would break and fix all those.
(Personally I do not have the time to do that.)  Even if somebody
finds the time to do all of this, we can not check and fix Elisp code
not included in the Emacs or XEmacs distributions.

The point of my proposal (possible values "all","none" and "edges" for
omit-nulls with nil being equivalent with "edges" in Emacs and with
"none" in XEmacs) was to avoid breaking any existing Emacs or XEmacs
code while still making it trivial to use split-string in a way that
works identically in Emacs and XEmacs.  Again, in that proposal, only
"edges" as an additional value for omit-nulls is necessary to avoid
breaking existing Emacs code.  I only mentioned "beginning" and "end"
as luxury possibilities.  I know of software packages that use the
"end" version and the "end" version actually does make a lot of sense
in plenty of situations, like splitting a file or buffer into lines,
where a leading newline does represent an empty line, but a trailing
one does not represent an additional empty line following it.  The
"end" (as well as the "beginning") behavior is, however, trivial to
obtain from the "none" behavior, so that it would be a luxury.  ("end"
would be a nice luxury, "beginning" would probably be a "luxury
luxury" for symmetry with "end".)

Sincerely,

Luc.

  reply	other threads:[~2003-04-21 21:11 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-17  9:06 Rationale for split-string? Stephen J. Turnbull
2003-04-17 11:30 ` Stefan Reichör
2003-04-18  1:54   ` Richard Stallman
2003-04-18  2:59     ` Steve Youngs
2003-04-17 17:44 ` Stefan Monnier
2003-04-17 19:32   ` Luc Teirlinck
2003-04-18 11:50   ` Stephen J. Turnbull
2003-04-18 14:17     ` Stefan Monnier
2003-04-19  8:18       ` Stephen J. Turnbull
2003-04-19 13:35     ` Richard Stallman
2003-04-19  4:14   ` Richard Stallman
2003-04-19  8:55     ` Stephen J. Turnbull
2003-04-21  0:59       ` Richard Stallman
2003-04-21  1:55         ` Luc Teirlinck
2003-04-21 10:58         ` Stephen J. Turnbull
2003-04-21 21:11           ` Luc Teirlinck [this message]
2003-04-21 23:43             ` Miles Bader
2003-04-22  3:26               ` Luc Teirlinck
2003-04-22  4:09                 ` Jerry James
2003-04-22  8:15                   ` Eli Zaretskii
2003-04-22 13:22                     ` Stephen J. Turnbull
2003-04-22 14:38                       ` Jerry James
2003-04-22 12:56                   ` Luc Teirlinck
2003-04-22 14:56                     ` Jerry James
2003-04-22 15:27                       ` Luc Teirlinck
2003-04-22 13:19                 ` Stephen J. Turnbull
2003-04-22 13:39                   ` Miles Bader
2003-04-22 13:51                   ` Luc Teirlinck
2003-04-22 16:26                   ` Luc Teirlinck
2003-04-23  1:00           ` Richard Stallman
2003-04-23  4:09             ` Stephen J. Turnbull
2003-04-24 23:12               ` Richard Stallman
2003-05-20  1:55               ` Stephen J. Turnbull
2003-05-22 15:00                 ` Kai Großjohann
  -- strict thread matches above, loose matches on Subject: below --
2003-05-20  3:11 Bill Wohler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200304212111.h3LLBLK11879@eel.dms.auburn.edu \
    --to=teirllm@dms.auburn.edu \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.