From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Stefan =?iso-8859-1?q?Reich=F6r?= Newsgroups: gmane.emacs.xemacs.design,gmane.emacs.devel Subject: Re: Rationale for split-string? Date: Thu, 17 Apr 2003 13:30:01 +0200 Sender: xemacs-design-admin@xemacs.org Message-ID: References: <87brz57at2.fsf@tleepslib.sk.tsukuba.ac.jp> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1050578941 18395 80.91.224.249 (17 Apr 2003 11:29:01 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 17 Apr 2003 11:29:01 +0000 (UTC) Cc: emacs-devel@gnu.org, xemacs-design@xemacs.org Original-X-From: xemacs-design-admin@xemacs.org Thu Apr 17 13:28:58 2003 Return-path: Original-Received: from gwyn.tux.org ([199.184.165.135]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1967Za-0004mR-00 for ; Thu, 17 Apr 2003 13:28:58 +0200 Original-Received: from gwyn.tux.org (localhost.localdomain [127.0.0.1]) by gwyn.tux.org (8.11.6p2/8.9.1) with ESMTP id h3HBTDM20666; Thu, 17 Apr 2003 07:29:13 -0400 Original-Received: (from turnbull@localhost) by gwyn.tux.org (8.11.6p2/8.9.1) id h3HBSaN20537 for xemacs-design-mailman@xemacs.org; Thu, 17 Apr 2003 07:28:36 -0400 Original-Received: (from mail@localhost) by gwyn.tux.org (8.11.6p2/8.9.1) id h3HBSYf20525 for turnbull@tux.org; Thu, 17 Apr 2003 07:28:34 -0400 Original-Received: from mail.riic.uni-linz.ac.at (mail.riic.uni-linz.ac.at [140.78.161.130]) by gwyn.tux.org (8.11.6p2/8.9.1) with ESMTP id h3HBSJM20475; Thu, 17 Apr 2003 07:28:23 -0400 Original-Received: from HEIDI.riic.at (heidi.riic.uni-linz.ac.at [140.78.161.224]) by mail.riic.uni-linz.ac.at (8.9.3/8.9.3) with ESMTP id NAA17900; Thu, 17 Apr 2003 13:51:15 +0200 Original-To: "Stephen J. Turnbull" In-Reply-To: <87brz57at2.fsf@tleepslib.sk.tsukuba.ac.jp> (Stephen J. Turnbull's message of "Thu, 17 Apr 2003 18:06:17 +0900") User-Agent: Gnus/5.090018 (Oort Gnus v0.18) Emacs/21.3.50 (windows-nt) X-XEmacs-List: design Errors-To: xemacs-design-admin@xemacs.org X-BeenThere: xemacs-design@xemacs.org X-Mailman-Version: 2.0.13 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Discussion of design and features for XEmacs. List-Unsubscribe: , Xref: main.gmane.org gmane.emacs.xemacs.design:2040 gmane.emacs.devel:13263 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:13263 On Thu, 17 Apr 2003, Stephen J. Turnbull said: > What is the rationale for the specification of `split-string'? > > That is, in GNU Emacs > > ;; an often convenient abbreviation > (split-string " data ") > => ("data") > > ;; weird > (split-string " data " " ") > => ("" "data" "") > > ;; urk (think "gnumeric just-say-no.xls" "save as" "csv") > (split-string ",,data,," ",") > => ("" "data" "") > > emacs-version > "21.2.2" > > In XEmacs currently we get > > ;; usually (delete "" (split-string " data ")) should do the > ;; trick if you don't like this > (split-string " data ") > => ("" "data" "") > > ;; no less useful than what GNU Emacs returns > (split-string " data " " ") > => ("" "" "data" "" "") > > ;; I can't imagine wanting anything else > (split-string ",,data,," ",") > => ("" "" "data" "" "") > > For comparison, Python's `split' function behaves like XEmacs's > `split-string'. Perl's `split' function by default removes all > trailing null fields while preserving all leading null fields, but > when invoked "split (/pattern/, string, -1)" behaves like XEmacs's > `split-string'. > > I think it makes sense for GNU Emacs to adopt (return to?) the > simpler, more consistent behavior, rather than have XEmacs sync to > GNU Emacs. In particular, I think it's really unfortunate to force > people who want to parse csv data and the like to write their own > functions, while the `(delete "" (split-string ...))' idiom not > only seems very natural to me, but it handles the second example > better than GNU Emacs currently does. And while I'm sure there > exist applications where trimming null fields at the ends but > leaving them when surrounded by non-null ones make sense, I can't > come up with one offhand. I suspect they're less common than either > "remove all nulls" or "keep all nulls". > > I believe that (at least for third-party maintainers) this change > should cause no problems, because we have had no complaints about > the behavior from anyone. (We discovered the difference only when > Ben started a sync, and the regression test sent up flares and > alarums.) I noticed the different behavior of the split-string function, because I need to parse csv output from subversion. Now I need different code for the two platforms. I would welcome, if the GNU Emacs and XEmacs would have the same split-string implementation. Stefan.