From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.xemacs.design,gmane.emacs.devel Subject: Re: Rationale for split-string? Date: Fri, 18 Apr 2003 20:50:42 +0900 Organization: The XEmacs Project Sender: xemacs-design-admin@xemacs.org Message-ID: <87n0io2fe5.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87brz57at2.fsf@tleepslib.sk.tsukuba.ac.jp> <200304171744.h3HHiJCx009215@rum.cs.yale.edu> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1050666702 3517 80.91.224.249 (18 Apr 2003 11:51:42 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Fri, 18 Apr 2003 11:51:42 +0000 (UTC) Cc: "Stephen J. Turnbull" , emacs-devel@gnu.org, xemacs-design@xemacs.org Original-X-From: xemacs-design-admin@xemacs.org Fri Apr 18 13:51:40 2003 Return-path: Original-Received: from gwyn.tux.org ([199.184.165.135]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 196UP5-0000ua-00 for ; Fri, 18 Apr 2003 13:51:39 +0200 Original-Received: from gwyn.tux.org (localhost.localdomain [127.0.0.1]) by gwyn.tux.org (8.11.6p2/8.9.1) with ESMTP id h3IBq6M01098; Fri, 18 Apr 2003 07:52:06 -0400 Original-Received: (from turnbull@localhost) by gwyn.tux.org (8.11.6p2/8.9.1) id h3IBpp200995 for xemacs-design-mailman@xemacs.org; Fri, 18 Apr 2003 07:51:51 -0400 Original-Received: (from mail@localhost) by gwyn.tux.org (8.11.6p2/8.9.1) id h3IBpeg00945 for turnbull@tux.org; Fri, 18 Apr 2003 07:51:40 -0400 Original-Received: from tleepslib.sk.tsukuba.ac.jp (tleepslib.sk.tsukuba.ac.jp [130.158.98.109]) by gwyn.tux.org (8.11.6p2/8.9.1) with ESMTP id h3IBpVM00895; Fri, 18 Apr 2003 07:51:32 -0400 Original-Received: from steve by tleepslib.sk.tsukuba.ac.jp with local (Exim 3.36 #1 (Debian)) id 196UOA-0001iF-00; Fri, 18 Apr 2003 20:50:42 +0900 Original-To: "Stefan Monnier" In-Reply-To: <200304171744.h3HHiJCx009215@rum.cs.yale.edu> ("Stefan Monnier"'s message of "Thu, 17 Apr 2003 13:44:18 -0400") User-Agent: Gnus/5.090016 (Oort Gnus v0.16) XEmacs/21.5 (cabbage) X-XEmacs-List: design Errors-To: xemacs-design-admin@xemacs.org X-BeenThere: xemacs-design@xemacs.org X-Mailman-Version: 2.0.13 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Discussion of design and features for XEmacs. List-Unsubscribe: , Xref: main.gmane.org gmane.emacs.xemacs.design:2065 gmane.emacs.devel:13281 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:13281 >>>>> "Stefan" == Stefan Monnier writes: >> What is the rationale for the specification of `split-string'? Stefan> I think the reason is for the default case. In XEmacs we Stefan> get: ELISP> (split-string " a b ") ("" "a" "b" "") Stefan> What is usually desired here is to eliminate all empty Stefan> parts. I tend to agree, but remember Larry Wall does not. That concerns me; Larry is nothing if not remarkably good at intuiting what works. And the (delete "" (split-string ...)) idiom is hardly an exercise in perversion or a brainteaser. Stefan> A gross hack is to test if the last char of the regexp is Stefan> ?+ and if so get rid of empty strings at start and end. Stefan> It should take care of 99% of the cases. That's an implementation, not a specification. Using that means we'll be having this discussion again, sooner or later. Think about someone who writes a smart SEPARATORS to get rid of whitespace or leaders around the elements. I really don't like the idea of iterating a spec every time somebody finds a plausible use for the function that some "less gross than the last time hack" rules out. If you want a specific common case optimized, test for that. Eg, how about one of (defun split-string-sanely (string &optional separators) (cond ((eq separators t) (gnu-emacs-split-string string)) (t (xemacs-split-string string separators)))) (defun split-string-sanely-too (string &optional separators) (let ((result (xemacs-split-string string separators))) (cond ((stringp separators) result) ((eq separators 'omit-nulls) (delete "" result)) (t (error 'invalid-argument "SEPARATORS must be a string or 'omit-nulls" separators))))) (defun split-string-flexibly (string &optional separators thunk) (let ((result (xemacs-split-string string separators))) (cond ((functionp thunk) (delete-if thunk result)) ((eq thunk 'omit-nulls) (delete "" result)) ((null thunk) result) (t (error 'invalid-argument "THUNK must be nil, 'omit-nulls, or a function" thunk))))) These can be easily generalized to further useful special cases (deleting blank strings or non-numbers, anyone?) without ever screwing up old code or ruling out uses of a given SEPARATORS regexp. In fact, my preference would be to implement and name more or less as above, in which case I would default differently (e.g., if SEPARATORS is nil, use the omit-nulls behavior). Then the internal function could be named `split-string' and have the simple, consistent behavior. Both APIs would be considered public. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software.