* Re: Rationale for split-string? @ 2003-05-20 3:11 Bill Wohler 0 siblings, 0 replies; 35+ messages in thread From: Bill Wohler @ 2003-05-20 3:11 UTC (permalink / raw) Cc: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > A few I couldn't tell at all without doing a much deeper analysis of > the code than I have time for right now: > > ./lisp/calendar/todo-mode.el:869: needs checking > ./lisp/eshell/em-pred.el:601: needs checking > ./lisp/mh-e/mh-utils.el:1606: needs checking Thanks very much for checking. I believe that Satyaki has already fixed this in CVS MH-E so that it would be compatible with present and future versions of Emacs as well as XEmacs. Given our recent history, this should find its way into CVS Emacs in a few weeks (in MH-E 7.4). > ./lisp/mh-e/mh-alias.el:156: want OMIT-NULLS t > ./lisp/mh-e/mh-alias.el:289: OK > ./lisp/mh-e/mh-alias.el:469: OK > ./lisp/mh-e/mh-comp.el:374: OK, double default > ./lisp/mh-e/mh-e.el:2164: OK, double default > ./lisp/mh-e/mh-index.el:475: OK, double default > ./lisp/mh-e/mh-seq.el:966: OK, double default > ./lisp/mh-e/mh-utils.el:1606: needs checking -- Bill Wohler <wohler@newt.com> http://www.newt.com/wohler/ GnuPG ID:610BD9AD Maintainer of comp.mail.mh FAQ and MH-E. Vote Libertarian! If you're passed on the right, you're in the wrong lane. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Rationale for split-string? @ 2003-04-17 9:06 Stephen J. Turnbull 2003-04-17 11:30 ` Stefan Reichör 2003-04-17 17:44 ` Stefan Monnier 0 siblings, 2 replies; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-17 9:06 UTC (permalink / raw) Cc: xemacs-design What is the rationale for the specification of `split-string'? That is, in GNU Emacs ;; an often convenient abbreviation (split-string " data ") => ("data") ;; weird (split-string " data " " ") => ("" "data" "") ;; urk (think "gnumeric just-say-no.xls" "save as" "csv") (split-string ",,data,," ",") => ("" "data" "") emacs-version "21.2.2" In XEmacs currently we get ;; usually (delete "" (split-string " data ")) should do the ;; trick if you don't like this (split-string " data ") => ("" "data" "") ;; no less useful than what GNU Emacs returns (split-string " data " " ") => ("" "" "data" "" "") ;; I can't imagine wanting anything else (split-string ",,data,," ",") => ("" "" "data" "" "") For comparison, Python's `split' function behaves like XEmacs's `split-string'. Perl's `split' function by default removes all trailing null fields while preserving all leading null fields, but when invoked "split (/pattern/, string, -1)" behaves like XEmacs's `split-string'. I think it makes sense for GNU Emacs to adopt (return to?) the simpler, more consistent behavior, rather than have XEmacs sync to GNU Emacs. In particular, I think it's really unfortunate to force people who want to parse csv data and the like to write their own functions, while the `(delete "" (split-string ...))' idiom not only seems very natural to me, but it handles the second example better than GNU Emacs currently does. And while I'm sure there exist applications where trimming null fields at the ends but leaving them when surrounded by non-null ones make sense, I can't come up with one offhand. I suspect they're less common than either "remove all nulls" or "keep all nulls". I believe that (at least for third-party maintainers) this change should cause no problems, because we have had no complaints about the behavior from anyone. (We discovered the difference only when Ben started a sync, and the regression test sent up flares and alarums.) -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-17 9:06 Stephen J. Turnbull @ 2003-04-17 11:30 ` Stefan Reichör 2003-04-18 1:54 ` Richard Stallman 2003-04-17 17:44 ` Stefan Monnier 1 sibling, 1 reply; 35+ messages in thread From: Stefan Reichör @ 2003-04-17 11:30 UTC (permalink / raw) Cc: emacs-devel, xemacs-design On Thu, 17 Apr 2003, Stephen J. Turnbull said: > What is the rationale for the specification of `split-string'? > > That is, in GNU Emacs > > ;; an often convenient abbreviation > (split-string " data ") > => ("data") > > ;; weird > (split-string " data " " ") > => ("" "data" "") > > ;; urk (think "gnumeric just-say-no.xls" "save as" "csv") > (split-string ",,data,," ",") > => ("" "data" "") > > emacs-version > "21.2.2" > > In XEmacs currently we get > > ;; usually (delete "" (split-string " data ")) should do the > ;; trick if you don't like this > (split-string " data ") > => ("" "data" "") > > ;; no less useful than what GNU Emacs returns > (split-string " data " " ") > => ("" "" "data" "" "") > > ;; I can't imagine wanting anything else > (split-string ",,data,," ",") > => ("" "" "data" "" "") > > For comparison, Python's `split' function behaves like XEmacs's > `split-string'. Perl's `split' function by default removes all > trailing null fields while preserving all leading null fields, but > when invoked "split (/pattern/, string, -1)" behaves like XEmacs's > `split-string'. > > I think it makes sense for GNU Emacs to adopt (return to?) the > simpler, more consistent behavior, rather than have XEmacs sync to > GNU Emacs. In particular, I think it's really unfortunate to force > people who want to parse csv data and the like to write their own > functions, while the `(delete "" (split-string ...))' idiom not > only seems very natural to me, but it handles the second example > better than GNU Emacs currently does. And while I'm sure there > exist applications where trimming null fields at the ends but > leaving them when surrounded by non-null ones make sense, I can't > come up with one offhand. I suspect they're less common than either > "remove all nulls" or "keep all nulls". > > I believe that (at least for third-party maintainers) this change > should cause no problems, because we have had no complaints about > the behavior from anyone. (We discovered the difference only when > Ben started a sync, and the regression test sent up flares and > alarums.) I noticed the different behavior of the split-string function, because I need to parse csv output from subversion. Now I need different code for the two platforms. I would welcome, if the GNU Emacs and XEmacs would have the same split-string implementation. Stefan. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-17 11:30 ` Stefan Reichör @ 2003-04-18 1:54 ` Richard Stallman 2003-04-18 2:59 ` Steve Youngs 0 siblings, 1 reply; 35+ messages in thread From: Richard Stallman @ 2003-04-18 1:54 UTC (permalink / raw) Cc: emacs-devel I would welcome, if the GNU Emacs and XEmacs would have the same split-string implementation. I know of no reason to want them to be different. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-18 1:54 ` Richard Stallman @ 2003-04-18 2:59 ` Steve Youngs 0 siblings, 0 replies; 35+ messages in thread From: Steve Youngs @ 2003-04-18 2:59 UTC (permalink / raw) Cc: Emacs Devel |--==> "RS" == Richard Stallman <rms@gnu.org> writes: RS> I know of no reason to want them to be different. Fantastic! Steve Turnbull is a thorough guy, so I'm sure that he will send you a patch so you can fix GNU/Emacs' split-string. -- |---<Steve Youngs>---------------<GnuPG KeyID: 10D5C9C5>---| | XEmacs - The only _______ you'll ever need. | | Fill in the blank, yes, it's THAT good! | |------------------------------------<youngs@xemacs.org>---| ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-17 9:06 Stephen J. Turnbull 2003-04-17 11:30 ` Stefan Reichör @ 2003-04-17 17:44 ` Stefan Monnier 2003-04-17 19:32 ` Luc Teirlinck ` (2 more replies) 1 sibling, 3 replies; 35+ messages in thread From: Stefan Monnier @ 2003-04-17 17:44 UTC (permalink / raw) Cc: emacs-devel, xemacs-design > What is the rationale for the specification of `split-string'? > > That is, in GNU Emacs > > ;; an often convenient abbreviation > (split-string " data ") > => ("data") > > ;; weird > (split-string " data " " ") > => ("" "data" "") > > ;; urk (think "gnumeric just-say-no.xls" "save as" "csv") > (split-string ",,data,," ",") > => ("" "data" "") I think the reason is for the default case. In XEmacs we get: ELISP> (split-string " a b ") ("" "a" "b" "") What is usually desired here is to eliminate all empty parts. The `+' in the default regexp gets rid of the empty parts inside the string, but not at the beginning and at the end, so that's why Emacs gets rid of the empty string at the beginning and at the end. I agree that when the regexp used is "," or "[ \t]*,[ \t]*", then XEmacs's behavior makes a lot more sense. A gross hack is to test if the last char of the regexp is ?+ and if so get rid of empty strings at start and end. It should take care of 99% of the cases. Stefan ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-17 17:44 ` Stefan Monnier @ 2003-04-17 19:32 ` Luc Teirlinck 2003-04-18 11:50 ` Stephen J. Turnbull 2003-04-19 4:14 ` Richard Stallman 2 siblings, 0 replies; 35+ messages in thread From: Luc Teirlinck @ 2003-04-17 19:32 UTC (permalink / raw) Cc: stephen, emacs-devel, xemacs-design Stefan Monnier wrote: A gross hack is to test if the last char of the regexp is ?+ and if so get rid of empty strings at start and end. It should take care of 99% of the cases. If you can not decide which of the two types of behavior is more useful, would it not be more logical to have the behavior depend on some optional new argument with the old behavior the default, so that no existing code gets broken? Gross hacks that "should" take care of 99% of the cases usually turn out to take care of something that looks more like 66% or even 50%. Making the behavior depend on the last character of the regexp just looks like a very messy imprecise heuristic. Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-17 17:44 ` Stefan Monnier 2003-04-17 19:32 ` Luc Teirlinck @ 2003-04-18 11:50 ` Stephen J. Turnbull 2003-04-18 14:17 ` Stefan Monnier 2003-04-19 13:35 ` Richard Stallman 2003-04-19 4:14 ` Richard Stallman 2 siblings, 2 replies; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-18 11:50 UTC (permalink / raw) Cc: Stephen J. Turnbull, emacs-devel, xemacs-design >>>>> "Stefan" == Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu> writes: >> What is the rationale for the specification of `split-string'? Stefan> I think the reason is for the default case. In XEmacs we Stefan> get: ELISP> (split-string " a b ") ("" "a" "b" "") Stefan> What is usually desired here is to eliminate all empty Stefan> parts. I tend to agree, but remember Larry Wall does not. That concerns me; Larry is nothing if not remarkably good at intuiting what works. And the (delete "" (split-string ...)) idiom is hardly an exercise in perversion or a brainteaser. Stefan> A gross hack is to test if the last char of the regexp is Stefan> ?+ and if so get rid of empty strings at start and end. Stefan> It should take care of 99% of the cases. That's an implementation, not a specification. Using that means we'll be having this discussion again, sooner or later. Think about someone who writes a smart SEPARATORS to get rid of whitespace or leaders around the elements. I really don't like the idea of iterating a spec every time somebody finds a plausible use for the function that some "less gross than the last time hack" rules out. If you want a specific common case optimized, test for that. Eg, how about one of (defun split-string-sanely (string &optional separators) (cond ((eq separators t) (gnu-emacs-split-string string)) (t (xemacs-split-string string separators)))) (defun split-string-sanely-too (string &optional separators) (let ((result (xemacs-split-string string separators))) (cond ((stringp separators) result) ((eq separators 'omit-nulls) (delete "" result)) (t (error 'invalid-argument "SEPARATORS must be a string or 'omit-nulls" separators))))) (defun split-string-flexibly (string &optional separators thunk) (let ((result (xemacs-split-string string separators))) (cond ((functionp thunk) (delete-if thunk result)) ((eq thunk 'omit-nulls) (delete "" result)) ((null thunk) result) (t (error 'invalid-argument "THUNK must be nil, 'omit-nulls, or a function" thunk))))) These can be easily generalized to further useful special cases (deleting blank strings or non-numbers, anyone?) without ever screwing up old code or ruling out uses of a given SEPARATORS regexp. In fact, my preference would be to implement and name more or less as above, in which case I would default differently (e.g., if SEPARATORS is nil, use the omit-nulls behavior). Then the internal function could be named `split-string' and have the simple, consistent behavior. Both APIs would be considered public. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-18 11:50 ` Stephen J. Turnbull @ 2003-04-18 14:17 ` Stefan Monnier 2003-04-19 8:18 ` Stephen J. Turnbull 2003-04-19 13:35 ` Richard Stallman 1 sibling, 1 reply; 35+ messages in thread From: Stefan Monnier @ 2003-04-18 14:17 UTC (permalink / raw) Cc: Stefan Monnier > I tend to agree, but remember Larry Wall does not. That concerns me; > Larry is nothing if not remarkably good at intuiting what works. And > the (delete "" (split-string ...)) idiom is hardly an exercise in > perversion or a brainteaser. I don't think it has much to do with intuition. He just had in mind splitting entries in /etc/passwd or tab-separated fields or somesuch whereas Emacs coders wanted the function to extract a list of words out of a string. As I said, the XEmacs behavior is more regular and probably preferable. > Stefan> A gross hack is to test if the last char of the regexp is > Stefan> ?+ and if so get rid of empty strings at start and end. > Stefan> It should take care of 99% of the cases. > > That's an implementation, not a specification. Using that means we'll > be having this discussion again, sooner or later. Think about someone Why do people assume that I'd want gross hacks in Emacs's code ? Stefan ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-18 14:17 ` Stefan Monnier @ 2003-04-19 8:18 ` Stephen J. Turnbull 0 siblings, 0 replies; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-19 8:18 UTC (permalink / raw) Cc: emacs-devel >>>>> "Stefan" == Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu> writes: Stefan> As I said, the XEmacs behavior is more regular and Stefan> probably preferable. Good. How about the convenience function aspect? Do you agree that keying on one or more symbols for less regular, but useful, behavior is a reasonable interface? I would prefer to _not_ overload `split-string', but have a second function. I'm not wedded to that, though. Stefan> Why do people assume that I'd want gross hacks in Emacs's Stefan> code ? It didn't look funny, and I've fallen into the habit of taking what you say seriously. Should I break that habit? :-) -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-18 11:50 ` Stephen J. Turnbull 2003-04-18 14:17 ` Stefan Monnier @ 2003-04-19 13:35 ` Richard Stallman 1 sibling, 0 replies; 35+ messages in thread From: Richard Stallman @ 2003-04-19 13:35 UTC (permalink / raw) Cc: monnier+gnu/emacs, emacs-devel, xemacs-design, stephen (defun split-string-sanely-too (string &optional separators) (let ((result (xemacs-split-string string separators))) (cond ((stringp separators) result) ((eq separators 'omit-nulls) (delete "" result)) (t (error 'invalid-argument "SEPARATORS must be a string or 'omit-nulls" separators))))) This seems like a good approach, but I would rather use t instead of `omit-nulls'. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-17 17:44 ` Stefan Monnier 2003-04-17 19:32 ` Luc Teirlinck 2003-04-18 11:50 ` Stephen J. Turnbull @ 2003-04-19 4:14 ` Richard Stallman 2003-04-19 8:55 ` Stephen J. Turnbull 2 siblings, 1 reply; 35+ messages in thread From: Richard Stallman @ 2003-04-19 4:14 UTC (permalink / raw) Cc: emacs-devel RS> I know of no reason to want them to be different. Fantastic! Steve Turnbull is a thorough guy, so I'm sure that he will send you a patch so you can fix GNU/Emacs' split-string. First we need to figure out what is the right behavior for that function. People are already discussing the question... > (split-string ",,data,," ",") > => ("" "data" "") Is that wrong? If so, what result do you think is right? ("" "" "data" "" "") could be argued for, but I am not sure it is better. A gross hack is to test if the last char of the regexp is ?+ and if so get rid of empty strings at start and end. It should take care of 99% of the cases. That is a kludge. Whatever we do, it should not be that. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-19 4:14 ` Richard Stallman @ 2003-04-19 8:55 ` Stephen J. Turnbull 2003-04-21 0:59 ` Richard Stallman 0 siblings, 1 reply; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-19 8:55 UTC (permalink / raw) Cc: emacs-devel, xemacs-design > (split-string ",,data,," ",") > => ("" "data" "") rms> Is that wrong? If so, what result do you think is right? rms> ("" "" "data" "" "") could be argued for, but I am not sure rms> it is better. Well, if you are parsing a comma separated value file (the standard text/plain output format for spreadsheets and some databases, such as subversion), the five-element list is exactly what you want, and the three-element list is a type error (incomplete record). In what case would the three-element list be desirable? I understand the case for a one-element result, but not three. I see basically two modes. In one mode you are parsing fields from each of a sequence of records, in which case you want to retain null strings as null values. In the other, you are parsing a (free-form) stream of words, in which case null words (usually) don't exist, so you want to throw away _all_ of the null strings. In fact, all of the whitespace-only strings, too, but those normally won't arise in the common case where SEPARATORS matches contiguous whitespace. I think we should support both modes, but the token-parser is easy to derive from the field-parser, while it's impossible to do the reverse because the token parser throws away information. I conclude that the field-parser (the XEmacs behavior) is more primitive, and I'd like to call that `split-string', with either more sophisticated behavior implemented by overloading the separators argument to take keywords for special treatment, or (preferably) in a separate function. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-19 8:55 ` Stephen J. Turnbull @ 2003-04-21 0:59 ` Richard Stallman 2003-04-21 1:55 ` Luc Teirlinck 2003-04-21 10:58 ` Stephen J. Turnbull 0 siblings, 2 replies; 35+ messages in thread From: Richard Stallman @ 2003-04-21 0:59 UTC (permalink / raw) Cc: emacs-devel, xemacs-design I see basically two modes. In one mode you are parsing fields from each of a sequence of records, in which case you want to retain null strings as null values. In the other, you are parsing a (free-form) stream of words, in which case null words (usually) don't exist, so you want to throw away _all_ of the null strings. In fact, all of the whitespace-only strings, too, but those normally won't arise in the common case where SEPARATORS matches contiguous whitespace. I think that makes sense. Does anyone see a counterargument, or a reason why any other behavior is useful? I think we should support both modes, but the token-parser is easy to derive from the field-parser, while it's impossible to do the reverse because the token parser throws away information. I conclude that the field-parser (the XEmacs behavior) is more primitive, and I'd like to call that `split-string', I don't entirely agree. The default case uses strings of whitespace as the separator, and for that case, the only intelligent approach is token-parsing. So the function needs to be able to do token-parsing. This feature therefore may as well also be available for any separator. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-21 0:59 ` Richard Stallman @ 2003-04-21 1:55 ` Luc Teirlinck 2003-04-21 10:58 ` Stephen J. Turnbull 1 sibling, 0 replies; 35+ messages in thread From: Luc Teirlinck @ 2003-04-21 1:55 UTC (permalink / raw) Cc: stephen, emacs-devel, xemacs-design I am personally always weary of changing existing documented behavior. Sometimes it is necessary, but even then it is a necessary evil. There is, of course, the possibility of breaking existing code. We do not know "all existing code", so just grepping through stuff does not solve the problem. It also makes life hard on people trying to write packages that are portable between Emacs versions. My own suggestion would be to add a new optional argument, say delete-null-matches to split-string. The value could be "all", "none" or "edges" and maybe even "beginning" and "end" (but that would be a luxury). For Emacs a value of nil would be equivalent with "edges" (the current behavior), for XEmacs it would be equivalent with "none", XEmacs' current behavior. No existing Emacs or XEmacs code would get broken, and people worried about Emacs-XEmacs compatibility could always give an explicit non-nil value, which would be interpreted in exactly the same way by Emacs and XEmacs. Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-21 0:59 ` Richard Stallman 2003-04-21 1:55 ` Luc Teirlinck @ 2003-04-21 10:58 ` Stephen J. Turnbull 2003-04-21 21:11 ` Luc Teirlinck 2003-04-23 1:00 ` Richard Stallman 1 sibling, 2 replies; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-21 10:58 UTC (permalink / raw) Cc: emacs-devel, xemacs-design >>>>> "rms" == Richard Stallman <rms@gnu.org> writes: rms> I don't entirely agree. The default case uses strings of rms> whitespace as the separator, and for that case, the only rms> intelligent approach is token-parsing. So the function needs rms> to be able to do token-parsing. I was afraid of that (I prefer regular behavior over intelligent behavior if I must make a choice), but I can live with it. I really would prefer a separate `tokenize-string' function, though. (That name is not used in the GNU Emacs or XEmacs cores, or anywhere in the XEmacs packages. Several packages have their own tokenize functions but they're all properly prefixed, and one might fear semantic would use the name, but it doesn't.) rms> This feature therefore may as well also be available for any rms> separator. But that's not compatible with a *single* function with with *two* arguments. So I suppose you want a simpler version of Luc Teirlinck's suggestion. How about: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; one function, three arguments (defun split-string (string &optional separators omit-nulls) "Splits STRING into substrings bounded by matches for SEPARATORS. The beginning and end of STRING, and each match for SEPARATORS, are splitting points. The substrings between the splitting points are collected in a list, which is returned. (The substrings matching SEPARATORS are removed.) If SEPARATORS is nil, it defaults to \"[ \f\t\n\r\v]+\". If OMIT-NULLs is t, zero-length substrings are omitted from the list (so that for the default value of SEPARATORS leading and trailing whitespace are trimmed). If nil, all zero-length substrings are retained, which correctly parses CSV format, for example." ;; implementation ) -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-21 10:58 ` Stephen J. Turnbull @ 2003-04-21 21:11 ` Luc Teirlinck 2003-04-21 23:43 ` Miles Bader 2003-04-23 1:00 ` Richard Stallman 1 sibling, 1 reply; 35+ messages in thread From: Luc Teirlinck @ 2003-04-21 21:11 UTC (permalink / raw) Cc: emacs-devel Stephen Turnbull wrote: How about: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; one function, three arguments (defun split-string (string &optional separators omit-nulls) "Splits STRING into substrings bounded by matches for SEPARATORS. The beginning and end of STRING, and each match for SEPARATORS, are splitting points. The substrings between the splitting points are collected in a list, which is returned. (The substrings matching SEPARATORS are removed.) If SEPARATORS is nil, it defaults to \"[ \f\t\n\r\v]+\". If OMIT-NULLs is t, zero-length substrings are omitted from the list (so that for the default value of SEPARATORS leading and trailing whitespace are trimmed). If nil, all zero-length substrings are retained, which correctly parses CSV format, for example." ;; implementation ) There are two problems with this. First of, all it would break tons of existing Emacs code. Secondly, the defaults for SEPARATORS and for OMIT-NULLs do not match. Thus, the most routine call of (split-string string) would produce nonsensical results in the case of leading or trailing whitespace. Something like (split-string &optional separators keep-nulls) that is, the same as your proposal but with the roles of nil and t reversed would take care of the second objection and also break less existing Emacs code (but probably still enough to worry about). Of course the reduction in broken Emacs code would probably come at the expense of breaking existing XEmacs code. With your proposal, we would have to replace plenty of occurrence of (split-string string) in Emacs with (split-string string nil t). To do that automatically, we would have to change all of them. There is plenty of Elisp code that is not included in either the Emacs or XEmacs distributions, but that might still be important to plenty of people. We can not change that code. Code compatible between different Emacs versions would have to become more complex. The reverse version of your proposal would eliminate this part of the problem, but probably produce a similar problem for XEmacs. With the reverse proposal above, we would not have to worry about Emacs calls to split-string with the default-value for SEPARATORS, but one still would have to go through all occurrences of split-string with non-default values of SEPARATORS, at the very least in all .el files in the Lisp directory and all its subdirectories, and very carefully check which ones the change would break and fix all those. (Personally I do not have the time to do that.) Even if somebody finds the time to do all of this, we can not check and fix Elisp code not included in the Emacs or XEmacs distributions. The point of my proposal (possible values "all","none" and "edges" for omit-nulls with nil being equivalent with "edges" in Emacs and with "none" in XEmacs) was to avoid breaking any existing Emacs or XEmacs code while still making it trivial to use split-string in a way that works identically in Emacs and XEmacs. Again, in that proposal, only "edges" as an additional value for omit-nulls is necessary to avoid breaking existing Emacs code. I only mentioned "beginning" and "end" as luxury possibilities. I know of software packages that use the "end" version and the "end" version actually does make a lot of sense in plenty of situations, like splitting a file or buffer into lines, where a leading newline does represent an empty line, but a trailing one does not represent an additional empty line following it. The "end" (as well as the "beginning") behavior is, however, trivial to obtain from the "none" behavior, so that it would be a luxury. ("end" would be a nice luxury, "beginning" would probably be a "luxury luxury" for symmetry with "end".) Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-21 21:11 ` Luc Teirlinck @ 2003-04-21 23:43 ` Miles Bader 2003-04-22 3:26 ` Luc Teirlinck 0 siblings, 1 reply; 35+ messages in thread From: Miles Bader @ 2003-04-21 23:43 UTC (permalink / raw) Cc: stephen, emacs-devel, xemacs-design, rms On Mon, Apr 21, 2003 at 04:11:21PM -0500, Luc Teirlinck wrote: > (defun split-string (string &optional separators omit-nulls) > > There are two problems with this. First of, all it would break tons > of existing Emacs code. Secondly, the defaults for SEPARATORS and for > OMIT-NULLs do not match. Thus, the most routine call of > (split-string string) would produce nonsensical results in the case of > leading or trailing whitespace. Other than the all-defaults case (where _both_ optional arguments are omitted), I think Stephen's formulation is very natural, in that you usually want OMIT-NULLS to be t if you're splitting on a non-whitespace string. I think the problem with the all-defaults case could be solved by having OMIT-NULLS default to t when SEPARATORS is not specified. This is what awk does I think (with split), and it's really very natural. [IOW, at the beginning of the function, put: (unless separators (setq omit-nulls t)) ] -Miles -- We are all lying in the gutter, but some of us are looking at the stars. -Oscar Wilde ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-21 23:43 ` Miles Bader @ 2003-04-22 3:26 ` Luc Teirlinck 2003-04-22 4:09 ` Jerry James 2003-04-22 13:19 ` Stephen J. Turnbull 0 siblings, 2 replies; 35+ messages in thread From: Luc Teirlinck @ 2003-04-22 3:26 UTC (permalink / raw) Cc: stephen, emacs-devel, xemacs-design, rms Miles Bader wrote: I think Stephen's formulation is very natural, in that you usually want OMIT-NULLS to be t if you're splitting on a non-whitespace string. First of all, I am not worried about Stephen's formulation being unnatural (although the original formulation actually would produce unnatural results in the default case), but about it breaking existing code. I believe you are underestimating the level of generality of split-string and the wild heterogeneity of its applications. It is by no means whatsoever true that except in the whitespace case you would want to keep all null matches. If SEPARATORS is a "terminator character", say newline, then a null match at the beginning counts. There is no reason you would start the string with a terminator other than to explicitly terminate an empty string. The empty match at the end does not count, because the terminator at that place just terminates the previous match. This is, for instance, how you would want to split a buffer, or a file, or user input, into lines. The way you implement that with the current split-string is to first check for an initial terminator and, if there is one, prepend an empty string to the split-string output. With the proposed new split-string, you delete the empty match at the end from the split-string output. That is actually easier. However... The "however" is that we are not defining a *new* function but *re*defining an *existing* function, an often used and extremely general existing function. That is all but guaranteed to produce a wild variety of bugs. In fact let us assume, for the sake of argument, that Stephen and you are 100% right. That would mean that any correct existing code, using the present Emacs split-string with a non-nil SEPARATORS, checks for empty matches at the beginning and end and adds any such matches to the split-string output to correct the "bug" in the present split-string. After Stephen's change, any empty match at the beginning and end of the string will produce not one, but two empty strings. Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 3:26 ` Luc Teirlinck @ 2003-04-22 4:09 ` Jerry James 2003-04-22 8:15 ` Eli Zaretskii 2003-04-22 12:56 ` Luc Teirlinck 2003-04-22 13:19 ` Stephen J. Turnbull 1 sibling, 2 replies; 35+ messages in thread From: Jerry James @ 2003-04-22 4:09 UTC (permalink / raw) Cc: emacs-devel, xemacs-design Luc Teirlinck <teirllm@dms.auburn.edu> wrote: > First of all, I am not worried about Stephen's formulation being > unnatural (although the original formulation actually would produce > unnatural results in the default case), but about it breaking existing > code. [snip] > The "however" is that we are not defining a *new* function but > *re*defining an *existing* function, an often used and extremely > general existing function. That is all but guaranteed to produce a > wild variety of bugs. Speaking of existing code, it's worth making a couple more points. It appears to me that Emacs 21.1 contained a version with the same behavior as XEmacs'; that is, it produced empty strings at the beginning and end in the cases of interest. Emacs 21.4 contained the current version, that discards such empty strings. So did anybody on the Emacs team worry about breaking existing code when 21.4 was released, nearly 4 years ago? If so, what steps were taken to counter such breakage? Did "a wild variety of bugs" appear at the time? Are there any mail archives of emacs-devel available from back then? Furthermore, how much code will just work, whether the empty strings are present or not? After all, Emacs' current implementation can still produce results containing empty strings, and doesn't even live up to its docstring's promise of not having any at the beginning or end, as some of Stephen's examples show, so any split-string clients still have to deal with such strings. How much code uses the delete idiom to throw the empty strings away? That code wouldn't notice the change. I did a lot of digging through the XEmacs package code a little while ago while researching this issue. I didn't see any code that conditionalized on the version of split-string (although I did not make a complete tour, either), so I suspect that a lot of code still assumes the semantics of the old version, and just never noticed that some empty strings don't appear any more. In short, is there any reason to believe that it wouldn't break LESS code to revert to the old version and pretend that the last 4 years never happened? -- Jerry James http://www.ittc.ku.edu/~james/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 4:09 ` Jerry James @ 2003-04-22 8:15 ` Eli Zaretskii 2003-04-22 13:22 ` Stephen J. Turnbull 2003-04-22 12:56 ` Luc Teirlinck 1 sibling, 1 reply; 35+ messages in thread From: Eli Zaretskii @ 2003-04-22 8:15 UTC (permalink / raw) Cc: emacs-devel > From: Jerry James <james@xemacs.org> > Date: 21 Apr 2003 23:09:31 -0500 > > It > appears to me that Emacs 21.1 contained a version with the same behavior > as XEmacs'; that is, it produced empty strings at the beginning and end > in the cases of interest. Emacs 21.4 contained the current version, > that discards such empty strings. So did anybody on the Emacs team > worry about breaking existing code when 21.4 was released, nearly 4 > years ago? There's some confusion (or maybe typos) here: Emacs 21.4 is not released yet, certainly not 4 years ago. The latest Emacs version is 21.3, released about 2 weeks ago. Perhaps you got the versions wrong or something. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 8:15 ` Eli Zaretskii @ 2003-04-22 13:22 ` Stephen J. Turnbull 2003-04-22 14:38 ` Jerry James 0 siblings, 1 reply; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-22 13:22 UTC (permalink / raw) Cc: james, emacs-devel, xemacs-design, teirllm >>>>> "Eli" == Eli Zaretskii <eliz@elta.co.il> writes: > From: Jerry James <james@xemacs.org> > Date: 21 Apr 2003 23:09:31 -0500 > > It > appears to me that Emacs 21.1 contained a version with the same behavior > as XEmacs' Eli> There's some confusion (or maybe typos) here: Emacs 21.4 is Eli> not released yet, certainly not 4 years ago. The latest Eli> Emacs version is 21.3, released about 2 weeks ago. Perhaps Eli> you got the versions wrong or something. It's a typo. Try `cvs diff -r EMACS_20_2 -r EMACS_20_4 subr.el'. Look for the hunk at line 956. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 13:22 ` Stephen J. Turnbull @ 2003-04-22 14:38 ` Jerry James 0 siblings, 0 replies; 35+ messages in thread From: Jerry James @ 2003-04-22 14:38 UTC (permalink / raw) Cc: Eli Zaretskii, emacs-devel, xemacs-design, teirllm "Stephen J. Turnbull" <stephen@xemacs.org>, on Tue, 22 Apr 2003 at 22:22:51 +0900 you wrote: > >>>>> "Eli" == Eli Zaretskii <eliz@elta.co.il> writes: > > > From: Jerry James <james@xemacs.org> > > Date: 21 Apr 2003 23:09:31 -0500 > > > > It > > appears to me that Emacs 21.1 contained a version with the same behavior > > as XEmacs' > > Eli> There's some confusion (or maybe typos) here: Emacs 21.4 is > Eli> not released yet, certainly not 4 years ago. The latest > Eli> Emacs version is 21.3, released about 2 weeks ago. Perhaps > Eli> you got the versions wrong or something. > > It's a typo. Try `cvs diff -r EMACS_20_2 -r EMACS_20_4 subr.el'. > Look for the hunk at line 956. Right. Sorry. I should know better than to try composing coherent email just before going to bed. I meant 20.1 and 20.4, of course. -- Jerry James http://www.ittc.ku.edu/~james/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 4:09 ` Jerry James 2003-04-22 8:15 ` Eli Zaretskii @ 2003-04-22 12:56 ` Luc Teirlinck 2003-04-22 14:56 ` Jerry James 1 sibling, 1 reply; 35+ messages in thread From: Luc Teirlinck @ 2003-04-22 12:56 UTC (permalink / raw) Cc: emacs-devel, xemacs-design I am not going to respond to the essence of your statement, since it does not have any. (It is just emotional stuff, it has no rational content.) I just want to point out that I am not an official spokesperson for Emacs. I represent my own opinions, not those of Emacs or "the Emacs developers". Any "Evil Intents" you seem to be attributing to Emacs and the Emacs developers are strictly and completely my own personal Evilness. Sincerely, Luc Teirlinck. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 12:56 ` Luc Teirlinck @ 2003-04-22 14:56 ` Jerry James 2003-04-22 15:27 ` Luc Teirlinck 0 siblings, 1 reply; 35+ messages in thread From: Jerry James @ 2003-04-22 14:56 UTC (permalink / raw) Cc: emacs-devel, xemacs-design Luc Teirlinck <teirllm@dms.auburn.edu> wrote: > I am not going to respond to the essence of your statement, since it > does not have any. (It is just emotional stuff, it has no rational > content.) I just want to point out that I am not an official > spokesperson for Emacs. I represent my own opinions, not those of > Emacs or "the Emacs developers". Any "Evil Intents" you seem to be > attributing to Emacs and the Emacs developers are strictly and > completely my own personal Evilness. You have to realize that I'm an academic, Luc. I asked the questions I asked, not to accuse or belittle anybody, but as an exercise in the Socratic method (perhaps a poor one, but that's another discussion). If you reread my last message with that in mind, I think you will see that the case is the opposite of what you assumed: it is all rational content; none of it is emotional. Let me summarize the main points I wanted to make: 1) Some of the resistance to changing Emacs' split-string function is coming from people who are worried about breaking existing code. 2) XEmacs has not changed the split-string function (except in the development version, which is where we noticed the test breakage that prompted all this). 3) Emacs changed the split-string function, somewhere after version 20.1 was released, and before 20.4 was released. 4) If no code broke at the time, then we have nothing to worry about, because no code at all notices the difference. 5) If some code broke, then knowing which code it is that broke is relevant to this discussion; hence the question about the existence of emacs-devel archives. The thought of anyone having any kind of evil intent never crossed my mind. Regards, -- Jerry James http://www.ittc.ku.edu/~james/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 14:56 ` Jerry James @ 2003-04-22 15:27 ` Luc Teirlinck 0 siblings, 0 replies; 35+ messages in thread From: Luc Teirlinck @ 2003-04-22 15:27 UTC (permalink / raw) Cc: emacs-devel, xemacs-design I interpreted your original message as suggesting that I was a hypocrite (and that other people involved with Emacs were hypocrites) because I was worrying about breaking existing code now, whereas nobody connected with Emacs development worried about breaking existing code four years ago. (The answer to that is that I did not subscribe to Emacs devel four years ago and that I represent my own opinions, not other people's opinions.) I am willing to believe that you did not mean to suggest the above, in which case I overreacted to your message. Sorry. Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 3:26 ` Luc Teirlinck 2003-04-22 4:09 ` Jerry James @ 2003-04-22 13:19 ` Stephen J. Turnbull 2003-04-22 13:39 ` Miles Bader ` (2 more replies) 1 sibling, 3 replies; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-22 13:19 UTC (permalink / raw) Cc: miles, emacs-devel, xemacs-design, rms >>>>> "Luc" == Luc Teirlinck <teirllm@dms.auburn.edu> writes: Luc> Miles Bader wrote: mb> I think Stephen's formulation is very natural, in that you mb> usually want OMIT-NULLS to be t if you're splitting on a mb> non-whitespace string. Miles, here you meant OMIT-NULLS to be nil, right? I think Miles's proposal to default the one-argument form of `split-string' to GNU behavior and have the two-argument form as XEmacs's, with the three argument form for precise control, is a good compromise. Add (defconst split-string-default-separators "[ \\f\\t\\n\\r\\v]+" "The default value of separators for `split-string'. A regexp matching strings of whitespace. May be locale-dependent \(as yet unimplemented). Should not match non-breaking spaces.") and the current XEmacs behavior is very naturally available with (split-string string split-string-default-separators) (although the fact that that means something different from `(split-string string)' is definitely a wart). ------------------------------------------------------------------------ Back to our regularly scheduled controversy on principles: Luc> First of all, I am not worried about Stephen's formulation Luc> being unnatural (although the original formulation actually Luc> would produce unnatural results in the default case), but Luc> about it breaking existing code. GNU Emacs made the change (viz. cvs diff -r EMACS_20_2 -r EMACS_20_4 subr.el) without worrying sufficiently about breaking existing code (see Stefan Reichör's post here <uvfxduzt2.fsf@riic.at>, or run XEmacs's regression test suite on XEmacs 21.5). I don't see why that should be a barrier to reverting to the old, regular, behavior now. Further, as far as GNU Emacs itself goes, I see your theory and raise you a full-tree patch. I volunteer to revise the code and fix the callers in all GNU Emacs code distributed on the mainline. (I've already requested papers from rms.) Sure, we can't guarantee that third party code won't get broken, but Jerry James has anted an audit of all XEmacs code including the packages, a significant fraction of 3rd party Emacs Lisp code. Nothing there will break, although once we get this settled, many packages can have their local versions of `split-string' either thrown out or turned into trivial defsubsts around the core version. Want to match Jerry's effort with some facts here? Find us some callers, we'll send patches to their maintainers. Luc> I believe you are underestimating the level of generality of Luc> split-string and the wild heterogeneity of its applications. Et tu, Luc. You don't imagine using split-string to parse Makefiles or Python code[1], to detect trailing whitespace (perhaps generated by older auto-fill implementations to mark sentence breaks) that violates coding standards, etc. (Not surprising, since GNU Emacs 21.x can't do those things using `split-string'.) Since generality and heterogeneity are much better served by simple regular interfaces, what you are really arguing is quite the opposite. Ie, that there's only one important application (splitting into tokens separated by non-significant whitespace). And you want the `split-string' API optimized for that and very similar applications by default, even though that means that `split-string's non-default behavior looks totally schizophrenic by comparison. A lot of people agree with you (including rms AFAICT), but others don't. Many XEmacs people disagree strongly. (They prefer regularity.) Luc> It is by no means whatsoever true that except in the Luc> whitespace case you would want to keep all null matches. If Luc> SEPARATORS is a "terminator character", say newline, Note that Miles's proposal would actually give the behavior you want in `(split-string string "\n")'. (Admittedly, you'd like `(split-string string "\n" 'end)' even better.) Point for Miles! But you are exactly right: sometimes one wants it one way, and sometimes the other. It is this _irreconcilable_ difference that leads me to strongly prefer separate APIs, one which imposes stream-of-token semantics, and one which merely splits strings. I think `split-string' is a more natural name for the latter. Luc> The "however" is that we are not defining a *new* function Luc> but *re*defining an *existing* function, an often used and Luc> extremely general existing function. That is all but Luc> guaranteed to produce a wild variety of bugs. Please consider the history of the change. You're inaccurate on all counts. We propose _reverting_ what is already a redefinition. Because the redefined function is _less general_ than the original, it's _used less often_ than it could be. (Jerry James's audit of XEmacs and package code demonstrates this.) And it won't "produce" bugs, it will _exchange_ a new set of unknown bugs (which is likely to be small everywhere except in code very specific to GNU Emacs 21) for a set of existing bugs, which everybody agrees need to be fixed. So the question basically boils down to whether it makes sense to have a regular, easily understood definition with exceptions restricted to a few very clear cases with consensus support, or to aggressively make "plausible" exceptions. The last time GNU Emacs did the latter with this function, it clearly screwed up. Luc> In fact let us assume, for the sake of argument, that Stephen Luc> and you are 100% right. That would mean that any correct Luc> existing code, using the present Emacs split-string with a Luc> non-nil SEPARATORS, checks for empty matches at the beginning Luc> and end and adds any such matches to the split-string output Luc> to correct the "bug" in the present split-string. After Luc> Stephen's change, any empty match at the beginning and end of Luc> the string will produce not one, but two empty strings. That's silly; what anybody sane would do in the face of GNU Emacs's demonstrated willingness to change semantics of such a fundamental function is to copy the old definition into their own code. It would probably be shorter, and surely simpler and faster, than the gross hack you propose. Footnotes: [1] (defun python-parse-indentation (line) (let ((i 0) (line (split-string line python-single-indentation))) (while (string= (car line) "") (setq i (1+ i)) (setq line (cdr line))) (cons i line))) -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 13:19 ` Stephen J. Turnbull @ 2003-04-22 13:39 ` Miles Bader 2003-04-22 13:51 ` Luc Teirlinck 2003-04-22 16:26 ` Luc Teirlinck 2 siblings, 0 replies; 35+ messages in thread From: Miles Bader @ 2003-04-22 13:39 UTC (permalink / raw) Cc: Luc Teirlinck, emacs-devel, xemacs-design, rms On Tue, Apr 22, 2003 at 10:19:31PM +0900, Stephen J. Turnbull wrote: > mb> I think Stephen's formulation is very natural, in that you > mb> usually want OMIT-NULLS to be t if you're splitting on a > mb> non-whitespace string. > > Miles, here you meant OMIT-NULLS to be nil, right? Yeah. -Miles -- I'd rather be consing. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 13:19 ` Stephen J. Turnbull 2003-04-22 13:39 ` Miles Bader @ 2003-04-22 13:51 ` Luc Teirlinck 2003-04-22 16:26 ` Luc Teirlinck 2 siblings, 0 replies; 35+ messages in thread From: Luc Teirlinck @ 2003-04-22 13:51 UTC (permalink / raw) Cc: miles, emacs-devel, xemacs-design, rms [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 852 bytes --] Stephen Turnbull wrote: GNU Emacs made the change (viz. cvs diff -r EMACS_20_2 -r EMACS_20_4 subr.el) without worrying sufficiently about breaking existing code (see Stefan Reichör's post here <uvfxduzt2.fsf@riic.at>, or run XEmacs's regression test suite on XEmacs 21.5). I don't see why that should be a barrier to reverting to the old, regular, behavior now. I did not know the history of the function. I did not subscribe to this site four years ago. If I did I would probably have opposed the original change back then, which might not have made any difference anyway. I am not part of some "Conspiracy" started four years ago as others seem to suggest. Anyway, I hope this answers the question of "And where were you four years ago?" which others asked me. I did not subscribe to emacs devel back then. Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-22 13:19 ` Stephen J. Turnbull 2003-04-22 13:39 ` Miles Bader 2003-04-22 13:51 ` Luc Teirlinck @ 2003-04-22 16:26 ` Luc Teirlinck 2 siblings, 0 replies; 35+ messages in thread From: Luc Teirlinck @ 2003-04-22 16:26 UTC (permalink / raw) Cc: miles, xemacs-design, rms, emacs-devel Stephen Turnbull wrote: Note that Miles's proposal would actually give the behavior you want in `(split-string string "\n")'. (Admittedly, you'd like `(split-string string "\n" 'end)' even better.) Point for Miles! Just to make sure I understand what you are proposing: I could not just do (split-string string "\n"), I would first have to check whether the string ended in a newline and, if so, remove that newline before calling split-string (or do something else). Otherwise split-string would return a "fake" empty line at the end of a newline terminated buffer or file. (Correct?) Or are you actually suggesting to remove a final empty match, but keep any initial empty match, exactly the behavior I suggested for "end". That is, would (split-string "\n" "\n") return ("" "") or ("") ? Sincerely, Luc. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-21 10:58 ` Stephen J. Turnbull 2003-04-21 21:11 ` Luc Teirlinck @ 2003-04-23 1:00 ` Richard Stallman 2003-04-23 4:09 ` Stephen J. Turnbull 1 sibling, 1 reply; 35+ messages in thread From: Richard Stallman @ 2003-04-23 1:00 UTC (permalink / raw) Cc: emacs-devel, xemacs-design So I suppose you want a simpler version of Luc Teirlinck's suggestion. How about: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; one function, three arguments (defun split-string (string &optional separators omit-nulls) "Splits STRING into substrings bounded by matches for SEPARATORS. The beginning and end of STRING, and each match for SEPARATORS, are splitting points. The substrings between the splitting points are collected in a list, which is returned. (The substrings matching SEPARATORS are removed.) If SEPARATORS is nil, it defaults to \"[ \f\t\n\r\v]+\". If OMIT-NULLs is t, zero-length substrings are omitted from the list (so that for the default value of SEPARATORS leading and trailing whitespace are trimmed). If nil, all zero-length substrings are retained, which correctly parses CSV format, for example." That seems like the right thing, except I think that if SEPARATORS is nil, OMIT-NULLS should default to t. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-23 1:00 ` Richard Stallman @ 2003-04-23 4:09 ` Stephen J. Turnbull 2003-04-24 23:12 ` Richard Stallman 2003-05-20 1:55 ` Stephen J. Turnbull 0 siblings, 2 replies; 35+ messages in thread From: Stephen J. Turnbull @ 2003-04-23 4:09 UTC (permalink / raw) Cc: emacs-devel, xemacs-design >>>>> "rms" == Richard Stallman <rms@gnu.org> writes: (defun split-string (string &optional separators omit-nulls) "Splits STRING into substrings bounded by matches for SEPARATORS. The beginning and end of STRING, and each match for SEPARATORS, are splitting points. The substrings between the splitting points are collected in a list, which is returned. (The substrings matching SEPARATORS are removed.) If SEPARATORS is nil, it defaults to \"[ \f\t\n\r\v]+\". If OMIT-NULLs is t, zero-length substrings are omitted from the list (so that for the default value of SEPARATORS leading and trailing whitespace are trimmed). If nil, all zero-length substrings are retained, which correctly parses CSV format, for example." rms> That seems like the right thing, except I think that if rms> SEPARATORS is nil, OMIT-NULLS should default to t. OK. That is satisfactory for XEmacs, and we'll implement that. Unless you say you prefer to do it yourself, I will also submit a patch against GNU Emacs CVS head, and audit the Lisp code in CVS head to make sure there are no surprises from callers with non-default SEPARATORS. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-23 4:09 ` Stephen J. Turnbull @ 2003-04-24 23:12 ` Richard Stallman 2003-05-20 1:55 ` Stephen J. Turnbull 1 sibling, 0 replies; 35+ messages in thread From: Richard Stallman @ 2003-04-24 23:12 UTC (permalink / raw) Cc: emacs-devel, xemacs-design Unless you say you prefer to do it yourself, I will also submit a patch against GNU Emacs CVS head, and audit the Lisp code in CVS head to make sure there are no surprises from callers with non-default SEPARATORS. That would be very kind of you. To use the same code can't hurt, and may help. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-04-23 4:09 ` Stephen J. Turnbull 2003-04-24 23:12 ` Richard Stallman @ 2003-05-20 1:55 ` Stephen J. Turnbull 2003-05-22 15:00 ` Kai Großjohann 1 sibling, 1 reply; 35+ messages in thread From: Stephen J. Turnbull @ 2003-05-20 1:55 UTC (permalink / raw) >>>>> "sjt" == Stephen J Turnbull <stephen@xemacs.org> writes: sjt> OK. That is satisfactory for XEmacs, and we'll implement sjt> that. sjt> Unless you say you prefer to do it yourself, I will also sjt> submit a patch against GNU Emacs CVS head, and audit the Lisp sjt> code in CVS head to make sure there are no surprises from sjt> callers with non-default SEPARATORS. Enclosed are patches for lisp/subr.el and lispref/strings.texi to implement the API for split-string discussed earlier. Also enclosed is the result of an audit of uses of split-string in Emacs CVS (as of about three weeks ago). I didn't notice any cases where the changed specification made existing code out-and-out incorrect, so there are no further patches suggested. However, I think a lot of the uses with an explicit SEPARATORS are semantically dubious without using the OMIT-NULLS flag (and most were semantically dubious before the change to split-string, because it's at least theoretically possible for a null string to arise in the interior of the list). Most other uses of split-string are dubious in that either they depend heavily on undocumented implementation details of other utilities (eg, that the fields in /etc/mtab are separated by exactly one space) or are not very robust to bogus input. People who understand the modules in question might want to take a closer look. A few I couldn't tell at all without doing a much deeper analysis of the code than I have time for right now: ./lisp/calendar/todo-mode.el:869: needs checking ./lisp/eshell/em-pred.el:601: needs checking ./lisp/mh-e/mh-utils.el:1606: needs checking ./lisp/textmodes/reftex.el:934: needs checking ./lisp/textmodes/reftex.el:2161: needs checking If you set default-directory to the root of the Emacs hierarchy, the following function is useful to jump to the reference. nb. a few of the references have changed since I started the audit. (defun sjt/parse-grep-n2 () "Parse `grep -n -#' output for filename and line number." (interactive) (beginning-of-line) (when (re-search-forward "^\\(\\S-+\\):\\([0-9]+\\):") (cons (match-string 1) (string-to-number (match-string 2))))) (defun sjt/parse-grep-n-and-go () "Jump to place specified by `grep -n' output." (interactive) (let* ((pair (sjt/parse-grep-n2)) (file (car pair)) (line (cdr pair))) (find-file file) (goto-line line))) lisp/ChangeLog 2003-05-16 Stephen J. Turnbull <stephen@xemacs.org> * subr.el (split-string): Implement specification that splitting on explicit separators retains null fields. Add new argument OMIT-NULLS. Special-case (split-string "a string"). lispref/ChangeLog 2003-05-16 Stephen J. Turnbull <stephen@xemacs.org> * strings.texi (Creating Strings): Update split-string specification and examples. Index: lisp/subr.el =================================================================== RCS file: /cvsroot/emacs/emacs/lisp/subr.el,v retrieving revision 1.350 diff -u -r1.350 subr.el --- lisp/subr.el 24 Apr 2003 23:14:12 -0000 1.350 +++ lisp/subr.el 16 May 2003 10:03:58 -0000 @@ -1792,19 +1792,45 @@ (buffer-substring-no-properties (match-beginning num) (match-end num))))) -(defun split-string (string &optional separators) - "Splits STRING into substrings where there are matches for SEPARATORS. -Each match for SEPARATORS is a splitting point. -The substrings between the splitting points are made into a list +(defconst split-string-default-separators "[ \f\t\n\r\v]+" + "The default value of separators for `split-string'. + +A regexp matching strings of whitespace. May be locale-dependent +\(as yet unimplemented). Should not match non-breaking spaces. + +Warning: binding this to a different value and using it as default is +likely to have undesired semantics.") + +;; The specification says that if both SEPARATORS and OMIT-NULLS are +;; defaulted, OMIT-NULLS should be treated as t. Simplifying the logical +;; expression leads to the equivalent implementation that if SEPARATORS +;; is defaulted, OMIT-NULLS is treated as t. +(defun split-string (string &optional separators omit-nulls) + "Splits STRING into substrings bounded by matches for SEPARATORS. + +The beginning and end of STRING, and each match for SEPARATORS, are +splitting points. The substrings matching SEPARATORS are removed, and +the substrings between the splitting points are collected as a list, which is returned. -If SEPARATORS is absent, it defaults to \"[ \\f\\t\\n\\r\\v]+\". -If there is match for SEPARATORS at the beginning of STRING, we do not -include a null substring for that. Likewise, if there is a match -at the end of STRING, we don't include a null substring for that. +If SEPARATORS is non-nil, it should be a regular expression matching text +which separates, but is not part of, the substrings. If nil it defaults to +`split-string-default-separators', normally \"[ \\f\\t\\n\\r\\v]+\", and +OMIT-NULLS is forced to t. + +If OMIT-NULLs is t, zero-length substrings are omitted from the list \(so +that for the default value of SEPARATORS leading and trailing whitespace +are effectively trimmed). If nil, all zero-length substrings are retained, +which correctly parses CSV format, for example. + +Note that the effect of `(split-string STRING)' is the same as +`(split-string STRING split-string-default-separators t)'). In the rare +case that you wish to retain zero-length substrings when splitting on +whitespace, use `(split-string STRING split-string-default-separators)'. Modifies the match data; use `save-match-data' if necessary." - (let ((rexp (or separators "[ \f\t\n\r\v]+")) + (let ((keep-nulls (not (if separators omit-nulls t))) + (rexp (or separators split-string-default-separators)) (start 0) notfirst (list nil)) @@ -1813,16 +1839,14 @@ (= start (match-beginning 0)) (< start (length string))) (1+ start) start)) - (< (match-beginning 0) (length string))) + (< start (length string))) (setq notfirst t) - (or (eq (match-beginning 0) 0) - (and (eq (match-beginning 0) (match-end 0)) - (eq (match-beginning 0) start)) + (if (or keep-nulls (< start (match-beginning 0))) (setq list (cons (substring string start (match-beginning 0)) list))) (setq start (match-end 0))) - (or (eq start (length string)) + (if (or keep-nulls (< start (length string))) (setq list (cons (substring string start) list))) Index: lispref/strings.texi =================================================================== RCS file: /cvsroot/emacs/emacs/lispref/strings.texi,v retrieving revision 1.23 diff -u -r1.23 strings.texi --- lispref/strings.texi 4 Feb 2003 14:47:54 -0000 1.23 +++ lispref/strings.texi 16 May 2003 10:03:59 -0000 @@ -259,30 +259,46 @@ Lists}. @end defun -@defun split-string string separators +@defun split-string string separators omit-nulls This function splits @var{string} into substrings at matches for the regular expression @var{separators}. Each match for @var{separators} defines a splitting point; the substrings between the splitting points are made -into a list, which is the value returned by @code{split-string}. +into a list, which is the value returned by @code{split-string}. If +@var{omit-nulls} is @code{t}, null strings will be removed from the +result list. Otherwise, null strings are left in the result. If @var{separators} is @code{nil} (or omitted), -the default is @code{"[ \f\t\n\r\v]+"}. +the default is the value of @code{split-string-default-separators}. -For example, +@defvar split-string-default-separators +The default value of @var{separators} for @code{split-string}, initially +@samp{"[ \f\t\n\r\v]+"}. + +As a special case, when @var{separators} is @code{nil} (or omitted), +null strings are always omitted from the result. Thus: @example -(split-string "Soup is good food" "o") -@result{} ("S" "up is g" "" "d f" "" "d") -(split-string "Soup is good food" "o+") -@result{} ("S" "up is g" "d f" "d") +(split-string " two words ") +@result{} ("two" "words") +@end example + +The result is not @samp{("" "two" "words" "")}, which would rarely be +useful. If you need such a result, use an explict value for +@var{separators}: + +@example +(split-string " two words " split-string-default-separators) +@result{} ("" "two" "words" "") @end example -When there is a match adjacent to the beginning or end of the string, -this does not cause a null string to appear at the beginning or end -of the list: +More examples: @example -(split-string "out to moo" "o+") -@result{} ("ut t" " m") +(split-string "Soup is good food" "o") +@result{} ("S" "up is g" "" "d f" "" "d") +(split-string "Soup is good food" "o" t) +@result{} ("S" "up is g" "d f" "d") +(split-string "Soup is good food" "o+") +@result{} ("S" "up is g" "d f" "d") @end example Empty matches do count, when not adjacent to another match: bash-2.05b$ find . -name '*.el' | xargs fgrep -2 -n split-string /dev/null ./lisp/apropos.el:267: want OMIT-NULLS t ./lisp/calendar/todo-mode.el:869: needs checking ./lisp/cvs-status.el:286: new semantics preferred; no error checking ./lisp/diff-mode.el:1047: OK, double default ./lisp/ediff-diff.el:1143: OK ./lisp/emacs-lisp/authors.el:460: double default, OK ./lisp/emacs-lisp/crm.el:419: new semantics preferred; no error checking ./lisp/emacs-lisp/crm.el:605: new semantics preferred; no error checking ./lisp/emacs-lisp/lisp-mnt.el:412: want OMIT-NULLS t ./lisp/emacs-lisp/unsafep.el:111: mentioned in comment, not used ./lisp/eshell/em-cmpl.el:403: new semantics preferred; no error checking ./lisp/eshell/em-ls.el:257: OK, double default ./lisp/eshell/em-pred.el:601: needs checking ./lisp/eshell/esh-util.el:228: want OMIT-NULLS t ./lisp/eshell/esh-util.el:449: new semantics preferred; no error checking ./lisp/eshell/esh-var.el:568: new semantics preferred; no error checking ./lisp/files.el:4254: double default, OK ./lisp/filesets.el:1202: new semantics preferred; no error checking ./lisp/gdb-ui.el:1001: new semantics preferred; no error checking ./lisp/gnus/gnus-art.el:4645: new semantics preferred; no error checking ./lisp/gnus/gnus-group.el:3798: OK ./lisp/gnus/gnus.el:2679: OK ./lisp/gnus/gnus.el:2681: OK ./lisp/gnus/mailcap.el:367: OK, could use OMIT-NULLS t instead ./lisp/gnus/mailcap.el:502: want OMIT-NULLS t ./lisp/gnus/mailcap.el:648: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/mailcap.el:702: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/mailcap.el:870: OK, could use OMIT-NULLS t instead ./lisp/gnus/mailcap.el:940: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/message.el:4701: want OMIT-NULLS t ./lisp/gnus/mm-decode.el:55: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/mm-decode.el:57: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/mm-decode.el:264: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/mm-decode.el:363: OK, double default ./lisp/gnus/mml.el:307: new semantics preferred; no error checking (splitting MIME content type) ./lisp/gnus/mml.el:337: ditto ./lisp/gnus/nnslashdot.el:364: OK, double default ./lisp/gnus/nnslashdot.el:488: OK, could use OMIT-NULLS t instead ./lisp/gnus/nnultimate.el:176: OK, could use OMIT-NULLS t instead ./lisp/gnus/pop3.el:249: want OMIT-NULLS t ./lisp/gnus/pop3.el:346: want OMIT-NULLS t ./lisp/gnus/pop3.el:347: want OMIT-NULLS t ./lisp/gnus/pop3.el:409: want OMIT-NULLS t ./lisp/gnus/rfc2231.el:131: new semantics preferred; no error checking (splitting encoded word into locale info) ./lisp/gud.el:1817: OK ./lisp/gud.el:1847: OK ./lisp/gud.el:2288: OK, double default ./lisp/gud.el:2813: OK ./lisp/hexl.el:635: double default, OK ./lisp/hexl.el:652: double default, OK ./lisp/ido.el:2502: want OMIT-NULLS t ./lisp/ido.el:2868: want OMIT-NULLS t ./lisp/info.el:387: want OMIT-NULLS t ./lisp/info.el:390: want OMIT-NULLS t ./lisp/mail/rfc2368.el:137: OK ./lisp/mail/rfc2368.el:144: new semantics preferred; no error checking ./lisp/mail/smtpmail.el:602: want OMIT-NULLS t ./lisp/mh-e/mh-alias.el:156: want OMIT-NULLS t ./lisp/mh-e/mh-alias.el:289: OK ./lisp/mh-e/mh-alias.el:469: OK ./lisp/mh-e/mh-comp.el:374: OK, double default ./lisp/mh-e/mh-e.el:2164: OK, double default ./lisp/mh-e/mh-index.el:475: OK, double default ./lisp/mh-e/mh-seq.el:966: OK, double default ./lisp/mh-e/mh-utils.el:1606: needs checking ./lisp/net/eudc-export.el:126: OK ./lisp/net/eudc.el:161: Emacs 21 compatible ./lisp/net/eudc.el:419: want OMIT-NULLS t ./lisp/net/eudc.el:442: check this ./lisp/net/eudc.el:833: want OMIT-NULLS t ./lisp/net/eudcb-ldap.el:90: OK ./lisp/net/ldap.el:415: new semantics preferred; no error checking ./lisp/net/ldap.el:420: OK ./lisp/net/tramp.el:5658: check this ./lisp/net/tramp.el:6257: tramp-split-string is not quite emacs compatible ./lisp/pcmpl-cvs.el:175: new semantics preferred; no error checking ./lisp/pcmpl-gnu.el:127: OK, double default ./lisp/pcmpl-linux.el:46: double default, OK ./lisp/pcmpl-linux.el:88: want OMIT-NULLS t ./lisp/pcmpl-linux.el:101: want OMIT-NULLS t ./lisp/pcmpl-rpm.el:39: OK, double default ./lisp/pcmpl-rpm.el:46: OK, double default ./lisp/pcmpl-unix.el:89: new semantics preferred; no error checking ./lisp/pcvs-util.el:227: want OMIT-NULLS t ./lisp/pcvs-util.el:228: want OMIT-NULLS t ./lisp/progmodes/ada-prj.el:590: want OMIT-NULLS t ./lisp/progmodes/ada-xref.el:207: new semantics preferred; no error checking ./lisp/progmodes/fortran.el:267: want OMIT-NULLS t ./lisp/progmodes/idlw-shell.el:1734: could use new split-string with OMIT-NULLS t ./lisp/progmodes/idlwave.el:3702: prior XEmacs-compatible, could use new split-string ./lisp/progmodes/inf-lisp.el:285: double default, OK ./lisp/progmodes/vhdl-mode.el:13030: new semantics preferred; no error checking ./lisp/progmodes/vhdl-mode.el:13171: new semantics preferred; no error checking ./lisp/progmodes/vhdl-mode.el:13698: new semantics preferred; no error checking ./lisp/progmodes/vhdl-mode.el:13701: new semantics preferred; no error checking ./lisp/textmodes/bibtex.el:2665: new semantics preferred; no error checking ./lisp/textmodes/reftex-cite.el:192: Gone? ./lisp/textmodes/reftex-cite.el:373: new semantics preferred; no error checking ./lisp/textmodes/reftex-cite.el:383: new semantics preferred; no error checking ./lisp/textmodes/reftex-cite.el:445: OK ./lisp/textmodes/reftex-cite.el:863: new semantics preferred; no error checking ./lisp/textmodes/reftex-cite.el:961: new semantics preferred; no error checking ./lisp/textmodes/reftex-index.el:1552: new semantics preferred; no error checking ./lisp/textmodes/reftex-index.el:1685: want OMIT-NULLS t ./lisp/textmodes/reftex-index.el:1734: OK, double default ./lisp/textmodes/reftex-index.el:1748: OK, double default ./lisp/textmodes/reftex-index.el:1755: OK, double default ./lisp/textmodes/reftex-index.el:1762: new semantics preferred; no error checking ./lisp/textmodes/reftex-index.el:1818: new semantics preferred; no error checking ./lisp/textmodes/reftex-parse.el:343: new semantics preferred; no error checking ./lisp/textmodes/reftex-parse.el:482: OK, mapconcat used ./lisp/textmodes/reftex-parse.el:990: new semantics preferred; no error checking ./lisp/textmodes/reftex.el:934: needs checking ./lisp/textmodes/reftex.el:1455: OK, double default ./lisp/textmodes/reftex.el:1488: OK, double default ./lisp/textmodes/reftex.el:1556: OK, could use OMIT-NULLS t instead ./lisp/textmodes/reftex.el:2161: needs checking (uses explicit re or explicit ws) ./lisp/vc-cvs.el:789: new semantics preferred; requires rewrite to use ./lisp/xml.el:432: OK ./lisp/xml.el:436: OK -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Rationale for split-string? 2003-05-20 1:55 ` Stephen J. Turnbull @ 2003-05-22 15:00 ` Kai Großjohann 0 siblings, 0 replies; 35+ messages in thread From: Kai Großjohann @ 2003-05-22 15:00 UTC (permalink / raw) "Stephen J. Turnbull" <stephen@xemacs.org> writes: > Enclosed are patches for lisp/subr.el and lispref/strings.texi to > implement the API for split-string discussed earlier. I wonder what's going to happen with this? It hasn't been committed, AFAICS. Does anyone know? -- This line is not blank. ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2003-05-22 15:00 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-05-20 3:11 Rationale for split-string? Bill Wohler -- strict thread matches above, loose matches on Subject: below -- 2003-04-17 9:06 Stephen J. Turnbull 2003-04-17 11:30 ` Stefan Reichör 2003-04-18 1:54 ` Richard Stallman 2003-04-18 2:59 ` Steve Youngs 2003-04-17 17:44 ` Stefan Monnier 2003-04-17 19:32 ` Luc Teirlinck 2003-04-18 11:50 ` Stephen J. Turnbull 2003-04-18 14:17 ` Stefan Monnier 2003-04-19 8:18 ` Stephen J. Turnbull 2003-04-19 13:35 ` Richard Stallman 2003-04-19 4:14 ` Richard Stallman 2003-04-19 8:55 ` Stephen J. Turnbull 2003-04-21 0:59 ` Richard Stallman 2003-04-21 1:55 ` Luc Teirlinck 2003-04-21 10:58 ` Stephen J. Turnbull 2003-04-21 21:11 ` Luc Teirlinck 2003-04-21 23:43 ` Miles Bader 2003-04-22 3:26 ` Luc Teirlinck 2003-04-22 4:09 ` Jerry James 2003-04-22 8:15 ` Eli Zaretskii 2003-04-22 13:22 ` Stephen J. Turnbull 2003-04-22 14:38 ` Jerry James 2003-04-22 12:56 ` Luc Teirlinck 2003-04-22 14:56 ` Jerry James 2003-04-22 15:27 ` Luc Teirlinck 2003-04-22 13:19 ` Stephen J. Turnbull 2003-04-22 13:39 ` Miles Bader 2003-04-22 13:51 ` Luc Teirlinck 2003-04-22 16:26 ` Luc Teirlinck 2003-04-23 1:00 ` Richard Stallman 2003-04-23 4:09 ` Stephen J. Turnbull 2003-04-24 23:12 ` Richard Stallman 2003-05-20 1:55 ` Stephen J. Turnbull 2003-05-22 15:00 ` Kai Großjohann
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.