* cl.el sort* efficiency @ 2006-12-22 19:19 Kevin Ryde 2006-12-23 0:29 ` Juanma Barranquero 0 siblings, 1 reply; 8+ messages in thread From: Kevin Ryde @ 2006-12-22 19:19 UTC (permalink / raw) I noticed that cl sort* function calls the given :key function within every comparison, which is not good if key extraction is a big slow calculation. I struck such a case myself recently, and in fact the example in the manual showing (sort* lst 'string-lessp :key 'downcase) is almost another. If possible I don't think you'd want to throw off two new downcased strings in every compare. The code below is an idea to build key results in an alist and sort that. It's not tested properly yet, and really isn't a thing of beauty. There's some trouble taken to rearrange the original cons cells because that's discussed at length in the manual for plain "sort", so maybe someone's relying on that. Ideas for something better or worse would be welcome. (defun sort* (cl-seq cl-pred &rest cl-keys) "Sort the argument SEQ according to PREDICATE. This is a destructive function; it reuses the storage of SEQ if possible. \nKeywords supported: :key \n(fn SEQ PREDICATE [KEYWORD VALUE]...)" (if (nlistp cl-seq) (replace cl-seq (apply 'sort* (append cl-seq nil) cl-pred cl-keys)) (cl-parsing-keywords (:key) () (if (memq cl-key '(nil identity)) (sort cl-seq cl-pred) ;; turn cl-seq into an alist of (KEY-RESULT . ORIGINAL-CONS-CELL) (let (cl-x) (while cl-seq (setq cl-x (acons (funcall cl-key (car cl-seq)) cl-seq cl-x)) (setq cl-seq (cdr cl-seq))) (setq cl-seq (nreverse cl-x))) ;; sort that alist by KEY-RESULT (setq cl-seq (sort cl-seq (if (eq cl-pred '<) 'car-less-than-car (lambda (cl-x cl-y) (funcall cl-pred (car cl-x) (car cl-y)))))) ;; mung cdrs of original cons cells to put them in new order (let (cell prev) (while cl-seq (setq cell (cdar cl-seq)) (setcdr cell prev) (setq prev cell) (setq cl-seq (cdr cl-seq))) (nreverse prev)))))) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-22 19:19 cl.el sort* efficiency Kevin Ryde @ 2006-12-23 0:29 ` Juanma Barranquero 2006-12-24 0:56 ` Kevin Ryde 0 siblings, 1 reply; 8+ messages in thread From: Juanma Barranquero @ 2006-12-23 0:29 UTC (permalink / raw) Cc: emacs-devel On 12/22/06, Kevin Ryde <user42@zip.com.au> wrote: > I noticed that cl sort* function calls the given :key function within > every comparison, which is not good if key extraction is a big slow > calculation. [...] > The code below is an idea to build key results in an alist and sort > that. You're implementing a Schwartzian Transform (http://en.wikipedia.org/wiki/Schwartzian_transform) inside sort*, which is kinda odd. IMHO is the responsibility of the user of sort* to know whether comparisons are going to be expensive and preprocess the data to help in the comparison. You're doing an optimization at the wrong level: perhaps I'm using it with so light a comparison function that your method makes it more expensive... Perhaps you could encapsulate your ST in a function and use it as a wrapper for sort*. /L/e/k/t/u ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-23 0:29 ` Juanma Barranquero @ 2006-12-24 0:56 ` Kevin Ryde 2006-12-24 2:08 ` Miles Bader ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Kevin Ryde @ 2006-12-24 0:56 UTC (permalink / raw) [-- Attachment #1: Type: text/plain, Size: 1222 bytes --] "Juanma Barranquero" <lekktu@gmail.com> writes: > > You're implementing a Schwartzian Transform > (http://en.wikipedia.org/wiki/Schwartzian_transform) inside sort*, > which is kinda odd. The concept would pre-date 1994, wouldn't it? (Not that this is the place for questions of academic priority :-) > IMHO is the responsibility of the user of sort* to know whether > comparisons are going to be expensive and preprocess the data to help > in the comparison. A key func as data accessor seemed to me a reasonably natural way to express that, but maybe that's just me. > You're doing an optimization at the wrong level: > perhaps I'm using it with so light a comparison function that your > method makes it more expensive... I see the common lisp spec specifically doesn't say how much or little the key is used. Dunno what you get from implementations in practice. Is there a CL expert in the house? In any case Richard advises not doing anything to emacs at this time, but I would propose a sentence for the manual, just to describe what's current. 2006-12-24 Kevin Ryde <user42@zip.com.au> * cl.texi (Sorting Sequences): In sort*, add a little cautionary note about the key procedure being used heavily. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: cl.texi.sort-key.diff --] [-- Type: text/x-diff, Size: 1139 bytes --] Index: cl.texi =================================================================== RCS file: /sources/emacs/emacs/man/cl.texi,v retrieving revision 1.30 diff -u -c -r1.30 cl.texi cvs diff: conflicting specifications of output style *** cl.texi 22 Dec 2006 23:27:28 -0000 1.30 --- cl.texi 24 Dec 2006 00:43:47 -0000 *************** *** 4092,4098 **** @noindent sorts @var{data}, a sequence of strings, into increasing alphabetical order without regard to case. A @code{:key} function of @code{car} ! would be useful for sorting association lists. The @code{sort*} function is destructive; it sorts lists by actually rearranging the @code{cdr} pointers in suitable fashion. --- 4092,4100 ---- @noindent sorts @var{data}, a sequence of strings, into increasing alphabetical order without regard to case. A @code{:key} function of @code{car} ! would be useful for sorting association lists. It should only be a ! simple accessor though, it's used heavily in the current ! implementation. The @code{sort*} function is destructive; it sorts lists by actually rearranging the @code{cdr} pointers in suitable fashion. [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-24 0:56 ` Kevin Ryde @ 2006-12-24 2:08 ` Miles Bader 2006-12-25 23:10 ` Stefan Monnier 2006-12-26 0:26 ` Kevin Ryde 2006-12-24 17:09 ` Richard Stallman 2006-12-25 1:09 ` Juanma Barranquero 2 siblings, 2 replies; 8+ messages in thread From: Miles Bader @ 2006-12-24 2:08 UTC (permalink / raw) Cc: emacs-devel Kevin Ryde <user42@zip.com.au> writes: > I see the common lisp spec specifically doesn't say how much or little > the key is used. Dunno what you get from implementations in practice. > Is there a CL expert in the house? I don't know how representative I am, but it's pretty obvious to me that adding something like :key to sort might call the given function multiple times. Certainly the alternative of allocating storage to keep around all retrieved key values is pretty heavyweight and in many cases not an optimization at all, so it's not something I would expect an implementation to do. -Miles -- `...the Soviet Union was sliding in to an economic collapse so comprehensive that in the end its factories produced not goods but bads: finished products less valuable than the raw materials they were made from.' [The Economist] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-24 2:08 ` Miles Bader @ 2006-12-25 23:10 ` Stefan Monnier 2006-12-26 0:26 ` Kevin Ryde 1 sibling, 0 replies; 8+ messages in thread From: Stefan Monnier @ 2006-12-25 23:10 UTC (permalink / raw) Cc: Kevin Ryde, emacs-devel > I don't know how representative I am, but it's pretty obvious to me that > adding something like :key to sort might call the given function > multiple times. Certainly the alternative of allocating storage to keep > around all retrieved key values is pretty heavyweight and in many cases > not an optimization at all, so it's not something I would expect an > implementation to do. Actually, I do believe that when I was a CL user I expected that the sort function would call the :key function only O(n) times rather than O(n log n) times. If I didn't want the allocation of an aux table I used something like car-less-then-car for the comparison function. Stefan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-24 2:08 ` Miles Bader 2006-12-25 23:10 ` Stefan Monnier @ 2006-12-26 0:26 ` Kevin Ryde 1 sibling, 0 replies; 8+ messages in thread From: Kevin Ryde @ 2006-12-26 0:26 UTC (permalink / raw) [-- Attachment #1: Type: text/plain, Size: 2712 bytes --] Miles Bader <miles@gnu.org> writes: > > Certainly the alternative of allocating storage to keep > around all retrieved key values is pretty heavyweight and in many cases > not an optimization at all, I see your point, I guess there's two equally good features (an accessor or a preprocessor), it's just a matter of which one is offered. Stefan Monnier <monnier@iro.umontreal.ca> writes: > > Actually, I do believe that when I was a CL user I expected that the sort > function would call the :key function only O(n) times rather than O(n log n) > times. At least I'm not the only one who thought in that direction :) "Juanma Barranquero" <lekktu@gmail.com> writes: > > Of course. I didn't meant that the idea was invented by Randal > Schwartz, or recently; but it's very popular under that name. No, confusion only on my part. Not that the wiki article is fantastically clear on concept versus the idiom (it's all there, but rather laboured). Without wanting to labour this sort* either :), I'd like to propose a further tweak to the documentation. In fact I think it's enough to leave the code alone and just make it a little clearer what one gets. 2006-12-26 Kevin Ryde <user42@zip.com.au> * cl.texi (Sorting Sequences): In sort*, don't say "preprocess" as that suggests O(N) :key use. Show structs as an example, since downcase is more expensive than intended for the :key func. Another tweak to the wording of :key only an accessor. For ease of reading: -- Function: sort* seq predicate &key :key This function sorts SEQ into increasing order as determined by using PREDICATE to compare pairs of elements. PREDICATE should return true (non-`nil') if and only if its first argument is less than (not equal to) its second argument. For example, `<' and `string-lessp' are suitable predicate functions for sorting numbers and strings, respectively; `>' would sort numbers into decreasing rather than increasing order. This function differs from Emacs' built-in `sort' in that it can operate on any type of sequence, not just lists. Also, it accepts a `:key' argument which is used to access part of an element for the PREDICATE function. For example, to sort structures by a particular field (*note Structures::) (defstruct person name age sex) (setq data (sort* data 'string-lessp :key 'person-name)) Or `car' would be useful for sorting association lists. The `:key' function is designed only as a data accessor though, it's used on every predicate call. The `sort*' function is destructive; it sorts lists by actually rearranging the `cdr' pointers in suitable fashion. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: cl.texi.sort-2.diff --] [-- Type: text/x-diff, Size: 1706 bytes --] *** cl.texi.~1.31.~ 2006-12-26 11:20:21.000000000 +1100 --- cl.texi 2006-12-26 11:24:08.000000000 +1100 *************** *** 4082,4100 **** This function differs from Emacs' built-in @code{sort} in that it can operate on any type of sequence, not just lists. Also, it ! accepts a @code{:key} argument which is used to preprocess data ! fed to the @var{predicate} function. For example, @example ! (setq data (sort* data 'string-lessp :key 'downcase)) @end example @noindent ! sorts @var{data}, a sequence of strings, into increasing alphabetical ! order without regard to case. A @code{:key} function of @code{car} ! would be useful for sorting association lists. It should only be a ! simple accessor though, it's used heavily in the current ! implementation. The @code{sort*} function is destructive; it sorts lists by actually rearranging the @code{cdr} pointers in suitable fashion. --- 4082,4101 ---- This function differs from Emacs' built-in @code{sort} in that it can operate on any type of sequence, not just lists. Also, it ! accepts a @code{:key} argument which is used to access part of an ! element for the @var{predicate} function. For example, to sort ! structures by a particular field (@pxref{Structures}) @example ! (defstruct person name age sex) ! ! (setq data (sort* data 'string-lessp :key 'person-name)) @end example @noindent ! Or @code{car} would be useful for sorting association lists. The ! @code{:key} function is designed only as a data accessor though, it's ! used on every predicate call. The @code{sort*} function is destructive; it sorts lists by actually rearranging the @code{cdr} pointers in suitable fashion. [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-24 0:56 ` Kevin Ryde 2006-12-24 2:08 ` Miles Bader @ 2006-12-24 17:09 ` Richard Stallman 2006-12-25 1:09 ` Juanma Barranquero 2 siblings, 0 replies; 8+ messages in thread From: Richard Stallman @ 2006-12-24 17:09 UTC (permalink / raw) Cc: emacs-devel 2006-12-24 Kevin Ryde <user42@zip.com.au> * cl.texi (Sorting Sequences): In sort*, add a little cautionary note about the key procedure being used heavily. Would someone please install that change? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cl.el sort* efficiency 2006-12-24 0:56 ` Kevin Ryde 2006-12-24 2:08 ` Miles Bader 2006-12-24 17:09 ` Richard Stallman @ 2006-12-25 1:09 ` Juanma Barranquero 2 siblings, 0 replies; 8+ messages in thread From: Juanma Barranquero @ 2006-12-25 1:09 UTC (permalink / raw) Cc: emacs-devel On 12/24/06, Kevin Ryde <user42@zip.com.au> wrote: > The concept would pre-date 1994, wouldn't it? (Not that this is the > place for questions of academic priority :-) Of course. I didn't meant that the idea was invented by Randal Schwartz, or recently; but it's very popular under that name. > but I would propose a sentence for the manual, just to describe what's > current. Committed. /L/e/k/t/u ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-12-26 0:26 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-22 19:19 cl.el sort* efficiency Kevin Ryde 2006-12-23 0:29 ` Juanma Barranquero 2006-12-24 0:56 ` Kevin Ryde 2006-12-24 2:08 ` Miles Bader 2006-12-25 23:10 ` Stefan Monnier 2006-12-26 0:26 ` Kevin Ryde 2006-12-24 17:09 ` Richard Stallman 2006-12-25 1:09 ` Juanma Barranquero
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.