unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* cl.el sort* efficiency
@ 2006-12-22 19:19 Kevin Ryde
  2006-12-23  0:29 ` Juanma Barranquero
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Ryde @ 2006-12-22 19:19 UTC (permalink / raw)


I noticed that cl sort* function calls the given :key function within
every comparison, which is not good if key extraction is a big slow
calculation.  I struck such a case myself recently, and in fact the
example in the manual showing

	(sort* lst 'string-lessp :key 'downcase)

is almost another.  If possible I don't think you'd want to throw off
two new downcased strings in every compare.

The code below is an idea to build key results in an alist and sort
that.  It's not tested properly yet, and really isn't a thing of
beauty.  There's some trouble taken to rearrange the original cons
cells because that's discussed at length in the manual for plain
"sort", so maybe someone's relying on that.  Ideas for something
better or worse would be welcome.


(defun sort* (cl-seq cl-pred &rest cl-keys)
  "Sort the argument SEQ according to PREDICATE.
This is a destructive function; it reuses the storage of SEQ if possible.
\nKeywords supported:  :key
\n(fn SEQ PREDICATE [KEYWORD VALUE]...)"
  (if (nlistp cl-seq)
      (replace cl-seq (apply 'sort* (append cl-seq nil) cl-pred cl-keys))
    (cl-parsing-keywords (:key) ()
      (if (memq cl-key '(nil identity))
	  (sort cl-seq cl-pred)

        ;; turn cl-seq into an alist of (KEY-RESULT . ORIGINAL-CONS-CELL)
        (let (cl-x)
          (while cl-seq
            (setq cl-x (acons (funcall cl-key (car cl-seq)) cl-seq cl-x))
            (setq cl-seq (cdr cl-seq)))
          (setq cl-seq (nreverse cl-x)))
        ;; sort that alist by KEY-RESULT
        (setq cl-seq (sort cl-seq (if (eq cl-pred '<)
                                      'car-less-than-car
                                    (lambda (cl-x cl-y)
                                      (funcall cl-pred
                                               (car cl-x) (car cl-y))))))
        ;; mung cdrs of original cons cells to put them in new order
        (let (cell prev)
          (while cl-seq
            (setq cell (cdar cl-seq))
            (setcdr cell prev)
            (setq prev cell)
            (setq cl-seq (cdr cl-seq)))
          (nreverse prev))))))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-22 19:19 cl.el sort* efficiency Kevin Ryde
@ 2006-12-23  0:29 ` Juanma Barranquero
  2006-12-24  0:56   ` Kevin Ryde
  0 siblings, 1 reply; 8+ messages in thread
From: Juanma Barranquero @ 2006-12-23  0:29 UTC (permalink / raw)
  Cc: emacs-devel

On 12/22/06, Kevin Ryde <user42@zip.com.au> wrote:

> I noticed that cl sort* function calls the given :key function within
> every comparison, which is not good if key extraction is a big slow
> calculation.

[...]

> The code below is an idea to build key results in an alist and sort
> that.

You're implementing a Schwartzian Transform
(http://en.wikipedia.org/wiki/Schwartzian_transform) inside sort*,
which is kinda odd.

IMHO is the responsibility of the user of sort* to know whether
comparisons are going to be expensive and preprocess the data to help
in the comparison. You're doing an optimization at the wrong level:
perhaps I'm using it with so light a comparison function that your
method makes it more expensive...

Perhaps you could encapsulate your ST in a function and use it as a
wrapper for sort*.

                    /L/e/k/t/u

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-23  0:29 ` Juanma Barranquero
@ 2006-12-24  0:56   ` Kevin Ryde
  2006-12-24  2:08     ` Miles Bader
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kevin Ryde @ 2006-12-24  0:56 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1222 bytes --]

"Juanma Barranquero" <lekktu@gmail.com> writes:
>
> You're implementing a Schwartzian Transform
> (http://en.wikipedia.org/wiki/Schwartzian_transform) inside sort*,
> which is kinda odd.

The concept would pre-date 1994, wouldn't it?  (Not that this is the
place for questions of academic priority :-)

> IMHO is the responsibility of the user of sort* to know whether
> comparisons are going to be expensive and preprocess the data to help
> in the comparison.

A key func as data accessor seemed to me a reasonably natural way to
express that, but maybe that's just me.

> You're doing an optimization at the wrong level:
> perhaps I'm using it with so light a comparison function that your
> method makes it more expensive...

I see the common lisp spec specifically doesn't say how much or little
the key is used.  Dunno what you get from implementations in practice.
Is there a CL expert in the house?


In any case Richard advises not doing anything to emacs at this time,
but I would propose a sentence for the manual, just to describe what's
current.

2006-12-24  Kevin Ryde  <user42@zip.com.au>

	* cl.texi (Sorting Sequences): In sort*, add a little cautionary note
	about the key procedure being used heavily.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cl.texi.sort-key.diff --]
[-- Type: text/x-diff, Size: 1139 bytes --]

Index: cl.texi
===================================================================
RCS file: /sources/emacs/emacs/man/cl.texi,v
retrieving revision 1.30
diff -u -c -r1.30 cl.texi
cvs diff: conflicting specifications of output style
*** cl.texi	22 Dec 2006 23:27:28 -0000	1.30
--- cl.texi	24 Dec 2006 00:43:47 -0000
***************
*** 4092,4098 ****
  @noindent
  sorts @var{data}, a sequence of strings, into increasing alphabetical
  order without regard to case.  A @code{:key} function of @code{car}
! would be useful for sorting association lists.
  
  The @code{sort*} function is destructive; it sorts lists by actually
  rearranging the @code{cdr} pointers in suitable fashion.
--- 4092,4100 ----
  @noindent
  sorts @var{data}, a sequence of strings, into increasing alphabetical
  order without regard to case.  A @code{:key} function of @code{car}
! would be useful for sorting association lists.  It should only be a
! simple accessor though, it's used heavily in the current
! implementation.
  
  The @code{sort*} function is destructive; it sorts lists by actually
  rearranging the @code{cdr} pointers in suitable fashion.

[-- Attachment #3: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-24  0:56   ` Kevin Ryde
@ 2006-12-24  2:08     ` Miles Bader
  2006-12-25 23:10       ` Stefan Monnier
  2006-12-26  0:26       ` Kevin Ryde
  2006-12-24 17:09     ` Richard Stallman
  2006-12-25  1:09     ` Juanma Barranquero
  2 siblings, 2 replies; 8+ messages in thread
From: Miles Bader @ 2006-12-24  2:08 UTC (permalink / raw)
  Cc: emacs-devel

Kevin Ryde <user42@zip.com.au> writes:
> I see the common lisp spec specifically doesn't say how much or little
> the key is used.  Dunno what you get from implementations in practice.
> Is there a CL expert in the house?

I don't know how representative I am, but it's pretty obvious to me that
adding something like :key to sort might call the given function
multiple times.  Certainly the alternative of allocating storage to keep
around all retrieved key values is pretty heavyweight and in many cases
not an optimization at all, so it's not something I would expect an
implementation to do.

-Miles
-- 
`...the Soviet Union was sliding in to an economic collapse so comprehensive
 that in the end its factories produced not goods but bads: finished products
 less valuable than the raw materials they were made from.'  [The Economist]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-24  0:56   ` Kevin Ryde
  2006-12-24  2:08     ` Miles Bader
@ 2006-12-24 17:09     ` Richard Stallman
  2006-12-25  1:09     ` Juanma Barranquero
  2 siblings, 0 replies; 8+ messages in thread
From: Richard Stallman @ 2006-12-24 17:09 UTC (permalink / raw)
  Cc: emacs-devel

    2006-12-24  Kevin Ryde  <user42@zip.com.au>

	    * cl.texi (Sorting Sequences): In sort*, add a little cautionary note
	    about the key procedure being used heavily.

Would someone please install that change?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-24  0:56   ` Kevin Ryde
  2006-12-24  2:08     ` Miles Bader
  2006-12-24 17:09     ` Richard Stallman
@ 2006-12-25  1:09     ` Juanma Barranquero
  2 siblings, 0 replies; 8+ messages in thread
From: Juanma Barranquero @ 2006-12-25  1:09 UTC (permalink / raw)
  Cc: emacs-devel

On 12/24/06, Kevin Ryde <user42@zip.com.au> wrote:

> The concept would pre-date 1994, wouldn't it?  (Not that this is the
> place for questions of academic priority :-)

Of course. I didn't meant that the idea was invented by Randal
Schwartz, or recently; but it's very popular under that name.

> but I would propose a sentence for the manual, just to describe what's
> current.

Committed.

                    /L/e/k/t/u

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-24  2:08     ` Miles Bader
@ 2006-12-25 23:10       ` Stefan Monnier
  2006-12-26  0:26       ` Kevin Ryde
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2006-12-25 23:10 UTC (permalink / raw)
  Cc: Kevin Ryde, emacs-devel

> I don't know how representative I am, but it's pretty obvious to me that
> adding something like :key to sort might call the given function
> multiple times.  Certainly the alternative of allocating storage to keep
> around all retrieved key values is pretty heavyweight and in many cases
> not an optimization at all, so it's not something I would expect an
> implementation to do.

Actually, I do believe that when I was a CL user I expected that the sort
function would call the :key function only O(n) times rather than O(n log n)
times.  If I didn't want the allocation of an aux table I used something
like car-less-then-car for the comparison function.


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cl.el sort* efficiency
  2006-12-24  2:08     ` Miles Bader
  2006-12-25 23:10       ` Stefan Monnier
@ 2006-12-26  0:26       ` Kevin Ryde
  1 sibling, 0 replies; 8+ messages in thread
From: Kevin Ryde @ 2006-12-26  0:26 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 2712 bytes --]

Miles Bader <miles@gnu.org> writes:
>
> Certainly the alternative of allocating storage to keep
> around all retrieved key values is pretty heavyweight and in many cases
> not an optimization at all,

I see your point, I guess there's two equally good features (an
accessor or a preprocessor), it's just a matter of which one is
offered.

Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
> Actually, I do believe that when I was a CL user I expected that the sort
> function would call the :key function only O(n) times rather than O(n log n)
> times.

At least I'm not the only one who thought in that direction :)

"Juanma Barranquero" <lekktu@gmail.com> writes:
>
> Of course. I didn't meant that the idea was invented by Randal
> Schwartz, or recently; but it's very popular under that name.

No, confusion only on my part.  Not that the wiki article is
fantastically clear on concept versus the idiom (it's all there, but
rather laboured).


Without wanting to labour this sort* either :), I'd like to propose a
further tweak to the documentation.  In fact I think it's enough to
leave the code alone and just make it a little clearer what one gets.

2006-12-26  Kevin Ryde  <user42@zip.com.au>

	* cl.texi (Sorting Sequences): In sort*, don't say "preprocess" as
	that suggests O(N) :key use.  Show structs as an example, since
	downcase is more expensive than intended for the :key func.  Another
	tweak to the wording of :key only an accessor.

For ease of reading:

 -- Function: sort* seq predicate &key :key
     This function sorts SEQ into increasing order as determined by
     using PREDICATE to compare pairs of elements.  PREDICATE should
     return true (non-`nil') if and only if its first argument is less
     than (not equal to) its second argument.  For example, `<' and
     `string-lessp' are suitable predicate functions for sorting
     numbers and strings, respectively; `>' would sort numbers into
     decreasing rather than increasing order.

     This function differs from Emacs' built-in `sort' in that it can
     operate on any type of sequence, not just lists.  Also, it accepts
     a `:key' argument which is used to access part of an element for
     the PREDICATE function.  For example, to sort structures by a
     particular field (*note Structures::)

          (defstruct person name age sex)

          (setq data (sort* data 'string-lessp :key 'person-name))

     Or `car' would be useful for sorting association lists.  The
     `:key' function is designed only as a data accessor though, it's
     used on every predicate call.

     The `sort*' function is destructive; it sorts lists by actually
     rearranging the `cdr' pointers in suitable fashion.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cl.texi.sort-2.diff --]
[-- Type: text/x-diff, Size: 1706 bytes --]

*** cl.texi.~1.31.~	2006-12-26 11:20:21.000000000 +1100
--- cl.texi	2006-12-26 11:24:08.000000000 +1100
***************
*** 4082,4100 ****
  
  This function differs from Emacs' built-in @code{sort} in that it
  can operate on any type of sequence, not just lists.  Also, it
! accepts a @code{:key} argument which is used to preprocess data
! fed to the @var{predicate} function.  For example,
  
  @example
! (setq data (sort* data 'string-lessp :key 'downcase))
  @end example
  
  @noindent
! sorts @var{data}, a sequence of strings, into increasing alphabetical
! order without regard to case.  A @code{:key} function of @code{car}
! would be useful for sorting association lists.  It should only be a
! simple accessor though, it's used heavily in the current
! implementation.
  
  The @code{sort*} function is destructive; it sorts lists by actually
  rearranging the @code{cdr} pointers in suitable fashion.
--- 4082,4101 ----
  
  This function differs from Emacs' built-in @code{sort} in that it
  can operate on any type of sequence, not just lists.  Also, it
! accepts a @code{:key} argument which is used to access part of an
! element for the @var{predicate} function.  For example, to sort
! structures by a particular field (@pxref{Structures})
  
  @example
! (defstruct person name age sex)
! 
! (setq data (sort* data 'string-lessp :key 'person-name))
  @end example
  
  @noindent
! Or @code{car} would be useful for sorting association lists.  The
! @code{:key} function is designed only as a data accessor though, it's
! used on every predicate call.
  
  The @code{sort*} function is destructive; it sorts lists by actually
  rearranging the @code{cdr} pointers in suitable fashion.

[-- Attachment #3: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-12-26  0:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-22 19:19 cl.el sort* efficiency Kevin Ryde
2006-12-23  0:29 ` Juanma Barranquero
2006-12-24  0:56   ` Kevin Ryde
2006-12-24  2:08     ` Miles Bader
2006-12-25 23:10       ` Stefan Monnier
2006-12-26  0:26       ` Kevin Ryde
2006-12-24 17:09     ` Richard Stallman
2006-12-25  1:09     ` Juanma Barranquero

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).