unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* Sort on vectors slow
@ 2004-10-14 12:28 Roland Orre
  2004-10-19 17:52 ` Marius Vollmer
  0 siblings, 1 reply; 5+ messages in thread
From: Roland Orre @ 2004-10-14 12:28 UTC (permalink / raw)


I noticed that the default sort routine on vectors is very
slow. I'm using the restricted-vector-sort! and compared
with a merge sort on vectors instead, the speed difference
is tremendous. In this example foo is a vector with 100000
elements (an often occuring size for me currently), here
with random numbers. Machine is a 2.2 GHz P4. Observe that
the restricted-vector-qsort I'm using takes inclusive endpos
according to the doc (I've mentioned this on guile-devel).

(begin
  (define vlen 1000000)
  (define foo (make-vector vlen 9999999))
  (array-map! foo random foo)
  (define bar (copy-vector bar))
)
(verbose 2)

(restricted-vector-qsort! foo < 0 (1- vlen))
;;; 249150  msec  (71790 msec in gc)

(array-copy! bar foo)

(restricted-vector-msort! foo < 0 (1- vlen))
;;; 1060  msec  (300 msec in gc)

I'm not sure if this huge difference makes sense, or if it indicates
a problem with quicksort, which I took from libc earlier (1998). I've
found in some text though that time complexity of quicksort is O(N^2)
in worst case, only O(N log N) in average, and  merge sort is O(N log N)
in all cases. An advantage is that merge sort is stable also, which
quicksort isn't. Anyone knowing anything about memory usage for the
two algorithms?

/Roland

PS. Code to the restricted-vector-msort! below. I suggest that this
should go into the distribution, alternatively we should replace the
current default quicksort for vectors with merge sort.


SCM_DEFINE (scm_restricted_vector_msort_x,
	    "restricted-vector-msort!", 4, 1, 0, 
            (SCM vec, SCM less, SCM startpos, SCM endpos, SCM tmpv),
	    "Sort the sequence @var{items}, which may be a list or a\n"
	    "vector. @var{less} is used for comparing the sequence elements.\n"
	    "The sorting is destructive, that means that the input sequence\n"
	    "is modified to produce the sorted result.\n"
	    "This is a stable sort.")
#define FUNC_NAME s_scm_restricted_vector_msort_x
{
  const scm_t_trampoline_2 cmp = compare_function (less, 2, FUNC_NAME);
  size_t  vlen, spos, len;

  if (SCM_VECTORP (vec))
    {
      SCM *temp;
      vlen = SCM_VECTOR_LENGTH (vec);

      SCM_VALIDATE_INUM_MIN_COPY (3, startpos, 0, spos);
      SCM_ASSERT_RANGE (3, startpos, spos <= vlen);
      SCM_VALIDATE_INUM_RANGE (4, endpos,0, vlen);
      len = SCM_INUM (endpos) - spos;

      /*
	 the following array does not contain any new references to
	 SCM objects, so we can get away with allocing it on the heap.
      */
      if (SCM_UNBNDP(tmpv))
	temp = scm_malloc (len * sizeof(SCM));
      else
	if (SCM_VECTORP(tmpv) && (SCM_VECTOR_LENGTH(tmpv) >= vlen))
	  {
	    temp = SCM_WRITABLE_VELTS(tmpv);
	  }
	else SCM_WRONG_TYPE_ARG (1, vec);

      scm_merge_vector_step (vec, temp, cmp, less, spos, len);
      if (SCM_UNBNDP(tmpv))
	free(temp);
      return SCM_UNSPECIFIED;
    }
  else
    SCM_WRONG_TYPE_ARG (1, vec);
}
#undef FUNC_NAME




_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: Sort on vectors slow
@ 2004-10-14 14:22 Roland Orre
  0 siblings, 0 replies; 5+ messages in thread
From: Roland Orre @ 2004-10-14 14:22 UTC (permalink / raw)


On Thu, 2004-10-14 at 14:28, Roland Orre wrote:
> PS. Code to the restricted-vector-msort! below. I suggest that this
> should go into the distribution, alternatively we should replace the
> current default quicksort for vectors with merge sort.

Typical, the code I sent contained a little bug, which I hadn't
noticed as I'm usually using a temporary vector (optional arg)
supplied. So, here is a new version of restricted-vector-msort!
if anyone is interested or maybe consider it should be put into
the distribution.
/Roland

SCM_DEFINE (scm_restricted_vector_msort_x,
	    "restricted-vector-msort!", 4, 1, 0, 
            (SCM vec, SCM less, SCM startpos, SCM endpos, SCM tmpv),
	    "Sort the sequence @var{items}, which may be a list or a\n"
	    "vector. @var{less} is used for comparing the sequence elements.\n"
	    "The sorting is destructive, that means that the input sequence\n"
	    "is modified to produce the sorted result.\n"
	    "This is a stable sort.")
#define FUNC_NAME s_scm_restricted_vector_msort_x
{
  const scm_t_trampoline_2 cmp = compare_function (less, 2, FUNC_NAME);
  size_t  vlen, spos, high;

  if (SCM_VECTORP (vec))
    {
      SCM *temp;
      vlen = SCM_VECTOR_LENGTH (vec);

      SCM_VALIDATE_INUM_MIN_COPY (3, startpos, 0, spos);
      SCM_ASSERT_RANGE (3, startpos, spos <= vlen);
      SCM_VALIDATE_INUM_RANGE (4, endpos,0, vlen);
      high = SCM_INUM (endpos) - spos;

      /*
	 the following array does not contain any new references to
	 SCM objects, so we can get away with allocing it on the heap.
      */
      if (SCM_UNBNDP(tmpv)) {
	temp = scm_malloc (vlen * sizeof(SCM));
      }
      else
	if (SCM_VECTORP(tmpv) && (SCM_VECTOR_LENGTH(tmpv) >= vlen))
	  {
	    temp = SCM_WRITABLE_VELTS(tmpv);
	  }
	else SCM_WRONG_TYPE_ARG (5, tmpv);

      scm_merge_vector_step (vec, temp, cmp, less, spos, high);

      if (SCM_UNBNDP(tmpv))
	free(temp);

      return SCM_UNSPECIFIED;
    }
  else
    SCM_WRONG_TYPE_ARG (1, vec);
}
#undef FUNC_NAME




_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-10-19 22:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-14 12:28 Sort on vectors slow Roland Orre
2004-10-19 17:52 ` Marius Vollmer
2004-10-19 21:47   ` Marius Vollmer
2004-10-19 22:28     ` Marius Vollmer
  -- strict thread matches above, loose matches on Subject: below --
2004-10-14 14:22 Roland Orre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).