unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* case-insensitive string<
@ 2009-04-17 13:40 Nikolaj Schumacher
  0 siblings, 0 replies; 8+ messages in thread
From: Nikolaj Schumacher @ 2009-04-17 13:40 UTC (permalink / raw)
  To: help-gnu-emacs

What's the fastest way to sort a large list of strings in a
case-insensitive order?

Calling downcase on every string is rather expensive.  Does anyone know
a better way?

regards,
Nikolaj Schumacher




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
       [not found] <mailman.5527.1239975614.31690.help-gnu-emacs@gnu.org>
@ 2009-04-17 13:58 ` harven
  2009-04-17 15:07   ` thierry.volpiatto
  2009-04-17 17:01 ` Ted Zlatanov
  1 sibling, 1 reply; 8+ messages in thread
From: harven @ 2009-04-17 13:58 UTC (permalink / raw)
  To: help-gnu-emacs

Nikolaj Schumacher <me@nschum.de> writes:

> What's the fastest way to sort a large list of strings in a
> case-insensitive order?
>
> Calling downcase on every string is rather expensive.  Does anyone know
> a better way?

The sort built-in function uses string< as predicate.
You could use compare-strings instead. It has an option IGNORE-CASE.
Hope that helps.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
  2009-04-17 13:58 ` case-insensitive string< harven
@ 2009-04-17 15:07   ` thierry.volpiatto
  2009-04-17 21:04     ` Nikolaj Schumacher
  0 siblings, 1 reply; 8+ messages in thread
From: thierry.volpiatto @ 2009-04-17 15:07 UTC (permalink / raw)
  To: help-gnu-emacs

harven <harven@free.fr> writes:

> Nikolaj Schumacher <me@nschum.de> writes:
>
>> What's the fastest way to sort a large list of strings in a
>> case-insensitive order?
>>
>> Calling downcase on every string is rather expensive.  Does anyone know
>> a better way?
>
> The sort built-in function uses string< as predicate.
> You could use compare-strings instead. It has an option IGNORE-CASE.
> Hope that helps.
>
You can use th cl sort ==> sort*

(sort* (list "abc" "abd" "abCde" "cade" "ABC") 'string< :key 'downcase)
("abc" "ABC" "abCde" "abd" "cade")

But not sure it is faster than downcasing all strings before sorting.
Need testing.

-- 
A + Thierry Volpiatto
Location: Saint-Cyr-Sur-Mer - France





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
       [not found] <mailman.5527.1239975614.31690.help-gnu-emacs@gnu.org>
  2009-04-17 13:58 ` case-insensitive string< harven
@ 2009-04-17 17:01 ` Ted Zlatanov
  2009-04-17 21:13   ` Nikolaj Schumacher
       [not found]   ` <mailman.5553.1240002838.31690.help-gnu-emacs@gnu.org>
  1 sibling, 2 replies; 8+ messages in thread
From: Ted Zlatanov @ 2009-04-17 17:01 UTC (permalink / raw)
  To: help-gnu-emacs

On Fri, 17 Apr 2009 15:40:03 +0200 Nikolaj Schumacher <me@nschum.de> wrote: 

NS> What's the fastest way to sort a large list of strings in a
NS> case-insensitive order?

NS> Calling downcase on every string is rather expensive.  Does anyone know
NS> a better way?

Perhaps you can use `sort' with a predicate that uses `compare-strings'?
The latter function has a ignore-case parameter, and is implemented in C
so should be quite fast.

btw, BBDB has a string> macro, and subr.el has a string< alias to
string-lessp, which gets a bit confusing :).

Ted


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
  2009-04-17 15:07   ` thierry.volpiatto
@ 2009-04-17 21:04     ` Nikolaj Schumacher
  2009-04-17 21:17       ` thierry.volpiatto
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolaj Schumacher @ 2009-04-17 21:04 UTC (permalink / raw)
  To: thierry.volpiatto; +Cc: help-gnu-emacs

thierry.volpiatto@gmail.com wrote:

> (sort* (list "abc" "abd" "abCde" "cade" "ABC") 'string< :key 'downcase)
> ("abc" "ABC" "abCde" "abd" "cade")
>
> But not sure it is faster than downcasing all strings before sorting.
> Need testing.

It's significantly slower.

It's the same as using:
(lambda (a b) (string< (downcase a) (downcase b)))

so that's O(n*log(n)) calls to downcase, instead of n.


regards,
Nikolaj Schumacher




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
  2009-04-17 17:01 ` Ted Zlatanov
@ 2009-04-17 21:13   ` Nikolaj Schumacher
       [not found]   ` <mailman.5553.1240002838.31690.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 8+ messages in thread
From: Nikolaj Schumacher @ 2009-04-17 21:13 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: help-gnu-emacs

Ted Zlatanov <tzz@lifelogs.com> wrote:

> Perhaps you can use `sort' with a predicate that uses `compare-strings'?
> The latter function has a ignore-case parameter, and is implemented in C
> so should be quite fast.

Thanks.  `company-strings' is indeed fast.  Unfortunately the return
value is odd.  It returns a positiv/negative number if different, but t
(instead of 0) when they are equal.  That means some work to turn it
into a compatible predicate...

(byte-compile
 (lambda (a b)
   (let ((v (compare-strings a nil nil b nil nil t)))
     (or (eq v t) (< v 0)))))

All this fuss eventually makes it slower... :(

regards,
Nikolaj Schumacher




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
  2009-04-17 21:04     ` Nikolaj Schumacher
@ 2009-04-17 21:17       ` thierry.volpiatto
  0 siblings, 0 replies; 8+ messages in thread
From: thierry.volpiatto @ 2009-04-17 21:17 UTC (permalink / raw)
  To: help-gnu-emacs

Nikolaj Schumacher <me@nschum.de> writes:

> thierry.volpiatto@gmail.com wrote:
>
>> (sort* (list "abc" "abd" "abCde" "cade" "ABC") 'string< :key 'downcase)
>> ("abc" "ABC" "abCde" "abd" "cade")
>>
>> But not sure it is faster than downcasing all strings before sorting.
>> Need testing.
>
> It's significantly slower.
>
> It's the same as using:
> (lambda (a b) (string< (downcase a) (downcase b)))
>
> so that's O(n*log(n)) calls to downcase, instead of n.
>
Ok, that's good to know.
Thank you.
-- 
A + Thierry Volpiatto
Location: Saint-Cyr-Sur-Mer - France




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-insensitive string<
       [not found]   ` <mailman.5553.1240002838.31690.help-gnu-emacs@gnu.org>
@ 2009-04-20 18:37     ` Ted Zlatanov
  0 siblings, 0 replies; 8+ messages in thread
From: Ted Zlatanov @ 2009-04-20 18:37 UTC (permalink / raw)
  To: help-gnu-emacs

On Fri, 17 Apr 2009 23:13:54 +0200 Nikolaj Schumacher <me@nschum.de> wrote: 

NS> Ted Zlatanov <tzz@lifelogs.com> wrote:
>> Perhaps you can use `sort' with a predicate that uses `compare-strings'?
>> The latter function has a ignore-case parameter, and is implemented in C
>> so should be quite fast.

NS> Thanks.  `company-strings' is indeed fast.  Unfortunately the return

s/company/compare/

NS> value is odd.  It returns a positiv/negative number if different, but t
NS> (instead of 0) when they are equal.  That means some work to turn it
NS> into a compatible predicate...

NS> (byte-compile
NS>  (lambda (a b)
NS>    (let ((v (compare-strings a nil nil b nil nil t)))
NS>      (or (eq v t) (< v 0)))))

NS> All this fuss eventually makes it slower... :(

Modifying the C source code is pretty easy, but I would actually submit
this as a feature request on emacs-devel if you're interested.  The
right way to solve it may be to add a case-fold-compare global variable
similar to case-fold-search, with the same automatic buffer-local
behavior.

Ted


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-04-20 18:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.5527.1239975614.31690.help-gnu-emacs@gnu.org>
2009-04-17 13:58 ` case-insensitive string< harven
2009-04-17 15:07   ` thierry.volpiatto
2009-04-17 21:04     ` Nikolaj Schumacher
2009-04-17 21:17       ` thierry.volpiatto
2009-04-17 17:01 ` Ted Zlatanov
2009-04-17 21:13   ` Nikolaj Schumacher
     [not found]   ` <mailman.5553.1240002838.31690.help-gnu-emacs@gnu.org>
2009-04-20 18:37     ` Ted Zlatanov
2009-04-17 13:40 Nikolaj Schumacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).