What to do for faster `remove-duplicates'?

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* What to do for faster `remove-duplicates'?
@ 2015-05-06  7:33 Oleh Krehel
  2015-05-06 12:59 ` Stefan Monnier
  0 siblings, 1 reply; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06  7:33 UTC (permalink / raw)
  To: emacs-devel

Hi all,

Should `cl-remove-duplicates' switch to using a hash table if the
candidates list is large?

I see here a speedup factor of >500 for removing the duplicates of
`load-library' candidates. Helm has a function for this, which looks
very simple (much more simple that the current `cl-remove-duplicates'):

    (cl-defun helm-fast-remove-dups (seq &key (test 'eq))
      "Remove duplicates elements in list SEQ.
    This is same as `remove-duplicates' but with memoisation.
    It is much faster, especially in large lists.
    A test function can be provided with TEST argument key.
    Default is `eq'."
      (cl-loop with cont = (make-hash-table :test test)
               for elm in seq
               unless (gethash elm cont)
               do (puthash elm elm cont)
               finally return
               (cl-loop for i being the hash-values in cont collect i)))

And here are the benchmark tests:

    (setq cands (locate-file-completion-table
                 load-path (get-load-suffixes) "" nil t))
    (length cands)
    5357
    (length (cl-remove-duplicates cands :test 'equal))
    2481
    (benchmark-run (cl-remove-duplicates cands :test 'equal))
    (0.67873101 0 0.0)
    (benchmark-run (helm-fast-remove-dups cands :test 'equal))
    (0.001350054 0 0.0)
    (benchmark-run (seq-uniq cands 'equal))
    (5.270219822 27 2.396615401000002)

This is also a request for seq.el to revise its optimization strategy.

So should `cl-remove-duplicates' be a 2-in-1 (lists for small input,
hash table for large input), or should there be
`cl-remove-duplicates-hash', or an optional argument to
`cl-remove-duplicates' to use a hash?

Oleh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06  7:33 What to do for faster `remove-duplicates'? Oleh Krehel
@ 2015-05-06 12:59 ` Stefan Monnier
  2015-05-06 13:30   ` Oleh Krehel
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Monnier @ 2015-05-06 12:59 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: emacs-devel

> So should `cl-remove-duplicates' be a 2-in-1 (lists for small input,

Don't know/don't care about cl-remove-duplicates, but for delete-dups, I'd
welcome a patch which switches to a hash-table after the Nth element.

        Stefan

PS: BTW, helm-fast-remove-dups's docstring says:

    This is same as `remove-duplicates' but with memoisation.

  but it doesn't actually use memoization, AFAICT.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 12:59 ` Stefan Monnier
@ 2015-05-06 13:30   ` Oleh Krehel
  2015-05-06 13:50     ` Stefan Monnier
  2015-05-06 14:04     ` Artur Malabarba
  0 siblings, 2 replies; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06 13:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 639 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> So should `cl-remove-duplicates' be a 2-in-1 (lists for small input,
>
> Don't know/don't care about cl-remove-duplicates, but for delete-dups, I'd
> welcome a patch which switches to a hash-table after the Nth element.

I attach the patch. I did a bunch of `benchmark-run' and it seems that
100 elements is the breaking point.

> PS: BTW, helm-fast-remove-dups's docstring says:
>
>     This is same as `remove-duplicates' but with memoisation.
>
>   but it doesn't actually use memoization, AFAICT.

It does in a way, for an optimized-out "is-element-in-collection"
function.

Oleh


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-lisp-subr.el-delete-dups-Use-a-hash-table.patch --]
[-- Type: text/x-diff, Size: 1512 bytes --]

From c845e6384283c17b61c3a0401213e4a1837807c5 Mon Sep 17 00:00:00 2001
From: Oleh Krehel <ohwoeowho@gmail.com>
Date: Wed, 6 May 2015 15:21:23 +0200
Subject: [PATCH] lisp/subr.el (delete-dups): Use a hash table

* lisp/subr.el (delete-dups): When there are more than 100 candidates,
  use a hash table. This can result in ~500 times speed-up for typical
  collections of size 5000, like that of `load-library'.
---
 lisp/subr.el | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/lisp/subr.el b/lisp/subr.el
index 0fec29c..591980d 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -417,11 +417,19 @@ If N is omitted or nil, remove the last element."
 Store the result in LIST and return it.  LIST must be a proper list.
 Of several `equal' occurrences of an element in LIST, the first
 one is kept."
-  (let ((tail list))
-    (while tail
-      (setcdr tail (delete (car tail) (cdr tail)))
-      (setq tail (cdr tail))))
-  list)
+  (if (> (length list) 100)
+      (let ((hash (make-hash-table :test #'equal))
+            res)
+        (dolist (elt list)
+          (unless (gethash elt hash)
+            (puthash elt elt hash)
+            (push elt res)))
+        (nreverse res))
+    (let ((tail list))
+      (while tail
+        (setcdr tail (delete (car tail) (cdr tail)))
+        (setq tail (cdr tail))))
+    list))
 
 ;; See http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00204.html
 (defun delete-consecutive-dups (list &optional circular)
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 13:30   ` Oleh Krehel
@ 2015-05-06 13:50     ` Stefan Monnier
  2015-05-06 13:51       ` Oleh Krehel
  2015-05-06 17:43       ` Thierry Volpiatto
  2015-05-06 14:04     ` Artur Malabarba
  1 sibling, 2 replies; 20+ messages in thread
From: Stefan Monnier @ 2015-05-06 13:50 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: emacs-devel

> I attach the patch. I did a bunch of `benchmark-run' and it seems that
> 100 elements is the breaking point.

Looks good, please install.  It's hard to believe that you need a whole 100
elements before it pays off ;-)


        Stefan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 13:50     ` Stefan Monnier
@ 2015-05-06 13:51       ` Oleh Krehel
  2015-05-06 17:43       ` Thierry Volpiatto
  1 sibling, 0 replies; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06 13:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I attach the patch. I did a bunch of `benchmark-run' and it seems that
>> 100 elements is the breaking point.
>
> Looks good, please install.  It's hard to believe that you need a whole 100
> elements before it pays off ;-)

For a different completely unique collection, it breaks at 30. This is
because list-style deletion works faster when there are a lot of
duplicates.

Oleh




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 13:30   ` Oleh Krehel
  2015-05-06 13:50     ` Stefan Monnier
@ 2015-05-06 14:04     ` Artur Malabarba
  2015-05-06 14:13       ` Oleh Krehel
  1 sibling, 1 reply; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 14:04 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: Stefan Monnier, emacs-devel

> I attach the patch. I did a bunch of `benchmark-run' and it seems that
> 100 elements is the breaking point.

Small question. How much slower is this patch compared to the current
version on a list of 99 elements? (Due to having to calculate the
length)



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 14:04     ` Artur Malabarba
@ 2015-05-06 14:13       ` Oleh Krehel
  2015-05-06 14:49         ` Artur Malabarba
  0 siblings, 1 reply; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06 14:13 UTC (permalink / raw)
  To: Artur Malabarba; +Cc: Stefan Monnier, emacs-devel

Artur Malabarba <bruce.connor.am@gmail.com> writes:

>> I attach the patch. I did a bunch of `benchmark-run' and it seems that
>> 100 elements is the breaking point.
>
> Small question. How much slower is this patch compared to the current
> version on a list of 99 elements? (Due to having to calculate the
> length)

For 99 unique candidates, the call to `length' takes 5% time compared to
the call to `delete-dups':

(/ 3.774e-06 6.3028e-05)
0.05987814939392017

It becomes worse for a small amount of unique candidates, going to 30%
for 10 candidates, but a lot of that is the standard cost of calling a
function.

I don't know if it's worth optimizing further.

Oleh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 14:13       ` Oleh Krehel
@ 2015-05-06 14:49         ` Artur Malabarba
  0 siblings, 0 replies; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 14:49 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: Stefan Monnier, emacs-devel

>>> I attach the patch. I did a bunch of `benchmark-run' and it seems that
>>> 100 elements is the breaking point.
>>
>> Small question. How much slower is this patch compared to the current
>> version on a list of 99 elements? (Due to having to calculate the
>> length)
>
> For 99 unique candidates, the call to `length' takes 5% time compared to
> the call to `delete-dups':
>
> (/ 3.774e-06 6.3028e-05)
> 0.05987814939392017
>
> It becomes worse for a small amount of unique candidates, going to 30%
> for 10 candidates, but a lot of that is the standard cost of calling a
> function.
>
> I don't know if it's worth optimizing further.

Nah, I think it's fine, I was just curious.
Doing (nthcdr 100 list) instead of (> (length list) 100) might improve
that a little, but either way is fine.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 13:50     ` Stefan Monnier
  2015-05-06 13:51       ` Oleh Krehel
@ 2015-05-06 17:43       ` Thierry Volpiatto
  2015-05-06 17:54         ` Oleh Krehel
  2015-05-06 18:31         ` Artur Malabarba
  1 sibling, 2 replies; 20+ messages in thread
From: Thierry Volpiatto @ 2015-05-06 17:43 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Looks good, please install.

Not so good as now it is no more destructive for a seq > 100.

-- 
Thierry
Get my Gnupg key:
gpg --keyserver pgp.mit.edu --recv-keys 59F29997 




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 17:43       ` Thierry Volpiatto
@ 2015-05-06 17:54         ` Oleh Krehel
  2015-05-06 18:31         ` Artur Malabarba
  1 sibling, 0 replies; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06 17:54 UTC (permalink / raw)
  To: Thierry Volpiatto; +Cc: emacs-devel

Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>> Looks good, please install.
>
> Not so good as now it is no more destructive for a seq > 100.

Should I make it destructive, like this?

(defun delete-dups (list)
  "Destructively remove `equal' duplicates from LIST.
Store the result in LIST and return it.  LIST must be a proper list.
Of several `equal' occurrences of an element in LIST, the first
one is kept."
  (if (> (length list) 100)
      (let ((hash (make-hash-table :test #'equal))
            res)
        (dolist (elt list)
          (unless (gethash elt hash)
            (puthash elt elt hash)
            (push elt res)))
        (setcdr list (cdr (nreverse res))))
    (let ((tail list))
      (while tail
        (setcdr tail (delete (car tail) (cdr tail)))
        (setq tail (cdr tail)))))
  list)

Oleh



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 17:43       ` Thierry Volpiatto
  2015-05-06 17:54         ` Oleh Krehel
@ 2015-05-06 18:31         ` Artur Malabarba
  2015-05-06 18:33           ` Artur Malabarba
                             ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 18:31 UTC (permalink / raw)
  To: Thierry Volpiatto; +Cc: emacs-devel

>> Looks good, please install.
>
> Not so good as now it is no more destructive for a seq > 100.

Just pushed the following:

modified   lisp/subr.el
@@ -424,12 +424,12 @@ one is kept."
           (unless (gethash elt hash)
             (puthash elt elt hash)
             (push elt res)))
-        (nreverse res))
+        (setcdr list (cdr (nreverse res))))
     (let ((tail list))
       (while tail
         (setcdr tail (delete (car tail) (cdr tail)))
-        (setq tail (cdr tail))))
-    list))
+        (setq tail (cdr tail)))))
+  list)



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 18:31         ` Artur Malabarba
@ 2015-05-06 18:33           ` Artur Malabarba
  2015-05-06 18:41             ` Oleh Krehel
  2015-05-06 18:48           ` Thierry Volpiatto
  2015-05-06 20:54           ` Stefan Monnier
  2 siblings, 1 reply; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 18:33 UTC (permalink / raw)
  To: Thierry Volpiatto; +Cc: emacs-devel

But that nreverse could be optimized out if the first loop followed a
`while'+`setcdr' strategy like the second.

2015-05-06 19:31 GMT+01:00 Artur Malabarba <bruce.connor.am@gmail.com>:
>>> Looks good, please install.
>>
>> Not so good as now it is no more destructive for a seq > 100.
>
> Just pushed the following:
>
> modified   lisp/subr.el
> @@ -424,12 +424,12 @@ one is kept."
>            (unless (gethash elt hash)
>              (puthash elt elt hash)
>              (push elt res)))
> -        (nreverse res))
> +        (setcdr list (cdr (nreverse res))))
>      (let ((tail list))
>        (while tail
>          (setcdr tail (delete (car tail) (cdr tail)))
> -        (setq tail (cdr tail))))
> -    list))
> +        (setq tail (cdr tail)))))
> +  list)



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 18:33           ` Artur Malabarba
@ 2015-05-06 18:41             ` Oleh Krehel
  2015-05-06 19:24               ` Artur Malabarba
  0 siblings, 1 reply; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06 18:41 UTC (permalink / raw)
  To: Artur Malabarba; +Cc: emacs-devel, Thierry Volpiatto

Artur Malabarba <bruce.connor.am@gmail.com> writes:

> But that nreverse could be optimized out if the first loop followed a
> `while'+`setcdr' strategy like the second.

Please check the optimized version:

(defun delete-dups (list)
  "Destructively remove `equal' duplicates from LIST.
Store the result in LIST and return it.  LIST must be a proper list.
Of several `equal' occurrences of an element in LIST, the first
one is kept."
  (if (> (length list) 100)
      (let ((hash (make-hash-table :test #'equal)))
        (let ((tail list)
              elt retail)
          (while (setq retail (cdr tail))
            (setq elt (car retail))
            (if (gethash elt hash)
                (setcdr tail (cdr retail))
              (puthash elt t hash))
            (setq tail retail))))
    (let ((tail list))
      (while tail
        (setcdr tail (delete (car tail) (cdr tail)))
        (setq tail (cdr tail)))))
  list)

Oleh



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 18:31         ` Artur Malabarba
  2015-05-06 18:33           ` Artur Malabarba
@ 2015-05-06 18:48           ` Thierry Volpiatto
  2015-05-06 20:54             ` Stefan Monnier
  2015-05-06 20:54           ` Stefan Monnier
  2 siblings, 1 reply; 20+ messages in thread
From: Thierry Volpiatto @ 2015-05-06 18:48 UTC (permalink / raw)
  To: bruce.connor.am; +Cc: emacs-devel


Artur Malabarba <bruce.connor.am@gmail.com> writes:

>>> Looks good, please install.
>>
>> Not so good as now it is no more destructive for a seq > 100.
>
> Just pushed the following:
>
> modified   lisp/subr.el
> @@ -424,12 +424,12 @@ one is kept."
>            (unless (gethash elt hash)
>              (puthash elt elt hash)
>              (push elt res)))
> -        (nreverse res))
> +        (setcdr list (cdr (nreverse res))))
>      (let ((tail list))
>        (while tail
>          (setcdr tail (delete (car tail) (cdr tail)))
> -        (setq tail (cdr tail))))
> -    list))
> +        (setq tail (cdr tail)))))
> +  list)

Also I am not sure pushing to a list (res) and returning this list at
end is faster than returning the maphash.
At first it looks faster but the time spent consing+gc'ing seems longer
than just returning the maphash.

But I may be wrong, just a thought.

-- 
Thierry
Get my Gnupg key:
gpg --keyserver pgp.mit.edu --recv-keys 59F29997 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 18:41             ` Oleh Krehel
@ 2015-05-06 19:24               ` Artur Malabarba
  2015-05-06 19:32                 ` Oleh Krehel
  0 siblings, 1 reply; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 19:24 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: emacs-devel, Thierry Volpiatto

2015-05-06 19:41 GMT+01:00 Oleh Krehel <ohwoeowho@gmail.com>:
> Artur Malabarba <bruce.connor.am@gmail.com> writes:
>
>> But that nreverse could be optimized out if the first loop followed a
>> `while'+`setcdr' strategy like the second.
>
> Please check the optimized version:
>
> (defun delete-dups (list)
>   "Destructively remove `equal' duplicates from LIST.
> Store the result in LIST and return it.  LIST must be a proper list.
> Of several `equal' occurrences of an element in LIST, the first
> one is kept."
>   (if (> (length list) 100)
>       (let ((hash (make-hash-table :test #'equal)))
>         (let ((tail list)
>               elt retail)
>           (while (setq retail (cdr tail))
>             (setq elt (car retail))
>             (if (gethash elt hash)
>                 (setcdr tail (cdr retail))
>               (puthash elt t hash))
>             (setq tail retail))))
>     (let ((tail list))
>       (while tail
>         (setcdr tail (delete (car tail) (cdr tail)))
>         (setq tail (cdr tail)))))
>   list)

I may be wrong, but it looks like it's completely skipping the car of the list.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 19:24               ` Artur Malabarba
@ 2015-05-06 19:32                 ` Oleh Krehel
  2015-05-06 21:22                   ` Artur Malabarba
  0 siblings, 1 reply; 20+ messages in thread
From: Oleh Krehel @ 2015-05-06 19:32 UTC (permalink / raw)
  To: Artur Malabarba; +Cc: Thierry Volpiatto, emacs-devel

Artur Malabarba <bruce.connor.am@gmail.com> writes:

> 2015-05-06 19:41 GMT+01:00 Oleh Krehel <ohwoeowho@gmail.com>:
>> Artur Malabarba <bruce.connor.am@gmail.com> writes:
>>
>>> But that nreverse could be optimized out if the first loop followed a
>>> `while'+`setcdr' strategy like the second.
>>
>> Please check the optimized version:
>>
>> (defun delete-dups (list)
>>   "Destructively remove `equal' duplicates from LIST.
>> Store the result in LIST and return it.  LIST must be a proper list.
>> Of several `equal' occurrences of an element in LIST, the first
>> one is kept."
>>   (if (> (length list) 100)
>>       (let ((hash (make-hash-table :test #'equal)))
>>         (let ((tail list)
>>               elt retail)
>>           (while (setq retail (cdr tail))
>>             (setq elt (car retail))
>>             (if (gethash elt hash)
>>                 (setcdr tail (cdr retail))
>>               (puthash elt t hash))
>>             (setq tail retail))))
>>     (let ((tail list))
>>       (while tail
>>         (setcdr tail (delete (car tail) (cdr tail)))
>>         (setq tail (cdr tail)))))
>>   list)
>
> I may be wrong, but it looks like it's completely skipping the car of the list.

You're right. I've pushed the corrected version.  I tried testing with
benchmark-run and the same large collection. The time fluctuates too
much to judge which method is better. They're roughly equal, and the
intuition is that not using `nreverse` is better.

Oleh



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 18:31         ` Artur Malabarba
  2015-05-06 18:33           ` Artur Malabarba
  2015-05-06 18:48           ` Thierry Volpiatto
@ 2015-05-06 20:54           ` Stefan Monnier
  2015-05-06 21:18             ` Artur Malabarba
  2 siblings, 1 reply; 20+ messages in thread
From: Stefan Monnier @ 2015-05-06 20:54 UTC (permalink / raw)
  To: Artur Malabarba; +Cc: emacs-devel, Thierry Volpiatto

>>> Looks good, please install.
>> Not so good as now it is no more destructive for a seq > 100.

Not being destructive doesn't seem like a serious problem to me.
`delete-*' functions are allowed to work destructively (as an
optimization) but if they don't, I wouldn't hold it against them, as
long as they're still fast.

> modified   lisp/subr.el
> @@ -424,12 +424,12 @@ one is kept."
>            (unless (gethash elt hash)
>              (puthash elt elt hash)
>              (push elt res)))
> -        (nreverse res))
> +        (setcdr list (cdr (nreverse res))))

I don't understand what this is intended to do.


        Stefan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 18:48           ` Thierry Volpiatto
@ 2015-05-06 20:54             ` Stefan Monnier
  0 siblings, 0 replies; 20+ messages in thread
From: Stefan Monnier @ 2015-05-06 20:54 UTC (permalink / raw)
  To: Thierry Volpiatto; +Cc: bruce.connor.am, emacs-devel

> Also I am not sure pushing to a list (res) and returning this list at
> end is faster than returning the maphash.

I don't know what you mean by "returning the maphash".


        Stefan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 20:54           ` Stefan Monnier
@ 2015-05-06 21:18             ` Artur Malabarba
  0 siblings, 0 replies; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 21:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, Thierry Volpiatto

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

On May 6, 2015 9:54 PM, "Stefan Monnier" <monnier@iro.umontreal.ca> wrote:
>
> >>> Looks good, please install.
> >> Not so good as now it is no more destructive for a seq > 100.
>
> Not being destructive doesn't seem like a serious problem to me.

Usually, yes. But here the docstring specifically documents that
"Destructively remove `equal' duplicates from LIST.
Store the result in LIST and return it."

>
> > modified   lisp/subr.el
> > @@ -424,12 +424,12 @@ one is kept."
> >            (unless (gethash elt hash)
> >              (puthash elt elt hash)
> >              (push elt res)))
> > -        (nreverse res))
> > +        (setcdr list (cdr (nreverse res))))
>
> I don't understand what this is intended to do.

It destructively modifies list to be equal to (nreverse res). The car is
already the same, so only the cdr needs to be modified.

[-- Attachment #2: Type: text/html, Size: 1206 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: What to do for faster `remove-duplicates'?
  2015-05-06 19:32                 ` Oleh Krehel
@ 2015-05-06 21:22                   ` Artur Malabarba
  0 siblings, 0 replies; 20+ messages in thread
From: Artur Malabarba @ 2015-05-06 21:22 UTC (permalink / raw)
  To: Oleh; +Cc: emacs-devel, Thierry Volpiatto

[-- Attachment #1: Type: text/plain, Size: 450 bytes --]

> You're right. I've pushed the corrected version.  I tried testing with
> benchmark-run and the same large collection. The time fluctuates too
> much to judge which method is better. They're roughly equal, and the
> intuition is that not using `nreverse` is better.

If there were no notable speed differences, I'd say the nreverse version is
shorter and easier to read. But I don't want to start bikeshedding, so feel
free to disregard my opinion.

[-- Attachment #2: Type: text/html, Size: 533 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-05-06 21:22 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-06  7:33 What to do for faster `remove-duplicates'? Oleh Krehel
2015-05-06 12:59 ` Stefan Monnier
2015-05-06 13:30   ` Oleh Krehel
2015-05-06 13:50     ` Stefan Monnier
2015-05-06 13:51       ` Oleh Krehel
2015-05-06 17:43       ` Thierry Volpiatto
2015-05-06 17:54         ` Oleh Krehel
2015-05-06 18:31         ` Artur Malabarba
2015-05-06 18:33           ` Artur Malabarba
2015-05-06 18:41             ` Oleh Krehel
2015-05-06 19:24               ` Artur Malabarba
2015-05-06 19:32                 ` Oleh Krehel
2015-05-06 21:22                   ` Artur Malabarba
2015-05-06 18:48           ` Thierry Volpiatto
2015-05-06 20:54             ` Stefan Monnier
2015-05-06 20:54           ` Stefan Monnier
2015-05-06 21:18             ` Artur Malabarba
2015-05-06 14:04     ` Artur Malabarba
2015-05-06 14:13       ` Oleh Krehel
2015-05-06 14:49         ` Artur Malabarba

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).