From: "Drew Adams" <drew.adams@oracle.com>
To: "'Thierry Volpiatto'" <thierry.volpiatto@gmail.com>,
<10386@debbugs.gnu.org>
Cc: 'Jari Aalto' <jari.aalto@cante.net>
Subject: bug#10386: CODE wishlist: íncludedelete-duplicates.el
Date: Wed, 28 Dec 2011 08:56:48 -0800 [thread overview]
Message-ID: <97F263C286E14A3A9D8656356D98ECD1@us.oracle.com> (raw)
In-Reply-To: <87aa6dvz8l.fsf@gmail.com>
> > http://mwolson.org/static/dist/elisp/delete-duplicates.el
>
> This is already provided in Emacs24+.
`delete-dups' has been in Emacs since Emacs 22, actually.
> What is not provided is a fast version of remove-duplicates.
> http://article.gmane.org/gmane.emacs.devel/139546/match=remove+dups
+1
And the cl-seq.el version of `remove-duplicates' is *particularly* slow. Even a
classic list dups removal algorithm is much faster.
Here's a comparison using `equal' and a list of strings (`elp-results'):
hash-remove-dups 1 0.031 0.031
list-remove-dups 1 5.813 5.813
remove-duplicates 1 122.875 122.875
Where:
(defun list-remove-dups (list)
(let ((tail list)
new)
(while tail
(unless (member (car tail) new) (push (car tail) new))
(pop tail))
(nreverse new)))
(defun* hash-remove-dups (seq &key (test 'equal))
(let ((cont (make-hash-table :test test)))
(loop for elm in seq
unless (gethash elm cont)
do (puthash elm elm cont)
finally return (loop for i being the hash-values
in cont collect i))))
With all 3 functions byte-compiled, using these calls:
(hash-remove-dups B :test 'equal)
(list-remove-dups B) ; uses `equal'
(remove-duplicates B :test 'equal)
And with this list B (initialized anew each time):
(let ((seq (loop for i from 1 to 10000
collect
(format "%s" (random most-positive-fixnum)))))
(append seq seq))
With B 10 times smaller (1000):
hash-remove-dups 1 0.0 0.0
list-remove-dups 1 0.047 0.047
remove-duplicates 1 1.172 1.172
With B 10 times bigger (100000), the difference between hash and classic list is
even greater (fuggedabowt cl-seq's `remove-duplicates' in this case):
hash-remove-dups 1 0.359 0.359
list-remove-dups 1 1209.578 1209.578
remove-duplicates la-la...
next prev parent reply other threads:[~2011-12-28 16:56 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-28 7:55 bug#10386: CODE wishlist: ínclude delete-duplicates.el Jari Aalto
2011-12-28 9:09 ` Thierry Volpiatto
2011-12-28 16:56 ` Drew Adams [this message]
2012-01-11 8:47 ` Glenn Morris
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=97F263C286E14A3A9D8656356D98ECD1@us.oracle.com \
--to=drew.adams@oracle.com \
--cc=10386@debbugs.gnu.org \
--cc=jari.aalto@cante.net \
--cc=thierry.volpiatto@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).