unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#50928: remove-dups
@ 2021-10-01  3:23 Tak Kunihiro
  2021-10-01 12:45 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 9+ messages in thread
From: Tak Kunihiro @ 2021-10-01  3:23 UTC (permalink / raw)
  To: 50928

I wanted to delete duplicated items from a list non-destructively.
It took me a while to find out how to do so.

(cl-remove-duplicates list :test 'equal)
(delete-dups (copy-sequence list))

I think it is handy to have something like below in subr.el.
Too obvious?

#+begin_src emacs-lisp
(defun remove-dups (list)
  "Remove 'equal' duplicates from LIST non-destructively.
Note that `delete-dups' deletes duplicates destructively."
  (delete-dups (copy-sequence list)))
#+end_src






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: remove-dups
  2021-10-01  3:23 bug#50928: remove-dups Tak Kunihiro
@ 2021-10-01 12:45 ` Lars Ingebrigtsen
  2021-10-01 13:16   ` Dmitry Gutov
  2021-10-03 23:42   ` Tak Kunihiro
  0 siblings, 2 replies; 9+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-01 12:45 UTC (permalink / raw)
  To: Tak Kunihiro; +Cc: 50928

Tak Kunihiro <tkk@misasa.okayama-u.ac.jp> writes:

> I wanted to delete duplicated items from a list non-destructively.
> It took me a while to find out how to do so.
>
> (cl-remove-duplicates list :test 'equal)
> (delete-dups (copy-sequence list))
>
> I think it is handy to have something like below in subr.el.
> Too obvious?
>
> #+begin_src emacs-lisp
> (defun remove-dups (list)
>   "Remove 'equal' duplicates from LIST non-destructively.
> Note that `delete-dups' deletes duplicates destructively."
>   (delete-dups (copy-sequence list)))
> #+end_src

This is basically seq-uniq:

---
seq-uniq is an autoloaded compiled Lisp function in ‘seq.el’.

(seq-uniq SEQUENCE &optional TESTFN)
---

The seq library has a pretty full set of sequence functions, some of
which overlaps with the older functions like `delete-dups'.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: remove-dups
  2021-10-01 12:45 ` Lars Ingebrigtsen
@ 2021-10-01 13:16   ` Dmitry Gutov
  2021-10-01 17:02     ` bug#50928: [External] : " Drew Adams
  2021-10-03 23:42   ` Tak Kunihiro
  1 sibling, 1 reply; 9+ messages in thread
From: Dmitry Gutov @ 2021-10-01 13:16 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Tak Kunihiro; +Cc: 50928

On 01.10.2021 15:45, Lars Ingebrigtsen wrote:
> This is basically seq-uniq:
> 
> ---
> seq-uniq is an autoloaded compiled Lisp function in ‘seq.el’.
> 
> (seq-uniq SEQUENCE &optional TESTFN)
> ---
> 
> The seq library has a pretty full set of sequence functions, some of
> which overlaps with the older functions like `delete-dups'.

seq-uniq is O(N^2), though, so it's going to be less efficient than

   (delete-dups (copy-sequence list))

I think the idea was to add specialized faster implementations for 
different data types, but that hasn't happened, so far.

And its signature (accepting testfn) might make the obvious optimization 
which delete-dups uses (caching the "known" set in a hash table) not 
feasible. Maybe we could use a hash table for a limited set of testfn's.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: [External] : bug#50928: remove-dups
  2021-10-01 13:16   ` Dmitry Gutov
@ 2021-10-01 17:02     ` Drew Adams
  2021-10-01 17:31       ` Thierry Volpiatto
  0 siblings, 1 reply; 9+ messages in thread
From: Drew Adams @ 2021-10-01 17:02 UTC (permalink / raw)
  To: Dmitry Gutov, Lars Ingebrigtsen, Tak Kunihiro; +Cc: 50928@debbugs.gnu.org

FWIW, I use this.  I don't recall whether I borrowed
it from somewhere or just wrote it from scratch.

(defun my-remove-dups (sequence &optional test)
  "Copy of SEQUENCE with duplicate elements removed.
Optional arg TEST is the test function.  If nil, test with `equal'.
See `make-hash-table' for possible values of TEST."
  (setq test  (or test  #'equal))
  (let ((htable  (make-hash-table :test test)))
    (loop 
     for elt in sequence
     unless (gethash elt htable)
     do     (puthash elt elt htable)
     finally return (loop for i being the hash-values in htable collect i))))

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: [External] : bug#50928: remove-dups
  2021-10-01 17:02     ` bug#50928: [External] : " Drew Adams
@ 2021-10-01 17:31       ` Thierry Volpiatto
  0 siblings, 0 replies; 9+ messages in thread
From: Thierry Volpiatto @ 2021-10-01 17:31 UTC (permalink / raw)
  To: Drew Adams
  Cc: Lars Ingebrigtsen, Tak Kunihiro, 50928@debbugs.gnu.org,
	Dmitry Gutov

Drew Adams <drew.adams@oracle.com> writes:

> FWIW, I use this.  I don't recall whether I borrowed
> it from somewhere or just wrote it from scratch.
>
> (defun my-remove-dups (sequence &optional test)
>   "Copy of SEQUENCE with duplicate elements removed.
> Optional arg TEST is the test function.  If nil, test with `equal'.
> See `make-hash-table' for possible values of TEST."
>   (setq test  (or test  #'equal))
>   (let ((htable  (make-hash-table :test test)))
>     (loop 
>      for elt in sequence
>      unless (gethash elt htable)
       collect (puthash elt elt htable))))

Looks like a old version of `helm-fast-remove-dups`, no need to loop
again in hash-table and using cl-loop is better.

-- 
Thierry





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: remove-dups
  2021-10-01 12:45 ` Lars Ingebrigtsen
  2021-10-01 13:16   ` Dmitry Gutov
@ 2021-10-03 23:42   ` Tak Kunihiro
  2021-10-04  9:29     ` Lars Ingebrigtsen
  1 sibling, 1 reply; 9+ messages in thread
From: Tak Kunihiro @ 2021-10-03 23:42 UTC (permalink / raw)
  To: larsi; +Cc: thievol, tkk, 50928, dgutov

> Tak Kunihiro <tkk@misasa.okayama-u.ac.jp> writes:
> 
>> I wanted to delete duplicated items from a list non-destructively.
>> It took me a while to find out how to do so.
>>
>> (cl-remove-duplicates list :test 'equal)
>> (delete-dups (copy-sequence list))
>>
>> I think it is handy to have something like below in subr.el.
>> Too obvious?
>>
>> (defun remove-dups (list)
>>   "Remove 'equal' duplicates from LIST non-destructively.
>> Note that `delete-dups' deletes duplicates destructively."
>>   (delete-dups (copy-sequence list)))
> 
> This is basically seq-uniq:

Thank you to let me know.  Now I can find its existence in (info
"(elisp) Sequence Functions").  I wonder how I could have reached to
the function by myself.

How did you find it? (apropos-documentation "duplicate")?





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: remove-dups
  2021-10-03 23:42   ` Tak Kunihiro
@ 2021-10-04  9:29     ` Lars Ingebrigtsen
  2021-10-05  3:03       ` Tak Kunihiro
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-04  9:29 UTC (permalink / raw)
  To: Tak Kunihiro; +Cc: thievol, 50928, dgutov

Tak Kunihiro <tkk@misasa.okayama-u.ac.jp> writes:

> Thank you to let me know.  Now I can find its existence in (info
> "(elisp) Sequence Functions").  I wonder how I could have reached to
> the function by myself.
>
> How did you find it? (apropos-documentation "duplicate")?

I just...  knew about seq.el.  The cross-referencing between the older
sequence functions and seq.el is rather lacking -- basically all these
older functions should probably reference something in seq.el in their
doc strings.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50928: remove-dups
  2021-10-04  9:29     ` Lars Ingebrigtsen
@ 2021-10-05  3:03       ` Tak Kunihiro
  2021-10-05  7:11         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 9+ messages in thread
From: Tak Kunihiro @ 2021-10-05  3:03 UTC (permalink / raw)
  To: larsi; +Cc: thievol, tkk, 50928, dgutov

>> Now I can find its existence in (info
>> "(elisp) Sequence Functions").  I wonder how I could have reached to
>> the function by myself.
>>
>> How did you find it? (apropos-documentation "duplicate")?
> 
> I just...  knew about seq.el.  The cross-referencing between the older
> sequence functions and seq.el is rather lacking -- basically all these
> older functions should probably reference something in seq.el in their
> doc strings.

How about something like below?

commit xxx
Author: yyy
Date:   zzz

    Add references to a newer function `seq-uniq' in seq.el
    
    * lisp/subr.el (delete-dups):
    * doc/lispref/lists.texi (Sets And Lists):
    * doc/lispref/lists.texi (delete-dups):  Refer to `seq-uniq' (bug#50928).

diff --git a/lisp/subr.el b/lisp/subr.el
index e4819c4b2b..228d2e0c22 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -696,7 +696,7 @@ delete-dups
   "Destructively remove `equal' duplicates from LIST.
 Store the result in LIST and return it.  LIST must be a proper list.
 Of several `equal' occurrences of an element in LIST, the first
-one is kept."
+one is kept.  See `seq-uniq' for non-destructive operation."
   (let ((l (length list)))
     (if (> l 100)
         (let ((hash (make-hash-table :test #'equal :size l))

diff --git a/doc/lispref/lists.texi b/doc/lispref/lists.texi
index 75641256b6..66c556ecd0 100644
--- a/doc/lispref/lists.texi
+++ b/doc/lispref/lists.texi
@@ -1227,13 +1227,13 @@ Sets And Lists
 @cindex lists as sets
 @cindex sets
 
-  A list can represent an unordered mathematical set---simply consider a
-value an element of a set if it appears in the list, and ignore the
-order of the list.  To form the union of two sets, use @code{append} (as
-long as you don't mind having duplicate elements).  You can remove
-@code{equal} duplicates using @code{delete-dups}.  Other useful
-functions for sets include @code{memq} and @code{delq}, and their
-@code{equal} versions, @code{member} and @code{delete}.
+  A list can represent an unordered mathematical set---simply consider
+a value an element of a set if it appears in the list, and ignore the
+order of the list.  To form the union of two sets, use @code{append}
+(as long as you don't mind having duplicate elements).  You can remove
+@code{equal} duplicates using @code{delete-dups} or @code{seq-uniq}.
+Other useful functions for sets include @code{memq} and @code{delq},
+and their @code{equal} versions, @code{member} and @code{delete}.
 
 @cindex CL note---lack @code{union}, @code{intersection}
 @quotation
@@ -1489,7 +1489,8 @@ Sets And Lists
 This function destructively removes all @code{equal} duplicates from
 @var{list}, stores the result in @var{list} and returns it.  Of
 several @code{equal} occurrences of an element in @var{list},
-@code{delete-dups} keeps the first one.
+@code{delete-dups} keeps the first one.  See @code{seq-uniq} for
+non-destructive operation.
 @end defun
 
   See also the function @code{add-to-list}, in @ref{List Variables},





^ permalink raw reply related	[flat|nested] 9+ messages in thread

* bug#50928: remove-dups
  2021-10-05  3:03       ` Tak Kunihiro
@ 2021-10-05  7:11         ` Lars Ingebrigtsen
  0 siblings, 0 replies; 9+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-05  7:11 UTC (permalink / raw)
  To: Tak Kunihiro; +Cc: thievol, 50928, dgutov

Tak Kunihiro <tkk@misasa.okayama-u.ac.jp> writes:

> How about something like below?

Thanks; applied to Emacs 28.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-05  7:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-01  3:23 bug#50928: remove-dups Tak Kunihiro
2021-10-01 12:45 ` Lars Ingebrigtsen
2021-10-01 13:16   ` Dmitry Gutov
2021-10-01 17:02     ` bug#50928: [External] : " Drew Adams
2021-10-01 17:31       ` Thierry Volpiatto
2021-10-03 23:42   ` Tak Kunihiro
2021-10-04  9:29     ` Lars Ingebrigtsen
2021-10-05  3:03       ` Tak Kunihiro
2021-10-05  7:11         ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).