all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Thien-Thi Nguyen <ttn@gnu.org>
To: Emacs Development <emacs-devel@gnu.org>
Cc: Michael Heerdegen <michael_heerdegen@web.de>, rswgnu@gmail.com
Subject: Re: [ELPA] New package: find-dups
Date: Thu, 12 Oct 2017 01:28:14 +0200	[thread overview]
Message-ID: <87mv4xnu29.fsf@gnuvola.org> (raw)
In-Reply-To: <87bmldefg5.fsf@web.de> (Michael Heerdegen's message of "Wed, 11 Oct 2017 19:56:26 +0200")

[-- Attachment #1: Type: text/plain, Size: 3320 bytes --]


() Michael Heerdegen <michael_heerdegen@web.de>
() Wed, 11 Oct 2017 19:56:26 +0200

   Robert Weiner <rsw@gnu.org> writes:

   > This seems incredibly complicated.  It would help if you
   > would state the general problem you are trying to solve and
   > the performance characteristics you need.  It certainly is
   > not a generic duplicate removal library.  Why can't you
   > flatten your list and then just apply a sequence of
   > predicate matches as needed or use hashing as mentioned in
   > the commentary?

   I guess the name is misleading, I'll try to find a better one.

How about "multi-pass-dups", then use/document "pass" everywhere
in the code/discussion?  (Currently, you use "stage" in the code,
and "step" in this thread.)

   Look at the example of finding files with equal contents in
   your file system: [...]

Can you think of another use-case?  That exercise will help
highlight the general (factorable) concepts to document well.
Conversely, if you cannot, maybe that's a hint that the
abstraction level is too high; some opportunity exists for
specialization (and thus optimization).

   In a second step, we have less many files.

This is the key motivation for multi-pass...

   Do you need a mathematical formulation of the abstract
   problem that the algorithm solves, and how it works?

...so briefly explaining how iteration mitigates the suffering
due to the (irreducible) N^2 might be good to do early on (in
Commentary), before giving examples.  Leading w/ a small bit of
theory caters to those readers already disposed to that style.

To help the rest of the readers, a common technique is to label
components of the theory (e.g., [1], [2], ...) and refer to
these later, in the concrete examples.  Those readers might
gloss over the theory at first (being indisposed) but the back
references invite them to make connections at their own pace.

In short, "it is wise" to show how "it is wise" and avoid saying
"it is wise" (according to this practiced "wise"ass :-D).

   (find-dups my-sequence-of-file-names
              (list (list (lambda (file) ...)
                          #'eq)
                    (list (lambda (file) ...)
                          #'equal)
                    (list (lambda (file) ...)
                          #'equal)))

IIUC the 2nd level ‘list’ is to associate each characterization
func w/ a comparison func.  I wonder if there is another way.
Too bad Emacs Lisp has no "object properties" like Guile, eh?

OTOH, the 1st level ‘list’ seems like a gratuitous hoop (read:
source of latent PEBKAC/complaints/redesign).  Why not move that
down-chain, so caller need not worry?  Something like:

#+begin_src emacs-lisp
(multi-pass-dups
 MY-SEQUENCE-OF-FILE-NAMES
 (list (lambda (file) ...)
       #'eq)
 (list (lambda (file) ...)
       #'equal)
 (list (lambda (file) ...)
       #'equal))
#+end_src

(I also use the all-caps-for-metavariables convention, here.)

-- 
Thien-Thi Nguyen -----------------------------------------------
 (defun responsep (query)
   (pcase (context query)
     (`(technical ,ml) (correctp ml))
     ...))                              748E A0E8 1CB8 A748 9BFA
--------------------------------------- 6CE4 6703 2224 4C80 7502


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

  parent reply	other threads:[~2017-10-11 23:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-11 15:25 [ELPA] New package: find-dups Michael Heerdegen
2017-10-11 17:05 ` Robert Weiner
2017-10-11 17:56   ` Michael Heerdegen
2017-10-11 18:56     ` Eli Zaretskii
2017-10-11 19:25       ` Michael Heerdegen
2017-10-11 23:28     ` Thien-Thi Nguyen [this message]
2017-10-12  2:23     ` Robert Weiner
2017-10-12  8:37 ` Andreas Politz
2017-10-12 12:32   ` Michael Heerdegen
2017-10-12 13:20     ` Nicolas Petton
2017-10-12 18:49       ` Michael Heerdegen
2017-10-13 10:21         ` Michael Heerdegen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mv4xnu29.fsf@gnuvola.org \
    --to=ttn@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=michael_heerdegen@web.de \
    --cc=rswgnu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.