Hi, On 2023-11-01 12:32, Philip Kaludercic wrote: > You forgot to CC me in the response, I would have almost missed your response. I am very sorry for this. > Harald Judt writes: > >> Hi, >> >> >> On 2023-10-31 13:21, Philip Kaludercic wrote: >> >> [...] >> >>> Sure, I can take care of adding the package to the archive, you'll only >>> have to bump the "Version" header when you want the package to be >>> released (and in the future as well, the commit that touches the >>> "Version" header triggers a new release). >>> In the meantime, here is some quick and superficial feedback on the >>> code: >> >> Great news! Thank you for your feedback, I'll try to incorporate it >> and push the changes to the package git repository soon. Though I have >> a few questions understanding some of this: >> >>> diff --git a/dired-duplicates.el b/dired-duplicates.el >>> index d62af02..4af37a3 100644 >>> --- a/dired-duplicates.el >>> +++ b/dired-duplicates.el >>> @@ -54,7 +54,6 @@ >>> (defcustom dired-duplicates-separate-results >>> t >>> "Boolean value indicating whether to separate results with new-lines." >>> - :group 'dired-duplicates >> >> [...] >> >> Why should I not use :group for customization? I thought this makes it >> easier to explore customization? > > See Michael's response. Basically what is going on is that the defgroup > macro has a side effect during macro expansion. Consider something like: > > (defvar foo) > > (defmacro bar (val) > (ignore (setq foo val))) > > (defmacro baz () > foo) > > then: > > (progn (bar 2) (baz)) expands to (progn nil 2) Thanks for both of you giving an explanation. [...] >>> (defun dired-duplicates--find-and-filter-files (directories) >>> "Search below DIRECTORIES for duplicate files. >>> @@ -140,14 +136,14 @@ duplicate files as values." >>> (append (gethash size same-size-table) (list f))) >>> finally >>> (cl-loop for same-size-files being the hash-values in same-size-table >>> - if (> (length same-size-files) 1) do >>> + if (> (length same-size-files) 1) do ;see length> >>> (cl-loop for f in same-size-files >>> for checksum = (dired-duplicates-checksum-file f) >>> do (setf (gethash checksum checksum-table) >>> (append (gethash checksum checksum-table) (list f))))) >>> (cl-loop for same-files being the hash-value in checksum-table using (hash-key checksum) >>> do >>> - (if (> (length same-files) 1) >>> + (if (cdr same-files) ;(> (length same-files) 1) in O(1) >>> (setf (gethash checksum checksum-table) >>> (cons (file-attribute-size (file-attributes (car same-files))) >>> (sort same-files #'string<))) >> >> Nice to know. Though I find length> more readable, I'll use cdr since >> I also use car ;-) > > Note that that would require a hard dependency on Emacs 28 (unless you > use Compat). I want to keep Emacs-27 as target unless the packages grows that much that it warrants including compat or bumping the Emacs version. >>> @@ -225,7 +220,7 @@ This is the same as `dired-do-flagged-delete', but calls >>> (defvar dired-duplicates-map >>> (let ((map (make-sparse-keymap))) >>> - ;; workaround for Emacs bug #57565 >>> + ;; workaround for Emacs bug#57565 >>> (when (< emacs-major-version 29) >>> (define-key map (kbd "x") 'dired-duplicates--do-flagged-delete) >>> (define-key map (kbd "D") 'dired-duplicates--do-delete)) >>> @@ -248,7 +243,7 @@ The results will be shown in a Dired buffer." >>> default-directory))) >>> (unless directories >>> (user-error "Specify one or more directories to search in")) >>> - (let* ((directories (if (listp directories) directories (list directories))) >>> + (let* ((directories (if (listp directories) directories (list directories))) ;see `ensure-list' >> >> `ensure-list' would bump emacs requirements to 28.1 or require compat >> hence I did not use it. > > But you have used other definitions that would have also raised the > dependency to at least 28.1? You can use package-lint to detect these. No, I have used package-lint already (it has great integration into flymake btw) so this did not happen. > Michael Heerdegen writes: > >> Harald Judt writes: >> >>>> diff --git a/dired-duplicates.el b/dired-duplicates.el >>>> index d62af02..4af37a3 100644 >>>> --- a/dired-duplicates.el >>>> +++ b/dired-duplicates.el >>>> @@ -54,7 +54,6 @@ >>>> (defcustom dired-duplicates-separate-results >>>> t >>>> "Boolean value indicating whether to separate results with new-lines." >>>> - :group 'dired-duplicates >>> >>> [...] >>> >>> Why should I not use :group for customization? I thought this makes it >>> easier to explore customization? >> >> It's redundant - see (info "(elisp) Variable Definitions"): >> >> If a ‘defcustom’ does not specify any ‘:group’, the last group >> defined with ‘defgroup’ in the same file will be used. This way, >> most ‘defcustom’ do not need an explicit ‘:group’. >> >> If :group is specified, it will most of the time mean the defcustom should >> be assigned to some other, not to the default, group. So it's better >> style to not explicitly specify the default. > > One notable exception is if you have two groups defined in one file, and > you want to disambiguate the user options or avoid having to write these > in a specific order. > >> Michael. To be honest, I have looked at how some existing package had implemented customization and did it the same way. I have no problems with removing the :group declarations if the defgroup is the default. > Harald Judt writes: > >> On 2023-10-31 13:21, Philip Kaludercic wrote: >> >> [...] >> >>> @@ -225,7 +220,7 @@ This is the same as `dired-do-flagged-delete', but calls >>> (defvar dired-duplicates-map >>> (let ((map (make-sparse-keymap))) >>> - ;; workaround for Emacs bug #57565 >>> + ;; workaround for Emacs bug#57565 >>> (when (< emacs-major-version 29) >>> (define-key map (kbd "x") 'dired-duplicates--do-flagged-delete) >>> (define-key map (kbd "D") 'dired-duplicates--do-delete)) >> >> BTW: What's the reasoning behind this suggested change? > > I thought that this would be necessary for bug-reference-mode to work, > but apparently it can detect both of these references. Yes, whether space or not, it doesn't matter, at least in Emacs-29.1. But that aside, bug reference mode wants to take me to a page on my project (https://codeberg.org/bug/issues/57565) which does not exist because it is an *Emacs* bug that is referenced in the comment, not a bug in dired-duplicates. > Eli Zaretskii writes: > >>> From: Philip Kaludercic >>> Cc: emacs-devel@gnu.org >>> Date: Tue, 31 Oct 2023 12:21:51 +0000 >>> >>> In the meantime, here is some quick and superficial feedback on the code: >> >> What I would like to ask is whether Harald tried to calculate SHA256 >> using Emacs's own primitives, instead of relying on an external >> program, which may or may not be installed. > > That is a good point, but I can imagine that one reason for using an > external tool is that if the user has a specific program they wish to > use that is not available as an Emacs primitive? I have not tried to calculate SHA256 using Emacs's own primitives, but not because I did not want to. The reasoning behind my decision is the following: dired-duplicates does support local files and *remote* files - via TRAMP. For local files using Emacs's own primitives could be OK performance-wise, I have not tested this. But if one starts to use them for fetching files via TRAMP which means over network, then things could slow down to a crawl pretty fast (it even takes quite a while collecting size stats over the network). So it is not Emacs's SHA256 calculation that would be a problem here, but instead slow transport over network. If the host does the calculation, this should be a lot faster (though that of course depends on whether the files are local files on the host of course and not some mounted network resource). I think it is reasonable to assume that one would *not* want to download large video files over the network to check if they are duplicates. The second thing is customizability of course, and probably necessary because of my reasoning explained in the previous paragraph. The remote host might have md5sum installed, but not sha256sum. So I wanted to provide a way for the user to configure the checksum command. I do not object against implementing a fallback as Eli suggested; I simply have not implemented it because I did not know about Emacs's primitives, and at that same time my reasoning about remote files started and so that is the way it is now. Regards, Harald -- `Experience is the best teacher.' PGP Key ID: 4FFFAB21B8580ABD Fingerprint: E073 6DD8 FF40 9CF2 0665 11D4 4FFF AB21 B858 0ABD