unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
@ 2020-10-25 11:26 Zhiwei Chen
  2020-10-26 22:37 ` Dmitry Gutov
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zhiwei Chen @ 2020-10-25 11:26 UTC (permalink / raw)
  To: 44210

The arguments of `find-program' in function
`project--files-in-directory' is hard coded, which disallows customizing
`find-program' in some means.

`counsel-file-jump` uses `find-program' and provides
`counsel-file-jump-args' which I thought is better.

-- 
Zhiwei Chen





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2020-10-25 11:26 bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd Zhiwei Chen
@ 2020-10-26 22:37 ` Dmitry Gutov
  2021-01-10  3:31 ` Zhiwei Chen
  2021-01-11 13:04 ` Zhiwei Chen
  2 siblings, 0 replies; 8+ messages in thread
From: Dmitry Gutov @ 2020-10-26 22:37 UTC (permalink / raw)
  To: Zhiwei Chen, 44210

Hi!

On 25.10.2020 13:26, Zhiwei Chen wrote:
> The arguments of `find-program' in function
> `project--files-in-directory' is hard coded, which disallows customizing
> `find-program' in some means.

The arguments are not hardcoded (they are constructed dynamically), but 
the format is (one expected by 'find').

'fd' uses a different arguments format, both for the "globs to search 
for" and the list of ignores. I wish we had a better mechanism in 
grep.el for a more flexible user ability to choose the tool to list 
files in a dir (and a search tool, and so on).

> `counsel-file-jump` uses `find-program' and provides
> `counsel-file-jump-args' which I thought is better.

A variable with a flat list of args won't do here, because we actually 
have to turn two other lists (FILES and IGNORES) into appropriate arguments.

What you could do, is do full :override advice on which would construct 
a proper command line for 'fd' based on these args, then call it and 
pipe through 'project--remote-file-names' (like 
'project--files-in-directory' currently does). Then benchmark them and 
post the results here.

If the result offers a meaningfully better performance, while honoring 
all ignores, we'll see what we can do to accommodate 'fd'.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2020-10-25 11:26 bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd Zhiwei Chen
  2020-10-26 22:37 ` Dmitry Gutov
@ 2021-01-10  3:31 ` Zhiwei Chen
  2021-01-10  3:37   ` Zhiwei Chen
  2021-01-11 13:04 ` Zhiwei Chen
  2 siblings, 1 reply; 8+ messages in thread
From: Zhiwei Chen @ 2021-01-10  3:31 UTC (permalink / raw)
  To: 44210@debbugs.gnu.org; +Cc: dgutov

[-- Attachment #1: Type: text/plain, Size: 1985 bytes --]

Sorry for late reply, here are the benchmark stats.

The result is promising, ‘fd’ is 3x faster than ‘find’.

(benchmark 5 '(project--files-in-directory "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 9.401258s (0.097027s in 1 GCs)"

(benchmark 5 '(project--files-in-directory-fd "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 2.759160s (0.105133s in 1 GCs)”

Where `project--files-in-directory’ is the original version in project.el, and `project--files-in-directory-fd’ modified from the previous one for ‘fd’ use.

The definition of `project--files-in-directory-fd’ follows:

(defun project--files-in-directory-fd (dir ignores &optional files)
  (require 'find-dired)
  (require 'xref)
  (defvar find-name-arg)
  (let* ((default-directory dir)
         ;; Make sure ~/ etc. in local directory name is
         ;; expanded and not left for the shell command
         ;; to interpret.
         (localdir (file-local-name (expand-file-name dir)))
         (command (format "%s . %s %s --type f %s --print0"
                          "fd"
                          ;; In case DIR is a symlink.
                          (file-name-as-directory localdir)
                          ""
                          (if files
                              (concat (shell-quote-argument "(")
                                      " " find-name-arg " "
                                      (mapconcat
                                       #'shell-quote-argument
                                       (split-string files)
                                       (concat " -o " find-name-arg " "))
                                      " "
                                      (shell-quote-argument ")"))
                            ""))))
    (message command)
    (project--remote-file-names
     (sort (split-string (shell-command-to-string command) "\0" t)
           #'string<))))

--
Zhiwei Chen



[-- Attachment #2: Type: text/html, Size: 5552 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2021-01-10  3:31 ` Zhiwei Chen
@ 2021-01-10  3:37   ` Zhiwei Chen
  2021-01-10 17:48     ` Dmitry Gutov
  0 siblings, 1 reply; 8+ messages in thread
From: Zhiwei Chen @ 2021-01-10  3:37 UTC (permalink / raw)
  To: 44210@debbugs.gnu.org; +Cc: condy0919@gmail.com, Dmitry Gutov

[-- Attachment #1: Type: text/plain, Size: 2133 bytes --]

+myself

--
Zhiwei Chen


On Jan 10, 2021, at 11:31 AM, Zhiwei Chen <chenzhiwei03@kuaishou.com<mailto:chenzhiwei03@kuaishou.com>> wrote:

Sorry for late reply, here are the benchmark stats.

The result is promising, ‘fd’ is 3x faster than ‘find’.

(benchmark 5 '(project--files-in-directory "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 9.401258s (0.097027s in 1 GCs)"

(benchmark 5 '(project--files-in-directory-fd "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 2.759160s (0.105133s in 1 GCs)”

Where `project--files-in-directory’ is the original version in project.el, and `project--files-in-directory-fd’ modified from the previous one for ‘fd’ use.

The definition of `project--files-in-directory-fd’ follows:

(defun project--files-in-directory-fd (dir ignores &optional files)
  (require 'find-dired)
  (require 'xref)
  (defvar find-name-arg)
  (let* ((default-directory dir)
         ;; Make sure ~/ etc. in local directory name is
         ;; expanded and not left for the shell command
         ;; to interpret.
         (localdir (file-local-name (expand-file-name dir)))
         (command (format "%s . %s %s --type f %s --print0"
                          "fd"
                          ;; In case DIR is a symlink.
                          (file-name-as-directory localdir)
                          ""
                          (if files
                              (concat (shell-quote-argument "(")
                                      " " find-name-arg " "
                                      (mapconcat
                                       #'shell-quote-argument
                                       (split-string files)
                                       (concat " -o " find-name-arg " "))
                                      " "
                                      (shell-quote-argument ")"))
                            ""))))
    (message command)
    (project--remote-file-names
     (sort (split-string (shell-command-to-string command) "\0" t)
           #'string<))))

--
Zhiwei Chen




[-- Attachment #2: Type: text/html, Size: 6813 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2021-01-10  3:37   ` Zhiwei Chen
@ 2021-01-10 17:48     ` Dmitry Gutov
  2021-01-18  1:15       ` Zhiwei Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Gutov @ 2021-01-10 17:48 UTC (permalink / raw)
  To: Zhiwei Chen, 44210@debbugs.gnu.org; +Cc: condy0919@gmail.com

Hi!

On 10.01.2021 05:37, Zhiwei Chen wrote:
> (defun project--files-in-directory-fd (dir ignores &optional files)
>    (require 'find-dired)
>    (require 'xref)
>    (defvar find-name-arg)
>    (let* ((default-directory dir)
>           ;; Make sure ~/ etc. in local directory name is
>           ;; expanded and not left for the shell command
>           ;; to interpret.
>           (localdir (file-local-name (expand-file-name dir)))
>           (command (format "%s . %s %s --type f %s --print0"
>                            "fd"
>                            ;; In case DIR is a symlink.
>                            (file-name-as-directory localdir)
>                            ""
>                            (if files
>                                (concat (shell-quote-argument "(")
>                                        " " find-name-arg " "
>                                        (mapconcat
>                                         #'shell-quote-argument
>                                         (split-string files)
>                                         (concat " -o " find-name-arg " "))
>                                        " "
>                                        (shell-quote-argument ")"))
>                              ""))))
>      (message command)
>      (project--remote-file-names
>       (sort (split-string (shell-command-to-string command) "\0" t)
>             #'string<))))

That code doesn't seem to handle the IGNORES argument at all. Which 
could lead to an imbalanced comparison, though I don't know if it does, 
in this example (with just one ignored dir). But you could try passing 
no ignores to both of them.

It's weird, though. I have just tried both functions, and there was no 
perceptible performance difference (in a different project, though; in 
gecko-dev).

What are the versions of said programs on your machine? Mine:

$ find --version
find (GNU findutils) 4.7.0

$ fdfind --version
fd 7.4.0





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2020-10-25 11:26 bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd Zhiwei Chen
  2020-10-26 22:37 ` Dmitry Gutov
  2021-01-10  3:31 ` Zhiwei Chen
@ 2021-01-11 13:04 ` Zhiwei Chen
  2 siblings, 0 replies; 8+ messages in thread
From: Zhiwei Chen @ 2021-01-11 13:04 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 44210@debbugs.gnu.org, condy0919@gmail.com

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]

I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.

Make sure the page cache is cleared before each benchmark.

> sudo sysctl -w vm.drop_caches=3

> cd llvm-project

> sudo sysctl -w vm.drop_caches=3
> time fd > /tmp/fd_output
1.04s user 4.11s system 522% cpu 0.987 total

> sudo sysctl -w vm.drop_caches=3
> time find > /tmp/find_output
0.06s user 0.20s system 7% cpu 3.354 total

Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.

--
Zhiwei Chen



[-- Attachment #2: Type: text/html, Size: 2012 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2021-01-10 17:48     ` Dmitry Gutov
@ 2021-01-18  1:15       ` Zhiwei Chen
  2021-01-18  3:09         ` Dmitry Gutov
  0 siblings, 1 reply; 8+ messages in thread
From: Zhiwei Chen @ 2021-01-18  1:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 44210@debbugs.gnu.org, Zhiwei Chen


I think I replied to the wrong thread, so forwarded it again.

> I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
> 
> Make sure the page cache is cleared before each benchmark.
> 
> > sudo sysctl -w vm.drop_caches=3
> 
> > cd llvm-project
> 
> > sudo sysctl -w vm.drop_caches=3
> > time fd > /tmp/fd_output
> 1.04s user 4.11s system 522% cpu 0.987 total
> 
> > sudo sysctl -w vm.drop_caches=3
> > time find > /tmp/find_output
> 0.06s user 0.20s system 7% cpu 3.354 total
> 
> Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.

-- 
Zhiwei Chen





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd
  2021-01-18  1:15       ` Zhiwei Chen
@ 2021-01-18  3:09         ` Dmitry Gutov
  0 siblings, 0 replies; 8+ messages in thread
From: Dmitry Gutov @ 2021-01-18  3:09 UTC (permalink / raw)
  To: Zhiwei Chen; +Cc: 44210@debbugs.gnu.org, Zhiwei Chen

Hi!

You didn't address my complaint about the ignored IGNORES argument. I 
was going to explain that, but that email got sidetracked, sorry.

In any case, I can't reproduce your results even with the latest fd.

I don't have an LLVM checkout, though, just some other projects like 
Linux kernel and gecko-dev. And 'find' is consistently 2x as fast here.

In any case, I can believe that fd is going to be faster on some 
systems. To make it an "official" option, someone will need to write a 
version of project--files-in-directory that uses fd but honors the 
IGNORES argument, as well as FILES. Preferably with some tests. Then we 
can make the program used switchable.

On 18.01.2021 03:15, Zhiwei Chen wrote:
> 
> I think I replied to the wrong thread, so forwarded it again.
> 
>> I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
>>
>> Make sure the page cache is cleared before each benchmark.
>>
>>> sudo sysctl -w vm.drop_caches=3
>>
>>> cd llvm-project
>>
>>> sudo sysctl -w vm.drop_caches=3
>>> time fd > /tmp/fd_output
>> 1.04s user 4.11s system 522% cpu 0.987 total
>>
>>> sudo sysctl -w vm.drop_caches=3
>>> time find > /tmp/find_output
>> 0.06s user 0.20s system 7% cpu 3.354 total
>>
>> Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.
> 






^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-01-18  3:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-25 11:26 bug#44210: 28.0.50; project.el failed to work after customizing find-program to fd Zhiwei Chen
2020-10-26 22:37 ` Dmitry Gutov
2021-01-10  3:31 ` Zhiwei Chen
2021-01-10  3:37   ` Zhiwei Chen
2021-01-10 17:48     ` Dmitry Gutov
2021-01-18  1:15       ` Zhiwei Chen
2021-01-18  3:09         ` Dmitry Gutov
2021-01-11 13:04 ` Zhiwei Chen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).