bug#64735: 29.0.92; find invocations are ~15x slower because of ignores

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
@ 2023-07-19 21:16 Spencer Baugh
  2023-07-20  5:00 ` Eli Zaretskii
                   ` (3 more replies)
  0 siblings, 4 replies; 199+ messages in thread
From: Spencer Baugh @ 2023-07-19 21:16 UTC (permalink / raw)
  To: 64735

Several important commands and functions invoke find; for example rgrep
and project-find-regexp.

Most of these add some set of ignores to the find command, pulling from
grep-find-ignored-files in the former case.  So the find command looks
like:

find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \)
-prune -o -type f -print0

Alas, on my system, using GNU find, these ignores slow down find by
about 15x on a large directory tree, taking it from around .5 seconds to
7.8 seconds.

This is very noticeable overhead; removing the ignores makes rgrep and
other find-invoking commands substantially faster for me.

The overhead is linear in the number of ignores - that is, each
additional ignore adds a small fixed cost.  This suggests that find is
linearly scanning the list of ignores and checking each one, rather than
optimizing them to a single regexp and checking that regexp.

Obviously, GNU find should be optimizing this.  However they have
previously said they will not optimize this; I commented on this bug
https://savannah.gnu.org/bugs/index.php?58197 to request they rethink
that.  Hopefully as a fellow GNU project they will be interested in
helping us...

In Emacs alone, there are a few things we could do:
- we could mitigate the find bug by optimizing the regexp before we pass
it to find; this should basically remove all the overhead but makes the
find command uglier and harder to edit
- we could remove rare and likely irrelevant things from
completion-ignored-extensions and vc-ignore-dir-regexp (which are used
to build these lists of ignores)
- we could use our own recursive directory-tree walking implementation
(directory-files-recursively), if we found a nice way to pipe its output
directly to grep etc without going through Lisp.  (This could be nice
for project-files, at least)

Incidentally, I tried a find alternative, "bfs",
https://github.com/tavianator/bfs and it doesn't optimize this either,
sadly, so it also has the 15x slowdown.

In GNU Emacs 29.0.92 (build 5, x86_64-pc-linux-gnu, X toolkit, cairo
 version 1.15.12, Xaw scroll bars) of 2023-07-10 built on

Repository revision: dd15432ffacbeff0291381c0109f5b1245060b1d
Repository branch: emacs-29
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
System Description: Rocky Linux 8.8 (Green Obsidian)

Configured using:
 'configure --config-cache --with-x-toolkit=lucid
 --with-gif=ifavailable'

Configured features:
CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG JSON
LIBSELINUX LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND
SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE XIM XINPUT2 XPM LUCID
ZLIB

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Shell

Memory information:
((conses 16 1939322 193013)
 (symbols 48 76940 49)
 (strings 32 337371 45355)
 (string-bytes 1 12322013)
 (vectors 16 148305)
 (vector-slots 8 3180429 187121)
 (floats 8 889 751)
 (intervals 56 152845 1238)
 (buffers 976 235)
 (heap 1024 978725 465480))

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh
@ 2023-07-20  5:00 ` Eli Zaretskii
  2023-07-20 12:22   ` sbaugh
  2023-07-20 12:38 ` Dmitry Gutov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-20  5:00 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: 64735

> From: Spencer Baugh <sbaugh@janestreet.com>
> Date: Wed, 19 Jul 2023 17:16:31 -0400
> 
> 
> Several important commands and functions invoke find; for example rgrep
> and project-find-regexp.
> 
> Most of these add some set of ignores to the find command, pulling from
> grep-find-ignored-files in the former case.  So the find command looks
> like:
> 
> find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \)
> -prune -o -type f -print0
> 
> Alas, on my system, using GNU find, these ignores slow down find by
> about 15x on a large directory tree, taking it from around .5 seconds to
> 7.8 seconds.
> 
> This is very noticeable overhead; removing the ignores makes rgrep and
> other find-invoking commands substantially faster for me.

grep-find-ignored-files is a customizable user option, so if this
slowdown bothers you, just customize it to avoid that.  And if there
are patterns there that are no longer pertinent or rare, we could
remove them from the default value.

I'm not sure we should bother more than these two simple measures.

> The overhead is linear in the number of ignores - that is, each
> additional ignore adds a small fixed cost.  This suggests that find is
> linearly scanning the list of ignores and checking each one, rather than
> optimizing them to a single regexp and checking that regexp.

If it uses fnmatch, it cannot do it any other way, I think





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20  5:00 ` Eli Zaretskii
@ 2023-07-20 12:22   ` sbaugh
  2023-07-20 12:42     ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: sbaugh @ 2023-07-20 12:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Spencer Baugh, 64735

Eli Zaretskii <eliz@gnu.org> writes:
>> From: Spencer Baugh <sbaugh@janestreet.com>
>> Date: Wed, 19 Jul 2023 17:16:31 -0400
>> 
>> 
>> Several important commands and functions invoke find; for example rgrep
>> and project-find-regexp.
>> 
>> Most of these add some set of ignores to the find command, pulling from
>> grep-find-ignored-files in the former case.  So the find command looks
>> like:
>> 
>> find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \)
>> -prune -o -type f -print0
>> 
>> Alas, on my system, using GNU find, these ignores slow down find by
>> about 15x on a large directory tree, taking it from around .5 seconds to
>> 7.8 seconds.
>> 
>> This is very noticeable overhead; removing the ignores makes rgrep and
>> other find-invoking commands substantially faster for me.
>
> grep-find-ignored-files is a customizable user option, so if this
> slowdown bothers you, just customize it to avoid that.

I think the fact that the default behavior is very slow, is bad.

> And if there are patterns there that are no longer pertinent or rare,
> we could remove them from the default value.

Sure!

So the thing to narrow down would be completion-ignored-extensions,
which is what populates grep-find-ignored-files.  Most things in that
list are irrelevant to most users, but all of them are relevant to some
users.

Most of these are language-specific things - e.g. there's a bunch of
Common Lisp compiled object (or something) extensions.

Perhaps we could modularize this, so that individual packages add things
to completion-ignored-extensions at load time.  Then
completion-ignored-extensions would only include things which are
relevant to a given user, as determined by what packages they load.

> I'm not sure we should bother more than these two simple measures.

Unfortunately those two simple measures help rgrep but they don't help
project-find-regexp (and others project.el commands using
project--files-in-directory such as project-find-file), since those
project commands pull their ignores from the version control system
through vc (not grep-find-ignored-files), and then pass them to find.

>> The overhead is linear in the number of ignores - that is, each
>> additional ignore adds a small fixed cost.  This suggests that find is
>> linearly scanning the list of ignores and checking each one, rather than
>> optimizing them to a single regexp and checking that regexp.
>
> If it uses fnmatch, it cannot do it any other way, I think

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh
  2023-07-20  5:00 ` Eli Zaretskii
@ 2023-07-20 12:38 ` Dmitry Gutov
  2023-07-20 13:20   ` Ihor Radchenko
  2023-07-21  2:42 ` Richard Stallman
  2023-07-22 10:18 ` Ihor Radchenko
  3 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-20 12:38 UTC (permalink / raw)
  To: Spencer Baugh, 64735

On 20/07/2023 00:16, Spencer Baugh wrote:
> In Emacs alone, there are a few things we could do:
> - we could mitigate the find bug by optimizing the regexp before we pass
> it to find; this should basically remove all the overhead but makes the
> find command uglier and harder to edit
> - we could remove rare and likely irrelevant things from
> completion-ignored-extensions and vc-ignore-dir-regexp (which are used
> to build these lists of ignores)

I like these two approaches.

> - we could use our own recursive directory-tree walking implementation
> (directory-files-recursively), if we found a nice way to pipe its output
> directly to grep etc without going through Lisp.  (This could be nice
> for project-files, at least)

This will probably not work as well. Last I checked, Lisp-native file 
listing was simply slower than 'find'.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 12:22   ` sbaugh
@ 2023-07-20 12:42     ` Dmitry Gutov
  2023-07-20 13:43       ` Spencer Baugh
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-20 12:42 UTC (permalink / raw)
  To: sbaugh, Eli Zaretskii; +Cc: Spencer Baugh, 64735

On 20/07/2023 15:22, sbaugh@catern.com wrote:
>> I'm not sure we should bother more than these two simple measures.
> Unfortunately those two simple measures help rgrep but they don't help
> project-find-regexp (and others project.el commands using
> project--files-in-directory such as project-find-file), since those
> project commands pull their ignores from the version control system
> through vc (not grep-find-ignored-files), and then pass them to find.

That's only a problem when the default file listing logic is used (and 
we usually delegate to something like 'git ls-files' instead, when the 
vc-aware backend is used).

Anyway, some optimization could be useful there too. The extra 
difficulty, though, is that the entries in IGNORES already can come as 
wildcards. Can we merge several wildcards? Though I suppose if we use a 
regexp, we could construct an alternation anyway.

Another question it would be helpful to check, is whether the different 
versions of 'find' out there work fine with -regex instead of -name, and 
don't get slowed down simply because of that feature. The old built-in 
'find' on macOS, for example.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 12:38 ` Dmitry Gutov
@ 2023-07-20 13:20   ` Ihor Radchenko
  2023-07-20 15:19     ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 13:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Spencer Baugh, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

> ... Last I checked, Lisp-native file 
> listing was simply slower than 'find'.

Could it be changed?
In my tests, I was able to improve performance of the built-in
`directory-files-recursively' simply by disabling
`file-name-handler-alist' around its call.

See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/
(the thread also continues off-list, and it looks like there is a lot of
room for improvement in this area)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 12:42     ` Dmitry Gutov
@ 2023-07-20 13:43       ` Spencer Baugh
  2023-07-20 18:54         ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Spencer Baugh @ 2023-07-20 13:43 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:
> On 20/07/2023 15:22, sbaugh@catern.com wrote:
>>> I'm not sure we should bother more than these two simple measures.
>> Unfortunately those two simple measures help rgrep but they don't help
>> project-find-regexp (and others project.el commands using
>> project--files-in-directory such as project-find-file), since those
>> project commands pull their ignores from the version control system
>> through vc (not grep-find-ignored-files), and then pass them to find.
>
> That's only a problem when the default file listing logic is used (and
> we usually delegate to something like 'git ls-files' instead, when the
> vc-aware backend is used).

Hm, yes, but things like C-u project-find-regexp will use the default
find-based file listing logic instead of git ls-files, as do a few other
things.

I wonder, could we just go ahead and make a vc function which is
list-files(GLOBS) and returns a list of files?  Both git and hg support
this.  Then we could have C-u project-find-regexp use that instead of
find, by taking the cross product of dirs-to-search and
file-name-patterns-to-search.  (And this would let me delete a big chunk
of my own project backend, so I'd be happy to implement it.)

Fundamentally it seems a little silly for project-ignores to ever be
used for a vc project; if the vcs gives us ignores, we can probably just
ask the vcs to list the files too, and it will have an efficient
implementation of that.

If we do that uniformly, then this find slowness would only affect
transient projects, and transient projects pull their ignores from
grep-find-ignored-files just like rgrep, so improvements will more
easily be applied to both.  (And maybe we could even get rid of
project-ignores entirely, then?)

> Anyway, some optimization could be useful there too. The extra
> difficulty, though, is that the entries in IGNORES already can come as
> wildcards. Can we merge several wildcards? Though I suppose if we use
> a regexp, we could construct an alternation anyway.
>
> Another question it would be helpful to check, is whether the
> different versions of 'find' out there work fine with -regex instead
> of -name, and don't get slowed down simply because of that
> feature. The old built-in 'find' on macOS, for example.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 13:20   ` Ihor Radchenko
@ 2023-07-20 15:19     ` Dmitry Gutov
  2023-07-20 15:42       ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-20 15:19 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Spencer Baugh, 64735

On 20/07/2023 16:20, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry@gutov.dev> writes:
> 
>> ... Last I checked, Lisp-native file
>> listing was simply slower than 'find'.
> 
> Could it be changed?
> In my tests, I was able to improve performance of the built-in
> `directory-files-recursively' simply by disabling
> `file-name-handler-alist' around its call.

Then it won't work with Tramp, right? I think it's pretty nifty that 
project-find-regexp and dired-do-find-regexp work over Tramp.

> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/
> (the thread also continues off-list, and it looks like there is a lot of
> room for improvement in this area)

Does it get close enough to the performance of 'find' this way?

Also note that processing all matches in Lisp, with many ignores 
entries, will incur the proportional overhead in Lisp. Which might be 
relatively slow as well.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 15:19     ` Dmitry Gutov
@ 2023-07-20 15:42       ` Ihor Radchenko
  2023-07-20 15:57         ` Dmitry Gutov
                           ` (2 more replies)
  0 siblings, 3 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 15:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Spencer Baugh, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

>>> ... Last I checked, Lisp-native file
>>> listing was simply slower than 'find'.
>> 
>> Could it be changed?
>> In my tests, I was able to improve performance of the built-in
>> `directory-files-recursively' simply by disabling
>> `file-name-handler-alist' around its call.
>
> Then it won't work with Tramp, right? I think it's pretty nifty that 
> project-find-regexp and dired-do-find-regexp work over Tramp.

Sure. It might also be optimized. Without trying to convince find devs
to do something about regexp handling.

And things are not as horrible as 15x slowdown in find.

>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/
>> (the thread also continues off-list, and it looks like there is a lot of
>> room for improvement in this area)
>
> Does it get close enough to the performance of 'find' this way?

Comparable:

(ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" ""))))
;; Elapsed time: 0.633713s
(ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" "")))))
;; Elapsed time: 0.324341s
;; time find /home/yantar92/.data >/dev/null
;; real	0m0.129s
;; user	0m0.017s
;; sys	0m0.111s

> Also note that processing all matches in Lisp, with many ignores 
> entries, will incur the proportional overhead in Lisp. Which might be 
> relatively slow as well.

Not significant.
I tried to unwrap recursion in `directory-files-recursively' and tried
to play around with regexp matching of the file list itself - no
significant impact compared to `file-name-handler-alist'.

I am pretty sure that Emacs's native file routines can be optimized to
the level of find.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 15:42       ` Ihor Radchenko
@ 2023-07-20 15:57         ` Dmitry Gutov
  2023-07-20 16:03           ` Ihor Radchenko
  2023-07-20 16:33         ` Eli Zaretskii
  2023-07-20 17:08         ` Spencer Baugh
  2 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-20 15:57 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Spencer Baugh, 64735

On 20/07/2023 18:42, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry@gutov.dev> writes:
> 
>>>> ... Last I checked, Lisp-native file
>>>> listing was simply slower than 'find'.
>>>
>>> Could it be changed?
>>> In my tests, I was able to improve performance of the built-in
>>> `directory-files-recursively' simply by disabling
>>> `file-name-handler-alist' around its call.
>>
>> Then it won't work with Tramp, right? I think it's pretty nifty that
>> project-find-regexp and dired-do-find-regexp work over Tramp.
> 
> Sure. It might also be optimized. Without trying to convince find devs
> to do something about regexp handling.
> 
> And things are not as horrible as 15x slowdown in find.

We haven't compared to the "optimized regexps" solution in find, though.

>>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/
>>> (the thread also continues off-list, and it looks like there is a lot of
>>> room for improvement in this area)
>>
>> Does it get close enough to the performance of 'find' this way?
> 
> Comparable:
> 
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" ""))))
> ;; Elapsed time: 0.633713s
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" "")))))
> ;; Elapsed time: 0.324341s
> ;; time find /home/yantar92/.data >/dev/null
> ;; real	0m0.129s
> ;; user	0m0.017s
> ;; sys	0m0.111s

Still like 2.5x slower, then? That's significant.

>> Also note that processing all matches in Lisp, with many ignores
>> entries, will incur the proportional overhead in Lisp. Which might be
>> relatively slow as well.
> 
> Not significant.
> I tried to unwrap recursion in `directory-files-recursively' and tried
> to play around with regexp matching of the file list itself - no
> significant impact compared to `file-name-handler-alist'.

I suppose that can make sense, if find's slowdown is due to it issuing 
repeated 'stat' calls for every match.

> I am pretty sure that Emacs's native file routines can be optimized to
> the level of find.

I don't know, the GNU tools are often ridiculously optimized. At least 
certain file paths.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 15:57         ` Dmitry Gutov
@ 2023-07-20 16:03           ` Ihor Radchenko
  2023-07-20 18:56             ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 16:03 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Spencer Baugh, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

>> And things are not as horrible as 15x slowdown in find.
>
> We haven't compared to the "optimized regexps" solution in find, though.

Fair point.

> Still like 2.5x slower, then? That's significant.

It is, but it is workable if we try to optimize Emacs'
`directory-files'/`file-name-all-completions' internals.

>> I am pretty sure that Emacs's native file routines can be optimized to
>> the level of find.
>
> I don't know, the GNU tools are often ridiculously optimized. At least 
> certain file paths.

You are likely right.
Then, what about applying regexps manually, on the full file list
returned by find?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 15:42       ` Ihor Radchenko
  2023-07-20 15:57         ` Dmitry Gutov
@ 2023-07-20 16:33         ` Eli Zaretskii
  2023-07-20 16:36           ` Ihor Radchenko
  2023-07-20 17:08         ` Spencer Baugh
  2 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-20 16:33 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh

> Cc: Spencer Baugh <sbaugh@janestreet.com>, 64735@debbugs.gnu.org
> From: Ihor Radchenko <yantar92@posteo.net>
> Date: Thu, 20 Jul 2023 15:42:17 +0000
> 
> I am pretty sure that Emacs's native file routines can be optimized to
> the level of find.

Maybe it can be improved, but not to the same level as Find, because
consing Lisp strings, something that Find doesn't do, does have its
overhead.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 16:33         ` Eli Zaretskii
@ 2023-07-20 16:36           ` Ihor Radchenko
  2023-07-20 16:45             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 16:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> I am pretty sure that Emacs's native file routines can be optimized to
>> the level of find.
>
> Maybe it can be improved, but not to the same level as Find, because
> consing Lisp strings, something that Find doesn't do, does have its
> overhead.

I am not sure if this specific issue is important.
If we want to use find from Emacs, we would need to create Emacs
string/strings when reading the output of find anyway.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 16:36           ` Ihor Radchenko
@ 2023-07-20 16:45             ` Eli Zaretskii
  2023-07-20 17:23               ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-20 16:45 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: dmitry@gutov.dev, sbaugh@janestreet.com, 64735@debbugs.gnu.org
> Date: Thu, 20 Jul 2023 16:36:28 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Maybe it can be improved, but not to the same level as Find, because
> > consing Lisp strings, something that Find doesn't do, does have its
> > overhead.
> 
> I am not sure if this specific issue is important.
> If we want to use find from Emacs, we would need to create Emacs
> string/strings when reading the output of find anyway.

So how do you explain that using Find is faster than using
find-lisp.el?

I think the answer is that using Find as a subprocess conses less.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 15:42       ` Ihor Radchenko
  2023-07-20 15:57         ` Dmitry Gutov
  2023-07-20 16:33         ` Eli Zaretskii
@ 2023-07-20 17:08         ` Spencer Baugh
  2023-07-20 17:24           ` Eli Zaretskii
                             ` (2 more replies)
  2 siblings, 3 replies; 199+ messages in thread
From: Spencer Baugh @ 2023-07-20 17:08 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dmitry Gutov, 64735

Ihor Radchenko <yantar92@posteo.net> writes:
> Dmitry Gutov <dmitry@gutov.dev> writes:
>
>>>> ... Last I checked, Lisp-native file
>>>> listing was simply slower than 'find'.
>>> 
>>> Could it be changed?
>>> In my tests, I was able to improve performance of the built-in
>>> `directory-files-recursively' simply by disabling
>>> `file-name-handler-alist' around its call.
>>
>> Then it won't work with Tramp, right? I think it's pretty nifty that 
>> project-find-regexp and dired-do-find-regexp work over Tramp.
>
> Sure. It might also be optimized. Without trying to convince find devs
> to do something about regexp handling.

Not to derail too much, but find as a subprocess has one substantial
advantage over find in Lisp: It can run in parallel with Emacs, so that
we actually use multiple CPU cores.

Between that, and the remote support part, I personally much prefer find
to be a subprocess rather than in Lisp.  I don't think optimizing
directory-files-recursively is a great solution.

(Really it's entirely plausible that Emacs could be improved by
*removing* directory-files-recursively, in favor of invoking find as a
subprocess: faster, parallelized execution, and better remote support.)





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 16:45             ` Eli Zaretskii
@ 2023-07-20 17:23               ` Ihor Radchenko
  2023-07-20 18:24                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 17:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> I am not sure if this specific issue is important.
>> If we want to use find from Emacs, we would need to create Emacs
>> string/strings when reading the output of find anyway.
>
> So how do you explain that using Find is faster than using
> find-lisp.el?
>
> I think the answer is that using Find as a subprocess conses less.

No. It uses less excessive regexp matching Emacs is trying to do in
file-name-handler-alist.

(ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" ""))))
;; Elapsed time: 2.982393s
(ignore (let ((gc-cons-threshold most-positive-fixnum) file-name-handler-alist) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" ""))))
;; Elapsed time: 0.784461s


    22.83%  emacs         emacs                            [.] Fnconc
    10.01%  emacs         emacs                            [.] Fexpand_file_name
     9.22%  emacs         emacs                            [.] eval_sub
     3.47%  emacs         emacs                            [.] assq_no_quit
     2.68%  emacs         emacs                            [.] getenv_internal_1
     2.50%  emacs         emacs                            [.] mem_insert.isra.0
     2.24%  emacs         emacs                            [.] Fassq
     2.02%  emacs         emacs                            [.] set_buffer_internal_2


(ignore (let ((gc-cons-threshold most-positive-fixnum) file-name-handler-alist) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" ""))))
;; Elapsed time: 0.624987s

    12.39%  emacs         emacs                                    [.] eval_sub
    12.07%  emacs         emacs                                    [.] Fexpand_file_name
     4.97%  emacs         emacs                                    [.] assq_no_quit
     4.11%  emacs         emacs                                    [.] getenv_internal_1
     2.77%  emacs         emacs                                    [.] set_buffer_internal_2
     2.61%  emacs         emacs                                    [.] mem_insert.isra.0
     2.47%  emacs         emacs                                    [.] make_clear_multibyte_string.part.0

Non-recursive version of `find-lisp-find-files-internal' is below,
though it provides limited improvement.

(defun find-lisp-find-files-internal (directory file-predicate
						directory-predicate)
  "Find files under DIRECTORY which satisfy FILE-PREDICATE.
FILE-PREDICATE is a function which takes two arguments: the file and its
directory.

DIRECTORY-PREDICATE is used to decide whether to descend into directories.
It is a function which takes two arguments, the directory and its parent."
  (setq directory (file-name-as-directory directory))
  (let (results fullname (dirs (list (expand-file-name directory))))
    (while dirs
      (setq directory (pop dirs))
      (dolist (file (directory-files directory nil nil t))
	(setq fullname (concat directory file))
	(when (file-readable-p fullname)
	  ;; If a directory, check it we should descend into it
	  (and (file-directory-p fullname)
               (setq fullname (concat fullname "/"))
               (funcall directory-predicate file directory)
               (push fullname dirs))
	  ;; For all files and directories, call the file predicate
	  (and (funcall file-predicate file directory)
               (push fullname results)))))
    results))

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 17:08         ` Spencer Baugh
@ 2023-07-20 17:24           ` Eli Zaretskii
  2023-07-22  6:35             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-07-20 17:25           ` Ihor Radchenko
  2023-07-22  6:39           ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-20 17:24 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: dmitry, yantar92, 64735

> Cc: Dmitry Gutov <dmitry@gutov.dev>, 64735@debbugs.gnu.org
> From: Spencer Baugh <sbaugh@janestreet.com>
> Date: Thu, 20 Jul 2023 13:08:24 -0400
> 
> (Really it's entirely plausible that Emacs could be improved by
> *removing* directory-files-recursively, in favor of invoking find as a
> subprocess: faster, parallelized execution, and better remote support.)

No, there's no reason to remove anything that useful from Emacs.  If
this or that API is not the optimal choice for some job, it is easy
enough not to use it.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 17:08         ` Spencer Baugh
  2023-07-20 17:24           ` Eli Zaretskii
@ 2023-07-20 17:25           ` Ihor Radchenko
  2023-07-21 19:31             ` Spencer Baugh
  2023-07-22  6:39           ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 17:25 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: Dmitry Gutov, 64735

Spencer Baugh <sbaugh@janestreet.com> writes:

>> Sure. It might also be optimized. Without trying to convince find devs
>> to do something about regexp handling.
>
> Not to derail too much, but find as a subprocess has one substantial
> advantage over find in Lisp: It can run in parallel with Emacs, so that
> we actually use multiple CPU cores.

Does find use multiple CPU cores?


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 17:23               ` Ihor Radchenko
@ 2023-07-20 18:24                 ` Eli Zaretskii
  2023-07-20 18:29                   ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-20 18:24 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: dmitry@gutov.dev, sbaugh@janestreet.com, 64735@debbugs.gnu.org
> Date: Thu, 20 Jul 2023 17:23:22 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I am not sure if this specific issue is important.
> >> If we want to use find from Emacs, we would need to create Emacs
> >> string/strings when reading the output of find anyway.
> >
> > So how do you explain that using Find is faster than using
> > find-lisp.el?
> >
> > I think the answer is that using Find as a subprocess conses less.
> 
> No. It uses less excessive regexp matching Emacs is trying to do in
> file-name-handler-alist.

Where do you see regexp matching in the profiles you provided?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 18:24                 ` Eli Zaretskii
@ 2023-07-20 18:29                   ` Ihor Radchenko
  2023-07-20 18:43                     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 18:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> No. It uses less excessive regexp matching Emacs is trying to do in
>> file-name-handler-alist.
>
> Where do you see regexp matching in the profiles you provided?

I did the analysis earlier for `directory-files-recursively'. See
https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/

Just to be sure, here is perf data for
(ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" ""))))

    54.89%  emacs    emacs                            [.] re_match_2_internal
    10.19%  emacs    emacs                            [.] re_search_2
     3.35%  emacs    emacs                            [.] unbind_to
     3.02%  emacs    emacs                            [.] compile_pattern
     3.02%  emacs    emacs                            [.] execute_charset
     3.00%  emacs    emacs                            [.] process_mark_stack
     1.59%  emacs    emacs                            [.] plist_get
     1.26%  emacs    emacs                            [.] RE_SETUP_SYNTAX_TABLE_FOR_OBJECT
     1.17%  emacs    emacs                            [.] update_syntax_table
     1.02%  emacs    emacs                            [.] Fexpand_file_name

Disabling `file-name-handler-alist' cuts the time more than 2x.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 18:29                   ` Ihor Radchenko
@ 2023-07-20 18:43                     ` Eli Zaretskii
  2023-07-20 18:57                       ` Ihor Radchenko
  2023-07-21  7:45                       ` Michael Albinus
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-20 18:43 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: dmitry@gutov.dev, sbaugh@janestreet.com, 64735@debbugs.gnu.org
> Date: Thu, 20 Jul 2023 18:29:43 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> No. It uses less excessive regexp matching Emacs is trying to do in
> >> file-name-handler-alist.
> >
> > Where do you see regexp matching in the profiles you provided?
> 
> I did the analysis earlier for `directory-files-recursively'. See
> https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/
> 
> Just to be sure, here is perf data for
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" ""))))
> 
>     54.89%  emacs    emacs                            [.] re_match_2_internal
>     10.19%  emacs    emacs                            [.] re_search_2
>      3.35%  emacs    emacs                            [.] unbind_to
>      3.02%  emacs    emacs                            [.] compile_pattern
>      3.02%  emacs    emacs                            [.] execute_charset
>      3.00%  emacs    emacs                            [.] process_mark_stack
>      1.59%  emacs    emacs                            [.] plist_get
>      1.26%  emacs    emacs                            [.] RE_SETUP_SYNTAX_TABLE_FOR_OBJECT
>      1.17%  emacs    emacs                            [.] update_syntax_table
>      1.02%  emacs    emacs                            [.] Fexpand_file_name
> 
> Disabling `file-name-handler-alist' cuts the time more than 2x.

Disabling file-handlers is inconceivable in Emacs.  And I suspect that
find-file-name-handler is mostly called not from directory-files, but
from expand-file-name -- another call that cannot possibly be bypassed
in Emacs, since Emacs behaves as if CWD were different for each
buffer.  And expand-file-name also conses file names.  And then we
have encoding and decoding file names, something that with Find we do
much less.  Etc. etc.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 13:43       ` Spencer Baugh
@ 2023-07-20 18:54         ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-20 18:54 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: sbaugh, Eli Zaretskii, 64735

On 20/07/2023 16:43, Spencer Baugh wrote:

>> That's only a problem when the default file listing logic is used (and
>> we usually delegate to something like 'git ls-files' instead, when the
>> vc-aware backend is used).
> 
> Hm, yes, but things like C-u project-find-regexp will use the default
> find-based file listing logic instead of git ls-files, as do a few other
> things.

Right.

> I wonder, could we just go ahead and make a vc function which is
> list-files(GLOBS) and returns a list of files?  Both git and hg support
> this.  Then we could have C-u project-find-regexp use that instead of
> find, by taking the cross product of dirs-to-search and
> file-name-patterns-to-search.  (And this would let me delete a big chunk
> of my own project backend, so I'd be happy to implement it.)

I started out on this inside the branch scratch/project-regen. Didn't 
have time to dedicate to it recently, but the basics are there, take a 
look (the method is called project-files-filtered).

The difficulty with making such changes, is the project protocol grows 
in size, it becomes difficult for a user to understand what is 
mandatory, what's obsolete, and how to use it, especially in the face of 
backward compatibility requirements.

Take a look, feedback is welcome, it should help move this forward. We 
should also transition to returning relative file names when possible, 
for performance (optionally or always).

> Fundamentally it seems a little silly for project-ignores to ever be
> used for a vc project; if the vcs gives us ignores, we can probably just
> ask the vcs to list the files too, and it will have an efficient
> implementation of that.

Possibly, yes. But there will likely remain cases when the project-files 
could stay useful for callers, to construct some bigger command line for 
some new feature. Though perhaps we'll be able to drop that need by 
extracting the theoretically best performance from project-files (using 
a process object or some abstraction), to facilitate low-overhead piping.

> If we do that uniformly, then this find slowness would only affect
> transient projects, and transient projects pull their ignores from
> grep-find-ignored-files just like rgrep, so improvements will more
> easily be applied to both.  (And maybe we could even get rid of
> project-ignores entirely, then?)

Regarding removing it, see above. And it'll take a number of years 
anyway ;-(

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 16:03           ` Ihor Radchenko
@ 2023-07-20 18:56             ` Dmitry Gutov
  2023-07-21  9:14               ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-20 18:56 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Spencer Baugh, 64735

On 20/07/2023 19:03, Ihor Radchenko wrote:
> Dmitry Gutov<dmitry@gutov.dev>  writes:
> 
>>> And things are not as horrible as 15x slowdown in find.
>> We haven't compared to the "optimized regexps" solution in find, though.
> Fair point.
> 
>> Still like 2.5x slower, then? That's significant.
> It is, but it is workable if we try to optimize Emacs'
> `directory-files'/`file-name-all-completions' internals.
> 
>>> I am pretty sure that Emacs's native file routines can be optimized to
>>> the level of find.
>> I don't know, the GNU tools are often ridiculously optimized. At least
>> certain file paths.

Sorry, I meant "code paths" here.

> You are likely right.
> Then, what about applying regexps manually, on the full file list
> returned by find?

It will almost certainly be slower in cases where several (few) ignore 
entries help drop whole big directories from traversal.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 18:43                     ` Eli Zaretskii
@ 2023-07-20 18:57                       ` Ihor Radchenko
  2023-07-21 12:37                         ` Dmitry Gutov
  2023-07-21  7:45                       ` Michael Albinus
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-20 18:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> Disabling `file-name-handler-alist' cuts the time more than 2x.
>
> Disabling file-handlers is inconceivable in Emacs.

Indeed. But we are talking about Emacs find vs. GNU find here.
In the scenarios where GNU find can be used, it is also safe to disable
file handlers, AFAIU.

> ... And I suspect that
> find-file-name-handler is mostly called not from directory-files, but
> from expand-file-name -- another call that cannot possibly be bypassed
> in Emacs, since Emacs behaves as if CWD were different for each
> buffer.  And expand-file-name also conses file names.  And then we
> have encoding and decoding file names, something that with Find we do
> much less.  Etc. etc.

expand-file-name indeed calls Ffind_file_name_handler multiple times.
And what is worse: (1) `find-lisp-find-files-internal' calls
`expand-file-name' on every file in lisp, even when it is already
expanded (which it is, for every sub-directory); (2) `directory-files'
calls Fexpand_file_name again, on already expanded directory name;
(3) `directory-files' calls Ffind_file_name_handler yet again on top of
what was already done by Fexpand_file_name; (4) `directory-files' calls
`directory_files_internal' that calls `Fdirectory_file_name' that
searches `Ffind_file_name_handler' yet one more time.

There is a huge amount of repetitive calls to Ffind_file_name_handler
going on. They could at least be cached or re-used.

I do not see much of encoding and consing present in perf stats.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh
  2023-07-20  5:00 ` Eli Zaretskii
  2023-07-20 12:38 ` Dmitry Gutov
@ 2023-07-21  2:42 ` Richard Stallman
  2023-07-22  2:39   ` Richard Stallman
  2023-07-22 10:18 ` Ihor Radchenko
  3 siblings, 1 reply; 199+ messages in thread
From: Richard Stallman @ 2023-07-21  2:42 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: 64735

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

I will take a look at this.

In case they are reluctant because of being busy, would anyone like
to help out by writing the code to do the optimization?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 18:43                     ` Eli Zaretskii
  2023-07-20 18:57                       ` Ihor Radchenko
@ 2023-07-21  7:45                       ` Michael Albinus
  2023-07-21 10:46                         ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21  7:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, Ihor Radchenko, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

Hi Eli,

>> Disabling `file-name-handler-alist' cuts the time more than 2x.
>
> Disabling file-handlers is inconceivable in Emacs.

Agreed.

However, the fattest regexps in file-name-handler-alist are those for
tramp-archive-file-name-handler, tramp-completion-file-name-handler and
tramp-file-name-handler. Somewhere else I have proposed to write a macro
without-remote-files and a command inhibit-remote-files, which disable
Tramp and remove its file name handlers from file-name-handler-alist.
Either temporarily, or permanent.

Users can call the command, if they know for sure they don't use remote
files ever. Authors could use the macro in case they know for sure they
are working over local files only.

WDYT?

Best regards, Michael.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 18:56             ` Dmitry Gutov
@ 2023-07-21  9:14               ` Ihor Radchenko
  0 siblings, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21  9:14 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Spencer Baugh, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

>> You are likely right.
>> Then, what about applying regexps manually, on the full file list
>> returned by find?
>
> It will almost certainly be slower in cases where several (few) ignore 
> entries help drop whole big directories from traversal.

Right.
Then, what about limiting find to -depth 1, filtering the output, and
re-running find on matching entries?

It gets complicated though, and the extra overheads associated with
invoking a new process may not be worth it.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21  7:45                       ` Michael Albinus
@ 2023-07-21 10:46                         ` Eli Zaretskii
  2023-07-21 11:32                           ` Michael Albinus
  2023-07-21 12:38                           ` Dmitry Gutov
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 10:46 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Ihor Radchenko <yantar92@posteo.net>,  dmitry@gutov.dev,
>   64735@debbugs.gnu.org,  sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 09:45:26 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Disabling file-handlers is inconceivable in Emacs.
> 
> Agreed.
> 
> However, the fattest regexps in file-name-handler-alist are those for
> tramp-archive-file-name-handler, tramp-completion-file-name-handler and
> tramp-file-name-handler. Somewhere else I have proposed to write a macro
> without-remote-files and a command inhibit-remote-files, which disable
> Tramp and remove its file name handlers from file-name-handler-alist.
> Either temporarily, or permanent.
> 
> Users can call the command, if they know for sure they don't use remote
> files ever. Authors could use the macro in case they know for sure they
> are working over local files only.
> 
> WDYT?

How is this different from binding file-name-handler-alist to nil?
Tramp is nowadays the main consumer of this feature, and AFAIU your
suggestion above boils down to disabling Tramp.  If so, what is left?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 10:46                         ` Eli Zaretskii
@ 2023-07-21 11:32                           ` Michael Albinus
  2023-07-21 11:51                             ` Ihor Radchenko
  2023-07-21 12:39                             ` Eli Zaretskii
  2023-07-21 12:38                           ` Dmitry Gutov
  1 sibling, 2 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 11:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

Hi Eli,

>> However, the fattest regexps in file-name-handler-alist are those for
>> tramp-archive-file-name-handler, tramp-completion-file-name-handler and
>> tramp-file-name-handler. Somewhere else I have proposed to write a macro
>> without-remote-files and a command inhibit-remote-files, which disable
>> Tramp and remove its file name handlers from file-name-handler-alist.
>> Either temporarily, or permanent.
>>
>> Users can call the command, if they know for sure they don't use remote
>> files ever. Authors could use the macro in case they know for sure they
>> are working over local files only.
>>
>> WDYT?
>
> How is this different from binding file-name-handler-alist to nil?
> Tramp is nowadays the main consumer of this feature, and AFAIU your
> suggestion above boils down to disabling Tramp.  If so, what is left?

jka-compr-handler, epa-file-handler and file-name-non-special are
left. All of them have their reason.

And there are packages out in the wild, which add other handlers. Like
jarchive--file-name-handler and sweeprolog-file-name-handler, I've
checked only (Non)GNU ELPA. All of them would suffer from the
bind-file-name-handler-alist-to-nil trick. There's a reason we haven't
documented it in the manuals.

And this is just the case to handle it in Lisp code, with
without-remote-files. According to the last Emacs Survey, more than 50%
of Emacs users don't use Tramp, never ever. But they must live with the
useless checks in file-name-handler-alist for Tramp. All of them would
profit, if they add (inhibit-remote-files) in their .emacs file.

Best regards, Michael.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 11:32                           ` Michael Albinus
@ 2023-07-21 11:51                             ` Ihor Radchenko
  2023-07-21 12:01                               ` Michael Albinus
  2023-07-21 12:39                             ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 11:51 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

> And this is just the case to handle it in Lisp code, with
> without-remote-files. According to the last Emacs Survey, more than 50%
> of Emacs users don't use Tramp, never ever. But they must live with the
> useless checks in file-name-handler-alist for Tramp. All of them would
> profit, if they add (inhibit-remote-files) in their .emacs file.

May tramp only set file-name-handler-alist when a tramp command is
actually invoked?

Or, alternatively, may we fence the regexp matches in
`file-name-handler-alist' behind boolean switches?
I examined what the actual handlers do, and I can see
`jka-compr-inhibit', `epa-inhibit', `tramp-archive-enabled',
and `tramp-mode' are used to force-execute the original handler. If we
could make Emacs perform these checks earlier, the whole expensive
regexp matching phase could be bypassed.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 11:51                             ` Ihor Radchenko
@ 2023-07-21 12:01                               ` Michael Albinus
  2023-07-21 12:20                                 ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 12:01 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

> May tramp only set file-name-handler-alist when a tramp command is
> actually invoked?

It's the other direction: Tramp is only invoked after a check in
file-name-handler-alist.

> Or, alternatively, may we fence the regexp matches in
> `file-name-handler-alist' behind boolean switches?
> I examined what the actual handlers do, and I can see
> `jka-compr-inhibit', `epa-inhibit', `tramp-archive-enabled',
> and `tramp-mode' are used to force-execute the original handler. If we
> could make Emacs perform these checks earlier, the whole expensive
> regexp matching phase could be bypassed.

Hmm, this would mean to extend the file-name-handler-alist spec. Instead
of a regexp to check, we would need to allow a function call or
alike. Don't know whether this pays for optimization.

And there is also the case, that due to inhibit-file-name-handlers and
inhibit-file-name-operation we can allow a remote file name operation
for a given function, and disable it for another function. Tramp uses
this mechanism. The general flag tramp-mode is not sufficient for this
scenario.

Best regards, Michael.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:01                               ` Michael Albinus
@ 2023-07-21 12:20                                 ` Ihor Radchenko
  2023-07-21 12:25                                   ` Ihor Radchenko
  2023-07-21 12:27                                   ` Michael Albinus
  0 siblings, 2 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 12:20 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> Or, alternatively, may we fence the regexp matches in
>> `file-name-handler-alist' behind boolean switches?
>> I examined what the actual handlers do, and I can see
>> `jka-compr-inhibit', `epa-inhibit', `tramp-archive-enabled',
>> and `tramp-mode' are used to force-execute the original handler. If we
>> could make Emacs perform these checks earlier, the whole expensive
>> regexp matching phase could be bypassed.
>
> Hmm, this would mean to extend the file-name-handler-alist spec. Instead
> of a regexp to check, we would need to allow a function call or
> alike. Don't know whether this pays for optimization.

The question is: what is more costly
(a) matching complex regexp && call function or
(b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...))

> And there is also the case, that due to inhibit-file-name-handlers and
> inhibit-file-name-operation we can allow a remote file name operation
> for a given function, and disable it for another function. Tramp uses
> this mechanism. The general flag tramp-mode is not sufficient for this
> scenario.

I am not sure if I understand completely, but it does not appear that
this is used often during ordinary file operations that do not involve
tramp.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:20                                 ` Ihor Radchenko
@ 2023-07-21 12:25                                   ` Ihor Radchenko
  2023-07-21 12:46                                     ` Eli Zaretskii
  2023-07-21 12:27                                   ` Michael Albinus
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 12:25 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

>> Hmm, this would mean to extend the file-name-handler-alist spec. Instead
>> of a regexp to check, we would need to allow a function call or
>> alike. Don't know whether this pays for optimization.
>
> The question is: what is more costly
> (a) matching complex regexp && call function or
> (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...))

(benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file"))
;; => (1.495432981 0 0.0)
(benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file"))
;; => (0.42053276500000003 0 0.0)

Looks like even funcall overheads are not as bad as invoking regexp search.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:20                                 ` Ihor Radchenko
  2023-07-21 12:25                                   ` Ihor Radchenko
@ 2023-07-21 12:27                                   ` Michael Albinus
  2023-07-21 12:30                                     ` Ihor Radchenko
  1 sibling, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 12:27 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

>> And there is also the case, that due to inhibit-file-name-handlers and
>> inhibit-file-name-operation we can allow a remote file name operation
>> for a given function, and disable it for another function. Tramp uses
>> this mechanism. The general flag tramp-mode is not sufficient for this
>> scenario.
>
> I am not sure if I understand completely, but it does not appear that
> this is used often during ordinary file operations that do not involve
> tramp.

Don't know, but it is a documented feature for many decades. We
shouldn't destroy it intentionally. If we have an alternative, as I have
proposed with without-remote-file-names. What's wrong with this? You
could use it everywhere, where you let-bind file-name-handler-alist to
nil these days.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:27                                   ` Michael Albinus
@ 2023-07-21 12:30                                     ` Ihor Radchenko
  2023-07-21 13:04                                       ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 12:30 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> I am not sure if I understand completely, but it does not appear that
>> this is used often during ordinary file operations that do not involve
>> tramp.
>
> Don't know, but it is a documented feature for many decades. We
> shouldn't destroy it intentionally. If we have an alternative, as I have
> proposed with without-remote-file-names. What's wrong with this? You
> could use it everywhere, where you let-bind file-name-handler-alist to
> nil these days.

The idea was to make things work faster without modifying third-party
code.

And what do you mean by destroy?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 18:57                       ` Ihor Radchenko
@ 2023-07-21 12:37                         ` Dmitry Gutov
  2023-07-21 12:58                           ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 12:37 UTC (permalink / raw)
  To: Ihor Radchenko, Eli Zaretskii; +Cc: sbaugh, 64735

On 20/07/2023 21:57, Ihor Radchenko wrote:
> Eli Zaretskii <eliz@gnu.org> writes:
> 
>>> Disabling `file-name-handler-alist' cuts the time more than 2x.
>>
>> Disabling file-handlers is inconceivable in Emacs.
> 
> Indeed. But we are talking about Emacs find vs. GNU find here.
> In the scenarios where GNU find can be used, it is also safe to disable
> file handlers, AFAIU.

GNU find can be used on a remote machine. In all the same cases as when 
it can be used on the local one.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 10:46                         ` Eli Zaretskii
  2023-07-21 11:32                           ` Michael Albinus
@ 2023-07-21 12:38                           ` Dmitry Gutov
  1 sibling, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 12:38 UTC (permalink / raw)
  To: Eli Zaretskii, Michael Albinus; +Cc: sbaugh, yantar92, 64735

On 21/07/2023 13:46, Eli Zaretskii wrote:
> How is this different from binding file-name-handler-alist to nil?
> Tramp is nowadays the main consumer of this feature, and AFAIU your
> suggestion above boils down to disabling Tramp.  If so, what is left?

I don't understand this either.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 11:32                           ` Michael Albinus
  2023-07-21 11:51                             ` Ihor Radchenko
@ 2023-07-21 12:39                             ` Eli Zaretskii
  2023-07-21 13:09                               ` Michael Albinus
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 12:39 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: yantar92@posteo.net,  dmitry@gutov.dev,  64735@debbugs.gnu.org,
>   sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 13:32:46 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Users can call the command, if they know for sure they don't use remote
> >> files ever. Authors could use the macro in case they know for sure they
> >> are working over local files only.
> >>
> >> WDYT?
> >
> > How is this different from binding file-name-handler-alist to nil?
> > Tramp is nowadays the main consumer of this feature, and AFAIU your
> > suggestion above boils down to disabling Tramp.  If so, what is left?
> 
> jka-compr-handler, epa-file-handler and file-name-non-special are
> left. All of them have their reason.

I know, but when I wrote that disabling file-handlers is
inconceivable, I meant remote files, not those other users of this
facility.

Let me rephrase: running Emacs commands with disabled support for
remote files is inconceivable.

IMO, if tests against file-name-handler-alist are a significant
performance problem, we should look for ways of solving it without
disabling remote files.

In general, disabling general-purpose Emacs features because they
cause slow-down should be the last resort, after we tried and failed
to use smarter solutions.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:25                                   ` Ihor Radchenko
@ 2023-07-21 12:46                                     ` Eli Zaretskii
  2023-07-21 13:01                                       ` Michael Albinus
  2023-07-21 13:17                                       ` Ihor Radchenko
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 12:46 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, michael.albinus, 64735, sbaugh

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: dmitry@gutov.dev, Eli Zaretskii <eliz@gnu.org>, 64735@debbugs.gnu.org,
>  sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 12:25:29 +0000
> 
> Ihor Radchenko <yantar92@posteo.net> writes:
> 
> > The question is: what is more costly
> > (a) matching complex regexp && call function or
> > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...))
> 
> (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file"))
> ;; => (1.495432981 0 0.0)
> (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file"))
> ;; => (0.42053276500000003 0 0.0)
> 
> Looks like even funcall overheads are not as bad as invoking regexp search.

But "nil" is not a faithful emulation of the real test which will have
to be put there, is it?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:37                         ` Dmitry Gutov
@ 2023-07-21 12:58                           ` Ihor Radchenko
  2023-07-21 13:00                             ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 12:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

>>> Disabling file-handlers is inconceivable in Emacs.
>> 
>> Indeed. But we are talking about Emacs find vs. GNU find here.
>> In the scenarios where GNU find can be used, it is also safe to disable
>> file handlers, AFAIU.
>
> GNU find can be used on a remote machine. In all the same cases as when 
> it can be used on the local one.

But GNU find does not take into account Emacs' file-handlers for each
directory when traversing directories. 

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:58                           ` Ihor Radchenko
@ 2023-07-21 13:00                             ` Dmitry Gutov
  2023-07-21 13:34                               ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 13:00 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735

On 21/07/2023 15:58, Ihor Radchenko wrote:
> Dmitry Gutov<dmitry@gutov.dev>  writes:
> 
>>>> Disabling file-handlers is inconceivable in Emacs.
>>> Indeed. But we are talking about Emacs find vs. GNU find here.
>>> In the scenarios where GNU find can be used, it is also safe to disable
>>> file handlers, AFAIU.
>> GNU find can be used on a remote machine. In all the same cases as when
>> it can be used on the local one.
> But GNU find does not take into account Emacs' file-handlers for each
> directory when traversing directories.

Indeed. Such usage always assumes the initial invocation and each 
visited directory belong to the same remote host. Which is usually a 
correct assumption.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:46                                     ` Eli Zaretskii
@ 2023-07-21 13:01                                       ` Michael Albinus
  2023-07-21 13:23                                         ` Ihor Radchenko
  2023-07-21 13:17                                       ` Ihor Radchenko
  1 sibling, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 13:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, Ihor Radchenko, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Ihor Radchenko <yantar92@posteo.net>
>> Cc: dmitry@gutov.dev, Eli Zaretskii <eliz@gnu.org>, 64735@debbugs.gnu.org,
>>  sbaugh@janestreet.com
>> Date: Fri, 21 Jul 2023 12:25:29 +0000
>>
>> Ihor Radchenko <yantar92@posteo.net> writes:
>>
>> > The question is: what is more costly
>> > (a) matching complex regexp && call function or
>> > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...))
>>
>> (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file"))
>> ;; => (1.495432981 0 0.0)
>> (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file"))
>> ;; => (0.42053276500000003 0 0.0)
>>
>> Looks like even funcall overheads are not as bad as invoking regexp search.
>
> But "nil" is not a faithful emulation of the real test which will have
> to be put there, is it?

Here are some other numbers. The definition of inhibit-remote-files and
without-remote-files is below.

--8<---------------cut here---------------start------------->8---
(length (directory-files-recursively "~/src" ""))
146121
--8<---------------cut here---------------end--------------->8---

A sufficient large directory.

--8<---------------cut here---------------start------------->8---
(benchmark-run-compiled 1 (directory-files-recursively "~/src" ""))
(38.133906724000006 13 0.5019186470000001)
(benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/src" "")))
(32.944982886 13 0.5274874450000002)
--8<---------------cut here---------------end--------------->8---

There are indeed 5 sec overhead just for file name handler regexp checks.

--8<---------------cut here---------------start------------->8---
(benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/src" "")))
(33.261659676 13 0.5338916200000003)
--8<---------------cut here---------------end--------------->8---

Removing just the Tramp file name handlers comes near to let-binding
file-name-handler-alist.

--8<---------------cut here---------------start------------->8---
(inhibit-remote-files)
nil
(benchmark-run-compiled 1 (directory-files-recursively "~/src" ""))
(34.344226758000005 13 0.5421030509999998)
--8<---------------cut here---------------end--------------->8---

And that's for the innocents, which aren't aware of Tramp overhead, and
which don't need it. As said, ~50% of Emacs users. Just adding
(inhibit-remote-files) to .emacs gives them a performance boost. W/o
touching any other code.

--8<---------------cut here---------------start------------->8---
;;;###autoload
(progn (defun inhibit-remote-files ()
  "Deactivate remote file names."
  (interactive)
  (when (fboundp 'tramp-cleanup-all-connections)
    (funcall 'tramp-cleanup-all-connections))
  (tramp-unload-file-name-handlers)
  (setq tramp-mode nil)))

;;;###autoload
(progn (defmacro without-remote-files (&rest body)
  "Deactivate remote file names temporarily.
Run BODY."
  (declare (indent 0) (debug ((form body) body)))
  `(let ((file-name-handler-alist (copy-tree file-name-handler-alist))
         tramp-mode)
     (tramp-unload-file-name-handlers)
     ,@body)))
--8<---------------cut here---------------end--------------->8---

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:30                                     ` Ihor Radchenko
@ 2023-07-21 13:04                                       ` Michael Albinus
  2023-07-21 13:24                                         ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 13:04 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

>>> I am not sure if I understand completely, but it does not appear that
>>> this is used often during ordinary file operations that do not involve
>>> tramp.
>>
>> Don't know, but it is a documented feature for many decades. We
>> shouldn't destroy it intentionally. If we have an alternative, as I have
>> proposed with without-remote-file-names. What's wrong with this? You
>> could use it everywhere, where you let-bind file-name-handler-alist to
>> nil these days.
>
> The idea was to make things work faster without modifying third-party
> code.

People use already (let (file-name-handler-alist) ...). As I have said,
this could have unexpected side effects. I propose to replace this by
(without-remote-files ...)

> And what do you mean by destroy?

"Destroy the feature".

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:39                             ` Eli Zaretskii
@ 2023-07-21 13:09                               ` Michael Albinus
  0 siblings, 0 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 13:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> > How is this different from binding file-name-handler-alist to nil?
>> > Tramp is nowadays the main consumer of this feature, and AFAIU your
>> > suggestion above boils down to disabling Tramp.  If so, what is left?
>>
>> jka-compr-handler, epa-file-handler and file-name-non-special are
>> left. All of them have their reason.
>
> I know, but when I wrote that disabling file-handlers is
> inconceivable, I meant remote files, not those other users of this
> facility.
>
> Let me rephrase: running Emacs commands with disabled support for
> remote files is inconceivable.

Agreed. My proposal was to provide a convenience macro to disable Tramp
when it is appropriate. Like

(unless (file-remote-p file) (without-remote-files ...))

Instead of binding file-name-handler-alist to nil, as it is the current
practice.

And the command inhibit-remote-files shall be applied only by users who
aren't interested in remote files at all. Again and again: these are
~50% of our users.

> IMO, if tests against file-name-handler-alist are a significant
> performance problem, we should look for ways of solving it without
> disabling remote files.

Sure. If there are proposals ...

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 12:46                                     ` Eli Zaretskii
  2023-07-21 13:01                                       ` Michael Albinus
@ 2023-07-21 13:17                                       ` Ihor Radchenko
  1 sibling, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 13:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, michael.albinus, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> > The question is: what is more costly
>> > (a) matching complex regexp && call function or
>> > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...))
>> 
>> (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file"))
>> ;; => (1.495432981 0 0.0)
>> (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file"))
>> ;; => (0.42053276500000003 0 0.0)
>> 
>> Looks like even funcall overheads are not as bad as invoking regexp search.
>
> But "nil" is not a faithful emulation of the real test which will have
> to be put there, is it?

It is, at least in some cases. In other cases, it is list lookup, which
is also faster:

(benchmark-run-compiled 10000000 (funcall (lambda (fn) (and (get 'foo 'jka-compr) (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file"))
;; => (0.5831819149999999 0 0.0)

Let me go through default handlers one by one:

file-name-handler-alist is a variable defined in fileio.c.

Value
(("..." . jka-compr-handler)
 (".." . epa-file-handler)
 ("..." . tramp-archive-file-name-handler)
 ("..." . tramp-completion-file-name-handler)
 ("..." . tramp-file-name-handler)
 ("\\`/:" . file-name-non-special))

---- 1 -----

(defun jka-compr-handler (operation &rest args)
  (save-match-data
    (let ((jka-op (get operation 'jka-compr)))
      (if (and jka-op (not jka-compr-inhibit))
	  (apply jka-op args)
	(jka-compr-run-real-handler operation args)))))

skips when `get' fails, and also puts unnecessary `save-match-data'
call, which would better be inside if.

---- 2 -----

(defun epa-file-handler (operation &rest args)
  (save-match-data
    (let ((op (get operation 'epa-file)))
      (if (and op (not epa-inhibit))
          (apply op args)
  	(epa-file-run-real-handler operation args)))))

again checks `get' and also epa-inhitbit. (and again,
`save-match-data' only needed for (apply op args)).

Side note: These handlers essentially force double handler lookup
without skipping already processed handlers when they decide that they
need to delegate to defaults.

---- 3 -----

    (if (not tramp-archive-enabled)
        ;; Unregister `tramp-archive-file-name-handler'.
        (progn
          (tramp-register-file-name-handlers)
          (tramp-archive-run-real-handler operation args))
          <...>

Note how this tries to remove itself from handler list, by testing a
boolean variable (nil by default!). However, this "self-removal" will
never happen unless we happen to query a file with matching regexp. If
no archive file is accessed during Emacs session (as it is the case for
me), this branch of code will never be executed and I am doomed to have
Emacs checking for regexp in front of this handler forever.

------ 4 ------

(defun tramp-completion-file-name-handler (operation &rest args)
  "Invoke Tramp file name completion handler for OPERATION and ARGS.
Falls back to normal file name handler if no Tramp file name handler exists."
  (if-let
      ((fn (and tramp-mode minibuffer-completing-file-name
		(assoc operation tramp-completion-file-name-handler-alist))))
      (save-match-data (apply (cdr fn) args))
    (tramp-run-real-handler operation args)))

is checking for tramp-mode (t by default) and
minibuffer-completing-file-name (often nil).

-------- 5 --------

(defun tramp-file-name-handler (operation &rest args)
  "Invoke Tramp file name handler for OPERATION and ARGS.
Fall back to normal file name handler if no Tramp file name handler exists."
  (let ((filename (apply #'tramp-file-name-for-operation operation args))
     <...>
    (if (tramp-tramp-file-p filename) ;; <<--- always nil when tramp-mode is nil
    <do staff>
    ;; When `tramp-mode' is not enabled, or the file name is quoted,
      ;; we don't do anything.
      (tramp-run-real-handler operation args))

this one is more complex, but does nothing when tramp-mode is nil.

--------- 6 -------

file-name-non-special is complex.
The only thing I noticed is that it binds tramp-mode as

(let ((tramp-mode (and tramp-mode (eq method 'local-copy))))

So, other handlers checking for tramp-mode variable early would benefit
if they were able to do so.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:01                                       ` Michael Albinus
@ 2023-07-21 13:23                                         ` Ihor Radchenko
  2023-07-21 15:31                                           ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 13:23 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

> --8<---------------cut here---------------start------------->8---
> (benchmark-run-compiled 1 (directory-files-recursively "~/src" ""))
> (38.133906724000006 13 0.5019186470000001)
> (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/src" "")))
> (32.944982886 13 0.5274874450000002)
> --8<---------------cut here---------------end--------------->8---

Interesting. Apparently my SSD is skewing the benchmark data on IO:

(length (directory-files-recursively "~/Git" ""))
;; => 113628

(benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
;; => (1.756453226 2 0.7181273930000032)
(benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
;; => (1.202790778 2 0.7401775709999896)

Would be interesting to see profiler and perf data for more detailed
breakdown where those 30+ seconds where spent in.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:04                                       ` Michael Albinus
@ 2023-07-21 13:24                                         ` Ihor Radchenko
  2023-07-21 15:36                                           ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 13:24 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> And what do you mean by destroy?
>
> "Destroy the feature".

I am sorry, but I still do not understand how what I proposed can lead
to any feature regression. May you please elaborate?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:00                             ` Dmitry Gutov
@ 2023-07-21 13:34                               ` Ihor Radchenko
  2023-07-21 13:36                                 ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 13:34 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

> On 21/07/2023 15:58, Ihor Radchenko wrote:
>> Dmitry Gutov<dmitry@gutov.dev>  writes:
>> 
>>>>> Disabling file-handlers is inconceivable in Emacs.
>>>> Indeed. But we are talking about Emacs find vs. GNU find here.
>>>> In the scenarios where GNU find can be used, it is also safe to disable
>>>> file handlers, AFAIU.

So, we agree here? (I've read your reply as counter-argument to mine.)

>>> GNU find can be used on a remote machine. In all the same cases as when
>>> it can be used on the local one.
>> But GNU find does not take into account Emacs' file-handlers for each
>> directory when traversing directories.
>
> Indeed. Such usage always assumes the initial invocation and each 
> visited directory belong to the same remote host. Which is usually a 
> correct assumption.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:34                               ` Ihor Radchenko
@ 2023-07-21 13:36                                 ` Dmitry Gutov
  2023-07-21 13:46                                   ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 13:36 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735

On 21/07/2023 16:34, Ihor Radchenko wrote:
> Dmitry Gutov<dmitry@gutov.dev>  writes:
> 
>> On 21/07/2023 15:58, Ihor Radchenko wrote:
>>> Dmitry Gutov<dmitry@gutov.dev>   writes:
>>>
>>>>>> Disabling file-handlers is inconceivable in Emacs.
>>>>> Indeed. But we are talking about Emacs find vs. GNU find here.
>>>>> In the scenarios where GNU find can be used, it is also safe to disable
>>>>> file handlers, AFAIU.
> So, we agree here? (I've read your reply as counter-argument to mine.)
> 

We don't, IIUC.

To use GNU find on a remote host, you need to have the file handlers 
enabled.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:36                                 ` Dmitry Gutov
@ 2023-07-21 13:46                                   ` Ihor Radchenko
  2023-07-21 15:41                                     ` Dmitry Gutov
  2023-07-23  5:40                                     ` Ihor Radchenko
  0 siblings, 2 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 13:46 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

> On 21/07/2023 16:34, Ihor Radchenko wrote:
>> Dmitry Gutov<dmitry@gutov.dev>  writes:
>> 
>>> On 21/07/2023 15:58, Ihor Radchenko wrote:
>>>> Dmitry Gutov<dmitry@gutov.dev>   writes:
>>>>
>>>>>>> Disabling file-handlers is inconceivable in Emacs.
>>>>>> Indeed. But we are talking about Emacs find vs. GNU find here.
>>>>>> In the scenarios where GNU find can be used, it is also safe to disable
>>>>>> file handlers, AFAIU.
>> So, we agree here? (I've read your reply as counter-argument to mine.)
>> 
>
> We don't, IIUC.
>
> To use GNU find on a remote host, you need to have the file handlers 
> enabled.

Let me clarify then.
I was exploring the possibility to replace GNU find with
`find-lisp-find-files'.

Locally, AFAIU, running `find-lisp-find-files' without
`file-name-handler-alist' is equivalent to running GNU find.
(That was a reply to Eli's message that we cannot disable
`file-name-handler-alist')

On remote host, I can see that `find-lisp-find-files' must use
tramp entries in `file-name-handler-alist'. Although, it will likely not
be usable then - running GNU find on remote host is going to be
unbeatable compared to repetitive TRAMP queries for file listing.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:23                                         ` Ihor Radchenko
@ 2023-07-21 15:31                                           ` Michael Albinus
  2023-07-21 15:38                                             ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 15:31 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

> Interesting. Apparently my SSD is skewing the benchmark data on IO:
>
> (length (directory-files-recursively "~/Git" ""))
> ;; => 113628
>
> (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
> ;; => (1.756453226 2 0.7181273930000032)
> (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
> ;; => (1.202790778 2 0.7401775709999896)
>
> Would be interesting to see profiler and perf data for more detailed
> breakdown where those 30+ seconds where spent in.

I have no SSD. And maybe some of the files are NFS-mounted.

--8<---------------cut here---------------start------------->8---
[albinus@gandalf emacs]$ sudo lshw -class disk
  *-disk
       description: ATA Disk
       product: SK hynix SC311 S
       size: 476GiB (512GB)
--8<---------------cut here---------------end--------------->8---

My point was to show the differences in the approaches. Do you have also
numbers using without-remote-files and inhibit-remote-files?

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:24                                         ` Ihor Radchenko
@ 2023-07-21 15:36                                           ` Michael Albinus
  2023-07-21 15:44                                             ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 15:36 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

>>> And what do you mean by destroy?
>>
>> "Destroy the feature".
>
> I am sorry, but I still do not understand how what I proposed can lead
> to any feature regression. May you please elaborate?

When you invoke a file name handler based on the value of a variable
like tramp-mode, either all file operations are enabled, or all are
disabled.

The mechanism with inhibit-file-name-{handlers,operation} allows you to
determine more fine-grained, which operation is allowed, and which is
suppressed.

Best regards, Michael.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:31                                           ` Michael Albinus
@ 2023-07-21 15:38                                             ` Ihor Radchenko
  2023-07-21 15:49                                               ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 15:38 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> Would be interesting to see profiler and perf data for more detailed
>> breakdown where those 30+ seconds where spent in.
>
> I have no SSD. And maybe some of the files are NFS-mounted.

That's why I asked about profile data (on emacs).

> My point was to show the differences in the approaches. Do you have also
> numbers using without-remote-files and inhibit-remote-files?

(length (directory-files-recursively "~/Git" ""))
;; => 113628
(benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
;; => (1.597328425 1 0.47237324699997885)
(benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
;; => (1.0012111910000001 1 0.4860752540000135)
(benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" "")))
;; => (1.147276594 1 0.48820330999998873)
(inhibit-remote-files)
(benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
;; => (1.054041615 1 0.4141427399999884)


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:46                                   ` Ihor Radchenko
@ 2023-07-21 15:41                                     ` Dmitry Gutov
  2023-07-21 15:48                                       ` Ihor Radchenko
  2023-07-23  5:40                                     ` Ihor Radchenko
  1 sibling, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 15:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735

On 21/07/2023 16:46, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry@gutov.dev> writes:
> 
>> On 21/07/2023 16:34, Ihor Radchenko wrote:
>>> Dmitry Gutov<dmitry@gutov.dev>  writes:
>>>
>>>> On 21/07/2023 15:58, Ihor Radchenko wrote:
>>>>> Dmitry Gutov<dmitry@gutov.dev>   writes:
>>>>>
>>>>>>>> Disabling file-handlers is inconceivable in Emacs.
>>>>>>> Indeed. But we are talking about Emacs find vs. GNU find here.
>>>>>>> In the scenarios where GNU find can be used, it is also safe to disable
>>>>>>> file handlers, AFAIU.
>>> So, we agree here? (I've read your reply as counter-argument to mine.)
>>>
>>
>> We don't, IIUC.
>>
>> To use GNU find on a remote host, you need to have the file handlers
>> enabled.
> 
> Let me clarify then.
> I was exploring the possibility to replace GNU find with
> `find-lisp-find-files'.
> 
> Locally, AFAIU, running `find-lisp-find-files' without
> `file-name-handler-alist' is equivalent to running GNU find.
> (That was a reply to Eli's message that we cannot disable
> `file-name-handler-alist')

But it's slower! At least 2x, even with file handlers disabled. 
According to your own measurements, with a modern SSD (not to mention 
all of our users with spinning media).

> Although, it will likely not
> be usable then - running GNU find on remote host is going to be
> unbeatable compared to repetitive TRAMP queries for file listing.

That's right.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:36                                           ` Michael Albinus
@ 2023-07-21 15:44                                             ` Ihor Radchenko
  0 siblings, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 15:44 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> I am sorry, but I still do not understand how what I proposed can lead
>> to any feature regression. May you please elaborate?
>
> When you invoke a file name handler based on the value of a variable
> like tramp-mode, either all file operations are enabled, or all are
> disabled.
>
> The mechanism with inhibit-file-name-{handlers,operation} allows you to
> determine more fine-grained, which operation is allowed, and which is
> suppressed.

I did not mean to remove the existing mechanisms.
Just wanted to allow additional check _before_ matching filename with a
regexp. (And I demonstrated that such a check is generally faster
compared to invoking regexp search)

Also, note that `inhibit-file-name-handlers' could then be implemented
without a need to match every single handler against
`inhibit-file-name-handlers' list. Emacs could instead have
handler-enabled-p flag that can be trivially let-bound. Checking a flag
is much faster compared to (memq handler inhibit-file-name-handlers).

Of course, the existing mechanism should be left for backward
compatibility.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:41                                     ` Dmitry Gutov
@ 2023-07-21 15:48                                       ` Ihor Radchenko
  2023-07-21 19:53                                         ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 15:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

>> Locally, AFAIU, running `find-lisp-find-files' without
>> `file-name-handler-alist' is equivalent to running GNU find.
>> (That was a reply to Eli's message that we cannot disable
>> `file-name-handler-alist')
>
> But it's slower! At least 2x, even with file handlers disabled. 
> According to your own measurements, with a modern SSD (not to mention 
> all of our users with spinning media).

Yes, but (1) there is room for optimization; (2) I have a hope that we
can implement better "ignores" when using `find-lisp-find-files', thus
eventually outperforming GNU find (when used with large number of
ignores).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:38                                             ` Ihor Radchenko
@ 2023-07-21 15:49                                               ` Michael Albinus
  2023-07-21 15:55                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 15:49 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

>> My point was to show the differences in the approaches. Do you have also
>> numbers using without-remote-files and inhibit-remote-files?
>
> (length (directory-files-recursively "~/Git" ""))
> ;; => 113628
> (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
> ;; => (1.597328425 1 0.47237324699997885)
> (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
> ;; => (1.0012111910000001 1 0.4860752540000135)
> (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" "")))
> ;; => (1.147276594 1 0.48820330999998873)
> (inhibit-remote-files)
> (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
> ;; => (1.054041615 1 0.4141427399999884)

Thanks a lot! These figures show, that both without-remote-files and
inhibit-remote-files are useful. Of course this shouldn't stop us to
find further approaches for performance optimizations.

I'll wait for some days whether there's opposition, before installing
them in master.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:49                                               ` Michael Albinus
@ 2023-07-21 15:55                                                 ` Eli Zaretskii
  2023-07-21 16:08                                                   ` Michael Albinus
  2023-07-21 16:15                                                   ` Ihor Radchenko
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 15:55 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Eli Zaretskii <eliz@gnu.org>,  dmitry@gutov.dev,  64735@debbugs.gnu.org,
>   sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 17:49:14 +0200
> 
> Ihor Radchenko <yantar92@posteo.net> writes:
> 
> Hi Ihor,
> 
> >> My point was to show the differences in the approaches. Do you have also
> >> numbers using without-remote-files and inhibit-remote-files?
> >
> > (length (directory-files-recursively "~/Git" ""))
> > ;; => 113628
> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
> > ;; => (1.597328425 1 0.47237324699997885)
> > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
> > ;; => (1.0012111910000001 1 0.4860752540000135)
> > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" "")))
> > ;; => (1.147276594 1 0.48820330999998873)
> > (inhibit-remote-files)
> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
> > ;; => (1.054041615 1 0.4141427399999884)
> 
> Thanks a lot! These figures show, that both without-remote-files and
> inhibit-remote-files are useful. Of course this shouldn't stop us to
> find further approaches for performance optimizations.
> 
> I'll wait for some days whether there's opposition, before installing
> them in master.

Can you spell out what you intend to install?

The figures provided in this thread indicate speedups that are modest
at best, so I'm not sure they justify measures which could cause
problems (if that indeed could happen).





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:55                                                 ` Eli Zaretskii
@ 2023-07-21 16:08                                                   ` Michael Albinus
  2023-07-21 16:15                                                   ` Ihor Radchenko
  1 sibling, 0 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 16:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

Hi Eli,

>> > (length (directory-files-recursively "~/Git" ""))
>> > ;; => 113628
>> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
>> > ;; => (1.597328425 1 0.47237324699997885)
>> > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
>> > ;; => (1.0012111910000001 1 0.4860752540000135)
>> > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" "")))
>> > ;; => (1.147276594 1 0.48820330999998873)
>> > (inhibit-remote-files)
>> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
>> > ;; => (1.054041615 1 0.4141427399999884)
>>
>> Thanks a lot! These figures show, that both without-remote-files and
>> inhibit-remote-files are useful. Of course this shouldn't stop us to
>> find further approaches for performance optimizations.
>>
>> I'll wait for some days whether there's opposition, before installing
>> them in master.
>
> Can you spell out what you intend to install?

I intend to install without-remote-files and inhibit-remote-files, which
I have shown upthread. Plus documentation.

> The figures provided in this thread indicate speedups that are modest
> at best, so I'm not sure they justify measures which could cause
> problems (if that indeed could happen).

>> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
>> > ;; => (1.597328425 1 0.47237324699997885)

1.59 seconds.

>> > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" "")))
>> > ;; => (1.147276594 1 0.48820330999998873)

28% performance boost.

>> > (inhibit-remote-files)
>> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
>> > ;; => (1.054041615 1 0.4141427399999884)

34% performance boost.

I believe it is more than a modest speedup. And without-remote-files
mitigates problems which could happen due to let-binding file-name-handler-alist.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:55                                                 ` Eli Zaretskii
  2023-07-21 16:08                                                   ` Michael Albinus
@ 2023-07-21 16:15                                                   ` Ihor Radchenko
  2023-07-21 16:38                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 16:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, Michael Albinus, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> > (length (directory-files-recursively "~/Git" ""))
>> > ;; => 113628
>> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" ""))
>> > ;; => (1.597328425 1 0.47237324699997885)
>> > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" "")))
>> > ;; => (1.0012111910000001 1 0.4860752540000135)
> ...
> The figures provided in this thread indicate speedups that are modest
> at best, so I'm not sure they justify measures which could cause
> problems (if that indeed could happen).

Not that modest. Basically, it all depends on how frequently Emacs file API is
being used. If we take `find-lisp-find-files', which triggers more file
handler lookup, the difference becomes more significant:

(benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" ""))
;; (3.853305824 4 0.9142656910000007)
(let (file-name-handler-alist) (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" "")))
;; (1.545292093 4 0.9098995830000014)

In particular, `expand-file-name' is commonly used in the wild to ensure
that a given path is full. For a single file, it may not add much
overheads, but it is so common that I believe that it would be worth it
to make even relatively small improvements in performance.

I am pretty sure that file name handlers are checked behind the scenes
by many other common operations.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 16:15                                                   ` Ihor Radchenko
@ 2023-07-21 16:38                                                     ` Eli Zaretskii
  2023-07-21 16:43                                                       ` Ihor Radchenko
  2023-07-21 16:43                                                       ` Michael Albinus
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 16:38 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: dmitry, michael.albinus, 64735, sbaugh

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Michael Albinus <michael.albinus@gmx.de>, dmitry@gutov.dev,
>  64735@debbugs.gnu.org, sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 16:15:41 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > The figures provided in this thread indicate speedups that are modest
> > at best, so I'm not sure they justify measures which could cause
> > problems (if that indeed could happen).
> 
> Not that modest. Basically, it all depends on how frequently Emacs file API is
> being used. If we take `find-lisp-find-files', which triggers more file
> handler lookup, the difference becomes more significant:
> 
> (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" ""))
> ;; (3.853305824 4 0.9142656910000007)
> (let (file-name-handler-alist) (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" "")))
> ;; (1.545292093 4 0.9098995830000014)

The above just means that find-lisp is not a good way of emulating
Find in Emacs.  It is no accident that it is not used too much.

> In particular, `expand-file-name' is commonly used in the wild to ensure
> that a given path is full. For a single file, it may not add much
> overheads, but it is so common that I believe that it would be worth it
> to make even relatively small improvements in performance.

The Right Way of avoiding unnecessary calls to expand-file-name is to
program dedicated primitives that perform more specialized jobs,
instead of calling existing primitives in some higher-level code.
Then you can avoid these calls altogether once you know that the input
file names are already in absolute form.

IOW, if a specific job, when implemented in Lisp, is not performant
enough, it means implementing it that way is not a good idea.

Disabling file-name-handlers is the wrong way to solve these
performance problems.

> I am pretty sure that file name handlers are checked behind the scenes
> by many other common operations.

I'm pretty sure they aren't.  But every file-related primitive calls
expand-file-name (it must, by virtue of the Emacs paradigm whereby
each buffer "lives" in a different directory), and that's what you
see, by and large.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 16:38                                                     ` Eli Zaretskii
@ 2023-07-21 16:43                                                       ` Ihor Radchenko
  2023-07-21 16:43                                                       ` Michael Albinus
  1 sibling, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 16:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, michael.albinus, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

> Disabling file-name-handlers is the wrong way to solve these
> performance problems.

We are in agreement here.
Note that I am talking about optimization.
And Michael proposed to provide a way for disabling only the
tramp-related handlers, when it is appropriate.

>> I am pretty sure that file name handlers are checked behind the scenes
>> by many other common operations.
>
> I'm pretty sure they aren't.  But every file-related primitive calls
> expand-file-name (it must, by virtue of the Emacs paradigm whereby
> each buffer "lives" in a different directory), and that's what you
> see, by and large.

The end result is the same - file handlers are searched very frequently
any time Emacs file API is used.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 16:38                                                     ` Eli Zaretskii
  2023-07-21 16:43                                                       ` Ihor Radchenko
@ 2023-07-21 16:43                                                       ` Michael Albinus
  2023-07-21 17:45                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 16:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, Ihor Radchenko, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

Hi Eli,

> Disabling file-name-handlers is the wrong way to solve these
> performance problems.

Does this mean you disagree to install the two forms I have proposed?
Although not perfect, they are better than the current status-quo.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 16:43                                                       ` Michael Albinus
@ 2023-07-21 17:45                                                         ` Eli Zaretskii
  2023-07-21 17:55                                                           ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 17:45 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Ihor Radchenko <yantar92@posteo.net>,  dmitry@gutov.dev,
>   64735@debbugs.gnu.org,  sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 18:43:52 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Disabling file-name-handlers is the wrong way to solve these
> > performance problems.
> 
> Does this mean you disagree to install the two forms I have proposed?
> Although not perfect, they are better than the current status-quo.

No, I just disagree that those measures should be seen as solutions of
the performance problems mentioned here.  I don't object to installing
the changes, I only hope that work on resolving the performance issues
will not stop because they are installed.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 17:45                                                         ` Eli Zaretskii
@ 2023-07-21 17:55                                                           ` Michael Albinus
  2023-07-21 18:38                                                             ` Eli Zaretskii
  2023-07-22  8:17                                                             ` Michael Albinus
  0 siblings, 2 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-21 17:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

Hi Eli,

>> Does this mean you disagree to install the two forms I have proposed?
>> Although not perfect, they are better than the current status-quo.
>
> No, I just disagree that those measures should be seen as solutions of
> the performance problems mentioned here.  I don't object to installing
> the changes, I only hope that work on resolving the performance issues
> will not stop because they are installed.

Thanks. I'll install tomorrow.

I'm open for any proposal in solving the performance problems. But since
I'm living in the file name handler world for many years, I might not be
the best source for new ideas.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 17:55                                                           ` Michael Albinus
@ 2023-07-21 18:38                                                             ` Eli Zaretskii
  2023-07-21 19:33                                                               ` Spencer Baugh
  2023-07-22  8:17                                                             ` Michael Albinus
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-21 18:38 UTC (permalink / raw)
  To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: yantar92@posteo.net,  dmitry@gutov.dev,  64735@debbugs.gnu.org,
>   sbaugh@janestreet.com
> Date: Fri, 21 Jul 2023 19:55:22 +0200
> 
> I'm open for any proposal in solving the performance problems. But since
> I'm living in the file name handler world for many years, I might not be
> the best source for new ideas.

The first idea that comes to mind is to reimplement
directory-files-recursively in C, modeled on how Find does that.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 17:25           ` Ihor Radchenko
@ 2023-07-21 19:31             ` Spencer Baugh
  2023-07-21 19:37               ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Spencer Baugh @ 2023-07-21 19:31 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dmitry Gutov, 64735

Ihor Radchenko <yantar92@posteo.net> writes:
> Spencer Baugh <sbaugh@janestreet.com> writes:
>
>>> Sure. It might also be optimized. Without trying to convince find devs
>>> to do something about regexp handling.
>>
>> Not to derail too much, but find as a subprocess has one substantial
>> advantage over find in Lisp: It can run in parallel with Emacs, so that
>> we actually use multiple CPU cores.
>
> Does find use multiple CPU cores?

Not on its own, but when it's running as a separate subprocess of Emacs,
that subprocess can (and will, on modern core-rich hardware) run on a
different CPU core from Emacs itself.  That's a form of parallelism
which is very achievable for Emacs, and provides a big performance win.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 18:38                                                             ` Eli Zaretskii
@ 2023-07-21 19:33                                                               ` Spencer Baugh
  2023-07-22  5:27                                                                 ` Eli Zaretskii
  2023-07-23  2:59                                                                 ` Richard Stallman
  0 siblings, 2 replies; 199+ messages in thread
From: Spencer Baugh @ 2023-07-21 19:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, yantar92, Michael Albinus, Richard Stallman, 64735

Eli Zaretskii <eliz@gnu.org> writes:
>> From: Michael Albinus <michael.albinus@gmx.de>
>> Cc: yantar92@posteo.net,  dmitry@gutov.dev,  64735@debbugs.gnu.org,
>>   sbaugh@janestreet.com
>> Date: Fri, 21 Jul 2023 19:55:22 +0200
>> 
>> I'm open for any proposal in solving the performance problems. But since
>> I'm living in the file name handler world for many years, I might not be
>> the best source for new ideas.
>
> The first idea that comes to mind is to reimplement
> directory-files-recursively in C, modeled on how Find does that.

If someone was thinking of doing that, they would be better off
responding to RMS's earlier request for C programmers to optimize this
behavior in find.

Since, after all, if we do it that way it will benefit remote files as
well.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 19:31             ` Spencer Baugh
@ 2023-07-21 19:37               ` Ihor Radchenko
  2023-07-21 19:56                 ` Dmitry Gutov
  2023-07-21 20:11                 ` Spencer Baugh
  0 siblings, 2 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-21 19:37 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: Dmitry Gutov, 64735

Spencer Baugh <sbaugh@janestreet.com> writes:

>> Does find use multiple CPU cores?
>
> Not on its own, but when it's running as a separate subprocess of Emacs,
> that subprocess can (and will, on modern core-rich hardware) run on a
> different CPU core from Emacs itself.  That's a form of parallelism
> which is very achievable for Emacs, and provides a big performance win.

AFAIU, the way find is called by project.el is synchronous: (1) call
find; (2) wait until it produces all the results; (3) process the
results. In such scenario, there is no gain from subprocess.

Is any part of Emacs is even using sentinels together with find?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 15:48                                       ` Ihor Radchenko
@ 2023-07-21 19:53                                         ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 19:53 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735

On 21/07/2023 18:48, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry@gutov.dev> writes:
> 
>>> Locally, AFAIU, running `find-lisp-find-files' without
>>> `file-name-handler-alist' is equivalent to running GNU find.
>>> (That was a reply to Eli's message that we cannot disable
>>> `file-name-handler-alist')
>>
>> But it's slower! At least 2x, even with file handlers disabled.
>> According to your own measurements, with a modern SSD (not to mention
>> all of our users with spinning media).
> 
> Yes, but (1) there is room for optimization; (2) I have a hope that we
> can implement better "ignores" when using `find-lisp-find-files', thus
> eventually outperforming GNU find (when used with large number of
> ignores).

There are natural limits to that optimization, if the approach is to 
generate the full list of files in Lisp, and then filter it out 
programmatically: every file name will need to be allocated. That's a 
lot of unnecessary consing.

But you're welcome to try it and report back with results. Tramp is easy 
to disable, so you should be fine in terms of infrastructure.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 19:37               ` Ihor Radchenko
@ 2023-07-21 19:56                 ` Dmitry Gutov
  2023-07-21 20:11                 ` Spencer Baugh
  1 sibling, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-21 19:56 UTC (permalink / raw)
  To: Ihor Radchenko, Spencer Baugh; +Cc: 64735

On 21/07/2023 22:37, Ihor Radchenko wrote:
> AFAIU, the way find is called by project.el is synchronous

For now.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 19:37               ` Ihor Radchenko
  2023-07-21 19:56                 ` Dmitry Gutov
@ 2023-07-21 20:11                 ` Spencer Baugh
  1 sibling, 0 replies; 199+ messages in thread
From: Spencer Baugh @ 2023-07-21 20:11 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dmitry Gutov, 64735

Ihor Radchenko <yantar92@posteo.net> writes:
> Spencer Baugh <sbaugh@janestreet.com> writes:
>
>>> Does find use multiple CPU cores?
>>
>> Not on its own, but when it's running as a separate subprocess of Emacs,
>> that subprocess can (and will, on modern core-rich hardware) run on a
>> different CPU core from Emacs itself.  That's a form of parallelism
>> which is very achievable for Emacs, and provides a big performance win.
>
> AFAIU, the way find is called by project.el is synchronous: (1) call
> find; (2) wait until it produces all the results; (3) process the
> results. In such scenario, there is no gain from subprocess.
>
> Is any part of Emacs is even using sentinels together with find?

rgrep.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21  2:42 ` Richard Stallman
@ 2023-07-22  2:39   ` Richard Stallman
  2023-07-22  5:49     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Richard Stallman @ 2023-07-22  2:39 UTC (permalink / raw)
  To: sbaugh, 64735

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Since people are making a lot of headway on optimizing this
in Emacs, I won't trouble the Find maintainers for now.

I wonder if it is possible to detect many cases in which
the file-name handlers won't actually do anything, and bind
file-name-hander-list to nil for those.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 19:33                                                               ` Spencer Baugh
@ 2023-07-22  5:27                                                                 ` Eli Zaretskii
  2023-07-22 10:38                                                                   ` sbaugh
  2023-07-23  2:59                                                                 ` Richard Stallman
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22  5:27 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: dmitry, yantar92, michael.albinus, rms, 64735

> From: Spencer Baugh <sbaugh@janestreet.com>
> Cc: Michael Albinus <michael.albinus@gmx.de>,  dmitry@gutov.dev,
>    yantar92@posteo.net,  64735@debbugs.gnu.org, Richard Stallman
>   <rms@gnu.org>
> Date: Fri, 21 Jul 2023 15:33:13 -0400
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> > The first idea that comes to mind is to reimplement
> > directory-files-recursively in C, modeled on how Find does that.
> 
> If someone was thinking of doing that, they would be better off
> responding to RMS's earlier request for C programmers to optimize this
> behavior in find.

No, the first step is to use in Emacs what Find does today, because it
will already be a significant speedup.  Optimizing the case of a long
list of omissions should come later, as it is a minor optimization.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22  2:39   ` Richard Stallman
@ 2023-07-22  5:49     ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22  5:49 UTC (permalink / raw)
  To: rms, Michael Albinus; +Cc: sbaugh, 64735

> From: Richard Stallman <rms@gnu.org>
> Date: Fri, 21 Jul 2023 22:39:41 -0400
> 
> I wonder if it is possible to detect many cases in which
> the file-name handlers won't actually do anything, and bind
> file-name-hander-list to nil for those.

I think we already do, but perhaps we could try harder.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 17:24           ` Eli Zaretskii
@ 2023-07-22  6:35             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 199+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-07-22  6:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Spencer Baugh, yantar92, 64735, dmitry

Eli Zaretskii <eliz@gnu.org> writes:

>> Cc: Dmitry Gutov <dmitry@gutov.dev>, 64735@debbugs.gnu.org
>> From: Spencer Baugh <sbaugh@janestreet.com>
>> Date: Thu, 20 Jul 2023 13:08:24 -0400
>> 
>> (Really it's entirely plausible that Emacs could be improved by
>> *removing* directory-files-recursively, in favor of invoking find as a
>> subprocess: faster, parallelized execution, and better remote support.)
>
> No, there's no reason to remove anything that useful from Emacs.  If
> this or that API is not the optimal choice for some job, it is easy
> enough not to use it.

Indeed.

I would like to add that subprocesses remain unimplemented on MS-DOS,
and the way find is currently invoked from project.el and rgrep makes
both packages lose on Unix, indicating that correct portable use of find
is decidedly non-trivial.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-20 17:08         ` Spencer Baugh
  2023-07-20 17:24           ` Eli Zaretskii
  2023-07-20 17:25           ` Ihor Radchenko
@ 2023-07-22  6:39           ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-07-22 21:01             ` Dmitry Gutov
  2 siblings, 1 reply; 199+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-07-22  6:39 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: Dmitry Gutov, Ihor Radchenko, 64735

Spencer Baugh <sbaugh@janestreet.com> writes:

> Not to derail too much, but find as a subprocess has one substantial
> advantage over find in Lisp: It can run in parallel with Emacs, so that
> we actually use multiple CPU cores.
>
> Between that, and the remote support part, I personally much prefer find
> to be a subprocess rather than in Lisp.  I don't think optimizing
> directory-files-recursively is a great solution.
>
> (Really it's entirely plausible that Emacs could be improved by
> *removing* directory-files-recursively, in favor of invoking find as a
> subprocess: faster, parallelized execution, and better remote support.)

find is only present in the default installations of Unix-like systems,
so it doesn't work without additional configuration on MS-Windows or
MS-DOS.  project.el and rgrep fail to work on USG Unix because they both
use `-path'.

Programs that use find should fall back to directory-file-recursively
when any of the situations above are detected.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 17:55                                                           ` Michael Albinus
  2023-07-21 18:38                                                             ` Eli Zaretskii
@ 2023-07-22  8:17                                                             ` Michael Albinus
  1 sibling, 0 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-22  8:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> No, I just disagree that those measures should be seen as solutions of
>> the performance problems mentioned here.  I don't object to installing
>> the changes, I only hope that work on resolving the performance issues
>> will not stop because they are installed.
>
> Thanks. I'll install tomorrow.

Pushed to master.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh
                   ` (2 preceding siblings ...)
  2023-07-21  2:42 ` Richard Stallman
@ 2023-07-22 10:18 ` Ihor Radchenko
  2023-07-22 10:42   ` sbaugh
  3 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-22 10:18 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: 64735

Spencer Baugh <sbaugh@janestreet.com> writes:

> - we could use our own recursive directory-tree walking implementation
> (directory-files-recursively), if we found a nice way to pipe its output
> directly to grep etc without going through Lisp.  (This could be nice
> for project-files, at least)

May you elaborate this idea?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22  5:27                                                                 ` Eli Zaretskii
@ 2023-07-22 10:38                                                                   ` sbaugh
  2023-07-22 11:58                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: sbaugh @ 2023-07-22 10:38 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Spencer Baugh, yantar92, rms, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:
>> From: Spencer Baugh <sbaugh@janestreet.com>
>> Cc: Michael Albinus <michael.albinus@gmx.de>,  dmitry@gutov.dev,
>>    yantar92@posteo.net,  64735@debbugs.gnu.org, Richard Stallman
>>   <rms@gnu.org>
>> Date: Fri, 21 Jul 2023 15:33:13 -0400
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> > The first idea that comes to mind is to reimplement
>> > directory-files-recursively in C, modeled on how Find does that.
>> 
>> If someone was thinking of doing that, they would be better off
>> responding to RMS's earlier request for C programmers to optimize this
>> behavior in find.
>
> No, the first step is to use in Emacs what Find does today, because it
> will already be a significant speedup.

Why bother?  directory-files-recursively is a rarely used API, as you
have mentioned before in this thread.

And there is a way to speed it up which will have a performance boost
which is unbeatable any other way: Use find instead of
directory-files-recursively, and operate on files as they find prints
them.  Since this runs the directory traversal in parallel with Emacs,
it has a speed advantage that is impossible to match in
directory-files-recursively.

We can fall back to directory-files-recursively when find is not
available.

> Optimizing the case of a long
> list of omissions should come later, as it is a minor optimization.

This seems wrong.  directory-files-recursively is rarely used, and rgrep
is a very popular command, and this problem with find makes rgrep around
~10x slower by default.  How in any world is that a minor optimization?
Most Emacs users will never realize that they can speed up rgrep
massively by setting grep-find-ignored-files to nil.  Indeed, no-one
realized that until I just pointed it out.  In my experience, they just
stop using rgrep in favor of other third-party packages like ripgrep,
because "grep is slow".

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 10:18 ` Ihor Radchenko
@ 2023-07-22 10:42   ` sbaugh
  2023-07-22 12:00     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: sbaugh @ 2023-07-22 10:42 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Spencer Baugh, 64735

Ihor Radchenko <yantar92@posteo.net> writes:
> Spencer Baugh <sbaugh@janestreet.com> writes:
>
>> - we could use our own recursive directory-tree walking implementation
>> (directory-files-recursively), if we found a nice way to pipe its output
>> directly to grep etc without going through Lisp.  (This could be nice
>> for project-files, at least)
>
> May you elaborate this idea?

One of the reasons directory-files-recursively is slow is because it
allocates memory inside Emacs.  If we piped its output directly to grep,
that overhead would be removed.

On reflection, though, as I've posted elsewhere in this thread: This is
a bad idea and is inherently slower than find, because
directory-files-recursively does not run in parallel with Emacs (and
never will).

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 10:38                                                                   ` sbaugh
@ 2023-07-22 11:58                                                                     ` Eli Zaretskii
  2023-07-22 14:14                                                                       ` Ihor Radchenko
  2023-07-22 17:18                                                                       ` sbaugh
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 11:58 UTC (permalink / raw)
  To: sbaugh; +Cc: sbaugh, yantar92, rms, dmitry, michael.albinus, 64735

> From: sbaugh@catern.com
> Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC)
> Cc: Spencer Baugh <sbaugh@janestreet.com>, dmitry@gutov.dev,
> 	yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org,
> 	64735@debbugs.gnu.org
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> > No, the first step is to use in Emacs what Find does today, because it
> > will already be a significant speedup.
> 
> Why bother?  directory-files-recursively is a rarely used API, as you
> have mentioned before in this thread.

Because we could then use it much more (assuming the result will be
performant enough -- this remains to be seen).

> And there is a way to speed it up which will have a performance boost
> which is unbeatable any other way: Use find instead of
> directory-files-recursively, and operate on files as they find prints
> them.

Not every command can operate on the output sequentially: some need to
see all of the output, others will need to be redesigned and
reimplemented to support such sequential mode.

Moreover, piping from Find incurs overhead: data is broken into blocks
by the pipe or PTY, reading the data can be slowed down if Emacs is
busy processing something, etc.

So I think a primitive that traverses the tree and produces file names
with or without attributes, and can call some callback if needed,
still has its place.

> Since this runs the directory traversal in parallel with Emacs, it
> has a speed advantage that is impossible to match in
> directory-files-recursively.

See above: you have an optimistic view of what actually happens in the
relevant use cases.

> We can fall back to directory-files-recursively when find is not
> available.

Find is already available today on many platforms, and we are
evidently not happy enough with the results.  That is the trigger for
this discussion, isn't it?  We are talking about ways to improve the
performance, and I think having our own primitive that can do it is
one such way, or at least it is not clear that it cannot be such a
way.

> > Optimizing the case of a long
> > list of omissions should come later, as it is a minor optimization.
> 
> This seems wrong.  directory-files-recursively is rarely used, and rgrep
> is a very popular command, and this problem with find makes rgrep around
> ~10x slower by default.  How in any world is that a minor optimization?
> Most Emacs users will never realize that they can speed up rgrep
> massively by setting grep-find-ignored-files to nil.  Indeed, no-one
> realized that until I just pointed it out.  In my experience, they just
> stop using rgrep in favor of other third-party packages like ripgrep,
> because "grep is slow".

Making grep-find-ignored-files smaller is independent of this
particular issue.  If we can make it shorter, we should.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 10:42   ` sbaugh
@ 2023-07-22 12:00     ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 12:00 UTC (permalink / raw)
  To: sbaugh; +Cc: sbaugh, yantar92, 64735

> Cc: Spencer Baugh <sbaugh@janestreet.com>, 64735@debbugs.gnu.org
> From: sbaugh@catern.com
> Date: Sat, 22 Jul 2023 10:42:06 +0000 (UTC)
> 
> Ihor Radchenko <yantar92@posteo.net> writes:
> > Spencer Baugh <sbaugh@janestreet.com> writes:
> >
> >> - we could use our own recursive directory-tree walking implementation
> >> (directory-files-recursively), if we found a nice way to pipe its output
> >> directly to grep etc without going through Lisp.  (This could be nice
> >> for project-files, at least)
> >
> > May you elaborate this idea?
> 
> One of the reasons directory-files-recursively is slow is because it
> allocates memory inside Emacs.  If we piped its output directly to grep,
> that overhead would be removed.

How can you do anything in Emacs without allocating memory?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 11:58                                                                     ` Eli Zaretskii
@ 2023-07-22 14:14                                                                       ` Ihor Radchenko
  2023-07-22 14:32                                                                         ` Eli Zaretskii
  2023-07-22 17:18                                                                       ` sbaugh
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-22 14:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

> So I think a primitive that traverses the tree and produces file names
> with or without attributes, and can call some callback if needed,
> still has its place.

Do you mean asynchronous primitive?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 14:14                                                                       ` Ihor Radchenko
@ 2023-07-22 14:32                                                                         ` Eli Zaretskii
  2023-07-22 15:07                                                                           ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 14:32 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sat, 22 Jul 2023 14:14:25 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > So I think a primitive that traverses the tree and produces file names
> > with or without attributes, and can call some callback if needed,
> > still has its place.
> 
> Do you mean asynchronous primitive?

No, a synchronous one.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 14:32                                                                         ` Eli Zaretskii
@ 2023-07-22 15:07                                                                           ` Ihor Radchenko
  2023-07-22 15:29                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-22 15:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Ihor Radchenko <yantar92@posteo.net>
>> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
>> Date: Sat, 22 Jul 2023 14:14:25 +0000
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> > So I think a primitive that traverses the tree and produces file names
>> > with or without attributes, and can call some callback if needed,
>> > still has its place.
>> 
>> Do you mean asynchronous primitive?
>
> No, a synchronous one.

Then how will the callback be different from
(mapc #'my-function (directory-files-recursively ...))
?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 15:07                                                                           ` Ihor Radchenko
@ 2023-07-22 15:29                                                                             ` Eli Zaretskii
  2023-07-23  7:52                                                                               ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 15:29 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sat, 22 Jul 2023 15:07:45 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> > So I think a primitive that traverses the tree and produces file names
> >> > with or without attributes, and can call some callback if needed,
> >> > still has its place.
> >> 
> >> Do you mean asynchronous primitive?
> >
> > No, a synchronous one.
> 
> Then how will the callback be different from
> (mapc #'my-function (directory-files-recursively ...))
> ?

It depends on the application.  Applications that want to get all the
data and only after that process it will not use the callback.  But I
can certainly imagine an application that inserts the file names, or
some of their transforms, into a buffer, and from time to time
triggers redisplay to show the partial results.  Or an application
could write the file names to some disk file or external consumer, or
send them to a network process.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 11:58                                                                     ` Eli Zaretskii
  2023-07-22 14:14                                                                       ` Ihor Radchenko
@ 2023-07-22 17:18                                                                       ` sbaugh
  2023-07-22 17:26                                                                         ` Ihor Radchenko
  2023-07-22 17:46                                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 199+ messages in thread
From: sbaugh @ 2023-07-22 17:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, yantar92, rms, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> From: sbaugh@catern.com
>> Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC)
>> Cc: Spencer Baugh <sbaugh@janestreet.com>, dmitry@gutov.dev,
>> 	yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org,
>> 	64735@debbugs.gnu.org
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> > No, the first step is to use in Emacs what Find does today, because it
>> > will already be a significant speedup.
>> 
>> Why bother?  directory-files-recursively is a rarely used API, as you
>> have mentioned before in this thread.
>
> Because we could then use it much more (assuming the result will be
> performant enough -- this remains to be seen).
>
>> And there is a way to speed it up which will have a performance boost
>> which is unbeatable any other way: Use find instead of
>> directory-files-recursively, and operate on files as they find prints
>> them.
>
> Not every command can operate on the output sequentially: some need to
> see all of the output, others will need to be redesigned and
> reimplemented to support such sequential mode.
>
> Moreover, piping from Find incurs overhead: data is broken into blocks
> by the pipe or PTY, reading the data can be slowed down if Emacs is
> busy processing something, etc.

I went ahead and implemented it, and I get a 2x speedup even *without*
running find in parallel with Emacs.

First my results:

(my-bench 100 "~/public_html" "")
(("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)")
 ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)"))

(my-bench 10 "~/.local/src/linux" "")
(("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)")
 ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)"))

(my-bench 100 "/ssh:catern.com:~/public_html" "")
(("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)")
 ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)"))

2x speedup on local files, and almost a 10x speedup for remote files.

And my implementation *isn't even using the fact that find can run in
parallel with Emacs*.  If I did start using that, I expect even more
speed gains from parallelism, which aren't achievable in Emacs itself.

So can we add something like this (with the appropriate fallbacks to
directory-files-recursively), since it has such a big speedup even
without parallelism?

My implementation and benchmarking:

(defun find-directory-files-recursively (dir regexp &optional include-directories _predicate follow-symlinks)
  (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates")
  (with-temp-buffer
    (setq case-fold-search nil)
    (cd dir)
    (let* ((command
	    (append
	     (list "find" (file-local-name dir))
	     (if follow-symlinks
		 '("-L")
	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
	     (unless (string-empty-p regexp)
	       "-regex" (concat ".*" regexp ".*"))
	     (unless include-directories
	       '("!" "-type" "d"))
	     '("-print0")
	     ))
	   (remote (file-remote-p dir))
	   (proc
	    (if remote
		(let ((proc (apply #'start-file-process
				   "find" (current-buffer) command)))
		  (set-process-sentinel proc (lambda (_proc _state)))
		  (set-process-query-on-exit-flag proc nil)
		  proc)
	      (make-process :name "find" :buffer (current-buffer)
			    :connection-type 'pipe
			    :noquery t
			    :sentinel (lambda (_proc _state))
			    :command command))))
      (while (accept-process-output proc))
      (let ((start (goto-char (point-min))) ret)
	(while (search-forward "\0" nil t)
	  (push (concat remote (buffer-substring-no-properties start (1- (point)))) ret)
	  (setq start (point)))
	ret))))

(defun my-bench (count path regexp)
  (setq path (expand-file-name path))
  (let ((old (directory-files-recursively path regexp))
	(new (find-directory-files-recursively path regexp)))
    (dolist (path old)
      (should (member path new)))
    (dolist (path new)
      (should (member path old))))
  (list
   (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp)))
   (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp)))))





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 17:18                                                                       ` sbaugh
@ 2023-07-22 17:26                                                                         ` Ihor Radchenko
  2023-07-22 17:46                                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-22 17:26 UTC (permalink / raw)
  To: sbaugh; +Cc: sbaugh, rms, dmitry, michael.albinus, Eli Zaretskii, 64735

sbaugh@catern.com writes:

> I went ahead and implemented it, and I get a 2x speedup even *without*
> running find in parallel with Emacs.
>
> First my results:
>
> (my-bench 100 "~/public_html" "")
> (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)")
>  ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)"))
>
> (my-bench 10 "~/.local/src/linux" "")
> (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)")
>  ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)"))

What about without `file-name-handler-alist'?

> (my-bench 100 "/ssh:catern.com:~/public_html" "")
> (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)")
>  ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)"))

This is indeed expected.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 17:18                                                                       ` sbaugh
  2023-07-22 17:26                                                                         ` Ihor Radchenko
@ 2023-07-22 17:46                                                                         ` Eli Zaretskii
  2023-07-22 18:31                                                                           ` Eli Zaretskii
  2023-07-22 20:53                                                                           ` Spencer Baugh
  1 sibling, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 17:46 UTC (permalink / raw)
  To: sbaugh; +Cc: sbaugh, yantar92, rms, dmitry, michael.albinus, 64735

> From: sbaugh@catern.com
> Date: Sat, 22 Jul 2023 17:18:19 +0000 (UTC)
> Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev,
> 	michael.albinus@gmx.de, 64735@debbugs.gnu.org
> 
> First my results:
> 
> (my-bench 100 "~/public_html" "")
> (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)")
>  ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)"))
> 
> (my-bench 10 "~/.local/src/linux" "")
> (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)")
>  ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)"))
> 
> (my-bench 100 "/ssh:catern.com:~/public_html" "")
> (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)")
>  ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)"))
> 
> 2x speedup on local files, and almost a 10x speedup for remote files.

Thanks, that's impressive.  But you omitted some of the features of
directory-files-recursively, see below.

> And my implementation *isn't even using the fact that find can run in
> parallel with Emacs*.  If I did start using that, I expect even more
> speed gains from parallelism, which aren't achievable in Emacs itself.

I'm not sure I understand what you mean by "in parallel" and why it
would be faster.

> So can we add something like this (with the appropriate fallbacks to
> directory-files-recursively), since it has such a big speedup even
> without parallelism?

We can have an alternative implementation, yes.  But it should support
predicate, and it should sort the files in each directory like
directory-files-recursively does, so that it's a drop-in replacement.
Also, I believe that Find does return "." in each directory, and your
implementation doesn't filter them, whereas
directory-files-recursively does AFAIR.

And I see no need for any fallback: that's for the application to do
if it wants.

>   (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates")

It should.

> 	     (if follow-symlinks
> 		 '("-L")
> 	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
> 	     (unless (string-empty-p regexp)
> 	       "-regex" (concat ".*" regexp ".*"))
> 	     (unless include-directories
> 	       '("!" "-type" "d"))
> 	     '("-print0")

Some of these switches are specific to GNU Find.  Are we going to
support only GNU Find?

> 	     ))
> 	   (remote (file-remote-p dir))
> 	   (proc
> 	    (if remote
> 		(let ((proc (apply #'start-file-process
> 				   "find" (current-buffer) command)))
> 		  (set-process-sentinel proc (lambda (_proc _state)))
> 		  (set-process-query-on-exit-flag proc nil)
> 		  proc)
> 	      (make-process :name "find" :buffer (current-buffer)
> 			    :connection-type 'pipe
> 			    :noquery t
> 			    :sentinel (lambda (_proc _state))
> 			    :command command))))
>       (while (accept-process-output proc))

Why do you call accept-process-output here? it could interfere with
reading output from async subprocesses running at the same time.  To
come think of this, why use async subprocesses here and not
call-process?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 17:46                                                                         ` Eli Zaretskii
@ 2023-07-22 18:31                                                                           ` Eli Zaretskii
  2023-07-22 19:06                                                                             ` Eli Zaretskii
  2023-07-22 20:53                                                                           ` Spencer Baugh
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 18:31 UTC (permalink / raw)
  To: sbaugh, sbaugh; +Cc: dmitry, yantar92, michael.albinus, rms, 64735

> Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev,
>  michael.albinus@gmx.de, 64735@debbugs.gnu.org
> Date: Sat, 22 Jul 2023 20:46:01 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > First my results:
> > 
> > (my-bench 100 "~/public_html" "")
> > (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)")
> >  ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)"))
> > 
> > (my-bench 10 "~/.local/src/linux" "")
> > (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)")
> >  ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)"))
> > 
> > (my-bench 100 "/ssh:catern.com:~/public_html" "")
> > (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)")
> >  ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)"))
> > 
> > 2x speedup on local files, and almost a 10x speedup for remote files.
> 
> Thanks, that's impressive.  But you omitted some of the features of
> directory-files-recursively, see below.

My results on MS-Windows are less encouraging:

  (my-bench 2 "d:/usr/archive" "")
  (("built-in" . "Elapsed time: 1.250000s (0.093750s in 5 GCs)")
   ("with-find" . "Elapsed time: 8.578125s (0.109375s in 7 GCs)"))

D:/usr/archive is a directory with 372 subdirectories and more than
12000 files in all of them.  The disk is SSD, in case it matters, and
I measured this with a warm disk cache.

So I guess whether or not to use this depends on the underlying
system.

Btw, you should not assume that "-type l" will universally work: at
least on MS-Windows some ports of GNU Find will barf when they see it.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 18:31                                                                           ` Eli Zaretskii
@ 2023-07-22 19:06                                                                             ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-22 19:06 UTC (permalink / raw)
  To: sbaugh, sbaugh; +Cc: dmitry, yantar92, michael.albinus, rms, 64735

> Cc: dmitry@gutov.dev, yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org,
>  64735@debbugs.gnu.org
> Date: Sat, 22 Jul 2023 21:31:14 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> My results on MS-Windows are less encouraging:
> 
>   (my-bench 2 "d:/usr/archive" "")
>   (("built-in" . "Elapsed time: 1.250000s (0.093750s in 5 GCs)")
>    ("with-find" . "Elapsed time: 8.578125s (0.109375s in 7 GCs)"))

And here's from a GNU/Linux machine, which is probably not very fast:

  (my-bench 10 "/usr/lib" "")
  (("built-in" . "Elapsed time: 4.410613s (2.077311s in 56 GCs)")
   ("with-find" . "Elapsed time: 3.326954s (1.997251s in 54 GCs)"))

Faster, but not by a lot.

On this system /usr/lib has 18000 files in 1860 subdirectories.

Btw, the Find command with pipe to some other program, like wc,
finishes much faster, like 2 to 4 times faster than when it is run
from find-directory-files-recursively.  That's probably the slowdown
due to communications with async subprocesses in action.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 17:46                                                                         ` Eli Zaretskii
  2023-07-22 18:31                                                                           ` Eli Zaretskii
@ 2023-07-22 20:53                                                                           ` Spencer Baugh
  2023-07-23  6:15                                                                             ` Eli Zaretskii
                                                                                               ` (2 more replies)
  1 sibling, 3 replies; 199+ messages in thread
From: Spencer Baugh @ 2023-07-22 20:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: yantar92, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> From: sbaugh@catern.com
>> Date: Sat, 22 Jul 2023 17:18:19 +0000 (UTC)
>> Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev,
>> 	michael.albinus@gmx.de, 64735@debbugs.gnu.org
>> 
>> First my results:
>> 
>> (my-bench 100 "~/public_html" "")
>> (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)")
>>  ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)"))
>> 
>> (my-bench 10 "~/.local/src/linux" "")
>> (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)")
>>  ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)"))
>> 
>> (my-bench 100 "/ssh:catern.com:~/public_html" "")
>> (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)")
>>  ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)"))
>> 
>> 2x speedup on local files, and almost a 10x speedup for remote files.
>
> Thanks, that's impressive.  But you omitted some of the features of
> directory-files-recursively, see below.
>
>> And my implementation *isn't even using the fact that find can run in
>> parallel with Emacs*.  If I did start using that, I expect even more
>> speed gains from parallelism, which aren't achievable in Emacs itself.
>
> I'm not sure I understand what you mean by "in parallel" and why it
> would be faster.

I mean having Emacs read output from the process and turn them into
strings while find is still running and walking the directory tree.  So
the two parts are running in parallel.  This, specifically:

(defun find-directory-files-recursively (dir regexp &optional include-directories _predicate follow-symlinks)
  (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates")
  (cl-assert (not (file-remote-p dir)))
  (let* (buffered
         result
         (proc
	  (make-process
           :name "find" :buffer nil
	   :connection-type 'pipe
	   :noquery t
	   :sentinel (lambda (_proc _state))
           :filter (lambda (proc data)
                     (let ((start 0))
                       (when-let (end (string-search "\0" data start))
                         (push (concat buffered (substring data start end)) result)
                         (setq buffered "")
                         (setq start (1+ end))
                         (while-let ((end (string-search "\0" data start)))
                           (push (substring data start end) result)
                           (setq start (1+ end))))
                       (setq buffered (concat buffered (substring data start)))))
	   :command (append
	             (list "find" (file-local-name dir))
	             (if follow-symlinks
		         '("-L")
	               '("!" "(" "-type" "l" "-xtype" "d" ")"))
	             (unless (string-empty-p regexp)
	               "-regex" (concat ".*" regexp ".*"))
	             (unless include-directories
	               '("!" "-type" "d"))
	             '("-print0")
	             ))))
    (while (accept-process-output proc))
    result))

Can you try this further change on your Windows (and GNU/Linux) box?  I
just tested on a different box and my original change gets:

(("built-in" . "Elapsed time: 4.506643s (2.276269s in 21 GCs)")
 ("with-find" . "Elapsed time: 4.114531s (2.848497s in 27 GCs)"))

while this parallel implementation gets

(("built-in" . "Elapsed time: 4.479185s (2.236561s in 21 GCs)")
 ("with-find" . "Elapsed time: 2.858452s (1.934647s in 19 GCs)"))

so it might have a favorable impact on Windows and your other GNU/Linux
box.

>> So can we add something like this (with the appropriate fallbacks to
>> directory-files-recursively), since it has such a big speedup even
>> without parallelism?
>
> We can have an alternative implementation, yes.  But it should support
> predicate, and it should sort the files in each directory like
> directory-files-recursively does, so that it's a drop-in replacement.
> Also, I believe that Find does return "." in each directory, and your
> implementation doesn't filter them, whereas
> directory-files-recursively does AFAIR.
>
> And I see no need for any fallback: that's for the application to do
> if it wants.
>
>>   (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates")
>
> It should.

This is where I think a fallback would be useful - it's basically
impossible to support arbitrary predicates efficiently here, since it
requires us to put Lisp in control of whether find descends into a
directory.  So I'm thinking I would just fall back to running the old
directory-files-recursively whenever there's a predicate.  Or just not
supporting this at all...

>> 	     (if follow-symlinks
>> 		 '("-L")
>> 	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
>> 	     (unless (string-empty-p regexp)
>> 	       "-regex" (concat ".*" regexp ".*"))
>> 	     (unless include-directories
>> 	       '("!" "-type" "d"))
>> 	     '("-print0")
>
> Some of these switches are specific to GNU Find.  Are we going to
> support only GNU Find?

POSIX find doesn't support -regex, so I think we have to.  We could
stick to just POSIX find if we only allowed globs in
find-directory-files-recursively, instead of full regexes.

>> 	     ))
>> 	   (remote (file-remote-p dir))
>> 	   (proc
>> 	    (if remote
>> 		(let ((proc (apply #'start-file-process
>> 				   "find" (current-buffer) command)))
>> 		  (set-process-sentinel proc (lambda (_proc _state)))
>> 		  (set-process-query-on-exit-flag proc nil)
>> 		  proc)
>> 	      (make-process :name "find" :buffer (current-buffer)
>> 			    :connection-type 'pipe
>> 			    :noquery t
>> 			    :sentinel (lambda (_proc _state))
>> 			    :command command))))
>>       (while (accept-process-output proc))
>
> Why do you call accept-process-output here? it could interfere with
> reading output from async subprocesses running at the same time.  To
> come think of this, why use async subprocesses here and not
> call-process?

See my new iteration which does use the async-ness.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22  6:39           ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-07-22 21:01             ` Dmitry Gutov
  2023-07-23  5:11               ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-22 21:01 UTC (permalink / raw)
  To: Po Lu, Spencer Baugh; +Cc: Ihor Radchenko, 64735

On 22/07/2023 09:39, Po Lu wrote:
> Spencer Baugh <sbaugh@janestreet.com> writes:
> 
>> Not to derail too much, but find as a subprocess has one substantial
>> advantage over find in Lisp: It can run in parallel with Emacs, so that
>> we actually use multiple CPU cores.
>>
>> Between that, and the remote support part, I personally much prefer find
>> to be a subprocess rather than in Lisp.  I don't think optimizing
>> directory-files-recursively is a great solution.
>>
>> (Really it's entirely plausible that Emacs could be improved by
>> *removing* directory-files-recursively, in favor of invoking find as a
>> subprocess: faster, parallelized execution, and better remote support.)
> 
> find is only present in the default installations of Unix-like systems,
> so it doesn't work without additional configuration on MS-Windows or
> MS-DOS.  project.el and rgrep fail to work on USG Unix because they both
> use `-path'.
> 
> Programs that use find should fall back to directory-file-recursively
> when any of the situations above are detected.

Perhaps if someone implements support for IGNORE entries (wildcards) in 
that function, it would be easy enough to do that fallback.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 19:33                                                               ` Spencer Baugh
  2023-07-22  5:27                                                                 ` Eli Zaretskii
@ 2023-07-23  2:59                                                                 ` Richard Stallman
  2023-07-23  5:28                                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Richard Stallman @ 2023-07-23  2:59 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: 64735

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > If someone was thinking of doing that, they would be better off
  > responding to RMS's earlier request for C programmers to optimize this
  > behavior in find.

  > Since, after all, if we do it that way it will benefit remote files as
  > well.

I wonder if some different way of specifying what to ignore might make
a faster implementation possible.  Regexps are general but matching
them tends to be slow.  Maybe some less general pattern matching could
be sufficient for these features while faster.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)







^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 21:01             ` Dmitry Gutov
@ 2023-07-23  5:11               ` Eli Zaretskii
  2023-07-23 10:46                 ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  5:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Cc: Ihor Radchenko <yantar92@posteo.net>, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 00:01:28 +0300
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 22/07/2023 09:39, Po Lu wrote:
> > 
> > Programs that use find should fall back to directory-file-recursively
> > when any of the situations above are detected.
> 
> Perhaps if someone implements support for IGNORE entries (wildcards) in 
> that function, it would be easy enough to do that fallback.

Shouldn't be hard, since it already filters some of them:

    (dolist (file (sort (file-name-all-completions "" dir)
                        'string<))
      (unless (member file '("./" "../"))  <<<<<<<<<<<<<<<<<<<

Even better: compute completion-regexp-list so that IGNOREs are
filtered by file-name-all-completions in the first place.

Patches welcome.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  2:59                                                                 ` Richard Stallman
@ 2023-07-23  5:28                                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  5:28 UTC (permalink / raw)
  To: rms; +Cc: sbaugh, 64735

> Cc: 64735@debbugs.gnu.org
> From: Richard Stallman <rms@gnu.org>
> Date: Sat, 22 Jul 2023 22:59:02 -0400
> 
>   > If someone was thinking of doing that, they would be better off
>   > responding to RMS's earlier request for C programmers to optimize this
>   > behavior in find.
> 
>   > Since, after all, if we do it that way it will benefit remote files as
>   > well.
> 
> I wonder if some different way of specifying what to ignore might make
> a faster implementation possible.  Regexps are general but matching
> them tends to be slow.  Maybe some less general pattern matching could
> be sufficient for these features while faster.

You are thinking about matching in Find, or about matching in Emacs?

If the former, they can probably use 'fnmatch' or somesuch, to match
against shell; wildcards.

If the latter, we don't have any pattern matching capabilities in
Emacs except fixed strings and regexps.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-21 13:46                                   ` Ihor Radchenko
  2023-07-21 15:41                                     ` Dmitry Gutov
@ 2023-07-23  5:40                                     ` Ihor Radchenko
  2023-07-23 11:50                                       ` Michael Albinus
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  5:40 UTC (permalink / raw)
  To: Dmitry Gutov, Michael Albinus; +Cc: sbaugh, Eli Zaretskii, 64735

Ihor Radchenko <yantar92@posteo.net> writes:

> On remote host, I can see that `find-lisp-find-files' must use
> tramp entries in `file-name-handler-alist'. Although, it will likely not
> be usable then - running GNU find on remote host is going to be
> unbeatable compared to repetitive TRAMP queries for file listing.

That said, Michael, may you please provide some insight about TRAMP
directory listing queries. May they be more optimized when we need to
query recursively rather than per directory?

GNU find is faster simply because it is running on remote machine itself.
But AFAIU, if TRAMP could convert repetitive network request for each
directory into a single request, it would speed things up significantly.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 20:53                                                                           ` Spencer Baugh
@ 2023-07-23  6:15                                                                             ` Eli Zaretskii
  2023-07-23  7:48                                                                             ` Ihor Radchenko
  2023-07-23 11:44                                                                             ` Michael Albinus
  2 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  6:15 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: yantar92, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Spencer Baugh <sbaugh@janestreet.com>
> Cc: sbaugh@catern.com,  yantar92@posteo.net,  rms@gnu.org,
>    dmitry@gutov.dev,  michael.albinus@gmx.de,  64735@debbugs.gnu.org
> Date: Sat, 22 Jul 2023 16:53:05 -0400
> 
> Can you try this further change on your Windows (and GNU/Linux) box?  I
> just tested on a different box and my original change gets:
> 
> (("built-in" . "Elapsed time: 4.506643s (2.276269s in 21 GCs)")
>  ("with-find" . "Elapsed time: 4.114531s (2.848497s in 27 GCs)"))
> 
> while this parallel implementation gets
> 
> (("built-in" . "Elapsed time: 4.479185s (2.236561s in 21 GCs)")
>  ("with-find" . "Elapsed time: 2.858452s (1.934647s in 19 GCs)"))
> 
> so it might have a favorable impact on Windows and your other GNU/Linux
> box.

Almost no effect here on MS-Windows:

  (("built-in" . "Elapsed time: 0.859375s (0.093750s in 4 GCs)")
   ("with-find" . "Elapsed time: 8.437500s (0.078125s in 4 GCs)"))

It was 8.578 sec with the previous version.

(The Lisp version is somewhat faster in this test because I
native-compiled the code for this test.)

On GNU/Linux:

  (("built-in" . "Elapsed time: 4.244898s (1.934182s in 56 GCs)")
   ("with-find" . "Elapsed time: 3.011574s (1.190498s in 35 GCs)"))

Faster by 10% (previous version yielded 3.327 sec).

Btw, I needed to fix the code: when-let needs 2 open parens after it,
not one.  The original code signals an error from the filter function
in Emacs 29.

> >>   (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates")
> >
> > It should.
> 
> This is where I think a fallback would be useful - it's basically
> impossible to support arbitrary predicates efficiently here, since it
> requires us to put Lisp in control of whether find descends into a
> directory.

There's nothing wrong with supporting this less efficiently.

And there's no need to control where Find descends: you could just
filter out the files from those directories that need to be ignored.

> So I'm thinking I would just fall back to running the old
> directory-files-recursively whenever there's a predicate.  Or just not
> supporting this at all...

We cannot not support it at all, because then it will not be a
replacement.  Fallback is okay, though I'd prefer a self-contained
function.

> >> 	     (if follow-symlinks
> >> 		 '("-L")
> >> 	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
> >> 	     (unless (string-empty-p regexp)
> >> 	       "-regex" (concat ".*" regexp ".*"))
> >> 	     (unless include-directories
> >> 	       '("!" "-type" "d"))
> >> 	     '("-print0")
> >
> > Some of these switches are specific to GNU Find.  Are we going to
> > support only GNU Find?
> 
> POSIX find doesn't support -regex, so I think we have to.  We could
> stick to just POSIX find if we only allowed globs in
> find-directory-files-recursively, instead of full regexes.

The latter would again be incompatible with
directory-files-recursively, so it isn't TRT, IMO.

One other subtlety is non-ASCII file names: you use -print0 switch to
Find, which produces null bytes, and those could inhibit decoding of
non-ASCII characters. So you may need to bind
inhibit-null-byte-detection to a non-nil value to get correctly
decoded file names you get from Find.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 20:53                                                                           ` Spencer Baugh
  2023-07-23  6:15                                                                             ` Eli Zaretskii
@ 2023-07-23  7:48                                                                             ` Ihor Radchenko
  2023-07-23  8:06                                                                               ` Eli Zaretskii
  2023-07-23 11:44                                                                             ` Michael Albinus
  2 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  7:48 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: rms, sbaugh, dmitry, michael.albinus, Eli Zaretskii, 64735

Spencer Baugh <sbaugh@janestreet.com> writes:

> Can you try this further change on your Windows (and GNU/Linux) box?  I
> just tested on a different box and my original change gets:

On GNU/Linux, with slight modifications 

(defun my-bench (count path regexp)
  (setq path (expand-file-name path))
  ;; (let ((old (directory-files-recursively path regexp))
  ;; 	(new (find-directory-files-recursively path regexp)))
  ;;   (dolist (path old)
  ;;     (should (member path new)))
  ;;   (dolist (path new)
  ;;     (should (member path old))))
  (list
   (cons "built-in" (benchmark count (list 'directory-files-recursively
					   path regexp)))
   (cons "built-in no handlers"
	 (let (file-name-handler-alist)
	   (benchmark count
		      (list 'directory-files-recursively path
			    regexp))))
   (cons "with-find" (benchmark count (list
				       'find-directory-files-recursively path regexp)))))


(my-bench 10 "/usr/src/linux/" "")

(("built-in" . "Elapsed time: 7.134589s (3.609741s in 10 GCs)")
 ("built-in no handlers" . "Elapsed time: 6.041666s (3.856730s in 11 GCs)")
 ("with-find" . "Elapsed time: 6.300330s (4.248508s in 12 GCs)"))

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 15:29                                                                             ` Eli Zaretskii
@ 2023-07-23  7:52                                                                               ` Ihor Radchenko
  2023-07-23  8:01                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  7:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> Then how will the callback be different from
>> (mapc #'my-function (directory-files-recursively ...))
>> ?
>
> It depends on the application.  Applications that want to get all the
> data and only after that process it will not use the callback.  But I
> can certainly imagine an application that inserts the file names, or
> some of their transforms, into a buffer, and from time to time
> triggers redisplay to show the partial results.  Or an application
> could write the file names to some disk file or external consumer, or
> send them to a network process.

But won't the Elisp callback always result in a queue that will
effectively be synchronous?

Also, another idea could be using iterators - the applications can just
request "next" file as needed, without waiting for the full file list.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  7:52                                                                               ` Ihor Radchenko
@ 2023-07-23  8:01                                                                                 ` Eli Zaretskii
  2023-07-23  8:11                                                                                   ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  8:01 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 07:52:31 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Then how will the callback be different from
> >> (mapc #'my-function (directory-files-recursively ...))
> >> ?
> >
> > It depends on the application.  Applications that want to get all the
> > data and only after that process it will not use the callback.  But I
> > can certainly imagine an application that inserts the file names, or
> > some of their transforms, into a buffer, and from time to time
> > triggers redisplay to show the partial results.  Or an application
> > could write the file names to some disk file or external consumer, or
> > send them to a network process.
> 
> But won't the Elisp callback always result in a queue that will
> effectively be synchronous?

I don't understand the question (what queue?), and understand even
less what you are trying to say here.  Please elaborate.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  7:48                                                                             ` Ihor Radchenko
@ 2023-07-23  8:06                                                                               ` Eli Zaretskii
  2023-07-23  8:16                                                                                 ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  8:06 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, sbaugh@catern.com, rms@gnu.org,
>  dmitry@gutov.dev, michael.albinus@gmx.de, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 07:48:45 +0000
> 
> (my-bench 10 "/usr/src/linux/" "")
> 
> (("built-in" . "Elapsed time: 7.134589s (3.609741s in 10 GCs)")
>  ("built-in no handlers" . "Elapsed time: 6.041666s (3.856730s in 11 GCs)")
>  ("with-find" . "Elapsed time: 6.300330s (4.248508s in 12 GCs)"))

Is this in "emacs -Q"?  Why so much time taken by GC?  It indicates
that temporarily raising the GC thresholds could speed up things by a
factor of 2 or 3.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  8:01                                                                                 ` Eli Zaretskii
@ 2023-07-23  8:11                                                                                   ` Ihor Radchenko
  2023-07-23  9:11                                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  8:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> >> Then how will the callback be different from
>> >> (mapc #'my-function (directory-files-recursively ...))
>> >> ?
>> >
>> > It depends on the application.  Applications that want to get all the
>> > data and only after that process it will not use the callback.  But I
>> > can certainly imagine an application that inserts the file names, or
>> > some of their transforms, into a buffer, and from time to time
>> > triggers redisplay to show the partial results.  Or an application
>> > could write the file names to some disk file or external consumer, or
>> > send them to a network process.
>> 
>> But won't the Elisp callback always result in a queue that will
>> effectively be synchronous?
>
> I don't understand the question (what queue?), and understand even
> less what you are trying to say here.  Please elaborate.

Consider (async-directory-files-recursively dir regexp callback) with
callback being (lambda (file) (start-process "Copy" nil "cp" file "/tmp/")).

`async-directory-files-recursively' may fire CALLBACK very frequently.
According to the other benchmarks in this thread, a file from directory
may be retrieved within 10E-6s or even less. Elisp will have to arrange
the callbacks to run immediately one after other (in a queue).
Which will not be very different compared to just running callbacks in a
synchronous loop.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  8:06                                                                               ` Eli Zaretskii
@ 2023-07-23  8:16                                                                                 ` Ihor Radchenko
  2023-07-23  9:13                                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  8:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> (("built-in" . "Elapsed time: 7.134589s (3.609741s in 10 GCs)")
>>  ("built-in no handlers" . "Elapsed time: 6.041666s (3.856730s in 11 GCs)")
>>  ("with-find" . "Elapsed time: 6.300330s (4.248508s in 12 GCs)"))
>
> Is this in "emacs -Q"?  Why so much time taken by GC?  It indicates
> that temporarily raising the GC thresholds could speed up things by a
> factor of 2 or 3.

With emacs -Q, the results are similar in terms of absolute time spent
doing GC:

(("built-in" . "Elapsed time: 5.706795s (3.332933s in 304 GCs)")
 ("built-in no handlers" . "Elapsed time: 4.535871s (3.161111s in 301 GCs)")
 ("with-find" . "Elapsed time: 4.829426s (3.333890s in 274 GCs)"))

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  8:11                                                                                   ` Ihor Radchenko
@ 2023-07-23  9:11                                                                                     ` Eli Zaretskii
  2023-07-23  9:34                                                                                       ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  9:11 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 08:11:56 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> But won't the Elisp callback always result in a queue that will
> >> effectively be synchronous?
> >
> > I don't understand the question (what queue?), and understand even
> > less what you are trying to say here.  Please elaborate.
> 
> Consider (async-directory-files-recursively dir regexp callback) with
> callback being (lambda (file) (start-process "Copy" nil "cp" file "/tmp/")).

What is async-directory-files-recursively, and why are we talking
about it?  I was talking about an implementation of
directory-files-recursively as a primitive in C.  That's not async
code.  So I don't understand why we are talking about some
hypothetical async implementation.

> `async-directory-files-recursively' may fire CALLBACK very frequently.
> According to the other benchmarks in this thread, a file from directory
> may be retrieved within 10E-6s or even less. Elisp will have to arrange
> the callbacks to run immediately one after other (in a queue).
> Which will not be very different compared to just running callbacks in a
> synchronous loop.

Regardless of my confusion above, no one said the callback must
necessarily operate on each file as soon as its name was retrieved,
nor even that the callback must be called for each file.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  8:16                                                                                 ` Ihor Radchenko
@ 2023-07-23  9:13                                                                                   ` Eli Zaretskii
  2023-07-23  9:16                                                                                     ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  9:13 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@janestreet.com, sbaugh@catern.com, rms@gnu.org, dmitry@gutov.dev,
>  michael.albinus@gmx.de, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 08:16:05 +0000
> 
> With emacs -Q, the results are similar in terms of absolute time spent
> doing GC:
> 
> (("built-in" . "Elapsed time: 5.706795s (3.332933s in 304 GCs)")
>  ("built-in no handlers" . "Elapsed time: 4.535871s (3.161111s in 301 GCs)")
>  ("with-find" . "Elapsed time: 4.829426s (3.333890s in 274 GCs)"))

Strange.  On my system, GC takes about 8% of the run time.  Maybe it's
a function of how many files are retrieved?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  9:13                                                                                   ` Eli Zaretskii
@ 2023-07-23  9:16                                                                                     ` Ihor Radchenko
  0 siblings, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  9:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> With emacs -Q, the results are similar in terms of absolute time spent
>> doing GC:
>> 
>> (("built-in" . "Elapsed time: 5.706795s (3.332933s in 304 GCs)")
>>  ("built-in no handlers" . "Elapsed time: 4.535871s (3.161111s in 301 GCs)")
>>  ("with-find" . "Elapsed time: 4.829426s (3.333890s in 274 GCs)"))
>
> Strange.  On my system, GC takes about 8% of the run time.  Maybe it's
> a function of how many files are retrieved?

Most likely.
(length (directory-files-recursively "/usr/src/linux/" "")) ; => 145489

My test is producing a very long list of files. 10 times for each test
and for each function variant.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  9:11                                                                                     ` Eli Zaretskii
@ 2023-07-23  9:34                                                                                       ` Ihor Radchenko
  2023-07-23  9:39                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  9:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> Consider (async-directory-files-recursively dir regexp callback) with
>> callback being (lambda (file) (start-process "Copy" nil "cp" file "/tmp/")).
>
> What is async-directory-files-recursively, and why are we talking
> about it?  I was talking about an implementation of
> directory-files-recursively as a primitive in C.  That's not async
> code.  So I don't understand why we are talking about some
> hypothetical async implementation.

Then, may you elaborate about how you imagine the proposed callback
interface?
I clearly did not understand what you had in mind.

>> `async-directory-files-recursively' may fire CALLBACK very frequently.
>> According to the other benchmarks in this thread, a file from directory
>> may be retrieved within 10E-6s or even less. Elisp will have to arrange
>> the callbacks to run immediately one after other (in a queue).
>> Which will not be very different compared to just running callbacks in a
>> synchronous loop.
>
> Regardless of my confusion above, no one said the callback must
> necessarily operate on each file as soon as its name was retrieved,
> nor even that the callback must be called for each file.

The only callback paradigm I know of in Emacs is something like process
sentinels. Do you have something else in mind?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  9:34                                                                                       ` Ihor Radchenko
@ 2023-07-23  9:39                                                                                         ` Eli Zaretskii
  2023-07-23  9:42                                                                                           ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23  9:39 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 09:34:20 +0000
> 
> The only callback paradigm I know of in Emacs is something like process
> sentinels. Do you have something else in mind?

Think about an API that is passed a function, and calls that function
when appropriate, to perform caller-defined processing of the stuff
generated by the API's implementation.  That function is what I
referred to as "callback".





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  9:39                                                                                         ` Eli Zaretskii
@ 2023-07-23  9:42                                                                                           ` Ihor Radchenko
  2023-07-23 10:20                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23  9:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> The only callback paradigm I know of in Emacs is something like process
>> sentinels. Do you have something else in mind?
>
> Think about an API that is passed a function, and calls that function
> when appropriate, to perform caller-defined processing of the stuff
> generated by the API's implementation.  That function is what I
> referred to as "callback".

But what is the strategy that should be used to call the CALLBACK?
You clearly had something other than "call as soon as we got another
file name" in mind.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  9:42                                                                                           ` Ihor Radchenko
@ 2023-07-23 10:20                                                                                             ` Eli Zaretskii
  2023-07-23 11:43                                                                                               ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 10:20 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 09:42:45 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Think about an API that is passed a function, and calls that function
> > when appropriate, to perform caller-defined processing of the stuff
> > generated by the API's implementation.  That function is what I
> > referred to as "callback".
> 
> But what is the strategy that should be used to call the CALLBACK?
> You clearly had something other than "call as soon as we got another
> file name" in mind.

It could be "call as soon as we got 100 file names", for example.  The
number can even be a separate parameter passed to the API.






^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  5:11               ` Eli Zaretskii
@ 2023-07-23 10:46                 ` Dmitry Gutov
  2023-07-23 11:18                   ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-23 10:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 23/07/2023 08:11, Eli Zaretskii wrote:
> Even better: compute completion-regexp-list so that IGNOREs are
> filtered by file-name-all-completions in the first place.

We don't have lookahead in Emacs regexps, so I'm not sure it's possible 
to construct regexp that says "don't match entries A, B and C".





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 10:46                 ` Dmitry Gutov
@ 2023-07-23 11:18                   ` Eli Zaretskii
  2023-07-23 17:46                     ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 11:18 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 23 Jul 2023 13:46:30 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 23/07/2023 08:11, Eli Zaretskii wrote:
> > Even better: compute completion-regexp-list so that IGNOREs are
> > filtered by file-name-all-completions in the first place.
> 
> We don't have lookahead in Emacs regexps, so I'm not sure it's possible 
> to construct regexp that says "don't match entries A, B and C".

Well, maybe just having a way of telling file-name-all-completions to
negate the sense of completion-regexp-list would be enough to make
that happen?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 10:20                                                                                             ` Eli Zaretskii
@ 2023-07-23 11:43                                                                                               ` Ihor Radchenko
  2023-07-23 12:49                                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23 11:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> But what is the strategy that should be used to call the CALLBACK?
>> You clearly had something other than "call as soon as we got another
>> file name" in mind.
>
> It could be "call as soon as we got 100 file names", for example.  The
> number can even be a separate parameter passed to the API.

Will consing the filename strings also be delayed until the callback is invoked?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-22 20:53                                                                           ` Spencer Baugh
  2023-07-23  6:15                                                                             ` Eli Zaretskii
  2023-07-23  7:48                                                                             ` Ihor Radchenko
@ 2023-07-23 11:44                                                                             ` Michael Albinus
  2 siblings, 0 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-23 11:44 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: yantar92, rms, sbaugh, dmitry, Eli Zaretskii, 64735

Spencer Baugh <sbaugh@janestreet.com> writes:

Hi Spencer,

> I mean having Emacs read output from the process and turn them into
> strings while find is still running and walking the directory tree.  So
> the two parts are running in parallel.  This, specifically:

Just as POC, I have modified your function slightly that it runs with
both local and remote directories.

--8<---------------cut here---------------start------------->8---
(defun find-directory-files-recursively (dir regexp &optional include-directories _predicate follow-symlinks)
  (let* (buffered
         result
	 (remote (file-remote-p dir))
	 (file-name-handler-alist (and remote file-name-handler-alist))
         (proc
	  (make-process
           :name "find" :buffer nil
	   :connection-type 'pipe
	   :noquery t
	   :sentinel #'ignore
	   :file-handler remote
           :filter (lambda (proc data)
                     (let ((start 0))
		       (when-let ((end (string-search "\0" data start)))
			 (push (concat buffered (substring data start end)) result)
			 (setq buffered "")
			 (setq start (1+ end))
			 (while-let ((end (string-search "\0" data start)))
                           (push (substring data start end) result)
                           (setq start (1+ end))))
                       (setq buffered (concat buffered (substring data start)))))
	   :command (append
	             (list "find" (file-local-name dir))
	             (if follow-symlinks
		         '("-L")
	               '("!" "(" "-type" "l" "-xtype" "d" ")"))
	             (unless (string-empty-p regexp)
	               "-regex" (concat ".*" regexp ".*"))
	             (unless include-directories
	               '("!" "-type" "d"))
	             '("-print0")
	             ))))
    (while (accept-process-output proc))
    (if remote (mapcar (lambda (file) (concat remote file)) result) result)))
--8<---------------cut here---------------end--------------->8---

This returns on my laptop

--8<---------------cut here---------------start------------->8---
(my-bench 100 "~/src/tramp" "")
(("built-in" . "Elapsed time: 99.177562s (3.403403s in 107 GCs)")
 ("with-find" . "Elapsed time: 83.432360s (2.820053s in 98 GCs)"))

(my-bench 100 "/ssh:remotehost:~/src/tramp" "")
(("built-in" . "Elapsed time: 128.406359s (34.981183s in 1850 GCs)")
 ("with-find" . "Elapsed time: 82.765064s (4.155410s in 163 GCs)"))
--8<---------------cut here---------------end--------------->8---

Of course the other problems still remain. For example, you cannot know
whether on a given host (local or remote) find supports all
arguments. On my NAS, for example, we have

--8<---------------cut here---------------start------------->8---
[~] # find -h
BusyBox v1.01 (2022.10.27-23:57+0000) multi-call binary

Usage: find [PATH...] [EXPRESSION]

Search for files in a directory hierarchy.  The default PATH is
the current directory; default EXPRESSION is '-print'

EXPRESSION may consist of:
	-follow		Dereference symbolic links.
	-name PATTERN	File name (leading directories removed) matches PATTERN.
	-print		Print (default and assumed).

	-type X		Filetype matches X (where X is one of: f,d,l,b,c,...)
	-perm PERMS	Permissions match any of (+NNN); all of (-NNN);
			or exactly (NNN)
	-mtime TIME	Modified time is greater than (+N); less than (-N);
			or exactly (N) days
--8<---------------cut here---------------end--------------->8---

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23  5:40                                     ` Ihor Radchenko
@ 2023-07-23 11:50                                       ` Michael Albinus
  2023-07-24  7:35                                         ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-23 11:50 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

>> On remote host, I can see that `find-lisp-find-files' must use
>> tramp entries in `file-name-handler-alist'. Although, it will likely not
>> be usable then - running GNU find on remote host is going to be
>> unbeatable compared to repetitive TRAMP queries for file listing.
>
> That said, Michael, may you please provide some insight about TRAMP
> directory listing queries. May they be more optimized when we need to
> query recursively rather than per directory?

Tramp is just a stupid library, w/o own intelligence. It offers
alternative implementations for the set of primitive operations listed in
(info "(elisp) Magic File Names")

There's no optimization wrt to bundling several operations into a more
suited remote command.

> GNU find is faster simply because it is running on remote machine itself.
> But AFAIU, if TRAMP could convert repetitive network request for each
> directory into a single request, it would speed things up significantly.

If you want something like this, you must add directory-files-recursively
to that list of primitive operations.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 11:43                                                                                               ` Ihor Radchenko
@ 2023-07-23 12:49                                                                                                 ` Eli Zaretskii
  2023-07-23 12:57                                                                                                   ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 12:49 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 11:43:22 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> But what is the strategy that should be used to call the CALLBACK?
> >> You clearly had something other than "call as soon as we got another
> >> file name" in mind.
> >
> > It could be "call as soon as we got 100 file names", for example.  The
> > number can even be a separate parameter passed to the API.
> 
> Will consing the filename strings also be delayed until the callback is invoked?

No.  I don't think it's possible (or desirable).  We could keep them
in some malloc'ed buffer, of course, but what's the point?  This would
only be justified if somehow creation of Lisp strings proved to be a
terrible bottleneck, which would leave me mightily surprised.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 12:49                                                                                                 ` Eli Zaretskii
@ 2023-07-23 12:57                                                                                                   ` Ihor Radchenko
  2023-07-23 13:32                                                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23 12:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

>> > It could be "call as soon as we got 100 file names", for example.  The
>> > number can even be a separate parameter passed to the API.
>> 
>> Will consing the filename strings also be delayed until the callback is invoked?
>
> No.  I don't think it's possible (or desirable).  We could keep them
> in some malloc'ed buffer, of course, but what's the point?  This would
> only be justified if somehow creation of Lisp strings proved to be a
> terrible bottleneck, which would leave me mightily surprised.

Thanks for the clarification!
Then, would it make sense to have such a callback API more general? (not
just for listing directory files).

For example, the callbacks might be attached to a list variable that
will accumulate the async results. Then, the callbacks will be called on
that list, similar to how process sentinels are called when a chunk of
output is arriving to the process buffer.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 12:57                                                                                                   ` Ihor Radchenko
@ 2023-07-23 13:32                                                                                                     ` Eli Zaretskii
  2023-07-23 13:56                                                                                                       ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 13:32 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 12:57:53 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> > It could be "call as soon as we got 100 file names", for example.  The
> >> > number can even be a separate parameter passed to the API.
> >> 
> >> Will consing the filename strings also be delayed until the callback is invoked?
> >
> > No.  I don't think it's possible (or desirable).  We could keep them
> > in some malloc'ed buffer, of course, but what's the point?  This would
> > only be justified if somehow creation of Lisp strings proved to be a
> > terrible bottleneck, which would leave me mightily surprised.
> 
> Thanks for the clarification!
> Then, would it make sense to have such a callback API more general? (not
> just for listing directory files).
> 
> For example, the callbacks might be attached to a list variable that
> will accumulate the async results. Then, the callbacks will be called on
> that list, similar to how process sentinels are called when a chunk of
> output is arriving to the process buffer.

Anything's possible, but when a function produces text, like file
names, then the natural thing is either to return them as strings or
to insert them into some buffer.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 13:32                                                                                                     ` Eli Zaretskii
@ 2023-07-23 13:56                                                                                                       ` Ihor Radchenko
  2023-07-23 14:32                                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-23 13:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

Eli Zaretskii <eliz@gnu.org> writes:

> Anything's possible, but when a function produces text, like file
> names, then the natural thing is either to return them as strings or
> to insert them into some buffer.

Do you mean to re-use process buffer and process API, but for internal
asynchronous C functions (rather than sub-processes)?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 13:56                                                                                                       ` Ihor Radchenko
@ 2023-07-23 14:32                                                                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 14:32 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev,
>  michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org
> Date: Sun, 23 Jul 2023 13:56:35 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Anything's possible, but when a function produces text, like file
> > names, then the natural thing is either to return them as strings or
> > to insert them into some buffer.
> 
> Do you mean to re-use process buffer and process API, but for internal
> asynchronous C functions (rather than sub-processes)?

Not necessarily a process buffer, no.  Just some temporary buffer.  We
already do stuff like that for some C primitives.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 11:18                   ` Eli Zaretskii
@ 2023-07-23 17:46                     ` Dmitry Gutov
  2023-07-23 17:56                       ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-23 17:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 23/07/2023 14:18, Eli Zaretskii wrote:
>> Date: Sun, 23 Jul 2023 13:46:30 +0300
>> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>> On 23/07/2023 08:11, Eli Zaretskii wrote:
>>> Even better: compute completion-regexp-list so that IGNOREs are
>>> filtered by file-name-all-completions in the first place.
>> We don't have lookahead in Emacs regexps, so I'm not sure it's possible
>> to construct regexp that says "don't match entries A, B and C".
> Well, maybe just having a way of telling file-name-all-completions to
> negate the sense of completion-regexp-list would be enough to make
> that happen?

Some way to do that is certainly possible (e.g. a new option and 
corresponding code, maybe; maybe not), it's just that the person 
implementing it should consider the performance of the resulting solution.

And, ideally, do all the relevant benchmarking when proposing the change.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 17:46                     ` Dmitry Gutov
@ 2023-07-23 17:56                       ` Eli Zaretskii
  2023-07-23 17:58                         ` Dmitry Gutov
  2023-07-23 19:27                         ` Dmitry Gutov
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 17:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 23 Jul 2023 20:46:19 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 23/07/2023 14:18, Eli Zaretskii wrote:
> >> Date: Sun, 23 Jul 2023 13:46:30 +0300
> >> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net,
> >>   64735@debbugs.gnu.org
> >> From: Dmitry Gutov<dmitry@gutov.dev>
> >>
> >> On 23/07/2023 08:11, Eli Zaretskii wrote:
> >>> Even better: compute completion-regexp-list so that IGNOREs are
> >>> filtered by file-name-all-completions in the first place.
> >> We don't have lookahead in Emacs regexps, so I'm not sure it's possible
> >> to construct regexp that says "don't match entries A, B and C".
> > Well, maybe just having a way of telling file-name-all-completions to
> > negate the sense of completion-regexp-list would be enough to make
> > that happen?
> 
> Some way to do that is certainly possible (e.g. a new option and 
> corresponding code, maybe; maybe not), it's just that the person 
> implementing it should consider the performance of the resulting solution.

I agree.  However, if we are going to implement filtering of file
names, I don't think it matters where in the pipeline to perform the
filtering.  The advantage of using completion-regexp-list is that the
matching is done in C, so is probably at least a tad faster.

> And, ideally, do all the relevant benchmarking when proposing the change.

Of course.  Although the benchmarks until now already show quite a
variability.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 17:56                       ` Eli Zaretskii
@ 2023-07-23 17:58                         ` Dmitry Gutov
  2023-07-23 18:21                           ` Eli Zaretskii
  2023-07-23 19:27                         ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-23 17:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 23/07/2023 20:56, Eli Zaretskii wrote:
>> Date: Sun, 23 Jul 2023 20:46:19 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> On 23/07/2023 14:18, Eli Zaretskii wrote:
>>>> Date: Sun, 23 Jul 2023 13:46:30 +0300
>>>> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net,
>>>>    64735@debbugs.gnu.org
>>>> From: Dmitry Gutov<dmitry@gutov.dev>
>>>>
>>>> On 23/07/2023 08:11, Eli Zaretskii wrote:
>>>>> Even better: compute completion-regexp-list so that IGNOREs are
>>>>> filtered by file-name-all-completions in the first place.
>>>> We don't have lookahead in Emacs regexps, so I'm not sure it's possible
>>>> to construct regexp that says "don't match entries A, B and C".
>>> Well, maybe just having a way of telling file-name-all-completions to
>>> negate the sense of completion-regexp-list would be enough to make
>>> that happen?
>>
>> Some way to do that is certainly possible (e.g. a new option and
>> corresponding code, maybe; maybe not), it's just that the person
>> implementing it should consider the performance of the resulting solution.
> 
> I agree.  However, if we are going to implement filtering of file
> names, I don't think it matters where in the pipeline to perform the
> filtering.

A possible advantage of doing it earlier, is that if filtering happens 
in C code you could do it before allocating Lisp strings, thereby 
lowering the resulting GC pressure at the outset.






^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 17:58                         ` Dmitry Gutov
@ 2023-07-23 18:21                           ` Eli Zaretskii
  2023-07-23 19:07                             ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 18:21 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 23 Jul 2023 20:58:24 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> A possible advantage of doing it earlier, is that if filtering happens 
> in C code you could do it before allocating Lisp strings

That's not what happens today.  And it isn't easy to do what you
suggest, since the file names we get from the C APIs need to be
decoded, and that is awkward at best with C strings.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 18:21                           ` Eli Zaretskii
@ 2023-07-23 19:07                             ` Dmitry Gutov
  2023-07-23 19:27                               ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-23 19:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 23/07/2023 21:21, Eli Zaretskii wrote:
>> Date: Sun, 23 Jul 2023 20:58:24 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> A possible advantage of doing it earlier, is that if filtering happens
>> in C code you could do it before allocating Lisp strings
> 
> That's not what happens today.  And it isn't easy to do what you
> suggest, since the file names we get from the C APIs need to be
> decoded, and that is awkward at best with C strings.

It is what happens today when 'find' is used, though.

Far be it from me to insist, though, but if we indeed reimplemented all 
the good parts of 'find', that would make the new function a suitable 
replacement/improvement, at least on local hosts (instead of it just 
being used as a fallback).





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 17:56                       ` Eli Zaretskii
  2023-07-23 17:58                         ` Dmitry Gutov
@ 2023-07-23 19:27                         ` Dmitry Gutov
  2023-07-24 11:20                           ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-23 19:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 23/07/2023 20:56, Eli Zaretskii wrote:
>> And, ideally, do all the relevant benchmarking when proposing the change.
> Of course.  Although the benchmarks until now already show quite a
> variability.

Speaking of your MS Windows results that are unflattering to 'find', it 
might be worth it to do a more varied comparison, to determine the 
OS-specific bottleneck.

Off the top of my head, here are some possibilities:

1. 'find' itself is much slower there. There is room for improvement in 
the port.

2. The process output handling is worse.

3. Something particular to the project being used for the test.

To look into the possibility #1, you can try running the same command in 
the terminal with the output to NUL and comparing the runtime to what's 
reported in the benchmark.

I actually remember, from my time on MS Windows about 10 years ago, that 
some older ports of 'find' and/or 'grep' did have performance problems, 
but IIRC ezwinports contained the improved versions.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 19:07                             ` Dmitry Gutov
@ 2023-07-23 19:27                               ` Eli Zaretskii
  2023-07-23 19:44                                 ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-23 19:27 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 23 Jul 2023 22:07:17 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 23/07/2023 21:21, Eli Zaretskii wrote:
> >> Date: Sun, 23 Jul 2023 20:58:24 +0300
> >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
> >>   64735@debbugs.gnu.org
> >> From: Dmitry Gutov <dmitry@gutov.dev>
> >>
> >> A possible advantage of doing it earlier, is that if filtering happens
> >> in C code you could do it before allocating Lisp strings
> > 
> > That's not what happens today.  And it isn't easy to do what you
> > suggest, since the file names we get from the C APIs need to be
> > decoded, and that is awkward at best with C strings.
> 
> It is what happens today when 'find' is used, though.

No, I was talking about what file-name-all-completions does.

> Far be it from me to insist, though, but if we indeed reimplemented all 
> the good parts of 'find', that would make the new function a suitable 
> replacement/improvement, at least on local hosts (instead of it just 
> being used as a fallback).

The basic problem here is this: the regexp or pattern to filter out
ignorables is specified as a Lisp string, which is in the internal
Emacs representation of characters.  So to compare file names we
receive either from Find or from a C API, we need either to decode the
file names we receive (which in practice means they should be Lisp
strings), or encode the regexp and use its C string payload.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 19:27                               ` Eli Zaretskii
@ 2023-07-23 19:44                                 ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-23 19:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 23/07/2023 22:27, Eli Zaretskii wrote:
>> Far be it from me to insist, though, but if we indeed reimplemented all
>> the good parts of 'find', that would make the new function a suitable
>> replacement/improvement, at least on local hosts (instead of it just
>> being used as a fallback).
> The basic problem here is this: the regexp or pattern to filter out
> ignorables is specified as a Lisp string, which is in the internal
> Emacs representation of characters.  So to compare file names we
> receive either from Find or from a C API, we need either to decode the
> file names we receive (which in practice means they should be Lisp
> strings), or encode the regexp and use its C string payload.

Yes, the latter sounds more fiddly, but it seems to be *the* way toward 
find's performance levels. But only the benchmarks can tell whether the 
hassle is worthwhile.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 11:50                                       ` Michael Albinus
@ 2023-07-24  7:35                                         ` Ihor Radchenko
  2023-07-24  7:59                                           ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-24  7:35 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

> Tramp is just a stupid library, w/o own intelligence. It offers
> alternative implementations for the set of primitive operations listed in
> (info "(elisp) Magic File Names")

This makes me wonder we can simply add a "find" file handler that will
use find as necessary when GNU find executable is available.

>> GNU find is faster simply because it is running on remote machine itself.
>> But AFAIU, if TRAMP could convert repetitive network request for each
>> directory into a single request, it would speed things up significantly.
>
> If you want something like this, you must add directory-files-recursively
> to that list of primitive operations.

So, it is doable, and not difficult. Good to know.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-24  7:35                                         ` Ihor Radchenko
@ 2023-07-24  7:59                                           ` Michael Albinus
  2023-07-24  8:22                                             ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Michael Albinus @ 2023-07-24  7:59 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

>> Tramp is just a stupid library, w/o own intelligence. It offers
>> alternative implementations for the set of primitive operations listed in
>> (info "(elisp) Magic File Names")
>
> This makes me wonder we can simply add a "find" file handler that will
> use find as necessary when GNU find executable is available.
>
>>> GNU find is faster simply because it is running on remote machine itself.
>>> But AFAIU, if TRAMP could convert repetitive network request for each
>>> directory into a single request, it would speed things up significantly.
>>
>> If you want something like this, you must add directory-files-recursively
>> to that list of primitive operations.
>
> So, it is doable, and not difficult. Good to know.

Technically it isn't difficult. But don't forget:

- We support already ~80 primitive operations.

- A new primitive operation must be handled by all Tramp backends, which
  could require up to 10 different implementations.

- I'm the only Tramp maintainer, for decades.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-24  7:59                                           ` Michael Albinus
@ 2023-07-24  8:22                                             ` Ihor Radchenko
  2023-07-24  9:31                                               ` Michael Albinus
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-24  8:22 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh

Michael Albinus <michael.albinus@gmx.de> writes:

>> So, it is doable, and not difficult. Good to know.
>
> Technically it isn't difficult. But don't forget:
>
> - We support already ~80 primitive operations.
>
> - A new primitive operation must be handled by all Tramp backends, which
>   could require up to 10 different implementations.

Why so? `directory-files-recursively' is already supported by Tramp via
`directory-files'. But at least for some backends
`directory-files-recursively' may be implemented more efficiently. If
other backends do not implement it, `directory-files' will be used.

> - I'm the only Tramp maintainer, for decades.

I hope that the above approach with only some backends implementing such
support will not add too much of maintenance burden.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-24  8:22                                             ` Ihor Radchenko
@ 2023-07-24  9:31                                               ` Michael Albinus
  0 siblings, 0 replies; 199+ messages in thread
From: Michael Albinus @ 2023-07-24  9:31 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh

Ihor Radchenko <yantar92@posteo.net> writes:

Hi Ihor,

> Why so? `directory-files-recursively' is already supported by Tramp via
> `directory-files'.

It isn't supported by Tramp yet. Tramp hasn't heard ever about.

> But at least for some backends `directory-files-recursively' may be
> implemented more efficiently. If other backends do not implement it,
> `directory-files' will be used.

Of course, and I did propose to add it. I just wanted to avoid an
inflation of proposals for primitive operations being supported by Tramp.

And yes, not all backends need to implement an own version of
`directory-files-recursively'. But this could happen for other primitive
operations, so we must always think about whether it is worth to add a
primitive to Tramp.

Best regards, Michael.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-23 19:27                         ` Dmitry Gutov
@ 2023-07-24 11:20                           ` Eli Zaretskii
  2023-07-24 12:55                             ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-24 11:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 23 Jul 2023 22:27:26 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 23/07/2023 20:56, Eli Zaretskii wrote:
> >> And, ideally, do all the relevant benchmarking when proposing the change.
> > Of course.  Although the benchmarks until now already show quite a
> > variability.
> 
> Speaking of your MS Windows results that are unflattering to 'find', it 
> might be worth it to do a more varied comparison, to determine the 
> OS-specific bottleneck.
> 
> Off the top of my head, here are some possibilities:
> 
> 1. 'find' itself is much slower there. There is room for improvement in 
> the port.

I think it's the filesystem, not the port (which I did myself in this
case).  But I'd welcome similar tests on other Windows systems with
other ports of Find.  Just remember to measure this particular
benchmark, not just Find itself from the shell, as the times are very
different (as I reported up-thread).

> 2. The process output handling is worse.

Not sure what that means.

> 3. Something particular to the project being used for the test.

I don't think I understand this one.

> To look into the possibility #1, you can try running the same command in 
> the terminal with the output to NUL and comparing the runtime to what's 
> reported in the benchmark.

Output to the null device is a bad idea, as (AFAIR) Find is clever
enough to detect that and do nothing.  I run "find | wc" instead, and
already reported that it is much faster.

> I actually remember, from my time on MS Windows about 10 years ago, that 
> some older ports of 'find' and/or 'grep' did have performance problems, 
> but IIRC ezwinports contained the improved versions.

The ezwinports is the version I'm using here.  But maybe someone came
up with a better one: after all, I did my port many years ago (because
the native ports available back then were abysmally slow).

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-24 11:20                           ` Eli Zaretskii
@ 2023-07-24 12:55                             ` Dmitry Gutov
  2023-07-24 13:26                               ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-24 12:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 24/07/2023 14:20, Eli Zaretskii wrote:
>> Date: Sun, 23 Jul 2023 22:27:26 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> On 23/07/2023 20:56, Eli Zaretskii wrote:
>>>> And, ideally, do all the relevant benchmarking when proposing the change.
>>> Of course.  Although the benchmarks until now already show quite a
>>> variability.
>>
>> Speaking of your MS Windows results that are unflattering to 'find', it
>> might be worth it to do a more varied comparison, to determine the
>> OS-specific bottleneck.
>>
>> Off the top of my head, here are some possibilities:
>>
>> 1. 'find' itself is much slower there. There is room for improvement in
>> the port.
> 
> I think it's the filesystem, not the port (which I did myself in this
> case).

But directory-files-recursively goes through the same filesystem, 
doesn't it?

> But I'd welcome similar tests on other Windows systems with
> other ports of Find.  Just remember to measure this particular
> benchmark, not just Find itself from the shell, as the times are very
> different (as I reported up-thread).

Concur.

>> 2. The process output handling is worse.
> 
> Not sure what that means.

Emacs's ability to process the output of a process on the particular 
platform.

You said:

   Btw, the Find command with pipe to some other program, like wc,
   finishes much faster, like 2 to 4 times faster than when it is run
   from find-directory-files-recursively.  That's probably the slowdown
   due to communications with async subprocesses in action.

One thing to try it changing the -with-find implementation to use a 
synchronous call, to compare (e.g. using 'process-file'). And repeat 
these tests on GNU/Linux too.

That would help us gauge the viability of using an asynchronous process 
to get the file listing. But also, if one was just looking into 
reimplementing directory-files-recursively using 'find' (to create an 
endpoint with swappable implementations, for example), 'process-file' is 
a suitable substitute because the original is also currently synchronous.

>> 3. Something particular to the project being used for the test.
> 
> I don't think I understand this one.

This described the possibility where the disparity between the 
implementations' runtimes was due to something unusual in the project 
structure, if you tested different projects between Windows and 
GNU/Linux, making direct comparison less useful. It's the least likely 
cause, but still sometimes a possibility.

>> To look into the possibility #1, you can try running the same command in
>> the terminal with the output to NUL and comparing the runtime to what's
>> reported in the benchmark.
> 
> Output to the null device is a bad idea, as (AFAIR) Find is clever
> enough to detect that and do nothing.  I run "find | wc" instead, and
> already reported that it is much faster.

Now I see it, thanks.

>> I actually remember, from my time on MS Windows about 10 years ago, that
>> some older ports of 'find' and/or 'grep' did have performance problems,
>> but IIRC ezwinports contained the improved versions.
> 
> The ezwinports is the version I'm using here.  But maybe someone came
> up with a better one: after all, I did my port many years ago (because
> the native ports available back then were abysmally slow).

We should also look at the exact numbers. If you say that "| wc" 
invocation is 2-4x faster than what's reported in the benchmark, then it 
takes about 2-4 seconds. Which is still oddly slower than your reported 
numbers for directory-files-recursively.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-24 12:55                             ` Dmitry Gutov
@ 2023-07-24 13:26                               ` Eli Zaretskii
  2023-07-25  2:41                                 ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-24 13:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Mon, 24 Jul 2023 15:55:13 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> 1. 'find' itself is much slower there. There is room for improvement in
> >> the port.
> > 
> > I think it's the filesystem, not the port (which I did myself in this
> > case).
> 
> But directory-files-recursively goes through the same filesystem, 
> doesn't it?

It does (more or less; see below).  But I was not trying to explain
why Find is slower than directory-files-recursively, I was trying to
explain why Find on Windows is slower than Find on GNU/Linux.

If you are asking why directory-files-recursively is so much faster on
Windows than Find, then the main factors I can think about are:

  . IPC, at least in how we implement it in Emacs on MS-Windows, via a
    separate thread and OS-level events between them to signal that
    stuff is available for reading, whereas
    directory-files-recursively avoids this overhead completely;
  . Find uses Posix APIs: 'stat', 'chdir', 'readdir' -- which on
    Windows are emulated by wrappers around native APIs.  Moreover,
    Find uses 'char *' for file names, so calling native APIs involves
    transparent conversion to UTF-16 and back, which is what native
    APIs accept and return.  By contrast, Emacs on Windows calls the
    native APIs directly, and converts to UTF-16 from UTF-8, which is
    faster.  (This last point also means that using Find on Windows
    has another grave disadvantage: it cannot fully support non-ASCII
    file names, only those that can be encoded by the current
    single-byte system codepage.)

> >> 2. The process output handling is worse.
> > 
> > Not sure what that means.
> 
> Emacs's ability to process the output of a process on the particular 
> platform.
> 
> You said:
> 
>    Btw, the Find command with pipe to some other program, like wc,
>    finishes much faster, like 2 to 4 times faster than when it is run
>    from find-directory-files-recursively.  That's probably the slowdown
>    due to communications with async subprocesses in action.

I see this slowdown on GNU/Linux as well.

> One thing to try it changing the -with-find implementation to use a 
> synchronous call, to compare (e.g. using 'process-file'). And repeat 
> these tests on GNU/Linux too.

This still uses pipes, albeit without the pselect stuff.

> >> 3. Something particular to the project being used for the test.
> > 
> > I don't think I understand this one.
> 
> This described the possibility where the disparity between the 
> implementations' runtimes was due to something unusual in the project 
> structure, if you tested different projects between Windows and 
> GNU/Linux, making direct comparison less useful. It's the least likely 
> cause, but still sometimes a possibility.

I have on my Windows system a d:/usr/share tree that is very similar
to (albeit somewhat smaller than) a typical /usr/share tree on Posix
systems.  I tried with that as well, and the results were similar.

> > The ezwinports is the version I'm using here.  But maybe someone came
> > up with a better one: after all, I did my port many years ago (because
> > the native ports available back then were abysmally slow).
> 
> We should also look at the exact numbers. If you say that "| wc" 
> invocation is 2-4x faster than what's reported in the benchmark, then it 
> takes about 2-4 seconds. Which is still oddly slower than your reported 
> numbers for directory-files-recursively.

Yes, so there are additional factors at work, at least with this port
of Find.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-24 13:26                               ` Eli Zaretskii
@ 2023-07-25  2:41                                 ` Dmitry Gutov
  2023-07-25  8:22                                   ` Ihor Radchenko
                                                     ` (2 more replies)
  0 siblings, 3 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-25  2:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

[-- Attachment #1: Type: text/plain, Size: 3663 bytes --]

On 24/07/2023 16:26, Eli Zaretskii wrote:
>> Date: Mon, 24 Jul 2023 15:55:13 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>>> 1. 'find' itself is much slower there. There is room for improvement in
>>>> the port.
>>>
>>> I think it's the filesystem, not the port (which I did myself in this
>>> case).
>>
>> But directory-files-recursively goes through the same filesystem,
>> doesn't it?
> 
> It does (more or less; see below).  But I was not trying to explain
> why Find is slower than directory-files-recursively, I was trying to
> explain why Find on Windows is slower than Find on GNU/Linux.

Understood. But we probably don't need to worry about the differences 
between platforms as much as about choosing the best option for each 
platform (or not choosing the worst, at least). So I'm more interested 
about how the find-based solution is more than 4x slower than the 
built-in one on MS Windows.

> If you are asking why directory-files-recursively is so much faster on
> Windows than Find, then the main factors I can think about are:
> 
>    . IPC, at least in how we implement it in Emacs on MS-Windows, via a
>      separate thread and OS-level events between them to signal that
>      stuff is available for reading, whereas
>      directory-files-recursively avoids this overhead completely;
>    . Find uses Posix APIs: 'stat', 'chdir', 'readdir' -- which on
>      Windows are emulated by wrappers around native APIs.  Moreover,
>      Find uses 'char *' for file names, so calling native APIs involves
>      transparent conversion to UTF-16 and back, which is what native
>      APIs accept and return.  By contrast, Emacs on Windows calls the
>      native APIs directly, and converts to UTF-16 from UTF-8, which is
>      faster.  (This last point also means that using Find on Windows
>      has another grave disadvantage: it cannot fully support non-ASCII
>      file names, only those that can be encoded by the current
>      single-byte system codepage.)

I seem to remember that Wine, which also does a similar dance of 
translating library and system calls, is often very close to the native 
performance for many programs. So this could be a problem, but 
necessarily a significant one.

Although text encoding conversion seems like a prime suspect, if the 
problem is here.

>>>> 2. The process output handling is worse.
>>>
>>> Not sure what that means.
>>
>> Emacs's ability to process the output of a process on the particular
>> platform.
>>
>> You said:
>>
>>     Btw, the Find command with pipe to some other program, like wc,
>>     finishes much faster, like 2 to 4 times faster than when it is run
>>     from find-directory-files-recursively.  That's probably the slowdown
>>     due to communications with async subprocesses in action.
> 
> I see this slowdown on GNU/Linux as well.
> 
>> One thing to try it changing the -with-find implementation to use a
>> synchronous call, to compare (e.g. using 'process-file'). And repeat
>> these tests on GNU/Linux too.
> 
> This still uses pipes, albeit without the pselect stuff.

I'm attaching an extended benchmark, one that includes a "synchronous" 
implementation as well. Please give it a spin as well.

Here (GNU/Linux) the reported numbers look like this:

 > (my-bench 1 default-directory "")

(("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)")
  ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)")
  ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)")
  ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)"))

[-- Attachment #2: find-bench.el --]
[-- Type: text/x-emacs-lisp, Size: 4648 bytes --]

(defun find-directory-files-recursively (dir regexp &optional include-directories _p follow-symlinks)
  (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates")
  (with-temp-buffer
    (setq case-fold-search nil)
    (cd dir)
    (let* ((command
	    (append
	     (list "find" (file-local-name dir))
	     (if follow-symlinks
		 '("-L")
	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
	     (unless (string-empty-p regexp)
	       (list "-regex" (concat ".*" regexp ".*")))
	     (unless include-directories
	       '("!" "-type" "d"))
	     '("-print0")
	     ))
	   (remote (file-remote-p dir))
	   (proc
	    (if remote
		(let ((proc (apply #'start-file-process
				   "find" (current-buffer) command)))
		  (set-process-sentinel proc (lambda (_proc _state)))
		  (set-process-query-on-exit-flag proc nil)
		  proc)
	      (make-process :name "find" :buffer (current-buffer)
			    :connection-type 'pipe
			    :noquery t
			    :sentinel (lambda (_proc _state))
			    :command command))))
      (while (accept-process-output proc))
      (let ((start (goto-char (point-min))) ret)
	(while (search-forward "\0" nil t)
	  (push (concat remote (buffer-substring-no-properties start (1- (point)))) ret)
	  (setq start (point)))
	ret))))

(defun find-directory-files-recursively-2 (dir regexp &optional include-directories _p follow-symlinks)
  (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates")
  (cl-assert (not (file-remote-p dir)))
  (let* (buffered
         result
         (proc
	  (make-process
           :name "find" :buffer nil
	   :connection-type 'pipe
	   :noquery t
	   :sentinel (lambda (_proc _state))
           :filter (lambda (proc data)
                     (let ((start 0))
                       (when-let (end (string-search "\0" data start))
                         (push (concat buffered (substring data start end)) result)
                         (setq buffered "")
                         (setq start (1+ end))
                         (while-let ((end (string-search "\0" data start)))
                           (push (substring data start end) result)
                           (setq start (1+ end))))
                       (setq buffered (concat buffered (substring data start)))))
	   :command (append
	             (list "find" (file-local-name dir))
	             (if follow-symlinks
		         '("-L")
	               '("!" "(" "-type" "l" "-xtype" "d" ")"))
	             (unless (string-empty-p regexp)
	               (list "-regex" (concat ".*" regexp ".*")))
	             (unless include-directories
	               '("!" "-type" "d"))
	             '("-print0")
	             ))))
    (while (accept-process-output proc))
    result))

(defun find-directory-files-recursively-3 (dir regexp &optional include-directories _p follow-symlinks)
  (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates")
  (cl-assert (not (file-remote-p dir)))
  (let ((args `(,(file-local-name dir)
	        ,@(if follow-symlinks
		      '("-L")
	            '("!" "(" "-type" "l" "-xtype" "d" ")"))
	        ,@(unless (string-empty-p regexp)
	            (list "-regex" (concat ".*" regexp ".*")))
	        ,@(unless include-directories
	            '("!" "-type" "d"))
	        "-print0")))
    (with-temp-buffer
      (let ((status (apply #'process-file
                           "find"
                           nil
                           t
                           nil
                           args))
            (pt (point-min))
            res)
        (unless (zerop status)
          (error "Listing failed"))
        (goto-char (point-min))
        (while (search-forward "\0" nil t)
          (push (buffer-substring-no-properties pt (1- (point)))
                res)
          (setq pt (point)))
        res))))

(defun my-bench (count path regexp)
  (setq path (expand-file-name path))
  ;; (let ((old (directory-files-recursively path regexp))
  ;;       (new (find-directory-files-recursively-3 path regexp)))
  ;;   (dolist (path old)
  ;;     (unless (member path new) (error "! %s not in" path)))
  ;;   (dolist (path new)
  ;;     (unless (member path old) (error "!! %s not in" path))))
  (list
   (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp)))
   (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp)))
   (cons "with-find-p" (benchmark count (list 'find-directory-files-recursively-2 path regexp)))
   (cons "with-find-sync" (benchmark count (list 'find-directory-files-recursively-3 path regexp)))))

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-25  2:41                                 ` Dmitry Gutov
@ 2023-07-25  8:22                                   ` Ihor Radchenko
  2023-07-26  1:51                                     ` Dmitry Gutov
  2023-07-25 18:42                                   ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii
  2023-07-25 19:16                                   ` sbaugh
  2 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-25  8:22 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:

> I'm attaching an extended benchmark, one that includes a "synchronous" 
> implementation as well. Please give it a spin as well.

GNU/Linux SSD

(my-bench 10 "/usr/src/linux/" "")

(("built-in" . "Elapsed time: 7.034326s (3.598539s in 14 GCs)")
 ("built-in no filename handler alist" . "Elapsed time: 5.907194s (3.698456s in 15 GCs)")
 ("with-find" . "Elapsed time: 6.078056s (4.052791s in 16 GCs)")
 ("with-find-p" . "Elapsed time: 4.496762s (2.739565s in 11 GCs)")
 ("with-find-sync" . "Elapsed time: 3.702760s (1.715160s in 7 GCs)"))

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-25  2:41                                 ` Dmitry Gutov
  2023-07-25  8:22                                   ` Ihor Radchenko
@ 2023-07-25 18:42                                   ` Eli Zaretskii
  2023-07-26  1:56                                     ` Dmitry Gutov
  2023-07-25 19:16                                   ` sbaugh
  2 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-25 18:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Tue, 25 Jul 2023 05:41:13 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> One thing to try it changing the -with-find implementation to use a
> >> synchronous call, to compare (e.g. using 'process-file'). And repeat
> >> these tests on GNU/Linux too.
> > 
> > This still uses pipes, albeit without the pselect stuff.
> 
> I'm attaching an extended benchmark, one that includes a "synchronous" 
> implementation as well. Please give it a spin as well.
> 
> Here (GNU/Linux) the reported numbers look like this:
> 
>  > (my-bench 1 default-directory "")
> 
> (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)")
>   ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)")
>   ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)")
>   ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)"))

Almost no change on Windows:

  (("built-in" . "Elapsed time: 1.218750s (0.078125s in 5 GCs)")
   ("with-find" . "Elapsed time: 8.984375s (0.109375s in 7 GCs)")
   ("with-find-p" . "Elapsed time: 8.718750s (0.046875s in 3 GCs)")
   ("with-find-sync" . "Elapsed time: 8.921875s (0.046875s in 3 GCs)"))

I'm beginning to suspect the implementation of pipes (and IPC in
general).  How else can such slowdown be explained?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-25  2:41                                 ` Dmitry Gutov
  2023-07-25  8:22                                   ` Ihor Radchenko
  2023-07-25 18:42                                   ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii
@ 2023-07-25 19:16                                   ` sbaugh
  2023-07-26  2:28                                     ` Dmitry Gutov
  2 siblings, 1 reply; 199+ messages in thread
From: sbaugh @ 2023-07-25 19:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, Eli Zaretskii, yantar92, 64735

Dmitry Gutov <dmitry@gutov.dev> writes:
>> (my-bench 1 default-directory "")
>
> (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)")
>  ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)")
>  ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)")
>  ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)"))

Tangent, but:

Ugh, wow, call-process really is a lot faster than make-process.  I see
now why people disliked my idea of replacing call-process with something
based on make-process, this is a big difference...

There's zero reason it has to be so slow... maybe I should try to make a
better make-process API and implementation which is actually fast.
(without worrying about being constrained by compatibility with
something that's already dog-slow)





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-25  8:22                                   ` Ihor Radchenko
@ 2023-07-26  1:51                                     ` Dmitry Gutov
  2023-07-26  9:09                                       ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-26  1:51 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735

On 25/07/2023 11:22, Ihor Radchenko wrote:
> Dmitry Gutov<dmitry@gutov.dev>  writes:
> 
>> I'm attaching an extended benchmark, one that includes a "synchronous"
>> implementation as well. Please give it a spin as well.
> GNU/Linux SSD
> 
> (my-bench 10 "/usr/src/linux/" "")
> 
> (("built-in" . "Elapsed time: 7.034326s (3.598539s in 14 GCs)")
>   ("built-in no filename handler alist" . "Elapsed time: 5.907194s (3.698456s in 15 GCs)")
>   ("with-find" . "Elapsed time: 6.078056s (4.052791s in 16 GCs)")
>   ("with-find-p" . "Elapsed time: 4.496762s (2.739565s in 11 GCs)")
>   ("with-find-sync" . "Elapsed time: 3.702760s (1.715160s in 7 GCs)"))

Thanks, for the extra data point in particular. Easy to see how it 
compares to the most efficient use of 'find', right (on GNU/Linix, at 
least)?

It's also something to note that, GC-wise, numbers 1 and 2 are not the 
worst: the time must be spent somewhere else.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-25 18:42                                   ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii
@ 2023-07-26  1:56                                     ` Dmitry Gutov
  2023-07-26  2:28                                       ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-26  1:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 25/07/2023 21:42, Eli Zaretskii wrote:
> Almost no change on Windows:
> 
>    (("built-in" . "Elapsed time: 1.218750s (0.078125s in 5 GCs)")
>     ("with-find" . "Elapsed time: 8.984375s (0.109375s in 7 GCs)")
>     ("with-find-p" . "Elapsed time: 8.718750s (0.046875s in 3 GCs)")
>     ("with-find-sync" . "Elapsed time: 8.921875s (0.046875s in 3 GCs)"))
> 
> I'm beginning to suspect the implementation of pipes (and IPC in
> general).  How else can such slowdown be explained?

Seems so (I'm not well-versed in the lower level details, alas).

Your other idea (spending time in text conversion) also sounds 
plausible, but I don't know whether this much overhead can be explained 
by it. And don't we have to convert any process's output to our internal 
encoding anyway, on any platform?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-25 19:16                                   ` sbaugh
@ 2023-07-26  2:28                                     ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-26  2:28 UTC (permalink / raw)
  To: sbaugh; +Cc: luangruo, sbaugh, Eli Zaretskii, yantar92, 64735

On 25/07/2023 22:16, sbaugh@catern.com wrote:
> Dmitry Gutov <dmitry@gutov.dev> writes:
>>> (my-bench 1 default-directory "")
>>
>> (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)")
>>   ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)")
>>   ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)")
>>   ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)"))
> 
> Tangent, but:
> 
> Ugh, wow, call-process really is a lot faster than make-process.  I see
> now why people disliked my idea of replacing call-process with something
> based on make-process, this is a big difference...

More like forewarned. Do we want to exchange 25% of performance for 
extra reactivity? We might. But we'd probably put that behind a pref and 
have to maintain two implementations.

> There's zero reason it has to be so slow... maybe I should try to make a
> better make-process API and implementation which is actually fast.
> (without worrying about being constrained by compatibility with
> something that's already dog-slow)

I don't know if the API itself is at fault. The first step should be to 
investigate which part of the current one is actually slow, I think.

But then, of course, if improved performance really requires a change in 
the API, we can switch to some new one too (which having to maintain at 
least two implementations for a number of years).

BTW, looking at the difference between the with-find-* approaches' 
performance, it seems like most of it comes down to GC.

Any chance we're doing extra copying of strings even when we don't have 
to, or some inefficient copying -- compared to the sync implementation? 
E.g. we could use the "fast" approach at least when the :filter is not 
specified (which is the case in the first impl, "with-find"). The manual 
says this:

   The default filter simply outputs directly to the process buffer.

Perhaps it's worth looking at.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-26  1:56                                     ` Dmitry Gutov
@ 2023-07-26  2:28                                       ` Eli Zaretskii
  2023-07-26  2:35                                         ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-26  2:28 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Wed, 26 Jul 2023 04:56:20 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> Your other idea (spending time in text conversion) also sounds 
> plausible, but I don't know whether this much overhead can be explained 
> by it. And don't we have to convert any process's output to our internal 
> encoding anyway, on any platform?

We do, but you-all probably run your tests on a system where the
external encoding is UTF-8, right?  That is much faster.






^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-26  2:28                                       ` Eli Zaretskii
@ 2023-07-26  2:35                                         ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-26  2:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 26/07/2023 05:28, Eli Zaretskii wrote:
>> Date: Wed, 26 Jul 2023 04:56:20 +0300
>> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>> Your other idea (spending time in text conversion) also sounds
>> plausible, but I don't know whether this much overhead can be explained
>> by it. And don't we have to convert any process's output to our internal
>> encoding anyway, on any platform?
> We do, but you-all probably run your tests on a system where the
> external encoding is UTF-8, right?  That is much faster.

I do. I suppose that transcoding can/uses the short-circuit approach, 
avoiding extra copying when the memory representations match.

It should be possible to measure the encoding's overhead by checking how 
big the output is, testing our code on a smaller string, and 
multiplying. Or, more roughly, by piping it to "iconv -f Windows-1251 -t 
UTF-8" and measuring how long it will take to finish (if our encoding 
takes longer, that could point to an optimization opportunity as well).

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-26  1:51                                     ` Dmitry Gutov
@ 2023-07-26  9:09                                       ` Ihor Radchenko
  2023-07-27  0:41                                         ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-26  9:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735

[-- Attachment #1: Type: text/plain, Size: 2592 bytes --]

Dmitry Gutov <dmitry@gutov.dev> writes:

>> (my-bench 10 "/usr/src/linux/" "")
>> 
>> (("built-in" . "Elapsed time: 7.034326s (3.598539s in 14 GCs)")
>>   ("built-in no filename handler alist" . "Elapsed time: 5.907194s (3.698456s in 15 GCs)")
>>   ("with-find" . "Elapsed time: 6.078056s (4.052791s in 16 GCs)")
>>   ("with-find-p" . "Elapsed time: 4.496762s (2.739565s in 11 GCs)")
>>   ("with-find-sync" . "Elapsed time: 3.702760s (1.715160s in 7 GCs)"))
>
> Thanks, for the extra data point in particular. Easy to see how it 
> compares to the most efficient use of 'find', right (on GNU/Linix, at 
> least)?
>
> It's also something to note that, GC-wise, numbers 1 and 2 are not the 
> worst: the time must be spent somewhere else.

Indeed. I did more detailed analysis in
https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/

Main contributors in the lisp versions are (in the order from most
significant to less significant) (1) file name handlers; (2) regexp
matching of the file names; (3) nconc calls in the current
`directory-files-recursively' implementation.

I have modified `directory-files-recursively' to avoid O(N^2) `nconc'
calls + bypassing regexp matches when REGEXP is nil.

Here are the results (using the attached modified version of your
benchmark file):

(my-bench 10 "/usr/src/linux/" "")
(("built-in" . "Elapsed time: 7.285597s (3.853368s in 6 GCs)")
 ("built-in no filename handler alist" . "Elapsed time: 5.855019s (3.760662s in 6 GCs)")
 ("built-in non-recursive no filename handler alist" . "Elapsed time: 5.817639s (4.326945s in 7 GCs)")
 ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 2.708306s (1.871665s in 3 GCs)")
 ("with-find" . "Elapsed time: 6.082200s (4.262830s in 7 GCs)")
 ("with-find-p" . "Elapsed time: 4.325503s (3.058647s in 5 GCs)")
 ("with-find-sync" . "Elapsed time: 3.267648s (1.903655s in 3 GCs)"))

 (let ((gc-cons-threshold most-positive-fixnum))
   (my-bench 10 "/usr/src/linux/" ""))
(("built-in" . "Elapsed time: 2.754473s")
 ("built-in no filename handler alist" . "Elapsed time: 1.322443s")
 ("built-in non-recursive no filename handler alist" . "Elapsed time: 1.235044s")
 ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 0.750275s")
 ("with-find" . "Elapsed time: 1.438510s")
 ("with-find-p" . "Elapsed time: 1.200876s")
 ("with-find-sync" . "Elapsed time: 1.349755s"))

If we forget about GC, Elisp version can get fairly close to GNU find.
And if we do not perform regexp matching (which makes sense when the
REGEXP is ""), Elisp version is faster.


[-- Attachment #2: find-bench.el --]
[-- Type: text/plain, Size: 7254 bytes --]

;; -*- lexical-binding: t; -*-

(defun find-directory-files-recursively (dir regexp &optional include-directories _p follow-symlinks)
  (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates")
  (with-temp-buffer
    (setq case-fold-search nil)
    (cd dir)
    (let* ((command
	    (append
	     (list "find" (file-local-name dir))
	     (if follow-symlinks
		 '("-L")
	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
	     (unless (string-empty-p regexp)
	       (list "-regex" (concat ".*" regexp ".*")))
	     (unless include-directories
	       '("!" "-type" "d"))
	     '("-print0")
	     ))
	   (remote (file-remote-p dir))
	   (proc
	    (if remote
		(let ((proc (apply #'start-file-process
				   "find" (current-buffer) command)))
		  (set-process-sentinel proc (lambda (_proc _state)))
		  (set-process-query-on-exit-flag proc nil)
		  proc)
	      (make-process :name "find" :buffer (current-buffer)
			    :connection-type 'pipe
			    :noquery t
			    :sentinel (lambda (_proc _state))
			    :command command))))
      (while (accept-process-output proc))
      (let ((start (goto-char (point-min))) ret)
	(while (search-forward "\0" nil t)
	  (push (concat remote (buffer-substring-no-properties start (1- (point)))) ret)
	  (setq start (point)))
	ret))))

(defun find-directory-files-recursively-2 (dir regexp &optional include-directories _p follow-symlinks)
  (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates")
  (cl-assert (not (file-remote-p dir)))
  (let* (buffered
         result
         (proc
	  (make-process
           :name "find" :buffer nil
	   :connection-type 'pipe
	   :noquery t
	   :sentinel (lambda (_proc _state))
           :filter (lambda (proc data)
                     (let ((start 0))
                       (when-let (end (string-search "\0" data start))
                         (push (concat buffered (substring data start end)) result)
                         (setq buffered "")
                         (setq start (1+ end))
                         (while-let ((end (string-search "\0" data start)))
                           (push (substring data start end) result)
                           (setq start (1+ end))))
                       (setq buffered (concat buffered (substring data start)))))
	   :command (append
	             (list "find" (file-local-name dir))
	             (if follow-symlinks
		         '("-L")
	               '("!" "(" "-type" "l" "-xtype" "d" ")"))
	             (unless (string-empty-p regexp)
	               (list "-regex" (concat ".*" regexp ".*")))
	             (unless include-directories
	               '("!" "-type" "d"))
	             '("-print0")
	             ))))
    (while (accept-process-output proc))
    result))

(defun find-directory-files-recursively-3 (dir regexp &optional include-directories _p follow-symlinks)
  (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates")
  (cl-assert (not (file-remote-p dir)))
  (let ((args `(,(file-local-name dir)
	        ,@(if follow-symlinks
		      '("-L")
	            '("!" "(" "-type" "l" "-xtype" "d" ")"))
	        ,@(unless (string-empty-p regexp)
	            (list "-regex" (concat ".*" regexp ".*")))
	        ,@(unless include-directories
	            '("!" "-type" "d"))
	        "-print0")))
    (with-temp-buffer
      (let ((status (apply #'process-file
                           "find"
                           nil
                           t
                           nil
                           args))
            (pt (point-min))
            res)
        (unless (zerop status)
          (error "Listing failed"))
        (goto-char (point-min))
        (while (search-forward "\0" nil t)
          (push (buffer-substring-no-properties pt (1- (point)))
                res)
          (setq pt (point)))
        res))))

(defun directory-files-recursively-strip-nconc
    (dir regexp
	 &optional include-directories predicate
	 follow-symlinks)
  "Return list of all files under directory DIR whose names match REGEXP.
This function works recursively.  Files are returned in \"depth
first\" order, and files from each directory are sorted in
alphabetical order.  Each file name appears in the returned list
in its absolute form.

By default, the returned list excludes directories, but if
optional argument INCLUDE-DIRECTORIES is non-nil, they are
included.

PREDICATE can be either nil (which means that all subdirectories
of DIR are descended into), t (which means that subdirectories that
can't be read are ignored), or a function (which is called with
the name of each subdirectory, and should return non-nil if the
subdirectory is to be descended into).

If FOLLOW-SYMLINKS is non-nil, symbolic links that point to
directories are followed.  Note that this can lead to infinite
recursion."
  (let* ((result nil)
	 (dirs (list dir))
         (dir (directory-file-name dir))
	 ;; When DIR is "/", remote file names like "/method:" could
	 ;; also be offered.  We shall suppress them.
	 (tramp-mode (and tramp-mode (file-remote-p (expand-file-name dir)))))
    (while (setq dir (pop dirs))
      (dolist (file (file-name-all-completions "" dir))
	(unless (member file '("./" "../"))
	  (if (directory-name-p file)
	      (let* ((leaf (substring file 0 (1- (length file))))
		     (full-file (concat dir "/" leaf)))
		;; Don't follow symlinks to other directories.
		(when (and (or (not (file-symlink-p full-file))
			       follow-symlinks)
			   ;; Allow filtering subdirectories.
			   (or (eq predicate nil)
			       (eq predicate t)
			       (funcall predicate full-file)))
                  (push full-file dirs))
		(when (and include-directories
			   (string-match regexp leaf))
		  (setq result (nconc result (list full-file)))))
	    (when (and regexp (string-match regexp file))
	      (push (concat dir "/" file) result))))))
    (sort result #'string<)))

(defun my-bench (count path regexp)
  (setq path (expand-file-name path))
  ;; (let ((old (directory-files-recursively path regexp))
  ;;       (new (find-directory-files-recursively-3 path regexp)))
  ;;   (dolist (path old)
  ;;     (unless (member path new) (error "! %s not in" path)))
  ;;   (dolist (path new)
  ;;     (unless (member path old) (error "!! %s not in" path))))
  (list
   (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp)))
   (cons "built-in no filename handler alist" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively path regexp))))
   (cons "built-in non-recursive no filename handler alist" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively-strip-nconc path regexp))))
   (cons "built-in non-recursive no filename handler alist + skip re-match" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively-strip-nconc path nil))))
   (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp)))
   (cons "with-find-p" (benchmark count (list 'find-directory-files-recursively-2 path regexp)))
   (cons "with-find-sync" (benchmark count (list 'find-directory-files-recursively-3 path regexp)))))

(provide 'find-bench)

[-- Attachment #3: Type: text/plain, Size: 224 bytes --]


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-26  9:09                                       ` Ihor Radchenko
@ 2023-07-27  0:41                                         ` Dmitry Gutov
  2023-07-27  5:22                                           ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-27  0:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735

On 26/07/2023 12:09, Ihor Radchenko wrote:
>> It's also something to note that, GC-wise, numbers 1 and 2 are not the
>> worst: the time must be spent somewhere else.
> Indeed. I did more detailed analysis in
> https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/
> 
> Main contributors in the lisp versions are (in the order from most
> significant to less significant) (1) file name handlers; (2) regexp
> matching of the file names; (3) nconc calls in the current
> `directory-files-recursively' implementation.
> 
> I have modified `directory-files-recursively' to avoid O(N^2) `nconc'
> calls + bypassing regexp matches when REGEXP is nil.

Sounds good. I haven't examined the diff closely, but it sounds like an 
improvement that can be applied irrespective of how this discussion ends.

Skipping regexp matching entirely, though, will make this benchmark 
farther removed from real-life usage: this thread started from being 
able to handle multiple ignore entries when listing files (e.g. in a 
project). So any solution for that (whether we use it on all or just 
some platforms) needs to be able to handle those. And it doesn't seem 
like directory-files-recursively has any alternative solution for that 
other than calling string-match on every found file.

> Here are the results (using the attached modified version of your
> benchmark file):
> 
> (my-bench 10 "/usr/src/linux/" "")
> (("built-in" . "Elapsed time: 7.285597s (3.853368s in 6 GCs)")
>   ("built-in no filename handler alist" . "Elapsed time: 5.855019s (3.760662s in 6 GCs)")
>   ("built-in non-recursive no filename handler alist" . "Elapsed time: 5.817639s (4.326945s in 7 GCs)")
>   ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 2.708306s (1.871665s in 3 GCs)")
>   ("with-find" . "Elapsed time: 6.082200s (4.262830s in 7 GCs)")
>   ("with-find-p" . "Elapsed time: 4.325503s (3.058647s in 5 GCs)")
>   ("with-find-sync" . "Elapsed time: 3.267648s (1.903655s in 3 GCs)"))

Nice.

>   (let ((gc-cons-threshold most-positive-fixnum))
>     (my-bench 10 "/usr/src/linux/" ""))
> (("built-in" . "Elapsed time: 2.754473s")
>   ("built-in no filename handler alist" . "Elapsed time: 1.322443s")
>   ("built-in non-recursive no filename handler alist" . "Elapsed time: 1.235044s")
>   ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 0.750275s")
>   ("with-find" . "Elapsed time: 1.438510s")
>   ("with-find-p" . "Elapsed time: 1.200876s")
>   ("with-find-sync" . "Elapsed time: 1.349755s"))
> 
> If we forget about GC, Elisp version can get fairly close to GNU find.
> And if we do not perform regexp matching (which makes sense when the
> REGEXP is ""), Elisp version is faster.

We can't really forget about GC, though.

But the above numbers make me hopeful about the async-parallel solution, 
implying that the parallelization really can help (and offset whatever 
latency we lose on pselect), as soon as we determine the source of extra 
consing and decide what to do about it.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-27  0:41                                         ` Dmitry Gutov
@ 2023-07-27  5:22                                           ` Eli Zaretskii
  2023-07-27  8:20                                             ` Ihor Radchenko
  2023-07-27 13:30                                             ` Dmitry Gutov
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-27  5:22 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Thu, 27 Jul 2023 03:41:29 +0300
> Cc: Eli Zaretskii <eliz@gnu.org>, luangruo@yahoo.com, sbaugh@janestreet.com,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > I have modified `directory-files-recursively' to avoid O(N^2) `nconc'
> > calls + bypassing regexp matches when REGEXP is nil.
> 
> Sounds good. I haven't examined the diff closely, but it sounds like an 
> improvement that can be applied irrespective of how this discussion ends.

That change should be submitted as a separate issue and discussed in
detail before we decide we can make it.

> Skipping regexp matching entirely, though, will make this benchmark 
> farther removed from real-life usage: this thread started from being 
> able to handle multiple ignore entries when listing files (e.g. in a 
> project).

Agreed.  From my POV, that variant's purpose was only to show how much
time is spent in matching file names against some include or exclude
list.

> So any solution for that (whether we use it on all or just 
> some platforms) needs to be able to handle those. And it doesn't seem 
> like directory-files-recursively has any alternative solution for that 
> other than calling string-match on every found file.

There's a possibility of pushing this filtering into
file-name-all-completions, but I'm not sure that will be faster.  We
should try that and measure the results, I think.

> > If we forget about GC, Elisp version can get fairly close to GNU find.
> > And if we do not perform regexp matching (which makes sense when the
> > REGEXP is ""), Elisp version is faster.
> 
> We can't really forget about GC, though.

But we could temporarily lift the threshold while this function runs,
if that leads to significant savings.

> But the above numbers make me hopeful about the async-parallel solution, 
> implying that the parallelization really can help (and offset whatever 
> latency we lose on pselect), as soon as we determine the source of extra 
> consing and decide what to do about it.

Isn't it clear that additional consing comes from the fact that we
first insert the Find's output into a buffer or produce a string from
it, and then chop that into individual file names?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-27  5:22                                           ` Eli Zaretskii
@ 2023-07-27  8:20                                             ` Ihor Radchenko
  2023-07-27  8:47                                               ` Eli Zaretskii
  2023-07-27 13:30                                             ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-27  8:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, Dmitry Gutov, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

>> > I have modified `directory-files-recursively' to avoid O(N^2) `nconc'
>> > calls + bypassing regexp matches when REGEXP is nil.
>> 
>> Sounds good. I haven't examined the diff closely, but it sounds like an 
>> improvement that can be applied irrespective of how this discussion ends.
>
> That change should be submitted as a separate issue and discussed in
> detail before we decide we can make it.

I will look into it. This was mostly a quick and dirty rewrite without
paying too match attention to file order in the result.

>> Skipping regexp matching entirely, though, will make this benchmark 
>> farther removed from real-life usage: this thread started from being 
>> able to handle multiple ignore entries when listing files (e.g. in a 
>> project).
>
> Agreed.  From my POV, that variant's purpose was only to show how much
> time is spent in matching file names against some include or exclude
> list.

Yes and no.

It is not uncommon to query _all_ the files in directory and something
as simple as

(when (and (not (member regexp '("" ".*"))) (string-match regexp file))...)

can give considerable speedup.

Might be worth adding such optimization.

>> So any solution for that (whether we use it on all or just 
>> some platforms) needs to be able to handle those. And it doesn't seem 
>> like directory-files-recursively has any alternative solution for that 
>> other than calling string-match on every found file.
>
> There's a possibility of pushing this filtering into
> file-name-all-completions, but I'm not sure that will be faster.  We
> should try that and measure the results, I think.

Isn't `file-name-all-completions' more limited and cannot accept
arbitrary regexp?

>> We can't really forget about GC, though.
>
> But we could temporarily lift the threshold while this function runs,
> if that leads to significant savings.

Yup. Also, GC times and frequencies will vary across different Emacs
sessions. So, we may not want to rely on it when comparing the
benchmarks from different people.

>> But the above numbers make me hopeful about the async-parallel solution, 
>> implying that the parallelization really can help (and offset whatever 
>> latency we lose on pselect), as soon as we determine the source of extra 
>> consing and decide what to do about it.
>
> Isn't it clear that additional consing comes from the fact that we
> first insert the Find's output into a buffer or produce a string from
> it, and then chop that into individual file names?

To add to it, I also tried to implement a version of
`directory-files-recursively' that first inserts all the files in buffer
and then filters them using `re-search-forward' instead of calling
`string-match' on every file name string.
That ended up being slower compared to the current `string-match' approach.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-27  8:20                                             ` Ihor Radchenko
@ 2023-07-27  8:47                                               ` Eli Zaretskii
  2023-07-27  9:28                                                 ` Ihor Radchenko
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-27  8:47 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: luangruo, dmitry, 64735, sbaugh

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Dmitry Gutov <dmitry@gutov.dev>, luangruo@yahoo.com,
>  sbaugh@janestreet.com, 64735@debbugs.gnu.org
> Date: Thu, 27 Jul 2023 08:20:55 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Skipping regexp matching entirely, though, will make this benchmark 
> >> farther removed from real-life usage: this thread started from being 
> >> able to handle multiple ignore entries when listing files (e.g. in a 
> >> project).
> >
> > Agreed.  From my POV, that variant's purpose was only to show how much
> > time is spent in matching file names against some include or exclude
> > list.
> 
> Yes and no.
> 
> It is not uncommon to query _all_ the files in directory and something
> as simple as
> 
> (when (and (not (member regexp '("" ".*"))) (string-match regexp file))...)
> 
> can give considerable speedup.

I don't understand what you are saying.  The current code already
checks PREDICATE for being nil, and if it is, does nothing about
filtering.

And if this is about testing REGEXP for being a trivial one, adding
such a test to the existing code is trivial, and hardly justifies an
objection to what I wrote.

> > There's a possibility of pushing this filtering into
> > file-name-all-completions, but I'm not sure that will be faster.  We
> > should try that and measure the results, I think.
> 
> Isn't `file-name-all-completions' more limited and cannot accept
> arbitrary regexp?

No, see completion-regexp-list.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-27  8:47                                               ` Eli Zaretskii
@ 2023-07-27  9:28                                                 ` Ihor Radchenko
  0 siblings, 0 replies; 199+ messages in thread
From: Ihor Radchenko @ 2023-07-27  9:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, dmitry, 64735, sbaugh

Eli Zaretskii <eliz@gnu.org> writes:

> And if this is about testing REGEXP for being a trivial one, adding
> such a test to the existing code is trivial, and hardly justifies an
> objection to what I wrote.

I was replying to your interpretations on why I included "no-regexp"
test.

I agree that we should not use this test as comparison with GNU find.
But I also wanted to say that adding the trivial REGEXP test will be
useful. Especially because it is easy. Should I prepare a patch?

>> Isn't `file-name-all-completions' more limited and cannot accept
>> arbitrary regexp?
>
> No, see completion-regexp-list.

That would be equivalent to forcing `include-directories' being t.

In any case, even if we ignore INCLUDE-DIRECTORIES, there is no gain:

(my-bench 10 "/usr/src/linux/" "")

(("built-in non-recursive no filename handler alist" . "Elapsed time: 5.780714s (4.352086s in 6 GCs)")
 ("built-in non-recursive no filename handler alist + completion-regexp-list" . "Elapsed time: 5.739315s (4.359772s in 6 GCs)"))

 (let ((gc-cons-threshold most-positive-fixnum))
   (my-bench 10 "/usr/src/linux/" ""))

(("built-in non-recursive no filename handler alist" . "Elapsed time: 1.267828s")
("built-in non-recursive no filename handler alist + completion-regexp-list" . "Elapsed time: 1.275656s"))

In the test, I removed all the `string-match' calls and instead let-bound (completion-regexp-list (list regexp))

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-27  5:22                                           ` Eli Zaretskii
  2023-07-27  8:20                                             ` Ihor Radchenko
@ 2023-07-27 13:30                                             ` Dmitry Gutov
  2023-07-29  0:12                                               ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-27 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 27/07/2023 08:22, Eli Zaretskii wrote:
>> Date: Thu, 27 Jul 2023 03:41:29 +0300
>> Cc: Eli Zaretskii <eliz@gnu.org>, luangruo@yahoo.com, sbaugh@janestreet.com,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>> I have modified `directory-files-recursively' to avoid O(N^2) `nconc'
>>> calls + bypassing regexp matches when REGEXP is nil.
>>
>> Sounds good. I haven't examined the diff closely, but it sounds like an
>> improvement that can be applied irrespective of how this discussion ends.
> 
> That change should be submitted as a separate issue and discussed in
> detail before we decide we can make it.

Sure.

>>> If we forget about GC, Elisp version can get fairly close to GNU find.
>>> And if we do not perform regexp matching (which makes sense when the
>>> REGEXP is ""), Elisp version is faster.
>>
>> We can't really forget about GC, though.
> 
> But we could temporarily lift the threshold while this function runs,
> if that leads to significant savings.

I mean, everything's doable, but if we do this for this function, why 
not others? Most long-running code would see an improvement from that 
kind of change (the 'find'-based solutions too).

IIRC the main drawback is running out of memory in extreme conditions or 
on low-memory platforms/devices. It's not like this feature is 
particularly protected from this.

>> But the above numbers make me hopeful about the async-parallel solution,
>> implying that the parallelization really can help (and offset whatever
>> latency we lose on pselect), as soon as we determine the source of extra
>> consing and decide what to do about it.
> 
> Isn't it clear that additional consing comes from the fact that we
> first insert the Find's output into a buffer or produce a string from
> it, and then chop that into individual file names?

But we do that in all 'find'-based solutions: the synchronous one takes 
buffer text and chops it into strings. The first asynchronous does the 
same. The other ("with-find-p") works from a process filter, chopping up 
strings that get passed to it.

But the amount of time spent in GC is different, with most of the 
difference in performance attributable to it: if we subtract time spent 
in GC, the runtimes are approximately equal.

I can imagine that the filter-based approach necessarily creates more 
strings (to pass to the filter function). Maybe we could increase those 
strings' size (thus reducing the number) by increasing the read buffer 
size? I haven't found a relevant variable, though.

Or if there was some other callback that runs after the next chunk of 
output arrives from the process, we could parse it from the buffer. But 
the insertion into the buffer would need to be made efficient 
(apparently internal-default-process-filter currently uses the same 
sequence of strings as the other filters for input, with the same amount 
of consing).

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-27 13:30                                             ` Dmitry Gutov
@ 2023-07-29  0:12                                               ` Dmitry Gutov
  2023-07-29  6:15                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-29  0:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 27/07/2023 16:30, Dmitry Gutov wrote:
> I can imagine that the filter-based approach necessarily creates more 
> strings (to pass to the filter function). Maybe we could increase those 
> strings' size (thus reducing the number) by increasing the read buffer 
> size?

To go further along this route, first of all, I verified that the input 
strings are (almost) all the same length: 4096. And they are parsed into 
strings with length 50-100 characters, meaning the number of "junk" 
objects due to the process-filter approach probably shouldn't matter too 
much, given that the number of strings returned is 40-80x more.

But then I ran these tests with different values of 
read-process-output-max, which exactly increased those strings' size, 
proportionally reducing their number. The results were:

 > (my-bench-rpom 1 default-directory "")

=>

(("with-find-p 4096" . "Elapsed time: 0.945478s (0.474680s in 6 GCs)")
  ("with-find-p 40960" . "Elapsed time: 0.760727s (0.395379s in 5 GCs)")
("with-find-p 409600" . "Elapsed time: 0.729757s (0.394881s in 5 GCs)"))

where

(defun my-bench-rpom (count path regexp)
   (setq path (expand-file-name path))
   (list
    (cons "with-find-p 4096"
          (let ((read-process-output-max 4096))
            (benchmark count (list 'find-directory-files-recursively-2 
path regexp))))
    (cons "with-find-p 40960"
          (let ((read-process-output-max 40960))
            (benchmark count (list 'find-directory-files-recursively-2 
path regexp))))
    (cons "with-find-p 409600"
          (let ((read-process-output-max 409600))
            (benchmark count (list 'find-directory-files-recursively-2 
path regexp))))))

...with the last iteration showing consistently the same or better 
performance than the "sync" version I benchmarked previously.

What does that mean for us? The number of strings in the heap is 
reduced, but not by much (again, the result is a list with 43x more 
elements). The combined memory taken up by these intermediate strings to 
be garbage-collected, is the same.

It seems like per-chunk overhead is non-trivial, and affects GC somehow 
(but not in a way that just any string would).

In this test, by default, the output produces ~6000 strings and passes 
them to the filter function. Meaning, read_and_dispose_of_process_output 
is called about 6000 times, producing the overhead of roughly 0.2s. 
Something in there must be producing extra work for the GC.

This line seems suspect:

        list3 (outstream, make_lisp_proc (p), text),

Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm 
missing something bigger.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-29  0:12                                               ` Dmitry Gutov
@ 2023-07-29  6:15                                                 ` Eli Zaretskii
  2023-07-30  1:35                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-29  6:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sat, 29 Jul 2023 03:12:34 +0300
> From: Dmitry Gutov <dmitry@gutov.dev>
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> 
> It seems like per-chunk overhead is non-trivial, and affects GC somehow 
> (but not in a way that just any string would).
> 
> In this test, by default, the output produces ~6000 strings and passes 
> them to the filter function. Meaning, read_and_dispose_of_process_output 
> is called about 6000 times, producing the overhead of roughly 0.2s. 
> Something in there must be producing extra work for the GC.
> 
> This line seems suspect:
> 
>         list3 (outstream, make_lisp_proc (p), text),
> 
> Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm 
> missing something bigger.

I don't understand what puzzles you here.  You need to make your
descriptions more clear to allow others to follow your logic.  You use
terms you never explain: "junk objects", "number of strings in the
heap", "per-chunk overhead" (what is "chunk"?), which is a no-no when
explaining complex technical stuff to others.

If I read what you wrote superficially, without delving into the
details (which I can't understand), you are saying that the overall
amount of consing is roughly the same.  This is consistent with the
fact that the GC times change only very little.  So I don't think I
see, on this level, what puzzles you in this picture.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-29  6:15                                                 ` Eli Zaretskii
@ 2023-07-30  1:35                                                   ` Dmitry Gutov
  2023-07-31 11:38                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-07-30  1:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 29/07/2023 09:15, Eli Zaretskii wrote:
>> Date: Sat, 29 Jul 2023 03:12:34 +0300
>> From: Dmitry Gutov <dmitry@gutov.dev>
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>>
>> It seems like per-chunk overhead is non-trivial, and affects GC somehow
>> (but not in a way that just any string would).
>>
>> In this test, by default, the output produces ~6000 strings and passes
>> them to the filter function. Meaning, read_and_dispose_of_process_output
>> is called about 6000 times, producing the overhead of roughly 0.2s.
>> Something in there must be producing extra work for the GC.
>>
>> This line seems suspect:
>>
>>          list3 (outstream, make_lisp_proc (p), text),
>>
>> Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm
>> missing something bigger.
> 
> I don't understand what puzzles you here.  You need to make your
> descriptions more clear to allow others to follow your logic.  You use
> terms you never explain: "junk objects", "number of strings in the
> heap", "per-chunk overhead" (what is "chunk"?), which is a no-no when
> explaining complex technical stuff to others.

In this context, junks objects are objects that will need to be 
collected by garbage collector very soon because they are just a 
byproduct of a function's execution (but aren't used in the return 
value, for example). The more of them a function creates, the more work 
it will be, supposedly, for the GC.

Heap is perhaps the wrong term (given that C has its own notion of 
heap), but I meant the memory managed by the Lisp runtime.

And chunks are the buffered strings that get passed to the process 
filter. Chunks of the process' output. By default, these chunks are 4096 
characters long, but the comparisons tweak that value by 10x and 100x.

> If I read what you wrote superficially, without delving into the
> details (which I can't understand), you are saying that the overall
> amount of consing is roughly the same.

What is "amount of consing"? Is it just the number of objects? Or does 
their size (e.g. string length) affect GC pressure as well?

> This is consistent with the
> fact that the GC times change only very little.  So I don't think I
> see, on this level, what puzzles you in this picture.

Now that you pointed that out, the picture is just more puzzling. While 
0.1s in GC is not insignificant (it's 10% of the whole runtime), it does 
seem to have been more of a fluke, and on average the fluctuations in GC 
time are smaller.

Here's an extended comparison:

(("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)")
  ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)")
  ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)")
  ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)")
  ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)")
  ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)")
("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)")
("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)")
("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)")
("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)")
("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)"
))

What was puzzling for me, overall, is that if we take "with-find 409600" 
(the fastest among the asynchronous runs without parallelism) and 
"with-find-sync", the difference in GC time (which is repeatable), 
0.66s, almost covers all the difference in performance. And as for 
"with-find-p 409600", it would come out on top! Which it did in Ihor's 
tests when GC was disabled.

But where does the extra GC time come from? Is it from extra consing in 
the asynchronous call's case? If it is, it's not from all the chunked 
strings, apparently, given that increasing max string's size (and 
decreasing their number by 2x-6x, according to my logging) doesn't 
affect the reported GC time much.

Could the extra time spent in GC just come from the fact that it's given 
more opportunities to run, maybe? call_process stays entirely in C, 
whereas make-process, with its asynchronous approach, goes between C and 
Lisp even time it receives input. The report above might indicate so: 
with-find-p have ~20 garbage collection cycles, whereas with-find-sync - 
only ~10. Or could there be some other source of consing, unrelated to 
the process output string, and how finely they are sliced?

Changing process-adaptive-read-buffering to nil didn't have any effect here.

If we get back to increasing read-process-output-max, which does help 
(apparently due to reducing the number we switch between reading from 
the process and doing... whatever else), the sweet spot seems to be 
1048576, which is my system's maximum value. Anything higher - and the 
perf goes back to worse -- I'm guessing something somewhere resets the 
value to default? Not sure why it doesn't clip to the maximum allowed, 
though.

Anyway, it would be helpful to be able to decide on as high as possible 
value without manually reading from /proc/sys/fs/pipe-max-size. And what 
of other OSes?

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-30  1:35                                                   ` Dmitry Gutov
@ 2023-07-31 11:38                                                     ` Eli Zaretskii
  2023-09-08  0:53                                                       ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-07-31 11:38 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 30 Jul 2023 04:35:49 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> In this context, junks objects are objects that will need to be 
> collected by garbage collector very soon because they are just a 
> byproduct of a function's execution (but aren't used in the return 
> value, for example). The more of them a function creates, the more work 
> it will be, supposedly, for the GC.
> 
> Heap is perhaps the wrong term (given that C has its own notion of 
> heap), but I meant the memory managed by the Lisp runtime.
> 
> And chunks are the buffered strings that get passed to the process 
> filter. Chunks of the process' output. By default, these chunks are 4096 
> characters long, but the comparisons tweak that value by 10x and 100x.

If the subprocess output is inserted into a buffer, its effect on the
GC will be different.  (Not sure if this is relevant to the issue at
hand, as I lost track of the many variants of the function that were
presented.)

> > If I read what you wrote superficially, without delving into the
> > details (which I can't understand), you are saying that the overall
> > amount of consing is roughly the same.
> 
> What is "amount of consing"? Is it just the number of objects? Or does 
> their size (e.g. string length) affect GC pressure as well?

In general, both, since we have 2 GC thresholds, and GC is actually
done when both are exceeded.  So the effect will also depend on how
much Lisp memory is already allocated in the Emacs process where these
benchmarks are run.

> > This is consistent with the
> > fact that the GC times change only very little.  So I don't think I
> > see, on this level, what puzzles you in this picture.
> 
> Now that you pointed that out, the picture is just more puzzling. While 
> 0.1s in GC is not insignificant (it's 10% of the whole runtime), it does 
> seem to have been more of a fluke, and on average the fluctuations in GC 
> time are smaller.
> 
> Here's an extended comparison:
> 
> (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)")
>   ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)")
>   ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)")
>   ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)")
>   ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)")
>   ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)")
> ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)")
> ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)")
> ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)")
> ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)")
> ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)"
> ))
> 
> What was puzzling for me, overall, is that if we take "with-find 409600" 
> (the fastest among the asynchronous runs without parallelism) and 
> "with-find-sync", the difference in GC time (which is repeatable), 
> 0.66s, almost covers all the difference in performance. And as for 
> "with-find-p 409600", it would come out on top! Which it did in Ihor's 
> tests when GC was disabled.
> 
> But where does the extra GC time come from? Is it from extra consing in 
> the asynchronous call's case? If it is, it's not from all the chunked 
> strings, apparently, given that increasing max string's size (and 
> decreasing their number by 2x-6x, according to my logging) doesn't 
> affect the reported GC time much.
> 
> Could the extra time spent in GC just come from the fact that it's given 
> more opportunities to run, maybe? call_process stays entirely in C, 
> whereas make-process, with its asynchronous approach, goes between C and 
> Lisp even time it receives input. The report above might indicate so: 
> with-find-p have ~20 garbage collection cycles, whereas with-find-sync - 
> only ~10. Or could there be some other source of consing, unrelated to 
> the process output string, and how finely they are sliced?

These questions can only be answered by dumping the values of the 2 GC
thresholds and of consing_until_gc for each GC cycle.  It could be
that we are consing more Lisp memory, or it could be that one of the
implementations provides fewer opportunities for Emacs to call
maybe_gc.  Or it could be some combination of the two.

> If we get back to increasing read-process-output-max, which does help 
> (apparently due to reducing the number we switch between reading from 
> the process and doing... whatever else), the sweet spot seems to be 
> 1048576, which is my system's maximum value. Anything higher - and the 
> perf goes back to worse -- I'm guessing something somewhere resets the 
> value to default? Not sure why it doesn't clip to the maximum allowed, 
> though.
> 
> Anyway, it would be helpful to be able to decide on as high as possible 
> value without manually reading from /proc/sys/fs/pipe-max-size. And what 
> of other OSes?

Is this with pipes or with PTYs?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-07-31 11:38                                                     ` Eli Zaretskii
@ 2023-09-08  0:53                                                       ` Dmitry Gutov
  2023-09-08  6:35                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-08  0:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

Let's try to investigate this some more, if we can.

On 31/07/2023 14:38, Eli Zaretskii wrote:
>> Date: Sun, 30 Jul 2023 04:35:49 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> In this context, junks objects are objects that will need to be
>> collected by garbage collector very soon because they are just a
>> byproduct of a function's execution (but aren't used in the return
>> value, for example). The more of them a function creates, the more work
>> it will be, supposedly, for the GC.
>>
>> Heap is perhaps the wrong term (given that C has its own notion of
>> heap), but I meant the memory managed by the Lisp runtime.
>>
>> And chunks are the buffered strings that get passed to the process
>> filter. Chunks of the process' output. By default, these chunks are 4096
>> characters long, but the comparisons tweak that value by 10x and 100x.
> 
> If the subprocess output is inserted into a buffer, its effect on the
> GC will be different.  (Not sure if this is relevant to the issue at
> hand, as I lost track of the many variants of the function that were
> presented.)

Yes, one of the variants inserts into the buffer (one that uses a 
synchronous process call and also, coincidentally, spends less time in 
GC), and the asynchronous work from a process filter.

>>> If I read what you wrote superficially, without delving into the
>>> details (which I can't understand), you are saying that the overall
>>> amount of consing is roughly the same.
>>
>> What is "amount of consing"? Is it just the number of objects? Or does
>> their size (e.g. string length) affect GC pressure as well?
> 
> In general, both, since we have 2 GC thresholds, and GC is actually
> done when both are exceeded.  So the effect will also depend on how
> much Lisp memory is already allocated in the Emacs process where these
> benchmarks are run.

All right.

>>> This is consistent with the
>>> fact that the GC times change only very little.  So I don't think I
>>> see, on this level, what puzzles you in this picture.
>>
>> Now that you pointed that out, the picture is just more puzzling. While
>> 0.1s in GC is not insignificant (it's 10% of the whole runtime), it does
>> seem to have been more of a fluke, and on average the fluctuations in GC
>> time are smaller.
>>
>> Here's an extended comparison:
>>
>> (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)")
>>    ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)")
>>    ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)")
>>    ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)")
>>    ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)")
>>    ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)")
>> ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)")
>> ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)")
>> ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)")
>> ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)")
>> ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)"
>> ))
>>
>> What was puzzling for me, overall, is that if we take "with-find 409600"
>> (the fastest among the asynchronous runs without parallelism) and
>> "with-find-sync", the difference in GC time (which is repeatable),
>> 0.66s, almost covers all the difference in performance. And as for
>> "with-find-p 409600", it would come out on top! Which it did in Ihor's
>> tests when GC was disabled.
>>
>> But where does the extra GC time come from? Is it from extra consing in
>> the asynchronous call's case? If it is, it's not from all the chunked
>> strings, apparently, given that increasing max string's size (and
>> decreasing their number by 2x-6x, according to my logging) doesn't
>> affect the reported GC time much.
>>
>> Could the extra time spent in GC just come from the fact that it's given
>> more opportunities to run, maybe? call_process stays entirely in C,
>> whereas make-process, with its asynchronous approach, goes between C and
>> Lisp even time it receives input. The report above might indicate so:
>> with-find-p have ~20 garbage collection cycles, whereas with-find-sync -
>> only ~10. Or could there be some other source of consing, unrelated to
>> the process output string, and how finely they are sliced?
> 
> These questions can only be answered by dumping the values of the 2 GC
> thresholds and of consing_until_gc for each GC cycle.  It could be
> that we are consing more Lisp memory, or it could be that one of the
> implementations provides fewer opportunities for Emacs to call
> maybe_gc.  Or it could be some combination of the two.

Do you think the outputs of 
https://elpa.gnu.org/packages/emacs-gc-stats.html could help?

Otherwise, I suppose I need to add some fprintf's somewhere. Would the 
beginning of maybe_gc inside lisp.h be a good place for that?

>> If we get back to increasing read-process-output-max, which does help
>> (apparently due to reducing the number we switch between reading from
>> the process and doing... whatever else), the sweet spot seems to be
>> 1048576, which is my system's maximum value. Anything higher - and the
>> perf goes back to worse -- I'm guessing something somewhere resets the
>> value to default? Not sure why it doesn't clip to the maximum allowed,
>> though.
>>
>> Anyway, it would be helpful to be able to decide on as high as possible
>> value without manually reading from /proc/sys/fs/pipe-max-size. And what
>> of other OSes?
> 
> Is this with pipes or with PTYs?

All examples which use make-process call it with :connection-type 'pipe.

The one that calls process-file (the "synchronous" impl) also probably 
does, but I don't see that in the docstring.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-08  0:53                                                       ` Dmitry Gutov
@ 2023-09-08  6:35                                                         ` Eli Zaretskii
  2023-09-10  1:30                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-08  6:35 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Fri, 8 Sep 2023 03:53:37 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)")
> >>    ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)")
> >>    ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)")
> >>    ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)")
> >>    ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)")
> >>    ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)")
> >> ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)")
> >> ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)")
> >> ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)")
> >> ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)")
> >> ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)"
> >> ))
> >>
> >> What was puzzling for me, overall, is that if we take "with-find 409600"
> >> (the fastest among the asynchronous runs without parallelism) and
> >> "with-find-sync", the difference in GC time (which is repeatable),
> >> 0.66s, almost covers all the difference in performance. And as for
> >> "with-find-p 409600", it would come out on top! Which it did in Ihor's
> >> tests when GC was disabled.
> >>
> >> But where does the extra GC time come from? Is it from extra consing in
> >> the asynchronous call's case? If it is, it's not from all the chunked
> >> strings, apparently, given that increasing max string's size (and
> >> decreasing their number by 2x-6x, according to my logging) doesn't
> >> affect the reported GC time much.
> >>
> >> Could the extra time spent in GC just come from the fact that it's given
> >> more opportunities to run, maybe? call_process stays entirely in C,
> >> whereas make-process, with its asynchronous approach, goes between C and
> >> Lisp even time it receives input. The report above might indicate so:
> >> with-find-p have ~20 garbage collection cycles, whereas with-find-sync -
> >> only ~10. Or could there be some other source of consing, unrelated to
> >> the process output string, and how finely they are sliced?
> > 
> > These questions can only be answered by dumping the values of the 2 GC
> > thresholds and of consing_until_gc for each GC cycle.  It could be
> > that we are consing more Lisp memory, or it could be that one of the
> > implementations provides fewer opportunities for Emacs to call
> > maybe_gc.  Or it could be some combination of the two.
> 
> Do you think the outputs of 
> https://elpa.gnu.org/packages/emacs-gc-stats.html could help?

I think you'd need to expose consing_until_gc to Lisp, and then you
can collect the data from Lisp.

> Otherwise, I suppose I need to add some fprintf's somewhere. Would the 
> beginning of maybe_gc inside lisp.h be a good place for that?

I can only recommend the fprintf method if doing this from Lisp is
impossible for some reason.

> >> If we get back to increasing read-process-output-max, which does help
> >> (apparently due to reducing the number we switch between reading from
> >> the process and doing... whatever else), the sweet spot seems to be
> >> 1048576, which is my system's maximum value. Anything higher - and the
> >> perf goes back to worse -- I'm guessing something somewhere resets the
> >> value to default? Not sure why it doesn't clip to the maximum allowed,
> >> though.
> >>
> >> Anyway, it would be helpful to be able to decide on as high as possible
> >> value without manually reading from /proc/sys/fs/pipe-max-size. And what
> >> of other OSes?
> > 
> > Is this with pipes or with PTYs?
> 
> All examples which use make-process call it with :connection-type 'pipe.
> 
> The one that calls process-file (the "synchronous" impl) also probably 
> does, but I don't see that in the docstring.

Yes, call-process uses pipes.  So finding the optimum boils down to
running various scenarios.  It is also possible that the optimum will
be different on different systems, btw.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-08  6:35                                                         ` Eli Zaretskii
@ 2023-09-10  1:30                                                           ` Dmitry Gutov
  2023-09-10  5:33                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-10  1:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 08/09/2023 09:35, Eli Zaretskii wrote:
>>> These questions can only be answered by dumping the values of the 2 GC
>>> thresholds and of consing_until_gc for each GC cycle.  It could be
>>> that we are consing more Lisp memory, or it could be that one of the
>>> implementations provides fewer opportunities for Emacs to call
>>> maybe_gc.  Or it could be some combination of the two.
>> Do you think the outputs of
>> https://elpa.gnu.org/packages/emacs-gc-stats.html  could help?
> I think you'd need to expose consing_until_gc to Lisp, and then you
> can collect the data from Lisp.

I can expose it to Lisp and print all three from post-gc-hook, but the 
result just looks like this:

gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903

Perhaps I need to add a hook which runs at the beginning of GC? Or of 
maybe_gc even?

Alternatively, (memory-use-counts) seems to retain some counters which 
don't get erased during garbage collection.

>> All examples which use make-process call it with :connection-type 'pipe.
>>
>> The one that calls process-file (the "synchronous" impl) also probably
>> does, but I don't see that in the docstring.
> Yes, call-process uses pipes.  So finding the optimum boils down to
> running various scenarios.  It is also possible that the optimum will
> be different on different systems, btw.

Sure, but I'd like to improve the state of affairs in at least the main one.

And as for MS Windows, IIRC all find-based solution are currently slow 
equally, so we're unlikely to make things worse there anyway.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-10  1:30                                                           ` Dmitry Gutov
@ 2023-09-10  5:33                                                             ` Eli Zaretskii
  2023-09-11  0:02                                                               ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-10  5:33 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sun, 10 Sep 2023 04:30:24 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 08/09/2023 09:35, Eli Zaretskii wrote:
> > I think you'd need to expose consing_until_gc to Lisp, and then you
> > can collect the data from Lisp.
> 
> I can expose it to Lisp and print all three from post-gc-hook, but the 
> result just looks like this:
> 
> gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903
> 
> Perhaps I need to add a hook which runs at the beginning of GC? Or of 
> maybe_gc even?

You could record its value in a local variable at the entry to
garbage_collect, and the expose that value to Lisp.

> Alternatively, (memory-use-counts) seems to retain some counters which 
> don't get erased during garbage collection.

Maybe using those will be good enough, indeed.

> And as for MS Windows, IIRC all find-based solution are currently slow 
> equally, so we're unlikely to make things worse there anyway.

I was actually thinking about *BSD and macOS.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-10  5:33                                                             ` Eli Zaretskii
@ 2023-09-11  0:02                                                               ` Dmitry Gutov
  2023-09-11 11:57                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-11  0:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

[-- Attachment #1: Type: text/plain, Size: 3466 bytes --]

On 10/09/2023 08:33, Eli Zaretskii wrote:
>> Date: Sun, 10 Sep 2023 04:30:24 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> On 08/09/2023 09:35, Eli Zaretskii wrote:
>>> I think you'd need to expose consing_until_gc to Lisp, and then you
>>> can collect the data from Lisp.
>>
>> I can expose it to Lisp and print all three from post-gc-hook, but the
>> result just looks like this:
>>
>> gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903
>>
>> Perhaps I need to add a hook which runs at the beginning of GC? Or of
>> maybe_gc even?
> 
> You could record its value in a local variable at the entry to
> garbage_collect, and the expose that value to Lisp.

That also doesn't seem to give much, given that the condition for 
entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait 
until it's down to 0, then garbage-collect. What we could perhaps do is 
add another hook (or a printer) at the beginning of maybe_gc, but either 
would result in lots and lots of output.

>> Alternatively, (memory-use-counts) seems to retain some counters which
>> don't get erased during garbage collection.
> 
> Maybe using those will be good enough, indeed.

I added this instrumentation:

(defvar last-mem-counts '(0 0 0 0 0 0 0))

(defun gc-record-after ()
   (let* ((counts (memory-use-counts))
          (diff (cl-map 'list
                        (lambda (old new) (- new old))
                        last-mem-counts counts)))
     (setq last-mem-counts counts)
     (message "counts diff %s" diff)))

(add-hook 'post-gc-hook #'gc-record-after)

so that after each garbage collection we print the differences in all 
counters (CONSES FLOATS VECTOR-CELLS SYMBOLS STRING-CHARS INTERVALS 
STRINGS).

And a message call when the process finishes.

And made those recordings during the benchmark runs of two different 
listing methods (one using make-process, another using process-file) to 
list all files in a large directory (there are ~200000 files there).

The make-process one I also ran with a different (large) value of 
read-process-output-max. Results attached.

What's in there? First of all, for find-directory-files-recursively-3, 
there are 0 garbage collections between the beginning of the function 
and when we start parsing the output (no GCs while the process is 
writing to the buffer synchronously). I guess inserting output in a 
buffer doesn't increase consing, so there's nothing to GC?

Next: for find-directory-files-recursively-2, the process only finishes 
at the end, when all GC cycles are done for. I suppose that also means 
we block the process's output while Lisp is running, and also that 
whatever GC events occur might coincide with the chunks of output coming 
from the process, and however many of them turn out to be in total.

So there is also a second recording for 
find-directory-files-recursively-2 with read-process-output-max=409600. 
It does improve the performance significantly (and reduce the number of 
GC pauses). I guess what I'm still not clear on, is whether the number 
of GC pauses is fewer because of less consing (the only column that 
looks significantly different is the 3rd: VECTOR-CELLS), or because the 
process finishes faster due to larger buffers, which itself causes fewer 
calls to maybe_gc.

And, of course, what else could be done to reduce the time spent in GC 
in the asynchronous case.

[-- Attachment #2: gcs2.txt --]
[-- Type: text/plain, Size: 2184 bytes --]

find-directory-files-recursively-2

Uses make-process and :filter to parse the output
concurrently with the process.

With (read-process-output-max 4096):

start now
counts diff (75840 13 31177 60 343443 3496 4748)
counts diff (41946 1 460 0 1226494 0 8425)
counts diff (43165 1 450 0 1284214 0 8951)
counts diff (43513 1 364 0 1343316 0 10125)
counts diff (43200 1 384 0 1479048 0 9766)
counts diff (46220 1 428 0 1528863 0 10242)
counts diff (43125 1 462 0 1767068 0 8790)
counts diff (49118 1 458 0 1723271 0 10832)
counts diff (53156 1 572 0 1789919 0 10774)
counts diff (57755 1 548 0 1783286 0 12600)
counts diff (62171 1 554 0 1795216 0 13995)
counts diff (62020 1 550 0 1963255 0 13996)
counts diff (54559 1 616 0 2387308 0 10700)
counts diff (56428 1 634 0 2513219 0 11095)
counts diff (62611 1 658 0 2510756 0 12864)
counts diff (67560 1 708 0 2574312 0 13899)
counts diff (78154 1 928 0 2572273 0 14714)
counts diff (86794 1 976 0 2520915 0 17004)
counts diff (78112 1 874 0 2943548 0 15367)
counts diff (79443 1 894 0 3138948 0 15559)
counts diff (81861 1 984 0 3343764 0 15260)
counts diff (87724 1 1030 0 3430969 0 16650)
counts diff (88532 1 902 0 3591052 0 18487)
counts diff (92083 1 952 0 3769290 0 19065)
<finished\n>
Elapsed time: 1.344422s (0.747126s in 24 GCs)

And here's with (read-process-output-max 409600):

start now
counts diff (57967 1 4040 1 981912 106 7731)
counts diff (32075 1 20 0 1919096 0 10560)
counts diff (43431 1 18 0 2259314 0 14371)
counts diff (46335 1 18 0 2426290 0 15339)
counts diff (31872 1 18 0 2447639 0 10518)
counts diff (46527 1 18 0 2328042 0 15403)
counts diff (42468 1 18 0 2099976 0 14050)
counts diff (48648 1 18 0 2302713 0 16110)
counts diff (50404 1 20 0 3260921 0 16669)
counts diff (40147 1 20 0 3264463 0 13251)
counts diff (48118 1 20 0 3261725 0 15908)
counts diff (60732 1 282 0 2791003 0 16785)
counts diff (71329 1 506 0 2762237 0 17487)
counts diff (61455 1 342 0 3192771 0 16271)
counts diff (49035 1 30 0 3663715 0 16085)
counts diff (58651 1 236 0 3783888 0 16683)
counts diff (57132 1 24 0 4557688 0 18862)
counts diff (71319 1 24 0 4769891 0 23591)
<finished\n>
Elapsed time: 0.890710s (0.546486s in 18 GCs)

[-- Attachment #3: gcs3.txt --]
[-- Type: text/plain, Size: 715 bytes --]

find-directory-files-recursively-3

Uses process-file, parses the buffer
with search-forward at the end.

start now
<process finished, now parsing>
counts diff (62771 5 26629 63 458211 3223 8038)
counts diff (17045 1 12 0 1288153 0 16949)
counts diff (18301 1 12 0 1432165 0 18205)
counts diff (17643 1 12 0 1716294 0 17547)
counts diff (21917 1 12 0 1726462 0 21821)
counts diff (25888 1 12 0 1777371 0 25792)
counts diff (21743 1 12 0 2345143 0 21647)
counts diff (24035 1 12 0 2561491 0 23939)
counts diff (30028 1 12 0 2593069 0 29932)
counts diff (29627 1 12 0 3041307 0 29531)
counts diff (30140 1 12 0 3479209 0 30044)
counts diff (35181 1 12 0 3690480 0 35085)
Elapsed time: 0.943090s (0.351799s in 12 GCs)

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-11  0:02                                                               ` Dmitry Gutov
@ 2023-09-11 11:57                                                                 ` Eli Zaretskii
  2023-09-11 23:06                                                                   ` Dmitry Gutov
  2023-09-12 14:23                                                                   ` Dmitry Gutov
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-11 11:57 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Mon, 11 Sep 2023 03:02:55 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > You could record its value in a local variable at the entry to
> > garbage_collect, and the expose that value to Lisp.
> 
> That also doesn't seem to give much, given that the condition for 
> entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait 
> until it's down to 0, then garbage-collect.

No, we don't wait until it's zero, we perform GC on the first
opportunity that we _notice_ that it crossed zero.  So examining how
negative is the value of consing_until_gc when GC is actually
performed could tell us whether we checked the threshold with high
enough frequency, and comparing these values between different runs
could tell us whether the shorter time spend in GC means really less
garbage or less frequent checks for the need to GC.

> What's in there? First of all, for find-directory-files-recursively-3, 
> there are 0 garbage collections between the beginning of the function 
> and when we start parsing the output (no GCs while the process is 
> writing to the buffer synchronously). I guess inserting output in a 
> buffer doesn't increase consing, so there's nothing to GC?

No, we just don't count increasing size of buffer text in the "consing
since GC" counter.  Basically, buffer text is never "garbage", except
when a buffer is killed.

> Next: for find-directory-files-recursively-2, the process only finishes 
> at the end, when all GC cycles are done for. I suppose that also means 
> we block the process's output while Lisp is running, and also that 
> whatever GC events occur might coincide with the chunks of output coming 
> from the process, and however many of them turn out to be in total.

We don't block the process when GC runs.  We do stop reading from the
process, so if and when the pipe fills, the OS will block the process.

> So there is also a second recording for 
> find-directory-files-recursively-2 with read-process-output-max=409600. 
> It does improve the performance significantly (and reduce the number of 
> GC pauses). I guess what I'm still not clear on, is whether the number 
> of GC pauses is fewer because of less consing (the only column that 
> looks significantly different is the 3rd: VECTOR-CELLS), or because the 
> process finishes faster due to larger buffers, which itself causes fewer 
> calls to maybe_gc.

I think the latter.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-11 11:57                                                                 ` Eli Zaretskii
@ 2023-09-11 23:06                                                                   ` Dmitry Gutov
  2023-09-12 11:39                                                                     ` Eli Zaretskii
  2023-09-12 14:23                                                                   ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-11 23:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

[-- Attachment #1: Type: text/plain, Size: 2757 bytes --]

On 11/09/2023 14:57, Eli Zaretskii wrote:
>> Date: Mon, 11 Sep 2023 03:02:55 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>> You could record its value in a local variable at the entry to
>>> garbage_collect, and the expose that value to Lisp.
>>
>> That also doesn't seem to give much, given that the condition for
>> entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait
>> until it's down to 0, then garbage-collect.
> 
> No, we don't wait until it's zero, we perform GC on the first
> opportunity that we _notice_ that it crossed zero.  So examining how
> negative is the value of consing_until_gc when GC is actually
> performed could tell us whether we checked the threshold with high
> enough frequency, and comparing these values between different runs
> could tell us whether the shorter time spend in GC means really less
> garbage or less frequent checks for the need to GC.

Good point, I'm attaching the same outputs with "last value of 
consing_until_gc" added to every line.

There are some pretty low values in the "read-process-output-max 409600" 
part of the experiment, which probably means runtime staying in C 
accumulating the output into the (now larger) buffer? Not sure.

>> What's in there? First of all, for find-directory-files-recursively-3,
>> there are 0 garbage collections between the beginning of the function
>> and when we start parsing the output (no GCs while the process is
>> writing to the buffer synchronously). I guess inserting output in a
>> buffer doesn't increase consing, so there's nothing to GC?
> 
> No, we just don't count increasing size of buffer text in the "consing
> since GC" counter.  Basically, buffer text is never "garbage", except
> when a buffer is killed.

That makes sense. Perhaps it hints at a faster design for calling 
process asynchronously as well (more on another experiment later).

>> Next: for find-directory-files-recursively-2, the process only finishes
>> at the end, when all GC cycles are done for. I suppose that also means
>> we block the process's output while Lisp is running, and also that
>> whatever GC events occur might coincide with the chunks of output coming
>> from the process, and however many of them turn out to be in total.
> 
> We don't block the process when GC runs.  We do stop reading from the
> process, so if and when the pipe fills, the OS will block the process.

Right. But the effect is almost the same, including the potential 
side-effect that (IIUC) when a process is waiting like that, it's just 
suspended and not rushing ahead using the CPU/disk/etc resources to the 
max. That's an orthogonal train of thought, sorry.

[-- Attachment #2: gcs2b.txt --]
[-- Type: text/plain, Size: 3284 bytes --]

find-directory-files-recursively-2

Uses make-process and :filter to parse the output
concurrently with the process.

With (read-process-output-max 4096):

start now
cugc -2560 counts diff (42345 3 14097 2 408583 1966 4479)
cugc -4449 counts diff (24599 1 342 0 883247 0 6266)
cugc -100 counts diff (24070 1 354 0 977387 0 6009)
cugc -116 counts diff (27266 1 278 0 940723 0 7485)
cugc -95 counts diff (27486 1 270 0 1014591 0 7586)
cugc -117 counts diff (27157 1 294 0 1121065 0 7329)
cugc -146 counts diff (28233 1 316 0 1185527 0 7562)
cugc -143 counts diff (30597 1 354 0 1217320 0 8147)
cugc -4807 counts diff (25925 1 380 0 1474618 0 6407)
cugc -127 counts diff (33344 1 368 0 1341453 0 8965)
cugc -177 counts diff (34785 1 478 0 1434432 0 8842)
cugc -2801 counts diff (37069 1 464 0 1477825 0 9675)
cugc -23 counts diff (40817 1 448 0 1478445 0 10999)
cugc -1215 counts diff (44526 1 500 0 1503604 0 11964)
cugc -4189 counts diff (42305 1 468 0 1701989 0 11354)
cugc -4715 counts diff (36644 1 532 0 2036778 0 9082)
cugc -85 counts diff (38234 1 542 0 2131756 0 9535)
cugc -861 counts diff (41632 1 578 0 2188186 0 10474)
cugc -117 counts diff (46029 1 580 0 2211685 0 11921)
cugc -38 counts diff (50353 1 728 0 2280388 0 12568)
cugc -2537 counts diff (57168 1 888 0 2286381 0 13974)
cugc -3676 counts diff (61570 1 924 0 2341402 0 15246)
cugc -174 counts diff (56504 1 924 0 2689300 0 13502)
cugc -1001 counts diff (57066 1 842 0 2855028 0 14098)
cugc -146 counts diff (57716 1 916 0 3063238 0 13891)
cugc -148 counts diff (62868 1 982 0 3139111 0 15244)
cugc -1730 counts diff (64809 1 856 0 3283855 0 16535)
cugc -162 counts diff (69183 1 870 0 3394031 0 17902)
<finished\n>
total chunks 6652
Elapsed time: 1.233016s (0.668819s in 28 GCs)

And here's with (read-process-output-max 409600):

start now
cugc -12 counts diff (59160 5 22547 116 155434 2046 2103)
cugc -154001 counts diff (18671 1 16 0 1034538 0 6172)
cugc -100 counts diff (20250 1 14 0 1003966 0 6708)
cugc -190294 counts diff (19623 1 16 0 1244441 0 6489)
cugc -58 counts diff (26160 1 14 0 1015128 0 8678)
cugc -293067 counts diff (22737 1 16 0 1426874 0 7527)
cugc -92 counts diff (28308 1 14 0 1160213 0 9394)
cugc -25 counts diff (21620 1 16 0 1535686 0 7153)
cugc -21 counts diff (23251 1 16 0 1554720 0 7698)
cugc -143 counts diff (29988 1 16 0 1462639 0 9943)
cugc -117 counts diff (28827 1 16 0 1622562 0 9556)
cugc -26 counts diff (33959 1 16 0 1606815 0 11266)
cugc -17 counts diff (37476 1 16 0 1639853 0 12439)
cugc -250992 counts diff (31345 1 18 0 2081663 0 10383)
cugc -289142 counts diff (29904 1 18 0 2448410 0 9901)
cugc -290227 counts diff (30675 1 18 0 2448156 0 10159)
cugc -264315 counts diff (35418 1 18 0 2446508 0 11741)
cugc -32 counts diff (41741 1 18 0 2343900 0 13847)
cugc -2201 counts diff (44523 1 112 0 2478310 0 14239)
cugc -15673 counts diff (49622 1 170 0 2528221 0 15592)
cugc -40267 counts diff (41990 1 58 0 2972015 0 13693)
cugc -159 counts diff (41010 1 22 0 3177994 0 13580)
cugc -42 counts diff (47602 1 156 0 3259833 0 15009)
cugc -358884 counts diff (43740 1 34 0 3687145 0 14436)
cugc -22 counts diff (55598 1 20 0 3494190 0 18454)
cugc -1270 counts diff (60128 1 190 0 3683461 0 18980)
<finished\n>
total chunks 273
Elapsed time: 0.932625s (0.608713s in 26 GCs)

[-- Attachment #3: gcs3b.txt --]
[-- Type: text/plain, Size: 929 bytes --]

find-directory-files-recursively-3

Uses process-file, parses the buffer
with search-forward at the end.

start now
<process finished, now parsing>
cugc -129 counts diff (17667 1 1565 1 779081 93 10465)
cugc -139 counts diff (12789 1 12 0 912364 0 12696)
cugc -84 counts diff (13496 1 12 0 1028060 0 13403)
cugc -33 counts diff (14112 1 12 0 1168522 0 14019)
cugc -153 counts diff (14354 1 12 0 1347241 0 14261)
cugc -72 counts diff (17005 1 12 0 1401075 0 16912)
cugc -17 counts diff (20810 1 12 0 1403396 0 20717)
cugc -94 counts diff (18516 1 12 0 1792508 0 18423)
cugc -120 counts diff (17981 1 12 0 2108458 0 17888)
cugc -50 counts diff (22090 1 12 0 2169835 0 21997)
cugc -136 counts diff (26749 1 12 0 2231037 0 26656)
cugc -93 counts diff (25300 1 12 0 2687843 0 25207)
cugc -72 counts diff (26165 1 12 0 3046140 0 26072)
cugc -142 counts diff (30968 1 12 0 3205306 0 30875)
Elapsed time: 0.938180s (0.314630s in 14 GCs)

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-11 23:06                                                                   ` Dmitry Gutov
@ 2023-09-12 11:39                                                                     ` Eli Zaretskii
  2023-09-12 13:11                                                                       ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-12 11:39 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Tue, 12 Sep 2023 02:06:50 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > No, we don't wait until it's zero, we perform GC on the first
> > opportunity that we _notice_ that it crossed zero.  So examining how
> > negative is the value of consing_until_gc when GC is actually
> > performed could tell us whether we checked the threshold with high
> > enough frequency, and comparing these values between different runs
> > could tell us whether the shorter time spend in GC means really less
> > garbage or less frequent checks for the need to GC.
> 
> Good point, I'm attaching the same outputs with "last value of 
> consing_until_gc" added to every line.
> 
> There are some pretty low values in the "read-process-output-max 409600" 
> part of the experiment, which probably means runtime staying in C 
> accumulating the output into the (now larger) buffer? Not sure.

No, I think this means we really miss some GC opportunities, and we
cons quite a lot more strings between GC cycles due to that.  I guess
this happens because we somehow cons many strings in code that doesn't
call maybe_gc or something.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 11:39                                                                     ` Eli Zaretskii
@ 2023-09-12 13:11                                                                       ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-12 13:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 12/09/2023 14:39, Eli Zaretskii wrote:
>> Date: Tue, 12 Sep 2023 02:06:50 +0300
>> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>>> No, we don't wait until it's zero, we perform GC on the first
>>> opportunity that we_notice_  that it crossed zero.  So examining how
>>> negative is the value of consing_until_gc when GC is actually
>>> performed could tell us whether we checked the threshold with high
>>> enough frequency, and comparing these values between different runs
>>> could tell us whether the shorter time spend in GC means really less
>>> garbage or less frequent checks for the need to GC.
>> Good point, I'm attaching the same outputs with "last value of
>> consing_until_gc" added to every line.
>>
>> There are some pretty low values in the "read-process-output-max 409600"
>> part of the experiment, which probably means runtime staying in C
>> accumulating the output into the (now larger) buffer? Not sure.
> No, I think this means we really miss some GC opportunities, and we
> cons quite a lot more strings between GC cycles due to that.

Or possibly same number of strings but longer ones?

> I guess
> this happens because we somehow cons many strings in code that doesn't
> call maybe_gc or something.

Yes, staying in some C code that doesn't call maybe_gc for a while.

I think we're describing the same thing, only I was doing that from the 
positive side (less frequent GCs = better performance in this scenario), 
and you from the negative one (less frequent GCs = more chances for an 
OOM to happen in some related but different scenario).





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-11 11:57                                                                 ` Eli Zaretskii
  2023-09-11 23:06                                                                   ` Dmitry Gutov
@ 2023-09-12 14:23                                                                   ` Dmitry Gutov
  2023-09-12 14:26                                                                     ` Dmitry Gutov
  2023-09-12 16:32                                                                     ` Eli Zaretskii
  1 sibling, 2 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-12 14:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 11/09/2023 14:57, Eli Zaretskii wrote:
>> So there is also a second recording for
>> find-directory-files-recursively-2 with read-process-output-max=409600.
>> It does improve the performance significantly (and reduce the number of
>> GC pauses). I guess what I'm still not clear on, is whether the number
>> of GC pauses is fewer because of less consing (the only column that
>> looks significantly different is the 3rd: VECTOR-CELLS), or because the
>> process finishes faster due to larger buffers, which itself causes fewer
>> calls to maybe_gc.
> I think the latter.

It might be both.

To try to analyze how large might per-chunk overhead be (CPU and GC-wise 
combined), I first implemented the same function in yet another way that 
doesn't use :filter (so that the default filter is used). But still 
asynchronously, with parsing happening concurrently to the process:

(defun find-directory-files-recursively-5 (dir regexp &optional 
include-directories _p follow-symlinks)
   (cl-assert (null _p) t "find-directory-files-recursively can't accept 
arbitrary predicates")
   (with-temp-buffer
     (setq case-fold-search nil)
     (cd dir)
     (let* ((command
	    (append
	     (list "find" (file-local-name dir))
	     (if follow-symlinks
		 '("-L")
	       '("!" "(" "-type" "l" "-xtype" "d" ")"))
	     (unless (string-empty-p regexp)
	       (list "-regex" (concat ".*" regexp ".*")))
	     (unless include-directories
	       '("!" "-type" "d"))
	     '("-print0")
	     ))
	   (remote (file-remote-p dir))
	   (proc
	    (if remote
		(let ((proc (apply #'start-file-process
				   "find" (current-buffer) command)))
		  (set-process-sentinel proc (lambda (_proc _state)))
		  (set-process-query-on-exit-flag proc nil)
		  proc)
	      (make-process :name "find" :buffer (current-buffer)
			    :connection-type 'pipe
			    :noquery t
			    :sentinel (lambda (_proc _state))
			    :command command)))
            start ret)
       (setq start (point-min))
       (while (accept-process-output proc)
         (goto-char start)
         (while (search-forward "\0" nil t)
	  (push (buffer-substring-no-properties start (1- (point))) ret)
	  (setq start (point))))
       ret)))

This method already improved the performance somewhat (compared to 
find-directory-files-recursively-2), but not too much. So I tried these 
next two steps:

- Dropping most of the setup in read_and_dispose_of_process_output 
(which creates some consing too) and calling 
Finternal_default_process_filter directly (call_filter_directly.diff), 
when it is the filter to be used anyway.

- Going around that function entirely, skipping the creation of a Lisp 
string (CHARS -> TEXT) and inserting into the buffer directly (when the 
filter is set to the default, of course). Copied and adapted some code 
from 'call_process' for that (read_and_insert_process_output.diff).

Neither are intended as complete proposals, but here are some 
comparisons. Note that either of these patches could only help the 
implementations that don't set up process filter (the naive first one, 
and the new parallel number 5 above).

For testing, I used two different repo checkouts that are large enough 
to not finish too quickly: gecko-dev and torvalds-linux.

master

| Function                                         | gecko-dev | linux |
| find-directory-files-recursively                 |      1.69 |  0.41 |
| find-directory-files-recursively-2               |      1.16 |  0.28 |
| find-directory-files-recursively-3               |      0.92 |  0.23 |
| find-directory-files-recursively-5               |      1.07 |  0.26 |
| find-directory-files-recursively (rpom 409600)   |      1.42 |  0.35 |
| find-directory-files-recursively-2 (rpom 409600) |      0.90 |  0.25 |
| find-directory-files-recursively-5 (rpom 409600) |      0.89 |  0.24 |

call_filter_directly.diff (basically, not much difference)

| Function                                         | gecko-dev | linux |
| find-directory-files-recursively                 |      1.64 |  0.38 |
| find-directory-files-recursively-5               |      1.05 |  0.26 |
| find-directory-files-recursively (rpom 409600)   |      1.42 |  0.36 |
| find-directory-files-recursively-5 (rpom 409600) |      0.91 |  0.25 |

read_and_insert_process_output.diff (noticeable differences)

| Function                                         | gecko-dev | linux |
| find-directory-files-recursively                 |      1.30 |  0.34 |
| find-directory-files-recursively-5               |      1.03 |  0.25 |
| find-directory-files-recursively (rpom 409600)   |      1.20 |  0.35 |
| find-directory-files-recursively-5 (rpom 409600) | (!!) 0.72 |  0.21 |

So it seems like we have at least two potential ways to implement an 
asynchronous file listing routine that is as fast or faster than the 
synchronous one (if only thanks to starting the parsing in parallel).

Combining the last patch together with using the very large value of 
read-process-output-max seems to yield the most benefit, but I'm not 
sure if it's appropriate to just raise that value in our code, though.

Thoughts?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 14:23                                                                   ` Dmitry Gutov
@ 2023-09-12 14:26                                                                     ` Dmitry Gutov
  2023-09-12 16:32                                                                     ` Eli Zaretskii
  1 sibling, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-12 14:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

On 12/09/2023 17:23, Dmitry Gutov wrote:
> Neither are intended as complete proposals, but here are some 
> comparisons. Note that either of these patches could only help the 
> implementations that don't set up process filter (the naive first one, 
> and the new parallel number 5 above).

Sorry, forgot to attach the patches.

[-- Attachment #2: call_filter_directly.diff --]
[-- Type: text/x-patch, Size: 818 bytes --]

diff --git a/src/process.c b/src/process.c
index 08cb810ec13..bdbe8d96064 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6227,7 +6227,15 @@ read_process_output (Lisp_Object proc, int channel)
      friends don't expect current-buffer to be changed from under them.  */
   record_unwind_current_buffer ();
 
-  read_and_dispose_of_process_output (p, chars, nbytes, coding);
+  if (p->filter == Qinternal_default_process_filter)
+    {
+      Lisp_Object text;
+      decode_coding_c_string (coding, (unsigned char *) chars, nbytes, Qt);
+      text = coding->dst_object;
+      Finternal_default_process_filter (proc, text);
+    }
+  else
+    read_and_dispose_of_process_output (p, chars, nbytes, coding);
 
   /* Handling the process output should not deactivate the mark.  */
   Vdeactivate_mark = odeactivate;

[-- Attachment #3: read_and_insert_process_output.diff --]
[-- Type: text/x-patch, Size: 2836 bytes --]

diff --git a/src/process.c b/src/process.c
index 08cb810ec13..5db56692fe1 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6112,6 +6112,11 @@ read_and_dispose_of_process_output (struct Lisp_Process *p, char *chars,
 				    ssize_t nbytes,
 				    struct coding_system *coding);
 
+static void
+read_and_insert_process_output (struct Lisp_Process *p, char *buf,
+				    ssize_t nread,
+				struct coding_system *process_coding);
+
 /* Read pending output from the process channel,
    starting with our buffered-ahead character if we have one.
    Yield number of decoded characters read,
@@ -6227,7 +6232,10 @@ read_process_output (Lisp_Object proc, int channel)
      friends don't expect current-buffer to be changed from under them.  */
   record_unwind_current_buffer ();
 
-  read_and_dispose_of_process_output (p, chars, nbytes, coding);
+  if (p->filter == Qinternal_default_process_filter)
+    read_and_insert_process_output (p, chars, nbytes, coding);
+  else
+    read_and_dispose_of_process_output (p, chars, nbytes, coding);
 
   /* Handling the process output should not deactivate the mark.  */
   Vdeactivate_mark = odeactivate;
@@ -6236,6 +6244,46 @@ read_process_output (Lisp_Object proc, int channel)
   return nbytes;
 }
 
+static void read_and_insert_process_output (struct Lisp_Process *p, char *buf,
+				    ssize_t nread,
+				    struct coding_system *process_coding)
+{
+  if (!nread || NILP (p->buffer) || !BUFFER_LIVE_P (XBUFFER (p->buffer)))
+    ;
+  else if (NILP (BVAR (XBUFFER(p->buffer), enable_multibyte_characters))
+	   && ! CODING_MAY_REQUIRE_DECODING (process_coding))
+    {
+      insert_1_both (buf, nread, nread, 0, 0, 0);
+      signal_after_change (PT - nread, 0, nread);
+    }
+  else
+    {			/* We have to decode the input.  */
+      Lisp_Object curbuf;
+      int carryover = 0;
+      specpdl_ref count1 = SPECPDL_INDEX ();
+
+      XSETBUFFER (curbuf, current_buffer);
+      /* We cannot allow after-change-functions be run
+	 during decoding, because that might modify the
+	 buffer, while we rely on process_coding.produced to
+	 faithfully reflect inserted text until we
+	 TEMP_SET_PT_BOTH below.  */
+      specbind (Qinhibit_modification_hooks, Qt);
+      decode_coding_c_string (process_coding,
+			      (unsigned char *) buf, nread, curbuf);
+      unbind_to (count1, Qnil);
+
+      TEMP_SET_PT_BOTH (PT + process_coding->produced_char,
+			PT_BYTE + process_coding->produced);
+      signal_after_change (PT - process_coding->produced_char,
+			   0, process_coding->produced_char);
+      carryover = process_coding->carryover_bytes;
+      if (carryover > 0)
+	memcpy (buf, process_coding->carryover,
+		process_coding->carryover_bytes);
+    }
+}
+
 static void
 read_and_dispose_of_process_output (struct Lisp_Process *p, char *chars,
 				    ssize_t nbytes,

^ permalink raw reply related	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 14:23                                                                   ` Dmitry Gutov
  2023-09-12 14:26                                                                     ` Dmitry Gutov
@ 2023-09-12 16:32                                                                     ` Eli Zaretskii
  2023-09-12 18:48                                                                       ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-12 16:32 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Tue, 12 Sep 2023 17:23:53 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> - Dropping most of the setup in read_and_dispose_of_process_output 
> (which creates some consing too) and calling 
> Finternal_default_process_filter directly (call_filter_directly.diff), 
> when it is the filter to be used anyway.
> 
> - Going around that function entirely, skipping the creation of a Lisp 
> string (CHARS -> TEXT) and inserting into the buffer directly (when the 
> filter is set to the default, of course). Copied and adapted some code 
> from 'call_process' for that (read_and_insert_process_output.diff).
> 
> Neither are intended as complete proposals, but here are some 
> comparisons. Note that either of these patches could only help the 
> implementations that don't set up process filter (the naive first one, 
> and the new parallel number 5 above).
> 
> For testing, I used two different repo checkouts that are large enough 
> to not finish too quickly: gecko-dev and torvalds-linux.
> 
> master
> 
> | Function                                         | gecko-dev | linux |
> | find-directory-files-recursively                 |      1.69 |  0.41 |
> | find-directory-files-recursively-2               |      1.16 |  0.28 |
> | find-directory-files-recursively-3               |      0.92 |  0.23 |
> | find-directory-files-recursively-5               |      1.07 |  0.26 |
> | find-directory-files-recursively (rpom 409600)   |      1.42 |  0.35 |
> | find-directory-files-recursively-2 (rpom 409600) |      0.90 |  0.25 |
> | find-directory-files-recursively-5 (rpom 409600) |      0.89 |  0.24 |
> 
> call_filter_directly.diff (basically, not much difference)
> 
> | Function                                         | gecko-dev | linux |
> | find-directory-files-recursively                 |      1.64 |  0.38 |
> | find-directory-files-recursively-5               |      1.05 |  0.26 |
> | find-directory-files-recursively (rpom 409600)   |      1.42 |  0.36 |
> | find-directory-files-recursively-5 (rpom 409600) |      0.91 |  0.25 |
> 
> read_and_insert_process_output.diff (noticeable differences)
> 
> | Function                                         | gecko-dev | linux |
> | find-directory-files-recursively                 |      1.30 |  0.34 |
> | find-directory-files-recursively-5               |      1.03 |  0.25 |
> | find-directory-files-recursively (rpom 409600)   |      1.20 |  0.35 |
> | find-directory-files-recursively-5 (rpom 409600) | (!!) 0.72 |  0.21 |
> 
> So it seems like we have at least two potential ways to implement an 
> asynchronous file listing routine that is as fast or faster than the 
> synchronous one (if only thanks to starting the parsing in parallel).
> 
> Combining the last patch together with using the very large value of 
> read-process-output-max seems to yield the most benefit, but I'm not 
> sure if it's appropriate to just raise that value in our code, though.
> 
> Thoughts?

I'm not sure what exactly is here to think about.  Removing portions
of read_and_insert_process_output, or bypassing it entirely, is not
going to fly, because AFAIU it basically means we don't decode text,
which can only work with plain ASCII file names, and/or don't move the
markers in the process buffer, which also cannot be avoided.  If you
want to conclude that inserting the process's output into a buffer
without consing Lisp strings is faster (which I'm not sure, see below,
but it could be true), then we could try extending
internal-default-process-filter (or writing a new filter function
similar to it) so that it inserts the stuff into the gap and then uses
decode_coding_gap, which converts inserted bytes in-place -- that, at
least, will be correct and will avoid consing intermediate temporary
strings from the process output, then decoding them, then inserting
them.  Other than that, the -2 and -3 variants are very close
runners-up of -5, so maybe I'm missing something, but I see no reason
be too excited here?  I mean, 0.89 vs 0.92? really?

About inserting into the buffer: what we do is insert into the gap,
and when the gap becomes full, we enlarge it.  Enlarging the gap
involves: (a) enlarging the chunk of memory allocated to buffer text
(which might mean we ask the OS for more memory), and (b) moving the
characters after the gap to the right to free space for inserting more
stuff.  This is pretty fast, but still, with a large pipe buffer and a
lot of output, we do this many times, so it could add up to something
pretty tangible.  It's hard to me to tell whether this is
significantly faster than consing strings and inserting them, only
measurements can tell.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 16:32                                                                     ` Eli Zaretskii
@ 2023-09-12 18:48                                                                       ` Dmitry Gutov
  2023-09-12 19:35                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-12 18:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 12/09/2023 19:32, Eli Zaretskii wrote:
>> Date: Tue, 12 Sep 2023 17:23:53 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> - Dropping most of the setup in read_and_dispose_of_process_output
>> (which creates some consing too) and calling
>> Finternal_default_process_filter directly (call_filter_directly.diff),
>> when it is the filter to be used anyway.
>>
>> - Going around that function entirely, skipping the creation of a Lisp
>> string (CHARS -> TEXT) and inserting into the buffer directly (when the
>> filter is set to the default, of course). Copied and adapted some code
>> from 'call_process' for that (read_and_insert_process_output.diff).
>>
>> Neither are intended as complete proposals, but here are some
>> comparisons. Note that either of these patches could only help the
>> implementations that don't set up process filter (the naive first one,
>> and the new parallel number 5 above).
>>
>> For testing, I used two different repo checkouts that are large enough
>> to not finish too quickly: gecko-dev and torvalds-linux.
>>
>> master
>>
>> | Function                                         | gecko-dev | linux |
>> | find-directory-files-recursively                 |      1.69 |  0.41 |
>> | find-directory-files-recursively-2               |      1.16 |  0.28 |
>> | find-directory-files-recursively-3               |      0.92 |  0.23 |
>> | find-directory-files-recursively-5               |      1.07 |  0.26 |
>> | find-directory-files-recursively (rpom 409600)   |      1.42 |  0.35 |
>> | find-directory-files-recursively-2 (rpom 409600) |      0.90 |  0.25 |
>> | find-directory-files-recursively-5 (rpom 409600) |      0.89 |  0.24 |
>>
>> call_filter_directly.diff (basically, not much difference)
>>
>> | Function                                         | gecko-dev | linux |
>> | find-directory-files-recursively                 |      1.64 |  0.38 |
>> | find-directory-files-recursively-5               |      1.05 |  0.26 |
>> | find-directory-files-recursively (rpom 409600)   |      1.42 |  0.36 |
>> | find-directory-files-recursively-5 (rpom 409600) |      0.91 |  0.25 |
>>
>> read_and_insert_process_output.diff (noticeable differences)
>>
>> | Function                                         | gecko-dev | linux |
>> | find-directory-files-recursively                 |      1.30 |  0.34 |
>> | find-directory-files-recursively-5               |      1.03 |  0.25 |
>> | find-directory-files-recursively (rpom 409600)   |      1.20 |  0.35 |
>> | find-directory-files-recursively-5 (rpom 409600) | (!!) 0.72 |  0.21 |
>>
>> So it seems like we have at least two potential ways to implement an
>> asynchronous file listing routine that is as fast or faster than the
>> synchronous one (if only thanks to starting the parsing in parallel).
>>
>> Combining the last patch together with using the very large value of
>> read-process-output-max seems to yield the most benefit, but I'm not
>> sure if it's appropriate to just raise that value in our code, though.
>>
>> Thoughts?
> 
> I'm not sure what exactly is here to think about.  Removing portions
> of read_and_insert_process_output, or bypassing it entirely, is not
> going to fly, because AFAIU it basically means we don't decode text,
> which can only work with plain ASCII file names, and/or don't move the
> markers in the process buffer, which also cannot be avoided.

That one was really a test to see whether the extra handling added any 
meaningful consing to affect GC. Removing it didn't make a difference, 
table number 2, so no.

> If you
> want to conclude that inserting the process's output into a buffer
> without consing Lisp strings is faster (which I'm not sure, see below,
> but it could be true),

That's what my tests seem to show, see table 3 (the last one).

> then we could try extending
> internal-default-process-filter (or writing a new filter function
> similar to it) so that it inserts the stuff into the gap and then uses
> decode_coding_gap,

Can that work at all? By the time internal-default-process-filter is 
called, we have already turned the string from char* into Lisp_Object 
text, which we then pass to it. So consing has already happened, IIUC.

> which converts inserted bytes in-place -- that, at
> least, will be correct and will avoid consing intermediate temporary
> strings from the process output, then decoding them, then inserting
> them.  Other than that, the -2 and -3 variants are very close
> runners-up of -5, so maybe I'm missing something, but I see no reason
> be too excited here?  I mean, 0.89 vs 0.92? really?

The important part is not 0.89 vs 0.92 (that would be meaningless 
indeed), but that we have an _asyncronous_ implementation of the feature 
that works as fast as the existing synchronous one (or faster! if we 
also bind read-process-output-max to a large value, the time is 0.72).

The possible applications for that range from simple (printing progress 
bar while the scan is happening) to more advanced (launching a 
concurrent process where we pipe the received file names concurrently to 
'xargs grep'), including visuals (xref buffer which shows the 
intermediate search results right away, updating them gradually, all 
without blocking the UI).

> About inserting into the buffer: what we do is insert into the gap,
> and when the gap becomes full, we enlarge it.  Enlarging the gap
> involves: (a) enlarging the chunk of memory allocated to buffer text
> (which might mean we ask the OS for more memory), and (b) moving the
> characters after the gap to the right to free space for inserting more
> stuff.  This is pretty fast, but still, with a large pipe buffer and a
> lot of output, we do this many times, so it could add up to something
> pretty tangible.  It's hard to me to tell whether this is
> significantly faster than consing strings and inserting them, only
> measurements can tell.

See the benchmark tables and the POC patch in my previous email. Using a 
better filter function would be ideal, but it seems like that's not 
going to fit the current design. Happy to be proven wrong, though.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 18:48                                                                       ` Dmitry Gutov
@ 2023-09-12 19:35                                                                         ` Eli Zaretskii
  2023-09-12 20:27                                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-12 19:35 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Tue, 12 Sep 2023 21:48:37 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > then we could try extending
> > internal-default-process-filter (or writing a new filter function
> > similar to it) so that it inserts the stuff into the gap and then uses
> > decode_coding_gap,
> 
> Can that work at all? By the time internal-default-process-filter is 
> called, we have already turned the string from char* into Lisp_Object 
> text, which we then pass to it. So consing has already happened, IIUC.

That's why I said "or writing a new filter function".
read_and_dispose_of_process_output will have to call this new filter
differently, passing it the raw text read from the subprocess, where
read_and_dispose_of_process_output current first decodes the text and
produces a Lisp string from it.  Then the filter would need to do
something similar to what insert-file-contents does: insert the raw
input into the gap, then call decode_coding_gap to decode that
in-place.

> > which converts inserted bytes in-place -- that, at
> > least, will be correct and will avoid consing intermediate temporary
> > strings from the process output, then decoding them, then inserting
> > them.  Other than that, the -2 and -3 variants are very close
> > runners-up of -5, so maybe I'm missing something, but I see no reason
> > be too excited here?  I mean, 0.89 vs 0.92? really?
> 
> The important part is not 0.89 vs 0.92 (that would be meaningless 
> indeed), but that we have an _asyncronous_ implementation of the feature 
> that works as fast as the existing synchronous one (or faster! if we 
> also bind read-process-output-max to a large value, the time is 0.72).
> 
> The possible applications for that range from simple (printing progress 
> bar while the scan is happening) to more advanced (launching a 
> concurrent process where we pipe the received file names concurrently to 
> 'xargs grep'), including visuals (xref buffer which shows the 
> intermediate search results right away, updating them gradually, all 
> without blocking the UI).

Hold your horses.  Emacs only reads output from sub-processes when
it's idle.  So printing a progress bar (which makes Emacs not idle)
with the asynchronous implementation is basically the same as having
the synchronous implementation call some callback from time to time
(which will then show the progress).

As for piping to another process, this is best handled by using a
shell pipe, without passing stuff through Emacs.  And even if you do
need to pass it through Emacs, you could do the same with the
synchronous implementation -- only the "xargs" part needs to be
asynchronous, the part that reads file names does not.  Right?

Please note: I'm not saying that the asynchronous implementation is
not interesting.  It might even have advantages in some specific use
cases.  So it is good to have it.  It just isn't a breakthrough,
that's all.  And if we want to use it in production, we should
probably work on adding that special default filter which inserts and
decodes directly into the buffer, because that will probably lower the
GC pressure and thus has hope of being faster.  Or even replace the
default filter implementation with that new one.

> > About inserting into the buffer: what we do is insert into the gap,
> > and when the gap becomes full, we enlarge it.  Enlarging the gap
> > involves: (a) enlarging the chunk of memory allocated to buffer text
> > (which might mean we ask the OS for more memory), and (b) moving the
> > characters after the gap to the right to free space for inserting more
> > stuff.  This is pretty fast, but still, with a large pipe buffer and a
> > lot of output, we do this many times, so it could add up to something
> > pretty tangible.  It's hard to me to tell whether this is
> > significantly faster than consing strings and inserting them, only
> > measurements can tell.
> 
> See the benchmark tables and the POC patch in my previous email. Using a 
> better filter function would be ideal, but it seems like that's not 
> going to fit the current design. Happy to be proven wrong, though.

I see no reason why reading subprocess output couldn't use the same
technique as insert-file-contents does.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 19:35                                                                         ` Eli Zaretskii
@ 2023-09-12 20:27                                                                           ` Dmitry Gutov
  2023-09-13 11:38                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-12 20:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 12/09/2023 22:35, Eli Zaretskii wrote:
>> Date: Tue, 12 Sep 2023 21:48:37 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>> then we could try extending
>>> internal-default-process-filter (or writing a new filter function
>>> similar to it) so that it inserts the stuff into the gap and then uses
>>> decode_coding_gap,
>>
>> Can that work at all? By the time internal-default-process-filter is
>> called, we have already turned the string from char* into Lisp_Object
>> text, which we then pass to it. So consing has already happened, IIUC.
> 
> That's why I said "or writing a new filter function".
> read_and_dispose_of_process_output will have to call this new filter
> differently, passing it the raw text read from the subprocess, where
> read_and_dispose_of_process_output current first decodes the text and
> produces a Lisp string from it.  Then the filter would need to do
> something similar to what insert-file-contents does: insert the raw
> input into the gap, then call decode_coding_gap to decode that
> in-place.

Does the patch from my last patch-bearing email look similar enough to 
what you're describing?

The one called read_and_insert_process_output.diff

The result there, though, is that a "filter" (in the sense that 
make-process uses that term) is not used at all.

>>> which converts inserted bytes in-place -- that, at
>>> least, will be correct and will avoid consing intermediate temporary
>>> strings from the process output, then decoding them, then inserting
>>> them.  Other than that, the -2 and -3 variants are very close
>>> runners-up of -5, so maybe I'm missing something, but I see no reason
>>> be too excited here?  I mean, 0.89 vs 0.92? really?
>>
>> The important part is not 0.89 vs 0.92 (that would be meaningless
>> indeed), but that we have an _asyncronous_ implementation of the feature
>> that works as fast as the existing synchronous one (or faster! if we
>> also bind read-process-output-max to a large value, the time is 0.72).
>>
>> The possible applications for that range from simple (printing progress
>> bar while the scan is happening) to more advanced (launching a
>> concurrent process where we pipe the received file names concurrently to
>> 'xargs grep'), including visuals (xref buffer which shows the
>> intermediate search results right away, updating them gradually, all
>> without blocking the UI).
> 
> Hold your horses.  Emacs only reads output from sub-processes when
> it's idle.  So printing a progress bar (which makes Emacs not idle)
> with the asynchronous implementation is basically the same as having
> the synchronous implementation call some callback from time to time
> (which will then show the progress).

Obviously there is more work to be done, including further desgin and 
benchmarking. But unlike before, at least the starting performance 
(before further features are added) is not worse.

Note that the variant -5 is somewhat limited since it doesn't use a 
filter - that means that no callbacks a issued while the output is 
arriving, meaning that if it's taken as base, whatever refreshes would 
have to be initiated from somewhere else. E.g. from a timer.

> As for piping to another process, this is best handled by using a
> shell pipe, without passing stuff through Emacs.  And even if you do
> need to pass it through Emacs, you could do the same with the
> synchronous implementation -- only the "xargs" part needs to be
> asynchronous, the part that reads file names does not.  Right?

Yes and no: if both steps are asynchronous, the final output window 
could be displayed right away, rather than waiting for the first step 
(or both) to be finished. Which can be a meaningful improvement for some 
(and still is an upside of 'M-x rgrep').

> Please note: I'm not saying that the asynchronous implementation is
> not interesting.  It might even have advantages in some specific use
> cases.  So it is good to have it.  It just isn't a breakthrough,
> that's all.

Not a breakthrough, of course, just a lower-level insight (hopefully).

I do think it would be meaningful to manage to reduce the runtime of a 
real-life program (which includes other work) by 10-20% solely by 
reducing GC pressure in a generic facility like process output handling.

> And if we want to use it in production, we should
> probably work on adding that special default filter which inserts and
> decodes directly into the buffer, because that will probably lower the
> GC pressure and thus has hope of being faster.  Or even replace the
> default filter implementation with that new one.

But a filter must be a Lisp function, which can't help but accept only 
Lisp strings (not C string) as argument. Isn't that right?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-12 20:27                                                                           ` Dmitry Gutov
@ 2023-09-13 11:38                                                                             ` Eli Zaretskii
  2023-09-13 14:27                                                                               ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-13 11:38 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Tue, 12 Sep 2023 23:27:49 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > That's why I said "or writing a new filter function".
> > read_and_dispose_of_process_output will have to call this new filter
> > differently, passing it the raw text read from the subprocess, where
> > read_and_dispose_of_process_output current first decodes the text and
> > produces a Lisp string from it.  Then the filter would need to do
> > something similar to what insert-file-contents does: insert the raw
> > input into the gap, then call decode_coding_gap to decode that
> > in-place.
> 
> Does the patch from my last patch-bearing email look similar enough to 
> what you're describing?
> 
> The one called read_and_insert_process_output.diff

No, not entirely: it still produces a Lisp string when decoding is
needed, and then inserts that string into the buffer.

Did you look at what insert-file-contents does?  If not I suggest to
have a look, starting from this comment:

  /* Here, we don't do code conversion in the loop.  It is done by
     decode_coding_gap after all data are read into the buffer.  */

and ending here:

  if (CODING_MAY_REQUIRE_DECODING (&coding)
      && (inserted > 0 || CODING_REQUIRE_FLUSHING (&coding)))
    {
      /* Now we have all the new bytes at the beginning of the gap,
         but `decode_coding_gap` can't have them at the beginning of the gap,
         so we need to move them.  */
      memmove (GAP_END_ADDR - inserted, GPT_ADDR, inserted);
      decode_coding_gap (&coding, inserted);
      inserted = coding.produced_char;
      coding_system = CODING_ID_NAME (coding.id);
    }
  else if (inserted > 0)
    {
      /* Make the text read part of the buffer.  */
      eassert (NILP (BVAR (current_buffer, enable_multibyte_characters)));
      insert_from_gap_1 (inserted, inserted, false);

      invalidate_buffer_caches (current_buffer, PT, PT + inserted);
      adjust_after_insert (PT, PT_BYTE, PT + inserted, PT_BYTE + inserted,
			   inserted);
    }

> The result there, though, is that a "filter" (in the sense that 
> make-process uses that term) is not used at all.

Sure, but in this case we don't need any filtering.  It's basically
the same idea as internal-default-process-filter: we just need to
insert the process output into a buffer, and optionally decode it.

> > And if we want to use it in production, we should
> > probably work on adding that special default filter which inserts and
> > decodes directly into the buffer, because that will probably lower the
> > GC pressure and thus has hope of being faster.  Or even replace the
> > default filter implementation with that new one.
> 
> But a filter must be a Lisp function, which can't help but accept only 
> Lisp strings (not C string) as argument. Isn't that right?

We can provide a special filter identified by a symbol.  Such a filter
will not be Lisp-callable, it will exist for the cases where we need
to insert the output into the process buffer.  Any Lisp callback could
then access the process output as the text of that buffer, no Lisp
strings needed.  I thought this was a worthy goal; apologies if I
misunderstood.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-13 11:38                                                                             ` Eli Zaretskii
@ 2023-09-13 14:27                                                                               ` Dmitry Gutov
  2023-09-13 15:07                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-13 14:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 13/09/2023 14:38, Eli Zaretskii wrote:
>> Date: Tue, 12 Sep 2023 23:27:49 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>> That's why I said "or writing a new filter function".
>>> read_and_dispose_of_process_output will have to call this new filter
>>> differently, passing it the raw text read from the subprocess, where
>>> read_and_dispose_of_process_output current first decodes the text and
>>> produces a Lisp string from it.  Then the filter would need to do
>>> something similar to what insert-file-contents does: insert the raw
>>> input into the gap, then call decode_coding_gap to decode that
>>> in-place.
>>
>> Does the patch from my last patch-bearing email look similar enough to
>> what you're describing?
>>
>> The one called read_and_insert_process_output.diff
> 
> No, not entirely: it still produces a Lisp string when decoding is
> needed, and then inserts that string into the buffer.

Are you sure? IIUC the fact that is passes 'curbuf' as the last argument 
to 'decode_coding_c_string' means that decoding happens inside the 
buffer. This has been my explanation for the performance improvement anyway.

If it still generated a Lisp string, I think that would mean that we 
could save the general shape of internal-default-process-filter and just 
improve its implementation for the same measured benefit.

> Did you look at what insert-file-contents does?  If not I suggest to
> have a look, starting from this comment:
> 
>    /* Here, we don't do code conversion in the loop.  It is done by
>       decode_coding_gap after all data are read into the buffer.  */
> 
> and ending here:
> 
>    if (CODING_MAY_REQUIRE_DECODING (&coding)
>        && (inserted > 0 || CODING_REQUIRE_FLUSHING (&coding)))
>      {
>        /* Now we have all the new bytes at the beginning of the gap,
>           but `decode_coding_gap` can't have them at the beginning of the gap,
>           so we need to move them.  */
>        memmove (GAP_END_ADDR - inserted, GPT_ADDR, inserted);
>        decode_coding_gap (&coding, inserted);
>        inserted = coding.produced_char;
>        coding_system = CODING_ID_NAME (coding.id);
>      }
>    else if (inserted > 0)
>      {
>        /* Make the text read part of the buffer.  */
>        eassert (NILP (BVAR (current_buffer, enable_multibyte_characters)));
>        insert_from_gap_1 (inserted, inserted, false);
> 
>        invalidate_buffer_caches (current_buffer, PT, PT + inserted);
>        adjust_after_insert (PT, PT_BYTE, PT + inserted, PT_BYTE + inserted,
> 			   inserted);
>      }

That does look different. I'm not sure how long it would take me to 
adapt this code (if you have an alternative patch to suggest right away, 
please go ahead), but if this method turns out to be faster, it sounds 
like we could improve the performance of 'call_process' the same way. 
That would be a win-win.

>> The result there, though, is that a "filter" (in the sense that
>> make-process uses that term) is not used at all.
> 
> Sure, but in this case we don't need any filtering.  It's basically
> the same idea as internal-default-process-filter: we just need to
> insert the process output into a buffer, and optionally decode it.

Pretty much. But that raises the question of what to do with the 
existing function internal-default-process-filter.

Looking around, it doesn't seem to be used with advice (a good thing: 
the proposed change would break that), but it is called directly in some 
packages like magit-blame, org-assistant, with-editor, wisi, sweeprolog, 
etc. I suppose we'd just keep it around unchanged.

>>> And if we want to use it in production, we should
>>> probably work on adding that special default filter which inserts and
>>> decodes directly into the buffer, because that will probably lower the
>>> GC pressure and thus has hope of being faster.  Or even replace the
>>> default filter implementation with that new one.
>>
>> But a filter must be a Lisp function, which can't help but accept only
>> Lisp strings (not C string) as argument. Isn't that right?
> 
> We can provide a special filter identified by a symbol.  Such a filter
> will not be Lisp-callable, it will exist for the cases where we need
> to insert the output into the process buffer.

The would be the safest alternative. OTOH, this way we'd pass up on the 
opportunity to make all existing asynchronous processes without custom 
filters, a little bit faster in one fell swoop.

> Any Lisp callback could
> then access the process output as the text of that buffer, no Lisp
> strings needed.  I thought this was a worthy goal; apologies if I
> misunderstood.

Sorry, I was just quibbling about the terminology, to make sure we are 
on the same page on what is being proposed. If the patch and evidence 
look good to people, that is. And I'd like to explore that improvement 
venue to the max.

But note that it has limitations as well (e.g. filter is the only way to 
get in-process callbacks from the process, and avoiding it for best 
performance will require external callback such as timers), so if 
someone has any better ideas how to improve GC time to a comparable 
extent but keep design unchanged, that's also welcome.

Should we also discuss increasing the default of 
read-process-output-max? Even increasing it 10x (not necessarily 100x) 
creates a noticeable difference, especially combined with the proposed 
change.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-13 14:27                                                                               ` Dmitry Gutov
@ 2023-09-13 15:07                                                                                 ` Eli Zaretskii
  2023-09-13 17:27                                                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-13 15:07 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Wed, 13 Sep 2023 17:27:49 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> Does the patch from my last patch-bearing email look similar enough to
> >> what you're describing?
> >>
> >> The one called read_and_insert_process_output.diff
> > 
> > No, not entirely: it still produces a Lisp string when decoding is
> > needed, and then inserts that string into the buffer.
> 
> Are you sure? IIUC the fact that is passes 'curbuf' as the last argument 
> to 'decode_coding_c_string' means that decoding happens inside the 
> buffer. This has been my explanation for the performance improvement anyway.

Yes, you are right, sorry.

> > Sure, but in this case we don't need any filtering.  It's basically
> > the same idea as internal-default-process-filter: we just need to
> > insert the process output into a buffer, and optionally decode it.
> 
> Pretty much. But that raises the question of what to do with the 
> existing function internal-default-process-filter.

Nothing.  It will remain as the default filter.

> Looking around, it doesn't seem to be used with advice (a good thing: 
> the proposed change would break that), but it is called directly in some 
> packages like magit-blame, org-assistant, with-editor, wisi, sweeprolog, 
> etc. I suppose we'd just keep it around unchanged.

Yes.

> > We can provide a special filter identified by a symbol.  Such a filter
> > will not be Lisp-callable, it will exist for the cases where we need
> > to insert the output into the process buffer.
> 
> The would be the safest alternative. OTOH, this way we'd pass up on the 
> opportunity to make all existing asynchronous processes without custom 
> filters, a little bit faster in one fell swoop.

We could change the ones we care about, though.

> Should we also discuss increasing the default of 
> read-process-output-max? Even increasing it 10x (not necessarily 100x) 
> creates a noticeable difference, especially combined with the proposed 
> change.

That should be limited to specific cases where we expect to see a lot
of stuff coming from the subprocess.  We could also discuss changing
the default value, but that would require measurements in as many
cases as we can afford.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-13 15:07                                                                                 ` Eli Zaretskii
@ 2023-09-13 17:27                                                                                   ` Dmitry Gutov
  2023-09-13 19:32                                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-13 17:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 13/09/2023 18:07, Eli Zaretskii wrote:
>> Date: Wed, 13 Sep 2023 17:27:49 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>>> Does the patch from my last patch-bearing email look similar enough to
>>>> what you're describing?
>>>>
>>>> The one called read_and_insert_process_output.diff
>>>
>>> No, not entirely: it still produces a Lisp string when decoding is
>>> needed, and then inserts that string into the buffer.
>>
>> Are you sure? IIUC the fact that is passes 'curbuf' as the last argument
>> to 'decode_coding_c_string' means that decoding happens inside the
>> buffer. This has been my explanation for the performance improvement anyway.
> 
> Yes, you are right, sorry.

So we're not going to try the gap-based approach? Okay.

>>> Sure, but in this case we don't need any filtering.  It's basically
>>> the same idea as internal-default-process-filter: we just need to
>>> insert the process output into a buffer, and optionally decode it.
>>
>> Pretty much. But that raises the question of what to do with the
>> existing function internal-default-process-filter.
> 
> Nothing.  It will remain as the default filter.

Okay, if you are sure.

>>> We can provide a special filter identified by a symbol.  Such a filter
>>> will not be Lisp-callable, it will exist for the cases where we need
>>> to insert the output into the process buffer.
>>
>> The would be the safest alternative. OTOH, this way we'd pass up on the
>> opportunity to make all existing asynchronous processes without custom
>> filters, a little bit faster in one fell swoop.
> 
> We could change the ones we care about, though.

Which ones do we care about? I've found a bunch of 'make-process' calls 
without :filter specified (flymake backends, ). Do we upgrade them all?

The difference is likely not critical in most of them, but the change 
would likely result in small reduction of GC pressure in the 
corresponding Emacs sessions.

We'll also need to version-guard the ones that are in ELPA.

We don't touch the implementations of functions like start-file-process, 
right?

What about the callers of functions like 
start-file-process-shell-command who want to take advantage of the 
improvement? Are we okay with them all having to call 
(set-process-filter proc 'buffer) on the returned process value?

>> Should we also discuss increasing the default of
>> read-process-output-max? Even increasing it 10x (not necessarily 100x)
>> creates a noticeable difference, especially combined with the proposed
>> change.
> 
> That should be limited to specific cases where we expect to see a lot
> of stuff coming from the subprocess.

So it would be okay to bump it in particular functions? Okay.

> We could also discuss changing
> the default value, but that would require measurements in as many
> cases as we can afford.

If you have some particular scenarios in mind, and what to look out for, 
I could test them out at least on one platform.

I'm not sure what negatives to test for, though. Raising the limit 10x 
is unlikely to lead to an OOM, but I guess some processes could grow 
higher latency?..





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-13 17:27                                                                                   ` Dmitry Gutov
@ 2023-09-13 19:32                                                                                     ` Eli Zaretskii
  2023-09-13 20:38                                                                                       ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-13 19:32 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Wed, 13 Sep 2023 20:27:09 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> Are you sure? IIUC the fact that is passes 'curbuf' as the last argument
> >> to 'decode_coding_c_string' means that decoding happens inside the
> >> buffer. This has been my explanation for the performance improvement anyway.
> > 
> > Yes, you are right, sorry.
> 
> So we're not going to try the gap-based approach? Okay.

decode_coding_c_string does that internally.

> >> The would be the safest alternative. OTOH, this way we'd pass up on the
> >> opportunity to make all existing asynchronous processes without custom
> >> filters, a little bit faster in one fell swoop.
> > 
> > We could change the ones we care about, though.
> 
> Which ones do we care about? I've found a bunch of 'make-process' calls 
> without :filter specified (flymake backends, ). Do we upgrade them all?
> 
> The difference is likely not critical in most of them, but the change 
> would likely result in small reduction of GC pressure in the 
> corresponding Emacs sessions.
> 
> We'll also need to version-guard the ones that are in ELPA.
> 
> We don't touch the implementations of functions like start-file-process, 
> right?
> 
> What about the callers of functions like 
> start-file-process-shell-command who want to take advantage of the 
> improvement? Are we okay with them all having to call 
> (set-process-filter proc 'buffer) on the returned process value?

I think these questions are slightly premature.  We should first have
the implementation of that filter, and then look for candidates that
could benefit from it.  My tendency is to change only callers which
are in many cases expected to get a lot of stuff from a subprocess, so
shell buffers are probably out.  But we could discuss that later.

> > We could also discuss changing
> > the default value, but that would require measurements in as many
> > cases as we can afford.
> 
> If you have some particular scenarios in mind, and what to look out for, 
> I could test them out at least on one platform.

Didn't think about that enough to have scenarios.

> I'm not sure what negatives to test for, though. Raising the limit 10x 
> is unlikely to lead to an OOM, but I guess some processes could grow 
> higher latency?..

With a large buffer and small subprocess output we will ask the OS for
a large memory increment for no good reason.  Then the following GC
will want to compact the gap, which means it will be slower.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-13 19:32                                                                                     ` Eli Zaretskii
@ 2023-09-13 20:38                                                                                       ` Dmitry Gutov
  2023-09-14  5:41                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-13 20:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 13/09/2023 22:32, Eli Zaretskii wrote:

>>> We could change the ones we care about, though.
>>
>> Which ones do we care about? I've found a bunch of 'make-process' calls
>> without :filter specified (flymake backends, ). Do we upgrade them all?
>>
>> The difference is likely not critical in most of them, but the change
>> would likely result in small reduction of GC pressure in the
>> corresponding Emacs sessions.
>>
>> We'll also need to version-guard the ones that are in ELPA.
>>
>> We don't touch the implementations of functions like start-file-process,
>> right?
>>
>> What about the callers of functions like
>> start-file-process-shell-command who want to take advantage of the
>> improvement? Are we okay with them all having to call
>> (set-process-filter proc 'buffer) on the returned process value?
> 
> I think these questions are slightly premature.  We should first have
> the implementation of that filter, and then look for candidates that
> could benefit from it.

The implementation in that patch looks almost complete to me, unless you 
have any further comments. The main difference would be the change in 
the dispatch comparison from

   if (p->filter == Qinternal_default_process_filter)

to

   if (p->filter == Qbuffer)

, I think. Of course I can re-submit the amended patch, if you like.

Regarding documentation, though. How will we describe that new value?

The process filter is described like this in the manual:

    This function gives PROCESS the filter function FILTER.  If FILTER
      is ‘nil’, it gives the process the default filter, which inserts
      the process output into the process buffer.  If FILTER is ‘t’,
      Emacs stops accepting output from the process, unless it’s a
      network server process that listens for incoming connections.

What can we add?

   If FILTER is ‘buffer’, it works like the default one, only a bit faster.

?

> My tendency is to change only callers which
> are in many cases expected to get a lot of stuff from a subprocess, so
> shell buffers are probably out.  But we could discuss that later.

When I'm thinking of start-file-process-shell-command, I have in mind 
project--files-in-directory, which currently uses 
process-file-shell-command. Though I suppose most cases would be more 
easily converted to use make-process (like xref-matches-in-files uses 
process-file for launching a shell pipeline already).

I was also thinking about Flymake backends because those work in the 
background. The outputs are usually small, but can easily grow in rare 
cases, without particular limit. Flymake also runs in the background, 
meaning whatever extra work it has to do (or especially GC pressure), 
affects the delays when editing.

>>> We could also discuss changing
>>> the default value, but that would require measurements in as many
>>> cases as we can afford.
>>
>> If you have some particular scenarios in mind, and what to look out for,
>> I could test them out at least on one platform.
> 
> Didn't think about that enough to have scenarios.
> 
>> I'm not sure what negatives to test for, though. Raising the limit 10x
>> is unlikely to lead to an OOM, but I guess some processes could grow
>> higher latency?..
> 
> With a large buffer and small subprocess output we will ask the OS for
> a large memory increment for no good reason.  Then the following GC
> will want to compact the gap, which means it will be slower.

I wonder what scenario that might become apparent in. Launching many 
small processes at once? Can't think of a realistic test case.

Anyway, if you prefer to put off the discussion about changing the 
default, that's fine by me. Or split into a separate bug.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-13 20:38                                                                                       ` Dmitry Gutov
@ 2023-09-14  5:41                                                                                         ` Eli Zaretskii
  2023-09-16  1:32                                                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-14  5:41 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Wed, 13 Sep 2023 23:38:29 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > I think these questions are slightly premature.  We should first have
> > the implementation of that filter, and then look for candidates that
> > could benefit from it.
> 
> The implementation in that patch looks almost complete to me, unless you 
> have any further comments.

Fine, then please post a complete patch with all the bells and
whistles, and let's have it reviewed more widely.  (I suggest a new
bug report, as this one is already prohibitively long to follow,
includes unrelated issues, and I fear some people will ignore patches
posted to it).  I think there are a few subtleties we still need to
figure out.

> The main difference would be the change in 
> the dispatch comparison from
> 
>    if (p->filter == Qinternal_default_process_filter)
> 
> to
> 
>    if (p->filter == Qbuffer)

Btw, both of the above are mistakes: you cannot compare Lisp objects
as if they were simple values.  You must use EQ.

>     This function gives PROCESS the filter function FILTER.  If FILTER
>       is ‘nil’, it gives the process the default filter, which inserts
>       the process output into the process buffer.  If FILTER is ‘t’,
>       Emacs stops accepting output from the process, unless it’s a
>       network server process that listens for incoming connections.
> 
> What can we add?
> 
>    If FILTER is ‘buffer’, it works like the default one, only a bit faster.
> 
> ?

  If FILTER is the symbol ‘buffer’, it works like the default filter,
  but makes some shortcuts to be faster: it doesn't adjust markers and
  the process mark (something else?).

Of course, the real text will depend on what the final patch will look
like: I'm not yet sure I understand which parts of
internal-default-process-filter you want to keep in this alternative
filter.  (If you intend to keep all of them, it might be better to
replace internal-default-process-filter completely, perhaps first with
some variable exposed to Lisp which we could use to see if the new one
causes issues.)

> > My tendency is to change only callers which
> > are in many cases expected to get a lot of stuff from a subprocess, so
> > shell buffers are probably out.  But we could discuss that later.
> 
> When I'm thinking of start-file-process-shell-command, I have in mind 
> project--files-in-directory, which currently uses 
> process-file-shell-command. Though I suppose most cases would be more 
> easily converted to use make-process (like xref-matches-in-files uses 
> process-file for launching a shell pipeline already).
> 
> I was also thinking about Flymake backends because those work in the 
> background. The outputs are usually small, but can easily grow in rare 
> cases, without particular limit. Flymake also runs in the background, 
> meaning whatever extra work it has to do (or especially GC pressure), 
> affects the delays when editing.

I think we will have to address these on a case by case basis.  The
issues and aspects are not trivial and sometimes subtle.  We might
even introduce knobs to allow different pipe sizes if there's no
one-fits-all value for a specific function using these primitives.

> >> I'm not sure what negatives to test for, though. Raising the limit 10x
> >> is unlikely to lead to an OOM, but I guess some processes could grow
> >> higher latency?..
> > 
> > With a large buffer and small subprocess output we will ask the OS for
> > a large memory increment for no good reason.  Then the following GC
> > will want to compact the gap, which means it will be slower.
> 
> I wonder what scenario that might become apparent in. Launching many 
> small processes at once? Can't think of a realistic test case.

One process suffices.  The effect might not be significant, but
slowdowns due to new features are generally considered regressions.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-14  5:41                                                                                         ` Eli Zaretskii
@ 2023-09-16  1:32                                                                                           ` Dmitry Gutov
  2023-09-16  5:37                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-16  1:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735

On 14/09/2023 08:41, Eli Zaretskii wrote:
>> Date: Wed, 13 Sep 2023 23:38:29 +0300
>> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>> I think these questions are slightly premature.  We should first have
>>> the implementation of that filter, and then look for candidates that
>>> could benefit from it.
>>
>> The implementation in that patch looks almost complete to me, unless you
>> have any further comments.
> 
> Fine, then please post a complete patch with all the bells and
> whistles, and let's have it reviewed more widely.  (I suggest a new
> bug report, as this one is already prohibitively long to follow,
> includes unrelated issues, and I fear some people will ignore patches
> posted to it).  I think there are a few subtleties we still need to
> figure out.

Sure, filed bug#66020.

>    If FILTER is the symbol ‘buffer’, it works like the default filter,
>    but makes some shortcuts to be faster: it doesn't adjust markers and
>    the process mark (something else?).
> 
> Of course, the real text will depend on what the final patch will look
> like: I'm not yet sure I understand which parts of
> internal-default-process-filter you want to keep in this alternative
> filter.  (If you intend to keep all of them, it might be better to
> replace internal-default-process-filter completely, perhaps first with
> some variable exposed to Lisp which we could use to see if the new one
> causes issues.)

Very good. And thanks for pointing out the omissions, so I went with 
reusing parts of internal-default-process-filter.

>>>> I'm not sure what negatives to test for, though. Raising the limit 10x
>>>> is unlikely to lead to an OOM, but I guess some processes could grow
>>>> higher latency?..
>>>
>>> With a large buffer and small subprocess output we will ask the OS for
>>> a large memory increment for no good reason.  Then the following GC
>>> will want to compact the gap, which means it will be slower.
>>
>> I wonder what scenario that might become apparent in. Launching many
>> small processes at once? Can't think of a realistic test case.
> 
> One process suffices.  The effect might not be significant, but
> slowdowns due to new features are generally considered regressions.

We'd need some objective way to evaluate this. Otherwise we'd just stop 
at the prospect of slowing down some process somewhere by 9ns (never 
mind speeding others up).





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
  2023-09-16  1:32                                                                                           ` Dmitry Gutov
@ 2023-09-16  5:37                                                                                             ` Eli Zaretskii
  2023-09-19 19:59                                                                                               ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-16  5:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735

> Date: Sat, 16 Sep 2023 04:32:26 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> I wonder what scenario that might become apparent in. Launching many
> >> small processes at once? Can't think of a realistic test case.
> > 
> > One process suffices.  The effect might not be significant, but
> > slowdowns due to new features are generally considered regressions.
> 
> We'd need some objective way to evaluate this. Otherwise we'd just stop 
> at the prospect of slowing down some process somewhere by 9ns (never 
> mind speeding others up).

That could indeed happen, and did happen in other cases.  My personal
conclusion from similar situations is that it is impossible to tell in
advance what the reaction will be; we need to present the numbers and
see how the chips fall.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-16  5:37                                                                                             ` Eli Zaretskii
@ 2023-09-19 19:59                                                                                               ` Dmitry Gutov
  2023-09-20 11:20                                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-19 19:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66020

This is another continuation from bug#64735, a subthread in this bug 
seems more fitting, given that I did most of the tests with its patch 
applied.

On 16/09/2023 08:37, Eli Zaretskii wrote:
>> Date: Sat, 16 Sep 2023 04:32:26 +0300
>> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net,
>>   64735@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>>>> I wonder what scenario that might become apparent in. Launching many
>>>> small processes at once? Can't think of a realistic test case.
>>> One process suffices.  The effect might not be significant, but
>>> slowdowns due to new features are generally considered regressions.
>> We'd need some objective way to evaluate this. Otherwise we'd just stop
>> at the prospect of slowing down some process somewhere by 9ns (never
>> mind speeding others up).
> That could indeed happen, and did happen in other cases.  My personal
> conclusion from similar situations is that it is impossible to tell in
> advance what the reaction will be; we need to present the numbers and
> see how the chips fall.

I wrote this test:

(defun test-ls-output ()
   (with-temp-buffer
     (let ((proc
            (make-process :name "ls"
                          :sentinel (lambda (&rest _))
                          :buffer (current-buffer)
                          :stderr (current-buffer)
                          :connection-type 'pipe
                          :command '("ls"))))
       (while (accept-process-output proc))
       (buffer-string))))

And tried to find some case where the difference is the least in favor 
of high buffer length. The one in favor of it we already know (a process 
with lots and lots of output).

But when running 'ls' on a small directory (output 500 chars long), the 
variance in benchmarking is larger than any difference I can see from 
changing read-process-output-max from 4096 to 40960 (or to 40900 even). 
The benchmark is the following:

   (benchmark 1000 '(let ((read-process-output-fast t) 
(read-process-output-max 4096)) (test-ls-output)))

When the directory is a little large (output ~50000 chars), there is 
more nuance. At first, as long as (!) read_and_insert_process_output_v2 
patch is applied and read-process-output-fast is non-nil, the difference 
is negligible:

| read-process-output-max | bench result                        |
|                    4096 | (4.566418994 28 0.8000380139999992) |
|                   40960 | (4.640526664 32 0.8330555910000008) |
|                  409600 | (4.629948652 30 0.7989731299999994) |

For completeness, here are the same results for 
read-process-output-fast=nil (emacs-29 is similar, though all a little 
slower):

| read-process-output-max | bench result                        |
|                    4096 | (4.953397326 52 1.354643750000001)  |
|                   40960 | (6.942334958 75 2.0616055079999995) |
|                  409600 | (7.124765651 76 2.0892871070000005) |

But as the session gets older (and I repeat these and other 
memory-intensive benchmarks), the outlay changes, and the larger buffer 
leads to uniformly worse number (the below is taken with 
read-process-output-fast=t; with that var set to nil the results were 
even worse):

| read-process-output-max | bench result                        |
|                    4096 | (5.02324481 41 0.8851443580000051)  |
|                   40960 | (5.438721274 61 1.2202541989999958) |
|                  409600 | (6.11188183 77 1.5461468160000038)  |

...which seems odd given that in general, the buffer length closer to 
the length of the output should be preferable, because otherwise it is 
allocated multiple times, and read_process_output is likewise called 
more. Perhaps longer strings get more difficult to allocate as 
fragmentation increases?

So, the last table is from a session I had running from yesterday, and 
the first table was produced after I restarted Emacs about an hour ago 
(the numbers were stable for 1-2 hours while I was writing this email 
on-and-off, then started degrading again a little bit, though not yet -- 
a couple of hours since -- even halfway to the numbers in the last table).

Where to go from here?

- Maybe we declare the difference insignificant and bump the value of 
read-process-output-max, given that it helps in other cases,
- Or try to find out the cause for degradation,
- Or keep the default the same, but make it easier to use different 
value for different processes (meaning, we resurrect the discussion in 
bug#38561).





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-19 19:59                                                                                               ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov
@ 2023-09-20 11:20                                                                                                 ` Eli Zaretskii
  2023-09-21  0:57                                                                                                   ` Dmitry Gutov
  2023-09-21  8:07                                                                                                   ` Stefan Kangas
  0 siblings, 2 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-20 11:20 UTC (permalink / raw)
  To: Dmitry Gutov, Stefan Kangas, Stefan Monnier; +Cc: 66020

> Date: Tue, 19 Sep 2023 22:59:43 +0300
> Cc: 66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> - Maybe we declare the difference insignificant and bump the value of 
> read-process-output-max, given that it helps in other cases,
> - Or try to find out the cause for degradation,
> - Or keep the default the same, but make it easier to use different 
> value for different processes (meaning, we resurrect the discussion in 
> bug#38561).

I'd try the same experiment on other use cases, say "M-x grep" and
"M-x compile" with large outputs, and if you see the same situation
there (i.e. larger buffers are no worse), try increasing the default
value on master.

Stefan & Stefan: any comments or suggestions?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-20 11:20                                                                                                 ` Eli Zaretskii
@ 2023-09-21  0:57                                                                                                   ` Dmitry Gutov
  2023-09-21  2:36                                                                                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-21  7:42                                                                                                     ` Eli Zaretskii
  2023-09-21  8:07                                                                                                   ` Stefan Kangas
  1 sibling, 2 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-21  0:57 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Kangas, Stefan Monnier; +Cc: 66020

On 20/09/2023 14:20, Eli Zaretskii wrote:
>> Date: Tue, 19 Sep 2023 22:59:43 +0300
>> Cc: 66020@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> - Maybe we declare the difference insignificant and bump the value of
>> read-process-output-max, given that it helps in other cases,
>> - Or try to find out the cause for degradation,
>> - Or keep the default the same, but make it easier to use different
>> value for different processes (meaning, we resurrect the discussion in
>> bug#38561).
> 
> I'd try the same experiment on other use cases, say "M-x grep" and
> "M-x compile" with large outputs, and if you see the same situation
> there (i.e. larger buffers are no worse), try increasing the default
> value on master.

I've run one particular rgrep search a few times (24340 hits, ~44s when 
the variable's value is either 4096 or 409600). And it makes sense that 
there is no difference: compilation modes do a lot more work than just 
capturing the process output or splitting it into strings.

That leaves the question of what new value to use. 409600 is optimal for 
a large-output process but seems too much as default anyway (even if I 
have very little experimental proof for that hesitance: any help with 
that would be very welcome).

I did some more experimenting, though. At a superficial glance, 
allocating the 'chars' buffer at the beginning of read_process_output is 
problematic because we could instead reuse a buffer for the whole 
duration of the process. I tried that (adding a new field to 
Lisp_Process and setting it in make_process), although I had to use a 
value produced by make_uninit_string: apparently simply storing a char* 
field inside a managed structure creates problems for the GC and early 
segfaults. Anyway, the result was slightly _slower_ than the status quo.

So I read what 'alloca' does, and it looks hard to beat. But it's only 
used (as you of course know) when the value is <= MAX_ALLOCA, which is 
currently 16384. Perhaps an optimal default value shouldn't exceed this, 
even if it's hard to create a benchmark that shows a difference. With 
read-process-output-max set to 16384, my original benchmark gets about 
halfway to the optimal number.

And I think we should make the process "remember" the value at its 
creation either way (something touched on in bug#38561): in bug#55737 we 
added an fcntl call to make the larger values take effect. But this call 
is in create_process: so any subsequent increase to a large value of 
this var won't have effect. Might as well remember it there (in a new 
field), then it'll be easier to use different values of it for different 
processes (set using let-binding at the time of the process' creation).

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21  0:57                                                                                                   ` Dmitry Gutov
@ 2023-09-21  2:36                                                                                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
       [not found]                                                                                                       ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev>
  2023-09-21  7:42                                                                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 199+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-21  2:36 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, Stefan Kangas, 66020

> make_process), although I had to use a value produced by make_uninit_string:
> apparently simply storing a char* field inside a managed structure creates
> problems for the GC and early segfaults. Anyway, the result was slightly

That should depend on *where* you put that field.  Basically, it has to
come after:

    /* The thread a process is linked to, or nil for any thread.  */
    Lisp_Object thread;
    /* After this point, there are no Lisp_Objects.  */

since all the words up to that point will be traced by the GC (and
assumed to be Lisp_Object fields).  But of course, if you created the
buffer with `make_uninit_string` then it'll be inside the Lisp heap and
so it'll be reclaimed if the GC doesn't find any reference to it.


        Stefan






^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21  0:57                                                                                                   ` Dmitry Gutov
  2023-09-21  2:36                                                                                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-21  7:42                                                                                                     ` Eli Zaretskii
  2023-09-21 14:37                                                                                                       ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-21  7:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier

> Date: Thu, 21 Sep 2023 03:57:43 +0300
> Cc: 66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> That leaves the question of what new value to use. 409600 is optimal for 
> a large-output process but seems too much as default anyway (even if I 
> have very little experimental proof for that hesitance: any help with 
> that would be very welcome).

How does the throughput depend on this value?  If the dependence curve
plateaus at some lower value, we could use that lower value as a
"good-enough" default.

> I did some more experimenting, though. At a superficial glance, 
> allocating the 'chars' buffer at the beginning of read_process_output is 
> problematic because we could instead reuse a buffer for the whole 
> duration of the process. I tried that (adding a new field to 
> Lisp_Process and setting it in make_process), although I had to use a 
> value produced by make_uninit_string: apparently simply storing a char* 
> field inside a managed structure creates problems for the GC and early 
> segfaults. Anyway, the result was slightly _slower_ than the status quo.
> 
> So I read what 'alloca' does, and it looks hard to beat. But it's only 
> used (as you of course know) when the value is <= MAX_ALLOCA, which is 
> currently 16384. Perhaps an optimal default value shouldn't exceed this, 
> even if it's hard to create a benchmark that shows a difference. With 
> read-process-output-max set to 16384, my original benchmark gets about 
> halfway to the optimal number.

Which I think means we should stop worrying about the overhead of
malloc for this purpose, as it is fast enough, at least on GNU/Linux.

> And I think we should make the process "remember" the value at its 
> creation either way (something touched on in bug#38561): in bug#55737 we 
> added an fcntl call to make the larger values take effect. But this call 
> is in create_process: so any subsequent increase to a large value of 
> this var won't have effect.

Why would the variable change after create_process?  I'm afraid I
don't understand what issue you are trying to deal with here.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-20 11:20                                                                                                 ` Eli Zaretskii
  2023-09-21  0:57                                                                                                   ` Dmitry Gutov
@ 2023-09-21  8:07                                                                                                   ` Stefan Kangas
       [not found]                                                                                                     ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev>
  1 sibling, 1 reply; 199+ messages in thread
From: Stefan Kangas @ 2023-09-21  8:07 UTC (permalink / raw)
  To: Eli Zaretskii, Dmitry Gutov, Stefan Monnier; +Cc: 66020

Eli Zaretskii <eliz@gnu.org> writes:

> Stefan & Stefan: any comments or suggestions?

FWIW, I've had the below snippet in my .emacs for the last two years,
and haven't noticed any adverse effects.  I never bothered making any
actual benchmarks though:

    ;; Maybe faster:
    (setq read-process-output-max
          (max read-process-output-max (* 64 1024)))

I added the above after the discussion here:

    https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
       [not found]                                                                                                       ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev>
@ 2023-09-21 13:16                                                                                                         ` Eli Zaretskii
  2023-09-21 17:54                                                                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-21 13:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 66020, monnier, stefankangas

> Date: Thu, 21 Sep 2023 15:20:57 +0300
> Cc: Eli Zaretskii <eliz@gnu.org>, Stefan Kangas <stefankangas@gmail.com>,
>  66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 21/09/2023 05:36, Stefan Monnier wrote:
> >> make_process), although I had to use a value produced by make_uninit_string:
> >> apparently simply storing a char* field inside a managed structure creates
> >> problems for the GC and early segfaults. Anyway, the result was slightly
> > That should depend on*where*  you put that field.  Basically, it has to
> > come after:
> > 
> >      /* The thread a process is linked to, or nil for any thread.  */
> >      Lisp_Object thread;
> >      /* After this point, there are no Lisp_Objects.  */
> > 
> > since all the words up to that point will be traced by the GC (and
> > assumed to be Lisp_Object fields).
> 
> Ah, thanks. That calls for another try.
> 
> ...still no improvement, though no statistically significant slowdown 
> either this time.

Why did you expect a significant improvement?  Allocating and freeing
the same-size buffer in quick succession has got to be optimally
handled by modern malloc implementations, so I wouldn't be surprised
by what you discover.  There should be no OS calls, just reuse of a
buffer that was just recently free'd.  The overhead exists, but is
probably very small, so it is lost in the noise.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
       [not found]                                                                                                     ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev>
@ 2023-09-21 13:17                                                                                                       ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-21 13:17 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier

> Date: Thu, 21 Sep 2023 15:27:41 +0300
> Cc: 66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >      https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html
> 
> The archive seems down (so I can't read this), but if you found a 
> tangible improvement from the above setting, you might also want to try 
> out the patch at the top of this bug report.

It is back up now.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21  7:42                                                                                                     ` Eli Zaretskii
@ 2023-09-21 14:37                                                                                                       ` Dmitry Gutov
  2023-09-21 14:59                                                                                                         ` Eli Zaretskii
  2023-09-21 17:33                                                                                                         ` Dmitry Gutov
  0 siblings, 2 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-21 14:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66020, stefankangas, monnier

On 21/09/2023 10:42, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 03:57:43 +0300
>> Cc: 66020@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> That leaves the question of what new value to use. 409600 is optimal for
>> a large-output process but seems too much as default anyway (even if I
>> have very little experimental proof for that hesitance: any help with
>> that would be very welcome).
> 
> How does the throughput depend on this value?  If the dependence curve
> plateaus at some lower value, we could use that lower value as a
> "good-enough" default.

Depends on what we're prepared to call a plateau. Strictly speaking, not 
really. But we have a "sweet spot": for the process in my original 
benchmark ('find' with lots of output) it seems to be around 1009600. 
Here's a table (numbers are different from before because they're 
results of (benchmark 5 ...) divided by 5, meaning GC is amortized:

|    4096 | 0.78 |
|   16368 | 0.69 |
|   40960 | 0.65 |
|  409600 | 0.59 |
| 1009600 | 0.56 |
| 2009600 | 0.64 |
| 4009600 | 0.65 |

The process's output length is 27244567 in this case. Still above the 
largest of the buffers in this example.

Notably, only allocating the buffer once at the start of the process 
(experiment mentioned in the email to Stefan M.) doesn't change the 
dynamics: buffer lengths above ~1009600 make the performance worse.

So there must be some negative factor associated with higher buffers. 
There is an obvious positive one: the longer the buffer, the longer we 
don't switch between processes, so that overhead is lower.

We could look into improving that part specifically: for example, 
reading from the process multiple times into 'chars' right away while 
there is still pending output present (either looping inside 
read_process_output, or calling it in a loop in 
wait_reading_process_output, at least until the process' buffered output 
is exhausted). That could reduce reactivity, however (can we find out 
how much is already buffered in advance, and only loop until we exhaust 
that length?)

>> I did some more experimenting, though. At a superficial glance,
>> allocating the 'chars' buffer at the beginning of read_process_output is
>> problematic because we could instead reuse a buffer for the whole
>> duration of the process. I tried that (adding a new field to
>> Lisp_Process and setting it in make_process), although I had to use a
>> value produced by make_uninit_string: apparently simply storing a char*
>> field inside a managed structure creates problems for the GC and early
>> segfaults. Anyway, the result was slightly _slower_ than the status quo.
>>
>> So I read what 'alloca' does, and it looks hard to beat. But it's only
>> used (as you of course know) when the value is <= MAX_ALLOCA, which is
>> currently 16384. Perhaps an optimal default value shouldn't exceed this,
>> even if it's hard to create a benchmark that shows a difference. With
>> read-process-output-max set to 16384, my original benchmark gets about
>> halfway to the optimal number.
> 
> Which I think means we should stop worrying about the overhead of
> malloc for this purpose, as it is fast enough, at least on GNU/Linux.

Perhaps. If we're not too concerned about memory fragmentation (that's 
the only explanation I have for the table "session gets older" -- last 
one -- in a previous email with test-ls-output timings).

>> And I think we should make the process "remember" the value at its
>> creation either way (something touched on in bug#38561): in bug#55737 we
>> added an fcntl call to make the larger values take effect. But this call
>> is in create_process: so any subsequent increase to a large value of
>> this var won't have effect.
> 
> Why would the variable change after create_process?  I'm afraid I
> don't understand what issue you are trying to deal with here.

Well, what could we lose by saving the value of read-process-output-max 
in create_process? Currently I suppose one could vary its value while a 
process is still running, to implement some adaptive behavior or 
whatnot. But that's already semi-broken because fcntl is called in 
create_process.

^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 14:37                                                                                                       ` Dmitry Gutov
@ 2023-09-21 14:59                                                                                                         ` Eli Zaretskii
  2023-09-21 17:40                                                                                                           ` Dmitry Gutov
  2023-09-21 17:33                                                                                                         ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-21 14:59 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier

> Date: Thu, 21 Sep 2023 17:37:23 +0300
> Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> > How does the throughput depend on this value?  If the dependence curve
> > plateaus at some lower value, we could use that lower value as a
> > "good-enough" default.
> 
> Depends on what we're prepared to call a plateau. Strictly speaking, not 
> really. But we have a "sweet spot": for the process in my original 
> benchmark ('find' with lots of output) it seems to be around 1009600. 
> Here's a table (numbers are different from before because they're 
> results of (benchmark 5 ...) divided by 5, meaning GC is amortized:
> 
> |    4096 | 0.78 |
> |   16368 | 0.69 |
> |   40960 | 0.65 |
> |  409600 | 0.59 |
> | 1009600 | 0.56 |
> | 2009600 | 0.64 |
> | 4009600 | 0.65 |

Not enough data points between 40960 and 409600, IMO.  40960 sounds
like a good spot for the default value.

> >> And I think we should make the process "remember" the value at its
> >> creation either way (something touched on in bug#38561): in bug#55737 we
> >> added an fcntl call to make the larger values take effect. But this call
> >> is in create_process: so any subsequent increase to a large value of
> >> this var won't have effect.
> > 
> > Why would the variable change after create_process?  I'm afraid I
> > don't understand what issue you are trying to deal with here.
> 
> Well, what could we lose by saving the value of read-process-output-max 
> in create_process?

It's already recorded in the size of the pipe, so why would we need to
record it once more?

> Currently I suppose one could vary its value while a process is
> still running, to implement some adaptive behavior or whatnot. But
> that's already semi-broken because fcntl is called in
> create_process.

I see no reason to support such changes during the process run,
indeed.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 14:37                                                                                                       ` Dmitry Gutov
  2023-09-21 14:59                                                                                                         ` Eli Zaretskii
@ 2023-09-21 17:33                                                                                                         ` Dmitry Gutov
  2023-09-23 21:51                                                                                                           ` Dmitry Gutov
  1 sibling, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-21 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66020, monnier, stefankangas

[-- Attachment #1: Type: text/plain, Size: 3317 bytes --]

On 21/09/2023 17:37, Dmitry Gutov wrote:
> We could look into improving that part specifically: for example, 
> reading from the process multiple times into 'chars' right away while 
> there is still pending output present (either looping inside 
> read_process_output, or calling it in a loop in 
> wait_reading_process_output, at least until the process' buffered output 
> is exhausted). That could reduce reactivity, however (can we find out 
> how much is already buffered in advance, and only loop until we exhaust 
> that length?)

Hmm, the naive patch below offers some improvement for the value 4096, 
but still not comparable to raising the buffer size: 0.76 -> 0.72.

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..a550e223f78 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5893,7 +5893,7 @@ wait_reading_process_output (intmax_t time_limit, 
int nsecs, int read_kbd,
  	      && ((fd_callback_info[channel].flags & (KEYBOARD_FD | PROCESS_FD))
  		  == PROCESS_FD))
  	    {
-	      int nread;
+	      int nread = 0, nnread;

  	      /* If waiting for this channel, arrange to return as
  		 soon as no more input to be processed.  No more
@@ -5912,7 +5912,13 @@ wait_reading_process_output (intmax_t time_limit, 
int nsecs, int read_kbd,
  	      /* Read data from the process, starting with our
  		 buffered-ahead character if we have one.  */

-	      nread = read_process_output (proc, channel);
+	      do
+		{
+		  nnread = read_process_output (proc, channel);
+		  nread += nnread;
+		}
+	      while (nnread >= 4096);
+
  	      if ((!wait_proc || wait_proc == XPROCESS (proc))
  		  && got_some_output < nread)
  		got_some_output = nread;


And "unlocking" the pipe size on the external process takes the 
performance further up a notch (by default it's much larger): 0.72 -> 0.65.

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..85fc1b4d0c8 100644
--- a/src/process.c
+++ b/src/process.c
@@ -2206,10 +2206,10 @@ create_process (Lisp_Object process, char 
**new_argv, Lisp_Object current_dir)
        inchannel = p->open_fd[READ_FROM_SUBPROCESS];
        forkout = p->open_fd[SUBPROCESS_STDOUT];

-#if (defined (GNU_LINUX) || defined __ANDROID__)	\
-  && defined (F_SETPIPE_SZ)
-      fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max);
-#endif /* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ */
+/* #if (defined (GNU_LINUX) || defined __ANDROID__)	\ */
+/*   && defined (F_SETPIPE_SZ) */
+/*       fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); */
+/* #endif /\* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ *\/ */
      }

    if (!NILP (p->stderrproc))

Apparently the patch from bug#55737 also made things a little worse by 
default, by limiting concurrency (the external process has to wait while 
the pipe is blocked, and by default Linux's pipe is larger). Just 
commenting it out makes performance a little better as well, though not 
as much as the two patches together.

Note that both changes above are just PoC (e.g. the hardcoded 4096, and 
probably other details like carryover).

I've tried to make a more nuanced loop inside read_process_output 
instead (as replacement for the first patch above), and so far it 
performs worse that the baseline. If anyone can see when I'm doing wrong 
(see attachment), comments are very welcome.

[-- Attachment #2: read_process_output_nn_inside.diff --]
[-- Type: text/x-patch, Size: 1443 bytes --]

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..91a5c044a8c 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6128,11 +6133,11 @@ read_and_dispose_of_process_output (struct Lisp_Process *p, char *chars,
 static int
 read_process_output (Lisp_Object proc, int channel)
 {
-  ssize_t nbytes;
+  ssize_t nbytes, nnbytes = 0;
   struct Lisp_Process *p = XPROCESS (proc);
   eassert (0 <= channel && channel < FD_SETSIZE);
   struct coding_system *coding = proc_decode_coding_system[channel];
-  int carryover = p->decoding_carryover;
+  int carryover;
   ptrdiff_t readmax = clip_to_bounds (1, read_process_output_max, PTRDIFF_MAX);
   specpdl_ref count = SPECPDL_INDEX ();
   Lisp_Object odeactivate;
@@ -6141,6 +6146,9 @@ read_process_output (Lisp_Object proc, int channel)
   USE_SAFE_ALLOCA;
   chars = SAFE_ALLOCA (sizeof coding->carryover + readmax);
 
+do{
+  carryover = p->decoding_carryover;
+
   if (carryover)
     /* See the comment above.  */
     memcpy (chars, SDATA (p->decoding_buf), carryover);
@@ -6222,3 +6236,3 @@ read_process_output (Lisp_Object proc, int channel)
   /* Now set NBYTES how many bytes we must decode.  */
   nbytes += carryover;
 
@@ -6233,5 +6245,8 @@
   /* Handling the process output should not deactivate the mark.  */
   Vdeactivate_mark = odeactivate;
 
+  nnbytes += nbytes;
+ } while (nbytes >= readmax);
+
   SAFE_FREE_UNBIND_TO (count, Qnil);
-  return nbytes;
+  return nnbytes;
}


^ permalink raw reply related	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 14:59                                                                                                         ` Eli Zaretskii
@ 2023-09-21 17:40                                                                                                           ` Dmitry Gutov
  2023-09-21 18:39                                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-21 17:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66020, stefankangas, monnier

On 21/09/2023 17:59, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 17:37:23 +0300
>> Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>>> How does the throughput depend on this value?  If the dependence curve
>>> plateaus at some lower value, we could use that lower value as a
>>> "good-enough" default.
>>
>> Depends on what we're prepared to call a plateau. Strictly speaking, not
>> really. But we have a "sweet spot": for the process in my original
>> benchmark ('find' with lots of output) it seems to be around 1009600.
>> Here's a table (numbers are different from before because they're
>> results of (benchmark 5 ...) divided by 5, meaning GC is amortized:
>>
>> |    4096 | 0.78 |
>> |   16368 | 0.69 |
>> |   40960 | 0.65 |
>> |  409600 | 0.59 |
>> | 1009600 | 0.56 |
>> | 2009600 | 0.64 |
>> | 4009600 | 0.65 |
> 
> Not enough data points between 40960 and 409600, IMO.  40960 sounds
> like a good spot for the default value.

Or 32K, from the thread linked to previously (datagram size). And ifwe 
were to raise MAX_ALLOCA by 2x, we could still use 'alloca'.

Neither would be optimal for my test scenario, though still an 
improvement. But see my other email with experimental patches, those 
bear improvement with the default 4096.

>>>> And I think we should make the process "remember" the value at its
>>>> creation either way (something touched on in bug#38561): in bug#55737 we
>>>> added an fcntl call to make the larger values take effect. But this call
>>>> is in create_process: so any subsequent increase to a large value of
>>>> this var won't have effect.
>>>
>>> Why would the variable change after create_process?  I'm afraid I
>>> don't understand what issue you are trying to deal with here.
>>
>> Well, what could we lose by saving the value of read-process-output-max
>> in create_process?
> 
> It's already recorded in the size of the pipe, so why would we need to
> record it once more?

'read_process_output' looks it up once more, to set the value of 
'readmax' and allocate char*chars.

Can we get the "recorded" value back from the pipe somehow?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 13:16                                                                                                         ` Eli Zaretskii
@ 2023-09-21 17:54                                                                                                           ` Dmitry Gutov
  0 siblings, 0 replies; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-21 17:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66020, monnier, stefankangas

On 21/09/2023 16:16, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 15:20:57 +0300
>> Cc: Eli Zaretskii<eliz@gnu.org>, Stefan Kangas<stefankangas@gmail.com>,
>>   66020@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>> On 21/09/2023 05:36, Stefan Monnier wrote:
>>>> make_process), although I had to use a value produced by make_uninit_string:
>>>> apparently simply storing a char* field inside a managed structure creates
>>>> problems for the GC and early segfaults. Anyway, the result was slightly
>>> That should depend on*where*  you put that field.  Basically, it has to
>>> come after:
>>>
>>>       /* The thread a process is linked to, or nil for any thread.  */
>>>       Lisp_Object thread;
>>>       /* After this point, there are no Lisp_Objects.  */
>>>
>>> since all the words up to that point will be traced by the GC (and
>>> assumed to be Lisp_Object fields).
>> Ah, thanks. That calls for another try.
>>
>> ...still no improvement, though no statistically significant slowdown
>> either this time.
> Why did you expect a significant improvement?

No need to be surprised, I'm still growing intuition for what is fast 
and what is slow at this level of abstraction.

> Allocating and freeing
> the same-size buffer in quick succession has got to be optimally
> handled by modern malloc implementations, so I wouldn't be surprised
> by what you discover.  There should be no OS calls, just reuse of a
> buffer that was just recently free'd.  The overhead exists, but is
> probably very small, so it is lost in the noise.

There are context switches after 'read_process_output' exits (control is 
returned to Emacs's event loop, the external process runs again, we wait 
on it with 'select'), it might not be there later, especially outside of 
the lab situation where we benchmark just single external process. So I 
don't know.

I'm not majorly concerned, of course, and wouldn't be at all, if not for 
the previously recorded minor degragation with larger buffers in the 
longer-running session (last table in https://debbugs.gnu.org/66020#10).





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 17:40                                                                                                           ` Dmitry Gutov
@ 2023-09-21 18:39                                                                                                             ` Eli Zaretskii
  2023-09-21 18:42                                                                                                               ` Dmitry Gutov
  0 siblings, 1 reply; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-21 18:39 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier

> Date: Thu, 21 Sep 2023 20:40:35 +0300
> Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> Can we get the "recorded" value back from the pipe somehow?

There's F_GETPIPE_SZ command to fcntl, so I think we can.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 18:39                                                                                                             ` Eli Zaretskii
@ 2023-09-21 18:42                                                                                                               ` Dmitry Gutov
  2023-09-21 18:49                                                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-21 18:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66020, stefankangas, monnier

On 21/09/2023 21:39, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 20:40:35 +0300
>> Cc:stefankangas@gmail.com,monnier@iro.umontreal.ca,66020@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>> Can we get the "recorded" value back from the pipe somehow?
> There's F_GETPIPE_SZ command to fcntl, so I think we can.

I'll rephrase: is this a good idea (doing a +1 syscall every time we 
read a chunk, I'm not sure of its performance anyway), or should we add 
a new field to Lisp_Process after all?





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 18:42                                                                                                               ` Dmitry Gutov
@ 2023-09-21 18:49                                                                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-21 18:49 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier

> Date: Thu, 21 Sep 2023 21:42:01 +0300
> Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 21/09/2023 21:39, Eli Zaretskii wrote:
> >> Date: Thu, 21 Sep 2023 20:40:35 +0300
> >> Cc:stefankangas@gmail.com,monnier@iro.umontreal.ca,66020@debbugs.gnu.org
> >> From: Dmitry Gutov<dmitry@gutov.dev>
> >>
> >> Can we get the "recorded" value back from the pipe somehow?
> > There's F_GETPIPE_SZ command to fcntl, so I think we can.
> 
> I'll rephrase: is this a good idea (doing a +1 syscall every time we 
> read a chunk, I'm not sure of its performance anyway), or should we add 
> a new field to Lisp_Process after all?

If you indeed need the size, then do add it to the process object.





^ permalink raw reply	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-21 17:33                                                                                                         ` Dmitry Gutov
@ 2023-09-23 21:51                                                                                                           ` Dmitry Gutov
  2023-09-24  5:29                                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 199+ messages in thread
From: Dmitry Gutov @ 2023-09-23 21:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: stefankangas, 66020, monnier

On 21/09/2023 20:33, Dmitry Gutov wrote:
> On 21/09/2023 17:37, Dmitry Gutov wrote:
>> We could look into improving that part specifically: for example, 
>> reading from the process multiple times into 'chars' right away while 
>> there is still pending output present (either looping inside 
>> read_process_output, or calling it in a loop in 
>> wait_reading_process_output, at least until the process' buffered 
>> output is exhausted). That could reduce reactivity, however (can we 
>> find out how much is already buffered in advance, and only loop until 
>> we exhaust that length?)
> 
> Hmm, the naive patch below offers some improvement for the value 4096, 
> but still not comparable to raising the buffer size: 0.76 -> 0.72.
> 
> diff --git a/src/process.c b/src/process.c
> index 2376d0f288d..a550e223f78 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -5893,7 +5893,7 @@ wait_reading_process_output (intmax_t time_limit, 
> int nsecs, int read_kbd,
>             && ((fd_callback_info[channel].flags & (KEYBOARD_FD | 
> PROCESS_FD))
>             == PROCESS_FD))
>           {
> -          int nread;
> +          int nread = 0, nnread;
> 
>             /* If waiting for this channel, arrange to return as
>            soon as no more input to be processed.  No more
> @@ -5912,7 +5912,13 @@ wait_reading_process_output (intmax_t time_limit, 
> int nsecs, int read_kbd,
>             /* Read data from the process, starting with our
>            buffered-ahead character if we have one.  */
> 
> -          nread = read_process_output (proc, channel);
> +          do
> +        {
> +          nnread = read_process_output (proc, channel);
> +          nread += nnread;
> +        }
> +          while (nnread >= 4096);
> +
>             if ((!wait_proc || wait_proc == XPROCESS (proc))
>             && got_some_output < nread)
>           got_some_output = nread;
> 
> 
> And "unlocking" the pipe size on the external process takes the 
> performance further up a notch (by default it's much larger): 0.72 -> 0.65.
> 
> diff --git a/src/process.c b/src/process.c
> index 2376d0f288d..85fc1b4d0c8 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -2206,10 +2206,10 @@ create_process (Lisp_Object process, char 
> **new_argv, Lisp_Object current_dir)
>         inchannel = p->open_fd[READ_FROM_SUBPROCESS];
>         forkout = p->open_fd[SUBPROCESS_STDOUT];
> 
> -#if (defined (GNU_LINUX) || defined __ANDROID__)    \
> -  && defined (F_SETPIPE_SZ)
> -      fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max);
> -#endif /* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ */
> +/* #if (defined (GNU_LINUX) || defined __ANDROID__)    \ */
> +/*   && defined (F_SETPIPE_SZ) */
> +/*       fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); */
> +/* #endif /\* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ *\/ */
>       }
> 
>     if (!NILP (p->stderrproc))
> 
> Apparently the patch from bug#55737 also made things a little worse by 
> default, by limiting concurrency (the external process has to wait while 
> the pipe is blocked, and by default Linux's pipe is larger). Just 
> commenting it out makes performance a little better as well, though not 
> as much as the two patches together.
> 
> Note that both changes above are just PoC (e.g. the hardcoded 4096, and 
> probably other details like carryover).
> 
> I've tried to make a more nuanced loop inside read_process_output 
> instead (as replacement for the first patch above), and so far it 
> performs worse that the baseline. If anyone can see when I'm doing wrong 
> (see attachment), comments are very welcome.

This seems to have been a dead end: while looping does indeed make 
things faster, it doesn't really fit the approach of the 
'adaptive_read_buffering' part that's implemented in read_process_output.

And if the external process is crazy fast (while we, e.g. when using a 
Lisp filter, are not so fast), the result could be much reduced 
interactivity, with this one process keeping us stuck in the loop.

But it seems I've found an answer to one previous question: "can we find 
out how much is already buffered in advance?"

The patch below asks that from the OS (how portable is this? not sure) 
and allocates a larger buffer when more output has been buffered. If we 
keep OS's default value of pipe buffer size (64K on Linux and 16K-ish on 
macOS, according to 
https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer), 
that means auto-scaling the buffer on Emacs's side depending on how much 
the process outputs. The effect on performance is similar to the 
previous (looping) patch (0.70 -> 0.65), and is comparable to bumping 
read-process-output-max to 65536.

So if we do decide to bump the default, I suppose the below should not 
be necessary. And I don't know whether we should be concerned about 
fragmentation: this way buffers do get allocates in different sizes 
(almost always multiples of 4096, but with rare exceptions among larger 
values).

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..13cf6d6c50d 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6137,7 +6145,18 @@
    specpdl_ref count = SPECPDL_INDEX ();
    Lisp_Object odeactivate;
    char *chars;

+#ifdef USABLE_FIONREAD
+#ifdef DATAGRAM_SOCKETS
+  if (!DATAGRAM_CHAN_P (channel))
+#endif
+    {
+      int available_read;
+      ioctl (p->infd, FIONREAD, &available_read);
+      readmax = MAX (readmax, available_read);
+    }
+#endif
+
    USE_SAFE_ALLOCA;
    chars = SAFE_ALLOCA (sizeof coding->carryover + readmax);

What do people think?





^ permalink raw reply related	[flat|nested] 199+ messages in thread

* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max
  2023-09-23 21:51                                                                                                           ` Dmitry Gutov
@ 2023-09-24  5:29                                                                                                             ` Eli Zaretskii
  0 siblings, 0 replies; 199+ messages in thread
From: Eli Zaretskii @ 2023-09-24  5:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Paul Eggert, stefankangas, 66020, monnier

> Date: Sun, 24 Sep 2023 00:51:28 +0300
> From: Dmitry Gutov <dmitry@gutov.dev>
> Cc: 66020@debbugs.gnu.org, monnier@iro.umontreal.ca, stefankangas@gmail.com
> 
> But it seems I've found an answer to one previous question: "can we find 
> out how much is already buffered in advance?"
> 
> The patch below asks that from the OS (how portable is this? not sure) 
> and allocates a larger buffer when more output has been buffered. If we 
> keep OS's default value of pipe buffer size (64K on Linux and 16K-ish on 
> macOS, according to 
> https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer), 
> that means auto-scaling the buffer on Emacs's side depending on how much 
> the process outputs. The effect on performance is similar to the 
> previous (looping) patch (0.70 -> 0.65), and is comparable to bumping 
> read-process-output-max to 65536.
> 
> So if we do decide to bump the default, I suppose the below should not 
> be necessary. And I don't know whether we should be concerned about 
> fragmentation: this way buffers do get allocates in different sizes 
> (almost always multiples of 4096, but with rare exceptions among larger 
> values).
> 
> diff --git a/src/process.c b/src/process.c
> index 2376d0f288d..13cf6d6c50d 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -6137,7 +6145,18 @@
>     specpdl_ref count = SPECPDL_INDEX ();
>     Lisp_Object odeactivate;
>     char *chars;
> 
> +#ifdef USABLE_FIONREAD
> +#ifdef DATAGRAM_SOCKETS
> +  if (!DATAGRAM_CHAN_P (channel))
> +#endif
> +    {
> +      int available_read;
> +      ioctl (p->infd, FIONREAD, &available_read);
> +      readmax = MAX (readmax, available_read);
> +    }
> +#endif
> +
>     USE_SAFE_ALLOCA;
>     chars = SAFE_ALLOCA (sizeof coding->carryover + readmax);
> 
> What do people think?

I think we should increase the default size, and the rest (querying
the system about the pipe size) looks like an unnecessary complication
to me.

I've added Paul Eggert to this discussion, as I'd like to hear his
opinions about this stuff.





^ permalink raw reply	[flat|nested] 199+ messages in thread

end of thread, other threads:[~2023-09-24  5:29 UTC | newest]

Thread overview: 199+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh
2023-07-20  5:00 ` Eli Zaretskii
2023-07-20 12:22   ` sbaugh
2023-07-20 12:42     ` Dmitry Gutov
2023-07-20 13:43       ` Spencer Baugh
2023-07-20 18:54         ` Dmitry Gutov
2023-07-20 12:38 ` Dmitry Gutov
2023-07-20 13:20   ` Ihor Radchenko
2023-07-20 15:19     ` Dmitry Gutov
2023-07-20 15:42       ` Ihor Radchenko
2023-07-20 15:57         ` Dmitry Gutov
2023-07-20 16:03           ` Ihor Radchenko
2023-07-20 18:56             ` Dmitry Gutov
2023-07-21  9:14               ` Ihor Radchenko
2023-07-20 16:33         ` Eli Zaretskii
2023-07-20 16:36           ` Ihor Radchenko
2023-07-20 16:45             ` Eli Zaretskii
2023-07-20 17:23               ` Ihor Radchenko
2023-07-20 18:24                 ` Eli Zaretskii
2023-07-20 18:29                   ` Ihor Radchenko
2023-07-20 18:43                     ` Eli Zaretskii
2023-07-20 18:57                       ` Ihor Radchenko
2023-07-21 12:37                         ` Dmitry Gutov
2023-07-21 12:58                           ` Ihor Radchenko
2023-07-21 13:00                             ` Dmitry Gutov
2023-07-21 13:34                               ` Ihor Radchenko
2023-07-21 13:36                                 ` Dmitry Gutov
2023-07-21 13:46                                   ` Ihor Radchenko
2023-07-21 15:41                                     ` Dmitry Gutov
2023-07-21 15:48                                       ` Ihor Radchenko
2023-07-21 19:53                                         ` Dmitry Gutov
2023-07-23  5:40                                     ` Ihor Radchenko
2023-07-23 11:50                                       ` Michael Albinus
2023-07-24  7:35                                         ` Ihor Radchenko
2023-07-24  7:59                                           ` Michael Albinus
2023-07-24  8:22                                             ` Ihor Radchenko
2023-07-24  9:31                                               ` Michael Albinus
2023-07-21  7:45                       ` Michael Albinus
2023-07-21 10:46                         ` Eli Zaretskii
2023-07-21 11:32                           ` Michael Albinus
2023-07-21 11:51                             ` Ihor Radchenko
2023-07-21 12:01                               ` Michael Albinus
2023-07-21 12:20                                 ` Ihor Radchenko
2023-07-21 12:25                                   ` Ihor Radchenko
2023-07-21 12:46                                     ` Eli Zaretskii
2023-07-21 13:01                                       ` Michael Albinus
2023-07-21 13:23                                         ` Ihor Radchenko
2023-07-21 15:31                                           ` Michael Albinus
2023-07-21 15:38                                             ` Ihor Radchenko
2023-07-21 15:49                                               ` Michael Albinus
2023-07-21 15:55                                                 ` Eli Zaretskii
2023-07-21 16:08                                                   ` Michael Albinus
2023-07-21 16:15                                                   ` Ihor Radchenko
2023-07-21 16:38                                                     ` Eli Zaretskii
2023-07-21 16:43                                                       ` Ihor Radchenko
2023-07-21 16:43                                                       ` Michael Albinus
2023-07-21 17:45                                                         ` Eli Zaretskii
2023-07-21 17:55                                                           ` Michael Albinus
2023-07-21 18:38                                                             ` Eli Zaretskii
2023-07-21 19:33                                                               ` Spencer Baugh
2023-07-22  5:27                                                                 ` Eli Zaretskii
2023-07-22 10:38                                                                   ` sbaugh
2023-07-22 11:58                                                                     ` Eli Zaretskii
2023-07-22 14:14                                                                       ` Ihor Radchenko
2023-07-22 14:32                                                                         ` Eli Zaretskii
2023-07-22 15:07                                                                           ` Ihor Radchenko
2023-07-22 15:29                                                                             ` Eli Zaretskii
2023-07-23  7:52                                                                               ` Ihor Radchenko
2023-07-23  8:01                                                                                 ` Eli Zaretskii
2023-07-23  8:11                                                                                   ` Ihor Radchenko
2023-07-23  9:11                                                                                     ` Eli Zaretskii
2023-07-23  9:34                                                                                       ` Ihor Radchenko
2023-07-23  9:39                                                                                         ` Eli Zaretskii
2023-07-23  9:42                                                                                           ` Ihor Radchenko
2023-07-23 10:20                                                                                             ` Eli Zaretskii
2023-07-23 11:43                                                                                               ` Ihor Radchenko
2023-07-23 12:49                                                                                                 ` Eli Zaretskii
2023-07-23 12:57                                                                                                   ` Ihor Radchenko
2023-07-23 13:32                                                                                                     ` Eli Zaretskii
2023-07-23 13:56                                                                                                       ` Ihor Radchenko
2023-07-23 14:32                                                                                                         ` Eli Zaretskii
2023-07-22 17:18                                                                       ` sbaugh
2023-07-22 17:26                                                                         ` Ihor Radchenko
2023-07-22 17:46                                                                         ` Eli Zaretskii
2023-07-22 18:31                                                                           ` Eli Zaretskii
2023-07-22 19:06                                                                             ` Eli Zaretskii
2023-07-22 20:53                                                                           ` Spencer Baugh
2023-07-23  6:15                                                                             ` Eli Zaretskii
2023-07-23  7:48                                                                             ` Ihor Radchenko
2023-07-23  8:06                                                                               ` Eli Zaretskii
2023-07-23  8:16                                                                                 ` Ihor Radchenko
2023-07-23  9:13                                                                                   ` Eli Zaretskii
2023-07-23  9:16                                                                                     ` Ihor Radchenko
2023-07-23 11:44                                                                             ` Michael Albinus
2023-07-23  2:59                                                                 ` Richard Stallman
2023-07-23  5:28                                                                   ` Eli Zaretskii
2023-07-22  8:17                                                             ` Michael Albinus
2023-07-21 13:17                                       ` Ihor Radchenko
2023-07-21 12:27                                   ` Michael Albinus
2023-07-21 12:30                                     ` Ihor Radchenko
2023-07-21 13:04                                       ` Michael Albinus
2023-07-21 13:24                                         ` Ihor Radchenko
2023-07-21 15:36                                           ` Michael Albinus
2023-07-21 15:44                                             ` Ihor Radchenko
2023-07-21 12:39                             ` Eli Zaretskii
2023-07-21 13:09                               ` Michael Albinus
2023-07-21 12:38                           ` Dmitry Gutov
2023-07-20 17:08         ` Spencer Baugh
2023-07-20 17:24           ` Eli Zaretskii
2023-07-22  6:35             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-07-20 17:25           ` Ihor Radchenko
2023-07-21 19:31             ` Spencer Baugh
2023-07-21 19:37               ` Ihor Radchenko
2023-07-21 19:56                 ` Dmitry Gutov
2023-07-21 20:11                 ` Spencer Baugh
2023-07-22  6:39           ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-07-22 21:01             ` Dmitry Gutov
2023-07-23  5:11               ` Eli Zaretskii
2023-07-23 10:46                 ` Dmitry Gutov
2023-07-23 11:18                   ` Eli Zaretskii
2023-07-23 17:46                     ` Dmitry Gutov
2023-07-23 17:56                       ` Eli Zaretskii
2023-07-23 17:58                         ` Dmitry Gutov
2023-07-23 18:21                           ` Eli Zaretskii
2023-07-23 19:07                             ` Dmitry Gutov
2023-07-23 19:27                               ` Eli Zaretskii
2023-07-23 19:44                                 ` Dmitry Gutov
2023-07-23 19:27                         ` Dmitry Gutov
2023-07-24 11:20                           ` Eli Zaretskii
2023-07-24 12:55                             ` Dmitry Gutov
2023-07-24 13:26                               ` Eli Zaretskii
2023-07-25  2:41                                 ` Dmitry Gutov
2023-07-25  8:22                                   ` Ihor Radchenko
2023-07-26  1:51                                     ` Dmitry Gutov
2023-07-26  9:09                                       ` Ihor Radchenko
2023-07-27  0:41                                         ` Dmitry Gutov
2023-07-27  5:22                                           ` Eli Zaretskii
2023-07-27  8:20                                             ` Ihor Radchenko
2023-07-27  8:47                                               ` Eli Zaretskii
2023-07-27  9:28                                                 ` Ihor Radchenko
2023-07-27 13:30                                             ` Dmitry Gutov
2023-07-29  0:12                                               ` Dmitry Gutov
2023-07-29  6:15                                                 ` Eli Zaretskii
2023-07-30  1:35                                                   ` Dmitry Gutov
2023-07-31 11:38                                                     ` Eli Zaretskii
2023-09-08  0:53                                                       ` Dmitry Gutov
2023-09-08  6:35                                                         ` Eli Zaretskii
2023-09-10  1:30                                                           ` Dmitry Gutov
2023-09-10  5:33                                                             ` Eli Zaretskii
2023-09-11  0:02                                                               ` Dmitry Gutov
2023-09-11 11:57                                                                 ` Eli Zaretskii
2023-09-11 23:06                                                                   ` Dmitry Gutov
2023-09-12 11:39                                                                     ` Eli Zaretskii
2023-09-12 13:11                                                                       ` Dmitry Gutov
2023-09-12 14:23                                                                   ` Dmitry Gutov
2023-09-12 14:26                                                                     ` Dmitry Gutov
2023-09-12 16:32                                                                     ` Eli Zaretskii
2023-09-12 18:48                                                                       ` Dmitry Gutov
2023-09-12 19:35                                                                         ` Eli Zaretskii
2023-09-12 20:27                                                                           ` Dmitry Gutov
2023-09-13 11:38                                                                             ` Eli Zaretskii
2023-09-13 14:27                                                                               ` Dmitry Gutov
2023-09-13 15:07                                                                                 ` Eli Zaretskii
2023-09-13 17:27                                                                                   ` Dmitry Gutov
2023-09-13 19:32                                                                                     ` Eli Zaretskii
2023-09-13 20:38                                                                                       ` Dmitry Gutov
2023-09-14  5:41                                                                                         ` Eli Zaretskii
2023-09-16  1:32                                                                                           ` Dmitry Gutov
2023-09-16  5:37                                                                                             ` Eli Zaretskii
2023-09-19 19:59                                                                                               ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov
2023-09-20 11:20                                                                                                 ` Eli Zaretskii
2023-09-21  0:57                                                                                                   ` Dmitry Gutov
2023-09-21  2:36                                                                                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
     [not found]                                                                                                       ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev>
2023-09-21 13:16                                                                                                         ` Eli Zaretskii
2023-09-21 17:54                                                                                                           ` Dmitry Gutov
2023-09-21  7:42                                                                                                     ` Eli Zaretskii
2023-09-21 14:37                                                                                                       ` Dmitry Gutov
2023-09-21 14:59                                                                                                         ` Eli Zaretskii
2023-09-21 17:40                                                                                                           ` Dmitry Gutov
2023-09-21 18:39                                                                                                             ` Eli Zaretskii
2023-09-21 18:42                                                                                                               ` Dmitry Gutov
2023-09-21 18:49                                                                                                                 ` Eli Zaretskii
2023-09-21 17:33                                                                                                         ` Dmitry Gutov
2023-09-23 21:51                                                                                                           ` Dmitry Gutov
2023-09-24  5:29                                                                                                             ` Eli Zaretskii
2023-09-21  8:07                                                                                                   ` Stefan Kangas
     [not found]                                                                                                     ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev>
2023-09-21 13:17                                                                                                       ` Eli Zaretskii
2023-07-25 18:42                                   ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii
2023-07-26  1:56                                     ` Dmitry Gutov
2023-07-26  2:28                                       ` Eli Zaretskii
2023-07-26  2:35                                         ` Dmitry Gutov
2023-07-25 19:16                                   ` sbaugh
2023-07-26  2:28                                     ` Dmitry Gutov
2023-07-21  2:42 ` Richard Stallman
2023-07-22  2:39   ` Richard Stallman
2023-07-22  5:49     ` Eli Zaretskii
2023-07-22 10:18 ` Ihor Radchenko
2023-07-22 10:42   ` sbaugh
2023-07-22 12:00     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).