* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores @ 2023-07-19 21:16 Spencer Baugh 2023-07-20 5:00 ` Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 213+ messages in thread From: Spencer Baugh @ 2023-07-19 21:16 UTC (permalink / raw) To: 64735 Several important commands and functions invoke find; for example rgrep and project-find-regexp. Most of these add some set of ignores to the find command, pulling from grep-find-ignored-files in the former case. So the find command looks like: find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \) -prune -o -type f -print0 Alas, on my system, using GNU find, these ignores slow down find by about 15x on a large directory tree, taking it from around .5 seconds to 7.8 seconds. This is very noticeable overhead; removing the ignores makes rgrep and other find-invoking commands substantially faster for me. The overhead is linear in the number of ignores - that is, each additional ignore adds a small fixed cost. This suggests that find is linearly scanning the list of ignores and checking each one, rather than optimizing them to a single regexp and checking that regexp. Obviously, GNU find should be optimizing this. However they have previously said they will not optimize this; I commented on this bug https://savannah.gnu.org/bugs/index.php?58197 to request they rethink that. Hopefully as a fellow GNU project they will be interested in helping us... In Emacs alone, there are a few things we could do: - we could mitigate the find bug by optimizing the regexp before we pass it to find; this should basically remove all the overhead but makes the find command uglier and harder to edit - we could remove rare and likely irrelevant things from completion-ignored-extensions and vc-ignore-dir-regexp (which are used to build these lists of ignores) - we could use our own recursive directory-tree walking implementation (directory-files-recursively), if we found a nice way to pipe its output directly to grep etc without going through Lisp. (This could be nice for project-files, at least) Incidentally, I tried a find alternative, "bfs", https://github.com/tavianator/bfs and it doesn't optimize this either, sadly, so it also has the 15x slowdown. In GNU Emacs 29.0.92 (build 5, x86_64-pc-linux-gnu, X toolkit, cairo version 1.15.12, Xaw scroll bars) of 2023-07-10 built on Repository revision: dd15432ffacbeff0291381c0109f5b1245060b1d Repository branch: emacs-29 Windowing system distributor 'The X.Org Foundation', version 11.0.12011000 System Description: Rocky Linux 8.8 (Green Obsidian) Configured using: 'configure --config-cache --with-x-toolkit=lucid --with-gif=ifavailable' Configured features: CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG JSON LIBSELINUX LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE XIM XINPUT2 XPM LUCID ZLIB Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: Shell Memory information: ((conses 16 1939322 193013) (symbols 48 76940 49) (strings 32 337371 45355) (string-bytes 1 12322013) (vectors 16 148305) (vector-slots 8 3180429 187121) (floats 8 889 751) (intervals 56 152845 1238) (buffers 976 235) (heap 1024 978725 465480)) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh @ 2023-07-20 5:00 ` Eli Zaretskii 2023-07-20 12:22 ` sbaugh 2023-07-20 12:38 ` Dmitry Gutov ` (2 subsequent siblings) 3 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-20 5:00 UTC (permalink / raw) To: Spencer Baugh; +Cc: 64735 > From: Spencer Baugh <sbaugh@janestreet.com> > Date: Wed, 19 Jul 2023 17:16:31 -0400 > > > Several important commands and functions invoke find; for example rgrep > and project-find-regexp. > > Most of these add some set of ignores to the find command, pulling from > grep-find-ignored-files in the former case. So the find command looks > like: > > find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \) > -prune -o -type f -print0 > > Alas, on my system, using GNU find, these ignores slow down find by > about 15x on a large directory tree, taking it from around .5 seconds to > 7.8 seconds. > > This is very noticeable overhead; removing the ignores makes rgrep and > other find-invoking commands substantially faster for me. grep-find-ignored-files is a customizable user option, so if this slowdown bothers you, just customize it to avoid that. And if there are patterns there that are no longer pertinent or rare, we could remove them from the default value. I'm not sure we should bother more than these two simple measures. > The overhead is linear in the number of ignores - that is, each > additional ignore adds a small fixed cost. This suggests that find is > linearly scanning the list of ignores and checking each one, rather than > optimizing them to a single regexp and checking that regexp. If it uses fnmatch, it cannot do it any other way, I think ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 5:00 ` Eli Zaretskii @ 2023-07-20 12:22 ` sbaugh 2023-07-20 12:42 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: sbaugh @ 2023-07-20 12:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Spencer Baugh, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> From: Spencer Baugh <sbaugh@janestreet.com> >> Date: Wed, 19 Jul 2023 17:16:31 -0400 >> >> >> Several important commands and functions invoke find; for example rgrep >> and project-find-regexp. >> >> Most of these add some set of ignores to the find command, pulling from >> grep-find-ignored-files in the former case. So the find command looks >> like: >> >> find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \) >> -prune -o -type f -print0 >> >> Alas, on my system, using GNU find, these ignores slow down find by >> about 15x on a large directory tree, taking it from around .5 seconds to >> 7.8 seconds. >> >> This is very noticeable overhead; removing the ignores makes rgrep and >> other find-invoking commands substantially faster for me. > > grep-find-ignored-files is a customizable user option, so if this > slowdown bothers you, just customize it to avoid that. I think the fact that the default behavior is very slow, is bad. > And if there are patterns there that are no longer pertinent or rare, > we could remove them from the default value. Sure! So the thing to narrow down would be completion-ignored-extensions, which is what populates grep-find-ignored-files. Most things in that list are irrelevant to most users, but all of them are relevant to some users. Most of these are language-specific things - e.g. there's a bunch of Common Lisp compiled object (or something) extensions. Perhaps we could modularize this, so that individual packages add things to completion-ignored-extensions at load time. Then completion-ignored-extensions would only include things which are relevant to a given user, as determined by what packages they load. > I'm not sure we should bother more than these two simple measures. Unfortunately those two simple measures help rgrep but they don't help project-find-regexp (and others project.el commands using project--files-in-directory such as project-find-file), since those project commands pull their ignores from the version control system through vc (not grep-find-ignored-files), and then pass them to find. >> The overhead is linear in the number of ignores - that is, each >> additional ignore adds a small fixed cost. This suggests that find is >> linearly scanning the list of ignores and checking each one, rather than >> optimizing them to a single regexp and checking that regexp. > > If it uses fnmatch, it cannot do it any other way, I think ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 12:22 ` sbaugh @ 2023-07-20 12:42 ` Dmitry Gutov 2023-07-20 13:43 ` Spencer Baugh 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-20 12:42 UTC (permalink / raw) To: sbaugh, Eli Zaretskii; +Cc: Spencer Baugh, 64735 On 20/07/2023 15:22, sbaugh@catern.com wrote: >> I'm not sure we should bother more than these two simple measures. > Unfortunately those two simple measures help rgrep but they don't help > project-find-regexp (and others project.el commands using > project--files-in-directory such as project-find-file), since those > project commands pull their ignores from the version control system > through vc (not grep-find-ignored-files), and then pass them to find. That's only a problem when the default file listing logic is used (and we usually delegate to something like 'git ls-files' instead, when the vc-aware backend is used). Anyway, some optimization could be useful there too. The extra difficulty, though, is that the entries in IGNORES already can come as wildcards. Can we merge several wildcards? Though I suppose if we use a regexp, we could construct an alternation anyway. Another question it would be helpful to check, is whether the different versions of 'find' out there work fine with -regex instead of -name, and don't get slowed down simply because of that feature. The old built-in 'find' on macOS, for example. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 12:42 ` Dmitry Gutov @ 2023-07-20 13:43 ` Spencer Baugh 2023-07-20 18:54 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Spencer Baugh @ 2023-07-20 13:43 UTC (permalink / raw) To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: > On 20/07/2023 15:22, sbaugh@catern.com wrote: >>> I'm not sure we should bother more than these two simple measures. >> Unfortunately those two simple measures help rgrep but they don't help >> project-find-regexp (and others project.el commands using >> project--files-in-directory such as project-find-file), since those >> project commands pull their ignores from the version control system >> through vc (not grep-find-ignored-files), and then pass them to find. > > That's only a problem when the default file listing logic is used (and > we usually delegate to something like 'git ls-files' instead, when the > vc-aware backend is used). Hm, yes, but things like C-u project-find-regexp will use the default find-based file listing logic instead of git ls-files, as do a few other things. I wonder, could we just go ahead and make a vc function which is list-files(GLOBS) and returns a list of files? Both git and hg support this. Then we could have C-u project-find-regexp use that instead of find, by taking the cross product of dirs-to-search and file-name-patterns-to-search. (And this would let me delete a big chunk of my own project backend, so I'd be happy to implement it.) Fundamentally it seems a little silly for project-ignores to ever be used for a vc project; if the vcs gives us ignores, we can probably just ask the vcs to list the files too, and it will have an efficient implementation of that. If we do that uniformly, then this find slowness would only affect transient projects, and transient projects pull their ignores from grep-find-ignored-files just like rgrep, so improvements will more easily be applied to both. (And maybe we could even get rid of project-ignores entirely, then?) > Anyway, some optimization could be useful there too. The extra > difficulty, though, is that the entries in IGNORES already can come as > wildcards. Can we merge several wildcards? Though I suppose if we use > a regexp, we could construct an alternation anyway. > > Another question it would be helpful to check, is whether the > different versions of 'find' out there work fine with -regex instead > of -name, and don't get slowed down simply because of that > feature. The old built-in 'find' on macOS, for example. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 13:43 ` Spencer Baugh @ 2023-07-20 18:54 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-20 18:54 UTC (permalink / raw) To: Spencer Baugh; +Cc: sbaugh, Eli Zaretskii, 64735 On 20/07/2023 16:43, Spencer Baugh wrote: >> That's only a problem when the default file listing logic is used (and >> we usually delegate to something like 'git ls-files' instead, when the >> vc-aware backend is used). > > Hm, yes, but things like C-u project-find-regexp will use the default > find-based file listing logic instead of git ls-files, as do a few other > things. Right. > I wonder, could we just go ahead and make a vc function which is > list-files(GLOBS) and returns a list of files? Both git and hg support > this. Then we could have C-u project-find-regexp use that instead of > find, by taking the cross product of dirs-to-search and > file-name-patterns-to-search. (And this would let me delete a big chunk > of my own project backend, so I'd be happy to implement it.) I started out on this inside the branch scratch/project-regen. Didn't have time to dedicate to it recently, but the basics are there, take a look (the method is called project-files-filtered). The difficulty with making such changes, is the project protocol grows in size, it becomes difficult for a user to understand what is mandatory, what's obsolete, and how to use it, especially in the face of backward compatibility requirements. Take a look, feedback is welcome, it should help move this forward. We should also transition to returning relative file names when possible, for performance (optionally or always). > Fundamentally it seems a little silly for project-ignores to ever be > used for a vc project; if the vcs gives us ignores, we can probably just > ask the vcs to list the files too, and it will have an efficient > implementation of that. Possibly, yes. But there will likely remain cases when the project-files could stay useful for callers, to construct some bigger command line for some new feature. Though perhaps we'll be able to drop that need by extracting the theoretically best performance from project-files (using a process object or some abstraction), to facilitate low-overhead piping. > If we do that uniformly, then this find slowness would only affect > transient projects, and transient projects pull their ignores from > grep-find-ignored-files just like rgrep, so improvements will more > easily be applied to both. (And maybe we could even get rid of > project-ignores entirely, then?) Regarding removing it, see above. And it'll take a number of years anyway ;-( ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh 2023-07-20 5:00 ` Eli Zaretskii @ 2023-07-20 12:38 ` Dmitry Gutov 2023-07-20 13:20 ` Ihor Radchenko 2023-07-21 2:42 ` Richard Stallman 2023-07-22 10:18 ` Ihor Radchenko 3 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-20 12:38 UTC (permalink / raw) To: Spencer Baugh, 64735 On 20/07/2023 00:16, Spencer Baugh wrote: > In Emacs alone, there are a few things we could do: > - we could mitigate the find bug by optimizing the regexp before we pass > it to find; this should basically remove all the overhead but makes the > find command uglier and harder to edit > - we could remove rare and likely irrelevant things from > completion-ignored-extensions and vc-ignore-dir-regexp (which are used > to build these lists of ignores) I like these two approaches. > - we could use our own recursive directory-tree walking implementation > (directory-files-recursively), if we found a nice way to pipe its output > directly to grep etc without going through Lisp. (This could be nice > for project-files, at least) This will probably not work as well. Last I checked, Lisp-native file listing was simply slower than 'find'. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 12:38 ` Dmitry Gutov @ 2023-07-20 13:20 ` Ihor Radchenko 2023-07-20 15:19 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 13:20 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Spencer Baugh, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: > ... Last I checked, Lisp-native file > listing was simply slower than 'find'. Could it be changed? In my tests, I was able to improve performance of the built-in `directory-files-recursively' simply by disabling `file-name-handler-alist' around its call. See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ (the thread also continues off-list, and it looks like there is a lot of room for improvement in this area) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 13:20 ` Ihor Radchenko @ 2023-07-20 15:19 ` Dmitry Gutov 2023-07-20 15:42 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-20 15:19 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Spencer Baugh, 64735 On 20/07/2023 16:20, Ihor Radchenko wrote: > Dmitry Gutov <dmitry@gutov.dev> writes: > >> ... Last I checked, Lisp-native file >> listing was simply slower than 'find'. > > Could it be changed? > In my tests, I was able to improve performance of the built-in > `directory-files-recursively' simply by disabling > `file-name-handler-alist' around its call. Then it won't work with Tramp, right? I think it's pretty nifty that project-find-regexp and dired-do-find-regexp work over Tramp. > See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ > (the thread also continues off-list, and it looks like there is a lot of > room for improvement in this area) Does it get close enough to the performance of 'find' this way? Also note that processing all matches in Lisp, with many ignores entries, will incur the proportional overhead in Lisp. Which might be relatively slow as well. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 15:19 ` Dmitry Gutov @ 2023-07-20 15:42 ` Ihor Radchenko 2023-07-20 15:57 ` Dmitry Gutov ` (2 more replies) 0 siblings, 3 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 15:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Spencer Baugh, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: >>> ... Last I checked, Lisp-native file >>> listing was simply slower than 'find'. >> >> Could it be changed? >> In my tests, I was able to improve performance of the built-in >> `directory-files-recursively' simply by disabling >> `file-name-handler-alist' around its call. > > Then it won't work with Tramp, right? I think it's pretty nifty that > project-find-regexp and dired-do-find-regexp work over Tramp. Sure. It might also be optimized. Without trying to convince find devs to do something about regexp handling. And things are not as horrible as 15x slowdown in find. >> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ >> (the thread also continues off-list, and it looks like there is a lot of >> room for improvement in this area) > > Does it get close enough to the performance of 'find' this way? Comparable: (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" "")))) ;; Elapsed time: 0.633713s (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" ""))))) ;; Elapsed time: 0.324341s ;; time find /home/yantar92/.data >/dev/null ;; real 0m0.129s ;; user 0m0.017s ;; sys 0m0.111s > Also note that processing all matches in Lisp, with many ignores > entries, will incur the proportional overhead in Lisp. Which might be > relatively slow as well. Not significant. I tried to unwrap recursion in `directory-files-recursively' and tried to play around with regexp matching of the file list itself - no significant impact compared to `file-name-handler-alist'. I am pretty sure that Emacs's native file routines can be optimized to the level of find. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 15:42 ` Ihor Radchenko @ 2023-07-20 15:57 ` Dmitry Gutov 2023-07-20 16:03 ` Ihor Radchenko 2023-07-20 16:33 ` Eli Zaretskii 2023-07-20 17:08 ` Spencer Baugh 2 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-20 15:57 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Spencer Baugh, 64735 On 20/07/2023 18:42, Ihor Radchenko wrote: > Dmitry Gutov <dmitry@gutov.dev> writes: > >>>> ... Last I checked, Lisp-native file >>>> listing was simply slower than 'find'. >>> >>> Could it be changed? >>> In my tests, I was able to improve performance of the built-in >>> `directory-files-recursively' simply by disabling >>> `file-name-handler-alist' around its call. >> >> Then it won't work with Tramp, right? I think it's pretty nifty that >> project-find-regexp and dired-do-find-regexp work over Tramp. > > Sure. It might also be optimized. Without trying to convince find devs > to do something about regexp handling. > > And things are not as horrible as 15x slowdown in find. We haven't compared to the "optimized regexps" solution in find, though. >>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ >>> (the thread also continues off-list, and it looks like there is a lot of >>> room for improvement in this area) >> >> Does it get close enough to the performance of 'find' this way? > > Comparable: > > (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" "")))) > ;; Elapsed time: 0.633713s > (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" ""))))) > ;; Elapsed time: 0.324341s > ;; time find /home/yantar92/.data >/dev/null > ;; real 0m0.129s > ;; user 0m0.017s > ;; sys 0m0.111s Still like 2.5x slower, then? That's significant. >> Also note that processing all matches in Lisp, with many ignores >> entries, will incur the proportional overhead in Lisp. Which might be >> relatively slow as well. > > Not significant. > I tried to unwrap recursion in `directory-files-recursively' and tried > to play around with regexp matching of the file list itself - no > significant impact compared to `file-name-handler-alist'. I suppose that can make sense, if find's slowdown is due to it issuing repeated 'stat' calls for every match. > I am pretty sure that Emacs's native file routines can be optimized to > the level of find. I don't know, the GNU tools are often ridiculously optimized. At least certain file paths. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 15:57 ` Dmitry Gutov @ 2023-07-20 16:03 ` Ihor Radchenko 2023-07-20 18:56 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 16:03 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Spencer Baugh, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: >> And things are not as horrible as 15x slowdown in find. > > We haven't compared to the "optimized regexps" solution in find, though. Fair point. > Still like 2.5x slower, then? That's significant. It is, but it is workable if we try to optimize Emacs' `directory-files'/`file-name-all-completions' internals. >> I am pretty sure that Emacs's native file routines can be optimized to >> the level of find. > > I don't know, the GNU tools are often ridiculously optimized. At least > certain file paths. You are likely right. Then, what about applying regexps manually, on the full file list returned by find? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 16:03 ` Ihor Radchenko @ 2023-07-20 18:56 ` Dmitry Gutov 2023-07-21 9:14 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-20 18:56 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Spencer Baugh, 64735 On 20/07/2023 19:03, Ihor Radchenko wrote: > Dmitry Gutov<dmitry@gutov.dev> writes: > >>> And things are not as horrible as 15x slowdown in find. >> We haven't compared to the "optimized regexps" solution in find, though. > Fair point. > >> Still like 2.5x slower, then? That's significant. > It is, but it is workable if we try to optimize Emacs' > `directory-files'/`file-name-all-completions' internals. > >>> I am pretty sure that Emacs's native file routines can be optimized to >>> the level of find. >> I don't know, the GNU tools are often ridiculously optimized. At least >> certain file paths. Sorry, I meant "code paths" here. > You are likely right. > Then, what about applying regexps manually, on the full file list > returned by find? It will almost certainly be slower in cases where several (few) ignore entries help drop whole big directories from traversal. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 18:56 ` Dmitry Gutov @ 2023-07-21 9:14 ` Ihor Radchenko 0 siblings, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 9:14 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Spencer Baugh, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: >> You are likely right. >> Then, what about applying regexps manually, on the full file list >> returned by find? > > It will almost certainly be slower in cases where several (few) ignore > entries help drop whole big directories from traversal. Right. Then, what about limiting find to -depth 1, filtering the output, and re-running find on matching entries? It gets complicated though, and the extra overheads associated with invoking a new process may not be worth it. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 15:42 ` Ihor Radchenko 2023-07-20 15:57 ` Dmitry Gutov @ 2023-07-20 16:33 ` Eli Zaretskii 2023-07-20 16:36 ` Ihor Radchenko 2023-07-20 17:08 ` Spencer Baugh 2 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-20 16:33 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh > Cc: Spencer Baugh <sbaugh@janestreet.com>, 64735@debbugs.gnu.org > From: Ihor Radchenko <yantar92@posteo.net> > Date: Thu, 20 Jul 2023 15:42:17 +0000 > > I am pretty sure that Emacs's native file routines can be optimized to > the level of find. Maybe it can be improved, but not to the same level as Find, because consing Lisp strings, something that Find doesn't do, does have its overhead. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 16:33 ` Eli Zaretskii @ 2023-07-20 16:36 ` Ihor Radchenko 2023-07-20 16:45 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 16:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> I am pretty sure that Emacs's native file routines can be optimized to >> the level of find. > > Maybe it can be improved, but not to the same level as Find, because > consing Lisp strings, something that Find doesn't do, does have its > overhead. I am not sure if this specific issue is important. If we want to use find from Emacs, we would need to create Emacs string/strings when reading the output of find anyway. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 16:36 ` Ihor Radchenko @ 2023-07-20 16:45 ` Eli Zaretskii 2023-07-20 17:23 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-20 16:45 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh > From: Ihor Radchenko <yantar92@posteo.net> > Cc: dmitry@gutov.dev, sbaugh@janestreet.com, 64735@debbugs.gnu.org > Date: Thu, 20 Jul 2023 16:36:28 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Maybe it can be improved, but not to the same level as Find, because > > consing Lisp strings, something that Find doesn't do, does have its > > overhead. > > I am not sure if this specific issue is important. > If we want to use find from Emacs, we would need to create Emacs > string/strings when reading the output of find anyway. So how do you explain that using Find is faster than using find-lisp.el? I think the answer is that using Find as a subprocess conses less. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 16:45 ` Eli Zaretskii @ 2023-07-20 17:23 ` Ihor Radchenko 2023-07-20 18:24 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 17:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> I am not sure if this specific issue is important. >> If we want to use find from Emacs, we would need to create Emacs >> string/strings when reading the output of find anyway. > > So how do you explain that using Find is faster than using > find-lisp.el? > > I think the answer is that using Find as a subprocess conses less. No. It uses less excessive regexp matching Emacs is trying to do in file-name-handler-alist. (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" "")))) ;; Elapsed time: 2.982393s (ignore (let ((gc-cons-threshold most-positive-fixnum) file-name-handler-alist) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" "")))) ;; Elapsed time: 0.784461s 22.83% emacs emacs [.] Fnconc 10.01% emacs emacs [.] Fexpand_file_name 9.22% emacs emacs [.] eval_sub 3.47% emacs emacs [.] assq_no_quit 2.68% emacs emacs [.] getenv_internal_1 2.50% emacs emacs [.] mem_insert.isra.0 2.24% emacs emacs [.] Fassq 2.02% emacs emacs [.] set_buffer_internal_2 (ignore (let ((gc-cons-threshold most-positive-fixnum) file-name-handler-alist) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" "")))) ;; Elapsed time: 0.624987s 12.39% emacs emacs [.] eval_sub 12.07% emacs emacs [.] Fexpand_file_name 4.97% emacs emacs [.] assq_no_quit 4.11% emacs emacs [.] getenv_internal_1 2.77% emacs emacs [.] set_buffer_internal_2 2.61% emacs emacs [.] mem_insert.isra.0 2.47% emacs emacs [.] make_clear_multibyte_string.part.0 Non-recursive version of `find-lisp-find-files-internal' is below, though it provides limited improvement. (defun find-lisp-find-files-internal (directory file-predicate directory-predicate) "Find files under DIRECTORY which satisfy FILE-PREDICATE. FILE-PREDICATE is a function which takes two arguments: the file and its directory. DIRECTORY-PREDICATE is used to decide whether to descend into directories. It is a function which takes two arguments, the directory and its parent." (setq directory (file-name-as-directory directory)) (let (results fullname (dirs (list (expand-file-name directory)))) (while dirs (setq directory (pop dirs)) (dolist (file (directory-files directory nil nil t)) (setq fullname (concat directory file)) (when (file-readable-p fullname) ;; If a directory, check it we should descend into it (and (file-directory-p fullname) (setq fullname (concat fullname "/")) (funcall directory-predicate file directory) (push fullname dirs)) ;; For all files and directories, call the file predicate (and (funcall file-predicate file directory) (push fullname results))))) results)) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 17:23 ` Ihor Radchenko @ 2023-07-20 18:24 ` Eli Zaretskii 2023-07-20 18:29 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-20 18:24 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh > From: Ihor Radchenko <yantar92@posteo.net> > Cc: dmitry@gutov.dev, sbaugh@janestreet.com, 64735@debbugs.gnu.org > Date: Thu, 20 Jul 2023 17:23:22 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> I am not sure if this specific issue is important. > >> If we want to use find from Emacs, we would need to create Emacs > >> string/strings when reading the output of find anyway. > > > > So how do you explain that using Find is faster than using > > find-lisp.el? > > > > I think the answer is that using Find as a subprocess conses less. > > No. It uses less excessive regexp matching Emacs is trying to do in > file-name-handler-alist. Where do you see regexp matching in the profiles you provided? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 18:24 ` Eli Zaretskii @ 2023-07-20 18:29 ` Ihor Radchenko 2023-07-20 18:43 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 18:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> No. It uses less excessive regexp matching Emacs is trying to do in >> file-name-handler-alist. > > Where do you see regexp matching in the profiles you provided? I did the analysis earlier for `directory-files-recursively'. See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ Just to be sure, here is perf data for (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" "")))) 54.89% emacs emacs [.] re_match_2_internal 10.19% emacs emacs [.] re_search_2 3.35% emacs emacs [.] unbind_to 3.02% emacs emacs [.] compile_pattern 3.02% emacs emacs [.] execute_charset 3.00% emacs emacs [.] process_mark_stack 1.59% emacs emacs [.] plist_get 1.26% emacs emacs [.] RE_SETUP_SYNTAX_TABLE_FOR_OBJECT 1.17% emacs emacs [.] update_syntax_table 1.02% emacs emacs [.] Fexpand_file_name Disabling `file-name-handler-alist' cuts the time more than 2x. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 18:29 ` Ihor Radchenko @ 2023-07-20 18:43 ` Eli Zaretskii 2023-07-20 18:57 ` Ihor Radchenko 2023-07-21 7:45 ` Michael Albinus 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-20 18:43 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, 64735, sbaugh > From: Ihor Radchenko <yantar92@posteo.net> > Cc: dmitry@gutov.dev, sbaugh@janestreet.com, 64735@debbugs.gnu.org > Date: Thu, 20 Jul 2023 18:29:43 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> No. It uses less excessive regexp matching Emacs is trying to do in > >> file-name-handler-alist. > > > > Where do you see regexp matching in the profiles you provided? > > I did the analysis earlier for `directory-files-recursively'. See > https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ > > Just to be sure, here is perf data for > (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (find-lisp-find-files "/home/yantar92/.data" "")))) > > 54.89% emacs emacs [.] re_match_2_internal > 10.19% emacs emacs [.] re_search_2 > 3.35% emacs emacs [.] unbind_to > 3.02% emacs emacs [.] compile_pattern > 3.02% emacs emacs [.] execute_charset > 3.00% emacs emacs [.] process_mark_stack > 1.59% emacs emacs [.] plist_get > 1.26% emacs emacs [.] RE_SETUP_SYNTAX_TABLE_FOR_OBJECT > 1.17% emacs emacs [.] update_syntax_table > 1.02% emacs emacs [.] Fexpand_file_name > > Disabling `file-name-handler-alist' cuts the time more than 2x. Disabling file-handlers is inconceivable in Emacs. And I suspect that find-file-name-handler is mostly called not from directory-files, but from expand-file-name -- another call that cannot possibly be bypassed in Emacs, since Emacs behaves as if CWD were different for each buffer. And expand-file-name also conses file names. And then we have encoding and decoding file names, something that with Find we do much less. Etc. etc. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 18:43 ` Eli Zaretskii @ 2023-07-20 18:57 ` Ihor Radchenko 2023-07-21 12:37 ` Dmitry Gutov 2023-07-21 7:45 ` Michael Albinus 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 18:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> Disabling `file-name-handler-alist' cuts the time more than 2x. > > Disabling file-handlers is inconceivable in Emacs. Indeed. But we are talking about Emacs find vs. GNU find here. In the scenarios where GNU find can be used, it is also safe to disable file handlers, AFAIU. > ... And I suspect that > find-file-name-handler is mostly called not from directory-files, but > from expand-file-name -- another call that cannot possibly be bypassed > in Emacs, since Emacs behaves as if CWD were different for each > buffer. And expand-file-name also conses file names. And then we > have encoding and decoding file names, something that with Find we do > much less. Etc. etc. expand-file-name indeed calls Ffind_file_name_handler multiple times. And what is worse: (1) `find-lisp-find-files-internal' calls `expand-file-name' on every file in lisp, even when it is already expanded (which it is, for every sub-directory); (2) `directory-files' calls Fexpand_file_name again, on already expanded directory name; (3) `directory-files' calls Ffind_file_name_handler yet again on top of what was already done by Fexpand_file_name; (4) `directory-files' calls `directory_files_internal' that calls `Fdirectory_file_name' that searches `Ffind_file_name_handler' yet one more time. There is a huge amount of repetitive calls to Ffind_file_name_handler going on. They could at least be cached or re-used. I do not see much of encoding and consing present in perf stats. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 18:57 ` Ihor Radchenko @ 2023-07-21 12:37 ` Dmitry Gutov 2023-07-21 12:58 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 12:37 UTC (permalink / raw) To: Ihor Radchenko, Eli Zaretskii; +Cc: sbaugh, 64735 On 20/07/2023 21:57, Ihor Radchenko wrote: > Eli Zaretskii <eliz@gnu.org> writes: > >>> Disabling `file-name-handler-alist' cuts the time more than 2x. >> >> Disabling file-handlers is inconceivable in Emacs. > > Indeed. But we are talking about Emacs find vs. GNU find here. > In the scenarios where GNU find can be used, it is also safe to disable > file handlers, AFAIU. GNU find can be used on a remote machine. In all the same cases as when it can be used on the local one. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:37 ` Dmitry Gutov @ 2023-07-21 12:58 ` Ihor Radchenko 2023-07-21 13:00 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 12:58 UTC (permalink / raw) To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: >>> Disabling file-handlers is inconceivable in Emacs. >> >> Indeed. But we are talking about Emacs find vs. GNU find here. >> In the scenarios where GNU find can be used, it is also safe to disable >> file handlers, AFAIU. > > GNU find can be used on a remote machine. In all the same cases as when > it can be used on the local one. But GNU find does not take into account Emacs' file-handlers for each directory when traversing directories. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:58 ` Ihor Radchenko @ 2023-07-21 13:00 ` Dmitry Gutov 2023-07-21 13:34 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 13:00 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735 On 21/07/2023 15:58, Ihor Radchenko wrote: > Dmitry Gutov<dmitry@gutov.dev> writes: > >>>> Disabling file-handlers is inconceivable in Emacs. >>> Indeed. But we are talking about Emacs find vs. GNU find here. >>> In the scenarios where GNU find can be used, it is also safe to disable >>> file handlers, AFAIU. >> GNU find can be used on a remote machine. In all the same cases as when >> it can be used on the local one. > But GNU find does not take into account Emacs' file-handlers for each > directory when traversing directories. Indeed. Such usage always assumes the initial invocation and each visited directory belong to the same remote host. Which is usually a correct assumption. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:00 ` Dmitry Gutov @ 2023-07-21 13:34 ` Ihor Radchenko 2023-07-21 13:36 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 13:34 UTC (permalink / raw) To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: > On 21/07/2023 15:58, Ihor Radchenko wrote: >> Dmitry Gutov<dmitry@gutov.dev> writes: >> >>>>> Disabling file-handlers is inconceivable in Emacs. >>>> Indeed. But we are talking about Emacs find vs. GNU find here. >>>> In the scenarios where GNU find can be used, it is also safe to disable >>>> file handlers, AFAIU. So, we agree here? (I've read your reply as counter-argument to mine.) >>> GNU find can be used on a remote machine. In all the same cases as when >>> it can be used on the local one. >> But GNU find does not take into account Emacs' file-handlers for each >> directory when traversing directories. > > Indeed. Such usage always assumes the initial invocation and each > visited directory belong to the same remote host. Which is usually a > correct assumption. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:34 ` Ihor Radchenko @ 2023-07-21 13:36 ` Dmitry Gutov 2023-07-21 13:46 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 13:36 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735 On 21/07/2023 16:34, Ihor Radchenko wrote: > Dmitry Gutov<dmitry@gutov.dev> writes: > >> On 21/07/2023 15:58, Ihor Radchenko wrote: >>> Dmitry Gutov<dmitry@gutov.dev> writes: >>> >>>>>> Disabling file-handlers is inconceivable in Emacs. >>>>> Indeed. But we are talking about Emacs find vs. GNU find here. >>>>> In the scenarios where GNU find can be used, it is also safe to disable >>>>> file handlers, AFAIU. > So, we agree here? (I've read your reply as counter-argument to mine.) > We don't, IIUC. To use GNU find on a remote host, you need to have the file handlers enabled. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:36 ` Dmitry Gutov @ 2023-07-21 13:46 ` Ihor Radchenko 2023-07-21 15:41 ` Dmitry Gutov 2023-07-23 5:40 ` Ihor Radchenko 0 siblings, 2 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 13:46 UTC (permalink / raw) To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: > On 21/07/2023 16:34, Ihor Radchenko wrote: >> Dmitry Gutov<dmitry@gutov.dev> writes: >> >>> On 21/07/2023 15:58, Ihor Radchenko wrote: >>>> Dmitry Gutov<dmitry@gutov.dev> writes: >>>> >>>>>>> Disabling file-handlers is inconceivable in Emacs. >>>>>> Indeed. But we are talking about Emacs find vs. GNU find here. >>>>>> In the scenarios where GNU find can be used, it is also safe to disable >>>>>> file handlers, AFAIU. >> So, we agree here? (I've read your reply as counter-argument to mine.) >> > > We don't, IIUC. > > To use GNU find on a remote host, you need to have the file handlers > enabled. Let me clarify then. I was exploring the possibility to replace GNU find with `find-lisp-find-files'. Locally, AFAIU, running `find-lisp-find-files' without `file-name-handler-alist' is equivalent to running GNU find. (That was a reply to Eli's message that we cannot disable `file-name-handler-alist') On remote host, I can see that `find-lisp-find-files' must use tramp entries in `file-name-handler-alist'. Although, it will likely not be usable then - running GNU find on remote host is going to be unbeatable compared to repetitive TRAMP queries for file listing. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:46 ` Ihor Radchenko @ 2023-07-21 15:41 ` Dmitry Gutov 2023-07-21 15:48 ` Ihor Radchenko 2023-07-23 5:40 ` Ihor Radchenko 1 sibling, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 15:41 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735 On 21/07/2023 16:46, Ihor Radchenko wrote: > Dmitry Gutov <dmitry@gutov.dev> writes: > >> On 21/07/2023 16:34, Ihor Radchenko wrote: >>> Dmitry Gutov<dmitry@gutov.dev> writes: >>> >>>> On 21/07/2023 15:58, Ihor Radchenko wrote: >>>>> Dmitry Gutov<dmitry@gutov.dev> writes: >>>>> >>>>>>>> Disabling file-handlers is inconceivable in Emacs. >>>>>>> Indeed. But we are talking about Emacs find vs. GNU find here. >>>>>>> In the scenarios where GNU find can be used, it is also safe to disable >>>>>>> file handlers, AFAIU. >>> So, we agree here? (I've read your reply as counter-argument to mine.) >>> >> >> We don't, IIUC. >> >> To use GNU find on a remote host, you need to have the file handlers >> enabled. > > Let me clarify then. > I was exploring the possibility to replace GNU find with > `find-lisp-find-files'. > > Locally, AFAIU, running `find-lisp-find-files' without > `file-name-handler-alist' is equivalent to running GNU find. > (That was a reply to Eli's message that we cannot disable > `file-name-handler-alist') But it's slower! At least 2x, even with file handlers disabled. According to your own measurements, with a modern SSD (not to mention all of our users with spinning media). > Although, it will likely not > be usable then - running GNU find on remote host is going to be > unbeatable compared to repetitive TRAMP queries for file listing. That's right. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:41 ` Dmitry Gutov @ 2023-07-21 15:48 ` Ihor Radchenko 2023-07-21 19:53 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 15:48 UTC (permalink / raw) To: Dmitry Gutov; +Cc: sbaugh, Eli Zaretskii, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: >> Locally, AFAIU, running `find-lisp-find-files' without >> `file-name-handler-alist' is equivalent to running GNU find. >> (That was a reply to Eli's message that we cannot disable >> `file-name-handler-alist') > > But it's slower! At least 2x, even with file handlers disabled. > According to your own measurements, with a modern SSD (not to mention > all of our users with spinning media). Yes, but (1) there is room for optimization; (2) I have a hope that we can implement better "ignores" when using `find-lisp-find-files', thus eventually outperforming GNU find (when used with large number of ignores). -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:48 ` Ihor Radchenko @ 2023-07-21 19:53 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 19:53 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, Eli Zaretskii, 64735 On 21/07/2023 18:48, Ihor Radchenko wrote: > Dmitry Gutov <dmitry@gutov.dev> writes: > >>> Locally, AFAIU, running `find-lisp-find-files' without >>> `file-name-handler-alist' is equivalent to running GNU find. >>> (That was a reply to Eli's message that we cannot disable >>> `file-name-handler-alist') >> >> But it's slower! At least 2x, even with file handlers disabled. >> According to your own measurements, with a modern SSD (not to mention >> all of our users with spinning media). > > Yes, but (1) there is room for optimization; (2) I have a hope that we > can implement better "ignores" when using `find-lisp-find-files', thus > eventually outperforming GNU find (when used with large number of > ignores). There are natural limits to that optimization, if the approach is to generate the full list of files in Lisp, and then filter it out programmatically: every file name will need to be allocated. That's a lot of unnecessary consing. But you're welcome to try it and report back with results. Tramp is easy to disable, so you should be fine in terms of infrastructure. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:46 ` Ihor Radchenko 2023-07-21 15:41 ` Dmitry Gutov @ 2023-07-23 5:40 ` Ihor Radchenko 2023-07-23 11:50 ` Michael Albinus 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 5:40 UTC (permalink / raw) To: Dmitry Gutov, Michael Albinus; +Cc: sbaugh, Eli Zaretskii, 64735 Ihor Radchenko <yantar92@posteo.net> writes: > On remote host, I can see that `find-lisp-find-files' must use > tramp entries in `file-name-handler-alist'. Although, it will likely not > be usable then - running GNU find on remote host is going to be > unbeatable compared to repetitive TRAMP queries for file listing. That said, Michael, may you please provide some insight about TRAMP directory listing queries. May they be more optimized when we need to query recursively rather than per directory? GNU find is faster simply because it is running on remote machine itself. But AFAIU, if TRAMP could convert repetitive network request for each directory into a single request, it would speed things up significantly. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 5:40 ` Ihor Radchenko @ 2023-07-23 11:50 ` Michael Albinus 2023-07-24 7:35 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-23 11:50 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, >> On remote host, I can see that `find-lisp-find-files' must use >> tramp entries in `file-name-handler-alist'. Although, it will likely not >> be usable then - running GNU find on remote host is going to be >> unbeatable compared to repetitive TRAMP queries for file listing. > > That said, Michael, may you please provide some insight about TRAMP > directory listing queries. May they be more optimized when we need to > query recursively rather than per directory? Tramp is just a stupid library, w/o own intelligence. It offers alternative implementations for the set of primitive operations listed in (info "(elisp) Magic File Names") There's no optimization wrt to bundling several operations into a more suited remote command. > GNU find is faster simply because it is running on remote machine itself. > But AFAIU, if TRAMP could convert repetitive network request for each > directory into a single request, it would speed things up significantly. If you want something like this, you must add directory-files-recursively to that list of primitive operations. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 11:50 ` Michael Albinus @ 2023-07-24 7:35 ` Ihor Radchenko 2023-07-24 7:59 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-24 7:35 UTC (permalink / raw) To: Michael Albinus; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: > Tramp is just a stupid library, w/o own intelligence. It offers > alternative implementations for the set of primitive operations listed in > (info "(elisp) Magic File Names") This makes me wonder we can simply add a "find" file handler that will use find as necessary when GNU find executable is available. >> GNU find is faster simply because it is running on remote machine itself. >> But AFAIU, if TRAMP could convert repetitive network request for each >> directory into a single request, it would speed things up significantly. > > If you want something like this, you must add directory-files-recursively > to that list of primitive operations. So, it is doable, and not difficult. Good to know. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-24 7:35 ` Ihor Radchenko @ 2023-07-24 7:59 ` Michael Albinus 2023-07-24 8:22 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-24 7:59 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, >> Tramp is just a stupid library, w/o own intelligence. It offers >> alternative implementations for the set of primitive operations listed in >> (info "(elisp) Magic File Names") > > This makes me wonder we can simply add a "find" file handler that will > use find as necessary when GNU find executable is available. > >>> GNU find is faster simply because it is running on remote machine itself. >>> But AFAIU, if TRAMP could convert repetitive network request for each >>> directory into a single request, it would speed things up significantly. >> >> If you want something like this, you must add directory-files-recursively >> to that list of primitive operations. > > So, it is doable, and not difficult. Good to know. Technically it isn't difficult. But don't forget: - We support already ~80 primitive operations. - A new primitive operation must be handled by all Tramp backends, which could require up to 10 different implementations. - I'm the only Tramp maintainer, for decades. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-24 7:59 ` Michael Albinus @ 2023-07-24 8:22 ` Ihor Radchenko 2023-07-24 9:31 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-24 8:22 UTC (permalink / raw) To: Michael Albinus; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> So, it is doable, and not difficult. Good to know. > > Technically it isn't difficult. But don't forget: > > - We support already ~80 primitive operations. > > - A new primitive operation must be handled by all Tramp backends, which > could require up to 10 different implementations. Why so? `directory-files-recursively' is already supported by Tramp via `directory-files'. But at least for some backends `directory-files-recursively' may be implemented more efficiently. If other backends do not implement it, `directory-files' will be used. > - I'm the only Tramp maintainer, for decades. I hope that the above approach with only some backends implementing such support will not add too much of maintenance burden. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-24 8:22 ` Ihor Radchenko @ 2023-07-24 9:31 ` Michael Albinus 0 siblings, 0 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-24 9:31 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Dmitry Gutov, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, > Why so? `directory-files-recursively' is already supported by Tramp via > `directory-files'. It isn't supported by Tramp yet. Tramp hasn't heard ever about. > But at least for some backends `directory-files-recursively' may be > implemented more efficiently. If other backends do not implement it, > `directory-files' will be used. Of course, and I did propose to add it. I just wanted to avoid an inflation of proposals for primitive operations being supported by Tramp. And yes, not all backends need to implement an own version of `directory-files-recursively'. But this could happen for other primitive operations, so we must always think about whether it is worth to add a primitive to Tramp. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 18:43 ` Eli Zaretskii 2023-07-20 18:57 ` Ihor Radchenko @ 2023-07-21 7:45 ` Michael Albinus 2023-07-21 10:46 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 7:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, Ihor Radchenko, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, >> Disabling `file-name-handler-alist' cuts the time more than 2x. > > Disabling file-handlers is inconceivable in Emacs. Agreed. However, the fattest regexps in file-name-handler-alist are those for tramp-archive-file-name-handler, tramp-completion-file-name-handler and tramp-file-name-handler. Somewhere else I have proposed to write a macro without-remote-files and a command inhibit-remote-files, which disable Tramp and remove its file name handlers from file-name-handler-alist. Either temporarily, or permanent. Users can call the command, if they know for sure they don't use remote files ever. Authors could use the macro in case they know for sure they are working over local files only. WDYT? Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 7:45 ` Michael Albinus @ 2023-07-21 10:46 ` Eli Zaretskii 2023-07-21 11:32 ` Michael Albinus 2023-07-21 12:38 ` Dmitry Gutov 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 10:46 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh > From: Michael Albinus <michael.albinus@gmx.de> > Cc: Ihor Radchenko <yantar92@posteo.net>, dmitry@gutov.dev, > 64735@debbugs.gnu.org, sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 09:45:26 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Disabling file-handlers is inconceivable in Emacs. > > Agreed. > > However, the fattest regexps in file-name-handler-alist are those for > tramp-archive-file-name-handler, tramp-completion-file-name-handler and > tramp-file-name-handler. Somewhere else I have proposed to write a macro > without-remote-files and a command inhibit-remote-files, which disable > Tramp and remove its file name handlers from file-name-handler-alist. > Either temporarily, or permanent. > > Users can call the command, if they know for sure they don't use remote > files ever. Authors could use the macro in case they know for sure they > are working over local files only. > > WDYT? How is this different from binding file-name-handler-alist to nil? Tramp is nowadays the main consumer of this feature, and AFAIU your suggestion above boils down to disabling Tramp. If so, what is left? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 10:46 ` Eli Zaretskii @ 2023-07-21 11:32 ` Michael Albinus 2023-07-21 11:51 ` Ihor Radchenko 2023-07-21 12:39 ` Eli Zaretskii 2023-07-21 12:38 ` Dmitry Gutov 1 sibling, 2 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-21 11:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, >> However, the fattest regexps in file-name-handler-alist are those for >> tramp-archive-file-name-handler, tramp-completion-file-name-handler and >> tramp-file-name-handler. Somewhere else I have proposed to write a macro >> without-remote-files and a command inhibit-remote-files, which disable >> Tramp and remove its file name handlers from file-name-handler-alist. >> Either temporarily, or permanent. >> >> Users can call the command, if they know for sure they don't use remote >> files ever. Authors could use the macro in case they know for sure they >> are working over local files only. >> >> WDYT? > > How is this different from binding file-name-handler-alist to nil? > Tramp is nowadays the main consumer of this feature, and AFAIU your > suggestion above boils down to disabling Tramp. If so, what is left? jka-compr-handler, epa-file-handler and file-name-non-special are left. All of them have their reason. And there are packages out in the wild, which add other handlers. Like jarchive--file-name-handler and sweeprolog-file-name-handler, I've checked only (Non)GNU ELPA. All of them would suffer from the bind-file-name-handler-alist-to-nil trick. There's a reason we haven't documented it in the manuals. And this is just the case to handle it in Lisp code, with without-remote-files. According to the last Emacs Survey, more than 50% of Emacs users don't use Tramp, never ever. But they must live with the useless checks in file-name-handler-alist for Tramp. All of them would profit, if they add (inhibit-remote-files) in their .emacs file. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 11:32 ` Michael Albinus @ 2023-07-21 11:51 ` Ihor Radchenko 2023-07-21 12:01 ` Michael Albinus 2023-07-21 12:39 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 11:51 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: > And this is just the case to handle it in Lisp code, with > without-remote-files. According to the last Emacs Survey, more than 50% > of Emacs users don't use Tramp, never ever. But they must live with the > useless checks in file-name-handler-alist for Tramp. All of them would > profit, if they add (inhibit-remote-files) in their .emacs file. May tramp only set file-name-handler-alist when a tramp command is actually invoked? Or, alternatively, may we fence the regexp matches in `file-name-handler-alist' behind boolean switches? I examined what the actual handlers do, and I can see `jka-compr-inhibit', `epa-inhibit', `tramp-archive-enabled', and `tramp-mode' are used to force-execute the original handler. If we could make Emacs perform these checks earlier, the whole expensive regexp matching phase could be bypassed. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 11:51 ` Ihor Radchenko @ 2023-07-21 12:01 ` Michael Albinus 2023-07-21 12:20 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 12:01 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, > May tramp only set file-name-handler-alist when a tramp command is > actually invoked? It's the other direction: Tramp is only invoked after a check in file-name-handler-alist. > Or, alternatively, may we fence the regexp matches in > `file-name-handler-alist' behind boolean switches? > I examined what the actual handlers do, and I can see > `jka-compr-inhibit', `epa-inhibit', `tramp-archive-enabled', > and `tramp-mode' are used to force-execute the original handler. If we > could make Emacs perform these checks earlier, the whole expensive > regexp matching phase could be bypassed. Hmm, this would mean to extend the file-name-handler-alist spec. Instead of a regexp to check, we would need to allow a function call or alike. Don't know whether this pays for optimization. And there is also the case, that due to inhibit-file-name-handlers and inhibit-file-name-operation we can allow a remote file name operation for a given function, and disable it for another function. Tramp uses this mechanism. The general flag tramp-mode is not sufficient for this scenario. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:01 ` Michael Albinus @ 2023-07-21 12:20 ` Ihor Radchenko 2023-07-21 12:25 ` Ihor Radchenko 2023-07-21 12:27 ` Michael Albinus 0 siblings, 2 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 12:20 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> Or, alternatively, may we fence the regexp matches in >> `file-name-handler-alist' behind boolean switches? >> I examined what the actual handlers do, and I can see >> `jka-compr-inhibit', `epa-inhibit', `tramp-archive-enabled', >> and `tramp-mode' are used to force-execute the original handler. If we >> could make Emacs perform these checks earlier, the whole expensive >> regexp matching phase could be bypassed. > > Hmm, this would mean to extend the file-name-handler-alist spec. Instead > of a regexp to check, we would need to allow a function call or > alike. Don't know whether this pays for optimization. The question is: what is more costly (a) matching complex regexp && call function or (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...)) > And there is also the case, that due to inhibit-file-name-handlers and > inhibit-file-name-operation we can allow a remote file name operation > for a given function, and disable it for another function. Tramp uses > this mechanism. The general flag tramp-mode is not sufficient for this > scenario. I am not sure if I understand completely, but it does not appear that this is used often during ordinary file operations that do not involve tramp. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:20 ` Ihor Radchenko @ 2023-07-21 12:25 ` Ihor Radchenko 2023-07-21 12:46 ` Eli Zaretskii 2023-07-21 12:27 ` Michael Albinus 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 12:25 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: >> Hmm, this would mean to extend the file-name-handler-alist spec. Instead >> of a regexp to check, we would need to allow a function call or >> alike. Don't know whether this pays for optimization. > > The question is: what is more costly > (a) matching complex regexp && call function or > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...)) (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file")) ;; => (1.495432981 0 0.0) (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file")) ;; => (0.42053276500000003 0 0.0) Looks like even funcall overheads are not as bad as invoking regexp search. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:25 ` Ihor Radchenko @ 2023-07-21 12:46 ` Eli Zaretskii 2023-07-21 13:01 ` Michael Albinus 2023-07-21 13:17 ` Ihor Radchenko 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 12:46 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, michael.albinus, 64735, sbaugh > From: Ihor Radchenko <yantar92@posteo.net> > Cc: dmitry@gutov.dev, Eli Zaretskii <eliz@gnu.org>, 64735@debbugs.gnu.org, > sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 12:25:29 +0000 > > Ihor Radchenko <yantar92@posteo.net> writes: > > > The question is: what is more costly > > (a) matching complex regexp && call function or > > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...)) > > (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file")) > ;; => (1.495432981 0 0.0) > (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file")) > ;; => (0.42053276500000003 0 0.0) > > Looks like even funcall overheads are not as bad as invoking regexp search. But "nil" is not a faithful emulation of the real test which will have to be put there, is it? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:46 ` Eli Zaretskii @ 2023-07-21 13:01 ` Michael Albinus 2023-07-21 13:23 ` Ihor Radchenko 2023-07-21 13:17 ` Ihor Radchenko 1 sibling, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 13:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, Ihor Radchenko, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> From: Ihor Radchenko <yantar92@posteo.net> >> Cc: dmitry@gutov.dev, Eli Zaretskii <eliz@gnu.org>, 64735@debbugs.gnu.org, >> sbaugh@janestreet.com >> Date: Fri, 21 Jul 2023 12:25:29 +0000 >> >> Ihor Radchenko <yantar92@posteo.net> writes: >> >> > The question is: what is more costly >> > (a) matching complex regexp && call function or >> > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...)) >> >> (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file")) >> ;; => (1.495432981 0 0.0) >> (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file")) >> ;; => (0.42053276500000003 0 0.0) >> >> Looks like even funcall overheads are not as bad as invoking regexp search. > > But "nil" is not a faithful emulation of the real test which will have > to be put there, is it? Here are some other numbers. The definition of inhibit-remote-files and without-remote-files is below. --8<---------------cut here---------------start------------->8--- (length (directory-files-recursively "~/src" "")) 146121 --8<---------------cut here---------------end--------------->8--- A sufficient large directory. --8<---------------cut here---------------start------------->8--- (benchmark-run-compiled 1 (directory-files-recursively "~/src" "")) (38.133906724000006 13 0.5019186470000001) (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/src" ""))) (32.944982886 13 0.5274874450000002) --8<---------------cut here---------------end--------------->8--- There are indeed 5 sec overhead just for file name handler regexp checks. --8<---------------cut here---------------start------------->8--- (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/src" ""))) (33.261659676 13 0.5338916200000003) --8<---------------cut here---------------end--------------->8--- Removing just the Tramp file name handlers comes near to let-binding file-name-handler-alist. --8<---------------cut here---------------start------------->8--- (inhibit-remote-files) nil (benchmark-run-compiled 1 (directory-files-recursively "~/src" "")) (34.344226758000005 13 0.5421030509999998) --8<---------------cut here---------------end--------------->8--- And that's for the innocents, which aren't aware of Tramp overhead, and which don't need it. As said, ~50% of Emacs users. Just adding (inhibit-remote-files) to .emacs gives them a performance boost. W/o touching any other code. --8<---------------cut here---------------start------------->8--- ;;;###autoload (progn (defun inhibit-remote-files () "Deactivate remote file names." (interactive) (when (fboundp 'tramp-cleanup-all-connections) (funcall 'tramp-cleanup-all-connections)) (tramp-unload-file-name-handlers) (setq tramp-mode nil))) ;;;###autoload (progn (defmacro without-remote-files (&rest body) "Deactivate remote file names temporarily. Run BODY." (declare (indent 0) (debug ((form body) body))) `(let ((file-name-handler-alist (copy-tree file-name-handler-alist)) tramp-mode) (tramp-unload-file-name-handlers) ,@body))) --8<---------------cut here---------------end--------------->8--- Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:01 ` Michael Albinus @ 2023-07-21 13:23 ` Ihor Radchenko 2023-07-21 15:31 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 13:23 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: > --8<---------------cut here---------------start------------->8--- > (benchmark-run-compiled 1 (directory-files-recursively "~/src" "")) > (38.133906724000006 13 0.5019186470000001) > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/src" ""))) > (32.944982886 13 0.5274874450000002) > --8<---------------cut here---------------end--------------->8--- Interesting. Apparently my SSD is skewing the benchmark data on IO: (length (directory-files-recursively "~/Git" "")) ;; => 113628 (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) ;; => (1.756453226 2 0.7181273930000032) (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) ;; => (1.202790778 2 0.7401775709999896) Would be interesting to see profiler and perf data for more detailed breakdown where those 30+ seconds where spent in. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:23 ` Ihor Radchenko @ 2023-07-21 15:31 ` Michael Albinus 2023-07-21 15:38 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 15:31 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, > Interesting. Apparently my SSD is skewing the benchmark data on IO: > > (length (directory-files-recursively "~/Git" "")) > ;; => 113628 > > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) > ;; => (1.756453226 2 0.7181273930000032) > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) > ;; => (1.202790778 2 0.7401775709999896) > > Would be interesting to see profiler and perf data for more detailed > breakdown where those 30+ seconds where spent in. I have no SSD. And maybe some of the files are NFS-mounted. --8<---------------cut here---------------start------------->8--- [albinus@gandalf emacs]$ sudo lshw -class disk *-disk description: ATA Disk product: SK hynix SC311 S size: 476GiB (512GB) --8<---------------cut here---------------end--------------->8--- My point was to show the differences in the approaches. Do you have also numbers using without-remote-files and inhibit-remote-files? Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:31 ` Michael Albinus @ 2023-07-21 15:38 ` Ihor Radchenko 2023-07-21 15:49 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 15:38 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> Would be interesting to see profiler and perf data for more detailed >> breakdown where those 30+ seconds where spent in. > > I have no SSD. And maybe some of the files are NFS-mounted. That's why I asked about profile data (on emacs). > My point was to show the differences in the approaches. Do you have also > numbers using without-remote-files and inhibit-remote-files? (length (directory-files-recursively "~/Git" "")) ;; => 113628 (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) ;; => (1.597328425 1 0.47237324699997885) (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) ;; => (1.0012111910000001 1 0.4860752540000135) (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" ""))) ;; => (1.147276594 1 0.48820330999998873) (inhibit-remote-files) (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) ;; => (1.054041615 1 0.4141427399999884) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:38 ` Ihor Radchenko @ 2023-07-21 15:49 ` Michael Albinus 2023-07-21 15:55 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 15:49 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, >> My point was to show the differences in the approaches. Do you have also >> numbers using without-remote-files and inhibit-remote-files? > > (length (directory-files-recursively "~/Git" "")) > ;; => 113628 > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) > ;; => (1.597328425 1 0.47237324699997885) > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) > ;; => (1.0012111910000001 1 0.4860752540000135) > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" ""))) > ;; => (1.147276594 1 0.48820330999998873) > (inhibit-remote-files) > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) > ;; => (1.054041615 1 0.4141427399999884) Thanks a lot! These figures show, that both without-remote-files and inhibit-remote-files are useful. Of course this shouldn't stop us to find further approaches for performance optimizations. I'll wait for some days whether there's opposition, before installing them in master. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:49 ` Michael Albinus @ 2023-07-21 15:55 ` Eli Zaretskii 2023-07-21 16:08 ` Michael Albinus 2023-07-21 16:15 ` Ihor Radchenko 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 15:55 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh > From: Michael Albinus <michael.albinus@gmx.de> > Cc: Eli Zaretskii <eliz@gnu.org>, dmitry@gutov.dev, 64735@debbugs.gnu.org, > sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 17:49:14 +0200 > > Ihor Radchenko <yantar92@posteo.net> writes: > > Hi Ihor, > > >> My point was to show the differences in the approaches. Do you have also > >> numbers using without-remote-files and inhibit-remote-files? > > > > (length (directory-files-recursively "~/Git" "")) > > ;; => 113628 > > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) > > ;; => (1.597328425 1 0.47237324699997885) > > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) > > ;; => (1.0012111910000001 1 0.4860752540000135) > > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" ""))) > > ;; => (1.147276594 1 0.48820330999998873) > > (inhibit-remote-files) > > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) > > ;; => (1.054041615 1 0.4141427399999884) > > Thanks a lot! These figures show, that both without-remote-files and > inhibit-remote-files are useful. Of course this shouldn't stop us to > find further approaches for performance optimizations. > > I'll wait for some days whether there's opposition, before installing > them in master. Can you spell out what you intend to install? The figures provided in this thread indicate speedups that are modest at best, so I'm not sure they justify measures which could cause problems (if that indeed could happen). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:55 ` Eli Zaretskii @ 2023-07-21 16:08 ` Michael Albinus 2023-07-21 16:15 ` Ihor Radchenko 1 sibling, 0 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-21 16:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, >> > (length (directory-files-recursively "~/Git" "")) >> > ;; => 113628 >> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) >> > ;; => (1.597328425 1 0.47237324699997885) >> > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) >> > ;; => (1.0012111910000001 1 0.4860752540000135) >> > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" ""))) >> > ;; => (1.147276594 1 0.48820330999998873) >> > (inhibit-remote-files) >> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) >> > ;; => (1.054041615 1 0.4141427399999884) >> >> Thanks a lot! These figures show, that both without-remote-files and >> inhibit-remote-files are useful. Of course this shouldn't stop us to >> find further approaches for performance optimizations. >> >> I'll wait for some days whether there's opposition, before installing >> them in master. > > Can you spell out what you intend to install? I intend to install without-remote-files and inhibit-remote-files, which I have shown upthread. Plus documentation. > The figures provided in this thread indicate speedups that are modest > at best, so I'm not sure they justify measures which could cause > problems (if that indeed could happen). >> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) >> > ;; => (1.597328425 1 0.47237324699997885) 1.59 seconds. >> > (benchmark-run-compiled 1 (without-remote-files (directory-files-recursively "~/Git" ""))) >> > ;; => (1.147276594 1 0.48820330999998873) 28% performance boost. >> > (inhibit-remote-files) >> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) >> > ;; => (1.054041615 1 0.4141427399999884) 34% performance boost. I believe it is more than a modest speedup. And without-remote-files mitigates problems which could happen due to let-binding file-name-handler-alist. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:55 ` Eli Zaretskii 2023-07-21 16:08 ` Michael Albinus @ 2023-07-21 16:15 ` Ihor Radchenko 2023-07-21 16:38 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 16:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, Michael Albinus, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> > (length (directory-files-recursively "~/Git" "")) >> > ;; => 113628 >> > (benchmark-run-compiled 1 (directory-files-recursively "~/Git" "")) >> > ;; => (1.597328425 1 0.47237324699997885) >> > (benchmark-run-compiled 1 (let (file-name-handler-alist) (directory-files-recursively "~/Git" ""))) >> > ;; => (1.0012111910000001 1 0.4860752540000135) > ... > The figures provided in this thread indicate speedups that are modest > at best, so I'm not sure they justify measures which could cause > problems (if that indeed could happen). Not that modest. Basically, it all depends on how frequently Emacs file API is being used. If we take `find-lisp-find-files', which triggers more file handler lookup, the difference becomes more significant: (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" "")) ;; (3.853305824 4 0.9142656910000007) (let (file-name-handler-alist) (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" ""))) ;; (1.545292093 4 0.9098995830000014) In particular, `expand-file-name' is commonly used in the wild to ensure that a given path is full. For a single file, it may not add much overheads, but it is so common that I believe that it would be worth it to make even relatively small improvements in performance. I am pretty sure that file name handlers are checked behind the scenes by many other common operations. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 16:15 ` Ihor Radchenko @ 2023-07-21 16:38 ` Eli Zaretskii 2023-07-21 16:43 ` Ihor Radchenko 2023-07-21 16:43 ` Michael Albinus 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 16:38 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, michael.albinus, 64735, sbaugh > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Michael Albinus <michael.albinus@gmx.de>, dmitry@gutov.dev, > 64735@debbugs.gnu.org, sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 16:15:41 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > The figures provided in this thread indicate speedups that are modest > > at best, so I'm not sure they justify measures which could cause > > problems (if that indeed could happen). > > Not that modest. Basically, it all depends on how frequently Emacs file API is > being used. If we take `find-lisp-find-files', which triggers more file > handler lookup, the difference becomes more significant: > > (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" "")) > ;; (3.853305824 4 0.9142656910000007) > (let (file-name-handler-alist) (benchmark-run-compiled 1 (find-lisp-find-files "/home/yantar92/.data" ""))) > ;; (1.545292093 4 0.9098995830000014) The above just means that find-lisp is not a good way of emulating Find in Emacs. It is no accident that it is not used too much. > In particular, `expand-file-name' is commonly used in the wild to ensure > that a given path is full. For a single file, it may not add much > overheads, but it is so common that I believe that it would be worth it > to make even relatively small improvements in performance. The Right Way of avoiding unnecessary calls to expand-file-name is to program dedicated primitives that perform more specialized jobs, instead of calling existing primitives in some higher-level code. Then you can avoid these calls altogether once you know that the input file names are already in absolute form. IOW, if a specific job, when implemented in Lisp, is not performant enough, it means implementing it that way is not a good idea. Disabling file-name-handlers is the wrong way to solve these performance problems. > I am pretty sure that file name handlers are checked behind the scenes > by many other common operations. I'm pretty sure they aren't. But every file-related primitive calls expand-file-name (it must, by virtue of the Emacs paradigm whereby each buffer "lives" in a different directory), and that's what you see, by and large. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 16:38 ` Eli Zaretskii @ 2023-07-21 16:43 ` Ihor Radchenko 2023-07-21 16:43 ` Michael Albinus 1 sibling, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 16:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, michael.albinus, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: > Disabling file-name-handlers is the wrong way to solve these > performance problems. We are in agreement here. Note that I am talking about optimization. And Michael proposed to provide a way for disabling only the tramp-related handlers, when it is appropriate. >> I am pretty sure that file name handlers are checked behind the scenes >> by many other common operations. > > I'm pretty sure they aren't. But every file-related primitive calls > expand-file-name (it must, by virtue of the Emacs paradigm whereby > each buffer "lives" in a different directory), and that's what you > see, by and large. The end result is the same - file handlers are searched very frequently any time Emacs file API is used. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 16:38 ` Eli Zaretskii 2023-07-21 16:43 ` Ihor Radchenko @ 2023-07-21 16:43 ` Michael Albinus 2023-07-21 17:45 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 16:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, Ihor Radchenko, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, > Disabling file-name-handlers is the wrong way to solve these > performance problems. Does this mean you disagree to install the two forms I have proposed? Although not perfect, they are better than the current status-quo. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 16:43 ` Michael Albinus @ 2023-07-21 17:45 ` Eli Zaretskii 2023-07-21 17:55 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 17:45 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh > From: Michael Albinus <michael.albinus@gmx.de> > Cc: Ihor Radchenko <yantar92@posteo.net>, dmitry@gutov.dev, > 64735@debbugs.gnu.org, sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 18:43:52 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Disabling file-name-handlers is the wrong way to solve these > > performance problems. > > Does this mean you disagree to install the two forms I have proposed? > Although not perfect, they are better than the current status-quo. No, I just disagree that those measures should be seen as solutions of the performance problems mentioned here. I don't object to installing the changes, I only hope that work on resolving the performance issues will not stop because they are installed. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 17:45 ` Eli Zaretskii @ 2023-07-21 17:55 ` Michael Albinus 2023-07-21 18:38 ` Eli Zaretskii 2023-07-22 8:17 ` Michael Albinus 0 siblings, 2 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-21 17:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, >> Does this mean you disagree to install the two forms I have proposed? >> Although not perfect, they are better than the current status-quo. > > No, I just disagree that those measures should be seen as solutions of > the performance problems mentioned here. I don't object to installing > the changes, I only hope that work on resolving the performance issues > will not stop because they are installed. Thanks. I'll install tomorrow. I'm open for any proposal in solving the performance problems. But since I'm living in the file name handler world for many years, I might not be the best source for new ideas. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 17:55 ` Michael Albinus @ 2023-07-21 18:38 ` Eli Zaretskii 2023-07-21 19:33 ` Spencer Baugh 2023-07-22 8:17 ` Michael Albinus 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 18:38 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh > From: Michael Albinus <michael.albinus@gmx.de> > Cc: yantar92@posteo.net, dmitry@gutov.dev, 64735@debbugs.gnu.org, > sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 19:55:22 +0200 > > I'm open for any proposal in solving the performance problems. But since > I'm living in the file name handler world for many years, I might not be > the best source for new ideas. The first idea that comes to mind is to reimplement directory-files-recursively in C, modeled on how Find does that. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 18:38 ` Eli Zaretskii @ 2023-07-21 19:33 ` Spencer Baugh 2023-07-22 5:27 ` Eli Zaretskii 2023-07-23 2:59 ` Richard Stallman 0 siblings, 2 replies; 213+ messages in thread From: Spencer Baugh @ 2023-07-21 19:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, yantar92, Michael Albinus, Richard Stallman, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> From: Michael Albinus <michael.albinus@gmx.de> >> Cc: yantar92@posteo.net, dmitry@gutov.dev, 64735@debbugs.gnu.org, >> sbaugh@janestreet.com >> Date: Fri, 21 Jul 2023 19:55:22 +0200 >> >> I'm open for any proposal in solving the performance problems. But since >> I'm living in the file name handler world for many years, I might not be >> the best source for new ideas. > > The first idea that comes to mind is to reimplement > directory-files-recursively in C, modeled on how Find does that. If someone was thinking of doing that, they would be better off responding to RMS's earlier request for C programmers to optimize this behavior in find. Since, after all, if we do it that way it will benefit remote files as well. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 19:33 ` Spencer Baugh @ 2023-07-22 5:27 ` Eli Zaretskii 2023-07-22 10:38 ` sbaugh 2023-07-23 2:59 ` Richard Stallman 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 5:27 UTC (permalink / raw) To: Spencer Baugh; +Cc: dmitry, yantar92, michael.albinus, rms, 64735 > From: Spencer Baugh <sbaugh@janestreet.com> > Cc: Michael Albinus <michael.albinus@gmx.de>, dmitry@gutov.dev, > yantar92@posteo.net, 64735@debbugs.gnu.org, Richard Stallman > <rms@gnu.org> > Date: Fri, 21 Jul 2023 15:33:13 -0400 > > Eli Zaretskii <eliz@gnu.org> writes: > > The first idea that comes to mind is to reimplement > > directory-files-recursively in C, modeled on how Find does that. > > If someone was thinking of doing that, they would be better off > responding to RMS's earlier request for C programmers to optimize this > behavior in find. No, the first step is to use in Emacs what Find does today, because it will already be a significant speedup. Optimizing the case of a long list of omissions should come later, as it is a minor optimization. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 5:27 ` Eli Zaretskii @ 2023-07-22 10:38 ` sbaugh 2023-07-22 11:58 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: sbaugh @ 2023-07-22 10:38 UTC (permalink / raw) To: Eli Zaretskii Cc: Spencer Baugh, yantar92, rms, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> From: Spencer Baugh <sbaugh@janestreet.com> >> Cc: Michael Albinus <michael.albinus@gmx.de>, dmitry@gutov.dev, >> yantar92@posteo.net, 64735@debbugs.gnu.org, Richard Stallman >> <rms@gnu.org> >> Date: Fri, 21 Jul 2023 15:33:13 -0400 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> > The first idea that comes to mind is to reimplement >> > directory-files-recursively in C, modeled on how Find does that. >> >> If someone was thinking of doing that, they would be better off >> responding to RMS's earlier request for C programmers to optimize this >> behavior in find. > > No, the first step is to use in Emacs what Find does today, because it > will already be a significant speedup. Why bother? directory-files-recursively is a rarely used API, as you have mentioned before in this thread. And there is a way to speed it up which will have a performance boost which is unbeatable any other way: Use find instead of directory-files-recursively, and operate on files as they find prints them. Since this runs the directory traversal in parallel with Emacs, it has a speed advantage that is impossible to match in directory-files-recursively. We can fall back to directory-files-recursively when find is not available. > Optimizing the case of a long > list of omissions should come later, as it is a minor optimization. This seems wrong. directory-files-recursively is rarely used, and rgrep is a very popular command, and this problem with find makes rgrep around ~10x slower by default. How in any world is that a minor optimization? Most Emacs users will never realize that they can speed up rgrep massively by setting grep-find-ignored-files to nil. Indeed, no-one realized that until I just pointed it out. In my experience, they just stop using rgrep in favor of other third-party packages like ripgrep, because "grep is slow". ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 10:38 ` sbaugh @ 2023-07-22 11:58 ` Eli Zaretskii 2023-07-22 14:14 ` Ihor Radchenko 2023-07-22 17:18 ` sbaugh 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 11:58 UTC (permalink / raw) To: sbaugh; +Cc: sbaugh, yantar92, rms, dmitry, michael.albinus, 64735 > From: sbaugh@catern.com > Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC) > Cc: Spencer Baugh <sbaugh@janestreet.com>, dmitry@gutov.dev, > yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org, > 64735@debbugs.gnu.org > > Eli Zaretskii <eliz@gnu.org> writes: > > No, the first step is to use in Emacs what Find does today, because it > > will already be a significant speedup. > > Why bother? directory-files-recursively is a rarely used API, as you > have mentioned before in this thread. Because we could then use it much more (assuming the result will be performant enough -- this remains to be seen). > And there is a way to speed it up which will have a performance boost > which is unbeatable any other way: Use find instead of > directory-files-recursively, and operate on files as they find prints > them. Not every command can operate on the output sequentially: some need to see all of the output, others will need to be redesigned and reimplemented to support such sequential mode. Moreover, piping from Find incurs overhead: data is broken into blocks by the pipe or PTY, reading the data can be slowed down if Emacs is busy processing something, etc. So I think a primitive that traverses the tree and produces file names with or without attributes, and can call some callback if needed, still has its place. > Since this runs the directory traversal in parallel with Emacs, it > has a speed advantage that is impossible to match in > directory-files-recursively. See above: you have an optimistic view of what actually happens in the relevant use cases. > We can fall back to directory-files-recursively when find is not > available. Find is already available today on many platforms, and we are evidently not happy enough with the results. That is the trigger for this discussion, isn't it? We are talking about ways to improve the performance, and I think having our own primitive that can do it is one such way, or at least it is not clear that it cannot be such a way. > > Optimizing the case of a long > > list of omissions should come later, as it is a minor optimization. > > This seems wrong. directory-files-recursively is rarely used, and rgrep > is a very popular command, and this problem with find makes rgrep around > ~10x slower by default. How in any world is that a minor optimization? > Most Emacs users will never realize that they can speed up rgrep > massively by setting grep-find-ignored-files to nil. Indeed, no-one > realized that until I just pointed it out. In my experience, they just > stop using rgrep in favor of other third-party packages like ripgrep, > because "grep is slow". Making grep-find-ignored-files smaller is independent of this particular issue. If we can make it shorter, we should. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 11:58 ` Eli Zaretskii @ 2023-07-22 14:14 ` Ihor Radchenko 2023-07-22 14:32 ` Eli Zaretskii 2023-07-22 17:18 ` sbaugh 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-22 14:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: > So I think a primitive that traverses the tree and produces file names > with or without attributes, and can call some callback if needed, > still has its place. Do you mean asynchronous primitive? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 14:14 ` Ihor Radchenko @ 2023-07-22 14:32 ` Eli Zaretskii 2023-07-22 15:07 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 14:32 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sat, 22 Jul 2023 14:14:25 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > So I think a primitive that traverses the tree and produces file names > > with or without attributes, and can call some callback if needed, > > still has its place. > > Do you mean asynchronous primitive? No, a synchronous one. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 14:32 ` Eli Zaretskii @ 2023-07-22 15:07 ` Ihor Radchenko 2023-07-22 15:29 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-22 15:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> From: Ihor Radchenko <yantar92@posteo.net> >> Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, >> michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org >> Date: Sat, 22 Jul 2023 14:14:25 +0000 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > So I think a primitive that traverses the tree and produces file names >> > with or without attributes, and can call some callback if needed, >> > still has its place. >> >> Do you mean asynchronous primitive? > > No, a synchronous one. Then how will the callback be different from (mapc #'my-function (directory-files-recursively ...)) ? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 15:07 ` Ihor Radchenko @ 2023-07-22 15:29 ` Eli Zaretskii 2023-07-23 7:52 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 15:29 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sat, 22 Jul 2023 15:07:45 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> > So I think a primitive that traverses the tree and produces file names > >> > with or without attributes, and can call some callback if needed, > >> > still has its place. > >> > >> Do you mean asynchronous primitive? > > > > No, a synchronous one. > > Then how will the callback be different from > (mapc #'my-function (directory-files-recursively ...)) > ? It depends on the application. Applications that want to get all the data and only after that process it will not use the callback. But I can certainly imagine an application that inserts the file names, or some of their transforms, into a buffer, and from time to time triggers redisplay to show the partial results. Or an application could write the file names to some disk file or external consumer, or send them to a network process. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 15:29 ` Eli Zaretskii @ 2023-07-23 7:52 ` Ihor Radchenko 2023-07-23 8:01 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 7:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> Then how will the callback be different from >> (mapc #'my-function (directory-files-recursively ...)) >> ? > > It depends on the application. Applications that want to get all the > data and only after that process it will not use the callback. But I > can certainly imagine an application that inserts the file names, or > some of their transforms, into a buffer, and from time to time > triggers redisplay to show the partial results. Or an application > could write the file names to some disk file or external consumer, or > send them to a network process. But won't the Elisp callback always result in a queue that will effectively be synchronous? Also, another idea could be using iterators - the applications can just request "next" file as needed, without waiting for the full file list. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 7:52 ` Ihor Radchenko @ 2023-07-23 8:01 ` Eli Zaretskii 2023-07-23 8:11 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 8:01 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 07:52:31 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Then how will the callback be different from > >> (mapc #'my-function (directory-files-recursively ...)) > >> ? > > > > It depends on the application. Applications that want to get all the > > data and only after that process it will not use the callback. But I > > can certainly imagine an application that inserts the file names, or > > some of their transforms, into a buffer, and from time to time > > triggers redisplay to show the partial results. Or an application > > could write the file names to some disk file or external consumer, or > > send them to a network process. > > But won't the Elisp callback always result in a queue that will > effectively be synchronous? I don't understand the question (what queue?), and understand even less what you are trying to say here. Please elaborate. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 8:01 ` Eli Zaretskii @ 2023-07-23 8:11 ` Ihor Radchenko 2023-07-23 9:11 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 8:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> >> Then how will the callback be different from >> >> (mapc #'my-function (directory-files-recursively ...)) >> >> ? >> > >> > It depends on the application. Applications that want to get all the >> > data and only after that process it will not use the callback. But I >> > can certainly imagine an application that inserts the file names, or >> > some of their transforms, into a buffer, and from time to time >> > triggers redisplay to show the partial results. Or an application >> > could write the file names to some disk file or external consumer, or >> > send them to a network process. >> >> But won't the Elisp callback always result in a queue that will >> effectively be synchronous? > > I don't understand the question (what queue?), and understand even > less what you are trying to say here. Please elaborate. Consider (async-directory-files-recursively dir regexp callback) with callback being (lambda (file) (start-process "Copy" nil "cp" file "/tmp/")). `async-directory-files-recursively' may fire CALLBACK very frequently. According to the other benchmarks in this thread, a file from directory may be retrieved within 10E-6s or even less. Elisp will have to arrange the callbacks to run immediately one after other (in a queue). Which will not be very different compared to just running callbacks in a synchronous loop. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 8:11 ` Ihor Radchenko @ 2023-07-23 9:11 ` Eli Zaretskii 2023-07-23 9:34 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 9:11 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 08:11:56 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> But won't the Elisp callback always result in a queue that will > >> effectively be synchronous? > > > > I don't understand the question (what queue?), and understand even > > less what you are trying to say here. Please elaborate. > > Consider (async-directory-files-recursively dir regexp callback) with > callback being (lambda (file) (start-process "Copy" nil "cp" file "/tmp/")). What is async-directory-files-recursively, and why are we talking about it? I was talking about an implementation of directory-files-recursively as a primitive in C. That's not async code. So I don't understand why we are talking about some hypothetical async implementation. > `async-directory-files-recursively' may fire CALLBACK very frequently. > According to the other benchmarks in this thread, a file from directory > may be retrieved within 10E-6s or even less. Elisp will have to arrange > the callbacks to run immediately one after other (in a queue). > Which will not be very different compared to just running callbacks in a > synchronous loop. Regardless of my confusion above, no one said the callback must necessarily operate on each file as soon as its name was retrieved, nor even that the callback must be called for each file. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 9:11 ` Eli Zaretskii @ 2023-07-23 9:34 ` Ihor Radchenko 2023-07-23 9:39 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 9:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> Consider (async-directory-files-recursively dir regexp callback) with >> callback being (lambda (file) (start-process "Copy" nil "cp" file "/tmp/")). > > What is async-directory-files-recursively, and why are we talking > about it? I was talking about an implementation of > directory-files-recursively as a primitive in C. That's not async > code. So I don't understand why we are talking about some > hypothetical async implementation. Then, may you elaborate about how you imagine the proposed callback interface? I clearly did not understand what you had in mind. >> `async-directory-files-recursively' may fire CALLBACK very frequently. >> According to the other benchmarks in this thread, a file from directory >> may be retrieved within 10E-6s or even less. Elisp will have to arrange >> the callbacks to run immediately one after other (in a queue). >> Which will not be very different compared to just running callbacks in a >> synchronous loop. > > Regardless of my confusion above, no one said the callback must > necessarily operate on each file as soon as its name was retrieved, > nor even that the callback must be called for each file. The only callback paradigm I know of in Emacs is something like process sentinels. Do you have something else in mind? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 9:34 ` Ihor Radchenko @ 2023-07-23 9:39 ` Eli Zaretskii 2023-07-23 9:42 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 9:39 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 09:34:20 +0000 > > The only callback paradigm I know of in Emacs is something like process > sentinels. Do you have something else in mind? Think about an API that is passed a function, and calls that function when appropriate, to perform caller-defined processing of the stuff generated by the API's implementation. That function is what I referred to as "callback". ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 9:39 ` Eli Zaretskii @ 2023-07-23 9:42 ` Ihor Radchenko 2023-07-23 10:20 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 9:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> The only callback paradigm I know of in Emacs is something like process >> sentinels. Do you have something else in mind? > > Think about an API that is passed a function, and calls that function > when appropriate, to perform caller-defined processing of the stuff > generated by the API's implementation. That function is what I > referred to as "callback". But what is the strategy that should be used to call the CALLBACK? You clearly had something other than "call as soon as we got another file name" in mind. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 9:42 ` Ihor Radchenko @ 2023-07-23 10:20 ` Eli Zaretskii 2023-07-23 11:43 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 10:20 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 09:42:45 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Think about an API that is passed a function, and calls that function > > when appropriate, to perform caller-defined processing of the stuff > > generated by the API's implementation. That function is what I > > referred to as "callback". > > But what is the strategy that should be used to call the CALLBACK? > You clearly had something other than "call as soon as we got another > file name" in mind. It could be "call as soon as we got 100 file names", for example. The number can even be a separate parameter passed to the API. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 10:20 ` Eli Zaretskii @ 2023-07-23 11:43 ` Ihor Radchenko 2023-07-23 12:49 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 11:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> But what is the strategy that should be used to call the CALLBACK? >> You clearly had something other than "call as soon as we got another >> file name" in mind. > > It could be "call as soon as we got 100 file names", for example. The > number can even be a separate parameter passed to the API. Will consing the filename strings also be delayed until the callback is invoked? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 11:43 ` Ihor Radchenko @ 2023-07-23 12:49 ` Eli Zaretskii 2023-07-23 12:57 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 12:49 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 11:43:22 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> But what is the strategy that should be used to call the CALLBACK? > >> You clearly had something other than "call as soon as we got another > >> file name" in mind. > > > > It could be "call as soon as we got 100 file names", for example. The > > number can even be a separate parameter passed to the API. > > Will consing the filename strings also be delayed until the callback is invoked? No. I don't think it's possible (or desirable). We could keep them in some malloc'ed buffer, of course, but what's the point? This would only be justified if somehow creation of Lisp strings proved to be a terrible bottleneck, which would leave me mightily surprised. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 12:49 ` Eli Zaretskii @ 2023-07-23 12:57 ` Ihor Radchenko 2023-07-23 13:32 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 12:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> > It could be "call as soon as we got 100 file names", for example. The >> > number can even be a separate parameter passed to the API. >> >> Will consing the filename strings also be delayed until the callback is invoked? > > No. I don't think it's possible (or desirable). We could keep them > in some malloc'ed buffer, of course, but what's the point? This would > only be justified if somehow creation of Lisp strings proved to be a > terrible bottleneck, which would leave me mightily surprised. Thanks for the clarification! Then, would it make sense to have such a callback API more general? (not just for listing directory files). For example, the callbacks might be attached to a list variable that will accumulate the async results. Then, the callbacks will be called on that list, similar to how process sentinels are called when a chunk of output is arriving to the process buffer. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 12:57 ` Ihor Radchenko @ 2023-07-23 13:32 ` Eli Zaretskii 2023-07-23 13:56 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 13:32 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 12:57:53 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> > It could be "call as soon as we got 100 file names", for example. The > >> > number can even be a separate parameter passed to the API. > >> > >> Will consing the filename strings also be delayed until the callback is invoked? > > > > No. I don't think it's possible (or desirable). We could keep them > > in some malloc'ed buffer, of course, but what's the point? This would > > only be justified if somehow creation of Lisp strings proved to be a > > terrible bottleneck, which would leave me mightily surprised. > > Thanks for the clarification! > Then, would it make sense to have such a callback API more general? (not > just for listing directory files). > > For example, the callbacks might be attached to a list variable that > will accumulate the async results. Then, the callbacks will be called on > that list, similar to how process sentinels are called when a chunk of > output is arriving to the process buffer. Anything's possible, but when a function produces text, like file names, then the natural thing is either to return them as strings or to insert them into some buffer. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 13:32 ` Eli Zaretskii @ 2023-07-23 13:56 ` Ihor Radchenko 2023-07-23 14:32 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 13:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: > Anything's possible, but when a function produces text, like file > names, then the natural thing is either to return them as strings or > to insert them into some buffer. Do you mean to re-use process buffer and process API, but for internal asynchronous C functions (rather than sub-processes)? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 13:56 ` Ihor Radchenko @ 2023-07-23 14:32 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 14:32 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@catern.com, sbaugh@janestreet.com, dmitry@gutov.dev, > michael.albinus@gmx.de, rms@gnu.org, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 13:56:35 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Anything's possible, but when a function produces text, like file > > names, then the natural thing is either to return them as strings or > > to insert them into some buffer. > > Do you mean to re-use process buffer and process API, but for internal > asynchronous C functions (rather than sub-processes)? Not necessarily a process buffer, no. Just some temporary buffer. We already do stuff like that for some C primitives. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 11:58 ` Eli Zaretskii 2023-07-22 14:14 ` Ihor Radchenko @ 2023-07-22 17:18 ` sbaugh 2023-07-22 17:26 ` Ihor Radchenko 2023-07-22 17:46 ` Eli Zaretskii 1 sibling, 2 replies; 213+ messages in thread From: sbaugh @ 2023-07-22 17:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, yantar92, rms, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> From: sbaugh@catern.com >> Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC) >> Cc: Spencer Baugh <sbaugh@janestreet.com>, dmitry@gutov.dev, >> yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org, >> 64735@debbugs.gnu.org >> >> Eli Zaretskii <eliz@gnu.org> writes: >> > No, the first step is to use in Emacs what Find does today, because it >> > will already be a significant speedup. >> >> Why bother? directory-files-recursively is a rarely used API, as you >> have mentioned before in this thread. > > Because we could then use it much more (assuming the result will be > performant enough -- this remains to be seen). > >> And there is a way to speed it up which will have a performance boost >> which is unbeatable any other way: Use find instead of >> directory-files-recursively, and operate on files as they find prints >> them. > > Not every command can operate on the output sequentially: some need to > see all of the output, others will need to be redesigned and > reimplemented to support such sequential mode. > > Moreover, piping from Find incurs overhead: data is broken into blocks > by the pipe or PTY, reading the data can be slowed down if Emacs is > busy processing something, etc. I went ahead and implemented it, and I get a 2x speedup even *without* running find in parallel with Emacs. First my results: (my-bench 100 "~/public_html" "") (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)") ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)")) (my-bench 10 "~/.local/src/linux" "") (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)") ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)")) (my-bench 100 "/ssh:catern.com:~/public_html" "") (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)") ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)")) 2x speedup on local files, and almost a 10x speedup for remote files. And my implementation *isn't even using the fact that find can run in parallel with Emacs*. If I did start using that, I expect even more speed gains from parallelism, which aren't achievable in Emacs itself. So can we add something like this (with the appropriate fallbacks to directory-files-recursively), since it has such a big speedup even without parallelism? My implementation and benchmarking: (defun find-directory-files-recursively (dir regexp &optional include-directories _predicate follow-symlinks) (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates") (with-temp-buffer (setq case-fold-search nil) (cd dir) (let* ((command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) "-regex" (concat ".*" regexp ".*")) (unless include-directories '("!" "-type" "d")) '("-print0") )) (remote (file-remote-p dir)) (proc (if remote (let ((proc (apply #'start-file-process "find" (current-buffer) command))) (set-process-sentinel proc (lambda (_proc _state))) (set-process-query-on-exit-flag proc nil) proc) (make-process :name "find" :buffer (current-buffer) :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :command command)))) (while (accept-process-output proc)) (let ((start (goto-char (point-min))) ret) (while (search-forward "\0" nil t) (push (concat remote (buffer-substring-no-properties start (1- (point)))) ret) (setq start (point))) ret)))) (defun my-bench (count path regexp) (setq path (expand-file-name path)) (let ((old (directory-files-recursively path regexp)) (new (find-directory-files-recursively path regexp))) (dolist (path old) (should (member path new))) (dolist (path new) (should (member path old)))) (list (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp))) (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp))))) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 17:18 ` sbaugh @ 2023-07-22 17:26 ` Ihor Radchenko 2023-07-22 17:46 ` Eli Zaretskii 1 sibling, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-22 17:26 UTC (permalink / raw) To: sbaugh; +Cc: sbaugh, rms, dmitry, michael.albinus, Eli Zaretskii, 64735 sbaugh@catern.com writes: > I went ahead and implemented it, and I get a 2x speedup even *without* > running find in parallel with Emacs. > > First my results: > > (my-bench 100 "~/public_html" "") > (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)") > ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)")) > > (my-bench 10 "~/.local/src/linux" "") > (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)") > ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)")) What about without `file-name-handler-alist'? > (my-bench 100 "/ssh:catern.com:~/public_html" "") > (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)") > ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)")) This is indeed expected. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 17:18 ` sbaugh 2023-07-22 17:26 ` Ihor Radchenko @ 2023-07-22 17:46 ` Eli Zaretskii 2023-07-22 18:31 ` Eli Zaretskii 2023-07-22 20:53 ` Spencer Baugh 1 sibling, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 17:46 UTC (permalink / raw) To: sbaugh; +Cc: sbaugh, yantar92, rms, dmitry, michael.albinus, 64735 > From: sbaugh@catern.com > Date: Sat, 22 Jul 2023 17:18:19 +0000 (UTC) > Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev, > michael.albinus@gmx.de, 64735@debbugs.gnu.org > > First my results: > > (my-bench 100 "~/public_html" "") > (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)") > ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)")) > > (my-bench 10 "~/.local/src/linux" "") > (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)") > ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)")) > > (my-bench 100 "/ssh:catern.com:~/public_html" "") > (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)") > ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)")) > > 2x speedup on local files, and almost a 10x speedup for remote files. Thanks, that's impressive. But you omitted some of the features of directory-files-recursively, see below. > And my implementation *isn't even using the fact that find can run in > parallel with Emacs*. If I did start using that, I expect even more > speed gains from parallelism, which aren't achievable in Emacs itself. I'm not sure I understand what you mean by "in parallel" and why it would be faster. > So can we add something like this (with the appropriate fallbacks to > directory-files-recursively), since it has such a big speedup even > without parallelism? We can have an alternative implementation, yes. But it should support predicate, and it should sort the files in each directory like directory-files-recursively does, so that it's a drop-in replacement. Also, I believe that Find does return "." in each directory, and your implementation doesn't filter them, whereas directory-files-recursively does AFAIR. And I see no need for any fallback: that's for the application to do if it wants. > (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates") It should. > (if follow-symlinks > '("-L") > '("!" "(" "-type" "l" "-xtype" "d" ")")) > (unless (string-empty-p regexp) > "-regex" (concat ".*" regexp ".*")) > (unless include-directories > '("!" "-type" "d")) > '("-print0") Some of these switches are specific to GNU Find. Are we going to support only GNU Find? > )) > (remote (file-remote-p dir)) > (proc > (if remote > (let ((proc (apply #'start-file-process > "find" (current-buffer) command))) > (set-process-sentinel proc (lambda (_proc _state))) > (set-process-query-on-exit-flag proc nil) > proc) > (make-process :name "find" :buffer (current-buffer) > :connection-type 'pipe > :noquery t > :sentinel (lambda (_proc _state)) > :command command)))) > (while (accept-process-output proc)) Why do you call accept-process-output here? it could interfere with reading output from async subprocesses running at the same time. To come think of this, why use async subprocesses here and not call-process? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 17:46 ` Eli Zaretskii @ 2023-07-22 18:31 ` Eli Zaretskii 2023-07-22 19:06 ` Eli Zaretskii 2023-07-22 20:53 ` Spencer Baugh 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 18:31 UTC (permalink / raw) To: sbaugh, sbaugh; +Cc: dmitry, yantar92, michael.albinus, rms, 64735 > Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev, > michael.albinus@gmx.de, 64735@debbugs.gnu.org > Date: Sat, 22 Jul 2023 20:46:01 +0300 > From: Eli Zaretskii <eliz@gnu.org> > > > First my results: > > > > (my-bench 100 "~/public_html" "") > > (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)") > > ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)")) > > > > (my-bench 10 "~/.local/src/linux" "") > > (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)") > > ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)")) > > > > (my-bench 100 "/ssh:catern.com:~/public_html" "") > > (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)") > > ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)")) > > > > 2x speedup on local files, and almost a 10x speedup for remote files. > > Thanks, that's impressive. But you omitted some of the features of > directory-files-recursively, see below. My results on MS-Windows are less encouraging: (my-bench 2 "d:/usr/archive" "") (("built-in" . "Elapsed time: 1.250000s (0.093750s in 5 GCs)") ("with-find" . "Elapsed time: 8.578125s (0.109375s in 7 GCs)")) D:/usr/archive is a directory with 372 subdirectories and more than 12000 files in all of them. The disk is SSD, in case it matters, and I measured this with a warm disk cache. So I guess whether or not to use this depends on the underlying system. Btw, you should not assume that "-type l" will universally work: at least on MS-Windows some ports of GNU Find will barf when they see it. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 18:31 ` Eli Zaretskii @ 2023-07-22 19:06 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 19:06 UTC (permalink / raw) To: sbaugh, sbaugh; +Cc: dmitry, yantar92, michael.albinus, rms, 64735 > Cc: dmitry@gutov.dev, yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org, > 64735@debbugs.gnu.org > Date: Sat, 22 Jul 2023 21:31:14 +0300 > From: Eli Zaretskii <eliz@gnu.org> > > My results on MS-Windows are less encouraging: > > (my-bench 2 "d:/usr/archive" "") > (("built-in" . "Elapsed time: 1.250000s (0.093750s in 5 GCs)") > ("with-find" . "Elapsed time: 8.578125s (0.109375s in 7 GCs)")) And here's from a GNU/Linux machine, which is probably not very fast: (my-bench 10 "/usr/lib" "") (("built-in" . "Elapsed time: 4.410613s (2.077311s in 56 GCs)") ("with-find" . "Elapsed time: 3.326954s (1.997251s in 54 GCs)")) Faster, but not by a lot. On this system /usr/lib has 18000 files in 1860 subdirectories. Btw, the Find command with pipe to some other program, like wc, finishes much faster, like 2 to 4 times faster than when it is run from find-directory-files-recursively. That's probably the slowdown due to communications with async subprocesses in action. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 17:46 ` Eli Zaretskii 2023-07-22 18:31 ` Eli Zaretskii @ 2023-07-22 20:53 ` Spencer Baugh 2023-07-23 6:15 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 213+ messages in thread From: Spencer Baugh @ 2023-07-22 20:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: yantar92, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> From: sbaugh@catern.com >> Date: Sat, 22 Jul 2023 17:18:19 +0000 (UTC) >> Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev, >> michael.albinus@gmx.de, 64735@debbugs.gnu.org >> >> First my results: >> >> (my-bench 100 "~/public_html" "") >> (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)") >> ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)")) >> >> (my-bench 10 "~/.local/src/linux" "") >> (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)") >> ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)")) >> >> (my-bench 100 "/ssh:catern.com:~/public_html" "") >> (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)") >> ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)")) >> >> 2x speedup on local files, and almost a 10x speedup for remote files. > > Thanks, that's impressive. But you omitted some of the features of > directory-files-recursively, see below. > >> And my implementation *isn't even using the fact that find can run in >> parallel with Emacs*. If I did start using that, I expect even more >> speed gains from parallelism, which aren't achievable in Emacs itself. > > I'm not sure I understand what you mean by "in parallel" and why it > would be faster. I mean having Emacs read output from the process and turn them into strings while find is still running and walking the directory tree. So the two parts are running in parallel. This, specifically: (defun find-directory-files-recursively (dir regexp &optional include-directories _predicate follow-symlinks) (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates") (cl-assert (not (file-remote-p dir))) (let* (buffered result (proc (make-process :name "find" :buffer nil :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :filter (lambda (proc data) (let ((start 0)) (when-let (end (string-search "\0" data start)) (push (concat buffered (substring data start end)) result) (setq buffered "") (setq start (1+ end)) (while-let ((end (string-search "\0" data start))) (push (substring data start end) result) (setq start (1+ end)))) (setq buffered (concat buffered (substring data start))))) :command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) "-regex" (concat ".*" regexp ".*")) (unless include-directories '("!" "-type" "d")) '("-print0") )))) (while (accept-process-output proc)) result)) Can you try this further change on your Windows (and GNU/Linux) box? I just tested on a different box and my original change gets: (("built-in" . "Elapsed time: 4.506643s (2.276269s in 21 GCs)") ("with-find" . "Elapsed time: 4.114531s (2.848497s in 27 GCs)")) while this parallel implementation gets (("built-in" . "Elapsed time: 4.479185s (2.236561s in 21 GCs)") ("with-find" . "Elapsed time: 2.858452s (1.934647s in 19 GCs)")) so it might have a favorable impact on Windows and your other GNU/Linux box. >> So can we add something like this (with the appropriate fallbacks to >> directory-files-recursively), since it has such a big speedup even >> without parallelism? > > We can have an alternative implementation, yes. But it should support > predicate, and it should sort the files in each directory like > directory-files-recursively does, so that it's a drop-in replacement. > Also, I believe that Find does return "." in each directory, and your > implementation doesn't filter them, whereas > directory-files-recursively does AFAIR. > > And I see no need for any fallback: that's for the application to do > if it wants. > >> (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates") > > It should. This is where I think a fallback would be useful - it's basically impossible to support arbitrary predicates efficiently here, since it requires us to put Lisp in control of whether find descends into a directory. So I'm thinking I would just fall back to running the old directory-files-recursively whenever there's a predicate. Or just not supporting this at all... >> (if follow-symlinks >> '("-L") >> '("!" "(" "-type" "l" "-xtype" "d" ")")) >> (unless (string-empty-p regexp) >> "-regex" (concat ".*" regexp ".*")) >> (unless include-directories >> '("!" "-type" "d")) >> '("-print0") > > Some of these switches are specific to GNU Find. Are we going to > support only GNU Find? POSIX find doesn't support -regex, so I think we have to. We could stick to just POSIX find if we only allowed globs in find-directory-files-recursively, instead of full regexes. >> )) >> (remote (file-remote-p dir)) >> (proc >> (if remote >> (let ((proc (apply #'start-file-process >> "find" (current-buffer) command))) >> (set-process-sentinel proc (lambda (_proc _state))) >> (set-process-query-on-exit-flag proc nil) >> proc) >> (make-process :name "find" :buffer (current-buffer) >> :connection-type 'pipe >> :noquery t >> :sentinel (lambda (_proc _state)) >> :command command)))) >> (while (accept-process-output proc)) > > Why do you call accept-process-output here? it could interfere with > reading output from async subprocesses running at the same time. To > come think of this, why use async subprocesses here and not > call-process? See my new iteration which does use the async-ness. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 20:53 ` Spencer Baugh @ 2023-07-23 6:15 ` Eli Zaretskii 2023-07-23 7:48 ` Ihor Radchenko 2023-07-23 11:44 ` Michael Albinus 2 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 6:15 UTC (permalink / raw) To: Spencer Baugh; +Cc: yantar92, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Spencer Baugh <sbaugh@janestreet.com> > Cc: sbaugh@catern.com, yantar92@posteo.net, rms@gnu.org, > dmitry@gutov.dev, michael.albinus@gmx.de, 64735@debbugs.gnu.org > Date: Sat, 22 Jul 2023 16:53:05 -0400 > > Can you try this further change on your Windows (and GNU/Linux) box? I > just tested on a different box and my original change gets: > > (("built-in" . "Elapsed time: 4.506643s (2.276269s in 21 GCs)") > ("with-find" . "Elapsed time: 4.114531s (2.848497s in 27 GCs)")) > > while this parallel implementation gets > > (("built-in" . "Elapsed time: 4.479185s (2.236561s in 21 GCs)") > ("with-find" . "Elapsed time: 2.858452s (1.934647s in 19 GCs)")) > > so it might have a favorable impact on Windows and your other GNU/Linux > box. Almost no effect here on MS-Windows: (("built-in" . "Elapsed time: 0.859375s (0.093750s in 4 GCs)") ("with-find" . "Elapsed time: 8.437500s (0.078125s in 4 GCs)")) It was 8.578 sec with the previous version. (The Lisp version is somewhat faster in this test because I native-compiled the code for this test.) On GNU/Linux: (("built-in" . "Elapsed time: 4.244898s (1.934182s in 56 GCs)") ("with-find" . "Elapsed time: 3.011574s (1.190498s in 35 GCs)")) Faster by 10% (previous version yielded 3.327 sec). Btw, I needed to fix the code: when-let needs 2 open parens after it, not one. The original code signals an error from the filter function in Emacs 29. > >> (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates") > > > > It should. > > This is where I think a fallback would be useful - it's basically > impossible to support arbitrary predicates efficiently here, since it > requires us to put Lisp in control of whether find descends into a > directory. There's nothing wrong with supporting this less efficiently. And there's no need to control where Find descends: you could just filter out the files from those directories that need to be ignored. > So I'm thinking I would just fall back to running the old > directory-files-recursively whenever there's a predicate. Or just not > supporting this at all... We cannot not support it at all, because then it will not be a replacement. Fallback is okay, though I'd prefer a self-contained function. > >> (if follow-symlinks > >> '("-L") > >> '("!" "(" "-type" "l" "-xtype" "d" ")")) > >> (unless (string-empty-p regexp) > >> "-regex" (concat ".*" regexp ".*")) > >> (unless include-directories > >> '("!" "-type" "d")) > >> '("-print0") > > > > Some of these switches are specific to GNU Find. Are we going to > > support only GNU Find? > > POSIX find doesn't support -regex, so I think we have to. We could > stick to just POSIX find if we only allowed globs in > find-directory-files-recursively, instead of full regexes. The latter would again be incompatible with directory-files-recursively, so it isn't TRT, IMO. One other subtlety is non-ASCII file names: you use -print0 switch to Find, which produces null bytes, and those could inhibit decoding of non-ASCII characters. So you may need to bind inhibit-null-byte-detection to a non-nil value to get correctly decoded file names you get from Find. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 20:53 ` Spencer Baugh 2023-07-23 6:15 ` Eli Zaretskii @ 2023-07-23 7:48 ` Ihor Radchenko 2023-07-23 8:06 ` Eli Zaretskii 2023-07-23 11:44 ` Michael Albinus 2 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 7:48 UTC (permalink / raw) To: Spencer Baugh; +Cc: rms, sbaugh, dmitry, michael.albinus, Eli Zaretskii, 64735 Spencer Baugh <sbaugh@janestreet.com> writes: > Can you try this further change on your Windows (and GNU/Linux) box? I > just tested on a different box and my original change gets: On GNU/Linux, with slight modifications (defun my-bench (count path regexp) (setq path (expand-file-name path)) ;; (let ((old (directory-files-recursively path regexp)) ;; (new (find-directory-files-recursively path regexp))) ;; (dolist (path old) ;; (should (member path new))) ;; (dolist (path new) ;; (should (member path old)))) (list (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp))) (cons "built-in no handlers" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively path regexp)))) (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp))))) (my-bench 10 "/usr/src/linux/" "") (("built-in" . "Elapsed time: 7.134589s (3.609741s in 10 GCs)") ("built-in no handlers" . "Elapsed time: 6.041666s (3.856730s in 11 GCs)") ("with-find" . "Elapsed time: 6.300330s (4.248508s in 12 GCs)")) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 7:48 ` Ihor Radchenko @ 2023-07-23 8:06 ` Eli Zaretskii 2023-07-23 8:16 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 8:06 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Eli Zaretskii <eliz@gnu.org>, sbaugh@catern.com, rms@gnu.org, > dmitry@gutov.dev, michael.albinus@gmx.de, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 07:48:45 +0000 > > (my-bench 10 "/usr/src/linux/" "") > > (("built-in" . "Elapsed time: 7.134589s (3.609741s in 10 GCs)") > ("built-in no handlers" . "Elapsed time: 6.041666s (3.856730s in 11 GCs)") > ("with-find" . "Elapsed time: 6.300330s (4.248508s in 12 GCs)")) Is this in "emacs -Q"? Why so much time taken by GC? It indicates that temporarily raising the GC thresholds could speed up things by a factor of 2 or 3. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 8:06 ` Eli Zaretskii @ 2023-07-23 8:16 ` Ihor Radchenko 2023-07-23 9:13 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 8:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> (("built-in" . "Elapsed time: 7.134589s (3.609741s in 10 GCs)") >> ("built-in no handlers" . "Elapsed time: 6.041666s (3.856730s in 11 GCs)") >> ("with-find" . "Elapsed time: 6.300330s (4.248508s in 12 GCs)")) > > Is this in "emacs -Q"? Why so much time taken by GC? It indicates > that temporarily raising the GC thresholds could speed up things by a > factor of 2 or 3. With emacs -Q, the results are similar in terms of absolute time spent doing GC: (("built-in" . "Elapsed time: 5.706795s (3.332933s in 304 GCs)") ("built-in no handlers" . "Elapsed time: 4.535871s (3.161111s in 301 GCs)") ("with-find" . "Elapsed time: 4.829426s (3.333890s in 274 GCs)")) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 8:16 ` Ihor Radchenko @ 2023-07-23 9:13 ` Eli Zaretskii 2023-07-23 9:16 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 9:13 UTC (permalink / raw) To: Ihor Radchenko; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: sbaugh@janestreet.com, sbaugh@catern.com, rms@gnu.org, dmitry@gutov.dev, > michael.albinus@gmx.de, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 08:16:05 +0000 > > With emacs -Q, the results are similar in terms of absolute time spent > doing GC: > > (("built-in" . "Elapsed time: 5.706795s (3.332933s in 304 GCs)") > ("built-in no handlers" . "Elapsed time: 4.535871s (3.161111s in 301 GCs)") > ("with-find" . "Elapsed time: 4.829426s (3.333890s in 274 GCs)")) Strange. On my system, GC takes about 8% of the run time. Maybe it's a function of how many files are retrieved? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 9:13 ` Eli Zaretskii @ 2023-07-23 9:16 ` Ihor Radchenko 0 siblings, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-23 9:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: sbaugh, rms, sbaugh, dmitry, michael.albinus, 64735 Eli Zaretskii <eliz@gnu.org> writes: >> With emacs -Q, the results are similar in terms of absolute time spent >> doing GC: >> >> (("built-in" . "Elapsed time: 5.706795s (3.332933s in 304 GCs)") >> ("built-in no handlers" . "Elapsed time: 4.535871s (3.161111s in 301 GCs)") >> ("with-find" . "Elapsed time: 4.829426s (3.333890s in 274 GCs)")) > > Strange. On my system, GC takes about 8% of the run time. Maybe it's > a function of how many files are retrieved? Most likely. (length (directory-files-recursively "/usr/src/linux/" "")) ; => 145489 My test is producing a very long list of files. 10 times for each test and for each function variant. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 20:53 ` Spencer Baugh 2023-07-23 6:15 ` Eli Zaretskii 2023-07-23 7:48 ` Ihor Radchenko @ 2023-07-23 11:44 ` Michael Albinus 2 siblings, 0 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-23 11:44 UTC (permalink / raw) To: Spencer Baugh; +Cc: yantar92, rms, sbaugh, dmitry, Eli Zaretskii, 64735 Spencer Baugh <sbaugh@janestreet.com> writes: Hi Spencer, > I mean having Emacs read output from the process and turn them into > strings while find is still running and walking the directory tree. So > the two parts are running in parallel. This, specifically: Just as POC, I have modified your function slightly that it runs with both local and remote directories. --8<---------------cut here---------------start------------->8--- (defun find-directory-files-recursively (dir regexp &optional include-directories _predicate follow-symlinks) (let* (buffered result (remote (file-remote-p dir)) (file-name-handler-alist (and remote file-name-handler-alist)) (proc (make-process :name "find" :buffer nil :connection-type 'pipe :noquery t :sentinel #'ignore :file-handler remote :filter (lambda (proc data) (let ((start 0)) (when-let ((end (string-search "\0" data start))) (push (concat buffered (substring data start end)) result) (setq buffered "") (setq start (1+ end)) (while-let ((end (string-search "\0" data start))) (push (substring data start end) result) (setq start (1+ end)))) (setq buffered (concat buffered (substring data start))))) :command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) "-regex" (concat ".*" regexp ".*")) (unless include-directories '("!" "-type" "d")) '("-print0") )))) (while (accept-process-output proc)) (if remote (mapcar (lambda (file) (concat remote file)) result) result))) --8<---------------cut here---------------end--------------->8--- This returns on my laptop --8<---------------cut here---------------start------------->8--- (my-bench 100 "~/src/tramp" "") (("built-in" . "Elapsed time: 99.177562s (3.403403s in 107 GCs)") ("with-find" . "Elapsed time: 83.432360s (2.820053s in 98 GCs)")) (my-bench 100 "/ssh:remotehost:~/src/tramp" "") (("built-in" . "Elapsed time: 128.406359s (34.981183s in 1850 GCs)") ("with-find" . "Elapsed time: 82.765064s (4.155410s in 163 GCs)")) --8<---------------cut here---------------end--------------->8--- Of course the other problems still remain. For example, you cannot know whether on a given host (local or remote) find supports all arguments. On my NAS, for example, we have --8<---------------cut here---------------start------------->8--- [~] # find -h BusyBox v1.01 (2022.10.27-23:57+0000) multi-call binary Usage: find [PATH...] [EXPRESSION] Search for files in a directory hierarchy. The default PATH is the current directory; default EXPRESSION is '-print' EXPRESSION may consist of: -follow Dereference symbolic links. -name PATTERN File name (leading directories removed) matches PATTERN. -print Print (default and assumed). -type X Filetype matches X (where X is one of: f,d,l,b,c,...) -perm PERMS Permissions match any of (+NNN); all of (-NNN); or exactly (NNN) -mtime TIME Modified time is greater than (+N); less than (-N); or exactly (N) days --8<---------------cut here---------------end--------------->8--- Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 19:33 ` Spencer Baugh 2023-07-22 5:27 ` Eli Zaretskii @ 2023-07-23 2:59 ` Richard Stallman 2023-07-23 5:28 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Richard Stallman @ 2023-07-23 2:59 UTC (permalink / raw) To: Spencer Baugh; +Cc: 64735 [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > If someone was thinking of doing that, they would be better off > responding to RMS's earlier request for C programmers to optimize this > behavior in find. > Since, after all, if we do it that way it will benefit remote files as > well. I wonder if some different way of specifying what to ignore might make a faster implementation possible. Regexps are general but matching them tends to be slow. Maybe some less general pattern matching could be sufficient for these features while faster. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 2:59 ` Richard Stallman @ 2023-07-23 5:28 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 5:28 UTC (permalink / raw) To: rms; +Cc: sbaugh, 64735 > Cc: 64735@debbugs.gnu.org > From: Richard Stallman <rms@gnu.org> > Date: Sat, 22 Jul 2023 22:59:02 -0400 > > > If someone was thinking of doing that, they would be better off > > responding to RMS's earlier request for C programmers to optimize this > > behavior in find. > > > Since, after all, if we do it that way it will benefit remote files as > > well. > > I wonder if some different way of specifying what to ignore might make > a faster implementation possible. Regexps are general but matching > them tends to be slow. Maybe some less general pattern matching could > be sufficient for these features while faster. You are thinking about matching in Find, or about matching in Emacs? If the former, they can probably use 'fnmatch' or somesuch, to match against shell; wildcards. If the latter, we don't have any pattern matching capabilities in Emacs except fixed strings and regexps. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 17:55 ` Michael Albinus 2023-07-21 18:38 ` Eli Zaretskii @ 2023-07-22 8:17 ` Michael Albinus 1 sibling, 0 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-22 8:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> No, I just disagree that those measures should be seen as solutions of >> the performance problems mentioned here. I don't object to installing >> the changes, I only hope that work on resolving the performance issues >> will not stop because they are installed. > > Thanks. I'll install tomorrow. Pushed to master. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:46 ` Eli Zaretskii 2023-07-21 13:01 ` Michael Albinus @ 2023-07-21 13:17 ` Ihor Radchenko 1 sibling, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 13:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, michael.albinus, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> > The question is: what is more costly >> > (a) matching complex regexp && call function or >> > (b) call function (lambda (fn) (when (and foo (match-string- ... fn)) ...)) >> >> (benchmark-run-compiled 10000000 (string-match-p (caar file-name-handler-alist) "/path/to/very/deep/file")) >> ;; => (1.495432981 0 0.0) >> (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and nil (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file")) >> ;; => (0.42053276500000003 0 0.0) >> >> Looks like even funcall overheads are not as bad as invoking regexp search. > > But "nil" is not a faithful emulation of the real test which will have > to be put there, is it? It is, at least in some cases. In other cases, it is list lookup, which is also faster: (benchmark-run-compiled 10000000 (funcall (lambda (fn) (and (get 'foo 'jka-compr) (string-match-p (caar file-name-handler-alist) fn))) "/path/to/very/deep/file")) ;; => (0.5831819149999999 0 0.0) Let me go through default handlers one by one: file-name-handler-alist is a variable defined in fileio.c. Value (("..." . jka-compr-handler) (".." . epa-file-handler) ("..." . tramp-archive-file-name-handler) ("..." . tramp-completion-file-name-handler) ("..." . tramp-file-name-handler) ("\\`/:" . file-name-non-special)) ---- 1 ----- (defun jka-compr-handler (operation &rest args) (save-match-data (let ((jka-op (get operation 'jka-compr))) (if (and jka-op (not jka-compr-inhibit)) (apply jka-op args) (jka-compr-run-real-handler operation args))))) skips when `get' fails, and also puts unnecessary `save-match-data' call, which would better be inside if. ---- 2 ----- (defun epa-file-handler (operation &rest args) (save-match-data (let ((op (get operation 'epa-file))) (if (and op (not epa-inhibit)) (apply op args) (epa-file-run-real-handler operation args))))) again checks `get' and also epa-inhitbit. (and again, `save-match-data' only needed for (apply op args)). Side note: These handlers essentially force double handler lookup without skipping already processed handlers when they decide that they need to delegate to defaults. ---- 3 ----- (if (not tramp-archive-enabled) ;; Unregister `tramp-archive-file-name-handler'. (progn (tramp-register-file-name-handlers) (tramp-archive-run-real-handler operation args)) <...> Note how this tries to remove itself from handler list, by testing a boolean variable (nil by default!). However, this "self-removal" will never happen unless we happen to query a file with matching regexp. If no archive file is accessed during Emacs session (as it is the case for me), this branch of code will never be executed and I am doomed to have Emacs checking for regexp in front of this handler forever. ------ 4 ------ (defun tramp-completion-file-name-handler (operation &rest args) "Invoke Tramp file name completion handler for OPERATION and ARGS. Falls back to normal file name handler if no Tramp file name handler exists." (if-let ((fn (and tramp-mode minibuffer-completing-file-name (assoc operation tramp-completion-file-name-handler-alist)))) (save-match-data (apply (cdr fn) args)) (tramp-run-real-handler operation args))) is checking for tramp-mode (t by default) and minibuffer-completing-file-name (often nil). -------- 5 -------- (defun tramp-file-name-handler (operation &rest args) "Invoke Tramp file name handler for OPERATION and ARGS. Fall back to normal file name handler if no Tramp file name handler exists." (let ((filename (apply #'tramp-file-name-for-operation operation args)) <...> (if (tramp-tramp-file-p filename) ;; <<--- always nil when tramp-mode is nil <do staff> ;; When `tramp-mode' is not enabled, or the file name is quoted, ;; we don't do anything. (tramp-run-real-handler operation args)) this one is more complex, but does nothing when tramp-mode is nil. --------- 6 ------- file-name-non-special is complex. The only thing I noticed is that it binds tramp-mode as (let ((tramp-mode (and tramp-mode (eq method 'local-copy)))) So, other handlers checking for tramp-mode variable early would benefit if they were able to do so. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:20 ` Ihor Radchenko 2023-07-21 12:25 ` Ihor Radchenko @ 2023-07-21 12:27 ` Michael Albinus 2023-07-21 12:30 ` Ihor Radchenko 1 sibling, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 12:27 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: >> And there is also the case, that due to inhibit-file-name-handlers and >> inhibit-file-name-operation we can allow a remote file name operation >> for a given function, and disable it for another function. Tramp uses >> this mechanism. The general flag tramp-mode is not sufficient for this >> scenario. > > I am not sure if I understand completely, but it does not appear that > this is used often during ordinary file operations that do not involve > tramp. Don't know, but it is a documented feature for many decades. We shouldn't destroy it intentionally. If we have an alternative, as I have proposed with without-remote-file-names. What's wrong with this? You could use it everywhere, where you let-bind file-name-handler-alist to nil these days. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:27 ` Michael Albinus @ 2023-07-21 12:30 ` Ihor Radchenko 2023-07-21 13:04 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 12:30 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> I am not sure if I understand completely, but it does not appear that >> this is used often during ordinary file operations that do not involve >> tramp. > > Don't know, but it is a documented feature for many decades. We > shouldn't destroy it intentionally. If we have an alternative, as I have > proposed with without-remote-file-names. What's wrong with this? You > could use it everywhere, where you let-bind file-name-handler-alist to > nil these days. The idea was to make things work faster without modifying third-party code. And what do you mean by destroy? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:30 ` Ihor Radchenko @ 2023-07-21 13:04 ` Michael Albinus 2023-07-21 13:24 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 13:04 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: >>> I am not sure if I understand completely, but it does not appear that >>> this is used often during ordinary file operations that do not involve >>> tramp. >> >> Don't know, but it is a documented feature for many decades. We >> shouldn't destroy it intentionally. If we have an alternative, as I have >> proposed with without-remote-file-names. What's wrong with this? You >> could use it everywhere, where you let-bind file-name-handler-alist to >> nil these days. > > The idea was to make things work faster without modifying third-party > code. People use already (let (file-name-handler-alist) ...). As I have said, this could have unexpected side effects. I propose to replace this by (without-remote-files ...) > And what do you mean by destroy? "Destroy the feature". Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:04 ` Michael Albinus @ 2023-07-21 13:24 ` Ihor Radchenko 2023-07-21 15:36 ` Michael Albinus 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 13:24 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> And what do you mean by destroy? > > "Destroy the feature". I am sorry, but I still do not understand how what I proposed can lead to any feature regression. May you please elaborate? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 13:24 ` Ihor Radchenko @ 2023-07-21 15:36 ` Michael Albinus 2023-07-21 15:44 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Michael Albinus @ 2023-07-21 15:36 UTC (permalink / raw) To: Ihor Radchenko; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Ihor Radchenko <yantar92@posteo.net> writes: Hi Ihor, >>> And what do you mean by destroy? >> >> "Destroy the feature". > > I am sorry, but I still do not understand how what I proposed can lead > to any feature regression. May you please elaborate? When you invoke a file name handler based on the value of a variable like tramp-mode, either all file operations are enabled, or all are disabled. The mechanism with inhibit-file-name-{handlers,operation} allows you to determine more fine-grained, which operation is allowed, and which is suppressed. Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 15:36 ` Michael Albinus @ 2023-07-21 15:44 ` Ihor Radchenko 0 siblings, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 15:44 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, Eli Zaretskii, 64735, sbaugh Michael Albinus <michael.albinus@gmx.de> writes: >> I am sorry, but I still do not understand how what I proposed can lead >> to any feature regression. May you please elaborate? > > When you invoke a file name handler based on the value of a variable > like tramp-mode, either all file operations are enabled, or all are > disabled. > > The mechanism with inhibit-file-name-{handlers,operation} allows you to > determine more fine-grained, which operation is allowed, and which is > suppressed. I did not mean to remove the existing mechanisms. Just wanted to allow additional check _before_ matching filename with a regexp. (And I demonstrated that such a check is generally faster compared to invoking regexp search) Also, note that `inhibit-file-name-handlers' could then be implemented without a need to match every single handler against `inhibit-file-name-handlers' list. Emacs could instead have handler-enabled-p flag that can be trivially let-bound. Checking a flag is much faster compared to (memq handler inhibit-file-name-handlers). Of course, the existing mechanism should be left for backward compatibility. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 11:32 ` Michael Albinus 2023-07-21 11:51 ` Ihor Radchenko @ 2023-07-21 12:39 ` Eli Zaretskii 2023-07-21 13:09 ` Michael Albinus 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-21 12:39 UTC (permalink / raw) To: Michael Albinus; +Cc: dmitry, yantar92, 64735, sbaugh > From: Michael Albinus <michael.albinus@gmx.de> > Cc: yantar92@posteo.net, dmitry@gutov.dev, 64735@debbugs.gnu.org, > sbaugh@janestreet.com > Date: Fri, 21 Jul 2023 13:32:46 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Users can call the command, if they know for sure they don't use remote > >> files ever. Authors could use the macro in case they know for sure they > >> are working over local files only. > >> > >> WDYT? > > > > How is this different from binding file-name-handler-alist to nil? > > Tramp is nowadays the main consumer of this feature, and AFAIU your > > suggestion above boils down to disabling Tramp. If so, what is left? > > jka-compr-handler, epa-file-handler and file-name-non-special are > left. All of them have their reason. I know, but when I wrote that disabling file-handlers is inconceivable, I meant remote files, not those other users of this facility. Let me rephrase: running Emacs commands with disabled support for remote files is inconceivable. IMO, if tests against file-name-handler-alist are a significant performance problem, we should look for ways of solving it without disabling remote files. In general, disabling general-purpose Emacs features because they cause slow-down should be the last resort, after we tried and failed to use smarter solutions. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 12:39 ` Eli Zaretskii @ 2023-07-21 13:09 ` Michael Albinus 0 siblings, 0 replies; 213+ messages in thread From: Michael Albinus @ 2023-07-21 13:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmitry, yantar92, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> > How is this different from binding file-name-handler-alist to nil? >> > Tramp is nowadays the main consumer of this feature, and AFAIU your >> > suggestion above boils down to disabling Tramp. If so, what is left? >> >> jka-compr-handler, epa-file-handler and file-name-non-special are >> left. All of them have their reason. > > I know, but when I wrote that disabling file-handlers is > inconceivable, I meant remote files, not those other users of this > facility. > > Let me rephrase: running Emacs commands with disabled support for > remote files is inconceivable. Agreed. My proposal was to provide a convenience macro to disable Tramp when it is appropriate. Like (unless (file-remote-p file) (without-remote-files ...)) Instead of binding file-name-handler-alist to nil, as it is the current practice. And the command inhibit-remote-files shall be applied only by users who aren't interested in remote files at all. Again and again: these are ~50% of our users. > IMO, if tests against file-name-handler-alist are a significant > performance problem, we should look for ways of solving it without > disabling remote files. Sure. If there are proposals ... Best regards, Michael. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 10:46 ` Eli Zaretskii 2023-07-21 11:32 ` Michael Albinus @ 2023-07-21 12:38 ` Dmitry Gutov 1 sibling, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 12:38 UTC (permalink / raw) To: Eli Zaretskii, Michael Albinus; +Cc: sbaugh, yantar92, 64735 On 21/07/2023 13:46, Eli Zaretskii wrote: > How is this different from binding file-name-handler-alist to nil? > Tramp is nowadays the main consumer of this feature, and AFAIU your > suggestion above boils down to disabling Tramp. If so, what is left? I don't understand this either. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 15:42 ` Ihor Radchenko 2023-07-20 15:57 ` Dmitry Gutov 2023-07-20 16:33 ` Eli Zaretskii @ 2023-07-20 17:08 ` Spencer Baugh 2023-07-20 17:24 ` Eli Zaretskii ` (2 more replies) 2 siblings, 3 replies; 213+ messages in thread From: Spencer Baugh @ 2023-07-20 17:08 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Dmitry Gutov, 64735 Ihor Radchenko <yantar92@posteo.net> writes: > Dmitry Gutov <dmitry@gutov.dev> writes: > >>>> ... Last I checked, Lisp-native file >>>> listing was simply slower than 'find'. >>> >>> Could it be changed? >>> In my tests, I was able to improve performance of the built-in >>> `directory-files-recursively' simply by disabling >>> `file-name-handler-alist' around its call. >> >> Then it won't work with Tramp, right? I think it's pretty nifty that >> project-find-regexp and dired-do-find-regexp work over Tramp. > > Sure. It might also be optimized. Without trying to convince find devs > to do something about regexp handling. Not to derail too much, but find as a subprocess has one substantial advantage over find in Lisp: It can run in parallel with Emacs, so that we actually use multiple CPU cores. Between that, and the remote support part, I personally much prefer find to be a subprocess rather than in Lisp. I don't think optimizing directory-files-recursively is a great solution. (Really it's entirely plausible that Emacs could be improved by *removing* directory-files-recursively, in favor of invoking find as a subprocess: faster, parallelized execution, and better remote support.) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 17:08 ` Spencer Baugh @ 2023-07-20 17:24 ` Eli Zaretskii 2023-07-22 6:35 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-07-20 17:25 ` Ihor Radchenko 2023-07-22 6:39 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-20 17:24 UTC (permalink / raw) To: Spencer Baugh; +Cc: dmitry, yantar92, 64735 > Cc: Dmitry Gutov <dmitry@gutov.dev>, 64735@debbugs.gnu.org > From: Spencer Baugh <sbaugh@janestreet.com> > Date: Thu, 20 Jul 2023 13:08:24 -0400 > > (Really it's entirely plausible that Emacs could be improved by > *removing* directory-files-recursively, in favor of invoking find as a > subprocess: faster, parallelized execution, and better remote support.) No, there's no reason to remove anything that useful from Emacs. If this or that API is not the optimal choice for some job, it is easy enough not to use it. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 17:24 ` Eli Zaretskii @ 2023-07-22 6:35 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 213+ messages in thread From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-07-22 6:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Spencer Baugh, yantar92, 64735, dmitry Eli Zaretskii <eliz@gnu.org> writes: >> Cc: Dmitry Gutov <dmitry@gutov.dev>, 64735@debbugs.gnu.org >> From: Spencer Baugh <sbaugh@janestreet.com> >> Date: Thu, 20 Jul 2023 13:08:24 -0400 >> >> (Really it's entirely plausible that Emacs could be improved by >> *removing* directory-files-recursively, in favor of invoking find as a >> subprocess: faster, parallelized execution, and better remote support.) > > No, there's no reason to remove anything that useful from Emacs. If > this or that API is not the optimal choice for some job, it is easy > enough not to use it. Indeed. I would like to add that subprocesses remain unimplemented on MS-DOS, and the way find is currently invoked from project.el and rgrep makes both packages lose on Unix, indicating that correct portable use of find is decidedly non-trivial. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 17:08 ` Spencer Baugh 2023-07-20 17:24 ` Eli Zaretskii @ 2023-07-20 17:25 ` Ihor Radchenko 2023-07-21 19:31 ` Spencer Baugh 2023-07-22 6:39 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-20 17:25 UTC (permalink / raw) To: Spencer Baugh; +Cc: Dmitry Gutov, 64735 Spencer Baugh <sbaugh@janestreet.com> writes: >> Sure. It might also be optimized. Without trying to convince find devs >> to do something about regexp handling. > > Not to derail too much, but find as a subprocess has one substantial > advantage over find in Lisp: It can run in parallel with Emacs, so that > we actually use multiple CPU cores. Does find use multiple CPU cores? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 17:25 ` Ihor Radchenko @ 2023-07-21 19:31 ` Spencer Baugh 2023-07-21 19:37 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Spencer Baugh @ 2023-07-21 19:31 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Dmitry Gutov, 64735 Ihor Radchenko <yantar92@posteo.net> writes: > Spencer Baugh <sbaugh@janestreet.com> writes: > >>> Sure. It might also be optimized. Without trying to convince find devs >>> to do something about regexp handling. >> >> Not to derail too much, but find as a subprocess has one substantial >> advantage over find in Lisp: It can run in parallel with Emacs, so that >> we actually use multiple CPU cores. > > Does find use multiple CPU cores? Not on its own, but when it's running as a separate subprocess of Emacs, that subprocess can (and will, on modern core-rich hardware) run on a different CPU core from Emacs itself. That's a form of parallelism which is very achievable for Emacs, and provides a big performance win. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 19:31 ` Spencer Baugh @ 2023-07-21 19:37 ` Ihor Radchenko 2023-07-21 19:56 ` Dmitry Gutov 2023-07-21 20:11 ` Spencer Baugh 0 siblings, 2 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-21 19:37 UTC (permalink / raw) To: Spencer Baugh; +Cc: Dmitry Gutov, 64735 Spencer Baugh <sbaugh@janestreet.com> writes: >> Does find use multiple CPU cores? > > Not on its own, but when it's running as a separate subprocess of Emacs, > that subprocess can (and will, on modern core-rich hardware) run on a > different CPU core from Emacs itself. That's a form of parallelism > which is very achievable for Emacs, and provides a big performance win. AFAIU, the way find is called by project.el is synchronous: (1) call find; (2) wait until it produces all the results; (3) process the results. In such scenario, there is no gain from subprocess. Is any part of Emacs is even using sentinels together with find? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 19:37 ` Ihor Radchenko @ 2023-07-21 19:56 ` Dmitry Gutov 2023-07-21 20:11 ` Spencer Baugh 1 sibling, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-21 19:56 UTC (permalink / raw) To: Ihor Radchenko, Spencer Baugh; +Cc: 64735 On 21/07/2023 22:37, Ihor Radchenko wrote: > AFAIU, the way find is called by project.el is synchronous For now. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 19:37 ` Ihor Radchenko 2023-07-21 19:56 ` Dmitry Gutov @ 2023-07-21 20:11 ` Spencer Baugh 1 sibling, 0 replies; 213+ messages in thread From: Spencer Baugh @ 2023-07-21 20:11 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Dmitry Gutov, 64735 Ihor Radchenko <yantar92@posteo.net> writes: > Spencer Baugh <sbaugh@janestreet.com> writes: > >>> Does find use multiple CPU cores? >> >> Not on its own, but when it's running as a separate subprocess of Emacs, >> that subprocess can (and will, on modern core-rich hardware) run on a >> different CPU core from Emacs itself. That's a form of parallelism >> which is very achievable for Emacs, and provides a big performance win. > > AFAIU, the way find is called by project.el is synchronous: (1) call > find; (2) wait until it produces all the results; (3) process the > results. In such scenario, there is no gain from subprocess. > > Is any part of Emacs is even using sentinels together with find? rgrep. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-20 17:08 ` Spencer Baugh 2023-07-20 17:24 ` Eli Zaretskii 2023-07-20 17:25 ` Ihor Radchenko @ 2023-07-22 6:39 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-07-22 21:01 ` Dmitry Gutov 2 siblings, 1 reply; 213+ messages in thread From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-07-22 6:39 UTC (permalink / raw) To: Spencer Baugh; +Cc: Dmitry Gutov, Ihor Radchenko, 64735 Spencer Baugh <sbaugh@janestreet.com> writes: > Not to derail too much, but find as a subprocess has one substantial > advantage over find in Lisp: It can run in parallel with Emacs, so that > we actually use multiple CPU cores. > > Between that, and the remote support part, I personally much prefer find > to be a subprocess rather than in Lisp. I don't think optimizing > directory-files-recursively is a great solution. > > (Really it's entirely plausible that Emacs could be improved by > *removing* directory-files-recursively, in favor of invoking find as a > subprocess: faster, parallelized execution, and better remote support.) find is only present in the default installations of Unix-like systems, so it doesn't work without additional configuration on MS-Windows or MS-DOS. project.el and rgrep fail to work on USG Unix because they both use `-path'. Programs that use find should fall back to directory-file-recursively when any of the situations above are detected. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 6:39 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-07-22 21:01 ` Dmitry Gutov 2023-07-23 5:11 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-22 21:01 UTC (permalink / raw) To: Po Lu, Spencer Baugh; +Cc: Ihor Radchenko, 64735 On 22/07/2023 09:39, Po Lu wrote: > Spencer Baugh <sbaugh@janestreet.com> writes: > >> Not to derail too much, but find as a subprocess has one substantial >> advantage over find in Lisp: It can run in parallel with Emacs, so that >> we actually use multiple CPU cores. >> >> Between that, and the remote support part, I personally much prefer find >> to be a subprocess rather than in Lisp. I don't think optimizing >> directory-files-recursively is a great solution. >> >> (Really it's entirely plausible that Emacs could be improved by >> *removing* directory-files-recursively, in favor of invoking find as a >> subprocess: faster, parallelized execution, and better remote support.) > > find is only present in the default installations of Unix-like systems, > so it doesn't work without additional configuration on MS-Windows or > MS-DOS. project.el and rgrep fail to work on USG Unix because they both > use `-path'. > > Programs that use find should fall back to directory-file-recursively > when any of the situations above are detected. Perhaps if someone implements support for IGNORE entries (wildcards) in that function, it would be easy enough to do that fallback. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 21:01 ` Dmitry Gutov @ 2023-07-23 5:11 ` Eli Zaretskii 2023-07-23 10:46 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 5:11 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Cc: Ihor Radchenko <yantar92@posteo.net>, 64735@debbugs.gnu.org > Date: Sun, 23 Jul 2023 00:01:28 +0300 > From: Dmitry Gutov <dmitry@gutov.dev> > > On 22/07/2023 09:39, Po Lu wrote: > > > > Programs that use find should fall back to directory-file-recursively > > when any of the situations above are detected. > > Perhaps if someone implements support for IGNORE entries (wildcards) in > that function, it would be easy enough to do that fallback. Shouldn't be hard, since it already filters some of them: (dolist (file (sort (file-name-all-completions "" dir) 'string<)) (unless (member file '("./" "../")) <<<<<<<<<<<<<<<<<<< Even better: compute completion-regexp-list so that IGNOREs are filtered by file-name-all-completions in the first place. Patches welcome. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 5:11 ` Eli Zaretskii @ 2023-07-23 10:46 ` Dmitry Gutov 2023-07-23 11:18 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-23 10:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 23/07/2023 08:11, Eli Zaretskii wrote: > Even better: compute completion-regexp-list so that IGNOREs are > filtered by file-name-all-completions in the first place. We don't have lookahead in Emacs regexps, so I'm not sure it's possible to construct regexp that says "don't match entries A, B and C". ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 10:46 ` Dmitry Gutov @ 2023-07-23 11:18 ` Eli Zaretskii 2023-07-23 17:46 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 11:18 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 23 Jul 2023 13:46:30 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 23/07/2023 08:11, Eli Zaretskii wrote: > > Even better: compute completion-regexp-list so that IGNOREs are > > filtered by file-name-all-completions in the first place. > > We don't have lookahead in Emacs regexps, so I'm not sure it's possible > to construct regexp that says "don't match entries A, B and C". Well, maybe just having a way of telling file-name-all-completions to negate the sense of completion-regexp-list would be enough to make that happen? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 11:18 ` Eli Zaretskii @ 2023-07-23 17:46 ` Dmitry Gutov 2023-07-23 17:56 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-23 17:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 23/07/2023 14:18, Eli Zaretskii wrote: >> Date: Sun, 23 Jul 2023 13:46:30 +0300 >> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov<dmitry@gutov.dev> >> >> On 23/07/2023 08:11, Eli Zaretskii wrote: >>> Even better: compute completion-regexp-list so that IGNOREs are >>> filtered by file-name-all-completions in the first place. >> We don't have lookahead in Emacs regexps, so I'm not sure it's possible >> to construct regexp that says "don't match entries A, B and C". > Well, maybe just having a way of telling file-name-all-completions to > negate the sense of completion-regexp-list would be enough to make > that happen? Some way to do that is certainly possible (e.g. a new option and corresponding code, maybe; maybe not), it's just that the person implementing it should consider the performance of the resulting solution. And, ideally, do all the relevant benchmarking when proposing the change. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 17:46 ` Dmitry Gutov @ 2023-07-23 17:56 ` Eli Zaretskii 2023-07-23 17:58 ` Dmitry Gutov 2023-07-23 19:27 ` Dmitry Gutov 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 17:56 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 23 Jul 2023 20:46:19 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 23/07/2023 14:18, Eli Zaretskii wrote: > >> Date: Sun, 23 Jul 2023 13:46:30 +0300 > >> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net, > >> 64735@debbugs.gnu.org > >> From: Dmitry Gutov<dmitry@gutov.dev> > >> > >> On 23/07/2023 08:11, Eli Zaretskii wrote: > >>> Even better: compute completion-regexp-list so that IGNOREs are > >>> filtered by file-name-all-completions in the first place. > >> We don't have lookahead in Emacs regexps, so I'm not sure it's possible > >> to construct regexp that says "don't match entries A, B and C". > > Well, maybe just having a way of telling file-name-all-completions to > > negate the sense of completion-regexp-list would be enough to make > > that happen? > > Some way to do that is certainly possible (e.g. a new option and > corresponding code, maybe; maybe not), it's just that the person > implementing it should consider the performance of the resulting solution. I agree. However, if we are going to implement filtering of file names, I don't think it matters where in the pipeline to perform the filtering. The advantage of using completion-regexp-list is that the matching is done in C, so is probably at least a tad faster. > And, ideally, do all the relevant benchmarking when proposing the change. Of course. Although the benchmarks until now already show quite a variability. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 17:56 ` Eli Zaretskii @ 2023-07-23 17:58 ` Dmitry Gutov 2023-07-23 18:21 ` Eli Zaretskii 2023-07-23 19:27 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-23 17:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 23/07/2023 20:56, Eli Zaretskii wrote: >> Date: Sun, 23 Jul 2023 20:46:19 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> On 23/07/2023 14:18, Eli Zaretskii wrote: >>>> Date: Sun, 23 Jul 2023 13:46:30 +0300 >>>> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net, >>>> 64735@debbugs.gnu.org >>>> From: Dmitry Gutov<dmitry@gutov.dev> >>>> >>>> On 23/07/2023 08:11, Eli Zaretskii wrote: >>>>> Even better: compute completion-regexp-list so that IGNOREs are >>>>> filtered by file-name-all-completions in the first place. >>>> We don't have lookahead in Emacs regexps, so I'm not sure it's possible >>>> to construct regexp that says "don't match entries A, B and C". >>> Well, maybe just having a way of telling file-name-all-completions to >>> negate the sense of completion-regexp-list would be enough to make >>> that happen? >> >> Some way to do that is certainly possible (e.g. a new option and >> corresponding code, maybe; maybe not), it's just that the person >> implementing it should consider the performance of the resulting solution. > > I agree. However, if we are going to implement filtering of file > names, I don't think it matters where in the pipeline to perform the > filtering. A possible advantage of doing it earlier, is that if filtering happens in C code you could do it before allocating Lisp strings, thereby lowering the resulting GC pressure at the outset. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 17:58 ` Dmitry Gutov @ 2023-07-23 18:21 ` Eli Zaretskii 2023-07-23 19:07 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 18:21 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 23 Jul 2023 20:58:24 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > A possible advantage of doing it earlier, is that if filtering happens > in C code you could do it before allocating Lisp strings That's not what happens today. And it isn't easy to do what you suggest, since the file names we get from the C APIs need to be decoded, and that is awkward at best with C strings. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 18:21 ` Eli Zaretskii @ 2023-07-23 19:07 ` Dmitry Gutov 2023-07-23 19:27 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-23 19:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 23/07/2023 21:21, Eli Zaretskii wrote: >> Date: Sun, 23 Jul 2023 20:58:24 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> A possible advantage of doing it earlier, is that if filtering happens >> in C code you could do it before allocating Lisp strings > > That's not what happens today. And it isn't easy to do what you > suggest, since the file names we get from the C APIs need to be > decoded, and that is awkward at best with C strings. It is what happens today when 'find' is used, though. Far be it from me to insist, though, but if we indeed reimplemented all the good parts of 'find', that would make the new function a suitable replacement/improvement, at least on local hosts (instead of it just being used as a fallback). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 19:07 ` Dmitry Gutov @ 2023-07-23 19:27 ` Eli Zaretskii 2023-07-23 19:44 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-23 19:27 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 23 Jul 2023 22:07:17 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 23/07/2023 21:21, Eli Zaretskii wrote: > >> Date: Sun, 23 Jul 2023 20:58:24 +0300 > >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > >> 64735@debbugs.gnu.org > >> From: Dmitry Gutov <dmitry@gutov.dev> > >> > >> A possible advantage of doing it earlier, is that if filtering happens > >> in C code you could do it before allocating Lisp strings > > > > That's not what happens today. And it isn't easy to do what you > > suggest, since the file names we get from the C APIs need to be > > decoded, and that is awkward at best with C strings. > > It is what happens today when 'find' is used, though. No, I was talking about what file-name-all-completions does. > Far be it from me to insist, though, but if we indeed reimplemented all > the good parts of 'find', that would make the new function a suitable > replacement/improvement, at least on local hosts (instead of it just > being used as a fallback). The basic problem here is this: the regexp or pattern to filter out ignorables is specified as a Lisp string, which is in the internal Emacs representation of characters. So to compare file names we receive either from Find or from a C API, we need either to decode the file names we receive (which in practice means they should be Lisp strings), or encode the regexp and use its C string payload. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 19:27 ` Eli Zaretskii @ 2023-07-23 19:44 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-23 19:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 23/07/2023 22:27, Eli Zaretskii wrote: >> Far be it from me to insist, though, but if we indeed reimplemented all >> the good parts of 'find', that would make the new function a suitable >> replacement/improvement, at least on local hosts (instead of it just >> being used as a fallback). > The basic problem here is this: the regexp or pattern to filter out > ignorables is specified as a Lisp string, which is in the internal > Emacs representation of characters. So to compare file names we > receive either from Find or from a C API, we need either to decode the > file names we receive (which in practice means they should be Lisp > strings), or encode the regexp and use its C string payload. Yes, the latter sounds more fiddly, but it seems to be *the* way toward find's performance levels. But only the benchmarks can tell whether the hassle is worthwhile. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 17:56 ` Eli Zaretskii 2023-07-23 17:58 ` Dmitry Gutov @ 2023-07-23 19:27 ` Dmitry Gutov 2023-07-24 11:20 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-23 19:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 23/07/2023 20:56, Eli Zaretskii wrote: >> And, ideally, do all the relevant benchmarking when proposing the change. > Of course. Although the benchmarks until now already show quite a > variability. Speaking of your MS Windows results that are unflattering to 'find', it might be worth it to do a more varied comparison, to determine the OS-specific bottleneck. Off the top of my head, here are some possibilities: 1. 'find' itself is much slower there. There is room for improvement in the port. 2. The process output handling is worse. 3. Something particular to the project being used for the test. To look into the possibility #1, you can try running the same command in the terminal with the output to NUL and comparing the runtime to what's reported in the benchmark. I actually remember, from my time on MS Windows about 10 years ago, that some older ports of 'find' and/or 'grep' did have performance problems, but IIRC ezwinports contained the improved versions. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-23 19:27 ` Dmitry Gutov @ 2023-07-24 11:20 ` Eli Zaretskii 2023-07-24 12:55 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-24 11:20 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 23 Jul 2023 22:27:26 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 23/07/2023 20:56, Eli Zaretskii wrote: > >> And, ideally, do all the relevant benchmarking when proposing the change. > > Of course. Although the benchmarks until now already show quite a > > variability. > > Speaking of your MS Windows results that are unflattering to 'find', it > might be worth it to do a more varied comparison, to determine the > OS-specific bottleneck. > > Off the top of my head, here are some possibilities: > > 1. 'find' itself is much slower there. There is room for improvement in > the port. I think it's the filesystem, not the port (which I did myself in this case). But I'd welcome similar tests on other Windows systems with other ports of Find. Just remember to measure this particular benchmark, not just Find itself from the shell, as the times are very different (as I reported up-thread). > 2. The process output handling is worse. Not sure what that means. > 3. Something particular to the project being used for the test. I don't think I understand this one. > To look into the possibility #1, you can try running the same command in > the terminal with the output to NUL and comparing the runtime to what's > reported in the benchmark. Output to the null device is a bad idea, as (AFAIR) Find is clever enough to detect that and do nothing. I run "find | wc" instead, and already reported that it is much faster. > I actually remember, from my time on MS Windows about 10 years ago, that > some older ports of 'find' and/or 'grep' did have performance problems, > but IIRC ezwinports contained the improved versions. The ezwinports is the version I'm using here. But maybe someone came up with a better one: after all, I did my port many years ago (because the native ports available back then were abysmally slow). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-24 11:20 ` Eli Zaretskii @ 2023-07-24 12:55 ` Dmitry Gutov 2023-07-24 13:26 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-24 12:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 24/07/2023 14:20, Eli Zaretskii wrote: >> Date: Sun, 23 Jul 2023 22:27:26 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> On 23/07/2023 20:56, Eli Zaretskii wrote: >>>> And, ideally, do all the relevant benchmarking when proposing the change. >>> Of course. Although the benchmarks until now already show quite a >>> variability. >> >> Speaking of your MS Windows results that are unflattering to 'find', it >> might be worth it to do a more varied comparison, to determine the >> OS-specific bottleneck. >> >> Off the top of my head, here are some possibilities: >> >> 1. 'find' itself is much slower there. There is room for improvement in >> the port. > > I think it's the filesystem, not the port (which I did myself in this > case). But directory-files-recursively goes through the same filesystem, doesn't it? > But I'd welcome similar tests on other Windows systems with > other ports of Find. Just remember to measure this particular > benchmark, not just Find itself from the shell, as the times are very > different (as I reported up-thread). Concur. >> 2. The process output handling is worse. > > Not sure what that means. Emacs's ability to process the output of a process on the particular platform. You said: Btw, the Find command with pipe to some other program, like wc, finishes much faster, like 2 to 4 times faster than when it is run from find-directory-files-recursively. That's probably the slowdown due to communications with async subprocesses in action. One thing to try it changing the -with-find implementation to use a synchronous call, to compare (e.g. using 'process-file'). And repeat these tests on GNU/Linux too. That would help us gauge the viability of using an asynchronous process to get the file listing. But also, if one was just looking into reimplementing directory-files-recursively using 'find' (to create an endpoint with swappable implementations, for example), 'process-file' is a suitable substitute because the original is also currently synchronous. >> 3. Something particular to the project being used for the test. > > I don't think I understand this one. This described the possibility where the disparity between the implementations' runtimes was due to something unusual in the project structure, if you tested different projects between Windows and GNU/Linux, making direct comparison less useful. It's the least likely cause, but still sometimes a possibility. >> To look into the possibility #1, you can try running the same command in >> the terminal with the output to NUL and comparing the runtime to what's >> reported in the benchmark. > > Output to the null device is a bad idea, as (AFAIR) Find is clever > enough to detect that and do nothing. I run "find | wc" instead, and > already reported that it is much faster. Now I see it, thanks. >> I actually remember, from my time on MS Windows about 10 years ago, that >> some older ports of 'find' and/or 'grep' did have performance problems, >> but IIRC ezwinports contained the improved versions. > > The ezwinports is the version I'm using here. But maybe someone came > up with a better one: after all, I did my port many years ago (because > the native ports available back then were abysmally slow). We should also look at the exact numbers. If you say that "| wc" invocation is 2-4x faster than what's reported in the benchmark, then it takes about 2-4 seconds. Which is still oddly slower than your reported numbers for directory-files-recursively. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-24 12:55 ` Dmitry Gutov @ 2023-07-24 13:26 ` Eli Zaretskii 2023-07-25 2:41 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-24 13:26 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Mon, 24 Jul 2023 15:55:13 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > >> 1. 'find' itself is much slower there. There is room for improvement in > >> the port. > > > > I think it's the filesystem, not the port (which I did myself in this > > case). > > But directory-files-recursively goes through the same filesystem, > doesn't it? It does (more or less; see below). But I was not trying to explain why Find is slower than directory-files-recursively, I was trying to explain why Find on Windows is slower than Find on GNU/Linux. If you are asking why directory-files-recursively is so much faster on Windows than Find, then the main factors I can think about are: . IPC, at least in how we implement it in Emacs on MS-Windows, via a separate thread and OS-level events between them to signal that stuff is available for reading, whereas directory-files-recursively avoids this overhead completely; . Find uses Posix APIs: 'stat', 'chdir', 'readdir' -- which on Windows are emulated by wrappers around native APIs. Moreover, Find uses 'char *' for file names, so calling native APIs involves transparent conversion to UTF-16 and back, which is what native APIs accept and return. By contrast, Emacs on Windows calls the native APIs directly, and converts to UTF-16 from UTF-8, which is faster. (This last point also means that using Find on Windows has another grave disadvantage: it cannot fully support non-ASCII file names, only those that can be encoded by the current single-byte system codepage.) > >> 2. The process output handling is worse. > > > > Not sure what that means. > > Emacs's ability to process the output of a process on the particular > platform. > > You said: > > Btw, the Find command with pipe to some other program, like wc, > finishes much faster, like 2 to 4 times faster than when it is run > from find-directory-files-recursively. That's probably the slowdown > due to communications with async subprocesses in action. I see this slowdown on GNU/Linux as well. > One thing to try it changing the -with-find implementation to use a > synchronous call, to compare (e.g. using 'process-file'). And repeat > these tests on GNU/Linux too. This still uses pipes, albeit without the pselect stuff. > >> 3. Something particular to the project being used for the test. > > > > I don't think I understand this one. > > This described the possibility where the disparity between the > implementations' runtimes was due to something unusual in the project > structure, if you tested different projects between Windows and > GNU/Linux, making direct comparison less useful. It's the least likely > cause, but still sometimes a possibility. I have on my Windows system a d:/usr/share tree that is very similar to (albeit somewhat smaller than) a typical /usr/share tree on Posix systems. I tried with that as well, and the results were similar. > > The ezwinports is the version I'm using here. But maybe someone came > > up with a better one: after all, I did my port many years ago (because > > the native ports available back then were abysmally slow). > > We should also look at the exact numbers. If you say that "| wc" > invocation is 2-4x faster than what's reported in the benchmark, then it > takes about 2-4 seconds. Which is still oddly slower than your reported > numbers for directory-files-recursively. Yes, so there are additional factors at work, at least with this port of Find. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-24 13:26 ` Eli Zaretskii @ 2023-07-25 2:41 ` Dmitry Gutov 2023-07-25 8:22 ` Ihor Radchenko ` (2 more replies) 0 siblings, 3 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-25 2:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 [-- Attachment #1: Type: text/plain, Size: 3663 bytes --] On 24/07/2023 16:26, Eli Zaretskii wrote: >> Date: Mon, 24 Jul 2023 15:55:13 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>>> 1. 'find' itself is much slower there. There is room for improvement in >>>> the port. >>> >>> I think it's the filesystem, not the port (which I did myself in this >>> case). >> >> But directory-files-recursively goes through the same filesystem, >> doesn't it? > > It does (more or less; see below). But I was not trying to explain > why Find is slower than directory-files-recursively, I was trying to > explain why Find on Windows is slower than Find on GNU/Linux. Understood. But we probably don't need to worry about the differences between platforms as much as about choosing the best option for each platform (or not choosing the worst, at least). So I'm more interested about how the find-based solution is more than 4x slower than the built-in one on MS Windows. > If you are asking why directory-files-recursively is so much faster on > Windows than Find, then the main factors I can think about are: > > . IPC, at least in how we implement it in Emacs on MS-Windows, via a > separate thread and OS-level events between them to signal that > stuff is available for reading, whereas > directory-files-recursively avoids this overhead completely; > . Find uses Posix APIs: 'stat', 'chdir', 'readdir' -- which on > Windows are emulated by wrappers around native APIs. Moreover, > Find uses 'char *' for file names, so calling native APIs involves > transparent conversion to UTF-16 and back, which is what native > APIs accept and return. By contrast, Emacs on Windows calls the > native APIs directly, and converts to UTF-16 from UTF-8, which is > faster. (This last point also means that using Find on Windows > has another grave disadvantage: it cannot fully support non-ASCII > file names, only those that can be encoded by the current > single-byte system codepage.) I seem to remember that Wine, which also does a similar dance of translating library and system calls, is often very close to the native performance for many programs. So this could be a problem, but necessarily a significant one. Although text encoding conversion seems like a prime suspect, if the problem is here. >>>> 2. The process output handling is worse. >>> >>> Not sure what that means. >> >> Emacs's ability to process the output of a process on the particular >> platform. >> >> You said: >> >> Btw, the Find command with pipe to some other program, like wc, >> finishes much faster, like 2 to 4 times faster than when it is run >> from find-directory-files-recursively. That's probably the slowdown >> due to communications with async subprocesses in action. > > I see this slowdown on GNU/Linux as well. > >> One thing to try it changing the -with-find implementation to use a >> synchronous call, to compare (e.g. using 'process-file'). And repeat >> these tests on GNU/Linux too. > > This still uses pipes, albeit without the pselect stuff. I'm attaching an extended benchmark, one that includes a "synchronous" implementation as well. Please give it a spin as well. Here (GNU/Linux) the reported numbers look like this: > (my-bench 1 default-directory "") (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)") ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)") ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)") ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)")) [-- Attachment #2: find-bench.el --] [-- Type: text/x-emacs-lisp, Size: 4648 bytes --] (defun find-directory-files-recursively (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (with-temp-buffer (setq case-fold-search nil) (cd dir) (let* ((command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) (unless include-directories '("!" "-type" "d")) '("-print0") )) (remote (file-remote-p dir)) (proc (if remote (let ((proc (apply #'start-file-process "find" (current-buffer) command))) (set-process-sentinel proc (lambda (_proc _state))) (set-process-query-on-exit-flag proc nil) proc) (make-process :name "find" :buffer (current-buffer) :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :command command)))) (while (accept-process-output proc)) (let ((start (goto-char (point-min))) ret) (while (search-forward "\0" nil t) (push (concat remote (buffer-substring-no-properties start (1- (point)))) ret) (setq start (point))) ret)))) (defun find-directory-files-recursively-2 (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (cl-assert (not (file-remote-p dir))) (let* (buffered result (proc (make-process :name "find" :buffer nil :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :filter (lambda (proc data) (let ((start 0)) (when-let (end (string-search "\0" data start)) (push (concat buffered (substring data start end)) result) (setq buffered "") (setq start (1+ end)) (while-let ((end (string-search "\0" data start))) (push (substring data start end) result) (setq start (1+ end)))) (setq buffered (concat buffered (substring data start))))) :command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) (unless include-directories '("!" "-type" "d")) '("-print0") )))) (while (accept-process-output proc)) result)) (defun find-directory-files-recursively-3 (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (cl-assert (not (file-remote-p dir))) (let ((args `(,(file-local-name dir) ,@(if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) ,@(unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) ,@(unless include-directories '("!" "-type" "d")) "-print0"))) (with-temp-buffer (let ((status (apply #'process-file "find" nil t nil args)) (pt (point-min)) res) (unless (zerop status) (error "Listing failed")) (goto-char (point-min)) (while (search-forward "\0" nil t) (push (buffer-substring-no-properties pt (1- (point))) res) (setq pt (point))) res)))) (defun my-bench (count path regexp) (setq path (expand-file-name path)) ;; (let ((old (directory-files-recursively path regexp)) ;; (new (find-directory-files-recursively-3 path regexp))) ;; (dolist (path old) ;; (unless (member path new) (error "! %s not in" path))) ;; (dolist (path new) ;; (unless (member path old) (error "!! %s not in" path)))) (list (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp))) (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp))) (cons "with-find-p" (benchmark count (list 'find-directory-files-recursively-2 path regexp))) (cons "with-find-sync" (benchmark count (list 'find-directory-files-recursively-3 path regexp))))) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-25 2:41 ` Dmitry Gutov @ 2023-07-25 8:22 ` Ihor Radchenko 2023-07-26 1:51 ` Dmitry Gutov 2023-07-25 18:42 ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii 2023-07-25 19:16 ` sbaugh 2 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-25 8:22 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: > I'm attaching an extended benchmark, one that includes a "synchronous" > implementation as well. Please give it a spin as well. GNU/Linux SSD (my-bench 10 "/usr/src/linux/" "") (("built-in" . "Elapsed time: 7.034326s (3.598539s in 14 GCs)") ("built-in no filename handler alist" . "Elapsed time: 5.907194s (3.698456s in 15 GCs)") ("with-find" . "Elapsed time: 6.078056s (4.052791s in 16 GCs)") ("with-find-p" . "Elapsed time: 4.496762s (2.739565s in 11 GCs)") ("with-find-sync" . "Elapsed time: 3.702760s (1.715160s in 7 GCs)")) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-25 8:22 ` Ihor Radchenko @ 2023-07-26 1:51 ` Dmitry Gutov 2023-07-26 9:09 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-26 1:51 UTC (permalink / raw) To: Ihor Radchenko; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735 On 25/07/2023 11:22, Ihor Radchenko wrote: > Dmitry Gutov<dmitry@gutov.dev> writes: > >> I'm attaching an extended benchmark, one that includes a "synchronous" >> implementation as well. Please give it a spin as well. > GNU/Linux SSD > > (my-bench 10 "/usr/src/linux/" "") > > (("built-in" . "Elapsed time: 7.034326s (3.598539s in 14 GCs)") > ("built-in no filename handler alist" . "Elapsed time: 5.907194s (3.698456s in 15 GCs)") > ("with-find" . "Elapsed time: 6.078056s (4.052791s in 16 GCs)") > ("with-find-p" . "Elapsed time: 4.496762s (2.739565s in 11 GCs)") > ("with-find-sync" . "Elapsed time: 3.702760s (1.715160s in 7 GCs)")) Thanks, for the extra data point in particular. Easy to see how it compares to the most efficient use of 'find', right (on GNU/Linix, at least)? It's also something to note that, GC-wise, numbers 1 and 2 are not the worst: the time must be spent somewhere else. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-26 1:51 ` Dmitry Gutov @ 2023-07-26 9:09 ` Ihor Radchenko 2023-07-27 0:41 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-26 9:09 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735 [-- Attachment #1: Type: text/plain, Size: 2592 bytes --] Dmitry Gutov <dmitry@gutov.dev> writes: >> (my-bench 10 "/usr/src/linux/" "") >> >> (("built-in" . "Elapsed time: 7.034326s (3.598539s in 14 GCs)") >> ("built-in no filename handler alist" . "Elapsed time: 5.907194s (3.698456s in 15 GCs)") >> ("with-find" . "Elapsed time: 6.078056s (4.052791s in 16 GCs)") >> ("with-find-p" . "Elapsed time: 4.496762s (2.739565s in 11 GCs)") >> ("with-find-sync" . "Elapsed time: 3.702760s (1.715160s in 7 GCs)")) > > Thanks, for the extra data point in particular. Easy to see how it > compares to the most efficient use of 'find', right (on GNU/Linix, at > least)? > > It's also something to note that, GC-wise, numbers 1 and 2 are not the > worst: the time must be spent somewhere else. Indeed. I did more detailed analysis in https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ Main contributors in the lisp versions are (in the order from most significant to less significant) (1) file name handlers; (2) regexp matching of the file names; (3) nconc calls in the current `directory-files-recursively' implementation. I have modified `directory-files-recursively' to avoid O(N^2) `nconc' calls + bypassing regexp matches when REGEXP is nil. Here are the results (using the attached modified version of your benchmark file): (my-bench 10 "/usr/src/linux/" "") (("built-in" . "Elapsed time: 7.285597s (3.853368s in 6 GCs)") ("built-in no filename handler alist" . "Elapsed time: 5.855019s (3.760662s in 6 GCs)") ("built-in non-recursive no filename handler alist" . "Elapsed time: 5.817639s (4.326945s in 7 GCs)") ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 2.708306s (1.871665s in 3 GCs)") ("with-find" . "Elapsed time: 6.082200s (4.262830s in 7 GCs)") ("with-find-p" . "Elapsed time: 4.325503s (3.058647s in 5 GCs)") ("with-find-sync" . "Elapsed time: 3.267648s (1.903655s in 3 GCs)")) (let ((gc-cons-threshold most-positive-fixnum)) (my-bench 10 "/usr/src/linux/" "")) (("built-in" . "Elapsed time: 2.754473s") ("built-in no filename handler alist" . "Elapsed time: 1.322443s") ("built-in non-recursive no filename handler alist" . "Elapsed time: 1.235044s") ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 0.750275s") ("with-find" . "Elapsed time: 1.438510s") ("with-find-p" . "Elapsed time: 1.200876s") ("with-find-sync" . "Elapsed time: 1.349755s")) If we forget about GC, Elisp version can get fairly close to GNU find. And if we do not perform regexp matching (which makes sense when the REGEXP is ""), Elisp version is faster. [-- Attachment #2: find-bench.el --] [-- Type: text/plain, Size: 7254 bytes --] ;; -*- lexical-binding: t; -*- (defun find-directory-files-recursively (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (with-temp-buffer (setq case-fold-search nil) (cd dir) (let* ((command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) (unless include-directories '("!" "-type" "d")) '("-print0") )) (remote (file-remote-p dir)) (proc (if remote (let ((proc (apply #'start-file-process "find" (current-buffer) command))) (set-process-sentinel proc (lambda (_proc _state))) (set-process-query-on-exit-flag proc nil) proc) (make-process :name "find" :buffer (current-buffer) :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :command command)))) (while (accept-process-output proc)) (let ((start (goto-char (point-min))) ret) (while (search-forward "\0" nil t) (push (concat remote (buffer-substring-no-properties start (1- (point)))) ret) (setq start (point))) ret)))) (defun find-directory-files-recursively-2 (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (cl-assert (not (file-remote-p dir))) (let* (buffered result (proc (make-process :name "find" :buffer nil :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :filter (lambda (proc data) (let ((start 0)) (when-let (end (string-search "\0" data start)) (push (concat buffered (substring data start end)) result) (setq buffered "") (setq start (1+ end)) (while-let ((end (string-search "\0" data start))) (push (substring data start end) result) (setq start (1+ end)))) (setq buffered (concat buffered (substring data start))))) :command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) (unless include-directories '("!" "-type" "d")) '("-print0") )))) (while (accept-process-output proc)) result)) (defun find-directory-files-recursively-3 (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (cl-assert (not (file-remote-p dir))) (let ((args `(,(file-local-name dir) ,@(if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) ,@(unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) ,@(unless include-directories '("!" "-type" "d")) "-print0"))) (with-temp-buffer (let ((status (apply #'process-file "find" nil t nil args)) (pt (point-min)) res) (unless (zerop status) (error "Listing failed")) (goto-char (point-min)) (while (search-forward "\0" nil t) (push (buffer-substring-no-properties pt (1- (point))) res) (setq pt (point))) res)))) (defun directory-files-recursively-strip-nconc (dir regexp &optional include-directories predicate follow-symlinks) "Return list of all files under directory DIR whose names match REGEXP. This function works recursively. Files are returned in \"depth first\" order, and files from each directory are sorted in alphabetical order. Each file name appears in the returned list in its absolute form. By default, the returned list excludes directories, but if optional argument INCLUDE-DIRECTORIES is non-nil, they are included. PREDICATE can be either nil (which means that all subdirectories of DIR are descended into), t (which means that subdirectories that can't be read are ignored), or a function (which is called with the name of each subdirectory, and should return non-nil if the subdirectory is to be descended into). If FOLLOW-SYMLINKS is non-nil, symbolic links that point to directories are followed. Note that this can lead to infinite recursion." (let* ((result nil) (dirs (list dir)) (dir (directory-file-name dir)) ;; When DIR is "/", remote file names like "/method:" could ;; also be offered. We shall suppress them. (tramp-mode (and tramp-mode (file-remote-p (expand-file-name dir))))) (while (setq dir (pop dirs)) (dolist (file (file-name-all-completions "" dir)) (unless (member file '("./" "../")) (if (directory-name-p file) (let* ((leaf (substring file 0 (1- (length file)))) (full-file (concat dir "/" leaf))) ;; Don't follow symlinks to other directories. (when (and (or (not (file-symlink-p full-file)) follow-symlinks) ;; Allow filtering subdirectories. (or (eq predicate nil) (eq predicate t) (funcall predicate full-file))) (push full-file dirs)) (when (and include-directories (string-match regexp leaf)) (setq result (nconc result (list full-file))))) (when (and regexp (string-match regexp file)) (push (concat dir "/" file) result)))))) (sort result #'string<))) (defun my-bench (count path regexp) (setq path (expand-file-name path)) ;; (let ((old (directory-files-recursively path regexp)) ;; (new (find-directory-files-recursively-3 path regexp))) ;; (dolist (path old) ;; (unless (member path new) (error "! %s not in" path))) ;; (dolist (path new) ;; (unless (member path old) (error "!! %s not in" path)))) (list (cons "built-in" (benchmark count (list 'directory-files-recursively path regexp))) (cons "built-in no filename handler alist" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively path regexp)))) (cons "built-in non-recursive no filename handler alist" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively-strip-nconc path regexp)))) (cons "built-in non-recursive no filename handler alist + skip re-match" (let (file-name-handler-alist) (benchmark count (list 'directory-files-recursively-strip-nconc path nil)))) (cons "with-find" (benchmark count (list 'find-directory-files-recursively path regexp))) (cons "with-find-p" (benchmark count (list 'find-directory-files-recursively-2 path regexp))) (cons "with-find-sync" (benchmark count (list 'find-directory-files-recursively-3 path regexp))))) (provide 'find-bench) [-- Attachment #3: Type: text/plain, Size: 224 bytes --] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-26 9:09 ` Ihor Radchenko @ 2023-07-27 0:41 ` Dmitry Gutov 2023-07-27 5:22 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-27 0:41 UTC (permalink / raw) To: Ihor Radchenko; +Cc: luangruo, sbaugh, Eli Zaretskii, 64735 On 26/07/2023 12:09, Ihor Radchenko wrote: >> It's also something to note that, GC-wise, numbers 1 and 2 are not the >> worst: the time must be spent somewhere else. > Indeed. I did more detailed analysis in > https://yhetil.org/emacs-devel/87cz0p2xlc.fsf@localhost/ > > Main contributors in the lisp versions are (in the order from most > significant to less significant) (1) file name handlers; (2) regexp > matching of the file names; (3) nconc calls in the current > `directory-files-recursively' implementation. > > I have modified `directory-files-recursively' to avoid O(N^2) `nconc' > calls + bypassing regexp matches when REGEXP is nil. Sounds good. I haven't examined the diff closely, but it sounds like an improvement that can be applied irrespective of how this discussion ends. Skipping regexp matching entirely, though, will make this benchmark farther removed from real-life usage: this thread started from being able to handle multiple ignore entries when listing files (e.g. in a project). So any solution for that (whether we use it on all or just some platforms) needs to be able to handle those. And it doesn't seem like directory-files-recursively has any alternative solution for that other than calling string-match on every found file. > Here are the results (using the attached modified version of your > benchmark file): > > (my-bench 10 "/usr/src/linux/" "") > (("built-in" . "Elapsed time: 7.285597s (3.853368s in 6 GCs)") > ("built-in no filename handler alist" . "Elapsed time: 5.855019s (3.760662s in 6 GCs)") > ("built-in non-recursive no filename handler alist" . "Elapsed time: 5.817639s (4.326945s in 7 GCs)") > ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 2.708306s (1.871665s in 3 GCs)") > ("with-find" . "Elapsed time: 6.082200s (4.262830s in 7 GCs)") > ("with-find-p" . "Elapsed time: 4.325503s (3.058647s in 5 GCs)") > ("with-find-sync" . "Elapsed time: 3.267648s (1.903655s in 3 GCs)")) Nice. > (let ((gc-cons-threshold most-positive-fixnum)) > (my-bench 10 "/usr/src/linux/" "")) > (("built-in" . "Elapsed time: 2.754473s") > ("built-in no filename handler alist" . "Elapsed time: 1.322443s") > ("built-in non-recursive no filename handler alist" . "Elapsed time: 1.235044s") > ("built-in non-recursive no filename handler alist + skip re-match" . "Elapsed time: 0.750275s") > ("with-find" . "Elapsed time: 1.438510s") > ("with-find-p" . "Elapsed time: 1.200876s") > ("with-find-sync" . "Elapsed time: 1.349755s")) > > If we forget about GC, Elisp version can get fairly close to GNU find. > And if we do not perform regexp matching (which makes sense when the > REGEXP is ""), Elisp version is faster. We can't really forget about GC, though. But the above numbers make me hopeful about the async-parallel solution, implying that the parallelization really can help (and offset whatever latency we lose on pselect), as soon as we determine the source of extra consing and decide what to do about it. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-27 0:41 ` Dmitry Gutov @ 2023-07-27 5:22 ` Eli Zaretskii 2023-07-27 8:20 ` Ihor Radchenko 2023-07-27 13:30 ` Dmitry Gutov 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-27 5:22 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Thu, 27 Jul 2023 03:41:29 +0300 > Cc: Eli Zaretskii <eliz@gnu.org>, luangruo@yahoo.com, sbaugh@janestreet.com, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > I have modified `directory-files-recursively' to avoid O(N^2) `nconc' > > calls + bypassing regexp matches when REGEXP is nil. > > Sounds good. I haven't examined the diff closely, but it sounds like an > improvement that can be applied irrespective of how this discussion ends. That change should be submitted as a separate issue and discussed in detail before we decide we can make it. > Skipping regexp matching entirely, though, will make this benchmark > farther removed from real-life usage: this thread started from being > able to handle multiple ignore entries when listing files (e.g. in a > project). Agreed. From my POV, that variant's purpose was only to show how much time is spent in matching file names against some include or exclude list. > So any solution for that (whether we use it on all or just > some platforms) needs to be able to handle those. And it doesn't seem > like directory-files-recursively has any alternative solution for that > other than calling string-match on every found file. There's a possibility of pushing this filtering into file-name-all-completions, but I'm not sure that will be faster. We should try that and measure the results, I think. > > If we forget about GC, Elisp version can get fairly close to GNU find. > > And if we do not perform regexp matching (which makes sense when the > > REGEXP is ""), Elisp version is faster. > > We can't really forget about GC, though. But we could temporarily lift the threshold while this function runs, if that leads to significant savings. > But the above numbers make me hopeful about the async-parallel solution, > implying that the parallelization really can help (and offset whatever > latency we lose on pselect), as soon as we determine the source of extra > consing and decide what to do about it. Isn't it clear that additional consing comes from the fact that we first insert the Find's output into a buffer or produce a string from it, and then chop that into individual file names? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-27 5:22 ` Eli Zaretskii @ 2023-07-27 8:20 ` Ihor Radchenko 2023-07-27 8:47 ` Eli Zaretskii 2023-07-27 13:30 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-27 8:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, Dmitry Gutov, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: >> > I have modified `directory-files-recursively' to avoid O(N^2) `nconc' >> > calls + bypassing regexp matches when REGEXP is nil. >> >> Sounds good. I haven't examined the diff closely, but it sounds like an >> improvement that can be applied irrespective of how this discussion ends. > > That change should be submitted as a separate issue and discussed in > detail before we decide we can make it. I will look into it. This was mostly a quick and dirty rewrite without paying too match attention to file order in the result. >> Skipping regexp matching entirely, though, will make this benchmark >> farther removed from real-life usage: this thread started from being >> able to handle multiple ignore entries when listing files (e.g. in a >> project). > > Agreed. From my POV, that variant's purpose was only to show how much > time is spent in matching file names against some include or exclude > list. Yes and no. It is not uncommon to query _all_ the files in directory and something as simple as (when (and (not (member regexp '("" ".*"))) (string-match regexp file))...) can give considerable speedup. Might be worth adding such optimization. >> So any solution for that (whether we use it on all or just >> some platforms) needs to be able to handle those. And it doesn't seem >> like directory-files-recursively has any alternative solution for that >> other than calling string-match on every found file. > > There's a possibility of pushing this filtering into > file-name-all-completions, but I'm not sure that will be faster. We > should try that and measure the results, I think. Isn't `file-name-all-completions' more limited and cannot accept arbitrary regexp? >> We can't really forget about GC, though. > > But we could temporarily lift the threshold while this function runs, > if that leads to significant savings. Yup. Also, GC times and frequencies will vary across different Emacs sessions. So, we may not want to rely on it when comparing the benchmarks from different people. >> But the above numbers make me hopeful about the async-parallel solution, >> implying that the parallelization really can help (and offset whatever >> latency we lose on pselect), as soon as we determine the source of extra >> consing and decide what to do about it. > > Isn't it clear that additional consing comes from the fact that we > first insert the Find's output into a buffer or produce a string from > it, and then chop that into individual file names? To add to it, I also tried to implement a version of `directory-files-recursively' that first inserts all the files in buffer and then filters them using `re-search-forward' instead of calling `string-match' on every file name string. That ended up being slower compared to the current `string-match' approach. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-27 8:20 ` Ihor Radchenko @ 2023-07-27 8:47 ` Eli Zaretskii 2023-07-27 9:28 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-27 8:47 UTC (permalink / raw) To: Ihor Radchenko; +Cc: luangruo, dmitry, 64735, sbaugh > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Dmitry Gutov <dmitry@gutov.dev>, luangruo@yahoo.com, > sbaugh@janestreet.com, 64735@debbugs.gnu.org > Date: Thu, 27 Jul 2023 08:20:55 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Skipping regexp matching entirely, though, will make this benchmark > >> farther removed from real-life usage: this thread started from being > >> able to handle multiple ignore entries when listing files (e.g. in a > >> project). > > > > Agreed. From my POV, that variant's purpose was only to show how much > > time is spent in matching file names against some include or exclude > > list. > > Yes and no. > > It is not uncommon to query _all_ the files in directory and something > as simple as > > (when (and (not (member regexp '("" ".*"))) (string-match regexp file))...) > > can give considerable speedup. I don't understand what you are saying. The current code already checks PREDICATE for being nil, and if it is, does nothing about filtering. And if this is about testing REGEXP for being a trivial one, adding such a test to the existing code is trivial, and hardly justifies an objection to what I wrote. > > There's a possibility of pushing this filtering into > > file-name-all-completions, but I'm not sure that will be faster. We > > should try that and measure the results, I think. > > Isn't `file-name-all-completions' more limited and cannot accept > arbitrary regexp? No, see completion-regexp-list. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-27 8:47 ` Eli Zaretskii @ 2023-07-27 9:28 ` Ihor Radchenko 0 siblings, 0 replies; 213+ messages in thread From: Ihor Radchenko @ 2023-07-27 9:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, dmitry, 64735, sbaugh Eli Zaretskii <eliz@gnu.org> writes: > And if this is about testing REGEXP for being a trivial one, adding > such a test to the existing code is trivial, and hardly justifies an > objection to what I wrote. I was replying to your interpretations on why I included "no-regexp" test. I agree that we should not use this test as comparison with GNU find. But I also wanted to say that adding the trivial REGEXP test will be useful. Especially because it is easy. Should I prepare a patch? >> Isn't `file-name-all-completions' more limited and cannot accept >> arbitrary regexp? > > No, see completion-regexp-list. That would be equivalent to forcing `include-directories' being t. In any case, even if we ignore INCLUDE-DIRECTORIES, there is no gain: (my-bench 10 "/usr/src/linux/" "") (("built-in non-recursive no filename handler alist" . "Elapsed time: 5.780714s (4.352086s in 6 GCs)") ("built-in non-recursive no filename handler alist + completion-regexp-list" . "Elapsed time: 5.739315s (4.359772s in 6 GCs)")) (let ((gc-cons-threshold most-positive-fixnum)) (my-bench 10 "/usr/src/linux/" "")) (("built-in non-recursive no filename handler alist" . "Elapsed time: 1.267828s") ("built-in non-recursive no filename handler alist + completion-regexp-list" . "Elapsed time: 1.275656s")) In the test, I removed all the `string-match' calls and instead let-bound (completion-regexp-list (list regexp)) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-27 5:22 ` Eli Zaretskii 2023-07-27 8:20 ` Ihor Radchenko @ 2023-07-27 13:30 ` Dmitry Gutov 2023-07-29 0:12 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-27 13:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 27/07/2023 08:22, Eli Zaretskii wrote: >> Date: Thu, 27 Jul 2023 03:41:29 +0300 >> Cc: Eli Zaretskii <eliz@gnu.org>, luangruo@yahoo.com, sbaugh@janestreet.com, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>> I have modified `directory-files-recursively' to avoid O(N^2) `nconc' >>> calls + bypassing regexp matches when REGEXP is nil. >> >> Sounds good. I haven't examined the diff closely, but it sounds like an >> improvement that can be applied irrespective of how this discussion ends. > > That change should be submitted as a separate issue and discussed in > detail before we decide we can make it. Sure. >>> If we forget about GC, Elisp version can get fairly close to GNU find. >>> And if we do not perform regexp matching (which makes sense when the >>> REGEXP is ""), Elisp version is faster. >> >> We can't really forget about GC, though. > > But we could temporarily lift the threshold while this function runs, > if that leads to significant savings. I mean, everything's doable, but if we do this for this function, why not others? Most long-running code would see an improvement from that kind of change (the 'find'-based solutions too). IIRC the main drawback is running out of memory in extreme conditions or on low-memory platforms/devices. It's not like this feature is particularly protected from this. >> But the above numbers make me hopeful about the async-parallel solution, >> implying that the parallelization really can help (and offset whatever >> latency we lose on pselect), as soon as we determine the source of extra >> consing and decide what to do about it. > > Isn't it clear that additional consing comes from the fact that we > first insert the Find's output into a buffer or produce a string from > it, and then chop that into individual file names? But we do that in all 'find'-based solutions: the synchronous one takes buffer text and chops it into strings. The first asynchronous does the same. The other ("with-find-p") works from a process filter, chopping up strings that get passed to it. But the amount of time spent in GC is different, with most of the difference in performance attributable to it: if we subtract time spent in GC, the runtimes are approximately equal. I can imagine that the filter-based approach necessarily creates more strings (to pass to the filter function). Maybe we could increase those strings' size (thus reducing the number) by increasing the read buffer size? I haven't found a relevant variable, though. Or if there was some other callback that runs after the next chunk of output arrives from the process, we could parse it from the buffer. But the insertion into the buffer would need to be made efficient (apparently internal-default-process-filter currently uses the same sequence of strings as the other filters for input, with the same amount of consing). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-27 13:30 ` Dmitry Gutov @ 2023-07-29 0:12 ` Dmitry Gutov 2023-07-29 6:15 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-29 0:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 27/07/2023 16:30, Dmitry Gutov wrote: > I can imagine that the filter-based approach necessarily creates more > strings (to pass to the filter function). Maybe we could increase those > strings' size (thus reducing the number) by increasing the read buffer > size? To go further along this route, first of all, I verified that the input strings are (almost) all the same length: 4096. And they are parsed into strings with length 50-100 characters, meaning the number of "junk" objects due to the process-filter approach probably shouldn't matter too much, given that the number of strings returned is 40-80x more. But then I ran these tests with different values of read-process-output-max, which exactly increased those strings' size, proportionally reducing their number. The results were: > (my-bench-rpom 1 default-directory "") => (("with-find-p 4096" . "Elapsed time: 0.945478s (0.474680s in 6 GCs)") ("with-find-p 40960" . "Elapsed time: 0.760727s (0.395379s in 5 GCs)") ("with-find-p 409600" . "Elapsed time: 0.729757s (0.394881s in 5 GCs)")) where (defun my-bench-rpom (count path regexp) (setq path (expand-file-name path)) (list (cons "with-find-p 4096" (let ((read-process-output-max 4096)) (benchmark count (list 'find-directory-files-recursively-2 path regexp)))) (cons "with-find-p 40960" (let ((read-process-output-max 40960)) (benchmark count (list 'find-directory-files-recursively-2 path regexp)))) (cons "with-find-p 409600" (let ((read-process-output-max 409600)) (benchmark count (list 'find-directory-files-recursively-2 path regexp)))))) ...with the last iteration showing consistently the same or better performance than the "sync" version I benchmarked previously. What does that mean for us? The number of strings in the heap is reduced, but not by much (again, the result is a list with 43x more elements). The combined memory taken up by these intermediate strings to be garbage-collected, is the same. It seems like per-chunk overhead is non-trivial, and affects GC somehow (but not in a way that just any string would). In this test, by default, the output produces ~6000 strings and passes them to the filter function. Meaning, read_and_dispose_of_process_output is called about 6000 times, producing the overhead of roughly 0.2s. Something in there must be producing extra work for the GC. This line seems suspect: list3 (outstream, make_lisp_proc (p), text), Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm missing something bigger. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-29 0:12 ` Dmitry Gutov @ 2023-07-29 6:15 ` Eli Zaretskii 2023-07-30 1:35 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-29 6:15 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sat, 29 Jul 2023 03:12:34 +0300 > From: Dmitry Gutov <dmitry@gutov.dev> > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > > It seems like per-chunk overhead is non-trivial, and affects GC somehow > (but not in a way that just any string would). > > In this test, by default, the output produces ~6000 strings and passes > them to the filter function. Meaning, read_and_dispose_of_process_output > is called about 6000 times, producing the overhead of roughly 0.2s. > Something in there must be producing extra work for the GC. > > This line seems suspect: > > list3 (outstream, make_lisp_proc (p), text), > > Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm > missing something bigger. I don't understand what puzzles you here. You need to make your descriptions more clear to allow others to follow your logic. You use terms you never explain: "junk objects", "number of strings in the heap", "per-chunk overhead" (what is "chunk"?), which is a no-no when explaining complex technical stuff to others. If I read what you wrote superficially, without delving into the details (which I can't understand), you are saying that the overall amount of consing is roughly the same. This is consistent with the fact that the GC times change only very little. So I don't think I see, on this level, what puzzles you in this picture. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-29 6:15 ` Eli Zaretskii @ 2023-07-30 1:35 ` Dmitry Gutov 2023-07-31 11:38 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-30 1:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 29/07/2023 09:15, Eli Zaretskii wrote: >> Date: Sat, 29 Jul 2023 03:12:34 +0300 >> From: Dmitry Gutov <dmitry@gutov.dev> >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> >> It seems like per-chunk overhead is non-trivial, and affects GC somehow >> (but not in a way that just any string would). >> >> In this test, by default, the output produces ~6000 strings and passes >> them to the filter function. Meaning, read_and_dispose_of_process_output >> is called about 6000 times, producing the overhead of roughly 0.2s. >> Something in there must be producing extra work for the GC. >> >> This line seems suspect: >> >> list3 (outstream, make_lisp_proc (p), text), >> >> Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm >> missing something bigger. > > I don't understand what puzzles you here. You need to make your > descriptions more clear to allow others to follow your logic. You use > terms you never explain: "junk objects", "number of strings in the > heap", "per-chunk overhead" (what is "chunk"?), which is a no-no when > explaining complex technical stuff to others. In this context, junks objects are objects that will need to be collected by garbage collector very soon because they are just a byproduct of a function's execution (but aren't used in the return value, for example). The more of them a function creates, the more work it will be, supposedly, for the GC. Heap is perhaps the wrong term (given that C has its own notion of heap), but I meant the memory managed by the Lisp runtime. And chunks are the buffered strings that get passed to the process filter. Chunks of the process' output. By default, these chunks are 4096 characters long, but the comparisons tweak that value by 10x and 100x. > If I read what you wrote superficially, without delving into the > details (which I can't understand), you are saying that the overall > amount of consing is roughly the same. What is "amount of consing"? Is it just the number of objects? Or does their size (e.g. string length) affect GC pressure as well? > This is consistent with the > fact that the GC times change only very little. So I don't think I > see, on this level, what puzzles you in this picture. Now that you pointed that out, the picture is just more puzzling. While 0.1s in GC is not insignificant (it's 10% of the whole runtime), it does seem to have been more of a fluke, and on average the fluctuations in GC time are smaller. Here's an extended comparison: (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)") ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)") ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)") ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)") ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)") ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)") ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)") ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)") ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)") ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)") ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)" )) What was puzzling for me, overall, is that if we take "with-find 409600" (the fastest among the asynchronous runs without parallelism) and "with-find-sync", the difference in GC time (which is repeatable), 0.66s, almost covers all the difference in performance. And as for "with-find-p 409600", it would come out on top! Which it did in Ihor's tests when GC was disabled. But where does the extra GC time come from? Is it from extra consing in the asynchronous call's case? If it is, it's not from all the chunked strings, apparently, given that increasing max string's size (and decreasing their number by 2x-6x, according to my logging) doesn't affect the reported GC time much. Could the extra time spent in GC just come from the fact that it's given more opportunities to run, maybe? call_process stays entirely in C, whereas make-process, with its asynchronous approach, goes between C and Lisp even time it receives input. The report above might indicate so: with-find-p have ~20 garbage collection cycles, whereas with-find-sync - only ~10. Or could there be some other source of consing, unrelated to the process output string, and how finely they are sliced? Changing process-adaptive-read-buffering to nil didn't have any effect here. If we get back to increasing read-process-output-max, which does help (apparently due to reducing the number we switch between reading from the process and doing... whatever else), the sweet spot seems to be 1048576, which is my system's maximum value. Anything higher - and the perf goes back to worse -- I'm guessing something somewhere resets the value to default? Not sure why it doesn't clip to the maximum allowed, though. Anyway, it would be helpful to be able to decide on as high as possible value without manually reading from /proc/sys/fs/pipe-max-size. And what of other OSes? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-30 1:35 ` Dmitry Gutov @ 2023-07-31 11:38 ` Eli Zaretskii 2023-09-08 0:53 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-31 11:38 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 30 Jul 2023 04:35:49 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > In this context, junks objects are objects that will need to be > collected by garbage collector very soon because they are just a > byproduct of a function's execution (but aren't used in the return > value, for example). The more of them a function creates, the more work > it will be, supposedly, for the GC. > > Heap is perhaps the wrong term (given that C has its own notion of > heap), but I meant the memory managed by the Lisp runtime. > > And chunks are the buffered strings that get passed to the process > filter. Chunks of the process' output. By default, these chunks are 4096 > characters long, but the comparisons tweak that value by 10x and 100x. If the subprocess output is inserted into a buffer, its effect on the GC will be different. (Not sure if this is relevant to the issue at hand, as I lost track of the many variants of the function that were presented.) > > If I read what you wrote superficially, without delving into the > > details (which I can't understand), you are saying that the overall > > amount of consing is roughly the same. > > What is "amount of consing"? Is it just the number of objects? Or does > their size (e.g. string length) affect GC pressure as well? In general, both, since we have 2 GC thresholds, and GC is actually done when both are exceeded. So the effect will also depend on how much Lisp memory is already allocated in the Emacs process where these benchmarks are run. > > This is consistent with the > > fact that the GC times change only very little. So I don't think I > > see, on this level, what puzzles you in this picture. > > Now that you pointed that out, the picture is just more puzzling. While > 0.1s in GC is not insignificant (it's 10% of the whole runtime), it does > seem to have been more of a fluke, and on average the fluctuations in GC > time are smaller. > > Here's an extended comparison: > > (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)") > ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)") > ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)") > ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)") > ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)") > ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)") > ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)") > ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)") > ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)") > ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)") > ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)" > )) > > What was puzzling for me, overall, is that if we take "with-find 409600" > (the fastest among the asynchronous runs without parallelism) and > "with-find-sync", the difference in GC time (which is repeatable), > 0.66s, almost covers all the difference in performance. And as for > "with-find-p 409600", it would come out on top! Which it did in Ihor's > tests when GC was disabled. > > But where does the extra GC time come from? Is it from extra consing in > the asynchronous call's case? If it is, it's not from all the chunked > strings, apparently, given that increasing max string's size (and > decreasing their number by 2x-6x, according to my logging) doesn't > affect the reported GC time much. > > Could the extra time spent in GC just come from the fact that it's given > more opportunities to run, maybe? call_process stays entirely in C, > whereas make-process, with its asynchronous approach, goes between C and > Lisp even time it receives input. The report above might indicate so: > with-find-p have ~20 garbage collection cycles, whereas with-find-sync - > only ~10. Or could there be some other source of consing, unrelated to > the process output string, and how finely they are sliced? These questions can only be answered by dumping the values of the 2 GC thresholds and of consing_until_gc for each GC cycle. It could be that we are consing more Lisp memory, or it could be that one of the implementations provides fewer opportunities for Emacs to call maybe_gc. Or it could be some combination of the two. > If we get back to increasing read-process-output-max, which does help > (apparently due to reducing the number we switch between reading from > the process and doing... whatever else), the sweet spot seems to be > 1048576, which is my system's maximum value. Anything higher - and the > perf goes back to worse -- I'm guessing something somewhere resets the > value to default? Not sure why it doesn't clip to the maximum allowed, > though. > > Anyway, it would be helpful to be able to decide on as high as possible > value without manually reading from /proc/sys/fs/pipe-max-size. And what > of other OSes? Is this with pipes or with PTYs? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-31 11:38 ` Eli Zaretskii @ 2023-09-08 0:53 ` Dmitry Gutov 2023-09-08 6:35 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-08 0:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 Let's try to investigate this some more, if we can. On 31/07/2023 14:38, Eli Zaretskii wrote: >> Date: Sun, 30 Jul 2023 04:35:49 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> In this context, junks objects are objects that will need to be >> collected by garbage collector very soon because they are just a >> byproduct of a function's execution (but aren't used in the return >> value, for example). The more of them a function creates, the more work >> it will be, supposedly, for the GC. >> >> Heap is perhaps the wrong term (given that C has its own notion of >> heap), but I meant the memory managed by the Lisp runtime. >> >> And chunks are the buffered strings that get passed to the process >> filter. Chunks of the process' output. By default, these chunks are 4096 >> characters long, but the comparisons tweak that value by 10x and 100x. > > If the subprocess output is inserted into a buffer, its effect on the > GC will be different. (Not sure if this is relevant to the issue at > hand, as I lost track of the many variants of the function that were > presented.) Yes, one of the variants inserts into the buffer (one that uses a synchronous process call and also, coincidentally, spends less time in GC), and the asynchronous work from a process filter. >>> If I read what you wrote superficially, without delving into the >>> details (which I can't understand), you are saying that the overall >>> amount of consing is roughly the same. >> >> What is "amount of consing"? Is it just the number of objects? Or does >> their size (e.g. string length) affect GC pressure as well? > > In general, both, since we have 2 GC thresholds, and GC is actually > done when both are exceeded. So the effect will also depend on how > much Lisp memory is already allocated in the Emacs process where these > benchmarks are run. All right. >>> This is consistent with the >>> fact that the GC times change only very little. So I don't think I >>> see, on this level, what puzzles you in this picture. >> >> Now that you pointed that out, the picture is just more puzzling. While >> 0.1s in GC is not insignificant (it's 10% of the whole runtime), it does >> seem to have been more of a fluke, and on average the fluctuations in GC >> time are smaller. >> >> Here's an extended comparison: >> >> (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)") >> ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)") >> ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)") >> ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)") >> ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)") >> ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)") >> ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)") >> ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)") >> ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)") >> ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)") >> ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)" >> )) >> >> What was puzzling for me, overall, is that if we take "with-find 409600" >> (the fastest among the asynchronous runs without parallelism) and >> "with-find-sync", the difference in GC time (which is repeatable), >> 0.66s, almost covers all the difference in performance. And as for >> "with-find-p 409600", it would come out on top! Which it did in Ihor's >> tests when GC was disabled. >> >> But where does the extra GC time come from? Is it from extra consing in >> the asynchronous call's case? If it is, it's not from all the chunked >> strings, apparently, given that increasing max string's size (and >> decreasing their number by 2x-6x, according to my logging) doesn't >> affect the reported GC time much. >> >> Could the extra time spent in GC just come from the fact that it's given >> more opportunities to run, maybe? call_process stays entirely in C, >> whereas make-process, with its asynchronous approach, goes between C and >> Lisp even time it receives input. The report above might indicate so: >> with-find-p have ~20 garbage collection cycles, whereas with-find-sync - >> only ~10. Or could there be some other source of consing, unrelated to >> the process output string, and how finely they are sliced? > > These questions can only be answered by dumping the values of the 2 GC > thresholds and of consing_until_gc for each GC cycle. It could be > that we are consing more Lisp memory, or it could be that one of the > implementations provides fewer opportunities for Emacs to call > maybe_gc. Or it could be some combination of the two. Do you think the outputs of https://elpa.gnu.org/packages/emacs-gc-stats.html could help? Otherwise, I suppose I need to add some fprintf's somewhere. Would the beginning of maybe_gc inside lisp.h be a good place for that? >> If we get back to increasing read-process-output-max, which does help >> (apparently due to reducing the number we switch between reading from >> the process and doing... whatever else), the sweet spot seems to be >> 1048576, which is my system's maximum value. Anything higher - and the >> perf goes back to worse -- I'm guessing something somewhere resets the >> value to default? Not sure why it doesn't clip to the maximum allowed, >> though. >> >> Anyway, it would be helpful to be able to decide on as high as possible >> value without manually reading from /proc/sys/fs/pipe-max-size. And what >> of other OSes? > > Is this with pipes or with PTYs? All examples which use make-process call it with :connection-type 'pipe. The one that calls process-file (the "synchronous" impl) also probably does, but I don't see that in the docstring. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-08 0:53 ` Dmitry Gutov @ 2023-09-08 6:35 ` Eli Zaretskii 2023-09-10 1:30 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-08 6:35 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Fri, 8 Sep 2023 03:53:37 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > >> (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)") > >> ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)") > >> ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)") > >> ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)") > >> ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)") > >> ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)") > >> ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)") > >> ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)") > >> ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)") > >> ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)") > >> ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)" > >> )) > >> > >> What was puzzling for me, overall, is that if we take "with-find 409600" > >> (the fastest among the asynchronous runs without parallelism) and > >> "with-find-sync", the difference in GC time (which is repeatable), > >> 0.66s, almost covers all the difference in performance. And as for > >> "with-find-p 409600", it would come out on top! Which it did in Ihor's > >> tests when GC was disabled. > >> > >> But where does the extra GC time come from? Is it from extra consing in > >> the asynchronous call's case? If it is, it's not from all the chunked > >> strings, apparently, given that increasing max string's size (and > >> decreasing their number by 2x-6x, according to my logging) doesn't > >> affect the reported GC time much. > >> > >> Could the extra time spent in GC just come from the fact that it's given > >> more opportunities to run, maybe? call_process stays entirely in C, > >> whereas make-process, with its asynchronous approach, goes between C and > >> Lisp even time it receives input. The report above might indicate so: > >> with-find-p have ~20 garbage collection cycles, whereas with-find-sync - > >> only ~10. Or could there be some other source of consing, unrelated to > >> the process output string, and how finely they are sliced? > > > > These questions can only be answered by dumping the values of the 2 GC > > thresholds and of consing_until_gc for each GC cycle. It could be > > that we are consing more Lisp memory, or it could be that one of the > > implementations provides fewer opportunities for Emacs to call > > maybe_gc. Or it could be some combination of the two. > > Do you think the outputs of > https://elpa.gnu.org/packages/emacs-gc-stats.html could help? I think you'd need to expose consing_until_gc to Lisp, and then you can collect the data from Lisp. > Otherwise, I suppose I need to add some fprintf's somewhere. Would the > beginning of maybe_gc inside lisp.h be a good place for that? I can only recommend the fprintf method if doing this from Lisp is impossible for some reason. > >> If we get back to increasing read-process-output-max, which does help > >> (apparently due to reducing the number we switch between reading from > >> the process and doing... whatever else), the sweet spot seems to be > >> 1048576, which is my system's maximum value. Anything higher - and the > >> perf goes back to worse -- I'm guessing something somewhere resets the > >> value to default? Not sure why it doesn't clip to the maximum allowed, > >> though. > >> > >> Anyway, it would be helpful to be able to decide on as high as possible > >> value without manually reading from /proc/sys/fs/pipe-max-size. And what > >> of other OSes? > > > > Is this with pipes or with PTYs? > > All examples which use make-process call it with :connection-type 'pipe. > > The one that calls process-file (the "synchronous" impl) also probably > does, but I don't see that in the docstring. Yes, call-process uses pipes. So finding the optimum boils down to running various scenarios. It is also possible that the optimum will be different on different systems, btw. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-08 6:35 ` Eli Zaretskii @ 2023-09-10 1:30 ` Dmitry Gutov 2023-09-10 5:33 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-10 1:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 08/09/2023 09:35, Eli Zaretskii wrote: >>> These questions can only be answered by dumping the values of the 2 GC >>> thresholds and of consing_until_gc for each GC cycle. It could be >>> that we are consing more Lisp memory, or it could be that one of the >>> implementations provides fewer opportunities for Emacs to call >>> maybe_gc. Or it could be some combination of the two. >> Do you think the outputs of >> https://elpa.gnu.org/packages/emacs-gc-stats.html could help? > I think you'd need to expose consing_until_gc to Lisp, and then you > can collect the data from Lisp. I can expose it to Lisp and print all three from post-gc-hook, but the result just looks like this: gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903 Perhaps I need to add a hook which runs at the beginning of GC? Or of maybe_gc even? Alternatively, (memory-use-counts) seems to retain some counters which don't get erased during garbage collection. >> All examples which use make-process call it with :connection-type 'pipe. >> >> The one that calls process-file (the "synchronous" impl) also probably >> does, but I don't see that in the docstring. > Yes, call-process uses pipes. So finding the optimum boils down to > running various scenarios. It is also possible that the optimum will > be different on different systems, btw. Sure, but I'd like to improve the state of affairs in at least the main one. And as for MS Windows, IIRC all find-based solution are currently slow equally, so we're unlikely to make things worse there anyway. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-10 1:30 ` Dmitry Gutov @ 2023-09-10 5:33 ` Eli Zaretskii 2023-09-11 0:02 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-10 5:33 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sun, 10 Sep 2023 04:30:24 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 08/09/2023 09:35, Eli Zaretskii wrote: > > I think you'd need to expose consing_until_gc to Lisp, and then you > > can collect the data from Lisp. > > I can expose it to Lisp and print all three from post-gc-hook, but the > result just looks like this: > > gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903 > > Perhaps I need to add a hook which runs at the beginning of GC? Or of > maybe_gc even? You could record its value in a local variable at the entry to garbage_collect, and the expose that value to Lisp. > Alternatively, (memory-use-counts) seems to retain some counters which > don't get erased during garbage collection. Maybe using those will be good enough, indeed. > And as for MS Windows, IIRC all find-based solution are currently slow > equally, so we're unlikely to make things worse there anyway. I was actually thinking about *BSD and macOS. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-10 5:33 ` Eli Zaretskii @ 2023-09-11 0:02 ` Dmitry Gutov 2023-09-11 11:57 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-11 0:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 [-- Attachment #1: Type: text/plain, Size: 3466 bytes --] On 10/09/2023 08:33, Eli Zaretskii wrote: >> Date: Sun, 10 Sep 2023 04:30:24 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> On 08/09/2023 09:35, Eli Zaretskii wrote: >>> I think you'd need to expose consing_until_gc to Lisp, and then you >>> can collect the data from Lisp. >> >> I can expose it to Lisp and print all three from post-gc-hook, but the >> result just looks like this: >> >> gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903 >> >> Perhaps I need to add a hook which runs at the beginning of GC? Or of >> maybe_gc even? > > You could record its value in a local variable at the entry to > garbage_collect, and the expose that value to Lisp. That also doesn't seem to give much, given that the condition for entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait until it's down to 0, then garbage-collect. What we could perhaps do is add another hook (or a printer) at the beginning of maybe_gc, but either would result in lots and lots of output. >> Alternatively, (memory-use-counts) seems to retain some counters which >> don't get erased during garbage collection. > > Maybe using those will be good enough, indeed. I added this instrumentation: (defvar last-mem-counts '(0 0 0 0 0 0 0)) (defun gc-record-after () (let* ((counts (memory-use-counts)) (diff (cl-map 'list (lambda (old new) (- new old)) last-mem-counts counts))) (setq last-mem-counts counts) (message "counts diff %s" diff))) (add-hook 'post-gc-hook #'gc-record-after) so that after each garbage collection we print the differences in all counters (CONSES FLOATS VECTOR-CELLS SYMBOLS STRING-CHARS INTERVALS STRINGS). And a message call when the process finishes. And made those recordings during the benchmark runs of two different listing methods (one using make-process, another using process-file) to list all files in a large directory (there are ~200000 files there). The make-process one I also ran with a different (large) value of read-process-output-max. Results attached. What's in there? First of all, for find-directory-files-recursively-3, there are 0 garbage collections between the beginning of the function and when we start parsing the output (no GCs while the process is writing to the buffer synchronously). I guess inserting output in a buffer doesn't increase consing, so there's nothing to GC? Next: for find-directory-files-recursively-2, the process only finishes at the end, when all GC cycles are done for. I suppose that also means we block the process's output while Lisp is running, and also that whatever GC events occur might coincide with the chunks of output coming from the process, and however many of them turn out to be in total. So there is also a second recording for find-directory-files-recursively-2 with read-process-output-max=409600. It does improve the performance significantly (and reduce the number of GC pauses). I guess what I'm still not clear on, is whether the number of GC pauses is fewer because of less consing (the only column that looks significantly different is the 3rd: VECTOR-CELLS), or because the process finishes faster due to larger buffers, which itself causes fewer calls to maybe_gc. And, of course, what else could be done to reduce the time spent in GC in the asynchronous case. [-- Attachment #2: gcs2.txt --] [-- Type: text/plain, Size: 2184 bytes --] find-directory-files-recursively-2 Uses make-process and :filter to parse the output concurrently with the process. With (read-process-output-max 4096): start now counts diff (75840 13 31177 60 343443 3496 4748) counts diff (41946 1 460 0 1226494 0 8425) counts diff (43165 1 450 0 1284214 0 8951) counts diff (43513 1 364 0 1343316 0 10125) counts diff (43200 1 384 0 1479048 0 9766) counts diff (46220 1 428 0 1528863 0 10242) counts diff (43125 1 462 0 1767068 0 8790) counts diff (49118 1 458 0 1723271 0 10832) counts diff (53156 1 572 0 1789919 0 10774) counts diff (57755 1 548 0 1783286 0 12600) counts diff (62171 1 554 0 1795216 0 13995) counts diff (62020 1 550 0 1963255 0 13996) counts diff (54559 1 616 0 2387308 0 10700) counts diff (56428 1 634 0 2513219 0 11095) counts diff (62611 1 658 0 2510756 0 12864) counts diff (67560 1 708 0 2574312 0 13899) counts diff (78154 1 928 0 2572273 0 14714) counts diff (86794 1 976 0 2520915 0 17004) counts diff (78112 1 874 0 2943548 0 15367) counts diff (79443 1 894 0 3138948 0 15559) counts diff (81861 1 984 0 3343764 0 15260) counts diff (87724 1 1030 0 3430969 0 16650) counts diff (88532 1 902 0 3591052 0 18487) counts diff (92083 1 952 0 3769290 0 19065) <finished\n> Elapsed time: 1.344422s (0.747126s in 24 GCs) And here's with (read-process-output-max 409600): start now counts diff (57967 1 4040 1 981912 106 7731) counts diff (32075 1 20 0 1919096 0 10560) counts diff (43431 1 18 0 2259314 0 14371) counts diff (46335 1 18 0 2426290 0 15339) counts diff (31872 1 18 0 2447639 0 10518) counts diff (46527 1 18 0 2328042 0 15403) counts diff (42468 1 18 0 2099976 0 14050) counts diff (48648 1 18 0 2302713 0 16110) counts diff (50404 1 20 0 3260921 0 16669) counts diff (40147 1 20 0 3264463 0 13251) counts diff (48118 1 20 0 3261725 0 15908) counts diff (60732 1 282 0 2791003 0 16785) counts diff (71329 1 506 0 2762237 0 17487) counts diff (61455 1 342 0 3192771 0 16271) counts diff (49035 1 30 0 3663715 0 16085) counts diff (58651 1 236 0 3783888 0 16683) counts diff (57132 1 24 0 4557688 0 18862) counts diff (71319 1 24 0 4769891 0 23591) <finished\n> Elapsed time: 0.890710s (0.546486s in 18 GCs) [-- Attachment #3: gcs3.txt --] [-- Type: text/plain, Size: 715 bytes --] find-directory-files-recursively-3 Uses process-file, parses the buffer with search-forward at the end. start now <process finished, now parsing> counts diff (62771 5 26629 63 458211 3223 8038) counts diff (17045 1 12 0 1288153 0 16949) counts diff (18301 1 12 0 1432165 0 18205) counts diff (17643 1 12 0 1716294 0 17547) counts diff (21917 1 12 0 1726462 0 21821) counts diff (25888 1 12 0 1777371 0 25792) counts diff (21743 1 12 0 2345143 0 21647) counts diff (24035 1 12 0 2561491 0 23939) counts diff (30028 1 12 0 2593069 0 29932) counts diff (29627 1 12 0 3041307 0 29531) counts diff (30140 1 12 0 3479209 0 30044) counts diff (35181 1 12 0 3690480 0 35085) Elapsed time: 0.943090s (0.351799s in 12 GCs) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-11 0:02 ` Dmitry Gutov @ 2023-09-11 11:57 ` Eli Zaretskii 2023-09-11 23:06 ` Dmitry Gutov 2023-09-12 14:23 ` Dmitry Gutov 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-09-11 11:57 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Mon, 11 Sep 2023 03:02:55 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > You could record its value in a local variable at the entry to > > garbage_collect, and the expose that value to Lisp. > > That also doesn't seem to give much, given that the condition for > entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait > until it's down to 0, then garbage-collect. No, we don't wait until it's zero, we perform GC on the first opportunity that we _notice_ that it crossed zero. So examining how negative is the value of consing_until_gc when GC is actually performed could tell us whether we checked the threshold with high enough frequency, and comparing these values between different runs could tell us whether the shorter time spend in GC means really less garbage or less frequent checks for the need to GC. > What's in there? First of all, for find-directory-files-recursively-3, > there are 0 garbage collections between the beginning of the function > and when we start parsing the output (no GCs while the process is > writing to the buffer synchronously). I guess inserting output in a > buffer doesn't increase consing, so there's nothing to GC? No, we just don't count increasing size of buffer text in the "consing since GC" counter. Basically, buffer text is never "garbage", except when a buffer is killed. > Next: for find-directory-files-recursively-2, the process only finishes > at the end, when all GC cycles are done for. I suppose that also means > we block the process's output while Lisp is running, and also that > whatever GC events occur might coincide with the chunks of output coming > from the process, and however many of them turn out to be in total. We don't block the process when GC runs. We do stop reading from the process, so if and when the pipe fills, the OS will block the process. > So there is also a second recording for > find-directory-files-recursively-2 with read-process-output-max=409600. > It does improve the performance significantly (and reduce the number of > GC pauses). I guess what I'm still not clear on, is whether the number > of GC pauses is fewer because of less consing (the only column that > looks significantly different is the 3rd: VECTOR-CELLS), or because the > process finishes faster due to larger buffers, which itself causes fewer > calls to maybe_gc. I think the latter. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-11 11:57 ` Eli Zaretskii @ 2023-09-11 23:06 ` Dmitry Gutov 2023-09-12 11:39 ` Eli Zaretskii 2023-09-12 14:23 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-11 23:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 [-- Attachment #1: Type: text/plain, Size: 2757 bytes --] On 11/09/2023 14:57, Eli Zaretskii wrote: >> Date: Mon, 11 Sep 2023 03:02:55 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>> You could record its value in a local variable at the entry to >>> garbage_collect, and the expose that value to Lisp. >> >> That also doesn't seem to give much, given that the condition for >> entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait >> until it's down to 0, then garbage-collect. > > No, we don't wait until it's zero, we perform GC on the first > opportunity that we _notice_ that it crossed zero. So examining how > negative is the value of consing_until_gc when GC is actually > performed could tell us whether we checked the threshold with high > enough frequency, and comparing these values between different runs > could tell us whether the shorter time spend in GC means really less > garbage or less frequent checks for the need to GC. Good point, I'm attaching the same outputs with "last value of consing_until_gc" added to every line. There are some pretty low values in the "read-process-output-max 409600" part of the experiment, which probably means runtime staying in C accumulating the output into the (now larger) buffer? Not sure. >> What's in there? First of all, for find-directory-files-recursively-3, >> there are 0 garbage collections between the beginning of the function >> and when we start parsing the output (no GCs while the process is >> writing to the buffer synchronously). I guess inserting output in a >> buffer doesn't increase consing, so there's nothing to GC? > > No, we just don't count increasing size of buffer text in the "consing > since GC" counter. Basically, buffer text is never "garbage", except > when a buffer is killed. That makes sense. Perhaps it hints at a faster design for calling process asynchronously as well (more on another experiment later). >> Next: for find-directory-files-recursively-2, the process only finishes >> at the end, when all GC cycles are done for. I suppose that also means >> we block the process's output while Lisp is running, and also that >> whatever GC events occur might coincide with the chunks of output coming >> from the process, and however many of them turn out to be in total. > > We don't block the process when GC runs. We do stop reading from the > process, so if and when the pipe fills, the OS will block the process. Right. But the effect is almost the same, including the potential side-effect that (IIUC) when a process is waiting like that, it's just suspended and not rushing ahead using the CPU/disk/etc resources to the max. That's an orthogonal train of thought, sorry. [-- Attachment #2: gcs2b.txt --] [-- Type: text/plain, Size: 3284 bytes --] find-directory-files-recursively-2 Uses make-process and :filter to parse the output concurrently with the process. With (read-process-output-max 4096): start now cugc -2560 counts diff (42345 3 14097 2 408583 1966 4479) cugc -4449 counts diff (24599 1 342 0 883247 0 6266) cugc -100 counts diff (24070 1 354 0 977387 0 6009) cugc -116 counts diff (27266 1 278 0 940723 0 7485) cugc -95 counts diff (27486 1 270 0 1014591 0 7586) cugc -117 counts diff (27157 1 294 0 1121065 0 7329) cugc -146 counts diff (28233 1 316 0 1185527 0 7562) cugc -143 counts diff (30597 1 354 0 1217320 0 8147) cugc -4807 counts diff (25925 1 380 0 1474618 0 6407) cugc -127 counts diff (33344 1 368 0 1341453 0 8965) cugc -177 counts diff (34785 1 478 0 1434432 0 8842) cugc -2801 counts diff (37069 1 464 0 1477825 0 9675) cugc -23 counts diff (40817 1 448 0 1478445 0 10999) cugc -1215 counts diff (44526 1 500 0 1503604 0 11964) cugc -4189 counts diff (42305 1 468 0 1701989 0 11354) cugc -4715 counts diff (36644 1 532 0 2036778 0 9082) cugc -85 counts diff (38234 1 542 0 2131756 0 9535) cugc -861 counts diff (41632 1 578 0 2188186 0 10474) cugc -117 counts diff (46029 1 580 0 2211685 0 11921) cugc -38 counts diff (50353 1 728 0 2280388 0 12568) cugc -2537 counts diff (57168 1 888 0 2286381 0 13974) cugc -3676 counts diff (61570 1 924 0 2341402 0 15246) cugc -174 counts diff (56504 1 924 0 2689300 0 13502) cugc -1001 counts diff (57066 1 842 0 2855028 0 14098) cugc -146 counts diff (57716 1 916 0 3063238 0 13891) cugc -148 counts diff (62868 1 982 0 3139111 0 15244) cugc -1730 counts diff (64809 1 856 0 3283855 0 16535) cugc -162 counts diff (69183 1 870 0 3394031 0 17902) <finished\n> total chunks 6652 Elapsed time: 1.233016s (0.668819s in 28 GCs) And here's with (read-process-output-max 409600): start now cugc -12 counts diff (59160 5 22547 116 155434 2046 2103) cugc -154001 counts diff (18671 1 16 0 1034538 0 6172) cugc -100 counts diff (20250 1 14 0 1003966 0 6708) cugc -190294 counts diff (19623 1 16 0 1244441 0 6489) cugc -58 counts diff (26160 1 14 0 1015128 0 8678) cugc -293067 counts diff (22737 1 16 0 1426874 0 7527) cugc -92 counts diff (28308 1 14 0 1160213 0 9394) cugc -25 counts diff (21620 1 16 0 1535686 0 7153) cugc -21 counts diff (23251 1 16 0 1554720 0 7698) cugc -143 counts diff (29988 1 16 0 1462639 0 9943) cugc -117 counts diff (28827 1 16 0 1622562 0 9556) cugc -26 counts diff (33959 1 16 0 1606815 0 11266) cugc -17 counts diff (37476 1 16 0 1639853 0 12439) cugc -250992 counts diff (31345 1 18 0 2081663 0 10383) cugc -289142 counts diff (29904 1 18 0 2448410 0 9901) cugc -290227 counts diff (30675 1 18 0 2448156 0 10159) cugc -264315 counts diff (35418 1 18 0 2446508 0 11741) cugc -32 counts diff (41741 1 18 0 2343900 0 13847) cugc -2201 counts diff (44523 1 112 0 2478310 0 14239) cugc -15673 counts diff (49622 1 170 0 2528221 0 15592) cugc -40267 counts diff (41990 1 58 0 2972015 0 13693) cugc -159 counts diff (41010 1 22 0 3177994 0 13580) cugc -42 counts diff (47602 1 156 0 3259833 0 15009) cugc -358884 counts diff (43740 1 34 0 3687145 0 14436) cugc -22 counts diff (55598 1 20 0 3494190 0 18454) cugc -1270 counts diff (60128 1 190 0 3683461 0 18980) <finished\n> total chunks 273 Elapsed time: 0.932625s (0.608713s in 26 GCs) [-- Attachment #3: gcs3b.txt --] [-- Type: text/plain, Size: 929 bytes --] find-directory-files-recursively-3 Uses process-file, parses the buffer with search-forward at the end. start now <process finished, now parsing> cugc -129 counts diff (17667 1 1565 1 779081 93 10465) cugc -139 counts diff (12789 1 12 0 912364 0 12696) cugc -84 counts diff (13496 1 12 0 1028060 0 13403) cugc -33 counts diff (14112 1 12 0 1168522 0 14019) cugc -153 counts diff (14354 1 12 0 1347241 0 14261) cugc -72 counts diff (17005 1 12 0 1401075 0 16912) cugc -17 counts diff (20810 1 12 0 1403396 0 20717) cugc -94 counts diff (18516 1 12 0 1792508 0 18423) cugc -120 counts diff (17981 1 12 0 2108458 0 17888) cugc -50 counts diff (22090 1 12 0 2169835 0 21997) cugc -136 counts diff (26749 1 12 0 2231037 0 26656) cugc -93 counts diff (25300 1 12 0 2687843 0 25207) cugc -72 counts diff (26165 1 12 0 3046140 0 26072) cugc -142 counts diff (30968 1 12 0 3205306 0 30875) Elapsed time: 0.938180s (0.314630s in 14 GCs) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-11 23:06 ` Dmitry Gutov @ 2023-09-12 11:39 ` Eli Zaretskii 2023-09-12 13:11 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-12 11:39 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Tue, 12 Sep 2023 02:06:50 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > No, we don't wait until it's zero, we perform GC on the first > > opportunity that we _notice_ that it crossed zero. So examining how > > negative is the value of consing_until_gc when GC is actually > > performed could tell us whether we checked the threshold with high > > enough frequency, and comparing these values between different runs > > could tell us whether the shorter time spend in GC means really less > > garbage or less frequent checks for the need to GC. > > Good point, I'm attaching the same outputs with "last value of > consing_until_gc" added to every line. > > There are some pretty low values in the "read-process-output-max 409600" > part of the experiment, which probably means runtime staying in C > accumulating the output into the (now larger) buffer? Not sure. No, I think this means we really miss some GC opportunities, and we cons quite a lot more strings between GC cycles due to that. I guess this happens because we somehow cons many strings in code that doesn't call maybe_gc or something. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 11:39 ` Eli Zaretskii @ 2023-09-12 13:11 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-09-12 13:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 12/09/2023 14:39, Eli Zaretskii wrote: >> Date: Tue, 12 Sep 2023 02:06:50 +0300 >> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov<dmitry@gutov.dev> >> >>> No, we don't wait until it's zero, we perform GC on the first >>> opportunity that we_notice_ that it crossed zero. So examining how >>> negative is the value of consing_until_gc when GC is actually >>> performed could tell us whether we checked the threshold with high >>> enough frequency, and comparing these values between different runs >>> could tell us whether the shorter time spend in GC means really less >>> garbage or less frequent checks for the need to GC. >> Good point, I'm attaching the same outputs with "last value of >> consing_until_gc" added to every line. >> >> There are some pretty low values in the "read-process-output-max 409600" >> part of the experiment, which probably means runtime staying in C >> accumulating the output into the (now larger) buffer? Not sure. > No, I think this means we really miss some GC opportunities, and we > cons quite a lot more strings between GC cycles due to that. Or possibly same number of strings but longer ones? > I guess > this happens because we somehow cons many strings in code that doesn't > call maybe_gc or something. Yes, staying in some C code that doesn't call maybe_gc for a while. I think we're describing the same thing, only I was doing that from the positive side (less frequent GCs = better performance in this scenario), and you from the negative one (less frequent GCs = more chances for an OOM to happen in some related but different scenario). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-11 11:57 ` Eli Zaretskii 2023-09-11 23:06 ` Dmitry Gutov @ 2023-09-12 14:23 ` Dmitry Gutov 2023-09-12 14:26 ` Dmitry Gutov 2023-09-12 16:32 ` Eli Zaretskii 1 sibling, 2 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-09-12 14:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 11/09/2023 14:57, Eli Zaretskii wrote: >> So there is also a second recording for >> find-directory-files-recursively-2 with read-process-output-max=409600. >> It does improve the performance significantly (and reduce the number of >> GC pauses). I guess what I'm still not clear on, is whether the number >> of GC pauses is fewer because of less consing (the only column that >> looks significantly different is the 3rd: VECTOR-CELLS), or because the >> process finishes faster due to larger buffers, which itself causes fewer >> calls to maybe_gc. > I think the latter. It might be both. To try to analyze how large might per-chunk overhead be (CPU and GC-wise combined), I first implemented the same function in yet another way that doesn't use :filter (so that the default filter is used). But still asynchronously, with parsing happening concurrently to the process: (defun find-directory-files-recursively-5 (dir regexp &optional include-directories _p follow-symlinks) (cl-assert (null _p) t "find-directory-files-recursively can't accept arbitrary predicates") (with-temp-buffer (setq case-fold-search nil) (cd dir) (let* ((command (append (list "find" (file-local-name dir)) (if follow-symlinks '("-L") '("!" "(" "-type" "l" "-xtype" "d" ")")) (unless (string-empty-p regexp) (list "-regex" (concat ".*" regexp ".*"))) (unless include-directories '("!" "-type" "d")) '("-print0") )) (remote (file-remote-p dir)) (proc (if remote (let ((proc (apply #'start-file-process "find" (current-buffer) command))) (set-process-sentinel proc (lambda (_proc _state))) (set-process-query-on-exit-flag proc nil) proc) (make-process :name "find" :buffer (current-buffer) :connection-type 'pipe :noquery t :sentinel (lambda (_proc _state)) :command command))) start ret) (setq start (point-min)) (while (accept-process-output proc) (goto-char start) (while (search-forward "\0" nil t) (push (buffer-substring-no-properties start (1- (point))) ret) (setq start (point)))) ret))) This method already improved the performance somewhat (compared to find-directory-files-recursively-2), but not too much. So I tried these next two steps: - Dropping most of the setup in read_and_dispose_of_process_output (which creates some consing too) and calling Finternal_default_process_filter directly (call_filter_directly.diff), when it is the filter to be used anyway. - Going around that function entirely, skipping the creation of a Lisp string (CHARS -> TEXT) and inserting into the buffer directly (when the filter is set to the default, of course). Copied and adapted some code from 'call_process' for that (read_and_insert_process_output.diff). Neither are intended as complete proposals, but here are some comparisons. Note that either of these patches could only help the implementations that don't set up process filter (the naive first one, and the new parallel number 5 above). For testing, I used two different repo checkouts that are large enough to not finish too quickly: gecko-dev and torvalds-linux. master | Function | gecko-dev | linux | | find-directory-files-recursively | 1.69 | 0.41 | | find-directory-files-recursively-2 | 1.16 | 0.28 | | find-directory-files-recursively-3 | 0.92 | 0.23 | | find-directory-files-recursively-5 | 1.07 | 0.26 | | find-directory-files-recursively (rpom 409600) | 1.42 | 0.35 | | find-directory-files-recursively-2 (rpom 409600) | 0.90 | 0.25 | | find-directory-files-recursively-5 (rpom 409600) | 0.89 | 0.24 | call_filter_directly.diff (basically, not much difference) | Function | gecko-dev | linux | | find-directory-files-recursively | 1.64 | 0.38 | | find-directory-files-recursively-5 | 1.05 | 0.26 | | find-directory-files-recursively (rpom 409600) | 1.42 | 0.36 | | find-directory-files-recursively-5 (rpom 409600) | 0.91 | 0.25 | read_and_insert_process_output.diff (noticeable differences) | Function | gecko-dev | linux | | find-directory-files-recursively | 1.30 | 0.34 | | find-directory-files-recursively-5 | 1.03 | 0.25 | | find-directory-files-recursively (rpom 409600) | 1.20 | 0.35 | | find-directory-files-recursively-5 (rpom 409600) | (!!) 0.72 | 0.21 | So it seems like we have at least two potential ways to implement an asynchronous file listing routine that is as fast or faster than the synchronous one (if only thanks to starting the parsing in parallel). Combining the last patch together with using the very large value of read-process-output-max seems to yield the most benefit, but I'm not sure if it's appropriate to just raise that value in our code, though. Thoughts? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 14:23 ` Dmitry Gutov @ 2023-09-12 14:26 ` Dmitry Gutov 2023-09-12 16:32 ` Eli Zaretskii 1 sibling, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-09-12 14:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 [-- Attachment #1: Type: text/plain, Size: 327 bytes --] On 12/09/2023 17:23, Dmitry Gutov wrote: > Neither are intended as complete proposals, but here are some > comparisons. Note that either of these patches could only help the > implementations that don't set up process filter (the naive first one, > and the new parallel number 5 above). Sorry, forgot to attach the patches. [-- Attachment #2: call_filter_directly.diff --] [-- Type: text/x-patch, Size: 818 bytes --] diff --git a/src/process.c b/src/process.c index 08cb810ec13..bdbe8d96064 100644 --- a/src/process.c +++ b/src/process.c @@ -6227,7 +6227,15 @@ read_process_output (Lisp_Object proc, int channel) friends don't expect current-buffer to be changed from under them. */ record_unwind_current_buffer (); - read_and_dispose_of_process_output (p, chars, nbytes, coding); + if (p->filter == Qinternal_default_process_filter) + { + Lisp_Object text; + decode_coding_c_string (coding, (unsigned char *) chars, nbytes, Qt); + text = coding->dst_object; + Finternal_default_process_filter (proc, text); + } + else + read_and_dispose_of_process_output (p, chars, nbytes, coding); /* Handling the process output should not deactivate the mark. */ Vdeactivate_mark = odeactivate; [-- Attachment #3: read_and_insert_process_output.diff --] [-- Type: text/x-patch, Size: 2836 bytes --] diff --git a/src/process.c b/src/process.c index 08cb810ec13..5db56692fe1 100644 --- a/src/process.c +++ b/src/process.c @@ -6112,6 +6112,11 @@ read_and_dispose_of_process_output (struct Lisp_Process *p, char *chars, ssize_t nbytes, struct coding_system *coding); +static void +read_and_insert_process_output (struct Lisp_Process *p, char *buf, + ssize_t nread, + struct coding_system *process_coding); + /* Read pending output from the process channel, starting with our buffered-ahead character if we have one. Yield number of decoded characters read, @@ -6227,7 +6232,10 @@ read_process_output (Lisp_Object proc, int channel) friends don't expect current-buffer to be changed from under them. */ record_unwind_current_buffer (); - read_and_dispose_of_process_output (p, chars, nbytes, coding); + if (p->filter == Qinternal_default_process_filter) + read_and_insert_process_output (p, chars, nbytes, coding); + else + read_and_dispose_of_process_output (p, chars, nbytes, coding); /* Handling the process output should not deactivate the mark. */ Vdeactivate_mark = odeactivate; @@ -6236,6 +6244,46 @@ read_process_output (Lisp_Object proc, int channel) return nbytes; } +static void read_and_insert_process_output (struct Lisp_Process *p, char *buf, + ssize_t nread, + struct coding_system *process_coding) +{ + if (!nread || NILP (p->buffer) || !BUFFER_LIVE_P (XBUFFER (p->buffer))) + ; + else if (NILP (BVAR (XBUFFER(p->buffer), enable_multibyte_characters)) + && ! CODING_MAY_REQUIRE_DECODING (process_coding)) + { + insert_1_both (buf, nread, nread, 0, 0, 0); + signal_after_change (PT - nread, 0, nread); + } + else + { /* We have to decode the input. */ + Lisp_Object curbuf; + int carryover = 0; + specpdl_ref count1 = SPECPDL_INDEX (); + + XSETBUFFER (curbuf, current_buffer); + /* We cannot allow after-change-functions be run + during decoding, because that might modify the + buffer, while we rely on process_coding.produced to + faithfully reflect inserted text until we + TEMP_SET_PT_BOTH below. */ + specbind (Qinhibit_modification_hooks, Qt); + decode_coding_c_string (process_coding, + (unsigned char *) buf, nread, curbuf); + unbind_to (count1, Qnil); + + TEMP_SET_PT_BOTH (PT + process_coding->produced_char, + PT_BYTE + process_coding->produced); + signal_after_change (PT - process_coding->produced_char, + 0, process_coding->produced_char); + carryover = process_coding->carryover_bytes; + if (carryover > 0) + memcpy (buf, process_coding->carryover, + process_coding->carryover_bytes); + } +} + static void read_and_dispose_of_process_output (struct Lisp_Process *p, char *chars, ssize_t nbytes, ^ permalink raw reply related [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 14:23 ` Dmitry Gutov 2023-09-12 14:26 ` Dmitry Gutov @ 2023-09-12 16:32 ` Eli Zaretskii 2023-09-12 18:48 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-12 16:32 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Tue, 12 Sep 2023 17:23:53 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > - Dropping most of the setup in read_and_dispose_of_process_output > (which creates some consing too) and calling > Finternal_default_process_filter directly (call_filter_directly.diff), > when it is the filter to be used anyway. > > - Going around that function entirely, skipping the creation of a Lisp > string (CHARS -> TEXT) and inserting into the buffer directly (when the > filter is set to the default, of course). Copied and adapted some code > from 'call_process' for that (read_and_insert_process_output.diff). > > Neither are intended as complete proposals, but here are some > comparisons. Note that either of these patches could only help the > implementations that don't set up process filter (the naive first one, > and the new parallel number 5 above). > > For testing, I used two different repo checkouts that are large enough > to not finish too quickly: gecko-dev and torvalds-linux. > > master > > | Function | gecko-dev | linux | > | find-directory-files-recursively | 1.69 | 0.41 | > | find-directory-files-recursively-2 | 1.16 | 0.28 | > | find-directory-files-recursively-3 | 0.92 | 0.23 | > | find-directory-files-recursively-5 | 1.07 | 0.26 | > | find-directory-files-recursively (rpom 409600) | 1.42 | 0.35 | > | find-directory-files-recursively-2 (rpom 409600) | 0.90 | 0.25 | > | find-directory-files-recursively-5 (rpom 409600) | 0.89 | 0.24 | > > call_filter_directly.diff (basically, not much difference) > > | Function | gecko-dev | linux | > | find-directory-files-recursively | 1.64 | 0.38 | > | find-directory-files-recursively-5 | 1.05 | 0.26 | > | find-directory-files-recursively (rpom 409600) | 1.42 | 0.36 | > | find-directory-files-recursively-5 (rpom 409600) | 0.91 | 0.25 | > > read_and_insert_process_output.diff (noticeable differences) > > | Function | gecko-dev | linux | > | find-directory-files-recursively | 1.30 | 0.34 | > | find-directory-files-recursively-5 | 1.03 | 0.25 | > | find-directory-files-recursively (rpom 409600) | 1.20 | 0.35 | > | find-directory-files-recursively-5 (rpom 409600) | (!!) 0.72 | 0.21 | > > So it seems like we have at least two potential ways to implement an > asynchronous file listing routine that is as fast or faster than the > synchronous one (if only thanks to starting the parsing in parallel). > > Combining the last patch together with using the very large value of > read-process-output-max seems to yield the most benefit, but I'm not > sure if it's appropriate to just raise that value in our code, though. > > Thoughts? I'm not sure what exactly is here to think about. Removing portions of read_and_insert_process_output, or bypassing it entirely, is not going to fly, because AFAIU it basically means we don't decode text, which can only work with plain ASCII file names, and/or don't move the markers in the process buffer, which also cannot be avoided. If you want to conclude that inserting the process's output into a buffer without consing Lisp strings is faster (which I'm not sure, see below, but it could be true), then we could try extending internal-default-process-filter (or writing a new filter function similar to it) so that it inserts the stuff into the gap and then uses decode_coding_gap, which converts inserted bytes in-place -- that, at least, will be correct and will avoid consing intermediate temporary strings from the process output, then decoding them, then inserting them. Other than that, the -2 and -3 variants are very close runners-up of -5, so maybe I'm missing something, but I see no reason be too excited here? I mean, 0.89 vs 0.92? really? About inserting into the buffer: what we do is insert into the gap, and when the gap becomes full, we enlarge it. Enlarging the gap involves: (a) enlarging the chunk of memory allocated to buffer text (which might mean we ask the OS for more memory), and (b) moving the characters after the gap to the right to free space for inserting more stuff. This is pretty fast, but still, with a large pipe buffer and a lot of output, we do this many times, so it could add up to something pretty tangible. It's hard to me to tell whether this is significantly faster than consing strings and inserting them, only measurements can tell. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 16:32 ` Eli Zaretskii @ 2023-09-12 18:48 ` Dmitry Gutov 2023-09-12 19:35 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-12 18:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 12/09/2023 19:32, Eli Zaretskii wrote: >> Date: Tue, 12 Sep 2023 17:23:53 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> - Dropping most of the setup in read_and_dispose_of_process_output >> (which creates some consing too) and calling >> Finternal_default_process_filter directly (call_filter_directly.diff), >> when it is the filter to be used anyway. >> >> - Going around that function entirely, skipping the creation of a Lisp >> string (CHARS -> TEXT) and inserting into the buffer directly (when the >> filter is set to the default, of course). Copied and adapted some code >> from 'call_process' for that (read_and_insert_process_output.diff). >> >> Neither are intended as complete proposals, but here are some >> comparisons. Note that either of these patches could only help the >> implementations that don't set up process filter (the naive first one, >> and the new parallel number 5 above). >> >> For testing, I used two different repo checkouts that are large enough >> to not finish too quickly: gecko-dev and torvalds-linux. >> >> master >> >> | Function | gecko-dev | linux | >> | find-directory-files-recursively | 1.69 | 0.41 | >> | find-directory-files-recursively-2 | 1.16 | 0.28 | >> | find-directory-files-recursively-3 | 0.92 | 0.23 | >> | find-directory-files-recursively-5 | 1.07 | 0.26 | >> | find-directory-files-recursively (rpom 409600) | 1.42 | 0.35 | >> | find-directory-files-recursively-2 (rpom 409600) | 0.90 | 0.25 | >> | find-directory-files-recursively-5 (rpom 409600) | 0.89 | 0.24 | >> >> call_filter_directly.diff (basically, not much difference) >> >> | Function | gecko-dev | linux | >> | find-directory-files-recursively | 1.64 | 0.38 | >> | find-directory-files-recursively-5 | 1.05 | 0.26 | >> | find-directory-files-recursively (rpom 409600) | 1.42 | 0.36 | >> | find-directory-files-recursively-5 (rpom 409600) | 0.91 | 0.25 | >> >> read_and_insert_process_output.diff (noticeable differences) >> >> | Function | gecko-dev | linux | >> | find-directory-files-recursively | 1.30 | 0.34 | >> | find-directory-files-recursively-5 | 1.03 | 0.25 | >> | find-directory-files-recursively (rpom 409600) | 1.20 | 0.35 | >> | find-directory-files-recursively-5 (rpom 409600) | (!!) 0.72 | 0.21 | >> >> So it seems like we have at least two potential ways to implement an >> asynchronous file listing routine that is as fast or faster than the >> synchronous one (if only thanks to starting the parsing in parallel). >> >> Combining the last patch together with using the very large value of >> read-process-output-max seems to yield the most benefit, but I'm not >> sure if it's appropriate to just raise that value in our code, though. >> >> Thoughts? > > I'm not sure what exactly is here to think about. Removing portions > of read_and_insert_process_output, or bypassing it entirely, is not > going to fly, because AFAIU it basically means we don't decode text, > which can only work with plain ASCII file names, and/or don't move the > markers in the process buffer, which also cannot be avoided. That one was really a test to see whether the extra handling added any meaningful consing to affect GC. Removing it didn't make a difference, table number 2, so no. > If you > want to conclude that inserting the process's output into a buffer > without consing Lisp strings is faster (which I'm not sure, see below, > but it could be true), That's what my tests seem to show, see table 3 (the last one). > then we could try extending > internal-default-process-filter (or writing a new filter function > similar to it) so that it inserts the stuff into the gap and then uses > decode_coding_gap, Can that work at all? By the time internal-default-process-filter is called, we have already turned the string from char* into Lisp_Object text, which we then pass to it. So consing has already happened, IIUC. > which converts inserted bytes in-place -- that, at > least, will be correct and will avoid consing intermediate temporary > strings from the process output, then decoding them, then inserting > them. Other than that, the -2 and -3 variants are very close > runners-up of -5, so maybe I'm missing something, but I see no reason > be too excited here? I mean, 0.89 vs 0.92? really? The important part is not 0.89 vs 0.92 (that would be meaningless indeed), but that we have an _asyncronous_ implementation of the feature that works as fast as the existing synchronous one (or faster! if we also bind read-process-output-max to a large value, the time is 0.72). The possible applications for that range from simple (printing progress bar while the scan is happening) to more advanced (launching a concurrent process where we pipe the received file names concurrently to 'xargs grep'), including visuals (xref buffer which shows the intermediate search results right away, updating them gradually, all without blocking the UI). > About inserting into the buffer: what we do is insert into the gap, > and when the gap becomes full, we enlarge it. Enlarging the gap > involves: (a) enlarging the chunk of memory allocated to buffer text > (which might mean we ask the OS for more memory), and (b) moving the > characters after the gap to the right to free space for inserting more > stuff. This is pretty fast, but still, with a large pipe buffer and a > lot of output, we do this many times, so it could add up to something > pretty tangible. It's hard to me to tell whether this is > significantly faster than consing strings and inserting them, only > measurements can tell. See the benchmark tables and the POC patch in my previous email. Using a better filter function would be ideal, but it seems like that's not going to fit the current design. Happy to be proven wrong, though. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 18:48 ` Dmitry Gutov @ 2023-09-12 19:35 ` Eli Zaretskii 2023-09-12 20:27 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-12 19:35 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Tue, 12 Sep 2023 21:48:37 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > then we could try extending > > internal-default-process-filter (or writing a new filter function > > similar to it) so that it inserts the stuff into the gap and then uses > > decode_coding_gap, > > Can that work at all? By the time internal-default-process-filter is > called, we have already turned the string from char* into Lisp_Object > text, which we then pass to it. So consing has already happened, IIUC. That's why I said "or writing a new filter function". read_and_dispose_of_process_output will have to call this new filter differently, passing it the raw text read from the subprocess, where read_and_dispose_of_process_output current first decodes the text and produces a Lisp string from it. Then the filter would need to do something similar to what insert-file-contents does: insert the raw input into the gap, then call decode_coding_gap to decode that in-place. > > which converts inserted bytes in-place -- that, at > > least, will be correct and will avoid consing intermediate temporary > > strings from the process output, then decoding them, then inserting > > them. Other than that, the -2 and -3 variants are very close > > runners-up of -5, so maybe I'm missing something, but I see no reason > > be too excited here? I mean, 0.89 vs 0.92? really? > > The important part is not 0.89 vs 0.92 (that would be meaningless > indeed), but that we have an _asyncronous_ implementation of the feature > that works as fast as the existing synchronous one (or faster! if we > also bind read-process-output-max to a large value, the time is 0.72). > > The possible applications for that range from simple (printing progress > bar while the scan is happening) to more advanced (launching a > concurrent process where we pipe the received file names concurrently to > 'xargs grep'), including visuals (xref buffer which shows the > intermediate search results right away, updating them gradually, all > without blocking the UI). Hold your horses. Emacs only reads output from sub-processes when it's idle. So printing a progress bar (which makes Emacs not idle) with the asynchronous implementation is basically the same as having the synchronous implementation call some callback from time to time (which will then show the progress). As for piping to another process, this is best handled by using a shell pipe, without passing stuff through Emacs. And even if you do need to pass it through Emacs, you could do the same with the synchronous implementation -- only the "xargs" part needs to be asynchronous, the part that reads file names does not. Right? Please note: I'm not saying that the asynchronous implementation is not interesting. It might even have advantages in some specific use cases. So it is good to have it. It just isn't a breakthrough, that's all. And if we want to use it in production, we should probably work on adding that special default filter which inserts and decodes directly into the buffer, because that will probably lower the GC pressure and thus has hope of being faster. Or even replace the default filter implementation with that new one. > > About inserting into the buffer: what we do is insert into the gap, > > and when the gap becomes full, we enlarge it. Enlarging the gap > > involves: (a) enlarging the chunk of memory allocated to buffer text > > (which might mean we ask the OS for more memory), and (b) moving the > > characters after the gap to the right to free space for inserting more > > stuff. This is pretty fast, but still, with a large pipe buffer and a > > lot of output, we do this many times, so it could add up to something > > pretty tangible. It's hard to me to tell whether this is > > significantly faster than consing strings and inserting them, only > > measurements can tell. > > See the benchmark tables and the POC patch in my previous email. Using a > better filter function would be ideal, but it seems like that's not > going to fit the current design. Happy to be proven wrong, though. I see no reason why reading subprocess output couldn't use the same technique as insert-file-contents does. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 19:35 ` Eli Zaretskii @ 2023-09-12 20:27 ` Dmitry Gutov 2023-09-13 11:38 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-12 20:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 12/09/2023 22:35, Eli Zaretskii wrote: >> Date: Tue, 12 Sep 2023 21:48:37 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>> then we could try extending >>> internal-default-process-filter (or writing a new filter function >>> similar to it) so that it inserts the stuff into the gap and then uses >>> decode_coding_gap, >> >> Can that work at all? By the time internal-default-process-filter is >> called, we have already turned the string from char* into Lisp_Object >> text, which we then pass to it. So consing has already happened, IIUC. > > That's why I said "or writing a new filter function". > read_and_dispose_of_process_output will have to call this new filter > differently, passing it the raw text read from the subprocess, where > read_and_dispose_of_process_output current first decodes the text and > produces a Lisp string from it. Then the filter would need to do > something similar to what insert-file-contents does: insert the raw > input into the gap, then call decode_coding_gap to decode that > in-place. Does the patch from my last patch-bearing email look similar enough to what you're describing? The one called read_and_insert_process_output.diff The result there, though, is that a "filter" (in the sense that make-process uses that term) is not used at all. >>> which converts inserted bytes in-place -- that, at >>> least, will be correct and will avoid consing intermediate temporary >>> strings from the process output, then decoding them, then inserting >>> them. Other than that, the -2 and -3 variants are very close >>> runners-up of -5, so maybe I'm missing something, but I see no reason >>> be too excited here? I mean, 0.89 vs 0.92? really? >> >> The important part is not 0.89 vs 0.92 (that would be meaningless >> indeed), but that we have an _asyncronous_ implementation of the feature >> that works as fast as the existing synchronous one (or faster! if we >> also bind read-process-output-max to a large value, the time is 0.72). >> >> The possible applications for that range from simple (printing progress >> bar while the scan is happening) to more advanced (launching a >> concurrent process where we pipe the received file names concurrently to >> 'xargs grep'), including visuals (xref buffer which shows the >> intermediate search results right away, updating them gradually, all >> without blocking the UI). > > Hold your horses. Emacs only reads output from sub-processes when > it's idle. So printing a progress bar (which makes Emacs not idle) > with the asynchronous implementation is basically the same as having > the synchronous implementation call some callback from time to time > (which will then show the progress). Obviously there is more work to be done, including further desgin and benchmarking. But unlike before, at least the starting performance (before further features are added) is not worse. Note that the variant -5 is somewhat limited since it doesn't use a filter - that means that no callbacks a issued while the output is arriving, meaning that if it's taken as base, whatever refreshes would have to be initiated from somewhere else. E.g. from a timer. > As for piping to another process, this is best handled by using a > shell pipe, without passing stuff through Emacs. And even if you do > need to pass it through Emacs, you could do the same with the > synchronous implementation -- only the "xargs" part needs to be > asynchronous, the part that reads file names does not. Right? Yes and no: if both steps are asynchronous, the final output window could be displayed right away, rather than waiting for the first step (or both) to be finished. Which can be a meaningful improvement for some (and still is an upside of 'M-x rgrep'). > Please note: I'm not saying that the asynchronous implementation is > not interesting. It might even have advantages in some specific use > cases. So it is good to have it. It just isn't a breakthrough, > that's all. Not a breakthrough, of course, just a lower-level insight (hopefully). I do think it would be meaningful to manage to reduce the runtime of a real-life program (which includes other work) by 10-20% solely by reducing GC pressure in a generic facility like process output handling. > And if we want to use it in production, we should > probably work on adding that special default filter which inserts and > decodes directly into the buffer, because that will probably lower the > GC pressure and thus has hope of being faster. Or even replace the > default filter implementation with that new one. But a filter must be a Lisp function, which can't help but accept only Lisp strings (not C string) as argument. Isn't that right? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-12 20:27 ` Dmitry Gutov @ 2023-09-13 11:38 ` Eli Zaretskii 2023-09-13 14:27 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-13 11:38 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Tue, 12 Sep 2023 23:27:49 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > That's why I said "or writing a new filter function". > > read_and_dispose_of_process_output will have to call this new filter > > differently, passing it the raw text read from the subprocess, where > > read_and_dispose_of_process_output current first decodes the text and > > produces a Lisp string from it. Then the filter would need to do > > something similar to what insert-file-contents does: insert the raw > > input into the gap, then call decode_coding_gap to decode that > > in-place. > > Does the patch from my last patch-bearing email look similar enough to > what you're describing? > > The one called read_and_insert_process_output.diff No, not entirely: it still produces a Lisp string when decoding is needed, and then inserts that string into the buffer. Did you look at what insert-file-contents does? If not I suggest to have a look, starting from this comment: /* Here, we don't do code conversion in the loop. It is done by decode_coding_gap after all data are read into the buffer. */ and ending here: if (CODING_MAY_REQUIRE_DECODING (&coding) && (inserted > 0 || CODING_REQUIRE_FLUSHING (&coding))) { /* Now we have all the new bytes at the beginning of the gap, but `decode_coding_gap` can't have them at the beginning of the gap, so we need to move them. */ memmove (GAP_END_ADDR - inserted, GPT_ADDR, inserted); decode_coding_gap (&coding, inserted); inserted = coding.produced_char; coding_system = CODING_ID_NAME (coding.id); } else if (inserted > 0) { /* Make the text read part of the buffer. */ eassert (NILP (BVAR (current_buffer, enable_multibyte_characters))); insert_from_gap_1 (inserted, inserted, false); invalidate_buffer_caches (current_buffer, PT, PT + inserted); adjust_after_insert (PT, PT_BYTE, PT + inserted, PT_BYTE + inserted, inserted); } > The result there, though, is that a "filter" (in the sense that > make-process uses that term) is not used at all. Sure, but in this case we don't need any filtering. It's basically the same idea as internal-default-process-filter: we just need to insert the process output into a buffer, and optionally decode it. > > And if we want to use it in production, we should > > probably work on adding that special default filter which inserts and > > decodes directly into the buffer, because that will probably lower the > > GC pressure and thus has hope of being faster. Or even replace the > > default filter implementation with that new one. > > But a filter must be a Lisp function, which can't help but accept only > Lisp strings (not C string) as argument. Isn't that right? We can provide a special filter identified by a symbol. Such a filter will not be Lisp-callable, it will exist for the cases where we need to insert the output into the process buffer. Any Lisp callback could then access the process output as the text of that buffer, no Lisp strings needed. I thought this was a worthy goal; apologies if I misunderstood. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-13 11:38 ` Eli Zaretskii @ 2023-09-13 14:27 ` Dmitry Gutov 2023-09-13 15:07 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-13 14:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 13/09/2023 14:38, Eli Zaretskii wrote: >> Date: Tue, 12 Sep 2023 23:27:49 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>> That's why I said "or writing a new filter function". >>> read_and_dispose_of_process_output will have to call this new filter >>> differently, passing it the raw text read from the subprocess, where >>> read_and_dispose_of_process_output current first decodes the text and >>> produces a Lisp string from it. Then the filter would need to do >>> something similar to what insert-file-contents does: insert the raw >>> input into the gap, then call decode_coding_gap to decode that >>> in-place. >> >> Does the patch from my last patch-bearing email look similar enough to >> what you're describing? >> >> The one called read_and_insert_process_output.diff > > No, not entirely: it still produces a Lisp string when decoding is > needed, and then inserts that string into the buffer. Are you sure? IIUC the fact that is passes 'curbuf' as the last argument to 'decode_coding_c_string' means that decoding happens inside the buffer. This has been my explanation for the performance improvement anyway. If it still generated a Lisp string, I think that would mean that we could save the general shape of internal-default-process-filter and just improve its implementation for the same measured benefit. > Did you look at what insert-file-contents does? If not I suggest to > have a look, starting from this comment: > > /* Here, we don't do code conversion in the loop. It is done by > decode_coding_gap after all data are read into the buffer. */ > > and ending here: > > if (CODING_MAY_REQUIRE_DECODING (&coding) > && (inserted > 0 || CODING_REQUIRE_FLUSHING (&coding))) > { > /* Now we have all the new bytes at the beginning of the gap, > but `decode_coding_gap` can't have them at the beginning of the gap, > so we need to move them. */ > memmove (GAP_END_ADDR - inserted, GPT_ADDR, inserted); > decode_coding_gap (&coding, inserted); > inserted = coding.produced_char; > coding_system = CODING_ID_NAME (coding.id); > } > else if (inserted > 0) > { > /* Make the text read part of the buffer. */ > eassert (NILP (BVAR (current_buffer, enable_multibyte_characters))); > insert_from_gap_1 (inserted, inserted, false); > > invalidate_buffer_caches (current_buffer, PT, PT + inserted); > adjust_after_insert (PT, PT_BYTE, PT + inserted, PT_BYTE + inserted, > inserted); > } That does look different. I'm not sure how long it would take me to adapt this code (if you have an alternative patch to suggest right away, please go ahead), but if this method turns out to be faster, it sounds like we could improve the performance of 'call_process' the same way. That would be a win-win. >> The result there, though, is that a "filter" (in the sense that >> make-process uses that term) is not used at all. > > Sure, but in this case we don't need any filtering. It's basically > the same idea as internal-default-process-filter: we just need to > insert the process output into a buffer, and optionally decode it. Pretty much. But that raises the question of what to do with the existing function internal-default-process-filter. Looking around, it doesn't seem to be used with advice (a good thing: the proposed change would break that), but it is called directly in some packages like magit-blame, org-assistant, with-editor, wisi, sweeprolog, etc. I suppose we'd just keep it around unchanged. >>> And if we want to use it in production, we should >>> probably work on adding that special default filter which inserts and >>> decodes directly into the buffer, because that will probably lower the >>> GC pressure and thus has hope of being faster. Or even replace the >>> default filter implementation with that new one. >> >> But a filter must be a Lisp function, which can't help but accept only >> Lisp strings (not C string) as argument. Isn't that right? > > We can provide a special filter identified by a symbol. Such a filter > will not be Lisp-callable, it will exist for the cases where we need > to insert the output into the process buffer. The would be the safest alternative. OTOH, this way we'd pass up on the opportunity to make all existing asynchronous processes without custom filters, a little bit faster in one fell swoop. > Any Lisp callback could > then access the process output as the text of that buffer, no Lisp > strings needed. I thought this was a worthy goal; apologies if I > misunderstood. Sorry, I was just quibbling about the terminology, to make sure we are on the same page on what is being proposed. If the patch and evidence look good to people, that is. And I'd like to explore that improvement venue to the max. But note that it has limitations as well (e.g. filter is the only way to get in-process callbacks from the process, and avoiding it for best performance will require external callback such as timers), so if someone has any better ideas how to improve GC time to a comparable extent but keep design unchanged, that's also welcome. Should we also discuss increasing the default of read-process-output-max? Even increasing it 10x (not necessarily 100x) creates a noticeable difference, especially combined with the proposed change. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-13 14:27 ` Dmitry Gutov @ 2023-09-13 15:07 ` Eli Zaretskii 2023-09-13 17:27 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-13 15:07 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Wed, 13 Sep 2023 17:27:49 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > >> Does the patch from my last patch-bearing email look similar enough to > >> what you're describing? > >> > >> The one called read_and_insert_process_output.diff > > > > No, not entirely: it still produces a Lisp string when decoding is > > needed, and then inserts that string into the buffer. > > Are you sure? IIUC the fact that is passes 'curbuf' as the last argument > to 'decode_coding_c_string' means that decoding happens inside the > buffer. This has been my explanation for the performance improvement anyway. Yes, you are right, sorry. > > Sure, but in this case we don't need any filtering. It's basically > > the same idea as internal-default-process-filter: we just need to > > insert the process output into a buffer, and optionally decode it. > > Pretty much. But that raises the question of what to do with the > existing function internal-default-process-filter. Nothing. It will remain as the default filter. > Looking around, it doesn't seem to be used with advice (a good thing: > the proposed change would break that), but it is called directly in some > packages like magit-blame, org-assistant, with-editor, wisi, sweeprolog, > etc. I suppose we'd just keep it around unchanged. Yes. > > We can provide a special filter identified by a symbol. Such a filter > > will not be Lisp-callable, it will exist for the cases where we need > > to insert the output into the process buffer. > > The would be the safest alternative. OTOH, this way we'd pass up on the > opportunity to make all existing asynchronous processes without custom > filters, a little bit faster in one fell swoop. We could change the ones we care about, though. > Should we also discuss increasing the default of > read-process-output-max? Even increasing it 10x (not necessarily 100x) > creates a noticeable difference, especially combined with the proposed > change. That should be limited to specific cases where we expect to see a lot of stuff coming from the subprocess. We could also discuss changing the default value, but that would require measurements in as many cases as we can afford. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-13 15:07 ` Eli Zaretskii @ 2023-09-13 17:27 ` Dmitry Gutov 2023-09-13 19:32 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-13 17:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 13/09/2023 18:07, Eli Zaretskii wrote: >> Date: Wed, 13 Sep 2023 17:27:49 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>>> Does the patch from my last patch-bearing email look similar enough to >>>> what you're describing? >>>> >>>> The one called read_and_insert_process_output.diff >>> >>> No, not entirely: it still produces a Lisp string when decoding is >>> needed, and then inserts that string into the buffer. >> >> Are you sure? IIUC the fact that is passes 'curbuf' as the last argument >> to 'decode_coding_c_string' means that decoding happens inside the >> buffer. This has been my explanation for the performance improvement anyway. > > Yes, you are right, sorry. So we're not going to try the gap-based approach? Okay. >>> Sure, but in this case we don't need any filtering. It's basically >>> the same idea as internal-default-process-filter: we just need to >>> insert the process output into a buffer, and optionally decode it. >> >> Pretty much. But that raises the question of what to do with the >> existing function internal-default-process-filter. > > Nothing. It will remain as the default filter. Okay, if you are sure. >>> We can provide a special filter identified by a symbol. Such a filter >>> will not be Lisp-callable, it will exist for the cases where we need >>> to insert the output into the process buffer. >> >> The would be the safest alternative. OTOH, this way we'd pass up on the >> opportunity to make all existing asynchronous processes without custom >> filters, a little bit faster in one fell swoop. > > We could change the ones we care about, though. Which ones do we care about? I've found a bunch of 'make-process' calls without :filter specified (flymake backends, ). Do we upgrade them all? The difference is likely not critical in most of them, but the change would likely result in small reduction of GC pressure in the corresponding Emacs sessions. We'll also need to version-guard the ones that are in ELPA. We don't touch the implementations of functions like start-file-process, right? What about the callers of functions like start-file-process-shell-command who want to take advantage of the improvement? Are we okay with them all having to call (set-process-filter proc 'buffer) on the returned process value? >> Should we also discuss increasing the default of >> read-process-output-max? Even increasing it 10x (not necessarily 100x) >> creates a noticeable difference, especially combined with the proposed >> change. > > That should be limited to specific cases where we expect to see a lot > of stuff coming from the subprocess. So it would be okay to bump it in particular functions? Okay. > We could also discuss changing > the default value, but that would require measurements in as many > cases as we can afford. If you have some particular scenarios in mind, and what to look out for, I could test them out at least on one platform. I'm not sure what negatives to test for, though. Raising the limit 10x is unlikely to lead to an OOM, but I guess some processes could grow higher latency?.. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-13 17:27 ` Dmitry Gutov @ 2023-09-13 19:32 ` Eli Zaretskii 2023-09-13 20:38 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-13 19:32 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Wed, 13 Sep 2023 20:27:09 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > >> Are you sure? IIUC the fact that is passes 'curbuf' as the last argument > >> to 'decode_coding_c_string' means that decoding happens inside the > >> buffer. This has been my explanation for the performance improvement anyway. > > > > Yes, you are right, sorry. > > So we're not going to try the gap-based approach? Okay. decode_coding_c_string does that internally. > >> The would be the safest alternative. OTOH, this way we'd pass up on the > >> opportunity to make all existing asynchronous processes without custom > >> filters, a little bit faster in one fell swoop. > > > > We could change the ones we care about, though. > > Which ones do we care about? I've found a bunch of 'make-process' calls > without :filter specified (flymake backends, ). Do we upgrade them all? > > The difference is likely not critical in most of them, but the change > would likely result in small reduction of GC pressure in the > corresponding Emacs sessions. > > We'll also need to version-guard the ones that are in ELPA. > > We don't touch the implementations of functions like start-file-process, > right? > > What about the callers of functions like > start-file-process-shell-command who want to take advantage of the > improvement? Are we okay with them all having to call > (set-process-filter proc 'buffer) on the returned process value? I think these questions are slightly premature. We should first have the implementation of that filter, and then look for candidates that could benefit from it. My tendency is to change only callers which are in many cases expected to get a lot of stuff from a subprocess, so shell buffers are probably out. But we could discuss that later. > > We could also discuss changing > > the default value, but that would require measurements in as many > > cases as we can afford. > > If you have some particular scenarios in mind, and what to look out for, > I could test them out at least on one platform. Didn't think about that enough to have scenarios. > I'm not sure what negatives to test for, though. Raising the limit 10x > is unlikely to lead to an OOM, but I guess some processes could grow > higher latency?.. With a large buffer and small subprocess output we will ask the OS for a large memory increment for no good reason. Then the following GC will want to compact the gap, which means it will be slower. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-13 19:32 ` Eli Zaretskii @ 2023-09-13 20:38 ` Dmitry Gutov 2023-09-14 5:41 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-13 20:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 13/09/2023 22:32, Eli Zaretskii wrote: >>> We could change the ones we care about, though. >> >> Which ones do we care about? I've found a bunch of 'make-process' calls >> without :filter specified (flymake backends, ). Do we upgrade them all? >> >> The difference is likely not critical in most of them, but the change >> would likely result in small reduction of GC pressure in the >> corresponding Emacs sessions. >> >> We'll also need to version-guard the ones that are in ELPA. >> >> We don't touch the implementations of functions like start-file-process, >> right? >> >> What about the callers of functions like >> start-file-process-shell-command who want to take advantage of the >> improvement? Are we okay with them all having to call >> (set-process-filter proc 'buffer) on the returned process value? > > I think these questions are slightly premature. We should first have > the implementation of that filter, and then look for candidates that > could benefit from it. The implementation in that patch looks almost complete to me, unless you have any further comments. The main difference would be the change in the dispatch comparison from if (p->filter == Qinternal_default_process_filter) to if (p->filter == Qbuffer) , I think. Of course I can re-submit the amended patch, if you like. Regarding documentation, though. How will we describe that new value? The process filter is described like this in the manual: This function gives PROCESS the filter function FILTER. If FILTER is ‘nil’, it gives the process the default filter, which inserts the process output into the process buffer. If FILTER is ‘t’, Emacs stops accepting output from the process, unless it’s a network server process that listens for incoming connections. What can we add? If FILTER is ‘buffer’, it works like the default one, only a bit faster. ? > My tendency is to change only callers which > are in many cases expected to get a lot of stuff from a subprocess, so > shell buffers are probably out. But we could discuss that later. When I'm thinking of start-file-process-shell-command, I have in mind project--files-in-directory, which currently uses process-file-shell-command. Though I suppose most cases would be more easily converted to use make-process (like xref-matches-in-files uses process-file for launching a shell pipeline already). I was also thinking about Flymake backends because those work in the background. The outputs are usually small, but can easily grow in rare cases, without particular limit. Flymake also runs in the background, meaning whatever extra work it has to do (or especially GC pressure), affects the delays when editing. >>> We could also discuss changing >>> the default value, but that would require measurements in as many >>> cases as we can afford. >> >> If you have some particular scenarios in mind, and what to look out for, >> I could test them out at least on one platform. > > Didn't think about that enough to have scenarios. > >> I'm not sure what negatives to test for, though. Raising the limit 10x >> is unlikely to lead to an OOM, but I guess some processes could grow >> higher latency?.. > > With a large buffer and small subprocess output we will ask the OS for > a large memory increment for no good reason. Then the following GC > will want to compact the gap, which means it will be slower. I wonder what scenario that might become apparent in. Launching many small processes at once? Can't think of a realistic test case. Anyway, if you prefer to put off the discussion about changing the default, that's fine by me. Or split into a separate bug. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-13 20:38 ` Dmitry Gutov @ 2023-09-14 5:41 ` Eli Zaretskii 2023-09-16 1:32 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-14 5:41 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Wed, 13 Sep 2023 23:38:29 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > I think these questions are slightly premature. We should first have > > the implementation of that filter, and then look for candidates that > > could benefit from it. > > The implementation in that patch looks almost complete to me, unless you > have any further comments. Fine, then please post a complete patch with all the bells and whistles, and let's have it reviewed more widely. (I suggest a new bug report, as this one is already prohibitively long to follow, includes unrelated issues, and I fear some people will ignore patches posted to it). I think there are a few subtleties we still need to figure out. > The main difference would be the change in > the dispatch comparison from > > if (p->filter == Qinternal_default_process_filter) > > to > > if (p->filter == Qbuffer) Btw, both of the above are mistakes: you cannot compare Lisp objects as if they were simple values. You must use EQ. > This function gives PROCESS the filter function FILTER. If FILTER > is ‘nil’, it gives the process the default filter, which inserts > the process output into the process buffer. If FILTER is ‘t’, > Emacs stops accepting output from the process, unless it’s a > network server process that listens for incoming connections. > > What can we add? > > If FILTER is ‘buffer’, it works like the default one, only a bit faster. > > ? If FILTER is the symbol ‘buffer’, it works like the default filter, but makes some shortcuts to be faster: it doesn't adjust markers and the process mark (something else?). Of course, the real text will depend on what the final patch will look like: I'm not yet sure I understand which parts of internal-default-process-filter you want to keep in this alternative filter. (If you intend to keep all of them, it might be better to replace internal-default-process-filter completely, perhaps first with some variable exposed to Lisp which we could use to see if the new one causes issues.) > > My tendency is to change only callers which > > are in many cases expected to get a lot of stuff from a subprocess, so > > shell buffers are probably out. But we could discuss that later. > > When I'm thinking of start-file-process-shell-command, I have in mind > project--files-in-directory, which currently uses > process-file-shell-command. Though I suppose most cases would be more > easily converted to use make-process (like xref-matches-in-files uses > process-file for launching a shell pipeline already). > > I was also thinking about Flymake backends because those work in the > background. The outputs are usually small, but can easily grow in rare > cases, without particular limit. Flymake also runs in the background, > meaning whatever extra work it has to do (or especially GC pressure), > affects the delays when editing. I think we will have to address these on a case by case basis. The issues and aspects are not trivial and sometimes subtle. We might even introduce knobs to allow different pipe sizes if there's no one-fits-all value for a specific function using these primitives. > >> I'm not sure what negatives to test for, though. Raising the limit 10x > >> is unlikely to lead to an OOM, but I guess some processes could grow > >> higher latency?.. > > > > With a large buffer and small subprocess output we will ask the OS for > > a large memory increment for no good reason. Then the following GC > > will want to compact the gap, which means it will be slower. > > I wonder what scenario that might become apparent in. Launching many > small processes at once? Can't think of a realistic test case. One process suffices. The effect might not be significant, but slowdowns due to new features are generally considered regressions. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-14 5:41 ` Eli Zaretskii @ 2023-09-16 1:32 ` Dmitry Gutov 2023-09-16 5:37 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-16 1:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 14/09/2023 08:41, Eli Zaretskii wrote: >> Date: Wed, 13 Sep 2023 23:38:29 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>> I think these questions are slightly premature. We should first have >>> the implementation of that filter, and then look for candidates that >>> could benefit from it. >> >> The implementation in that patch looks almost complete to me, unless you >> have any further comments. > > Fine, then please post a complete patch with all the bells and > whistles, and let's have it reviewed more widely. (I suggest a new > bug report, as this one is already prohibitively long to follow, > includes unrelated issues, and I fear some people will ignore patches > posted to it). I think there are a few subtleties we still need to > figure out. Sure, filed bug#66020. > If FILTER is the symbol ‘buffer’, it works like the default filter, > but makes some shortcuts to be faster: it doesn't adjust markers and > the process mark (something else?). > > Of course, the real text will depend on what the final patch will look > like: I'm not yet sure I understand which parts of > internal-default-process-filter you want to keep in this alternative > filter. (If you intend to keep all of them, it might be better to > replace internal-default-process-filter completely, perhaps first with > some variable exposed to Lisp which we could use to see if the new one > causes issues.) Very good. And thanks for pointing out the omissions, so I went with reusing parts of internal-default-process-filter. >>>> I'm not sure what negatives to test for, though. Raising the limit 10x >>>> is unlikely to lead to an OOM, but I guess some processes could grow >>>> higher latency?.. >>> >>> With a large buffer and small subprocess output we will ask the OS for >>> a large memory increment for no good reason. Then the following GC >>> will want to compact the gap, which means it will be slower. >> >> I wonder what scenario that might become apparent in. Launching many >> small processes at once? Can't think of a realistic test case. > > One process suffices. The effect might not be significant, but > slowdowns due to new features are generally considered regressions. We'd need some objective way to evaluate this. Otherwise we'd just stop at the prospect of slowing down some process somewhere by 9ns (never mind speeding others up). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-09-16 1:32 ` Dmitry Gutov @ 2023-09-16 5:37 ` Eli Zaretskii 2023-09-19 19:59 ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-16 5:37 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Sat, 16 Sep 2023 04:32:26 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > >> I wonder what scenario that might become apparent in. Launching many > >> small processes at once? Can't think of a realistic test case. > > > > One process suffices. The effect might not be significant, but > > slowdowns due to new features are generally considered regressions. > > We'd need some objective way to evaluate this. Otherwise we'd just stop > at the prospect of slowing down some process somewhere by 9ns (never > mind speeding others up). That could indeed happen, and did happen in other cases. My personal conclusion from similar situations is that it is impossible to tell in advance what the reaction will be; we need to present the numbers and see how the chips fall. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-16 5:37 ` Eli Zaretskii @ 2023-09-19 19:59 ` Dmitry Gutov 2023-09-20 11:20 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-19 19:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020 This is another continuation from bug#64735, a subthread in this bug seems more fitting, given that I did most of the tests with its patch applied. On 16/09/2023 08:37, Eli Zaretskii wrote: >> Date: Sat, 16 Sep 2023 04:32:26 +0300 >> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov<dmitry@gutov.dev> >> >>>> I wonder what scenario that might become apparent in. Launching many >>>> small processes at once? Can't think of a realistic test case. >>> One process suffices. The effect might not be significant, but >>> slowdowns due to new features are generally considered regressions. >> We'd need some objective way to evaluate this. Otherwise we'd just stop >> at the prospect of slowing down some process somewhere by 9ns (never >> mind speeding others up). > That could indeed happen, and did happen in other cases. My personal > conclusion from similar situations is that it is impossible to tell in > advance what the reaction will be; we need to present the numbers and > see how the chips fall. I wrote this test: (defun test-ls-output () (with-temp-buffer (let ((proc (make-process :name "ls" :sentinel (lambda (&rest _)) :buffer (current-buffer) :stderr (current-buffer) :connection-type 'pipe :command '("ls")))) (while (accept-process-output proc)) (buffer-string)))) And tried to find some case where the difference is the least in favor of high buffer length. The one in favor of it we already know (a process with lots and lots of output). But when running 'ls' on a small directory (output 500 chars long), the variance in benchmarking is larger than any difference I can see from changing read-process-output-max from 4096 to 40960 (or to 40900 even). The benchmark is the following: (benchmark 1000 '(let ((read-process-output-fast t) (read-process-output-max 4096)) (test-ls-output))) When the directory is a little large (output ~50000 chars), there is more nuance. At first, as long as (!) read_and_insert_process_output_v2 patch is applied and read-process-output-fast is non-nil, the difference is negligible: | read-process-output-max | bench result | | 4096 | (4.566418994 28 0.8000380139999992) | | 40960 | (4.640526664 32 0.8330555910000008) | | 409600 | (4.629948652 30 0.7989731299999994) | For completeness, here are the same results for read-process-output-fast=nil (emacs-29 is similar, though all a little slower): | read-process-output-max | bench result | | 4096 | (4.953397326 52 1.354643750000001) | | 40960 | (6.942334958 75 2.0616055079999995) | | 409600 | (7.124765651 76 2.0892871070000005) | But as the session gets older (and I repeat these and other memory-intensive benchmarks), the outlay changes, and the larger buffer leads to uniformly worse number (the below is taken with read-process-output-fast=t; with that var set to nil the results were even worse): | read-process-output-max | bench result | | 4096 | (5.02324481 41 0.8851443580000051) | | 40960 | (5.438721274 61 1.2202541989999958) | | 409600 | (6.11188183 77 1.5461468160000038) | ...which seems odd given that in general, the buffer length closer to the length of the output should be preferable, because otherwise it is allocated multiple times, and read_process_output is likewise called more. Perhaps longer strings get more difficult to allocate as fragmentation increases? So, the last table is from a session I had running from yesterday, and the first table was produced after I restarted Emacs about an hour ago (the numbers were stable for 1-2 hours while I was writing this email on-and-off, then started degrading again a little bit, though not yet -- a couple of hours since -- even halfway to the numbers in the last table). Where to go from here? - Maybe we declare the difference insignificant and bump the value of read-process-output-max, given that it helps in other cases, - Or try to find out the cause for degradation, - Or keep the default the same, but make it easier to use different value for different processes (meaning, we resurrect the discussion in bug#38561). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-19 19:59 ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov @ 2023-09-20 11:20 ` Eli Zaretskii 2023-09-21 0:57 ` Dmitry Gutov 2023-09-21 8:07 ` Stefan Kangas 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-09-20 11:20 UTC (permalink / raw) To: Dmitry Gutov, Stefan Kangas, Stefan Monnier; +Cc: 66020 > Date: Tue, 19 Sep 2023 22:59:43 +0300 > Cc: 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > - Maybe we declare the difference insignificant and bump the value of > read-process-output-max, given that it helps in other cases, > - Or try to find out the cause for degradation, > - Or keep the default the same, but make it easier to use different > value for different processes (meaning, we resurrect the discussion in > bug#38561). I'd try the same experiment on other use cases, say "M-x grep" and "M-x compile" with large outputs, and if you see the same situation there (i.e. larger buffers are no worse), try increasing the default value on master. Stefan & Stefan: any comments or suggestions? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-20 11:20 ` Eli Zaretskii @ 2023-09-21 0:57 ` Dmitry Gutov 2023-09-21 2:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-09-21 7:42 ` Eli Zaretskii 2023-09-21 8:07 ` Stefan Kangas 1 sibling, 2 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-09-21 0:57 UTC (permalink / raw) To: Eli Zaretskii, Stefan Kangas, Stefan Monnier; +Cc: 66020 On 20/09/2023 14:20, Eli Zaretskii wrote: >> Date: Tue, 19 Sep 2023 22:59:43 +0300 >> Cc: 66020@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> - Maybe we declare the difference insignificant and bump the value of >> read-process-output-max, given that it helps in other cases, >> - Or try to find out the cause for degradation, >> - Or keep the default the same, but make it easier to use different >> value for different processes (meaning, we resurrect the discussion in >> bug#38561). > > I'd try the same experiment on other use cases, say "M-x grep" and > "M-x compile" with large outputs, and if you see the same situation > there (i.e. larger buffers are no worse), try increasing the default > value on master. I've run one particular rgrep search a few times (24340 hits, ~44s when the variable's value is either 4096 or 409600). And it makes sense that there is no difference: compilation modes do a lot more work than just capturing the process output or splitting it into strings. That leaves the question of what new value to use. 409600 is optimal for a large-output process but seems too much as default anyway (even if I have very little experimental proof for that hesitance: any help with that would be very welcome). I did some more experimenting, though. At a superficial glance, allocating the 'chars' buffer at the beginning of read_process_output is problematic because we could instead reuse a buffer for the whole duration of the process. I tried that (adding a new field to Lisp_Process and setting it in make_process), although I had to use a value produced by make_uninit_string: apparently simply storing a char* field inside a managed structure creates problems for the GC and early segfaults. Anyway, the result was slightly _slower_ than the status quo. So I read what 'alloca' does, and it looks hard to beat. But it's only used (as you of course know) when the value is <= MAX_ALLOCA, which is currently 16384. Perhaps an optimal default value shouldn't exceed this, even if it's hard to create a benchmark that shows a difference. With read-process-output-max set to 16384, my original benchmark gets about halfway to the optimal number. And I think we should make the process "remember" the value at its creation either way (something touched on in bug#38561): in bug#55737 we added an fcntl call to make the larger values take effect. But this call is in create_process: so any subsequent increase to a large value of this var won't have effect. Might as well remember it there (in a new field), then it'll be easier to use different values of it for different processes (set using let-binding at the time of the process' creation). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 0:57 ` Dmitry Gutov @ 2023-09-21 2:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors [not found] ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev> 2023-09-21 7:42 ` Eli Zaretskii 1 sibling, 1 reply; 213+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-21 2:36 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Eli Zaretskii, Stefan Kangas, 66020 > make_process), although I had to use a value produced by make_uninit_string: > apparently simply storing a char* field inside a managed structure creates > problems for the GC and early segfaults. Anyway, the result was slightly That should depend on *where* you put that field. Basically, it has to come after: /* The thread a process is linked to, or nil for any thread. */ Lisp_Object thread; /* After this point, there are no Lisp_Objects. */ since all the words up to that point will be traced by the GC (and assumed to be Lisp_Object fields). But of course, if you created the buffer with `make_uninit_string` then it'll be inside the Lisp heap and so it'll be reclaimed if the GC doesn't find any reference to it. Stefan ^ permalink raw reply [flat|nested] 213+ messages in thread
[parent not found: <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev>]
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max [not found] ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev> @ 2023-09-21 13:16 ` Eli Zaretskii 2023-09-21 17:54 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-21 13:16 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020, monnier, stefankangas > Date: Thu, 21 Sep 2023 15:20:57 +0300 > Cc: Eli Zaretskii <eliz@gnu.org>, Stefan Kangas <stefankangas@gmail.com>, > 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 21/09/2023 05:36, Stefan Monnier wrote: > >> make_process), although I had to use a value produced by make_uninit_string: > >> apparently simply storing a char* field inside a managed structure creates > >> problems for the GC and early segfaults. Anyway, the result was slightly > > That should depend on*where* you put that field. Basically, it has to > > come after: > > > > /* The thread a process is linked to, or nil for any thread. */ > > Lisp_Object thread; > > /* After this point, there are no Lisp_Objects. */ > > > > since all the words up to that point will be traced by the GC (and > > assumed to be Lisp_Object fields). > > Ah, thanks. That calls for another try. > > ...still no improvement, though no statistically significant slowdown > either this time. Why did you expect a significant improvement? Allocating and freeing the same-size buffer in quick succession has got to be optimally handled by modern malloc implementations, so I wouldn't be surprised by what you discover. There should be no OS calls, just reuse of a buffer that was just recently free'd. The overhead exists, but is probably very small, so it is lost in the noise. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 13:16 ` Eli Zaretskii @ 2023-09-21 17:54 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-09-21 17:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020, monnier, stefankangas On 21/09/2023 16:16, Eli Zaretskii wrote: >> Date: Thu, 21 Sep 2023 15:20:57 +0300 >> Cc: Eli Zaretskii<eliz@gnu.org>, Stefan Kangas<stefankangas@gmail.com>, >> 66020@debbugs.gnu.org >> From: Dmitry Gutov<dmitry@gutov.dev> >> >> On 21/09/2023 05:36, Stefan Monnier wrote: >>>> make_process), although I had to use a value produced by make_uninit_string: >>>> apparently simply storing a char* field inside a managed structure creates >>>> problems for the GC and early segfaults. Anyway, the result was slightly >>> That should depend on*where* you put that field. Basically, it has to >>> come after: >>> >>> /* The thread a process is linked to, or nil for any thread. */ >>> Lisp_Object thread; >>> /* After this point, there are no Lisp_Objects. */ >>> >>> since all the words up to that point will be traced by the GC (and >>> assumed to be Lisp_Object fields). >> Ah, thanks. That calls for another try. >> >> ...still no improvement, though no statistically significant slowdown >> either this time. > Why did you expect a significant improvement? No need to be surprised, I'm still growing intuition for what is fast and what is slow at this level of abstraction. > Allocating and freeing > the same-size buffer in quick succession has got to be optimally > handled by modern malloc implementations, so I wouldn't be surprised > by what you discover. There should be no OS calls, just reuse of a > buffer that was just recently free'd. The overhead exists, but is > probably very small, so it is lost in the noise. There are context switches after 'read_process_output' exits (control is returned to Emacs's event loop, the external process runs again, we wait on it with 'select'), it might not be there later, especially outside of the lab situation where we benchmark just single external process. So I don't know. I'm not majorly concerned, of course, and wouldn't be at all, if not for the previously recorded minor degragation with larger buffers in the longer-running session (last table in https://debbugs.gnu.org/66020#10). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 0:57 ` Dmitry Gutov 2023-09-21 2:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-21 7:42 ` Eli Zaretskii 2023-09-21 14:37 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-21 7:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier > Date: Thu, 21 Sep 2023 03:57:43 +0300 > Cc: 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > That leaves the question of what new value to use. 409600 is optimal for > a large-output process but seems too much as default anyway (even if I > have very little experimental proof for that hesitance: any help with > that would be very welcome). How does the throughput depend on this value? If the dependence curve plateaus at some lower value, we could use that lower value as a "good-enough" default. > I did some more experimenting, though. At a superficial glance, > allocating the 'chars' buffer at the beginning of read_process_output is > problematic because we could instead reuse a buffer for the whole > duration of the process. I tried that (adding a new field to > Lisp_Process and setting it in make_process), although I had to use a > value produced by make_uninit_string: apparently simply storing a char* > field inside a managed structure creates problems for the GC and early > segfaults. Anyway, the result was slightly _slower_ than the status quo. > > So I read what 'alloca' does, and it looks hard to beat. But it's only > used (as you of course know) when the value is <= MAX_ALLOCA, which is > currently 16384. Perhaps an optimal default value shouldn't exceed this, > even if it's hard to create a benchmark that shows a difference. With > read-process-output-max set to 16384, my original benchmark gets about > halfway to the optimal number. Which I think means we should stop worrying about the overhead of malloc for this purpose, as it is fast enough, at least on GNU/Linux. > And I think we should make the process "remember" the value at its > creation either way (something touched on in bug#38561): in bug#55737 we > added an fcntl call to make the larger values take effect. But this call > is in create_process: so any subsequent increase to a large value of > this var won't have effect. Why would the variable change after create_process? I'm afraid I don't understand what issue you are trying to deal with here. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 7:42 ` Eli Zaretskii @ 2023-09-21 14:37 ` Dmitry Gutov 2023-09-21 14:59 ` Eli Zaretskii 2023-09-21 17:33 ` Dmitry Gutov 0 siblings, 2 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-09-21 14:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020, stefankangas, monnier On 21/09/2023 10:42, Eli Zaretskii wrote: >> Date: Thu, 21 Sep 2023 03:57:43 +0300 >> Cc: 66020@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> That leaves the question of what new value to use. 409600 is optimal for >> a large-output process but seems too much as default anyway (even if I >> have very little experimental proof for that hesitance: any help with >> that would be very welcome). > > How does the throughput depend on this value? If the dependence curve > plateaus at some lower value, we could use that lower value as a > "good-enough" default. Depends on what we're prepared to call a plateau. Strictly speaking, not really. But we have a "sweet spot": for the process in my original benchmark ('find' with lots of output) it seems to be around 1009600. Here's a table (numbers are different from before because they're results of (benchmark 5 ...) divided by 5, meaning GC is amortized: | 4096 | 0.78 | | 16368 | 0.69 | | 40960 | 0.65 | | 409600 | 0.59 | | 1009600 | 0.56 | | 2009600 | 0.64 | | 4009600 | 0.65 | The process's output length is 27244567 in this case. Still above the largest of the buffers in this example. Notably, only allocating the buffer once at the start of the process (experiment mentioned in the email to Stefan M.) doesn't change the dynamics: buffer lengths above ~1009600 make the performance worse. So there must be some negative factor associated with higher buffers. There is an obvious positive one: the longer the buffer, the longer we don't switch between processes, so that overhead is lower. We could look into improving that part specifically: for example, reading from the process multiple times into 'chars' right away while there is still pending output present (either looping inside read_process_output, or calling it in a loop in wait_reading_process_output, at least until the process' buffered output is exhausted). That could reduce reactivity, however (can we find out how much is already buffered in advance, and only loop until we exhaust that length?) >> I did some more experimenting, though. At a superficial glance, >> allocating the 'chars' buffer at the beginning of read_process_output is >> problematic because we could instead reuse a buffer for the whole >> duration of the process. I tried that (adding a new field to >> Lisp_Process and setting it in make_process), although I had to use a >> value produced by make_uninit_string: apparently simply storing a char* >> field inside a managed structure creates problems for the GC and early >> segfaults. Anyway, the result was slightly _slower_ than the status quo. >> >> So I read what 'alloca' does, and it looks hard to beat. But it's only >> used (as you of course know) when the value is <= MAX_ALLOCA, which is >> currently 16384. Perhaps an optimal default value shouldn't exceed this, >> even if it's hard to create a benchmark that shows a difference. With >> read-process-output-max set to 16384, my original benchmark gets about >> halfway to the optimal number. > > Which I think means we should stop worrying about the overhead of > malloc for this purpose, as it is fast enough, at least on GNU/Linux. Perhaps. If we're not too concerned about memory fragmentation (that's the only explanation I have for the table "session gets older" -- last one -- in a previous email with test-ls-output timings). >> And I think we should make the process "remember" the value at its >> creation either way (something touched on in bug#38561): in bug#55737 we >> added an fcntl call to make the larger values take effect. But this call >> is in create_process: so any subsequent increase to a large value of >> this var won't have effect. > > Why would the variable change after create_process? I'm afraid I > don't understand what issue you are trying to deal with here. Well, what could we lose by saving the value of read-process-output-max in create_process? Currently I suppose one could vary its value while a process is still running, to implement some adaptive behavior or whatnot. But that's already semi-broken because fcntl is called in create_process. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 14:37 ` Dmitry Gutov @ 2023-09-21 14:59 ` Eli Zaretskii 2023-09-21 17:40 ` Dmitry Gutov 2023-09-21 17:33 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-21 14:59 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier > Date: Thu, 21 Sep 2023 17:37:23 +0300 > Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > How does the throughput depend on this value? If the dependence curve > > plateaus at some lower value, we could use that lower value as a > > "good-enough" default. > > Depends on what we're prepared to call a plateau. Strictly speaking, not > really. But we have a "sweet spot": for the process in my original > benchmark ('find' with lots of output) it seems to be around 1009600. > Here's a table (numbers are different from before because they're > results of (benchmark 5 ...) divided by 5, meaning GC is amortized: > > | 4096 | 0.78 | > | 16368 | 0.69 | > | 40960 | 0.65 | > | 409600 | 0.59 | > | 1009600 | 0.56 | > | 2009600 | 0.64 | > | 4009600 | 0.65 | Not enough data points between 40960 and 409600, IMO. 40960 sounds like a good spot for the default value. > >> And I think we should make the process "remember" the value at its > >> creation either way (something touched on in bug#38561): in bug#55737 we > >> added an fcntl call to make the larger values take effect. But this call > >> is in create_process: so any subsequent increase to a large value of > >> this var won't have effect. > > > > Why would the variable change after create_process? I'm afraid I > > don't understand what issue you are trying to deal with here. > > Well, what could we lose by saving the value of read-process-output-max > in create_process? It's already recorded in the size of the pipe, so why would we need to record it once more? > Currently I suppose one could vary its value while a process is > still running, to implement some adaptive behavior or whatnot. But > that's already semi-broken because fcntl is called in > create_process. I see no reason to support such changes during the process run, indeed. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 14:59 ` Eli Zaretskii @ 2023-09-21 17:40 ` Dmitry Gutov 2023-09-21 18:39 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-21 17:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020, stefankangas, monnier On 21/09/2023 17:59, Eli Zaretskii wrote: >> Date: Thu, 21 Sep 2023 17:37:23 +0300 >> Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >>> How does the throughput depend on this value? If the dependence curve >>> plateaus at some lower value, we could use that lower value as a >>> "good-enough" default. >> >> Depends on what we're prepared to call a plateau. Strictly speaking, not >> really. But we have a "sweet spot": for the process in my original >> benchmark ('find' with lots of output) it seems to be around 1009600. >> Here's a table (numbers are different from before because they're >> results of (benchmark 5 ...) divided by 5, meaning GC is amortized: >> >> | 4096 | 0.78 | >> | 16368 | 0.69 | >> | 40960 | 0.65 | >> | 409600 | 0.59 | >> | 1009600 | 0.56 | >> | 2009600 | 0.64 | >> | 4009600 | 0.65 | > > Not enough data points between 40960 and 409600, IMO. 40960 sounds > like a good spot for the default value. Or 32K, from the thread linked to previously (datagram size). And ifwe were to raise MAX_ALLOCA by 2x, we could still use 'alloca'. Neither would be optimal for my test scenario, though still an improvement. But see my other email with experimental patches, those bear improvement with the default 4096. >>>> And I think we should make the process "remember" the value at its >>>> creation either way (something touched on in bug#38561): in bug#55737 we >>>> added an fcntl call to make the larger values take effect. But this call >>>> is in create_process: so any subsequent increase to a large value of >>>> this var won't have effect. >>> >>> Why would the variable change after create_process? I'm afraid I >>> don't understand what issue you are trying to deal with here. >> >> Well, what could we lose by saving the value of read-process-output-max >> in create_process? > > It's already recorded in the size of the pipe, so why would we need to > record it once more? 'read_process_output' looks it up once more, to set the value of 'readmax' and allocate char*chars. Can we get the "recorded" value back from the pipe somehow? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 17:40 ` Dmitry Gutov @ 2023-09-21 18:39 ` Eli Zaretskii 2023-09-21 18:42 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-21 18:39 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier > Date: Thu, 21 Sep 2023 20:40:35 +0300 > Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > Can we get the "recorded" value back from the pipe somehow? There's F_GETPIPE_SZ command to fcntl, so I think we can. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 18:39 ` Eli Zaretskii @ 2023-09-21 18:42 ` Dmitry Gutov 2023-09-21 18:49 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-21 18:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020, stefankangas, monnier On 21/09/2023 21:39, Eli Zaretskii wrote: >> Date: Thu, 21 Sep 2023 20:40:35 +0300 >> Cc:stefankangas@gmail.com,monnier@iro.umontreal.ca,66020@debbugs.gnu.org >> From: Dmitry Gutov<dmitry@gutov.dev> >> >> Can we get the "recorded" value back from the pipe somehow? > There's F_GETPIPE_SZ command to fcntl, so I think we can. I'll rephrase: is this a good idea (doing a +1 syscall every time we read a chunk, I'm not sure of its performance anyway), or should we add a new field to Lisp_Process after all? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 18:42 ` Dmitry Gutov @ 2023-09-21 18:49 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-09-21 18:49 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier > Date: Thu, 21 Sep 2023 21:42:01 +0300 > Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 21/09/2023 21:39, Eli Zaretskii wrote: > >> Date: Thu, 21 Sep 2023 20:40:35 +0300 > >> Cc:stefankangas@gmail.com,monnier@iro.umontreal.ca,66020@debbugs.gnu.org > >> From: Dmitry Gutov<dmitry@gutov.dev> > >> > >> Can we get the "recorded" value back from the pipe somehow? > > There's F_GETPIPE_SZ command to fcntl, so I think we can. > > I'll rephrase: is this a good idea (doing a +1 syscall every time we > read a chunk, I'm not sure of its performance anyway), or should we add > a new field to Lisp_Process after all? If you indeed need the size, then do add it to the process object. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 14:37 ` Dmitry Gutov 2023-09-21 14:59 ` Eli Zaretskii @ 2023-09-21 17:33 ` Dmitry Gutov 2023-09-23 21:51 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-21 17:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020, monnier, stefankangas [-- Attachment #1: Type: text/plain, Size: 3317 bytes --] On 21/09/2023 17:37, Dmitry Gutov wrote: > We could look into improving that part specifically: for example, > reading from the process multiple times into 'chars' right away while > there is still pending output present (either looping inside > read_process_output, or calling it in a loop in > wait_reading_process_output, at least until the process' buffered output > is exhausted). That could reduce reactivity, however (can we find out > how much is already buffered in advance, and only loop until we exhaust > that length?) Hmm, the naive patch below offers some improvement for the value 4096, but still not comparable to raising the buffer size: 0.76 -> 0.72. diff --git a/src/process.c b/src/process.c index 2376d0f288d..a550e223f78 100644 --- a/src/process.c +++ b/src/process.c @@ -5893,7 +5893,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, && ((fd_callback_info[channel].flags & (KEYBOARD_FD | PROCESS_FD)) == PROCESS_FD)) { - int nread; + int nread = 0, nnread; /* If waiting for this channel, arrange to return as soon as no more input to be processed. No more @@ -5912,7 +5912,13 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, /* Read data from the process, starting with our buffered-ahead character if we have one. */ - nread = read_process_output (proc, channel); + do + { + nnread = read_process_output (proc, channel); + nread += nnread; + } + while (nnread >= 4096); + if ((!wait_proc || wait_proc == XPROCESS (proc)) && got_some_output < nread) got_some_output = nread; And "unlocking" the pipe size on the external process takes the performance further up a notch (by default it's much larger): 0.72 -> 0.65. diff --git a/src/process.c b/src/process.c index 2376d0f288d..85fc1b4d0c8 100644 --- a/src/process.c +++ b/src/process.c @@ -2206,10 +2206,10 @@ create_process (Lisp_Object process, char **new_argv, Lisp_Object current_dir) inchannel = p->open_fd[READ_FROM_SUBPROCESS]; forkout = p->open_fd[SUBPROCESS_STDOUT]; -#if (defined (GNU_LINUX) || defined __ANDROID__) \ - && defined (F_SETPIPE_SZ) - fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); -#endif /* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ */ +/* #if (defined (GNU_LINUX) || defined __ANDROID__) \ */ +/* && defined (F_SETPIPE_SZ) */ +/* fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); */ +/* #endif /\* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ *\/ */ } if (!NILP (p->stderrproc)) Apparently the patch from bug#55737 also made things a little worse by default, by limiting concurrency (the external process has to wait while the pipe is blocked, and by default Linux's pipe is larger). Just commenting it out makes performance a little better as well, though not as much as the two patches together. Note that both changes above are just PoC (e.g. the hardcoded 4096, and probably other details like carryover). I've tried to make a more nuanced loop inside read_process_output instead (as replacement for the first patch above), and so far it performs worse that the baseline. If anyone can see when I'm doing wrong (see attachment), comments are very welcome. [-- Attachment #2: read_process_output_nn_inside.diff --] [-- Type: text/x-patch, Size: 1443 bytes --] diff --git a/src/process.c b/src/process.c index 2376d0f288d..91a5c044a8c 100644 --- a/src/process.c +++ b/src/process.c @@ -6128,11 +6133,11 @@ read_and_dispose_of_process_output (struct Lisp_Process *p, char *chars, static int read_process_output (Lisp_Object proc, int channel) { - ssize_t nbytes; + ssize_t nbytes, nnbytes = 0; struct Lisp_Process *p = XPROCESS (proc); eassert (0 <= channel && channel < FD_SETSIZE); struct coding_system *coding = proc_decode_coding_system[channel]; - int carryover = p->decoding_carryover; + int carryover; ptrdiff_t readmax = clip_to_bounds (1, read_process_output_max, PTRDIFF_MAX); specpdl_ref count = SPECPDL_INDEX (); Lisp_Object odeactivate; @@ -6141,6 +6146,9 @@ read_process_output (Lisp_Object proc, int channel) USE_SAFE_ALLOCA; chars = SAFE_ALLOCA (sizeof coding->carryover + readmax); +do{ + carryover = p->decoding_carryover; + if (carryover) /* See the comment above. */ memcpy (chars, SDATA (p->decoding_buf), carryover); @@ -6222,3 +6236,3 @@ read_process_output (Lisp_Object proc, int channel) /* Now set NBYTES how many bytes we must decode. */ nbytes += carryover; @@ -6233,5 +6245,8 @@ /* Handling the process output should not deactivate the mark. */ Vdeactivate_mark = odeactivate; + nnbytes += nbytes; + } while (nbytes >= readmax); + SAFE_FREE_UNBIND_TO (count, Qnil); - return nbytes; + return nnbytes; } ^ permalink raw reply related [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-21 17:33 ` Dmitry Gutov @ 2023-09-23 21:51 ` Dmitry Gutov 2023-09-24 5:29 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-09-23 21:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefankangas, 66020, monnier On 21/09/2023 20:33, Dmitry Gutov wrote: > On 21/09/2023 17:37, Dmitry Gutov wrote: >> We could look into improving that part specifically: for example, >> reading from the process multiple times into 'chars' right away while >> there is still pending output present (either looping inside >> read_process_output, or calling it in a loop in >> wait_reading_process_output, at least until the process' buffered >> output is exhausted). That could reduce reactivity, however (can we >> find out how much is already buffered in advance, and only loop until >> we exhaust that length?) > > Hmm, the naive patch below offers some improvement for the value 4096, > but still not comparable to raising the buffer size: 0.76 -> 0.72. > > diff --git a/src/process.c b/src/process.c > index 2376d0f288d..a550e223f78 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -5893,7 +5893,7 @@ wait_reading_process_output (intmax_t time_limit, > int nsecs, int read_kbd, > && ((fd_callback_info[channel].flags & (KEYBOARD_FD | > PROCESS_FD)) > == PROCESS_FD)) > { > - int nread; > + int nread = 0, nnread; > > /* If waiting for this channel, arrange to return as > soon as no more input to be processed. No more > @@ -5912,7 +5912,13 @@ wait_reading_process_output (intmax_t time_limit, > int nsecs, int read_kbd, > /* Read data from the process, starting with our > buffered-ahead character if we have one. */ > > - nread = read_process_output (proc, channel); > + do > + { > + nnread = read_process_output (proc, channel); > + nread += nnread; > + } > + while (nnread >= 4096); > + > if ((!wait_proc || wait_proc == XPROCESS (proc)) > && got_some_output < nread) > got_some_output = nread; > > > And "unlocking" the pipe size on the external process takes the > performance further up a notch (by default it's much larger): 0.72 -> 0.65. > > diff --git a/src/process.c b/src/process.c > index 2376d0f288d..85fc1b4d0c8 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -2206,10 +2206,10 @@ create_process (Lisp_Object process, char > **new_argv, Lisp_Object current_dir) > inchannel = p->open_fd[READ_FROM_SUBPROCESS]; > forkout = p->open_fd[SUBPROCESS_STDOUT]; > > -#if (defined (GNU_LINUX) || defined __ANDROID__) \ > - && defined (F_SETPIPE_SZ) > - fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); > -#endif /* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ */ > +/* #if (defined (GNU_LINUX) || defined __ANDROID__) \ */ > +/* && defined (F_SETPIPE_SZ) */ > +/* fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); */ > +/* #endif /\* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ *\/ */ > } > > if (!NILP (p->stderrproc)) > > Apparently the patch from bug#55737 also made things a little worse by > default, by limiting concurrency (the external process has to wait while > the pipe is blocked, and by default Linux's pipe is larger). Just > commenting it out makes performance a little better as well, though not > as much as the two patches together. > > Note that both changes above are just PoC (e.g. the hardcoded 4096, and > probably other details like carryover). > > I've tried to make a more nuanced loop inside read_process_output > instead (as replacement for the first patch above), and so far it > performs worse that the baseline. If anyone can see when I'm doing wrong > (see attachment), comments are very welcome. This seems to have been a dead end: while looping does indeed make things faster, it doesn't really fit the approach of the 'adaptive_read_buffering' part that's implemented in read_process_output. And if the external process is crazy fast (while we, e.g. when using a Lisp filter, are not so fast), the result could be much reduced interactivity, with this one process keeping us stuck in the loop. But it seems I've found an answer to one previous question: "can we find out how much is already buffered in advance?" The patch below asks that from the OS (how portable is this? not sure) and allocates a larger buffer when more output has been buffered. If we keep OS's default value of pipe buffer size (64K on Linux and 16K-ish on macOS, according to https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer), that means auto-scaling the buffer on Emacs's side depending on how much the process outputs. The effect on performance is similar to the previous (looping) patch (0.70 -> 0.65), and is comparable to bumping read-process-output-max to 65536. So if we do decide to bump the default, I suppose the below should not be necessary. And I don't know whether we should be concerned about fragmentation: this way buffers do get allocates in different sizes (almost always multiples of 4096, but with rare exceptions among larger values). diff --git a/src/process.c b/src/process.c index 2376d0f288d..13cf6d6c50d 100644 --- a/src/process.c +++ b/src/process.c @@ -6137,7 +6145,18 @@ specpdl_ref count = SPECPDL_INDEX (); Lisp_Object odeactivate; char *chars; +#ifdef USABLE_FIONREAD +#ifdef DATAGRAM_SOCKETS + if (!DATAGRAM_CHAN_P (channel)) +#endif + { + int available_read; + ioctl (p->infd, FIONREAD, &available_read); + readmax = MAX (readmax, available_read); + } +#endif + USE_SAFE_ALLOCA; chars = SAFE_ALLOCA (sizeof coding->carryover + readmax); What do people think? ^ permalink raw reply related [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-23 21:51 ` Dmitry Gutov @ 2023-09-24 5:29 ` Eli Zaretskii 2024-05-26 15:20 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-09-24 5:29 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Paul Eggert, stefankangas, 66020, monnier > Date: Sun, 24 Sep 2023 00:51:28 +0300 > From: Dmitry Gutov <dmitry@gutov.dev> > Cc: 66020@debbugs.gnu.org, monnier@iro.umontreal.ca, stefankangas@gmail.com > > But it seems I've found an answer to one previous question: "can we find > out how much is already buffered in advance?" > > The patch below asks that from the OS (how portable is this? not sure) > and allocates a larger buffer when more output has been buffered. If we > keep OS's default value of pipe buffer size (64K on Linux and 16K-ish on > macOS, according to > https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer), > that means auto-scaling the buffer on Emacs's side depending on how much > the process outputs. The effect on performance is similar to the > previous (looping) patch (0.70 -> 0.65), and is comparable to bumping > read-process-output-max to 65536. > > So if we do decide to bump the default, I suppose the below should not > be necessary. And I don't know whether we should be concerned about > fragmentation: this way buffers do get allocates in different sizes > (almost always multiples of 4096, but with rare exceptions among larger > values). > > diff --git a/src/process.c b/src/process.c > index 2376d0f288d..13cf6d6c50d 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -6137,7 +6145,18 @@ > specpdl_ref count = SPECPDL_INDEX (); > Lisp_Object odeactivate; > char *chars; > > +#ifdef USABLE_FIONREAD > +#ifdef DATAGRAM_SOCKETS > + if (!DATAGRAM_CHAN_P (channel)) > +#endif > + { > + int available_read; > + ioctl (p->infd, FIONREAD, &available_read); > + readmax = MAX (readmax, available_read); > + } > +#endif > + > USE_SAFE_ALLOCA; > chars = SAFE_ALLOCA (sizeof coding->carryover + readmax); > > What do people think? I think we should increase the default size, and the rest (querying the system about the pipe size) looks like an unnecessary complication to me. I've added Paul Eggert to this discussion, as I'd like to hear his opinions about this stuff. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-24 5:29 ` Eli Zaretskii @ 2024-05-26 15:20 ` Dmitry Gutov 2024-05-26 16:01 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2024-05-26 15:20 UTC (permalink / raw) To: Eli Zaretskii, Paul Eggert; +Cc: stefankangas, 66020, monnier Hi Paul and Eli, On 24/09/2023 08:29, Eli Zaretskii wrote: >> What do people think? > I think we should increase the default size, and the rest (querying > the system about the pipe size) looks like an unnecessary complication > to me. > > I've added Paul Eggert to this discussion, as I'd like to hear his > opinions about this stuff. Do you think we can get this in before the emacs-30 branch is cut? To summarize: * Patch 1 (reduces consing when the default filter is used by moving it into C - skipping the creation of Lisp strings). * Patch 2 a) It ensures that the dynamic binding of read-process-output-max is saved when the process it created and used for its processing - ensuring that the current dynamic value is not just used when creating the pipe, but also later when reading from it. b) A few lines are outdated: part of the fix went in with bug#66288. * Patch 3 can be replaced with just upping the default value of read-process-output-max to 65536. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-05-26 15:20 ` Dmitry Gutov @ 2024-05-26 16:01 ` Eli Zaretskii 2024-05-26 23:27 ` Stefan Kangas 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2024-05-26 16:01 UTC (permalink / raw) To: eggert, Dmitry Gutov; +Cc: stefankangas, 66020, monnier > Date: Sun, 26 May 2024 18:20:15 +0300 > Cc: 66020@debbugs.gnu.org, monnier@iro.umontreal.ca, stefankangas@gmail.com > From: Dmitry Gutov <dmitry@gutov.dev> > > Hi Paul and Eli, > > On 24/09/2023 08:29, Eli Zaretskii wrote: > > >> What do people think? > > I think we should increase the default size, and the rest (querying > > the system about the pipe size) looks like an unnecessary complication > > to me. > > > > I've added Paul Eggert to this discussion, as I'd like to hear his > > opinions about this stuff. > > Do you think we can get this in before the emacs-30 branch is cut? I'll try to recollect the discussion and review the patches one of these days. Paul, your input (as well as that of everybody else on the CC list) will be most welcome. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-05-26 16:01 ` Eli Zaretskii @ 2024-05-26 23:27 ` Stefan Kangas 2024-06-08 12:11 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Stefan Kangas @ 2024-05-26 23:27 UTC (permalink / raw) To: Eli Zaretskii, eggert, Dmitry Gutov; +Cc: 66020, monnier Eli Zaretskii <eliz@gnu.org> writes: > I'll try to recollect the discussion and review the patches one of > these days. > > Paul, your input (as well as that of everybody else on the CC list) > will be most welcome. FWIW, I'd be in favor of raising `read-process-output-max' to something like 40960 (as Eli suggested in this thread), or perhaps some power of 2 close to that like 32768 or 65536. This is based on it being seemingly faster in the benchmarks in this thread, and me having used that locally for 2-3 years and noting no adverse effects. See also the discussion here: https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html (I didn't review patch 1 and 2, so no opinion on those.) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-05-26 23:27 ` Stefan Kangas @ 2024-06-08 12:11 ` Eli Zaretskii 2024-06-09 0:12 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2024-06-08 12:11 UTC (permalink / raw) To: Stefan Kangas, dmitry; +Cc: eggert, 66020, monnier > From: Stefan Kangas <stefankangas@gmail.com> > Date: Sun, 26 May 2024 16:27:19 -0700 > Cc: 66020@debbugs.gnu.org, monnier@iro.umontreal.ca > > Eli Zaretskii <eliz@gnu.org> writes: > > > I'll try to recollect the discussion and review the patches one of > > these days. > > > > Paul, your input (as well as that of everybody else on the CC list) > > will be most welcome. > > FWIW, I'd be in favor of raising `read-process-output-max' to something > like 40960 (as Eli suggested in this thread), or perhaps some power of 2 > close to that like 32768 or 65536. > > This is based on it being seemingly faster in the benchmarks in this > thread, and me having used that locally for 2-3 years and noting no > adverse effects. > > See also the discussion here: > https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html > > (I didn't review patch 1 and 2, so no opinion on those.) I guess we can install all 3 patches and see if anything breaks. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-08 12:11 ` Eli Zaretskii @ 2024-06-09 0:12 ` Dmitry Gutov 2024-06-11 3:12 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2024-06-09 0:12 UTC (permalink / raw) To: Eli Zaretskii, Stefan Kangas; +Cc: eggert, monnier, 66020-done On 08/06/2024 15:11, Eli Zaretskii wrote: >> From: Stefan Kangas<stefankangas@gmail.com> >> Date: Sun, 26 May 2024 16:27:19 -0700 >> Cc:66020@debbugs.gnu.org,monnier@iro.umontreal.ca >> >> Eli Zaretskii<eliz@gnu.org> writes: >> >>> I'll try to recollect the discussion and review the patches one of >>> these days. >>> >>> Paul, your input (as well as that of everybody else on the CC list) >>> will be most welcome. >> FWIW, I'd be in favor of raising `read-process-output-max' to something >> like 40960 (as Eli suggested in this thread), or perhaps some power of 2 >> close to that like 32768 or 65536. >> >> This is based on it being seemingly faster in the benchmarks in this >> thread, and me having used that locally for 2-3 years and noting no >> adverse effects. >> >> See also the discussion here: >> https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html >> >> (I didn't review patch 1 and 2, so no opinion on those.) > I guess we can install all 3 patches and see if anything breaks. Thank you, now pushed to master. For read-process-output-max, I chose the higher of the powers of two, but please feel free to amend. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-09 0:12 ` Dmitry Gutov @ 2024-06-11 3:12 ` Dmitry Gutov 2024-06-11 6:51 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2024-06-11 3:12 UTC (permalink / raw) To: Eli Zaretskii, Stefan Kangas; +Cc: eggert, monnier, 66020-done [-- Attachment #1: Type: text/plain, Size: 438 bytes --] On 09/06/2024 03:12, Dmitry Gutov wrote: >>> >> I guess we can install all 3 patches and see if anything breaks. > > Thank you, now pushed to master. On the heels of bug#71452, I've pushed three updates. Here's a third one (hopefully last) which I'd like a second opinion on. Process output -- apparently -- needs to be inserted before markers. Is it okay to make adjust_markers_for_insert non-static and call it here? See attached. [-- Attachment #2: read_and_insert_process_output_before_markers.diff --] [-- Type: text/x-patch, Size: 2004 bytes --] diff --git a/src/insdel.c b/src/insdel.c index 3809f8bc060..fbf71e1e595 100644 --- a/src/insdel.c +++ b/src/insdel.c @@ -284,7 +284,7 @@ adjust_markers_for_delete (ptrdiff_t from, ptrdiff_t from_byte, we advance it if either its insertion-type is t or BEFORE_MARKERS is true. */ -static void +void adjust_markers_for_insert (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t to, ptrdiff_t to_byte, bool before_markers) { diff --git a/src/lisp.h b/src/lisp.h index e1911cbb660..21dada59132 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -4399,6 +4399,8 @@ verify (FLT_RADIX == 2 || FLT_RADIX == 16); ptrdiff_t, ptrdiff_t); extern void adjust_markers_for_delete (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t); +extern void adjust_markers_for_insert (ptrdiff_t, ptrdiff_t, + ptrdiff_t, ptrdiff_t, bool); extern void adjust_markers_bytepos (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t, int); extern void replace_range (ptrdiff_t, ptrdiff_t, Lisp_Object, bool, bool, diff --git a/src/process.c b/src/process.c index b6ec114e2b3..0bd3d068441 100644 --- a/src/process.c +++ b/src/process.c @@ -6406,7 +6406,7 @@ read_and_insert_process_output (struct Lisp_Process *p, char *buf, if (NILP (BVAR (XBUFFER (p->buffer), enable_multibyte_characters)) && ! CODING_MAY_REQUIRE_DECODING (process_coding)) { - insert_1_both (buf, nread, nread, 0, 0, 0); + insert_1_both (buf, nread, nread, 0, 0, 1); signal_after_change (PT - nread, 0, nread); } else @@ -6423,6 +6423,9 @@ read_and_insert_process_output (struct Lisp_Process *p, char *buf, specbind (Qinhibit_modification_hooks, Qt); decode_coding_c_string (process_coding, (unsigned char *) buf, nread, curbuf); + adjust_markers_for_insert (PT, PT_BYTE, + PT + process_coding->produced_char, + PT_BYTE + process_coding->produced, true); unbind_to (count1, Qnil); read_process_output_set_last_coding_system (p, process_coding); ^ permalink raw reply related [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 3:12 ` Dmitry Gutov @ 2024-06-11 6:51 ` Eli Zaretskii 2024-06-11 11:41 ` Dmitry Gutov 2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 213+ messages in thread From: Eli Zaretskii @ 2024-06-11 6:51 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020-done, eggert, stefankangas, monnier > Date: Tue, 11 Jun 2024 06:12:21 +0300 > From: Dmitry Gutov <dmitry@gutov.dev> > Cc: eggert@cs.ucla.edu, monnier@iro.umontreal.ca, 66020-done@debbugs.gnu.org > > Is it okay to make adjust_markers_for_insert non-static and call it > here? See attached. Technically, this is okay, but I'd like to hear from Stefan about whether it's correct to insert process output before the markers. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 6:51 ` Eli Zaretskii @ 2024-06-11 11:41 ` Dmitry Gutov 2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2024-06-11 11:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 66020-done, eggert, stefankangas, monnier On 11/06/2024 09:51, Eli Zaretskii wrote: >> Date: Tue, 11 Jun 2024 06:12:21 +0300 >> From: Dmitry Gutov<dmitry@gutov.dev> >> Cc:eggert@cs.ucla.edu,monnier@iro.umontreal.ca,66020-done@debbugs.gnu.org >> >> Is it okay to make adjust_markers_for_insert non-static and call it >> here? See attached. > Technically, this is okay, but I'd like to hear from Stefan about > whether it's correct to insert process output before the markers. FWIW, internal-default-process-filter calls insert_from_string_before_markers. My goal is just to maintain compatibility. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 6:51 ` Eli Zaretskii 2024-06-11 11:41 ` Dmitry Gutov @ 2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-06-11 13:06 ` Eli Zaretskii 2024-06-11 17:15 ` Ihor Radchenko 1 sibling, 2 replies; 213+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-06-11 12:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Dmitry Gutov, eggert, stefankangas, 66020-done > Technically, this is okay, but I'd like to hear from Stefan about > whether it's correct to insert process output before the markers. AFAIK, when the insertion is done by the default process filter, it's indeed done "before the markers". I have no idea why it's done this way and would have preferred it to be different in many cases, but it's been that way forever and changing it would inevitably break some code. Stefan ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-06-11 13:06 ` Eli Zaretskii 2024-06-11 17:15 ` Ihor Radchenko 1 sibling, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2024-06-11 13:06 UTC (permalink / raw) To: Stefan Monnier; +Cc: dmitry, eggert, stefankangas, 66020 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Dmitry Gutov <dmitry@gutov.dev>, stefankangas@gmail.com, > eggert@cs.ucla.edu, 66020-done@debbugs.gnu.org > Date: Tue, 11 Jun 2024 08:55:53 -0400 > > > Technically, this is okay, but I'd like to hear from Stefan about > > whether it's correct to insert process output before the markers. > > AFAIK, when the insertion is done by the default process filter, it's > indeed done "before the markers". I have no idea why it's done this > way and would have preferred it to be different in many cases, but it's > been that way forever and changing it would inevitably break some code. OK, thanks. I guess compatibility wins here. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-06-11 13:06 ` Eli Zaretskii @ 2024-06-11 17:15 ` Ihor Radchenko 2024-06-11 18:09 ` Dmitry Gutov 1 sibling, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2024-06-11 17:15 UTC (permalink / raw) To: Stefan Monnier Cc: Dmitry Gutov, Eli Zaretskii, eggert, stefankangas, 66020-done Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org> writes: >> Technically, this is okay, but I'd like to hear from Stefan about >> whether it's correct to insert process output before the markers. > > AFAIK, when the insertion is done by the default process filter, it's > indeed done "before the markers". I have no idea why it's done this > way and would have preferred it to be different in many cases, but it's > been that way forever and changing it would inevitably break some code. FYI, I just upgraded to the latest master, and observed some lines in *Async-native-compile-log* inserted not at the end, but on the second line before last. Is it something others also see? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 17:15 ` Ihor Radchenko @ 2024-06-11 18:09 ` Dmitry Gutov 2024-06-11 19:33 ` Ihor Radchenko 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2024-06-11 18:09 UTC (permalink / raw) To: Ihor Radchenko, Stefan Monnier Cc: Eli Zaretskii, eggert, stefankangas, 66020-done On 11/06/2024 20:15, Ihor Radchenko wrote: > Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of > text editors"<bug-gnu-emacs@gnu.org> writes: > >>> Technically, this is okay, but I'd like to hear from Stefan about >>> whether it's correct to insert process output before the markers. >> AFAIK, when the insertion is done by the default process filter, it's >> indeed done "before the markers". I have no idea why it's done this >> way and would have preferred it to be different in many cases, but it's >> been that way forever and changing it would inevitably break some code. > FYI, I just upgraded to the latest master, and observed some lines in > *Async-native-compile-log* inserted not at the end, but on the second > line before last. > > Is it something others also see? I have now pushed the proposed patch (with "before markers" behavior). Could you rebuild and see whether the async-native-compile scenario improves? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 18:09 ` Dmitry Gutov @ 2024-06-11 19:33 ` Ihor Radchenko 2024-06-11 20:00 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2024-06-11 19:33 UTC (permalink / raw) To: Dmitry Gutov Cc: 66020-done, Eli Zaretskii, eggert, Stefan Monnier, stefankangas Dmitry Gutov <dmitry@gutov.dev> writes: >> FYI, I just upgraded to the latest master, and observed some lines in >> *Async-native-compile-log* inserted not at the end, but on the second >> line before last. >> >> Is it something others also see? > > I have now pushed the proposed patch (with "before markers" behavior). > > Could you rebuild and see whether the async-native-compile scenario > improves? Back to normal now. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2024-06-11 19:33 ` Ihor Radchenko @ 2024-06-11 20:00 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2024-06-11 20:00 UTC (permalink / raw) To: Ihor Radchenko Cc: 66020-done, Eli Zaretskii, eggert, Stefan Monnier, stefankangas On 11/06/2024 22:33, Ihor Radchenko wrote: >>> FYI, I just upgraded to the latest master, and observed some lines in >>> *Async-native-compile-log* inserted not at the end, but on the second >>> line before last. >>> >>> Is it something others also see? >> I have now pushed the proposed patch (with "before markers" behavior). >> >> Could you rebuild and see whether the async-native-compile scenario >> improves? > Back to normal now. Perfect, thanks for checking. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max 2023-09-20 11:20 ` Eli Zaretskii 2023-09-21 0:57 ` Dmitry Gutov @ 2023-09-21 8:07 ` Stefan Kangas [not found] ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev> 1 sibling, 1 reply; 213+ messages in thread From: Stefan Kangas @ 2023-09-21 8:07 UTC (permalink / raw) To: Eli Zaretskii, Dmitry Gutov, Stefan Monnier; +Cc: 66020 Eli Zaretskii <eliz@gnu.org> writes: > Stefan & Stefan: any comments or suggestions? FWIW, I've had the below snippet in my .emacs for the last two years, and haven't noticed any adverse effects. I never bothered making any actual benchmarks though: ;; Maybe faster: (setq read-process-output-max (max read-process-output-max (* 64 1024))) I added the above after the discussion here: https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html ^ permalink raw reply [flat|nested] 213+ messages in thread
[parent not found: <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev>]
* bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max [not found] ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev> @ 2023-09-21 13:17 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-09-21 13:17 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 66020, stefankangas, monnier > Date: Thu, 21 Sep 2023 15:27:41 +0300 > Cc: 66020@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > > https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html > > The archive seems down (so I can't read this), but if you found a > tangible improvement from the above setting, you might also want to try > out the patch at the top of this bug report. It is back up now. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-25 2:41 ` Dmitry Gutov 2023-07-25 8:22 ` Ihor Radchenko @ 2023-07-25 18:42 ` Eli Zaretskii 2023-07-26 1:56 ` Dmitry Gutov 2023-07-25 19:16 ` sbaugh 2 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-25 18:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Tue, 25 Jul 2023 05:41:13 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > >> One thing to try it changing the -with-find implementation to use a > >> synchronous call, to compare (e.g. using 'process-file'). And repeat > >> these tests on GNU/Linux too. > > > > This still uses pipes, albeit without the pselect stuff. > > I'm attaching an extended benchmark, one that includes a "synchronous" > implementation as well. Please give it a spin as well. > > Here (GNU/Linux) the reported numbers look like this: > > > (my-bench 1 default-directory "") > > (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)") > ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)") > ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)") > ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)")) Almost no change on Windows: (("built-in" . "Elapsed time: 1.218750s (0.078125s in 5 GCs)") ("with-find" . "Elapsed time: 8.984375s (0.109375s in 7 GCs)") ("with-find-p" . "Elapsed time: 8.718750s (0.046875s in 3 GCs)") ("with-find-sync" . "Elapsed time: 8.921875s (0.046875s in 3 GCs)")) I'm beginning to suspect the implementation of pipes (and IPC in general). How else can such slowdown be explained? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-25 18:42 ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii @ 2023-07-26 1:56 ` Dmitry Gutov 2023-07-26 2:28 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Dmitry Gutov @ 2023-07-26 1:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 25/07/2023 21:42, Eli Zaretskii wrote: > Almost no change on Windows: > > (("built-in" . "Elapsed time: 1.218750s (0.078125s in 5 GCs)") > ("with-find" . "Elapsed time: 8.984375s (0.109375s in 7 GCs)") > ("with-find-p" . "Elapsed time: 8.718750s (0.046875s in 3 GCs)") > ("with-find-sync" . "Elapsed time: 8.921875s (0.046875s in 3 GCs)")) > > I'm beginning to suspect the implementation of pipes (and IPC in > general). How else can such slowdown be explained? Seems so (I'm not well-versed in the lower level details, alas). Your other idea (spending time in text conversion) also sounds plausible, but I don't know whether this much overhead can be explained by it. And don't we have to convert any process's output to our internal encoding anyway, on any platform? ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-26 1:56 ` Dmitry Gutov @ 2023-07-26 2:28 ` Eli Zaretskii 2023-07-26 2:35 ` Dmitry Gutov 0 siblings, 1 reply; 213+ messages in thread From: Eli Zaretskii @ 2023-07-26 2:28 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, yantar92, 64735 > Date: Wed, 26 Jul 2023 04:56:20 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > Your other idea (spending time in text conversion) also sounds > plausible, but I don't know whether this much overhead can be explained > by it. And don't we have to convert any process's output to our internal > encoding anyway, on any platform? We do, but you-all probably run your tests on a system where the external encoding is UTF-8, right? That is much faster. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-26 2:28 ` Eli Zaretskii @ 2023-07-26 2:35 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-26 2:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, sbaugh, yantar92, 64735 On 26/07/2023 05:28, Eli Zaretskii wrote: >> Date: Wed, 26 Jul 2023 04:56:20 +0300 >> Cc:luangruo@yahoo.com,sbaugh@janestreet.com,yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov<dmitry@gutov.dev> >> >> Your other idea (spending time in text conversion) also sounds >> plausible, but I don't know whether this much overhead can be explained >> by it. And don't we have to convert any process's output to our internal >> encoding anyway, on any platform? > We do, but you-all probably run your tests on a system where the > external encoding is UTF-8, right? That is much faster. I do. I suppose that transcoding can/uses the short-circuit approach, avoiding extra copying when the memory representations match. It should be possible to measure the encoding's overhead by checking how big the output is, testing our code on a smaller string, and multiplying. Or, more roughly, by piping it to "iconv -f Windows-1251 -t UTF-8" and measuring how long it will take to finish (if our encoding takes longer, that could point to an optimization opportunity as well). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-25 2:41 ` Dmitry Gutov 2023-07-25 8:22 ` Ihor Radchenko 2023-07-25 18:42 ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii @ 2023-07-25 19:16 ` sbaugh 2023-07-26 2:28 ` Dmitry Gutov 2 siblings, 1 reply; 213+ messages in thread From: sbaugh @ 2023-07-25 19:16 UTC (permalink / raw) To: Dmitry Gutov; +Cc: luangruo, sbaugh, Eli Zaretskii, yantar92, 64735 Dmitry Gutov <dmitry@gutov.dev> writes: >> (my-bench 1 default-directory "") > > (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)") > ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)") > ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)") > ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)")) Tangent, but: Ugh, wow, call-process really is a lot faster than make-process. I see now why people disliked my idea of replacing call-process with something based on make-process, this is a big difference... There's zero reason it has to be so slow... maybe I should try to make a better make-process API and implementation which is actually fast. (without worrying about being constrained by compatibility with something that's already dog-slow) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-25 19:16 ` sbaugh @ 2023-07-26 2:28 ` Dmitry Gutov 0 siblings, 0 replies; 213+ messages in thread From: Dmitry Gutov @ 2023-07-26 2:28 UTC (permalink / raw) To: sbaugh; +Cc: luangruo, sbaugh, Eli Zaretskii, yantar92, 64735 On 25/07/2023 22:16, sbaugh@catern.com wrote: > Dmitry Gutov <dmitry@gutov.dev> writes: >>> (my-bench 1 default-directory "") >> >> (("built-in" . "Elapsed time: 1.601649s (0.709108s in 22 GCs)") >> ("with-find" . "Elapsed time: 1.792383s (1.135869s in 38 GCs)") >> ("with-find-p" . "Elapsed time: 1.248543s (0.682827s in 20 GCs)") >> ("with-find-sync" . "Elapsed time: 0.922291s (0.343497s in 10 GCs)")) > > Tangent, but: > > Ugh, wow, call-process really is a lot faster than make-process. I see > now why people disliked my idea of replacing call-process with something > based on make-process, this is a big difference... More like forewarned. Do we want to exchange 25% of performance for extra reactivity? We might. But we'd probably put that behind a pref and have to maintain two implementations. > There's zero reason it has to be so slow... maybe I should try to make a > better make-process API and implementation which is actually fast. > (without worrying about being constrained by compatibility with > something that's already dog-slow) I don't know if the API itself is at fault. The first step should be to investigate which part of the current one is actually slow, I think. But then, of course, if improved performance really requires a change in the API, we can switch to some new one too (which having to maintain at least two implementations for a number of years). BTW, looking at the difference between the with-find-* approaches' performance, it seems like most of it comes down to GC. Any chance we're doing extra copying of strings even when we don't have to, or some inefficient copying -- compared to the sync implementation? E.g. we could use the "fast" approach at least when the :filter is not specified (which is the case in the first impl, "with-find"). The manual says this: The default filter simply outputs directly to the process buffer. Perhaps it's worth looking at. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh 2023-07-20 5:00 ` Eli Zaretskii 2023-07-20 12:38 ` Dmitry Gutov @ 2023-07-21 2:42 ` Richard Stallman 2023-07-22 2:39 ` Richard Stallman 2023-07-22 10:18 ` Ihor Radchenko 3 siblings, 1 reply; 213+ messages in thread From: Richard Stallman @ 2023-07-21 2:42 UTC (permalink / raw) To: Spencer Baugh; +Cc: 64735 [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] I will take a look at this. In case they are reluctant because of being busy, would anyone like to help out by writing the code to do the optimization? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-21 2:42 ` Richard Stallman @ 2023-07-22 2:39 ` Richard Stallman 2023-07-22 5:49 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: Richard Stallman @ 2023-07-22 2:39 UTC (permalink / raw) To: sbaugh, 64735 [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] Since people are making a lot of headway on optimizing this in Emacs, I won't trouble the Find maintainers for now. I wonder if it is possible to detect many cases in which the file-name handlers won't actually do anything, and bind file-name-hander-list to nil for those. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 2:39 ` Richard Stallman @ 2023-07-22 5:49 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 5:49 UTC (permalink / raw) To: rms, Michael Albinus; +Cc: sbaugh, 64735 > From: Richard Stallman <rms@gnu.org> > Date: Fri, 21 Jul 2023 22:39:41 -0400 > > I wonder if it is possible to detect many cases in which > the file-name handlers won't actually do anything, and bind > file-name-hander-list to nil for those. I think we already do, but perhaps we could try harder. ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh ` (2 preceding siblings ...) 2023-07-21 2:42 ` Richard Stallman @ 2023-07-22 10:18 ` Ihor Radchenko 2023-07-22 10:42 ` sbaugh 3 siblings, 1 reply; 213+ messages in thread From: Ihor Radchenko @ 2023-07-22 10:18 UTC (permalink / raw) To: Spencer Baugh; +Cc: 64735 Spencer Baugh <sbaugh@janestreet.com> writes: > - we could use our own recursive directory-tree walking implementation > (directory-files-recursively), if we found a nice way to pipe its output > directly to grep etc without going through Lisp. (This could be nice > for project-files, at least) May you elaborate this idea? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 10:18 ` Ihor Radchenko @ 2023-07-22 10:42 ` sbaugh 2023-07-22 12:00 ` Eli Zaretskii 0 siblings, 1 reply; 213+ messages in thread From: sbaugh @ 2023-07-22 10:42 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Spencer Baugh, 64735 Ihor Radchenko <yantar92@posteo.net> writes: > Spencer Baugh <sbaugh@janestreet.com> writes: > >> - we could use our own recursive directory-tree walking implementation >> (directory-files-recursively), if we found a nice way to pipe its output >> directly to grep etc without going through Lisp. (This could be nice >> for project-files, at least) > > May you elaborate this idea? One of the reasons directory-files-recursively is slow is because it allocates memory inside Emacs. If we piped its output directly to grep, that overhead would be removed. On reflection, though, as I've posted elsewhere in this thread: This is a bad idea and is inherently slower than find, because directory-files-recursively does not run in parallel with Emacs (and never will). ^ permalink raw reply [flat|nested] 213+ messages in thread
* bug#64735: 29.0.92; find invocations are ~15x slower because of ignores 2023-07-22 10:42 ` sbaugh @ 2023-07-22 12:00 ` Eli Zaretskii 0 siblings, 0 replies; 213+ messages in thread From: Eli Zaretskii @ 2023-07-22 12:00 UTC (permalink / raw) To: sbaugh; +Cc: sbaugh, yantar92, 64735 > Cc: Spencer Baugh <sbaugh@janestreet.com>, 64735@debbugs.gnu.org > From: sbaugh@catern.com > Date: Sat, 22 Jul 2023 10:42:06 +0000 (UTC) > > Ihor Radchenko <yantar92@posteo.net> writes: > > Spencer Baugh <sbaugh@janestreet.com> writes: > > > >> - we could use our own recursive directory-tree walking implementation > >> (directory-files-recursively), if we found a nice way to pipe its output > >> directly to grep etc without going through Lisp. (This could be nice > >> for project-files, at least) > > > > May you elaborate this idea? > > One of the reasons directory-files-recursively is slow is because it > allocates memory inside Emacs. If we piped its output directly to grep, > that overhead would be removed. How can you do anything in Emacs without allocating memory? ^ permalink raw reply [flat|nested] 213+ messages in thread
end of thread, other threads:[~2024-06-11 20:00 UTC | newest] Thread overview: 213+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-19 21:16 bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Spencer Baugh 2023-07-20 5:00 ` Eli Zaretskii 2023-07-20 12:22 ` sbaugh 2023-07-20 12:42 ` Dmitry Gutov 2023-07-20 13:43 ` Spencer Baugh 2023-07-20 18:54 ` Dmitry Gutov 2023-07-20 12:38 ` Dmitry Gutov 2023-07-20 13:20 ` Ihor Radchenko 2023-07-20 15:19 ` Dmitry Gutov 2023-07-20 15:42 ` Ihor Radchenko 2023-07-20 15:57 ` Dmitry Gutov 2023-07-20 16:03 ` Ihor Radchenko 2023-07-20 18:56 ` Dmitry Gutov 2023-07-21 9:14 ` Ihor Radchenko 2023-07-20 16:33 ` Eli Zaretskii 2023-07-20 16:36 ` Ihor Radchenko 2023-07-20 16:45 ` Eli Zaretskii 2023-07-20 17:23 ` Ihor Radchenko 2023-07-20 18:24 ` Eli Zaretskii 2023-07-20 18:29 ` Ihor Radchenko 2023-07-20 18:43 ` Eli Zaretskii 2023-07-20 18:57 ` Ihor Radchenko 2023-07-21 12:37 ` Dmitry Gutov 2023-07-21 12:58 ` Ihor Radchenko 2023-07-21 13:00 ` Dmitry Gutov 2023-07-21 13:34 ` Ihor Radchenko 2023-07-21 13:36 ` Dmitry Gutov 2023-07-21 13:46 ` Ihor Radchenko 2023-07-21 15:41 ` Dmitry Gutov 2023-07-21 15:48 ` Ihor Radchenko 2023-07-21 19:53 ` Dmitry Gutov 2023-07-23 5:40 ` Ihor Radchenko 2023-07-23 11:50 ` Michael Albinus 2023-07-24 7:35 ` Ihor Radchenko 2023-07-24 7:59 ` Michael Albinus 2023-07-24 8:22 ` Ihor Radchenko 2023-07-24 9:31 ` Michael Albinus 2023-07-21 7:45 ` Michael Albinus 2023-07-21 10:46 ` Eli Zaretskii 2023-07-21 11:32 ` Michael Albinus 2023-07-21 11:51 ` Ihor Radchenko 2023-07-21 12:01 ` Michael Albinus 2023-07-21 12:20 ` Ihor Radchenko 2023-07-21 12:25 ` Ihor Radchenko 2023-07-21 12:46 ` Eli Zaretskii 2023-07-21 13:01 ` Michael Albinus 2023-07-21 13:23 ` Ihor Radchenko 2023-07-21 15:31 ` Michael Albinus 2023-07-21 15:38 ` Ihor Radchenko 2023-07-21 15:49 ` Michael Albinus 2023-07-21 15:55 ` Eli Zaretskii 2023-07-21 16:08 ` Michael Albinus 2023-07-21 16:15 ` Ihor Radchenko 2023-07-21 16:38 ` Eli Zaretskii 2023-07-21 16:43 ` Ihor Radchenko 2023-07-21 16:43 ` Michael Albinus 2023-07-21 17:45 ` Eli Zaretskii 2023-07-21 17:55 ` Michael Albinus 2023-07-21 18:38 ` Eli Zaretskii 2023-07-21 19:33 ` Spencer Baugh 2023-07-22 5:27 ` Eli Zaretskii 2023-07-22 10:38 ` sbaugh 2023-07-22 11:58 ` Eli Zaretskii 2023-07-22 14:14 ` Ihor Radchenko 2023-07-22 14:32 ` Eli Zaretskii 2023-07-22 15:07 ` Ihor Radchenko 2023-07-22 15:29 ` Eli Zaretskii 2023-07-23 7:52 ` Ihor Radchenko 2023-07-23 8:01 ` Eli Zaretskii 2023-07-23 8:11 ` Ihor Radchenko 2023-07-23 9:11 ` Eli Zaretskii 2023-07-23 9:34 ` Ihor Radchenko 2023-07-23 9:39 ` Eli Zaretskii 2023-07-23 9:42 ` Ihor Radchenko 2023-07-23 10:20 ` Eli Zaretskii 2023-07-23 11:43 ` Ihor Radchenko 2023-07-23 12:49 ` Eli Zaretskii 2023-07-23 12:57 ` Ihor Radchenko 2023-07-23 13:32 ` Eli Zaretskii 2023-07-23 13:56 ` Ihor Radchenko 2023-07-23 14:32 ` Eli Zaretskii 2023-07-22 17:18 ` sbaugh 2023-07-22 17:26 ` Ihor Radchenko 2023-07-22 17:46 ` Eli Zaretskii 2023-07-22 18:31 ` Eli Zaretskii 2023-07-22 19:06 ` Eli Zaretskii 2023-07-22 20:53 ` Spencer Baugh 2023-07-23 6:15 ` Eli Zaretskii 2023-07-23 7:48 ` Ihor Radchenko 2023-07-23 8:06 ` Eli Zaretskii 2023-07-23 8:16 ` Ihor Radchenko 2023-07-23 9:13 ` Eli Zaretskii 2023-07-23 9:16 ` Ihor Radchenko 2023-07-23 11:44 ` Michael Albinus 2023-07-23 2:59 ` Richard Stallman 2023-07-23 5:28 ` Eli Zaretskii 2023-07-22 8:17 ` Michael Albinus 2023-07-21 13:17 ` Ihor Radchenko 2023-07-21 12:27 ` Michael Albinus 2023-07-21 12:30 ` Ihor Radchenko 2023-07-21 13:04 ` Michael Albinus 2023-07-21 13:24 ` Ihor Radchenko 2023-07-21 15:36 ` Michael Albinus 2023-07-21 15:44 ` Ihor Radchenko 2023-07-21 12:39 ` Eli Zaretskii 2023-07-21 13:09 ` Michael Albinus 2023-07-21 12:38 ` Dmitry Gutov 2023-07-20 17:08 ` Spencer Baugh 2023-07-20 17:24 ` Eli Zaretskii 2023-07-22 6:35 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-07-20 17:25 ` Ihor Radchenko 2023-07-21 19:31 ` Spencer Baugh 2023-07-21 19:37 ` Ihor Radchenko 2023-07-21 19:56 ` Dmitry Gutov 2023-07-21 20:11 ` Spencer Baugh 2023-07-22 6:39 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-07-22 21:01 ` Dmitry Gutov 2023-07-23 5:11 ` Eli Zaretskii 2023-07-23 10:46 ` Dmitry Gutov 2023-07-23 11:18 ` Eli Zaretskii 2023-07-23 17:46 ` Dmitry Gutov 2023-07-23 17:56 ` Eli Zaretskii 2023-07-23 17:58 ` Dmitry Gutov 2023-07-23 18:21 ` Eli Zaretskii 2023-07-23 19:07 ` Dmitry Gutov 2023-07-23 19:27 ` Eli Zaretskii 2023-07-23 19:44 ` Dmitry Gutov 2023-07-23 19:27 ` Dmitry Gutov 2023-07-24 11:20 ` Eli Zaretskii 2023-07-24 12:55 ` Dmitry Gutov 2023-07-24 13:26 ` Eli Zaretskii 2023-07-25 2:41 ` Dmitry Gutov 2023-07-25 8:22 ` Ihor Radchenko 2023-07-26 1:51 ` Dmitry Gutov 2023-07-26 9:09 ` Ihor Radchenko 2023-07-27 0:41 ` Dmitry Gutov 2023-07-27 5:22 ` Eli Zaretskii 2023-07-27 8:20 ` Ihor Radchenko 2023-07-27 8:47 ` Eli Zaretskii 2023-07-27 9:28 ` Ihor Radchenko 2023-07-27 13:30 ` Dmitry Gutov 2023-07-29 0:12 ` Dmitry Gutov 2023-07-29 6:15 ` Eli Zaretskii 2023-07-30 1:35 ` Dmitry Gutov 2023-07-31 11:38 ` Eli Zaretskii 2023-09-08 0:53 ` Dmitry Gutov 2023-09-08 6:35 ` Eli Zaretskii 2023-09-10 1:30 ` Dmitry Gutov 2023-09-10 5:33 ` Eli Zaretskii 2023-09-11 0:02 ` Dmitry Gutov 2023-09-11 11:57 ` Eli Zaretskii 2023-09-11 23:06 ` Dmitry Gutov 2023-09-12 11:39 ` Eli Zaretskii 2023-09-12 13:11 ` Dmitry Gutov 2023-09-12 14:23 ` Dmitry Gutov 2023-09-12 14:26 ` Dmitry Gutov 2023-09-12 16:32 ` Eli Zaretskii 2023-09-12 18:48 ` Dmitry Gutov 2023-09-12 19:35 ` Eli Zaretskii 2023-09-12 20:27 ` Dmitry Gutov 2023-09-13 11:38 ` Eli Zaretskii 2023-09-13 14:27 ` Dmitry Gutov 2023-09-13 15:07 ` Eli Zaretskii 2023-09-13 17:27 ` Dmitry Gutov 2023-09-13 19:32 ` Eli Zaretskii 2023-09-13 20:38 ` Dmitry Gutov 2023-09-14 5:41 ` Eli Zaretskii 2023-09-16 1:32 ` Dmitry Gutov 2023-09-16 5:37 ` Eli Zaretskii 2023-09-19 19:59 ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov 2023-09-20 11:20 ` Eli Zaretskii 2023-09-21 0:57 ` Dmitry Gutov 2023-09-21 2:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors [not found] ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev> 2023-09-21 13:16 ` Eli Zaretskii 2023-09-21 17:54 ` Dmitry Gutov 2023-09-21 7:42 ` Eli Zaretskii 2023-09-21 14:37 ` Dmitry Gutov 2023-09-21 14:59 ` Eli Zaretskii 2023-09-21 17:40 ` Dmitry Gutov 2023-09-21 18:39 ` Eli Zaretskii 2023-09-21 18:42 ` Dmitry Gutov 2023-09-21 18:49 ` Eli Zaretskii 2023-09-21 17:33 ` Dmitry Gutov 2023-09-23 21:51 ` Dmitry Gutov 2023-09-24 5:29 ` Eli Zaretskii 2024-05-26 15:20 ` Dmitry Gutov 2024-05-26 16:01 ` Eli Zaretskii 2024-05-26 23:27 ` Stefan Kangas 2024-06-08 12:11 ` Eli Zaretskii 2024-06-09 0:12 ` Dmitry Gutov 2024-06-11 3:12 ` Dmitry Gutov 2024-06-11 6:51 ` Eli Zaretskii 2024-06-11 11:41 ` Dmitry Gutov 2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-06-11 13:06 ` Eli Zaretskii 2024-06-11 17:15 ` Ihor Radchenko 2024-06-11 18:09 ` Dmitry Gutov 2024-06-11 19:33 ` Ihor Radchenko 2024-06-11 20:00 ` Dmitry Gutov 2023-09-21 8:07 ` Stefan Kangas [not found] ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev> 2023-09-21 13:17 ` Eli Zaretskii 2023-07-25 18:42 ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii 2023-07-26 1:56 ` Dmitry Gutov 2023-07-26 2:28 ` Eli Zaretskii 2023-07-26 2:35 ` Dmitry Gutov 2023-07-25 19:16 ` sbaugh 2023-07-26 2:28 ` Dmitry Gutov 2023-07-21 2:42 ` Richard Stallman 2023-07-22 2:39 ` Richard Stallman 2023-07-22 5:49 ` Eli Zaretskii 2023-07-22 10:18 ` Ihor Radchenko 2023-07-22 10:42 ` sbaugh 2023-07-22 12:00 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).