* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered @ 2024-11-07 17:41 Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-11-17 1:03 ` Sean Whitton 0 siblings, 1 reply; 9+ messages in thread From: Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-11-07 17:41 UTC (permalink / raw) To: 74243 [-- Attachment #1: Type: text/plain, Size: 1562 bytes --] Tags: patch The most significant slow component of "hg status" is parsing the .hgignore file. If we pass -mardc instead of -A to hg status, hg doesn't list ignored or untracked files, so it skips parsing the .hgignore. On my large repo, this brings "hg status" from 140ms to 20ms. For vc-hg-state, the distinction doesn't matter: nothing using the output of vc-hg-state has significantly different behavior for ignored files vs unregistered files: - vc-dir-clean-files and vc-dir-recompute-file-state call vc-hg-state, but will never see an ignored file anyway since vc-dir shouldn't list ignored files for hg. - vc-next-action checks 'ignored, but it's OK to take the 'unregistered path instead; it will either fail when calling hg, or succeed. - Other users of vc-state don't differ between 'ignored and 'unregistered * lisp/vc/vc-hg.el (vc-hg-state-slow): Treat ignored files as unregistered. In GNU Emacs 29.2.50 (build 6, x86_64-pc-linux-gnu, X toolkit, cairo version 1.15.12, Xaw scroll bars) of 2024-11-06 built on igm-qws-u22796a Repository revision: 18ed746717c1c80e5cc9d9dc85b6e1f4013a1cec Repository branch: emacs-29 Windowing system distributor 'The X.Org Foundation', version 11.0.12011000 System Description: Rocky Linux 8.10 (Green Obsidian) Configured using: 'configure --with-x-toolkit=lucid --without-gpm --without-gconf --without-selinux --without-imagemagick --with-modules --with-gif=no --with-tree-sitter --with-native-compilation=aot PKG_CONFIG_PATH=/usr/local/home/garnish/libtree-sitter/0.22.6-1/lib/pkgconfig/' [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Speed-up-vc-hg-state-by-treating-ignored-files-as-un.patch --] [-- Type: text/patch, Size: 2338 bytes --] From 591dd8d7d5e23fd03696f9a536d7f40e77ea2de4 Mon Sep 17 00:00:00 2001 From: Spencer Baugh <sbaugh@janestreet.com> Date: Thu, 7 Nov 2024 12:28:49 -0500 Subject: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered The most significant slow component of "hg status" is parsing the .hgignore file. If we pass -mardc instead of -A to hg status, hg doesn't list ignored or untracked files, so it skips parsing the .hgignore. On my large repo, this brings "hg status" from 140ms to 20ms. For vc-hg-state, the distinction doesn't matter: nothing using the output of vc-hg-state has significantly different behavior for ignored files vs unregistered files: - vc-dir-clean-files and vc-dir-recompute-file-state call vc-hg-state, but will never see an ignored file anyway since vc-dir shouldn't list ignored files for hg. - vc-next-action checks 'ignored, but it's OK to take the 'unregistered path instead; it will either fail when calling hg, or succeed. - Other users of vc-state don't differ between 'ignored and 'unregistered * lisp/vc/vc-hg.el (vc-hg-state-slow): Treat ignored files as unregistered. --- lisp/vc/vc-hg.el | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lisp/vc/vc-hg.el b/lisp/vc/vc-hg.el index 856bea66a6f..2dbd1285318 100644 --- a/lisp/vc/vc-hg.el +++ b/lisp/vc/vc-hg.el @@ -245,7 +245,7 @@ vc-hg-state-slow "--config" "ui.report_untrusted=0" "--config" "alias.status=status" "--config" "defaults.status=" - "status" "-A" (file-relative-name file))) + "status" "-mardc" (file-relative-name file))) ;; Some problem happened. E.g. We can't find an `hg' ;; executable. (error nil))))))) @@ -260,12 +260,12 @@ vc-hg-state-slow ((eq state ?=) 'up-to-date) ((eq state ?A) 'added) ((eq state ?M) 'edited) - ((eq state ?I) 'ignored) ((eq state ?R) 'removed) ((eq state ?!) 'missing) - ((eq state ??) 'unregistered) ((eq state ?C) 'up-to-date) ;; Older mercurial versions use this. - (t 'up-to-date)))))) + ;; Ignored or untracked files don't show up; they're both + ;; treated as unregistered. + (t 'unregistered)))))) (defun vc-hg-working-revision (_file) "Hg-specific version of `vc-working-revision'." -- 2.39.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-07 17:41 bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-11-17 1:03 ` Sean Whitton [not found] ` <ierttc3z71g.fsf@janestreet.com> 0 siblings, 1 reply; 9+ messages in thread From: Sean Whitton @ 2024-11-17 1:03 UTC (permalink / raw) To: 74243; +Cc: Spencer Baugh Hello, On Thu 07 Nov 2024 at 12:41pm -05, Spencer Baugh via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote: > The most significant slow component of "hg status" is parsing the > .hgignore file. If we pass -mardc instead of -A to hg status, hg > doesn't list ignored or untracked files, so it skips parsing the > .hgignore. On my large repo, this brings "hg status" from 140ms to > 20ms. Thanks for investigating this. > For vc-hg-state, the distinction doesn't matter: nothing using the > output of vc-hg-state has significantly different behavior for ignored > files vs unregistered files: > - vc-dir-clean-files and vc-dir-recompute-file-state call vc-hg-state, > but will never see an ignored file anyway since vc-dir shouldn't list > ignored files for hg. > - vc-next-action checks 'ignored, but it's OK to take the > 'unregistered path instead; it will either fail when calling hg, or > succeed. > - Other users of vc-state don't differ between 'ignored and > 'unregistered In vc-dir for a git repo, if I type 'G' on an unregistered file and then 'g' to refresh the view, the status label next to the unregistered file changes to "ignored". ISTM this is a nice feature that allows you to confirm that the addition to .gitignore worked. If that doesn't currently work for Hg, someone might want to implement it at some point. vc-state is a public function, so someone might well be relying on it returning 'ignored, for some other purpose. So, can we do this with a new optional argument to vc-state? Like how vc-deduce-fileset can provide more information if STATE-MODEL-ONLY-FILES is non-nil. It could be an optional argument that means to treat 'ignored and 'unregistered the same. Or something similar. It seems well-motivated to add an argument to the general vc-status because it's an operation that can be slow in large repos regardless of the backend. Though we could start by just adding an optional argument to vc-hg-status. Or, less invasive would be a vc-hg--status-internal which does it. -- Sean Whitton ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <ierttc3z71g.fsf@janestreet.com>]
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered [not found] ` <ierttc3z71g.fsf@janestreet.com> @ 2024-11-26 7:52 ` Sean Whitton 2024-11-26 23:26 ` Dmitry Gutov 0 siblings, 1 reply; 9+ messages in thread From: Sean Whitton @ 2024-11-26 7:52 UTC (permalink / raw) To: Spencer Baugh; +Cc: 74243 Hello, On Tue 19 Nov 2024 at 08:28am -05, Spencer Baugh wrote: > The original motivation of optimizing 'state is to speed up > vc-refresh-state and vc-after-save, since they call vc-hg-state in > find-file-hook and after-save-buffer, which adds very noticeable > latency. (I care less about the speed of vc-dir, which uses > 'dir-status-files not 'state) > > So those functions would need to pass this argument. But they (through > vc-state-refresh) then store the returned state in 'vc-state in the VC > per-file properties, so anyone accessing that will also see the effect > of this argument. I see what you mean. Quoting vc-state, | A return of nil from this function means we have no information on the | status of this file. | [...] | `ignored' The file showed up in a dir-status listing with a flag | indicating the version-control system is ignoring it, | Note: This property is not set reliably (some VCSes | don't have useful directory-status commands) so assume | that any file with vc-state nil might be ignorable | without VC knowing it. | | `unregistered' The file is not under version control." | | ;; Note: we usually return nil here for unregistered files anyway | ;; when called with only one argument. This doesn't seem to cause | ;; any problems. But if we wanted to change that, we should | ;; probably opt for redefining the `registered' command to return | ;; non-nil even for unregistered files (maybe also rename it), and | ;; then make sure that all `state' implementations handle | ;; unregistered file appropriately. (I think there's a mistake here: an ignored file is not a file "under version control", so `unregistered' should say "not under version control and not ignored". Would you agree?) Thanks for pointing out the involvement of find-file-hook and after-save-hook. The problem you describe is not at all Hg-specific: vc-state gets called in a context where speed matters, but it's also the primary entry point for any code that wants to know the state of a file, some of which might care more about accuracy than speed. To put it another way, the code assumes throughout that finding out the file state will always be fast. But it also assumes the information is accurate if present. This makes me queasy about your original patch. It does not seem wise to return something we don't know to be true only on the basis that it all works out fine for now. The 'nil' return value might provide us with a way out, however. Could we add an optional argument to vc-state that means "just return nil if finding out the state properly might be slow"? Could we make vc-after-save and the relevant find-file-hook entry pass that option through, and do something sensible with a nil return value? If they get nil, they would clear out the saved property, and possibly update the mode line display to "????" or something. Maybe we'd want a user option (that could go in your large repo's .dir-locals.el, so it's set-and-forget) to opt-in to not knowing the file state as often. -- Sean Whitton ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-26 7:52 ` Sean Whitton @ 2024-11-26 23:26 ` Dmitry Gutov 2024-11-26 23:32 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 9+ messages in thread From: Dmitry Gutov @ 2024-11-26 23:26 UTC (permalink / raw) To: Sean Whitton, Spencer Baugh; +Cc: 74243 Hi, just to interject a little. On 26/11/2024 09:52, Sean Whitton wrote: > (I think there's a mistake here: an ignored file is not a file "under > version control", so `unregistered' should say "not under version > control and not ignored". Would you agree?) > > Thanks for pointing out the involvement of find-file-hook and > after-save-hook. The problem you describe is not at all Hg-specific: > vc-state gets called in a context where speed matters, but it's also the > primary entry point for any code that wants to know the state of a file, > some of which might care more about accuracy than speed. > > To put it another way, the code assumes throughout that finding out the > file state will always be fast. But it also assumes the information is > accurate if present. This makes me queasy about your original patch. > It does not seem wise to return something we don't know to be true only > on the basis that it all works out fine for now. This FR reminds me of a similar change in vc-git-state that we ended up installing in bug#11757 (in 2012). Then stayed with it until 2017 when bug#19343 was filed and fixed (provided a recent enough Git is used) - see also the problem scenario described there. So it seems both a reasonable change and ultimately not ideal. Depending on how many users we think might be affected by performance here. > The 'nil' return value might provide us with a way out, however. > Could we add an optional argument to vc-state that means "just return > nil if finding out the state properly might be slow"? > Could we make vc-after-save and the relevant find-file-hook entry pass > that option through, and do something sensible with a nil return value? > > If they get nil, they would clear out the saved property, and possibly > update the mode line display to "????" or something. Maybe we'd want a > user option (that could go in your large repo's .dir-locals.el, so it's > set-and-forget) to opt-in to not knowing the file state as often. A user option seems like an easier choice. Solutions that clear cache under some conditions or other tend to be more complex, and slow down at least some combined scenarios (e.g. one of my use cases is saving the buffer and having diff-hl-mode use its vc state from after-save-hook). ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-26 23:26 ` Dmitry Gutov @ 2024-11-26 23:32 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-11-27 0:18 ` Dmitry Gutov 0 siblings, 1 reply; 9+ messages in thread From: Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-11-26 23:32 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 74243, Sean Whitton Dmitry Gutov <dmitry@gutov.dev> writes: > Hi, just to interject a little. > > On 26/11/2024 09:52, Sean Whitton wrote: >> (I think there's a mistake here: an ignored file is not a file "under >> version control", so `unregistered' should say "not under version >> control and not ignored". Would you agree?) >> Thanks for pointing out the involvement of find-file-hook and >> after-save-hook. The problem you describe is not at all Hg-specific: >> vc-state gets called in a context where speed matters, but it's also the >> primary entry point for any code that wants to know the state of a file, >> some of which might care more about accuracy than speed. >> To put it another way, the code assumes throughout that finding out >> the >> file state will always be fast. But it also assumes the information is >> accurate if present. This makes me queasy about your original patch. >> It does not seem wise to return something we don't know to be true only >> on the basis that it all works out fine for now. > > This FR reminds me of a similar change in vc-git-state that we ended > up installing in bug#11757 (in 2012). Then stayed with it until 2017 > when bug#19343 was filed and fixed (provided a recent enough Git is > used) - see also the problem scenario described there. > > So it seems both a reasonable change and ultimately not > ideal. Depending on how many users we think might be affected by > performance here. > >> The 'nil' return value might provide us with a way out, however. >> Could we add an optional argument to vc-state that means "just return >> nil if finding out the state properly might be slow"? >> Could we make vc-after-save and the relevant find-file-hook entry pass >> that option through, and do something sensible with a nil return value? >> If they get nil, they would clear out the saved property, and >> possibly >> update the mode line display to "????" or something. Maybe we'd want a >> user option (that could go in your large repo's .dir-locals.el, so it's >> set-and-forget) to opt-in to not knowing the file state as often. > > A user option seems like an easier choice. > > Solutions that clear cache under some conditions or other tend to be > more complex, and slow down at least some combined scenarios (e.g. one > of my use cases is saving the buffer and having diff-hl-mode use its > vc state from after-save-hook). These are all reasonable concerns. Since posting my original patch, though, I've heard from Hg developers that they plan to eventually implement per-directory ignore files, like in Git. That would remove the original performance problem, so maybe this is not so important. That being said, it's still sad that the vc hooks in find-file-hook and after-save-hook can hurt performance so much. It seems to me that the ideal outcome would be to support asynchronicity in those hooks. That would benefit all vc backends... though this is perhaps quite difficult. Maybe asynchronously populating the saved vc-state property would work? With some clever usage of nil return values as Sean describes... ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-26 23:32 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-11-27 0:18 ` Dmitry Gutov 2024-11-29 8:17 ` Sean Whitton 0 siblings, 1 reply; 9+ messages in thread From: Dmitry Gutov @ 2024-11-27 0:18 UTC (permalink / raw) To: Spencer Baugh; +Cc: 74243, Sean Whitton On 27/11/2024 01:32, Spencer Baugh wrote: >> A user option seems like an easier choice. >> >> Solutions that clear cache under some conditions or other tend to be >> more complex, and slow down at least some combined scenarios (e.g. one >> of my use cases is saving the buffer and having diff-hl-mode use its >> vc state from after-save-hook). > > These are all reasonable concerns. > > Since posting my original patch, though, I've heard from Hg developers > that they plan to eventually implement per-directory ignore files, like > in Git. That would remove the original performance problem, so maybe > this is not so important. That's good to hear. > That being said, it's still sad that the vc hooks in find-file-hook and > after-save-hook can hurt performance so much. Understandable, that's what drove me to implement that older change in Git back then. > It seems to me that the ideal outcome would be to support asynchronicity > in those hooks. That would benefit all vc backends... though this is > perhaps quite difficult. The most complex part could be retaining a compatible/synchronous API. > Maybe asynchronously populating the saved vc-state property would work? > With some clever usage of nil return values as Sean describes... Maybe Sean's idea is better, but to spitball different options: - FWIW since not too long ago we've treated a similar issue in diff-hl by using a thread - which calls the same code inside (meaning the current synchronous implementation), but it happens in the background, so the input is unfrozen and the visual update is asynchronous. But keeping in mind that threads' error handling is not great, so it seems not optimal to keep a lot of implementation code inside a thread. Also, threads are reportedly not good with remote calls yet. - The mode-line update isn't going to wait asynchronously, though, but perhaps an update could be scheduled. If state updates are not synchronous, I suppose this would also need some debouncing/queueing mechanism for the callers as well. That is the route of migrating to a different calling convention, though. - Finally, if the main scenario that we are concerned is the use in vc-dir, we could try switching only its updates to another backend call. E.g. vc-dir-resynch-file would switch to the (possibly) more precise - though slower - dir-status-files, just like the code that first populates that buffer. vc-state could then afford shortcuts more safely. ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-27 0:18 ` Dmitry Gutov @ 2024-11-29 8:17 ` Sean Whitton 2024-11-29 12:43 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 9+ messages in thread From: Sean Whitton @ 2024-11-29 8:17 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Spencer Baugh, 74243 Hello, On Wed 27 Nov 2024 at 02:18am +02, Dmitry Gutov wrote: > Maybe Sean's idea is better, but to spitball different options: > > - FWIW since not too long ago we've treated a similar issue in diff-hl by > using a thread - which calls the same code inside (meaning the current > synchronous implementation), but it happens in the background, so the input > is unfrozen and the visual update is asynchronous. > > But keeping in mind that threads' error handling is not great, so it seems not > optimal to keep a lot of implementation code inside a thread. Also, threads > are reportedly not good with remote calls yet. > > - The mode-line update isn't going to wait asynchronously, though, but perhaps > an update could be scheduled. If state updates are not synchronous, I > suppose this would also need some debouncing/queueing mechanism for the > callers as well. That is the route of migrating to a different calling > convention, though. Thanks for these ideas. Spencer, do you mind if I close this bug? It's clear that we could be doing something better here, but given the news from Hg upstream, we probably don't want to make changes along the lines of your original patch. > - Finally, if the main scenario that we are concerned is the use in vc-dir, we > could try switching only its updates to another backend > call. E.g. vc-dir-resynch-file would switch to the (possibly) more precise - > though slower - dir-status-files, just like the code that first populates > that buffer. vc-state could then afford shortcuts more safely. Just to note, my concern was that vc-state is a public function and we don't know where it's being used, so vc-dir is not the main concern. -- Sean Whitton ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-29 8:17 ` Sean Whitton @ 2024-11-29 12:43 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-11-29 23:45 ` Sean Whitton 0 siblings, 1 reply; 9+ messages in thread From: Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-11-29 12:43 UTC (permalink / raw) To: Sean Whitton; +Cc: Dmitry Gutov, 74243 [-- Attachment #1: Type: text/plain, Size: 339 bytes --] On Fri, Nov 29, 2024, 3:17 AM Sean Whitton <spwhitton@spwhitton.name> wrote: > Spencer, do you mind if I close this bug? It's clear that we could be > doing something better here, but given the news from Hg upstream, we > probably don't want to make changes along the lines of your original > patch. > Agreed, go ahead. > [-- Attachment #2: Type: text/html, Size: 837 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered 2024-11-29 12:43 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-11-29 23:45 ` Sean Whitton 0 siblings, 0 replies; 9+ messages in thread From: Sean Whitton @ 2024-11-29 23:45 UTC (permalink / raw) To: Spencer Baugh; +Cc: 74243-close, Dmitry Gutov Hello, On Fri 29 Nov 2024 at 07:43am -05, Spencer Baugh via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote: > On Fri, Nov 29, 2024, 3:17 AM Sean Whitton <spwhitton@spwhitton.name> wrote: > > Spencer, do you mind if I close this bug? It's clear that we could be > doing something better here, but given the news from Hg upstream, we > probably don't want to make changes along the lines of your original > patch. > > Agreed, go ahead. Coolio, thanks. -- Sean Whitton ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-11-29 23:45 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-07 17:41 bug#74243: [PATCH] Speed up vc-hg-state by treating ignored files as unregistered Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-11-17 1:03 ` Sean Whitton [not found] ` <ierttc3z71g.fsf@janestreet.com> 2024-11-26 7:52 ` Sean Whitton 2024-11-26 23:26 ` Dmitry Gutov 2024-11-26 23:32 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-11-27 0:18 ` Dmitry Gutov 2024-11-29 8:17 ` Sean Whitton 2024-11-29 12:43 ` Spencer Baugh via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-11-29 23:45 ` Sean Whitton
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).