unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
@ 2021-09-29 22:49 Stefan Kangas
  2021-09-30  7:44 ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-10-05  1:59 ` Dmitry Gutov
  0 siblings, 2 replies; 9+ messages in thread
From: Stefan Kangas @ 2021-09-29 22:49 UTC (permalink / raw)
  To: 50906

Severity: wishlist

`xref-find-references' blocks Emacs while searching for matches.
This can take a long time to complete in large repositories.

It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-09-29 22:49 bug#50906: xref-find-references blocks Emacs: asynchronous operation? Stefan Kangas
@ 2021-09-30  7:44 ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-10-05  1:59 ` Dmitry Gutov
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-09-30  7:44 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: 50906

Stefan Kangas <stefan@marxist.se> writes:

> Severity: wishlist
>
> `xref-find-references' blocks Emacs while searching for matches.
> This can take a long time to complete in large repositories.
>
> It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.

Here's a related bug report:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50733





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-09-29 22:49 bug#50906: xref-find-references blocks Emacs: asynchronous operation? Stefan Kangas
  2021-09-30  7:44 ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-10-05  1:59 ` Dmitry Gutov
  2021-10-05  5:18   ` Arthur Miller
  2021-10-05  6:29   ` Helmut Eller
  1 sibling, 2 replies; 9+ messages in thread
From: Dmitry Gutov @ 2021-10-05  1:59 UTC (permalink / raw)
  To: Stefan Kangas, 50906

On 30.09.2021 01:49, Stefan Kangas wrote:
> Severity: wishlist
> 
> `xref-find-references' blocks Emacs while searching for matches.
> This can take a long time to complete in large repositories.
> 
> It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.

Wishlist indeed!

Daniel's bug report shows a good case for this kind of feature: huge 
projects where the search, even using fast tools (e.g. ripgrep), takes 
multiple seconds. So if results of such searches could be displayed 
incrementally, it would improve the perceive speed and usability.

What can be done here:

- Design an "asynchronous" format for xref-show-xrefs-function to 
consume. FETCHER of a different shape. Not sure how it's going to work 
in the end -- maybe a simple-ish iterator (call a function again for 
more results), but ideally it would look synchronous somehow, and the 
concurrency would be achieved through the use of threads. Not sure if 
that's realistic.

- The new kind of fetcher would need to provide a way to abort the 
search, since 'C-g' would not be available anymore.

- Implement it for the common searches of course.

Downsides:

- No way to quickly 'C-g' out of a search, supposedly one would have to 
switch to the results buffer (maybe it will be selected right away) and 
type 'C-c C-c'. And then kill the buffer, I guess?

- The size threshold of a project where the improvement will be 
significant is pretty big -- for instance, searching across the Emacs 
checkout takes about 100-200ms (just the time the external process 
uses). If the search results in many matches (1000s or 10000s) the 
results will take a while to display, but most of the time is taken up 
by processing of results which is implemented in Lisp. We might have 
Emacs which shows the first results soon, but then remains sluggish 
until all search results are processed. This problem could be worked 
around, however, by limiting the displayed number of results and having 
buttons like the ones at the bottom of vc-print-root-log output buffer.

- Search results come in unsorted, and, in the case of ripgrep, sorted 
randomly every time the search is performed (the files, at least). We 
sort them now at the bottom of xref-matches-in-files, but asynchronous 
search results would make that infeasible.

Given all of the above, I've been putting off this work, but thoughts 
and opinions welcome, and POC patches -- doubly so.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-10-05  1:59 ` Dmitry Gutov
@ 2021-10-05  5:18   ` Arthur Miller
  2021-10-05 15:11     ` Dmitry Gutov
  2021-10-05  6:29   ` Helmut Eller
  1 sibling, 1 reply; 9+ messages in thread
From: Arthur Miller @ 2021-10-05  5:18 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Kangas, 50906

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 30.09.2021 01:49, Stefan Kangas wrote:
>> Severity: wishlist
>> `xref-find-references' blocks Emacs while searching for matches.
>> This can take a long time to complete in large repositories.
>> It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.
>
> Wishlist indeed!
>
> Daniel's bug report shows a good case for this kind of feature: huge projects
> where the search, even using fast tools (e.g. ripgrep), takes multiple
> seconds. So if results of such searches could be displayed incrementally, it
> would improve the perceive speed and usability.
>
> What can be done here:
>
> - Design an "asynchronous" format for xref-show-xrefs-function to
>   consume. FETCHER of a different shape. Not sure how it's going to work in the
>   end -- maybe a simple-ish iterator (call a function again for more results),
>   but ideally it would look synchronous somehow, and the concurrency would be
>   achieved through the use of threads. Not sure if that's realistic.

Built-in threads are not realistic, what you probably want is async processes. I
was myself thinking of something for finding all references for implementing
this asynchronosly for help, in style of , but I have not yet come to implement
that. However I have looked at native comp, 'comp-run-async-workers' and how it
processes it's qeue. I have no idea if it can be somehow adapted/reused, but
something like that at least as an idea.

> - The new kind of fetcher would need to provide a way to abort the search, since
>  'C-g' would not be available anymore.
It depends on how you would use it. If you would scan for references in the
background than you would be working with something else and wouldn't need
C-g. But reading your writing, something tells me that you would like to use it
interactively, which means you would start a *synchronous* operation, which
would use async workers, a lá Java's or MFC's thread workers to get responsive
and visible updates in real-time, while workers are still searching. In that
case you would still have C-g avaialable. On C-g you could signal worker
processes to quit.

Perhaps ...? :)





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-10-05  1:59 ` Dmitry Gutov
  2021-10-05  5:18   ` Arthur Miller
@ 2021-10-05  6:29   ` Helmut Eller
  2021-10-05 16:38     ` Dmitry Gutov
  1 sibling, 1 reply; 9+ messages in thread
From: Helmut Eller @ 2021-10-05  6:29 UTC (permalink / raw)
  To: 50906

On Tue, Oct 05 2021, Dmitry Gutov wrote:

> What can be done here:
>
> - Design an "asynchronous" format for xref-show-xrefs-function to
>   consume. FETCHER of a different shape. Not sure how it's going to
>   work in the end -- maybe a simple-ish iterator (call a function
>   again for more results), but ideally it would look synchronous
>   somehow, and the concurrency would be achieved through the use of
>   threads. Not sure if that's realistic.
>
> - The new kind of fetcher would need to provide a way to abort the
>   search, since 'C-g' would not be available anymore.
>
> - Implement it for the common searches of course.

I think promises, as used in the Javascript world, would be a good fit
for this kind of problem.  Something like this:
https://github.com/chuntaro/emacs-promise.
        
> Downsides:
>
> - No way to quickly 'C-g' out of a search, supposedly one would have
>   to switch to the results buffer (maybe it will be selected right
>   away) and type 'C-c C-c'. And then kill the buffer, I guess?

Maybe we could have some "promise framework" that solves this problem
more generally, e.g., a list-promises command that works like
list-processes and offers a command to cancel promises.

> - The size threshold of a project where the improvement will be
>   significant is pretty big -- for instance, searching across the
>   Emacs checkout takes about 100-200ms (just the time the external
>   process uses). If the search results in many matches (1000s or
>   10000s) the results will take a while to display, but most of the
>   time is taken up by processing of results which is implemented in
>   Lisp. We might have Emacs which shows the first results soon, but
>   then remains sluggish until all search results are processed. This
>   problem could be worked around, however, by limiting the displayed
>   number of results and having buttons like the ones at the bottom of
>  vc-print-root-log output buffer.
>
> - Search results come in unsorted, and, in the case of ripgrep, sorted
>   randomly every time the search is performed (the files, at
>   least). We sort them now at the bottom of xref-matches-in-files, but
>   asynchronous search results would make that infeasible.

This is a good point and probably quite difficult to solve.  I'm
wondering if it would be possible to build some kind of index, like
search engines do.  So instead of grepping, we'd use the index and maybe
invest more effort in ranking the results?

Helmut






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-10-05  5:18   ` Arthur Miller
@ 2021-10-05 15:11     ` Dmitry Gutov
  0 siblings, 0 replies; 9+ messages in thread
From: Dmitry Gutov @ 2021-10-05 15:11 UTC (permalink / raw)
  To: Arthur Miller; +Cc: Stefan Kangas, 50906

On 05.10.2021 08:18, Arthur Miller wrote:

>> What can be done here:
>>
>> - Design an "asynchronous" format for xref-show-xrefs-function to
>>    consume. FETCHER of a different shape. Not sure how it's going to work in the
>>    end -- maybe a simple-ish iterator (call a function again for more results),
>>    but ideally it would look synchronous somehow, and the concurrency would be
>>    achieved through the use of threads. Not sure if that's realistic.
> 
> Built-in threads are not realistic, what you probably want is async processes.

Why not? It should be possible with cooperative multithreading (which we 
have), at least in theory. Under the hood it would be based on async 
processes, of course.

> I
> was myself thinking of something for finding all references for implementing
> this asynchronosly for help, in style of , but I have not yet come to implement
> that. However I have looked at native comp, 'comp-run-async-workers' and how it
> processes it's qeue. I have no idea if it can be somehow adapted/reused, but
> something like that at least as an idea.

Doesn't seem like it really can be reused directly: it launches a queue 
of processes. What we would need is a "queue" of result batches coming 
from one process. And we'd need some abstraction for it, not just 
concrete implementation.

>> - The new kind of fetcher would need to provide a way to abort the search, since
>>   'C-g' would not be available anymore.
> It depends on how you would use it. If you would scan for references in the
> background than you would be working with something else and wouldn't need
> C-g.

Why not? Sometimes the regexp I have typed is wrong (too short, for 
example), and I need to stop the search to correct it. Or even if the 
regexp was right, I might discover it brings too many matches to be useful.

> But reading your writing, something tells me that you would like to use it
> interactively, which means you would start a *synchronous* operation, which
> would use async workers, a lá Java's or MFC's thread workers to get responsive
> and visible updates in real-time, while workers are still searching. In that
> case you would still have C-g avaialable. On C-g you could signal worker
> processes to quit.

It's... an option too. And having lives with the current UI, I would 
probably be fine with it.

But I suppose a lot of users might want to be able to interact with the 
first results (that have been already rendered) before the search 
completes. Otherwise we're not really taking full advantage of 
asynchronous searching.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-10-05  6:29   ` Helmut Eller
@ 2021-10-05 16:38     ` Dmitry Gutov
  2021-10-05 18:09       ` Helmut Eller
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Gutov @ 2021-10-05 16:38 UTC (permalink / raw)
  To: Helmut Eller, 50906

On 05.10.2021 09:29, Helmut Eller wrote:
> On Tue, Oct 05 2021, Dmitry Gutov wrote:
> 
>> What can be done here:
>>
>> - Design an "asynchronous" format for xref-show-xrefs-function to
>>    consume. FETCHER of a different shape. Not sure how it's going to
>>    work in the end -- maybe a simple-ish iterator (call a function
>>    again for more results), but ideally it would look synchronous
>>    somehow, and the concurrency would be achieved through the use of
>>    threads. Not sure if that's realistic.
>>
>> - The new kind of fetcher would need to provide a way to abort the
>>    search, since 'C-g' would not be available anymore.
>>
>> - Implement it for the common searches of course.
> 
> I think promises, as used in the Javascript world, would be a good fit
> for this kind of problem.  Something like this:
> https://github.com/chuntaro/emacs-promise.

A promise is something that resolves once. We could build on top of this 
concept, but what's really needed is some sort of a lazy sequence 
(Clojure-style), or a sequence of chunks.

>> Downsides:
>>
>> - No way to quickly 'C-g' out of a search, supposedly one would have
>>    to switch to the results buffer (maybe it will be selected right
>>    away) and type 'C-c C-c'. And then kill the buffer, I guess?
> 
> Maybe we could have some "promise framework" that solves this problem
> more generally, e.g., a list-promises command that works like
> list-processes and offers a command to cancel promises.

It would need be accessible by the code handling the "abort" command, 
not just by some special UI accessible to the user separately.

But some Promise/Future implementations include the "abort" 
functionality, so it can work together.

>> - The size threshold of a project where the improvement will be
>>    significant is pretty big -- for instance, searching across the
>>    Emacs checkout takes about 100-200ms (just the time the external
>>    process uses). If the search results in many matches (1000s or
>>    10000s) the results will take a while to display, but most of the
>>    time is taken up by processing of results which is implemented in
>>    Lisp. We might have Emacs which shows the first results soon, but
>>    then remains sluggish until all search results are processed. This
>>    problem could be worked around, however, by limiting the displayed
>>    number of results and having buttons like the ones at the bottom of
>>   vc-print-root-log output buffer.
>>
>> - Search results come in unsorted, and, in the case of ripgrep, sorted
>>    randomly every time the search is performed (the files, at
>>    least). We sort them now at the bottom of xref-matches-in-files, but
>>    asynchronous search results would make that infeasible.
> 
> This is a good point and probably quite difficult to solve.  I'm
> wondering if it would be possible to build some kind of index, like
> search engines do.  So instead of grepping, we'd use the index and maybe
> invest more effort in ranking the results?

For xref-find-references in particular, you can build an index using 'ID 
Utils' already, and the search will be fast. The downside is you will 
need to update this index manually when the project changes. E.g. when 
you switch to a different repository branch.

And the ripgrep devs are working on something similar: 
https://github.com/BurntSushi/ripgrep/issues/1497

Not sure how far off in the future that is, though.

A really fast searcher solves the biggest part of the problem, but we'd 
still be left with very imprecise searches (many matches) locking up 
Emacs for seconds, since the Lisp overhead processing a match is 
unavoidably larger than the time it takes for a search program to print 
it. Using lazy sequences could allow us some leeway as well -- namely, 
processing only the first N hits initially, and then processing the rest 
only if the user requests that.

If we only target this kind of improvement, the "abort" functionality 
could wait. We'd still need to choose between sorting the results and 
saving on parsing the output buffer eagerly, though.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-10-05 16:38     ` Dmitry Gutov
@ 2021-10-05 18:09       ` Helmut Eller
  2021-10-05 19:24         ` Dmitry Gutov
  0 siblings, 1 reply; 9+ messages in thread
From: Helmut Eller @ 2021-10-05 18:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 50906

On Tue, Oct 05 2021, Dmitry Gutov wrote:

> A really fast searcher solves the biggest part of the problem, but
> we'd still be left with very imprecise searches (many matches) locking
> up Emacs for seconds, since the Lisp overhead processing a match is
> unavoidably larger than the time it takes for a search program to
> print it. Using lazy sequences could allow us some leeway as well --
> namely, processing only the first N hits initially, and then
> processing the rest only if the user requests that.
>
> If we only target this kind of improvement, the "abort" functionality
> could wait.

Yes, limiting the time that Emacs is locked up, by limiting the number of
hits that Emacs accepts in one chunk, seems like the way to go.

> We'd still need to choose between sorting the results and
> saving on parsing the output buffer eagerly, though.

Theoretically it should be possible to sort the first chunk and display
it.  Then, when the next chunk arrives, merge it in, à la heap-sort, and
update the display accordingly.  Probably not worth the effort, though.

Also, I think that the only "sorting" that we actually do, is grouping
by filename.  And that doesn't seem all that important to me.

Helmut





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#50906: xref-find-references blocks Emacs: asynchronous operation?
  2021-10-05 18:09       ` Helmut Eller
@ 2021-10-05 19:24         ` Dmitry Gutov
  0 siblings, 0 replies; 9+ messages in thread
From: Dmitry Gutov @ 2021-10-05 19:24 UTC (permalink / raw)
  To: Helmut Eller; +Cc: 50906

On 05.10.2021 21:09, Helmut Eller wrote:

>> We'd still need to choose between sorting the results and
>> saving on parsing the output buffer eagerly, though.
> 
> Theoretically it should be possible to sort the first chunk and display
> it.  Then, when the next chunk arrives, merge it in, à la heap-sort, and
> update the display accordingly.  Probably not worth the effort, though.

This will lead to "jumping" of groups up and down. Not a pleasant UX.

> Also, I think that the only "sorting" that we actually do, is grouping
> by filename.  And that doesn't seem all that important to me.

xref-matches-in-files sorts results by filename alphabetically, because 
ripgrep returns them in random order every time. And the sorting step is 
pretty fast, as long as all results are available.





^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-05 19:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-29 22:49 bug#50906: xref-find-references blocks Emacs: asynchronous operation? Stefan Kangas
2021-09-30  7:44 ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-10-05  1:59 ` Dmitry Gutov
2021-10-05  5:18   ` Arthur Miller
2021-10-05 15:11     ` Dmitry Gutov
2021-10-05  6:29   ` Helmut Eller
2021-10-05 16:38     ` Dmitry Gutov
2021-10-05 18:09       ` Helmut Eller
2021-10-05 19:24         ` Dmitry Gutov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).