bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Dmitry Gutov <dgutov@yandex.ru>
To: Eli Zaretskii <eliz@gnu.org>
Cc: hariharanrangasamy@gmail.com, 26710@debbugs.gnu.org
Subject: bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
Date: Tue, 2 May 2017 00:46:25 +0300	[thread overview]
Message-ID: <b283056e-7b17-86c2-7d59-1f9015146130@yandex.ru> (raw)
In-Reply-To: <83inlljb5r.fsf@gnu.org>

On 01.05.2017 10:20, Eli Zaretskii wrote:

> In my testing, find-grep finishes almost instantaneously.  The
> exception is when you have a cold cache, but even then it takes about
> 10% of the total run time, for the Emacs source tree (which yields
> about 100,000 hits in the test case).

This particular example, uses a very frequent term. I get 61000 hits or 
so, and it's still a lot, the search never finishes here (probably 
because I have more minor modes and customizations enabled).

I don't think this is the common case, but let's try to remove some 
unnecessary work in Elisp first.

See commit c99a3b9. Please take a look at 
xref--regexp-syntax-dependent-p specifically, and see if any significant 
false negatives come to mind.

With this, project-find-regexp for 'emacs' finally completes in ~10 
seconds on my machine. That's still more than 10 times longer than the 
external process takes, but I'm out of big optimization ideas at this point.

> I thought the request was to allow the user do something in the
> foreground, while this processing runs in the background.  If that's
> not what was requested, then I guess I no longer understand the
> request.

If the project is huge, and there are only a few hits, parallelizing the 
search and processing will allow the user to do whatever they want in 
the foreground. Because processing in Elisp, while slow, will still take 
a small fraction of the time.

If the search term returns a lot of hits (compared to the size of the 
project), processing might indeed take a lot of time, and the UI might 
appear sluggish (not sure how sluggish, though, that should depend on 
the scheduling of the main and background threads).

Even if it's sluggish, at least the user will see that the search has 
started, and there is some progress. We could even allow them to stop 
the search midway, and still do something with the first results.

These are some of the advantages 'M-x rgrep' has over project-find-regexp.

>> What we _can_ manage to run in parallel, in the find-grep process in the
>> background, and the post-processing of the results in Elisp.
> 
> Yes, you can -- if you invoke find-grep asynchronously and move the
> processing of the hits to the filter function.

Yes, these parts are necessary either way. What I was describing would 
go on top of them, as an abstraction.

> But that doesn't need
> to involve threads, and is being done in many packages/features out
> there, so I'm not sure what did you ask me to do with this.

I imagined that the xref API that allows this kind of asynchronous 
results might look better and more readable if it's implemented with 
threads underneath.

> IOW, it
> should be "trivial", at least in principle, to make this command work
> in the background, just like, say, "M-x grep".

In Compilation buffers (of which Grep is one example), the sentinel code 
has access to the buffer where the results are displayed. And the 
process outputs to that buffer as well. And 'M-x rgrep' doesn't have to 
abstract over possible way to obtain search results.

None of those are the case with the xref API, or the results rendering 
code, which has to work with the values returned by an arbitrary xref 
backend, as documented.

Right now, an xref backend implements several methods that are allowed 
to return the same type of value: "a list of xref items".

Our task, as I see it, is to generalize that return value type for 
asynchronous work, and to do that as sanely as possible.

Threads are not strictly necessary for this (see the last paragraph of 
my previous email), but this case seems like it could be a good, limited 
in scope, showcase for the threading functionality.

> I'm not sure I understand the need for this complexity, given that
> async subprocesses are available.  I'm probably missing something
> because I know too little about the internals of the involved code.

The main thing to understand is the xref API, not the internals of the 
package.

next prev parent reply	other threads:[~2017-05-01 21:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87a86zu3gf.fsf@hari-laptop.i-did-not-set--mail-host-address--so-tickle-me>
2017-04-29  8:55 ` bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu Hariharan Rangasamy
2017-04-29 17:00   ` Dmitry Gutov
2017-04-29 17:37     ` Eli Zaretskii
2017-04-30  4:13       ` Hariharan Rangasamy
2017-04-30 10:35         ` Dmitry Gutov
2017-04-30 18:47           ` Eli Zaretskii
2017-05-01  2:42             ` Dmitry Gutov
2017-05-01  7:20               ` Eli Zaretskii
2017-05-01 21:46                 ` Dmitry Gutov [this message]
2017-05-02  7:15                   ` Eli Zaretskii
2017-05-02 10:00                     ` Dmitry Gutov
2017-05-02 17:26                       ` Eli Zaretskii
2017-05-02 17:41                         ` Eli Zaretskii
2017-05-03  0:14                         ` Dmitry Gutov
2017-05-03  2:34                           ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b283056e-7b17-86c2-7d59-1f9015146130@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=26710@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=hariharanrangasamy@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.