bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
       [not found] <87a86zu3gf.fsf@hari-laptop.i-did-not-set--mail-host-address--so-tickle-me>
@ 2017-04-29  8:55 ` Hariharan Rangasamy
  2017-04-29 17:00   ` Dmitry Gutov
  0 siblings, 1 reply; 15+ messages in thread
From: Hariharan Rangasamy @ 2017-04-29  8:55 UTC (permalink / raw)
  To: 26710

using project-find-regexp to find text in a folder makes emacs use 100%
CPU

Steps to reproduce:
1. emacs -nw -Q
2. M-x project-find-regexp
3. choose search item as emacs
4. Choose the project as downloaded emacs-25.2/ folder
5. Give top command and check the cpu usage of emacs-25.2 process
6. emacs utilises 100% CPU till the search is over.






In GNU Emacs 25.2.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.18.9)
 of 2017-04-29 built on hari-laptop
System Description:     Ubuntu 16.04.2 LTS

Configured features:
XPM JPEG TIFF GIF PNG SOUND DBUS GSETTINGS NOTIFY GNUTLS FREETYPE XFT
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11

Important settings:
  value of $LANG: en_IN
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: iso-latin-1-unix

Major mode: Fundamental

Minor modes in effect:
  shell-dirtrack-mode: t
  diff-auto-refine-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
user-error: No matches for: helloworld
Using ’~/Downloads/emacs-25.2/’ as a transient project root
Enriched: decoding document...
Indenting...
Using ’~/Downloads/emacs-25.2/’ as a transient project root
Enriched: decoding document...
Indenting...
Making completion list...
delete-backward-char: Text is read-only
user-error: Beginning of history; no preceding item [2 times]
Quit [2 times]

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rfc822 mml mml-sec epg
epg-config mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mail-utils
autoconf autoconf-mode m4-mode gud nroff-mode texinfo sgml-mode python
tramp-sh tramp tramp-compat auth-source mm-util mail-prsvr
password-cache tramp-loaddefs trampver ucs-normalize advice json map seq
semantic/bovine/grammar semantic/wisent/grammar semantic/bovine
semantic/grammar help-fns semantic/idle semantic/grammar-wy bat-mode
cc-awk enriched cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles
cc-align cc-engine cc-vars cc-defs nxml-uchnm rng-xsd xsd-regexp
rng-cmpct rng-nxml rng-valid rng-loc rng-uri rng-parse nxml-parse
rng-match rng-dt rng-util rng-pttrn nxml-ns nxml-mode nxml-outln
nxml-rap nxml-util nxml-glyph nxml-enc xmltok ruby-mode perl-mode
verilog-mode diff conf-mode make-mode tex-mode shell latexenc
org-element org-rmail org-mhe org-irc org-info org-gnus gnus-util
org-docview org-bibtex bibtex org-bbdb org-w3m org org-macro
org-footnote org-pcomplete pcomplete org-list org-faces org-entities
noutline outline org-version ob-emacs-lisp ob ob-tangle ob-ref ob-lob
ob-table ob-exp org-src ob-keys ob-comint ob-core ob-eval org-compat
org-macs org-loaddefs format-spec cal-menu calendar cal-loaddefs
srecode/srt-mode semantic/analyze semantic/sort semantic/scope
semantic/analyze/fcn semantic/db semantic/format ezimage
srecode/template srecode/srt-wy semantic/wisent semantic/wisent/wisent
semantic/ctxt srecode/ctxt semantic/tag-ls semantic/find srecode/compile
srecode/dictionary srecode/table srecode eieio-base semantic/util-modes
semantic/util semantic semantic/tag semantic/lex cedet doc-view subr-x
jka-compr image-mode ps-mode add-log sh-script smie executable
find-dired dired semantic/fw mode-local find-func grep compile comint
ansi-color vc-mtn vc-hg vc-git diff-mode easy-mmode vc-bzr vc-src
vc-sccs vc-svn vc-cvs vc-rcs vc vc-dispatcher thingatpt etags xref
cl-seq ring eieio byte-opt bytecomp byte-compile cl-extra help-mode
easymenu cconv eieio-core cl-macs gv cl-loaddefs pcase cl-lib project
term/xterm xterm time-date disp-table mule-util tooltip eldoc electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list newcomment elisp-mode lisp-mode prog-mode register page
menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core frame cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese charscript case-table epa-hook jka-cmpr-hook help
simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces
cus-face macroexp files text-properties overlay sha1 md5 base64 format
env code-pages mule custom widget hashtable-print-readable backquote
dbusbind inotify dynamic-setting system-font-setting font-render-setting
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 4729592 517895)
 (symbols 48 40455 0)
 (miscs 40 112 310)
 (strings 32 292026 313931)
 (string-bytes 1 15565717)
 (vectors 16 275283)
 (vector-slots 8 2028489 240208)
 (floats 8 530 633)
 (intervals 56 977354 6401)
 (buffers 976 23)
 (heap 1024 235047 29774))





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-04-29  8:55 ` bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu Hariharan Rangasamy
@ 2017-04-29 17:00   ` Dmitry Gutov
  2017-04-29 17:37     ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2017-04-29 17:00 UTC (permalink / raw)
  To: Hariharan Rangasamy, 26710

Hi,

On 29.04.2017 11:55, Hariharan Rangasamy wrote:
> using project-find-regexp to find text in a folder makes emacs use 100%
> CPU

Why is that a problem? Should it use 10% CPU and take 10 times as long?





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-04-29 17:00   ` Dmitry Gutov
@ 2017-04-29 17:37     ` Eli Zaretskii
  2017-04-30  4:13       ` Hariharan Rangasamy
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2017-04-29 17:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: hariharanrangasamy, 26710

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Sat, 29 Apr 2017 20:00:35 +0300
> 
> On 29.04.2017 11:55, Hariharan Rangasamy wrote:
> > using project-find-regexp to find text in a folder makes emacs use 100%
> > CPU
> 
> Why is that a problem?

Indeed: CPU-intensive processing will always make one execution unit
100% busy, as long as the processing goes on.  This is normal.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-04-29 17:37     ` Eli Zaretskii
@ 2017-04-30  4:13       ` Hariharan Rangasamy
  2017-04-30 10:35         ` Dmitry Gutov
  0 siblings, 1 reply; 15+ messages in thread
From: Hariharan Rangasamy @ 2017-04-30  4:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 26710, Dmitry Gutov

When the search is in progress, I'm unable to use emacs.
Emacs doesn't respond to any action until the search is over.

On Sat, Apr 29, 2017 at 11:07 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Dmitry Gutov <dgutov@yandex.ru>
>> Date: Sat, 29 Apr 2017 20:00:35 +0300
>>
>> On 29.04.2017 11:55, Hariharan Rangasamy wrote:
>> > using project-find-regexp to find text in a folder makes emacs use 100%
>> > CPU
>>
>> Why is that a problem?
>
> Indeed: CPU-intensive processing will always make one execution unit
> 100% busy, as long as the processing goes on.  This is normal.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-04-30  4:13       ` Hariharan Rangasamy
@ 2017-04-30 10:35         ` Dmitry Gutov
  2017-04-30 18:47           ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2017-04-30 10:35 UTC (permalink / raw)
  To: Hariharan Rangasamy, Eli Zaretskii; +Cc: control, 26710

retitle 26710 project-find-regexp blocks the UI
stop

On 30.04.2017 7:13, Hariharan Rangasamy wrote:
> When the search is in progress, I'm unable to use emacs.
> Emacs doesn't respond to any action until the search is over.

OK, that's a reasonable complaint, especially when the search term is a 
rarely occurring one (so most of the time is spent in the external process).

It is non-trivial to fix, however, while keeping the xref backend API 
and associated code sane.

I'm hoping that the newly-added concurrency support can help us in that. 
I have not looked into actually using it, though.

If someone (Eli?) would like to try their hand at it, or to even outline 
the basic direction such solution would use, that would be welcome.

It seems we need two parts:

- Generating some sort of lazy sequence from the external process' 
output, so that the return value of an xref backend call is still a 
sequence.

- Being able to hook up in an asynchronous fashion to that sequence in a 
(second?) background thread, to render the search results in the xref 
buffer as soon as they appear.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-04-30 10:35         ` Dmitry Gutov
@ 2017-04-30 18:47           ` Eli Zaretskii
  2017-05-01  2:42             ` Dmitry Gutov
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2017-04-30 18:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: hariharanrangasamy, control, 26710

> Cc: 26710@debbugs.gnu.org, control@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Sun, 30 Apr 2017 13:35:53 +0300
> 
> > When the search is in progress, I'm unable to use emacs.
> > Emacs doesn't respond to any action until the search is over.
> 
> OK, that's a reasonable complaint, especially when the search term is a 
> rarely occurring one (so most of the time is spent in the external process).
> 
> It is non-trivial to fix, however, while keeping the xref backend API 
> and associated code sane.
> 
> I'm hoping that the newly-added concurrency support can help us in that. 
> I have not looked into actually using it, though.
> 
> If someone (Eli?) would like to try their hand at it, or to even outline 
> the basic direction such solution would use, that would be welcome.

I'll try to look at this.  According to my profiling, the lion's share
of time is taken by xref--collect-matches, so that's the place to try
concurrency.

> - Being able to hook up in an asynchronous fashion to that sequence in a 
> (second?) background thread, to render the search results in the xref 
> buffer as soon as they appear.

For the other thread to be background, it will need to yield from time
to time, otherwise the user experience will be identical to what we
have today, because an un-yielding thread will hold the execution unit
until it does its job completely, and no other thread gets to run
until it does.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-04-30 18:47           ` Eli Zaretskii
@ 2017-05-01  2:42             ` Dmitry Gutov
  2017-05-01  7:20               ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2017-05-01  2:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: hariharanrangasamy, control, 26710

On 30.04.2017 21:47, Eli Zaretskii wrote:

> I'll try to look at this.  According to my profiling, the lion's share
> of time is taken by xref--collect-matches, so that's the place to try
> concurrency.

I think that's too late. By the time xref--collect-matches is called 
(and it's called for each hit), we've already spent time synchronously 
waiting for the find-grep invocation to finish.

When there are a lot of matches, xref--collect-matches can take some 
significant time, with opening the buffers and everything. That can be 
optimized, however, as a separate issue, and I don't think there's 
anything to parallelize there, since it all happens in Elisp.

What we _can_ manage to run in parallel, in the find-grep process in the 
background, and the post-processing of the results in Elisp. Depending 
on how matches there are in total, compared to the project size (which 
affects how long find-grep will run), the second part will still affect 
the responsiveness of the UI to a smaller or larger degree, but 
ultimately there's no way around this AFAIK, as long as Elisp threads do 
not provide parallelism.

>> - Being able to hook up in an asynchronous fashion to that sequence in a
>> (second?) background thread, to render the search results in the xref
>> buffer as soon as they appear.
> 
> For the other thread to be background, it will need to yield from time
> to time, otherwise the user experience will be identical to what we
> have today, because an un-yielding thread will hold the execution unit
> until it does its job completely, and no other thread gets to run
> until it does.

Here's how I imagine it:

- Main thread creates a thread P which invoked the find-grep, somehow 
creates a "generator" object, yields to the main thread.

- The main thread creates the "other thread" which creates a buffer for 
displaying the xref items and marks it as still loading (e.g. with a 
spinner in the mode-line). And then repeatedly calls the generator for 
the next element. There are three choices:

1. An element is returned. Render it into the buffer.
2. No next element available. That should automatically yield from the 
current thread until it becomes available. That kind of logic should 
reside somewhere inside the generator, along with storing the reference 
to the current thread, to resume it later.
3. No more items, the process in P is finished. Mark the xref buffer as 
completed.

The things I'm not clear on are:

- How to best "convert" the process buffer into a generator object.

- Which generator type to use. Not sure if any of the ones we already 
have (generator.el, or iterator.el and stream.el in ELPA) will help.

- If the thread P is needed at all, or we could just one the main one 
instead of it.

- Whether we should forget all that concurrency nonsense (at least for 
this problem), and instead of a generator go with callbacks, similar to 
the API of the dir-status-files VC command. This way, each callback will 
render a new line, and the last one will mark the buffer as completed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-01  2:42             ` Dmitry Gutov
@ 2017-05-01  7:20               ` Eli Zaretskii
  2017-05-01 21:46                 ` Dmitry Gutov
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2017-05-01  7:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: hariharanrangasamy, 26710

> Cc: hariharanrangasamy@gmail.com, control@debbugs.gnu.org,
>  26710@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Mon, 1 May 2017 05:42:21 +0300
> 
> On 30.04.2017 21:47, Eli Zaretskii wrote:
> 
> > I'll try to look at this.  According to my profiling, the lion's share
> > of time is taken by xref--collect-matches, so that's the place to try
> > concurrency.
> 
> I think that's too late. By the time xref--collect-matches is called 
> (and it's called for each hit), we've already spent time synchronously 
> waiting for the find-grep invocation to finish.

In my testing, find-grep finishes almost instantaneously.  The
exception is when you have a cold cache, but even then it takes about
10% of the total run time, for the Emacs source tree (which yields
about 100,000 hits in the test case).

> When there are a lot of matches, xref--collect-matches can take some 
> significant time, with opening the buffers and everything. That can be 
> optimized, however, as a separate issue, and I don't think there's 
> anything to parallelize there, since it all happens in Elisp.

I thought the request was to allow the user do something in the
foreground, while this processing runs in the background.  If that's
not what was requested, then I guess I no longer understand the
request.

> What we _can_ manage to run in parallel, in the find-grep process in the 
> background, and the post-processing of the results in Elisp.

Yes, you can -- if you invoke find-grep asynchronously and move the
processing of the hits to the filter function.  But that doesn't need
to involve threads, and is being done in many packages/features out
there, so I'm not sure what did you ask me to do with this.  IOW, it
should be "trivial", at least in principle, to make this command work
in the background, just like, say, "M-x grep".

> Here's how I imagine it:

I'm not sure I understand the need for this complexity, given that
async subprocesses are available.  I'm probably missing something
because I know too little about the internals of the involved code.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-01  7:20               ` Eli Zaretskii
@ 2017-05-01 21:46                 ` Dmitry Gutov
  2017-05-02  7:15                   ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2017-05-01 21:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: hariharanrangasamy, 26710

On 01.05.2017 10:20, Eli Zaretskii wrote:

> In my testing, find-grep finishes almost instantaneously.  The
> exception is when you have a cold cache, but even then it takes about
> 10% of the total run time, for the Emacs source tree (which yields
> about 100,000 hits in the test case).

This particular example, uses a very frequent term. I get 61000 hits or 
so, and it's still a lot, the search never finishes here (probably 
because I have more minor modes and customizations enabled).

I don't think this is the common case, but let's try to remove some 
unnecessary work in Elisp first.

See commit c99a3b9. Please take a look at 
xref--regexp-syntax-dependent-p specifically, and see if any significant 
false negatives come to mind.

With this, project-find-regexp for 'emacs' finally completes in ~10 
seconds on my machine. That's still more than 10 times longer than the 
external process takes, but I'm out of big optimization ideas at this point.

> I thought the request was to allow the user do something in the
> foreground, while this processing runs in the background.  If that's
> not what was requested, then I guess I no longer understand the
> request.

If the project is huge, and there are only a few hits, parallelizing the 
search and processing will allow the user to do whatever they want in 
the foreground. Because processing in Elisp, while slow, will still take 
a small fraction of the time.

If the search term returns a lot of hits (compared to the size of the 
project), processing might indeed take a lot of time, and the UI might 
appear sluggish (not sure how sluggish, though, that should depend on 
the scheduling of the main and background threads).

Even if it's sluggish, at least the user will see that the search has 
started, and there is some progress. We could even allow them to stop 
the search midway, and still do something with the first results.

These are some of the advantages 'M-x rgrep' has over project-find-regexp.

>> What we _can_ manage to run in parallel, in the find-grep process in the
>> background, and the post-processing of the results in Elisp.
> 
> Yes, you can -- if you invoke find-grep asynchronously and move the
> processing of the hits to the filter function.

Yes, these parts are necessary either way. What I was describing would 
go on top of them, as an abstraction.

> But that doesn't need
> to involve threads, and is being done in many packages/features out
> there, so I'm not sure what did you ask me to do with this.

I imagined that the xref API that allows this kind of asynchronous 
results might look better and more readable if it's implemented with 
threads underneath.

> IOW, it
> should be "trivial", at least in principle, to make this command work
> in the background, just like, say, "M-x grep".

In Compilation buffers (of which Grep is one example), the sentinel code 
has access to the buffer where the results are displayed. And the 
process outputs to that buffer as well. And 'M-x rgrep' doesn't have to 
abstract over possible way to obtain search results.

None of those are the case with the xref API, or the results rendering 
code, which has to work with the values returned by an arbitrary xref 
backend, as documented.

Right now, an xref backend implements several methods that are allowed 
to return the same type of value: "a list of xref items".

Our task, as I see it, is to generalize that return value type for 
asynchronous work, and to do that as sanely as possible.

Threads are not strictly necessary for this (see the last paragraph of 
my previous email), but this case seems like it could be a good, limited 
in scope, showcase for the threading functionality.

> I'm not sure I understand the need for this complexity, given that
> async subprocesses are available.  I'm probably missing something
> because I know too little about the internals of the involved code.

The main thing to understand is the xref API, not the internals of the 
package.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-01 21:46                 ` Dmitry Gutov
@ 2017-05-02  7:15                   ` Eli Zaretskii
  2017-05-02 10:00                     ` Dmitry Gutov
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2017-05-02  7:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: hariharanrangasamy, 26710

> Cc: hariharanrangasamy@gmail.com, 26710@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 2 May 2017 00:46:25 +0300
> 
> See commit c99a3b9. Please take a look at 
> xref--regexp-syntax-dependent-p specifically, and see if any significant 
> false negatives come to mind.

Can you explain the significance of xref--regexp-syntax-dependent-p's
tests?  I don't know enough about xref to grasp that just by looking
at the changes.

> With this, project-find-regexp for 'emacs' finally completes in ~10 
> seconds on my machine.

It takes about 15 here (and 45 in an unoptimized build).  I guess this
slowdown is expected, since this is a 32-bit build --with-wide-int, so
should be 30% slower than with native ints.

I don't remember the original timings, but this looks like a good
improvement, thanks.

> >> What we _can_ manage to run in parallel, in the find-grep process in the
> >> background, and the post-processing of the results in Elisp.
> > 
> > Yes, you can -- if you invoke find-grep asynchronously and move the
> > processing of the hits to the filter function.
> 
> Yes, these parts are necessary either way. What I was describing would 
> go on top of them, as an abstraction.

If the processing is in filter and sentinel functions, I'm not sure we
will need any further speedups, because the UI will remain responsive.

> > But that doesn't need
> > to involve threads, and is being done in many packages/features out
> > there, so I'm not sure what did you ask me to do with this.
> 
> I imagined that the xref API that allows this kind of asynchronous 
> results might look better and more readable if it's implemented with 
> threads underneath.

If you need advice for how to implement something like that, I can try
helping with threads.

> The main thing to understand is the xref API, not the internals of the 
> package.

Well, I lack that understanding as well.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-02  7:15                   ` Eli Zaretskii
@ 2017-05-02 10:00                     ` Dmitry Gutov
  2017-05-02 17:26                       ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2017-05-02 10:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: hariharanrangasamy, 26710

On 02.05.2017 10:15, Eli Zaretskii wrote:

> Can you explain the significance of xref--regexp-syntax-dependent-p's
> tests?  I don't know enough about xref to grasp that just by looking
> at the changes.

When it returns nil (the regexp is not affected by syntax-table):

If the file containing the hit is not open, we now skip inserting the 
first few lines of that file into the temporary buffer, and calling 
set-auto-mode.

And, whether it's open or not, we skip the syntax-propertize call.

>> With this, project-find-regexp for 'emacs' finally completes in ~10
>> seconds on my machine.
> 
> It takes about 15 here (and 45 in an unoptimized build).  I guess this
> slowdown is expected, since this is a 32-bit build --with-wide-int, so
> should be 30% slower than with native ints.

Thanks for testing. To be more accurate, it's about 10 seconds in my 
normal session, and about 6 seconds starting with 'emacs -Q'. My laptop 
is most likely faster.

> If the processing is in filter and sentinel functions, I'm not sure we
> will need any further speedups, because the UI will remain responsive.

The filter and sentinel functions are not allowed to have direct access 
to the final output buffer, hence the need for abstraction.

I guess you favor the "one callback per hit" approach, then.

Still, if the filter function and sentinel functions take a lot of time 
(and/or get called a lot), like it will be in this example, the UI can't 
as responsive as usual, can it?

>>> But that doesn't need
>>> to involve threads, and is being done in many packages/features out
>>> there, so I'm not sure what did you ask me to do with this.
>>
>> I imagined that the xref API that allows this kind of asynchronous
>> results might look better and more readable if it's implemented with
>> threads underneath.
> 
> If you need advice for how to implement something like that, I can try
> helping with threads.

I'd like a more general advice first. E.g. do we want to go this road? 
The dir-status-files like scheme should work without threads, too.

It seems a bit brittle, though: if the process filter is supposed to be 
calling the callback for each item, the callback has to be in place 
right away. And the process will be started before that happens.

We'll probably be saved by filters having to wait until the current 
command finishes executing, though.

>> The main thing to understand is the xref API, not the internals of the
>> package.
> 
> Well, I lack that understanding as well.

I'm hoping it's not too hard to obtain even just by reading the 
Commentary section in xref.el. But hey, you don't have to.

The callbacks approach seems viable, too.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-02 10:00                     ` Dmitry Gutov
@ 2017-05-02 17:26                       ` Eli Zaretskii
  2017-05-02 17:41                         ` Eli Zaretskii
  2017-05-03  0:14                         ` Dmitry Gutov
  0 siblings, 2 replies; 15+ messages in thread
From: Eli Zaretskii @ 2017-05-02 17:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: hariharanrangasamy, 26710

> Cc: hariharanrangasamy@gmail.com, 26710@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 2 May 2017 13:00:06 +0300
> 
> On 02.05.2017 10:15, Eli Zaretskii wrote:
> 
> > Can you explain the significance of xref--regexp-syntax-dependent-p's
> > tests?  I don't know enough about xref to grasp that just by looking
> > at the changes.
> 
> When it returns nil (the regexp is not affected by syntax-table):
> 
> If the file containing the hit is not open, we now skip inserting the 
> first few lines of that file into the temporary buffer, and calling 
> set-auto-mode.
> 
> And, whether it's open or not, we skip the syntax-propertize call.

OK, I will look at that function with this in mind.

> Still, if the filter function and sentinel functions take a lot of time 
> (and/or get called a lot), like it will be in this example, the UI can't 
> as responsive as usual, can it?

The sentinel/filter won't be called at all if keyboard/mouse input is
available.  Once they are called, if each call takes a long processing
time, the UI could feel sluggish, yes.  But I don't quite see how
using threads will avoid the same problem, since the mechanism for
thread switch is basically the same as for multiplexing UI with
subprocess output.

> I'd like a more general advice first. E.g. do we want to go this road? 

IMO, we should first explore the async subprocess road.

> It seems a bit brittle, though: if the process filter is supposed to be 
> calling the callback for each item, the callback has to be in place 
> right away. And the process will be started before that happens.

You can countermand that by using make-process with the :stop
attribute, then use 'continue-process' when everything is set up.

> We'll probably be saved by filters having to wait until the current 
> command finishes executing, though.

Not sure I follow you: a filter function is called whenever some
output arrives from the subprocess.  So they don't need to wait for
the subprocess to finish.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-02 17:26                       ` Eli Zaretskii
@ 2017-05-02 17:41                         ` Eli Zaretskii
  2017-05-03  0:14                         ` Dmitry Gutov
  1 sibling, 0 replies; 15+ messages in thread
From: Eli Zaretskii @ 2017-05-02 17:41 UTC (permalink / raw)
  To: dgutov; +Cc: hariharanrangasamy, 26710

> Date: Tue, 02 May 2017 20:26:19 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: hariharanrangasamy@gmail.com, 26710@debbugs.gnu.org
> 
> > It seems a bit brittle, though: if the process filter is supposed to be 
> > calling the callback for each item, the callback has to be in place 
> > right away. And the process will be started before that happens.
> 
> You can countermand that by using make-process with the :stop
> attribute, then use 'continue-process' when everything is set up.

Darn, this won't work on systems without SIGCONT support, like
MS-Windows.

But I don't think this is a real problem anyway: Emacs will not try
sensing for subprocess output until it becomes idle, so as long as the
code which sets up the process's filter and sentinel functions and
their respective callbacks runs, Emacs will not try to call the
filter/sentinel functions.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-02 17:26                       ` Eli Zaretskii
  2017-05-02 17:41                         ` Eli Zaretskii
@ 2017-05-03  0:14                         ` Dmitry Gutov
  2017-05-03  2:34                           ` Eli Zaretskii
  1 sibling, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2017-05-03  0:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: hariharanrangasamy, 26710

On 02.05.2017 20:26, Eli Zaretskii wrote:

> The sentinel/filter won't be called at all if keyboard/mouse input is
> available.  Once they are called, if each call takes a long processing
> time, the UI could feel sluggish, yes.

Hmm, and if we're in "many calls, each of them fairly fast" situation?

Sounds like the UI might be quite usable (but doing anything with it 
would slow down the processing of search results).

> But I don't quite see how
> using threads will avoid the same problem, since the mechanism for
> thread switch is basically the same as for multiplexing UI with
> subprocess output.

Right, threads would server only to make the code more readable. With 
filters, we'll have callbacks.

Threads can make this code look sequential, like iterating over a sequence.

> IMO, we should first explore the async subprocess road.

OK.

>> We'll probably be saved by filters having to wait until the current
>> command finishes executing, though.
> 
> Not sure I follow you: a filter function is called whenever some
> output arrives from the subprocess.  So they don't need to wait for
> the subprocess to finish.

The current command, as in "Emacs command loop" command. Anyway, you've 
addressed this issue in the next email.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu
  2017-05-03  0:14                         ` Dmitry Gutov
@ 2017-05-03  2:34                           ` Eli Zaretskii
  0 siblings, 0 replies; 15+ messages in thread
From: Eli Zaretskii @ 2017-05-03  2:34 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: hariharanrangasamy, 26710

> Cc: hariharanrangasamy@gmail.com, 26710@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Wed, 3 May 2017 03:14:52 +0300
> 
> On 02.05.2017 20:26, Eli Zaretskii wrote:
> 
> > The sentinel/filter won't be called at all if keyboard/mouse input is
> > available.  Once they are called, if each call takes a long processing
> > time, the UI could feel sluggish, yes.
> 
> Hmm, and if we're in "many calls, each of them fairly fast" situation?

Then there should be no problem with UI responsiveness.

> Sounds like the UI might be quite usable (but doing anything with it 
> would slow down the processing of search results).

Correct.





^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-05-03  2:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <87a86zu3gf.fsf@hari-laptop.i-did-not-set--mail-host-address--so-tickle-me>
2017-04-29  8:55 ` bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu Hariharan Rangasamy
2017-04-29 17:00   ` Dmitry Gutov
2017-04-29 17:37     ` Eli Zaretskii
2017-04-30  4:13       ` Hariharan Rangasamy
2017-04-30 10:35         ` Dmitry Gutov
2017-04-30 18:47           ` Eli Zaretskii
2017-05-01  2:42             ` Dmitry Gutov
2017-05-01  7:20               ` Eli Zaretskii
2017-05-01 21:46                 ` Dmitry Gutov
2017-05-02  7:15                   ` Eli Zaretskii
2017-05-02 10:00                     ` Dmitry Gutov
2017-05-02 17:26                       ` Eli Zaretskii
2017-05-02 17:41                         ` Eli Zaretskii
2017-05-03  0:14                         ` Dmitry Gutov
2017-05-03  2:34                           ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).