From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#26710: Fwd: 25.2; project-find-regexp makes emacs use 100% cpu Date: Tue, 2 May 2017 00:46:25 +0300 Message-ID: References: <87a86zu3gf.fsf@hari-laptop.i-did-not-set--mail-host-address--so-tickle-me> <83vapnktcn.fsf@gnu.org> <3d76a3ac-32ad-412d-349d-5904fc964a2b@yandex.ru> <83ziexka0s.fsf@gnu.org> <77b3a404-adac-fd1c-bd99-ad10e2450338@yandex.ru> <83inlljb5r.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: blaine.gmane.org 1493675237 24271 195.159.176.226 (1 May 2017 21:47:17 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 1 May 2017 21:47:17 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Thunderbird/53.0 Cc: hariharanrangasamy@gmail.com, 26710@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon May 01 23:47:08 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d5J9v-00069Y-6J for geb-bug-gnu-emacs@m.gmane.org; Mon, 01 May 2017 23:47:07 +0200 Original-Received: from localhost ([::1]:55859 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d5JA1-00054j-01 for geb-bug-gnu-emacs@m.gmane.org; Mon, 01 May 2017 17:47:13 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41026) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d5J9t-00054d-D4 for bug-gnu-emacs@gnu.org; Mon, 01 May 2017 17:47:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d5J9q-0006D0-8X for bug-gnu-emacs@gnu.org; Mon, 01 May 2017 17:47:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:51832) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d5J9q-0006Cq-4w for bug-gnu-emacs@gnu.org; Mon, 01 May 2017 17:47:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1d5J9p-0004hh-Uk for bug-gnu-emacs@gnu.org; Mon, 01 May 2017 17:47:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 01 May 2017 21:47:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 26710 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 26710-submit@debbugs.gnu.org id=B26710.149367519818051 (code B ref 26710); Mon, 01 May 2017 21:47:01 +0000 Original-Received: (at 26710) by debbugs.gnu.org; 1 May 2017 21:46:38 +0000 Original-Received: from localhost ([127.0.0.1]:50031 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d5J9R-0004h4-SP for submit@debbugs.gnu.org; Mon, 01 May 2017 17:46:38 -0400 Original-Received: from mail-wr0-f172.google.com ([209.85.128.172]:34114) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d5J9O-0004gq-Iq for 26710@debbugs.gnu.org; Mon, 01 May 2017 17:46:35 -0400 Original-Received: by mail-wr0-f172.google.com with SMTP id l9so68997326wre.1 for <26710@debbugs.gnu.org>; Mon, 01 May 2017 14:46:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=+rPPyCKncW6BCXmgbCfFLmAeyN0ot7U4Y0VEi5LWIl8=; b=cTETh+RfGfF/tYHw314GOtGic1wlzNjxbVY59sjf6LpWXiWGshjXhzl+XLTvyERKAy UHMZQmuC7VYFDtbzV56ko/uQnUHbQ/8kXvLJuDcG5stX3rkBL5TSbaxANAwP4UPbI103 I4rba2Fsqbpq2SAfAqUvXPUEtLK+q30lT9CMgAjHjrqwoj2lTSeHIhvJ1lgD9+cBYXh+ ZsJgNJ75i9op3jb9e8L96EIAuKJ5zC98KhMmHDNc55jGEJRR5TX/2shF4UgnCeBD1cZc aegTWVcnTX5p2V5S9TDObA74L+H6p7GL8Hq0uOhvmsm6NmEMsx5HH4b+T1YU1y9KAx34 qUtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=+rPPyCKncW6BCXmgbCfFLmAeyN0ot7U4Y0VEi5LWIl8=; b=D7uY+NCZAOu1urxIeaNW4/AqKyzaPQ1jKZW+JXh1nHLhsFg21o57j6CvPPdort7A4c Hy5ij/zJ1jIbMEAR06sBnE4ItDGwTypTCFQZToqYkFLL01nIMLdT924Cmf4lB3HRnuTj izdau5RE/nj5D4/BaKC/OWZzsGi0peoRplyqY/ViqdZf/SoXqSloJUC/QtxaAcGZhubf WqgKV3Z0JOvYW588FoFbXCo2g/q3ZDrESHkRCaZznm1RFYx5ApOXopwzV2JvKHSRLtig CUPg4vYD3HLvSSTFY3yH/wZWsv4PQZvFWfpv2UOG2RnaEvcE77KUFQvKLAlDUg2dU4JF egyg== X-Gm-Message-State: AN3rC/668v8MyyQkAA0D77ZhkIQ2115laMlwGNjOxVhYJIM1FwM7XOwD bp0oLeR3xUVv5Q== X-Received: by 10.223.134.238 with SMTP id 43mr16558298wry.80.1493675188674; Mon, 01 May 2017 14:46:28 -0700 (PDT) Original-Received: from [192.168.1.3] ([185.105.173.156]) by smtp.googlemail.com with ESMTPSA id o18sm22250822wrb.47.2017.05.01.14.46.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 May 2017 14:46:27 -0700 (PDT) In-Reply-To: <83inlljb5r.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:132176 Archived-At: On 01.05.2017 10:20, Eli Zaretskii wrote: > In my testing, find-grep finishes almost instantaneously. The > exception is when you have a cold cache, but even then it takes about > 10% of the total run time, for the Emacs source tree (which yields > about 100,000 hits in the test case). This particular example, uses a very frequent term. I get 61000 hits or so, and it's still a lot, the search never finishes here (probably because I have more minor modes and customizations enabled). I don't think this is the common case, but let's try to remove some unnecessary work in Elisp first. See commit c99a3b9. Please take a look at xref--regexp-syntax-dependent-p specifically, and see if any significant false negatives come to mind. With this, project-find-regexp for 'emacs' finally completes in ~10 seconds on my machine. That's still more than 10 times longer than the external process takes, but I'm out of big optimization ideas at this point. > I thought the request was to allow the user do something in the > foreground, while this processing runs in the background. If that's > not what was requested, then I guess I no longer understand the > request. If the project is huge, and there are only a few hits, parallelizing the search and processing will allow the user to do whatever they want in the foreground. Because processing in Elisp, while slow, will still take a small fraction of the time. If the search term returns a lot of hits (compared to the size of the project), processing might indeed take a lot of time, and the UI might appear sluggish (not sure how sluggish, though, that should depend on the scheduling of the main and background threads). Even if it's sluggish, at least the user will see that the search has started, and there is some progress. We could even allow them to stop the search midway, and still do something with the first results. These are some of the advantages 'M-x rgrep' has over project-find-regexp. >> What we _can_ manage to run in parallel, in the find-grep process in the >> background, and the post-processing of the results in Elisp. > > Yes, you can -- if you invoke find-grep asynchronously and move the > processing of the hits to the filter function. Yes, these parts are necessary either way. What I was describing would go on top of them, as an abstraction. > But that doesn't need > to involve threads, and is being done in many packages/features out > there, so I'm not sure what did you ask me to do with this. I imagined that the xref API that allows this kind of asynchronous results might look better and more readable if it's implemented with threads underneath. > IOW, it > should be "trivial", at least in principle, to make this command work > in the background, just like, say, "M-x grep". In Compilation buffers (of which Grep is one example), the sentinel code has access to the buffer where the results are displayed. And the process outputs to that buffer as well. And 'M-x rgrep' doesn't have to abstract over possible way to obtain search results. None of those are the case with the xref API, or the results rendering code, which has to work with the values returned by an arbitrary xref backend, as documented. Right now, an xref backend implements several methods that are allowed to return the same type of value: "a list of xref items". Our task, as I see it, is to generalize that return value type for asynchronous work, and to do that as sanely as possible. Threads are not strictly necessary for this (see the last paragraph of my previous email), but this case seems like it could be a good, limited in scope, showcase for the threading functionality. > I'm not sure I understand the need for this complexity, given that > async subprocesses are available. I'm probably missing something > because I know too little about the internals of the involved code. The main thing to understand is the xref API, not the internals of the package.