From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Andrew Cohen Newsgroups: gmane.emacs.bugs Subject: bug#63842: 30.0.50; Slow 'gnus-summary-refer-thread' Date: Sun, 18 Jun 2023 08:45:41 +0800 Message-ID: <87ttv54myy.fsf@ust.hk> References: <871qiu6m1f.fsf@ledu-giraud.fr> <87bkhxl9th.fsf@ust.hk> <87ilc44r3u.fsf@ledu-giraud.fr> <831qiduv5f.fsf@gnu.org> <87352rr8vz.fsf@ledu-giraud.fr> <87fs6rashz.fsf@ust.hk> <87352ppwed.fsf@ledu-giraud.fr> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12052"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Andrew Cohen , Eli Zaretskii , 63842@debbugs.gnu.org, cohen@andy.bu.edu To: Manuel Giraud Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jun 18 02:47:24 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qAgZe-0002t8-Vp for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 18 Jun 2023 02:47:23 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qAgZO-0000nJ-T2; Sat, 17 Jun 2023 20:47:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qAgZK-0000mz-I2 for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2023 20:47:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qAgZK-0000h5-8z for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2023 20:47:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qAgZK-0000Rl-5K for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2023 20:47:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Andrew Cohen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 18 Jun 2023 00:47:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63842 X-GNU-PR-Package: emacs Original-Received: via spool by 63842-submit@debbugs.gnu.org id=B63842.16870491631642 (code B ref 63842); Sun, 18 Jun 2023 00:47:02 +0000 Original-Received: (at 63842) by debbugs.gnu.org; 18 Jun 2023 00:46:03 +0000 Original-Received: from localhost ([127.0.0.1]:52839 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qAgYM-0000Q8-Az for submit@debbugs.gnu.org; Sat, 17 Jun 2023 20:46:02 -0400 Original-Received: from andy.bu.edu ([128.197.41.152]:36176) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qAgYK-0000Pk-8n for 63842@debbugs.gnu.org; Sat, 17 Jun 2023 20:46:01 -0400 Original-Received: from [193.176.211.165] (helo=clove) by andy.bu.edu with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qAgYB-0005qr-UR; Sat, 17 Jun 2023 20:45:54 -0400 In-Reply-To: <87352ppwed.fsf@ledu-giraud.fr> (Manuel Giraud's message of "Sun, 18 Jun 2023 00:16:26 +0200") X-Spam_score: -2.9 X-Spam_score_int: -28 X-Spam_bar: -- X-Spam_report: Spam detection software, running on the system "andy.bu.edu", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: OK, I think I understand the problem. Before the change that Manuel identified as the culprit of the slowdown, thread referral resulted in the creation of a new ephemeral group to hold the search results. One of the major features of the [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263593 Archived-At: OK, I think I understand the problem. Before the change that Manuel identified as the culprit of the slowdown, thread referral resulted in the creation of a new ephemeral group to hold the search results. One of the major features of the changes was to try to add the articles from the search to the existing summary buffer rather than create a new one (this is an important improvement for a variety of reasons; for example, changes made in the additional ephemeral buffer would be overriden when exiting the originating buffer). So what Manuel is seeing: previously the new ephemeral group parses only the headers for the articles in the thread, while in the current implementation where the newly found articles are simply added to the existing summary buffer, all the headers in the original buffer are parsed. I need to figure out the right way to fix this. I think parsing the full set of headers is the right thing, but since it is slow that creates a problem. In the meantime, Manuel can you try setting 'gnus-refer-thread-use-search to t and see if this resolves your problem? (This should effectively restore the old behavior of creating a new ephemeral group to hold the search result). Best, Andy >>>>> "MG" == Manuel Giraud writes: MG> Andrew Cohen writes: >> Sorry, I have gotten busy with other things at the moment. MG> Hi Andrew and you don't need to be sorry for this :-) >>>>>>> "MG" == Manuel Giraud writes: >> MG> Hi, So here is the crux of this issue. When using MG> 'gnus-summary-refer-thread' in a nnml group, Emacs ends up MG> calling 'gnus-get-newsgroup-headers-xover' (via MG> 'gnus-fetch-headers'). AFAIU in this function when MG> 'gnus-read-all-available-headers' is t, Emacs will parse *all* MG> of the " *nntp*" buffer content. In my case, this buffer is MG> quite big (about 50k lines and 23MiB) hence the slowness. >> >> Thanks for continuing to debug this. I am confused---why is the >> nntp buffer so full? MG> I think in a nnml group the nntp buffer is populated with the MG> content of the ".overview" file. In this particular group, I MG> have thousands of messages and I think that explains the size of MG> this file. >> The search routine should populate the buffer only with the >> headers of the articles found in the search (I am assuming that >> this list of found articles is not 50K lines long). Maybe the >> search is not working properly? MG> As we are talking about 'gnus-summary-refer-thread', I guess MG> that it is expected that the nntp buffer is filled with this MG> content. A regular query (with 'G G') is still fast, so I think MG> my search engine is set up properly. >> Can you step through gnus-summary-refer-thread and in the >> conditional that retrieves the new headers can you tell me which >> branch of the conditional is chosen (there are three >> possibilities: 'gnus-request-thread, 'gnus-search-thread, and the >> clause with the comment "Otherwise just retrieve some headers"). MG> In my case, Emacs is using the 'gnus-search-thread' branch and MG> ends up calling 'gnus-get-newsgroup-headers-xover' which is the MG> function that parses all the nntp buffer content. MG> BTW, I also have examples where 'gnus-summary-refer-thread' MG> gives me some false positives (eg., not the same thread but part MG> of the subject matching) >> >> This is probably by design: in the olden days many mailers were >> broken and didn't handle the references header properly (I don't >> know if this is still the case). So by default gnus tries to use >> information from the subject header to help gather loose threads, >> which can result in articles not actually part of the thread >> being included. You can check if this is the reason for what you >> are seeing by setting >> >> (setq gnus-summary-thread-gathering-function >> 'gnus-gather-threads-by-references) >> >> and seeing if this makes a difference. MG> Ok, thanks for the explanation and FWIW, my MG> 'gnus-summary-thread-gathering-function' is already set to MG> 'gnus-gather-threads-by-references. MG> Best regards, -- Manuel Giraud -- Andrew Cohen Director, HKUST Jockey Club Institute for Advanced Study Lam Woo Foundation Professor and Chair Professor of Physics The Hong Kong University of Science and Technology