From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: Lars Ingebrigtsen <larsi@gnus.org>
Cc: 38011@debbugs.gnu.org
Subject: bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly
Date: Sat, 02 Jan 2021 12:49:28 -0800 [thread overview]
Message-ID: <87ble7kqsn.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <874kjzangc.fsf@gnus.org> (Lars Ingebrigtsen's message of "Sat, 02 Jan 2021 06:59:31 +0100")
Lars Ingebrigtsen <larsi@gnus.org> writes:
> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>
>> I revisit this every few months, and have to completely relearn all the
>> code each time. With any luck that means that I've looked over these
>> diffs sufficiently to have caught more bugs.
>>
>> At any rate, I think this is finally ready to go.
>
> Congratulations!
>
> That's a big patch, and skimming it, I'm not quite sure I understand it
> all. Could you put this on a branch so that we can get a bit of testing
> before merging it?
Sure thing. It's in girzel/gnus-headers now. I made a few more sneaky
last minute changes, so yes... testing is in order.
The basic principle is simple: it gives backends the option of parsing
their own article headers, rather than writing text into the
nntp-server-buffer to get parsed later. In this sense, the diff on
`gnus-fetch-headers' is all there is to it. None of the backends
actually do this right now.
It gets complicated because the cache and the agent need to mix their
saved headers into whatever newly-fetched headers we get from the
server. So instead of having them call `gnus-retrieve-headers' and
mixing their cached text into the nntp-server-buffer, they now call
`gnus-fetch-headers' on the server, which actually returns real headers.
That means they also need to be responsible for extracting real headers
from their own cache files (rather than letting that happen further down
the line). In this patch both do that with
`gnus-get-newsgroup-headers-xover' (which efficiently parses only a
subset of [potentially very many] cached headers), then merge/sort those
headers with what came back from the server.
A few points of contention:
1. I'm not sure there's a real difference between
`gnus-agent-fetch-headers' and `gnus-agent-retrieve-headers' anymore.
Both return actual headers. It would take a quiet afternoon of
staring at the code to know for sure.
2. The agent and cache are now using `gnus-get-newsgroup-headers-xover'
to parse their cached headers, which does its own dependency
building. This means that `gnus-fetch-headers' has to be careful not
to double-register headers in the dependency table. It also means
that the agent and cache have to reach waaaaaay back to find a
reference to the `gnus-dependency-table', which they're both doing
with a call to `buffer-local-value', which feels gross and fragile.
In general I would much prefer to build the dependency table in one
place, preferably after all the headers have been retrieved,
preferably in `gnus-select-newsgroup'. Another option would be to not
use the higher-level `gnus-get-newsgroup-headers-xover', but instead
to scan the cache files for article numbers and use the lower-level
`nnheader-parse-nov', which isn't concerned with dependencies.
3. In general it took many extra brain cycles (of which I do not have a
surplus) to follow the code flow. I would love it if
`gnus-retrieve-headers' -- instead of calling one of
`gnus-(cache|agent)-retrieve-headers' and expecting them to re-call
`gnus-retrieve-headers' multiple times with various global variables
set -- instead called everything consecutively, dumping the article
list into each function by turn -- cache, agent, server -- and
filtering the list according to which headers we get back. But that's
another patch for another time.
next prev parent reply other threads:[~2021-01-02 20:49 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-31 21:34 bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly Eric Abrahamsen
2019-11-01 14:12 ` Lars Ingebrigtsen
2019-11-01 18:41 ` Eric Abrahamsen
2019-11-01 20:52 ` Eric Abrahamsen
2019-11-02 14:49 ` Lars Ingebrigtsen
2019-11-07 23:21 ` Eric Abrahamsen
2019-11-08 21:03 ` Lars Ingebrigtsen
2019-11-08 21:43 ` Eric Abrahamsen
2019-11-08 21:58 ` Lars Ingebrigtsen
2020-03-29 19:50 ` Eric Abrahamsen
2020-04-30 4:50 ` Lars Ingebrigtsen
2020-09-27 4:13 ` Eric Abrahamsen
2020-09-27 12:16 ` Lars Ingebrigtsen
2020-09-27 23:41 ` Eric Abrahamsen
2021-01-02 3:18 ` Eric Abrahamsen
2021-01-02 5:59 ` Lars Ingebrigtsen
2021-01-02 20:49 ` Eric Abrahamsen [this message]
2021-01-03 7:45 ` Lars Ingebrigtsen
2021-01-03 19:53 ` Eric Abrahamsen
2021-01-04 9:05 ` Lars Ingebrigtsen
2021-01-04 18:09 ` Eric Abrahamsen
2021-01-05 8:47 ` Lars Ingebrigtsen
2021-01-05 17:02 ` Eric Abrahamsen
2021-01-17 5:00 ` Eric Abrahamsen
2021-01-18 10:48 ` Robert Pluim
2021-01-18 21:12 ` Eric Abrahamsen
2021-01-18 16:37 ` Lars Ingebrigtsen
2021-01-03 19:54 ` Eric Abrahamsen
2021-01-03 21:38 ` Eric Abrahamsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ble7kqsn.fsf@ericabrahamsen.net \
--to=eric@ericabrahamsen.net \
--cc=38011@debbugs.gnu.org \
--cc=larsi@gnus.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).