unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: meta@public-inbox.org
Subject: Re: Cheap way to check for new messages in a thread
Date: Tue, 28 Mar 2023 19:45:49 +0000	[thread overview]
Message-ID: <20230328194549.M808175@dcvr> (raw)
In-Reply-To: <euqfjyjedoywqdcbldq23o3k3joewdmnx5j3tsjcu2dfx253uj@mzrjrpfy5nxv>

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Mon, Mar 27, 2023 at 09:38:49PM +0000, Eric Wong wrote:
> > I thought about that, too; but I'm worried about having one-off
> > stuff that ends up needing to be supported indefinitely.
> > 
> > JMAP for this would take more time, but I'd be more comfortable
> > carrying it long-term.
> > 
> > I don't expect trimming after the first paragraph to be a huge
> > improvement.  Retrieving any part of the message from git and
> > dealing with MIME is expensive, anyways.  I wouldn't expect it
> > to be a big (if any) improvement compared to POST-ing for the
> > mbox.gz (&x=m&t=1) endpoint with rt:$SINCE..
> 
> Hmm... This didn't seem to do the right thing for me. For example, this
> thread:
> 
> https://lore.kernel.org/lkml/20230327080502.GA570847@ziqianlu-desk2
> 
> If I ask for any new messages in that thread since 20230327120000, I get
> nothing:
> 
> curl -Sf -d '' 'https://lore.kernel.org/all/?x=m&t=1&q=mid%3A20230327080502.GA570847@ziqianlu-desk2+AND+dt%3A20230328120000..'

Ugh, that's because the thread expansion (t=1) happens after
Xapian handles dt:/rt:/d:

I don't know if there's a good way to do that entirely within
Xapian via high-level Perl bindings.

Some options:

A) grab MSGID first, lookup THREADID for a given MSGID,
   use remaining query

   The problem is figuring out which parts of the query to
   handle, first.  Maybe a solution below...

B) add explicit before= and after= parameters which allow us
   to do filtering ourselves in the thread expansion phase

C) index References:/In-Reply-To: so searching `ref:$MSGID'
   can work.  This doesn't work for some MUAs and deep
   threads, though.

D) Support `thread:{subquery}' like notmuch.
   Thus `thread:{mid:$MSGID} AND dt:$START..' would communicate
   to Xapian what we want for A).

   I'm not sure this is doable unless using Xapian via C++,
   but I've been considering providing the option to use C++
   anyways to support less hacky approxidate query parsing.
   According to notmuch docs, it's expensive, though :<

I think it's possible to support /$INBOX/$MSGID/t.mbox.gz?q=...
for A) without too much difficulty.  I'll have to think
about it a bit...

D) is good for long-term consideration if proper timeouts can
be implemented.

> > The mbox.gz endpoints should be a bit more efficient for the
> > server than Atom feeds; decoding MIME and HTML escaping takes up
> > considerable CPU time.
> 
> Good to know. I'm really looking for a way to ask the remote system "hey, is
> there anything new in this thread?" so that I can quickly ignore threads
> without any updates.

All the mbox.gz endpoints will 404 if there's no results, and
the `-f' flag of curl will ensure nothing's emitted to stdout
in that case.

  reply	other threads:[~2023-03-28 19:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-27 15:08 Cheap way to check for new messages in a thread Konstantin Ryabitsev
2023-03-27 19:10 ` Eric Wong
2023-03-27 20:47   ` Konstantin Ryabitsev
2023-03-27 21:38     ` Eric Wong
2023-03-28 14:04       ` Konstantin Ryabitsev
2023-03-28 19:45         ` Eric Wong [this message]
2023-03-28 20:00           ` Konstantin Ryabitsev
2023-03-28 22:08             ` Eric Wong
2023-03-28 23:30               ` Konstantin Ryabitsev
2023-03-29 21:25                 ` Eric Wong
2023-03-30 11:29                   ` Eric Wong
2023-03-30 16:45                     ` Konstantin Ryabitsev
2023-03-31  1:40                       ` Eric Wong
2023-04-11 11:27                         ` Eric Wong
2023-06-16 19:11                     ` Konstantin Ryabitsev
2023-06-16 23:13                       ` [PATCH] www: use correct threadid for per-thread search Eric Wong
2023-06-21 17:11                         ` Konstantin Ryabitsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230328194549.M808175@dcvr \
    --to=e@80x24.org \
    --cc=konstantin@linuxfoundation.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).