From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: meta@public-inbox.org
Subject: [PATCH] www: use correct threadid for per-thread search
Date: Fri, 16 Jun 2023 23:13:01 +0000 [thread overview]
Message-ID: <20230616231301.M394415@dcvr> (raw)
In-Reply-To: <20230616-rudy-comedy-vision-2b9f92@meerkat>
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Thu, Mar 30, 2023 at 11:29:51AM +0000, Eric Wong wrote:
> > This implements the mbox.gz retrieval. I didn't want to deal
> > with HTML nor figuring out how to expose more <form> elements,
> > yet; but I figure mbox.gz is the most important.
> >
> > Now deployed on 80x24.org/lore:
> >
> > MSGID=20230327080502.GA570847@ziqianlu-desk2
> > curl -d '' -sSf \
> > https://80x24.org/lore/all/"$MSGID/?x=m&q=rt:2023-03-29.." | \
> > zcat | grep -i ^Message-ID:
>
> Eric:
>
> Reviving this old thread for some clarification. I noticed that this only
> works for /all/, but not for individual inboxes. E.g.:
>
> $ curl -d '' -sSf \
> https://lore.kernel.org/all/"$MSGID/?x=m&q=rt:2023-03-29.." \
> | zgrep -i ^Message-ID:
> Message-ID: <cfcf852c-e9f0-f560-542d-0f72777a85b2@leemhuis.info>
>
> but with /lkml/ I get a 404:
>
> $ curl -d '' -sSf \
> https://lore.kernel.org/lkml/"$MSGID/?x=m&q=rt:2023-03-29.." \
> | zgrep -i ^Message-ID:
> curl: (22) The requested URL returned error: 404
>
> Is that intentionally restricted to just extindex?
It's a bug, fix below and deployed to https://80x24.org/lore/
---------8<---------
Subject: [PATCH] www: use correct threadid for per-thread search
For individual public-inboxes relying on extindex for per-inbox
search, we must use the threadid from the extindex over.sqlite3
rather than the per-inbox over.sqlite3 file.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20230616-rudy-comedy-vision-2b9f92@meerkat/
---
lib/PublicInbox/Mbox.pm | 10 +++++++---
t/extindex-psgi.t | 39 +++++++++++++++++++++++++++++++++++++--
2 files changed, 44 insertions(+), 5 deletions(-)
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index e1abf7ec..bf61bb0e 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -225,15 +225,19 @@ sub mbox_all {
return mbox_all_ids($ctx) if $q_string !~ /\S/;
my $srch = $ctx->{ibx}->isrch or
return PublicInbox::WWW::need($ctx, 'Search');
- my $over = $ctx->{ibx}->over or
- return PublicInbox::WWW::need($ctx, 'Overview');
my $qopts = $ctx->{qopts} = { relevance => -2 }; # ORDER BY docid DESC
# {threadid} limits results to a given thread
# {threads} collapses results from messages in the same thread,
# allowing us to use ->expand_thread w/o duplicates in our own code
- $qopts->{threadid} = $over->mid2tid($ctx->{mid}) if defined($ctx->{mid});
+ if (defined($ctx->{mid})) {
+ my $over = ($ctx->{ibx}->{isrch} ?
+ $ctx->{ibx}->{isrch}->{es}->over :
+ $ctx->{ibx}->over) or
+ return PublicInbox::WWW::need($ctx, 'Overview');
+ $qopts->{threadid} = $over->mid2tid($ctx->{mid});
+ }
$qopts->{threads} = 1 if $q->{t};
$srch->query_approxidate($ctx->{ibx}->git, $q_string);
my $mset = $srch->mset($q_string, $qopts);
diff --git a/t/extindex-psgi.t b/t/extindex-psgi.t
index 98dc2e48..f10ffbb6 100644
--- a/t/extindex-psgi.t
+++ b/t/extindex-psgi.t
@@ -1,5 +1,5 @@
#!perl -w
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
use strict;
use v5.10.1;
@@ -21,7 +21,28 @@ mkdir "$home/.public-inbox" or BAIL_OUT $!;
my $pi_config = "$home/.public-inbox/config";
cp($cfg_path, $pi_config) or BAIL_OUT;
my $env = { HOME => $home };
-run_script([qw(-extindex --all), "$tmpdir/eidx"], $env) or BAIL_OUT;
+my $m2t = create_inbox 'mid2tid', version => 2, indexlevel => 'basic', sub {
+ my ($im, $ibx) = @_;
+ for my $n (1..3) {
+ $im->add(PublicInbox::Eml->new(<<EOM)) or xbail 'add';
+Date: Fri, 02 Oct 1993 00:0$n:00 +0000
+Message-ID: <t\@$n>
+Subject: tid $n
+From: x\@example.com
+References: <a-mid\@b>
+
+$n
+EOM
+ $im->add(PublicInbox::Eml->new(<<EOM)) or xbail 'add';
+Date: Fri, 02 Oct 1993 00:0$n:00 +0000
+Message-ID: <ut\@$n>
+Subject: unrelated tid $n
+From: x\@example.com
+References: <b-mid\@b>
+
+EOM
+ }
+};
{
open my $cfgfh, '>>', $pi_config or BAIL_OUT;
$cfgfh->autoflush(1);
@@ -32,8 +53,14 @@ run_script([qw(-extindex --all), "$tmpdir/eidx"], $env) or BAIL_OUT;
[publicinbox]
wwwlisting = all
grokManifest = all
+[publicinbox "m2t"]
+ inboxdir = $m2t->{inboxdir}
+ address = $m2t->{-primary_address}
EOM
+ close $cfgfh or xbail "close: $!";
}
+
+run_script([qw(-extindex --all), "$tmpdir/eidx"], $env) or BAIL_OUT;
my $www = PublicInbox::WWW->new(PublicInbox::Config->new($pi_config));
my $client = sub {
my ($cb) = @_;
@@ -83,6 +110,14 @@ my $client = sub {
't2 manifest');
is_deeply([ sort keys %{$m->{'/t1'}} ], [ '/t1' ],
't2 manifest');
+
+ # ensure ibx->{isrch}->{es}->over is used instead of ibx->over:
+ $res = $cb->(POST("/m2t/t\@1/?q=dt:19931002000259..&x=m"));
+ is($res->code, 200, 'hit on mid2tid query');
+ $res = $cb->(POST("/m2t/t\@1/?q=dt:19931002000400..&x=m"));
+ is($res->code, 404, '404 on out-of-range mid2tid query');
+ $res = $cb->(POST("/m2t/t\@1/?q=s:unrelated&x=m"));
+ is($res->code, 404, '404 on cross-thread search');
};
test_psgi(sub { $www->call(@_) }, $client);
%$env = (%$env, TMPDIR => $tmpdir, PI_CONFIG => $pi_config);
next prev parent reply other threads:[~2023-06-16 23:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-27 15:08 Cheap way to check for new messages in a thread Konstantin Ryabitsev
2023-03-27 19:10 ` Eric Wong
2023-03-27 20:47 ` Konstantin Ryabitsev
2023-03-27 21:38 ` Eric Wong
2023-03-28 14:04 ` Konstantin Ryabitsev
2023-03-28 19:45 ` Eric Wong
2023-03-28 20:00 ` Konstantin Ryabitsev
2023-03-28 22:08 ` Eric Wong
2023-03-28 23:30 ` Konstantin Ryabitsev
2023-03-29 21:25 ` Eric Wong
2023-03-30 11:29 ` Eric Wong
2023-03-30 16:45 ` Konstantin Ryabitsev
2023-03-31 1:40 ` Eric Wong
2023-04-11 11:27 ` Eric Wong
2023-06-16 19:11 ` Konstantin Ryabitsev
2023-06-16 23:13 ` Eric Wong [this message]
2023-06-21 17:11 ` [PATCH] www: use correct threadid for per-thread search Konstantin Ryabitsev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230616231301.M394415@dcvr \
--to=e@80x24.org \
--cc=konstantin@linuxfoundation.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).