From: "Eric Wong (Contractor, The Linux Foundation)" <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 22/34] nntp: use NNTP article numbers for lookups
Date: Tue, 6 Mar 2018 08:42:30 +0000 [thread overview]
Message-ID: <20180306084242.19988-23-e@80x24.org> (raw)
In-Reply-To: <20180306084242.19988-1-e@80x24.org>
Since Message-IDs are no longer unique within Xapian
(but are within the SQLite Msgmap); favor NNTP article
numbers for internal lookups. This will prevent us
from finding the "wrong" internal Message-ID.
---
lib/PublicInbox/NNTP.pm | 29 ++++++++++++++---------------
lib/PublicInbox/Search.pm | 21 +++++++++++++++++++++
2 files changed, 35 insertions(+), 15 deletions(-)
diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 56d8e01..895e502 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -463,18 +463,16 @@ find_mid:
defined $mid or return $err;
}
found:
- my $bytes;
- my $s = eval { $ng->msg_by_mid($mid, \$bytes) } or return $err;
- $s = Email::Simple->new($s);
- my $lines;
+ my $smsg = $ng->search->lookup_article($n) or return $err;
+ my $msg = $ng->msg_by_smsg($smsg) or return $err;
+ my $s = Email::Simple->new($msg);
if ($set_headers) {
set_nntp_headers($s->header_obj, $ng, $n, $mid);
- $lines = $s->body =~ tr!\n!\n!;
# must be last
$s->body_set('') if ($set_headers == 2);
}
- [ $n, $mid, $s, $bytes, $lines, $ng ];
+ [ $n, $mid, $s, $smsg->bytes, $smsg->lines, $ng ];
}
sub simple_body_write ($$) {
@@ -693,8 +691,8 @@ sub hdr_xref ($$$) { # optimize XHDR Xref [range] for rtin
}
sub search_header_for {
- my ($srch, $mid, $field) = @_;
- my $smsg = $srch->lookup_mail($mid) or return;
+ my ($srch, $num, $field) = @_;
+ my $smsg = $srch->lookup_article($num) or return;
$smsg->$field;
}
@@ -702,8 +700,8 @@ sub hdr_searchmsg ($$$$) {
my ($self, $xhdr, $field, $range) = @_;
if (defined $range && $range =~ /\A<(.+)>\z/) { # Message-ID
my ($ng, $n) = mid_lookup($self, $1);
- return r430 unless $n;
- my $v = search_header_for($ng->search, $range, $field);
+ return r430 unless defined $n;
+ my $v = search_header_for($ng->search, $n, $field);
hdr_mid_response($self, $xhdr, $ng, $n, $range, $v);
} else { # numeric range
$range = $self->{article} unless defined $range;
@@ -803,9 +801,10 @@ sub cmd_xrover ($;$) {
more($self, '224 Overview information follows');
long_response($self, $beg, $end, sub {
my ($i) = @_;
- my $mid = $mm->mid_for($$i) or return;
- my $h = search_header_for($srch, $mid, 'references');
- more($self, "$$i $h");
+ my $num = $$i;
+ my $h = search_header_for($srch, $num, 'references');
+ defined $h or return;
+ more($self, "$num $h");
});
}
@@ -829,8 +828,8 @@ sub cmd_over ($;$) {
my ($self, $range) = @_;
if ($range && $range =~ /\A<(.+)>\z/) {
my ($ng, $n) = mid_lookup($self, $1);
- my $smsg = $ng->search->lookup_mail($range) or
- return '430 No article with that message-id';
+ defined $n or return r430;
+ my $smsg = $ng->search->lookup_article($n) or return r430;
more($self, '224 Overview information follows (multi-line)');
# Only set article number column if it's the current group
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index a1c423c..802984b 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -372,6 +372,27 @@ sub lookup_mail { # no ghosts!
});
}
+sub lookup_article {
+ my ($self, $num) = @_;
+ my $term = 'XNUM'.$num;
+ my $smsg;
+ eval {
+ retry_reopen($self, sub {
+ my $db = $self->{skel} || $self->{xdb};
+ my $head = $db->postlist_begin($term);
+ return if $head == $db->postlist_end($term);
+ my $doc_id = $head->get_docid;
+ return unless defined $doc_id;
+ # raises on error:
+ my $doc = $db->get_document($doc_id);
+ $smsg = PublicInbox::SearchMsg->wrap($doc);
+ $smsg->load_expand;
+ $smsg->{doc_id} = $doc_id;
+ });
+ };
+ $smsg;
+}
+
sub each_smsg_by_mid {
my ($self, $mid, $cb) = @_;
my $xdb = $self->{xdb};
--
EW
next prev parent reply other threads:[~2018-03-06 8:42 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-06 8:42 [v2 PATCH 00/34] duplicate handling, smaller Xapian DBs, date fixes Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 01/34] v2writable: delete ::Import obj when ->done Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 02/34] search: remove informational "warning" message Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 03/34] searchidx: add PID to error message when die-ing Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 04/34] content_id: special treatment for Message-Id headers Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 05/34] evcleanup: disable outside of daemon Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 06/34] v2writable: deduplicate detection on add Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 07/34] evcleanup: do not create event loop if nothing was registered Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 08/34] mid: add `mids' and `references' methods for extraction Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 09/34] content_id: use `mids' and `references' for MID extraction Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 10/34] searchidx: use new `references' method for parsing References Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 11/34] content_id: no need to be human-friendly Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 12/34] v2writable: inject new Message-IDs on true duplicates Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 13/34] search: revert to using 'Q' as a uniQue id per-Xapian conventions Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 14/34] searchidx: support indexing multiple MIDs Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 15/34] mid: be strict with References, but loose on Message-Id Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 16/34] searchidx: avoid excessive XNQ indexing with diffs Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 17/34] searchidxskeleton: add a note about locking Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 18/34] v2writable: generated Message-ID goes first Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 19/34] searchidx: use add_boolean_term for internal terms Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 20/34] searchidx: add NNTP article number as a searchable term Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 21/34] mid: truncate excessively long MIDs early Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` Eric Wong (Contractor, The Linux Foundation) [this message]
2018-03-06 8:42 ` [PATCH 23/34] nntp: fix NEWNEWS command Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 24/34] searchidx: store the primary MID in doc data for NNTP Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 25/34] import: consolidate object info for v2 imports Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 26/34] v2: avoid redundant/repeated configs for git partition repos Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 27/34] INSTALL: document more optional dependencies Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 28/34] search: favor skeleton DB for lookup_mail Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 29/34] search: each_smsg_by_mid uses skeleton if available Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 30/34] v2writable: remove unnecessary skeleton commit Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 31/34] favor Received: date over Date: header globally Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 32/34] import: fall back to Sender for extracting name and email Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 33/34] scripts/import_vger_from_mbox: perform mboxrd or mboxo escaping Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:42 ` [PATCH 34/34] v2writable: detect and use previous partition count Eric Wong (Contractor, The Linux Foundation)
2018-03-06 8:53 ` [v2 PATCH 00/34] duplicate handling, smaller Xapian DBs, date fixes Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180306084242.19988-23-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).