From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 825251F9F3 for ; Tue, 12 Oct 2021 22:45:00 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 1/5] extindex: flush pending reindex before unref Date: Tue, 12 Oct 2021 22:44:56 +0000 Message-Id: <20211012224500.2882-2-e@80x24.org> In-Reply-To: <20211012224500.2882-1-e@80x24.org> References: <20211012224500.2882-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: This prevents unnecessary message renumbering and I/O. Without this change, there is a small window for long-running WWW streaming requests to miss a message that was unref-ed before reindexing. If we expose an "All Mail" mailbox via IMAP/JMAP, this will save client traffic. --- lib/PublicInbox/ExtSearchIdx.pm | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm index c2ab0447e176..40489eab4c66 100644 --- a/lib/PublicInbox/ExtSearchIdx.pm +++ b/lib/PublicInbox/ExtSearchIdx.pm @@ -193,6 +193,7 @@ sub do_xpost ($$) { $idx->ipc_do('add_eidx_info', $docid, $eidx_key, $eml); apply_boost($req, $smsg) if $req->{boost_in_use}; } else { # 'd' no {xnum} + $self->git->async_wait_all; $oid = pack('H*', $oid); _unref_doc($req, $docid, $xibx, undef, $oid, $eml); } @@ -261,6 +262,7 @@ sub _blob_missing ($$) { # called when a known $smsg->{blob} is gone # xnum and ibx are unknown, we only call this when an entry from # /ei*/over.sqlite3 is bad, not on entries from xap*/over.sqlite3 my $oidbin = pack('H*', $smsg->{blob}); + $req->{self}->git->async_wait_all; _unref_doc($req, $smsg, undef, undef, $oidbin); } @@ -552,6 +554,7 @@ sub _reindex_finalize ($$$) { } return if $nr == 1; # likely, all good + $self->git->async_wait_all; warn "W: #$docid split into $nr due to deduplication change\n"; my @todo; for my $ary (values %$by_chash) { @@ -896,6 +899,7 @@ ibx_id = ? AND xnum >= ? AND xnum <= ? } return if $sync->{quit}; next unless scalar keys %x3m; + $self->git->async_wait_all; # wait for reindex_unseen # eliminate stale/mismatched entries my %mismatch = map { $_->{num} => $_->{blob} } @$msgs;