unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 00/11] cleanups, mostly indexing related
@ 2020-09-02 11:04 Eric Wong
  2020-09-02 11:04 ` [PATCH 01/11] msgmap: note how we use ->created_at Eric Wong
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

Some cleanups ahead of detached index support.

Found some dead code, too.

Eric Wong (11):
  msgmap: note how we use ->created_at
  disambiguate OverIdx and Over by field name
  use more idiomatic internal API for ->over access
  search: remove special case for blank query
  tests: add "use strict" and declare v5.10.1 compatibility
  search: replace ->query with ->mset
  search: remove {over_ro} field
  imap: drop old, pre-Parse::RecDescent search parser
  wwwaltid: drop unused sqlite3_missing function
  overidx: document column uses
  v2writable: reuse read-only shard counting code

 lib/PublicInbox/ExtMsg.pm     |   4 +-
 lib/PublicInbox/IMAP.pm       |  63 +----------------
 lib/PublicInbox/Inbox.pm      |  11 ++-
 lib/PublicInbox/Mbox.pm       |   6 +-
 lib/PublicInbox/Msgmap.pm     |   1 +
 lib/PublicInbox/OverIdx.pm    |  18 ++---
 lib/PublicInbox/Search.pm     |  32 ++++-----
 lib/PublicInbox/SearchIdx.pm  |  32 ++++-----
 lib/PublicInbox/SearchView.pm |   3 +-
 lib/PublicInbox/SolverGit.pm  |   5 +-
 lib/PublicInbox/V2Writable.pm |  59 ++++++----------
 lib/PublicInbox/WwwAltId.pm   |  16 +----
 scripts/dupe-finder           |   3 +-
 t/altid.t                     |   8 +--
 t/altid_v2.t                  |   7 +-
 t/index-git-times.t           |  17 +++--
 t/indexlevels-mirror.t        |   8 +--
 t/mda_filter_rubylang.t       |   6 +-
 t/replace.t                   |   8 +--
 t/search-thr-index.t          |   8 +--
 t/search.t                    | 126 +++++++++++++++++-----------------
 t/v1reindex.t                 |   4 +-
 t/v2mda.t                     |  16 +++--
 t/v2mirror.t                  |  24 +++----
 t/v2reindex.t                 |   9 +--
 t/v2writable.t                |  14 ++--
 t/watch_filter_rubylang.t     |  12 ++--
 t/watch_maildir_v2.t          |  30 ++++----
 t/xcpdb-reshard.t             |   3 +-
 xt/eml_check_limits.t         |   2 +
 xt/perf-threading.t           |   2 +-
 31 files changed, 232 insertions(+), 325 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 01/11] msgmap: note how we use ->created_at
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 02/11] disambiguate OverIdx and Over by field name Eric Wong
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

It'll likely be used in the future for JMAP, detached indices,
and maybe other things.
---
 lib/PublicInbox/Msgmap.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index d696ce83..f15875e3 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -91,6 +91,7 @@ sub last_commit_xap {
 	$self->meta_accessor("last_xap$version-$i", $commit);
 }
 
+# this is the UIDVALIDITY for IMAP (cf. RFC 3501 sec 2.3.1.1. item 3)
 sub created_at {
 	my ($self, $second) = @_;
 	$self->meta_accessor('created_at', $second);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 02/11] disambiguate OverIdx and Over by field name
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
  2020-09-02 11:04 ` [PATCH 01/11] msgmap: note how we use ->created_at Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 03/11] use more idiomatic internal API for ->over access Eric Wong
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

We'll use {oidx} as the common field name for the read-write
OverIdx, here, to disambiguate it from the read-only {over}
field.  This hopefully makes it clearer which code paths are
read-only and which are read-write.
---
 lib/PublicInbox/SearchIdx.pm  | 32 ++++++++++++++-----------------
 lib/PublicInbox/V2Writable.pm | 36 +++++++++++++++++------------------
 t/search-thr-index.t          |  8 ++++----
 t/search.t                    |  2 +-
 4 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 3f2da6ab..eb620f44 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -69,8 +69,8 @@ sub new {
 	if ($version == 1) {
 		$self->{lock_path} = "$inboxdir/ssoma.lock";
 		my $dir = $self->xdir;
-		$self->{over} = PublicInbox::OverIdx->new("$dir/over.sqlite3");
-		$self->{over}->{-no_fsync} = 1 if $ibx->{-no_fsync};
+		$self->{oidx} = PublicInbox::OverIdx->new("$dir/over.sqlite3");
+		$self->{oidx}->{-no_fsync} = 1 if $ibx->{-no_fsync};
 	} elsif ($version == 2) {
 		defined $shard or die "shard is required for v2\n";
 		# shard is a number
@@ -419,8 +419,8 @@ sub add_message {
 		# of the fields which exist in over.sqlite3.  We may stop
 		# storing doc_data in Xapian sometime after we get multi-inbox
 		# search working.
-		if (my $over = $self->{over}) { # v1 only
-			$over->add_overview($mime, $smsg);
+		if (my $oidx = $self->{oidx}) { # v1 only
+			$oidx->add_overview($mime, $smsg);
 		}
 		if (need_xapian($self)) {
 			add_xapian($self, $mime, $smsg, $mids);
@@ -457,7 +457,7 @@ sub xdb_remove {
 
 sub remove_by_oid {
 	my ($self, $oid, $num) = @_;
-	die "BUG: remove_by_oid is v2-only\n" if $self->{over};
+	die "BUG: remove_by_oid is v2-only\n" if $self->{oidx};
 	$self->begin_txn_lazy;
 	xdb_remove($self, $oid, $num) if need_xapian($self);
 }
@@ -479,13 +479,9 @@ sub unindex_eml {
 	my $nr = 0;
 	my %tmp;
 	for my $mid (@$mids) {
-		my @removed = eval { $self->{over}->remove_oid($oid, $mid) };
-		if ($@) {
-			warn "E: failed to remove <$mid> from overview: $@\n";
-		} else {
-			$nr += scalar @removed;
-			$tmp{$_}++ for @removed;
-		}
+		my @removed = $self->{oidx}->remove_oid($oid, $mid);
+		$nr += scalar @removed;
+		$tmp{$_}++ for @removed;
 	}
 	if (!$nr) {
 		$mids = join('> <', @$mids);
@@ -507,9 +503,9 @@ sub index_mm {
 	my $mids = mids($mime);
 	my $mm = $self->{mm};
 	if ($sync->{reindex}) {
-		my $over = $self->{over};
+		my $oidx = $self->{oidx};
 		for my $mid (@$mids) {
-			my ($num, undef) = $over->num_mid0_for_oid($oid, $mid);
+			my ($num, undef) = $oidx->num_mid0_for_oid($oid, $mid);
 			return $num if defined $num;
 		}
 		$mm->num_for($mids->[0]) // $mm->mid_insert($mids->[0]);
@@ -603,7 +599,7 @@ sub v1_checkpoint ($$;$) {
 		}
 	}
 
-	$self->{over}->rethread_done($sync->{-opt}) if $newest; # all done
+	$self->{oidx}->rethread_done($sync->{-opt}) if $newest; # all done
 	commit_txn_lazy($self);
 	$self->{ibx}->git->cleanup;
 	my $nr = ${$sync->{nr}};
@@ -773,7 +769,7 @@ sub _index_sync {
 	my $pr = $opt->{-progress};
 	my $sync = { reindex => $opt->{reindex}, -opt => $opt };
 	my $xdb = $self->begin_txn_lazy;
-	$self->{over}->rethread_prepare($opt);
+	$self->{oidx}->rethread_prepare($opt);
 	my $mm = _msgmap_init($self);
 	if ($sync->{reindex}) {
 		my $last = $mm->last_commit;
@@ -804,7 +800,7 @@ sub DESTROY {
 sub _begin_txn {
 	my ($self) = @_;
 	my $xdb = $self->{xdb} || idx_acquire($self);
-	$self->{over}->begin_lazy if $self->{over};
+	$self->{oidx}->begin_lazy if $self->{oidx};
 	$xdb->begin_transaction if $xdb;
 	$self->{txn} = 1;
 	$xdb;
@@ -844,7 +840,7 @@ sub _commit_txn {
 		set_metadata_once($self);
 		$xdb->commit_transaction;
 	}
-	$self->{over}->commit_lazy if $self->{over};
+	$self->{oidx}->commit_lazy if $self->{oidx};
 }
 
 sub commit_txn_lazy {
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 553dd839..c8334645 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -114,13 +114,13 @@ sub new {
 		total_bytes => 0,
 		current_info => '',
 		xpfx => $xpfx,
-		over => PublicInbox::OverIdx->new("$xpfx/over.sqlite3"),
+		oidx => PublicInbox::OverIdx->new("$xpfx/over.sqlite3"),
 		lock_path => "$dir/inbox.lock",
 		# limit each git repo (epoch) to 1GB or so
 		rotate_bytes => int((1024 * 1024 * 1024) / $PACKING_FACTOR),
 		last_commit => [], # git epoch -> commit
 	};
-	$self->{over}->{-no_fsync} = 1 if $v2ibx->{-no_fsync};
+	$self->{oidx}->{-no_fsync} = 1 if $v2ibx->{-no_fsync};
 	$self->{shards} = count_shards($self) || nproc_shards($creat);
 	bless $self, $class;
 }
@@ -154,7 +154,7 @@ sub add {
 sub do_idx ($$$$) {
 	my ($self, $msgref, $mime, $smsg) = @_;
 	$smsg->{bytes} = $smsg->{raw_bytes} + crlf_adjust($$msgref);
-	$self->{over}->add_overview($mime, $smsg);
+	$self->{oidx}->add_overview($mime, $smsg);
 	my $idx = idx_shard($self, $smsg->{num} % $self->{shards});
 	$idx->index_raw($msgref, $mime, $smsg);
 	my $n = $self->{transact_bytes} += $smsg->{raw_bytes};
@@ -219,7 +219,7 @@ sub v2_num_for {
 		if ($altid && grep(/:file=msgmap\.sqlite3\z/, @$altid)) {
 			my $num = $self->{mm}->num_for($mid);
 
-			if (defined $num && !$self->{over}->get_art($num)) {
+			if (defined $num && !$self->{oidx}->get_art($num)) {
 				return ($num, $mid);
 			}
 		}
@@ -274,7 +274,7 @@ sub idx_shard {
 sub _idx_init { # with_umask callback
 	my ($self, $opt) = @_;
 	$self->lock_acquire unless $opt && $opt->{-skip_lock};
-	$self->{over}->create;
+	$self->{oidx}->create;
 
 	# xcpdb can change shard count while -watch is idle
 	my $nshards = count_shards($self);
@@ -381,7 +381,7 @@ sub rewrite_internal ($$;$$$) {
 	} else {
 		$im = $self->importer;
 	}
-	my $over = $self->{over};
+	my $oidx = $self->{oidx};
 	my $chashes = content_hashes($old_eml);
 	my $removed = [];
 	my $mids = mids($old_eml);
@@ -395,7 +395,7 @@ sub rewrite_internal ($$;$$$) {
 	foreach my $mid (@$mids) {
 		my %gone; # num => [ smsg, $mime, raw ]
 		my ($id, $prev);
-		while (my $smsg = $over->next_by_mid($mid, \$id, \$prev)) {
+		while (my $smsg = $oidx->next_by_mid($mid, \$id, \$prev)) {
 			my $msg = get_blob($self, $smsg);
 			if (!defined($msg)) {
 				warn "broken smsg for $mid\n";
@@ -623,7 +623,7 @@ sub checkpoint ($;$) {
 		$dbh->commit;
 
 		# SQLite overview is third
-		$self->{over}->commit_lazy;
+		$self->{oidx}->commit_lazy;
 
 		# Now deal with Xapian
 		if ($wait) {
@@ -682,7 +682,7 @@ sub done {
 			$err .= "shard close: $@\n" if $@;
 		}
 	}
-	eval { $self->{over}->dbh_close };
+	eval { $self->{oidx}->dbh_close };
 	$err .= "over close: $@\n" if $@;
 	delete $self->{bnote};
 	my $nbytes = $self->{total_bytes};
@@ -844,10 +844,10 @@ sub get_blob ($$) {
 
 sub content_exists ($$$) {
 	my ($self, $mime, $mid) = @_;
-	my $over = $self->{over};
+	my $oidx = $self->{oidx};
 	my $chashes = content_hashes($mime);
 	my ($id, $prev);
-	while (my $smsg = $over->next_by_mid($mid, \$id, \$prev)) {
+	while (my $smsg = $oidx->next_by_mid($mid, \$id, \$prev)) {
 		my $msg = get_blob($self, $smsg);
 		if (!defined($msg)) {
 			warn "broken smsg for $mid\n";
@@ -917,9 +917,9 @@ sub index_oid { # cat_async callback
 		}
 	}
 	if (!defined($num)) { # reuse if reindexing (or duplicates)
-		my $over = $self->{over};
+		my $oidx = $self->{oidx};
 		for my $mid (@$mids) {
-			($num, $mid0) = $over->num_mid0_for_oid($oid, $mid);
+			($num, $mid0) = $oidx->num_mid0_for_oid($oid, $mid);
 			last if defined $num;
 		}
 	}
@@ -1107,7 +1107,7 @@ sub sync_prepare ($$$) {
 
 sub unindex_oid_remote ($$$) {
 	my ($self, $oid, $mid) = @_;
-	my @removed = $self->{over}->remove_oid($oid, $mid);
+	my @removed = $self->{oidx}->remove_oid($oid, $mid);
 	for my $num (@removed) {
 		my $idx = idx_shard($self, $num % $self->{shards});
 		$idx->shard_remove($oid, $num);
@@ -1121,11 +1121,11 @@ sub unindex_oid ($$;$) { # git->cat_async callback
 	my $mm = $self->{mm};
 	my $mids = mids(PublicInbox::Eml->new($bref));
 	undef $$bref;
-	my $over = $self->{over};
+	my $oidx = $self->{oidx};
 	foreach my $mid (@$mids) {
 		my %gone;
 		my ($id, $prev);
-		while (my $smsg = $over->next_by_mid($mid, \$id, \$prev)) {
+		while (my $smsg = $oidx->next_by_mid($mid, \$id, \$prev)) {
 			$gone{$smsg->{num}} = 1 if $oid eq $smsg->{blob};
 		}
 		my $n = scalar(keys(%gone)) or next;
@@ -1299,7 +1299,7 @@ sub index_sync {
 
 	$self->idx_init($opt); # acquire lock
 	fill_alternates($self, $epoch_max);
-	$self->{over}->rethread_prepare($opt);
+	$self->{oidx}->rethread_prepare($opt);
 	my $sync = {
 		need_checkpoint => \(my $bool = 0),
 		unindex_range => {}, # EPOCH => oid_old..oid_new
@@ -1329,7 +1329,7 @@ sub index_sync {
 	}
 	# work forwards through history
 	index_epoch($self, $sync, $_) for (0..$epoch_max);
-	$self->{over}->rethread_done($opt);
+	$self->{oidx}->rethread_done($opt);
 	$self->done;
 
 	if (my $nr = $sync->{nr}) {
diff --git a/t/search-thr-index.t b/t/search-thr-index.t
index b5a5ff1f..bd663519 100644
--- a/t/search-thr-index.t
+++ b/t/search-thr-index.t
@@ -60,9 +60,9 @@ foreach (reverse split(/\n\n/, $data)) {
 
 my $prev;
 my %tids;
-my $dbh = $rw->{over}->dbh;
+my $dbh = $rw->{oidx}->dbh;
 foreach my $mid (@mids) {
-	my $msgs = $rw->{over}->get_thread($mid);
+	my $msgs = $rw->{oidx}->get_thread($mid);
 	is(3, scalar(@$msgs), "got all messages from $mid");
 	foreach my $m (@$msgs) {
 		my $tid = $dbh->selectrow_array(<<'', undef, $m->{num});
@@ -84,9 +84,9 @@ Message-Id: <1-bw@g>
 From: bw@g
 To: git@vger.kernel.org
 
-	my $dbh = $rw->{over}->dbh;
+	my $dbh = $rw->{oidx}->dbh;
 	my ($id, $prev);
-	my $reidx = $rw->{over}->next_by_mid('1-bw@g', \$id, \$prev);
+	my $reidx = $rw->{oidx}->next_by_mid('1-bw@g', \$id, \$prev);
 	ok(defined $reidx);
 	my $num = $reidx->{num};
 	my $tid0 = $dbh->selectrow_array(<<'', undef, $num);
diff --git a/t/search.t b/t/search.t
index e2290ecd..f026e509 100644
--- a/t/search.t
+++ b/t/search.t
@@ -161,7 +161,7 @@ are real
 EOF
 	my $ghost_id = $rw->add_message($was_ghost);
 	is($ghost_id, int($ghost_id), "ghost_id is an integer: $ghost_id");
-	my $msgs = $rw->{over}->get_thread('ghost-message@s');
+	my $msgs = $rw->{oidx}->get_thread('ghost-message@s');
 	is(scalar(@$msgs), 2, 'got both messages in ghost thread');
 	foreach (qw(sid tid)) {
 		is($msgs->[0]->{$_}, $msgs->[1]->{$_}, "{$_} match");

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 03/11] use more idiomatic internal API for ->over access
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
  2020-09-02 11:04 ` [PATCH 01/11] msgmap: note how we use ->created_at Eric Wong
  2020-09-02 11:04 ` [PATCH 02/11] disambiguate OverIdx and Over by field name Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 04/11] search: remove special case for blank query Eric Wong
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

{over_ro} being a part of the Search object is a historical
oddity which will go away, soon.  Lets start removing its use in
tests and rarely-used helper scripts.
---
 scripts/dupe-finder |  3 +--
 t/search.t          | 14 +++++++-------
 t/v2mirror.t        |  2 +-
 t/v2writable.t      |  4 ++--
 xt/perf-threading.t |  2 +-
 5 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/scripts/dupe-finder b/scripts/dupe-finder
index 04714cbd..7b490cbb 100644
--- a/scripts/dupe-finder
+++ b/scripts/dupe-finder
@@ -21,8 +21,7 @@ if (index($repo, '@') > 0) {
 }
 $ibx or die "No inbox";
 $ibx->search or die "search not available for inbox";
-my $dbh = $ibx->search->{over_ro}->dbh;
-my $over = PublicInbox::Over->new($dbh->sqlite_db_filename);
+my $over = $ibx->over;
 
 sub emit ($) {
 	my ($nums) = @_;
diff --git a/t/search.t b/t/search.t
index f026e509..3124baeb 100644
--- a/t/search.t
+++ b/t/search.t
@@ -25,7 +25,7 @@ $ibx->with_umask(sub {
 	$rw->idx_release;
 });
 $rw = undef;
-my $ro = PublicInbox::Search->new($ibx);
+my $ro = $ibx->search;
 my $rw_commit = sub {
 	$rw->commit_txn_lazy if $rw;
 	$rw = PublicInbox::SearchIdx->new($ibx, 1);
@@ -233,7 +233,7 @@ EOF
 
 	$rw_commit->();
 	$ro->reopen;
-	my $t = $ro->{over_ro}->get_thread('root@s');
+	my $t = $ibx->over->get_thread('root@s');
 	is(scalar(@$t), 4, "got all 4 messages in thread");
 	my @exp = sort($long_reply_mid, 'root@s', 'last@s', $long_mid);
 	@res = filter_mids($t);
@@ -328,7 +328,7 @@ $ibx->with_umask(sub {
 	my $mset = $ro->query('t:list@example.com', {mset => 1});
 	is($mset->size, 9, 'searched To: successfully');
 	foreach my $m ($mset->items) {
-		my $smsg = $ro->{over_ro}->get_art($m->get_docid);
+		my $smsg = $ibx->over->get_art($m->get_docid);
 		like($smsg->{to}, qr/\blist\@example\.com\b/, 'to appears');
 		my $doc = $m->get_document;
 		my $col = PublicInbox::Search::BYTES();
@@ -346,7 +346,7 @@ $ibx->with_umask(sub {
 	$mset = $ro->query('tc:list@example.com', {mset => 1});
 	is($mset->size, 9, 'searched To+Cc: successfully');
 	foreach my $m ($mset->items) {
-		my $smsg = $ro->{over_ro}->get_art($m->get_docid);
+		my $smsg = $ibx->over->get_art($m->get_docid);
 		my $tocc = join("\n", $smsg->{to}, $smsg->{cc});
 		like($tocc, qr/\blist\@example\.com\b/, 'tocc appears');
 	}
@@ -355,7 +355,7 @@ $ibx->with_umask(sub {
 		my $mset = $ro->query($pfx . 'foo@example.com', { mset => 1 });
 		is($mset->items, 1, "searched $pfx successfully for Cc:");
 		foreach my $m ($mset->items) {
-			my $smsg = $ro->{over_ro}->get_art($m->get_docid);
+			my $smsg = $ibx->over->get_art($m->get_docid);
 			like($smsg->{cc}, qr/\bfoo\@example\.com\b/,
 				'cc appears');
 		}
@@ -421,7 +421,7 @@ $ibx->with_umask(sub {
 	if (scalar(@$n) >= 1) {
 		my $mid = $n->[0]->{mid};
 		my ($id, $prev);
-		$art = $ro->{over_ro}->next_by_mid($mid, \$id, \$prev);
+		$art = $ibx->over->next_by_mid($mid, \$id, \$prev);
 		ok($art, 'article exists in OVER DB');
 	}
 	$rw->_msgmap_init;
@@ -429,7 +429,7 @@ $ibx->with_umask(sub {
 	$rw->commit_txn_lazy;
 	SKIP: {
 		skip('$art not defined', 1) unless defined $art;
-		is($ro->{over_ro}->get_art($art->{num}), undef,
+		is($ibx->over->get_art($art->{num}), undef,
 			'gone from OVER DB');
 	};
 });
diff --git a/t/v2mirror.t b/t/v2mirror.t
index a4ac682d..bca43fd5 100644
--- a/t/v2mirror.t
+++ b/t/v2mirror.t
@@ -134,7 +134,7 @@ $mime->header_set('Subject', 'subject = 10');
 
 $v2w->done;
 
-my $msgs = $mibx->search->{over_ro}->get_thread('10@example.com');
+my $msgs = $mibx->over->get_thread('10@example.com');
 my $to_purge = $msgs->[0]->{blob};
 like($to_purge, qr/\A[a-f0-9]{40,}\z/, 'read blob to be purged');
 $mset = $ibx->search->reopen->query('m:10@example.com', {mset => 1});
diff --git a/t/v2writable.t b/t/v2writable.t
index 9e4547ba..217eaf97 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -235,7 +235,7 @@ EOF
 	my $mset = $srch->query('m:'.$mid, { mset => 1});
 	is($mset->size, 0, 'no longer found in Xapian');
 	my @log1 = (@log, qw(-1 --pretty=raw --raw -r --no-renames));
-	is($srch->{over_ro}->get_art($num), undef,
+	is($ibx->over->get_art($num), undef,
 		'removal propagated to Over DB');
 
 	my $after = $git0->qx(@log1);
@@ -278,7 +278,7 @@ EOF
 	ok($im->add($mime), 'add excessively long References');
 	$im->barrier;
 
-	my $msgs = $ibx->search->{over_ro}->get_thread('x'x244);
+	my $msgs = $ibx->over->get_thread('x'x244);
 	is(2, scalar(@$msgs), 'got both messages');
 	is($msgs->[0]->{mid}, 'x'x244, 'stored truncated mid');
 	is($msgs->[1]->{references}, '<'.('x'x244).'>', 'stored truncated ref');
diff --git a/xt/perf-threading.t b/xt/perf-threading.t
index ae98a5ba..b27c9cbd 100644
--- a/xt/perf-threading.t
+++ b/xt/perf-threading.t
@@ -18,7 +18,7 @@ require PublicInbox::View;
 
 my $msgs;
 my $elapsed = timeit(1, sub {
-	$msgs = $srch->{over_ro}->recent({limit => 200000});
+	$msgs = $ibx->over->recent({limit => 200000});
 });
 my $n = scalar(@$msgs);
 ok($n, 'got some messages');

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 04/11] search: remove special case for blank query
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (2 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 03/11] use more idiomatic internal API for ->over access Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 05/11] tests: add "use strict" and declare v5.10.1 compatibility Eric Wong
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

The special case (if any) belongs at a higher-level,
and this is another step towards removing {over_ro}-dependence
in our Search object.
---
 lib/PublicInbox/Search.pm | 13 ++++---------
 t/v2mda.t                 |  6 +++---
 t/watch_maildir_v2.t      | 19 +++++++++----------
 3 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index b739faf1..546884a9 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -282,15 +282,10 @@ sub reopen {
 sub query {
 	my ($self, $query_string, $opts) = @_;
 	$opts ||= {};
-	if ($query_string eq '' && !$opts->{mset}) {
-		$self->{over_ro}->recent($opts);
-	} else {
-		my $qp = $self->{qp} //= qparse_new($self);
-		my $qp_flags = $self->{qp_flags};
-		my $query = $qp->parse_query($query_string, $qp_flags);
-		$opts->{relevance} = 1 unless exists $opts->{relevance};
-		_do_enquire($self, $query, $opts);
-	}
+	my $qp = $self->{qp} //= qparse_new($self);
+	my $query = $qp->parse_query($query_string, $self->{qp_flags});
+	$opts->{relevance} = 1 unless exists $opts->{relevance};
+	_do_enquire($self, $query, $opts);
 }
 
 sub retry_reopen {
diff --git a/t/v2mda.t b/t/v2mda.t
index 7666eb2d..2262c3ad 100644
--- a/t/v2mda.t
+++ b/t/v2mda.t
@@ -50,7 +50,7 @@ $ibx = PublicInbox::Inbox->new($ibx);
 if ($V == 1) {
 	ok(run_script([ '-index', "$tmpdir/inbox" ]), 'v1 indexed');
 }
-my $msgs = $ibx->search->query('');
+my $msgs = $ibx->over->recent;
 is(scalar(@$msgs), 1, 'only got one message');
 my $eml = $ibx->smsg_eml($msgs->[0]);
 is($eml->as_string, $mime->as_string, 'injected message');
@@ -64,7 +64,7 @@ is($eml->as_string, $mime->as_string, 'injected message');
 	ok(run_script(['-mda'], undef, $rdr), 'mda did not die on "spam"');
 	@new = glob("$faildir/new/*");
 	is(scalar(@new), 1, 'got a message in faildir');
-	$msgs = $ibx->search->reopen->query('');
+	$msgs = $ibx->over->recent;
 	is(scalar(@$msgs), 1, 'no new message');
 
 	my $config = "$ENV{PI_DIR}/config";
@@ -76,7 +76,7 @@ is($eml->as_string, $mime->as_string, 'injected message');
 	ok(run_script(['-mda'], undef, $rdr), 'mda did not die');
 	my @again = glob("$faildir/new/*");
 	is_deeply(\@again, \@new, 'no new message in faildir');
-	$msgs = $ibx->search->reopen->query('');
+	$msgs = $ibx->over->recent;
 	is(scalar(@$msgs), 2, 'new message added OK');
 }
 
diff --git a/t/watch_maildir_v2.t b/t/watch_maildir_v2.t
index ca1cf965..c2c096ae 100644
--- a/t/watch_maildir_v2.t
+++ b/t/watch_maildir_v2.t
@@ -47,10 +47,9 @@ EOF
 my $config = PublicInbox::Config->new(\$orig);
 my $ibx = $config->lookup_name('test');
 ok($ibx, 'found inbox by name');
-my $srch = $ibx->search;
 
 PublicInbox::Watch->new($config)->scan('full');
-my $total = scalar @{$srch->reopen->query('')};
+my $total = scalar @{$ibx->over->recent};
 is($total, 1, 'got one revision');
 
 # my $git = PublicInbox::Git->new("$inboxdir/git/0.git");
@@ -70,7 +69,7 @@ my $write_spam = sub {
 $write_spam->();
 is(unlink(glob("$maildir/new/*")), 1, 'unlinked old spam');
 PublicInbox::Watch->new($config)->scan('full');
-is_deeply($srch->reopen->query(''), [], 'deleted file');
+is_deeply($ibx->over->recent, [], 'deleted file');
 is(unlink(glob("$spamdir/cur/*")), 1, 'unlinked trained spam');
 
 # check with scrubbing
@@ -81,7 +80,7 @@ the body of a message to majordomo\@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	PublicInbox::Emergency->new($maildir)->prepare(\$msg);
 	PublicInbox::Watch->new($config)->scan('full');
-	my $msgs = $srch->reopen->query('');
+	my $msgs = $ibx->over->recent;
 	is(scalar(@$msgs), 1, 'got one file back');
 	my $mref = $ibx->msg_by_smsg($msgs->[0]);
 	like($$mref, qr/something\n\z/s, 'message scrubbed on import');
@@ -89,7 +88,7 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	is(unlink(glob("$maildir/new/*")), 1, 'unlinked spam');
 	$write_spam->();
 	PublicInbox::Watch->new($config)->scan('full');
-	$msgs = $srch->reopen->query('');
+	$msgs = $ibx->over->recent;
 	is(scalar(@$msgs), 0, 'inbox is empty again');
 	is(unlink(glob("$spamdir/cur/*")), 1, 'unlinked trained spam');
 }
@@ -105,7 +104,7 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 		local $SIG{__WARN__} = sub {}; # quiet spam check warning
 		PublicInbox::Watch->new($config)->scan('full');
 	}
-	my $msgs = $srch->reopen->query('');
+	my $msgs = $ibx->over->recent;
 	is(scalar(@$msgs), 0, 'inbox is still empty');
 	is(unlink(glob("$maildir/new/*")), 1);
 }
@@ -118,7 +117,7 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	PublicInbox::Emergency->new($maildir)->prepare(\$msg);
 	$config->{'publicinboxwatch.spamcheck'} = 'spamc';
 	PublicInbox::Watch->new($config)->scan('full');
-	my $msgs = $srch->reopen->query('');
+	my $msgs = $ibx->over->recent;
 	is(scalar(@$msgs), 1, 'inbox has one mail after spamc OK-ed a message');
 	my $mref = $ibx->msg_by_smsg($msgs->[0]);
 	like($$mref, qr/something\n\z/s, 'message scrubbed on import');
@@ -131,10 +130,10 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	$msg = do { local $/; <$fh> };
 	PublicInbox::Emergency->new($maildir)->prepare(\$msg);
 	PublicInbox::Watch->new($config)->scan('full');
-	my $msgs = $srch->reopen->query('dfpost:6e006fd7');
+	my $msgs = $ibx->search->reopen->query('dfpost:6e006fd7');
 	is(scalar(@$msgs), 1, 'diff postimage found');
 	my $post = $msgs->[0];
-	$msgs = $srch->query('dfpre:090d998b6c2c');
+	$msgs = $ibx->search->query('dfpre:090d998b6c2c');
 	is(scalar(@$msgs), 1, 'diff preimage found');
 	is($post->{blob}, $msgs->[0]->{blob}, 'same message');
 }
@@ -162,7 +161,7 @@ both
 EOF
 	PublicInbox::Emergency->new($maildir)->prepare(\$both);
 	PublicInbox::Watch->new($config)->scan('full');
-	my $msgs = $srch->reopen->query('m:both@b.com');
+	my $msgs = $ibx->search->reopen->query('m:both@b.com');
 	my $v1 = $config->lookup_name('v1');
 	my $msg = $v1->git->cat_file($msgs->[0]->{blob});
 	is($both, $$msg, 'got original message back from v1');

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 05/11] tests: add "use strict" and declare v5.10.1 compatibility
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (3 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 04/11] search: remove special case for blank query Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 06/11] search: replace ->query with ->mset Eric Wong
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

strict.pm helped me find a typo in an upcoming recent change, so
ensure we use it since it does more good than harm.  We'll also
take the opportunity here to declare v5.10.1 compatibility level
to future-proof against Perl incompatibilities.
---
 t/index-git-times.t   | 3 +++
 xt/eml_check_limits.t | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/t/index-git-times.t b/t/index-git-times.t
index 8f80c866..73c99e61 100644
--- a/t/index-git-times.t
+++ b/t/index-git-times.t
@@ -1,5 +1,8 @@
+#!perl -w
 # Copyright (C) 2020 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
 use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Import;
diff --git a/xt/eml_check_limits.t b/xt/eml_check_limits.t
index cf780c77..9f821946 100644
--- a/xt/eml_check_limits.t
+++ b/xt/eml_check_limits.t
@@ -1,6 +1,8 @@
 #!perl -w
 # Copyright (C) 2020 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
 use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Eml;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 06/11] search: replace ->query with ->mset
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (4 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 05/11] tests: add "use strict" and declare v5.10.1 compatibility Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 07/11] search: remove {over_ro} field Eric Wong
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

Nearly all of the search uses in the production code rely on
a Xapian mset iterator being returned (instead of an array
of $smsg objects).  So default to returning the mset and move
the burden of smsg array conversion into the test cases.
---
 lib/PublicInbox/ExtMsg.pm     |   4 +-
 lib/PublicInbox/IMAP.pm       |   2 +-
 lib/PublicInbox/Mbox.pm       |   6 +-
 lib/PublicInbox/Search.pm     |  12 ++--
 lib/PublicInbox/SearchView.pm |   3 +-
 lib/PublicInbox/SolverGit.pm  |   5 +-
 t/altid.t                     |   8 +--
 t/altid_v2.t                  |   7 ++-
 t/index-git-times.t           |  14 +++--
 t/indexlevels-mirror.t        |   8 +--
 t/mda_filter_rubylang.t       |   6 +-
 t/replace.t                   |   8 +--
 t/search.t                    | 112 +++++++++++++++++-----------------
 t/v1reindex.t                 |   4 +-
 t/v2mda.t                     |  10 +--
 t/v2mirror.t                  |  22 +++----
 t/v2reindex.t                 |   9 +--
 t/v2writable.t                |  10 ++-
 t/watch_filter_rubylang.t     |  12 ++--
 t/watch_maildir_v2.t          |  17 +++---
 t/xcpdb-reshard.t             |   3 +-
 21 files changed, 143 insertions(+), 139 deletions(-)

diff --git a/lib/PublicInbox/ExtMsg.pm b/lib/PublicInbox/ExtMsg.pm
index 65892161..5dffc65c 100644
--- a/lib/PublicInbox/ExtMsg.pm
+++ b/lib/PublicInbox/ExtMsg.pm
@@ -65,10 +65,10 @@ sub search_partial ($$) {
 		# has too many results.  $@ can be
 		# Search::Xapian::QueryParserError or even:
 		# "something terrible happened at ../Search/Xapian/Enquire.pm"
-		my $mset = eval { $srch->query($m, $opt) } or next;
+		my $mset = eval { $srch->mset($m, $opt) } or next;
 		my @mids = map {
 			$_->{mid}
-		} @{$ibx->over->get_all(@{$srch->mset_to_artnums($mset)})};
+		} @{$srch->mset_to_smsg($ibx, $mset)};
 		return \@mids if scalar(@mids);
 	}
 }
diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index abdb8fec..d540fd0b 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -1187,7 +1187,7 @@ sub refill_xap ($$$$) {
 	my ($beg, $end) = @$range_info;
 	my $srch = $self->{ibx}->search;
 	my $opt = { mset => 2, limit => 1000 };
-	my $mset = $srch->query("$q uid:$beg..$end", $opt);
+	my $mset = $srch->mset("$q uid:$beg..$end", $opt);
 	@$uids = @{$srch->mset_to_artnums($mset)};
 	if (@$uids) {
 		$range_info->[0] = $uids->[-1] + 1; # update $beg
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 0223bead..47025891 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -213,7 +213,7 @@ sub results_cb {
 		}
 		# refill result set
 		my $srch = $ctx->{-inbox}->search(undef, $ctx) or return;
-		my $mset = $srch->query($ctx->{query}, $ctx->{qopts});
+		my $mset = $srch->mset($ctx->{query}, $ctx->{qopts});
 		my $size = $mset->size or return;
 		$ctx->{qopts}->{offset} += $size;
 		$ctx->{ids} = $srch->mset_to_artnums($mset);
@@ -235,7 +235,7 @@ sub results_thread_cb {
 
 		# refill result set
 		my $srch = $ctx->{-inbox}->search(undef, $ctx) or return;
-		my $mset = $srch->query($ctx->{query}, $ctx->{qopts});
+		my $mset = $srch->mset($ctx->{query}, $ctx->{qopts});
 		my $size = $mset->size or return;
 		$ctx->{qopts}->{offset} += $size;
 		$ctx->{ids} = $srch->mset_to_artnums($mset);
@@ -254,7 +254,7 @@ sub mbox_all {
 
 	my $qopts = $ctx->{qopts} = { mset => 2 }; # order by docid
 	$qopts->{thread} = 1 if $q->{t};
-	my $mset = $srch->query($q_string, $qopts);
+	my $mset = $srch->mset($q_string, $qopts);
 	$qopts->{offset} = $mset->size or
 			return [404, [qw(Content-Type text/plain)],
 				["No results found\n"]];
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 546884a9..cfa942b2 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -279,7 +279,7 @@ sub reopen {
 }
 
 # read-only
-sub query {
+sub mset {
 	my ($self, $query_string, $opts) = @_;
 	$opts ||= {};
 	my $qp = $self->{qp} //= qparse_new($self);
@@ -346,17 +346,17 @@ sub _enquire_once { # retry_reopen callback
 	if ($opts->{thread} && has_threadid($self)) {
 		$enquire->set_collapse_key(THREADID);
 	}
+	$enquire->get_mset($opts->{offset} || 0, $opts->{limit} || 50);
+}
 
-	my $offset = $opts->{offset} || 0;
-	my $limit = $opts->{limit} || 50;
-	my $mset = $enquire->get_mset($offset, $limit);
-	return $mset if $opts->{mset};
+sub mset_to_smsg {
+	my ($self, $ibx, $mset) = @_;
 	my $nshard = $self->{nshard} // 1;
 	my $i = 0;
 	my %order = map { mdocid($nshard, $_) => ++$i } $mset->items;
 	my @msgs = sort {
 		$order{$a->{num}} <=> $order{$b->{num}}
-	} @{$self->{over_ro}->get_all(keys %order)};
+	} @{$ibx->over->get_all(keys %order)};
 	wantarray ? ($mset->get_matches_estimated, \@msgs) : \@msgs;
 }
 
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index 892e8fda..c482f1c9 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -47,7 +47,6 @@ sub sres_top_html {
 	my $opts = {
 		limit => $q->{l},
 		offset => $o,
-		mset => 1,
 		relevance => $q->{r},
 		thread => $q->{t},
 		asc => $asc,
@@ -55,7 +54,7 @@ sub sres_top_html {
 	my ($mset, $total, $err, $html);
 retry:
 	eval {
-		$mset = $srch->query($query, $opts);
+		$mset = $srch->mset($query, $opts);
 		$total = $mset->get_matches_estimated;
 	};
 	$err = $@;
diff --git a/lib/PublicInbox/SolverGit.pm b/lib/PublicInbox/SolverGit.pm
index d0cd59db..dd95f400 100644
--- a/lib/PublicInbox/SolverGit.pm
+++ b/lib/PublicInbox/SolverGit.pm
@@ -228,10 +228,9 @@ sub find_extract_diffs ($$$) {
 		}
 	}
 
-	my $msgs = $srch->query($q, { relevance => 1 });
-
+	my $mset = $srch->mset($q, { relevance => 1 });
 	my $diffs = [];
-	foreach my $smsg (@$msgs) {
+	for my $smsg (@{$srch->mset_to_smsg($ibx, $mset)}) {
 		my $eml = $ibx->smsg_eml($smsg) or next;
 		$eml->each_part(\&extract_diff,
 				[$self, $diffs, $pre, $post, $ibx, $smsg], 1);
diff --git a/t/altid.t b/t/altid.t
index f3c01520..816f5f5b 100644
--- a/t/altid.t
+++ b/t/altid.t
@@ -45,13 +45,13 @@ EOF
 }
 
 {
-	my $ro = PublicInbox::Search->new($ibx);
-	my $msgs = $ro->query("gmane:1234");
+	my $mset = $ibx->search->mset("gmane:1234");
+	my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 	$msgs = [ map { $_->{mid} } @$msgs ];
 	is_deeply($msgs, ['a@example.com'], 'got one match');
 
-	$msgs = $ro->query("gmane:666");
-	is_deeply([], $msgs, 'body did NOT match');
+	$mset = $ibx->search->mset('gmane:666');
+	is($mset->size, 0, 'body did NOT match');
 };
 
 {
diff --git a/t/altid_v2.t b/t/altid_v2.t
index 01ed9ed4..f04b547b 100644
--- a/t/altid_v2.t
+++ b/t/altid_v2.t
@@ -41,11 +41,12 @@ hello world gmane:666
 EOF
 $v2w->done;
 
-my $msgs = $ibx->search->reopen->query("gmane:1234");
+my $mset = $ibx->search->reopen->mset('gmane:1234');
+my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 $msgs = [ map { $_->{mid} } @$msgs ];
 is_deeply($msgs, ['a@example.com'], 'got one match');
-$msgs = $ibx->search->query("gmane:666");
-is_deeply([], $msgs, 'body did NOT match');
+$mset = $ibx->search->mset('gmane:666');
+is($mset->size, 0, 'body did NOT match');
 
 done_testing();
 
diff --git a/t/index-git-times.t b/t/index-git-times.t
index 73c99e61..f9869cfa 100644
--- a/t/index-git-times.t
+++ b/t/index-git-times.t
@@ -63,10 +63,12 @@ my $smsg;
 	$smsg = $ibx->over->get_art(1);
 	is($smsg->{ds}, 749520000, 'datestamp from git author time');
 	is($smsg->{ts}, 1285977600, 'timestamp from git committer time');
-	my $res = $ibx->search->query("m:$smsg->{mid}");
-	is(scalar @$res, 1, 'got one result for m:');
+	my $mset = $ibx->search->mset("m:$smsg->{mid}");
+	is($mset->size, 1, 'got one result for m:');
+	my $res = $ibx->search->mset_to_smsg($ibx, $mset);
 	is($res->[0]->{ds}, $smsg->{ds}, 'Xapian stored datestamp');
-	$res = $ibx->search->query('d:19931002..19931002');
+	$mset = $ibx->search->mset('d:19931002..19931002');
+	$res = $ibx->search->mset_to_smsg($ibx, $mset);
 	is(scalar @$res, 1, 'got one result for d:');
 	is($res->[0]->{ds}, $smsg->{ds}, 'Xapian search on datestamp');
 }
@@ -87,9 +89,11 @@ SKIP: {
 			'v2 datestamp from git author time');
 		is($v2smsg->{ts}, $smsg->{ts},
 			'v2 timestamp from git committer time');
-		my $res = $ibx->search->query("m:$smsg->{mid}");
+		my $mset = $ibx->search->mset("m:$smsg->{mid}");
+		my $res = $ibx->search->mset_to_smsg($ibx, $mset);
 		is($res->[0]->{ds}, $smsg->{ds}, 'Xapian stored datestamp');
-		$res = $ibx->search->query('d:19931002..19931002');
+		$mset = $ibx->search->mset('d:19931002..19931002');
+		$res = $ibx->search->mset_to_smsg($ibx, $mset);
 		is(scalar @$res, 1, 'got one result for d:');
 		is($res->[0]->{ds}, $smsg->{ds}, 'Xapian search on datestamp');
 	};
diff --git a/t/indexlevels-mirror.t b/t/indexlevels-mirror.t
index 27533546..291e0d2f 100644
--- a/t/indexlevels-mirror.t
+++ b/t/indexlevels-mirror.t
@@ -121,8 +121,8 @@ my $import_index_incremental = sub {
 		is(PublicInbox::Admin::detect_indexlevel($ro_mirror), $level,
 		   'indexlevel detectable by Admin after xcpdb v' .$v.$level);
 		delete $ro_mirror->{$_} for (qw(over search));
-		$msgs = $ro_mirror->search->query('m:m@2');
-		is(scalar(@$msgs), 1, "v$v found m\@2 via Xapian on $level");
+		my $mset = $ro_mirror->search->mset('m:m@2');
+		is($mset->size, 1, "v$v found m\@2 via Xapian on $level");
 	}
 
 	# sync the mirror
@@ -138,8 +138,8 @@ my $import_index_incremental = sub {
 			 'no Xapian shard directories for v2 basic');
 	}
 	if ($level ne 'basic') {
-		$msgs = $ro_mirror->search->reopen->query('m:m@2');
-		is(scalar(@$msgs), 0,
+		my $mset = $ro_mirror->search->reopen->mset('m:m@2');
+		is($mset->size, 0,
 			"v$v m\@2 gone from Xapian in mirror on $level");
 	}
 
diff --git a/t/mda_filter_rubylang.t b/t/mda_filter_rubylang.t
index 5b6bf28b..754d52f7 100644
--- a/t/mda_filter_rubylang.t
+++ b/t/mda_filter_rubylang.t
@@ -48,10 +48,10 @@ EOF
 	my $ibx = $config->lookup_name($v);
 
 	# make sure all serials are searchable:
-	my ($tot, $msgs);
 	for my $i (1..2) {
-		($tot, $msgs) = $ibx->search->query("alerts:$i");
-		is($tot, 1, "got one result for alerts:$i");
+		my $mset = $ibx->search->mset("alerts:$i");
+		is($mset->size, 1, "got one result for alerts:$i");
+		my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 		is($msgs->[0]->{mid}, "a.$i\@b.com", "got expected MID for $i");
 	}
 	is_deeply([], \@warn, 'no warnings');
diff --git a/t/replace.t b/t/replace.t
index 490e3b7b..95241adf 100644
--- a/t/replace.t
+++ b/t/replace.t
@@ -106,8 +106,8 @@ EOF
 
 	if (my $srch = $ibx->search) {
 		for my $q ('f:streisand', 's:confidential', 'malibu') {
-			my $msgs = $srch->query($q);
-			is_deeply($msgs, [], "no match for $q");
+			my $mset = $srch->mset($q);
+			is($mset->size, 0, "no match for $q");
 		}
 		my @ok = ('f:redactor', 's:redacted', 'nothing to see');
 		if ($opt->{pre}) {
@@ -119,8 +119,8 @@ EOF
 				's:message3', 's:message4';
 		}
 		for my $q (@ok) {
-			my $msgs = $srch->query($q);
-			ok($msgs->[0], "got match for $q");
+			my $mset = $srch->mset($q);
+			ok($mset->size, "got match for $q");
 		}
 	}
 
diff --git a/t/search.t b/t/search.t
index 3124baeb..8df8a202 100644
--- a/t/search.t
+++ b/t/search.t
@@ -25,12 +25,12 @@ $ibx->with_umask(sub {
 	$rw->idx_release;
 });
 $rw = undef;
-my $ro = $ibx->search;
 my $rw_commit = sub {
 	$rw->commit_txn_lazy if $rw;
 	$rw = PublicInbox::SearchIdx->new($ibx, 1);
 	$rw->{qp_flags} = 0; # quiet a warning
 	$rw->begin_txn_lazy;
+	$ibx->search->reopen;
 };
 
 sub oct_is ($$$) {
@@ -103,29 +103,34 @@ sub filter_mids {
 	sort(map { $_->{mid} } @$msgs);
 }
 
+my $query = sub {
+	my ($query_string, $opt) = @_;
+	my $mset = $ibx->search->mset($query_string, $opt);
+	$ibx->search->mset_to_smsg($ibx, $mset);
+};
+
 {
 	$rw_commit->();
-	$ro->reopen;
-	my $found = $ro->query('m:root@s');
+	my $found = $query->('m:root@s');
 	is(scalar(@$found), 1, "message found");
 	is($found->[0]->{mid}, 'root@s', 'mid set correctly') if @$found;
 
 	my ($res, @res);
 	my @exp = sort qw(root@s last@s);
 
-	$res = $ro->query('s:(Hello world)');
+	$res = $query->('s:(Hello world)');
 	@res = filter_mids($res);
 	is_deeply(\@res, \@exp, 'got expected results for s:() match');
 
-	$res = $ro->query('s:"Hello world"');
+	$res = $query->('s:"Hello world"');
 	@res = filter_mids($res);
 	is_deeply(\@res, \@exp, 'got expected results for s:"" match');
 
-	$res = $ro->query('s:"Hello world"', {limit => 1});
+	$res = $query->('s:"Hello world"', {limit => 1});
 	is(scalar @$res, 1, "limit works");
 	my $first = $res->[0];
 
-	$res = $ro->query('s:"Hello world"', {offset => 1});
+	$res = $query->('s:"Hello world"', {offset => 1});
 	is(scalar @$res, 1, "offset works");
 	my $second = $res->[0];
 
@@ -173,31 +178,29 @@ EOF
 # search thread on ghost
 {
 	$rw_commit->();
-	$ro->reopen;
 
 	# subject
-	my $res = $ro->query('ghost');
+	my $res = $query->('ghost');
 	my @exp = sort qw(ghost-message@s ghost-reply@s);
 	my @res = filter_mids($res);
 	is_deeply(\@res, \@exp, 'got expected results for Subject match');
 
 	# body
-	$res = $ro->query('goodbye');
+	$res = $query->('goodbye');
 	is(scalar(@$res), 1, "goodbye message found");
 	is($res->[0]->{mid}, 'last@s', 'got goodbye message body') if @$res;
 
 	# datestamp
-	$res = $ro->query('dt:20101002000001..20101002000001');
+	$res = $query->('dt:20101002000001..20101002000001');
 	@res = filter_mids($res);
 	is_deeply(\@res, ['ghost-message@s'], 'exact Date: match works');
-	$res = $ro->query('dt:20101002000002..20101002000002');
+	$res = $query->('dt:20101002000002..20101002000002');
 	is_deeply($res, [], 'exact Date: match down to the second');
 }
 
 # long message-id
 $ibx->with_umask(sub {
 	$rw_commit->();
-	$ro->reopen;
 	my $long_mid = 'last' . ('x' x 60). '@s';
 	my $long = PublicInbox::Eml->new(<<EOF);
 Date: Sat, 02 Oct 2010 00:00:00 +0000
@@ -214,7 +217,6 @@ EOF
 	is($long_id, int($long_id), "long_id is an integer: $long_id");
 
 	$rw_commit->();
-	$ro->reopen;
 	my $res;
 	my @res;
 
@@ -232,7 +234,6 @@ EOF
 	ok($rw->add_message($long_reply) > $long_id, "inserted long reply");
 
 	$rw_commit->();
-	$ro->reopen;
 	my $t = $ibx->over->get_thread('root@s');
 	is(scalar(@$t), 4, "got all 4 messages in thread");
 	my @exp = sort($long_reply_mid, 'root@s', 'last@s', $long_mid);
@@ -264,13 +265,13 @@ theatre
 fade
 EOF
 	$rw_commit->();
-	my $res = $ro->reopen->query("theatre");
+	my $res = $query->("theatre");
 	is(scalar(@$res), 2, "got both matches");
 	if (@$res == 2) {
 		is($res->[0]->{mid}, 'nquote@a', 'non-quoted scores higher');
 		is($res->[1]->{mid}, 'quote@a', 'quoted result still returned');
 	}
-	$res = $ro->query("illusions");
+	$res = $query->("illusions");
 	is(scalar(@$res), 1, "got a match for quoted text");
 	is($res->[0]->{mid}, 'quote@a',
 		"quoted result returned if nothing else") if scalar(@$res);
@@ -292,7 +293,7 @@ LOOP!
 EOF
 	ok($doc_id > 0, "doc_id defined with circular reference");
 	$rw_commit->();
-	my $smsg = $ro->reopen->query('m:circle@a', {limit=>1})->[0];
+	my $smsg = $query->('m:circle@a', {limit=>1})->[0];
 	is(defined($smsg), 1, 'found m:circl@a');
 	if (defined $smsg) {
 		is($smsg->{references}, '', "no references created");
@@ -301,11 +302,11 @@ EOF
 });
 
 {
-	my $msgs = $ro->query('d:19931002..20101002');
+	my $msgs = $query->('d:19931002..20101002');
 	ok(scalar(@$msgs) > 0, 'got results within range');
-	$msgs = $ro->query('d:20101003..');
+	$msgs = $query->('d:20101003..');
 	is(scalar(@$msgs), 0, 'nothing after 20101003');
-	$msgs = $ro->query('d:..19931001');
+	$msgs = $query->('d:..19931001');
 	is(scalar(@$msgs), 0, 'nothing before 19931001');
 }
 
@@ -314,8 +315,7 @@ $ibx->with_umask(sub {
 	my $doc_id = $rw->add_message($mime);
 	ok($doc_id > 0, 'message indexed doc_id with UTF-8');
 	$rw_commit->();
-	my $msg = $ro->reopen->
-		query('m:testmessage@example.com', {limit => 1})->[0];
+	my $msg = $query->('m:testmessage@example.com', {limit => 1})->[0];
 	is(defined($msg), 1, 'found testmessage@example.com');
 	if (defined $msg) {
 		is($mime->header('Subject'), $msg->{subject},
@@ -325,7 +325,7 @@ $ibx->with_umask(sub {
 
 # names and addresses
 {
-	my $mset = $ro->query('t:list@example.com', {mset => 1});
+	my $mset = $ibx->search->mset('t:list@example.com');
 	is($mset->size, 9, 'searched To: successfully');
 	foreach my $m ($mset->items) {
 		my $smsg = $ibx->over->get_art($m->get_docid);
@@ -343,7 +343,7 @@ $ibx->with_umask(sub {
 		is($uid, $m->get_docid, 'UID column matches docid');
 	}
 
-	$mset = $ro->query('tc:list@example.com', {mset => 1});
+	$mset = $ibx->search->mset('tc:list@example.com');
 	is($mset->size, 9, 'searched To+Cc: successfully');
 	foreach my $m ($mset->items) {
 		my $smsg = $ibx->over->get_art($m->get_docid);
@@ -352,7 +352,7 @@ $ibx->with_umask(sub {
 	}
 
 	foreach my $pfx ('tcf:', 'c:') {
-		my $mset = $ro->query($pfx . 'foo@example.com', { mset => 1 });
+		my $mset = $ibx->search->mset($pfx . 'foo@example.com');
 		is($mset->items, 1, "searched $pfx successfully for Cc:");
 		foreach my $m ($mset->items) {
 			my $smsg = $ibx->over->get_art($m->get_docid);
@@ -362,7 +362,7 @@ $ibx->with_umask(sub {
 	}
 
 	foreach my $pfx ('', 'tcf:', 'f:') {
-		my $res = $ro->query($pfx . 'Laggy');
+		my $res = $query->($pfx . 'Laggy');
 		is(scalar(@$res), 1,
 			"searched $pfx successfully for From:");
 		foreach my $smsg (@$res) {
@@ -374,25 +374,24 @@ $ibx->with_umask(sub {
 
 {
 	$rw_commit->();
-	$ro->reopen;
-	my $res = $ro->query('b:hello');
+	my $res = $query->('b:hello');
 	is(scalar(@$res), 0, 'no match on body search only');
-	$res = $ro->query('bs:smith');
+	$res = $query->('bs:smith');
 	is(scalar(@$res), 0,
 		'no match on body+subject search for From');
 
-	$res = $ro->query('q:theatre');
+	$res = $query->('q:theatre');
 	is(scalar(@$res), 1, 'only one quoted body');
 	like($res->[0]->{from_name}, qr/\AQuoter/,
 		'got quoted body') if (scalar(@$res));
 
-	$res = $ro->query('nq:theatre');
+	$res = $query->('nq:theatre');
 	is(scalar @$res, 1, 'only one non-quoted body');
 	like($res->[0]->{from_name}, qr/\ANon-Quoter/,
 		'got non-quoted body') if (scalar(@$res));
 
 	foreach my $pfx (qw(b: bs:)) {
-		$res = $ro->query($pfx . 'theatre');
+		$res = $query->($pfx . 'theatre');
 		is(scalar @$res, 2, "searched both bodies for $pfx");
 		like($res->[0]->{from_name}, qr/\ANon-Quoter/,
 			"non-quoter first for $pfx") if scalar(@$res);
@@ -405,14 +404,13 @@ $ibx->with_umask(sub {
 	my $smsg = bless { blob => $oid }, 'PublicInbox::Smsg';
 	ok($rw->add_message($amsg, $smsg), 'added attachment');
 	$rw_commit->();
-	$ro->reopen;
-	my $n = $ro->query('n:attached_fart.txt');
+	my $n = $query->('n:attached_fart.txt');
 	is(scalar @$n, 1, 'got result for n:');
-	my $res = $ro->query('part_deux.txt');
+	my $res = $query->('part_deux.txt');
 	is(scalar @$res, 1, 'got result without n:');
 	is($n->[0]->{mid}, $res->[0]->{mid},
 		'same result with and without') if scalar(@$res);
-	my $txt = $ro->query('"inside another"');
+	my $txt = $query->('"inside another"');
 	is(scalar @$txt, 1, 'found inside another');
 	is($txt->[0]->{mid}, $res->[0]->{mid},
 		'search inside text attachments works') if scalar(@$txt);
@@ -459,8 +457,7 @@ $ibx->with_umask(sub {
 	my $digits = '10010260936330';
 	my $ua = 'Pine.LNX.4.10';
 	my $mid = "$ua.$digits.2460-100000\@penguin.transmeta.com";
-	is($ro->reopen->query("m:$digits", { mset => 1})->size, 0,
-		'no results yet');
+	is($ibx->search->mset("m:$digits")->size, 0, 'no results yet');
 	my $pine = PublicInbox::Eml->new(<<EOF);
 Subject: blah
 Message-ID: <$mid>
@@ -470,44 +467,45 @@ To: list\@example.com
 EOF
 	my $x = $rw->add_message($pine);
 	$rw->commit_txn_lazy;
-	is($ro->reopen->query("m:$digits", { mset => 1})->size, 1,
+	$ibx->search->reopen;
+	is($ibx->search->mset("m:$digits")->size, 1,
 		'searching only digit yielded result');
 
 	my $wild = $digits;
 	for my $i (1..6) {
 		chop($wild);
-		is($ro->query("m:$wild*", { mset => 1})->size, 1,
+		is($ibx->search->mset("m:$wild*")->size, 1,
 			"searching chopped($i) digit yielded result $wild ");
 	}
-	is($ro->query("m:Pine m:LNX m:10010260936330", {mset=>1})->size, 1);
+	is($ibx->search->mset('m:Pine m:LNX m:10010260936330')->size, 1);
 });
 
 { # List-Id searching
-	my $found = $ro->query('lid:i.m.just.bored');
+	my $found = $query->('lid:i.m.just.bored');
 	is_deeply([ filter_mids($found) ], [ 'root@s' ],
 		'got expected mid on exact lid: search');
 
-	$found = $ro->query('lid:just.bored');
+	$found = $query->('lid:just.bored');
 	is_deeply($found, [], 'got nothing on lid: search');
 
-	$found = $ro->query('lid:*.just.bored');
+	$found = $query->('lid:*.just.bored');
 	is_deeply($found, [], 'got nothing on lid: search');
 
-	$found = $ro->query('l:i.m.just.bored');
+	$found = $query->('l:i.m.just.bored');
 	is_deeply([ filter_mids($found) ], [ 'root@s' ],
 		'probabilistic search works on full List-Id contents');
 
-	$found = $ro->query('l:just.bored');
+	$found = $query->('l:just.bored');
 	is_deeply([ filter_mids($found) ], [ 'root@s' ],
 		'probabilistic search works on partial List-Id contents');
 
-	$found = $ro->query('lid:mad');
+	$found = $query->('lid:mad');
 	is_deeply($found, [], 'no match on phrase with lid:');
 
-	$found = $ro->query('lid:bored');
+	$found = $query->('lid:bored');
 	is_deeply($found, [], 'no match on partial List-Id with lid:');
 
-	$found = $ro->query('l:nothing');
+	$found = $query->('l:nothing');
 	is_deeply($found, [], 'matched on phrase with l:');
 }
 
@@ -516,22 +514,22 @@ $ibx->with_umask(sub {
 	my $doc_id = $rw->add_message(eml_load('t/data/message_embed.eml'));
 	ok($doc_id > 0, 'messages within messages');
 	$rw->commit_txn_lazy;
-	$ro->reopen;
-	my $n_test_eml = $ro->query('n:test.eml');
+	$ibx->search->reopen;
+	my $n_test_eml = $query->('n:test.eml');
 	is(scalar(@$n_test_eml), 1, 'got a result');
-	my $n_embed2x_eml = $ro->query('n:embed2x.eml');
+	my $n_embed2x_eml = $query->('n:embed2x.eml');
 	is_deeply($n_test_eml, $n_embed2x_eml, '.eml filenames searchable');
 	for my $m (qw(20200418222508.GA13918@dcvr 20200418222020.GA2745@dcvr
 			20200418214114.7575-1-e@yhbt.net)) {
-		is($ro->query("m:$m")->[0]->{mid},
+		is($query->("m:$m")->[0]->{mid},
 			'20200418222508.GA13918@dcvr', 'probabilistic m:'.$m);
-		is($ro->query("mid:$m")->[0]->{mid},
+		is($query->("mid:$m")->[0]->{mid},
 			'20200418222508.GA13918@dcvr', 'boolean mid:'.$m);
 	}
-	is($ro->query('dfpost:4dc62c50')->[0]->{mid},
+	is($query->('dfpost:4dc62c50')->[0]->{mid},
 		'20200418222508.GA13918@dcvr',
 		'diff search reaches inside message/rfc822');
-	is($ro->query('s:"mail header experiments"')->[0]->{mid},
+	is($query->('s:"mail header experiments"')->[0]->{mid},
 		'20200418222508.GA13918@dcvr',
 		'Subject search reaches inside message/rfc822');
 });
diff --git a/t/v1reindex.t b/t/v1reindex.t
index a5c85ffb..e66d89e5 100644
--- a/t/v1reindex.t
+++ b/t/v1reindex.t
@@ -178,7 +178,7 @@ ok(!-d $xap, 'Xapian directories removed again');
 	delete $ibx->{mm};
 	is_deeply([ $ibx->mm->minmax ], $minmax, 'minmax unchanged');
 	is($ibx->mm->num_highwater, 10, 'num_highwater as expected');
-	my $mset = $ibx->search->query('hello world', {mset=>1});
+	my $mset = $ibx->search->mset('hello world');
 	isnt($mset->size, 0, 'got Xapian search results');
 
 	my ($min, $max) = $ibx->mm->minmax;
@@ -224,7 +224,7 @@ ok(!-d $xap, 'Xapian directories removed again');
 	eval { $rw->index_sync({reindex => 1}) };
 	is($@, '', 'no error from indexing');
 	is_deeply(\@warn, [], 'no warnings');
-	my $mset = $ibx->search->reopen->query('hello world', {mset=>1});
+	my $mset = $ibx->search->reopen->mset('hello world');
 	isnt($mset->size, 0, 'search OK after basic -> medium');
 
 	is($ibx->mm->num_highwater, 10, 'num_highwater as expected');
diff --git a/t/v2mda.t b/t/v2mda.t
index 2262c3ad..abbdc8e4 100644
--- a/t/v2mda.t
+++ b/t/v2mda.t
@@ -85,10 +85,12 @@ is($eml->as_string, $mime->as_string, 'injected message');
 	open my $fh, '<', $patch or die "failed to open $patch: $!\n";
 	$rdr->{0} = \(do { local $/; <$fh> });
 	ok(run_script(['-mda'], undef, $rdr), 'mda delivered a patch');
-	my $post = $ibx->search->reopen->query('dfpost:6e006fd7');
-	is(scalar(@$post), 1, 'got one result for dfpost');
-	my $pre = $ibx->search->query('dfpre:090d998');
-	is(scalar(@$pre), 1, 'got one result for dfpre');
+	my $post = $ibx->search->reopen->mset('dfpost:6e006fd7');
+	is($post->size, 1, 'got one result for dfpost');
+	my $pre = $ibx->search->mset('dfpre:090d998');
+	is($pre->size, 1, 'got one result for dfpre');
+	$pre = $ibx->search->mset_to_smsg($ibx, $pre);
+	$post = $ibx->search->mset_to_smsg($ibx, $post);
 	is($post->[0]->{blob}, $pre->[0]->{blob}, 'same message in both cases');
 }
 
diff --git a/t/v2mirror.t b/t/v2mirror.t
index bca43fd5..81b9544d 100644
--- a/t/v2mirror.t
+++ b/t/v2mirror.t
@@ -112,11 +112,11 @@ my $fetch_each_epoch = sub {
 
 $fetch_each_epoch->();
 
-my $mset = $mibx->search->reopen->query('m:15@example.com', {mset => 1});
+my $mset = $mibx->search->reopen->mset('m:15@example.com');
 is(scalar($mset->items), 0, 'new message not found in mirror, yet');
 ok(run_script([qw(-index -j0), "$tmpdir/m"]), 'index updated');
 is_deeply([$mibx->mm->minmax], [$ibx->mm->minmax], 'index synched minmax');
-$mset = $mibx->search->reopen->query('m:15@example.com', {mset => 1});
+$mset = $mibx->search->reopen->mset('m:15@example.com');
 is(scalar($mset->items), 1, 'found message in mirror');
 
 # purge:
@@ -137,7 +137,7 @@ $v2w->done;
 my $msgs = $mibx->over->get_thread('10@example.com');
 my $to_purge = $msgs->[0]->{blob};
 like($to_purge, qr/\A[a-f0-9]{40,}\z/, 'read blob to be purged');
-$mset = $ibx->search->reopen->query('m:10@example.com', {mset => 1});
+$mset = $ibx->search->reopen->mset('m:10@example.com');
 is(scalar($mset->items), 0, 'purged message gone from origin');
 
 $fetch_each_epoch->();
@@ -153,11 +153,11 @@ $fetch_each_epoch->();
 	unlike($err, qr/fatal/, 'no scary fatal error shown');
 }
 
-$mset = $mibx->search->reopen->query('m:10@example.com', {mset => 1});
+$mset = $mibx->search->reopen->mset('m:10@example.com');
 is(scalar($mset->items), 0, 'purged message not found in mirror');
 is_deeply([$mibx->mm->minmax], [$ibx->mm->minmax], 'minmax still synced');
 for my $i ((1..9),(11..15)) {
-	$mset = $mibx->search->query("m:$i\@example.com", {mset => 1});
+	$mset = $mibx->search->mset("m:$i\@example.com");
 	is(scalar($mset->items), 1, "$i\@example.com remains visible");
 }
 is($mibx->git->check($to_purge), undef, 'unindex+prune successful in mirror');
@@ -171,7 +171,7 @@ is($mibx->git->check($to_purge), undef, 'unindex+prune successful in mirror');
 
 # deletes happen in a different fetch window
 {
-	$mset = $mibx->search->reopen->query('m:1@example.com', {mset => 1});
+	$mset = $mibx->search->reopen->mset('m:1@example.com');
 	is(scalar($mset->items), 1, '1@example.com visible in mirror');
 	$mime->header_set('Message-ID', '<1@example.com>');
 	$mime->header_set('Subject', 'subject = 1');
@@ -186,12 +186,12 @@ is($mibx->git->check($to_purge), undef, 'unindex+prune successful in mirror');
 	my $opt = { 1 => \$out, 2 => \$err };
 	ok(run_script($cmd, undef, $opt), 'index ran');
 	is($err, '', 'no errors reported by index');
-	$mset = $mibx->search->reopen->query('m:1@example.com', {mset => 1});
+	$mset = $mibx->search->reopen->mset('m:1@example.com');
 	is(scalar($mset->items), 0, '1@example.com no longer visible in mirror');
 }
 
 if ('sequential-shard') {
-	$mset = $mibx->search->query('m:15@example.com', {mset => 1});
+	$mset = $mibx->search->mset('m:15@example.com');
 	is(scalar($mset->items), 1, 'large message not indexed');
 	remove_tree(glob("$tmpdir/m/xap*"), glob("$tmpdir/m/msgmap.*"));
 	my $cmd = [ qw(-index -j9 --sequential-shard), "$tmpdir/m" ];
@@ -199,7 +199,7 @@ if ('sequential-shard') {
 	my @shards = glob("$tmpdir/m/xap*/?");
 	is(scalar(@shards), 8, 'got expected shard count');
 	PublicInbox::InboxWritable::cleanup($mibx);
-	$mset = $mibx->search->query('m:15@example.com', {mset => 1});
+	$mset = $mibx->search->mset('m:15@example.com');
 	is(scalar($mset->items), 1, 'search works after --sequential-shard');
 }
 
@@ -216,7 +216,7 @@ if ('max size') {
 	my $opt = { 2 => \(my $err) };
 	ok(run_script($cmd, undef, $opt), 'indexed with --max-size');
 	like($err, qr/skipping [a-f0-9]{40,}/, 'warned about skipping message');
-	$mset = $mibx->search->reopen->query('m:2big@a', {mset =>1});
+	$mset = $mibx->search->reopen->mset('m:2big@a');
 	is(scalar($mset->items), 0, 'large message not indexed');
 
 	{
@@ -230,7 +230,7 @@ EOF
 	$cmd = [ qw(-index -j0 --reindex), "$tmpdir/m" ];
 	ok(run_script($cmd, undef, $opt), 'reindexed w/ indexMaxSize in file');
 	like($err, qr/skipping [a-f0-9]{40,}/, 'warned about skipping message');
-	$mset = $mibx->search->reopen->query('m:2big@a', {mset =>1});
+	$mset = $mibx->search->reopen->mset('m:2big@a');
 	is(scalar($mset->items), 0, 'large message not re-indexed');
 }
 
diff --git a/t/v2reindex.t b/t/v2reindex.t
index a2fc2075..ae1570ed 100644
--- a/t/v2reindex.t
+++ b/t/v2reindex.t
@@ -153,7 +153,7 @@ ok(!-d $xap, 'Xapian directories removed again');
 	delete $ibx->{mm};
 	is_deeply([ $ibx->mm->minmax ], $minmax, 'minmax unchanged');
 	is($ibx->mm->num_highwater, 10, 'num_highwater as expected');
-	my $mset = $ibx->search->query($phrase, {mset=>1});
+	my $mset = $ibx->search->mset($phrase);
 	isnt($mset->size, 0, "phrase search succeeds on indexlevel=full");
 	for (glob("$xap/*/*")) { $sizes{$ibx->{indexlevel}} += -s _ if -f $_ }
 
@@ -184,12 +184,12 @@ ok(!-d $xap, 'Xapian directories removed again');
 		# not sure why, but Xapian seems to fallback to terms and
 		# phrase searches still work
 		delete $ibx->{search};
-		my $mset = $ibx->search->query($phrase, {mset=>1});
+		my $mset = $ibx->search->mset($phrase);
 		is($mset->size, 0, 'phrase search does not work on medium');
 	}
 	my $words = $phrase;
 	$words =~ tr/"'//d;
-	my $mset = $ibx->search->query($words, {mset=>1});
+	my $mset = $ibx->search->mset($words);
 	isnt($mset->size, 0, "normal search works on indexlevel=medium");
 	for (glob("$xap/*/*")) { $sizes{$ibx->{indexlevel}} += -s _ if -f $_ }
 
@@ -531,7 +531,8 @@ EOF
 
 	my %uniq;
 	for my $s (qw(uno dos tres)) {
-		my $msgs = $ibx->search->query("s:$s");
+		my $mset = $ibx->search->mset("s:$s");
+		my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 		is(scalar(@$msgs), 1, "only one result for `$s'");
 		$uniq{$msgs->[0]->{num}}++;
 	}
diff --git a/t/v2writable.t b/t/v2writable.t
index 217eaf97..1de8c032 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -124,15 +124,14 @@ if ('ensure git configs are correct') {
 SELECT COUNT(*) FROM over WHERE num > 0
 
 	is($ibx->mm->num_highwater, $total, 'got expected highwater value');
-	my $srch = $ibx->search;
-	my $mset1 = $srch->reopen->query('m:abcde@1', { mset => 1 });
+	my $mset1 = $ibx->search->reopen->mset('m:abcde@1');
 	is($mset1->size, 1, 'message found by first MID');
-	my $mset2 = $srch->reopen->query('m:abcde@2', { mset => 1 });
+	my $mset2 = $ibx->search->mset('m:abcde@2');
 	is($mset2->size, 1, 'message found by second MID');
 	is((($mset1->items)[0])->get_docid, (($mset2->items)[0])->get_docid,
 		'same document') if ($mset1->size);
 
-	my $alt = $srch->reopen->query('m:alt-id-for-nntp', { mset => 1 });
+	my $alt = $ibx->search->mset('m:alt-id-for-nntp');
 	is($alt->size, 1, 'message found by alt MID (NNTP)');
 	is((($alt->items)[0])->get_docid, (($mset1->items)[0])->get_docid,
 		'same document') if ($mset1->size);
@@ -231,8 +230,7 @@ EOF
 	my $num = $smsg->{num};
 	like($num, qr/\A\d+\z/, 'numeric number in return message');
 	is($ibx->mm->mid_for($num), undef, 'no longer in Msgmap by num');
-	my $srch = $ibx->search->reopen;
-	my $mset = $srch->query('m:'.$mid, { mset => 1});
+	my $mset = $ibx->search->reopen->mset('m:'.$mid);
 	is($mset->size, 0, 'no longer found in Xapian');
 	my @log1 = (@log, qw(-1 --pretty=raw --raw -r --no-renames));
 	is($ibx->over->get_art($num), undef,
diff --git a/t/watch_filter_rubylang.t b/t/watch_filter_rubylang.t
index 4b72dbae..6513f30b 100644
--- a/t/watch_filter_rubylang.t
+++ b/t/watch_filter_rubylang.t
@@ -82,14 +82,13 @@ EOF
 	}
 
 	# make sure all serials are searchable:
-	my ($tot, $msgs);
 	for my $i (1..15) {
-		($tot, $msgs) = $ibx->search->query("alerts:$i");
-		is($tot, 1, "got one result for alerts:$i");
+		my $mset = $ibx->search->mset("alerts:$i");
+		is($mset->size, 1, "got one result for alerts:$i");
+		my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 		is($msgs->[0]->{mid}, "a.$i\@b.com", "got expected MID for $i");
 	}
-	($tot, undef) = $ibx->search->query('b:spam');
-	is($tot, 1, 'got spam message');
+	is($ibx->search->mset('b:spam')->size, 1, 'got spam message');
 
 	my $nr = unlink <$maildir/new/*>;
 	is(16, $nr);
@@ -104,8 +103,7 @@ EOF
 
 	$config = PublicInbox::Config->new(\$orig);
 	$ibx = $config->lookup_name($v);
-	($tot, undef) = $ibx->search->reopen->query('b:spam');
-	is($tot, 0, 'spam removed');
+	is($ibx->search->reopen->mset('b:spam')->size, 0, 'spam removed');
 
 	is_deeply([], \@warn, 'no warnings');
 }
diff --git a/t/watch_maildir_v2.t b/t/watch_maildir_v2.t
index c2c096ae..12546418 100644
--- a/t/watch_maildir_v2.t
+++ b/t/watch_maildir_v2.t
@@ -130,12 +130,14 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	$msg = do { local $/; <$fh> };
 	PublicInbox::Emergency->new($maildir)->prepare(\$msg);
 	PublicInbox::Watch->new($config)->scan('full');
-	my $msgs = $ibx->search->reopen->query('dfpost:6e006fd7');
-	is(scalar(@$msgs), 1, 'diff postimage found');
-	my $post = $msgs->[0];
-	$msgs = $ibx->search->query('dfpre:090d998b6c2c');
-	is(scalar(@$msgs), 1, 'diff preimage found');
-	is($post->{blob}, $msgs->[0]->{blob}, 'same message');
+	my $post = $ibx->search->reopen->mset('dfpost:6e006fd7');
+	is($post->size, 1, 'diff postimage found');
+	my $pre = $ibx->search->mset('dfpre:090d998b6c2c');
+	is($pre->size, 1, 'diff preimage found');
+	$pre = $ibx->search->mset_to_smsg($ibx, $pre);
+	$post = $ibx->search->mset_to_smsg($ibx, $post);
+	is(scalar(@$pre), 1, 'diff preimage found');
+	is($post->[0]->{blob}, $pre->[0]->{blob}, 'same message');
 }
 
 # multiple inboxes in the same maildir
@@ -161,7 +163,8 @@ both
 EOF
 	PublicInbox::Emergency->new($maildir)->prepare(\$both);
 	PublicInbox::Watch->new($config)->scan('full');
-	my $msgs = $ibx->search->reopen->query('m:both@b.com');
+	my $mset = $ibx->search->reopen->mset('m:both@b.com');
+	my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 	my $v1 = $config->lookup_name('v1');
 	my $msg = $v1->git->cat_file($msgs->[0]->{blob});
 	is($both, $$msg, 'got original message back from v1');
diff --git a/t/xcpdb-reshard.t b/t/xcpdb-reshard.t
index 1835fa62..c1af5d9a 100644
--- a/t/xcpdb-reshard.t
+++ b/t/xcpdb-reshard.t
@@ -49,7 +49,8 @@ for my $R (qw(2 4 1 3 3)) {
 	ok(run_script($cmd), "xcpdb -R$R");
 	my @new_shards = grep(m!/\d+\z!, glob("$ibx->{inboxdir}/xap*/*"));
 	is(scalar(@new_shards), $R, 'resharded to two shards');
-	my $msgs = $ibx->search->query('s:this');
+	my $mset = $ibx->search->mset('s:this');
+	my $msgs = $ibx->search->mset_to_smsg($ibx, $mset);
 	is(scalar(@$msgs), $ndoc, 'got expected docs after resharding');
 	my %by_mid = map {; "$_->{mid}" => $_ } @$msgs;
 	ok($by_mid{"m$_\@example.com"}, "$_ exists") for (1..$ndoc);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 07/11] search: remove {over_ro} field
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (5 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 06/11] search: replace ->query with ->mset Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 08/11] imap: drop old, pre-Parse::RecDescent search parser Eric Wong
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

Only inbox accesses the read-only {over}, now, instead of going
through ->search.  This simplifies our object graph and avoids
potentially redundant FDs and DB handles pointing to the same
over.sqlite3 file.
---
 lib/PublicInbox/Inbox.pm  | 11 +++++------
 lib/PublicInbox/Search.pm |  2 --
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 4005954e..b0894a7d 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -206,14 +206,13 @@ EOF
 	};
 }
 
-sub over ($) {
-	my ($self) = @_;
-	my $srch = search($self, 1) or return;
-	$self->{over} //= eval {
-		my $over = $srch->{over_ro};
+sub over {
+	$_[0]->{over} //= eval {
+		my $srch = search($_[0], 1) or return;
+		my $over = PublicInbox::Over->new("$srch->{xpfx}/over.sqlite3");
 		$over->dbh; # may fail
 		$over;
-	}
+	};
 }
 
 sub try_cat {
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index cfa942b2..b07f4ea6 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -265,8 +265,6 @@ sub new {
 		ibx_ver => $ibx->version,
 	}, $class;
 	xpfx_init($self);
-	my $dir = xdir($self, 1);
-	$self->{over_ro} = PublicInbox::Over->new("$dir/over.sqlite3");
 	$self;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 08/11] imap: drop old, pre-Parse::RecDescent search parser
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (6 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 07/11] search: remove {over_ro} field Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 09/11] wwwaltid: drop unused sqlite3_missing function Eric Wong
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

We switched to Parse::RecDescent during development and left
some dead code behind.
---
 lib/PublicInbox/IMAP.pm | 61 -----------------------------------------
 1 file changed, 61 deletions(-)

diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index d540fd0b..2d0d005e 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -1109,67 +1109,6 @@ sub search_uid_range { # long_response
 	1; # more
 }
 
-sub date_search {
-	my ($q, $k, $d) = @_;
-	my $sql = $q->{sql};
-
-	# Date: header
-	if ($k eq 'SENTON') {
-		my $end = $d + 86399; # no leap day...
-		my $da = strftime('%Y%m%d%H%M%S', gmtime($d));
-		my $db = strftime('%Y%m%d%H%M%S', gmtime($end));
-		$q->{xap} .= " dt:$da..$db";
-		$$sql .= " AND ds >= $d AND ds <= $end" if defined($sql);
-	} elsif ($k eq 'SENTBEFORE') {
-		$q->{xap} .= ' d:..'.strftime('%Y%m%d', gmtime($d));
-		$$sql .= " AND ds <= $d" if defined($sql);
-	} elsif ($k eq 'SENTSINCE') {
-		$q->{xap} .= ' d:'.strftime('%Y%m%d', gmtime($d)).'..';
-		$$sql .= " AND ds >= $d" if defined($sql);
-
-	# INTERNALDATE (Received)
-	} elsif ($k eq 'ON') {
-		my $end = $d + 86399; # no leap day...
-		$q->{xap} .= " ts:$d..$end";
-		$$sql .= " AND ts >= $d AND ts <= $end" if defined($sql);
-	} elsif ($k eq 'BEFORE') {
-		$q->{xap} .= " ts:..$d";
-		$$sql .= " AND ts <= $d" if defined($sql);
-	} elsif ($k eq 'SINCE') {
-		$q->{xap} .= " ts:$d..";
-		$$sql .= " AND ts >= $d" if defined($sql);
-	} else {
-		die "BUG: $k not recognized";
-	}
-}
-
-# IMAP to Xapian search key mapping
-my %I2X = (
-	SUBJECT => 's:',
-	BODY => 'b:',
-	FROM => 'f:',
-	TEXT => '', # n.b. does not include all headers
-	TO => 't:',
-	CC => 'c:',
-	# BCC => 'bcc:', # TODO
-	# KEYWORD # TODO ? dfpre,dfpost,...
-);
-
-# IMAP allows searching arbitrary headers via "HEADER $HDR_NAME $HDR_VAL"
-# which gets silly expensive.  We only allow the headers we already index.
-my %H2X = (%I2X, 'MESSAGE-ID' => 'm:', 'LIST-ID' => 'l:');
-
-sub xap_append ($$$$) {
-	my ($q, $rest, $k, $xk) = @_;
-	delete $q->{sql}; # can't use over.sqlite3
-	defined(my $arg = shift @$rest) or return "BAD $k no arg";
-
-	# AFAIK Xapian can't handle [*"] in probabilistic terms
-	$arg =~ tr/*"//d;
-	${$q->{xap}} .= qq[ $xk"$arg"];
-	undef;
-}
-
 sub parse_query ($$) {
 	my ($self, $query) = @_;
 	my $q = PublicInbox::IMAPsearchqp::parse($self, $query);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 09/11] wwwaltid: drop unused sqlite3_missing function
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (7 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 08/11] imap: drop old, pre-Parse::RecDescent search parser Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 10/11] overidx: document column uses Eric Wong
  2020-09-02 11:04 ` [PATCH 11/11] v2writable: reuse read-only shard counting code Eric Wong
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

It's inlined into the main function, which we'll shorten
slightly with the defined-or (`//') operator.  Also noticed
and fixed a mismatched HTML tag.
---
 lib/PublicInbox/WwwAltId.pm | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/lib/PublicInbox/WwwAltId.pm b/lib/PublicInbox/WwwAltId.pm
index e5476d1f..2818400e 100644
--- a/lib/PublicInbox/WwwAltId.pm
+++ b/lib/PublicInbox/WwwAltId.pm
@@ -11,16 +11,6 @@ use PublicInbox::Spawn qw(which);
 use PublicInbox::GzipFilter;
 our $sqlite3 = $ENV{SQLITE3};
 
-sub sqlite3_missing ($) {
-	html_oneshot($_[0], 501, \<<EOF);
-<pre>sqlite3 not available
-
-The administrator needs to install the sqlite3(1) binary
-to support gzipped sqlite3 dumps.</pre>
-</pre>
-EOF
-}
-
 sub check_output {
 	my ($r, $bref, $ctx) = @_;
 	return html_oneshot($ctx, 500) if !defined($r);
@@ -65,16 +55,12 @@ or
 EOF
 	}
 
-	$sqlite3 //= which('sqlite3');
-	if (!defined($sqlite3)) {
-		return html_oneshot($ctx, 501, \<<EOF);
+	$sqlite3 //= which('sqlite3') // return html_oneshot($ctx, 501, \<<EOF);
 <pre>sqlite3 not available
 
 The administrator needs to install the sqlite3(1) binary
 to support gzipped sqlite3 dumps.</pre>
-</pre>
 EOF
-	}
 
 	# setup stdin, POSIX requires writes <= 512 bytes to succeed so
 	# we can close the pipe right away.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 10/11] overidx: document column uses
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (8 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 09/11] wwwaltid: drop unused sqlite3_missing function Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  2020-09-02 11:04 ` [PATCH 11/11] v2writable: reuse read-only shard counting code Eric Wong
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

This may be useful for keeping our heads on straight dealing
with IMAP, NNTP, JMAP, etc.
---
 lib/PublicInbox/OverIdx.pm | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 6f0477f0..db4b7738 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -379,12 +379,12 @@ sub create_tables {
 
 	$dbh->do(<<'');
 CREATE TABLE IF NOT EXISTS over (
-	num INTEGER NOT NULL,
-	tid INTEGER NOT NULL,
-	sid INTEGER,
-	ts INTEGER,
-	ds INTEGER,
-	ddd VARBINARY, /* doc-data-deflated */
+	num INTEGER NOT NULL, /* NNTP article number == IMAP UID */
+	tid INTEGER NOT NULL, /* THREADID (IMAP REFERENCES threading, JMAP) */
+	sid INTEGER, /* Subject ID (IMAP ORDEREDSUBJECT "threading") */
+	ts INTEGER, /* IMAP INTERNALDATE (Received: header, git commit time) */
+	ds INTEGER, /* RFC-2822 sent Date: header, git author time */
+	ddd VARBINARY, /* doc-data-deflated (->to_doc_data, ->load_from_data) */
 	UNIQUE (num)
 )
 
@@ -406,13 +406,13 @@ CREATE TABLE IF NOT EXISTS counter (
 	$dbh->do(<<'');
 CREATE TABLE IF NOT EXISTS subject (
 	sid INTEGER PRIMARY KEY AUTOINCREMENT,
-	path VARCHAR(40) NOT NULL,
+	path VARCHAR(40) NOT NULL, /* SHA-1 of normalized subject */
 	UNIQUE (path)
 )
 
 	$dbh->do(<<'');
 CREATE TABLE IF NOT EXISTS id2num (
-	id INTEGER NOT NULL,
+	id INTEGER NOT NULL, /* <=> msgid.id */
 	num INTEGER NOT NULL,
 	UNIQUE (id, num)
 )
@@ -423,7 +423,7 @@ CREATE TABLE IF NOT EXISTS id2num (
 
 	$dbh->do(<<'');
 CREATE TABLE IF NOT EXISTS msgid (
-	id INTEGER PRIMARY KEY AUTOINCREMENT,
+	id INTEGER PRIMARY KEY AUTOINCREMENT, /* <=> id2num.id */
 	mid VARCHAR(244) NOT NULL,
 	UNIQUE (mid)
 )

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 11/11] v2writable: reuse read-only shard counting code
  2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
                   ` (9 preceding siblings ...)
  2020-09-02 11:04 ` [PATCH 10/11] overidx: document column uses Eric Wong
@ 2020-09-02 11:04 ` Eric Wong
  10 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-09-02 11:04 UTC (permalink / raw)
  To: meta

We'll also fix the read-only code to ensure we notice missing
Xapian shards, since gaps would throw off our expectation that
Xapian document IDs and NNTP article numbers are interchangeable.
---
 lib/PublicInbox/Search.pm     |  5 ++++-
 lib/PublicInbox/V2Writable.pm | 23 +++--------------------
 2 files changed, 7 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index b07f4ea6..fb35b747 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -7,6 +7,7 @@ package PublicInbox::Search;
 use strict;
 use parent qw(Exporter);
 our @EXPORT_OK = qw(mdocid);
+use List::Util qw(max);
 
 # values for searching, changing the numeric value breaks
 # compatibility with old indices (so don't change them it)
@@ -203,7 +204,9 @@ sub _xdb ($) {
 
 		# We need numeric sorting so shard[0] is first for reading
 		# Xapian metadata, if needed
-		for (sort { $a <=> $b } grep(/\A[0-9]+\z/, readdir($dh))) {
+		my $last = max(grep(/\A[0-9]+\z/, readdir($dh)));
+		return if !defined($last);
+		for (0..$last) {
 			my $shard_dir = "$dir/$_";
 			if (-d $shard_dir && -r _) {
 				push @xdb, $X{Database}->new($shard_dir);
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index c8334645..a1f6048f 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -65,28 +65,11 @@ sub nproc_shards ($) {
 
 sub count_shards ($) {
 	my ($self) = @_;
-	my $n = 0;
-	my $xpfx = $self->{xpfx};
-
 	# always load existing shards in case core count changes:
 	# Also, shard count may change while -watch is running
-	# due to "xcpdb --reshard"
-	if (-d $xpfx) {
-		my $XapianDatabase;
-		foreach my $shard (<$xpfx/*>) {
-			-d $shard && $shard =~ m!/[0-9]+\z! or next;
-			$XapianDatabase //= do {
-				require PublicInbox::Search;
-				PublicInbox::Search::load_xapian();
-				$PublicInbox::Search::X{Database};
-			};
-			eval {
-				$XapianDatabase->new($shard)->close;
-				$n++;
-			};
-		}
-	}
-	$n;
+	my $srch = $self->{ibx}->search or return 0;
+	delete $self->{ibx}->{search};
+	$srch->{nshard} // 0
 }
 
 sub new {

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-09-02 11:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-02 11:04 [PATCH 00/11] cleanups, mostly indexing related Eric Wong
2020-09-02 11:04 ` [PATCH 01/11] msgmap: note how we use ->created_at Eric Wong
2020-09-02 11:04 ` [PATCH 02/11] disambiguate OverIdx and Over by field name Eric Wong
2020-09-02 11:04 ` [PATCH 03/11] use more idiomatic internal API for ->over access Eric Wong
2020-09-02 11:04 ` [PATCH 04/11] search: remove special case for blank query Eric Wong
2020-09-02 11:04 ` [PATCH 05/11] tests: add "use strict" and declare v5.10.1 compatibility Eric Wong
2020-09-02 11:04 ` [PATCH 06/11] search: replace ->query with ->mset Eric Wong
2020-09-02 11:04 ` [PATCH 07/11] search: remove {over_ro} field Eric Wong
2020-09-02 11:04 ` [PATCH 08/11] imap: drop old, pre-Parse::RecDescent search parser Eric Wong
2020-09-02 11:04 ` [PATCH 09/11] wwwaltid: drop unused sqlite3_missing function Eric Wong
2020-09-02 11:04 ` [PATCH 10/11] overidx: document column uses Eric Wong
2020-09-02 11:04 ` [PATCH 11/11] v2writable: reuse read-only shard counting code Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).