unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* Switching to extindex
@ 2024-06-04 15:59 Jonathan Corbet
  2024-06-04 17:07 ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Corbet @ 2024-06-04 15:59 UTC (permalink / raw)
  To: meta

The public-inbox installation used by LWN has been working great for
years, with the result that I didn't update it for years.  I've recently
replaced that server, though, and moved to 1.9 in the process.  I'm
trying to switch over to the extindex mode, but could use some help.

I"ve run public-inbox-extindex --all and waited a few days; once I
figured out that I also need to have it running with --watch, things
*seem* to be working well.

My understanding is that I can now get rid of the per-inbox indexing and
get a bunch of disk space back, and I would like to do that.  This,
though, is the step that I've not been able to figure out.  Which files
can I remove, and how do I tell public-inbox to not recreate them?

Thanks,

jon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Switching to extindex
  2024-06-04 15:59 Switching to extindex Jonathan Corbet
@ 2024-06-04 17:07 ` Eric Wong
  2024-06-04 17:18   ` Jonathan Corbet
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2024-06-04 17:07 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: meta

Jonathan Corbet <corbet@lwn.net> wrote:
> I"ve run public-inbox-extindex --all and waited a few days; once I
> figured out that I also need to have it running with --watch, things
> *seem* to be working well.
> 
> My understanding is that I can now get rid of the per-inbox indexing and
> get a bunch of disk space back, and I would like to do that.  This,
> though, is the step that I've not been able to figure out.  Which files
> can I remove, and how do I tell public-inbox to not recreate them?

Yes, you can remove the Xapian shard directories (0, 1, 2, ...)
under xap15 and -watch will be able to detect indexlevel=basic
on the next SIGHUP or restart.

Leave the per-inbox *.sqlite3 files, they're still required for
-extindex and don't use too much space compared to Xapian.

For new inboxes you can -init with `-L basic' to avoid per-inbox
Xapian DBs.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Switching to extindex
  2024-06-04 17:07 ` Eric Wong
@ 2024-06-04 17:18   ` Jonathan Corbet
  2024-06-04 17:24     ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Corbet @ 2024-06-04 17:18 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> Jonathan Corbet <corbet@lwn.net> wrote:
>> I"ve run public-inbox-extindex --all and waited a few days; once I
>> figured out that I also need to have it running with --watch, things
>> *seem* to be working well.
>> 
>> My understanding is that I can now get rid of the per-inbox indexing and
>> get a bunch of disk space back, and I would like to do that.  This,
>> though, is the step that I've not been able to figure out.  Which files
>> can I remove, and how do I tell public-inbox to not recreate them?
>
> Yes, you can remove the Xapian shard directories (0, 1, 2, ...)
> under xap15 and -watch will be able to detect indexlevel=basic
> on the next SIGHUP or restart.

For reasons lost to history, I'm using public-inbox-mda instead; it
seems it does not do that detection?  No worries, I can tweak the config
file easily enough.

This is all good stuff - thanks again!

jon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Switching to extindex
  2024-06-04 17:18   ` Jonathan Corbet
@ 2024-06-04 17:24     ` Eric Wong
  2024-06-04 17:34       ` Jonathan Corbet
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2024-06-04 17:24 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: meta

Jonathan Corbet <corbet@lwn.net> wrote:
> Eric Wong <e@80x24.org> writes:
> 
> > Jonathan Corbet <corbet@lwn.net> wrote:
> >> I"ve run public-inbox-extindex --all and waited a few days; once I
> >> figured out that I also need to have it running with --watch, things
> >> *seem* to be working well.
> >> 
> >> My understanding is that I can now get rid of the per-inbox indexing and
> >> get a bunch of disk space back, and I would like to do that.  This,
> >> though, is the step that I've not been able to figure out.  Which files
> >> can I remove, and how do I tell public-inbox to not recreate them?
> >
> > Yes, you can remove the Xapian shard directories (0, 1, 2, ...)
> > under xap15 and -watch will be able to detect indexlevel=basic
> > on the next SIGHUP or restart.
> 
> For reasons lost to history, I'm using public-inbox-mda instead; it
> seems it does not do that detection?  No worries, I can tweak the config
> file easily enough.

-mda does the detection, too, just every time.  I haven't used
it much with -extindex but I think it there's tests for it so it works...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Switching to extindex
  2024-06-04 17:24     ` Eric Wong
@ 2024-06-04 17:34       ` Jonathan Corbet
  2024-06-04 22:25         ` [PATCH] mda: do not auto-create Xapian indices Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Corbet @ 2024-06-04 17:34 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> Jonathan Corbet <corbet@lwn.net> wrote:
>> Eric Wong <e@80x24.org> writes:
>> 
>> > Jonathan Corbet <corbet@lwn.net> wrote:
>> >> I"ve run public-inbox-extindex --all and waited a few days; once I
>> >> figured out that I also need to have it running with --watch, things
>> >> *seem* to be working well.
>> >> 
>> >> My understanding is that I can now get rid of the per-inbox indexing and
>> >> get a bunch of disk space back, and I would like to do that.  This,
>> >> though, is the step that I've not been able to figure out.  Which files
>> >> can I remove, and how do I tell public-inbox to not recreate them?
>> >
>> > Yes, you can remove the Xapian shard directories (0, 1, 2, ...)
>> > under xap15 and -watch will be able to detect indexlevel=basic
>> > on the next SIGHUP or restart.
>> 
>> For reasons lost to history, I'm using public-inbox-mda instead; it
>> seems it does not do that detection?  No worries, I can tweak the config
>> file easily enough.
>
> -mda does the detection, too, just every time.  I haven't used
> it much with -extindex but I think it there's tests for it so it works...

Interesting...for me it recreates the 0/1/2 directories unless I edit
the config file explicitly.

Thanks,

jon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] mda: do not auto-create Xapian indices
  2024-06-04 17:34       ` Jonathan Corbet
@ 2024-06-04 22:25         ` Eric Wong
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2024-06-04 22:25 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: meta

Jonathan Corbet <corbet@lwn.net> wrote:
> Eric Wong <e@80x24.org> writes:
> > -mda does the detection, too, just every time.  I haven't used
> > it much with -extindex but I think it there's tests for it so it works...
> 
> Interesting...for me it recreates the 0/1/2 directories unless I edit
> the config file explicitly.

Oops, yeah, that's a bug.  -learn did it right, but not -mda
This should fix it:

-------8<------
Subject: [PATCH] mda: do not auto-create Xapian indices

As with -learn, -mda now detects indexlevel=basic without an
explicit config setting for inboxes which only have SQLite
files.  Omitting indexlevel=basic in the config file allows
users to reduce configuration file size (and RAM usage).

We'll also ensure completely unindexed v1 inboxes can stay
unindexed despite the default being indexlevel=full.
---
 script/public-inbox-mda |  1 +
 t/mda.t                 | 30 ++++++++++++++++++++++++++++++
 t/v2mda.t               | 14 ++++++++++++++
 3 files changed, 45 insertions(+)

diff --git a/script/public-inbox-mda b/script/public-inbox-mda
index b463b07b..74202912 100755
--- a/script/public-inbox-mda
+++ b/script/public-inbox-mda
@@ -72,6 +72,7 @@ if (!scalar(@$dests)) {
 my $err;
 @$dests = grep {
 	my $ibx = PublicInbox::InboxWritable->new($_);
+	$ibx->{indexlevel} = $ibx->detect_indexlevel;
 	eval { $ibx->assert_usable_dir };
 	if ($@) {
 		warn $@;
diff --git a/t/mda.t b/t/mda.t
index 1d9e237b..51aae231 100644
--- a/t/mda.t
+++ b/t/mda.t
@@ -396,6 +396,36 @@ EOM
 	@xap = grep(!m!/over\.sqlite3!,
 			glob("$maindir/public-inbox/xapian*/*"));
 	is_deeply(\@xap, [], 'no Xapian files created by -learn');
+
+	$in = <<'EOM';
+From: a@example.com
+To: updated-address@example.com
+Subject: basic message for mda
+Date: Fri, 02 Oct 1993 00:00:00 +0000
+Message-ID: <basic-for-mda@example>
+
+basic
+EOM
+	local $ENV{ORIGINAL_RECIPIENT} = $addr;
+	ok run_script(['-mda'], undef, $rdr), '-mda for basic';
+	@xap = grep(!m!/over\.sqlite3!,
+			glob("$maindir/public-inbox/xapian*/*"));
+	is_deeply \@xap, [], 'no Xapian files created by -mda';
+
+	# try ensure completely unindexed v1 stays unindexed
+	remove_tree "$maindir/public-inbox";
+	$in = <<'EOM';
+From: a@example.com
+To: updated-address@example.com
+Subject: unnidexed message for mda
+Date: Fri, 02 Oct 1993 00:00:00 +0000
+Message-ID: <unindexed-for-mda@example>
+
+unindexed
+EOM
+
+	ok run_script(['-mda'], undef, $rdr), '-mda for unindexed';
+	ok !-e "$maindir/public-inbox", 'no v1 index created by default';
 };
 
 done_testing();
diff --git a/t/v2mda.t b/t/v2mda.t
index b7d177b2..cecc9722 100644
--- a/t/v2mda.t
+++ b/t/v2mda.t
@@ -119,6 +119,20 @@ EOM
 	ok($smsg, 'ham message learned w/ indexlevel=basic');
 	@shards = grep(m!/[0-9]+\z!, glob("$ibx->{inboxdir}/xap*/*"));
 	is_deeply(\@shards, [], 'not converted to medium/full after learn');
+
+	$rdr->{0} = \<<'EOM';
+From: a@example.com
+To: test@example.com
+Subject: this is a message for -mda to stay basic
+Date: Fri, 02 Oct 1993 00:00:00 +0000
+Message-ID: <mda-stays-basic@example>
+
+yum
+EOM
+	ok run_script(['-mda'], undef, $rdr), '-learn runs on basic'
+		or diag $err;
+	@shards = grep m!/[0-9]+\z!, glob("$ibx->{inboxdir}/xap*/*");
+	is_deeply \@shards, [], 'not converted to medium/full after -mda';
 }
 
 done_testing();

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-04 22:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 15:59 Switching to extindex Jonathan Corbet
2024-06-04 17:07 ` Eric Wong
2024-06-04 17:18   ` Jonathan Corbet
2024-06-04 17:24     ` Eric Wong
2024-06-04 17:34       ` Jonathan Corbet
2024-06-04 22:25         ` [PATCH] mda: do not auto-create Xapian indices Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).