unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/9] various read-only daemon and WWW things
@ 2021-10-12 11:46 Eric Wong
  2021-10-12 11:46 ` [PATCH 1/9] daemon: use v5.10.1, disable local warnings Eric Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:46 UTC (permalink / raw)
  To: meta

Read-only daemons close idle descriptors much more aggressively,
now.  This should reduce userspace and kernel memory use during
non-peak times with negligible slowdown under heavy load.

Some fairly big msgmap-related cleanups and WWW now exposes
mm->created_at AKA UIDVALIDITY as the Last-Modified time
of the _/text/config/raw endpoint. (see 9/9)

Eric Wong (9):
  daemon: use v5.10.1, disable local warnings
  daemon: quiet down Eml-related warnings
  search: delete QueryParser along with DB handle
  nntp: use defined-OR from Perl 5.10 for msgid check
  msgmap: use DBI->prepare_cached
  msgmap: share most of check_inodes w/ over
  daemon: unconditionally close Xapian shards on cleanup
  msgmap: ->new_file to supports $ibx arg, drop ->new
  www: _/text/config/raw Last-Modified: is mm->created_at

 lib/PublicInbox/Daemon.pm        |   8 ++-
 lib/PublicInbox/Inbox.pm         |  18 +++---
 lib/PublicInbox/InboxWritable.pm |   2 +-
 lib/PublicInbox/LeiMirror.pm     |  13 ++--
 lib/PublicInbox/Msgmap.pm        | 102 +++++++++++++------------------
 lib/PublicInbox/NNTP.pm          |   6 +-
 lib/PublicInbox/Search.pm        |  14 -----
 lib/PublicInbox/SearchIdx.pm     |   3 +-
 lib/PublicInbox/V2Writable.pm    |   4 +-
 lib/PublicInbox/WwwText.pm       |   3 +
 t/altid.t                        |   4 +-
 t/altid_v2.t                     |   4 +-
 t/extsearch.t                    |  10 +--
 t/filter_rubylang.t              |   2 +-
 t/init.t                         |   3 +-
 t/lei-mirror.t                   |  31 +++++++++-
 t/msgmap.t                       |   5 +-
 t/psgi_v2.t                      |  10 ++-
 18 files changed, 120 insertions(+), 122 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/9] daemon: use v5.10.1, disable local warnings
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
@ 2021-10-12 11:46 ` Eric Wong
  2021-10-12 11:46 ` [PATCH 2/9] daemon: quiet down Eml-related warnings Eric Wong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:46 UTC (permalink / raw)
  To: meta

We're moving towards relying on "perl -w" for warnings and
v5.12 for strict.
---
 lib/PublicInbox/Daemon.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index 0acd4ee9..70d0f1c5 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -6,8 +6,8 @@
 # and/or lossy connections.
 package PublicInbox::Daemon;
 use strict;
-use warnings;
-use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
+use v5.10.1;
+use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
 use IO::Handle; # ->autoflush
 use IO::Socket;
 use POSIX qw(WNOHANG :signal_h);
@@ -15,7 +15,7 @@ use Socket qw(IPPROTO_TCP SOL_SOCKET);
 STDOUT->autoflush(1);
 STDERR->autoflush(1);
 use PublicInbox::DS qw(now);
-require PublicInbox::Listener;
+use PublicInbox::Listener;
 use PublicInbox::EOFpipe;
 use PublicInbox::Sigfd;
 use PublicInbox::Git;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/9] daemon: quiet down Eml-related warnings
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
  2021-10-12 11:46 ` [PATCH 1/9] daemon: use v5.10.1, disable local warnings Eric Wong
@ 2021-10-12 11:46 ` Eric Wong
  2021-10-12 11:46 ` [PATCH 3/9] search: delete QueryParser along with DB handle Eric Wong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:46 UTC (permalink / raw)
  To: meta

Email::Address::XS is quite noisy and there's nothing we can
really do about messages we're serving from read-only daemons.
---
 lib/PublicInbox/Daemon.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index 70d0f1c5..5c64e84c 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -20,6 +20,7 @@ use PublicInbox::EOFpipe;
 use PublicInbox::Sigfd;
 use PublicInbox::Git;
 use PublicInbox::GitAsyncCat;
+use PublicInbox::Eml;
 our $SO_ACCEPTFILTER = 0x1000;
 my @CMD;
 my ($set_user, $oldset);
@@ -642,6 +643,7 @@ sub run ($$$;$) {
 	# localize GCF2C for tests:
 	local $PublicInbox::GitAsyncCat::GCF2C;
 	local $PublicInbox::Git::async_warn = 1;
+	local $SIG{__WARN__} = \&PublicInbox::Eml::warn_ignore_cb;
 
 	daemon_loop($refresh, $post_accept, $tlsd, $af_default);
 	PublicInbox::DS->Reset;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/9] search: delete QueryParser along with DB handle
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
  2021-10-12 11:46 ` [PATCH 1/9] daemon: use v5.10.1, disable local warnings Eric Wong
  2021-10-12 11:46 ` [PATCH 2/9] daemon: quiet down Eml-related warnings Eric Wong
@ 2021-10-12 11:46 ` Eric Wong
  2021-10-12 11:47 ` [PATCH 4/9] nntp: use defined-OR from Perl 5.10 for msgid check Eric Wong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:46 UTC (permalink / raw)
  To: meta

Xapian::QueryParser is attached to the Xapian::Database,
so holding onto the QueryParser was preventing us from
releasing DB handles if a query was performed.
---
 lib/PublicInbox/Search.pm | 2 +-
 t/extsearch.t             | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 17e202e1..dd6d3710 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -253,7 +253,7 @@ sub cleanup_shards {
 	my $nr = grep(/\A[0-9]+\z/, readdir($dh)) or
 		return warn("$xpfx has no shards\n"); # true
 	return reopen($self) if $nr == ($self->{nshard} // -1);
-	delete($self->{xdb});
+	delete @$self{qw(xdb qp)};
 	undef;
 }
 
diff --git a/t/extsearch.t b/t/extsearch.t
index 1b6235ba..8190de17 100644
--- a/t/extsearch.t
+++ b/t/extsearch.t
@@ -176,7 +176,7 @@ if ('inbox edited') {
 	is($mset->size, 1, 'new message found');
 	$mset = $es->mset('b:"test message"');
 	is($mset->size, 1, 'old message found');
-	delete @$es{qw(git over xdb)}; # fork preparation
+	delete @$es{qw(git over xdb qp)}; # fork preparation
 
 	my $pi_cfg = PublicInbox::Config->new;
 	$pi_cfg->fill_all;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/9] nntp: use defined-OR from Perl 5.10 for msgid check
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
                   ` (2 preceding siblings ...)
  2021-10-12 11:46 ` [PATCH 3/9] search: delete QueryParser along with DB handle Eric Wong
@ 2021-10-12 11:47 ` Eric Wong
  2021-10-12 11:47 ` [PATCH 5/9] msgmap: use DBI->prepare_cached Eric Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:47 UTC (permalink / raw)
  To: meta

"<0>" could be a valid Message-ID, maybe...
---
 lib/PublicInbox/NNTP.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index ea9ce183..aa019368 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -9,6 +9,7 @@
 # long_cb: long_response private data
 package PublicInbox::NNTP;
 use strict;
+use v5.10.1;
 use parent qw(PublicInbox::DS);
 use PublicInbox::MID qw(mid_escape $MID_EXTRACT);
 use PublicInbox::Eml;
@@ -381,11 +382,10 @@ sub article_adj ($$) {
 	defined $n or return '420 no current article has been selected';
 
 	$n += $off;
-	my $mid = $ibx->mm(1)->mid_for($n);
-	unless ($mid) {
+	my $mid = $ibx->mm(1)->mid_for($n) // do {
 		$n = $off > 0 ? 'next' : 'previous';
 		return "421 no $n article in this group";
-	}
+	};
 	$self->{article} = $n;
 	"223 $n <$mid> article retrieved - request text separately";
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/9] msgmap: use DBI->prepare_cached
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
                   ` (3 preceding siblings ...)
  2021-10-12 11:47 ` [PATCH 4/9] nntp: use defined-OR from Perl 5.10 for msgid check Eric Wong
@ 2021-10-12 11:47 ` Eric Wong
  2021-10-12 11:47 ` [PATCH 6/9] msgmap: share most of check_inodes w/ over Eric Wong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:47 UTC (permalink / raw)
  To: meta

msgmap is not performance-critical enough to justify doing our
own prepared statement caching.  Just rely on the functionality
of DBI here so future changes will be easier.

There's also minor style changes to avoid dirtying refcount
cache lines bumping by repeating hash lookups rather than attempting
to store them as locals.
---
 lib/PublicInbox/Msgmap.pm | 54 +++++++++++++++------------------------
 1 file changed, 20 insertions(+), 34 deletions(-)

diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 3887a9e6..de9fd989 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -61,18 +61,15 @@ sub meta_accessor {
 	my ($self, $key, $value) = @_;
 
 	my $sql = 'SELECT val FROM meta WHERE key = ? LIMIT 1';
-	my $dbh = $self->{dbh};
-	my $prev;
-	defined $value or return $dbh->selectrow_array($sql, undef, $key);
-
-	$prev = $dbh->selectrow_array($sql, undef, $key);
+	my $prev = $self->{dbh}->selectrow_array($sql, undef, $key);
+	$value // return $prev;
 
 	if (defined $prev) {
 		$sql = 'UPDATE meta SET val = ? WHERE key = ?';
-		$dbh->do($sql, undef, $value, $key);
+		$self->{dbh}->do($sql, undef, $value, $key);
 	} else {
 		$sql = 'INSERT INTO meta (key,val) VALUES (?,?)';
-		$dbh->do($sql, undef, $key, $value);
+		$self->{dbh}->do($sql, undef, $key, $value);
 	}
 	$prev;
 }
@@ -109,33 +106,30 @@ sub num_highwater {
 
 sub mid_insert {
 	my ($self, $mid) = @_;
-	my $dbh = $self->{dbh};
-	my $sth = $dbh->prepare_cached(<<'');
+	my $sth = $self->{dbh}->prepare_cached(<<'');
 INSERT INTO msgmap (mid) VALUES (?)
 
 	return unless eval { $sth->execute($mid) };
-	my $num = $dbh->last_insert_id(undef, undef, 'msgmap', 'num');
+	my $num = $self->{dbh}->last_insert_id(undef, undef, 'msgmap', 'num');
 	$self->num_highwater($num) if defined($num);
 	$num;
 }
 
 sub mid_for {
 	my ($self, $num) = @_;
-	my $dbh = $self->{dbh};
-	my $sth = $self->{mid_for} ||=
-		$dbh->prepare('SELECT mid FROM msgmap WHERE num = ? LIMIT 1');
-	$sth->bind_param(1, $num);
-	$sth->execute;
+	my $sth = $self->{dbh}->prepare_cached(<<"", undef, 1);
+SELECT mid FROM msgmap WHERE num = ? LIMIT 1
+
+	$sth->execute($num);
 	$sth->fetchrow_array;
 }
 
 sub num_for {
 	my ($self, $mid) = @_;
-	my $dbh = $self->{dbh};
-	my $sth = $self->{num_for} ||=
-		$dbh->prepare('SELECT num FROM msgmap WHERE mid = ? LIMIT 1');
-	$sth->bind_param(1, $mid);
-	$sth->execute;
+	my $sth = $self->{dbh}->prepare_cached(<<"", undef, 1);
+SELECT num FROM msgmap WHERE mid = ? LIMIT 1
+
+	$sth->execute($mid);
 	$sth->fetchrow_array;
 }
 
@@ -157,18 +151,12 @@ sub minmax {
 
 sub mid_delete {
 	my ($self, $mid) = @_;
-	my $dbh = $self->{dbh};
-	my $sth = $dbh->prepare('DELETE FROM msgmap WHERE mid = ?');
-	$sth->bind_param(1, $mid);
-	$sth->execute;
+	$self->{dbh}->do('DELETE FROM msgmap WHERE mid = ?', undef, $mid);
 }
 
 sub num_delete {
 	my ($self, $num) = @_;
-	my $dbh = $self->{dbh};
-	my $sth = $dbh->prepare('DELETE FROM msgmap WHERE num = ?');
-	$sth->bind_param(1, $num);
-	$sth->execute;
+	$self->{dbh}->do('DELETE FROM msgmap WHERE num = ?', undef, $num);
 }
 
 sub create_tables {
@@ -192,9 +180,8 @@ CREATE TABLE IF NOT EXISTS meta (
 sub msg_range {
 	my ($self, $beg, $end, $cols) = @_;
 	$cols //= 'num,mid';
-	my $dbh = $self->{dbh};
 	my $attr = { Columns => [] };
-	my $mids = $dbh->selectall_arrayref(<<"", $attr, $$beg, $end);
+	my $mids = $self->{dbh}->selectall_arrayref(<<"", $attr, $$beg, $end);
 SELECT $cols FROM msgmap WHERE num >= ? AND num <= ?
 ORDER BY num ASC LIMIT 1000
 
@@ -206,10 +193,9 @@ ORDER BY num ASC LIMIT 1000
 # see scripts/xhdr-num2mid or PublicInbox::Filter::RubyLang for usage
 sub mid_set {
 	my ($self, $num, $mid) = @_;
-	my $sth = $self->{mid_set} ||= do {
-		$self->{dbh}->prepare(
-			'INSERT OR IGNORE INTO msgmap (num,mid) VALUES (?,?)');
-	};
+	my $sth = $self->{dbh}->prepare_cached(<<"");
+INSERT OR IGNORE INTO msgmap (num,mid) VALUES (?,?)
+
 	my $result = $sth->execute($num, $mid);
 	$self->num_highwater($num) if (defined($result) && $result == 1);
 	$result;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/9] msgmap: share most of check_inodes w/ over
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
                   ` (4 preceding siblings ...)
  2021-10-12 11:47 ` [PATCH 5/9] msgmap: use DBI->prepare_cached Eric Wong
@ 2021-10-12 11:47 ` Eric Wong
  2021-10-12 11:47 ` [PATCH 7/9] daemon: unconditionally close Xapian shards on cleanup Eric Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:47 UTC (permalink / raw)
  To: meta

We still need to account for msgmap being open all the time
and not having separate read-only vs. read-write packages.
---
 lib/PublicInbox/Msgmap.pm | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index de9fd989..978730e2 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -257,21 +257,10 @@ sub skip_artnum {
 
 sub check_inodes {
 	my ($self) = @_;
-	# no filename if in-:memory:
-	my $f = $self->{dbh}->sqlite_db_filename // return;
-	if (my @st = stat($f)) { # did st_dev, st_ino change?
-		my $st = pack('dd', $st[0], $st[1]);
-		if ($st ne ($self->{st} // $st)) {
-			my $tmp = eval { ref($self)->new_file($f) };
-			if ($@) {
-				warn "E: DBI->connect($f): $@\n";
-			} else {
-				%$self = %$tmp;
-			}
-		}
-	} else {
-		warn "W: stat $f: $!\n";
-	}
+	$self->{dbh} // return;
+	my $rw = !$self->{dbh}->{ReadOnly};
+	PublicInbox::Over::check_inodes($self);
+	$self->{dbh} //= PublicInbox::Over::dbh_new($self, !$rw);
 }
 
 1;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/9] daemon: unconditionally close Xapian shards on cleanup
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
                   ` (5 preceding siblings ...)
  2021-10-12 11:47 ` [PATCH 6/9] msgmap: share most of check_inodes w/ over Eric Wong
@ 2021-10-12 11:47 ` Eric Wong
  2021-10-12 11:47 ` [PATCH 8/9] msgmap: ->new_file to supports $ibx arg, drop ->new Eric Wong
  2021-10-12 11:47 ` [PATCH 9/9] www: _/text/config/raw Last-Modified: is mm->created_at Eric Wong
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:47 UTC (permalink / raw)
  To: meta

The cost of opening a Xapian DB (even with shards) isn't high,
so save some FDs and just close it.  We hit Xapian far less than
over.sqlite3 and we discard the MSet ASAP even when streaming
large responses.

This simplifies our code a bit and hopefully helps reduce
fragmentation by increasing mortality of late allocations.
---
 lib/PublicInbox/Inbox.pm  | 18 +++++++++---------
 lib/PublicInbox/Search.pm | 14 --------------
 t/extsearch.t             |  8 +-------
 3 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 724df50a..61d153bf 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -11,8 +11,6 @@ use List::Util qw(max);
 use Carp qw(croak);
 
 # returns true if further checking is required
-sub cleanup_shards { $_[0]->{search} ? $_[0]->{search}->cleanup_shards : undef }
-
 sub check_inodes ($) {
 	for (qw(over mm)) { $_[0]->{$_}->check_inodes if $_[0]->{$_} }
 }
@@ -31,7 +29,8 @@ sub do_cleanup {
 		delete(@$ibx{qw(over mm description cloneurl
 				-imap_url -nntp_url)});
 	}
-	$ibx->cleanup_shards and $live = 1;
+	my $srch = $ibx->{search} // $ibx;
+	delete @$srch{qw(xdb qp)};
 	for my $git (@{$ibx->{-repo_objs} // []}) {
 		$live = 1 if $git->cleanup(1);
 	}
@@ -138,17 +137,18 @@ sub max_git_epoch {
 	$cur;
 }
 
+sub mm_file {
+	my ($self) = @_;
+	my $d = $self->{inboxdir};
+	($self->version >= 2 ? $d : "$d/public-inbox").'/msgmap.sqlite3';
+}
+
 sub mm {
 	my ($self, $req) = @_;
 	$self->{mm} //= eval {
 		require PublicInbox::Msgmap;
 		_cleanup_later($self);
-		my $dir = $self->{inboxdir};
-		if ($self->version >= 2) {
-			PublicInbox::Msgmap->new_file("$dir/msgmap.sqlite3");
-		} else {
-			PublicInbox::Msgmap->new($dir);
-		}
+		PublicInbox::Msgmap->new_file(mm_file($self));
 	} // ($req ? croak("E: $@") : undef);
 }
 
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index dd6d3710..f0e7ed0c 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -243,20 +243,6 @@ sub xdb ($) {
 	};
 }
 
-# returns true if a future rescan is desired
-sub cleanup_shards {
-	my ($self) = @_;
-	return unless exists($self->{xdb});
-	my $xpfx = $self->{xpfx};
-	return reopen($self) if $xpfx =~ m!/xapian[0-9]+\z!; # true
-	opendir(my $dh, $xpfx) or return warn("$xpfx gone: $!\n"); # true
-	my $nr = grep(/\A[0-9]+\z/, readdir($dh)) or
-		return warn("$xpfx has no shards\n"); # true
-	return reopen($self) if $nr == ($self->{nshard} // -1);
-	delete @$self{qw(xdb qp)};
-	undef;
-}
-
 sub new {
 	my ($class, $ibx) = @_;
 	ref $ibx or die "BUG: expected PublicInbox::Inbox object: $ibx";
diff --git a/t/extsearch.t b/t/extsearch.t
index 8190de17..dfc190e2 100644
--- a/t/extsearch.t
+++ b/t/extsearch.t
@@ -447,9 +447,6 @@ SKIP: {
 	ok(ref($es->{xdb}), '{xdb} created');
 	my $nshards1 = $es->{nshard};
 	is($nshards1, 1, 'correct shard count');
-	my $xdb_str = "$es->{xdb}";
-	ok($es->cleanup_shards, 'cleanup_shards noop');
-	is("$es->{xdb}", $xdb_str, '{xdb} unchanged');
 
 	my @ei_dir = glob("$d/ei*/");
 	chmod 0755, $ei_dir[0] or xbail "chmod: $!";
@@ -463,11 +460,8 @@ SKIP: {
 		my $m = sprintf('%04o', 07777 & (stat($dirs[$i]))[2]);
 		is($m, $mode, "shard [$i] mode");
 	}
-	is($es->cleanup_shards, undef, 'cleanup_shards cleaned');
-	ok(!defined($es->{xdb}), 'old {xdb} gone');
-	is($es->cleanup_shards, undef, 'cleanup_shards clean idempotent');
+	delete @$es{qw(xdb qp)};
 	is($es->mset('z:0..')->size, $nresult0, 'new shards, same results');
-	ok($es->cleanup_shards, 'cleanup_shards true after open');
 
 	for my $i (4..5) {
 		is(grep(m!/ei[0-9]+/$i\z!, @dirs), 0, "no shard [$i]");

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/9] msgmap: ->new_file to supports $ibx arg, drop ->new
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
                   ` (6 preceding siblings ...)
  2021-10-12 11:47 ` [PATCH 7/9] daemon: unconditionally close Xapian shards on cleanup Eric Wong
@ 2021-10-12 11:47 ` Eric Wong
  2021-10-12 11:47 ` [PATCH 9/9] www: _/text/config/raw Last-Modified: is mm->created_at Eric Wong
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:47 UTC (permalink / raw)
  To: meta

The original Msgmap->new API was v1-specific and not necessary.
The ->new_file API now supports an $ibx object being passed to
it, simplify -no_fsync use.  It will also make an upcoming
change easier...
---
 lib/PublicInbox/Inbox.pm         |  2 +-
 lib/PublicInbox/InboxWritable.pm |  2 +-
 lib/PublicInbox/Msgmap.pm        | 20 +++++++++-----------
 lib/PublicInbox/SearchIdx.pm     |  3 +--
 lib/PublicInbox/V2Writable.pm    |  4 +---
 t/altid.t                        |  4 ++--
 t/altid_v2.t                     |  4 ++--
 t/filter_rubylang.t              |  2 +-
 t/init.t                         |  3 ++-
 t/msgmap.t                       |  5 +++--
 10 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 61d153bf..74b8a74f 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -148,7 +148,7 @@ sub mm {
 	$self->{mm} //= eval {
 		require PublicInbox::Msgmap;
 		_cleanup_later($self);
-		PublicInbox::Msgmap->new_file(mm_file($self));
+		PublicInbox::Msgmap->new_file($self);
 	} // ($req ? croak("E: $@") : undef);
 }
 
diff --git a/lib/PublicInbox/InboxWritable.pm b/lib/PublicInbox/InboxWritable.pm
index 65539781..17dfbe18 100644
--- a/lib/PublicInbox/InboxWritable.pm
+++ b/lib/PublicInbox/InboxWritable.pm
@@ -47,7 +47,7 @@ sub _init_v1 {
 		require PublicInbox::Msgmap;
 		my $sidx = PublicInbox::SearchIdx->new($self, 1); # just create
 		$sidx->begin_txn_lazy;
-		my $mm = PublicInbox::Msgmap->new($self->{inboxdir}, 1);
+		my $mm = PublicInbox::Msgmap->new_file($self, 1);
 		if (defined $skip_artnum) {
 			$mm->{dbh}->begin_work;
 			$mm->skip_artnum($skip_artnum);
diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 978730e2..94a0cbeb 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -14,19 +14,17 @@ use DBI;
 use DBD::SQLite;
 use PublicInbox::Over;
 use PublicInbox::Spawn;
-
-sub new {
-	my ($class, $git_dir, $writable) = @_;
-	my $d = "$git_dir/public-inbox";
-	if ($writable && !-d $d && !mkdir $d) {
-		my $err = $!;
-		-d $d or die "$d not created: $err";
-	}
-	new_file($class, "$d/msgmap.sqlite3", $writable);
-}
+use Scalar::Util qw(blessed);
 
 sub new_file {
-	my ($class, $f, $rw) = @_;
+	my ($class, $ibx, $rw) = @_;
+	my $f;
+	if (blessed($ibx)) {
+		$f = $ibx->mm_file;
+		$rw = 2 if $rw && $ibx->{-no_fsync};
+	} else {
+		$f = $ibx;
+	}
 	return if !$rw && !-r $f;
 
 	my $self = bless { filename => $f }, $class;
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index bebe904b..a2ed9499 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -453,8 +453,7 @@ sub _msgmap_init ($) {
 	die "BUG: _msgmap_init is only for v1\n" if $self->{ibx}->version != 1;
 	$self->{mm} //= eval {
 		require PublicInbox::Msgmap;
-		my $rw = $self->{ibx}->{-no_fsync} ? 2 : 1;
-		PublicInbox::Msgmap->new($self->{ibx}->{inboxdir}, $rw);
+		PublicInbox::Msgmap->new_file($self->{ibx}, 1);
 	};
 }
 
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index d04cdda6..efcc1fc2 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -267,9 +267,7 @@ sub _idx_init { # with_umask callback
 
 	# Now that all subprocesses are up, we can open the FDs
 	# for SQLite:
-	my $mm = $self->{mm} = PublicInbox::Msgmap->new_file(
-				"$ibx->{inboxdir}/msgmap.sqlite3",
-				$ibx->{-no_fsync} ? 2 : 1);
+	my $mm = $self->{mm} = PublicInbox::Msgmap->new_file($ibx, 1);
 	$mm->{dbh}->begin_work;
 }
 
diff --git a/t/altid.t b/t/altid.t
index 87635b19..3ce08a6a 100644
--- a/t/altid.t
+++ b/t/altid.t
@@ -15,7 +15,7 @@ my $altid = [ "serial:gmane:file=$alt_file" ];
 my $ibx;
 
 {
-	my $mm = PublicInbox::Msgmap->new_file($alt_file, 1);
+	my $mm = PublicInbox::Msgmap->new_file($alt_file, 2);
 	is($mm->mid_set(1234, 'a@example.com'), 1, 'mid_set once OK');
 	ok(0 == $mm->mid_set(1234, 'a@example.com'), 'mid_set not idempotent');
 	ok(0 == $mm->mid_set(1, 'a@example.com'), 'mid_set fails with dup MID');
@@ -48,7 +48,7 @@ EOF
 };
 
 {
-	my $mm = PublicInbox::Msgmap->new_file($alt_file, 1);
+	my $mm = PublicInbox::Msgmap->new_file($alt_file, 2);
 	my ($min, $max) = $mm->minmax;
 	my $num = $mm->mid_insert('b@example.com');
 	ok($num > $max, 'auto-increment goes beyond mid_set');
diff --git a/t/altid_v2.t b/t/altid_v2.t
index 47ebec85..281a09d5 100644
--- a/t/altid_v2.t
+++ b/t/altid_v2.t
@@ -13,7 +13,7 @@ my $altid = [ "serial:gmane:file=$another" ];
 my $ibx = create_inbox 'v2', version => 2, indexlevel => 'medium',
 			altid => $altid, sub {
 	my ($im, $ibx) = @_;
-	my $mm = PublicInbox::Msgmap->new_file("$ibx->{inboxdir}/$another", 1);
+	my $mm = PublicInbox::Msgmap->new_file("$ibx->{inboxdir}/$another", 2);
 	$mm->mid_set(1234, 'a@example.com') == 1 or BAIL_OUT 'mid_set once';
 	ok(0 == $mm->mid_set(1234, 'a@example.com'), 'mid_set not idempotent');
 	ok(0 == $mm->mid_set(1, 'a@example.com'), 'mid_set fails with dup MID');
@@ -26,7 +26,7 @@ Message-ID: <a@example.com>
 hello world gmane:666
 EOF
 };
-my $mm = PublicInbox::Msgmap->new_file("$ibx->{inboxdir}/$another", 1);
+my $mm = PublicInbox::Msgmap->new_file("$ibx->{inboxdir}/$another", 2);
 ok(0 == $mm->mid_set(1234, 'a@example.com'), 'mid_set not idempotent');
 ok(0 ==  $mm->mid_set(1, 'a@example.com'), 'mid_set fails with dup MID');
 my $mset = $ibx->search->mset('gmane:1234');
diff --git a/t/filter_rubylang.t b/t/filter_rubylang.t
index 81799451..4e9695e1 100644
--- a/t/filter_rubylang.t
+++ b/t/filter_rubylang.t
@@ -44,7 +44,7 @@ EOF
 	$mime = PublicInbox::Eml->new($msg);
 	$ret = $f->delivery($mime);
 	is($ret, $mime, "delivery successful");
-	my $mm = PublicInbox::Msgmap->new($git_dir);
+	my $mm = $ibx->mm;
 	is($mm->num_for('a@b'), 12, 'MM entry created based on X-ML-Count');
 
 	$msg = <<'EOF';
diff --git a/t/init.t b/t/init.t
index 752e5af9..4bec6a2f 100644
--- a/t/init.t
+++ b/t/init.t
@@ -199,7 +199,8 @@ SKIP: {
 	$err = '';
 	ok(run_script([qw(-mda --no-precheck)], $env, $rdr), 'deliver V1');
 	diag "err=$err" if $err;
-	$mm = PublicInbox::Msgmap->new("$tmpdir/skip4");
+	$mm = PublicInbox::Msgmap->new_file(
+			"$tmpdir/skip4/public-inbox/msgmap.sqlite3");
 	$n = $mm->num_for($mid);
 	is($n, 13, 'V1 NNTP article numbers skipped via --skip-artnum');
 }
diff --git a/t/msgmap.t b/t/msgmap.t
index 2d462dfb..a3b7200f 100644
--- a/t/msgmap.t
+++ b/t/msgmap.t
@@ -7,7 +7,8 @@ use PublicInbox::TestCommon;
 require_mods('DBD::SQLite');
 use_ok 'PublicInbox::Msgmap';
 my ($tmpdir, $for_destroy) = tmpdir();
-my $d = PublicInbox::Msgmap->new($tmpdir, 1);
+my $f = "$tmpdir/msgmap.sqlite3";
+my $d = PublicInbox::Msgmap->new_file($f, 1);
 
 my %mid2num;
 my %num2mid;
@@ -50,7 +51,7 @@ is($d->mid_delete('a@b') + 0, 0, 'delete again returns zero');
 is(undef, $d->num_for('a@b'), 'num_for fails on deleted msg');
 $d = undef;
 
-ok($d = PublicInbox::Msgmap->new($tmpdir, 1), 'idempotent DB creation');
+ok($d = PublicInbox::Msgmap->new_file($f, 1), 'idempotent DB creation');
 my ($min, $max) = $d->minmax;
 ok($min > 0, "article min OK");
 ok($max > 0 && $max < 10, "article max OK");

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 9/9] www: _/text/config/raw Last-Modified: is mm->created_at
  2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
                   ` (7 preceding siblings ...)
  2021-10-12 11:47 ` [PATCH 8/9] msgmap: ->new_file to supports $ibx arg, drop ->new Eric Wong
@ 2021-10-12 11:47 ` Eric Wong
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2021-10-12 11:47 UTC (permalink / raw)
  To: meta

This allows IMAP mirrors to keep UIDVALIDITY synchronized (and
"LIST ACTIVE.TIMES" in NNTP).  "lei add-external --mirror" will
automatically set it, as will the combination of
public-inbox-clone + public-inbox-index.

This avoids the need for extra endpoints or config entries,
at least...
---
 lib/PublicInbox/LeiMirror.pm | 13 +++++++------
 lib/PublicInbox/Msgmap.pm    |  9 ++++++++-
 lib/PublicInbox/WwwText.pm   |  3 +++
 t/lei-mirror.t               | 31 ++++++++++++++++++++++++++++---
 t/psgi_v2.t                  | 10 +++++++++-
 5 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 5cfa6fea..4be8f70a 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -96,15 +96,15 @@ sub _get_txt { # non-fatal
 	my $path = $uri->path;
 	chop($path) eq '/' or die "BUG: $uri not canonicalized";
 	$uri->path("$path/$endpoint");
-	my $cmd = $self->{curl}->for_uri($lei, $uri, '--compressed');
-	my $ce = "$self->{dst}/$file";
-	my $ft = File::Temp->new(TEMPLATE => "$file-XXXX",
-				UNLINK => 1, DIR => $self->{dst});
-	my $opt = { 0 => $lei->{0}, 1 => $ft, 2 => $lei->{2} };
+	my $ft = File::Temp->new(TEMPLATE => "$file-XXXX", DIR => $self->{dst});
+	my $f = $ft->filename;
+	my $opt = { 0 => $lei->{0}, 1 => $lei->{1}, 2 => $lei->{2} };
+	my $cmd = $self->{curl}->for_uri($lei, $uri,
+					qw(--compressed -R -o), $f);
 	my $cerr = run_reap($lei, $cmd, $opt);
 	return "$uri missing" if ($cerr >> 8) == 22;
 	return "# @$cmd failed (non-fatal)" if $cerr;
-	my $f = $ft->filename;
+	my $ce = "$self->{dst}/$file";
 	rename($f, $ce) or return "rename($f, $ce): $! (non-fatal)";
 	$ft->unlink_on_destroy(0);
 	undef; # success
@@ -122,6 +122,7 @@ sub _try_config {
 	my $err = _get_txt($self, qw(_/text/config/raw inbox.config.example));
 	return $self->{lei}->err($err) if $err;
 	my $f = "$self->{dst}/inbox.config.example";
+	chmod((stat($f))[2] & 0444, $f) or die "chmod(a-w, $f): $!";
 	my $cfg = PublicInbox::Config->git_config_dump($f, $self->{lei}->{2});
 	my $ibx = $self->{ibx} = {};
 	for my $sec (grep(/\Apublicinbox\./, @{$cfg->{-section_order}})) {
diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 94a0cbeb..e71f16f8 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -32,8 +32,15 @@ sub new_file {
 	if ($rw) {
 		$dbh->begin_work;
 		create_tables($dbh);
-		$self->created_at(time) unless $self->created_at;
+		unless ($self->created_at) {
+			my $t;
 
+			if (blessed($ibx) &&
+				-f "$ibx->{inboxdir}/inbox.config.example") {
+				$t = (stat(_))[9]; # mtime set by "curl -R"
+			}
+			$self->created_at($t // time);
+		}
 		$self->num_highwater(max($self));
 		$dbh->commit;
 	}
diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index bb9a0a0f..8b929f74 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -8,6 +8,7 @@ use v5.10.1;
 use PublicInbox::Linkify;
 use PublicInbox::WwwStream;
 use PublicInbox::Hval qw(ascii_html prurl);
+use HTTP::Date qw(time2str);
 use URI::Escape qw(uri_escape_utf8);
 use PublicInbox::GzipFilter qw(gzf_maybe);
 our $QP_URL = 'https://xapian.org/docs/queryparser.html';
@@ -171,6 +172,8 @@ sub inbox_config ($$$) {
 	my ($ctx, $hdr, $txt) = @_;
 	my $ibx = $ctx->{ibx};
 	push @$hdr, 'Content-Disposition', 'inline; filename=inbox.config';
+	my $t = eval { $ibx->mm->created_at };
+	push(@$hdr, 'Last-Modified', time2str($t)) if $t;
 	my $name = dq_escape($ibx->{name});
 	my $inboxdir = '/path/to/top-level-inbox';
 	my $base_url = $ibx->base_url($ctx->{env});
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index de5246b6..b449e0b4 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -3,8 +3,9 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 use PublicInbox::Inbox;
-require_mods(qw(-httpd lei));
+require_mods(qw(-httpd lei DBD::SQLite));
 require_cmd('curl');
+require PublicInbox::Msgmap;
 my $sock = tcp_server();
 my ($tmpdir, $for_destroy) = tmpdir();
 my $http = 'http://'.tcp_host_port($sock);
@@ -12,25 +13,40 @@ my ($ro_home, $cfg_path) = setup_public_inboxes;
 my $cmd = [ qw(-httpd -W0 ./t/lei-mirror.psgi),
 	"--stdout=$tmpdir/out", "--stderr=$tmpdir/err" ];
 my $td = start_script($cmd, { PI_CONFIG => $cfg_path }, { 3 => $sock });
+my %created;
 test_lei({ tmpdir => $tmpdir }, sub {
 	my $home = $ENV{HOME};
 	my $t1 = "$home/t1-mirror";
+	my $mm_orig = "$ro_home/t1/public-inbox/msgmap.sqlite3";
+	$created{v1} = PublicInbox::Msgmap->new_file($mm_orig)->created_at;
 	lei_ok('add-external', $t1, '--mirror', "$http/t1/", \'--mirror v1');
-	ok(-f "$t1/public-inbox/msgmap.sqlite3", 't1-mirror indexed');
+	my $mm_dup = "$t1/public-inbox/msgmap.sqlite3";
+	ok(-f $mm_dup, 't1-mirror indexed');
 	is(PublicInbox::Inbox::try_cat("$t1/description"),
 		"mirror of $http/t1/\n", 'description set');
 	ok(-f "$t1/Makefile", 'convenience Makefile added (v1)');
+	ok(-f "$t1/inbox.config.example", 'inbox.config.example downloaded');
+	is((stat(_))[9], $created{v1},
+		'inbox.config.example mtime is ->created_at');
+	is((stat(_))[2] & 0222, 0, 'inbox.config.example not writable');
+	my $tb = PublicInbox::Msgmap->new_file($mm_dup)->created_at;
+	is($tb, $created{v1}, 'created_at matched in mirror');
 
 	lei_ok('ls-external');
 	like($lei_out, qr!\Q$t1\E!, 't1 added to ls-externals');
 
 	my $t2 = "$home/t2-mirror";
+	$mm_orig = "$ro_home/t2/msgmap.sqlite3";
+	$created{v2} = PublicInbox::Msgmap->new_file($mm_orig)->created_at;
 	lei_ok('add-external', $t2, '--mirror', "$http/t2/", \'--mirror v2');
-	ok(-f "$t2/msgmap.sqlite3", 't2-mirror indexed');
+	$mm_dup = "$t2/msgmap.sqlite3";
+	ok(-f $mm_dup, 't2-mirror indexed');
 	ok(-f "$t2/description", 't2 description');
 	ok(-f "$t2/Makefile", 'convenience Makefile added (v2)');
 	is(PublicInbox::Inbox::try_cat("$t2/description"),
 		"mirror of $http/t2/\n", 'description set');
+	$tb = PublicInbox::Msgmap->new_file($mm_dup)->created_at;
+	is($tb, $created{v2}, 'created_at matched in v2 mirror');
 
 	lei_ok('ls-external');
 	like($lei_out, qr!\Q$t2\E!, 't2 added to ls-externals');
@@ -150,6 +166,15 @@ SKIP: {
 	ok(unlink("$d/t1/manifest.js.gz"), 'manifest created');
 	my $after = [ glob("$d/t1/*") ];
 	is_deeply($before, $after, 'no new files created');
+
+	ok(run_script([qw(-index -Lbasic), "$d/t1"]), 'index v1');
+	ok(run_script([qw(-index -Lbasic), "$d/t2"]), 'index v2');
+	my $f = "$d/t1/public-inbox/msgmap.sqlite3";
+	my $ca = PublicInbox::Msgmap->new_file($f)->created_at;
+	is($ca, $created{v1}, 'clone + index v1 synced ->created_at');
+	$f = "$d/t2/msgmap.sqlite3";
+	$ca = PublicInbox::Msgmap->new_file($f)->created_at;
+	is($ca, $created{v2}, 'clone + index v1 synced ->created_at');
 }
 
 ok($td->kill, 'killed -httpd');
diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 1f190708..e0570682 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -9,7 +9,7 @@ use PublicInbox::Eml;
 use PublicInbox::Config;
 use PublicInbox::MID qw(mids);
 require_mods(qw(DBD::SQLite Search::Xapian HTTP::Request::Common Plack::Test
-		URI::Escape Plack::Builder));
+		URI::Escape Plack::Builder HTTP::Date));
 use_ok($_) for (qw(HTTP::Request::Common Plack::Test));
 use_ok 'PublicInbox::WWW';
 my ($tmpdir, $for_destroy) = tmpdir();
@@ -113,6 +113,14 @@ $im->done;
 
 my $client1 = sub {
 	my ($cb) = @_;
+	$res = $cb->(GET('/v2test/_/text/config/raw'));
+	my $lm = $res->header('Last-Modified');
+	ok($lm, 'Last-Modified set w/ ->mm');
+	$lm = HTTP::Date::str2time($lm);
+	is($lm, $ibx->mm->created_at,
+		'Last-Modified for text/config/raw matches ->created_at');
+	delete $ibx->{mm};
+
 	$res = $cb->(GET("/v2test/$third/raw"));
 	$raw = $res->content;
 	like($raw, qr/^hello ghosts$/m, 'got third message');

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-12 11:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
2021-10-12 11:46 ` [PATCH 1/9] daemon: use v5.10.1, disable local warnings Eric Wong
2021-10-12 11:46 ` [PATCH 2/9] daemon: quiet down Eml-related warnings Eric Wong
2021-10-12 11:46 ` [PATCH 3/9] search: delete QueryParser along with DB handle Eric Wong
2021-10-12 11:47 ` [PATCH 4/9] nntp: use defined-OR from Perl 5.10 for msgid check Eric Wong
2021-10-12 11:47 ` [PATCH 5/9] msgmap: use DBI->prepare_cached Eric Wong
2021-10-12 11:47 ` [PATCH 6/9] msgmap: share most of check_inodes w/ over Eric Wong
2021-10-12 11:47 ` [PATCH 7/9] daemon: unconditionally close Xapian shards on cleanup Eric Wong
2021-10-12 11:47 ` [PATCH 8/9] msgmap: ->new_file to supports $ibx arg, drop ->new Eric Wong
2021-10-12 11:47 ` [PATCH 9/9] www: _/text/config/raw Last-Modified: is mm->created_at Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).