unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH] lei add-external --mirror: deduce paths for PSGI mount prefixes
Date: Fri, 10 Sep 2021 05:51:00 +0000	[thread overview]
Message-ID: <20210910055100.31920-1-e@80x24.org> (raw)

The current manifest.js.gz generation in WWW doesn't account for
PSGI mount prefixes (and grokmirror 1.x appears to work fine).

In other words, <https://yhbt.net/lore/lkml/manifest.js.gz>
currently has keys like "/lkml/git/0.git" and not
"/lore/lkml/git/0.git" where "/lore" is the PSGI mount prefix.
This works fine with the prefix accounted for in my grokmirror
(1.x) repos.conf like this:

	site = https://yhbt.net/lore/
	manifest = https://yhbt.net/lore/manifest.js.gz

Adding the PSGI mount prefix in manifest.js.gz is probably not
desirable since it would force the prefix into the locally
cloned path by grokmirror, and all the cloned directories
would have the remote PSGI mount prefix prepended to the
toplevel.

So, "lei add-external --mirror" needs to account for PSGI
mount prefixes by deducing the prefix based on available keys
in the manifest.js.gz hash table.
---
 MANIFEST                     |  1 +
 lib/PublicInbox/LeiMirror.pm | 28 +++++++++++++++++++++-------
 t/lei-mirror.psgi            |  9 +++++++++
 t/lei-mirror.t               |  6 +++++-
 4 files changed, 36 insertions(+), 8 deletions(-)
 create mode 100644 t/lei-mirror.psgi

diff --git a/MANIFEST b/MANIFEST
index 531f8c46..a22672e7 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -436,6 +436,7 @@ t/lei-import-nntp.t
 t/lei-import.t
 t/lei-index.t
 t/lei-lcat.t
+t/lei-mirror.psgi
 t/lei-mirror.t
 t/lei-p2q.t
 t/lei-q-kw.t
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 39671f90..fca11ccf 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -200,6 +200,19 @@ failed to extract epoch number from $src
 	index_cloned_inbox($self, 2);
 }
 
+# PSGI mount prefixes and manifest.js.gz prefixes don't always align...
+sub deduce_epochs ($$) {
+	my ($m, $path) = @_;
+	my ($v1_bare, @v2_epochs);
+	my $path_pfx = '';
+	do {
+		$v1_bare = $m->{$path};
+		@v2_epochs = grep(m!\A\Q$path\E/git/[0-9]+\.git\z!, keys %$m);
+	} while (!defined($v1_bare) && !@v2_epochs &&
+		$path =~ s!\A(/[^/]+)/!/! and $path_pfx .= $1);
+	($path_pfx, $v1_bare, @v2_epochs);
+}
+
 sub try_manifest {
 	my ($self) = @_;
 	my $uri = URI->new($self->{src});
@@ -229,8 +242,7 @@ sub try_manifest {
 	die "$uri: error decoding `$js': $@" if $@;
 	ref($m) eq 'HASH' or die "$uri unknown type: ".ref($m);
 
-	my $v1_bare = $m->{$path};
-	my @v2_epochs = grep(m!\A\Q$path\E/git/[0-9]+\.git\z!, keys %$m);
+	my ($path_pfx, $v1_bare, @v2_epochs) = deduce_epochs($m, $path);
 	if (@v2_epochs) {
 		# It may be possible to have v1 + v2 in parallel someday:
 		$lei->err(<<EOM) if defined $v1_bare;
@@ -238,14 +250,16 @@ sub try_manifest {
 # @v2_epochs
 # ignoring $v1_bare (use --inbox-version=1 to force v1 instead)
 EOM
-		@v2_epochs = map { $uri->path($_); $uri->clone } @v2_epochs;
+		@v2_epochs = map {
+			$uri->path($path_pfx.$_);
+			$uri->clone
+		} @v2_epochs;
 		clone_v2($self, \@v2_epochs);
-	} elsif ($v1_bare) {
+	} elsif (defined $v1_bare) {
 		clone_v1($self);
-	} elsif (my @maybe = grep(m!\Q$path\E!, keys %$m)) {
-		die "E: confused by <$uri>, possible matches:\n@maybe";
 	} else {
-		die "E: confused by <$uri>";
+		die "E: confused by <$uri>, possible matches:\n\t",
+			join(', ', sort keys %$m), "\n";
 	}
 }
 
diff --git a/t/lei-mirror.psgi b/t/lei-mirror.psgi
new file mode 100644
index 00000000..6b4bbfec
--- /dev/null
+++ b/t/lei-mirror.psgi
@@ -0,0 +1,9 @@
+use Plack::Builder;
+use PublicInbox::WWW;
+my $www = PublicInbox::WWW->new;
+$www->preload;
+builder {
+	enable 'Head';
+	mount '/pfx' => builder { sub { $www->call(@_) } };
+	mount '/' => builder { sub { $www->call(@_) } };
+};
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 80bc6ed5..65b6068c 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -7,7 +7,8 @@ my $sock = tcp_server();
 my ($tmpdir, $for_destroy) = tmpdir();
 my $http = 'http://'.tcp_host_port($sock);
 my ($ro_home, $cfg_path) = setup_public_inboxes;
-my $cmd = [ qw(-httpd -W0), "--stdout=$tmpdir/out", "--stderr=$tmpdir/err" ];
+my $cmd = [ qw(-httpd -W0 ./t/lei-mirror.psgi),
+	"--stdout=$tmpdir/out", "--stderr=$tmpdir/err" ];
 my $td = start_script($cmd, { PI_CONFIG => $cfg_path }, { 3 => $sock });
 test_lei({ tmpdir => $tmpdir }, sub {
 	my $home = $ENV{HOME};
@@ -43,6 +44,9 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	lei_ok('ls-external');
 	unlike($lei_out, qr!\Q$t2-fail\E!, 'not added to ls-external');
 
+	lei_ok('add-external', "$t1-pfx", '--mirror', "$http/pfx/t1/",
+			\'--mirror v1 w/ PSGI prefix');
+
 	my %phail = (
 		HTTPS => 'https://public-inbox.org/' . 'phail',
 		ONION =>

                 reply	other threads:[~2021-09-10  5:51 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210910055100.31920-1-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).