* [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset @ 2019-09-24 4:10 edef 2019-09-24 4:10 ` [PATCH 1/1] wwwstream: copy $ctx->{env} in new edef 2019-09-26 3:03 ` [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset Eric Wong 0 siblings, 2 replies; 4+ messages in thread From: edef @ 2019-09-24 4:10 UTC (permalink / raw) To: meta; +Cc: hi, edef We're trying to get public-inbox working with a PSGI file that mounts it to a subdirectory. This seems like it's intended to be a supported use case, with stuff paying attention to SCRIPT_NAME and all when generating URLs. However, Plack::App::URLMap seems determined to reset SCRIPT_NAME before getline gets called: my $orig_path_info = $env->{PATH_INFO}; my $orig_script_name = $env->{SCRIPT_NAME}; $env->{PATH_INFO} = $path; $env->{SCRIPT_NAME} = $script_name . $location; return $self->response_cb($app->($env), sub { $env->{PATH_INFO} = $orig_path_info; $env->{SCRIPT_NAME} = $orig_script_name; }); I'm not sure whether public-inbox or Plack is in the wrong here, but the timing works out poorly. By the time PublicInbox::WwwStream::_html_end gets invoked SCRIPT_NAME is blank, and the wrong URLs get generated. Copying env seems to fix it, and that's what the attached patch does. I'm pretty sure this is the wrong approach, but it seems to work. edef (1): wwwstream: copy $ctx->{env} in new lib/PublicInbox/WwwStream.pm | 4 ++++ 1 file changed, 4 insertions(+) base-commit: 55283284757af5f5d8f63fd17d53340e4dea34fb -- git-series 0.9.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/1] wwwstream: copy $ctx->{env} in new 2019-09-24 4:10 [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset edef @ 2019-09-24 4:10 ` edef 2019-09-26 3:03 ` [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset Eric Wong 1 sibling, 0 replies; 4+ messages in thread From: edef @ 2019-09-24 4:10 UTC (permalink / raw) To: meta; +Cc: hi, edef Plack::App::URLMap wipes out SCRIPT_NAME after we return, and _html_end needs it for generating correct URLs --- lib/PublicInbox/WwwStream.pm | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm index e0823c8..6bca095 100644 --- a/lib/PublicInbox/WwwStream.pm +++ b/lib/PublicInbox/WwwStream.pm @@ -19,6 +19,10 @@ sub close {} sub new { my ($class, $ctx, $cb) = @_; + + my %env = %{$ctx->{env}}; # full hash copy + $ctx->{env} = \%env; + bless { nr => 0, cb => $cb || *close, ctx => $ctx }, $class; } -- git-series 0.9.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset 2019-09-24 4:10 [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset edef 2019-09-24 4:10 ` [PATCH 1/1] wwwstream: copy $ctx->{env} in new edef @ 2019-09-26 3:03 ` Eric Wong 2019-10-01 7:13 ` [PATCH] www: fix absolute URLs when mounted under a subdir Eric Wong 1 sibling, 1 reply; 4+ messages in thread From: Eric Wong @ 2019-09-26 3:03 UTC (permalink / raw) To: edef; +Cc: meta, hi edef <edef@edef.eu> wrote: > We're trying to get public-inbox working with a PSGI file that mounts > it to a subdirectory. This seems like it's intended to be a supported > use case, with stuff paying attention to SCRIPT_NAME and all when > generating URLs. > > However, Plack::App::URLMap seems determined to reset SCRIPT_NAME > before getline gets called: > > my $orig_path_info = $env->{PATH_INFO}; > my $orig_script_name = $env->{SCRIPT_NAME}; > > $env->{PATH_INFO} = $path; > $env->{SCRIPT_NAME} = $script_name . $location; > return $self->response_cb($app->($env), sub { > $env->{PATH_INFO} = $orig_path_info; > $env->{SCRIPT_NAME} = $orig_script_name; > }); Sounds like a familiar problem to me :x > I'm not sure whether public-inbox or Plack is in the wrong here, but > the timing works out poorly. By the time > PublicInbox::WwwStream::_html_end gets invoked SCRIPT_NAME is blank, > and the wrong URLs get generated. > > Copying env seems to fix it, and that's what the attached patch does. > I'm pretty sure this is the wrong approach, but it seems to work. Yeah, it's a big hash and not needed to copy the whole thing. I gotta run, now, but I think the patch below will work for you by precalculating base_url up front. Can you confirm? Thanks. Also, I suspect the mbox Archived-At headers could be wrong and need a similar change... Maybe Atom feeds, too. diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm index e0823c8d..b240c071 100644 --- a/lib/PublicInbox/WwwStream.pm +++ b/lib/PublicInbox/WwwStream.pm @@ -19,7 +19,17 @@ sub close {} sub new { my ($class, $ctx, $cb) = @_; - bless { nr => 0, cb => $cb || *close, ctx => $ctx }, $class; + + my $env = $ctx->{env}; + my $ibx = $ctx->{-inbox}; + my $base_url = $ibx->base_url($env); + chop $base_url; # no trailing slash for clone + bless { + nr => 0, + cb => $cb || *close, + ctx => $ctx, + base_url => $base_url, + }, $class; } sub response { @@ -83,8 +93,7 @@ sub _html_end { my $desc = ascii_html($ibx->description); my (%seen, @urls); - my $http = $ibx->base_url($ctx->{env}); - chop $http; # no trailing slash for clone + my $http = $self->{base_url}; my $max = $ibx->max_git_epoch; my $dir = (split(m!/!, $http))[-1]; if (defined($max)) { # v2 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH] www: fix absolute URLs when mounted under a subdir 2019-09-26 3:03 ` [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset Eric Wong @ 2019-10-01 7:13 ` Eric Wong 0 siblings, 0 replies; 4+ messages in thread From: Eric Wong @ 2019-10-01 7:13 UTC (permalink / raw) To: edef; +Cc: meta, hi Eric Wong <e@80x24.org> wrote: > Also, I suspect the mbox Archived-At headers could be wrong > and need a similar change... Maybe Atom feeds, too. Yup, mboxrd code needed changing. Atom feeds already had full URLs (and tests), so I added some test cases to t/psgi_mount.t and fixed the remaining cases. Just pushed this out to master: ---------8<----------- Subject: [PATCH] www: fix absolute URLs when mounted under a subdir While we avoid generating absolute URLs in most cases, our "git clone" instructions and URL headers in mboxrd files contain full URLs. So do the same thing we do for WwwAtomStream and pre-generate the full URL before Plack::App::URLMap changes $env->{PATH_INFO} and $env->{SCRIPT_NAME} back to their original values. Reported-by: edef <edef@edef.eu> Link: https://public-inbox.org/meta/cover.0f97c47bb88db8b875be7497289d8fedd3b11991.1569296942.git-series.edef@edef.eu/ --- lib/PublicInbox/Mbox.pm | 5 ++++- lib/PublicInbox/WwwStream.pm | 13 +++++++++--- t/psgi_mount.t | 38 ++++++++++++++++++++++++++++++++++-- 3 files changed, 50 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm index 6d902e6c..67b671f5 100644 --- a/lib/PublicInbox/Mbox.pm +++ b/lib/PublicInbox/Mbox.pm @@ -60,10 +60,12 @@ sub getline { sub close {} # noop +# /$INBOX/$MESSAGE_ID/raw sub emit_raw { my ($ctx) = @_; my $mid = $ctx->{mid}; my $ibx = $ctx->{-inbox}; + $ctx->{base_url} = $ibx->base_url($ctx->{env}); my ($mref, $more, $id, $prev, $next); if (my $over = $ibx->over) { my $smsg = $over->next_by_mid($mid, \$id, \$prev) or return; @@ -97,7 +99,7 @@ sub msg_hdr ($$;$) { $header_obj->header_set($d); } my $ibx = $ctx->{-inbox}; - my $base = $ibx->base_url($ctx->{env}); + my $base = $ctx->{base_url}; $mid = $ctx->{mid} unless defined $mid; $mid = mid_escape($mid); my @append = ( @@ -246,6 +248,7 @@ use PublicInbox::Hval qw/to_filename/; sub new { my ($class, $ctx, $cb) = @_; my $buf = ''; + $ctx->{base_url} = $ctx->{-inbox}->base_url($ctx->{env}); bless { buf => \$buf, gz => IO::Compress::Gzip->new(\$buf, Time => 0), diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm index 7399b0ad..f5338c39 100644 --- a/lib/PublicInbox/WwwStream.pm +++ b/lib/PublicInbox/WwwStream.pm @@ -19,7 +19,15 @@ sub close {} sub new { my ($class, $ctx, $cb) = @_; - bless { nr => 0, cb => $cb || *close, ctx => $ctx }, $class; + + my $base_url = $ctx->{-inbox}->base_url($ctx->{env}); + chop $base_url; # no trailing slash for clone + bless { + nr => 0, + cb => $cb || *close, + ctx => $ctx, + base_url => $base_url, + }, $class; } sub response { @@ -83,8 +91,7 @@ sub _html_end { my $desc = ascii_html($ibx->description); my (%seen, @urls); - my $http = $ibx->base_url($ctx->{env}); - chop $http; # no trailing slash for clone + my $http = $self->{base_url}; my $max = $ibx->max_git_epoch; my $dir = (split(m!/!, $http))[-1]; if (defined($max)) { # v2 diff --git a/t/psgi_mount.t b/t/psgi_mount.t index 05dbd736..8da2bc89 100644 --- a/t/psgi_mount.t +++ b/t/psgi_mount.t @@ -60,11 +60,24 @@ test_psgi($app, sub { unlike($res->content, qr!\b\Qhttp://[^/]+/test/\E!, 'No URLs which are not mount-aware'); - # redirects + $res = $cb->(GET('/a/test/new.html')); + like($res->content, qr!git clone --mirror http://[^/]+/a/test\b!, + 'clone URL in new.html is mount-aware'); + $res = $cb->(GET('/a/test/blah%40example.com/')); is($res->code, 200, 'OK with URLMap mount'); + like($res->content, qr!git clone --mirror http://[^/]+/a/test\b!, + 'clone URL in /$INBOX/$MESSAGE_ID/ is mount-aware'); + $res = $cb->(GET('/a/test/blah%40example.com/raw')); is($res->code, 200, 'OK with URLMap mount'); + like($res->content, qr!^List-Archive: <http://[^/]+/a/test/>!m, + 'List-Archive set in /raw mboxrd'); + like($res->content, + qr!^Archived-At: <http://[^/]+/a/test/blah\@example\.com/>!m, + 'Archived-At set in /raw mboxrd'); + + # redirects $res = $cb->(GET('/a/test/m/blah%40example.com.html')); is($res->header('Location'), 'http://localhost/a/test/blah@example.com/', @@ -72,7 +85,28 @@ test_psgi($app, sub { $res = $cb->(GET('/test/blah%40example.com/')); is($res->code, 404, 'intentional 404 with URLMap mount'); - }); +SKIP: { + my @mods = qw(DBI DBD::SQLite Search::Xapian IO::Uncompress::Gunzip); + foreach my $mod (@mods) { + eval "require $mod" or skip "$mod not available: $@", 2; + } + my $ibx = $config->lookup_name('test'); + PublicInbox::SearchIdx->new($ibx, 1)->index_sync; + test_psgi($app, sub { + my ($cb) = @_; + my $res = $cb->(GET('/a/test/blah@example.com/t.mbox.gz')); + my $gz = $res->content; + my $raw; + IO::Uncompress::Gunzip::gunzip(\$gz => \$raw); + like($raw, qr!^List-Archive: <http://[^/]+/a/test/>!m, + 'List-Archive set in /t.mbox.gz mboxrd'); + like($raw, + qr!^Archived-At:\x20 + <http://[^/]+/a/test/blah\@example\.com/>!mx, + 'Archived-At set in /t.mbox.gz mboxrd'); + }); +} + done_testing(); -- EW ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-10-01 7:13 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-09-24 4:10 [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset edef 2019-09-24 4:10 ` [PATCH 1/1] wwwstream: copy $ctx->{env} in new edef 2019-09-26 3:03 ` [PATCH 0/1] Fix broken clone URLs due to SCRIPT_NAME getting reset Eric Wong 2019-10-01 7:13 ` [PATCH] www: fix absolute URLs when mounted under a subdir Eric Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).