From: Eric Wong <e@80x24.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: meta@public-inbox.org
Subject: [PATCH] www: strip and redirect on `<' and `>' in MSGID of URL
Date: Mon, 17 Jun 2024 00:01:40 +0000 [thread overview]
Message-ID: <20240617000140.M652375@dcvr> (raw)
In-Reply-To: <xmqqtthvh4r6.fsf@gitster.g>
Junio C Hamano <gitster@pobox.com> wrote:
> When I have a specific message on a mailing list, and I am
> interested in the discussion around the message, I often go to the
> URL of that message in public-inbox powered mailing list archive.
> For example, I go to
>
> https://public-inbox.org/meta/20240606074416.3900983-1-e@80x24.org/
>
> when I find "Message-ID: <20240606074416.3900983-1-e@80x24.org>"
>
> It would be immensely convenient if cutting and pasting including
> the surrounding <angle-brackets>, i.e.
>
> https://public-inbox.org/meta/<20240606074416.3900983-1-e@80x24.org>/
>
> is silently accepted and redirected to
>
> https://public-inbox.org/meta/20240606074416.3900983-1-e@80x24.org/
>
> instead of the "partial matches found" page.
Seems reasonable; especially since sr.ht uses <> in URLs nowadays
and some users may be conditioned to include them.
I don't see 404s in my logs from this, but I don't keep a lot of logs.
------8<------
Subject: [PATCH] www: strip and redirect on `<' and `>' in MSGID of URL
Some users may needlessly include `<' and `>' braces in URLs, so
account for this common mistake and redirect users to the
non-braced URL. This common mistake could be learned behavior
from other sites (e.g. sr.ht) which include `<' and `>' in URLs.
Reported-by: Junio C Hamano <gitster@pobox.com>
Link: https://public-inbox.org/meta/xmqqtthvh4r6.fsf@gitster.g/
---
lib/PublicInbox/View.pm | 10 +++++++---
t/psgi_search.t | 11 +++++++++++
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index dcceb311..cc1ab79a 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -74,9 +74,13 @@ sub msg_page {
my ($id, $prev);
my $next_arg = $ctx->{next_arg} = [ $ctx->{mid}, \$id, \$prev ];
- my $smsg = $ctx->{smsg} = $over->next_by_mid(@$next_arg) or
- return; # undef == 404
-
+ my $smsg = $ctx->{smsg} = $over->next_by_mid(@$next_arg);
+ if (!$smsg && $ctx->{mid} =~ /\A\<(.+)\>\z/ and
+ ($next_arg->[0] = $1) and
+ ($over->next_by_mid(@$next_arg))) {
+ return PublicInbox::WWW::r301($ctx, undef, $next_arg->[0]);
+ }
+ $smsg or return; # undef=404
# allow user to easily browse the range around this message if
# they have ->over
$ctx->{-t_max} = $smsg->{ts};
diff --git a/t/psgi_search.t b/t/psgi_search.t
index 8c981c6c..759dab78 100644
--- a/t/psgi_search.t
+++ b/t/psgi_search.t
@@ -179,6 +179,17 @@ test_psgi(sub { $www->call(@_) }, sub {
$res = $cb->(GET(q{/test/?q=%22s'more%22&x=A}));
is $res->code, 200, 'single quote inside phrase';
+
+ $res = $cb->(GET("/test/<$mid>/"));
+ is $res->code, 301, "redirect for raw `<' and `>' in msgid";
+ like $res->header('location'), qr!/test/\Q$mid\E/\z!,
+ "redirected to URL without raw `<' and `>'";
+
+ $res = $cb->(GET("/test/%3c$mid%3e/"));
+ is $res->code, 301, "redirect for escaped `<' and `>' in msgid";
+ like $res->header('location'), qr!/test/\Q$mid\E/\z!,
+ "redirected to URL without escaped `<' and `>'";
+
# TODO: more tests and odd cases
});
next prev parent reply other threads:[~2024-06-17 0:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-14 16:24 [feature request] accept and silently ignore <> around message IDs at the end of URL Junio C Hamano
2024-06-17 0:01 ` Eric Wong [this message]
2024-06-17 1:07 ` [PATCH] www: strip and redirect on `<' and `>' in MSGID " Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240617000140.M652375@dcvr \
--to=e@80x24.org \
--cc=gitster@pobox.com \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).