unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [feature request] accept and silently ignore <> around message IDs at the end of URL
@ 2024-06-14 16:24 Junio C Hamano
  2024-06-17  0:01 ` [PATCH] www: strip and redirect on `<' and `>' in MSGID " Eric Wong
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2024-06-14 16:24 UTC (permalink / raw)
  To: meta

When I have a specific message on a mailing list, and I am
interested in the discussion around the message, I often go to the
URL of that message in public-inbox powered mailing list archive.
For example, I go to

    https://public-inbox.org/meta/20240606074416.3900983-1-e@80x24.org/

when I find "Message-ID: <20240606074416.3900983-1-e@80x24.org>"

It would be immensely convenient if cutting and pasting including
the surrounding <angle-brackets>, i.e.

    https://public-inbox.org/meta/<20240606074416.3900983-1-e@80x24.org>/

is silently accepted and redirected to

    https://public-inbox.org/meta/20240606074416.3900983-1-e@80x24.org/

instead of the "partial matches found" page.

Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] www: strip and redirect on `<' and `>' in MSGID of URL
  2024-06-14 16:24 [feature request] accept and silently ignore <> around message IDs at the end of URL Junio C Hamano
@ 2024-06-17  0:01 ` Eric Wong
  2024-06-17  1:07   ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Wong @ 2024-06-17  0:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: meta

Junio C Hamano <gitster@pobox.com> wrote:
> When I have a specific message on a mailing list, and I am
> interested in the discussion around the message, I often go to the
> URL of that message in public-inbox powered mailing list archive.
> For example, I go to
> 
>     https://public-inbox.org/meta/20240606074416.3900983-1-e@80x24.org/
> 
> when I find "Message-ID: <20240606074416.3900983-1-e@80x24.org>"
> 
> It would be immensely convenient if cutting and pasting including
> the surrounding <angle-brackets>, i.e.
> 
>     https://public-inbox.org/meta/<20240606074416.3900983-1-e@80x24.org>/
> 
> is silently accepted and redirected to
> 
>     https://public-inbox.org/meta/20240606074416.3900983-1-e@80x24.org/
> 
> instead of the "partial matches found" page.

Seems reasonable; especially since sr.ht uses <> in URLs nowadays
and some users may be conditioned to include them.
I don't see 404s in my logs from this, but I don't keep a lot of logs.

------8<------
Subject: [PATCH] www: strip and redirect on `<' and `>' in MSGID of URL

Some users may needlessly include `<' and `>' braces in URLs, so
account for this common mistake and redirect users to the
non-braced URL.  This common mistake could be learned behavior
from other sites (e.g. sr.ht) which include `<' and `>' in URLs.

Reported-by: Junio C Hamano <gitster@pobox.com>
Link: https://public-inbox.org/meta/xmqqtthvh4r6.fsf@gitster.g/
---
 lib/PublicInbox/View.pm | 10 +++++++---
 t/psgi_search.t         | 11 +++++++++++
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index dcceb311..cc1ab79a 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -74,9 +74,13 @@ sub msg_page {
 	my ($id, $prev);
 	my $next_arg = $ctx->{next_arg} = [ $ctx->{mid}, \$id, \$prev ];
 
-	my $smsg = $ctx->{smsg} = $over->next_by_mid(@$next_arg) or
-		return; # undef == 404
-
+	my $smsg = $ctx->{smsg} = $over->next_by_mid(@$next_arg);
+	if (!$smsg && $ctx->{mid} =~ /\A\<(.+)\>\z/ and
+			($next_arg->[0] = $1) and
+			($over->next_by_mid(@$next_arg))) {
+		return PublicInbox::WWW::r301($ctx, undef, $next_arg->[0]);
+	}
+	$smsg or return; # undef=404
 	# allow user to easily browse the range around this message if
 	# they have ->over
 	$ctx->{-t_max} = $smsg->{ts};
diff --git a/t/psgi_search.t b/t/psgi_search.t
index 8c981c6c..759dab78 100644
--- a/t/psgi_search.t
+++ b/t/psgi_search.t
@@ -179,6 +179,17 @@ test_psgi(sub { $www->call(@_) }, sub {
 
 	$res = $cb->(GET(q{/test/?q=%22s'more%22&x=A}));
 	is $res->code, 200, 'single quote inside phrase';
+
+	$res = $cb->(GET("/test/<$mid>/"));
+	is $res->code, 301, "redirect for raw `<' and `>' in msgid";
+	like $res->header('location'), qr!/test/\Q$mid\E/\z!,
+		"redirected to URL without raw `<' and `>'";
+
+	$res = $cb->(GET("/test/%3c$mid%3e/"));
+	is $res->code, 301, "redirect for escaped `<' and `>' in msgid";
+	like $res->header('location'), qr!/test/\Q$mid\E/\z!,
+		"redirected to URL without escaped `<' and `>'";
+
 	# TODO: more tests and odd cases
 });
 


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] www: strip and redirect on `<' and `>' in MSGID of URL
  2024-06-17  0:01 ` [PATCH] www: strip and redirect on `<' and `>' in MSGID " Eric Wong
@ 2024-06-17  1:07   ` Junio C Hamano
  0 siblings, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2024-06-17  1:07 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> Seems reasonable; especially since sr.ht uses <> in URLs nowadays
> and some users may be conditioned to include them.
> I don't see 404s in my logs from this, but I don't keep a lot of logs.

Other users may have read RFC2822 and friends and remember that they
say

    message-id      =       "Message-ID:" msg-id CRLF
    msg-id          =       [CFWS] "<" id-left "@" id-right ">" [CFWS]

i.e. the <angle-brackets> around a msg-id is part of msg-id.

A more practical reason is that double-clicking on a msg-id on a
terminal often highlight the surrounding <> as well, not just the
id-left@id-right part.  Some users may have solved it by configuring
their terminal emulator, but for others, just letting the msg-id
pasted including <angle-brackets> on the URL input makes it more
convenient.

I just visited

  https://public-inbox.org/meta/<20240617000140.M652375@dcvr>

by typing the .../meta/ part myself and copy-pasted from the Message-ID:
header of the message I am responding to.  Very pleased to see the request
turned into

  https://public-inbox.org/meta/20240617000140.M652375@dcvr/

Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-06-17  1:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14 16:24 [feature request] accept and silently ignore <> around message IDs at the end of URL Junio C Hamano
2024-06-17  0:01 ` [PATCH] www: strip and redirect on `<' and `>' in MSGID " Eric Wong
2024-06-17  1:07   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).