From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS206238 45.83.232.0/22 X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 0D85B1F59D for ; Sun, 21 Aug 2022 21:43:55 +0000 (UTC) Received: from reform (deer0x0c.wildebeest.org [172.31.17.142]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 2B014302AB2C; Sun, 21 Aug 2022 23:43:48 +0200 (CEST) Received: by reform (Postfix, from userid 1000) id 4E0F52E814B6; Sun, 21 Aug 2022 23:43:48 +0200 (CEST) Date: Sun, 21 Aug 2022 23:43:48 +0200 From: Mark Wielaard To: Overseers mailing list Cc: Eric Wong , meta@public-inbox.org Subject: Re: Using plus (+) in list name Message-ID: References: <20220821205338.M316466@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220821205338.M316466@dcvr> List-Id: Hi Eric, On Sun, Aug 21, 2022 at 08:53:38PM +0000, Eric Wong via Overseers wrote: > Mark Wielaard wrote: > > We are setting up a public-inbox instance for cygwin/gcc/sourceware > > lists at https://inbox.sourceware.org/ and it seems to work pretty > > nicely. Thanks. Except for lists which have a + in their name like > > libstdc++. > > > > I assume this needs some escaping somewhere, but I cannot figure out > > where. The .public-inbox/config snippet looks like: > > I seem to remember '+' is OK as-is in the path component of HTTP URLs, > but is escaping for ' ' (SP) in query strings. Yes, '+' doesn't have a reserved purpose in the path component, but does encode a space in the query string. So it doesn't have to be escaped in the path component and can be used as is (although percentage encoding is recommended nobody seems to do it). > > This seems to work fine for nntp and imap, but not https. > > Interesting that NNTP and IMAP work (I wasn't expecting it :x). > > I can't remember off the top of my head, but is '+' allowed by > the relevant NNTP and List-Id RFCs? I don't know. I just observed that I can see the group name inbox.gcc.libstdc++ in my nttp and imap readers when pointing at inbox.sourceware.org. > > We are using the EPEL public-inbox package public-inbox-1.7.0-2.el8.noarch > > Totally untested, but perhaps changing $INBOX_RE in > PublicInbox/WWW.pm will work: > > diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm > index b9b68382..77f463d3 100644 > --- a/lib/PublicInbox/WWW.pm > +++ b/lib/PublicInbox/WWW.pm > @@ -23,7 +23,7 @@ use PublicInbox::WwwStatic qw(r path_info_raw); > use PublicInbox::Eml; > > # TODO: consider a routing tree now that we have more endpoints: > -our $INBOX_RE = qr!\A/([\w\-][\w\.\-]*)!; > +our $INBOX_RE = qr!\A/([\w\-][\w\.\-\+]*)!; > our $MID_RE = qr!([^/]+)!; > our $END_RE = qr!(T/|t/|t\.mbox(?:\.gz)?|t\.atom|raw|)!; > our $ATTACH_RE = qr!([0-9][0-9\.]*)-($PublicInbox::Hval::FN)!; That works! https://inbox.sourceware.org/libstdc++ looks fully functional now. Now to figure out how to properly include that patch before the other sourceware overseers figure out I patched the packaged code in place. Thanks, Mark