From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 6E8591F54E; Sun, 21 Aug 2022 20:53:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1661115218; bh=zdDdhRV6fezVydI9i2xgaOxGQZ/B5MsRk7UkH526ZUs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=P3U58g2nhwq0K8Br5JrtmqXJ53M2WK5MltFcNtgAE6etGo6H9NH8cO5T61wLecG1g uAKksIV2JoGDs4u2rexGlIwTQ4KH3Wn4HGqgdzyDOIJdOP8VnsHU2C59wHxBDZrO8O Hkb9rVmQzNAAIehIwylEpL1+Jb1KZ8I3S0A6rUeA= Date: Sun, 21 Aug 2022 20:53:38 +0000 From: Eric Wong To: Mark Wielaard Cc: meta@public-inbox.org, overseers@sourceware.org Subject: Re: Using plus (+) in list name Message-ID: <20220821205338.M316466@dcvr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: List-Id: Mark Wielaard wrote: > Hi, > > We are setting up a public-inbox instance for cygwin/gcc/sourceware > lists at https://inbox.sourceware.org/ and it seems to work pretty > nicely. Thanks. Except for lists which have a + in their name like > libstdc++. > > I assume this needs some escaping somewhere, but I cannot figure out > where. The .public-inbox/config snippet looks like: I seem to remember '+' is OK as-is in the path component of HTTP URLs, but is escaping for ' ' (SP) in query strings. At least it's OK for a git-config section name: > [publicinbox "libstdc++"] > address = libstdc++@gcc.gnu.org > url = https://inbox.sourceware.org/libstdc++ > inboxdir = /home/inbox/lists/libstdc++ > indexlevel = full > newsgroup = inbox.gcc.libstdc++ > listid = libstdc++.gcc.gnu.org > > This seems to work fine for nntp and imap, but not https. Interesting that NNTP and IMAP work (I wasn't expecting it :x). I can't remember off the top of my head, but is '+' allowed by the relevant NNTP and List-Id RFCs? Anyways, good to see public-inbox getting more adoption :> > It does work when replacing the ++ with pp in the list name and > url. But that looks somewhat odd imho. And the name with ++ can be > used with e.g. mailman: > https://gcc.gnu.org/mailman/listinfo/libstdc++ > > Is there some way to configure public-inbox-http to be able to use ++ > in list names and urls? > > We are using the EPEL public-inbox package public-inbox-1.7.0-2.el8.noarch Totally untested, but perhaps changing $INBOX_RE in PublicInbox/WWW.pm will work: diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index b9b68382..77f463d3 100644 --- a/lib/PublicInbox/WWW.pm +++ b/lib/PublicInbox/WWW.pm @@ -23,7 +23,7 @@ use PublicInbox::WwwStatic qw(r path_info_raw); use PublicInbox::Eml; # TODO: consider a routing tree now that we have more endpoints: -our $INBOX_RE = qr!\A/([\w\-][\w\.\-]*)!; +our $INBOX_RE = qr!\A/([\w\-][\w\.\-\+]*)!; our $MID_RE = qr!([^/]+)!; our $END_RE = qr!(T/|t/|t\.mbox(?:\.gz)?|t\.atom|raw|)!; our $ATTACH_RE = qr!([0-9][0-9\.]*)-($PublicInbox::Hval::FN)!;