From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 9C8831F626; Thu, 16 Feb 2023 21:36:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1676583388; bh=DlsaCEFuLqzMBaKvFl0DY98oPgq0q5rs3XydIil5LHM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=sVUymVKYv8O42Qe8HglKZTusM+QAdwFvBa5PvCG8XNuyC6lH4OOCAbfl1P4O22U3D t7IgdwWDfH3+W0srICrYMRlqTwN99xyNdy8qtR8HwRkJLc+hhGA00Mx+h9xLKxjo4T WdSo4TPkMAh6TkM1oeAAPmDayC5/A3zBDgPQVDWI= Date: Thu, 16 Feb 2023 21:36:28 +0000 From: Eric Wong To: Uwe =?utf-8?Q?Kleine-K=C3=B6nig?= Cc: meta@public-inbox.org Subject: Re: Bug related to (maybe?) / in Message-Id Message-ID: <20230216213628.M187845@dcvr> References: <20230216210546.eo73kyzvtzaxwxko@pengutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230216210546.eo73kyzvtzaxwxko@pengutronix.de> List-Id: Uwe Kleine-König wrote: > Hello, > > The mail by Alexander Dahl that is (currently) the first hit on > https://lore.ptxdist.org/ptxdist/?q=ptxd_make_world_compile_commands_filter > results in a 404 when I follow the link. > > The original mail has > > Message-ID: > > and the corresponding link is: > > https://lore.ptxdist.org/ptxdist/Y+07h0l%2FzJJAgs9s@falbala.internal.home.lespocky.de/ > > I noticed this on public-inbox 1.8.0-1~bpo11+1 from Debian, upgrading to > 1.9.0-1~bpo11+1 didn't help. > > Other mails with / in Message-Id are not accessible either, I tested > with: > > YyHu/412LT8uQTy1@lenoch > Y0/5xdFZO3u0952+@lenoch The TODO file has this: * use REQUEST_URI properly for CGI / mod_perl2 compatibility with Message-IDs which include '%' (done?) So I guess it's not done... To deal with '/' in the Message-ID, $env->{REQUEST_URI} really needs to be the raw, undecoded URI specified in the PSGI specs[1]. I'm not sure how to go about it Apache+CGI or mod_perl2.. Fwiw, the recommended configuration is: (nginx|haproxy) -> varnish -> public-inbox-{httpd,netd} Maybe Apache2 mpm_event reverse proxy can work in lieu of (nginx|haproxy), but /T/, /t/, /t.mbox.gz requests are a bit faster on -httpd/-netd since 1.6+ on SMP machines. > I also wonder why these mails yield the webserver's 404 page and not the > one provided by the public-inbox cgi?! This may be the small size public-inbox's 404 page. I don't know Apache configs well, but I know nginx did something similar. > Is this a problem in public-inbox, or is the apache configuration > somehow borked? Any hints welcome. Do you have access to that server and can show us the configs? REQUEST_URI really needs to be raw in accordance to PSGI specs. This can dump the request $env to stderr and show us REQUEST_URI, PATH_INFO, SCRIPT_NAME, and anything else which may enlighten us: diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index 9ffcb879..f67fe8e6 100644 --- a/lib/PublicInbox/WWW.pm +++ b/lib/PublicInbox/WWW.pm @@ -52,7 +52,8 @@ sub call { # none of the keys we care about will need escaping ($k // '', uri_unescape($v // '')) } split(/[&;]+/, $env->{QUERY_STRING}); - + use Data::Dumper; $Data::Dumper::Useqq = 1; + warn Dumper($env); my $path_info = path_info_raw($env); my $method = $env->{REQUEST_METHOD}; [1] PSGI specs: git clone https://github.com/plack/psgi-specs