From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C4B321F9F3; Sat, 11 Sep 2021 00:19:17 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Cc: Konstantin Ryabitsev Subject: [PATCH 3/3] lei: normalize whitespace in remote queries Date: Sat, 11 Sep 2021 00:19:17 +0000 Message-Id: <20210911001917.1310-4-e@80x24.org> In-Reply-To: <20210911001917.1310-1-e@80x24.org> References: <20210911001917.1310-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Having redundant "+" in URLs is ugly and can hurt cacheability of queries. Even with "quoted phrase searches", Xapian seems unaffected by redundant spaces, so just normalize the ASCII white spaces to ' ' (%20) when fed via STDIN or saved-search config file. Reported-by: Konstantin Ryabitsev Link: https://public-inbox.org/meta/20210910141157.6u5adehpx7wftkor@meerkat.local/ --- lib/PublicInbox/LeiXSearch.pm | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm index 709a3b3a..9f7f3885 100644 --- a/lib/PublicInbox/LeiXSearch.pm +++ b/lib/PublicInbox/LeiXSearch.pm @@ -297,7 +297,9 @@ sub query_remote_mboxrd { local $SIG{TERM} = sub { exit(0) }; # for DESTROY (File::Temp, $reap) my $lei = $self->{lei}; my $opt = $lei->{opt}; - my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm'); + my $qstr = $lei->{mset_opt}->{qstr}; + $qstr =~ s/[ \n\t]+/ /sg; # make URLs less ugly + my @qform = (q => $qstr, x => 'm'); push(@qform, t => 1) if $opt->{threads}; my $verbose = $opt->{verbose}; my ($reap_tail, $reap_curl);