From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C87E91F597; Fri, 20 Jul 2018 06:16:12 +0000 (UTC) Date: Fri, 20 Jul 2018 06:16:12 +0000 From: Eric Wong To: meta@public-inbox.org Subject: [PATCH v2] search: use boolean prefixes for git blob queries Message-ID: <20180720061612.j4s3gugasle2r4iz@whir> References: <20180716040734.30104-1-e@80x24.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20180716040734.30104-1-e@80x24.org> List-Id: I've hit some case where probabilistic searches don't work when using dfpre:/dfpost:/dfblob: search prefixes because stemming in the query parser interferes. In any case, our indexing code indexes longer/unabbreviated blob names down to its 7 character abbreviation, so there should be no need to do wildcard searches on git blob names. --- lib/PublicInbox/Search.pm | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm index 69eca9f..090d998 100644 --- a/lib/PublicInbox/Search.pm +++ b/lib/PublicInbox/Search.pm @@ -50,6 +50,9 @@ use constant { my %bool_pfx_external = ( mid => 'Q', # Message-ID (full/exact), this is mostly uniQue + dfpre => 'XDFPRE', + dfpost => 'XDFPOST', + dfblob => 'XDFPRE XDFPOST', ); my $non_quoted_body = 'XNQ XDFN XDFA XDFB XDFHH XDFCTX XDFPRE XDFPOST'; @@ -74,9 +77,6 @@ my %prob_prefix = ( dfb => 'XDFB', dfhh => 'XDFHH', dfctx => 'XDFCTX', - dfpre => 'XDFPRE', - dfpost => 'XDFPOST', - dfblob => 'XDFPRE XDFPOST', # default: '' => 'XM S A XQUOT XFN ' . $non_quoted_body, @@ -266,7 +266,7 @@ sub qp { Search::Xapian::NumberValueRangeProcessor->new(DT, 'dt:')); while (my ($name, $prefix) = each %bool_pfx_external) { - $qp->add_boolean_prefix($name, $prefix); + $qp->add_boolean_prefix($name, $_) foreach split(/ /, $prefix); } # we do not actually create AltId objects, -- EW