From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 784056DE01ED for ; Sun, 30 Sep 2018 10:49:53 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: 0.002 X-Spam-Level: X-Spam-Status: No, score=0.002 tagged_above=-999 required=5 tests=[AWL=0.013, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0FxW4V4Z05jz for ; Sun, 30 Sep 2018 10:49:52 -0700 (PDT) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by arlo.cworth.org (Postfix) with ESMTPS id 4FC436DE00C6 for ; Sun, 30 Sep 2018 10:49:51 -0700 (PDT) Received: from remotemail by fethera.tethera.net with local (Exim 4.89) (envelope-from ) id 1g6fqp-0004Xi-1o; Sun, 30 Sep 2018 13:49:51 -0400 Received: (nullmailer pid 24503 invoked by uid 1000); Sun, 30 Sep 2018 17:49:49 -0000 From: David Bremner To: Olly Betts , James Aylett Cc: notmuch@notmuchmail.org, xapian-discuss@lists.xapian.org Subject: Re: xapian parser bug? In-Reply-To: <87y3bj198a.fsf@tethera.net> References: <87a7o02bya.fsf@tethera.net> <20180930092039.7imrsrjyctpel2sp@survex.com> <87y3bj198a.fsf@tethera.net> Date: Sun, 30 Sep 2018 14:49:49 -0300 Message-ID: <87o9cekh8i.fsf@tethera.net> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 17:49:53 -0000 David Bremner writes: > Olly Betts writes: > >> >> FWIW, I also couldn't reproduce this (I tried with quest and 1.4.7): >> >> $ quest -psubject:S -fdefault,boolean_any_case 'subject:"and"' >> Parsed Query: Query(Sand@1) >> > > Ah, OK, it must have something to do with the way that notmuch is using > field processors. And I see now that the following code (from > lib/regexp-fields.cc) is probably related (at least it explains > subject:" not" works) > > if (str.find (' ') != std::string::npos) > query_str = '"' + str + '"'; > else > query_str = str; > > return parser.parse_query (query_str, NOTMUCH_QUERY_PARSER_FLAGS, term_prefix); For the record, I have proposed a fix for notmuch (str is known to be non-empty there). This will phrase quote by default, unless the string looks like a wildcard query (without spaces). diff --git a/lib/regexp-fields.cc b/lib/regexp-fields.cc index 084bc8c0..52f30d82 100644 --- a/lib/regexp-fields.cc +++ b/lib/regexp-fields.cc @@ -194,7 +194,7 @@ RegexpFieldProcessor::operator() (const std::string & str) * phrase parsing, when possible */ std::string query_str; - if (str.find (' ') != std::string::npos) + if (*str.rbegin () != '*' || str.find (' ') != std::string::npos) query_str = '"' + str + '"'; else