From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 8A5806DE025F for ; Sun, 30 Sep 2018 05:15:07 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: 0.002 X-Spam-Level: X-Spam-Status: No, score=0.002 tagged_above=-999 required=5 tests=[AWL=0.013, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dhCL-xxjtBvP for ; Sun, 30 Sep 2018 05:15:06 -0700 (PDT) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by arlo.cworth.org (Postfix) with ESMTPS id A65F86DE025B for ; Sun, 30 Sep 2018 05:15:06 -0700 (PDT) Received: from remotemail by fethera.tethera.net with local (Exim 4.89) (envelope-from ) id 1g6acr-0003E5-Qn; Sun, 30 Sep 2018 08:15:05 -0400 Received: (nullmailer pid 1949 invoked by uid 1000); Sun, 30 Sep 2018 12:05:25 -0000 From: David Bremner To: Olly Betts , James Aylett Cc: notmuch@notmuchmail.org, xapian-discuss@lists.xapian.org Subject: Re: xapian parser bug? In-Reply-To: <20180930092039.7imrsrjyctpel2sp@survex.com> References: <87a7o02bya.fsf@tethera.net> <20180930092039.7imrsrjyctpel2sp@survex.com> Date: Sun, 30 Sep 2018 09:05:25 -0300 Message-ID: <87y3bj198a.fsf@tethera.net> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 12:15:07 -0000 Olly Betts writes: > > FWIW, I also couldn't reproduce this (I tried with quest and 1.4.7): > > $ quest -psubject:S -fdefault,boolean_any_case 'subject:"and"' > Parsed Query: Query(Sand@1) > Ah, OK, it must have something to do with the way that notmuch is using field processors. And I see now that the following code (from lib/regexp-fields.cc) is probably related (at least it explains subject:" not" works) if (str.find (' ') != std::string::npos) query_str = '"' + str + '"'; else query_str = str; return parser.parse_query (query_str, NOTMUCH_QUERY_PARSER_FLAGS, term_prefix); The motivation for not always triggering phrase processing is that it breaks/disables wildcards. In particular this change was to fix the query 'subject:foo*'. The difficulty here is that the field processor doesn't know if its string argument was originally quoted.