unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: David Bremner <david@tethera.net>
To: Olly Betts <olly@survex.com>
Cc: notmuch@notmuchmail.org, xapian-discuss@lists.xapian.org
Subject: Re: xapian parser bug?
Date: Sun, 30 Sep 2018 22:25:33 -0300	[thread overview]
Message-ID: <87sh1q1mr6.fsf@tethera.net> (raw)
In-Reply-To: <20180930204327.a4dwzh6jdqcqvk2e@survex.com>

Olly Betts <olly@survex.com> writes:

> On Sun, Sep 30, 2018 at 09:05:25AM -0300, David Bremner wrote:
>>             if (str.find (' ') != std::string::npos)
>> 		query_str = '"' + str + '"';
>> 	    else
>> 		query_str = str;
>> 
>> 	    return parser.parse_query (query_str, NOTMUCH_QUERY_PARSER_FLAGS, term_prefix);
>
> I wouldn't recommend trying to generate strings to feed to QueryParser
> like this code seems to be doing.  QueryParser aims to parse input from
> humans not machines.

str is the parameter to the FieldProcessor () operator.  The field
processor needs a way to approximate the standard probabilistic prefix
parsing in the fallback case. The addition of quotes is to force the
generation of a phrase query, otherwise e.g. subject:"christmas party"
doesn't work out well.

I tried using OP_PHRASE as a the default operators, but it doesn't
handle some cases I need.

% quest -o phrase 'bob jones <bob@example.com>'       
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

If I don't recursively call parse_query, then I guess I need to generate
terms in a compatible way before turning them into a phrase query. Maybe
that's not as hard as I orginally thought, since being in phrase turns
off the stemmer anyway iiuc.  Is there a Xapian API I can use to extract
 "bob", "jones", "bob", "example", "com" from the example above? I guess
 I guess I could use a throwaway Xapian::Document and a TermGenerator
 (basically aping xapian_core/tests/api_termgen.cc).

d

      reply	other threads:[~2018-10-01  1:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-29 22:09 xapian parser bug? David Bremner
2018-09-30  8:50 ` James Aylett
2018-09-30  9:20   ` Olly Betts
2018-09-30 12:05     ` David Bremner
2018-09-30 17:49       ` David Bremner
2018-09-30 20:43       ` Olly Betts
2018-10-01  1:25         ` David Bremner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sh1q1mr6.fsf@tethera.net \
    --to=david@tethera.net \
    --cc=notmuch@notmuchmail.org \
    --cc=olly@survex.com \
    --cc=xapian-discuss@lists.xapian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).