From: David Bremner <david@tethera.net>
To: notmuch@notmuchmail.org
Subject: v2 sexpr parser
Date: Sat, 17 Jul 2021 23:39:56 -0300 [thread overview]
Message-ID: <20210718024021.3850340-1-david@tethera.net> (raw)
This is a substantially revised version of the series at [1]. As far
as I know, it now understands (a translation of) most of the queries
handled by the existing query parser. Some remaining limitations/issues
1) The new query parser is only hooked into the notmuch search
subcommand. It should be fairly rote to hook it into the other
relevant subcommands, but I want to wait until resolving (2) before
proceeding.
2) The command line option --query-syntax={sexp,xapian} is a bit
klunky. Also "xapian" should perhaps be renamed "infix" to match the
'infix' operator in the new parser.
3) There is no documentation. I think notmuch-search-terms(7) is too
long already, so there should probably be a separate manual page. I
don't want to write that until I'm sure we want the new parser.
4) There is still some uncertainty around utf8 handling in sfsexp.
5) I'm not too sure about the new API call
notmuch_query_create_sexpr. I guess a more idiomatic thing to do would
be to add a new function with an extra argument, and have the old
function call it.
6) The way that user defined headers are used in the new parser is a
bit different than the existing one. Instead of (List notmuch), you
currently have to write (header List notmuch). I don't know if that's
better or worse. It's a bit more typing, but it is maybe a bit clearer to read.
It would probably not be too hard to switch.
7) Trailing wildcards like "subject:foo*" are not implemented yet.
In [2] Hannu mentioned being unclear on the design goals of the
s-expression query parser, so let me try and articulate the main
design goals a bit better. I think the existing query parser is great
for making "easy things easy". But when things are not easy and/or the
user wants better diagnostics, it is nice to have an alternative.
A) More consistent / predictable syntax.
The notmuch query parser adds several features to the Xapian query
parser. Mainly due for implementation reasons, this has resulted in a
somewhat quirky syntax, and often fairly painful escaping. Probably
the most egregious syntax quirk is that '*' (for all messages) cannot
be composed with other queries. In particular is should simplify and
make more reliable code like "notmuch-search-filter", which tries to
combine an existing query with some user specified filter.
With the new parser, this 15-20 lines can be replaced by
`(and (infix ,existing) (infix ,new))
B) Better error reporting.
Xapian's query parser is designed to be permissive and almost never
rejects a query string. This is not always ideal, particularly with
debugging constructed queries.
C) Extensibility
The Xapian Query API has functionality that is not (yet) exposed via
the QueryParser. It turns out that some common feature requests are
easy to add [3]. For example, to match messages with a List-Id header,
you can use '(header List :any)'.
[1]: id:20210714000239.804384-1-david@tethera.net
[2]: id:60f190f8.1c69fb81.7e7d2.40d1@mx.google.com
[3]: In fairness, they would probably be fairly easy to add to the
Xapian QueryParser as well. But then we'd need to depend on a
sufficiently recent version.
next reply other threads:[~2021-07-18 2:40 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-18 2:39 David Bremner [this message]
2021-07-18 2:39 ` [PATCH 01/25] configure: optional library sfsexp David Bremner
2021-07-18 2:39 ` [PATCH 02/25] lib: split notmuch_query_create David Bremner
2021-07-18 2:39 ` [PATCH 03/25] lib: define notmuch_query_create_sexpr David Bremner
2021-07-18 2:40 ` [PATCH 04/25] CLI/search+address: support sexpr queries David Bremner
2021-07-18 2:40 ` [PATCH 05/25] lib: add new status code for query syntax errors David Bremner
2021-07-18 2:40 ` [PATCH 06/25] lib/parse-sexp: parse 'and', 'not', 'or' David Bremner
2021-07-18 2:40 ` [PATCH 07/25] lib/parse-sexp: parse 'subject' David Bremner
2021-07-18 2:40 ` [PATCH 08/25] lib/parse-sexp: split terms in phrase mode David Bremner
2021-07-18 2:40 ` [PATCH 09/25] lib/parse-sexp: handle most fields David Bremner
2021-07-18 2:40 ` [PATCH 10/25] lib/parse-sexp: handle unprefixed terms David Bremner
2021-07-18 2:40 ` [PATCH 11/25] lib: factor out date to query conversion David Bremner
2021-07-18 2:40 ` [PATCH 12/25] lib/parse-sexp: parse date fields David Bremner
2021-07-18 2:40 ` [PATCH 13/25] lib: factor out expansion of saved queries David Bremner
2021-07-18 2:40 ` [PATCH 14/25] lib/parse-sexp: handle " David Bremner
2021-07-18 2:40 ` [PATCH 15/25] lib/parse-sexp: add keyword arguments for fields David Bremner
2021-07-18 2:40 ` [PATCH 16/25] lib/parse-sexp: initial support for wildcard queries David Bremner
2021-07-18 2:40 ` [PATCH 17/25] lib/query: generalize exclude handling to s-expression queries David Bremner
2021-07-18 2:40 ` [PATCH 18/25] lib: factor out query construction from regexp David Bremner
2021-07-18 2:40 ` [PATCH 19/25] lib/parse-sexp: add support for regexp fields David Bremner
2021-07-18 2:40 ` [PATCH 20/25] lib/thread-fp: factor out query expansion David Bremner
2021-07-18 2:40 ` [PATCH 21/25] lib: define _notmuch_query_from_sexp David Bremner
2021-07-18 2:40 ` [PATCH 22/25] lib: generate actual Xapian query for "*" and "" David Bremner
2021-07-18 2:40 ` [PATCH 23/25] lib/parse-sexp: support thread subqueries David Bremner
2021-07-18 2:40 ` [PATCH 24/25] lib/parse-sexp: support infix subqueries David Bremner
2021-07-18 2:40 ` [PATCH 25/25] lib/parse-sexp: parse user headers David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210718024021.3850340-1-david@tethera.net \
--to=david@tethera.net \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).