From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id eE+kFg33A2HchQEAgWs5BA (envelope-from ) for ; Fri, 30 Jul 2021 14:56:45 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id cD5JEg33A2FNawAAB5/wlQ (envelope-from ) for ; Fri, 30 Jul 2021 12:56:45 +0000 Received: from mail.notmuchmail.org (nmbug.tethera.net [IPv6:2607:5300:201:3100::1657]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id F12F4D7E4 for ; Fri, 30 Jul 2021 14:56:44 +0200 (CEST) Received: from nmbug.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id 98205291F4; Fri, 30 Jul 2021 08:56:26 -0400 (EDT) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by mail.notmuchmail.org (Postfix) with ESMTP id 9C61E291E8 for ; Fri, 30 Jul 2021 08:56:15 -0400 (EDT) Received: by fethera.tethera.net (Postfix, from userid 1001) id 971165FD17; Fri, 30 Jul 2021 08:56:15 -0400 (EDT) Received: (nullmailer pid 2166877 invoked by uid 1000); Fri, 30 Jul 2021 12:56:10 -0000 From: David Bremner To: notmuch@notmuchmail.org Cc: David Bremner Subject: [PATCH 06/27] lib/parse-sexp: parse single terms and the empty list. Date: Fri, 30 Jul 2021 09:55:46 -0300 Message-Id: <20210730125607.2165433-7-david@tethera.net> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210730125607.2165433-1-david@tethera.net> References: <20210730125607.2165433-1-david@tethera.net> MIME-Version: 1.0 Message-ID-Hash: 4EGYTKOFQYK3BWKROCOUY5S3NWTTCOXS X-Message-ID-Hash: 4EGYTKOFQYK3BWKROCOUY5S3NWTTCOXS X-MailFrom: bremner@tethera.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.1 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1627649805; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=t1BMOlY4XAjn/qfcAMbwZUvIZN0c+g05Yrpa/b/vQwI=; b=D1e+Vx9xHOoe8+krv2xY5wq+eW2GfxTasvqZfCqs1GkYqvAyByO93p4zYLX7SZEDH8AXvH odaCc9C3zm6Y2v0fm4dam4b4kKcgs7osWYMTJWW0/IKKlaUKymf/05A8nOQK9QySu/9jXl 2LxdGJTyoJYqKfufNNfE1sjHxbPLBUDa/7wS8S9aqEpP0FW7ZdLjer9eP9LkcN+Mo2ZPCs AlP0TO8A5tPOmhO//Anx/IWFUVcRS3LgTPeWtnIrkqyAOEDBa3sypb+0HKnFbakQhqbntA /7xJjrUgCxtFnEb90Wzxvv1AEjbZLtu+znnkFaSLFX24G8E49cdg+1reXpsx9Q== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1627649805; a=rsa-sha256; cv=none; b=HtZQv7q6HdQpOPYrejYaoIOzJxk//SGs82uruQ27fJRt8MASji4cr8mVUXHCrGpGOkRH2o v+30fmgpAM7Mspx7xTu9hmz1AdqMnzAOQRQTvGTwh6I1KyLQ8tWg34yIJERjJLRgXS+/O6 OJ1lvgNx2WjJ7QUFJrVM/IHMNg/AaLbuBmZUbCrQegRAJhaquJZBOkSzqsyDz78yw4Fp3I SFp5t7ShIsn5dk/b4LwySAzlyTB6L6KqqZoLy44Rfi4UaJrf275YaBjKOJpKQ/9baF4EEq g21GiCg55X6pOODdeYgr25Vp09HZNtqYhbZptOLQD1ABpmZZJGooJ1Fr3jm9DQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2607:5300:201:3100::1657 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: 0.52 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2607:5300:201:3100::1657 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: F12F4D7E4 X-Spam-Score: 0.52 X-Migadu-Scanner: scn1.migadu.com X-TUID: IZbXhmELa4bg There is not much of a parser here yet, but it already does some useful error reporting. Most functionality sketched in the documentation is not implemented yet; detailed documentation will follow with the implementation. --- doc/conf.py | 4 ++ doc/index.rst | 1 + doc/man7/notmuch-sexp-queries.rst | 81 +++++++++++++++++++++++++++++++ lib/Makefile.local | 3 +- lib/database-private.h | 7 +++ lib/parse-sexp.cc | 54 +++++++++++++++++++++ lib/query.cc | 6 +-- test/T080-search.sh | 5 -- test/T081-sexpr-search.sh | 65 +++++++++++++++++++++++++ 9 files changed, 216 insertions(+), 10 deletions(-) create mode 100644 doc/man7/notmuch-sexp-queries.rst create mode 100644 lib/parse-sexp.cc create mode 100755 test/T081-sexpr-search.sh diff --git a/doc/conf.py b/doc/conf.py index 4a4a3421..53becb00 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -159,6 +159,10 @@ man_pages = [ u'syntax for notmuch queries', [notmuch_authors], 7), + ('man7/notmuch-sexp-queries', 'notmuch-sexp-queries', + u's-expression syntax for notmuch queries', + [notmuch_authors], 7), + ('man1/notmuch-show', 'notmuch-show', u'show messages matching the given search terms', [notmuch_authors], 1), diff --git a/doc/index.rst b/doc/index.rst index a3bf3480..fbdcf779 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -24,6 +24,7 @@ Contents: man1/notmuch-restore man1/notmuch-search man7/notmuch-search-terms + man7/notmuch-sexp-queries man1/notmuch-show man1/notmuch-tag python-bindings diff --git a/doc/man7/notmuch-sexp-queries.rst b/doc/man7/notmuch-sexp-queries.rst new file mode 100644 index 00000000..e530912c --- /dev/null +++ b/doc/man7/notmuch-sexp-queries.rst @@ -0,0 +1,81 @@ +.. _notmuch-sexp-query(7): + +==================== +notmuch-sexp-queries +==================== + +SYNOPSIS +======== + +**notmuch** **search** ``--query-syntax=sexp`` '(and (to santa) (date december))' + +DESCRIPTION +=========== + + +S-EXPRESSIONS +------------- + +An *s-expression* is either an atom, or list of whitespace delimited +s-expressions inside parentheses. Atoms are either + +*basic value* + A basic value is an unquoted string containing no whitespace, double quotes, or + parentheses. + +*quoted string* + Double quotes (") delimit strings possibly containing whitespace + or parentheses. These can contain double quote characters by + escaping with backslash. E.g. ``"this is a quote \""``. + +S-EXPRESSION QUERIES +-------------------- + +An s-expression query is either an atom, the empty list, or a +*compound query* consisting of a prefix atom (first element) defining +a *field*, *logical operation*, or *modifier*, and 0 or more +subqueries. + +``*`` +``()`` + The empty list matches all messages + +*term* + Match all messages containing *term*, possibly after stemming + or phase splitting. + +``(`` *field* |q1| |q2| ... |qn| ``)`` + Restrict the queries |q1| to |qn| to *field*, and combine with *and* + (for most fields) or *or*. See :any:`fields` for more information. + +``(`` *operator* |q1| |q2| ... |qn| ``)`` + Combine queries |q1| to |qn|. See :any:`operators` for more information. + +``(`` *modifier* |q1| |q2| ... |qn| ``)`` + Combine queries |q1| to |qn|, and reinterpret the result (e.g. as a regular expression). + See :any:`modifiers` for more information. + +.. _fields: + +FIELDS +`````` + +.. _operators: + +OPERATORS +````````` + +.. _modifiers: + +MODIFIERS +````````` + +EXAMPLES +======== + +``Wizard`` + Match all messages containing the word "wizard", ignoring case. + +.. |q1| replace:: :math:`q_1` +.. |q2| replace:: :math:`q_2` +.. |qn| replace:: :math:`q_n` diff --git a/lib/Makefile.local b/lib/Makefile.local index e2d4b91d..1378a74b 100644 --- a/lib/Makefile.local +++ b/lib/Makefile.local @@ -63,7 +63,8 @@ libnotmuch_cxx_srcs = \ $(dir)/features.cc \ $(dir)/prefix.cc \ $(dir)/open.cc \ - $(dir)/init.cc + $(dir)/init.cc \ + $(dir)/parse-sexp.cc libnotmuch_modules := $(libnotmuch_c_srcs:.c=.o) $(libnotmuch_cxx_srcs:.cc=.o) diff --git a/lib/database-private.h b/lib/database-private.h index 9706c17e..f206efaf 100644 --- a/lib/database-private.h +++ b/lib/database-private.h @@ -300,4 +300,11 @@ _notmuch_database_setup_standard_query_fields (notmuch_database_t *notmuch); notmuch_status_t _notmuch_database_setup_user_query_fields (notmuch_database_t *notmuch); +#if __cplusplus +/* parse-sexp.cc */ +notmuch_status_t +_notmuch_sexp_string_to_xapian_query (notmuch_database_t *notmuch, const char *querystr, + Xapian::Query &output); +#endif + #endif diff --git a/lib/parse-sexp.cc b/lib/parse-sexp.cc new file mode 100644 index 00000000..1ce3c9d4 --- /dev/null +++ b/lib/parse-sexp.cc @@ -0,0 +1,54 @@ +#include +#include "notmuch-private.h" +#include "sexp.h" + +#if HAVE_SFSEXP + +/* _sexp is used for file scope symbols to avoid clashing with + * definitions from sexp.h */ + +/* Here we expect the s-expression to be a proper list, with first + * element defining and operation, or as a special case the empty + * list */ + +static notmuch_status_t +_sexp_to_xapian_query (notmuch_database_t *notmuch, const sexp_t *sx, + Xapian::Query &output) +{ + + if (sx->ty == SEXP_VALUE) { + output = Xapian::Query (Xapian::Unicode::tolower (sx->val)); + return NOTMUCH_STATUS_SUCCESS; + } + + /* Empty list */ + if (! sx->list) { + output = Xapian::Query::MatchAll; + return NOTMUCH_STATUS_SUCCESS; + } + + if (sx->list->ty == SEXP_VALUE) + _notmuch_database_log (notmuch, "unknown prefix '%s'\n", sx->list->val); + else + _notmuch_database_log (notmuch, "unexpected list in field/operation position\n", + sx->list->val); + + return NOTMUCH_STATUS_BAD_QUERY_SYNTAX; +} + +notmuch_status_t +_notmuch_sexp_string_to_xapian_query (notmuch_database_t *notmuch, const char *querystr, + Xapian::Query &output) +{ + const sexp_t *sx = NULL; + char *buf = talloc_strdup (notmuch, querystr); + + sx = parse_sexp (buf, strlen (querystr)); + if (! sx) { + _notmuch_database_log (notmuch, "invalid s-expression: '%s'\n", querystr); + return NOTMUCH_STATUS_BAD_QUERY_SYNTAX; + } + + return _sexp_to_xapian_query (notmuch, sx, output); +} +#endif diff --git a/lib/query.cc b/lib/query.cc index 12fd9482..435f7229 100644 --- a/lib/query.cc +++ b/lib/query.cc @@ -23,8 +23,6 @@ #include /* GHashTable, GPtrArray */ -#include "sexp.h" - struct _notmuch_query { notmuch_database_t *notmuch; const char *query_string; @@ -208,8 +206,8 @@ _notmuch_query_ensure_parsed_sexpr (notmuch_query_t *query) if (query->parsed) return NOTMUCH_STATUS_SUCCESS; - query->xapian_query = Xapian::Query::MatchAll; - return NOTMUCH_STATUS_SUCCESS; + return _notmuch_sexp_string_to_xapian_query (query->notmuch, query->query_string, + query->xapian_query); } static notmuch_status_t diff --git a/test/T080-search.sh b/test/T080-search.sh index 966e772a..a3f0dead 100755 --- a/test/T080-search.sh +++ b/test/T080-search.sh @@ -189,9 +189,4 @@ test_begin_subtest "parts do not have adjacent term positions" output=$(notmuch search id:termpos and '"c x"') test_expect_equal "$output" "" -test_begin_subtest "sexpr query: all messages" -notmuch search '*' > EXPECTED -notmuch search --query-syntax=sexp '()' > OUTPUT -test_expect_equal_file EXPECTED OUTPUT - test_done diff --git a/test/T081-sexpr-search.sh b/test/T081-sexpr-search.sh new file mode 100755 index 00000000..3ee9f71d --- /dev/null +++ b/test/T081-sexpr-search.sh @@ -0,0 +1,65 @@ +#!/usr/bin/env bash +test_description='"notmuch search" in several variations' +. $(dirname "$0")/test-lib.sh || exit 1 + +if [ $NOTMUCH_HAVE_SFSEXP -ne 1 ]; then + printf "Skipping due to missing sfsexp library\n" + test_done +fi + +add_email_corpus + +test_begin_subtest "all messages: ()" +notmuch search '*' > EXPECTED +notmuch search --query-syntax=sexp "()" > OUTPUT +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest "single term in body" +notmuch search --query-syntax=sexp 'wizard' | notmuch_search_sanitize>OUTPUT +cat < EXPECTED +thread:XXX 2009-11-18 [1/3] Carl Worth| Jan Janak; [notmuch] What a great idea! (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest "single term in body (case insensitive)" +notmuch search --query-syntax=sexp 'Wizard' | notmuch_search_sanitize>OUTPUT +cat < EXPECTED +thread:XXX 2009-11-18 [1/3] Carl Worth| Jan Janak; [notmuch] What a great idea! (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest "single term in body, stemmed version" +test_subtest_known_broken +notmuch search arriv > EXPECTED +notmuch search --query-syntax=sexp arriv > OUTPUT +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest "Unbalanced parens" +# A code 1 indicates the error was handled (a crash will return e.g. 139). +test_expect_code 1 "notmuch search --query-syntax=sexp '('" + +test_begin_subtest "Unbalanced parens, error message" +notmuch search --query-syntax=sexp '(' >OUTPUT 2>&1 +cat < EXPECTED +notmuch search: Syntax error in query +invalid s-expression: '(' +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest "unknown prefix" +notmuch search --query-syntax=sexp '(foo)' >OUTPUT 2>&1 +cat < EXPECTED +notmuch search: Syntax error in query +unknown prefix 'foo' +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest "list as prefix" +notmuch search --query-syntax=sexp '((foo))' >OUTPUT 2>&1 +cat < EXPECTED +notmuch search: Syntax error in query +unexpected list in field/operation position +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_done -- 2.30.2