From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id EBA0B431FC0 for ; Sun, 23 Dec 2012 17:40:10 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WiUA2wl4Vb2W for ; Sun, 23 Dec 2012 17:40:09 -0800 (PST) Received: from tesseract.cs.unb.ca (tesseract.cs.unb.ca [131.202.240.238]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 8E3EF431FC3 for ; Sun, 23 Dec 2012 17:40:02 -0800 (PST) Received: from fctnnbsc30w-156034082078.dhcp-dynamic.fibreop.nb.bellaliant.net ([156.34.82.78] helo=zancas.localnet) by tesseract.cs.unb.ca with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1Tmx1M-0008Kp-1r; Sun, 23 Dec 2012 21:40:00 -0400 Received: from bremner by zancas.localnet with local (Exim 4.80) (envelope-from ) id 1Tmx1G-0002ng-HD; Sun, 23 Dec 2012 21:39:54 -0400 From: david@tethera.net To: notmuch@notmuchmail.org Subject: [Patch v9 06/17] unhex_and_quote: new function to quote hex-decoded queries Date: Sun, 23 Dec 2012 21:39:32 -0400 Message-Id: <1356313183-9266-7-git-send-email-david@tethera.net> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1356313183-9266-1-git-send-email-david@tethera.net> References: <1356313183-9266-1-git-send-email-david@tethera.net> X-Spam_bar: - Cc: David Bremner X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Dec 2012 01:40:11 -0000 From: David Bremner Space delimited tokens are hex decoded and then quoted according to Xapian rules. Prefixes and '*' are passed through unquoted, as is anything that hex-decoding would not change. --- tag-util.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/tag-util.c b/tag-util.c index 935c8d9..b9b6099 100644 --- a/tag-util.c +++ b/tag-util.c @@ -56,6 +56,100 @@ illegal_tag (const char *tag, notmuch_bool_t remove) return NULL; } +/* Factor out the boilerplate to append a token to the query string. + * For use in unhex_and_quote */ + +static tag_parse_status_t +append_tok (const char *tok, size_t tok_len, + const char *line_for_error, char **query_string) +{ + + *query_string = talloc_strndup_append_buffer (*query_string, tok, tok_len); + if (*query_string == NULL) + return line_error (TAG_PARSE_OUT_OF_MEMORY, line_for_error, "aborting"); + + return TAG_PARSE_SUCCESS; +} + +/* Input is a hex encoded string, presumed to be a query for Xapian. + * + * Space delimited tokens are decoded and quoted, with '*' and prefixes + * of the form "foo:" passed through unquoted. + */ +static tag_parse_status_t +unhex_and_quote (void *ctx, char *encoded, const char *line_for_error, + char **query_string) +{ + char *tok = encoded; + size_t tok_len = 0; + size_t delim_len = 0; + char *buf = NULL; + size_t buf_len = 0; + tag_parse_status_t ret = TAG_PARSE_SUCCESS; + + *query_string = talloc_strdup (ctx, ""); + + while ((tok = strtok_len2 (tok + tok_len + delim_len, " ()", + &tok_len, &delim_len)) != NULL) { + + size_t prefix_len; + char delim = *(tok + tok_len); + + *(tok + tok_len) = '\0'; + + /* The following matches a superset of prefixes currently + * used by notmuch */ + prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz"); + + if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) { + + /* pass some things through without quoting or decoding. + * Note for '*' this is mandatory. + */ + + ret = append_tok (tok, tok_len, line_for_error, query_string); + if (ret) goto DONE; + + } else { + /* potential prefix: one for ':', then something after */ + if ((tok_len - prefix_len >= 2) && *(tok + prefix_len) == ':') { + ret = append_tok (tok, prefix_len + 1, + line_for_error, query_string); + if (ret) goto DONE; + + tok += prefix_len + 1; + tok_len -= prefix_len + 1; + } + + if (hex_decode_inplace (tok) != HEX_SUCCESS) { + ret = line_error (TAG_PARSE_INVALID, line_for_error, + "hex decoding of token '%s' failed", tok); + goto DONE; + } + + if (double_quote_str (ctx, tok, &buf, &buf_len)) { + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, + line_for_error, "aborting"); + goto DONE; + } + + ret = append_tok (buf, buf_len, line_for_error, query_string); + if (ret) goto DONE; + } + /* restore the string */ + *(tok + tok_len) = delim; + + /* copy any delimiters */ + ret = append_tok (tok + tok_len, delim_len, line_for_error, query_string); + if (ret) goto DONE; + } + + DONE: + if (ret != TAG_PARSE_SUCCESS && *query_string) + talloc_free (*query_string); + return ret; +} + tag_parse_status_t parse_tag_line (void *ctx, char *line, tag_op_flag_t flags, -- 1.7.10.4