unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Mark Walters <markwalters1009@gmail.com>
To: david@tethera.net, notmuch@notmuchmail.org
Subject: Re: v9 of batch tagging
Date: Mon, 24 Dec 2012 02:34:33 +0000	[thread overview]
Message-ID: <8738yw2n5y.fsf@qmul.ac.uk> (raw)
In-Reply-To: <1356313183-9266-1-git-send-email-david@tethera.net>


On Mon, 24 Dec 2012, david@tethera.net wrote:
> This obsoletes 
>
>      id:1356095307-22895-1-git-send-email-david@tethera.net
>
> The main changes since v8 are the rebasing against the notmuch-restore
> fixes in master, and the rewrite of the query (pre)-processing
> unhex_and_quote. This incorporates the changes of
>
>       id:1356231570-28232-1-git-send-email-david@tethera.net
>
> and  now handles '()'  (cf. id:87a9t5p4dz.fsf@qmul.ac.uk)
>
> With respect to 
>
> ,----
> | Finally, I don't know if a query can contain a : without being a
> | prefix query. If it can that could end up being misquoted.
> `----
>
> This is pretty easy to work around by encoding that :. I think unless
> it is a problem in practice I prefer not to keep an explicity list of
> prefixes here; recognizing prefixes should really be a service from
> libnotmuch.

I am quite happy with this.

> I dropped two patches (strnspn and hex_invariant), but picked up a new
> strtok variation. Probably the name strtok_len2 could be improved
> (and I see there is a typo in the patch subject).
>
>  [Patch v9 05/17] util/string-util: add a new string tokenized
>

Patches 5 and 6 look good to me.

> Finally I added a test for the new parenthesis handling.

My recollection is that dump prints the messages unsorted: does this
mean that we could get unstable results for these tests (eg with
different Xapian versions)? 

Best wishes

Mark

>
> [Patch v9 17/17] test/tagging: add test for handling of parens
>


> Fixup wise, the tests needed to be adjusted a bit for () being delimiters, 
> and the man page as well.
>
> I added the fclose in id:87wqw9hf9a.fsf@oiva.home.nikula.org
>
> And I modified the return value per id:87zk15hi7f.fsf@oiva.home.nikula.org
>
> Here is the interdiff for unhex_and_quote:
>
> commit 67c6aee87db5c7da25529e1c0feb64e422abb4b7
> Author: David Bremner <bremner@unb.ca>
> Date:   Sat Dec 22 22:49:02 2012 -0400
>
>     simplify unhex_and_quote, support parens
>     
>     the overgeneral definition of a prefix can be replaced by lower case
>     alphabetic, and still work fine with current notmuch query syntax.
>     
>     use () as delimiters in unhex_and_quote, preserve delimiters
>
> diff --git a/tag-util.c b/tag-util.c
> index 6f62fe6..91f3603 100644
> --- a/tag-util.c
> +++ b/tag-util.c
> @@ -56,6 +56,21 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
>      return NULL;
>  }
>  
> +/* Factor out the boilerplate to append a token to the query string.
> + * For use in unhex_and_quote */
> +
> +static tag_parse_status_t
> +append_tok (const char *tok, size_t tok_len,
> +	    const char *line_for_error, char **query_string)
> +{
> +
> +    *query_string = talloc_strndup_append_buffer (*query_string, tok, tok_len);
> +    if (*query_string == NULL)
> +	return line_error (TAG_PARSE_OUT_OF_MEMORY, line_for_error, "aborting");
> +
> +    return TAG_PARSE_SUCCESS;
> +}
> +
>  /* Input is a hex encoded string, presumed to be a query for Xapian.
>   *
>   * Space delimited tokens are decoded and quoted, with '*' and prefixes
> @@ -67,45 +82,41 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error,
>  {
>      char *tok = encoded;
>      size_t tok_len = 0;
> +    size_t delim_len = 0;
>      char *buf = NULL;
>      size_t buf_len = 0;
>      tag_parse_status_t ret = TAG_PARSE_SUCCESS;
>  
>      *query_string = talloc_strdup (ctx, "");
>  
> -    while ((tok = strtok_len (tok + tok_len, " ", &tok_len)) != NULL) {
> +    while ((tok = strtok_len2 (tok + tok_len + delim_len, " ()",
> +			       &tok_len, &delim_len)) != NULL) {
>  
>  	size_t prefix_len;
>  	char delim = *(tok + tok_len);
>  
> -	*(tok + tok_len++) = '\0';
> +	*(tok + tok_len) = '\0';
>  
> -	prefix_len = hex_invariant (tok, tok_len);
> +	/* The following matches a superset of prefixes currently
> +	 * used by notmuch */
> +	prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz");
>  
> -	if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) {
> +	if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) {
>  
>  	    /* pass some things through without quoting or decoding.
>  	     * Note for '*' this is mandatory.
>  	     */
>  
> -	    if (! (*query_string = talloc_asprintf_append_buffer (
> -		       *query_string, "%s%c", tok, delim))) {
> -
> -		ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> -				  line_for_error, "aborting");
> -		goto DONE;
> -	    }
> +	    ret = append_tok (tok, tok_len, line_for_error, query_string);
> +	    if (ret) goto DONE;
>  
>  	} else {
>  	    /* potential prefix: one for ':', then something after */
> -	    if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') {
> -		if (! (*query_string = talloc_strndup_append (*query_string,
> -							      tok,
> -							      prefix_len + 1))) {
> -		    ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> -				      line_for_error, "aborting");
> -		    goto DONE;
> -		}
> +	    if ((tok_len - prefix_len >= 2) && *(tok + prefix_len) == ':') {
> +		ret = append_tok (tok, prefix_len + 1,
> +				  line_for_error, query_string);
> +		if (ret) goto DONE;
> +
>  		tok += prefix_len + 1;
>  		tok_len -= prefix_len + 1;
>  	    }
> @@ -122,13 +133,15 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error,
>  		goto DONE;
>  	    }
>  
> -	    if (! (*query_string = talloc_asprintf_append_buffer (
> -		       *query_string, "%s%c", buf, delim))) {
> -		ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> -				  line_for_error, "aborting");
> -		goto DONE;
> -	    }
> +	    ret = append_tok (buf, buf_len, line_for_error, query_string);
> +	    if (ret) goto DONE;
>  	}
> +	/* restore the string */
> +	*(tok + tok_len) = delim;
> +
> +	/* copy any delimiters */
> +	ret = append_tok (tok + tok_len, delim_len, line_for_error, query_string);
> +	if (ret) goto DONE;
>      }
>  
>    DONE:
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

  parent reply	other threads:[~2012-12-24  2:34 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-24  1:39 v9 of batch tagging david
2012-12-24  1:39 ` [Patch v9 01/17] parse_tag_line: use enum for return value david
2012-12-24  1:39 ` [Patch v9 02/17] tag-util: factor out rules for illegal tags, use in parse_tag_line david
2012-12-24  1:39 ` [Patch v9 03/17] notmuch-tag.c: convert to use tag-utils david
2012-12-24  1:39 ` [Patch v9 04/17] notmuch-tag: factor out double quoting routine david
2012-12-24  1:39 ` [Patch v9 05/17] util/string-util: add a new string tokenized function david
2012-12-24  1:39 ` [Patch v9 06/17] unhex_and_quote: new function to quote hex-decoded queries david
2012-12-24  1:39 ` [Patch v9 07/17] notmuch-restore: move query handling for batch restore to parser david
2012-12-24  1:39 ` [Patch v9 08/17] cli: add support for batch tagging operations to "notmuch tag" david
2012-12-24  1:39 ` [Patch v9 09/17] test/tagging: add test for error messages of tag --batch david
2012-12-24  1:39 ` [Patch v9 10/17] test/tagging: add basic tests for batch tagging functionality david
2012-12-24  1:39 ` [Patch v9 11/17] test/tagging: add tests for exotic tags david
2012-12-24  1:39 ` [Patch v9 12/17] test/tagging: add test for exotic message-ids and batch tagging david
2012-12-24  1:39 ` [Patch v9 13/17] test/tagging: add test for compound queries with " david
2012-12-24  1:39 ` [Patch v9 14/17] notmuch-tag.1: tidy synopsis formatting, reference david
2012-12-24  1:39 ` [Patch v9 15/17] man: document notmuch tag --batch, --input options david
2012-12-24  1:39 ` [Patch v9 16/17] test/tagging: add test for naked punctuation in tags; compare with quoting spaces david
2012-12-24  1:39 ` [Patch v9 17/17] test/tagging: add test for handling of parenthesized tag queries david
2012-12-24  2:34 ` Mark Walters [this message]
2012-12-24  3:31   ` v9 of batch tagging David Bremner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8738yw2n5y.fsf@qmul.ac.uk \
    --to=markwalters1009@gmail.com \
    --cc=david@tethera.net \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).