unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin"
@ 2012-04-14 12:15 Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 1/6] hex-escape: (en|de)code strings to/from restricted character set Jani Nikula
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

v2 of id:"cover.1333231401.git.jani@nikula.org"

This series adds support for batch tagging through stdin to "notmuch
tag". This should be useful and efficient for e.g. initial tagging
scripts. Also, this adds locking around a batch of tag changes, which is
also useful for initial tagging scripts. See the test patch for an
example using a "here document".

The idea is that the same format (and parser) would be later used for
notmuch dump/restore, building on David's earlier work at
id:"1324214111-32079-1-git-send-email-david@tethera.net".

The v2 only has non-functional changes per David's comments, and
squashes the first three patches of v1 into one.

BR,
Jani.


David Bremner (3):
  hex-escape: (en|de)code strings to/from restricted character set
  test/hex-xcode: new test binary
  test/hex-escaping: new test for hex escaping routines

Jani Nikula (3):
  cli: add support for batch tagging operations to "notmuch tag"
  test: add test for notmuch tag --stdin option
  man: document notmuch tag --stdin option

 man/man1/notmuch-tag.1 |   44 +++++++++-
 notmuch-tag.c          |  244 ++++++++++++++++++++++++++++++++++++++++++------
 test/.gitignore        |    1 +
 test/Makefile.local    |    6 +-
 test/basic             |    2 +-
 test/hex-escaping      |   26 +++++
 test/hex-xcode.c       |  103 ++++++++++++++++++++
 test/notmuch-test      |    1 +
 test/tagging           |   13 +++
 util/Makefile.local    |    2 +-
 util/hex-escape.c      |  168 +++++++++++++++++++++++++++++++++
 util/hex-escape.h      |   41 ++++++++
 12 files changed, 616 insertions(+), 35 deletions(-)
 create mode 100755 test/hex-escaping
 create mode 100644 test/hex-xcode.c
 create mode 100644 util/hex-escape.c
 create mode 100644 util/hex-escape.h

-- 
1.7.5.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/6] hex-escape: (en|de)code strings to/from restricted character set
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
@ 2012-04-14 12:15 ` Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 2/6] test/hex-xcode: new test binary Jani Nikula
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

From: David Bremner <bremner@debian.org>

The character set is chosen to be suitable for pathnames, and the same
as that used by contrib/nmbug

[With additions by Jani Nikula]
---
 util/Makefile.local |    2 +-
 util/hex-escape.c   |  168 +++++++++++++++++++++++++++++++++++++++++++++++++++
 util/hex-escape.h   |   41 ++++++++++++
 3 files changed, 210 insertions(+), 1 deletions(-)
 create mode 100644 util/hex-escape.c
 create mode 100644 util/hex-escape.h

diff --git a/util/Makefile.local b/util/Makefile.local
index c7cae61..3ca623e 100644
--- a/util/Makefile.local
+++ b/util/Makefile.local
@@ -3,7 +3,7 @@
 dir := util
 extra_cflags += -I$(srcdir)/$(dir)
 
-libutil_c_srcs := $(dir)/xutil.c $(dir)/error_util.c
+libutil_c_srcs := $(dir)/xutil.c $(dir)/error_util.c $(dir)/hex-escape.c
 
 libutil_modules := $(libutil_c_srcs:.c=.o)
 
diff --git a/util/hex-escape.c b/util/hex-escape.c
new file mode 100644
index 0000000..d8905d0
--- /dev/null
+++ b/util/hex-escape.c
@@ -0,0 +1,168 @@
+/* hex-escape.c -  Manage encoding and decoding of byte strings into path names
+ *
+ * Copyright (c) 2011 David Bremner
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see http://www.gnu.org/licenses/ .
+ *
+ * Author: David Bremner <david@tethera.net>
+ */
+
+#include <assert.h>
+#include <string.h>
+#include <talloc.h>
+#include <ctype.h>
+#include "error_util.h"
+#include "hex-escape.h"
+
+static const size_t default_buf_size = 1024;
+
+static const char *output_charset =
+    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-_@=.:,";
+
+static const char escape_char = '%';
+
+static int
+is_output (char c)
+{
+    return (strchr (output_charset, c) != NULL);
+}
+
+static int
+maybe_realloc (void *ctx, size_t needed, char **out, size_t *out_size)
+{
+    if (*out_size < needed) {
+
+	if (*out == NULL)
+	    *out = talloc_size (ctx, needed);
+	else
+	    *out = talloc_realloc (ctx, *out, char, needed);
+
+	if (*out == NULL)
+	    return 0;
+
+	*out_size = needed;
+    }
+    return 1;
+}
+
+hex_status_t
+hex_encode (void *ctx, const char *in, char **out, size_t *out_size)
+{
+
+    const unsigned char *p;
+    char *q;
+
+    size_t escape_count = 0;
+    size_t len = 0;
+    size_t needed;
+
+    assert (ctx); assert (in); assert (out); assert (out_size);
+
+    for (p = (unsigned char *) in; *p; p++) {
+	escape_count += (!is_output (*p));
+	len++;
+    }
+
+    needed = len + escape_count * 2 + 1;
+
+    if (*out == NULL)
+	*out_size = 0;
+
+    if (!maybe_realloc (ctx, needed, out, out_size))
+	return HEX_OUT_OF_MEMORY;
+
+    q = *out;
+    p = (unsigned char *) in;
+
+    while (*p) {
+	if (is_output (*p)) {
+	    *q++ = *p++;
+	} else {
+	    sprintf (q, "%%%02x", *p++);
+	    q += 3;
+	}
+    }
+
+    *q = '\0';
+    return HEX_SUCCESS;
+}
+
+/* Hex decode 'in' to 'out'.
+ *
+ * This must succeed for in == out to support hex_decode_inplace().
+ */
+static hex_status_t
+hex_decode_internal (const char *in, unsigned char *out)
+{
+    char buf[3];
+
+    while (*in) {
+	if (*in == escape_char) {
+	    char *endp;
+
+	    /* This also handles unexpected end-of-string. */
+	    if (!isxdigit ((unsigned char) in[1]) ||
+		!isxdigit ((unsigned char) in[2]))
+		return HEX_SYNTAX_ERROR;
+
+	    buf[0] = in[1];
+	    buf[1] = in[2];
+	    buf[2] = '\0';
+
+	    *out = strtoul (buf, &endp, 16);
+
+	    if (endp != buf + 2)
+		return HEX_SYNTAX_ERROR;
+
+	    in += 3;
+	    out++;
+	} else {
+	    *out++ = *in++;
+	}
+    }
+
+    *out = '\0';
+
+    return HEX_SUCCESS;
+}
+
+hex_status_t
+hex_decode_inplace (char *s)
+{
+    /* A decoded string is never longer than the encoded one, so it is
+     * safe to decode a string onto itself. */
+    return hex_decode_internal (s, (unsigned char *) s);
+}
+
+hex_status_t
+hex_decode (void *ctx, const char *in, char **out, size_t * out_size)
+{
+    const char *p;
+    size_t escape_count = 0;
+    size_t needed = 0;
+
+    assert (ctx); assert (in); assert (out); assert (out_size);
+
+    size_t len = strlen (in);
+
+    for (p = in; *p; p++)
+	escape_count += (*p == escape_char);
+
+    needed = len - escape_count * 2 + 1;
+
+    if (!maybe_realloc (ctx, needed, out, out_size))
+	return HEX_OUT_OF_MEMORY;
+
+    return hex_decode_internal (in, (unsigned char *) *out);
+}
diff --git a/util/hex-escape.h b/util/hex-escape.h
new file mode 100644
index 0000000..5182042
--- /dev/null
+++ b/util/hex-escape.h
@@ -0,0 +1,41 @@
+#ifndef _HEX_ESCAPE_H
+#define _HEX_ESCAPE_H
+
+typedef enum hex_status {
+    HEX_SUCCESS = 0,
+    HEX_SYNTAX_ERROR,
+    HEX_OUT_OF_MEMORY
+} hex_status_t;
+
+/*
+ * The API for hex_encode() and hex_decode() is modelled on that for
+ * getline.
+ *
+ * If 'out' points to a NULL pointer a char array of the appropriate
+ * size is allocated using talloc, and out_size is updated.
+ *
+ * If 'out' points to a non-NULL pointer, it assumed to describe an
+ * existing char array, with the size given in *out_size.  This array
+ * may be resized by talloc_realloc if needed; in this case *out_size
+ * will also be updated.
+ *
+ * Note that it is an error to pass a NULL pointer for any parameter
+ * of these routines.
+ */
+
+hex_status_t
+hex_encode (void *talloc_ctx, const char *in, char **out,
+            size_t *out_size);
+
+hex_status_t
+hex_decode (void *talloc_ctx, const char *in, char **out,
+            size_t *out_size);
+
+/*
+ * Non-allocating hex decode to decode 's' in-place. The length of the
+ * result is always equal to or shorter than the length of the
+ * original.
+ */
+hex_status_t
+hex_decode_inplace (char *s);
+#endif
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/6] test/hex-xcode: new test binary
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 1/6] hex-escape: (en|de)code strings to/from restricted character set Jani Nikula
@ 2012-04-14 12:15 ` Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 3/6] test/hex-escaping: new test for hex escaping routines Jani Nikula
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

From: David Bremner <bremner@debian.org>

This program is used both as a test-bed/unit-tester for
../util/hex-escape.c, and also as a utility in future tests of dump
and restore.
---
 test/.gitignore     |    1 +
 test/Makefile.local |    6 ++-
 test/basic          |    2 +-
 test/hex-xcode.c    |  103 +++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 110 insertions(+), 2 deletions(-)
 create mode 100644 test/hex-xcode.c

diff --git a/test/.gitignore b/test/.gitignore
index e63c689..be7ab5e 100644
--- a/test/.gitignore
+++ b/test/.gitignore
@@ -3,4 +3,5 @@ corpus.mail
 smtp-dummy
 symbol-test
 arg-test
+hex-xcode
 tmp.*
diff --git a/test/Makefile.local b/test/Makefile.local
index 4a6a4b1..9cecd28 100644
--- a/test/Makefile.local
+++ b/test/Makefile.local
@@ -13,6 +13,9 @@ smtp_dummy_modules = $(smtp_dummy_srcs:.c=.o)
 $(dir)/arg-test: $(dir)/arg-test.o command-line-arguments.o util/libutil.a
 	$(call quiet,CC) -I. $^ -o $@
 
+$(dir)/hex-xcode: $(dir)/hex-xcode.o command-line-arguments.o util/libutil.a
+	$(call quiet,CC) -I. $^ -o $@ -ltalloc
+
 $(dir)/smtp-dummy: $(smtp_dummy_modules)
 	$(call quiet,CC) $^ -o $@
 
@@ -21,7 +24,8 @@ $(dir)/symbol-test: $(dir)/symbol-test.o
 
 .PHONY: test check
 
-test-binaries: $(dir)/arg-test $(dir)/smtp-dummy $(dir)/symbol-test
+test-binaries: $(dir)/arg-test $(dir)/hex-xcode \
+	 $(dir)/smtp-dummy $(dir)/symbol-test
 
 test:	all test-binaries
 	@${dir}/notmuch-test $(OPTIONS)
diff --git a/test/basic b/test/basic
index d6aed24..af57026 100755
--- a/test/basic
+++ b/test/basic
@@ -54,7 +54,7 @@ test_begin_subtest 'Ensure that all available tests will be run by notmuch-test'
 eval $(sed -n -e '/^TESTS="$/,/^"$/p' $TEST_DIRECTORY/notmuch-test)
 tests_in_suite=$(for i in $TESTS; do echo $i; done | sort)
 available=$(find "$TEST_DIRECTORY" -maxdepth 1 -type f -executable -printf '%f\n' | \
-    sed -r -e "/^(aggregate-results.sh|notmuch-test|smtp-dummy|test-verbose|symbol-test|arg-test)$/d" | \
+    sed -r -e "/^(aggregate-results.sh|notmuch-test|smtp-dummy|test-verbose|symbol-test|arg-test|hex-xcode)$/d" | \
     sort)
 test_expect_equal "$tests_in_suite" "$available"
 
diff --git a/test/hex-xcode.c b/test/hex-xcode.c
new file mode 100644
index 0000000..eec6541
--- /dev/null
+++ b/test/hex-xcode.c
@@ -0,0 +1,103 @@
+/* No, nothing to to with IDE from Apple Inc.
+   testbed for ../util/hex-escape.c.
+
+   usage:
+   hex-xcode [--direction=(encode|decode)] [--omit-newline] < file
+   hex-xcode [--direction=(encode|decode)] [--omit-newline] arg1 arg2 arg3 ...
+
+ */
+
+#include "notmuch-client.h"
+#include "hex-escape.h"
+#include <assert.h>
+
+
+enum direction {
+    ENCODE,
+    DECODE
+};
+
+static int
+xcode (void *ctx, enum direction dir, char *in, char **buf_p, size_t *size_p)
+{
+    hex_status_t status;
+
+    if (dir == ENCODE)
+	status = hex_encode (ctx, in, buf_p, size_p);
+    else
+	status = hex_decode (ctx, in, buf_p, size_p);
+
+    if (status == HEX_SUCCESS)
+	fputs (*buf_p, stdout);
+
+    return status;
+}
+
+
+int
+main (int argc, char **argv)
+{
+
+
+    enum direction dir = DECODE;
+    int omit_newline = FALSE;
+
+    notmuch_opt_desc_t options[] = {
+	{ NOTMUCH_OPT_KEYWORD, &dir, "direction", 'd',
+	  (notmuch_keyword_t []){ { "encode", ENCODE },
+				  { "decode", DECODE },
+				  { 0, 0 } } },
+	{ NOTMUCH_OPT_BOOLEAN, &omit_newline, "omit-newline", 'n', 0 },
+	{ 0, 0, 0, 0, 0 }
+    };
+
+    int opt_index = parse_arguments (argc, argv, options, 1);
+
+    if (opt_index < 0)
+	exit (1);
+
+    void *ctx = talloc_new (NULL);
+
+    char *line = NULL;
+    size_t line_size;
+    ssize_t line_len;
+
+    char *buffer = NULL;
+    size_t buf_size = 0;
+
+    notmuch_bool_t read_stdin = TRUE;
+
+    for (; opt_index < argc; opt_index++) {
+
+	if (xcode (ctx, dir, argv[opt_index],
+		   &buffer, &buf_size) != HEX_SUCCESS)
+	    return 1;
+
+	if (!omit_newline)
+	    putchar ('\n');
+
+	read_stdin = FALSE;
+    }
+
+    if (!read_stdin)
+	return 0;
+
+    while ((line_len = getline (&line, &line_size, stdin)) != -1) {
+
+	chomp_newline (line);
+
+	if (xcode (ctx, dir, line, &buffer, &buf_size) != HEX_SUCCESS)
+	    return 1;
+
+	if (!omit_newline)
+	    putchar ('\n');
+
+    }
+
+    if (line)
+	free (line);
+
+    talloc_free (ctx);
+
+    return 0;
+}
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/6] test/hex-escaping: new test for hex escaping routines
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 1/6] hex-escape: (en|de)code strings to/from restricted character set Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 2/6] test/hex-xcode: new test binary Jani Nikula
@ 2012-04-14 12:15 ` Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 4/6] cli: add support for batch tagging operations to "notmuch tag" Jani Nikula
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

From: David Bremner <bremner@debian.org>

These are more like unit tests, to (try to) make sure the library
functionality is working before building more complicated things on
top of it.
---
 test/hex-escaping |   26 ++++++++++++++++++++++++++
 test/notmuch-test |    1 +
 2 files changed, 27 insertions(+), 0 deletions(-)
 create mode 100755 test/hex-escaping

diff --git a/test/hex-escaping b/test/hex-escaping
new file mode 100755
index 0000000..f34cc8c
--- /dev/null
+++ b/test/hex-escaping
@@ -0,0 +1,26 @@
+#!/usr/bin/env bash
+test_description="hex encoding and decoding"
+. ./test-lib.sh
+
+test_begin_subtest "round trip"
+find $TEST_DIRECTORY/corpus -type f -print | sort | xargs cat > EXPECTED
+$TEST_DIRECTORY/hex-xcode --direction=encode < EXPECTED | $TEST_DIRECTORY/hex-xcode --direction=decode > OUTPUT
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "punctuation"
+tag1='comic_swear=$&^%$^%\\//-+$^%$'
+tag_enc1=$($TEST_DIRECTORY/hex-xcode --direction=encode "$tag1")
+test_expect_equal "$tag_enc1" "comic_swear=%24%26%5e%25%24%5e%25%5c%5c%2f%2f-+%24%5e%25%24"
+
+test_begin_subtest "round trip newlines"
+printf 'this\n tag\t has\n spaces\n' > EXPECTED.$test_count
+$TEST_DIRECTORY/hex-xcode --direction=encode  < EXPECTED.$test_count |\
+	$TEST_DIRECTORY/hex-xcode --direction=decode > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+
+test_begin_subtest "round trip 8bit chars"
+echo '%c3%91%c3%a5%c3%b0%c3%a3%c3%a5%c3%a9-%c3%8f%c3%8a' > EXPECTED.$test_count
+$TEST_DIRECTORY/hex-xcode --direction=decode  < EXPECTED.$test_count |\
+	$TEST_DIRECTORY/hex-xcode --direction=encode > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+test_done
diff --git a/test/notmuch-test b/test/notmuch-test
index 801df73..2d0b381 100755
--- a/test/notmuch-test
+++ b/test/notmuch-test
@@ -57,6 +57,7 @@ TESTS="
   emacs-address-cleaning
   emacs-hello
   emacs-show
+  hex-escaping
 "
 TESTS=${NOTMUCH_TESTS:=$TESTS}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/6] cli: add support for batch tagging operations to "notmuch tag"
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
                   ` (2 preceding siblings ...)
  2012-04-14 12:15 ` [PATCH v2 3/6] test/hex-escaping: new test for hex escaping routines Jani Nikula
@ 2012-04-14 12:15 ` Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 5/6] test: add test for notmuch tag --stdin option Jani Nikula
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

Add support for batch tagging operations through stdin to "notmuch
tag". This can be enabled with the new --stdin command line option to
"notmuch new". The input must consist of lines of the format:

T +<tag>|-<tag> [...] [--] <search-terms>

Each line is interpreted similarly to "notmuch tag" command line
arguments. The delimiter is one or more spaces ' '. Any characters in
<tag> and <search-terms> MAY be hex encoded with %NN where NN is the
hexadecimal value of the character. Any ' ' and '%' characters in
<tag> and <search-terms> MUST be hex encoded (using %20 and %25,
respectively). Any characters that are not part of <tag> or
<search-terms> MUST NOT be hex encoded.

Leading and trailing space ' ' is ignored. Empty lines and lines
beginning with '#' are ignored.

Signed-off-by: Jani Nikula <jani@nikula.org>
---
 notmuch-tag.c |  244 +++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 213 insertions(+), 31 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 05feed3..e67555f 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */
 
 #include "notmuch-client.h"
+#include "hex-escape.h"
 
 static volatile sig_atomic_t interrupted;
 
@@ -167,17 +168,181 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const char *query_string,
     return interrupted;
 }
 
+/* like strtok(3), but without state, and doesn't modify s. usage pattern:
+ *
+ * const char *tok = input;
+ * const char *delim = " \t";
+ * size_t tok_len = 0;
+ *
+ * while ((tok = strtok_len (tok + tok_len, delim, &tok_len)) != NULL) {
+ *     // do stuff with string tok of length tok_len
+ * }
+ */
+static
+char *strtok_len(char *s, const char *delim, size_t *len)
+{
+    /* skip initial delims */
+    s += strspn (s, delim);
+
+    /* length of token */
+    *len = strcspn (s, delim);
+
+    return *len ? s : NULL;
+}
+
+/* Tag messages according to 'input', which must consist of lines of
+ * the format:
+ *
+ * T +<tag>|-<tag> [...] [--] <search-terms>
+ *
+ * Each line is interpreted similarly to "notmuch tag" command line
+ * arguments. The delimiter is one or more spaces ' '. Any characters
+ * in <tag> and <search-terms> MAY be hex encoded with %NN where NN is
+ * the hexadecimal value of the character. Any ' ' and '%' characters
+ * in <tag> and <search-terms> MUST be hex encoded (using %20 and %25,
+ * respectively). Any characters that are not part of <tag> or
+ * <search-terms> MUST NOT be hex encoded.
+ *
+ * Leading and trailing space ' ' is ignored. Empty lines and lines
+ * beginning with '#' are ignored.
+ */
+static int
+tag_file (void *ctx, notmuch_database_t *notmuch, FILE *input,
+	  notmuch_bool_t synchronize_flags)
+{
+    char *line = NULL;
+    size_t line_size;
+    ssize_t line_len;
+    tag_operation_t *tag_ops;
+    int tag_ops_array_size = 10;
+    int ret = 0;
+
+    /* Array of tagging operations (add or remove), terminated with an
+     * empty element. Size will be increased as necessary. */
+    tag_ops = talloc_array (ctx, tag_operation_t, tag_ops_array_size);
+    if (tag_ops == NULL) {
+	fprintf (stderr, "Out of memory.\n");
+	return 1;
+    }
+
+    while ((line_len = getline (&line, &line_size, input)) != -1 &&
+	   !interrupted) {
+	char *tok;
+	size_t tok_len;
+	int tag_ops_count = 0;
+
+	chomp_newline (line);
+
+	tok = strtok_len (line, " ", &tok_len);
+
+	/* Skip empty and comment lines. */
+	if (tok == NULL || *tok == '#')
+	    continue;
+
+	/* T for tagging is the only recognized action for now. */
+	if (strncmp (tok, "T", tok_len) != 0) {
+	    fprintf (stderr, "Warning: Ignoring invalid input line: %s\n",
+		     line);
+	    continue;
+	}
+
+	/* Parse tags. */
+	while ((tok = strtok_len (tok + tok_len, " ", &tok_len)) != NULL) {
+	    notmuch_bool_t remove;
+	    char *tag;
+
+	    /* Optional explicit end of tags marker. */
+	    if (strncmp (tok, "--", tok_len) == 0) {
+		tok = strtok_len (tok + tok_len, " ", &tok_len);
+		break;
+	    }
+
+	    /* Implicit end of tags. */
+	    if (*tok != '-' && *tok != '+')
+		break;
+
+	    /* If tag is terminated by NUL, there's no query string. */
+	    if (*(tok + tok_len) == '\0') {
+		tok = NULL;
+		break;
+	    }
+
+	    /* Terminate, and start next token after terminator. */
+	    *(tok + tok_len++) = '\0';
+
+	    remove = (*tok == '-');
+	    tag = tok + 1;
+
+	    /* Refuse empty tags. */
+	    if (*tag == '\0') {
+		tok = NULL;
+		break;
+	    }
+
+	    /* Decode tag. */
+	    if (hex_decode_inplace (tag) != HEX_SUCCESS) {
+		tok = NULL;
+		break;
+	    }
+
+	    tag_ops[tag_ops_count].tag = tag;
+	    tag_ops[tag_ops_count].remove = remove;
+	    tag_ops_count++;
+
+	    /* Make room for terminating empty element and potential
+	     * new tags, if necessary. This should be a fairly rare
+	     * case, considering the initial array size. */
+	    if (tag_ops_count == tag_ops_array_size) {
+		tag_ops_array_size *= 2;
+		tag_ops = talloc_realloc (ctx, tag_ops, tag_operation_t,
+					  tag_ops_array_size);
+		if (tag_ops == NULL) {
+		    fprintf (stderr, "Out of memory.\n");
+		    return 1;
+		}
+	    }
+	}
+
+	if (tok == NULL || tag_ops_count == 0) {
+	    /* FIXME: line has been modified! */
+	    fprintf (stderr, "Warning: Ignoring invalid input line: %s\n",
+		     line);
+	    continue;
+	}
+
+	tag_ops[tag_ops_count].tag = NULL;
+
+	/* tok now points to the query string */
+	if (hex_decode_inplace (tok) != HEX_SUCCESS) {
+	    /* FIXME: line has been modified! */
+	    fprintf (stderr, "Warning: Ignoring invalid input line: %s\n",
+		     line);
+	    continue;
+	}
+
+	ret = tag_query (ctx, notmuch, tok, tag_ops, synchronize_flags);
+	if (ret)
+	    break;
+    }
+
+    if (line)
+	free (line);
+
+    return ret || interrupted;
+}
+
 int
 notmuch_tag_command (void *ctx, int argc, char *argv[])
 {
-    tag_operation_t *tag_ops;
+    tag_operation_t *tag_ops = NULL;
     int tag_ops_count = 0;
-    char *query_string;
+    char *query_string = NULL;
     notmuch_config_t *config;
     notmuch_database_t *notmuch;
     struct sigaction action;
     notmuch_bool_t synchronize_flags;
-    int i;
+    notmuch_bool_t use_stdin = FALSE;
+    int i, opt_index;
     int ret;
 
     /* Setup our handler for SIGINT */
@@ -187,42 +352,56 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
     action.sa_flags = SA_RESTART;
     sigaction (SIGINT, &action, NULL);
 
-    argc--; argv++; /* skip subcommand argument */
+    notmuch_opt_desc_t options[] = {
+	{ NOTMUCH_OPT_BOOLEAN, &use_stdin, "stdin", 0, 0 },
+	{ 0, 0, 0, 0, 0 }
+    };
 
-    /* Array of tagging operations (add or remove), terminated with an
-     * empty element. */
-    tag_ops = talloc_array (ctx, tag_operation_t, argc + 1);
-    if (tag_ops == NULL) {
-	fprintf (stderr, "Out of memory.\n");
+    opt_index = parse_arguments (argc, argv, options, 1);
+    if (opt_index < 0)
 	return 1;
-    }
 
-    for (i = 0; i < argc; i++) {
-	if (strcmp (argv[i], "--") == 0) {
-	    i++;
-	    break;
+    if (use_stdin) {
+	if (opt_index != argc) {
+	    fprintf (stderr, "Can't specify both cmdline and stdin!\n");
+	    return 1;
 	}
-	if (argv[i][0] == '+' || argv[i][0] == '-') {
-	    tag_ops[tag_ops_count].tag = argv[i] + 1;
-	    tag_ops[tag_ops_count].remove = (argv[i][0] == '-');
-	    tag_ops_count++;
-	} else {
-	    break;
+    } else {
+	/* Array of tagging operations (add or remove), terminated with an
+	 * empty element. */
+	tag_ops = talloc_array (ctx, tag_operation_t, argc - opt_index + 1);
+	if (tag_ops == NULL) {
+	    fprintf (stderr, "Out of memory.\n");
+	    return 1;
 	}
-    }
 
-    tag_ops[tag_ops_count].tag = NULL;
+	for (i = opt_index; i < argc; i++) {
+	    if (strcmp (argv[i], "--") == 0) {
+		i++;
+		break;
+	    }
+	    if (argv[i][0] == '+' || argv[i][0] == '-') {
+		tag_ops[tag_ops_count].tag = argv[i] + 1;
+		tag_ops[tag_ops_count].remove = (argv[i][0] == '-');
+		tag_ops_count++;
+	    } else {
+		break;
+	    }
+	}
 
-    if (tag_ops_count == 0) {
-	fprintf (stderr, "Error: 'notmuch tag' requires at least one tag to add or remove.\n");
-	return 1;
-    }
+	tag_ops[tag_ops_count].tag = NULL;
 
-    query_string = query_string_from_args (ctx, argc - i, &argv[i]);
+	if (tag_ops_count == 0) {
+	    fprintf (stderr, "Error: 'notmuch tag' requires at least one tag to add or remove.\n");
+	    return 1;
+	}
 
-    if (*query_string == '\0') {
-	fprintf (stderr, "Error: notmuch tag requires at least one search term.\n");
-	return 1;
+	query_string = query_string_from_args (ctx, argc - i, &argv[i]);
+
+	if (*query_string == '\0') {
+	    fprintf (stderr, "Error: notmuch tag requires at least one search term.\n");
+	    return 1;
+	}
     }
 
     config = notmuch_config_open (ctx, NULL, NULL);
@@ -236,7 +415,10 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 
     synchronize_flags = notmuch_config_get_maildir_synchronize_flags (config);
 
-    ret = tag_query (ctx, notmuch, query_string, tag_ops, synchronize_flags);
+    if (use_stdin)
+	ret = tag_file (ctx, notmuch, stdin, synchronize_flags);
+    else
+	ret = tag_query (ctx, notmuch, query_string, tag_ops, synchronize_flags);
 
     notmuch_database_close (notmuch);
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 5/6] test: add test for notmuch tag --stdin option
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
                   ` (3 preceding siblings ...)
  2012-04-14 12:15 ` [PATCH v2 4/6] cli: add support for batch tagging operations to "notmuch tag" Jani Nikula
@ 2012-04-14 12:15 ` Jani Nikula
  2012-04-14 12:15 ` [PATCH v2 6/6] man: document " Jani Nikula
  2012-04-14 20:27 ` [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jameson Graef Rollins
  6 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

---
 test/tagging |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/test/tagging b/test/tagging
index e4782ed..15eb42d 100755
--- a/test/tagging
+++ b/test/tagging
@@ -46,4 +46,17 @@ test_expect_equal "$output" "\
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (:\"  inbox tag1 unread)
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag1 tag4 unread)"
 
+test_begin_subtest "Batch stdin"
+notmuch tag --stdin <<EOF
+# %20 is a space in tag
+T -:"%20 -tag1 +tag5 +tag6 -- One
+T +tag1 -tag1 -tag4 +tag4 -- Two
+T -tag6 One
+T +tag5 Two
+EOF
+output=$(notmuch search \* | notmuch_search_sanitize)
+test_expect_equal "$output" "\
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (inbox tag5 unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag4 tag5 unread)"
+
 test_done
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 6/6] man: document notmuch tag --stdin option
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
                   ` (4 preceding siblings ...)
  2012-04-14 12:15 ` [PATCH v2 5/6] test: add test for notmuch tag --stdin option Jani Nikula
@ 2012-04-14 12:15 ` Jani Nikula
  2012-04-14 20:27 ` [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jameson Graef Rollins
  6 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 12:15 UTC (permalink / raw)
  To: notmuch

---
 man/man1/notmuch-tag.1 |   44 +++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 43 insertions(+), 1 deletions(-)

diff --git a/man/man1/notmuch-tag.1 b/man/man1/notmuch-tag.1
index aa4546e..30693cf 100644
--- a/man/man1/notmuch-tag.1
+++ b/man/man1/notmuch-tag.1
@@ -4,7 +4,11 @@ notmuch-tag \- Add/remove tags for all messages matching the search terms.
 
 .SH SYNOPSIS
 .B notmuch tag
-.RI  "+<" tag> "|\-<" tag "> [...] [\-\-] <" search-term ">..."
+.RI "+<" tag ">|\-<" tag "> [...] [\-\-] <" search-terms ">"
+
+.B notmuch tag
+.RI "--stdin"
+
 
 .SH DESCRIPTION
 
@@ -29,6 +33,44 @@ updates the maildir flags according to tag changes if the
 configuration option is enabled. See \fBnotmuch-config\fR(1) for
 details.
 
+Supported options for
+.B tag
+include
+.RS 4
+.TP 4
+.BR \-\-stdin
+
+Read batch tagging operations from standard input. This is more
+efficient than repeated
+.B notmuch tag
+invocations. See
+.B TAG FILE FORMAT
+below for the input format. This option is not compatible with
+specifying tagging on the command line.
+.RE
+
+.SH TAG FILE FORMAT
+
+The input must consist of lines of the format:
+
+.RI "T +<" tag ">|\-<" tag "> [...] [\-\-] <" search-terms ">"
+
+Each line is interpreted similarly to
+.B notmuch tag
+command line arguments. The delimiter is one or more spaces ' '. Any
+characters in <tag> and <search-terms>
+.B may
+be hex encoded with %NN where NN is the hexadecimal value of the
+character. Any ' ' and '%' characters in <tag> and <search-terms>
+.B must
+be hex encoded (using %20 and %25, respectively). Any characters that
+are not part of <tag> or <search-terms>
+.B must not
+be hex encoded.
+
+Leading and trailing space ' ' is ignored. Empty lines and lines
+beginning with '#' are ignored.
+
 .SH SEE ALSO
 
 \fBnotmuch\fR(1), \fBnotmuch-config\fR(1), \fBnotmuch-count\fR(1),
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin"
  2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
                   ` (5 preceding siblings ...)
  2012-04-14 12:15 ` [PATCH v2 6/6] man: document " Jani Nikula
@ 2012-04-14 20:27 ` Jameson Graef Rollins
  2012-04-14 21:07   ` Jani Nikula
  6 siblings, 1 reply; 11+ messages in thread
From: Jameson Graef Rollins @ 2012-04-14 20:27 UTC (permalink / raw)
  To: Jani Nikula, notmuch

[-- Attachment #1: Type: text/plain, Size: 647 bytes --]

On Sat, Apr 14 2012, Jani Nikula <jani@nikula.org> wrote:
> This series adds support for batch tagging through stdin to "notmuch
> tag". This should be useful and efficient for e.g. initial tagging
> scripts. Also, this adds locking around a batch of tag changes, which is
> also useful for initial tagging scripts. See the test patch for an
> example using a "here document".

My issues from v1 still stand.  I would rather see some unification with
the existing tag file format, rather than introduce a second format,
with it's increased maintenance burden and confusion to users.  Either
that or a more general batch command interface.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin"
  2012-04-14 20:27 ` [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jameson Graef Rollins
@ 2012-04-14 21:07   ` Jani Nikula
  2012-04-15 12:02     ` David Bremner
  0 siblings, 1 reply; 11+ messages in thread
From: Jani Nikula @ 2012-04-14 21:07 UTC (permalink / raw)
  To: Jameson Graef Rollins, notmuch

Jameson Graef Rollins <jrollins@finestructure.net> writes:

> On Sat, Apr 14 2012, Jani Nikula <jani@nikula.org> wrote:
>> This series adds support for batch tagging through stdin to "notmuch
>> tag". This should be useful and efficient for e.g. initial tagging
>> scripts. Also, this adds locking around a batch of tag changes, which is
>> also useful for initial tagging scripts. See the test patch for an
>> example using a "here document".
>
> My issues from v1 still stand.  I would rather see some unification with
> the existing tag file format, rather than introduce a second format,
> with it's increased maintenance burden and confusion to users.  Either
> that or a more general batch command interface.

The existing dump/restore file format is a dead end. It fails
magnificently with tags and message-ids that have spaces or some other
special characters in them. It doesn't support removal of tags. There's
zero room for extensibility. I can't imagine how that could be used for
notmuch tag.

David originally wrote the dump/restore patches
id:"1324214111-32079-1-git-send-email-david@tethera.net" to add a
"notmuch" format (yes, a second format) in addition to the "sup" format
we currently have, to fix the issues with special characters. That too
lacked features to support batch tagging in notmuch new. The format
proposed here would fix the issues and work with notmuch tag, and it
would be obvious to use for anyone who has used notmuch tag.

I see a general batch command interface overkill. Too many problems to
solve for little gain (see the previous thread for details). What other
notmuch commands than tag would really benefit from it? If you want to
do search or show or whatever, the general batch command interface to
use is /bin/sh.

For notmuch tag, particularly in connection with notmuch new, there's a
clear benefit in having a batch mode: if you have "new" in new.tags, and
do initial tagging on the "new" tagged messages (e.g. in post-new hook),
you don't want anyone messing with the database while you're processing
the messages tagged "new". The batch tagging of this series keeps the db
locked for the duration of such initial tagging.

I'm totally fine with modifying the proposed format (e.g. change "T" to
"tag", make things compatible with a future general batch mode), but to
be absolutely clear: I will not implement a general batch command
mode. If that is deemed to be what is wanted, that's okay too. I can
share my WIP patches for dump/restore, and focus on something else.


BR,
Jani.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin"
  2012-04-14 21:07   ` Jani Nikula
@ 2012-04-15 12:02     ` David Bremner
  2012-04-18 18:58       ` Jani Nikula
  0 siblings, 1 reply; 11+ messages in thread
From: David Bremner @ 2012-04-15 12:02 UTC (permalink / raw)
  To: Jani Nikula, Jameson Graef Rollins, notmuch

Jani Nikula <jani@nikula.org> writes:


> I'm totally fine with modifying the proposed format (e.g. change "T" to
> "tag", make things compatible with a future general batch mode), but to
> be absolutely clear: I will not implement a general batch command
> mode. 

I was thinking about the best way of making the interface extensible,
and it might be better have a kind of modal interface, where some well
defined escape at the beginning of the line introduces a mode switch.
This has two apparent advantages: it avoids duplication of redundant
information at the beginning of each line, and for input to a subcommand
it could be optional.

something like

* tag
+foo +bar msg.id@blah
+glub -glog other.msg.id@blog

vs

* restore
foo bar msg.id@blah
glub glog other.msg.id@blog

where a hypothetical general batch interface could take those two files
concatenated together, and the "*" lines would be optional feeding to
tag and restore respectively.

I guess the current proposal is to have the restore format and tag
format a bit closer

* restore
+foo +bar msg.id@blah
+glub +glog other.msg.id@blog

This looks like it would be a 2% space increase for my tags. I guess I
could live with that. Another option would be to have the '+' be
optional for tag as well; I suppose then tags starting with + would be
ambiguous, which is probably a bad idea.

d

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin"
  2012-04-15 12:02     ` David Bremner
@ 2012-04-18 18:58       ` Jani Nikula
  0 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2012-04-18 18:58 UTC (permalink / raw)
  To: David Bremner, Jameson Graef Rollins, notmuch

On Sun, 15 Apr 2012, David Bremner <david@tethera.net> wrote:
> Jani Nikula <jani@nikula.org> writes:
>
>
>> I'm totally fine with modifying the proposed format (e.g. change "T" to
>> "tag", make things compatible with a future general batch mode), but to
>> be absolutely clear: I will not implement a general batch command
>> mode. 
>
> I was thinking about the best way of making the interface extensible,
> and it might be better have a kind of modal interface, where some well
> defined escape at the beginning of the line introduces a mode switch.
> This has two apparent advantages: it avoids duplication of redundant
> information at the beginning of each line, and for input to a subcommand
> it could be optional.

Does it not worry you that such a modal format isn't much fun to filter
with regular command line tools?

> something like
>
> * tag
> +foo +bar msg.id@blah
> +glub -glog other.msg.id@blog
>
> vs
>
> * restore
> foo bar msg.id@blah
> glub glog other.msg.id@blog
>
> where a hypothetical general batch interface could take those two files
> concatenated together, and the "*" lines would be optional feeding to
> tag and restore respectively.

Restore currently has two modes of operation: with and without
--accumulate option. How would you specify that *outside* the file, if
the restore was done with the hypothetical general batch interface? And
if there was such an interface, we shouldn't have a dedicated restore
command for restoring such dump files, should we...? (It's useful to be
able to use the same dump file with/without --accumulate. See e.g. [1].)

> I guess the current proposal is to have the restore format and tag
> format a bit closer
>
> * restore
> +foo +bar msg.id@blah
> +glub +glog other.msg.id@blog

Something like that... though in my original idea the same dump file
could be fed to either tag or restore. They would interpret the file
using the same routine, but perform the tagging in slightly different
ways (mainly to keep restore very strict as it's a backup/restore
command).

> This looks like it would be a 2% space increase for my tags. I guess I
> could live with that. Another option would be to have the '+' be
> optional for tag as well; I suppose then tags starting with + would be
> ambiguous, which is probably a bad idea.

I'd like to keep the new format unambiguous if at all possible.

BR,
Jani.



[1] http://notmuchmail.org/howto/#index5h2

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-04-18 18:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-14 12:15 [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jani Nikula
2012-04-14 12:15 ` [PATCH v2 1/6] hex-escape: (en|de)code strings to/from restricted character set Jani Nikula
2012-04-14 12:15 ` [PATCH v2 2/6] test/hex-xcode: new test binary Jani Nikula
2012-04-14 12:15 ` [PATCH v2 3/6] test/hex-escaping: new test for hex escaping routines Jani Nikula
2012-04-14 12:15 ` [PATCH v2 4/6] cli: add support for batch tagging operations to "notmuch tag" Jani Nikula
2012-04-14 12:15 ` [PATCH v2 5/6] test: add test for notmuch tag --stdin option Jani Nikula
2012-04-14 12:15 ` [PATCH v2 6/6] man: document " Jani Nikula
2012-04-14 20:27 ` [PATCH v2 0/6] batch tagging support: "notmuch tag --stdin" Jameson Graef Rollins
2012-04-14 21:07   ` Jani Nikula
2012-04-15 12:02     ` David Bremner
2012-04-18 18:58       ` Jani Nikula

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).