unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* New Dump/Restore Format
@ 2011-12-13 20:27 David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 1/6] util/hex-escape.[ch]: encoding/decoding strings into restricted character set David Bremner
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: pere

Hi All;

There are some style/doc issues remaining, but because bugs in dump
and restore really suck, I thought I would ask for early feedback on
functionality.  I'm particularly interested in how the new dump format
works for weird message-ids (spaces and so on). If you have public
messages with tricky message-id's, I'd appreciate adding those
messages to the test suite.

Things to bikeshed now: name(s) of the formats; sup and notmuch are
maybe not ideal.  The format itself? The encoding format? The latter
is chosen for compatibility with nmbug, but we could discussing using
a bigger character set.

Things I know about
      
      - not enough tests
      - no man page, online docs.
      - no API docs for hex_encode/blah.

I think the code in hex-escape.[ch] is otherwise ready for (second)
review; I'll probably do another review of the code in
notmuch-(dump|restore).c myself for clarity, so you might want to wait
for the next round before diving in.

If you prefer pull from git, you can get these patches on branch "new-dump" 
at git://pivot.cs.unb.ca/notmuch.git

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Alpha PATCH 1/6] util/hex-escape.[ch]: encoding/decoding strings into restricted character set
  2011-12-13 20:27 New Dump/Restore Format David Bremner
@ 2011-12-13 20:27 ` David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 2/6] test: add test for hex_(encode|decode) David Bremner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

The character set is chosen to be suitable for pathnames, and the same as that
used by contrib/nmbug
---
 util/Makefile.local |    2 +-
 util/hex-escape.c   |  150 +++++++++++++++++++++++++++++++++++++++++++++++++++
 util/hex-escape.h   |   15 +++++
 3 files changed, 166 insertions(+), 1 deletions(-)
 create mode 100644 util/hex-escape.c
 create mode 100644 util/hex-escape.h

diff --git a/util/Makefile.local b/util/Makefile.local
index 26e4c3f..2e63932 100644
--- a/util/Makefile.local
+++ b/util/Makefile.local
@@ -3,7 +3,7 @@
 dir := util
 extra_cflags += -I$(srcdir)/$(dir)
 
-libutil_c_srcs := $(dir)/xutil.c $(dir)/error_util.c
+libutil_c_srcs := $(dir)/xutil.c $(dir)/error_util.c $(dir)/hex-escape.c
 
 libutil_modules := $(libutil_c_srcs:.c=.o)
 
diff --git a/util/hex-escape.c b/util/hex-escape.c
new file mode 100644
index 0000000..3e08465
--- /dev/null
+++ b/util/hex-escape.c
@@ -0,0 +1,150 @@
+/* pathname.c -  Manage encoding and decoding of byte strings into path names
+ *
+ * Copyright (c) 2011 David Bremner
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see http://www.gnu.org/licenses/ .
+ *
+ * Author: David Bremner <david@tethera.net>
+ */
+
+#include <string.h>
+#include <talloc.h>
+#include "error_util.h"
+#include "hex-escape.h"
+
+static const size_t default_buf_size=1024;
+
+static const char* output_charset=
+    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-_@=.:,";
+
+static const char escape_char='%';
+
+static int
+is_output (char c) {
+    return (strchr (output_charset, c) != NULL);
+}
+
+
+static int
+maybe_realloc(void *ctx, size_t needed, char **out, size_t *out_size)
+{
+    if (*out_size < needed) {
+
+	if (*out == NULL)
+	    *out = talloc_size(ctx,needed);
+	else
+	    *out = talloc_realloc(ctx,*out,char,needed);
+
+	if (*out == NULL)
+	    return 0;
+
+	*out_size = needed;
+    }
+    return 1;
+}
+
+
+hex_status_t
+hex_encode (void *ctx, const char *in, char **out, size_t *out_size)
+{
+
+    const char *p;
+    char *q;
+
+    int escape_count=0;
+    size_t needed;
+
+    for  (p = in; *p; p++)
+	escape_count += (! is_output (*p));
+
+    needed = strlen (in) + 2*escape_count + 1;
+
+    if (*out == NULL)
+	*out_size=0;
+
+    if (!maybe_realloc (ctx, needed, out, out_size))
+	return HEX_OUT_OF_MEMORY;
+
+    q = *out;
+    p = in;
+
+    while (*p) {
+	if (is_output (*p)) {
+	    *q++ = *p++;
+	} else {
+	    sprintf (q, "%%%02x", *p++);
+	    q += 3;
+	}
+    }
+
+    *q = '\0';
+    return HEX_SUCCESS;
+}
+
+
+hex_status_t
+hex_decode (void *ctx, const char *in, char **out, size_t *out_size) {
+
+    char buf[3];
+
+    const char *p;
+    char *q;
+
+    size_t escape_count = 0;
+    size_t needed = 0;
+
+    size_t len = strlen (in);
+
+    for (p = in; *p; p++)
+	escape_count += (*p == escape_char);
+
+    needed = len - escape_count*2 +1;
+
+    if (!maybe_realloc(ctx, needed, out, out_size))
+	return HEX_OUT_OF_MEMORY;
+
+    p = in;
+    q = *out;
+    buf[2]=0;
+
+
+    while (*p) {
+
+	if (*p == escape_char) {
+
+	    char *endp;
+
+	    if (len < 3)
+		return HEX_SYNTAX_ERROR;
+
+	    buf[0]=p[1];
+	    buf[1]=p[2];
+
+	    *q = strtol(buf, &endp, 16);
+
+	    if (endp != buf+2)
+		return HEX_SYNTAX_ERROR;
+
+	    len -= 3;
+	    p += 3;
+	    q++;
+	} else {
+	    *q++ = *p++;
+	}
+    }
+
+    *q = '\0';
+
+    return HEX_SUCCESS;
+}
diff --git a/util/hex-escape.h b/util/hex-escape.h
new file mode 100644
index 0000000..98ecbe0
--- /dev/null
+++ b/util/hex-escape.h
@@ -0,0 +1,15 @@
+#ifndef _HEX_ESCAPE_H
+#define _HEX_ESCAPE_H
+
+typedef enum hex_status {
+    HEX_SUCCESS = 0,
+    HEX_SYNTAX_ERROR,
+    HEX_OUT_OF_MEMORY
+} hex_status_t;
+
+hex_status_t
+hex_encode (void *talloc_ctx, const char *in, char **out, size_t *out_size);
+
+hex_status_t
+hex_decode (void *talloc_ctx, const char *in, char **out, size_t *out_size);
+#endif
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha PATCH 2/6] test: add test for hex_(encode|decode)
  2011-12-13 20:27 New Dump/Restore Format David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 1/6] util/hex-escape.[ch]: encoding/decoding strings into restricted character set David Bremner
@ 2011-12-13 20:27 ` David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 3/6] notmuch-dump: add --format=(notmuch|sup) David Bremner
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

The test binary hex-xcode can also be used to check for memory leaks.
---
 test/.gitignore     |    1 +
 test/Makefile.local |    6 +++-
 test/basic          |    2 +-
 test/hex-escaping   |   20 +++++++++++++
 test/hex-xcode.c    |   76 +++++++++++++++++++++++++++++++++++++++++++++++++++
 test/notmuch-test   |    1 +
 6 files changed, 104 insertions(+), 2 deletions(-)
 create mode 100755 test/hex-escaping
 create mode 100644 test/hex-xcode.c

diff --git a/test/.gitignore b/test/.gitignore
index e63c689..be7ab5e 100644
--- a/test/.gitignore
+++ b/test/.gitignore
@@ -3,4 +3,5 @@ corpus.mail
 smtp-dummy
 symbol-test
 arg-test
+hex-xcode
 tmp.*
diff --git a/test/Makefile.local b/test/Makefile.local
index 6cb6c82..d27826a 100644
--- a/test/Makefile.local
+++ b/test/Makefile.local
@@ -13,6 +13,9 @@ smtp_dummy_modules = $(smtp_dummy_srcs:.c=.o)
 $(dir)/arg-test: $(dir)/arg-test.o command-line-arguments.o util/libutil.a
 	$(call quiet,CC) -I. $^ -o $@
 
+$(dir)/hex-xcode: $(dir)/hex-xcode.o util/libutil.a
+	$(call quiet,CC) -I. $^ -o $@ -ltalloc
+
 $(dir)/smtp-dummy: $(smtp_dummy_modules)
 	$(call quiet,CC) $^ -o $@
 
@@ -21,7 +24,8 @@ $(dir)/symbol-test: $(dir)/symbol-test.o
 
 .PHONY: test check
 
-test-binaries: $(dir)/arg-test $(dir)/smtp-dummy $(dir)/symbol-test
+test-binaries: $(dir)/arg-test $(dir)/hex-xcode \
+	 $(dir)/smtp-dummy $(dir)/symbol-test
 
 test:	all test-binaries
 	@${dir}/notmuch-test $(OPTIONS)
diff --git a/test/basic b/test/basic
index d6aed24..af57026 100755
--- a/test/basic
+++ b/test/basic
@@ -54,7 +54,7 @@ test_begin_subtest 'Ensure that all available tests will be run by notmuch-test'
 eval $(sed -n -e '/^TESTS="$/,/^"$/p' $TEST_DIRECTORY/notmuch-test)
 tests_in_suite=$(for i in $TESTS; do echo $i; done | sort)
 available=$(find "$TEST_DIRECTORY" -maxdepth 1 -type f -executable -printf '%f\n' | \
-    sed -r -e "/^(aggregate-results.sh|notmuch-test|smtp-dummy|test-verbose|symbol-test|arg-test)$/d" | \
+    sed -r -e "/^(aggregate-results.sh|notmuch-test|smtp-dummy|test-verbose|symbol-test|arg-test|hex-xcode)$/d" | \
     sort)
 test_expect_equal "$tests_in_suite" "$available"
 
diff --git a/test/hex-escaping b/test/hex-escaping
new file mode 100755
index 0000000..d0a993e
--- /dev/null
+++ b/test/hex-escaping
@@ -0,0 +1,20 @@
+#!/usr/bin/env bash
+test_description="hex encoding and decoding"
+. ./test-lib.sh
+
+test_begin_subtest "round trip"
+find $TEST_DIRECTORY/corpus -type f -print | sort | xargs cat > EXPECTED
+$TEST_DIRECTORY/hex-xcode e < EXPECTED | $TEST_DIRECTORY/hex-xcode d > OUTPUT
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "punctuation"
+tag1='comic_swear=$&^%$^%\\//-+$^%$'
+tag_enc1=$($TEST_DIRECTORY/hex-xcode e "$tag1")
+test_expect_equal "$tag_enc1" "comic_swear=%24%26%5e%25%24%5e%25%5c%5c%2f%2f-+%24%5e%25%24"
+
+test_begin_subtest "round trip newlines"
+printf 'this\n tag\t has\n spaces\n' > EXPECTED.$test_count
+$TEST_DIRECTORY/hex-xcode e  < EXPECTED.$test_count |\
+	$TEST_DIRECTORY/hex-xcode d > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+test_done
diff --git a/test/hex-xcode.c b/test/hex-xcode.c
new file mode 100644
index 0000000..fff3e01
--- /dev/null
+++ b/test/hex-xcode.c
@@ -0,0 +1,76 @@
+/* No, nothing to to with IDE from Apple Inc.
+   testbed for ../util/hex-escape.c.
+
+   usage: hex-xcode (e|d) < foo
+
+   e is for encode
+   d is for decode
+ */
+
+#include "notmuch-client.h"
+#include "hex-escape.h"
+#include <assert.h>
+
+static int
+xcode(void *ctx, char dir, char *in, char **buf_p, size_t *size_p)
+{
+    hex_status_t status;
+
+    if (dir == 'e')
+	status = hex_encode (ctx, in, buf_p, size_p);
+    else
+	status = hex_decode (ctx, in, buf_p, size_p);
+
+    if (status == HEX_SUCCESS)
+	puts(*buf_p);
+
+    return status;
+}
+
+
+int main (int argc, char **argv){
+
+    assert(argc > 1 && argv[1]);
+
+    char dir=argv[1][0];
+
+    void *ctx = talloc_new(NULL);
+
+    char *line = NULL;
+    size_t line_size;
+    ssize_t line_len;
+
+    char *buffer=NULL;
+    size_t buf_size=0;
+
+    int arg_index=2;
+    notmuch_bool_t read_stdin=TRUE;
+
+    for (arg_index=2; arg_index < argc; arg_index++) {
+
+	if (xcode (ctx, dir, argv[arg_index],
+		   &buffer, &buf_size) != HEX_SUCCESS)
+	    return 1;
+
+	read_stdin=FALSE;
+    }
+
+    if (!read_stdin)
+	return 0;
+
+    while ((line_len = getline (&line, &line_size, stdin)) != -1) {
+
+	chomp_newline (line);
+
+	if (xcode (ctx, dir, line, &buffer, &buf_size) != HEX_SUCCESS)
+	    return 1;
+
+    }
+
+    if (line)
+	free(line);
+
+    talloc_free(ctx);
+
+    return 0;
+}
diff --git a/test/notmuch-test b/test/notmuch-test
index ded79e8..af72c73 100755
--- a/test/notmuch-test
+++ b/test/notmuch-test
@@ -50,6 +50,7 @@ TESTS="
   python
   hooks
   argument-parsing
+  hex-escaping
 "
 TESTS=${NOTMUCH_TESTS:=$TESTS}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha PATCH 3/6] notmuch-dump: add --format=(notmuch|sup)
  2011-12-13 20:27 New Dump/Restore Format David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 1/6] util/hex-escape.[ch]: encoding/decoding strings into restricted character set David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 2/6] test: add test for hex_(encode|decode) David Bremner
@ 2011-12-13 20:27 ` David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 4/6] test: add test for dump --format=notmuch David Bremner
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

sup is the old format, and remains the default.

Each line of the notmuch format is "msg_id tag tag...tag" where each
space seperated token is 'hex-encoded' to remove troubling characters.
In particular this format won't have the same problem with e.g. spaces
in message-ids or tags; they will be round-trip-able.
---
 dump-restore-private.h |   20 ++++++++++++++++++++
 notmuch-dump.c         |   40 ++++++++++++++++++++++++++++++++++------
 2 files changed, 54 insertions(+), 6 deletions(-)
 create mode 100644 dump-restore-private.h

diff --git a/dump-restore-private.h b/dump-restore-private.h
new file mode 100644
index 0000000..eda3219
--- /dev/null
+++ b/dump-restore-private.h
@@ -0,0 +1,20 @@
+#ifndef DUMP_RESTORE_PRIVATE_H
+#define DUMP_RESTORE_PRIVATE_H
+
+#include "hex-escape.h"
+#include "command-line-arguments.h"
+
+typedef enum dump_formats {
+    DUMP_FORMAT_SUP,
+    DUMP_FORMAT_NOTMUCH
+} dump_format_t;
+
+#define FORMAT_DESC(out) \
+     {									\
+	NOTMUCH_OPT_KEYWORD, &out, "format", 'f',			\
+	(notmuch_keyword_t []){ { "sup", DUMP_FORMAT_SUP },		\
+				{ "notmuch", DUMP_FORMAT_NOTMUCH },	\
+				{0, 0} }				\
+     }
+
+#endif
diff --git a/notmuch-dump.c b/notmuch-dump.c
index a735875..5f788f9 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -19,6 +19,7 @@
  */
 
 #include "notmuch-client.h"
+#include "dump-restore-private.h"
 
 int
 notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
@@ -44,7 +45,10 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
     char *output_file_name = NULL;
     int opt_index;
 
+    int output_format = DUMP_FORMAT_SUP;
+
     notmuch_opt_desc_t options[] = {
+	FORMAT_DESC (output_format),
 	{ NOTMUCH_OPT_POSITION, &output_file_name, 0, 0, 0  },
 	{ 0, 0, 0, 0, 0 }
     };
@@ -85,29 +89,53 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
      */
     notmuch_query_set_sort (query, NOTMUCH_SORT_UNSORTED);
 
+    char *buffer=NULL;
+    size_t buffer_size=0;
+
     for (messages = notmuch_query_search_messages (query);
 	 notmuch_messages_valid (messages);
 	 notmuch_messages_move_to_next (messages))
     {
 	int first = 1;
-	message = notmuch_messages_get (messages);
+	const char *message_id;
 
-	fprintf (output,
-		 "%s (", notmuch_message_get_message_id (message));
+	message = notmuch_messages_get (messages);
+	message_id = notmuch_message_get_message_id (message);
+
+	if (output_format == DUMP_FORMAT_SUP) {
+	    fprintf (output, "%s (", message_id);
+	} else {
+	    if (hex_encode (notmuch, message_id,
+			    &buffer, &buffer_size) != HEX_SUCCESS)
+		return 1;
+	    fprintf (output, "%s ", buffer);
+	}
 
 	for (tags = notmuch_message_get_tags (message);
 	     notmuch_tags_valid (tags);
 	     notmuch_tags_move_to_next (tags))
 	{
+	    const char *tag_str = notmuch_tags_get (tags);
+
 	    if (! first)
-		fprintf (output, " ");
+		fputs (" ", output);
 
-	    fprintf (output, "%s", notmuch_tags_get (tags));
+	    if (output_format == DUMP_FORMAT_SUP) {
+		fputs (tag_str, output);
+	    } else {
+		if (hex_encode (notmuch, tag_str,
+				&buffer, &buffer_size) != HEX_SUCCESS)
+		    return 1;
 
+		fputs (buffer, output);
+	    }
 	    first = 0;
 	}
 
-	fprintf (output, ")\n");
+	if (output_format == DUMP_FORMAT_SUP)
+	    fputs (")\n",output);
+	else
+	    fputs ("\n", output);
 
 	notmuch_message_destroy (message);
     }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha PATCH 4/6] test: add test for dump --format=notmuch
  2011-12-13 20:27 New Dump/Restore Format David Bremner
                   ` (2 preceding siblings ...)
  2011-12-13 20:27 ` [Alpha PATCH 3/6] notmuch-dump: add --format=(notmuch|sup) David Bremner
@ 2011-12-13 20:27 ` David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 5/6] notmuch-restore: add --format=notmuch support David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests David Bremner
  5 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

The first test is really to test our assumptions about the corpus,
namely that a certain set of message-id's is safe (i.e. doesn't change
under hex-escaping).  We then check dump output as best we can without
functionality-to-come in notmuch-restore.
---
 test/dump-restore |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/test/dump-restore b/test/dump-restore
index 439e998..48caf4e 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -82,4 +82,16 @@ test_begin_subtest "dump outfile -- from:cworth"
 notmuch dump dump-outfile-dash-inbox.actual -- from:cworth
 test_expect_equal_file dump-cworth.expected dump-outfile-dash-inbox.actual
 
+test_begin_subtest "Check for a safe set of message-ids"
+notmuch search --output=messages from:cworth > EXPECTED.$test_count
+notmuch search --output=messages from:cworth |\
+	$TEST_DIRECTORY/hex-xcode e > OUTPUT.$test_count
+test_expect_equal_file OUTPUT.$test_count EXPECTED.$test_count
+
+# we have observed that cworth has sane message-ids, and hopefully sane tags.
+test_begin_subtest "dump --format=notmuch -- from:cworth"
+notmuch dump --format=sup -- from:cworth | tr -d \(\) > EXPECTED.$test_count
+notmuch dump --format=notmuch -- from:cworth > OUTPUT.$test_count
+test_expect_equal_file OUTPUT.$test_count EXPECTED.$test_count
+
 test_done
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha PATCH 5/6] notmuch-restore: add --format=notmuch support
  2011-12-13 20:27 New Dump/Restore Format David Bremner
                   ` (3 preceding siblings ...)
  2011-12-13 20:27 ` [Alpha PATCH 4/6] test: add test for dump --format=notmuch David Bremner
@ 2011-12-13 20:27 ` David Bremner
  2011-12-13 20:27 ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests David Bremner
  5 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

This is format is whitespace separated tokens, encoded by
util/hex-escape.c
---
 notmuch-restore.c |   83 ++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 64 insertions(+), 19 deletions(-)

diff --git a/notmuch-restore.c b/notmuch-restore.c
index 87d9772..f392590 100644
--- a/notmuch-restore.c
+++ b/notmuch-restore.c
@@ -19,6 +19,7 @@
  */
 
 #include "notmuch-client.h"
+#include "dump-restore-private.h"
 
 int
 notmuch_restore_command (unused (void *ctx), int argc, char *argv[])
@@ -35,6 +36,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])
     regex_t regex;
     int rerr;
     int opt_index;
+    int input_format = DUMP_FORMAT_SUP;
 
     config = notmuch_config_open (ctx, NULL, NULL);
     if (config == NULL)
@@ -48,6 +50,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])
     synchronize_flags = notmuch_config_get_maildir_synchronize_flags (config);
 
     notmuch_opt_desc_t options[] = {
+	FORMAT_DESC (input_format),
 	{ NOTMUCH_OPT_POSITION, &input_file_name, 0, 0, 0 },
 	{ NOTMUCH_OPT_BOOLEAN,  &accumulate, "accumulate", 'a', 0 },
 	{ 0, 0, 0, 0, 0 }
@@ -81,33 +84,63 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])
      * non-space characters for the message-id, then one or more
      * spaces, then a list of space-separated tags as a sequence of
      * characters within literal '(' and ')'. */
-    if ( xregcomp (&regex,
-		   "^([^ ]+) \\(([^)]*)\\)$",
-		   REG_EXTENDED) )
-	INTERNAL_ERROR("compile time constant regex failed.");
+    if (input_format == DUMP_FORMAT_SUP) {
+	if ( xregcomp (&regex,
+		       "^([^ ]+) \\(([^)]*)\\)$",
+		       REG_EXTENDED) )
+	    INTERNAL_ERROR("compile time constant regex failed.");
+    }
+
+
+    /* These are out here to re-use the buffers with hex_decode */
+
+    char *message_id = NULL;
+    size_t message_id_size=0;
+    char *tag = NULL;
+    size_t tag_size=0;
 
     while ((line_len = getline (&line, &line_size, input)) != -1) {
 	regmatch_t match[3];
-	char *message_id, *file_tags, *tag, *next;
+	char  *file_tags, *next;
 	notmuch_message_t *message = NULL;
+
 	notmuch_status_t status;
 	notmuch_tags_t *db_tags;
 	char *db_tags_str;
 
 	chomp_newline (line);
 
-	rerr = xregexec (&regex, line, 3, match, 0);
-	if (rerr == REG_NOMATCH)
-	{
-	    fprintf (stderr, "Warning: Ignoring invalid input line: %s\n",
-		     line);
-	    continue;
+	/* Silently ignore blank lines */
+
+	if (line[0] == '\0') {
+		continue;
+	}
+
+	if (input_format == DUMP_FORMAT_SUP) {
+	    rerr = xregexec (&regex, line, 3, match, 0);
+	    if (rerr == REG_NOMATCH)
+	    {
+		fprintf (stderr, "Warning: Ignoring invalid input line: %s\n",
+			 line);
+		continue;
+	    }
+	    message_id = talloc_strndup (notmuch, line + match[1].rm_so,
+					 match[1].rm_eo - match[1].rm_so);
+	    file_tags = talloc_strndup (notmuch, line + match[2].rm_so,
+					match[2].rm_eo - match[2].rm_so);
+	} else {
+	    char *p = line;
+	    char *raw_mid;
+
+	    raw_mid = strsep (&p, " \t");
+
+	    if (hex_decode (notmuch, raw_mid,
+			    &message_id, &message_id_size) != HEX_SUCCESS)
+		return 1;
+
+	    file_tags = xstrdup (p);
 	}
 
-	message_id = xstrndup (line + match[1].rm_so,
-			       match[1].rm_eo - match[1].rm_so);
-	file_tags = xstrndup (line + match[2].rm_so,
-			      match[2].rm_eo - match[2].rm_so);
 
 	status = notmuch_database_find_message (notmuch, message_id, &message);
 	if (status || message == NULL) {
@@ -153,7 +186,16 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])
 
 	next = file_tags;
 	while (next) {
-	    tag = strsep (&next, " ");
+	    char *raw_tag = strsep (&next, " ");
+
+	    if (input_format == DUMP_FORMAT_NOTMUCH) {
+		if (hex_decode (notmuch, raw_tag,
+			    &tag, &tag_size) != HEX_SUCCESS)
+		    return 1;
+	    } else {
+		tag = talloc_strdup(notmuch, raw_tag);
+	    }
+
 	    if (*tag == '\0')
 		continue;
 	    status = notmuch_message_add_tag (message, tag);
@@ -175,11 +217,14 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])
 	if (message)
 	    notmuch_message_destroy (message);
 	message = NULL;
-	free (message_id);
-	free (file_tags);
+	if (input_format == DUMP_FORMAT_SUP) {
+	    talloc_free (message_id);
+	    talloc_free (file_tags);
+	}
     }
 
-    regfree (&regex);
+    if (input_format == DUMP_FORMAT_SUP)
+	regfree (&regex);
 
     if (line)
 	free (line);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests
  2011-12-13 20:27 New Dump/Restore Format David Bremner
                   ` (4 preceding siblings ...)
  2011-12-13 20:27 ` [Alpha PATCH 5/6] notmuch-restore: add --format=notmuch support David Bremner
@ 2011-12-13 20:27 ` David Bremner
  2011-12-14 20:14   ` [Alpha Patch 1/2] test: add (currently broken) test 8 bit characters hex-escape and dump-restore David Bremner
  2011-12-14 20:24   ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests Dmitry Kurochkin
  5 siblings, 2 replies; 12+ messages in thread
From: David Bremner @ 2011-12-13 20:27 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

These one need the completed functionality in notmuch-restore. Fairly
exotic tags are tested, but no weird message id's.
---
 test/dump-restore |   33 +++++++++++++++++++++++++++++++++
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/test/dump-restore b/test/dump-restore
index 48caf4e..122de5c 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -94,4 +94,37 @@ notmuch dump --format=sup -- from:cworth | tr -d \(\) > EXPECTED.$test_count
 notmuch dump --format=notmuch -- from:cworth > OUTPUT.$test_count
 test_expect_equal_file OUTPUT.$test_count EXPECTED.$test_count
 
+test_begin_subtest "format=notmuch, # round-trip"
+notmuch dump --format=sup | sort > EXPECTED.$test_count
+notmuch dump --format=notmuch | notmuch restore --format=notmuch
+notmuch dump --format=sup | sort > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+
+tag1='comic_swear=$&^%$^%\\//-+$^%$'
+enc1=$($TEST_DIRECTORY/hex-xcode e "$tag1")
+
+tag2=$(printf 'this\n tag\t has\n spaces')
+enc2=$($TEST_DIRECTORY/hex-xcode e "$tag2")
+
+notmuch dump --format=notmuch > BACKUP
+
+notmuch tag +"$tag1" +"$tag2" -inbox -unread "*"
+
+test_begin_subtest 'format=notmuch, round trip with strange tags'
+   notmuch dump --format=notmuch > EXPECTED.$test_count
+   notmuch dump --format=notmuch | notmuch restore --format=notmuch
+   notmuch dump --format=notmuch > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+
+
+test_begin_subtest 'format=notmuch, checking encoded output'
+    cp /dev/null EXPECTED.$test_count
+    notmuch dump --format=notmuch -- from:cworth |\
+	 awk "{ print \$1 \" $enc1 $enc2\" }" > EXPECTED.$test_count
+
+    notmuch dump --format=notmuch -- from:cworth  > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+
+notmuch restore --format=notmuch < BACKUP
+
 test_done
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha Patch 1/2] test: add (currently broken) test 8 bit characters hex-escape and dump-restore
  2011-12-13 20:27 ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests David Bremner
@ 2011-12-14 20:14   ` David Bremner
  2011-12-14 20:14     ` [Alpha Patch 2/2] test: update dump-restore tests for 8 bit characters David Bremner
  2011-12-14 20:24   ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests Dmitry Kurochkin
  1 sibling, 1 reply; 12+ messages in thread
From: David Bremner @ 2011-12-14 20:14 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

The problem is a use of signed chars in places where unsigned
chars (i.e. 0-255) should be used.
---

Well, I did mention more tests were needed ;). I failed to test 8 bit
(>127) stuff and sure enough it was broken.  This sets up some tests
to demonstrate the problem and the next patch fixes it.

 test/dump-restore |    9 +++++++--
 test/hex-escaping |    7 +++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/test/dump-restore b/test/dump-restore
index 122de5c..eee1773 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -106,11 +106,15 @@ enc1=$($TEST_DIRECTORY/hex-xcode e "$tag1")
 tag2=$(printf 'this\n tag\t has\n spaces')
 enc2=$($TEST_DIRECTORY/hex-xcode e "$tag2")
 
+enc3='%c3%91%c3%a5%c3%b0%c3%a3%c3%a5%c3%a9-%c3%8f%c3%8a'
+tag3=$($TEST_DIRECTORY/hex-xcode d $enc3)
+
 notmuch dump --format=notmuch > BACKUP
 
-notmuch tag +"$tag1" +"$tag2" -inbox -unread "*"
+notmuch tag +"$tag1" +"$tag2" +"$tag3" -inbox -unread "*"
 
 test_begin_subtest 'format=notmuch, round trip with strange tags'
+   test_subtest_known_broken
    notmuch dump --format=notmuch > EXPECTED.$test_count
    notmuch dump --format=notmuch | notmuch restore --format=notmuch
    notmuch dump --format=notmuch > OUTPUT.$test_count
@@ -118,9 +122,10 @@ test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
 
 
 test_begin_subtest 'format=notmuch, checking encoded output'
+    test_subtest_known_broken
     cp /dev/null EXPECTED.$test_count
     notmuch dump --format=notmuch -- from:cworth |\
-	 awk "{ print \$1 \" $enc1 $enc2\" }" > EXPECTED.$test_count
+	 awk "{ print \$1 \" $enc1 $enc2 $enc3\" }" > EXPECTED.$test_count
 
     notmuch dump --format=notmuch -- from:cworth  > OUTPUT.$test_count
 test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
diff --git a/test/hex-escaping b/test/hex-escaping
index d0a993e..2053fb0 100755
--- a/test/hex-escaping
+++ b/test/hex-escaping
@@ -17,4 +17,11 @@ printf 'this\n tag\t has\n spaces\n' > EXPECTED.$test_count
 $TEST_DIRECTORY/hex-xcode e  < EXPECTED.$test_count |\
 	$TEST_DIRECTORY/hex-xcode d > OUTPUT.$test_count
 test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
+
+test_begin_subtest "round trip 8bit chars"
+test_subtest_known_broken
+echo '%c3%91%c3%a5%c3%b0%c3%a3%c3%a5%c3%a9-%c3%8f%c3%8a' > EXPECTED.$test_count
+$TEST_DIRECTORY/hex-xcode d  < EXPECTED.$test_count |\
+	$TEST_DIRECTORY/hex-xcode e > OUTPUT.$test_count
+test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
 test_done
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Alpha Patch 2/2] test: update dump-restore tests for 8 bit characters.
  2011-12-14 20:14   ` [Alpha Patch 1/2] test: add (currently broken) test 8 bit characters hex-escape and dump-restore David Bremner
@ 2011-12-14 20:14     ` David Bremner
  2011-12-14 20:36       ` Dmitry Kurochkin
  0 siblings, 1 reply; 12+ messages in thread
From: David Bremner @ 2011-12-14 20:14 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner, pere

From: David Bremner <bremner@debian.org>

hex-escape: fix for handling of 8 bit chars

The low level problem was passing negative numbers to sprintf(s,"%x");
we fix this and clarify the api for hex_(decode|encode) by making
encode go from (unsigned char *) (i.e. 8bit) to (char *) and decode
vise-versa.
---
 test/dump-restore |    2 --
 test/hex-escaping |    1 -
 util/hex-escape.c |   26 +++++++++++++++-----------
 util/hex-escape.h |    6 ++++--
 4 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/test/dump-restore b/test/dump-restore
index eee1773..c5b2e86 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -114,7 +114,6 @@ notmuch dump --format=notmuch > BACKUP
 notmuch tag +"$tag1" +"$tag2" +"$tag3" -inbox -unread "*"
 
 test_begin_subtest 'format=notmuch, round trip with strange tags'
-   test_subtest_known_broken
    notmuch dump --format=notmuch > EXPECTED.$test_count
    notmuch dump --format=notmuch | notmuch restore --format=notmuch
    notmuch dump --format=notmuch > OUTPUT.$test_count
@@ -122,7 +121,6 @@ test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
 
 
 test_begin_subtest 'format=notmuch, checking encoded output'
-    test_subtest_known_broken
     cp /dev/null EXPECTED.$test_count
     notmuch dump --format=notmuch -- from:cworth |\
 	 awk "{ print \$1 \" $enc1 $enc2 $enc3\" }" > EXPECTED.$test_count
diff --git a/test/hex-escaping b/test/hex-escaping
index 2053fb0..daa6446 100755
--- a/test/hex-escaping
+++ b/test/hex-escaping
@@ -19,7 +19,6 @@ $TEST_DIRECTORY/hex-xcode e  < EXPECTED.$test_count |\
 test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
 
 test_begin_subtest "round trip 8bit chars"
-test_subtest_known_broken
 echo '%c3%91%c3%a5%c3%b0%c3%a3%c3%a5%c3%a9-%c3%8f%c3%8a' > EXPECTED.$test_count
 $TEST_DIRECTORY/hex-xcode d  < EXPECTED.$test_count |\
 	$TEST_DIRECTORY/hex-xcode e > OUTPUT.$test_count
diff --git a/util/hex-escape.c b/util/hex-escape.c
index dcf87cf..565ae99 100644
--- a/util/hex-escape.c
+++ b/util/hex-escape.c
@@ -28,23 +28,24 @@ static const size_t default_buf_size=1024;
 static const char* output_charset=
     "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-_@=.:,";
 
-static const char escape_char='%';
+static const int escape_char = '%';
 
 static int
 is_output (char c) {
     return (strchr (output_charset, c) != NULL);
 }
 
+typedef unsigned char _octet;
 
 static int
-maybe_realloc(void *ctx, size_t needed, char **out, size_t *out_size)
+maybe_realloc(void *ctx, size_t needed, _octet **out, size_t *out_size)
 {
     if (*out_size < needed) {
 
 	if (*out == NULL)
 	    *out = talloc_size(ctx,needed);
 	else
-	    *out = talloc_realloc(ctx,*out,char,needed);
+	    *out = talloc_realloc(ctx, *out, _octet, needed);
 
 	if (*out == NULL)
 	    return 0;
@@ -56,24 +57,27 @@ maybe_realloc(void *ctx, size_t needed, char **out, size_t *out_size)
 
 
 hex_status_t
-hex_encode (void *ctx, const char *in, char **out, size_t *out_size)
+hex_encode (void *ctx, const _octet *in, char **out, size_t *out_size)
 {
 
-    const char *p;
+    const _octet *p;
     char *q;
 
-    int escape_count=0;
+    size_t escape_count = 0;
+    size_t len = 0;
     size_t needed;
 
-    for  (p = in; *p; p++)
+    for  (p = in; *p; p++) {
 	escape_count += (! is_output (*p));
+	len++;
+    }
 
-    needed = strlen (in) + 2*escape_count + 1;
+    needed = len + 2*escape_count + 1;
 
     if (*out == NULL)
 	*out_size=0;
 
-    if (!maybe_realloc (ctx, needed, out, out_size))
+    if (!maybe_realloc (ctx, needed, (_octet**)out, out_size))
 	return HEX_OUT_OF_MEMORY;
 
     q = *out;
@@ -94,12 +98,12 @@ hex_encode (void *ctx, const char *in, char **out, size_t *out_size)
 
 
 hex_status_t
-hex_decode (void *ctx, const char *in, char **out, size_t *out_size) {
+hex_decode (void *ctx, const char *in, _octet **out, size_t *out_size) {
 
     char buf[3];
 
     const char *p;
-    char *q;
+    _octet *q;
 
     size_t escape_count = 0;
     size_t needed = 0;
diff --git a/util/hex-escape.h b/util/hex-escape.h
index 98ecbe0..e04aff5 100644
--- a/util/hex-escape.h
+++ b/util/hex-escape.h
@@ -8,8 +8,10 @@ typedef enum hex_status {
 } hex_status_t;
 
 hex_status_t
-hex_encode (void *talloc_ctx, const char *in, char **out, size_t *out_size);
+hex_encode (void *talloc_ctx, const unsigned char *in, char **out,
+	    size_t *out_size);
 
 hex_status_t
-hex_decode (void *talloc_ctx, const char *in, char **out, size_t *out_size);
+hex_decode (void *talloc_ctx, const char *in, unsigned char **out,
+	    size_t *out_size);
 #endif
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests
  2011-12-13 20:27 ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests David Bremner
  2011-12-14 20:14   ` [Alpha Patch 1/2] test: add (currently broken) test 8 bit characters hex-escape and dump-restore David Bremner
@ 2011-12-14 20:24   ` Dmitry Kurochkin
  1 sibling, 0 replies; 12+ messages in thread
From: Dmitry Kurochkin @ 2011-12-14 20:24 UTC (permalink / raw)
  To: David Bremner, notmuch; +Cc: David Bremner, pere

On Tue, 13 Dec 2011 16:27:55 -0400, David Bremner <david@tethera.net> wrote:
> From: David Bremner <bremner@debian.org>
> 
> These one need the completed functionality in notmuch-restore. Fairly
> exotic tags are tested, but no weird message id's.
> ---
>  test/dump-restore |   33 +++++++++++++++++++++++++++++++++
>  1 files changed, 33 insertions(+), 0 deletions(-)
> 
> diff --git a/test/dump-restore b/test/dump-restore
> index 48caf4e..122de5c 100755
> --- a/test/dump-restore
> +++ b/test/dump-restore
> @@ -94,4 +94,37 @@ notmuch dump --format=sup -- from:cworth | tr -d \(\) > EXPECTED.$test_count
>  notmuch dump --format=notmuch -- from:cworth > OUTPUT.$test_count
>  test_expect_equal_file OUTPUT.$test_count EXPECTED.$test_count
>  
> +test_begin_subtest "format=notmuch, # round-trip"
> +notmuch dump --format=sup | sort > EXPECTED.$test_count
> +notmuch dump --format=notmuch | notmuch restore --format=notmuch
> +notmuch dump --format=sup | sort > OUTPUT.$test_count
> +test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
> +
> +tag1='comic_swear=$&^%$^%\\//-+$^%$'
> +enc1=$($TEST_DIRECTORY/hex-xcode e "$tag1")
> +
> +tag2=$(printf 'this\n tag\t has\n spaces')
> +enc2=$($TEST_DIRECTORY/hex-xcode e "$tag2")
> +
> +notmuch dump --format=notmuch > BACKUP
> +
> +notmuch tag +"$tag1" +"$tag2" -inbox -unread "*"
> +
> +test_begin_subtest 'format=notmuch, round trip with strange tags'
> +   notmuch dump --format=notmuch > EXPECTED.$test_count
> +   notmuch dump --format=notmuch | notmuch restore --format=notmuch
> +   notmuch dump --format=notmuch > OUTPUT.$test_count

Noticed this while looking at the 8bit patch.  The indent offset should
be 4.

Regards,
  Dmitry

> +test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
> +
> +
> +test_begin_subtest 'format=notmuch, checking encoded output'
> +    cp /dev/null EXPECTED.$test_count
> +    notmuch dump --format=notmuch -- from:cworth |\
> +	 awk "{ print \$1 \" $enc1 $enc2\" }" > EXPECTED.$test_count
> +
> +    notmuch dump --format=notmuch -- from:cworth  > OUTPUT.$test_count
> +test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
> +
> +notmuch restore --format=notmuch < BACKUP
> +
>  test_done
> -- 
> 1.7.5.4
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Alpha Patch 2/2] test: update dump-restore tests for 8 bit characters.
  2011-12-14 20:14     ` [Alpha Patch 2/2] test: update dump-restore tests for 8 bit characters David Bremner
@ 2011-12-14 20:36       ` Dmitry Kurochkin
  2011-12-14 23:20         ` David Bremner
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry Kurochkin @ 2011-12-14 20:36 UTC (permalink / raw)
  To: David Bremner, notmuch; +Cc: David Bremner, pere

On Wed, 14 Dec 2011 16:14:01 -0400, David Bremner <david@tethera.net> wrote:
> From: David Bremner <bremner@debian.org>
> 
> hex-escape: fix for handling of 8 bit chars
> 
> The low level problem was passing negative numbers to sprintf(s,"%x");
> we fix this and clarify the api for hex_(decode|encode) by making
> encode go from (unsigned char *) (i.e. 8bit) to (char *) and decode
> vise-versa.

I did not do a proper review.  But I think the encoder and decoder
should accept and return the same type, either char* or unsigned char*.
The decision should be based on what type strings (that would be fed to
the encoder and decoder) have in notmuch code.  I guess it is char*, so
the encoder and decoder should take and return char*.  Internally we
would cast char* to unsigned char*.

Also, I do not like the _octet typedef in hex-escape.c.  Having
different function parameters in header and .c is confusing.  IMO we
should either move the typedef to some header, or just use unsigned
char.

Regards,
  Dmitry

> ---
>  test/dump-restore |    2 --
>  test/hex-escaping |    1 -
>  util/hex-escape.c |   26 +++++++++++++++-----------
>  util/hex-escape.h |    6 ++++--
>  4 files changed, 19 insertions(+), 16 deletions(-)
> 
> diff --git a/test/dump-restore b/test/dump-restore
> index eee1773..c5b2e86 100755
> --- a/test/dump-restore
> +++ b/test/dump-restore
> @@ -114,7 +114,6 @@ notmuch dump --format=notmuch > BACKUP
>  notmuch tag +"$tag1" +"$tag2" +"$tag3" -inbox -unread "*"
>  
>  test_begin_subtest 'format=notmuch, round trip with strange tags'
> -   test_subtest_known_broken
>     notmuch dump --format=notmuch > EXPECTED.$test_count
>     notmuch dump --format=notmuch | notmuch restore --format=notmuch
>     notmuch dump --format=notmuch > OUTPUT.$test_count
> @@ -122,7 +121,6 @@ test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
>  
>  
>  test_begin_subtest 'format=notmuch, checking encoded output'
> -    test_subtest_known_broken
>      cp /dev/null EXPECTED.$test_count
>      notmuch dump --format=notmuch -- from:cworth |\
>  	 awk "{ print \$1 \" $enc1 $enc2 $enc3\" }" > EXPECTED.$test_count
> diff --git a/test/hex-escaping b/test/hex-escaping
> index 2053fb0..daa6446 100755
> --- a/test/hex-escaping
> +++ b/test/hex-escaping
> @@ -19,7 +19,6 @@ $TEST_DIRECTORY/hex-xcode e  < EXPECTED.$test_count |\
>  test_expect_equal_file EXPECTED.$test_count OUTPUT.$test_count
>  
>  test_begin_subtest "round trip 8bit chars"
> -test_subtest_known_broken
>  echo '%c3%91%c3%a5%c3%b0%c3%a3%c3%a5%c3%a9-%c3%8f%c3%8a' > EXPECTED.$test_count
>  $TEST_DIRECTORY/hex-xcode d  < EXPECTED.$test_count |\
>  	$TEST_DIRECTORY/hex-xcode e > OUTPUT.$test_count
> diff --git a/util/hex-escape.c b/util/hex-escape.c
> index dcf87cf..565ae99 100644
> --- a/util/hex-escape.c
> +++ b/util/hex-escape.c
> @@ -28,23 +28,24 @@ static const size_t default_buf_size=1024;
>  static const char* output_charset=
>      "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-_@=.:,";
>  
> -static const char escape_char='%';
> +static const int escape_char = '%';
>  
>  static int
>  is_output (char c) {
>      return (strchr (output_charset, c) != NULL);
>  }
>  
> +typedef unsigned char _octet;
>  
>  static int
> -maybe_realloc(void *ctx, size_t needed, char **out, size_t *out_size)
> +maybe_realloc(void *ctx, size_t needed, _octet **out, size_t *out_size)
>  {
>      if (*out_size < needed) {
>  
>  	if (*out == NULL)
>  	    *out = talloc_size(ctx,needed);
>  	else
> -	    *out = talloc_realloc(ctx,*out,char,needed);
> +	    *out = talloc_realloc(ctx, *out, _octet, needed);
>  
>  	if (*out == NULL)
>  	    return 0;
> @@ -56,24 +57,27 @@ maybe_realloc(void *ctx, size_t needed, char **out, size_t *out_size)
>  
>  
>  hex_status_t
> -hex_encode (void *ctx, const char *in, char **out, size_t *out_size)
> +hex_encode (void *ctx, const _octet *in, char **out, size_t *out_size)
>  {
>  
> -    const char *p;
> +    const _octet *p;
>      char *q;
>  
> -    int escape_count=0;
> +    size_t escape_count = 0;
> +    size_t len = 0;
>      size_t needed;
>  
> -    for  (p = in; *p; p++)
> +    for  (p = in; *p; p++) {
>  	escape_count += (! is_output (*p));
> +	len++;
> +    }
>  
> -    needed = strlen (in) + 2*escape_count + 1;
> +    needed = len + 2*escape_count + 1;
>  
>      if (*out == NULL)
>  	*out_size=0;
>  
> -    if (!maybe_realloc (ctx, needed, out, out_size))
> +    if (!maybe_realloc (ctx, needed, (_octet**)out, out_size))
>  	return HEX_OUT_OF_MEMORY;
>  
>      q = *out;
> @@ -94,12 +98,12 @@ hex_encode (void *ctx, const char *in, char **out, size_t *out_size)
>  
>  
>  hex_status_t
> -hex_decode (void *ctx, const char *in, char **out, size_t *out_size) {
> +hex_decode (void *ctx, const char *in, _octet **out, size_t *out_size) {
>  
>      char buf[3];
>  
>      const char *p;
> -    char *q;
> +    _octet *q;
>  
>      size_t escape_count = 0;
>      size_t needed = 0;
> diff --git a/util/hex-escape.h b/util/hex-escape.h
> index 98ecbe0..e04aff5 100644
> --- a/util/hex-escape.h
> +++ b/util/hex-escape.h
> @@ -8,8 +8,10 @@ typedef enum hex_status {
>  } hex_status_t;
>  
>  hex_status_t
> -hex_encode (void *talloc_ctx, const char *in, char **out, size_t *out_size);
> +hex_encode (void *talloc_ctx, const unsigned char *in, char **out,
> +	    size_t *out_size);
>  
>  hex_status_t
> -hex_decode (void *talloc_ctx, const char *in, char **out, size_t *out_size);
> +hex_decode (void *talloc_ctx, const char *in, unsigned char **out,
> +	    size_t *out_size);
>  #endif
> -- 
> 1.7.7.3
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Alpha Patch 2/2] test: update dump-restore tests for 8 bit characters.
  2011-12-14 20:36       ` Dmitry Kurochkin
@ 2011-12-14 23:20         ` David Bremner
  0 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-12-14 23:20 UTC (permalink / raw)
  To: Dmitry Kurochkin, notmuch; +Cc: pere

On Thu, 15 Dec 2011 00:36:38 +0400, Dmitry Kurochkin <dmitry.kurochkin@gmail.com> wrote:
> 
> I did not do a proper review.  But I think the encoder and decoder
> should accept and return the same type, either char* or unsigned char*.
> The decision should be based on what type strings (that would be fed to
> the encoder and decoder) have in notmuch code.  I guess it is char*, so
> the encoder and decoder should take and return char*.  Internally we
> would cast char* to unsigned char*.

After staring at the draft C99 standard a bit, I'm inclined to agree. I
think char is the generic, which to my horror is really either unsigned
char or signed char is an implementation dependent way.

The info I was missing was in the description of <string.h>

,----
| 3 For all functions in this subclause, each character shall be interpreted as if it had the type
|  unsigned char (and therefore every possible object representation is valid and has a
| different value).
`----

I'll fix this in git, but I probably won't bother with another round of
patches yet.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-12-14 23:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-13 20:27 New Dump/Restore Format David Bremner
2011-12-13 20:27 ` [Alpha PATCH 1/6] util/hex-escape.[ch]: encoding/decoding strings into restricted character set David Bremner
2011-12-13 20:27 ` [Alpha PATCH 2/6] test: add test for hex_(encode|decode) David Bremner
2011-12-13 20:27 ` [Alpha PATCH 3/6] notmuch-dump: add --format=(notmuch|sup) David Bremner
2011-12-13 20:27 ` [Alpha PATCH 4/6] test: add test for dump --format=notmuch David Bremner
2011-12-13 20:27 ` [Alpha PATCH 5/6] notmuch-restore: add --format=notmuch support David Bremner
2011-12-13 20:27 ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests David Bremner
2011-12-14 20:14   ` [Alpha Patch 1/2] test: add (currently broken) test 8 bit characters hex-escape and dump-restore David Bremner
2011-12-14 20:14     ` [Alpha Patch 2/2] test: update dump-restore tests for 8 bit characters David Bremner
2011-12-14 20:36       ` Dmitry Kurochkin
2011-12-14 23:20         ` David Bremner
2011-12-14 20:24   ` [Alpha PATCH 6/6] test: second set of dump/restore --format=notmuch tests Dmitry Kurochkin

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).