unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Xapian Quote tags
@ 2013-01-04 13:55 Mark Walters
  0 siblings, 0 replies; only message in thread
From: Mark Walters @ 2013-01-04 13:55 UTC (permalink / raw)
  To: notmuch


Hello

I would like to suggest that we Xapian quote the tags for notmuch
dump/restore. The general view on irc is that we probably want to do
this in the long term and I think it would be nice if we can avoid
changing the dump format a second time.

One problem is that our current line based parsing cannot cope with
Xapian quoted newlines (which stay newlines). So allow a line to start
with % which means hex-decode the whole line before passing to the main
parser. Thus the query tags etc still need to be Xapian encoded.

I attach a patch to show roughly what I mean. This is not complete: the
dump routine does not hex encode lines with newlines yet, tests man
pages etc are not updated and it is significantly unpolished. Also there
should be some consolidation between parse_boolean_term and the
xapian_decode routine.

Despite the above caveats it broadly seems to work.

Best wishes

Mark




From d518e2be27ff7243ddc156699c2bfc38dec78b43 Mon Sep 17 00:00:00 2001
From: Mark Walters <markwalters1009@gmail.com>
Date: Fri, 4 Jan 2013 13:37:42 +0000
Subject: [PATCH] notmuch dump: xapian quote tags

---
 notmuch-dump.c |    6 +++---
 tag-util.c     |   53 ++++++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index bf01a39..e94d870 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -120,9 +120,9 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
 	    if (output_format == DUMP_FORMAT_SUP) {
 		fputs (tag_str, output);
 	    } else {
-		if (hex_encode (notmuch, tag_str,
-				&buffer, &buffer_size) != HEX_SUCCESS) {
-		    fprintf (stderr, "Error: failed to hex-encode tag %s\n",
+		if (make_boolean_term (notmuch, NULL, tag_str,
+				       &buffer, &buffer_size)) {
+		    fprintf (stderr, "Error: failed to xapian-encode tag %s\n",
 			     tag_str);
 		    return 1;
 		}
diff --git a/tag-util.c b/tag-util.c
index ca12b3b..7384afa 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -31,6 +31,38 @@ line_error (tag_parse_status_t status,
     return status;
 }
 
+static int
+xapian_decode_tag_inplace (char *str, char* tok, size_t *tok_len)
+{
+    char *pos = str;
+    char *out = str;
+
+    if (*pos == '"') {
+	int closed = 0;
+	/* Skip the opening quote, find the closing quote, and
+	 * un-double doubled internal quotes. */
+	for (++pos; *pos; ) {
+	    if (*pos == '"') {
+		++pos;
+		if (*pos != '"') {
+		    /* Found the closing quote. */
+		    closed = 1;
+		    break;
+		}
+	    }
+	    *out++ = *pos++;
+	}
+	if (! closed || *pos != ' ')
+	    return HEX_SYNTAX_ERROR;
+	*tok_len = pos - tok;
+    } else {
+	out = tok + (*tok_len)++;
+    }
+    /* Terminate token */
+    *out = '\0';
+    return HEX_SUCCESS;
+}
+
 tag_parse_status_t
 parse_tag_line (void *ctx, char *line,
 		tag_op_flag_t flags,
@@ -60,6 +92,15 @@ parse_tag_line (void *ctx, char *line,
 	goto DONE;
     }
 
+    if (*tok == '%') {
+	tok++;
+	if (hex_decode_inplace (tok) != HEX_SUCCESS) {
+	    ret = line_error (TAG_PARSE_INVALID, line_for_error,
+			      "hex decoding of line failed", "");
+	    goto DONE;
+	}
+    }
+
     tag_op_list_reset (tag_ops);
 
     /* Parse tags. */
@@ -89,23 +130,21 @@ parse_tag_line (void *ctx, char *line,
 	    goto DONE;
 	}
 
-	/* Terminate, and start next token after terminator. */
-	*(tok + tok_len++) = '\0';
-
 	remove = (*tok == '-');
 	tag = tok + 1;
 
-	/* Maybe refuse empty tags. */
+	/* Maybe refuse empty tags. Note a quoted empty tag is allowed. */
 	if (! (flags & TAG_FLAG_BE_GENEROUS) && *tag == '\0') {
 	    ret = line_error (TAG_PARSE_INVALID, line_for_error,
 			      "empty tag");
 	    goto DONE;
 	}
 
-	/* Decode tag. */
-	if (hex_decode_inplace (tag) != HEX_SUCCESS) {
+	/* Find real (quoted) end, terminate, and start next token
+	 * after terminator. */
+	if (xapian_decode_tag_inplace (tag, tok, &tok_len) != HEX_SUCCESS) {
 	    ret = line_error (TAG_PARSE_INVALID, line_for_error,
-			      "hex decoding of tag %s failed", tag);
+			      "xapian decoding of tag %s failed", tag);
 	    goto DONE;
 	}
 
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2013-01-04 13:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-04 13:55 Xapian Quote tags Mark Walters

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).