From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jani@nikula.org>
Received: from localhost (localhost [127.0.0.1])
	by olra.theworths.org (Postfix) with ESMTP id D66A0431FAF
	for <notmuch@notmuchmail.org>; Thu,  3 Jan 2013 09:19:12 -0800 (PST)
X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
X-Spam-Flag: NO
X-Spam-Score: -0.7
X-Spam-Level: 
X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
	tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
Received: from olra.theworths.org ([127.0.0.1])
	by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id GLIaA6Xn33Ie for <notmuch@notmuchmail.org>;
	Thu,  3 Jan 2013 09:19:12 -0800 (PST)
Received: from mail-bk0-f51.google.com (mail-bk0-f51.google.com
	[209.85.214.51]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
	(No client certificate requested)
	by olra.theworths.org (Postfix) with ESMTPS id 33BA6431FAE
	for <notmuch@notmuchmail.org>; Thu,  3 Jan 2013 09:19:12 -0800 (PST)
Received: by mail-bk0-f51.google.com with SMTP id ik5so6861629bkc.38
	for <notmuch@notmuchmail.org>; Thu, 03 Jan 2013 09:19:11 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=x-received:from:to:cc:subject:in-reply-to:references:user-agent
	:date:message-id:mime-version:content-type:x-gm-message-state;
	bh=weg9zmvlX5oI7ii9MuCExFpek3nkOZ49dPt7RsXx6cM=;
	b=OHlGToXKfiJJrf2nxkR+mh9yORAKv+N9NLCPz+SHIsz8FAU8agsIvQIn2nWThSc4q3
	TquirsvzIG54BI1gXeZCSP4IAicW3MCm69rj96gJv2qrzr5dDJKNCTYcG4PAtFg888q0
	jM4zZN99uHVtuigO6Mo5GtaO1L+Xv+Vb994cL1Ww/8ZP2l7r9R0SM0YlKR8SVo6Vt9uG
	g5/1uEnoF/JDSEztPOlKfJf2cSSIGRQRDAEyZVKDK7bsEG0eohlR4G5ItAXRYpZLIg2a
	rp4pEMG40nxi83sQjMxBDKfIffEvBSffGQ7VOZBH02AWAG7t1rOm5DHJ0k3mj1WYS2AJ
	dSDQ==
X-Received: by 10.204.148.134 with SMTP id p6mr23785251bkv.75.1357233550728;
	Thu, 03 Jan 2013 09:19:10 -0800 (PST)
Received: from localhost ([2001:4b98:dc0:43:216:3eff:fe1b:25f3])
	by mx.google.com with ESMTPS id u3sm34662405bkw.9.2013.01.03.09.19.08
	(version=SSLv3 cipher=OTHER); Thu, 03 Jan 2013 09:19:09 -0800 (PST)
From: Jani Nikula <jani@nikula.org>
To: Austin Clements <amdragon@MIT.EDU>, notmuch@notmuchmail.org
Subject: Re: [PATCH v4 3/5] dump: Disallow \n in message IDs
In-Reply-To: <1356936162-2589-4-git-send-email-amdragon@mit.edu>
References: <1356936162-2589-1-git-send-email-amdragon@mit.edu>
	<1356936162-2589-4-git-send-email-amdragon@mit.edu>
User-Agent: Notmuch/0.14+235~gdaf492b (http://notmuchmail.org) Emacs/23.2.1
	(x86_64-pc-linux-gnu)
Date: Thu, 03 Jan 2013 18:19:02 +0100
Message-ID: <87sj6igp6h.fsf@nikula.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Gm-Message-State: ALoCoQk7dkZ4EVax45iSmdEH4CpjQoyF79IJK+j57UHP5ztnuKHF52mVUAjLBePfNwxZ+Z4o8c7c
Cc: tomi.ollila@iki.fi
X-BeenThere: notmuch@notmuchmail.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: "Use and development of the notmuch mail system."
	<notmuch.notmuchmail.org>
List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
	<mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
List-Archive: <http://notmuchmail.org/pipermail/notmuch>
List-Post: <mailto:notmuch@notmuchmail.org>
List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
	<mailto:notmuch-request@notmuchmail.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Jan 2013 17:19:13 -0000

On Mon, 31 Dec 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> When we switch to using regular Xapian queries in the dump format, \n
> will cause problems, so we disallow it.  Specially, while Xapian can
> quote and parse queries containing \n without difficultly, quoted
> queries containing \n still span multiple lines, which breaks the
> line-orientedness of the dump format.  Strictly speaking, we could
> still round-trip these, but it would significantly complicate restore
> as well as scripts that deal with tag dumps.  This complexity would
> come at absolutely no benefit: because of the RFC 2822 unfolding
> rules, no amount of standards negligence can produce a message with a
> message ID containing a line break (not even Outlook can do it!).
>
> Hence, we simply disallow it.
> ---
>  notmuch-dump.c       |    9 +++++++++
>  test/random-corpus.c |    4 +++-
>  2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/notmuch-dump.c b/notmuch-dump.c
> index d2dad40..29d79da 100644
> --- a/notmuch-dump.c
> +++ b/notmuch-dump.c
> @@ -132,6 +132,15 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
>  	if (output_format == DUMP_FORMAT_SUP) {
>  	    fputs (")\n", output);
>  	} else {
> +	    if (strchr (message_id, '\n')) {
> +		/* This will produce a line break in the output, which
> +		 * would be difficult to handle in tools.  However,
> +		 * it's also impossible to produce an email containing
> +		 * a line break in a message ID because of unfolding,
> +		 * so we can safely disallow it. */
> +		fprintf (stderr, "Error: cannot dump message id containing line break: %s\n", message_id);
> +		return 1;

How about just skipping the message in the dump, with a warning, instead
of bailing out? If the user is desperate to do a backup for whatever
reason, I don't think it's a good idea to require deleting the message
from the db before dump can succeed. The fs holding the db might be
remounted ro and all that.

And perhaps the message id in the error message should be wrapped in
quotes, because it will span multiple lines due to having a
newline... ;)

Otherwise, LGTM.

Jani.

> +	    }
>  	    if (hex_encode (notmuch, message_id,
>  			    &buffer, &buffer_size) != HEX_SUCCESS) {
>  		    fprintf (stderr, "Error: failed to hex-encode msg-id %s\n",
> diff --git a/test/random-corpus.c b/test/random-corpus.c
> index f354d4b..8b7748e 100644
> --- a/test/random-corpus.c
> +++ b/test/random-corpus.c
> @@ -96,7 +96,9 @@ random_utf8_string (void *ctx, size_t char_count)
>  	    buf = talloc_realloc (ctx, buf, gchar, buf_size);
>  	}
>  
> -	randomchar = random_unichar ();
> +	do {
> +	    randomchar = random_unichar ();
> +	} while (randomchar == '\n');
>  
>  	written = g_unichar_to_utf8 (randomchar, buf + offset);
>  
> -- 
> 1.7.10.4