unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Tomi Ollila <tomi.ollila@iki.fi>
To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org
Subject: Re: [PATCH 14/15] lib: add _notmuch_message_id_parse_strict
Date: Sat, 01 Sep 2018 23:27:06 +0300	[thread overview]
Message-ID: <m2mut1ge05.fsf@guru.guru-group.fi> (raw)
In-Reply-To: <20180830112915.11761-15-david@tethera.net>

On Thu, Aug 30 2018, David Bremner wrote:

> The idea is that if a message-id parses with this function, the MUA
> generating it was probably sane, and in particular it's probably safe
> to use the result as a parent from In-Reply-to.
> ---
>  lib/message-id.c        | 32 ++++++++++++++++++
>  lib/notmuch-private.h   | 14 ++++++++
>  test/Makefile.local     |  6 +++-
>  test/T710-message-id.sh | 73 +++++++++++++++++++++++++++++++++++++++++
>  test/message-id-parse.c | 26 +++++++++++++++
>  5 files changed, 150 insertions(+), 1 deletion(-)
>  create mode 100755 test/T710-message-id.sh
>  create mode 100644 test/message-id-parse.c
>
> diff --git a/lib/message-id.c b/lib/message-id.c
> index d7541d50..a1dce9c8 100644
> --- a/lib/message-id.c
> +++ b/lib/message-id.c
> @@ -1,4 +1,5 @@
>  #include "notmuch-private.h"
> +#include "string-util.h"
>  
>  /* Advance 'str' past any whitespace or RFC 822 comments. A comment is
>   * a (potentially nested) parenthesized sequence with '\' used to
> @@ -94,3 +95,34 @@ _notmuch_message_id_parse (void *ctx, const char *message_id, const char **next)
>  
>      return result;
>  }
> +
> +char *
> +_notmuch_message_id_parse_strict (void *ctx, const char *message_id)
> +{
> +    const char *s, *end;
> +    char *result;
> +
> +    if (message_id == NULL || *message_id == '\0')
> +	return NULL;
> +
> +    s = skip_space (message_id);
> +    if (*s == '<')
> +	s++;
> +    else
> +	return NULL;
> +
> +    for (end = s; *end && *end != '>'; end++)
> +	if (isspace (*end))
> +	    return NULL;
> +
> +    if (*end != '>')
> +	return NULL;
> +    else {
> +	const char *last =  skip_space (end+1);

this parser looks good to me, but the spacing in above line not
(first 2 spaces, and then missing spaces around '+')

> +	if (*last != '\0')
> +	    return NULL;
> +    }
> +
> +    result = talloc_strndup (ctx, s, end - s);
> +    return result;
> +}
> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> index fd0d251b..5bbaa292 100644
> --- a/lib/notmuch-private.h
> +++ b/lib/notmuch-private.h
> @@ -526,6 +526,20 @@ _notmuch_query_count_documents (notmuch_query_t *query,
>  char *
>  _notmuch_message_id_parse (void *ctx, const char *message_id, const char **next);
>  
> +/* Parse a message-id, discarding leading and trailing whitespace, and
> + * '<' and '>' delimiters.
> + *
> + * Apply a probably-stricter-than RFC definition of what is allowed in
> + * a message-id. In particular, forbid whitespace.
> + *
> + * Returns a newly talloc'ed string belonging to 'ctx'.
> + *
> + * Returns NULL if there is any error parsing the message-id.
> + */
> +
> +char *
> +_notmuch_message_id_parse_strict (void *ctx, const char *message_id);
> +
>  
>  /* message.cc */
>  
> diff --git a/test/Makefile.local b/test/Makefile.local
> index 1a0ab813..1cf09778 100644
> --- a/test/Makefile.local
> +++ b/test/Makefile.local
> @@ -15,6 +15,9 @@ smtp_dummy_modules = $(smtp_dummy_srcs:.c=.o)
>  $(dir)/arg-test: $(dir)/arg-test.o command-line-arguments.o util/libnotmuch_util.a
>  	$(call quiet,CC) $^ -o $@ $(LDFLAGS)
>  
> +$(dir)/message-id-parse: $(dir)/message-id-parse.o lib/libnotmuch.a util/libnotmuch_util.a
> +	$(call quiet,CC) $^ -o $@ $(LDFLAGS) $(TALLOC_LDFLAGS)
> +
>  $(dir)/hex-xcode: $(dir)/hex-xcode.o command-line-arguments.o util/libnotmuch_util.a
>  	$(call quiet,CC) $^ -o $@ $(LDFLAGS) $(TALLOC_LDFLAGS)
>  
> @@ -50,7 +53,8 @@ test_main_srcs=$(dir)/arg-test.c \
>  	      $(dir)/smtp-dummy.c \
>  	      $(dir)/symbol-test.cc \
>  	      $(dir)/make-db-version.cc \
> -	      $(dir)/ghost-report.cc
> +	      $(dir)/ghost-report.cc \
> +	      $(dir)/message-id-parse.c
>  
>  test_srcs=$(test_main_srcs) $(dir)/database-test.c
>  
> diff --git a/test/T710-message-id.sh b/test/T710-message-id.sh
> new file mode 100755
> index 00000000..e73d6ba9
> --- /dev/null
> +++ b/test/T710-message-id.sh
> @@ -0,0 +1,73 @@
> +#!/usr/bin/env bash
> +test_description="message id parsing"
> +
> +. $(dirname "$0")/test-lib.sh || exit 1
> +
> +test_begin_subtest "good message ids"
> +${TEST_DIRECTORY}/message-id-parse <<EOF >OUTPUT
> +<018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915.git.jani@nikula.org>
> +<1530507300.raoomurnbf.astroid@strange.none>
> +<1258787708-21121-2-git-send-email-keithp@keithp.com>
> +EOF
> +cat <<EOF >EXPECTED
> +GOOD: 018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915.git.jani@nikula.org
> +GOOD: 1530507300.raoomurnbf.astroid@strange.none
> +GOOD: 1258787708-21121-2-git-send-email-keithp@keithp.com
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_begin_subtest "leading and trailing space is OK"
> +${TEST_DIRECTORY}/message-id-parse <<EOF >OUTPUT
> +   <018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915.git.jani@nikula.org>
> +<1530507300.raoomurnbf.astroid@strange.none>    
> +    <1258787708-21121-2-git-send-email-keithp@keithp.com>
> +EOF
> +cat <<EOF >EXPECTED
> +GOOD: 018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915.git.jani@nikula.org
> +GOOD: 1530507300.raoomurnbf.astroid@strange.none
> +GOOD: 1258787708-21121-2-git-send-email-keithp@keithp.com
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_begin_subtest "<> delimeters are required"
> +${TEST_DIRECTORY}/message-id-parse <<EOF >OUTPUT
> +018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915.git.jani@nikula.org>
> +<1530507300.raoomurnbf.astroid@strange.none
> +1258787708-21121-2-git-send-email-keithp@keithp.com
> +EOF
> +cat <<EOF >EXPECTED
> +BAD: 018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915.git.jani@nikula.org>
> +BAD: <1530507300.raoomurnbf.astroid@strange.none
> +BAD: 1258787708-21121-2-git-send-email-keithp@keithp.com
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_begin_subtest "embedded whitespace is forbidden"
> +${TEST_DIRECTORY}/message-id-parse <<EOF >OUTPUT
> +<018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915 .git.jani@nikula.org>
> +<1530507300.raoomurnbf.astroid	@strange.none>
> +<1258787708-21121-\f2-git-send-email-keithp@keithp.com>
> +EOF
> +cat <<EOF >EXPECTED
> +BAD: <018b1a8f2d1df62e804ce88b65401304832dfbbf.1346614915 .git.jani@nikula.org>
> +BAD: <1530507300.raoomurnbf.astroid	@strange.none>
> +BAD: <1258787708-21121-\f2-git-send-email-keithp@keithp.com>
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +
> +test_begin_subtest "folded real life bad In-Reply-To values"
> +${TEST_DIRECTORY}/message-id-parse <<EOF >OUTPUT
> +<22597.31869.380767.339702@chiark.greenend.org.uk> (Ian Jackson's message of "Mon, 5 Dec 2016 14:41:01 +0000")
> +<20170625141242.loaalhis2eodo66n@gaara.hadrons.org>  <149719990964.27883.13021127452105787770.reportbug@seneca.home.org>
> +Your message of Tue, 09 Dec 2014 13:21:11 +0100. <1900758.CgLNVPbY9N@liber>
> +EOF
> +cat <<EOF >EXPECTED
> +BAD: <22597.31869.380767.339702@chiark.greenend.org.uk> (Ian Jackson's message of "Mon, 5 Dec 2016 14:41:01 +0000")
> +BAD: <20170625141242.loaalhis2eodo66n@gaara.hadrons.org>  <149719990964.27883.13021127452105787770.reportbug@seneca.home.org>
> +BAD: Your message of Tue, 09 Dec 2014 13:21:11 +0100. <1900758.CgLNVPbY9N@liber>
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +
> +test_done
> diff --git a/test/message-id-parse.c b/test/message-id-parse.c
> new file mode 100644
> index 00000000..752eb1fd
> --- /dev/null
> +++ b/test/message-id-parse.c
> @@ -0,0 +1,26 @@
> +#include <stdio.h>
> +#include <talloc.h>
> +#include "notmuch-private.h"
> +
> +int
> +main (unused (int argc), unused (char **argv))
> +{
> +    char *line = NULL;
> +    size_t len = 0;
> +    ssize_t nread;
> +    void *local = talloc_new (NULL);
> +
> +    while ((nread = getline (&line, &len, stdin)) != -1) {
> +	int last = strlen (line) - 1;
> +	if (line[last] == '\n')
> +	    line[last] = '\0';
> +
> +	char *mid = _notmuch_message_id_parse_strict (local, line);
> +	if (mid)
> +	    printf ("GOOD: %s\n", mid);
> +	else
> +	    printf ("BAD: %s\n", line);
> +    }
> +
> +    talloc_free (local);
> +}
> -- 
> 2.18.0
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch

  reply	other threads:[~2018-09-01 20:27 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-30 11:29 threading replies fixes v3 David Bremner
2018-08-30 11:29 ` [PATCH 01/15] util: add DEBUG_PRINTF, rename error_util.h -> debug_print.h David Bremner
2018-09-01 17:06   ` Tomi Ollila
2018-09-03 11:54     ` David Bremner
2018-08-30 11:29 ` [PATCH 02/15] test: start threading test corpus David Bremner
2018-08-30 11:29 ` [PATCH 03/15] test: add known broken tests for "ghost roots" David Bremner
2018-08-30 11:29 ` [PATCH 04/15] lib/thread: sort sibling messages by date David Bremner
2018-09-01 17:11   ` Tomi Ollila
2018-09-03 15:18     ` David Bremner
2018-08-30 11:29 ` [PATCH 05/15] lib: read reference terms into message struct David Bremner
2018-08-30 11:29 ` [PATCH 06/15] lib/thread: refactor in_reply_to test David Bremner
2018-09-01 20:02   ` Tomi Ollila
2018-08-30 11:29 ` [PATCH 07/15] lib/thread: initial use of references as for fallback parenting David Bremner
2018-08-30 11:29 ` [PATCH 08/15] lib: calculate message depth in thread David Bremner
2018-08-30 11:29 ` [PATCH 09/15] lib/thread: rewrite _parent_or_toplevel to use depths David Bremner
2018-08-30 11:29 ` [PATCH 10/15] lib/thread: change _resolve_thread_relationships " David Bremner
2018-08-30 11:29 ` [PATCH 11/15] test: add known broken test for good In-Reply-To / bad References David Bremner
2018-08-30 11:29 ` [PATCH 12/15] test/thread-replies: mangle In-Reply-To's David Bremner
2018-08-30 11:29 ` [PATCH 13/15] util/string-util: export skip_space David Bremner
2018-08-30 11:29 ` [PATCH 14/15] lib: add _notmuch_message_id_parse_strict David Bremner
2018-09-01 20:27   ` Tomi Ollila [this message]
2018-08-30 11:29 ` [PATCH 15/15] lib: change parent strategy to use In-Reply-To if it looks sane David Bremner
2018-09-01 20:29   ` Tomi Ollila

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2mut1ge05.fsf@guru.guru-group.fi \
    --to=tomi.ollila@iki.fi \
    --cc=david@tethera.net \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).