unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Jani Nikula <jani@nikula.org>
To: notmuch@notmuchmail.org
Subject: [PATCH v2 7/7] HACK: fix broken messages in the perf test corpus
Date: Sat, 30 Nov 2013 17:33:56 +0200	[thread overview]
Message-ID: <d74499f1e462755676edf9aa6ab689ba47fa2471.1385825425.git.jani@nikula.org> (raw)
In-Reply-To: <cover.1385825425.git.jani@nikula.org>
In-Reply-To: <cover.1385825425.git.jani@nikula.org>

The gmime header parser rejects a lot of messages in the perf test
corpus which have this in the middle of headers:

Microsoft Mail Internet Headers Version 2.0

The header parsing stops right there. This illustrates a change in the
parsing. The message is clearly broken, but previously notmuch
accepted it anyway.

This patch "fixes" the messages in the perf test corpus to be able to
do fair comparisons of the parsers.

NOT TO BE MERGED, if that isn't obvious. This is just a quick hack.
---
 performance-test/perf-test-lib.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index 9ee7661..caec0d0 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -84,7 +84,11 @@ add_email_corpus ()
 	    "${args[@]}"
 
 	printf "\n"
+	printf "Fix broken messages in corpus..."
 
+	find "${TEST_DIRECTORY}/corpus" -type f -print0 | xargs -0 sed -i -e 's/^Microsoft Mail Internet Headers Version 2\.0/X-Crap: &/'
+
+	printf "\n"
     fi
 
     cp -lr $TAG_CORPUS $TMP_DIRECTORY/corpus.tags
-- 
1.8.4.2

  parent reply	other threads:[~2013-11-30 15:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-30 15:33 [PATCH v2 0/7] lib: replace the message header parser with gmime Jani Nikula
2013-11-30 15:33 ` [PATCH v2 1/7] cli: sanitize tabs and newlines to spaces in notmuch search Jani Nikula
2013-11-30 15:33 ` [PATCH v2 2/7] cli: refactor reply from guessing Jani Nikula
2014-02-02 18:21   ` Mark Walters
2013-11-30 15:33 ` [PATCH v2 3/7] util: make sanitize string available in string util for reuse Jani Nikula
2014-02-02 18:24   ` Mark Walters
2013-11-30 15:33 ` [PATCH v2 4/7] cli: sanitize the received header before scanning for replies Jani Nikula
2013-11-30 15:33 ` [PATCH v2 5/7] lib: replace the header parser with gmime Jani Nikula
2013-11-30 15:33 ` [PATCH v2 6/7] lib: parse messages only once Jani Nikula
2013-11-30 15:33 ` Jani Nikula [this message]
2013-11-30 17:48   ` [PATCH v2 7/7] HACK: fix broken messages in the perf test corpus David Bremner
2014-01-15 18:03 ` [PATCH v2 0/7] lib: replace the message header parser with gmime David Bremner
2014-02-02 13:03   ` Jani Nikula
2014-02-02 18:15 ` Mark Walters
2014-02-02 19:32   ` Jani Nikula

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d74499f1e462755676edf9aa6ab689ba47fa2471.1385825425.git.jani@nikula.org \
    --to=jani@nikula.org \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).