* Duplicate message ids @ 2017-08-27 9:45 Mark Walters 2017-08-27 13:19 ` [PATCH] test: check for subject consistency between search and show David Bremner 0 siblings, 1 reply; 9+ messages in thread From: Mark Walters @ 2017-08-27 9:45 UTC (permalink / raw) To: notmuch [-- Attachment #1: Type: text/plain, Size: 756 bytes --] Hi A concern for notmuch is some form of attack via someone sending a message with a duplicate message id. I think I have seen it asserted that it is not so much of a problem as the first message received by notmuch will have priority. However, I believe that this is not the case. The script below gives a demonstration (on my system at least). I have written it as a "test" so (I hope) it doesn't mess up the system. It should work if you put it in the test directory and execute it. It adds a message, runs notmuch new, adds a message with the same id, runs notmuch new, and then runs notmuch search <id> and notmuch show <id>. The former shows the subject of the first message, and the latter the subject of the second message. Best wishes Mark [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: T710-duplicate-id.sh --] [-- Type: text/x-sh, Size: 602 bytes --] #!/usr/bin/env bash test_description='Do duplicate message ids get shown in arrival order' . ./test-lib.sh || exit 1 find ${MAIL_DIR} generate_message [id]=duplicate-id [subject]=first [body]=first mkdir "${MAIL_DIR}"/b mv "$gen_msg_filename" "${MAIL_DIR}"/b notmuch new generate_message [id]=duplicate-id [subject]=second [body]=second mv "$gen_msg_filename" "${MAIL_DIR}"/a notmuch new find ${MAIL_DIR} echo echo SEARCH: observe \"first\" is the subject notmuch search id:duplicate-id echo echo SHOW: observe \"second\" is the subject and body notmuch show --format=json id:duplicate-id |json_pp ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] test: check for subject consistency between search and show 2017-08-27 9:45 Duplicate message ids Mark Walters @ 2017-08-27 13:19 ` David Bremner 2017-08-27 18:48 ` [PATCH] test: duplicate mids: add an extra broken search test Mark Walters 0 siblings, 1 reply; 9+ messages in thread From: David Bremner @ 2017-08-27 13:19 UTC (permalink / raw) To: Mark Walters, notmuch In [1] Mark showed that the the current code (d7a49e81) is not consistent in it's handling of subjects of messages with duplicate message-ids (or in notmuch-speak, of messages with multiple files). notmuch-search uses indexing order and explicitedly preserves the first. notmuch-show (apparently) uses alphabetical (or at least xapian term order) of filenames. In a perfect world we would probably report all subjects in the json output; at the very least we should be consistent. [1]: id:87378dny3d.fsf@qmul.ac.uk --- test/T670-duplicate-mid.sh | 62 +++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 56 insertions(+), 6 deletions(-) diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index decbc0a4..55ebde38 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -12,6 +12,29 @@ EOF notmuch search id:duplicate | notmuch_search_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT +test_begin_subtest 'First subject preserved in notmuch-show (json)' +output=$(notmuch show --body=false --format=json id:duplicate | notmuch_json_show_sanitize) +expected='[[[{ + "id": "XXXXX", + "match": true, + "excluded": false, + "filename": [ + "'"${MAIL_DIR}"/copy1'", + "'"${MAIL_DIR}"/copy2'" + ], + "timestamp": 42, + "date_relative": "2001-01-05", + "tags": ["inbox","unread"], + "headers": { + "Subject": "message 1", + "From": "Notmuch Test Suite <test_suite@notmuchmail.org>", + "To": "Notmuch Test Suite <test_suite@notmuchmail.org>", + "Date": "GENERATED_DATE" + } + }, +[]]]]' +test_expect_equal_json "$output" "$expected" + test_begin_subtest 'Search for second subject' cat <<EOF >EXPECTED MAIL_DIR/copy1 @@ -37,25 +60,52 @@ notmuch reindex '*' notmuch search --output=files "sekrit" | notmuch_dir_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT -rm ${MAIL_DIR}/copy1 +add_message '[id]="duplicate"' '[subject]="reorder file names"' '[filename]=00-copy4' +test_begin_subtest 'First subject preserved in notmuch-show (json), file order' +test_subtest_known_broken +output=$(notmuch show --body=false --format=json id:duplicate | notmuch_json_show_sanitize) +expected='[[[{ + "id": "XXXXX", + "match": true, + "excluded": false, + "filename": [ + "'"${MAIL_DIR}"/00-copy4'", + "'"${MAIL_DIR}"/copy1'", + "'"${MAIL_DIR}"/copy2'" + ], + "timestamp": 42, + "date_relative": "2001-01-05", + "tags": ["inbox","unread"], + "headers": { + "Subject": "message 1", + "From": "Notmuch Test Suite <test_suite@notmuchmail.org>", + "To": "Notmuch Test Suite <test_suite@notmuchmail.org>", + "Date": "GENERATED_DATE" + } + }, +[]]]]' +test_expect_equal_json "$output" "$expected" + +rm ${MAIL_DIR}/00-copy4 test_begin_subtest 'Deleted first duplicate file does not stop notmuch show from working' -output=$(notmuch show --body=false --format=json id:duplicate) +output=$(notmuch show --body=false --format=json id:duplicate | notmuch_json_show_sanitize) expected='[[[{ - "id": "'duplicate'", + "id": "XXXXX", "match": true, "excluded": false, "filename": [ + "'"${MAIL_DIR}"/00-copy4'", "'"${MAIL_DIR}"/copy1'", "'"${MAIL_DIR}"/copy2'" ], - "timestamp": 978709435, + "timestamp": 42, "date_relative": "2001-01-05", "tags": ["inbox","unread"], "headers": { - "Subject": "message 2", + "Subject": "message 1", "From": "Notmuch Test Suite <test_suite@notmuchmail.org>", "To": "Notmuch Test Suite <test_suite@notmuchmail.org>", - "Date": "Fri, 05 Jan 2001 15:43:55 +0000" + "Date": "GENERATED_DATE" } }, []]]]' -- 2.14.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] test: duplicate mids: add an extra broken search test 2017-08-27 13:19 ` [PATCH] test: check for subject consistency between search and show David Bremner @ 2017-08-27 18:48 ` Mark Walters 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner 0 siblings, 1 reply; 9+ messages in thread From: Mark Walters @ 2017-08-27 18:48 UTC (permalink / raw) To: notmuch --- Hi Many thanks to bremner for the parent patch. I thought it might be worth adding a search test after the broken show test demonstrating the distinction between show and search. When I added it I found it actually gives the other file as the answer! This applies on top of the parent patch. Best wishes Mark test/T670-duplicate-mid.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index 55ebde3..0de2bc0 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -86,6 +86,14 @@ expected='[[[{ []]]]' test_expect_equal_json "$output" "$expected" +test_begin_subtest 'First subject preserved in notmuch-search file order' +test_subtest_known_broken +notmuch search id:duplicate | notmuch_search_sanitize > OUTPUT +cat <<EOF > EXPECTED +thread:XXX 2001-01-05 [1/1(3)] Notmuch Test Suite; message 1 (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + rm ${MAIL_DIR}/00-copy4 test_begin_subtest 'Deleted first duplicate file does not stop notmuch show from working' output=$(notmuch show --body=false --format=json id:duplicate | notmuch_json_show_sanitize) -- 2.1.4 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 1/5] test: make fallback to duplicate test more robust. 2017-08-27 18:48 ` [PATCH] test: duplicate mids: add an extra broken search test Mark Walters @ 2017-08-27 23:58 ` David Bremner 2017-08-27 23:58 ` [PATCH 2/5] test/duplicate-mid: clarify index order vs filename order David Bremner ` (4 more replies) 0 siblings, 5 replies; 9+ messages in thread From: David Bremner @ 2017-08-27 23:58 UTC (permalink / raw) To: Mark Walters, notmuch The original intent of this test was to verify that notmuch show was not crashing when the first file (where headers are being read from) was deleted. Run the output through some sanitization so that as we add and delete copies we don't have to update this test. --- test/T670-duplicate-mid.sh | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index decbc0a4..3495e63d 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -39,23 +39,24 @@ test_expect_equal_file EXPECTED OUTPUT rm ${MAIL_DIR}/copy1 test_begin_subtest 'Deleted first duplicate file does not stop notmuch show from working' -output=$(notmuch show --body=false --format=json id:duplicate) +output=$(notmuch show --body=false --format=json id:duplicate | + notmuch_json_show_sanitize | sed 's/message [0-9]/A_SUBJECT/') expected='[[[{ - "id": "'duplicate'", + "id": "XXXXX", "match": true, "excluded": false, "filename": [ "'"${MAIL_DIR}"/copy1'", "'"${MAIL_DIR}"/copy2'" ], - "timestamp": 978709435, + "timestamp": 42, "date_relative": "2001-01-05", "tags": ["inbox","unread"], "headers": { - "Subject": "message 2", + "Subject": "A_SUBJECT", "From": "Notmuch Test Suite <test_suite@notmuchmail.org>", "To": "Notmuch Test Suite <test_suite@notmuchmail.org>", - "Date": "Fri, 05 Jan 2001 15:43:55 +0000" + "Date": "GENERATED_DATE" } }, []]]]' -- 2.13.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/5] test/duplicate-mid: clarify index order vs filename order 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner @ 2017-08-27 23:58 ` David Bremner 2017-08-27 23:58 ` [PATCH 3/5] test: known broken test for subject after reindexing David Bremner ` (3 subsequent siblings) 4 siblings, 0 replies; 9+ messages in thread From: David Bremner @ 2017-08-27 23:58 UTC (permalink / raw) To: Mark Walters, notmuch The existing test for notmuch search had the first in filename order the same as the first indexed, which made it harder to understand what the underlying behaviour is. Add a file with a lexicographically smaller name, but later index time to clarify this. --- test/T670-duplicate-mid.sh | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index 3495e63d..d2f89432 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -5,15 +5,17 @@ test_description="duplicate message ids" add_message '[id]="duplicate"' '[subject]="message 1" [filename]=copy1' add_message '[id]="duplicate"' '[subject]="message 2" [filename]=copy2' -test_begin_subtest 'First subject preserved' +add_message '[id]="duplicate"' '[subject]="message 0" [filename]=copy0' +test_begin_subtest 'search: first indexed subject preserved' cat <<EOF > EXPECTED -thread:XXX 2001-01-05 [1/1(2)] Notmuch Test Suite; message 1 (inbox unread) +thread:XXX 2001-01-05 [1/1(3)] Notmuch Test Suite; message 1 (inbox unread) EOF notmuch search id:duplicate | notmuch_search_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT test_begin_subtest 'Search for second subject' cat <<EOF >EXPECTED +MAIL_DIR/copy0 MAIL_DIR/copy1 MAIL_DIR/copy2 EOF @@ -23,6 +25,7 @@ test_expect_equal_file EXPECTED OUTPUT add_message '[id]="duplicate"' '[body]="sekrit" [filename]=copy3' test_begin_subtest 'search for body in duplicate file' cat <<EOF >EXPECTED +MAIL_DIR/copy0 MAIL_DIR/copy1 MAIL_DIR/copy2 MAIL_DIR/copy3 @@ -37,7 +40,7 @@ notmuch reindex '*' notmuch search --output=files "sekrit" | notmuch_dir_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT -rm ${MAIL_DIR}/copy1 +rm ${MAIL_DIR}/copy0 test_begin_subtest 'Deleted first duplicate file does not stop notmuch show from working' output=$(notmuch show --body=false --format=json id:duplicate | notmuch_json_show_sanitize | sed 's/message [0-9]/A_SUBJECT/') @@ -46,6 +49,7 @@ expected='[[[{ "match": true, "excluded": false, "filename": [ + "'"${MAIL_DIR}"/copy0'", "'"${MAIL_DIR}"/copy1'", "'"${MAIL_DIR}"/copy2'" ], -- 2.13.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/5] test: known broken test for subject after reindexing 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner 2017-08-27 23:58 ` [PATCH 2/5] test/duplicate-mid: clarify index order vs filename order David Bremner @ 2017-08-27 23:58 ` David Bremner 2017-08-27 23:58 ` [PATCH 4/5] lib: enforce that n_message_reindex takes headers from first file David Bremner ` (2 subsequent siblings) 4 siblings, 0 replies; 9+ messages in thread From: David Bremner @ 2017-08-27 23:58 UTC (permalink / raw) To: Mark Walters, notmuch In [1], Mark gave a test that was behaving strangly. This turns out to be specific to reindexing. I suppose one could argue that picking the lexicographically last file name is a defensible choice, but it's almost as easy to take the first, which seems more intuitive. So mark the current situation as broken. [1]: id:1503859703-2973-1-git-send-email-markwalters1009@gmail.com --- test/T670-duplicate-mid.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index d2f89432..d4c1d1c2 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -40,6 +40,14 @@ notmuch reindex '*' notmuch search --output=files "sekrit" | notmuch_dir_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT +test_begin_subtest 'reindex choses subject from first filename' +test_subtest_known_broken +cat <<EOF > EXPECTED +thread:XXX 2001-01-05 [1/1(3)] Notmuch Test Suite; message 0 (inbox unread) +EOF +notmuch search id:duplicate | notmuch_search_sanitize > OUTPUT +test_expect_equal_file EXPECTED OUTPUT + rm ${MAIL_DIR}/copy0 test_begin_subtest 'Deleted first duplicate file does not stop notmuch show from working' output=$(notmuch show --body=false --format=json id:duplicate | -- 2.13.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] lib: enforce that n_message_reindex takes headers from first file 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner 2017-08-27 23:58 ` [PATCH 2/5] test/duplicate-mid: clarify index order vs filename order David Bremner 2017-08-27 23:58 ` [PATCH 3/5] test: known broken test for subject after reindexing David Bremner @ 2017-08-27 23:58 ` David Bremner 2017-08-27 23:58 ` [PATCH 5/5] test/duplicate-mid: check for subject with notmuch-show David Bremner 2017-09-06 1:29 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner 4 siblings, 0 replies; 9+ messages in thread From: David Bremner @ 2017-08-27 23:58 UTC (permalink / raw) To: Mark Walters, notmuch This is still a bit stopgap to be only choosing one set of headers, but this seems like a more defensible set of headers to choose. --- lib/message.cc | 4 +++- test/T670-duplicate-mid.sh | 1 - 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 98a88d3a..051362d4 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -2012,7 +2012,9 @@ notmuch_message_reindex (notmuch_message_t *message, thread_id = orig_thread_id; _notmuch_message_add_term (message, "thread", thread_id); - _notmuch_message_set_header_values (message, date, from, subject); + /* Take header values only from first filename */ + if (found == 0) + _notmuch_message_set_header_values (message, date, from, subject); ret = _notmuch_message_index_file (message, message_file); diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index d4c1d1c2..ce010cf7 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -41,7 +41,6 @@ notmuch search --output=files "sekrit" | notmuch_dir_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT test_begin_subtest 'reindex choses subject from first filename' -test_subtest_known_broken cat <<EOF > EXPECTED thread:XXX 2001-01-05 [1/1(3)] Notmuch Test Suite; message 0 (inbox unread) EOF -- 2.13.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/5] test/duplicate-mid: check for subject with notmuch-show 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner ` (2 preceding siblings ...) 2017-08-27 23:58 ` [PATCH 4/5] lib: enforce that n_message_reindex takes headers from first file David Bremner @ 2017-08-27 23:58 ` David Bremner 2017-09-06 1:29 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner 4 siblings, 0 replies; 9+ messages in thread From: David Bremner @ 2017-08-27 23:58 UTC (permalink / raw) To: Mark Walters, notmuch In [1] Mark showed that the the current code (d7a49e81) is not consistent in it's handling of subjects of messages with duplicate message-ids (or in notmuch-speak, of messages with multiple files). notmuch-search uses indexing order and explicitedly preserves the first. notmuch-show (apparently) uses alphabetical (or at least xapian term order) of filenames. In a perfect world we would probably report all subjects in the json output; at the very least we should be consistent. [1]: id:87378dny3d.fsf@qmul.ac.uk --- test/T670-duplicate-mid.sh | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh index ce010cf7..21a9689a 100755 --- a/test/T670-duplicate-mid.sh +++ b/test/T670-duplicate-mid.sh @@ -13,6 +13,31 @@ EOF notmuch search id:duplicate | notmuch_search_sanitize > OUTPUT test_expect_equal_file EXPECTED OUTPUT +test_begin_subtest 'First subject preserved in notmuch-show (json)' +test_subtest_known_broken +output=$(notmuch show --body=false --format=json id:duplicate | notmuch_json_show_sanitize) +expected='[[[{ + "id": "XXXXX", + "match": true, + "excluded": false, + "filename": [ + "'"${MAIL_DIR}"/copy0'", + "'"${MAIL_DIR}"/copy1'", + "'"${MAIL_DIR}"/copy2'" + ], + "timestamp": 42, + "date_relative": "2001-01-05", + "tags": ["inbox","unread"], + "headers": { + "Subject": "message 1", + "From": "Notmuch Test Suite <test_suite@notmuchmail.org>", + "To": "Notmuch Test Suite <test_suite@notmuchmail.org>", + "Date": "GENERATED_DATE" + } + }, +[]]]]' +test_expect_equal_json "$output" "$expected" + test_begin_subtest 'Search for second subject' cat <<EOF >EXPECTED MAIL_DIR/copy0 -- 2.13.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/5] test: make fallback to duplicate test more robust. 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner ` (3 preceding siblings ...) 2017-08-27 23:58 ` [PATCH 5/5] test/duplicate-mid: check for subject with notmuch-show David Bremner @ 2017-09-06 1:29 ` David Bremner 4 siblings, 0 replies; 9+ messages in thread From: David Bremner @ 2017-09-06 1:29 UTC (permalink / raw) To: Mark Walters, notmuch David Bremner <david@tethera.net> writes: > The original intent of this test was to verify that notmuch show was > not crashing when the first file (where headers are being read from) > was deleted. Run the output through some sanitization so that as we > add and delete copies we don't have to update this test. Series pushed. d ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-09-06 1:29 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-27 9:45 Duplicate message ids Mark Walters 2017-08-27 13:19 ` [PATCH] test: check for subject consistency between search and show David Bremner 2017-08-27 18:48 ` [PATCH] test: duplicate mids: add an extra broken search test Mark Walters 2017-08-27 23:58 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner 2017-08-27 23:58 ` [PATCH 2/5] test/duplicate-mid: clarify index order vs filename order David Bremner 2017-08-27 23:58 ` [PATCH 3/5] test: known broken test for subject after reindexing David Bremner 2017-08-27 23:58 ` [PATCH 4/5] lib: enforce that n_message_reindex takes headers from first file David Bremner 2017-08-27 23:58 ` [PATCH 5/5] test/duplicate-mid: check for subject with notmuch-show David Bremner 2017-09-06 1:29 ` [PATCH 1/5] test: make fallback to duplicate test more robust David Bremner
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).