* [PATCH 1/6] test: start new corpus of test messages for indexing code
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
@ 2022-02-05 19:52 ` David Bremner
2022-02-05 19:52 ` [PATCH 2/6] test: add known broken test for non-email false positive David Bremner
` (6 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-05 19:52 UTC (permalink / raw)
To: notmuch
This particular message is not recognized by notmuch as mail, but is
fine according to e.g. mutt. The trigger for this bad behaviour seems
to be a second "From " ocurring at the beginning of the line but
inside an attachment.
---
test/corpora/indexing/mbox-attachment.eml | 83 +++++++++++++++++++++++
1 file changed, 83 insertions(+)
create mode 100644 test/corpora/indexing/mbox-attachment.eml
diff --git a/test/corpora/indexing/mbox-attachment.eml b/test/corpora/indexing/mbox-attachment.eml
new file mode 100644
index 00000000..98a8fc91
--- /dev/null
+++ b/test/corpora/indexing/mbox-attachment.eml
@@ -0,0 +1,83 @@
+From david@tethera.net Sat Feb 5 09:19:10 2022
+From: David Bremner <david@tethera.net>
+To: David Bremner <david@tethera.net>
+Subject: Re: [RFC PATCH v2 12/12] emacs: whitespace cleanup for keybindings
+Date: Sat, 05 Feb 2022 10:19:09 -0400
+Message-ID: <87k0e9o0pu.fsf@tethera.net>
+MIME-Version: 1.0
+Content-Type: multipart/mixed; boundary="=-=-="
+
+--=-=-=
+Content-Type: text/plain
+Content-Disposition: inline
+
+
+I figured out the race condition in the tests. The previous test was
+still running when the failing test started, the joys of using a shared
+emacs for running all of the tests in one file.
+
+The attached diff is split into the the commits that introduce the tests
+in question in my working series, but you should be able to just apply
+it on top of the posted series if you want.
+
+
+--=-=-=
+Content-Type: text/x-diff
+Content-Disposition: inline; filename=0001-test-fixups.patch
+
+From fc88cba7f1f37b9cf3b296eace2422dd0e173502 Mon Sep 17 00:00:00 2001
+From: David Bremner <david@tethera.net>
+Date: Thu, 3 Feb 2022 21:05:05 -0400
+Subject: [PATCH] test fixups
+
+---
+ test/T315-emacs-tagging.sh | 9 ++++-----
+ 1 file changed, 4 insertions(+), 5 deletions(-)
+
+diff --git a/test/T315-emacs-tagging.sh b/test/T315-emacs-tagging.sh
+index c9e3e53a..c26413ce 100755
+--- a/test/T315-emacs-tagging.sh
++++ b/test/T315-emacs-tagging.sh
+@@ -119,7 +119,8 @@ for mode in search show tree unthreaded; do
+ (notmuch-$mode \"$os_x_darwin_thread\")
+ (notmuch-test-wait)
+ (execute-kbd-macro \"+tag-to-be-undone-$mode\")
+- (notmuch-tag-undo))"
++ (notmuch-tag-undo)
++ (notmuch-test-wait))"
+ count=$(notmuch count "tag:tag-to-be-undone-$mode")
+ test_expect_equal "$count" "0"
+
+@@ -128,9 +129,7 @@ for mode in search show tree unthreaded; do
+ (notmuch-$mode \"$os_x_darwin_thread\")
+ (notmuch-test-wait)
+ (execute-kbd-macro \"+one-$mode\")
+- (notmuch-test-wait)
+ (execute-kbd-macro \"+two-$mode\")
+- (notmuch-test-wait)
+ (notmuch-tag-undo)
+ (notmuch-test-wait)
+ (execute-kbd-macro \"+three-$mode\"))"
+@@ -143,7 +142,6 @@ for mode in search show tree unthreaded; do
+ (notmuch-$mode \"$os_x_darwin_thread\")
+ (notmuch-test-wait)
+ (execute-kbd-macro \"+one-$mode\")
+- (notmuch-test-wait)
+ (execute-kbd-macro \"+two-$mode\")
+ (notmuch-tag-undo)
+ (notmuch-test-wait)
+@@ -159,7 +157,8 @@ for mode in search show tree unthreaded; do
+ (notmuch-$mode \"$os_x_darwin_thread\")
+ (notmuch-test-wait)
+ (execute-kbd-macro \"+tag-to-be-undone-$mode\")
+- (execute-kbd-macro (kbd \"C-x u\")))"
++ (execute-kbd-macro (kbd \"C-x u\"))
++ (notmuch-test-wait))"
+ count=$(notmuch count "tag:tag-to-be-undone-$mode")
+ test_expect_equal "$count" "0"
+ done
+--
+2.30.2
+
+
+--=-=-=--
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/6] test: add known broken test for non-email false positive.
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
2022-02-05 19:52 ` [PATCH 1/6] test: start new corpus of test messages for indexing code David Bremner
@ 2022-02-05 19:52 ` David Bremner
2022-02-05 19:52 ` [PATCH 3/6] lib/config: add configuration item for controlling check for mbox David Bremner
` (5 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-05 19:52 UTC (permalink / raw)
To: notmuch
This message should parse fine, but notmuch heuristics are claiming it
is not email. It differs from a version that parses fine only in the
first line, which triggers the mbox detection code.
---
test/T050-new.sh | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/test/T050-new.sh b/test/T050-new.sh
index 6791f87c..9ef24f18 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -455,4 +455,12 @@ Date: Fri, 17 Jun 2016 22:14:41 -0400
EOF
test_expect_equal_file EXPECTED OUTPUT
+add_email_corpus indexing
+
+test_begin_subtest "id:87k0e9o0pu.fsf@tethera.net is indexed"
+test_subtest_known_broken
+notmuch new
+output=$(notmuch search --output=messages id:87k0e9o0pu.fsf@tethera.net)
+test_expect_equal "$output" "id:87k0e9o0pu.fsf@tethera.net"
+
test_done
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/6] lib/config: add configuration item for controlling check for mbox
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
2022-02-05 19:52 ` [PATCH 1/6] test: start new corpus of test messages for indexing code David Bremner
2022-02-05 19:52 ` [PATCH 2/6] test: add known broken test for non-email false positive David Bremner
@ 2022-02-05 19:52 ` David Bremner
2022-02-05 19:52 ` [PATCH 4/6] test: add known broken test for index.check_mbox=false David Bremner
` (4 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-05 19:52 UTC (permalink / raw)
To: notmuch
Currently we always check for multi-message mboxes, but some people
would prefer to disable this check. Set up infrastructure to disable
check.
---
lib/config.cc | 4 ++++
lib/notmuch.h | 1 +
test/T030-config.sh | 1 +
test/T055-path-config.sh | 1 +
test/T590-libconfig.sh | 4 ++++
5 files changed, 11 insertions(+)
diff --git a/lib/config.cc b/lib/config.cc
index 503a0c8b..296bc3b8 100644
--- a/lib/config.cc
+++ b/lib/config.cc
@@ -581,6 +581,8 @@ _notmuch_config_key_to_string (notmuch_config_key_t key)
return "database.hook_dir";
case NOTMUCH_CONFIG_BACKUP_DIR:
return "database.backup_dir";
+ case NOTMUCH_CONFIG_CHECK_MBOX:
+ return "index.check_mbox";
case NOTMUCH_CONFIG_EXCLUDE_TAGS:
return "search.exclude_tags";
case NOTMUCH_CONFIG_NEW_TAGS:
@@ -628,6 +630,8 @@ _notmuch_config_default (notmuch_database_t *notmuch, notmuch_config_key_t key)
return "unread;inbox";
case NOTMUCH_CONFIG_SYNC_MAILDIR_FLAGS:
return "true";
+ case NOTMUCH_CONFIG_CHECK_MBOX:
+ return "true";
case NOTMUCH_CONFIG_USER_NAME:
name = getenv ("NAME");
if (name)
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 2e6ec2af..f0cf1a09 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -2551,6 +2551,7 @@ typedef enum {
NOTMUCH_CONFIG_USER_NAME,
NOTMUCH_CONFIG_AUTOCOMMIT,
NOTMUCH_CONFIG_EXTRA_HEADERS,
+ NOTMUCH_CONFIG_CHECK_MBOX,
NOTMUCH_CONFIG_LAST
} notmuch_config_key_t;
diff --git a/test/T030-config.sh b/test/T030-config.sh
index 43bbce31..fe7ac4a6 100755
--- a/test/T030-config.sh
+++ b/test/T030-config.sh
@@ -57,6 +57,7 @@ database.mail_root=MAIL_DIR
database.path=MAIL_DIR
foo.list=this;is another;list value;
foo.string=this is another string value
+index.check_mbox=true
maildir.synchronize_flags=true
new.ignore=
new.tags=unread;inbox
diff --git a/test/T055-path-config.sh b/test/T055-path-config.sh
index 1df240dd..2a518892 100755
--- a/test/T055-path-config.sh
+++ b/test/T055-path-config.sh
@@ -283,6 +283,7 @@ database.backup_dir
database.hook_dir
database.mail_root=MAIL_DIR
database.path
+index.check_mbox=true
maildir.synchronize_flags=true
new.ignore=
new.tags=unread;inbox
diff --git a/test/T590-libconfig.sh b/test/T590-libconfig.sh
index 26a1f033..034ac8e6 100755
--- a/test/T590-libconfig.sh
+++ b/test/T590-libconfig.sh
@@ -440,6 +440,7 @@ cat <<'EOF' >EXPECTED
10: 'USER_FULL_NAME'
11: '8000'
12: 'NULL'
+13: 'true'
== stderr ==
EOF
unset MAILDIR
@@ -751,6 +752,7 @@ cat <<'EOF' >EXPECTED
10: 'Notmuch Test Suite'
11: '8000'
12: 'NULL'
+13: 'true'
== stderr ==
EOF
test_expect_equal_file EXPECTED OUTPUT
@@ -785,6 +787,7 @@ cat <<'EOF' >EXPECTED
10: 'USER_FULL_NAME'
11: '8000'
12: 'NULL'
+13: 'true'
== stderr ==
EOF
test_expect_equal_file EXPECTED OUTPUT.clean
@@ -856,6 +859,7 @@ database.backup_dir MAIL_DIR/.notmuch/backups
database.hook_dir MAIL_DIR/.notmuch/hooks
database.mail_root MAIL_DIR
database.path MAIL_DIR
+index.check_mbox true
key with spaces value, with, spaces!
maildir.synchronize_flags true
new.ignore sekrit_junk
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/6] test: add known broken test for index.check_mbox=false
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
` (2 preceding siblings ...)
2022-02-05 19:52 ` [PATCH 3/6] lib/config: add configuration item for controlling check for mbox David Bremner
@ 2022-02-05 19:52 ` David Bremner
2022-02-05 19:52 ` [PATCH 5/6] lib/message-file: stash pointer to "parent" notmuch database David Bremner
` (3 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-05 19:52 UTC (permalink / raw)
To: notmuch
This message should only fail to parse because it looks (rightly or
wrongly) like a 2 message mbox.
---
test/T050-new.sh | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/test/T050-new.sh b/test/T050-new.sh
index 9ef24f18..e779bd2f 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -463,4 +463,13 @@ notmuch new
output=$(notmuch search --output=messages id:87k0e9o0pu.fsf@tethera.net)
test_expect_equal "$output" "id:87k0e9o0pu.fsf@tethera.net"
+test_begin_subtest "id:87k0e9o0pu.fsf@tethera.net is indexed with index.check_mbox=false"
+test_subtest_known_broken
+cp notmuch-config notmuch-config.old
+notmuch config set index.check_mbox false
+notmuch new
+cp notmuch-config.old notmuch-config
+output=$(notmuch search --output=messages id:87k0e9o0pu.fsf@tethera.net)
+test_expect_equal "$output" "id:87k0e9o0pu.fsf@tethera.net"
+
test_done
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 5/6] lib/message-file: stash pointer to "parent" notmuch database
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
` (3 preceding siblings ...)
2022-02-05 19:52 ` [PATCH 4/6] test: add known broken test for index.check_mbox=false David Bremner
@ 2022-02-05 19:52 ` David Bremner
2022-02-05 19:52 ` [PATCH 6/6] lib/message-file: allow disabling check for mbox files David Bremner
` (2 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-05 19:52 UTC (permalink / raw)
To: notmuch
Since we know the database anyway when creating the
notmuch_message_file struct, keep it to e.g. retrieve configuration
information later.
---
lib/message-file.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/message-file.c b/lib/message-file.c
index 68f646a4..0f356cf1 100644
--- a/lib/message-file.c
+++ b/lib/message-file.c
@@ -27,6 +27,7 @@
#include <glib.h> /* GHashTable */
struct _notmuch_message_file {
+ notmuch_database_t *notmuch;
/* open stream to (possibly gzipped) file */
GMimeStream *stream;
char *filename;
@@ -90,6 +91,8 @@ _notmuch_message_file_open_ctx (notmuch_database_t *notmuch,
if (message->stream == NULL)
goto FAIL;
+ message->notmuch = notmuch;
+
return message;
FAIL:
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 6/6] lib/message-file: allow disabling check for mbox files.
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
` (4 preceding siblings ...)
2022-02-05 19:52 ` [PATCH 5/6] lib/message-file: stash pointer to "parent" notmuch database David Bremner
@ 2022-02-05 19:52 ` David Bremner
2022-02-06 19:42 ` [PATCH] doc: document index.check_mbox option David Bremner
2022-02-08 20:58 ` allow disabling the check for multi-message mboxes David Bremner
2022-02-20 13:23 ` David Bremner
7 siblings, 1 reply; 10+ messages in thread
From: David Bremner @ 2022-02-05 19:52 UTC (permalink / raw)
To: notmuch
This does not change the default (of enforcing the check) but does
allow users to disable the check if they want.
---
lib/message-file.c | 9 +++++++--
test/T050-new.sh | 1 -
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/lib/message-file.c b/lib/message-file.c
index 0f356cf1..e2bd406b 100644
--- a/lib/message-file.c
+++ b/lib/message-file.c
@@ -145,14 +145,19 @@ _notmuch_message_file_parse (notmuch_message_file_t *message)
GMimeParser *parser;
notmuch_status_t status = NOTMUCH_STATUS_SUCCESS;
bool is_mbox;
+ notmuch_bool_t check_mbox;
if (message->message)
return NOTMUCH_STATUS_SUCCESS;
- is_mbox = _is_mbox (message->stream);
-
_notmuch_init ();
+ status = notmuch_config_get_bool (message->notmuch, NOTMUCH_CONFIG_CHECK_MBOX, &check_mbox);
+ if (status)
+ return status;
+
+ is_mbox = check_mbox && _is_mbox (message->stream);
+
message->headers = g_hash_table_new_full (strcase_hash, strcase_equal,
free, g_free);
if (! message->headers)
diff --git a/test/T050-new.sh b/test/T050-new.sh
index e779bd2f..aa777235 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -464,7 +464,6 @@ output=$(notmuch search --output=messages id:87k0e9o0pu.fsf@tethera.net)
test_expect_equal "$output" "id:87k0e9o0pu.fsf@tethera.net"
test_begin_subtest "id:87k0e9o0pu.fsf@tethera.net is indexed with index.check_mbox=false"
-test_subtest_known_broken
cp notmuch-config notmuch-config.old
notmuch config set index.check_mbox false
notmuch new
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] doc: document index.check_mbox option.
2022-02-05 19:52 ` [PATCH 6/6] lib/message-file: allow disabling check for mbox files David Bremner
@ 2022-02-06 19:42 ` David Bremner
0 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-06 19:42 UTC (permalink / raw)
To: David Bremner, notmuch
---
This needs to be applied on top of the series id:20220206142334.303341-1-david@tethera.net
doc/man1/notmuch-config.rst | 10 ++++++++++
doc/man1/notmuch-insert.rst | 1 +
doc/man1/notmuch-new.rst | 1 +
3 files changed, 12 insertions(+)
diff --git a/doc/man1/notmuch-config.rst b/doc/man1/notmuch-config.rst
index acc5ecec..f24c58a5 100644
--- a/doc/man1/notmuch-config.rst
+++ b/doc/man1/notmuch-config.rst
@@ -159,6 +159,16 @@ index.decrypt
Default: ``auto``.
+.. _index.check_mbox:
+
+index.check\_mbox
+ Avoid indexing files which appear to be :manpage:`mbox` files with
+ more than one message.
+
+ Default: ``true``.
+
+ History: This configuration value was introduced in notmuch 0.36.
+
.. _index.header:
index.header.<prefix>
diff --git a/doc/man1/notmuch-insert.rst b/doc/man1/notmuch-insert.rst
index fe2bf26b..ed29e77f 100644
--- a/doc/man1/notmuch-insert.rst
+++ b/doc/man1/notmuch-insert.rst
@@ -92,6 +92,7 @@ CONFIGURATION
=============
Indexing is influenced by the configuration options
+:ref:`index.check_mbox <index.check_mbox>`,
:ref:`index.decrypt <index.decrypt>` and :ref:`index.header
<index.header>`. Tagging
is controlled by :ref:`new.tags <new.tags>` and
diff --git a/doc/man1/notmuch-new.rst b/doc/man1/notmuch-new.rst
index 398f8813..0b871c56 100644
--- a/doc/man1/notmuch-new.rst
+++ b/doc/man1/notmuch-new.rst
@@ -82,6 +82,7 @@ CONFIGURATION
=============
Indexing is influenced by the configuration options
+:ref:`index.check_mbox <index.check_mbox>`,
:ref:`index.decrypt <index.decrypt>`, :ref:`index.header
<index.header>`, and :ref:`new.ignore <new.ignore>`. Tagging
is controlled by :ref:`new.tags <new.tags>` and
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: allow disabling the check for multi-message mboxes
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
` (5 preceding siblings ...)
2022-02-05 19:52 ` [PATCH 6/6] lib/message-file: allow disabling check for mbox files David Bremner
@ 2022-02-08 20:58 ` David Bremner
2022-02-20 13:23 ` David Bremner
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-08 20:58 UTC (permalink / raw)
To: notmuch
David Bremner <david@tethera.net> writes:
> Notmuch tries to prevent users from shooting themselves in the foot by
> indexing mboxes containing multiple messages. On the other hand some
> MTAs (notably common configurations of postfix) like to deliver mail
> to file the looks like mboxes. This can occasionaly cause problems
> where notmuch thinks a file is an mbox, but it really isn't (see patch
> 2/6). This series makes the corresponding check optional, while
> defaulting to preserving the current behaviour.
As a complementary approach, perhaps notmuch insert should escape the
enveloper header so that notmuch does not deliver messages it cannot
index. I guess the question is what other client software expects the
files to start with "From "
d
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: allow disabling the check for multi-message mboxes
2022-02-05 19:52 allow disabling the check for multi-message mboxes David Bremner
` (6 preceding siblings ...)
2022-02-08 20:58 ` allow disabling the check for multi-message mboxes David Bremner
@ 2022-02-20 13:23 ` David Bremner
7 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2022-02-20 13:23 UTC (permalink / raw)
To: notmuch
David Bremner <david@tethera.net> writes:
> Notmuch tries to prevent users from shooting themselves in the foot by
> indexing mboxes containing multiple messages. On the other hand some
> MTAs (notably common configurations of postfix) like to deliver mail
> to file the looks like mboxes. This can occasionaly cause problems
> where notmuch thinks a file is an mbox, but it really isn't (see patch
> 2/6). This series makes the corresponding check optional, while
> defaulting to preserving the current behaviour.
The series I just applied fixes the problem that motivated me to write
this series. On the other hand, that is specific to delivery via
notmuch-insert. For the moment I going to continue to wait for feedback
on this series.
^ permalink raw reply [flat|nested] 10+ messages in thread