unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH 1/3] test: Test notmuch new for single-message mbox
@ 2012-11-25  6:15 Austin Clements
  2012-11-25  6:16 ` [PATCH 2/3] test: Test for ignoring multi-message mbox Austin Clements
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Austin Clements @ 2012-11-25  6:15 UTC (permalink / raw)
  To: notmuch

We support this for historical reasons.
---
 test/new |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/test/new b/test/new
index 587aa11..43d56e4 100755
--- a/test/new
+++ b/test/new
@@ -163,6 +163,19 @@ rm -rf "${MAIL_DIR}"/two
 output=$(NOTMUCH_NEW)
 test_expect_equal "$output" "No new mail. Removed 3 messages."
 
+test_begin_subtest "Support single-message mbox"
+cat > "${MAIL_DIR}"/mbox_file1 <<EOF
+From test_suite@notmuchmail.org Fri Jan  5 15:43:57 2001
+From: Notmuch Test Suite <test_suite@notmuchmail.org>
+To: Notmuch Test Suite <test_suite@notmuchmail.org>
+Subject: Test mbox message 1
+
+Body.
+EOF
+output=$(NOTMUCH_NEW 2>&1)
+test_expect_equal "$output" \
+"Added 1 new message to the database."
+
 # This test requires that notmuch new has been run at least once.
 test_begin_subtest "Skip and report non-mail files"
 generate_message
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] test: Test for ignoring multi-message mbox
  2012-11-25  6:15 [PATCH 1/3] test: Test notmuch new for single-message mbox Austin Clements
@ 2012-11-25  6:16 ` Austin Clements
  2012-11-25  6:16 ` [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox Austin Clements
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Austin Clements @ 2012-11-25  6:16 UTC (permalink / raw)
  To: notmuch

This test is currently broken.  Note that its brokenness cascades and
causes the next test to fail as well (because notmuch incorrectly
indexes the mbox file).
---
 test/new |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/test/new b/test/new
index 43d56e4..29f9aff 100755
--- a/test/new
+++ b/test/new
@@ -178,16 +178,34 @@ test_expect_equal "$output" \
 
 # This test requires that notmuch new has been run at least once.
 test_begin_subtest "Skip and report non-mail files"
+test_subtest_known_broken
 generate_message
 mkdir -p "${MAIL_DIR}"/.git && touch "${MAIL_DIR}"/.git/config
 touch "${MAIL_DIR}"/ignored_file
 touch "${MAIL_DIR}"/.ignored_hidden_file
+cat > "${MAIL_DIR}"/mbox_file <<EOF
+From test_suite@notmuchmail.org Fri Jan  5 15:43:57 2001
+From: Notmuch Test Suite <test_suite@notmuchmail.org>
+To: Notmuch Test Suite <test_suite@notmuchmail.org>
+Subject: Test mbox message 1
+
+Body.
+
+From test_suite@notmuchmail.org Fri Jan  5 15:43:57 2001
+From: Notmuch Test Suite <test_suite@notmuchmail.org>
+To: Notmuch Test Suite <test_suite@notmuchmail.org>
+Subject: Test mbox message 2
+
+Body 2.
+EOF
 output=$(NOTMUCH_NEW 2>&1)
 test_expect_equal "$output" \
 "Note: Ignoring non-mail file: ${MAIL_DIR}/.git/config
 Note: Ignoring non-mail file: ${MAIL_DIR}/.ignored_hidden_file
 Note: Ignoring non-mail file: ${MAIL_DIR}/ignored_file
+Note: Ignoring non-mail file: ${MAIL_DIR}/mbox_file
 Added 1 new message to the database."
+rm "${MAIL_DIR}"/mbox_file
 
 test_begin_subtest "Ignore files and directories specified in new.ignore"
 generate_message
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox
  2012-11-25  6:15 [PATCH 1/3] test: Test notmuch new for single-message mbox Austin Clements
  2012-11-25  6:16 ` [PATCH 2/3] test: Test for ignoring multi-message mbox Austin Clements
@ 2012-11-25  6:16 ` Austin Clements
  2012-11-25 13:26   ` Tomi Ollila
  2012-11-25 11:33 ` [PATCH 1/3] test: Test notmuch new for " Mark Walters
  2012-11-27  1:14 ` David Bremner
  3 siblings, 1 reply; 8+ messages in thread
From: Austin Clements @ 2012-11-25  6:16 UTC (permalink / raw)
  To: notmuch

Previously, we would treat multi-message mboxes as one giant email,
which, besides the obvious incorrect indexing, often led to
out-of-memory errors for archival mboxes.  Now we explicitly reject
multi-message mboxes.  For historical reasons, we retain support for
single-message mboxes, but official deprecate this behavior.
---
 lib/database.cc |    4 +++-
 lib/index.cc    |   28 ++++++++++++++++++++++++++++
 test/new        |    8 +++++---
 3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 4df3217..91d4329 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1821,7 +1821,9 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
 	    date = notmuch_message_file_get_header (message_file, "date");
 	    _notmuch_message_set_header_values (message, date, from, subject);
 
-	    _notmuch_message_index_file (message, filename);
+	    ret = _notmuch_message_index_file (message, filename);
+	    if (ret)
+		goto DONE;
 	} else {
 	    ret = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID;
 	}
diff --git a/lib/index.cc b/lib/index.cc
index e377732..da0e6ce 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -435,6 +435,9 @@ _notmuch_message_index_file (notmuch_message_t *message,
     const char *from, *subject;
     notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS;
     static int initialized = 0;
+    char from_buf[5];
+    bool is_mbox = false;
+    static bool mbox_warning = false;
 
     if (! initialized) {
 	g_mime_init (0);
@@ -448,13 +451,38 @@ _notmuch_message_index_file (notmuch_message_t *message,
 	goto DONE;
     }
 
+    /* Is this mbox? */
+    if (fread (from_buf, sizeof (from_buf), 1, file) == 1 &&
+	strncmp (from_buf, "From ", 5) == 0)
+	is_mbox = true;
+    rewind (file);
+
     /* Evil GMime steals my FILE* here so I won't fclose it. */
     stream = g_mime_stream_file_new (file);
 
     parser = g_mime_parser_new_with_stream (stream);
+    g_mime_parser_set_scan_from (parser, is_mbox);
 
     mime_message = g_mime_parser_construct_message (parser);
 
+    if (is_mbox) {
+	if (!g_mime_parser_eos (parser)) {
+	    /* This is a multi-message mbox. */
+	    ret = NOTMUCH_STATUS_FILE_NOT_EMAIL;
+	    goto DONE;
+	}
+	/* For historical reasons, we support single-message mboxes,
+	 * but this behavior is likely to change in the future, so
+	 * warn. */
+	if (!mbox_warning) {
+	    mbox_warning = true;
+	    fprintf (stderr, "\
+Warning: %s is an mbox containing a single message,\n\
+likely caused by misconfigured mail delivery.  Support for single-message\n\
+mboxes is deprecated and may be removed in the future.\n", filename);
+	}
+    }
+
     from = g_mime_message_get_sender (mime_message);
     addresses = internet_address_list_parse_string (from);
 
diff --git a/test/new b/test/new
index 29f9aff..f562cec 100755
--- a/test/new
+++ b/test/new
@@ -163,7 +163,7 @@ rm -rf "${MAIL_DIR}"/two
 output=$(NOTMUCH_NEW)
 test_expect_equal "$output" "No new mail. Removed 3 messages."
 
-test_begin_subtest "Support single-message mbox"
+test_begin_subtest "Support single-message mbox (deprecated)"
 cat > "${MAIL_DIR}"/mbox_file1 <<EOF
 From test_suite@notmuchmail.org Fri Jan  5 15:43:57 2001
 From: Notmuch Test Suite <test_suite@notmuchmail.org>
@@ -174,11 +174,13 @@ Body.
 EOF
 output=$(NOTMUCH_NEW 2>&1)
 test_expect_equal "$output" \
-"Added 1 new message to the database."
+"Warning: ${MAIL_DIR}/mbox_file1 is an mbox containing a single message,
+likely caused by misconfigured mail delivery.  Support for single-message
+mboxes is deprecated and may be removed in the future.
+Added 1 new message to the database."
 
 # This test requires that notmuch new has been run at least once.
 test_begin_subtest "Skip and report non-mail files"
-test_subtest_known_broken
 generate_message
 mkdir -p "${MAIL_DIR}"/.git && touch "${MAIL_DIR}"/.git/config
 touch "${MAIL_DIR}"/ignored_file
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] test: Test notmuch new for single-message mbox
  2012-11-25  6:15 [PATCH 1/3] test: Test notmuch new for single-message mbox Austin Clements
  2012-11-25  6:16 ` [PATCH 2/3] test: Test for ignoring multi-message mbox Austin Clements
  2012-11-25  6:16 ` [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox Austin Clements
@ 2012-11-25 11:33 ` Mark Walters
  2012-11-27  1:14 ` David Bremner
  3 siblings, 0 replies; 8+ messages in thread
From: Mark Walters @ 2012-11-25 11:33 UTC (permalink / raw)
  To: Austin Clements, notmuch


This series looks fine to me. +1

Mark

On Sun, 25 Nov 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> We support this for historical reasons.
> ---
>  test/new |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/test/new b/test/new
> index 587aa11..43d56e4 100755
> --- a/test/new
> +++ b/test/new
> @@ -163,6 +163,19 @@ rm -rf "${MAIL_DIR}"/two
>  output=$(NOTMUCH_NEW)
>  test_expect_equal "$output" "No new mail. Removed 3 messages."
>  
> +test_begin_subtest "Support single-message mbox"
> +cat > "${MAIL_DIR}"/mbox_file1 <<EOF
> +From test_suite@notmuchmail.org Fri Jan  5 15:43:57 2001
> +From: Notmuch Test Suite <test_suite@notmuchmail.org>
> +To: Notmuch Test Suite <test_suite@notmuchmail.org>
> +Subject: Test mbox message 1
> +
> +Body.
> +EOF
> +output=$(NOTMUCH_NEW 2>&1)
> +test_expect_equal "$output" \
> +"Added 1 new message to the database."
> +
>  # This test requires that notmuch new has been run at least once.
>  test_begin_subtest "Skip and report non-mail files"
>  generate_message
> -- 
> 1.7.10.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox
  2012-11-25  6:16 ` [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox Austin Clements
@ 2012-11-25 13:26   ` Tomi Ollila
  2012-11-25 18:05     ` Austin Clements
  0 siblings, 1 reply; 8+ messages in thread
From: Tomi Ollila @ 2012-11-25 13:26 UTC (permalink / raw)
  To: Austin Clements, notmuch

On Sun, Nov 25 2012, Austin Clements <amdragon@MIT.EDU> wrote:

> Previously, we would treat multi-message mboxes as one giant email,
> which, besides the obvious incorrect indexing, often led to
> out-of-memory errors for archival mboxes.  Now we explicitly reject
> multi-message mboxes.  For historical reasons, we retain support for
> single-message mboxes, but official deprecate this behavior.


The series looks good to me -- but I don't know about deprecating
single-message mboxes:

* If we someday support (read-only?) mbox format, then single-message
  mboxes are "normal" again.

* Some naïve mb2md scripts could leave the 'From ' -line intact: for
  example `formail -bz -s head -3 < $MAIL`(*) can be used to demonstrate this

* Some people may have large collection of single-file messages starting
  with 'From ' currently indexed. If those are to be re-indexed later
  without "single-message mbox" support that is somewhat of a burden to
  the users (**)

(*) my "mb2md" wannabe does gnus-like "$formail" -bz -R 'From ' X-From-Line: ...

(**) Something like the following could be used to mangle "single-file mboxes"...
     find . -type f | xargs perl -e 'foreach (@ARGV) { open IO, "+<", $_ or
     next; sysread IO, $buf, 5; if ($buf eq "From ") { sysseek IO, 0, 0;
     syswrite IO, "Fro:"; }}' 
     This breaks the multi-message mbox nicely... >;)


Tomi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox
  2012-11-25 13:26   ` Tomi Ollila
@ 2012-11-25 18:05     ` Austin Clements
  2012-11-25 19:57       ` Tomi Ollila
  0 siblings, 1 reply; 8+ messages in thread
From: Austin Clements @ 2012-11-25 18:05 UTC (permalink / raw)
  To: Tomi Ollila; +Cc: notmuch

Quoth Tomi Ollila on Nov 25 at  3:26 pm:
> On Sun, Nov 25 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> 
> > Previously, we would treat multi-message mboxes as one giant email,
> > which, besides the obvious incorrect indexing, often led to
> > out-of-memory errors for archival mboxes.  Now we explicitly reject
> > multi-message mboxes.  For historical reasons, we retain support for
> > single-message mboxes, but official deprecate this behavior.
> 
> 
> The series looks good to me -- but I don't know about deprecating
> single-message mboxes:
> 
> * If we someday support (read-only?) mbox format, then single-message
>   mboxes are "normal" again.

If notmuch does gain mbox support, then its handling of single-message
mboxes will *definitely* change because it will stop doing
maildir-like things to them (flag sync, moving from new to cur, etc),
which people may currently be depending on.  This was one of the
motivations for deprecating the current handling of single-message
mboxes.

> * Some naïve mb2md scripts could leave the 'From ' -line intact: for
>   example `formail -bz -s head -3 < $MAIL`(*) can be used to demonstrate this

I would call that "buggy", rather than "naïve".  ]:--8)

> * Some people may have large collection of single-file messages starting
>   with 'From ' currently indexed. If those are to be re-indexed later
>   without "single-message mbox" support that is somewhat of a burden to
>   the users (**)

That's why this only deprecates them (with a warning) and doesn't drop
support for them.  The idea is to keep the historical handling for a
few releases and then we'll have the flexibility to do what we want
with single-message mboxes (including supporting them as real mbox).

It's probably a good idea to include a script or a wiki pointer for
fixing single-message mboxes in the NEWS.  As long as the file name is
kept the same, notmuch won't reindex it.

> (*) my "mb2md" wannabe does gnus-like "$formail" -bz -R 'From ' X-From-Line: ...
> 
> (**) Something like the following could be used to mangle "single-file mboxes"...
>      find . -type f | xargs perl -e 'foreach (@ARGV) { open IO, "+<", $_ or
>      next; sysread IO, $buf, 5; if ($buf eq "From ") { sysseek IO, 0, 0;
>      syswrite IO, "Fro:"; }}' 
>      This breaks the multi-message mbox nicely... >;)
> 
> 
> Tomi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox
  2012-11-25 18:05     ` Austin Clements
@ 2012-11-25 19:57       ` Tomi Ollila
  0 siblings, 0 replies; 8+ messages in thread
From: Tomi Ollila @ 2012-11-25 19:57 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch

On Sun, Nov 25 2012, Austin Clements <amdragon@MIT.EDU> wrote:

> Quoth Tomi Ollila on Nov 25 at  3:26 pm:
>> On Sun, Nov 25 2012, Austin Clements <amdragon@MIT.EDU> wrote:
>> 
>> > Previously, we would treat multi-message mboxes as one giant email,
>> > which, besides the obvious incorrect indexing, often led to
>> > out-of-memory errors for archival mboxes.  Now we explicitly reject
>> > multi-message mboxes.  For historical reasons, we retain support for
>> > single-message mboxes, but official deprecate this behavior.
>> 
>> 
>> The series looks good to me -- but I don't know about deprecating
>> single-message mboxes:
>> 
>> * If we someday support (read-only?) mbox format, then single-message
>>   mboxes are "normal" again.
>
> If notmuch does gain mbox support, then its handling of single-message
> mboxes will *definitely* change because it will stop doing
> maildir-like things to them (flag sync, moving from new to cur, etc),
> which people may currently be depending on.  This was one of the
> motivations for deprecating the current handling of single-message
> mboxes.
>
>> * Some naïve mb2md scripts could leave the 'From ' -line intact: for
>>   example `formail -bz -s head -3 < $MAIL`(*) can be used to demonstrate this
>
> I would call that "buggy", rather than "naïve".  ]:--8)
>
>> * Some people may have large collection of single-file messages starting
>>   with 'From ' currently indexed. If those are to be re-indexed later
>>   without "single-message mbox" support that is somewhat of a burden to
>>   the users (**)
>
> That's why this only deprecates them (with a warning) and doesn't drop
> support for them.  The idea is to keep the historical handling for a
> few releases and then we'll have the flexibility to do what we want
> with single-message mboxes (including supporting them as real mbox).
>
> It's probably a good idea to include a script or a wiki pointer for
> fixing single-message mboxes in the NEWS.  As long as the file name is
> kept the same, notmuch won't reindex it.

Ok, I'm convinced. +1

Tomi

>
>> (*) my "mb2md" wannabe does gnus-like "$formail" -bz -R 'From ' X-From-Line: ...
>> 
>> (**) Something like the following could be used to mangle "single-file mboxes"...
>>      find . -type f | xargs perl -e 'foreach (@ARGV) { open IO, "+<", $_ or
>>      next; sysread IO, $buf, 5; if ($buf eq "From ") { sysseek IO, 0, 0;
>>      syswrite IO, "Fro:"; }}' 
>>      This breaks the multi-message mbox nicely... >;)
>> 
>> 
>> Tomi
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] test: Test notmuch new for single-message mbox
  2012-11-25  6:15 [PATCH 1/3] test: Test notmuch new for single-message mbox Austin Clements
                   ` (2 preceding siblings ...)
  2012-11-25 11:33 ` [PATCH 1/3] test: Test notmuch new for " Mark Walters
@ 2012-11-27  1:14 ` David Bremner
  3 siblings, 0 replies; 8+ messages in thread
From: David Bremner @ 2012-11-27  1:14 UTC (permalink / raw)
  To: Austin Clements, notmuch

Austin Clements <amdragon@MIT.EDU> writes:

> We support this for historical reasons.

Pushed all 3.

d

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-11-27  1:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-25  6:15 [PATCH 1/3] test: Test notmuch new for single-message mbox Austin Clements
2012-11-25  6:16 ` [PATCH 2/3] test: Test for ignoring multi-message mbox Austin Clements
2012-11-25  6:16 ` [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox Austin Clements
2012-11-25 13:26   ` Tomi Ollila
2012-11-25 18:05     ` Austin Clements
2012-11-25 19:57       ` Tomi Ollila
2012-11-25 11:33 ` [PATCH 1/3] test: Test notmuch new for " Mark Walters
2012-11-27  1:14 ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).