unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Escape mbox "From " header in notmuch-insert
@ 2022-02-13 15:27 David Bremner
  2022-02-13 15:27 ` [PATCH 1/4] test: start new corpus of test messages for indexing code David Bremner
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: David Bremner @ 2022-02-13 15:27 UTC (permalink / raw)
  To: notmuch

This obsoletes the WIP patches at [1].  I've settled on this being a
reasonable thing to do; this series tries to do it in a more
maintainable way.  Compared to the WIP series, it is also more careful
about writing out the inserted head name. It again repeats the patch
adding a test message in mbox form.

[1]: id:20220212025503.1690413-1-david@tethera.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] test: start new corpus of test messages for indexing code
  2022-02-13 15:27 Escape mbox "From " header in notmuch-insert David Bremner
@ 2022-02-13 15:27 ` David Bremner
  2022-02-13 15:27 ` [PATCH 2/4] test: add known broken test for insert with mbox as input David Bremner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: David Bremner @ 2022-02-13 15:27 UTC (permalink / raw)
  To: notmuch

This particular message is not recognized by notmuch as mail, but is
fine according to e.g. mutt. The trigger for this bad behaviour seems
to be a second "From " ocurring at the beginning of the line but
inside an attachment.
---
 test/corpora/indexing/mbox-attachment.eml | 83 +++++++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100644 test/corpora/indexing/mbox-attachment.eml

diff --git a/test/corpora/indexing/mbox-attachment.eml b/test/corpora/indexing/mbox-attachment.eml
new file mode 100644
index 00000000..98a8fc91
--- /dev/null
+++ b/test/corpora/indexing/mbox-attachment.eml
@@ -0,0 +1,83 @@
+From david@tethera.net  Sat Feb  5 09:19:10 2022
+From: David Bremner <david@tethera.net>
+To: David Bremner <david@tethera.net>
+Subject: Re: [RFC PATCH v2 12/12] emacs: whitespace cleanup for keybindings
+Date: Sat, 05 Feb 2022 10:19:09 -0400
+Message-ID: <87k0e9o0pu.fsf@tethera.net>
+MIME-Version: 1.0
+Content-Type: multipart/mixed; boundary="=-=-="
+
+--=-=-=
+Content-Type: text/plain
+Content-Disposition: inline
+
+
+I figured out the race condition in the tests. The previous test was
+still running when the failing test started, the joys of using a shared
+emacs for running all of the tests in one file.
+
+The attached diff is split into the the commits that introduce the tests
+in question in my working series, but you should be able to just apply
+it on top of the posted series if you want.
+
+
+--=-=-=
+Content-Type: text/x-diff
+Content-Disposition: inline; filename=0001-test-fixups.patch
+
+From fc88cba7f1f37b9cf3b296eace2422dd0e173502 Mon Sep 17 00:00:00 2001
+From: David Bremner <david@tethera.net>
+Date: Thu, 3 Feb 2022 21:05:05 -0400
+Subject: [PATCH] test fixups
+
+---
+ test/T315-emacs-tagging.sh | 9 ++++-----
+ 1 file changed, 4 insertions(+), 5 deletions(-)
+
+diff --git a/test/T315-emacs-tagging.sh b/test/T315-emacs-tagging.sh
+index c9e3e53a..c26413ce 100755
+--- a/test/T315-emacs-tagging.sh
++++ b/test/T315-emacs-tagging.sh
+@@ -119,7 +119,8 @@ for mode in search show tree unthreaded; do
+       (notmuch-$mode \"$os_x_darwin_thread\")
+       (notmuch-test-wait)
+       (execute-kbd-macro \"+tag-to-be-undone-$mode\")
+-      (notmuch-tag-undo))"
++      (notmuch-tag-undo)
++      (notmuch-test-wait))"
+     count=$(notmuch count "tag:tag-to-be-undone-$mode")
+     test_expect_equal "$count" "0"
+ 
+@@ -128,9 +129,7 @@ for mode in search show tree unthreaded; do
+       (notmuch-$mode \"$os_x_darwin_thread\")
+       (notmuch-test-wait)
+       (execute-kbd-macro \"+one-$mode\")
+-      (notmuch-test-wait)
+       (execute-kbd-macro \"+two-$mode\")
+-      (notmuch-test-wait)
+       (notmuch-tag-undo)
+       (notmuch-test-wait)
+       (execute-kbd-macro \"+three-$mode\"))"
+@@ -143,7 +142,6 @@ for mode in search show tree unthreaded; do
+       (notmuch-$mode \"$os_x_darwin_thread\")
+       (notmuch-test-wait)
+       (execute-kbd-macro \"+one-$mode\")
+-      (notmuch-test-wait)
+       (execute-kbd-macro \"+two-$mode\")
+       (notmuch-tag-undo)
+       (notmuch-test-wait)
+@@ -159,7 +157,8 @@ for mode in search show tree unthreaded; do
+       (notmuch-$mode \"$os_x_darwin_thread\")
+       (notmuch-test-wait)
+       (execute-kbd-macro \"+tag-to-be-undone-$mode\")
+-      (execute-kbd-macro (kbd \"C-x u\")))"
++      (execute-kbd-macro (kbd \"C-x u\"))
++      (notmuch-test-wait))"
+     count=$(notmuch count "tag:tag-to-be-undone-$mode")
+     test_expect_equal "$count" "0"
+ done
+-- 
+2.30.2
+
+
+--=-=-=--
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] test: add known broken test for insert with mbox as input
  2022-02-13 15:27 Escape mbox "From " header in notmuch-insert David Bremner
  2022-02-13 15:27 ` [PATCH 1/4] test: start new corpus of test messages for indexing code David Bremner
@ 2022-02-13 15:27 ` David Bremner
  2022-02-13 15:27 ` [PATCH 3/4] CLI/insert: split copy_fd David Bremner
  2022-02-13 15:27 ` [PATCH 4/4] CLI/insert: escape envelope from David Bremner
  3 siblings, 0 replies; 7+ messages in thread
From: David Bremner @ 2022-02-13 15:27 UTC (permalink / raw)
  To: notmuch

It seems reasonable that notmuch should try to avoid delivering
messages in formats it cannot index.
---
 test/T070-insert.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/test/T070-insert.sh b/test/T070-insert.sh
index ec170b30..a297fa73 100755
--- a/test/T070-insert.sh
+++ b/test/T070-insert.sh
@@ -292,4 +292,10 @@ for code in OUT_OF_MEMORY XAPIAN_EXCEPTION ; do
     test_expect_code 0 "notmuch_with_shim shim-$code insert --keep < \"$gen_msg_filename\""
 done
 
+test_begin_subtest "insert converts mboxes on delivery"
+test_subtest_known_broken
+notmuch insert +unmboxed < "${TEST_DIRECTORY}"/corpora/indexing/mbox-attachment.eml
+output=$(notmuch count tag:unmboxed)
+test_expect_equal "${output}" 1
+
 test_done
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] CLI/insert: split copy_fd
  2022-02-13 15:27 Escape mbox "From " header in notmuch-insert David Bremner
  2022-02-13 15:27 ` [PATCH 1/4] test: start new corpus of test messages for indexing code David Bremner
  2022-02-13 15:27 ` [PATCH 2/4] test: add known broken test for insert with mbox as input David Bremner
@ 2022-02-13 15:27 ` David Bremner
  2022-02-13 15:27 ` [PATCH 4/4] CLI/insert: escape envelope from David Bremner
  3 siblings, 0 replies; 7+ messages in thread
From: David Bremner @ 2022-02-13 15:27 UTC (permalink / raw)
  To: notmuch

This helps maintainability and enables code-reuse of our home-brewed
buffered-write code.

This commit is mostly code movement.
---
 notmuch-insert.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/notmuch-insert.c b/notmuch-insert.c
index 214d4d03..bec25a2a 100644
--- a/notmuch-insert.c
+++ b/notmuch-insert.c
@@ -241,6 +241,24 @@ maildir_mktemp (const void *ctx, const char *maildir, bool world_readable, char
     return fd;
 }
 
+static bool
+write_buf (const char *buf, int fdout, ssize_t remain) {
+    const char *p = buf;
+    do {
+	ssize_t written = write (fdout, p, remain);
+	if (written < 0 && errno == EINTR)
+	    continue;
+	if (written <= 0) {
+	    fprintf (stderr, "Error: writing to temporary file: %s",
+		     strerror (errno));
+	    return false;
+	}
+	p += written;
+	remain -= written;
+    } while (remain > 0);
+    return true;
+}
+
 /*
  * Copy fdin to fdout, return true on success, and false on errors and
  * empty input.
@@ -253,7 +271,6 @@ copy_fd (int fdout, int fdin)
     while (! interrupted) {
 	ssize_t remain;
 	char buf[4096];
-	char *p;
 
 	remain = read (fdin, buf, sizeof (buf));
 	if (remain == 0)
@@ -265,21 +282,9 @@ copy_fd (int fdout, int fdin)
 		     strerror (errno));
 	    return false;
 	}
-
-	p = buf;
-	do {
-	    ssize_t written = write (fdout, p, remain);
-	    if (written < 0 && errno == EINTR)
-		continue;
-	    if (written <= 0) {
-		fprintf (stderr, "Error: writing to temporary file: %s",
-			 strerror (errno));
-		return false;
-	    }
-	    p += written;
-	    remain -= written;
-	    empty = false;
-	} while (remain > 0);
+	if (! write_buf (buf, fdout, remain))
+	    return false;
+	empty = false;
     }
 
     return (! interrupted && ! empty);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] CLI/insert: escape envelope from
  2022-02-13 15:27 Escape mbox "From " header in notmuch-insert David Bremner
                   ` (2 preceding siblings ...)
  2022-02-13 15:27 ` [PATCH 3/4] CLI/insert: split copy_fd David Bremner
@ 2022-02-13 15:27 ` David Bremner
  2022-02-19 21:47   ` Tomi Ollila
  3 siblings, 1 reply; 7+ messages in thread
From: David Bremner @ 2022-02-13 15:27 UTC (permalink / raw)
  To: notmuch

The idea is to do as little parsing and modification of the delivered
message as possible. Luckily the position of the "envelope header"
lets us escape it by replacing the first 5 characters of the stream
with a regular header name (with ':').
---
 notmuch-insert.c    | 15 ++++++++++++++-
 test/T070-insert.sh |  1 -
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/notmuch-insert.c b/notmuch-insert.c
index bec25a2a..d1244384 100644
--- a/notmuch-insert.c
+++ b/notmuch-insert.c
@@ -267,10 +267,13 @@ static bool
 copy_fd (int fdout, int fdin)
 {
     bool empty = true;
+    bool first = true;
+    const char *header="X-Envelope-From: ";
 
     while (! interrupted) {
 	ssize_t remain;
 	char buf[4096];
+	const char *p = buf;
 
 	remain = read (fdin, buf, sizeof (buf));
 	if (remain == 0)
@@ -282,7 +285,17 @@ copy_fd (int fdout, int fdin)
 		     strerror (errno));
 	    return false;
 	}
-	if (! write_buf (buf, fdout, remain))
+
+	if (first && remain >= 5 && 0 == strncmp (buf, "From ", 5)) {
+	    if (! write_buf (header, fdout, strlen (header)))
+		return false;
+	    p += 5;
+	    remain -= 5;
+	}
+
+	first = false;
+
+	if (! write_buf (p, fdout, remain))
 	    return false;
 	empty = false;
     }
diff --git a/test/T070-insert.sh b/test/T070-insert.sh
index a297fa73..e1e3b151 100755
--- a/test/T070-insert.sh
+++ b/test/T070-insert.sh
@@ -293,7 +293,6 @@ for code in OUT_OF_MEMORY XAPIAN_EXCEPTION ; do
 done
 
 test_begin_subtest "insert converts mboxes on delivery"
-test_subtest_known_broken
 notmuch insert +unmboxed < "${TEST_DIRECTORY}"/corpora/indexing/mbox-attachment.eml
 output=$(notmuch count tag:unmboxed)
 test_expect_equal "${output}" 1
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 4/4] CLI/insert: escape envelope from
  2022-02-13 15:27 ` [PATCH 4/4] CLI/insert: escape envelope from David Bremner
@ 2022-02-19 21:47   ` Tomi Ollila
  2022-02-20  2:25     ` David Bremner
  0 siblings, 1 reply; 7+ messages in thread
From: Tomi Ollila @ 2022-02-19 21:47 UTC (permalink / raw)
  To: David Bremner, notmuch

On Sun, Feb 13 2022, David Bremner wrote:

> The idea is to do as little parsing and modification of the delivered
> message as possible. Luckily the position of the "envelope header"
> lets us escape it by replacing the first 5 characters of the stream
> with a regular header name (with ':').
> ---
>  notmuch-insert.c    | 15 ++++++++++++++-
>  test/T070-insert.sh |  1 -
>  2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/notmuch-insert.c b/notmuch-insert.c
> index bec25a2a..d1244384 100644
> --- a/notmuch-insert.c
> +++ b/notmuch-insert.c
> @@ -267,10 +267,13 @@ static bool
>  copy_fd (int fdout, int fdin)
>  {
>      bool empty = true;
> +    bool first = true;
> +    const char *header="X-Envelope-From: ";

spaces

>  
>      while (! interrupted) {
>  	ssize_t remain;
>  	char buf[4096];
> +	const char *p = buf;
>  
>  	remain = read (fdin, buf, sizeof (buf));
>  	if (remain == 0)
> @@ -282,7 +285,17 @@ copy_fd (int fdout, int fdin)
>  		     strerror (errno));
>  	    return false;
>  	}
> -	if (! write_buf (buf, fdout, remain))
> +
> +	if (first && remain >= 5 && 0 == strncmp (buf, "From ", 5)) {
> +	    if (! write_buf (header, fdout, strlen (header)))

Probably optimizing compiler can replace strlen (header) with constant
value -- it does not matter but had to mention :D

Code changes look good to me.

Tomi


> +		return false;
> +	    p += 5;
> +	    remain -= 5;
> +	}
> +
> +	first = false;
> +
> +	if (! write_buf (p, fdout, remain))
>  	    return false;
>  	empty = false;
>      }
> diff --git a/test/T070-insert.sh b/test/T070-insert.sh
> index a297fa73..e1e3b151 100755
> --- a/test/T070-insert.sh
> +++ b/test/T070-insert.sh
> @@ -293,7 +293,6 @@ for code in OUT_OF_MEMORY XAPIAN_EXCEPTION ; do
>  done
>  
>  test_begin_subtest "insert converts mboxes on delivery"
> -test_subtest_known_broken
>  notmuch insert +unmboxed < "${TEST_DIRECTORY}"/corpora/indexing/mbox-attachment.eml
>  output=$(notmuch count tag:unmboxed)
>  test_expect_equal "${output}" 1
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4/4] CLI/insert: escape envelope from
  2022-02-19 21:47   ` Tomi Ollila
@ 2022-02-20  2:25     ` David Bremner
  0 siblings, 0 replies; 7+ messages in thread
From: David Bremner @ 2022-02-20  2:25 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

Tomi Ollila <tomi.ollila@iki.fi> writes:

>
> spaces
>

Thanks, I caught one more formatting change in a previous patch, and
applied the series to master.

d


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-20  2:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-13 15:27 Escape mbox "From " header in notmuch-insert David Bremner
2022-02-13 15:27 ` [PATCH 1/4] test: start new corpus of test messages for indexing code David Bremner
2022-02-13 15:27 ` [PATCH 2/4] test: add known broken test for insert with mbox as input David Bremner
2022-02-13 15:27 ` [PATCH 3/4] CLI/insert: split copy_fd David Bremner
2022-02-13 15:27 ` [PATCH 4/4] CLI/insert: escape envelope from David Bremner
2022-02-19 21:47   ` Tomi Ollila
2022-02-20  2:25     ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).