unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] notmuch-mutt: support for messages that lack Message-ID
@ 2015-01-24  9:11 Stefano Zacchiroli
  2015-01-24  9:11 ` [PATCH 1/3] notmuch-mutt README: use metacpn.org/* as deps homepages Stefano Zacchiroli
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Stefano Zacchiroli @ 2015-01-24  9:11 UTC (permalink / raw)
  To: notmuch; +Cc: Jan N. Klug

Please find attached a patch series that add support for messages that
lack Message-ID to notmuch-mutt, kindly provided by Jan N. Klug. I've
reviewed / adapted it, and I consider it suitable for inclusion into
the next release of notmuch(-mutt).

Please let me know if you've any question.

Cheers.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] notmuch-mutt README: use metacpn.org/* as deps homepages
  2015-01-24  9:11 [PATCH] notmuch-mutt: support for messages that lack Message-ID Stefano Zacchiroli
@ 2015-01-24  9:11 ` Stefano Zacchiroli
  2015-01-24  9:11 ` [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers Stefano Zacchiroli
  2015-01-24  9:11 ` [PATCH 3/3] debian packaging: drop dep notmuch-mutt -> libmailtools-perl Stefano Zacchiroli
  2 siblings, 0 replies; 8+ messages in thread
From: Stefano Zacchiroli @ 2015-01-24  9:11 UTC (permalink / raw)
  To: notmuch; +Cc: Jan N. Klug, Stefano Zacchiroli

---
 contrib/notmuch-mutt/README | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/contrib/notmuch-mutt/README b/contrib/notmuch-mutt/README
index 382ac91..c661447 100644
--- a/contrib/notmuch-mutt/README
+++ b/contrib/notmuch-mutt/README
@@ -33,13 +33,13 @@ Requirements
 
 To *run* notmuch-mutt you will need Perl with the following libraries:
 
-- Mail::Box <http://search.cpan.org/~markov/Mail-Box/>
+- Mail::Box <https://metacpan.org/pod/Mail::Box>
   (Debian package: libmail-box-perl)
-- Mail::Internet <http://search.cpan.org/~markov/MailTools/>
+- Mail::Internet <https://metacpan.org/pod/Mail::Internet>
   (Debian package: libmailtools-perl)
-- String::ShellQuote <http://search.cpan.org/~rosch/String-ShellQuote/ShellQuote.pm>
+- String::ShellQuote <https://metacpan.org/pod/String::ShellQuote>
   (Debian package: libstring-shellquote-perl)
-- Term::ReadLine <http://search.cpan.org/~hayashi/Term-ReadLine-Gnu/>
+- Term::ReadLine::Gnu <https://metacpan.org/pod/Term::ReadLine::Gnu>
   (Debian package: libterm-readline-gnu-perl)
 
 To *build* notmuch-mutt documentation you will need:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers
  2015-01-24  9:11 [PATCH] notmuch-mutt: support for messages that lack Message-ID Stefano Zacchiroli
  2015-01-24  9:11 ` [PATCH 1/3] notmuch-mutt README: use metacpn.org/* as deps homepages Stefano Zacchiroli
@ 2015-01-24  9:11 ` Stefano Zacchiroli
  2015-01-24 14:59   ` Jani Nikula
  2015-01-24  9:11 ` [PATCH 3/3] debian packaging: drop dep notmuch-mutt -> libmailtools-perl Stefano Zacchiroli
  2 siblings, 1 reply; 8+ messages in thread
From: Stefano Zacchiroli @ 2015-01-24  9:11 UTC (permalink / raw)
  To: notmuch; +Cc: Jan N. Klug, Stefano Zacchiroli

From: "Jan N. Klug" <jan.n.klug@rub.de>

For those messages, compute a synthetic Message-ID based on the SHA1
of the whole message, in the same way that notmuch would do. See:
http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c

To do the above, rewrite get_message_id() to scan the current message
line by line, incrementally computing a SHA1. As a consequence, drop
the dependency on Mail::Internet.

Signed-off-by: Stefano Zacchiroli <zack@upsilon.cc>
---
 contrib/notmuch-mutt/README       |  4 ++--
 contrib/notmuch-mutt/notmuch-mutt | 23 ++++++++++++++++++-----
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/contrib/notmuch-mutt/README b/contrib/notmuch-mutt/README
index c661447..0013ed0 100644
--- a/contrib/notmuch-mutt/README
+++ b/contrib/notmuch-mutt/README
@@ -33,10 +33,10 @@ Requirements
 
 To *run* notmuch-mutt you will need Perl with the following libraries:
 
+- Digest::SHA <https://metacpan.org/release/Digest-SHA>
+  (Debian package: libdigest-sha-perl)
 - Mail::Box <https://metacpan.org/pod/Mail::Box>
   (Debian package: libmail-box-perl)
-- Mail::Internet <https://metacpan.org/pod/Mail::Internet>
-  (Debian package: libmailtools-perl)
 - String::ShellQuote <https://metacpan.org/pod/String::ShellQuote>
   (Debian package: libstring-shellquote-perl)
 - Term::ReadLine::Gnu <https://metacpan.org/pod/Term::ReadLine::Gnu>
diff --git a/contrib/notmuch-mutt/notmuch-mutt b/contrib/notmuch-mutt/notmuch-mutt
index 4969e4b..4d30b0b 100755
--- a/contrib/notmuch-mutt/notmuch-mutt
+++ b/contrib/notmuch-mutt/notmuch-mutt
@@ -13,11 +13,11 @@ use warnings;
 
 use File::Path;
 use Getopt::Long qw(:config no_getopt_compat);
-use Mail::Internet;
 use Mail::Box::Maildir;
 use Pod::Usage;
 use String::ShellQuote;
 use Term::ReadLine;
+use Digest::SHA;
 
 
 my $xdg_cache_dir = "$ENV{HOME}/.cache";
@@ -75,10 +75,23 @@ sub prompt($$) {
 }
 
 sub get_message_id() {
-    my $mail = Mail::Internet->new(\*STDIN);
-    my $mid = $mail->head->get("message-id") or return undef;
-    $mid =~ /^<(.*)>$/;	# get message-id value
-    return $1;
+    my $mid = undef;
+    my $sha = Digest::SHA->new(1);  # SHA1 hashing
+
+    while(<STDIN>) {  # scan message line by line, looking for mid
+        if ($_ =~ /^Message-ID:\s*<(.*)>$/i) {
+	    $mid = $1;
+	    last;  # message-id found, abort scan
+        }
+	$sha->add($_);  # update hash
+    }
+
+    # Generate message-id from hash if none was found, in the same way
+    # that notmuch would do.
+    # See: http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c
+    $mid ||= "notmuch-sha1-".$sha->hexdigest;
+
+    return $mid;
 }
 
 sub search_action($$$@) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] debian packaging: drop dep notmuch-mutt -> libmailtools-perl
  2015-01-24  9:11 [PATCH] notmuch-mutt: support for messages that lack Message-ID Stefano Zacchiroli
  2015-01-24  9:11 ` [PATCH 1/3] notmuch-mutt README: use metacpn.org/* as deps homepages Stefano Zacchiroli
  2015-01-24  9:11 ` [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers Stefano Zacchiroli
@ 2015-01-24  9:11 ` Stefano Zacchiroli
  2 siblings, 0 replies; 8+ messages in thread
From: Stefano Zacchiroli @ 2015-01-24  9:11 UTC (permalink / raw)
  To: notmuch; +Cc: Jan N. Klug, Stefano Zacchiroli

Rationale: the dependency is no longer needed after the rewrite of
get_message_id().

Note that Digest::SHA, needed by the new implementation of
get_message_id(), is shipped by the perl package itself, which is in
an implied dependency of other lib*-perl packages. So there is no need
to list perl explicitly as a dependency.
---
 debian/changelog | 8 ++++++++
 debian/control   | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/debian/changelog b/debian/changelog
index fbd15b8..2fa0f96 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,11 @@
+notmuch (0.19-2) experimental; urgency=medium
+
+  [ Stefano Zacchiroli ]
+  * notmuch-mutt: drop dependency on libmailtools-perl, because we no
+    longer need Mail::Internet
+
+ --
+
 notmuch (0.19-1) experimental; urgency=medium
 
   * New upstream release.
diff --git a/debian/control b/debian/control
index 4bc4cd9..84dbf32 100644
--- a/debian/control
+++ b/debian/control
@@ -144,7 +144,7 @@ Package: notmuch-mutt
 Architecture: all
 Depends:
  notmuch (>= 0.4),
- libmail-box-perl, libmailtools-perl,
+ libmail-box-perl,
  libstring-shellquote-perl, libterm-readline-gnu-perl,
  ${misc:Depends}
 Recommends: mutt
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers
  2015-01-24  9:11 ` [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers Stefano Zacchiroli
@ 2015-01-24 14:59   ` Jani Nikula
  2015-01-24 15:21     ` Stefano Zacchiroli
  0 siblings, 1 reply; 8+ messages in thread
From: Jani Nikula @ 2015-01-24 14:59 UTC (permalink / raw)
  To: Stefano Zacchiroli, notmuch; +Cc: Jan N. Klug, Stefano Zacchiroli

On Sat, 24 Jan 2015, Stefano Zacchiroli <zack@upsilon.cc> wrote:
> From: "Jan N. Klug" <jan.n.klug@rub.de>
>
> For those messages, compute a synthetic Message-ID based on the SHA1
> of the whole message, in the same way that notmuch would do. See:
> http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c

As I said on IRC, I think this is a notmuch implementation detail, and
we don't make any promises about always generating missing message-ids
the same way. That said, I don't see any reason why we'd change this
anytime soon, so the solution is probably good enough for now.

> To do the above, rewrite get_message_id() to scan the current message
> line by line, incrementally computing a SHA1. As a consequence, drop
> the dependency on Mail::Internet.

I am not so sure this is a good idea however, see below.

> Signed-off-by: Stefano Zacchiroli <zack@upsilon.cc>
> ---
>  contrib/notmuch-mutt/README       |  4 ++--
>  contrib/notmuch-mutt/notmuch-mutt | 23 ++++++++++++++++++-----
>  2 files changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/contrib/notmuch-mutt/README b/contrib/notmuch-mutt/README
> index c661447..0013ed0 100644
> --- a/contrib/notmuch-mutt/README
> +++ b/contrib/notmuch-mutt/README
> @@ -33,10 +33,10 @@ Requirements
>  
>  To *run* notmuch-mutt you will need Perl with the following libraries:
>  
> +- Digest::SHA <https://metacpan.org/release/Digest-SHA>
> +  (Debian package: libdigest-sha-perl)
>  - Mail::Box <https://metacpan.org/pod/Mail::Box>
>    (Debian package: libmail-box-perl)
> -- Mail::Internet <https://metacpan.org/pod/Mail::Internet>
> -  (Debian package: libmailtools-perl)
>  - String::ShellQuote <https://metacpan.org/pod/String::ShellQuote>
>    (Debian package: libstring-shellquote-perl)
>  - Term::ReadLine::Gnu <https://metacpan.org/pod/Term::ReadLine::Gnu>
> diff --git a/contrib/notmuch-mutt/notmuch-mutt b/contrib/notmuch-mutt/notmuch-mutt
> index 4969e4b..4d30b0b 100755
> --- a/contrib/notmuch-mutt/notmuch-mutt
> +++ b/contrib/notmuch-mutt/notmuch-mutt
> @@ -13,11 +13,11 @@ use warnings;
>  
>  use File::Path;
>  use Getopt::Long qw(:config no_getopt_compat);
> -use Mail::Internet;
>  use Mail::Box::Maildir;
>  use Pod::Usage;
>  use String::ShellQuote;
>  use Term::ReadLine;
> +use Digest::SHA;
>  
>  
>  my $xdg_cache_dir = "$ENV{HOME}/.cache";
> @@ -75,10 +75,23 @@ sub prompt($$) {
>  }
>  
>  sub get_message_id() {
> -    my $mail = Mail::Internet->new(\*STDIN);
> -    my $mid = $mail->head->get("message-id") or return undef;
> -    $mid =~ /^<(.*)>$/;	# get message-id value
> -    return $1;
> +    my $mid = undef;
> +    my $sha = Digest::SHA->new(1);  # SHA1 hashing
> +
> +    while(<STDIN>) {  # scan message line by line, looking for mid
> +        if ($_ =~ /^Message-ID:\s*<(.*)>$/i) {

This doesn't take into account header folding or end of headers. There's
probably a bunch of other things to consider in the relevant RFCs. I
think you really should use Mail::Internet for the common case, and only
fallback to generating the message-id when that fails.

BR,
Jani.

> +	    $mid = $1;
> +	    last;  # message-id found, abort scan
> +        }
> +	$sha->add($_);  # update hash
> +    }
> +
> +    # Generate message-id from hash if none was found, in the same way
> +    # that notmuch would do.
> +    # See: http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c
> +    $mid ||= "notmuch-sha1-".$sha->hexdigest;
> +
> +    return $mid;
>  }
>  
>  sub search_action($$$@) {
> -- 
> 2.1.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers
  2015-01-24 14:59   ` Jani Nikula
@ 2015-01-24 15:21     ` Stefano Zacchiroli
  2015-01-24 16:56       ` Jan N. Klug
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Zacchiroli @ 2015-01-24 15:21 UTC (permalink / raw)
  To: notmuch; +Cc: Jan N. Klug

First of all thanks for your feedback, Jani!
(both here, and the other day on IRC, of course)

On Sat, Jan 24, 2015 at 04:59:16PM +0200, Jani Nikula wrote:
> On Sat, 24 Jan 2015, Stefano Zacchiroli <zack@upsilon.cc> wrote:
> >
> > For those messages, compute a synthetic Message-ID based on the SHA1
> > of the whole message, in the same way that notmuch would do. See:
> > http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c
> 
> As I said on IRC, I think this is a notmuch implementation detail, and
> we don't make any promises about always generating missing message-ids
> the same way. That said, I don't see any reason why we'd change this
> anytime soon, so the solution is probably good enough for now.

ACK. I've noted down the code URL in the patch for reference, and also
as a warning to keep an eye on.

> > To do the above, rewrite get_message_id() to scan the current message
> > line by line, incrementally computing a SHA1. As a consequence, drop
> > the dependency on Mail::Internet.
> 
> I am not so sure this is a good idea however, see below.

So, fun part here is that I had initially made a similar comment to Jan.
Then got convinced to do as the current patch does to avoid loading full
messages (potentially with large attachments and the like) in memory.

But I didn't think of header folding, and you're absolutely correct in
saying that might be a problem.

I've just found out Mail::Header->new which I believe solves both
problems: avoid loading the full message into memory (it will only load
the headers) and proper parsing folded Message-IDs, without having to
use more complex regexp hackery.

Jan: do you agree with using Mail::Header->new and fall back to
line-by-line hasing only in case Message-ID is not found? If so, having
an updated patch based on the one I've posted here would be awesome! If
you cannot do that just let me know and I'll get to it, eventually :).

Cheers.
-- 
Stefano Zacchiroli  . . . . . . .  zack@upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader  . . @zack on identi.ca . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers
  2015-01-24 15:21     ` Stefano Zacchiroli
@ 2015-01-24 16:56       ` Jan N. Klug
  2015-01-24 20:04         ` Jani Nikula
  0 siblings, 1 reply; 8+ messages in thread
From: Jan N. Klug @ 2015-01-24 16:56 UTC (permalink / raw)
  To: notmuch

On Sat, Jan 24, 2015 at 04:21:06PM +0100, Stefano Zacchiroli wrote:
> On Sat, Jan 24, 2015 at 04:59:16PM +0200, Jani Nikula wrote:
> > On Sat, 24 Jan 2015, Stefano Zacchiroli <zack@upsilon.cc> wrote:
> > > To do the above, rewrite get_message_id() to scan the current message
> > > line by line, incrementally computing a SHA1. As a consequence, drop
> > > the dependency on Mail::Internet.
> > 
> > I am not so sure this is a good idea however, see below.
> [...]
> But I didn't think of header folding, and you're absolutely correct in
> saying that might be a problem.

I disagree. I do not think that header folding is a problem here. RFC
5322 and RFC 2822 state that FWS in the message-id are not allowed (as
opposed to In-Reply: and other headers, where FWS may occur between, but
not inside message-ids).

Folded lines that start with "Message-Id:" might occur, the
regex-pattern used in the patch will not match those lines. 

As far as I have looked in the Mail::Internet sources, further checking of the
correctness of the message-id does not occur, so I do not see any
advantage over the proposed solution. 

> Jan: do you agree with using Mail::Header->new and fall back to
> line-by-line hasing only in case Message-ID is not found? If so, having
> an updated patch based on the one I've posted here would be awesome! If
> you cannot do that just let me know and I'll get to it, eventually :).

I can look into that, but I cannot promise to have it ready in the next
days.

Regards,

Jan
--
Jan N. Klug, Gelsenkirchen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers
  2015-01-24 16:56       ` Jan N. Klug
@ 2015-01-24 20:04         ` Jani Nikula
  0 siblings, 0 replies; 8+ messages in thread
From: Jani Nikula @ 2015-01-24 20:04 UTC (permalink / raw)
  To: Jan N. Klug, notmuch

On Sat, 24 Jan 2015, "Jan N. Klug" <jan.n.klug@rub.de> wrote:
> On Sat, Jan 24, 2015 at 04:21:06PM +0100, Stefano Zacchiroli wrote:
>> On Sat, Jan 24, 2015 at 04:59:16PM +0200, Jani Nikula wrote:
>> > On Sat, 24 Jan 2015, Stefano Zacchiroli <zack@upsilon.cc> wrote:
>> > > To do the above, rewrite get_message_id() to scan the current message
>> > > line by line, incrementally computing a SHA1. As a consequence, drop
>> > > the dependency on Mail::Internet.
>> > 
>> > I am not so sure this is a good idea however, see below.
>> [...]
>> But I didn't think of header folding, and you're absolutely correct in
>> saying that might be a problem.
>
> I disagree. I do not think that header folding is a problem here. RFC
> 5322 and RFC 2822 state that FWS in the message-id are not allowed (as
> opposed to In-Reply: and other headers, where FWS may occur between, but
> not inside message-ids).

Not *inside* message-ids, but as a data point, 0.05% of my email have
FWS *before* the message-id. That's more than the amount of messages
I have without a message-id.

BR,
Jani.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-01-24 20:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-24  9:11 [PATCH] notmuch-mutt: support for messages that lack Message-ID Stefano Zacchiroli
2015-01-24  9:11 ` [PATCH 1/3] notmuch-mutt README: use metacpn.org/* as deps homepages Stefano Zacchiroli
2015-01-24  9:11 ` [PATCH 2/3] notmuch-mutt: support for messages that lack Message-ID headers Stefano Zacchiroli
2015-01-24 14:59   ` Jani Nikula
2015-01-24 15:21     ` Stefano Zacchiroli
2015-01-24 16:56       ` Jan N. Klug
2015-01-24 20:04         ` Jani Nikula
2015-01-24  9:11 ` [PATCH 3/3] debian packaging: drop dep notmuch-mutt -> libmailtools-perl Stefano Zacchiroli

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).