hi notmuch folks--

i've been trying to wrap my head around how to get notmuch to support
verifying cryptographically-signed mail.  i'm afraid my current
understanding of the problem space is that it is neither pretty nor
clean.  Sorry for the length of this message.

Scope:
------

I'm focusing initially here only on verifying PGP/MIME cleartext
signatures.  I'm proposing to do the verification in the backend, and to
report on the validity of the signatures to the frontend through
"notmuch show --format=json" (ignoring the other output formats for now).

This mail is only trying to explain how the JSON format might
communicate this information from the backend to the frontend.
(implementation will happen depending on the followup discussion, but i
don't mean for implementation questions to derail this first)


Proposal:
---------

No attempt to actually validate the signatures will be made unless the
new --verify flag is passed to "notmuch show".

A signed MIME part will contain a new element "signedby", which is a
list of part numbers identifying signatures that cover this part.

Signature parts (Content-Type: application/pgp-signature) will contain a
new element "signs", which points back to the list of parts this
signature covers.  It will also contain a "sigstatus" member, which is a
list of objects, each of which contain at least the following element:
 * "verified" -- one of the following values:
     "success" (the sig has been tested and is cryptographically valid)
     "failure" (the sig has been tested and does not match)
     "nokey"   (the sig could not be tested because pubkey is missing)
     "error"   (testing the sig failed for some other reason)
     "unknown" (testing was not tried)
 If "verified" is "success" in a "sigstatus" object, then the following
fields might also be present:
 * "signingkey" -- hexadecimal representation of 160-bit fingerprint of
                   the signing key
 * "digest" -- the hash over which the sig was made (e.g. "SHA1")
 * "timestamp" -- the time the signature claims to have been made
                  (let me know what format i should represent this in)
 * "pubkeyalgo" -- the signing key's asymmetric algorithm (e.g. "RSA")
 * "expires" -- if the signature has an expiration date, it goes here


Example:
--------

currently, the "body" element of a PGP/MIME signed message looks like
this with --format=json:

---------------------------
 "body": [
     {
         "content": "here is a test message i signed on 2010-11-11.\n\n
 --dkg\n\n",
         "content-type": "text/plain",
         "id": 1
     },
     {
         "content-type": "application/pgp-signature",
         "filename": "signature.asc",
         "id": 2
     }
 ],
---------------------------

It would end up like this (without the --verify flag):

---------------------------
 "body": [
     {
         "content": "here is a test message i signed on 2010-11-11.\n\n
 --dkg\n\n",
         "content-type": "text/plain",
         "id": 1,
         "signedby": [ 2 ]
     },
     {
         "content-type": "application/pgp-signature",
         "filename": "signature.asc",
         "id": 2,
         "signs": [ 1 ],
         "sigstatus": [ {
             "verified": "unknown"
         } ]
     }
 ],
---------------------------

and here it is with the --verify flag:

---------------------------
 "body": [
     {
         "content": "here is a test message i signed on 2010-11-11.\n\n
 --dkg\n\n",
         "content-type": "text/plain",
         "id": 1,
         "signedby": [ 2 ]
     },
     {
         "content-type": "application/pgp-signature",
         "filename": "signature.asc",
         "id": 2,
         "signs": [ 1 ],
         "sigstatus": [ {
             "verified": "success",
             "signingkey": "0EE5BE979282D80B9F7540F1CCD2ED94D21739E9",
             "digest": "SHA512",
             "timestamp": "2010-11-11 22:32:45 -0400",
             "pubkeyalgo": "RSA"
         } ]
     }
 ],
---------------------------


Observations:
-------------

i'm not covering key->userid bindings in this first pass -- it's already
complicated enough to say "the following key did actually sign this
message part".  I'm still not sure whether the front-end or the backend
should be responsible for resolution of key->userid bindings, but i'm OK
punting on that question for the moment.

Multipart messages can have some parts signed and other parts not
signed: think of mailing lists which tack on a footer to each relayed
mail; the footer isn't signed, though the rest of the message is.

One MIME signature can cover more than one MIME part: Think of a signed
e-mail with an attachment. In this case, the signature is actually over
the aggregate, not the individual parts.  For example, a signed two-part
message that says:
 [ (A) "this is the budget for 2011", and (B) an attached spreadsheet ]
is *not* the same as either (A) or (B) signed independently.

A multipart MIME message can contain more than one distinct signature on
different parts:  Think of a digest of a mailing list discussion between
several participants who each sign their own messages.  Each signature
needs to be bound to the relevant parts (and vice versa); and some
signatures within a message can fail while others succeed.

A single application/pgp-signature part could contain signing material
from multiple signers.  Think of a PGP/MIME-signed key transition document.

MIME is actually a tree structure, and any subtree can be signed.  But
currently, "notmuch show" hides the tree structure and produces what
appears to be a linear set of parts.

Even more perversely, the tree structure means that a single MIME part
could potentially be signed by multiple signatures, each of which
potentially has independent origin and independent validity.

I've attached a moderately nasty e-mail message to this one
demonstrating a confluence of a bunch of these observations.

The structure of the attached e-mail looks like this:

A└┬╴multipart/signed 10936 bytes
B ├┬╴multipart/mixed 7403 bytes
C │├╴text/plain 77 bytes
D │├╴image/jpeg attachment [dkg.jpg] 4753 bytes
E │└┬╴message/rfc822 2072 bytes
F │ └┬╴multipart/signed 1914 bytes
G │  ├╴text/plain 57 bytes
H │  └╴application/pgp-signature attachment [signature.asc] 900 bytes
I └╴application/pgp-signature attachment [signature.asc] 900 bytes

"notmuch show" emits it as 5 parts (omitting A, B, E, and F):

 1: C
 2: D
 3: G
 4: H
 5: I

Note that while C and D are both signed by I, G is actually signed by
both H and I.  yuck.  And since this example message is attached to the
e-mail i'm writing right now (which itself will be signed) it can
certainly get even yuckier.



Questions:
----------

Am i missing any data or relationships you think we might want?

Is anything broken, unexpected, or dangerous about the choice of JSON
modifications?

I realize i've gone down a bit of a rabbit hole in the corner cases here
(driven mainly by my observations section).  Are there any simplifying
assumptions we can safely make about what kinds of messages are worth
verifying?  That is, are there ways to make this more intelligible that
don't throw away our ability to accurately represent the verified state
of some non-trivial subset of messages?

If this method (or something similar to it) gets put into the notmuch
backend, is this something we can actually represent to a human with a
reasonable frontend?

Would it make more sense to do deeper structural modifications of the
json output (e.g. return the full MIME tree instead of a list of parts)
than to go with the current proposal?

It would be nice to also make this kind of reporting structure also work
for S/MIME and maybe other crypto-signature structures like DKIM.  Is
that doable within this framework?  are there other tweaks we might want
to consider to cover that possibility?



If you actually read this far, you are a champion!  I look forward to
any feedback you have.

OK, off to bed!

	--dkg