From: Alexander Adolf <alexander.adolf@condition-alpha.com>
To: David Bremner <david@tethera.net>, Teemu Likonen <tlikonen@iki.fi>
Cc: Daniel Corbe <daniel@corbe.net>, notmuch@notmuchmail.org
Subject: Re: Fixed Message-ID trouble
Date: Tue, 26 Sep 2023 13:44:00 +0200 [thread overview]
Message-ID: <aff94e0305c375fa30aebbcf5240bef7@condition-alpha.com> (raw)
In-Reply-To: <875y3xs0nm.fsf@tethera.net>
David Bremner <david@tethera.net> writes:
> Alexander Adolf <alexander.adolf@condition-alpha.com> writes:
>
>> Bearing in mind that re-recognising a message which has arrived
>> multiple times via different routes is a worthwhile feature, it would
>> seem to me that a hash over the invariant part of the message, that is
>> the body, would allow for such detection. In that light, it would seem
>> to me that the tuple (body_hash, message_id) could be a candidate for
>> a “unique enough”(tm) identifier?
>
> I always had the impression that the message body had too variation
> imposed by different delivery routes for this to be very helpful:
> essentially the hash would be different for every file due to trailers
> added by mailing lists,
Ah, good point. I hadn't thought of mailing list trailers. Could these
perhaps be detected via the signature line separator "-- \n"?
I guess this also touches on the question of what a consensus definition
of "sameness" could be. If we take the message-id only, it'd be a purely
technical one. If we'd include the content one way or another (for
instance via hash over the body), that would rather be an editorial
definition of "sameness".
> re-encoding,
Like...? utf-8 to/from quoted-printable...?
> stupid "external message" headers added by malicious^Wcorporate mail
> servers, etc...
Headers would not "muddy the waters" since they are headers. In my mind,
the hash would be over the body only.
> I could be wrong, maybe hashing is a useful approach, but I'd need to
> see some numbers to be convinced.
I fully agree that we need to adapt to the realities of how things are
actually used, not how they were intended to be used.
How would I find instances of multiple files for the same message-id in
my database for example?
Cheers,
--alexander\r
next prev parent reply other threads:[~2023-09-26 11:49 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-25 8:54 Fixed Message-ID trouble Teemu Likonen
2023-09-25 10:52 ` Teemu Likonen
2023-09-25 10:59 ` Michael J Gruber
2023-09-25 11:33 ` Daniel Corbe
2023-09-25 12:00 ` Teemu Likonen
2023-09-25 13:10 ` Alexander Adolf
2023-09-26 10:13 ` David Bremner
2023-09-26 11:44 ` Alexander Adolf [this message]
2023-09-26 12:15 ` Andreas Kähäri
2023-09-26 16:22 ` Alexander Adolf
2023-09-26 10:07 ` David Bremner
2023-09-26 12:46 ` Teemu Likonen
2023-09-26 17:17 ` David Bremner
2023-09-25 21:53 ` Gregor Zattler
2023-09-25 23:00 ` Andy Smith
2023-09-25 22:45 ` Daniel Kahn Gillmor
2023-09-27 16:48 ` David Bremner
2023-09-28 5:51 ` Teemu Likonen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aff94e0305c375fa30aebbcf5240bef7@condition-alpha.com \
--to=alexander.adolf@condition-alpha.com \
--cc=daniel@corbe.net \
--cc=david@tethera.net \
--cc=notmuch@notmuchmail.org \
--cc=tlikonen@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).