unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Fixed Message-ID trouble
@ 2023-09-25  8:54 Teemu Likonen
  2023-09-25 10:52 ` Teemu Likonen
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Teemu Likonen @ 2023-09-25  8:54 UTC (permalink / raw)
  To: notmuch


[-- Attachment #1.1: Type: text/plain, Size: 644 bytes --]

Some person on debian-user mailing list seems to be sending messages
with fixed Message-ID field: the same ID in different messages. In
Notmuch it is creating trouble because it connects unrelated threads to
one. The person has different messages in different threads but Notmuch
thinks they are the same message because the Message-ID is the same.

This is potentially a "denial of service" for Notmuch. Well, not quite,
but is harmful nonetheless. How would a Notmuch user fix the mess or
protect himself against it?

-- 
/// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
// OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25  8:54 Fixed Message-ID trouble Teemu Likonen
@ 2023-09-25 10:52 ` Teemu Likonen
  2023-09-25 10:59   ` Michael J Gruber
  2023-09-25 11:33   ` Daniel Corbe
  2023-09-25 21:53 ` Gregor Zattler
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 18+ messages in thread
From: Teemu Likonen @ 2023-09-25 10:52 UTC (permalink / raw)
  To: notmuch


[-- Attachment #1.1: Type: text/plain, Size: 947 bytes --]

* 2023-09-25 11:54:07+0300, Teemu Likonen wrote:

> Some person on debian-user mailing list seems to be sending messages
> with fixed Message-ID field: the same ID in different messages. In
> Notmuch it is creating trouble because it connects unrelated threads to
> one. The person has different messages in different threads but Notmuch
> thinks they are the same message because the Message-ID is the same.
>
> This is potentially a "denial of service" for Notmuch. Well, not quite,
> but is harmful nonetheless. How would a Notmuch user fix the mess or
> protect himself against it?

I am no longer sure if this issue is caused by fixed "Message-ID" or
wrong "References" or "In-Reply-To" values. Anyway, someone has created
real mess anyway because Notmuch combines originally separate threads
now and forever.

-- 
/// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
// OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 10:52 ` Teemu Likonen
@ 2023-09-25 10:59   ` Michael J Gruber
  2023-09-25 11:33   ` Daniel Corbe
  1 sibling, 0 replies; 18+ messages in thread
From: Michael J Gruber @ 2023-09-25 10:59 UTC (permalink / raw)
  To: Teemu Likonen; +Cc: notmuch

Am Mo., 25. Sept. 2023 um 12:53 Uhr schrieb Teemu Likonen <tlikonen@iki.fi>:
>
> * 2023-09-25 11:54:07+0300, Teemu Likonen wrote:
>
> > Some person on debian-user mailing list seems to be sending messages
> > with fixed Message-ID field: the same ID in different messages. In
> > Notmuch it is creating trouble because it connects unrelated threads to
> > one. The person has different messages in different threads but Notmuch
> > thinks they are the same message because the Message-ID is the same.
> >
> > This is potentially a "denial of service" for Notmuch. Well, not quite,
> > but is harmful nonetheless. How would a Notmuch user fix the mess or
> > protect himself against it?
>
> I am no longer sure if this issue is caused by fixed "Message-ID" or
> wrong "References" or "In-Reply-To" values. Anyway, someone has created
> real mess anyway because Notmuch combines originally separate threads
> now and forever.

Yes, several sources of different badness ...

Still, if I understand correctly, a new message with a pre-existing
mid ends up being registered by notmuch as a second file for the
"same" message irrespective of differences in the actual files. For
message copies which you receive via different paths (say directly
plus via an ml) this may or may not be what you want. Used
intentionally, it may create harm - how do other mailers handle this?
Show them in parallel in the same thread (but as individual messages)?

Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 10:52 ` Teemu Likonen
  2023-09-25 10:59   ` Michael J Gruber
@ 2023-09-25 11:33   ` Daniel Corbe
  2023-09-25 12:00     ` Teemu Likonen
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel Corbe @ 2023-09-25 11:33 UTC (permalink / raw)
  To: Teemu Likonen; +Cc: notmuch


[-- Attachment #1.1: Type: text/plain, Size: 1265 bytes --]


> On Sep 25, 2023, at 06:52, Teemu Likonen <tlikonen@iki.fi> wrote:
> 
>> Some person on debian-user mailing list seems to be sending messages
>> with fixed Message-ID field: the same ID in different messages. In
>> Notmuch it is creating trouble because it connects unrelated threads to
>> one. The person has different messages in different threads but Notmuch
>> thinks they are the same message because the Message-ID is the same.
>> 
>> This is potentially a "denial of service" for Notmuch. Well, not quite,
>> but is harmful nonetheless. How would a Notmuch user fix the mess or
>> protect himself against it?
> 
> I am no longer sure if this issue is caused by fixed "Message-ID" or
> wrong "References" or "In-Reply-To" values. Anyway, someone has created
> real mess anyway because Notmuch combines originally separate threads
> now and forever.

Silly question, I know, but have you actually tried reaching out to the user?  No MUA that I’m aware of acts like this and it’s pretty clear from documentation and standards tracks that Message-ID is meant to be globally unique per message.

If the user is knowledgeable enough to have a boutique mail reader, they’re probably also knowledgeable enough to correct the defect too.

[-- Attachment #1.2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 11:33   ` Daniel Corbe
@ 2023-09-25 12:00     ` Teemu Likonen
  2023-09-25 13:10       ` Alexander Adolf
  2023-09-26 10:07       ` David Bremner
  0 siblings, 2 replies; 18+ messages in thread
From: Teemu Likonen @ 2023-09-25 12:00 UTC (permalink / raw)
  To: Daniel Corbe; +Cc: notmuch


[-- Attachment #1.1: Type: text/plain, Size: 770 bytes --]

* 2023-09-25 07:33:23-0400, Daniel Corbe wrote:

> Silly question, I know, but have you actually tried reaching out to
> the user?

Not silly, but I don't even know who the person is. All I see is the
mess, and everything else is my interpretation of the cause. Notmuch
Emacs tree mode shows messages' relations but they are not accurate if
references are messed up. It's difficult to dig into Message-ID level of
relations.

Perhaps my wish is that there was an easy way to break threads: mark a
message as origin of a new thread. Or perhaps I just use my custom
ignore mechanism to mark messed threads automatically as read and move
on.

-- 
/// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
// OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 12:00     ` Teemu Likonen
@ 2023-09-25 13:10       ` Alexander Adolf
  2023-09-26 10:13         ` David Bremner
  2023-09-26 10:07       ` David Bremner
  1 sibling, 1 reply; 18+ messages in thread
From: Alexander Adolf @ 2023-09-25 13:10 UTC (permalink / raw)
  To: Teemu Likonen; +Cc: Daniel Corbe, notmuch


[-- Attachment #1.1.1: Type: text/plain, Size: 1700 bytes --]

Hello, 

This sounds like a nasty problem indeed. OTOH, “there’s nothing that couldn’t be” as my granny would have put it. 

Bearing in mind that re-recognising a message which has arrived multiple times via different routes is a worthwhile feature, it would seem to me that a hash over the invariant part of the message, that is the body, would allow for such detection. In that light, it would seem to me that the tuple (body_hash, message_id) could be a candidate for a “unique enough”(tm) identifier?

  --alex

-- 
www.condition-alpha.com / @c_alpha
Sent from my iPhone; apologies for brevity and autocorrect weirdness. 

> On 25. Sep 2023, at 14:00, Teemu Likonen <tlikonen@iki.fi> wrote:
> 
> * 2023-09-25 07:33:23-0400, Daniel Corbe wrote:
> 
>> Silly question, I know, but have you actually tried reaching out to
>> the user?
> 
> Not silly, but I don't even know who the person is. All I see is the
> mess, and everything else is my interpretation of the cause. Notmuch
> Emacs tree mode shows messages' relations but they are not accurate if
> references are messed up. It's difficult to dig into Message-ID level of
> relations.
> 
> Perhaps my wish is that there was an easy way to break threads: mark a
> message as origin of a new thread. Or perhaps I just use my custom
> ignore mechanism to mark messed threads automatically as read and move
> on.
> 
> -- 
> /// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
> // OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462
> _______________________________________________
> notmuch mailing list -- notmuch@notmuchmail.org
> To unsubscribe send an email to notmuch-leave@notmuchmail.org

[-- Attachment #1.1.2: signature.asc --]
[-- Type: application/octet-stream, Size: 251 bytes --]

-----BEGIN PGP SIGNATURE-----

iIYEARYIAC4WIQQL23klfGMkeOvdGCt57xklfWtWWwUCZRF2aBAcdGxpa29uZW5A
aWtpLmZpAAoJEHnvGSV9a1ZbBTsA/04iYtQM+jtv4qdT1/dAzPQZdyvzDyIZTBye
djdw1mrTAP0RgNOuNiRoJK1p0aimkVyRTleKhw4HO32xBf66vUmQBA==
=fSjp
-----END PGP SIGNATURE-----

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 1944 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25  8:54 Fixed Message-ID trouble Teemu Likonen
  2023-09-25 10:52 ` Teemu Likonen
@ 2023-09-25 21:53 ` Gregor Zattler
  2023-09-25 23:00   ` Andy Smith
  2023-09-25 22:45 ` Daniel Kahn Gillmor
  2023-09-27 16:48 ` David Bremner
  3 siblings, 1 reply; 18+ messages in thread
From: Gregor Zattler @ 2023-09-25 21:53 UTC (permalink / raw)
  To: Teemu Likonen, notmuch

Hi Teemu, notmuch users,
* Teemu Likonen <tlikonen@iki.fi> [2023-09-25; 11:54 +03]:
> Some person on debian-user mailing list seems to be sending messages
> with fixed Message-ID field: the same ID in different messages. In
> Notmuch it is creating trouble because it connects unrelated threads to
> one. The person has different messages in different threads but Notmuch
> thinks they are the same message because the Message-ID is the same.
>
> This is potentially a "denial of service" for Notmuch. Well, not quite,
> but is harmful nonetheless. How would a Notmuch user fix the mess or
> protect himself against it?

would you please give details of some such posts?  Then
other people are able to investigate.

Ciao; Gregor

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25  8:54 Fixed Message-ID trouble Teemu Likonen
  2023-09-25 10:52 ` Teemu Likonen
  2023-09-25 21:53 ` Gregor Zattler
@ 2023-09-25 22:45 ` Daniel Kahn Gillmor
  2023-09-27 16:48 ` David Bremner
  3 siblings, 0 replies; 18+ messages in thread
From: Daniel Kahn Gillmor @ 2023-09-25 22:45 UTC (permalink / raw)
  To: Teemu Likonen, notmuch


[-- Attachment #1.1: Type: text/plain, Size: 1694 bytes --]

On Mon 2023-09-25 11:54:07 +0300, Teemu Likonen wrote:
> Some person on debian-user mailing list seems to be sending messages
> with fixed Message-ID field: the same ID in different messages. In
> Notmuch it is creating trouble because it connects unrelated threads to
> one. The person has different messages in different threads but Notmuch
> thinks they are the same message because the Message-ID is the same.
>
> This is potentially a "denial of service" for Notmuch. Well, not quite,
> but is harmful nonetheless. How would a Notmuch user fix the mess or
> protect himself against it?

fwiw, the duplicate message-id attack vector a long-recognized problem:

  https://nmbug.notmuchmail.org/nmweb/show/87k42vrqve.fsf%40pip.fifthhorseman.net

yikes, over a decade ago ☹

With recent versions of notmuch, if the problem is a message-id
collision, you can at least *see* the different variant forms of a given
message by cycling through the list of duplicates (e.g. via
notmuch-show-choose-duplicate in notmuch-emacs), thanks to excellent
work by David Bremner:

https://nmbug.notmuchmail.org/nmweb/show/20220701214548.461943-1-david%40tethera.net

As for thread splitting/re-joining based on References: and In-Reply-To:
headers, you might be interested in these oldies-but-goodies from the
mailing list archives, which as far as i know we have never managed to
resolve:

https://nmbug.notmuchmail.org/nmweb/show/AANLkTimDjk_-Xjpf6uovGXgyG_3j-ySLWQR%2B0UvdVjjT%40mail.gmail.com
https://nmbug.notmuchmail.org/nmweb/show/87mvp9uwi4.fsf%40alice.fifthhorseman.net

Sorry to only have archival references here and not robust/complete
fixes.

        --dkg

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 21:53 ` Gregor Zattler
@ 2023-09-25 23:00   ` Andy Smith
  0 siblings, 0 replies; 18+ messages in thread
From: Andy Smith @ 2023-09-25 23:00 UTC (permalink / raw)
  To: notmuch

Hi,

On Mon, Sep 25, 2023 at 11:53:34PM +0200, Gregor Zattler wrote:
> Hi Teemu, notmuch users,
> * Teemu Likonen <tlikonen@iki.fi> [2023-09-25; 11:54 +03]:
> > Some person on debian-user mailing list seems to be sending messages
> > with fixed Message-ID field: the same ID in different messages.

[…]

> would you please give details of some such posts?  Then
> other people are able to investigate.

Here's an explainer for confused people on the debian-user list:

    https://lists.debian.org/debian-user/2023/09/msg00515.html

Here's an mbox of the five messages that dsr sent that have a
different message ID format to their other messages, and show two
duplicate IDs:

    https://strugglers.net/~andy/dsr.mbox

$ grep '^Message-ID' ~/public_html/dsr.mbox
Message-ID: <34dbc5be-529a-4f47-9a51-3b09040197e2@randomstring.org>
Message-ID: <34dbc5be-529a-4f47-9a51-3b09040197e2@randomstring.org>
Message-ID: <34dbc5be-529a-4f47-9a51-3b09040197e2@randomstring.org>
Message-ID: <3a6b048e-84db-4a47-969c-b62c826e0cdd@randomstring.org>
Message-ID: <3a6b048e-84db-4a47-969c-b62c826e0cdd@randomstring.org>

dsr is now aware of the problem and says they have fixed it.

Cheers,
Andy\r

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 12:00     ` Teemu Likonen
  2023-09-25 13:10       ` Alexander Adolf
@ 2023-09-26 10:07       ` David Bremner
  2023-09-26 12:46         ` Teemu Likonen
  1 sibling, 1 reply; 18+ messages in thread
From: David Bremner @ 2023-09-26 10:07 UTC (permalink / raw)
  To: Teemu Likonen, Daniel Corbe; +Cc: notmuch

Teemu Likonen <tlikonen@iki.fi> writes:

> * 2023-09-25 07:33:23-0400, Daniel Corbe wrote:
>
>> Silly question, I know, but have you actually tried reaching out to
>> the user?
>
> Not silly, but I don't even know who the person is. All I see is the
> mess, and everything else is my interpretation of the cause. Notmuch
> Emacs tree mode shows messages' relations but they are not accurate if
> references are messed up. It's difficult to dig into Message-ID level of
> relations.
>
> Perhaps my wish is that there was an easy way to break threads: mark a
> message as origin of a new thread. Or perhaps I just use my custom
> ignore mechanism to mark messed threads automatically as read and move
> on.

How about if you delete the Message-ID, References, and In-Reply-To
headers from the bad messages and re-index? Notmuch will synthesize a
unique Message-Id if there is none present.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25 13:10       ` Alexander Adolf
@ 2023-09-26 10:13         ` David Bremner
  2023-09-26 11:44           ` Alexander Adolf
  0 siblings, 1 reply; 18+ messages in thread
From: David Bremner @ 2023-09-26 10:13 UTC (permalink / raw)
  To: Alexander Adolf, Teemu Likonen; +Cc: Daniel Corbe, notmuch

Alexander Adolf <alexander.adolf@condition-alpha.com> writes:

>
> Bearing in mind that re-recognising a message which has arrived
> multiple times via different routes is a worthwhile feature, it would
> seem to me that a hash over the invariant part of the message, that is
> the body, would allow for such detection. In that light, it would seem
> to me that the tuple (body_hash, message_id) could be a candidate for
> a “unique enough”(tm) identifier?

I always had the impression that the message body had too variation
imposed by different delivery routes for this to be very helpful:
essentially the hash would be different for every file due to trailers
added by mailing lists, re-encoding, stupid "external message" headers
added by malicious^Wcorporate mail servers, etc...

I could be wrong, maybe hashing is a useful approach, but I'd need to
see some numbers to be convinced.\r

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-26 10:13         ` David Bremner
@ 2023-09-26 11:44           ` Alexander Adolf
  2023-09-26 12:15             ` Andreas Kähäri
  0 siblings, 1 reply; 18+ messages in thread
From: Alexander Adolf @ 2023-09-26 11:44 UTC (permalink / raw)
  To: David Bremner, Teemu Likonen; +Cc: Daniel Corbe, notmuch

David Bremner <david@tethera.net> writes:

> Alexander Adolf <alexander.adolf@condition-alpha.com> writes:
>
>> Bearing in mind that re-recognising a message which has arrived
>> multiple times via different routes is a worthwhile feature, it would
>> seem to me that a hash over the invariant part of the message, that is
>> the body, would allow for such detection. In that light, it would seem
>> to me that the tuple (body_hash, message_id) could be a candidate for
>> a “unique enough”(tm) identifier?
>
> I always had the impression that the message body had too variation
> imposed by different delivery routes for this to be very helpful:
> essentially the hash would be different for every file due to trailers
> added by mailing lists,

Ah, good point. I hadn't thought of mailing list trailers. Could these
perhaps be detected via the signature line separator "-- \n"?

I guess this also touches on the question of what a consensus definition
of "sameness" could be. If we take the message-id only, it'd be a purely
technical one. If we'd include the content one way or another (for
instance via hash over the body), that would rather be an editorial
definition of "sameness".

> re-encoding,

Like...? utf-8 to/from quoted-printable...?

> stupid "external message" headers added by malicious^Wcorporate mail
> servers, etc...

Headers would not "muddy the waters" since they are headers. In my mind,
the hash would be over the body only.

> I could be wrong, maybe hashing is a useful approach, but I'd need to
> see some numbers to be convinced.

I fully agree that we need to adapt to the realities of how things are
actually used, not how they were intended to be used.

How would I find instances of multiple files for the same message-id in
my database for example?


Cheers,

  --alexander\r

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-26 11:44           ` Alexander Adolf
@ 2023-09-26 12:15             ` Andreas Kähäri
  2023-09-26 16:22               ` Alexander Adolf
  0 siblings, 1 reply; 18+ messages in thread
From: Andreas Kähäri @ 2023-09-26 12:15 UTC (permalink / raw)
  To: Alexander Adolf; +Cc: Daniel Corbe, notmuch

On Tue, Sep 26, 2023 at 01:44:00PM +0200, Alexander Adolf wrote:
> David Bremner <david@tethera.net> writes:
> 
> > Alexander Adolf <alexander.adolf@condition-alpha.com> writes:
> >
> >> Bearing in mind that re-recognising a message which has arrived
> >> multiple times via different routes is a worthwhile feature, it would
> >> seem to me that a hash over the invariant part of the message, that is
> >> the body, would allow for such detection. In that light, it would seem
> >> to me that the tuple (body_hash, message_id) could be a candidate for
> >> a “unique enough”(tm) identifier?
> >
> > I always had the impression that the message body had too variation
> > imposed by different delivery routes for this to be very helpful:
> > essentially the hash would be different for every file due to trailers
> > added by mailing lists,
> 
> Ah, good point. I hadn't thought of mailing list trailers. Could these
> perhaps be detected via the signature line separator "-- \n"?
> 
> I guess this also touches on the question of what a consensus definition
> of "sameness" could be. If we take the message-id only, it'd be a purely
> technical one. If we'd include the content one way or another (for
> instance via hash over the body), that would rather be an editorial
> definition of "sameness".
> 
> > re-encoding,
> 
> Like...? utf-8 to/from quoted-printable...?
> 
> > stupid "external message" headers added by malicious^Wcorporate mail
> > servers, etc...
> 
> Headers would not "muddy the waters" since they are headers. In my mind,
> the hash would be over the body only.

Hi, I'm not really part of the discussion, but I can add a quick thought
and a suggestion.

There are corporate mail servers that add a boilerplate "header" to the
body of outgoing email messages.  The more common practice is to add a
"footer" to the message.  I have seen these footers being added both
before and after the user's signature.  You can not use a hash that
contains the body of the message to identify the message as unique.

Using the earliest Received header (the one furtherst down) as a unique
identifier would possibly be a better approach.  Since this likely
contains the identity of the originating mail server, some mail queue
ID, and a timestamp, it should be unique enough to identify the message,
even if the message is received via multiple routes and has a non-unique
Message ID.

> > I could be wrong, maybe hashing is a useful approach, but I'd need to
> > see some numbers to be convinced.
> 
> I fully agree that we need to adapt to the realities of how things are
> actually used, not how they were intended to be used.
> 
> How would I find instances of multiple files for the same message-id in
> my database for example?
> 
> 
> Cheers,
> 
>   --alexander
> _______________________________________________
> notmuch mailing list -- notmuch@notmuchmail.org
> To unsubscribe send an email to notmuch-leave@notmuchmail.org

-- 
Andreas (Kusalananda) Kähäri
Uppsala, Sweden

.\r

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-26 10:07       ` David Bremner
@ 2023-09-26 12:46         ` Teemu Likonen
  2023-09-26 17:17           ` David Bremner
  0 siblings, 1 reply; 18+ messages in thread
From: Teemu Likonen @ 2023-09-26 12:46 UTC (permalink / raw)
  To: David Bremner, Daniel Corbe; +Cc: notmuch


[-- Attachment #1.1: Type: text/plain, Size: 1234 bytes --]

* 2023-09-26 07:07:46-0300, David Bremner wrote:

> Teemu Likonen <tlikonen@iki.fi> writes:
>> Perhaps my wish is that there was an easy way to break threads: mark a
>> message as origin of a new thread.

> How about if you delete the Message-ID, References, and In-Reply-To
> headers from the bad messages and re-index? Notmuch will synthesize a
> unique Message-Id if there is none present.

Will Notmuch also break the thread so that this edited message will
start a new thread? Maybe the message itself but its follow-ups need to
be fixed too. Often "References" points several earlier messages in the
chain. So, to detach a subthread from bigger thread would need manual
editing for more than one message:

 1. Edit one message and remove its "References" and "In-Reply-To".
    Possibly edit "Message-ID". This would be the origin of a new
    thread.

 2. Check all follow-ups to that message and make them refer the new
    origin and its (possibly) new "Message-ID". Remove references that
    go beyond the origin.

 3. Reindex.

Or just forget the mess and move on with life. :-)

-- 
/// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
// OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-26 12:15             ` Andreas Kähäri
@ 2023-09-26 16:22               ` Alexander Adolf
  0 siblings, 0 replies; 18+ messages in thread
From: Alexander Adolf @ 2023-09-26 16:22 UTC (permalink / raw)
  To: Andreas Kähäri; +Cc: Daniel Corbe, notmuch

Andreas Kähäri <andreas.kahari@abc.se> writes:

> [...]
>> > stupid "external message" headers added by malicious^Wcorporate mail
>> > servers, etc...
>> 
>> Headers would not "muddy the waters" since they are headers. In my mind,
>> the hash would be over the body only.
>
> Hi, I'm not really part of the discussion, but I can add a quick thought
> and a suggestion.
>
> There are corporate mail servers that add a boilerplate "header" to the
> body of outgoing email messages.  The more common practice is to add a
> "footer" to the message.  I have seen these footers being added both
> before and after the user's signature.  You can not use a hash that
> contains the body of the message to identify the message as unique.

Thanks for pointing out. You're right, of course; I have seen such
things myself, too.

It thus seems to me that the body hash idea is officially not working. I
rest my case.

> Using the earliest Received header (the one furtherst down) as a unique
> identifier would possibly be a better approach.  Since this likely
> contains the identity of the originating mail server, some mail queue
> ID, and a timestamp, it should be unique enough to identify the message,
> even if the message is received via multiple routes and has a non-unique
> Message ID.
> [...]

I would strongly advise against using any "early" Received (or any
other) header for any heuristics. In spam traffic most headers will all
but certainly be fake. The only ones to trust is the very last Received
header added by your own (or your provider's) mail system.

Trying to control your code's behaviour based on maliciously crafted
data would hence mean intentionally exposing an attack surface. Parsing
these data for display to the user (as is the case now) is as far as I
would suggest going with that; but no further.


Cheers,

  --alexander\r

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-26 12:46         ` Teemu Likonen
@ 2023-09-26 17:17           ` David Bremner
  0 siblings, 0 replies; 18+ messages in thread
From: David Bremner @ 2023-09-26 17:17 UTC (permalink / raw)
  To: Teemu Likonen, Daniel Corbe; +Cc: notmuch

Teemu Likonen <tlikonen@iki.fi> writes:

> Will Notmuch also break the thread so that this edited message will
> start a new thread? Maybe the message itself but its follow-ups need to
> be fixed too. Often "References" points several earlier messages in the
> chain. So, to detach a subthread from bigger thread would need manual
> editing for more than one message:

Yeah, once people start replying to the broken messages, it becomes more
complicated, as you point out.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-25  8:54 Fixed Message-ID trouble Teemu Likonen
                   ` (2 preceding siblings ...)
  2023-09-25 22:45 ` Daniel Kahn Gillmor
@ 2023-09-27 16:48 ` David Bremner
  2023-09-28  5:51   ` Teemu Likonen
  3 siblings, 1 reply; 18+ messages in thread
From: David Bremner @ 2023-09-27 16:48 UTC (permalink / raw)
  To: Teemu Likonen, notmuch

Teemu Likonen <tlikonen@iki.fi> writes:

> Some person on debian-user mailing list seems to be sending messages
> with fixed Message-ID field: the same ID in different messages. In
> Notmuch it is creating trouble because it connects unrelated threads to
> one. The person has different messages in different threads but Notmuch
> thinks they are the same message because the Message-ID is the same.
>
> This is potentially a "denial of service" for Notmuch. Well, not quite,
> but is harmful nonetheless. How would a Notmuch user fix the mess or
> protect himself against it?

By the way, if using the emacs front-end did you try the unthreaded view
(U)? That would at least mitigate damage from people replying to the
poisoned messages.

I could imagine a future version of notmuch considering the
identification of files with the same message id as part of "threading",
and allowing an unthreaded view to just show all the files, effectively
ignoring the message-id. The next step would be to do that selectively
for some messages.  This all requires a complete redesign of the
database schema, so I don't know how realistic it is.

d

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fixed Message-ID trouble
  2023-09-27 16:48 ` David Bremner
@ 2023-09-28  5:51   ` Teemu Likonen
  0 siblings, 0 replies; 18+ messages in thread
From: Teemu Likonen @ 2023-09-28  5:51 UTC (permalink / raw)
  To: David Bremner, notmuch


[-- Attachment #1.1: Type: text/plain, Size: 645 bytes --]

* 2023-09-27 13:48:50-0300, David Bremner wrote:

> By the way, if using the emacs front-end did you try the unthreaded
> view (U)? That would at least mitigate damage from people replying to
> the poisoned messages.

I didn't. So thanks for reminding about the unthreaded view. It is a
nice fallback mode when threading is broken or complicated. Plain list
of timestamp-sorted messages help in this particular case because the
originally different threads (which are now the same thread) appeared in
different times.

-- 
/// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
// OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-09-28  5:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-25  8:54 Fixed Message-ID trouble Teemu Likonen
2023-09-25 10:52 ` Teemu Likonen
2023-09-25 10:59   ` Michael J Gruber
2023-09-25 11:33   ` Daniel Corbe
2023-09-25 12:00     ` Teemu Likonen
2023-09-25 13:10       ` Alexander Adolf
2023-09-26 10:13         ` David Bremner
2023-09-26 11:44           ` Alexander Adolf
2023-09-26 12:15             ` Andreas Kähäri
2023-09-26 16:22               ` Alexander Adolf
2023-09-26 10:07       ` David Bremner
2023-09-26 12:46         ` Teemu Likonen
2023-09-26 17:17           ` David Bremner
2023-09-25 21:53 ` Gregor Zattler
2023-09-25 23:00   ` Andy Smith
2023-09-25 22:45 ` Daniel Kahn Gillmor
2023-09-27 16:48 ` David Bremner
2023-09-28  5:51   ` Teemu Likonen

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).