unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* thread ordering based on references and/or in-reply-to
@ 2011-10-31 23:07 Florian Friesdorf
  2011-11-02 14:37 ` Austin Clements
  0 siblings, 1 reply; 3+ messages in thread
From: Florian Friesdorf @ 2011-10-31 23:07 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 1778 bytes --]


Hi,

I'm looking into taking the References header into account for thread
ordering. So far only In-Reply-To is used. My C/C++ is rusty at best, so
I'd need some help to get this done.

Carl gave a try on irc already to clear things up for me, reading into
it, I have more questions:

lib/thread.cc/_resolve_thread_relationships adds messages as replies to
a parent.

Currently, we seem to treat In-Reply-To as empty or single msgid. If I
understand rfc822 it can be a list of msgids and/or phrases. Do/shall we
support that?

References is a list of msgids, with the last one being the direct
parent. I don't know how multiple direct parents are handled here.

DJB recommends "... readers look for identifiers in In-Reply-To and
append them to References if they are not already included in
References." [1]

In that case if there are two msgids in In-Reply-To and there are
appended to the References list, than only the last one will be a parent
and the one that used to be the last is not a parent anymore.

And Carl recommends to treat references and in-reply-to as two separated
sources of information, first using in-reply-to and then references in
order "to attach to the deepest referenced parent". 

I fail to understand that. Am I complicating things?
How do we want to treat the combination of References/In-Reply-To?

Do we have code that returns the last msgid listed in references?
database.cc/parse_references seems not to care about order, just
existence - or is GHashTable ordered.

[1] http://cr.yp.to/immhf/thread.html


florian
-- 
Florian Friesdorf <flo@chaoflow.net>
  GPG FPR: 7A13 5EEE 1421 9FC2 108D  BAAF 38F8 99A3 0C45 F083
Jabber/XMPP: flo@chaoflow.net
IRC: chaoflow on freenode,ircnet,blafasel,OFTC

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: thread ordering based on references and/or in-reply-to
  2011-10-31 23:07 thread ordering based on references and/or in-reply-to Florian Friesdorf
@ 2011-11-02 14:37 ` Austin Clements
  2011-11-04 20:36   ` Dirk-Jan C. Binnema
  0 siblings, 1 reply; 3+ messages in thread
From: Austin Clements @ 2011-11-02 14:37 UTC (permalink / raw)
  To: Florian Friesdorf; +Cc: notmuch

On Mon, Oct 31, 2011 at 7:07 PM, Florian Friesdorf <flo@chaoflow.net> wrote:
>
> Hi,
>
> I'm looking into taking the References header into account for thread
> ordering. So far only In-Reply-To is used. My C/C++ is rusty at best, so
> I'd need some help to get this done.
>
> Carl gave a try on irc already to clear things up for me, reading into
> it, I have more questions:
>
> lib/thread.cc/_resolve_thread_relationships adds messages as replies to
> a parent.
>
> Currently, we seem to treat In-Reply-To as empty or single msgid. If I
> understand rfc822 it can be a list of msgids and/or phrases. Do/shall we
> support that?
>
> References is a list of msgids, with the last one being the direct
> parent. I don't know how multiple direct parents are handled here.
>
> DJB recommends "... readers look for identifiers in In-Reply-To and
> append them to References if they are not already included in
> References." [1]
>
> In that case if there are two msgids in In-Reply-To and there are
> appended to the References list, than only the last one will be a parent
> and the one that used to be the last is not a parent anymore.
>
> And Carl recommends to treat references and in-reply-to as two separated
> sources of information, first using in-reply-to and then references in
> order "to attach to the deepest referenced parent".
>
> I fail to understand that. Am I complicating things?
> How do we want to treat the combination of References/In-Reply-To?
>
> Do we have code that returns the last msgid listed in references?
> database.cc/parse_references seems not to care about order, just
> existence - or is GHashTable ordered.
>
> [1] http://cr.yp.to/immhf/thread.html
>
>
> florian

I know this came up on IRC, but have you looked at jwz's threading
algorithm (http://www.jwz.org/doc/threading.html)?  Carl mentioned
that notmuch already implements it (except for subject matching), but
notmuch only implements the subset of it necessary to group messages
into threads without structure.  Much of the algorithm is devoted to
exactly this problem of piecing together the thread structure based on
all of the information in both In-Reply-To and References.  The
algorithm as described combines the issues of grouping and structuring
since it's expecting a giant pile of mail as input, but there's no
reason these can't be teased apart.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Re: thread ordering based on references and/or in-reply-to
  2011-11-02 14:37 ` Austin Clements
@ 2011-11-04 20:36   ` Dirk-Jan C. Binnema
  0 siblings, 0 replies; 3+ messages in thread
From: Dirk-Jan C. Binnema @ 2011-11-04 20:36 UTC (permalink / raw)
  To: Florian Friesdorf, Austin Clements; +Cc: notmuch



On Wed 02 Nov 2011 04:37:05 PM EET, Austin Clements wrote:

 > On Mon, Oct 31, 2011 at 7:07 PM, Florian Friesdorf <flo@chaoflow.net> wrote:
 > >
 > > Hi,
 > >
 > > I'm looking into taking the References header into account for thread
 > > ordering. So far only In-Reply-To is used. My C/C++ is rusty at best, so
 > > I'd need some help to get this done.

<snip>
 
 > I know this came up on IRC, but have you looked at jwz's threading
 > algorithm (http://www.jwz.org/doc/threading.html)?  Carl mentioned
 > that notmuch already implements it (except for subject matching), but
 > notmuch only implements the subset of it necessary to group messages
 > into threads without structure.  Much of the algorithm is devoted to
 > exactly this problem of piecing together the thread structure based on
 > all of the information in both In-Reply-To and References.  The
 > algorithm as described combines the issues of grouping and structuring
 > since it's expecting a giant pile of mail as input, but there's no
 > reason these can't be teased apart.

I've implemented it for mu[1], maybe some of it can be reusable for notmuch;
see mu-threader.[ch] and mu-container.[ch] in

   http://gitorious.org/mu/mu/blobs/master/src/

(starting point is mu_threader_calculate).
   
I didn't implement subject matching yet, but it does build the hierarchy as
per JWZ and "References:".

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema                  Helsinki, Finland
e:djcb@djcbsoftware.nl           w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-11-04 20:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-31 23:07 thread ordering based on references and/or in-reply-to Florian Friesdorf
2011-11-02 14:37 ` Austin Clements
2011-11-04 20:36   ` Dirk-Jan C. Binnema

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).