From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id BF35A429E34 for ; Mon, 30 Jan 2012 17:17:54 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 1.401 X-Spam-Level: * X-Spam-Status: No, score=1.401 tagged_above=-999 required=5 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=2.499, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jZPRLSYy+bu2 for ; Mon, 30 Jan 2012 17:17:54 -0800 (PST) Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id C1E5A431FBC for ; Mon, 30 Jan 2012 17:17:53 -0800 (PST) Received: from smtp.qmul.ac.uk ([138.37.6.40]) by mail2.qmul.ac.uk with esmtp (Exim 4.71) (envelope-from ) id 1Rs2M1-0008HJ-Rb; Tue, 31 Jan 2012 01:17:50 +0000 Received: from 94-192-233-223.zone6.bethere.co.uk ([94.192.233.223] helo=localhost) by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from ) id 1Rs2M1-0005Rc-E1; Tue, 31 Jan 2012 01:17:49 +0000 From: Mark Walters To: Gregor Zattler , notmuch Subject: Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages In-Reply-To: <20120130223416.GA26239@shi.workgroup> References: <20120126004024.GA13704@shi.workgroup> <20120126011903.GA1176@mit.edu> <8762fzry7k.fsf@servo.finestructure.net> <20120126124450.GB30209@shi.workgroup> <87mx9avbc1.fsf@praet.org> <20120129234213.GB11460@shi.workgroup> <87zkd5655g.fsf@praet.org> <20120130190425.GB13521@shi.workgroup> <878vkoev95.fsf@qmul.ac.uk> <20120130223416.GA26239@shi.workgroup> User-Agent: Notmuch/0.11+137~g98adc3d (http://notmuchmail.org) Emacs/23.2.1 (i486-pc-linux-gnu) Date: Tue, 31 Jan 2012 01:18:55 +0000 Message-ID: <874nvcekjk.fsf@qmul.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Sender-Host-Address: 94.192.233.223 X-QM-SPAM-Info: Sender has good ham record. :) X-QM-Body-MD5: 6919f6b6598f2a54f2db22f0a2fb9248 (of first 20000 bytes) X-SpamAssassin-Score: -1.2 X-SpamAssassin-SpamBar: - X-SpamAssassin-Report: The QM spam filters have analysed this message to determine if it is spam. We require at least 5.0 points to mark a message as spam. This message scored -1.2 points. Summary of the scoring: * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, * medium trust * [138.37.6.40 listed in list.dnswl.org] * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider * (markwalters1009[at]gmail.com) * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay * domain * 1.0 FREEMAIL_REPLY From and body contain different freemails * 0.1 AWL AWL: From: address is in the auto white-list X-QM-Scan-Virus: ClamAV says the message is clean X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jan 2012 01:17:54 -0000 On Mon, 30 Jan 2012 23:34:16 +0100, Gregor Zattler wrote: > Hi Mark, > * Mark Walters [30. Jan. 2012]: > > On Mon, 30 Jan 2012 20:04:25 +0100, Gregor Zattler wrote: > >> * Pieter Praet [30. Jan. 2012]: > >>> On Mon, 30 Jan 2012 00:42:14 +0100, Gregor Zattler wrote: > >>>> * Pieter Praet [26. Jan. 2012]: > >>>>> Here's another couple of threads squashed into a single one: > >>>>> - [O] [Use Question] Capture and long lines > >>>>> - id:"BANLkTikoF4tXuNLLufRzNSD6k2ZYs7sUcg@mail.gmail.com" > >>>>> - [O] Worg update > >>>>> - id:"m1wrfiz3ch.fsf@tsdye.com" > >>>>> - [O] Table formula to convert hex to dec > >>>>> - id:"20110724080054.GB16388@x201" > >>>>> - [O] ICS import? > >>>>> - id:"20120125173421.GQ3747@x201" > >>>>> > >>>>> > >>>>> AFAICT, none of them share Message-Id's... > >>>> > >>>> Do you consider this a bug? > >>>> > >>> > >>> I do. No idea what causes it or how to fix it though... :) > >> > >> First I thougt it' not a severe bug since one see's more not less > >> messages in notmuch show buffer. But later I realised one also > >> sees less not more threads in notmuch search buffer and might not > >> read certain notmuch threads because of "wrong" $Subject: in > >> notmuch search buffer. > > > I think notmuch links two messages into the same thread if they have an > > in-reply-to or reference header in common: i.e the messages reference a > > common parent message. (See comment in lib/database.cc "Even before a > > message is added, it's pre-allocated thread ID is useful so that all > > descendant messages that reference this common parent can be recognized > > as belonging to the same thread.") > > So in case message a from thread A and message b from B would > name the same Message c in their In-Reoply-To:/References: > headers, while c is not (for some reason) in A or B, notmuch > would assume both threads linked? Makes sense. > > > As far as I can see your grep tests haven't checked for that. > > True. > > > Also, could you email me the mbox you had (I think you said that it was > > a mailing list so all public) and I will take a look? > > Sure, I do so off-list because of the size of the attachment. Hi I have looked at this and I think this is not notmuch's fault: I think it is a mua doing strange things: One of the mails has an in-reply-to header which looks like In-reply-to: Message from Carsten Dominik of "Tue, 15 Mar 2011 12:18:51 BST." <17242340-A14F-495A-B144-20C96D52B620@gmail.com> and I think notmuch is taking the carsten.dominik@gmail.com as message id. A similar in-reply-to header appears in the other thread so notmuch pairs them up. According to http://www.jwz.org/doc/threading.html this form of header is not allowed under RFC2822 but was allowed under the earlier RFC822. You can see several such messages on the gnu-mailing list site eg ftp://lists.gnu.org/emacs-orgmode/2011-11 search for "in-reply-to: M" but they all appear to be from the same person (running mh-e 8.3 nmh under emacs 24) In my collection from the linux kernel mailing list I get some examples of in-reply-to not just being : but it was only about 200 from 100,000 messages in the second half of 2010 (the most recent archives I have). Best wishes Mark