From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 2FD0D431FAF for ; Sun, 3 Mar 2013 15:46:22 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 1.7 X-Spam-Level: * X-Spam-Status: No, score=1.7 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=2.499, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id itfX0F7TynOB for ; Sun, 3 Mar 2013 15:46:21 -0800 (PST) Received: from mail-qe0-f54.google.com (mail-qe0-f54.google.com [209.85.128.54]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 4CCB0431FAE for ; Sun, 3 Mar 2013 15:46:21 -0800 (PST) Received: by mail-qe0-f54.google.com with SMTP id i11so2674234qej.27 for ; Sun, 03 Mar 2013 15:46:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:to:cc:subject:in-reply-to:references:user-agent :date:message-id:mime-version:content-type:content-transfer-encoding; bh=XeVIO0io3ByeZxaPSIdKKwTeTcZKfzZwQRrpoPdQ2uA=; b=xr6wNjvgKUQL0IGZBCoLzpXgfkFys+fmchU0uI75Y75Zt1X4uDNw7PT8VN4XU+tIv+ i8skIPlntYNVTxO9eRDySeA5ZR+YeJhHsE485pczbndalRt6hTt4IdiGAWYGj9l20HLy 9NbqS4ipgMlWcPCXLbuFpi4+SGMG9aDJhkcZroMg64sP0Sg/6Ad99DovFKiS9Er8KL98 7u5wzYIoRjK08Nsm52iA1s3kf3YmH5NGPOEVSlYlMpwbe/NeP8DcogFWwweCCFMAOUL7 8qmTPmVI8ymg6ku6TlQmLzjcqryKZEwskQpD04BQJgK9RhyI8Z2kRMJBt5S+JtnKncc5 qiYA== X-Received: by 10.229.175.25 with SMTP id v25mr6301859qcz.36.1362354379700; Sun, 03 Mar 2013 15:46:19 -0800 (PST) Received: from localhost (c-68-80-94-73.hsd1.pa.comcast.net. [68.80.94.73]) by mx.google.com with ESMTPS id g6sm34092051qav.6.2013.03.03.15.46.18 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 03 Mar 2013 15:46:19 -0800 (PST) From: Aaron Ecay To: Jani Nikula , notmuch@notmuchmail.org Subject: Re: [RFC] [PATCH] lib/database.cc: change how the parent of a message is calculated In-Reply-To: <871ubzt5gr.fsf@nikula.org> References: <1361836225-17279-1-git-send-email-aaronecay@gmail.com> <87621cteeb.fsf@nikula.org> <871ubzt5gr.fsf@nikula.org> User-Agent: Notmuch/0.15.2+33~g0c0a530 (http://notmuchmail.org) Emacs/24.3.50.2 (x86_64-unknown-linux-gnu) Date: Sun, 03 Mar 2013 18:46:18 -0500 Message-ID: <87wqtovygl.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Mar 2013 23:46:22 -0000 Hi Jani, Thanks to you and Austin for the comments. 2013ko martxoak 1an, Jani Nikula-ek idatzi zuen: >> I think the background is that RFC 822 defines In-Reply-To (and >> References too for that matter) as *(phrase / msg-id), while RFC 2822 >> defines them as 1*msg-id. I'd like something about RFC 822 being >> mentioned in the commit message. >>=20 >> The problem in the gmane message you link to in >> id:87liaa3luc.fsf@gmail.com is likely related to the FAQ item 05.26 >> "How do I fix a bogus In-Reply-To or missing References field?" in >> the MH FAQ http://www.newt.com/faq/mh.html. Likely yes. But I think notmuch should handle these messages, since they are seen in the wild (and I don=E2=80=99t think you disagree with me on this point?) >>=20 >> As the comment for the function says, we explicitly avoid including >> self-references. I think I'd err on the safe side and return NULL if >> the last ref equals message-id. Done. >>=20 >> I don't know how you got this non-change hunk here, but please remove >> it. :) That=E2=80=99s what I get for setting my editor to delete trailing whitespa= ce on save (then not reading outgoing patches carefully). Fixed. >> I wonder if you should reuse your parse_references() change here, so >> you'd set in_reply_to_message_id to the last message-id in >> In-Reply-To. This might tackle some of the problematic cases >> directly, but should still be all right per RFC 2822. I didn't verify >> how the parser handles an RFC 2822 violating free form header though. >=20 > Strike that based on http://www.jwz.org/doc/threading.html: >=20 > "If there are multiple things in In-Reply-To that look like > Message-IDs, only use the first one of them: odds are that the later > ones are actually email addresses, not IDs." Hmm. I think it=E2=80=99s a toss-up which of multiple quasi-message-ids is= the real one. In the email message example I linked upthread, it was the last one that was real. I decided to use the last one, because it allows the self-reference checking to be pushed entirely into parse_references. If you feel strongly that we should use the first one, I can change it back. > I talked to Austin (CC) about the patch on IRC, and his comment was, > perceptive as always: >=20 > 23:38 amdragon Is the logic in that patch equivalent to always using > the last message ID in references unless there is no references > header? Seems like it is, but in a convoluted way. >=20 > And that's actually the case, isn't it? To make the code reflect that, > you should use last_ref_message_id, and if that's NULL, fallback to > in_reply_to_message_id. Yes. Fixed. >=20 >> I suggest adding an else if branch (or revamp the above if condition) >> to tackle the missing In-Reply-To header: >>=20 >> else if (!in_reply_to_message_id && last_ref_message_id) { >> in_reply_to_message_id =3D last_ref_message_id; } >=20 > Strike that, it should be the other way round. Now that the self-reference check is in parse_references, the conditional is much simpler. One additional change I made in this version was to factor out 3 calls to =E2=80=9Cnotmuch_message_get_message_id (message)=E2=80=9D into a variab= le inside the _notmuch_database_link_message_to_parents function, for a small boost to readability (and perhaps speed, depending on how clever the compiler is I guess). I also added tests =E2=80=93 those are the first of two patches that will f= ollow this email, the second being the code to make them pass. --=20 Aaron Ecay