From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS51167 193.164.131.0/24 X-Spam-Status: No, score=-1.8 required=3.0 tests=BAYES_00,RCVD_IN_MSPIKE_BL, RCVD_IN_MSPIKE_ZBI,RCVD_IN_XBL,RDNS_NONE,SPF_FAIL,SPF_HELO_FAIL shortcircuit=no autolearn=no autolearn_force=no version=3.4.0 Received: from 80x24.org (unknown [193.164.131.95]) by dcvr.yhbt.net (Postfix) with ESMTP id D25DB20281 for ; Mon, 2 Oct 2017 22:19:20 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] threading: deal with improperly-terminated References headers Date: Mon, 2 Oct 2017 22:19:16 +0000 Message-Id: <20171002221916.6849-1-e@80x24.org> List-Id: We should not blindly join References and In-Reply-To headers as a single string, because some messages can have an open angle brace '<' in References: without a corresponding '>'. --- lib/PublicInbox/SearchIdx.pm | 5 ++--- lib/PublicInbox/View.pm | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm index 0824db0..cfb9a08 100644 --- a/lib/PublicInbox/SearchIdx.pm +++ b/lib/PublicInbox/SearchIdx.pm @@ -414,9 +414,8 @@ sub link_message { # last References should be IRT, but some mail clients do things # out of order, so trust IRT over References iff IRT exists - my @refs = ($hdr->header_raw('References'), - $hdr->header_raw('In-Reply-To')); - @refs = ((join(' ', @refs)) =~ /<([^>]+)>/g); + my @refs = (($hdr->header_raw('References') || '') =~ /<([^>]+)>/g); + push(@refs, (($hdr->header_raw('In-Reply-To') || '') =~ /<([^>]+)>/g)); my $tid; if (@refs) { diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index 7454acb..b39c820 100644 --- a/lib/PublicInbox/View.pm +++ b/lib/PublicInbox/View.pm @@ -104,9 +104,8 @@ EOF sub in_reply_to { my ($hdr) = @_; my %mid = map { $_ => 1 } $hdr->header_raw('Message-ID'); - my @refs = ($hdr->header_raw('References'), - $hdr->header_raw('In-Reply-To')); - @refs = ((join(' ', @refs)) =~ /<([^>]+)>/g); + my @refs = (($hdr->header_raw('References') || '') =~ /<([^>]+)>/g); + push(@refs, (($hdr->header_raw('In-Reply-To') || '') =~ /<([^>]+)>/g)); while (defined(my $irt = pop @refs)) { next if $mid{"<$irt>"}; return $irt; -- EW