From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 5/4] msgtime: avoid obviously out-of-range dates (for now)
Date: Sun, 1 Dec 2019 22:04:25 +0000 [thread overview]
Message-ID: <20191201220425.GA30161@dcvr> (raw)
In-Reply-To: <20191129122508.7708-5-e@80x24.org>
Wacky dates show up in lore for valid messages. Lets ignore
them and let future generations deal with Y10K and time-travel
problems.
---
lib/PublicInbox/MsgTime.pm | 6 +++++-
t/msgtime.t | 14 ++++++++++++--
2 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/MsgTime.pm b/lib/PublicInbox/MsgTime.pm
index 479aaa4ecf132..9f4326442dd11 100644
--- a/lib/PublicInbox/MsgTime.pm
+++ b/lib/PublicInbox/MsgTime.pm
@@ -38,7 +38,7 @@ sub str2date_zone ($) {
if ($date =~ /(?:[A-Za-z]+,?\s+)? # day-of-week
([0-9]+),?\s+ # dd
([A-Za-z]+)\s+ # mon
- ([0-9]{2,})\s+ # YYYY or YY (or YYY :P)
+ ([0-9]{2,4})\s+ # YYYY or YY (or YYY :P)
([0-9]+)[:\.] # HH:
((?:[0-9]{2})|(?:\s?[0-9])) # MM
(?:[:\.]((?:[0-9]{2})|(?:\s?[0-9])))? # :SS
@@ -67,6 +67,10 @@ sub str2date_zone ($) {
$ts = timegm($ss // 0, $mm, $hh, $dd, $mon, $yyyy);
+ # 4-digit dates in non-spam from 1900s and 1910s exist in
+ # lore archives
+ return if $ts < 0;
+
# Compute the time offset from [+-]HHMM
$tz //= 0;
my ($tz_hh, $tz_mm);
diff --git a/t/msgtime.t b/t/msgtime.t
index 1452dc97d5b0b..cecad775769e1 100644
--- a/t/msgtime.t
+++ b/t/msgtime.t
@@ -5,7 +5,7 @@ use warnings;
use Test::More;
use PublicInbox::MIME;
use PublicInbox::MsgTime;
-
+our $received_date = 'Mon, 22 Jan 2007 13:16:24 -0500';
sub datestamp ($) {
my ($date) = @_;
local $SIG{__WARN__} = sub {}; # Suppress warnings
@@ -17,7 +17,11 @@ sub datestamp ($) {
Subject => 'this is a subject',
'Message-ID' => '<a@example.com>',
Date => $date,
- 'Received' => '(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);\n\tMon, 22 Jan 2007 13:16:24 -0500',
+ 'Received' => <<EOF,
+(majordomo\@vger.kernel.org) by vger.kernel.org via listexpand
+\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);
+\t$received_date
+EOF
],
body => "hello world\n",
);
@@ -104,4 +108,10 @@ for (qw(UT GMT Z)) {
}
is_datestamp('Fri, 02 Oct 1993 00:00:00 EDT', [ 749534400, '-0400']);
+# fallback to Received: header if Date: is out-of-range:
+is_datestamp('Fri, 1 Jan 1904 10:12:31 +0100',
+ PublicInbox::MsgTime::str2date_zone($received_date));
+is_datestamp('Fri, 9 Mar 71685 18:45:56 +0000', # Y10K is not my problem :P
+ PublicInbox::MsgTime::str2date_zone($received_date));
+
done_testing();
next prev parent reply other threads:[~2019-12-01 22:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-29 12:25 [PATCH 0/4] drop Date::Parse dependency Eric Wong
2019-11-29 12:25 ` [PATCH 1/4] git: async batch interface Eric Wong
2019-11-29 12:25 ` [PATCH 2/4] add msgtime_cmp maintainer test Eric Wong
2019-11-29 12:25 ` [PATCH 3/4] msgtime: drop Date::Parse for RFC2822 Eric Wong
2019-11-29 12:25 ` [PATCH 4/4] Date::Parse is now optional Eric Wong
2019-12-01 22:04 ` Eric Wong [this message]
2019-12-12 3:42 ` [PATCH 5/4] msgtime: avoid obviously out-of-range dates (for now) Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191201220425.GA30161@dcvr \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).