unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
@ 2021-11-30 20:55 Bob Rogers
  2021-12-01  4:17 ` Lars Ingebrigtsen
  2021-12-04 18:58 ` Paul Eggert
  0 siblings, 2 replies; 40+ messages in thread
From: Bob Rogers @ 2021-11-30 20:55 UTC (permalink / raw)
  To: 52209

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 1538 bytes --]

   In the emacs-28 branch at 0dd3883def:

   Imagine my surprise when evaluating

	(days-between "2021-10-22" "2020-09-29")

returned zero.  The root cause is that passing any date string without a
time to date-to-time produces the same return value:

	(date-to-time "2021-10-22") => (14445 17280)
	(date-to-time "2020-09-29") => (14445 17280)

But:

	(date-to-time "2020-09-29 23:15") => (24435 63540)

There are really two bugs here (or maybe three, depending on how you
look at it):

   1.  If parsing throws an error that is not an overflow, it passes the
date through timezone-make-date-arpa-standard to try to fix some cases
that parse-time-string can't handle.  But the condition-case is also
wrapped around the encode-time call, which gets a wrong-type-argument
error when it sees nil time values for HH:MM, so the fallback gets used
for something other than a parsing error.

   2.  When timezone-make-date-arpa-standard gets something it can't
handle, it "canonicalizes" the value to "31 Dec 1999 19:00:00 -0500",
which is the source of the constant result.  That may be worth another
bug report, but I'm not sure of its charter; maybe that's correct
behavior in context.

   The attached patch adds decoded-time-set-defaults, moves that and the
encode-time call outside the condition-case, and disables the fallback
to timezone-make-date-arpa-standard if the date appears not to have a
time value.  And I can now tell you there are 388 days between
2020-09-29 and 2021-10-22.

					-- Bob Rogers
					   http://www.rgrjr.com/


[-- Attachment #2: Type: text/x-patch, Size: 1854 bytes --]

diff --git a/lisp/calendar/time-date.el b/lisp/calendar/time-date.el
index 155c34927f..6407138953 100644
--- a/lisp/calendar/time-date.el
+++ b/lisp/calendar/time-date.el
@@ -153,19 +153,25 @@ date-to-time
   "Parse a string DATE that represents a date-time and return a time value.
 DATE should be in one of the forms recognized by `parse-time-string'.
 If DATE lacks timezone information, GMT is assumed."
-  (condition-case err
-      (encode-time (parse-time-string date))
-    (error
-     (let ((overflow-error '(error "Specified time is not representable")))
-       (if (equal err overflow-error)
-	   (signal (car err) (cdr err))
-	 (condition-case err
-	     (encode-time (parse-time-string
-			   (timezone-make-date-arpa-standard date)))
-	   (error
-	    (if (equal err overflow-error)
-		(signal (car err) (cdr err))
-	      (error "Invalid date: %s" date)))))))))
+  ;; Pass the result of parsing through decoded-time-set-defaults
+  ;; because encode-time signals if HH:MM:SS are not filled in.
+  (encode-time
+    (decoded-time-set-defaults
+      (condition-case err
+          (parse-time-string date)
+        (error
+         (let ((overflow-error '(error "Specified time is not representable")))
+           (if (or (equal err overflow-error)
+                   ;; timezone-make-date-arpa-standard misbehaves if
+                   ;; not given at least HH:MM as part of the date.
+                   (not (string-match ":" date)))
+               (signal (car err) (cdr err))
+             (condition-case err
+                 (parse-time-string (timezone-make-date-arpa-standard date))
+               (error
+                (if (equal err overflow-error)
+                    (signal (car err) (cdr err))
+                  (error "Invalid date: %s" date)))))))))))
 
 ;;;###autoload
 (defalias 'time-to-seconds 'float-time)

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-11-30 20:55 bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates Bob Rogers
@ 2021-12-01  4:17 ` Lars Ingebrigtsen
  2021-12-03  5:19   ` Katsumi Yamaoka
  2021-12-04 18:58 ` Paul Eggert
  1 sibling, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-01  4:17 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

>    In the emacs-28 branch at 0dd3883def:
>
>    Imagine my surprise when evaluating
>
> 	(days-between "2021-10-22" "2020-09-29")
>
> returned zero.

Thanks, applied to Emacs 29.

(These functions were never really intended to support parsing dates
like that -- only strict RFC822 date strings were originally supported,
but it's become more DWIM as time has passed.  Especially since it
wasn't explicitly stated anywhere that time-date.el was an RFC822
library.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-01  4:17 ` Lars Ingebrigtsen
@ 2021-12-03  5:19   ` Katsumi Yamaoka
  2021-12-03 16:29     ` Lars Ingebrigtsen
  2021-12-03 18:38     ` Michael Heerdegen
  0 siblings, 2 replies; 40+ messages in thread
From: Katsumi Yamaoka @ 2021-12-03  5:19 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Bob Rogers, 52209

[-- Attachment #1: Type: text/plain, Size: 644 bytes --]

On Wed, 01 Dec 2021 05:17:30 +0100, Lars Ingebrigtsen wrote:
> Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:
>>    In the emacs-28 branch at 0dd3883def:
>>    Imagine my surprise when evaluating
>> 	(days-between "2021-10-22" "2020-09-29")
>> returned zero.

> Thanks, applied to Emacs 29.

This change caused another regression.  Please try:

(current-time-string (date-to-time "Fri, 03-Dec-2021 04:59:52 GMT"))

The function needs to test if `parse-time-string' returns a valid
data as the old version did it with the help of `encode-time'.
A patch is below (where why I do `(setq time ...)' is to silence
the byte compiler).

Thanks.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 587 bytes --]

--- time-date.el~	2021-12-01 22:24:35.006052000 +0000
+++ time-date.el	2021-12-03 05:13:22.832443900 +0000
@@ -158,7 +158,10 @@
   (encode-time
     (decoded-time-set-defaults
       (condition-case err
-          (parse-time-string date)
+          (let ((time (parse-time-string date)))
+            (prog1 time
+              ;; Cause an error if data `parse-time-string' returns is invalid.
+              (setq time (encode-time time))))
         (error
          (let ((overflow-error '(error "Specified time is not representable")))
            (if (or (equal err overflow-error)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-03  5:19   ` Katsumi Yamaoka
@ 2021-12-03 16:29     ` Lars Ingebrigtsen
  2021-12-03 18:38     ` Michael Heerdegen
  1 sibling, 0 replies; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-03 16:29 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: Bob Rogers, 52209

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> The function needs to test if `parse-time-string' returns a valid
> data as the old version did it with the help of `encode-time'.
> A patch is below (where why I do `(setq time ...)' is to silence
> the byte compiler).

Thanks; applied to Emacs 29.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-03  5:19   ` Katsumi Yamaoka
  2021-12-03 16:29     ` Lars Ingebrigtsen
@ 2021-12-03 18:38     ` Michael Heerdegen
  1 sibling, 0 replies; 40+ messages in thread
From: Michael Heerdegen @ 2021-12-03 18:38 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: Bob Rogers, Lars Ingebrigtsen, 52209

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> A patch is below (where why I do `(setq time ...)' is to silence
> the byte compiler).

AFAIU: `ignore' is the solution we choose most of the time for such
cases, it makes a bit clearer what happens.

Michael.





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-11-30 20:55 bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates Bob Rogers
  2021-12-01  4:17 ` Lars Ingebrigtsen
@ 2021-12-04 18:58 ` Paul Eggert
  2021-12-19 21:11   ` Bob Rogers
  1 sibling, 1 reply; 40+ messages in thread
From: Paul Eggert @ 2021-12-04 18:58 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Bob Rogers, Katsumi Yamaoka, 52209

[-- Attachment #1: Type: text/plain, Size: 1318 bytes --]

Unfortunately the latest change to time-date.el reintroduced Bug#52209. 
I installed the attached patch to fix this, and to add some test cases 
mentioned in this bug report to help prevent the problem recurring. 
Also, this patch documents the new feature, and avoids 
overenthusiastically guessing the year to be 1970 when the date string 
lacks a year.

> (These functions were never really intended to support parsing dates
> like that -- only strict RFC822 date strings were originally supported,
> but it's become more DWIM as time has passed.

Yes, date-to-time has definitely ... evolved.

My understanding is that date-to-time's RFC822 parsing is present only 
for backward compatibility, and that we shouldn't attempt to enhance it 
(here, the enhancement would be pointless as the RFC822 parsing fills in 
the blanks anyway). So the patch I just installed adds the new feature 
only for the normal path taken, when not doing the RFC822 hack.


PS. Internet RFC 822 has been obsolete since 2001, and the Emacs code 
should be talking about RFC 5322 everywhere except when Emacs is 
explicitly supporting the obsolete standard instead of the current 
standard. And we should rename functions like rfc822-goto-eoh to 
rfc-email-goto-eoh, to help avoid confusion or further function 
renaming. But I digress....

[-- Attachment #2: 0001-Fix-date-to-time-2021-12-04.patch --]
[-- Type: text/x-patch, Size: 5213 bytes --]

From cb0f4f00b328a561e49538bbf0f90050eac1ba20 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 4 Dec 2021 10:33:32 -0800
Subject: [PATCH] Fix (date-to-time "2021-12-04")

This should complete the fix for Bug#52209.
* lisp/calendar/time-date.el (date-to-time): Apply
decoded-time-set-defaults only to the output of (parse-time-string
date), and only when the output has a year (to avoid confusion
when dates lack years).  There is no point applying it after
timezone-make-date-arpa-standard since the latter fills in all the
blanks.  And the former code mistakenly called encode-time on an
already-encoded time.  This goes back to the code a couple of days
ago, except with changed behavior (to fix Bug#52209) only when
timezone-make-date-arpa-standard is not called.
* test/lisp/calendar/time-date-tests.el (test-date-to-time)
(test-days-between): New tests.
---
 doc/lispref/os.texi                   |  3 ++-
 etc/NEWS                              |  4 +++
 lisp/calendar/time-date.el            | 38 +++++++++++----------------
 test/lisp/calendar/time-date-tests.el |  7 +++++
 4 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/doc/lispref/os.texi b/doc/lispref/os.texi
index e420644cd8..b4efc44b03 100644
--- a/doc/lispref/os.texi
+++ b/doc/lispref/os.texi
@@ -1724,7 +1724,8 @@ Time Parsing
 corresponding Lisp timestamp.  The argument @var{string} should represent
 a date-time, and should be in one of the forms recognized by
 @code{parse-time-string} (see below).  This function assumes Universal
-Time if @var{string} lacks explicit time zone information.
+Time if @var{string} lacks explicit time zone information,
+and assumes earliest values if @var{string} lacks month, day, or time.
 The operating system limits the range of time and zone values.
 @end defun
 
diff --git a/etc/NEWS b/etc/NEWS
index ac1787d7f8..2b4eaaf8a1 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1084,6 +1084,10 @@ cookies set by web pages on disk.
 ** New variable 'help-buffer-under-preparation'.
 This variable is bound to t during the preparation of a "*Help*" buffer.
 
++++
+** 'date-to-time' now assumes earliest values if its argument lacks
+month, day, or time.  For example, (date-to-time "2021-12-04") now
+assumes a time of 00:00 instead of signaling an error.
 \f
 * Changes in Emacs 29.1 on Non-Free Operating Systems
 
diff --git a/lisp/calendar/time-date.el b/lisp/calendar/time-date.el
index 8a6ee0f270..37a16d3b98 100644
--- a/lisp/calendar/time-date.el
+++ b/lisp/calendar/time-date.el
@@ -153,28 +153,22 @@ date-to-time
   "Parse a string DATE that represents a date-time and return a time value.
 DATE should be in one of the forms recognized by `parse-time-string'.
 If DATE lacks timezone information, GMT is assumed."
-  ;; Pass the result of parsing through decoded-time-set-defaults
-  ;; because encode-time signals if HH:MM:SS are not filled in.
-  (encode-time
-    (decoded-time-set-defaults
-      (condition-case err
-          (let ((time (parse-time-string date)))
-            (prog1 time
-              ;; Cause an error if data `parse-time-string' returns is invalid.
-              (setq time (encode-time time))))
-        (error
-         (let ((overflow-error '(error "Specified time is not representable")))
-           (if (or (equal err overflow-error)
-                   ;; timezone-make-date-arpa-standard misbehaves if
-                   ;; not given at least HH:MM as part of the date.
-                   (not (string-match ":" date)))
-               (signal (car err) (cdr err))
-             (condition-case err
-                 (parse-time-string (timezone-make-date-arpa-standard date))
-               (error
-                (if (equal err overflow-error)
-                    (signal (car err) (cdr err))
-                  (error "Invalid date: %s" date)))))))))))
+  (condition-case err
+      (let ((parsed (parse-time-string date)))
+	(when (decoded-time-year parsed)
+	  (decoded-time-set-defaults parsed))
+	(encode-time parsed))
+    (error
+     (let ((overflow-error '(error "Specified time is not representable")))
+       (if (equal err overflow-error)
+	   (signal (car err) (cdr err))
+	 (condition-case err
+	     (encode-time (parse-time-string
+			   (timezone-make-date-arpa-standard date)))
+	   (error
+	    (if (equal err overflow-error)
+		(signal (car err) (cdr err))
+	      (error "Invalid date: %s" date)))))))))
 
 ;;;###autoload
 (defalias 'time-to-seconds 'float-time)
diff --git a/test/lisp/calendar/time-date-tests.el b/test/lisp/calendar/time-date-tests.el
index 4568947c0b..d5269804ad 100644
--- a/test/lisp/calendar/time-date-tests.el
+++ b/test/lisp/calendar/time-date-tests.el
@@ -41,6 +41,13 @@ test-obsolete-encode-time-value
                    (encode-time-value 1 2 3 4 3))
                  '(1 2 3 4))))
 
+(ert-deftest test-date-to-time ()
+  (should (equal (format-time-string "%F %T" (date-to-time "2021-12-04"))
+                 "2021-12-04 00:00:00")))
+
+(ert-deftest test-days-between ()
+  (should (equal (days-between "2021-10-22" "2020-09-29") 388)))
+
 (ert-deftest test-leap-year ()
   (should-not (date-leap-year-p 1999))
   (should-not (date-leap-year-p 1900))
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-04 18:58 ` Paul Eggert
@ 2021-12-19 21:11   ` Bob Rogers
  2021-12-20 10:08     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-19 21:11 UTC (permalink / raw)
  To: Paul Eggert, Lars Ingebrigtsen; +Cc: Katsumi Yamaoka, 52209

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 2357 bytes --]

   From: Paul Eggert <eggert@cs.ucla.edu>
   Date: Sat, 4 Dec 2021 10:58:37 -0800

   Unfortunately the latest change to time-date.el reintroduced
   Bug#52209.  I installed the attached patch to fix this . . .

I'm sure none of you will be surprised to learn that parse-time-string
still doesn't recognize single-digit months or days, with the same
fallback-to-the-epoch behavior that threw me for a loop originally.

    (format-time-string "%F %T %z" (date-to-time "2022-1-12"))
	=> "1999-12-31 19:00:00 -0500"

And adding a time makes it work again because it seems that
timezone-make-date-arpa-standard does accept single-digit months and
days.  Go figure.

   The attached patch extends parse-time-string by using regexps instead
of string manipulation of fixed-width fields.  This could possibly break
interface compatibility, especially if you expect anyone to customize
parse-time-rules.  So I will not be surprised if you decline to adopt
it.

   > (These functions were never really intended to support parsing dates
   > like that -- only strict RFC822 date strings were originally supported,
   > but it's become more DWIM as time has passed.

   Yes, date-to-time has definitely ... evolved.

   My understanding is that date-to-time's RFC822 parsing is present
   only for backward compatibility, and that we shouldn't attempt to
   enhance it (here, the enhancement would be pointless as the RFC822
   parsing fills in the blanks anyway). So the patch I just installed
   adds the new feature only for the normal path taken, when not doing
   the RFC822 hack.

   PS. Internet RFC 822 has been obsolete since 2001, and the Emacs code
   should be talking about RFC 5322 everywhere except when Emacs is
   explicitly supporting the obsolete standard instead of the current
   standard. And we should rename functions like rfc822-goto-eoh to
   rfc-email-goto-eoh, to help avoid confusion or further function
   renaming. But I digress....

Since Emacs time functions have evolved well beyond email, I would argue
that even "rfc-email-" is too specific a prefix for them.  So if this
patch is not suitable, maybe it's (cough, cough) time for a new time and
date parsing API that supports a broader range of human-generated dates
and times, with better error handling and I18N support.  WDYT?

					-- Bob Rogers
					   http://www.rgrjr.com/


[-- Attachment #2: Type: text/x-patch, Size: 6215 bytes --]

diff --git a/lisp/calendar/parse-time.el b/lisp/calendar/parse-time.el
index 5a3d2706af..4812dcbd1b 100644
--- a/lisp/calendar/parse-time.el
+++ b/lisp/calendar/parse-time.el
@@ -102,45 +102,25 @@ parse-time-rules
     ((3) (1 31))
     ((4) parse-time-months)
     ((5) (100))
-    ((2 1 0)
-     ,(lambda () (and (stringp parse-time-elt)
-                      (= (length parse-time-elt) 8)
-                      (= (aref parse-time-elt 2) ?:)
-                      (= (aref parse-time-elt 5) ?:)))
-     [0 2] [3 5] [6 8])
     ((8 7) parse-time-zoneinfo
      ,(lambda () (car parse-time-val))
      ,(lambda () (cadr parse-time-val)))
     ((8)
+     "^[-+][0-9][0-9][0-9][0-9]$"
      ,(lambda ()
-        (and (stringp parse-time-elt)
-             (= 5 (length parse-time-elt))
-             (or (= (aref parse-time-elt 0) ?+)
-                 (= (aref parse-time-elt 0) ?-))))
-     ,(lambda () (* 60 (+ (cl-parse-integer parse-time-elt :start 3 :end 5)
-                          (* 60 (cl-parse-integer parse-time-elt :start 1 :end 3)))
-                    (if (= (aref parse-time-elt 0) ?-) -1 1))))
+        (* 60
+           (+ (cl-parse-integer parse-time-elt :start 3 :end 5)
+              (* 60 (cl-parse-integer parse-time-elt :start 1 :end 3)))
+           (if (= (aref parse-time-elt 0) ?-) -1 1))))
     ((5 4 3)
-     ,(lambda () (and (stringp parse-time-elt)
-                      (= (length parse-time-elt) 10)
-                      (= (aref parse-time-elt 4) ?-)
-                      (= (aref parse-time-elt 7) ?-)))
-     [0 4] [5 7] [8 10])
-    ((2 1 0)
-     ,(lambda () (and (stringp parse-time-elt)
-                      (= (length parse-time-elt) 5)
-                      (= (aref parse-time-elt 2) ?:)))
-     [0 2] [3 5] ,(lambda () 0))
+     "^\\([0-9][0-9][0-9][0-9]\\)-\\([0-9][0-9]?\\)-\\([0-9][0-9]?\\)$"
+     1 2 3)
     ((2 1 0)
-     ,(lambda () (and (stringp parse-time-elt)
-                      (= (length parse-time-elt) 4)
-                      (= (aref parse-time-elt 1) ?:)))
-     [0 1] [2 4] ,(lambda () 0))
+     "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\)$"
+     1 2 ,(lambda () 0))
     ((2 1 0)
-     ,(lambda () (and (stringp parse-time-elt)
-                      (= (length parse-time-elt) 7)
-                      (= (aref parse-time-elt 1) ?:)))
-     [0 1] [2 4] [5 7])
+     "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\):\\([0-9][0-9]\\)$"
+     1 2 3)
     ((5) (50 110) ,(lambda () (+ 1900 parse-time-elt)))
     ((5) (0 49) ,(lambda () (+ 2000 parse-time-elt))))
   "(slots predicate extractor...)")
@@ -173,7 +153,11 @@ parse-time-string
 		    (parse-time-val))
 	       (when (and (not (nth (car slots) time)) ;not already set
 			  (setq parse-time-val
-				(cond ((and (consp predicate)
+				(cond ((stringp predicate)
+                                        (and (stringp parse-time-elt)
+                                             (string-match predicate
+                                                           parse-time-elt)))
+                                      ((and (consp predicate)
 					    (not (functionp predicate)))
 				       (and (numberp parse-time-elt)
 					    (<= (car predicate) parse-time-elt)
@@ -188,15 +172,15 @@ parse-time-string
 		 (setq exit t)
 		 (while slots
 		   (let ((new-val (if rule
-				      (let ((this (pop rule)))
-					(if (vectorp this)
-					    (cl-parse-integer
-					     parse-time-elt
-					     :start (aref this 0)
-					     :end (aref this 1))
-					  (funcall this)))
+				      (let* ((this (pop rule)))
+                                        (if (integerp this)
+                                            (cl-parse-integer
+                                              (match-string this parse-time-elt))
+                                          (funcall this)))
 				    parse-time-val)))
-		     (setf (nth (pop slots) time) new-val))))))))
+                     (setf (nth (pop slots) time) new-val))))))
+           (unless exit
+             (message "unrecognized token '%s'" parse-time-elt))))
        time))))
 
 (defun parse-iso8601-time-string (date-string)
diff --git a/test/lisp/calendar/parse-time-tests.el b/test/lisp/calendar/parse-time-tests.el
index b706b73570..63b696db1d 100644
--- a/test/lisp/calendar/parse-time-tests.el
+++ b/test/lisp/calendar/parse-time-tests.el
@@ -45,6 +45,36 @@ parse-time-tests
                  '(42 35 19 22 2 2016 1 nil -28800)))
   (should (equal (parse-time-string "Friday, 21 Sep 2018 13:47:58 PDT")
                  '(58 47 13 21 9 2018 5 t -25200)))
+  (should (equal (parse-time-string "Friday, 21 Sep 2018 13:47:58")
+                 '(58 47 13 21 9 2018 5 -1 nil)))
+  (should (equal (parse-time-string "Friday, 21 Sep 2018")
+                 '(nil nil nil 21 9 2018 5 -1 nil)))
+  ;; Date can be numeric if separated by hyphens.
+  (should (equal (parse-time-string "Friday, 2018-09-21")
+                 '(nil nil nil 21 9 2018 5 -1 nil)))
+  ;; Day of week is optional
+  (should (equal (parse-time-string "2018-09-21")
+                 '(nil nil nil 21 9 2018 nil -1 nil)))
+  ;; The order of date, time, etc., does not matter.
+  (should (equal (parse-time-string "13:47:58, +0100, 2018-09-21, Friday")
+                 '(58 47 13 21 9 2018 5 -1 3600)))
+  ;; Month, day, or both, can be a single digit.
+  (should (equal (parse-time-string "Friday, 2018-9-08")
+                 '(nil nil nil 8 9 2018 5 -1 nil)))
+  (should (equal (parse-time-string "Friday, 2018-09-8")
+                 '(nil nil nil 8 9 2018 5 -1 nil)))
+  (should (equal (parse-time-string "Friday, 2018-9-8")
+                 '(nil nil nil 8 9 2018 5 -1 nil)))
+  ;; Time by itself is recognized as such.
+  (should (equal (parse-time-string "03:47:58")
+                 '(58 47 3 nil nil nil nil -1 nil)))
+  ;; A leading zero for hours is optional.
+  (should (equal (parse-time-string "3:47:58")
+                 '(58 47 3 nil nil nil nil -1 nil)))
+  ;; Missing seconds are assumed to be zero.
+  (should (equal (parse-time-string "3:47")
+                 '(0 47 3 nil nil nil nil -1 nil)))
+
   (should (equal (format-time-string
 		  "%Y-%m-%d %H:%M:%S"
 		  (parse-iso8601-time-string "1998-09-12T12:21:54-0200") t)

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-19 21:11   ` Bob Rogers
@ 2021-12-20 10:08     ` Lars Ingebrigtsen
  2021-12-20 15:57       ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-20 10:08 UTC (permalink / raw)
  To: Bob Rogers; +Cc: Katsumi Yamaoka, Paul Eggert, 52209

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> Since Emacs time functions have evolved well beyond email, I would argue
> that even "rfc-email-" is too specific a prefix for them.  So if this
> patch is not suitable, maybe it's (cough, cough) time for a new time and
> date parsing API that supports a broader range of human-generated dates
> and times, with better error handling and I18N support.  WDYT?

Yes, I think we should stop futzing around with `parse-time-string' and
instead create a new well-defined library with a signature like:

(parse-time "2020-01-15T16:12:21-08:00" 'iso-8601)
(parse-time "1/4-2" 'us-date)
(parse-time "Wed, 15 Jan 2020 16:12:21 -0800" 'rfc822)

etc.  (And yes, I know the latter is a superseded standard, but it's the
one people know.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-20 10:08     ` Lars Ingebrigtsen
@ 2021-12-20 15:57       ` Bob Rogers
  2021-12-20 16:34         ` Bob Rogers
  2021-12-21 11:01         ` Lars Ingebrigtsen
  0 siblings, 2 replies; 40+ messages in thread
From: Bob Rogers @ 2021-12-20 15:57 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Katsumi Yamaoka, Paul Eggert, 52209

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Mon, 20 Dec 2021 11:08:44 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > Since Emacs time functions have evolved well beyond email, I would argue
   > that even "rfc-email-" is too specific a prefix for them.  So if this
   > patch is not suitable, maybe it's (cough, cough) time for a new time and
   > date parsing API that supports a broader range of human-generated dates
   > and times, with better error handling and I18N support.  WDYT?

   Yes, I think we should stop futzing around with `parse-time-string' and
   instead create a new well-defined library with a signature like:

   (parse-time "2020-01-15T16:12:21-08:00" 'iso-8601)
   (parse-time "1/4-2" 'us-date)
   (parse-time "Wed, 15 Jan 2020 16:12:21 -0800" 'rfc822)

   etc.  (And yes, I know the latter is a superseded standard, but it's
   the one people know.)

I can see that it's a good idea to have an explicit hint that the date
is in rfc822 format since a two-digit year (which parse-time-string
still supports) might otherwise be misinterpreted as something else.
And perhaps two-digit years this far into the century should otherwise
be disallowed.

   Otherwise, I think the other date formats would be pretty easy to
recognize, with the exception of month-day order in numeric dates, which
ought to be possible to disambiguate via locale.  (Though I admit I have
no experience with locale programming.)

   On the other hand, I can imagine the caller might want to insist that
the passed string must be in a certain format and force an error if
parse-time finds otherwise.

   One question:  Did you have in mind that parse-time should have the
same return value as parse-time-string, in order to feed into the other
Emacs time functions?

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-20 15:57       ` Bob Rogers
@ 2021-12-20 16:34         ` Bob Rogers
  2021-12-21 11:01         ` Lars Ingebrigtsen
  1 sibling, 0 replies; 40+ messages in thread
From: Bob Rogers @ 2021-12-20 16:34 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Katsumi Yamaoka, Paul Eggert, 52209

   From: Bob Rogers <rogers-emacs@rgrjr.homedns.org>
   Date: Mon, 20 Dec 2021 10:57:33 -0500

   . . . Otherwise, I think the other date formats would be pretty easy
   to recognize, with the exception of month-day order in numeric dates,
   which ought to be possible to disambiguate via locale.  (Though I
   admit I have no experience with locale programming.)

Never mind; through further reading I have realized that the current
locale has no necessary bearing on the locale of the date.  So I'm not
sure what's needed here.

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-20 15:57       ` Bob Rogers
  2021-12-20 16:34         ` Bob Rogers
@ 2021-12-21 11:01         ` Lars Ingebrigtsen
  2021-12-23 19:48           ` Bob Rogers
  1 sibling, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-21 11:01 UTC (permalink / raw)
  To: Bob Rogers; +Cc: Katsumi Yamaoka, Paul Eggert, 52209

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

>    On the other hand, I can imagine the caller might want to insist that
> the passed string must be in a certain format and force an error if
> parse-time finds otherwise.

Yup.  That's one good reason to not have a time parsing function guess
at formats, because the input data will be different.

In my previous job, we had a library to parse date/time strings, and I
think we were up to about 80 distinct formats to handle the different
data feeds we were getting.  For instance, "01 02 03" may be three
different dates depending on where you get the date from.

>    One question:  Did you have in mind that parse-time should have the
> same return value as parse-time-string, in order to feed into the other
> Emacs time functions?

Yes, I think so.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-21 11:01         ` Lars Ingebrigtsen
@ 2021-12-23 19:48           ` Bob Rogers
  2021-12-24  9:29             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-23 19:48 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 1699 bytes --]

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Tue, 21 Dec 2021 12:01:07 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   >    On the other hand, I can imagine the caller might want to insist
   > that the passed string must be in a certain format and force an
   > error if parse-time finds otherwise.

   Yup.  That's one good reason to not have a time parsing function guess
   at formats, because the input data will be different.

OK, I have proceeded along those lines; WIP attached for feedback.  I
changed the name to "parse-date" to avoid confusion; I was otherwise
stuck when trying to come up with a sensible name for the test file,
since parse-time-tests.el was already taken (though I suppose I could
have added to the existing file).  The docstring of parse-date describes
the expected functionality as far as I've planned, with comments in
square brackets to note what's missing.

   In my previous job, we had a library to parse date/time strings, and I
   think we were up to about 80 distinct formats to handle the different
   data feeds we were getting.  For instance, "01 02 03" may be three
   different dates depending on where you get the date from.

Which (additional) formats would you like?  I'm assuming we need iso8601
and rfc822 for compatibility (in which case rfc2822 will be easy to
provide in addition), and us-date and euro-date to disambiguate the
month/day order.  Would the third format correspond to ISO 2001-01-03?
Do we want to support that?

   And come to think of it, I've been using DD-Mon-YY for my own
purposes for so long that I'm not even certain whether Americans use
MM-DD-YY or it's the other way around . . .

					-- Bob


[-- Attachment #2: Type: text/x-patch, Size: 20310 bytes --]

diff --git a/lisp/calendar/parse-date.el b/lisp/calendar/parse-date.el
new file mode 100644
index 0000000000..c4b756cf2e
--- /dev/null
+++ b/lisp/calendar/parse-date.el
@@ -0,0 +1,281 @@
+;;; parse-date.el --- parsing time/date strings -*- lexical-binding: t -*-
+
+;; Copyright (C) 2021 Free Software Foundation, Inc.
+
+;; Author: Bob Rogers <rogers@rgrjr.com>
+;; Keywords: util
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; 'parse-date' parses a time and/or date in a string and returns a
+;; list of values, just like `decode-time', where unspecified elements
+;; in the string are returned as nil (except unspecified DST is
+;; returned as -1).  `encode-time' may be applied on these values to
+;; obtain an internal time value.  If left to its own devices, it
+;; accepts a wide variety of formats, but can be told to insist on a
+;; particular date/time format.
+
+;; Historically, `parse-time-string' was used for this purpose, but it
+;; was focused on email date formats, and gradually but imperfectly
+;; extended to handle other formats.  'parse-date' is compatible in
+;; that it parses the same input formats and uses the same return
+;; value format, but is stricter in that it signals an error for
+;; tokens that `parse-time-string' would simply ignore.
+
+;;; TODO:
+;;
+;; * Define and signal a date-error for parsing issues.
+;;
+;; * Implement rfc2822 and rfc822 independently of parse-time-string.
+;;
+;; * Add a euro-date format for DD/MM/YYYY ?
+;;
+
+;;; Code:
+
+(require 'cl-lib)
+(require 'iso8601)
+(require 'parse-time)
+
+(defun parse-date-guess-format (time-string)
+  (cond ((string-match "[0-9]T[0-9]" time-string) 'iso8601)
+        (t nil)))
+
+(defun parse-date-ignore-char? (char)
+  (or (eq char ?\ ) (eq char ?,) (eq char ?,)))
+
+(defun parse-date-tokenize-string (string)
+  "Turn STRING into tokens, separated only by whitespace and commas.
+Multiple commas are ignored.  Pure digit sequences are turned
+into integers."
+  (let ((index 0)
+	(end (length string))
+        (char nil)
+	(list ()))
+    ;; Skip leading ignored characters.
+    (while (and (< index end)
+                (setq char (aref string index))
+                (parse-date-ignore-char? char))
+      (cl-incf index))
+    (while (< index end)
+      (let ((start index)
+            (all-digits (<= ?0 char ?9)))
+        ;; char is valid; look for more valid characters.
+        (while (and (< (cl-incf index) end)
+                    (setq char (aref string index))
+                    (not (parse-date-ignore-char? char)))
+          (unless (<= ?0 char ?9)
+	    (setq all-digits nil)))
+        (when (<= index end)
+	  (push (if all-digits
+                    (cl-parse-integer string :start start :end index)
+		  (substring string start index))
+		list)
+          ;; Skip ignored characters.
+          (while (and (< (cl-incf index) end)
+                      (setq char (aref string index))
+                      (parse-date-ignore-char? char))
+            ())
+          ;; Next token.
+          )))
+    (nreverse list)))
+
+(defconst parse-date-slot-names
+  '(second minute hour day month year weekday dst zone)
+  "Names of return value slots, for better error messages
+See the decoded-time defstruct.")
+
+(defconst parse-date-slot-ranges
+  '((0 59) (0 59) (0 23) (1 31) (1 12) (1 9999))
+  "Numeric slot ranges, for bounds checking.")
+
+(defun parse-date-default (time-string two-digit-year?)
+  ;; Do the standard parsing thing.  This is mostly free form, in that
+  ;; tokens may appear in any order, but we expect to introduce some
+  ;; state dependence.
+  (let ((tokens (parse-date-tokenize-string (downcase time-string)))
+        (time (list nil nil nil nil nil nil nil -1 nil)))
+    (cl-flet ((set-matched-slot (slot index token)
+                ;; Assign a slot value from match data if index is
+                ;; non-nil, else from token, signalling an error if
+                ;; it's already been assigned or is out of range.
+                (let ((value (if index
+                                 (cl-parse-integer (match-string index token))
+                               token))
+                      (range (nth slot parse-date-slot-ranges)))
+                  (unless (equal (nth slot time)
+                                 (if (= slot 7) -1 nil))
+                    (error "Duplicate %s slot value '%s'"
+                           (nth slot parse-date-slot-names) token))
+                  (when (and range
+                             (not (<= (car range) value (cadr range))))
+                    (error "Value %s is out of range for %s"
+                           token (nth slot parse-date-slot-names)))
+                  (setf (nth slot time) value))))
+      (while tokens
+        (let ((token (pop tokens))
+              (match nil))
+          (cond ((numberp token)
+                  ;; A bare number could be a month, day, or year.
+                  ;; The order of these tests matters greatly.
+                  (cond ((>= token 1000)
+                          (set-matched-slot 5 nil token))
+                        ((and (<= 1 token 31)
+                              (not (nth 3 time)))
+                          ;; Assume days come before months or years.
+                          (set-matched-slot 3 nil token))
+                        ((and (<= 1 token 12)
+                              (not (nth 4 time)))
+                          ;; Assume days come before years.
+                          (set-matched-slot 4 nil token))
+                        ((or (nth 5 time)
+                             (not two-digit-year?)
+                             (> token 100))
+                          (error "Unrecognized numeric value %s" token))
+                        ;; It's a two-digit year.
+                        ((>= token 50)
+                          ;; second half of the 20th century.
+                          (set-matched-slot 5 nil (+ 1900 token)))
+                        (t
+                          ;; first half of the 21st century.
+                          (set-matched-slot 5 nil (+ 2000 token)))))
+                ((setq match (assoc token parse-time-weekdays))
+                  (set-matched-slot 6 nil (cdr match)))
+                ((setq match (assoc token parse-time-months))
+                  (set-matched-slot 4 nil (cdr match)))
+                ((setq match (assoc token parse-time-zoneinfo))
+                  (set-matched-slot 8 nil (cadr match))
+                  (set-matched-slot 7 nil (caddr match)))
+                ((string-match "^[-+][0-9][0-9][0-9][0-9]$" token)
+                  ;; Numeric time zone.
+                  (set-matched-slot
+                    8 nil
+                    (* 60
+                       (+ (cl-parse-integer token :start 3 :end 5)
+                          (* 60 (cl-parse-integer token :start 1 :end 3)))
+                       (if (= (aref token 0) ?-) -1 1))))
+                ((string-match
+                  "^\\([0-9][0-9][0-9][0-9]\\)[-/]\\([0-9][0-9]?\\)[-/]\\([0-9][0-9]?\\)$"
+                  token)
+                  ;; ISO-8601-style date (YYYY-MM-DD).
+                  (set-matched-slot 5 1 token)
+                  (set-matched-slot 4 2 token)
+                  (set-matched-slot 3 3 token))
+                ((string-match
+                  "^\\([0-9][0-9]?\\)[-/]\\([0-9][0-9]?\\)[-/]\\([0-9][0-9][0-9][0-9]\\)$"
+                  token)
+                  ;; US date (MM-DD-YYYY), but we insist on four
+                  ;; digits for the year.
+                  (set-matched-slot 4 1 token)
+                  (set-matched-slot 3 2 token)
+                  (set-matched-slot 5 3 token))
+                ((string-match
+                  "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\):\\([0-9][0-9]\\)$"
+                  token)
+                  (set-matched-slot 2 1 token)
+                  (set-matched-slot 1 2 token)
+                  (set-matched-slot 0 3 token))
+                ((string-match "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\)$" token)
+                  ;; Time without seconds.
+                  (set-matched-slot 2 1 token)
+                  (set-matched-slot 1 2 token)
+                  (set-matched-slot 0 nil 0))
+                ((member token '("am" "pm"))
+                  (unless (nth 2 time)
+                    (error "'AM'/'PM' specified before or without time"))
+                  (unless (<= (nth 2 time) 12)
+                    (error "'AM'/'PM' specified for time already past noon"))
+                  (when (equal token "pm")
+                    (cl-incf (nth 2 time) 12)))
+                (t
+                  (error "Unrecognized time token '%s'" token))))))
+    time))
+
+;;;###autoload
+(defun parse-date (time-string &optional format)
+  "Parse TIME-STRING according to FORMAT, returning a list.
+The FORMAT value is a symbol that may be one of the following:
+
+   iso8601 => parse the string according to the ISO-8601
+standard.  See `parse-iso8601-time-string'.
+
+   iso-8601 => synonym for iso8601.
+
+   rfc822 => parse an RFC822 (old email) date, which allows
+two-digit years and internal '()' comments.  In dates of the form
+'11 Jan 12', the 11 is assumed to be the day, and the 12 is
+assumed to mean 2012.  [not fully implemented.]
+
+   rfc2822 => parse an RFC2822 (new email) date, which allows
+only four-digit years.  [not implemented.]
+
+   us-date => parse a US-style date, of the form MM/DD/YYYY, but
+allowing two-digit years.  In dates of the form '01/11/12', the 1
+is the month, 11 is the day, and the 12 is assumed to mean 2012.
+[not fully implemented.]
+
+   nil => attempt to guess the format, falling back on us-date
+with two-digit years disallowed.
+
+The default is nil, and anything else is assumed to be us-date
+with two-digit years disallowed.
+
+   * For all formats except iso8601, parsing is case-insensitive.
+
+   * Commas and whitespace are ignored.
+
+   * In date specifications, either '/' or '-' may be used to
+separate components, but all three components must be given.
+
+   * A date that starts with four digits is YYYY-MM-DD, ISO-8601
+style, but a date that ends with four digits is MM-DD-YYYY [at
+least in us-date format].
+
+   * Two digit years, when allowed, are in the 1900's when
+between 50 and 99 and in the 2000's when between 0 and 49.
+
+Errors are signalled when time values are duplicated,
+unrecognized, or out of range.  No consistency checks between
+fields are done.  For instance, the weekday is not checked to see
+that it corresponds to the date, and parse-date complains about
+the 32nd of March (or any other month) but blithely accepts the
+29th of February in non-leap years -- or the 31st of February in
+any year.
+
+The result is a list of (SEC MIN HOUR DAY MON YEAR DOW DST TZ),
+which can be accessed as a decoded-time defstruct (q.v.),
+e.g. `decoded-time-year' to extract the year, and turned into an
+Emacs timestamp by `encode-time'.  The values returned are
+identical to those of `decode-time', but any unknown values other
+than DST are returned as nil, and an unknown DST value is
+returned as -1."
+  (cl-case (or format (parse-date-guess-format time-string))
+    ((iso8601 iso-8601)
+      (parse-iso8601-time-string time-string))
+    ((rfc822 rfc2822)
+      ;; [Placeholder; we eventually want something more strict.  --
+      ;; rgr, 20-Dec-21.]
+      (parse-time-string time-string))
+    (us-date
+      (parse-date-default time-string t))
+    (t
+      (parse-date-default time-string nil))))
+
+(provide 'parse-date)
+
+;;; parse-date.el ends here
diff --git a/test/lisp/calendar/parse-date-tests.el b/test/lisp/calendar/parse-date-tests.el
new file mode 100644
index 0000000000..682365e674
--- /dev/null
+++ b/test/lisp/calendar/parse-date-tests.el
@@ -0,0 +1,164 @@
+;;; parse-date-tests.el --- Test suite for parse-date.el  -*- lexical-binding:t -*-
+
+;; Copyright (C) 2016-2021 Free Software Foundation, Inc.
+
+;; Author: Lars Ingebrigtsen <larsi@gnus.org>
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;;; Code:
+
+(require 'ert)
+(require 'parse-date)
+
+(ert-deftest parse-date-tests ()
+  "Test basic parse-date functionality."
+
+  ;; Test tokenization.
+  (should (equal (parse-date-tokenize-string " ") '()))
+  (should (equal (parse-date-tokenize-string " a b") '("a" "b")))
+  (should (equal (parse-date-tokenize-string "a bbc dde") '("a" "bbc" "dde")))
+  (should (equal (parse-date-tokenize-string " , a 27 b,, c 14:32 ")
+                 '("a" 27 "b" "c" "14:32")))
+
+  ;; Start with some RFC822 dates.
+  (dolist (format '(nil rfc822))
+    (should (equal (parse-date "Mon, 22 Feb 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "22 Feb 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 nil -1 3600)))
+    (should (equal (parse-date "22 Feb 2016 +0100" format)
+                   '(nil nil nil 22 2 2016 nil -1 3600)))
+    (should (equal (parse-date "Mon, 22 February 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "Mon, 22 feb 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "Monday, 22 february 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "Monday, 22 february 2016 19:35:42 PST" format)
+                   '(42 35 19 22 2 2016 1 nil -28800)))
+    (should (equal (parse-date "Friday, 21 Sep 2018 13:47:58 PDT" format)
+                   '(58 47 13 21 9 2018 5 t -25200)))
+    (should (equal (parse-date "Friday, 21 Sep 2018 13:47:58" format)
+                   '(58 47 13 21 9 2018 5 -1 nil)))
+    (should (equal (parse-date "Friday, 21 Sep 2018" format)
+                   '(nil nil nil 21 9 2018 5 -1 nil))))
+  ;; These are not allowed by the default format.
+  (should (equal (parse-date "22 Feb 16 19:35:42 +0100" 'rfc822)
+                 '(42 35 19 22 2 2016 nil -1 3600)))
+  (should (equal (parse-date "22 Feb 96 19:35:42 +0100" 'rfc822)
+                 '(42 35 19 22 2 1996 nil -1 3600)))
+
+  ;; Test the default format with both hyphens and slashes in dates.
+  (dolist (case '(;; Month can be numeric if date uses hyphens/slashes.
+                  ("Friday, 2018-09-21" (nil nil nil 21 9 2018 5 -1 nil))
+                  ;; Year can come last if four digits.
+                  ("Friday, 9-21-2018" (nil nil nil 21 9 2018 5 -1 nil))
+                  ;; Day of week is optional
+                  ("2018-09-21" (nil nil nil 21 9 2018 nil -1 nil))
+                  ;; The order of date, time, etc., does not matter.
+                  ("13:47:58, +0100, 2018-09-21, Friday"
+                   (58 47 13 21 9 2018 5 -1 3600))
+                  ;; Month, day, or both, can be a single digit.
+                  ("Friday, 2018-9-08" (nil nil nil 8 9 2018 5 -1 nil))
+                  ("Friday, 2018-09-8" (nil nil nil 8 9 2018 5 -1 nil))
+                  ("Friday, 2018-9-8" (nil nil nil 8 9 2018 5 -1 nil))))
+    (let ((string (car case))
+          (expected (cadr case)))
+      ;; Test with hyphens.
+      (should (equal (parse-date string) expected))
+      (while (string-match "-" string)
+        (setq string (replace-match "/" t t string)))
+      ;; Test with slashes.
+      (should (equal (parse-date string) expected))))
+
+  ;; Time by itself is recognized as such.
+  (should (equal (parse-date "03:47:58")
+                 '(58 47 3 nil nil nil nil -1 nil)))
+  ;; A leading zero for hours is optional.
+  (should (equal (parse-date "3:47:58")
+                 '(58 47 3 nil nil nil nil -1 nil)))
+  ;; Missing seconds are assumed to be zero.
+  (should (equal (parse-date "3:47")
+                 '(0 47 3 nil nil nil nil -1 nil)))
+  ;; AM/PM are understood (in any case combination).
+  (dolist (am '(am AM Am))
+    (should (equal (parse-date (format "3:47 %s" am))
+                   '(0 47 3 nil nil nil nil -1 nil))))
+  (dolist (pm '(pm PM Pm))
+    (should (equal (parse-date (format "3:47 %s" pm))
+                   '(0 47 15 nil nil nil nil -1 nil))))
+
+  ;; Ensure some cases fail.
+  (should-error (parse-date "22 Feb 196" 'us-date))	;; bad year
+  (should-error (parse-date "22 Feb 16 19:35:42"))	;; two-digit year
+  (should-error (parse-date "22 Feb 96 19:35:42"))	;; two-digit year
+  (should-error (parse-date "2 Feb 2021 1996"))	;; duplicate year
+  (should-error (parse-date "2020-1-1 2021"))	;; another duplicate year
+  (should-error (parse-date "2020-1-1 30"))	;; extra 30 (not a day))
+  (should-error (parse-date "2020-1-1 12"))	;; extra 12 (not a month)
+  (should-error (parse-date "15:47 15:15"))	;; duplicate time
+  (should-error (parse-date "2020-1-1 +0800 -0800"))	;; duplicate TZ
+  (should-error (parse-date "15:47 PM"))	;; PM in the afternoon
+  (should-error (parse-date "2020-1-1 PM"))	;; PM without a time
+  ;; Range tests.
+  (should-error (parse-date "2021-12-32"))
+  (should-error (parse-date "2021-12-0"))
+  (should-error (parse-date "2021-13-3"))
+  (should-error (parse-date "0000-12-3"))
+  (should-error (parse-date "20021 Dec 3"))
+  (should-error (parse-date "24:21:14"))
+  (should-error (parse-date "14:60:21"))
+  (should-error (parse-date "14:21:60"))
+
+  ;; Test ISO-8601 dates.
+  (dolist (format '(nil iso8601 iso-8601))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54-0200" format) t)
+                   "1998-09-12 14:21:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54-0230" format) t)
+                   "1998-09-12 14:51:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54-02:00" format) t)
+                   "1998-09-12 14:21:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54-02" format) t)
+                   "1998-09-12 14:21:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54+0230" format) t)
+                   "1998-09-12 09:51:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54+02" format) t)
+                   "1998-09-12 10:21:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (parse-date "1998-09-12T12:21:54Z" format) t)
+                   "1998-09-12 12:21:54"))
+    (should (equal (parse-date "1998-09-12T12:21:54")
+                   (encode-time 54 21 12 12 9 1998)))))
+
+(provide 'parse-date-tests)
+
+;;; parse-date-tests.el ends here

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-23 19:48           ` Bob Rogers
@ 2021-12-24  9:29             ` Lars Ingebrigtsen
  2021-12-24 15:58               ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-24  9:29 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> OK, I have proceeded along those lines; WIP attached for feedback.  I
> changed the name to "parse-date" to avoid confusion; I was otherwise
> stuck when trying to come up with a sensible name for the test file,
> since parse-time-tests.el was already taken (though I suppose I could
> have added to the existing file).

Sounds good to me.

> Which (additional) formats would you like?  I'm assuming we need iso8601
> and rfc822 for compatibility (in which case rfc2822 will be easy to
> provide in addition), and us-date and euro-date to disambiguate the
> month/day order.  Would the third format correspond to ISO 2001-01-03?
> Do we want to support that?

Probably not -- you mostly see that in Sweden.

> +(defun parse-date (time-string &optional format)

I think it'd be better if this was a cl-defmethod with an eql specifier
for the format.

> +   iso8601 => parse the string according to the ISO-8601
> +standard.  See `parse-iso8601-time-string'.
> +
> +   iso-8601 => synonym for iso8601.

And synonyms aren't necessary -- they just confuse people reading the
code.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-24  9:29             ` Lars Ingebrigtsen
@ 2021-12-24 15:58               ` Bob Rogers
  2021-12-25 11:58                 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-24 15:58 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Fri, 24 Dec 2021 10:29:29 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > Which (additional) formats would you like?  I'm assuming we need iso8601
   > and rfc822 for compatibility (in which case rfc2822 will be easy to
   > provide in addition), and us-date and euro-date to disambiguate the
   > month/day order.  Would the third format correspond to ISO 2001-01-03?
   > Do we want to support that?

   Probably not -- you mostly see that in Sweden.

OK (<phew> ;-}).

   > +(defun parse-date (time-string &optional format)

   I think it'd be better if this was a cl-defmethod with an eql
   specifier for the format.

OK, good; cl-case was easier to start, but I was also beginning to think
in terms of cl-defmethod.

   > +   iso8601 => parse the string according to the ISO-8601
   > +standard.  See `parse-iso8601-time-string'.
   > +
   > +   iso-8601 => synonym for iso8601.

   And synonyms aren't necessary -- they just confuse people reading the
   code.

OK.  I added the synonym because RFCs are always spelled without the
hyphen, but I wasn't sure about the convention for ISO standards.  And
it seems that there isn't a well defined precedent in the Emacs sources;
C programmers mostly avoid the hyphen, but Elisp programmers are more
evenly split:

    rogers@orion> find . -name '*.el' | xargs cat | tr A-Z a-z | grep -c 'iso-[0-9]'
    702
    rogers@orion> find . -name '*.el' | xargs cat | tr A-Z a-z | grep -c 'iso[0-9]'
    798
    rogers@orion> find . -name '*.[ch]' | xargs cat | tr A-Z a-z | grep -c 'iso-[0-9]'
    47
    rogers@orion> find . -name '*.[ch]' | xargs cat | tr A-Z a-z | grep -c 'iso[0-9]'
    148
    rogers@orion> 

So which do you prefer?

   I'm also looking at defining a date-parse-error condition with a few
error symbol "subclasses," but I'm wondering about the tradeoff between
having enough error symbols for precision in error reporting
vs. cluttering the code with too many.  Thoughts?

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-24 15:58               ` Bob Rogers
@ 2021-12-25 11:58                 ` Lars Ingebrigtsen
  2021-12-25 22:50                   ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-25 11:58 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> So which do you prefer?

With a hyphen.

>    I'm also looking at defining a date-parse-error condition with a few
> error symbol "subclasses," but I'm wondering about the tradeoff between
> having enough error symbols for precision in error reporting
> vs. cluttering the code with too many.  Thoughts?

Having a `date-parse-error' would be fine, but I'm unsure about the
utility of having a bunch of sub-errors, but perhaps you have a use case
in mind?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-25 11:58                 ` Lars Ingebrigtsen
@ 2021-12-25 22:50                   ` Bob Rogers
  2021-12-26 11:31                     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-25 22:50 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Sat, 25 Dec 2021 12:58:14 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > I'm also looking at defining a date-parse-error condition with a
   > few error symbol "subclasses," but I'm wondering about the tradeoff
   > between having enough error symbols for precision in error
   > reporting vs. cluttering the code with too many.  Thoughts?

   Having a `date-parse-error' would be fine, but I'm unsure about the
   utility of having a bunch of sub-errors, but perhaps you have a use
   case in mind?

My only motivation is that I think it would make the resulting error
message clearer.  For example, passing a malformed ISO 8601 date to
iso8601-parse just signals wrong-type-argument, which is not very
helpful.  Multiple errors would allow me to specify the problem in
detail, while still classifying them as date/time parsing errors.  Here
are four that I have in mind:

	Unknown date/time token: X
	Illegal date/time value for field: <field>, X
	Duplicate date/time value for field: <field>, X
	Date/time value for field out of range: <field>, X, <min>, <max>

This doesn't quite cover the 14 calls to `error' that are in the current
version of the code, in that they wouldn't be as precise, but they
should be adequate.

   On the other hand, this might be overkill for callers of parse-date,
who, being deep in their own logic, might only care that some date they
have to deal with is invalid.  Which is why I wanted an opinion from
someone with the big picture -- I admit I am biased (and a bit annoyed)
from too often having to study the code to figure out why some perfectly
reasonable date I supply is being misinterpreted.

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-25 22:50                   ` Bob Rogers
@ 2021-12-26 11:31                     ` Lars Ingebrigtsen
  2021-12-28 15:52                       ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-26 11:31 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

>    On the other hand, this might be overkill for callers of parse-date,
> who, being deep in their own logic, might only care that some date they
> have to deal with is invalid.  Which is why I wanted an opinion from
> someone with the big picture -- I admit I am biased (and a bit annoyed)
> from too often having to study the code to figure out why some perfectly
> reasonable date I supply is being misinterpreted.

Better errors messages are possible without making many specific error
symbols, though.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-26 11:31                     ` Lars Ingebrigtsen
@ 2021-12-28 15:52                       ` Bob Rogers
  2021-12-29 15:19                         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-28 15:52 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 947 bytes --]

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Sun, 26 Dec 2021 12:31:07 +0100

   Better errors messages are possible without making many specific error
   symbols, though.

OK, I think I have a good solution that uses a single error symbol; let
me know what you think.  (Having never done much with Elisp conditions,
I was still thinking in terms of Common Lisp, so I had to go scratch my
head for a while.)

   I am currently working on broadening what the parser will accept,
though I think it is close to a usable state.  I am using the
documentation for the Perl Date::Parse module to see what it accepts,
and will then look at the corresponding Python and Ruby modules for
further ideas.  I am not planning to adopt everything I see, though; in
particular, I think it's a good idea for new code to stick to insisting
on four-digit years except when the caller has specified a format that
determines the month/day order.

					-- Bob


[-- Attachment #2: Type: text/x-patch, Size: 33616 bytes --]

diff --git a/lisp/calendar/parse-date.el b/lisp/calendar/parse-date.el
new file mode 100644
index 0000000000..10bd939e91
--- /dev/null
+++ b/lisp/calendar/parse-date.el
@@ -0,0 +1,472 @@
+;;; parse-date.el --- parsing time/date strings -*- lexical-binding: t -*-
+
+;; Copyright (C) 2021 Free Software Foundation, Inc.
+
+;; Author: Bob Rogers <rogers@rgrjr.com>
+;; Keywords: util
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; 'parse-date' parses a time and/or date in a string and returns a
+;; list of values, just like `decode-time', where unspecified elements
+;; in the string are returned as nil (except unspecified DST is
+;; returned as -1).  `encode-time' may be applied on these values to
+;; obtain an internal time value.  If left to its own devices, it
+;; accepts a wide variety of formats, but can be told to insist on a
+;; particular date/time format.
+
+;; Historically, `parse-time-string' was used for this purpose, but it
+;; was focused on email date formats, and gradually but imperfectly
+;; extended to handle other formats.  'parse-date' is compatible in
+;; that it parses the same input formats and uses the same return
+;; value format, but is stricter in that it signals an error for
+;; tokens that `parse-time-string' would simply ignore.
+
+;;; TODO:
+;;
+;; * Add a euro-date format for DD/MM/YYYY ?
+;;
+
+;;; Code:
+
+(require 'cl-lib)
+(require 'iso8601)
+(require 'parse-time)
+
+(define-error 'date-parse-error "Date/time parse error" 'error)
+
+(defconst parse-date--ends-with-alpha-tz-re
+  (concat " \\(" (mapconcat #'car parse-time-zoneinfo "\\|") "\\)$")
+  "Recognize an alphanumeric timezone at the end of the string.")
+
+(defun parse-date--guess-rfc822-formats (date-string)
+  (let ((case-fold-search t))
+    (cond ((string-match "(" date-string) 'rfc2822)
+          ((string-match parse-date--ends-with-alpha-tz-re date-string)
+            ;; Alphabetic timezones are legacy syntax.
+            'rfc822)
+          ((string-match " [-+][0-9][0-9][0-9][0-9][ \t\n]*\\($\\}(\\)"
+                         date-string)
+            ;; Note that an ISO-8601 timezone has a colon in the middle
+            ;; and no preceding space.
+            'rfc2822)
+          (t nil))))
+
+(defun parse-date--guess-format (date-string)
+  (cond ((iso8601-valid-p date-string) 'iso-8601)
+        ((parse-date--guess-rfc822-formats date-string))
+        (t nil)))
+
+(defun parse-date--ignore-char? (char)
+  ;; Ignore whitespace and commas.
+  (or (eq char ?\ ) (eq char ?\t) (eq char ?\r) (eq char ?\n) (eq char ?,)))
+
+(defun parse-date--tokenize-string (string &optional strip-fws?)
+  "Turn STRING into tokens, separated only by whitespace and commas.
+Multiple commas are ignored.  Pure digit sequences are turned
+into integers.  If STRIP-FWS? is true, then folding whitespace as
+defined by RFC2822 (strictly, the CFWS production that also
+accepts comments) is stripped out by treating it like whitespace;
+if it's value is the symbol `first', we exit when we see the
+first '(' (per RFC2822), else we strip them all (per RFC822)."
+  (let ((index 0)
+	(end (length string))
+        (fws-eof? (eq strip-fws? 'first))
+	(list ()))
+    (when fws-eof?
+      ;; In order to stop on the first "(", we need to see it as
+      ;; non-whitespace.
+      (setq strip-fws? nil))
+    (cl-flet ((skip-ignored ()
+                ;; Skip ignored characters at index (the scan
+                ;; position).  Skip RFC822 comments in matched parens
+                ;; if strip-fws? is true, but do not complain about
+                ;; unterminated comments.
+                (let ((char nil)
+                      (nest 0))
+                  (while (and (< index end)
+                              (setq char (aref string index))
+                              (or (> nest 0)
+                                  (parse-date--ignore-char? char)
+                                  (and strip-fws? (eql char ?\())))
+                    (cl-incf index)
+                    ;; FWS bookkeeping.
+                    (cond ((not strip-fws?))
+                          ((and (eq char ?\\)
+                                (< (1+ index) end))
+	                    ;; Move to the next char but don't check
+	                    ;; it to see if it might be a paren.
+                            (cl-incf index))
+                          ((eq char ?\() (cl-incf nest))
+                          ((eq char ?\)) (cl-decf nest)))))))
+      (skip-ignored)		;; Skip leading whitespace.
+      (while (and (< index end)
+                  (not (and fws-eof?
+                            (eq (aref string index) ?\())))
+        (let* ((start index)
+               (char (aref string index))
+               (all-digits (<= ?0 char ?9)))
+          ;; char is valid; look for more valid characters.
+          (when (and strip-fws?
+                     (eq char ?\\)
+                     (< (1+ index) end))
+            ;; Escaped character, which might be a "(".  If so, we are
+            ;; correct to include it in the token, even though the
+            ;; caller is sure to barf.  If not, we violate RFC2?822 by
+            ;; not removing the backslash, but no characters in valid
+            ;; RFC2?822 dates need escaping anyway, so it shouldn't
+            ;; matter that this is not done strictly correctly.  --
+            ;; rgr, 24-Dec-21.
+            (cl-incf index))
+          (while (and (< (cl-incf index) end)
+                      (setq char (aref string index))
+                      (not (or (parse-date--ignore-char? char)
+                               (and strip-fws?
+                                    (eq char ?\()))))
+            (unless (<= ?0 char ?9)
+              (setq all-digits nil))
+            (when (and strip-fws?
+                       (eq char ?\\)
+                       (< (1+ index) end))
+              ;; Escaped character, see above.
+              (cl-incf index)))
+          (push (if all-digits
+                    (cl-parse-integer string :start start :end index)
+                  (substring string start index))
+                list)
+          (skip-ignored)))
+      (nreverse list))))
+
+(defconst parse-date--slot-names
+  '(second minute hour day month year weekday dst zone)
+  "Names of return value slots, for better error messages
+See the decoded-time defstruct.")
+
+(defconst parse-date--slot-ranges
+  '((0 60) (0 59) (0 23) (1 31) (1 12) (1 9999))
+  "Numeric slot ranges, for bounds checking.
+Note that RFC2822 explicitly requires that seconds go up to 60,
+to allow for leap seconds (see Mills, D., 'Network Time
+Protocol', STD 12, RFC 1119, September 1989).")
+
+(defun parse-date--x822 (time-string obs-format?)
+  ;; Parse an RFC2822 or (if obs-format? is true) RFC822 date.  The
+  ;; strict syntax for the former is as follows:
+  ;;
+  ;;	[ day-of-week "," ] day FWS month-name FWS year FWS time [CFWS]
+  ;;
+  ;; where "time" is:
+  ;;
+  ;;	2DIGIT ":" 2DIGIT [ ":" 2DIGIT ] FWS ( "+" / "-" ) 4DIGIT
+  ;;
+  ;; RFC822 also accepts comments in random places (which is handled
+  ;; by parse-date--tokenize-string) and two-digit years.  We are
+  ;; somewhat more lax in what we accept (specifically, the hours
+  ;; don't have to be two digits, and the TZ and the comma after the
+  ;; DOW are optional), but we do insist that the items that are
+  ;; present do appear in this order.
+  (let ((tokens (parse-date--tokenize-string (downcase time-string)
+                                            (if obs-format? 'all 'first)))
+        (time (list nil nil nil nil nil nil nil -1 nil)))
+    (cl-labels ((set-matched-slot (slot index token)
+                  ;; Assign a slot value from match data if index is
+                  ;; non-nil, else from token, signalling an error if
+                  ;; it's already been assigned or is out of range.
+                  (let ((value (if index
+                                   (cl-parse-integer (match-string index token))
+                                 token))
+                        (range (nth slot parse-date--slot-ranges)))
+                    (unless (equal (nth slot time)
+                                   (if (= slot 7) -1 nil))
+                      (signal 'date-parse-error
+                              (list "Duplicate slot value"
+                                    (nth slot parse-date--slot-names) token)))
+                    (when (and range
+                               (not (<= (car range) value (cadr range))))
+                      (signal 'date-parse-error
+                              (list "Slot out of range"
+                                    (nth slot parse-date--slot-names)
+                                    token (car range) (cadr range))))
+                    (setf (nth slot time) value)))
+                (set-numeric (slot token)
+                  (unless (natnump token)
+                    (signal 'date-parse-error
+                              (list "Not a number"
+                                    (nth slot parse-date--slot-names) token)))
+                  (set-matched-slot slot nil token)))
+      ;; Check for weekday.
+      (let ((dow (assoc (car tokens) parse-time-weekdays)))
+        (when dow
+          ;; Day of the week.
+          (set-matched-slot 6 nil (cdr dow))
+          (pop tokens)))
+      ;; Day.
+      (set-numeric 3 (pop tokens))
+      ;; Alphabetic month.
+      (let* ((month (pop tokens))
+             (match (assoc month parse-time-months)))
+        (if match
+            (set-matched-slot 4 nil (cdr match))
+          (signal 'date-parse-error
+                  (list "Expected an alphabetic month" month))))
+      ;; Year.
+      (let ((year (pop tokens)))
+        ;; Check the year for the right number of digits.
+        (cond ((> year 1000)
+                (set-numeric 5 year))
+              ((or (not obs-format?)
+                   (>= year 100))
+                "Four digit years are required but found '%s'" year)
+              ((>= year 50)
+                ;; second half of the 20th century.
+                (set-numeric 5 (+ 1900 year)))
+              (t
+                ;; first half of the 21st century.
+                (set-numeric 5 (+ 2000 year)))))
+      ;; Time.
+      (let ((time (pop tokens)))
+        (cond ((or (null time) (natnump time))
+                (signal 'date-parse-error
+                        (list "Expected a time" time)))
+              ((string-match
+                "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\):\\([0-9][0-9]\\)$"
+                time)
+                (set-matched-slot 2 1 time)
+                (set-matched-slot 1 2 time)
+                (set-matched-slot 0 3 time))
+              ((string-match "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\)$" time)
+                ;; Time without seconds.
+                (set-matched-slot 2 1 time)
+                (set-matched-slot 1 2 time)
+                (set-matched-slot 0 nil 0))
+              (t
+                (signal 'date-parse-error
+                        (list "Expected a time" time)))))
+      ;; Timezone.
+      (let* ((zone (pop tokens))
+             (match (assoc zone parse-time-zoneinfo)))
+        (cond (match
+                (set-matched-slot 8 nil (cadr match))
+                (set-matched-slot 7 nil (caddr match)))
+              ((and (stringp zone)
+                    (string-match "^[-+][0-9][0-9][0-9][0-9]$" zone))
+                ;; Numeric time zone.
+                (set-matched-slot
+                  8 nil
+                  (* 60
+                     (+ (cl-parse-integer zone :start 3 :end 5)
+                        (* 60 (cl-parse-integer zone :start 1 :end 3)))
+                     (if (= (aref zone 0) ?-) -1 1))))
+              (zone
+                (signal 'date-parse-error
+                        (list "Expected a timezone" zone)))))
+      (when tokens
+        (signal 'date-parse-error
+                (list "Extra token(s)" (car tokens)))))
+    time))
+
+(defun parse-date--default (time-string two-digit-year?)
+  ;; Do the standard parsing thing.  This is mostly free form, in that
+  ;; tokens may appear in any order, but we expect to introduce some
+  ;; state dependence.
+  (let ((tokens (parse-date--tokenize-string (downcase time-string)))
+        (time (list nil nil nil nil nil nil nil -1 nil)))
+    (cl-flet ((set-matched-slot (slot index token)
+                ;; Assign a slot value from match data if index is
+                ;; non-nil, else from token, signalling an error if
+                ;; it's already been assigned or is out of range.
+                (let ((value (if index
+                                 (cl-parse-integer (match-string index token))
+                               token))
+                      (range (nth slot parse-date--slot-ranges)))
+                  (unless (equal (nth slot time)
+                                 (if (= slot 7) -1 nil))
+                    (signal 'date-parse-error
+                              (list "Duplicate slot value"
+                                    (nth slot parse-date--slot-names) token)))
+                  (when (and range
+                             (not (<= (car range) value (cadr range))))
+                    (signal 'date-parse-error
+                            (list "Slot out of range"
+                                         (nth slot parse-date--slot-names)
+                                         token (car range) (cadr range))))
+                  (setf (nth slot time) value))))
+      (while tokens
+        (let ((token (pop tokens))
+              (match nil))
+          (cond ((numberp token)
+                  ;; A bare number could be a month, day, or year.
+                  ;; The order of these tests matters greatly.
+                  (cond ((>= token 1000)
+                          (set-matched-slot 5 nil token))
+                        ((and (<= 1 token 31)
+                              (not (nth 3 time)))
+                          ;; Assume days come before months or years.
+                          (set-matched-slot 3 nil token))
+                        ((and (<= 1 token 12)
+                              (not (nth 4 time)))
+                          ;; Assume days come before years.
+                          (set-matched-slot 4 nil token))
+                        ((or (nth 5 time)
+                             (not two-digit-year?)
+                             (> token 100))
+                          (signal 'date-parse-error
+                                  (list "Unrecognized token" token)))
+                        ;; It's a two-digit year.
+                        ((>= token 50)
+                          ;; second half of the 20th century.
+                          (set-matched-slot 5 nil (+ 1900 token)))
+                        (t
+                          ;; first half of the 21st century.
+                          (set-matched-slot 5 nil (+ 2000 token)))))
+                ((setq match (assoc token parse-time-weekdays))
+                  (set-matched-slot 6 nil (cdr match)))
+                ((setq match (assoc token parse-time-months))
+                  (set-matched-slot 4 nil (cdr match)))
+                ((setq match (assoc token parse-time-zoneinfo))
+                  (set-matched-slot 8 nil (cadr match))
+                  (set-matched-slot 7 nil (caddr match)))
+                ((string-match "^[-+][0-9][0-9][0-9][0-9]$" token)
+                  ;; Numeric time zone.
+                  (set-matched-slot
+                    8 nil
+                    (* 60
+                       (+ (cl-parse-integer token :start 3 :end 5)
+                          (* 60 (cl-parse-integer token :start 1 :end 3)))
+                       (if (= (aref token 0) ?-) -1 1))))
+                ((string-match
+                  "^\\([0-9][0-9][0-9][0-9]\\)[-/]\\([0-9][0-9]?\\)[-/]\\([0-9][0-9]?\\)$"
+                  token)
+                  ;; ISO-8601-style date (YYYY-MM-DD).
+                  (set-matched-slot 5 1 token)
+                  (set-matched-slot 4 2 token)
+                  (set-matched-slot 3 3 token))
+                ((string-match
+                  "^\\([0-9][0-9]?\\)[-/]\\([0-9][0-9]?\\)[-/]\\([0-9][0-9][0-9][0-9]\\)$"
+                  token)
+                  ;; US date (MM-DD-YYYY), but we insist on four
+                  ;; digits for the year.
+                  (set-matched-slot 4 1 token)
+                  (set-matched-slot 3 2 token)
+                  (set-matched-slot 5 3 token))
+                ((string-match
+                  "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\):\\([0-9][0-9]\\)$"
+                  token)
+                  (set-matched-slot 2 1 token)
+                  (set-matched-slot 1 2 token)
+                  (set-matched-slot 0 3 token))
+                ((string-match "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\)$" token)
+                  ;; Time without seconds.
+                  (set-matched-slot 2 1 token)
+                  (set-matched-slot 1 2 token)
+                  (set-matched-slot 0 nil 0))
+                ((member token '("am" "pm"))
+                  (unless (nth 2 time)
+                    (signal 'date-parse-error
+                            (list "Missing time" token)))
+                  (unless (<= (nth 2 time) 12)
+                    (signal 'date-parse-error
+                            (list "Time already past noon" token)))
+                  (when (equal token "pm")
+                    (cl-incf (nth 2 time) 12)))
+                (t
+                  (signal 'date-parse-error
+                          (list "Unrecognized token" token)))))))
+    time))
+
+;;;###autoload
+(cl-defgeneric parse-date (time-string &optional format)
+  "Parse TIME-STRING according to FORMAT, returning a list.
+The FORMAT value is a symbol that may be one of the following:
+
+   iso-8601 => parse the string according to the ISO-8601
+standard.  See `parse-iso8601-time-string'.
+
+   rfc822 => parse an RFC822 (old email) date, which allows
+two-digit years and internal '()' comments.  In dates of the form
+'11 Jan 12', the 11 is assumed to be the day, and the 12 is
+assumed to mean 2012.  Be sure you really want this; the format
+is more limited than most human-supplied dates.
+
+   rfc2822 => parse an RFC2822 (new email) date, which allows
+only four-digit years.  Again, this is a fairly restricted
+format, with fields required to be in a specified order and
+representation.
+
+   us-date => parse a US-style date, of the form MM/DD/YYYY, but
+allowing two-digit years.  In dates of the form '01/11/12', the 1
+is the month, 11 is the day, and the 12 is assumed to mean 2012.
+
+   nil => like us-date with two-digit years disallowed.
+
+Anything else is treated as iso-8601 if it looks similar, else
+us-date with two-digit years disallowed.
+
+   * For all formats except iso-8601, parsing is case-insensitive.
+
+   * Commas and whitespace are ignored.
+
+   * In date specifications, either '/' or '-' may be used to
+separate components, but all three components must be given.
+
+   * A date that starts with four digits is YYYY-MM-DD, ISO-8601
+style, but a date that ends with four digits is MM-DD-YYYY [at
+least in us-date format].
+
+   * Two digit years, when allowed, are in the 1900's when
+between 50 and 99 inclusive and in the 2000's when between 0 and
+49 inclusive.
+
+A `date-parse-error' is signalled when time values are duplicated,
+unrecognized, or out of range.  No consistency checks between
+fields are done.  For instance, the weekday is not checked to see
+that it corresponds to the date, and parse-date complains about
+the 32nd of March (or any other month) but blithely accepts the
+29th of February in non-leap years -- or the 31st of February in
+any year.
+
+The result is a list of (SEC MIN HOUR DAY MON YEAR DOW DST TZ),
+which can be accessed as a decoded-time defstruct (q.v.),
+e.g. `decoded-time-year' to extract the year, and turned into an
+Emacs timestamp by `encode-time'.  The values returned are
+identical to those of `decode-time', but any unknown values other
+than DST are returned as nil, and an unknown DST value is
+returned as -1.")
+
+(cl-defmethod parse-date (time-string (_format (eql iso-8601)))
+  (iso8601-parse time-string))
+
+(cl-defmethod parse-date (time-string (_format (eql rfc2822)))
+  (parse-date--x822 time-string nil))
+
+(cl-defmethod parse-date (time-string (_format (eql rfc822)))
+  (parse-date--x822 time-string t))
+
+(cl-defmethod parse-date (time-string (_format (eql us-date)))
+  (parse-date--default time-string t))
+
+(cl-defmethod parse-date (time-string (_format (eql nil)))
+  (parse-date--default time-string nil))
+
+(cl-defmethod parse-date (time-string _format)
+  ;; Re-dispatch after guessing the format.
+  (parse-date time-string (parse-date--guess-format time-string)))
+
+(provide 'parse-date)
+
+;;; parse-date.el ends here
diff --git a/test/lisp/calendar/parse-date-tests.el b/test/lisp/calendar/parse-date-tests.el
new file mode 100644
index 0000000000..bd2b344d71
--- /dev/null
+++ b/test/lisp/calendar/parse-date-tests.el
@@ -0,0 +1,247 @@
+;;; parse-date-tests.el --- Test suite for parse-date.el  -*- lexical-binding:t -*-
+
+;; Copyright (C) 2016-2021 Free Software Foundation, Inc.
+
+;; Author: Lars Ingebrigtsen <larsi@gnus.org>
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;;; Code:
+
+(require 'ert)
+(require 'parse-date)
+
+(ert-deftest parse-date-tests ()
+  "Test basic parse-date functionality."
+
+  ;; Test tokenization.
+  (should (equal (parse-date--tokenize-string " ") '()))
+  (should (equal (parse-date--tokenize-string " a b") '("a" "b")))
+  (should (equal (parse-date--tokenize-string "a bbc dde") '("a" "bbc" "dde")))
+  (should (equal (parse-date--tokenize-string " , a 27 b,, c 14:32 ")
+                 '("a" 27 "b" "c" "14:32")))
+  ;; Some folding whitespace tests.
+  (should (equal (parse-date--tokenize-string " a b (end) c" 'first)
+                 '("a" "b")))
+  (should (equal (parse-date--tokenize-string "(quux)a (foo (bar)) b(baz)" t)
+                 '("a" "b")))
+  (should (equal (parse-date--tokenize-string "a b\\cde" 'all)
+                 ;; Strictly incorrect, but strictly unnecessary syntax.
+                 '("a" "b\\cde")))
+  (should (equal (parse-date--tokenize-string "a b\\ de" 'all)
+                 '("a" "b\\ de")))
+  (should (equal (parse-date--tokenize-string "a \\de \\(f" 'all)
+                 '("a" "\\de" "\\(f")))
+
+  ;; Start with some compatible RFC822 dates.
+  (dolist (format '(nil rfc822 rfc2822))
+    (should (equal (parse-date "Mon, 22 Feb 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "22 Feb 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 nil -1 3600)))
+    (should (equal (parse-date "Mon, 22 February 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "Mon, 22 feb 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "Monday, 22 february 2016 19:35:42 +0100" format)
+                   '(42 35 19 22 2 2016 1 -1 3600)))
+    (should (equal (parse-date "Monday, 22 february 2016 19:35:42 PST" format)
+                   '(42 35 19 22 2 2016 1 nil -28800)))
+    (should (equal (parse-date "Friday, 21 Sep 2018 13:47:58 PDT" format)
+                   '(58 47 13 21 9 2018 5 t -25200)))
+    (should (equal (parse-date "Friday, 21 Sep 2018 13:47:58" format)
+                   '(58 47 13 21 9 2018 5 -1 nil))))
+  ;; These are not allowed by the default format.
+  (should (equal (parse-date "22 Feb 16 19:35:42 +0100" 'rfc822)
+                 '(42 35 19 22 2 2016 nil -1 3600)))
+  (should (equal (parse-date "22 Feb 96 19:35:42 +0100" 'rfc822)
+                 '(42 35 19 22 2 1996 nil -1 3600)))
+  ;; Try them again with comments.
+  (should (equal (parse-date "22 Feb (today) 16 19:35:42 +0100" 'rfc822)
+                 '(42 35 19 22 2 2016 nil -1 3600)))
+  (should (equal (parse-date "22 Feb 96 (long ago) 19:35:42 +0100" 'rfc822)
+                 '(42 35 19 22 2 1996 nil -1 3600)))
+  (should (equal (parse-date
+                  "Friday, 21 Sep(comment \\) with \\( parens)18 19:35:42"
+                  'rfc822)
+                 '(42 35 19 21 9 2018 5 -1 nil)))
+  (should (equal (parse-date
+                  "Friday, 21 Sep 18 19:35:42 (unterminated comment"
+                  'rfc822)
+                 '(42 35 19 21 9 2018 5 -1 nil)))
+
+  ;; Test some RFC822 error cases
+  (dolist (test '(("33 1 2022" ("Slot out of range" day 33 1 31))
+                  ("0 1 2022" ("Slot out of range" day 0 1 31))
+                  ("1 1 2020 2021" ("Expected an alphabetic month" 1))
+                  ("1 Jan 2020 2021" ("Expected a time" 2021))
+                  ("1 Jan 2020 20:21 2000" ("Expected a timezone" 2000))
+                  ("1 Jan 2020 20:21 +0200 33" ("Extra token(s)" 33))))
+    (should (equal (condition-case err (parse-date (car test) 'rfc822)
+                     (date-parse-error (cdr err)))
+                   (cadr test))))
+
+  ;; And these are not allowed by rfc822 because of missing time.
+  (should (equal (parse-date "Friday, 21 Sep 2018" nil)
+                 '(nil nil nil 21 9 2018 5 -1 nil)))
+  (should (equal (parse-date "22 Feb 2016 +0100" nil)
+                 '(nil nil nil 22 2 2016 nil -1 3600)))
+
+  ;; Test the default format with both hyphens and slashes in dates.
+  (dolist (case '(;; Month can be numeric if date uses hyphens/slashes.
+                  ("Friday, 2018-09-21" (nil nil nil 21 9 2018 5 -1 nil))
+                  ;; Year can come last if four digits.
+                  ("Friday, 9-21-2018" (nil nil nil 21 9 2018 5 -1 nil))
+                  ;; Day of week is optional
+                  ("2018-09-21" (nil nil nil 21 9 2018 nil -1 nil))
+                  ;; The order of date, time, etc., does not matter.
+                  ("13:47:58, +0100, 2018-09-21, Friday"
+                   (58 47 13 21 9 2018 5 -1 3600))
+                  ;; Month, day, or both, can be a single digit.
+                  ("Friday, 2018-9-08" (nil nil nil 8 9 2018 5 -1 nil))
+                  ("Friday, 2018-09-8" (nil nil nil 8 9 2018 5 -1 nil))
+                  ("Friday, 2018-9-8" (nil nil nil 8 9 2018 5 -1 nil))))
+    (let ((string (car case))
+          (expected (cadr case)))
+      ;; Test with hyphens.
+      (should (equal (parse-date string nil) expected))
+      (while (string-match "-" string)
+        (setq string (replace-match "/" t t string)))
+      ;; Test with slashes.
+      (should (equal (parse-date string nil) expected))))
+
+  ;; Time by itself is recognized as such.
+  (should (equal (parse-date "03:47:58" nil)
+                 '(58 47 3 nil nil nil nil -1 nil)))
+  ;; A leading zero for hours is optional.
+  (should (equal (parse-date "3:47:58" nil)
+                 '(58 47 3 nil nil nil nil -1 nil)))
+  ;; Missing seconds are assumed to be zero.
+  (should (equal (parse-date "3:47" nil)
+                 '(0 47 3 nil nil nil nil -1 nil)))
+  ;; AM/PM are understood (in any case combination).
+  (dolist (am '(am AM Am))
+    (should (equal (parse-date (format "3:47 %s" am) nil)
+                   '(0 47 3 nil nil nil nil -1 nil))))
+  (dolist (pm '(pm PM Pm))
+    (should (equal (parse-date (format "3:47 %s" pm) nil)
+                   '(0 47 15 nil nil nil nil -1 nil))))
+
+  ;; Ensure some cases fail.
+  (should-error (parse-date "22 Feb 196" 'us-date))
+  (should-error (parse-date "22 Feb 16 19:35:42" nil))
+  (should-error (parse-date "22 Feb 96 19:35:42" nil))	;; two-digit year
+  (should-error (parse-date "2 Feb 2021 1996" nil))	;; duplicate year
+
+  (dolist (test '(("22 Feb 196" 'us-date	;; bad year
+                   ("Unrecognized token" 196))
+                  ("22 Feb 16 19:35:42" nil	;; two-digit year
+                   ("Unrecognized token" 16))
+                  ("22 Feb 96 19:35:42" nil	;; two-digit year
+                   ("Unrecognized token" 96))
+                  ("2 Feb 2021 1996" nil
+                   ("Duplicate slot value" year 1996))
+                  ("2020-1-1 2021" nil
+                   ("Duplicate slot value" year 2021))
+                  ("22 Feb 196" 'us-date
+                   ("Unrecognized token" 196))
+                  ("22 Feb 16 19:35:42" nil
+                   ("Unrecognized token" 16))
+                  ("22 Feb 96 19:35:42" nil
+                   ("Unrecognized token" 96))
+                  ("2 Feb 2021 1996" nil
+                   ("Duplicate slot value" year 1996))
+                  ("2020-1-1 30" nil
+                   ("Unrecognized token" 30))
+                  ("2020-1-1 12" nil
+                   ("Unrecognized token" 12))
+                  ("15:47 15:15" nil
+                   ("Duplicate slot value" hour "15:15"))
+                  ("2020-1-1 +0800 -0800" t
+                   ("Duplicate slot value" zone -28800))
+                  ("15:47 PM" nil
+                   ("Time already past noon" "pm"))
+                  ("15:47 AM" nil
+                   ("Time already past noon" "am"))
+                  ("2020-1-1 PM" nil
+                   ("Missing time" "pm"))
+                  ;; Range tests
+                  ("2021-12-32" nil
+                   ("Slot out of range" day "2021-12-32" 1 31))
+                  ("2021-12-0" nil
+                   ("Slot out of range" day "2021-12-0" 1 31))
+                  ("2021-13-3" nil
+                   ("Slot out of range" month "2021-13-3" 1 12))
+                  ("0000-12-3" nil
+                   ("Slot out of range" year "0000-12-3" 1 9999))
+                  ("20021 Dec 3" nil
+                   ("Slot out of range" year 20021 1 9999))
+                  ("24:21:14" nil
+                   ("Slot out of range" hour "24:21:14" 0 23))
+                  ("14:60:21" nil
+                   ("Slot out of range" minute "14:60:21" 0 59))
+                  ("14:21:61" nil
+                   ("Slot out of range" second "14:21:61" 0 60))))
+    (should (equal (condition-case err (parse-date (car test) (cadr test))
+                     (date-parse-error (cdr err)))
+                   (caddr test))))
+  (should (equal (parse-date "14:21:60" nil)	;; a leap second!
+                 '(60 21 14 nil nil nil nil -1 nil)))
+
+  ;; Test ISO-8601 dates.
+  (dolist (format '(t iso-8601))
+    (should (equal (parse-date "1998-09-12T12:21:54-0200" format)
+                   '(54 21 12 12 9 1998 nil nil -7200)))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (encode-time
+                      (parse-date "1998-09-12T12:21:54-0230" format))
+                    t)
+                   "1998-09-12 14:51:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (encode-time
+                      (parse-date "1998-09-12T12:21:54-02:00" format))
+                    t)
+                   "1998-09-12 14:21:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (encode-time
+                      (parse-date "1998-09-12T12:21:54-02" format))
+                    t)
+                   "1998-09-12 14:21:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (encode-time
+                      (parse-date "1998-09-12T12:21:54+0230" format))
+                    t)
+                   "1998-09-12 09:51:54"))
+    (should (equal (format-time-string
+                    "%Y-%m-%d %H:%M:%S"
+                    (encode-time
+                     (parse-date "1998-09-12T12:21:54+02" format))
+                    t)
+                   "1998-09-12 10:21:54"))
+    (should (equal (parse-date "1998-09-12T12:21:54Z" t)
+                   '(54 21 12 12 9 1998 nil nil 0)))
+    (should (equal (parse-date "1998-09-12T12:21:54" format)
+                   '(54 21 12 12 9 1998 nil -1 nil)))))
+
+(provide 'parse-date-tests)
+
+;;; parse-date-tests.el ends here

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-28 15:52                       ` Bob Rogers
@ 2021-12-29 15:19                         ` Lars Ingebrigtsen
  2021-12-29 19:29                           ` Paul Eggert
  2021-12-30 21:08                           ` Bob Rogers
  0 siblings, 2 replies; 40+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-29 15:19 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Paul Eggert

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

>    I am currently working on broadening what the parser will accept,
> though I think it is close to a usable state.

Makes sense to me.  Perhaps Paul has some comments; added to the CCs.

> +(cl-defmethod parse-date (time-string (_format (eql iso-8601)))

By the way, this should be

(cl-defmethod parse-date (time-string (_format (eql 'iso-8601)))

now -- we're transitioning to eval-ing the eql specifier.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-29 15:19                         ` Lars Ingebrigtsen
@ 2021-12-29 19:29                           ` Paul Eggert
  2021-12-29 22:01                             ` Bob Rogers
  2021-12-30 21:08                           ` Bob Rogers
  1 sibling, 1 reply; 40+ messages in thread
From: Paul Eggert @ 2021-12-29 19:29 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Bob Rogers; +Cc: 52209

On 12/29/21 07:19, Lars Ingebrigtsen wrote:
> Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:
> 
>>     I am currently working on broadening what the parser will accept,
>> though I think it is close to a usable state.
> 
> Makes sense to me.  Perhaps Paul has some comments; added to the CCs.

My first comment is "be careful what you're getting into" :-). I'm 
trying to retire from date-parsing as its users are never happy and 
rightly so. But here goes. I took a quick look at 
<https://bugs.gnu.org/52209#58> and have a few comments.

* Calling it parse-date is a bit confusing, as it parses both dates and 
times. I suggest calling it parse-timestamp or parse-date-time instead. 
(I know the existing package is called parse-time but we can't fix that.)

* If the package is called X, the error should be called X-error. 
Currently the package is called parse-date and the error is called 
date-parse-error, which is confusing.

* The patch should also modify the comment at the start of parse-time.el 
to indicate parse-date-time as another possibility.

* I suggest preferring the symbol 'rfc-email' for parsing email-related 
dates, for consistency with the --rfc-email option of GNU 'date'. This 
should use the current RFC (5322 now, perhaps updated later). I suppose 
you could also advertise 'rfc-822' for strict RFC 822 conformance, and 
similarly 'rfc2822' for strict 2822 conformance, but I expect these 
alternatives would be less useful in practice.


> +   nil => like us-date with two-digit years disallowed.

This doesn't sound like a good default. For example, it completely 
mishandles dates in Brazil, which use DD/MM/YYYY format.

> +Anything else is treated as iso-8601 if it looks similar, else
> +us-date with two-digit years disallowed.

This might be a better default (for nil), but it should have an explicit 
name other than nil.

> +   * For all formats except iso-8601, parsing is case-insensitive.

It's pretty common for ISO 8601 parsers to be case-insensitive. For 
example, Java's OffsetDateTime.parse(CharSequence) allow both lower and 
upper case T and Z. Perhaps some people need strict ISO 8601 parsers, 
but I imagine a more-generous parser would be more useful. So you could 
have iso-8601 and iso-8601-strict; or you could have a strictness arg; 
or something like that.

> +   * Commas and whitespace are ignored.

This is quite wrong for some formats, if you want to be strict. And even 
if not, commas are part of ISO 8601 format and can't be ignored if I 
understand what you mean by "ignored".


> +   * Two digit years, when allowed, are in the 1900's when
> +between 50 and 99 inclusive and in the 2000's when between 0 and
> +49 inclusive.

This disagrees with the POSIX standard for 'date' (supported by GNU 
'date'), which says 69-99 are treated as 1969-1999 and 00-68 are treated 
as 2000-2068. I suggest going with the POSIX heuristic if you're going 
to use a fixed heuristic for dates at all.

Better might be to have an optional argument of context specifying the 
default time for incomplete timestamps. You can use that the context to 
fill in more-significant parts that are missing. E.g., if the year is 
missing, you take it from the context; if the century is missing, you 
take that from the context. The default context would be empty, i.e., 
missing years or centuries would be an error.

For more formats that need parsing, see:

https://en.wikipedia.org/wiki/Date_format_by_country
https://metacpan.org/search?q=datetime%3A%3Aformat

You don't need to support them all now, but you should take a look at 
what's out there and make sure the API can be extended to handle them.





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-29 19:29                           ` Paul Eggert
@ 2021-12-29 22:01                             ` Bob Rogers
  2021-12-30  5:32                               ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-29 22:01 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Lars Ingebrigtsen, 52209

   From: Paul Eggert <eggert@cs.ucla.edu>
   Date: Wed, 29 Dec 2021 11:29:44 -0800

   On 12/29/21 07:19, Lars Ingebrigtsen wrote:
   > Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:
   > 
   >>     I am currently working on broadening what the parser will accept,
   >> though I think it is close to a usable state.
   > 
   > Makes sense to me.  Perhaps Paul has some comments; added to the CCs.

   My first comment is "be careful what you're getting into" :-).  I'm
   trying to retire from date-parsing as its users are never happy and
   rightly so.

No worries; I have spent more of my career than I like to think about
dealing with date/time issues, so I know what a can of worms I am in the
process of opening.

   But here goes.  I took a quick look at
   <https://bugs.gnu.org/52209#58> and have a few comments.

They are greatly appreciated; thank you.

   * Calling it parse-date is a bit confusing, as it parses both dates and 
   times. I suggest calling it parse-timestamp or parse-date-time instead. 
   (I know the existing package is called parse-time but we can't fix that.)

Lars originally suggested parse-time, but there's already a
parse-time-tests.el, so I switched to parse-date so I could use
parse-date-tests.el to correspond.  So the namespace is already crowded.
But I would be OK with either of those alternatives.  Since it will
actually give you either date or time, or both, parse-date-time might
make more sense.

   * If the package is called X, the error should be called X-error. 
   Currently the package is called parse-date and the error is called 
   date-parse-error, which is confusing.

My thought was that for the "parse-date" function, the verb should come
before the noun, and in "date-parse-error", the "date" is an adjective
further modifying "parse error."  But I think I'm way fussier about
these things than anybody I know, so your point is well taken.

   * The patch should also modify the comment at the start of
   parse-time.el to indicate parse-date-time as another possibility.

I took that as a late-stage task, something to do alongside updating
Elisp documentation.  (Which I haven't even begun to look at.)

   * I suggest preferring the symbol 'rfc-email' for parsing
   email-related dates, for consistency with the --rfc-email option of
   GNU 'date'. This should use the current RFC (5322 now, perhaps
   updated later). 

I started with RFC822 and RFC2822 because I had copies of these lying
around; you're right that I should have looked for more recent
standards.  And using rfc-email as a synonym for the latest version is a
good idea.

   I suppose you could also advertise 'rfc-822' for strict RFC 822
   conformance, and similarly 'rfc2822' for strict 2822 conformance, but
   I expect these alternatives would be less useful in practice.

Anyone parsing email headers would need their date parser to support
RFC822 in case they encountered very old emails, but (since later
standards are backward-compatible) it's not clear what supporting
intermediate standards would buy.

   > +   nil => like us-date with two-digit years disallowed.

   This doesn't sound like a good default. For example, it completely 
   mishandles dates in Brazil, which use DD/MM/YYYY format.

I subsequently added a euro-date format for DD/MM (with various lengths
of years).

   > +Anything else is treated as iso-8601 if it looks similar, else
   > +us-date with two-digit years disallowed.

   This might be a better default (for nil), but it should have an explicit 
   name other than nil.

Suggestions?

   > +   * For all formats except iso-8601, parsing is case-insensitive.

   It's pretty common for ISO 8601 parsers to be case-insensitive. For 
   example, Java's OffsetDateTime.parse(CharSequence) allow both lower and 
   upper case T and Z. Perhaps some people need strict ISO 8601 parsers, 
   but I imagine a more-generous parser would be more useful. So you could 
   have iso-8601 and iso-8601-strict; or you could have a strictness arg; 
   or something like that.

Actually, I am handing those off to the existing iso8601-parse code,
which doesn't like lowercase T (at least).

   > +   * Commas and whitespace are ignored.

   This is quite wrong for some formats, if you want to be strict.  And
   even if not, commas are part of ISO 8601 format and can't be ignored
   if I understand what you mean by "ignored".

I see I need to clarify the docstring to state that these other bulleted
comments also do not apply to ISO-8601 dates.

   > +   * Two digit years, when allowed, are in the 1900's when
   > +between 50 and 99 inclusive and in the 2000's when between 0 and
   > +49 inclusive.

   This disagrees with the POSIX standard for 'date' (supported by GNU 
   'date'), which says 69-99 are treated as 1969-1999 and 00-68 are treated 
   as 2000-2068. I suggest going with the POSIX heuristic if you're going 
   to use a fixed heuristic for dates at all.

I was just following the existing parse-time-string heuristic.  So which
do you think should rule:  POSIX or parse-time-string compatibility?

   Better might be to have an optional argument of context specifying the 
   default time for incomplete timestamps. You can use that the context to 
   fill in more-significant parts that are missing. E.g., if the year is 
   missing, you take it from the context; if the century is missing, you 
   take that from the context. The default context would be empty, i.e., 
   missing years or centuries would be an error.

Again, I'm just doing what parse-time-string is doing, namely leaving
everything that is not specified nil, and letting the caller decide how
to apply defaults.  The only exception is when time is specified without
seconds; in that case, the seconds are set to zero (which is also
compatible with parse-time-string).

   And even defaulting from context is not straightforward:  If given a
date without a year that is not today, should that be in the future or
in the past?  There's a can of worms I don't need to touch.  ;-}

   For more formats that need parsing, see:

   https://en.wikipedia.org/wiki/Date_format_by_country
   https://metacpan.org/search?q=datetime%3A%3Aformat

   You don't need to support them all now, but you should take a look at 
   what's out there and make sure the API can be extended to handle them.

Excellent; thank you!  I have been looking at date parsing module
documentation but so far the ones I've seen have not been very clear
about what they actually accept.

					-- Bob Rogers
					   http://www.rgrjr.com/





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-29 22:01                             ` Bob Rogers
@ 2021-12-30  5:32                               ` Bob Rogers
  0 siblings, 0 replies; 40+ messages in thread
From: Bob Rogers @ 2021-12-30  5:32 UTC (permalink / raw)
  To: Paul Eggert, Lars Ingebrigtsen, 52209

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 1002 bytes --]

   From: Bob Rogers <rogers-emacs@rgrjr.homedns.org>
   Date: Wed, 29 Dec 2021 17:01:01 -0500

      From: Paul Eggert <eggert@cs.ucla.edu>
      Date: Wed, 29 Dec 2021 11:29:44 -0800

      * I suggest preferring the symbol 'rfc-email' for parsing
      email-related dates, for consistency with the --rfc-email option
      of GNU 'date'. This should use the current RFC (5322 now, perhaps
      updated later).

The only update I saw at https://www.rfc-editor.org (RFC6854) only
affects addressing syntax.

   I started with RFC822 and RFC2822 because I had copies of these lying
   around; you're right that I should have looked for more recent
   standards.  And using rfc-email as a synonym for the latest version is a
   good idea.

FYI, there is no substantial difference between RFC2822 and RFC5322 in
date/time syntax.  They hide the whitespace in different productions,
but the end result is the same.  So I'll change the format name to
rfc5322 and add rfc-email as a synonym.

					-- Bob


[-- Attachment #2: Type: text/x-patch, Size: 4132 bytes --]

--- rfc2822-date.text	2021-12-30 00:15:38.588023882 -0500
+++ rfc5322-date.text	2021-12-29 23:41:39.492629354 -0500
@@ -1,15 +1,15 @@
 
 3.3. Date and Time Specification
 
- Date and time occur in several header fields. This section
+ Date and time values occur in several header fields. This section
  specifies the syntax for a full date and time specification. Though
  folding white space is permitted throughout the date-time
  specification, it is RECOMMENDED that a single space be used in each
  place that FWS appears (whether it is required or optional); some older
- implementations may not interpret other occurrences of folding white
+ implementations will not interpret longer sequences of folding white
  space correctly.
 
- date-time = [ day-of-week "," ] date FWS time [CFWS]
+ date-time = [ day-of-week "," ] date time [CFWS]
 
  day-of-week = ([FWS] day-name) / obs-day-of-week
 
@@ -18,17 +18,15 @@
 
  date = day month year
 
- day = ([FWS] 1*2DIGIT) / obs-day
+ day = ([FWS] 1*2DIGIT FWS) / obs-day
 
- month = (FWS month-name FWS) / obs-month
-
- month-name = "Jan" / "Feb" / "Mar" / "Apr" /
+ month = "Jan" / "Feb" / "Mar" / "Apr" /
  "May" / "Jun" / "Jul" / "Aug" /
  "Sep" / "Oct" / "Nov" / "Dec"
 
- year = 4*DIGIT / obs-year
+ year = (FWS 4*DIGIT FWS) / obs-year
 
- time = time-of-day FWS zone
+ time = time-of-day zone
 
  time-of-day = hour ":" minute [ ":" second ]
 
@@ -38,7 +36,7 @@
 
  second = 2DIGIT / obs-second
 
- zone = (( "+" / "-" ) 4DIGIT) / obs-zone
+ zone = (FWS ( "+" / "-" ) 4DIGIT) / obs-zone
 
  The day is the numeric day of the month. The year is any numeric year
  1900 or later.
@@ -54,28 +52,27 @@
  day is ahead of (i.e., east of) or behind (i.e., west of) Universal
  Time. The first two digits indicate the number of hours difference from
  Universal Time, and the last two digits indicate the number of
- minutes difference from Universal Time. (Hence, +hhmm means
+ additional minutes difference from Universal Time. (Hence, +hhmm means
  +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) minutes). The
  form "+0000" SHOULD be used to indicate a time zone at Universal
  Time. Though "-0000" also indicates Universal Time, it is used to
  indicate that the time was generated on a system that may be in a local
- time zone other than Universal Time and therefore indicates that the
- date-time contains no
+ time zone other than Universal Time and that the date-time contains no
  information about the local time zone.
 
  A date-time specification MUST be semantically valid. That is, the
- day-of-the-week (if included) MUST be the day implied by the date, the
+ day-of-week (if included) MUST be the day implied by the date, the
  numeric day-of-month MUST be between 1 and the number of days allowed
  for the specified month (in the specified year), the time-of-day MUST
  be in the range 00:00:00 through 23:59:60 (the number of seconds
- allowing for a leap second; see [STD12]), and the zone MUST be within
- the range -9959 through +9959.
+ allowing for a leap second; see [RFC1305]), and the last two digits of
+ the zone MUST be within the range 00 through 59.
 
 4.3. Obsolete Date and Time
 
  The syntax for the obsolete date format allows a 2 digit year in the
- date field and allows for a list of alphabetic time zone specifications
- that were used in earlier versions of this standard. It also
+ date field and allows for a list of alphabetic time zone specifiers
+ that were used in earlier versions of this specification. It also
  permits comments and folding white space between many of the tokens.
 
  obs-day-of-week = [CFWS] day-name [CFWS]
@@ -138,3 +135,8 @@
  and "Z" is equivalent to "+0000". However, because of the error in
  [RFC0822], they SHOULD all be considered equivalent to "-0000" unless
  there is out-of-band information confirming their meaning.
+
+ Other multi-character (usually between 3 and 5) alphabetic time zones
+ have been used in Internet messages. Any such time zone whose meaning
+ is not known SHOULD be considered equivalent to "-0000" unless there is
+ out-of-band information confirming their meaning.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-29 15:19                         ` Lars Ingebrigtsen
  2021-12-29 19:29                           ` Paul Eggert
@ 2021-12-30 21:08                           ` Bob Rogers
  2022-01-01 14:47                             ` Lars Ingebrigtsen
  1 sibling, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2021-12-30 21:08 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209, Paul Eggert

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Wed, 29 Dec 2021 16:19:03 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   . . .

   > +(cl-defmethod parse-date (time-string (_format (eql iso-8601)))

   By the way, this should be

   (cl-defmethod parse-date (time-string (_format (eql 'iso-8601)))

   now -- we're transitioning to eval-ing the eql specifier.

Thanks for the heads-up; now done.

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2021-12-30 21:08                           ` Bob Rogers
@ 2022-01-01 14:47                             ` Lars Ingebrigtsen
  2022-01-01 14:56                               ` Andreas Schwab
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-01-01 14:47 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Paul Eggert

I wonder whether we should look at this another way.  We currently have
two built-in date parsing functions in Emacs: `iso8601-parse' and
`parse-time-string', and both parse strings according to well-defined
standards (ISO8601 and RFC822bis, respectively).  (But the latter's doc
string didn't explicitly say so, so people thought it was a DWIM
parser.)

DWIM date parsing is impossible, though, because there's an infinite
variety of date formats out there, and variants are ambiguous.  And
adding an infinite number of date parsers to Emacs doesn't seem
attractive.

So how about just adding something that makes parsing common date
formats easier, but without being DWIM or being hard-coded.  Like:

(parse-time "%Y/%m/%d" "2021/01/01")
=> (nil nil nil 01 01 2021)

or something.

It could be regexp-ey

(parse-time "%Y.*%m.*%d" "2021  01-01")

and basically accept the same things that format-time-string accepts,
like:

(with-locale-environment "fr_FR"
  (parse-time "%d +%h" "5 août"))
=> (nil nil nil 5 8 nil)

I think that'd be more generally useful.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no






^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-01-01 14:47                             ` Lars Ingebrigtsen
@ 2022-01-01 14:56                               ` Andreas Schwab
  2022-01-02  0:41                                 ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Andreas Schwab @ 2022-01-01 14:56 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Bob Rogers, 52209, Paul Eggert

On Jan 01 2022, Lars Ingebrigtsen wrote:

> So how about just adding something that makes parsing common date
> formats easier, but without being DWIM or being hard-coded.  Like:
>
> (parse-time "%Y/%m/%d" "2021/01/01")
> => (nil nil nil 01 01 2021)

Aka strptime.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-01-01 14:56                               ` Andreas Schwab
@ 2022-01-02  0:41                                 ` Bob Rogers
  2022-01-03 11:34                                   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-01-02  0:41 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Andreas Schwab; +Cc: 52209, Paul Eggert

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Sat, 01 Jan 2022 15:47:05 +0100

   I wonder whether we should look at this another way.  We currently have
   two built-in date parsing functions in Emacs: `iso8601-parse' and
   `parse-time-string', and both parse strings according to well-defined
   standards (ISO8601 and RFC822bis, respectively).  (But the latter's doc
   string didn't explicitly say so, so people thought it was a DWIM
   parser.)

   DWIM date parsing is impossible, though, because there's an infinite
   variety of date formats out there, and variants are ambiguous.  And
   adding an infinite number of date parsers to Emacs doesn't seem
   attractive.

After perusing [1], I had started to think in terms just three basic
formats:  dmy (formerly euro-date), ymd, and mdy (formerly us-date),
plus possibly adding "." as a date separator.  That doesn't cover
everything but ought to broaden the set to make most of the world happy,
especially if I add a few hacks I have in mind to broaden recognition of
four-digit years and alphabetic months.  The rest I think could be left
to "patches welcome."

   And in that context, it may make more sense to say, "Use the original
parse-time-string if you know you have email dates, or iso8601-parse if
you have dates that conform to ISO-8601," rather than having parse-date
handle them itself.

   So how about just adding something that makes parsing common date
   formats easier, but without being DWIM or being hard-coded . . .

   I think that'd be more generally useful.

Perhaps, but I see that as a different problem:  One where you have a
date or set of dates in a precise format and just need to knock them
out.  I was trying to solve the problem where you have date(s) that you
only know the general origin (e.g. North America) and don't know whether
they are numeric, alphabetic, or how precise, and just want the parser
to do the best it can, and signal a reasonably informative error rather
than return an incorrect result.

   ================
   From: Andreas Schwab <schwab@linux-m68k.org>
   Date: Sat, 01 Jan 2022 15:56:37 +0100

   On Jan 01 2022, Lars Ingebrigtsen wrote:

   > (parse-time "%Y/%m/%d" "2021/01/01")
   > => (nil nil nil 01 01 2021)

   Aka strptime.

Oh, you're talking about the POSIX strptime, not the Perl Date::Parse
strptime, which is free-form.  Not being a C programmer, I was not aware
of the POSIX version.  But now I know where the odd name came from.  ;-}

					-- Bob

[1]  https://en.wikipedia.org/wiki/Date_format_by_country





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-01-02  0:41                                 ` Bob Rogers
@ 2022-01-03 11:34                                   ` Lars Ingebrigtsen
  2022-01-04  4:45                                     ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-01-03 11:34 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Andreas Schwab, Paul Eggert

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

>    And in that context, it may make more sense to say, "Use the original
> parse-time-string if you know you have email dates, or iso8601-parse if
> you have dates that conform to ISO-8601," rather than having parse-date
> handle them itself.

Yeah.  And rename `parse-time-string' to something less confusing.

>    So how about just adding something that makes parsing common date
>    formats easier, but without being DWIM or being hard-coded . . .
>
>    I think that'd be more generally useful.
>
> Perhaps, but I see that as a different problem:  One where you have a
> date or set of dates in a precise format and just need to knock them
> out.  I was trying to solve the problem where you have date(s) that you
> only know the general origin (e.g. North America) and don't know whether
> they are numeric, alphabetic, or how precise, and just want the parser
> to do the best it can, and signal a reasonably informative error rather
> than return an incorrect result.

Yes, I think a function like that would be welcomed by many...  but
would then lead to an endless series of patches as it'd be extended
because it doesn't work correctly on dates from, say, Iceland.  That is,
a DWIM function would never be finished.

>    On Jan 01 2022, Lars Ingebrigtsen wrote:
>
>    > (parse-time "%Y/%m/%d" "2021/01/01")
>    > => (nil nil nil 01 01 2021)
>
>    Aka strptime.
>
> Oh, you're talking about the POSIX strptime, not the Perl Date::Parse
> strptime, which is free-form.  Not being a C programmer, I was not aware
> of the POSIX version.  But now I know where the odd name came from.  ;-}

POSIX strptime isn't very useful, because if you know the format that
precisely, you might as well just write a regexp for it yourself.  But
something like that, but with more sloppiness (i.e., allowing regexp
matching for the non-time bits) might be useful.  (And I think if we had
that, then implementing DWIM-ish parsing of, say, US dates on top of
that would be a matter of writing a series of these strings to match
them.  Probably.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-01-03 11:34                                   ` Lars Ingebrigtsen
@ 2022-01-04  4:45                                     ` Bob Rogers
  2022-01-05 15:46                                       ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-01-04  4:45 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209, Andreas Schwab, Paul Eggert

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Mon, 03 Jan 2022 12:34:33 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   >    And in that context, it may make more sense to say, "Use the original
   > parse-time-string if you know you have email dates, or iso8601-parse if
   > you have dates that conform to ISO-8601," rather than having parse-date
   > handle them itself.

   Yeah.  And rename `parse-time-string' to something less confusing.

I would certainly have found that helpful.

   FWIW, I did some grepping of the elisp sources to count the callers
of parse-time-string to seen how much trouble it would be to rename
(there are around 60 of them), and found that ietf-drums-parse-date is
just encode-time of parse-time-string.  Since ietf-drums.el declares
itself "Functions for parsing RFC 2822 headers," perhaps
parse-date--x822 should find a new home as ietf-drums-parse-date-string,
and parse-time-string could then be made obsolescent in its favor.

   >    So how about just adding something that makes parsing common date
   >    formats easier, but without being DWIM or being hard-coded . . .
   >
   >    I think that'd be more generally useful.
   >
   > Perhaps, but I see that as a different problem:  One where you have a
   > date or set of dates in a precise format and just need to knock them
   > out.  I was trying to solve the problem where you have date(s) that you
   > only know the general origin (e.g. North America) and don't know whether
   > they are numeric, alphabetic, or how precise, and just want the parser
   > to do the best it can, and signal a reasonably informative error rather
   > than return an incorrect result.

   Yes, I think a function like that would be welcomed by many...  but
   would then lead to an endless series of patches as it'd be extended
   because it doesn't work correctly on dates from, say, Iceland.  That
   is, a DWIM function would never be finished.

But then, as I think someone on the list might have said very recently,
neither is Emacs.  ;-}

   POSIX strptime isn't very useful, because if you know the format that
   precisely, you might as well just write a regexp for it yourself.

Agreed.  And even writing and debugging regexps can often be less than
straightforward.  What you are suggesting is effectively expanding the
set of metacharacters with percent-escapes, which could makes it easier,
or could make it worse.

   But something like that, but with more sloppiness (i.e., allowing
   regexp matching for the non-time bits) might be useful.

One thing regexps can't do (at least not without adding a fair bit of
complexity) is allow components to be in different order or omitted.  So
it still just takes one approximate date/time, and the caller is back to
writing regexps to validate before passing it to the "real" parser.

   I was thinking that the next dimension in which to extend parse-date
would be to add keywords to refine what is accepted, on top of the basic
MDY order, e.g.:

	:date-separators "-/"
	:time-separators ":."
	:two-numbers-are :month-year	;; or (e.g.) :day-month
	:timezone :required	;; could be :optional or :forbidden
	:timezone-has-colon t	;; RFC5322 forbids, ISO-8601 requires

Some keywords could even be regexp-valued.  Others could be "umbrella"
keywords that change the defaults for subsets of more specific keywords.
In any case, that should make patching to add new features easier and
(eventually) allow for much more fine tuning by callers.

   (And I think if we had that, then implementing DWIM-ish parsing of,
   say, US dates on top of that would be a matter of writing a series of
   these strings to match them.  Probably.)

If I understand you correctly, this parse-date-DWIMishly would go
through the string and recognize (say) that it had come to something
that matches "%M/%d/%Y", concatenate that to a strptime-like format
string it was building, and then call parse-date-strptime-style (or
whatever) with that and the original string.  But it seems to me that if
it could recognize that it had found "%M/%d/%Y" in the string, it would
be much easier to just fill in the month, day, and year right then.

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-01-04  4:45                                     ` Bob Rogers
@ 2022-01-05 15:46                                       ` Lars Ingebrigtsen
  2022-01-05 22:49                                         ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-01-05 15:46 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Andreas Schwab, Paul Eggert

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> One thing regexps can't do (at least not without adding a fair bit of
> complexity) is allow components to be in different order or omitted.  So
> it still just takes one approximate date/time, and the caller is back to
> writing regexps to validate before passing it to the "real" parser.

Yes.  But you can't just have a function that you can give any string to
and it'll tell you what the date contained in it is.  "1.2" is a
perfectly normal way to specify "January second" in some countries, but no
amount of general DWIM is going to take us there.  The caller has to say
what they expect the format to be is.

>    I was thinking that the next dimension in which to extend parse-date
> would be to add keywords to refine what is accepted, on top of the basic
> MDY order, e.g.:
>
> 	:date-separators "-/"
> 	:time-separators ":."
> 	:two-numbers-are :month-year	;; or (e.g.) :day-month
> 	:timezone :required	;; could be :optional or :forbidden
> 	:timezone-has-colon t	;; RFC5322 forbids, ISO-8601 requires
>
> Some keywords could even be regexp-valued.  Others could be "umbrella"
> keywords that change the defaults for subsets of more specific keywords.
> In any case, that should make patching to add new features easier and
> (eventually) allow for much more fine tuning by callers.

I think it'd be easier to just write a regexp than to use a date-parsing
function like that.  😀

>    (And I think if we had that, then implementing DWIM-ish parsing of,
>    say, US dates on top of that would be a matter of writing a series of
>    these strings to match them.  Probably.)
>
> If I understand you correctly, this parse-date-DWIMishly would go
> through the string and recognize (say) that it had come to something
> that matches "%M/%d/%Y", concatenate that to a strptime-like format
> string it was building, and then call parse-date-strptime-style (or
> whatever) with that and the original string.  But it seems to me that if
> it could recognize that it had found "%M/%d/%Y" in the string, it would
> be much easier to just fill in the month, day, and year right then.

Well, I was thinking more like looping over a common set of formats and
see whether we have a match.  For the US, looping over "%M.*%d.*%Y?",
"%M.*%b.*%Y?" and "%M.*%B.*%Y?" would probably cover most of the
American-language dates.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-01-05 15:46                                       ` Lars Ingebrigtsen
@ 2022-01-05 22:49                                         ` Bob Rogers
       [not found]                                           ` <25105.33397.961104.269676@orion.rgrjr.com>
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-01-05 22:49 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209, Andreas Schwab, Paul Eggert

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Wed, 05 Jan 2022 16:46:01 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > One thing regexps can't do (at least not without adding a fair bit of
   > complexity) is allow components to be in different order or omitted.  So
   > it still just takes one approximate date/time, and the caller is back to
   > writing regexps to validate before passing it to the "real" parser.

   Yes.  But you can't just have a function that you can give any string to
   and it'll tell you what the date contained in it is.  "1.2" is a
   perfectly normal way to specify "January second" in some countries, but no
   amount of general DWIM is going to take us there.  The caller has to say
   what they expect the format to be is.

Granted, but (as with any API) how much they need to say can be greatly
reduced by suitable defaults.

   >    I was thinking that the next dimension in which to extend parse-date
   > would be to add keywords to refine what is accepted, on top of the basic
   > MDY order, e.g.:
   >
   > 	:date-separators "-/"
   > 	:time-separators ":."
   > 	:two-numbers-are :month-year	;; or (e.g.) :day-month
   > 	:timezone :required	;; could be :optional or :forbidden
   > 	:timezone-has-colon t	;; RFC5322 forbids, ISO-8601 requires
   >
   > Some keywords could even be regexp-valued.  Others could be "umbrella"
   > keywords that change the defaults for subsets of more specific keywords.
   > In any case, that should make patching to add new features easier and
   > (eventually) allow for much more fine tuning by callers.

   I think it'd be easier to just write a regexp than to use a date-parsing
   function like that.   

Again, suitable defaults should take care of that.  (And I'm beginning
to suspect you're better at writing regexps than I am.  ;-)

   > . . .

   Well, I was thinking more like looping over a common set of formats
   and see whether we have a match.  For the US, looping over
   "%M.*%d.*%Y?", "%M.*%b.*%Y?" and "%M.*%B.*%Y?" would probably cover
   most of the American-language dates.

Except that (according to "man strptime" on my system), "%M" is the
descriptor for minute, which rather makes the point that composing these
is not straightforward.  It also occurs to me that using ".*" could be
dangerous if it matches into the time or timezone fields.

   And (also based on my reading of "man strptime") you wouldn't need to
specify "%b" and "%B" separately, as they are treated equivalently, but
if you wanted to be DWIMmy about two-digit years, you'd have to cover
"%y" as well as "%Y":

     %m[-/]%d[-/]%y
     %m[-/]%d[-/]%Y
     %m[-/]%b[-/]%y
     %m[-/]%b[-/]%Y

This does not strike me as an improvement.

   In any case, I would like to bring parse-date.el to completion soon,
so here is what I plan to do:

   1.  Drop ISO-8601 parsing, and point the documentation to
iso8601-parse.

   2.  Drop email date parsing and use the code to create a patch that
updates ietf-drums.el, which could perhaps start the process to replace
parse-time-string.

   3.  Restrict parse-date formats to mdy, dmy, and ymd, with some extra
heuristics for four-digit years and alphanumeric months, then call it a
day.

   If you think the resulting parse-date is worth the trouble, then it
can become part of Emacs; if not, then I will offer it to ELPA.  Either
way, parse-date will be off my plate.  But I don't think I will take up
the mantle of writing a strptime-like date parser, as I don't think it
will be very useful.

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
       [not found]                                           ` <25105.33397.961104.269676@orion.rgrjr.com>
@ 2022-02-20 12:25                                             ` Lars Ingebrigtsen
  2022-02-20 13:03                                             ` Andreas Schwab
  1 sibling, 0 replies; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-20 12:25 UTC (permalink / raw)
  To: 52209

(Resending because bug report was archived.)

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> Here's what I have for this phase of the plan; let me know what you
> think.  It took longer than expected because it became a project unto
> itself, and so my procrastinator kicked in, making it longer still.  :-/

:-)

Have you benchmarked your new implementation versus the current one?
It's important that the parsing is performant, otherwise it'd slow down
many things that parse a large number of date strings.

Some minor comments about the code:

> +(defsubst ietf-drums-date--ignore-char? (char)
> +  ;; Ignore whitespace and commas.
> +  (or (eq char ?\ ) (eq char ?\t) (eq char ?\r) (eq char ?\n) (eq char ?,)))

In Emacs Lisp, we don't use Scheme-style predicate names -- we use -p
instead.

> +(defun ietf-drums-date--tokenize-string (string &optional comment-eof?)

And the same with booleans -- we don't use foo? for those, but just foo.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
       [not found]                                           ` <25105.33397.961104.269676@orion.rgrjr.com>
  2022-02-20 12:25                                             ` Lars Ingebrigtsen
@ 2022-02-20 13:03                                             ` Andreas Schwab
       [not found]                                               ` <87ilt9vicd.fsf@gnus.org>
  1 sibling, 1 reply; 40+ messages in thread
From: Andreas Schwab @ 2022-02-20 13:03 UTC (permalink / raw)
  To: Bob Rogers; +Cc: Lars Ingebrigtsen, 52209, Paul Eggert

On Feb 19 2022, Bob Rogers wrote:

> +  (or (eq char ?\ ) (eq char ?\t) (eq char ?\r) (eq char ?\n) (eq char ?,)))

     (memq char '(?\s ?\t ?\r ?\n ?,))

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
       [not found]                                               ` <87ilt9vicd.fsf@gnus.org>
@ 2022-02-20 22:14                                                 ` Bob Rogers
  2022-02-23 23:15                                                   ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-02-20 22:14 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Andreas Schwab; +Cc: 52209, Paul Eggert

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Sun, 20 Feb 2022 13:21:54 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > Here's what I have for this phase of the plan; let me know what you
   > think.  It took longer than expected because it became a project unto
   > itself, and so my procrastinator kicked in, making it longer still.  :-/

   :-)

   Have you benchmarked your new implementation versus the current one?
   It's important that the parsing is performant, otherwise it'd slow down
   many things that parse a large number of date strings.

No benchmarking; I will do that presently.

   Some minor comments about the code:

   > +(defsubst ietf-drums-date--ignore-char? (char) . . .

   In Emacs Lisp, we don't use Scheme-style predicate names -- we use -p
   instead . . .

   And the same with booleans -- we don't use foo? for those, but just foo.

OK.

   ================
   From: Andreas Schwab <schwab@linux-m68k.org>
   Date: Sun, 20 Feb 2022 14:03:55 +0100

   On Feb 19 2022, Bob Rogers wrote:

   > +  (or (eq char ?\ ) (eq char ?\t) (eq char ?\r) (eq char ?\n) (eq char ?,)))

	(memq char '(?\s ?\t ?\r ?\n ?,))

Good eye; I had been adding to that set incrementally, so I missed the
forest for the trees.

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-20 22:14                                                 ` Bob Rogers
@ 2022-02-23 23:15                                                   ` Bob Rogers
  2022-02-24  9:19                                                     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-02-23 23:15 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Andreas Schwab, 52209, Paul Eggert

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 1218 bytes --]

   From: Bob Rogers <rogers-emacs@rgrjr.homedns.org>
   Date: Sun, 20 Feb 2022 17:14:36 -0500

      From: Lars Ingebrigtsen <larsi@gnus.org>
      Date: Sun, 20 Feb 2022 13:21:54 +0100

      Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

      > Here's what I have for this phase of the plan; let me know what you
      > think.  It took longer than expected because it became a project unto
      > itself, and so my procrastinator kicked in, making it longer still.  :-/

      :-)

      Have you benchmarked your new implementation versus the current one?
      It's important that the parsing is performant, otherwise it'd slow down
      many things that parse a large number of date strings.

   No benchmarking; I will do that presently.

Benchmarking code and results attached.  I extracted a handful of
non-error cases from the tests as being more representative than any of
the error cases; the resulting numbers make it seem like any difference
between the two implementations is in the noise.

   But this is my first foray into elisp benchmarking, so I may have
overlooked something.  Fortunately, email dates are not that diverse, so
I am hoping this sampling may be broad enough.

					-- Bob


[-- Attachment #2: ietf-drums-date-timings.el --]
[-- Type: text/x-emacs-lisp, Size: 1938 bytes --]

;;; ietf-drums-date-timings.el --- timing ietf-drums-date.el -*- lexical-binding: t -*-

;; Copyright (C) 2022 Free Software Foundation, Inc.

;; Author: Bob Rogers <rogers@rgrjr.com>

(defun run-timings (parse-fn)
  (dolist (case '(("Mon, 22 Feb 2016 19:35:42 +0100"
                   (42 35 19 22 2 2016 1 -1 3600)
                   (22219 21758))
                  ("22 Feb 2016 19:35:42 +0100"
                   (42 35 19 22 2 2016 nil -1 3600)
                   (22219 21758))
                  ("Mon, 22 February 2016 19:35:42 +0100"
                   (42 35 19 22 2 2016 1 -1 3600)
                   (22219 21758))
                  ("Mon, 22 feb 2016 19:35:42 +0100"
                   (42 35 19 22 2 2016 1 -1 3600)
                   (22219 21758))
                  ("Monday, 22 february 2016 19:35:42 +0100"
                   (42 35 19 22 2 2016 1 -1 3600)
                   (22219 21758))
                  ("Monday, 22 february 2016 19:35:42 PST"
                   (42 35 19 22 2 2016 1 nil -28800)
                   (22219 54158))
                  ("Friday, 21 Sep 2018 13:47:58 PDT"
                   (58 47 13 21 9 2018 5 t -25200)
                   (23461 22782))
                  ("Friday, 21 Sep 2018 13:47:58"
                   (58 47 13 21 9 2018 5 -1 nil)
                   (23461 11982))))
    (funcall parse-fn (car case))))

(benchmark-run-compiled 10000 (run-timings #'ietf-drums-parse-date))
;; (7.220905228 83 3.3420971879999968)
;; (7.24936647 83 3.3321491059999993)
;; (7.3240701370000005 84 3.371737411)
;; (/ (+ 7.249 7.324 7.324) 3) 7.299

(defun ietf-drums-old-parse-date (string)
  "Return an Emacs time spec from STRING."
  (encode-time (parse-time-string string)))

(benchmark-run 10000 (run-timings #'ietf-drums-old-parse-date))
;; (7.249068317 83 3.3251401939999994)
;; (7.317397244 84 3.3750772899999983)
;; (7.268244294 84 3.3820036280000005)
;; (/ (+ 7.249 7.317 7.268) 3) 7.278

^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-23 23:15                                                   ` Bob Rogers
@ 2022-02-24  9:19                                                     ` Lars Ingebrigtsen
  2022-02-25  0:49                                                       ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-24  9:19 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Andreas Schwab, Paul Eggert

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> Benchmarking code and results attached.  I extracted a handful of
> non-error cases from the tests as being more representative than any of
> the error cases; the resulting numbers make it seem like any difference
> between the two implementations is in the noise.
>
>    But this is my first foray into elisp benchmarking, so I may have
> overlooked something.  Fortunately, email dates are not that diverse, so
> I am hoping this sampling may be broad enough.

[...]

> (benchmark-run-compiled 10000 (run-timings #'ietf-drums-parse-date))
> ;; (7.220905228 83 3.3420971879999968)
> ;; (7.24936647 83 3.3321491059999993)
> ;; (7.3240701370000005 84 3.371737411)
> ;; (/ (+ 7.249 7.324 7.324) 3) 7.299

[...]

> (benchmark-run 10000 (run-timings #'ietf-drums-old-parse-date))
> ;; (7.249068317 83 3.3251401939999994)
> ;; (7.317397244 84 3.3750772899999983)
> ;; (7.268244294 84 3.3820036280000005)
> ;; (/ (+ 7.249 7.317 7.268) 3) 7.278

Thanks; that looks quite promising.  Can you send a new version of the
patch, and I'll get it pushed?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-24  9:19                                                     ` Lars Ingebrigtsen
@ 2022-02-25  0:49                                                       ` Bob Rogers
  2022-02-25  2:16                                                         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-02-25  0:49 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209, Andreas Schwab, Paul Eggert

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 669 bytes --]

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Thu, 24 Feb 2022 10:19:43 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > Benchmarking code and results attached.  I extracted a handful of
   > non-error cases from the tests as being more representative than any of
   > the error cases; the resulting numbers make it seem like any difference
   > between the two implementations is in the noise.

   Thanks; that looks quite promising.  Can you send a new version of the
   patch, and I'll get it pushed?

Here it is; there should be no changes from what I last sent other than
from the suggestions you and Andreas made.  Thanks,

					-- Bob


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Enhanced-date-parsing-for-ietf-drums.el.patch --]
[-- Type: text/x-patch, Size: 23385 bytes --]

From bdf96f4132cb0433dfcd48b862175ef9bbcc41bb Mon Sep 17 00:00:00 2001
From: Bob Rogers <rogers@rgrjr.com>
Date: Tue, 1 Feb 2022 14:36:31 -0500
Subject: [PATCH] Enhanced date parsing for ietf-drums.el

* lisp/mail/ietf-drums-date.el (added):
   + (ietf-drums-parse-date-string):  parse-time-string replacement
     which is compatible but can be made stricter if desired.
* test/lisp/mail/ietf-drums-date-tests.el (added):
   + Add tests for ietf-drums-parse-date-string.
* lisp/mail/ietf-drums.el:
   + (ietf-drums-parse-date):  Use ietf-drums-parse-date-string.
---
 lisp/mail/ietf-drums-date.el            | 274 ++++++++++++++++++++++++
 lisp/mail/ietf-drums.el                 |   6 +-
 test/lisp/mail/ietf-drums-date-tests.el | 176 +++++++++++++++
 3 files changed, 455 insertions(+), 1 deletion(-)
 create mode 100644 lisp/mail/ietf-drums-date.el
 create mode 100644 test/lisp/mail/ietf-drums-date-tests.el

diff --git a/lisp/mail/ietf-drums-date.el b/lisp/mail/ietf-drums-date.el
new file mode 100644
index 0000000000..6f64ae7337
--- /dev/null
+++ b/lisp/mail/ietf-drums-date.el
@@ -0,0 +1,274 @@
+;;; ietf-drums-date.el --- parse time/date for ietf-drums.el -*- lexical-binding: t -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author: Bob Rogers <rogers@rgrjr.com>
+;; Keywords: mail, util
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; 'ietf-drums-parse-date-string' parses a time and/or date in a
+;; string and returns a list of values, just like `decode-time', where
+;; unspecified elements in the string are returned as nil (except
+;; unspecified DST is returned as -1).  `encode-time' may be applied
+;; on these values to obtain an internal time value.
+
+;; Historically, `parse-time-string' was used for this purpose, but it
+;; was gradually but imperfectly extended to handle other date
+;; formats.  'ietf-drums-parse-date-string' is compatible in that it
+;; uses the same return value format and parses the same email date
+;; formats by default, but can be made stricter if desired.
+
+;;; Code:
+
+(require 'cl-lib)
+(require 'parse-time)
+
+(define-error 'date-parse-error "Date/time parse error" 'error)
+
+(defconst ietf-drums-date--slot-names
+  '(second minute hour day month year weekday dst zone)
+  "Names of return value slots, for better error messages
+See the decoded-time defstruct.")
+
+(defconst ietf-drums-date--slot-ranges
+  '((0 60) (0 59) (0 23) (1 31) (1 12) (1 9999))
+  "Numeric slot ranges, for bounds checking.
+Note that RFC5322 explicitly requires that seconds go up to 60,
+to allow for leap seconds (see Mills, D., 'Network Time
+Protocol', STD 12, RFC 1119, September 1989).")
+
+(defsubst ietf-drums-date--ignore-char-p (char)
+  ;; Ignore whitespace and commas.
+  (memq char '(?\s ?\t ?\r ?\n ?,)))
+
+(defun ietf-drums-date--tokenize-string (string &optional comment-eof)
+  "Turn STRING into tokens, separated only by whitespace and commas.
+Multiple commas are ignored.  Pure digit sequences are turned
+into integers.  If COMMENT-EOF is true, then a comment as
+defined by RFC5322 (strictly, the CFWS production that also
+accepts comments) is treated as an end-of-file, and no further
+tokens are recognized, otherwise we strip out all comments and
+treat them as whitespace (per RFC822)."
+  (let ((index 0)
+	(end (length string))
+	(list ()))
+    (cl-flet ((skip-ignored ()
+                ;; Skip ignored characters at index (the scan
+                ;; position).  Skip RFC822 comments in matched parens,
+                ;; but do not complain about unterminated comments.
+                (let ((char nil)
+                      (nest 0))
+                  (while (and (< index end)
+                              (setq char (aref string index))
+                              (or (> nest 0)
+                                  (ietf-drums-date--ignore-char-p char)
+                                  (and (not comment-eof) (eql char ?\())))
+                    (cl-incf index)
+                    ;; FWS bookkeeping.
+                    (cond ((and (eq char ?\\)
+                                (< (1+ index) end))
+	                    ;; Move to the next char but don't check
+	                    ;; it to see if it might be a paren.
+                            (cl-incf index))
+                          ((eq char ?\() (cl-incf nest))
+                          ((eq char ?\)) (cl-decf nest)))))))
+      (skip-ignored)		;; Skip leading whitespace.
+      (while (and (< index end)
+                  (not (and comment-eof
+                            (eq (aref string index) ?\())))
+        (let* ((start index)
+               (char (aref string index))
+               (all-digits (<= ?0 char ?9)))
+          ;; char is valid; look for more valid characters.
+          (when (and (eq char ?\\)
+                     (< (1+ index) end))
+            ;; Escaped character, which might be a "(".  If so, we are
+            ;; correct to include it in the token, even though the
+            ;; caller is sure to barf.  If not, we violate RFC2?822 by
+            ;; not removing the backslash, but no characters in valid
+            ;; RFC2?822 dates need escaping anyway, so it shouldn't
+            ;; matter that this is not done strictly correctly.  --
+            ;; rgr, 24-Dec-21.
+            (cl-incf index))
+          (while (and (< (cl-incf index) end)
+                      (setq char (aref string index))
+                      (not (or (ietf-drums-date--ignore-char-p char)
+                               (eq char ?\())))
+            (unless (<= ?0 char ?9)
+              (setq all-digits nil))
+            (when (and (eq char ?\\)
+                       (< (1+ index) end))
+              ;; Escaped character, see above.
+              (cl-incf index)))
+          (push (if all-digits
+                    (cl-parse-integer string :start start :end index)
+                  (substring string start index))
+                list)
+          (skip-ignored)))
+      (nreverse list))))
+
+(defun ietf-drums-parse-date-string (time-string &optional error no-822)
+  "Parse an RFC5322 or RFC822 date, passed as TIME-STRING.
+The optional ERROR parameter causes syntax errors to be flagged
+by signalling an instance of the date-parse-error condition.  The
+optional NO-822 parameter disables the more lax RFC822 syntax,
+which is permitted by default.
+
+The result is a list of (SEC MIN HOUR DAY MON YEAR DOW DST TZ),
+which can be accessed as a decoded-time defstruct (q.v.),
+e.g. `decoded-time-year' to extract the year, and turned into an
+Emacs timestamp by `encode-time'.
+
+The strict syntax for RFC5322 is as follows:
+
+   [ day-of-week \",\" ] day FWS month-name FWS year FWS time [CFWS]
+
+where the \"time\" production is:
+
+   2DIGIT \":\" 2DIGIT [ \":\" 2DIGIT ] FWS ( \"+\" / \"-\" ) 4DIGIT
+
+and FWS is \"folding white space,\" and CFWS is \"comments and/or
+folding white space\", where comments are included in nesting
+parentheses and are equivalent to white space.  RFC822 also
+accepts comments in random places (all of which is handled by
+ietf-drums-date--tokenize-string) and two-digit years.  For
+two-digit years, 50 and up are interpreted as 1950 through 1999
+and 00 through 49 as 200 through 2049.
+
+We are somewhat more lax in what we accept (specifically, the
+hours don't have to be two digits, and the TZ and the comma after
+the DOW are optional), but we do insist that the items that are
+present do appear in this order.  Unspecified/unrecognized
+elements in the string are returned as nil (except unspecified
+DST is returned as -1)."
+  (let ((tokens (ietf-drums-date--tokenize-string (downcase time-string)
+                                                  no-822))
+        (time (list nil nil nil nil nil nil nil -1 nil)))
+    (cl-labels ((set-matched-slot (slot index token)
+                  ;; Assign a slot value from match data if index is
+                  ;; non-nil, else from token, signalling an error if
+                  ;; enabled and it's out of range.
+                  (let ((value (if index
+                                   (cl-parse-integer (match-string index token))
+                                 token)))
+                    (when error
+                      (let ((range (nth slot ietf-drums-date--slot-ranges)))
+                        (when (and range
+                                   (not (<= (car range) value (cadr range))))
+                          (signal 'date-parse-error
+                                  (list "Slot out of range"
+                                        (nth slot ietf-drums-date--slot-names)
+                                        token (car range) (cadr range))))))
+                    (setf (nth slot time) value)))
+                (set-numeric (slot token)
+                  ;; Only assign the slot if the token is a number.
+                  (cond ((natnump token)
+                          (set-matched-slot slot nil token))
+                        (error
+                          (signal 'date-parse-error
+                                  (list "Not a number"
+                                        (nth slot ietf-drums-date--slot-names)
+                                        token))))))
+      ;; Check for weekday.
+      (let ((dow (assoc (car tokens) parse-time-weekdays)))
+        (when dow
+          ;; Day of the week.
+          (set-matched-slot 6 nil (cdr dow))
+          (pop tokens)))
+      ;; Day.
+      (set-numeric 3 (pop tokens))
+      ;; Alphabetic month.
+      (let* ((month (pop tokens))
+             (match (assoc month parse-time-months)))
+        (cond (match
+                (set-matched-slot 4 nil (cdr match)))
+              (error
+                (signal 'date-parse-error
+                        (list "Expected an alphabetic month" month)))
+              (t
+                (push month tokens))))
+      ;; Year.
+      (let ((year (pop tokens)))
+        ;; Check the year for the right number of digits.
+        (cond ((not (natnump year))
+                (when error
+                  (signal 'date-parse-error
+                          (list "Expected a year" year)))
+                (push year tokens))
+              ((>= year 1000)
+                (set-numeric 5 year))
+              ((or no-822
+                   (>= year 100))
+                (when error
+                  (signal 'date-parse-error
+                          (list "Four-digit years are required" year)))
+                (push year tokens))
+              ((>= year 50)
+                ;; second half of the 20th century.
+                (set-numeric 5 (+ 1900 year)))
+              (t
+                ;; first half of the 21st century.
+                (set-numeric 5 (+ 2000 year)))))
+      ;; Time.
+      (let ((time (pop tokens)))
+        (cond ((or (null time) (natnump time))
+                (when error
+                  (signal 'date-parse-error
+                          (list "Expected a time" time)))
+                (push time tokens))
+              ((string-match
+                "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\):\\([0-9][0-9]\\)$"
+                time)
+                (set-matched-slot 2 1 time)
+                (set-matched-slot 1 2 time)
+                (set-matched-slot 0 3 time))
+              ((string-match "^\\([0-9][0-9]?\\):\\([0-9][0-9]\\)$" time)
+                ;; Time without seconds.
+                (set-matched-slot 2 1 time)
+                (set-matched-slot 1 2 time)
+                (set-matched-slot 0 nil 0))
+              (error
+                (signal 'date-parse-error
+                        (list "Expected a time" time)))))
+      ;; Timezone.
+      (let* ((zone (pop tokens))
+             (match (assoc zone parse-time-zoneinfo)))
+        (cond (match
+                (set-matched-slot 8 nil (cadr match))
+                (set-matched-slot 7 nil (caddr match)))
+              ((and (stringp zone)
+                    (string-match "^[-+][0-9][0-9][0-9][0-9]$" zone))
+                ;; Numeric time zone.
+                (set-matched-slot
+                  8 nil
+                  (* 60
+                     (+ (cl-parse-integer zone :start 3 :end 5)
+                        (* 60 (cl-parse-integer zone :start 1 :end 3)))
+                     (if (= (aref zone 0) ?-) -1 1))))
+              ((and zone error)
+                (signal 'date-parse-error
+                        (list "Expected a timezone" zone)))))
+      (when (and tokens error)
+        (signal 'date-parse-error
+                (list "Extra token(s)" (car tokens)))))
+    time))
+
+(provide 'ietf-drums-date)
+
+;;; ietf-drums-date.el ends here
diff --git a/lisp/mail/ietf-drums.el b/lisp/mail/ietf-drums.el
index 85aa27235f..d1ad671b16 100644
--- a/lisp/mail/ietf-drums.el
+++ b/lisp/mail/ietf-drums.el
@@ -294,9 +294,13 @@ ietf-drums-unfold-fws
     (replace-match " " t t))
   (goto-char (point-min)))
 
+(declare-function ietf-drums-parse-date-string "ietf-drums-date"
+                  (time-string &optional error? no-822?))
+
 (defun ietf-drums-parse-date (string)
   "Return an Emacs time spec from STRING."
-  (encode-time (parse-time-string string)))
+  (require 'ietf-drums-date)
+  (encode-time (ietf-drums-parse-date-string string)))
 
 (defun ietf-drums-narrow-to-header ()
   "Narrow to the header section in the current buffer."
diff --git a/test/lisp/mail/ietf-drums-date-tests.el b/test/lisp/mail/ietf-drums-date-tests.el
new file mode 100644
index 0000000000..2d4b39dfae
--- /dev/null
+++ b/test/lisp/mail/ietf-drums-date-tests.el
@@ -0,0 +1,176 @@
+;;; ietf-drums-date-tests.el --- Test suite for ietf-drums-date.el  -*- lexical-binding:t -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author: Bob Rogers <rogers@rgrjr.com>
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;;; Code:
+
+(require 'ert)
+(require 'ietf-drums)
+(require 'ietf-drums-date)
+
+(ert-deftest ietf-drums-date-tests ()
+  "Test basic ietf-drums-parse-date-string functionality."
+
+  ;; Test tokenization.
+  (should (equal (ietf-drums-date--tokenize-string " ") '()))
+  (should (equal (ietf-drums-date--tokenize-string " a b") '("a" "b")))
+  (should (equal (ietf-drums-date--tokenize-string "a bbc dde")
+                 '("a" "bbc" "dde")))
+  (should (equal (ietf-drums-date--tokenize-string " , a 27 b,, c 14:32 ")
+                 '("a" 27 "b" "c" "14:32")))
+  ;; Some folding whitespace tests.
+  (should (equal (ietf-drums-date--tokenize-string " a b (end) c" t)
+                 '("a" "b")))
+  (should (equal (ietf-drums-date--tokenize-string "(quux)a (foo (bar)) b(baz)")
+                 '("a" "b")))
+  (should (equal (ietf-drums-date--tokenize-string "a b\\cde")
+                 ;; Strictly incorrect, but strictly unnecessary syntax.
+                 '("a" "b\\cde")))
+  (should (equal (ietf-drums-date--tokenize-string "a b\\ de")
+                 '("a" "b\\ de")))
+  (should (equal (ietf-drums-date--tokenize-string "a \\de \\(f")
+                 '("a" "\\de" "\\(f")))
+
+  ;; Start with some compatible RFC822 dates.
+  (dolist (case '(("Mon, 22 Feb 2016 19:35:42 +0100"
+                   (42 35 19 22 2 2016 1 -1 3600)
+                   (22219 21758))
+                  ("22 Feb 2016 19:35:42 +0100"
+                   (42 35 19 22 2 2016 nil -1 3600)
+                   (22219 21758))
+                  ("Mon, 22 February 2016 19:35:42 +0100"
+                   (42 35 19 22 2 2016 1 -1 3600)
+                   (22219 21758))
+                  ("Mon, 22 feb 2016 19:35:42 +0100"
+                   (42 35 19 22 2 2016 1 -1 3600)
+                   (22219 21758))
+                  ("Monday, 22 february 2016 19:35:42 +0100"
+                   (42 35 19 22 2 2016 1 -1 3600)
+                   (22219 21758))
+                  ("Monday, 22 february 2016 19:35:42 PST"
+                   (42 35 19 22 2 2016 1 nil -28800)
+                   (22219 54158))
+                  ("Friday, 21 Sep 2018 13:47:58 PDT"
+                   (58 47 13 21 9 2018 5 t -25200)
+                   (23461 22782))
+                  ("Friday, 21 Sep 2018 13:47:58"
+                   (58 47 13 21 9 2018 5 -1 nil)
+                   (23461 11982))))
+            (let* ((input (car case))
+                   (parsed (cadr case))
+                   (encoded (caddr case)))
+              ;; The input should parse the same without RFC822.
+              (should (equal (ietf-drums-parse-date-string input) parsed))
+              (should (equal (ietf-drums-parse-date-string input nil t) parsed))
+              ;; Check the encoded date (the official output, though
+              ;; the decoded-time is easier to debug).
+              (should (equal (ietf-drums-parse-date input) encoded))))
+
+  ;; Two-digit years are not allowed by the "modern" format.
+  (should (equal (ietf-drums-parse-date-string "22 Feb 16 19:35:42 +0100")
+                 '(42 35 19 22 2 2016 nil -1 3600)))
+  (should (equal (ietf-drums-parse-date-string "22 Feb 16 19:35:42 +0100" nil t)
+                 '(nil nil nil 22 2 nil nil -1 nil)))
+  (should (equal (should-error (ietf-drums-parse-date-string
+                                "22 Feb 16 19:35:42 +0100" t t))
+                 '(date-parse-error "Four-digit years are required" 16)))
+  (should (equal (ietf-drums-parse-date-string "22 Feb 96 19:35:42 +0100")
+                 '(42 35 19 22 2 1996 nil -1 3600)))
+  (should (equal (ietf-drums-parse-date-string "22 Feb 96 19:35:42 +0100" nil t)
+                 '(nil nil nil 22 2 nil nil -1 nil)))
+  (should (equal (should-error (ietf-drums-parse-date-string
+                                "22 Feb 96 19:35:42 +0100" t t))
+                 '(date-parse-error "Four-digit years are required" 96)))
+
+  ;; Try some dates with comments.
+  (should (equal (ietf-drums-parse-date-string
+                  "22 Feb (today) 16 19:35:42 +0100")
+                 '(42 35 19 22 2 2016 nil -1 3600)))
+  (should (equal (ietf-drums-parse-date-string
+                  "22 Feb (today) 16 19:35:42 +0100" nil t)
+                 '(nil nil nil 22 2 nil nil -1 nil)))
+  (should (equal (should-error (ietf-drums-parse-date-string
+                                "22 Feb (today) 16 19:35:42 +0100" t t))
+                 '(date-parse-error "Expected a year" nil)))
+  (should (equal (ietf-drums-parse-date-string
+                  "22 Feb 96 (long ago) 19:35:42 +0100")
+                 '(42 35 19 22 2 1996 nil -1 3600)))
+  (should (equal (ietf-drums-parse-date-string
+                  "Friday, 21 Sep(comment \\) with \\( parens)18 19:35:42")
+                 '(42 35 19 21 9 2018 5 -1 nil)))
+  (should (equal (ietf-drums-parse-date-string
+                  "Friday, 21 Sep 18 19:35:42 (unterminated comment")
+                 '(42 35 19 21 9 2018 5 -1 nil)))
+
+  ;; Test some RFC822 error cases
+  (dolist (test '(("33 1 2022" ("Slot out of range" day 33 1 31))
+                  ("0 1 2022" ("Slot out of range" day 0 1 31))
+                  ("1 1 2020 2021" ("Expected an alphabetic month" 1))
+                  ("1 Jan 2020 2021" ("Expected a time" 2021))
+                  ("1 Jan 2020 20:21 2000" ("Expected a timezone" 2000))
+                  ("1 Jan 2020 20:21 +0200 33" ("Extra token(s)" 33))))
+    (should (equal (should-error (ietf-drums-parse-date-string (car test) t))
+                   (cons 'date-parse-error (cadr test)))))
+
+  (dolist (test '(("22 Feb 196" nil		;; bad year
+                   ("Four-digit years are required" 196))
+                  ("22 Feb 16 19:35:24" t	;; two-digit year
+                   ("Four-digit years are required" 16))
+                  ("22 Feb 96 19:35:42" t	;; two-digit year
+                   ("Four-digit years are required" 96))
+                  ("2 Feb 2021 1996" nil
+                   ("Expected a time" 1996))
+                  ("22 Fub 1996" nil
+                   ("Expected an alphabetic month" "fub"))
+                  ("1 Jan 2020 30" nil
+                   ("Expected a time" 30))
+                  ("1 Jan 2020 16:47 15:15" nil
+                   ("Expected a timezone" "15:15"))
+                  ("1 Jan 2020 16:47 +0800 -0800" t
+                   ("Extra token(s)" "-0800"))
+                  ;; Range tests
+                  ("32 Dec 2021" nil
+                   ("Slot out of range" day 32 1 31))
+                  ("0 Dec 2021" nil
+                   ("Slot out of range" day 0 1 31))
+                  ("3 13 2021" nil
+                   ("Expected an alphabetic month" 13))
+                  ("3 Dec 0000" t
+                   ("Four-digit years are required" 0))
+                  ("3 Dec 20021" nil
+                   ("Slot out of range" year 20021 1 9999))
+                  ("1 Jan 2020 24:21:14" nil
+                   ("Slot out of range" hour "24:21:14" 0 23))
+                  ("1 Jan 2020 14:60:21" nil
+                   ("Slot out of range" minute "14:60:21" 0 59))
+                  ("1 Jan 2020 14:21:61" nil
+                   ("Slot out of range" second "14:21:61" 0 60))))
+    (should (equal (should-error
+                    (ietf-drums-parse-date-string (car test) t (cadr test)))
+                   (cons 'date-parse-error (caddr test)))))
+  (should (equal (ietf-drums-parse-date-string
+                  "1 Jan 2020 14:21:60")	;; a leap second!
+                 '(60 21 14 1 1 2020 nil -1 nil))))
+
+(provide 'ietf-drums-date-tests)
+
+;;; ietf-drums-date-tests.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-25  0:49                                                       ` Bob Rogers
@ 2022-02-25  2:16                                                         ` Lars Ingebrigtsen
  2022-02-25  2:32                                                           ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-25  2:16 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Andreas Schwab, Paul Eggert

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> Here it is; there should be no changes from what I last sent other than
> from the suggestions you and Andreas made.  Thanks,

I got a test error:

Test ietf-drums-date-tests condition:
    (ert-test-failed
     ((should
       (equal
	(ietf-drums-parse-date input)
	encoded))
      :form
      (equal
       (23460 55918)
       (23461 11982))
      :value nil :explanation
      (list-elt 0
		(different-atoms
		 (23460 "#x5ba4" "?室")
		 (23461 "#x5ba5" "?宥")))))
   FAILED  1/1  ietf-drums-date-tests (0.000406 sec) at lisp/mail/ietf-drums-date-tests.el:30


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-25  2:16                                                         ` Lars Ingebrigtsen
@ 2022-02-25  2:32                                                           ` Bob Rogers
  2022-02-25  2:58                                                             ` Bob Rogers
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-02-25  2:32 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 52209, Andreas Schwab, Paul Eggert

   From: Lars Ingebrigtsen <larsi@gnus.org>
   Date: Fri, 25 Feb 2022 03:16:56 +0100

   Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

   > Here it is; there should be no changes from what I last sent other than
   > from the suggestions you and Andreas made.  Thanks,

   I got a test error:

Hmm.  I bet we have a timezone issue . . .

					-- Bob





^ permalink raw reply	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-25  2:32                                                           ` Bob Rogers
@ 2022-02-25  2:58                                                             ` Bob Rogers
  2022-02-25 12:03                                                               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 40+ messages in thread
From: Bob Rogers @ 2022-02-25  2:58 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Andreas Schwab, 52209, Paul Eggert

[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 531 bytes --]

   From: Bob Rogers <rogers-emacs@rgrjr.homedns.org>
   Date: Thu, 24 Feb 2022 21:32:27 -0500

      From: Lars Ingebrigtsen <larsi@gnus.org>
      Date: Fri, 25 Feb 2022 03:16:56 +0100

      Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

      > Here it is; there should be no changes from what I last sent other than
      > from the suggestions you and Andreas made.  Thanks,

      I got a test error:

   Hmm.  I bet we have a timezone issue . . .

Yep; here's a fix, to be applied to the previous patch.

					-- Bob


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 3121 bytes --]

From 93a92360e5ea514236366f978aa5a71e7662ba1a Mon Sep 17 00:00:00 2001
From: Bob Rogers <rogers@rgrjr.com>
Date: Thu, 24 Feb 2022 21:55:30 -0500
Subject: [PATCH] Fix an ietf-drums-parse-date test without TZ

* test/lisp/mail/ietf-drums-date-tests.el:
   + (ietf-drums-date-tests):  Bug fix:  Input to ietf-drums-parse-date
     must have a timezone, otherwise the output depends on the test
     environment TZ.  Also add some tests without TZ, & fix indentation.
---
 test/lisp/mail/ietf-drums-date-tests.el | 36 +++++++++++++++++--------
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/test/lisp/mail/ietf-drums-date-tests.el b/test/lisp/mail/ietf-drums-date-tests.el
index 2d4b39dfae..5b798077ff 100644
--- a/test/lisp/mail/ietf-drums-date-tests.el
+++ b/test/lisp/mail/ietf-drums-date-tests.el
@@ -72,18 +72,32 @@ ietf-drums-date-tests
                   ("Friday, 21 Sep 2018 13:47:58 PDT"
                    (58 47 13 21 9 2018 5 t -25200)
                    (23461 22782))
-                  ("Friday, 21 Sep 2018 13:47:58"
-                   (58 47 13 21 9 2018 5 -1 nil)
+                  ("Friday, 21 Sep 2018 13:47:58 EDT"
+                   (58 47 13 21 9 2018 5 t -14400)
                    (23461 11982))))
-            (let* ((input (car case))
-                   (parsed (cadr case))
-                   (encoded (caddr case)))
-              ;; The input should parse the same without RFC822.
-              (should (equal (ietf-drums-parse-date-string input) parsed))
-              (should (equal (ietf-drums-parse-date-string input nil t) parsed))
-              ;; Check the encoded date (the official output, though
-              ;; the decoded-time is easier to debug).
-              (should (equal (ietf-drums-parse-date input) encoded))))
+    (let* ((input (car case))
+           (parsed (cadr case))
+           (encoded (caddr case)))
+      ;; The input should parse the same without RFC822.
+      (should (equal (ietf-drums-parse-date-string input) parsed))
+      (should (equal (ietf-drums-parse-date-string input nil t) parsed))
+      ;; Check the encoded date (the official output, though the
+      ;; decoded-time is easier to debug).
+      (should (equal (ietf-drums-parse-date input) encoded))))
+
+  ;; Test a few without timezones.
+  (dolist (case '(("Mon, 22 Feb 2016 19:35:42"
+                   (42 35 19 22 2 2016 1 -1 nil))
+                  ("Friday, 21 Sep 2018 13:47:58"
+                   (58 47 13 21 9 2018 5 -1 nil))))
+    (let* ((input (car case))
+           (parsed (cadr case)))
+      ;; The input should parse the same without RFC822.
+      (should (equal (ietf-drums-parse-date-string input) parsed))
+      (should (equal (ietf-drums-parse-date-string input nil t) parsed))
+      ;; We can't check the encoded date here because it will differ
+      ;; depending on the TZ of the test environment.
+      ))
 
   ;; Two-digit years are not allowed by the "modern" format.
   (should (equal (ietf-drums-parse-date-string "22 Feb 16 19:35:42 +0100")
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates
  2022-02-25  2:58                                                             ` Bob Rogers
@ 2022-02-25 12:03                                                               ` Lars Ingebrigtsen
  0 siblings, 0 replies; 40+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-25 12:03 UTC (permalink / raw)
  To: Bob Rogers; +Cc: 52209, Andreas Schwab, Paul Eggert

Bob Rogers <rogers-emacs@rgrjr.homedns.org> writes:

> Yep; here's a fix, to be applied to the previous patch.

That fixes the issue here, so I've now pushed the patches to Emacs 29.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2022-02-25 12:03 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-30 20:55 bug#52209: 28.0.60; [PATCH] date-to-time fails on pure dates Bob Rogers
2021-12-01  4:17 ` Lars Ingebrigtsen
2021-12-03  5:19   ` Katsumi Yamaoka
2021-12-03 16:29     ` Lars Ingebrigtsen
2021-12-03 18:38     ` Michael Heerdegen
2021-12-04 18:58 ` Paul Eggert
2021-12-19 21:11   ` Bob Rogers
2021-12-20 10:08     ` Lars Ingebrigtsen
2021-12-20 15:57       ` Bob Rogers
2021-12-20 16:34         ` Bob Rogers
2021-12-21 11:01         ` Lars Ingebrigtsen
2021-12-23 19:48           ` Bob Rogers
2021-12-24  9:29             ` Lars Ingebrigtsen
2021-12-24 15:58               ` Bob Rogers
2021-12-25 11:58                 ` Lars Ingebrigtsen
2021-12-25 22:50                   ` Bob Rogers
2021-12-26 11:31                     ` Lars Ingebrigtsen
2021-12-28 15:52                       ` Bob Rogers
2021-12-29 15:19                         ` Lars Ingebrigtsen
2021-12-29 19:29                           ` Paul Eggert
2021-12-29 22:01                             ` Bob Rogers
2021-12-30  5:32                               ` Bob Rogers
2021-12-30 21:08                           ` Bob Rogers
2022-01-01 14:47                             ` Lars Ingebrigtsen
2022-01-01 14:56                               ` Andreas Schwab
2022-01-02  0:41                                 ` Bob Rogers
2022-01-03 11:34                                   ` Lars Ingebrigtsen
2022-01-04  4:45                                     ` Bob Rogers
2022-01-05 15:46                                       ` Lars Ingebrigtsen
2022-01-05 22:49                                         ` Bob Rogers
     [not found]                                           ` <25105.33397.961104.269676@orion.rgrjr.com>
2022-02-20 12:25                                             ` Lars Ingebrigtsen
2022-02-20 13:03                                             ` Andreas Schwab
     [not found]                                               ` <87ilt9vicd.fsf@gnus.org>
2022-02-20 22:14                                                 ` Bob Rogers
2022-02-23 23:15                                                   ` Bob Rogers
2022-02-24  9:19                                                     ` Lars Ingebrigtsen
2022-02-25  0:49                                                       ` Bob Rogers
2022-02-25  2:16                                                         ` Lars Ingebrigtsen
2022-02-25  2:32                                                           ` Bob Rogers
2022-02-25  2:58                                                             ` Bob Rogers
2022-02-25 12:03                                                               ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).