unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#13544: (web http) fails to parse numeric timezones in Date header
@ 2013-01-24 22:13 Ludovic Courtès
  2013-03-07 22:28 ` Andy Wingo
  2013-03-15 14:40 ` Daniel Hartwig
  0 siblings, 2 replies; 15+ messages in thread
From: Ludovic Courtès @ 2013-01-24 22:13 UTC (permalink / raw)
  To: 13544; +Cc: Cyril Roelandt

[-- Attachment #1: Type: text/plain, Size: 772 bytes --]

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (use-modules(web client)(web uri))
scheme@(guile-user)> (http-get (string->uri "http://www.sqlite.org/"))
web/http.scm:768:6: In procedure parse-asctime-date:
web/http.scm:768:6: Bad Date header: Thu, 24  Jan 2013 21:53:01 +0000
--8<---------------cut here---------------end--------------->8---

RFC 1123 reads:

       There is a strong trend towards the use of numeric timezone
       indicators, and implementations SHOULD use numeric timezones
       instead of timezone names.  However, all implementations MUST
       accept either notation.  If timezone names are used, they MUST
       be exactly as defined in RFC-822.

Here’s a tentative patch to fix it:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1948 bytes --]

diff --git a/module/web/http.scm b/module/web/http.scm
index 216fddd..2ab5bd0 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1,6 +1,6 @@
 ;;; HTTP messages
 
-;; Copyright (C)  2010, 2011, 2012 Free Software Foundation, Inc.
+;; Copyright (C)  2010, 2011, 2012, 2013 Free Software Foundation, Inc.
 
 ;; This library is free software; you can redistribute it and/or
 ;; modify it under the terms of the GNU Lesser General Public
@@ -732,6 +732,20 @@ as an ordered alist."
                (minute (parse-non-negative-integer str 19 21))
                (second (parse-non-negative-integer str 22 24)))
            (make-date 0 second minute hour date month year 0)))
+        ((string-match? str "aaa, dd aaa dddd dd:dd:dd .0000")
+         (let ((date (parse-non-negative-integer str 5 7))
+               (month (parse-month str 8 11))
+               (year (parse-non-negative-integer str 12 16))
+               (hour (parse-non-negative-integer str 17 19))
+               (minute (parse-non-negative-integer str 20 22))
+               (second (parse-non-negative-integer str 23 25))
+               (tz (parse-non-negative-integer str 28 31))
+               (tz-sign (case (string-ref str 27)
+                          ((#\+) +1)
+                          ((#\-) -1)
+                          (else (bad-header 'date str) #f))))
+           (make-date 0 second minute hour date month year
+                      (* tz-sign tz))))
         (else
          (bad-header 'date str)         ; prevent tail call
          #f)))
@@ -778,7 +792,8 @@ as an ordered alist."
     (make-date 0 second minute hour date month year 0)))
 
 (define (parse-date str)
-  (if (string-suffix? " GMT" str)
+  (if (or (string-suffix? " GMT" str)
+          (string-match "[+-][0-9]{4}$" str))
       (let ((comma (string-index str #\,)))
         (cond ((not comma) (bad-header 'date str))
               ((= comma 3) (parse-rfc-822-date str))

[-- Attachment #3: Type: text/plain, Size: 226 bytes --]


Problem is, this particular example has another problem: it has an extra
space before the month name.

How is this best addressed?  Should the parser be more tolerant,
possibly using plain regexps?

Thanks,
Ludo’.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-03-27 15:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-24 22:13 bug#13544: (web http) fails to parse numeric timezones in Date header Ludovic Courtès
2013-03-07 22:28 ` Andy Wingo
2013-03-09  1:41   ` Daniel Hartwig
2013-03-09  2:08     ` Daniel Hartwig
2013-03-09  8:21     ` Andy Wingo
2013-03-09 23:50       ` Daniel Hartwig
2013-03-10 18:31         ` Andy Wingo
2013-03-14 13:34           ` Ludovic Courtès
2013-03-14 15:00             ` Andy Wingo
2013-03-14 16:07               ` Ludovic Courtès
2013-03-15  7:08               ` Daniel Hartwig
2013-03-15  7:17                 ` Daniel Hartwig
2013-03-15 14:40 ` Daniel Hartwig
2013-03-15 23:04   ` Ludovic Courtès
2013-03-27 15:25   ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).