unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [Patch] tag.py: Bugfix to avoid decode() on a NoneType object
@ 2011-07-22 14:11 Michael Heinrich
  2011-07-22 19:13 ` Patrick Totzke
  2011-08-12  8:14 ` Sebastian Spaeth
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Heinrich @ 2011-07-22 14:11 UTC (permalink / raw)
  To: notmuch

Dear all,

with current head I get following error in my python scripts when I read the
tags of a message:

  File "/home/heinrich/.local/lib/python2.6/site-packages/notmuch/tag.py", line
88, in next
    tag = Tags._get(self._tags).decode('utf-8')


Here is a patch:

diff --git a/bindings/python/notmuch/tag.py b/bindings/python/notmuch/tag.py
index 65a9118..e9049fc 100644
--- a/bindings/python/notmuch/tag.py
+++ b/bindings/python/notmuch/tag.py
@@ -85,10 +85,12 @@ class Tags(object):
             raise NotmuchError(STATUS.NOT_INITIALIZED)
         # No need to call nmlib.notmuch_tags_valid(self._tags);
         # Tags._get safely returns None, if there is no more valid tag.
-        tag = Tags._get(self._tags).decode('utf-8')
+        tag = Tags._get(self._tags)
         if tag is None:
             self._tags = None
             raise StopIteration
+        else:
+            tag = tag.decode('utf-8')
         nmlib.notmuch_tags_move_to_next(self._tags)
         return tag
 

Michael.

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Patch] tag.py: Bugfix to avoid decode() on a NoneType object
  2011-07-22 14:11 [Patch] tag.py: Bugfix to avoid decode() on a NoneType object Michael Heinrich
@ 2011-07-22 19:13 ` Patrick Totzke
  2011-08-12  8:14 ` Sebastian Spaeth
  1 sibling, 0 replies; 7+ messages in thread
From: Patrick Totzke @ 2011-07-22 19:13 UTC (permalink / raw)
  To: Michael Heinrich; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]

Hi Michael,

I also fell over this a while ago. (cf. http://notmuch.198994.n3.nabble.com/Encodings-td3159281.html)
Your patch certainly fix the immediate error, but there is ore to the problem:
Tagstrings seem to be the only ones stored by notmuch as-is, so unlike
headers, they don't get converted to utf-8. The patch that lead to
this .decode('utf-8') was pushed a bit too hastily.
As Uwe mentiones in above cited thread, we could consider enforcing
tags to be utf-8..

best,
/p


On Fri, Jul 22, 2011 at 02:11:41PM +0000, Michael Heinrich wrote:
> Dear all,
> 
> with current head I get following error in my python scripts when I read the
> tags of a message:
> 
>   File "/home/heinrich/.local/lib/python2.6/site-packages/notmuch/tag.py", line
> 88, in next
>     tag = Tags._get(self._tags).decode('utf-8')
> 
> 
> Here is a patch:
> 
> diff --git a/bindings/python/notmuch/tag.py b/bindings/python/notmuch/tag.py
> index 65a9118..e9049fc 100644
> --- a/bindings/python/notmuch/tag.py
> +++ b/bindings/python/notmuch/tag.py
> @@ -85,10 +85,12 @@ class Tags(object):
>              raise NotmuchError(STATUS.NOT_INITIALIZED)
>          # No need to call nmlib.notmuch_tags_valid(self._tags);
>          # Tags._get safely returns None, if there is no more valid tag.
> -        tag = Tags._get(self._tags).decode('utf-8')
> +        tag = Tags._get(self._tags)
>          if tag is None:
>              self._tags = None
>              raise StopIteration
> +        else:
> +            tag = tag.decode('utf-8')
>          nmlib.notmuch_tags_move_to_next(self._tags)
>          return tag
>  
> 
> Michael.
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch] tag.py: Bugfix to avoid decode() on a NoneType object
  2011-07-22 14:11 [Patch] tag.py: Bugfix to avoid decode() on a NoneType object Michael Heinrich
  2011-07-22 19:13 ` Patrick Totzke
@ 2011-08-12  8:14 ` Sebastian Spaeth
  2011-08-12 13:23   ` [PATCH] [python] decode headers from utf-8 to unicode Patrick Totzke
  1 sibling, 1 reply; 7+ messages in thread
From: Sebastian Spaeth @ 2011-08-12  8:14 UTC (permalink / raw)
  To: Michael Heinrich, notmuch

[-- Attachment #1: Type: text/plain, Size: 528 bytes --]

On Fri, 22 Jul 2011 14:11:41 +0000 (UTC), Michael Heinrich <michael@haas-heinrich.de> wrote:
> Dear all,
> 
> with current head I get following error in my python scripts when I read the
> tags of a message:
> 
>   File "/home/heinrich/.local/lib/python2.6/site-packages/notmuch/tag.py", line
> 88, in next
>     tag = Tags._get(self._tags).decode('utf-8')

Just for reference, this has been fixed in commit
94c5edd064f856a888ce29f7ac1523006b4b8fd6 on Aug 9th
using a slightly different patch of mine.

Sebastian

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] [python] decode headers from utf-8 to unicode
  2011-08-12  8:14 ` Sebastian Spaeth
@ 2011-08-12 13:23   ` Patrick Totzke
  2011-08-15 13:49     ` Sebastian Spaeth
  2011-08-16 21:37     ` [PATCH 2/2] [python] fix unsafe utf-8 decodings Patrick Totzke
  0 siblings, 2 replies; 7+ messages in thread
From: Patrick Totzke @ 2011-08-12 13:23 UTC (permalink / raw)
  To: notmuch; +Cc: patrick

From: patrick <p.totzke@ed.ac.uk>

as mail headers are stored as utf-8 in the index,
it is safe to return them as unicode strings directly
---
 bindings/python/notmuch/message.py |    4 ++--
 bindings/python/notmuch/thread.py  |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/bindings/python/notmuch/message.py b/bindings/python/notmuch/message.py
index 435a05d..ae6ae1b 100644
--- a/bindings/python/notmuch/message.py
+++ b/bindings/python/notmuch/message.py
@@ -395,7 +395,7 @@ class Message(object):
         header = Message._get_header(self._msg, header)
         if header == None:
             raise NotmuchError(STATUS.NULL_POINTER)
-        return header
+        return header.decode('UTF-8')
 
     def get_filename(self):
         """Returns the file path of the message file
@@ -747,7 +747,7 @@ class Message(object):
         """A message() is represented by a 1-line summary"""
         msg = {}
         msg['from'] = self.get_header('from')
-        msg['tags'] = str(self.get_tags())
+        msg['tags'] = self.get_tags()
         msg['date'] = date.fromtimestamp(self.get_date())
         return "%(from)s (%(date)s) (%(tags)s)" % (msg)
 
diff --git a/bindings/python/notmuch/thread.py b/bindings/python/notmuch/thread.py
index 60f6c29..120f925 100644
--- a/bindings/python/notmuch/thread.py
+++ b/bindings/python/notmuch/thread.py
@@ -292,7 +292,7 @@ class Thread(object):
         """
         if self._thread is None:
             raise NotmuchError(STATUS.NOT_INITIALIZED)
-        return Thread._get_authors(self._thread)
+        return Thread._get_authors(self._thread).decode('UTF-8')
 
     def get_subject(self):
         """Returns the Subject of 'thread'
@@ -302,7 +302,7 @@ class Thread(object):
         """
         if self._thread is None:
             raise NotmuchError(STATUS.NOT_INITIALIZED)
-        return Thread._get_subject(self._thread)
+        return Thread._get_subject(self._thread).decode('UTF-8')
 
     def get_newest_date(self):
         """Returns time_t of the newest message date
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] [python] decode headers from utf-8 to unicode
  2011-08-12 13:23   ` [PATCH] [python] decode headers from utf-8 to unicode Patrick Totzke
@ 2011-08-15 13:49     ` Sebastian Spaeth
  2011-08-16 21:37     ` [PATCH 2/2] [python] fix unsafe utf-8 decodings Patrick Totzke
  1 sibling, 0 replies; 7+ messages in thread
From: Sebastian Spaeth @ 2011-08-15 13:49 UTC (permalink / raw)
  To: Patrick Totzke, notmuch; +Cc: patrick

[-- Attachment #1: Type: text/plain, Size: 273 bytes --]

On Fri, 12 Aug 2011 14:23:28 +0100, Patrick Totzke <patricktotzke@googlemail.com> wrote:
> From: patrick <p.totzke@ed.ac.uk>
> 
> as mail headers are stored as utf-8 in the index,
> it is safe to return them as unicode strings directly

Applied, thanks

Sebastian

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] [python] fix unsafe utf-8 decodings
  2011-08-12 13:23   ` [PATCH] [python] decode headers from utf-8 to unicode Patrick Totzke
  2011-08-15 13:49     ` Sebastian Spaeth
@ 2011-08-16 21:37     ` Patrick Totzke
  2011-08-17 12:48       ` Sebastian Spaeth
  1 sibling, 1 reply; 7+ messages in thread
From: Patrick Totzke @ 2011-08-16 21:37 UTC (permalink / raw)
  To: notmuch

From: pazz <patricktotzke@gmail.com>

This prevents unsafe calls to decode for return
value None in get_authors/get_subject
---
 bindings/python/notmuch/tag.py    |    4 +++-
 bindings/python/notmuch/thread.py |   10 ++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bindings/python/notmuch/tag.py b/bindings/python/notmuch/tag.py
index d6abf28..9eb9fe2 100644
--- a/bindings/python/notmuch/tag.py
+++ b/bindings/python/notmuch/tag.py
@@ -86,7 +86,9 @@ class Tags(object):
         if not nmlib.notmuch_tags_valid(self._tags):
             self._tags = None
             raise StopIteration
-        tag = Tags._get(self._tags).decode('utf-8')
+        tag = Tags._get(self._tags)
+        if tag:
+            tag = tag.decode('UTF-8')
         nmlib.notmuch_tags_move_to_next(self._tags)
         return tag
 
diff --git a/bindings/python/notmuch/thread.py b/bindings/python/notmuch/thread.py
index 120f925..2a55bd9 100644
--- a/bindings/python/notmuch/thread.py
+++ b/bindings/python/notmuch/thread.py
@@ -292,7 +292,10 @@ class Thread(object):
         """
         if self._thread is None:
             raise NotmuchError(STATUS.NOT_INITIALIZED)
-        return Thread._get_authors(self._thread).decode('UTF-8')
+        authors = Thread._get_authors(self._thread)
+        if authors:
+            return authors.decode('UTF-8')
+        return None
 
     def get_subject(self):
         """Returns the Subject of 'thread'
@@ -302,7 +305,10 @@ class Thread(object):
         """
         if self._thread is None:
             raise NotmuchError(STATUS.NOT_INITIALIZED)
-        return Thread._get_subject(self._thread).decode('UTF-8')
+        subject = Thread._get_subject(self._thread)
+        if subject:
+            return subject.decode('UTF-8')
+        return None
 
     def get_newest_date(self):
         """Returns time_t of the newest message date
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] [python] fix unsafe utf-8 decodings
  2011-08-16 21:37     ` [PATCH 2/2] [python] fix unsafe utf-8 decodings Patrick Totzke
@ 2011-08-17 12:48       ` Sebastian Spaeth
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Spaeth @ 2011-08-17 12:48 UTC (permalink / raw)
  To: Patrick Totzke, notmuch

[-- Attachment #1: Type: text/plain, Size: 1313 bytes --]

On Tue, 16 Aug 2011 22:37:47 +0100, Patrick Totzke <patricktotzke@googlemail.com> wrote:
> This prevents unsafe calls to decode for return
> value None in get_authors/get_subject

Thanks for the heads up, I just pushed a modified version of this. Some
comments on the code below.

Sebastian

> -        tag = Tags._get(self._tags).decode('utf-8')
> +        tag = Tags._get(self._tags)
> +        if tag:
> +            tag = tag.decode('UTF-8')

This was already safe as 
  if not nmlib.notmuch_tags_valid(self._tags):
was making sure that something useful will be returned.

> -        return Thread._get_authors(self._thread).decode('UTF-8')
> +        authors = Thread._get_authors(self._thread)
> +        if authors:
> +            return authors.decode('UTF-8')
> +        return None

> -        return Thread._get_subject(self._thread).decode('UTF-8')
> +        subject = Thread._get_subject(self._thread)
> +        if subject:
> +            return subject.decode('UTF-8')
> +        return None

Modified this to say:

foo = get_foo()
if foo is None:
   return None
return foo.decode('UTF-8')

Otherwise you would fall into a trap when e.g. the subject is empty and
a '' is returned. Your code would have returned "None". My version will
return ''.

Thanks!

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-08-17 12:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-22 14:11 [Patch] tag.py: Bugfix to avoid decode() on a NoneType object Michael Heinrich
2011-07-22 19:13 ` Patrick Totzke
2011-08-12  8:14 ` Sebastian Spaeth
2011-08-12 13:23   ` [PATCH] [python] decode headers from utf-8 to unicode Patrick Totzke
2011-08-15 13:49     ` Sebastian Spaeth
2011-08-16 21:37     ` [PATCH 2/2] [python] fix unsafe utf-8 decodings Patrick Totzke
2011-08-17 12:48       ` Sebastian Spaeth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).