unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [python] unicode strings
@ 2011-12-05 21:19 Patrick Totzke
  2011-12-05 21:19 ` [PATCH 1/3] clean up Thread.__str__ Patrick Totzke
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Patrick Totzke @ 2011-12-05 21:19 UTC (permalink / raw)
  To: notmuch


Unicode handling fixes for the bindings:
 - use __unicode__ for string representations, __str__ falls back to this
 - less complicated __str__ for Thread and Message
 - use errors='ignore' parameter for str.decode(). This should fix the
   "UnicodeDecodeError: 'utf8' codec can't decode byte 0xe8 in position"
   exceptions people were seing.

Best,
/p

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] clean up Thread.__str__
  2011-12-05 21:19 [python] unicode strings Patrick Totzke
@ 2011-12-05 21:19 ` Patrick Totzke
  2011-12-05 21:19 ` [PATCH 2/3] use __unicode__ for string representation Patrick Totzke
  2011-12-05 21:19 ` [PATCH 3/3] errors='ignore' when decode to unicode Patrick Totzke
  2 siblings, 0 replies; 8+ messages in thread
From: Patrick Totzke @ 2011-12-05 21:19 UTC (permalink / raw)
  To: notmuch

---
 bindings/python/notmuch/thread.py |   37 +++++++++++++------------------------
 1 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/bindings/python/notmuch/thread.py b/bindings/python/notmuch/thread.py
index d903c76..3e59b35 100644
--- a/bindings/python/notmuch/thread.py
+++ b/bindings/python/notmuch/thread.py
@@ -393,30 +393,19 @@ class Thread(object):
         return Tags(tags_p, self)
 
     def __str__(self):
-        """A str(Thread()) is represented by a 1-line summary"""
-        thread = {}
-        thread['id'] = self.get_thread_id()
-
-        ###TODO: How do we find out the current sort order of Threads?
-        ###Add a "sort" attribute to the Threads() object?
-        #if (sort == NOTMUCH_SORT_OLDEST_FIRST)
-        #         date = notmuch_thread_get_oldest_date (thread);
-        #else
-        #         date = notmuch_thread_get_newest_date (thread);
-        thread['date'] = date.fromtimestamp(self.get_newest_date())
-        thread['matched'] = self.get_matched_messages()
-        thread['total'] = self.get_total_messages()
-        thread['authors'] = self.get_authors()
-        thread['subject'] = self.get_subject()
-        thread['tags'] = self.get_tags()
-
-        return "thread:%s %12s [%d/%d] %s; %s (%s)" % (thread['id'],
-                                                       thread['date'],
-                                                       thread['matched'],
-                                                       thread['total'],
-                                                       thread['authors'],
-                                                       thread['subject'],
-                                                       thread['tags'])
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
+        frm = "thread:%s %12s [%d/%d] %s; %s (%s)"
+
+        return frm % (self.get_thread_id(),
+                      date.fromtimestamp(self.get_newest_date()),
+                      self.get_matched_messages(),
+                      self.get_total_messages(),
+                      self.get_authors(),
+                      self.get_subject(),
+                      self.get_tags(),
+                     )
 
     _destroy = nmlib.notmuch_thread_destroy
     _destroy.argtypes = [NotmuchThreadP]
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] use __unicode__ for string representation
  2011-12-05 21:19 [python] unicode strings Patrick Totzke
  2011-12-05 21:19 ` [PATCH 1/3] clean up Thread.__str__ Patrick Totzke
@ 2011-12-05 21:19 ` Patrick Totzke
  2011-12-05 22:56   ` [PATCH] " patricktotzke
  2011-12-05 21:19 ` [PATCH 3/3] errors='ignore' when decode to unicode Patrick Totzke
  2 siblings, 1 reply; 8+ messages in thread
From: Patrick Totzke @ 2011-12-05 21:19 UTC (permalink / raw)
  To: notmuch

---
 bindings/python/notmuch/filename.py |    3 +++
 bindings/python/notmuch/globals.py  |   11 +++++++----
 bindings/python/notmuch/message.py  |   14 ++++++++------
 bindings/python/notmuch/tag.py      |    7 +++++--
 4 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/bindings/python/notmuch/filename.py b/bindings/python/notmuch/filename.py
index 077754e..80755ee 100644
--- a/bindings/python/notmuch/filename.py
+++ b/bindings/python/notmuch/filename.py
@@ -99,6 +99,9 @@ class Filenames(object):
         self._files = None
 
     def __str__(self):
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
         """Represent Filenames() as newline-separated list of full paths
 
         .. note:: As this iterates over the filenames, we will not be
diff --git a/bindings/python/notmuch/globals.py b/bindings/python/notmuch/globals.py
index 36354fc..62b2df1 100644
--- a/bindings/python/notmuch/globals.py
+++ b/bindings/python/notmuch/globals.py
@@ -49,11 +49,11 @@ class Status(Enum):
 
     @classmethod
     def status2str(self, status):
-        """Get a string representation of a notmuch_status_t value."""
+        """Get a (unicode) string representation of a notmuch_status_t value."""
         # define strings for custom error messages
         if status == STATUS.NOT_INITIALIZED:
-            return "Operation on uninitialized object impossible."
-        return str(Status._status2str(status))
+            return u"Operation on uninitialized object impossible."
+        return unicode(Status._status2str(status))
 
 STATUS = Status(['SUCCESS',
   'OUT_OF_MEMORY',
@@ -133,12 +133,15 @@ class NotmuchError(Exception):
         self.message = message
 
     def __str__(self):
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
         if self.message is not None:
             return self.message
         elif self.status is not None:
             return STATUS.status2str(self.status)
         else:
-            return 'Unknown error'
+            return u'Unknown error'
 
 # List of Subclassed exceptions that correspond to STATUS values and are
 # subclasses of NotmuchError.
diff --git a/bindings/python/notmuch/message.py b/bindings/python/notmuch/message.py
index e0c7eda..fac575c 100644
--- a/bindings/python/notmuch/message.py
+++ b/bindings/python/notmuch/message.py
@@ -794,12 +794,14 @@ class Message(object):
         return self.__str__()
 
     def __str__(self):
-        """A message() is represented by a 1-line summary"""
-        msg = {}
-        msg['from'] = self.get_header('from')
-        msg['tags'] = self.get_tags()
-        msg['date'] = date.fromtimestamp(self.get_date())
-        return "%(from)s (%(date)s) (%(tags)s)" % (msg)
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
+        format = "%(from)s (%(date)s) (%(tags)s)"
+        return format % (self.get_header('from'),
+                         self.get_tags(),
+                         date.fromtimestamp(self.get_date()),
+                        )
 
     def get_message_parts(self):
         """Output like notmuch show"""
diff --git a/bindings/python/notmuch/tag.py b/bindings/python/notmuch/tag.py
index f3a3d27..36aeeed 100644
--- a/bindings/python/notmuch/tag.py
+++ b/bindings/python/notmuch/tag.py
@@ -95,7 +95,7 @@ class Tags(object):
         if not self._valid(self._tags):
             self._tags = None
             raise StopIteration
-        tag = Tags._get(self._tags).decode('UTF-8')
+        tag = Tags._get(self._tags)
         self._move_to_next(self._tags)
         return tag
 
@@ -111,7 +111,10 @@ class Tags(object):
         return self._valid(self._tags) > 0
 
     def __str__(self):
-        """The str() representation of Tags() is a space separated list of tags
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
+        """string representation of :class:`Tags`: a space separated list of tags
 
         .. note:: As this iterates over the tags, we will not be able
                to iterate over them again (as in retrieve them)! If
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] errors='ignore' when decode to unicode
  2011-12-05 21:19 [python] unicode strings Patrick Totzke
  2011-12-05 21:19 ` [PATCH 1/3] clean up Thread.__str__ Patrick Totzke
  2011-12-05 21:19 ` [PATCH 2/3] use __unicode__ for string representation Patrick Totzke
@ 2011-12-05 21:19 ` Patrick Totzke
  2011-12-06 12:31   ` Sebastian Spaeth
  2 siblings, 1 reply; 8+ messages in thread
From: Patrick Totzke @ 2011-12-05 21:19 UTC (permalink / raw)
  To: notmuch

---
 bindings/python/notmuch/message.py |    2 +-
 bindings/python/notmuch/thread.py  |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/bindings/python/notmuch/message.py b/bindings/python/notmuch/message.py
index fac575c..4790663 100644
--- a/bindings/python/notmuch/message.py
+++ b/bindings/python/notmuch/message.py
@@ -425,7 +425,7 @@ class Message(object):
         header = Message._get_header(self._msg, header)
         if header == None:
             raise NotmuchError(STATUS.NULL_POINTER)
-        return header.decode('UTF-8')
+        return header.decode('UTF-8', errors='ignore')
 
     def get_filename(self):
         """Returns the file path of the message file
diff --git a/bindings/python/notmuch/thread.py b/bindings/python/notmuch/thread.py
index 3e59b35..ddefade 100644
--- a/bindings/python/notmuch/thread.py
+++ b/bindings/python/notmuch/thread.py
@@ -326,7 +326,7 @@ class Thread(object):
         authors = Thread._get_authors(self._thread)
         if authors is None:
             return None
-        return authors.decode('UTF-8')
+        return authors.decode('UTF-8', errors='ignore')
 
     def get_subject(self):
         """Returns the Subject of 'thread'
@@ -339,7 +339,7 @@ class Thread(object):
         subject = Thread._get_subject(self._thread)
         if subject is None:
             return None
-        return subject.decode('UTF-8')
+        return subject.decode('UTF-8', errors='ignore')
 
     def get_newest_date(self):
         """Returns time_t of the newest message date
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] use __unicode__ for string representation
  2011-12-05 21:19 ` [PATCH 2/3] use __unicode__ for string representation Patrick Totzke
@ 2011-12-05 22:56   ` patricktotzke
  0 siblings, 0 replies; 8+ messages in thread
From: patricktotzke @ 2011-12-05 22:56 UTC (permalink / raw)
  To: notmuch

From: Patrick Totzke <patricktotzke@gmail.com>

---
 bindings/python/notmuch/filename.py |    3 +++
 bindings/python/notmuch/globals.py  |   11 +++++++----
 bindings/python/notmuch/message.py  |   14 ++++++++------
 bindings/python/notmuch/tag.py      |    5 ++++-
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/bindings/python/notmuch/filename.py b/bindings/python/notmuch/filename.py
index 077754e..80755ee 100644
--- a/bindings/python/notmuch/filename.py
+++ b/bindings/python/notmuch/filename.py
@@ -99,6 +99,9 @@ class Filenames(object):
         self._files = None
 
     def __str__(self):
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
         """Represent Filenames() as newline-separated list of full paths
 
         .. note:: As this iterates over the filenames, we will not be
diff --git a/bindings/python/notmuch/globals.py b/bindings/python/notmuch/globals.py
index 36354fc..62b2df1 100644
--- a/bindings/python/notmuch/globals.py
+++ b/bindings/python/notmuch/globals.py
@@ -49,11 +49,11 @@ class Status(Enum):
 
     @classmethod
     def status2str(self, status):
-        """Get a string representation of a notmuch_status_t value."""
+        """Get a (unicode) string representation of a notmuch_status_t value."""
         # define strings for custom error messages
         if status == STATUS.NOT_INITIALIZED:
-            return "Operation on uninitialized object impossible."
-        return str(Status._status2str(status))
+            return u"Operation on uninitialized object impossible."
+        return unicode(Status._status2str(status))
 
 STATUS = Status(['SUCCESS',
   'OUT_OF_MEMORY',
@@ -133,12 +133,15 @@ class NotmuchError(Exception):
         self.message = message
 
     def __str__(self):
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
         if self.message is not None:
             return self.message
         elif self.status is not None:
             return STATUS.status2str(self.status)
         else:
-            return 'Unknown error'
+            return u'Unknown error'
 
 # List of Subclassed exceptions that correspond to STATUS values and are
 # subclasses of NotmuchError.
diff --git a/bindings/python/notmuch/message.py b/bindings/python/notmuch/message.py
index e0c7eda..fac575c 100644
--- a/bindings/python/notmuch/message.py
+++ b/bindings/python/notmuch/message.py
@@ -794,12 +794,14 @@ class Message(object):
         return self.__str__()
 
     def __str__(self):
-        """A message() is represented by a 1-line summary"""
-        msg = {}
-        msg['from'] = self.get_header('from')
-        msg['tags'] = self.get_tags()
-        msg['date'] = date.fromtimestamp(self.get_date())
-        return "%(from)s (%(date)s) (%(tags)s)" % (msg)
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
+        format = "%(from)s (%(date)s) (%(tags)s)"
+        return format % (self.get_header('from'),
+                         self.get_tags(),
+                         date.fromtimestamp(self.get_date()),
+                        )
 
     def get_message_parts(self):
         """Output like notmuch show"""
diff --git a/bindings/python/notmuch/tag.py b/bindings/python/notmuch/tag.py
index f3a3d27..52607ed 100644
--- a/bindings/python/notmuch/tag.py
+++ b/bindings/python/notmuch/tag.py
@@ -111,7 +111,10 @@ class Tags(object):
         return self._valid(self._tags) > 0
 
     def __str__(self):
-        """The str() representation of Tags() is a space separated list of tags
+        return unicode(self).encode('utf-8')
+
+    def __unicode__(self):
+        """string representation of :class:`Tags`: a space separated list of tags
 
         .. note:: As this iterates over the tags, we will not be able
                to iterate over them again (as in retrieve them)! If
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] errors='ignore' when decode to unicode
  2011-12-05 21:19 ` [PATCH 3/3] errors='ignore' when decode to unicode Patrick Totzke
@ 2011-12-06 12:31   ` Sebastian Spaeth
  2011-12-06 20:22     ` fix error introduced in pushed patch Patrick Totzke
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Spaeth @ 2011-12-06 12:31 UTC (permalink / raw)
  To: Patrick Totzke, notmuch

[-- Attachment #1: Type: text/plain, Size: 202 bytes --]


Just for reference, all three patches went in.
Perhaps this warrants a NEWS entry such as:

 * python: using more unicode throughout and robustify against unicode
   errors (credits to Patrick Totzke)

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* fix error introduced in pushed patch
  2011-12-06 12:31   ` Sebastian Spaeth
@ 2011-12-06 20:22     ` Patrick Totzke
  2011-12-06 20:22       ` [PATCH] fix format string in Message.__unicode__ Patrick Totzke
  0 siblings, 1 reply; 8+ messages in thread
From: Patrick Totzke @ 2011-12-06 20:22 UTC (permalink / raw)
  To: notmuch

Hi,
A friend of mine just complained about an issue he had with upstream notmuch python
that is due to my recent reformating of Message.__str__.
This patch fixes it.
Sorry for the inconveniences,
/p

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] fix format string in Message.__unicode__
  2011-12-06 20:22     ` fix error introduced in pushed patch Patrick Totzke
@ 2011-12-06 20:22       ` Patrick Totzke
  0 siblings, 0 replies; 8+ messages in thread
From: Patrick Totzke @ 2011-12-06 20:22 UTC (permalink / raw)
  To: notmuch

Since 2b0116119160f2dc83, Message.__str__ doesn't
construct a hash containing the thread data before
constructing the formatstring. This changes the formatstring
to accept positional parameters instead of a hash.
---
 bindings/python/notmuch/message.py |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/bindings/python/notmuch/message.py b/bindings/python/notmuch/message.py
index f95e50e..ce8e718 100644
--- a/bindings/python/notmuch/message.py
+++ b/bindings/python/notmuch/message.py
@@ -799,7 +799,7 @@ class Message(object):
         return unicode(self).encode('utf-8')
 
     def __unicode__(self):
-        format = "%(from)s (%(date)s) (%(tags)s)"
+        format = "%s (%s) (%s)"
         return format % (self.get_header('from'),
                          self.get_tags(),
                          date.fromtimestamp(self.get_date()),
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-12-06 20:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-05 21:19 [python] unicode strings Patrick Totzke
2011-12-05 21:19 ` [PATCH 1/3] clean up Thread.__str__ Patrick Totzke
2011-12-05 21:19 ` [PATCH 2/3] use __unicode__ for string representation Patrick Totzke
2011-12-05 22:56   ` [PATCH] " patricktotzke
2011-12-05 21:19 ` [PATCH 3/3] errors='ignore' when decode to unicode Patrick Totzke
2011-12-06 12:31   ` Sebastian Spaeth
2011-12-06 20:22     ` fix error introduced in pushed patch Patrick Totzke
2011-12-06 20:22       ` [PATCH] fix format string in Message.__unicode__ Patrick Totzke

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).