From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 009116DE034D for ; Thu, 24 Aug 2017 14:40:25 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.011 X-Spam-Level: X-Spam-Status: No, score=-0.011 tagged_above=-999 required=5 tests=[SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2xmBxQt08YEh for ; Thu, 24 Aug 2017 14:40:23 -0700 (PDT) X-Greylist: delayed 594 seconds by postgrey-1.36 at arlo; Thu, 24 Aug 2017 14:40:23 PDT Received: from zeus.flokli.de (mail.zeus.flokli.de [88.198.15.28]) by arlo.cworth.org (Postfix) with ESMTPS id 88B8B6DE0183 for ; Thu, 24 Aug 2017 14:40:23 -0700 (PDT) Received: from localhost (p5795616A.dip0.t-ipconnect.de [87.149.97.106]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: flokli@flokli.de) by zeus.flokli.de (Postfix) with ESMTPSA id A287E2CF55D; Thu, 24 Aug 2017 21:30:26 +0000 (UTC) From: Florian Klink To: notmuch@notmuchmail.org Subject: [PATCH] python: open messages in binary mode Date: Thu, 24 Aug 2017 23:30:01 +0200 Message-Id: <20170824213001.22353-1-flokli@flokli.de> X-Mailer: git-send-email 2.14.1 X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Aug 2017 21:40:25 -0000 currently, notmuch's get_message_parts() opens the file in text mode and passes the file object to email.message_from_file(fp). In case the email contains UTF-8 characters, reading might fail inside email.parser with the following exception: File "/usr/lib/python3.6/site-packages/notmuch/message.py", line 591, in get_message_parts email_msg = email.message_from_binary_file(fp) File "/usr/lib/python3.6/email/__init__.py", line 62, in message_from_binary_file return BytesParser(*args, **kws).parse(fp) File "/usr/lib/python3.6/email/parser.py", line 110, in parse return self.parser.parse(fp, headersonly) File "/usr/lib/python3.6/email/parser.py", line 54, in parse data = fp.read(8192) File "/usr/lib/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 1865: invalid continuation byte To fix this, read file in binary mode and pass to email.message_from_binary_file(fp). Signed-off-by: Florian Klink --- bindings/python/notmuch/message.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/bindings/python/notmuch/message.py b/bindings/python/notmuch/message.py index cce377d0..531b22d0 100644 --- a/bindings/python/notmuch/message.py +++ b/bindings/python/notmuch/message.py @@ -587,8 +587,8 @@ class Message(Python3StringMixIn): def get_message_parts(self): """Output like notmuch show""" - fp = open(self.get_filename()) - email_msg = email.message_from_file(fp) + fp = open(self.get_filename(), 'rb') + email_msg = email.message_from_binary_file(fp) fp.close() out = [] -- 2.14.1