From: Brian Sniffen <bts@evenmere.org>
To: Matthew Lear <matt@bubblegen.co.uk>
Cc: Daniel Kahn Gillmor <dkg@fifthhorseman.net>,
Jani Nikula <jani@nikula.org>,
Vladimir Panteleev <thecybershadow@gmail.com>,
notmuch@notmuchmail.org
Subject: Re: web interface to notmuch
Date: Tue, 31 Oct 2017 15:21:40 -0400 [thread overview]
Message-ID: <87h8ufnmwr.fsf@istari.evenmere.org> (raw)
In-Reply-To: <CAJFxaw8dDcPpJvQWmpJqjjXeFy=z9chA4bF0pJnw=_f6bxNi-Q@mail.gmail.com>
> just remove it), but along the way of searching and viewing mail, I've
> encountered quite a few occurrences of failing to UnicodeEncode. An example
> backtrace looks like this:
>
> Traceback (most recent call last):
> File "/usr/lib/python2.7/dist-packages/web/application.py", line 239, in
> process
> return self.handle()
> File "/usr/lib/python2.7/dist-packages/web/application.py", line 230, in
> handle
> return self._delegate(fn, self.fvars, args)
> File "/usr/lib/python2.7/dist-packages/web/application.py", line 420, in
> _delegate
> return handle_class(cls)
> File "/usr/lib/python2.7/dist-packages/web/application.py", line 396, in
> handle_class
> return tocall(*args)
> File "/b/git/notmuch-brians.git/contrib/notmuch-web/nmweb.py", line 153,
> in GET
> sprefix=webprefix)
> File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 989,
> in render
> return self.environment.handle_exception(exc_info, True)
> File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 754,
> in handle_exception
> reraise(exc_type, exc_value, tb)
> File "templates/show.html", line 1, in top-level template code
> {% extends "base.html" %}
> File "templates/base.html", line 32, in top-level template code
> {% block content %}
> File "templates/show.html", line 12, in block "content"
> {% for part in format_message(m.get_filename(),mid): %}{{ part|safe
> }}{% endfor %}
> File "/b/git/notmuch-brians.git/contrib/notmuch-web/nmweb.py", line 245,
> in format_message_walk
> tags=safe_tags).encode(part.get_content_charset('ascii')))
> UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in
> position 1141: ordinal not in range(256)
>
> 127.0.0.1:60968 - - [31/Oct/2017 17:00:02] "HTTP/1.1 GET /show/
> 665d8c5c2b024898ae21951c4b8b4f93@CO2PR05MB747.namprd05.prod.outlook.com" -
> 500 Internal Server Error
>
> I'm no Python expert, but from a quick google it would seem like the cause
> of such an exception is related to not using utf-8.
Neat. So to get there, this has to be a text/html part. It has to have
been decoded, either with the declared content type or with ascii. If a
\u201c (left double quote) showed up, it didn't get decoded as
ascii---and indeed, it looks like the content-type specifies latin-1.
But now when we try to encode back, using the same latin-1, it fails?
That's really neat.
> Brian - do you think something needs modifying in nmweb.py to cater for
> this type of thing, or is this somehow related my own mailstore (not sure
> why that would be as my messages haven't been modified).
Lots of mail has busted encoding. I've done some defensive work against
that---look at decodeAnyway and shed a tear for purity---but clearly not
enough. Can you send me a message that causes the problem?
In the mean time, I think like 245 ought to be, appropriately indented:
tags=safe_tags).encode(part.get_content_charset('ascii'),
'xmlcharrefreplace'))
Thanks for the report---investigating it showed me that the search box
doesn't tolerate that character either.
-Brian
next prev parent reply other threads:[~2017-10-31 19:21 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-19 14:43 web interface to notmuch Matthew Lear
2017-10-19 15:01 ` Brian Sniffen
2017-10-19 16:55 ` Daniel Kahn Gillmor
2017-10-19 20:00 ` Brian Sniffen
2017-10-19 20:13 ` Daniel Kahn Gillmor
2017-10-21 20:00 ` Jani Nikula
2017-10-21 22:21 ` Daniel Kahn Gillmor
2017-10-24 12:39 ` Vladimir Panteleev
[not found] ` <27e53def-32b4-45ab-1192-77cc0e837a93@gmail.com>
2017-10-24 20:03 ` Matthew Lear
2017-10-25 22:03 ` Brian Sniffen
2017-10-26 21:25 ` Daniel Kahn Gillmor
2017-10-27 4:04 ` Brian Sniffen
2017-10-27 4:24 ` Daniel Kahn Gillmor
2017-10-27 10:02 ` Matthew Lear
2017-10-27 6:05 ` Daniel Kahn Gillmor
2017-10-27 17:52 ` Brian Sniffen
2017-10-31 17:13 ` Matthew Lear
2017-10-31 18:47 ` Tomas Nordin
2017-10-31 19:21 ` Brian Sniffen [this message]
2017-10-31 21:32 ` Matthew Lear
2017-11-01 13:01 ` Matthew Lear
2017-11-01 14:38 ` Brian Sniffen
2017-11-02 17:32 ` Matthew Lear
2017-12-06 15:00 ` Brian Sniffen
2017-12-06 19:13 ` Daniel Kahn Gillmor
2017-12-07 1:00 ` David Bremner
2017-10-20 19:25 ` W. Trevor King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h8ufnmwr.fsf@istari.evenmere.org \
--to=bts@evenmere.org \
--cc=dkg@fifthhorseman.net \
--cc=jani@nikula.org \
--cc=matt@bubblegen.co.uk \
--cc=notmuch@notmuchmail.org \
--cc=thecybershadow@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).