unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Brian Sniffen <bts@evenmere.org>
To: Matthew Lear <matt@bubblegen.co.uk>
Cc: Daniel Kahn Gillmor <dkg@fifthhorseman.net>,
	Jani Nikula <jani@nikula.org>,
	Vladimir Panteleev <thecybershadow@gmail.com>,
	notmuch@notmuchmail.org
Subject: Re: web interface to notmuch
Date: Tue, 31 Oct 2017 15:21:40 -0400	[thread overview]
Message-ID: <87h8ufnmwr.fsf@istari.evenmere.org> (raw)
In-Reply-To: <CAJFxaw8dDcPpJvQWmpJqjjXeFy=z9chA4bF0pJnw=_f6bxNi-Q@mail.gmail.com>

> just remove it), but along the way of searching and viewing mail, I've
> encountered quite a few occurrences of failing to UnicodeEncode. An example
> backtrace looks like this:
>
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 239, in
> process
>     return self.handle()
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 230, in
> handle
>     return self._delegate(fn, self.fvars, args)
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 420, in
> _delegate
>     return handle_class(cls)
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 396, in
> handle_class
>     return tocall(*args)
>   File "/b/git/notmuch-brians.git/contrib/notmuch-web/nmweb.py", line 153,
> in GET
>     sprefix=webprefix)
>   File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 989,
> in render
>     return self.environment.handle_exception(exc_info, True)
>   File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 754,
> in handle_exception
>     reraise(exc_type, exc_value, tb)
>   File "templates/show.html", line 1, in top-level template code
>     {% extends "base.html" %}
>   File "templates/base.html", line 32, in top-level template code
>     {% block content %}
>   File "templates/show.html", line 12, in block "content"
>     {% for part in format_message(m.get_filename(),mid): %}{{ part|safe
> }}{% endfor %}
>   File "/b/git/notmuch-brians.git/contrib/notmuch-web/nmweb.py", line 245,
> in format_message_walk
>     tags=safe_tags).encode(part.get_content_charset('ascii')))
> UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in
> position 1141: ordinal not in range(256)
>
> 127.0.0.1:60968 - - [31/Oct/2017 17:00:02] "HTTP/1.1 GET /show/
> 665d8c5c2b024898ae21951c4b8b4f93@CO2PR05MB747.namprd05.prod.outlook.com" -
> 500 Internal Server Error
>
> I'm no Python expert, but from a quick google it would seem like the cause
> of such an exception is related to not using utf-8.

Neat.  So to get there, this has to be a text/html part.  It has to have
been decoded, either with the declared content type or with ascii.  If a
\u201c (left double quote) showed up, it didn't get decoded as
ascii---and indeed, it looks like the content-type specifies latin-1.
But now when we try to encode back, using the same latin-1, it fails?
That's really neat.

> Brian - do you think something needs modifying in nmweb.py to cater for
> this type of thing, or is this somehow related my own mailstore (not sure
> why that would be as my messages haven't been modified).

Lots of mail has busted encoding.  I've done some defensive work against
that---look at decodeAnyway and shed a tear for purity---but clearly not
enough.  Can you send me a message that causes the problem?

In the mean time, I think like 245 ought to be, appropriately indented:

    tags=safe_tags).encode(part.get_content_charset('ascii'),
    'xmlcharrefreplace'))

Thanks for the report---investigating it showed me that the search box
doesn't tolerate that character either.

-Brian

  parent reply	other threads:[~2017-10-31 19:21 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 14:43 web interface to notmuch Matthew Lear
2017-10-19 15:01 ` Brian Sniffen
2017-10-19 16:55   ` Daniel Kahn Gillmor
2017-10-19 20:00     ` Brian Sniffen
2017-10-19 20:13       ` Daniel Kahn Gillmor
2017-10-21 20:00     ` Jani Nikula
2017-10-21 22:21       ` Daniel Kahn Gillmor
2017-10-24 12:39         ` Vladimir Panteleev
     [not found]         ` <27e53def-32b4-45ab-1192-77cc0e837a93@gmail.com>
2017-10-24 20:03           ` Matthew Lear
2017-10-25 22:03           ` Brian Sniffen
2017-10-26 21:25             ` Daniel Kahn Gillmor
2017-10-27  4:04               ` Brian Sniffen
2017-10-27  4:24                 ` Daniel Kahn Gillmor
2017-10-27 10:02                   ` Matthew Lear
2017-10-27  6:05                 ` Daniel Kahn Gillmor
2017-10-27 17:52                   ` Brian Sniffen
2017-10-31 17:13                     ` Matthew Lear
2017-10-31 18:47                       ` Tomas Nordin
2017-10-31 19:21                       ` Brian Sniffen [this message]
2017-10-31 21:32                         ` Matthew Lear
2017-11-01 13:01                           ` Matthew Lear
2017-11-01 14:38                           ` Brian Sniffen
2017-11-02 17:32                             ` Matthew Lear
2017-12-06 15:00     ` Brian Sniffen
2017-12-06 19:13       ` Daniel Kahn Gillmor
2017-12-07  1:00       ` David Bremner
2017-10-20 19:25 ` W. Trevor King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h8ufnmwr.fsf@istari.evenmere.org \
    --to=bts@evenmere.org \
    --cc=dkg@fifthhorseman.net \
    --cc=jani@nikula.org \
    --cc=matt@bubblegen.co.uk \
    --cc=notmuch@notmuchmail.org \
    --cc=thecybershadow@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).