From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 41A4A6DE0C66 for ; Wed, 1 Nov 2017 06:02:20 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.718 X-Spam-Level: X-Spam-Status: No, score=-0.718 tagged_above=-999 required=5 tests=[AWL=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XynM1Oq-cR62 for ; Wed, 1 Nov 2017 06:02:15 -0700 (PDT) Received: from avasout05.plus.net (avasout05.plus.net [84.93.230.250]) by arlo.cworth.org (Postfix) with ESMTPS id E56916DE0C3F for ; Wed, 1 Nov 2017 06:02:14 -0700 (PDT) Received: from mail.bubblegen.co.uk ([80.229.236.194]) by smtp with ESMTP id 9sepeb7nK2du79seqeSazA; Wed, 01 Nov 2017 13:02:12 +0000 X-CM-Score: 0.00 X-CNFS-Analysis: v=2.2 cv=a6FAzQaF c=1 sm=1 tr=0 a=G4bc5lkgapKKm1P+Twxy3Q==:117 a=G4bc5lkgapKKm1P+Twxy3Q==:17 a=sC3jslCIGhcA:10 a=80hmnl3cAAAA:8 a=J6cydZkJAAAA:8 a=pGLkceISAAAA:8 a=nU5gdv_OJyNFdElqy-IA:9 a=QEXdDO2ut3YA:10 a=93ifz_K-DzNtId4BFQYA:9 a=rZsrr8RBufw9ySOk:21 a=enaiOHsCviWFCEVJ:21 a=FQeu1HYxzEMaXP5s:21 a=G-y4FhfbluXncOafXg4t:22 a=L0coiOcRLHDvGY_RdqVO:22 Received: from mail-lf0-f44.google.com (mail-lf0-f44.google.com [209.85.215.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: matt) by mail.bubblegen.co.uk (Postfix) with ESMTPSA id DFDB64622017 for ; Wed, 1 Nov 2017 13:02:10 +0000 (GMT) Received: by mail-lf0-f44.google.com with SMTP id k40so2445372lfi.4 for ; Wed, 01 Nov 2017 06:02:10 -0700 (PDT) X-Gm-Message-State: AMCzsaVNTnOoCqqwludYEwodGanDX0Re3ORWy5QA7dn+smmxMwfHi20m 6xgIvBBR/tLtIGyem+RWAOmJ6HY97DmMOgBsLhM= X-Google-Smtp-Source: ABhQp+RP+j/qygkVOnRc7lKqZ52fxdX6xx/yg1mCv56ESmyXEPf/CXd7fT/C264FtPlD5eIJN5gkfQjyQM5gTf2N+fI= X-Received: by 10.46.64.81 with SMTP id n78mr2710692lja.33.1509541329908; Wed, 01 Nov 2017 06:02:09 -0700 (PDT) MIME-Version: 1.0 References: <87tvyvp4f2.fsf@istari.evenmere.org> <87376f13ho.fsf@fifthhorseman.net> <87r2tww9tr.fsf@nikula.org> <87wp3ow39i.fsf@fifthhorseman.net> <27e53def-32b4-45ab-1192-77cc0e837a93@gmail.com> <87zi8eopgq.fsf@istari.evenmere.org> <877evhy53k.fsf@fifthhorseman.net> <87she5nsmy.fsf@istari.evenmere.org> <87inf1gm7l.fsf@fifthhorseman.net> <87mv4co4vz.fsf@istari.evenmere.org> <87h8ufnmwr.fsf@istari.evenmere.org> In-Reply-To: From: Matthew Lear Date: Wed, 01 Nov 2017 13:01:59 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: web interface to notmuch To: Brian Sniffen Cc: Daniel Kahn Gillmor , Jani Nikula , Vladimir Panteleev , notmuch@notmuchmail.org Content-Type: multipart/alternative; boundary="94eb2c1bfd42be3ac0055ceb7afa" X-CMAE-Envelope: MS4wfEHJ3n1e6AyD4Zuhrzz8nJ0hRlXmhquM2Hda8/x7rH1kltaIBXgIObgbntCeEXpJiMAAk2UhxHsFf842LWhbxDmHFdk1/PBJ9Jy7MinU1dTUvN1CRBG1 4Bt6n/OrIndYYhl/fbSP3bl25ZrBKvJWeBGix786MDmpFbTf3RzXwKKzZcp4cv/PHX1Iyum7+Ik8EQ== X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Nov 2017 13:02:20 -0000 --94eb2c1bfd42be3ac0055ceb7afa Content-Type: text/plain; charset="UTF-8" Comparing with the Haskell based notmuch-web, while slightly slower to render a browser page with the same search terms as nmweb, I can view the email which causes nmweb to throw the encoding exception just fine in it. I guess something in that implementation is able to handle encoding differently. Regards, Matt On Tue, 31 Oct 2017, 21:32 Matthew Lear, wrote: > On Tue, Oct 31, 2017 at 7:21 PM, Brian Sniffen wrote: > >> >> > I'm no Python expert, but from a quick google it would seem like the >> cause >> > of such an exception is related to not using utf-8. >> >> Neat. So to get there, this has to be a text/html part. It has to have >> been decoded, either with the declared content type or with ascii. If a >> \u201c (left double quote) showed up, it didn't get decoded as >> ascii---and indeed, it looks like the content-type specifies latin-1. >> But now when we try to encode back, using the same latin-1, it fails? >> That's really neat. >> >> > Brian - do you think something needs modifying in nmweb.py to cater for >> > this type of thing, or is this somehow related my own mailstore (not >> sure >> > why that would be as my messages haven't been modified). >> >> Lots of mail has busted encoding. I've done some defensive work against >> that---look at decodeAnyway and shed a tear for purity---but clearly not >> enough. Can you send me a message that causes the problem? >> > > I'll need to fix up the text in the message because it's confidential. > That should be easy enough to do. > I'll send it to you once I've done that. > > One other thing - it looks like accessing attachments should work, but > I've seen messages in my local set up here which have attachments shown, > but I've not been able to retrieve them. > Not sure what would cause that. Also some messages which are tagged as > having attachments, don't have them shown by nmweb. > > FWIW this link ( > https://nmweb.evenmere.org/show/CACMMjMLecmXopb8AATjE3UuCnNLOO%2B5Nmev5X8K-UostDEUdrQ%40mail.gmail.com) > has the tag attachment applied to the message, but there is no attachment > shown. And another ( > https://nmweb.evenmere.org/show/87d31artti.fsf%40inf-8657.int-evry.fr). > > Maybe text/plain only emails are the ones which aren't problematic w.r.t. > having their attachments shown? > Cheers, > -- Matt > > > --94eb2c1bfd42be3ac0055ceb7afa Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Comparing with the Haskell based notmuch-web, while slightly= slower to render a browser page with the same search terms as nmweb, I can= view the email which causes nmweb to throw the encoding exception just fin= e in it. I guess something in that implementation is able to handle encodin= g differently.
Regards,
=C2=A0 Matt


On Tue, 31 Oct 2017, 21:32 = Matthew Lear, <matt@bubblegen.co= .uk> wrote:
On Tue, Oct 31, 2017= at 7:21 PM, Brian Sniffen <bts@evenmere.org> wrote:

> I'm no Python expert, but from a quick google it would seem like t= he cause
> of such an exception is related to not using utf-8.

Neat.=C2=A0 So to get there, this has to be a text/html part.= =C2=A0 It has to have
been decoded, either with the declared content type or with ascii.=C2=A0 If= a
\u201c (left double quote) showed up, it didn't get decoded as
ascii---and indeed, it looks like the content-type specifies latin-1.
But now when we try to encode back, using the same latin-1, it fails?
That's really neat.

> Brian - do you think something needs modifying in nmweb.py to cater fo= r
> this type of thing, or is this somehow related my own mailstore (not s= ure
> why that would be as my messages haven't been modified).

Lots of mail has busted encoding.=C2=A0 I've done some defensive= work against
that---look at decodeAnyway and shed a tear for purity---but clearly not enough.=C2=A0 Can you send me a message that causes the problem?

I'll need to fix up the text in= the message because it's confidential. That should be easy enough to d= o.
I'll send it to you once I've done that.
One other thing - it looks like accessing attachments should wo= rk, but I've seen messages in my local set up here which have attachmen= ts shown, but I've not been able to retrieve them.
Not sure w= hat would cause that. Also some messages which are tagged as having attachm= ents, don't have them shown by nmweb.

FWIW thi= s link (https://n= mweb.evenmere.org/show/CACMMjMLecmXopb8AATjE3UuCnNLOO%2B5Nmev5X8K-UostDEUdr= Q%40mail.gmail.com) has the tag attachment applied to the message, but = there is no attachment shown.=C2=A0 And another (h= ttps://nmweb.evenmere.org/show/87d31artti.fsf%40inf-8657.int-evry.fr).<= /div>

Maybe text/plain only emails are the ones which ar= en't problematic w.r.t. having their attachments shown?
Cheer= s,
--=C2=A0 Matt


--94eb2c1bfd42be3ac0055ceb7afa--