From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 1AD956DE0C66 for ; Tue, 31 Oct 2017 14:32:50 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.718 X-Spam-Level: X-Spam-Status: No, score=-0.718 tagged_above=-999 required=5 tests=[AWL=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 28Y9oFG_ljCD for ; Tue, 31 Oct 2017 14:32:48 -0700 (PDT) Received: from avasout06.plus.net.plus.net (avasout06.plus.net [212.159.14.18]) by arlo.cworth.org (Postfix) with ESMTPS id 2DDFC6DE0C3F for ; Tue, 31 Oct 2017 14:32:47 -0700 (PDT) Received: from mail.bubblegen.co.uk ([80.229.236.194]) by smtp with ESMTP id 9e9IepwzFFv8c9e9JeGl8J; Tue, 31 Oct 2017 21:32:44 +0000 X-CM-Score: 0.00 X-CNFS-Analysis: v=2.2 cv=Ful1xyjq c=1 sm=1 tr=0 a=G4bc5lkgapKKm1P+Twxy3Q==:117 a=G4bc5lkgapKKm1P+Twxy3Q==:17 a=02M-m0pO-4AA:10 a=J6cydZkJAAAA:8 a=pGLkceISAAAA:8 a=xq686A2I7EyJdc6EvdcA:9 a=QEXdDO2ut3YA:10 a=WkVOQnwtgs03cJaq3u4A:9 a=iKOploJDn3gWFa0n:21 a=L0coiOcRLHDvGY_RdqVO:22 Received: from mail-lf0-f53.google.com (mail-lf0-f53.google.com [209.85.215.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: matt) by mail.bubblegen.co.uk (Postfix) with ESMTPSA id 8E09D4622018 for ; Tue, 31 Oct 2017 21:32:40 +0000 (GMT) Received: by mail-lf0-f53.google.com with SMTP id b190so412296lfg.9 for ; Tue, 31 Oct 2017 14:32:40 -0700 (PDT) X-Gm-Message-State: AMCzsaWMnEAhTLywWBkVdV1Hf01mnjrSTtPm/R6Afyg4+A/jRtEZ0YKz +JrDc8XmAjFN1w5xbwoyoMHnBDlHj61IqjJNs3c= X-Google-Smtp-Source: ABhQp+TeQLRVYeywDGA7yOu4jxXjgkxn7WaI4El/6YmUlCWAv5cJRhawhTo6XC5Yg8ZKYVLzzOpia1RUsiRgajS0mG4= X-Received: by 10.25.206.69 with SMTP id e66mr1128856lfg.259.1509485559699; Tue, 31 Oct 2017 14:32:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.28.76 with HTTP; Tue, 31 Oct 2017 14:32:39 -0700 (PDT) In-Reply-To: <87h8ufnmwr.fsf@istari.evenmere.org> References: <87tvyvp4f2.fsf@istari.evenmere.org> <87376f13ho.fsf@fifthhorseman.net> <87r2tww9tr.fsf@nikula.org> <87wp3ow39i.fsf@fifthhorseman.net> <27e53def-32b4-45ab-1192-77cc0e837a93@gmail.com> <87zi8eopgq.fsf@istari.evenmere.org> <877evhy53k.fsf@fifthhorseman.net> <87she5nsmy.fsf@istari.evenmere.org> <87inf1gm7l.fsf@fifthhorseman.net> <87mv4co4vz.fsf@istari.evenmere.org> <87h8ufnmwr.fsf@istari.evenmere.org> From: Matthew Lear Date: Tue, 31 Oct 2017 21:32:39 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: web interface to notmuch To: Brian Sniffen Cc: Daniel Kahn Gillmor , Jani Nikula , Vladimir Panteleev , notmuch@notmuchmail.org Content-Type: multipart/alternative; boundary="001a114125cc945f1e055cde7e2b" X-CMAE-Envelope: MS4wfJ7CdKn41iXxY/4jgWuAmSxSUVdMWRC9PlDltDDCtQ/Q0dP3G4R5OpsVxAHy7Plc6dxYs+PwbOpC+F2jNF74PekMKUbJjO8h+DBW+zScWDVLzqgS9H65 IJfZifixhb9CuVdCJHW0U2HeOMamvF0NgysRmGDZw0saHXji4Ikh4mE74voSYPbQ4mPfTaQ2NIHiCQ== X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Oct 2017 21:32:50 -0000 --001a114125cc945f1e055cde7e2b Content-Type: text/plain; charset="UTF-8" On Tue, Oct 31, 2017 at 7:21 PM, Brian Sniffen wrote: > > > I'm no Python expert, but from a quick google it would seem like the > cause > > of such an exception is related to not using utf-8. > > Neat. So to get there, this has to be a text/html part. It has to have > been decoded, either with the declared content type or with ascii. If a > \u201c (left double quote) showed up, it didn't get decoded as > ascii---and indeed, it looks like the content-type specifies latin-1. > But now when we try to encode back, using the same latin-1, it fails? > That's really neat. > > > Brian - do you think something needs modifying in nmweb.py to cater for > > this type of thing, or is this somehow related my own mailstore (not sure > > why that would be as my messages haven't been modified). > > Lots of mail has busted encoding. I've done some defensive work against > that---look at decodeAnyway and shed a tear for purity---but clearly not > enough. Can you send me a message that causes the problem? > I'll need to fix up the text in the message because it's confidential. That should be easy enough to do. I'll send it to you once I've done that. One other thing - it looks like accessing attachments should work, but I've seen messages in my local set up here which have attachments shown, but I've not been able to retrieve them. Not sure what would cause that. Also some messages which are tagged as having attachments, don't have them shown by nmweb. FWIW this link ( https://nmweb.evenmere.org/show/CACMMjMLecmXopb8AATjE3UuCnNLOO%2B5Nmev5X8K-UostDEUdrQ%40mail.gmail.com) has the tag attachment applied to the message, but there is no attachment shown. And another ( https://nmweb.evenmere.org/show/87d31artti.fsf%40inf-8657.int-evry.fr). Maybe text/plain only emails are the ones which aren't problematic w.r.t. having their attachments shown? Cheers, -- Matt --001a114125cc945f1e055cde7e2b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

= On Tue, Oct 31, 2017 at 7:21 PM, Brian Sniffen <bts@evenmere.org> wrote:

> I'm no Python expert, but from a quick google it would seem like t= he cause
> of such an exception is related to not using utf-8.

Neat.=C2=A0 So to get there, this has to be a text/html part.= =C2=A0 It has to have
been decoded, either with the declared content type or with ascii.=C2=A0 If= a
\u201c (left double quote) showed up, it didn't get decoded as
ascii---and indeed, it looks like the content-type specifies latin-1.
But now when we try to encode back, using the same latin-1, it fails?
That's really neat.

> Brian - do you think something needs modifying in nmweb.py to cater fo= r
> this type of thing, or is this somehow related my own mailstore (not s= ure
> why that would be as my messages haven't been modified).

Lots of mail has busted encoding.=C2=A0 I've done some defensive= work against
that---look at decodeAnyway and shed a tear for purity---but clearly not enough.=C2=A0 Can you send me a message that causes the problem?

I'll need to fix up the text in the message b= ecause it's confidential. That should be easy enough to do.
I= 'll send it to you once I've done that.

On= e other thing - it looks like accessing attachments should work, but I'= ve seen messages in my local set up here which have attachments shown, but = I've not been able to retrieve them.
Not sure what would caus= e that. Also some messages which are tagged as having attachments, don'= t have them shown by nmweb.

FWIW this link (https://nmweb.evenmere.org/show/CACMMjMLe= cmXopb8AATjE3UuCnNLOO%2B5Nmev5X8K-UostDEUdrQ%40mail.gmail.com) has the = tag attachment applied to the message, but there is no attachment shown.=C2= =A0 And another (https://nmweb.evenmere.org/show/87d31artti.fsf%40in= f-8657.int-evry.fr).

Maybe text/plain only ema= ils are the ones which aren't problematic w.r.t. having their attachmen= ts shown?
Cheers,
--=C2=A0 Matt


--001a114125cc945f1e055cde7e2b--