From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 7BF4A1F670; Mon, 25 Oct 2021 00:08:24 +0000 (UTC) Date: Mon, 25 Oct 2021 00:08:24 +0000 From: Eric Wong To: Thomas =?utf-8?Q?Wei=C3=9Fschuh?= Cc: meta@public-inbox.org Subject: Re: [PATCH 3/3] mbox: Specify encoding for raw message display Message-ID: <20211025000824.GA20307@dcvr> References: <20211024214337.161779-1-thomas@t-8ch.de> <20211024214337.161779-2-thomas@t-8ch.de> <20211024214337.161779-3-thomas@t-8ch.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20211024214337.161779-3-thomas@t-8ch.de> List-Id: Thomas Weißschuh wrote: > +++ b/lib/PublicInbox/Mbox.pm > @@ -58,10 +58,10 @@ sub res_hdr ($$) { > my @hdr = ('Content-Type'); > if ($ctx->{ibx}->{obfuscate}) { > # obfuscation is stupid, but maybe scrapers are, too... > - push @hdr, 'application/mbox'; > + push @hdr, 'application/mbox; charset=UTF-8'; > $fn .= '.mbox'; > } else { > - push @hdr, 'text/plain'; > + push @hdr, 'text/plain; charset=UTF-8'; > $fn .= '.txt'; Applied and pushed patches 1 + 2, thanks. This (3/3) seems incorrect for non-UTF-8-compatible messages. I should have a better approach for this in the next day or so. The correct approach would be to use the Content-Type from the $eml object, but the $eml object isn't likely in memory when res_hdr() is called. I was actually doing some surgery with the WwwStream / GzipFilter async response components earlier, soI'll probably get reading charset supported, soon. Thanks for bringing this up.