From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alexandre Duret-Lutz Newsgroups: gmane.emacs.bugs Subject: bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode Date: Thu, 07 Jan 2021 17:06:44 +0100 Message-ID: <87wnwo7muj.fsf@lrde.epita.fr> References: <8735zj6q6h.fsf@goulash.lrde.epita.fr> <8735zf3f46.fsf@gnus.org> <87k0srg0hz.fsf@goulash.lrde.epita.fr> <87wnwo3kd2.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5975"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: 44307@debbugs.gnu.org To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jan 07 17:27:53 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kxY8f-0001Rx-9O for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 07 Jan 2021 17:27:53 +0100 Original-Received: from localhost ([::1]:39662 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kxY8e-0003kR-9l for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 07 Jan 2021 11:27:52 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34510) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kxY8S-0003iv-3N for bug-gnu-emacs@gnu.org; Thu, 07 Jan 2021 11:27:40 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:36820) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kxXoU-0000tf-5H for bug-gnu-emacs@gnu.org; Thu, 07 Jan 2021 11:07:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kxXoT-0000XO-UQ; Thu, 07 Jan 2021 11:07:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alexandre Duret-Lutz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Thu, 07 Jan 2021 16:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44307 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 44307-submit@debbugs.gnu.org id=B44307.16100356172055 (code B ref 44307); Thu, 07 Jan 2021 16:07:01 +0000 Original-Received: (at 44307) by debbugs.gnu.org; 7 Jan 2021 16:06:57 +0000 Original-Received: from localhost ([127.0.0.1]:48365 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kxXoO-0000X4-Rk for submit@debbugs.gnu.org; Thu, 07 Jan 2021 11:06:57 -0500 Original-Received: from mail-wr1-f49.google.com ([209.85.221.49]:42016) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kxXoK-0000Wj-91 for 44307@debbugs.gnu.org; Thu, 07 Jan 2021 11:06:54 -0500 Original-Received: by mail-wr1-f49.google.com with SMTP id m5so6144246wrx.9 for <44307@debbugs.gnu.org>; Thu, 07 Jan 2021 08:06:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=eaqEMGmeKb3xaLQ6kV5RlDhyHau5tH1LHpY5eLw73IE=; b=FUYKO9a7NyQfaU/u34jQwhefFdWe9NuSyFdddW20DSaCTfIm09v1wgnQ+yyZjS3pT/ neMpDUzEXrzYgok/VwO3ws0AdH41olZth5Ma5YkQVK2pGyvFRDMhFwWshLWRO7ei/1gu +NamIsLp3sAPZQf+w0LTKdl+/BhiAORqp/hUOL7vbps/jrOIXtWYzJyPovK8eHAAFQ9u ammx16f6IbUd6umunC4siHnjVtx/1XQSnqL2KjpFGh4DzRFY6dIu5Yau8SxpVDZWFbxW Wr4FNEjURyn9SQaQUguAgLPAArxHzJ6OORe69rsPtUHtjt4/kqpuN0HnT33PFj7vcS7O UrZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=eaqEMGmeKb3xaLQ6kV5RlDhyHau5tH1LHpY5eLw73IE=; b=pYeOngT/X/xP2IVAUvMOtlq8L7CnTmlsbYQe29rv50uAzSE5LcrF2l2hopt2zHHcgU t6Km8qdRUcWSx04idcOJLAuD46aar5ty9UQrVkKEJuShQ7Zv3JPuNmkqqAJYCd+T6z2m PLxkLZmiFvA60A3Bqx3JCAuqyTll+plCPZXLSIiNDCdvtncn0xcO9lI45vd3TR3zwBnr LuUGyP89Htw7ZYXB0XZDVPsZlRKPxTpk4qtL3Dipky7xDk6Z31FnDc0+xQQ8G0i/klDU HUJoVHh15gd+Wrr4xKVtAVPrDvyqHL3LU9ZDTGWwUi6bMfXDVy77UuW2b1k3CW3jX+En YIYg== X-Gm-Message-State: AOAM532WPUFU8YoR2Xc5f/yEGP/wQJNcmehsgKIUPILlFXIF/f6TeKJf kSY5HxR5fJg33sUGIzOeel36SE+MNpKbbQ== X-Google-Smtp-Source: ABdhPJzslkNu2Wd+hkVSqn6P54faKTJcYSj3JSEeGVCTNHU4bFj98vnhpJFpgfxxROaNt/qTrncH7A== X-Received: by 2002:adf:97ce:: with SMTP id t14mr9757160wrb.368.1610035606414; Thu, 07 Jan 2021 08:06:46 -0800 (PST) Original-Received: from goulash (89-109-190-109.dsl.ovh.fr. [109.190.109.89]) by smtp.gmail.com with ESMTPSA id h83sm8856260wmf.9.2021.01.07.08.06.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Jan 2021 08:06:45 -0800 (PST) In-Reply-To: <87wnwo3kd2.fsf@gnus.org> (Lars Ingebrigtsen's message of "Thu, 07 Jan 2021 15:14:01 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:197495 Archived-At: Lars Ingebrigtsen writes: > I've now committed a fix to mm-with-part that may or may not fix this > nnmaildir problem. Question: shouldn't mm-with-part always leave the buffer in unibyte mode? The comment at the beginning of the macro seems to suggest that, but the new "if" does not call (mm-disable-multibyte) after inserting the part. Otherwise that would be just pushing the issue further away, to the next place where when the contents of mm-with-part will be inserted in a unibyte buffer. > Can you try this (in Emacs 28)? You may have to do a "make bootstrap" > or at least remove all the lisp/gnus/*.elc files for the change to > have any effect. After "make bootstrap", this seems to fix only the rendering of text/html utf-8 parts (I'm using w3m, if that matters). However text/plain utf-8 parts are still garbled as they where before. If I tweak the patch a follows: --- a/lisp/gnus/mm-decode.el +++ b/lisp/gnus/mm-decode.el @@ -1271,7 +1271,9 @@ mm-with-part ;; multibyte buffer here, but if it's using an 8bit ;; Content-Transfer-Encoding, then work around that by ;; just ignoring the situation. - (insert-buffer-substring (mm-handle-buffer handle)) + (progn + (insert-buffer-substring (mm-handle-buffer handle)) + (mm-disable-multibyte)) ;; Do the decoding. (mm-disable-multibyte) (insert-buffer-substring (mm-handle-buffer handle)) this seems to fix text/plain utf-8 parts as well, however the rendering of window-1252 parts is now broken... See the following table, where "with patch" refers to commit (23a887e4), and "disable-mb" to the above tweak. |-------------+------------+---------------+------------+------------| | charset | type | without patch | with patch | disable-mb | |-------------+------------+---------------+------------+------------| | utf-8 | text/html | garbled | ok | ok | | window-1252 | test/html | ok | ok | garbled | | utf-8 | text/plain | garbled | garbled | ok | | window-1252 | test/plain | ok | ok | garbled | When looking at window-1252-encoded mails read by nnmaildir, and rendered using "C-u g" (where none of the above changes should matter), it's obvious that the buffer contains utf-8 characters. My guess is that when nnmaildir calls nnheader-insert-file-contents to reads the mail, it does so with 'undecided coding. emacs then automatically detect window-1252 and converts it to utf-8 for its internal representation. -- Alexandre Duret-Lutz