From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#35507: Gnus mojibakifies UTF-8 text/x-patch attachments from Thunderbird Date: Wed, 1 May 2019 11:26:35 -0700 Organization: UCLA Computer Science Department Message-ID: References: <44a26585-7980-378c-9262-a567ddd3e617@cs.ucla.edu> <83d0l2qdw9.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="269580"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 Cc: 35507@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed May 01 20:27:13 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hLtwn-0017q9-Fz for geb-bug-gnu-emacs@m.gmane.org; Wed, 01 May 2019 20:27:13 +0200 Original-Received: from localhost ([127.0.0.1]:38153 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hLtwm-00060g-Gn for geb-bug-gnu-emacs@m.gmane.org; Wed, 01 May 2019 14:27:12 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:48675) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hLtwd-0005xM-VY for bug-gnu-emacs@gnu.org; Wed, 01 May 2019 14:27:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hLtwc-0004sz-RS for bug-gnu-emacs@gnu.org; Wed, 01 May 2019 14:27:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:59569) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hLtwc-0004sj-OV for bug-gnu-emacs@gnu.org; Wed, 01 May 2019 14:27:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hLtwc-0003mX-CO; Wed, 01 May 2019 14:27:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Wed, 01 May 2019 18:27:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35507 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 35507-submit@debbugs.gnu.org id=B35507.155673520514512 (code B ref 35507); Wed, 01 May 2019 18:27:02 +0000 Original-Received: (at 35507) by debbugs.gnu.org; 1 May 2019 18:26:45 +0000 Original-Received: from localhost ([127.0.0.1]:44880 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hLtwL-0003m0-2Z for submit@debbugs.gnu.org; Wed, 01 May 2019 14:26:45 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:54760) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hLtwI-0003ll-0e for 35507@debbugs.gnu.org; Wed, 01 May 2019 14:26:42 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 751C21618D3; Wed, 1 May 2019 11:26:36 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 6sY7nZsZYH0D; Wed, 1 May 2019 11:26:35 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 995301618D4; Wed, 1 May 2019 11:26:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id gjscEzSH9zqQ; Wed, 1 May 2019 11:26:35 -0700 (PDT) Original-Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 769C916176A; Wed, 1 May 2019 11:26:35 -0700 (PDT) Openpgp: preference=signencrypt Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECH In-Reply-To: <83d0l2qdw9.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:158595 Archived-At: On 5/1/19 10:32 AM, Eli Zaretskii wrote: > Is text/x-patch a "new media type" or not? It's not a registered media type so strictly speaking the RFCs' SHOULD statements do not apply (and they are SHOULDs not MUSTs so they could be disregarded for good reason). That being said, the ordinary and usual intent is for the x- media types to follow these recommendations and my bug report was filed under that assumption. > my reading of the RFC is that we should not define > or expect any defaults, which means this bug is squarely in > Thunderbird's yard Ah, sorry, I see that my bug report misstated a point. This particular patch clearly identifies its own encoding because its header says "Content-Type: text/plain; charset=UTF-8". (I think Git-generated patches always specify an encoding unless it's ASCII.) So in this particular case the RFC's recommendation seems to be respected by the sender. Gnus could look for a Content-Type: header in text bodies that do not specify charsets; this would follow the Internet's robustness principle better. > I don't see why we should > change Gnus in this regard, certainly not unconditionally assuming > UTF-8. Gnus is mishandling emails sent from Thunderbird right now, so it would be a practical benefit for Gnus users if it did a better job of decoding these admittedly-iffy messages. These days, UTF-8 is by far the most common encoding specified for non-ASCII text in email and its popularity is growing, so it's the best choice for a default if Gnus will have one - certainly better than the confusing behavior that Robert Pluim observed in his Gnus session. Gnus's current behavior may have been a good idea in 1996 when RFC 2046 said US-ASCII was the default, but it stopped being a good idea in 2012 when RFC 6657 came out and said that UTF-8 should be the default if there is a default. Another possibility is that Gnus could ask the user which encoding to use when the email headers don't specify one and when the text is not ASCII; even that would be better than Gnus's current behavior of forcing US-ASCII and displaying something like "\xe2\x80\x99" when it encounters a non-ASCII character.