unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii
@ 2020-09-12  7:04 Alex Bochannek
  2020-09-12 14:13 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Bochannek @ 2020-09-12  7:04 UTC (permalink / raw)
  To: 43351

[-- Attachment #1: Type: text/plain, Size: 1900 bytes --]

Hello!

This is a very small patch, but I am not confident that there aren't
other side effects, so please evaluate it carefully.

In the fix for bug#5458 (2011-06-30), a change was made to
mm-charset-to-coding-system to support "ansi.x3.4*" as an alias for
'ascii. As part of that patch 'us-ascii was also mapped to 'ascii. This
is problematic because decode-coding-string does not recognize 'ascii as
a coding system and throws an "Invalid coding system: ascii" exception.

As a result, when using gnus-article-browse-html-article (K H) to
display a text/html message that has charset=us-ascii (or presumably
also charset=ascii), the display will fail iff the header of the message
is not ASCII.

Tracing gnus-article-browse-html-parts the call chain in my test case
looks like this:

(setq hcharset (mm-find-mime-charset-region (point-min)(point-max)))
returns 'utf-8 because of the RFC 2047 encoded words in the
from-header. The HTML part has charset=us-ascii and therefore coding and
charset differ. (setq body (mm-charset-to-coding-system charset nil t))
then sets 'us-ascii to 'ascii (see above) and the attempt to transcode
the part into 'utf-8 fails at (encode-coding-string
(decode-coding-string content body) charset) That last piece of code
seems to have gone in on 2016-02-12 when removing XEmacs compat
functions from mm-util.el.

This patch no longer maps 'us-ascii and instead maps 'ascii to 'us-ascii
(The ANSI alias is untouched.) Alternatively, I could modify
gnus-article-browse-html-parts to special-case this, but I don't think
mm-charset-to-coding-system should output 'ascii if it is not a valid
coding system (anymore?) However, I don't know what else that could
possibly break, which is why I want to offer this patch with some
caution.

Please let me know if there is anything I can do to help with getting
this change accepted.

Thanks!

-- 
Alex. <abochannek@google.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Change ASCII handling in mm-charset-to-coding-system to us-ascii --]
[-- Type: text/x-patch, Size: 643 bytes --]

diff --git a/lisp/gnus/mm-util.el b/lisp/gnus/mm-util.el
index 282465722d..3dc93e4ad4 100644
--- a/lisp/gnus/mm-util.el
+++ b/lisp/gnus/mm-util.el
@@ -137,9 +137,9 @@ mm-charset-to-coding-system
 	 (let ((cs (cdr (assq charset mm-charset-override-alist))))
 	   (and cs (mm-coding-system-p cs) cs))))
    ;; ascii
-   ((or (eq charset 'us-ascii)
+   ((or (eq charset 'ascii)
 	(string-match "ansi.x3.4" (symbol-name charset)))
-    'ascii)
+    'us-ascii)
    ;; Check to see whether we can handle this charset.  (This depends
    ;; on there being some coding system matching each `mime-charset'
    ;; property defined, as there should be.)

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-13 12:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-12  7:04 bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii Alex Bochannek
2020-09-12 14:13 ` Lars Ingebrigtsen
2020-09-12 14:19   ` Eli Zaretskii
2020-09-13  1:00     ` Alex Bochannek
2020-09-13 12:40       ` Lars Ingebrigtsen
2020-09-13 12:39     ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).