unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#63644: 29.0.91; Coding system detection defect in html
@ 2023-05-22 13:59 Ikumi Keita
  2023-05-22 16:04 ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Ikumi Keita @ 2023-05-22 13:59 UTC (permalink / raw)
  To: 63644

The function `sgml-html-meta-auto-coding-function' signals error for html
file with legacy encoding specification.

0. Save the following file as /tmp/foo.html with the coding system `euc-jp':
----------------------------------------------------------------------
<!DOCTYPE html>
<html lang="ja">
<head>
<meta charset="EUC-JP">
<title>dummy</title>
</head>
<body>
あいうえお
</body></html>
----------------------------------------------------------------------
1. emacs -Q
2. C-x C-f /tmp/foo.html RET
3. M-: (sgml-html-meta-auto-coding-function 1000) RET
4. Then emacs signals error with the following backtrace:
Debugger entered--Lisp error: (coding-system-error iso-2022)
  coding-system-plist(iso-2022)
  coding-system-equal(utf-8 iso-2022)
  sgml-html-meta-auto-coding-function(1000)
  eval((sgml-html-meta-auto-coding-function 1000) t)
  eval-expression((sgml-html-meta-auto-coding-function 1000) nil nil 127)
  funcall-interactively(eval-expression (sgml-html-meta-auto-coding-function 1000) nil nil 127)
  call-interactively(eval-expression nil nil)
  command-execute(eval-expression)

It seems that this error is due to change in
`sgml-html-meta-auto-coding-function' introduced in emacs 27. When I use
emacs 26.1 definition of the function, it returns `euc-jp' as expected.

Regards,
Ikumi Keita
#StandWithUkraine #StopWarInUkraine

In GNU Emacs 29.0.91 (build 1, x86_64-unknown-freebsd13.2, GTK+ Version
 3.24.34, cairo version 1.17.4) of 2023-05-22 built on freebsd.vmware
Windowing system distributor 'The X.Org Foundation', version 11.0.12101007
System Description: 13.2-RELEASE

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GNUTLS GSETTINGS HARFBUZZ JPEG JSON
LCMS2 LIBXML2 MODULES NOTIFY KQUEUE PDUMPER PNG RSVG SOUND SQLITE3
THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB

Important settings:
  value of $EMACSLOADPATH: /home/keita/elisp:
  value of $LANG: ja_JP.UTF-8
  locale-coding-system: utf-8-unix

Major mode: HTML+

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  line-number-mode: t
  indent-tabs-mode: t
  transient-mark-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t

Load-path shadows:
/home/keita/elisp/reftex-parse hides /home/keita/scr/emacs-29.0.91/lisp/textmodes/reftex-parse

Features:
(shadow sort mail-extr emacsbug message dired dired-loaddefs rfc822 mml
mml-sec epa derived epg rfc6068 epg-config mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums debug backtrace find-func cl-extra pp cl-print
help-fns radix-tree help-mode yank-media mhtml-mode css-mode smie eww
xdg url-queue thingatpt shr pixel-fill kinsoku url-file svg xml
browse-url url url-proxy url-privacy url-expand url-methods url-history
url-cookie generate-lisp-file url-domsuf url-util url-parse auth-source
eieio eieio-core cl-macs password-cache url-vars mailcap puny mm-url
gnus nnheader gnus-util text-property-search time-date mail-utils range
wid-edit mm-util mail-prsvr color js c-ts-common treesit cl-seq json
subr-x map byte-opt gv bytecomp byte-compile imenu cc-mode cc-fonts
cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
sgml-mode facemenu dom cl-loaddefs cl-lib japan-util rmc iso-transl
tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks
lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq
simple cl-generic indonesian philippine cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite emoji-zwj charscript charprop case-table
epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button
loaddefs theme-loaddefs faces cus-face macroexp files window
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget keymap hashtable-print-readable backquote threads dbusbind
kqueue lcms2 dynamic-setting system-font-setting font-render-setting
cairo move-toolbar gtk x-toolkit xinput2 x multi-tty
make-network-process emacs)

Memory information:
((conses 16 122445 10073)
 (symbols 48 12761 0)
 (strings 32 41719 1774)
 (string-bytes 1 1318626)
 (vectors 16 23622)
 (vector-slots 8 397515 14753)
 (floats 8 154 34)
 (intervals 56 350 0)
 (buffers 976 15))





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#63644: 29.0.91; Coding system detection defect in html
  2023-05-22 13:59 bug#63644: 29.0.91; Coding system detection defect in html Ikumi Keita
@ 2023-05-22 16:04 ` Eli Zaretskii
  2023-05-22 16:42   ` Ikumi Keita
  0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2023-05-22 16:04 UTC (permalink / raw)
  To: Ikumi Keita; +Cc: 63644

> From: Ikumi Keita <ikumi@ikumi.que.jp>
> Date: Mon, 22 May 2023 22:59:23 +0900
> 
> 0. Save the following file as /tmp/foo.html with the coding system `euc-jp':
> ----------------------------------------------------------------------
> <!DOCTYPE html>
> <html lang="ja">
> <head>
> <meta charset="EUC-JP">
> <title>dummy</title>
> </head>
> <body>
> あいうえお
> </body></html>
> ----------------------------------------------------------------------
> 1. emacs -Q
> 2. C-x C-f /tmp/foo.html RET
> 3. M-: (sgml-html-meta-auto-coding-function 1000) RET
> 4. Then emacs signals error with the following backtrace:
> Debugger entered--Lisp error: (coding-system-error iso-2022)
>   coding-system-plist(iso-2022)
>   coding-system-equal(utf-8 iso-2022)
>   sgml-html-meta-auto-coding-function(1000)
>   eval((sgml-html-meta-auto-coding-function 1000) t)
>   eval-expression((sgml-html-meta-auto-coding-function 1000) nil nil 127)
>   funcall-interactively(eval-expression (sgml-html-meta-auto-coding-function 1000) nil nil 127)
>   call-interactively(eval-expression nil nil)
>   command-execute(eval-expression)

Thanks.  Does the patch below give good results?

diff --git a/lisp/international/mule.el b/lisp/international/mule.el
index 25b90b4..2b44a2e 100644
--- a/lisp/international/mule.el
+++ b/lisp/international/mule.el
@@ -2484,10 +2484,12 @@ sgml-xml-auto-coding-function
                     ;; called as part of visiting a file, as opposed
                     ;; to when saving a buffer to a file.
                     (if (and enable-multibyte-characters
-                             ;; 'charset' will signal an error in
-                             ;; coding-system-equal, since it isn't a
-                             ;; coding-system.  So test that up front.
+                             ;; 'charset' and 'iso-2022' will signal
+                             ;; an error in coding-system-equal, since
+                             ;; they aren't coding-systems.  So test
+                             ;; that up front.
                              (not (equal sym-type 'charset))
+                             (not (equal sym-type 'iso-2022))
                              (coding-system-equal 'utf-8 sym-type)
                              (coding-system-equal 'utf-8 bfcs-type))
                         buffer-file-coding-system
@@ -2540,11 +2542,13 @@ sgml-html-meta-auto-coding-function
                   (bfcs-type
                    (coding-system-type buffer-file-coding-system)))
               (if (and enable-multibyte-characters
-                       ;; 'charset' will signal an error in
-                       ;; coding-system-equal, since it isn't a
-                       ;; coding-system.  So test that up front.
+                       ;; 'charset' and 'iso-2022' will signal an error
+                       ;; in coding-system-equal, since they aren't
+                       ;; coding-systems.  So test that up front.
                        (not (equal sym-type 'charset))
                        (not (equal bfcs-type 'charset))
+                       (not (equal sym-type 'iso-2022))
+                       (not (equal bfcs-type 'iso-2022))
                        (coding-system-equal 'utf-8 sym-type)
                        (coding-system-equal 'utf-8 bfcs-type))
                   buffer-file-coding-system





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#63644: 29.0.91; Coding system detection defect in html
  2023-05-22 16:04 ` Eli Zaretskii
@ 2023-05-22 16:42   ` Ikumi Keita
  2023-05-22 18:25     ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Ikumi Keita @ 2023-05-22 16:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63644

>>>>> Eli Zaretskii <eliz@gnu.org> writes:
> Thanks.  Does the patch below give good results?

Yes. It returns `euc-jp' as expected.

Regards,
Ikumi Keita
#StandWithUkraine #StopWarInUkraine





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#63644: 29.0.91; Coding system detection defect in html
  2023-05-22 16:42   ` Ikumi Keita
@ 2023-05-22 18:25     ` Eli Zaretskii
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2023-05-22 18:25 UTC (permalink / raw)
  To: Ikumi Keita; +Cc: 63644-done

> From: Ikumi Keita <ikumi@ikumi.que.jp>
> cc: 63644@debbugs.gnu.org
> Comments: In-reply-to Eli Zaretskii <eliz@gnu.org>
>    message dated "Mon, 22 May 2023 19:04:02 +0300."
> Date: Tue, 23 May 2023 01:42:39 +0900
> 
> >>>>> Eli Zaretskii <eliz@gnu.org> writes:
> > Thanks.  Does the patch below give good results?
> 
> Yes. It returns `euc-jp' as expected.

Thanks, installed on the emacs-29 branch, and closing the bug.





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-22 18:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22 13:59 bug#63644: 29.0.91; Coding system detection defect in html Ikumi Keita
2023-05-22 16:04 ` Eli Zaretskii
2023-05-22 16:42   ` Ikumi Keita
2023-05-22 18:25     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).