unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#4712: File encoding
@ 2009-10-13  9:12 Elmar Zander
  2009-10-14  9:07 ` Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Elmar Zander @ 2009-10-13  9:12 UTC (permalink / raw)
  To: bug-gnu-emacs

Hi!

my source file begins with the following lines

  #!/usr/bin/perl
  # -*- coding: iso-8859-1 -*-
  #

I definitely need the latin-1 encoding here because the script is supposed 
to do some non-standard translations from latin-1 to html entities. However, 
whenever I try to save the file I get the message:

  Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix specified by file contents.  
  Really save (else edit coding cookies and try again)? (yes or no) 

It doesn't help setting the buffer-file-coding system, like

  M-x set-buffer-file-coding-system latin-1

I get then

  buffer-file-coding-system is a variable defined in `C source code'.
  Its value is latin-1-unix
  Local in buffer html_translate; global value is mule-utf-8

but after saving (ignoring the warning) again:

  buffer-file-coding-system is a variable defined in `C source code'.
  Its value is mule-utf-8-unix

Hmm, so it completely ignores my settings. It would be nice if emacs either 
at least gave me the choice between latin-1 and utf-8 (instead of just 
yes or no) or tell me how the hell to "edit the coding cookies" (I have 
no clue what that is and also extensive internet search and manual reading 
didn't turn that up).

Regards,
Elmar





In GNU Emacs 22.3.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2009-03-31 on raven, modified by Debian
configured using `configure  '--build=i486-linux-gnu' '--host=i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs22:/etc/emacs:/usr/local/share/emacs/22.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/22.3/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/22.3/leim' '--with-x=yes' '--with-x-toolkit=athena' '--with-toolkit-scroll-bars' 'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -g -O2' 'LDFLAGS=-g' 'CPPFLAGS=''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_AU.UTF-8
  locale-coding-system: utf-8
  default-enable-multibyte-characters: t

Major mode: Perl

Minor modes in effect:
  encoded-kbd-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  unify-8859-on-encoding-mode: t
  utf-translate-cjk-mode: t
  auto-compression-mode: t
  line-number-mode: t

Recent input:
A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC 
O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A 
ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC 
O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A 
ESC O A ESC O B ESC O B ESC O B ESC O B ESC O B ESC 
O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B 
ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC 
O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B 
ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC 
O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B 
ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC 
O B ESC O B ESC O H ESC [ 5 ~ ESC [ 5 ~ ESC [ 5 ~ ESC 
[ 5 ~ ESC O A ESC O A ESC O A ESC O A ESC O A ESC O 
A ESC O A ESC O A ESC O A ESC O A C-x C-s C-g C-x C-s 
y e s RET C-z ESC O B ESC O A SPC DEL C-x C-s C-g ESC 
x r e p o TAB r TAB RET

Recent messages:
Quit
Making completion list...
Type C-x 4 C-o RET to restore the other window.  
call-interactively: Beginning of buffer [2 times]
Quit
Wrote /globalfs/VALKYRIEHOME/ezander/institut/www/html/wir_htmllib/html_translate
Quit
Auto-saving...
Making completion list...
Loading emacsbug...done






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-13  9:12 bug#4712: File encoding Elmar Zander
@ 2009-10-14  9:07 ` Andreas Schwab
  2009-10-14 13:27   ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2009-10-14  9:07 UTC (permalink / raw)
  To: Elmar Zander; +Cc: bug-gnu-emacs, 4712

Elmar Zander <ezander@valkyrie.sc.cs.tu-bs.de> writes:

> my source file begins with the following lines
>
>   #!/usr/bin/perl
>   # -*- coding: iso-8859-1 -*-
>   #
>
> I definitely need the latin-1 encoding here because the script is supposed 
> to do some non-standard translations from latin-1 to html entities. However, 
> whenever I try to save the file I get the message:
>
>   Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix specified by file contents.  
>   Really save (else edit coding cookies and try again)? (yes or no) 

That means that the buffer contains characters that cannot be encoded in
iso-8859-1-unix.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14  9:07 ` Andreas Schwab
@ 2009-10-14 13:27   ` Stefan Monnier
  2009-10-14 14:04     ` Eli Zaretskii
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2009-10-14 13:27 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: bug-gnu-emacs, Elmar Zander, 4712

>> my source file begins with the following lines
>> 
>> #!/usr/bin/perl
>> # -*- coding: iso-8859-1 -*-
>> #
>> 
>> I definitely need the latin-1 encoding here because the script is supposed 
>> to do some non-standard translations from latin-1 to html entities. However, 
>> whenever I try to save the file I get the message:
>> 
>> Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix specified by file contents.  
>> Really save (else edit coding cookies and try again)? (yes or no) 

> That means that the buffer contains characters that cannot be encoded in
> iso-8859-1-unix.

Indeed, but the error message we output is completely unhelpful.
Rather than select utf-8 and then complain that the tag doesn't match,
we should say upfront, that the selected latin-1 can't encode all the
chars in the buffer (and that message can come with the usual thingy
that shows the offending chars and their location).


        Stefan







^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14 13:27   ` Stefan Monnier
@ 2009-10-14 14:04     ` Eli Zaretskii
  2009-10-14 14:33       ` Andreas Schwab
  2009-10-14 15:19       ` Stefan Monnier
  0 siblings, 2 replies; 12+ messages in thread
From: Eli Zaretskii @ 2009-10-14 14:04 UTC (permalink / raw)
  To: Stefan Monnier, 4712; +Cc: ezander, schwab

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 14 Oct 2009 09:27:33 -0400
> Cc: bug-gnu-emacs@gnu.org, Elmar Zander <ezander@valkyrie.sc.cs.tu-bs.de>,
> 	4712@emacsbugs.donarmstrong.com
> 
> > That means that the buffer contains characters that cannot be encoded in
> > iso-8859-1-unix.
> 
> Indeed, but the error message we output is completely unhelpful.
> Rather than select utf-8 and then complain that the tag doesn't match,
> we should say upfront, that the selected latin-1 can't encode all the
> chars in the buffer (and that message can come with the usual thingy
> that shows the offending chars and their location).

Maybe so, but this part of the OP's report:

  value of $LANG: en_AU.UTF-8
  locale-coding-system: utf-8

indicates that UTF-8 is the "native" encoding on the OP's machine, and
there is an overwhelming user demand for silently and transparently
switch to such a native encoding when we need to select an encoding.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14 14:04     ` Eli Zaretskii
@ 2009-10-14 14:33       ` Andreas Schwab
  2009-10-14 18:11         ` Eli Zaretskii
  2009-10-14 15:19       ` Stefan Monnier
  1 sibling, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2009-10-14 14:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ezander, 4712

Eli Zaretskii <eliz@gnu.org> writes:

> Maybe so, but this part of the OP's report:
>
>   value of $LANG: en_AU.UTF-8
>   locale-coding-system: utf-8
>
> indicates that UTF-8 is the "native" encoding on the OP's machine, and
> there is an overwhelming user demand for silently and transparently
> switch to such a native encoding when we need to select an encoding.

How can silently producing a broken file be desirable?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14 14:04     ` Eli Zaretskii
  2009-10-14 14:33       ` Andreas Schwab
@ 2009-10-14 15:19       ` Stefan Monnier
  2009-10-14 18:15         ` Eli Zaretskii
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2009-10-14 15:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, ezander, 4712

> Maybe so, but this part of the OP's report:

>   value of $LANG: en_AU.UTF-8
>   locale-coding-system: utf-8

> indicates that UTF-8 is the "native" encoding on the OP's machine, and
> there is an overwhelming user demand for silently and transparently
> switch to such a native encoding when we need to select an encoding.

The `coding' cookie trumps any such setting, since when we read the
file, we will blindly obey the cookie without paying any attention to
the user's locale.


        Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14 14:33       ` Andreas Schwab
@ 2009-10-14 18:11         ` Eli Zaretskii
  0 siblings, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2009-10-14 18:11 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: ezander, 4712

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 4712@emacsbugs.donarmstrong.com,
>         ezander@valkyrie.sc.cs.tu-bs.de
> Date: Wed, 14 Oct 2009 16:33:19 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Maybe so, but this part of the OP's report:
> >
> >   value of $LANG: en_AU.UTF-8
> >   locale-coding-system: utf-8
> >
> > indicates that UTF-8 is the "native" encoding on the OP's machine, and
> > there is an overwhelming user demand for silently and transparently
> > switch to such a native encoding when we need to select an encoding.
> 
> How can silently producing a broken file be desirable?

It isn't, and we don't.  You misunderstood what I was saying.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14 15:19       ` Stefan Monnier
@ 2009-10-14 18:15         ` Eli Zaretskii
  2009-10-15  3:16           ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2009-10-14 18:15 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: schwab, ezander, 4712

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: 4712@emacsbugs.donarmstrong.com,  schwab@linux-m68k.org,  ezander@valkyrie.sc.cs.tu-bs.de
> Date: Wed, 14 Oct 2009 11:19:47 -0400
> 
> > Maybe so, but this part of the OP's report:
> 
> >   value of $LANG: en_AU.UTF-8
> >   locale-coding-system: utf-8
> 
> > indicates that UTF-8 is the "native" encoding on the OP's machine, and
> > there is an overwhelming user demand for silently and transparently
> > switch to such a native encoding when we need to select an encoding.
> 
> The `coding' cookie trumps any such setting, since when we read the
> file, we will blindly obey the cookie without paying any attention to
> the user's locale.

Right, and we did:

  Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix specified by file contents.                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Except that we probably do it only _after_ defaulting to UTF-8.






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-14 18:15         ` Eli Zaretskii
@ 2009-10-15  3:16           ` Stefan Monnier
  2009-10-15  7:20             ` Eli Zaretskii
  2009-10-15  9:38             ` Andreas Schwab
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Monnier @ 2009-10-15  3:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, ezander, 4712

>> > Maybe so, but this part of the OP's report:
>> 
>> >   value of $LANG: en_AU.UTF-8
>> >   locale-coding-system: utf-8
>> 
>> > indicates that UTF-8 is the "native" encoding on the OP's machine, and
>> > there is an overwhelming user demand for silently and transparently
>> > switch to such a native encoding when we need to select an encoding.
>> 
>> The `coding' cookie trumps any such setting, since when we read the
>> file, we will blindly obey the cookie without paying any attention to
>> the user's locale.

> Right, and we did:

>   Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix
>   specified by file contents.

The fact that we even consider utf-8 is the bug, it means that the
coding cookie didn't actually "trump" the locale setting.


        Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-15  3:16           ` Stefan Monnier
@ 2009-10-15  7:20             ` Eli Zaretskii
  2009-10-15  9:38             ` Andreas Schwab
  1 sibling, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2009-10-15  7:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: schwab, ezander, 4712

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: 4712@emacsbugs.donarmstrong.com,  schwab@linux-m68k.org,  ezander@valkyrie.sc.cs.tu-bs.de
> Date: Wed, 14 Oct 2009 23:16:32 -0400
> 
> >> > Maybe so, but this part of the OP's report:
> >> 
> >> >   value of $LANG: en_AU.UTF-8
> >> >   locale-coding-system: utf-8
> >> 
> >> > indicates that UTF-8 is the "native" encoding on the OP's machine, and
> >> > there is an overwhelming user demand for silently and transparently
> >> > switch to such a native encoding when we need to select an encoding.
> >> 
> >> The `coding' cookie trumps any such setting, since when we read the
> >> file, we will blindly obey the cookie without paying any attention to
> >> the user's locale.
> 
> > Right, and we did:
> 
> >   Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix
> >   specified by file contents.
> 
> The fact that we even consider utf-8 is the bug, it means that the
> coding cookie didn't actually "trump" the locale setting.

Sigh...  In case it wasn't clear, I didn't mean to say this is not a
problem.  I tried to explain why it happens, so that someone who cares
could find the proper fix faster and without disrupting other
important features while at that.  But if no one wants to hear, I
guess I'll crawl back under my rock...





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-15  3:16           ` Stefan Monnier
  2009-10-15  7:20             ` Eli Zaretskii
@ 2009-10-15  9:38             ` Andreas Schwab
  2009-10-17  4:01               ` Stefan Monnier
  1 sibling, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2009-10-15  9:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: ezander, 4712

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> > Maybe so, but this part of the OP's report:
>>> 
>>> >   value of $LANG: en_AU.UTF-8
>>> >   locale-coding-system: utf-8
>>> 
>>> > indicates that UTF-8 is the "native" encoding on the OP's machine, and
>>> > there is an overwhelming user demand for silently and transparently
>>> > switch to such a native encoding when we need to select an encoding.
>>> 
>>> The `coding' cookie trumps any such setting, since when we read the
>>> file, we will blindly obey the cookie without paying any attention to
>>> the user's locale.
>
>> Right, and we did:
>
>>   Selected encoding mule-utf-8-unix disagrees with iso-8859-1-unix
>>   specified by file contents.
>
> The fact that we even consider utf-8 is the bug, it means that the
> coding cookie didn't actually "trump" the locale setting.

Why do you think this is a bug?  The designated coding cannot encode the
buffer, so some other encoding must be selected.  That's the whole point
of the message.  It's the same when you force a coding with C-x C-m C-f,
except that then you get the more detailed message.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#4712: File encoding
  2009-10-15  9:38             ` Andreas Schwab
@ 2009-10-17  4:01               ` Stefan Monnier
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Monnier @ 2009-10-17  4:01 UTC (permalink / raw)
  To: 4712

I believe the patch below fixed the problem,


        Stefan


--- mule-cmds.el.~1.375.~	2009-10-15 18:22:51.000000000 -0400
+++ mule-cmds.el	2009-10-16 23:56:10.000000000 -0400
@@ -889,13 +889,13 @@
 		  default-coding-system))
 
     (if (and auto-cs (not no-other-defaults))
-	;; If the file has a coding cookie, try to use it before anything
-	;; else (i.e. before default-coding-system which will typically come
-	;; from file-coding-system-alist).
+	;; If the file has a coding cookie, use it regardless of any
+	;; other setting.
 	(let ((base (coding-system-base auto-cs)))
 	  (or (memq base '(nil undecided))
-	      (rassq base default-coding-system)
-	      (push (cons auto-cs base) default-coding-system))))
+              (progn
+                (setq default-coding-system (list (cons auto-cs base)))
+                (setq no-other-defaults t)))))
 
     (unless no-other-defaults
       ;; If buffer-file-coding-system is not nil nor undecided, append it





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-10-17  4:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-13  9:12 bug#4712: File encoding Elmar Zander
2009-10-14  9:07 ` Andreas Schwab
2009-10-14 13:27   ` Stefan Monnier
2009-10-14 14:04     ` Eli Zaretskii
2009-10-14 14:33       ` Andreas Schwab
2009-10-14 18:11         ` Eli Zaretskii
2009-10-14 15:19       ` Stefan Monnier
2009-10-14 18:15         ` Eli Zaretskii
2009-10-15  3:16           ` Stefan Monnier
2009-10-15  7:20             ` Eli Zaretskii
2009-10-15  9:38             ` Andreas Schwab
2009-10-17  4:01               ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).