unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* selecting an inapplicable coding-system
@ 2004-11-15 23:25 Stefan Monnier
  2004-11-16  5:00 ` Eli Zaretskii
  2004-11-24  6:31 ` Karl Eichwalder
  0 siblings, 2 replies; 4+ messages in thread
From: Stefan Monnier @ 2004-11-15 23:25 UTC (permalink / raw)



If I open a new file, insert é and then do the following:

   C-x RET f us-ascii RET
   C-x C-s

the file is saved in latin-1.  This is because when saving
buffer-file-coding-system is just one of several coding-systems used.

Another annoying situation is when you load a utf-8 file containing mostly
latin-1 chars plus a few non-latin-1 chars.  Let's say you don't know that
there are non-latin-1 chars and want to change the file to latin-1.  You do:

   C-x RET f latin-1 RET
   C-x C-s

and the buffer and file is back to utf-8 !?!

I think Emacs should give some feedback at some point between the C-x RET f
and the actual file save that the coding-system specified can't be used.
Ideally, it should also show the offending chars as is done when none of the
default coding systems can be used.

I've been using the patch below for this purpose.

Another problem I've encountered (recently with the iso-2022-7bit ->
utf-8 -> iso-2022-7bit dance in mule-cmds.el) is that iso-2022-7bit cannot
encode eight-bit-control characters, so if you read an iso-2022-7bit file
with invalid sequences in it, you get a buffer that you can't save.
Worse yet, when you try to save it it might say "selected encoding
mule-utf-8 disagrees with iso-2022-7bit-unix specified by file contents" but
if you look at the buffer's modeline it says "J", not "u", so you wonder
what's up with this utf-8 thing.


        Stefan


Index: lisp/international/mule.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/international/mule.el,v
retrieving revision 1.206
diff -u -r1.206 mule.el
--- lisp/international/mule.el	11 Nov 2004 21:39:41 -0000	1.206
+++ lisp/international/mule.el	15 Nov 2004 23:19:43 -0000
@@ -1151,18 +1151,31 @@
 surely saves the buffer with CODING-SYSTEM.  From a program, if you
 don't want to mark the buffer modified, just set the variable
 `buffer-file-coding-system' directly."
+  ;; FIXME: Use find-coding-systems-region to give a subset of all
+  ;; coding-systems in the completion table.  And provide a useful default
+  ;; (e.g. the one that select-safe-coding-system would have chosen, or the
+  ;; next best one if it's already the current coding system).
   (interactive "zCoding system for saving file (default, nil): \nP")
   (check-coding-system coding-system)
-  (if (and coding-system buffer-file-coding-system (null force))
-      (setq coding-system
-	    (merge-coding-systems coding-system buffer-file-coding-system)))
-  (setq buffer-file-coding-system coding-system)
+  (let ((cs (if (and coding-system buffer-file-coding-system (null force))
+		(merge-coding-systems coding-system buffer-file-coding-system)
+	      coding-system)))
+    (when (interactive-p)
+      ;; Check whether save-buffer would succeed, and if not, jump to the
+      ;; offending char(s) and give the user a chance to change her mind.
+      (let ((css (find-coding-systems-region (point-min) (point-max))))
+	(unless (or (eq (car css) 'undecided)
+		    (memq (coding-system-base cs) css))
+	  (setq coding-system (select-safe-coding-system-interactively
+			       (point-min) (point-max) css (list cs)
+			       nil coding-system)))))
+    (setq buffer-file-coding-system cs)
   ;; This is in case of an explicit call.  Normally, `normal-mode' and
   ;; `set-buffer-major-mode-hook' take care of setting the table.
   (if (fboundp 'ucs-set-table-for-input) ; don't lose when building
       (ucs-set-table-for-input))
   (set-buffer-modified-p t)
-  (force-mode-line-update))
+    (force-mode-line-update)))
 
 (defun revert-buffer-with-coding-system (coding-system &optional force)
   "Visit the current buffer's file again using coding system CODING-SYSTEM.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: selecting an inapplicable coding-system
  2004-11-15 23:25 selecting an inapplicable coding-system Stefan Monnier
@ 2004-11-16  5:00 ` Eli Zaretskii
  2004-11-16  5:14   ` Stefan
  2004-11-24  6:31 ` Karl Eichwalder
  1 sibling, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2004-11-16  5:00 UTC (permalink / raw)
  Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 15 Nov 2004 18:25:12 -0500
> 
> 
> If I open a new file, insert \x7f and then do the following:
> 
>    C-x RET f us-ascii RET
>    C-x C-s
> 
> the file is saved in latin-1.  This is because when saving
> buffer-file-coding-system is just one of several coding-systems used.

Isn't this a side effect of the feature that if the encoding that is
``natural'' in your language environment can encode the text, Emacs
silently uses that encoding?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: selecting an inapplicable coding-system
  2004-11-16  5:00 ` Eli Zaretskii
@ 2004-11-16  5:14   ` Stefan
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan @ 2004-11-16  5:14 UTC (permalink / raw)
  Cc: emacs-devel

>> C-x RET f us-ascii RET
>> C-x C-s
>> 
>> the file is saved in latin-1.  This is because when saving
>> buffer-file-coding-system is just one of several coding-systems used.

> Isn't this a side effect of the feature that if the encoding that is
> ``natural'' in your language environment can encode the text, Emacs
> silently uses that encoding?

Yes it is and the "silently" is the problem here, since you just specified
manually that you want *another* encoding.


        Stefan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: selecting an inapplicable coding-system
  2004-11-15 23:25 selecting an inapplicable coding-system Stefan Monnier
  2004-11-16  5:00 ` Eli Zaretskii
@ 2004-11-24  6:31 ` Karl Eichwalder
  1 sibling, 0 replies; 4+ messages in thread
From: Karl Eichwalder @ 2004-11-24  6:31 UTC (permalink / raw)
  Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I think Emacs should give some feedback at some point between the C-x
> RET f and the actual file save that the coding-system specified can't
> be used.

I think the same.  It already happened more than once that I have had to
figure out what was going on behind my back.

> Ideally, it should also show the offending chars as is done when none of the
> default coding systems can be used.

Such a feature I'd appreciate a lot.

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-11-24  6:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-15 23:25 selecting an inapplicable coding-system Stefan Monnier
2004-11-16  5:00 ` Eli Zaretskii
2004-11-16  5:14   ` Stefan
2004-11-24  6:31 ` Karl Eichwalder

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).