From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Stefan Monnier" Newsgroups: gmane.emacs.devel Subject: Re: Cyrillic vs UTF-8 Date: Mon, 19 May 2003 09:49:05 -0400 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <200305191349.h4JDn5Ua019134@rum.cs.yale.edu> References: <1858-Fri25Apr2003194023+0300-eliz@elta.co.il> <200304282149.h3SLnxSU002624@rum.cs.yale.edu> <200305190040.JAA01942@etlken.m17n.org> <200305190052.h4J0qUfa017404@rum.cs.yale.edu> <200305190231.LAA02082@etlken.m17n.org> <200305191328.h4JDSuWf019090@rum.cs.yale.edu> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1053369727 1010 80.91.224.249 (19 May 2003 18:42:07 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 19 May 2003 18:42:07 +0000 (UTC) Cc: monnier+gnu/emacs@rum.cs.yale.edu Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Mon May 19 20:42:01 2003 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19HpYl-0000Am-00 for ; Mon, 19 May 2003 20:40:31 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 19Hpig-0008S1-00 for ; Mon, 19 May 2003 20:50:46 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 19Hp6K-0007Aq-04 for emacs-devel@quimby.gnus.org; Mon, 19 May 2003 14:11:08 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 19Hp4T-0006kw-00 for emacs-devel@gnu.org; Mon, 19 May 2003 14:09:13 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 19Hl9B-0005Gl-00 for emacs-devel@gnu.org; Mon, 19 May 2003 09:57:50 -0400 Original-Received: from rum.cs.yale.edu ([128.36.229.169]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 19Hl0s-0003r8-00 for emacs-devel@gnu.org; Mon, 19 May 2003 09:49:14 -0400 Original-Received: from rum.cs.yale.edu (localhost [127.0.0.1]) by rum.cs.yale.edu (8.12.8/8.12.8) with ESMTP id h4JDn5x6019136; Mon, 19 May 2003 09:49:05 -0400 Original-Received: (from monnier@localhost) by rum.cs.yale.edu (8.12.8/8.12.8/Submit) id h4JDn5Ua019134; Mon, 19 May 2003 09:49:05 -0400 X-Mailer: exmh version 2.4 06/23/2000 with nmh-1.0.4 Original-To: Kenichi Handa Original-cc: jas@extundo.com Original-cc: eliz@elta.co.il Original-cc: emacs-devel@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:14004 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:14004 > > > Maybe it is. In my situation, I'd like utf-8 to be at the top > > > of the preferences w.r.t decoding because it virtually never > > > guesses wrong. > > > OTOH, I'm still using a mostly-latin-1 environment, so I'd > > > still rather avoid utf-8 when I can. I.e. latin-1 should be at > > > the top of my preferences w.r.t encoding. > > > > In that case, I think the source of the problem is that the > > command prefer-coding-system doesn't satisfy this request of > > yours: > > Prefer utf-8 only in automatic detection on reading a > > file, not for the other situations. > > > > (defun prefer-coding-system (coding-system) > > "Add CODING-SYSTEM at the front of the priority list for automatic detection. > > This also sets the following coding systems: > > o coding system of a newly created buffer > > o default coding system for subprocess I/O > > This also sets the following values: > > o default value used as `file-name-coding-system' for converting file names. > > o default value for the command `set-terminal-coding-system' (not on MSDOS) > > o default value for the command `set-keyboard-coding-system' > > > > How about changing it to skip "This also ..." parts if > > called with a prefix argument? > > > > Then, on writing, if buffer-file-coding-system is not > > locally bound, default-buffer-file-coding-system is tried > > automatically. > > > > And, for the case that buffer-file-coding-system is locally > > bound differently from default-buffer-file-coding-system, > > but it can'd encode the current buffer, we can change > > select-safe-coding-system to try > > default-buffer-file-coding-system before trying the most > > preferred coding system. > > > > That way, I think we can satisfy your request completely. > > That seems like a cheap way to get what I want indeed. Actually I don't currently use prefer-coding-system (specifically because I didn't want to set all those other coding-systems), instead I use (when (boundp 'coding-category-utf-8) (set-coding-priority '(coding-category-utf-8))) so I guess the only change that I care about is the part that uses default-buffer-file-coding-system in preference to the most preferred coding system (although it does sound paradoxical ;-) The patch below would work for me; any comment/objection ? Stefan Index: mule-cmds.el =================================================================== RCS file: /cvsroot/emacs/emacs/lisp/international/mule-cmds.el,v retrieving revision 1.231 diff -u -u -b -r1.231 mule-cmds.el --- mule-cmds.el 16 May 2003 04:15:20 -0000 1.231 +++ mule-cmds.el 19 May 2003 13:45:16 -0000 @@ -1,5 +1,5 @@ ;;; mule-cmds.el --- commands for mulitilingual environment -;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN. +;; Copyright (C) 1995, 2003 Electrotechnical Laboratory, JAPAN. ;; Licensed to the Free Software Foundation. ;; Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc. @@ -631,7 +631,8 @@ between FROM and TO are shown in a popup window. Among them, the most proper one is suggested as the default. -The list of `buffer-file-coding-system' of the current buffer and the +The list of `buffer-file-coding-system' of the current buffer, +the `default-buffer-file-coding-system', and the most preferred coding system (if it corresponds to a MIME charset) is treated as the default coding system list. Among them, the first one that safely encodes the text is normally selected silently and @@ -648,8 +649,8 @@ list of coding systems to be prepended to the default coding system list. However, if DEFAULT-CODING-SYSTEM is a list and the first element is t, the cdr part is used as the defualt coding system list, -i.e. `buffer-file-coding-system' and the most prepended coding system -is not used. +i.e. `buffer-file-coding-system', `default-buffer-file-coding-system', +and the most preferred coding system are not used. Optional 4th arg ACCEPT-DEFAULT-P, if non-nil, is a function to determine the acceptability of the silently selected coding system. @@ -679,6 +680,9 @@ (mapcar (function (lambda (x) (cons x (coding-system-base x)))) default-coding-system)) + ;; From now on, the list of defaults is reversed. + (setq default-coding-system (nreverse default-coding-system)) + (unless no-other-defaults ;; If buffer-file-coding-system is not nil nor undecided, append it ;; to the defaults. @@ -686,24 +690,30 @@ (let ((base (coding-system-base buffer-file-coding-system))) (or (eq base 'undecided) (rassq base default-coding-system) - (setq default-coding-system - (append default-coding-system - (list (cons buffer-file-coding-system base))))))) + (push (cons buffer-file-coding-system base) + default-coding-system)))) + + ;; If default-buffer-file-coding-system is not nil nor undecided, + ;; append it to the defaults. + (if default-buffer-file-coding-system + (let ((base (coding-system-base default-buffer-file-coding-system))) + (or (eq base 'undecided) + (rassq base default-coding-system) + (push (cons default-buffer-file-coding-system base) + default-coding-system)))) ;; If the most preferred coding system has the property mime-charset, ;; append it to the defaults. (let ((tail coding-category-list) preferred base) - (while (and tail - (not (setq preferred (symbol-value (car tail))))) + (while (and tail (not (setq preferred (symbol-value (car tail))))) (setq tail (cdr tail))) (and (coding-system-p preferred) (setq base (coding-system-base preferred)) (coding-system-get preferred 'mime-charset) (not (rassq base default-coding-system)) - (setq default-coding-system - (append default-coding-system - (list (cons preferred base)))))))) + (push (cons preferred base) + default-coding-system))))) (if select-safe-coding-system-accept-default-p (setq accept-default-p select-safe-coding-system-accept-default-p)) @@ -724,7 +734,7 @@ (push (car elt) safe)) (push (car elt) unsafe))) (if safe - (setq coding-system (car (last safe))))) + (setq coding-system (car safe)))) ;; If all the defaults failed, ask a user. (when (not coding-system)