Default setting of `mm-coding-system-priorities'

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: David Kastrup <dak@gnu.org>
To: emacs-devel@gnu.org
Subject: Default setting of `mm-coding-system-priorities'
Date: Mon, 12 Feb 2007 09:14:45 +0100	[thread overview]
Message-ID: <86sldbdc7e.fsf@lola.quinscape.zz> (raw)

Hi,

nowadays a lot of GNU/Linux systems have set up an utf-8 locale by
default.  Now the default value of
mm-coding-system-priorities is

(defcustom mm-coding-system-priorities
  (if (boundp 'current-language-environment)
      (let ((lang (symbol-value 'current-language-environment)))
	(cond ((string= lang "Japanese")
	       ;; Japanese users prefer iso-2022-jp to euc-japan or
	       ;; shift_jis, however iso-8859-1 should be used when
	       ;; there are only ASCII text and Latin-1 characters.
	       '(iso-8859-1 iso-2022-jp iso-2022-jp-2 shift_jis utf-8)))))
  "Preferred coding systems for encoding outgoing messages.

More than one suitable coding system may be found for some text.
By default, the coding system with the highest priority is used
to encode outgoing messages (see `sort-coding-systems').  If this
variable is set, it overrides the default priority."
  :version "21.2"
  :type '(repeat (symbol :tag "Coding system"))
  :group 'mime)

Now the problem is that a _lot_ of mail and news readers in frequent
use (and often old favorites of people) don't grok utf-8, but pretty
much every one of them gets along fine with Latin-1.  So I think that
we should at least in a standard language locale (I have

English language environment

Sample text:
  Hello!, Hi!, How are you?

Input methods:
  english-dvorak ("DV@" in mode line)

Character sets:
  ascii: ASCII (ISO646 IRV)

Coding systems:
  nothing specific to English

And I have

Coding system for saving this buffer:
  = -- emacs-mule

Default coding system (for new files):
  u -- mule-utf-8 (alias: utf-8)

Coding system for keyboard input:
  nil
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)

Defaults for subprocess I/O:
  decoding: u -- mule-utf-8 (alias: utf-8)

  encoding: u -- mule-utf-8 (alias: utf-8)

Priority order for recognizing coding systems when reading files:
  1. mule-utf-8 (alias: utf-8)
  2. iso-latin-1 (alias: iso-8859-1 latin-1)
  3. mule-utf-16be-with-signature (alias: utf-16be-with-signature mule-utf-16-be utf-16-be)
  4. mule-utf-16le-with-signature (alias: utf-16le-with-signature mule-utf-16-le utf-16-le)
  5. iso-2022-jp (alias: junet)
  6. iso-2022-7bit 
  7. iso-2022-7bit-lock (alias: iso-2022-int-1)
  8. iso-2022-8bit-ss2 
  9. emacs-mule 
  10. raw-text 
  11. japanese-shift-jis (alias: shift_jis sjis cp932)
  12. chinese-big5 (alias: big5 cn-big5 cp950)
  13. no-conversion 

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The following are decoded correctly but recognized as iso-2022-7bit-lock:
    iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext
    iso-2022-jp-2 iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION	TARGET PATTERN		CODING SYSTEM(s)
  ---------	--------------		----------------
  File I/O	"\\.dz\\'"		(no-conversion . no-conversion)
		"\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'"
					(no-conversion . no-conversion)
		"\\.tgz\\'"		(no-conversion . no-conversion)
		"\\.tbz\\'"		(no-conversion . no-conversion)
		"\\.bz2\\'"		(no-conversion . no-conversion)
		"\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'"
					(no-conversion . no-conversion)
		"\\.elc\\'"		(emacs-mule . emacs-mule)
		"\\.utf\\(-8\\)?\\'"	utf-8
		"\\(\\`\\|/\\)loaddefs.el\\'"
					(raw-text . raw-text-unix)
		"\\.tar\\'"		(no-conversion . no-conversion)
		"\\.po[tx]?\\'\\|\\.po\\."
					po-find-file-coding-system
		"\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'"
					latexenc-find-file-coding-system
		""			(undecided)
  Process I/O	nothing specified
  Network I/O	nothing specified

[back]

It would now seem appropriate to make mm-coding-system-priorities
effectively default to '(iso-8859-1 utf-8), namely first try
iso-8859-1 before going over to utf-8.

I am not sure whether this change is something that should be done at
mm-coding-system-priorities level or in the English language
environment, though.  But since it is reasonable in an utf-8 locale
that files that are read and written are primarily considered utf-8,
it really might be appropriate to confine the Latin-1 preference to
mail and news interchange.

But there it is _definitely_ preferable.

-- 
David Kastrup

next             reply	other threads:[~2007-02-12  8:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-12  8:14 David Kastrup [this message]
2007-02-15 21:40 ` Default setting of `mm-coding-system-priorities' Reiner Steib
2007-02-15 22:38   ` Stefan Monnier
2007-02-16 20:04     ` Reiner Steib

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86sldbdc7e.fsf@lola.quinscape.zz \
    --to=dak@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.