unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Default setting of `mm-coding-system-priorities'
@ 2007-02-12  8:14 David Kastrup
  2007-02-15 21:40 ` Reiner Steib
  0 siblings, 1 reply; 4+ messages in thread
From: David Kastrup @ 2007-02-12  8:14 UTC (permalink / raw)
  To: emacs-devel


Hi,

nowadays a lot of GNU/Linux systems have set up an utf-8 locale by
default.  Now the default value of
mm-coding-system-priorities is

(defcustom mm-coding-system-priorities
  (if (boundp 'current-language-environment)
      (let ((lang (symbol-value 'current-language-environment)))
	(cond ((string= lang "Japanese")
	       ;; Japanese users prefer iso-2022-jp to euc-japan or
	       ;; shift_jis, however iso-8859-1 should be used when
	       ;; there are only ASCII text and Latin-1 characters.
	       '(iso-8859-1 iso-2022-jp iso-2022-jp-2 shift_jis utf-8)))))
  "Preferred coding systems for encoding outgoing messages.

More than one suitable coding system may be found for some text.
By default, the coding system with the highest priority is used
to encode outgoing messages (see `sort-coding-systems').  If this
variable is set, it overrides the default priority."
  :version "21.2"
  :type '(repeat (symbol :tag "Coding system"))
  :group 'mime)

Now the problem is that a _lot_ of mail and news readers in frequent
use (and often old favorites of people) don't grok utf-8, but pretty
much every one of them gets along fine with Latin-1.  So I think that
we should at least in a standard language locale (I have

English language environment

Sample text:
  Hello!, Hi!, How are you?

Input methods:
  english-dvorak ("DV@" in mode line)

Character sets:
  ascii: ASCII (ISO646 IRV)

Coding systems:
  nothing specific to English


And I have

Coding system for saving this buffer:
  = -- emacs-mule

Default coding system (for new files):
  u -- mule-utf-8 (alias: utf-8)

Coding system for keyboard input:
  nil
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)

Defaults for subprocess I/O:
  decoding: u -- mule-utf-8 (alias: utf-8)

  encoding: u -- mule-utf-8 (alias: utf-8)


Priority order for recognizing coding systems when reading files:
  1. mule-utf-8 (alias: utf-8)
  2. iso-latin-1 (alias: iso-8859-1 latin-1)
  3. mule-utf-16be-with-signature (alias: utf-16be-with-signature mule-utf-16-be utf-16-be)
  4. mule-utf-16le-with-signature (alias: utf-16le-with-signature mule-utf-16-le utf-16-le)
  5. iso-2022-jp (alias: junet)
  6. iso-2022-7bit 
  7. iso-2022-7bit-lock (alias: iso-2022-int-1)
  8. iso-2022-8bit-ss2 
  9. emacs-mule 
  10. raw-text 
  11. japanese-shift-jis (alias: shift_jis sjis cp932)
  12. chinese-big5 (alias: big5 cn-big5 cp950)
  13. no-conversion 

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The following are decoded correctly but recognized as iso-2022-7bit-lock:
    iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext
    iso-2022-jp-2 iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION	TARGET PATTERN		CODING SYSTEM(s)
  ---------	--------------		----------------
  File I/O	"\\.dz\\'"		(no-conversion . no-conversion)
		"\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'"
					(no-conversion . no-conversion)
		"\\.tgz\\'"		(no-conversion . no-conversion)
		"\\.tbz\\'"		(no-conversion . no-conversion)
		"\\.bz2\\'"		(no-conversion . no-conversion)
		"\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'"
					(no-conversion . no-conversion)
		"\\.elc\\'"		(emacs-mule . emacs-mule)
		"\\.utf\\(-8\\)?\\'"	utf-8
		"\\(\\`\\|/\\)loaddefs.el\\'"
					(raw-text . raw-text-unix)
		"\\.tar\\'"		(no-conversion . no-conversion)
		"\\.po[tx]?\\'\\|\\.po\\."
					po-find-file-coding-system
		"\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'"
					latexenc-find-file-coding-system
		""			(undecided)
  Process I/O	nothing specified
  Network I/O	nothing specified

[back]


It would now seem appropriate to make mm-coding-system-priorities
effectively default to '(iso-8859-1 utf-8), namely first try
iso-8859-1 before going over to utf-8.

I am not sure whether this change is something that should be done at
mm-coding-system-priorities level or in the English language
environment, though.  But since it is reasonable in an utf-8 locale
that files that are read and written are primarily considered utf-8,
it really might be appropriate to confine the Latin-1 preference to
mail and news interchange.

But there it is _definitely_ preferable.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-02-16 20:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-02-12  8:14 Default setting of `mm-coding-system-priorities' David Kastrup
2007-02-15 21:40 ` Reiner Steib
2007-02-15 22:38   ` Stefan Monnier
2007-02-16 20:04     ` Reiner Steib

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).