unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: help-gnu-emacs@gnu.org
Subject: Re: Understanding how to specify UTF-8
Date: Thu, 13 Apr 2017 10:18:04 +0300	[thread overview]
Message-ID: <83a87kixn7.fsf@gnu.org> (raw)
In-Reply-To: <ocn17208ke@news4.newsguy.com> (btraven@nihilo.net)

[Resending with the correct Subject.]

> From: "B. T. Raven" <btraven@nihilo.net>
> Date: Thu, 13 Apr 2017 00:09:51 -0500
> 
> I also have these lines in my .emacs:
> 
>    (set-locale-environment   "utf-8")
>          (set-language-environment               'utf-8)
>          (set-default-coding-systems             'utf-8)
>          (setq file-name-coding-system           'utf-8)
>          (setq buffer-file-coding-system 'utf-8)
>          (setq coding-system-for-write           'utf-8)
>          (set-keyboard-coding-system             'utf-8)
>          (set-terminal-coding-system          'utf-8)
>          (prefer-coding-system                   'utf-8)
>          ;; (set-buffer-process-coding-system 'utf-8 'utf-8)
>          (modify-coding-system-alist 'process 
> "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
> 
> 
> The line commented out caused a problem but I don't remember what it 
> was. My os w64 vers. 7

Some of the above are not recommended, and some are downright
dangerous (a.k.a. "shooting yourself in the foot").  Especially on
MS-Windows, UTF-8 should be used with extra care, because Windows only
partially supports this encoding in its APIs.

Specifically:

>    (set-locale-environment   "utf-8")

Don't do this on Windows, as Windows locales cannot use UTF-8 as their
encoding.

>          (set-language-environment               'utf-8)
>          (set-default-coding-systems             'utf-8)

Redundant as long as you have the prefer-coding-system call below.

>          (setq file-name-coding-system           'utf-8)

This is a no-op: Emacs on Windows ignores the value of this variable,
except if you are on Windows 9X, and file names cannot be encoded in
UTF-8 on Windows anyway.  Starting with Emacs 24.4, Emacs on Windows
uses Unicode APIs to deal with file names, so it supports non-ASCII
file names with all Unicode characters, and you don't need to do
anything to get this support.

>          (setq buffer-file-coding-system 'utf-8)

Dangerous.  Also redundant with prefer-coding-system below.

>          (setq coding-system-for-write           'utf-8)

This is dangerous: it will produce subtle issues with some commands,
notably when invoking subprocesses with non-ASCII strings in
command-line arguments.  This variable exists so that Lisp programs
could force specific encoding where appropriate, so leave it to that
and don't globally set it.

>          (set-keyboard-coding-system             'utf-8)
>          (set-terminal-coding-system          'utf-8)

These are wrong, and will get in the way when you work in -nw
sessions.  Emacs on MS-Windows doesn't fully support UTF-8 encoding of
keyboard input and console output, even if you tweak your system's
codepage to be 65001 (did you?).

>          (prefer-coding-system                   'utf-8)

This is the only setting that you should have if you want to use UTF-8
wherever possible and reasonable.

>          ;; (set-buffer-process-coding-system 'utf-8 'utf-8)
>          (modify-coding-system-alist 'process 
> "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)

This is wrong: Emacs on MS-Windows doesn't support UTF-8 encoding of
program command-line arguments for subprocesses, and most Windows
programs will NOT talk UTF-8 in their standard streams.
prefer-coding-system should take care of those situations where this
is possible/actually happens; the rest should be left alone, or you
will have subtle problems with non-ASCII I/O vis-a-vis subprocesses.

HTH



  parent reply	other threads:[~2017-04-13  7:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07 23:43 Understanding how to specify UTF-8 Will Parsons
2017-04-08  7:29 ` Eli Zaretskii
2017-04-13  5:09 ` B. T. Raven
2017-04-13  6:37   ` (unknown) Eli Zaretskii
2017-04-13  7:18   ` Eli Zaretskii [this message]
2017-04-13  9:42     ` Understanding how to specify UTF-8 hector
2017-04-14 23:37   ` Will Parsons
2017-04-21  9:28 ` Jason Rumney
2017-04-21 10:54   ` Eli Zaretskii
2017-04-21 17:36   ` Will Parsons
2017-05-29 15:16   ` Understanding cross version problem Francis Belliveau
2017-05-29 16:38     ` Drew Adams
2017-04-21 18:30 ` Understanding how to specify UTF-8 Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83a87kixn7.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).