unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files
Date: Sat, 26 Sep 2015 20:25:19 +0300	[thread overview]
Message-ID: <83wpvdf5z4.fsf@gnu.org> (raw)
In-Reply-To: <5606C140.6090309@cs.ucla.edu>

> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 26 Sep 2015 09:01:04 -0700
> 
> Eli Zaretskii wrote:
> > So you are, in effect, saying that it is incorrect to derive the
> > default encodings from the locale's codeset?
> 
> Yes, for Emacs developers.  And come to think of it, for most Emacs users. 
> Nowadays in my experience most non-ASCII text files use UTF-8, regardless of 
> locale.

Are you sure your experience isn't biased by the fact you mostly work
in UTF-8 locales?

> The old days of having to guess encoding from the locale are passing
> away.  This is partly due to UTF-8 being the encoding of choice for
> HTML and XML, where UTF-8 overtook the older 8-bit encodings in 2008
> and now is by far the dominant encoding.

We already DTRT with XML files, and should be doing TRT with any file
format that includes the specification of the encoding in it.

The problem, IMO, is not only with disk files.  It is also with email
messages, output from processes, etc.  E.g., I routinely get Latin-1
encoded email from people whose platform is GNU/Linux.  IOW, non-UTF
encodings are far from being dead yet.

Using UTF-8 by default is certainly wrong on MS-Windows.

> One way to accommodate the new reality would be to change Emacs so that by 
> default the system locale does not affect Emacs's guess of a file's encoding if 
> the file's initial sample is valid UTF-8.  Users could set a variable to 
> re-enable the old behavior.

The problem with this line of thought is that "initial sample" part --
how far into the file should we look, how far is far enough?  E.g.,
tips.texi has its first non-ASCII character at character position
25353.  We've been there before, and found this not reliable enough.

Anyway, doesn't "(prefer-coding-system 'utf-8)" already does what you
want us to offer?



  parent reply	other threads:[~2015-09-26 17:25 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150921165211.20434.28114@vcs.savannah.gnu.org>
     [not found] ` <E1Ze4K3-0005KC-5U@vcs.savannah.gnu.org>
2015-09-21 19:57   ` [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files Stefan Monnier
2015-09-21 20:07     ` Eli Zaretskii
2015-09-24 16:44       ` Eli Zaretskii
2015-09-24 21:29         ` Stefan Monnier
2015-09-25  7:55           ` Eli Zaretskii
2015-09-25 12:21             ` Stefan Monnier
2015-09-25 13:37               ` Eli Zaretskii
2015-09-25 22:32               ` Paul Eggert
2015-09-26  6:27                 ` Eli Zaretskii
2015-09-26  6:32                   ` Eli Zaretskii
2015-09-26 14:31                   ` Paul Eggert
2015-09-26 15:15                     ` Eli Zaretskii
2015-09-26 16:01                       ` Paul Eggert
2015-09-26 16:09                         ` David Kastrup
2015-09-26 17:26                           ` Eli Zaretskii
2015-09-26 18:53                           ` Paul Eggert
2015-09-26 19:35                             ` Eli Zaretskii
2015-09-26 20:26                               ` Chad Brown
2015-09-26 21:50                                 ` David Kastrup
2015-09-27  4:44                                   ` Paul Eggert
2015-09-27  5:29                                     ` David Kastrup
2015-09-27  7:38                                       ` Paul Eggert
2015-09-27  7:46                                         ` David Kastrup
2015-09-27  7:52                                           ` Paul Eggert
2015-09-27  9:47                                       ` Andreas Schwab
2015-09-27  9:54                                         ` David Kastrup
2015-09-27 10:03                                           ` Andreas Schwab
2015-09-27 10:12                                             ` David Kastrup
2015-09-27 11:10                                               ` Andreas Schwab
2015-09-27 22:48                                       ` Richard Stallman
2015-09-28  2:41                                         ` Paul Eggert
2015-09-28  6:53                                           ` Eli Zaretskii
2015-09-28 15:08                                             ` Paul Eggert
2015-09-28 15:58                                               ` Eli Zaretskii
2015-09-27  7:39                                     ` Eli Zaretskii
2015-09-27  7:52                                       ` Paul Eggert
2015-09-27  8:00                                         ` David Kastrup
2015-09-27  8:03                                         ` Eli Zaretskii
2015-09-27  8:29                                           ` Paul Eggert
2015-09-27  8:37                                             ` David Kastrup
2015-09-27  8:40                                               ` Paul Eggert
2015-09-27  8:50                                                 ` David Kastrup
2015-09-27 10:14                                                 ` Eli Zaretskii
2015-09-27  8:57                                             ` Eli Zaretskii
2015-09-27  7:34                                 ` Eli Zaretskii
2015-09-27 16:03                                   ` Chad Brown
2015-09-27 18:41                                     ` Eli Zaretskii
2015-09-27 19:52                                       ` Chad Brown
2015-09-27 20:52                                         ` Eli Zaretskii
2015-09-26 20:32                               ` Paul Eggert
2015-09-27  7:27                                 ` Eli Zaretskii
2015-09-27  7:42                                   ` David Kastrup
2015-09-27  9:20                                     ` Rustom Mody
2015-09-27 10:13                                       ` Eli Zaretskii
2015-09-27 20:21                                         ` Paul Eggert
2015-09-27 21:04                                           ` Eli Zaretskii
2015-09-27  8:22                                   ` Paul Eggert
2015-09-27  8:55                                     ` Eli Zaretskii
2015-09-27  9:56                                     ` Andreas Schwab
2015-09-27 10:04                                       ` David Kastrup
2015-09-27 10:16                                         ` Eli Zaretskii
2015-09-27 10:36                                           ` Eli Zaretskii
2015-09-27 10:59                                             ` Eli Zaretskii
2015-09-27 20:05                                               ` Paul Eggert
2015-09-26 17:25                         ` Eli Zaretskii [this message]
2015-09-26 18:51                           ` Paul Eggert
2015-09-27  0:12                         ` stephen
2015-09-27  4:44                           ` Paul Eggert
2015-09-27  6:20                             ` stephen
2015-09-27  8:34                               ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83wpvdf5z4.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).