all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: dak@gnu.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files
Date: Sun, 27 Sep 2015 11:55:36 +0300	[thread overview]
Message-ID: <83fv20fdh3.fsf@gnu.org> (raw)
In-Reply-To: <5607A758.4020205@cs.ucla.edu>

> Cc: dak@gnu.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 27 Sep 2015 01:22:48 -0700
> 
> Eli Zaretskii wrote:
> > I've also looked at the *.po files in the latest releases of GNU Make,
> > Gawk, Texinfo, and Binutils, and I find that between 20% and 25% of
> > such files still use non-UTF-8 encodings.
> 
> Yes, and those files are a pain to look at with Emacs now, since it typically 
> misguesses their encodings.  Presumably Emacs should be looking at .po files' 
> charset= decorations.

You need to install the po-mode.

But anyway, that's not the issue at hand.  I just used those files as
indicators of preferences of some locales.

> > while I agree with you that UTF-8 encoded files are the majority
> > among non-ASCII files (and Emacs development aligns itself with that
> > fact very well), the non-UTF-8 minority, even in the Posix world, is
> > still significant enough, and we cannot possibly ignore it.
> 
> Naturally we cannot ignore it.  All I'm suggesting is that we change the default 
> behavior so that it's more UTF-8 friendly, since that's the way the world is 
> going.  The old Emacs behavior should still be available, for people who need it.

You use "default" here in a sense that is different from what the Mule
stuff does.  Since Emacs attempts to support i18n, not just l10n, it
cannot ask users to modify their defaults whenever they meet a file
that's decoded incorrectly.  Emacs uses the defaults in this area as
the last resort, when no other information is available in the file
itself or its accompanying meta-data.  That default is already as
friendly to UTF-8 as possible: UTF-8 is used in any locale where
that's the default.  Going further, i.e. preferring UTF-8 in locales
whose preferences are different, will simply bring back the old bugs
and misfeatures of Emacs 20 and 21 which we worked so hard to
eradicate.

IMO, the _only_ sane way forward is to introduce more reliable ways of
detecting the encoding, whether by using some new kinds of meta-data
or by more extensive analysis of the text itself.  (The latter
solution will probably have difficulties with decoding sub-process
output, but it could be very efficient with disk files and large
bodies of text made available to Emacs at once.)

IOW, I don't think we will be able to change our locale-derived
defaults any time soon.  What we can do is minimize the probability of
having to fall back on those defaults.  But this requires that
Someone™ volunteers to revamp our detect_coding_* implementations in
that direction.




  reply	other threads:[~2015-09-27  8:55 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150921165211.20434.28114@vcs.savannah.gnu.org>
     [not found] ` <E1Ze4K3-0005KC-5U@vcs.savannah.gnu.org>
2015-09-21 19:57   ` [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files Stefan Monnier
2015-09-21 20:07     ` Eli Zaretskii
2015-09-24 16:44       ` Eli Zaretskii
2015-09-24 21:29         ` Stefan Monnier
2015-09-25  7:55           ` Eli Zaretskii
2015-09-25 12:21             ` Stefan Monnier
2015-09-25 13:37               ` Eli Zaretskii
2015-09-25 22:32               ` Paul Eggert
2015-09-26  6:27                 ` Eli Zaretskii
2015-09-26  6:32                   ` Eli Zaretskii
2015-09-26 14:31                   ` Paul Eggert
2015-09-26 15:15                     ` Eli Zaretskii
2015-09-26 16:01                       ` Paul Eggert
2015-09-26 16:09                         ` David Kastrup
2015-09-26 17:26                           ` Eli Zaretskii
2015-09-26 18:53                           ` Paul Eggert
2015-09-26 19:35                             ` Eli Zaretskii
2015-09-26 20:26                               ` Chad Brown
2015-09-26 21:50                                 ` David Kastrup
2015-09-27  4:44                                   ` Paul Eggert
2015-09-27  5:29                                     ` David Kastrup
2015-09-27  7:38                                       ` Paul Eggert
2015-09-27  7:46                                         ` David Kastrup
2015-09-27  7:52                                           ` Paul Eggert
2015-09-27  9:47                                       ` Andreas Schwab
2015-09-27  9:54                                         ` David Kastrup
2015-09-27 10:03                                           ` Andreas Schwab
2015-09-27 10:12                                             ` David Kastrup
2015-09-27 11:10                                               ` Andreas Schwab
2015-09-27 22:48                                       ` Richard Stallman
2015-09-28  2:41                                         ` Paul Eggert
2015-09-28  6:53                                           ` Eli Zaretskii
2015-09-28 15:08                                             ` Paul Eggert
2015-09-28 15:58                                               ` Eli Zaretskii
2015-09-27  7:39                                     ` Eli Zaretskii
2015-09-27  7:52                                       ` Paul Eggert
2015-09-27  8:00                                         ` David Kastrup
2015-09-27  8:03                                         ` Eli Zaretskii
2015-09-27  8:29                                           ` Paul Eggert
2015-09-27  8:37                                             ` David Kastrup
2015-09-27  8:40                                               ` Paul Eggert
2015-09-27  8:50                                                 ` David Kastrup
2015-09-27 10:14                                                 ` Eli Zaretskii
2015-09-27  8:57                                             ` Eli Zaretskii
2015-09-27  7:34                                 ` Eli Zaretskii
2015-09-27 16:03                                   ` Chad Brown
2015-09-27 18:41                                     ` Eli Zaretskii
2015-09-27 19:52                                       ` Chad Brown
2015-09-27 20:52                                         ` Eli Zaretskii
2015-09-26 20:32                               ` Paul Eggert
2015-09-27  7:27                                 ` Eli Zaretskii
2015-09-27  7:42                                   ` David Kastrup
2015-09-27  9:20                                     ` Rustom Mody
2015-09-27 10:13                                       ` Eli Zaretskii
2015-09-27 20:21                                         ` Paul Eggert
2015-09-27 21:04                                           ` Eli Zaretskii
2015-09-27  8:22                                   ` Paul Eggert
2015-09-27  8:55                                     ` Eli Zaretskii [this message]
2015-09-27  9:56                                     ` Andreas Schwab
2015-09-27 10:04                                       ` David Kastrup
2015-09-27 10:16                                         ` Eli Zaretskii
2015-09-27 10:36                                           ` Eli Zaretskii
2015-09-27 10:59                                             ` Eli Zaretskii
2015-09-27 20:05                                               ` Paul Eggert
2015-09-26 17:25                         ` Eli Zaretskii
2015-09-26 18:51                           ` Paul Eggert
2015-09-27  0:12                         ` stephen
2015-09-27  4:44                           ` Paul Eggert
2015-09-27  6:20                             ` stephen
2015-09-27  8:34                               ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83fv20fdh3.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=dak@gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.