unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: stephen@xemacs.org
Cc: emacs-devel@gnu.org
Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files
Date: Sun, 27 Sep 2015 01:34:39 -0700	[thread overview]
Message-ID: <5607AA1F.4030508@cs.ucla.edu> (raw)
In-Reply-To: <E1Zg5KS-0005NI-Ul@turnbull.sk.tsukuba.ac.jp>

stephen@xemacs.org wrote:
> Perhaps most
> recently authored pages are UTF-8.  But the data sets themselves are
> typically flat files, either CSV or plaintext.  The explanatory pages,
> even if in HTML, often haven't been revised in decades.

Yes, that's pretty much my experience.  In Japan older stuff is mostly 
Shift-JIS, EUC, or maybe ISO-2022-JP.  New stuff is mostly UTF-8.  People using 
old email software send old encodings because that's what they've been doing for 
decades.  Normally it works, because the email envelope tells you the encoding. 
  But sometimes people screw up and you get mojibake.

But this situation is not an argument for having the locale determine encoding 
when visiting random imported files that lack envelopes.  For such files, it 
often doesn't work to set LC_ALL=ja_JP.ujis and expect Emacs to get things 
right.  (This is one of things that Eli has noted multiple times, and he's right.)

Of course if one is working in a conservative Japanese government ministry that 
standardized on Shift-JIS back in 1992 and hasn't changed since then, then 
things are different, and Emacs should support such users.  But typical Emacs 
users are not in this situation, and the Emacs default should cater to the 
more-typical case today.

To narrow things down a bit I briefly looked for .jp websites that talk about 
Emacs.  Google reported the following first page's worth of hits (I list year of 
composition, encoding, and URL).  Again, the new stuff is mostly UTF-8, and the 
old stuff is a mishmash, so it's another data point suggesting that defaulting 
to UTF-8 would not be such a bad thing for editing today's text.

2002 Shift-JIS   http://www.rsch.tuis.ac.jp/~ohmi/literacy/emacs/quick.html
2008 ISO-2022-JP http://www.wakayama-u.ac.jp/~takehiko/webprg/03.html
2015 EUC-JP      http://d.hatena.ne.jp/tarao/20150221/1424518030
2015 UTF-8       http://uguisu.skr.jp/Windows/emacs.html
2015 UTF-8 
http://www.amazon.co.jp/Emacs%E5%AE%9F%E8%B7%B5%E5%85%A5%E9%96%80-%EF%BD%9E%E6%80%9D%E8%80%83%E3%82%92%E7%9B%B4%E6%84%9F%E7%9A%84%E3%81%AB%E3%82%B3%E3%83%BC%E3%83%89%E5%8C%96%E3%81%97%E3%80%81%E9%96%8B%E7%99%BA%E3%82%92%E5%8A%A0%E9%80%9F%E3%81%99%E3%82%8B-WEB-DB-PRESS-plus/dp/4774150029
2015 UTF-8       http://www.sigasi.jp/better-emacs-vhdl-mode
2006 Shift-JIS 
http://www.math.kobe-u.ac.jp/icms2006/icms2006-video/slides/grayson/share/doc/Macaulay2/Macaulay2/html/_teaching_spemacs_sphow_spto_spfind_sp__M2.html
2015 UTF-8       https://osdn.jp/projects/gnupack/




      reply	other threads:[~2015-09-27  8:34 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150921165211.20434.28114@vcs.savannah.gnu.org>
     [not found] ` <E1Ze4K3-0005KC-5U@vcs.savannah.gnu.org>
2015-09-21 19:57   ` [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files Stefan Monnier
2015-09-21 20:07     ` Eli Zaretskii
2015-09-24 16:44       ` Eli Zaretskii
2015-09-24 21:29         ` Stefan Monnier
2015-09-25  7:55           ` Eli Zaretskii
2015-09-25 12:21             ` Stefan Monnier
2015-09-25 13:37               ` Eli Zaretskii
2015-09-25 22:32               ` Paul Eggert
2015-09-26  6:27                 ` Eli Zaretskii
2015-09-26  6:32                   ` Eli Zaretskii
2015-09-26 14:31                   ` Paul Eggert
2015-09-26 15:15                     ` Eli Zaretskii
2015-09-26 16:01                       ` Paul Eggert
2015-09-26 16:09                         ` David Kastrup
2015-09-26 17:26                           ` Eli Zaretskii
2015-09-26 18:53                           ` Paul Eggert
2015-09-26 19:35                             ` Eli Zaretskii
2015-09-26 20:26                               ` Chad Brown
2015-09-26 21:50                                 ` David Kastrup
2015-09-27  4:44                                   ` Paul Eggert
2015-09-27  5:29                                     ` David Kastrup
2015-09-27  7:38                                       ` Paul Eggert
2015-09-27  7:46                                         ` David Kastrup
2015-09-27  7:52                                           ` Paul Eggert
2015-09-27  9:47                                       ` Andreas Schwab
2015-09-27  9:54                                         ` David Kastrup
2015-09-27 10:03                                           ` Andreas Schwab
2015-09-27 10:12                                             ` David Kastrup
2015-09-27 11:10                                               ` Andreas Schwab
2015-09-27 22:48                                       ` Richard Stallman
2015-09-28  2:41                                         ` Paul Eggert
2015-09-28  6:53                                           ` Eli Zaretskii
2015-09-28 15:08                                             ` Paul Eggert
2015-09-28 15:58                                               ` Eli Zaretskii
2015-09-27  7:39                                     ` Eli Zaretskii
2015-09-27  7:52                                       ` Paul Eggert
2015-09-27  8:00                                         ` David Kastrup
2015-09-27  8:03                                         ` Eli Zaretskii
2015-09-27  8:29                                           ` Paul Eggert
2015-09-27  8:37                                             ` David Kastrup
2015-09-27  8:40                                               ` Paul Eggert
2015-09-27  8:50                                                 ` David Kastrup
2015-09-27 10:14                                                 ` Eli Zaretskii
2015-09-27  8:57                                             ` Eli Zaretskii
2015-09-27  7:34                                 ` Eli Zaretskii
2015-09-27 16:03                                   ` Chad Brown
2015-09-27 18:41                                     ` Eli Zaretskii
2015-09-27 19:52                                       ` Chad Brown
2015-09-27 20:52                                         ` Eli Zaretskii
2015-09-26 20:32                               ` Paul Eggert
2015-09-27  7:27                                 ` Eli Zaretskii
2015-09-27  7:42                                   ` David Kastrup
2015-09-27  9:20                                     ` Rustom Mody
2015-09-27 10:13                                       ` Eli Zaretskii
2015-09-27 20:21                                         ` Paul Eggert
2015-09-27 21:04                                           ` Eli Zaretskii
2015-09-27  8:22                                   ` Paul Eggert
2015-09-27  8:55                                     ` Eli Zaretskii
2015-09-27  9:56                                     ` Andreas Schwab
2015-09-27 10:04                                       ` David Kastrup
2015-09-27 10:16                                         ` Eli Zaretskii
2015-09-27 10:36                                           ` Eli Zaretskii
2015-09-27 10:59                                             ` Eli Zaretskii
2015-09-27 20:05                                               ` Paul Eggert
2015-09-26 17:25                         ` Eli Zaretskii
2015-09-26 18:51                           ` Paul Eggert
2015-09-27  0:12                         ` stephen
2015-09-27  4:44                           ` Paul Eggert
2015-09-27  6:20                             ` stephen
2015-09-27  8:34                               ` Paul Eggert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5607AA1F.4030508@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=stephen@xemacs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).