From: Paul Eggert <eggert@cs.ucla.edu>
To: stephen@xemacs.org
Cc: emacs-devel@gnu.org
Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files
Date: Sun, 27 Sep 2015 01:34:39 -0700 [thread overview]
Message-ID: <5607AA1F.4030508@cs.ucla.edu> (raw)
In-Reply-To: <E1Zg5KS-0005NI-Ul@turnbull.sk.tsukuba.ac.jp>
stephen@xemacs.org wrote:
> Perhaps most
> recently authored pages are UTF-8. But the data sets themselves are
> typically flat files, either CSV or plaintext. The explanatory pages,
> even if in HTML, often haven't been revised in decades.
Yes, that's pretty much my experience. In Japan older stuff is mostly
Shift-JIS, EUC, or maybe ISO-2022-JP. New stuff is mostly UTF-8. People using
old email software send old encodings because that's what they've been doing for
decades. Normally it works, because the email envelope tells you the encoding.
But sometimes people screw up and you get mojibake.
But this situation is not an argument for having the locale determine encoding
when visiting random imported files that lack envelopes. For such files, it
often doesn't work to set LC_ALL=ja_JP.ujis and expect Emacs to get things
right. (This is one of things that Eli has noted multiple times, and he's right.)
Of course if one is working in a conservative Japanese government ministry that
standardized on Shift-JIS back in 1992 and hasn't changed since then, then
things are different, and Emacs should support such users. But typical Emacs
users are not in this situation, and the Emacs default should cater to the
more-typical case today.
To narrow things down a bit I briefly looked for .jp websites that talk about
Emacs. Google reported the following first page's worth of hits (I list year of
composition, encoding, and URL). Again, the new stuff is mostly UTF-8, and the
old stuff is a mishmash, so it's another data point suggesting that defaulting
to UTF-8 would not be such a bad thing for editing today's text.
2002 Shift-JIS http://www.rsch.tuis.ac.jp/~ohmi/literacy/emacs/quick.html
2008 ISO-2022-JP http://www.wakayama-u.ac.jp/~takehiko/webprg/03.html
2015 EUC-JP http://d.hatena.ne.jp/tarao/20150221/1424518030
2015 UTF-8 http://uguisu.skr.jp/Windows/emacs.html
2015 UTF-8
http://www.amazon.co.jp/Emacs%E5%AE%9F%E8%B7%B5%E5%85%A5%E9%96%80-%EF%BD%9E%E6%80%9D%E8%80%83%E3%82%92%E7%9B%B4%E6%84%9F%E7%9A%84%E3%81%AB%E3%82%B3%E3%83%BC%E3%83%89%E5%8C%96%E3%81%97%E3%80%81%E9%96%8B%E7%99%BA%E3%82%92%E5%8A%A0%E9%80%9F%E3%81%99%E3%82%8B-WEB-DB-PRESS-plus/dp/4774150029
2015 UTF-8 http://www.sigasi.jp/better-emacs-vhdl-mode
2006 Shift-JIS
http://www.math.kobe-u.ac.jp/icms2006/icms2006-video/slides/grayson/share/doc/Macaulay2/Macaulay2/html/_teaching_spemacs_sphow_spto_spfind_sp__M2.html
2015 UTF-8 https://osdn.jp/projects/gnupack/
prev parent reply other threads:[~2015-09-27 8:34 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20150921165211.20434.28114@vcs.savannah.gnu.org>
[not found] ` <E1Ze4K3-0005KC-5U@vcs.savannah.gnu.org>
2015-09-21 19:57 ` [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files Stefan Monnier
2015-09-21 20:07 ` Eli Zaretskii
2015-09-24 16:44 ` Eli Zaretskii
2015-09-24 21:29 ` Stefan Monnier
2015-09-25 7:55 ` Eli Zaretskii
2015-09-25 12:21 ` Stefan Monnier
2015-09-25 13:37 ` Eli Zaretskii
2015-09-25 22:32 ` Paul Eggert
2015-09-26 6:27 ` Eli Zaretskii
2015-09-26 6:32 ` Eli Zaretskii
2015-09-26 14:31 ` Paul Eggert
2015-09-26 15:15 ` Eli Zaretskii
2015-09-26 16:01 ` Paul Eggert
2015-09-26 16:09 ` David Kastrup
2015-09-26 17:26 ` Eli Zaretskii
2015-09-26 18:53 ` Paul Eggert
2015-09-26 19:35 ` Eli Zaretskii
2015-09-26 20:26 ` Chad Brown
2015-09-26 21:50 ` David Kastrup
2015-09-27 4:44 ` Paul Eggert
2015-09-27 5:29 ` David Kastrup
2015-09-27 7:38 ` Paul Eggert
2015-09-27 7:46 ` David Kastrup
2015-09-27 7:52 ` Paul Eggert
2015-09-27 9:47 ` Andreas Schwab
2015-09-27 9:54 ` David Kastrup
2015-09-27 10:03 ` Andreas Schwab
2015-09-27 10:12 ` David Kastrup
2015-09-27 11:10 ` Andreas Schwab
2015-09-27 22:48 ` Richard Stallman
2015-09-28 2:41 ` Paul Eggert
2015-09-28 6:53 ` Eli Zaretskii
2015-09-28 15:08 ` Paul Eggert
2015-09-28 15:58 ` Eli Zaretskii
2015-09-27 7:39 ` Eli Zaretskii
2015-09-27 7:52 ` Paul Eggert
2015-09-27 8:00 ` David Kastrup
2015-09-27 8:03 ` Eli Zaretskii
2015-09-27 8:29 ` Paul Eggert
2015-09-27 8:37 ` David Kastrup
2015-09-27 8:40 ` Paul Eggert
2015-09-27 8:50 ` David Kastrup
2015-09-27 10:14 ` Eli Zaretskii
2015-09-27 8:57 ` Eli Zaretskii
2015-09-27 7:34 ` Eli Zaretskii
2015-09-27 16:03 ` Chad Brown
2015-09-27 18:41 ` Eli Zaretskii
2015-09-27 19:52 ` Chad Brown
2015-09-27 20:52 ` Eli Zaretskii
2015-09-26 20:32 ` Paul Eggert
2015-09-27 7:27 ` Eli Zaretskii
2015-09-27 7:42 ` David Kastrup
2015-09-27 9:20 ` Rustom Mody
2015-09-27 10:13 ` Eli Zaretskii
2015-09-27 20:21 ` Paul Eggert
2015-09-27 21:04 ` Eli Zaretskii
2015-09-27 8:22 ` Paul Eggert
2015-09-27 8:55 ` Eli Zaretskii
2015-09-27 9:56 ` Andreas Schwab
2015-09-27 10:04 ` David Kastrup
2015-09-27 10:16 ` Eli Zaretskii
2015-09-27 10:36 ` Eli Zaretskii
2015-09-27 10:59 ` Eli Zaretskii
2015-09-27 20:05 ` Paul Eggert
2015-09-26 17:25 ` Eli Zaretskii
2015-09-26 18:51 ` Paul Eggert
2015-09-27 0:12 ` stephen
2015-09-27 4:44 ` Paul Eggert
2015-09-27 6:20 ` stephen
2015-09-27 8:34 ` Paul Eggert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5607AA1F.4030508@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=emacs-devel@gnu.org \
--cc=stephen@xemacs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).