From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files Date: Sat, 26 Sep 2015 21:44:33 -0700 Organization: UCLA Computer Science Department Message-ID: <56077431.7010906@cs.ucla.edu> References: <20150921165211.20434.28114@vcs.savannah.gnu.org> <83fv27mt7r.fsf@gnu.org> <83wpvfix7i.fsf@gnu.org> <83fv23hr0z.fsf@gnu.org> <5605CB6B.4000102@cs.ucla.edu> <83twqhhf0g.fsf@gnu.org> <5606AC48.7090801@cs.ucla.edu> <83zj09fbzp.fsf@gnu.org> <5606C140.6090309@cs.ucla.edu> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1443329117 28707 80.91.229.3 (27 Sep 2015 04:45:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 27 Sep 2015 04:45:17 +0000 (UTC) To: stephen@xemacs.org, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Sep 27 06:45:02 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Zg3pb-0007QO-FH for ged-emacs-devel@m.gmane.org; Sun, 27 Sep 2015 06:44:59 +0200 Original-Received: from localhost ([::1]:56086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zg3pa-0003Jm-VC for ged-emacs-devel@m.gmane.org; Sun, 27 Sep 2015 00:44:58 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49624) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zg3pF-00036R-J8 for emacs-devel@gnu.org; Sun, 27 Sep 2015 00:44:38 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zg3pE-000770-8T for emacs-devel@gnu.org; Sun, 27 Sep 2015 00:44:37 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:60234) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zg3pE-00076s-3v for emacs-devel@gnu.org; Sun, 27 Sep 2015 00:44:36 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id BE7871611E3; Sat, 26 Sep 2015 21:44:35 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id ucryzQ3yn1Z0; Sat, 26 Sep 2015 21:44:33 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id A9BFD1611D8; Sat, 26 Sep 2015 21:44:33 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id OSvF8gWKw5Bm; Sat, 26 Sep 2015 21:44:33 -0700 (PDT) Original-Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 8B30D1601AC; Sat, 26 Sep 2015 21:44:33 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 131.179.128.68 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:190374 Archived-At: stephen@xemacs.org wrote: > This is partly due to UTF-8 being the encoding of > > choice for HTML and XML, where UTF-8 overtook the older 8-bit > > encodings in 2008 and now is by far the dominant encoding. > > On the commercial internet, yes, but not for government and academic > sites in Japan and China. I think your information is out of date. Yes, ten years ago there was a lot of non-UTF-8 out there, but nowadays they've largely moved on to UTF-8. For fun I just now visited a few of the top government and academic websites in Japan: http://www.japan.go.jp/ http://www.mofa.go.jp/ http://nettv.gov-online.go.jp/ http://www.e-kokusei.go.jp/ https://www.env.go.jp/ http://www.u-tokyo.ac.jp/ http://www.kyoto-u.ac.jp/ http://www.osaka-u.ac.jp/ http://www.keio.ac.jp/ I configured my browser to say that I preferred Japanese text. All ten web sites gave me UTF-8. Feel free to canvass China, but I daresay you'll find the same. Of course one can still find a few web sites using other encodings, but like it or not, UTF-8 dominates now.