From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: Emacs Lisp's future Date: Tue, 14 Oct 2014 16:03:42 +0900 Message-ID: <8761fnnne9.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87d2ahm3nw.fsf@fencepost.gnu.org> <871tqneyvl.fsf@netris.org> <87zjd9swfj.fsf@uwakimon.sk.tsukuba.ac.jp> <87oatnqpml.fsf@uwakimon.sk.tsukuba.ac.jp> <874mvdrj45.fsf@uwakimon.sk.tsukuba.ac.jp> <20141009044917.GA19957@fencepost.gnu.org> <83lhopisfr.fsf@gnu.org> <87ppe1pldu.fsf@uwakimon.sk.tsukuba.ac.jp> <8761ft5wpo.fsf@fencepost.gnu.org> <83k349b0vj.fsf@gnu.org> <83bnph96kh.fsf@gnu.org> <87ppdwo7ll.fsf@uwakimon.sk.tsukuba.ac.jp> <543BE7CB.9040801@cs.ucla.edu> <87egubopls.fsf@uwakimon.sk.tsukuba.ac.jp> <87bnpfyjaf.fsf@fencepost.gnu.org> <87a94zoo57.fsf@uwakimon.sk.tsukuba.ac.jp> <83h9z77p7d.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 X-Trace: ger.gmane.org 1413270267 11189 80.91.229.3 (14 Oct 2014 07:04:27 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 14 Oct 2014 07:04:27 +0000 (UTC) Cc: dak@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 14 09:04:20 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Xdw9b-0003gB-LE for ged-emacs-devel@m.gmane.org; Tue, 14 Oct 2014 09:04:19 +0200 Original-Received: from localhost ([::1]:36868 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xdw9b-00040O-9f for ged-emacs-devel@m.gmane.org; Tue, 14 Oct 2014 03:04:19 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46487) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xdw9R-000409-80 for emacs-devel@gnu.org; Tue, 14 Oct 2014 03:04:16 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xdw9J-0003ol-IV for emacs-devel@gnu.org; Tue, 14 Oct 2014 03:04:09 -0400 Original-Received: from shako.sk.tsukuba.ac.jp ([130.158.97.161]:53492) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xdw93-0003kz-HS; Tue, 14 Oct 2014 03:03:45 -0400 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by shako.sk.tsukuba.ac.jp (Postfix) with ESMTP id 841781C396B; Tue, 14 Oct 2014 16:03:42 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 762C91A2888; Tue, 14 Oct 2014 16:03:42 +0900 (JST) In-Reply-To: <83h9z77p7d.fsf@gnu.org> X-Mailer: VM undefined under 21.5 (beta34) "kale" acf1c26e3019 XEmacs Lucid (x86_64-unknown-linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 130.158.97.161 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:175341 Archived-At: Eli Zaretskii writes: > That's not true: we try using UTF-8 wherever possible. The few files > that don't use that simply cannot. That doesn't seem to be true. In fact many of the encodings discovered by "grep -r -e '-\\*- coding:" are ISO 2022 conformant, and a few indeed appear to be EUC encodings under an alias (eg, chinese-iso-8bit-unix). AFAICS, the only encodings listed that can't be encoded in UTF-8 are the Big 5 family -- and that's only if you demand bug-compatibility.[1] So "simply cannot" evidently is your way of saying "inconvenient".[2] Note that because of multiple encodings, in the Emacs tree "grep -r" is probably just a bug. It's not that you can't read the foreign languages in "wrong" encodings. Rather, if your search key is in one of those languages, you'll *miss occurances* in the "wrong" encodings. With your preferred default, most users will live their whole lives without recognizing the bug. With a strict default, they have a fighting chance of learning about it. Footnotes: [1] Big 5 contains a few duplicated characters (at different code points), so *as text* those files can be represented in Unicode (no text information is lost since the characters in question are identical in all ways except Big 5 code point), although *as binary files* they may not be roundtrippable to UTF-8 (it depends on which code point is chosen for the duplicated character). [2] The inconvenience is pretty significant, here: you'd lose diff'ability across the conversion boundary. Thus only new files are *required* to use UTF-8 (no diff discontinuity), and conversions of existing files are presumably done only with great care, if at all. Still, I would think the benefits of having these files be greppable (and etags-able!) would outweigh that inconvenience in a very short period of time (maybe a year?) Except for documentation files, the files that need these characters probably don't change much.