From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#13936: Default to UTF-8 for most Emacs source files Date: Wed, 20 Mar 2013 09:43:38 -0700 Message-ID: <5149E73A.60304@cs.ucla.edu> References: <871ubaa3i5.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1363797863 18797 80.91.229.3 (20 Mar 2013 16:44:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 20 Mar 2013 16:44:23 +0000 (UTC) Cc: 13936@debbugs.gnu.org To: Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Mar 20 17:44:47 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UIM87-0006l5-6u for geb-bug-gnu-emacs@m.gmane.org; Wed, 20 Mar 2013 17:44:47 +0100 Original-Received: from localhost ([::1]:36262 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIM7j-00052B-Vq for geb-bug-gnu-emacs@m.gmane.org; Wed, 20 Mar 2013 12:44:24 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:49929) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIM7e-0004yL-Sw for bug-gnu-emacs@gnu.org; Wed, 20 Mar 2013 12:44:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UIM7d-0007ov-F3 for bug-gnu-emacs@gnu.org; Wed, 20 Mar 2013 12:44:18 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:33739) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIM7d-0007oc-Bm for bug-gnu-emacs@gnu.org; Wed, 20 Mar 2013 12:44:17 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1UIM9K-0006EO-IB for bug-gnu-emacs@gnu.org; Wed, 20 Mar 2013 12:46:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 20 Mar 2013 16:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13936 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 13936-submit@debbugs.gnu.org id=B13936.136379793423914 (code B ref 13936); Wed, 20 Mar 2013 16:46:02 +0000 Original-Received: (at 13936) by debbugs.gnu.org; 20 Mar 2013 16:45:34 +0000 Original-Received: from localhost ([127.0.0.1]:37848 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UIM8r-0006Df-4b for submit@debbugs.gnu.org; Wed, 20 Mar 2013 12:45:33 -0400 Original-Received: from smtp.cs.ucla.edu ([131.179.128.62]:55051) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UIM8l-0006DS-8m for 13936@debbugs.gnu.org; Wed, 20 Mar 2013 12:45:29 -0400 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 2FE3039E8008; Wed, 20 Mar 2013 09:43:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Original-Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n+ep10PL7wAD; Wed, 20 Mar 2013 09:43:39 -0700 (PDT) Original-Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 4DBDDA60001; Wed, 20 Mar 2013 09:43:39 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 In-Reply-To: <871ubaa3i5.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:72747 Archived-At: On 03/20/13 01:18, Kenichi Handa wrote: > Among CJK files, I think K(orean) files can be in UTF-8 > without problem. It's easy enough to convert the K files to UTF-8 too, and I'll propose a patch to do that in followup email. > Are there any people familiar with Korean situation? Sorry, I don't know. For what it's worth, when I use Emacs to convert TUTORIAL.ko to UTF-8 and back, the result is identical to the original, so no information is lost by making that change. (This is not true for TUTORIAL.ja.) I have another question. Shouldn't it be OK to convert Elisp source files such as leim/quail/japanese.el to UTF-8 as well? Emacs internally converts their text to UTF-8 while compiling them, so the corresponding .elc files are in UTF-8 already, and there should be no functional difference if we convert the .el files to UTF-8. Converting these files to UTF-8 would fix an inconsistency in Emacs behavior. For example, if I visit the file leim/quail/japanese.el I see this definition: (defvar quail-japanese-use-double-n nil "If non-nil, use type \"nn\" to insert =E3=82=93.") where the character '=E3=82=93' is displayed using code point 0x2473 in charset japanese-jisx0208. But if I *use* the above definition string, by typing "C-h v quail-japanese-use-double-n RET", the help string that I see has been translated to UTF-8, so Emacs displays that character using code point 0x3093 in charset unicode instead. It would be better if the runtime behavior matched the source code, and an easy way to do that would be to convert the source code to UTF-8. Here is the list of the remaining .el files that I'd like to convert to UTF-8: leim/quail/cyril-jis.el leim/quail/hanja-jis.el leim/quail/japanese.el leim/quail/py-punct.el leim/quail/pypunct-b5.el lisp/international/ja-dic-cnv.el lisp/international/ja-dic-utl.el lisp/international/kinsoku.el lisp/international/kkc.el lisp/international/titdic-cnv.el lisp/language/japan-util.el lisp/language/japanese.el lisp/term/x-win.el x-win.el is a special case, since it has two "Kana: Fixme:" lines talking about problems when converting to UTF-8 -- evidently these are issues in our current setup anyway since Emacs converts the text to UTF-8 before compiling it.