From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Oliver Scholz Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Unicode Lisp reader escapes Date: Sun, 07 May 2006 23:26:02 +0200 Message-ID: <87irohfrx1.fsf@gmx.de> References: <17491.34779.959316.484740@parhasard.net> <87odyfnqcj.fsf-monnier+emacs@gnu.org> <17498.27200.911709.330947@parhasard.net> <877j4z5had.fsf@gmx.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1147058112 23501 80.91.229.2 (8 May 2006 03:15:12 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 8 May 2006 03:15:12 +0000 (UTC) Cc: emacs-devel@gnu.org, rms@gnu.org, alkibiades@gmx.de Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 08 05:15:10 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FcwDE-0002mm-KE for ged-emacs-devel@m.gmane.org; Mon, 08 May 2006 05:15:08 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FcwDE-0003qn-0G for ged-emacs-devel@m.gmane.org; Sun, 07 May 2006 23:15:08 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FcqmU-00021j-2N for emacs-devel@gnu.org; Sun, 07 May 2006 17:27:10 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FcqmS-00021X-Hr for emacs-devel@gnu.org; Sun, 07 May 2006 17:27:09 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FcqmS-00021U-DK for emacs-devel@gnu.org; Sun, 07 May 2006 17:27:08 -0400 Original-Received: from [213.165.64.20] (helo=mail.gmx.net) by monty-python.gnu.org with smtp (Exim 4.52) id 1Fcqn9-0003NM-3h for emacs-devel@gnu.org; Sun, 07 May 2006 17:27:51 -0400 Original-Received: (qmail invoked by alias); 07 May 2006 21:27:05 -0000 Original-Received: from dslb-084-058-152-133.pools.arcor-ip.net (EHLO localhost.localdomain.gmx.de) [84.58.152.133] by mail.gmx.net (mp042) with SMTP; 07 May 2006 23:27:05 +0200 X-Authenticated: #1497658 Original-To: Kenichi Handa In-Reply-To: (Kenichi Handa's message of "Sun, 07 May 2006 21:38:17 +0900") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/23.0.0 (gnu/linux) X-Y-GMX-Trusted: 0 X-Mailman-Approved-At: Sun, 07 May 2006 23:14:55 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54051 Archived-At: Kenichi Handa writes: > In article , Richard Stallman writes: [...] > When you byte-compile a x.el file, x.el file is at first > decoded. How x.el file is decoded depends on many thing, > and thus, of course, the resulting x.elc files become > different. Yes, that's what I meant. > If you say that is a bug, I think there's no way to fix it. > > The very simple testcase is this: > > (progn > (let ((str "(setq x \"\300\300\")\n") > (coding-system-for-write 'no-conversion)) > (write-region str nil "~/test1.el") > (write-region str nil "~/test2.el")) > (set-language-environment "Latin-1") > (byte-compile-file "~/test1.el") > (set-language-environment "Japanese") > (byte-compile-file "~/test2.el")) That's not exactly what I meant. This happens basically because Emacs has no indication on how to decode that file properly. Here's a test case for what I had in mind: (let ((str1 (format "\ ;; -*- coding: utf-8 -*- \(defvar my-string \"The Greek letter alpha: %c\")" (decode-char 'ucs #x3B1= ))) (str2 (format "\ ;; -*- coding: iso-8859-7 -*- \(defvar my-string \"The Greek letter alpha: %c\")" (decode-char 'ucs #x3B1= )))) (let ((coding-system-for-write 'utf-8)) (write-region str1 nil "~/fragment-test-1.el") (write-region str1 nil "~/fragment-test-2.el")) (let ((coding-system-for-write 'iso-8859-7)) (write-region str2 nil "~/unify-test-1.el") (write-region str2 nil "~/unify-test-2.el")) (unify-8859-on-decoding-mode -1) (byte-compile-file "~/unify-test-1.el") ; ch. 2913 from ; greek-iso8859-7 (unify-8859-on-decoding-mode 1) (byte-compile-file "~/unify-test-2.el") ; ch. 332721 from ; mule-unicode-0100-24ff ;; Assuming `utf-fragment-on-decoding' is nil. (byte-compile-file "~/fragment-test-1.el") ; ch. 332721 from ; mule-unicode-0100-24ff ;; AFAICS there is no way to change the settings associated with ;; `utf-fragment-on-decoding' programmatically. However, the ;; following (taken from the variable's `defcustom' declaration) ;; should have the same effect as customizing it. (progn (define-translation-table 'utf-translation-table-for-decode utf-fragmentation-table) (unless (eq (get 'utf-translation-table-for-encode 'translation-table) ucs-mule-to-mule-unicode) (define-translation-table 'utf-translation-table-for-encode utf-defragmentation-table))) (byte-compile-file "~/fragment-test-2.el") ; ch. 2913 from ; greek-iso8859-7 ) As Richard wrote, the fix would be to change the settings to their default, unless the files set a specific variable. But given the work this would require and given that the value of changing the defaults is IMO somewhat dubious, you could as well just document it in etc/PROBLEMS. Oliver --=20 Oliver Scholz 18 Flor=C3=A9al an 214 de la R=C3=A9volution Ostendstr. 61 Libert=C3=A9, Egalit=C3=A9, Fraternit=C3=A9! 60314 Frankfurt a. M.=20=20=20=20=20=20=20