From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Jan D." Newsgroups: gmane.emacs.devel Subject: Re: Strange behaviour with dired and UTF8 Date: Fri, 2 May 2003 10:16:52 +0200 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <6DDE98F0-7C76-11D7-8080-00039363E640@swipnet.se> References: <200305010652.PAA14919@etlken.m17n.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (Apple Message framework v552) Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Trace: main.gmane.org 1051863567 30510 80.91.224.249 (2 May 2003 08:19:27 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Fri, 2 May 2003 08:19:27 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Fri May 02 10:19:25 2003 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19BVlN-0007vt-00 for ; Fri, 02 May 2003 10:19:25 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 19BVmq-0000Gr-00 for ; Fri, 02 May 2003 10:20:56 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 19BVmL-0007lT-01 for emacs-devel@quimby.gnus.org; Fri, 02 May 2003 04:20:25 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 19BVl5-0007P8-00 for emacs-devel@gnu.org; Fri, 02 May 2003 04:19:07 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 19BVkJ-0005jN-00 for emacs-devel@gnu.org; Fri, 02 May 2003 04:18:21 -0400 Original-Received: from stubby.bodenonline.com ([193.201.16.94]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 19BVjk-0005DH-00 for emacs-devel@gnu.org; Fri, 02 May 2003 04:17:44 -0400 Original-Received: from accessno42.bodenonline.com (IDENT:root@accessno42.bodenonline.com [193.201.16.44]) h4298EbL017113; Fri, 2 May 2003 11:08:16 +0200 Original-To: Kenichi Handa In-Reply-To: <200305010652.PAA14919@etlken.m17n.org> X-Mailer: Apple Mail (2.552) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:13616 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:13616 > In article <200304241235.h3OCZdbL023178@stubby.bodenonline.com>, "Jan > D." writes: >> Maybe I am doing this wrong, but here is what I try to do. >> My language environment is ISO-8859-1. >> I have a directory that contains files with file names in UTF-8. >> I start dired on that directory. I want to see the UTF-8 characters >> so I do C-x RET r utf-8. File names display OK now. > >> But when trying to operate on a file, say opening it, I get >> "File no longer exists; type `g' to update Dired buffer" >> It seems that dired does not keep the original file name around, but >> tries to open with the display name representation of the file name. > >> When I type g, I loose the UTF-8 coding and files are now displayed >> as ISO-8859-1 again. Setting buffer coding to UTF-8 does not help. > >> Do I have to set file-name-coding-system to UTF-8? This solves the >> problem, but my file-name-coding-system is really ISO-8859-1, it is >> just this one directory that is UTF-8. > > The current Emacs doesn't have a facility to cope with such > a situation well. > > How about this? > > (1) Make a customizable variable > file-name-coding-system-alist; the format is the same as > file-coding-system-alist. > > (2) Make the macro ENCODE_FILE and DECODE_FILE to check that > variable before using file-name-coding-system and > default-file-name-coding-system. > > (3) Enhance the function dired-revert to update > file-name-coding-system-alist automatically if it is > called with coding-system-for-read being bound to > non-nil. In that case, it may also have to ask a user > to save that modification for the future session (via > customize). > > What do people think? Aren't there any better idea? This sounds very complicated. As I understand it, dired first gets the file name from ls (original representation), then converts that to whatever encoding it shall use to show it in the buffer (view representation). When dired operates on the file (opening for example), it converts back from the view representation, hoping to get the original representation. But this may fail, since conversion from view back to original is not one-to-one. This work (original representation -> view representation -> original representation) should not be needed, IMHO. Why just not keep the original representation around (some kind of text property on the file name?) and always use that when operating on the file? That change would be transparent to users. I do not know how dired works, but I think a separation of original representation and view representation would make it easier for dired to use any encoding to view the files. Jan D.