From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Peter Dyballa Newsgroups: gmane.emacs.help Subject: Re: UTF-8 in path / filename Date: Fri, 25 Aug 2006 14:08:11 +0200 Message-ID: References: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=WINDOWS-1252; delsp=yes; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1156507848 13966 80.91.229.2 (25 Aug 2006 12:10:48 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 25 Aug 2006 12:10:48 +0000 (UTC) Cc: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Aug 25 14:10:45 2006 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1GGaW8-00026Q-RR for geh-help-gnu-emacs@m.gmane.org; Fri, 25 Aug 2006 14:10:33 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GGaW8-0002nL-2H for geh-help-gnu-emacs@m.gmane.org; Fri, 25 Aug 2006 08:10:32 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1GGaU0-00007y-2G for help-gnu-emacs@gnu.org; Fri, 25 Aug 2006 08:08:20 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1GGaTw-0008UE-MB for help-gnu-emacs@gnu.org; Fri, 25 Aug 2006 08:08:19 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GGaTu-0008TG-Qw for help-gnu-emacs@gnu.org; Fri, 25 Aug 2006 08:08:16 -0400 Original-Received: from [217.72.192.221] (helo=fmmailgate01.web.de) by monty-python.gnu.org with esmtp (Exim 4.52) id 1GGacB-0003Lr-Mw for help-gnu-emacs@gnu.org; Fri, 25 Aug 2006 08:16:48 -0400 Original-Received: from smtp08.web.de (fmsmtp08.dlan.cinetic.de [172.20.5.216]) by fmmailgate01.web.de (Postfix) with ESMTP id B2FBB16FC1A7; Fri, 25 Aug 2006 14:08:13 +0200 (CEST) Original-Received: from [84.245.187.11] (helo=[192.168.1.2]) by smtp08.web.de with asmtp (TLSv1:RC4-SHA:128) (WEB.DE 4.107 #114) id 1GGaTt-0007H0-00; Fri, 25 Aug 2006 14:08:13 +0200 In-Reply-To: X-Image-Url: http://homepage.mac.com/sparifankal/.cv/thumbs/me.thumbnail Original-To: =?ISO-8859-1?Q?Gr=E9gory_SCHMITT?= X-Mailer: Apple Mail (2.752.2) X-Sender: Peter_Dyballa@web.de X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:36928 Archived-At: Am 24.08.2006 um 15:59 schrieb Gr=E9gory SCHMITT: > Hi everyone, > > I'm running emacs 21.4.1 using Linux (Fedora Core 5). When I try to =20= > open a > file and the path name contains UTF-8 letters, emacs won't be able =20 > to find > the file. > > I create a folder called "Gr=E9gory". I put any file in it (let's =20 > call it > "test") and if I, from a simple xterm, try to do "emacs Gr=E9gory/test",= > emacs won't be able to open the file. However, it will be =20 > successful if I > manually visit using C-x C-f. > > If I use any other editor (such as mcedit), it will open OK. > > Any explanation ? > Yes: your terminal emulation/shell swallows/hides information. On Mac OS X in Apple's Terminal (TERM is xterm-color) I can see UTF-8 =20= filenames, for example =E4=F6=FC=DF=DC=D6=C4=80. File name = expansion/completion =20 does *not* work on them (although RGB =E4=F6=FC=E6=C6=DC=D6=C4.txt gets =20= expanded to RGB a?^?o?^?u?^?=E6?^?U?^?O?^?A?^?.txt). And of course it =20= does not work to invoke GNU Emacs with this file name as argument (or =20= 'built-in' vi, nano. It *works* though when I do that from the =20 *shell* buffer in Unicode Emacs 23.0.0 or GNU Emacs 22.0.50 ... =20 (although no file name completion and the latter showing the =A8 as =20 empty boxes in the file name) If I for example paste a name with =20 UTF-8 contents from ls output to pass it to vi (it gives the best =20 complaints) I can see that the de-composed UTF-8 characters are =20 strangely interpreted. An =E4 seems to vanish and become kind of =20 control character, the =A8 component of A=A8, i.e. =C4, is passed as = =20 or such ... Since in your case mcedit accepts the file name, mcedit and your =20 terminal seem to use the same character encoding, so for both =E9 *is* =20= an =E9. GNU Emacs lives in its own world of almost indefinite character =20= encodings. One way to make Emacs work correctly is to set environment =20= variables like LC_All, LANG, or LC_CTYPE which obviously just repeat =20 what your shell and your OS' standard utilities know. Next is *not* =20 to set current-language-environment! =46rom LC_CTYPE etc. Emacs learns =20= what encodings to set for buffer contents, file names, process data. =20 If it makes mistakes in this you might consider to use (prefer-coding-system 'iso-latin-9-unix) ; the = one with =80 or a few such statements with different codings each. GNU Emacs will =20 then try to apply these encodings first. Since you're working with a =20 non-Unicode Emacs you might need to set (unify-8859-on-decoding-mode t) (unify-8859-on-encoding-mode t) to make the 8 bit ISO Latin encodings be handled as quite the same, =20 i.e. =E9 would be in any of these encodings in which it exists the =20 same, i.e. you could search for it in all buffers and you only once =20 told isearch to look for =E9. One important thing is that *you* already messed up your .emacs file. =20= Try to launch it also with --no-init-file and/or --no-site-file and =20 also with -nw, i.e. running inside the terminal without X windows. -- Greetings Pete The most exciting phrase to hear in science, the one that heralds new =20= discoveries, is not "Eureka!" (I found it!) but "That's funny..." Isaac Asimov