From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel,gmane.emacs.pretest.bugs Subject: Re: File names with accented Latin characters are not displayed correctly Date: Sat, 19 Nov 2005 10:51:10 +0900 Message-ID: References: <4a688eb6976fe19f639d8ae0fec0126d@Web.DE> <0ed07c7b9fe05cb0334850bba636bd40@Web.DE> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1132365133 29474 80.91.229.2 (19 Nov 2005 01:52:13 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 19 Nov 2005 01:52:13 +0000 (UTC) Cc: emacs-pretest-bug@gnu.org, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 19 02:52:06 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EdHt5-0005oS-9m for ged-emacs-devel@m.gmane.org; Sat, 19 Nov 2005 02:51:31 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EdHt4-0006sj-Jy for ged-emacs-devel@m.gmane.org; Fri, 18 Nov 2005 20:51:30 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EdHsr-0006qc-8S for emacs-devel@gnu.org; Fri, 18 Nov 2005 20:51:17 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EdHsq-0006qM-NI for emacs-devel@gnu.org; Fri, 18 Nov 2005 20:51:17 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EdHsp-0006q5-HF; Fri, 18 Nov 2005 20:51:15 -0500 Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org) by monty-python.gnu.org with esmtp (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24) (Exim 4.34) id 1EdHsp-00050g-7T; Fri, 18 Nov 2005 20:51:15 -0500 Original-Received: from nfs.m17n.org (nfs.m17n.org [192.47.44.7]) by tsukuba.m17n.org (8.13.4/8.13.4/Debian-3) with ESMTP id jAJ1pC3K011127; Sat, 19 Nov 2005 10:51:12 +0900 Original-Received: from etlken (etlken.m17n.org [192.47.44.125]) by nfs.m17n.org (8.13.4/8.13.4/Debian-3) with ESMTP id jAJ1pBIK014547; Sat, 19 Nov 2005 10:51:11 +0900 Original-Received: from handa by etlken with local (Exim 3.36 #1 (Debian)) id 1EdHsk-0001Yf-00; Sat, 19 Nov 2005 10:51:10 +0900 Original-To: Peter Dyballa In-reply-to: <0ed07c7b9fe05cb0334850bba636bd40@Web.DE> (message from Peter Dyballa on Thu, 17 Nov 2005 23:54:39 +0100) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:46247 gmane.emacs.pretest.bugs:10134 Archived-At: In article <0ed07c7b9fe05cb0334850bba636bd40@Web.DE>, Peter Dyballa writes: >> When I use 'ls -lw' to display the file names in xterm, I get: >>=20 >> -rw-r--r-- 1 pete pete 62 25 M=C3=A4r 2005 =C3=A1=C3=9B=C3= =AF=C7=93=C3=A0.txt >> -rw-r--r-- 1 pete pete 62 25 M=C3=A4r 2005 =C3=A4=C3=96=C3= =BC=C3=84=C3=B6=C3=9C.txt >> -rw-r--r-- 1 pete pete 107 2 Dez 2004 =C3=A4=C3=B6=C3=BC= =C3=9F=C3=9C=C3=96=C3=84=E2=82=AC >> =09 >> Doing the same in Emacs' *shell* buffer I get: >>=20 >> -rw-r--r-- 1 pete pete 62 25 M=C3=A4r 2005 a=CC=81U=CC=82i= =CC=88U=CC=8Ca=CC=80.txt >> -rw-r--r-- 1 pete pete 62 25 M=C3=A4r 2005 a=CC=88O=CC=88u= =CC=88A=CC=88o=CC=88U=CC=88.txt >> -rw-r--r-- 1 pete pete 107 2 Dez 2004 a=CC=88o=CC=88u=CC= =88=C3=9FU=CC=88O=CC=88A=CC=88=E2=82=AC You are using Unicode Emacs on Mac Darwin. I heard that all file names on that system are treated in a decomposed form (normalization form NFD or NFKD, see UAX15 of Unicode). But, currently emacs doesn't have a converter between each normalization form. So, a-umlaut in a file name is actually the two characters seqeunce "a" and "umlaut" (U+0308) on that system, but when you type that character in Emacs, Emacs produces U+00E4. [...] > OK, now an explanation is given: no font. The question is: do I need to=20 > supply a font? If so: how? Hitting C-h v on that=20 > `reference-point-alist' gives a reference to a variable (I think: too=20 > big to cite it here) defined in `composite'. There I found a reference=20 > to the function toggle-auto-composition. When I apply this function to=20 > the *Buffers List* I can see that it "changes" one file name: obviously=20 > one which is the exact copy of the entry in dired-mode! You said that Emacs used to show those characters in an ugly but correct way, and stopped displaying in that way recently. I think that is because some font/fontset related code was changed for Mac recently, and I have no idea why Emacs can't find a correct font on Mac now. Could someone who is working on Mac port help him? > And I now recognised too that I when I open a file with the =C3=A4 in the= =20 > name, it appears in mode-line correct. In the pop-up buffers menu I see=20 > its name printed in normal UTF-8 representation, i.e. C3 A4 =3D =C3=83=C2= =A4. OK, I=20 > can guess the right name. When I open such a file from dired-mode by=20 > pressing the mouse, the =C3=A4 is represented by a hollow box in the=20 > mode-line. This hollow box is "translated" in pop-up buffers menu to=20 > "=C3=8C=E2=96=A2." OK, I am cheating a bit: when I open the file with C-x= f or=20 > change any name to a name with =C3=A4, then the name is correct in=20 > mode-line. In *Buffer List* this name is displayed correctly too. The=20 > other file name, which I open 'with the mouse,' has the de-composed =C3= =A4=20 > glyph which is described by C-u C-x =3D as the =C3=A4 in dired mode. And = in=20 > this file name I can toggle the representation between "a" and "a=E2=96= =A2" --=20 > but no change in pop-up buffers menu! All those confusions are because of normalization form used on your system. Emacs and the system don't agree with file name encoding. Mr. Kawabata is now working on implementing converters between all normalization forms. He has already finished writing a code, sent assignment paper to FSF, and is now waiting for a reply. As soon as his contribution is accepted, I'll install it. Then I think you problem is solved. --- Kenichi Handa handa@m17n.org