From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: ISO-8859-1 encoded file names and UTF-8 Date: Thu, 20 Mar 2003 08:52:24 +0900 (JST) Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <200303192352.IAA00475@etlken.m17n.org> References: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1048118247 7582 80.91.224.249 (19 Mar 2003 23:57:27 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 19 Mar 2003 23:57:27 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Thu Mar 20 00:57:26 2003 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18vnR0-0001yA-00 for ; Thu, 20 Mar 2003 00:57:26 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 18vnRz-0002UV-00 for ; Thu, 20 Mar 2003 00:58:27 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18vnQU-0006pj-09 for emacs-devel@quimby.gnus.org; Wed, 19 Mar 2003 18:56:54 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18vnOt-0004kl-00 for emacs-devel@gnu.org; Wed, 19 Mar 2003 18:55:15 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18vnNK-0003jE-00 for emacs-devel@gnu.org; Wed, 19 Mar 2003 18:53:39 -0500 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18vnMF-0003Yx-00 for emacs-devel@gnu.org; Wed, 19 Mar 2003 18:52:31 -0500 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])h2JNqP918685; Thu, 20 Mar 2003 08:52:25 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) h2JNqPA11680; Thu, 20 Mar 2003 08:52:25 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id IAA00475; Thu, 20 Mar 2003 08:52:24 +0900 (JST) Original-To: keichwa@gmx.net In-reply-to: (message from Karl Eichwalder on Wed, 19 Mar 2003 17:15:42 +0100) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:12476 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:12476 In article , Karl Eichwalder writes: > I think there is still a subtle bug left; in a ISO-8859-1 locale do: > touch "Maler M=FCller" > Then call emacs: > LANG=3Dde_DE.UTF-8 emacs -q --no-site --no-splash . > In dired you can see: > -rw-r--r-- 1 ke users 0 2003-03-19 16:10 Maler M\374ller\374rle > good part ^^^^^^^^^^^^^^^||||||| > trailing garbage ------------>>>^^^^^^^ Ah! That's a bug of utf-8 decoder. I've just installed the attached fix. >> Should the recoding of filename regarded as a kind of file name >> changing? If so, perhaps we should make the function rename-file to >> handle also recoding. In that case, how should we tell rename-file >> to actually recode filename encoding? > If the user calls rename-file it should be up to him to specify a proper > file name. In other words I vote to provide a separate function like > convert-file-name to do the right thing; by default convert-file-name > should try to convert the file name to the user's locale. As we already have the function convert-standard-filename, I think the name convert-file-name is confusing. So, I prefer the name recode-file-name if we'll have a separate function. --- Ken'ichi HANDA handa@m17n.org *** utf-8.el.~1.26.~ Tue Mar 18 09:09:15 2003 --- utf-8.el Thu Mar 20 08:22:42 2003 *************** *** 479,497 **** (write-multibyte-character r5 r3)) (write-multibyte-character r6 r3)) (if (r0 >=3D #xf8) ; 5- or 6-byte encoding ! ((read r1) ! (if (r1 < #xa0) ! (if (r1 < #x80) ; invalid byte ! (write r1) ! (write-multibyte-character r5 r1)) ! (write-multibyte-character r6 r1)) (if (r0 >=3D #xfc) ; 6-byte ! ((read r1) ! (if (r1 < #xa0) ! (if (r1 < #x80) ; invalid byte ! (write r1) ! (write-multibyte-character r5 r1)) ! (write-multibyte-character r6 r1))))))) ;; else invalid byte >=3D #xfe (write-multibyte-character r6 r0)))))) (repeat))) --- 479,499 ---- (write-multibyte-character r5 r3)) (write-multibyte-character r6 r3)) (if (r0 >=3D #xf8) ; 5- or 6-byte encoding ! ((r0 =3D -1) ! (read r0) ! (if (r0 < #xa0) ! (if (r0 < #x80) ; invalid byte ! (write r0) ! (write-multibyte-character r5 r0)) ! (write-multibyte-character r6 r0)) (if (r0 >=3D #xfc) ; 6-byte ! ((r0 =3D -1) ! (read r0) ! (if (r0 < #xa0) ! (if (r0 < #x80) ; invalid byte ! (write r0) ! (write-multibyte-character r5 r0)) ! (write-multibyte-character r6 r0))))))) ;; else invalid byte >=3D #xfe (write-multibyte-character r6 r0)))))) (repeat)))