From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Michael Albinus Newsgroups: gmane.emacs.devel Subject: Re: Multibyte and unibyte file names Date: Wed, 23 Jan 2013 21:58:59 +0100 Message-ID: <87ham71ur0.fsf@gmx.de> References: <83ehhbn680.fsf@gnu.org> <87mwvz1y9s.fsf@gmx.de> <83a9rzmzq1.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1358979368 989 80.91.229.3 (23 Jan 2013 22:16:08 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 23 Jan 2013 22:16:08 +0000 (UTC) Cc: kzhr@d1.dion.ne.jp, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jan 23 23:16:25 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Ty8bt-0001De-BR for ged-emacs-devel@m.gmane.org; Wed, 23 Jan 2013 23:15:57 +0100 Original-Received: from localhost ([::1]:35988 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ty8bb-00025F-Mu for ged-emacs-devel@m.gmane.org; Wed, 23 Jan 2013 17:15:39 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:48937) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ty8bY-00024v-9w for emacs-devel@gnu.org; Wed, 23 Jan 2013 17:15:38 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Ty7Pd-0001Sf-FU for emacs-devel@gnu.org; Wed, 23 Jan 2013 15:59:19 -0500 Original-Received: from mout.gmx.net ([212.227.15.19]:52469) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ty7Pd-0001SP-4K for emacs-devel@gnu.org; Wed, 23 Jan 2013 15:59:13 -0500 Original-Received: from mailout-de.gmx.net ([10.1.76.27]) by mrigmx.server.lan (mrigmx001) with ESMTP (Nemesis) id 0MfCFw-1UMnxd1814-00On1C for ; Wed, 23 Jan 2013 21:59:11 +0100 Original-Received: (qmail invoked by alias); 23 Jan 2013 20:59:10 -0000 Original-Received: from p57BB98BD.dip0.t-ipconnect.de (EHLO detlef.gmx.de) [87.187.152.189] by mail.gmx.net (mp027) with SMTP; 23 Jan 2013 21:59:10 +0100 X-Authenticated: #3708877 X-Provags-ID: V01U2FsdGVkX1/8QxZHFSD8xxAJvHUrfIrCbHSLg/3ihSSwjwpMPy 7EmPip2S7c6eAR In-Reply-To: <83a9rzmzq1.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 23 Jan 2013 22:05:58 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-Y-GMX-Trusted: 0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 212.227.15.19 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:156612 Archived-At: Eli Zaretskii writes: > For example, in the particular case of file-name-directory, I think > Tramp should simply do its job by a straightforward removal of the > portion after the last slash in Lisp, instead of calling the native > implementation. This would duplicate code. I try to avoid, when possible. >> I agree, Tramp shall check carefully what a file name encoding is. This >> must be added to the code. > > Sorry, I don't follow. File names in Lisp are not encoded in any > way. You only need to encode them when you pass them to commands > executed on the remote host, and decode the results that are output by > those remote commands. Maybe there's a misunderstanding here. But you gave an example with a file name with japanese codings. >> There might be a chance to switch to en_US.UTF-8 on the remote side. But >> even here I would propose to start with the unibyte subset. "en_US", >> because Tramp parses the output of commands, which must not be >> localized. > > Why "must not be localized"? Tramp does not understand German messages, for example. "de_DE.UTF-8" would be a no-go. That's why Tramp sets the remote locale to English messages. Currently it is "C", it could be "en_US.UTF-8" in the furure. But I don't know, whether all remote hosts are already prepared for UTF-8. >> Other encodings but UTF-8 will be hard to support. It is not only that >> Tramp calls "native" file name primitives, there are also several >> parsing routines for commands on the remote side, which have their >> expectations on file name syntax and their encodings. > > I'm afraid I don't follow here, either. Emacs is well equipped to > do code conversions from and to almost any encoding out there. The > only problem is to know which encoding to use when communicating with > the commands on the remote host. What am I missing? Maybe one could teach Tramp to convert file names in whatever coding to UTF-8. But shall we do it? And how would that work with other Emacs flavors? Yes, I must keep XEmacs in mind. Best regards, Michael.