From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Michael Albinus Newsgroups: gmane.emacs.devel Subject: Re: Multibyte and unibyte file names Date: Wed, 23 Jan 2013 20:42:55 +0100 Message-ID: <87mwvz1y9s.fsf@gmx.de> References: <83ehhbn680.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1358970215 10183 80.91.229.3 (23 Jan 2013 19:43:35 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 23 Jan 2013 19:43:35 +0000 (UTC) Cc: Kazuhiro Ito , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jan 23 20:43:53 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Ty6EZ-0003aT-Oo for ged-emacs-devel@m.gmane.org; Wed, 23 Jan 2013 20:43:43 +0100 Original-Received: from localhost ([::1]:44987 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ty6EI-0007uP-En for ged-emacs-devel@m.gmane.org; Wed, 23 Jan 2013 14:43:26 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:36214) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ty6E9-0007tR-Ri for emacs-devel@gnu.org; Wed, 23 Jan 2013 14:43:24 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Ty6E5-0003IA-E7 for emacs-devel@gnu.org; Wed, 23 Jan 2013 14:43:17 -0500 Original-Received: from mout.gmx.net ([212.227.15.19]:59594) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ty6E5-0003GK-3Y for emacs-devel@gnu.org; Wed, 23 Jan 2013 14:43:13 -0500 Original-Received: from mailout-de.gmx.net ([10.1.76.2]) by mrigmx.server.lan (mrigmx001) with ESMTP (Nemesis) id 0Ld2ys-1UgCU80jvU-00iAiq for ; Wed, 23 Jan 2013 20:43:08 +0100 Original-Received: (qmail invoked by alias); 23 Jan 2013 19:43:07 -0000 Original-Received: from p57BB98BD.dip0.t-ipconnect.de (EHLO detlef.gmx.de) [87.187.152.189] by mail.gmx.net (mp002) with SMTP; 23 Jan 2013 20:43:07 +0100 X-Authenticated: #3708877 X-Provags-ID: V01U2FsdGVkX1/RKpUs5pHRhjOOxGuHCr+azhzLImc6j4J17A0QvI az5cjfdbvzHa56 In-Reply-To: <83ehhbn680.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 23 Jan 2013 19:45:35 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-Y-GMX-Trusted: 0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 212.227.15.19 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:156604 Archived-At: Eli Zaretskii writes: > 2) This gets worse with remote file names. For these, the handlers > are always called first, and the result is never run through > dostounix_filename. However, Tramp sometimes turns around and > calls the "real" handler on parts of the remote file name, > evidently expecting that "real" handler not to do any harm. But > due to the above, it does do harm. While it might be justified to > limit native file name support to file names encodable with the > current file-name-coding-system, it _cannot_ be justified for > remote file names. An example of this is file-name-directory: > > (defun tramp-handle-file-name-directory (file) > "Like `file-name-directory' but aware of Tramp files." > ;; Everything except the last filename thing is the directory. We > ;; cannot apply `with-parsed-tramp-file-name', because this expands > ;; the remote file name parts. This is a problem when we are in > ;; file name completion. > (let ((v (tramp-dissect-file-name file t))) > ;; Run the command on the localname portion only. > (tramp-make-tramp-file-name > (tramp-file-name-method v) > (tramp-file-name-user v) > (tramp-file-name-host v) > (tramp-run-real-handler > 'file-name-directory (list (or (tramp-file-name-localname v) "")))))) > > which on Windows means that, e.g. > > (let ((file-name-coding-system 'cp1252)) > (file-name-directory "/eliz@fencepost.gnu.org:=E6=BC=A2=E5=AD=97/")) > > =3D> "/eliz@fencepost.gnu.org: /" > > And there are other similar handlers in Tramp (e.g., the > file-name-nondirectory handler) which do the same. IOW, they seem > to _assume_ that the corresponding "real" handler never needs to > encode the file name. A false assumption. Tramp is not prepared to handle encoded file names. One of the first actions on the remote side is to set the environment "LC_ALL=3DC". An exception are Android devices, which require UTF-8. I agree, Tramp shall check carefully what a file name encoding is. This must be added to the code. There might be a chance to switch to en_US.UTF-8 on the remote side. But even here I would propose to start with the unibyte subset. "en_US", because Tramp parses the output of commands, which must not be localized. Other encodings but UTF-8 will be hard to support. It is not only that Tramp calls "native" file name primitives, there are also several parsing routines for commands on the remote side, which have their expectations on file name syntax and their encodings. > TIA Best regards, Michael.