From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Multibyte and unibyte file names Date: Sun, 27 Jan 2013 20:55:16 -0500 Message-ID: References: <83ehhbn680.fsf@gnu.org> <83wqv2ldk1.fsf@gnu.org> <83obgel94c.fsf@gnu.org> <83k3r1lnlb.fsf@gnu.org> <83vcalj97s.fsf@gnu.org> <83r4l8jjtv.fsf@gnu.org> <83k3r0jd9r.fsf@gnu.org> <834ni3jefn.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1359338128 23241 80.91.229.3 (28 Jan 2013 01:55:28 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 28 Jan 2013 01:55:28 +0000 (UTC) Cc: kzhr@d1.dion.ne.jp, michael.albinus@gmx.de, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jan 28 02:55:47 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Tzdwp-0001OA-B3 for ged-emacs-devel@m.gmane.org; Mon, 28 Jan 2013 02:55:47 +0100 Original-Received: from localhost ([::1]:57397 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TzdwX-00073z-NU for ged-emacs-devel@m.gmane.org; Sun, 27 Jan 2013 20:55:29 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:59858) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TzdwU-00073j-Hg for emacs-devel@gnu.org; Sun, 27 Jan 2013 20:55:27 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TzdwT-0005PY-7d for emacs-devel@gnu.org; Sun, 27 Jan 2013 20:55:26 -0500 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.182]:10967) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TzdwM-0005Oy-Go; Sun, 27 Jan 2013 20:55:18 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtsGAG6Zu09FpYpx/2dsb2JhbABEgXuDMq5kgQiCFQEBBAEjMyMQCxoCGA4CAhQYDSSIHAWnDpJ7gSaOCoEUA4hCmnGBWIMH X-IronPort-AV: E=Sophos;i="4.75,637,1330923600"; d="scan'208";a="213820212" Original-Received: from 69-165-138-113.dsl.teksavvy.com (HELO fmsmemgm.homelinux.net) ([69.165.138.113]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 27 Jan 2013 20:55:17 -0500 Original-Received: by fmsmemgm.homelinux.net (Postfix, from userid 20848) id 19EFAAE0F4; Sun, 27 Jan 2013 20:55:17 -0500 (EST) In-Reply-To: <834ni3jefn.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 27 Jan 2013 09:03:08 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.248.154.182 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:156694 Archived-At: >> > OK, but as long as file-name primitives are required to support >> > unibyte strings, you cannot be sure these situations won't pop up in >> > the future. >> I don't see a need to disallow unibyte strings, but I don't see the need >> to be particularly careful about it either. Basically Elisp code which >> provides unibyte file names does it at its own risks. > What about C code that calls these primitives? Can we consider every > such instance a bug in the caller? Most likely, yes. >> But that's exactly the behavior stipulated by POSIX (tho for '/' rather >> than '\\'). I.e. if you use file names on a POSIX host with >> a coding-system that occasionally uses '/' within its multibyte >> sequences, you'll get those surprises regardless of Emacs. And for that >> reason, Emacs would be right to cut those file names in the middle of >> a multibyte sequence. > Then why did you regard this: > (let ((file-name-coding-system 'cp932)) > (expand-file-name "=E8=A1=A8" "C:/")) > =3D> "c:/\225/" > as a bug? Because expand-file-name works on Emacs strings, not on file-system strings. >> And since Emacs is largely based on "POSIX semantics for the generic >> code, plus an emulation layer in w32.c", we have a problem of subtly >> incompatible semantics. > Maybe so, but it certainly isn't the only place in Emacs with subtly > incompatible semantics. And anyway, I don't see how this observation > helps to decide what, if anything, to do to fix this. It helps me understand the problem, at least. Maybe it also points out that we might like to change the interface so that generic code does not encode strings before passing them to the OS-specific primitives. >> Could you specify a bit more precisely which primitives you have >> in mind? > Those in fileio.c and in dired.c. I could give an explicit list, if > you want. At least I disagree with your Ffile_name_directory suggestion: if the file-name is already encoded and it results in bugs, the fix should be in the caller. Stefan