From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Klaus-Dieter Bauer Newsgroups: gmane.emacs.devel Subject: Re: Passing unicode filenames to start-process on Windows? Date: Fri, 8 Jan 2016 00:31:38 +0100 Message-ID: References: <83si2a3cuo.fsf@gnu.org> <83h9ip2xdg.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a113c70e21408050528c6e17e X-Trace: ger.gmane.org 1452209550 2456 80.91.229.3 (7 Jan 2016 23:32:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 7 Jan 2016 23:32:30 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jan 08 00:32:30 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aHK2d-0008BP-Th for ged-emacs-devel@m.gmane.org; Fri, 08 Jan 2016 00:32:28 +0100 Original-Received: from localhost ([::1]:33338 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHK2d-0008Mr-5v for ged-emacs-devel@m.gmane.org; Thu, 07 Jan 2016 18:32:27 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39737) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHK2O-0008Mj-Kr for emacs-devel@gnu.org; Thu, 07 Jan 2016 18:32:13 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aHK2N-0007Au-AV for emacs-devel@gnu.org; Thu, 07 Jan 2016 18:32:12 -0500 Original-Received: from mail-wm0-x22c.google.com ([2a00:1450:400c:c09::22c]:36615) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHK2L-00079Y-DI; Thu, 07 Jan 2016 18:32:09 -0500 Original-Received: by mail-wm0-x22c.google.com with SMTP id l65so116019321wmf.1; Thu, 07 Jan 2016 15:32:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=FJsAWeW0lkKmeDMGhvoVvBczPV+MHIjiEIPt33DoFTE=; b=tenPozbW4KG5A/4fKOerPb0imkpeSg80EgY6ozne7xsI/dQNn436Qd1zW5bCXOB5c6 kNFsQCfF5k6ih4nMvMCViZG6W9WEExFddIY20QNmqr8QEKi9L+9lTKoXXzn+r9vFt2dr iJE5p3Hd5UaQI0/6J/qo3Wdx4hSvgY7e0O97sXUjdm+7ZDf1KBJtEyWTwvDxUdtwItPJ 5k5FqxIHkMlwmLbRHrUpA3Q09zV/L5OoLMbl3MmP/JmyvOmLkvMU3tbLISV8sCWxlAVY cN4o+W7ao9TWcz3tBm32jpBWu6HQSqDbwDYcEz3AjjSA4sxTzGzhT2XASp16w2tUyCSz 2RQA== X-Received: by 10.28.0.79 with SMTP id 76mr20834562wma.27.1452209528372; Thu, 07 Jan 2016 15:32:08 -0800 (PST) Original-Received: by 10.27.12.104 with HTTP; Thu, 7 Jan 2016 15:31:38 -0800 (PST) In-Reply-To: <83h9ip2xdg.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::22c X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:197782 Archived-At: --001a113c70e21408050528c6e17e Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable 2016-01-07 17:00 GMT+01:00 Eli Zaretskii : > > From: Klaus-Dieter Bauer > > Date: Wed, 6 Jan 2016 22:19:39 +0100 > > Cc: emacs-devel@gnu.org > > > > I thought up some workarounds, but they all run into limitations: > > > > * w32-short-file-name: Doesn't work, because in modern Windows systems > 8.3 file > > names may not be generated, so it may just return the unchanged > filename. > > * rename-file: Allows working with a name via a temporary supported fil= e > name. > > Sadly there is no way to guarantee that such renaming is undone > afterwards. > > * copy-file (to a temporary directory): Would work for the current > application, > > but unviable when larger amounts of data are involved. > > > > Would you happen to know any other possible workaround? > > The only one that would work reliably is to pass arguments via a file > or a pipe. (Some program support "response files" as a replacement > for command-line arguments, or can read the arguments from stdin.) > > Do you really have programs that can support text outside of the > current system codepage? If you don't, then passing arguments with > such strings is the least of your problems: once you do get these > strings into the program, the program won't be able to do anything > useful with them: all the library functions that receive C strings > will misbehave, you won't be able to open files with such names, etc. > > IOW, I'm not sure I understand your use case in enough detail to > provide useful advice. Perhaps describe what you want to do and the > program you want to invoke from Emacs in more detail. > =E2=80=8BI have two usecases where I run into the issue: - I want at some point to write an incremental backup utility that uses md5sum to identify renamed files. Since precompiled Windows binaries are 32bit, only the first 512MB of any given file are accessible to elisp however, so I wanted to use GnuWin32's md5sum.exe (but it turns out that it doesn't support unicode filenames anyway). - I want to verify a convention where filenames should mirror the metadata in my music library. Here I intended to write an elisp tool (for easy interactive processing im Emacs) and tried to use ffmpeg (which does support unicode filenames in cmd.exe). I checked and both tools allow reading the input data from a pipe (`type UNICODE.mp3 | ffmpeg -i - ...` or `md5sum` respectively), so that workaround is applicable to all my usecases. Thanks for the help! - Klaus --001a113c70e21408050528c6e17e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
2016-01-07 17:00 = GMT+01:00 Eli Zaretskii <eliz@gnu= .org>:
> From: Klaus-Dieter Bauer <bauer.klaus.dieter@gmail.com>
> Date: Wed, 6 Jan 2016 22:19:39 +0100
> Cc: emacs-devel@gnu.org
>
> I thought up some workarounds, but they all run into limitations:
>
> * w32-short-file-name: Doesn't work, because in modern Wind= ows systems 8.3 file
>=C2=A0 =C2=A0names may not be generated, so it may jus= t return the unchanged filename.
> * rename-file: Allows working with a name via a temporary suppo= rted file name.
>=C2=A0 =C2=A0Sadly there is no way to guarantee that s= uch renaming is undone afterwards.
> * copy-file (to a temporary directory): Would work for the curr= ent application,
>=C2=A0 =C2=A0but unviable when larger amounts of data = are involved.
>
> Would you happen to know any other possible workaround?

The only one that would work reliably is to pass arguments via a fil= e
or a pipe.=C2=A0 (Some program support "response files" as a repl= acement
for command-line arguments, or can read the arguments from stdin.)

Do you really have programs that can support text outside of the
current system codepage?=C2=A0 If you don't, then passing arguments wit= h
such strings is the least of your problems: once you do get these
strings into the program, the program won't be able to do anything
useful with them: all the library functions that receive C strings
will misbehave, you won't be able to open files with such names, etc.
IOW, I'm not sure I understand your use case in enough detail to
provide useful advice.=C2=A0 Perhaps describe what you want to do and the program you want to invoke from Emacs in more detail.

=E2=80=8BI have two use= cases where I run into the issue:

- I want at some point to write = an incremental backup utility=C2=A0
=C2=A0 that uses md5sum to identify re= named files. Since precompiled
=C2=A0 Windows binaries are 32bit, onl= y the first 512MB of any given
=C2=A0 file are accessible to elisp however= , so I wanted to use=C2=A0
=C2=A0 GnuWin32's md5sum.exe (but it turns = out that it doesn't=C2=A0
=C2=A0 support unicode filenames anyway).=C2= =A0

- I want to verify a convention where filenames should mirror=C2= =A0
=C2=A0 the metadata in my music library. Here I intended to write=C2= =A0
=C2=A0 an elisp tool (for easy interactive processing im Emacs)=C2=A0<= /div>
=C2=A0 and tried to use ffmpeg (which does support unicode filenames
=
=C2= =A0 in cmd.exe).=C2=A0

I checked and both tools allow reading the in= put data from=C2=A0
a pipe (`type UNICODE.mp3 | ffmpeg -i - ...` or `m= d5sum`=C2=A0
respectively), so that workaround is applicable to all my use= cases.=C2=A0

Thanks for the help!
- Klaus
--001a113c70e21408050528c6e17e--