all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Klaus-Dieter Bauer <bauer.klaus.dieter@gmail.com>
To: emacs-devel@gnu.org
Subject: Passing unicode filenames to start-process on Windows?
Date: Wed, 6 Jan 2016 16:20:29 +0100	[thread overview]
Message-ID: <CANtbJLHOJOsy+CfgsvgCYN-aA6Ur1UYuRENPREpsRW_JaSJpDg@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3251 bytes --]

Hello!

Is there a reliable way to pass unicode file names as
arguments through `start-process'?

I realized two limitations:


1. Using `prefer-coding-system' with anything other than
   `locale-default-encoding', e.g.

       (prefer-coding-system 'utf-8),

   causes a file name "Ö.txt" to be misdecoded as by
   subprocesses -- notably including "emacs.exe", but also
   all other executables I tried (both Windows builtins like
   where.exe and third party executables like ffmpeg.exe or
   GnuWin32 utilities).

   In my case (German locale, 'utf-8 preferred coding
   system) it is mis-decoded as "Ö.txt", i.e. emacs encodes
   the process argument as 'utf-8 but the subprocess decodes
   it as 'latin-1 (in my case).

   While this can be fixed by an explicit encoding

       (start-process ...
         (encode-coding-string filename locale-coding-system))

   such code will probably not be used in most projects, as
   the issue occurs only on Windows, dependent on the user
   configuration (-> hard-to-find bug?). I have added some
   elisp for demonstration at the end of the mail.


2. When a file-name contains characters that cannot be
   encoded in the locale's encoding, e.g. Japanese
   characters in a German locale, I cannot find any way to
   pass the file name through the `start-process' interface;
   Unlike for characters, that are supported by the locale,
   it fails even in a clean "emacs -Q" session.

   Curiously the file name can still be used in cmd.exe,
   though entering it may require TAB-completion (even
   though the active codepage shouldn't support them).


- Klaus


---------------- EXAMPLE CODE --------------------

;; Setup: Create a file "unifilebug/Ö.txt" with
;; some arbitrary text. Make sure it is the only file in
;; "unifilebug".
;;
;; Note that for this issue it doesn't matter what coding system
;; is chosen for file names (Unix only; On Windows the coding
;; system for file names is fixed anyway.)


;; Set the preferred coding system.
(prefer-coding-system 'utf-8)


;; Try opening it in an emacs subprocess.
;;
;; On Windows this breaks
;; if `prefer-coding-system' was called with anything other than
;; `locale-coding-system', here 'utf-8.
;;
;; On Unix (tested with cygwin), it works fine; Presumably because
;; the file name is decoded (in `directory-files') and encoded (in
;; `start-process') with the same preferred coding system.
(let ((file-name (car (directory-files "unifilebug" t "txt$"))))
  (start-process "" nil "emacs" "-Q" file-name))


;; It can be fixed by explicitly encoding file-names. This
;; thankfully works both in the W32 and the Cygwin version of
;; emacs.
(let ((file-name (car (directory-files "unifilebug" t "txt$"))))
  (start-process "" nil "emacs" "-Q"
    (encode-coding-string file-name locale-coding-system)))


;; Now we create a file called "ufb2/こんにちは世界.txt"
;; Even in a emacs-session without prefer-coding-system it will
;; fail, decoding the file-name as "ufb2/ .txt".
(let ((file-name (car (directory-files "ufb2" t "txt$"))))
  (start-process "" nil "emacs" "-Q" file-name))


--------------------------------------------------

[-- Attachment #2: Type: text/html, Size: 10843 bytes --]

             reply	other threads:[~2016-01-06 15:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-06 15:20 Klaus-Dieter Bauer [this message]
2016-01-06 16:13 ` Passing unicode filenames to start-process on Windows? Eli Zaretskii
2016-01-06 21:19   ` Klaus-Dieter Bauer
2016-01-06 23:05     ` Davis Herring
2016-01-07  3:36       ` Eli Zaretskii
2016-01-07 16:00     ` Eli Zaretskii
2016-01-07 23:31       ` Klaus-Dieter Bauer
2016-01-08  9:17         ` Eli Zaretskii
2016-01-08 20:01           ` Klaus-Dieter Bauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANtbJLHOJOsy+CfgsvgCYN-aA6Ur1UYuRENPREpsRW_JaSJpDg@mail.gmail.com \
    --to=bauer.klaus.dieter@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.