unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Passing unicode filenames to start-process on Windows?
@ 2016-01-06 15:20 Klaus-Dieter Bauer
  2016-01-06 16:13 ` Eli Zaretskii
  0 siblings, 1 reply; 9+ messages in thread
From: Klaus-Dieter Bauer @ 2016-01-06 15:20 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3251 bytes --]

Hello!

Is there a reliable way to pass unicode file names as
arguments through `start-process'?

I realized two limitations:


1. Using `prefer-coding-system' with anything other than
   `locale-default-encoding', e.g.

       (prefer-coding-system 'utf-8),

   causes a file name "Ö.txt" to be misdecoded as by
   subprocesses -- notably including "emacs.exe", but also
   all other executables I tried (both Windows builtins like
   where.exe and third party executables like ffmpeg.exe or
   GnuWin32 utilities).

   In my case (German locale, 'utf-8 preferred coding
   system) it is mis-decoded as "Ö.txt", i.e. emacs encodes
   the process argument as 'utf-8 but the subprocess decodes
   it as 'latin-1 (in my case).

   While this can be fixed by an explicit encoding

       (start-process ...
         (encode-coding-string filename locale-coding-system))

   such code will probably not be used in most projects, as
   the issue occurs only on Windows, dependent on the user
   configuration (-> hard-to-find bug?). I have added some
   elisp for demonstration at the end of the mail.


2. When a file-name contains characters that cannot be
   encoded in the locale's encoding, e.g. Japanese
   characters in a German locale, I cannot find any way to
   pass the file name through the `start-process' interface;
   Unlike for characters, that are supported by the locale,
   it fails even in a clean "emacs -Q" session.

   Curiously the file name can still be used in cmd.exe,
   though entering it may require TAB-completion (even
   though the active codepage shouldn't support them).


- Klaus


---------------- EXAMPLE CODE --------------------

;; Setup: Create a file "unifilebug/Ö.txt" with
;; some arbitrary text. Make sure it is the only file in
;; "unifilebug".
;;
;; Note that for this issue it doesn't matter what coding system
;; is chosen for file names (Unix only; On Windows the coding
;; system for file names is fixed anyway.)


;; Set the preferred coding system.
(prefer-coding-system 'utf-8)


;; Try opening it in an emacs subprocess.
;;
;; On Windows this breaks
;; if `prefer-coding-system' was called with anything other than
;; `locale-coding-system', here 'utf-8.
;;
;; On Unix (tested with cygwin), it works fine; Presumably because
;; the file name is decoded (in `directory-files') and encoded (in
;; `start-process') with the same preferred coding system.
(let ((file-name (car (directory-files "unifilebug" t "txt$"))))
  (start-process "" nil "emacs" "-Q" file-name))


;; It can be fixed by explicitly encoding file-names. This
;; thankfully works both in the W32 and the Cygwin version of
;; emacs.
(let ((file-name (car (directory-files "unifilebug" t "txt$"))))
  (start-process "" nil "emacs" "-Q"
    (encode-coding-string file-name locale-coding-system)))


;; Now we create a file called "ufb2/こんにちは世界.txt"
;; Even in a emacs-session without prefer-coding-system it will
;; fail, decoding the file-name as "ufb2/ .txt".
(let ((file-name (car (directory-files "ufb2" t "txt$"))))
  (start-process "" nil "emacs" "-Q" file-name))


--------------------------------------------------

[-- Attachment #2: Type: text/html, Size: 10843 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-01-08 20:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-06 15:20 Passing unicode filenames to start-process on Windows? Klaus-Dieter Bauer
2016-01-06 16:13 ` Eli Zaretskii
2016-01-06 21:19   ` Klaus-Dieter Bauer
2016-01-06 23:05     ` Davis Herring
2016-01-07  3:36       ` Eli Zaretskii
2016-01-07 16:00     ` Eli Zaretskii
2016-01-07 23:31       ` Klaus-Dieter Bauer
2016-01-08  9:17         ` Eli Zaretskii
2016-01-08 20:01           ` Klaus-Dieter Bauer

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).