unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#12807: 24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows
@ 2012-11-05 20:52 Nils Gösche
  2012-11-05 21:47 ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Nils Gösche @ 2012-11-05 20:52 UTC (permalink / raw)
  To: 12807

Dear Sirs,

I keep a bunch of text files on my Windows 7 desktop containing my thoughts
about the solutions of chess problems I am trying to solve. Now, one of these
problems was composed by a Russian. So, I named the file Кузовков_Lösung.txt:
First the name of the Russian composer, then the German word for »solution«.
However, when I tried to edit that file in Emacs, I only got error messages,
probably because of the funny Unicode characters in the file name. (See below
for the exact wording of the messages.)

Another file with only English/German characters in the name,
Thorton_Lösung.txt, does not cause any trouble at all (oh, but it seems I
misspelled the name, actually).

(BTW, Notepad does not have any problems editing the same file. So, it is
not some weird, OS-related problem, either).

Regards,
Nils Gösche

======== End of bug report======


In GNU Emacs 24.2.1 (i386-mingw-nt6.1.7601)
 of 2012-08-29 on MARVIN
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
 `configure --with-gcc (4.6) --cflags
 -ID:/devel/emacs/libs/libXpm-3.5.8/include
 -ID:/devel/emacs/libs/libXpm-3.5.8/src
 -ID:/devel/emacs/libs/libpng-dev_1.4.3-1/include
 -ID:/devel/emacs/libs/zlib-dev_1.2.5-2/include
 -ID:/devel/emacs/libs/giflib-4.1.4-1/include
 -ID:/devel/emacs/libs/jpeg-6b-4/include
 -ID:/devel/emacs/libs/tiff-3.8.2-1/include
 -ID:/devel/emacs/libs/gnutls-3.0.9/include'

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: en_US
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: DEU
  value of $XMODIFIERS: nil
  locale-coding-system: cp1252
  default enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  display-time-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t

Recent input:
<down-mouse-1> <mouse-1> <down-mouse-1> <mouse-1> C-x 
C-f d e s k <tab> K u s o w k o w _ L ö s u n g . t 
x t C-g <help-echo> <down-mouse-1> <mouse-1> C-x C-f 
K u <backspace> <backspace> d e s k <tab> K u s o w 
k o s <backspace> w _ L ö s u n g . t x t <return> 
b l a r k <return> C-x C-s C-x k <return> C-z <down-mouse-1> 
<mouse-1> <down-mouse-1> <mouse-1> d x f o i p g d 
o j g <return> C-x C-s <down-mouse-1> <mouse-1> <down-mouse-1> 
<mouse-1> C-x k <return> y e s <return> C-z <down-mouse-1> 
<mouse-1> <return> C-x C-s <backspace> C-x C-s C-x 
k <return> C-z C-x k <return> C-z <down-mouse-1> <mouse-1> 
C-x k <return> C-z C-x C-f d e s k <tab> k <tab> <backspace> 
<tab> <tab> <down-mouse-1> <mouse-2> <end> F a r k 
. <return> C-x C-s C-x k <return> y e s <return> C-z 
M-x M-x C-g M-x r e p o r <tab> <return>

Recent messages:
Wrote c:/Users/cartan/Desktop/Thorton_Lösung.txt
Saving file c:/Users/cartan/Desktop/Thorton_Lösung.txt...
Wrote c:/Users/cartan/Desktop/Thorton_Lösung.txt
(New file) [2 times]
Making completion list...
Mark set
Saving file c:/Users/cartan/Desktop/????????_Lösung.txt...
basic-save-buffer-2: Opening output file: invalid argument, c:/Users/cartan/Desktop/????????_Lösung.txt
completing-read-default: Command attempted to use minibuffer while in minibuffer
Quit
Quit

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail regexp-opt rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils help-mode easymenu view eliserv doctor server time cl time-date
tooltip ediff-hook vc-hooks lisp-float-type mwheel dos-w32 disp-table ls-lisp
w32-win w32-vars tool-bar dnd fontset image fringe lisp-mode register page
menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax
facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple
abbrev minibuffer loaddefs button faces cus-face files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process multi-tty emacs)





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#12807: 24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows
  2012-11-05 20:52 bug#12807: 24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows Nils Gösche
@ 2012-11-05 21:47 ` Eli Zaretskii
  2012-11-05 22:05   ` bug#12807: AW: " Nils Gösche
  0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2012-11-05 21:47 UTC (permalink / raw)
  To: Nils Gösche; +Cc: 12807

> From: Nils Gösche <cartan@cartan.de>
> Date: Mon, 05 Nov 2012 21:52:05 +0100
> 
> I keep a bunch of text files on my Windows 7 desktop containing my thoughts
> about the solutions of chess problems I am trying to solve. Now, one of these
> problems was composed by a Russian. So, I named the file Кузовков_Lösung.txt:
> First the name of the Russian composer, then the German word for »solution«.
> However, when I tried to edit that file in Emacs, I only got error messages,
> probably because of the funny Unicode characters in the file name. (See below
> for the exact wording of the messages.)
> 
> Another file with only English/German characters in the name,
> Thorton_Lösung.txt, does not cause any trouble at all (oh, but it seems I
> misspelled the name, actually).

Emacs on Windows currently supports only file names that can be
expressed in the system codepage.  So unless someone writes the code
to support the Unicode APIs throughout, this limitation will remain
for some time to come.  Volunteers are welcome.

> (BTW, Notepad does not have any problems editing the same file. So, it is
> not some weird, OS-related problem, either).

Yes, but the Explorer and the Notepad are about the only programs that
do.  Many others don't.  Emacs is one of them.

Sorry.






^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#12807: AW: bug#12807: 24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows
  2012-11-05 21:47 ` Eli Zaretskii
@ 2012-11-05 22:05   ` Nils Gösche
  2012-11-06  3:57     ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Nils Gösche @ 2012-11-05 22:05 UTC (permalink / raw)
  To: 'Eli Zaretskii'; +Cc: 12807

You wrote:

> Emacs on Windows currently supports only file names that can be
> expressed in the system codepage.  So unless someone writes the code to
> support the Unicode APIs throughout, this limitation will remain for
> some time to come.  Volunteers are welcome.

Subtle hint noted. Ok ok, I'll look into it.

> > (BTW, Notepad does not have any problems editing the same file. So,
> it
> > is not some weird, OS-related problem, either).
> 
> Yes, but the Explorer and the Notepad are about the only programs that
> do.  Many others don't.  Emacs is one of them.

»About the only« is a bit of an exaggeration ;-)  Anything that is written
in C# or Java shouldn't have that problem; or Common Lisp, come to think of
it. But yeah, back in the old days, pretty much nobody felt like using
wchar_t instead of char everywhere in C. I didn't, either, back then. (Not
to mention that in the really old days, wchar_t didn't even exist ;-)

I'll see what I can do.

Regards,
-- 
Nils Gösche
Don't ask for whom the <Ctrl-G> tolls.







^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#12807: AW: bug#12807: 24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows
  2012-11-05 22:05   ` bug#12807: AW: " Nils Gösche
@ 2012-11-06  3:57     ` Eli Zaretskii
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2012-11-06  3:57 UTC (permalink / raw)
  To: Nils Gösche; +Cc: 12807

> From: Nils Gösche <cartan@cartan.de>
> Cc: <12807@debbugs.gnu.org>
> Date: Mon, 5 Nov 2012 23:05:57 +0100
> 
> > Yes, but the Explorer and the Notepad are about the only programs that
> > do.  Many others don't.  Emacs is one of them.
> 
> »About the only« is a bit of an exaggeration ;-)  Anything that is written
> in C# or Java shouldn't have that problem; or Common Lisp, come to think of
> it. But yeah, back in the old days, pretty much nobody felt like using
> wchar_t instead of char everywhere in C. I didn't, either, back then. (Not
> to mention that in the really old days, wchar_t didn't even exist ;-)

Using wchar_t is not going to solve the whole problem, unfortunately.
The problem is that the mainline Emacs code uses APIs that don't
accept wide characters.  Examples include 'stat', 'access', 'open',
'fopen', etc.  To fix the problem, we'd need to provide our own
implementation of these APIs that would accept a UTF-8 encoded file
name, then re-encode the file name in UTF-16, and call the Unicode
APIs as part of the implementation.  This is a large job.






^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-06  3:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-05 20:52 bug#12807: 24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows Nils Gösche
2012-11-05 21:47 ` Eli Zaretskii
2012-11-05 22:05   ` bug#12807: AW: " Nils Gösche
2012-11-06  3:57     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).