all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Mathias Dahl <brakjoller@hotmail.com>
Subject: Re: opening files with unicode characters in the file name on windows
Date: 04 Aug 2004 16:27:05 +0200	[thread overview]
Message-ID: <uoelr7zd2.fsf@hotmail.com> (raw)
In-Reply-To: mailman.2622.1091561203.1960.help-gnu-emacs@gnu.org

"Eli Zaretskii" <eliz@gnu.org> writes:

> Your original message said ``file names with Unicode characters''.
> Can you tell what characters are those, and why do you think they
> are encoded in some Unicode-related encoding, like UTF-16?  Can you
> look at the file's name as recorded in the directory with some
> low-level tool that actually shows the byte values that encode the
> file's name?

I have done some investigation and I am pretty sure UTF-16 is the
encoding used. The following VBScript program (sorry for pasting
non-emacs related stuff here) loops through all files in a folder and
if the file names contain character values > 255 displays a list with
unicode code point values:

' -- TestUnicoceFileNames.vbs ---

Option Explicit

' --------- Main program starts

Dim sFileName
Dim oFSO
Dim oFile

Set oFSO = CreateObject("Scripting.FileSystemObject")

For Each oFile In oFSO.GetFolder("c:\document\my docs").Files
  checkUnicodeFileName(oFile.Name)
Next

Set oFSO = Nothing

' --------- Main program ends

Private Sub checkUnicodeFileName(fileName)

  Dim i
  Dim c
  Dim n

  For i = 1 to Len(fileName)

    c = Mid(fileName, i, 1)
    n = AscW(c)

    If n > 255 Then
      MsgBox "File name contains unicode characters: " & _
             Chr(10) & Chr(10) & _
             "File name: " & fileName & _
             Chr(10) & Chr(10) & _
             "Characters and their unicode code points:" & _
             Chr(10) & Chr(10) & _
             getStringInfo(fileName)
      Exit Sub
    End If

  Next

End Sub

Private Function getStringInfo(s)
  Dim i
  Dim n
  Dim c
  Dim h
  Dim result

  result = "Char" & Chr(9) & "U+NNNN" & Chr(10) & Chr(10)

  For i = 1 to Len(s)
    c = Mid(s, i, 1)
    n = AscW(c)
    h = Hex(n)
    result = result & c & Chr(9) & Right("0000" & h, 4) & Chr(10)
  Next

  getStringInfo = result

End Function

' -- TestUnicoceFileNames.vbs end here---

The output looks like this (you do not see the actual characters which
I do if I use a "unicode font" for message boxes):

File name contains unicode characters: 

File name: pravda_правда.txt

Characters and their unicode code points:

Char	U+NNNN

p	0070
r	0072
a	0061
v	0076
d	0064
a	0061
_	005F
п	043F
р	0440
а	0430
в	0432
д	0434
а	0430
.	002E
t	0074
x	0078
t	0074

/Mathias

      parent reply	other threads:[~2004-08-04 14:27 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-02 12:53 opening files with unicode characters in the file name on windows Mathias Dahl
2004-08-02 16:14 ` Kevin Rodgers
2004-08-03  6:32   ` Mathias Dahl
2004-08-03 19:19     ` Eli Zaretskii
     [not found]     ` <mailman.2622.1091561203.1960.help-gnu-emacs@gnu.org>
2004-08-04  7:46       ` Mathias Dahl
2004-08-04  7:56         ` Jason Rumney
2004-08-04  8:42           ` Mathias Dahl
2004-08-04 16:29             ` Eli Zaretskii
     [not found]             ` <mailman.2714.1091637376.1960.help-gnu-emacs@gnu.org>
2004-08-05 11:28               ` Mathias Dahl
2004-08-06  9:38                 ` Eli Zaretskii
     [not found]                 ` <mailman.112.1091785538.2011.help-gnu-emacs@gnu.org>
2004-08-06 11:44                   ` Mathias Dahl
2004-08-06 13:08                   ` Mathias Dahl
2004-08-04 14:27       ` Mathias Dahl [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=uoelr7zd2.fsf@hotmail.com \
    --to=brakjoller@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.