all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Peter Dyballa <Peter_Dyballa@Web.DE>
Cc: help-gnu-emacs@gnu.org, Miles Bader <miles@gnu.org>
Subject: Re: UTF-8 in path / filename
Date: Sun, 27 Aug 2006 15:12:15 +0200	[thread overview]
Message-ID: <25A143BA-4E99-4FF9-B6C0-A8F42146D0C9@Web.DE> (raw)
In-Reply-To: <m3wt8vm9g6.fsf@lugabout.jhcloos.org>


Am 27.08.2006 um 00:13 schrieb James Cloos:

> Peter> Files with UTF-8 characters in them are shown in dired (has - 
> u: in
> Peter> mode-line, i.e. uses UTF-8) à la <vowel><empty box>. Some  
> UTF-8
> Peter> characters like ß or Û show up as themselves.
>
> Doesn't apple by default use NFD (Normalizaion Form Decomposed) for
> filenames?  That would explain the <vowel><box> sequences.

Yes, that's the correct term for the way file names are recorded in  
HFS+.

The font file, LucidaTypewriterRegular.ttf, has no combining  
diacritical marks defined (only some modifiers), so these empty boxes  
are displayed instead.

>
> Can you get at the actual octet-sequence of the filenames?

Do you know a tool that can do that? I can only think of a C  
programme that reads the inode and than outputs the octets. Doing the  
same as Harald did I get in Terminal different output (because UTF-8  
characters are substituted with question marks, for example:

	pete 140 /\ l -1 | grep .txt | grep ' ' | grep -v Mac
	RGB äöüæÆÜÖÄ.txt
	pete 141 /\ l -1 | grep .txt | grep ' ' | grep -v Mac | od -t a
	    R   G   B  sp   a   ?  88   o   ?  88   u   ?  88   ?   ?   ?
	   86   U   ?  88   O   ?  88   A   ?  88   .   t   x   t  nl

In Emacsen' shells I get:

	    R   G   B  sp   a   \314  88   o   \314  88   u   \314  88    
\303   \246   \303
	   86   U   \314  88   O   \314  88   A   \314  88   .   t   x   t  nl

The file name áÛïǓà.txt is interpreted as:

	    a   \314  81   U   \314  82   i   \314  88   U   \314  8c   a    
\314  80   .
	    t   x   t  nl

--
Greetings

   Pete

"Isn't vi that text editor with two modes... one that beeps and one
that corrupts your file?" -- Dan Jacobson, on comp.os.linux.advocacy

  reply	other threads:[~2006-08-27 13:12 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-24 13:59 UTF-8 in path / filename Grégory SCHMITT
2006-08-24 14:42 ` Noah Slater
2006-08-25 12:08 ` Peter Dyballa
     [not found] ` <mailman.5606.1156507702.9609.help-gnu-emacs@gnu.org>
2006-08-25 13:42   ` Grégory SCHMITT
2006-08-25 18:35     ` Peter Dyballa
2006-08-25 22:06       ` Grégory SCHMITT
2006-08-25 22:55         ` Peter Dyballa
     [not found]         ` <mailman.5656.1156546542.9609.help-gnu-emacs@gnu.org>
2006-08-25 23:06           ` Grégory SCHMITT
2006-08-25 23:09           ` Miles Bader
2006-08-26  9:36             ` Peter Dyballa
2006-08-26 22:13               ` James Cloos
2006-08-27 13:12                 ` Peter Dyballa [this message]
2006-08-28 15:11                   ` James Cloos
2006-08-28 15:55                     ` Peter Dyballa
     [not found]               ` <mailman.5694.1156630455.9609.help-gnu-emacs@gnu.org>
2006-08-27  8:46                 ` Harald Hanche-Olsen
     [not found]           ` <mailman.5657.1156547377.9609.help-gnu-emacs@gnu.org>
2006-08-25 23:22             ` Grégory SCHMITT
2006-08-25 23:25               ` Miles Bader

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25A143BA-4E99-4FF9-B6C0-A8F42146D0C9@Web.DE \
    --to=peter_dyballa@web.de \
    --cc=help-gnu-emacs@gnu.org \
    --cc=miles@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.