unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: John Darrington <john@darrington.wattle.id.au>
To: Ludovic Court??s <ludo@gnu.org>
Cc: guile-user@gnu.org
Subject: Re: Filename encoding
Date: Thu, 16 Jan 2014 15:07:43 +0100	[thread overview]
Message-ID: <20140116140743.GA16999@jocasta.intra> (raw)
In-Reply-To: <8738koaxkm.fsf@gnu.org>

On Thu, Jan 16, 2014 at 02:03:05PM +0100, Ludovic Court??s wrote:
     Eli Zaretskii <eliz@gnu.org> skribis:
     
     >> From: ludo@gnu.org (Ludovic Court??s)
     >> Date: Thu, 16 Jan 2014 00:29:06 +0100
     >> 
     >> Does anyone know of systems where the file name encoding is commonly
     >> different from locale encoding?  Is it the case on Windows?
     >
     > Windows stores file names on disk encoded in UTF-16, but converts them
     > to the current codepage if you use Posix-style interfaces like 'open'
     > and 'rename'.
     
     So in practice, given that Guile uses the POSIX interfaces, the
     assumption that file names are in the locale encoding is valid on
     Windows.
     

If you know that the filename was always obtained using the Guile's 
interface then the issue is never pertinent.    The problem comes when a function
is aske to open a non-ascii named file, without any information about where that
filename came from.


There is no answer to this general problem.  We've encountered it over the years
in PSPP what we are doing now, is to pass the filename around in a structure along
with a variable indicating the encoding in which that filename should be interpreted.

This works up to a point, but eventually there comes an interface where the crucial 
information is missing.  For example, what happens if the filename is in a text file.
We have heuristics which can guess the encoding of a file, but that is of course not
completely reliable.

One has to decide on an approach which will give the lowest probability of surprises.

J'




     

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.




  reply	other threads:[~2014-01-16 14:07 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-15 12:52 Filename encoding Chris Vine
2014-01-15 18:14 ` Mark H Weaver
2014-01-15 19:02   ` Eli Zaretskii
2014-01-15 21:34     ` Mark H Weaver
2014-01-16  3:46       ` Eli Zaretskii
2014-01-15 19:50   ` Chris Vine
2014-01-15 21:00     ` Eli Zaretskii
2014-01-15 21:42       ` Chris Vine
2014-01-16  3:52         ` Eli Zaretskii
2014-01-15 21:47     ` Mark H Weaver
2014-01-15 22:32       ` Chris Vine
2014-01-16  3:55       ` Eli Zaretskii
2014-01-15 23:29     ` Ludovic Courtès
2014-01-16  4:00       ` Eli Zaretskii
2014-01-16 13:03         ` Ludovic Courtès
2014-01-16 14:07           ` John Darrington [this message]
2014-01-16 16:12             ` Eli Zaretskii
2014-01-16 16:09           ` Eli Zaretskii
2014-01-16 15:36         ` Mark H Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140116140743.GA16999@jocasta.intra \
    --to=john@darrington.wattle.id.au \
    --cc=guile-user@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).