From: Chris Vine <chris@cvine.freeserve.co.uk>
To: Mark H Weaver <mhw@netris.org>
Cc: guile-user@gnu.org
Subject: Re: Filename encoding
Date: Wed, 15 Jan 2014 19:50:51 +0000 [thread overview]
Message-ID: <20140115195051.3272023c@bother.homenet> (raw)
In-Reply-To: <87bnzdun74.fsf@netris.org>
On Wed, 15 Jan 2014 13:14:39 -0500
Mark H Weaver <mhw@netris.org> wrote:
> Chris Vine <chris@cvine.freeserve.co.uk> writes:
>
> > A number of guile's scheme procedures look-up or reference files on
> > a file system (open-file, load and so forth).
> >
> > How does guile translate filenames from its internal string
> > representation (ISO-8859-1/UTF-32) to narrow string filename
> > encoding when looking up the file? Does it assume filenames are in
> > locale encoding (not particularly safe on networked file systems)
> > or does it provide a fluid for this? (glib caters for this with the
> > G_FILENAME_ENCODING environmental variable.)
>
> It assumes filenames are in locale encoding. Ditto for virtually
> everything that interfaces with POSIX-style byte strings, including
> environment variables, command-line arguments, etc. Encoding errors
> will raise exceptions by default.
>
> My hope is that this will become less of an issue over time, as
> systems increasingly standardize on UTF-8. I see no other good
> solution.
>
> Thoughts?
POSIX system calls are encoding agnostic. The filename is just a series
of bytes terminating with a NUL character. All guile needs to know is
what encoding the person creating the filesystem has adopted in naming
files and which it needs to map to. So far as filenames are concerned,
this seems to me to be something for which a fluid would be just the
thing - it could default to the locale encoding but a user could set it
to something else. I suppose command lines and environmental variables
are less problematic because they are usually local to a particular
machine, although that may not necessarily be so true these days for
command lines.
Fluids would have a substantial advantage over glib's approach of an
environmental variable. Fluids can be thread safe, environmental
variables are not. (Incidentally, with glib you can set the
environmental variable G_BROKEN_FILENAMES instead of G_FILENAME_ENCODING
which will cause the glib file functions to use locale encoding, which
I guess expresses their view on the issue. However, their solution of
using environmental variables is not ideal.)
Chris
next prev parent reply other threads:[~2014-01-15 19:50 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-15 12:52 Filename encoding Chris Vine
2014-01-15 18:14 ` Mark H Weaver
2014-01-15 19:02 ` Eli Zaretskii
2014-01-15 21:34 ` Mark H Weaver
2014-01-16 3:46 ` Eli Zaretskii
2014-01-15 19:50 ` Chris Vine [this message]
2014-01-15 21:00 ` Eli Zaretskii
2014-01-15 21:42 ` Chris Vine
2014-01-16 3:52 ` Eli Zaretskii
2014-01-15 21:47 ` Mark H Weaver
2014-01-15 22:32 ` Chris Vine
2014-01-16 3:55 ` Eli Zaretskii
2014-01-15 23:29 ` Ludovic Courtès
2014-01-16 4:00 ` Eli Zaretskii
2014-01-16 13:03 ` Ludovic Courtès
2014-01-16 14:07 ` John Darrington
2014-01-16 16:12 ` Eli Zaretskii
2014-01-16 16:09 ` Eli Zaretskii
2014-01-16 15:36 ` Mark H Weaver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140115195051.3272023c@bother.homenet \
--to=chris@cvine.freeserve.co.uk \
--cc=guile-user@gnu.org \
--cc=mhw@netris.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).