From: David Kastrup <dak@gnu.org>
To: guile-user@gnu.org
Subject: Re: guile can't find a chinese named file
Date: Wed, 15 Feb 2017 10:54:06 +0100 [thread overview]
Message-ID: <87a89n6apt.fsf@fencepost.gnu.org> (raw)
In-Reply-To: 20170215091832.GA28017@tuxteam.de
<tomas@tuxteam.de> writes:
> On Tue, Feb 14, 2017 at 10:19:14PM +0000, Chris Vine wrote:
>> On Tue, 14 Feb 2017 21:52:01 +0000 (UTC)
>> Mike Gran <spk121@yahoo.com> wrote:
>> [snip]
>> > > In particular, filenames are *not*, nor can they be mapped to,
>> > > Unicode
>> >
>> > > strings in Linux.
>> >
>> > True. Linux should follow OpenBSD and make all locales UTF-8.
>>
>> Filenames and locales are not necessarily related. When you access a
>> networked file system, you get the filename encoding you are given,
>> which may or may not be the same as the particular locale encoding on
>> your particular machine on one particular day, and may or may not be a
>> unicode encoding. Glib, for example, enables you to set this with the
>> G_FILENAME_ENCODING environmental variable [...]
>
> which is, btw., "just a better approximation", but still wrong: the
> application creating a directory might have been "in" a different
> locale (and thus having a different encoding) that the one creating
> the file whithin that directory.
>
> Most notably, the whole path might cross several mount points, thus
> the whole path can well have fragments coming from several file systems.
>
> I think the only sane way to see a Linux file system path is the way
> Linux sees it: as a byte string.
>
> Sure, some helper infrastructure to try to make characters of that
> mess will be welcome, but that should be absolutely robust wrt.
> unexpected input e.g. bad UTF-8) and leave control to the application.
>
> Not easy.
If you tell Emacs that some external entity is in UTF-8, it will
represent all valid UTF-8 sequences as properly decoded characters, and
it has special codes for all bytes not part of valid UTF-8.
As a result, it works with valid UTF-8 perfectly as expected but will
reproduce arbitrary byte streams thrown at it perfectly when decoding as
UTF-8 and then reencoding into UTF-8 again.
Guile is lacking this byte stream reproducibility when
decoding/reencoding. That makes it a whole lot less robust for dealing
with externally provided material.
--
David Kastrup
next prev parent reply other threads:[~2017-02-15 9:54 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-27 11:58 guile can't find a chinese named file Thomas Morley
2016-11-27 12:16 ` Chaos Eternal
2016-11-28 8:54 ` Thomas Morley
2017-01-26 21:59 ` Linas Vepstas
2017-01-30 14:20 ` Ludovic Courtès
2017-01-30 15:48 ` David Kastrup
2017-01-30 16:41 ` Ludovic Courtès
2017-01-30 17:04 ` David Kastrup
2017-01-30 15:54 ` Marko Rauhamaa
2017-01-30 16:19 ` David Kastrup
2017-01-30 16:33 ` Marko Rauhamaa
2017-01-30 16:42 ` David Kastrup
2017-01-30 17:58 ` Marko Rauhamaa
2017-01-30 18:32 ` David Kastrup
2017-01-30 18:50 ` Eli Zaretskii
2017-01-30 19:00 ` David Kastrup
2017-01-30 19:32 ` Eli Zaretskii
2017-01-30 19:59 ` Eli Zaretskii
2017-01-30 20:42 ` Mike Gran
2017-01-31 3:31 ` Eli Zaretskii
2017-01-31 6:16 ` Mike Gran
2017-01-31 8:51 ` David Kastrup
2017-01-30 19:01 ` Marko Rauhamaa
2017-01-30 19:27 ` David Kastrup
2017-02-14 20:10 ` Linas Vepstas
2017-02-14 20:54 ` Mike Gran
2017-02-14 21:07 ` Marko Rauhamaa
2017-02-14 21:52 ` Mike Gran
2017-02-14 22:12 ` Marko Rauhamaa
2017-02-14 22:19 ` Chris Vine
2017-02-15 7:15 ` Marko Rauhamaa
2017-02-15 9:18 ` tomas
2017-02-15 9:54 ` David Kastrup [this message]
2017-02-15 10:10 ` tomas
2017-02-15 17:04 ` Eli Zaretskii
2017-02-15 20:07 ` tomas
2017-02-15 20:22 ` Eli Zaretskii
2017-02-15 10:50 ` Marko Rauhamaa
2017-02-15 11:18 ` David Kastrup
2017-02-15 10:15 ` Chris Vine
2017-02-15 11:48 ` tomas
2017-02-15 12:13 ` Chris Vine
2017-02-15 12:41 ` tomas
2017-02-15 13:11 ` Chris Vine
2017-02-15 13:31 ` tomas
2017-02-15 17:07 ` Eli Zaretskii
2017-02-26 20:58 ` Andy Wingo
2017-02-27 16:02 ` Eli Zaretskii
2017-02-26 20:52 ` Andy Wingo
2017-02-15 16:59 ` Eli Zaretskii
2017-02-15 17:53 ` Marko Rauhamaa
2017-02-15 20:20 ` tomas
2017-02-15 20:32 ` Eli Zaretskii
2017-02-15 21:04 ` Marko Rauhamaa
2017-02-16 5:44 ` Eli Zaretskii
2017-02-16 6:15 ` Marko Rauhamaa
2017-02-16 6:29 ` Eli Zaretskii
2017-02-16 6:41 ` Eli Zaretskii
2017-02-16 7:16 ` Marko Rauhamaa
2017-02-16 8:26 ` David Kastrup
2017-02-16 10:21 ` Marko Rauhamaa
2017-02-16 10:43 ` David Kastrup
2017-02-16 11:04 ` Marko Rauhamaa
2017-02-16 11:11 ` David Kastrup
2017-02-16 11:32 ` Marko Rauhamaa
2017-02-16 11:49 ` David Kastrup
2017-02-16 12:14 ` Marko Rauhamaa
2017-02-16 16:21 ` Eli Zaretskii
2017-02-16 16:38 ` Marko Rauhamaa
2017-02-16 17:46 ` Eli Zaretskii
2017-02-16 18:38 ` Marko Rauhamaa
2017-02-16 18:46 ` Eli Zaretskii
2017-02-16 19:35 ` Marko Rauhamaa
2017-02-16 20:10 ` Eli Zaretskii
2017-02-16 20:52 ` David Kastrup
2017-02-16 21:13 ` Marko Rauhamaa
2017-02-17 6:44 ` Eli Zaretskii
2017-02-17 8:46 ` Marko Rauhamaa
2017-02-17 9:04 ` David Kastrup
2017-02-17 9:57 ` tomas
2017-02-17 9:07 ` Eli Zaretskii
2017-02-17 6:32 ` Eli Zaretskii
2017-02-16 16:06 ` Eli Zaretskii
2017-02-16 16:35 ` Marko Rauhamaa
2017-02-16 17:41 ` Eli Zaretskii
2017-02-16 18:30 ` Mike Gran
2017-02-16 18:48 ` David Kastrup
2017-02-16 7:02 ` Marko Rauhamaa
2017-02-16 15:47 ` Eli Zaretskii
2017-02-15 21:15 ` tomas
2017-02-16 5:54 ` Eli Zaretskii
2017-02-14 23:58 ` David Kastrup
2017-02-15 10:12 ` tomas
2017-02-15 12:04 ` Marko Rauhamaa
2017-02-26 21:20 ` Andy Wingo
2017-02-27 9:10 ` David Kastrup
2017-02-27 11:02 ` Andy Wingo
2017-02-27 12:09 ` David Kastrup
2017-02-27 12:33 ` Andy Wingo
2017-02-27 16:07 ` Eli Zaretskii
2017-02-27 19:29 ` Andy Wingo
2017-02-27 20:24 ` Jan Wedekind
2017-02-27 20:33 ` Eli Zaretskii
2017-02-14 22:26 ` Ludovic Courtès
2017-02-26 21:23 ` Andy Wingo
2017-01-30 19:41 ` Eli Zaretskii
2017-01-30 20:46 ` Marko Rauhamaa
2017-01-31 12:20 ` tomas
2017-02-14 19:58 ` Linas Vepstas
2017-02-26 21:33 ` Andy Wingo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a89n6apt.fsf@fencepost.gnu.org \
--to=dak@gnu.org \
--cc=guile-user@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).