unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: David Kastrup <dak@gnu.org>
To: guile-user@gnu.org
Subject: Re: guile can't find a chinese named file
Date: Wed, 15 Feb 2017 10:54:06 +0100	[thread overview]
Message-ID: <87a89n6apt.fsf@fencepost.gnu.org> (raw)
In-Reply-To: 20170215091832.GA28017@tuxteam.de

<tomas@tuxteam.de> writes:

> On Tue, Feb 14, 2017 at 10:19:14PM +0000, Chris Vine wrote:
>> On Tue, 14 Feb 2017 21:52:01 +0000 (UTC)
>> Mike Gran <spk121@yahoo.com> wrote:
>> [snip]
>> > > In particular, filenames are *not*, nor can they be mapped to,
>> > > Unicode  
>> > 
>> > > strings in Linux.  
>> > 
>> > True. Linux should follow OpenBSD and make all locales UTF-8.
>> 
>> Filenames and locales are not necessarily related.  When you access a
>> networked file system, you get the filename encoding you are given,
>> which may or may not be the same as the particular locale encoding on
>> your particular machine on one particular day, and may or may not be a
>> unicode encoding.  Glib, for example, enables you to set this with the
>> G_FILENAME_ENCODING environmental variable [...]
>
> which is, btw., "just a better approximation", but still wrong: the
> application creating a directory might have been "in" a different
> locale (and thus having a different encoding) that the one creating
> the file whithin that directory.
>
> Most notably, the whole path might cross several mount points, thus
> the whole path can well have fragments coming from several file systems.
>
> I think the only sane way to see a Linux file system path is the way
> Linux sees it: as a byte string.
>
> Sure, some helper infrastructure to try to make characters of that
> mess will be welcome, but that should be absolutely robust wrt.
> unexpected input e.g. bad UTF-8) and leave control to the application.
>
> Not easy.

If you tell Emacs that some external entity is in UTF-8, it will
represent all valid UTF-8 sequences as properly decoded characters, and
it has special codes for all bytes not part of valid UTF-8.

As a result, it works with valid UTF-8 perfectly as expected but will
reproduce arbitrary byte streams thrown at it perfectly when decoding as
UTF-8 and then reencoding into UTF-8 again.

Guile is lacking this byte stream reproducibility when
decoding/reencoding.  That makes it a whole lot less robust for dealing
with externally provided material.

-- 
David Kastrup




  reply	other threads:[~2017-02-15  9:54 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-27 11:58 guile can't find a chinese named file Thomas Morley
2016-11-27 12:16 ` Chaos Eternal
2016-11-28  8:54   ` Thomas Morley
2017-01-26 21:59     ` Linas Vepstas
2017-01-30 14:20 ` Ludovic Courtès
2017-01-30 15:48   ` David Kastrup
2017-01-30 16:41     ` Ludovic Courtès
2017-01-30 17:04       ` David Kastrup
2017-01-30 15:54   ` Marko Rauhamaa
2017-01-30 16:19     ` David Kastrup
2017-01-30 16:33       ` Marko Rauhamaa
2017-01-30 16:42         ` David Kastrup
2017-01-30 17:58           ` Marko Rauhamaa
2017-01-30 18:32             ` David Kastrup
2017-01-30 18:50               ` Eli Zaretskii
2017-01-30 19:00                 ` David Kastrup
2017-01-30 19:32                   ` Eli Zaretskii
2017-01-30 19:59                     ` Eli Zaretskii
2017-01-30 20:42                       ` Mike Gran
2017-01-31  3:31                         ` Eli Zaretskii
2017-01-31  6:16                           ` Mike Gran
2017-01-31  8:51                           ` David Kastrup
2017-01-30 19:01               ` Marko Rauhamaa
2017-01-30 19:27                 ` David Kastrup
2017-02-14 20:10                   ` Linas Vepstas
2017-02-14 20:54                     ` Mike Gran
2017-02-14 21:07                       ` Marko Rauhamaa
2017-02-14 21:52                         ` Mike Gran
2017-02-14 22:12                           ` Marko Rauhamaa
2017-02-14 22:19                           ` Chris Vine
2017-02-15  7:15                             ` Marko Rauhamaa
2017-02-15  9:18                             ` tomas
2017-02-15  9:54                               ` David Kastrup [this message]
2017-02-15 10:10                                 ` tomas
2017-02-15 17:04                                   ` Eli Zaretskii
2017-02-15 20:07                                     ` tomas
2017-02-15 20:22                                       ` Eli Zaretskii
2017-02-15 10:50                                 ` Marko Rauhamaa
2017-02-15 11:18                                   ` David Kastrup
2017-02-15 10:15                               ` Chris Vine
2017-02-15 11:48                                 ` tomas
2017-02-15 12:13                                   ` Chris Vine
2017-02-15 12:41                                     ` tomas
2017-02-15 13:11                                       ` Chris Vine
2017-02-15 13:31                                         ` tomas
2017-02-15 17:07                                     ` Eli Zaretskii
2017-02-26 20:58                                       ` Andy Wingo
2017-02-27 16:02                                         ` Eli Zaretskii
2017-02-26 20:52                                 ` Andy Wingo
2017-02-15 16:59                               ` Eli Zaretskii
2017-02-15 17:53                                 ` Marko Rauhamaa
2017-02-15 20:20                                 ` tomas
2017-02-15 20:32                                   ` Eli Zaretskii
2017-02-15 21:04                                     ` Marko Rauhamaa
2017-02-16  5:44                                       ` Eli Zaretskii
2017-02-16  6:15                                         ` Marko Rauhamaa
2017-02-16  6:29                                           ` Eli Zaretskii
2017-02-16  6:41                                             ` Eli Zaretskii
2017-02-16  7:16                                               ` Marko Rauhamaa
2017-02-16  8:26                                                 ` David Kastrup
2017-02-16 10:21                                                   ` Marko Rauhamaa
2017-02-16 10:43                                                     ` David Kastrup
2017-02-16 11:04                                                       ` Marko Rauhamaa
2017-02-16 11:11                                                         ` David Kastrup
2017-02-16 11:32                                                           ` Marko Rauhamaa
2017-02-16 11:49                                                             ` David Kastrup
2017-02-16 12:14                                                               ` Marko Rauhamaa
2017-02-16 16:21                                                                 ` Eli Zaretskii
2017-02-16 16:38                                                                   ` Marko Rauhamaa
2017-02-16 17:46                                                                     ` Eli Zaretskii
2017-02-16 18:38                                                                       ` Marko Rauhamaa
2017-02-16 18:46                                                                         ` Eli Zaretskii
2017-02-16 19:35                                                                           ` Marko Rauhamaa
2017-02-16 20:10                                                                             ` Eli Zaretskii
2017-02-16 20:52                                                                               ` David Kastrup
2017-02-16 21:13                                                                                 ` Marko Rauhamaa
2017-02-17  6:44                                                                                   ` Eli Zaretskii
2017-02-17  8:46                                                                                     ` Marko Rauhamaa
2017-02-17  9:04                                                                                       ` David Kastrup
2017-02-17  9:57                                                                                         ` tomas
2017-02-17  9:07                                                                                       ` Eli Zaretskii
2017-02-17  6:32                                                                                 ` Eli Zaretskii
2017-02-16 16:06                                                 ` Eli Zaretskii
2017-02-16 16:35                                                   ` Marko Rauhamaa
2017-02-16 17:41                                                     ` Eli Zaretskii
2017-02-16 18:30                                                     ` Mike Gran
2017-02-16 18:48                                                       ` David Kastrup
2017-02-16  7:02                                             ` Marko Rauhamaa
2017-02-16 15:47                                               ` Eli Zaretskii
2017-02-15 21:15                                     ` tomas
2017-02-16  5:54                                       ` Eli Zaretskii
2017-02-14 23:58                       ` David Kastrup
2017-02-15 10:12                         ` tomas
2017-02-15 12:04                           ` Marko Rauhamaa
2017-02-26 21:20                         ` Andy Wingo
2017-02-27  9:10                           ` David Kastrup
2017-02-27 11:02                             ` Andy Wingo
2017-02-27 12:09                               ` David Kastrup
2017-02-27 12:33                                 ` Andy Wingo
2017-02-27 16:07                           ` Eli Zaretskii
2017-02-27 19:29                             ` Andy Wingo
2017-02-27 20:24                               ` Jan Wedekind
2017-02-27 20:33                                 ` Eli Zaretskii
2017-02-14 22:26                     ` Ludovic Courtès
2017-02-26 21:23                       ` Andy Wingo
2017-01-30 19:41                 ` Eli Zaretskii
2017-01-30 20:46                   ` Marko Rauhamaa
2017-01-31 12:20                     ` tomas
2017-02-14 19:58             ` Linas Vepstas
2017-02-26 21:33               ` Andy Wingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a89n6apt.fsf@fencepost.gnu.org \
    --to=dak@gnu.org \
    --cc=guile-user@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).