From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: David Kastrup Newsgroups: gmane.lisp.guile.user Subject: Re: guile can't find a chinese named file Date: Wed, 15 Feb 2017 10:54:06 +0100 Organization: Organization?!? Message-ID: <87a89n6apt.fsf@fencepost.gnu.org> References: <87h94gqz34.fsf@fencepost.gnu.org> <87fuk0ctve.fsf@elektro.pacujo.net> <878tpsqtzl.fsf@fencepost.gnu.org> <87zii8bcdw.fsf@elektro.pacujo.net> <87y3xspcux.fsf@fencepost.gnu.org> <578885360.4452806.1487105647708@mail.yahoo.com> <87r330cwhj.fsf@elektro.pacujo.net> <191859705.4469709.1487109121157@mail.yahoo.com> <20170214221914.1483ddb1@bother.homenet> <20170215091832.GA28017@tuxteam.de> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1487152487 27719 195.159.176.226 (15 Feb 2017 09:54:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 15 Feb 2017 09:54:47 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Wed Feb 15 10:54:43 2017 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cdwIH-0006SA-Qj for guile-user@m.gmane.org; Wed, 15 Feb 2017 10:54:37 +0100 Original-Received: from localhost ([::1]:39285 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdwIN-0002uK-Cb for guile-user@m.gmane.org; Wed, 15 Feb 2017 04:54:43 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51794) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdwI0-0002u3-L0 for guile-user@gnu.org; Wed, 15 Feb 2017 04:54:21 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cdwHx-00029L-HY for guile-user@gnu.org; Wed, 15 Feb 2017 04:54:20 -0500 Original-Received: from [195.159.176.226] (port=39147 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cdwHx-00028z-AA for guile-user@gnu.org; Wed, 15 Feb 2017 04:54:17 -0500 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1cdwHo-0004R1-0A for guile-user@gnu.org; Wed, 15 Feb 2017 10:54:08 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 51 Original-X-Complaints-To: usenet@blaine.gmane.org X-Face: 2FEFf>]>q>2iw=B6, xrUubRI>pR&Ml9=ao@P@i)L:\urd*t9M~y1^:+Y]'C0~{mAl`oQuAl \!3KEIp?*w`|bL5qr,H)LFO6Q=qx~iH4DN; i"; /yuIsqbLLCh/!U#X[S~(5eZ41to5f%E@'ELIi$t^ Vc\LWP@J5p^rst0+('>Er0=^1{]M9!p?&:\z]|;&=NP3AhB!B_bi^]Pfkw Cancel-Lock: sha1:7HuI4aZroeRN952bcUmXnSZViGI= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 195.159.176.226 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:13217 Archived-At: writes: > On Tue, Feb 14, 2017 at 10:19:14PM +0000, Chris Vine wrote: >> On Tue, 14 Feb 2017 21:52:01 +0000 (UTC) >> Mike Gran wrote: >> [snip] >> > > In particular, filenames are *not*, nor can they be mapped to, >> > > Unicode >> > >> > > strings in Linux. >> > >> > True. Linux should follow OpenBSD and make all locales UTF-8. >> >> Filenames and locales are not necessarily related. When you access a >> networked file system, you get the filename encoding you are given, >> which may or may not be the same as the particular locale encoding on >> your particular machine on one particular day, and may or may not be a >> unicode encoding. Glib, for example, enables you to set this with the >> G_FILENAME_ENCODING environmental variable [...] > > which is, btw., "just a better approximation", but still wrong: the > application creating a directory might have been "in" a different > locale (and thus having a different encoding) that the one creating > the file whithin that directory. > > Most notably, the whole path might cross several mount points, thus > the whole path can well have fragments coming from several file systems. > > I think the only sane way to see a Linux file system path is the way > Linux sees it: as a byte string. > > Sure, some helper infrastructure to try to make characters of that > mess will be welcome, but that should be absolutely robust wrt. > unexpected input e.g. bad UTF-8) and leave control to the application. > > Not easy. If you tell Emacs that some external entity is in UTF-8, it will represent all valid UTF-8 sequences as properly decoded characters, and it has special codes for all bytes not part of valid UTF-8. As a result, it works with valid UTF-8 perfectly as expected but will reproduce arbitrary byte streams thrown at it perfectly when decoding as UTF-8 and then reencoding into UTF-8 again. Guile is lacking this byte stream reproducibility when decoding/reencoding. That makes it a whole lot less robust for dealing with externally provided material. -- David Kastrup