From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: David Kastrup Newsgroups: gmane.lisp.guile.user Subject: Re: guile can't find a chinese named file Date: Mon, 30 Jan 2017 17:42:07 +0100 Organization: Organization?!? Message-ID: <87h94gqz34.fsf@fencepost.gnu.org> References: <874m0gd3z4.fsf@gnu.org> <87wpdc8rx7.fsf@elektro.pacujo.net> <87poj4r04c.fsf@fencepost.gnu.org> <87k29c8q3b.fsf@elektro.pacujo.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1485794623 7474 195.159.176.226 (30 Jan 2017 16:43:43 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 30 Jan 2017 16:43:43 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Jan 30 17:43:36 2017 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cYF3H-0001c2-2i for guile-user@m.gmane.org; Mon, 30 Jan 2017 17:43:35 +0100 Original-Received: from localhost ([::1]:33757 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cYF3M-00061c-Fa for guile-user@m.gmane.org; Mon, 30 Jan 2017 11:43:40 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57023) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cYF24-0005M6-6A for guile-user@gnu.org; Mon, 30 Jan 2017 11:42:21 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cYF21-00055I-27 for guile-user@gnu.org; Mon, 30 Jan 2017 11:42:20 -0500 Original-Received: from [195.159.176.226] (port=41717 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cYF20-00054z-Sd for guile-user@gnu.org; Mon, 30 Jan 2017 11:42:17 -0500 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1cYF1t-0006rJ-0o for guile-user@gnu.org; Mon, 30 Jan 2017 17:42:09 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 74 Original-X-Complaints-To: usenet@blaine.gmane.org X-Face: 2FEFf>]>q>2iw=B6, xrUubRI>pR&Ml9=ao@P@i)L:\urd*t9M~y1^:+Y]'C0~{mAl`oQuAl \!3KEIp?*w`|bL5qr,H)LFO6Q=qx~iH4DN; i"; /yuIsqbLLCh/!U#X[S~(5eZ41to5f%E@'ELIi$t^ Vc\LWP@J5p^rst0+('>Er0=^1{]M9!p?&:\z]|;&=NP3AhB!B_bi^]Pfkw Cancel-Lock: sha1:5HC97Vcu1nV6pVqPfNHHVIkA7cQ= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 195.159.176.226 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:13142 Archived-At: Marko Rauhamaa writes: > David Kastrup : > >> Marko Rauhamaa writes: >>> ludo@gnu.org (Ludovic Courtès): >>>> Guile assumes its command-line arguments are UTF-8-encoded and >>>> decodes them accordingly. >>> >>> I'm afraid that choice (which Python made, as well) was a bad one >>> because Linux doesn't guarantee UTF-8 purity. >> >> Have you looked at the error messages? They are all perfect UTF-8. As >> was the command line locale. > > I was responding to Ludovic. > >> Apparently, Guile can open the file just fine, and it sees the command >> line just fine as encoded in utf-8. > > My problem is when it is not valid UTF-8. > >> So I really, really, really suggest that before people post their >> theories that they actually bother cross-checking them with Guile. > > Well, execute these commands from bash: > > $ touch $'\xee' > $ touch xyz > $ ls -a > . .. ''$'\356' xyz We are not talking about file names not encoded in UTF-8. It is well-known that Guile is unable to work with strings in UTF-8-encoding when their byte-pattern is not valid UTF-8. This is a red herring. The problem is not that Guile is unable to deal with badly encoded UTF-8 file names. The problem is that Guile is unable to deal with properly encoded UTF-8 file names when it is supposed to execute them from the command line. > Then, execute this guile program: > > ======================================================================== > (let ((dir (opendir "."))) > (let loop () > (let ((filename (readdir dir))) > (if (not (eof-object? filename)) > (begin > (if (access? filename R_OK) > (format #t "~s\n" filename)) > (loop)))))) > ======================================================================== > > It outputs: > > ".." > "." > "xyz" > > skipping a file. This is a security risk. Files like these appear easily > when extracting zip files, for example. I am surprised this does not just throw a bad encoding exception. But at any rate, this cannot easily be fixed since Guile uses libraries for encoding/decoding that cannot deal reproducibly with improper byte patterns. The problem here is that Guile cannot even deal with _properly_ encoded UTF-8 file names on the command line. -- David Kastrup