From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.bugs Subject: bug#11197: problems with string ports and unicode Date: Wed, 11 Apr 2012 12:08:09 -0400 Message-ID: <87ty0q8d5h.fsf@netris.org> References: <87ty0sa9tu.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1334160732 9805 80.91.229.3 (11 Apr 2012 16:12:12 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 11 Apr 2012 16:12:12 +0000 (UTC) Cc: 11197@debbugs.gnu.org, Klaus Stehle To: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Wed Apr 11 18:12:11 2012 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SI09Q-00044D-Q6 for guile-bugs@m.gmane.org; Wed, 11 Apr 2012 18:12:08 +0200 Original-Received: from localhost ([::1]:46762 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SI09Q-0004jx-3U for guile-bugs@m.gmane.org; Wed, 11 Apr 2012 12:12:08 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:46470) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SI09J-0004jP-Nq for bug-guile@gnu.org; Wed, 11 Apr 2012 12:12:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SI09D-0007JH-6O for bug-guile@gnu.org; Wed, 11 Apr 2012 12:12:01 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:55768) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SI09D-0007JB-2z for bug-guile@gnu.org; Wed, 11 Apr 2012 12:11:55 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1SI0AI-0001e3-53 for bug-guile@gnu.org; Wed, 11 Apr 2012 12:13:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Mark H Weaver Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-guile@gnu.org Resent-Date: Wed, 11 Apr 2012 16:13:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 11197 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 11197-submit@debbugs.gnu.org id=B11197.13341607516274 (code B ref 11197); Wed, 11 Apr 2012 16:13:02 +0000 Original-Received: (at 11197) by debbugs.gnu.org; 11 Apr 2012 16:12:31 +0000 Original-Received: from localhost ([127.0.0.1]:52306 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SI09m-0001d9-9K for submit@debbugs.gnu.org; Wed, 11 Apr 2012 12:12:30 -0400 Original-Received: from world.peace.net ([96.39.62.75]:38339) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SI09j-0001d1-Eq for 11197@debbugs.gnu.org; Wed, 11 Apr 2012 12:12:28 -0400 Original-Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=yeeloong) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1SI08X-0001o2-BI; Wed, 11 Apr 2012 12:11:13 -0400 In-Reply-To: <87ty0sa9tu.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 09 Apr 2012 23:12:29 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:6291 Archived-At: ludo@gnu.org (Ludovic Court=C3=A8s) writes: > It may be that your string ports are created with a non-Unicode-capable > encoding. Try something like: > > (define p > (with-fluids ((%default-port-encoding "UTF-8")) > (open-input-string "=C4=8Dty=C5=99=C3=AD"))) IMO, this should not be needed. Port encodings should only be relevant when reading from ports involving byte strings, such as file ports or socket ports. The encoding used by Scheme strings is a purely internal matter; from the user's perspective, Scheme strings are simply a sequence of Unicode code points. What _is_ needed is a file coding declaration near the top of the source file, e.g. "coding: utf-8" (see "Character Encoding of Source Files" in the manual). I tried that and it still fails for me. I think this is a genuine bug. Mark