From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Christopher Lam Newsgroups: gmane.lisp.guile.user Subject: Re: string-ports issue on Windows Date: Tue, 16 Apr 2019 23:26:49 +0000 Message-ID: References: <877ebt7tc0.fsf@netris.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="18385"; mail-complaints-to="usenet@blaine.gmane.org" Cc: guile-user To: Mark H Weaver Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Wed Apr 17 01:27:48 2019 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hGXUR-0004dh-9C for guile-user@m.gmane.org; Wed, 17 Apr 2019 01:27:47 +0200 Original-Received: from localhost ([127.0.0.1]:44116 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGXUQ-0007nr-5T for guile-user@m.gmane.org; Tue, 16 Apr 2019 19:27:46 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:56489) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGXU0-0007lb-EF for guile-user@gnu.org; Tue, 16 Apr 2019 19:27:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hGXTz-0005De-0T for guile-user@gnu.org; Tue, 16 Apr 2019 19:27:20 -0400 Original-Received: from mail-ed1-x52e.google.com ([2a00:1450:4864:20::52e]:34254) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hGXTy-0005DI-LX for guile-user@gnu.org; Tue, 16 Apr 2019 19:27:18 -0400 Original-Received: by mail-ed1-x52e.google.com with SMTP id a6so8124878edv.1 for ; Tue, 16 Apr 2019 16:27:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=68nTGPfK7YpnACyDNld1L5rhG/UbEX0js2OskkKwD+s=; b=QTW6ERIFwW14I3F0+GyqJsLu2PjvyfO1N39lZqxBXv077ERZUkkVwzCVTqn6VxETA/ zdAtbT66U5G3LvaD9h7x5m3+uDU/SYpwK4R8soqWIYkglhNCkmTjsmmhG59wcisJifH5 FJQMfd77PxrX/ZRHXDGQu1BQ6rBsRrc1P9TFo/ruJEQ5tkJQj8orRZXHAhxLM7dBDcBB /1ZXuhWikNX1nfNyvpkbkEqGAM8wLrRT1lQdPk5wy4Na6NWUjZsGVkHfj1h18ozWORcJ XlNoXAlmYhp5uqzbuZyoU4k1mG3BqlbzKnwbX94Rny4hgP1l94Yt13Z+bqkaEwYJDVWr 8r7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=68nTGPfK7YpnACyDNld1L5rhG/UbEX0js2OskkKwD+s=; b=YRAxzb920qP3iHBHGoH7g0MFDK4vfIvJmrewYYu4LK2+F6ixATwbRwtL/vmgJSyFeZ YqenZTLXljc5aHyuU4liUuDGLnJsRFiGILAyjlDi+yfyA3RJRDVjUVUw+XzVxWCJJVmk hMX/TTsphlpd6fIlk5MZ+zyDGqpt4RdP1JIAllMQ6v27Y42RhT85r5bz4Izi11oVxwWG rNItxAFCKMSSdTEPxY/GmWobvUOQka31tOpmT4gRHazl3ynBbXgsoO2rtQJNgGMDdyfz XrHL4WSyLXgLLXtdsKmQA0Vs/ptwNYpmTLPH4x/Ot4+OmTzbnjJ18kegZQCCjRpwmJyX dS+w== X-Gm-Message-State: APjAAAUmy8Zb1JiSA+NIIs7up4gxUGVS0XsuEuDqEQezl/+vS4Q8Eb8z 9jjxYprpXgK1vFgzUExQiEiapkPW5YHYd5C/56wBjqtPODA= X-Google-Smtp-Source: APXvYqwm3AHwFKel8txLNeulcR2g4sgRMnOBI0CS2cmTzvj+JVbAmwst2nEnSlCdwlBjIXiwRPZ53UyyCXW6gEkRO1s= X-Received: by 2002:a50:a4db:: with SMTP id x27mr53224698edb.120.1555457237058; Tue, 16 Apr 2019 16:27:17 -0700 (PDT) In-Reply-To: <877ebt7tc0.fsf@netris.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::52e X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:15394 Archived-At: Thank you Mark The problem is rather obscure and may have been fixed in 2.2. I've taken the reins of handling the guile code in GnuCash. For various reasons I can't fathom, the Windows build includes Guile 2.0.14 rather than Guile-2.2. I've checked NEWS and there was change in SRFI-6 string-ports to make them Unicode-capable in 2.0.6. Bearing in mind majority of strings code in GnuCash handle Unicode just fine. However, there are some currencies e.g.TYR https://en.wikipedia.org/wiki/Turkish_lira need extended Unicode and are misprinted as ? in the reports. I've dwelved down and figure there are only 2 offending functions. (format #f "~a bla" str) and (with-output-to-string) as described above. After much experimentation I can fix by changing (format) to (string-append), and changing (with-ouput-to-string) to (open-string-port) and importing srfi-6 as described in original port, and these fix the TYR symbol display. Hence my suspicion that string-ports on Windows munging Unicode. To try elucidate this I've also tried removing (setlocale LC_ALL "") and dumping (locale-encoding) which is "CP1252". There are also other bits where UTF8 is being interpreted as CP1252 but these are outside the scope of this post. So, I'm rather late in this game (started diving into scheme 18 months ago) and have probably missed many controversial changes in the past years, but the issue above seems weird to me, why the Windows port is munging Unicode :) On Tue, 16 Apr 2019 at 17:29, Mark H Weaver wrote: > Hi Christopher, > > Christopher Lam writes: > > > I'm struggling with string-ports on Windows. > > > > Last para of > > https://www.gnu.org/software/guile/manual/html_node/String-Ports.html > > "With string ports, the port-encoding is treated differently than other > > types of ports. When string ports are created, they do not inherit a > > character encoding from the current locale. They are given a default > locale > > that allows them to handle all valid string characters." > > > > This causes a string-sanitize function to not run correctly in Windows. > > (locale-encoding) says "CP1252" no matter what LANG or setlocale I try. > > > > The use case is to sanitize string for html, but on Windows it munges > > extended-unicode. > > Can you explain more fully what the problem is? I know a fair amount > about Unicode, but my knowledge of Windows is extremely weak. > > What exactly is "extended-unicode" in this context? References welcome. > > Thanks, > Mark >