From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Christopher Lam Newsgroups: gmane.lisp.guile.user Subject: Re: string-ports issue on Windows Date: Tue, 14 May 2019 14:42:29 +1000 Message-ID: References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="107375"; mail-complaints-to="usenet@blaine.gmane.org" Cc: guile-user To: Mark H Weaver Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Tue May 14 06:43:27 2019 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hQPHj-000RjK-0O for guile-user@m.gmane.org; Tue, 14 May 2019 06:43:27 +0200 Original-Received: from localhost ([127.0.0.1]:39054 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQPHi-00057I-23 for guile-user@m.gmane.org; Tue, 14 May 2019 00:43:26 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:34949) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQPHJ-00055T-AP for guile-user@gnu.org; Tue, 14 May 2019 00:43:02 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hQPHH-0000ZD-Hn for guile-user@gnu.org; Tue, 14 May 2019 00:43:01 -0400 Original-Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]:42374) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hQPHH-0000XS-4t for guile-user@gnu.org; Tue, 14 May 2019 00:42:59 -0400 Original-Received: by mail-ed1-x52c.google.com with SMTP id l25so20848428eda.9 for ; Mon, 13 May 2019 21:42:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NOLwEPd2NOw57qtK5sR2xF2BBrPGkZy6eBzv91jCV5E=; b=GKIIaTTc2zVBFyZ06p3+qmx27dk/pc7/x/va+P/42/eu00/KUDU/etjkCS9OmRQW1R AkuFjaESypMBF8rYMK7BDqCAmpbzqdz93+fZ2H3e+0B+EWsRupAQH6RuFa/xyxrqXjxC g/1d3B6mAPDDgnVtFPlpW0Pk8Mq9Ht6NSkBQyr90Gz8zRvsBImYjlFtd2ZyAycd57bYV kCyu4kw7IJcvcBML/S0R2wu2ACZH+ZMzhLyoQGvmmomLemwilNi5FU7oFXAFM1x4ZeFK Oy3IT5QM2WyvAB4rfXv424pLvgElT2XJ8JkueeoYveBnIcoS4/vq5EhGF1eqqdTaHgkE oyDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NOLwEPd2NOw57qtK5sR2xF2BBrPGkZy6eBzv91jCV5E=; b=EV003OR4xrL8jvxKt4noahlQ6M2SMkWt+DBqi2HFU5x7hlPwYJ5MgtIRw5paYOX+If vBCIpFae47XmP+2D6v8800BbjqJ8PoTP7OZAjSYjgnpf3SGgv2Tz7oftA03MN2Oqau8q nrIKCbHh59fKh7FxQHjKNhYypH7IJgQtv1ThbCTAuqObApnpFtrEq+EpaqA2xP4JvT3u t5ijMZYXCMfZpmP4X/bvr+lEz63L0tbmCIKLd2ww34s7uRpFXFYaZtgQ9LkKXMWOouT6 BXV+nRtE+FQ/k9ZAKG/Jt+DB2GnN9DVycv1EdA3zCisEN3CbBJI3byDhnmwuvAUCGcRP uKBw== X-Gm-Message-State: APjAAAVkNEaDsFthiVDFEmjO7ytaNaDct+ggdbwIpkLGiXNNjcNFbWvM vAymHwaXH3e3OSTyel0+QWInmcFm7JHZNWfwkRx9dJwo/fg= X-Google-Smtp-Source: APXvYqw4TTTP++0IhQ4pKIcENOwdYFdFGdgvu4CU9vhJBMq0QZscnpjB2ttPOGN8QV8Nn0DHFpnXAugh5/9R1YTUaak= X-Received: by 2002:a50:bae4:: with SMTP id x91mr34016804ede.76.1557808975726; Mon, 13 May 2019 21:42:55 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::52c X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:15474 Archived-At: Hi Mark Final update - first, we've reused your efficient substring-replace function in https://github.com/Gnucash/gnucash/commit/7d15e6e4e727c87fb4a501e924c4ae022= 76e508d from a few years ago. Second, the email thread https://lists.gnu.org/archive/html/guile-devel/2014-03/msg00060.html confirmed a lot of issues in guile-2.0 could be solved in Windows by upgrading to guile-2.2. So, GnuCash has now upgraded to guile-2.2 on Windows and the string-ports are now behaving. Thank you (twice) :) On Fri, 19 Apr 2019 at 10:26, Christopher Lam wrote: > Hi, > The patch *does* work and handles unicode properly :) There are unintende= d > consequences however, whereby other (probably C-based) string-code in > Windows are now reading the lira-symbol into unexpected chars (eg > lira-symbol -> "=C3=A2=E2=80=9A=C2=B0" i.e. #xe2 #x201a #xba) but this is= now outside the > scope of this post. > Thank you again! > > On Thu, 18 Apr 2019 at 21:20, Mark H Weaver wrote: > >> Hi again, >> >> Earlier, I wrote: >> >> > Christopher Lam writes: >> > >> >> Hi Mark >> >> Thank you so much for looking into this. >> >> I'm reviewing the GnuCash for Windows package (v3.5 released April >> 2019) >> >> which contains the following libraries: >> >> - guile 2.0.14 >> > >> > Ah, for some reason I thought you were using Guile 2.2. That explains >> > the problem. >> > >> > In Guile 2.0, string ports internally used the locale encoding by >> > default, which meant that any characters not supported by the locale >> > encoding would be munged. >> > >> > Guile 2.2 changed the behavior of string ports to always use UTF-8 >> > internally, which ensures that all valid Guile strings can pass throug= h >> > unmunged. >> > >> > So, this problem would almost certainly be fixed by updating to >> > Guile 2.2. >> >> It's probably a good idea to update to Guile 2.2 anyway, but I'd like to >> also offer the following workaround, which monkey patches the string >> port procedures in Guile 2.0 to behave more like Guile 2.2. >> >> Note that it only patches the Scheme APIs for string ports, and not the >> underlying C functions. It might be that some code, possibly within >> Guile itself, creates a string port using the C functions, and such >> string ports may still munge characters. >> >> Anyway, if you want to try it, arrange for GnuCash to evaluate the code >> below, after initializing Guile. >> >> Mark >> >> >> (when (string=3D? (effective-version) "2.0") >> ;; When using Guile 2.0.x, use monkey patching to change the >> ;; behavior of string ports to use UTF-8 as the internal encoding. >> ;; Note that this is the default behavior in Guile 2.2 or later. >> (let* ((mod (resolve-module '(guile))) >> (orig-open-input-string (module-ref mod 'open-input-string)) >> (orig-open-output-string (module-ref mod 'open-output-string)) >> (orig-object->string (module-ref mod 'object->string)) >> (orig-simple-format (module-ref mod 'simple-format))) >> >> (define (open-input-string str) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (orig-open-input-string str))) >> >> (define (open-output-string) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (orig-open-output-string))) >> >> (define (object->string . args) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (apply orig-object->string args))) >> >> (define (simple-format . args) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (apply orig-simple-format args))) >> >> (define (call-with-input-string str proc) >> (proc (open-input-string str))) >> >> (define (call-with-output-string proc) >> (let ((port (open-output-string))) >> (proc port) >> (get-output-string port))) >> >> (module-set! mod 'open-input-string open-input-string) >> (module-set! mod 'open-output-string open-output-string) >> (module-set! mod 'object->string object->string) >> (module-set! mod 'simple-format simple-format) >> (module-set! mod 'call-with-input-string call-with-input-string) >> (module-set! mod 'call-with-output-string call-with-output-string) >> >> (when (eqv? (module-ref mod 'format) orig-simple-format) >> (module-set! mod 'format simple-format)))) >> >