unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: Christopher Lam <christopher.lck@gmail.com>
To: Mark H Weaver <mhw@netris.org>
Cc: guile-user <guile-user@gnu.org>
Subject: Re: string-ports issue on Windows
Date: Tue, 16 Apr 2019 23:26:49 +0000	[thread overview]
Message-ID: <CAKVAZZLTL1+AJCkQ0t8VbdS=39BxLG5=T7TD+C0_2_7-kPuY0g@mail.gmail.com> (raw)
In-Reply-To: <877ebt7tc0.fsf@netris.org>

Thank you Mark

The problem is rather obscure and may have been fixed in 2.2.

I've taken the reins of handling the guile code in GnuCash. For various
reasons I can't fathom, the Windows build includes Guile 2.0.14 rather than
Guile-2.2. I've checked NEWS and there was change in SRFI-6 string-ports to
make them Unicode-capable in 2.0.6.

Bearing in mind majority of strings code in GnuCash handle Unicode just
fine. However, there are some currencies e.g.TYR
https://en.wikipedia.org/wiki/Turkish_lira need extended Unicode and are
misprinted as ? in the reports.

I've dwelved down and figure there are only 2 offending functions. (format
#f "~a bla" str) and (with-output-to-string) as described above. After much
experimentation I can fix by changing (format) to (string-append), and
changing (with-ouput-to-string) to (open-string-port) and importing srfi-6
as described in original port, and these fix the TYR symbol display. Hence
my suspicion that string-ports on Windows munging Unicode. To try elucidate
this I've also tried removing (setlocale LC_ALL "") and dumping
(locale-encoding) which is "CP1252".

There are also other bits where UTF8 is being interpreted as CP1252 but
these are outside the scope of this post.

So, I'm rather late in this game (started diving into scheme 18 months ago)
and have probably missed many controversial changes in the past years, but
the issue above seems weird to me, why the Windows port is munging Unicode
:)

On Tue, 16 Apr 2019 at 17:29, Mark H Weaver <mhw@netris.org> wrote:

> Hi Christopher,
>
> Christopher Lam <christopher.lck@gmail.com> writes:
>
> > I'm struggling with string-ports on Windows.
> >
> > Last para of
> > https://www.gnu.org/software/guile/manual/html_node/String-Ports.html
> > "With string ports, the port-encoding is treated differently than other
> > types of ports. When string ports are created, they do not inherit a
> > character encoding from the current locale. They are given a default
> locale
> > that allows them to handle all valid string characters."
> >
> > This causes a string-sanitize function to not run correctly in Windows.
> > (locale-encoding) says "CP1252" no matter what LANG or setlocale I try.
> >
> > The use case is to sanitize string for html, but on Windows it munges
> > extended-unicode.
>
> Can you explain more fully what the problem is?  I know a fair amount
> about Unicode, but my knowledge of Windows is extremely weak.
>
> What exactly is "extended-unicode" in this context?  References welcome.
>
>       Thanks,
>         Mark
>


  reply	other threads:[~2019-04-16 23:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-16  4:13 string-ports issue on Windows Christopher Lam
2019-04-16 14:34 ` Eli Zaretskii
2019-04-16 17:15   ` Mark H Weaver
2019-04-16 17:28 ` Mark H Weaver
2019-04-16 23:26   ` Christopher Lam [this message]
2019-04-17 19:30     ` Mark H Weaver
2019-04-18 16:22       ` Christopher Lam
2019-04-18 18:51         ` Eli Zaretskii
2019-04-18 19:29         ` Mark H Weaver
2019-04-18 21:18           ` Mark H Weaver
2019-04-19 10:26             ` Christopher Lam
2019-05-14  4:42               ` Christopher Lam
2019-05-26 10:52                 ` Christopher Lam
2019-05-26 20:48                   ` Mark H Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKVAZZLTL1+AJCkQ0t8VbdS=39BxLG5=T7TD+C0_2_7-kPuY0g@mail.gmail.com' \
    --to=christopher.lck@gmail.com \
    --cc=guile-user@gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).