unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Tom de Vries <tdevries@suse.de>
Cc: 33044@debbugs.gnu.org
Subject: bug#33044: Guile misbehaves in the "ja_JP.sjis" locale
Date: Tue, 16 Oct 2018 01:13:43 -0400	[thread overview]
Message-ID: <87tvlmo4mw.fsf@netris.org> (raw)
In-Reply-To: <87y3ayodqp.fsf_-_@netris.org> (Mark H. Weaver's message of "Mon,  15 Oct 2018 21:57:02 -0400")

Mark H Weaver <mhw@netris.org> writes:

> Shift_JIS is _mostly_ ASCII-compatible, except that code points 0x5C and
> 0x7E, which represent backslash (\) and tilde (~) in ASCII, are mapped
> to the Yen sign (¥) and overline (‾) in Shift_JIS.  Backslash (\) and
> tilde (~) are multibyte characters in Shift_JIS.

Although I wrote above that "Backslash (\) and tilde (~) are multibyte
characters in Shift_JIS", that was admittedly my assumption, based on
the absence of those characters in the "First byte" map shown here:

  https://en.wikipedia.org/wiki/Shift_JIS#As_defined_in_JIS_X_0208:1997

However, now I'm unsure.  I've spent some time attempting to find the
Shift_JIS encodings for backslash and tilde, but I've not yet found an
answer.

I've asked Emacs 26 to write a file containing backslashes and Yen signs
using the "shift_jis" encoding, and both characters seem to be mapped to
the same code: 0x5C.

I've also used the 'iconv' utility from GNU libc to convert backslashes
and Yen signs to Shift_JIS, and it also maps these two characters to the
same codes:

--8<---------------cut here---------------start------------->8---
mhw@jojen ~$ echo '\\¥¥' | iconv -f UTF-8 -t SHIFT-JIS > Shift_JIS_test.txt
mhw@jojen ~$ hexdump -C Shift_JIS_test.txt
00000000  5c 5c 5c 5c 0a                                    |\\\\.|
00000005
--8<---------------cut here---------------end--------------->8---

While investigating, I found this bug for GNU libc asking to add an SJIS
locale, and the developers were strongly opposed:

  https://bugzilla.redhat.com/show_bug.cgi?id=136290

At this point, I'm inclined to believe that Shift_JIS is not suitable as
a locale encoding on POSIX systems, and that we should not try to
support it in Guile.

What do you think?

Can you tell me how backslash and tilde are represented in Shift JIS?

     Regards,
       Mark





  reply	other threads:[~2018-10-16  5:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15  8:44 bug#33044: Invalid read access of chars of wide string in scm_seed_to_random_state Tom de Vries
2018-10-15 14:20 ` bug#33044: Reproduced using guile binary Tom de Vries
2018-10-21 16:24   ` Tom de Vries
2018-10-15 18:59 ` bug#33044: Analysis and proposed patch Tom de Vries
2018-10-16  1:57   ` bug#33044: Guile misbehaves in the "ja_JP.sjis" locale Mark H Weaver
2018-10-16  5:13     ` Mark H Weaver [this message]
2018-10-16 12:52       ` John Cowan
2018-10-16 23:38       ` Tom de Vries
2018-10-17  7:00       ` Tom de Vries
2018-10-16 23:27     ` Tom de Vries
2018-10-18  1:56       ` Mark H Weaver
2018-10-18 10:26         ` Tom de Vries
2018-10-20  2:24         ` Mark H Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tvlmo4mw.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=33044@debbugs.gnu.org \
    --cc=tdevries@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).