unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Jean Abou Samra <jean@abou-samra.fr>
Cc: guile-user@gnu.org
Subject: Re: UTF16 encoding adds BOM everywhere?
Date: Wed, 20 Jul 2022 16:42:53 -0400	[thread overview]
Message-ID: <87pmhz347r.fsf@netris.org> (raw)
In-Reply-To: <63f03f91-58d8-c247-2d2a-78848c2e5ca9@abou-samra.fr>

Hi,

Jean Abou Samra <jean@abou-samra.fr> wrote:

> With this code:
> 
> (let ((p (open-output-file "x.txt")))
>    (set-port-encoding! p "UTF16")
>    (display "ABC" p)
>    (close-port p))
> 
> the sequence of bytes in the output file x.txt is
> 
> ['FF', 'FE', '41', '0', 'FF', 'FE', '42', '0', 'FF', 'FE', '43', '0']
> 
> FFE is a little-endian Byte Order Mark (BOM), fine.
> But why is Guile adding it before every character
> instead of just at the beginning of the string?
> Is that expected?

No, this is certainly a bug.  It sounds like the
'at_stream_start_for_bom_write' port flag is not being cleared, as it
should be, after the first character is written.  I suspect that it
worked correctly when I first implemented proper BOM handling in 2013
(commit cdd3d6c9f423d5b95f05193fe3c27d50b56957e9), but the ports code
has seen some major reworking since then.  I guess that BOM handling was
broken somewhere along the way.

I would suggest filing a bug report.  I don't have time to look into it,
sorry.  I don't work on Guile anymore.  I only happened to see your
message by chance.

     Regards,
       Mark

-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.



  reply	other threads:[~2022-07-20 20:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14 19:16 UTF16 encoding adds BOM everywhere? Jean Abou Samra
2022-07-20 20:42 ` Mark H Weaver [this message]
2022-07-20 21:45   ` Jean Abou Samra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pmhz347r.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=guile-user@gnu.org \
    --cc=jean@abou-samra.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).