unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: npostavs@gmail.com
Cc: jszabo_98@hotmail.com, 35766@debbugs.gnu.org
Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be
Date: Sat, 18 May 2019 10:26:09 +0300	[thread overview]
Message-ID: <83ef4w9qb2.fsf@gnu.org> (raw)
In-Reply-To: <85a7fldp15.fsf@gmail.com> (npostavs@gmail.com)

merge 8282 35766
close 36766
thanks

> From: npostavs@gmail.com
> Cc: Noam Postavsky <npostavs@gmail.com>,  jszabo_98@hotmail.com,  35766@debbugs.gnu.org
> Date: Fri, 17 May 2019 12:27:50 -0400
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Perhaps we should by default produce encoding with BOM when XML header
> > specifies UTF-16?
> 
> I think yes, https://www.w3.org/TR/xml/#charencoding says
> 
>     Entities encoded in UTF-16 MUST [...] begin with the Byte Order Mark

OK, I did that as well, and pushed the changes to master.

> By the way, is Bug#8282 the same as this one, or just closely related?

It's the same problem; merged the bugs.

> It's talking about sgml-html-meta-auto-coding-function (though maybe
> sgml-xml-auto-coding-function is more relevant).  I'm getting a little
> confused between all the different *-find/auto-coding-* functions.

The function relevant for the recipe in bug#8282 is
sgml-xml-auto-coding-function, which is where I made the changes.  If
the HTML and/or SGML specs also mandate that we use BOM, then maybe we
need the same changes in sgml-html-meta-auto-coding-function as well.
Note that there's no equivalent for xml-find-file-coding-system for
non-XML files, so recognition of visited UTF-16 HTML files will not
work even if they do have a BOM.

> There is also nxml-set-auto-coding which seems to be mostly unused.

It is supposed to be used by packages that build on top of nXml,
AFAIU.

Thanks.





      parent reply	other threads:[~2019-05-18  7:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 17:11 bug#35766: emacs saves utf-16 le xml files as utf-16 be J S
2019-05-16 18:22 ` Eli Zaretskii
     [not found]   ` <BL0PR11MB34754605999DC2A03A6DF45A9E0A0@BL0PR11MB3475.namprd11.prod.outlook.com>
2019-05-16 19:21     ` J S
2019-05-16 20:57       ` J S
2019-05-17  9:26         ` Eli Zaretskii
2019-05-17 11:26           ` J S
2019-05-17 11:48             ` Noam Postavsky
2019-05-17 15:34               ` Eli Zaretskii
2019-05-17 16:27                 ` npostavs
2019-05-17 16:57                   ` J S
2019-05-17 19:46                     ` Eli Zaretskii
2019-05-17 20:16                       ` J S
2019-05-18  5:33                         ` Eli Zaretskii
2019-05-18 20:57                           ` J S
2019-05-19  4:58                             ` Eli Zaretskii
2019-05-19 14:12                               ` J S
2019-05-18  7:26                   ` Eli Zaretskii [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83ef4w9qb2.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=35766@debbugs.gnu.org \
    --cc=jszabo_98@hotmail.com \
    --cc=npostavs@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).