From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: J S Newsgroups: gmane.emacs.bugs Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Date: Sat, 18 May 2019 20:57:51 +0000 Message-ID: References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>,<85a7fldp15.fsf@gmail.com> , <83r28wamoc.fsf@gnu.org> , <83o9409vj6.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="140633"; mail-complaints-to="usenet@blaine.gmane.org" Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org>, "npostavs@gmail.com" To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat May 18 22:58:18 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hS6PJ-000aSi-QC for geb-bug-gnu-emacs@m.gmane.org; Sat, 18 May 2019 22:58:18 +0200 Original-Received: from localhost ([127.0.0.1]:38580 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hS6PI-0007ME-Rg for geb-bug-gnu-emacs@m.gmane.org; Sat, 18 May 2019 16:58:16 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:42725) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hS6P6-0007Im-LH for bug-gnu-emacs@gnu.org; Sat, 18 May 2019 16:58:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hS6P5-0006eu-Jn for bug-gnu-emacs@gnu.org; Sat, 18 May 2019 16:58:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:47754) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hS6P5-0006en-H4 for bug-gnu-emacs@gnu.org; Sat, 18 May 2019 16:58:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hS6P5-0000VO-Fr for bug-gnu-emacs@gnu.org; Sat, 18 May 2019 16:58:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 May 2019 20:58:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs Original-Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.15582130811920 (code B ref 35766); Sat, 18 May 2019 20:58:03 +0000 Original-Received: (at 35766) by debbugs.gnu.org; 18 May 2019 20:58:01 +0000 Original-Received: from localhost ([127.0.0.1]:33063 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hS6P2-0000Ut-Ko for submit@debbugs.gnu.org; Sat, 18 May 2019 16:58:00 -0400 Original-Received: from mail-oln040092005045.outbound.protection.outlook.com ([40.92.5.45]:36226 helo=NAM02-SN1-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hS6P0-0000UW-Ef for 35766@debbugs.gnu.org; Sat, 18 May 2019 16:57:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=g1b6Df9GU1Ab9Kq6exN7nxQWr44H4a04iOkkgwTIfp0=; b=tZr/rnEBSZoiv2U+wWMTXeApz3nGe9/6alzbd8mg6cIaHIPybVkP1FOflaR17JZiBwrgGjIzQZ9m2EhQ4cxAf5IJlW/AjzVz3ytH0/A0rlg8Dkj+4AwoeJamk/rTjqG2Lq+fVu2FJnR2EQtuJGb2Mq/3dDgytvnEnVFnP0JjWTHYr6aejWX3+6BBwdgysqLQBPbRRpO2G8BYdRjgEm6DrcAdWjCupNNKCGEDmoIWIrPHSgtVIEK2tj9ghFb9+7qh1WYXKjIWvBXw248zXvQ4eUf34ZDJyQrLKttp25W5rvAEBSINSQNhJEXfWPf6btV6oIWnT8LpiQ3dfy3rJE0zbg== Original-Received: from BL2NAM02FT019.eop-nam02.prod.protection.outlook.com (10.152.76.55) by BL2NAM02HT054.eop-nam02.prod.protection.outlook.com (10.152.77.1) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16; Sat, 18 May 2019 20:57:51 +0000 Original-Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.76.58) by BL2NAM02FT019.mail.protection.outlook.com (10.152.77.166) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Sat, 18 May 2019 20:57:51 +0000 Original-Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1900.010; Sat, 18 May 2019 20:57:51 +0000 Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeFgADRqwCAACD5gIAABqSpgAA/Xq2AAA6vR4AACAlMgAAvoiuAAAex1YAAnB/ogAEAfdY= In-Reply-To: <83o9409vj6.fsf@gnu.org> Accept-Language: en-US Content-Language: en-US x-incomingtopheadermarker: OriginalChecksum:F51A0B925050C6F9FA4ABA41B8661FB8F164D6E92EDE4D4DC1F0580B12106D5E; UpperCasedChecksum:08908E137B6B01A57112A0263DDBE48933404A43E7EEADCE186B50B7D3EDAF16; SizeAsReceived:7585; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [RZZhLndqI+obpTM6CZjpZ8VFvILXrj6m] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:BL2NAM02HT054; x-ms-traffictypediagnostic: BL2NAM02HT054: x-microsoft-antispam-message-info: vIx9bkbbwP5C3SBi7wj17HhSUL4lJKOoFGp52RTo1Dq6kf6okrtxeEII6Ge43xhsPRQ83hqiD5B/h8niXb/oOlrLFxFCGmDfwXpZfda3QseYisL9a5iaX8fRIYGA3h1gf+Zcyth/qR9YIpiooBvjwQ4u+m4i+nhlnrt1ROQzjNX1XdoHERlA11yGHWCqNlRN X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: cadd3165-43a9-4619-e063-08d6dbd37e02 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 18 May 2019 20:57:51.7834 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2NAM02HT054 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:159510 Archived-At: --_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable RFC 2781 under "4.3 Interpreting text labelled as UTF-16" says is that if a= document is labelled "UTF-16", the application should check the byte order= mark to see if it is little endian or big endian Only if there's no byte= order mark, should the document be interpreted as big endian. ________________________________ From: Eli Zaretskii Sent: Saturday, May 18, 2019 5:33 AM To: J S Cc: npostavs@gmail.com; 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be > From: J S > CC: "npostavs@gmail.com" , "35766@debbugs.gnu.org" > <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 20:16:41 +0000 > > For example, if I save this xml file in emacs, it saves it as utf-16 big = endian: > > This is the Emacs default, which is well documented, and is also according to what the UTF-16 spec (RFC 2781) says. > If I do this in powershell (really a .net method), it saves it as utf-16 = little endian (osx or windows): Then PowerShell behaves in violation of RFC 2781. --_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
RFC 2781 under "4.3 Interpreting text labelled as UTF-16" sa= ys is that if a document is labelled "UTF-16", the application sh= ould check the byte order mark to see if it is little endian or big endian&= nbsp;  Only if there's no byte order mark, should the document be interpreted as big endian.


From: Eli Zaretskii <eli= z@gnu.org>
Sent: Saturday, May 18, 2019 5:33 AM
To: J S
Cc: npostavs@gmail.com; 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
> From: J S <jszabo_98@hotmail.com>
> CC: "npostavs@gmail.com" <npostavs@gmail.com>, "3= 5766@debbugs.gnu.org"
>        <35766@debbugs.gnu.org>= ;
> Date: Fri, 17 May 2019 20:16:41 +0000
>
> For example, if I save this xml file in emacs, it saves it as utf-16 b= ig endian:
>
> <?xml version=3D"1.0" encoding=3D"UTF-16"?><= br>
This is the Emacs default, which is well documented, and is also
according to what the UTF-16 spec (RFC 2781) says.

> If I do this in powershell (really a .net method), it saves it as utf-= 16 little endian (osx or windows):

Then PowerShell behaves in violation of RFC 2781.
--_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_--