From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Simon Ledergerber Newsgroups: gmane.emacs.bugs Subject: bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save Date: Sat, 23 May 2015 19:11:15 +0200 Message-ID: <0LpKKr-1Zb1Pi3ZTR-00fE69@mail.gmx.com> References: <555E2912.7060509@gmx.net> <83iobl67ao.fsf@gnu.org> <83iobk4oqm.fsf@gnu.org> <83mw0v3i9v.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="_3B118A15-6F26-4E0C-953C-6EEA7CE91C7C_" X-Trace: ger.gmane.org 1432401149 2136 80.91.229.3 (23 May 2015 17:12:29 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 23 May 2015 17:12:29 +0000 (UTC) Cc: 20623@debbugs.gnu.org To: Eli Zaretskii , Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat May 23 19:12:18 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YwCy7-0008UQ-Eu for geb-bug-gnu-emacs@m.gmane.org; Sat, 23 May 2015 19:12:15 +0200 Original-Received: from localhost ([::1]:38264 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwCy6-0003yS-In for geb-bug-gnu-emacs@m.gmane.org; Sat, 23 May 2015 13:12:14 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:35674) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwCy0-0003xm-FR for bug-gnu-emacs@gnu.org; Sat, 23 May 2015 13:12:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YwCxu-0002iJ-J0 for bug-gnu-emacs@gnu.org; Sat, 23 May 2015 13:12:08 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:44148) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwCxu-0002hd-Fc for bug-gnu-emacs@gnu.org; Sat, 23 May 2015 13:12:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1YwCxu-0008P1-2a for bug-gnu-emacs@gnu.org; Sat, 23 May 2015 13:12:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Simon Ledergerber Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 23 May 2015 17:12:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20623 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 20623-submit@debbugs.gnu.org id=B20623.143240109332261 (code B ref 20623); Sat, 23 May 2015 17:12:01 +0000 Original-Received: (at 20623) by debbugs.gnu.org; 23 May 2015 17:11:33 +0000 Original-Received: from localhost ([127.0.0.1]:54123 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwCxN-0008OD-4n for submit@debbugs.gnu.org; Sat, 23 May 2015 13:11:32 -0400 Original-Received: from mout.gmx.net ([212.227.15.15]:56581) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwCxH-0008Nq-7u for 20623@debbugs.gnu.org; Sat, 23 May 2015 13:11:27 -0400 Original-Received: from [192.168.1.100] ([77.56.185.142]) by mail.gmx.com (mrgmx001) with ESMTPSA (Nemesis) id 0MEXHd-1YuF0E2zUH-00FkAL; Sat, 23 May 2015 19:11:15 +0200 In-Reply-To: <83mw0v3i9v.fsf@gnu.org> X-Provags-ID: V03:K0:2Gr+jLSvrzGgm6BXcfumIEO+DN+BZlqRSe+nEkLUNveQKTqCvmI wIGFzERmimZqPgJz7+i/Hs+GBxAQCotz1KUG6MXYyxQ+a5yR/hz3SIKCuogU0CMm3hz28o3 4giF5iWzzk8Pn4UuktCSyKbk/P+PKwB65Ej7RvVb9rFma9IU6CylJ9WYzXVAiHwLeHl3kx4 Rc4sYMyugCu66xmhKTE0Q== X-UI-Out-Filterresults: notjunk:1; X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:103114 Archived-At: --_3B118A15-6F26-4E0C-953C-6EEA7CE91C7C_ Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As already mentioned in my last post, even when I started Emacs with the op= tion -Q, which should opt out my customizations, it made no difference. So = naturally, the source of the problem will be somewhere else. -----Original Message----- From: "Eli Zaretskii" Sent: =E2=80=8E23.=E2=80=8E05.=E2=80=8E2015 08:44 To: "Stefan Monnier" Cc: "sledergerber@gmx.net" ; "20623@debbugs.gnu.org" = <20623@debbugs.gnu.org> Subject: Re: bug#20623: XML and HTML files with encoding/charset=3D"utf-8" = declaration loose BOM; Coding system is reset from utf-8-with-signature to = utf-8 on save > From: Stefan Monnier > Cc: sledergerber@gmx.net, 20623@debbugs.gnu.org > Date: Fri, 22 May 2015 17:51:07 -0400 >=20 > >> > What would you expect Emacs to do instead? It just obeys the stated > >> > encoding, which says nothing about the BOM. How can Emacs know when > >> > to use utf-8 and when utf-8-with-signature? > >> To the extent that Emacs has seen the BOM when opening the file, it > >> would make sense for Emacs to try and preserve this detail. > > It does. >=20 > While there are cases where it does, this bug report is about a case > where it doesn't, IIUC. AFAIU, that happened because the user has this in ~/.emacs: (setq-default buffer-file-coding-system 'utf-8-dos) IMO, this bad customization should be removed, and then the problem will go away. --_3B118A15-6F26-4E0C-953C-6EEA7CE91C7C_ Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="utf-8"
As already mentioned in my last post, even when I started= Emacs with the option -Q, which should opt out my customizations, it made = no difference. So naturally, the source of the problem will be somewhere el= se.

From: Eli Zaretskii
Sent: =E2=80=8E23.=E2=80=8E05.= =E2=80=8E2015 08:44
To: Stefan Monnier
Cc: sledergerber@gmx.net; 20623@debbugs.gnu.org
Subject: Re: bug#20623= : XML and HTML files with encoding/charset=3D"utf-8" declaration loose BOM;= Coding system is reset from utf-8-with-signature to utf-8 on save
> From: Stefan Monnier <monnier@iro.umontreal.ca>
&= gt; Cc: sledergerber@gmx.net,  20623@debbugs.gnu.org
> Date: Fri= , 22 May 2015 17:51:07 -0400
>
> >> > What would you = expect Emacs to do instead?  It just obeys the stated
> >>= > encoding, which says nothing about the BOM.  How can Emacs know = when
> >> > to use utf-8 and when utf-8-with-signature?
&= gt; >> To the extent that Emacs has seen the BOM when opening the fil= e, it
> >> would make sense for Emacs to try and preserve this = detail.
> > It does.
>
> While there are cases where = it does, this bug report is about a case
> where it doesn't, IIUC.
AFAIU, that happened because the user has this in ~/.emacs:

&nb= sp; (setq-default buffer-file-coding-system 'utf-8-dos)

IMO, this ba= d customization should be removed, and then the problem
will go away.= --_3B118A15-6F26-4E0C-953C-6EEA7CE91C7C_--