unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Vincent Lefevre <vincent@vinc17.net>
Cc: a.s@realize.ch, monnier@iro.umontreal.ca, 20623@debbugs.gnu.org,
	sledergerber@gmx.net
Subject: bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save
Date: Sat, 11 Aug 2018 19:27:33 +0300	[thread overview]
Message-ID: <83pnyolvgq.fsf@gnu.org> (raw)
In-Reply-To: <20180811154101.GB4800@zira.vinc17.org> (message from Vincent Lefevre on Sat, 11 Aug 2018 17:41:01 +0200)

> Date: Sat, 11 Aug 2018 17:41:01 +0200
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: monnier@iro.umontreal.ca, rgm@gnu.org, sledergerber@gmx.net,
> 	a.s@realize.ch, 20623@debbugs.gnu.org
> 
> > > You're completely wrong. The presence of BOM or not is very important
> > > for some applications, such as Firefox (not to determine the charset,
> > > but the MIME type of local files).
> > 
> > Please provide the details, including the use case, if possible.  I'm
> > still in the dark regarding the importance of the BOM in UTF-8 encoded
> > HTML stuff.
> 
>   https://bugzilla.mozilla.org/show_bug.cgi?id=1422889
> 
> for HTML. Wontfix because of:
> 
>   https://mimesniff.spec.whatwg.org/#mime-type-sniffing-algorithm
> 
> For text/plain only (but this is another example that BOM can matter
> in practice), there's
> 
>   https://bugzilla.mozilla.org/show_bug.cgi?id=1071816
> 
> (which is a bug that should be fixed).

Maybe I'm missing something, but none of these issues describes the
situation in this bug report, namely: an HTML file with an explicit
charset= tag, with or without a BOM.  In fact, the first of these
issues happens only in files that _do_ have a BOM, so you could say
that Emacs did you a favor by removing it ;-)

> > I agree about the user not knowing, but that doesn't yet qualify as
> > "data loss", which has an widely accepted meaning.
> 
> This is data corruption, which is a form of data loss, because some
> information is lost in the process (I recall that Emacs does not
> provide any information to the user about this transformation).

That is the most inclusive interpretation of "data loss" I've ever
seen.  "Some information is lost" is nowhere near what "grave bug"
means by "data loss", so I don't think "grave" applies here.

Anyway, the Emacs issue is now fixed.





  reply	other threads:[~2018-08-11 16:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 18:50 bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save Simon Ledergerber
2015-05-21 19:48 ` Eli Zaretskii
     [not found]   ` <555E44EB.6070604@gmx.net>
2015-05-22  7:11     ` Eli Zaretskii
2015-05-22 13:21       ` Simon Ledergerber
2016-10-12 21:44         ` Alain Schneble
2017-12-04 16:54           ` Glenn Morris
2017-12-04 17:38             ` Stefan Monnier
2017-12-04 20:28               ` Eli Zaretskii
2017-12-04 21:08                 ` Stefan Monnier
2017-12-10 19:17                   ` Eli Zaretskii
2017-12-15  9:08                     ` Eli Zaretskii
2018-08-01 18:07                     ` bug#20623: XML and HTML files with encoding/charset="utf-8" declaration lose " Glenn Morris
2018-08-01 18:41                       ` Eli Zaretskii
2018-08-07 19:14                         ` Glenn Morris
2018-08-11 12:45                     ` bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose " Stefan Monnier
2018-08-11 13:54                       ` Eli Zaretskii
2018-08-12  0:04                         ` Stefan Monnier
2018-08-12 19:07                           ` Eli Zaretskii
2018-08-08  9:47               ` Vincent Lefevre
2018-08-08 14:45                 ` Stefan Monnier
2018-08-11  9:15                 ` Eli Zaretskii
2018-08-11 10:13                   ` Vincent Lefevre
2018-08-11 10:45                     ` Eli Zaretskii
2018-08-11 15:41                       ` Vincent Lefevre
2018-08-11 16:27                         ` Eli Zaretskii [this message]
2018-08-12  1:34                           ` Vincent Lefevre
2018-08-12  0:11                         ` Stefan Monnier
2018-08-12  0:58                           ` Vincent Lefevre
2015-05-22 15:22   ` Stefan Monnier
2015-05-22 15:26     ` Eli Zaretskii
2015-05-22 21:51       ` Stefan Monnier
2015-05-23  6:44         ` Eli Zaretskii
2015-05-23 17:11           ` Simon Ledergerber
2015-05-23 17:20             ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83pnyolvgq.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=20623@debbugs.gnu.org \
    --cc=a.s@realize.ch \
    --cc=monnier@iro.umontreal.ca \
    --cc=sledergerber@gmx.net \
    --cc=vincent@vinc17.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).