unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Stephen J. Turnbull" <stephen@xemacs.org>
To: David Kastrup <dak@gnu.org>
Cc: emacs-pretest-bug@gnu.org,
	Patrick Drechsler <patrick@pdrechsler.de>,
	Miles Bader <miles@gnu.org>
Subject: Re: 23.0.60; [nxml] BOM and utf-8
Date: Mon, 19 May 2008 12:05:51 +0900	[thread overview]
Message-ID: <87fxsff0xc.fsf@uwakimon.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <85y768ug6x.fsf@lola.goethe.zz>

David Kastrup writes:

 > It would be sufficient to use an encoding variation which adds the bom
 > back on writing.
 > 
 > I am actually surprised that this is not done right now: I thought we
 > had a discussion about having the BOM-encodings early in the automatic
 > encoding detections.

IIRC this is an issue recently reported by Eli, discussed, and ISTR
already fixed by Handa-san.  Don't have time to dig it up though.
Something about -le vs -littleendian.

 > > Alternatively, sabotage the Microsoft users by silently eating the BOM
 > > on the way in, and writing the file in GNU substandard[1] format on
 > > the way out.
 > 
 > Emacs developers are not nonchalant about having Emacs write a byte
 > sequence differing from what it read in

OK, I should always use smileys on this list, my bad.  The main point
was to get in the "substandard" joke, YHBT HAND.  And I was
recommending that for Miles's benefit, not as an Emacs default.

In any case, maintaining faithfulness of representation is simply not
possible, as you point out (safe-character-sets or whatever you call
your analog to latin-unity being another case).  It's also not at all
obvious that that is a very useful requirement when dealing with a
character-oriented standard like Unicode or XML, since you can expect
many applications to canonicalize the text "behind your back".

Users should get used to it, and we should document how to force Emacs
to error rather than do anything behind your back for those who need
binary faithfulness rather than text faithfulness.




  reply	other threads:[~2008-05-19  3:05 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-17 12:31 23.0.60; [nxml] BOM and utf-8 Patrick Drechsler
2008-05-17 14:13 ` Lennart Borgman (gmail)
2008-05-17 16:57   ` Patrick Drechsler
2008-05-17 20:38 ` Mark A. Hershberger
2008-05-21 22:20   ` Patrick Drechsler
2008-05-21 22:37     ` Patrick Drechsler
2008-05-22  1:33       ` Mark A. Hershberger
2008-05-22 14:43         ` Tom Tromey
2008-05-22 21:24           ` Miles Bader
2008-05-22  4:17       ` tomas
2008-05-22  4:33         ` Miles Bader
2008-05-22  8:28           ` Jason Rumney
2008-05-27  8:22           ` tomas
2008-05-22 17:34         ` Stephen J. Turnbull
2008-05-23  9:05           ` tomas
2008-05-23 21:23             ` Stephen J. Turnbull
2008-05-27  8:20               ` tomas
2008-05-18  2:29 ` Stephen J. Turnbull
2008-05-18  2:30   ` Miles Bader
2008-05-18  3:19     ` Eli Zaretskii
2008-05-18  4:19       ` Stephen J. Turnbull
2008-05-18  8:56       ` Jason Rumney
2008-05-18 11:00         ` Patrick Drechsler
2008-05-19  3:11           ` Stephen J. Turnbull
2008-05-19 14:32             ` Patrick Drechsler
2008-05-19 18:56               ` Eli Zaretskii
2008-05-20 15:16                 ` Patrick Drechsler
2008-05-18 15:19         ` joakim
2008-05-18  4:13     ` Stephen J. Turnbull
2008-05-18  5:40       ` Miles Bader
2008-05-18  9:14       ` David Kastrup
2008-05-19  3:05         ` Stephen J. Turnbull [this message]
2008-05-18 23:40           ` David Kastrup
2008-05-19 20:34             ` Stephen J. Turnbull
2008-05-19 20:57               ` David Kastrup
2008-05-19 23:36                 ` Stephen J. Turnbull
2008-05-20  7:13                   ` David Kastrup
2008-05-30  2:47                     ` Kenichi Handa
2008-05-30  3:44                       ` Miles Bader
2008-05-30  3:59                         ` Kenichi Handa
2008-05-19  6:32           ` Lennart Borgman (gmail)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fxsff0xc.fsf@uwakimon.sk.tsukuba.ac.jp \
    --to=stephen@xemacs.org \
    --cc=dak@gnu.org \
    --cc=emacs-pretest-bug@gnu.org \
    --cc=miles@gnu.org \
    --cc=patrick@pdrechsler.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).