From: "Stephen J. Turnbull" <stephen@xemacs.org>
To: David Kastrup <dak@gnu.org>
Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
Subject: Re: utf-16le vs utf-16-le
Date: Tue, 15 Apr 2008 03:25:51 +0900 [thread overview]
Message-ID: <87lk3gfg40.fsf@uwakimon.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <851w58q24a.fsf@lola.goethe.zz>
David Kastrup writes:
> "Stephen J. Turnbull" <stephen@xemacs.org> writes:
> > I don't know, in fact I think I think [having BOM-specific coding
> > systems is] a bad idea. That's what the part of my message that
> > you snipped was saying. But I'll have to defer to Handa-san on
> > that.
>
> I think it obvious: if a BOM mark gets detected on read, one wants
> to have it removed from the buffer and reinserted on saving the
> buffer.
I agree, as you state it, it's obvious. My question is "why does that
need to be part of the coding system?" At present the UTF-16 and
UTF-32 Unicode coding systems (in the abstract) have *twenty-seven*
variants each (BOM-required, BOM-prohibited, BOM-autodetected X be,
le, system-dependent X CR, LF, CRLF), and UTF-8 needs *nine*. This is
nuts, from a user-education standpoint.
What I proposed was a more generic concept where use of signatures and
the EOL convention would (at least to the user) appear as buffer-local
variables.
> I am just not sure what the semantics for recoding/encoding/decoding
> regions are. They should not mess with BOM in any case, I would
> suppose. But then reading a file is not equivalent to reading it
> literally in unibyte mode and then decoding the buffer-region.
That's correct. The thing is, processing the BOM is a question of
*initialization* of a stream.
> Maybe there never was such an equivalence (can't be for shift codes, can
> it?).
In my view, there cannot be an equivalence. An Emacs buffer in
unibyte mode is a *different* stream from the file it was read from,
and the decision about BOM processing will have to be made differently
from the way the decision is made at the time of reading from the
file. You could add yet another option for BOM mode, namely "if this
stream is an Emacs buffer that is visting a file in unibyte mode, then
do BOM processing on conversion as if you were reading in the file in
multibyte mode." I don't much like this....
next prev parent reply other threads:[~2008-04-14 18:25 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-13 14:54 utf-16le vs utf-16-le Eli Zaretskii
2008-04-13 19:32 ` Stefan Monnier
2008-04-14 5:17 ` Kenichi Handa
2008-04-14 6:10 ` David Kastrup
2008-04-14 18:54 ` Eli Zaretskii
2008-04-14 19:04 ` David Kastrup
2008-04-14 17:38 ` Eli Zaretskii
2008-04-14 18:57 ` Eli Zaretskii
2008-04-13 22:23 ` Stephen J. Turnbull
2008-04-14 3:19 ` Eli Zaretskii
2008-04-14 7:32 ` Stephen J. Turnbull
2008-04-14 8:20 ` David Kastrup
2008-04-14 18:25 ` Stephen J. Turnbull [this message]
2008-04-14 18:46 ` Eli Zaretskii
2008-04-14 21:01 ` Stephen J. Turnbull
2008-04-14 21:15 ` Andreas Schwab
2008-04-15 0:22 ` Stephen J. Turnbull
2008-04-15 3:25 ` Eli Zaretskii
2008-04-15 16:51 ` Stephen J. Turnbull
2008-04-15 20:09 ` Eli Zaretskii
2008-04-15 20:31 ` Eli Zaretskii
2008-04-15 20:35 ` David Kastrup
2008-04-16 20:15 ` Stephen J. Turnbull
2008-04-16 20:32 ` David Kastrup
2008-04-17 3:23 ` Stephen J. Turnbull
2008-04-17 3:26 ` Eli Zaretskii
2008-04-17 7:44 ` Stephen J. Turnbull
2008-04-17 8:19 ` Jan Djärv
2008-04-17 12:41 ` Eli Zaretskii
2008-04-17 17:20 ` Stephen J. Turnbull
2008-04-17 18:03 ` Eli Zaretskii
2008-04-16 22:09 ` Eli Zaretskii
2008-04-17 1:14 ` Stefan Monnier
2008-04-14 20:20 ` Stefan Monnier
2008-04-14 20:58 ` David Kastrup
2008-04-14 22:19 ` Stefan Monnier
2008-04-14 22:26 ` David Kastrup
2008-04-14 22:33 ` Stefan Monnier
2008-04-15 5:44 ` David Kastrup
2008-04-15 15:35 ` Stefan Monnier
2008-04-14 21:35 ` Stephen J. Turnbull
2008-04-14 5:17 ` Kenichi Handa
2008-04-14 13:57 ` Stefan Monnier
2008-04-14 7:02 ` tomas
2008-04-14 17:45 ` Eli Zaretskii
2008-04-15 7:38 ` tomas
2008-04-15 22:30 ` Juri Linkov
2008-04-16 3:20 ` Eli Zaretskii
2008-04-16 8:12 ` Jason Rumney
2008-04-16 13:35 ` Stefan Monnier
2008-04-16 14:45 ` Jason Rumney
2008-04-16 17:05 ` Stefan Monnier
2008-04-16 20:09 ` Stephen J. Turnbull
2008-04-16 23:17 ` Juri Linkov
2008-04-16 23:42 ` Jason Rumney
2008-04-17 1:03 ` Kenichi Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lk3gfg40.fsf@uwakimon.sk.tsukuba.ac.jp \
--to=stephen@xemacs.org \
--cc=dak@gnu.org \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.