From: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org, mew-int@mew.org
Subject: [mew-int 01596] Re: windows 1252
Date: Fri, 7 Nov 2003 16:13:45 +0900 (JST) [thread overview]
Message-ID: <200311070713.QAA24793@etlken.m17n.org> (raw)
In-Reply-To: <20031104.111334.60445673.kazu@iijlab.net>
I'm sorry for the late response on this thread.
I at first want to clarify these things:
(1) windows-1252
This is actually not a charset but a coding system in
Emacs. When Emacs reads a file by this coding system, it
decode each byte into one of these character sets:
ascii, latin-iso8859-1, mule-unicode-0100-24ff
(2) ctext (alias of compound-text)
On conversion, it works not fully compatible with the
specification of X Compound Text because it encodes any
Emacs characters while using an designation sequence for
private character sets (please note that all Emacs charasets
have a iso-final-char). So, Big5 characters are preceded by
ESC $ ( 0 or 1, mule-unicode-0100-24ff characters are
preceded by ESC - 1.
(3) ctext-with-extensions (alias of compound-text-with-extensions)
It can handle several kinds of "extended segment". On
decoding, it handles ESC % / N M L ... ^b for what listed in
ctext-non-standard-encoding-alist, and ESC % G ...ESC % @
for UTF-8. On encoding, it does two-path encoding; at first
by `compound-text', then re-encode what are encoded by a
designation sequence listed in
ctext-non-standard-designations-alist using the "extended
segment". Currently only ESC $ ( 0 and ESC $ ( 1 are
listed. Thus only Big5 are encoded using the "extended
segment".
As to the Mew case, I think the following is good.
When it runs under the current Emacs, keep using ctext but
add a coding tag to the file. Emacs should be able to
encode/decode all Emacs characters.
When it runs under emacs-unicode version, on writing the
file, if all the characters can be encoded by ctext, keep
using it. If not (because, in emacs-unicode, some character
doesn't belong to any charset that has iso-final-char), use
utf-8. And in both cases, add a coding tag. On reading,
check the coding tag at first. If no coding tag, read by
ctext, otherwise, read by the coding system specified in the
tag.
By the way,
> The one-and-only coding-system which, I found, meets the requirements
> above is 'ctext.
I think iso-latin-1-with-esc also meets your requirements.
---
Ken'ichi HANDA
handa@m17n.org
next prev parent reply other threads:[~2003-11-07 7:13 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20031029.160819.120233945.kazu@iijlab.net>
[not found] ` <20031029.082403.193886873.wl@gnu.org>
[not found] ` <20031030.175736.39971315.kazu@iijlab.net>
2003-10-30 14:41 ` [mew-int 01581] Re: windows 1252 Werner LEMBERG
2003-10-31 11:04 ` [mew-int 01579] " Kenichi Handa
2003-10-31 12:39 ` [mew-int 01583] " Kazu Yamamoto
2003-11-01 15:36 ` [mew-int 01584] " Eli Zaretskii
2003-11-02 6:41 ` [mew-int 01582] " Stephen J. Turnbull
2003-11-04 2:13 ` [mew-int 01586] " Kazu Yamamoto
2003-11-04 5:55 ` [mew-int 01585] " Eli Zaretskii
2003-11-04 6:13 ` [mew-int 01587] " Kazu Yamamoto
2003-11-04 6:23 ` [mew-int 01589] " Stephen J. Turnbull
2003-11-04 15:13 ` [mew-int 01590] " Stefan Monnier
2003-11-04 15:55 ` [mew-int 01591] " Kazu Yamamoto
2003-11-04 17:04 ` [mew-int 01590] " Stefan Monnier
2003-11-04 18:45 ` Stephen J. Turnbull
2003-11-05 1:59 ` [mew-int 01594] " Kazu Yamamoto
2003-11-05 5:00 ` [mew-int 01593] " Stephen J. Turnbull
2003-11-07 7:30 ` Kenichi Handa
2003-11-07 7:28 ` [mew-int 01597] " Kenichi Handa
2003-11-07 8:21 ` [mew-int 01599] " Kazu Yamamoto
2003-11-07 7:13 ` Kenichi Handa [this message]
2003-11-10 7:11 ` [mew-int 01607] " Kazu Yamamoto
2003-11-10 7:42 ` [mew-int 01608] " Kenichi Handa
2003-11-12 16:36 ` [mew-int 01596] " Stephen J. Turnbull
2003-11-13 1:01 ` Kenichi Handa
2003-11-13 16:32 ` Stephen J. Turnbull
2003-11-14 2:57 ` Kenichi Handa
2003-11-14 11:20 ` Stephen J. Turnbull
2003-11-14 12:02 ` Kenichi Handa
2003-11-13 19:49 ` Eli Zaretskii
2003-11-14 3:39 ` [mew-int 01621] " Kenichi Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200311070713.QAA24793@etlken.m17n.org \
--to=handa@m17n.org \
--cc=emacs-devel@gnu.org \
--cc=mew-int@mew.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).