unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Stephen J. Turnbull" <stephen@xemacs.org>
To: James Cloos <cloos@jhcloos.com>
Cc: emacs-devel@gnu.org, David De La Harpe Golden <david@harpegolden.net>
Subject: Re: X11 Compound Text vs ISO 2022
Date: Wed, 07 Jul 2010 09:36:41 +0900	[thread overview]
Message-ID: <87r5jgnn52.fsf@uwakimon.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <m3aaq4w8d7.fsf@carbon.jhcloos.org>

James Cloos writes:

 > I think utf8 is the only significant difference between the upstream
 > Xorg spec and the Xfree86 modification.  I vaguely recall the
 > discussions on the xfree86 list(s) when it was introduced (too many
 > years ago, [SIGH]).  The EWMH spec and the UTF8_STRING fromat came
 > about, in part, out of that discussion, IIRC.

As of about 2004, the XFree86 spec was totally bogus (internally
contradictory on the subject of encoding some ISO 8859 coded character
sets), and the XFree86 implementation ignored it anyway in many cases.

 > Emacs does need to limit what it is willing to encode in COMPOUND_TEXT,
 > and to use utf8-in-ctext for everything which is not in the 8859, GB,
 > JISX, KSC, CNS or BIG5 varients libX11 supports.  I'd go a bit further
 > and prefer utf8 over the CJK encodings for characters which are not
 > part of a CJK string.

But that goes against the spec, which AFAIK still provides that in
COMPOUND_TEXT the escape to non-ISO-2022 should only be used for
characters not in the repertoires of the registered charsets:

    Extended segments are not to be used for any character set
    encoding that can be constructed from a GL/GR pair of approved
    standard encodings. For example, it is incorrect to use an
    extended segment for any of the ISO 8859 family of encodings.

I would argue that you have two choices here: consider the whole
string to be Unicode, and used an extended segment for the whole
thing; or consider the string to be pieced together from segments in
approved standard encodings, in which case a character that can be
represented in those encodings should be.

BTW, for the case of MIDDLE DOT using JIS X 0213, the most recent spec
I could find on the web doesn't admit JIS X 0213 (or JIS X 0212 for
that matter).

 > The question, then, is how best to do that?

Wouldn't it be better to avoid use of COMPOUND_TEXT targets?  How many
apps prefer it to UTF8_STRING?  So, for example, when asked for
supported targets Emacs could list UTF8_STRING first.



  reply	other threads:[~2010-07-07  0:36 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-06 16:21 X11 Compound Text vs ISO 2022 James Cloos
2010-07-06 20:18 ` David De La Harpe Golden
2010-07-06 22:30   ` James Cloos
2010-07-07  0:36     ` Stephen J. Turnbull [this message]
2010-07-07  5:19       ` James Cloos
2010-07-07 19:51         ` James Cloos
2010-07-08  0:24           ` David De La Harpe Golden
2010-07-14 21:07             ` James Cloos
2010-07-06 23:38 ` David De La Harpe Golden
2010-07-07  1:15   ` David De La Harpe Golden
2010-07-07  4:55   ` James Cloos
2010-07-29 12:36 ` Kenichi Handa
2010-07-29 15:51   ` James Cloos
2010-07-30  1:27     ` Kenichi Handa
2010-07-30 18:46       ` James Cloos
2010-08-01  9:35         ` Stephen J. Turnbull
2010-08-01 11:06           ` James Cloos
2010-08-02  8:14             ` Stephen J. Turnbull
2010-08-06 12:50             ` Kenichi Handa
2010-08-08  9:47               ` James Cloos
2010-08-09  1:49                 ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r5jgnn52.fsf@uwakimon.sk.tsukuba.ac.jp \
    --to=stephen@xemacs.org \
    --cc=cloos@jhcloos.com \
    --cc=david@harpegolden.net \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).