unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@IRO.UMontreal.CA>, Kenichi Handa <handa@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: Encoding of etc/HELLO
Date: Sat, 21 Apr 2018 10:07:38 +0300	[thread overview]
Message-ID: <83k1t1xcjp.fsf@gnu.org> (raw)
In-Reply-To: <jwvo9idaa1s.fsf-monnier+emacs@gnu.org> (message from Stefan Monnier on Fri, 20 Apr 2018 16:42:02 -0400)

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: emacs-devel@gnu.org
> Date: Fri, 20 Apr 2018 16:42:02 -0400
> 
> > The whole point of ISO-2022 is that the same Unicode codepoints can
> > come from different ISO-2022 charsets, and the ISO-2022 encoding keeps
> > that information in the bytestream.
> 
> My question was meant to see if there's a way to encode a similar kind
> of charset info into the bytestream.  From what you say above, there is
> such a thing but its use is discouraged.

If you mean a Unicode-compatible bytestream, then yes, that's the
feature I know of.  But if we want to use it in Emacs, we should
modify the UTF-x decoders to put the charset properties on the decoded
text, or invent a new property (since charset is currently 'unicode'),
and then augment the font selection code to consider that new
property.

> Clearly this problem is not specific to Emacs, so what do people do?
> Hold on to iso-2022 for as long as they can (like we do in Emacs)?
> Give up on these "details" of rendering for files using a mix of C, J, and K?
> Rely on higher-level info (XML tags and friends) to carry the charset info?

I don't know.  Several years ago, I think each vendor used a private
extension of ISO-2022 to support the emoji, not sure if that is still
the case, especially since the number of standardized emoji continues
to grow all the time.  We could perhaps follow one such extension in
our support of ISO-2022.  Or we could decide that the Han unification
has conquered the world, and therefore the CJK charset distinction for
font selection is no longer important enough for us, in which case we
could recode HELLO in UTF-8.

I've added Handa-san to this discussion in the hope that he could
comment on what would be the bets way forward.



  parent reply	other threads:[~2018-04-21  7:07 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-20 13:25 Encoding of etc/HELLO Eli Zaretskii
2018-04-20 15:34 ` Michael Albinus
2018-04-20 16:00   ` Eli Zaretskii
2018-04-20 16:16     ` Stefan Monnier
2018-04-20 17:22       ` Eli Zaretskii
2018-04-20 20:42         ` Stefan Monnier
2018-04-20 21:02           ` Clément Pit-Claudel
2018-04-20 21:26           ` Paul Eggert
2018-04-21  7:07           ` Eli Zaretskii [this message]
2018-04-21 14:58             ` Michael Welsh Duggan
2018-05-19 15:23               ` Eli Zaretskii
2018-05-19 17:17                 ` Paul Eggert
2018-05-19 18:03                   ` Eli Zaretskii
2018-05-19 18:23                     ` Paul Eggert
2018-05-19 18:39                       ` Eli Zaretskii
2018-05-19 19:38                         ` Paul Eggert
2018-05-19 20:03                           ` Eli Zaretskii
2018-05-20  8:56                             ` Eli Zaretskii
2018-05-19 17:52                 ` Michael Albinus
2018-04-20 17:39     ` Michael Albinus
2018-04-21  7:10       ` Eli Zaretskii
2018-04-21 14:40         ` Clément Pit-Claudel
2018-04-21 15:43           ` Eli Zaretskii
2018-04-21 15:52           ` Paul Eggert
2018-04-23  2:53         ` Stefan Monnier
2018-04-23 15:07           ` Eli Zaretskii
2018-04-23 15:23             ` Stefan Monnier
2018-04-23 16:12               ` Eli Zaretskii
2018-04-20 16:56   ` Paul Eggert
2018-04-20 17:37     ` Michael Albinus
2018-04-21 20:31       ` Juri Linkov
2018-04-23 16:25         ` Eli Zaretskii
2018-04-23 20:05           ` Juri Linkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83k1t1xcjp.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=handa@gnu.org \
    --cc=monnier@IRO.UMontreal.CA \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).