all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org
Subject: Re: Cyrillic vs UTF-8
Date: Mon, 28 Apr 2003 18:18:41 +0900 (JST)	[thread overview]
Message-ID: <200304280918.SAA10779@etlken.m17n.org> (raw)
In-Reply-To: <ilullxxxx78.fsf@latte.josefsson.org> (message from Simon Josefsson on Sat, 26 Apr 2003 14:25:15 +0200)

In article <ilullxxxx78.fsf@latte.josefsson.org>, Simon Josefsson <jas@extundo.com> writes:

> Kenichi Handa <handa@m17n.org> writes:
>>  Unfortunately, the current Emacs doesn't have a facility to
>>  detect UTF-8 byte sequence.  So, if we put UTF-8 the higher
>>  priority, all files are detected as UTF-8.  :-(

> I see.  Is this very difficult to solve, or why hasn't it?  The
> algorithm to detect UTF-8 is not that complicated.

Ooops, I'm very sorry that I was wrong.  The current Emacs
contains a builtin utf-8 and utf-16 (with BOM) detectors.
So, putting UTF-8 the higher priority should have no
problem.

Richard Stallman <rms@gnu.org> writes:
>     It seems binary is preferred over utf-8 and utf-16-* in
>     coding-category-list.  This seems extremely conservative.  I guess it
>     means UTF-8 can never be autodetected by default?

> That certainly seems undesirable.  Unless there is a specific reason
> why it needs to be this way, I agree with you that we should raise
> the priority of utf-8 and utf-16.

We can raise the priority of utf-16-le-with-signature and
utf-16-be-with-signature, but can't raise the priority of
utf-16-le, utf-16-be, utf-16 because it's impossible to
distinguish them from binary data.

So, I've just installed these changes.


2003-04-28  Kenichi Handa  <handa@m17n.org>

	* international/mule-cmds.el (reset-language-environment): Raise
	the priority of mule-utf-8, mule-utf-16-be-with-signature and
	mule-utf-16-le.-with-signature.

	* international/mule-conf.el: Set coding-category-utf-16-be to
	mule-utf-16-be-with-signature, coding-category-utf-16-le to
	mule-utf-16-le-with-signature.  Raise the priority of
	coding-category-utf-8, coding-category-utf-16-be, and
	coding-category-utf-16-le

---
Ken'ichi HANDA
handa@m17n.org

  reply	other threads:[~2003-04-28  9:18 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-25 16:12 Cyrillic vs UTF-8 Simon Josefsson
2003-04-25 16:40 ` Eli Zaretskii
2003-04-25 17:09   ` Simon Josefsson
2003-04-25 22:39     ` Eli Zaretskii
2003-04-26  8:11     ` Kenichi Handa
2003-04-26 12:25       ` Simon Josefsson
2003-04-28  9:18         ` Kenichi Handa [this message]
2003-04-28 11:11           ` Simon Josefsson
2003-04-26 16:21       ` Benjamin Riefenstahl
2003-04-26 16:27         ` Benjamin Riefenstahl
2003-04-28  4:38       ` Richard Stallman
2003-05-01  8:27         ` Kenichi Handa
2003-05-02  7:06           ` Richard Stallman
2003-05-02 21:51             ` Eli Zaretskii
2003-05-03 13:37               ` Juanma Barranquero
2003-05-03 19:04                 ` Eli Zaretskii
2003-05-04 13:03               ` Richard Stallman
2003-05-04 11:04           ` Dave Love
2003-05-04 12:01             ` Simon Josefsson
2003-05-04 17:13               ` Dave Love
2003-05-04 18:03                 ` Simon Josefsson
2003-05-05  8:47             ` Kenichi Handa
2003-04-26 13:44     ` Richard Stallman
2003-04-26 14:10       ` Simon Josefsson
2003-04-28 21:49     ` Stefan Monnier
2003-04-28 22:29       ` Simon Josefsson
2003-04-29 13:49         ` Stefan Monnier
2003-04-29 14:27           ` Simon Josefsson
2003-04-30  4:42             ` Stephen J. Turnbull
2003-04-30  5:43           ` Richard Stallman
2003-05-19  0:40       ` Kenichi Handa
2003-05-19  0:52         ` Stefan Monnier
2003-05-19  2:31           ` Kenichi Handa
2003-05-19 13:28             ` Stefan Monnier
2003-05-19 13:49               ` Stefan Monnier
2003-04-25 16:54 ` Simon Josefsson
2003-04-26  3:55   ` Implementing charset-aware X font names [was: Cyrillic vs UTF-8] Stephen J. Turnbull
2003-04-28 11:09     ` Kenichi Handa
2003-04-28 12:27       ` Implementing charset-aware X font names Stephen J. Turnbull
2003-05-01 11:13         ` Kenichi Handa
2003-05-01 14:14           ` Alex Schroeder
2003-05-01 23:16             ` Kenichi Handa
2003-04-26  7:59   ` Cyrillic vs UTF-8 Kenichi Handa
2003-04-26 12:14     ` Simon Josefsson
2003-05-01  7:20       ` Kenichi Handa
2003-05-01 14:06         ` Alex Schroeder
2003-05-01 18:03         ` Customizing fontsets (was: Cyrillic vs UTF-8) Oliver Scholz
2003-05-02  5:17           ` Customizing fontsets Alex Schroeder
2003-05-02  6:32             ` Kenichi Handa
2003-05-02 13:25               ` Stefan Monnier
2003-05-03  0:40               ` Oliver Scholz
2003-05-03  1:50                 ` Kenichi Handa
2003-05-03 12:08                   ` Oliver Scholz
2003-05-07  1:22                     ` Kenichi Handa
2003-05-03  0:33             ` Oliver Scholz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200304280918.SAA10779@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.