From: Simon Josefsson <jas@extundo.com>
Cc: emacs-devel@gnu.org
Subject: Re: Cyrillic vs UTF-8
Date: Fri, 25 Apr 2003 19:09:07 +0200 [thread overview]
Message-ID: <iluvfx21p3g.fsf@latte.josefsson.org> (raw)
In-Reply-To: <1858-Fri25Apr2003194023+0300-eliz@elta.co.il> (Eli Zaretskii's message of "Fri, 25 Apr 2003 19:40:23 +0300")
"Eli Zaretskii" <eliz@elta.co.il> writes:
>> From: Simon Josefsson <jas@extundo.com>
>> Date: Fri, 25 Apr 2003 18:12:17 +0200
>>
>> I think there are two problems. Opening the file the first time
>> should guess it is a utf-8 file.
>
> IIRC, you need to make the priority of utf-8 higher for this to
> happen. Unless that's changed in the current CVS, try evaluating the
> following expression:
>
> (prefer-coding-system 'utf-8)
>
> before you visit a utf-8 encoded file, and see if that helps. I think
> this is because the encoding detection routines cannot distinguish
> between Latin-n and utf encoding without some help.
This works, but note that Emacs didn't recognize the file as being in
any encoding without it. The modeline says '-:--'.
It seems binary is preferred over utf-8 and utf-16-* in
coding-category-list. This seems extremely conservative. I guess it
means UTF-8 can never be autodetected by default? Is the unicode
support so bad it shouldn't even be preferred over binary? UTF-8 is
well formed and restricted; detecting it properly (even compared to
Latin-n) can be done well enough that failures rarely happen in
practice.
Can't we move binary down below UTF-8 in CVS? IMHO we should move
UTF-8 earlier still, since determining whether data is UTF-8 or not
can be done with good probability. Prefering binary over UTF-8 seems
just wrong.
There used to be (in Emacs 21.2) a PROBLEMS entry suggesting what you
say, but it has been removed both in 21.3 and in CVS. I thought that
meant UTF-8 was better supported now, but this doesn't seem to be the
case.
next prev parent reply other threads:[~2003-04-25 17:09 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-25 16:12 Cyrillic vs UTF-8 Simon Josefsson
2003-04-25 16:40 ` Eli Zaretskii
2003-04-25 17:09 ` Simon Josefsson [this message]
2003-04-25 22:39 ` Eli Zaretskii
2003-04-26 8:11 ` Kenichi Handa
2003-04-26 12:25 ` Simon Josefsson
2003-04-28 9:18 ` Kenichi Handa
2003-04-28 11:11 ` Simon Josefsson
2003-04-26 16:21 ` Benjamin Riefenstahl
2003-04-26 16:27 ` Benjamin Riefenstahl
2003-04-28 4:38 ` Richard Stallman
2003-05-01 8:27 ` Kenichi Handa
2003-05-02 7:06 ` Richard Stallman
2003-05-02 21:51 ` Eli Zaretskii
2003-05-03 13:37 ` Juanma Barranquero
2003-05-03 19:04 ` Eli Zaretskii
2003-05-04 13:03 ` Richard Stallman
2003-05-04 11:04 ` Dave Love
2003-05-04 12:01 ` Simon Josefsson
2003-05-04 17:13 ` Dave Love
2003-05-04 18:03 ` Simon Josefsson
2003-05-05 8:47 ` Kenichi Handa
2003-04-26 13:44 ` Richard Stallman
2003-04-26 14:10 ` Simon Josefsson
2003-04-28 21:49 ` Stefan Monnier
2003-04-28 22:29 ` Simon Josefsson
2003-04-29 13:49 ` Stefan Monnier
2003-04-29 14:27 ` Simon Josefsson
2003-04-30 4:42 ` Stephen J. Turnbull
2003-04-30 5:43 ` Richard Stallman
2003-05-19 0:40 ` Kenichi Handa
2003-05-19 0:52 ` Stefan Monnier
2003-05-19 2:31 ` Kenichi Handa
2003-05-19 13:28 ` Stefan Monnier
2003-05-19 13:49 ` Stefan Monnier
2003-04-25 16:54 ` Simon Josefsson
2003-04-26 3:55 ` Implementing charset-aware X font names [was: Cyrillic vs UTF-8] Stephen J. Turnbull
2003-04-28 11:09 ` Kenichi Handa
2003-04-28 12:27 ` Implementing charset-aware X font names Stephen J. Turnbull
2003-05-01 11:13 ` Kenichi Handa
2003-05-01 14:14 ` Alex Schroeder
2003-05-01 23:16 ` Kenichi Handa
2003-04-26 7:59 ` Cyrillic vs UTF-8 Kenichi Handa
2003-04-26 12:14 ` Simon Josefsson
2003-05-01 7:20 ` Kenichi Handa
2003-05-01 14:06 ` Alex Schroeder
2003-05-01 18:03 ` Customizing fontsets (was: Cyrillic vs UTF-8) Oliver Scholz
2003-05-02 5:17 ` Customizing fontsets Alex Schroeder
2003-05-02 6:32 ` Kenichi Handa
2003-05-02 13:25 ` Stefan Monnier
2003-05-03 0:40 ` Oliver Scholz
2003-05-03 1:50 ` Kenichi Handa
2003-05-03 12:08 ` Oliver Scholz
2003-05-07 1:22 ` Kenichi Handa
2003-05-03 0:33 ` Oliver Scholz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=iluvfx21p3g.fsf@latte.josefsson.org \
--to=jas@extundo.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.