* Cyrillic, utf-8 and windows
@ 2003-12-08 18:39 Sam Steingold
2003-12-09 18:25 ` Sam Steingold
0 siblings, 1 reply; 8+ messages in thread
From: Sam Steingold @ 2003-12-08 18:39 UTC (permalink / raw)
GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195)
of 2003-11-20 on WINSTEINGOLDLAP
--with-msvc (12.00)
I can open in Emacs a utf-8 file with Cyrillic characters in it and it
is displayed just fine - with correct glyphs &c.
I set `default-input-method' to "cyrillic-yawerty" in .emacs,
so when I try C-\ `toggle-input-method', I get 2 "character outline
boxes" in the modeline and when I type, I see these "character outline
boxes" in the buffer instead of the characters I just typed.
When I save the buffer, kill it, and re-visit the file,
I see what I just typed displayed correctly as Cyrillic!
So, why does Emacs display the characters that I type as boxes
(rectangles) but shows them correctly when loaded from a file on disk?
I use:
(setq default-input-method "cyrillic-yawerty")
(prefer-coding-system 'utf-8)
(when (fboundp 'utf-translate-cjk-mode) (utf-translate-cjk-mode 1))
Is there anything else I need to do?
Thanks!
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.honestreporting.com>
NY survival guide: when crossing a street, mind cars, not streetlights.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-08 18:39 Cyrillic, utf-8 and windows Sam Steingold
@ 2003-12-09 18:25 ` Sam Steingold
2003-12-09 23:58 ` Kenichi Handa
2003-12-10 1:27 ` Jason Rumney
0 siblings, 2 replies; 8+ messages in thread
From: Sam Steingold @ 2003-12-09 18:25 UTC (permalink / raw)
> * Sam Steingold <fqf@tah.bet> [2003-12-08 13:39:32 -0500]:
>
> GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195)
> of 2003-11-20 on WINSTEINGOLDLAP
> --with-msvc (12.00)
>
> I can open in Emacs a utf-8 file with Cyrillic characters in it and it
> is displayed just fine - with correct glyphs &c.
> I set `default-input-method' to "cyrillic-yawerty" in .emacs,
> so when I try C-\ `toggle-input-method', I get 2 "character outline
> boxes" in the modeline and when I type, I see these "character outline
> boxes" in the buffer instead of the characters I just typed.
> When I save the buffer, kill it, and re-visit the file,
> I see what I just typed displayed correctly as Cyrillic!
> So, why does Emacs display the characters that I type as boxes
> (rectangles) but shows them correctly when loaded from a file on disk?
>
> I use:
>
> (setq default-input-method "cyrillic-yawerty")
> (prefer-coding-system 'utf-8)
> (when (fboundp 'utf-translate-cjk-mode) (utf-translate-cjk-mode 1))
when I type using cyrillic-yawerty, I get this:
character: а (07120, 3664, 0xe50, U+0430)
charset: cyrillic-iso8859-5
(Right-Hand Part of Latin/Cyrillic Alphabet (ISO/IEC 8859-5): ISO-IR-144.)
code point: 80
syntax: w which means: word
category: y:Cyrillic
buffer code: 0x8C 0xD0
file code: 0xD0 0xB0 (encoded by coding system mule-utf-8-unix)
display: no font available
when I save the file, kill the buffer and visit the file again, that
character becomes
character: а (01212120, 332880, 0x51450, U+0430)
charset: mule-unicode-0100-24ff
(Unicode characters of the range U+0100..U+24FF.)
code point: 40 80
syntax: w which means: word
category: y:Cyrillic
buffer code: 0x9C 0xF4 0xA8 0xD0
file code: 0xD0 0xB0 (encoded by coding system mule-utf-8-unix)
display: by this font (glyph code)
-outline-Courier New-normal-r-normal-normal-13-97-96-96-c-80-iso10646-1 (0x430)
So, how do I tell cyrillic-yawerty to insert UTF-8?!
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.honestreporting.com>
When you talk to God, it's prayer; when He talks to you, it's schizophrenia.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-09 18:25 ` Sam Steingold
@ 2003-12-09 23:58 ` Kenichi Handa
2003-12-11 19:38 ` Sam Steingold
2003-12-10 1:27 ` Jason Rumney
1 sibling, 1 reply; 8+ messages in thread
From: Kenichi Handa @ 2003-12-09 23:58 UTC (permalink / raw)
Cc: emacs-devel
Thank you for the report.
In article <ufzftq26b.fsf@gnu.org>, Sam Steingold <sds@gnu.org> writes:
>>
>> I can open in Emacs a utf-8 file with Cyrillic characters in it and it
>> is displayed just fine - with correct glyphs &c.
>> I set `default-input-method' to "cyrillic-yawerty" in .emacs,
>> so when I try C-\ `toggle-input-method', I get 2 "character outline
>> boxes" in the modeline and when I type, I see these "character outline
>> boxes" in the buffer instead of the characters I just typed.
>> When I save the buffer, kill it, and re-visit the file,
>> I see what I just typed displayed correctly as Cyrillic!
>> So, why does Emacs display the characters that I type as boxes
>> (rectangles) but shows them correctly when loaded from a file on disk?
Because those are different character for Emacs as you
already found as below.
> when I type using cyrillic-yawerty, I get this:
[...]
> charset: cyrillic-iso8859-5
[...]
> when I save the file, kill the buffer and visit the file again, that
> character becomes
[...]
> charset: mule-unicode-0100-24ff
> So, how do I tell cyrillic-yawerty to insert UTF-8?!
The input method cyrillic-yawerty generates iso-8859-5
characters, and Emacs has a facility to automatically adjust
an input character to what the buffer-file-coding-system
expects. But, I found a bug in that facility and
insufficiency in set-default-coding-systems (called from
prefer-coding-system). Please try the attached patch.
But, there still exist one problem. As you don't have
iso8859-5 fonts, the input-method indicator in the modeline
can't be displayed correctly. For the moment, Emacs doesn't
has a facility to automatically try the other fonts
(e.g. iso10646-1). Emacs-unicode version has it.
---
Ken'ichi HANDA
handa@m17n.org
*** ucs-tables.el.~1.34.~ Tue Sep 2 08:25:38 2003
--- ucs-tables.el Wed Dec 10 08:17:57 2003
***************
*** 2507,2512 ****
--- 2507,2514 ----
(coding-system-base default-buffer-file-coding-system))))
(when cs
(setq table (coding-system-get cs 'translation-table-for-encode))
+ (if (and table (symbolp table))
+ (setq table (get table 'translation-table)))
(unless (char-table-p table)
(setq table (coding-system-get cs 'translation-table-for-input)))
(when (char-table-p table)
*** mule-cmds.el.~1.249.~ Wed Nov 26 08:10:10 2003
--- mule-cmds.el Wed Dec 10 08:42:43 2003
***************
*** 321,326 ****
--- 321,331 ----
o default value for the command `set-keyboard-coding-system'."
(check-coding-system coding-system)
(setq-default buffer-file-coding-system coding-system)
+ (if (fboundp 'ucs-set-table-for-input)
+ (dolist (buffer (buffer-list))
+ (or (local-variable-p 'buffer-file-coding-system buffer)
+ (ucs-set-table-for-input buffer))))
+
(if default-enable-multibyte-characters
(setq default-file-name-coding-system coding-system))
;; If coding-system is nil, honor that on MS-DOS as well, so
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-09 18:25 ` Sam Steingold
2003-12-09 23:58 ` Kenichi Handa
@ 2003-12-10 1:27 ` Jason Rumney
2003-12-10 7:20 ` Roman Belenov
1 sibling, 1 reply; 8+ messages in thread
From: Jason Rumney @ 2003-12-10 1:27 UTC (permalink / raw)
Cc: emacs-devel
Sam Steingold <sds@gnu.org> writes:
> when I type using cyrillic-yawerty, I get this:
>
> character: а (07120, 3664, 0xe50, U+0430)
> charset: cyrillic-iso8859-5
> (Right-Hand Part of Latin/Cyrillic Alphabet (ISO/IEC 8859-5): ISO-IR-144.)
> code point: 80
> syntax: w which means: word
> category: y:Cyrillic
> buffer code: 0x8C 0xD0
> file code: 0xD0 0xB0 (encoded by coding system mule-utf-8-unix)
> display: no font available
There seems to be a bug in the font handling of the Windows
version. Courier New (the default font) has Cyrillic characters, and
most versions of Windows support iso8859-5 - you can verify this: if
(w32-get-valid-codepages) lists 28595, then it should be supported;
so Cyrillic should display automatically. But I just checked and it
doesn't seem to be the case.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-10 1:27 ` Jason Rumney
@ 2003-12-10 7:20 ` Roman Belenov
2003-12-11 19:39 ` Sam Steingold
0 siblings, 1 reply; 8+ messages in thread
From: Roman Belenov @ 2003-12-10 7:20 UTC (permalink / raw)
Cc: help-emacs-windows, sds, emacs-devel
Jason Rumney <jasonr@gnu.org> writes:
> There seems to be a bug in the font handling of the Windows
> version. Courier New (the default font) has Cyrillic characters, and
> most versions of Windows support iso8859-5 - you can verify this: if
> (w32-get-valid-codepages) lists 28595, then it should be supported;
> so Cyrillic should display automatically. But I just checked and it
> doesn't seem to be the case.
The list of character sets supported by Windows can be customized (at
least in NT derivatives) via Control Panel->Regional Settings
("Advanced" tab). It seems that iso8859-5 is enabled automatically if
Russian locale is in use, but in other locales you may have to enable
iso8859-5 manually.
--
With regards, Roman.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-09 23:58 ` Kenichi Handa
@ 2003-12-11 19:38 ` Sam Steingold
2003-12-11 23:20 ` Kenichi Handa
0 siblings, 1 reply; 8+ messages in thread
From: Sam Steingold @ 2003-12-11 19:38 UTC (permalink / raw)
> * Kenichi Handa <unaqn@z17a.bet> [2003-12-10 08:58:28 +0900]:
>
> Please try the attached patch.
works, thanks!
BTW, why does gnus save koi8 messages in something like uuencode?
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.honestreporting.com>
Abandon all hope, all ye who press Enter.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-10 7:20 ` Roman Belenov
@ 2003-12-11 19:39 ` Sam Steingold
0 siblings, 0 replies; 8+ messages in thread
From: Sam Steingold @ 2003-12-11 19:39 UTC (permalink / raw)
Cc: emacs-devel
> * Roman Belenov <eoryrabi@lnaqrk.eh> [2003-12-10 10:20:04 +0300]:
>
> The list of character sets supported by Windows can be customized (at
> least in NT derivatives) via Control Panel->Regional Settings
> ("Advanced" tab). It seems that iso8859-5 is enabled automatically if
> Russian locale is in use, but in other locales you may have to enable
> iso8859-5 manually.
thanks, this helped!
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.honestreporting.com>
Those who can't write, write manuals.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows
2003-12-11 19:38 ` Sam Steingold
@ 2003-12-11 23:20 ` Kenichi Handa
0 siblings, 0 replies; 8+ messages in thread
From: Kenichi Handa @ 2003-12-11 23:20 UTC (permalink / raw)
Cc: emacs-devel
In article <u4qw7no0w.fsf@gnu.org>, Sam Steingold <sds@gnu.org> writes:
>> * Kenichi Handa <unaqn@z17a.bet> [2003-12-10 08:58:28 +0900]:
>>
>> Please try the attached patch.
> works, thanks!
Thank you for testing.
> BTW, why does gnus save koi8 messages in something like uuencode?
Sorry, I have no idea about it.
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-12-11 23:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-08 18:39 Cyrillic, utf-8 and windows Sam Steingold
2003-12-09 18:25 ` Sam Steingold
2003-12-09 23:58 ` Kenichi Handa
2003-12-11 19:38 ` Sam Steingold
2003-12-11 23:20 ` Kenichi Handa
2003-12-10 1:27 ` Jason Rumney
2003-12-10 7:20 ` Roman Belenov
2003-12-11 19:39 ` Sam Steingold
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).