* Cyrillic, utf-8 and windows @ 2003-12-08 18:39 Sam Steingold 2003-12-09 18:25 ` Sam Steingold 0 siblings, 1 reply; 8+ messages in thread From: Sam Steingold @ 2003-12-08 18:39 UTC (permalink / raw) GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195) of 2003-11-20 on WINSTEINGOLDLAP --with-msvc (12.00) I can open in Emacs a utf-8 file with Cyrillic characters in it and it is displayed just fine - with correct glyphs &c. I set `default-input-method' to "cyrillic-yawerty" in .emacs, so when I try C-\ `toggle-input-method', I get 2 "character outline boxes" in the modeline and when I type, I see these "character outline boxes" in the buffer instead of the characters I just typed. When I save the buffer, kill it, and re-visit the file, I see what I just typed displayed correctly as Cyrillic! So, why does Emacs display the characters that I type as boxes (rectangles) but shows them correctly when loaded from a file on disk? I use: (setq default-input-method "cyrillic-yawerty") (prefer-coding-system 'utf-8) (when (fboundp 'utf-translate-cjk-mode) (utf-translate-cjk-mode 1)) Is there anything else I need to do? Thanks! -- Sam Steingold (http://www.podval.org/~sds) running w2k <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.honestreporting.com> NY survival guide: when crossing a street, mind cars, not streetlights. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-08 18:39 Cyrillic, utf-8 and windows Sam Steingold @ 2003-12-09 18:25 ` Sam Steingold 2003-12-09 23:58 ` Kenichi Handa 2003-12-10 1:27 ` Jason Rumney 0 siblings, 2 replies; 8+ messages in thread From: Sam Steingold @ 2003-12-09 18:25 UTC (permalink / raw) > * Sam Steingold <fqf@tah.bet> [2003-12-08 13:39:32 -0500]: > > GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195) > of 2003-11-20 on WINSTEINGOLDLAP > --with-msvc (12.00) > > I can open in Emacs a utf-8 file with Cyrillic characters in it and it > is displayed just fine - with correct glyphs &c. > I set `default-input-method' to "cyrillic-yawerty" in .emacs, > so when I try C-\ `toggle-input-method', I get 2 "character outline > boxes" in the modeline and when I type, I see these "character outline > boxes" in the buffer instead of the characters I just typed. > When I save the buffer, kill it, and re-visit the file, > I see what I just typed displayed correctly as Cyrillic! > So, why does Emacs display the characters that I type as boxes > (rectangles) but shows them correctly when loaded from a file on disk? > > I use: > > (setq default-input-method "cyrillic-yawerty") > (prefer-coding-system 'utf-8) > (when (fboundp 'utf-translate-cjk-mode) (utf-translate-cjk-mode 1)) when I type using cyrillic-yawerty, I get this: character: а (07120, 3664, 0xe50, U+0430) charset: cyrillic-iso8859-5 (Right-Hand Part of Latin/Cyrillic Alphabet (ISO/IEC 8859-5): ISO-IR-144.) code point: 80 syntax: w which means: word category: y:Cyrillic buffer code: 0x8C 0xD0 file code: 0xD0 0xB0 (encoded by coding system mule-utf-8-unix) display: no font available when I save the file, kill the buffer and visit the file again, that character becomes character: а (01212120, 332880, 0x51450, U+0430) charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.) code point: 40 80 syntax: w which means: word category: y:Cyrillic buffer code: 0x9C 0xF4 0xA8 0xD0 file code: 0xD0 0xB0 (encoded by coding system mule-utf-8-unix) display: by this font (glyph code) -outline-Courier New-normal-r-normal-normal-13-97-96-96-c-80-iso10646-1 (0x430) So, how do I tell cyrillic-yawerty to insert UTF-8?! -- Sam Steingold (http://www.podval.org/~sds) running w2k <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.honestreporting.com> When you talk to God, it's prayer; when He talks to you, it's schizophrenia. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-09 18:25 ` Sam Steingold @ 2003-12-09 23:58 ` Kenichi Handa 2003-12-11 19:38 ` Sam Steingold 2003-12-10 1:27 ` Jason Rumney 1 sibling, 1 reply; 8+ messages in thread From: Kenichi Handa @ 2003-12-09 23:58 UTC (permalink / raw) Cc: emacs-devel Thank you for the report. In article <ufzftq26b.fsf@gnu.org>, Sam Steingold <sds@gnu.org> writes: >> >> I can open in Emacs a utf-8 file with Cyrillic characters in it and it >> is displayed just fine - with correct glyphs &c. >> I set `default-input-method' to "cyrillic-yawerty" in .emacs, >> so when I try C-\ `toggle-input-method', I get 2 "character outline >> boxes" in the modeline and when I type, I see these "character outline >> boxes" in the buffer instead of the characters I just typed. >> When I save the buffer, kill it, and re-visit the file, >> I see what I just typed displayed correctly as Cyrillic! >> So, why does Emacs display the characters that I type as boxes >> (rectangles) but shows them correctly when loaded from a file on disk? Because those are different character for Emacs as you already found as below. > when I type using cyrillic-yawerty, I get this: [...] > charset: cyrillic-iso8859-5 [...] > when I save the file, kill the buffer and visit the file again, that > character becomes [...] > charset: mule-unicode-0100-24ff > So, how do I tell cyrillic-yawerty to insert UTF-8?! The input method cyrillic-yawerty generates iso-8859-5 characters, and Emacs has a facility to automatically adjust an input character to what the buffer-file-coding-system expects. But, I found a bug in that facility and insufficiency in set-default-coding-systems (called from prefer-coding-system). Please try the attached patch. But, there still exist one problem. As you don't have iso8859-5 fonts, the input-method indicator in the modeline can't be displayed correctly. For the moment, Emacs doesn't has a facility to automatically try the other fonts (e.g. iso10646-1). Emacs-unicode version has it. --- Ken'ichi HANDA handa@m17n.org *** ucs-tables.el.~1.34.~ Tue Sep 2 08:25:38 2003 --- ucs-tables.el Wed Dec 10 08:17:57 2003 *************** *** 2507,2512 **** --- 2507,2514 ---- (coding-system-base default-buffer-file-coding-system)))) (when cs (setq table (coding-system-get cs 'translation-table-for-encode)) + (if (and table (symbolp table)) + (setq table (get table 'translation-table))) (unless (char-table-p table) (setq table (coding-system-get cs 'translation-table-for-input))) (when (char-table-p table) *** mule-cmds.el.~1.249.~ Wed Nov 26 08:10:10 2003 --- mule-cmds.el Wed Dec 10 08:42:43 2003 *************** *** 321,326 **** --- 321,331 ---- o default value for the command `set-keyboard-coding-system'." (check-coding-system coding-system) (setq-default buffer-file-coding-system coding-system) + (if (fboundp 'ucs-set-table-for-input) + (dolist (buffer (buffer-list)) + (or (local-variable-p 'buffer-file-coding-system buffer) + (ucs-set-table-for-input buffer)))) + (if default-enable-multibyte-characters (setq default-file-name-coding-system coding-system)) ;; If coding-system is nil, honor that on MS-DOS as well, so ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-09 23:58 ` Kenichi Handa @ 2003-12-11 19:38 ` Sam Steingold 2003-12-11 23:20 ` Kenichi Handa 0 siblings, 1 reply; 8+ messages in thread From: Sam Steingold @ 2003-12-11 19:38 UTC (permalink / raw) > * Kenichi Handa <unaqn@z17a.bet> [2003-12-10 08:58:28 +0900]: > > Please try the attached patch. works, thanks! BTW, why does gnus save koi8 messages in something like uuencode? -- Sam Steingold (http://www.podval.org/~sds) running w2k <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.honestreporting.com> Abandon all hope, all ye who press Enter. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-11 19:38 ` Sam Steingold @ 2003-12-11 23:20 ` Kenichi Handa 0 siblings, 0 replies; 8+ messages in thread From: Kenichi Handa @ 2003-12-11 23:20 UTC (permalink / raw) Cc: emacs-devel In article <u4qw7no0w.fsf@gnu.org>, Sam Steingold <sds@gnu.org> writes: >> * Kenichi Handa <unaqn@z17a.bet> [2003-12-10 08:58:28 +0900]: >> >> Please try the attached patch. > works, thanks! Thank you for testing. > BTW, why does gnus save koi8 messages in something like uuencode? Sorry, I have no idea about it. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-09 18:25 ` Sam Steingold 2003-12-09 23:58 ` Kenichi Handa @ 2003-12-10 1:27 ` Jason Rumney 2003-12-10 7:20 ` Roman Belenov 1 sibling, 1 reply; 8+ messages in thread From: Jason Rumney @ 2003-12-10 1:27 UTC (permalink / raw) Cc: emacs-devel Sam Steingold <sds@gnu.org> writes: > when I type using cyrillic-yawerty, I get this: > > character: а (07120, 3664, 0xe50, U+0430) > charset: cyrillic-iso8859-5 > (Right-Hand Part of Latin/Cyrillic Alphabet (ISO/IEC 8859-5): ISO-IR-144.) > code point: 80 > syntax: w which means: word > category: y:Cyrillic > buffer code: 0x8C 0xD0 > file code: 0xD0 0xB0 (encoded by coding system mule-utf-8-unix) > display: no font available There seems to be a bug in the font handling of the Windows version. Courier New (the default font) has Cyrillic characters, and most versions of Windows support iso8859-5 - you can verify this: if (w32-get-valid-codepages) lists 28595, then it should be supported; so Cyrillic should display automatically. But I just checked and it doesn't seem to be the case. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-10 1:27 ` Jason Rumney @ 2003-12-10 7:20 ` Roman Belenov 2003-12-11 19:39 ` Sam Steingold 0 siblings, 1 reply; 8+ messages in thread From: Roman Belenov @ 2003-12-10 7:20 UTC (permalink / raw) Cc: help-emacs-windows, sds, emacs-devel Jason Rumney <jasonr@gnu.org> writes: > There seems to be a bug in the font handling of the Windows > version. Courier New (the default font) has Cyrillic characters, and > most versions of Windows support iso8859-5 - you can verify this: if > (w32-get-valid-codepages) lists 28595, then it should be supported; > so Cyrillic should display automatically. But I just checked and it > doesn't seem to be the case. The list of character sets supported by Windows can be customized (at least in NT derivatives) via Control Panel->Regional Settings ("Advanced" tab). It seems that iso8859-5 is enabled automatically if Russian locale is in use, but in other locales you may have to enable iso8859-5 manually. -- With regards, Roman. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic, utf-8 and windows 2003-12-10 7:20 ` Roman Belenov @ 2003-12-11 19:39 ` Sam Steingold 0 siblings, 0 replies; 8+ messages in thread From: Sam Steingold @ 2003-12-11 19:39 UTC (permalink / raw) Cc: emacs-devel > * Roman Belenov <eoryrabi@lnaqrk.eh> [2003-12-10 10:20:04 +0300]: > > The list of character sets supported by Windows can be customized (at > least in NT derivatives) via Control Panel->Regional Settings > ("Advanced" tab). It seems that iso8859-5 is enabled automatically if > Russian locale is in use, but in other locales you may have to enable > iso8859-5 manually. thanks, this helped! -- Sam Steingold (http://www.podval.org/~sds) running w2k <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.honestreporting.com> Those who can't write, write manuals. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-12-11 23:20 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-12-08 18:39 Cyrillic, utf-8 and windows Sam Steingold 2003-12-09 18:25 ` Sam Steingold 2003-12-09 23:58 ` Kenichi Handa 2003-12-11 19:38 ` Sam Steingold 2003-12-11 23:20 ` Kenichi Handa 2003-12-10 1:27 ` Jason Rumney 2003-12-10 7:20 ` Roman Belenov 2003-12-11 19:39 ` Sam Steingold
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).