unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
@ 2002-08-27  9:04 Krjukov Victor
  2002-08-28  1:14 ` Kenichi Handa
  0 siblings, 1 reply; 10+ messages in thread
From: Krjukov Victor @ 2002-08-27  9:04 UTC (permalink / raw)


Greetings, Emacs developers.

MULE support in Emacs contains several encodings for Russian language: 
ISO 8859-5, used as main encoding (it's called a 'standard' encoding but
_nobody_ here actually use it);
koi8-r (de-facto standard in email transmission and standard in *nix
systems - almost all *nix systems in Russia use it);
and Alternativnyj (also known as cp866) - an ancient encoding come from
DOS world.

But Emacs has no support for cp1251 (standard Russian encoding in
Windows), though we have and activelly use Emacs for Windows.

I've made several changes in several *el files from Emacs 21.2
distribution, but it seems to me that you need to recompile Emacs to
include additional language in MULE support (and I have no C compiler on
my Windows system).

Could someone check attached files and probably implement them in future
Emacs version? This would be a great help to russian windows-emacs
users.

Thank you very much for this great program,

		Victor V. Kryukov.

PS. I've also made some changes in lisp/language/cyrillic.el: fix some
typo (staff instead of stuff ;) and fix some bugs (e.g.
cyrillic-encode-koi8-r-char instead of
cyrillic-encode-alternativnyj-char at the bottom of folloving excerpt in
lines 203-217 of cyrillic.el looks really strange).

I have also another question: comparing sections in cyrillic.el for
koi8-r and for alternativnyj you may notice that one line       
(if (r0 == ,(charset-id 'cyrillic-iso8859-5))
only present for define-ccl-program ccl-encode-koi8 and not in
ccl-encode-alternativnyj - is it a bug or a feature?
---
(make-coding-system
 'cyrillic-alternativnyj 4 ?A
 "ALTERNATIVNYJ 8-bit encoding for Cyrillic"
 '(ccl-decode-alternativnyj . ccl-encode-alternativnyj)
 `((safe-chars . ,(let ((table (make-char-table 'safe-chars))
			(i 0))
		    (while (< i 256)
		      (aset table (aref
cyrillic-alternativnyj-decode-table i)
			    t)
		      (setq i (1+ i)))
		    table))
   (valid-codes (0 . 175) (224 . 241) 255)
   (charset-origin-alist (cyrillic-iso8859-5 "ALTERNATIVNYJ"
	
cyrillic-encode-koi8-r-char))))
---

----
Sincerely Yours, Victor V. Kryukov, UFG
phone: +7501 967 3727, ext. 4387
email: vkryukov@ufg.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
@ 2002-08-27  9:05 Krjukov Victor
  0 siblings, 0 replies; 10+ messages in thread
From: Krjukov Victor @ 2002-08-27  9:05 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 2777 bytes --]

I forgot to attach changed files - here they are.

----
Sincerely Yours, Victor V. Kryukov, UFG
phone: +7501 967 3727, ext. 4387
email: vkryukov@ufg.com

> -----Original Message-----
> From: Krjukov Victor 
> Sent: Tuesday, August 27, 2002 1:04 PM
> To: 'bug-gnu-emacs@gnu.org'
> Subject: Proposal about adding cp1251 (Russian Windoz) 
> encoding in Emacs
> 
> 
> Greetings, Emacs developers.
> 
> MULE support in Emacs contains several encodings for Russian 
> language: 
> ISO 8859-5, used as main encoding (it's called a 'standard' 
> encoding but _nobody_ here actually use it);
> koi8-r (de-facto standard in email transmission and standard 
> in *nix systems - almost all *nix systems in Russia use it);
> and Alternativnyj (also known as cp866) - an ancient encoding 
> come from DOS world.
> 
> But Emacs has no support for cp1251 (standard Russian 
> encoding in Windows), though we have and activelly use Emacs 
> for Windows.
> 
> I've made several changes in several *el files from Emacs 
> 21.2 distribution, but it seems to me that you need to 
> recompile Emacs to include additional language in MULE 
> support (and I have no C compiler on my Windows system).
> 
> Could someone check attached files and probably implement 
> them in future Emacs version? This would be a great help to 
> russian windows-emacs users.
> 
> Thank you very much for this great program,
> 
> 		Victor V. Kryukov.
> 
> PS. I've also made some changes in lisp/language/cyrillic.el: 
> fix some typo (staff instead of stuff ;) and fix some bugs 
> (e.g. cyrillic-encode-koi8-r-char instead of 
> cyrillic-encode-alternativnyj-char at the bottom of folloving 
> excerpt in lines 203-217 of cyrillic.el looks really strange).
> 
> I have also another question: comparing sections in 
> cyrillic.el for koi8-r and for alternativnyj you may notice 
> that one line       
> (if (r0 == ,(charset-id 'cyrillic-iso8859-5))
> only present for define-ccl-program ccl-encode-koi8 and not 
> in ccl-encode-alternativnyj - is it a bug or a feature?
> ---
> (make-coding-system
>  'cyrillic-alternativnyj 4 ?A
>  "ALTERNATIVNYJ 8-bit encoding for Cyrillic"
>  '(ccl-decode-alternativnyj . ccl-encode-alternativnyj)
>  `((safe-chars . ,(let ((table (make-char-table 'safe-chars))
> 			(i 0))
> 		    (while (< i 256)
> 		      (aset table (aref 
> cyrillic-alternativnyj-decode-table i)
> 			    t)
> 		      (setq i (1+ i)))
> 		    table))
>    (valid-codes (0 . 175) (224 . 241) 255)
>    (charset-origin-alist (cyrillic-iso8859-5 "ALTERNATIVNYJ"
> 					     
> cyrillic-encode-koi8-r-char))))
> ---
> 
> ----
> Sincerely Yours, Victor V. Kryukov, UFG
> phone: +7501 967 3727, ext. 4387
> email: vkryukov@ufg.com
> 

[-- Attachment #2: emacs-21.2.zip --]
[-- Type: application/x-zip-compressed, Size: 270324 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-27  9:04 Krjukov Victor
@ 2002-08-28  1:14 ` Kenichi Handa
  2002-08-28  5:31   ` Eli Zaretskii
  2002-08-29 22:47   ` Dave Love
  0 siblings, 2 replies; 10+ messages in thread
From: Kenichi Handa @ 2002-08-28  1:14 UTC (permalink / raw)
  Cc: bug-gnu-emacs, d.love

In article <7CF3E427AE54FC42A26E68C5A89C186910C763@zeal.ufg.com>, "Krjukov Victor" <VKryukov@ufg.com> writes:
> Greetings, Emacs developers.
> MULE support in Emacs contains several encodings for Russian language: 
> ISO 8859-5, used as main encoding (it's called a 'standard' encoding but
> _nobody_ here actually use it);
> koi8-r (de-facto standard in email transmission and standard in *nix
> systems - almost all *nix systems in Russia use it);
> and Alternativnyj (also known as cp866) - an ancient encoding come from
> DOS world.

> But Emacs has no support for cp1251 (standard Russian encoding in
> Windows), though we have and activelly use Emacs for Windows.

Thank you for the report.  We have already supported cp1251
in the latest CVS code.  Could you please try it?  When you
set the language environment to "Windows-1251", you can
start using the coding system windows-1251 (alias cp1251).

But, I don't know if Emacs automatically switch to that
lang. env. when started on Windows.  Does anyone know that?

> I've made several changes in several *el files from Emacs 21.2
> distribution, but it seems to me that you need to recompile Emacs to
> include additional language in MULE support (and I have no C compiler on
> my Windows system).

> Could someone check attached files and probably implement them in future
> Emacs version? This would be a great help to russian windows-emacs
> users.

It seems that Emacs already has all the features you want,
so I haven't read through your code.  We greatly appreciate
if you try the latest Emacs code and report problems if any.

Dave, you have improved cyrillic.el greatly.  Are the
following problem already fixed?  Could you please comment
on that?

> PS. I've also made some changes in lisp/language/cyrillic.el: fix some
> typo (staff instead of stuff ;) and fix some bugs (e.g.
> cyrillic-encode-koi8-r-char instead of
> cyrillic-encode-alternativnyj-char at the bottom of folloving excerpt in
> lines 203-217 of cyrillic.el looks really strange).

> I have also another question: comparing sections in cyrillic.el for
> koi8-r and for alternativnyj you may notice that one line       
> (if (r0 == ,(charset-id 'cyrillic-iso8859-5))
> only present for define-ccl-program ccl-encode-koi8 and not in
> ccl-encode-alternativnyj - is it a bug or a feature?
> ---
> (make-coding-system
>  'cyrillic-alternativnyj 4 ?A
>  "ALTERNATIVNYJ 8-bit encoding for Cyrillic"
>  '(ccl-decode-alternativnyj . ccl-encode-alternativnyj)
>  `((safe-chars . ,(let ((table (make-char-table 'safe-chars))
> 			(i 0))
> 		    (while (< i 256)
> 		      (aset table (aref
> cyrillic-alternativnyj-decode-table i)
> 			    t)
> 		      (setq i (1+ i)))
> 		    table))
>    (valid-codes (0 . 175) (224 . 241) 255)
>    (charset-origin-alist (cyrillic-iso8859-5 "ALTERNATIVNYJ"
	
> cyrillic-encode-koi8-r-char))))
> ---

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-28  1:14 ` Kenichi Handa
@ 2002-08-28  5:31   ` Eli Zaretskii
  2002-08-29 22:47   ` Dave Love
  1 sibling, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-08-28  5:31 UTC (permalink / raw)
  Cc: VKryukov, bug-gnu-emacs, d.love


On Wed, 28 Aug 2002, Kenichi Handa wrote:

> Thank you for the report.  We have already supported cp1251
> in the latest CVS code.  Could you please try it?  When you
> set the language environment to "Windows-1251", you can
> start using the coding system windows-1251 (alias cp1251).
> 
> But, I don't know if Emacs automatically switch to that
> lang. env. when started on Windows.

If it doesn't, it should, IMHO.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
@ 2002-08-28  5:48 Krjukov Victor
  2002-08-28  6:37 ` Kenichi Handa
  0 siblings, 1 reply; 10+ messages in thread
From: Krjukov Victor @ 2002-08-28  5:48 UTC (permalink / raw)
  Cc: bug-gnu-emacs, d.love

If I'm not wrong (http://www.microsoft.com/typography/unicode/1251.htm)
cp1251 is a character set with cyrillic letters added in the upper half
of character table; i'm not sure people in Korea, for example, or even
in France with their accented letters really want their Emacs on Windows
switch to this language environment automatically. On the other hand,
this should be the default _cyrillic_ language environment on Windows
systems for sure.

----
Sincerely Yours, Victor V. Kryukov, UFG
phone: +7501 967 3727, ext. 4387
email: vkryukov@ufg.com

> -----Original Message-----
> From: Eli Zaretskii [mailto:eliz@is.elta.co.il] 
> Sent: Wednesday, August 28, 2002 9:32 AM
> To: Kenichi Handa
> Cc: Krjukov Victor; bug-gnu-emacs@gnu.org; d.love@dl.ac.uk
> Subject: Re: Proposal about adding cp1251 (Russian Windoz) 
> encoding in Emacs
> 
> 
> 
> On Wed, 28 Aug 2002, Kenichi Handa wrote:
> 
> > Thank you for the report.  We have already supported cp1251
> > in the latest CVS code.  Could you please try it?  When you
> > set the language environment to "Windows-1251", you can
> > start using the coding system windows-1251 (alias cp1251).
> > 
> > But, I don't know if Emacs automatically switch to that
> > lang. env. when started on Windows.
> 
> If it doesn't, it should, IMHO.
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-28  5:48 Proposal about adding cp1251 (Russian Windoz) encoding in Emacs Krjukov Victor
@ 2002-08-28  6:37 ` Kenichi Handa
  2002-08-28 13:06   ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2002-08-28  6:37 UTC (permalink / raw)
  Cc: eliz, bug-gnu-emacs, d.love

In article <7CF3E427AE54FC42A26E68C5A89C186910C765@zeal.ufg.com>, "Krjukov Victor" <VKryukov@ufg.com> writes:
> If I'm not wrong (http://www.microsoft.com/typography/unicode/1251.htm)
> cp1251 is a character set with cyrillic letters added in the upper half
> of character table; i'm not sure people in Korea, for example, or even
> in France with their accented letters really want their Emacs on Windows
> switch to this language environment automatically. On the other hand,
> this should be the default _cyrillic_ language environment on Windows
> systems for sure.

What I mean is that Windows-1251 must be the default
lang. env. IF your Windows' codepage is cp1251.

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-28  6:37 ` Kenichi Handa
@ 2002-08-28 13:06   ` Eli Zaretskii
  0 siblings, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-08-28 13:06 UTC (permalink / raw)
  Cc: VKryukov, bug-gnu-emacs, d.love


On Wed, 28 Aug 2002, Kenichi Handa wrote:

> What I mean is that Windows-1251 must be the default
> lang. env. IF your Windows' codepage is cp1251.

Right, and that's what I meant as well.  Sorry if it was unclear.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-28  1:14 ` Kenichi Handa
  2002-08-28  5:31   ` Eli Zaretskii
@ 2002-08-29 22:47   ` Dave Love
  2002-08-30  2:39     ` Kenichi Handa
  1 sibling, 1 reply; 10+ messages in thread
From: Dave Love @ 2002-08-29 22:47 UTC (permalink / raw)
  Cc: VKryukov, bug-gnu-emacs

Kenichi Handa <handa@etl.go.jp> writes:

> But, I don't know if Emacs automatically switch to that
> lang. env. when started on Windows.  Does anyone know that?

Regardless of what happens on Windows, and I don't think it does that,
there isn't generally a language environment for all the relevant
coding systems.  I think processing locale specifications (and the
Windows equivalent) should be done differently, but no-one seemed to
agree when this last came up.  In particular, I was told it couldn't
work under Windows the same way as POSIX, for reasons I couldn't
follow.  (I haven't had a chance to implement something for Emacs 22.)

Note that windows-1251 & al aren't available until code-pages is
explicitly required, since coding system autoloading wasn't installed
and code-pages isn't preloaded.  It may interact with the codepage.el
nastiness.  (codepage has unfortunate special cases in various parts
of the Mule code and apparently can't be got rid of.)

> Dave, you have improved cyrillic.el greatly.  Are the
> following problem already fixed?  Could you please comment
> on that?

I don't think there have been any significant changes for Emacs 21.3.
Cyrillic support is basically the same as it always was, since changes
were blocked.

Actually, I don't know what the correct definition for alternativnyj
should be -- I've never found anything that looked `official'.  I've
seen suggestions both that it's the same as cp866 and that it's not
the same.  I'd welcome an authoritative answer.

I can't check the original mail to see whether it's all accounted for
now, since the list isn't now being gated to gnusnet and I can no
longer access the GNU systems for the archive.  I don't know of any
missing features for Cyrillic, though.

[Users of Cyrillic who want improved support can look in
<URL:ftp://dlpx1.dl.ac.uk/fx/emacs/Mule>, which contains versions of
my original changes for use with Emacs 21.1.  There are coding
systems, language environments and input methods.  However, I think
there are some errors and omissions there which I haven't had time to
check.

Just to show it works, including in Gnus, here's
M-x list-charset-chars windows-1251:

Characters in the coded character set windows-1251.
----------------------------------------------------
      0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0x  \0 \x01 \x02 \x03 \x04 \x05 \x06 \a \b 	
                                       \v \f \r \x0e \x0f
  1x  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a ^[ \x1c \x1d \x1e \x1f
  2x     !  "  #  $  %  &  '  (  )  *  +  ,  -  .  /
  3x  0  1  2  3  4  5  6  7  8  9  :  ;  <  =  >  ?
  4x  @  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O
  5x  P  Q  R  S  T  U  V  W  X  Y  Z  [  \  ]  ^  _
  6x  `  a  b  c  d  e  f  g  h  i  j  k  l  m  n  o
  7x  p  q  r  s  t  u  v  w  x  y  z  {  |  }  ~  \x7f
  8x  Ђ  Ѓ  ‚  ѓ  „  …  †  ‡  €  ‰  Љ  ‹  Њ  Ќ  Ћ  Џ
  9x  ђ  ‘  ’  “  ”  •  –  —     ™  љ  ›  њ  ќ  ћ  џ
  Ax     Ў  ў  Ј  ¤  Ґ  ¦  §  Ё  ©  Є  «  ¬  ­  ®  Ї
  Bx  °  ±  І  і  ґ  µ  ¶  ·  ё  №  є  »  ј  Ѕ  ѕ  ї
  Cx  А  Б  В  Г  Д  Е  Ж  З  И  Й  К  Л  М  Н  О  П
  Dx  Р  С  Т  У  Ф  Х  Ц  Ч  Ш  Щ  Ъ  Ы  Ь  Э  Ю  Я
  Ex  а  б  в  г  д  е  ж  з  и  й  к  л  м  н  о  п
  Fx  р  с  т  у  ф  х  ц  ч  ш  щ  ъ  ы  ь  э  ю  я
]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-29 22:47   ` Dave Love
@ 2002-08-30  2:39     ` Kenichi Handa
  2002-08-31 17:20       ` Dave Love
  0 siblings, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2002-08-30  2:39 UTC (permalink / raw)
  Cc: VKryukov, bug-gnu-emacs

In article <rzq8z2p1duq.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:
> Kenichi Handa <handa@etl.go.jp> writes:
>>  But, I don't know if Emacs automatically switch to that
>>  lang. env. when started on Windows.  Does anyone know that?

> Regardless of what happens on Windows, and I don't think it does that,
> there isn't generally a language environment for all the relevant
> coding systems.

Dave, here, we are discussing about the latest CVS code
(i.e. trunk HEAD), not about the comming 21.3 (trunk RC).

And, in HEAD, you have already installed the language
environment Windows-1251.  So, the question is that whether
or not Windows port of Emacs running under the codepage
cp1251 can automatically set the language environment to
Windows-1251.  And, I don't know how Windows port decides
the language environment.

> I think processing locale specifications (and the
> Windows equivalent) should be done differently, but no-one seemed to
> agree when this last came up.  In particular, I was told it couldn't
> work under Windows the same way as POSIX, for reasons I couldn't
> follow.  (I haven't had a chance to implement something for Emacs 22.)

I understand that the current method is not that good, but
don't have time to provide any concrete solution.

> Note that windows-1251 & al aren't available until code-pages is
> explicitly required, since coding system autoloading wasn't installed
> and code-pages isn't preloaded.

Yes.  But, you setup HEAD so that selecting Windows-1251
lang. env. automatically loads code-pages.  Thus, in this
lang. env., there should be no problem with using the coding
system windows-1251.

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal about adding cp1251 (Russian Windoz) encoding in Emacs
  2002-08-30  2:39     ` Kenichi Handa
@ 2002-08-31 17:20       ` Dave Love
  0 siblings, 0 replies; 10+ messages in thread
From: Dave Love @ 2002-08-31 17:20 UTC (permalink / raw)
  Cc: VKryukov, bug-gnu-emacs

Kenichi Handa <handa@etl.go.jp> writes:

> And, in HEAD, you have already installed the language
> environment Windows-1251.

But I should actually have removed it after making the environments
for Bulgarian and Belarusian.  My point was that there aren't any
other windows-nnnn environments.

> So, the question is that whether
> or not Windows port of Emacs running under the codepage
> cp1251 can automatically set the language environment to
> Windows-1251.

I'm sure it could in this special case, but it needs a general
mechanism.

> And, I don't know how Windows port decides
> the language environment.

I guess it has no way of doing that properly until someone correlates
the Windows locales (for which there seem to be w32-... access
functions) with the Emacs environments, like eggert originally did for
gnunix.  (I assume that the locales don't correspond properly to the
language environments of the same name, because of differing charsets,
but I haven't looked at it.)

> Yes.  But, you setup HEAD so that selecting Windows-1251
> lang. env. automatically loads code-pages.  Thus, in this
> lang. env., there should be no problem with using the coding
> system windows-1251.

Yes, as a special case, but I assume people want a real environment
that gives them the correct language and input methods, not just the
coding system.  The two GNU locales which use windows-1251 are
Bulgarian and Belarusian, so they are all I provided.  However, for
Russian users of windows-1251, I think LC_CTYPE=ru_RU.windows-1251
should work iff windows-1251 was available at startup, but there
probably won't be a corresponding system locale outside Windows.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-08-31 17:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-28  5:48 Proposal about adding cp1251 (Russian Windoz) encoding in Emacs Krjukov Victor
2002-08-28  6:37 ` Kenichi Handa
2002-08-28 13:06   ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2002-08-27  9:05 Krjukov Victor
2002-08-27  9:04 Krjukov Victor
2002-08-28  1:14 ` Kenichi Handa
2002-08-28  5:31   ` Eli Zaretskii
2002-08-29 22:47   ` Dave Love
2002-08-30  2:39     ` Kenichi Handa
2002-08-31 17:20       ` Dave Love

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).