unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: "Outer world" encoding for non-Latin1 language environments
  2002-02-27 11:29 "Outer world" encoding for non-Latin1 language environments Anton Zinoviev
@ 2002-02-27 11:10 ` Eli Zaretskii
  2002-03-01 18:00   ` Anton Zinoviev
  2002-03-01  1:28 ` Stefan Monnier
  1 sibling, 1 reply; 18+ messages in thread
From: Eli Zaretskii @ 2002-02-27 11:10 UTC (permalink / raw)
  Cc: emacs-devel


On Wed, 27 Feb 2002, Anton Zinoviev wrote:

> Even when users choose some non-Latin1 language environment Emacs
> doesn't suppose that it's new default encoding is actually the
> encoding of the X-clipboard, the console font, etc.  This can be
> easily changed in this way:

Why change it?  The default for X selections is compound-text, which can 
handle many different languages/scripts mixed in a single selection.  
What you propose (to set it to koi8-u) will limit the selections to a 
single charset.  Why is that a good idea?
> 
> (defun setup-koi8u-coding-system () 
>   (let ()
>     (set-keyboard-coding-system 'koi8-u)
>     (set-clipboard-coding-system 'koi8-u)
>     (set-terminal-coding-system 'koi8-u)))

Whether defining a language environment should set keyboard and terminal 
encodings is an old issue, but it isn't a clear-cut one.  It is possible 
that someone defines a language environment but her keyboard and/or 
terminal cannot cope with encoded characters.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* "Outer world" encoding for non-Latin1 language environments
@ 2002-02-27 11:29 Anton Zinoviev
  2002-02-27 11:10 ` Eli Zaretskii
  2002-03-01  1:28 ` Stefan Monnier
  0 siblings, 2 replies; 18+ messages in thread
From: Anton Zinoviev @ 2002-02-27 11:29 UTC (permalink / raw)


Hi!

Even when users choose some non-Latin1 language environment Emacs
doesn't suppose that it's new default encoding is actually the
encoding of the X-clipboard, the console font, etc.  This can be
easily changed in this way:

(set-language-info-alist
 "Ukrainian" `((documentation . "\
Support for Ukrainian with KOI8-U coding system.  If you prefer
UTF-8, please put in ~/.emacs the following line:
  (prefer-coding-system 'utf-8)")
                (charset ascii cyrillic-iso8859-5 mule-unicode-0100-24ff)
                (sample-text . "here some Ukrainian tekst")
                (setup-function . setup-koi8u-coding-system)
                (exit-function . reset-coding-system)
                (coding-system . (koi8-u utf-8))
                (coding-priority . (koi8-u utf-8))
                (nonascii-translation
                 . ,(get 'cyrillic-koi8u-nonascii-translation-table
                         'translation-table))
                (input-method . "cyrillic-ukrainian")
                (unibyte-display . cyrillic-koi8u)
                (features cyril-util))
 '("Cyrillic"))

Note the (setup-function . setup-koi8u-coding-system) and
(exit-function . reset-coding-system).  They can be defined by
(features cyril-util) in this way:

(defun setup-koi8u-coding-system () 
  (let ()
    (set-keyboard-coding-system 'koi8-u)
    (set-clipboard-coding-system 'koi8-u)
    (set-terminal-coding-system 'koi8-u)))

(defun reset-coding-system () 
  (let ()
    (set-keyboard-coding-system 'latin-1)
    (set-clipboard-coding-system nil)
    (set-terminal-coding-system nil)))

Maybe someone will propose a better solution?

Regards, Anton Zinoviev

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
@ 2002-02-27 11:46 Kenichi Handa
  2002-02-27 12:11 ` Eli Zaretskii
  0 siblings, 1 reply; 18+ messages in thread
From: Kenichi Handa @ 2002-02-27 11:46 UTC (permalink / raw)
  Cc: anton, emacs-devel

Eli Zaretskii <eliz@is.elta.co.il> writes:
> On Wed, 27 Feb 2002, Anton Zinoviev wrote:

>>  Even when users choose some non-Latin1 language environment Emacs
>>  doesn't suppose that it's new default encoding is actually the
>>  encoding of the X-clipboard, the console font, etc.  This can be
>>  easily changed in this way:

> Why change it?  The default for X selections is compound-text, which can 
> handle many different languages/scripts mixed in a single selection.  
> What you propose (to set it to koi8-u) will limit the selections to a 
> single charset.  Why is that a good idea?

It may be good for such koi8-u environment that uses koi8-u
encoding directly for X selection and all the other X
applications expect that.

But, such a usage is against X's ICCCM (Inter-Client
Communication Conventions Manual) that says STRING type
selection can contain ASCII and Latin-1 only.

And it seems that the world is moving toward using UTF-8 or
using "Non-Standard Character Set Encodings" of
compound-text.

Are there any consensus in koi8-u community for such a usage
(i.e. using koi8-u directly in X selection)?

If not, I too agree with Eli.  Keep on using compound-text
as the default coding system for X selection is better.

By the way, the support for "Non-Standard Character Set
Encodings" in compound-text is recently added to Emacs by
the effort of Eli.  Currently it doesn't support koi8-u
because we haven't heard that people started to use that
method for koi8-u.

---
Ken'ichi HANDA
handa@etl.go.jp


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-02-27 11:46 Kenichi Handa
@ 2002-02-27 12:11 ` Eli Zaretskii
  0 siblings, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2002-02-27 12:11 UTC (permalink / raw)
  Cc: anton, emacs-devel


On Wed, 27 Feb 2002, Kenichi Handa wrote:

> By the way, the support for "Non-Standard Character Set
> Encodings" in compound-text is recently added to Emacs by
> the effort of Eli.  Currently it doesn't support koi8-u
> because we haven't heard that people started to use that
> method for koi8-u.

It supports koi8-r, but only in one direction--from X to Emacs.  Adding 
koi8-u in the same direction is trivial.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-02-27 11:29 "Outer world" encoding for non-Latin1 language environments Anton Zinoviev
  2002-02-27 11:10 ` Eli Zaretskii
@ 2002-03-01  1:28 ` Stefan Monnier
  2002-03-01  7:36   ` Eli Zaretskii
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Monnier @ 2002-03-01  1:28 UTC (permalink / raw)
  Cc: emacs-devel

> Note the (setup-function . setup-koi8u-coding-system) and
> (exit-function . reset-coding-system).  They can be defined by
> (features cyril-util) in this way:
> 
> (defun setup-koi8u-coding-system () 
>   (let ()
>     (set-keyboard-coding-system 'koi8-u)
>     (set-clipboard-coding-system 'koi8-u)
>     (set-terminal-coding-system 'koi8-u)))
> 
> (defun reset-coding-system () 
>   (let ()
>     (set-keyboard-coding-system 'latin-1)
>     (set-clipboard-coding-system nil)
>     (set-terminal-coding-system nil)))
> 
> Maybe someone will propose a better solution?

Maybe all it takes is to use set-locale-environment rather than
set-language-environment ?
Also maybe koi8-u should be added to `standard-keyboard-coding-systems' ?


	Stefan


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-01  1:28 ` Stefan Monnier
@ 2002-03-01  7:36   ` Eli Zaretskii
  0 siblings, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-01  7:36 UTC (permalink / raw)
  Cc: anton, emacs-devel

> From: "Stefan Monnier" <monnier+gnu/emacs@RUM.cs.yale.edu>
> Date: Thu, 28 Feb 2002 20:28:08 -0500
> 
> Also maybe koi8-u should be added to `standard-keyboard-coding-systems' ?

Yes, I think so.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-02-27 11:10 ` Eli Zaretskii
@ 2002-03-01 18:00   ` Anton Zinoviev
  2002-03-01 18:23     ` Stefan Monnier
  2002-03-01 19:57     ` Eli Zaretskii
  0 siblings, 2 replies; 18+ messages in thread
From: Anton Zinoviev @ 2002-03-01 18:00 UTC (permalink / raw)
  Cc: Anton Zinoviev, emacs-devel

On 27.II.2002 at 13:10 (+0200) Eli Zaretskii wrote:
> 
> On Wed, 27 Feb 2002, Anton Zinoviev wrote:
> 
> > Even when users choose some non-Latin1 language environment Emacs
> > doesn't suppose that it's new default encoding is actually the
> > encoding of the X-clipboard, the console font, etc.  This can be
> > easily changed in this way:
> 
> Why change it?  The default for X selections is compound-text, which can 
> handle many different languages/scripts mixed in a single selection.  
> What you propose (to set it to koi8-u) will limit the selections to a 
> single charset.  Why is that a good idea?

I was under impression that Emacs doesn't support compound-text.  It
doesn't touch command sequences and pasting from xterm gives something
like:

^[%/1\200\212\koi8-r^B......

where dots are Latin-1 letters instead of Cyrillic.

Many programs doesn't use such command-sequences and
set-clipboard-coding-system helps for pasting from them.  However the
right solution is to improve the support for compound-text in Emacs.
It should interpret these control sequences when pasting and generate
them when copying.  For the last task it has to know the defauld
encoding for the language environment and use it if possible because
most of the programs can'r reencode if you paste in them.

> > (defun setup-koi8u-coding-system () 
> >   (let ()
> >     (set-keyboard-coding-system 'koi8-u)
> >     (set-clipboard-coding-system 'koi8-u)
> >     (set-terminal-coding-system 'koi8-u)))
> 
> Whether defining a language environment should set keyboard and terminal 
> encodings is an old issue, but it isn't a clear-cut one.  It is possible 
> that someone defines a language environment but her keyboard and/or 
> terminal cannot cope with encoded characters.

In this case setting the keyboard won't make any harm to this user, as
then he or she can use some of the input methods provided by Emacs.
There is no situation when it might be usefull if Emacs interprets my
Cyrillic input as Latin1.

After wrong setting of the terminal encoding the user will see garbage
instead of question marks, so again that's not big harm.


Regards, Anton Zinoviev

P.S. Should I cc you when replying?

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-01 18:00   ` Anton Zinoviev
@ 2002-03-01 18:23     ` Stefan Monnier
  2002-03-01 19:57     ` Eli Zaretskii
  1 sibling, 0 replies; 18+ messages in thread
From: Stefan Monnier @ 2002-03-01 18:23 UTC (permalink / raw)
  Cc: Eli Zaretskii, emacs-devel

> In this case setting the keyboard won't make any harm to this user, as
> then he or she can use some of the input methods provided by Emacs.
> There is no situation when it might be usefull if Emacs interprets my
> Cyrillic input as Latin1.

The default keyboard encoding is not latin-1 but ascii (at least until
Emacs-21.3), so I'm not sure what makes you think that Emacs interprets
your input as latin-1.  Maybe your locale settings are wrong or maybe
Emacs mistakenly associates your locale setting with latin-1 ?

What is the value of your locale envvars like LANG ?
Can you give us an example starting with `emacs -q --no-site-file' which
shows a case where Emacs mistakenly treats your input as latin-1 ?


	Stefan


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-01 18:00   ` Anton Zinoviev
  2002-03-01 18:23     ` Stefan Monnier
@ 2002-03-01 19:57     ` Eli Zaretskii
  2002-03-05 10:57       ` Anton Zinoviev
  1 sibling, 1 reply; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-01 19:57 UTC (permalink / raw)
  Cc: emacs-devel

> Date: Fri, 1 Mar 2002 20:00:58 +0200
> From: Anton Zinoviev <anton@lml.bas.bg>
> > 
> > Why change it?  The default for X selections is compound-text, which can 
> > handle many different languages/scripts mixed in a single selection.  
> > What you propose (to set it to koi8-u) will limit the selections to a 
> > single charset.  Why is that a good idea?
> 
> I was under impression that Emacs doesn't support compound-text.  It
> doesn't touch command sequences and pasting from xterm gives something
> like:
> 
> ^[%/1\200\212\koi8-r^B......
> 
> where dots are Latin-1 letters instead of Cyrillic.

The current CVS version and the pretest of Emacs 21.2 do support the
above.  Please try a newser version.

> In this case setting the keyboard won't make any harm to this user, as
> then he or she can use some of the input methods provided by Emacs.
> There is no situation when it might be usefull if Emacs interprets my
> Cyrillic input as Latin1.

See Stefan's response: the default is not Latin-1, and Emacs should
interpret your keyboard input correctly even without
keyboard-coding-system being set.  If that doesn't work, please tell
the details; perhaps there's some bug.

> P.S. Should I cc you when replying?

You don't have to.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-01 19:57     ` Eli Zaretskii
@ 2002-03-05 10:57       ` Anton Zinoviev
  2002-03-05 17:28         ` Eli Zaretskii
  0 siblings, 1 reply; 18+ messages in thread
From: Anton Zinoviev @ 2002-03-05 10:57 UTC (permalink / raw)


Thanks to all who replied!

On  1.III.2002 at 21:57 (+0200) Eli Zaretskii wrote:
> > Date: Fri, 1 Mar 2002 20:00:58 +0200
> > From: Anton Zinoviev <anton@lml.bas.bg>
> > 
> > I was under impression that Emacs doesn't support compound-text.  It
> > doesn't touch command sequences and pasting from xterm gives something
> > like:
> > 
> > ^[%/1\200\212\koi8-r^B......
> > 
> > where dots are Latin-1 letters instead of Cyrillic.
> 
> The current CVS version and the pretest of Emacs 21.2 do support the
> above.  Please try a newser version.

Pasting from xterm in utf-8 mode works fine.  I think that pasting
from koi8-r generates double-width Cyrillic letters from JISX0208
instead of regular Cyrillic letters from ISO 8859-5.

I tried rgrep koi8-r on sources and couldn't find where the clipboard
support is located.  Where is it?

> > In this case setting the keyboard won't make any harm to this user, as
> > then he or she can use some of the input methods provided by Emacs.
> > There is no situation when it might be usefull if Emacs interprets my
> > Cyrillic input as Latin1.
> 
> See Stefan's response: the default is not Latin-1, and Emacs should
> interpret your keyboard input correctly even without
> keyboard-coding-system being set.  If that doesn't work, please tell
> the details; perhaps there's some bug.

The bug is in lisp/international/mule-cmds.el: Latin-5 is stated to be
coding system for some of the Cyrillic locales.  However Latin-5 is
actually ISO 8859-9 (quiet similar to Latin-1) and has nothing to do
with ISO 8859-5 (Cyrillic-ISO).

I observed a change between Emacs I used before and Emacs as I got it
from CVS.  If I set the keyboard coding system apropriately, the
previous version of Emacs worked fine both on the text-mode console
and in X Window.  The last version of Emacs works in text-mode as well
but in X Window it only beeps on non-ASCII symbols.  It would be nice
if Emacs understood directly xkeysyms but the previous behaviour is
also fine.  The version of Emacs I used before is 21.1.1 from Debian
and uses Athena widget.  I compiled the new version of Emacs with
Lesstif.

Regards, Anton Zinoviev



_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-05 10:57       ` Anton Zinoviev
@ 2002-03-05 17:28         ` Eli Zaretskii
  2002-03-06 18:27           ` Anton Zinoviev
  0 siblings, 1 reply; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-05 17:28 UTC (permalink / raw)
  Cc: emacs-devel

> From: Anton Zinoviev <anton@lml.bas.bg>
> Date: Tue, 5 Mar 2002 12:57:39 +0200
> 
> > > ^[%/1\200\212\koi8-r^B......
> > > 
> > > where dots are Latin-1 letters instead of Cyrillic.
> > 
> > The current CVS version and the pretest of Emacs 21.2 do support the
> > above.  Please try a newser version.
> 
> Pasting from xterm in utf-8 mode works fine.  I think that pasting
> from koi8-r generates double-width Cyrillic letters from JISX0208
> instead of regular Cyrillic letters from ISO 8859-5.

That's strange: I thought compound-text decoded Cyrillic text into
cyrillic-iso8859-5 characters, not jisx0208 characters.

Are you sure you get jisx0208?  What does Emacs say if you go to one
of the Cyrillic characters you pasted and type "C-u C-x ="?  Please
post everything Emacs displays in the buffer it pops up.

> I tried rgrep koi8-r on sources and couldn't find where the clipboard
> support is located.  Where is it?

If you are looking for the support for ICCCM Extended Segments, it's
in mule.el, functions ctext-post-read-conversion and
ctext-pre-write-conversion (that's in CVS head; in the RC branch and
in the pretest 21.1.95, these functions are in mule-conf.el instead).

The rest of support for X selections is in xselect.c.

> The bug is in lisp/international/mule-cmds.el: Latin-5 is stated to be
> coding system for some of the Cyrillic locales.  However Latin-5 is
> actually ISO 8859-9 (quiet similar to Latin-1) and has nothing to do
> with ISO 8859-5 (Cyrillic-ISO).

Yes, you are right.  Please suggest what changes should be done in
locale-language-names for the Cyrillic locales.

> I observed a change between Emacs I used before and Emacs as I got it
> from CVS.  If I set the keyboard coding system apropriately, the
> previous version of Emacs worked fine both on the text-mode console
> and in X Window.  The last version of Emacs works in text-mode as well
> but in X Window it only beeps on non-ASCII symbols.

Please send more details about this problem; if you can debug this by
yorself, it would be even better.  I don't have any access to a
system in Cyrillic locale running X.

Is it possible that v21.1 that worked for you was compiled with XIM,
whereas the CVS version is not, or vice versa?  For XIM, the
locale-coding-system should be set correctly, or else non-ASCII input
will not DTRT.  Perhaps the same bug in mule-cmds.el that you
mentioned also affects this issue, as it sets the wrong
locale-coding-system given the value of LANG and LC_* environment
variables?

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-05 17:28         ` Eli Zaretskii
@ 2002-03-06 18:27           ` Anton Zinoviev
  2002-03-06 19:22             ` Eli Zaretskii
  0 siblings, 1 reply; 18+ messages in thread
From: Anton Zinoviev @ 2002-03-06 18:27 UTC (permalink / raw)


On  5.III.2002 at 19:28 Eli Zaretskii wrote:
> 
> Are you sure you get jisx0208?  What does Emacs say if you go to one
> of the Cyrillic characters you pasted and type "C-u C-x ="?  Please
> post everything Emacs displays in the buffer it pops up.

I will post the result.

> Yes, you are right.  Please suggest what changes should be done in
> locale-language-names for the Cyrillic locales.

This is a general question about what Cyrillic language environments
Emacs should have.  Please give a suggestion.  This is the current
situation about locales and languages:

be in mule-cmd.el maps to language environment "Belarussian" (the right
spelling is Belarusian, not Belarussian).  There is no such language
environment in Emacs.  The locale be_BY is for CP1251 but Emacs doesn't 
support CP1251 in cyrillic.el.  I can make the necessary changes about
that but AFAIK Dave Love does something similar and I'd better contact
him.

Acording to Alexander Mikhailian (the author of "Belarusian-HOWTO") ISO
8859-5 is better supported than CP1251 and thats why Emacs should
support also Belarusian+ISO-8859-5.  I guess this means two languages
environments for Belarusian: Belarusian-CP1251 and Belarusian-ISO?

bg in mule-cmd.el maps to language environment "Bulgarian" which also
doesn't exist.  The locale bg_BG is for CP1251.  Bulgarian GNU/Linux
users have always used CP1251, even before XFree4.0.2 (the first version
that supports this encoding).  Two completely different keyboard layouts
are used in Bulgaria.  This has been discussed already
<http://mail.gnu.org/pipermail/bug-gnu-emacs/2002-January/009497.html> I
think it is better to have two language environments Bulgarian-BDS and
Bulgarian-phonetic.

mk (Macedonian) in mule-cmd.el maps to "Latin-5".  The locale mk_MK is
for ISO 8859-5. The most closest language environment is "Cyrillic-ISO",
but it uses Russian keyboard.  If you agree I will make language
environment for Macedonian, or else that will be "Cyrillic-ISO".

ru in mule-cmd.el maps to "Latin-5".  That definitely should be
"Cyrillic-ISO".

tg (Tadjik) in mule-cmd.el maps to "Cyrillic-KOI8-T".  There is no such
language environment in Emacs, but I can make it.  KOI8-T has to be
added in cyrillic.el.

uk (Ukrainian) maps to language environment "Ukrainian".  There is no
such environment in Emacs, but I can make it.  KOI8-U has to be added in
cyrillic.el.

> Please send more details about this problem; if you can debug this by
> yorself, it would be even better.  I don't have any access to a
> system in Cyrillic locale running X.

I will see what I can do.

> Is it possible that v21.1 that worked for you was compiled with XIM,
> whereas the CVS version is not, or vice versa?  For XIM, the
> locale-coding-system should be set correctly, or else non-ASCII input
> will not DTRT.  Perhaps the same bug in mule-cmds.el that you
> mentioned also affects this issue, as it sets the wrong
> locale-coding-system given the value of LANG and LC_* environment
> variables?

There are no problems in text-mode.  Emacs beeps not only to
alphanumeric characters but also for xkeysyms like "ISO_Next_Group".

Regards, Anton Zinoviev


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-06 18:27           ` Anton Zinoviev
@ 2002-03-06 19:22             ` Eli Zaretskii
  2002-03-06 20:16               ` Anton Zinoviev
  0 siblings, 1 reply; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-06 19:22 UTC (permalink / raw)
  Cc: emacs-devel, d.love

> From: Anton Zinoviev <anton@lml.bas.bg>
> Date: Wed, 6 Mar 2002 20:27:44 +0200
> 
> This is a general question about what Cyrillic language environments
> Emacs should have.  Please give a suggestion.  This is the current
> situation about locales and languages:

Thanks for the status report; it sounds like a lot has to be done.

> be in mule-cmd.el maps to language environment "Belarussian" (the right
> spelling is Belarusian, not Belarussian).  There is no such language
> environment in Emacs.  The locale be_BY is for CP1251 but Emacs doesn't 
> support CP1251 in cyrillic.el.  I can make the necessary changes about
> that but AFAIK Dave Love does something similar and I'd better contact
> him.
> 
> Acording to Alexander Mikhailian (the author of "Belarusian-HOWTO") ISO
> 8859-5 is better supported than CP1251 and thats why Emacs should
> support also Belarusian+ISO-8859-5.  I guess this means two languages
> environments for Belarusian: Belarusian-CP1251 and Belarusian-ISO?
> 
> bg in mule-cmd.el maps to language environment "Bulgarian" which also
> doesn't exist.  The locale bg_BG is for CP1251.  Bulgarian GNU/Linux
> users have always used CP1251, even before XFree4.0.2 (the first version
> that supports this encoding).  Two completely different keyboard layouts
> are used in Bulgaria.  This has been discussed already
> <http://mail.gnu.org/pipermail/bug-gnu-emacs/2002-January/009497.html> I
> think it is better to have two language environments Bulgarian-BDS and
> Bulgarian-phonetic.

I, too, remember that Dave was discussing these issues, so I'll leave
it to him to comment on this.

> mk (Macedonian) in mule-cmd.el maps to "Latin-5".  The locale mk_MK is
> for ISO 8859-5. The most closest language environment is "Cyrillic-ISO",
> but it uses Russian keyboard.  If you agree I will make language
> environment for Macedonian, or else that will be "Cyrillic-ISO".

Adding the Macedonian environment sounds like a better idea.  Thanks.

> ru in mule-cmd.el maps to "Latin-5".  That definitely should be
> "Cyrillic-ISO".

Yes, agreed.

> tg (Tadjik) in mule-cmd.el maps to "Cyrillic-KOI8-T".  There is no such
> language environment in Emacs, but I can make it.  KOI8-T has to be
> added in cyrillic.el.
> 
> uk (Ukrainian) maps to language environment "Ukrainian".  There is no
> such environment in Emacs, but I can make it.  KOI8-U has to be added in
> cyrillic.el.

Please do add these two environments.

> > Is it possible that v21.1 that worked for you was compiled with XIM,
> > whereas the CVS version is not, or vice versa?  For XIM, the
> > locale-coding-system should be set correctly, or else non-ASCII input
> > will not DTRT.  Perhaps the same bug in mule-cmds.el that you
> > mentioned also affects this issue, as it sets the wrong
> > locale-coding-system given the value of LANG and LC_* environment
> > variables?
> 
> There are no problems in text-mode.

Hmm... not sure how is this relevant to the issue.  Are you saying
that Emacs beeps in some modes, but not in others?

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-06 19:22             ` Eli Zaretskii
@ 2002-03-06 20:16               ` Anton Zinoviev
  2002-03-07  6:02                 ` Eli Zaretskii
  0 siblings, 1 reply; 18+ messages in thread
From: Anton Zinoviev @ 2002-03-06 20:16 UTC (permalink / raw)


On  6.III.2002 at 21:22 Eli Zaretskii wrote:
>
> > There are no problems in text-mode.
> 
> Hmm... not sure how is this relevant to the issue.  Are you saying
> that Emacs beeps in some modes, but not in others?

I wanted to say that there are no problems in Linux console.  The
problems are only in X Window.  This makes me think that the problem is
not because of the locale-coding-system (I am not able to check this
right now). The xkeysym ISO_Next_Group is also irrelevant to coding
system settings.

Regards, Anton Zinoviev


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-06 20:16               ` Anton Zinoviev
@ 2002-03-07  6:02                 ` Eli Zaretskii
  2002-03-13 20:13                   ` Anton Zinoviev
  0 siblings, 1 reply; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-07  6:02 UTC (permalink / raw)
  Cc: emacs-devel


On Wed, 6 Mar 2002, Anton Zinoviev wrote:

> I wanted to say that there are no problems in Linux console.  The
> problems are only in X Window.  This makes me think that the problem is
> not because of the locale-coding-system (I am not able to check this
> right now).

Actually, this points into the direction I was thinking of: on a text 
terminal locale-coding-system is not used to decode keyboard input, while 
on X it is.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-07  6:02                 ` Eli Zaretskii
@ 2002-03-13 20:13                   ` Anton Zinoviev
  2002-03-15  7:09                     ` Eli Zaretskii
  2002-03-22 12:25                     ` Eli Zaretskii
  0 siblings, 2 replies; 18+ messages in thread
From: Anton Zinoviev @ 2002-03-13 20:13 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1635 bytes --]

On  7.III.2002 at 08:02 Eli Zaretskii wrote:
> 
> On Wed, 6 Mar 2002, Anton Zinoviev wrote:
> 
> > I wanted to say that there are no problems in Linux console.  The
> > problems are only in X Window.  This makes me think that the problem is
> > not because of the locale-coding-system (I am not able to check this
> > right now).
> 
> Actually, this points into the direction I was thinking of: on a text 
> terminal locale-coding-system is not used to decode keyboard input, while 
> on X it is.

(Sorry for the delayed reply.)

I investigated what happened.

Emacs tries to get a code for the received xkeysym.  It gets this code
acording to encoding of the locale Emacs is started with.  If this
encoding includes the key-pressed symbol the result is an Emacs-event of
the form [194], where 194 is the code of the key-pressed symbol.  If
this symbol is not included in the encoding of the locale, then the
result is a symbol of the form [S-Cyrillic-A].  Emacs can't interpret
such sort of events (they are not bound to any action) and that's why I
got beeps -- I started Emacs under LANG=C and thus Cyrillic symbols were
not includes of the locale encoding (i.e. ASCII).  This is not a real
bug, but only a wishlist.

However Emacs beeps also for xkeysyms like ISO_Next_Group, thou it must
interpret them the same way as Mode_switch.  This is a bug, I've
attached the fix.

Copying from a KOI8-R of ISO-8859-5 xterm and pasting in Emacs works
fine -- the result are symbols from ISO-8859-5.  Copying from an Unicode
xterm gives Japanese double-width Cyrillic symbols.  The output of C-u
C-x = is attached.

Regards, Anton Zinoviev


[-- Attachment #2: emacs-isokeys.patch --]
[-- Type: text/plain, Size: 1001 bytes --]

diff -Naur emacs.old/src/xterm.c emacs.new/src/xterm.c
--- emacs.old/src/xterm.c	Sat Mar  2 00:38:47 2002
+++ emacs.new/src/xterm.c	Sun Mar 10 23:05:22 2002
@@ -1,5 +1,5 @@
 /* X Communication module for terminals which understand the X protocol.
-   Copyright (C) 1989, 93, 94, 95, 96, 1997, 1998, 1999, 2000, 2001
+   Copyright (C) 1989, 93, 94, 95, 96, 1997, 1998, 1999, 2000, 2001, 2002
    Free Software Foundation, Inc.
 
 This file is part of GNU Emacs.
@@ -10629,6 +10629,14 @@
 				|| ((unsigned)(orig_keysym) == XK_Num_Lock)
 #endif
 #endif /* not HAVE_X11R5 */
+				/* The symbols from XK_ISO_Lock to
+				   XK_ISO_Last_Group_Lock doesn't have real
+				   modifiers but should be treated similarly
+				   to Mode_switch by Emacs. */
+#if defined XK_ISO_Lock && defined XK_ISO_Last_Group_Lock
+				|| ((unsigned)(orig_keysym) >=  XK_ISO_Lock
+				    && (unsigned)(orig_keysym) <= XK_ISO_Last_Group_Lock)
+#endif
 				))
 			{
 			  if (temp_index == sizeof temp_buffer / sizeof (short))

[-- Attachment #3: emacs-paste --]
[-- Type: text/plain, Size: 513 bytes --]

  character: ^[$B'U^[(B (0151725, 54229, 0xd3d5)
    charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87.)
 code point: 39 85
     syntax: word
   category: Y:Cyrillic characters of 2-byte character sets   j:Japanese  
	     |:While filling, we can break a line at this character.  
buffer code: 0x92 0xA7 0xD5
  file code: ESC 24 42 27 55 (encoded by coding system iso-2022-7bit)
       font: -Misc-Fixed-Medium-R-Normal--14-130-75-75-C-140-JISX0208.1983-0


adsf ^[$B'Q'c'U'f^[(B


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-13 20:13                   ` Anton Zinoviev
@ 2002-03-15  7:09                     ` Eli Zaretskii
  2002-03-22 12:25                     ` Eli Zaretskii
  1 sibling, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-15  7:09 UTC (permalink / raw)
  Cc: emacs-devel

> From: Anton Zinoviev <anton@lml.bas.bg>
> Date: Wed, 13 Mar 2002 22:13:13 +0200
> 
> However Emacs beeps also for xkeysyms like ISO_Next_Group, thou it must
> interpret them the same way as Mode_switch.  This is a bug, I've
> attached the fix.

Can someone who knows about X keysym input please comment on this
change?

> Copying from an Unicode xterm gives Japanese double-width Cyrillic
> symbols.

That's probably something determined by the Unicode xterm--it needs to
decide what character set of those defined by ICCCM to use for Unicode
characters.  Perhaps there's some way a user can influence that
decision?  I don't have access to a Unicode xterm, so I cannot check.

Thanks for working on this.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: "Outer world" encoding for non-Latin1 language environments
  2002-03-13 20:13                   ` Anton Zinoviev
  2002-03-15  7:09                     ` Eli Zaretskii
@ 2002-03-22 12:25                     ` Eli Zaretskii
  1 sibling, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2002-03-22 12:25 UTC (permalink / raw)
  Cc: emacs-devel

> From: Anton Zinoviev <anton@lml.bas.bg>
> Date: Wed, 13 Mar 2002 22:13:13 +0200
> 
> However Emacs beeps also for xkeysyms like ISO_Next_Group, thou it must
> interpret them the same way as Mode_switch.  This is a bug, I've
> attached the fix.

Thanks, I installed this.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2002-03-22 12:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-02-27 11:29 "Outer world" encoding for non-Latin1 language environments Anton Zinoviev
2002-02-27 11:10 ` Eli Zaretskii
2002-03-01 18:00   ` Anton Zinoviev
2002-03-01 18:23     ` Stefan Monnier
2002-03-01 19:57     ` Eli Zaretskii
2002-03-05 10:57       ` Anton Zinoviev
2002-03-05 17:28         ` Eli Zaretskii
2002-03-06 18:27           ` Anton Zinoviev
2002-03-06 19:22             ` Eli Zaretskii
2002-03-06 20:16               ` Anton Zinoviev
2002-03-07  6:02                 ` Eli Zaretskii
2002-03-13 20:13                   ` Anton Zinoviev
2002-03-15  7:09                     ` Eli Zaretskii
2002-03-22 12:25                     ` Eli Zaretskii
2002-03-01  1:28 ` Stefan Monnier
2002-03-01  7:36   ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2002-02-27 11:46 Kenichi Handa
2002-02-27 12:11 ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).