* getting Mule, Unicode & X selection to play together
@ 2002-12-15 0:09 Michael Livshin
2002-12-15 6:14 ` Eli Zaretskii
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Michael Livshin @ 2002-12-15 0:09 UTC (permalink / raw)
so the day had come and I decided to explore the wonderful world of
Emacs 21 and Mule, what with all the nice Debian packaging of them out
there.
so I installed Emacs 21 and mule-ucs (it seemed like a good idea, or
was it?), and I've put the following into .emacs:
(set-language-environment "Cyrillic-KOI8") ; I want my cyrillics
(define-coding-system-alias 'mule-utf-8 'utf-8) ; per mule-ucs README
(set-keyboard-coding-system 'utf-8) ; my keyboard generates
; Unicode-encoded cyrillic chars
now, I'm mostly interested in making X selection play well between
Emacs and several Unicode-based apps (mainly Mozilla and a couple of
GTK2-based critters).
I had to play with the locale settings, to get the X clipboard to
approach at least some sanity. so they ended up like this
(considering that I don't want the programs to speak Russian at me and
I live in Israel):
LANG=ru_RU.UTF-8
LC_CTYPE=ru_RU.UTF-8
LC_NUMERIC=he_IL
LC_TIME=he_IL
LC_COLLATE=ru_RU.UTF-8
LC_MONETARY=he_IL
LC_MESSAGES=C
LC_PAPER=C
LC_NAME=C
LC_ADDRESS=he_IL
LC_TELEPHONE=he_IL
LC_MEASUREMENT=he_IL
LC_IDENTIFICATION=ru_RU.UTF-8
LC_ALL=
if I select a chunk of cyrillic text in Emacs and paste it into
Mozilla, all is well.
now, if I select a chunk of cyrillic text in Mozilla and paste it into
Emacs, I do indeed get the same-looking text. however, the char codes
are different from whatever Emacs itself chooses for the same entities
if I type them into it (which is just weird, but no biggie), and (as a
consequence, probably) the pasted text is shown in a different font
(which is butt ugly).
so basically I'd like Emacs to somehow recognize the cyrillic
characters in the X selection it receives, and to convert them into
the codes it itself uses for the same characters. how do I do that?
--
There are few personal problems which can't be solved by the suitable
application of high explosives.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin
@ 2002-12-15 6:14 ` Eli Zaretskii
2002-12-15 21:33 ` Tatsuya Kinoshita
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2002-12-15 6:14 UTC (permalink / raw)
On Sun, 15 Dec 2002, Michael Livshin wrote:
> so the day had come and I decided to explore the wonderful world of
> Emacs 21 and Mule, what with all the nice Debian packaging of them out
> there.
>
> so I installed Emacs 21 and mule-ucs (it seemed like a good idea, or
> was it?)
It's not necessarily a good idea to bring Mule-UCS into this equation. In
any case, you will be much better off using the CVS version of Emacs
where several related bugs were fixed lately.
> now, if I select a chunk of cyrillic text in Mozilla and paste it into
> Emacs, I do indeed get the same-looking text. however, the char codes
> are different from whatever Emacs itself chooses for the same entities
> if I type them into it (which is just weird, but no biggie), and (as a
> consequence, probably) the pasted text is shown in a different font
> (which is butt ugly).
I suspect that Emacs converts the pasted text into Unicode codepoints,
and that your Unicode font is ugly. What does "C-u C-x =" tell about the
cyrillic characters you paste this way?
> so basically I'd like Emacs to somehow recognize the cyrillic
> characters in the X selection it receives, and to convert them into
> the codes it itself uses for the same characters.
The problem is, Emacs 21 uses two different codepoints for Cyrillic
characters: one based on ISO-8859-5, the other based on Unicode.
Conversion between them is not supported in stock Emacs distributions,
AFAIK you need either add-on packages (such as ucs-tables you can find
on gnu.emacs.sources) or the latest development code from CVS.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin
2002-12-15 6:14 ` Eli Zaretskii
@ 2002-12-15 21:33 ` Tatsuya Kinoshita
2002-12-16 9:30 ` Roman Belenov
2002-12-16 11:21 ` Tatsuya Kinoshita
3 siblings, 0 replies; 9+ messages in thread
From: Tatsuya Kinoshita @ 2002-12-15 21:33 UTC (permalink / raw)
On December 15, 2002 at 2:09AM +0200,
Michael Livshin <usenet@cmm.kakpryg.net> wrote:
> so the day had come and I decided to explore the wonderful world of
> Emacs 21 and Mule, what with all the nice Debian packaging of them out
> there.
>
> so I installed Emacs 21 and mule-ucs
> now, if I select a chunk of cyrillic text in Mozilla and paste it into
> Emacs, I do indeed get the same-looking text. however, the char codes
> are different from whatever Emacs itself chooses for the same entities
> if I type them into it (which is just weird, but no biggie), and (as a
> consequence, probably) the pasted text is shown in a different font
> (which is butt ugly).
Did you install the xfonts-base-transcoded package?
In Debian, fonts in several ISO 8859 encodings transcoded from
ISO 10646-1 are contained with the xfonts-*-transcoded packages.
See also `apt-cache show xfonts-base-transcoded' and
`apt-cache search transcoded'.
--
Tatsuya Kinoshita
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin
2002-12-15 6:14 ` Eli Zaretskii
2002-12-15 21:33 ` Tatsuya Kinoshita
@ 2002-12-16 9:30 ` Roman Belenov
2002-12-16 11:21 ` Tatsuya Kinoshita
3 siblings, 0 replies; 9+ messages in thread
From: Roman Belenov @ 2002-12-16 9:30 UTC (permalink / raw)
Michael Livshin <usenet@cmm.kakpryg.net> writes:
> so basically I'd like Emacs to somehow recognize the cyrillic
> characters in the X selection it receives, and to convert them into
> the codes it itself uses for the same characters. how do I do that?
I've patched utf-8.el (located in lisp/international) to make it use
cyrillic-iso8859-5 character set (the patch is against file from
GNU Emacs 21.2); I guess it should solve your problem.
CVS version already has support for it (although I never tried it,
just looked through sources).
==============================================================================
--- utf-8.el.orig 2002-12-16 12:06:57.000000000 +0300
+++ utf-8.el 2002-09-04 17:39:50.000000000 +0400
@@ -116,14 +116,18 @@
((r0 = ,(charset-id 'latin-iso8859-1))
(r1 -= 128)
(write-multibyte-character r0 r1))
-
+ ((r2 = (r1 <= #x045f))
+ (if ((r1 >= #x0400) & r2)
+ ((r0 = ,(charset-id 'cyrillic-iso8859-5))
+ (r1 -= #x03e0)
+ (write-multibyte-character r0 r1))
;; mule-unicode-0100-24ff (< 0800)
((r0 = ,(charset-id 'mule-unicode-0100-24ff))
(r1 -= #x0100)
(r2 = (((r1 / 96) + 32) << 7))
(r1 %= 96)
(r1 += (r2 + 32))
- (write-multibyte-character r0 r1)))))))
+ (write-multibyte-character r0 r1)))))))))
;; 3byte encoding
;; zzzzyyyyyyxxxxxx = 1110zzzz 10yyyyyy 10xxxxxx
@@ -246,6 +250,12 @@
(r1 &= #x3f)
(r1 |= #x80)
(write r0 r1))
+ (if (r0 == ,(charset-id 'cyrillic-iso8859-5))
+ ((r0 = (((r1 - #x20) >> 6) | #xd0))
+ (r1 -= #x20)
+ (r1 &= #x3f)
+ (r1 |= #x80)
+ (write r0 r1))
(if (r0 == ,(charset-id 'mule-unicode-0100-24ff))
((r0 = ((((r1 & #x3f80) >> 7) - 32) * 96))
@@ -327,7 +337,7 @@
;; Output U+FFFD, which is `ef bf bd' in UTF-8.
((write #xef)
(write #xbf)
- (write #xbd)))))))))
+ (write #xbd))))))))))
(repeat)))
(if (r1 >= #xa0)
(write r1)
@@ -348,6 +358,7 @@
eight-bit-control
eight-bit-graphic
latin-iso8859-1
+ cyrillic-iso8859-5
mule-unicode-0100-24ff
mule-unicode-2500-33ff
mule-unicode-e000-ffff
@@ -367,6 +378,7 @@
eight-bit-control
eight-bit-graphic
latin-iso8859-1
+ cyrillic-iso8859-5
mule-unicode-0100-24ff
mule-unicode-2500-33ff
mule-unicode-e000-ffff)
==============================================================================
--
With regards, Roman.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin
` (2 preceding siblings ...)
2002-12-16 9:30 ` Roman Belenov
@ 2002-12-16 11:21 ` Tatsuya Kinoshita
3 siblings, 0 replies; 9+ messages in thread
From: Tatsuya Kinoshita @ 2002-12-16 11:21 UTC (permalink / raw)
On December 15, 2002 at 2:09AM +0200,
Michael Livshin <usenet@cmm.kakpryg.net> wrote:
> LANG=ru_RU.UTF-8
> LC_CTYPE=ru_RU.UTF-8
> now, if I select a chunk of cyrillic text in Mozilla and paste it into
> Emacs, I do indeed get the same-looking text. however, the char codes
> are different from whatever Emacs itself chooses for the same entities
> if I type them into it (which is just weird, but no biggie), and (as a
> consequence, probably) the pasted text is shown in a different font
> (which is butt ugly).
How about `/usr/bin/env LC_ALL=ru_RU.ISO-8859-5 /usr/bin/mozilla'?
(The UTF-8 bug might exist in Mozilla or in Xlib...)
--
Tatsuya Kinoshita
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
[not found] <mailman.224.1039932883.19936.help-gnu-emacs@gnu.org>
@ 2002-12-17 23:02 ` Michael Livshin
2002-12-18 5:47 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: Michael Livshin @ 2002-12-17 23:02 UTC (permalink / raw)
Eli Zaretskii <eliz@is.elta.co.il> writes:
> The problem is, Emacs 21 uses two different codepoints for Cyrillic
> characters: one based on ISO-8859-5, the other based on Unicode.
> Conversion between them is not supported in stock Emacs distributions,
> AFAIK you need either add-on packages (such as ucs-tables you can find
> on gnu.emacs.sources) or the latest development code from CVS.
thanks! I installed the CVS version and got everything to work.
to possible future sufferers: the _key_ thing about getting anything
MULE-related to work seems to be /letting go/. by any means, don't
try logic! you'll spend hours in pain, you'll pull half your hair
out, and it just won't work for you.
my satori found me the minute I happened upon the following, in
fontset.el:
(defvar x-font-name-charset-alist
'(...
("koi8" ascii cyrillic-iso8859-5)
...))
after seeing the above, it was a matter of slapping self on whatever
passes for forehead, setting the global locale to ru_RU.KOI8-R,
setting the Emacs language environment to "Cyrillic-ISO", not
forgetting to explicitly map the "cyrillic-iso8859-5" encoding to an
iso8859-5 font in the fontset (no sir, Emacs *won't* grok it by
itself, how could it?), and voila!
(OK, so as a consequence of setting the global locale to something
un-Unicodelly, I won't be able to cut-n-paste in more than 2
languages at the same time. no biggie, at least cyrillics are
working.)
bitter unfunny sarcasm aside, Emacs 21.3.50 seems to be a *really*
nice piece of work, so far.
thanks all,
--m
--
This program posts news to billions of machines throughout the galaxy. Your
message will cost the net enough to bankrupt your entire planet. As a result
your species will be sold into slavery. Be sure you know what you are doing.
Are you absolutely sure you want to do this? [yn] y
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
2002-12-17 23:02 ` Michael Livshin
@ 2002-12-18 5:47 ` Eli Zaretskii
0 siblings, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2002-12-18 5:47 UTC (permalink / raw)
On Wed, 18 Dec 2002, Michael Livshin wrote:
> to possible future sufferers: the _key_ thing about getting anything
> MULE-related to work seems to be /letting go/. by any means, don't
> try logic! you'll spend hours in pain, you'll pull half your hair
> out, and it just won't work for you.
OTOH, you _can_ try logic after you read the sources ;-)
> not
> forgetting to explicitly map the "cyrillic-iso8859-5" encoding to an
> iso8859-5 font in the fontset (no sir, Emacs *won't* grok it by
> itself, how could it?)
I'm surprised this is so, but if it is, it sounds like a bug, so please
report it (with a precise test case to reproduce) to
emacs-pretest-bug@gnu.org. Thanks.
Btw, cyrillic-iso8859-5 is not an encoding, it's a character set. But
that's nitpicking.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
[not found] <mailman.369.1040190440.19936.help-gnu-emacs@gnu.org>
@ 2002-12-18 10:44 ` Michael Livshin
2002-12-18 10:56 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: Michael Livshin @ 2002-12-18 10:44 UTC (permalink / raw)
Eli Zaretskii <eliz@is.elta.co.il> writes:
>> not forgetting to explicitly map the "cyrillic-iso8859-5" encoding
>> to an iso8859-5 font in the fontset (no sir, Emacs *won't* grok it
>> by itself, how could it?)
>
> I'm surprised this is so, but if it is, it sounds like a bug, so please
> report it (with a precise test case to reproduce) to
> emacs-pretest-bug@gnu.org. Thanks.
well, let me clarify. Emacs *did* find an appropriately-encoded font,
the problem was that it took the iso8859-5 font from the "standard"
fontset and not from my fontset, even though there surely *it* a
matching font in my fontset (and it works great once I map
"cyrillic-iso8859-5" to it explicitly).
perhaps it's a kind of feature?
it would be nice to simply be able to tell Emacs to forget the
standard fontset altogether, or at least to completely ignore it.
> Btw, cyrillic-iso8859-5 is not an encoding, it's a character set.
> But that's nitpicking.
it's kind of subtle, so it's certainly worth pointing out. thank you.
--
Incrementally extended heuristic algorithms tend inexorably toward the
incomprehensible.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together
2002-12-18 10:44 ` Michael Livshin
@ 2002-12-18 10:56 ` Eli Zaretskii
0 siblings, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2002-12-18 10:56 UTC (permalink / raw)
On Wed, 18 Dec 2002, Michael Livshin wrote:
> well, let me clarify. Emacs *did* find an appropriately-encoded font,
> the problem was that it took the iso8859-5 font from the "standard"
> fontset and not from my fontset, even though there surely *it* a
> matching font in my fontset (and it works great once I map
> "cyrillic-iso8859-5" to it explicitly).
>
> perhaps it's a kind of feature?
I still think it's worth reporting as a possible misfeature or bug. (I
myself don't know enough about fontsets to give an opinion.)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2002-12-18 10:56 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin
2002-12-15 6:14 ` Eli Zaretskii
2002-12-15 21:33 ` Tatsuya Kinoshita
2002-12-16 9:30 ` Roman Belenov
2002-12-16 11:21 ` Tatsuya Kinoshita
[not found] <mailman.224.1039932883.19936.help-gnu-emacs@gnu.org>
2002-12-17 23:02 ` Michael Livshin
2002-12-18 5:47 ` Eli Zaretskii
[not found] <mailman.369.1040190440.19936.help-gnu-emacs@gnu.org>
2002-12-18 10:44 ` Michael Livshin
2002-12-18 10:56 ` Eli Zaretskii
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).