* getting Mule, Unicode & X selection to play together @ 2002-12-15 0:09 Michael Livshin 2002-12-15 6:14 ` Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Michael Livshin @ 2002-12-15 0:09 UTC (permalink / raw) so the day had come and I decided to explore the wonderful world of Emacs 21 and Mule, what with all the nice Debian packaging of them out there. so I installed Emacs 21 and mule-ucs (it seemed like a good idea, or was it?), and I've put the following into .emacs: (set-language-environment "Cyrillic-KOI8") ; I want my cyrillics (define-coding-system-alias 'mule-utf-8 'utf-8) ; per mule-ucs README (set-keyboard-coding-system 'utf-8) ; my keyboard generates ; Unicode-encoded cyrillic chars now, I'm mostly interested in making X selection play well between Emacs and several Unicode-based apps (mainly Mozilla and a couple of GTK2-based critters). I had to play with the locale settings, to get the X clipboard to approach at least some sanity. so they ended up like this (considering that I don't want the programs to speak Russian at me and I live in Israel): LANG=ru_RU.UTF-8 LC_CTYPE=ru_RU.UTF-8 LC_NUMERIC=he_IL LC_TIME=he_IL LC_COLLATE=ru_RU.UTF-8 LC_MONETARY=he_IL LC_MESSAGES=C LC_PAPER=C LC_NAME=C LC_ADDRESS=he_IL LC_TELEPHONE=he_IL LC_MEASUREMENT=he_IL LC_IDENTIFICATION=ru_RU.UTF-8 LC_ALL= if I select a chunk of cyrillic text in Emacs and paste it into Mozilla, all is well. now, if I select a chunk of cyrillic text in Mozilla and paste it into Emacs, I do indeed get the same-looking text. however, the char codes are different from whatever Emacs itself chooses for the same entities if I type them into it (which is just weird, but no biggie), and (as a consequence, probably) the pasted text is shown in a different font (which is butt ugly). so basically I'd like Emacs to somehow recognize the cyrillic characters in the X selection it receives, and to convert them into the codes it itself uses for the same characters. how do I do that? -- There are few personal problems which can't be solved by the suitable application of high explosives. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together 2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin @ 2002-12-15 6:14 ` Eli Zaretskii 2002-12-15 21:33 ` Tatsuya Kinoshita ` (2 subsequent siblings) 3 siblings, 0 replies; 9+ messages in thread From: Eli Zaretskii @ 2002-12-15 6:14 UTC (permalink / raw) On Sun, 15 Dec 2002, Michael Livshin wrote: > so the day had come and I decided to explore the wonderful world of > Emacs 21 and Mule, what with all the nice Debian packaging of them out > there. > > so I installed Emacs 21 and mule-ucs (it seemed like a good idea, or > was it?) It's not necessarily a good idea to bring Mule-UCS into this equation. In any case, you will be much better off using the CVS version of Emacs where several related bugs were fixed lately. > now, if I select a chunk of cyrillic text in Mozilla and paste it into > Emacs, I do indeed get the same-looking text. however, the char codes > are different from whatever Emacs itself chooses for the same entities > if I type them into it (which is just weird, but no biggie), and (as a > consequence, probably) the pasted text is shown in a different font > (which is butt ugly). I suspect that Emacs converts the pasted text into Unicode codepoints, and that your Unicode font is ugly. What does "C-u C-x =" tell about the cyrillic characters you paste this way? > so basically I'd like Emacs to somehow recognize the cyrillic > characters in the X selection it receives, and to convert them into > the codes it itself uses for the same characters. The problem is, Emacs 21 uses two different codepoints for Cyrillic characters: one based on ISO-8859-5, the other based on Unicode. Conversion between them is not supported in stock Emacs distributions, AFAIK you need either add-on packages (such as ucs-tables you can find on gnu.emacs.sources) or the latest development code from CVS. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together 2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin 2002-12-15 6:14 ` Eli Zaretskii @ 2002-12-15 21:33 ` Tatsuya Kinoshita 2002-12-16 9:30 ` Roman Belenov 2002-12-16 11:21 ` Tatsuya Kinoshita 3 siblings, 0 replies; 9+ messages in thread From: Tatsuya Kinoshita @ 2002-12-15 21:33 UTC (permalink / raw) On December 15, 2002 at 2:09AM +0200, Michael Livshin <usenet@cmm.kakpryg.net> wrote: > so the day had come and I decided to explore the wonderful world of > Emacs 21 and Mule, what with all the nice Debian packaging of them out > there. > > so I installed Emacs 21 and mule-ucs > now, if I select a chunk of cyrillic text in Mozilla and paste it into > Emacs, I do indeed get the same-looking text. however, the char codes > are different from whatever Emacs itself chooses for the same entities > if I type them into it (which is just weird, but no biggie), and (as a > consequence, probably) the pasted text is shown in a different font > (which is butt ugly). Did you install the xfonts-base-transcoded package? In Debian, fonts in several ISO 8859 encodings transcoded from ISO 10646-1 are contained with the xfonts-*-transcoded packages. See also `apt-cache show xfonts-base-transcoded' and `apt-cache search transcoded'. -- Tatsuya Kinoshita ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together 2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin 2002-12-15 6:14 ` Eli Zaretskii 2002-12-15 21:33 ` Tatsuya Kinoshita @ 2002-12-16 9:30 ` Roman Belenov 2002-12-16 11:21 ` Tatsuya Kinoshita 3 siblings, 0 replies; 9+ messages in thread From: Roman Belenov @ 2002-12-16 9:30 UTC (permalink / raw) Michael Livshin <usenet@cmm.kakpryg.net> writes: > so basically I'd like Emacs to somehow recognize the cyrillic > characters in the X selection it receives, and to convert them into > the codes it itself uses for the same characters. how do I do that? I've patched utf-8.el (located in lisp/international) to make it use cyrillic-iso8859-5 character set (the patch is against file from GNU Emacs 21.2); I guess it should solve your problem. CVS version already has support for it (although I never tried it, just looked through sources). ============================================================================== --- utf-8.el.orig 2002-12-16 12:06:57.000000000 +0300 +++ utf-8.el 2002-09-04 17:39:50.000000000 +0400 @@ -116,14 +116,18 @@ ((r0 = ,(charset-id 'latin-iso8859-1)) (r1 -= 128) (write-multibyte-character r0 r1)) - + ((r2 = (r1 <= #x045f)) + (if ((r1 >= #x0400) & r2) + ((r0 = ,(charset-id 'cyrillic-iso8859-5)) + (r1 -= #x03e0) + (write-multibyte-character r0 r1)) ;; mule-unicode-0100-24ff (< 0800) ((r0 = ,(charset-id 'mule-unicode-0100-24ff)) (r1 -= #x0100) (r2 = (((r1 / 96) + 32) << 7)) (r1 %= 96) (r1 += (r2 + 32)) - (write-multibyte-character r0 r1))))))) + (write-multibyte-character r0 r1))))))))) ;; 3byte encoding ;; zzzzyyyyyyxxxxxx = 1110zzzz 10yyyyyy 10xxxxxx @@ -246,6 +250,12 @@ (r1 &= #x3f) (r1 |= #x80) (write r0 r1)) + (if (r0 == ,(charset-id 'cyrillic-iso8859-5)) + ((r0 = (((r1 - #x20) >> 6) | #xd0)) + (r1 -= #x20) + (r1 &= #x3f) + (r1 |= #x80) + (write r0 r1)) (if (r0 == ,(charset-id 'mule-unicode-0100-24ff)) ((r0 = ((((r1 & #x3f80) >> 7) - 32) * 96)) @@ -327,7 +337,7 @@ ;; Output U+FFFD, which is `ef bf bd' in UTF-8. ((write #xef) (write #xbf) - (write #xbd))))))))) + (write #xbd)))))))))) (repeat))) (if (r1 >= #xa0) (write r1) @@ -348,6 +358,7 @@ eight-bit-control eight-bit-graphic latin-iso8859-1 + cyrillic-iso8859-5 mule-unicode-0100-24ff mule-unicode-2500-33ff mule-unicode-e000-ffff @@ -367,6 +378,7 @@ eight-bit-control eight-bit-graphic latin-iso8859-1 + cyrillic-iso8859-5 mule-unicode-0100-24ff mule-unicode-2500-33ff mule-unicode-e000-ffff) ============================================================================== -- With regards, Roman. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together 2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin ` (2 preceding siblings ...) 2002-12-16 9:30 ` Roman Belenov @ 2002-12-16 11:21 ` Tatsuya Kinoshita 3 siblings, 0 replies; 9+ messages in thread From: Tatsuya Kinoshita @ 2002-12-16 11:21 UTC (permalink / raw) On December 15, 2002 at 2:09AM +0200, Michael Livshin <usenet@cmm.kakpryg.net> wrote: > LANG=ru_RU.UTF-8 > LC_CTYPE=ru_RU.UTF-8 > now, if I select a chunk of cyrillic text in Mozilla and paste it into > Emacs, I do indeed get the same-looking text. however, the char codes > are different from whatever Emacs itself chooses for the same entities > if I type them into it (which is just weird, but no biggie), and (as a > consequence, probably) the pasted text is shown in a different font > (which is butt ugly). How about `/usr/bin/env LC_ALL=ru_RU.ISO-8859-5 /usr/bin/mozilla'? (The UTF-8 bug might exist in Mozilla or in Xlib...) -- Tatsuya Kinoshita ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <mailman.224.1039932883.19936.help-gnu-emacs@gnu.org>]
* Re: getting Mule, Unicode & X selection to play together [not found] <mailman.224.1039932883.19936.help-gnu-emacs@gnu.org> @ 2002-12-17 23:02 ` Michael Livshin 2002-12-18 5:47 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: Michael Livshin @ 2002-12-17 23:02 UTC (permalink / raw) Eli Zaretskii <eliz@is.elta.co.il> writes: > The problem is, Emacs 21 uses two different codepoints for Cyrillic > characters: one based on ISO-8859-5, the other based on Unicode. > Conversion between them is not supported in stock Emacs distributions, > AFAIK you need either add-on packages (such as ucs-tables you can find > on gnu.emacs.sources) or the latest development code from CVS. thanks! I installed the CVS version and got everything to work. to possible future sufferers: the _key_ thing about getting anything MULE-related to work seems to be /letting go/. by any means, don't try logic! you'll spend hours in pain, you'll pull half your hair out, and it just won't work for you. my satori found me the minute I happened upon the following, in fontset.el: (defvar x-font-name-charset-alist '(... ("koi8" ascii cyrillic-iso8859-5) ...)) after seeing the above, it was a matter of slapping self on whatever passes for forehead, setting the global locale to ru_RU.KOI8-R, setting the Emacs language environment to "Cyrillic-ISO", not forgetting to explicitly map the "cyrillic-iso8859-5" encoding to an iso8859-5 font in the fontset (no sir, Emacs *won't* grok it by itself, how could it?), and voila! (OK, so as a consequence of setting the global locale to something un-Unicodelly, I won't be able to cut-n-paste in more than 2 languages at the same time. no biggie, at least cyrillics are working.) bitter unfunny sarcasm aside, Emacs 21.3.50 seems to be a *really* nice piece of work, so far. thanks all, --m -- This program posts news to billions of machines throughout the galaxy. Your message will cost the net enough to bankrupt your entire planet. As a result your species will be sold into slavery. Be sure you know what you are doing. Are you absolutely sure you want to do this? [yn] y ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together 2002-12-17 23:02 ` Michael Livshin @ 2002-12-18 5:47 ` Eli Zaretskii 0 siblings, 0 replies; 9+ messages in thread From: Eli Zaretskii @ 2002-12-18 5:47 UTC (permalink / raw) On Wed, 18 Dec 2002, Michael Livshin wrote: > to possible future sufferers: the _key_ thing about getting anything > MULE-related to work seems to be /letting go/. by any means, don't > try logic! you'll spend hours in pain, you'll pull half your hair > out, and it just won't work for you. OTOH, you _can_ try logic after you read the sources ;-) > not > forgetting to explicitly map the "cyrillic-iso8859-5" encoding to an > iso8859-5 font in the fontset (no sir, Emacs *won't* grok it by > itself, how could it?) I'm surprised this is so, but if it is, it sounds like a bug, so please report it (with a precise test case to reproduce) to emacs-pretest-bug@gnu.org. Thanks. Btw, cyrillic-iso8859-5 is not an encoding, it's a character set. But that's nitpicking. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <mailman.369.1040190440.19936.help-gnu-emacs@gnu.org>]
* Re: getting Mule, Unicode & X selection to play together [not found] <mailman.369.1040190440.19936.help-gnu-emacs@gnu.org> @ 2002-12-18 10:44 ` Michael Livshin 2002-12-18 10:56 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: Michael Livshin @ 2002-12-18 10:44 UTC (permalink / raw) Eli Zaretskii <eliz@is.elta.co.il> writes: >> not forgetting to explicitly map the "cyrillic-iso8859-5" encoding >> to an iso8859-5 font in the fontset (no sir, Emacs *won't* grok it >> by itself, how could it?) > > I'm surprised this is so, but if it is, it sounds like a bug, so please > report it (with a precise test case to reproduce) to > emacs-pretest-bug@gnu.org. Thanks. well, let me clarify. Emacs *did* find an appropriately-encoded font, the problem was that it took the iso8859-5 font from the "standard" fontset and not from my fontset, even though there surely *it* a matching font in my fontset (and it works great once I map "cyrillic-iso8859-5" to it explicitly). perhaps it's a kind of feature? it would be nice to simply be able to tell Emacs to forget the standard fontset altogether, or at least to completely ignore it. > Btw, cyrillic-iso8859-5 is not an encoding, it's a character set. > But that's nitpicking. it's kind of subtle, so it's certainly worth pointing out. thank you. -- Incrementally extended heuristic algorithms tend inexorably toward the incomprehensible. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: getting Mule, Unicode & X selection to play together 2002-12-18 10:44 ` Michael Livshin @ 2002-12-18 10:56 ` Eli Zaretskii 0 siblings, 0 replies; 9+ messages in thread From: Eli Zaretskii @ 2002-12-18 10:56 UTC (permalink / raw) On Wed, 18 Dec 2002, Michael Livshin wrote: > well, let me clarify. Emacs *did* find an appropriately-encoded font, > the problem was that it took the iso8859-5 font from the "standard" > fontset and not from my fontset, even though there surely *it* a > matching font in my fontset (and it works great once I map > "cyrillic-iso8859-5" to it explicitly). > > perhaps it's a kind of feature? I still think it's worth reporting as a possible misfeature or bug. (I myself don't know enough about fontsets to give an opinion.) ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2002-12-18 10:56 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-12-15 0:09 getting Mule, Unicode & X selection to play together Michael Livshin 2002-12-15 6:14 ` Eli Zaretskii 2002-12-15 21:33 ` Tatsuya Kinoshita 2002-12-16 9:30 ` Roman Belenov 2002-12-16 11:21 ` Tatsuya Kinoshita [not found] <mailman.224.1039932883.19936.help-gnu-emacs@gnu.org> 2002-12-17 23:02 ` Michael Livshin 2002-12-18 5:47 ` Eli Zaretskii [not found] <mailman.369.1040190440.19936.help-gnu-emacs@gnu.org> 2002-12-18 10:44 ` Michael Livshin 2002-12-18 10:56 ` Eli Zaretskii
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).