all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* emacs, umlauts,  x-windows text mark and paste
@ 2003-12-16  8:36 josh buhl
  2003-12-16  9:11 ` Sergei Pokrovsky
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: josh buhl @ 2003-12-16  8:36 UTC (permalink / raw)


Hi,

When I mark text in some other application that contains non-ascii 
characters, e.g. in mozilla from a german webpage containing ö ä ü ß, 
and paste it into an emacs buffer, than all the special characters get 
shown as control sequences. Here's an example:

I mark this text in mozilla:

Soße wird in einer extra Soßenschüssel...


Paste it into my Emacs buffer and get this:

So\x{00DF}e wird in einer extra So\x{00DF}ensch\x{00FC}ssel...



Questions:

1. how can I get this to work properly?

2. which command could I execute in emacs to get it to switch the 
encoding of the current buffer, or whatever, so that the garbled 
characters magically get converted to what they're supposed to be?

3. why is this like this?

I've struggled with this problem _for years_ and never have found an 
easy solution. I'm running emacs 21.3.1 on debian testing under gnome 
2.4 with english as the default language environment. I do know that if 
I log out, and log in setting the session language to german, then I can 
cut an paste german text into an emacs buffer with no problem. However, 
it's not enough to set the LANG variable: if I open a terminal, set 
LANG=german or de, and start emacs, it still doesn't work.

I've read the mule section in the emacs manual about ten times, but 
never have been able to make heads or tails of it.

If anybody can help me clear up this problem I'd really appreciate it!

-jb

-- 
I awaken to hear for the first time
the music I remember I've known all along,
and find it playing Everywhere.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16  8:36 emacs, umlauts, x-windows text mark and paste josh buhl
@ 2003-12-16  9:11 ` Sergei Pokrovsky
  2003-12-16  9:40 ` Harald Maier
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Sergei Pokrovsky @ 2003-12-16  9:11 UTC (permalink / raw)


>>>>> "josh" == josh buhl <uzs33d@uni-bonn.de> writes:

[...]

  josh> I've struggled with this problem _for years_ and never have found an
  josh> easy solution. I'm running emacs 21.3.1 on debian testing under gnome
  josh> 2.4 with english as the default language environment. I do know that
  josh> if I log out, and log in setting the session language to german, then
  josh> I can cut an paste german text into an emacs buffer with no
  josh> problem. However, it's not enough to set the LANG variable: if I open
  josh> a terminal, set LANG=german or de, and start emacs, it still doesn't
  josh> work.

It's not $LANG which matters ($LANG concerns the message language),
it's $LC_ALL generally (or maybe $LC_CTYPE specifically).

Here on Solaris I've set the locale to en_US.UTF-8 and I can copy
between emacs and other applications.  Not quite well, though: when I
copy Russian text from Emacs buffer it works up to the first
whitespace (the first word is copied correctly, the rest is garbage).

But there is no such a problem for the Latin alphabets, which seems to
be your case.

[...]

-- 
Sergei

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16  8:36 emacs, umlauts, x-windows text mark and paste josh buhl
  2003-12-16  9:11 ` Sergei Pokrovsky
@ 2003-12-16  9:40 ` Harald Maier
  2003-12-16 10:11 ` erasurehead
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Harald Maier @ 2003-12-16  9:40 UTC (permalink / raw)


josh buhl <uzs33d@uni-bonn.de> writes:

> Hi,
>
> When I mark text in some other application that contains non-ascii
> characters, e.g. in mozilla from a german webpage containing ö ä ü ß,
> and paste it into an emacs buffer, than all the special characters get
> shown as control sequences. Here's an example:
>
> I mark this text in mozilla:
>
> Soße wird in einer extra Soßenschüssel...
>
> Paste it into my Emacs buffer and get this:
>
> So\x{00DF}e wird in einer extra So\x{00DF}ensch\x{00FC}ssel...
>
> Questions:
>
> 1. how can I get this to work properly?

Works fine for me out of the box. In mozilla the character coding is
Western ISO8859-1 or ISO8859-15.

>
> 2. which command could I execute in emacs to get it to switch the
> encoding of the current buffer, or whatever, so that the garbled
> characters magically get converted to what they're supposed to be?

C-x RET f (set-buffer-file-coding-system)

>
> 3. why is this like this?
>
> I've struggled with this problem _for years_ and never have found an
> easy solution. I'm running emacs 21.3.1 on debian testing under gnome
> 2.4 with english as the default language environment. I do know that
> if I log out, and log in setting the session language to german, then
> I can cut an paste german text into an emacs buffer with no
> problem. However, it's not enough to set the LANG variable: if I open
> a terminal, set LANG=german or de, and start emacs, it still doesn't
> work.

I am using as LANG environment de_DE or de_DE@euro. But it also works
fine with "C". As far as I know "german" is not a valid entry. In
emacs you can see the LANG value as follows:

M-: (getenv "LANG")

Harald

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16  8:36 emacs, umlauts, x-windows text mark and paste josh buhl
  2003-12-16  9:11 ` Sergei Pokrovsky
  2003-12-16  9:40 ` Harald Maier
@ 2003-12-16 10:11 ` erasurehead
  2003-12-16 10:33   ` Harald Maier
  2003-12-16 10:57   ` Sergei Pokrovsky
  2003-12-16 10:12 ` josh buhl
  2003-12-18  9:54 ` josh buhl
  4 siblings, 2 replies; 24+ messages in thread
From: erasurehead @ 2003-12-16 10:11 UTC (permalink / raw)


Hello Harald and Sergei,

thanks for your replies.

I've tried setting LC_ALL to german, C, de_DE, de_DE@euro, and 
en_US.UTF-8 and then starting emacs. I can tell this makes a difference 
because the symbol in the lower lefthand corner of emacs which indicates 
the default encoding changes appropriately (e.g. -1 for de_DE (-> 
iso-8859-1), and -u for en_US.UTF-8) However pasting in the german text 
still makes garbage.

Also, i tried running C-x RET f (set-buffer-file-coding-system) on the 
garbarged buffer and tried many different encodings (which one should I 
choose?), but it never changes the text in the buffer *one whit*! I 
would expect it to junk the entire screen if I pick chinese-big5-unix, 
but it doesn't seem to care, the buffer still looks the same. I had 
already tried this many times after reading all the mule garbage in the 
manual, and have never been able to get emacs to redisplay some buffer 
in a different encoding.
:-(

very frustrated,

josh buhl

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16  8:36 emacs, umlauts, x-windows text mark and paste josh buhl
                   ` (2 preceding siblings ...)
  2003-12-16 10:11 ` erasurehead
@ 2003-12-16 10:12 ` josh buhl
  2003-12-18  9:54 ` josh buhl
  4 siblings, 0 replies; 24+ messages in thread
From: josh buhl @ 2003-12-16 10:12 UTC (permalink / raw)


oops, wrong account. the reply from "erasurehead" was from me.

-jb

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16 10:11 ` erasurehead
@ 2003-12-16 10:33   ` Harald Maier
  2003-12-16 10:54     ` josh buhl
  2003-12-16 10:57   ` Sergei Pokrovsky
  1 sibling, 1 reply; 24+ messages in thread
From: Harald Maier @ 2003-12-16 10:33 UTC (permalink / raw)


erasurehead <erasurehead@goatrance.net> writes:

> I've tried setting LC_ALL to german, C, de_DE, de_DE@euro, and
> en_US.UTF-8 and then starting emacs. I can tell this makes a
> difference because the symbol in the lower lefthand corner of emacs
> which indicates the default encoding changes appropriately (e.g. -1
> for de_DE (->
> iso-8859-1), and -u for en_US.UTF-8) However pasting in the german
> text still makes garbage.

I fear the character coding in mozilla differs from the emacs
encoding. What is the character coding in mozilla? You should find it
in mozilla under View -> Character Coding. 

Harald

PS: I am too not an expert in this mule stuff.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16 10:33   ` Harald Maier
@ 2003-12-16 10:54     ` josh buhl
  2003-12-16 11:16       ` josh buhl
  0 siblings, 1 reply; 24+ messages in thread
From: josh buhl @ 2003-12-16 10:54 UTC (permalink / raw)


Harald Maier wrote:
> I fear the character coding in mozilla differs from the emacs
> encoding. What is the character coding in mozilla? You should find it
> in mozilla under View -> Character Coding. 

On the page I've been using for test purposes Mozilla says iso-8559-1, 
which emacs should be able to handle, but this has actually nothing to 
do with mozilla. It's a problem with how the text is being put in the 
x-windows systems clipboard and how emacs is reading it back out. Only 
emacs seems to have a problem with this. Running my usual C (en) locale, 
i can mark and copy this same german text back and forth between open 
office writer, gedit, kword, kedit, and even the ancient xedit. They all 
get the special characters properly, only Emacs barfs, and Emacs is 
otherwise (in my humble opinion) the most advanced editor in the world, 
but it can't seem to get a simple copy and paste right. What gives?

What about this other point of telling emacs what encoding to use for a 
buffer, and then not having the text change at all? Probably you're just 
telling it what format to use when it saves. What I'd like to see, is a 
command that tells emacs to reinterpret the buffer assuming that the 
encoding is different, so that, e.g., if I pick a chinese encoding and 
have entered ascii text, it should junk the screen, but if I've pasted 
in chinese text, and the buffer looks junked, then the chinese suddenly 
appears when selecting the correct encoding.


still frustrated ;-)
-jb

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16 10:11 ` erasurehead
  2003-12-16 10:33   ` Harald Maier
@ 2003-12-16 10:57   ` Sergei Pokrovsky
  1 sibling, 0 replies; 24+ messages in thread
From: Sergei Pokrovsky @ 2003-12-16 10:57 UTC (permalink / raw)


>>>>> "josh" == erasurehead  <erasurehead@goatrance.net> writes:

  josh> Hello Harald and Sergei,
  josh> thanks for your replies.

  josh> I've tried setting LC_ALL to german, C, de_DE, de_DE@euro, and
  josh> en_US.UTF-8 and then starting emacs.

And what about the other application (your browser)?  I guess, it is
important that both of them are set consistently, so that the
clipboard treatment is one same on both sides.

  josh> I can tell this makes a difference because the symbol in the
  josh> lower lefthand corner of emacs which indicates the default
  josh> encoding changes appropriately (e.g. -1 for de_DE (->
  josh> iso-8859-1), and -u for en_US.UTF-8) However pasting in the
  josh> german text still makes garbage.

  josh> Also, i tried running C-x RET f
  josh> (set-buffer-file-coding-system) on the garbarged buffer and
  josh> tried many different encodings (which one should I choose?),
  josh> but it never changes the text in the buffer *one whit*!

The buffer uses the internal Emacs encoding, and the
file-coding-system should matter only when you save or "find" (= open)
the file.

  josh> I would expect it to junk the entire screen if I pick
  josh> chinese-big5-unix, but it doesn't seem to care, the buffer
  josh> still looks the same. I had already tried this many times

[...]

-- 
Sergei

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16 10:54     ` josh buhl
@ 2003-12-16 11:16       ` josh buhl
  2003-12-17  6:07         ` Eli Zaretskii
       [not found]         ` <mailman.175.1071644836.868.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 24+ messages in thread
From: josh buhl @ 2003-12-16 11:16 UTC (permalink / raw)


So, I've narrowed it down further. It seems to be an emacs/gtk+ 2 
problem. Emacs has  a problem pasting in text with non-ascii characters 
from any of the apps which are compiled with gtk+ 2. Emacs inserts the 
text properly when it has been freshly marked in kword, kate, xedit, or 
open office writer, and barfs if the same text has been marked in 
mozilla, gedit, or *any gtk+ 2* dialog like any of the gnome dialogs. So 
I can mark a text in mozilla, paste it into xedit, _remark_ it and paste 
it into emacs, and it works, but if I don't remark, emacs barfs.

I'm sure this is related to this general problem:

ISO 14755 specifies using Ctrl+Shift+hex-digit to input unicode.
gtk2 implemented ISO 14755 input method.

There are several apps which are now having problems with this (see
http://bugzilla.mozilla.org/show_bug.cgi?id=186789 for example.)


The garbaged text correspond exactly to the unicode hex encodings for 
the characters. for example hex 00DF is ß and emacs displays the pasted 
in ß as \x{00DF}. This certainly isn't a coincidence. Why does it work 
if I login with the session language set to german, but not if I set LC_ALL?


-jb



josh buhl wrote:

> On the page I've been using for test purposes Mozilla says iso-8559-1, 
> which emacs should be able to handle, but this has actually nothing to 
> do with mozilla. It's a problem with how the text is being put in the 
> x-windows systems clipboard and how emacs is reading it back out. Only 
> emacs seems to have a problem with this. Running my usual C (en) locale, 
> i can mark and copy this same german text back and forth between open 
> office writer, gedit, kword, kedit, and even the ancient xedit. They all 
> get the special characters properly, only Emacs barfs, and Emacs is 
> otherwise (in my humble opinion) the most advanced editor in the world, 
> but it can't seem to get a simple copy and paste right. What gives?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16 11:16       ` josh buhl
@ 2003-12-17  6:07         ` Eli Zaretskii
       [not found]         ` <mailman.175.1071644836.868.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2003-12-17  6:07 UTC (permalink / raw)


> From: josh buhl <uzs33d@uni-bonn.de>
> Newsgroups: gnu.emacs.help
> Date: Tue, 16 Dec 2003 12:16:28 +0100
> 
> I'm sure this is related to this general problem:
> 
> ISO 14755 specifies using Ctrl+Shift+hex-digit to input unicode.
> gtk2 implemented ISO 14755 input method.

Does it help to say

  C-x RET X utf-8 RET

immediately before pasting a selection from a GTK2-compiled
application?

This command tells Emacs to assume that text in the X selection is
encoded in UTF-8.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]         ` <mailman.175.1071644836.868.help-gnu-emacs@gnu.org>
@ 2003-12-17  8:17           ` josh buhl
  2003-12-17  8:19           ` josh buhl
  1 sibling, 0 replies; 24+ messages in thread
From: josh buhl @ 2003-12-17  8:17 UTC (permalink / raw)


Eli Zaretskii wrote:
>>From: josh buhl <uzs33d@uni-bonn.de>
>>Newsgroups: gnu.emacs.help
>>Date: Tue, 16 Dec 2003 12:16:28 +0100
>>
>>I'm sure this is related to this general problem:
>>
>>ISO 14755 specifies using Ctrl+Shift+hex-digit to input unicode.
>>gtk2 implemented ISO 14755 input method.
> 
> 
> Does it help to say
> 
>   C-x RET X utf-8 RET

sounded promising, but unfortuneately, no.


-jb

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]         ` <mailman.175.1071644836.868.help-gnu-emacs@gnu.org>
  2003-12-17  8:17           ` josh buhl
@ 2003-12-17  8:19           ` josh buhl
  2003-12-17  9:25             ` Eli Zaretskii
       [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
  1 sibling, 2 replies; 24+ messages in thread
From: josh buhl @ 2003-12-17  8:19 UTC (permalink / raw)


Eli Zaretskii wrote:
>>From: josh buhl <uzs33d@uni-bonn.de>
>>Newsgroups: gnu.emacs.help
>>Date: Tue, 16 Dec 2003 12:16:28 +0100
>>
>>I'm sure this is related to this general problem:
>>
>>ISO 14755 specifies using Ctrl+Shift+hex-digit to input unicode.
>>gtk2 implemented ISO 14755 input method.
> 
> 
> Does it help to say
> 
>   C-x RET X utf-8 RET

and C-x RET X iso-8859-1 RET doesn't work either.

-jb

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-17  8:19           ` josh buhl
@ 2003-12-17  9:25             ` Eli Zaretskii
       [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2003-12-17  9:25 UTC (permalink / raw)


> From: josh buhl <uzs33d@uni-bonn.de>
> Newsgroups: gnu.emacs.help
> Date: Wed, 17 Dec 2003 09:19:46 +0100
> > 
> > Does it help to say
> > 
> >   C-x RET X utf-8 RET
> 
> and C-x RET X iso-8859-1 RET doesn't work either.

What do you get if you say

  C-x RET X raw-text RET

before pasting?  Please tell what characters are pasted into the Emacs
buffer in this case.

Also, if you can find some tool that will show you the contents of the
X selection buffer after you cut/copy from a gtk2 application, please
tell what does that tool show.  Perhaps by looking at the contents of
the selection buffer we will be able to guess what kind of encoding is
used.

I'm also slightly confused by the fact that someone told in this
thread they have no problems pasting from Mozilla.  Could it be
something in the setup of your system?  Does it help to invoke Emacs
with "emacs -q --no-site-file"?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
@ 2003-12-17  9:57               ` Sergei Pokrovsky
  2003-12-17 10:51               ` Harald Maier
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Sergei Pokrovsky @ 2003-12-17  9:57 UTC (permalink / raw)


>>>>> "EZ" == Eli Zaretskii <eliz@elta.co.il> writes:

[...]

  EZ> I'm also slightly confused by the fact that someone told in this
  EZ> thread they have no problems pasting from Mozilla.

Do you mean me?

I have no problem in pasting Unicode from Netscape 7.0 = Mozilla/5.0
(X11; U; SunOS sun4u; ru-RU; rv:1.0.1) Gecko/20020920 Netscape/7.0

but that's on Solaris, in sessions started with "en_US.UTF-8".

I've also mentioned that there _is_ a problem i copying Russian text
_from_ Emacs into Netscape form fields: Every first Russian word in a
line is copied correctly (it need not be the very first word in a
line: there may be some ASCII words before it); but all the Russian
words after a space following that Russian word are in Latin-1.  OTOH
a TAB does not has that evil effect.

GNU Emacs 21.3.2 (sparc-sun-solaris2.8, X toolkit).

[...]

-- 
Sergei

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
  2003-12-17  9:57               ` Sergei Pokrovsky
@ 2003-12-17 10:51               ` Harald Maier
  2003-12-17 13:56                 ` josh buhl
  2003-12-17 11:43               ` Edi Weitz
  2003-12-17 13:54               ` erasurehead
  3 siblings, 1 reply; 24+ messages in thread
From: Harald Maier @ 2003-12-17 10:51 UTC (permalink / raw)


Eli Zaretskii <eliz@elta.co.il> writes:

> I'm also slightly confused by the fact that someone told in this
> thread they have no problems pasting from Mozilla.  Could it be
> something in the setup of your system?  Does it help to invoke Emacs
> with "emacs -q --no-site-file"?

That was me. I am using mozilla-1.5a on a SuSE GNU/Linux system:

ldd /opt/mozilla-1.5a/mozilla-bin 
...
...
        libgtk-1.2.so.0 => /usr/lib/libgtk-1.2.so.0 (0x40077000)
        libgdk-1.2.so.0 => /usr/lib/libgdk-1.2.so.0 (0x401b9000)
        libgmodule-1.2.so.0 => /usr/lib/libgmodule-1.2.so.0 (0x401f2000)
        libglib-1.2.so.0 => /usr/lib/libglib-1.2.so.0 (0x401f5000)

If I understand Josh correctly then he is using a newer GTK version.

Harald

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
  2003-12-17  9:57               ` Sergei Pokrovsky
  2003-12-17 10:51               ` Harald Maier
@ 2003-12-17 11:43               ` Edi Weitz
  2003-12-17 18:20                 ` Eli Zaretskii
  2003-12-17 13:54               ` erasurehead
  3 siblings, 1 reply; 24+ messages in thread
From: Edi Weitz @ 2003-12-17 11:43 UTC (permalink / raw)


On 17 Dec 2003 11:25:22 +0200, Eli Zaretskii <eliz@elta.co.il> wrote:

>> From: josh buhl <uzs33d@uni-bonn.de>
>> Newsgroups: gnu.emacs.help
>> Date: Wed, 17 Dec 2003 09:19:46 +0100
>> > 
>> > Does it help to say
>> > 
>> >   C-x RET X utf-8 RET
>> 
>> and C-x RET X iso-8859-1 RET doesn't work either.
>
> What do you get if you say
>
>   C-x RET X raw-text RET
>
> before pasting?  Please tell what characters are pasted into the Emacs
> buffer in this case.

Let me chime in here a bit late and note that I have a similar
problem. Pasting text selected in Mozilla will result in question
marks instead of umlauts in Emacs (no matter which C-x RET x
... setting I use). Likewise, I see the correct characters with Opera
or Konqueror, no matter which coding system I use.

> Also, if you can find some tool that will show you the contents of
> the X selection buffer after you cut/copy from a gtk2 application,
> please tell what does that tool show.  Perhaps by looking at the
> contents of the selection buffer we will be able to guess what kind
> of encoding is used.

Which tools would that be?

Edi.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
                                 ` (2 preceding siblings ...)
  2003-12-17 11:43               ` Edi Weitz
@ 2003-12-17 13:54               ` erasurehead
  2003-12-17 14:00                 ` josh buhl
                                   ` (2 more replies)
  3 siblings, 3 replies; 24+ messages in thread
From: erasurehead @ 2003-12-17 13:54 UTC (permalink / raw)




Eli Zaretskii wrote:
> What do you get if you say
> 
>   C-x RET X raw-text RET
> 
> before pasting?  Please tell what characters are pasted into the Emacs
> buffer in this case.

same thing. I mark this text in mozilla:

Minuten köcheln lassen. Diese Soße sollte weder zu dick noch zu dünn sein


and I get this in emacs:

Minuten k\x{00F6}cheln lassen. Diese So\x{00DF}e sollte weder zu dick 
noch zu d\x{00FC}nn sein

and the control sequences aren't one character, they are in the buffer 
as literal (e.g.) "\x{00F6}"  consisting of the 8 ascii characters you see.

> 
> Also, if you can find some tool that will show you the contents of the
> X selection buffer after you cut/copy from a gtk2 application, please
> tell what does that tool show.  Perhaps by looking at the contents of
> the selection buffer we will be able to guess what kind of encoding is
> used.

I'll try to find one.

> 
> I'm also slightly confused by the fact that someone told in this
> thread they have no problems pasting from Mozilla.

It works properly for german language stuff if I log in setting the 
session language to german (who knows what gnome does with the gtk2 
pango stuff in this case), but then it doesn't copy other lanquages 
properly. It also doesn't work if i set LC_ALL to any type of german or 
utf locale in a terminal window and then start emacs, though I can 
verify that this has an effect since it changes the default encoding for 
saving a buffer.


   Could it be
> something in the setup of your system?  Does it help to invoke Emacs
> with "emacs -q --no-site-file"?

No. same thing.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-17 10:51               ` Harald Maier
@ 2003-12-17 13:56                 ` josh buhl
  0 siblings, 0 replies; 24+ messages in thread
From: josh buhl @ 2003-12-17 13:56 UTC (permalink / raw)


Here's my ldd output:
josh@spleen:/home/music/misc$ ldd /usr/lib/mozilla/mozilla-bin
         libmozjs.so => /usr/lib/libmozjs.so (0x40028000)
         libplds4.so => /usr/lib/libplds4.so (0x400a7000)
         libplc4.so => /usr/lib/libplc4.so (0x400aa000)
         libnspr4.so => /usr/lib/libnspr4.so (0x400b0000)
         libpthread.so.0 => /lib/libpthread.so.0 (0x400e4000)
         libdl.so.2 => /lib/libdl.so.2 (0x40135000)
         libgtk-x11-2.0.so.0 => /usr/lib/libgtk-x11-2.0.so.0 (0x40138000)
         libgdk-x11-2.0.so.0 => /usr/lib/libgdk-x11-2.0.so.0 (0x4038a000)
         libatk-1.0.so.0 => /usr/lib/libatk-1.0.so.0 (0x403f7000)
         libgdk_pixbuf-2.0.so.0 => /usr/lib/libgdk_pixbuf-2.0.so.0 
(0x40412000)
         libpangoxft-1.0.so.0 => /usr/lib/libpangoxft-1.0.so.0 (0x40425000)
         libpangox-1.0.so.0 => /usr/lib/libpangox-1.0.so.0 (0x40446000)
         libpango-1.0.so.0 => /usr/lib/libpango-1.0.so.0 (0x40453000)
         libgobject-2.0.so.0 => /usr/lib/libgobject-2.0.so.0 (0x40486000)
         libgmodule-2.0.so.0 => /usr/lib/libgmodule-2.0.so.0 (0x404b6000)
         libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0x404bb000)
         libm.so.6 => /lib/libm.so.6 (0x4051f000)
         libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x40541000)
         libc.so.6 => /lib/libc.so.6 (0x405f8000)
         libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x4072a000)
         /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
         libX11.so.6 => /usr/X11R6/lib/libX11.so.6 (0x40732000)
         libXi.so.6 => /usr/X11R6/lib/libXi.so.6 (0x407fa000)
         libXext.so.6 => /usr/X11R6/lib/libXext.so.6 (0x40802000)
         libXft.so.2 => /usr/lib/libXft.so.2 (0x40810000)
         libXrender.so.1 => /usr/lib/libXrender.so.1 (0x40822000)
         libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 (0x4082a000)
         libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x40851000)
         libz.so.1 => /usr/lib/libz.so.1 (0x408be000)
         libexpat.so.1 => /usr/lib/libexpat.so.1 (0x408cf000)



Harald Maier wrote:
> That was me. I am using mozilla-1.5a on a SuSE GNU/Linux system:
> 
> ldd /opt/mozilla-1.5a/mozilla-bin 
> ...
> ...
>         libgtk-1.2.so.0 => /usr/lib/libgtk-1.2.so.0 (0x40077000)
>         libgdk-1.2.so.0 => /usr/lib/libgdk-1.2.so.0 (0x401b9000)
>         libgmodule-1.2.so.0 => /usr/lib/libgmodule-1.2.so.0 (0x401f2000)
>         libglib-1.2.so.0 => /usr/lib/libglib-1.2.so.0 (0x401f5000)
> 
> If I understand Josh correctly then he is using a newer GTK version.
> 
> Harald

-- 
I awaken to hear for the first time
the music I remember I've known all along,
and find it playing Everywhere.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-17 13:54               ` erasurehead
@ 2003-12-17 14:00                 ` josh buhl
  2003-12-17 18:24                 ` Eli Zaretskii
       [not found]                 ` <mailman.210.1071689295.868.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 24+ messages in thread
From: josh buhl @ 2003-12-17 14:00 UTC (permalink / raw)


Sorry Eli, sent that from the wrong account. This was a reply to your 
questions.

erasurehead wrote:
> 
> 
> Eli Zaretskii wrote:
> 
>> What do you get if you say
>>
>>   C-x RET X raw-text RET
>>
>> before pasting?  Please tell what characters are pasted into the Emacs
>> buffer in this case.
> 
> 
> same thing. I mark this text in mozilla:
> 
> Minuten köcheln lassen. Diese Soße sollte weder zu dick noch zu dünn sein
> 
> 
> and I get this in emacs:
> 
> Minuten k\x{00F6}cheln lassen. Diese So\x{00DF}e sollte weder zu dick 
> noch zu d\x{00FC}nn sein
> 
> and the control sequences aren't one character, they are in the buffer 
> as literal (e.g.) "\x{00F6}"  consisting of the 8 ascii characters you see.
> 
>>
>> Also, if you can find some tool that will show you the contents of the
>> X selection buffer after you cut/copy from a gtk2 application, please
>> tell what does that tool show.  Perhaps by looking at the contents of
>> the selection buffer we will be able to guess what kind of encoding is
>> used.
> 
> 
> I'll try to find one.
> 
>>
>> I'm also slightly confused by the fact that someone told in this
>> thread they have no problems pasting from Mozilla.
> 
> 
> It works properly for german language stuff if I log in setting the 
> session language to german (who knows what gnome does with the gtk2 
> pango stuff in this case), but then it doesn't copy other lanquages 
> properly. It also doesn't work if i set LC_ALL to any type of german or 
> utf locale in a terminal window and then start emacs, though I can 
> verify that this has an effect since it changes the default encoding for 
> saving a buffer.
> 
> 
>   Could it be
> 
>> something in the setup of your system?  Does it help to invoke Emacs
>> with "emacs -q --no-site-file"?
> 
> 
> No. same thing.
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-17 11:43               ` Edi Weitz
@ 2003-12-17 18:20                 ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2003-12-17 18:20 UTC (permalink / raw)


> From: Edi Weitz <edi@agharta.de>
> Newsgroups: gnu.emacs.help
> Date: Wed, 17 Dec 2003 12:43:01 +0100
> 
> > Also, if you can find some tool that will show you the contents of
> > the X selection buffer after you cut/copy from a gtk2 application,
> > please tell what does that tool show.  Perhaps by looking at the
> > contents of the selection buffer we will be able to guess what kind
> > of encoding is used.
> 
> Which tools would that be?

Sorry, I don't know.  I just assumed there should be something like
that.

Anybody?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-17 13:54               ` erasurehead
  2003-12-17 14:00                 ` josh buhl
@ 2003-12-17 18:24                 ` Eli Zaretskii
       [not found]                 ` <mailman.210.1071689295.868.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2003-12-17 18:24 UTC (permalink / raw)


> From: erasurehead <erasurehead@goatrance.net>
> Newsgroups: gnu.emacs.help
> Date: Wed, 17 Dec 2003 14:54:40 +0100
> 
> I get this in emacs:
> 
> Minuten k\x{00F6}cheln lassen. Diese So\x{00DF}e sollte weder zu dick 
> noch zu d\x{00FC}nn sein
> 
> and the control sequences aren't one character, they are in the buffer 
> as literal (e.g.) "\x{00F6}"  consisting of the 8 ascii characters you see.

It begins to sound as if gtk2/pango actually converts the text like
that when it puts it into the selection buffer.

Perhaps you should post a question to some forum related to gtk2
development, or google for similar issues.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
       [not found]                 ` <mailman.210.1071689295.868.help-gnu-emacs@gnu.org>
@ 2003-12-18  8:53                   ` josh buhl
  0 siblings, 0 replies; 24+ messages in thread
From: josh buhl @ 2003-12-18  8:53 UTC (permalink / raw)


Eli Zaretskii wrote:
>>and the control sequences aren't one character, they are in the buffer 
>>as literal (e.g.) "\x{00F6}"  consisting of the 8 ascii characters you see.
> 
> 
> It begins to sound as if gtk2/pango actually converts the text like
> that when it puts it into the selection buffer.

Maybe, but then why are the other non gtk2 apps like xedit, kedit, kate, 
open office, able to insert it properly?


> 
> Perhaps you should post a question to some forum related to gtk2
> development, or google for similar issues.

I'll see what I can find...

-jb

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-16  8:36 emacs, umlauts, x-windows text mark and paste josh buhl
                   ` (3 preceding siblings ...)
  2003-12-16 10:12 ` josh buhl
@ 2003-12-18  9:54 ` josh buhl
  2003-12-18  9:56   ` josh buhl
  4 siblings, 1 reply; 24+ messages in thread
From: josh buhl @ 2003-12-18  9:54 UTC (permalink / raw)


I had also sent this report to bug-gnu-emacs@gnu.org  and received this:

-------- Original Message --------
Subject: Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii 
characters,	and the x-windows clipboard]
Date: Thu, 18 Dec 2003 11:15:39 +0900 (JST)
From: Kenichi Handa <handa@m17n.org>
To: rms@gnu.org
CC: uzs33d@uni-bonn.de, bug-gnu-emacs@gnu.org, handa@m17n.org
References: <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>

In article <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>, Richard Stallman 
<rms@gnu.org> writes:
 > Would you please investigate this?

Ok.

 > From: josh buhl <uzs33d@uni-bonn.de>
 > Newsgroups: gnu.emacs.bug
 > To: bug-gnu-emacs@gnu.org
[...]
 > Subject: gtk2, iso14755, pasting non-ascii characters,
 > 	and the x-windows clipboard
[...]
 > Emacs has a problem pasting in text with non-ascii characters from any
 > of the apps which are compiled with gtk2 (via marking with mouse, and
 > inserting per mouse-2 click). Here's an example:

 > I mark this text from a german webpage displayed in mozilla 1.5
 > compiled with gtk2:

 > "Soße wird in einer extra Soßenschüssel..."

 > Paste it into my Emacs buffer and get this:

 > "So\x{00DF}e wird in einer extra So\x{00DF}ensch\x{00FC}ssel..."

Actually, this should be the exact text Emacs received from
the gtk2 application, thus it seems that gtk2 has a bug in
producing COMPOUND_TEXT.

 > Emacs inserts the text correctly when it has been marked in kword,
 > kate, xedit, open office writer, or any other non-gtk2 app, and barfs
 > if the same text has been marked in mozilla, gedit, or *any gtk+ 2*
 > dialog like any of the gnome 2.4 dialogs. So I can mark a text in
 > mozilla, paste it into xedit, _remark_ it and paste it into emacs, and
 > it works, but if I don't remark, emacs barfs. If I mark the text in
 > Emacs, then I can paste it correctly into any non-gtk2 app, but if I
 > try to paste it into a gtk2 app, *nothing* gets pasted in.

 > However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
 > seem to be able to paste this text in from each other properly. Only
 > emacs has this problem.

Perhaps, that because the other apps use UTF8_STRING request
on selection (which is XFree86 extention) but Emacs 21.3
uses only COMPOUND_TEXT request (standard of X).  The latest
CVS version of Emacs supports UTF8_STRING.

 > This behaviour is independent of what I've set LC_ALL to before
 > starting emacs, but if I logout and login with default session
 > language set to german, then all the pasting functions work properly.

???  Then, in what locale were you running gtk2 apps when
pasting didn't work?

 > I'm sure this is related to this: ISO 14755 specifies using
 > Ctrl+Shift+hex-digit to input unicode.  gtk2 implemented ISO 14755
 > input method.

I'm sure this is not related to input method.

 > The garbaged text corresponds exactly to the unicode hex encodings for
 > the characters. for example the unicode hex encoding of ß is 00DF and
 > emacs displays the pasted in ß as \x{00DF}. This certainly isn't a
 > coincidence.

Emacs never generates such \x{.....} notation automatically.
So, the text should be generated on sender site.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: emacs, umlauts,  x-windows text mark and paste
  2003-12-18  9:54 ` josh buhl
@ 2003-12-18  9:56   ` josh buhl
  0 siblings, 0 replies; 24+ messages in thread
From: josh buhl @ 2003-12-18  9:56 UTC (permalink / raw)


Here's my reply:

-------- Original Message --------
Subject: Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii 
characters, and the x-windows clipboard]
Date: Thu, 18 Dec 2003 10:50:25 +0100
From: josh buhl <uzs33d@uni-bonn.de>
To: Kenichi Handa <handa@m17n.org>
CC: bug-gnu-emacs@gnu.org
References: <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org> 
<200312180215.LAA00397@etlken.m17n.org>

Kenichi Handa wrote:
 >>However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
 >>seem to be able to paste this text in from each other properly. Only
 >>emacs has this problem.
 >
 >
 > Perhaps, that because the other apps use UTF8_STRING request
 > on selection (which is XFree86 extention) but Emacs 21.3
 > uses only COMPOUND_TEXT request (standard of X).  The latest
 > CVS version of Emacs supports UTF8_STRING.

That sounds plausible. If I tried to checkout and compile the latest cvs
of emacs to test this, would I have to somehow enable utf8_string, or
would it be automatically supported?


 >>This behaviour is independent of what I've set LC_ALL to before
 >>starting emacs, but if I logout and login with default session
 >>language set to german, then all the pasting functions work properly.
 >
 >
 > ???  Then, in what locale were you running gtk2 apps when
 > pasting didn't work?

The system default, which is no default language (as recommended during
the debian locales configuration script for mult-language systems), so
just POSIX:

josh@spleen:~$ locale
LANG=POSIX
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
josh@spleen:~$ locale -a
C
de_DE
de_DE@euro
de_DE.iso88591
de_DE.iso885915@euro
de_DE.utf8
de_DE.utf8@euro
deutsch
en_US
en_US.iso88591
en_US.utf8
german
POSIX
josh@spleen:~$

But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start
emacs, and the pasting does not work (but only for emacs, it still works
with other apps). *HOWEVER*, if I log out, select any of the available
locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1
or en_US.UTF-8, and then login, then all the pasting works properly.

I suppose that the session locale setting might also alter the way the X
selection buffer deals with the marked text.

 >>The garbaged text corresponds exactly to the unicode hex encodings for
 >>the characters. for example the unicode hex encoding of ß is 00DF and
 >>emacs displays the pasted in ß as \x{00DF}. This certainly isn't a
 >>coincidence.
 >
 >
 > Emacs never generates such \x{.....} notation automatically.
 > So, the text should be generated on sender site.

This corroborates the suggestion that the session locale setting is also
effecting the text in the x selection buffer. But there's still the
question (except for your utf8-string explanation) of why other apps can
insert this, but emacs can't.

-jb

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2003-12-18  9:56 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-16  8:36 emacs, umlauts, x-windows text mark and paste josh buhl
2003-12-16  9:11 ` Sergei Pokrovsky
2003-12-16  9:40 ` Harald Maier
2003-12-16 10:11 ` erasurehead
2003-12-16 10:33   ` Harald Maier
2003-12-16 10:54     ` josh buhl
2003-12-16 11:16       ` josh buhl
2003-12-17  6:07         ` Eli Zaretskii
     [not found]         ` <mailman.175.1071644836.868.help-gnu-emacs@gnu.org>
2003-12-17  8:17           ` josh buhl
2003-12-17  8:19           ` josh buhl
2003-12-17  9:25             ` Eli Zaretskii
     [not found]             ` <mailman.181.1071656744.868.help-gnu-emacs@gnu.org>
2003-12-17  9:57               ` Sergei Pokrovsky
2003-12-17 10:51               ` Harald Maier
2003-12-17 13:56                 ` josh buhl
2003-12-17 11:43               ` Edi Weitz
2003-12-17 18:20                 ` Eli Zaretskii
2003-12-17 13:54               ` erasurehead
2003-12-17 14:00                 ` josh buhl
2003-12-17 18:24                 ` Eli Zaretskii
     [not found]                 ` <mailman.210.1071689295.868.help-gnu-emacs@gnu.org>
2003-12-18  8:53                   ` josh buhl
2003-12-16 10:57   ` Sergei Pokrovsky
2003-12-16 10:12 ` josh buhl
2003-12-18  9:54 ` josh buhl
2003-12-18  9:56   ` josh buhl

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.