unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard]
       [not found] <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>
@ 2003-12-18  2:15 ` Kenichi Handa
  2003-12-18  9:50   ` josh buhl
  0 siblings, 1 reply; 3+ messages in thread
From: Kenichi Handa @ 2003-12-18  2:15 UTC (permalink / raw)
  Cc: bug-gnu-emacs, uzs33d, handa

In article <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> Would you please investigate this?

Ok.

> From: josh buhl <uzs33d@uni-bonn.de>
> Newsgroups: gnu.emacs.bug
> To: bug-gnu-emacs@gnu.org
[...]
> Subject: gtk2, iso14755, pasting non-ascii characters,
> 	and the x-windows clipboard
[...]
> Emacs has a problem pasting in text with non-ascii characters from any
> of the apps which are compiled with gtk2 (via marking with mouse, and
> inserting per mouse-2 click). Here's an example:

> I mark this text from a german webpage displayed in mozilla 1.5
> compiled with gtk2:

> "Soße wird in einer extra Soßenschüssel..."

> Paste it into my Emacs buffer and get this:

> "So\x{00DF}e wird in einer extra So\x{00DF}ensch\x{00FC}ssel..."

Actually, this should be the exact text Emacs received from
the gtk2 application, thus it seems that gtk2 has a bug in
producing COMPOUND_TEXT.

> Emacs inserts the text correctly when it has been marked in kword,
> kate, xedit, open office writer, or any other non-gtk2 app, and barfs
> if the same text has been marked in mozilla, gedit, or *any gtk+ 2*
> dialog like any of the gnome 2.4 dialogs. So I can mark a text in 
> mozilla, paste it into xedit, _remark_ it and paste it into emacs, and 
> it works, but if I don't remark, emacs barfs. If I mark the text in
> Emacs, then I can paste it correctly into any non-gtk2 app, but if I
> try to paste it into a gtk2 app, *nothing* gets pasted in.

> However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
> seem to be able to paste this text in from each other properly. Only
> emacs has this problem.

Perhaps, that because the other apps use UTF8_STRING request
on selection (which is XFree86 extention) but Emacs 21.3
uses only COMPOUND_TEXT request (standard of X).  The latest
CVS version of Emacs supports UTF8_STRING.

> This behaviour is independent of what I've set LC_ALL to before
> starting emacs, but if I logout and login with default session
> language set to german, then all the pasting functions work properly.

???  Then, in what locale were you running gtk2 apps when
pasting didn't work?

> I'm sure this is related to this: ISO 14755 specifies using
> Ctrl+Shift+hex-digit to input unicode.  gtk2 implemented ISO 14755
> input method.

I'm sure this is not related to input method.

> The garbaged text corresponds exactly to the unicode hex encodings for
> the characters. for example the unicode hex encoding of ß is 00DF and
> emacs displays the pasted in ß as \x{00DF}. This certainly isn't a 
> coincidence.

Emacs never generates such \x{.....} notation automatically.
So, the text should be generated on sender site.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard]
  2003-12-18  2:15 ` [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] Kenichi Handa
@ 2003-12-18  9:50   ` josh buhl
  2003-12-18 11:28     ` Kenichi Handa
  0 siblings, 1 reply; 3+ messages in thread
From: josh buhl @ 2003-12-18  9:50 UTC (permalink / raw)
  Cc: bug-gnu-emacs

Kenichi Handa wrote:
>>However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
>>seem to be able to paste this text in from each other properly. Only
>>emacs has this problem.
> 
> 
> Perhaps, that because the other apps use UTF8_STRING request
> on selection (which is XFree86 extention) but Emacs 21.3
> uses only COMPOUND_TEXT request (standard of X).  The latest
> CVS version of Emacs supports UTF8_STRING.

That sounds plausible. If I tried to checkout and compile the latest cvs 
of emacs to test this, would I have to somehow enable utf8_string, or 
would it be automatically supported?


>>This behaviour is independent of what I've set LC_ALL to before
>>starting emacs, but if I logout and login with default session
>>language set to german, then all the pasting functions work properly.
> 
> 
> ???  Then, in what locale were you running gtk2 apps when
> pasting didn't work?

The system default, which is no default language (as recommended during 
the debian locales configuration script for mult-language systems), so 
just POSIX:

josh@spleen:~$ locale
LANG=POSIX
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
josh@spleen:~$ locale -a
C
de_DE
de_DE@euro
de_DE.iso88591
de_DE.iso885915@euro
de_DE.utf8
de_DE.utf8@euro
deutsch
en_US
en_US.iso88591
en_US.utf8
german
POSIX
josh@spleen:~$

But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start 
emacs, and the pasting does not work (but only for emacs, it still works 
with other apps). *HOWEVER*, if I log out, select any of the available 
locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1 
or en_US.UTF-8, and then login, then all the pasting works properly.

I suppose that the session locale setting might also alter the way the X 
selection buffer deals with the marked text.

>>The garbaged text corresponds exactly to the unicode hex encodings for
>>the characters. for example the unicode hex encoding of ß is 00DF and
>>emacs displays the pasted in ß as \x{00DF}. This certainly isn't a 
>>coincidence.
> 
> 
> Emacs never generates such \x{.....} notation automatically.
> So, the text should be generated on sender site.

This corroborates the suggestion that the session locale setting is also 
effecting the text in the x selection buffer. But there's still the 
question (except for your utf8-string explanation) of why other apps can 
insert this, but emacs can't.

-jb

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard]
  2003-12-18  9:50   ` josh buhl
@ 2003-12-18 11:28     ` Kenichi Handa
  0 siblings, 0 replies; 3+ messages in thread
From: Kenichi Handa @ 2003-12-18 11:28 UTC (permalink / raw)
  Cc: bug-gnu-emacs, handa

In article <3FE17861.9090809@uni-bonn.de>, josh buhl <uzs33d@uni-bonn.de> writes:

> Kenichi Handa wrote:
>>> However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
>>> seem to be able to paste this text in from each other properly. Only
>>> emacs has this problem.
>>  
>>  Perhaps, that because the other apps use UTF8_STRING request
>>  on selection (which is XFree86 extention) but Emacs 21.3
>>  uses only COMPOUND_TEXT request (standard of X).  The latest
>>  CVS version of Emacs supports UTF8_STRING.

> That sounds plausible. If I tried to checkout and compile the latest cvs 
> of emacs to test this, would I have to somehow enable utf8_string, or 
> would it be automatically supported?

In CVS Emacs, we introduced this variable.

----------------------------------------------------------------------
x-select-request-type's value is nil

*Data type request for X selection.
The value is nil, one of the following data types, or a list of them:
  `COMPOUND_TEXT', `UTF8_STRING', `STRING', `TEXT'

If the value is nil, try `COMPOUND_TEXT' and `UTF8_STRING', and
use the more appropriate result.  If both fail, try `STRING', and
then `TEXT'.

If the value is one of the above symbols, try only the specified
type.

If the value is a list of them, try each of them in the specified
order until succeed.
----------------------------------------------------------------------

As the default is still nil, Emacs tries both COMPOUND_TEXT
and UTF8_STRING.  And to decide "the more appropriate
result", we currently do this:

;; Helper function for x-selection-value.  Select UTF8 or CTEXT
;; whichever is more appropriate.  Here, we use this heurisitcs.
;;
;;   (1) If their lengthes are different, select the longer one.  This
;;   is because an X client may just cut off unsupported characters.
;;
;;   (2) Otherwise, if the Nth character of CTEXT is an ASCII
;;   character that is different from the Nth character of UTF8,
;;   select UTF8.  This is because an X client may replace unsupported
;;   characters with some ASCII character (typically ` ' or `?') in
;;   CTEXT.
;;
;;   (3) Otherwise, select CTEXT.  This is because legacy charsets are
;;   better for the current Emacs, especially when the selection owner
;;   is also Emacs.

But, considering the described behaviour of gtk2, it seems
that we should test (2) at first.

>>  ???  Then, in what locale were you running gtk2 apps when
>>  pasting didn't work?

> The system default, which is no default language (as recommended during 
> the debian locales configuration script for mult-language systems), so 
> just POSIX:

I see.  I suspect that gtk2 produces \x{...} in
COMPOUND_TEXT encoder because latin-1 accented letters are
not supported in that locale.

[...]

> But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start 
> emacs, and the pasting does not work (but only for emacs, it still works 
> with other apps). *HOWEVER*, if I log out, select any of the available 
> locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1 
> or en_US.UTF-8, and then login, then all the pasting works properly.

> I suppose that the session locale setting might also alter the way the X 
> selection buffer deals with the marked text.

Perhaps.  As the selection owner has no way to know in which
locale a selection requester is running, it is likely that
the gtk2 assumes that the requester is in the session
locale.

>>> The garbaged text corresponds exactly to the unicode hex encodings for
>>> the characters. for example the unicode hex encoding of ß is 00DF and
>>> emacs displays the pasted in ß as \x{00DF}. This certainly isn't a 
>>> coincidence.
>>  
>>  
>>  Emacs never generates such \x{.....} notation automatically.
>>  So, the text should be generated on sender site.

> This corroborates the suggestion that the session locale setting is also 
> effecting the text in the x selection buffer. But there's still the 
> question (except for your utf8-string explanation) of why other apps can 
> insert this, but emacs can't.

As I wrote, I think they request UTF8_STRING at first, and
UTF8 encoder always encode all characters correctly
regardless of the current locale.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-12-18 11:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>
2003-12-18  2:15 ` [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] Kenichi Handa
2003-12-18  9:50   ` josh buhl
2003-12-18 11:28     ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).