* Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] [not found] <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org> @ 2003-12-18 2:15 ` Kenichi Handa 2003-12-18 9:50 ` josh buhl 0 siblings, 1 reply; 3+ messages in thread From: Kenichi Handa @ 2003-12-18 2:15 UTC (permalink / raw) Cc: bug-gnu-emacs, uzs33d, handa In article <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: > Would you please investigate this? Ok. > From: josh buhl <uzs33d@uni-bonn.de> > Newsgroups: gnu.emacs.bug > To: bug-gnu-emacs@gnu.org [...] > Subject: gtk2, iso14755, pasting non-ascii characters, > and the x-windows clipboard [...] > Emacs has a problem pasting in text with non-ascii characters from any > of the apps which are compiled with gtk2 (via marking with mouse, and > inserting per mouse-2 click). Here's an example: > I mark this text from a german webpage displayed in mozilla 1.5 > compiled with gtk2: > "Soße wird in einer extra Soßenschüssel..." > Paste it into my Emacs buffer and get this: > "So\x{00DF}e wird in einer extra So\x{00DF}ensch\x{00FC}ssel..." Actually, this should be the exact text Emacs received from the gtk2 application, thus it seems that gtk2 has a bug in producing COMPOUND_TEXT. > Emacs inserts the text correctly when it has been marked in kword, > kate, xedit, open office writer, or any other non-gtk2 app, and barfs > if the same text has been marked in mozilla, gedit, or *any gtk+ 2* > dialog like any of the gnome 2.4 dialogs. So I can mark a text in > mozilla, paste it into xedit, _remark_ it and paste it into emacs, and > it works, but if I don't remark, emacs barfs. If I mark the text in > Emacs, then I can paste it correctly into any non-gtk2 app, but if I > try to paste it into a gtk2 app, *nothing* gets pasted in. > However, the gtk2 apps and the non-gtk2 apps aside from emacs, all > seem to be able to paste this text in from each other properly. Only > emacs has this problem. Perhaps, that because the other apps use UTF8_STRING request on selection (which is XFree86 extention) but Emacs 21.3 uses only COMPOUND_TEXT request (standard of X). The latest CVS version of Emacs supports UTF8_STRING. > This behaviour is independent of what I've set LC_ALL to before > starting emacs, but if I logout and login with default session > language set to german, then all the pasting functions work properly. ??? Then, in what locale were you running gtk2 apps when pasting didn't work? > I'm sure this is related to this: ISO 14755 specifies using > Ctrl+Shift+hex-digit to input unicode. gtk2 implemented ISO 14755 > input method. I'm sure this is not related to input method. > The garbaged text corresponds exactly to the unicode hex encodings for > the characters. for example the unicode hex encoding of ß is 00DF and > emacs displays the pasted in ß as \x{00DF}. This certainly isn't a > coincidence. Emacs never generates such \x{.....} notation automatically. So, the text should be generated on sender site. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] 2003-12-18 2:15 ` [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] Kenichi Handa @ 2003-12-18 9:50 ` josh buhl 2003-12-18 11:28 ` Kenichi Handa 0 siblings, 1 reply; 3+ messages in thread From: josh buhl @ 2003-12-18 9:50 UTC (permalink / raw) Cc: bug-gnu-emacs Kenichi Handa wrote: >>However, the gtk2 apps and the non-gtk2 apps aside from emacs, all >>seem to be able to paste this text in from each other properly. Only >>emacs has this problem. > > > Perhaps, that because the other apps use UTF8_STRING request > on selection (which is XFree86 extention) but Emacs 21.3 > uses only COMPOUND_TEXT request (standard of X). The latest > CVS version of Emacs supports UTF8_STRING. That sounds plausible. If I tried to checkout and compile the latest cvs of emacs to test this, would I have to somehow enable utf8_string, or would it be automatically supported? >>This behaviour is independent of what I've set LC_ALL to before >>starting emacs, but if I logout and login with default session >>language set to german, then all the pasting functions work properly. > > > ??? Then, in what locale were you running gtk2 apps when > pasting didn't work? The system default, which is no default language (as recommended during the debian locales configuration script for mult-language systems), so just POSIX: josh@spleen:~$ locale LANG=POSIX LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= josh@spleen:~$ locale -a C de_DE de_DE@euro de_DE.iso88591 de_DE.iso885915@euro de_DE.utf8 de_DE.utf8@euro deutsch en_US en_US.iso88591 en_US.utf8 german POSIX josh@spleen:~$ But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start emacs, and the pasting does not work (but only for emacs, it still works with other apps). *HOWEVER*, if I log out, select any of the available locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1 or en_US.UTF-8, and then login, then all the pasting works properly. I suppose that the session locale setting might also alter the way the X selection buffer deals with the marked text. >>The garbaged text corresponds exactly to the unicode hex encodings for >>the characters. for example the unicode hex encoding of ß is 00DF and >>emacs displays the pasted in ß as \x{00DF}. This certainly isn't a >>coincidence. > > > Emacs never generates such \x{.....} notation automatically. > So, the text should be generated on sender site. This corroborates the suggestion that the session locale setting is also effecting the text in the x selection buffer. But there's still the question (except for your utf8-string explanation) of why other apps can insert this, but emacs can't. -jb ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] 2003-12-18 9:50 ` josh buhl @ 2003-12-18 11:28 ` Kenichi Handa 0 siblings, 0 replies; 3+ messages in thread From: Kenichi Handa @ 2003-12-18 11:28 UTC (permalink / raw) Cc: bug-gnu-emacs, handa In article <3FE17861.9090809@uni-bonn.de>, josh buhl <uzs33d@uni-bonn.de> writes: > Kenichi Handa wrote: >>> However, the gtk2 apps and the non-gtk2 apps aside from emacs, all >>> seem to be able to paste this text in from each other properly. Only >>> emacs has this problem. >> >> Perhaps, that because the other apps use UTF8_STRING request >> on selection (which is XFree86 extention) but Emacs 21.3 >> uses only COMPOUND_TEXT request (standard of X). The latest >> CVS version of Emacs supports UTF8_STRING. > That sounds plausible. If I tried to checkout and compile the latest cvs > of emacs to test this, would I have to somehow enable utf8_string, or > would it be automatically supported? In CVS Emacs, we introduced this variable. ---------------------------------------------------------------------- x-select-request-type's value is nil *Data type request for X selection. The value is nil, one of the following data types, or a list of them: `COMPOUND_TEXT', `UTF8_STRING', `STRING', `TEXT' If the value is nil, try `COMPOUND_TEXT' and `UTF8_STRING', and use the more appropriate result. If both fail, try `STRING', and then `TEXT'. If the value is one of the above symbols, try only the specified type. If the value is a list of them, try each of them in the specified order until succeed. ---------------------------------------------------------------------- As the default is still nil, Emacs tries both COMPOUND_TEXT and UTF8_STRING. And to decide "the more appropriate result", we currently do this: ;; Helper function for x-selection-value. Select UTF8 or CTEXT ;; whichever is more appropriate. Here, we use this heurisitcs. ;; ;; (1) If their lengthes are different, select the longer one. This ;; is because an X client may just cut off unsupported characters. ;; ;; (2) Otherwise, if the Nth character of CTEXT is an ASCII ;; character that is different from the Nth character of UTF8, ;; select UTF8. This is because an X client may replace unsupported ;; characters with some ASCII character (typically ` ' or `?') in ;; CTEXT. ;; ;; (3) Otherwise, select CTEXT. This is because legacy charsets are ;; better for the current Emacs, especially when the selection owner ;; is also Emacs. But, considering the described behaviour of gtk2, it seems that we should test (2) at first. >> ??? Then, in what locale were you running gtk2 apps when >> pasting didn't work? > The system default, which is no default language (as recommended during > the debian locales configuration script for mult-language systems), so > just POSIX: I see. I suspect that gtk2 produces \x{...} in COMPOUND_TEXT encoder because latin-1 accented letters are not supported in that locale. [...] > But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start > emacs, and the pasting does not work (but only for emacs, it still works > with other apps). *HOWEVER*, if I log out, select any of the available > locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1 > or en_US.UTF-8, and then login, then all the pasting works properly. > I suppose that the session locale setting might also alter the way the X > selection buffer deals with the marked text. Perhaps. As the selection owner has no way to know in which locale a selection requester is running, it is likely that the gtk2 assumes that the requester is in the session locale. >>> The garbaged text corresponds exactly to the unicode hex encodings for >>> the characters. for example the unicode hex encoding of ß is 00DF and >>> emacs displays the pasted in ß as \x{00DF}. This certainly isn't a >>> coincidence. >> >> >> Emacs never generates such \x{.....} notation automatically. >> So, the text should be generated on sender site. > This corroborates the suggestion that the session locale setting is also > effecting the text in the x selection buffer. But there's still the > question (except for your utf8-string explanation) of why other apps can > insert this, but emacs can't. As I wrote, I think they request UTF8_STRING at first, and UTF8 encoder always encode all characters correctly regardless of the current locale. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-12-18 11:28 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org> 2003-12-18 2:15 ` [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] Kenichi Handa 2003-12-18 9:50 ` josh buhl 2003-12-18 11:28 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).