From: Kenichi Handa <handa@m17n.org>
Cc: bug-gnu-emacs@gnu.org, handa@m17n.org
Subject: Re: [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard]
Date: Thu, 18 Dec 2003 20:28:21 +0900 (JST) [thread overview]
Message-ID: <200312181128.UAA00989@etlken.m17n.org> (raw)
In-Reply-To: <3FE17861.9090809@uni-bonn.de> (message from josh buhl on Thu, 18 Dec 2003 10:50:25 +0100)
In article <3FE17861.9090809@uni-bonn.de>, josh buhl <uzs33d@uni-bonn.de> writes:
> Kenichi Handa wrote:
>>> However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
>>> seem to be able to paste this text in from each other properly. Only
>>> emacs has this problem.
>>
>> Perhaps, that because the other apps use UTF8_STRING request
>> on selection (which is XFree86 extention) but Emacs 21.3
>> uses only COMPOUND_TEXT request (standard of X). The latest
>> CVS version of Emacs supports UTF8_STRING.
> That sounds plausible. If I tried to checkout and compile the latest cvs
> of emacs to test this, would I have to somehow enable utf8_string, or
> would it be automatically supported?
In CVS Emacs, we introduced this variable.
----------------------------------------------------------------------
x-select-request-type's value is nil
*Data type request for X selection.
The value is nil, one of the following data types, or a list of them:
`COMPOUND_TEXT', `UTF8_STRING', `STRING', `TEXT'
If the value is nil, try `COMPOUND_TEXT' and `UTF8_STRING', and
use the more appropriate result. If both fail, try `STRING', and
then `TEXT'.
If the value is one of the above symbols, try only the specified
type.
If the value is a list of them, try each of them in the specified
order until succeed.
----------------------------------------------------------------------
As the default is still nil, Emacs tries both COMPOUND_TEXT
and UTF8_STRING. And to decide "the more appropriate
result", we currently do this:
;; Helper function for x-selection-value. Select UTF8 or CTEXT
;; whichever is more appropriate. Here, we use this heurisitcs.
;;
;; (1) If their lengthes are different, select the longer one. This
;; is because an X client may just cut off unsupported characters.
;;
;; (2) Otherwise, if the Nth character of CTEXT is an ASCII
;; character that is different from the Nth character of UTF8,
;; select UTF8. This is because an X client may replace unsupported
;; characters with some ASCII character (typically ` ' or `?') in
;; CTEXT.
;;
;; (3) Otherwise, select CTEXT. This is because legacy charsets are
;; better for the current Emacs, especially when the selection owner
;; is also Emacs.
But, considering the described behaviour of gtk2, it seems
that we should test (2) at first.
>> ??? Then, in what locale were you running gtk2 apps when
>> pasting didn't work?
> The system default, which is no default language (as recommended during
> the debian locales configuration script for mult-language systems), so
> just POSIX:
I see. I suspect that gtk2 produces \x{...} in
COMPOUND_TEXT encoder because latin-1 accented letters are
not supported in that locale.
[...]
> But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start
> emacs, and the pasting does not work (but only for emacs, it still works
> with other apps). *HOWEVER*, if I log out, select any of the available
> locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1
> or en_US.UTF-8, and then login, then all the pasting works properly.
> I suppose that the session locale setting might also alter the way the X
> selection buffer deals with the marked text.
Perhaps. As the selection owner has no way to know in which
locale a selection requester is running, it is likely that
the gtk2 assumes that the requester is in the session
locale.
>>> The garbaged text corresponds exactly to the unicode hex encodings for
>>> the characters. for example the unicode hex encoding of ß is 00DF and
>>> emacs displays the pasted in ß as \x{00DF}. This certainly isn't a
>>> coincidence.
>>
>>
>> Emacs never generates such \x{.....} notation automatically.
>> So, the text should be generated on sender site.
> This corroborates the suggestion that the session locale setting is also
> effecting the text in the x selection buffer. But there's still the
> question (except for your utf8-string explanation) of why other apps can
> insert this, but emacs can't.
As I wrote, I think they request UTF8_STRING at first, and
UTF8 encoder always encode all characters correctly
regardless of the current locale.
---
Ken'ichi HANDA
handa@m17n.org
prev parent reply other threads:[~2003-12-18 11:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <E1AWn7Q-0006EZ-Fs@fencepost.gnu.org>
2003-12-18 2:15 ` [uzs33d@uni-bonn.de: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard] Kenichi Handa
2003-12-18 9:50 ` josh buhl
2003-12-18 11:28 ` Kenichi Handa [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200312181128.UAA00989@etlken.m17n.org \
--to=handa@m17n.org \
--cc=bug-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.