From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Reporting UTF-8 related problems? Date: Wed, 31 Jul 2002 21:26:40 +0900 (JST) Sender: emacs-devel-admin@gnu.org Message-ID: <200207311226.VAA08114@etlken.m17n.org> References: <2110-Sun28Jul2002212621+0300-eliz@is.elta.co.il> <200207290518.OAA04004@etlken.m17n.org> <200207300522.OAA05828@etlken.m17n.org> <200207300711.QAA05993@etlken.m17n.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: main.gmane.org 1028118471 3229 127.0.0.1 (31 Jul 2002 12:27:51 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 31 Jul 2002 12:27:51 +0000 (UTC) Cc: eliz@is.elta.co.il, emacs-devel@gnu.org, schwab@suse.de Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 17ZsZy-0000py-00 for ; Wed, 31 Jul 2002 14:27:50 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17ZssA-00054c-00 for ; Wed, 31 Jul 2002 14:46:38 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17ZsaO-0007cK-00; Wed, 31 Jul 2002 08:28:16 -0400 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by fencepost.gnu.org with smtp (Exim 3.35 #1 (Debian)) id 17Zsa4-0007bY-00 for ; Wed, 31 Jul 2002 08:27:57 -0400 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6/3.7W-20010518204228) with ESMTP id g6VCQfl24411; Wed, 31 Jul 2002 21:26:41 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.3/3.7W-20010823150639) with ESMTP id g6VCQf920135; Wed, 31 Jul 2002 21:26:41 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id VAA08114; Wed, 31 Jul 2002 21:26:40 +0900 (JST) Original-To: keichwa@gmx.net In-Reply-To: (message from Karl Eichwalder on Tue, 30 Jul 2002 20:58:32 +0200) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:6204 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:6204 In article , Karl Eichwalder writes: > Yes, but once in the X selection I'd like to see Emacs honor them. > The spacing problem also occurs when I try to cut and paste from Markus > Kuhn's demo file > (http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt): As far as I understand, that's not a spacing problem. As those clients send Emacs the designation sequence of jisx0208 characters, Emacs just decodes them correctly (i.e. honoring them) and displaying them by Japanese double-width font. > When I insert (C-x RET c utf-8 RET C-x C-f UTF-8-demo.txt RET), things > are correctly displayed (the characters are different): That's because the file is correclty encoded in utf-8, thus Emacs can decode it correctly. > Cut and paste both these examples from Emacs (this mail buffer) to a > UTF-8 xterm doesn't work neither; instead of the quotes I see "-1" and > garbage. Yes because I have not yet installed a code for encoding Emacs string to what UTF-8 xterm expect. I confirmed that UTF-8 xterm surely request the target type UTF8_STRING at first. I'm now finding a way to handle it. While tracing the the whole procedure of Emacs to handle a selection request, I found the followings. Could someone else also check if I miss something? When Emacs receives a selection request, x_handle_selction_request (xselect.c) is called. The flow is as this: x_handle_selction_request (EVENT) -- xselect.c x_get_local_selection (SELECTION, TARGET_TYPE) -- xselect.c xselect-convert-to-string (SELECTION, TARGET-TYPE, VALUE) -- select.el => returns MULTIBYTE-STRING => returns MULTIBYTE-STRING lisp_data_to_selection_data (EVENT, MULTIBYTE-STRING, ...) => returns encoded string x_reply_selection_request (EVENT, above returned encoded string) ;; sends selection data to the other client So, it seems that we can perform the encoding in the lisp function xselect-convert-to-string, not in lisp_data_to_selection_data. BUT... xselect-convert-to-string is also called in this way: yank -- simple.el current-kill -- simple.el x-cur-buffer-or-selection-value -- x-win.el x-get-selection -- select.el Fx_get_selection_internal -- xselect.c x_get_local_selection -- xselect.c xselect-convert-to-string -- select.el !!! And, in the latter case, xselect-convert-to-string must return an Emacs string without encoding it. Currently, xselect-convert-to-string has no way to know in which situation it is called. So, how about calling xselect-convert-to-string with TARGET-TYPE nil in the latter case? This can be done by adding one more arg LOCAL-REQUEST to x_get_local_selection. If the above analysis is correct, we can implement the rather sensitive/delicate code for handling string in lisp_data_to_selection_data and x_encode_text in Lisp, which makes the Emacs' reaction to selection request more flexible and also makes the future maintanance easier. What do you think? --- Ken'ichi HANDA handa@etl.go.jp