From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Benjamin Riefenstahl Newsgroups: gmane.emacs.devel Subject: Re: Unicode support for the MS Windows clipboard Date: Thu, 27 May 2004 11:45:42 +0200 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: References: <9681-Thu27May2004100522+0300-eliz@gnu.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1085709116 26331 80.91.224.253 (28 May 2004 01:51:56 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 28 May 2004 01:51:56 +0000 (UTC) Cc: sds@gnu.org, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Fri May 28 03:51:44 2004 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1BTWXA-00064W-00 for ; Fri, 28 May 2004 03:51:44 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1BTWXA-0000vi-00 for ; Fri, 28 May 2004 03:51:44 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BTVSh-0002vg-3W for emacs-devel@quimby.gnus.org; Thu, 27 May 2004 20:43:03 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.34) id 1BTVSY-0002uS-DZ for emacs-devel@gnu.org; Thu, 27 May 2004 20:42:54 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.34) id 1BTVS1-0002ou-30 for emacs-devel@gnu.org; Thu, 27 May 2004 20:42:52 -0400 Original-Received: from [193.28.100.151] (helo=mail.epost.de) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BTHSk-000794-Qd; Thu, 27 May 2004 05:46:11 -0400 Original-Received: from seneca.benny.turtle-trading.net.epost.de (193.99.153.30) by mail.epost.de (6.7.015) id 40B529D1000155C2; Thu, 27 May 2004 11:46:10 +0200 Original-To: Eli Zaretskii In-Reply-To: <9681-Thu27May2004100522+0300-eliz@gnu.org> (Eli Zaretskii's message of "Thu, 27 May 2004 10:05:22 +0200") User-Agent: Gnus/5.1001 (Gnus v5.10.1) Emacs/21.3.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.4 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:24046 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:24046 Hi Eli, >> From: Benjamin Riefenstahl >> - Introduce a new variable `w32-clipboard-type' "Eli Zaretskii" writes: > Couldn't this be done without introducing Windows-specific options? I'd certainly love to. In general all three modes have their use: - CF_TEXT - This is obviously the fallback. - CF_UNICODETEXT - The most capable choice on NT/W2K/XP. According to MSDN not supported on 9x/Me. - CF_OEMTEXT - If you want to cut-and-paste line drawing characters between Emacs and other console apps on 9x/Me this would be the type to use. You could consider this scenario too exotic, so that we could drop it. OTOH I know of at least one user that is actually using and maintaining files with line drawing characters in Emacs. > AFAIK, the logic employed by Windows when it encodes clipboard text > is quite simple, something like: if it cannot be encoded with the > system codepage, use Unicode. Why cannot Emacs simply follow this > logic? Conceptually, Windows doesn't encode, it just marshals memory blocks. Windows *will* automatically generate any additional supported text type from whatever an application provides, so that an application only needs to know one of the three text types. In this context at least, 9x/Me only supports CF_TEXT and CF_OEMTEXT while Windows NT/W2K/XP also supports CF_UNICODETEXT (or so MSDN says). I don't have a 9x machine ready here at the moment, but I will probably install one on the weekend for testing of other things anyway. > Also, AFAIK CF_UNICODETEXT _can_ be used on Windows 9x, as any > program like clipbrd.exe or ClipConvert will show you. 9x/Me is explicitly documented on MSDN not to support that. Being that - from Windows' POV - we are just talking about memory blocks, CF_UNICODETEXT is probably marshalled fine, but is it also automatically converted? I.e. will any non-Unicode application be able to retrieve the CF_TEXT format that it is entitled to expect, when we just post CF_UNICODETEXT? And the other way around? Even if not we could probably try a scheme similar to what you outlined above: - Receiving: First try CF_UNICODETEXT. If CF_UNICODETEXT doesn't exist, try CF_TEXT. - Posting: Post CF_UNICODETEXT. Test if CF_TEXT is there now. If CF_TEXT is not automatically provided by Windows, post CF_TEXT ourself in additiona to CF_UNICODETEXT. Note that this last situation would triple the amount of memory required. Anyway, what happens to the MULE problem in this unified scenario? Do all problems go away with unify-8859-on-{de,en}coding? >> - Drop optimizations for ASCII-only text. > Is that optimization indeed an optimization? It was obviously intented as such, but it may just have been a remnant of the code that was there before the introduction of the {en,de}coding via coding systems into that module. I will build a version without my patch and test it. Thanks for your input, benny