From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Benjamin Riefenstahl Newsgroups: gmane.emacs.devel Subject: [Patch] Unicode support for the MS Windows clipboard Date: Wed, 26 May 2004 20:01:22 +0200 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: sea.gmane.org 1085628962 6214 80.91.224.253 (27 May 2004 03:36:02 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 27 May 2004 03:36:02 +0000 (UTC) Cc: Sam Steingold Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Thu May 27 05:35:51 2004 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1BTBgN-0004Rw-00 for ; Thu, 27 May 2004 05:35:51 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1BTBgN-0002Db-00 for ; Thu, 27 May 2004 05:35:51 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BT9v7-0003fW-74 for emacs-devel@quimby.gnus.org; Wed, 26 May 2004 21:42:57 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.34) id 1BT8wD-0002MC-M8 for emacs-devel@gnu.org; Wed, 26 May 2004 20:40:02 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.34) id 1BT8iW-0000Ya-Rk for emacs-devel@gnu.org; Wed, 26 May 2004 20:26:25 -0400 Original-Received: from [193.28.100.152] (helo=mail.epost.de) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BT2jA-00052T-9i; Wed, 26 May 2004 14:02:08 -0400 Original-Received: from seneca.benny.turtle-trading.net.epost.de (193.99.153.30) by mail.epost.de (6.7.015) id 40B1548E00078FBD; Wed, 26 May 2004 20:02:07 +0200 Original-To: emacs-devel@gnu.org User-Agent: Gnus/5.1001 (Gnus v5.10.1) Emacs/21.3.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.4 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:23994 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:23994 --=-=-= Hi everybody, As we just had another discussion on this, I sat down and wrote a preliminary implementation of Unicode support for the MS Windows clipboard, see attached patch. I assume some of you have opinions on how this should be actually be packaged, so this patch is mostly for introduction, testing and playing around. If this code is acceptable in principle, I will probably also need to send in papers for the copyright assignment, before it can be used. What does it do: - Introduce a new variable `w32-clipboard-type' to use with cut-and-paste instead of the hard-coded CF_TEXT. The default for `w32-clipboard-type' is CF_TEXT, because CF_UNICODETEXT is not compatible with 9x/Me, it uses more memory and CF_TEXT was used before. - Drop optimizations for ASCII-only text. This is mostly because I couldn't get all combinations straight in my mind, between this, the `last_clipboard_text' mechanism and CF_UNICODETEXT. Open questions: - Support for CF_OEMTEXT (console text) in the clipboard may look superfluous. OTOH this may be usefull for full console mode support (emacs -nw) on 9x/Me. If we keep this, the default for `w32-clipboard-type' should depend on the display mode (console vs GUI). In that case `w32-clipboard-type' should be set from the Lisp code, though, I think. - `selection-coding-system' and `w32-clipboard-type' need to be set in a synchronized manner. I'm not yet sure if we want this to be done by Lisp code or by combining/synchronizing the two variables in the C code somehow. If we keep them separate, users can in theory use other coding-systems with this as they see fit. They can adapt it to whatever exotic locales and coding-systems that they may have. But that may be an academic issue. Anyway, as long as they are kept separate, these are the recommended combinations for an English version of Windows: w32-clipboard-type | selection-coding-system -------------------------------------------- CF_TEXT | cp1252-dos (GetACP()) CF_OEMTEXT | cp850-dos (GetOEMCP()) CF_UNICODETEXT | utf-16le-dos - If we keep `w32-clipboard-type', we could try to use more user-oriented names for the different types instead of just taking them from the C API names. - We want to either drop the #if 0 sections in the code completly or make that code work. benny --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=w32select.c.patch Index: w32select.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/w32select.c,v retrieving revision 1.32 diff -u -p -r1.32 w32select.c --- w32select.c 18 Apr 2004 18:34:03 -0000 1.32 +++ w32select.c 26 May 2004 17:35:14 -0000 @@ -41,6 +41,10 @@ static Lisp_Object Vselection_coding_sys /* Coding system for the next communicating with other Windows programs. */ static Lisp_Object Vnext_selection_coding_system; +/* Type of clipboard transfer method that should be used. */ +static Lisp_Object Vw32_clipboard_type; +static Lisp_Object QCF_TEXT, QCF_OEMTEXT, QCF_UNICODETEXT; + /* Sequence number, used where possible to detect when we are pasting our own text. */ static DWORD last_clipboard_sequence_number; @@ -110,6 +114,20 @@ DEFUN ("w32-close-clipboard", Fw32_close #endif +static UINT +get_cf_type (void) +{ + CHECK_SYMBOL (Vw32_clipboard_type); + + if (EQ (Vw32_clipboard_type, QCF_UNICODETEXT)) + return CF_UNICODETEXT; + + if (EQ (Vw32_clipboard_type, QCF_OEMTEXT)) + return CF_OEMTEXT; + + return CF_TEXT; +} + DEFUN ("w32-set-clipboard-data", Fw32_set_clipboard_data, Sw32_set_clipboard_data, 1, 2, 0, doc: /* This sets the clipboard data to the given text. */) @@ -134,6 +152,7 @@ DEFUN ("w32-set-clipboard-data", Fw32_se src = SDATA (string); dst = src; +#if 0 /* Disable ASCII-only optimizations */ /* We need to know how many lines there are, since we need CRLF line termination for compatibility with other Windows Programs. avoid using strchr because it recomputes the length every time */ @@ -142,8 +161,10 @@ DEFUN ("w32-set-clipboard-data", Fw32_se nlines++; dst++; } +#endif { +#if 0 /* Disable ASCII-only optimizations */ /* Since we are now handling multilingual text, we must consider encoding text for the clipboard. */ int charset_info = find_charset_in_text (src, SCHARS (string), @@ -192,6 +213,7 @@ DEFUN ("w32-set-clipboard-data", Fw32_se Vlast_coding_system_used = Qraw_text; } else +#endif { /* We must encode contents of OBJ to the selection coding system. */ @@ -242,7 +264,14 @@ DEFUN ("w32-set-clipboard-data", Fw32_se clipboard_storage_size); } if (last_clipboard_text) - memcpy (last_clipboard_text, dst, coding.produced); + { + memcpy (last_clipboard_text, dst, coding.produced); + /* Add a string terminator. Set *two* bytes after the + string to NUL, the second just in case we are using + CF_UNICODETEXT. */ + last_clipboard_text[coding.produced] = + last_clipboard_text[coding.produced+1] = '\0'; + } } GlobalUnlock (htext); @@ -257,7 +286,7 @@ DEFUN ("w32-set-clipboard-data", Fw32_se if (!OpenClipboard ((!NILP (frame) && FRAME_W32_P (XFRAME (frame))) ? FRAME_W32_WINDOW (XFRAME (frame)) : NULL)) goto error; - ok = EmptyClipboard () && SetClipboardData (CF_TEXT, htext); + ok = EmptyClipboard () && SetClipboardData (get_cf_type(), htext); CloseClipboard (); @@ -277,8 +306,11 @@ DEFUN ("w32-set-clipboard-data", Fw32_se ok = FALSE; if (htext) GlobalFree (htext); + + /* Set the first *two* bytes to NUL, the second just in case we are + using CF_UNICODETEXT. */ if (last_clipboard_text) - *last_clipboard_text = '\0'; + last_clipboard_text[0] = last_clipboard_text[1] = '\0'; last_clipboard_sequence_number = 0; @@ -296,6 +328,7 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge { HANDLE htext; Lisp_Object ret = Qnil; + UINT cf_type; if (!NILP (frame)) CHECK_LIVE_FRAME (frame); @@ -305,7 +338,8 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge if (!OpenClipboard ((!NILP (frame) && FRAME_W32_P (XFRAME (frame))) ? FRAME_W32_WINDOW (XFRAME (frame)) : NULL)) goto done; - if ((htext = GetClipboardData (CF_TEXT)) == NULL) + cf_type = get_cf_type(); + if ((htext = GetClipboardData (cf_type)) == NULL) goto closeclip; { @@ -313,12 +347,17 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge unsigned char *dst; int nbytes; int truelen; +#if 0 /* Disable ASCII-only optimizations */ int require_decoding = 0; +#endif if ((src = (unsigned char *) GlobalLock (htext)) == NULL) goto closeclip; - nbytes = strlen (src); + if (cf_type == CF_UNICODETEXT) + nbytes = (lstrlenW ((WCHAR *)src) + 1) * 2; + else + nbytes = strlen (src) + 1; /* If the text in clipboard is identical to what we put there last time w32_set_clipboard_data was called, pretend there's no @@ -332,6 +371,12 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge && memcmp(last_clipboard_text, src, nbytes) == 0)) goto closeclip; + /* Drop the string terminator from here on. */ + nbytes --; + if (cf_type == CF_UNICODETEXT) + nbytes --; + +#if 0 /* Disable ASCII-only optimizations */ { /* If the clipboard data contains any non-ascii code, we need to decode it. */ @@ -346,8 +391,11 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge } } } +#endif +#if 0 /* Disable ASCII-only optimizations */ if (require_decoding) +#endif { int bufsize; unsigned char *buf; @@ -376,6 +424,7 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge && !NILP (Ffboundp (coding.post_read_conversion))) ret = run_pre_post_conversion_on_str (ret, &coding, 0); } +#if 0 /* Disable ASCII-only optimizations */ else { /* Need to know final size after CR chars are removed because we @@ -421,6 +470,7 @@ DEFUN ("w32-get-clipboard-data", Fw32_ge Vlast_coding_system_used = Qraw_text; } +#endif GlobalUnlock (htext); } @@ -499,7 +549,21 @@ next communication only. After the comm set to nil. */); Vnext_selection_coding_system = Qnil; - QCLIPBOARD = intern ("CLIPBOARD"); staticpro (&QCLIPBOARD); + DEFVAR_LISP ("w32-clipboard-type", &Vw32_clipboard_type, + doc: /* MS Windows clipboard type for the communication with other programs. +When sending or receiving text via clipboard, this is the Windows text +type that is used. It can be set to `CF_TEXT', `CF_OEMTEXT' or +`CF_UNICODETEXT'. `CF_UNICODETEXT' is only valid on NT, Windows 2000 +or Windows XP. `selection-coding-system' must be set to match. The +default value is `CF_TEXT'. */); + + QCF_TEXT = intern ("CF_TEXT"); staticpro (&QCF_TEXT); + QCF_OEMTEXT = intern ("CF_OEMTEXT"); staticpro (&QCF_OEMTEXT); + QCF_UNICODETEXT = intern ("CF_UNICODETEXT"); staticpro (&QCF_UNICODETEXT); + Vw32_clipboard_type = QCF_TEXT; + + + QCLIPBOARD = intern ("CLIPBOARD"); staticpro (&QCLIPBOARD); } /* arch-tag: c96e9724-5eb1-4dad-be07-289f092fd2af --=-=-= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel --=-=-=--