From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Joakim =?UTF-8?Q?H=C3=A5rsman?= Newsgroups: gmane.emacs.bugs Subject: bug#10299: Emacs doesn't handle Unicode characters in keyboard layout on MS Windows Date: Tue, 20 Dec 2011 22:16:53 +0100 Message-ID: References: <8739clgapc.fsf@gnu.org> <83zket20xw.fsf@gnu.org> <83vcph0w9t.fsf@gnu.org> <83obv821wv.fsf@gnu.org> <831us31atj.fsf@gnu.org> <83pqflzr1d.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1324415833 15414 80.91.229.12 (20 Dec 2011 21:17:13 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 20 Dec 2011 21:17:13 +0000 (UTC) Cc: 10299@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Dec 20 22:17:08 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Rd73b-0003e5-MX for geb-bug-gnu-emacs@m.gmane.org; Tue, 20 Dec 2011 22:17:08 +0100 Original-Received: from localhost ([::1]:45767 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rd73a-00068H-LU for geb-bug-gnu-emacs@m.gmane.org; Tue, 20 Dec 2011 16:17:06 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:59889) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rd73Y-00067w-0K for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2011 16:17:05 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rd73W-0006YR-RX for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2011 16:17:03 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:46179) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rd73W-0006YL-NR for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2011 16:17:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1Rd75R-0007cc-Qu for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2011 16:19:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Joakim =?UTF-8?Q?H=C3=A5rsman?= Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 20 Dec 2011 21:19:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 10299 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 10299-submit@debbugs.gnu.org id=B10299.132441593729287 (code B ref 10299); Tue, 20 Dec 2011 21:19:01 +0000 Original-Received: (at 10299) by debbugs.gnu.org; 20 Dec 2011 21:18:57 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Rd75N-0007cK-02 for submit@debbugs.gnu.org; Tue, 20 Dec 2011 16:18:57 -0500 Original-Received: from mail-ey0-f172.google.com ([209.85.215.172]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Rd75K-0007cC-6S for 10299@debbugs.gnu.org; Tue, 20 Dec 2011 16:18:55 -0500 Original-Received: by eaad1 with SMTP id d1so5855051eaa.3 for <10299@debbugs.gnu.org>; Tue, 20 Dec 2011 13:16:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=qG3cx7yvwbAeGm+CeyJV/rhdyqIFzlK2VoZDgmngwcE=; b=hRZZTfoCPAi8Ff3Sy6NYblKQ7AFV83wUWyVge79zO5TXcHoR6SqHlcRTIrjw5+V8Ak p0JDTFbsaG9NCNBNfCSPXZd96O4VUZdFbpxiYXkU47O6Kyy5/b71RrBIwXH6Tv2dM+H8 7B4o2gUjasG9g2tIi1HaHWLC3sTj0bksgvxaE= Original-Received: by 10.204.155.76 with SMTP id r12mr1391936bkw.115.1324415813350; Tue, 20 Dec 2011 13:16:53 -0800 (PST) Original-Received: by 10.204.58.209 with HTTP; Tue, 20 Dec 2011 13:16:53 -0800 (PST) In-Reply-To: <83pqflzr1d.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Tue, 20 Dec 2011 16:19:01 -0500 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:55092 Archived-At: On 18 December 2011 19:13, Eli Zaretskii wrote: >> Date: Sun, 18 Dec 2011 18:31:55 +0100 >> From: Joakim H=E5rsman >> >> > That's good news. =A0However, I'm puzzled: are you saying that the cod= e >> > points passed by Windows to Emacs for the characters generated by MKLC >> > are outside the Unicode BMP, i.e. larger than 65535? =A0If so, what co= de >> > points are they? >> >> No, none of the characters I needed are outside the BMP. >> >> WM_CHAR encodes the codepoint in UTF-16 inside wParam, while >> WM_UNICHAR uses UTF-32. So if I press something which gives U+2218 >> RING OPERATOR, I get a WM_CHAR event with a wParam of 2228248 or >> 0x220018. > > ??? UTF-16 encodes the characters in the BMP as themselves, i.e. a > single 16-bit value that is numerically identical to the codepoint. > That is, you should have gotten 0x2218. =A0What am I missing? > >> I experimented a bit, and CreateWindowW isn't needed after all. As >> long as I use RegisterClassW and GetMessageW, things work. I'm unsure >> if it's TranslateMessage that translates the key press to a question >> mark or if it's GetMessage that does it on receiving the message. > > Question marks are a sign that Windows tried to convert the character > to its ANSI equivalent, and failed. =A0I.e., it means that Windows > thought the program asked for ANSI encoded characters. =A0So it's > probably TranslateMessage that did it. > >> I'll try to get frame titles working again as well, then I can >> probably switch on os_subtype in two or three places and Windows 95 >> won't be affected at all. Do you think that is a good plan? > > Yes, thanks. I've fixed the issues with the frame titles, and everything appears to work, there are a number of issues I find very confusing however. Here's the state of my changes as of now: =3D=3D=3D modified file 'src/w32fns.c' --- src/w32fns.c 2011-12-04 08:02:42 +0000 +++ src/w32fns.c 2011-12-20 20:46:40 +0000 @@ -1697,10 +1697,10 @@ if (FRAME_W32_WINDOW (f)) { if (STRING_MULTIBYTE (name)) - name =3D ENCODE_SYSTEM (name); + name =3D ENCODE_SYSTEM (name); BLOCK_INPUT; - SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); + SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); UNBLOCK_INPUT; } } @@ -1746,7 +1746,7 @@ name =3D ENCODE_SYSTEM (name); BLOCK_INPUT; - SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); + SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); UNBLOCK_INPUT; } } @@ -1785,7 +1785,7 @@ static BOOL w32_init_class (HINSTANCE hinst) { - WNDCLASS wc; + WNDCLASSW wc; wc.style =3D CS_HREDRAW | CS_VREDRAW; wc.lpfnWndProc =3D (WNDPROC) w32_wnd_proc; @@ -1796,9 +1796,9 @@ wc.hCursor =3D w32_load_cursor (IDC_ARROW); wc.hbrBackground =3D NULL; /* GetStockObject (WHITE_BRUSH); */ wc.lpszMenuName =3D NULL; - wc.lpszClassName =3D EMACS_CLASS; + wc.lpszClassName =3D L"Emacs"; - return (RegisterClass (&wc)); + return (RegisterClassW (&wc)); } static HWND @@ -2248,7 +2248,7 @@ msh_mousewheel =3D RegisterWindowMessage (MSH_MOUSEWHEEL); - while (GetMessage (&msg, NULL, 0, 0)) + while (GetMessageW (&msg, NULL, 0, 0)) { if (msg.hwnd =3D=3D NULL) { @@ -2915,8 +2915,21 @@ case WM_SYSCHAR: case WM_CHAR: - post_character_message (hwnd, msg, wParam, lParam, - w32_get_key_modifiers (wParam, lParam)); + if (wParam > 255 ) + { + unsigned short lo =3D wParam & 0x0000FFFF; + unsigned short hi =3D (wParam & 0xFFFF0000) >> 8; + wParam =3D hi | lo; + + W32Msg wmsg; + wmsg.dwModifiers =3D w32_get_key_modifiers (wParam, lParam); + signal_user_input (); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, wParam, lParam); + + } + else + post_character_message (hwnd, msg, wParam, lParam, + w32_get_key_modifiers (wParam, lParam)); break; case WM_UNICHAR: I should probably also only do this on NT (to avoid breaking stuff on Windows 95), but that should be easy to fix. There are a couple of very weird things going on however: 1. Why is wParam encoded in a weird format spread over the lo and hi word of the wParam DWORD? 2. Why does sending 8-bit strings to SetWindowTextW work, but sending 8-bit strings to SetWindowTextA for a window with a "Unicode" window class only use the first character? My guess would be that the correct solution for 2 is to always encode frame captions in utf-16le before sending them to SetWindowTextW, however I'm not sure what the best way to do this is. I figure I should use something like this: Lisp_Object encoding =3D intern_c_string ("utf-16le-dos"); name =3D code_convert_string_norecord (name, encoding, 1); SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); Sadly that didn't work (I still get single char frame captions), and I never managed to get gdb on Windows to print Lisp objects correctly, so I had a hard time understanding why it didn't work. Looking at the data that actually gets sent to SetWindowText might make things clearer. Anyway, the current patch works fine as far as I can tell, but it's a bit disconcerting to not know *why* things work the way they do.