From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Joakim =?UTF-8?Q?H=C3=A5rsman?= Newsgroups: gmane.emacs.bugs Subject: bug#10299: Emacs doesn't handle Unicode characters in keyboard layout on MS Windows Date: Sat, 14 Jan 2012 17:40:51 +0100 Message-ID: References: <8739clgapc.fsf@gnu.org> <83zket20xw.fsf@gnu.org> <83vcph0w9t.fsf@gnu.org> <83obv821wv.fsf@gnu.org> <831us31atj.fsf@gnu.org> <83pqflzr1d.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1326559533 24670 80.91.229.12 (14 Jan 2012 16:45:33 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 14 Jan 2012 16:45:33 +0000 (UTC) Cc: 10299@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Jan 14 17:45:24 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Rm6jC-0001Vt-JJ for geb-bug-gnu-emacs@m.gmane.org; Sat, 14 Jan 2012 17:45:14 +0100 Original-Received: from localhost ([::1]:48754 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rm6jC-0007Ul-4U for geb-bug-gnu-emacs@m.gmane.org; Sat, 14 Jan 2012 11:45:14 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:51417) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rm6j7-0007Ph-6m for bug-gnu-emacs@gnu.org; Sat, 14 Jan 2012 11:45:10 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rm6j5-0003bo-7S for bug-gnu-emacs@gnu.org; Sat, 14 Jan 2012 11:45:09 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:34600) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rm6fO-0002y0-Ut for bug-gnu-emacs@gnu.org; Sat, 14 Jan 2012 11:41:19 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1Rm6g5-0001cs-Os for bug-gnu-emacs@gnu.org; Sat, 14 Jan 2012 11:42:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Joakim =?UTF-8?Q?H=C3=A5rsman?= Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 14 Jan 2012 16:42:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 10299 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 10299-submit@debbugs.gnu.org id=B10299.13265593126232 (code B ref 10299); Sat, 14 Jan 2012 16:42:01 +0000 Original-Received: (at 10299) by debbugs.gnu.org; 14 Jan 2012 16:41:52 +0000 Original-Received: from localhost ([127.0.0.1]:57506 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Rm6fv-0001cT-1c for submit@debbugs.gnu.org; Sat, 14 Jan 2012 11:41:51 -0500 Original-Received: from mail-bk0-f44.google.com ([209.85.214.44]:63620) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Rm6ff-0001c7-SO for 10299@debbugs.gnu.org; Sat, 14 Jan 2012 11:41:49 -0500 Original-Received: by bkwq16 with SMTP id q16so2710616bkw.3 for <10299@debbugs.gnu.org>; Sat, 14 Jan 2012 08:40:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=RbDn1mLGxa0k9D7J4uOVchaISkBpdmO2cdwSF/AzhGE=; b=FXDM/8kME9vuDp+UGvRathzHHAaCh4lvHbLOL8SJwbaa3TQVD7OHiU8+ewZWr1bPP2 Jg4HuACTWIMlQXF5OrWC8G6UpoAicYqG5BglhVkawVjZfIj9dqmTgfnYBe+DuYkp74MX yUPocs1U8jQu32Jf9/KsdrgcAC4Vcu5ffj0LE= Original-Received: by 10.204.156.83 with SMTP id v19mr2306556bkw.40.1326559251436; Sat, 14 Jan 2012 08:40:51 -0800 (PST) Original-Received: by 10.205.128.17 with HTTP; Sat, 14 Jan 2012 08:40:51 -0800 (PST) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:55739 Archived-At: Ok, here's the final version of the patch I'm currently using. It now only switches to the new behavior when running on NT which should maintain compatibility with Windows 95. This does make the code somewhat uglier, but when Emacs stops supporting Windows 95 maybe the source can be cleaned up and switched to always use the full Unicode API on all Windows platforms. I've been using an Emacs with this patch for a couple of weeks now and haven't discovered any problems so far. I don't have a Windows 95 system to test on though. =3D=3D=3D modified file 'src/w32fns.c' --- src/w32fns.c 2011-12-04 08:02:42 +0000 +++ src/w32fns.c 2012-01-08 15:30:12 +0000 @@ -1697,10 +1697,13 @@ if (FRAME_W32_WINDOW (f)) { if (STRING_MULTIBYTE (name)) - name =3D ENCODE_SYSTEM (name); - + name =3D ENCODE_SYSTEM (name); + BLOCK_INPUT; - SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); + if (os_subtype =3D=3D OS_NT) + SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); + else + SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); UNBLOCK_INPUT; } } @@ -1746,7 +1749,10 @@ name =3D ENCODE_SYSTEM (name); BLOCK_INPUT; - SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); + if (os_subtype =3D=3D OS_NT) + SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); + else + SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); UNBLOCK_INPUT; } } @@ -1785,20 +1791,39 @@ static BOOL w32_init_class (HINSTANCE hinst) { - WNDCLASS wc; - - wc.style =3D CS_HREDRAW | CS_VREDRAW; - wc.lpfnWndProc =3D (WNDPROC) w32_wnd_proc; - wc.cbClsExtra =3D 0; - wc.cbWndExtra =3D WND_EXTRA_BYTES; - wc.hInstance =3D hinst; - wc.hIcon =3D LoadIcon (hinst, EMACS_CLASS); - wc.hCursor =3D w32_load_cursor (IDC_ARROW); - wc.hbrBackground =3D NULL; /* GetStockObject (WHITE_BRUSH); */ - wc.lpszMenuName =3D NULL; - wc.lpszClassName =3D EMACS_CLASS; - - return (RegisterClass (&wc)); + WNDCLASSW uwc; + WNDCLASS wc; + + if (os_subtype =3D=3D OS_NT) + { + uwc.style =3D CS_HREDRAW | CS_VREDRAW; + uwc.lpfnWndProc =3D (WNDPROC) w32_wnd_proc; + uwc.cbClsExtra =3D 0; + uwc.cbWndExtra =3D WND_EXTRA_BYTES; + uwc.hInstance =3D hinst; + uwc.hIcon =3D LoadIcon (hinst, EMACS_CLASS); + uwc.hCursor =3D w32_load_cursor (IDC_ARROW); + uwc.hbrBackground =3D NULL; /* GetStockObject (WHITE_BRUSH); */ + uwc.lpszMenuName =3D NULL; + uwc.lpszClassName =3D L"Emacs"; + + return (RegisterClassW (&uwc)); + } + else + { + wc.style =3D CS_HREDRAW | CS_VREDRAW; + wc.lpfnWndProc =3D (WNDPROC) w32_wnd_proc; + wc.cbClsExtra =3D 0; + wc.cbWndExtra =3D WND_EXTRA_BYTES; + wc.hInstance =3D hinst; + wc.hIcon =3D LoadIcon (hinst, EMACS_CLASS); + wc.hCursor =3D w32_load_cursor (IDC_ARROW); + wc.hbrBackground =3D NULL; /* GetStockObject (WHITE_BRUSH); */ + wc.lpszMenuName =3D NULL; + wc.lpszClassName =3D EMACS_CLASS; + + return (RegisterClass (&wc)); + } } static HWND @@ -2248,8 +2273,16 @@ msh_mousewheel =3D RegisterWindowMessage (MSH_MOUSEWHEEL); - while (GetMessage (&msg, NULL, 0, 0)) + while (1) { + if (os_subtype =3D=3D OS_NT) + result =3D GetMessageW (&msg, NULL, 0, 0); + else + result =3D GetMessage (&msg, NULL, 0, 0); + + if (!result) + break; + if (msg.hwnd =3D=3D NULL) { switch (msg.message) @@ -2915,8 +2948,21 @@ case WM_SYSCHAR: case WM_CHAR: - post_character_message (hwnd, msg, wParam, lParam, - w32_get_key_modifiers (wParam, lParam)); + if (wParam > 255 ) + { + unsigned short lo =3D wParam & 0x0000FFFF; + unsigned short hi =3D (wParam & 0xFFFF0000) >> 8; + wParam =3D hi | lo; + + W32Msg wmsg; + wmsg.dwModifiers =3D w32_get_key_modifiers (wParam, lParam); + signal_user_input (); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, wParam, lParam); + + } + else + post_character_message (hwnd, msg, wParam, lParam, + w32_get_key_modifiers (wParam, lParam)); break; case WM_UNICHAR: On 20 December 2011 22:16, Joakim H=E5rsman wrot= e: > On 18 December 2011 19:13, Eli Zaretskii wrote: >>> Date: Sun, 18 Dec 2011 18:31:55 +0100 >>> From: Joakim H=E5rsman >>> >>> > That's good news. =A0However, I'm puzzled: are you saying that the co= de >>> > points passed by Windows to Emacs for the characters generated by MKL= C >>> > are outside the Unicode BMP, i.e. larger than 65535? =A0If so, what c= ode >>> > points are they? >>> >>> No, none of the characters I needed are outside the BMP. >>> >>> WM_CHAR encodes the codepoint in UTF-16 inside wParam, while >>> WM_UNICHAR uses UTF-32. So if I press something which gives U+2218 >>> RING OPERATOR, I get a WM_CHAR event with a wParam of 2228248 or >>> 0x220018. >> >> ??? UTF-16 encodes the characters in the BMP as themselves, i.e. a >> single 16-bit value that is numerically identical to the codepoint. >> That is, you should have gotten 0x2218. =A0What am I missing? >> >>> I experimented a bit, and CreateWindowW isn't needed after all. As >>> long as I use RegisterClassW and GetMessageW, things work. I'm unsure >>> if it's TranslateMessage that translates the key press to a question >>> mark or if it's GetMessage that does it on receiving the message. >> >> Question marks are a sign that Windows tried to convert the character >> to its ANSI equivalent, and failed. =A0I.e., it means that Windows >> thought the program asked for ANSI encoded characters. =A0So it's >> probably TranslateMessage that did it. >> >>> I'll try to get frame titles working again as well, then I can >>> probably switch on os_subtype in two or three places and Windows 95 >>> won't be affected at all. Do you think that is a good plan? >> >> Yes, thanks. > > I've fixed the issues with the frame titles, and everything appears to > work, there are a number of issues I find very confusing however. > > Here's the state of my changes as of now: > > =3D=3D=3D modified file 'src/w32fns.c' > --- src/w32fns.c =A0 =A0 =A0 =A02011-12-04 08:02:42 +0000 > +++ src/w32fns.c =A0 =A0 =A0 =A02011-12-20 20:46:40 +0000 > @@ -1697,10 +1697,10 @@ > =A0 if (FRAME_W32_WINDOW (f)) > =A0 =A0 { > =A0 =A0 =A0 if (STRING_MULTIBYTE (name)) > - =A0 =A0 =A0 name =3D ENCODE_SYSTEM (name); > + =A0 =A0 =A0 =A0name =3D ENCODE_SYSTEM (name); > > =A0 =A0 =A0 BLOCK_INPUT; > - =A0 =A0 =A0SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); > + =A0 =A0 =A0SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); > =A0 =A0 =A0 UNBLOCK_INPUT; > =A0 =A0 } > =A0} > @@ -1746,7 +1746,7 @@ > =A0 =A0 =A0 =A0name =3D ENCODE_SYSTEM (name); > > =A0 =A0 =A0 BLOCK_INPUT; > - =A0 =A0 =A0SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); > + =A0 =A0 =A0SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); > =A0 =A0 =A0 UNBLOCK_INPUT; > =A0 =A0 } > =A0} > @@ -1785,7 +1785,7 @@ > =A0static BOOL > =A0w32_init_class (HINSTANCE hinst) > =A0{ > - =A0WNDCLASS wc; > + =A0WNDCLASSW wc; > > =A0 wc.style =3D CS_HREDRAW | CS_VREDRAW; > =A0 wc.lpfnWndProc =3D (WNDPROC) w32_wnd_proc; > @@ -1796,9 +1796,9 @@ > =A0 wc.hCursor =3D w32_load_cursor (IDC_ARROW); > =A0 wc.hbrBackground =3D NULL; /* GetStockObject (WHITE_BRUSH); =A0*/ > =A0 wc.lpszMenuName =3D NULL; > - =A0wc.lpszClassName =3D EMACS_CLASS; > + =A0wc.lpszClassName =3D L"Emacs"; > > - =A0return (RegisterClass (&wc)); > + =A0return (RegisterClassW (&wc)); > =A0} > > =A0static HWND > @@ -2248,7 +2248,7 @@ > > =A0 msh_mousewheel =3D RegisterWindowMessage (MSH_MOUSEWHEEL); > > - =A0while (GetMessage (&msg, NULL, 0, 0)) > + =A0while (GetMessageW (&msg, NULL, 0, 0)) > =A0 =A0 { > =A0 =A0 =A0 if (msg.hwnd =3D=3D NULL) > =A0 =A0 =A0 =A0{ > @@ -2915,8 +2915,21 @@ > > =A0 =A0 case WM_SYSCHAR: > =A0 =A0 case WM_CHAR: > - =A0 =A0 =A0post_character_message (hwnd, msg, wParam, lParam, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 w32_get_key_mod= ifiers (wParam, lParam)); > + =A0 =A0 =A0if (wParam > 255 ) > + =A0 =A0 =A0 =A0{ > + =A0 =A0 =A0 =A0 =A0unsigned short lo =3D wParam & 0x0000FFFF; > + =A0 =A0 =A0 =A0 =A0unsigned short hi =3D (wParam & 0xFFFF0000) >> 8; > + =A0 =A0 =A0 =A0 =A0wParam =A0=3D hi | lo; > + > + =A0 =A0 =A0 =A0 =A0W32Msg wmsg; > + =A0 =A0 =A0 =A0 =A0wmsg.dwModifiers =3D w32_get_key_modifiers (wParam, = lParam); > + =A0 =A0 =A0 =A0 =A0signal_user_input (); > + =A0 =A0 =A0 =A0 =A0my_post_msg (&wmsg, hwnd, WM_UNICHAR, wParam, lParam= ); > + > + =A0 =A0 =A0 =A0} > + =A0 =A0 =A0else > + =A0 =A0 =A0 =A0post_character_message (hwnd, msg, wParam, lParam, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0w32_get_= key_modifiers (wParam, lParam)); > =A0 =A0 =A0 break; > > =A0 =A0 case WM_UNICHAR: > > I should probably also only do this on NT (to avoid breaking stuff on > Windows 95), but that should be easy to fix. > > There are a couple of very weird things going on however: > > 1. Why is wParam encoded in a weird format spread over the lo and hi > word of the wParam DWORD? > > 2. Why does sending 8-bit strings to SetWindowTextW work, but sending > 8-bit strings to SetWindowTextA for a window with a "Unicode" window > class only use the first character? > > My guess would be that the correct solution for 2 is to always encode > frame captions in utf-16le before sending them to SetWindowTextW, > however I'm not sure what the best way to do this is. > > I figure I should use something like this: > > Lisp_Object encoding =3D intern_c_string ("utf-16le-dos"); > name =3D code_convert_string_norecord (name, encoding, 1); > SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); > > Sadly that didn't work (I still get single char frame captions), and I > never managed to get gdb on Windows to print Lisp objects correctly, > so I had a hard time understanding why it didn't work. Looking at the > data that actually gets sent to SetWindowText might make things > clearer. > > Anyway, the current patch works fine as far as I can tell, but it's a > bit disconcerting to not know *why* things work the way they do.