From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.bugs Subject: bug#10299: Emacs doesn't handle Unicode characters in keyboard layout on MS Windows Date: Mon, 19 Dec 2011 12:17:50 +0100 Message-ID: References: <8739clgapc.fsf@gnu.org> <83zket20xw.fsf@gnu.org> <83vcph0w9t.fsf@gnu.org> <83obv821wv.fsf@gnu.org> <831us31atj.fsf@gnu.org> <83pqflzr1d.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1324293568 28695 80.91.229.12 (19 Dec 2011 11:19:28 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 19 Dec 2011 11:19:28 +0000 (UTC) Cc: 10299@debbugs.gnu.org To: Joakim =?UTF-8?Q?H=C3=A5rsman?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Dec 19 12:19:23 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RcbFb-0001H5-23 for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Dec 2011 12:19:23 +0100 Original-Received: from localhost ([::1]:45721 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RcbFa-00014P-G7 for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Dec 2011 06:19:22 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:43665) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RcbFT-000143-KR for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2011 06:19:21 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RcbFP-0003Qr-AT for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2011 06:19:15 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:43029) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RcbFP-0003Qj-7c for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2011 06:19:11 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1RcbHC-0000tx-Cx for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2011 06:21:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Lennart Borgman Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 19 Dec 2011 11:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 10299 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 10299-submit@debbugs.gnu.org id=B10299.13242936063399 (code B ref 10299); Mon, 19 Dec 2011 11:21:02 +0000 Original-Received: (at 10299) by debbugs.gnu.org; 19 Dec 2011 11:20:06 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RcbGI-0000sm-JC for submit@debbugs.gnu.org; Mon, 19 Dec 2011 06:20:06 -0500 Original-Received: from mail-lpp01m010-f44.google.com ([209.85.215.44]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RcbGG-0000se-43 for 10299@debbugs.gnu.org; Mon, 19 Dec 2011 06:20:05 -0500 Original-Received: by laah2 with SMTP id h2so2121779laa.3 for <10299@debbugs.gnu.org>; Mon, 19 Dec 2011 03:18:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=CrchkFCzfli/yiojsRyv9GGxcpVhe97bwFSnRbANUNc=; b=RjS41F87TS63MgiKXzh9XYQZ8+rhdM9v8azNQdzvqelKu7IH+tXzKI5BGxQDZOusFh 31+hWavDg2uDrsrVeO4dlwHa6Ih4bTtbx8ueqtU3m98VLNAVqQhojwwdiNy48xArpKrB wIZUP2FCqcoRCRo3iBF7M8d6D9lyLJJ79uoHE= Original-Received: by 10.152.133.70 with SMTP id pa6mr2793866lab.0.1324293491110; Mon, 19 Dec 2011 03:18:11 -0800 (PST) Original-Received: by 10.152.27.104 with HTTP; Mon, 19 Dec 2011 03:17:50 -0800 (PST) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Mon, 19 Dec 2011 06:21:02 -0500 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:55061 Archived-At: On Mon, Dec 19, 2011 at 12:04, Joakim H=C3=A5rsman wrote: > On 19 December 2011 11:59, Lennart Borgman wr= ote: >> On Mon, Dec 19, 2011 at 11:44, Joakim H=C3=A5rsman wrote: >>> On 18 December 2011 19:13, Eli Zaretskii wrote: >>>>> Date: Sun, 18 Dec 2011 18:31:55 +0100 >>>>> From: Joakim H=C3=A5rsman >>>>> >>>>> > That's good news. =C2=A0However, I'm puzzled: are you saying that t= he code >>>>> > points passed by Windows to Emacs for the characters generated by M= KLC >>>>> > are outside the Unicode BMP, i.e. larger than 65535? =C2=A0If so, w= hat code >>>>> > points are they? >>>>> >>>>> No, none of the characters I needed are outside the BMP. >>>>> >>>>> WM_CHAR encodes the codepoint in UTF-16 inside wParam, while >>>>> WM_UNICHAR uses UTF-32. So if I press something which gives U+2218 >>>>> RING OPERATOR, I get a WM_CHAR event with a wParam of 2228248 or >>>>> 0x220018. >>>> >>>> ??? UTF-16 encodes the characters in the BMP as themselves, i.e. a >>>> single 16-bit value that is numerically identical to the codepoint. >>>> That is, you should have gotten 0x2218. =C2=A0What am I missing? >>> >>> I just assumed Windows encoded the codepoints into a DWORD in some >>> funky way, but looking more closely at the documentattion it appears >>> like wParam should just be the codepoint. Even more strangely, some >>> places claim that if a keyboard produces a character outside the BMP, >>> you get two WM_CHAR events. >>> >>> From what I can tell, Emacs itself never alters wParam, but I guess >>> Windows might do some funky multibyte encoding since Emacs isn't >>> completely Unicode? >> >> Maybe Emacs on windows still is using the ANSI version of DefWindowProc?= See >> >> http://blogs.msdn.com/b/michkap/archive/2007/03/25/1945659.aspx > > I looked at that page as well, but it says that the ANSI DefWindowProc > is supposed to post one or two ANSI characters, and it definitely > isn't doing that. I get teh correct Unciode character, just spread > over the low and high word of the wParam dword. Strange. What is the reason Emacs is still using the ANSI version? Maybe a mix of ANSI and UNICODE versions gives strange results?