From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#10299: Emacs doesn't handle Unicode characters in keyboard layout on MS Windows Date: Thu, 15 Dec 2011 01:22:11 -0500 Message-ID: References: Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1323930165 16861 80.91.229.12 (15 Dec 2011 06:22:45 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 15 Dec 2011 06:22:45 +0000 (UTC) Cc: 10299@debbugs.gnu.org To: Joakim =?UTF-8?Q?H=C3=A5rsman?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Dec 15 07:22:40 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Rb4iF-0003Ak-Td for geb-bug-gnu-emacs@m.gmane.org; Thu, 15 Dec 2011 07:22:40 +0100 Original-Received: from localhost ([::1]:36550 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rb4iE-0001Vg-V8 for geb-bug-gnu-emacs@m.gmane.org; Thu, 15 Dec 2011 01:22:38 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:34685) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rb4iB-0001VR-Bi for bug-gnu-emacs@gnu.org; Thu, 15 Dec 2011 01:22:36 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rb4iA-0006Bg-5C for bug-gnu-emacs@gnu.org; Thu, 15 Dec 2011 01:22:35 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:38034) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rb4iA-0006Bc-3Y for bug-gnu-emacs@gnu.org; Thu, 15 Dec 2011 01:22:34 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1Rb4jZ-0004XD-MH for bug-gnu-emacs@gnu.org; Thu, 15 Dec 2011 01:24:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 15 Dec 2011 06:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 10299 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 10299-submit@debbugs.gnu.org id=B10299.132393022317405 (code B ref 10299); Thu, 15 Dec 2011 06:24:01 +0000 Original-Received: (at 10299) by debbugs.gnu.org; 15 Dec 2011 06:23:43 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Rb4jG-0004Wf-NE for submit@debbugs.gnu.org; Thu, 15 Dec 2011 01:23:43 -0500 Original-Received: from fencepost.gnu.org ([140.186.70.10]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Rb4jE-0004WX-29 for 10299@debbugs.gnu.org; Thu, 15 Dec 2011 01:23:40 -0500 Original-Received: from eliz by fencepost.gnu.org with local (Exim 4.71) (envelope-from ) id 1Rb4hn-0002C6-5C; Thu, 15 Dec 2011 01:22:11 -0500 In-reply-to: (message from Joakim =?UTF-8?Q?H=C3=A5rsman?= on Wed, 14 Dec 2011 21:39:28 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Thu, 15 Dec 2011 01:24:01 -0500 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:54977 Archived-At: > Date: Wed, 14 Dec 2011 21:39:28 +0100 > From: Joakim Hårsman > > However, Emacs doesn't seem to handle the case when the keyboard > layout contains characters not available in the ANSI code page, and > just prints a question mark character instead. Yes, Emacs on Windows uses the ANSI codepage to read the keyboard input. Does it help to play with the value of keyboard-coding-system? > For certain characters, > a character that is visually similar to the actual character is > printed instead of a question mark. For example, if I use a layout > where AltGr+O produces U+2218 RING OPERATOR, Emacs prints U+00B0 > DEGREE SYMBOL instead. The degree symbol is available in Windows 1252, > the default ANSI code page on my system, but the ring operator > isn't. I'm guessing that this is Windows trying to translate the characters to the ANSI codepage behind the scenes. > However, if the layout maps AltGr+R to U+0220A SMALL ELEMENT OF, Emacs > just prints a question mark, presumably because Windows 1252 doesn't > contain a reasonable replacement for that character. Will inputting these characters with "C-x 8 RET 0220a RET" or "C-x 8 RET SMALL ELEMENT OF RET" be a good enough solution for you? You can input any Unicode character by its name or codepoint using "C-x 8 RET". > I'd be happy to help debug this but I have no idea where to even > start. Is there an easy way to find out if it's the C code that > clobbers the character or if it happens in lisp for example? I don't think there any "clobbering". Emacs deliberately converts the Unicode characters to the current locale's ANSI codepage. I think (but I'm not sure) the reason is that Emacs cannot use UTF-16 for keyboard input. Perhaps Jason and Handa-san could comment on this.