From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#12055: Re: bug#12055: 24.1.50; Characters "=?UTF-8?Q?=C3=A1?=" and "=?UTF-8?Q?=C3=A9?=" are not correctly displayed on a Windows terminal Date: Sat, 28 Jul 2012 13:06:30 +0300 Message-ID: <83a9yki4ih.fsf@gnu.org> References: <83vchajyb1.fsf@gnu.org> <83txwujwyg.fsf@gnu.org> <83pq7ijv9m.fsf@gnu.org> <83k3xqjnns.fsf@gnu.org> <83hastk8i1.fsf@gnu.org> <83d33hk212.fsf@gnu.org> <83zk6li6gd.fsf@gnu.org> <83d33gia5u.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1343470032 9171 80.91.229.3 (28 Jul 2012 10:07:12 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 28 Jul 2012 10:07:12 +0000 (UTC) Cc: 12055@debbugs.gnu.org To: dmoncayo@gmail.com, lekktu@gmail.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Jul 28 12:07:11 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Sv3vP-0000UI-49 for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Jul 2012 12:07:07 +0200 Original-Received: from localhost ([::1]:39096 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sv3vN-0002HW-UL for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Jul 2012 06:07:05 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:53115) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sv3vK-0002HG-HF for bug-gnu-emacs@gnu.org; Sat, 28 Jul 2012 06:07:03 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sv3vI-0003Cn-R0 for bug-gnu-emacs@gnu.org; Sat, 28 Jul 2012 06:07:02 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:36330) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sv3vI-0003Cd-Mx for bug-gnu-emacs@gnu.org; Sat, 28 Jul 2012 06:07:00 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1Sv425-0003W9-Nm for bug-gnu-emacs@gnu.org; Sat, 28 Jul 2012 06:14:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 28 Jul 2012 10:14:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 12055 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 12055-submit@debbugs.gnu.org id=B12055.134347040913475 (code B ref 12055); Sat, 28 Jul 2012 10:14:01 +0000 Original-Received: (at 12055) by debbugs.gnu.org; 28 Jul 2012 10:13:29 +0000 Original-Received: from localhost ([127.0.0.1]:45876 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Sv41Y-0003VI-BK for submit@debbugs.gnu.org; Sat, 28 Jul 2012 06:13:29 -0400 Original-Received: from mtaout22.012.net.il ([80.179.55.172]:42045) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Sv41V-0003V8-PH for 12055@debbugs.gnu.org; Sat, 28 Jul 2012 06:13:27 -0400 Original-Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0M7V001007TM6N00@a-mtaout22.012.net.il> for 12055@debbugs.gnu.org; Sat, 28 Jul 2012 13:06:22 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.210.75]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0M7V001GA82M6420@a-mtaout22.012.net.il>; Sat, 28 Jul 2012 13:06:22 +0300 (IDT) In-reply-to: <83d33gia5u.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:62491 Archived-At: > Date: Sat, 28 Jul 2012 11:04:29 +0300 > From: Eli Zaretskii > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Date: Sat, 28 Jul 2012 03:12:12 +0200 > > From: Dani Moncayo > > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > > > Please > > > post here the exact output, and please tell for each pair of such > > > messages which character did you type. > > > > Sorry for the delay. I've not had time until now. > > > > Here is my data: > > Thanks to both of you. Now I see that my theory is correct, and I can > sit down and code the solution for this problem. Please try the patch below. It works for me. Please try it also when Unicode input is not used (it is by default on Windows NT and later, as result of this patch). You can do that by forcing w32_console_unicode_input to zero (either by modifying the source of w32console.c and rebuilding, or by setting the variable's value in GDB. TIA === modified file 'lisp/international/mule-cmds.el' --- lisp/international/mule-cmds.el 2012-07-25 23:11:23 +0000 +++ lisp/international/mule-cmds.el 2012-07-28 09:43:40 +0000 @@ -2655,23 +2655,29 @@ See also `locale-charset-language-names' ;; On Windows, override locale-coding-system, ;; default-file-name-coding-system, keyboard-coding-system, - ;; terminal-coding-system with system codepage. + ;; terminal-coding-system with the appropriate codepages. (when (boundp 'w32-ansi-code-page) - (let ((code-page-coding (intern (format "cp%d" w32-ansi-code-page)))) - (when (coding-system-p code-page-coding) - (unless frame (setq locale-coding-system code-page-coding)) - (set-keyboard-coding-system code-page-coding frame) - (set-terminal-coding-system code-page-coding frame) - ;; Set default-file-name-coding-system last, so that Emacs - ;; doesn't try to use cpNNNN when it defines keyboard and - ;; terminal encoding. That's because the above two lines - ;; will want to load code-pages.el, where cpNNNN are - ;; defined; if default-file-name-coding-system were set to - ;; cpNNNN while these two lines run, Emacs will want to use - ;; it for encoding the file name it wants to load. And that - ;; will fail, since cpNNNN is not yet usable until - ;; code-pages.el finishes loading. - (setq default-file-name-coding-system code-page-coding)))) + (let ((ansi-code-page-coding (intern (format "cp%d" w32-ansi-code-page))) + (oem-code-page-coding + (intern (format "cp%d" (w32-get-console-codepage)))) + ansi-cs-p oem-cs-p) + (and (coding-system-p ansi-code-page-coding) + (setq ansi-cs-p t)) + (and (coding-system-p oem-code-page-coding) + (setq oem-cs-p t)) + ;; Set the keyboard and display encoding to either the current + ;; ANSI codepage of the OEM codepage, depending on whether + ;; this is a GUI or a TTY frame. + (when ansi-cs-p + (unless frame (setq locale-coding-system ansi-code-page-coding)) + (when (display-graphic-p frame) + (set-keyboard-coding-system ansi-code-page-coding frame) + (set-terminal-coding-system ansi-code-page-coding frame)) + (setq default-file-name-coding-system ansi-code-page-coding)) + (when oem-cs-p + (unless (display-graphic-p frame) + (set-keyboard-coding-system oem-code-page-coding frame) + (set-terminal-coding-system oem-code-page-coding frame))))) (when (eq system-type 'darwin) ;; On Darwin, file names are always encoded in utf-8, no matter === modified file 'src/w32console.c' --- src/w32console.c 2012-06-28 07:50:27 +0000 +++ src/w32console.c 2012-07-28 09:48:41 +0000 @@ -37,6 +37,7 @@ along with GNU Emacs. If not, see 0) { - char cp[20]; - int cpId; + int cpId = GetConsoleCP (); event->uChar.UnicodeChar = buf[isdead - 1]; - - GetLocaleInfo (GetThreadLocale (), - LOCALE_IDEFAULTANSICODEPAGE, cp, 20); - cpId = atoi (cp); isdead = WideCharToMultiByte (cpId, 0, buf, isdead, ansi_code, 4, NULL, NULL); } @@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru } else if (event->uChar.AsciiChar > 0) { + /* Pure ASCII characters < 128. */ emacs_ev->kind = ASCII_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.AsciiChar; } - else if (event->uChar.UnicodeChar > 0) + else if (event->uChar.UnicodeChar > 0 + && w32_console_unicode_input) { + /* Unicode codepoint; only valid if we are using Unicode + console input mode. */ emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.UnicodeChar; } else { - /* Fallback for non-Unicode versions of Windows. */ + /* Fallback handling of non-ASCII characters for non-Unicode + versions of Windows, and for non-Unicode input on NT + family of Windows. Only characters in the current + console codepage are supported by this fallback. */ wchar_t code; char dbcs[2]; - char cp[20]; int cpId; - /* Get the codepage to interpret this key with. */ - GetLocaleInfo (GetThreadLocale (), - LOCALE_IDEFAULTANSICODEPAGE, cp, 20); - cpId = atoi (cp); + /* Get the current console input codepage to interpret this + key with. Note that the system defaults for the OEM + codepage could have been changed by calling SetConsoleCP + or w32-set-console-codepage, so using GetLocaleInfo to + get LOCALE_IDEFAULTCODEPAGE is not TRT here. */ + cpId = GetConsoleCP (); dbcs[0] = dbcs_lead; dbcs[1] = event->uChar.AsciiChar; @@ -501,6 +514,7 @@ key_event (KEY_EVENT_RECORD *event, stru } else { + /* Function keys and other non-character keys. */ emacs_ev->kind = NON_ASCII_KEYSTROKE_EVENT; emacs_ev->code = event->wVirtualKeyCode; } === modified file 'src/w32inevt.h' --- src/w32inevt.h 2012-01-19 07:21:25 +0000 +++ src/w32inevt.h 2012-07-28 08:39:49 +0000 @@ -19,6 +19,8 @@ along with GNU Emacs. If not, see