* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal @ 2012-07-26 12:13 Dani Moncayo 2012-07-26 16:13 ` Eli Zaretskii 2012-07-28 14:12 ` Dani Moncayo 0 siblings, 2 replies; 41+ messages in thread From: Dani Moncayo @ 2012-07-26 12:13 UTC (permalink / raw) To: 12055 [-- Attachment #1: Type: text/plain, Size: 1082 bytes --] Hello, On my Windows 7, 64-bit system, if I start Emacs from a cmd.exe console (emacs -nw -Q) and type the text "áéíóú", the first two characters are not displayed correctly (see attached screenshot). If I exit Emacs, and type those characters again in the console prompt, the are all displayed correctly. I've been able to reproduce this problem both in Emacs 24.1 and 23.4. In GNU Emacs 24.1.50.1 (i386-mingw-nt6.1.7601) of 2012-07-19 on DANI-PC Bzr revision: 109159 monnier@iro.umontreal.ca-20120719113938-sgu5ruqm1vcbchtw Windowing system distributor `Microsoft Corp.', version 6.1.7601 Configured using: `configure --with-gcc (4.7) --enable-checking --cflags -I../../libs/libiconv-1.14-2-mingw32-dev/include -I../../libs/libxml2-2.7.8-w32-bin/include/libxml2 -I../../libs/giflib-4.1.4-1/include -I../../emacs/libs/gnutls-3.0.16/include -I../../libs/jpeg-6b-4/include -I../../libs/libpng-1.4.10 -I../../libs/libxpm-3.5.8/include -I../../libs/libxpm-3.5.8/src -I../../libs/tiff-3.8.2-1/include -I../../libs/zlib-1.2.6' -- Dani Moncayo [-- Attachment #2: img1.png --] [-- Type: image/png, Size: 25235 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 12:13 bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal Dani Moncayo @ 2012-07-26 16:13 ` Eli Zaretskii 2012-07-26 16:24 ` Juanma Barranquero 2012-07-28 14:12 ` Dani Moncayo 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-26 16:13 UTC (permalink / raw) To: Dani Moncayo; +Cc: 12055 > Date: Thu, 26 Jul 2012 14:13:23 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > > On my Windows 7, 64-bit system, if I start Emacs from a cmd.exe > console (emacs -nw -Q) and type the text "áéíóú", the first two > characters are not displayed correctly (see attached screenshot). > > If I exit Emacs, and type those characters again in the console > prompt, the are all displayed correctly. Do these two characters _always_ display incorrectly inside Emacs, or only when they are the first you type after starting the -nw session? Also, how did you type these characters, and what does Emacs say if you go to each one of them and type "C-u C-x ="? > I've been able to reproduce this problem both in Emacs 24.1 and 23.4. Thanks for the report. First, please always make a point of reporting bugs via "M-x report-emacs-bug RET", as that command collects and sends lots of useful information about your system and Emacs setup. This is especially important when non-ASCII characters are involved, as report-emacs-bug provides important information related to that. Please send that info, as collected in the same version of Emacs and in the same console session as the one where you see the problem. In addition, please tell what these expressions produce in the console session of the current trunk Emacs version: (terminal-coding-system) (keyboard-coding-system) w32-ansi-codepage (w32-get-console-codepage) (w32-get-console-output-codepage) Finally, in the console outside Emacs type "chcp" at the Windows cmd.exe shell prompt, and tell what it says. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 16:13 ` Eli Zaretskii @ 2012-07-26 16:24 ` Juanma Barranquero 2012-07-26 16:42 ` bug#12055: " Eli Zaretskii 2012-07-26 16:44 ` Dani Moncayo 0 siblings, 2 replies; 41+ messages in thread From: Juanma Barranquero @ 2012-07-26 16:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 12055 I see the same. > Do these two characters _always_ display incorrectly inside Emacs, or > only when they are the first you type after starting the -nw session? Always. > Also, how did you type these characters, ' a and ' e (' is on the Spanish keyboards, to type accented vowels). > and what does Emacs say if > you go to each one of them and type "C-u C-x ="? position: 206 of 210 (98%), column: 0 character: (displayed as ) (codepoint 160, #o240, #xa0) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0xA0 syntax: . which means: punctuation category: .:Base, b:Arabic, j:Japanese, l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #xA0 file code: #xC2 #xA0 (encoded by coding system nil) display: terminal code #xA0 hardcoded face: nobreak-space Character code properties: customize what to show name: NO-BREAK SPACE old-name: NON-BREAKING SPACE general-category: Zs (Separator, Space) decomposition: (noBreak 32) (noBreak ' ') position: 210 of 210 (100%), column: 0 character: ‚ (displayed as ‚) (codepoint 130, #o202, #x82) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0x82 syntax: w which means: word category: l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #x82 file code: #xC2 #x82 (encoded by coding system nil) display: not encodable for terminal Character code properties: customize what to show name: <control> old-name: BREAK PERMITTED HERE general-category: Cc (Other, Control) decomposition: (130) ('‚') > First, please always make a point of reporting bugs via > "M-x report-emacs-bug RET", as that command collects and sends > lots of useful information about your system and Emacs setup. > This is especially important when non-ASCII characters are involved, > as report-emacs-bug provides important information related to that. Important settings: value of $LANG: C locale-coding-system: cp1252 default enable-multibyte-characters: t > (terminal-coding-system) cp1252 > (keyboard-coding-system) windows-1252-unix > w32-ansi-codepage 1252 (BTW, we're not very consistent here, the variable is w32-ansi-code-page) > (w32-get-console-codepage) 850 > (w32-get-console-output-codepage) 850 > Finally, in the console outside Emacs type "chcp" at the Windows > cmd.exe shell prompt, and tell what it says. Página de códigos activa: 850 Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 16:24 ` Juanma Barranquero @ 2012-07-26 16:42 ` Eli Zaretskii 2012-07-26 16:49 ` Juanma Barranquero 2012-07-26 16:44 ` Dani Moncayo 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-26 16:42 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 12055 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Thu, 26 Jul 2012 18:24:14 +0200 > Cc: Dani Moncayo <dmoncayo@gmail.com>, 12055@debbugs.gnu.org > > I see the same. I have no doubt ;-) > > (terminal-coding-system) > > cp1252 > > > (keyboard-coding-system) > > windows-1252-unix > > > w32-ansi-codepage > > 1252 (BTW, we're not very consistent here, the variable is w32-ansi-code-page) > > > (w32-get-console-codepage) > > 850 > > > (w32-get-console-output-codepage) > > 850 > > > Finally, in the console outside Emacs type "chcp" at the Windows > > cmd.exe shell prompt, and tell what it says. > > Página de códigos activa: 850 Does it help to say C-x RET t cp850 RET C-x RET k cp850 RET before typing those characters? Do they display correctly then, and most importantly, does "C-u C-x =" report in that case the characters you really intended to type? ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 16:42 ` bug#12055: " Eli Zaretskii @ 2012-07-26 16:49 ` Juanma Barranquero 2012-07-26 17:18 ` bug#12055: " Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Juanma Barranquero @ 2012-07-26 16:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 12055 On Thu, Jul 26, 2012 at 6:42 PM, Eli Zaretskii <eliz@gnu.org> wrote: > Does it help to say > > C-x RET t cp850 RET > C-x RET k cp850 RET > > before typing those characters? Do they display correctly then, and > most importantly, does "C-u C-x =" report in that case the characters > you really intended to type? No. The problem worsens. Now á é are still incorrect, and í ó ú ñ ç turn into ¡ ¢ £ ¤‡ \207 Note: I see that at some point in the past I surely hit the problem and for some reason I failed to report it, because I have this in my .emacs: (unless (or window-system noninteractive (not (boundp 'w32-ansi-code-page))) (let ((cicp (w32-get-console-codepage)) (cocp (w32-get-console-output-codepage))) (w32-set-console-codepage w32-ansi-code-page) (w32-set-console-output-codepage w32-ansi-code-page) (add-hook 'kill-emacs-hook `(lambda () (w32-set-console-codepage ,cicp) (w32-set-console-output-codepage ,cocp))))) though that's irrelevant to the tests above, which are all -Q -nw. Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 16:49 ` Juanma Barranquero @ 2012-07-26 17:18 ` Eli Zaretskii 2012-07-26 18:09 ` Eli Zaretskii 2012-07-26 18:29 ` Juanma Barranquero 0 siblings, 2 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-26 17:18 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 12055 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Thu, 26 Jul 2012 18:49:33 +0200 > Cc: dmoncayo@gmail.com, 12055@debbugs.gnu.org > > On Thu, Jul 26, 2012 at 6:42 PM, Eli Zaretskii <eliz@gnu.org> wrote: > > > Does it help to say > > > > C-x RET t cp850 RET > > C-x RET k cp850 RET > > > > before typing those characters? Do they display correctly then, and > > most importantly, does "C-u C-x =" report in that case the characters > > you really intended to type? > > No. The problem worsens. Now á é are still incorrect, and í ó ú ñ ç > turn into ¡ ¢ £ ¤‡ \207 What are the codes of these characters, as "C-u C-x =" sees them? > Note: I see that at some point in the past I surely hit the problem > and for some reason I failed to report it, because I have this in my > .emacs: > > (unless (or window-system noninteractive > (not (boundp 'w32-ansi-code-page))) > (let ((cicp (w32-get-console-codepage)) > (cocp (w32-get-console-output-codepage))) > (w32-set-console-codepage w32-ansi-code-page) > (w32-set-console-output-codepage w32-ansi-code-page) > (add-hook 'kill-emacs-hook > `(lambda () > (w32-set-console-codepage ,cicp) > (w32-set-console-output-codepage ,cocp))))) > > though that's irrelevant to the tests above, which are all -Q -nw. Does the above solve the problem at hand, though? If it does, we can do that at startup in the -nw session. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 17:18 ` bug#12055: " Eli Zaretskii @ 2012-07-26 18:09 ` Eli Zaretskii 2012-07-26 18:42 ` Juanma Barranquero 2012-07-26 18:29 ` Juanma Barranquero 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-26 18:09 UTC (permalink / raw) To: lekktu; +Cc: 12055 > Date: Thu, 26 Jul 2012 20:18:45 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 12055@debbugs.gnu.org > > > > Does it help to say > > > > > > C-x RET t cp850 RET > > > C-x RET k cp850 RET > > > > > > before typing those characters? Do they display correctly then, and > > > most importantly, does "C-u C-x =" report in that case the characters > > > you really intended to type? > > > > No. The problem worsens. Now á é are still incorrect, and í ó ú ñ ç > > turn into ¡ ¢ £ ¤‡ \207 I think we need to establish whether the problem is with input or output (or both). (I think it's with input, but let's make sure.) If you type these same characters using some latin-1 Leim input method (e.g., latin-1-postfix), and set the terminal encoding to cp850, do all the Latin-1 characters display correctly? ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 18:09 ` Eli Zaretskii @ 2012-07-26 18:42 ` Juanma Barranquero 0 siblings, 0 replies; 41+ messages in thread From: Juanma Barranquero @ 2012-07-26 18:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 12055 On Thu, Jul 26, 2012 at 8:09 PM, Eli Zaretskii <eliz@gnu.org> wrote: > I think we need to establish whether the problem is with input or > output (or both). (I think it's with input, but let's make sure.) If > you type these same characters using some latin-1 Leim input method > (e.g., latin-1-postfix), and set the terminal encoding to cp850, do > all the Latin-1 characters display correctly? Yes, after C-x RET t cp850 RET C-\ latin-1-postfix RET the accented characters can be input with latin-1-postfix and display correctly (and C-u M-x describe-char confirms they are the expected chars). Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 17:18 ` bug#12055: " Eli Zaretskii 2012-07-26 18:09 ` Eli Zaretskii @ 2012-07-26 18:29 ` Juanma Barranquero 2012-07-26 20:03 ` bug#12055: " Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Juanma Barranquero @ 2012-07-26 18:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 12055 On Thu, Jul 26, 2012 at 7:18 PM, Eli Zaretskii <eliz@gnu.org> wrote: > What are the codes of these characters, as "C-u C-x =" sees them? á and é, as above. As for í, ó, ú, ñ and ç, in that order: position: 194 of 198 (97%), column: 5 character: ¡ (displayed as ¡) (codepoint 161, #o241, #xa1) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0xA1 syntax: . which means: punctuation category: .:Base, h:Korean, j:Japanese, l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #xA1 file code: #xC2 #xA1 (encoded by coding system nil) display: terminal code #xAD Character code properties: customize what to show name: INVERTED EXCLAMATION MARK general-category: Po (Punctuation, Other) decomposition: (161) ('¡') position: 195 of 198 (98%), column: 6 character: ¢ (displayed as ¢) (codepoint 162, #o242, #xa2) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0xA2 syntax: _ which means: symbol category: .:Base, j:Japanese, l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #xA2 file code: #xC2 #xA2 (encoded by coding system nil) display: terminal code #xBD Character code properties: customize what to show name: CENT SIGN general-category: Sc (Symbol, Currency) decomposition: (162) ('¢') position: 196 of 198 (98%), column: 7 character: £ (displayed as £) (codepoint 163, #o243, #xa3) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0xA3 syntax: _ which means: symbol category: .:Base, j:Japanese, l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #xA3 file code: #xC2 #xA3 (encoded by coding system nil) display: terminal code #x9C Character code properties: customize what to show name: POUND SIGN general-category: Sc (Symbol, Currency) decomposition: (163) ('£') position: 197 of 198 (99%), column: 8 character: ¤ (displayed as ¤) (codepoint 164, #o244, #xa4) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0xA4 syntax: _ which means: symbol category: .:Base, b:Arabic, c:Chinese, h:Korean, j:Japanese, l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #xA4 file code: #xC2 #xA4 (encoded by coding system nil) display: terminal code #xCF position: 198 of 198 (99%), column: 9 character: ‡ (displayed as ‡) (codepoint 135, #o207, #x87) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0x87 syntax: w which means: word category: l:Latin to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC2 #x87 file code: #xC2 #x87 (encoded by coding system nil) display: not encodable for terminal Character code properties: customize what to show name: <control> old-name: END OF SELECTED AREA general-category: Cc (Other, Control) decomposition: (135) ('‡') > Does the above solve the problem at hand, though? If it does, we can > do that at startup in the -nw session. Yes, if you evaluate that code prior to typing á, é, etc, it works as expected. Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 18:29 ` Juanma Barranquero @ 2012-07-26 20:03 ` Eli Zaretskii 2012-07-26 22:40 ` Dani Moncayo 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-26 20:03 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 12055 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Thu, 26 Jul 2012 20:29:57 +0200 > Cc: dmoncayo@gmail.com, 12055@debbugs.gnu.org > > On Thu, Jul 26, 2012 at 7:18 PM, Eli Zaretskii <eliz@gnu.org> wrote: > > > What are the codes of these characters, as "C-u C-x =" sees them? > > á and é, as above. > > As for í, ó, ú, ñ and ç, in that order: > > position: 194 of 198 (97%), column: 5 > character: ¡ (displayed as ¡) (codepoint 161, #o241, #xa1) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0xA1 > syntax: . which means: punctuation > category: .:Base, h:Korean, j:Japanese, l:Latin > to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" > buffer code: #xC2 #xA1 > file code: #xC2 #xA1 (encoded by coding system nil) > display: terminal code #xAD > > Character code properties: customize what to show > name: INVERTED EXCLAMATION MARK > general-category: Po (Punctuation, Other) > decomposition: (161) ('¡') > > > position: 195 of 198 (98%), column: 6 > character: ¢ (displayed as ¢) (codepoint 162, #o242, #xa2) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0xA2 > syntax: _ which means: symbol > category: .:Base, j:Japanese, l:Latin > to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" > buffer code: #xC2 #xA2 > file code: #xC2 #xA2 (encoded by coding system nil) > display: terminal code #xBD > > Character code properties: customize what to show > name: CENT SIGN > general-category: Sc (Symbol, Currency) > decomposition: (162) ('¢') > > > position: 196 of 198 (98%), column: 7 > character: £ (displayed as £) (codepoint 163, #o243, #xa3) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0xA3 > syntax: _ which means: symbol > category: .:Base, j:Japanese, l:Latin > to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" > buffer code: #xC2 #xA3 > file code: #xC2 #xA3 (encoded by coding system nil) > display: terminal code #x9C > > Character code properties: customize what to show > name: POUND SIGN > general-category: Sc (Symbol, Currency) > decomposition: (163) ('£') > > > position: 197 of 198 (99%), column: 8 > character: ¤ (displayed as ¤) (codepoint 164, #o244, #xa4) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0xA4 > syntax: _ which means: symbol > category: > .:Base, b:Arabic, c:Chinese, h:Korean, j:Japanese, l:Latin > to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" > buffer code: #xC2 #xA4 > file code: #xC2 #xA4 (encoded by coding system nil) > display: terminal code #xCF > > > position: 198 of 198 (99%), column: 9 > character: ‡ (displayed as ‡) (codepoint 135, #o207, #x87) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0x87 > syntax: w which means: word > category: l:Latin > to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" > buffer code: #xC2 #x87 > file code: #xC2 #x87 (encoded by coding system nil) > display: not encodable for terminal > > Character code properties: customize what to show > name: <control> > old-name: END OF SELECTED AREA > general-category: Cc (Other, Control) > decomposition: (135) ('‡') That's strange: these are definitely the cp850 codes for the Latin-1 characters you typed, so I wonder why just setting terminal-coding-system to that doesn't fix the problem... ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 20:03 ` bug#12055: " Eli Zaretskii @ 2012-07-26 22:40 ` Dani Moncayo 2012-07-27 6:45 ` bug#12055: " Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-26 22:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Juanma Barranquero, 12055 [-- Attachment #1: Type: text/plain, Size: 660 bytes --] FWIW, another experiment that produces unexpected results: The attached screenshot shows how two Emacs instances (GUI and non-GUI) show the contents of a test file (previously written by me and saved to disk). IIUC, this time the input method is irrelevant, since the text has been read from a file (not the keyboard). As you can see in the modelines, both instances of Emacs have selected the latin-1 coding system, but the non-GUI instance fails to show the characters correctly (and not only "á" and "é"). I don't know if this problem has the same root than the original. If not, I could file a separate bug report. -- Dani Moncayo [-- Attachment #2: img2.png --] [-- Type: image/png, Size: 40071 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 22:40 ` Dani Moncayo @ 2012-07-27 6:45 ` Eli Zaretskii 2012-07-27 8:35 ` Dani Moncayo 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-27 6:45 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Fri, 27 Jul 2012 00:40:22 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: Juanma Barranquero <lekktu@gmail.com>, 12055@debbugs.gnu.org > > The attached screenshot shows how two Emacs instances (GUI and > non-GUI) show the contents of a test file (previously written by me > and saved to disk). > > IIUC, this time the input method is irrelevant, since the text has > been read from a file (not the keyboard). > > As you can see in the modelines, both instances of Emacs have selected > the latin-1 coding system, but the non-GUI instance fails to show the > characters correctly (and not only "á" and "é"). Please try that in the non-GUI session where you first set the terminal coding-system to cp850. Juanma said that doing so and using a Leim input method (which is guaranteed to produce correct charcaters in the buffer) shows these characters correctly. So I expect the same to work with a file (unless that file was also produced in a non-GUI Emacs session...). Which leaves us with input problem... ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 6:45 ` bug#12055: " Eli Zaretskii @ 2012-07-27 8:35 ` Dani Moncayo 2012-07-27 9:04 ` bug#12055: " Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-27 8:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > Please try that in the non-GUI session where you first set the > terminal coding-system to cp850. Ok. If I do: 1. emacs -nw -Q 2. C-x RET t cp850 RET 3. Visit the test file. Then the file is corrrectly displayed. -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 8:35 ` Dani Moncayo @ 2012-07-27 9:04 ` Eli Zaretskii 2012-07-27 15:12 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-27 9:04 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Fri, 27 Jul 2012 10:35:53 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Please try that in the non-GUI session where you first set the > > terminal coding-system to cp850. > > Ok. If I do: > 1. emacs -nw -Q > 2. C-x RET t cp850 RET > 3. Visit the test file. > > Then the file is corrrectly displayed. Thanks. If no one beats me to it, I will look into the input issue when I have time. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 9:04 ` bug#12055: " Eli Zaretskii @ 2012-07-27 15:12 ` Eli Zaretskii 2012-07-27 16:46 ` Jason Rumney ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-27 15:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > Date: Fri, 27 Jul 2012 12:04:57 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Date: Fri, 27 Jul 2012 10:35:53 +0200 > > From: Dani Moncayo <dmoncayo@gmail.com> > > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > > > Please try that in the non-GUI session where you first set the > > > terminal coding-system to cp850. > > > > Ok. If I do: > > 1. emacs -nw -Q > > 2. C-x RET t cp850 RET > > 3. Visit the test file. > > > > Then the file is corrrectly displayed. > > Thanks. If no one beats me to it, I will look into the input issue > when I have time. Well, I see some strange stuff in the input processing. Please add this snippet: DebPrint (("key_event: %d %d 0x%x 0x%x {0x%x 0x%x} 0x%x\n", event->bKeyDown, event->wRepeatCount, event->wVirtualKeyCode, event->wVirtualScanCode, event->uChar.AsciiChar, event->uChar.UnicodeChar, event->dwControlKeyState)); at the very beginning of key_event function (in w32inevt.c), attach GDB to a running "emacs -Q -nw", and tell me what does GDB report when you type non-ASCII keys on your keyboard. That is, emacs -Q -nw gdb -p EMACS-PID (gdb) continue then type non-ASCII characters into Emacs. You should see messages such as these: warning: key_event: 0 1 0x54 0x14 {0xffffff80 0x580} 0x20 warning: key_event: 1 1 0x54 0x14 {0xffffff80 0x580} 0x20 (but with different codes). There are 2 messages for each keystroke: one when the key is pressed, the other when it is released. Please post here the exact output, and please tell for each pair of such messages which character did you type. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 15:12 ` Eli Zaretskii @ 2012-07-27 16:46 ` Jason Rumney 2012-07-27 18:03 ` Eli Zaretskii 2012-07-27 23:45 ` Juanma Barranquero 2012-07-28 1:12 ` Dani Moncayo 2 siblings, 1 reply; 41+ messages in thread From: Jason Rumney @ 2012-07-27 16:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 Eli Zaretskii <eliz@gnu.org> writes: >> Date: Fri, 27 Jul 2012 12:04:57 +0300 >> From: Eli Zaretskii <eliz@gnu.org> >> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org >> >> > Date: Fri, 27 Jul 2012 10:35:53 +0200 >> > From: Dani Moncayo <dmoncayo@gmail.com> >> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org >> > >> > > Please try that in the non-GUI session where you first set the >> > > terminal coding-system to cp850. >> > >> > Ok. If I do: >> > 1. emacs -nw -Q >> > 2. C-x RET t cp850 RET >> > 3. Visit the test file. >> > >> > Then the file is corrrectly displayed. >> >> Thanks. If no one beats me to it, I will look into the input issue >> when I have time. > > Well, I see some strange stuff in the input processing. /* Get the codepage to interpret this key with. */ GetLocaleInfo (GetThreadLocale (), LOCALE_IDEFAULTANSICODEPAGE, cp, 20); cpId = atoi (cp); is quite suspicious. It appears in two places - one is a fallback for older versions of Windows that do not fully support Unicode, the other is more interesting for this case, as it is in the dead key handling, and from Juanma's description, a dead key is being used to input the problem characters. The above lines should probably be replaced with cpId = GetConsoleCP (); ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 16:46 ` Jason Rumney @ 2012-07-27 18:03 ` Eli Zaretskii 2012-07-27 18:22 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-27 18:03 UTC (permalink / raw) To: Jason Rumney; +Cc: lekktu, 12055 > From: Jason Rumney <jasonr@gnu.org> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > Date: Sat, 28 Jul 2012 00:46:08 +0800 > > > Well, I see some strange stuff in the input processing. > > /* Get the codepage to interpret this key with. */ > GetLocaleInfo (GetThreadLocale (), > LOCALE_IDEFAULTANSICODEPAGE, cp, 20); > cpId = atoi (cp); > > is quite suspicious. It appears in two places - one is a fallback for > older versions of Windows that do not fully support Unicode, the other > is more interesting for this case, as it is in the dead key handling, > and from Juanma's description, a dead key is being used to input the > problem characters. > > The above lines should probably be replaced with > > cpId = GetConsoleCP (); Thanks. Yes, I wondered about that as well. However, this is not my problem right now. If we were decoding input with a wrong codepage, I should have at least seen correct Unicode character codes right at entry into key_event. But what I see on my machine (whose ANSI encoding is cp1255 and the corresponding OEM encoding is cp862) is something really weird. When I switch the keyboard to Hebrew and type ALEPH, BET, GIMEL, whose Unicode codepoints are, respectively, u+05D0, u+05D1, u+05D2, I see 0x0580, 0x0581, and 0x0582 instead. That makes no sense at all, and no amount of tinkering with input codepage can ever fix that. Besides, at least in my locale, the code that you mention is never executed at all. Instead, we return the original Unicode character codepoint via this fragment: else if (event->uChar.UnicodeChar > 0) { emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.UnicodeChar; } And since, at least in my locale, event->uChar.UnicodeChar is wrong, the rest is a logical consequence of this. So my current theory is that it is simply wrong to look at uChar.UnicodeChar unless we call ReadConsoleInputW, the wide-character version of the API. But I need data from other locales to make sure this theory is correct. The theory is based on the following vague portion of the ReadConsoleInput's documentation: This function uses either Unicode characters or 8-bit characters from the console's current code page. There isn't a word about when it does one or the other (AFAICS), which led me to the above hypothesis, since that's the only cause that doesn't need to be explicitly documented. Btw, the MSDN documentation about stuff this is not as helpful as it could have been (so what else is new?). This page http://msdn.microsoft.com/en-us/library/windows/desktop/ms684166%28v=vs.85%29.aspx says: uChar A union of the following members. UnicodeChar Translated Unicode character. AsciiChar Translated ASCII character. What the heck do they mean by "translated" here? "Translated" by whom and how? ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 18:03 ` Eli Zaretskii @ 2012-07-27 18:22 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-27 18:22 UTC (permalink / raw) To: jasonr; +Cc: lekktu, 12055 > Date: Fri, 27 Jul 2012 21:03:43 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > So my current theory is that it is simply wrong to look at > uChar.UnicodeChar unless we call ReadConsoleInputW, the wide-character > version of the API. Forgot to tell an important detail: if I replace the call to ReadConsoleInput with ReadConsoleInputW, I do see the expected 0x05D0 etc. codes in uChar.UnicodeChar of each event, and Emacs inserts the correct characters into the buffer. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 15:12 ` Eli Zaretskii 2012-07-27 16:46 ` Jason Rumney @ 2012-07-27 23:45 ` Juanma Barranquero 2012-07-28 1:12 ` Dani Moncayo 2 siblings, 0 replies; 41+ messages in thread From: Juanma Barranquero @ 2012-07-27 23:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 12055 On Fri, Jul 27, 2012 at 5:12 PM, Eli Zaretskii <eliz@gnu.org> wrote: > There are 2 messages for each keystroke: Not exactly, see below. > one when the key is pressed, the other when it is released. Please > post here the exact output, and please tell for each pair of such > messages which character did you type. Dead key ' warning: key_event: 1 1 0xde 0x28 {0x0 0x0} 0x0 warning: key_event: 0 1 0xde 0x0 {0xffffffef 0xef} 0x0 warning: key_event: 0 1 0xde 0x28 {0xffffffef 0xef} 0x0 a warning: key_event: 1 1 0x41 0x1e {0xffffffa0 0xa0} 0x0 warning: key_event: 0 1 0x41 0x1e {0x61 0x61} 0x0 Dead key ' warning: key_event: 1 1 0xde 0x28 {0x0 0x0} 0x0 warning: key_event: 0 1 0xde 0x0 {0xffffffef 0xef} 0x0 warning: key_event: 0 1 0xde 0x28 {0xffffffef 0xef} 0x0 e warning: key_event: 1 1 0x45 0x12 {0xffffff82 0x82} 0x0 warning: key_event: 0 1 0x45 0x12 {0x65 0x65} 0x0 etc. Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-27 15:12 ` Eli Zaretskii 2012-07-27 16:46 ` Jason Rumney 2012-07-27 23:45 ` Juanma Barranquero @ 2012-07-28 1:12 ` Dani Moncayo 2012-07-28 8:04 ` bug#12055: " Eli Zaretskii 2 siblings, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 1:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > Please > post here the exact output, and please tell for each pair of such > messages which character did you type. Sorry for the delay. I've not had time until now. Here is my data: [´] (dead key. used before vowels for inserting accented vowels like "á") warning: key_event: 1 1 0xde 0x28 {0x0 0x0} 0x20 warning: key_event: 0 1 0xde 0x0 {0xffffffef 0xef} 0x20 warning: key_event: 0 1 0xde 0x28 {0xffffffef 0xef} 0x20 [a] warning: key_event: 1 1 0x41 0x1e {0xffffffa0 0xa0} 0x20 warning: key_event: 0 1 0x41 0x1e {0x61 0x61} 0x20 [e] warning: key_event: 1 1 0x45 0x12 {0xffffff82 0x82} 0x20 warning: key_event: 0 1 0x45 0x12 {0x65 0x65} 0x20 [i] warning: key_event: 1 1 0x49 0x17 {0xffffffa1 0xa1} 0x20 warning: key_event: 0 1 0x49 0x17 {0x69 0x69} 0x20 [o] warning: key_event: 1 1 0x4f 0x18 {0xffffffa2 0xa2} 0x20 warning: key_event: 0 1 0x4f 0x18 {0x6f 0x6f} 0x20 [u] warning: key_event: 1 1 0x55 0x16 {0xffffffa3 0xa3} 0x20 warning: key_event: 0 1 0x55 0x16 {0x75 0x75} 0x20 [ñ] warning: key_event: 1 1 0xc0 0x27 {0xffffffa4 0xa4} 0x20 warning: key_event: 0 1 0xc0 0x27 {0xffffffa4 0xa4} 0x20 [ç] warning: key_event: 1 1 0xbf 0x2b {0xffffff87 0x87} 0x20 warning: key_event: 0 1 0xbf 0x2b {0xffffff87 0x87} 0x20 -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 1:12 ` Dani Moncayo @ 2012-07-28 8:04 ` Eli Zaretskii 2012-07-28 10:06 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 8:04 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 03:12:12 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Please > > post here the exact output, and please tell for each pair of such > > messages which character did you type. > > Sorry for the delay. I've not had time until now. > > Here is my data: Thanks to both of you. Now I see that my theory is correct, and I can sit down and code the solution for this problem. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 8:04 ` bug#12055: " Eli Zaretskii @ 2012-07-28 10:06 ` Eli Zaretskii 2012-07-28 11:55 ` Dani Moncayo ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 10:06 UTC (permalink / raw) To: dmoncayo, lekktu; +Cc: 12055 > Date: Sat, 28 Jul 2012 11:04:29 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Date: Sat, 28 Jul 2012 03:12:12 +0200 > > From: Dani Moncayo <dmoncayo@gmail.com> > > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > > > Please > > > post here the exact output, and please tell for each pair of such > > > messages which character did you type. > > > > Sorry for the delay. I've not had time until now. > > > > Here is my data: > > Thanks to both of you. Now I see that my theory is correct, and I can > sit down and code the solution for this problem. Please try the patch below. It works for me. Please try it also when Unicode input is not used (it is by default on Windows NT and later, as result of this patch). You can do that by forcing w32_console_unicode_input to zero (either by modifying the source of w32console.c and rebuilding, or by setting the variable's value in GDB. TIA === modified file 'lisp/international/mule-cmds.el' --- lisp/international/mule-cmds.el 2012-07-25 23:11:23 +0000 +++ lisp/international/mule-cmds.el 2012-07-28 09:43:40 +0000 @@ -2655,23 +2655,29 @@ See also `locale-charset-language-names' ;; On Windows, override locale-coding-system, ;; default-file-name-coding-system, keyboard-coding-system, - ;; terminal-coding-system with system codepage. + ;; terminal-coding-system with the appropriate codepages. (when (boundp 'w32-ansi-code-page) - (let ((code-page-coding (intern (format "cp%d" w32-ansi-code-page)))) - (when (coding-system-p code-page-coding) - (unless frame (setq locale-coding-system code-page-coding)) - (set-keyboard-coding-system code-page-coding frame) - (set-terminal-coding-system code-page-coding frame) - ;; Set default-file-name-coding-system last, so that Emacs - ;; doesn't try to use cpNNNN when it defines keyboard and - ;; terminal encoding. That's because the above two lines - ;; will want to load code-pages.el, where cpNNNN are - ;; defined; if default-file-name-coding-system were set to - ;; cpNNNN while these two lines run, Emacs will want to use - ;; it for encoding the file name it wants to load. And that - ;; will fail, since cpNNNN is not yet usable until - ;; code-pages.el finishes loading. - (setq default-file-name-coding-system code-page-coding)))) + (let ((ansi-code-page-coding (intern (format "cp%d" w32-ansi-code-page))) + (oem-code-page-coding + (intern (format "cp%d" (w32-get-console-codepage)))) + ansi-cs-p oem-cs-p) + (and (coding-system-p ansi-code-page-coding) + (setq ansi-cs-p t)) + (and (coding-system-p oem-code-page-coding) + (setq oem-cs-p t)) + ;; Set the keyboard and display encoding to either the current + ;; ANSI codepage of the OEM codepage, depending on whether + ;; this is a GUI or a TTY frame. + (when ansi-cs-p + (unless frame (setq locale-coding-system ansi-code-page-coding)) + (when (display-graphic-p frame) + (set-keyboard-coding-system ansi-code-page-coding frame) + (set-terminal-coding-system ansi-code-page-coding frame)) + (setq default-file-name-coding-system ansi-code-page-coding)) + (when oem-cs-p + (unless (display-graphic-p frame) + (set-keyboard-coding-system oem-code-page-coding frame) + (set-terminal-coding-system oem-code-page-coding frame))))) (when (eq system-type 'darwin) ;; On Darwin, file names are always encoded in utf-8, no matter === modified file 'src/w32console.c' --- src/w32console.c 2012-06-28 07:50:27 +0000 +++ src/w32console.c 2012-07-28 09:48:41 +0000 @@ -37,6 +37,7 @@ along with GNU Emacs. If not, see <http #include "termhooks.h" #include "termchar.h" #include "dispextern.h" +#include "w32heap.h" /* for os_subtype */ #include "w32inevt.h" /* from window.c */ @@ -67,6 +68,7 @@ static CONSOLE_CURSOR_INFO prev_console_ #endif HANDLE keyboard_handle; +int w32_console_unicode_input; /* Setting this as the ctrl handler prevents emacs from being killed when @@ -786,6 +788,11 @@ initialize_w32_display (struct terminal info.srWindow.Left); } + if (os_subtype == OS_NT) + w32_console_unicode_input = 1; + else + w32_console_unicode_input = 0; + /* Setup w32_display_info structure for this frame. */ w32_initialize_display_info (build_string ("Console")); === modified file 'src/w32inevt.c' --- src/w32inevt.c 2012-05-26 11:58:19 +0000 +++ src/w32inevt.c 2012-07-28 09:57:11 +0000 @@ -41,6 +41,7 @@ along with GNU Emacs. If not, see <http #include "termchar.h" #include "w32heap.h" #include "w32term.h" +#include "w32inevt.h" /* stdin, from w32console.c */ extern HANDLE keyboard_handle; @@ -61,6 +62,15 @@ static INPUT_RECORD *queue_ptr = event_q /* Temporarily store lead byte of DBCS input sequences. */ static char dbcs_lead = 0; +static inline BOOL +w32_read_console_input (HANDLE h, INPUT_RECORD *rec, DWORD recsize, + DWORD *waiting) +{ + return (w32_console_unicode_input + ? ReadConsoleInputW (h, rec, recsize, waiting) + : ReadConsoleInputA (h, rec, recsize, waiting)); +} + static int fill_queue (BOOL block) { @@ -80,8 +90,8 @@ fill_queue (BOOL block) return 0; } - rc = ReadConsoleInput (keyboard_handle, event_queue, EVENT_QUEUE_SIZE, - &events_waiting); + rc = w32_read_console_input (keyboard_handle, event_queue, EVENT_QUEUE_SIZE, + &events_waiting); if (!rc) return -1; queue_ptr = event_queue; @@ -224,7 +234,7 @@ w32_kbd_patch_key (KEY_EVENT_RECORD *eve #endif /* On NT, call ToUnicode instead and then convert to the current - locale's default codepage. */ + console input codepage. */ if (os_subtype == OS_NT) { WCHAR buf[128]; @@ -233,14 +243,9 @@ w32_kbd_patch_key (KEY_EVENT_RECORD *eve keystate, buf, 128, 0); if (isdead > 0) { - char cp[20]; - int cpId; + int cpId = GetConsoleCP (); event->uChar.UnicodeChar = buf[isdead - 1]; - - GetLocaleInfo (GetThreadLocale (), - LOCALE_IDEFAULTANSICODEPAGE, cp, 20); - cpId = atoi (cp); isdead = WideCharToMultiByte (cpId, 0, buf, isdead, ansi_code, 4, NULL, NULL); } @@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru } else if (event->uChar.AsciiChar > 0) { + /* Pure ASCII characters < 128. */ emacs_ev->kind = ASCII_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.AsciiChar; } - else if (event->uChar.UnicodeChar > 0) + else if (event->uChar.UnicodeChar > 0 + && w32_console_unicode_input) { + /* Unicode codepoint; only valid if we are using Unicode + console input mode. */ emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.UnicodeChar; } else { - /* Fallback for non-Unicode versions of Windows. */ + /* Fallback handling of non-ASCII characters for non-Unicode + versions of Windows, and for non-Unicode input on NT + family of Windows. Only characters in the current + console codepage are supported by this fallback. */ wchar_t code; char dbcs[2]; - char cp[20]; int cpId; - /* Get the codepage to interpret this key with. */ - GetLocaleInfo (GetThreadLocale (), - LOCALE_IDEFAULTANSICODEPAGE, cp, 20); - cpId = atoi (cp); + /* Get the current console input codepage to interpret this + key with. Note that the system defaults for the OEM + codepage could have been changed by calling SetConsoleCP + or w32-set-console-codepage, so using GetLocaleInfo to + get LOCALE_IDEFAULTCODEPAGE is not TRT here. */ + cpId = GetConsoleCP (); dbcs[0] = dbcs_lead; dbcs[1] = event->uChar.AsciiChar; @@ -501,6 +514,7 @@ key_event (KEY_EVENT_RECORD *event, stru } else { + /* Function keys and other non-character keys. */ emacs_ev->kind = NON_ASCII_KEYSTROKE_EVENT; emacs_ev->code = event->wVirtualKeyCode; } === modified file 'src/w32inevt.h' --- src/w32inevt.h 2012-01-19 07:21:25 +0000 +++ src/w32inevt.h 2012-07-28 08:39:49 +0000 @@ -19,6 +19,8 @@ along with GNU Emacs. If not, see <http #ifndef EMACS_W32INEVT_H #define EMACS_W32INEVT_H +extern int w32_console_unicode_input; + extern int w32_console_read_socket (struct terminal *term, int numchars, struct input_event *hold_quit); extern void w32_console_mouse_position (FRAME_PTR *f, int insist, ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 10:06 ` Eli Zaretskii @ 2012-07-28 11:55 ` Dani Moncayo 2012-07-28 12:23 ` bug#12055: " Eli Zaretskii 2012-07-28 12:30 ` bug#12055: " Eli Zaretskii 2012-07-28 13:57 ` Dani Moncayo 2012-07-28 16:11 ` Juanma Barranquero 2 siblings, 2 replies; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 11:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > Please try the patch below. It works for me. I'm having problems for applying your patch to my trunk branch (updated right now). This is what I'm trying to do (from an Emacs -Q): 1. Copy your patch and paste it in a new Emacs buffer, and save it to a file "patch.diff" (with UNIX-type EOL format). 2. Go to each hunk and type "C-c C-a". This is failing for me in the hunks that begin with: @@ -786,6 +788,11 @@ initialize_w32_display (struct terminal @@ -61,6 +62,15 @@ static INPUT_RECORD *queue_ptr = event_q @@ -80,8 +90,8 @@ fill_queue (BOOL block) @@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru For these hunks, I receive the error message "Can't find the text to patch". And another oddity: For the last hunk in the patch (which affect the file "src/w32inevt.h"), the patch is apparently applied (I see the message "hunk applied"), but if I watch to the corresponding buffer, the path is not applied (the added line "extern int w32_console_unicode_input;" is not there). Am I missing something? Is this an Emacs bug? -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 11:55 ` Dani Moncayo @ 2012-07-28 12:23 ` Eli Zaretskii 2012-07-28 12:49 ` Dani Moncayo 2012-07-28 12:30 ` bug#12055: " Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 12:23 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 13:55:41 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Please try the patch below. It works for me. > > I'm having problems for applying your patch to my trunk branch > (updated right now). > [...] > Am I missing something? > Is this an Emacs bug? I have no idea, but can we please solve bugs one at a time? Can you apply the patch outside Emacs, by using the Patch utility? The command you should type at the shell prompt should be: patch --binary -p0 < patch.diff This command should be issued from the root directory of the Emacs tree, the one that has src and lisp as its subdirectories. It should also work to do this from inside Emacs, like this: . put the region around the diffs I sent . type this command: C-x RET c unix RET M-| patch -d /path/to/emacs/root/dir --binary -p0 Thanks. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 12:23 ` bug#12055: " Eli Zaretskii @ 2012-07-28 12:49 ` Dani Moncayo 2012-07-28 15:02 ` bug#12055: " Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 12:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > Can you > apply the patch outside Emacs, by using the Patch utility? I'm sorry, but the patch utility is giving me problems too :( I've installed the GnuWin32 version, and added its "bin" directory (where the "patch.exe" program is) to my system PATH. After doing this, if I open a cmd console and type "patch", I see a dialog box from Windows 7 asking me to allow the execution of the program. I click "yes" and then a new console window is opened, with no text inside it. Where can I download a working "patch" utility for Windows? -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 12:49 ` Dani Moncayo @ 2012-07-28 15:02 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 15:02 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 14:49:34 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Can you > > apply the patch outside Emacs, by using the Patch utility? > > I'm sorry, but the patch utility is giving me problems too :( > > I've installed the GnuWin32 version, and added its "bin" directory > (where the "patch.exe" program is) to my system PATH. > > After doing this, if I open a cmd console and type "patch", I see a > dialog box from Windows 7 asking me to allow the execution of the > program. I click "yes" and then a new console window is opened, with > no text inside it. This is UAC in action. You need a manifest for Patch. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 11:55 ` Dani Moncayo 2012-07-28 12:23 ` bug#12055: " Eli Zaretskii @ 2012-07-28 12:30 ` Eli Zaretskii 1 sibling, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 12:30 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 13:55:41 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > This is what I'm trying to do (from an Emacs -Q): > 1. Copy your patch and paste it in a new Emacs buffer, and save it to > a file "patch.diff" (with UNIX-type EOL format). > 2. Go to each hunk and type "C-c C-a". > > This is failing for me in the hunks that begin with: > @@ -786,6 +788,11 @@ initialize_w32_display (struct terminal > @@ -61,6 +62,15 @@ static INPUT_RECORD *queue_ptr = event_q > @@ -80,8 +90,8 @@ fill_queue (BOOL block) > @@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru > > For these hunks, I receive the error message "Can't find the text to patch". Perhaps your copy/paste procedure didn't preserve the TAB characters, converting them into spaces instead. Can "C-c C-a" ignore whitespace changes? If not, Patch can, if you use the -l (the letter ell, not the digit 1) option. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 10:06 ` Eli Zaretskii 2012-07-28 11:55 ` Dani Moncayo @ 2012-07-28 13:57 ` Dani Moncayo 2012-07-28 16:07 ` Juanma Barranquero 2012-07-28 16:11 ` Juanma Barranquero 2 siblings, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 13:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 [-- Attachment #1: Type: text/plain, Size: 1585 bytes --] > Please try the patch below. It works for me. Well, I've finally managed to install a working "patch" utility for Windows (MinGW has a package for it). I've applied the patch and built the branch. Now, after starting emacs (-Q -nw) and typing "áéíóúñç", Emacs shows different symbols (see attached screenshot), but if I copy them and paste here, the pasted symbols are the correct ones (áéíóúñç), instead of the ones I see in the screen. Also, if I go to one char, for example the "á", and do C-u C-x =, Emacs says (*): position: 192 of 196 (97%), column: 0 character: á (displayed as á) (codepoint 225, #o341, #xe1) preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1)) code point in charset: 0xE1 syntax: w which means: word category: .:Base, L:Left-to-right (strong), c:Chinese, j:Japanese, l:Latin, v:Viet to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME" buffer code: #xC3 #xA1 file code: #xE1 (encoded by coding system iso-latin-1-dos) display: terminal code #xE1 Character code properties: customize what to show name: LATIN SMALL LETTER A WITH ACUTE old-name: LATIN SMALL LETTER A ACUTE general-category: Ll (Letter, Lowercase) decomposition: (97 769) ('a' '́') There are text properties here: fontified t ------- (*) Although, as I said, in the second line I don't see the "á" symbols in the screen, but another symbols (like a greek "beta"). -- Dani Moncayo [-- Attachment #2: img3.png --] [-- Type: image/png, Size: 18600 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 13:57 ` Dani Moncayo @ 2012-07-28 16:07 ` Juanma Barranquero 2012-07-28 16:12 ` Dani Moncayo 0 siblings, 1 reply; 41+ messages in thread From: Juanma Barranquero @ 2012-07-28 16:07 UTC (permalink / raw) To: Dani Moncayo; +Cc: 12055 On Sat, Jul 28, 2012 at 3:57 PM, Dani Moncayo <dmoncayo@gmail.com> wrote: > Well, I've finally managed to install a working "patch" utility for > Windows (MinGW has a package for it). Dani, if you're using Gmail, never copy a patch from the main message window. While displaying the relevant message, use the "Show original" option of the menu on the right, and copy from the original message. I had no trouble applying Eli's patch with "bzr patch". The main message window of Gmail murders tabs and wraps long lines at will. Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 16:07 ` Juanma Barranquero @ 2012-07-28 16:12 ` Dani Moncayo 0 siblings, 0 replies; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 16:12 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 12055 >> Well, I've finally managed to install a working "patch" utility for >> Windows (MinGW has a package for it). > > Dani, if you're using Gmail, never copy a patch from the main message > window. While displaying the relevant message, use the "Show original" > option of the menu on the right, and copy from the original message. I > had no trouble applying Eli's patch with "bzr patch". The main message > window of Gmail murders tabs and wraps long lines at will. Thank you so much! And bzr integrates a patch utility... good to know. -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 10:06 ` Eli Zaretskii 2012-07-28 11:55 ` Dani Moncayo 2012-07-28 13:57 ` Dani Moncayo @ 2012-07-28 16:11 ` Juanma Barranquero 2012-07-28 16:44 ` bug#12055: " Eli Zaretskii 2 siblings, 1 reply; 41+ messages in thread From: Juanma Barranquero @ 2012-07-28 16:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 12055 On Sat, Jul 28, 2012 at 12:06 PM, Eli Zaretskii <eliz@gnu.org> wrote: > Please try the patch below. It works for me. > > Please try it also when Unicode input is not used (it is by default on > Windows NT and later, as result of this patch). You can do that by > forcing w32_console_unicode_input to zero (either by modifying the > source of w32console.c and rebuilding, or by setting the variable's > value in GDB. It works for me in both cases. Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 16:11 ` Juanma Barranquero @ 2012-07-28 16:44 ` Eli Zaretskii 2012-07-28 17:01 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 16:44 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 12055 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Sat, 28 Jul 2012 18:11:51 +0200 > Cc: dmoncayo@gmail.com, 12055@debbugs.gnu.org > > On Sat, Jul 28, 2012 at 12:06 PM, Eli Zaretskii <eliz@gnu.org> wrote: > > > Please try the patch below. It works for me. > > > > Please try it also when Unicode input is not used (it is by default on > > Windows NT and later, as result of this patch). You can do that by > > forcing w32_console_unicode_input to zero (either by modifying the > > source of w32console.c and rebuilding, or by setting the variable's > > value in GDB. > > It works for me in both cases. Thanks, I will install it now. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 16:44 ` bug#12055: " Eli Zaretskii @ 2012-07-28 17:01 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 17:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055-done > Date: Sat, 28 Jul 2012 19:44:31 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 12055@debbugs.gnu.org > > Thanks, I will install it now. Done as trunk revision 109251. Thanks to both of you, and to Jason, for helping resolve this tricky problem. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 16:24 ` Juanma Barranquero 2012-07-26 16:42 ` bug#12055: " Eli Zaretskii @ 2012-07-26 16:44 ` Dani Moncayo 1 sibling, 0 replies; 41+ messages in thread From: Dani Moncayo @ 2012-07-26 16:44 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 12055 > I see the same. Good to hear that the problem is not Dani-specific :). I have the same data given by Juanma, except one (probably irrelevant) detail: > value of $LANG: C I have: value of $LANG: ESN -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-26 12:13 bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal Dani Moncayo 2012-07-26 16:13 ` Eli Zaretskii @ 2012-07-28 14:12 ` Dani Moncayo 2012-07-28 15:01 ` bug#12055: " Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 14:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > I've applied the patch and built the branch. > > Now, after starting emacs (-Q -nw) and typing "áéíóúñç", Emacs shows > different symbols (see attached screenshot)... And after doing "C-x RET t cp850 RET", everything seem to work fine. -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 14:12 ` Dani Moncayo @ 2012-07-28 15:01 ` Eli Zaretskii 2012-07-28 15:23 ` Dani Moncayo 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 15:01 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 16:12:59 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > I've applied the patch and built the branch. > > > > Now, after starting emacs (-Q -nw) and typing "áéíóúñç", Emacs shows > > different symbols (see attached screenshot)... > > And after doing "C-x RET t cp850 RET", everything seem to work fine. Did you compile international/mule-cmds.el (which was modified by the patch) and did you re-dump Emacs after byte-compiling mule-cmds.el? If you did all that, what is the value you get by evaluating (terminal-coding-system), and what is the value you get from w32-get-console-codepage? ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 15:01 ` bug#12055: " Eli Zaretskii @ 2012-07-28 15:23 ` Dani Moncayo 2012-07-28 15:34 ` Dani Moncayo 2012-07-28 15:35 ` Eli Zaretskii 0 siblings, 2 replies; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 15:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 [-- Attachment #1: Type: text/plain, Size: 942 bytes --] > Did you compile international/mule-cmds.el (which was modified by the > patch) Yes. > and did you re-dump Emacs after byte-compiling mule-cmds.el? I don't know. This is exactly what I did: 1. I applied your patch to my branch (the whole patch, which includes the file you mention). 2. To make sure that the patch was correctly applied, I compared your patch with the output of "bzr diff" (I'm attaching this output if you want to check it). 3. I went to the "nt" subdirectory and ran a plain "mingw32-make". I thought that the build system would know what to recompile based on what files have changed since the last time. 4. I ran a "mingw32-make install". > If you did all that, what is the value you get by evaluating > (terminal-coding-system) cp1252 >, and what is the value you get from > w32-get-console-codepage? "w32-get-console-codepage" is void as a variable. "(w32-get-console-codepage)" returns 850. -- Dani Moncayo [-- Attachment #2: bzr-diff --] [-- Type: application/octet-stream, Size: 7267 bytes --] === modified file 'lisp/international/mule-cmds.el' --- lisp/international/mule-cmds.el 2012-07-25 23:11:23 +0000 +++ lisp/international/mule-cmds.el 2012-07-28 13:21:52 +0000 @@ -2655,23 +2655,29 @@ ;; On Windows, override locale-coding-system, ;; default-file-name-coding-system, keyboard-coding-system, - ;; terminal-coding-system with system codepage. + ;; terminal-coding-system with the appropriate codepages. (when (boundp 'w32-ansi-code-page) - (let ((code-page-coding (intern (format "cp%d" w32-ansi-code-page)))) - (when (coding-system-p code-page-coding) - (unless frame (setq locale-coding-system code-page-coding)) - (set-keyboard-coding-system code-page-coding frame) - (set-terminal-coding-system code-page-coding frame) - ;; Set default-file-name-coding-system last, so that Emacs - ;; doesn't try to use cpNNNN when it defines keyboard and - ;; terminal encoding. That's because the above two lines - ;; will want to load code-pages.el, where cpNNNN are - ;; defined; if default-file-name-coding-system were set to - ;; cpNNNN while these two lines run, Emacs will want to use - ;; it for encoding the file name it wants to load. And that - ;; will fail, since cpNNNN is not yet usable until - ;; code-pages.el finishes loading. - (setq default-file-name-coding-system code-page-coding)))) + (let ((ansi-code-page-coding (intern (format "cp%d" w32-ansi-code-page))) + (oem-code-page-coding + (intern (format "cp%d" (w32-get-console-codepage)))) + ansi-cs-p oem-cs-p) + (and (coding-system-p ansi-code-page-coding) + (setq ansi-cs-p t)) + (and (coding-system-p oem-code-page-coding) + (setq oem-cs-p t)) + ;; Set the keyboard and display encoding to either the current + ;; ANSI codepage of the OEM codepage, depending on whether + ;; this is a GUI or a TTY frame. + (when ansi-cs-p + (unless frame (setq locale-coding-system ansi-code-page-coding)) + (when (display-graphic-p frame) + (set-keyboard-coding-system ansi-code-page-coding frame) + (set-terminal-coding-system ansi-code-page-coding frame)) + (setq default-file-name-coding-system ansi-code-page-coding)) + (when oem-cs-p + (unless (display-graphic-p frame) + (set-keyboard-coding-system oem-code-page-coding frame) + (set-terminal-coding-system oem-code-page-coding frame))))) (when (eq system-type 'darwin) ;; On Darwin, file names are always encoded in utf-8, no matter === modified file 'src/w32console.c' --- src/w32console.c 2012-06-28 07:50:27 +0000 +++ src/w32console.c 2012-07-28 13:21:52 +0000 @@ -37,6 +37,7 @@ #include "termhooks.h" #include "termchar.h" #include "dispextern.h" +#include "w32heap.h" /* for os_subtype */ #include "w32inevt.h" /* from window.c */ @@ -67,6 +68,7 @@ #endif HANDLE keyboard_handle; +int w32_console_unicode_input; /* Setting this as the ctrl handler prevents emacs from being killed when @@ -786,6 +788,11 @@ info.srWindow.Left); } + if (os_subtype == OS_NT) + w32_console_unicode_input = 1; + else + w32_console_unicode_input = 0; + /* Setup w32_display_info structure for this frame. */ w32_initialize_display_info (build_string ("Console")); === modified file 'src/w32inevt.c' --- src/w32inevt.c 2012-05-26 11:58:19 +0000 +++ src/w32inevt.c 2012-07-28 13:21:52 +0000 @@ -41,6 +41,7 @@ #include "termchar.h" #include "w32heap.h" #include "w32term.h" +#include "w32inevt.h" /* stdin, from w32console.c */ extern HANDLE keyboard_handle; @@ -61,6 +62,15 @@ /* Temporarily store lead byte of DBCS input sequences. */ static char dbcs_lead = 0; +static inline BOOL +w32_read_console_input (HANDLE h, INPUT_RECORD *rec, DWORD recsize, + DWORD *waiting) +{ + return (w32_console_unicode_input + ? ReadConsoleInputW (h, rec, recsize, waiting) + : ReadConsoleInputA (h, rec, recsize, waiting)); +} + static int fill_queue (BOOL block) { @@ -80,8 +90,8 @@ return 0; } - rc = ReadConsoleInput (keyboard_handle, event_queue, EVENT_QUEUE_SIZE, - &events_waiting); + rc = w32_read_console_input (keyboard_handle, event_queue, EVENT_QUEUE_SIZE, + &events_waiting); if (!rc) return -1; queue_ptr = event_queue; @@ -224,7 +234,7 @@ #endif /* On NT, call ToUnicode instead and then convert to the current - locale's default codepage. */ + console input codepage. */ if (os_subtype == OS_NT) { WCHAR buf[128]; @@ -233,14 +243,9 @@ keystate, buf, 128, 0); if (isdead > 0) { - char cp[20]; - int cpId; + int cpId = GetConsoleCP (); event->uChar.UnicodeChar = buf[isdead - 1]; - - GetLocaleInfo (GetThreadLocale (), - LOCALE_IDEFAULTANSICODEPAGE, cp, 20); - cpId = atoi (cp); isdead = WideCharToMultiByte (cpId, 0, buf, isdead, ansi_code, 4, NULL, NULL); } @@ -447,26 +452,34 @@ } else if (event->uChar.AsciiChar > 0) { + /* Pure ASCII characters < 128. */ emacs_ev->kind = ASCII_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.AsciiChar; } - else if (event->uChar.UnicodeChar > 0) + else if (event->uChar.UnicodeChar > 0 + && w32_console_unicode_input) { + /* Unicode codepoint; only valid if we are using Unicode + console input mode. */ emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT; emacs_ev->code = event->uChar.UnicodeChar; } else { - /* Fallback for non-Unicode versions of Windows. */ + /* Fallback handling of non-ASCII characters for non-Unicode + versions of Windows, and for non-Unicode input on NT + family of Windows. Only characters in the current + console codepage are supported by this fallback. */ wchar_t code; char dbcs[2]; - char cp[20]; int cpId; - /* Get the codepage to interpret this key with. */ - GetLocaleInfo (GetThreadLocale (), - LOCALE_IDEFAULTANSICODEPAGE, cp, 20); - cpId = atoi (cp); + /* Get the current console input codepage to interpret this + key with. Note that the system defaults for the OEM + codepage could have been changed by calling SetConsoleCP + or w32-set-console-codepage, so using GetLocaleInfo to + get LOCALE_IDEFAULTCODEPAGE is not TRT here. */ + cpId = GetConsoleCP (); dbcs[0] = dbcs_lead; dbcs[1] = event->uChar.AsciiChar; @@ -501,6 +514,7 @@ } else { + /* Function keys and other non-character keys. */ emacs_ev->kind = NON_ASCII_KEYSTROKE_EVENT; emacs_ev->code = event->wVirtualKeyCode; } === modified file 'src/w32inevt.h' --- src/w32inevt.h 2012-01-19 07:21:25 +0000 +++ src/w32inevt.h 2012-07-28 13:21:52 +0000 @@ -19,6 +19,8 @@ #ifndef EMACS_W32INEVT_H #define EMACS_W32INEVT_H +extern int w32_console_unicode_input; + extern int w32_console_read_socket (struct terminal *term, int numchars, struct input_event *hold_quit); extern void w32_console_mouse_position (FRAME_PTR *f, int insist, ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 15:23 ` Dani Moncayo @ 2012-07-28 15:34 ` Dani Moncayo 2012-07-28 16:27 ` bug#12055: " Eli Zaretskii 2012-07-28 15:35 ` Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 15:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 On Sat, Jul 28, 2012 at 5:23 PM, Dani Moncayo <dmoncayo@gmail.com> wrote: >> Did you compile international/mule-cmds.el (which was modified by the >> patch) > > Yes. I'm sorry. I thought that the build process would do every needed compilation. I've just recompiled the file "lisp/international/mule-cmds.elc" and rebuit Emacs ("mingw32-make" + "mingw32-make install"). Now everything seem to work well. -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 15:34 ` Dani Moncayo @ 2012-07-28 16:27 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 16:27 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 17:34:34 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > I'm sorry. I thought that the build process would do every needed compilation. Never mind that. > I've just recompiled the file "lisp/international/mule-cmds.elc" and > rebuit Emacs ("mingw32-make" + "mingw32-make install"). > > Now everything seem to work well. Thanks! I will wait for Juanma to confirm these good results, before committing. The actual changes I will install include one more subtlety that I missed: the encoding of console input and output on Windows can generally be different, so mule-cmds.el needs one more small tweak. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 15:23 ` Dani Moncayo 2012-07-28 15:34 ` Dani Moncayo @ 2012-07-28 15:35 ` Eli Zaretskii 2012-07-28 15:46 ` Dani Moncayo 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-07-28 15:35 UTC (permalink / raw) To: Dani Moncayo; +Cc: lekktu, 12055 > Date: Sat, 28 Jul 2012 17:23:19 +0200 > From: Dani Moncayo <dmoncayo@gmail.com> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org > > > Did you compile international/mule-cmds.el (which was modified by the > > patch) > > Yes. > > > and did you re-dump Emacs after byte-compiling mule-cmds.el? > > I don't know. This is exactly what I did: > 1. I applied your patch to my branch (the whole patch, which includes > the file you mention). > 2. To make sure that the patch was correctly applied, I compared your > patch with the output of "bzr diff" (I'm attaching this output if you > want to check it). > 3. I went to the "nt" subdirectory and ran a plain "mingw32-make". I > thought that the build system would know what to recompile based on > what files have changed since the last time. > 4. I ran a "mingw32-make install". > > > If you did all that, what is the value you get by evaluating > > (terminal-coding-system) > > cp1252 > > >, and what is the value you get from > > w32-get-console-codepage? > > "w32-get-console-codepage" is void as a variable. > "(w32-get-console-codepage)" returns 850. This looks as if the changes in mule-cmds didn't take place at all. Please try rebuilding one more time, and please run emacs.exe from src/oo-spd/i386 or src/oo/i386 (depending on whether you build it optimized or not). ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal 2012-07-28 15:35 ` Eli Zaretskii @ 2012-07-28 15:46 ` Dani Moncayo 0 siblings, 0 replies; 41+ messages in thread From: Dani Moncayo @ 2012-07-28 15:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 12055 > This looks as if the changes in mule-cmds didn't take place at all. > Please try rebuilding one more time, and please run emacs.exe from > src/oo-spd/i386 or src/oo/i386 (depending on whether you build it > optimized or not). Indeed. As I said, after byte-compiling "lisp/international/mule-cmds.el" and rebuilding Emacs, now the problems discussed in this thread seem to be solved, and now: (terminal-coding-system) => cp850 (w32-get-console-codepage) => 850 -- Dani Moncayo ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2012-07-28 17:01 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-26 12:13 bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal Dani Moncayo 2012-07-26 16:13 ` Eli Zaretskii 2012-07-26 16:24 ` Juanma Barranquero 2012-07-26 16:42 ` bug#12055: " Eli Zaretskii 2012-07-26 16:49 ` Juanma Barranquero 2012-07-26 17:18 ` bug#12055: " Eli Zaretskii 2012-07-26 18:09 ` Eli Zaretskii 2012-07-26 18:42 ` Juanma Barranquero 2012-07-26 18:29 ` Juanma Barranquero 2012-07-26 20:03 ` bug#12055: " Eli Zaretskii 2012-07-26 22:40 ` Dani Moncayo 2012-07-27 6:45 ` bug#12055: " Eli Zaretskii 2012-07-27 8:35 ` Dani Moncayo 2012-07-27 9:04 ` bug#12055: " Eli Zaretskii 2012-07-27 15:12 ` Eli Zaretskii 2012-07-27 16:46 ` Jason Rumney 2012-07-27 18:03 ` Eli Zaretskii 2012-07-27 18:22 ` Eli Zaretskii 2012-07-27 23:45 ` Juanma Barranquero 2012-07-28 1:12 ` Dani Moncayo 2012-07-28 8:04 ` bug#12055: " Eli Zaretskii 2012-07-28 10:06 ` Eli Zaretskii 2012-07-28 11:55 ` Dani Moncayo 2012-07-28 12:23 ` bug#12055: " Eli Zaretskii 2012-07-28 12:49 ` Dani Moncayo 2012-07-28 15:02 ` bug#12055: " Eli Zaretskii 2012-07-28 12:30 ` bug#12055: " Eli Zaretskii 2012-07-28 13:57 ` Dani Moncayo 2012-07-28 16:07 ` Juanma Barranquero 2012-07-28 16:12 ` Dani Moncayo 2012-07-28 16:11 ` Juanma Barranquero 2012-07-28 16:44 ` bug#12055: " Eli Zaretskii 2012-07-28 17:01 ` Eli Zaretskii 2012-07-26 16:44 ` Dani Moncayo 2012-07-28 14:12 ` Dani Moncayo 2012-07-28 15:01 ` bug#12055: " Eli Zaretskii 2012-07-28 15:23 ` Dani Moncayo 2012-07-28 15:34 ` Dani Moncayo 2012-07-28 16:27 ` bug#12055: " Eli Zaretskii 2012-07-28 15:35 ` Eli Zaretskii 2012-07-28 15:46 ` Dani Moncayo
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).