unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
@ 2012-07-26 12:13 Dani Moncayo
  2012-07-26 16:13 ` Eli Zaretskii
  2012-07-28 14:12 ` Dani Moncayo
  0 siblings, 2 replies; 41+ messages in thread
From: Dani Moncayo @ 2012-07-26 12:13 UTC (permalink / raw)
  To: 12055

[-- Attachment #1: Type: text/plain, Size: 1082 bytes --]

Hello,

On my Windows 7, 64-bit system, if I start Emacs from a cmd.exe
console (emacs -nw -Q) and type the text "áéíóú", the first two
characters are not displayed correctly (see attached screenshot).

If I exit Emacs, and type those characters again in the console
prompt, the are all displayed correctly.

I've been able to reproduce this problem both in Emacs 24.1 and 23.4.

In GNU Emacs 24.1.50.1 (i386-mingw-nt6.1.7601)
 of 2012-07-19 on DANI-PC
Bzr revision: 109159 monnier@iro.umontreal.ca-20120719113938-sgu5ruqm1vcbchtw
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
 `configure --with-gcc (4.7) --enable-checking --cflags
 -I../../libs/libiconv-1.14-2-mingw32-dev/include
 -I../../libs/libxml2-2.7.8-w32-bin/include/libxml2
 -I../../libs/giflib-4.1.4-1/include
 -I../../emacs/libs/gnutls-3.0.16/include -I../../libs/jpeg-6b-4/include
 -I../../libs/libpng-1.4.10 -I../../libs/libxpm-3.5.8/include
 -I../../libs/libxpm-3.5.8/src -I../../libs/tiff-3.8.2-1/include
 -I../../libs/zlib-1.2.6'


-- 
Dani Moncayo

[-- Attachment #2: img1.png --]
[-- Type: image/png, Size: 25235 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 12:13 bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal Dani Moncayo
@ 2012-07-26 16:13 ` Eli Zaretskii
  2012-07-26 16:24   ` Juanma Barranquero
  2012-07-28 14:12 ` Dani Moncayo
  1 sibling, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-26 16:13 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: 12055

> Date: Thu, 26 Jul 2012 14:13:23 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> 
> On my Windows 7, 64-bit system, if I start Emacs from a cmd.exe
> console (emacs -nw -Q) and type the text "áéíóú", the first two
> characters are not displayed correctly (see attached screenshot).
> 
> If I exit Emacs, and type those characters again in the console
> prompt, the are all displayed correctly.

Do these two characters _always_ display incorrectly inside Emacs, or
only when they are the first you type after starting the -nw session?

Also, how did you type these characters, and what does Emacs say if
you go to each one of them and type "C-u C-x ="?

> I've been able to reproduce this problem both in Emacs 24.1 and 23.4.

Thanks for the report.

First, please always make a point of reporting bugs via 
"M-x report-emacs-bug RET", as that command collects and sends
lots of useful information about your system and Emacs setup.
This is especially important when non-ASCII characters are involved,
as report-emacs-bug provides important information related to that.

Please send that info, as collected in the same version of Emacs and
in the same console session as the one where you see the problem.

In addition, please tell what these expressions produce in the console
session of the current trunk Emacs version:

  (terminal-coding-system)

  (keyboard-coding-system)

  w32-ansi-codepage

  (w32-get-console-codepage)

  (w32-get-console-output-codepage)

Finally, in the console outside Emacs type "chcp" at the Windows
cmd.exe shell prompt, and tell what it says.






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 16:13 ` Eli Zaretskii
@ 2012-07-26 16:24   ` Juanma Barranquero
  2012-07-26 16:42     ` bug#12055: " Eli Zaretskii
  2012-07-26 16:44     ` Dani Moncayo
  0 siblings, 2 replies; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-26 16:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 12055

I see the same.

> Do these two characters _always_ display incorrectly inside Emacs, or
> only when they are the first you type after starting the -nw session?

Always.

> Also, how did you type these characters,

' a  and  ' e  (' is on the Spanish keyboards, to type accented vowels).

> and what does Emacs say if
> you go to each one of them and type "C-u C-x ="?

             position: 206 of 210 (98%), column: 0
            character:   (displayed as  ) (codepoint 160, #o240, #xa0)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xA0
               syntax: . 	which means: punctuation
             category: .:Base, b:Arabic, j:Japanese, l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #xA0
            file code: #xC2 #xA0 (encoded by coding system nil)
              display: terminal code #xA0
       hardcoded face: nobreak-space

Character code properties: customize what to show
  name: NO-BREAK SPACE
  old-name: NON-BREAKING SPACE
  general-category: Zs (Separator, Space)
  decomposition: (noBreak 32) (noBreak ' ')


             position: 210 of 210 (100%), column: 0
            character: ‚ (displayed as ‚) (codepoint 130, #o202, #x82)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x82
               syntax: w 	which means: word
             category: l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #x82
            file code: #xC2 #x82 (encoded by coding system nil)
              display: not encodable for terminal

Character code properties: customize what to show
  name: <control>
  old-name: BREAK PERMITTED HERE
  general-category: Cc (Other, Control)
  decomposition: (130) ('‚')


> First, please always make a point of reporting bugs via
> "M-x report-emacs-bug RET", as that command collects and sends
> lots of useful information about your system and Emacs setup.
> This is especially important when non-ASCII characters are involved,
> as report-emacs-bug provides important information related to that.

Important settings:
  value of $LANG: C
  locale-coding-system: cp1252
  default enable-multibyte-characters: t

>   (terminal-coding-system)

cp1252

>   (keyboard-coding-system)

windows-1252-unix

>   w32-ansi-codepage

1252  (BTW, we're not very consistent here, the variable is w32-ansi-code-page)

>   (w32-get-console-codepage)

850

>   (w32-get-console-output-codepage)

850

> Finally, in the console outside Emacs type "chcp" at the Windows
> cmd.exe shell prompt, and tell what it says.

Página de códigos activa: 850


    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 16:24   ` Juanma Barranquero
@ 2012-07-26 16:42     ` Eli Zaretskii
  2012-07-26 16:49       ` Juanma Barranquero
  2012-07-26 16:44     ` Dani Moncayo
  1 sibling, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-26 16:42 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 12055

> From: Juanma Barranquero <lekktu@gmail.com>
> Date: Thu, 26 Jul 2012 18:24:14 +0200
> Cc: Dani Moncayo <dmoncayo@gmail.com>, 12055@debbugs.gnu.org
> 
> I see the same.

I have no doubt ;-)

> >   (terminal-coding-system)
> 
> cp1252
> 
> >   (keyboard-coding-system)
> 
> windows-1252-unix
> 
> >   w32-ansi-codepage
> 
> 1252  (BTW, we're not very consistent here, the variable is w32-ansi-code-page)
> 
> >   (w32-get-console-codepage)
> 
> 850
> 
> >   (w32-get-console-output-codepage)
> 
> 850
> 
> > Finally, in the console outside Emacs type "chcp" at the Windows
> > cmd.exe shell prompt, and tell what it says.
> 
> Página de códigos activa: 850

Does it help to say

  C-x RET t cp850 RET
  C-x RET k cp850 RET

before typing those characters?  Do they display correctly then, and
most importantly, does "C-u C-x =" report in that case the characters
you really intended to type?






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 16:24   ` Juanma Barranquero
  2012-07-26 16:42     ` bug#12055: " Eli Zaretskii
@ 2012-07-26 16:44     ` Dani Moncayo
  1 sibling, 0 replies; 41+ messages in thread
From: Dani Moncayo @ 2012-07-26 16:44 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 12055

> I see the same.

Good to hear that the problem is not Dani-specific :).

I have the same data given by Juanma, except one (probably irrelevant) detail:

>   value of $LANG: C

I have:  value of $LANG: ESN

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 16:42     ` bug#12055: " Eli Zaretskii
@ 2012-07-26 16:49       ` Juanma Barranquero
  2012-07-26 17:18         ` bug#12055: " Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-26 16:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 12055

On Thu, Jul 26, 2012 at 6:42 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> Does it help to say
>
>   C-x RET t cp850 RET
>   C-x RET k cp850 RET
>
> before typing those characters?  Do they display correctly then, and
> most importantly, does "C-u C-x =" report in that case the characters
> you really intended to type?

No. The problem worsens. Now á é are still incorrect, and  í ó ú ñ ç
turn into ¡ ¢ £ ¤‡ \207

Note: I see that at some point in the past I surely hit the problem
and for some reason I failed to report it, because I have this in my
.emacs:

(unless (or window-system noninteractive
            (not (boundp 'w32-ansi-code-page)))
  (let ((cicp (w32-get-console-codepage))
        (cocp (w32-get-console-output-codepage)))
    (w32-set-console-codepage w32-ansi-code-page)
    (w32-set-console-output-codepage w32-ansi-code-page)
    (add-hook 'kill-emacs-hook
              `(lambda ()
                 (w32-set-console-codepage ,cicp)
                 (w32-set-console-output-codepage ,cocp)))))

though that's irrelevant to the tests above, which are all -Q -nw.

    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 16:49       ` Juanma Barranquero
@ 2012-07-26 17:18         ` Eli Zaretskii
  2012-07-26 18:09           ` Eli Zaretskii
  2012-07-26 18:29           ` Juanma Barranquero
  0 siblings, 2 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-26 17:18 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 12055

> From: Juanma Barranquero <lekktu@gmail.com>
> Date: Thu, 26 Jul 2012 18:49:33 +0200
> Cc: dmoncayo@gmail.com, 12055@debbugs.gnu.org
> 
> On Thu, Jul 26, 2012 at 6:42 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > Does it help to say
> >
> >   C-x RET t cp850 RET
> >   C-x RET k cp850 RET
> >
> > before typing those characters?  Do they display correctly then, and
> > most importantly, does "C-u C-x =" report in that case the characters
> > you really intended to type?
> 
> No. The problem worsens. Now á é are still incorrect, and  í ó ú ñ ç
> turn into ¡ ¢ £ ¤‡ \207

What are the codes of these characters, as "C-u C-x =" sees them?

> Note: I see that at some point in the past I surely hit the problem
> and for some reason I failed to report it, because I have this in my
> .emacs:
> 
> (unless (or window-system noninteractive
>             (not (boundp 'w32-ansi-code-page)))
>   (let ((cicp (w32-get-console-codepage))
>         (cocp (w32-get-console-output-codepage)))
>     (w32-set-console-codepage w32-ansi-code-page)
>     (w32-set-console-output-codepage w32-ansi-code-page)
>     (add-hook 'kill-emacs-hook
>               `(lambda ()
>                  (w32-set-console-codepage ,cicp)
>                  (w32-set-console-output-codepage ,cocp)))))
> 
> though that's irrelevant to the tests above, which are all -Q -nw.

Does the above solve the problem at hand, though?  If it does, we can
do that at startup in the -nw session.






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 17:18         ` bug#12055: " Eli Zaretskii
@ 2012-07-26 18:09           ` Eli Zaretskii
  2012-07-26 18:42             ` Juanma Barranquero
  2012-07-26 18:29           ` Juanma Barranquero
  1 sibling, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-26 18:09 UTC (permalink / raw)
  To: lekktu; +Cc: 12055

> Date: Thu, 26 Jul 2012 20:18:45 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 12055@debbugs.gnu.org
> 
> > > Does it help to say
> > >
> > >   C-x RET t cp850 RET
> > >   C-x RET k cp850 RET
> > >
> > > before typing those characters?  Do they display correctly then, and
> > > most importantly, does "C-u C-x =" report in that case the characters
> > > you really intended to type?
> > 
> > No. The problem worsens. Now á é are still incorrect, and  í ó ú ñ ç
> > turn into ¡ ¢ £ ¤‡ \207

I think we need to establish whether the problem is with input or
output (or both).  (I think it's with input, but let's make sure.)  If
you type these same characters using some latin-1 Leim input method
(e.g., latin-1-postfix), and set the terminal encoding to cp850, do
all the Latin-1 characters display correctly?






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 17:18         ` bug#12055: " Eli Zaretskii
  2012-07-26 18:09           ` Eli Zaretskii
@ 2012-07-26 18:29           ` Juanma Barranquero
  2012-07-26 20:03             ` bug#12055: " Eli Zaretskii
  1 sibling, 1 reply; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-26 18:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 12055

On Thu, Jul 26, 2012 at 7:18 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> What are the codes of these characters, as "C-u C-x =" sees them?

á and é, as above.

As for í, ó, ú, ñ and ç, in that order:

             position: 194 of 198 (97%), column: 5
            character: ¡ (displayed as ¡) (codepoint 161, #o241, #xa1)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xA1
               syntax: . 	which means: punctuation
             category: .:Base, h:Korean, j:Japanese, l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #xA1
            file code: #xC2 #xA1 (encoded by coding system nil)
              display: terminal code #xAD

Character code properties: customize what to show
  name: INVERTED EXCLAMATION MARK
  general-category: Po (Punctuation, Other)
  decomposition: (161) ('¡')


             position: 195 of 198 (98%), column: 6
            character: ¢ (displayed as ¢) (codepoint 162, #o242, #xa2)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xA2
               syntax: _ 	which means: symbol
             category: .:Base, j:Japanese, l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #xA2
            file code: #xC2 #xA2 (encoded by coding system nil)
              display: terminal code #xBD

Character code properties: customize what to show
  name: CENT SIGN
  general-category: Sc (Symbol, Currency)
  decomposition: (162) ('¢')


             position: 196 of 198 (98%), column: 7
            character: £ (displayed as £) (codepoint 163, #o243, #xa3)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xA3
               syntax: _ 	which means: symbol
             category: .:Base, j:Japanese, l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #xA3
            file code: #xC2 #xA3 (encoded by coding system nil)
              display: terminal code #x9C

Character code properties: customize what to show
  name: POUND SIGN
  general-category: Sc (Symbol, Currency)
  decomposition: (163) ('£')


             position: 197 of 198 (99%), column: 8
            character: ¤ (displayed as ¤) (codepoint 164, #o244, #xa4)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xA4
               syntax: _ 	which means: symbol
             category:
		       .:Base, b:Arabic, c:Chinese, h:Korean, j:Japanese, l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #xA4
            file code: #xC2 #xA4 (encoded by coding system nil)
              display: terminal code #xCF


             position: 198 of 198 (99%), column: 9
            character: ‡ (displayed as ‡) (codepoint 135, #o207, #x87)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x87
               syntax: w 	which means: word
             category: l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC2 #x87
            file code: #xC2 #x87 (encoded by coding system nil)
              display: not encodable for terminal

Character code properties: customize what to show
  name: <control>
  old-name: END OF SELECTED AREA
  general-category: Cc (Other, Control)
  decomposition: (135) ('‡')


> Does the above solve the problem at hand, though?  If it does, we can
> do that at startup in the -nw session.

Yes, if you evaluate that code prior to typing á, é, etc, it works as expected.

    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 18:09           ` Eli Zaretskii
@ 2012-07-26 18:42             ` Juanma Barranquero
  0 siblings, 0 replies; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-26 18:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 12055

On Thu, Jul 26, 2012 at 8:09 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> I think we need to establish whether the problem is with input or
> output (or both).  (I think it's with input, but let's make sure.)  If
> you type these same characters using some latin-1 Leim input method
> (e.g., latin-1-postfix), and set the terminal encoding to cp850, do
> all the Latin-1 characters display correctly?

Yes, after

   C-x RET t cp850 RET
   C-\ latin-1-postfix RET

the accented characters can be input with latin-1-postfix and display
correctly (and C-u M-x describe-char confirms they are the expected
chars).

    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 18:29           ` Juanma Barranquero
@ 2012-07-26 20:03             ` Eli Zaretskii
  2012-07-26 22:40               ` Dani Moncayo
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-26 20:03 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 12055

> From: Juanma Barranquero <lekktu@gmail.com>
> Date: Thu, 26 Jul 2012 20:29:57 +0200
> Cc: dmoncayo@gmail.com, 12055@debbugs.gnu.org
> 
> On Thu, Jul 26, 2012 at 7:18 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > What are the codes of these characters, as "C-u C-x =" sees them?
> 
> á and é, as above.
> 
> As for í, ó, ú, ñ and ç, in that order:
> 
>              position: 194 of 198 (97%), column: 5
>             character: ¡ (displayed as ¡) (codepoint 161, #o241, #xa1)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0xA1
>                syntax: . 	which means: punctuation
>              category: .:Base, h:Korean, j:Japanese, l:Latin
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xC2 #xA1
>             file code: #xC2 #xA1 (encoded by coding system nil)
>               display: terminal code #xAD
> 
> Character code properties: customize what to show
>   name: INVERTED EXCLAMATION MARK
>   general-category: Po (Punctuation, Other)
>   decomposition: (161) ('¡')
> 
> 
>              position: 195 of 198 (98%), column: 6
>             character: ¢ (displayed as ¢) (codepoint 162, #o242, #xa2)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0xA2
>                syntax: _ 	which means: symbol
>              category: .:Base, j:Japanese, l:Latin
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xC2 #xA2
>             file code: #xC2 #xA2 (encoded by coding system nil)
>               display: terminal code #xBD
> 
> Character code properties: customize what to show
>   name: CENT SIGN
>   general-category: Sc (Symbol, Currency)
>   decomposition: (162) ('¢')
> 
> 
>              position: 196 of 198 (98%), column: 7
>             character: £ (displayed as £) (codepoint 163, #o243, #xa3)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0xA3
>                syntax: _ 	which means: symbol
>              category: .:Base, j:Japanese, l:Latin
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xC2 #xA3
>             file code: #xC2 #xA3 (encoded by coding system nil)
>               display: terminal code #x9C
> 
> Character code properties: customize what to show
>   name: POUND SIGN
>   general-category: Sc (Symbol, Currency)
>   decomposition: (163) ('£')
> 
> 
>              position: 197 of 198 (99%), column: 8
>             character: ¤ (displayed as ¤) (codepoint 164, #o244, #xa4)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0xA4
>                syntax: _ 	which means: symbol
>              category:
> 		       .:Base, b:Arabic, c:Chinese, h:Korean, j:Japanese, l:Latin
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xC2 #xA4
>             file code: #xC2 #xA4 (encoded by coding system nil)
>               display: terminal code #xCF
> 
> 
>              position: 198 of 198 (99%), column: 9
>             character: ‡ (displayed as ‡) (codepoint 135, #o207, #x87)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0x87
>                syntax: w 	which means: word
>              category: l:Latin
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xC2 #x87
>             file code: #xC2 #x87 (encoded by coding system nil)
>               display: not encodable for terminal
> 
> Character code properties: customize what to show
>   name: <control>
>   old-name: END OF SELECTED AREA
>   general-category: Cc (Other, Control)
>   decomposition: (135) ('‡')

That's strange: these are definitely the cp850 codes for the Latin-1
characters you typed, so I wonder why just setting
terminal-coding-system to that doesn't fix the problem...






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 20:03             ` bug#12055: " Eli Zaretskii
@ 2012-07-26 22:40               ` Dani Moncayo
  2012-07-27  6:45                 ` bug#12055: " Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-26 22:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Juanma Barranquero, 12055

[-- Attachment #1: Type: text/plain, Size: 660 bytes --]

FWIW, another experiment that produces unexpected results:

The attached screenshot shows how two Emacs instances (GUI and
non-GUI) show the contents of a test file (previously written by me
and saved to disk).

IIUC, this time the input method is irrelevant, since the text has
been read from a file (not the keyboard).

As you can see in the modelines, both instances of Emacs have selected
the latin-1 coding system, but the non-GUI instance fails to show the
characters correctly (and not only "á" and "é").

I don't know if this problem has the same root than the original.  If
not, I could file a separate bug report.

-- 
Dani Moncayo

[-- Attachment #2: img2.png --]
[-- Type: image/png, Size: 40071 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 22:40               ` Dani Moncayo
@ 2012-07-27  6:45                 ` Eli Zaretskii
  2012-07-27  8:35                   ` Dani Moncayo
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-27  6:45 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Fri, 27 Jul 2012 00:40:22 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: Juanma Barranquero <lekktu@gmail.com>, 12055@debbugs.gnu.org
> 
> The attached screenshot shows how two Emacs instances (GUI and
> non-GUI) show the contents of a test file (previously written by me
> and saved to disk).
> 
> IIUC, this time the input method is irrelevant, since the text has
> been read from a file (not the keyboard).
> 
> As you can see in the modelines, both instances of Emacs have selected
> the latin-1 coding system, but the non-GUI instance fails to show the
> characters correctly (and not only "á" and "é").

Please try that in the non-GUI session where you first set the
terminal coding-system to cp850.  Juanma said that doing so and using
a Leim input method (which is guaranteed to produce correct charcaters
in the buffer) shows these characters correctly.  So I expect the same
to work with a file (unless that file was also produced in a non-GUI
Emacs session...).

Which leaves us with input problem...






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27  6:45                 ` bug#12055: " Eli Zaretskii
@ 2012-07-27  8:35                   ` Dani Moncayo
  2012-07-27  9:04                     ` bug#12055: " Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-27  8:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> Please try that in the non-GUI session where you first set the
> terminal coding-system to cp850.

Ok.  If I do:
1. emacs -nw -Q
2. C-x RET t cp850 RET
3. Visit the test file.

Then the file is corrrectly displayed.

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: Re: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27  8:35                   ` Dani Moncayo
@ 2012-07-27  9:04                     ` Eli Zaretskii
  2012-07-27 15:12                       ` Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-27  9:04 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Fri, 27 Jul 2012 10:35:53 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Please try that in the non-GUI session where you first set the
> > terminal coding-system to cp850.
> 
> Ok.  If I do:
> 1. emacs -nw -Q
> 2. C-x RET t cp850 RET
> 3. Visit the test file.
> 
> Then the file is corrrectly displayed.

Thanks.  If no one beats me to it, I will look into the input issue
when I have time.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27  9:04                     ` bug#12055: " Eli Zaretskii
@ 2012-07-27 15:12                       ` Eli Zaretskii
  2012-07-27 16:46                         ` Jason Rumney
                                           ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-27 15:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> Date: Fri, 27 Jul 2012 12:04:57 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Date: Fri, 27 Jul 2012 10:35:53 +0200
> > From: Dani Moncayo <dmoncayo@gmail.com>
> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> > 
> > > Please try that in the non-GUI session where you first set the
> > > terminal coding-system to cp850.
> > 
> > Ok.  If I do:
> > 1. emacs -nw -Q
> > 2. C-x RET t cp850 RET
> > 3. Visit the test file.
> > 
> > Then the file is corrrectly displayed.
> 
> Thanks.  If no one beats me to it, I will look into the input issue
> when I have time.

Well, I see some strange stuff in the input processing.

Please add this snippet:

  DebPrint (("key_event: %d %d 0x%x 0x%x {0x%x 0x%x} 0x%x\n",
	     event->bKeyDown, event->wRepeatCount,
	     event->wVirtualKeyCode, event->wVirtualScanCode,
	     event->uChar.AsciiChar, event->uChar.UnicodeChar,
	     event->dwControlKeyState));

at the very beginning of key_event function (in w32inevt.c), attach
GDB to a running "emacs -Q -nw", and tell me what does GDB report when
you type non-ASCII keys on your keyboard.

That is,

  emacs -Q -nw
  gdb -p EMACS-PID
  (gdb) continue

then type non-ASCII characters into Emacs.  You should see messages
such as these:

  warning: key_event: 0 1 0x54 0x14 {0xffffff80 0x580} 0x20

  warning: key_event: 1 1 0x54 0x14 {0xffffff80 0x580} 0x20

(but with different codes).  There are 2 messages for each keystroke:
one when the key is pressed, the other when it is released.  Please
post here the exact output, and please tell for each pair of such
messages which character did you type.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27 15:12                       ` Eli Zaretskii
@ 2012-07-27 16:46                         ` Jason Rumney
  2012-07-27 18:03                           ` Eli Zaretskii
  2012-07-27 23:45                         ` Juanma Barranquero
  2012-07-28  1:12                         ` Dani Moncayo
  2 siblings, 1 reply; 41+ messages in thread
From: Jason Rumney @ 2012-07-27 16:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Fri, 27 Jul 2012 12:04:57 +0300
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
>> 
>> > Date: Fri, 27 Jul 2012 10:35:53 +0200
>> > From: Dani Moncayo <dmoncayo@gmail.com>
>> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
>> > 
>> > > Please try that in the non-GUI session where you first set the
>> > > terminal coding-system to cp850.
>> > 
>> > Ok.  If I do:
>> > 1. emacs -nw -Q
>> > 2. C-x RET t cp850 RET
>> > 3. Visit the test file.
>> > 
>> > Then the file is corrrectly displayed.
>> 
>> Thanks.  If no one beats me to it, I will look into the input issue
>> when I have time.
>
> Well, I see some strange stuff in the input processing.

	  /* Get the codepage to interpret this key with.  */
          GetLocaleInfo (GetThreadLocale (),
			 LOCALE_IDEFAULTANSICODEPAGE, cp, 20);
	  cpId = atoi (cp);

is quite suspicious. It appears in two places - one is a fallback for
older versions of Windows that do not fully support Unicode, the other
is more interesting for this case, as it is in the dead key handling,
and from Juanma's description, a dead key is being used to input the
problem characters.

The above lines should probably be replaced with

   cpId = GetConsoleCP ();






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27 16:46                         ` Jason Rumney
@ 2012-07-27 18:03                           ` Eli Zaretskii
  2012-07-27 18:22                             ` Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-27 18:03 UTC (permalink / raw)
  To: Jason Rumney; +Cc: lekktu, 12055

> From: Jason Rumney <jasonr@gnu.org>
> Cc: lekktu@gmail.com,  12055@debbugs.gnu.org
> Date: Sat, 28 Jul 2012 00:46:08 +0800
> 
> > Well, I see some strange stuff in the input processing.
> 
> 	  /* Get the codepage to interpret this key with.  */
>           GetLocaleInfo (GetThreadLocale (),
> 			 LOCALE_IDEFAULTANSICODEPAGE, cp, 20);
> 	  cpId = atoi (cp);
> 
> is quite suspicious. It appears in two places - one is a fallback for
> older versions of Windows that do not fully support Unicode, the other
> is more interesting for this case, as it is in the dead key handling,
> and from Juanma's description, a dead key is being used to input the
> problem characters.
> 
> The above lines should probably be replaced with
> 
>    cpId = GetConsoleCP ();

Thanks.  Yes, I wondered about that as well.  However, this is not my
problem right now.  If we were decoding input with a wrong codepage, I
should have at least seen correct Unicode character codes right at
entry into key_event.  But what I see on my machine (whose ANSI
encoding is cp1255 and the corresponding OEM encoding is cp862) is
something really weird.  When I switch the keyboard to Hebrew and type
ALEPH, BET, GIMEL, whose Unicode codepoints are, respectively, u+05D0,
u+05D1, u+05D2, I see 0x0580, 0x0581, and 0x0582 instead.  That makes
no sense at all, and no amount of tinkering with input codepage can
ever fix that.

Besides, at least in my locale, the code that you mention is never
executed at all.  Instead, we return the original Unicode character
codepoint via this fragment:

      else if (event->uChar.UnicodeChar > 0)
	{
	  emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT;
	  emacs_ev->code = event->uChar.UnicodeChar;
	}

And since, at least in my locale, event->uChar.UnicodeChar is wrong,
the rest is a logical consequence of this.

So my current theory is that it is simply wrong to look at
uChar.UnicodeChar unless we call ReadConsoleInputW, the wide-character
version of the API.  But I need data from other locales to make sure
this theory is correct.  The theory is based on the following vague
portion of the ReadConsoleInput's documentation:

  This function uses either Unicode characters or 8-bit characters
  from the console's current code page.

There isn't a word about when it does one or the other (AFAICS), which
led me to the above hypothesis, since that's the only cause that
doesn't need to be explicitly documented.

Btw, the MSDN documentation about stuff this is not as helpful as it
could have been (so what else is new?).  This page

  http://msdn.microsoft.com/en-us/library/windows/desktop/ms684166%28v=vs.85%29.aspx

says:

  uChar
      A union of the following members.

      UnicodeChar
	  Translated Unicode character.

      AsciiChar
	  Translated ASCII character.

What the heck do they mean by "translated" here?  "Translated" by whom
and how?





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27 18:03                           ` Eli Zaretskii
@ 2012-07-27 18:22                             ` Eli Zaretskii
  0 siblings, 0 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-27 18:22 UTC (permalink / raw)
  To: jasonr; +Cc: lekktu, 12055

> Date: Fri, 27 Jul 2012 21:03:43 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> So my current theory is that it is simply wrong to look at
> uChar.UnicodeChar unless we call ReadConsoleInputW, the wide-character
> version of the API.

Forgot to tell an important detail: if I replace the call to
ReadConsoleInput with ReadConsoleInputW, I do see the expected 0x05D0
etc. codes in uChar.UnicodeChar of each event, and Emacs inserts the
correct characters into the buffer.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27 15:12                       ` Eli Zaretskii
  2012-07-27 16:46                         ` Jason Rumney
@ 2012-07-27 23:45                         ` Juanma Barranquero
  2012-07-28  1:12                         ` Dani Moncayo
  2 siblings, 0 replies; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-27 23:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 12055

On Fri, Jul 27, 2012 at 5:12 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> There are 2 messages for each keystroke:

Not exactly, see below.

> one when the key is pressed, the other when it is released.  Please
> post here the exact output, and please tell for each pair of such
> messages which character did you type.

Dead key '
warning: key_event: 1 1 0xde 0x28 {0x0 0x0} 0x0
warning: key_event: 0 1 0xde 0x0 {0xffffffef 0xef} 0x0
warning: key_event: 0 1 0xde 0x28 {0xffffffef 0xef} 0x0

a
warning: key_event: 1 1 0x41 0x1e {0xffffffa0 0xa0} 0x0
warning: key_event: 0 1 0x41 0x1e {0x61 0x61} 0x0

Dead key '
warning: key_event: 1 1 0xde 0x28 {0x0 0x0} 0x0
warning: key_event: 0 1 0xde 0x0 {0xffffffef 0xef} 0x0
warning: key_event: 0 1 0xde 0x28 {0xffffffef 0xef} 0x0

e
warning: key_event: 1 1 0x45 0x12 {0xffffff82 0x82} 0x0
warning: key_event: 0 1 0x45 0x12 {0x65 0x65} 0x0

etc.

    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-27 15:12                       ` Eli Zaretskii
  2012-07-27 16:46                         ` Jason Rumney
  2012-07-27 23:45                         ` Juanma Barranquero
@ 2012-07-28  1:12                         ` Dani Moncayo
  2012-07-28  8:04                           ` bug#12055: " Eli Zaretskii
  2 siblings, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28  1:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> Please
> post here the exact output, and please tell for each pair of such
> messages which character did you type.

Sorry for the delay.  I've not had time until now.

Here is my data:

  [´] (dead key. used before vowels for inserting accented vowels like "á")
  warning: key_event: 1 1 0xde 0x28 {0x0 0x0} 0x20
  warning: key_event: 0 1 0xde 0x0 {0xffffffef 0xef} 0x20
  warning: key_event: 0 1 0xde 0x28 {0xffffffef 0xef} 0x20

  [a]
  warning: key_event: 1 1 0x41 0x1e {0xffffffa0 0xa0} 0x20
  warning: key_event: 0 1 0x41 0x1e {0x61 0x61} 0x20

  [e]
  warning: key_event: 1 1 0x45 0x12 {0xffffff82 0x82} 0x20
  warning: key_event: 0 1 0x45 0x12 {0x65 0x65} 0x20

  [i]
  warning: key_event: 1 1 0x49 0x17 {0xffffffa1 0xa1} 0x20
  warning: key_event: 0 1 0x49 0x17 {0x69 0x69} 0x20

  [o]
  warning: key_event: 1 1 0x4f 0x18 {0xffffffa2 0xa2} 0x20
  warning: key_event: 0 1 0x4f 0x18 {0x6f 0x6f} 0x20

  [u]
  warning: key_event: 1 1 0x55 0x16 {0xffffffa3 0xa3} 0x20
  warning: key_event: 0 1 0x55 0x16 {0x75 0x75} 0x20

  [ñ]
  warning: key_event: 1 1 0xc0 0x27 {0xffffffa4 0xa4} 0x20
  warning: key_event: 0 1 0xc0 0x27 {0xffffffa4 0xa4} 0x20

  [ç]
  warning: key_event: 1 1 0xbf 0x2b {0xffffff87 0x87} 0x20
  warning: key_event: 0 1 0xbf 0x2b {0xffffff87 0x87} 0x20

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28  1:12                         ` Dani Moncayo
@ 2012-07-28  8:04                           ` Eli Zaretskii
  2012-07-28 10:06                             ` Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28  8:04 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 03:12:12 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Please
> > post here the exact output, and please tell for each pair of such
> > messages which character did you type.
> 
> Sorry for the delay.  I've not had time until now.
> 
> Here is my data:

Thanks to both of you.  Now I see that my theory is correct, and I can
sit down and code the solution for this problem.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28  8:04                           ` bug#12055: " Eli Zaretskii
@ 2012-07-28 10:06                             ` Eli Zaretskii
  2012-07-28 11:55                               ` Dani Moncayo
                                                 ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 10:06 UTC (permalink / raw)
  To: dmoncayo, lekktu; +Cc: 12055

> Date: Sat, 28 Jul 2012 11:04:29 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Date: Sat, 28 Jul 2012 03:12:12 +0200
> > From: Dani Moncayo <dmoncayo@gmail.com>
> > Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> > 
> > > Please
> > > post here the exact output, and please tell for each pair of such
> > > messages which character did you type.
> > 
> > Sorry for the delay.  I've not had time until now.
> > 
> > Here is my data:
> 
> Thanks to both of you.  Now I see that my theory is correct, and I can
> sit down and code the solution for this problem.

Please try the patch below.  It works for me.

Please try it also when Unicode input is not used (it is by default on
Windows NT and later, as result of this patch).  You can do that by
forcing w32_console_unicode_input to zero (either by modifying the
source of w32console.c and rebuilding, or by setting the variable's
value in GDB.

TIA


=== modified file 'lisp/international/mule-cmds.el'
--- lisp/international/mule-cmds.el	2012-07-25 23:11:23 +0000
+++ lisp/international/mule-cmds.el	2012-07-28 09:43:40 +0000
@@ -2655,23 +2655,29 @@ See also `locale-charset-language-names'
 
     ;; On Windows, override locale-coding-system,
     ;; default-file-name-coding-system, keyboard-coding-system,
-    ;; terminal-coding-system with system codepage.
+    ;; terminal-coding-system with the appropriate codepages.
     (when (boundp 'w32-ansi-code-page)
-      (let ((code-page-coding (intern (format "cp%d" w32-ansi-code-page))))
-	(when (coding-system-p code-page-coding)
-	  (unless frame (setq locale-coding-system code-page-coding))
-	  (set-keyboard-coding-system code-page-coding frame)
-	  (set-terminal-coding-system code-page-coding frame)
-	  ;; Set default-file-name-coding-system last, so that Emacs
-	  ;; doesn't try to use cpNNNN when it defines keyboard and
-	  ;; terminal encoding.  That's because the above two lines
-	  ;; will want to load code-pages.el, where cpNNNN are
-	  ;; defined; if default-file-name-coding-system were set to
-	  ;; cpNNNN while these two lines run, Emacs will want to use
-	  ;; it for encoding the file name it wants to load.  And that
-	  ;; will fail, since cpNNNN is not yet usable until
-	  ;; code-pages.el finishes loading.
-	  (setq default-file-name-coding-system code-page-coding))))
+      (let ((ansi-code-page-coding (intern (format "cp%d" w32-ansi-code-page)))
+	    (oem-code-page-coding
+	     (intern (format "cp%d" (w32-get-console-codepage))))
+	    ansi-cs-p oem-cs-p)
+	(and (coding-system-p ansi-code-page-coding)
+	     (setq ansi-cs-p t))
+	(and (coding-system-p oem-code-page-coding)
+	     (setq oem-cs-p t))
+	;; Set the keyboard and display encoding to either the current
+	;; ANSI codepage of the OEM codepage, depending on whether
+	;; this is a GUI or a TTY frame.
+	(when ansi-cs-p
+	  (unless frame (setq locale-coding-system ansi-code-page-coding))
+	  (when (display-graphic-p frame)
+	    (set-keyboard-coding-system ansi-code-page-coding frame)
+	    (set-terminal-coding-system ansi-code-page-coding frame))
+	  (setq default-file-name-coding-system ansi-code-page-coding))
+	(when oem-cs-p
+	  (unless (display-graphic-p frame)
+	    (set-keyboard-coding-system oem-code-page-coding frame)
+	    (set-terminal-coding-system oem-code-page-coding frame)))))
 
     (when (eq system-type 'darwin)
       ;; On Darwin, file names are always encoded in utf-8, no matter

=== modified file 'src/w32console.c'
--- src/w32console.c	2012-06-28 07:50:27 +0000
+++ src/w32console.c	2012-07-28 09:48:41 +0000
@@ -37,6 +37,7 @@ along with GNU Emacs.  If not, see <http
 #include "termhooks.h"
 #include "termchar.h"
 #include "dispextern.h"
+#include "w32heap.h"	/* for os_subtype */
 #include "w32inevt.h"
 
 /* from window.c */
@@ -67,6 +68,7 @@ static CONSOLE_CURSOR_INFO prev_console_
 #endif
 
 HANDLE  keyboard_handle;
+int w32_console_unicode_input;
 
 
 /* Setting this as the ctrl handler prevents emacs from being killed when
@@ -786,6 +788,11 @@ initialize_w32_display (struct terminal 
 		       info.srWindow.Left);
     }
 
+  if (os_subtype == OS_NT)
+    w32_console_unicode_input = 1;
+  else
+    w32_console_unicode_input = 0;
+
   /* Setup w32_display_info structure for this frame. */
 
   w32_initialize_display_info (build_string ("Console"));

=== modified file 'src/w32inevt.c'
--- src/w32inevt.c	2012-05-26 11:58:19 +0000
+++ src/w32inevt.c	2012-07-28 09:57:11 +0000
@@ -41,6 +41,7 @@ along with GNU Emacs.  If not, see <http
 #include "termchar.h"
 #include "w32heap.h"
 #include "w32term.h"
+#include "w32inevt.h"
 
 /* stdin, from w32console.c */
 extern HANDLE keyboard_handle;
@@ -61,6 +62,15 @@ static INPUT_RECORD *queue_ptr = event_q
 /* Temporarily store lead byte of DBCS input sequences.  */
 static char dbcs_lead = 0;
 
+static inline BOOL
+w32_read_console_input (HANDLE h, INPUT_RECORD *rec, DWORD recsize,
+			DWORD *waiting)
+{
+  return (w32_console_unicode_input
+	  ? ReadConsoleInputW (h, rec, recsize, waiting)
+	  : ReadConsoleInputA (h, rec, recsize, waiting));
+}
+
 static int
 fill_queue (BOOL block)
 {
@@ -80,8 +90,8 @@ fill_queue (BOOL block)
 	return 0;
     }
 
-  rc = ReadConsoleInput (keyboard_handle, event_queue, EVENT_QUEUE_SIZE,
-			 &events_waiting);
+  rc = w32_read_console_input (keyboard_handle, event_queue, EVENT_QUEUE_SIZE,
+			       &events_waiting);
   if (!rc)
     return -1;
   queue_ptr = event_queue;
@@ -224,7 +234,7 @@ w32_kbd_patch_key (KEY_EVENT_RECORD *eve
 #endif
 
   /* On NT, call ToUnicode instead and then convert to the current
-     locale's default codepage.  */
+     console input codepage.  */
   if (os_subtype == OS_NT)
     {
       WCHAR buf[128];
@@ -233,14 +243,9 @@ w32_kbd_patch_key (KEY_EVENT_RECORD *eve
 			  keystate, buf, 128, 0);
       if (isdead > 0)
 	{
-	  char cp[20];
-	  int cpId;
+	  int cpId = GetConsoleCP ();
 
 	  event->uChar.UnicodeChar = buf[isdead - 1];
-
-	  GetLocaleInfo (GetThreadLocale (),
-			 LOCALE_IDEFAULTANSICODEPAGE, cp, 20);
-	  cpId = atoi (cp);
 	  isdead = WideCharToMultiByte (cpId, 0, buf, isdead,
 					ansi_code, 4, NULL, NULL);
 	}
@@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru
 	}
       else if (event->uChar.AsciiChar > 0)
 	{
+	  /* Pure ASCII characters < 128.  */
 	  emacs_ev->kind = ASCII_KEYSTROKE_EVENT;
 	  emacs_ev->code = event->uChar.AsciiChar;
 	}
-      else if (event->uChar.UnicodeChar > 0)
+      else if (event->uChar.UnicodeChar > 0
+	       && w32_console_unicode_input)
 	{
+	  /* Unicode codepoint; only valid if we are using Unicode
+	     console input mode.  */
 	  emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT;
 	  emacs_ev->code = event->uChar.UnicodeChar;
 	}
       else
 	{
-	  /* Fallback for non-Unicode versions of Windows.  */
+	  /* Fallback handling of non-ASCII characters for non-Unicode
+	     versions of Windows, and for non-Unicode input on NT
+	     family of Windows.  Only characters in the current
+	     console codepage are supported by this fallback.  */
 	  wchar_t code;
 	  char dbcs[2];
-          char cp[20];
           int cpId;
 
-	  /* Get the codepage to interpret this key with.  */
-          GetLocaleInfo (GetThreadLocale (),
-			 LOCALE_IDEFAULTANSICODEPAGE, cp, 20);
-          cpId = atoi (cp);
+	  /* Get the current console input codepage to interpret this
+	     key with.  Note that the system defaults for the OEM
+	     codepage could have been changed by calling SetConsoleCP
+	     or w32-set-console-codepage, so using GetLocaleInfo to
+	     get LOCALE_IDEFAULTCODEPAGE is not TRT here.  */
+          cpId = GetConsoleCP ();
 
 	  dbcs[0] = dbcs_lead;
 	  dbcs[1] = event->uChar.AsciiChar;
@@ -501,6 +514,7 @@ key_event (KEY_EVENT_RECORD *event, stru
     }
   else
     {
+      /* Function keys and other non-character keys.  */
       emacs_ev->kind = NON_ASCII_KEYSTROKE_EVENT;
       emacs_ev->code = event->wVirtualKeyCode;
     }

=== modified file 'src/w32inevt.h'
--- src/w32inevt.h	2012-01-19 07:21:25 +0000
+++ src/w32inevt.h	2012-07-28 08:39:49 +0000
@@ -19,6 +19,8 @@ along with GNU Emacs.  If not, see <http
 #ifndef EMACS_W32INEVT_H
 #define EMACS_W32INEVT_H
 
+extern int w32_console_unicode_input;
+
 extern int w32_console_read_socket (struct terminal *term, int numchars,
 				    struct input_event *hold_quit);
 extern void w32_console_mouse_position (FRAME_PTR *f, int insist,






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 10:06                             ` Eli Zaretskii
@ 2012-07-28 11:55                               ` Dani Moncayo
  2012-07-28 12:23                                 ` bug#12055: " Eli Zaretskii
  2012-07-28 12:30                                 ` bug#12055: " Eli Zaretskii
  2012-07-28 13:57                               ` Dani Moncayo
  2012-07-28 16:11                               ` Juanma Barranquero
  2 siblings, 2 replies; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 11:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> Please try the patch below.  It works for me.

I'm having problems for applying your patch to my trunk branch
(updated right now).

This is what I'm trying to do (from an Emacs -Q):
1. Copy your patch and paste it in a new Emacs buffer, and save it to
a file "patch.diff" (with UNIX-type EOL format).
2. Go to each hunk and type "C-c C-a".

This is failing for me in the hunks that begin with:
  @@ -786,6 +788,11 @@ initialize_w32_display (struct terminal
  @@ -61,6 +62,15 @@ static INPUT_RECORD *queue_ptr = event_q
  @@ -80,8 +90,8 @@ fill_queue (BOOL block)
  @@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru

For these hunks, I receive the error message "Can't find the text to patch".

And another oddity: For the last hunk in the patch (which affect the
file "src/w32inevt.h"), the patch is apparently applied (I see the
message "hunk applied"), but if I watch to the corresponding buffer,
the path is not applied (the added line "extern int
w32_console_unicode_input;" is not there).

Am I missing something?
Is this an Emacs bug?


-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 11:55                               ` Dani Moncayo
@ 2012-07-28 12:23                                 ` Eli Zaretskii
  2012-07-28 12:49                                   ` Dani Moncayo
  2012-07-28 12:30                                 ` bug#12055: " Eli Zaretskii
  1 sibling, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 12:23 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 13:55:41 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Please try the patch below.  It works for me.
> 
> I'm having problems for applying your patch to my trunk branch
> (updated right now).
> [...]
> Am I missing something?
> Is this an Emacs bug?

I have no idea, but can we please solve bugs one at a time?  Can you
apply the patch outside Emacs, by using the Patch utility?  The
command you should type at the shell prompt should be:

  patch --binary -p0 < patch.diff

This command should be issued from the root directory of the Emacs
tree, the one that has src and lisp as its subdirectories.

It should also work to do this from inside Emacs, like this:

 . put the region around the diffs I sent

 . type this command:

    C-x RET c unix RET M-| patch -d /path/to/emacs/root/dir --binary -p0

Thanks.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 11:55                               ` Dani Moncayo
  2012-07-28 12:23                                 ` bug#12055: " Eli Zaretskii
@ 2012-07-28 12:30                                 ` Eli Zaretskii
  1 sibling, 0 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 12:30 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 13:55:41 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> This is what I'm trying to do (from an Emacs -Q):
> 1. Copy your patch and paste it in a new Emacs buffer, and save it to
> a file "patch.diff" (with UNIX-type EOL format).
> 2. Go to each hunk and type "C-c C-a".
> 
> This is failing for me in the hunks that begin with:
>   @@ -786,6 +788,11 @@ initialize_w32_display (struct terminal
>   @@ -61,6 +62,15 @@ static INPUT_RECORD *queue_ptr = event_q
>   @@ -80,8 +90,8 @@ fill_queue (BOOL block)
>   @@ -447,26 +452,34 @@ key_event (KEY_EVENT_RECORD *event, stru
> 
> For these hunks, I receive the error message "Can't find the text to patch".

Perhaps your copy/paste procedure didn't preserve the TAB characters,
converting them into spaces instead.  Can "C-c C-a" ignore whitespace
changes?  If not, Patch can, if you use the -l (the letter ell, not
the digit 1) option.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 12:23                                 ` bug#12055: " Eli Zaretskii
@ 2012-07-28 12:49                                   ` Dani Moncayo
  2012-07-28 15:02                                     ` bug#12055: " Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 12:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> Can you
> apply the patch outside Emacs, by using the Patch utility?

I'm sorry, but the patch utility is giving me problems too :(

I've installed the GnuWin32 version, and added its "bin" directory
(where the "patch.exe" program is) to my system PATH.

After doing this, if I open a cmd console and type "patch", I see a
dialog box from Windows 7 asking me to allow the execution of the
program.  I click "yes" and then a new console window is opened, with
no text inside it.

Where can I download a working "patch" utility for Windows?

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 10:06                             ` Eli Zaretskii
  2012-07-28 11:55                               ` Dani Moncayo
@ 2012-07-28 13:57                               ` Dani Moncayo
  2012-07-28 16:07                                 ` Juanma Barranquero
  2012-07-28 16:11                               ` Juanma Barranquero
  2 siblings, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 13:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

[-- Attachment #1: Type: text/plain, Size: 1585 bytes --]

> Please try the patch below.  It works for me.

Well, I've finally managed to install a working "patch" utility for
Windows (MinGW has a package for it).

I've applied the patch and built the branch.

Now, after starting emacs (-Q -nw) and typing "áéíóúñç", Emacs shows
different symbols (see attached screenshot), but if I copy them and
paste here, the pasted symbols are the correct ones (áéíóúñç), instead
of the ones I see in the screen.

Also, if I go to one char, for example the "á", and do C-u C-x =,
Emacs says (*):

             position: 192 of 196 (97%), column: 0
            character: á (displayed as á) (codepoint 225, #o341, #xe1)
    preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1))
code point in charset: 0xE1
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), c:Chinese,
j:Japanese, l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA1
            file code: #xE1 (encoded by coding system iso-latin-1-dos)
              display: terminal code #xE1

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH ACUTE
  old-name: LATIN SMALL LETTER A ACUTE
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 769) ('a' '́')

There are text properties here:
  fontified            t


-------
(*) Although, as I said, in the second line I don't see the "á"
symbols in the screen, but another symbols (like a greek "beta").

-- 
Dani Moncayo

[-- Attachment #2: img3.png --]
[-- Type: image/png, Size: 18600 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-26 12:13 bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal Dani Moncayo
  2012-07-26 16:13 ` Eli Zaretskii
@ 2012-07-28 14:12 ` Dani Moncayo
  2012-07-28 15:01   ` bug#12055: " Eli Zaretskii
  1 sibling, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 14:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> I've applied the patch and built the branch.
>
> Now, after starting emacs (-Q -nw) and typing "áéíóúñç", Emacs shows
> different symbols (see attached screenshot)...

And after doing "C-x RET t cp850 RET", everything seem to work fine.

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 14:12 ` Dani Moncayo
@ 2012-07-28 15:01   ` Eli Zaretskii
  2012-07-28 15:23     ` Dani Moncayo
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 15:01 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 16:12:59 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > I've applied the patch and built the branch.
> >
> > Now, after starting emacs (-Q -nw) and typing "áéíóúñç", Emacs shows
> > different symbols (see attached screenshot)...
> 
> And after doing "C-x RET t cp850 RET", everything seem to work fine.

Did you compile international/mule-cmds.el (which was modified by the
patch) and did you re-dump Emacs after byte-compiling mule-cmds.el?

If you did all that, what is the value you get by evaluating
(terminal-coding-system), and what is the value you get from
w32-get-console-codepage?






^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 12:49                                   ` Dani Moncayo
@ 2012-07-28 15:02                                     ` Eli Zaretskii
  0 siblings, 0 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 15:02 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 14:49:34 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Can you
> > apply the patch outside Emacs, by using the Patch utility?
> 
> I'm sorry, but the patch utility is giving me problems too :(
> 
> I've installed the GnuWin32 version, and added its "bin" directory
> (where the "patch.exe" program is) to my system PATH.
> 
> After doing this, if I open a cmd console and type "patch", I see a
> dialog box from Windows 7 asking me to allow the execution of the
> program.  I click "yes" and then a new console window is opened, with
> no text inside it.

This is UAC in action.  You need a manifest for Patch.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 15:01   ` bug#12055: " Eli Zaretskii
@ 2012-07-28 15:23     ` Dani Moncayo
  2012-07-28 15:34       ` Dani Moncayo
  2012-07-28 15:35       ` Eli Zaretskii
  0 siblings, 2 replies; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 15:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

[-- Attachment #1: Type: text/plain, Size: 942 bytes --]

> Did you compile international/mule-cmds.el (which was modified by the
> patch)

Yes.

> and did you re-dump Emacs after byte-compiling mule-cmds.el?

I don't know.  This is exactly what I did:
1. I applied your patch to my branch (the whole patch, which includes
the file you mention).
2. To make sure that the patch was correctly applied, I compared your
patch with the output of "bzr diff" (I'm attaching this output if you
want to check it).
3. I went to the "nt" subdirectory and ran a plain "mingw32-make".  I
thought that the build system would know what to recompile based on
what files have changed since the last time.
4. I ran a "mingw32-make install".

> If you did all that, what is the value you get by evaluating
> (terminal-coding-system)

cp1252

>, and what is the value you get from
> w32-get-console-codepage?

"w32-get-console-codepage" is void as a variable.
"(w32-get-console-codepage)" returns 850.

-- 
Dani Moncayo

[-- Attachment #2: bzr-diff --]
[-- Type: application/octet-stream, Size: 7267 bytes --]

=== modified file 'lisp/international/mule-cmds.el'
--- lisp/international/mule-cmds.el	2012-07-25 23:11:23 +0000
+++ lisp/international/mule-cmds.el	2012-07-28 13:21:52 +0000
@@ -2655,23 +2655,29 @@
 
     ;; On Windows, override locale-coding-system,
     ;; default-file-name-coding-system, keyboard-coding-system,
-    ;; terminal-coding-system with system codepage.
+    ;; terminal-coding-system with the appropriate codepages.
     (when (boundp 'w32-ansi-code-page)
-      (let ((code-page-coding (intern (format "cp%d" w32-ansi-code-page))))
-	(when (coding-system-p code-page-coding)
-	  (unless frame (setq locale-coding-system code-page-coding))
-	  (set-keyboard-coding-system code-page-coding frame)
-	  (set-terminal-coding-system code-page-coding frame)
-	  ;; Set default-file-name-coding-system last, so that Emacs
-	  ;; doesn't try to use cpNNNN when it defines keyboard and
-	  ;; terminal encoding.  That's because the above two lines
-	  ;; will want to load code-pages.el, where cpNNNN are
-	  ;; defined; if default-file-name-coding-system were set to
-	  ;; cpNNNN while these two lines run, Emacs will want to use
-	  ;; it for encoding the file name it wants to load.  And that
-	  ;; will fail, since cpNNNN is not yet usable until
-	  ;; code-pages.el finishes loading.
-	  (setq default-file-name-coding-system code-page-coding))))
+      (let ((ansi-code-page-coding (intern (format "cp%d" w32-ansi-code-page)))
+           (oem-code-page-coding
+            (intern (format "cp%d" (w32-get-console-codepage))))
+           ansi-cs-p oem-cs-p)
+       (and (coding-system-p ansi-code-page-coding)
+            (setq ansi-cs-p t))
+       (and (coding-system-p oem-code-page-coding)
+            (setq oem-cs-p t))
+       ;; Set the keyboard and display encoding to either the current
+       ;; ANSI codepage of the OEM codepage, depending on whether
+       ;; this is a GUI or a TTY frame.
+       (when ansi-cs-p
+         (unless frame (setq locale-coding-system ansi-code-page-coding))
+         (when (display-graphic-p frame)
+           (set-keyboard-coding-system ansi-code-page-coding frame)
+           (set-terminal-coding-system ansi-code-page-coding frame))
+         (setq default-file-name-coding-system ansi-code-page-coding))
+       (when oem-cs-p
+         (unless (display-graphic-p frame)
+           (set-keyboard-coding-system oem-code-page-coding frame)
+           (set-terminal-coding-system oem-code-page-coding frame)))))
 
     (when (eq system-type 'darwin)
       ;; On Darwin, file names are always encoded in utf-8, no matter

=== modified file 'src/w32console.c'
--- src/w32console.c	2012-06-28 07:50:27 +0000
+++ src/w32console.c	2012-07-28 13:21:52 +0000
@@ -37,6 +37,7 @@
 #include "termhooks.h"
 #include "termchar.h"
 #include "dispextern.h"
+#include "w32heap.h"   /* for os_subtype */
 #include "w32inevt.h"
 
 /* from window.c */
@@ -67,6 +68,7 @@
 #endif
 
 HANDLE  keyboard_handle;
+int w32_console_unicode_input;
 
 
 /* Setting this as the ctrl handler prevents emacs from being killed when
@@ -786,6 +788,11 @@
 		       info.srWindow.Left);
     }
 
+  if (os_subtype == OS_NT)
+    w32_console_unicode_input = 1;
+  else
+    w32_console_unicode_input = 0;
+
   /* Setup w32_display_info structure for this frame. */
 
   w32_initialize_display_info (build_string ("Console"));

=== modified file 'src/w32inevt.c'
--- src/w32inevt.c	2012-05-26 11:58:19 +0000
+++ src/w32inevt.c	2012-07-28 13:21:52 +0000
@@ -41,6 +41,7 @@
 #include "termchar.h"
 #include "w32heap.h"
 #include "w32term.h"
+#include "w32inevt.h"
 
 /* stdin, from w32console.c */
 extern HANDLE keyboard_handle;
@@ -61,6 +62,15 @@
 /* Temporarily store lead byte of DBCS input sequences.  */
 static char dbcs_lead = 0;
 
+static inline BOOL
+w32_read_console_input (HANDLE h, INPUT_RECORD *rec, DWORD recsize,
+                       DWORD *waiting)
+{
+  return (w32_console_unicode_input
+         ? ReadConsoleInputW (h, rec, recsize, waiting)
+         : ReadConsoleInputA (h, rec, recsize, waiting));
+}
+
 static int
 fill_queue (BOOL block)
 {
@@ -80,8 +90,8 @@
 	return 0;
     }
 
-  rc = ReadConsoleInput (keyboard_handle, event_queue, EVENT_QUEUE_SIZE,
-			 &events_waiting);
+  rc = w32_read_console_input (keyboard_handle, event_queue, EVENT_QUEUE_SIZE,
+                              &events_waiting);
   if (!rc)
     return -1;
   queue_ptr = event_queue;
@@ -224,7 +234,7 @@
 #endif
 
   /* On NT, call ToUnicode instead and then convert to the current
-     locale's default codepage.  */
+     console input codepage.  */
   if (os_subtype == OS_NT)
     {
       WCHAR buf[128];
@@ -233,14 +243,9 @@
 			  keystate, buf, 128, 0);
       if (isdead > 0)
 	{
-	  char cp[20];
-	  int cpId;
+         int cpId = GetConsoleCP ();
 
 	  event->uChar.UnicodeChar = buf[isdead - 1];
-
-	  GetLocaleInfo (GetThreadLocale (),
-			 LOCALE_IDEFAULTANSICODEPAGE, cp, 20);
-	  cpId = atoi (cp);
 	  isdead = WideCharToMultiByte (cpId, 0, buf, isdead,
 					ansi_code, 4, NULL, NULL);
 	}
@@ -447,26 +452,34 @@
 	}
       else if (event->uChar.AsciiChar > 0)
 	{
+         /* Pure ASCII characters < 128.  */
 	  emacs_ev->kind = ASCII_KEYSTROKE_EVENT;
 	  emacs_ev->code = event->uChar.AsciiChar;
 	}
-      else if (event->uChar.UnicodeChar > 0)
+      else if (event->uChar.UnicodeChar > 0
+              && w32_console_unicode_input)
 	{
+         /* Unicode codepoint; only valid if we are using Unicode
+            console input mode.  */
 	  emacs_ev->kind = MULTIBYTE_CHAR_KEYSTROKE_EVENT;
 	  emacs_ev->code = event->uChar.UnicodeChar;
 	}
       else
 	{
-	  /* Fallback for non-Unicode versions of Windows.  */
+         /* Fallback handling of non-ASCII characters for non-Unicode
+            versions of Windows, and for non-Unicode input on NT
+            family of Windows.  Only characters in the current
+            console codepage are supported by this fallback.  */
 	  wchar_t code;
 	  char dbcs[2];
-          char cp[20];
           int cpId;
 
-	  /* Get the codepage to interpret this key with.  */
-          GetLocaleInfo (GetThreadLocale (),
-			 LOCALE_IDEFAULTANSICODEPAGE, cp, 20);
-          cpId = atoi (cp);
+         /* Get the current console input codepage to interpret this
+            key with.  Note that the system defaults for the OEM
+            codepage could have been changed by calling SetConsoleCP
+            or w32-set-console-codepage, so using GetLocaleInfo to
+            get LOCALE_IDEFAULTCODEPAGE is not TRT here.  */
+          cpId = GetConsoleCP ();
 
 	  dbcs[0] = dbcs_lead;
 	  dbcs[1] = event->uChar.AsciiChar;
@@ -501,6 +514,7 @@
     }
   else
     {
+      /* Function keys and other non-character keys.  */
       emacs_ev->kind = NON_ASCII_KEYSTROKE_EVENT;
       emacs_ev->code = event->wVirtualKeyCode;
     }

=== modified file 'src/w32inevt.h'
--- src/w32inevt.h	2012-01-19 07:21:25 +0000
+++ src/w32inevt.h	2012-07-28 13:21:52 +0000
@@ -19,6 +19,8 @@
 #ifndef EMACS_W32INEVT_H
 #define EMACS_W32INEVT_H
 
+extern int w32_console_unicode_input;
+
 extern int w32_console_read_socket (struct terminal *term, int numchars,
 				    struct input_event *hold_quit);
 extern void w32_console_mouse_position (FRAME_PTR *f, int insist,


^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 15:23     ` Dani Moncayo
@ 2012-07-28 15:34       ` Dani Moncayo
  2012-07-28 16:27         ` bug#12055: " Eli Zaretskii
  2012-07-28 15:35       ` Eli Zaretskii
  1 sibling, 1 reply; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 15:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

On Sat, Jul 28, 2012 at 5:23 PM, Dani Moncayo <dmoncayo@gmail.com> wrote:
>> Did you compile international/mule-cmds.el (which was modified by the
>> patch)
>
> Yes.

I'm sorry.  I thought that the build process would do every needed compilation.

I've just recompiled the file "lisp/international/mule-cmds.elc" and
rebuit Emacs ("mingw32-make" + "mingw32-make install").

Now everything seem to work well.

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 15:23     ` Dani Moncayo
  2012-07-28 15:34       ` Dani Moncayo
@ 2012-07-28 15:35       ` Eli Zaretskii
  2012-07-28 15:46         ` Dani Moncayo
  1 sibling, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 15:35 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 17:23:19 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> > Did you compile international/mule-cmds.el (which was modified by the
> > patch)
> 
> Yes.
> 
> > and did you re-dump Emacs after byte-compiling mule-cmds.el?
> 
> I don't know.  This is exactly what I did:
> 1. I applied your patch to my branch (the whole patch, which includes
> the file you mention).
> 2. To make sure that the patch was correctly applied, I compared your
> patch with the output of "bzr diff" (I'm attaching this output if you
> want to check it).
> 3. I went to the "nt" subdirectory and ran a plain "mingw32-make".  I
> thought that the build system would know what to recompile based on
> what files have changed since the last time.
> 4. I ran a "mingw32-make install".
> 
> > If you did all that, what is the value you get by evaluating
> > (terminal-coding-system)
> 
> cp1252
> 
> >, and what is the value you get from
> > w32-get-console-codepage?
> 
> "w32-get-console-codepage" is void as a variable.
> "(w32-get-console-codepage)" returns 850.

This looks as if the changes in mule-cmds didn't take place at all.
Please try rebuilding one more time, and please run emacs.exe from
src/oo-spd/i386 or src/oo/i386 (depending on whether you build it
optimized or not).





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 15:35       ` Eli Zaretskii
@ 2012-07-28 15:46         ` Dani Moncayo
  0 siblings, 0 replies; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 15:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055

> This looks as if the changes in mule-cmds didn't take place at all.
> Please try rebuilding one more time, and please run emacs.exe from
> src/oo-spd/i386 or src/oo/i386 (depending on whether you build it
> optimized or not).

Indeed.  As I said, after byte-compiling
"lisp/international/mule-cmds.el" and rebuilding Emacs, now the
problems discussed in this thread seem to be solved, and now:

(terminal-coding-system) => cp850

(w32-get-console-codepage) => 850


-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 13:57                               ` Dani Moncayo
@ 2012-07-28 16:07                                 ` Juanma Barranquero
  2012-07-28 16:12                                   ` Dani Moncayo
  0 siblings, 1 reply; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-28 16:07 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: 12055

On Sat, Jul 28, 2012 at 3:57 PM, Dani Moncayo <dmoncayo@gmail.com> wrote:

> Well, I've finally managed to install a working "patch" utility for
> Windows (MinGW has a package for it).

Dani, if you're using Gmail, never copy a patch from the main message
window. While displaying the relevant message, use the "Show original"
option of the menu on the right, and copy from the original message. I
had no trouble applying Eli's patch with "bzr patch". The main message
window of Gmail murders tabs and wraps long lines at will.

    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 10:06                             ` Eli Zaretskii
  2012-07-28 11:55                               ` Dani Moncayo
  2012-07-28 13:57                               ` Dani Moncayo
@ 2012-07-28 16:11                               ` Juanma Barranquero
  2012-07-28 16:44                                 ` bug#12055: " Eli Zaretskii
  2 siblings, 1 reply; 41+ messages in thread
From: Juanma Barranquero @ 2012-07-28 16:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 12055

On Sat, Jul 28, 2012 at 12:06 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> Please try the patch below.  It works for me.
>
> Please try it also when Unicode input is not used (it is by default on
> Windows NT and later, as result of this patch).  You can do that by
> forcing w32_console_unicode_input to zero (either by modifying the
> source of w32console.c and rebuilding, or by setting the variable's
> value in GDB.

It works for me in both cases.

    Juanma





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 16:07                                 ` Juanma Barranquero
@ 2012-07-28 16:12                                   ` Dani Moncayo
  0 siblings, 0 replies; 41+ messages in thread
From: Dani Moncayo @ 2012-07-28 16:12 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 12055

>> Well, I've finally managed to install a working "patch" utility for
>> Windows (MinGW has a package for it).
>
> Dani, if you're using Gmail, never copy a patch from the main message
> window. While displaying the relevant message, use the "Show original"
> option of the menu on the right, and copy from the original message. I
> had no trouble applying Eli's patch with "bzr patch". The main message
> window of Gmail murders tabs and wraps long lines at will.

Thank you so much!

And bzr integrates a patch utility... good to know.

-- 
Dani Moncayo





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 15:34       ` Dani Moncayo
@ 2012-07-28 16:27         ` Eli Zaretskii
  0 siblings, 0 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 16:27 UTC (permalink / raw)
  To: Dani Moncayo; +Cc: lekktu, 12055

> Date: Sat, 28 Jul 2012 17:34:34 +0200
> From: Dani Moncayo <dmoncayo@gmail.com>
> Cc: lekktu@gmail.com, 12055@debbugs.gnu.org
> 
> I'm sorry.  I thought that the build process would do every needed compilation.

Never mind that.

> I've just recompiled the file "lisp/international/mule-cmds.elc" and
> rebuit Emacs ("mingw32-make" + "mingw32-make install").
> 
> Now everything seem to work well.

Thanks!  I will wait for Juanma to confirm these good results, before
committing.

The actual changes I will install include one more subtlety that I
missed: the encoding of console input and output on Windows can
generally be different, so mule-cmds.el needs one more small tweak.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 16:11                               ` Juanma Barranquero
@ 2012-07-28 16:44                                 ` Eli Zaretskii
  2012-07-28 17:01                                   ` Eli Zaretskii
  0 siblings, 1 reply; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 16:44 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 12055

> From: Juanma Barranquero <lekktu@gmail.com>
> Date: Sat, 28 Jul 2012 18:11:51 +0200
> Cc: dmoncayo@gmail.com, 12055@debbugs.gnu.org
> 
> On Sat, Jul 28, 2012 at 12:06 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > Please try the patch below.  It works for me.
> >
> > Please try it also when Unicode input is not used (it is by default on
> > Windows NT and later, as result of this patch).  You can do that by
> > forcing w32_console_unicode_input to zero (either by modifying the
> > source of w32console.c and rebuilding, or by setting the variable's
> > value in GDB.
> 
> It works for me in both cases.

Thanks, I will install it now.





^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#12055: Re: bug#12055: Re: bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal
  2012-07-28 16:44                                 ` bug#12055: " Eli Zaretskii
@ 2012-07-28 17:01                                   ` Eli Zaretskii
  0 siblings, 0 replies; 41+ messages in thread
From: Eli Zaretskii @ 2012-07-28 17:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 12055-done

> Date: Sat, 28 Jul 2012 19:44:31 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 12055@debbugs.gnu.org
> 
> Thanks, I will install it now.

Done as trunk revision 109251.

Thanks to both of you, and to Jason, for helping resolve this tricky
problem.





^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2012-07-28 17:01 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-26 12:13 bug#12055: 24.1.50; Characters "á" and "é" are not correctly displayed on a Windows terminal Dani Moncayo
2012-07-26 16:13 ` Eli Zaretskii
2012-07-26 16:24   ` Juanma Barranquero
2012-07-26 16:42     ` bug#12055: " Eli Zaretskii
2012-07-26 16:49       ` Juanma Barranquero
2012-07-26 17:18         ` bug#12055: " Eli Zaretskii
2012-07-26 18:09           ` Eli Zaretskii
2012-07-26 18:42             ` Juanma Barranquero
2012-07-26 18:29           ` Juanma Barranquero
2012-07-26 20:03             ` bug#12055: " Eli Zaretskii
2012-07-26 22:40               ` Dani Moncayo
2012-07-27  6:45                 ` bug#12055: " Eli Zaretskii
2012-07-27  8:35                   ` Dani Moncayo
2012-07-27  9:04                     ` bug#12055: " Eli Zaretskii
2012-07-27 15:12                       ` Eli Zaretskii
2012-07-27 16:46                         ` Jason Rumney
2012-07-27 18:03                           ` Eli Zaretskii
2012-07-27 18:22                             ` Eli Zaretskii
2012-07-27 23:45                         ` Juanma Barranquero
2012-07-28  1:12                         ` Dani Moncayo
2012-07-28  8:04                           ` bug#12055: " Eli Zaretskii
2012-07-28 10:06                             ` Eli Zaretskii
2012-07-28 11:55                               ` Dani Moncayo
2012-07-28 12:23                                 ` bug#12055: " Eli Zaretskii
2012-07-28 12:49                                   ` Dani Moncayo
2012-07-28 15:02                                     ` bug#12055: " Eli Zaretskii
2012-07-28 12:30                                 ` bug#12055: " Eli Zaretskii
2012-07-28 13:57                               ` Dani Moncayo
2012-07-28 16:07                                 ` Juanma Barranquero
2012-07-28 16:12                                   ` Dani Moncayo
2012-07-28 16:11                               ` Juanma Barranquero
2012-07-28 16:44                                 ` bug#12055: " Eli Zaretskii
2012-07-28 17:01                                   ` Eli Zaretskii
2012-07-26 16:44     ` Dani Moncayo
2012-07-28 14:12 ` Dani Moncayo
2012-07-28 15:01   ` bug#12055: " Eli Zaretskii
2012-07-28 15:23     ` Dani Moncayo
2012-07-28 15:34       ` Dani Moncayo
2012-07-28 16:27         ` bug#12055: " Eli Zaretskii
2012-07-28 15:35       ` Eli Zaretskii
2012-07-28 15:46         ` Dani Moncayo

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).