unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Windows + Eshell: fixing character encoding?
@ 2009-07-28 10:19 Elena
  2009-07-28 17:35 ` Eli Zaretskii
       [not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 17+ messages in thread
From: Elena @ 2009-07-28 10:19 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

when running Eshell on Windows, programs' output characters such as
'è', 'à', etc. are printed as \212,  \205, etc.

How can I see actual characters?

Thanks


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-28 10:19 Windows + Eshell: fixing character encoding? Elena
@ 2009-07-28 17:35 ` Eli Zaretskii
       [not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-28 17:35 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Elena <egarrulo@gmail.com>
> Date: Tue, 28 Jul 2009 03:19:51 -0700 (PDT)
> 
> when running Eshell on Windows, programs' output characters such as
> 'è', 'à', etc. are printed as \212,  \205, etc.
> 
> How can I see actual characters?

What is the value of buffer-file-coding-system in the Eshell buffer?

Also, if you go to one of these characters and type "C-u C-x =", what
does Emacs tell about that character?





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
       [not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
@ 2009-07-29  8:12   ` Elena
  2009-07-29 10:23     ` Peter Dyballa
  0 siblings, 1 reply; 17+ messages in thread
From: Elena @ 2009-07-29  8:12 UTC (permalink / raw)
  To: help-gnu-emacs

On 28 Lug, 17:35, Eli Zaretskii <e...@gnu.org> wrote:
> > From: Elena <egarr...@gmail.com>
> > Date: Tue, 28 Jul 2009 03:19:51 -0700 (PDT)
>
> > when running Eshell on Windows, programs' output characters such as
> > 'è', 'à', etc. are printed as \212,  \205, etc.
>
> > How can I see actual characters?
>
> What is the value of buffer-file-coding-system in the Eshell buffer?
>
> Also, if you go to one of these characters and type "C-u C-x =", what
> does Emacs tell about that character?

buffer-file-coding-system is iso-latin-1-dos. I think it should be
iso8859-1, but the variable's description does not says it is
customizable.

"C-u C-x =" prints:

  character: … (133, #o205, #x85, U+0085)
    charset: eight-bit-control (8-bit control code (0x80..0x9F))
 code point: #x85
     syntax:   	which means: whitespace
buffer code: #x85
  file code: not encodable by coding system iso-latin-1-dos
    display: by this font (glyph code)
     -outline-Courier New-normal-r-normal-normal-13-97-96-96-c-*-
iso8859-1 (#x85)

Following the documentation, I've tried to customize "current-language-
environment" to "Italian" and restarting "eshell", but it doesn't
change anything.

Thanks.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-29  8:12   ` Elena
@ 2009-07-29 10:23     ` Peter Dyballa
  2009-07-29 11:38       ` Elena Garrulo
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Dyballa @ 2009-07-29 10:23 UTC (permalink / raw)
  To: Elena; +Cc: help-gnu-emacs


Am 29.07.2009 um 10:12 schrieb Elena:

> buffer-file-coding-system is iso-latin-1-dos. I think it should be
> iso8859-1, but the variable's description does not says it is
> customizable.

Both are the same, just two different names.

>
> "C-u C-x =" prints:
>
>   character: … (133, #o205, #x85, U+0085)
>     charset: eight-bit-control (8-bit control code (0x80..0x9F))


 From where do you know that \205 and \212 stand for à and è etc.?

The NeXT encoding comes next to your assumption:

	;   oct   dec   hex    UCS2    UTF-8
	;=====================================
	Ä = 205 = 133 = 85 = U+00C4 =    C3 84 : A diaeresis
	Ê = 212 = 138 = 8A = U+00CA =    C3 8A : E circumflex

In ISO Latin-1 or ISO 8859-1 the two characters are:

	à = 340 = 224 = E0 = U+00E0 =    C3 A0 : LATIN SMALL LETTER A WITH  
GRAVE
	è = 350 = 232 = E8 = U+00E8 =    C3 A8 : LATIN SMALL LETTER E WITH  
GRAVE


--
Greetings

   Pete

Basic, n.:
	A programming language. Related to certain social diseases in
	that those who have it will not admit it in polite company.







^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-29 10:23     ` Peter Dyballa
@ 2009-07-29 11:38       ` Elena Garrulo
  2009-07-29 16:10         ` Eli Zaretskii
                           ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Elena Garrulo @ 2009-07-29 11:38 UTC (permalink / raw)
  To: Peter Dyballa; +Cc: help-gnu-emacs

2009/7/29 Peter Dyballa <Peter_Dyballa@web.de>:
> From where do you know that \205 and \212 stand for à and è etc.?

Because I know the italian words the program is outputting: "già" ->
"gi\205", "è" -> "\212".




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-29 11:38       ` Elena Garrulo
@ 2009-07-29 16:10         ` Eli Zaretskii
  2009-07-29 19:36         ` Peter Dyballa
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-29 16:10 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Wed, 29 Jul 2009 11:38:07 +0000
> From: Elena Garrulo <egarrulo@gmail.com>
> Cc: help-gnu-emacs@gnu.org
> 
> 2009/7/29 Peter Dyballa <Peter_Dyballa@web.de>:
> > From where do you know that \205 and \212 stand for à and è etc.?
> 
> Because I know the italian words the program is outputting: "già" ->
> "gi\205", "è" -> "\212".

That's not a proof.  Peter is right, the octal escapes you see are not
the codepoints of the Latin-1 characters you expect to see.  In fact,
these codepoints are invalid in Latin-1.

I suspect that the programs you run from Eshell (which programs are
those, by the way?) produce a Windows codepage encoding, not a Latin-1
encoding.

I have a few more questions:

  . What is the value of default-process-coding-system?

  . What is the value of locale-coding-system?

  . Does the problem happens in Emacs invoked as "emacs -Q"?





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-29 11:38       ` Elena Garrulo
  2009-07-29 16:10         ` Eli Zaretskii
@ 2009-07-29 19:36         ` Peter Dyballa
       [not found]         ` <mailman.3409.1248896205.2239.help-gnu-emacs@gnu.org>
       [not found]         ` <mailman.3394.1248883861.2239.help-gnu-emacs@gnu.org>
  3 siblings, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2009-07-29 19:36 UTC (permalink / raw)
  To: Elena Garrulo; +Cc: help-gnu-emacs


Am 29.07.2009 um 13:38 schrieb Elena Garrulo:

> 2009/7/29 Peter Dyballa <Peter_Dyballa>:
>> From where do you know that \205 and \212 stand for à and è etc.?
>
> Because I know the italian words the program is outputting: "già" ->
> "gi\205", "è" -> "\212".


Can you check whether these are encoded in some DOS code page? No ISO  
8859/ISO Latin encoding and neither Unicode uses the range

	;   oct   dec   hex
	;==================
	? = 200 = 128 = 80
		...
	? = 239 = 159 = 9F

to encode valid characters which can be presented as glyphs. They're  
used as so-called 8-bit control characters. DOS code pages, Mac  
encodings, and the NeXT encoding are different. (HP Roman too?)

--
Greetings

   Pete

Remember: use logout to logout.







^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
       [not found]         ` <mailman.3409.1248896205.2239.help-gnu-emacs@gnu.org>
@ 2009-07-30  2:01           ` Jason Rumney
  0 siblings, 0 replies; 17+ messages in thread
From: Jason Rumney @ 2009-07-30  2:01 UTC (permalink / raw)
  To: help-gnu-emacs

On Jul 30, 3:36 am, Peter Dyballa <Peter_Dyba...@web.de> wrote:
> Am 29.07.2009 um 13:38 schrieb Elena Garrulo:
>
> > 2009/7/29 Peter Dyballa <Peter_Dyballa>:
> >> From where do you know that \205 and \212 stand for à and è etc.?
>
> > Because I know the italian words the program is outputting: "già" ->
> > "gi\205", "è" -> "\212".
>
> Can you check whether these are encoded in some DOS code page? No ISO  
> 8859/ISO Latin encoding and neither Unicode uses the range
>
>         ;   oct   dec   hex
>         ;==================
>         ? = 200 = 128 = 80
>                 ...
>         ? = 239 = 159 = 9F
>
> to encode valid characters which can be presented as glyphs. They're  
> used as so-called 8-bit control characters. DOS code pages, Mac  
> encodings, and the NeXT encoding are different. (HP Roman too?)
>
> --
> Greetings
>
>    Pete
>
> Remember: use logout to logout.

Those codepoints do match the characters expected in cp850, which is
what I'd expect by default from console programs on a West European
windows system.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
       [not found]         ` <mailman.3394.1248883861.2239.help-gnu-emacs@gnu.org>
@ 2009-07-30 13:39           ` Elena
  2009-07-30 14:11             ` Lennart Borgman
                               ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Elena @ 2009-07-30 13:39 UTC (permalink / raw)
  To: help-gnu-emacs

On 29 Lug, 16:10, Eli Zaretskii <e...@gnu.org> wrote:
> I suspect that the programs you run from Eshell (which programs are
> those, by the way?) produce a Windows codepage encoding, not a Latin-1
> encoding.

I'm running Ant (a console Java program) which spawns NMake
(Microsoft's make utility). BTW, if I run Ant from the Windows command
prompt, accented characters are printed correctly.

>
> I have a few more questions:
>
>   . What is the value of default-process-coding-system?

default-process-coding-system is a variable defined in `C source
code'.
Its value is
(iso-latin-1-dos . iso-latin-1-unix)


>
>   . What is the value of locale-coding-system?

locale-coding-system is a variable defined in `C source code'.
Its value is cp1252

>
>   . Does the problem happens in Emacs invoked as "emacs -Q"?

Yes.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 13:39           ` Elena
@ 2009-07-30 14:11             ` Lennart Borgman
  2009-07-30 14:41               ` Elena Garrulo
  2009-07-30 14:41             ` Peter Dyballa
  2009-07-30 18:47             ` Eli Zaretskii
  2 siblings, 1 reply; 17+ messages in thread
From: Lennart Borgman @ 2009-07-30 14:11 UTC (permalink / raw)
  To: Elena; +Cc: help-gnu-emacs

On Thu, Jul 30, 2009 at 3:39 PM, Elena<egarrulo@gmail.com> wrote:

>>   . Does the problem happens in Emacs invoked as "emacs -Q"?
>
> Yes.

What version of Emacs are you running? What is the output from "M-x version"?




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 14:11             ` Lennart Borgman
@ 2009-07-30 14:41               ` Elena Garrulo
  2009-07-30 14:54                 ` Peter Dyballa
       [not found]                 ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 17+ messages in thread
From: Elena Garrulo @ 2009-07-30 14:41 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: help-gnu-emacs

2009/7/30 Lennart Borgman <lennart.borgman@gmail.com>:
> On Thu, Jul 30, 2009 at 3:39 PM, Elena<egarrulo@gmail.com> wrote:
>
>>>   . Does the problem happens in Emacs invoked as "emacs -Q"?
>>
>> Yes.
>
> What version of Emacs are you running? What is the output from "M-x version"?
>

GNU Emacs 22.3.1 (i386-mingw-nt5.1.2600) of 2008-09-06 on SOFT-MJASON

Anyway, I've tried writing a simple .bat which echoes accented
characters and I've launched it from Eshell: the accented characters
are printed correctly.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 13:39           ` Elena
  2009-07-30 14:11             ` Lennart Borgman
@ 2009-07-30 14:41             ` Peter Dyballa
  2009-07-30 18:47             ` Eli Zaretskii
  2 siblings, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2009-07-30 14:41 UTC (permalink / raw)
  To: Elena; +Cc: help-gnu-emacs


Am 30.07.2009 um 15:39 schrieb Elena:

> locale-coding-system is a variable defined in `C source code'.
> Its value is cp1252


This is as wrong as is iso-latin-1. In CP1252 your \205 and \212  
codes are:

	;   oct   dec   hex    UCS2    UTF-8
	;=====================================
	… = 205 = 133 = 85 = U+2026 = E2 80 A6 : HORIZONTAL ELLIPSIS
	Š = 212 = 138 = 8A = U+0160 =    C5 A0 : LATIN CAPITAL LETTER S WITH  
CARON

Use Jason Rumney's recommendation! I can see from ICU files that in  
CP850

	\205 -> [à]  00E0  LATIN SMALL LETTER A WITH GRAVE
	\212 -> [è]  00E8  LATIN SMALL LETTER E WITH GRAVE


GNU Emacsen 22 and 23 allow to re-open a *file* in a new encoding: C- 
x RET r <your choice> RET (Options menu -> Mule -> Set Coding Systems  
-> For Reverting This File Now). Once you saved the faultily looking  
buffer into a (temporary, growing) file (don't kill the buffer) you  
can revert it in a different representation of its internal bits and  
bytes.

--
Greetings

   Pete

Klingons do not believe in indentation - except perhaps in the skulls  
of their project managers.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 14:41               ` Elena Garrulo
@ 2009-07-30 14:54                 ` Peter Dyballa
       [not found]                 ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2009-07-30 14:54 UTC (permalink / raw)
  To: Elena Garrulo; +Cc: help-gnu-emacs


Am 30.07.2009 um 16:41 schrieb Elena Garrulo:

> Anyway, I've tried writing a simple .bat which echoes accented
> characters and I've launched it from Eshell: the accented characters
> are printed correctly.

Because you've inserted them either in ISO Latin or CP1252, i.e.,  
input = output. Check with C-u C-x =!

--
Greetings

   Pete

If my theory of relativity is proven successful, Germany will claim  
me as a German, and France will declare that I am a citizen of the  
world. Should my theory prove untrue, France will say that I am a  
German, and Germany will declare that I am a Jew.
				– Albert Einstein, 1929







^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
       [not found]                 ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
@ 2009-07-30 15:05                   ` Elena
  2009-07-30 15:49                     ` Peter Dyballa
  0 siblings, 1 reply; 17+ messages in thread
From: Elena @ 2009-07-30 15:05 UTC (permalink / raw)
  To: help-gnu-emacs

On 30 Lug, 14:54, Peter Dyballa <Peter_Dyba...@Web.DE> wrote:
> Am 30.07.2009 um 16:41 schrieb Elena Garrulo:
>
> > Anyway, I've tried writing a simple .bat which echoes accented
> > characters and I've launched it from Eshell: the accented characters
> > are printed correctly.
>
> Because you've inserted them either in ISO Latin or CP1252, i.e.,  
> input = output. Check with C-u C-x =!

I don't know: I've inserted them using Notepad.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 15:05                   ` Elena
@ 2009-07-30 15:49                     ` Peter Dyballa
  2009-07-30 18:49                       ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Dyballa @ 2009-07-30 15:49 UTC (permalink / raw)
  To: Elena; +Cc: help-gnu-emacs


Am 30.07.2009 um 17:05 schrieb Elena:

> I don't know: I've inserted them using Notepad.


So it is probably using your system's default, ISO Latin (BTW, ISO  
Latin-9 or ISO 8859-15, which encoded €, should be a good choice for  
a standard 8-bit encoding).

Have you thought of teaching (configuring) ant to use an ISO Latin  
encoding instead of a DOS code page from last millennium?

--
Greetings

   Pete

The best way to accelerate a PC is 9.8 m/s²





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 13:39           ` Elena
  2009-07-30 14:11             ` Lennart Borgman
  2009-07-30 14:41             ` Peter Dyballa
@ 2009-07-30 18:47             ` Eli Zaretskii
  2 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-30 18:47 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Elena <egarrulo@gmail.com>
> Date: Thu, 30 Jul 2009 06:39:35 -0700 (PDT)
> 
> default-process-coding-system is a variable defined in `C source
> code'.
> Its value is
> (iso-latin-1-dos . iso-latin-1-unix)

That's strange.  It should be (undecided-dos . undecided-unix) in
"emacs -Q".  Can you try that and see if it helps?  If that doesn't
help, please try (cp850-dos . cp850-unix).




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Windows + Eshell: fixing character encoding?
  2009-07-30 15:49                     ` Peter Dyballa
@ 2009-07-30 18:49                       ` Eli Zaretskii
  0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-30 18:49 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Peter Dyballa <Peter_Dyballa@Web.DE>
> Date: Thu, 30 Jul 2009 17:49:15 +0200
> Cc: help-gnu-emacs@gnu.org
> 
> 
> Am 30.07.2009 um 17:05 schrieb Elena:
> 
> > I don't know: I've inserted them using Notepad.
> 
> 
> So it is probably using your system's default, ISO Latin

No, it probably uses cp1252.  On Windows, the encodings used by GUI
programs (such as Notepad) and console programs are different.




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-07-30 18:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-28 10:19 Windows + Eshell: fixing character encoding? Elena
2009-07-28 17:35 ` Eli Zaretskii
     [not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
2009-07-29  8:12   ` Elena
2009-07-29 10:23     ` Peter Dyballa
2009-07-29 11:38       ` Elena Garrulo
2009-07-29 16:10         ` Eli Zaretskii
2009-07-29 19:36         ` Peter Dyballa
     [not found]         ` <mailman.3409.1248896205.2239.help-gnu-emacs@gnu.org>
2009-07-30  2:01           ` Jason Rumney
     [not found]         ` <mailman.3394.1248883861.2239.help-gnu-emacs@gnu.org>
2009-07-30 13:39           ` Elena
2009-07-30 14:11             ` Lennart Borgman
2009-07-30 14:41               ` Elena Garrulo
2009-07-30 14:54                 ` Peter Dyballa
     [not found]                 ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
2009-07-30 15:05                   ` Elena
2009-07-30 15:49                     ` Peter Dyballa
2009-07-30 18:49                       ` Eli Zaretskii
2009-07-30 14:41             ` Peter Dyballa
2009-07-30 18:47             ` Eli Zaretskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).