* Windows + Eshell: fixing character encoding?
@ 2009-07-28 10:19 Elena
2009-07-28 17:35 ` Eli Zaretskii
[not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 17+ messages in thread
From: Elena @ 2009-07-28 10:19 UTC (permalink / raw)
To: help-gnu-emacs
Hi,
when running Eshell on Windows, programs' output characters such as
'è', 'à', etc. are printed as \212, \205, etc.
How can I see actual characters?
Thanks
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-28 10:19 Windows + Eshell: fixing character encoding? Elena
@ 2009-07-28 17:35 ` Eli Zaretskii
[not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-28 17:35 UTC (permalink / raw)
To: help-gnu-emacs
> From: Elena <egarrulo@gmail.com>
> Date: Tue, 28 Jul 2009 03:19:51 -0700 (PDT)
>
> when running Eshell on Windows, programs' output characters such as
> 'è', 'à', etc. are printed as \212, \205, etc.
>
> How can I see actual characters?
What is the value of buffer-file-coding-system in the Eshell buffer?
Also, if you go to one of these characters and type "C-u C-x =", what
does Emacs tell about that character?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
[not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
@ 2009-07-29 8:12 ` Elena
2009-07-29 10:23 ` Peter Dyballa
0 siblings, 1 reply; 17+ messages in thread
From: Elena @ 2009-07-29 8:12 UTC (permalink / raw)
To: help-gnu-emacs
On 28 Lug, 17:35, Eli Zaretskii <e...@gnu.org> wrote:
> > From: Elena <egarr...@gmail.com>
> > Date: Tue, 28 Jul 2009 03:19:51 -0700 (PDT)
>
> > when running Eshell on Windows, programs' output characters such as
> > 'è', 'à', etc. are printed as \212, \205, etc.
>
> > How can I see actual characters?
>
> What is the value of buffer-file-coding-system in the Eshell buffer?
>
> Also, if you go to one of these characters and type "C-u C-x =", what
> does Emacs tell about that character?
buffer-file-coding-system is iso-latin-1-dos. I think it should be
iso8859-1, but the variable's description does not says it is
customizable.
"C-u C-x =" prints:
character:
(133, #o205, #x85, U+0085)
charset: eight-bit-control (8-bit control code (0x80..0x9F))
code point: #x85
syntax: which means: whitespace
buffer code: #x85
file code: not encodable by coding system iso-latin-1-dos
display: by this font (glyph code)
-outline-Courier New-normal-r-normal-normal-13-97-96-96-c-*-
iso8859-1 (#x85)
Following the documentation, I've tried to customize "current-language-
environment" to "Italian" and restarting "eshell", but it doesn't
change anything.
Thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-29 8:12 ` Elena
@ 2009-07-29 10:23 ` Peter Dyballa
2009-07-29 11:38 ` Elena Garrulo
0 siblings, 1 reply; 17+ messages in thread
From: Peter Dyballa @ 2009-07-29 10:23 UTC (permalink / raw)
To: Elena; +Cc: help-gnu-emacs
Am 29.07.2009 um 10:12 schrieb Elena:
> buffer-file-coding-system is iso-latin-1-dos. I think it should be
> iso8859-1, but the variable's description does not says it is
> customizable.
Both are the same, just two different names.
>
> "C-u C-x =" prints:
>
> character: … (133, #o205, #x85, U+0085)
> charset: eight-bit-control (8-bit control code (0x80..0x9F))
From where do you know that \205 and \212 stand for à and è etc.?
The NeXT encoding comes next to your assumption:
; oct dec hex UCS2 UTF-8
;=====================================
Ä = 205 = 133 = 85 = U+00C4 = C3 84 : A diaeresis
Ê = 212 = 138 = 8A = U+00CA = C3 8A : E circumflex
In ISO Latin-1 or ISO 8859-1 the two characters are:
à = 340 = 224 = E0 = U+00E0 = C3 A0 : LATIN SMALL LETTER A WITH
GRAVE
è = 350 = 232 = E8 = U+00E8 = C3 A8 : LATIN SMALL LETTER E WITH
GRAVE
--
Greetings
Pete
Basic, n.:
A programming language. Related to certain social diseases in
that those who have it will not admit it in polite company.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-29 10:23 ` Peter Dyballa
@ 2009-07-29 11:38 ` Elena Garrulo
2009-07-29 16:10 ` Eli Zaretskii
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Elena Garrulo @ 2009-07-29 11:38 UTC (permalink / raw)
To: Peter Dyballa; +Cc: help-gnu-emacs
2009/7/29 Peter Dyballa <Peter_Dyballa@web.de>:
> From where do you know that \205 and \212 stand for à and è etc.?
Because I know the italian words the program is outputting: "già" ->
"gi\205", "è" -> "\212".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-29 11:38 ` Elena Garrulo
@ 2009-07-29 16:10 ` Eli Zaretskii
2009-07-29 19:36 ` Peter Dyballa
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-29 16:10 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Wed, 29 Jul 2009 11:38:07 +0000
> From: Elena Garrulo <egarrulo@gmail.com>
> Cc: help-gnu-emacs@gnu.org
>
> 2009/7/29 Peter Dyballa <Peter_Dyballa@web.de>:
> > From where do you know that \205 and \212 stand for à and è etc.?
>
> Because I know the italian words the program is outputting: "già" ->
> "gi\205", "è" -> "\212".
That's not a proof. Peter is right, the octal escapes you see are not
the codepoints of the Latin-1 characters you expect to see. In fact,
these codepoints are invalid in Latin-1.
I suspect that the programs you run from Eshell (which programs are
those, by the way?) produce a Windows codepage encoding, not a Latin-1
encoding.
I have a few more questions:
. What is the value of default-process-coding-system?
. What is the value of locale-coding-system?
. Does the problem happens in Emacs invoked as "emacs -Q"?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-29 11:38 ` Elena Garrulo
2009-07-29 16:10 ` Eli Zaretskii
@ 2009-07-29 19:36 ` Peter Dyballa
[not found] ` <mailman.3409.1248896205.2239.help-gnu-emacs@gnu.org>
[not found] ` <mailman.3394.1248883861.2239.help-gnu-emacs@gnu.org>
3 siblings, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2009-07-29 19:36 UTC (permalink / raw)
To: Elena Garrulo; +Cc: help-gnu-emacs
Am 29.07.2009 um 13:38 schrieb Elena Garrulo:
> 2009/7/29 Peter Dyballa <Peter_Dyballa>:
>> From where do you know that \205 and \212 stand for à and è etc.?
>
> Because I know the italian words the program is outputting: "già" ->
> "gi\205", "è" -> "\212".
Can you check whether these are encoded in some DOS code page? No ISO
8859/ISO Latin encoding and neither Unicode uses the range
; oct dec hex
;==================
? = 200 = 128 = 80
...
? = 239 = 159 = 9F
to encode valid characters which can be presented as glyphs. They're
used as so-called 8-bit control characters. DOS code pages, Mac
encodings, and the NeXT encoding are different. (HP Roman too?)
--
Greetings
Pete
Remember: use logout to logout.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
[not found] ` <mailman.3409.1248896205.2239.help-gnu-emacs@gnu.org>
@ 2009-07-30 2:01 ` Jason Rumney
0 siblings, 0 replies; 17+ messages in thread
From: Jason Rumney @ 2009-07-30 2:01 UTC (permalink / raw)
To: help-gnu-emacs
On Jul 30, 3:36 am, Peter Dyballa <Peter_Dyba...@web.de> wrote:
> Am 29.07.2009 um 13:38 schrieb Elena Garrulo:
>
> > 2009/7/29 Peter Dyballa <Peter_Dyballa>:
> >> From where do you know that \205 and \212 stand for à and è etc.?
>
> > Because I know the italian words the program is outputting: "già" ->
> > "gi\205", "è" -> "\212".
>
> Can you check whether these are encoded in some DOS code page? No ISO
> 8859/ISO Latin encoding and neither Unicode uses the range
>
> ; oct dec hex
> ;==================
> ? = 200 = 128 = 80
> ...
> ? = 239 = 159 = 9F
>
> to encode valid characters which can be presented as glyphs. They're
> used as so-called 8-bit control characters. DOS code pages, Mac
> encodings, and the NeXT encoding are different. (HP Roman too?)
>
> --
> Greetings
>
> Pete
>
> Remember: use logout to logout.
Those codepoints do match the characters expected in cp850, which is
what I'd expect by default from console programs on a West European
windows system.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
[not found] ` <mailman.3394.1248883861.2239.help-gnu-emacs@gnu.org>
@ 2009-07-30 13:39 ` Elena
2009-07-30 14:11 ` Lennart Borgman
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Elena @ 2009-07-30 13:39 UTC (permalink / raw)
To: help-gnu-emacs
On 29 Lug, 16:10, Eli Zaretskii <e...@gnu.org> wrote:
> I suspect that the programs you run from Eshell (which programs are
> those, by the way?) produce a Windows codepage encoding, not a Latin-1
> encoding.
I'm running Ant (a console Java program) which spawns NMake
(Microsoft's make utility). BTW, if I run Ant from the Windows command
prompt, accented characters are printed correctly.
>
> I have a few more questions:
>
> . What is the value of default-process-coding-system?
default-process-coding-system is a variable defined in `C source
code'.
Its value is
(iso-latin-1-dos . iso-latin-1-unix)
>
> . What is the value of locale-coding-system?
locale-coding-system is a variable defined in `C source code'.
Its value is cp1252
>
> . Does the problem happens in Emacs invoked as "emacs -Q"?
Yes.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 13:39 ` Elena
@ 2009-07-30 14:11 ` Lennart Borgman
2009-07-30 14:41 ` Elena Garrulo
2009-07-30 14:41 ` Peter Dyballa
2009-07-30 18:47 ` Eli Zaretskii
2 siblings, 1 reply; 17+ messages in thread
From: Lennart Borgman @ 2009-07-30 14:11 UTC (permalink / raw)
To: Elena; +Cc: help-gnu-emacs
On Thu, Jul 30, 2009 at 3:39 PM, Elena<egarrulo@gmail.com> wrote:
>> . Does the problem happens in Emacs invoked as "emacs -Q"?
>
> Yes.
What version of Emacs are you running? What is the output from "M-x version"?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 14:11 ` Lennart Borgman
@ 2009-07-30 14:41 ` Elena Garrulo
2009-07-30 14:54 ` Peter Dyballa
[not found] ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 17+ messages in thread
From: Elena Garrulo @ 2009-07-30 14:41 UTC (permalink / raw)
To: Lennart Borgman; +Cc: help-gnu-emacs
2009/7/30 Lennart Borgman <lennart.borgman@gmail.com>:
> On Thu, Jul 30, 2009 at 3:39 PM, Elena<egarrulo@gmail.com> wrote:
>
>>> . Does the problem happens in Emacs invoked as "emacs -Q"?
>>
>> Yes.
>
> What version of Emacs are you running? What is the output from "M-x version"?
>
GNU Emacs 22.3.1 (i386-mingw-nt5.1.2600) of 2008-09-06 on SOFT-MJASON
Anyway, I've tried writing a simple .bat which echoes accented
characters and I've launched it from Eshell: the accented characters
are printed correctly.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 13:39 ` Elena
2009-07-30 14:11 ` Lennart Borgman
@ 2009-07-30 14:41 ` Peter Dyballa
2009-07-30 18:47 ` Eli Zaretskii
2 siblings, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2009-07-30 14:41 UTC (permalink / raw)
To: Elena; +Cc: help-gnu-emacs
Am 30.07.2009 um 15:39 schrieb Elena:
> locale-coding-system is a variable defined in `C source code'.
> Its value is cp1252
This is as wrong as is iso-latin-1. In CP1252 your \205 and \212
codes are:
; oct dec hex UCS2 UTF-8
;=====================================
… = 205 = 133 = 85 = U+2026 = E2 80 A6 : HORIZONTAL ELLIPSIS
Š = 212 = 138 = 8A = U+0160 = C5 A0 : LATIN CAPITAL LETTER S WITH
CARON
Use Jason Rumney's recommendation! I can see from ICU files that in
CP850
\205 -> [à] 00E0 LATIN SMALL LETTER A WITH GRAVE
\212 -> [è] 00E8 LATIN SMALL LETTER E WITH GRAVE
GNU Emacsen 22 and 23 allow to re-open a *file* in a new encoding: C-
x RET r <your choice> RET (Options menu -> Mule -> Set Coding Systems
-> For Reverting This File Now). Once you saved the faultily looking
buffer into a (temporary, growing) file (don't kill the buffer) you
can revert it in a different representation of its internal bits and
bytes.
--
Greetings
Pete
Klingons do not believe in indentation - except perhaps in the skulls
of their project managers.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 14:41 ` Elena Garrulo
@ 2009-07-30 14:54 ` Peter Dyballa
[not found] ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2009-07-30 14:54 UTC (permalink / raw)
To: Elena Garrulo; +Cc: help-gnu-emacs
Am 30.07.2009 um 16:41 schrieb Elena Garrulo:
> Anyway, I've tried writing a simple .bat which echoes accented
> characters and I've launched it from Eshell: the accented characters
> are printed correctly.
Because you've inserted them either in ISO Latin or CP1252, i.e.,
input = output. Check with C-u C-x =!
--
Greetings
Pete
If my theory of relativity is proven successful, Germany will claim
me as a German, and France will declare that I am a citizen of the
world. Should my theory prove untrue, France will say that I am a
German, and Germany will declare that I am a Jew.
– Albert Einstein, 1929
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
[not found] ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
@ 2009-07-30 15:05 ` Elena
2009-07-30 15:49 ` Peter Dyballa
0 siblings, 1 reply; 17+ messages in thread
From: Elena @ 2009-07-30 15:05 UTC (permalink / raw)
To: help-gnu-emacs
On 30 Lug, 14:54, Peter Dyballa <Peter_Dyba...@Web.DE> wrote:
> Am 30.07.2009 um 16:41 schrieb Elena Garrulo:
>
> > Anyway, I've tried writing a simple .bat which echoes accented
> > characters and I've launched it from Eshell: the accented characters
> > are printed correctly.
>
> Because you've inserted them either in ISO Latin or CP1252, i.e.,
> input = output. Check with C-u C-x =!
I don't know: I've inserted them using Notepad.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 15:05 ` Elena
@ 2009-07-30 15:49 ` Peter Dyballa
2009-07-30 18:49 ` Eli Zaretskii
0 siblings, 1 reply; 17+ messages in thread
From: Peter Dyballa @ 2009-07-30 15:49 UTC (permalink / raw)
To: Elena; +Cc: help-gnu-emacs
Am 30.07.2009 um 17:05 schrieb Elena:
> I don't know: I've inserted them using Notepad.
So it is probably using your system's default, ISO Latin (BTW, ISO
Latin-9 or ISO 8859-15, which encoded €, should be a good choice for
a standard 8-bit encoding).
Have you thought of teaching (configuring) ant to use an ISO Latin
encoding instead of a DOS code page from last millennium?
--
Greetings
Pete
The best way to accelerate a PC is 9.8 m/s²
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 13:39 ` Elena
2009-07-30 14:11 ` Lennart Borgman
2009-07-30 14:41 ` Peter Dyballa
@ 2009-07-30 18:47 ` Eli Zaretskii
2 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-30 18:47 UTC (permalink / raw)
To: help-gnu-emacs
> From: Elena <egarrulo@gmail.com>
> Date: Thu, 30 Jul 2009 06:39:35 -0700 (PDT)
>
> default-process-coding-system is a variable defined in `C source
> code'.
> Its value is
> (iso-latin-1-dos . iso-latin-1-unix)
That's strange. It should be (undecided-dos . undecided-unix) in
"emacs -Q". Can you try that and see if it helps? If that doesn't
help, please try (cp850-dos . cp850-unix).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Windows + Eshell: fixing character encoding?
2009-07-30 15:49 ` Peter Dyballa
@ 2009-07-30 18:49 ` Eli Zaretskii
0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2009-07-30 18:49 UTC (permalink / raw)
To: help-gnu-emacs
> From: Peter Dyballa <Peter_Dyballa@Web.DE>
> Date: Thu, 30 Jul 2009 17:49:15 +0200
> Cc: help-gnu-emacs@gnu.org
>
>
> Am 30.07.2009 um 17:05 schrieb Elena:
>
> > I don't know: I've inserted them using Notepad.
>
>
> So it is probably using your system's default, ISO Latin
No, it probably uses cp1252. On Windows, the encodings used by GUI
programs (such as Notepad) and console programs are different.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-07-30 18:49 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-28 10:19 Windows + Eshell: fixing character encoding? Elena
2009-07-28 17:35 ` Eli Zaretskii
[not found] ` <mailman.3340.1248802527.2239.help-gnu-emacs@gnu.org>
2009-07-29 8:12 ` Elena
2009-07-29 10:23 ` Peter Dyballa
2009-07-29 11:38 ` Elena Garrulo
2009-07-29 16:10 ` Eli Zaretskii
2009-07-29 19:36 ` Peter Dyballa
[not found] ` <mailman.3409.1248896205.2239.help-gnu-emacs@gnu.org>
2009-07-30 2:01 ` Jason Rumney
[not found] ` <mailman.3394.1248883861.2239.help-gnu-emacs@gnu.org>
2009-07-30 13:39 ` Elena
2009-07-30 14:11 ` Lennart Borgman
2009-07-30 14:41 ` Elena Garrulo
2009-07-30 14:54 ` Peter Dyballa
[not found] ` <mailman.3482.1248965709.2239.help-gnu-emacs@gnu.org>
2009-07-30 15:05 ` Elena
2009-07-30 15:49 ` Peter Dyballa
2009-07-30 18:49 ` Eli Zaretskii
2009-07-30 14:41 ` Peter Dyballa
2009-07-30 18:47 ` Eli Zaretskii
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).