python-mode, c-c c-c and unicode in output buffer

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* python-mode, c-c c-c and unicode in output buffer
@ 2009-10-24 13:56 sandro dentella
  2009-10-24 15:26 ` LanX
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: sandro dentella @ 2009-10-24 13:56 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

  this simple python code:  print u'è'  works outside emacs but raises
error inside emacs:

sandro@bluff:~$ cat /tmp/u.py
# coding: utf-8
print u'è'

sandro@bluff:~$ python /tmp/u.py
è

  So I ask here, instead of asking in a python group...
  the buffer(s) are all utf-8 encoded. the error that python raises
trying to write to the buffer is:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in
position 0: ordinal not in range(128)

  It seems python believes it needs to encode in ascii, but emacs
output buffer has the 'u' in the left lower corner...


Thanks in advance
sandro
*:-)




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python-mode, c-c c-c and unicode in output buffer
  2009-10-24 13:56 python-mode, c-c c-c and unicode in output buffer sandro dentella
@ 2009-10-24 15:26 ` LanX
  2009-10-24 17:33   ` sandro dentella
  2009-10-24 15:33 ` Peter Dyballa
       [not found] ` <mailman.9411.1256398409.2239.help-gnu-emacs@gnu.org>
  2 siblings, 1 reply; 7+ messages in thread
From: LanX @ 2009-10-24 15:26 UTC (permalink / raw)
  To: help-gnu-emacs

I had a similar problem with perl and German umlauts because the
coding-system for _saving_ and compilation-buffer were not in sync.

Maybe try running

M-x describe-current-coding-system

to get more details.


Cheers
  Rolf


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python-mode, c-c c-c and unicode in output buffer
  2009-10-24 13:56 python-mode, c-c c-c and unicode in output buffer sandro dentella
  2009-10-24 15:26 ` LanX
@ 2009-10-24 15:33 ` Peter Dyballa
       [not found] ` <mailman.9411.1256398409.2239.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 7+ messages in thread
From: Peter Dyballa @ 2009-10-24 15:33 UTC (permalink / raw)
  To: sandro dentella; +Cc: help-gnu-emacs

Am 24.10.2009 um 15:56 schrieb sandro dentella:

> Traceback (most recent call last):
>   File "<stdin>", line 2, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in
> position 0: ordinal not in range(128)

This comes from Python!

The reason is that you are *not* creating an UTF-8 encoded file,  
which you can notice from the message "character u'\xe8'" which means  
some 8-bit value, i.e., not Unicode. To achieve UTF-8 encoding you  
need to update the first line to real ELisp:

	# coding: utf-8;

The final delimiter is needed, I think...

--
Greetings

   Pete

A morning without coffee is like something without something else.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python-mode, c-c c-c and unicode in output buffer
  2009-10-24 15:26 ` LanX
@ 2009-10-24 17:33   ` sandro dentella
  0 siblings, 0 replies; 7+ messages in thread
From: sandro dentella @ 2009-10-24 17:33 UTC (permalink / raw)
  To: help-gnu-emacs

On 24 Ott, 17:26, LanX <lanx.p...@googlemail.com> wrote:
> I had a similar problem with perl and German umlauts because the
> coding-system for _saving_ and compilation-buffer were not in sync.
>
> Maybe try running
>
> M-x describe-current-coding-system

I dont' know how to interpret the last 2 lines but the rest seems to
me exactly what is should be...
Am I wrong?


sandro
*:-)

python code buffer
==============
Coding system for saving this buffer:
  u -- mule-utf-8-unix


Default coding system (for new files):
  u -- mule-utf-8 (alias: utf-8)

Coding system for keyboard input:
  nil
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)

Defaults for subprocess I/O:
  decoding: u -- mule-utf-8 (alias: utf-8)

  encoding: u -- mule-utf-8 (alias: utf-8)


Priority order for recognizing coding systems when reading files:
  1. mule-utf-8 (alias: utf-8)
  2. iso-latin-1 (alias: iso-8859-1 latin-1)
  3. mule-utf-16be-with-signature (alias: utf-16be-with-signature mule-
utf-16-be utf-16-be)
  4. mule-utf-16le-with-signature (alias: utf-16le-with-signature mule-
utf-16-le utf-16-le)
  5. iso-2022-jp (alias: junet)
  6. iso-2022-7bit
  7. iso-2022-7bit-lock (alias: iso-2022-int-1)
  8. iso-2022-8bit-ss2
  9. emacs-mule
  10. raw-text
  11. japanese-shift-jis (alias: shift_jis sjis cp932)
  12. chinese-big5 (alias: big5 cn-big5 cp950)
  13. no-conversion

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The following are decoded correctly but recognized as iso-2022-7bit-
lock:
    iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-
ext iso-2022-jp-2 iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION	TARGET PATTERN		CODING SYSTEM(s)
  ---------	--------------		----------------
  File I/O	"\\.elc\\'"		(emacs-mule . emacs-mule)
		"\\.utf\\(-8\\)?\\'"	utf-8
		"\\(\\`\\|/\\)loaddefs.el\\'"
					(raw-text . raw-text-unix)
		"\\.tar\\'"		(no-conversion . no-conversion)
		"\\.po[tx]?\\'\\|\\.po\\."
					po-find-file-coding-system
		"\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'"
					latexenc-find-file-coding-system
		""			(undecided)
  Process I/O	nothing specified
  Network I/O	nothing specified



output buffer
============
Coding system for saving this buffer:
  Not set locally, use the default.
Default coding system (for new files):
  u -- mule-utf-8 (alias: utf-8)

Coding system for keyboard input:
  nil
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)

Defaults for subprocess I/O:
  decoding: u -- mule-utf-8 (alias: utf-8)

  encoding: u -- mule-utf-8 (alias: utf-8)




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python-mode, c-c c-c and unicode in output buffer
       [not found] ` <mailman.9411.1256398409.2239.help-gnu-emacs@gnu.org>
@ 2009-10-24 17:39   ` sandro dentella
  2009-11-01 19:27     ` Dave Love
  2009-11-01 19:24   ` Dave Love
  1 sibling, 1 reply; 7+ messages in thread
From: sandro dentella @ 2009-10-24 17:39 UTC (permalink / raw)
  To: help-gnu-emacs

On 24 Ott, 17:33, Peter Dyballa <Peter_Dyba...@Web.DE> wrote:
> Am 24.10.2009 um 15:56 schrieb sandro dentella:
>
> > Traceback (most recent call last):
> >   File "<stdin>", line 2, in <module>
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in
> > position 0: ordinal not in range(128)
>
> This comes from Python!

agreed

>
> The reason is that you are *not* creating an UTF-8 encoded file,

mmh, python says it's trying to create an ascii coded string and it
can't write
char u'\xe8' in ascii.

> which you can notice from the message "character u'\xe8'" which means
> some 8-bit value, i.e., not Unicode.
on the contary, python perfectly knows which char it is, but tries to
encode in ascii.

> To achieve UTF-8 encoding you
> need to update the first line to real ELisp:
>
>         # coding: utf-8;
>
> The final delimiter is needed, I think...



No, this does not  change. In fact the buffer states it's already a
uft-8 encoded buffer.


sandro
*:-)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python-mode, c-c c-c and unicode in output buffer
       [not found] ` <mailman.9411.1256398409.2239.help-gnu-emacs@gnu.org>
  2009-10-24 17:39   ` sandro dentella
@ 2009-11-01 19:24   ` Dave Love
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Love @ 2009-11-01 19:24 UTC (permalink / raw)
  To: help-gnu-emacs

Peter Dyballa <Peter_Dyballa@Web.DE> writes:

> The reason is that you are *not* creating an UTF-8 encoded file,

This is the interpreter output stream from a file which was utf-8.

> To achieve UTF-8 encoding you  need to
> update the first line to real ELisp:
>
> 	# coding: utf-8;
>
> The final delimiter is needed, I think...

Not according to the Python manual.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python-mode, c-c c-c and unicode in output buffer
  2009-10-24 17:39   ` sandro dentella
@ 2009-11-01 19:27     ` Dave Love
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Love @ 2009-11-01 19:27 UTC (permalink / raw)
  To: help-gnu-emacs

sandro dentella <sandro@e-den.it> writes:

> mmh, python says it's trying to create an ascii coded string and it
> can't write
> char u'\xe8' in ascii.

Right.  The Python manual doesn't tell you it does that when writing to
a pipe (used to suppress paging in the Emacs sub-process) rather than a
pty:

  $ python u.py | cat
  Traceback (most recent call last):
    File "u.py", line 2, in <module>
      print u"è"
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 0: ordinal not in range(128)

I fixed this recently in my version (at least as far as pydoc's handling
of paging goes, which was the reason for setting process-connection-type
for the sub-process).  There was previously a misleading comment about
non-ASCII code.  <URL:http://www.loveshack.ukfsn.org/emacs/#python.el>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-11-01 19:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-24 13:56 python-mode, c-c c-c and unicode in output buffer sandro dentella
2009-10-24 15:26 ` LanX
2009-10-24 17:33   ` sandro dentella
2009-10-24 15:33 ` Peter Dyballa
     [not found] ` <mailman.9411.1256398409.2239.help-gnu-emacs@gnu.org>
2009-10-24 17:39   ` sandro dentella
2009-11-01 19:27     ` Dave Love
2009-11-01 19:24   ` Dave Love

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.