decode-coding-region returns octal escapes

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* decode-coding-region returns octal escapes
@ 2009-02-07 12:00 Eli Zaretskii
  2009-02-10  6:55 ` Kenichi Handa
  0 siblings, 1 reply; 2+ messages in thread
From: Eli Zaretskii @ 2009-02-07 12:00 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

To reproduce:

  emacs -Q
  M-x find-file-literally RET etc/tutorials/TUTORIAL.de RET

(The German tutorial is just an example, you can use any one, but
others might need different values of the 3rd argument to
decode-coding-region below.)

Now mark a region that includes non-ASCII characters, for example,
this one, on line 7 of the file:

    da\337 die CONTROL-Taste gedr\374ckt sein mu\337

and type

    M-: (decode-coding-region (mark) (point) 'latin-1 t) RET

The result is that the echo area shows this:

    #("da\337 die CONTROL-Taste gedr\374ckt sein mu\337" 0 39 (charset iso-8859-1))

and the *Messages* buffer shows this:

    #("da\303\237 die CONTROL-Taste gedr\303\274ckt sein mu\303\237" 0 39 (charset iso-8859-1))

But I expected to see this in both cases:

    #("daß die CONTROL-Taste gedrückt sein muß" 0 39 (charset iso-8859-1))

Is this a bug?  If not, what is the explanation for what I see?  Why
are raw bytes inserted instead of decoded characters?





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: decode-coding-region returns octal escapes
  2009-02-07 12:00 decode-coding-region returns octal escapes Eli Zaretskii
@ 2009-02-10  6:55 ` Kenichi Handa
  0 siblings, 0 replies; 2+ messages in thread
From: Kenichi Handa @ 2009-02-10  6:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

In article <uab8ytq12.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> To reproduce:
>   emacs -Q
>   M-x find-file-literally RET etc/tutorials/TUTORIAL.de RET

> (The German tutorial is just an example, you can use any one, but
> others might need different values of the 3rd argument to
> decode-coding-region below.)

> Now mark a region that includes non-ASCII characters, for example,
> this one, on line 7 of the file:

>     da\337 die CONTROL-Taste gedr\374ckt sein mu\337

> and type

>     M-: (decode-coding-region (mark) (point) 'latin-1 t) RET

> The result is that the echo area shows this:

>     #("da\337 die CONTROL-Taste gedr\374ckt sein mu\337" 0 39 (charset iso-8859-1))

> and the *Messages* buffer shows this:

>     #("da\303\237 die CONTROL-Taste gedr\303\274ckt sein mu\303\237" 0 39 (charset iso-8859-1))

> But I expected to see this in both cases:

>     #("daß die CONTROL-Taste gedrückt sein muß" 0 39 (charset iso-8859-1))

At least decode-coding-string is working correctly.  Please
try this instead:
    M-: (setq str (decode-coding-region (mark) (point) 'latin-1 t)) RET
    C-h v str RET
You'll see the correct multibyte string.

> Is this a bug?  If not, what is the explanation for what I see?  Why
> are raw bytes inserted instead of decoded characters?

It seems that `message' function somehow uses
enable-multibyte-characters of the current buffer to decide
how to show a string.

When I do this after the above C-h v:
    M-: (message "%s" str) RET
I see those octals, but when I do that while I'm in a
multibyte buffer,  I see correct characters.

---
Kenichi Handa
handa@m17n.org




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-02-10  6:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-07 12:00 decode-coding-region returns octal escapes Eli Zaretskii
2009-02-10  6:55 ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).