* decode-coding-region returns octal escapes
@ 2009-02-07 12:00 Eli Zaretskii
2009-02-10 6:55 ` Kenichi Handa
0 siblings, 1 reply; 2+ messages in thread
From: Eli Zaretskii @ 2009-02-07 12:00 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
To reproduce:
emacs -Q
M-x find-file-literally RET etc/tutorials/TUTORIAL.de RET
(The German tutorial is just an example, you can use any one, but
others might need different values of the 3rd argument to
decode-coding-region below.)
Now mark a region that includes non-ASCII characters, for example,
this one, on line 7 of the file:
da\337 die CONTROL-Taste gedr\374ckt sein mu\337
and type
M-: (decode-coding-region (mark) (point) 'latin-1 t) RET
The result is that the echo area shows this:
#("da\337 die CONTROL-Taste gedr\374ckt sein mu\337" 0 39 (charset iso-8859-1))
and the *Messages* buffer shows this:
#("da\303\237 die CONTROL-Taste gedr\303\274ckt sein mu\303\237" 0 39 (charset iso-8859-1))
But I expected to see this in both cases:
#("daß die CONTROL-Taste gedrückt sein muß" 0 39 (charset iso-8859-1))
Is this a bug? If not, what is the explanation for what I see? Why
are raw bytes inserted instead of decoded characters?
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: decode-coding-region returns octal escapes
2009-02-07 12:00 decode-coding-region returns octal escapes Eli Zaretskii
@ 2009-02-10 6:55 ` Kenichi Handa
0 siblings, 0 replies; 2+ messages in thread
From: Kenichi Handa @ 2009-02-10 6:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
In article <uab8ytq12.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> To reproduce:
> emacs -Q
> M-x find-file-literally RET etc/tutorials/TUTORIAL.de RET
> (The German tutorial is just an example, you can use any one, but
> others might need different values of the 3rd argument to
> decode-coding-region below.)
> Now mark a region that includes non-ASCII characters, for example,
> this one, on line 7 of the file:
> da\337 die CONTROL-Taste gedr\374ckt sein mu\337
> and type
> M-: (decode-coding-region (mark) (point) 'latin-1 t) RET
> The result is that the echo area shows this:
> #("da\337 die CONTROL-Taste gedr\374ckt sein mu\337" 0 39 (charset iso-8859-1))
> and the *Messages* buffer shows this:
> #("da\303\237 die CONTROL-Taste gedr\303\274ckt sein mu\303\237" 0 39 (charset iso-8859-1))
> But I expected to see this in both cases:
> #("daß die CONTROL-Taste gedrückt sein muß" 0 39 (charset iso-8859-1))
At least decode-coding-string is working correctly. Please
try this instead:
M-: (setq str (decode-coding-region (mark) (point) 'latin-1 t)) RET
C-h v str RET
You'll see the correct multibyte string.
> Is this a bug? If not, what is the explanation for what I see? Why
> are raw bytes inserted instead of decoded characters?
It seems that `message' function somehow uses
enable-multibyte-characters of the current buffer to decide
how to show a string.
When I do this after the above C-h v:
M-: (message "%s" str) RET
I see those octals, but when I do that while I'm in a
multibyte buffer, I see correct characters.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-02-10 6:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-07 12:00 decode-coding-region returns octal escapes Eli Zaretskii
2009-02-10 6:55 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).