* strange UTF8 encoding problem (relevant to decoding-system-gone-awry?)
@ 2005-02-17 12:48 Nic Ferrier
2005-02-22 7:38 ` Kenichi Handa
0 siblings, 1 reply; 2+ messages in thread
From: Nic Ferrier @ 2005-02-17 12:48 UTC (permalink / raw)
I've noted the current discussion on Emacs coding.
I am experiencing a strange problem with Emacs encoding which I
thought I might share.
I'm reading the tcpd package's hosts_acccess man page with Emacs man
from this version of Emacs:
GNU Emacs 21.3.50.22 (i686-pc-linux-gnu, GTK+ Version 2.4.10) of
2004-12-14
In the man page viewed on a terminal there are nice little bullet
characters. Hexdump shows these characters as B7 so obviously the
terminal is not UTF-8.
The UTF-8 sequence for B7 is 0301 0267.
When I view the man page in Emacs with utf-8 encoding on by default I
get a \267. Encoding the page as unix produces: \302\267 which
*does* look like a valid UTF-8 byte sequence.
When I do (what-cursor-position) on the character I get 302 which is
the first byte in the sequence.
I'm not sure what Emacs is doing here. It looks like valid UTF-8 and
yet (what-cursor-position) obviously does not believe there is a UTF-8
character.
Anybody got any idea why the correct character doesn't display?
btw Woman display the manual page with the strange bullet converted to
an asterisk.
--
Nic Ferrier
http://www.tapsellferrier.co.uk
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: strange UTF8 encoding problem (relevant to decoding-system-gone-awry?)
2005-02-17 12:48 strange UTF8 encoding problem (relevant to decoding-system-gone-awry?) Nic Ferrier
@ 2005-02-22 7:38 ` Kenichi Handa
0 siblings, 0 replies; 2+ messages in thread
From: Kenichi Handa @ 2005-02-22 7:38 UTC (permalink / raw)
Cc: emacs-devel
In article <87y8dnmjx5.fsf@kanga.tapsellferrier.co.uk>, Nic Ferrier <nferrier@tapsellferrier.co.uk> writes:
> I've noted the current discussion on Emacs coding.
> I am experiencing a strange problem with Emacs encoding which I
> thought I might share.
> I'm reading the tcpd package's hosts_acccess man page with Emacs man
> from this version of Emacs:
> GNU Emacs 21.3.50.22 (i686-pc-linux-gnu, GTK+ Version 2.4.10) of
> 2004-12-14
> In the man page viewed on a terminal there are nice little bullet
> characters. Hexdump shows these characters as B7 so obviously the
> terminal is not UTF-8.
> The UTF-8 sequence for B7 is 0301 0267.
> When I view the man page in Emacs with utf-8 encoding on by default I
> get a \267. Encoding the page as unix produces: \302\267 which
> *does* look like a valid UTF-8 byte sequence.
> When I do (what-cursor-position) on the character I get 302 which is
> the first byte in the sequence.
> I'm not sure what Emacs is doing here. It looks like valid UTF-8 and
> yet (what-cursor-position) obviously does not believe there is a UTF-8
> character.
> Anybody got any idea why the correct character doesn't display?
I can't reproduce it. What I did is:
% LANG=de_DE.UTF-8 emacs -Q
and M-x man RET man RET
It surely decodes utf-8 output of man command correctly.
What is the value of enable-multibyte-characters?
Can you reproduce the bug with -Q arg?
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2005-02-22 7:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-17 12:48 strange UTF8 encoding problem (relevant to decoding-system-gone-awry?) Nic Ferrier
2005-02-22 7:38 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).