unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* strange UTF8 encoding problem (relevant to decoding-system-gone-awry?)
@ 2005-02-17 12:48 Nic Ferrier
  2005-02-22  7:38 ` Kenichi Handa
  0 siblings, 1 reply; 2+ messages in thread
From: Nic Ferrier @ 2005-02-17 12:48 UTC (permalink / raw)


I've noted the current discussion on Emacs coding.


I am experiencing a strange problem with Emacs encoding which I
thought I might share.

I'm reading the tcpd package's hosts_acccess man page with Emacs man
from this version of Emacs:

  GNU Emacs 21.3.50.22 (i686-pc-linux-gnu, GTK+ Version 2.4.10) of
  2004-12-14


In the man page viewed on a terminal there are nice little bullet
characters. Hexdump shows these characters as B7 so obviously the
terminal is not UTF-8.

The UTF-8 sequence for B7 is 0301 0267.

When I view the man page in Emacs with utf-8 encoding on by default I
get a \267. Encoding the page as unix produces:  \302\267 which
*does* look like a valid UTF-8 byte sequence.

When I do (what-cursor-position) on the character I get 302 which is
the first byte in the sequence.

I'm not sure what Emacs is doing here. It looks like valid UTF-8 and
yet (what-cursor-position) obviously does not believe there is a UTF-8
character.

Anybody got any idea why the correct character doesn't display?


btw Woman display the manual page with the strange bullet converted to
an asterisk.


-- 
Nic Ferrier
http://www.tapsellferrier.co.uk

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: strange UTF8 encoding problem (relevant to decoding-system-gone-awry?)
  2005-02-17 12:48 strange UTF8 encoding problem (relevant to decoding-system-gone-awry?) Nic Ferrier
@ 2005-02-22  7:38 ` Kenichi Handa
  0 siblings, 0 replies; 2+ messages in thread
From: Kenichi Handa @ 2005-02-22  7:38 UTC (permalink / raw)
  Cc: emacs-devel

In article <87y8dnmjx5.fsf@kanga.tapsellferrier.co.uk>, Nic Ferrier <nferrier@tapsellferrier.co.uk> writes:

> I've noted the current discussion on Emacs coding.
> I am experiencing a strange problem with Emacs encoding which I
> thought I might share.

> I'm reading the tcpd package's hosts_acccess man page with Emacs man
> from this version of Emacs:

>   GNU Emacs 21.3.50.22 (i686-pc-linux-gnu, GTK+ Version 2.4.10) of
>   2004-12-14


> In the man page viewed on a terminal there are nice little bullet
> characters. Hexdump shows these characters as B7 so obviously the
> terminal is not UTF-8.

> The UTF-8 sequence for B7 is 0301 0267.

> When I view the man page in Emacs with utf-8 encoding on by default I
> get a \267. Encoding the page as unix produces:  \302\267 which
> *does* look like a valid UTF-8 byte sequence.

> When I do (what-cursor-position) on the character I get 302 which is
> the first byte in the sequence.

> I'm not sure what Emacs is doing here. It looks like valid UTF-8 and
> yet (what-cursor-position) obviously does not believe there is a UTF-8
> character.

> Anybody got any idea why the correct character doesn't display?

I can't reproduce it.  What I did is:
% LANG=de_DE.UTF-8 emacs -Q
and M-x man RET man RET

It surely decodes utf-8 output of man command correctly.

What is the value of enable-multibyte-characters?
Can you reproduce the bug with -Q arg?

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-02-22  7:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-17 12:48 strange UTF8 encoding problem (relevant to decoding-system-gone-awry?) Nic Ferrier
2005-02-22  7:38 ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).