unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: Fwd: Problem with non-bmp unicode
       [not found] <200611101127.47456.jerome@marant.org>
@ 2006-11-12  2:32 ` Kenichi Handa
  2006-11-18 13:22   ` Jérôme Marant
  0 siblings, 1 reply; 2+ messages in thread
From: Kenichi Handa @ 2006-11-12  2:32 UTC (permalink / raw)
  Cc: emacs-devel

In article <200611101127.47456.jerome@marant.org>, Jérôme Marant <jerome@marant.org> writes:

> Do you have any clue about this?

Sorry for the late reponse on this thread.

> Subject: Problem with non-bmp unicode
> Date: mercredi 08 novembre 2006 09:26
[...]
> An UTF-8 file (attached) with these three characters:
> U+0022 U+00010380 U+0022
> shows with "emacs -nw":
> "\360\220\216\200"
> which is not usable at all. The file displays correctly if I cat it.

> I tried a bunch of other characters outside the BMP, all of which
> fail in the same way. Characters in the BMP work nicely.

Emacs 22 still doesn't support Unicode characters over BMP.
If you really need to handle them, please use the CVS branch
emacs-unicode-2.

> Apparently, emacs 22 shows a question mark instead of "\360\220\216\200"
> but trying to delete the question mark character with backspace turn it into
> "\360\220\216".

This is written in the comment of utf-8.el.

;; We compose the untranslatable sequences into a single character,
;; and move point to the next character.
;; This is infelicitous for editing, because there's currently no
;; mechanism for treating compositions as atomic, but is OK for
;; display.  They are composed to U+FFFD with help-echo which
;; indicates the unicodes they represent.  This function GCs too much.

I tried to fix this editting problem by using
modification-hooks text property, but couldn't accomplish a
good result.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Fwd: Problem with non-bmp unicode
  2006-11-12  2:32 ` Fwd: Problem with non-bmp unicode Kenichi Handa
@ 2006-11-18 13:22   ` Jérôme Marant
  0 siblings, 0 replies; 2+ messages in thread
From: Jérôme Marant @ 2006-11-18 13:22 UTC (permalink / raw)
  Cc: Kenichi Handa

Le dimanche 12 novembre 2006 03:32, Kenichi Handa a écrit :

> > I tried a bunch of other characters outside the BMP, all of which
> > fail in the same way. Characters in the BMP work nicely.
> 
> Emacs 22 still doesn't support Unicode characters over BMP.
> If you really need to handle them, please use the CVS branch
> emacs-unicode-2.

OK. Noted.

> > Apparently, emacs 22 shows a question mark instead of "\360\220\216\200"
> > but trying to delete the question mark character with backspace turn it into
> > "\360\220\216".
> 
> This is written in the comment of utf-8.el.
> 
> ;; We compose the untranslatable sequences into a single character,
> ;; and move point to the next character.
> ;; This is infelicitous for editing, because there's currently no
> ;; mechanism for treating compositions as atomic, but is OK for
> ;; display.  They are composed to U+FFFD with help-echo which
> ;; indicates the unicodes they represent.  This function GCs too much.
> 
> I tried to fix this editting problem by using
> modification-hooks text property, but couldn't accomplish a
> good result.

Thanks for trying anyway.

-- 
Jérôme Marant

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-11-18 13:22 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200611101127.47456.jerome@marant.org>
2006-11-12  2:32 ` Fwd: Problem with non-bmp unicode Kenichi Handa
2006-11-18 13:22   ` Jérôme Marant

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).