all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: David Kastrup <dak@gnu.org>
To: emacs-devel@gnu.org
Subject: Re: Bug with UTF-8 string and dbus
Date: Wed, 09 Jun 2010 22:42:55 +0200	[thread overview]
Message-ID: <87fx0wuecw.fsf@lola.goethe.zz> (raw)
In-Reply-To: m2r5kgj6e6.fsf@igel.home

Andreas Schwab <schwab@linux-m68k.org> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>> AFAIK, Emacs's internal encoding is valid utf-8.  It uses private
>> characters for some things, but I don't think that makes it invalid.
>
> The eight-bit characters are encoded outside of the Unicode range, and a
> good utf-8 decoder must treat them as invalid.

Yes, that's the whole point.  Indeed, Emacs own utf-8 decoder treats
them as invalid too: when Emacs considers the data to be in utf-8
instead of emacs-internal encoding, it will decode the respective codes
into its "raw byte" presentation.  Which again is not legal utf-8 (but a
rather obvious "extension" of the utf-8 encoding scheme which quite
artificially stops at 2^20+2^16 or something similar which I don't
accurately remember and that is a consequence of the range encodable
with utf-16 with surrogate codes).

-- 
David Kastrup




  reply	other threads:[~2010-06-09 20:42 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-08 21:39 Bug with UTF-8 string and dbus Julien Danjou
2010-06-09  0:43 ` Stefan Monnier
2010-06-09  1:17   ` Eli Zaretskii
2010-06-09  6:34     ` Julien Danjou
2010-06-09  7:27       ` Eli Zaretskii
2010-06-09  8:51         ` Jan Djärv
2010-06-09  9:30           ` Eli Zaretskii
2010-06-09  7:28       ` Jan Djärv
2010-06-09 14:08       ` Stefan Monnier
2010-06-09 14:24         ` Julien Danjou
2010-06-09 15:01         ` Andreas Schwab
2010-06-09 15:39           ` Michael Albinus
2010-06-09 18:11           ` Stefan Monnier
2010-06-09 19:45             ` Davis Herring
2010-06-09 20:30             ` Andreas Schwab
2010-06-09 20:42               ` David Kastrup [this message]
2010-06-09 22:19         ` Andreas Schwab
     [not found]           ` <19472.35590.940217.577634@uwakimon.sk.tsukuba.ac.jp>
2010-06-10  8:05             ` Andreas Schwab
2010-06-09  9:16 ` [PATCH] Fix D-Bus string encoding Julien Danjou
2010-06-10  0:20   ` Stefan Monnier
2010-06-10  1:56     ` Eli Zaretskii
2010-06-10  2:48       ` Miles Bader
2010-06-10  3:49         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fx0wuecw.fsf@lola.goethe.zz \
    --to=dak@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.