From: Kenichi Handa <handa@m17n.org>
Cc: emacs-pretest-bug@gnu.org, ihs_4664@yahoo.com,
christopher.ian.moore@gmail.com, emacs-devel@gnu.org,
richard.stallman@gnu.org
Subject: Re: Emacs puts binary junk into the clipboard, marking it as text
Date: Tue, 19 Sep 2006 16:14:01 +0900 [thread overview]
Message-ID: <E1GPZnt-0000jy-00@etlken> (raw)
In-Reply-To: <450F8AF7.5010702@swipnet.se> (message from Jan Djärv on Tue, 19 Sep 2006 08:15:19 +0200)
In article <450F8AF7.5010702@swipnet.se>, Jan Djärv <jan.h.d@swipnet.se> writes:
> > AFAIK, only when TEXT is requested, an selection owner can
> > choose the returning type from STRING, COMPOUND_TEXT, or
> > UTF8_STRING. When UTF8_STRING is requested, we should
> > return it or return nothing.
> >
> > And, if Emacs owns a unibyte string, perhaps the right thing
> > is to make it multibyte according to the current
> > lang. env. (by string-make-multibyte) at first, then encode
> > it by utf-8.
> What would that do to illegal UTF-8 sequences in the original unibyte string?
The original unibyte string won't be in UTF-8 format. But,
string-make-multibyte will convert it to a correct multibyte
string, thus encoding that multibyte string by UTF-8 will
produce a correct UTF-8 string ... usually.
> I.e. will this procedure always produce valid UTF-8 data?
No. If a byte in the original unibyte string is not a valid
code point of the primary charset of the current lang. env.,
string-make-unibyte will produce a multibyte string that
contains eight-bit-control or eight-bit-graphic character.
Then, encoding it by UTF-8 will results in incorrect UTF-8
sequence. So, for safely, we must delete such eight-bit
characters or replace them with U+FFFD (REPLACEMENT
CHARACTER) before encoding by UTF-8.
Or, in such a case, don't return anything (which means Emacs
doesn't hold a requested data).
---
Kenichi Handa
handa@m17n.org
next prev parent reply other threads:[~2006-09-19 7:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1158280855.14121.69.camel@chrislap.madeupdomain.com>
2006-09-15 7:07 ` Emacs puts binary junk into the clipboard, marking it as text Jan Djärv
2006-09-15 16:30 ` Kevin Rodgers
2006-09-16 11:31 ` Jan D.
2006-09-16 17:25 ` Jan D.
2006-09-19 5:05 ` Kenichi Handa
2006-09-19 6:15 ` Jan Djärv
2006-09-19 7:14 ` Kenichi Handa [this message]
2006-09-19 10:54 ` Stefan Monnier
2006-09-19 11:14 ` Kenichi Handa
2006-09-19 16:15 ` Stefan Monnier
2006-09-19 19:32 ` Jan D.
2006-09-20 2:20 ` Kenichi Handa
2006-10-19 7:19 ` Jan Djärv
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1GPZnt-0000jy-00@etlken \
--to=handa@m17n.org \
--cc=christopher.ian.moore@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=emacs-pretest-bug@gnu.org \
--cc=ihs_4664@yahoo.com \
--cc=richard.stallman@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).