unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Kenichi Handa <handa@gnu.org>
Cc: larsi@gnus.org, 31149@debbugs.gnu.org, monnier@IRO.UMontreal.CA
Subject: bug#31149: 27.0.50; (gui-get-selection nil 'text/html) returns mis-decoded text
Date: Fri, 11 May 2018 12:18:13 +0300	[thread overview]
Message-ID: <83vabuo8iy.fsf@gnu.org> (raw)
In-Reply-To: <83h8nmsasr.fsf@gnu.org> (message from Eli Zaretskii on Sat, 05 May 2018 12:37:24 +0300)

Ping! Ping! Ping!

> Date: Sat, 05 May 2018 12:37:24 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: larsi@gnus.org, 31149@debbugs.gnu.org, monnier@IRO.UMontreal.CA
> 
> Ping! Ping!
> 
> > Date: Tue, 24 Apr 2018 21:11:10 +0300
> > From: Eli Zaretskii <eliz@gnu.org>
> > Cc: larsi@gnus.org, 31149@debbugs.gnu.org, monnier@IRO.UMontreal.CA
> > 
> > Ping!
> > 
> > > Date: Sat, 14 Apr 2018 09:32:41 +0300
> > > From: Eli Zaretskii <eliz@gnu.org>
> > > Cc: larsi@gnus.org, 31149@debbugs.gnu.org
> > > 
> > > > From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> > > > Date: Fri, 13 Apr 2018 16:55:26 -0400
> > > > Cc: Lars Ingebrigtsen <larsi@gnus.org>
> > > > 
> > > > (gui-get-selection nil 'text/html)
> > > > 
> > > > returns utf-16 text when the primary selection is owned by Mozilla, but
> > > > we decode it as latin-1 instead, so it looks like garbage.
> > > > 
> > > > I don't know why we're getting utf-16.  Is that what standards say it
> > > > should do?  If so, we should adjust our code (which currently knows
> > > > nothing about the `text/html` target-type).
> > > > 
> > > > As for why we decode it as latin-1, it's (under GNU/Linux; Lars may be
> > > > using something else because he's getting something with a `charset`
> > > > property which I don't get here) because:
> > > > - selection_data_to_lisp_data (in xselect.c) makes a unibyte string with
> > > >   the property `foreign-selection` set to `STRING` when the actual
> > > >   string type is not known (as opposed to COMPOUND-TEXT and
> > > >   UTF8-STRING, basically).
> > > > - in gui-get-selection we then have a mapping from `STRING` to
> > > >   `iso-8859-1` (which is apparently the right thing for the official
> > > >   `STRING` target-type in X11).
> > > > 
> > > > I can't figure out if/where these kinds of things about the X11
> > > > selection protocol is described, but at least in `xclip` they have
> > > > a hack specifically for this case:
> > > > 
> > > >     [...]
> > > >     if (html != None && sel_type == html) {
> > > > 	/* if the buffer contains UCS-2 (UTF-16), convert to
> > > > 	 * UTF-8.  Mozilla-based browsers do this for the
> > > > 	 * text/html target.
> > > > 	 */
> > > >     [...]
> > > > 
> > > > and according to the subsequent code it's not even always the
> > > > same endianness.
> > > > 
> > > > I don't know what is the difference between the `target-type` passed to
> > > > x-get-selection-internal and the `foreign-selection` property we get on
> > > > the returned string (they seem to be the same in my tests, except when
> > > > the type is not one of the known ones, and where we then force
> > > > `foreign-selection` to be `STRING`).
> > > 
> > > I hope Handa-san (CC'ed) could comment on this.
> > 





  reply	other threads:[~2018-05-11  9:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-13 20:55 bug#31149: 27.0.50; (gui-get-selection nil 'text/html) returns mis-decoded text Stefan Monnier
2018-04-13 21:05 ` Lars Ingebrigtsen
2018-04-14  6:32 ` Eli Zaretskii
2018-04-24 18:11   ` Eli Zaretskii
2018-05-05  9:37     ` Eli Zaretskii
2018-05-11  9:18       ` Eli Zaretskii [this message]
2018-05-19  8:50         ` Eli Zaretskii
2019-09-29  8:44 ` Lars Ingebrigtsen
2019-09-29  9:31   ` Eli Zaretskii
2019-09-29  9:37     ` Lars Ingebrigtsen
2019-09-29  9:52       ` Eli Zaretskii
2019-09-29 10:02         ` Lars Ingebrigtsen
2019-09-29 10:21           ` Eli Zaretskii
2019-09-29 11:48             ` Lars Ingebrigtsen
2021-11-08  1:07 ` Lars Ingebrigtsen
2021-11-08  1:12   ` Lars Ingebrigtsen
2021-11-09  3:44   ` Lars Ingebrigtsen
2021-11-11  4:24     ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83vabuo8iy.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=31149@debbugs.gnu.org \
    --cc=handa@gnu.org \
    --cc=larsi@gnus.org \
    --cc=monnier@IRO.UMontreal.CA \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).