* Question about (gui-get-selection nil 'text/html)
@ 2018-04-13 19:00 Lars Ingebrigtsen
2018-04-13 19:07 ` Lars Ingebrigtsen
2018-04-13 21:42 ` Stefan Monnier
0 siblings, 2 replies; 8+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-13 19:00 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 194 bytes --]
So, we can yank HTML that we cut from Firefox like so:
(gui-get-selection nil 'text/html)
... sort of.
I've put the result into a binary file so it'll hopefully survive the
email transport.
[-- Attachment #2: selection.bin --]
[-- Type: application/octet-stream, Size: 125 bytes --]
[-- Attachment #3: Type: text/plain, Size: 448 bytes --]
So... what is that? I've tried to google, but I found nothing
promising, so my Google-fu is probably bad.
It rather looks like it's UTF-16 -- every other byte is a nul. But
Emacs claims that it's iso-8859-1. And... I've tried decoding various
instances of these things like UTF-16, and it almost kinda works, but
not quite.
Any ideas?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 19:00 Question about (gui-get-selection nil 'text/html) Lars Ingebrigtsen
@ 2018-04-13 19:07 ` Lars Ingebrigtsen
2018-04-13 20:27 ` Eli Zaretskii
2018-04-13 21:42 ` Stefan Monnier
1 sibling, 1 reply; 8+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-13 19:07 UTC (permalink / raw)
To: emacs-devel
Oh, wow. If I just do
(decode-coding-region (point-min) (point-max) 'utf-16-le)
instead of utf-16, I get the HTML I expect instead of a bunch of Chinese
characters. :-)
There's a byte order mark at the start -- isn't utf-16 supposed to use
that to get the byte order?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 19:07 ` Lars Ingebrigtsen
@ 2018-04-13 20:27 ` Eli Zaretskii
2018-04-13 20:29 ` Lars Ingebrigtsen
0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2018-04-13 20:27 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: emacs-devel
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Fri, 13 Apr 2018 21:07:27 +0200
>
> Oh, wow. If I just do
>
> (decode-coding-region (point-min) (point-max) 'utf-16-le)
>
> instead of utf-16, I get the HTML I expect instead of a bunch of Chinese
> characters. :-)
>
> There's a byte order mark at the start -- isn't utf-16 supposed to use
> that to get the byte order?
The file you attached has no BOM.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 20:27 ` Eli Zaretskii
@ 2018-04-13 20:29 ` Lars Ingebrigtsen
2018-04-13 22:07 ` Andreas Schwab
2018-04-14 6:42 ` Eli Zaretskii
0 siblings, 2 replies; 8+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-13 20:29 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Lars Ingebrigtsen <larsi@gnus.org>
>> Date: Fri, 13 Apr 2018 21:07:27 +0200
>>
>> Oh, wow. If I just do
>>
>> (decode-coding-region (point-min) (point-max) 'utf-16-le)
>>
>> instead of utf-16, I get the HTML I expect instead of a bunch of Chinese
>> characters. :-)
>>
>> There's a byte order mark at the start -- isn't utf-16 supposed to use
>> that to get the byte order?
>
> The file you attached has no BOM.
The first four bytes were
\303\277\303\276
Isn't that the BOM? Or do I misremember?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 19:00 Question about (gui-get-selection nil 'text/html) Lars Ingebrigtsen
2018-04-13 19:07 ` Lars Ingebrigtsen
@ 2018-04-13 21:42 ` Stefan Monnier
1 sibling, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2018-04-13 21:42 UTC (permalink / raw)
To: emacs-devel
> So, we can yank HTML that we cut from Firefox like so:
>
> (gui-get-selection nil 'text/html)
I've opened a bug report for that: bug#31149
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 20:29 ` Lars Ingebrigtsen
@ 2018-04-13 22:07 ` Andreas Schwab
2018-04-13 22:18 ` Lars Ingebrigtsen
2018-04-14 6:42 ` Eli Zaretskii
1 sibling, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2018-04-13 22:07 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Eli Zaretskii, emacs-devel
On Apr 13 2018, Lars Ingebrigtsen <larsi@gnus.org> wrote:
> The first four bytes were
>
> \303\277\303\276
>
> Isn't that the BOM? Or do I misremember?
It's a BOM encoded as UTF-16 encoded as UTF-8.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 22:07 ` Andreas Schwab
@ 2018-04-13 22:18 ` Lars Ingebrigtsen
0 siblings, 0 replies; 8+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-13 22:18 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Eli Zaretskii, emacs-devel
Andreas Schwab <schwab@linux-m68k.org> writes:
> On Apr 13 2018, Lars Ingebrigtsen <larsi@gnus.org> wrote:
>
>> The first four bytes were
>>
>> \303\277\303\276
>>
>> Isn't that the BOM? Or do I misremember?
>
> It's a BOM encoded as UTF-16 encoded as UTF-8.
Heh heh. Beautiful.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question about (gui-get-selection nil 'text/html)
2018-04-13 20:29 ` Lars Ingebrigtsen
2018-04-13 22:07 ` Andreas Schwab
@ 2018-04-14 6:42 ` Eli Zaretskii
1 sibling, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2018-04-14 6:42 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: emacs-devel
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Fri, 13 Apr 2018 22:29:41 +0200
> Cc: emacs-devel@gnu.org
>
> >> There's a byte order mark at the start -- isn't utf-16 supposed to use
> >> that to get the byte order?
> >
> > The file you attached has no BOM.
>
> The first four bytes were
>
> \303\277\303\276
>
> Isn't that the BOM? Or do I misremember?
Look at it with hexl-find-file or with "od -x", and you will see it's
not a BOM (which should be either FFFE or FEFF).
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-04-14 6:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-13 19:00 Question about (gui-get-selection nil 'text/html) Lars Ingebrigtsen
2018-04-13 19:07 ` Lars Ingebrigtsen
2018-04-13 20:27 ` Eli Zaretskii
2018-04-13 20:29 ` Lars Ingebrigtsen
2018-04-13 22:07 ` Andreas Schwab
2018-04-13 22:18 ` Lars Ingebrigtsen
2018-04-14 6:42 ` Eli Zaretskii
2018-04-13 21:42 ` Stefan Monnier
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.