From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#31149: 27.0.50; (gui-get-selection nil 'text/html) returns mis-decoded text Date: Sun, 29 Sep 2019 12:31:58 +0300 Message-ID: <83o8z3fnxt.fsf@gnu.org> References: <87h84vqynz.fsf@gnus.org> Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="100950"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 31149@debbugs.gnu.org, monnier@IRO.UMontreal.CA To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Sep 29 11:33:23 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1iEVZw-000Q25-Cq for geb-bug-gnu-emacs@m.gmane.org; Sun, 29 Sep 2019 11:33:20 +0200 Original-Received: from localhost ([::1]:37550 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iEVZt-0003VH-Qm for geb-bug-gnu-emacs@m.gmane.org; Sun, 29 Sep 2019 05:33:17 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34591) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iEVZh-0003UQ-01 for bug-gnu-emacs@gnu.org; Sun, 29 Sep 2019 05:33:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iEVZf-0000Ru-0c for bug-gnu-emacs@gnu.org; Sun, 29 Sep 2019 05:33:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:43425) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iEVZe-0000Rg-TD for bug-gnu-emacs@gnu.org; Sun, 29 Sep 2019 05:33:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1iEVZe-000521-Or for bug-gnu-emacs@gnu.org; Sun, 29 Sep 2019 05:33:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 29 Sep 2019 09:33:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31149 X-GNU-PR-Package: emacs Original-Received: via spool by 31149-submit@debbugs.gnu.org id=B31149.156974953219248 (code B ref 31149); Sun, 29 Sep 2019 09:33:02 +0000 Original-Received: (at 31149) by debbugs.gnu.org; 29 Sep 2019 09:32:12 +0000 Original-Received: from localhost ([127.0.0.1]:52240 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iEVYq-00050N-4m for submit@debbugs.gnu.org; Sun, 29 Sep 2019 05:32:12 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:53694) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iEVYo-000508-Hz for 31149@debbugs.gnu.org; Sun, 29 Sep 2019 05:32:10 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:46111) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1iEVYi-0007KY-VZ; Sun, 29 Sep 2019 05:32:05 -0400 Original-Received: from [176.228.60.248] (port=2601 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1iEVYi-00038j-6w; Sun, 29 Sep 2019 05:32:04 -0400 In-reply-to: <87h84vqynz.fsf@gnus.org> (message from Lars Ingebrigtsen on Sun, 29 Sep 2019 10:44:48 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:167633 Archived-At: > From: Lars Ingebrigtsen > Date: Sun, 29 Sep 2019 10:44:48 +0200 > Cc: 31149@debbugs.gnu.org > > > if (html != None && sel_type == html) { > > /* if the buffer contains UCS-2 (UTF-16), convert to > > * UTF-8. Mozilla-based browsers do this for the > > * text/html target. > > */ > > [...] > > > > and according to the subsequent code it's not even always the > > same endianness. > > I think it would make sense for us to do the same here. It should be > easy enough for us to detect that the string is utf-16, I think? I think you want to use auto-coding-regexp-alist-lookup. > The data has a BOM Does it? It doesn't have to, at least not in principle.