From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: eww doesn't decode %AA%BB%CC URL names Date: Thu, 24 Dec 2015 22:40:09 +0200 Message-ID: <83k2o3mvh2.fsf@gnu.org> References: <83r3n0llkt.fsf@gnu.org> <87vb7nsq1g.fsf@gnus.org> <83ziwzmzyp.fsf@gnu.org> <8760znslig.fsf@gnus.org> <83twn7myin.fsf@gnu.org> <87bn9fr59a.fsf@gnus.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1450989586 14604 80.91.229.3 (24 Dec 2015 20:39:46 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 24 Dec 2015 20:39:46 +0000 (UTC) Cc: emacs-devel@gnu.org, yuri.v.khan@gmail.com To: Lars Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Dec 24 21:39:45 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aCCfo-0006lE-SU for ged-emacs-devel@m.gmane.org; Thu, 24 Dec 2015 21:39:45 +0100 Original-Received: from localhost ([::1]:33349 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCCfo-0004Da-8q for ged-emacs-devel@m.gmane.org; Thu, 24 Dec 2015 15:39:44 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49165) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCCfd-00046H-Sp for emacs-devel@gnu.org; Thu, 24 Dec 2015 15:39:37 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aCCfZ-0000jp-T6 for emacs-devel@gnu.org; Thu, 24 Dec 2015 15:39:33 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:57547) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCCfZ-0000jk-Pf; Thu, 24 Dec 2015 15:39:29 -0500 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3241 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aCCfZ-0004U1-1p; Thu, 24 Dec 2015 15:39:29 -0500 In-reply-to: <87bn9fr59a.fsf@gnus.org> (message from Lars Ingebrigtsen on Thu, 24 Dec 2015 20:55:13 +0100) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:196798 Archived-At: > From: Lars Ingebrigtsen > Cc: yuri.v.khan@gmail.com, emacs-devel@gnu.org > Date: Thu, 24 Dec 2015 20:55:13 +0100 > > > I think you should use UTF-8 literally as the first choice. > > Right. How do I check whether the bytes are a valid utf-8 sequence, > though? I thought I remembered something called > `valid-something-something-p', but I can't find it now... I think you can run find-charset-string on the decoded string, and if the result is just (unicode), you can be sure.