From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.devel Subject: Re: eww doesn't decode %AA%BB%CC URL names Date: Thu, 24 Dec 2015 20:55:13 +0100 Message-ID: <87bn9fr59a.fsf@gnus.org> References: <83r3n0llkt.fsf@gnu.org> <87vb7nsq1g.fsf@gnus.org> <83ziwzmzyp.fsf@gnu.org> <8760znslig.fsf@gnus.org> <83twn7myin.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1450986967 10649 80.91.229.3 (24 Dec 2015 19:56:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 24 Dec 2015 19:56:07 +0000 (UTC) Cc: emacs-devel@gnu.org, yuri.v.khan@gmail.com To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Dec 24 20:55:55 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aCBzN-000833-V6 for ged-emacs-devel@m.gmane.org; Thu, 24 Dec 2015 20:55:54 +0100 Original-Received: from localhost ([::1]:33251 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCBzN-0005HQ-5O for ged-emacs-devel@m.gmane.org; Thu, 24 Dec 2015 14:55:53 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54051) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCBzB-0005HF-F9 for emacs-devel@gnu.org; Thu, 24 Dec 2015 14:55:42 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aCBz8-0001Qm-9R for emacs-devel@gnu.org; Thu, 24 Dec 2015 14:55:41 -0500 Original-Received: from hermes.netfonds.no ([80.91.224.195]:46924) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCBz8-0001QZ-2d; Thu, 24 Dec 2015 14:55:38 -0500 Original-Received: from 2.150.58.24.tmi.telenormobil.no ([2.150.58.24] helo=mouse) by hermes.netfonds.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1aCByl-0003Lb-0B; Thu, 24 Dec 2015 20:55:15 +0100 In-Reply-To: <83twn7myin.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 24 Dec 2015 21:34:24 +0200") User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.1.50 (gnu/linux) X-MailScanner-ID: 1aCByl-0003Lb-0B MailScanner-NULL-Check: 1451591715.40734@60Pfryc6Xt8+OnQ+7FBW6w X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 80.91.224.195 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:196793 Archived-At: Eli Zaretskii writes: >> That's basically just (car (decode-coding-string ...)) > > I believe you meant detect-coding-string. Yup. :-) >> though, since it'll return utf-8 first if that's a possible charset, >> won't it? > > You cannot rely on it returning UTF-8, that depends on coding > priorities (that are subject to customizations) and other things. > > I think you should use UTF-8 literally as the first choice. Right. How do I check whether the bytes are a valid utf-8 sequence, though? I thought I remembered something called `valid-something-something-p', but I can't find it now... > Yes, they use these in preference to everything else, something like > this: > > (let ((coding (or coding-system-for-read > document-encoding > locale-coding-system > ...))) > (decode-coding-string ... coding)) Okidoke. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no