From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.bugs Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request Date: Mon, 08 May 2017 22:57:34 +0200 Message-ID: References: <83mvkvmbv2.fsf@gnu.org> <27168f12-32d2-cb38-45c0-27d3339c75aa@yandex.ru> <83twf0lb5s.fsf@gnu.org> <83lh07i6g3.fsf@gnu.org> <83k2fri5kc.fsf@gnu.org> <87oa53i3si.fsf@linux-m68k.org> <83bn13i2x2.fsf@gnu.org> <87fuqfhy0q.fsf@linux-m68k.org> <837fbqise6.fsf@gnu.org> <834m6uhu87.fsf@gnu.org> <65f6508f-a464-7f66-fd14-1372dce86aa7@yandex.ru> <83bn10hetr.fsf@gnu.org> <50426141-3483-e5e4-a252-20b1198cde30@yandex.ru> <874m6rjwdt.fsf_-_@lifelogs.com> <605199d2-551d-07c8-71b4-ca73c008246a@yandex.ru> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1494277104 2434 195.159.176.226 (8 May 2017 20:58:24 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 8 May 2017 20:58:24 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) Cc: stakemorii@gmail.com, Ted Zlatanov , schwab@linux-m68k.org, 24117@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon May 08 22:58:20 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d7pjX-0000Ww-I7 for geb-bug-gnu-emacs@m.gmane.org; Mon, 08 May 2017 22:58:19 +0200 Original-Received: from localhost ([::1]:33408 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d7pjc-0007Cz-S0 for geb-bug-gnu-emacs@m.gmane.org; Mon, 08 May 2017 16:58:24 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46080) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d7pjL-00075T-25 for bug-gnu-emacs@gnu.org; Mon, 08 May 2017 16:58:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d7pjG-0001iC-6i for bug-gnu-emacs@gnu.org; Mon, 08 May 2017 16:58:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:57613) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d7pjG-0001i4-3M for bug-gnu-emacs@gnu.org; Mon, 08 May 2017 16:58:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1d7pjF-00009W-S1 for bug-gnu-emacs@gnu.org; Mon, 08 May 2017 16:58:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 08 May 2017 20:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24117 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24117-submit@debbugs.gnu.org id=B24117.1494277063561 (code B ref 24117); Mon, 08 May 2017 20:58:01 +0000 Original-Received: (at 24117) by debbugs.gnu.org; 8 May 2017 20:57:43 +0000 Original-Received: from localhost ([127.0.0.1]:60290 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d7piw-00008z-Ph for submit@debbugs.gnu.org; Mon, 08 May 2017 16:57:42 -0400 Original-Received: from hermes.netfonds.no ([80.91.224.195]:37316) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d7piv-00008q-3n for 24117@debbugs.gnu.org; Mon, 08 May 2017 16:57:41 -0400 Original-Received: from cm-84.209.243.26.getinternet.no ([84.209.243.26] helo=stories) by hermes.netfonds.no with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1d7pio-00056M-S4; Mon, 08 May 2017 22:57:36 +0200 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAGFBMVEU/QFQXCR8dEiccDiYf HjUcDCQNBhsgFyevZmOwAAACSUlEQVQ4jXWSTXObMBCG1Rmmd6U1OXsF1rlSydlDd6JzrNnmGjuu 7qEQ/f2+wsaJM80aPLAP735KEbPxnTci0RtH3hNR77sfquHeex+9gBhnOvA8kDeKCL5YgBfvbRu7 aIeBSHnni7PcnYjzro1kGKDzDvG7M5ASkkoe1VFDiNzJystKSsjoOvnJSprG4K3TWkTrCO7Jx6YH oC4i2ApAHqRExZcNKSGavmVEwq/4pdsR+jEqDqGNYytaigRZDDlDALsxU7SIslqJ/BUZ2oFmYEca KMtsRTO0DsUarywEZFuROhX7Yy1xT+KUHVCspYd0tv2Omt5EpzhM8piHlLaJZvK8AM6w34l5dvcp 3WAUnlQoYEpYC3zkNpA0hl2vMoecX9OBm19UyoHKedr1asyQICdTwxlbI5cOhiIXEF7SHjNumHEx bZ7RPKkM8IqEYQyBi5HRE3akeOTq65PiixFtyqbUxd7A2oKo7QKqCxgsUVB5zDgu7hKpZxswP1al bw7NWw4ObIfd+B/AADQiZ8jjBxBkR6WYPPZXfqaHTwA2tCsAU3zn3ar7pPUMMHb8V4u/wki1m5Pn 6cg891g6vcfkNXLAj81+v4ymKou8sScw1el4nk11NwMo8gmkdAtUBX86K9jHrNDL6UmHE1jPI8n5 pU5XpvUZaEjqdarvDmlDaZNuaVHo+km9ty9nMEz6qLZbru6xynK9LIpJX2Wpa1WO1WOJtRS2xyd6 XwCfkpzBvtZzJerkxPN1waXcQetZft0JwLoIPgH1Pn20fz1YNgkbGTfPAAAAAElFTkSuQmCC In-Reply-To: <605199d2-551d-07c8-71b4-ca73c008246a@yandex.ru> (Dmitry Gutov's message of "Mon, 8 May 2017 16:36:36 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:132385 Archived-At: Dmitry Gutov writes: > Just got around to this. The test I came up with looks like this: > > (ert-deftest url-generic-parse-url/multibyte-host-and-path () > (should (equal (url-generic-parse-url "http://=D0=B1=D0=B0=D0=BD=D0=BA= =D0=B8.=D1=80=D1=84/=D1=84=D1=8B=D0=B2=D0=B0/") > (url-parse-make-urlobj "http" nil nil "=D0=B1=D0=B0=D0= =BD=D0=BA=D0=B8.=D1=80=D1=84" nil > "/=D1=84=D1=8B=D0=B2=D0=B0/" nil = nil t)))) That looks like the correct decomposition of this URL, so url-generic-parse-url does the right thing. > But! What behavior would this test? If we're making sure here that > url-generic-parse-url can cope with multibyte characters anywhere in > the URL, the encode-coding-string/decode-coding-string logic in > url-encode-url is extraneous. I'm not sure that it is, or is there are > some edge cases (are they fixable? should we add tests for them?). (url-encode-url "http://=D0=B1=D0=B0=D0=BD=D0=BA=D0=B8.=D1=80=D1=84/=D1=84= =D1=8B=D0=B2=D0=B0/") =3D> "http://=D0=B1=D0=B0=D0=BD=D0=BA=D0=B8.=D1=80=D1=84/%D1%84%D1%8B%D0%B2= %D0%B0/" It is perhaps debatable whether the host name should be encoded (with punycode) here, but this is otherwise correct. > So if this test goes in, it should be accompanied with the > simplification of url-encode-url. > > Lars, what do you think? The utf-8 encoding does seem superfluous, especially since url-hexify-string also does the encoding... (url-hexify-string "=D1=84=D1=8B=D0=B2=D0=B0") =3D> "%D1%84%D1%8B%D0%B2%D0%B0" --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no