From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request Date: Wed, 10 Aug 2016 10:12:40 +0300 Message-ID: <65f6508f-a464-7f66-fd14-1372dce86aa7@yandex.ru> References: <83d1ltq3p6.fsf@gnu.org> <83popsocg8.fsf@gnu.org> <7fb3540a-7b74-68cf-2c63-66474de26640@yandex.ru> <83mvkvmbv2.fsf@gnu.org> <27168f12-32d2-cb38-45c0-27d3339c75aa@yandex.ru> <83twf0lb5s.fsf@gnu.org> <83lh07i6g3.fsf@gnu.org> <83k2fri5kc.fsf@gnu.org> <87oa53i3si.fsf@linux-m68k.org> <83bn13i2x2.fsf@gnu.org> <87fuqfhy0q.fsf@linux-m68k.org> <837fbqise6.fsf@gnu.org> <834m6uhu87.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1470845154 29177 195.159.176.226 (10 Aug 2016 16:05:54 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 10 Aug 2016 16:05:54 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:47.0) Gecko/20100101 Thunderbird/47.0 Cc: stakemorii@gmail.com, larsi@gnus.org, 24117@debbugs.gnu.org To: Eli Zaretskii , Andreas Schwab Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Aug 10 18:05:49 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bXW0q-0007Lx-LQ for geb-bug-gnu-emacs@m.gmane.org; Wed, 10 Aug 2016 18:05:48 +0200 Original-Received: from localhost ([::1]:42613 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bXW0n-00022R-NU for geb-bug-gnu-emacs@m.gmane.org; Wed, 10 Aug 2016 12:05:45 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53650) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bXVzC-0000kF-LZ for bug-gnu-emacs@gnu.org; Wed, 10 Aug 2016 12:04:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bXVz8-00013m-Dp for bug-gnu-emacs@gnu.org; Wed, 10 Aug 2016 12:04:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:54770) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bXVz8-00013e-AU for bug-gnu-emacs@gnu.org; Wed, 10 Aug 2016 12:04:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bXVz7-0003Mv-Ja for bug-gnu-emacs@gnu.org; Wed, 10 Aug 2016 12:04:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 10 Aug 2016 16:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24117 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24117-submit@debbugs.gnu.org id=B24117.147084504012942 (code B ref 24117); Wed, 10 Aug 2016 16:04:01 +0000 Original-Received: (at 24117) by debbugs.gnu.org; 10 Aug 2016 16:04:00 +0000 Original-Received: from localhost ([127.0.0.1]:52482 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bXVz5-0003Mg-Rz for submit@debbugs.gnu.org; Wed, 10 Aug 2016 12:04:00 -0400 Original-Received: from mail-wm0-f47.google.com ([74.125.82.47]:37702) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bXVz4-0003MO-6T for 24117@debbugs.gnu.org; Wed, 10 Aug 2016 12:03:58 -0400 Original-Received: by mail-wm0-f47.google.com with SMTP id i5so114197424wmg.0 for <24117@debbugs.gnu.org>; Wed, 10 Aug 2016 09:03:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=k3kDPLjHTTRZie8F9BNqDJXLh/ef+SAgKdVdGFhaLhg=; b=ZOO4g/q9gS1DK/eqVj9wSZ0f0TpPsXFwEHBC9HziUk/3yA3byqeldgeu1Sb/wO+y3D Aqtvr2BPaU02TYix8zbqXji33gtZFWNRZTq9dCWSUsBvQHMSZv49WooPLqOU03C8+krE dobKzwiQ7mpeV+FuSHSvVmScSiJX7Zn9dRXVGhYdK2dchMkNRbDPSHwGMG1d/l5GdZod /ZBXLtnTymUnPeqc2iUjjFo121LS5tnajJwoOUrwJRAhWyI/8rBEedkj0l4MDM+Hfkhe teOG+2CuLeSm7XAaq67IOc0M20d8hMbxagAVVgHCxvLvvio0CT/VKLUuGeIQymMJffil PCKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:to:references:cc:from:message-id :date:user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=k3kDPLjHTTRZie8F9BNqDJXLh/ef+SAgKdVdGFhaLhg=; b=Xr5CtsveU41C9JHAWBOq2SgOW7ttvEbIwxwRuz9waFUtfMP87Hmo64f3MwmPwjHWpg MlhS518ZfMkwg5/1JwUP5IkfZDw38H71q3tDiY7iokoqx9mnLNbku80pwGsQ7iH0Nu+/ 431Z7Dp7XVCBpNvoNFEQQlqqJmUcZw/voR6J2exqjiqdNaZfFCDcIoECkObCuRVwn78Q K+KPM64I3WlvwMuPaHwnfdII+tuqkEiqGJtspWUTQ5NvvtXCTU/hUd/SG/IpqetAG7Ii 7zw0o25QMwEqC+DcKgFOnkobXgy4RrNNye3Ma1fUYGXAXgIWgSr9KFqcqtw6Cmh05vmo VuUg== X-Gm-Message-State: AEkoouuNBTH/QqcaPujUxlLJiNN3YX17607iX5m94UKlacWY7+YDEeCy1DB48ayPtmShAg== X-Received: by 10.25.91.148 with SMTP id p142mr366313lfb.161.1470813162808; Wed, 10 Aug 2016 00:12:42 -0700 (PDT) Original-Received: from [192.168.1.190] ([178.252.127.239]) by smtp.googlemail.com with ESMTPSA id 29sm7247592lfu.43.2016.08.10.00.12.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 00:12:42 -0700 (PDT) In-Reply-To: <834m6uhu87.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:122036 Archived-At: On 08/09/2016 05:50 PM, Eli Zaretskii wrote: >> You can't encode it properly without parsing it first. > > You don't say what you meant by "encode properly". It's just a > string, and there are ways to make a string unibyte without any > parsing. Different parts of an URL are supposed to be encoded in different ways. For instance, http://банки.рф/фыва/ turns into http://xn--80abwho.xn--p1ai/%D1%84%D1%8B%D0%B2%D0%B0/ The domain is encoded with IDNA, whereas the path uses percent-encoding. And they're also often encoded separately (e.g. when you copy-paste the above URL from Firefox to a text editor, the result is http://банки.рф/%D1%84%D1%8B%D0%B2%D0%B0/). So I think the encoding of the URL parts should be performed inside url-http-create-request. On the master branch, host is passed through IDNA encoding, but real-fname is untouched. On emacs-25, I think we should convert both to unibyte. Not sure encode-coding-string is the way to go (why would we assume UTF-8?). Personally, using string-as-unibyte makes more sense (neither string should contain any multibyte characters at that point), but I defer to the more qualified colleagues. (Why doesn't (encode-coding-string "aaaa" 'ascii) work?)