unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Dmitry Gutov <dgutov@yandex.ru>
To: Eli Zaretskii <eliz@gnu.org>
Cc: stakemorii@gmail.com, larsi@gnus.org, schwab@linux-m68k.org,
	24117@debbugs.gnu.org
Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request
Date: Thu, 11 Aug 2016 05:52:42 +0300	[thread overview]
Message-ID: <ef8575e5-8583-a7da-020b-c5ceca311362@yandex.ru> (raw)
In-Reply-To: <83bn10hetr.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 2184 bytes --]

On 08/10/2016 05:35 PM, Eli Zaretskii wrote:

> Are you saying that url-generic-parse-url performs this encoding, and
> that using a unibyte buffer causes that to fail?

No, url-generic-parse-url contains logic that allows to distinguish 
between the domain and the path parts of an URL. So apparently it might 
have to work on multibyte URLs.

That's not strictly necessary, however, given how url-encode-url uses it 
currently (it performs encode-coding-string and decode-coding-string on 
the URL string).

That approach seems flawed to me, but either way, someone will have to 
choose how url-encode-url should use url-generic-parse-url. If we intend 
to leave it as-is, then the proposed patch using set-buffer-multibyte 
actually works fine, even on master, with multibyte URLs.

>> So I think the encoding of the URL parts should be performed inside
>> url-http-create-request.
>
> Fine with me, but when I suggested that, you didn't like the
> suggestion.  If you changed your mind, let's do that.

See below. But yes, I'm more inclined toward this approach now, after 
Lar's objection, and after looking at the code in master.

>> On the master branch, host is passed through IDNA encoding, but
>> real-fname is untouched. On emacs-25, I think we should convert both
>> to unibyte.
>
> Not sure I understand why there should be a difference between the two
> branches.  Encoding an ASCII string doesn't do any harm.

Since it's ASCII, using utf-8 there seems misleading to me. It's a 
question of readability. As a bonus, using us-ascii will validate that 
the strings indeed do not contain any unexpected characters.

>> (Why doesn't (encode-coding-string "aaaa" 'ascii) work?)
>
> It's 'us-ascii, not 'ascii.

Thanks. Attaching a patch, it seems to work well enough.

I'd like to wait for Lar's response now, but someone will have to make 
an executive decision. Both patches (this and the set-multibyte-buffer-p 
one), work in the cases I've tested.

This one seems more conservative, but it'll require a manual merge to 
master. The other one is very trivial, will merge automatically, but 
might cause problems for potential less-careful uses of 
url-generic-parse-url.

[-- Attachment #2: url-http--encode-string.diff --]
[-- Type: text/x-patch, Size: 1359 bytes --]

diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
index 7156e6f..860e652 100644
--- a/lisp/url/url-http.el
+++ b/lisp/url/url-http.el
@@ -235,7 +235,7 @@ url-http-create-request
 			      'url-http-proxy-basic-auth-storage))
 			 (url-get-authentication url-http-proxy nil 'any nil))))
 	 (real-fname (url-filename url-http-target-url))
-	 (host (url-host url-http-target-url))
+	 (host (url-http--encode-string (url-host url-http-target-url)))
 	 (auth (if (cdr-safe (assoc "Authorization" url-http-extra-headers))
 		   nil
 		 (url-get-authentication (or
@@ -278,7 +278,8 @@ url-http-create-request
           (concat
              ;; The request
              (or url-http-method "GET") " "
-             (if using-proxy (url-recreate-url url-http-target-url) real-fname)
+             (url-http--encode-string
+              (if using-proxy (url-recreate-url url-http-target-url) real-fname))
              " HTTP/" url-http-version "\r\n"
              ;; Version of MIME we speak
              "MIME-Version: 1.0\r\n"
@@ -360,6 +361,11 @@ url-http-create-request
     (url-http-debug "Request is: \n%s" request)
     request))
 
+(defun url-http--encode-string (s)
+  (if (multibyte-string-p s)
+      (encode-coding-string s 'us-ascii)
+    s))
+
 ;; Parsing routines
 (defun url-http-clean-headers ()
   "Remove trailing \r from header lines.

  reply	other threads:[~2016-08-11  2:52 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-31  8:26 bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request Sho Takemori
2016-07-31 14:31 ` Eli Zaretskii
2016-07-31 23:21   ` Sho Takemori
2016-08-01 13:17     ` Eli Zaretskii
2016-08-02  0:52       ` Dmitry Gutov
2016-08-02 15:25         ` Eli Zaretskii
2016-08-03  2:39           ` Dmitry Gutov
2016-08-04 17:02             ` Eli Zaretskii
2016-08-08  1:56               ` Dmitry Gutov
2016-08-08 13:32                 ` Ted Zlatanov
2016-08-08 23:48                   ` Katsumi Yamaoka
2016-08-08 15:33                 ` Eli Zaretskii
2016-08-08 15:52                 ` Lars Ingebrigtsen
2016-08-08 15:54                 ` Lars Ingebrigtsen
2016-08-08 16:14                   ` Eli Zaretskii
2016-08-08 16:18                     ` Lars Ingebrigtsen
2016-08-08 16:33                       ` Eli Zaretskii
2016-08-08 17:11                         ` Andreas Schwab
2016-08-08 17:30                           ` Eli Zaretskii
2016-08-08 19:16                             ` Andreas Schwab
2016-08-09  2:32                               ` Eli Zaretskii
2016-08-09  8:05                                 ` Andreas Schwab
2016-08-09 14:50                                   ` Eli Zaretskii
2016-08-10  7:12                                     ` Dmitry Gutov
2016-08-10 14:35                                       ` Eli Zaretskii
2016-08-11  2:52                                         ` Dmitry Gutov [this message]
2016-08-11  8:53                                           ` Ted Zlatanov
2016-08-11 12:31                                             ` Dmitry Gutov
2016-08-11 12:57                                               ` Ted Zlatanov
2016-08-11 13:00                                                 ` Lars Ingebrigtsen
2016-08-11 13:18                                                   ` Ted Zlatanov
2017-05-08 13:36                                                 ` Dmitry Gutov
2017-05-08 20:57                                                   ` Lars Ingebrigtsen
2017-05-10  0:40                                                     ` Dmitry Gutov
2016-08-11 11:05                                           ` Lars Ingebrigtsen
2016-08-11 14:47                                           ` Eli Zaretskii
2016-08-11 14:59                                             ` Dmitry Gutov
2016-08-11 15:31                                               ` Eli Zaretskii
2016-08-11 18:07                                                 ` Dmitry Gutov
2016-08-11 19:47                                                   ` Eli Zaretskii
2016-08-12 21:44                                                   ` John Wiegley
2016-08-13  0:30                                           ` Sho Takemori
2016-08-13  7:02                                             ` Eli Zaretskii
2016-08-13  7:31                                               ` Sho Takemori
2016-08-13  8:31                                                 ` Eli Zaretskii
2016-08-13 13:02                                                   ` Sho Takemori
2016-08-13 13:11                                                     ` Eli Zaretskii
2016-08-13 15:32                                                   ` Dmitry Gutov
2016-08-13 15:56                                                     ` Eli Zaretskii
2016-08-08 16:21                     ` Lars Ingebrigtsen
2016-08-08 16:33                       ` Eli Zaretskii
2016-08-08 16:58                         ` Lars Ingebrigtsen
2016-08-08 17:11                           ` Eli Zaretskii
2016-08-08 19:46                   ` Dmitry Gutov
2016-08-08 20:19                     ` Lars Ingebrigtsen
2016-08-08 20:35                       ` Dmitry Gutov
2016-08-08 20:36                         ` Lars Ingebrigtsen
2016-08-09  2:13                           ` Dmitry Gutov
2016-08-09  9:39                             ` Lars Ingebrigtsen
2016-08-10  6:50                               ` Dmitry Gutov
2016-08-11  1:31                                 ` Dmitry Gutov
2016-08-02  3:26       ` Sho Takemori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef8575e5-8583-a7da-020b-c5ceca311362@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=24117@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=larsi@gnus.org \
    --cc=schwab@linux-m68k.org \
    --cc=stakemorii@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).