all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ted Zlatanov <tzz@lifelogs.com>
To: Dmitry Gutov <dgutov@yandex.ru>
Cc: stakemorii@gmail.com, Lars Ingebrigtsen <larsi@gnus.org>,
	schwab@linux-m68k.org, 24117@debbugs.gnu.org
Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request
Date: Thu, 11 Aug 2016 08:57:50 -0400	[thread overview]
Message-ID: <874m6rjwdt.fsf_-_@lifelogs.com> (raw)
In-Reply-To: <50426141-3483-e5e4-a252-20b1198cde30@yandex.ru> (Dmitry Gutov's message of "Thu, 11 Aug 2016 15:31:11 +0300, Thu, 11 Aug 2016 13:05:12 +0200")

On Thu, 11 Aug 2016 15:31:11 +0300 Dmitry Gutov <dgutov@yandex.ru> wrote: 

DG> On 08/11/2016 11:53 AM, Ted Zlatanov wrote:
>> Could you add to your patch the cases you've tested? There's a specific
>> place for URL parsing tests in test/lisp/url/url-parse-tests.el that
>> would help everyone.

DG> Sure, but only one of the patches affects URL parsing (and Lars prefers the
DG> other one).

Maybe the tests should be in a separate patch then. Neither your Russian
example nor Lars' example have a parallel in the tests AFAICS. I'd also
add the example hostname that Katsumi Yamaoka gave from the w3m source.

Somewhat related: it would be nice if the URL parser also listed the
non-ASCII scripts used in the domain name. Then eww and other programs
could do one of the typical defenses: either ensure only one script is
used; or allow only scripts that match the user's locale; or catch any
non-ASCII domain names. Typically they'd use Punycode to display such
suspicious domain names:
https://en.wikipedia.org/wiki/IDN_homograph_attack

I bring it up since explicitly allowing non-ASCII domain names
automatically opens up these security concerns, and it's a bit hard to
collect the confusables externally:
https://elpa.gnu.org/packages/uni-confusables.html

On Thu, 11 Aug 2016 13:05:12 +0200 Lars Ingebrigtsen <larsi@gnus.org> wrote: 

LI> Yes, the fix here should be in url-http-create-request, not in the URL
LI> parsing functions.  The main issue here is that the URL request buffer
LI> is a multibyte buffer and (as with all network connection buffers), it
LI> shouldn't be.  (Or, rather, that function just creates a string instead
LI> of a buffer, but the same principle applies.)

I think this is correct: the URL parsing should not care about the
provenance or potential use of that URL to make a HTTP request or
otherwise. But maybe the URL parsing can be smart enough to return both
the IDNA version and the original domain name, plus some parsing
information like the list of scripts I suggested above, to save user
agents from doing that extra work?

Ted





  reply	other threads:[~2016-08-11 12:57 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-31  8:26 bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request Sho Takemori
2016-07-31 14:31 ` Eli Zaretskii
2016-07-31 23:21   ` Sho Takemori
2016-08-01 13:17     ` Eli Zaretskii
2016-08-02  0:52       ` Dmitry Gutov
2016-08-02 15:25         ` Eli Zaretskii
2016-08-03  2:39           ` Dmitry Gutov
2016-08-04 17:02             ` Eli Zaretskii
2016-08-08  1:56               ` Dmitry Gutov
2016-08-08 13:32                 ` Ted Zlatanov
2016-08-08 23:48                   ` Katsumi Yamaoka
2016-08-08 15:33                 ` Eli Zaretskii
2016-08-08 15:52                 ` Lars Ingebrigtsen
2016-08-08 15:54                 ` Lars Ingebrigtsen
2016-08-08 16:14                   ` Eli Zaretskii
2016-08-08 16:18                     ` Lars Ingebrigtsen
2016-08-08 16:33                       ` Eli Zaretskii
2016-08-08 17:11                         ` Andreas Schwab
2016-08-08 17:30                           ` Eli Zaretskii
2016-08-08 19:16                             ` Andreas Schwab
2016-08-09  2:32                               ` Eli Zaretskii
2016-08-09  8:05                                 ` Andreas Schwab
2016-08-09 14:50                                   ` Eli Zaretskii
2016-08-10  7:12                                     ` Dmitry Gutov
2016-08-10 14:35                                       ` Eli Zaretskii
2016-08-11  2:52                                         ` Dmitry Gutov
2016-08-11  8:53                                           ` Ted Zlatanov
2016-08-11 12:31                                             ` Dmitry Gutov
2016-08-11 12:57                                               ` Ted Zlatanov [this message]
2016-08-11 13:00                                                 ` Lars Ingebrigtsen
2016-08-11 13:18                                                   ` Ted Zlatanov
2017-05-08 13:36                                                 ` Dmitry Gutov
2017-05-08 20:57                                                   ` Lars Ingebrigtsen
2017-05-10  0:40                                                     ` Dmitry Gutov
2016-08-11 11:05                                           ` Lars Ingebrigtsen
2016-08-11 14:47                                           ` Eli Zaretskii
2016-08-11 14:59                                             ` Dmitry Gutov
2016-08-11 15:31                                               ` Eli Zaretskii
2016-08-11 18:07                                                 ` Dmitry Gutov
2016-08-11 19:47                                                   ` Eli Zaretskii
2016-08-12 21:44                                                   ` John Wiegley
2016-08-13  0:30                                           ` Sho Takemori
2016-08-13  7:02                                             ` Eli Zaretskii
2016-08-13  7:31                                               ` Sho Takemori
2016-08-13  8:31                                                 ` Eli Zaretskii
2016-08-13 13:02                                                   ` Sho Takemori
2016-08-13 13:11                                                     ` Eli Zaretskii
2016-08-13 15:32                                                   ` Dmitry Gutov
2016-08-13 15:56                                                     ` Eli Zaretskii
2016-08-08 16:21                     ` Lars Ingebrigtsen
2016-08-08 16:33                       ` Eli Zaretskii
2016-08-08 16:58                         ` Lars Ingebrigtsen
2016-08-08 17:11                           ` Eli Zaretskii
2016-08-08 19:46                   ` Dmitry Gutov
2016-08-08 20:19                     ` Lars Ingebrigtsen
2016-08-08 20:35                       ` Dmitry Gutov
2016-08-08 20:36                         ` Lars Ingebrigtsen
2016-08-09  2:13                           ` Dmitry Gutov
2016-08-09  9:39                             ` Lars Ingebrigtsen
2016-08-10  6:50                               ` Dmitry Gutov
2016-08-11  1:31                                 ` Dmitry Gutov
2016-08-02  3:26       ` Sho Takemori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874m6rjwdt.fsf_-_@lifelogs.com \
    --to=tzz@lifelogs.com \
    --cc=24117@debbugs.gnu.org \
    --cc=dgutov@yandex.ru \
    --cc=larsi@gnus.org \
    --cc=schwab@linux-m68k.org \
    --cc=stakemorii@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.