unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
@ 2020-07-15 20:40 Daniele Nicolodi
  2020-07-16 16:08 ` Robert Pluim
  2020-07-19 19:17 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 8+ messages in thread
From: Daniele Nicolodi @ 2020-07-15 20:40 UTC (permalink / raw)
  To: 42382

[-- Attachment #1: Type: text/plain, Size: 1327 bytes --]

url-http.el interprets HTTP responses in url-http-parse-headers. This
function contains the following code:

	 (when redirect-uri
	   ;; Clean off any whitespace and/or <...> cruft.
	   (if (string-match "\\([^ \t]+\\)[ \t]" redirect-uri)
	       (setq redirect-uri (match-string 1 redirect-uri)))
	   (if (string-match "^<\\(.*\\)>$" redirect-uri)
	       (setq redirect-uri (match-string 1 redirect-uri)))

which truncates the value of the Location header at the first whitespace
character and removes surrounding angle brackets quoting.

In RFC 7231 the Location header is defined to carry a URI-reference.
According to RFC 3986 it should be percent-encoded and thus should not
contain spaces. However, there are HTTP server implementation (notably
nginx) that do not do that. While this is a bug in those HTTP server
implementations, I think Emacs should follow what most other HTTP client
implementatios (all the ones I tested) and use the content of the
Location header unmodified. Stripping of angle bracket quotes is
unnecessary as they are not valid according to the RFCs.

Also, accordingly to the RFCs, the location header may contain a
relative location. Thus the comment that suggest that such a response is
a bug in the server should be reworded.

The attached patches implement the proposed changes.

Thank you.

[-- Attachment #2: 0001-url-http-Fix-handling-of-redirect-locations.patch --]
[-- Type: text/plain, Size: 1493 bytes --]

From 052a9934380fdd7517b5bf38aca37aaf951a644e Mon Sep 17 00:00:00 2001
From: Daniele Nicolodi <daniele@grinta.net>
Date: Fri, 10 Jul 2020 12:16:01 -0600
Subject: [PATCH 1/2] url-http: Fix handling of redirect locations

Do not break the redirect Location header when it contain spaces. In
RFC 7231 the Location header is defined to carry a URI-reference.
According to RFC 3986 it should be percent-encoded and thus should not
contain spaces. However, there are HTTP server implementation (notably
nginx) that do not do that. This makes Emacs url-http.el behave like
most other HTTP client implementatios. Also remove the stripping of
angle bracket quotes as they are not valid according to the RFCs.
---
 lisp/url/url-http.el | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
index 669c24571f..a746be8475 100644
--- a/lisp/url/url-http.el
+++ b/lisp/url/url-http.el
@@ -702,12 +702,6 @@ should be shown to the user."
 	    ;; Treat everything like '300'
 	    nil))
 	 (when redirect-uri
-	   ;; Clean off any whitespace and/or <...> cruft.
-	   (if (string-match "\\([^ \t]+\\)[ \t]" redirect-uri)
-	       (setq redirect-uri (match-string 1 redirect-uri)))
-	   (if (string-match "^<\\(.*\\)>$" redirect-uri)
-	       (setq redirect-uri (match-string 1 redirect-uri)))
-
 	   ;; Some stupid sites (like sourceforge) send a
 	   ;; non-fully-qualified URL (ie: /), which royally confuses
 	   ;; the URL library.
-- 
2.24.3 (Apple Git-128)


[-- Attachment #3: 0002-url-http-Do-not-suggest-a-broken-HTTP-server-impleme.patch --]
[-- Type: text/plain, Size: 1083 bytes --]

From 8da3326f64aaeb47f4306a4bacc3067d4bc38141 Mon Sep 17 00:00:00 2001
From: Daniele Nicolodi <daniele@grinta.net>
Date: Fri, 10 Jul 2020 18:50:39 -0600
Subject: [PATCH 2/2] url-http: Do not suggest a broken HTTP server
 implementation

Relative URIs are allowed in the Location header by the relvant RFCs.
---
 lisp/url/url-http.el | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
index a746be8475..e0c925e13f 100644
--- a/lisp/url/url-http.el
+++ b/lisp/url/url-http.el
@@ -702,9 +702,7 @@ should be shown to the user."
 	    ;; Treat everything like '300'
 	    nil))
 	 (when redirect-uri
-	   ;; Some stupid sites (like sourceforge) send a
-	   ;; non-fully-qualified URL (ie: /), which royally confuses
-	   ;; the URL library.
+	   ;; Handle relative redirect URIs.
 	   (if (not (string-match url-nonrelative-link redirect-uri))
                ;; Be careful to use the real target URL, otherwise we may
                ;; compute the redirection relative to the URL of the proxy.
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-15 20:40 bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace Daniele Nicolodi
@ 2020-07-16 16:08 ` Robert Pluim
  2020-07-16 16:30   ` Daniele Nicolodi
  2020-07-19 19:17 ` Lars Ingebrigtsen
  1 sibling, 1 reply; 8+ messages in thread
From: Robert Pluim @ 2020-07-16 16:08 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: 42382

>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele@grinta.net> said:

    Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
    Daniele> According to RFC 3986 it should be percent-encoded and thus should not
    Daniele> contain spaces. However, there are HTTP server implementation (notably
    Daniele> nginx) that do not do that. While this is a bug in those HTTP server
    Daniele> implementations, I think Emacs should follow what most other HTTP client
    Daniele> implementatios (all the ones I tested) and use the content of the
    Daniele> Location header unmodified. Stripping of angle bracket quotes is
    Daniele> unnecessary as they are not valid according to the RFCs.

Nor is embedded whitespace in the URI :-)

Are you sure this won't break anything? ie are you sure there are 0
server implementations out there that send angle brackets?

Iʼd be conservative, and just replace the truncation on whitespace
with percent-encoding of said whitespace.

    Daniele> Also, accordingly to the RFCs, the location header may contain a
    Daniele> relative location. Thus the comment that suggest that such a response is
    Daniele> a bug in the server should be reworded.

    Daniele> The attached patches implement the proposed changes.

The second patch is small enough that I think you can combine the two.

Robert





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-16 16:08 ` Robert Pluim
@ 2020-07-16 16:30   ` Daniele Nicolodi
  2020-07-16 17:10     ` Robert Pluim
  0 siblings, 1 reply; 8+ messages in thread
From: Daniele Nicolodi @ 2020-07-16 16:30 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 42382

On 16-07-2020 10:08, Robert Pluim wrote:
>>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele@grinta.net> said:
> 
>     Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
>     Daniele> According to RFC 3986 it should be percent-encoded and thus should not
>     Daniele> contain spaces. However, there are HTTP server implementation (notably
>     Daniele> nginx) that do not do that. While this is a bug in those HTTP server
>     Daniele> implementations, I think Emacs should follow what most other HTTP client
>     Daniele> implementatios (all the ones I tested) and use the content of the
>     Daniele> Location header unmodified. Stripping of angle bracket quotes is
>     Daniele> unnecessary as they are not valid according to the RFCs.
> 
> Nor is embedded whitespace in the URI :-)

I don't understand this remark. Truncating at the first whitespace
character (current behavior) is a valid arbitrary decision for an
RFC-invalid URI-reference value. However, it is different from what all
other HTTP clients implement and it results in practical problems.

> Are you sure this won't break anything? ie are you sure there are 0
> server implementations out there that send angle brackets?

I don't see any reason why there should be angle brackets around the
value of a Location header and the current code or changelog or commit
messages does not provide a justification or a case where these have
been encountered. No other HTTP client I looked at does something like
this. I think there are many HTTP client implementations out there that
are more widely used and tested for interoperability than url-http.

> Iʼd be conservative, and just replace the truncation on whitespace
> with percent-encoding of said whitespace.

Why is percent-encoding better? The URI-reference value should not be
interpreted in any way, simply passed along. Again, all other HTTP
clients I looked at do not do this, or other manipulation of the header.

>     Daniele> Also, accordingly to the RFCs, the location header may contain a
>     Daniele> relative location. Thus the comment that suggest that such a response is
>     Daniele> a bug in the server should be reworded.
> 
>     Daniele> The attached patches implement the proposed changes.
> 
> The second patch is small enough that I think you can combine the two.

They are divided to provide justification for the changes in the commit
messages.

Thank you.

Daniele





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-16 16:30   ` Daniele Nicolodi
@ 2020-07-16 17:10     ` Robert Pluim
  2020-07-17  0:01       ` Lars Ingebrigtsen
  2020-07-17 15:47       ` Daniele Nicolodi
  0 siblings, 2 replies; 8+ messages in thread
From: Robert Pluim @ 2020-07-16 17:10 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: 42382

>>>>> On Thu, 16 Jul 2020 10:30:49 -0600, Daniele Nicolodi <daniele@grinta.net> said:

    Daniele> On 16-07-2020 10:08, Robert Pluim wrote:
    >>>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele@grinta.net> said:
    >> 
    Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
    Daniele> According to RFC 3986 it should be percent-encoded and thus should not
    Daniele> contain spaces. However, there are HTTP server implementation (notably
    Daniele> nginx) that do not do that. While this is a bug in those HTTP server
    Daniele> implementations, I think Emacs should follow what most other HTTP client
    Daniele> implementatios (all the ones I tested) and use the content of the
    Daniele> Location header unmodified. Stripping of angle bracket quotes is
    Daniele> unnecessary as they are not valid according to the RFCs.
    >> 
    >> Nor is embedded whitespace in the URI :-)

    Daniele> I don't understand this remark. Truncating at the first whitespace
    Daniele> character (current behavior) is a valid arbitrary decision for an
    Daniele> RFC-invalid URI-reference value. However, it is different from what all
    Daniele> other HTTP clients implement and it results in practical problems.

You stated that angle quotes were invalid, and proposed to remove the
support for their presence. I was merely pointing out that spaces are
equally invalid, and you propose to accomodate them. If one, why not
the other?

    >> Are you sure this won't break anything? ie are you sure there are 0
    >> server implementations out there that send angle brackets?

    Daniele> I don't see any reason why there should be angle brackets around the
    Daniele> value of a Location header and the current code or changelog or commit
    Daniele> messages does not provide a justification or a case where these have
    Daniele> been encountered. No other HTTP client I looked at does something like
    Daniele> this. I think there are many HTTP client implementations out there that
    Daniele> are more widely used and tested for interoperability than url-http.

    >> Iʼd be conservative, and just replace the truncation on whitespace
    >> with percent-encoding of said whitespace.

    Daniele> Why is percent-encoding better? The URI-reference value should not be
    Daniele> interpreted in any way, simply passed along. Again, all other HTTP
    Daniele> clients I looked at do not do this, or other manipulation of the header.

Because then it become a valid URI, and other parts of emacs that
donʼt know how to treat literal spaces in a URI wonʼt break. But Iʼll
agree that itʼs conjecture on my part.

Robert





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-16 17:10     ` Robert Pluim
@ 2020-07-17  0:01       ` Lars Ingebrigtsen
  2020-07-17 15:47       ` Daniele Nicolodi
  1 sibling, 0 replies; 8+ messages in thread
From: Lars Ingebrigtsen @ 2020-07-17  0:01 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Daniele Nicolodi, 42382

Robert Pluim <rpluim@gmail.com> writes:

> You stated that angle quotes were invalid, and proposed to remove the
> support for their presence. I was merely pointing out that spaces are
> equally invalid, and you propose to accomodate them. If one, why not
> the other?

I think it makes sense to remove the support for the angle brackets --
I've never seen those in use anywhere.  If they turn out to be a thing,
we can add that back.

On the other hand, spaces in Location headers have been observed in the
wild, so supporting that seems like a good idea to me.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-16 17:10     ` Robert Pluim
  2020-07-17  0:01       ` Lars Ingebrigtsen
@ 2020-07-17 15:47       ` Daniele Nicolodi
  1 sibling, 0 replies; 8+ messages in thread
From: Daniele Nicolodi @ 2020-07-17 15:47 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 42382

On 16/07/2020 11:10, Robert Pluim wrote:
> You stated that angle quotes were invalid, and proposed to remove the
> support for their presence. I was merely pointing out that spaces are
> equally invalid, and you propose to accomodate them. If one, why not
> the other?

Technically, I am not proposing to accommodate anything. I am proposing
to use the Location header as returned by the server, without trying to
guess what the server really meant.

If you want to look at it in another way, I am proposing to modify
url-http to follow what all other HTTP client implementations do. The
standard does not specify what to do in case of malformed header values,
thus adopting the most common approach allows for greater
interoperability. The fact that this is also the simplest thing to do is
also very attractive.

Finally, as Lars points out, angle bracket enclosed URIs in Location
headers have not been seen anywhere. Thus stripping them out is fixing
an imaginary problem, and we can come up with an infinite list of
imaginary problems that need to be "fixed". Non fixing any of them is
the reasonable thing to do.

>     Daniele> Why is percent-encoding better? The URI-reference value should not be
>     Daniele> interpreted in any way, simply passed along. Again, all other HTTP
>     Daniele> clients I looked at do not do this, or other manipulation of the header.
> 
> Because then it become a valid URI, and other parts of emacs that
> donʼt know how to treat literal spaces in a URI wonʼt break. But Iʼll
> agree that itʼs conjecture on my part.

The code I would like to fix does not expose the values of the headers
in any way, and it works just fine with non-escaped spaces, thus there
is no need to fix the percent-encoding.

Trying to fix the value of the URI-location value is a very slippery
slope. We can fix non-escaped spaces, but what about other non-escaped
characters? Should we escape them as well or leave them alone? If we
leave them alone, there is little value in escaping spaces (the
invariable "correctly percent-escaped URIs are returned, SOMETIMES" is
weaker than "the content of the Location header is return as is,
ALWAYS"). If we try to fix the escaping, how can we know that when we
get a URI containing a % character, this is a valid escape sequence an
not a literal % that needs to be escaped? We are back to the "correct
answer most of the times", which is not very helpful.

Thank you.

Cheers,
Dan





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-15 20:40 bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace Daniele Nicolodi
  2020-07-16 16:08 ` Robert Pluim
@ 2020-07-19 19:17 ` Lars Ingebrigtsen
  2020-07-20 14:46   ` Daniele Nicolodi
  1 sibling, 1 reply; 8+ messages in thread
From: Lars Ingebrigtsen @ 2020-07-19 19:17 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: 42382

Daniele Nicolodi <daniele@grinta.net> writes:

> Also, accordingly to the RFCs, the location header may contain a
> relative location. Thus the comment that suggest that such a response is
> a bug in the server should be reworded.
>
> The attached patches implement the proposed changes.

Thanks; I've now applied the patches to Emacs 28.1.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace
  2020-07-19 19:17 ` Lars Ingebrigtsen
@ 2020-07-20 14:46   ` Daniele Nicolodi
  0 siblings, 0 replies; 8+ messages in thread
From: Daniele Nicolodi @ 2020-07-20 14:46 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 42382

On 19-07-2020 13:17, Lars Ingebrigtsen wrote:
> Daniele Nicolodi <daniele@grinta.net> writes:
> 
>> Also, accordingly to the RFCs, the location header may contain a
>> relative location. Thus the comment that suggest that such a response is
>> a bug in the server should be reworded.
>>
>> The attached patches implement the proposed changes.
> 
> Thanks; I've now applied the patches to Emacs 28.1.

Thanks Lars.

Cheers,
Dan






^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-20 14:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 20:40 bug#42382: 26.3; url-http handling of Location redirection headers containing whitespace Daniele Nicolodi
2020-07-16 16:08 ` Robert Pluim
2020-07-16 16:30   ` Daniele Nicolodi
2020-07-16 17:10     ` Robert Pluim
2020-07-17  0:01       ` Lars Ingebrigtsen
2020-07-17 15:47       ` Daniele Nicolodi
2020-07-19 19:17 ` Lars Ingebrigtsen
2020-07-20 14:46   ` Daniele Nicolodi

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).