unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
@ 2014-03-12 10:19 Bastien
  2014-03-12 13:40 ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Bastien @ 2014-03-12 10:19 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 711 bytes --]

For now url-encode-url does this:

(url-encode-url "http://www.gnu.org")
   => http://www.gnu.org/

I suggest to not add the trailing slash:

(url-encode-url "http://www.gnu.org")
   => http://www.gnu.org

Adding a trailing slash is only recommended when there is a directory
after the domain name, like in "http://www.gnu.org/software", which
should be written "http://www.gnu.org/software/".

For now url-encode-url is right about *not* adding the trailing slash
in the above case (as this would lead to too many false positives) but
wrong in adding it for http://www.gnu.org, as this disambiguation is
not needed.

Am I missing cases when it is definitely needed?

The following trivial patch changes this.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: url-parse.el.patch --]
[-- Type: text/x-diff, Size: 548 bytes --]

=== modified file 'lisp/url/url-parse.el'
*** lisp/url/url-parse.el	2014-01-01 07:43:34 +0000
--- lisp/url/url-parse.el	2014-03-12 10:08:48 +0000
***************
*** 94,100 ****
  			"@"))
  	    host
  	    (if port (format ":%d" (url-port urlobj)))
! 	    (or file "/")
  	    (if frag (concat "#" frag)))))
  
  (defun url-recreate-url-attributes (urlobj)
--- 94,100 ----
  			"@"))
  	    host
  	    (if port (format ":%d" (url-port urlobj)))
! 	    file
  	    (if frag (concat "#" frag)))))
  
  (defun url-recreate-url-attributes (urlobj)


[-- Attachment #3: Type: text/plain, Size: 14 bytes --]


-- 
 Bastien

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-12 10:19 url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory) Bastien
@ 2014-03-12 13:40 ` Stefan Monnier
  2014-03-12 14:05   ` Tim Visher
  2014-03-12 14:42   ` Bastien
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Monnier @ 2014-03-12 13:40 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-devel

> The following trivial patch changes this.

I'm no expert on URL conventions, but it looks OK to me, tho maybe the /
is needed when there's a fragment.


        Stefan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-12 13:40 ` Stefan Monnier
@ 2014-03-12 14:05   ` Tim Visher
  2014-03-12 14:42   ` Bastien
  1 sibling, 0 replies; 8+ messages in thread
From: Tim Visher @ 2014-03-12 14:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Bastien, Emacs Development List

Hi Bastien and Stefan,

On Wed, Mar 12, 2014 at 9:40 AM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> The following trivial patch changes this.
>
> I'm no expert on URL conventions, but it looks OK to me, tho maybe the /
> is needed when there's a fragment.

I'd agree with you here, in that the enclosed patch probably wouldn't
work as expected in the case of a fragment being present.

I'd agree with Bastien in that it's never the right thing to default
to a directory, especially since most servers these days will
automatically redirect you to the directory if one is present with the
same name as the 'file' you requested.

--

In Christ,

Timmy V.

http://blog.twonegatives.com/
http://five.sentenc.es/ -- Spend less time on mail



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-12 13:40 ` Stefan Monnier
  2014-03-12 14:05   ` Tim Visher
@ 2014-03-12 14:42   ` Bastien
  2014-03-12 22:28     ` David Caldwell
  1 sibling, 1 reply; 8+ messages in thread
From: Bastien @ 2014-03-12 14:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hi Stefan,

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> The following trivial patch changes this.
>
> I'm no expert on URL conventions, but it looks OK to me, tho maybe the /
> is needed when there's a fragment.

Do you mean when there is a #name like in "http://www.gnu.org/#"
or "http://www.gnu.org/index.html#name" ?

The patch does not the current behavior wrt fragments.

(url-encode-url "http://www.gnu.org/#")
  => http://www.gnu.org/

in both cases.

With the patch we have

(url-encode-url "http://www.gnu.org#")
  => http://www.gnu.org

instead of

(url-encode-url "http://www.gnu.org#")
  => http://www.gnu.org/

(no trailing slash with the patch), but I don't think
http://www.gnu.org# is correct anyway.

Let me know,

-- 
 Bastien



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-12 14:42   ` Bastien
@ 2014-03-12 22:28     ` David Caldwell
  2014-03-13  8:06       ` Bastien
  0 siblings, 1 reply; 8+ messages in thread
From: David Caldwell @ 2014-03-12 22:28 UTC (permalink / raw)
  To: emacs-devel

Bastien <bzg <at> gnu.org> writes:

> 
> Hi Stefan,
> 
> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
> 
> >> The following trivial patch changes this.
> >
> > I'm no expert on URL conventions, but it looks OK to me, tho maybe the /
> > is needed when there's a fragment.

The relevant RFC here is 3986 (https://tools.ietf.org/html/rfc3986). Section
3 (https://tools.ietf.org/html/rfc3986#page-16) talks about the syntax.

> With the patch we have
> 
> (url-encode-url "http://www.gnu.org#")
>   => http://www.gnu.org
> 
> instead of
> 
> (url-encode-url "http://www.gnu.org#")
>   => http://www.gnu.org/
> 
> (no trailing slash with the patch), but I don't think
> http://www.gnu.org# is correct anyway.

That appears to be allowed by my reading of the RFC. The path is allowed to
be empty, so http://www.gnu.org#something and http://www.gnu.org?some-query
would both be valid. That being said, a quick test of curl shows that it
doesn't like # right on the end:

  $ curl 'http://www.gnu.org#someting'
  curl: (6) Could not resolve host: www.gnu.org#something

Firefox handles the same URL correctly.

That being said, there's no point at all to elide the / at the root. With
other paths there's a difference at the http level:

   http://gnu.org/emacs/ -> http command "GET /emacs/"
   http://gnu.org/emacs  -> http command "GET /emacs"

But the root one doesn't change:

   http://gnu.org/  -> http command "GET /"
   http://gnu.org   -> http command "GET /"

…since it's not valid http to just say "GET".

-David




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-12 22:28     ` David Caldwell
@ 2014-03-13  8:06       ` Bastien
  2014-03-13 13:12         ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Bastien @ 2014-03-13  8:06 UTC (permalink / raw)
  To: David Caldwell; +Cc: emacs-devel

Hi David,

David Caldwell <david@porkrind.org> writes:

> The relevant RFC here is 3986 (https://tools.ietf.org/html/rfc3986). Section
> 3 (https://tools.ietf.org/html/rfc3986#page-16) talks about the
> syntax.

Thanks, I checked it before submitting the patch.

>> With the patch we have
>> 
>> (url-encode-url "http://www.gnu.org#")
>>   => http://www.gnu.org
>> 
>> instead of
>> 
>> (url-encode-url "http://www.gnu.org#")
>>   => http://www.gnu.org/
>> 
>> (no trailing slash with the patch), but I don't think
>> http://www.gnu.org# is correct anyway.
>
> That appears to be allowed by my reading of the RFC. The path is allowed to
> be empty, so http://www.gnu.org#something and http://www.gnu.org?some-query
> would both be valid. That being said, a quick test of curl shows that it
> doesn't like # right on the end:
>
>   $ curl 'http://www.gnu.org#someting'
>   curl: (6) Could not resolve host: www.gnu.org#something
>
> Firefox handles the same URL correctly.

OKay, but the patch does not really change the output of
(url-encode-url "http://www.gnu.org#"), except the for trailing slash.

> That being said, there's no point at all to elide the / at the
> root.

The point is: there is no point in adding it when it is not needed,
which is what the current code does.

Admittedly, this is not a bug, but a small discrepancy that makes bare
URLs looks weird/ugly.

Stefan, let me know if you're fine with the patch and if I can commit
this in trunk or on some other branch.

Thanks,

-- 
 Bastien



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-13  8:06       ` Bastien
@ 2014-03-13 13:12         ` Stefan Monnier
  2014-03-13 13:50           ` Bastien
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2014-03-13 13:12 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-devel, David Caldwell

> OKay, but the patch does not really change the output of
> (url-encode-url "http://www.gnu.org#"), except the for trailing slash.

But it changes the output of (url-encode-url "http://www.gnu.org/#foo"),
doesn't it.  And it changes it from one that works with curl to one that
doesn't work with curl, right?

> Admittedly, this is not a bug, but a small discrepancy that makes bare
> URLs looks weird/ugly.

I personally find http://www.gnu.org more weird than http://www.gnu.org/

> Stefan, let me know if you're fine with the patch and if I can commit
> this in trunk or on some other branch.

I think we're better off leaving this as it is, since the two forms are
just as valid, and the current form seems to be marginally safer.


        Stefan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory)
  2014-03-13 13:12         ` Stefan Monnier
@ 2014-03-13 13:50           ` Bastien
  0 siblings, 0 replies; 8+ messages in thread
From: Bastien @ 2014-03-13 13:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, David Caldwell

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> OKay, but the patch does not really change the output of
>> (url-encode-url "http://www.gnu.org#"), except the for trailing slash.
>
> But it changes the output of (url-encode-url "http://www.gnu.org/#foo"),
> doesn't it.

No it doesn't.

With and without my patch, it produces "http://www.gnu.org/#foo" as
expected.

> And it changes it from one that works with curl to one that
> doesn't work with curl, right?

No: curl can't parse http://www.gnu.org#foo (while firefox can).

> I personally find http://www.gnu.org more weird than
> http://www.gnu.org/

To be more precise, I find it weird that `url-encode-url' _adds_
something that is not needed.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-03-13 13:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-12 10:19 url-encode-url: do not add a trailing slash for "bare" URLs (with no file/directory) Bastien
2014-03-12 13:40 ` Stefan Monnier
2014-03-12 14:05   ` Tim Visher
2014-03-12 14:42   ` Bastien
2014-03-12 22:28     ` David Caldwell
2014-03-13  8:06       ` Bastien
2014-03-13 13:12         ` Stefan Monnier
2014-03-13 13:50           ` Bastien

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).