unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Maxime Devos <maximedevos@telenet.be>
To: Vivien Kraus <vivien@planete-kraus.eu>, guile-devel@gnu.org
Subject: Re: [PATCH] Add resolve-relative-reference in (web uri), as in RFC 3986 5.2.
Date: Mon, 25 Sep 2023 22:46:15 +0200	[thread overview]
Message-ID: <c515b924-ea1c-43c5-712c-1afbc0fe0b0b@telenet.be> (raw)
In-Reply-To: <61e17faa8546f6ff79e9bbe1f25f0bf687d3dce1.1695667513.git.vivien@planete-kraus.eu>


[-- Attachment #1.1.1: Type: text/plain, Size: 5390 bytes --]



Op 25-09-2023 om 18:48 schreef Vivien Kraus:
> * module/web/uri.scm (remove-dot-segments): Implement algorithm 5.2.4.
> (merge-paths): Implement algorithm 5.2.3.
> (resolve-relative-reference): Implement algorithm 5.2.2.
> (module): Export resolve-relative-reference.
> * NEWS: Reference it here.
> ---
> Dear Guile developers,
> 
> When you request https://example.com/resource an URI and get redirected
> to "here", you end up with 2 URI references:
> 
>    - https://example.com/resource
>    - here
> 
> What should you request next? The answer is,
> "https://example.com/here". It seems evident how we go from one to the
> other.
> 
> However, there are more subtle cases. What if you get redirected to
> "../here", for instance?
> 
> RFC 3986 has you covered, in section 5.2. It explains how we go from a
> base URI and a URI reference to the new URI.
> What do you think?
> 
> Best regards,

Sounds very useful.  However, there are also some dangers on doing this 
thing -- the ‘external’ page https://example.com/data.json could 
redirect to 
http://localhost/unsecured-secret-but-its-localhost-only-so-it-is-safe.

I forgot the name of this attack, but there is probably a page somewhere 
that documents the danger and how to mitigate it (I think I read some 
Firefox documentation somewhere?).  Perhaps there exists an informative 
RFC about it?   I think it put a warning about this somewhere in the 
documentation.

(Another related problem is that example.com could have IP address ::1, 
but that's a different problem.)

> 
> Vivien
> 
>   NEWS                          |   7 ++
>   module/web/uri.scm            | 152 +++++++++++++++++++++++++++++++++-
>   test-suite/tests/web-uri.test |  68 +++++++++++++++

You forgot modifying the non-docstring documentation to properly 
document the new procedure.

>   3 files changed, 226 insertions(+), 1 deletion(-)

Given the existence of resolve-relative-reference, it is easy to expect 
http-request to do redirection.  I think it would be best to adjust to 
the documentation of http-request / http-get / ... to mention whether 
automatic redirection is done or not.

> +(define (resolve-relative-reference base relative)
> +  "Resolve @var{relative} on top of @var{base}, as RFC3986, section 5.2."

I don't like the mutation, but it's a completely deterministic procedure 
(ignoring memory allocation) so it can't cause problems and hence I 
suppose it's not worth rewriting (unless you or someone else really 
wants to rewrite it, I suppose).

I suppose it also avoids the risk of accidentally deviating from the RFC 
it is supposed to implement, which is major advantage of sticking with 
the mutation.

I like that you say _which_ resolution method you are using instead of 
saying or implying that this is the always the _right_ way of resolving 
relative references, because some URI schemes are rather quirky.
(I don't know any quirkiness w.r.t. relative references, but wouldn't be 
surprised if it exists.)

Also I think it's worth stating that both base and relative are URIs -- 
with the current docstring, I find (resolve-... uri "./whatever") a 
reasonable thing to do.

IIUC, there currently is nothing preventing

(resolve-... (uri object for "https://example.com/a")
              (uri object for "https://example.com/b")).


IIUC, this is supposed to return (uri object for 
"https://example.com/b"), but that could be more explicit with a change 
of variable name.

(define (resolve-... base maybe-relative)
   [...])

> +(with-test-prefix "resolve relative reference"
> +  ;; Test suite in RFC3986, section 5.4.
> +  (let ((base (string->uri "http://a/b/c/d;p?q"))
> +        (equal/encoded?
> +         ;; The test suite checks for ';' characters, but Guile escapes
> +         ;; them in URIs. Same for '='.

IIUC, that's a bug!

6.2.2.2.  Percent-Encoding Normalization

    The percent-encoding mechanism (Section 2.1) is a frequent source of
    variance among otherwise identical URIs.  In addition to the case
    normalization issue noted above, some URI producers percent-encode
    octets that do not require percent-encoding, resulting in URIs that
    are equivalent to their non-encoded counterparts.  __These URIs
    __should be normalized by decoding any percent-encoded octet that
    corresponds to an unreserved character, as described in
    Section 2.3.__

(Emphasis added.)

I am assuming here that ; is an unreserved character, if it isn't, there 
isn't a bug here.

However, I sense a lack of normalisation in resolve-relative-reference, 
so unless Guile already does normalisation elsewhere (perhaps it does 
during the URI object construction?), there might be a bug here -- ok, 
technically perhaps not a bug because the docstring only mentions 
‘implements section 5.2 and 5.2. doesn't seem to mention section 6 (and 
section 6 says ‘should’, not ‘shall/must’, but some people use ‘should’ 
as ‘shall/must’, so dunno), but in that case that's still a potential 
pitfall that could be mentioned in the docstring. Could be as simple as 
"No normalisation is performed.".

I guess it shouldn't do normalisation, but guesswork seems better to be 
avoided/confirmed or disconfirmed when possible.

Best regards,
Maxime Devos.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

  reply	other threads:[~2023-09-25 20:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-25 16:48 [PATCH] Add resolve-relative-reference in (web uri), as in RFC 3986 5.2 Vivien Kraus
2023-09-25 20:46 ` Maxime Devos [this message]
2023-09-25 16:48   ` [PATCH v2] " Vivien Kraus
2023-10-02 16:32     ` Vivien Kraus
2023-10-03 18:49       ` Maxime Devos
2023-09-25 16:48         ` [PATCH v3] " Vivien Kraus
2023-10-03 18:56         ` [PATCH v2] " Dale Mellor
2023-10-03 19:04           ` Maxime Devos
2023-10-03 20:03   ` [PATCH] " Vivien Kraus
2023-10-03 22:22     ` Maxime Devos
2023-10-03 22:30       ` Maxime Devos
2023-10-04  5:29         ` Vivien Kraus
2023-10-10 21:44           ` Maxime Devos
2023-09-25 16:48             ` [PATCH v4] " Vivien Kraus
2023-11-02 20:00               ` Nathan via Developers list for Guile, the GNU extensibility library
2023-11-02 20:48                 ` Vivien Kraus
2023-11-03 17:49                   ` Nathan via Developers list for Guile, the GNU extensibility library
2023-11-03 18:19                     ` Vivien Kraus
2023-11-27 17:10                 ` Vivien Kraus
2023-11-27 17:15                   ` Vivien Kraus
2023-11-29  1:08                     ` Nathan via Developers list for Guile, the GNU extensibility library

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c515b924-ea1c-43c5-712c-1afbc0fe0b0b@telenet.be \
    --to=maximedevos@telenet.be \
    --cc=guile-devel@gnu.org \
    --cc=vivien@planete-kraus.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).