unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Vivien Kraus <vivien@planete-kraus.eu>
To: Nathan <nathan_mail@nborghese.com>, guile-devel@gnu.org
Subject: Re: [PATCH v4] Add resolve-relative-reference in (web uri), as in RFC 3986 5.2.
Date: Thu, 02 Nov 2023 21:48:55 +0100	[thread overview]
Message-ID: <92c8a960b091300107e523126887fd1c246a0fda.camel@planete-kraus.eu> (raw)
In-Reply-To: <87fs1n53de.fsf@nborghese.com>

Hello Natan!

Le jeudi 02 novembre 2023 à 16:00 -0400, Nathan a écrit :
> There is a problem and I fixed it by rewriting a bunch of code myself
> because I need similar code.

Thank you!

> remove-dot-segments:
> You cannot split-and-decode-uri-path and then encode-and-join-uri-
> path.
> Those are terrible functions that don't work on all URIs.
> URI schemes are allowed to specify that certain reserved characters
> (sub-delims) are special.
> In that case, a sub-delim that IS escaped is different from a sub-
> delim that IS NOT escaped.
> 
> Example input to your remove-dot-segments:
> (resolve-relative-reference (string->uri-reference "/") (string->uri-
> reference "excitement://a.com/a!a!%21!"))
> Your wrong output:
> excitement://a.com/a%21a%21%21%21

I see.

> 
> One solution would be to only percent-decode dots. Because dot is
> unreserved, that solution doesn't have any URI equivalence issues.
> But I still think decoding dots automatically is a bad, unexpected
> side-effect to have.
> I rewrote this function so that it:
> - works on both escaped and unescaped dots
> - doesn't unescape any unnecessary characters

This pushes the limits of my understanding of URIs, as I did not know
we had to consider '%2E%2E' the same as '..'. However, the RFC is not
very clear:

2.3: Unreserved Characters:
   For consistency, percent-encoded octets in the ranges of ALPHA
   (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
   underscore (%5F), or tilde (%7E) should not be created by URI
   producers and, when found in a URI, should be decoded to their
   corresponding unreserved characters by URI normalizers.

5.2.1: Pre-parse the Base URI:
   Normalization of the base URI, as described in Sections 6.2.2 and
   6.2.3, is optional.  A URI reference must be transformed to its
   target URI before it can be normalized.

Did you find something more precise than that?  In any case, decoding
the dots is probably the least unsafe thing to do.

> 
> The test suite no longer needs to check for incorrect output either:
> > ;; The test suite checks for ';' characters, but Guile escapes
> > ;; them in URIs. Same for '='.
> 
> ----
> 
> resolve-relative-reference:
> I rewrote this procedure so it is shorter.
> I also added #:strict? to toggle "strict parser" as mentioned in the
> RFC.

As far as I understand, your code is correct. The tests pass.

Thank you again!

Best regards,

Vivien



  reply	other threads:[~2023-11-02 20:48 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-25 16:48 [PATCH] Add resolve-relative-reference in (web uri), as in RFC 3986 5.2 Vivien Kraus
2023-09-25 20:46 ` Maxime Devos
2023-09-25 16:48   ` [PATCH v2] " Vivien Kraus
2023-10-02 16:32     ` Vivien Kraus
2023-10-03 18:49       ` Maxime Devos
2023-09-25 16:48         ` [PATCH v3] " Vivien Kraus
2023-10-03 18:56         ` [PATCH v2] " Dale Mellor
2023-10-03 19:04           ` Maxime Devos
2023-10-03 20:03   ` [PATCH] " Vivien Kraus
2023-10-03 22:22     ` Maxime Devos
2023-10-03 22:30       ` Maxime Devos
2023-10-04  5:29         ` Vivien Kraus
2023-10-10 21:44           ` Maxime Devos
2023-09-25 16:48             ` [PATCH v4] " Vivien Kraus
2023-11-02 20:00               ` Nathan via Developers list for Guile, the GNU extensibility library
2023-11-02 20:48                 ` Vivien Kraus [this message]
2023-11-03 17:49                   ` Nathan via Developers list for Guile, the GNU extensibility library
2023-11-03 18:19                     ` Vivien Kraus
2023-11-27 17:10                 ` Vivien Kraus
2023-11-27 17:15                   ` Vivien Kraus
2023-11-29  1:08                     ` Nathan via Developers list for Guile, the GNU extensibility library

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92c8a960b091300107e523126887fd1c246a0fda.camel@planete-kraus.eu \
    --to=vivien@planete-kraus.eu \
    --cc=guile-devel@gnu.org \
    --cc=nathan_mail@nborghese.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).