Re: Guile web server example serving static files

unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed

From: Zelphir Kaltstahl <zelphirkaltstahl@posteo.de>
To: divoplade <d@divoplade.fr>
Cc: Guile User <guile-user@gnu.org>
Subject: Re: Guile web server example serving static files
Date: Sat, 19 Sep 2020 12:39:54 +0200	[thread overview]
Message-ID: <cf222f51-c0b8-6ed2-7bf3-eddd80fc76a3@posteo.de> (raw)
In-Reply-To: <c83137b77d42e0f8a231570e7af276d22c59f0b4.camel@divoplade.fr>

Hello divoplade,

On 18.09.20 07:56, divoplade wrote:
> Hello Zelphir,
>
> Le jeudi 17 septembre 2020 à 23:45 +0200, Zelphir Kaltstahl a écrit :
>> I finally managed to create an example for using Guile's web server
>> and
>> serving static files. A rather silly bug kept me for a few days from
>> making progress, but finally today I fixed it.
>>
>> I tried to implement some security checks about the path of the
>> requested static assets. If anyone wants to look at it and point out
>> issues with it, I will try to fix it, or you could make a pull
>> request.
>> If there are any other issues, it would also be great to know them :
>> )
>>
>> Here is the code in my repository:
>>
>> https://notabug.org/ZelphirKaltstahl/guile-examples/src/65ba7cead2983f1ceb8aa2d4eedfe37734e5ca56/web-development/example-03-serve-static-assets
>>
>> I tried to comment most stuff, so that the code can be understood
>> more
>> easily.
>>
>> And here is a pointer to the path security stuff:
>>
>> https://notabug.org/ZelphirKaltstahl/guile-examples/src/65ba7cead2983f1ceb8aa2d4eedfe37734e5ca56/web-development/example-03-serve-static-assets/web-path-handling.scm#L50
> As for why guile avoid reasoning about "paths", see 
> https://www.gnu.org/prep/standards/standards.html#GNU-Manuals
> https://www.gnu.org/prep/standards/standards.html#GNU-Manuals:
>
> Please do not use the term “pathname” that is used in Unix
> documentation; use “file name” (two words) instead. We use the term
> “path” only for search paths, which are lists of directory names.

I see! I was not aware of any of that. That explains the naming. Perhaps
I need to think of another name than "path" or "file". It seems still
weird to me, to call everything a "file" when it is actually also for
directories, symlinks and whatever else there is. Is there any common
idea for an alternative name?

But perhaps on the lower level those are really the same? I have no idea
how a typical file system or OS handles directories.

> Also, your functions "absolute-path" and "complex-path?" in path-
> handling.scm 
> https://notabug.org/ZelphirKaltstahl/guile-examples/src/65ba7cead2983f1ceb8aa2d4eedfe37734e5ca56/web-development/example-03-serve-static-assets/path-handling.scm
>
> do not seem to me that they would work correctly when passed something
> starting with "../" (as opposed to containing "/../").
I think that is a remnant from thinking, that anything that contains
anything with special meaning like ".." should be considered "complex"
and refusing to handle it. That is probably, why I did not think about
it. Perhaps I overlooked this case. I need to check my code. I'll add
test cases for those.
> I think that
> with a little bit of work you could accept "../" in arguments and tweak
> path-join to go up (by discarding anything in path1 after the last '/'
> and go to the next part, if there is something to discard).

When a "../" is encountered, it should go one level up, so that means
discarding one part (one "thing between the separators", have no better
name for it right now), so I would have to not use fold or do something
special inside fold to modify the already accumulated part, I guess.

I think you are correct in that this function is in the path-handling
module and should work properly also for special cases like "../" parts.
It is an abstraction layer below the web-path-handling module. I should
probably completely separate it out into its own example directory and
make it more comprehensive, adding things whenever I realize something
is missing.

> Also I am not sure how it would remove inclusions of '/./' or leading
> './' in the name.

I think for that case such "./" could be skipped, as they do not change
the meaning or result. If a part begins with a "/" it causes the
previously accumulated to be discarded, as it is itself an absolute
path. I think this is the behavior of the Python 3 function and what I
took as an example of what the results of my own function should be.

I should add test cases for this and then fix it.

> The URI RFC (https://tools.ietf.org/html/rfc3986#section-5) describes
> an algorithm in section 5.2. Relative resolution that does the
> canonization of an URI relative to an absolute URI (you just need to
> ignore the scheme, authority, query and fragment parts and focus on the
> path). This is similar to canonicalization of file names, except for
> the \\ difficulty. In particular, see 5.2.4, Remove dot segments.

That is a lot to read.

Seems like the whole problem calls for an extra path handling library. I
did not look in detail at the code in the RFC yet, but hopefully it will
be readable, understandable and not too difficult to implement, so that
I can create that library and call it "RFC conform" or something. Seems
like I opened a can of worms with the whole path stuff. It's always
those little things that explode ...

> Also, you should refrain from checking if a file exists, because it
> could be deleted between your call to file-exists? and when you
> actually open the file. Thus, passing the file-exists? test will not
> guarantee that the file will exist when you want to use it, and even
> less that you will be able to open it and read it.
That makes sense. Thanks for pointing it out. I shostringuld rather be
handling the exception when/if the file happens to be not existing when
I try to read it.
> Finally, you don't need to check if a file name is "safe" at all. the
> file procedures do not interpret or substitute variables or ~ or ``
> (try it: change directory to /tmp and write to files named ~root,
> `pwd`, $PATH, '*', ... just be aware that you will have a hard time
> deleting them from bash!), and there is nothing special with files
> named as a series of dots. That's good, otherwise you would also need
> to check for '%' in mingw and whatever stuff microsoft invented to
> change the file name experience.
Ha! I simply assumed I would need to do this. Good to know!
> Best regards,
>
> divoplade

All good input, thank you!

Regards,
Zelphir

-- 
repositories: https://notabug.org/ZelphirKaltstahl

next prev parent reply	other threads:[~2020-09-19 10:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17 21:45 Guile web server example serving static files Zelphir Kaltstahl
2020-09-18  5:56 ` divoplade
2020-09-19 10:39   ` Zelphir Kaltstahl [this message]
2020-09-20  7:48   ` tomas
2020-09-20  7:52     ` divoplade
2020-09-20  8:29       ` tomas
2020-09-20  8:54         ` divoplade
2020-09-20  9:07           ` tomas
2020-09-18  7:47 ` Dr. Arne Babenhauserheide
2020-09-19 10:57   ` Zelphir Kaltstahl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cf222f51-c0b8-6ed2-7bf3-eddd80fc76a3@posteo.de \
    --to=zelphirkaltstahl@posteo.de \
    --cc=d@divoplade.fr \
    --cc=guile-user@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).