all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Hartmut Goebel <h.goebel@crazy-compilers.com>
To: 28159@debbugs.gnu.org
Subject: bug#28159: Updater needs to support HTTP(S) servers
Date: Sun, 20 Aug 2017 14:06:02 +0200	[thread overview]
Message-ID: <2c2838f3-24d6-5010-faf6-49e70f85e963@crazy-compilers.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3039 bytes --]

Hi,

our updater currently only supports FTP servers, but more and more
projects shutdown the FTP service and provide HTTP(S) servers only (e.g
the Linux kernel). For other projects, the main distribution point has
changed to HTTP and the mirrors still providing FTP at lagging (e.g.
KDE, see [1]).

A common case is to simply use Apache to serve the directories, but it
will deliver a HTML view on the directory contents (using mod_autoindex
[3]).

In [2] Ludo wrote:

    So we need a way to list the latest releases somehow.  If they publish
    JSON, XML, or some other structured info format, that’s fine too.  But
    HTTP alone is not good: we’d have to infer the information from HTML
    pages, which sounds fragile.

IMHO we can not expect project and mirror sites to provide these
additional data. Most projects simply will not do since this would
require the server to generate some data-files n the fly.

OTOH, I assume the delivered directory index pages to be well-formed
(X)HTML. Thus parsing the HTML should be quite simple: We only need to
pattern-match "<A>" tags, or – if guile has some decent one – a 
xml/html-parser use this to query the data.

Only relative links without slash (except a trailing one) have to be
handled. Links with a trailing slash can be assumed to be a directories.
(Since auto-index only works if URL is pointing to a directory and the
directory is marked by a training slash we can assume the generated
links for directories will all have the trailing slash.) At least this
would be a good start which could be refined if necessary.

Please note tha I'm not suggesting to write a general-purpose parser,
but aiming for auto-index html-pages only.

Some things I already found out:

  * Directory-listings generated by mod_autoindex can be provided as a
    simple list by passing the query-parameter "F=0" in the URL [4].
    There are other query parameters for sorting and pattern matching.
  * nginx's "ngx_http_autoindex_module" [6] seem to not use query
    parameters, but can be configured (on the server-side) to provide
    the content as XML or json. The "fancy_index" module [7] si
    documented to "Allow choosing to sort elements", but [7] does not
    state how and if "fancy" can be switched off.
  * Lighttp supports some of these options [5].

[1] http://lists.gnu.org/archive/html/guix-devel/2017-05/msg00237.html
[2] http://lists.gnu.org/archive/html/guix-devel/2017-05/msg00292.html
[3] https://httpd.apache.org/docs/2.4/mod/mod_autoindex.html
[4] https://httpd.apache.org/docs/2.4/mod/mod_autoindex.html#query
[5]
https://redmine.lighttpd.net/projects/1/wiki/Docs_ModDirlisting#Table-sorting
[6] http://nginx.org/en/docs/http/ngx_http_autoindex_module.html
[7] https://www.nginx.com/resources/wiki/modules/fancy_index/

-- 

Regards
Hartmut Goebel

| Hartmut Goebel          | h.goebel@crazy-compilers.com               |
| www.crazy-compilers.com | compilers which you thought are impossible |


[-- Attachment #2: 0xBF773B65.asc --]
[-- Type: application/pgp-keys, Size: 14855 bytes --]

             reply	other threads:[~2017-08-20 12:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <RT-Ticket-1238656@rt.gnu.org>
2017-08-20 12:06 ` Hartmut Goebel [this message]
2017-08-22  8:57   ` bug#28159: Updater needs to support HTTP(S) servers Ludovic Courtès
2017-08-23 10:20     ` Hartmut Goebel
2017-08-23 21:30       ` Ludovic Courtès
2017-08-26  9:54       ` Ludovic Courtès
2017-08-26 10:33         ` Hartmut Goebel
2017-09-03 21:40           ` Ludovic Courtès
2017-09-08  8:30             ` Ludovic Courtès
2017-09-14 16:50               ` bug#28159: [gnu.org #1238656] " =?UTF-8?B?UnViw6luIFJvZHLDrWd1ZXogUMOpcmV6?= via RT
2017-09-15  7:50                 ` Ludovic Courtès
2017-09-14 16:50               ` Rubén Rodríguez Pérez via RT
2017-09-25 22:39             ` Ludovic Courtès
2018-11-10 22:38               ` Ludovic Courtès
2019-09-10 17:25     ` Hartmut Goebel
2020-04-29  8:21   ` bug#28159: Closing bug #28159? " Brice Waegeneire
2020-04-30 21:14     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2c2838f3-24d6-5010-faf6-49e70f85e963@crazy-compilers.com \
    --to=h.goebel@crazy-compilers.com \
    --cc=28159@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.