all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Arun Isaac <arunisaac@systemreboot.net>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 33801@debbugs.gnu.org
Subject: [bug#33801] import: github: Support source URIs that redirect to GitHub
Date: Thu, 20 Dec 2018 12:26:26 +0530	[thread overview]
Message-ID: <cu7mup04rfp.fsf@systemreboot.net> (raw)
In-Reply-To: <87d0pxtciz.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]


> Do you know how many packages fall into that category?

With this patch, we have a problem estimating the coverage using `guix
refresh -L'. Now, to estimate coverage, we need to make HTTP requests
for every single source tarball in Guix to determine if it redirects to
GitHub. This is an enormous number of HTTP requests! When I ran `guix
refresh -L', it took a very long time to finish coverage estimation. So,
I cancelled the command. Any better way to handle this?

>> +(define (follow-redirects-to-github uri)
>> +  "Follow redirects of URI until a GitHub URI is found. Return that GitHub
>> +URI. If no GitHub URI is found, return #f."
>
> Perhaps add the yt-dl.org example as a comment here.

I added a reference to the youtube-dl package in the comments. I also
added a few more comments in other places.

>> +  (define (follow-redirect uri)
>> +    (receive (response body) (http-get uri #:streaming? #t)
>
> Add: (close-port body).

I switched to using (http-head uri) instead of (http-get uri
#:streaming? #t). So, (close-port body) should no longer be required.

I also modified follow-redirects-to-github to avoid following redirects
on mirror and file URIs.

Please find attached a new patch.


[-- Attachment #2: 0001-import-github-Support-source-URIs-that-redirect-to-G.patch --]
[-- Type: text/x-patch, Size: 4666 bytes --]

From 7fa1daaf44720fa31813e4f07a2c49a2540a0526 Mon Sep 17 00:00:00 2001
From: Arun Isaac <arunisaac@systemreboot.net>
Date: Wed, 19 Dec 2018 15:59:52 +0530
Subject: [PATCH] import: github: Support source URIs that redirect to GitHub.

* guix/import/github.scm (follow-redirects-to-github): New function.
(updated-github-url)[updated-url]: For source URIs on other domains, replace
all instances of the old version with the new version.
(latest-release)[origin-github-uri]: If necessary, follow redirects to find
the GitHub URI.
---
 guix/import/github.scm | 41 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/guix/import/github.scm b/guix/import/github.scm
index af9f56e1d..8db7db305 100644
--- a/guix/import/github.scm
+++ b/guix/import/github.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2016 Ben Woodcroft <donttrustben@gmail.com>
 ;;; Copyright © 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2018 Arun Isaac <arunisaac@systemreboot.net>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -19,6 +20,8 @@
 
 (define-module (guix import github)
   #:use-module (ice-9 match)
+  #:use-module (ice-9 receive)
+  #:use-module (ice-9 regex)
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
@@ -29,6 +32,8 @@
   #:use-module (guix packages)
   #:use-module (guix upstream)
   #:use-module (guix http-client)
+  #:use-module (web client)
+  #:use-module (web response)
   #:use-module (web uri)
   #:export (%github-updater))
 
@@ -39,12 +44,30 @@ false if none is recognized"
         (list ".tar.gz" ".tar.bz2" ".tar.xz" ".zip" ".tar"
               ".tgz" ".tbz" ".love")))
 
+(define (follow-redirects-to-github uri)
+  "Follow redirects of URI until a GitHub URI is found. Return that GitHub
+URI. If no GitHub URI is found, return #f."
+  (define (follow-redirect uri)
+    (receive (response body) (http-head uri)
+      (case (response-code response)
+        ((301 302)
+         (uri->string (assoc-ref (response-headers response) 'location)))
+        (else #f))))
+
+  (cond
+   ((string-prefix? "https://github.com/" uri) uri)
+   ((string-prefix? "http" uri)
+    (and=> (follow-redirect uri) follow-redirects-to-github))
+   ;; Do not attempt to follow redirects on URIs other than http and https
+   ;; (such as mirror, file)
+   (else #f)))
+
 (define (updated-github-url old-package new-version)
   ;; Return a url for the OLD-PACKAGE with NEW-VERSION.  If no source url in
   ;; the OLD-PACKAGE is a GitHub url, then return false.
 
   (define (updated-url url)
-    (if (string-prefix? "https://github.com/" url)
+    (if (follow-redirects-to-github url)
         (let ((ext     (or (find-extension url) ""))
               (name    (package-name old-package))
               (version (package-version old-package))
@@ -83,7 +106,14 @@ false if none is recognized"
                             url)
             (string-append "/releases/download/" repo "-" version "/" repo "-"
                            version ext))
-           (#t #f))) ; Some URLs are not recognised.
+           ;; As a last resort, attempt to replace all instances of the old
+           ;; version with the new version. This is necessary to handle URIs
+           ;; hosted on other domains that redirect to GitHub (for an example,
+           ;; see the youtube-dl package). We do not know the internal
+           ;; structure of these URIs and cannot handle them more
+           ;; intelligently.
+           (else (regexp-substitute/global
+                  #f version url 'pre new-version 'post))))
         #f))
 
   (let ((source-url (and=> (package-source old-package) origin-uri))
@@ -210,11 +240,14 @@ https://github.com/settings/tokens"))
 (define (latest-release pkg)
   "Return an <upstream-source> for the latest release of PKG."
   (define (origin-github-uri origin)
+    ;; We follow redirects to GitHub because the origin URI might appear to be
+    ;; hosted on some other domain but just redirects to GitHub. For example,
+    ;; see the youtube-dl package.
     (match (origin-uri origin)
       ((? string? url)
-       url)                                       ;surely a github.com URL
+       (follow-redirects-to-github url))
       ((urls ...)
-       (find (cut string-contains <> "github.com") urls))))
+       (find follow-redirects-to-github urls))))
 
   (let* ((source-uri (origin-github-uri (package-source pkg)))
          (name (package-name pkg))
-- 
2.19.2


  reply	other threads:[~2018-12-20  6:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-19 10:44 [bug#33801] import: github: Support source URIs that redirect to GitHub Arun Isaac
2018-12-19 21:47 ` Ludovic Courtès
2018-12-20  6:56   ` Arun Isaac [this message]
2018-12-20 10:55     ` Ludovic Courtès
2018-12-20 11:20       ` Arun Isaac
2018-12-20 11:22         ` Ludovic Courtès
2018-12-20 13:07       ` Arun Isaac
2018-12-20 16:28         ` Ludovic Courtès
2018-12-20 16:48           ` Arun Isaac
2018-12-21 12:27             ` Arun Isaac
2018-12-21 15:18               ` Ludovic Courtès
2018-12-22 10:08                 ` Arun Isaac
2018-12-23 17:23                   ` Ludovic Courtès
2019-01-05 23:18                   ` Ludovic Courtès
2019-01-07 17:48                     ` Arun Isaac
2019-01-08  8:40                       ` Ludovic Courtès
2019-01-08 13:19                         ` Arun Isaac
2019-01-09 14:11                           ` bug#33801: " Ludovic Courtès
2019-01-10  7:45                             ` [bug#33801] " Arun Isaac
2019-01-10  8:52                               ` Ludovic Courtès
2019-01-10 10:12                                 ` Arun Isaac
2018-12-21  0:12           ` Eric Bavier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cu7mup04rfp.fsf@systemreboot.net \
    --to=arunisaac@systemreboot.net \
    --cc=33801@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.