From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50675) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gZsGe-0000ia-5o for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gZsGZ-00089I-03 for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:12 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:51111) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gZsGT-00085v-Vr for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gZsGT-0004Gl-RP for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:01 -0500 Subject: [bug#33801] import: github: Support source URIs that redirect to GitHub Resent-Message-ID: From: Arun Isaac In-Reply-To: <87d0pxtciz.fsf@gnu.org> References: <87d0pxtciz.fsf@gnu.org> Date: Thu, 20 Dec 2018 12:26:26 +0530 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 33801@debbugs.gnu.org --=-=-= Content-Type: text/plain > Do you know how many packages fall into that category? With this patch, we have a problem estimating the coverage using `guix refresh -L'. Now, to estimate coverage, we need to make HTTP requests for every single source tarball in Guix to determine if it redirects to GitHub. This is an enormous number of HTTP requests! When I ran `guix refresh -L', it took a very long time to finish coverage estimation. So, I cancelled the command. Any better way to handle this? >> +(define (follow-redirects-to-github uri) >> + "Follow redirects of URI until a GitHub URI is found. Return that GitHub >> +URI. If no GitHub URI is found, return #f." > > Perhaps add the yt-dl.org example as a comment here. I added a reference to the youtube-dl package in the comments. I also added a few more comments in other places. >> + (define (follow-redirect uri) >> + (receive (response body) (http-get uri #:streaming? #t) > > Add: (close-port body). I switched to using (http-head uri) instead of (http-get uri #:streaming? #t). So, (close-port body) should no longer be required. I also modified follow-redirects-to-github to avoid following redirects on mirror and file URIs. Please find attached a new patch. --=-=-= Content-Type: text/x-patch; charset=utf-8 Content-Disposition: inline; filename=0001-import-github-Support-source-URIs-that-redirect-to-G.patch Content-Transfer-Encoding: quoted-printable >From 7fa1daaf44720fa31813e4f07a2c49a2540a0526 Mon Sep 17 00:00:00 2001 From: Arun Isaac Date: Wed, 19 Dec 2018 15:59:52 +0530 Subject: [PATCH] import: github: Support source URIs that redirect to GitHu= b. * guix/import/github.scm (follow-redirects-to-github): New function. (updated-github-url)[updated-url]: For source URIs on other domains, replace all instances of the old version with the new version. (latest-release)[origin-github-uri]: If necessary, follow redirects to find the GitHub URI. --- guix/import/github.scm | 41 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/guix/import/github.scm b/guix/import/github.scm index af9f56e1d..8db7db305 100644 --- a/guix/import/github.scm +++ b/guix/import/github.scm @@ -1,6 +1,7 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright =C2=A9 2016 Ben Woodcroft ;;; Copyright =C2=A9 2017, 2018 Ludovic Court=C3=A8s +;;; Copyright =C2=A9 2018 Arun Isaac ;;; ;;; This file is part of GNU Guix. ;;; @@ -19,6 +20,8 @@ =20 (define-module (guix import github) #:use-module (ice-9 match) + #:use-module (ice-9 receive) + #:use-module (ice-9 regex) #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:use-module (srfi srfi-34) @@ -29,6 +32,8 @@ #:use-module (guix packages) #:use-module (guix upstream) #:use-module (guix http-client) + #:use-module (web client) + #:use-module (web response) #:use-module (web uri) #:export (%github-updater)) =20 @@ -39,12 +44,30 @@ false if none is recognized" (list ".tar.gz" ".tar.bz2" ".tar.xz" ".zip" ".tar" ".tgz" ".tbz" ".love"))) =20 +(define (follow-redirects-to-github uri) + "Follow redirects of URI until a GitHub URI is found. Return that GitHub +URI. If no GitHub URI is found, return #f." + (define (follow-redirect uri) + (receive (response body) (http-head uri) + (case (response-code response) + ((301 302) + (uri->string (assoc-ref (response-headers response) 'location))) + (else #f)))) + + (cond + ((string-prefix? "https://github.com/" uri) uri) + ((string-prefix? "http" uri) + (and=3D> (follow-redirect uri) follow-redirects-to-github)) + ;; Do not attempt to follow redirects on URIs other than http and https + ;; (such as mirror, file) + (else #f))) + (define (updated-github-url old-package new-version) ;; Return a url for the OLD-PACKAGE with NEW-VERSION. If no source url = in ;; the OLD-PACKAGE is a GitHub url, then return false. =20 (define (updated-url url) - (if (string-prefix? "https://github.com/" url) + (if (follow-redirects-to-github url) (let ((ext (or (find-extension url) "")) (name (package-name old-package)) (version (package-version old-package)) @@ -83,7 +106,14 @@ false if none is recognized" url) (string-append "/releases/download/" repo "-" version "/" repo= "-" version ext)) - (#t #f))) ; Some URLs are not recognised. + ;; As a last resort, attempt to replace all instances of the old + ;; version with the new version. This is necessary to handle UR= Is + ;; hosted on other domains that redirect to GitHub (for an exam= ple, + ;; see the youtube-dl package). We do not know the internal + ;; structure of these URIs and cannot handle them more + ;; intelligently. + (else (regexp-substitute/global + #f version url 'pre new-version 'post)))) #f)) =20 (let ((source-url (and=3D> (package-source old-package) origin-uri)) @@ -210,11 +240,14 @@ https://github.com/settings/tokens")) (define (latest-release pkg) "Return an for the latest release of PKG." (define (origin-github-uri origin) + ;; We follow redirects to GitHub because the origin URI might appear t= o be + ;; hosted on some other domain but just redirects to GitHub. For examp= le, + ;; see the youtube-dl package. (match (origin-uri origin) ((? string? url) - url) ;surely a github.com URL + (follow-redirects-to-github url)) ((urls ...) - (find (cut string-contains <> "github.com") urls)))) + (find follow-redirects-to-github urls)))) =20 (let* ((source-uri (origin-github-uri (package-source pkg))) (name (package-name pkg)) --=20 2.19.2 --=-=-=--