all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ben Woodcroft <b.woodcroft@uq.edu.au>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: "guix-devel@gnu.org" <guix-devel@gnu.org>
Subject: Re: [PATCH] draft addition of github updater
Date: Sun, 20 Dec 2015 10:42:22 +1000	[thread overview]
Message-ID: <5675F96E.4090609@uq.edu.au> (raw)
In-Reply-To: <87h9kmb8zs.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 7532 bytes --]

Thanks for the encouraging words. Here's the next revision.

On 16/11/15 19:15, Ludovic Courtès wrote:
> Hi!
>
> Ben Woodcroft <b.woodcroft@uq.edu.au> skribis:
>
>> Importing from GitHub seems very non-trivial, but can we update?
>> There's a number of issues with the attached patch but so far out of
>> the 171 github package in guix, it recognizes 101, and 17 are detected
>> as out of date (see below).
It seems I miscounted before, but now it is 129 of 146 github "release" 
packages recognised with 28 suggesting an update - see the end of email 
for details. There is one false positive:

gnu/packages/ocaml.scm:202:13: camlp4 would be upgraded from 4.02+6 to 
4.02.0+1

This happens because the newer versions were not made as official 
releases just tags, so the newer versions are omitted from the API 
response, plus there's the odd version numbering scheme. Guix is up to 
date.
>> I have two questions:
>>
>> 1. Some guess-work is required to get between the version as it is
>> defined in guix, and that presented in the github json, where only the
>> "tag_name" is available. Is it OK to be a little speculative in this
>> conversion e.g. "v1.0" => "1.0"?
> I guess so.  What I would do is do that conversion when the tag matches
> “^v[0-9]” and leave the tag as-is in other cases.  WDYT?
>
> We can always add more heuristics later if we find that there’s another
> widely-used convention for tag names.
Most seem to follow those few conventions, but there's still repos that 
decided to be different e.g.

https://github.com/vapoursynth/vapoursynth/archive/R28.tar.gz
https://github.com/synergy/synergy/archive/v1.7.4-stable.tar.gz

Having gotten this far, I wonder if I've gone about it backwards. 
Currently the updater works by asserting it is a refreshable package by 
interrogating the source URI only. But it might be easier to determine 
this with an API response on hand, by matching the current release 
version number to a tag. Then if we assume the same transformation of 
tag to version holds in the newest release, the reverse transformation 
can be used on the newest tag to convert it back into a version number. 
By transformation I mean addition of [a-z\.\-] characters before and 
after the version number. This is easier because guesswork is only 
needed to convert between the tag and version number, without reference 
to a URI.

This means more work for me, is it a good idea? As I understand it would 
involve returning #t more often from "github-package?". If #f is 
returned by an updater, do the updaters further down the chain get a 
bite at the cherry too? It doesn't matter for now since the github 
updater is last, but it might in the future.
>> 2. For mass-updates, it fails when it hits the abuse limit on github
>> (60 api requests per hour). This can be overcome by authenticating
>> with an access token, but I don't think that token should go in the
>> git repository. So I'm after some guidance on the best way of the user
>> providing a token to the updater (or some other workaround).
> Argh, that’s annoying.  How does it fail exactly?  What’s the impact on
> the behavior of ‘guix refresh’?
I didn't investigate thoroughly, but I believe it either gives a 403 or 
a more descriptive json string, dependent on the user-agent.

I added some words and errored out when json-fetch* returns #f. This was 
potentially a little lazy on my part as it might be better to detect the 
403 error as distinct from errors of other kinds, but it wasn't 
immediately obvious to me how to do this without going too deep into the 
fetching functions and/or duplicating code. WDYT?
>
> I guess (guix import github) could contain something like:
>
>    (define %github-token
>      ;; Token to be passed to Github.com to avoid the 60-request per hour
>      ;; limit, or #f.
>      (make-parameter (getenv "GUIX_GITHUB_TOKEN")))
>
> and we’d need to document that, or maybe write a message hinting at it
> when we know the limit has been reached.
>
> WDYT?
Seems we were all thinking the same thing - I've integrated this. Should 
we check that the token matches ^[0-9a-f]+$ for security and UI?
> I was thinking we could have a generic Git updater that would look for 
> available tags upstream. I wonder how efficient that would be compared 
> to using the GitHub-specific API, and if there would be other 
> differences. What are your thoughts on this?
This sounds like an excellent idea, but I was unable to find any way to 
fetch tags without a clone first. A clone could take a long time and a 
lot of bandwidth I would imagine. Also there's no way to discern regular 
releases from pre-releases I don't think. It is a bit unclear to me how 
conservative these updaters should be, are tags sufficiently synonymous 
with releases so as to be reported by refresh?

There's a number of github repos packaged that refer to git commits 
directly too, these are ignored by the current updater but might benefit 
from this approach (as well as non-github git repos of course).

Thanks,
ben



gnu/packages/xml.scm:380:13: pugixml would be upgraded from 1.6 to 1.7
gnu/packages/web.scm:353:13: libpsl would be upgraded from 0.7.1 to 0.11.0
gnu/packages/web.scm:685:6: sassc would be upgraded from 3.2.5 to 3.3.2
gnu/packages/version-control.scm:934:13: findnewest would be upgraded 
from 0.2 to 0.3
gnu/packages/telephony.scm:192:13: libsrtp would be upgraded from 1.5.2 
to 1.5.3
gnu/packages/ruby.scm:2373:13: ruby-sanitize would be upgraded from 
4.0.0 to 4.0.1
gnu/packages/ocaml.scm:202:13: camlp4 would be upgraded from 4.02+6 to 
4.02.0+1
gnu/packages/ninja.scm:31:13: ninja would be upgraded from 1.5.3 to 1.6.0
gnu/packages/maths.scm:1855:13: dealii would be upgraded from 8.2.1 to 8.3.0
gnu/packages/jrnl.scm:30:13: jrnl would be upgraded from 1.8.4 to 1.9.7
gnu/packages/gl.scm:453:13: libepoxy would be upgraded from 1.2 to 1.3.1
gnu/packages/game-development.scm:125:13: tiled would be upgraded from 
0.13.1 to 0.14.2
gnu/packages/fontutils.scm:285:13: libuninameslist would be upgraded 
from 0.4.20140731 to 0.5.20150701
gnu/packages/engineering.scm:58:13: librecad would be upgraded from 
2.0.6-rc to 2.0.8
gnu/packages/emacs.scm:436:13: haskell-mode would be upgraded from 
13.14.2 to 13.16
gnu/packages/conky.scm:35:13: conky would be upgraded from 1.10.0 to 1.10.1
gnu/packages/bioinformatics.scm:974:13: deeptools would be upgraded from 
1.5.11 to 1.5.12
gnu/packages/bioinformatics.scm:1532:13: htsjdk would be upgraded from 
1.129 to 2.0.1
gnu/packages/bioinformatics.scm:207:13: bedtools would be upgraded from 
2.24.0 to 2.25.0
gnu/packages/bioinformatics.scm:1880:13: orfm would be upgraded from 
0.4.1 to 0.5.2
gnu/packages/bioinformatics.scm:758:13: clipper would be upgraded from 
0.3.0 to 1.0
gnu/packages/bioinformatics.scm:1612:13: idr would be upgraded from 
2.0.0 to 2.0.2
gnu/packages/bioinformatics.scm:2592:13: preseq would be upgraded from 
2.0 to 2.0.2
gnu/packages/bioinformatics.scm:2978:13: vsearch would be upgraded from 
1.4.1 to 1.9.5
gnu/packages/bioinformatics.scm:1360:13: grit would be upgraded from 
2.0.2 to 2.0.4
gnu/packages/bioinformatics.scm:1577:13: htslib would be upgraded from 
1.2.1 to 1.3
gnu/packages/bioinformatics.scm:1013:13: diamond would be upgraded from 
0.7.9 to 0.7.10
gnu/packages/bioinformatics.scm:613:13: bowtie would be upgraded from 
2.2.4 to 2.2.6


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-import-Add-github-updater.patch --]
[-- Type: text/x-patch; name="0001-import-Add-github-updater.patch", Size: 11421 bytes --]

From a42eda6b9631cc28dfdd02d2c8bb02eabb2626b9 Mon Sep 17 00:00:00 2001
From: Ben Woodcroft <donttrustben@gmail.com>
Date: Sun, 15 Nov 2015 10:18:05 +1000
Subject: [PATCH] import: Add github-updater.

* guix/import/github.scm: New file.
* guix/scripts/refresh.scm (%updaters): Add %GITHUB-UPDATER.
* doc/guix.texi (Invoking guix refresh): Mention it.
---
 doc/guix.texi            |  14 ++++
 guix/import/github.scm   | 167 +++++++++++++++++++++++++++++++++++++++++++++++
 guix/scripts/refresh.scm |   5 +-
 3 files changed, 185 insertions(+), 1 deletion(-)
 create mode 100644 guix/import/github.scm

diff --git a/doc/guix.texi b/doc/guix.texi
index 06d70ba..f6b7368 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -16,6 +16,7 @@ Copyright @copyright{} 2013 Nikita Karetnikov@*
 Copyright @copyright{} 2015 Mathieu Lirzin@*
 Copyright @copyright{} 2014 Pierre-Antoine Rault@*
 Copyright @copyright{} 2015 Taylan Ulrich Bayırlı/Kammer
+Copyright @copyright{} 2015 Ben Woodcroft
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
@@ -4354,6 +4355,16 @@ attempt is made to automatically retrieve it from a public key server;
 when it's successful, the key is added to the user's keyring; otherwise,
 @command{guix refresh} reports an error.
 
+The @code{github} updater uses the
+@uref{https://developer.github.com/v3/, GitHub API} to query for new
+releases. When used repeatedly e.g. when refreshing all packages, GitHub
+will eventually refuse to answer any further API requests. By default 60
+API requests per hour are allowed, and a full refresh on all GitHub
+packages in Guix requires more than this. Authentication with GitHub
+through the use of an API token alleviates these limits. To use an API
+token, set the environment variable @code{GUIX_GITHUB_TOKEN} to a token
+procured from @uref{https://github.com/settings/tokens} or otherwise.
+
 The following options are supported:
 
 @table @code
@@ -4415,6 +4426,8 @@ the updater for @uref{http://elpa.gnu.org/, ELPA} packages;
 the updater for @uref{http://cran.r-project.org/, CRAN} packages;
 @item pypi
 the updater for @uref{https://pypi.python.org, PyPI} packages.
+@item github
+the updater for @uref{https://github.com, GitHub} packages.
 @end table
 
 For instance, the following commands only checks for updates of Emacs
@@ -4501,6 +4514,7 @@ Use @var{host} as the OpenPGP key server when importing a public key.
 
 @end table
 
+
 @node Invoking guix lint
 @section Invoking @command{guix lint}
 The @command{guix lint} is meant to help package developers avoid common
diff --git a/guix/import/github.scm b/guix/import/github.scm
new file mode 100644
index 0000000..2ed477e
--- /dev/null
+++ b/guix/import/github.scm
@@ -0,0 +1,167 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2015 Ben Woodcroft <donttrustben@gmail.com>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+;; TODO: Are all of these imports used?
+(define-module (guix import github)
+  #:use-module (ice-9 match)
+  #:use-module (srfi srfi-1)
+  #:use-module (json)
+  #:use-module (guix utils)
+  #:use-module ((guix download) #:prefix download:)
+  #:use-module (guix import utils)
+  #:use-module (guix packages)
+  #:use-module (guix upstream)
+  #:use-module (gnu packages)
+  #:export (%github-updater))
+
+(define (json-fetch* url)
+  "Return a list/hash representation of the JSON resource URL, or #f on
+failure."
+  (call-with-output-file "/dev/null"
+    (lambda (null)
+      (with-error-to-port null
+        (lambda ()
+          (call-with-temporary-output-file
+           (lambda (temp port)
+             (and (url-fetch url temp)
+                  (call-with-input-file temp json->scm)))))))))
+
+;; TODO: is there some code from elsewhere in guix that can be used instead of
+;; redefining?
+(define (find-extension url)
+  "Return the extension of the archive e.g. '.tar.gz' given a URL, or
+false if none is recognized"
+  (find (lambda x (string-suffix? (first x) url))
+        (list ".tar.gz" ".tar.bz2" ".tar.xz" ".zip" ".tar")))
+
+(define (github-package? package)
+  "Return true if PACKAGE is a package from GitHub."
+
+  (define (github-url? url)
+    (and
+     (string-prefix? "https://github.com/" url)
+     (let ((ext (find-extension url)))
+       (and ext
+            (or
+             (string-suffix?
+              (string-append "/archive/v" (package-version package) ext) url)
+             (string-suffix?
+              (string-append "/archive/" (package-version package) ext) url)
+             (string-suffix?
+              (string-append "/archive/" (package-name package) "-"
+                             (package-version package) ext)
+              url)
+             (string-suffix?
+              (string-append "/releases/download/v" (package-version package)
+                             "/" (package-name package) "-"
+                             (package-version package) ext)
+              url)
+             (string-suffix?
+              (string-append "/releases/download/" (package-version package)
+                             "/" (package-name package) "-"
+                             (package-version package) ext)
+              url))))))
+
+  (let ((source-url (and=> (package-source package) origin-uri))
+        (fetch-method (and=> (package-source package) origin-method)))
+    (and (eq? fetch-method download:url-fetch)
+         (match source-url
+           ((? string?)
+            (github-url? source-url))
+           ((source-url ...)
+            (any github-url? source-url))))))
+
+(define (github-user-slash-repository url)
+  "Return a string e.g. arq5x/bedtools2 of the owner and the name of the
+repository separated by a forward slash, from a string URL of the form
+'https://github.com/arq5x/bedtools2/archive/v2.24.0.tar.gz'"
+  (let ((splits (string-split url #\/)))
+    (string-append (list-ref splits 3) "/" (list-ref splits 4))))
+
+(define %github-token
+  ;; Token to be passed to Github.com to avoid the 60-request per hour
+  ;; limit, or #f.
+  ;; QUESTION: is there a need to check that the token looks like a token, for
+  ;; security, since it gets used in a fetch as is?
+  (make-parameter (getenv "GUIX_GITHUB_TOKEN")))
+
+(define (latest-released-version url package-name)
+  "Return a string of the newest released version name given a string URL like
+'https://github.com/arq5x/bedtools2/archive/v2.24.0.tar.gz' and the name of
+the package e.g. 'bedtools2'. Return #f if there is no releases"
+  (let* ((token (%github-token))
+         (api-url (string-append
+                   "https://api.github.com/repos/"
+                   (github-user-slash-repository url)
+                   "/releases"))
+         (json (json-fetch*
+                (if token
+                    (string-append api-url "?access_token=" token)
+                    api-url))))
+    (if (eq? json #f)
+        (if token
+            (error "Error downloading release information through the GitHub
+API when using a GitHub token")
+            (error "Error downloading release information through the GitHub
+API. This may be fixed by using an access token and setting the environment
+variable GUIX_GITHUB_TOKEN, for instance one procured from
+https://github.com/settings/tokens"))
+        (let ((proper-releases
+               (filter
+                (lambda (x)
+                  ;; example pre-release:
+                  ;; https://github.com/wwood/OrfM/releases/tag/v0.5.1
+                  ;; or an all-prerelease set
+                  ;; https://github.com/powertab/powertabeditor/releases
+                  (eq? (assoc-ref (hash-table->alist x) "prerelease") #f))
+                json)))
+          (if (eq? (length proper-releases) 0) #f ;empty releases list
+              (let*
+                  ((tag (assoc-ref (hash-table->alist (first proper-releases))
+                                   "tag_name"))
+                   (name-length (string-length package-name)))
+                ;; some tags include the name of the package e.g. "fdupes-1.51"
+                ;; so remove these
+                (if (and (< name-length (string-length tag))
+                         (string=? (string-append package-name "-")
+                                   (substring tag 0 (+ name-length 1))))
+                    (substring tag (+ name-length 1))
+                    ;; some tags start with a "v" e.g. "v0.25.0"
+                    ;; where some are just the version number
+                    (if (eq? (string-ref tag 0) #\v)
+                        (substring tag 1) tag))))))))
+
+(define (latest-release guix-package)
+  "Return an <upstream-source> for the latest release of GUIX-PACKAGE."
+  (let* ((pkg (specification->package guix-package))
+         (source-uri (origin-uri (package-source pkg)))
+         (name (package-name pkg))
+         (version (latest-released-version source-uri name)))
+    (if version
+        (upstream-source
+         (package guix-package)
+         (version version)
+         (urls (list source-uri)))
+        #f)))
+
+(define %github-updater
+  (upstream-updater
+   (name 'github)
+   (description "Updater for GitHub packages")
+   (pred github-package?)
+   (latest latest-release)))
diff --git a/guix/scripts/refresh.scm b/guix/scripts/refresh.scm
index a5834d1..adbcf28 100644
--- a/guix/scripts/refresh.scm
+++ b/guix/scripts/refresh.scm
@@ -3,6 +3,7 @@
 ;;; Copyright © 2013 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2014 Eric Bavier <bavier@member.fsf.org>
 ;;; Copyright © 2015 Alex Kost <alezost@gmail.com>
+;;; Copyright © 2015 Ben Woodcroft <donttrustben@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -34,6 +35,7 @@
                 #:select (%gnu-updater %gnome-updater))
   #:use-module (guix import elpa)
   #:use-module (guix import cran)
+  #:use-module (guix import github)
   #:use-module (guix gnupg)
   #:use-module (gnu packages)
   #:use-module ((gnu packages commencement) #:select (%final-inputs))
@@ -195,7 +197,8 @@ unavailable optional dependencies such as Guile-JSON."
                  %gnome-updater
                  %elpa-updater
                  %cran-updater
-                 ((guix import pypi) => %pypi-updater)))
+                 ((guix import pypi) => %pypi-updater)
+                 %github-updater))
 
 (define (lookup-updater name)
   "Return the updater called NAME."
-- 
2.5.0


  reply	other threads:[~2015-12-20  0:42 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-15  0:32 [PATCH] draft addition of github updater Ben Woodcroft
2015-11-16  9:15 ` Ludovic Courtès
2015-12-20  0:42   ` Ben Woodcroft [this message]
2016-01-03 20:46     ` Ludovic Courtès
2016-01-05 16:05       ` Ricardo Wurmus
2016-04-15  8:42         ` Updaters now receive package objects Ludovic Courtès
2016-02-21  3:13       ` [PATCH] draft addition of github updater Ben Woodcroft
2016-02-21  3:17         ` Ben Woodcroft
2016-02-23 13:22         ` Ludovic Courtès
2016-02-27  3:14           ` Ben Woodcroft
2016-02-27 11:55             ` Ricardo Wurmus
2016-02-28 14:35               ` Ludovic Courtès
2015-11-16 14:14 ` Efraim Flashner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5675F96E.4090609@uq.edu.au \
    --to=b.woodcroft@uq.edu.au \
    --cc=guix-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.