unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
@ 2022-03-03 21:13 Ludovic Courtès
  2022-03-03 21:14 ` [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error' Ludovic Courtès
  2022-03-04 12:07 ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Maxime Devos
  0 siblings, 2 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-03 21:13 UTC (permalink / raw)
  To: 54241; +Cc: Ludovic Courtès, Nicolas Goaziou

Hi Guix!

These patches address a famous complaint about “the GitHub problem”
that affects ‘guix refresh’¹, shown here in its naked awfulness:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix refresh
gnu/packages/zig.scm:32:13: zig would be upgraded from 0.9.0 to 0.9.1

[...]

In guix/scripts/refresh.scm:
   578:14  5 (_ _)
In srfi/srfi-1.scm:
    634:9  4 (for-each #<procedure 7fe85c9a8e00 at guix/scripts/refresh.scm:578:24 (t-916fdc98f4be2f1-1d48)> _)
In guix/scripts/refresh.scm:
    378:2  3 (check-for-package-update #<package redshift-wayland@1.12-1.7da875d gnu/packages/xdisorg.scm:1425 7fe85879e790> (#<<upstream…>) …)
In guix/import/github.scm:
   232:12  2 (latest-release _)
In ice-9/boot-9.scm:
  1685:16  1 (raise-exception _ #:continuable? _)
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Error downloading release information through the GitHub
API. This may be fixed by using an access token and setting the environment
variable GUIX_GITHUB_TOKEN, for instance one procured from
https://github.com/settings/tokens
--8<---------------cut here---------------end--------------->8---

With this change, ‘guix refresh’ warns you when the GitHub rate limit
is reached, but it keeps going, falling back to the ‘generic-git’
updater if it’s among the applicable updaters:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix refresh -t github,generic-git

[...]

guix refresh: warning: GitHub rate limit exceeded; disallowing requests for 1477 seconds
hint: You can waive the rate limit by setting the `GUIX_GITHUB_TOKEN' environment variable to a
token obtained from `https://github.com/settings/tokens' with your GitHub account.

Alternatively, you can wait until your rate limit is reset, or use the `generic-git' updater
instead.

gnu/packages/zile.scm:113:14: warning: no tags were found for zile-on-guile
gnu/packages/zig.scm:32:13: zig would be upgraded from 0.9.0 to 0.9.1
gnu/packages/xorg.scm:2830:7: warning: no valid tags found for xf86-video-freedreno
gnu/packages/xml.scm:2132:13: java-kxml2 would be upgraded from 2.4.2 to 2.5.0
--8<---------------cut here---------------end--------------->8---

The GitHub updater becomes functional again once the rate limit has
been reset.

The code to deal with rate limiting is similar to that in (guix swh).

Thoughts?

Thanks,
Ludo’.

¹ https://issues.guix.gnu.org/53818#50

Ludovic Courtès (4):
  http-client: Add response headers to '&http-get-error'.
  import: github: Gracefully handle rate limit exhaustion.
  http-client: Correctly handle redirects when #:keep-alive? #t.
  import: github: Reuse HTTP connection for the /tags URL fallback.

 .dir-locals.el         |   1 +
 guix/http-client.scm   |  39 +++++++++-----
 guix/import/github.scm | 119 +++++++++++++++++++++++++++++------------
 3 files changed, 112 insertions(+), 47 deletions(-)


base-commit: be84fb701bf7a36a0eb50147ccbb988aa3f41209
-- 
2.34.0





^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error'.
  2022-03-03 21:13 [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
@ 2022-03-03 21:14 ` Ludovic Courtès
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
                     ` (2 more replies)
  2022-03-04 12:07 ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Maxime Devos
  1 sibling, 3 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-03 21:14 UTC (permalink / raw)
  To: 54241; +Cc: Ludovic Courtès

* guix/http-client.scm (&http-get-error)[headers]: New field.
(http-fetch): Initialize the 'headers' field.
---
 guix/http-client.scm | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/guix/http-client.scm b/guix/http-client.scm
index 10bc278023..4b01e31165 100644
--- a/guix/http-client.scm
+++ b/guix/http-client.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012-2018, 2020-2022 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2015 Mark H Weaver <mhw@netris.org>
 ;;; Copyright © 2012, 2015 Free Software Foundation, Inc.
 ;;; Copyright © 2017 Tobias Geerinckx-Rice <me@tobias.gr>
@@ -52,6 +52,7 @@ (define-module (guix http-client)
             http-get-error-uri
             http-get-error-code
             http-get-error-reason
+            http-get-error-headers
 
             http-fetch
             http-multiple-get
@@ -69,9 +70,10 @@ (define-module (guix http-client)
 ;; HTTP GET error.
 (define-condition-type &http-get-error &error
   http-get-error?
-  (uri    http-get-error-uri)                     ; URI
-  (code   http-get-error-code)                    ; integer
-  (reason http-get-error-reason))                 ; string
+  (uri      http-get-error-uri)                   ;URI
+  (code     http-get-error-code)                  ;integer
+  (reason   http-get-error-reason)                ;string
+  (headers  http-get-error-headers))              ;alist
 
 
 (define* (http-fetch uri #:key port (text? #f) (buffered? #t)
@@ -138,7 +140,8 @@ (define* (http-fetch uri #:key port (text? #f) (buffered? #t)
            (raise (condition (&http-get-error
                               (uri uri)
                               (code code)
-                              (reason (response-reason-phrase resp)))
+                              (reason (response-reason-phrase resp))
+                              (headers (response-headers resp)))
                              (&message
                               (message
                                (format
-- 
2.34.0





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion.
  2022-03-03 21:14 ` [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error' Ludovic Courtès
@ 2022-03-03 21:14   ` Ludovic Courtès
  2022-03-05  9:35     ` Maxime Devos
                       ` (4 more replies)
  2022-03-03 21:14   ` [bug#54241] [PATCH 3/4] http-client: Correctly handle redirects when #:keep-alive? #t Ludovic Courtès
  2022-03-03 21:14   ` [bug#54241] [PATCH 4/4] import: github: Reuse HTTP connection for the /tags URL fallback Ludovic Courtès
  2 siblings, 5 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-03 21:14 UTC (permalink / raw)
  To: 54241; +Cc: Ludovic Courtès

Previously, 'guix refresh' would literally crash when the rate limit was
reached due to the call to 'error'.  With this change, the updater
notices when the rate limit is reached and it turns itself into a no-op
until the rate limit has been reset.

When running "guix refresh" (with no arguments), the 'github' updater
gets used until the rate limit has been reached, after which "guix
refresh" automatically picks up the next valid updater, typically
'generic-git'.

* guix/import/github.scm (fetch-releases-or-tags): Use 'http-fetch'
directly instead of 'json-fetch' to let 'http-get-error?' exceptions
through.  Handle 403 errors with an 'X-RateLimit-Remaining' header.
(%rate-limit-reset-time): New variable.
(update-rate-limit-reset-time!, request-rate-limit-reached?): New
procedures.
(latest-released-version): Remove calls to 'error'.
---
 guix/import/github.scm | 113 +++++++++++++++++++++++++++++------------
 1 file changed, 80 insertions(+), 33 deletions(-)

diff --git a/guix/import/github.scm b/guix/import/github.scm
index 8c1898c0c5..5062bad80d 100644
--- a/guix/import/github.scm
+++ b/guix/import/github.scm
@@ -1,6 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2016 Ben Woodcroft <donttrustben@gmail.com>
-;;; Copyright © 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2017-2020, 2022 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2018 Eric Bavier <bavier@member.fsf.org>
 ;;; Copyright © 2019 Arun Isaac <arunisaac@systemreboot.net>
 ;;; Copyright © 2019 Efraim Flashner <efraim@flashner.co.il>
@@ -30,15 +30,16 @@ (define-module (guix import github)
   #:use-module (guix utils)
   #:use-module (guix i18n)
   #:use-module (guix diagnostics)
+  #:use-module ((guix ui) #:select (display-hint))
   #:use-module ((guix download) #:prefix download:)
   #:use-module ((guix git-download) #:prefix download:)
   #:use-module (guix import utils)
-  #:use-module (guix import json)
   #:use-module (json)
   #:use-module (guix packages)
   #:use-module (guix upstream)
   #:use-module (guix http-client)
   #:use-module (web uri)
+  #:use-module (web response)
   #:export (%github-api %github-updater))
 
 ;; For tests.
@@ -140,6 +141,30 @@ (define %github-token
   ;; limit, or #f.
   (make-parameter (getenv "GUIX_GITHUB_TOKEN")))
 
+(define %rate-limit-reset-time
+  ;; Time (seconds since the Epoch, UTC) when the rate limit for GitHub
+  ;; requests will be reset, or #f if the rate limit hasn't been reached.
+  #f)
+
+(define (update-rate-limit-reset-time! headers)
+  "Update the rate limit reset time based on HEADERS, the HTTP response
+headers."
+  (match (assq-ref headers 'x-ratelimit-reset)
+    ((= string->number (? number? reset))
+     (set! %rate-limit-reset-time reset)
+     reset)
+    (_
+     0)))
+
+(define (request-rate-limit-reached?)
+  "Return true if the rate limit has been reached."
+  (and %rate-limit-reset-time
+       (match (< (car (gettimeofday)) %rate-limit-reset-time)
+         (#t #t)
+         (#f
+          (set! %rate-limit-reset-time #f)
+          #f))))
+
 (define (fetch-releases-or-tags url)
   "Fetch the list of \"releases\" or, if it's empty, the list of tags for the
 repository at URL.  Return the corresponding JSON dictionaries (alists),
@@ -170,20 +195,49 @@ (define headers
             `((Authorization . ,(string-append "token " (%github-token))))
             '())))
 
-  (guard (c ((and (http-get-error? c)
-                  (= 404 (http-get-error-code c)))
-             (warning (G_ "~a is unreachable (~a)~%")
-                      release-url (http-get-error-code c))
-             '#()))                               ;return an empty release set
-    (let* ((port   (http-fetch release-url #:headers headers))
-           (result (json->scm port)))
-      (close-port port)
-      (match result
-        (#()
-         ;; We got the empty list, presumably because the user didn't use GitHub's
-         ;; "release" mechanism, but hopefully they did use Git tags.
-         (json-fetch tag-url #:headers headers))
-        (x x)))))
+  (and (not (request-rate-limit-reached?))
+       (guard (c ((and (http-get-error? c)
+                       (= 404 (http-get-error-code c)))
+                  (warning (G_ "~a is unreachable (~a)~%")
+                           (uri->string (http-get-error-uri c))
+                           (http-get-error-code c))
+                  '#())                           ;return an empty release set
+                 ((and (http-get-error? c)
+                       (= 403 (http-get-error-code c)))
+                  ;; See
+                  ;; <https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting>.
+                  (match (assq-ref (http-get-error-headers c)
+                                   'x-ratelimit-remaining)
+                    (#f
+                     (raise c))
+                    ((? (compose zero? string->number))
+                     (let ((reset (update-rate-limit-reset-time!
+                                   (http-get-error-headers c))))
+                       (warning (G_ "GitHub rate limit exceeded; \
+disallowing requests for ~a seconds~%")
+                                (- reset (car (gettimeofday))))
+                       (display-hint (G_ "You can waive the rate limit by
+setting the @env{GUIX_GITHUB_TOKEN} environment variable to a token obtained
+from @url{https://github.com/settings/tokens} with your GitHub account.
+
+Alternatively, you can wait until your rate limit is reset, or use the
+@code{generic-git} updater instead."))
+                       #f))                       ;bail out
+                    (_
+                     (raise c)))))
+
+         (let* ((port   (http-fetch release-url #:headers headers))
+                (result (json->scm port)))
+           (close-port port)
+           (match result
+             (#()
+              ;; We got the empty list, presumably because the user didn't use GitHub's
+              ;; "release" mechanism, but hopefully they did use Git tags.
+              (let* ((port (http-fetch tag-url #:headers headers))
+                     (json (json->scm port)))
+                (close-port port)
+                json))
+             (x x))))))
 
 (define (latest-released-version url package-name)
   "Return the newest released version and its tag given a string URL like
@@ -223,23 +277,16 @@ (define (release->version release)
         (cons tag tag))
        (else #f))))
 
-  (let* ((json (and=> (fetch-releases-or-tags url)
-                      vector->list)))
-    (if (eq? json #f)
-        (if (%github-token)
-            (error "Error downloading release information through the GitHub
-API when using a GitHub token")
-            (error "Error downloading release information through the GitHub
-API. This may be fixed by using an access token and setting the environment
-variable GUIX_GITHUB_TOKEN, for instance one procured from
-https://github.com/settings/tokens"))
-        (match (sort (filter-map release->version
-                                 (match (remove pre-release? json)
-                                   (() json) ; keep everything
-                                   (releases releases)))
-                     (lambda (x y) (version>? (car x) (car y))))
-          (((latest-version . tag) . _) (values latest-version tag))
-          (() (values #f #f))))))
+  (match (and=> (fetch-releases-or-tags url) vector->list)
+    (#f (values #f #f))
+    (json
+     (match (sort (filter-map release->version
+                              (match (remove pre-release? json)
+                                (() json)         ; keep everything
+                                (releases releases)))
+                  (lambda (x y) (version>? (car x) (car y))))
+       (((latest-version . tag) . _) (values latest-version tag))
+       (() (values #f #f))))))
 
 (define (latest-release pkg)
   "Return an <upstream-source> for the latest release of PKG."
-- 
2.34.0





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 3/4] http-client: Correctly handle redirects when #:keep-alive? #t.
  2022-03-03 21:14 ` [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error' Ludovic Courtès
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
@ 2022-03-03 21:14   ` Ludovic Courtès
  2022-03-04 12:39     ` Maxime Devos
  2022-03-03 21:14   ` [bug#54241] [PATCH 4/4] import: github: Reuse HTTP connection for the /tags URL fallback Ludovic Courtès
  2 siblings, 1 reply; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-03 21:14 UTC (permalink / raw)
  To: 54241; +Cc: Ludovic Courtès

Previously PORT would be closed unconditionally, which broke redirects
when #:keep-alive? #t is given.

* guix/http-client.scm (http-fetch): Make 'port' a parameter of 'loop'.
Upon 3xx responses, do not close PORT is KEEP-ALIVE? is true, but consume
RESP's body.  Add second argument to 'loop'.
---
 guix/http-client.scm | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/guix/http-client.scm b/guix/http-client.scm
index 4b01e31165..4784497e5f 100644
--- a/guix/http-client.scm
+++ b/guix/http-client.scm
@@ -102,12 +102,12 @@ (define* (http-fetch uri #:key port (text? #f) (buffered? #t)
 Raise an '&http-get-error' condition if downloading fails."
   (let loop ((uri (if (string? uri)
                       (string->uri uri)
-                      uri)))
-    (let ((port (or port (open-connection uri
-                                          #:verify-certificate?
-                                          verify-certificate?
-                                          #:timeout timeout)))
-          (headers (match (uri-userinfo uri)
+                      uri))
+             (port (or port (open-connection uri
+                                             #:verify-certificate?
+                                             verify-certificate?
+                                             #:timeout timeout))))
+    (let ((headers (match (uri-userinfo uri)
                      ((? string? str)
                       (cons (cons 'Authorization
                                   (string-append "Basic "
@@ -131,11 +131,19 @@ (define* (http-fetch uri #:key port (text? #f) (buffered? #t)
             303                                   ; see other
             307                                   ; temporary redirection
             308)                                  ; permanent redirection
-           (let ((uri (resolve-uri-reference (response-location resp) uri)))
-             (close-port port)
+           (let ((host (uri-host uri))
+                 (uri  (resolve-uri-reference (response-location resp) uri)))
+             (if keep-alive?
+                 (dump-port data (%make-void-port "w0")
+                            (response-content-length resp))
+                 (close-port port))
              (format log-port (G_ "following redirection to `~a'...~%")
                      (uri->string uri))
-             (loop uri)))
+             (loop uri
+                   (and keep-alive?
+                        (or (not (uri-host uri))
+                            (string=? host (uri-host uri)))
+                        port))))
           (else
            (raise (condition (&http-get-error
                               (uri uri)
-- 
2.34.0





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 4/4] import: github: Reuse HTTP connection for the /tags URL fallback.
  2022-03-03 21:14 ` [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error' Ludovic Courtès
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
  2022-03-03 21:14   ` [bug#54241] [PATCH 3/4] http-client: Correctly handle redirects when #:keep-alive? #t Ludovic Courtès
@ 2022-03-03 21:14   ` Ludovic Courtès
  2 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-03 21:14 UTC (permalink / raw)
  To: 54241; +Cc: Ludovic Courtès

* guix/import/github.scm (fetch-releases-or-tags): Call
'open-connection-for-uri' and reuse the same connection for the two
'http-fetch' calls.
* .dir-locals.el (scheme-mode): Add 'call-with-port'.
---
 .dir-locals.el         |  1 +
 guix/import/github.scm | 30 ++++++++++++++++++------------
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index 0edf2a8d23..bee745cc27 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -52,6 +52,7 @@
    (eval . (put 'test-equal 'scheme-indent-function 1))
    (eval . (put 'test-eq 'scheme-indent-function 1))
    (eval . (put 'call-with-input-string 'scheme-indent-function 1))
+   (eval . (put 'call-with-port 'scheme-indent-function 1))
    (eval . (put 'guard 'scheme-indent-function 1))
    (eval . (put 'lambda* 'scheme-indent-function 1))
    (eval . (put 'substitute* 'scheme-indent-function 1))
diff --git a/guix/import/github.scm b/guix/import/github.scm
index 5062bad80d..84f70eed0f 100644
--- a/guix/import/github.scm
+++ b/guix/import/github.scm
@@ -33,6 +33,7 @@ (define-module (guix import github)
   #:use-module ((guix ui) #:select (display-hint))
   #:use-module ((guix download) #:prefix download:)
   #:use-module ((guix git-download) #:prefix download:)
+  #:autoload   (guix build download) (open-connection-for-uri)
   #:use-module (guix import utils)
   #:use-module (json)
   #:use-module (guix packages)
@@ -226,18 +227,23 @@ (define headers
                     (_
                      (raise c)))))
 
-         (let* ((port   (http-fetch release-url #:headers headers))
-                (result (json->scm port)))
-           (close-port port)
-           (match result
-             (#()
-              ;; We got the empty list, presumably because the user didn't use GitHub's
-              ;; "release" mechanism, but hopefully they did use Git tags.
-              (let* ((port (http-fetch tag-url #:headers headers))
-                     (json (json->scm port)))
-                (close-port port)
-                json))
-             (x x))))))
+         (let ((release-uri (string->uri release-url)))
+           (call-with-port (open-connection-for-uri release-uri)
+             (lambda (connection)
+               (let* ((result (json->scm
+                               (http-fetch release-uri
+                                           #:port connection
+                                           #:keep-alive? #t
+                                           #:headers headers))))
+                 (match result
+                   (#()
+                    ;; We got the empty list, presumably because the user didn't use GitHub's
+                    ;; "release" mechanism, but hopefully they did use Git tags.
+                    (json->scm (http-fetch tag-url
+                                           #:port connection
+                                           #:keep-alive? #t
+                                           #:headers headers)))
+                   (x x)))))))))
 
 (define (latest-released-version url package-name)
   "Return the newest released version and its tag given a string URL like
-- 
2.34.0





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-03 21:13 [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  2022-03-03 21:14 ` [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error' Ludovic Courtès
@ 2022-03-04 12:07 ` Maxime Devos
  2022-03-04 20:45   ` Ludovic Courtès
  1 sibling, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-04 12:07 UTC (permalink / raw)
  To: Ludovic Courtès, 54241; +Cc: Nicolas Goaziou

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:13 [+0100]:
> With this change, ‘guix refresh’ warns you when the GitHub rate limit
> is reached, but it keeps going, falling back to the ‘generic-git’
> updater if it’s among the applicable updaters:
> [...]

WDYT of avoiding the rate limit by caching, using 'http-fetch/cached'?
GitHub does not count requests setting If-Modified-Since against the
rate limit (assuming the answer hasn't changed).

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 3/4] http-client: Correctly handle redirects when #:keep-alive? #t.
  2022-03-03 21:14   ` [bug#54241] [PATCH 3/4] http-client: Correctly handle redirects when #:keep-alive? #t Ludovic Courtès
@ 2022-03-04 12:39     ` Maxime Devos
  2022-03-06 21:55       ` bug#54241: [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  0 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-04 12:39 UTC (permalink / raw)
  To: Ludovic Courtès, 54241

[-- Attachment #1: Type: text/plain, Size: 660 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> Previously PORT would be closed unconditionally, which broke redirects
> when #:keep-alive? #t is given.
> 
> * guix/http-client.scm (http-fetch): Make 'port' a parameter of 'loop'.
> Upon 3xx responses, do not close PORT is KEEP-ALIVE? is true, but consume
> RESP's body.  Add second argument to 'loop'.
> ---

HTTP things can become complicated, with lots of edge cases. Could an
appropriate test be added to prevent this becoming buggy in the future,
in case of future changed to 'http-fetch'?  And a few source code
comments about what is going on exactly?

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-04 12:07 ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Maxime Devos
@ 2022-03-04 20:45   ` Ludovic Courtès
  2022-03-05  9:44     ` Maxime Devos
  0 siblings, 1 reply; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-04 20:45 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241, Nicolas Goaziou

Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op do 03-03-2022 om 22:13 [+0100]:
>> With this change, ‘guix refresh’ warns you when the GitHub rate limit
>> is reached, but it keeps going, falling back to the ‘generic-git’
>> updater if it’s among the applicable updaters:
>> [...]
>
> WDYT of avoiding the rate limit by caching, using 'http-fetch/cached'?
> GitHub does not count requests setting If-Modified-Since against the
> rate limit (assuming the answer hasn't changed).

My concern is that we’d end up caching one or two little files in
~/.cache for each candidate package, and (rate limit aside) the overhead
of dealing with the cache might outweigh the benefits.  I’d rather use
‘http-fetch/cached’ for bigger files, like in (guix cve).

WDYT?

My goal here was to ensure the ‘github’ updater doesn’t get in the way
of those who don’t want to specify a token.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion.
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
@ 2022-03-05  9:35     ` Maxime Devos
  2022-03-05 22:00       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  2022-03-05  9:37     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05  9:35 UTC (permalink / raw)
  To: Ludovic Courtès, 54241

[-- Attachment #1: Type: text/plain, Size: 418 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> When running "guix refresh" (with no arguments), the 'github' updater
> gets used until the rate limit has been reached, after which "guix
> refresh" automatically picks up the next valid updater, typically
> 'generic-git'.

Wouldn't this make "guix refresh" non-deterministic (though this might
be an acceptable trade-off)?

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion.
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
  2022-03-05  9:35     ` Maxime Devos
@ 2022-03-05  9:37     ` Maxime Devos
  2022-03-05 22:01       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05  9:37 UTC (permalink / raw)
  To: Ludovic Courtès, 54241

[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> +  (and (not (request-rate-limit-reached?))
> +       (guard (c ([...]
> +                  ;; See
> +                  ;; <https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting>.
> +                  (match (assq-ref (http-get-error-headers c)
> +                                   'x-ratelimit-remaining)
> +                    (#f
> +                     (raise c))
> +                    ((? (compose zero? string->number))
> +                     (let ((reset (update-rate-limit-reset-time!
> +                                   (http-get-error-headers c))))
> +                       (warning (G_ "GitHub rate limit exceeded; \
> +disallowing requests for ~a seconds~%")
> +                                (- reset (car (gettimeofday))))
> +                       (display-hint (G_ "You can waive the rate limit by
> +setting the @env{GUIX_GITHUB_TOKEN} environment variable to a token obtained
> +from @url{https://github.com/settings/tokens} with your GitHub account.

IIRC, the GitHub documentation doesn't state that this waives the rate
limit, rather it increases the rate limit a lot, but there's still a
limit.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-04 20:45   ` Ludovic Courtès
@ 2022-03-05  9:44     ` Maxime Devos
  2022-03-05 21:58       ` Ludovic Courtès
  0 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05  9:44 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 54241, Nicolas Goaziou

[-- Attachment #1: Type: text/plain, Size: 860 bytes --]

Ludovic Courtès schreef op vr 04-03-2022 om 21:45 [+0100]:
> My concern is that we’d end up caching one or two little files in
> ~/.cache for each candidate package, and (rate limit aside) the overhead
> of dealing with the cache might outweigh the benefits.  I’d rather use
> ‘http-fetch/cached’ for bigger files, like in (guix cve).
> 
> WDYT?

If the overhead of caching little files is a concern, then perhaps a
SQLite (or GDBM) database could be used instead of the filesystem-based
cache?  The number of packages in Guix was about 150 000 IIRC, if we
assume something around the magnitude of 200 bytes per package, then
we end up with about 29 MiB for the entirity of Guix.  And there might
be some opportunities for compression, reducing this number.

Something like this could be left for later though.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion.
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
  2022-03-05  9:35     ` Maxime Devos
  2022-03-05  9:37     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05  9:48     ` Maxime Devos
  2022-03-05 22:09       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
  2022-03-05  9:52     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
  4 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05  9:48 UTC (permalink / raw)
  To: Ludovic Courtès, 54241

[-- Attachment #1: Type: text/plain, Size: 189 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> [...]

It would be nice to have some tests for the rate limiting code,
in tests/import-github.scm.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion.
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
                       ` (2 preceding siblings ...)
  2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05  9:48     ` Maxime Devos
  2022-03-05 22:03       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  2022-03-05  9:52     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
  4 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05  9:48 UTC (permalink / raw)
  To: Ludovic Courtès, 54241

[-- Attachment #1: Type: text/plain, Size: 466 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> +(define (update-rate-limit-reset-time! headers)
> +  "Update the rate limit reset time based on HEADERS, the HTTP response
> +headers."
> +  (match (assq-ref headers 'x-ratelimit-reset)
> +    ((= string->number (? number? reset))
> +     (set! %rate-limit-reset-time reset)
> +     reset)
> +    (_
> +     0)))

When can this second case happen?

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion.
  2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
                       ` (3 preceding siblings ...)
  2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05  9:52     ` Maxime Devos
  2022-03-05 22:06       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  4 siblings, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05  9:52 UTC (permalink / raw)
  To: Ludovic Courtès, 54241

[-- Attachment #1: Type: text/plain, Size: 755 bytes --]

Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> +(define (request-rate-limit-reached?)
> +  "Return true if the rate limit has been reached."
> +  (and %rate-limit-reset-time
> +       (match (< (car (gettimeofday)) %rate-limit-reset-time)
> +         (#t #t)
> +         (#f
> +          (set! %rate-limit-reset-time #f)
> +          #f))))

The clocks used by the GitHub server cannot exactly be the clock of the
local Guix (at least, not in a realistic setting).  WDYT of adding a
little margin, accounting for the impossibility of clocks exactly
matching and allowing for some clock skew?

  (< (car (gettimeofday)) (+ [5 minutes] %rate-limit-reset-time))

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05  9:44     ` Maxime Devos
@ 2022-03-05 21:58       ` Ludovic Courtès
  2022-03-05 22:04         ` Maxime Devos
  2022-03-05 22:11         ` Maxime Devos
  0 siblings, 2 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-05 21:58 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241, Nicolas Goaziou

Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op vr 04-03-2022 om 21:45 [+0100]:
>> My concern is that we’d end up caching one or two little files in
>> ~/.cache for each candidate package, and (rate limit aside) the overhead
>> of dealing with the cache might outweigh the benefits.  I’d rather use
>> ‘http-fetch/cached’ for bigger files, like in (guix cve).
>> 
>> WDYT?
>
> If the overhead of caching little files is a concern, then perhaps a
> SQLite (or GDBM) database could be used instead of the filesystem-based
> cache?  The number of packages in Guix was about 150 000 IIRC, if we
> assume something around the magnitude of 200 bytes per package, then
> we end up with about 29 MiB for the entirity of Guix.  And there might
> be some opportunities for compression, reducing this number.

I think this would be going overboard in terms of complexity :-), and it
wouldn’t radically change the run-time overhead (you still potentially
have to do an HTTP round trip with ‘If-Modified-Since’, you’re just
saving a few hundred bytes on the response in the best case.)

> Something like this could be left for later though.

Yup!

Ludo’.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05  9:35     ` Maxime Devos
@ 2022-03-05 22:00       ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-05 22:00 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
>> When running "guix refresh" (with no arguments), the 'github' updater
>> gets used until the rate limit has been reached, after which "guix
>> refresh" automatically picks up the next valid updater, typically
>> 'generic-git'.
>
> Wouldn't this make "guix refresh" non-deterministic (though this might
> be an acceptable trade-off)?

Correct.  It could be said that ‘guix refresh’ is “non-deterministic”
anyway since its result depends on external services.  I think it’s an
acceptable tradeoff.

Ludo’.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05  9:37     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05 22:01       ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-05 22:01 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
>> +  (and (not (request-rate-limit-reached?))
>> +       (guard (c ([...]
>> +                  ;; See
>> +                  ;; <https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting>.
>> +                  (match (assq-ref (http-get-error-headers c)
>> +                                   'x-ratelimit-remaining)
>> +                    (#f
>> +                     (raise c))
>> +                    ((? (compose zero? string->number))
>> +                     (let ((reset (update-rate-limit-reset-time!
>> +                                   (http-get-error-headers c))))
>> +                       (warning (G_ "GitHub rate limit exceeded; \
>> +disallowing requests for ~a seconds~%")
>> +                                (- reset (car (gettimeofday))))
>> +                       (display-hint (G_ "You can waive the rate limit by
>> +setting the @env{GUIX_GITHUB_TOKEN} environment variable to a token obtained
>> +from @url{https://github.com/settings/tokens} with your GitHub account.
>
> IIRC, the GitHub documentation doesn't state that this waives the rate
> limit, rather it increases the rate limit a lot, but there's still a
> limit.

Good point, I’ll adjust the message accordingly.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05 22:03       ` Ludovic Courtès
  2022-03-05 22:16         ` Maxime Devos
  2022-03-05 22:21         ` Maxime Devos
  0 siblings, 2 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-05 22:03 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
>> +(define (update-rate-limit-reset-time! headers)
>> +  "Update the rate limit reset time based on HEADERS, the HTTP response
>> +headers."
>> +  (match (assq-ref headers 'x-ratelimit-reset)
>> +    ((= string->number (? number? reset))
>> +     (set! %rate-limit-reset-time reset)
>> +     reset)
>> +    (_
>> +     0)))
>
> When can this second case happen?

I don’t know if it’s supposed to happen.  It’s defensive programming:
better keep going than crash if the server starts behaving slightly
differently.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05 21:58       ` Ludovic Courtès
@ 2022-03-05 22:04         ` Maxime Devos
  2022-03-05 22:11         ` Maxime Devos
  1 sibling, 0 replies; 27+ messages in thread
From: Maxime Devos @ 2022-03-05 22:04 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 54241, Nicolas Goaziou

[-- Attachment #1: Type: text/plain, Size: 649 bytes --]

Ludovic Courtès schreef op za 05-03-2022 om 22:58 [+0100]:
> [...] and it wouldn’t radically change the run-time overhead (you still
> potentially have to do an HTTP round trip with ‘If-Modified-Since’,
> you’re just saving a few hundred bytes on the response in the best case.)

IIUC, when the TTL hasn't been exceeded, then the file from the file
system is served without contacting the remote server at all.  So in
the best case, you only ‘round-trip’ to the disk instead of the HTTP
server.  So I think there's some potential benefits to be had here.

That assumes a sufficiently large TTL though.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05  9:52     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05 22:06       ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-05 22:06 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
>> +(define (request-rate-limit-reached?)
>> +  "Return true if the rate limit has been reached."
>> +  (and %rate-limit-reset-time
>> +       (match (< (car (gettimeofday)) %rate-limit-reset-time)
>> +         (#t #t)
>> +         (#f
>> +          (set! %rate-limit-reset-time #f)
>> +          #f))))
>
> The clocks used by the GitHub server cannot exactly be the clock of the
> local Guix (at least, not in a realistic setting).  WDYT of adding a
> little margin, accounting for the impossibility of clocks exactly
> matching and allowing for some clock skew?
>
>   (< (car (gettimeofday)) (+ [5 minutes] %rate-limit-reset-time))

I don’t think it’s necessary.  The worst that can happen is that we
retry too early, get another 403 response, and retry later.  (In
practice, on my NTP-synchronized laptop, things just work.)




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
@ 2022-03-05 22:09       ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-05 22:09 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241

Maxime Devos <maximedevos@telenet.be> skribis:

> It would be nice to have some tests for the rate limiting code,
> in tests/import-github.scm.

It would but… I think I’ll punt on that one.  :-)




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05 21:58       ` Ludovic Courtès
  2022-03-05 22:04         ` Maxime Devos
@ 2022-03-05 22:11         ` Maxime Devos
  2022-03-06 17:21           ` Ludovic Courtès
  1 sibling, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05 22:11 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 54241, Nicolas Goaziou

[-- Attachment #1: Type: text/plain, Size: 856 bytes --]

Ludovic Courtès schreef op za 05-03-2022 om 22:58 [+0100]:
> I think this would be going overboard in terms of complexity :-)

There's some complexity here, but assuming a sufficient amount of
tests, I believe it would be worth it if it allows side-stepping the
rate limit to some degree.  And the extra complexity would mostly
disappear if the overhead of tiny files was accepted (*).

There are also some other benefits, e.g. a kind of ‘download
resumption’ but for linters, reducing network traffic after retrying
"guix lint" on a lossy network (or because the terminal tab was closed
to early, etc.).

All stuff that can be left for later though!

Greetings,
Maxime.

(*) Assuming 150 000 packages and 1 KiB per package (this would be
file-system dependent!), I end up with 150 MiB.  That's a bit on the
large size though ...

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05 22:03       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
@ 2022-03-05 22:16         ` Maxime Devos
  2022-03-06 17:18           ` Ludovic Courtès
  2022-03-05 22:21         ` Maxime Devos
  1 sibling, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-03-05 22:16 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 54241

[-- Attachment #1: Type: text/plain, Size: 852 bytes --]

Ludovic Courtès schreef op za 05-03-2022 om 23:03 [+0100]:
> I don’t know if it’s supposed to happen.  It’s defensive programming:
> better keep going than crash if the server starts behaving slightly
> differently.

That's called total programming I think?  From a OOP I'm following:

  * total: handle all cases without complaints (no throwing exceptions
    or such), assign every case a well-defined (and documented!)
    behaviour
  * nominal: document the preconditions, but don't bother checking them
  * defensive: check inputs, if they are wrong, throw an exception

(it was probably formulated a bit differently but that was the gist of
it)

At least according to this classification, this
'update-rate-limit-reset-time!' would be total (except for the lack of
documentation), not defensive.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05 22:03       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
  2022-03-05 22:16         ` Maxime Devos
@ 2022-03-05 22:21         ` Maxime Devos
  1 sibling, 0 replies; 27+ messages in thread
From: Maxime Devos @ 2022-03-05 22:21 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 54241

[-- Attachment #1: Type: text/plain, Size: 1264 bytes --]

Ludovic Courtès schreef op za 05-03-2022 om 23:03 [+0100]:
> Maxime Devos <maximedevos@telenet.be> skribis:
> 
> > Ludovic Courtès schreef op do 03-03-2022 om 22:14 [+0100]:
> > > +(define (update-rate-limit-reset-time! headers)
> > > +  "Update the rate limit reset time based on HEADERS, the HTTP response
> > > +headers."
> > > +  (match (assq-ref headers 'x-ratelimit-reset)
> > > +    ((= string->number (? number? reset))
> > > +     (set! %rate-limit-reset-time reset)
> > > +     reset)
> > > +    (_
> > > +     0)))
> > 
> > When can this second case happen?
> 
> I don’t know if it’s supposed to happen.  It’s defensive programming:
> better keep going than crash if the server starts behaving slightly
> differently.

If it's not supposed to happen, can it at least be reported with a
warning, such that we then know that 'update-rate-limit-reset-time!'
needs to be extended or GitHub needs to be contacted?

FWIW, I think crashing in case of bogus HTTP answers is fine, as long
as it crashes with a _nice_ error message ("guix: error: HTTP server
foo.com returned an unrecoginised X-Ratelimit-Reset $SOME_STRING" or
something like that) instead of some vague backtrace.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05 22:16         ` Maxime Devos
@ 2022-03-06 17:18           ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-06 17:18 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241

Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

> That's called total programming I think?  From a OOP I'm following:
>
>   * total: handle all cases without complaints (no throwing exceptions
>     or such), assign every case a well-defined (and documented!)
>     behaviour
>   * nominal: document the preconditions, but don't bother checking them
>   * defensive: check inputs, if they are wrong, throw an exception

Interesting; I stand corrected!

> If it's not supposed to happen, can it at least be reported with a
> warning, such that we then know that 'update-rate-limit-reset-time!'
> needs to be extended or GitHub needs to be contacted?

Yes, sounds reasonable.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-05 22:11         ` Maxime Devos
@ 2022-03-06 17:21           ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-06 17:21 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241, Nicolas Goaziou

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op za 05-03-2022 om 22:58 [+0100]:
>> I think this would be going overboard in terms of complexity :-)
>
> There's some complexity here, but assuming a sufficient amount of
> tests, I believe it would be worth it if it allows side-stepping the
> rate limit to some degree.

What should also be taken into account is the usefulness of the ‘github’
updater—investment should be proportionate.  I suspect it’s much less
useful now that we have the ‘generic-git’ updater.  Maybe, maybe it
gives slightly more accurate data in some cases, maybe it can be
slightly faster, but that’s not entirely clear to me.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#54241: [PATCH 0/4] 'github' importer gracefully handles rate limiting
  2022-03-04 12:39     ` Maxime Devos
@ 2022-03-06 21:55       ` Ludovic Courtès
  0 siblings, 0 replies; 27+ messages in thread
From: Ludovic Courtès @ 2022-03-06 21:55 UTC (permalink / raw)
  To: Maxime Devos; +Cc: 54241-done

Hello,

I committed an updated version of these patches, taking some of your
suggestions into account:

  a8d3033da6 import: github: Reuse HTTP connection for the /tags URL fallback.
  8786c2e8d7 http-client: Correctly handle redirects when #:keep-alive? #t.
  55e8e283ae import: github: Gracefully handle rate limit exhaustion.
  ecad9b2213 http-client: Add response headers to '&http-get-error'.
  049aefddb2 tests: Add (guix http-client) tests.

The first patch adds tests for ‘http-fetch’, as you suggested, which
allowed me to find a thinko in the keep-alive patch.  The test doesn’t
test keep-alive support though, because the server in (guix tests http)
doesn’t support it currently (I tried to retrofit it based on (web
server http) but that’s not really possible because we’d need to either
access internals of the <http> record type, specifically it’s poll set,
or re-implement it entirely.)

Thanks a lot for your help!

Ludo’.




^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-03-06 21:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03 21:13 [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-03 21:14 ` [bug#54241] [PATCH 1/4] http-client: Add response headers to '&http-get-error' Ludovic Courtès
2022-03-03 21:14   ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Ludovic Courtès
2022-03-05  9:35     ` Maxime Devos
2022-03-05 22:00       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-05  9:37     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
2022-03-05 22:01       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
2022-03-05 22:09       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-05  9:48     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
2022-03-05 22:03       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-05 22:16         ` Maxime Devos
2022-03-06 17:18           ` Ludovic Courtès
2022-03-05 22:21         ` Maxime Devos
2022-03-05  9:52     ` [bug#54241] [PATCH 2/4] import: github: Gracefully handle rate limit exhaustion Maxime Devos
2022-03-05 22:06       ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-03 21:14   ` [bug#54241] [PATCH 3/4] http-client: Correctly handle redirects when #:keep-alive? #t Ludovic Courtès
2022-03-04 12:39     ` Maxime Devos
2022-03-06 21:55       ` bug#54241: [PATCH 0/4] 'github' importer gracefully handles rate limiting Ludovic Courtès
2022-03-03 21:14   ` [bug#54241] [PATCH 4/4] import: github: Reuse HTTP connection for the /tags URL fallback Ludovic Courtès
2022-03-04 12:07 ` [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Maxime Devos
2022-03-04 20:45   ` Ludovic Courtès
2022-03-05  9:44     ` Maxime Devos
2022-03-05 21:58       ` Ludovic Courtès
2022-03-05 22:04         ` Maxime Devos
2022-03-05 22:11         ` Maxime Devos
2022-03-06 17:21           ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).