unofficial mirror of guix-patches@gnu.org 
 help / color / Atom feed
* [bug#42019] [PATCH 0/1] sources.json compliant with SWH loader
@ 2020-06-23 15:13 zimoun
  2020-06-23 15:21 ` [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources zimoun
  2020-06-29 16:50 ` [bug#42019] [PATCH v2] " zimoun
  0 siblings, 2 replies; 10+ messages in thread
From: zimoun @ 2020-06-23 15:13 UTC (permalink / raw)
  To: 42019; +Cc: zimoun

Dear,

This patch adds the "integrity" field.  It is SRI format i.e., 'origin-hash'
is converted to 'base64'.

The "revision" field is the Guix commit.  It should be used by SWH; for example
SWH could fetch several sources.json.

Currently, the SWH loader does only support the formats [1]
".tar.gz$|.zip$|tar.bz2$|.tbz$|.tar.xz$|.tgz$|.tar$" and their advice is
to filter out any other files (e.g., Gem).  For now, there is no filter and
it could be added then if it is really an issue for them.


1: https://forge.softwareheritage.org/T1352#45459

All the best,
simon


zimoun (1):
  website: Add integrity to JSON sources.

 website/apps/packages/builder.scm | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)


base-commit: 36fdde5b3efad445291588a5bc17a11802eb7ff8
-- 
2.26.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-23 15:13 [bug#42019] [PATCH 0/1] sources.json compliant with SWH loader zimoun
@ 2020-06-23 15:21 ` zimoun
  2020-06-27 17:05   ` Ludovic Courtès
  2020-06-29 16:50 ` [bug#42019] [PATCH v2] " zimoun
  1 sibling, 1 reply; 10+ messages in thread
From: zimoun @ 2020-06-23 15:21 UTC (permalink / raw)
  To: 42019; +Cc: zimoun

* website/apps/packages/builder.scm (origin->json): Add integrity field using
SRI format.
---
 website/apps/packages/builder.scm | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/website/apps/packages/builder.scm b/website/apps/packages/builder.scm
index d2bccd7..e20d672 100644
--- a/website/apps/packages/builder.scm
+++ b/website/apps/packages/builder.scm
@@ -46,6 +46,8 @@
   #:use-module (guix hg-download)
   #:use-module (guix utils)                       ;location
   #:use-module ((guix build download) #:select (maybe-expand-mirrors))
+  #:use-module ((guix base64) #:select (base64-encode))
+  #:use-module ((guix config) #:select (%guix-version))
   #:use-module (json)
   #:use-module (ice-9 match)
   #:use-module ((web uri) #:select (string->uri uri->string))
@@ -114,7 +116,7 @@
     ,@(cond ((or (eq? url-fetch method)
                  (eq? url-fetch/tarbomb method)
                  (eq? url-fetch/zipbomb method))
-             `(("url" . ,(list->vector
+             `(("urls" . ,(list->vector
                           (resolve
                            (match uri
                              ((? string? url) (list url))
@@ -128,6 +130,16 @@
             ((eq? hg-fetch method)
              `(("hg_url" . ,(hg-reference-url uri))))
             (else '()))
+    ,@(if (or (eq? url-fetch method)
+              (eq? url-fetch/tarbomb method)
+              (eq? url-fetch/zipbomb method))
+          (let* ((content-hash (origin-hash origin))
+                 (hash-value (content-hash-value content-hash))
+                 (hash-algorithm (content-hash-algorithm content-hash))
+                 (algorithm-string (symbol->string hash-algorithm)))
+            `(("integrity" . ,(string-append algorithm-string "-"
+                                             (base64-encode hash-value)))))
+          '())
     ,@(if (eq? method git-fetch)
           `(("git_ref" . ,(git-reference-commit uri)))
           '())
@@ -174,9 +186,11 @@
              scm->json))
 
 (define (sources-json-builder)
-  "Return a JSON page listing all the sources.
-
-See <https://forge.softwareheritage.org/D2025#51269>."
+  "Return a JSON page listing all the sources."
+  ;; The Software Heritage format is described here:
+  ;; https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources.json
+  ;; And the loader is implemented here:
+  ;; https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/package/nixguix/
   (define (package->json package)
     `(,@(if (origin? (package-source package))
             (origin->json (package-source package))
@@ -185,7 +199,8 @@ See <https://forge.softwareheritage.org/D2025#51269>."
 
   (make-page "sources.json"
              `(("sources" . ,(list->vector (map package->json (all-packages))))
-               ("version" . "1"))
+               ("version" . "1")
+               ("revision" . ,%guix-version))
              scm->json))
 
 (define (index-builder)
-- 
2.26.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-23 15:21 ` [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources zimoun
@ 2020-06-27 17:05   ` Ludovic Courtès
  2020-06-27 17:41     ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2020-06-27 17:05 UTC (permalink / raw)
  To: zimoun; +Cc: 42019

Hi!

zimoun <zimon.toutoune@gmail.com> skribis:

> * website/apps/packages/builder.scm (origin->json): Add integrity field using
> SRI format.

[...]

> -             `(("url" . ,(list->vector
> +             `(("urls" . ,(list->vector
>                            (resolve
>                             (match uri
>                               ((? string? url) (list url))

Is this change OK for Repology?  Or should we keep “url” in addition to
“urls”?

>    (make-page "sources.json"
>               `(("sources" . ,(list->vector (map package->json (all-packages))))
> -               ("version" . "1"))
> +               ("version" . "1")
> +               ("revision" . ,%guix-version))

There’s no guarantee that ‘%guix-version’ is a commit ID, so perhaps we
should do something like:

  (match (current-profile)
    (#f %guix-version)   ;for lack of a better ID
    (profile
     (let ((channel (find guix-channel? (profile-channels profile))))
       (channel-commit channel))))

Otherwise LGTM, thank you!

Ludo’.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-27 17:05   ` Ludovic Courtès
@ 2020-06-27 17:41     ` zimoun
  2020-06-29 17:01       ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2020-06-27 17:41 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 42019

Hi Ludo,

Thank you for the review.

On Sat, 27 Jun 2020 at 19:05, Ludovic Courtès <ludo@gnu.org> wrote:

>> -             `(("url" . ,(list->vector
>> +             `(("urls" . ,(list->vector
>>                            (resolve
>>                             (match uri
>>                               ((? string? url) (list url))
>
> Is this change OK for Repology?  Or should we keep “url” in addition to
> “urls”?

From what I understood of their API [1] when I checked it, I may say yes. :-)
Well, I do not think that repology parses the field 'sources'.

1: https://repology.org/addrepo


> There’s no guarantee that ‘%guix-version’ is a commit ID, so perhaps we
> should do something like:

Thanks for the tip, I did not know.  I will sent a v2 with your
suggestion or feel free to update the patch and push it. :-)


Cheers,
simon




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH v2] website: Add integrity to JSON sources.
  2020-06-23 15:13 [bug#42019] [PATCH 0/1] sources.json compliant with SWH loader zimoun
  2020-06-23 15:21 ` [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources zimoun
@ 2020-06-29 16:50 ` zimoun
  1 sibling, 0 replies; 10+ messages in thread
From: zimoun @ 2020-06-29 16:50 UTC (permalink / raw)
  To: 42019; +Cc: zimoun

* website/apps/packages/builder.scm (origin->json): Add integrity field using
SRI format.
---
 website/apps/packages/builder.scm | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/website/apps/packages/builder.scm b/website/apps/packages/builder.scm
index d2bccd7..fa488a5 100644
--- a/website/apps/packages/builder.scm
+++ b/website/apps/packages/builder.scm
@@ -46,6 +46,9 @@
   #:use-module (guix hg-download)
   #:use-module (guix utils)                       ;location
   #:use-module ((guix build download) #:select (maybe-expand-mirrors))
+  #:use-module ((guix base64) #:select (base64-encode))
+  #:use-module ((guix describe) #:select (current-profile))
+  #:use-module ((guix config) #:select (%guix-version))
   #:use-module (json)
   #:use-module (ice-9 match)
   #:use-module ((web uri) #:select (string->uri uri->string))
@@ -114,7 +117,7 @@
     ,@(cond ((or (eq? url-fetch method)
                  (eq? url-fetch/tarbomb method)
                  (eq? url-fetch/zipbomb method))
-             `(("url" . ,(list->vector
+             `(("urls" . ,(list->vector
                           (resolve
                            (match uri
                              ((? string? url) (list url))
@@ -128,6 +131,16 @@
             ((eq? hg-fetch method)
              `(("hg_url" . ,(hg-reference-url uri))))
             (else '()))
+    ,@(if (or (eq? url-fetch method)
+              (eq? url-fetch/tarbomb method)
+              (eq? url-fetch/zipbomb method))
+          (let* ((content-hash (origin-hash origin))
+                 (hash-value (content-hash-value content-hash))
+                 (hash-algorithm (content-hash-algorithm content-hash))
+                 (algorithm-string (symbol->string hash-algorithm)))
+            `(("integrity" . ,(string-append algorithm-string "-"
+                                             (base64-encode hash-value)))))
+          '())
     ,@(if (eq? method git-fetch)
           `(("git_ref" . ,(git-reference-commit uri)))
           '())
@@ -174,9 +187,11 @@
              scm->json))
 
 (define (sources-json-builder)
-  "Return a JSON page listing all the sources.
-
-See <https://forge.softwareheritage.org/D2025#51269>."
+  "Return a JSON page listing all the sources."
+  ;; The Software Heritage format is described here:
+  ;; https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources.json
+  ;; And the loader is implemented here:
+  ;; https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/package/nixguix/
   (define (package->json package)
     `(,@(if (origin? (package-source package))
             (origin->json (package-source package))
@@ -185,7 +200,13 @@ See <https://forge.softwareheritage.org/D2025#51269>."
 
   (make-page "sources.json"
              `(("sources" . ,(list->vector (map package->json (all-packages))))
-               ("version" . "1"))
+               ("version" . "1")
+               ("revision" .
+                ,(match (current-profile)
+                   (#f %guix-version)   ;for lack of a better ID
+                   (profile
+                    (let ((channel (find guix-channel? (profile-channels profile))))
+                      (channel-commit channel))))))
              scm->json))
 
 (define (index-builder)
-- 
2.26.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-27 17:41     ` zimoun
@ 2020-06-29 17:01       ` zimoun
  2020-06-29 20:41         ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2020-06-29 17:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 42019

Hi Ludo,

On Sat, 27 Jun 2020 at 19:42, zimoun <zimon.toutoune@gmail.com> wrote:

> Thanks for the tip, I did not know.  I will sent a v2 with your
> suggestion or feel free to update the patch and push it. :-)

v2 is sent.

BTW, in the SWH picture and after a chat video with lewo, I do not
think that the website is the right place.  Instead, it should go to
ci.guix.gnu.org or data.guix.gnu.org.  Or maybe integrated with "guix
publish".  Well, the next step is to have a collection of sources.json
-- that the point of "revision".  WDYT?

Cheers,
simon




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-29 17:01       ` zimoun
@ 2020-06-29 20:41         ` Ludovic Courtès
  2020-06-29 23:28           ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2020-06-29 20:41 UTC (permalink / raw)
  To: zimoun; +Cc: 42019, Christopher Baines

Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

> BTW, in the SWH picture and after a chat video with lewo, I do not
> think that the website is the right place.  Instead, it should go to
> ci.guix.gnu.org or data.guix.gnu.org.  Or maybe integrated with "guix
> publish".  Well, the next step is to have a collection of sources.json
> -- that the point of "revision".  WDYT?

The Guix Data Service would be a natural place for ‘sources.json’ IMO.
Thoughts, Chris?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-29 20:41         ` Ludovic Courtès
@ 2020-06-29 23:28           ` zimoun
  2020-07-01 19:35             ` Christopher Baines
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2020-06-29 23:28 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 42019, Christopher Baines

Hi Chris,

On Mon, 29 Jun 2020 at 22:41, Ludovic Courtès <ludo@gnu.org> wrote:
> zimoun <zimon.toutoune@gmail.com> skribis:
>
>> BTW, in the SWH picture and after a chat video with lewo, I do not
>> think that the website is the right place.  Instead, it should go to
>> ci.guix.gnu.org or data.guix.gnu.org.  Or maybe integrated with "guix
>> publish".  Well, the next step is to have a collection of sources.json
>> -- that the point of "revision".  WDYT?
>
> The Guix Data Service would be a natural place for ‘sources.json’ IMO.
> Thoughts, Chris?

If it goes to the GDS, then first let point me where to start. :-)

And second, it could be nice in the "near" future to have at least 2
sources.json: one for the last commit refreshed every X minutes (or
hours) and another one containing the concatenation of all the sources
of Guix (at least the one reachable by guix time-machine i.e. after the
big overhaul of Inferiors).  I will go on #swh-devel or reach lewo to
know how "near" it is on SWH side.


Cheers,
simon




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-06-29 23:28           ` zimoun
@ 2020-07-01 19:35             ` Christopher Baines
  2020-07-01 20:29               ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Christopher Baines @ 2020-07-01 19:35 UTC (permalink / raw)
  To: zimoun; +Cc: Ludovic Courtès, 42019

[-- Attachment #1: Type: text/plain, Size: 1798 bytes --]


zimoun <zimon.toutoune@gmail.com> writes:

> Hi Chris,
>
> On Mon, 29 Jun 2020 at 22:41, Ludovic Courtès <ludo@gnu.org> wrote:
>> zimoun <zimon.toutoune@gmail.com> skribis:
>>
>>> BTW, in the SWH picture and after a chat video with lewo, I do not
>>> think that the website is the right place.  Instead, it should go to
>>> ci.guix.gnu.org or data.guix.gnu.org.  Or maybe integrated with "guix
>>> publish".  Well, the next step is to have a collection of sources.json
>>> -- that the point of "revision".  WDYT?
>>
>> The Guix Data Service would be a natural place for ‘sources.json’ IMO.
>> Thoughts, Chris?
>
> If it goes to the GDS, then first let point me where to start. :-)

I think this does sound like a good use of the Guix Data
Service. Unfortunately, the sources of packages aren't currently stored
in the Guix Data Service database, so I'm guessing this will require
storing some new data, then working out how to present it.

A question maybe for you Simon, what would be the perfect data for this
particular use case? I gather it's something about the (source ...)
field in packages, probably for all the exported (plus maybe
not-exported packages).

> And second, it could be nice in the "near" future to have at least 2
> sources.json: one for the last commit refreshed every X minutes (or
> hours) and another one containing the concatenation of all the sources
> of Guix (at least the one reachable by guix time-machine i.e. after the
> big overhaul of Inferiors).  I will go on #swh-devel or reach lewo to
> know how "near" it is on SWH side.

Once you can get the data for an individual revision in the Guix Data
Service, it should be reasonably easy to just get the data for multiple
revisions, say all for the last week.

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources.
  2020-07-01 19:35             ` Christopher Baines
@ 2020-07-01 20:29               ` zimoun
  0 siblings, 0 replies; 10+ messages in thread
From: zimoun @ 2020-07-01 20:29 UTC (permalink / raw)
  To: Christopher Baines; +Cc: Ludovic Courtès, 42019

Hi Chris,

On Wed, 01 Jul 2020 at 20:35, Christopher Baines <mail@cbaines.net> wrote:

> A question maybe for you Simon, what would be the perfect data for this
> particular use case? I gather it's something about the (source ...)
> field in packages, probably for all the exported (plus maybe
> not-exported packages).

Currently the website builds source.json by using 'fold-packages'
(traversing all the modules and returning all the public variables, if I
read correctly) then excluding 'package-superseded' and
'package-replacement'.

Well, maybe an example is simpler than a lot of words.  The resulting
JSON looks like:

--8<---------------cut here---------------start------------->8---
    {
      "type": "url",
      "urls": [
        "https://ftpmirror.gnu.org/gnu/a2ps/a2ps-4.14.tar.gz",
        "ftp://ftp.cs.tu-berlin.de/pub/gnu/a2ps/a2ps-4.14.tar.gz",
        "ftp://ftp.funet.fi/pub/mirrors/ftp.gnu.org/gnu/a2ps/a2ps-4.14.tar.gz",
        "http://ftp.gnu.org/pub/gnu/a2ps/a2ps-4.14.tar.gz"
      ],
      "integrity": "sha256-866NPUVkpBtuKiHyN9LysQT0gQhZHouDSXUAGCo6s6Q="
    },
    {
      "type": "git",
      "git_url": "https://github.com/opencog/agi-bio.git",
      "git_ref": "b5c6f3d99e8cca3798bf0cdf2c32f4bdb8098efb"
    },
--8<---------------cut here---------------end--------------->8---

So basically, the data are: origin-method, origin-uri (implies reference
URLs and {git,hg,svn}-{commit,revision}), origin-hash (implies
content-hash-{value,algorithm}).  Note that the list of mirrors are
necessary too.

I have given a look to

  http://git.savannah.gnu.org/cgit/guix/data-service.git/tree/

but I am not sure to understand where the SQL table is defined.


Thanks,
simon




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 15:13 [bug#42019] [PATCH 0/1] sources.json compliant with SWH loader zimoun
2020-06-23 15:21 ` [bug#42019] [PATCH 1/1] website: Add integrity to JSON sources zimoun
2020-06-27 17:05   ` Ludovic Courtès
2020-06-27 17:41     ` zimoun
2020-06-29 17:01       ` zimoun
2020-06-29 20:41         ` Ludovic Courtès
2020-06-29 23:28           ` zimoun
2020-07-01 19:35             ` Christopher Baines
2020-07-01 20:29               ` zimoun
2020-06-29 16:50 ` [bug#42019] [PATCH v2] " zimoun

unofficial mirror of guix-patches@gnu.org 

Archives are clonable:
	git clone --mirror https://yhetil.org/guix-patches/1 guix-patches/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 guix-patches guix-patches/ https://yhetil.org/guix-patches \
		guix-patches@gnu.org
	public-inbox-index guix-patches

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://news.yhetil.org/yhetil.gnu.guix.patches


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git