unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "JOULAUD François" <Francois.JOULAUD@radiofrance.com>
Cc: "Ludovic Courtès" <ludo@gnu.org>,
	"44178@debbugs.gnu.org" <44178@debbugs.gnu.org>,
	"Katherine Cox-Buday" <cox.katherine.e@gmail.com>
Subject: [bug#44178] [PATCH v4] Re: bug#44178: Add a Go Module Importer
Date: Thu, 04 Mar 2021 00:40:36 -0500	[thread overview]
Message-ID: <8735xbqxwr.fsf_-_@gmail.com> (raw)
In-Reply-To: <871rcxte52.fsf_-_@gnu.org> ("Ludovic Courtès"'s message of "Tue, 02 Mar 2021 22:54:49 +0100")

[-- Attachment #1: Type: text/plain, Size: 3179 bytes --]

Hi François, Ludovic, et al!

Sorry for bumping in the review, but I have been experimenting with this
importer, and it looks promising; thanks for everyone involved!  I made
a couple changes, mostly with regard to integrating support for the
synopsis, description and license field of the package, plus other
cosmetic changes.  I thought I should share it quickly so that it can be
used as the basis for a v5, so here's the patch, attached.

I hope you don't mind!

I tested it with:

$ ./pre-inst-env guix environment guix

$ ./pre-inst-env guix import go -r github.com/dgraph-io/badger/v2

--8<---------------cut here---------------start------------->8---
[...]

(define-public go-github-com-dgraph-io-badger-v2
  (package
    (name "go-github-com-dgraph-io-badger-v2")
    (version "2.2007.2")
    (source
      (origin
        (method git-fetch)
        (uri (git-reference
               (url "https://github.com/dgraph-io/badger.git")
               (commit (go-version->git-ref version))))
        (file-name (git-file-name name version))
        (sha256
          (base32
            "0000000000000000000000000000000000000000000000000000"))))
    (build-system go-build-system)
    (arguments
      '(#:import-path "github.com/dgraph-io/badger"))
    (inputs
      `(("go-gopkg-in-check-v1" ,go-gopkg-in-check-v1)
        ("go-golang-org-x-sys" ,go-golang-org-x-sys)
        ("go-golang-org-x-net" ,go-golang-org-x-net)
        ("go-github-com-stretchr-testify"
         ,go-github-com-stretchr-testify)
        ("go-github-com-spf13-cobra"
         ,go-github-com-spf13-cobra)
        ("go-github-com-spaolacci-murmur3"
         ,go-github-com-spaolacci-murmur3)
        ("go-github-com-pkg-errors"
         ,go-github-com-pkg-errors)
        ("go-github-com-kr-pretty"
         ,go-github-com-kr-pretty)
        ("go-github-com-golang-snappy"
         ,go-github-com-golang-snappy)
        ("go-github-com-golang-protobuf"
         ,go-github-com-golang-protobuf)
        ("go-github-com-dustin-go-humanize"
         ,go-github-com-dustin-go-humanize)
        ("go-github-com-dgryski-go-farm"
         ,go-github-com-dgryski-go-farm)
        ("go-github-com-dgraph-io-ristretto"
         ,go-github-com-dgraph-io-ristretto)
        ("go-github-com-cespare-xxhash"
         ,go-github-com-cespare-xxhash)
        ("go-github-com-datadog-zstd"
         ,go-github-com-datadog-zstd)))
    (home-page "https://github.com/dgraph-io/badger")
    (synopsis "BadgerDB")
    (description
      "Package badger implements an embeddable, simple and fast key-value database, written in pure Go. It is designed to be highly performant for both reads and writes simultaneously. Badger uses Multi-Version Concurrency Control (MVCC), and supports transactions. It runs transactions concurrently, with serializable snapshot isolation guarantees.")
    (license (license:asl2.0))))
--8<---------------cut here---------------end--------------->8---

Attached is the fixup commit which should apply cleanly on top of your
v3 patch on master, along a (now required) commit to use a temporary
fork of guile-lib:


[-- Attachment #2: 0001-gnu-guile-lib-Update-to-a-temporary-fork.patch --]
[-- Type: text/x-patch, Size: 5679 bytes --]

From 16c07537375ab5d18ee76a5fdfb2b8ed7192b395 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Wed, 3 Mar 2021 16:20:22 -0500
Subject: [PATCH] gnu: guile-lib: Update to a temporary fork.

This fork add support to enable stricter/more correct parsing of HTML in
htmlprag, which is used by the go importer.

* gnu/packages/guile-xyz.scm (guile-lib)[source]: Fetch from git.
Remove snippet and modules field.
[native-inputs]: Add autoconf, automake, gettext and texinfo.
---
 gnu/packages/guile-xyz.scm | 96 ++++++++++++++++++--------------------
 1 file changed, 46 insertions(+), 50 deletions(-)

diff --git a/gnu/packages/guile-xyz.scm b/gnu/packages/guile-xyz.scm
index ce5aad8ec7..c14193921b 100644
--- a/gnu/packages/guile-xyz.scm
+++ b/gnu/packages/guile-xyz.scm
@@ -16,7 +16,7 @@
 ;;; Copyright © 2017 Theodoros Foradis <theodoros@foradis.org>
 ;;; Copyright © 2017 Nikita <nikita@n0.is>
 ;;; Copyright © 2017, 2018 Tobias Geerinckx-Rice <me@tobias.gr>
-;;; Copyright © 2018 Maxim Cournoyer <maxim.cournoyer@gmail.com>
+;;; Copyright © 2018, 2021 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;; Copyright © 2018, 2019, 2020 Arun Isaac <arunisaac@systemreboot.net>
 ;;; Copyright © 2018 Pierre-Antoine Rouby <pierre-antoine.rouby@inria.fr>
 ;;; Copyright © 2018 Eric Bavier <bavier@member.fsf.org>
@@ -2167,59 +2167,55 @@ library.")
               ("guile" ,guile-3.0)))))
 
 (define-public guile-lib
-  (package
-    (name "guile-lib")
-    (version "0.2.6.1")
-    (source (origin
-              (method url-fetch)
-              (uri (string-append "mirror://savannah/guile-lib/guile-lib-"
-                                  version ".tar.gz"))
-              (sha256
-               (base32
-                "0aizxdif5dpch9cvs8zz5g8ds5s4xhfnwza2il5ji7fv2h7ks7bd"))
-              (modules '((guix build utils)))
-              (snippet
-               '(begin
-                  ;; Work around miscompilation on Guile 3.0.0 at -O2:
-                  ;; <https://bugs.gnu.org/39251>.
-                  (substitute* "src/md5.scm"
-                    (("\\(define f-ash ash\\)")
-                     "(define f-ash (@ (guile) ash))\n")
-                    (("\\(define f-add \\+\\)")
-                     "(define f-add (@ (guile) +))\n"))
-                  #t))))
-    (build-system gnu-build-system)
-    (arguments
-     '(#:make-flags
-       '("GUILE_AUTO_COMPILE=0")        ; to prevent guild errors
-       #:phases
-       (modify-phases %standard-phases
-         (add-before 'configure 'patch-module-dir
-           (lambda _
-             (substitute* "src/Makefile.in"
-               (("^moddir = ([[:graph:]]+)")
-                "moddir = $(datadir)/guile/site/@GUILE_EFFECTIVE_VERSION@\n")
-               (("^godir = ([[:graph:]]+)")
-                "godir = \
-$(libdir)/guile/@GUILE_EFFECTIVE_VERSION@/site-ccache\n"))
-             #t)))))
-    (native-inputs
-     `(("guile" ,guile-3.0)
-       ("pkg-config" ,pkg-config)))
-    (inputs
-     `(("guile" ,guile-3.0)))
-    (home-page "https://www.nongnu.org/guile-lib/")
-    (synopsis "Collection of useful Guile Scheme modules")
-    (description
-     "Guile-Lib is intended as an accumulation place for pure-scheme Guile
+  (let ((revision "1")
+        (commit "c059f13e332347201eaa4a32ef27c53d064f2d17"))
+    (package
+      (name "guile-lib")
+      (version (git-version "0.2.6.1" revision commit))
+      (source (origin
+                (method git-fetch)
+                (uri (git-reference
+                      (url "https://notabug.org/apteryx/guile-lib/")
+                      (commit commit)))
+                (file-name (git-file-name name version))
+                (sha256
+                 (base32
+                  "1dl2f53p737n637n2805slci5i32s6cy0bq1j0xkmzd5piymg4f8"))))
+      (build-system gnu-build-system)
+      (arguments
+       '(#:make-flags
+         '("GUILE_AUTO_COMPILE=0")      ;to prevent guild errors
+         #:phases
+         (modify-phases %standard-phases
+           (add-before 'configure 'patch-module-dir
+             (lambda _
+               (substitute* "src/Makefile.in"
+                 (("^moddir = ([[:graph:]]+)")
+                  "moddir = $(datadir)/guile/site/@GUILE_EFFECTIVE_VERSION@\n")
+                 (("^godir = ([[:graph:]]+)")
+                  "godir = \
+$(libdir)/guile/@GUILE_EFFECTIVE_VERSION@/site-ccache\n")))))))
+      (native-inputs
+       `(("autoconf" ,autoconf)
+         ("automake" ,automake)
+         ("gettext" ,gettext-minimal)
+         ("guile" ,guile-3.0)
+         ("pkg-config" ,pkg-config)
+         ("texinfo" ,texinfo)))
+      (inputs
+       `(("guile" ,guile-3.0)))
+      (home-page "https://www.nongnu.org/guile-lib/")
+      (synopsis "Collection of useful Guile Scheme modules")
+      (description
+       "Guile-Lib is intended as an accumulation place for pure-scheme Guile
 modules, allowing for people to cooperate integrating their generic Guile
 modules into a coherent library.  Think \"a down-scaled, limited-scope CPAN
 for Guile\".")
 
-    ;; The whole is under GPLv3+, but some modules are under laxer
-    ;; distribution terms such as LGPL and public domain.  See `COPYING' for
-    ;; details.
-    (license license:gpl3+)))
+      ;; The whole is under GPLv3+, but some modules are under laxer
+      ;; distribution terms such as LGPL and public domain.  See `COPYING' for
+      ;; details.
+      (license license:gpl3+))))
 
 (define-public guile2.0-lib
   (package
-- 
2.30.1


[-- Attachment #3: 0002-fixup-Create-importer-for-Go-modules.patch --]
[-- Type: text/x-patch, Size: 29727 bytes --]

From f3a6130577252e3d079a6209ec2e21bf5d8baf25 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Wed, 3 Mar 2021 16:45:11 -0500
Subject: [PATCH] fixup! Create importer for Go modules

---
 guix/build-system/go.scm |  34 ++--
 guix/import/go.scm       | 420 ++++++++++++++++++++++-----------------
 2 files changed, 257 insertions(+), 197 deletions(-)

diff --git a/guix/build-system/go.scm b/guix/build-system/go.scm
index 594e0cb4f3..d07c703a6a 100644
--- a/guix/build-system/go.scm
+++ b/guix/build-system/go.scm
@@ -34,30 +34,28 @@
             go-version->git-ref))
 
 (define (go-version->git-ref version)
-  "GO-VERSION->GIT-REF parse pseudo-versions and extract the commit
-   hash from it, defaulting to full VERSION if we don't recognise a
-   pseudo-version pattern."
-  ;; A module version like v1.2.3 is introduced by tagging a revision in
-  ;; the underlying source repository. Untagged revisions can be referred
-  ;; to using a "pseudo-version" like v0.0.0-yyyymmddhhmmss-abcdefabcdef,
-  ;; where the time is the commit time in UTC and the final suffix is the
-  ;; prefix of the commit hash.
-  ;; cf. https://golang.org/cmd/go/#hdr-Pseudo_versions
+  "GO-VERSION->GIT-REF parse pseudo-versions and extract the commit hash from
+it, defaulting to full VERSION if a pseudo-version pattern is not recognized."
+  ;; A module version like v1.2.3 is introduced by tagging a revision in the
+  ;; underlying source repository.  Untagged revisions can be referred to
+  ;; using a "pseudo-version" like v0.0.0-yyyymmddhhmmss-abcdefabcdef, where
+  ;; the time is the commit time in UTC and the final suffix is the prefix of
+  ;; the commit hash (see: https://golang.org/cmd/go/#hdr-Pseudo_versions).
   (let* ((version
-          ;; if a source code repository has a v2.0.0 or later tag for
-          ;; a file tree with no go.mod, the version is considered to be
-          ;; part of the v1 module's available versions and is given an
-          ;; +incompatible suffix
-          ;; https://golang.org/cmd/go/#hdr-Module_compatibility_and_semantic_versioning
+          ;; If a source code repository has a v2.0.0 or later tag for a file
+          ;; tree with no go.mod, the version is considered to be part of the
+          ;; v1 module's available versions and is given an +incompatible
+          ;; suffix
+          ;; (see:https://golang.org/cmd/go/#hdr-Module_compatibility_and_semantic_versioning).
           (if (string-suffix? "+incompatible" version)
               (string-drop-right version 13)
               version))
          (re (string-concatenate
               (list
-               "(v?[0-9]\\.[0-9]\\.[0-9])" ; "v" prefix can be omitted in version prefix
-               "(-|-pre\\.0\\.|-0\\.)"     ; separator
-               "([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])-" ; timestamp
-               "([0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f])"))) ; commit hash
+               "(v?[0-9]\\.[0-9]\\.[0-9])" ;"v" prefix can be omitted in version prefix
+               "(-|-pre\\.0\\.|-0\\.)"     ;separator
+               "([0-9]{14})-"              ;timestamp
+               "([0-9A-Fa-f]{12})")))      ;commit hash
          (match (string-match re version)))
     (if match
         (match:substring match 4)
diff --git a/guix/import/go.scm b/guix/import/go.scm
index fead355bd2..7bc97c5c92 100644
--- a/guix/import/go.scm
+++ b/guix/import/go.scm
@@ -2,6 +2,7 @@
 ;;; Copyright © 2020 Katherine Cox-Buday <cox.katherine.e@gmail.com>
 ;;; Copyright © 2020 Helio Machado <0x2b3bfa0+guix@googlemail.com>
 ;;; Copyright © 2021 François Joulaud <francois.joulaud@radiofrance.com>
+;;; Copyright © 2021 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -18,51 +19,37 @@
 ;;; You should have received a copy of the GNU General Public License
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
-;;; (guix import golang) wants to make easier to create Guix package
-;;; declaration for Go modules.
+;;; (guix import golang) attempts to make it easier to create Guix package
+;;; declarations for Go modules.
 ;;;
-;;; Modules in Go are "collection of related Go packages" which are
-;;; "the unit of source code interchange and versioning".
-;;; Modules are generally hosted in a repository.
+;;; Modules in Go are a "collection of related Go packages" which are "the
+;;; unit of source code interchange and versioning".  Modules are generally
+;;; hosted in a repository.
 ;;;
-;;; At this point it should handle correctly modules which
-;;; have only Go dependencies and are accessible from proxy.golang.org
-;;; (or configured GOPROXY).
+;;; At this point it should handle correctly modules which have only Go
+;;; dependencies and are accessible from proxy.golang.org (or configured via
+;;; GOPROXY).
 ;;;
 ;;; We want it to work more or less this way:
 ;;; - get latest version for the module from GOPROXY
 ;;; - infer VCS root repo from which we will check-out source by
 ;;;   + recognising known patterns (like github.com)
-;;;   + or (TODO) recognising .vcs suffix
-;;;   + or parsing meta tag in html served at the URL
+;;;   + or recognizing .vcs suffix
+;;;   + or parsing meta tag in HTML served at the URL
 ;;;   + or (TODO) if nothing else works by using zip file served by GOPROXY
 ;;; - get go.mod from GOPROXY (which is able to synthetize one if needed)
 ;;; - extract list of dependencies from this go.mod
 ;;;
-;;; We translate Go module paths to a Guix package name under the
+;;; The Go module paths are translated to a Guix package name under the
 ;;; assumption that there will be no collision.
 
 ;;; TODO list
 ;;; - get correct hash in vcs->origin
 ;;; - print partial result during recursive imports (need to catch
 ;;;   exceptions)
-;;; - infer repo from module path with VCS qualifier
-;;;   (e.g. site.example/my/path/to/repo.git/and/subdir/module)
-;;; - don't print fetch messages to stdout
-;;; - pre-fill synopsis, description and license
 
 (define-module (guix import go)
-  #:use-module (ice-9 match)
-  #:use-module (ice-9 rdelim)
-  #:use-module (ice-9 receive)
-  #:use-module (ice-9 regex)
   #:use-module (guix build-system go)
-  #:use-module (htmlprag)
-  #:use-module (sxml xpath)
-  #:use-module (srfi srfi-1)
-  #:use-module (srfi srfi-9)
-  #:use-module (srfi srfi-11)
-  #:use-module (json)
   #:use-module ((guix download) #:prefix download:)
   #:use-module (guix git)
   #:use-module (guix import utils)
@@ -75,49 +62,134 @@
   #:use-module (guix base32)
   #:use-module (guix memoization)
   #:use-module ((guix build download) #:prefix build-download:)
+  #:use-module (htmlprag)
+  #:use-module (ice-9 match)
+  #:use-module (ice-9 rdelim)
+  #:use-module (ice-9 receive)
+  #:use-module (ice-9 regex)
+  #:use-module (json)
+  #:use-module (rnrs io ports)
+  #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-9)
+  #:use-module (srfi srfi-11)
+  #:use-module (srfi srfi-26)
+  #:use-module (sxml xpath)
+  #:use-module (web client)
+  #:use-module (web response)
   #:use-module (web uri)
 
   #:export (go-module->guix-package
-            go-module-recursive-import
-            infer-module-root-repo))
+            go-module-recursive-import))
 
+;;; Parameterize htmlprag to parse valid HTML more reliably.
+(%strict-tokenizer? #t)
 
 (define (go-path-escape path)
-  "Escape a module path by replacing every uppercase letter with an exclamation
-mark followed with its lowercase equivalent, as per the module Escaped Paths
-specification. https://godoc.org/golang.org/x/mod/module#hdr-Escaped_Paths"
+  "Escape a module path by replacing every uppercase letter with an
+exclamation mark followed with its lowercase equivalent, as per the module
+Escaped Paths specification (see:
+https://godoc.org/golang.org/x/mod/module#hdr-Escaped_Paths)."
   (define (escape occurrence)
     (string-append "!" (string-downcase (match:substring occurrence))))
   (regexp-substitute/global #f "[A-Z]" path 'pre escape 'post))
 
-
 (define (go-module-latest-version goproxy-url module-path)
-  "Fetches the version number of the latest version for MODULE-PATH from the
+  "Fetch the version number of the latest version for MODULE-PATH from the
 given GOPROXY-URL server."
-  (assoc-ref
-   (json-fetch (format #f "~a/~a/@latest" goproxy-url
-                       (go-path-escape module-path)))
-   "Version"))
+  (assoc-ref (json-fetch (format #f "~a/~a/@latest" goproxy-url
+                                 (go-path-escape module-path)))
+             "Version"))
+
+(define (go-package-licenses name)
+  "Retrieve the list of licenses that apply to NAME, a Go package or module
+name (e.g. \"github.com/golang/protobuf/proto\").  The data is scraped from
+the https://pkg.go.dev/ web site."
+  (let*-values (((url) (string-append "https://pkg.go.dev/" name
+                                      "?tab=licenses"))
+                ((response body) (http-get url))
+                ;; Extract the text contained in a h2 child node of any
+                ;; element marked with a "License" class attribute.
+                ((select) (sxpath `(// (* (@ (equal? (class "License"))))
+                                        h2 // *text*))))
+    (and (eq? (response-code response) 200)
+         (match (select (html->sxml body))
+           (() #f)                      ;nothing selected
+           (licenses licenses)))))
+
+(define (go-package-description name)
+  "Retrieve a short description for NAME, a Go package name,
+e.g. \"google.golang.org/protobuf/proto\".  The data is scraped from the
+https://pkg.go.dev/ web site."
+  (let*-values (((url) (string-append "https://pkg.go.dev/" name))
+                ((response body) (http-get url))
+                ;; Extract the text contained in a h2 child node of any
+                ;; element marked with a "License" class attribute.
+                ((select) (sxpath
+                           `(// (section
+                                 (@ (equal? (class "Documentation-overview"))))
+                                (p 1)))))
+    (and (eq? (response-code response) 200)
+         (match (select (html->sxml body))
+           (() #f)                      ;nothing selected
+           (((p . strings))
+            ;; The paragraph text is returned as a list of strings embedding
+            ;; newline characters.  Join them and strip the newline
+            ;; characters.
+            (string-delete #\newline (string-join strings)))))))
+
+(define (go-package-synopsis module-name)
+  "Retrieve a short synopsis for a Go module named MODULE-NAME,
+e.g. \"google.golang.org/protobuf\".  The data is scraped from
+the https://pkg.go.dev/ web site."
+  ;; Note: Only the *module* (rather than package) page has the README title
+  ;; used as a synopsis on the https://pkg.go.dev web site.
+  (let*-values (((url) (string-append "https://pkg.go.dev/" module-name))
+                ((response body) (http-get url))
+                ;; Extract the text contained in a h2 child node of any
+                ;; element marked with a "License" class attribute.
+                ((select) (sxpath
+                           `(// (div (@ (equal? (class "UnitReadme-content"))))
+                                // h3 *text*))))
+    (and (eq? (response-code response) 200)
+         (match (select (html->sxml body))
+           (() #f)                      ;nothing selected
+           ((title more ...)            ;title is the first string of the list
+            (string-trim-both title))))))
 
-(define go-module-latest-version* (memoize go-module-latest-version))
+(define (list->licenses licenses)
+  "Given a list of LICENSES mostly following the SPDX conventions, return the
+corresponding Guix license or 'unknown-license!"
+  (filter-map (lambda (license)
+                (and (not (string-null? license))
+                     (not (any (cut string=? <> license)
+                               '("AND" "OR" "WITH")))
+                     ;; Adjust the license names scraped from
+                     ;; https://pkg.go.dev to an equivalent SPDX identifier,
+                     ;; if they differ (see: https://github.com/golang/pkgsite
+                     ;; /internal/licenses/licenses.go#L174).
+                     (or (spdx-string->license
+                          (match license
+                            ("BlueOak-1.0" "BlueOak-1.0.0")
+                            ("BSD-0-Clause" "0BSD")
+                            ("BSD-2-Clause" "BSD-2-Clause-FreeBSD")
+                            ("GPL2" "GPL-2.0")
+                            ("GPL3" "GPL-3.0")
+                            ("NIST" "NIST-PD")
+                            (_ license)))
+                         'unknown-license!)))
+              licenses))
 
-(define (fetch-go.mod goproxy-url module-path version file)
-  "Fetches go.mod from the given GOPROXY-URL server for the given MODULE-PATH
-and VERSION."
+(define (fetch-go.mod goproxy-url module-path version)
+  "Fetch go.mod from the given GOPROXY-URL server for the given MODULE-PATH
+and VERSION and return an input port."
   (let ((url (format #f "~a/~a/@v/~a.mod" goproxy-url
                      (go-path-escape module-path)
                      (go-path-escape version))))
-    (parameterize ((current-output-port (current-error-port)))
-      (build-download:url-fetch url
-                                file
-                                #:print-build-trace? #f))))
+    (build-download:http-fetch (string->uri url))))
 
-(define (parse-go.mod go.mod-path)
-  (parse-go.mod-port (open-input-file go.mod-path)))
-
-(define (parse-go.mod-port go.mod-port)
-  "PARSE-GO.MOD takes a filename in GO.MOD-PATH and extract a list of
-requirements from it."
+(define (parse-go.mod port)
+  "Parse the go.mod file accessible via the input PORT, returning a list of
+requirements."
   ;; We parse only a subset of https://golang.org/ref/mod#go-mod-file-grammar
   ;; which we think necessary for our use case.
   (define (toplevel results)
@@ -147,6 +219,7 @@ requirements from it."
        (#t
         ;; unrecognised line, ignore silently
         (toplevel results)))))
+
   (define (in-require results)
     (let ((line (read-line)))
       (cond
@@ -158,6 +231,7 @@ requirements from it."
         (toplevel results))
        (#t
         (in-require (require-directive results line))))))
+
   (define (in-replace results)
     (let ((line (read-line)))
       (cond
@@ -169,6 +243,7 @@ requirements from it."
         (toplevel results))
        (#t
         (in-replace (replace-directive results line))))))
+
   (define (replace-directive results line)
     "Extract replaced modules and new requirements from replace directive
     in LINE and add to RESULTS."
@@ -191,6 +266,7 @@ requirements from it."
                 requirements
                 (acons new-module-path new-version requirements))))
       (cons new-requirements new-replaced)))
+
   (define (require-directive results line)
     "Extract requirement from LINE and add it to RESULTS."
     (let* ((requirements (car results))
@@ -209,7 +285,8 @@ requirements from it."
            (module-path (string-trim-both module-path #\"))
            (version (match:substring match 2)))
       (cons (acons module-path version requirements) replaced)))
-  (with-input-from-port go.mod-port
+
+  (with-input-from-port port
     (lambda ()
       (let* ((results (toplevel '(() . ())))
              (requirements (car results))
@@ -221,120 +298,102 @@ requirements from it."
          requirements
          replaced)))))
 
-(define (infer-module-root-repo module-path)
-  "Go modules can be defined at any level of a repository's tree, but querying
-for the meta tag usually can only be done at the webpage at the root of the
-repository. Therefore, it is sometimes necessary to try and derive a module's
-root path from its path. For a set of well-known forges, the pattern of what
-consists of a module's root page is known before hand."
+(define (module-path->repository-root module-path)
+  "Infer the repository root from a module path.  Go modules can be
+defined at any level of a repository tree, but querying for the meta tag
+usually can only be done from the web page at the root of the repository,
+hence the need to derive this information."
   ;; See the following URL for the official Go equivalent:
   ;; https://github.com/golang/go/blob/846dce9d05f19a1f53465e62a304dea21b99f910/src/cmd/go/internal/vcs/vcs.go#L1026-L1087
-  ;;
-  ;; TODO: handle module path with VCS qualifier as described in
-  ;; https://golang.org/ref/mod#vcs-find and
-  ;; https://golang.org/cmd/go/#hdr-Remote_import_paths
+
   (define-record-type <vcs>
     (make-vcs url-prefix root-regex type)
     vcs?
     (url-prefix vcs-url-prefix)
     (root-regex vcs-root-regex)
     (type vcs-type))
-  (let* ((known-vcs
-          (list
-           (make-vcs
-            "github.com"
-            "^(github\\.com/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
-            'git)
-           (make-vcs
-            "bitbucket.org"
-            "^(bitbucket\\.org/([A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+))(/[A-Za-z0-9_.\\-]+)*$"
-            'unknown)
-           (make-vcs
-            "hub.jazz.net/git/"
-            "^(hub\\.jazz\\.net/git/[a-z0-9]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
-            'git)
-           (make-vcs
-            "git.apache.org"
-            "^(git\\.apache\\.org/[a-z0-9_.\\-]+\\.git)(/[A-Za-z0-9_.\\-]+)*$"
-            'git)
-           (make-vcs
-            "git.openstack.org"
-            "^(git\\.openstack\\.org/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(\\.git)?(/[A-Za-z0-9_.\\-]+)*$"
-            'git)))
-         (vcs (find (lambda (vcs) (string-prefix? (vcs-url-prefix vcs) module-path))
-                    known-vcs)))
-    (if vcs
-        (match:substring (string-match (vcs-root-regex vcs) module-path) 1)
-        module-path)))
+
+  (define known-vcs
+    (list
+     (make-vcs
+      "github.com"
+      "^(github\\.com/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
+      'git)
+     (make-vcs
+      "bitbucket.org"
+      "^(bitbucket\\.org/([A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+))(/[A-Za-z0-9_.\\-]+)*$"
+      'unknown)
+     (make-vcs
+      "hub.jazz.net/git/"
+      "^(hub\\.jazz\\.net/git/[a-z0-9]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
+      'git)
+     (make-vcs
+      "git.apache.org"
+      "^(git\\.apache\\.org/[a-z0-9_.\\-]+\\.git)(/[A-Za-z0-9_.\\-]+)*$"
+      'git)
+     (make-vcs
+      "git.openstack.org"
+      "^(git\\.openstack\\.org/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(\\.git)?\
+(/[A-Za-z0-9_.\\-]+)*$"
+      'git)))
+
+  ;; For reference, see: https://golang.org/ref/mod#vcs-find.
+  (define vcs-qualifiers '(".bzr" ".fossil" ".git" ".hg" ".svn"))
+
+  (define (vcs-qualified-module-path->root-repo-url module-path)
+    (let* ((vcs-qualifiers-group (string-join vcs-qualifiers "|"))
+           (pattern (format #f "^(.*(~a))(/|$)" vcs-qualifiers-group))
+           (m (string-match pattern module-path)))
+      (and=> m (cut match:substring <> 1))))
+
+  (or (and=> (find (lambda (vcs)
+                     (string-prefix? (vcs-url-prefix vcs) module-path))
+                   known-vcs)
+             (lambda (vcs)
+               (match:substring (string-match (vcs-root-regex vcs)
+                                              module-path) 1)))
+      (vcs-qualified-module-path->root-repo-url module-path)
+      module-path))
 
 (define (go-module->guix-package-name module-path)
   "Converts a module's path to the canonical Guix format for Go packages."
-  (string-downcase
-   (string-append "go-"
-                  (string-replace-substring
-                   (string-replace-substring
-                    module-path
-                    "." "-")
-                   "/" "-"))))
+  (string-downcase (string-append "go-" (string-replace-substring
+                                         (string-replace-substring
+                                          module-path
+                                          "." "-")
+                                         "/" "-"))))
 
 (define-record-type <module-meta>
   (make-module-meta import-prefix vcs repo-root)
   module-meta?
   (import-prefix module-meta-import-prefix)
-  ;; VCS field is a symbol
-  (vcs module-meta-vcs)
+  (vcs module-meta-vcs)                 ;a symbol
   (repo-root module-meta-repo-root))
 
 (define (fetch-module-meta-data module-path)
-  "Fetches module meta-data from a module's landing page. This is
-  necessary because goproxy servers don't currently provide all the
-  information needed to build a package."
+  "Retrieve the module meta-data from its landing page.  This is necessary
+because goproxy servers don't currently provide all the information needed to
+build a package."
   ;; <meta name="go-import" content="import-prefix vcs repo-root">
-  (define (meta-go-import->module-meta text)
-    "Takes the content of the go-import meta tag as TEXT and gives back
-     a MODULE-META record"
-    (define (get-component s start)
-      (let*
-          ((start (string-skip s char-set:whitespace start))
-           (end (string-index s char-set:whitespace start))
-           (end (if end end (string-length s)))
-           (result (substring s start end)))
-        (values result end)))
-    (let*-values (((import-prefix end) (get-component text 0))
-                  ((vcs end) (get-component text end))
-                  ((repo-root end) (get-component text end)))
-      (make-module-meta import-prefix (string->symbol vcs) repo-root)))
-  (define (html->meta-go-import port)
-    "Read PORT with HTML content. Find the go-import meta tag and gives
-    back its content as a string."
-    (let* ((parsedhtml (html->sxml port))
-           (extract-content (node-join
-                             (select-kids (node-typeof? 'html))
-                             (select-kids (node-typeof? 'head))
-                             (select-kids (node-typeof? 'meta))
-                             (select-kids (node-typeof? '@))
-                             (node-self
-                              (node-join
-                               (select-kids (node-typeof? 'name))
-                               (select-kids (node-equal? "go-import"))))
-                             (select-kids (node-typeof? 'content))
-                             (select-kids (lambda (_) #t))))
-           (content (car (extract-content parsedhtml))))
-      content))
-  (let* ((port (build-download:http-fetch (string->uri (format #f "https://~a?go-get=1" module-path))))
-         (meta-go-import (html->meta-go-import port))
-         (module-metadata (meta-go-import->module-meta meta-go-import)))
-    (close-port port)
-    module-metadata))
+  (let* ((port (build-download:http-fetch
+                (string->uri (format #f "https://~a?go-get=1" module-path))))
+         (select (sxpath `(// head (meta (@ (equal? (name "go-import"))))
+                              // content))))
+    (match (select (call-with-port port html->sxml))
+      (() #f)                         ;nothing selected
+      (((content content-text))
+       (match (string-split content-text #\space)
+         ((root-path vcs repo-url)
+          (make-module-meta root-path (string->symbol vcs) repo-url)))))))
 
 (define (module-meta-data-repo-url meta-data goproxy-url)
-  "Return the URL where the fetcher which will be used can download the source
-control."
-  (if (member (module-meta-vcs meta-data)'(fossil mod))
+  "Return the URL where the fetcher which will be used can download the
+source."
+  (if (member (module-meta-vcs meta-data) '(fossil mod))
       goproxy-url
       (module-meta-repo-root meta-data)))
 
-(define (vcs->origin vcs-type vcs-repo-url version file)
+(define (vcs->origin vcs-type vcs-repo-url version)
   "Generate the `origin' block of a package depending on what type of source
 control system is being used."
   (case vcs-type
@@ -347,61 +406,64 @@ control system is being used."
         (file-name (git-file-name name version))
         (sha256
          (base32
-           ;; FIXME: get hash for git repo checkout
-           "0000000000000000000000000000000000000000000000000000"))))
+          ;; FIXME: populate hash for git repo checkout
+          "0000000000000000000000000000000000000000000000000000"))))
     ((hg)
      `(origin
         (method hg-fetch)
         (uri (hg-reference
               (url ,vcs-repo-url)
               (changeset ,version)))
-        (file-name (format #f "~a-~a-checkout" name version))))
+        (file-name (string-append name "-" version "-checkout"))
+        (sha256
+         (base32
+          ;; FIXME: populate hash for hg repo checkout
+          "0000000000000000000000000000000000000000000000000000"))))
     ((svn)
      `(origin
         (method svn-fetch)
         (uri (svn-reference
               (url ,vcs-repo-url)
-              (revision (string->number version))
-              (recursive? #f)))
-        (file-name (format #f "~a-~a-checkout" name version))
+              (revision (string->number version))))
+        (file-name (string-append name "-" version "-checkout"))
         (sha256
          (base32
-          ,(guix-hash-url file)))))
+          ;; FIXME: populate hash for svn repo checkout
+          "0000000000000000000000000000000000000000000000000000"))))
     (else
      (raise-exception (format #f "unsupported vcs type: ~a" vcs-type)))))
 
-(define* (go-module->guix-package module-path #:key (goproxy-url "https://proxy.golang.org"))
-  (call-with-temporary-output-file
-   (lambda (temp port)
-     (let* ((latest-version (go-module-latest-version* goproxy-url module-path))
-            (go.mod-path (fetch-go.mod goproxy-url module-path latest-version
-                                       temp))
-            (dependencies (map car (parse-go.mod temp)))
-            (guix-name (go-module->guix-package-name module-path))
-            (root-module-path (infer-module-root-repo module-path))
-            ;; VCS type and URL are not included in goproxy information. For
-            ;; this we need to fetch it from the official module page.
-            (meta-data (fetch-module-meta-data root-module-path))
-            (vcs-type (module-meta-vcs meta-data))
-            (vcs-repo-url (module-meta-data-repo-url meta-data goproxy-url)))
-       (values
-        `(package
-           (name ,guix-name)
-           ;; Elide the "v" prefix Go uses
-           (version ,(string-trim latest-version #\v))
-           (source
-            ,(vcs->origin vcs-type vcs-repo-url latest-version temp))
-           (build-system go-build-system)
-           (arguments
-            '(#:import-path ,root-module-path))
-           ,@(maybe-inputs (map go-module->guix-package-name dependencies))
-           ;; TODO(katco): It would be nice to make an effort to fetch this
-           ;; from known forges, e.g. GitHub
-           (home-page ,(format #f "https://~a" root-module-path))
-           (synopsis "A Go package")
-           (description ,(format #f "~a is a Go package." guix-name))
-           (license #f))
-        dependencies)))))
+(define* (go-module->guix-package module-path #:key
+                                  (goproxy-url "https://proxy.golang.org"))
+  (let* ((latest-version (go-module-latest-version goproxy-url module-path))
+         (port (fetch-go.mod goproxy-url module-path latest-version))
+         (dependencies (map car (call-with-port port parse-go.mod)))
+         (guix-name (go-module->guix-package-name module-path))
+         (root-module-path (module-path->repository-root module-path))
+         ;; The VCS type and URL are not included in goproxy information. For
+         ;; this we need to fetch it from the official module page.
+         (meta-data (fetch-module-meta-data root-module-path))
+         (vcs-type (module-meta-vcs meta-data))
+         (vcs-repo-url (module-meta-data-repo-url meta-data goproxy-url))
+         (synopsis (go-package-synopsis root-module-path))
+         (description (go-package-description module-path))
+         (licenses (go-package-licenses module-path)))
+    (values
+     `(package
+        (name ,guix-name)
+        ;; Elide the "v" prefix Go uses
+        (version ,(string-trim latest-version #\v))
+        (source
+         ,(vcs->origin vcs-type vcs-repo-url latest-version))
+        (build-system go-build-system)
+        (arguments
+         '(#:import-path ,root-module-path))
+        ,@(maybe-inputs (map go-module->guix-package-name dependencies))
+        (home-page ,(format #f "https://~a" root-module-path))
+        (synopsis ,synopsis)
+        (description ,description)
+        (license ,(and=> licenses list->licenses)))
+     dependencies)))
 
 (define go-module->guix-package* (memoize go-module->guix-package))
 
-- 
2.30.1


[-- Attachment #4: Type: text/plain, Size: 95 bytes --]


I hope I'm not making things more difficult for you!

Thank you for working on it! :-)

Maxim

  reply	other threads:[~2021-03-04  5:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-23 14:06 [bug#44178] Add a Go Module Importer Katherine Cox-Buday
2020-10-28 10:41 ` Ludovic Courtès
2020-10-28 10:42 ` Ludovic Courtès
2020-11-10 20:26 ` Marius Bakke
     [not found]   ` <CANe01w55ZO=_9v0HcDv248UsoLUXb_9WVAgM4LqiZ4E-r1XgXg@mail.gmail.com>
2020-11-11  1:23     ` Helio Machado
2021-01-23 21:35       ` [bug#44178] [PATCH] Create importer for Go modules guix-patches--- via
2021-01-23 22:41         ` Katherine Cox-Buday
2021-01-25 21:03           ` guix-patches--- via
2021-01-27 14:38             ` Katherine Cox-Buday
2021-01-28 13:27               ` Ludovic Courtès
2021-01-29 16:43                 ` guix-patches--- via
2021-01-29 16:52                   ` [bug#44178] [PATCHv2] " guix-patches--- via
2021-01-31 16:23                   ` [bug#44178] [PATCH] " Ludovic Courtès
2021-02-19 15:51                     ` JOULAUD François via Guix-patches via
2021-02-19 16:21                       ` [bug#44178] [PATCHv3] " JOULAUD François via Guix-patches via
2021-03-02 21:54                         ` [bug#44178] Add a Go Module Importer Ludovic Courtès
2021-03-04  5:40                           ` Maxim Cournoyer [this message]
2021-03-04 14:14                             ` [bug#44178] [PATCH v4] Re: bug#44178: " JOULAUD François via Guix-patches via
2021-03-04 15:47                               ` Maxim Cournoyer
2021-03-08 13:54                           ` [bug#44178] " JOULAUD François via Guix-patches via
2021-03-10 17:12                             ` bug#44178: " Ludovic Courtès
2021-01-28  5:01             ` [bug#44178] [PATCH] Create importer for Go modules Timmy Douglas
2020-11-11 20:48   ` [bug#44178] Add a Go Module Importer Katherine Cox-Buday
2020-12-09 14:22 ` [bug#44178] dftxbs3e
2020-12-10  2:42   ` [bug#44178] dftxbs3e
2020-12-10  3:14     ` [bug#44178] dftxbs3e
2021-01-28  7:29 ` [bug#44178] [PATCH] Create importer for Go modules guix-patches--- via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735xbqxwr.fsf_-_@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=44178@debbugs.gnu.org \
    --cc=Francois.JOULAUD@radiofrance.com \
    --cc=cox.katherine.e@gmail.com \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).