unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Katherine Cox-Buday <cox.katherine.e@gmail.com>
To: "JOULAUD François" <Francois.JOULAUD@radiofrance.com>
Cc: "44178@debbugs.gnu.org" <44178@debbugs.gnu.org>,
	Helio Machado <0x2b3bfa0@gmail.com>
Subject: [bug#44178] [PATCH] Create importer for Go modules
Date: Sat, 23 Jan 2021 16:41:18 -0600	[thread overview]
Message-ID: <87r1mb6zu9.fsf@gmail.com> (raw)
In-Reply-To: <20210123212742.m2thdeuzdvgpkgeo@fjo-extia-HPdeb.example.avalenn.eu> ("JOULAUD François"'s message of "Sat, 23 Jan 2021 21:35:08 +0000")

Thanks so much for the patches, Helio, Joulaud!

I apologize for the long delay before looking at this again. My time
right now is extremely limited due to COVID-19 related childcare and
activities. I was negligent and left a patch to bitrot on my
computer[1]. This patch supersedes it.

In addition to the things this patch corrects, I was/am working on a few
other bugs:

- There are valid Go Module paths which when queried will not serve the
  requisite meta tag. I had modified `fetch-module-meta-data` to
  recursively walk up the module path searching for a valid meta tag
  (see diff[1]).

- I think Joulaud's patch covers this, but replacements relative to a
  module are a possibility.

- For reasons Joulaud calls out, a simple line-parser of the HTML for a
  module is not sufficient. Since we are pulling pages down from the
  wider internet, we should fall back on a library made for parsing HTML
  so we handle any edge-cases (e.g. meta tags split across lines). I am
  currently looking at `sxml`, and if that doesn't pan out `htmlprag`

- Some module homepages issue HTTP redirects. Last time I tested this,
  `http-fetch` does not handle this properly.

I think that covers everything.

I have pushed everything (including Joulaud's patch with appropriate
attribution) here[2]. I am admittedly new at using email to organize
code changes, but using a forge seems easier.

[1] https://github.com/guix-mirror/guix/commit/cce35c6d68a9bddf9558e85d2cb88be323da9247
[2] https://github.com/kat-co/guix/tree/create-go-importer

Can I suggest we coordinate there, or is that too much of an imposition?

-
Katherine

JOULAUD François <Francois.JOULAUD@radiofrance.com> writes:

> This patch add a `guix import go` command.
>
> It was tested with several big repositories and seems to mostly work for
> the import part (because building Guix packages is an other story). There
> is still bugs blocking e.g. use of any k8s.io modules.
>
> * guix/import/go.scm: Created Go Importer
> * guix/scripts/import.scm: Created Go Importer Subcommand
> * guix/import/go.scm (importers): Added Go Importer Subcommand
>
> Signed-off-by: Francois Joulaud <francois.joulaud@radiofrance.com>
> ---
> The patch is a rebased and modified version of the one proposed by
> Katherine Cox-Buday.
>
> Notable modifications are :
> - move from (guix json) to (json)
> - new parse-go.mod with no "set!" and parsing some go.mod which were in
>   error before
> - adding comments (maybe too much comments)
> - renamed SCS to VCS to be in accordance with vocabulary in use in Guix
>   and in Go worlds
> - replacing escape-capital-letters by Helio Machado's go-path-escape
> - no pruning of major version in go module names as they are considered
>   as completely different artefacts by Go programmers
> - fixed recursive-import probably broken by the rebase
> - force usage of url-fetch from (guix build download)
>
> I would be happy to hear about problems and perspective for this patch and
> will now focus on my next step which is actually building any package.
>
> Hope I CCed the right persons, I am not really aware of applicable
> netiquette here.
>
> Interdiff :
>   diff --git a/guix/import/go.scm b/guix/import/go.scm
>   index 61009f3565..7f5f300f0a 100644
>   --- a/guix/import/go.scm
>   +++ b/guix/import/go.scm
>   @@ -1,5 +1,7 @@
>    ;;; GNU Guix --- Functional package management for GNU
>    ;;; Copyright © 2020 Katherine Cox-Buday <cox.katherine.e@gmail.com>
>   +;;; Copyright © 2020 Helio Machado <0x2b3bfa0+guix@googlemail.com>
>   +;;; Copyright © 2021 François Joulaud <francois.joulaud@radiofrance.com>
>    ;;;
>    ;;; This file is part of GNU Guix.
>    ;;;
>   @@ -16,6 +18,21 @@
>    ;;; You should have received a copy of the GNU General Public License
>    ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
>    
>   +;;; (guix import golang) wants to make easier to create Guix package
>   +;;; declaration for Go modules.
>   +;;;
>   +;;; Modules in Go are "collection of related Go packages" which are
>   +;;; "the unit of source code interchange and versioning".
>   +;;; Modules are generally hosted in a repository.
>   +;;;
>   +;;; At this point it should handle correctly modules which
>   +;;; - have only Go dependencies;
>   +;;; - use go.mod;
>   +;;; - and are accessible from proxy.golang.org (or configured GOPROXY).
>   +;;;
>   +;;; We translate Go module paths  to a Guix package name under the
>   +;;; assumption that there will be no collision.
>   +
>    (define-module (guix import go)
>      #:use-module (ice-9 match)
>      #:use-module (ice-9 rdelim)
>   @@ -23,7 +40,7 @@
>      #:use-module (ice-9 regex)
>      #:use-module (srfi srfi-1)
>      #:use-module (srfi srfi-9)
>   -  #:use-module (guix json)
>   +  #:use-module (json)
>      #:use-module ((guix download) #:prefix download:)
>      #:use-module (guix import utils)
>      #:use-module (guix import json)
>   @@ -33,88 +50,129 @@
>      #:use-module ((guix licenses) #:prefix license:)
>      #:use-module (guix base16)
>      #:use-module (guix base32)
>   -  #:use-module (guix build download)
>   +  #:use-module ((guix build download) #:prefix build-download:)
>      #:use-module (web uri)
>    
>      #:export (go-module->guix-package
>                go-module-recursive-import
>                infer-module-root))
>    
>   -(define (escape-capital-letters s)
>   -  "To avoid ambiguity when serving from case-insensitive file systems, the
>   -$module and $version elements are case-encoded by replacing every uppercase
>   -letter with an exclamation mark followed by the corresponding lower-case
>   -letter."
>   -  (let ((escaped-string (string)))
>   -    (string-for-each-index
>   -     (lambda (i)
>   -       (let ((c (string-ref s i)))
>   -         (set! escaped-string
>   -           (string-concatenate
>   -            (list escaped-string
>   -                  (if (char-upper-case? c) "!" "")
>   -                  (string (char-downcase c)))))))
>   -     s)
>   -    escaped-string))
>   +(define (go-path-escape path)
>   +  "Escape a module path by replacing every uppercase letter with an exclamation
>   +mark followed with its lowercase equivalent, as per the module Escaped Paths
>   +specification. https://godoc.org/golang.org/x/mod/module#hdr-Escaped_Paths"
>   +  (define (escape occurrence)
>   +    (string-append "!" (string-downcase (match:substring occurrence))))
>   +  (regexp-substitute/global #f "[A-Z]" path 'pre escape 'post))
>   +
>    
>    (define (fetch-latest-version goproxy-url module-path)
>      "Fetches the version number of the latest version for MODULE-PATH from the
>    given GOPROXY-URL server."
>      (assoc-ref
>       (json-fetch (format #f "~a/~a/@latest" goproxy-url
>   -                       (escape-capital-letters module-path)))
>   +                       (go-path-escape module-path)))
>       "Version"))
>    
>    (define (fetch-go.mod goproxy-url module-path version file)
>      "Fetches go.mod from the given GOPROXY-URL server for the given MODULE-PATH
>    and VERSION."
>   -  (url-fetch (format #f "~a/~a/@v/~a.mod" goproxy-url
>   -                     (escape-capital-letters module-path)
>   -                     (escape-capital-letters version))
>   -             file
>   -             #:print-build-trace? #f))
>   +  (let ((url (format #f "~a/~a/@v/~a.mod" goproxy-url
>   +                     (go-path-escape module-path)
>   +                     (go-path-escape version))))
>   +    (parameterize ((current-output-port (current-error-port)))
>   +      (build-download:url-fetch url
>   +                                file
>   +                                #:print-build-trace? #f))))
>    
>    (define (parse-go.mod go.mod-path)
>   -  "Parses a go.mod file and returns an alist of module path to version."
>   +  "PARSE-GO.MOD takes a filename in GO.MOD-PATH and extract a list of
>   +requirements from it."
>   +  ;; We parse only a subset of https://golang.org/ref/mod#go-mod-file-grammar
>   +  ;; which we think necessary for our use case.
>   +  (define (toplevel results)
>   +    "Main parser, RESULTS is a pair of alist serving as accumulator for
>   +     all encountered requirements and replacements."
>   +    (let ((line (read-line)))
>   +      (cond
>   +       ((eof-object? line)
>   +        ;; parsing ended, give back the result
>   +        results)
>   +       ((string=? line "require (")
>   +        ;; a require block begins, delegate parsing to IN-REQUIRE
>   +        (in-require results))
>   +       ((string-prefix? "require " line)
>   +        ;; a require directive by itself
>   +        (let* ((stripped-line (string-drop line 8))
>   +               (new-results (require-directive results stripped-line)))
>   +          (toplevel new-results)))
>   +       ((string-prefix? "replace " line)
>   +        ;; a replace directive by itself
>   +        (let* ((stripped-line (string-drop line 8))
>   +               (new-results (replace-directive results stripped-line)))
>   +          (toplevel new-results)))
>   +       (#t
>   +        ;; unrecognised line, ignore silently
>   +        (toplevel results)))))
>   +  (define (in-require results)
>   +    (let ((line (read-line)))
>   +      (cond
>   +       ((eof-object? line)
>   +        ;; this should never happen here but we ignore silently
>   +        results)
>   +       ((string=? line ")")
>   +        ;; end of block, coming back to toplevel
>   +        (toplevel results))
>   +       (#t
>   +        (in-require (require-directive results line))))))
>   +  (define (replace-directive results line)
>   +    "Extract replaced modules and new requirements from replace directive
>   +    in LINE and add to RESULTS."
>   +    ;; ReplaceSpec = ModulePath [ Version ] "=>" FilePath newline
>   +    ;;             | ModulePath [ Version ] "=>" ModulePath Version newline .
>   +    (let* ((requirements (car results))
>   +           (replaced (cdr results))
>   +           (re (string-concatenate
>   +                '("([^[:blank:]]+)([[:blank:]]+([^[:blank:]]+))?"
>   +                  "[[:blank:]]+" "=>" "[[:blank:]]+"
>   +                  "([^[:blank:]]+)([[:blank:]]+([^[:blank:]]+))?")))
>   +           (match (string-match re line))
>   +           (module-path (match:substring match 1))
>   +           (version (match:substring match 3))
>   +           (new-module-path (match:substring match 4))
>   +           (new-version (match:substring match 6))
>   +           (new-replaced (acons module-path version replaced))
>   +           (new-requirements
>   +            (if (string-match "^\\.?\\./" new-module-path)
>   +                requirements
>   +                (acons new-module-path new-version requirements))))
>   +      (cons new-requirements new-replaced)))
>   +  (define (require-directive results line)
>   +    "Extract requirement from LINE and add it to RESULTS."
>   +    (let* ((requirements (car results))
>   +           (replaced (cdr results))
>   +           ;; A line in a require directive is composed of a module path and
>   +           ;; a version separated by whitespace and an optionnal '//' comment at
>   +           ;; the end.
>   +           (re (string-concatenate
>   +                '("^[[:blank:]]*"
>   +                  "([^[:blank:]]+)[[:blank:]]+([^[:blank:]]+)"
>   +                  "([[:blank:]]+//.*)?")))
>   +           (match (string-match re line))
>   +           (module-path (match:substring match 1))
>   +           (version (match:substring match 2)))
>   +      (cons (acons module-path version requirements) replaced)))
>      (with-input-from-file go.mod-path
>        (lambda ()
>   -      (let ((in-require? #f)
>   -            (requirements (list)))
>   -        (do ((line (read-line) (read-line)))
>   -            ((eof-object? line))
>   -          (set! line (string-trim line))
>   -          ;; The parser is either entering, within, exiting, or after the
>   -          ;; require block. The Go toolchain is trustworthy so edge-cases like
>   -          ;; double-entry, etc. need not complect the parser.
>   -          (cond
>   -           ((string=? line "require (")
>   -            (set! in-require? #t))
>   -           ((and in-require? (string=? line ")"))
>   -            (set! in-require? #f))
>   -           (in-require?
>   -            (let* ((requirement (string-split line #\space))
>   -                   ;; Modules should be unquoted
>   -                   (module-path (string-delete #\" (car requirement)))
>   -                   (version (list-ref requirement 1)))
>   -              (set! requirements (acons module-path version requirements))))
>   -           ((string-prefix? "replace" line)
>   -            (let* ((requirement (string-split line #\space))
>   -                   (module-path (list-ref requirement 1))
>   -                   (new-module-path (list-ref requirement 3))
>   -                   (version (list-ref requirement 4)))
>   -              (set! requirements (assoc-remove! requirements module-path))
>   -              (set! requirements (acons new-module-path version requirements))))))
>   -        requirements))))
>   -
>   -(define (module-path-without-major-version module-path)
>   -  "Go modules can be appended with a major version indicator,
>   -e.g. /v3. Sometimes it is desirable to work with the root module path. For
>   -instance, for a module path github.com/foo/bar/v3 this function returns
>   -github.com/foo/bar."
>   -  (let ((m (string-match "(.*)\\/v[0-9]+$" module-path)))
>   -    (if m
>   -        (match:substring m 1)
>   -        module-path)))
>   +      (let* ((results (toplevel '(() . ())))
>   +             (requirements (car results))
>   +             (replaced (cdr results)))
>   +        ;; At last we remove replaced modules from the requirements list
>   +        (fold
>   +         (lambda (replacedelem requirements)
>   +             (alist-delete! (car replacedelem) requirements))
>   +         requirements
>   +         replaced)))))
>    
>    (define (infer-module-root module-path)
>      "Go modules can be defined at any level of a repository's tree, but querying
>   @@ -124,38 +182,42 @@ root path from its path. For a set of well-known forges, the pattern of what
>    consists of a module's root page is known before hand."
>      ;; See the following URL for the official Go equivalent:
>      ;; https://github.com/golang/go/blob/846dce9d05f19a1f53465e62a304dea21b99f910/src/cmd/go/internal/vcs/vcs.go#L1026-L1087
>   -  (define-record-type <scs>
>   -    (make-scs url-prefix root-regex type)
>   -    scs?
>   -    (url-prefix	scs-url-prefix)
>   -    (root-regex scs-root-regex)
>   -    (type	scs-type))
>   -  (let* ((known-scs
>   +  ;;
>   +  ;; FIXME: handle module path with VCS qualifier as described in
>   +  ;; https://golang.org/ref/mod#vcs-find and
>   +  ;; https://golang.org/cmd/go/#hdr-Remote_import_paths
>   +  (define-record-type <vcs>
>   +    (make-vcs url-prefix root-regex type)
>   +    vcs?
>   +    (url-prefix vcs-url-prefix)
>   +    (root-regex vcs-root-regex)
>   +    (type vcs-type))
>   +  (let* ((known-vcs
>              (list
>   -           (make-scs
>   +           (make-vcs
>                "github.com"
>                "^(github\\.com/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
>                'git)
>   -           (make-scs
>   +           (make-vcs
>                "bitbucket.org"
>                "^(bitbucket\\.org/([A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+))(/[A-Za-z0-9_.\\-]+)*$`"
>                'unknown)
>   -           (make-scs
>   +           (make-vcs
>                "hub.jazz.net/git/"
>                "^(hub\\.jazz\\.net/git/[a-z0-9]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
>                'git)
>   -           (make-scs
>   +           (make-vcs
>                "git.apache.org"
>                "^(git\\.apache\\.org/[a-z0-9_.\\-]+\\.git)(/[A-Za-z0-9_.\\-]+)*$"
>                'git)
>   -           (make-scs
>   +           (make-vcs
>                "git.openstack.org"
>                "^(git\\.openstack\\.org/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(\\.git)?(/[A-Za-z0-9_.\\-]+)*$"
>                'git)))
>   -         (scs (find (lambda (scs) (string-prefix? (scs-url-prefix scs) module-path))
>   -                    known-scs)))
>   -    (if scs
>   -        (match:substring (string-match (scs-root-regex scs) module-path) 1)
>   +         (vcs (find (lambda (vcs) (string-prefix? (vcs-url-prefix vcs) module-path))
>   +                    known-vcs)))
>   +    (if vcs
>   +        (match:substring (string-match (vcs-root-regex vcs) module-path) 1)
>            module-path)))
>    
>    (define (to-guix-package-name module-path)
>   @@ -164,8 +226,7 @@ consists of a module's root page is known before hand."
>       (string-append "go-"
>                      (string-replace-substring
>                       (string-replace-substring
>   -                    ;; Guix has its own field for version
>   -                    (module-path-without-major-version module-path)
>   +                    module-path
>                        "." "-")
>                       "/" "-"))))
>    
>   @@ -173,7 +234,9 @@ consists of a module's root page is known before hand."
>      "Fetches module meta-data from a module's landing page. This is necessary
>    because goproxy servers don't currently provide all the information needed to
>    build a package."
>   -  (let* ((port (http-fetch (string->uri (format #f "https://~a?go-get=1" module-path))))
>   +  ;; FIXME: This code breaks on k8s.io which have a meta tag splitted
>   +  ;; on several lines
>   +  (let* ((port (build-download:http-fetch (string->uri (format #f "https://~a?go-get=1" module-path))))
>             (module-metadata #f)
>             (meta-tag-prefix "<meta name=\"go-import\" content=\"")
>             (meta-tag-prefix-length (string-length meta-tag-prefix)))
>   @@ -185,7 +248,7 @@ build a package."
>              (let* ((start (+ meta-tag-index meta-tag-prefix-length))
>                     (end (string-index line #\" start)))
>                (set! module-metadata
>   -              (string-split (substring/shared line start end) #\space))))))
>   +                  (string-split (substring/shared line start end) #\space))))))
>        (close-port port)
>        module-metadata))
>    
>   @@ -244,7 +307,7 @@ control system is being used."
>                (dependencies (map car (parse-go.mod temp)))
>                (guix-name (to-guix-package-name module-path))
>                (root-module-path (infer-module-root module-path))
>   -            ;; SCS type and URL are not included in goproxy information. For
>   +            ;; VCS type and URL are not included in goproxy information. For
>                ;; this we need to fetch it from the official module page.
>                (meta-data (fetch-module-meta-data root-module-path))
>                (scs-type (module-meta-data-scs meta-data))
>   @@ -268,9 +331,10 @@ control system is being used."
>    
>    (define* (go-module-recursive-import package-name
>                                         #:key (goproxy-url "https://proxy.golang.org"))
>   -  (recursive-import package-name #f
>   -                    #:repo->guix-package
>   -                    (lambda (name _)
>   -                      (go-module->guix-package name
>   -                                               #:goproxy-url goproxy-url))
>   -                    #:guix-name to-guix-package-name))
>   +  (recursive-import
>   +   package-name
>   +   #:repo->guix-package (lambda* (name . _)
>   +                          (go-module->guix-package
>   +                           name
>   +                           #:goproxy-url goproxy-url))
>   +   #:guix-name to-guix-package-name))
>
>  guix/import/go.scm         | 340 +++++++++++++++++++++++++++++++++++++
>  guix/scripts/import.scm    |   2 +-
>  guix/scripts/import/go.scm | 118 +++++++++++++
>  3 files changed, 459 insertions(+), 1 deletion(-)
>  create mode 100644 guix/import/go.scm
>  create mode 100644 guix/scripts/import/go.scm
>
> diff --git a/guix/import/go.scm b/guix/import/go.scm
> new file mode 100644
> index 0000000000..7f5f300f0a
> --- /dev/null
> +++ b/guix/import/go.scm
> @@ -0,0 +1,340 @@
> +;;; GNU Guix --- Functional package management for GNU
> +;;; Copyright © 2020 Katherine Cox-Buday <cox.katherine.e@gmail.com>
> +;;; Copyright © 2020 Helio Machado <0x2b3bfa0+guix@googlemail.com>
> +;;; Copyright © 2021 François Joulaud <francois.joulaud@radiofrance.com>
> +;;;
> +;;; This file is part of GNU Guix.
> +;;;
> +;;; GNU Guix is free software; you can redistribute it and/or modify it
> +;;; under the terms of the GNU General Public License as published by
> +;;; the Free Software Foundation; either version 3 of the License, or (at
> +;;; your option) any later version.
> +;;;
> +;;; GNU Guix is distributed in the hope that it will be useful, but
> +;;; WITHOUT ANY WARRANTY; without even the implied warranty of
> +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;;; GNU General Public License for more details.
> +;;;
> +;;; You should have received a copy of the GNU General Public License
> +;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
> +
> +;;; (guix import golang) wants to make easier to create Guix package
> +;;; declaration for Go modules.
> +;;;
> +;;; Modules in Go are "collection of related Go packages" which are
> +;;; "the unit of source code interchange and versioning".
> +;;; Modules are generally hosted in a repository.
> +;;;
> +;;; At this point it should handle correctly modules which
> +;;; - have only Go dependencies;
> +;;; - use go.mod;
> +;;; - and are accessible from proxy.golang.org (or configured GOPROXY).
> +;;;
> +;;; We translate Go module paths  to a Guix package name under the
> +;;; assumption that there will be no collision.
> +
> +(define-module (guix import go)
> +  #:use-module (ice-9 match)
> +  #:use-module (ice-9 rdelim)
> +  #:use-module (ice-9 receive)
> +  #:use-module (ice-9 regex)
> +  #:use-module (srfi srfi-1)
> +  #:use-module (srfi srfi-9)
> +  #:use-module (json)
> +  #:use-module ((guix download) #:prefix download:)
> +  #:use-module (guix import utils)
> +  #:use-module (guix import json)
> +  #:use-module (guix packages)
> +  #:use-module (guix upstream)
> +  #:use-module (guix utils)
> +  #:use-module ((guix licenses) #:prefix license:)
> +  #:use-module (guix base16)
> +  #:use-module (guix base32)
> +  #:use-module ((guix build download) #:prefix build-download:)
> +  #:use-module (web uri)
> +
> +  #:export (go-module->guix-package
> +            go-module-recursive-import
> +            infer-module-root))
> +
> +(define (go-path-escape path)
> +  "Escape a module path by replacing every uppercase letter with an exclamation
> +mark followed with its lowercase equivalent, as per the module Escaped Paths
> +specification. https://godoc.org/golang.org/x/mod/module#hdr-Escaped_Paths"
> +  (define (escape occurrence)
> +    (string-append "!" (string-downcase (match:substring occurrence))))
> +  (regexp-substitute/global #f "[A-Z]" path 'pre escape 'post))
> +
> +
> +(define (fetch-latest-version goproxy-url module-path)
> +  "Fetches the version number of the latest version for MODULE-PATH from the
> +given GOPROXY-URL server."
> +  (assoc-ref
> +   (json-fetch (format #f "~a/~a/@latest" goproxy-url
> +                       (go-path-escape module-path)))
> +   "Version"))
> +
> +(define (fetch-go.mod goproxy-url module-path version file)
> +  "Fetches go.mod from the given GOPROXY-URL server for the given MODULE-PATH
> +and VERSION."
> +  (let ((url (format #f "~a/~a/@v/~a.mod" goproxy-url
> +                     (go-path-escape module-path)
> +                     (go-path-escape version))))
> +    (parameterize ((current-output-port (current-error-port)))
> +      (build-download:url-fetch url
> +                                file
> +                                #:print-build-trace? #f))))
> +
> +(define (parse-go.mod go.mod-path)
> +  "PARSE-GO.MOD takes a filename in GO.MOD-PATH and extract a list of
> +requirements from it."
> +  ;; We parse only a subset of https://golang.org/ref/mod#go-mod-file-grammar
> +  ;; which we think necessary for our use case.
> +  (define (toplevel results)
> +    "Main parser, RESULTS is a pair of alist serving as accumulator for
> +     all encountered requirements and replacements."
> +    (let ((line (read-line)))
> +      (cond
> +       ((eof-object? line)
> +        ;; parsing ended, give back the result
> +        results)
> +       ((string=? line "require (")
> +        ;; a require block begins, delegate parsing to IN-REQUIRE
> +        (in-require results))
> +       ((string-prefix? "require " line)
> +        ;; a require directive by itself
> +        (let* ((stripped-line (string-drop line 8))
> +               (new-results (require-directive results stripped-line)))
> +          (toplevel new-results)))
> +       ((string-prefix? "replace " line)
> +        ;; a replace directive by itself
> +        (let* ((stripped-line (string-drop line 8))
> +               (new-results (replace-directive results stripped-line)))
> +          (toplevel new-results)))
> +       (#t
> +        ;; unrecognised line, ignore silently
> +        (toplevel results)))))
> +  (define (in-require results)
> +    (let ((line (read-line)))
> +      (cond
> +       ((eof-object? line)
> +        ;; this should never happen here but we ignore silently
> +        results)
> +       ((string=? line ")")
> +        ;; end of block, coming back to toplevel
> +        (toplevel results))
> +       (#t
> +        (in-require (require-directive results line))))))
> +  (define (replace-directive results line)
> +    "Extract replaced modules and new requirements from replace directive
> +    in LINE and add to RESULTS."
> +    ;; ReplaceSpec = ModulePath [ Version ] "=>" FilePath newline
> +    ;;             | ModulePath [ Version ] "=>" ModulePath Version newline .
> +    (let* ((requirements (car results))
> +           (replaced (cdr results))
> +           (re (string-concatenate
> +                '("([^[:blank:]]+)([[:blank:]]+([^[:blank:]]+))?"
> +                  "[[:blank:]]+" "=>" "[[:blank:]]+"
> +                  "([^[:blank:]]+)([[:blank:]]+([^[:blank:]]+))?")))
> +           (match (string-match re line))
> +           (module-path (match:substring match 1))
> +           (version (match:substring match 3))
> +           (new-module-path (match:substring match 4))
> +           (new-version (match:substring match 6))
> +           (new-replaced (acons module-path version replaced))
> +           (new-requirements
> +            (if (string-match "^\\.?\\./" new-module-path)
> +                requirements
> +                (acons new-module-path new-version requirements))))
> +      (cons new-requirements new-replaced)))
> +  (define (require-directive results line)
> +    "Extract requirement from LINE and add it to RESULTS."
> +    (let* ((requirements (car results))
> +           (replaced (cdr results))
> +           ;; A line in a require directive is composed of a module path and
> +           ;; a version separated by whitespace and an optionnal '//' comment at
> +           ;; the end.
> +           (re (string-concatenate
> +                '("^[[:blank:]]*"
> +                  "([^[:blank:]]+)[[:blank:]]+([^[:blank:]]+)"
> +                  "([[:blank:]]+//.*)?")))
> +           (match (string-match re line))
> +           (module-path (match:substring match 1))
> +           (version (match:substring match 2)))
> +      (cons (acons module-path version requirements) replaced)))
> +  (with-input-from-file go.mod-path
> +    (lambda ()
> +      (let* ((results (toplevel '(() . ())))
> +             (requirements (car results))
> +             (replaced (cdr results)))
> +        ;; At last we remove replaced modules from the requirements list
> +        (fold
> +         (lambda (replacedelem requirements)
> +             (alist-delete! (car replacedelem) requirements))
> +         requirements
> +         replaced)))))
> +
> +(define (infer-module-root module-path)
> +  "Go modules can be defined at any level of a repository's tree, but querying
> +for the meta tag usually can only be done at the webpage at the root of the
> +repository. Therefore, it is sometimes necessary to try and derive a module's
> +root path from its path. For a set of well-known forges, the pattern of what
> +consists of a module's root page is known before hand."
> +  ;; See the following URL for the official Go equivalent:
> +  ;; https://github.com/golang/go/blob/846dce9d05f19a1f53465e62a304dea21b99f910/src/cmd/go/internal/vcs/vcs.go#L1026-L1087
> +  ;;
> +  ;; FIXME: handle module path with VCS qualifier as described in
> +  ;; https://golang.org/ref/mod#vcs-find and
> +  ;; https://golang.org/cmd/go/#hdr-Remote_import_paths
> +  (define-record-type <vcs>
> +    (make-vcs url-prefix root-regex type)
> +    vcs?
> +    (url-prefix vcs-url-prefix)
> +    (root-regex vcs-root-regex)
> +    (type vcs-type))
> +  (let* ((known-vcs
> +          (list
> +           (make-vcs
> +            "github.com"
> +            "^(github\\.com/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
> +            'git)
> +           (make-vcs
> +            "bitbucket.org"
> +            "^(bitbucket\\.org/([A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+))(/[A-Za-z0-9_.\\-]+)*$`"
> +            'unknown)
> +           (make-vcs
> +            "hub.jazz.net/git/"
> +            "^(hub\\.jazz\\.net/git/[a-z0-9]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\-]+)*$"
> +            'git)
> +           (make-vcs
> +            "git.apache.org"
> +            "^(git\\.apache\\.org/[a-z0-9_.\\-]+\\.git)(/[A-Za-z0-9_.\\-]+)*$"
> +            'git)
> +           (make-vcs
> +            "git.openstack.org"
> +            "^(git\\.openstack\\.org/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(\\.git)?(/[A-Za-z0-9_.\\-]+)*$"
> +            'git)))
> +         (vcs (find (lambda (vcs) (string-prefix? (vcs-url-prefix vcs) module-path))
> +                    known-vcs)))
> +    (if vcs
> +        (match:substring (string-match (vcs-root-regex vcs) module-path) 1)
> +        module-path)))
> +
> +(define (to-guix-package-name module-path)
> +  "Converts a module's path to the canonical Guix format for Go packages."
> +  (string-downcase
> +   (string-append "go-"
> +                  (string-replace-substring
> +                   (string-replace-substring
> +                    module-path
> +                    "." "-")
> +                   "/" "-"))))
> +
> +(define (fetch-module-meta-data module-path)
> +  "Fetches module meta-data from a module's landing page. This is necessary
> +because goproxy servers don't currently provide all the information needed to
> +build a package."
> +  ;; FIXME: This code breaks on k8s.io which have a meta tag splitted
> +  ;; on several lines
> +  (let* ((port (build-download:http-fetch (string->uri (format #f "https://~a?go-get=1" module-path))))
> +         (module-metadata #f)
> +         (meta-tag-prefix "<meta name=\"go-import\" content=\"")
> +         (meta-tag-prefix-length (string-length meta-tag-prefix)))
> +    (do ((line (read-line port) (read-line port)))
> +        ((or (eof-object? line)
> +             module-metadata))
> +      (let ((meta-tag-index (string-contains line meta-tag-prefix)))
> +        (when meta-tag-index
> +          (let* ((start (+ meta-tag-index meta-tag-prefix-length))
> +                 (end (string-index line #\" start)))
> +            (set! module-metadata
> +                  (string-split (substring/shared line start end) #\space))))))
> +    (close-port port)
> +    module-metadata))
> +
> +(define (module-meta-data-scs meta-data)
> +  "Return the source control system specified by a module's meta-data."
> +  (string->symbol (list-ref meta-data 1)))
> +
> +(define (module-meta-data-repo-url meta-data goproxy-url)
> +  "Return the URL where the fetcher which will be used can download the source
> +control."
> +  (if (member (module-meta-data-scs meta-data) '(fossil mod))
> +      goproxy-url
> +      (list-ref meta-data 2)))
> +
> +(define (source-uri scs-type scs-repo-url file)
> +  "Generate the `origin' block of a package depending on what type of source
> +control system is being used."
> +  (case scs-type
> +    ((git)
> +     `(origin
> +        (method git-fetch)
> +        (uri (git-reference
> +              (url ,scs-repo-url)
> +              (commit (string-append "v" version))))
> +        (file-name (git-file-name name version))
> +        (sha256
> +         (base32
> +          ,(guix-hash-url file)))))
> +    ((hg)
> +     `(origin
> +        (method hg-fetch)
> +        (uri (hg-reference
> +              (url ,scs-repo-url)
> +              (changeset ,version)))
> +        (file-name (format #f "~a-~a-checkout" name version))))
> +    ((svn)
> +     `(origin
> +        (method svn-fetch)
> +        (uri (svn-reference
> +              (url ,scs-repo-url)
> +              (revision (string->number version))
> +              (recursive? #f)))
> +        (file-name (format #f "~a-~a-checkout" name version))
> +        (sha256
> +         (base32
> +          ,(guix-hash-url file)))))
> +    (else
> +     (raise-exception (format #f "unsupported scs type: ~a" scs-type)))))
> +
> +(define* (go-module->guix-package module-path #:key (goproxy-url "https://proxy.golang.org"))
> +  (call-with-temporary-output-file
> +   (lambda (temp port)
> +     (let* ((latest-version (fetch-latest-version goproxy-url module-path))
> +            (go.mod-path (fetch-go.mod goproxy-url module-path latest-version
> +                                       temp))
> +            (dependencies (map car (parse-go.mod temp)))
> +            (guix-name (to-guix-package-name module-path))
> +            (root-module-path (infer-module-root module-path))
> +            ;; VCS type and URL are not included in goproxy information. For
> +            ;; this we need to fetch it from the official module page.
> +            (meta-data (fetch-module-meta-data root-module-path))
> +            (scs-type (module-meta-data-scs meta-data))
> +            (scs-repo-url (module-meta-data-repo-url meta-data goproxy-url)))
> +       (values
> +        `(package
> +           (name ,guix-name)
> +           ;; Elide the "v" prefix Go uses
> +           (version ,(string-trim latest-version #\v))
> +           (source
> +            ,(source-uri scs-type scs-repo-url temp))
> +           (build-system go-build-system)
> +           ,@(maybe-inputs (map to-guix-package-name dependencies))
> +           ;; TODO(katco): It would be nice to make an effort to fetch this
> +           ;; from known forges, e.g. GitHub
> +           (home-page ,(format #f "https://~a" root-module-path))
> +           (synopsis "A Go package")
> +           (description ,(format #f "~a is a Go package." guix-name))
> +           (license #f))
> +        dependencies)))))
> +
> +(define* (go-module-recursive-import package-name
> +                                     #:key (goproxy-url "https://proxy.golang.org"))
> +  (recursive-import
> +   package-name
> +   #:repo->guix-package (lambda* (name . _)
> +                          (go-module->guix-package
> +                           name
> +                           #:goproxy-url goproxy-url))
> +   #:guix-name to-guix-package-name))
> diff --git a/guix/scripts/import.scm b/guix/scripts/import.scm
> index 0a3863f965..1d2b45d942 100644
> --- a/guix/scripts/import.scm
> +++ b/guix/scripts/import.scm
> @@ -77,7 +77,7 @@ rather than \\n."
>  ;;;
>  
>  (define importers '("gnu" "nix" "pypi" "cpan" "hackage" "stackage" "elpa" "gem"
> -                    "cran" "crate" "texlive" "json" "opam"))
> +                    "go" "cran" "crate" "texlive" "json" "opam"))
>  
>  (define (resolve-importer name)
>    (let ((module (resolve-interface
> diff --git a/guix/scripts/import/go.scm b/guix/scripts/import/go.scm
> new file mode 100644
> index 0000000000..fde7555973
> --- /dev/null
> +++ b/guix/scripts/import/go.scm
> @@ -0,0 +1,118 @@
> +;;; GNU Guix --- Functional package management for GNU
> +;;; Copyright © 2020 Katherine Cox-Buday <cox.katherine.e@gmail.com>
> +;;;
> +;;; This file is part of GNU Guix.
> +;;;
> +;;; GNU Guix is free software; you can redistribute it and/or modify it
> +;;; under the terms of the GNU General Public License as published by
> +;;; the Free Software Foundation; either version 3 of the License, or (at
> +;;; your option) any later version.
> +;;;
> +;;; GNU Guix is distributed in the hope that it will be useful, but
> +;;; WITHOUT ANY WARRANTY; without even the implied warranty of
> +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;;; GNU General Public License for more details.
> +;;;
> +;;; You should have received a copy of the GNU General Public License
> +;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
> +
> +(define-module (guix scripts import go)
> +  #:use-module (guix ui)
> +  #:use-module (guix utils)
> +  #:use-module (guix scripts)
> +  #:use-module (guix import go)
> +  #:use-module (guix scripts import)
> +  #:use-module (srfi srfi-1)
> +  #:use-module (srfi srfi-11)
> +  #:use-module (srfi srfi-37)
> +  #:use-module (ice-9 match)
> +  #:use-module (ice-9 format)
> +  #:export (guix-import-go))
> +
> +
> +;;;
> +;;; Command-line options.
> +;;;
> +
> +(define %default-options
> +  '())
> +
> +(define (show-help)
> +  (display (G_ "Usage: guix import go PACKAGE-PATH
> +Import and convert the Go module for PACKAGE-PATH.\n"))
> +  (display (G_ "
> +  -h, --help             display this help and exit"))
> +  (display (G_ "
> +  -V, --version          display version information and exit"))
> +  (display (G_ "
> +  -r, --recursive        generate package expressions for all Go modules\
> + that are not yet in Guix"))
> +  (display (G_ "
> +  -p, --goproxy=GOPROXY  specify which goproxy server to use"))
> +  (newline)
> +  (show-bug-report-information))
> +
> +(define %options
> +  ;; Specification of the command-line options.
> +  (cons* (option '(#\h "help") #f #f
> +                 (lambda args
> +                   (show-help)
> +                   (exit 0)))
> +         (option '(#\V "version") #f #f
> +                 (lambda args
> +                   (show-version-and-exit "guix import go")))
> +         (option '(#\r "recursive") #f #f
> +                 (lambda (opt name arg result)
> +                   (alist-cons 'recursive #t result)))
> +         (option '(#\p "goproxy") #t #f
> +                 (lambda (opt name arg result)
> +                   (alist-cons 'goproxy
> +                               (string->symbol arg)
> +                               (alist-delete 'goproxy result))))
> +         %standard-import-options))
> +
> +
> +;;;
> +;;; Entry point.
> +;;;
> +
> +(define (guix-import-go . args)
> +  (define (parse-options)
> +    ;; Return the alist of option values.
> +    (args-fold* args %options
> +                (lambda (opt name arg result)
> +                  (leave (G_ "~A: unrecognized option~%") name))
> +                (lambda (arg result)
> +                  (alist-cons 'argument arg result))
> +                %default-options))
> +
> +  (let* ((opts (parse-options))
> +         (args (filter-map (match-lambda
> +                             (('argument . value)
> +                              value)
> +                             (_ #f))
> +                           (reverse opts))))
> +    (match args
> +      ((module-name)
> +       (if (assoc-ref opts 'recursive)
> +           (map (match-lambda
> +                  ((and ('package ('name name) . rest) pkg)
> +                   `(define-public ,(string->symbol name)
> +                      ,pkg))
> +                  (_ #f))
> +                (go-module-recursive-import module-name
> +                                            #:goproxy-url
> +                                            (or (assoc-ref opts 'goproxy)
> +                                                "https://proxy.golang.org")))
> +           (let ((sexp (go-module->guix-package module-name
> +                                                #:goproxy-url
> +                                                (or (assoc-ref opts 'goproxy)
> +                                                    "https://proxy.golang.org"))))
> +             (unless sexp
> +               (leave (G_ "failed to download meta-data for module '~a'~%")
> +                      module-name))
> +             sexp)))
> +      (()
> +       (leave (G_ "too few arguments~%")))
> +      ((many ...)
> +       (leave (G_ "too many arguments~%"))))))

-- 
Katherine




  reply	other threads:[~2021-01-23 22:42 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-23 14:06 [bug#44178] Add a Go Module Importer Katherine Cox-Buday
2020-10-28 10:41 ` Ludovic Courtès
2020-10-28 10:42 ` Ludovic Courtès
2020-11-10 20:26 ` Marius Bakke
     [not found]   ` <CANe01w55ZO=_9v0HcDv248UsoLUXb_9WVAgM4LqiZ4E-r1XgXg@mail.gmail.com>
2020-11-11  1:23     ` Helio Machado
2021-01-23 21:35       ` [bug#44178] [PATCH] Create importer for Go modules guix-patches--- via
2021-01-23 22:41         ` Katherine Cox-Buday [this message]
2021-01-25 21:03           ` guix-patches--- via
2021-01-27 14:38             ` Katherine Cox-Buday
2021-01-28 13:27               ` Ludovic Courtès
2021-01-29 16:43                 ` guix-patches--- via
2021-01-29 16:52                   ` [bug#44178] [PATCHv2] " guix-patches--- via
2021-01-31 16:23                   ` [bug#44178] [PATCH] " Ludovic Courtès
2021-02-19 15:51                     ` JOULAUD François via Guix-patches via
2021-02-19 16:21                       ` [bug#44178] [PATCHv3] " JOULAUD François via Guix-patches via
2021-03-02 21:54                         ` [bug#44178] Add a Go Module Importer Ludovic Courtès
2021-03-04  5:40                           ` [bug#44178] [PATCH v4] Re: bug#44178: " Maxim Cournoyer
2021-03-04 14:14                             ` JOULAUD François via Guix-patches via
2021-03-04 15:47                               ` Maxim Cournoyer
2021-03-08 13:54                           ` [bug#44178] " JOULAUD François via Guix-patches via
2021-03-10 17:12                             ` bug#44178: " Ludovic Courtès
2021-01-28  5:01             ` [bug#44178] [PATCH] Create importer for Go modules Timmy Douglas
2020-11-11 20:48   ` [bug#44178] Add a Go Module Importer Katherine Cox-Buday
2020-12-09 14:22 ` [bug#44178] dftxbs3e
2020-12-10  2:42   ` [bug#44178] dftxbs3e
2020-12-10  3:14     ` [bug#44178] dftxbs3e
2021-01-28  7:29 ` [bug#44178] [PATCH] Create importer for Go modules guix-patches--- via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r1mb6zu9.fsf@gmail.com \
    --to=cox.katherine.e@gmail.com \
    --cc=0x2b3bfa0@gmail.com \
    --cc=44178@debbugs.gnu.org \
    --cc=Francois.JOULAUD@radiofrance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).