unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#24450: pypi importer outputs strange character series in optional dependency case.
@ 2016-09-16 20:00 ng0
  2019-03-29  4:24 ` Maxim Cournoyer
                   ` (3 more replies)
  0 siblings, 4 replies; 41+ messages in thread
From: ng0 @ 2016-09-16 20:00 UTC (permalink / raw)
  To: 24450

I think this should not happen with pypi import:

  (inputs
    `(("python-certifi==2016.2.28"
       ,python-certifi==2016.2.28)
      ("python-dateutil==2.5.3"
       ,python-dateutil==2.5.3)
      ("python-flask-babel==0.11.1"
       ,python-flask-babel==0.11.1)
      ("python-flask==0.11.1" ,python-flask==0.11.1)
      ("python-lxml==3.6.0" ,python-lxml==3.6.0)
      ("python-ndg-httpsclient==0.4.1"
       ,python-ndg-httpsclient==0.4.1)
      ("python-pyasn1-modules==0.0.8"
       ,python-pyasn1-modules==0.0.8)
      ("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
      ("python-pygments==2.1.3"
       ,python-pygments==2.1.3)
      ("python-pyopenssl==0.15.1"
       ,python-pyopenssl==0.15.1)
      ("python-pyyaml==3.11" ,python-pyyaml==3.11)
      ("python-requests[socks]==2.10.0"
       ,#{python-requests\x5b;socks\x5d;==2.10.0}#)
      ("python-setuptools" ,python-setuptools)))


I can understand the version numbers, I can also understand the optional
socks building/module of the python-requests, but why does it read like
Gobbledygook?  Can't we improve the output here?

For version numbers, this is not a format which happened recently which
is exclusive for python build system right? This is just bad formated
because of the pypi query.
I will first try and not pin the application to these version numbers,
maybe itjustworks™.


To reproduce: "guix import pypi searx"
-- 
              ng0

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: pypi importer outputs strange character series in optional dependency case.
  2016-09-16 20:00 bug#24450: pypi importer outputs strange character series in optional dependency case ng0
@ 2019-03-29  4:24 ` Maxim Cournoyer
  2019-06-16 17:02   ` ng0
  2019-03-29  4:34 ` bug#24450: [PATCH] " Maxim Cournoyer
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-03-29  4:24 UTC (permalink / raw)
  To: ng0; +Cc: 24450

ng0 <ng0@we.make.ritual.n0.is> writes:

> I think this should not happen with pypi import:
>
>   (inputs
>     `(("python-certifi==2016.2.28"
>        ,python-certifi==2016.2.28)
>       ("python-dateutil==2.5.3"
>        ,python-dateutil==2.5.3)
>       ("python-flask-babel==0.11.1"
>        ,python-flask-babel==0.11.1)
>       ("python-flask==0.11.1" ,python-flask==0.11.1)
>       ("python-lxml==3.6.0" ,python-lxml==3.6.0)
>       ("python-ndg-httpsclient==0.4.1"
>        ,python-ndg-httpsclient==0.4.1)
>       ("python-pyasn1-modules==0.0.8"
>        ,python-pyasn1-modules==0.0.8)
>       ("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
>       ("python-pygments==2.1.3"
>        ,python-pygments==2.1.3)
>       ("python-pyopenssl==0.15.1"
>        ,python-pyopenssl==0.15.1)
>       ("python-pyyaml==3.11" ,python-pyyaml==3.11)
>       ("python-requests[socks]==2.10.0"
>        ,#{python-requests\x5b;socks\x5d;==2.10.0}#)
>       ("python-setuptools" ,python-setuptools)))
>
>
> I can understand the version numbers, I can also understand the optional
> socks building/module of the python-requests, but why does it read like
> Gobbledygook?  Can't we improve the output here?
>
> For version numbers, this is not a format which happened recently which
> is exclusive for python build system right? This is just bad formated
> because of the pypi query.
> I will first try and not pin the application to these version numbers,
> maybe itjustworks™.
>
>
> To reproduce: "guix import pypi searx"

This would now give (change to be sent for review soon):

--8<---------------cut here---------------start------------->8---
./pre-inst-env guix import pypi searx

Starting download of /tmp/guix-file.1wD8K4
From https://files.pythonhosted.org/packages/75/3f/5941ad2d500ff7cf6f8da1022c78013dcd2207941d533586a8e7bfe699d3/searx-0.15.0.tar.gz...
 …5.0.tar.gz  1.6MiB                  729KiB/s 00:02 [##################] 100.0%
(package
  (name "python-searx")
  (version "0.15.0")
  (source
    (origin
      (method url-fetch)
      (uri (pypi-uri "searx" version))
      (sha256
        (base32
          "1gmww73q7wydkvlyz73wnr3sybpjn40wha7avnz9ak9m365zcjxf"))))
  (build-system python-build-system)
  (propagated-inputs
    `(("python-certifi" ,python-certifi)
      ("python-dateutil" ,python-dateutil)
      ("python-flask" ,python-flask)
      ("python-flask-babel" ,python-flask-babel)
      ("python-idna" ,python-idna)
      ("python-lxml" ,python-lxml)
      ("python-pygments" ,python-pygments)
      ("python-pyopenssl" ,python-pyopenssl)
      ("python-pyyaml" ,python-pyyaml)
      ("python-requests" ,python-requests)))
  (native-inputs
    `(("python-babel" ,python-babel)
      ("python-cov-core" ,python-cov-core)
      ("python-mock" ,python-mock)
      ("python-nose2" ,python-nose2)
      ("python-pep8" ,python-pep8)
      ("python-plone.testing" ,python-plone.testing)
      ("python-selenium" ,python-selenium)
      ("python-splinter" ,python-splinter)
      ("python-transifex-client"
       ,python-transifex-client)
      ("python-unittest2" ,python-unittest2)
      ("python-zope.testrunner"
       ,python-zope.testrunner)))
  (home-page "https://github.com/asciimoo/searx")
  (synopsis
    "A privacy-respecting, hackable metasearch engine")
  (description
    "A privacy-respecting, hackable metasearch engine")
  (license #f))
--8<---------------cut here---------------end--------------->8---

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCH] bug#24450: pypi importer outputs strange character series in optional dependency case.
  2016-09-16 20:00 bug#24450: pypi importer outputs strange character series in optional dependency case ng0
  2019-03-29  4:24 ` Maxim Cournoyer
@ 2019-03-29  4:34 ` Maxim Cournoyer
  2019-03-30  2:12   ` bug#24450: [PATCHv2] " T460s laptop
  2019-05-15 11:06 ` Ricardo Wurmus
  2019-06-12  3:00 ` Maxim Cournoyer
  3 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-03-29  4:34 UTC (permalink / raw)
  To: ng0; +Cc: 24450


[-- Attachment #1.1: Type: text/plain, Size: 1606 bytes --]

Hello,

ng0 <ng0@we.make.ritual.n0.is> writes:

> I think this should not happen with pypi import:
>
>   (inputs
>     `(("python-certifi==2016.2.28"
>        ,python-certifi==2016.2.28)
>       ("python-dateutil==2.5.3"
>        ,python-dateutil==2.5.3)
>       ("python-flask-babel==0.11.1"
>        ,python-flask-babel==0.11.1)
>       ("python-flask==0.11.1" ,python-flask==0.11.1)
>       ("python-lxml==3.6.0" ,python-lxml==3.6.0)
>       ("python-ndg-httpsclient==0.4.1"
>        ,python-ndg-httpsclient==0.4.1)
>       ("python-pyasn1-modules==0.0.8"
>        ,python-pyasn1-modules==0.0.8)
>       ("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
>       ("python-pygments==2.1.3"
>        ,python-pygments==2.1.3)
>       ("python-pyopenssl==0.15.1"
>        ,python-pyopenssl==0.15.1)
>       ("python-pyyaml==3.11" ,python-pyyaml==3.11)
>       ("python-requests[socks]==2.10.0"
>        ,#{python-requests\x5b;socks\x5d;==2.10.0}#)
>       ("python-setuptools" ,python-setuptools)))
>
>
> I can understand the version numbers, I can also understand the optional
> socks building/module of the python-requests, but why does it read like
> Gobbledygook?  Can't we improve the output here?
>
> For version numbers, this is not a format which happened recently which
> is exclusive for python build system right? This is just bad formated
> because of the pypi query.
> I will first try and not pin the application to these version numbers,
> maybe itjustworks™.
>
>
> To reproduce: "guix import pypi searx"

The following patches fix this, and more!


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-import-pypi-Do-not-consider-requirements.txt-files.patch --]
[-- Type: text/x-patch, Size: 5825 bytes --]

From 54e44b7397f17910d95dbdb233d23e5c97c095aa Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 1/7] import: pypi: Do not consider requirements.txt files.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt.  Streamline logic.
---
 guix/import/pypi.scm | 35 +++++++++++++----------------------
 tests/pypi.scm       | 23 +++++++++++------------
 2 files changed, 24 insertions(+), 34 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 3a20fc4b9b..8269aa61d7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -206,35 +206,26 @@ cannot determine package dependencies"))
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (req-files (list (string-append dirname "/requirements.txt")
-                                     (string-append dirname "/" pypi-name ".egg-info"
-                                                    "/requires.txt")))
-                    (exit-codes (map (lambda (file-name)
-                                       (parameterize ((current-error-port (%make-void-port "rw+"))
-                                                      (current-output-port (%make-void-port "rw+")))
-                                         (system* "tar" "xf" tarball "-C" dir file-name)))
-                                     req-files)))
-               ;; Only one of these files needs to exist.
-               (if (any zero? exit-codes)
-                   (match (find-files dir)
-                     ((file . _)
-                      (read-requirements file))
-                     (()
-                      (warning (G_ "No requirements file found.\n"))))
+                    (requires.txt (string-append dirname "/" pypi-name
+                                                 ".egg-info" "/requires.txt"))
+                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
+                                              (current-output-port (%make-void-port "rw+")))
+                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+               (if (zero? exit-code)
+                   (read-requirements (string-append dir "/" requires.txt))
                    (begin
-                     (warning (G_ "Failed to extract requirements files\n"))
+                     (warning
+                      (G_ "Failed to extract file: ~a from source.~%")
+                      requires.txt)
                      '())))))
           '())))
 
-  ;; First, try to compute the requirements using the wheel, since that is the
-  ;; most reliable option. If a wheel is not provided for this package, try
-  ;; getting them by reading either the "requirements.txt" file or the
-  ;; "requires.txt" from the egg-info directory from the source tarball. Note
-  ;; that "requirements.txt" is not mandatory, so this is likely to fail.
+  ;; First, try to compute the requirements using the wheel, else, fallback to
+  ;; reading the "requires.txt" from the egg-info directory from the source
+  ;; tarball.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-
 (define (compute-inputs source-url wheel-url tarball)
   "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
 name/variable pairs describing the required inputs of this package.  Also
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6daa44a6e7..335be42644 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -23,7 +23,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
-  #:use-module ((guix build utils) #:select (delete-file-recursively which))
+  #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
   #:use-module (srfi srfi-64)
   #:use-module (ice-9 match))
 
@@ -55,11 +55,10 @@
 (define test-source-hash
   "")
 
-(define test-requirements
-"# A comment
- # A comment after a space
+(define test-requires.txt "\
 bar
-baz > 13.37")
+baz > 13.37
+")
 
 (define test-metadata
   "{
@@ -107,10 +106,10 @@ baz > 13.37")
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir "foo-1.0.0")
-                  (with-output-to-file "foo-1.0.0/requirements.txt"
+                  (mkdir-p "foo-1.0.0/foo.egg-info/")
+                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
-                      (display test-requirements)))
+                      (display test-requires.txt)))
                   (system* "tar" "czvf" file-name "foo-1.0.0/")
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
@@ -157,11 +156,11 @@ baz > 13.37")
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
-               (begin
-                 (mkdir "foo-1.0.0")
-                 (with-output-to-file "foo-1.0.0/requirements.txt"
+              (begin
+                (mkdir-p "foo-1.0.0/foo.egg-info/")
+                (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requirements)))
+                     (display test-requires.txt)))
                  (system* "tar" "czvf" file-name "foo-1.0.0/")
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
-- 
2.20.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.3: 0002-import-pypi-Do-not-parse-optional-requirements-from-.patch --]
[-- Type: text/x-patch, Size: 7840 bytes --]

From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 2/7] import: pypi: Do not parse optional requirements from
 source.

* guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
(guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT and COMMENT?
functions inside the READ-REQUIREMENTS procedure.
(parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
parsing optional requirements.

* tests/pypi.scm (test-requires-with-sections): New variable.
("parse-requires.txt, with sections"): New test.
("pypi->guix-package"): Mute tar output to stdout.
---
 guix/import/pypi.scm | 76 +++++++++++++++++++++++++++-----------------
 tests/pypi.scm       | 21 ++++++++++--
 2 files changed, 65 insertions(+), 32 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8269aa61d7..91e987e9f1 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -47,7 +47,8 @@
   #:use-module (guix upstream)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
-  #:export (guix-package->pypi-name
+  #:export (parse-requires.txt
+            guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
             %pypi-updater))
@@ -117,6 +118,49 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
+(define (clean-requirement s)
+  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
+  ;; file, remove everything other than the actual name of the required
+  ;; package, and return it.
+  (string-take s (or (string-index s (lambda (chr)
+                                       (member chr '(#\space #\> #\= #\<))))
+                     (string-length s))))
+
+(define (parse-requires.txt requires.txt)
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
+requirement names."
+  ;; This is a very incomplete parser, which job is to select the non-optional
+  ;; dependencies and strip them out of any version information.
+  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
+  ;; library and the requirements grammar defined by PEP-0508
+  ;; (https://www.python.org/dev/peps/pep-0508/).
+
+  (define (comment? line)
+    ;; Return #t if the given LINE is a comment, #f otherwise.
+    (eq? (string-ref (string-trim line) 0) #\#))
+
+  (define (section-header? line)
+    ;; Return #t if the given LINE is a section header, #f otherwise.
+    (let ((trimmed-line (string-trim line)))
+      (and (not (string-null? trimmed-line))
+           (eq? (string-ref trimmed-line 0) #\[))))
+
+  (call-with-input-file requires.txt
+    (lambda (port)
+      (let loop ((result '()))
+        (let ((line (read-line port)))
+          ;; Stop when a section is encountered, as sections contains optional
+          ;; (extra) requirements.  Non-optional requirements must appear
+          ;; before any section is defined.
+          (if (or (eof-object? line) (section-header? line))
+              (reverse result)
+              (cond
+               ((or (string-null? line) (comment? line))
+                (loop result))
+               (else
+                (loop (cons (clean-requirement line)
+                            result))))))))))
+
 (define (guess-requirements source-url wheel-url tarball)
   "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
 of the required packages specified in the requirements.txt file.  TARBALL will
@@ -139,34 +183,6 @@ be extracted in a temporary directory."
 cannot determine package dependencies"))
           #f)))))
 
-  (define (clean-requirement s)
-    ;; Given a requirement LINE, as can be found in a Python requirements.txt
-    ;; file, remove everything other than the actual name of the required
-    ;; package, and return it.
-    (string-take s
-      (or (string-index s (lambda (chr) (member chr '(#\space #\> #\= #\<))))
-          (string-length s))))
-
-  (define (comment? line)
-    ;; Return #t if the given LINE is a comment, #f otherwise.
-    (eq? (string-ref (string-trim line) 0) #\#))
-
-  (define (read-requirements requirements-file)
-    ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return a list
-    ;; of name/variable pairs describing the requirements.
-    (call-with-input-file requirements-file
-      (lambda (port)
-        (let loop ((result '()))
-          (let ((line (read-line port)))
-            (if (eof-object? line)
-                result
-                (cond
-                 ((or (string-null? line) (comment? line))
-                  (loop result))
-                 (else
-                  (loop (cons (clean-requirement line)
-                              result))))))))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -212,7 +228,7 @@ cannot determine package dependencies"))
                                               (current-output-port (%make-void-port "rw+")))
                                  (system* "tar" "xf" tarball "-C" dir requires.txt))))
                (if (zero? exit-code)
-                   (read-requirements (string-append dir "/" requires.txt))
+                   (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 335be42644..e4b7142311 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -60,6 +60,15 @@ bar
 baz > 13.37
 ")
 
+(define test-requires-with-sections "\
+# A comment
+foo ~= 3
+bar != 2
+
+[test]
+pytest (>=2.5.0)
+")
+
 (define test-metadata
   "{
   \"run_requires\": [
@@ -99,6 +108,12 @@ baz > 13.37
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "parse-requires.txt, with sections"
+  '("foo" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires-with-sections)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -110,7 +125,8 @@ baz > 13.37
                   (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
-                  (system* "tar" "czvf" file-name "foo-1.0.0/")
+                  (parameterize ((current-output-port (%make-void-port "rw+")))
+                    (system* "tar" "czvf" file-name "foo-1.0.0/"))
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
                     (call-with-input-file file-name port-sha256))))
@@ -161,7 +177,8 @@ baz > 13.37
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
                      (display test-requires.txt)))
-                 (system* "tar" "czvf" file-name "foo-1.0.0/")
+                (parameterize ((current-output-port (%make-void-port "rw+")))
+                  (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
                        (call-with-input-file file-name port-sha256))))
-- 
2.20.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.4: 0003-import-pypi-Improve-parsing-of-requirement-specifica.patch --]
[-- Type: text/x-patch, Size: 5396 bytes --]

From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 3/7] import: pypi: Improve parsing of requirement
 specifications.

The previous solution was fragile and could leave unwanted characters in a
requirement name, such as '[' or ']'.

Partially fixes issue #33047 (see:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=33047).

* guix/import/pypi.scm (use-modules): Export SPECIFICATION->REQUIREMENT-NAME
(%requirement-name-regexp): New variable.
(clean-requirement): Rename to...
(specification->requirement-name): this, which now uses
%requirement-name-regexp to select the requirement name from the requirement
specification.
(parse-requires.txt): Adapt.
---
 guix/import/pypi.scm | 43 ++++++++++++++++++++++++++++++++++---------
 tests/pypi.scm       | 12 ++++++++++++
 2 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 91e987e9f1..efb5939c78 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -48,6 +48,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
@@ -118,13 +119,37 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
-(define (clean-requirement s)
-  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
-  ;; file, remove everything other than the actual name of the required
-  ;; package, and return it.
-  (string-take s (or (string-index s (lambda (chr)
-                                       (member chr '(#\space #\> #\= #\<))))
-                     (string-length s))))
+(define %requirement-name-regexp
+  ;; Regexp to match the requirement name in a requirement specification.
+
+  ;; Some grammar, taken from PEP-0508 (see:
+  ;; https://www.python.org/dev/peps/pep-0508/).
+
+  ;; The unified rule can be expressed as:
+  ;; specification = wsp* ( url_req | name_req ) wsp*
+
+  ;; where url_req is:
+  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
+
+  ;; and where name_req is:
+  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
+
+  ;; Thus, we need only matching NAME, which is expressed as:
+  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
+  ;; identifier    = letterOrDigit identifier_end*
+  ;; name          = identifier
+  (let* ((letter-or-digit "[A-Za-z0-9]")
+         (identifier-end (string-append "(" letter-or-digit "|"
+                                        "[-_.]*" letter-or-digit ")"))
+         (identifier (string-append "^" letter-or-digit identifier-end "*"))
+         (name identifier))
+    (make-regexp name)))
+
+(define (specification->requirement-name spec)
+  "Given a specification SPEC, return the requirement name."
+  (match:substring
+   (or (regexp-exec %requirement-name-regexp spec)
+       (error (G_ "Could not extract requirement name in spec:") spec))))
 
 (define (parse-requires.txt requires.txt)
   "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
@@ -158,7 +183,7 @@ requirement names."
                ((or (string-null? line) (comment? line))
                 (loop result))
                (else
-                (loop (cons (clean-requirement line)
+                (loop (cons (specification->requirement-name line)
                             result))))))))))
 
 (define (guess-requirements source-url wheel-url tarball)
@@ -200,7 +225,7 @@ cannot determine package dependencies"))
                                             (hash-ref (list-ref run_requires 0)
                                                        "requires")
                                             '())))
-                     (map clean-requirement requirements)))))
+                     (map specification->requirement-name requirements)))))
              (lambda ()
                (delete-file json-file)
                (rmdir dirname))))))
diff --git a/tests/pypi.scm b/tests/pypi.scm
index e4b7142311..82d6bba8dd 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -55,6 +55,14 @@
 (define test-source-hash
   "")
 
+(define test-specifications
+  '("Fizzy [foo, bar]"
+    "PickyThing<1.6,>1.9,!=1.9.6,<2.0a0,==2.4c1"
+    "SomethingWithMarker[foo]>1.0;python_version<\"2.7\""
+    "requests [security,tests] >= 2.8.1, == 2.8.* ; python_version < \"2.7\""
+    "pip @ https://github.com/pypa/pip/archive/1.3.1.zip#\
+sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
+
 (define test-requires.txt "\
 bar
 baz > 13.37
@@ -108,6 +116,10 @@ pytest (>=2.5.0)
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "specification->requirement-name"
+  '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
+  (map specification->requirement-name test-specifications))
+
 (test-equal "parse-requires.txt, with sections"
   '("foo" "bar")
   (mock ((ice-9 ports) call-with-input-file
-- 
2.20.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.5: 0004-import-pypi-Deduplicate-requirements.patch --]
[-- Type: text/x-patch, Size: 1253 bytes --]

From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 4/7] import: pypi: Deduplicate requirements.

* guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.
---
 guix/import/pypi.scm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index efb5939c78..a90be67bb0 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -178,7 +178,11 @@ requirement names."
           ;; (extra) requirements.  Non-optional requirements must appear
           ;; before any section is defined.
           (if (or (eof-object? line) (section-header? line))
-              (reverse result)
+              ;; Duplicates can occur, since the same requirement can be
+              ;; listed multiple times with different conditional markers, e.g.
+              ;; pytest >= 3 ; python_version >= "3.3"
+              ;; pytest < 3 ; python_version < "3.3"
+              (reverse (delete-duplicates result))
               (cond
                ((or (string-null? line) (comment? line))
                 (loop result))
-- 
2.20.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.6: 0005-import-pypi-Support-more-types-of-archives.patch --]
[-- Type: text/x-patch, Size: 5498 bytes --]

From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:02 -0400
Subject: [PATCH 5/7] import: pypi: Support more types of archives.

This change enables the PyPI importer to look for requirements in a source
archive of a different type than "tar.gz" or "tar.bz2".

* guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
[archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
archive is supported or not.
[guess-requirements-from-source]: Adapt to use the new method, and use unzip
to extract ZIP archives.
(guess-requirements): Rename the TARBALL argument to ARCHIVE, to denote the
archive format is no longer bound specifically to the Tar format.
---
 guix/import/pypi.scm | 47 ++++++++++++++++++++++----------------------
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a90be67bb0..8e93653717 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -190,27 +190,24 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
-(define (guess-requirements source-url wheel-url tarball)
-  "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
-of the required packages specified in the requirements.txt file.  TARBALL will
+(define (guess-requirements source-url wheel-url archive)
+  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (tarball-directory url)
-    ;; Given the URL of the package's tarball, return the name of the directory
+  (define (archive-root-directory url)
+    ;; Given the URL of the package's archive, return the name of the directory
     ;; that will be created upon decompressing it. If the filetype is not
     ;; supported, return #f.
-    ;; TODO: Support more archive formats.
-    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
-      (cond
-       ((string-suffix? ".tar.gz" basename)
-        (string-drop-right basename 7))
-       ((string-suffix? ".tar.bz2" basename)
-        (string-drop-right basename 8))
-       (else
+    (if (compressed-file? url)
+        (let ((root-directory (file-sans-extension (basename url))))
+          (if (string=? "tar" (file-extension root-directory))
+              (file-sans-extension root-directory)
+              root-directory))
         (begin
-          (warning (G_ "Unsupported archive format: \
-cannot determine package dependencies"))
-          #f)))))
+          (warning (G_ "Unsupported archive format (~a): \
+cannot determine package dependencies") (file-extension url))
+          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -246,16 +243,20 @@ cannot determine package dependencies"))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (tarball-directory source-url)))
+    (let ((dirname (archive-root-directory source-url))
+          (extension (file-extension source-url)))
       (if (string? dirname)
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
                     (requires.txt (string-append dirname "/" pypi-name
                                                  ".egg-info" "/requires.txt"))
-                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
-                                              (current-output-port (%make-void-port "rw+")))
-                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+                    (exit-code
+                     (parameterize ((current-error-port (%make-void-port "rw+"))
+                                    (current-output-port (%make-void-port "rw+")))
+                       (if (string=? "zip" extension)
+                           (system* "unzip" archive "-d" dir requires.txt)
+                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
                (if (zero? exit-code)
                    (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
@@ -271,13 +272,13 @@ cannot determine package dependencies"))
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-(define (compute-inputs source-url wheel-url tarball)
-  "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
+(define (compute-inputs source-url wheel-url archive)
+  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
 name/variable pairs describing the required inputs of this package.  Also
 return the unaltered list of upstream dependency names."
   (let ((dependencies
          (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url tarball))))
+                 (guess-requirements source-url wheel-url archive))))
     (values (sort
              (map (lambda (input)
                     (let ((guix-name (python->package-name input)))
-- 
2.20.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.7: 0006-import-pypi-Parse-wheel-METADATA-instead-of-metadata.patch --]
[-- Type: text/x-patch, Size: 9785 bytes --]

From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:03 -0400
Subject: [PATCH 6/7] import: pypi: Parse wheel METADATA instead of
 metadata.json.

With newer Wheel releases, there is no more metadata.json file; the METADATA
file should be used instead (see: https://github.com/pypa/wheel/issues/195).

This change updates our PyPI importer so that it uses the later.

* guix/import/pypi.scm (define-module): Remove unnecessary modules and export
  the PARSE-WHEEL-METADATA method.
(parse-wheel-metadata): Add method.
(guess-requirements): Use it.
* tests/pypi.scm (test-metadata): Test it.
---
 guix/import/pypi.scm | 66 +++++++++++++++++++++++++++++---------------
 tests/pypi.scm       | 60 ++++++++++++++++++++++++++++++----------
 2 files changed, 89 insertions(+), 37 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8e93653717..c520213b6a 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -21,9 +21,7 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix import pypi)
-  #:use-module (ice-9 binary-ports)
   #:use-module (ice-9 match)
-  #:use-module (ice-9 pretty-print)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
@@ -31,9 +29,6 @@
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
-  #:use-module (rnrs bytevectors)
-  #:use-module (json)
-  #:use-module (web uri)
   #:use-module (guix ui)
   #:use-module (guix utils)
   #:use-module ((guix build utils)
@@ -48,6 +43,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            parse-wheel-metadata
             specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
@@ -190,6 +186,37 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
+(define (parse-wheel-metadata metadata)
+  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  ;; METADATA is a RFC-2822-like, header based file.
+
+  (define (requires-dist-header? line)
+    ;; Return #t if the given LINE is a Requires-Dist header.
+    (regexp-match? (string-match "^Requires-Dist: " line)))
+
+  (define (requires-dist-value line)
+    (string-drop line (string-length "Requires-Dist: ")))
+
+  (define (extra? line)
+    ;; Return #t if the given LINE is an "extra" requirement.
+    (regexp-match? (string-match "extra == " line)))
+
+  (call-with-input-file metadata
+    (lambda (port)
+      (let loop ((requirements '()))
+        (let ((line (read-line port)))
+          ;; Stop at the first 'Provides-Extra' section: the non-optional
+          ;; requirements appear before the optional ones.
+          (if (eof-object? line)
+              (reverse (delete-duplicates requirements))
+              (cond
+               ((and (requires-dist-header? line) (not (extra? line)))
+                (loop (cons (specification->requirement-name
+                             (requires-dist-value line))
+                            requirements)))
+               (else
+                (loop requirements)))))))))
+
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
@@ -211,25 +238,18 @@ cannot determine package dependencies") (file-extension url))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
-    ;; requirements.
+    ;; requirements, or #f if the metadata file contained therein couldn't be
+    ;; extracted.
     (let* ((dirname (wheel-url->extracted-directory wheel-url))
-           (json-file (string-append dirname "/metadata.json")))
-      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
-           (dynamic-wind
-             (const #t)
-             (lambda ()
-               (call-with-input-file json-file
-                 (lambda (port)
-                   (let* ((metadata (json->scm port))
-                          (run_requires (hash-ref metadata "run_requires"))
-                          (requirements (if run_requires
-                                            (hash-ref (list-ref run_requires 0)
-                                                       "requires")
-                                            '())))
-                     (map specification->requirement-name requirements)))))
-             (lambda ()
-               (delete-file json-file)
-               (rmdir dirname))))))
+           (metadata (string-append dirname "/METADATA")))
+      (call-with-temporary-directory
+       (lambda (dir)
+         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+             (parse-wheel-metadata (string-append dir "/" metadata))
+             (begin
+               (warning
+                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
+               #f))))))
 
   (define (guess-requirements-from-wheel)
     ;; Return the package's requirements using the wheel, or #f if an error
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 82d6bba8dd..ca8cb5f6de 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -21,6 +21,7 @@
   #:use-module (guix import pypi)
   #:use-module (guix base32)
   #:use-module (gcrypt hash)
+  #:use-module (guix memoization)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
   #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
@@ -77,17 +78,33 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
-(define test-metadata
-  "{
-  \"run_requires\": [
-    {
-      \"requires\": [
-        \"bar\",
-        \"baz (>13.37)\"
-      ]
-    }
-  ]
-}")
+(define test-metadata "\
+Classifier: Programming Language :: Python :: 3.7
+Requires-Dist: baz ~= 3
+Requires-Dist: bar != 2
+Provides-Extra: test
+pytest (>=2.5.0)
+")
+
+(define test-metadata-with-extras "
+Classifier: Programming Language :: Python :: 3.7
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Requires-Dist: wrapt (<2,>=1)
+Requires-Dist: bar
+
+Provides-Extra: dev
+Requires-Dist: tox ; extra == 'dev'
+Requires-Dist: bumpversion (<1) ; extra == 'dev'
+")
+
+;;; Provides-Extra can appear before Requires-Dist.
+(define test-metadata-with-extras-jedi "\
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Provides-Extra: testing
+Requires-Dist: parso (>=0.3.0)
+Provides-Extra: testing
+Requires-Dist: pytest (>=3.1.0); extra == 'testing'
+")
 
 (test-begin "pypi")
 
@@ -126,6 +143,18 @@ pytest (>=2.5.0)
          call-with-input-string)
         (parse-requires.txt test-requires-with-sections)))
 
+(test-equal "parse-wheel-metadata, with extras"
+  '("wrapt" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras)))
+
+(test-equal "parse-wheel-metadata, with extras - Jedi"
+  '("parso")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras-jedi)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -188,7 +217,7 @@ pytest (>=2.5.0)
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requires.txt)))
+                     (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
@@ -197,13 +226,13 @@ pytest (>=2.5.0)
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
                (begin
                  (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/metadata.json"
+                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
                    (lambda ()
                      (display test-metadata)))
                  (let ((zip-file (string-append file-name ".zip")))
                    ;; zip always adds a "zip" extension to the file it creates,
                    ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/metadata.json")
+                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
                    (rename-file zip-file file-name))
                  (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
@@ -215,6 +244,9 @@ pytest (>=2.5.0)
                             (string-length test-json)))
                    ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
                    (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
               (match (pypi->guix-package "foo")
                 (('package
                    ('name "python-foo")
-- 
2.20.1


[-- Attachment #1.8: 0007-import-pypi-Include-optional-test-inputs-as-native-i.patch --]
[-- Type: text/x-patch, Size: 22689 bytes --]

From ea0f24eb7b19c57ebb24ec48ba776b240bccfc99 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH 7/7] import: pypi: Include optional test inputs as
 native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt, and hide unzip output.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
("pypi->guix-package, no usable requirement file."): New test.
---
 guix/import/pypi.scm | 158 ++++++++++++++++++++++++++++---------------
 tests/pypi.scm       | 123 +++++++++++++++++++++++++--------
 2 files changed, 199 insertions(+), 82 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index c520213b6a..f84ad88e44 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -106,14 +108,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -147,11 +150,21 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
-  ;; This is a very incomplete parser, which job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the
+second the optional test dependencies.  Note that currently, optional,
+non-test dependencies are omitted since these can be difficult or expensive to
+satisfy."
+
+  ;; This is a very incomplete parser, which job is to read in the requirement
+  ;; specification lines, and strip them out of any version information.
   ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
   ;; library and the requirements grammar defined by PEP-0508
   ;; (https://www.python.org/dev/peps/pep-0508/).
@@ -168,57 +181,89 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contains optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
+          (if (eof-object? line)
               ;; Duplicates can occur, since the same requirement can be
               ;; listed multiple times with different conditional markers, e.g.
               ;; pytest >= 3 ; python_version >= "3.3"
               ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
+                (loop required-deps test-deps inside-test-section? optional?))
+               ((section-header? line)
+                ;; Encountering a section means that all the requirements
+                ;; listed below are optional. Since we want to pick only the
+                ;; test dependencies from the optional dependencies, we must
+                ;; track those separately.
+                (loop required-deps test-deps (test-section? line) #t))
+               (inside-test-section?
+                (loop required-deps
+                      (cons (specification->requirement-name line)
+                            test-deps)
+                      inside-test-section? optional?))
+               ((not optional?)
                 (loop (cons (specification->requirement-name line)
-                            result))))))))))
+                            required-deps)
+                      test-deps inside-test-section? optional?))
+               (optional?
+                ;; Skip optional items.
+                (loop required-deps test-deps inside-test-section? optional?))
+               (else
+                (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line)))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the second the optional
+test dependencies.  Note that currently, optional, non-test dependencies are
+omitted since these can be difficult or expensive to satisfy."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
     ;; Return #t if the given LINE is a Requires-Dist header.
-    (regexp-match? (string-match "^Requires-Dist: " line)))
+    (string-match "^Requires-Dist: " line))
 
   (define (requires-dist-value line)
     (string-drop line (string-length "Requires-Dist: ")))
 
   (define (extra? line)
     ;; Return #t if the given LINE is an "extra" requirement.
-    (regexp-match? (string-match "extra == " line)))
+    (string-match "extra == '(.*)'" line))
+
+  (define (test-requirement? line)
+    (let ((extra-label (match:substring (extra? line) 1)))
+      (and extra-label (test-section? extra-label))))
 
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((and (requires-dist-header? line) (not (extra? line)))
                 (loop (cons (specification->requirement-name
                              (requires-dist-value line))
-                            requirements)))
+                            required-deps)
+                      test-deps))
+               ((and (requires-dist-header? line) (test-requirement? line))
+                (loop required-deps
+                      (cons (specification->requirement-name (requires-dist-value line))
+                            test-deps)))
                (else
-                (loop requirements)))))))))
+                (loop required-deps test-deps))))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
@@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
@@ -283,32 +331,38 @@ cannot determine package dependencies") (file-extension url))
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
                       requires.txt)
-                     '())))))
-          '())))
+                     (list '() '()))))))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
-return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively."
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (map process-requirements (guess-requirements source-url wheel-url archive)))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -317,15 +371,13 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
-              (compute-inputs source-url wheel-url temp)
-            (values
+          (match (compute-inputs source-url wheel-url temp)
+            ((required-inputs test-inputs)
              `(package
                 (name ,(python->package-name name))
                 (version ,version)
                 (source (origin
                           (method url-fetch)
-
                           ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
                           ;; cases in NAME, for instance, as is the case with
                           ;; "uwsgi".  In that case, fall back to a full URL.
@@ -334,12 +386,12 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
                            (base32
                             ,(guix-hash-url temp)))))
                 (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
+                ,@(maybe-inputs required-inputs 'propagated-inputs)
+                ,@(maybe-inputs test-inputs 'native-inputs)
                 (home-page ,home-page)
                 (synopsis ,synopsis)
                 (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+                (license ,(license->symbol license)))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index ca8cb5f6de..aa08e2cb54 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -65,11 +66,6 @@
 sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 
 (define test-requires.txt "\
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 # A comment
 foo ~= 3
 bar != 2
@@ -78,12 +74,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -137,25 +146,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -195,7 +210,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -216,25 +234,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -262,6 +280,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
@@ -272,4 +293,48 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256))
+              #t)
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f)))
+              )))
+
 (test-end "pypi")
-- 
2.20.1


[-- Attachment #1.9: Type: text/plain, Size: 16 bytes --]


Thanks,

Maxim

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] bug#24450: pypi importer outputs strange character series in optional dependency case.
  2019-03-29  4:34 ` bug#24450: [PATCH] " Maxim Cournoyer
@ 2019-03-30  2:12   ` T460s laptop
  2019-03-31 14:40     ` bug#24450: [PATCH] " Maxim Cournoyer
  2019-04-01 15:28     ` bug#24450: [PATCHv2] " Ludovic Courtès
  0 siblings, 2 replies; 41+ messages in thread
From: T460s laptop @ 2019-03-30  2:12 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: ng0, 24450


[-- Attachment #1.1: Type: text/plain, Size: 177 bytes --]


The previous 0007 patch had broken the recursive importer.  This
reworked version fixes this.  The rest of the patches stack sent in the
previous message is still good as is.


[-- Attachment #1.2: 0007-import-pypi-Include-optional-test-inputs-as-native-i.patch --]
[-- Type: text/x-patch, Size: 23891 bytes --]

From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH] import: pypi: Include optional test inputs as native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt, and hide unzip output.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
("pypi->guix-package, no usable requirement file."): New test.
---
 guix/import/pypi.scm | 195 ++++++++++++++++++++++++++++---------------
 tests/pypi.scm       | 123 ++++++++++++++++++++-------
 2 files changed, 222 insertions(+), 96 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index c520213b6a..099768f0c8 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -106,14 +108,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -147,11 +150,21 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
-  ;; This is a very incomplete parser, which job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the
+second the optional test dependencies.  Note that currently, optional,
+non-test dependencies are omitted since these can be difficult or expensive to
+satisfy."
+
+  ;; This is a very incomplete parser, which job is to read in the requirement
+  ;; specification lines, and strip them out of any version information.
   ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
   ;; library and the requirements grammar defined by PEP-0508
   ;; (https://www.python.org/dev/peps/pep-0508/).
@@ -168,57 +181,89 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contains optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
+          (if (eof-object? line)
               ;; Duplicates can occur, since the same requirement can be
               ;; listed multiple times with different conditional markers, e.g.
               ;; pytest >= 3 ; python_version >= "3.3"
               ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
+                (loop required-deps test-deps inside-test-section? optional?))
+               ((section-header? line)
+                ;; Encountering a section means that all the requirements
+                ;; listed below are optional. Since we want to pick only the
+                ;; test dependencies from the optional dependencies, we must
+                ;; track those separately.
+                (loop required-deps test-deps (test-section? line) #t))
+               (inside-test-section?
+                (loop required-deps
+                      (cons (specification->requirement-name line)
+                            test-deps)
+                      inside-test-section? optional?))
+               ((not optional?)
                 (loop (cons (specification->requirement-name line)
-                            result))))))))))
+                            required-deps)
+                      test-deps inside-test-section? optional?))
+               (optional?
+                ;; Skip optional items.
+                (loop required-deps test-deps inside-test-section? optional?))
+               (else
+                (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line)))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the second the optional
+test dependencies.  Note that currently, optional, non-test dependencies are
+omitted since these can be difficult or expensive to satisfy."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
     ;; Return #t if the given LINE is a Requires-Dist header.
-    (regexp-match? (string-match "^Requires-Dist: " line)))
+    (string-match "^Requires-Dist: " line))
 
   (define (requires-dist-value line)
     (string-drop line (string-length "Requires-Dist: ")))
 
   (define (extra? line)
     ;; Return #t if the given LINE is an "extra" requirement.
-    (regexp-match? (string-match "extra == " line)))
+    (string-match "extra == '(.*)'" line))
+
+  (define (test-requirement? line)
+    (let ((extra-label (match:substring (extra? line) 1)))
+      (and extra-label (test-section? extra-label))))
 
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((and (requires-dist-header? line) (not (extra? line)))
                 (loop (cons (specification->requirement-name
                              (requires-dist-value line))
-                            requirements)))
+                            required-deps)
+                      test-deps))
+               ((and (requires-dist-header? line) (test-requirement? line))
+                (loop required-deps
+                      (cons (specification->requirement-name (requires-dist-value line))
+                            test-deps)))
                (else
-                (loop requirements)))))))))
+                (loop required-deps test-deps))))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
@@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
@@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
                       requires.txt)
-                     '())))))
-          '())))
+                     (list '() '()))))))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively.  Also
 return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (let ((dependencies (guess-requirements source-url wheel-url archive)))
+    (values (map process-requirements dependencies)
+            (concatenate dependencies))))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -317,29 +374,33 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
+          (receive (guix-dependencies upstream-dependencies)
               (compute-inputs source-url wheel-url temp)
-            (values
-             `(package
-                (name ,(python->package-name name))
-                (version ,version)
-                (source (origin
-                          (method url-fetch)
-
-                          ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                          ;; cases in NAME, for instance, as is the case with
-                          ;; "uwsgi".  In that case, fall back to a full URL.
-                          (uri (pypi-uri ,(string-downcase name) version))
-                          (sha256
-                           (base32
-                            ,(guix-hash-url temp)))))
-                (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
-                (home-page ,home-page)
-                (synopsis ,synopsis)
-                (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+            (match guix-dependencies
+              ((required-inputs test-inputs)
+               (values
+                `(package
+                   (name ,(python->package-name name))
+                   (version ,version)
+                   (source (origin
+                             (method url-fetch)
+                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
+                             ;; cases in NAME, for instance, as is the case with
+                             ;; "uwsgi".  In that case, fall back to a full URL.
+                             (uri (pypi-uri ,(string-downcase name) version))
+                             (sha256
+                              (base32
+                               ,(guix-hash-url temp)))))
+                   (build-system python-build-system)
+                   ,@(maybe-inputs required-inputs 'propagated-inputs)
+                   ,@(maybe-inputs test-inputs 'native-inputs)
+                   (home-page ,home-page)
+                   (synopsis ,synopsis)
+                   (description ,description)
+                   (license ,(license->symbol license)))
+                ;; Flatten the nested lists and return the upstream
+                ;; dependencies.
+                upstream-dependencies))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index ca8cb5f6de..aa08e2cb54 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -65,11 +66,6 @@
 sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 
 (define test-requires.txt "\
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 # A comment
 foo ~= 3
 bar != 2
@@ -78,12 +74,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -137,25 +146,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -195,7 +210,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -216,25 +234,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -262,6 +280,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
@@ -272,4 +293,48 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256))
+              #t)
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f)))
+              )))
+
 (test-end "pypi")
-- 
2.20.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* bug#24450: [PATCH] bug#24450: pypi importer outputs strange character series in optional dependency case.
  2019-03-30  2:12   ` bug#24450: [PATCHv2] " T460s laptop
@ 2019-03-31 14:40     ` Maxim Cournoyer
  2019-04-01 15:28     ` bug#24450: [PATCHv2] " Ludovic Courtès
  1 sibling, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-03-31 14:40 UTC (permalink / raw)
  To: ng0; +Cc: 24450


[-- Attachment #1.1: Type: text/plain, Size: 442 bytes --]

Hello!

I've yet another commit to add on the top of the current stack of
patches already sent. This one makes it so that the source archive is
searched for the '[...].egg-info/requires.txt' file rather than check
the more common, fixed location.

This makes the importer more useful, as some packages such as
robotframework-sshlibrary use a "src" directory in their hierarchy which
would cause the previous scheme to not find requires.txt.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0008-import-pypi-Scan-source-archive-to-find-requires.txt.patch --]
[-- Type: text/x-patch, Size: 7538 bytes --]

From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 23:13:26 -0400
Subject: [PATCH] import: pypi: Scan source archive to find requires.txt file.

* guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
(guess-requirements)[archive-root-directory]: Remove procedure.
[guess-requirements-from-wheel]: Re-ident.
[guess-requirements-from-source]: Search for the requires.txt file in the
archive instead of using a static, expected location.
* tests/pypi.scm ("pypi->guix-package, no wheel"): Mock the requires.txt at a
non-standard location to test the new feature.
("pypi->guix-package, no usable requirement file."): Adapt.
---
 guix/import/pypi.scm | 65 ++++++++++++++++++--------------------------
 tests/pypi.scm       | 17 +++++++-----
 2 files changed, 37 insertions(+), 45 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 099768f0c8..a2ce14b192 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -36,7 +36,8 @@
   #:use-module ((guix build utils)
                 #:select ((package-name->name+version
                            . hyphen-package-name->name+version)
-                          find-files))
+                          find-files
+                          invoke))
   #:use-module (guix import utils)
   #:use-module ((guix download) #:prefix download:)
   #:use-module (guix import json)
@@ -267,19 +268,6 @@ omitted since these can be difficult or expensive to satisfy."
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (archive-root-directory url)
-    ;; Given the URL of the package's archive, return the name of the directory
-    ;; that will be created upon decompressing it. If the filetype is not
-    ;; supported, return #f.
-    (if (compressed-file? url)
-        (let ((root-directory (file-sans-extension (basename url))))
-          (if (string=? "tar" (file-extension root-directory))
-              (file-sans-extension root-directory)
-              root-directory))
-        (begin
-          (warning (G_ "Unsupported archive format (~a): \
-cannot determine package dependencies") (file-extension url))
-          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -305,33 +293,34 @@ cannot determine package dependencies") (file-extension url))
     (call-with-temporary-output-file
      (lambda (temp port)
        (if wheel-url
-         (and (url-fetch wheel-url temp)
-              (read-wheel-metadata temp))
-         #f))))
+           (and (url-fetch wheel-url temp)
+                (read-wheel-metadata temp))
+           #f))))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (archive-root-directory source-url))
-          (extension (file-extension source-url)))
-      (if (string? dirname)
-          (call-with-temporary-directory
-           (lambda (dir)
-             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (requires.txt (string-append dirname "/" pypi-name
-                                                 ".egg-info" "/requires.txt"))
-                    (exit-code
-                     (parameterize ((current-error-port (%make-void-port "rw+"))
-                                    (current-output-port (%make-void-port "rw+")))
-                       (if (string=? "zip" extension)
-                           (system* "unzip" archive "-d" dir requires.txt)
-                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
-               (if (zero? exit-code)
-                   (parse-requires.txt (string-append dir "/" requires.txt))
-                   (begin
-                     (warning
-                      (G_ "Failed to extract file: ~a from source.~%")
-                      requires.txt)
-                     (list '() '()))))))
+    (if (compressed-file? source-url)
+        (call-with-temporary-directory
+         (lambda (dir)
+           (parameterize ((current-error-port (%make-void-port "rw+"))
+                          (current-output-port (%make-void-port "rw+")))
+             (if (string=? "zip" (file-extension source-url))
+                 (invoke "unzip" archive "-d" dir)
+                 (invoke "tar" "xf" archive "-C" dir)))
+           (let ((requires.txt-files
+                  (find-files dir (lambda (abs-file-name _)
+		                    (string-match "\\.egg-info/requires.txt$"
+                                                  abs-file-name)))))
+             (if (> (length requires.txt-files) 0)
+                 (begin
+                   (parse-requires.txt (first requires.txt-files)))
+                 (begin (warning (G_ "Cannot guess requirements from source archive:\
+ no requires.txt file found.~%"))
+                        (list '() '()))))))
+        (begin
+          (warning (G_ "Unsupported archive format; \
+cannot determine package dependencies from source archive: ~a~%")
+                   (basename source-url))
           (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
diff --git a/tests/pypi.scm b/tests/pypi.scm
index aa08e2cb54..ad188df16c 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -177,8 +177,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir-p "foo-1.0.0/foo.egg-info/")
-                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
+                  ;; Unusual requires.txt location should still be found.
+                  (mkdir-p "foo-1.0.0/src/bizarre.egg-info")
+                  (with-output-to-file "foo-1.0.0/src/bizarre.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
                   (parameterize ((current-output-port (%make-void-port "rw+")))
@@ -299,10 +300,13 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
+              (mkdir-p "foo-1.0.0/foo.egg-info/")
+              (parameterize ((current-output-port (%make-void-port "rw+")))
+                (system* "tar" "czvf" file-name "foo-1.0.0/"))
+              (delete-file-recursively "foo-1.0.0")
               (set! test-source-hash
-                (call-with-input-file file-name port-sha256))
-              #t)
-             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+                (call-with-input-file file-name port-sha256)))
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -334,7 +338,6 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                             test-source-hash)
                            hash))
                 (x
-                 (pk 'fail x #f)))
-              )))
+                 (pk 'fail x #f))))))
 
 (test-end "pypi")
-- 
2.20.1


[-- Attachment #1.3: Type: text/plain, Size: 16 bytes --]


Thanks,

Maxim

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] bug#24450: pypi importer outputs strange character series in optional dependency case.
  2019-03-30  2:12   ` bug#24450: [PATCHv2] " T460s laptop
  2019-03-31 14:40     ` bug#24450: [PATCH] " Maxim Cournoyer
@ 2019-04-01 15:28     ` Ludovic Courtès
  1 sibling, 0 replies; 41+ messages in thread
From: Ludovic Courtès @ 2019-04-01 15:28 UTC (permalink / raw)
  To: T460s laptop; +Cc: ng0, 24450

Hi Maxim,

T460s laptop <maxim.cournoyer@gmail.com> skribis:

> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 23:12:26 -0400
> Subject: [PATCH] import: pypi: Include optional test inputs as native-inputs.
>
> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
> (test-section?): New predicate.
> (parse-requires.txt): Collect the optional test inputs, and return them as the
> second element of the returned list.
> (parse-wheel-metadata): Likewise.
> (guess-requirements): Adapt, and hide unzip output.
> (make-pypi-sexp): Likewise, and include the test inputs requirements as native
> inputs in the returned package expression.
>
> * tests/pypi.scm (test-requires.txt): Include a test section in the
> test-requires.txt data.
> (test-requires.txt-beaker): New variable.
> ("parse-requires.txt"): Adapt.
> ("parse-requires.txt - Beaker"): New test.
> ("parse-wheel-metadata, with extras"): Adapt.
> ("parse-wheel-metadata, with extras - Jedi"): Adapt.
> ("pypi->guix-package, no wheel"): Re-indent, and add the expected
> native-inputs.
> ("pypi->guix-package, wheels"): Likewise.
> ("pypi->guix-package, no usable requirement file."): New test.

It seems that the patch does several unrelated things, such as silencing
‘unzip’, improving docstrings, handling inputs other than
‘propagated-inputs’, and correctly parsing wheels.

Could you try to separate these as much as possible to simplify review?

Thank you!

Ludo’.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: pypi importer outputs strange character series in optional dependency case.
  2016-09-16 20:00 bug#24450: pypi importer outputs strange character series in optional dependency case ng0
  2019-03-29  4:24 ` Maxim Cournoyer
  2019-03-29  4:34 ` bug#24450: [PATCH] " Maxim Cournoyer
@ 2019-05-15 11:06 ` Ricardo Wurmus
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
  2019-06-12  3:00 ` Maxim Cournoyer
  3 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-15 11:06 UTC (permalink / raw)
  To: maxim.cournoyer; +Cc: 24450

Hi Maxim,

I would very much like to see your improvements to the pypi importer to
be merged.  Have you been able to separate the independent changes as
suggested by Ludo?

-- 
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-15 11:06 ` Ricardo Wurmus
@ 2019-05-20  4:05   ` Maxim Cournoyer
  2019-05-20 15:05     ` Ludovic Courtès
                       ` (9 more replies)
  0 siblings, 10 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-05-20  4:05 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450


[-- Attachment #1.1: Type: text/plain, Size: 450 bytes --]

Hi Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Hi Maxim,
>
> I would very much like to see your improvements to the pypi importer to
> be merged.  Have you been able to separate the independent changes as
> suggested by Ludo?

I'm thrilled that someone has an interest in this :-)

I took my time, but finally got around to restructure the changes a
bit.  I hope it'll be easier to review this time around!

Thank you!

Maxim


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-import-pypi-Do-not-consider-requirements.txt-files.patch --]
[-- Type: text/x-patch, Size: 5825 bytes --]

From 54e44b7397f17910d95dbdb233d23e5c97c095aa Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt.  Streamline logic.
---
 guix/import/pypi.scm | 35 +++++++++++++----------------------
 tests/pypi.scm       | 23 +++++++++++------------
 2 files changed, 24 insertions(+), 34 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 3a20fc4b9b..8269aa61d7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -206,35 +206,26 @@ cannot determine package dependencies"))
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (req-files (list (string-append dirname "/requirements.txt")
-                                     (string-append dirname "/" pypi-name ".egg-info"
-                                                    "/requires.txt")))
-                    (exit-codes (map (lambda (file-name)
-                                       (parameterize ((current-error-port (%make-void-port "rw+"))
-                                                      (current-output-port (%make-void-port "rw+")))
-                                         (system* "tar" "xf" tarball "-C" dir file-name)))
-                                     req-files)))
-               ;; Only one of these files needs to exist.
-               (if (any zero? exit-codes)
-                   (match (find-files dir)
-                     ((file . _)
-                      (read-requirements file))
-                     (()
-                      (warning (G_ "No requirements file found.\n"))))
+                    (requires.txt (string-append dirname "/" pypi-name
+                                                 ".egg-info" "/requires.txt"))
+                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
+                                              (current-output-port (%make-void-port "rw+")))
+                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+               (if (zero? exit-code)
+                   (read-requirements (string-append dir "/" requires.txt))
                    (begin
-                     (warning (G_ "Failed to extract requirements files\n"))
+                     (warning
+                      (G_ "Failed to extract file: ~a from source.~%")
+                      requires.txt)
                      '())))))
           '())))
 
-  ;; First, try to compute the requirements using the wheel, since that is the
-  ;; most reliable option. If a wheel is not provided for this package, try
-  ;; getting them by reading either the "requirements.txt" file or the
-  ;; "requires.txt" from the egg-info directory from the source tarball. Note
-  ;; that "requirements.txt" is not mandatory, so this is likely to fail.
+  ;; First, try to compute the requirements using the wheel, else, fallback to
+  ;; reading the "requires.txt" from the egg-info directory from the source
+  ;; tarball.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-
 (define (compute-inputs source-url wheel-url tarball)
   "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
 name/variable pairs describing the required inputs of this package.  Also
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6daa44a6e7..335be42644 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -23,7 +23,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
-  #:use-module ((guix build utils) #:select (delete-file-recursively which))
+  #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
   #:use-module (srfi srfi-64)
   #:use-module (ice-9 match))
 
@@ -55,11 +55,10 @@
 (define test-source-hash
   "")
 
-(define test-requirements
-"# A comment
- # A comment after a space
+(define test-requires.txt "\
 bar
-baz > 13.37")
+baz > 13.37
+")
 
 (define test-metadata
   "{
@@ -107,10 +106,10 @@ baz > 13.37")
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir "foo-1.0.0")
-                  (with-output-to-file "foo-1.0.0/requirements.txt"
+                  (mkdir-p "foo-1.0.0/foo.egg-info/")
+                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
-                      (display test-requirements)))
+                      (display test-requires.txt)))
                   (system* "tar" "czvf" file-name "foo-1.0.0/")
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
@@ -157,11 +156,11 @@ baz > 13.37")
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
-               (begin
-                 (mkdir "foo-1.0.0")
-                 (with-output-to-file "foo-1.0.0/requirements.txt"
+              (begin
+                (mkdir-p "foo-1.0.0/foo.egg-info/")
+                (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requirements)))
+                     (display test-requires.txt)))
                  (system* "tar" "czvf" file-name "foo-1.0.0/")
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.3: 0002-import-pypi-Do-not-parse-optional-requirements-from-.patch --]
[-- Type: text/x-patch, Size: 7840 bytes --]

From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
 source.

* guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
(guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT and COMMENT?
functions inside the READ-REQUIREMENTS procedure.
(parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
parsing optional requirements.

* tests/pypi.scm (test-requires-with-sections): New variable.
("parse-requires.txt, with sections"): New test.
("pypi->guix-package"): Mute tar output to stdout.
---
 guix/import/pypi.scm | 76 +++++++++++++++++++++++++++-----------------
 tests/pypi.scm       | 21 ++++++++++--
 2 files changed, 65 insertions(+), 32 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8269aa61d7..91e987e9f1 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -47,7 +47,8 @@
   #:use-module (guix upstream)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
-  #:export (guix-package->pypi-name
+  #:export (parse-requires.txt
+            guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
             %pypi-updater))
@@ -117,6 +118,49 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
+(define (clean-requirement s)
+  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
+  ;; file, remove everything other than the actual name of the required
+  ;; package, and return it.
+  (string-take s (or (string-index s (lambda (chr)
+                                       (member chr '(#\space #\> #\= #\<))))
+                     (string-length s))))
+
+(define (parse-requires.txt requires.txt)
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
+requirement names."
+  ;; This is a very incomplete parser, which job is to select the non-optional
+  ;; dependencies and strip them out of any version information.
+  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
+  ;; library and the requirements grammar defined by PEP-0508
+  ;; (https://www.python.org/dev/peps/pep-0508/).
+
+  (define (comment? line)
+    ;; Return #t if the given LINE is a comment, #f otherwise.
+    (eq? (string-ref (string-trim line) 0) #\#))
+
+  (define (section-header? line)
+    ;; Return #t if the given LINE is a section header, #f otherwise.
+    (let ((trimmed-line (string-trim line)))
+      (and (not (string-null? trimmed-line))
+           (eq? (string-ref trimmed-line 0) #\[))))
+
+  (call-with-input-file requires.txt
+    (lambda (port)
+      (let loop ((result '()))
+        (let ((line (read-line port)))
+          ;; Stop when a section is encountered, as sections contains optional
+          ;; (extra) requirements.  Non-optional requirements must appear
+          ;; before any section is defined.
+          (if (or (eof-object? line) (section-header? line))
+              (reverse result)
+              (cond
+               ((or (string-null? line) (comment? line))
+                (loop result))
+               (else
+                (loop (cons (clean-requirement line)
+                            result))))))))))
+
 (define (guess-requirements source-url wheel-url tarball)
   "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
 of the required packages specified in the requirements.txt file.  TARBALL will
@@ -139,34 +183,6 @@ be extracted in a temporary directory."
 cannot determine package dependencies"))
           #f)))))
 
-  (define (clean-requirement s)
-    ;; Given a requirement LINE, as can be found in a Python requirements.txt
-    ;; file, remove everything other than the actual name of the required
-    ;; package, and return it.
-    (string-take s
-      (or (string-index s (lambda (chr) (member chr '(#\space #\> #\= #\<))))
-          (string-length s))))
-
-  (define (comment? line)
-    ;; Return #t if the given LINE is a comment, #f otherwise.
-    (eq? (string-ref (string-trim line) 0) #\#))
-
-  (define (read-requirements requirements-file)
-    ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return a list
-    ;; of name/variable pairs describing the requirements.
-    (call-with-input-file requirements-file
-      (lambda (port)
-        (let loop ((result '()))
-          (let ((line (read-line port)))
-            (if (eof-object? line)
-                result
-                (cond
-                 ((or (string-null? line) (comment? line))
-                  (loop result))
-                 (else
-                  (loop (cons (clean-requirement line)
-                              result))))))))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -212,7 +228,7 @@ cannot determine package dependencies"))
                                               (current-output-port (%make-void-port "rw+")))
                                  (system* "tar" "xf" tarball "-C" dir requires.txt))))
                (if (zero? exit-code)
-                   (read-requirements (string-append dir "/" requires.txt))
+                   (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 335be42644..e4b7142311 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -60,6 +60,15 @@ bar
 baz > 13.37
 ")
 
+(define test-requires-with-sections "\
+# A comment
+foo ~= 3
+bar != 2
+
+[test]
+pytest (>=2.5.0)
+")
+
 (define test-metadata
   "{
   \"run_requires\": [
@@ -99,6 +108,12 @@ baz > 13.37
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "parse-requires.txt, with sections"
+  '("foo" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires-with-sections)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -110,7 +125,8 @@ baz > 13.37
                   (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
-                  (system* "tar" "czvf" file-name "foo-1.0.0/")
+                  (parameterize ((current-output-port (%make-void-port "rw+")))
+                    (system* "tar" "czvf" file-name "foo-1.0.0/"))
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
                     (call-with-input-file file-name port-sha256))))
@@ -161,7 +177,8 @@ baz > 13.37
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
                      (display test-requires.txt)))
-                 (system* "tar" "czvf" file-name "foo-1.0.0/")
+                (parameterize ((current-output-port (%make-void-port "rw+")))
+                  (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
                        (call-with-input-file file-name port-sha256))))
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.4: 0003-import-pypi-Improve-parsing-of-requirement-specifica.patch --]
[-- Type: text/x-patch, Size: 5396 bytes --]

From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
 specifications.

The previous solution was fragile and could leave unwanted characters in a
requirement name, such as '[' or ']'.

Partially fixes issue #33047 (see:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=33047).

* guix/import/pypi.scm (use-modules): Export SPECIFICATION->REQUIREMENT-NAME
(%requirement-name-regexp): New variable.
(clean-requirement): Rename to...
(specification->requirement-name): this, which now uses
%requirement-name-regexp to select the requirement name from the requirement
specification.
(parse-requires.txt): Adapt.
---
 guix/import/pypi.scm | 43 ++++++++++++++++++++++++++++++++++---------
 tests/pypi.scm       | 12 ++++++++++++
 2 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 91e987e9f1..efb5939c78 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -48,6 +48,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
@@ -118,13 +119,37 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
-(define (clean-requirement s)
-  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
-  ;; file, remove everything other than the actual name of the required
-  ;; package, and return it.
-  (string-take s (or (string-index s (lambda (chr)
-                                       (member chr '(#\space #\> #\= #\<))))
-                     (string-length s))))
+(define %requirement-name-regexp
+  ;; Regexp to match the requirement name in a requirement specification.
+
+  ;; Some grammar, taken from PEP-0508 (see:
+  ;; https://www.python.org/dev/peps/pep-0508/).
+
+  ;; The unified rule can be expressed as:
+  ;; specification = wsp* ( url_req | name_req ) wsp*
+
+  ;; where url_req is:
+  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
+
+  ;; and where name_req is:
+  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
+
+  ;; Thus, we need only matching NAME, which is expressed as:
+  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
+  ;; identifier    = letterOrDigit identifier_end*
+  ;; name          = identifier
+  (let* ((letter-or-digit "[A-Za-z0-9]")
+         (identifier-end (string-append "(" letter-or-digit "|"
+                                        "[-_.]*" letter-or-digit ")"))
+         (identifier (string-append "^" letter-or-digit identifier-end "*"))
+         (name identifier))
+    (make-regexp name)))
+
+(define (specification->requirement-name spec)
+  "Given a specification SPEC, return the requirement name."
+  (match:substring
+   (or (regexp-exec %requirement-name-regexp spec)
+       (error (G_ "Could not extract requirement name in spec:") spec))))
 
 (define (parse-requires.txt requires.txt)
   "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
@@ -158,7 +183,7 @@ requirement names."
                ((or (string-null? line) (comment? line))
                 (loop result))
                (else
-                (loop (cons (clean-requirement line)
+                (loop (cons (specification->requirement-name line)
                             result))))))))))
 
 (define (guess-requirements source-url wheel-url tarball)
@@ -200,7 +225,7 @@ cannot determine package dependencies"))
                                             (hash-ref (list-ref run_requires 0)
                                                        "requires")
                                             '())))
-                     (map clean-requirement requirements)))))
+                     (map specification->requirement-name requirements)))))
              (lambda ()
                (delete-file json-file)
                (rmdir dirname))))))
diff --git a/tests/pypi.scm b/tests/pypi.scm
index e4b7142311..82d6bba8dd 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -55,6 +55,14 @@
 (define test-source-hash
   "")
 
+(define test-specifications
+  '("Fizzy [foo, bar]"
+    "PickyThing<1.6,>1.9,!=1.9.6,<2.0a0,==2.4c1"
+    "SomethingWithMarker[foo]>1.0;python_version<\"2.7\""
+    "requests [security,tests] >= 2.8.1, == 2.8.* ; python_version < \"2.7\""
+    "pip @ https://github.com/pypa/pip/archive/1.3.1.zip#\
+sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
+
 (define test-requires.txt "\
 bar
 baz > 13.37
@@ -108,6 +116,10 @@ pytest (>=2.5.0)
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "specification->requirement-name"
+  '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
+  (map specification->requirement-name test-specifications))
+
 (test-equal "parse-requires.txt, with sections"
   '("foo" "bar")
   (mock ((ice-9 ports) call-with-input-file
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.5: 0004-import-pypi-Deduplicate-requirements.patch --]
[-- Type: text/x-patch, Size: 1253 bytes --]

From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 4/9] import: pypi: Deduplicate requirements.

* guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.
---
 guix/import/pypi.scm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index efb5939c78..a90be67bb0 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -178,7 +178,11 @@ requirement names."
           ;; (extra) requirements.  Non-optional requirements must appear
           ;; before any section is defined.
           (if (or (eof-object? line) (section-header? line))
-              (reverse result)
+              ;; Duplicates can occur, since the same requirement can be
+              ;; listed multiple times with different conditional markers, e.g.
+              ;; pytest >= 3 ; python_version >= "3.3"
+              ;; pytest < 3 ; python_version < "3.3"
+              (reverse (delete-duplicates result))
               (cond
                ((or (string-null? line) (comment? line))
                 (loop result))
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.6: 0005-import-pypi-Support-more-types-of-archives.patch --]
[-- Type: text/x-patch, Size: 5498 bytes --]

From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:02 -0400
Subject: [PATCH 5/9] import: pypi: Support more types of archives.

This change enables the PyPI importer to look for requirements in a source
archive of a different type than "tar.gz" or "tar.bz2".

* guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
[archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
archive is supported or not.
[guess-requirements-from-source]: Adapt to use the new method, and use unzip
to extract ZIP archives.
(guess-requirements): Rename the TARBALL argument to ARCHIVE, to denote the
archive format is no longer bound specifically to the Tar format.
---
 guix/import/pypi.scm | 47 ++++++++++++++++++++++----------------------
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a90be67bb0..8e93653717 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -190,27 +190,24 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
-(define (guess-requirements source-url wheel-url tarball)
-  "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
-of the required packages specified in the requirements.txt file.  TARBALL will
+(define (guess-requirements source-url wheel-url archive)
+  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (tarball-directory url)
-    ;; Given the URL of the package's tarball, return the name of the directory
+  (define (archive-root-directory url)
+    ;; Given the URL of the package's archive, return the name of the directory
     ;; that will be created upon decompressing it. If the filetype is not
     ;; supported, return #f.
-    ;; TODO: Support more archive formats.
-    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
-      (cond
-       ((string-suffix? ".tar.gz" basename)
-        (string-drop-right basename 7))
-       ((string-suffix? ".tar.bz2" basename)
-        (string-drop-right basename 8))
-       (else
+    (if (compressed-file? url)
+        (let ((root-directory (file-sans-extension (basename url))))
+          (if (string=? "tar" (file-extension root-directory))
+              (file-sans-extension root-directory)
+              root-directory))
         (begin
-          (warning (G_ "Unsupported archive format: \
-cannot determine package dependencies"))
-          #f)))))
+          (warning (G_ "Unsupported archive format (~a): \
+cannot determine package dependencies") (file-extension url))
+          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -246,16 +243,20 @@ cannot determine package dependencies"))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (tarball-directory source-url)))
+    (let ((dirname (archive-root-directory source-url))
+          (extension (file-extension source-url)))
       (if (string? dirname)
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
                     (requires.txt (string-append dirname "/" pypi-name
                                                  ".egg-info" "/requires.txt"))
-                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
-                                              (current-output-port (%make-void-port "rw+")))
-                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+                    (exit-code
+                     (parameterize ((current-error-port (%make-void-port "rw+"))
+                                    (current-output-port (%make-void-port "rw+")))
+                       (if (string=? "zip" extension)
+                           (system* "unzip" archive "-d" dir requires.txt)
+                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
                (if (zero? exit-code)
                    (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
@@ -271,13 +272,13 @@ cannot determine package dependencies"))
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-(define (compute-inputs source-url wheel-url tarball)
-  "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
+(define (compute-inputs source-url wheel-url archive)
+  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
 name/variable pairs describing the required inputs of this package.  Also
 return the unaltered list of upstream dependency names."
   (let ((dependencies
          (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url tarball))))
+                 (guess-requirements source-url wheel-url archive))))
     (values (sort
              (map (lambda (input)
                     (let ((guix-name (python->package-name input)))
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.7: 0006-import-pypi-Parse-wheel-METADATA-instead-of-metadata.patch --]
[-- Type: text/x-patch, Size: 9785 bytes --]

From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:03 -0400
Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
 metadata.json.

With newer Wheel releases, there is no more metadata.json file; the METADATA
file should be used instead (see: https://github.com/pypa/wheel/issues/195).

This change updates our PyPI importer so that it uses the later.

* guix/import/pypi.scm (define-module): Remove unnecessary modules and export
  the PARSE-WHEEL-METADATA method.
(parse-wheel-metadata): Add method.
(guess-requirements): Use it.
* tests/pypi.scm (test-metadata): Test it.
---
 guix/import/pypi.scm | 66 +++++++++++++++++++++++++++++---------------
 tests/pypi.scm       | 60 ++++++++++++++++++++++++++++++----------
 2 files changed, 89 insertions(+), 37 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8e93653717..c520213b6a 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -21,9 +21,7 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix import pypi)
-  #:use-module (ice-9 binary-ports)
   #:use-module (ice-9 match)
-  #:use-module (ice-9 pretty-print)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
@@ -31,9 +29,6 @@
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
-  #:use-module (rnrs bytevectors)
-  #:use-module (json)
-  #:use-module (web uri)
   #:use-module (guix ui)
   #:use-module (guix utils)
   #:use-module ((guix build utils)
@@ -48,6 +43,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            parse-wheel-metadata
             specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
@@ -190,6 +186,37 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
+(define (parse-wheel-metadata metadata)
+  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  ;; METADATA is a RFC-2822-like, header based file.
+
+  (define (requires-dist-header? line)
+    ;; Return #t if the given LINE is a Requires-Dist header.
+    (regexp-match? (string-match "^Requires-Dist: " line)))
+
+  (define (requires-dist-value line)
+    (string-drop line (string-length "Requires-Dist: ")))
+
+  (define (extra? line)
+    ;; Return #t if the given LINE is an "extra" requirement.
+    (regexp-match? (string-match "extra == " line)))
+
+  (call-with-input-file metadata
+    (lambda (port)
+      (let loop ((requirements '()))
+        (let ((line (read-line port)))
+          ;; Stop at the first 'Provides-Extra' section: the non-optional
+          ;; requirements appear before the optional ones.
+          (if (eof-object? line)
+              (reverse (delete-duplicates requirements))
+              (cond
+               ((and (requires-dist-header? line) (not (extra? line)))
+                (loop (cons (specification->requirement-name
+                             (requires-dist-value line))
+                            requirements)))
+               (else
+                (loop requirements)))))))))
+
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
@@ -211,25 +238,18 @@ cannot determine package dependencies") (file-extension url))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
-    ;; requirements.
+    ;; requirements, or #f if the metadata file contained therein couldn't be
+    ;; extracted.
     (let* ((dirname (wheel-url->extracted-directory wheel-url))
-           (json-file (string-append dirname "/metadata.json")))
-      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
-           (dynamic-wind
-             (const #t)
-             (lambda ()
-               (call-with-input-file json-file
-                 (lambda (port)
-                   (let* ((metadata (json->scm port))
-                          (run_requires (hash-ref metadata "run_requires"))
-                          (requirements (if run_requires
-                                            (hash-ref (list-ref run_requires 0)
-                                                       "requires")
-                                            '())))
-                     (map specification->requirement-name requirements)))))
-             (lambda ()
-               (delete-file json-file)
-               (rmdir dirname))))))
+           (metadata (string-append dirname "/METADATA")))
+      (call-with-temporary-directory
+       (lambda (dir)
+         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+             (parse-wheel-metadata (string-append dir "/" metadata))
+             (begin
+               (warning
+                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
+               #f))))))
 
   (define (guess-requirements-from-wheel)
     ;; Return the package's requirements using the wheel, or #f if an error
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 82d6bba8dd..ca8cb5f6de 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -21,6 +21,7 @@
   #:use-module (guix import pypi)
   #:use-module (guix base32)
   #:use-module (gcrypt hash)
+  #:use-module (guix memoization)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
   #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
@@ -77,17 +78,33 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
-(define test-metadata
-  "{
-  \"run_requires\": [
-    {
-      \"requires\": [
-        \"bar\",
-        \"baz (>13.37)\"
-      ]
-    }
-  ]
-}")
+(define test-metadata "\
+Classifier: Programming Language :: Python :: 3.7
+Requires-Dist: baz ~= 3
+Requires-Dist: bar != 2
+Provides-Extra: test
+pytest (>=2.5.0)
+")
+
+(define test-metadata-with-extras "
+Classifier: Programming Language :: Python :: 3.7
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Requires-Dist: wrapt (<2,>=1)
+Requires-Dist: bar
+
+Provides-Extra: dev
+Requires-Dist: tox ; extra == 'dev'
+Requires-Dist: bumpversion (<1) ; extra == 'dev'
+")
+
+;;; Provides-Extra can appear before Requires-Dist.
+(define test-metadata-with-extras-jedi "\
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Provides-Extra: testing
+Requires-Dist: parso (>=0.3.0)
+Provides-Extra: testing
+Requires-Dist: pytest (>=3.1.0); extra == 'testing'
+")
 
 (test-begin "pypi")
 
@@ -126,6 +143,18 @@ pytest (>=2.5.0)
          call-with-input-string)
         (parse-requires.txt test-requires-with-sections)))
 
+(test-equal "parse-wheel-metadata, with extras"
+  '("wrapt" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras)))
+
+(test-equal "parse-wheel-metadata, with extras - Jedi"
+  '("parso")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras-jedi)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -188,7 +217,7 @@ pytest (>=2.5.0)
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requires.txt)))
+                     (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
@@ -197,13 +226,13 @@ pytest (>=2.5.0)
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
                (begin
                  (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/metadata.json"
+                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
                    (lambda ()
                      (display test-metadata)))
                  (let ((zip-file (string-append file-name ".zip")))
                    ;; zip always adds a "zip" extension to the file it creates,
                    ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/metadata.json")
+                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
                    (rename-file zip-file file-name))
                  (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
@@ -215,6 +244,9 @@ pytest (>=2.5.0)
                             (string-length test-json)))
                    ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
                    (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
               (match (pypi->guix-package "foo")
                 (('package
                    ('name "python-foo")
-- 
2.21.0


[-- Attachment #1.8: 0007-import-pypi-Include-optional-test-inputs-as-native-i.patch --]
[-- Type: text/x-patch, Size: 23897 bytes --]

From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
 native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt, and hide unzip output.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
("pypi->guix-package, no usable requirement file."): New test.
---
 guix/import/pypi.scm | 195 ++++++++++++++++++++++++++++---------------
 tests/pypi.scm       | 123 ++++++++++++++++++++-------
 2 files changed, 222 insertions(+), 96 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index c520213b6a..099768f0c8 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -106,14 +108,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -147,11 +150,21 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
-  ;; This is a very incomplete parser, which job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the
+second the optional test dependencies.  Note that currently, optional,
+non-test dependencies are omitted since these can be difficult or expensive to
+satisfy."
+
+  ;; This is a very incomplete parser, which job is to read in the requirement
+  ;; specification lines, and strip them out of any version information.
   ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
   ;; library and the requirements grammar defined by PEP-0508
   ;; (https://www.python.org/dev/peps/pep-0508/).
@@ -168,57 +181,89 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contains optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
+          (if (eof-object? line)
               ;; Duplicates can occur, since the same requirement can be
               ;; listed multiple times with different conditional markers, e.g.
               ;; pytest >= 3 ; python_version >= "3.3"
               ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
+                (loop required-deps test-deps inside-test-section? optional?))
+               ((section-header? line)
+                ;; Encountering a section means that all the requirements
+                ;; listed below are optional. Since we want to pick only the
+                ;; test dependencies from the optional dependencies, we must
+                ;; track those separately.
+                (loop required-deps test-deps (test-section? line) #t))
+               (inside-test-section?
+                (loop required-deps
+                      (cons (specification->requirement-name line)
+                            test-deps)
+                      inside-test-section? optional?))
+               ((not optional?)
                 (loop (cons (specification->requirement-name line)
-                            result))))))))))
+                            required-deps)
+                      test-deps inside-test-section? optional?))
+               (optional?
+                ;; Skip optional items.
+                (loop required-deps test-deps inside-test-section? optional?))
+               (else
+                (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line)))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the second the optional
+test dependencies.  Note that currently, optional, non-test dependencies are
+omitted since these can be difficult or expensive to satisfy."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
     ;; Return #t if the given LINE is a Requires-Dist header.
-    (regexp-match? (string-match "^Requires-Dist: " line)))
+    (string-match "^Requires-Dist: " line))
 
   (define (requires-dist-value line)
     (string-drop line (string-length "Requires-Dist: ")))
 
   (define (extra? line)
     ;; Return #t if the given LINE is an "extra" requirement.
-    (regexp-match? (string-match "extra == " line)))
+    (string-match "extra == '(.*)'" line))
+
+  (define (test-requirement? line)
+    (let ((extra-label (match:substring (extra? line) 1)))
+      (and extra-label (test-section? extra-label))))
 
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((and (requires-dist-header? line) (not (extra? line)))
                 (loop (cons (specification->requirement-name
                              (requires-dist-value line))
-                            requirements)))
+                            required-deps)
+                      test-deps))
+               ((and (requires-dist-header? line) (test-requirement? line))
+                (loop required-deps
+                      (cons (specification->requirement-name (requires-dist-value line))
+                            test-deps)))
                (else
-                (loop requirements)))))))))
+                (loop required-deps test-deps))))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
@@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
@@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
                       requires.txt)
-                     '())))))
-          '())))
+                     (list '() '()))))))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively.  Also
 return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (let ((dependencies (guess-requirements source-url wheel-url archive)))
+    (values (map process-requirements dependencies)
+            (concatenate dependencies))))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -317,29 +374,33 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
+          (receive (guix-dependencies upstream-dependencies)
               (compute-inputs source-url wheel-url temp)
-            (values
-             `(package
-                (name ,(python->package-name name))
-                (version ,version)
-                (source (origin
-                          (method url-fetch)
-
-                          ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                          ;; cases in NAME, for instance, as is the case with
-                          ;; "uwsgi".  In that case, fall back to a full URL.
-                          (uri (pypi-uri ,(string-downcase name) version))
-                          (sha256
-                           (base32
-                            ,(guix-hash-url temp)))))
-                (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
-                (home-page ,home-page)
-                (synopsis ,synopsis)
-                (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+            (match guix-dependencies
+              ((required-inputs test-inputs)
+               (values
+                `(package
+                   (name ,(python->package-name name))
+                   (version ,version)
+                   (source (origin
+                             (method url-fetch)
+                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
+                             ;; cases in NAME, for instance, as is the case with
+                             ;; "uwsgi".  In that case, fall back to a full URL.
+                             (uri (pypi-uri ,(string-downcase name) version))
+                             (sha256
+                              (base32
+                               ,(guix-hash-url temp)))))
+                   (build-system python-build-system)
+                   ,@(maybe-inputs required-inputs 'propagated-inputs)
+                   ,@(maybe-inputs test-inputs 'native-inputs)
+                   (home-page ,home-page)
+                   (synopsis ,synopsis)
+                   (description ,description)
+                   (license ,(license->symbol license)))
+                ;; Flatten the nested lists and return the upstream
+                ;; dependencies.
+                upstream-dependencies))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index ca8cb5f6de..aa08e2cb54 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -65,11 +66,6 @@
 sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 
 (define test-requires.txt "\
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 # A comment
 foo ~= 3
 bar != 2
@@ -78,12 +74,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -137,25 +146,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -195,7 +210,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -216,25 +234,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -262,6 +280,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
@@ -272,4 +293,48 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256))
+              #t)
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f)))
+              )))
+
 (test-end "pypi")
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.9: 0008-import-pypi-Scan-source-archive-to-find-requires.txt.patch --]
[-- Type: text/x-patch, Size: 7544 bytes --]

From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 23:13:26 -0400
Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
 file.

* guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
(guess-requirements)[archive-root-directory]: Remove procedure.
[guess-requirements-from-wheel]: Re-ident.
[guess-requirements-from-source]: Search for the requires.txt file in the
archive instead of using a static, expected location.
* tests/pypi.scm ("pypi->guix-package, no wheel"): Mock the requires.txt at a
non-standard location to test the new feature.
("pypi->guix-package, no usable requirement file."): Adapt.
---
 guix/import/pypi.scm | 65 ++++++++++++++++++--------------------------
 tests/pypi.scm       | 17 +++++++-----
 2 files changed, 37 insertions(+), 45 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 099768f0c8..a2ce14b192 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -36,7 +36,8 @@
   #:use-module ((guix build utils)
                 #:select ((package-name->name+version
                            . hyphen-package-name->name+version)
-                          find-files))
+                          find-files
+                          invoke))
   #:use-module (guix import utils)
   #:use-module ((guix download) #:prefix download:)
   #:use-module (guix import json)
@@ -267,19 +268,6 @@ omitted since these can be difficult or expensive to satisfy."
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (archive-root-directory url)
-    ;; Given the URL of the package's archive, return the name of the directory
-    ;; that will be created upon decompressing it. If the filetype is not
-    ;; supported, return #f.
-    (if (compressed-file? url)
-        (let ((root-directory (file-sans-extension (basename url))))
-          (if (string=? "tar" (file-extension root-directory))
-              (file-sans-extension root-directory)
-              root-directory))
-        (begin
-          (warning (G_ "Unsupported archive format (~a): \
-cannot determine package dependencies") (file-extension url))
-          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -305,33 +293,34 @@ cannot determine package dependencies") (file-extension url))
     (call-with-temporary-output-file
      (lambda (temp port)
        (if wheel-url
-         (and (url-fetch wheel-url temp)
-              (read-wheel-metadata temp))
-         #f))))
+           (and (url-fetch wheel-url temp)
+                (read-wheel-metadata temp))
+           #f))))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (archive-root-directory source-url))
-          (extension (file-extension source-url)))
-      (if (string? dirname)
-          (call-with-temporary-directory
-           (lambda (dir)
-             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (requires.txt (string-append dirname "/" pypi-name
-                                                 ".egg-info" "/requires.txt"))
-                    (exit-code
-                     (parameterize ((current-error-port (%make-void-port "rw+"))
-                                    (current-output-port (%make-void-port "rw+")))
-                       (if (string=? "zip" extension)
-                           (system* "unzip" archive "-d" dir requires.txt)
-                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
-               (if (zero? exit-code)
-                   (parse-requires.txt (string-append dir "/" requires.txt))
-                   (begin
-                     (warning
-                      (G_ "Failed to extract file: ~a from source.~%")
-                      requires.txt)
-                     (list '() '()))))))
+    (if (compressed-file? source-url)
+        (call-with-temporary-directory
+         (lambda (dir)
+           (parameterize ((current-error-port (%make-void-port "rw+"))
+                          (current-output-port (%make-void-port "rw+")))
+             (if (string=? "zip" (file-extension source-url))
+                 (invoke "unzip" archive "-d" dir)
+                 (invoke "tar" "xf" archive "-C" dir)))
+           (let ((requires.txt-files
+                  (find-files dir (lambda (abs-file-name _)
+		                    (string-match "\\.egg-info/requires.txt$"
+                                                  abs-file-name)))))
+             (if (> (length requires.txt-files) 0)
+                 (begin
+                   (parse-requires.txt (first requires.txt-files)))
+                 (begin (warning (G_ "Cannot guess requirements from source archive:\
+ no requires.txt file found.~%"))
+                        (list '() '()))))))
+        (begin
+          (warning (G_ "Unsupported archive format; \
+cannot determine package dependencies from source archive: ~a~%")
+                   (basename source-url))
           (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
diff --git a/tests/pypi.scm b/tests/pypi.scm
index aa08e2cb54..ad188df16c 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -177,8 +177,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir-p "foo-1.0.0/foo.egg-info/")
-                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
+                  ;; Unusual requires.txt location should still be found.
+                  (mkdir-p "foo-1.0.0/src/bizarre.egg-info")
+                  (with-output-to-file "foo-1.0.0/src/bizarre.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
                   (parameterize ((current-output-port (%make-void-port "rw+")))
@@ -299,10 +300,13 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
+              (mkdir-p "foo-1.0.0/foo.egg-info/")
+              (parameterize ((current-output-port (%make-void-port "rw+")))
+                (system* "tar" "czvf" file-name "foo-1.0.0/"))
+              (delete-file-recursively "foo-1.0.0")
               (set! test-source-hash
-                (call-with-input-file file-name port-sha256))
-              #t)
-             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+                (call-with-input-file file-name port-sha256)))
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -334,7 +338,6 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                             test-source-hash)
                            hash))
                 (x
-                 (pk 'fail x #f)))
-              )))
+                 (pk 'fail x #f))))))
 
 (test-end "pypi")
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.10: 0009-import-pypi-Preserve-package-name-case-when-forming-.patch --]
[-- Type: text/x-patch, Size: 3185 bytes --]

From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 20:27:35 -0400
Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
 pypi-uri.

Fixes issue: #33046.

* guix/build-system/python.scm (pypi-uri): Update the host URI to
"files.pythonhosted.org".
* guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
the source URL calls for it.
---
 guix/build-system/python.scm |  2 +-
 guix/import/pypi.scm         | 23 ++++++++++++++---------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/guix/build-system/python.scm b/guix/build-system/python.scm
index b753940bad..e39c06528e 100644
--- a/guix/build-system/python.scm
+++ b/guix/build-system/python.scm
@@ -50,7 +50,7 @@
   "Return a URI string for the Python package hosted on the Python Package
 Index (PyPI) corresponding to NAME and VERSION.  EXTENSION is the file name
 extension, such as '.tar.gz'."
-  (string-append "https://pypi.org/packages/source/"
+  (string-append "https://files.pythonhosted.org/packages/source/"
                  (string-take name 1) "/" name "/"
                  name "-" version extension))
 
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a2ce14b192..fecf95d0a7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -371,15 +371,20 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
                 `(package
                    (name ,(python->package-name name))
                    (version ,version)
-                   (source (origin
-                             (method url-fetch)
-                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                             ;; cases in NAME, for instance, as is the case with
-                             ;; "uwsgi".  In that case, fall back to a full URL.
-                             (uri (pypi-uri ,(string-downcase name) version))
-                             (sha256
-                              (base32
-                               ,(guix-hash-url temp)))))
+                   (source
+                    (origin
+                      (method url-fetch)
+                      ;; PyPI URL are case sensitive, but sometimes a project
+                      ;; named using mixed case has a URL using lower case, so
+                      ;; we must work around this inconsistency.  For actual
+                      ;; examples, compare the URLs of the "Deprecated" and
+                      ;; "uWSGI" PyPI packages.
+                      (uri ,(if (string-contains source-url name)
+                                `(pypi-uri ,name version)
+                                `(pypi-uri ,(string-downcase name) version)))
+                      (sha256
+                       (base32
+                        ,(guix-hash-url temp)))))
                    (build-system python-build-system)
                    ,@(maybe-inputs required-inputs 'propagated-inputs)
                    ,@(maybe-inputs test-inputs 'native-inputs)
-- 
2.21.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
@ 2019-05-20 15:05     ` Ludovic Courtès
  2019-05-22  1:13       ` Maxim Cournoyer
  2019-05-27 14:48     ` Ricardo Wurmus
                       ` (8 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Ludovic Courtès @ 2019-05-20 15:05 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: Ricardo Wurmus, 24450

Hello!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:
>
>> Hi Maxim,
>>
>> I would very much like to see your improvements to the pypi importer to
>> be merged.  Have you been able to separate the independent changes as
>> suggested by Ludo?
>
> I'm thrilled that someone has an interest in this :-)
>
> I took my time, but finally got around to restructure the changes a
> bit.  I hope it'll be easier to review this time around!

I’ll let Ricardo comment.  For my part, I see that it has tests, and
that’s enough to make me happy.  :-)

Thanks for improving the importer!

Ludo’.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20 15:05     ` Ludovic Courtès
@ 2019-05-22  1:13       ` Maxim Cournoyer
  0 siblings, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-05-22  1:13 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Ricardo Wurmus, 24450

Greetings!

Ludovic Courtès <ludo@gnu.org> writes:

> Hello!
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:
>>
>>> Hi Maxim,
>>>
>>> I would very much like to see your improvements to the pypi importer to
>>> be merged.  Have you been able to separate the independent changes as
>>> suggested by Ludo?
>>
>> I'm thrilled that someone has an interest in this :-)
>>
>> I took my time, but finally got around to restructure the changes a
>> bit.  I hope it'll be easier to review this time around!
>
> I’ll let Ricardo comment.  For my part, I see that it has tests, and
> that’s enough to make me happy.  :-)

OK!

> Thanks for improving the importer!

My pleasure! One less itch to scratch ;-)

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
  2019-05-20 15:05     ` Ludovic Courtès
@ 2019-05-27 14:48     ` Ricardo Wurmus
  2019-06-10  2:10       ` Maxim Cournoyer
  2019-05-27 15:11     ` Ricardo Wurmus
                       ` (7 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-27 14:48 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Hi Maxim,

> Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.
>
> * guix/import/pypi.scm (guess-requirements): Update comment.
> [guess-requirements-from-source]: Do not attempt to parse the file
> requirements.txt.  Streamline logic.

Why remove the handling of the requirements.txt?  Is it no longer
popular enough to expect its availability in the source archives?

Please also mention in the commit message that and how you adjusted the
tests.  You removed the comments from the example requires.txt — are
comments no longer permitted in these files?  If they are, please don’t
include those changes.

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
  2019-05-20 15:05     ` Ludovic Courtès
  2019-05-27 14:48     ` Ricardo Wurmus
@ 2019-05-27 15:11     ` Ricardo Wurmus
  2019-06-10  3:30       ` Maxim Cournoyer
  2019-05-27 15:54     ` Ricardo Wurmus
                       ` (6 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-27 15:11 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

Hi Maxim,

on to patch number 2!

> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:00 -0400
> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
>  source.
>
> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
> and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT and COMMENT?
> functions inside the READ-REQUIREMENTS procedure.
> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
> parsing optional requirements.
>
> * tests/pypi.scm (test-requires-with-sections): New variable.
> ("parse-requires.txt, with sections"): New test.
> ("pypi->guix-package"): Mute tar output to stdout.

The commit log does not match the changes.  CLEAN-REQUIREMENT is now a
top-level procedure, not a local procedure inside of READ-REQUIREMENTS
as reported in the commit message.  Which is correct?

> +  (call-with-input-file requires.txt
> +    (lambda (port)
> +      (let loop ((result '()))
> +        (let ((line (read-line port)))
> +          ;; Stop when a section is encountered, as sections contains optional

Should be “contain”.

> +          ;; (extra) requirements.  Non-optional requirements must appear
> +          ;; before any section is defined.
> +          (if (or (eof-object? line) (section-header? line))
> +              (reverse result)
> +              (cond
> +               ((or (string-null? line) (comment? line))
> +                (loop result))
> +               (else
> +                (loop (cons (clean-requirement line)
> +                            result))))))))))
> +

I think it would be better to use “match” here instead of nested “let”,
“if” and “cond”.  At least you can drop the “if” and just use cond.

The loop let and the inner let can be merged.


> +(define (parse-requires.txt requires.txt)
> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> +requirement names."
> +  ;; This is a very incomplete parser, which job is to select the non-optional

“which” –> “whose”

> +  ;; dependencies and strip them out of any version information.
> +  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
> +  ;; library and the requirements grammar defined by PEP-0508
> +  ;; (https://www.python.org/dev/peps/pep-0508/).

Let’s remove the sentence starting with “Alternatively…”.  We could do
that but we didn’t :)

> +  (define (section-header? line)
> +    ;; Return #t if the given LINE is a section header, #f otherwise.
> +    (let ((trimmed-line (string-trim line)))
> +      (and (not (string-null? trimmed-line))
> +           (eq? (string-ref trimmed-line 0) #\[))))
> +

How about using string-prefix? instead?  This looks more complicated
than it deserves.  You can get rid of string-null? and eq? and string-ref
and all that.

Same here:

> +  (define (comment? line)
> +    ;; Return #t if the given LINE is a comment, #f otherwise.
> +    (eq? (string-ref (string-trim line) 0) #\#))

I’d just use string-prefix? here.

> +(define (clean-requirement s)
> +  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
> +  ;; file, remove everything other than the actual name of the required
> +  ;; package, and return it.
> +  (string-take s (or (string-index s (lambda (chr)
> +                                       (member chr '(#\space #\> #\= #\<))))
> +                     (string-length s))))

“string-take” with “string-length” is not very elegant.  The char
predicate in string-index could better be a char set:

--8<---------------cut here---------------start------------->8---
(define (clean-requirement s)
 (cond
  ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
  (else s)))
--8<---------------cut here---------------end--------------->8---


> ("pypi->guix-package"): Mute tar output to stdout.

Finally, I think it would be better to keep this separate because it’s
really orthogonal to the other changes in this patch.

What do you think?

-- 
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (2 preceding siblings ...)
  2019-05-27 15:11     ` Ricardo Wurmus
@ 2019-05-27 15:54     ` Ricardo Wurmus
  2019-06-10  8:32       ` Maxim Cournoyer
  2019-05-27 15:58     ` Ricardo Wurmus
                       ` (5 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-27 15:54 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

Patch number 3!

> From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:01 -0400
> Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
>  specifications.
>
> The previous solution was fragile and could leave unwanted characters in a
> requirement name, such as '[' or ']'.

Wouldn’t it be sufficient to add [ and ] to the list of forbidden
characters?  The tests pass with this implementation of
clean-requirements:

(define (clean-requirements s)
 (cond
  ((string-index s (char-set #\space #\> #\= #\< #\[ #\])) => (cut string-take s <>))
  (else s)))

> +(define %requirement-name-regexp
> +  ;; Regexp to match the requirement name in a requirement specification.
> +
> +  ;; Some grammar, taken from PEP-0508 (see:
> +  ;; https://www.python.org/dev/peps/pep-0508/).
> +
> +  ;; The unified rule can be expressed as:
> +  ;; specification = wsp* ( url_req | name_req ) wsp*
> +
> +  ;; where url_req is:
> +  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
> +
> +  ;; and where name_req is:
> +  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
> +
> +  ;; Thus, we need only matching NAME, which is expressed as:
> +  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
> +  ;; identifier    = letterOrDigit identifier_end*
> +  ;; name          = identifier
> +  (let* ((letter-or-digit "[A-Za-z0-9]")
> +         (identifier-end (string-append "(" letter-or-digit "|"
> +                                        "[-_.]*" letter-or-digit ")"))
> +         (identifier (string-append "^" letter-or-digit identifier-end "*"))
> +         (name identifier))
> +    (make-regexp name)))

This seems a little too complicated.  Translating a grammar into a
regexp is probably not a good idea in general.  Since we don’t care
about anything other than the name it seems easier to just chop off
the string tail as soon as we find an invalid character.

-- 
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (3 preceding siblings ...)
  2019-05-27 15:54     ` Ricardo Wurmus
@ 2019-05-27 15:58     ` Ricardo Wurmus
  2019-05-28 10:23     ` Ricardo Wurmus
                       ` (4 subsequent siblings)
  9 siblings, 0 replies; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-27 15:58 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

> From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:01 -0400
> Subject: [PATCH 4/9] import: pypi: Deduplicate requirements.
>
> * guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.

This looks fine to me, but it is subject to changes that I requested to
the procedure in my comments to an earlier patch.

-- 
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (4 preceding siblings ...)
  2019-05-27 15:58     ` Ricardo Wurmus
@ 2019-05-28 10:23     ` Ricardo Wurmus
  2019-06-10 13:30       ` Maxim Cournoyer
  2019-05-28 11:04     ` Ricardo Wurmus
                       ` (3 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-28 10:23 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

On to the next:

> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:02 -0400
> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>
> This change enables the PyPI importer to look for requirements in a source
> archive of a different type than "tar.gz" or "tar.bz2".

Okay.

> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
> archive is supported or not.

Nitpick: please use “...this.” and leave two spaces between sentences.

Typo: it should be COMPRESSED-FILE?

> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
> to extract ZIP archives.

s/method/procedure/

Please also mention that “compute-inputs” has been adjusted.

> -  (define (tarball-directory url)
> -    ;; Given the URL of the package's tarball, return the name of the directory
> +  (define (archive-root-directory url)
> +    ;; Given the URL of the package's archive, return the name of the directory
>      ;; that will be created upon decompressing it. If the filetype is not
>      ;; supported, return #f.
> -    ;; TODO: Support more archive formats.
> -    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
> -      (cond
> -       ((string-suffix? ".tar.gz" basename)
> -        (string-drop-right basename 7))
> -       ((string-suffix? ".tar.bz2" basename)
> -        (string-drop-right basename 8))
> -       (else
> +    (if (compressed-file? url)
> +        (let ((root-directory (file-sans-extension (basename url))))
> +          (if (string=? "tar" (file-extension root-directory))
> +              (file-sans-extension root-directory)
> +              root-directory))
>          (begin
> -          (warning (G_ "Unsupported archive format: \
> -cannot determine package dependencies"))
> -          #f)))))
> +          (warning (G_ "Unsupported archive format (~a): \
> +cannot determine package dependencies") (file-extension url))
> +          #f)))

I think the double application of file-sans-extension and the
intermediate variable name “root-directory” for something that is a file
is a little confusing, but I don’t have a better proposal (other than to
replace file-extension and file-sans-extension with a match expression).

>    (define (read-wheel-metadata wheel-archive)
>      ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
> @@ -246,16 +243,20 @@ cannot determine package dependencies"))

>    (define (guess-requirements-from-source)
>      ;; Return the package's requirements by guessing them from the source.
> -    (let ((dirname (tarball-directory source-url)))
> +    (let ((dirname (archive-root-directory source-url))
> +          (extension (file-extension source-url)))
>        (if (string? dirname)
>            (call-with-temporary-directory
>             (lambda (dir)
>               (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
>                      (requires.txt (string-append dirname "/" pypi-name
>                                                   ".egg-info" "/requires.txt"))
> -                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
> -                                              (current-output-port (%make-void-port "rw+")))
> -                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
> +                    (exit-code
> +                     (parameterize ((current-error-port (%make-void-port "rw+"))
> +                                    (current-output-port (%make-void-port "rw+")))
> +                       (if (string=? "zip" extension)
> +                           (system* "unzip" archive "-d" dir requires.txt)
> +                           (system* "tar" "xf" archive "-C" dir requires.txt)))))

I guess this is why I’m not too happy with this: we’re checking in
multiple places if the format is supported but then forget about this
again until the next time we need to do something to the file.

I wonder if we could do better and answer the question just once.

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (5 preceding siblings ...)
  2019-05-28 10:23     ` Ricardo Wurmus
@ 2019-05-28 11:04     ` Ricardo Wurmus
  2019-06-11  0:39       ` Maxim Cournoyer
  2019-05-28 13:21     ` Ricardo Wurmus
                       ` (2 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-28 11:04 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

Patch number 6:

> From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:03 -0400
> Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
>  metadata.json.

> With newer Wheel releases, there is no more metadata.json file; the METADATA
> file should be used instead (see: https://github.com/pypa/wheel/issues/195).

> This change updates our PyPI importer so that it uses the later.

Typo: should be “latter” instead of “later”.

> * guix/import/pypi.scm (define-module): Remove unnecessary modules and export
>   the PARSE-WHEEL-METADATA method.

Please remove the indentation here.  Also, please don’t use “method”
(because it’s not); use “procedure” instead.

> (parse-wheel-metadata): Add method.

Same here.

> +  (define (requires-dist-header? line)
> +    ;; Return #t if the given LINE is a Requires-Dist header.
> +    (regexp-match? (string-match "^Requires-Dist: " line)))
> +
> +  (define (requires-dist-value line)
> +    (string-drop line (string-length "Requires-Dist: ")))
> +
> +  (define (extra? line)
> +    ;; Return #t if the given LINE is an "extra" requirement.
> +    (regexp-match? (string-match "extra == " line)))

The use of “regexp-match?” here isn’t strictly necessary as the return
value is true-ish anyway.

> +  (call-with-input-file metadata
> +    (lambda (port)
> +      (let loop ((requirements '()))
> +        (let ((line (read-line port)))
> +          ;; Stop at the first 'Provides-Extra' section: the non-optional
> +          ;; requirements appear before the optional ones.
> +          (if (eof-object? line)
> +              (reverse (delete-duplicates requirements))
> +              (cond
> +               ((and (requires-dist-header? line) (not (extra? line)))
> +                (loop (cons (specification->requirement-name
> +                             (requires-dist-value line))
> +                            requirements)))
> +               (else
> +                (loop requirements)))))))))
> +

As before you can simplify the nested let and merge “if” and "cond“.

>    (define (read-wheel-metadata wheel-archive)
>      ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
> -    ;; requirements.
> +    ;; requirements, or #f if the metadata file contained therein couldn't be
> +    ;; extracted.
>      (let* ((dirname (wheel-url->extracted-directory wheel-url))
> -           (json-file (string-append dirname "/metadata.json")))
> -      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
> -           (dynamic-wind
> -             (const #t)
> -             (lambda ()
> -               (call-with-input-file json-file
> -                 (lambda (port)
> -                   (let* ((metadata (json->scm port))
> -                          (run_requires (hash-ref metadata "run_requires"))
> -                          (requirements (if run_requires
> -                                            (hash-ref (list-ref run_requires 0)
> -                                                       "requires")
> -                                            '())))
> -                     (map specification->requirement-name requirements)))))
> -             (lambda ()
> -               (delete-file json-file)
> -               (rmdir dirname))))))
> +           (metadata (string-append dirname "/METADATA")))
> +      (call-with-temporary-directory
> +       (lambda (dir)
> +         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
> +             (parse-wheel-metadata (string-append dir "/" metadata))
> +             (begin
> +               (warning
> +                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
> +               #f))))))

The old approach took care of removing the unpacked archive no matter
what happened.  The new code doesn’t do that.

> --- a/tests/pypi.scm
> +++ b/tests/pypi.scm

Thanks for the tests!

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (6 preceding siblings ...)
  2019-05-28 11:04     ` Ricardo Wurmus
@ 2019-05-28 13:21     ` Ricardo Wurmus
  2019-05-28 14:48     ` Ricardo Wurmus
  2019-05-28 14:53     ` Ricardo Wurmus
  9 siblings, 0 replies; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-28 13:21 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 23:12:26 -0400
> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
>  native-inputs.
>
> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
> (test-section?): New predicate.
> (parse-requires.txt): Collect the optional test inputs, and return them as the
> second element of the returned list.

AFAICT parse-requires.txt now returns a list of pairs, but used to
return a plain list before.  Is this correct?

>  (define (parse-requires.txt requires.txt)
> -  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> -requirement names."
> -  ;; This is a very incomplete parser, which job is to select the non-optional
> -  ;; dependencies and strip them out of any version information.
> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the
> +second the optional test dependencies.  Note that currently, optional,
> +non-test dependencies are omitted since these can be difficult or expensive to
> +satisfy."
> +
> +  ;; This is a very incomplete parser, which job is to read in the requirement
> +  ;; specification lines, and strip them out of any version information.
>    ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>    ;; library and the requirements grammar defined by PEP-0508
>    ;; (https://www.python.org/dev/peps/pep-0508/).

Does it really return a pair?  Or a list of pairs?  Or is it a
two-element list of lists?

>    (call-with-input-file requires.txt
>      (lambda (port)
> -      (let loop ((result '()))
> +      (let loop ((required-deps '())
> +                 (test-deps '())
> +                 (inside-test-section? #f)
> +                 (optional? #f))
>          (let ((line (read-line port)))
> -          ;; Stop when a section is encountered, as sections contains optional
> -          ;; (extra) requirements.  Non-optional requirements must appear
> -          ;; before any section is defined.
> -          (if (or (eof-object? line) (section-header? line))
> +          (if (eof-object? line)
>                ;; Duplicates can occur, since the same requirement can be
>                ;; listed multiple times with different conditional markers, e.g.
>                ;; pytest >= 3 ; python_version >= "3.3"
>                ;; pytest < 3 ; python_version < "3.3"
> -              (reverse (delete-duplicates result))
> +              (map (compose reverse delete-duplicates)
> +                   (list required-deps test-deps))

Looks like a list of lists to me.  “delete-duplicates” now won’t delete
a name that is in both “required-deps” as well as in “test-deps”.  Is
this acceptable?

Personally, I’m not a fan of using data structures for returning
multiple values, because we can simply return multiple values.

Or we could have more than just strings.  The meaning of these strings
is provided by the bin into which they are thrown — either
“required-deps” or “test-deps”.  It could be an option to collect tagged
values instead and have the caller deal with filtering.

>  (define (parse-wheel-metadata metadata)
> -  "Given METADATA, a Wheel metadata file, return a list of requirement names."
> +  "Given METADATA, a Wheel metadata file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the second the optional
> +test dependencies.  Note that currently, optional, non-test dependencies are
> +omitted since these can be difficult or expensive to satisfy."
>    ;; METADATA is a RFC-2822-like, header based file.

This sounds like this is going to duplicate the previous procedures.

>    (define (requires-dist-header? line)
>      ;; Return #t if the given LINE is a Requires-Dist header.
> -    (regexp-match? (string-match "^Requires-Dist: " line)))
> +    (string-match "^Requires-Dist: " line))
>
>    (define (requires-dist-value line)
>      (string-drop line (string-length "Requires-Dist: ")))
>
>    (define (extra? line)
>      ;; Return #t if the given LINE is an "extra" requirement.
> -    (regexp-match? (string-match "extra == " line)))
> +    (string-match "extra == '(.*)'" line))

These hunks should be part of the previous patch where they were
introduced.  (See my comments there about “regexp-match?”.)

> +  (define (test-requirement? line)
> +    (let ((extra-label (match:substring (extra? line) 1)))
> +      (and extra-label (test-section? extra-label))))

You can use “and=>” instead of binding a name:

    (and=> (match:substring (extra? line) 1) test-section?)

>    (call-with-input-file metadata
>      (lambda (port)
> -      (let loop ((requirements '()))
> +      (let loop ((required-deps '())
> +                 (test-deps '()))
>          (let ((line (read-line port)))
> -          ;; Stop at the first 'Provides-Extra' section: the non-optional
> -          ;; requirements appear before the optional ones.
>            (if (eof-object? line)
> -              (reverse (delete-duplicates requirements))
> +              (map (compose reverse delete-duplicates)
> +                   (list required-deps test-deps))
>                (cond
>                 ((and (requires-dist-header? line) (not (extra? line)))
>                  (loop (cons (specification->requirement-name
>                               (requires-dist-value line))
> -                            requirements)))
> +                            required-deps)
> +                      test-deps))
> +               ((and (requires-dist-header? line) (test-requirement? line))
> +                (loop required-deps
> +                      (cons (specification->requirement-name (requires-dist-value line))
> +                            test-deps)))
>                 (else
> -                (loop requirements)))))))))
> +                (loop required-deps test-deps))))))))) ;skip line

Could you refactor this so that the other parser can be reused?  If not,
the same comments about “if” and “cond” and the use of pairs/lists
instead of “values” applies here.

>  (define (guess-requirements source-url wheel-url archive)
> -  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
> +  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
>  of the required packages specified in the requirements.txt file.  ARCHIVE will
>  be extracted in a temporary directory."
>
> @@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
>             (metadata (string-append dirname "/METADATA")))
>        (call-with-temporary-directory
>         (lambda (dir)
> -         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
> +         (if (zero?
> +              (parameterize ((current-error-port (%make-void-port "rw+"))
> +                             (current-output-port (%make-void-port "rw+")))
> +                (system* "unzip" wheel-archive "-d" dir metadata)))
>               (parse-wheel-metadata (string-append dir "/" metadata))
>               (begin
>                 (warning
> @@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
>                       (warning
>                        (G_ "Failed to extract file: ~a from source.~%")
>                        requires.txt)
> -                     '())))))
> -          '())))
> +                     (list '() '()))))))
> +          (list '() '()))))

I would like to see cosmetic changes like these three hunks in separate
commits.

>  (define (compute-inputs source-url wheel-url archive)
> -  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
> -name/variable pairs describing the required inputs of this package.  Also
> +  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
> +a pair of lists, each consisting of a list of name/variable pairs, for the
> +propagated inputs and the native inputs, respectively.  Also
>  return the unaltered list of upstream dependency names."
> -  (let ((dependencies
> -         (remove (cut string=? "argparse" <>)
> -                 (guess-requirements source-url wheel-url archive))))
> -    (values (sort
> -             (map (lambda (input)
> -                    (let ((guix-name (python->package-name input)))
> -                      (list guix-name (list 'unquote (string->symbol guix-name)))))
> -                  dependencies)
> -             (lambda args
> -               (match args
> -                 (((a _ ...) (b _ ...))
> -                  (string-ci<? a b)))))
> -            dependencies)))
> +
> +  (define (strip-argparse deps)
> +    (remove (cut string=? "argparse" <>) deps))
> +
> +  (define (requirement->package-name/sort deps)
> +    (sort
> +     (map (lambda (input)
> +            (let ((guix-name (python->package-name input)))
> +              (list guix-name (list 'unquote (string->symbol guix-name)))))
> +          deps)
> +     (lambda args
> +       (match args
> +         (((a _ ...) (b _ ...))
> +          (string-ci<? a b))))))
> +
> +  (define process-requirements
> +    (compose requirement->package-name/sort strip-argparse))
> +
> +  (let ((dependencies (guess-requirements source-url wheel-url archive)))
> +    (values (map process-requirements dependencies)
> +            (concatenate dependencies))))

Giving names to these processing steps is fine and improves clarity, but
I’m not so comfortable about returning ad-hoc pairs *and* multiple
values (like above).

> +            (match guix-dependencies
> +              ((required-inputs test-inputs)
> +               (values
> +                `(package
> +                   (name ,(python->package-name name))
> +                   (version ,version)
> +                   (source (origin
> +                             (method url-fetch)
> +                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
> +                             ;; cases in NAME, for instance, as is the case with
> +                             ;; "uwsgi".  In that case, fall back to a full URL.
> +                             (uri (pypi-uri ,(string-downcase name) version))
> +                             (sha256
> +                              (base32
> +                               ,(guix-hash-url temp)))))
> +                   (build-system python-build-system)
> +                   ,@(maybe-inputs required-inputs 'propagated-inputs)
> +                   ,@(maybe-inputs test-inputs 'native-inputs)
> +                   (home-page ,home-page)
> +                   (synopsis ,synopsis)
> +                   (description ,description)
> +                   (license ,(license->symbol license)))
> +                ;; Flatten the nested lists and return the upstream
> +                ;; dependencies.
> +                upstream-dependencies))))))))

I don’t see anything being flattened here?

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (7 preceding siblings ...)
  2019-05-28 13:21     ` Ricardo Wurmus
@ 2019-05-28 14:48     ` Ricardo Wurmus
  2019-06-16  5:10       ` Maxim Cournoyer
  2019-05-28 14:53     ` Ricardo Wurmus
  9 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-28 14:48 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


> From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sat, 30 Mar 2019 23:13:26 -0400
> Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
>  file.

> * guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
> (guess-requirements)[archive-root-directory]: Remove procedure.

Oh, I guess I reviewed this procedure in vain :(

Please modify the commits so that added procedures are not removed in
later commits.  This is easier on the reviewer and makes for a clearer
commit history.

>    (define (guess-requirements-from-source)
>      ;; Return the package's requirements by guessing them from the source.
> -    (let ((dirname (archive-root-directory source-url))
> -          (extension (file-extension source-url)))
> -      (if (string? dirname)
> -          (call-with-temporary-directory
> -           (lambda (dir)
> -             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
> -                    (requires.txt (string-append dirname "/" pypi-name
> -                                                 ".egg-info" "/requires.txt"))
> -                    (exit-code
> -                     (parameterize ((current-error-port (%make-void-port "rw+"))
> -                                    (current-output-port (%make-void-port "rw+")))
> -                       (if (string=? "zip" extension)
> -                           (system* "unzip" archive "-d" dir requires.txt)
> -                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
> -               (if (zero? exit-code)
> -                   (parse-requires.txt (string-append dir "/" requires.txt))
> -                   (begin
> -                     (warning
> -                      (G_ "Failed to extract file: ~a from source.~%")
> -                      requires.txt)
> -                     (list '() '()))))))
> +    (if (compressed-file? source-url)
> +        (call-with-temporary-directory
> +         (lambda (dir)
> +           (parameterize ((current-error-port (%make-void-port "rw+"))
> +                          (current-output-port (%make-void-port "rw+")))
> +             (if (string=? "zip" (file-extension source-url))
> +                 (invoke "unzip" archive "-d" dir)
> +                 (invoke "tar" "xf" archive "-C" dir)))
> +           (let ((requires.txt-files
> +                  (find-files dir (lambda (abs-file-name _)
> +		                    (string-match "\\.egg-info/requires.txt$"
> +                                                  abs-file-name)))))
> +             (if (> (length requires.txt-files) 0)

Let’s work on the empty list directly.  Here “match” would be better.

> +                 (begin
> +                   (parse-requires.txt (first requires.txt-files)))

No need for “begin” here.

> +                 (begin (warning (G_ "Cannot guess requirements from source archive:\
> + no requires.txt file found.~%"))
> +                        (list '() '()))))))

I know that this is from an earlier commit, but I don’t like the look of
“(list '() '())” at all :)

> +        (begin
> +          (warning (G_ "Unsupported archive format; \
> +cannot determine package dependencies from source archive: ~a~%")
> +                   (basename source-url))
>            (list '() '()))))

Same here.  Certainly there’s a better return value.

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
                       ` (8 preceding siblings ...)
  2019-05-28 14:48     ` Ricardo Wurmus
@ 2019-05-28 14:53     ` Ricardo Wurmus
  2019-05-30  2:24       ` Maxim Cournoyer
  2019-06-16  5:53       ` Maxim Cournoyer
  9 siblings, 2 replies; 41+ messages in thread
From: Ricardo Wurmus @ 2019-05-28 14:53 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450

And finally: Number 9!

> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sat, 30 Mar 2019 20:27:35 -0400
> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>  pypi-uri.
>
> Fixes issue: #33046.

Please change this to:

Fixes <https://bugs.gnu.org/33046>.

> * guix/build-system/python.scm (pypi-uri): Update the host URI to
> "files.pythonhosted.org".
> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
> the source URL calls for it.

Is the first change to use files.pythonhosted.org required to fix this?
Or is this unrelated?

If it is required this looks fine to me.

Thank you!

-- 
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-28 14:53     ` Ricardo Wurmus
@ 2019-05-30  2:24       ` Maxim Cournoyer
  2019-06-16  5:53       ` Maxim Cournoyer
  1 sibling, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-05-30  2:24 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> And finally: Number 9!
>
>> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 30 Mar 2019 20:27:35 -0400
>> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>>  pypi-uri.
>>
>> Fixes issue: #33046.
>
> Please change this to:
>
> Fixes <https://bugs.gnu.org/33046>.
>
>> * guix/build-system/python.scm (pypi-uri): Update the host URI to
>> "files.pythonhosted.org".
>> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
>> the source URL calls for it.
>
> Is the first change to use files.pythonhosted.org required to fix this?
> Or is this unrelated?
>
> If it is required this looks fine to me.
>
> Thank you!

Thank you for this thorough review!  I'll need some time
to go through it, and an upcoming travel will delay it some more, so
don't fret if you don't hear back from me before a couple days. Sorry!

I'll keep you posted.

Thanks a bunch!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-27 14:48     ` Ricardo Wurmus
@ 2019-06-10  2:10       ` Maxim Cournoyer
  0 siblings, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-10  2:10 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Hi Maxim,
>
>> Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.
>>
>> * guix/import/pypi.scm (guess-requirements): Update comment.
>> [guess-requirements-from-source]: Do not attempt to parse the file
>> requirements.txt.  Streamline logic.
>
> Why remove the handling of the requirements.txt?  Is it no longer
> popular enough to expect its availability in the source archives?
>
> Please also mention in the commit message that and how you adjusted the
> tests.

The commit message now explains the above:

    import: pypi: Do not consider requirements.txt files.

    PyPI packages are mandated to have a setup.py file, which contains a listing
    of the required dependencies.  The setuptools/distutils machinery embed
    metadata in the archives they produce, which contains this information. There
    is no need nor gain to collect the requirements from a "requirements.txt"
    file, as it is not the true record of dependencies for PyPI packages and may
    contain extraneous requirements or not exist at all.

    * guix/import/pypi.scm (guess-requirements): Update comment.
    [guess-requirements-from-source]: Do not attempt to parse the file
    requirements.txt.  Streamline logic.
    * tests/pypi.scm (test-requires.txt): Rename from test-requirements, to hint
    at the file being tested.
    ("pypi->guix-package"): Adapt so that the fake package contains a requires.txt
    file rather than a requirements.txt file.
    ("pypi->guix-package, wheels"): Likewise.

> You removed the comments from the example requires.txt — are
> comments no longer permitted in these files?  If they are, please don’t
> include those changes.

The comments are now preserved. Thank you!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-27 15:11     ` Ricardo Wurmus
@ 2019-06-10  3:30       ` Maxim Cournoyer
  2019-06-10  9:23         ` Ricardo Wurmus
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-10  3:30 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello again!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Hi Maxim,
>
> on to patch number 2!

Yay!

>> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:00 -0400
>> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
>>  source.
>>
>> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
>> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
>> and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT and COMMENT?
>> functions inside the READ-REQUIREMENTS procedure.
>> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
>> parsing optional requirements.
>>
>> * tests/pypi.scm (test-requires-with-sections): New variable.
>> ("parse-requires.txt, with sections"): New test.
>> ("pypi->guix-package"): Mute tar output to stdout.
>
> The commit log does not match the changes.  CLEAN-REQUIREMENT is now a
> top-level procedure, not a local procedure inside of READ-REQUIREMENTS
> as reported in the commit message.  Which is correct?

Fixed.

>> +  (call-with-input-file requires.txt
>> +    (lambda (port)
>> +      (let loop ((result '()))
>> +        (let ((line (read-line port)))
>> +          ;; Stop when a section is encountered, as sections contains optional
>
> Should be “contain”.

Fixed.

>> +          ;; (extra) requirements.  Non-optional requirements must appear
>> +          ;; before any section is defined.
>> +          (if (or (eof-object? line) (section-header? line))
>> +              (reverse result)
>> +              (cond
>> +               ((or (string-null? line) (comment? line))
>> +                (loop result))
>> +               (else
>> +                (loop (cons (clean-requirement line)
>> +                            result))))))))))
>> +
>
> I think it would be better to use “match” here instead of nested “let”,
> “if” and “cond”.  At least you can drop the “if” and just use cond.
>
> The loop let and the inner let can be merged.

I'm not sure I understand; wouldn't merging the named let with the plain
let mean adding an extra LINE argument to my LOOP procedure?  I don't
want that.

Also, how could the above code be expressed using "match"? I'm using
predicates which tests for (special) characters in a string; I don't see
how the more primitive pattern language of "match" will enable me to do
the same.

>> +(define (parse-requires.txt requires.txt)
>> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
>> +requirement names."
>> +  ;; This is a very incomplete parser, which job is to select the non-optional
>
> “which” –> “whose”

Fixed, with due diligence reading on English grammar ;-)

>> +  ;; dependencies and strip them out of any version information.
>> +  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>> +  ;; library and the requirements grammar defined by PEP-0508
>> +  ;; (https://www.python.org/dev/peps/pep-0508/).
>
> Let’s remove the sentence starting with “Alternatively…”.  We could do
> that but we didn’t :)

Alright; done!

>> +  (define (section-header? line)
>> +    ;; Return #t if the given LINE is a section header, #f otherwise.
>> +    (let ((trimmed-line (string-trim line)))
>> +      (and (not (string-null? trimmed-line))
>> +           (eq? (string-ref trimmed-line 0) #\[))))
>> +
>
> How about using string-prefix? instead?  This looks more complicated
> than it deserves.  You can get rid of string-null? and eq? and string-ref
> and all that.
>
> Same here:
>
>> +  (define (comment? line)
>> +    ;; Return #t if the given LINE is a comment, #f otherwise.
>> +    (eq? (string-ref (string-trim line) 0) #\#))
>
> I’d just use string-prefix? here.

Neat! Adjusted, for both counts.

>> +(define (clean-requirement s)
>> +  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
>> +  ;; file, remove everything other than the actual name of the required
>> +  ;; package, and return it.
>> +  (string-take s (or (string-index s (lambda (chr)
>> +                                       (member chr '(#\space #\> #\= #\<))))
>> +                     (string-length s))))
>
> “string-take” with “string-length” is not very elegant.  The char
> predicate in string-index could better be a char set:
>
> (define (clean-requirement s)
>  (cond
>   ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
>   (else s)))

That's nicer, thanks!

>> ("pypi->guix-package"): Mute tar output to stdout.
>
> Finally, I think it would be better to keep this separate because it’s
> really orthogonal to the other changes in this patch.

OK, done!

> What do you think?

All good points; I just don't understand the one about using a match
and/or merging the regular "let" with the named "let".

Thanks,

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-27 15:54     ` Ricardo Wurmus
@ 2019-06-10  8:32       ` Maxim Cournoyer
  2019-06-10  9:12         ` Ricardo Wurmus
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-10  8:32 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Patch number 3!

Yay!

>> From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:01 -0400
>> Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
>>  specifications.
>>
>> The previous solution was fragile and could leave unwanted characters in a
>> requirement name, such as '[' or ']'.
>
> Wouldn’t it be sufficient to add [ and ] to the list of forbidden
> characters?  The tests pass with this implementation of
> clean-requirements:
>
> (define (clean-requirements s)
>  (cond
>   ((string-index s (char-set #\space #\> #\= #\< #\[ #\])) => (cut string-take s <>))
>   (else s)))

Indeed this would be sufficient to make the tests pass, but the tests
don't cover all the cases; as an example, consider:

--8<---------------cut here---------------start------------->8---
argparse;python_version<"2.7"
--8<---------------cut here---------------end--------------->8---

While we could make it work with the current logic by adding more
invalid characters (such as ';' here) to the character set, it seems
less error prone to use the upstream provided regex to match a package
name.  [0]

>> +(define %requirement-name-regexp
>> +  ;; Regexp to match the requirement name in a requirement specification.
>> +
>> +  ;; Some grammar, taken from PEP-0508 (see:
>> +  ;; https://www.python.org/dev/peps/pep-0508/).
>> +
>> +  ;; The unified rule can be expressed as:
>> +  ;; specification = wsp* ( url_req | name_req ) wsp*
>> +
>> +  ;; where url_req is:
>> +  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
>> +
>> +  ;; and where name_req is:
>> +  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
>> +
>> +  ;; Thus, we need only matching NAME, which is expressed as:
>> +  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
>> +  ;; identifier    = letterOrDigit identifier_end*
>> +  ;; name          = identifier
>> +  (let* ((letter-or-digit "[A-Za-z0-9]")
>> +         (identifier-end (string-append "(" letter-or-digit "|"
>> +                                        "[-_.]*" letter-or-digit ")"))
>> +         (identifier (string-append "^" letter-or-digit identifier-end "*"))
>> +         (name identifier))
>> +    (make-regexp name)))
>
> This seems a little too complicated.  Translating a grammar into a
> regexp is probably not a good idea in general.  Since we don’t care
> about anything other than the name it seems easier to just chop off
> the string tail as soon as we find an invalid character.

While I agree that a regexp is a bigger hammer than basic string
manipulation, I see some merit to it here:

1) We can be assured of conformance with upstream, again, per PEP-0508.
2) It is easier to extend; we might want to add parsing for the version
spec in order to disregard dependencies specified for Python < 3, for
example.

The use of the PEP-0508 grammar to define the regexp is useful to detail
in a more human-friendly language the components of the regexp.  We
could have otherwise used the more cryptic regexp for Python
distribution names:

--8<---------------cut here---------------start------------->8---
^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
--8<---------------cut here---------------end--------------->8---

So I guess that what I'm saying is that I prefer this approach to using
string-index with invalid characters, for the reasons above.

[0]  https://www.python.org/dev/peps/pep-0508/

Thanks!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-10  8:32       ` Maxim Cournoyer
@ 2019-06-10  9:12         ` Ricardo Wurmus
  2019-06-16  6:05           ` Maxim Cournoyer
  0 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-06-10  9:12 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

> While I agree that a regexp is a bigger hammer than basic string
> manipulation, I see some merit to it here:
>
> 1) We can be assured of conformance with upstream, again, per PEP-0508.
> 2) It is easier to extend; we might want to add parsing for the version
> spec in order to disregard dependencies specified for Python < 3, for
> example.
>
> The use of the PEP-0508 grammar to define the regexp is useful to detail
> in a more human-friendly language the components of the regexp.  We
> could have otherwise used the more cryptic regexp for Python
> distribution names:
>
> --8<---------------cut here---------------start------------->8---
> ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
> --8<---------------cut here---------------end--------------->8---
>
> So I guess that what I'm saying is that I prefer this approach to using
> string-index with invalid characters, for the reasons above.
>
> [0]  https://www.python.org/dev/peps/pep-0508/

Okay, sounds good.  Please make sure to note this in a comment, so that
I won’t be asking myself this same question in a year :)

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-10  3:30       ` Maxim Cournoyer
@ 2019-06-10  9:23         ` Ricardo Wurmus
  2019-06-16 14:11           ` Maxim Cournoyer
  0 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-06-10  9:23 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

>>> +          ;; (extra) requirements.  Non-optional requirements must appear
>>> +          ;; before any section is defined.
>>> +          (if (or (eof-object? line) (section-header? line))
>>> +              (reverse result)
>>> +              (cond
>>> +               ((or (string-null? line) (comment? line))
>>> +                (loop result))
>>> +               (else
>>> +                (loop (cons (clean-requirement line)
>>> +                            result))))))))))
>>> +
>>
>> I think it would be better to use “match” here instead of nested “let”,
>> “if” and “cond”.  At least you can drop the “if” and just use cond.
>>
>> The loop let and the inner let can be merged.
>
> I'm not sure I understand; wouldn't merging the named let with the plain
> let mean adding an extra LINE argument to my LOOP procedure?  I don't
> want that.

Let’s forget about merging the nested “let”, because you would indeed
need to change a few more things.  It’s fine to keep that as it is.  But
(if … (cond …)) is not pretty.  At least it could be done in one “cond”:

    (cond
     ((or (eof-object? line) (section-header? line))
      (reverse result))
     ((or (string-null? line) (comment? line))
      (loop result))
     (else
      (loop (cons (clean-requirement line)
                  result))))

> Also, how could the above code be expressed using "match"? I'm using
> predicates which tests for (special) characters in a string; I don't see
> how the more primitive pattern language of "match" will enable me to do
> the same.

“match” has support for predicates, so you could do something like this:

    (match line
     ((or (eof-object) (? section-header?))
      (reverse result))
     ((or '() (? comment?))
      (loop result))
     (_ (loop (cons (clean-requirement line) result))))

This allows you to match “eof-object” and '() directly.  Whenever I see
“string-null?” I think it might be better to “match” on the empty list
directly.

But really, that’s up to you.  I only feel strongly about avoiding “(if
… (cond …))”.

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-28 10:23     ` Ricardo Wurmus
@ 2019-06-10 13:30       ` Maxim Cournoyer
  2019-06-10 20:13         ` Ricardo Wurmus
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-10 13:30 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello again!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> On to the next:
>
>> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:02 -0400
>> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>>
>> This change enables the PyPI importer to look for requirements in a source
>> archive of a different type than "tar.gz" or "tar.bz2".
>
> Okay.
>
>> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
>> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
>> archive is supported or not.
>
> Nitpick: please use “...this.” and leave two spaces between sentences.

Done.

> Typo: it should be COMPRESSED-FILE?

Fixed.

>> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
>> to extract ZIP archives.
>
> s/method/procedure/

Done.

> Please also mention that “compute-inputs” has been adjusted.

Done.

>> -  (define (tarball-directory url)
>> -    ;; Given the URL of the package's tarball, return the name of the directory
>> +  (define (archive-root-directory url)
>> +    ;; Given the URL of the package's archive, return the name of the directory
>>      ;; that will be created upon decompressing it. If the filetype is not
>>      ;; supported, return #f.
>> -    ;; TODO: Support more archive formats.
>> -    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
>> -      (cond
>> -       ((string-suffix? ".tar.gz" basename)
>> -        (string-drop-right basename 7))
>> -       ((string-suffix? ".tar.bz2" basename)
>> -        (string-drop-right basename 8))
>> -       (else
>> +    (if (compressed-file? url)
>> +        (let ((root-directory (file-sans-extension (basename url))))
>> +          (if (string=? "tar" (file-extension root-directory))
>> +              (file-sans-extension root-directory)
>> +              root-directory))
>>          (begin
>> -          (warning (G_ "Unsupported archive format: \
>> -cannot determine package dependencies"))
>> -          #f)))))
>> +          (warning (G_ "Unsupported archive format (~a): \
>> +cannot determine package dependencies") (file-extension url))
>> +          #f)))
>
> I think the double application of file-sans-extension and the
> intermediate variable name “root-directory” for something that is a file
> is a little confusing, but I don’t have a better proposal (other than to
> replace file-extension and file-sans-extension with a match expression).

Done, w.r.t. using "match":

--8<---------------cut here---------------start------------->8---
@@ -198,10 +198,12 @@ be extracted in a temporary directory."
     ;; that will be created upon decompressing it. If the filetype is not
     ;; supported, return #f.
     (if (compressed-file? url)
-        (let ((root-directory (file-sans-extension (basename url))))
-          (if (string=? "tar" (file-extension root-directory))
-              (file-sans-extension root-directory)
-              root-directory))
+        (match (file-sans-extension (basename url))
+          (root-directory
+           (match (file-extension root-directory)
+             ("tar"
+              (file-sans-extension root-directory))
+             (_ root-directory))))
         (begin
           (warning (G_ "Unsupported archive format (~a): \
 cannot determine package dependencies") (file-extension url))
--8<---------------cut here---------------end--------------->8---


>>    (define (read-wheel-metadata wheel-archive)
>>      ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> @@ -246,16 +243,20 @@ cannot determine package dependencies"))
>
>>    (define (guess-requirements-from-source)
>>      ;; Return the package's requirements by guessing them from the source.
>> -    (let ((dirname (tarball-directory source-url)))
>> +    (let ((dirname (archive-root-directory source-url))
>> +          (extension (file-extension source-url)))
>>        (if (string? dirname)
>>            (call-with-temporary-directory
>>             (lambda (dir)
>>               (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
>>                      (requires.txt (string-append dirname "/" pypi-name
>>                                                   ".egg-info" "/requires.txt"))
>> -                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
>> -                                              (current-output-port (%make-void-port "rw+")))
>> -                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
>> +                    (exit-code
>> +                     (parameterize ((current-error-port (%make-void-port "rw+"))
>> +                                    (current-output-port (%make-void-port "rw+")))
>> +                       (if (string=? "zip" extension)
>> +                           (system* "unzip" archive "-d" dir requires.txt)
>> +                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
>
> I guess this is why I’m not too happy with this: we’re checking in
> multiple places if the format is supported but then forget about this
> again until the next time we need to do something to the file.

I don't see much of a problem with the current design since there are
two questions being answered:

1) What should be the directory name of the extracted package (retrieved
   from the base name of the archive).
2) What extractor should be used (zip vs tar).

These two questions are orthogonal, and that the same primitive get used
to answer both is an implementation, or rather, an optimization detail.

> I wonder if we could do better and answer the question just once.

The questions are different :-). We could optimize, but that would be at
the price of expressiveness (squash the two questions into one solving
space).

What do you think?

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-10 13:30       ` Maxim Cournoyer
@ 2019-06-10 20:13         ` Ricardo Wurmus
  0 siblings, 0 replies; 41+ messages in thread
From: Ricardo Wurmus @ 2019-06-10 20:13 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Hi Maxim,

thanks for your patience in addressing my comments.  I appreciate it!

>> I think the double application of file-sans-extension and the
>> intermediate variable name “root-directory” for something that is a file
>> is a little confusing, but I don’t have a better proposal (other than to
>> replace file-extension and file-sans-extension with a match expression).
>
> Done, w.r.t. using "match":
>
> --8<---------------cut here---------------start------------->8---
> @@ -198,10 +198,12 @@ be extracted in a temporary directory."
>      ;; that will be created upon decompressing it. If the filetype is not
>      ;; supported, return #f.
>      (if (compressed-file? url)
> -        (let ((root-directory (file-sans-extension (basename url))))
> -          (if (string=? "tar" (file-extension root-directory))
> -              (file-sans-extension root-directory)
> -              root-directory))
> +        (match (file-sans-extension (basename url))
> +          (root-directory
> +           (match (file-extension root-directory)
> +             ("tar"
> +              (file-sans-extension root-directory))
> +             (_ root-directory))))
>          (begin
>            (warning (G_ "Unsupported archive format (~a): \
>  cannot determine package dependencies") (file-extension url))
> --8<---------------cut here---------------end--------------->8---

The first application of “match” matches anything.  What I had in mind
was really a slightly different approach, namely to split up the “url”
string at dots and then match the resulting list of strings.

Something like this:

  (match (string-split "hello.tar.gz" #\.)
   ((base "tar" (or "bz2" "gz")) base)
   ((base ext) base))

> I don't see much of a problem with the current design since there are
> two questions being answered:
>
> 1) What should be the directory name of the extracted package (retrieved
>    from the base name of the archive).
> 2) What extractor should be used (zip vs tar).
>
> These two questions are orthogonal, and that the same primitive get used
> to answer both is an implementation, or rather, an optimization detail.
>
>> I wonder if we could do better and answer the question just once.
>
> The questions are different :-). We could optimize, but that would be at
> the price of expressiveness (squash the two questions into one solving
> space).

Okay.  I guess I’m too picky :)
I’d be happy if we could move the checks to tiny named procedures, but
it’s probably fine the way it is.

Thanks!

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-28 11:04     ` Ricardo Wurmus
@ 2019-06-11  0:39       ` Maxim Cournoyer
  2019-06-11 11:56         ` Ricardo Wurmus
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-11  0:39 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Patch number 6:
>
>> From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:03 -0400
>> Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
>>  metadata.json.
>
>> With newer Wheel releases, there is no more metadata.json file; the METADATA
>> file should be used instead (see: https://github.com/pypa/wheel/issues/195).
>
>> This change updates our PyPI importer so that it uses the later.
>
> Typo: should be “latter” instead of “later”.

Fixed.

>> * guix/import/pypi.scm (define-module): Remove unnecessary modules and export
>>   the PARSE-WHEEL-METADATA method.
>
> Please remove the indentation here.  Also, please don’t use “method”
> (because it’s not); use “procedure” instead.

Done. Thanks for fixing my terminology :-).

>> (parse-wheel-metadata): Add method.
>
> Same here.

Done.

>> +  (define (requires-dist-header? line)
>> +    ;; Return #t if the given LINE is a Requires-Dist header.
>> +    (regexp-match? (string-match "^Requires-Dist: " line)))
>> +
>> +  (define (requires-dist-value line)
>> +    (string-drop line (string-length "Requires-Dist: ")))
>> +
>> +  (define (extra? line)
>> +    ;; Return #t if the given LINE is an "extra" requirement.
>> +    (regexp-match? (string-match "extra == " line)))
>
> The use of “regexp-match?” here isn’t strictly necessary as the return
> value is true-ish anyway.

Done.

>> +  (call-with-input-file metadata
>> +    (lambda (port)
>> +      (let loop ((requirements '()))
>> +        (let ((line (read-line port)))
>> +          ;; Stop at the first 'Provides-Extra' section: the non-optional
>> +          ;; requirements appear before the optional ones.
>> +          (if (eof-object? line)
>> +              (reverse (delete-duplicates requirements))
>> +              (cond
>> +               ((and (requires-dist-header? line) (not (extra? line)))
>> +                (loop (cons (specification->requirement-name
>> +                             (requires-dist-value line))
>> +                            requirements)))
>> +               (else
>> +                (loop requirements)))))))))
>> +
>
> As before you can simplify the nested let and merge “if” and "cond“.

Oh, I get it now, I think:

--8<---------------cut here---------------start------------->8---
 
   (call-with-input-file metadata
     (lambda (port)
       (let loop ((requirements '()))
-        (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
-          (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
-              (cond
-               ((and (requires-dist-header? line) (not (extra? line)))
-                (loop (cons (specification->requirement-name
-                             (requires-dist-value line))
-                            requirements)))
-               (else
-                (loop requirements)))))))))
+        (match (read-line port)
+          (line
+           ;; Stop at the first 'Provides-Extra' section: the non-optional
+           ;; requirements appear before the optional ones.
+           (cond
+            ((eof-object? line)
+             (reverse (delete-duplicates requirements)))
+            ((and (requires-dist-header? line) (not (extra? line)))
+             (loop (cons (specification->requirement-name
+                          (requires-dist-value line))
+                         requirements)))
+            (else
+             (loop requirements)))))))))
 
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
--8<---------------cut here---------------end--------------->8---

>>    (define (read-wheel-metadata wheel-archive)
>>      ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> -    ;; requirements.
>> +    ;; requirements, or #f if the metadata file contained therein couldn't be
>> +    ;; extracted.
>>      (let* ((dirname (wheel-url->extracted-directory wheel-url))
>> -           (json-file (string-append dirname "/metadata.json")))
>> -      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
>> -           (dynamic-wind
>> -             (const #t)
>> -             (lambda ()
>> -               (call-with-input-file json-file
>> -                 (lambda (port)
>> -                   (let* ((metadata (json->scm port))
>> -                          (run_requires (hash-ref metadata "run_requires"))
>> -                          (requirements (if run_requires
>> -                                            (hash-ref (list-ref run_requires 0)
>> -                                                       "requires")
>> -                                            '())))
>> -                     (map specification->requirement-name requirements)))))
>> -             (lambda ()
>> -               (delete-file json-file)
>> -               (rmdir dirname))))))
>> +           (metadata (string-append dirname "/METADATA")))
>> +      (call-with-temporary-directory
>> +       (lambda (dir)
>> +         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
>> +             (parse-wheel-metadata (string-append dir "/" metadata))
>> +             (begin
>> +               (warning
>> +                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
>> +               #f))))))
>
> The old approach took care of removing the unpacked archive no matter
> what happened.  The new code doesn’t do that.

The temporary directory where the archive is unpacked should be cleared
when leaving upon leaving its scope; the docstring of
"call-with-temporary-directory" says:

    "Call PROC with a name of a temporary directory; close the directory and
    delete it when leaving the dynamic extent of this call."

>> --- a/tests/pypi.scm
>> +++ b/tests/pypi.scm
>
> Thanks for the tests!

:-)

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-11  0:39       ` Maxim Cournoyer
@ 2019-06-11 11:56         ` Ricardo Wurmus
  0 siblings, 0 replies; 41+ messages in thread
From: Ricardo Wurmus @ 2019-06-11 11:56 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Hi Maxim,

>>> +  (call-with-input-file metadata
>>> +    (lambda (port)
>>> +      (let loop ((requirements '()))
>>> +        (let ((line (read-line port)))
>>> +          ;; Stop at the first 'Provides-Extra' section: the non-optional
>>> +          ;; requirements appear before the optional ones.
>>> +          (if (eof-object? line)
>>> +              (reverse (delete-duplicates requirements))
>>> +              (cond
>>> +               ((and (requires-dist-header? line) (not (extra? line)))
>>> +                (loop (cons (specification->requirement-name
>>> +                             (requires-dist-value line))
>>> +                            requirements)))
>>> +               (else
>>> +                (loop requirements)))))))))
>>> +
>>
>> As before you can simplify the nested let and merge “if” and "cond“.
>
> Oh, I get it now, I think:
>
> --8<---------------cut here---------------start------------->8---
>
>    (call-with-input-file metadata
>      (lambda (port)
>        (let loop ((requirements '()))
> -        (let ((line (read-line port)))
> -          ;; Stop at the first 'Provides-Extra' section: the non-optional
> -          ;; requirements appear before the optional ones.
> -          (if (eof-object? line)
> -              (reverse (delete-duplicates requirements))
> -              (cond
> -               ((and (requires-dist-header? line) (not (extra? line)))
> -                (loop (cons (specification->requirement-name
> -                             (requires-dist-value line))
> -                            requirements)))
> -               (else
> -                (loop requirements)))))))))
> +        (match (read-line port)
> +          (line
> +           ;; Stop at the first 'Provides-Extra' section: the non-optional
> +           ;; requirements appear before the optional ones.
> +           (cond
> +            ((eof-object? line)
> +             (reverse (delete-duplicates requirements)))
> +            ((and (requires-dist-header? line) (not (extra? line)))
> +             (loop (cons (specification->requirement-name
> +                          (requires-dist-value line))
> +                         requirements)))
> +            (else
> +             (loop requirements)))))))))
>
>  (define (guess-requirements source-url wheel-url archive)
>    "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
> --8<---------------cut here---------------end--------------->8---

Not quite.  Your ‘match’ expression here doesn’t do anything that a
‘let’ wouldn’t have done.  It really just binds the return value of
(read-line port) to ‘line’; that’s the same as (let ((line (read-line
port))) …).

I gave a match example using predicate matchers in a previous reply.  In
any case, using ‘cond’ inside of a let would be just fine.  If you
wanted to go with ‘match’, though, you’d replace the ‘cond’.

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2016-09-16 20:00 bug#24450: pypi importer outputs strange character series in optional dependency case ng0
                   ` (2 preceding siblings ...)
  2019-05-15 11:06 ` Ricardo Wurmus
@ 2019-06-12  3:00 ` Maxim Cournoyer
  2019-06-12  6:39   ` Ricardo Wurmus
  3 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-12  3:00 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

Ehe! I had to look up the reference; I'm not much of a Star Trek fan
obviously :-P.

>> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 23:12:26 -0400
>> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
>>  native-inputs.
>>
>> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
>> (test-section?): New predicate.
>> (parse-requires.txt): Collect the optional test inputs, and return them as the
>> second element of the returned list.
>
> AFAICT parse-requires.txt now returns a list of pairs, but used to
> return a plain list before.  Is this correct?

Right, a list of two lists to be technically correct.

>>  (define (parse-requires.txt requires.txt)
>> -  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
>> -requirement names."
>> -  ;; This is a very incomplete parser, which job is to select the non-optional
>> -  ;; dependencies and strip them out of any version information.
>> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
>> +
>> +The first element of the pair contains the required dependencies while the
>> +second the optional test dependencies.  Note that currently, optional,
>> +non-test dependencies are omitted since these can be difficult or expensive to
>> +satisfy."
>> +
>> +  ;; This is a very incomplete parser, which job is to read in the requirement
>> +  ;; specification lines, and strip them out of any version information.
>>    ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>>    ;; library and the requirements grammar defined by PEP-0508
>>    ;; (https://www.python.org/dev/peps/pep-0508/).
>
> Does it really return a pair?  Or a list of pairs?  Or is it a
> two-element list of lists?

The latter! I've fixed the docstring accordingly.

>>    (call-with-input-file requires.txt
>>      (lambda (port)
>> -      (let loop ((result '()))
>> +      (let loop ((required-deps '())
>> +                 (test-deps '())
>> +                 (inside-test-section? #f)
>> +                 (optional? #f))
>>          (let ((line (read-line port)))
>> -          ;; Stop when a section is encountered, as sections contains optional
>> -          ;; (extra) requirements.  Non-optional requirements must appear
>> -          ;; before any section is defined.
>> -          (if (or (eof-object? line) (section-header? line))
>> +          (if (eof-object? line)
>>                ;; Duplicates can occur, since the same requirement can be
>>                ;; listed multiple times with different conditional markers, e.g.
>>                ;; pytest >= 3 ; python_version >= "3.3"
>>                ;; pytest < 3 ; python_version < "3.3"
>> -              (reverse (delete-duplicates result))
>> +              (map (compose reverse delete-duplicates)
>> +                   (list required-deps test-deps))
>
> Looks like a list of lists to me.  “delete-duplicates” now won’t delete
> a name that is in both “required-deps” as well as in “test-deps”.  Is
> this acceptable?

It is acceptable, as this corner case cannot exist given the current
code (a requirement can exist in either required-deps or test-deps, but
never in both). It also doesn't make sense that a run time requirement
would also be listed as a test requirement, so that corner case is not
likely to exist in the future either.

> Personally, I’m not a fan of using data structures for returning
> multiple values, because we can simply return multiple values.

I thought the Guile supported multiple values return value would be
great here as well, but I've found that for this specific case here, a
list of lists worked better, since the two lists contain requirements to
be processed the same, which "map" can readily do (i.e. less ceremony is
required).

> Or we could have more than just strings.  The meaning of these strings
> is provided by the bin into which they are thrown — either
> “required-deps” or “test-deps”.  It could be an option to collect tagged
> values instead and have the caller deal with filtering.

Sounds neat, but I'd rather punt on this one for now.

>>  (define (parse-wheel-metadata metadata)
>> -  "Given METADATA, a Wheel metadata file, return a list of requirement names."
>> +  "Given METADATA, a Wheel metadata file, return a pair of requirements.
>> +
>> +The first element of the pair contains the required dependencies while the second the optional
>> +test dependencies.  Note that currently, optional, non-test dependencies are
>> +omitted since these can be difficult or expensive to satisfy."
>>    ;; METADATA is a RFC-2822-like, header based file.
>
> This sounds like this is going to duplicate the previous procedures.

Instead of duplicating the docstring, I'm now referring to that of
PARSE-REQUIRES.TXT for PARSE-WHEEL-METADATA. The two procedures share
the same interface, but implement a different parser, which consist of
mostly a loop + conditional branches. IMHO, it's not worth, or even,
desirable to try to merge those two parsers into one; I prefer to keep
the logic of two distinct parsers separate.

>>    (define (requires-dist-header? line)
>>      ;; Return #t if the given LINE is a Requires-Dist header.
>> -    (regexp-match? (string-match "^Requires-Dist: " line)))
>> +    (string-match "^Requires-Dist: " line))
>>
>>    (define (requires-dist-value line)
>>      (string-drop line (string-length "Requires-Dist: ")))
>>
>>    (define (extra? line)
>>      ;; Return #t if the given LINE is an "extra" requirement.
>> -    (regexp-match? (string-match "extra == " line)))
>> +    (string-match "extra == '(.*)'" line))
>
> These hunks should be part of the previous patch where they were
> introduced.  (See my comments there about “regexp-match?”.)

Done.

>> +  (define (test-requirement? line)
>> +    (let ((extra-label (match:substring (extra? line) 1)))
>> +      (and extra-label (test-section? extra-label))))
>
> You can use “and=>” instead of binding a name:
>
>     (and=> (match:substring (extra? line) 1) test-section?)

Neat! I still don't have the reflex to use "and=>", thanks for reminding
me about it!

>>    (call-with-input-file metadata
>>      (lambda (port)
>> -      (let loop ((requirements '()))
>> +      (let loop ((required-deps '())
>> +                 (test-deps '()))
>>          (let ((line (read-line port)))
>> -          ;; Stop at the first 'Provides-Extra' section: the non-optional
>> -          ;; requirements appear before the optional ones.
>>            (if (eof-object? line)
>> -              (reverse (delete-duplicates requirements))
>> +              (map (compose reverse delete-duplicates)
>> +                   (list required-deps test-deps))
>>                (cond
>>                 ((and (requires-dist-header? line) (not (extra? line)))
>>                  (loop (cons (specification->requirement-name
>>                               (requires-dist-value line))
>> -                            requirements)))
>> +                            required-deps)
>> +                      test-deps))
>> +               ((and (requires-dist-header? line) (test-requirement? line))
>> +                (loop required-deps
>> +                      (cons (specification->requirement-name (requires-dist-value line))
>> +                            test-deps)))
>>                 (else
>> -                (loop requirements)))))))))
>> +                (loop required-deps test-deps))))))))) ;skip line
>
> Could you refactor this so that the other parser can be reused?  If not,
> the same comments about “if” and “cond” and the use of pairs/lists
> instead of “values” applies here.

Done w.r.t. using only "cond" rather than "if" + "cond". As explained
above, the docstring's body now refer to the documentation of
PARSE-REQUIRES.TXT; otherwise I prefer to keep the parsers separate.

>>  (define (guess-requirements source-url wheel-url archive)
>> -  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
>> +  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
>>  of the required packages specified in the requirements.txt file.  ARCHIVE will
>>  be extracted in a temporary directory."
>>
>> @@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
>>             (metadata (string-append dirname "/METADATA")))
>>        (call-with-temporary-directory
>>         (lambda (dir)
>> -         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
>> +         (if (zero?
>> +              (parameterize ((current-error-port (%make-void-port "rw+"))
>> +                             (current-output-port (%make-void-port "rw+")))
>> +                (system* "unzip" wheel-archive "-d" dir metadata)))
>>               (parse-wheel-metadata (string-append dir "/" metadata))
>>               (begin
>>                 (warning
>> @@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
>>                       (warning
>>                        (G_ "Failed to extract file: ~a from source.~%")
>>                        requires.txt)
>> -                     '())))))
>> -          '())))
>> +                     (list '() '()))))))
>> +          (list '() '()))))
>
> I would like to see cosmetic changes like these three hunks in separate
> commits.

Done for the first two hunks; the last one isn't cosmetic; it changes
the default return from an empty list to a list of two empty lists.

>>  (define (compute-inputs source-url wheel-url archive)
>> -  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
>> -name/variable pairs describing the required inputs of this package.  Also
>> +  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
>> +a pair of lists, each consisting of a list of name/variable pairs, for the
>> +propagated inputs and the native inputs, respectively.  Also
>>  return the unaltered list of upstream dependency names."
>> -  (let ((dependencies
>> -         (remove (cut string=? "argparse" <>)
>> -                 (guess-requirements source-url wheel-url archive))))
>> -    (values (sort
>> -             (map (lambda (input)
>> -                    (let ((guix-name (python->package-name input)))
>> -                      (list guix-name (list 'unquote (string->symbol guix-name)))))
>> -                  dependencies)
>> -             (lambda args
>> -               (match args
>> -                 (((a _ ...) (b _ ...))
>> -                  (string-ci<? a b)))))
>> -            dependencies)))
>> +
>> +  (define (strip-argparse deps)
>> +    (remove (cut string=? "argparse" <>) deps))
>> +
>> +  (define (requirement->package-name/sort deps)
>> +    (sort
>> +     (map (lambda (input)
>> +            (let ((guix-name (python->package-name input)))
>> +              (list guix-name (list 'unquote (string->symbol guix-name)))))
>> +          deps)
>> +     (lambda args
>> +       (match args
>> +         (((a _ ...) (b _ ...))
>> +          (string-ci<? a b))))))
>> +
>> +  (define process-requirements
>> +    (compose requirement->package-name/sort strip-argparse))
>> +
>> +  (let ((dependencies (guess-requirements source-url wheel-url archive)))
>> +    (values (map process-requirements dependencies)
>> +            (concatenate dependencies))))
>
> Giving names to these processing steps is fine and improves clarity, but
> I’m not so comfortable about returning ad-hoc pairs *and* multiple
> values (like above).

Using "values" is required by the API of the recursive importer.

>> +            (match guix-dependencies
>> +              ((required-inputs test-inputs)
>> +               (values
>> +                `(package
>> +                   (name ,(python->package-name name))
>> +                   (version ,version)
>> +                   (source (origin
>> +                             (method url-fetch)
>> +                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
>> +                             ;; cases in NAME, for instance, as is the case with
>> +                             ;; "uwsgi".  In that case, fall back to a full URL.
>> +                             (uri (pypi-uri ,(string-downcase name) version))
>> +                             (sha256
>> +                              (base32
>> +                               ,(guix-hash-url temp)))))
>> +                   (build-system python-build-system)
>> +                   ,@(maybe-inputs required-inputs 'propagated-inputs)
>> +                   ,@(maybe-inputs test-inputs 'native-inputs)
>> +                   (home-page ,home-page)
>> +                   (synopsis ,synopsis)
>> +                   (description ,description)
>> +                   (license ,(license->symbol license)))
>> +                ;; Flatten the nested lists and return the upstream
>> +                ;; dependencies.
>> +                upstream-dependencies))))))))
>
> I don’t see anything being flattened here?

Good catch! Seems a remnant of something that is now gone :-).

Thanks for the review.

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-12  3:00 ` Maxim Cournoyer
@ 2019-06-12  6:39   ` Ricardo Wurmus
  2019-06-16 14:29     ` Maxim Cournoyer
  0 siblings, 1 reply; 41+ messages in thread
From: Ricardo Wurmus @ 2019-06-12  6:39 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Hi Maxim,

>>>    (call-with-input-file requires.txt
>>>      (lambda (port)
>>> -      (let loop ((result '()))
>>> +      (let loop ((required-deps '())
>>> +                 (test-deps '())
>>> +                 (inside-test-section? #f)
>>> +                 (optional? #f))
>>>          (let ((line (read-line port)))
>>> -          ;; Stop when a section is encountered, as sections contains optional
>>> -          ;; (extra) requirements.  Non-optional requirements must appear
>>> -          ;; before any section is defined.
>>> -          (if (or (eof-object? line) (section-header? line))
>>> +          (if (eof-object? line)
>>>                ;; Duplicates can occur, since the same requirement can be
>>>                ;; listed multiple times with different conditional markers, e.g.
>>>                ;; pytest >= 3 ; python_version >= "3.3"
>>>                ;; pytest < 3 ; python_version < "3.3"
>>> -              (reverse (delete-duplicates result))
>>> +              (map (compose reverse delete-duplicates)
>>> +                   (list required-deps test-deps))
>>
>> Looks like a list of lists to me.  “delete-duplicates” now won’t delete
>> a name that is in both “required-deps” as well as in “test-deps”.  Is
>> this acceptable?
>
> It is acceptable, as this corner case cannot exist given the current
> code (a requirement can exist in either required-deps or test-deps, but
> never in both). It also doesn't make sense that a run time requirement
> would also be listed as a test requirement, so that corner case is not
> likely to exist in the future either.

I mentioned it because I believe I’ve seen this in the past where the
importer would return some of the same inputs as both regular inputs and
test dependencies.

>> Personally, I’m not a fan of using data structures for returning
>> multiple values, because we can simply return multiple values.
>
> I thought the Guile supported multiple values return value would be
> great here as well, but I've found that for this specific case here, a
> list of lists worked better, since the two lists contain requirements to
> be processed the same, which "map" can readily do (i.e. less ceremony is
> required).

“map” can also operate on more than one list at a time:

    (call-with-values
      (lambda ()
        (values (list 1 2 3)
                (list 9 8 7)))
      (lambda (a b) (map + a b)))

    => (10 10 10)

Of course, it would be simpler to just use a single list of tagged
items.

--
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-28 14:48     ` Ricardo Wurmus
@ 2019-06-16  5:10       ` Maxim Cournoyer
  0 siblings, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-16  5:10 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

>> From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 30 Mar 2019 23:13:26 -0400
>> Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
>>  file.
>
>> * guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
>> (guess-requirements)[archive-root-directory]: Remove procedure.
>
> Oh, I guess I reviewed this procedure in vain :(
>
> Please modify the commits so that added procedures are not removed in
> later commits.  This is easier on the reviewer and makes for a clearer
> commit history.

Indeed; I'll be more careful about this is the future; sorry!

I've squashed this commit along with the one enabling more archive types
support, as this is where the modified (and later removed) procedure
originated.

>>    (define (guess-requirements-from-source)
>>      ;; Return the package's requirements by guessing them from the source.
>> -    (let ((dirname (archive-root-directory source-url))
>> -          (extension (file-extension source-url)))
>> -      (if (string? dirname)
>> -          (call-with-temporary-directory
>> -           (lambda (dir)
>> -             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
>> -                    (requires.txt (string-append dirname "/" pypi-name
>> -                                                 ".egg-info" "/requires.txt"))
>> -                    (exit-code
>> -                     (parameterize ((current-error-port (%make-void-port "rw+"))
>> -                                    (current-output-port (%make-void-port "rw+")))
>> -                       (if (string=? "zip" extension)
>> -                           (system* "unzip" archive "-d" dir requires.txt)
>> -                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
>> -               (if (zero? exit-code)
>> -                   (parse-requires.txt (string-append dir "/" requires.txt))
>> -                   (begin
>> -                     (warning
>> -                      (G_ "Failed to extract file: ~a from source.~%")
>> -                      requires.txt)
>> -                     (list '() '()))))))
>> +    (if (compressed-file? source-url)
>> +        (call-with-temporary-directory
>> +         (lambda (dir)
>> +           (parameterize ((current-error-port (%make-void-port "rw+"))
>> +                          (current-output-port (%make-void-port "rw+")))
>> +             (if (string=? "zip" (file-extension source-url))
>> +                 (invoke "unzip" archive "-d" dir)
>> +                 (invoke "tar" "xf" archive "-C" dir)))
>> +           (let ((requires.txt-files
>> +                  (find-files dir (lambda (abs-file-name _)
>> +		                    (string-match "\\.egg-info/requires.txt$"
>> +                                                  abs-file-name)))))
>> +             (if (> (length requires.txt-files) 0)
>
> Let’s work on the empty list directly.  Here “match” would be better.

Done, like this:

--8<---------------cut here---------------start------------->8---
-             (if (> (length requires.txt-files) 0)
-                 (parse-requires.txt (first requires.txt-files))
-                 (begin (warning (G_ "Cannot guess requirements from source archive:\
+             (match requires.txt-files
+               (()
+                (warning (G_ "Cannot guess requirements from source archive:\
  no requires.txt file found.~%"))
-                        '())))))
+                '())
+               (else (parse-requires.txt (first requires.txt-files)))))))
--8<---------------cut here---------------end--------------->8---

>> +                 (begin
>> +                   (parse-requires.txt (first requires.txt-files)))
>
> No need for “begin” here.

Fixed.

>> +                 (begin (warning (G_ "Cannot guess requirements from source archive:\
>> + no requires.txt file found.~%"))
>> +                        (list '() '()))))))
>
> I know that this is from an earlier commit, but I don’t like the look of
> “(list '() '())” at all :)
>
>> +        (begin
>> +          (warning (G_ "Unsupported archive format; \
>> +cannot determine package dependencies from source archive: ~a~%")
>> +                   (basename source-url))
>>            (list '() '()))))
>
> Same here.  Certainly there’s a better return value.

This might look ugly, but I can't think of a better return value, since
using anything else would mean having to introduce extra logic in the
callers, while it is now a correct value that needs no special case.

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-05-28 14:53     ` Ricardo Wurmus
  2019-05-30  2:24       ` Maxim Cournoyer
@ 2019-06-16  5:53       ` Maxim Cournoyer
  1 sibling, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-16  5:53 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> And finally: Number 9!

Yay!

>> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 30 Mar 2019 20:27:35 -0400
>> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>>  pypi-uri.
>>
>> Fixes issue: #33046.
>
> Please change this to:
>
> Fixes <https://bugs.gnu.org/33046>.

Done!

>> * guix/build-system/python.scm (pypi-uri): Update the host URI to
>> "files.pythonhosted.org".
>> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
>> the source URL calls for it.
>
> Is the first change to use files.pythonhosted.org required to fix this?
> Or is this unrelated?
>
> If it is required this looks fine to me.
>
> Thank you!

The permanent redirection was found while fixing the issue; but it's
better to have the fix separate.  I've separated it into its own commit.

Thank you!

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-10  9:12         ` Ricardo Wurmus
@ 2019-06-16  6:05           ` Maxim Cournoyer
  0 siblings, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-16  6:05 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hi again,

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>> While I agree that a regexp is a bigger hammer than basic string
>> manipulation, I see some merit to it here:
>>
>> 1) We can be assured of conformance with upstream, again, per PEP-0508.
>> 2) It is easier to extend; we might want to add parsing for the version
>> spec in order to disregard dependencies specified for Python < 3, for
>> example.
>>
>> The use of the PEP-0508 grammar to define the regexp is useful to detail
>> in a more human-friendly language the components of the regexp.  We
>> could have otherwise used the more cryptic regexp for Python
>> distribution names:
>>
>> --8<---------------cut here---------------start------------->8---
>> ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
>> --8<---------------cut here---------------end--------------->8---
>>
>> So I guess that what I'm saying is that I prefer this approach to using
>> string-index with invalid characters, for the reasons above.
>>
>> [0]  https://www.python.org/dev/peps/pep-0508/
>
> Okay, sounds good.  Please make sure to note this in a comment, so that
> I won’t be asking myself this same question in a year :)

Done!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-10  9:23         ` Ricardo Wurmus
@ 2019-06-16 14:11           ` Maxim Cournoyer
  2019-06-17  1:41             ` Ricardo Wurmus
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-16 14:11 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hello!

Continued feedback about your much appreciated comments! :-)

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>>>> +          ;; (extra) requirements.  Non-optional requirements must appear
>>>> +          ;; before any section is defined.
>>>> +          (if (or (eof-object? line) (section-header? line))
>>>> +              (reverse result)
>>>> +              (cond
>>>> +               ((or (string-null? line) (comment? line))
>>>> +                (loop result))
>>>> +               (else
>>>> +                (loop (cons (clean-requirement line)
>>>> +                            result))))))))))
>>>> +
>>>
>>> I think it would be better to use “match” here instead of nested “let”,
>>> “if” and “cond”.  At least you can drop the “if” and just use cond.
>>>
>>> The loop let and the inner let can be merged.
>>
>> I'm not sure I understand; wouldn't merging the named let with the plain
>> let mean adding an extra LINE argument to my LOOP procedure?  I don't
>> want that.
>
> Let’s forget about merging the nested “let”, because you would indeed
> need to change a few more things.  It’s fine to keep that as it is.  But
> (if … (cond …)) is not pretty.  At least it could be done in one “cond”:
>
>     (cond
>      ((or (eof-object? line) (section-header? line))
>       (reverse result))
>      ((or (string-null? line) (comment? line))
>       (loop result))
>      (else
>       (loop (cons (clean-requirement line)
>                   result))))

Agreed and fixed, thanks.

>> Also, how could the above code be expressed using "match"? I'm using
>> predicates which tests for (special) characters in a string; I don't see
>> how the more primitive pattern language of "match" will enable me to do
>> the same.
>
> “match” has support for predicates, so you could do something like this:
>
>     (match line
>      ((or (eof-object) (? section-header?))
>       (reverse result))
>      ((or '() (? comment?))
>       (loop result))
>      (_ (loop (cons (clean-requirement line) result))))

Oh, that's neat! I had no idea that predicates could be used with
"match".  '() would need to be replaced by "" to match the empty
string.  Another gotcha with "match", is that the "or" seems to evaluate
every component, no matter if a early true condition was found; this
resulted in the following error:

--8<---------------cut here---------------start------------->8---
+ (wrong-type-arg
+   "string-trim"
+   "Wrong type argument in position ~A (expecting ~A): ~S"
+   (1 "string" #<eof>)
+   (#<eof>))
result: FAIL
--8<---------------cut here---------------end--------------->8---

Due to the "(or (eof-object) (? section-header?)" match clause
evaluating the section-header? predicate despite the line being an EOF
character.

> This allows you to match “eof-object” and '() directly.  Whenever I see
> “string-null?” I think it might be better to “match” on the empty list
> directly.

string-null? and an empty list are not the same, unless I'm missing something.

> But really, that’s up to you.  I only feel strongly about avoiding “(if
> … (cond …))”.

Due to the problem mentioned above, I stayed with "cond".

Thanks!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-12  6:39   ` Ricardo Wurmus
@ 2019-06-16 14:29     ` Maxim Cournoyer
  2019-06-16 14:36       ` bug#24450: [PATCHv3] " Maxim Cournoyer
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-16 14:29 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450

Hey Ricardo :-)

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Hi Maxim,
>
>>>>    (call-with-input-file requires.txt
>>>>      (lambda (port)
>>>> -      (let loop ((result '()))
>>>> +      (let loop ((required-deps '())
>>>> +                 (test-deps '())
>>>> +                 (inside-test-section? #f)
>>>> +                 (optional? #f))
>>>>          (let ((line (read-line port)))
>>>> -          ;; Stop when a section is encountered, as sections contains optional
>>>> -          ;; (extra) requirements.  Non-optional requirements must appear
>>>> -          ;; before any section is defined.
>>>> -          (if (or (eof-object? line) (section-header? line))
>>>> +          (if (eof-object? line)
>>>>                ;; Duplicates can occur, since the same requirement can be
>>>>                ;; listed multiple times with different conditional markers, e.g.
>>>>                ;; pytest >= 3 ; python_version >= "3.3"
>>>>                ;; pytest < 3 ; python_version < "3.3"
>>>> -              (reverse (delete-duplicates result))
>>>> +              (map (compose reverse delete-duplicates)
>>>> +                   (list required-deps test-deps))
>>>
>>> Looks like a list of lists to me.  “delete-duplicates” now won’t delete
>>> a name that is in both “required-deps” as well as in “test-deps”.  Is
>>> this acceptable?
>>
>> It is acceptable, as this corner case cannot exist given the current
>> code (a requirement can exist in either required-deps or test-deps, but
>> never in both). It also doesn't make sense that a run time requirement
>> would also be listed as a test requirement, so that corner case is not
>> likely to exist in the future either.
>
> I mentioned it because I believe I’ve seen this in the past where the
> importer would return some of the same inputs as both regular inputs and
> test dependencies.

OK!

>>> Personally, I’m not a fan of using data structures for returning
>>> multiple values, because we can simply return multiple values.
>>
>> I thought the Guile supported multiple values return value would be
>> great here as well, but I've found that for this specific case here, a
>> list of lists worked better, since the two lists contain requirements to
>> be processed the same, which "map" can readily do (i.e. less ceremony is
>> required).
>
> “map” can also operate on more than one list at a time:
>
>     (call-with-values
>       (lambda ()
>         (values (list 1 2 3)
>                 (list 9 8 7)))
>       (lambda (a b) (map + a b)))
>
>     => (10 10 10)

That's what I meant by "requires more ceremony".  I can simply apply
"map" to the return value of the function and get what I need, rather
than having to use "values" in the callee, then "call-with-values" in
the caller and establish a binding for each list.
  
> Of course, it would be simpler to just use a single list of tagged
> items.

Do you feel strongly about it? I don't; I'm open to try to use a tagged
list if you feel this is worth it.

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv3] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-16 14:29     ` Maxim Cournoyer
@ 2019-06-16 14:36       ` Maxim Cournoyer
  2019-07-02  1:54         ` Maxim Cournoyer
  0 siblings, 1 reply; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-16 14:36 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450


[-- Attachment #1.1: Type: text/plain, Size: 105 bytes --]

Here's the current patch set, version 3, with the modifications
as discussed in the previous exchanges:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-import-pypi-Do-not-consider-requirements.txt-files.patch --]
[-- Type: text/x-patch, Size: 6551 bytes --]

From 215d0bde103e7e8df0d1d1df81965e973f38783a Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 01/12] import: pypi: Do not consider requirements.txt files.

PyPI packages are mandated to have a setup.py file, which contains a listing
of the required dependencies.  The setuptools/distutils machinery embed
metadata in the archives they produce, which contains this information. There
is no need nor gain to collect the requirements from a "requirements.txt"
file, as it is not the true record of dependencies for PyPI packages and may
contain extraneous requirements or not exist at all.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt.  Streamline logic.
* tests/pypi.scm (test-requires.txt): Rename from test-requirements, to hint
at the file being tested.
("pypi->guix-package"): Adapt so that the fake package contains a requires.txt
file rather than a requirements.txt file.
("pypi->guix-package, wheels"): Likewise.
---
 guix/import/pypi.scm | 35 +++++++++++++----------------------
 tests/pypi.scm       | 23 ++++++++++++-----------
 2 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 3a20fc4b9b..8269aa61d7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -206,35 +206,26 @@ cannot determine package dependencies"))
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (req-files (list (string-append dirname "/requirements.txt")
-                                     (string-append dirname "/" pypi-name ".egg-info"
-                                                    "/requires.txt")))
-                    (exit-codes (map (lambda (file-name)
-                                       (parameterize ((current-error-port (%make-void-port "rw+"))
-                                                      (current-output-port (%make-void-port "rw+")))
-                                         (system* "tar" "xf" tarball "-C" dir file-name)))
-                                     req-files)))
-               ;; Only one of these files needs to exist.
-               (if (any zero? exit-codes)
-                   (match (find-files dir)
-                     ((file . _)
-                      (read-requirements file))
-                     (()
-                      (warning (G_ "No requirements file found.\n"))))
+                    (requires.txt (string-append dirname "/" pypi-name
+                                                 ".egg-info" "/requires.txt"))
+                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
+                                              (current-output-port (%make-void-port "rw+")))
+                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+               (if (zero? exit-code)
+                   (read-requirements (string-append dir "/" requires.txt))
                    (begin
-                     (warning (G_ "Failed to extract requirements files\n"))
+                     (warning
+                      (G_ "Failed to extract file: ~a from source.~%")
+                      requires.txt)
                      '())))))
           '())))
 
-  ;; First, try to compute the requirements using the wheel, since that is the
-  ;; most reliable option. If a wheel is not provided for this package, try
-  ;; getting them by reading either the "requirements.txt" file or the
-  ;; "requires.txt" from the egg-info directory from the source tarball. Note
-  ;; that "requirements.txt" is not mandatory, so this is likely to fail.
+  ;; First, try to compute the requirements using the wheel, else, fallback to
+  ;; reading the "requires.txt" from the egg-info directory from the source
+  ;; tarball.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-
 (define (compute-inputs source-url wheel-url tarball)
   "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
 name/variable pairs describing the required inputs of this package.  Also
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6daa44a6e7..a0271fffad 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -23,7 +23,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
-  #:use-module ((guix build utils) #:select (delete-file-recursively which))
+  #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
   #:use-module (srfi srfi-64)
   #:use-module (ice-9 match))
 
@@ -55,11 +55,12 @@
 (define test-source-hash
   "")
 
-(define test-requirements
-"# A comment
+(define test-requires.txt "\
+# A comment
  # A comment after a space
 bar
-baz > 13.37")
+baz > 13.37
+")
 
 (define test-metadata
   "{
@@ -107,10 +108,10 @@ baz > 13.37")
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir "foo-1.0.0")
-                  (with-output-to-file "foo-1.0.0/requirements.txt"
+                  (mkdir-p "foo-1.0.0/foo.egg-info/")
+                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
-                      (display test-requirements)))
+                      (display test-requires.txt)))
                   (system* "tar" "czvf" file-name "foo-1.0.0/")
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
@@ -157,11 +158,11 @@ baz > 13.37")
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
-               (begin
-                 (mkdir "foo-1.0.0")
-                 (with-output-to-file "foo-1.0.0/requirements.txt"
+              (begin
+                (mkdir-p "foo-1.0.0/foo.egg-info/")
+                (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requirements)))
+                     (display test-requires.txt)))
                  (system* "tar" "czvf" file-name "foo-1.0.0/")
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.3: 0001-import-pypi-Do-not-parse-optional-requirements-from-.patch --]
[-- Type: text/x-patch, Size: 6516 bytes --]

From 22da422c3cea1f6d05a11123dd201c1ec54c9ff8 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH] import: pypi: Do not parse optional requirements from source.

* guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
(clean-requirement): Move procedure to the top level.
(guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT procedure to
the top level.  Move the COMMENT? functions inside the PARSE-REQUIRES.TXT
procedure.
(parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
parsing optional requirements.

* tests/pypi.scm (test-requires-with-sections): New variable.
("parse-requires.txt, with sections"): New test.
---
 guix/import/pypi.scm | 74 ++++++++++++++++++++++++++------------------
 tests/pypi.scm       | 14 +++++++++
 2 files changed, 58 insertions(+), 30 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8269aa61d7..d9db876222 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -47,7 +47,8 @@
   #:use-module (guix upstream)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
-  #:export (guix-package->pypi-name
+  #:export (parse-requires.txt
+            guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
             %pypi-updater))
@@ -117,6 +118,47 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
+(define (clean-requirement s)
+  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
+  ;; file, remove everything other than the actual name of the required
+  ;; package, and return it.
+  (cond
+   ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
+   (else s)))
+
+(define (parse-requires.txt requires.txt)
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
+requirement names."
+  ;; This is a very incomplete parser, whose job is to select the non-optional
+  ;; dependencies and strip them out of any version information.
+  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
+  ;; library and the requirements grammar defined by PEP-0508
+  ;; (https://www.python.org/dev/peps/pep-0508/).
+
+  (define (comment? line)
+    ;; Return #t if the given LINE is a comment, #f otherwise.
+    (string-prefix? "#" (string-trim line)))
+
+  (define (section-header? line)
+    ;; Return #t if the given LINE is a section header, #f otherwise.
+    (string-prefix? "[" (string-trim line)))
+
+  (call-with-input-file requires.txt
+    (lambda (port)
+      (let loop ((result '()))
+        (let ((line (read-line port)))
+          ;; Stop when a section is encountered, as sections contain optional
+          ;; (extra) requirements.  Non-optional requirements must appear
+          ;; before any section is defined.
+          (if (or (eof-object? line) (section-header? line))
+              (reverse result)
+              (cond
+               ((or (string-null? line) (comment? line))
+                (loop result))
+               (else
+                (loop (cons (clean-requirement line)
+                            result))))))))))
+
 (define (guess-requirements source-url wheel-url tarball)
   "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
 of the required packages specified in the requirements.txt file.  TARBALL will
@@ -139,34 +181,6 @@ be extracted in a temporary directory."
 cannot determine package dependencies"))
           #f)))))
 
-  (define (clean-requirement s)
-    ;; Given a requirement LINE, as can be found in a Python requirements.txt
-    ;; file, remove everything other than the actual name of the required
-    ;; package, and return it.
-    (string-take s
-      (or (string-index s (lambda (chr) (member chr '(#\space #\> #\= #\<))))
-          (string-length s))))
-
-  (define (comment? line)
-    ;; Return #t if the given LINE is a comment, #f otherwise.
-    (eq? (string-ref (string-trim line) 0) #\#))
-
-  (define (read-requirements requirements-file)
-    ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return a list
-    ;; of name/variable pairs describing the requirements.
-    (call-with-input-file requirements-file
-      (lambda (port)
-        (let loop ((result '()))
-          (let ((line (read-line port)))
-            (if (eof-object? line)
-                result
-                (cond
-                 ((or (string-null? line) (comment? line))
-                  (loop result))
-                 (else
-                  (loop (cons (clean-requirement line)
-                              result))))))))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -212,7 +226,7 @@ cannot determine package dependencies"))
                                               (current-output-port (%make-void-port "rw+")))
                                  (system* "tar" "xf" tarball "-C" dir requires.txt))))
                (if (zero? exit-code)
-                   (read-requirements (string-append dir "/" requires.txt))
+                   (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6df69073dc..03455ba6be 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -62,6 +62,14 @@ bar
 baz > 13.37
 ")
 
+(define test-requires-with-sections "\
+foo ~= 3
+bar != 2
+
+[test]
+pytest (>=2.5.0)
+")
+
 (define test-metadata
   "{
   \"run_requires\": [
@@ -101,6 +109,12 @@ baz > 13.37
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "parse-requires.txt, with sections"
+  '("foo" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires-with-sections)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.4: 0002-tests-pypi-Mute-the-output-of-tar.patch --]
[-- Type: text/x-patch, Size: 1827 bytes --]

From 17419a5f9572b7b886544531324b7b2ec431555e Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Mon, 10 Jun 2019 12:20:08 +0900
Subject: [PATCH 02/12] tests: pypi: Mute the output of tar.

The output of tar when creating archives for the purpose of tests is not
useful, so we mute it.

* tests/pypi.scm ("pypi->guix-package"): Mute the output of tar.
("pypi->guix-package, wheels"): Likewise.
---
 tests/pypi.scm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/pypi.scm b/tests/pypi.scm
index a0271fffad..6df69073dc 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -112,7 +112,8 @@ baz > 13.37
                   (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
-                  (system* "tar" "czvf" file-name "foo-1.0.0/")
+                  (parameterize ((current-output-port (%make-void-port "rw+")))
+                    (system* "tar" "czvf" file-name "foo-1.0.0/"))
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
                     (call-with-input-file file-name port-sha256))))
@@ -163,7 +164,8 @@ baz > 13.37
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
                      (display test-requires.txt)))
-                 (system* "tar" "czvf" file-name "foo-1.0.0/")
+                (parameterize ((current-output-port (%make-void-port "rw+")))
+                  (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
                        (call-with-input-file file-name port-sha256))))
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.5: 0003-import-pypi-Do-not-parse-optional-requirements-from-.patch --]
[-- Type: text/x-patch, Size: 6524 bytes --]

From 22da422c3cea1f6d05a11123dd201c1ec54c9ff8 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 03/12] import: pypi: Do not parse optional requirements from
 source.

* guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
(clean-requirement): Move procedure to the top level.
(guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT procedure to
the top level.  Move the COMMENT? functions inside the PARSE-REQUIRES.TXT
procedure.
(parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
parsing optional requirements.

* tests/pypi.scm (test-requires-with-sections): New variable.
("parse-requires.txt, with sections"): New test.
---
 guix/import/pypi.scm | 74 ++++++++++++++++++++++++++------------------
 tests/pypi.scm       | 14 +++++++++
 2 files changed, 58 insertions(+), 30 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8269aa61d7..d9db876222 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -47,7 +47,8 @@
   #:use-module (guix upstream)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
-  #:export (guix-package->pypi-name
+  #:export (parse-requires.txt
+            guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
             %pypi-updater))
@@ -117,6 +118,47 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
+(define (clean-requirement s)
+  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
+  ;; file, remove everything other than the actual name of the required
+  ;; package, and return it.
+  (cond
+   ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
+   (else s)))
+
+(define (parse-requires.txt requires.txt)
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
+requirement names."
+  ;; This is a very incomplete parser, whose job is to select the non-optional
+  ;; dependencies and strip them out of any version information.
+  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
+  ;; library and the requirements grammar defined by PEP-0508
+  ;; (https://www.python.org/dev/peps/pep-0508/).
+
+  (define (comment? line)
+    ;; Return #t if the given LINE is a comment, #f otherwise.
+    (string-prefix? "#" (string-trim line)))
+
+  (define (section-header? line)
+    ;; Return #t if the given LINE is a section header, #f otherwise.
+    (string-prefix? "[" (string-trim line)))
+
+  (call-with-input-file requires.txt
+    (lambda (port)
+      (let loop ((result '()))
+        (let ((line (read-line port)))
+          ;; Stop when a section is encountered, as sections contain optional
+          ;; (extra) requirements.  Non-optional requirements must appear
+          ;; before any section is defined.
+          (if (or (eof-object? line) (section-header? line))
+              (reverse result)
+              (cond
+               ((or (string-null? line) (comment? line))
+                (loop result))
+               (else
+                (loop (cons (clean-requirement line)
+                            result))))))))))
+
 (define (guess-requirements source-url wheel-url tarball)
   "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
 of the required packages specified in the requirements.txt file.  TARBALL will
@@ -139,34 +181,6 @@ be extracted in a temporary directory."
 cannot determine package dependencies"))
           #f)))))
 
-  (define (clean-requirement s)
-    ;; Given a requirement LINE, as can be found in a Python requirements.txt
-    ;; file, remove everything other than the actual name of the required
-    ;; package, and return it.
-    (string-take s
-      (or (string-index s (lambda (chr) (member chr '(#\space #\> #\= #\<))))
-          (string-length s))))
-
-  (define (comment? line)
-    ;; Return #t if the given LINE is a comment, #f otherwise.
-    (eq? (string-ref (string-trim line) 0) #\#))
-
-  (define (read-requirements requirements-file)
-    ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return a list
-    ;; of name/variable pairs describing the requirements.
-    (call-with-input-file requirements-file
-      (lambda (port)
-        (let loop ((result '()))
-          (let ((line (read-line port)))
-            (if (eof-object? line)
-                result
-                (cond
-                 ((or (string-null? line) (comment? line))
-                  (loop result))
-                 (else
-                  (loop (cons (clean-requirement line)
-                              result))))))))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -212,7 +226,7 @@ cannot determine package dependencies"))
                                               (current-output-port (%make-void-port "rw+")))
                                  (system* "tar" "xf" tarball "-C" dir requires.txt))))
                (if (zero? exit-code)
-                   (read-requirements (string-append dir "/" requires.txt))
+                   (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6df69073dc..03455ba6be 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -62,6 +62,14 @@ bar
 baz > 13.37
 ")
 
+(define test-requires-with-sections "\
+foo ~= 3
+bar != 2
+
+[test]
+pytest (>=2.5.0)
+")
+
 (define test-metadata
   "{
   \"run_requires\": [
@@ -101,6 +109,12 @@ baz > 13.37
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "parse-requires.txt, with sections"
+  '("foo" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires-with-sections)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.6: 0004-import-pypi-Improve-parsing-of-requirement-specifica.patch --]
[-- Type: text/x-patch, Size: 6147 bytes --]

From bbdfb0ff2cc0347df73bb21b1443d6ef1f138a43 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 04/12] import: pypi: Improve parsing of requirement
 specifications.

The previous solution was fragile and could leave unwanted characters in a
requirement name, such as '[' or ']'.

Partially fixes <https://bugs.gnu.org/33047>.

* guix/import/pypi.scm (use-modules): Export SPECIFICATION->REQUIREMENT-NAME
(%requirement-name-regexp): New variable.
(clean-requirement): Rename to...
(specification->requirement-name): this, which now uses
%requirement-name-regexp to select the requirement name from the requirement
specification.
(parse-requires.txt): Adapt.
---
 guix/import/pypi.scm | 54 ++++++++++++++++++++++++++++++++------------
 tests/pypi.scm       | 12 ++++++++++
 2 files changed, 52 insertions(+), 14 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index d9db876222..6a881bda12 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -48,6 +48,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
@@ -118,22 +119,47 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
-(define (clean-requirement s)
-  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
-  ;; file, remove everything other than the actual name of the required
-  ;; package, and return it.
-  (cond
-   ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
-   (else s)))
+(define %requirement-name-regexp
+  ;; Regexp to match the requirement name in a requirement specification.
+
+  ;; Some grammar, taken from PEP-0508 (see:
+  ;; https://www.python.org/dev/peps/pep-0508/).
+
+  ;; Using this grammar makes the PEP-0508 regexp easier to understand for
+  ;; humans.  The use of a regexp is preferred to more primitive string
+  ;; manipulations because we can more directly match what upstream uses
+  ;; (again, per PEP-0508).  The regexp approach is also easier to extend,
+  ;; should we want to implement more completely the grammar of PEP-0508.
+
+  ;; The unified rule can be expressed as:
+  ;; specification = wsp* ( url_req | name_req ) wsp*
+
+  ;; where url_req is:
+  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
+
+  ;; and where name_req is:
+  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
+
+  ;; Thus, we need only matching NAME, which is expressed as:
+  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
+  ;; identifier    = letterOrDigit identifier_end*
+  ;; name          = identifier
+  (let* ((letter-or-digit "[A-Za-z0-9]")
+         (identifier-end (string-append "(" letter-or-digit "|"
+                                        "[-_.]*" letter-or-digit ")"))
+         (identifier (string-append "^" letter-or-digit identifier-end "*"))
+         (name identifier))
+    (make-regexp name)))
+
+(define (specification->requirement-name spec)
+  "Given a specification SPEC, return the requirement name."
+  (match:substring
+   (or (regexp-exec %requirement-name-regexp spec)
+       (error (G_ "Could not extract requirement name in spec:") spec))))
 
 (define (parse-requires.txt requires.txt)
   "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
 requirement names."
-  ;; This is a very incomplete parser, whose job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
-  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
-  ;; library and the requirements grammar defined by PEP-0508
-  ;; (https://www.python.org/dev/peps/pep-0508/).
 
   (define (comment? line)
     ;; Return #t if the given LINE is a comment, #f otherwise.
@@ -156,7 +182,7 @@ requirement names."
                ((or (string-null? line) (comment? line))
                 (loop result))
                (else
-                (loop (cons (clean-requirement line)
+                (loop (cons (specification->requirement-name line)
                             result))))))))))
 
 (define (guess-requirements source-url wheel-url tarball)
@@ -198,7 +224,7 @@ cannot determine package dependencies"))
                                             (hash-ref (list-ref run_requires 0)
                                                        "requires")
                                             '())))
-                     (map clean-requirement requirements)))))
+                     (map specification->requirement-name requirements)))))
              (lambda ()
                (delete-file json-file)
                (rmdir dirname))))))
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 03455ba6be..c40be6c21d 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -55,6 +55,14 @@
 (define test-source-hash
   "")
 
+(define test-specifications
+  '("Fizzy [foo, bar]"
+    "PickyThing<1.6,>1.9,!=1.9.6,<2.0a0,==2.4c1"
+    "SomethingWithMarker[foo]>1.0;python_version<\"2.7\""
+    "requests [security,tests] >= 2.8.1, == 2.8.* ; python_version < \"2.7\""
+    "pip @ https://github.com/pypa/pip/archive/1.3.1.zip#\
+sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
+
 (define test-requires.txt "\
 # A comment
  # A comment after a space
@@ -109,6 +117,10 @@ pytest (>=2.5.0)
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "specification->requirement-name"
+  '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
+  (map specification->requirement-name test-specifications))
+
 (test-equal "parse-requires.txt, with sections"
   '("foo" "bar")
   (mock ((ice-9 ports) call-with-input-file
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.7: 0005-import-pypi-Deduplicate-requirements.patch --]
[-- Type: text/x-patch, Size: 1255 bytes --]

From 475d5e483e7bfdae8db271fc513172c4f05e7358 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 05/12] import: pypi: Deduplicate requirements.

* guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.
---
 guix/import/pypi.scm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 6a881bda12..ad59a8b731 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -177,7 +177,11 @@ requirement names."
           ;; (extra) requirements.  Non-optional requirements must appear
           ;; before any section is defined.
           (if (or (eof-object? line) (section-header? line))
-              (reverse result)
+              ;; Duplicates can occur, since the same requirement can be
+              ;; listed multiple times with different conditional markers, e.g.
+              ;; pytest >= 3 ; python_version >= "3.3"
+              ;; pytest < 3 ; python_version < "3.3"
+              (reverse (delete-duplicates result))
               (cond
                ((or (string-null? line) (comment? line))
                 (loop result))
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.8: 0006-import-pypi-Support-more-types-of-archives.patch --]
[-- Type: text/x-patch, Size: 10481 bytes --]

From 49b2232b6a3421e3b7d3b59be5b965c5bb11392e Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:02 -0400
Subject: [PATCH 06/12] import: pypi: Support more types of archives.

This change enables the PyPI importer to look for requirements in a source
archive of a different type than "tar.gz" or "tar.bz2".  Also, scan the source
archive to find a requires.txt file.

* guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Remove procedure.
[guess-requirements-from-source]: Use COMRESSED-FILE? to determine if an
archive type is supported, and some file extension logic that chooses either
"tar" or "unzip" as the extractor.  Search for the requires.txt file in the
archive instead of using a static, expected location.
(guess-requirements): Rename the TARBALL argument to ARCHIVE, to denote the
archive format is no longer bound specifically to the Tar format.
(compute-inputs): Likewise.
* tests/pypi.scm ("pypi->guix-package, no wheel"): Mock the requires.txt at a
non-standard location.
("pypi->guix-package, no usable requirement file."): New test.
---
 guix/import/pypi.scm | 77 +++++++++++++++++++-------------------------
 tests/pypi.scm       | 52 ++++++++++++++++++++++++++++--
 2 files changed, 83 insertions(+), 46 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index ad59a8b731..a6106ab4ec 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -39,7 +39,8 @@
   #:use-module ((guix build utils)
                 #:select ((package-name->name+version
                            . hyphen-package-name->name+version)
-                          find-files))
+                          find-files
+                          invoke))
   #:use-module (guix import utils)
   #:use-module ((guix download) #:prefix download:)
   #:use-module (guix import json)
@@ -189,28 +190,11 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
-(define (guess-requirements source-url wheel-url tarball)
-  "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
-of the required packages specified in the requirements.txt file.  TARBALL will
+(define (guess-requirements source-url wheel-url archive)
+  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (tarball-directory url)
-    ;; Given the URL of the package's tarball, return the name of the directory
-    ;; that will be created upon decompressing it. If the filetype is not
-    ;; supported, return #f.
-    ;; TODO: Support more archive formats.
-    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
-      (cond
-       ((string-suffix? ".tar.gz" basename)
-        (string-drop-right basename 7))
-       ((string-suffix? ".tar.bz2" basename)
-        (string-drop-right basename 8))
-       (else
-        (begin
-          (warning (G_ "Unsupported archive format: \
-cannot determine package dependencies"))
-          #f)))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -239,29 +223,34 @@ cannot determine package dependencies"))
     (call-with-temporary-output-file
      (lambda (temp port)
        (if wheel-url
-         (and (url-fetch wheel-url temp)
-              (read-wheel-metadata temp))
-         #f))))
+           (and (url-fetch wheel-url temp)
+                (read-wheel-metadata temp))
+           #f))))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (tarball-directory source-url)))
-      (if (string? dirname)
-          (call-with-temporary-directory
-           (lambda (dir)
-             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (requires.txt (string-append dirname "/" pypi-name
-                                                 ".egg-info" "/requires.txt"))
-                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
-                                              (current-output-port (%make-void-port "rw+")))
-                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
-               (if (zero? exit-code)
-                   (parse-requires.txt (string-append dir "/" requires.txt))
-                   (begin
-                     (warning
-                      (G_ "Failed to extract file: ~a from source.~%")
-                      requires.txt)
-                     '())))))
+    (if (compressed-file? source-url)
+        (call-with-temporary-directory
+         (lambda (dir)
+           (parameterize ((current-error-port (%make-void-port "rw+"))
+                          (current-output-port (%make-void-port "rw+")))
+             (if (string=? "zip" (file-extension source-url))
+                 (invoke "unzip" archive "-d" dir)
+                 (invoke "tar" "xf" archive "-C" dir)))
+           (let ((requires.txt-files
+                  (find-files dir (lambda (abs-file-name _)
+		                    (string-match "\\.egg-info/requires.txt$"
+                                                  abs-file-name)))))
+             (match requires.txt-files
+               (()
+                (warning (G_ "Cannot guess requirements from source archive:\
+ no requires.txt file found.~%"))
+                '())
+               (else (parse-requires.txt (first requires.txt-files)))))))
+        (begin
+          (warning (G_ "Unsupported archive format; \
+cannot determine package dependencies from source archive: ~a~%")
+                   (basename source-url))
           '())))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
@@ -270,13 +259,13 @@ cannot determine package dependencies"))
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-(define (compute-inputs source-url wheel-url tarball)
-  "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
+(define (compute-inputs source-url wheel-url archive)
+  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
 name/variable pairs describing the required inputs of this package.  Also
 return the unaltered list of upstream dependency names."
   (let ((dependencies
          (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url tarball))))
+                 (guess-requirements source-url wheel-url archive))))
     (values (sort
              (map (lambda (input)
                     (let ((guix-name (python->package-name input)))
diff --git a/tests/pypi.scm b/tests/pypi.scm
index c40be6c21d..b45d2c9d2f 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -20,6 +20,7 @@
 (define-module (test-pypi)
   #:use-module (guix import pypi)
   #:use-module (guix base32)
+  #:use-module (guix memoization)
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
@@ -134,8 +135,9 @@ pytest (>=2.5.0)
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir-p "foo-1.0.0/foo.egg-info/")
-                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
+                  ;; Unusual requires.txt location should still be found.
+                  (mkdir-p "foo-1.0.0/src/bizarre.egg-info")
+                  (with-output-to-file "foo-1.0.0/src/bizarre.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
                   (parameterize ((current-output-port (%make-void-port "rw+")))
@@ -241,4 +243,50 @@ pytest (>=2.5.0)
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (mkdir-p "foo-1.0.0/foo.egg-info/")
+              (parameterize ((current-output-port (%make-void-port "rw+")))
+                (system* "tar" "czvf" file-name "foo-1.0.0/"))
+              (delete-file-recursively "foo-1.0.0")
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256)))
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f))))))
+
 (test-end "pypi")
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.9: 0007-import-pypi-Parse-wheel-METADATA-instead-of-metadata.patch --]
[-- Type: text/x-patch, Size: 10948 bytes --]

From 3a0082de1bed3eca6dedf71dec34efd0457bd58f Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:03 -0400
Subject: [PATCH 07/12] import: pypi: Parse wheel METADATA instead of
 metadata.json.

With newer Wheel releases, there is no more metadata.json file; the METADATA
file should be used instead (see: https://github.com/pypa/wheel/issues/195).

This change updates our PyPI importer so that it uses the latter.

* guix/import/pypi.scm (define-module): Remove unnecessary modules and export
the PARSE-WHEEL-METADATA procedure.
(parse-wheel-metadata): Add procedure.
(guess-requirements): Use it.
* tests/pypi.scm (test-metadata): Test it.
---
 guix/import/pypi.scm | 90 +++++++++++++++++++++++++++-----------------
 tests/pypi.scm       | 60 ++++++++++++++++++++++-------
 2 files changed, 101 insertions(+), 49 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a6106ab4ec..7cf1e92101 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -21,9 +21,7 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix import pypi)
-  #:use-module (ice-9 binary-ports)
   #:use-module (ice-9 match)
-  #:use-module (ice-9 pretty-print)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
@@ -31,9 +29,6 @@
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
-  #:use-module (rnrs bytevectors)
-  #:use-module (json)
-  #:use-module (web uri)
   #:use-module (guix ui)
   #:use-module (guix utils)
   #:use-module ((guix build utils)
@@ -49,6 +44,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            parse-wheel-metadata
             specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
@@ -177,18 +173,49 @@ requirement names."
           ;; Stop when a section is encountered, as sections contain optional
           ;; (extra) requirements.  Non-optional requirements must appear
           ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
-              ;; Duplicates can occur, since the same requirement can be
-              ;; listed multiple times with different conditional markers, e.g.
-              ;; pytest >= 3 ; python_version >= "3.3"
-              ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
-              (cond
-               ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
-                (loop (cons (specification->requirement-name line)
-                            result))))))))))
+          (cond
+           ((or (eof-object? line) (section-header? line))
+            ;; Duplicates can occur, since the same requirement can be
+            ;; listed multiple times with different conditional markers, e.g.
+            ;; pytest >= 3 ; python_version >= "3.3"
+            ;; pytest < 3 ; python_version < "3.3"
+            (reverse (delete-duplicates result)))
+           ((or (string-null? line) (comment? line))
+            (loop result))
+           (else
+            (loop (cons (specification->requirement-name line)
+                        result)))))))))
+
+(define (parse-wheel-metadata metadata)
+  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  ;; METADATA is a RFC-2822-like, header based file.
+
+  (define (requires-dist-header? line)
+    ;; Return #t if the given LINE is a Requires-Dist header.
+    (string-match "^Requires-Dist: " line))
+
+  (define (requires-dist-value line)
+    (string-drop line (string-length "Requires-Dist: ")))
+
+  (define (extra? line)
+    ;; Return #t if the given LINE is an "extra" requirement.
+    (string-match "extra == '(.*)'" line))
+
+  (call-with-input-file metadata
+    (lambda (port)
+      (let loop ((requirements '()))
+        (let ((line (read-line port)))
+          ;; Stop at the first 'Provides-Extra' section: the non-optional
+          ;; requirements appear before the optional ones.
+          (cond
+           ((eof-object? line)
+            (reverse (delete-duplicates requirements)))
+           ((and (requires-dist-header? line) (not (extra? line)))
+            (loop (cons (specification->requirement-name
+                         (requires-dist-value line))
+                        requirements)))
+           (else
+            (loop requirements))))))))
 
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
@@ -197,25 +224,18 @@ be extracted in a temporary directory."
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
-    ;; requirements.
+    ;; requirements, or #f if the metadata file contained therein couldn't be
+    ;; extracted.
     (let* ((dirname (wheel-url->extracted-directory wheel-url))
-           (json-file (string-append dirname "/metadata.json")))
-      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
-           (dynamic-wind
-             (const #t)
-             (lambda ()
-               (call-with-input-file json-file
-                 (lambda (port)
-                   (let* ((metadata (json->scm port))
-                          (run_requires (hash-ref metadata "run_requires"))
-                          (requirements (if run_requires
-                                            (hash-ref (list-ref run_requires 0)
-                                                       "requires")
-                                            '())))
-                     (map specification->requirement-name requirements)))))
-             (lambda ()
-               (delete-file json-file)
-               (rmdir dirname))))))
+           (metadata (string-append dirname "/METADATA")))
+      (call-with-temporary-directory
+       (lambda (dir)
+         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+             (parse-wheel-metadata (string-append dir "/" metadata))
+             (begin
+               (warning
+                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
+               #f))))))
 
   (define (guess-requirements-from-wheel)
     ;; Return the package's requirements using the wheel, or #f if an error
diff --git a/tests/pypi.scm b/tests/pypi.scm
index b45d2c9d2f..8b42c2f071 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -22,6 +22,7 @@
   #:use-module (guix base32)
   #:use-module (guix memoization)
   #:use-module (gcrypt hash)
+  #:use-module (guix memoization)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
   #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
@@ -79,17 +80,33 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
-(define test-metadata
-  "{
-  \"run_requires\": [
-    {
-      \"requires\": [
-        \"bar\",
-        \"baz (>13.37)\"
-      ]
-    }
-  ]
-}")
+(define test-metadata "\
+Classifier: Programming Language :: Python :: 3.7
+Requires-Dist: baz ~= 3
+Requires-Dist: bar != 2
+Provides-Extra: test
+pytest (>=2.5.0)
+")
+
+(define test-metadata-with-extras "
+Classifier: Programming Language :: Python :: 3.7
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Requires-Dist: wrapt (<2,>=1)
+Requires-Dist: bar
+
+Provides-Extra: dev
+Requires-Dist: tox ; extra == 'dev'
+Requires-Dist: bumpversion (<1) ; extra == 'dev'
+")
+
+;;; Provides-Extra can appear before Requires-Dist.
+(define test-metadata-with-extras-jedi "\
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Provides-Extra: testing
+Requires-Dist: parso (>=0.3.0)
+Provides-Extra: testing
+Requires-Dist: pytest (>=3.1.0); extra == 'testing'
+")
 
 (test-begin "pypi")
 
@@ -128,6 +145,18 @@ pytest (>=2.5.0)
          call-with-input-string)
         (parse-requires.txt test-requires-with-sections)))
 
+(test-equal "parse-wheel-metadata, with extras"
+  '("wrapt" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras)))
+
+(test-equal "parse-wheel-metadata, with extras - Jedi"
+  '("parso")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras-jedi)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -191,7 +220,7 @@ pytest (>=2.5.0)
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requires.txt)))
+                     (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
@@ -200,13 +229,13 @@ pytest (>=2.5.0)
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
                (begin
                  (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/metadata.json"
+                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
                    (lambda ()
                      (display test-metadata)))
                  (let ((zip-file (string-append file-name ".zip")))
                    ;; zip always adds a "zip" extension to the file it creates,
                    ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/metadata.json")
+                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
                    (rename-file zip-file file-name))
                  (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
@@ -218,6 +247,9 @@ pytest (>=2.5.0)
                             (string-length test-json)))
                    ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
                    (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
               (match (pypi->guix-package "foo")
                 (('package
                    ('name "python-foo")
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.10: 0008-import-pypi-Fix-typo-in-docstring.patch --]
[-- Type: text/x-patch, Size: 958 bytes --]

From 04e119d50d8622bf5bf8f1a6c6c80cd51e3e1658 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Wed, 12 Jun 2019 11:34:23 +0900
Subject: [PATCH 08/12] import: pypi: Fix typo in docstring.

* guix/import/pypi.scm (guess-requirements): Fix typo.
---
 guix/import/pypi.scm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 7cf1e92101..d861dd960d 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -218,7 +218,7 @@ requirement names."
             (loop requirements))))))))
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.11: 0009-import-pypi-Completely-mute-the-output-of-the-unzip-.patch --]
[-- Type: text/x-patch, Size: 1236 bytes --]

From 141551b4dae1f7d9920495eddfd85d49e3a45262 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Wed, 12 Jun 2019 11:36:39 +0900
Subject: [PATCH 09/12] import: pypi: Completely mute the output of the "unzip"
 command.

* guix/import/pypi.scm (guess-requirements): Completely mute the output of the
"unzip" command.
---
 guix/import/pypi.scm | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index d861dd960d..23a1e69061 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -230,7 +230,10 @@ be extracted in a temporary directory."
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
-- 
2.21.0


[-- Attachment #1.12: 0010-import-pypi-Include-optional-test-inputs-as-native-i.patch --]
[-- Type: text/x-patch, Size: 19751 bytes --]

From b435bb15a1905fee13fcee5423a6d2dc2300fa9a Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH 10/12] import: pypi: Include optional test inputs as
 native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
---
 guix/import/pypi.scm | 177 ++++++++++++++++++++++++++++---------------
 tests/pypi.scm       |  79 ++++++++++++-------
 2 files changed, 166 insertions(+), 90 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 23a1e69061..537431dd69 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -107,14 +109,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -154,9 +157,19 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of lists
+of requirements.
+
+The first list contains the required dependencies while the second the
+optional test dependencies.  Note that currently, optional, non-test
+dependencies are omitted since these can be difficult or expensive to
+satisfy."
 
   (define (comment? line)
     ;; Return #t if the given LINE is a comment, #f otherwise.
@@ -168,26 +181,49 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contain optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
           (cond
-           ((or (eof-object? line) (section-header? line))
+           ((eof-object? line)
             ;; Duplicates can occur, since the same requirement can be
             ;; listed multiple times with different conditional markers, e.g.
             ;; pytest >= 3 ; python_version >= "3.3"
             ;; pytest < 3 ; python_version < "3.3"
-            (reverse (delete-duplicates result)))
+            (map (compose reverse delete-duplicates)
+                 (list required-deps test-deps)))
            ((or (string-null? line) (comment? line))
-            (loop result))
-           (else
+            (loop required-deps test-deps inside-test-section? optional?))
+           ((section-header? line)
+            ;; Encountering a section means that all the requirements
+            ;; listed below are optional. Since we want to pick only the
+            ;; test dependencies from the optional dependencies, we must
+            ;; track those separately.
+            (loop required-deps test-deps (test-section? line) #t))
+           (inside-test-section?
+            (loop required-deps
+                  (cons (specification->requirement-name line)
+                        test-deps)
+                  inside-test-section? optional?))
+           ((not optional?)
             (loop (cons (specification->requirement-name line)
-                        result)))))))))
+                        required-deps)
+                  test-deps inside-test-section? optional?))
+           (optional?
+            ;; Skip optional items.
+            (loop required-deps test-deps inside-test-section? optional?))
+           (else
+            (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a list of lists of
+requirements.
+
+Refer to the documentation of PARSE-REQUIRES.TXT for a description of the
+returned value."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
@@ -201,21 +237,29 @@ requirement names."
     ;; Return #t if the given LINE is an "extra" requirement.
     (string-match "extra == '(.*)'" line))
 
+  (define (test-requirement? line)
+    (and=> (match:substring (extra? line) 1) test-section?))
+
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (cond
            ((eof-object? line)
-            (reverse (delete-duplicates requirements)))
+            (map (compose reverse delete-duplicates)
+                 (list required-deps test-deps)))
            ((and (requires-dist-header? line) (not (extra? line)))
             (loop (cons (specification->requirement-name
                          (requires-dist-value line))
-                        requirements)))
+                        required-deps)
+                  test-deps))
+           ((and (requires-dist-header? line) (test-requirement? line))
+            (loop required-deps
+                  (cons (specification->requirement-name (requires-dist-value line))
+                        test-deps)))
            (else
-            (loop requirements))))))))
+            (loop required-deps test-deps)))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
@@ -268,37 +312,46 @@ be extracted in a temporary directory."
                (()
                 (warning (G_ "Cannot guess requirements from source archive:\
  no requires.txt file found.~%"))
-                '())
+                (list '() '()))
                (else (parse-requires.txt (first requires.txt-files)))))))
         (begin
           (warning (G_ "Unsupported archive format; \
 cannot determine package dependencies from source archive: ~a~%")
                    (basename source-url))
-          '())))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively.  Also
 return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (let ((dependencies (guess-requirements source-url wheel-url archive)))
+    (values (map process-requirements dependencies)
+            (concatenate dependencies))))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -307,29 +360,31 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
+          (receive (guix-dependencies upstream-dependencies)
               (compute-inputs source-url wheel-url temp)
-            (values
-             `(package
-                (name ,(python->package-name name))
-                (version ,version)
-                (source (origin
-                          (method url-fetch)
-
-                          ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                          ;; cases in NAME, for instance, as is the case with
-                          ;; "uwsgi".  In that case, fall back to a full URL.
-                          (uri (pypi-uri ,(string-downcase name) version))
-                          (sha256
-                           (base32
-                            ,(guix-hash-url temp)))))
-                (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
-                (home-page ,home-page)
-                (synopsis ,synopsis)
-                (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+            (match guix-dependencies
+              ((required-inputs test-inputs)
+               (values
+                `(package
+                   (name ,(python->package-name name))
+                   (version ,version)
+                   (source (origin
+                             (method url-fetch)
+                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
+                             ;; cases in NAME, for instance, as is the case with
+                             ;; "uwsgi".  In that case, fall back to a full URL.
+                             (uri (pypi-uri ,(string-downcase name) version))
+                             (sha256
+                              (base32
+                               ,(guix-hash-url temp)))))
+                   (build-system python-build-system)
+                   ,@(maybe-inputs required-inputs 'propagated-inputs)
+                   ,@(maybe-inputs test-inputs 'native-inputs)
+                   (home-page ,home-page)
+                   (synopsis ,synopsis)
+                   (description ,description)
+                   (license ,(license->symbol license)))
+                upstream-dependencies))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 8b42c2f071..43d45f1dd8 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -68,11 +69,6 @@ sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 (define test-requires.txt "\
 # A comment
  # A comment after a space
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 foo ~= 3
 bar != 2
 
@@ -80,12 +76,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -139,25 +148,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -198,7 +213,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -219,25 +237,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -265,6 +283,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.13: 0011-import-pypi-Update-the-host-URI.patch --]
[-- Type: text/x-patch, Size: 1043 bytes --]

From 736e91e7c1d9d7e7c6f93342766fdcd4b78cbfa1 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sun, 16 Jun 2019 14:52:25 +0900
Subject: [PATCH 11/12] import: pypi: Update the host URI.

* guix/build-system/python.scm (pypi-uri): Update the host URI to
"files.pythonhosted.org".
---
 guix/build-system/python.scm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/guix/build-system/python.scm b/guix/build-system/python.scm
index b753940bad..e39c06528e 100644
--- a/guix/build-system/python.scm
+++ b/guix/build-system/python.scm
@@ -50,7 +50,7 @@
   "Return a URI string for the Python package hosted on the Python Package
 Index (PyPI) corresponding to NAME and VERSION.  EXTENSION is the file name
 extension, such as '.tar.gz'."
-  (string-append "https://pypi.org/packages/source/"
+  (string-append "https://files.pythonhosted.org/packages/source/"
                  (string-take name 1) "/" name "/"
                  name "-" version extension))
 
-- 
2.21.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.14: 0012-import-pypi-Preserve-package-name-case-when-forming-.patch --]
[-- Type: text/x-patch, Size: 2535 bytes --]

From 5373f50088cd0d37e1dbc64e6ca505224a452d8e Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 20:27:35 -0400
Subject: [PATCH 12/12] import: pypi: Preserve package name case when forming
 pypi-uri.

Fixes <https://bugs.gnu.org/33046>.

* guix/build-system/python.scm (pypi-uri): Update the host URI to
"files.pythonhosted.org".
* guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
the source URL calls for it.
---
 guix/import/pypi.scm | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 537431dd69..ab7a024ee0 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -368,15 +368,20 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
                 `(package
                    (name ,(python->package-name name))
                    (version ,version)
-                   (source (origin
-                             (method url-fetch)
-                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                             ;; cases in NAME, for instance, as is the case with
-                             ;; "uwsgi".  In that case, fall back to a full URL.
-                             (uri (pypi-uri ,(string-downcase name) version))
-                             (sha256
-                              (base32
-                               ,(guix-hash-url temp)))))
+                   (source
+                    (origin
+                      (method url-fetch)
+                      ;; PyPI URL are case sensitive, but sometimes a project
+                      ;; named using mixed case has a URL using lower case, so
+                      ;; we must work around this inconsistency.  For actual
+                      ;; examples, compare the URLs of the "Deprecated" and
+                      ;; "uWSGI" PyPI packages.
+                      (uri ,(if (string-contains source-url name)
+                                `(pypi-uri ,name version)
+                                `(pypi-uri ,(string-downcase name) version)))
+                      (sha256
+                       (base32
+                        ,(guix-hash-url temp)))))
                    (build-system python-build-system)
                    ,@(maybe-inputs required-inputs 'propagated-inputs)
                    ,@(maybe-inputs test-inputs 'native-inputs)
-- 
2.21.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* bug#24450: pypi importer outputs strange character series in optional dependency case.
  2019-03-29  4:24 ` Maxim Cournoyer
@ 2019-06-16 17:02   ` ng0
  2019-06-26  4:12     ` Maxim Cournoyer
  0 siblings, 1 reply; 41+ messages in thread
From: ng0 @ 2019-06-16 17:02 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: ng0, 24450

Hi Maxim,

great to see this fixed. Weird it bumped into my view after
almost 3 years. I trust that other people will have tested
it, I'm all over the place and mostly doing NetBSD and GNUnet
now.
Sorry if I have written a reply before, the thread was
marked as unread here.
Maxim Cournoyer transcribed 3.5K bytes:
> ng0 <ng0@we.make.ritual.n0.is> writes:
> 
> > I think this should not happen with pypi import:
> >
> >   (inputs
> >     `(("python-certifi==2016.2.28"
> >        ,python-certifi==2016.2.28)
> >       ("python-dateutil==2.5.3"
> >        ,python-dateutil==2.5.3)
> >       ("python-flask-babel==0.11.1"
> >        ,python-flask-babel==0.11.1)
> >       ("python-flask==0.11.1" ,python-flask==0.11.1)
> >       ("python-lxml==3.6.0" ,python-lxml==3.6.0)
> >       ("python-ndg-httpsclient==0.4.1"
> >        ,python-ndg-httpsclient==0.4.1)
> >       ("python-pyasn1-modules==0.0.8"
> >        ,python-pyasn1-modules==0.0.8)
> >       ("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
> >       ("python-pygments==2.1.3"
> >        ,python-pygments==2.1.3)
> >       ("python-pyopenssl==0.15.1"
> >        ,python-pyopenssl==0.15.1)
> >       ("python-pyyaml==3.11" ,python-pyyaml==3.11)
> >       ("python-requests[socks]==2.10.0"
> >        ,#{python-requests\x5b;socks\x5d;==2.10.0}#)
> >       ("python-setuptools" ,python-setuptools)))
> >
> >
> > I can understand the version numbers, I can also understand the optional
> > socks building/module of the python-requests, but why does it read like
> > Gobbledygook?  Can't we improve the output here?
> >
> > For version numbers, this is not a format which happened recently which
> > is exclusive for python build system right? This is just bad formated
> > because of the pypi query.
> > I will first try and not pin the application to these version numbers,
> > maybe itjustworks™.
> >
> >
> > To reproduce: "guix import pypi searx"
> 
> This would now give (change to be sent for review soon):
> 
> --8<---------------cut here---------------start------------->8---
> ./pre-inst-env guix import pypi searx
> 
> Starting download of /tmp/guix-file.1wD8K4
> From https://files.pythonhosted.org/packages/75/3f/5941ad2d500ff7cf6f8da1022c78013dcd2207941d533586a8e7bfe699d3/searx-0.15.0.tar.gz...
>  …5.0.tar.gz  1.6MiB                  729KiB/s 00:02 [##################] 100.0%
> (package
>   (name "python-searx")
>   (version "0.15.0")
>   (source
>     (origin
>       (method url-fetch)
>       (uri (pypi-uri "searx" version))
>       (sha256
>         (base32
>           "1gmww73q7wydkvlyz73wnr3sybpjn40wha7avnz9ak9m365zcjxf"))))
>   (build-system python-build-system)
>   (propagated-inputs
>     `(("python-certifi" ,python-certifi)
>       ("python-dateutil" ,python-dateutil)
>       ("python-flask" ,python-flask)
>       ("python-flask-babel" ,python-flask-babel)
>       ("python-idna" ,python-idna)
>       ("python-lxml" ,python-lxml)
>       ("python-pygments" ,python-pygments)
>       ("python-pyopenssl" ,python-pyopenssl)
>       ("python-pyyaml" ,python-pyyaml)
>       ("python-requests" ,python-requests)))
>   (native-inputs
>     `(("python-babel" ,python-babel)
>       ("python-cov-core" ,python-cov-core)
>       ("python-mock" ,python-mock)
>       ("python-nose2" ,python-nose2)
>       ("python-pep8" ,python-pep8)
>       ("python-plone.testing" ,python-plone.testing)
>       ("python-selenium" ,python-selenium)
>       ("python-splinter" ,python-splinter)
>       ("python-transifex-client"
>        ,python-transifex-client)
>       ("python-unittest2" ,python-unittest2)
>       ("python-zope.testrunner"
>        ,python-zope.testrunner)))
>   (home-page "https://github.com/asciimoo/searx")
>   (synopsis
>     "A privacy-respecting, hackable metasearch engine")
>   (description
>     "A privacy-respecting, hackable metasearch engine")
>   (license #f))
> --8<---------------cut here---------------end--------------->8---
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-16 14:11           ` Maxim Cournoyer
@ 2019-06-17  1:41             ` Ricardo Wurmus
  0 siblings, 0 replies; 41+ messages in thread
From: Ricardo Wurmus @ 2019-06-17  1:41 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 24450


Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

>> This allows you to match “eof-object” and '() directly.  Whenever I see
>> “string-null?” I think it might be better to “match” on the empty list
>> directly.
>
> string-null? and an empty list are not the same, unless I'm missing something.

Yes, sorry, I meant “null?”.  Using “string-null?” is equivalent to
matching the empty string, of course.

>> But really, that’s up to you.  I only feel strongly about avoiding “(if
>> … (cond …))”.
>
> Due to the problem mentioned above, I stayed with "cond".

Okay!  Thanks.

-- 
Ricardo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: pypi importer outputs strange character series in optional dependency case.
  2019-06-16 17:02   ` ng0
@ 2019-06-26  4:12     ` Maxim Cournoyer
  0 siblings, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-06-26  4:12 UTC (permalink / raw)
  To: ng0; +Cc: 24450

Hey ng0!

ng0 <ng0@we.make.ritual.n0.is> writes:

> Hi Maxim,
>
> great to see this fixed. Weird it bumped into my view after
> almost 3 years.

Yes! Bugs don't really expire, do they? :-)

> I trust that other people will have tested
> it, I'm all over the place and mostly doing NetBSD and GNUnet
> now.

I'm glad to know you're still involved with GNUnet :-)

> Sorry if I have written a reply before, the thread was
> marked as unread here.

No problem! I'll wait a few more days to allow for commenting some more,
then push.

Cheers!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

* bug#24450: [PATCHv3] Re: pypi importer outputs strange character series in optional dependency case.
  2019-06-16 14:36       ` bug#24450: [PATCHv3] " Maxim Cournoyer
@ 2019-07-02  1:54         ` Maxim Cournoyer
  0 siblings, 0 replies; 41+ messages in thread
From: Maxim Cournoyer @ 2019-07-02  1:54 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 24450-done

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

> Here's the current patch set, version 3, with the modifications
> as discussed in the previous exchanges [...]

I've now merged this patch series (version 3) into master, with top
commit 4b60ab8c006964d026dee8cf5f1260eba0b2bb81.

Closing.

Thank you!

Maxim

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2019-07-02  1:56 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-16 20:00 bug#24450: pypi importer outputs strange character series in optional dependency case ng0
2019-03-29  4:24 ` Maxim Cournoyer
2019-06-16 17:02   ` ng0
2019-06-26  4:12     ` Maxim Cournoyer
2019-03-29  4:34 ` bug#24450: [PATCH] " Maxim Cournoyer
2019-03-30  2:12   ` bug#24450: [PATCHv2] " T460s laptop
2019-03-31 14:40     ` bug#24450: [PATCH] " Maxim Cournoyer
2019-04-01 15:28     ` bug#24450: [PATCHv2] " Ludovic Courtès
2019-05-15 11:06 ` Ricardo Wurmus
2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
2019-05-20 15:05     ` Ludovic Courtès
2019-05-22  1:13       ` Maxim Cournoyer
2019-05-27 14:48     ` Ricardo Wurmus
2019-06-10  2:10       ` Maxim Cournoyer
2019-05-27 15:11     ` Ricardo Wurmus
2019-06-10  3:30       ` Maxim Cournoyer
2019-06-10  9:23         ` Ricardo Wurmus
2019-06-16 14:11           ` Maxim Cournoyer
2019-06-17  1:41             ` Ricardo Wurmus
2019-05-27 15:54     ` Ricardo Wurmus
2019-06-10  8:32       ` Maxim Cournoyer
2019-06-10  9:12         ` Ricardo Wurmus
2019-06-16  6:05           ` Maxim Cournoyer
2019-05-27 15:58     ` Ricardo Wurmus
2019-05-28 10:23     ` Ricardo Wurmus
2019-06-10 13:30       ` Maxim Cournoyer
2019-06-10 20:13         ` Ricardo Wurmus
2019-05-28 11:04     ` Ricardo Wurmus
2019-06-11  0:39       ` Maxim Cournoyer
2019-06-11 11:56         ` Ricardo Wurmus
2019-05-28 13:21     ` Ricardo Wurmus
2019-05-28 14:48     ` Ricardo Wurmus
2019-06-16  5:10       ` Maxim Cournoyer
2019-05-28 14:53     ` Ricardo Wurmus
2019-05-30  2:24       ` Maxim Cournoyer
2019-06-16  5:53       ` Maxim Cournoyer
2019-06-12  3:00 ` Maxim Cournoyer
2019-06-12  6:39   ` Ricardo Wurmus
2019-06-16 14:29     ` Maxim Cournoyer
2019-06-16 14:36       ` bug#24450: [PATCHv3] " Maxim Cournoyer
2019-07-02  1:54         ` Maxim Cournoyer

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).