unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: 24450@debbugs.gnu.org
Subject: bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
Date: Tue, 28 May 2019 15:21:56 +0200	[thread overview]
Message-ID: <idjzhn690jv.fsf@bimsb-sys02.mdc-berlin.net> (raw)
In-Reply-To: <87pnod7ot4.fsf@gmail.com>

Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 23:12:26 -0400
> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
>  native-inputs.
>
> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
> (test-section?): New predicate.
> (parse-requires.txt): Collect the optional test inputs, and return them as the
> second element of the returned list.

AFAICT parse-requires.txt now returns a list of pairs, but used to
return a plain list before.  Is this correct?

>  (define (parse-requires.txt requires.txt)
> -  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> -requirement names."
> -  ;; This is a very incomplete parser, which job is to select the non-optional
> -  ;; dependencies and strip them out of any version information.
> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the
> +second the optional test dependencies.  Note that currently, optional,
> +non-test dependencies are omitted since these can be difficult or expensive to
> +satisfy."
> +
> +  ;; This is a very incomplete parser, which job is to read in the requirement
> +  ;; specification lines, and strip them out of any version information.
>    ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>    ;; library and the requirements grammar defined by PEP-0508
>    ;; (https://www.python.org/dev/peps/pep-0508/).

Does it really return a pair?  Or a list of pairs?  Or is it a
two-element list of lists?

>    (call-with-input-file requires.txt
>      (lambda (port)
> -      (let loop ((result '()))
> +      (let loop ((required-deps '())
> +                 (test-deps '())
> +                 (inside-test-section? #f)
> +                 (optional? #f))
>          (let ((line (read-line port)))
> -          ;; Stop when a section is encountered, as sections contains optional
> -          ;; (extra) requirements.  Non-optional requirements must appear
> -          ;; before any section is defined.
> -          (if (or (eof-object? line) (section-header? line))
> +          (if (eof-object? line)
>                ;; Duplicates can occur, since the same requirement can be
>                ;; listed multiple times with different conditional markers, e.g.
>                ;; pytest >= 3 ; python_version >= "3.3"
>                ;; pytest < 3 ; python_version < "3.3"
> -              (reverse (delete-duplicates result))
> +              (map (compose reverse delete-duplicates)
> +                   (list required-deps test-deps))

Looks like a list of lists to me.  “delete-duplicates” now won’t delete
a name that is in both “required-deps” as well as in “test-deps”.  Is
this acceptable?

Personally, I’m not a fan of using data structures for returning
multiple values, because we can simply return multiple values.

Or we could have more than just strings.  The meaning of these strings
is provided by the bin into which they are thrown — either
“required-deps” or “test-deps”.  It could be an option to collect tagged
values instead and have the caller deal with filtering.

>  (define (parse-wheel-metadata metadata)
> -  "Given METADATA, a Wheel metadata file, return a list of requirement names."
> +  "Given METADATA, a Wheel metadata file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the second the optional
> +test dependencies.  Note that currently, optional, non-test dependencies are
> +omitted since these can be difficult or expensive to satisfy."
>    ;; METADATA is a RFC-2822-like, header based file.

This sounds like this is going to duplicate the previous procedures.

>    (define (requires-dist-header? line)
>      ;; Return #t if the given LINE is a Requires-Dist header.
> -    (regexp-match? (string-match "^Requires-Dist: " line)))
> +    (string-match "^Requires-Dist: " line))
>
>    (define (requires-dist-value line)
>      (string-drop line (string-length "Requires-Dist: ")))
>
>    (define (extra? line)
>      ;; Return #t if the given LINE is an "extra" requirement.
> -    (regexp-match? (string-match "extra == " line)))
> +    (string-match "extra == '(.*)'" line))

These hunks should be part of the previous patch where they were
introduced.  (See my comments there about “regexp-match?”.)

> +  (define (test-requirement? line)
> +    (let ((extra-label (match:substring (extra? line) 1)))
> +      (and extra-label (test-section? extra-label))))

You can use “and=>” instead of binding a name:

    (and=> (match:substring (extra? line) 1) test-section?)

>    (call-with-input-file metadata
>      (lambda (port)
> -      (let loop ((requirements '()))
> +      (let loop ((required-deps '())
> +                 (test-deps '()))
>          (let ((line (read-line port)))
> -          ;; Stop at the first 'Provides-Extra' section: the non-optional
> -          ;; requirements appear before the optional ones.
>            (if (eof-object? line)
> -              (reverse (delete-duplicates requirements))
> +              (map (compose reverse delete-duplicates)
> +                   (list required-deps test-deps))
>                (cond
>                 ((and (requires-dist-header? line) (not (extra? line)))
>                  (loop (cons (specification->requirement-name
>                               (requires-dist-value line))
> -                            requirements)))
> +                            required-deps)
> +                      test-deps))
> +               ((and (requires-dist-header? line) (test-requirement? line))
> +                (loop required-deps
> +                      (cons (specification->requirement-name (requires-dist-value line))
> +                            test-deps)))
>                 (else
> -                (loop requirements)))))))))
> +                (loop required-deps test-deps))))))))) ;skip line

Could you refactor this so that the other parser can be reused?  If not,
the same comments about “if” and “cond” and the use of pairs/lists
instead of “values” applies here.

>  (define (guess-requirements source-url wheel-url archive)
> -  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
> +  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
>  of the required packages specified in the requirements.txt file.  ARCHIVE will
>  be extracted in a temporary directory."
>
> @@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
>             (metadata (string-append dirname "/METADATA")))
>        (call-with-temporary-directory
>         (lambda (dir)
> -         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
> +         (if (zero?
> +              (parameterize ((current-error-port (%make-void-port "rw+"))
> +                             (current-output-port (%make-void-port "rw+")))
> +                (system* "unzip" wheel-archive "-d" dir metadata)))
>               (parse-wheel-metadata (string-append dir "/" metadata))
>               (begin
>                 (warning
> @@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
>                       (warning
>                        (G_ "Failed to extract file: ~a from source.~%")
>                        requires.txt)
> -                     '())))))
> -          '())))
> +                     (list '() '()))))))
> +          (list '() '()))))

I would like to see cosmetic changes like these three hunks in separate
commits.

>  (define (compute-inputs source-url wheel-url archive)
> -  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
> -name/variable pairs describing the required inputs of this package.  Also
> +  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
> +a pair of lists, each consisting of a list of name/variable pairs, for the
> +propagated inputs and the native inputs, respectively.  Also
>  return the unaltered list of upstream dependency names."
> -  (let ((dependencies
> -         (remove (cut string=? "argparse" <>)
> -                 (guess-requirements source-url wheel-url archive))))
> -    (values (sort
> -             (map (lambda (input)
> -                    (let ((guix-name (python->package-name input)))
> -                      (list guix-name (list 'unquote (string->symbol guix-name)))))
> -                  dependencies)
> -             (lambda args
> -               (match args
> -                 (((a _ ...) (b _ ...))
> -                  (string-ci<? a b)))))
> -            dependencies)))
> +
> +  (define (strip-argparse deps)
> +    (remove (cut string=? "argparse" <>) deps))
> +
> +  (define (requirement->package-name/sort deps)
> +    (sort
> +     (map (lambda (input)
> +            (let ((guix-name (python->package-name input)))
> +              (list guix-name (list 'unquote (string->symbol guix-name)))))
> +          deps)
> +     (lambda args
> +       (match args
> +         (((a _ ...) (b _ ...))
> +          (string-ci<? a b))))))
> +
> +  (define process-requirements
> +    (compose requirement->package-name/sort strip-argparse))
> +
> +  (let ((dependencies (guess-requirements source-url wheel-url archive)))
> +    (values (map process-requirements dependencies)
> +            (concatenate dependencies))))

Giving names to these processing steps is fine and improves clarity, but
I’m not so comfortable about returning ad-hoc pairs *and* multiple
values (like above).

> +            (match guix-dependencies
> +              ((required-inputs test-inputs)
> +               (values
> +                `(package
> +                   (name ,(python->package-name name))
> +                   (version ,version)
> +                   (source (origin
> +                             (method url-fetch)
> +                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
> +                             ;; cases in NAME, for instance, as is the case with
> +                             ;; "uwsgi".  In that case, fall back to a full URL.
> +                             (uri (pypi-uri ,(string-downcase name) version))
> +                             (sha256
> +                              (base32
> +                               ,(guix-hash-url temp)))))
> +                   (build-system python-build-system)
> +                   ,@(maybe-inputs required-inputs 'propagated-inputs)
> +                   ,@(maybe-inputs test-inputs 'native-inputs)
> +                   (home-page ,home-page)
> +                   (synopsis ,synopsis)
> +                   (description ,description)
> +                   (license ,(license->symbol license)))
> +                ;; Flatten the nested lists and return the upstream
> +                ;; dependencies.
> +                upstream-dependencies))))))))

I don’t see anything being flattened here?

--
Ricardo

  parent reply	other threads:[~2019-05-28 13:23 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-16 20:00 bug#24450: pypi importer outputs strange character series in optional dependency case ng0
2019-03-29  4:24 ` Maxim Cournoyer
2019-06-16 17:02   ` ng0
2019-06-26  4:12     ` Maxim Cournoyer
2019-03-29  4:34 ` bug#24450: [PATCH] " Maxim Cournoyer
2019-03-30  2:12   ` bug#24450: [PATCHv2] " T460s laptop
2019-03-31 14:40     ` bug#24450: [PATCH] " Maxim Cournoyer
2019-04-01 15:28     ` bug#24450: [PATCHv2] " Ludovic Courtès
2019-05-15 11:06 ` Ricardo Wurmus
2019-05-20  4:05   ` bug#24450: [PATCHv2] " Maxim Cournoyer
2019-05-20 15:05     ` Ludovic Courtès
2019-05-22  1:13       ` Maxim Cournoyer
2019-05-27 14:48     ` Ricardo Wurmus
2019-06-10  2:10       ` Maxim Cournoyer
2019-05-27 15:11     ` Ricardo Wurmus
2019-06-10  3:30       ` Maxim Cournoyer
2019-06-10  9:23         ` Ricardo Wurmus
2019-06-16 14:11           ` Maxim Cournoyer
2019-06-17  1:41             ` Ricardo Wurmus
2019-05-27 15:54     ` Ricardo Wurmus
2019-06-10  8:32       ` Maxim Cournoyer
2019-06-10  9:12         ` Ricardo Wurmus
2019-06-16  6:05           ` Maxim Cournoyer
2019-05-27 15:58     ` Ricardo Wurmus
2019-05-28 10:23     ` Ricardo Wurmus
2019-06-10 13:30       ` Maxim Cournoyer
2019-06-10 20:13         ` Ricardo Wurmus
2019-05-28 11:04     ` Ricardo Wurmus
2019-06-11  0:39       ` Maxim Cournoyer
2019-06-11 11:56         ` Ricardo Wurmus
2019-05-28 13:21     ` Ricardo Wurmus [this message]
2019-05-28 14:48     ` Ricardo Wurmus
2019-06-16  5:10       ` Maxim Cournoyer
2019-05-28 14:53     ` Ricardo Wurmus
2019-05-30  2:24       ` Maxim Cournoyer
2019-06-16  5:53       ` Maxim Cournoyer
2019-06-12  3:00 ` Maxim Cournoyer
2019-06-12  6:39   ` Ricardo Wurmus
2019-06-16 14:29     ` Maxim Cournoyer
2019-06-16 14:36       ` bug#24450: [PATCHv3] " Maxim Cournoyer
2019-07-02  1:54         ` Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=idjzhn690jv.fsf@bimsb-sys02.mdc-berlin.net \
    --to=ricardo.wurmus@mdc-berlin.de \
    --cc=24450@debbugs.gnu.org \
    --cc=maxim.cournoyer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).