From: Amirouche Boubekki <amirouche@hypermove.net>
To: Cyril Roelandt <tipecaml@gmail.com>
Cc: guix-devel@gnu.org, guix-devel-bounces+amirouche=hypermove.net@gnu.org
Subject: Re: [PATCH] import: pypi: Detect inputs.
Date: Thu, 18 Jun 2015 12:45:41 +0200 [thread overview]
Message-ID: <dcefa440e96b445bc3185f21b7f03629@hypermove.net> (raw)
In-Reply-To: <1434331554-13170-1-git-send-email-tipecaml@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 12655 bytes --]
Héllo,
If I'm not mistaken this patch relies only on the presence of
requirements.txt. This is not a required file in python packaging.
otherwise said, we miss a lot using this method. I think the best way to
do that would be to:
- download the package and extract it
- create an environment (#)
- create a virtual env with access to system site package of the
environment (#)
- enter the venv and install the package
- use `pip freeze -l` to retrieve the full set of dependencies
If it fails (because of missing system dependencies) fallback to parse
setup.py (with guile-log?) and plain requirements.txt. It would be nice
to allow to drop to guix environment (#) when the first option fails to
inspect and install missing system dependencies manually.
Maybe [1] can be helpful, I attached both data and a script to extract.
the dataset is missing and needs cleanup. It helped me to see that *a
lot* of django packages miss django dependency on pypi.
WDYT?
[1]
https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/
On 2015-06-15 03:25, Cyril Roelandt wrote:
> * guix/import/pypi.scm (python->package-name, maybe-inputs,
> compute-inputs,
> guess-requirements): New procedures.
> * guix/import/pypi.scm (guix-hash-url): Now takes a filename instead of
> an
> URL as input.
> * guix/import/pypi.scm (make-pypi-sexp): Now tries to generate the
> inputs
> automagically.
> * tests/pypi.scm: Update the test.
> ---
> guix/import/pypi.scm | 160
> +++++++++++++++++++++++++++++++++++++++++----------
> tests/pypi.scm | 42 +++++++++-----
> 2 files changed, 158 insertions(+), 44 deletions(-)
>
> diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
> index 8567cad..cf0a7bb 100644
> --- a/guix/import/pypi.scm
> +++ b/guix/import/pypi.scm
> @@ -21,10 +21,13 @@
> #:use-module (ice-9 match)
> #:use-module (ice-9 pretty-print)
> #:use-module (ice-9 regex)
> + #:use-module ((ice-9 rdelim) #:select (read-line))
> #:use-module (srfi srfi-1)
> + #:use-module (srfi srfi-26)
> #:use-module (rnrs bytevectors)
> #:use-module (json)
> #:use-module (web uri)
> + #:use-module (guix ui)
> #:use-module (guix utils)
> #:use-module (guix import utils)
> #:use-module (guix import json)
> @@ -77,42 +80,137 @@ or #f on failure."
> with dashes."
> (string-join (string-split (string-downcase str) #\_) "-"))
>
> -(define (guix-hash-url url)
> - "Download the resource at URL and return the hash in nix-base32
> format."
> - (call-with-temporary-output-file
> - (lambda (temp port)
> - (and (url-fetch url temp)
> - (bytevector->nix-base32-string
> - (call-with-input-file temp port-sha256))))))
> +(define (guix-hash-url filename)
> + "Return the hash of FILENAME in nix-base32 format."
> + (bytevector->nix-base32-string (file-sha256 filename)))
> +
> +(define (python->package-name name)
> + "Given the NAME of a package on PyPI, return a Guix-compliant name
> for the
> +package."
> + (if (string-prefix? "python-" name)
> + (snake-case name)
> + (string-append "python-" (snake-case name))))
> +
> +(define (maybe-inputs package-inputs)
> + "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs'
> field of a
> +package definition."
> + (match package-inputs
> + (()
> + '())
> + ((package-inputs ...)
> + `((inputs (,'quasiquote ,package-inputs))))))
> +
> +(define (guess-requirements source-url tarball)
> + "Given SOURCE-URL and a TARBALL of the package, return a list of the
> required
> +packages specified in the requirements.txt file. TARBALL will be
> extracted in
> +the current directory, and will be deleted."
> +
> + (define (tarball-directory url)
> + ;; Given the URL of the package's tarball, return the name of the
> directory
> + ;; that will be created upon decompressing it. If the filetype is
> not
> + ;; supported, return #f.
> + ;; TODO: Support more archive formats.
> + (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
> + (cond
> + ((string-suffix? ".tar.gz" basename)
> + (string-drop-right basename 7))
> + ((string-suffix? ".tar.bz2" basename)
> + (string-drop-right basename 8))
> + (else
> + (begin
> + (warning (_ "Unsupported archive format: \
> +cannot determine package dependencies"))
> + #f)))))
> +
> + (define (clean-requirement s)
> + ;; Given a requirement LINE, as can be found in a Python
> requirements.txt
> + ;; file, remove everything other than the actual name of the
> required
> + ;; package, and return it.
> + (string-take s
> + (or (string-index s #\space)
> + (string-length s))))
> +
> + (define (comment? line)
> + ;; Return #t if the given LINE is a comment, #f otherwise.
> + (eq? (string-ref (string-trim line) 0) #\#))
> +
> + (define (read-requirements requirements-file)
> + ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return
> a list
> + ;; of name/variable pairs describing the requirements.
> + (call-with-input-file requirements-file
> + (lambda (port)
> + (let loop ((result '()))
> + (let ((line (read-line port)))
> + (if (eof-object? line)
> + result
> + (cond
> + ((or (string-null? line) (comment? line))
> + (loop result))
> + (else
> + (loop (cons (python->package-name (clean-requirement
> line))
> + result))))))))))
> +
> + (let ((dirname (tarball-directory source-url)))
> + (if (string? dirname)
> + (let* ((req-file (string-append dirname "/requirements.txt"))
> + (exit-code (system* "tar" "xf" tarball req-file)))
> + ;; TODO: support more formats.
> + (if (zero? exit-code)
> + (dynamic-wind
> + (const #t)
> + (lambda ()
> + (read-requirements req-file))
> + (lambda ()
> + (delete-file req-file)
> + (rmdir dirname)))
> + (begin
> + (warning (_ "tar xf failed with exit code ~a")
> exit-code)
> + '())))
> + '())))
> +
> +(define (compute-inputs source-url tarball)
> + "Given the SOURCE-URL of an already downloaded TARBALL, return a
> list of
> +name/variable pairs describing the required inputs of this package."
> + (sort
> + (map (lambda (input)
> + (list input (list 'unquote (string->symbol input))))
> + (append '("python-setuptools")
> + ;; Argparse has been part of Python since 2.7.
> + (remove (cut string=? "python-argparse" <>)
> + (guess-requirements source-url tarball))))
> + (lambda args
> + (match args
> + (((a _ ...) (b _ ...))
> + (string-ci<? a b))))))
>
> (define (make-pypi-sexp name version source-url home-page synopsis
> description license)
> "Return the `package' s-expression for a python package with the
> given NAME,
> VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
> - `(package
> - (name ,(if (string-prefix? "python-" name)
> - (snake-case name)
> - (string-append "python-" (snake-case name))))
> - (version ,version)
> - (source (origin
> - (method url-fetch)
> - (uri (string-append ,@(factorize-uri source-url
> version)))
> - (sha256
> - (base32
> - ,(guix-hash-url source-url)))))
> - (build-system python-build-system)
> - (inputs
> - `(("python-setuptools" ,python-setuptools)))
> - (home-page ,home-page)
> - (synopsis ,synopsis)
> - (description ,description)
> - (license ,(assoc-ref `((,lgpl2.0 . lgpl2.0)
> - (,gpl3 . gpl3)
> - (,bsd-3 . bsd-3)
> - (,expat . expat)
> - (,public-domain . public-domain)
> - (,asl2.0 . asl2.0))
> - license))))
> + (call-with-temporary-output-file
> + (lambda (temp port)
> + (and (url-fetch source-url temp)
> + `(package
> + (name ,(python->package-name name))
> + (version ,version)
> + (source (origin
> + (method url-fetch)
> + (uri (string-append ,@(factorize-uri
> source-url version)))
> + (sha256
> + (base32
> + ,(guix-hash-url temp)))))
> + (build-system python-build-system)
> + ,@(maybe-inputs (compute-inputs source-url temp))
> + (home-page ,home-page)
> + (synopsis ,synopsis)
> + (description ,description)
> + (license ,(assoc-ref `((,lgpl2.0 . lgpl2.0)
> + (,gpl3 . gpl3)
> + (,bsd-3 . bsd-3)
> + (,expat . expat)
> + (,public-domain . public-domain)
> + (,asl2.0 . asl2.0))
> + license)))))))
>
> (define (pypi->guix-package package-name)
> "Fetch the metadata for PACKAGE-NAME from pypi.python.org, and
> return the
> diff --git a/tests/pypi.scm b/tests/pypi.scm
> index 45cf7ca..c772474 100644
> --- a/tests/pypi.scm
> +++ b/tests/pypi.scm
> @@ -21,6 +21,7 @@
> #:use-module (guix base32)
> #:use-module (guix hash)
> #:use-module (guix tests)
> + #:use-module ((guix build utils) #:select (delete-file-recursively))
> #:use-module (srfi srfi-64)
> #:use-module (ice-9 match))
>
> @@ -46,8 +47,14 @@
> }
> }")
>
> -(define test-source
> - "foobar")
> +(define test-source-hash
> + "")
> +
> +(define test-requirements
> +"# A comment
> + # A comment after a space
> +bar
> +baz > 13.37")
>
> (test-begin "pypi")
>
> @@ -55,15 +62,22 @@
> ;; Replace network resources with sample data.
> (mock ((guix import utils) url-fetch
> (lambda (url file-name)
> - (with-output-to-file file-name
> - (lambda ()
> - (display
> - (match url
> - ("https://pypi.python.org/pypi/foo/json"
> - test-json)
> - ("https://example.com/foo-1.0.0.tar.gz"
> - test-source)
> - (_ (error "Unexpected URL: " url))))))))
> + (match url
> + ("https://pypi.python.org/pypi/foo/json"
> + (with-output-to-file file-name
> + (lambda ()
> + (display test-json))))
> + ("https://example.com/foo-1.0.0.tar.gz"
> + (begin
> + (mkdir "foo-1.0.0")
> + (with-output-to-file "foo-1.0.0/requirements.txt"
> + (lambda ()
> + (display test-requirements)))
> + (system* "tar" "czvf" file-name "foo-1.0.0/")
> + (delete-file-recursively "foo-1.0.0")
> + (set! test-source-hash
> + (call-with-input-file file-name port-sha256))))
> + (_ (error "Unexpected URL: " url)))))
> (match (pypi->guix-package "foo")
> (('package
> ('name "python-foo")
> @@ -78,13 +92,15 @@
> ('build-system 'python-build-system)
> ('inputs
> ('quasiquote
> - (("python-setuptools" ('unquote 'python-setuptools)))))
> + (("python-bar" ('unquote 'python-bar))
> + ("python-baz" ('unquote 'python-baz))
> + ("python-setuptools" ('unquote 'python-setuptools)))))
> ('home-page "http://example.com")
> ('synopsis "summary")
> ('description "summary")
> ('license 'lgpl2.0))
> (string=? (bytevector->nix-base32-string
> - (call-with-input-string test-source port-sha256))
> + test-source-hash)
> hash))
> (x
> (pk 'fail x #f)))))
--
Amirouche ~ amz3 ~ http://www.hyperdev.fr
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: extract.py --]
[-- Type: text/x-python; name=extract.py, Size: 755 bytes --]
from re import compile
from json import loads
from pathlib import Path
from base64 import b64decode
NAME = compile('^([\w\.]+)')
# the pypi dataset was built by O. Girardot
# https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/
with Path('./import.log').open('w') as log:
with Path('./pypi-deps.csv').resolve().open() as f:
lines = f.read().split('\n')
count = len(lines)
for num, line in enumerate(lines):
name, version, dependencies = line.split('\t')
dependencies = loads(b64decode(dependencies).decode('utf-8'))
dependencies = map(lambda x: x.strip(), dependencies)
dependencies = list(dependencies)
print(name, version, dependencies)
[-- Attachment #3: pypi-deps.csv.gz --]
[-- Type: application/x-gzip, Size: 223837 bytes --]
next prev parent reply other threads:[~2015-06-18 10:46 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-07 23:51 [PATCH] import: pypi: detect requirements from requirements.txt Cyril Roelandt
2015-03-08 0:41 ` David Thompson
2015-03-22 21:05 ` [PATCH] import: pypi: Detect inputs Cyril Roelandt
2015-03-26 13:15 ` Ludovic Courtès
2015-03-27 12:36 ` David Thompson
2015-03-29 13:46 ` Ludovic Courtès
2015-06-04 22:56 ` Cyril Roelandt
2015-06-07 20:03 ` Ludovic Courtès
2015-06-15 1:25 ` Cyril Roelandt
2015-06-18 10:45 ` Amirouche Boubekki [this message]
2015-06-19 15:32 ` Christopher Allan Webber
2015-06-20 18:01 ` Amirouche Boubekki
2015-06-21 20:56 ` Ludovic Courtès
2015-06-21 21:32 ` Amirouche Boubekki
2015-06-23 21:04 ` Cyril Roelandt
2015-06-24 19:49 ` Ludovic Courtès
2015-06-24 21:42 ` Cyril Roelandt
2015-03-22 21:05 ` [PATCH] import: pypi: detect requirements from requirements.txt Cyril Roelandt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dcefa440e96b445bc3185f21b7f03629@hypermove.net \
--to=amirouche@hypermove.net \
--cc=guix-devel-bounces+amirouche=hypermove.net@gnu.org \
--cc=guix-devel@gnu.org \
--cc=tipecaml@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).