From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christopher Allan Webber Subject: Re: [PATCH] import: pypi: Detect inputs. Date: Fri, 19 Jun 2015 10:32:11 -0500 Message-ID: <87twu3btmq.fsf@earlgrey.lan> References: <87d217uvzd.fsf@gnu.org> <1434331554-13170-1-git-send-email-tipecaml@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:42586) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5yHa-0005b3-8g for guix-devel@gnu.org; Fri, 19 Jun 2015 11:32:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z5yHX-0000I7-Nb for guix-devel@gnu.org; Fri, 19 Jun 2015 11:32:42 -0400 Received: from [2600:3c02::f03c:91ff:feae:cb51] (port=55610 helo=dustycloud.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5yHX-0000Hq-Gv for guix-devel@gnu.org; Fri, 19 Jun 2015 11:32:39 -0400 In-reply-to: List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: Amirouche Boubekki Cc: guix-devel@gnu.org, guix-devel-bounces+amirouche=hypermove.net@gnu.org Amirouche Boubekki writes: > H=C3=A9llo, > > > If I'm not mistaken this patch relies only on the presence of=20 > requirements.txt. This is not a required file in python packaging.=20 > otherwise said, we miss a lot using this method. I think the best way t= o=20 > do that would be to: > > - download the package and extract it > - create an environment (#) > - create a virtual env with access to system site package of the=20 > environment (#) > - enter the venv and install the package > - use `pip freeze -l` to retrieve the full set of dependencies Using pip freeze is an interesting idea. Setting up a virtualenv... that's interesting. Would it be written to a temporary directory? > If it fails (because of missing system dependencies) fallback to parse=20 > setup.py (with guile-log?) and plain requirements.txt. It would be nice= =20 > to allow to drop to guix environment (#) when the first option fails to= =20 > inspect and install missing system dependencies manually. > > Maybe [1] can be helpful, I attached both data and a script to extract.= =20 > the dataset is missing and needs cleanup. It helped me to see that *a=20 > lot* of django packages miss django dependency on pypi. > > WDYT? > > [1]=20 > https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency= -data/ > > > On 2015-06-15 03:25, Cyril Roelandt wrote: >> * guix/import/pypi.scm (python->package-name, maybe-inputs,=20 >> compute-inputs, >> guess-requirements): New procedures. >> * guix/import/pypi.scm (guix-hash-url): Now takes a filename instead o= f=20 >> an >> URL as input. >> * guix/import/pypi.scm (make-pypi-sexp): Now tries to generate the=20 >> inputs >> automagically. >> * tests/pypi.scm: Update the test. >> --- >> guix/import/pypi.scm | 160=20 >> +++++++++++++++++++++++++++++++++++++++++---------- >> tests/pypi.scm | 42 +++++++++----- >> 2 files changed, 158 insertions(+), 44 deletions(-) >>=20 >> diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm >> index 8567cad..cf0a7bb 100644 >> --- a/guix/import/pypi.scm >> +++ b/guix/import/pypi.scm >> @@ -21,10 +21,13 @@ >> #:use-module (ice-9 match) >> #:use-module (ice-9 pretty-print) >> #:use-module (ice-9 regex) >> + #:use-module ((ice-9 rdelim) #:select (read-line)) >> #:use-module (srfi srfi-1) >> + #:use-module (srfi srfi-26) >> #:use-module (rnrs bytevectors) >> #:use-module (json) >> #:use-module (web uri) >> + #:use-module (guix ui) >> #:use-module (guix utils) >> #:use-module (guix import utils) >> #:use-module (guix import json) >> @@ -77,42 +80,137 @@ or #f on failure." >> with dashes." >> (string-join (string-split (string-downcase str) #\_) "-")) >>=20 >> -(define (guix-hash-url url) >> - "Download the resource at URL and return the hash in nix-base32=20 >> format." >> - (call-with-temporary-output-file >> - (lambda (temp port) >> - (and (url-fetch url temp) >> - (bytevector->nix-base32-string >> - (call-with-input-file temp port-sha256)))))) >> +(define (guix-hash-url filename) >> + "Return the hash of FILENAME in nix-base32 format." >> + (bytevector->nix-base32-string (file-sha256 filename))) >> + >> +(define (python->package-name name) >> + "Given the NAME of a package on PyPI, return a Guix-compliant name=20 >> for the >> +package." >> + (if (string-prefix? "python-" name) >> + (snake-case name) >> + (string-append "python-" (snake-case name)))) >> + >> +(define (maybe-inputs package-inputs) >> + "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs'=20 >> field of a >> +package definition." >> + (match package-inputs >> + (() >> + '()) >> + ((package-inputs ...) >> + `((inputs (,'quasiquote ,package-inputs)))))) >> + >> +(define (guess-requirements source-url tarball) >> + "Given SOURCE-URL and a TARBALL of the package, return a list of th= e=20 >> required >> +packages specified in the requirements.txt file. TARBALL will be=20 >> extracted in >> +the current directory, and will be deleted." >> + >> + (define (tarball-directory url) >> + ;; Given the URL of the package's tarball, return the name of the= =20 >> directory >> + ;; that will be created upon decompressing it. If the filetype is= =20 >> not >> + ;; supported, return #f. >> + ;; TODO: Support more archive formats. >> + (let ((basename (substring url (+ 1 (string-rindex url #\/))))) >> + (cond >> + ((string-suffix? ".tar.gz" basename) >> + (string-drop-right basename 7)) >> + ((string-suffix? ".tar.bz2" basename) >> + (string-drop-right basename 8)) >> + (else >> + (begin >> + (warning (_ "Unsupported archive format: \ >> +cannot determine package dependencies")) >> + #f))))) >> + >> + (define (clean-requirement s) >> + ;; Given a requirement LINE, as can be found in a Python=20 >> requirements.txt >> + ;; file, remove everything other than the actual name of the=20 >> required >> + ;; package, and return it. >> + (string-take s >> + (or (string-index s #\space) >> + (string-length s)))) >> + >> + (define (comment? line) >> + ;; Return #t if the given LINE is a comment, #f otherwise. >> + (eq? (string-ref (string-trim line) 0) #\#)) >> + >> + (define (read-requirements requirements-file) >> + ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, retur= n=20 >> a list >> + ;; of name/variable pairs describing the requirements. >> + (call-with-input-file requirements-file >> + (lambda (port) >> + (let loop ((result '())) >> + (let ((line (read-line port))) >> + (if (eof-object? line) >> + result >> + (cond >> + ((or (string-null? line) (comment? line)) >> + (loop result)) >> + (else >> + (loop (cons (python->package-name (clean-requiremen= t=20 >> line)) >> + result)))))))))) >> + >> + (let ((dirname (tarball-directory source-url))) >> + (if (string? dirname) >> + (let* ((req-file (string-append dirname "/requirements.txt")) >> + (exit-code (system* "tar" "xf" tarball req-file))) >> + ;; TODO: support more formats. >> + (if (zero? exit-code) >> + (dynamic-wind >> + (const #t) >> + (lambda () >> + (read-requirements req-file)) >> + (lambda () >> + (delete-file req-file) >> + (rmdir dirname))) >> + (begin >> + (warning (_ "tar xf failed with exit code ~a")=20 >> exit-code) >> + '()))) >> + '()))) >> + >> +(define (compute-inputs source-url tarball) >> + "Given the SOURCE-URL of an already downloaded TARBALL, return a=20 >> list of >> +name/variable pairs describing the required inputs of this package." >> + (sort >> + (map (lambda (input) >> + (list input (list 'unquote (string->symbol input)))) >> + (append '("python-setuptools") >> + ;; Argparse has been part of Python since 2.7. >> + (remove (cut string=3D? "python-argparse" <>) >> + (guess-requirements source-url tarball)))) >> + (lambda args >> + (match args >> + (((a _ ...) (b _ ...)) >> + (string-ci>=20 >> (define (make-pypi-sexp name version source-url home-page synopsis >> description license) >> "Return the `package' s-expression for a python package with the=20 >> given NAME, >> VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE." >> - `(package >> - (name ,(if (string-prefix? "python-" name) >> - (snake-case name) >> - (string-append "python-" (snake-case name)))) >> - (version ,version) >> - (source (origin >> - (method url-fetch) >> - (uri (string-append ,@(factorize-uri source-url=20 >> version))) >> - (sha256 >> - (base32 >> - ,(guix-hash-url source-url))))) >> - (build-system python-build-system) >> - (inputs >> - `(("python-setuptools" ,python-setuptools))) >> - (home-page ,home-page) >> - (synopsis ,synopsis) >> - (description ,description) >> - (license ,(assoc-ref `((,lgpl2.0 . lgpl2.0) >> - (,gpl3 . gpl3) >> - (,bsd-3 . bsd-3) >> - (,expat . expat) >> - (,public-domain . public-domain) >> - (,asl2.0 . asl2.0)) >> - license)))) >> + (call-with-temporary-output-file >> + (lambda (temp port) >> + (and (url-fetch source-url temp) >> + `(package >> + (name ,(python->package-name name)) >> + (version ,version) >> + (source (origin >> + (method url-fetch) >> + (uri (string-append ,@(factorize-uri >> source-url version))) >> + (sha256 >> + (base32 >> + ,(guix-hash-url temp))))) >> + (build-system python-build-system) >> + ,@(maybe-inputs (compute-inputs source-url temp)) >> + (home-page ,home-page) >> + (synopsis ,synopsis) >> + (description ,description) >> + (license ,(assoc-ref `((,lgpl2.0 . lgpl2.0) >> + (,gpl3 . gpl3) >> + (,bsd-3 . bsd-3) >> + (,expat . expat) >> + (,public-domain . public-domain) >> + (,asl2.0 . asl2.0)) >> + license))))))) >>=20 >> (define (pypi->guix-package package-name) >> "Fetch the metadata for PACKAGE-NAME from pypi.python.org, and=20 >> return the >> diff --git a/tests/pypi.scm b/tests/pypi.scm >> index 45cf7ca..c772474 100644 >> --- a/tests/pypi.scm >> +++ b/tests/pypi.scm >> @@ -21,6 +21,7 @@ >> #:use-module (guix base32) >> #:use-module (guix hash) >> #:use-module (guix tests) >> + #:use-module ((guix build utils) #:select (delete-file-recursively)= ) >> #:use-module (srfi srfi-64) >> #:use-module (ice-9 match)) >>=20 >> @@ -46,8 +47,14 @@ >> } >> }") >>=20 >> -(define test-source >> - "foobar") >> +(define test-source-hash >> + "") >> + >> +(define test-requirements >> +"# A comment >> + # A comment after a space >> +bar >> +baz > 13.37") >>=20 >> (test-begin "pypi") >>=20 >> @@ -55,15 +62,22 @@ >> ;; Replace network resources with sample data. >> (mock ((guix import utils) url-fetch >> (lambda (url file-name) >> - (with-output-to-file file-name >> - (lambda () >> - (display >> - (match url >> - ("https://pypi.python.org/pypi/foo/json" >> - test-json) >> - ("https://example.com/foo-1.0.0.tar.gz" >> - test-source) >> - (_ (error "Unexpected URL: " url)))))))) >> + (match url >> + ("https://pypi.python.org/pypi/foo/json" >> + (with-output-to-file file-name >> + (lambda () >> + (display test-json)))) >> + ("https://example.com/foo-1.0.0.tar.gz" >> + (begin >> + (mkdir "foo-1.0.0") >> + (with-output-to-file "foo-1.0.0/requirements.txt" >> + (lambda () >> + (display test-requirements))) >> + (system* "tar" "czvf" file-name "foo-1.0.0/") >> + (delete-file-recursively "foo-1.0.0") >> + (set! test-source-hash >> + (call-with-input-file file-name port-sha256)))= ) >> + (_ (error "Unexpected URL: " url))))) >> (match (pypi->guix-package "foo") >> (('package >> ('name "python-foo") >> @@ -78,13 +92,15 @@ >> ('build-system 'python-build-system) >> ('inputs >> ('quasiquote >> - (("python-setuptools" ('unquote 'python-setuptools))))) >> + (("python-bar" ('unquote 'python-bar)) >> + ("python-baz" ('unquote 'python-baz)) >> + ("python-setuptools" ('unquote 'python-setuptools))))) >> ('home-page "http://example.com") >> ('synopsis "summary") >> ('description "summary") >> ('license 'lgpl2.0)) >> (string=3D? (bytevector->nix-base32-string >> - (call-with-input-string test-source port-sha256)) >> + test-source-hash) >> hash)) >> (x >> (pk 'fail x #f)))))