unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [RFC] Improve Python package quality
@ 2021-01-03 10:00 Lars-Dominik Braun
  2021-01-03 10:48 ` Hartmut Goebel
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Lars-Dominik Braun @ 2021-01-03 10:00 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1624 bytes --]

Hi,

I’d like to propose adding an additional phase to python-build-system to
improve Guix’ Python package quality.

Python is an interpreted language without any compile-time checks. Any
errors are only visible at run-time, including missing or wrong imports.
Additionally most Python packages use an additional packaging layer
through setuptools, which also declares mandatory and optional
dependencies.

This can result in buildable, but broken packages. An example of a
botched upgrade is fixed in commit
22e06297b1982f75aaadddba616b1052e506e4a0. The package python-httpx was
upgraded, but that new version declared a dependency to a newer
python-httpcore via setuptools, which wasn’t available and thus packages
depending on python-httpx were broken.

My proposal adds some build-time checks to guarentee three properties:

1) Depending on the package (distribution in Python language) via
setuptools works. This ensures dependencies are complete and have the
correct version as determined by setup.{py,cfg}.
2) Console entry points can be loaded. This ensures scripts installed to
bin/ actually work.
3) Top-level modules can be imported. This ensures the package is
actually usable and has no undeclared dependencies.

The attached patch implements all three. I’ve been rebuilding a few
Python packages using the patch and uncovered some issues already. I’m
sure it’s not perfect yet and thus I’m open to improvements. If this
idea is well received I’m willing to rebuild and fix *all* Python
packages broken by this new phase. I’m also aware this change needs to
go to core-updates.

Cheers,
Lars


[-- Attachment #2: 0001-WIP-build-system-python-Validate-installed-package.patch --]
[-- Type: text/x-diff, Size: 3449 bytes --]

From 919fd78b1aff6c277baca396ef962a5dcd4e23ae Mon Sep 17 00:00:00 2001
From: Lars-Dominik Braun <lars@6xq.net>
Date: Sun, 3 Jan 2021 10:30:29 +0100
Subject: [PATCH] [WIP] build-system/python: Validate installed package

* guix/build/python-build-system.scm (validate-loadable): New phase.
(%standard-phases): Use it.
---
 guix/build/python-build-system.scm | 51 ++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/guix/build/python-build-system.scm b/guix/build/python-build-system.scm
index 09bd8465c8..edb772a7a4 100644
--- a/guix/build/python-build-system.scm
+++ b/guix/build/python-build-system.scm
@@ -148,6 +148,56 @@
       (format #t "test suite not run~%"))
   #t)
 
+(define* (validate-loadable #:key tests? inputs outputs #:allow-other-keys)
+  "Ensure packages depending on this package via setuptools work properly,
+their advertised endpoints work and their top level modules are importable
+without errors."
+  (let ((script (string-join
+'(
+;; Python 2 support.
+"from __future__ import print_function"
+"import pkg_resources, sys, importlib"
+;; Only check site-packages installed by this package, but not dependencies
+;; (which pkg_resources.working_set would include). Path supplied via argv.
+"ws = pkg_resources.find_distributions (sys.argv[1])"
+"for dist in ws:"
+"    print ('validating', repr (dist.project_name), dist.location)"
+"    try:"
+"        req = str (dist.as_requirement ())"
+;; dist.activate() is not enough to actually check requirements, we have to
+;; .require() it.
+"        pkg_resources.require (req)"
+"    except Exception as e:"
+"        print (req, e)"
+"        sys.exit (1)"
+;; Try to load entry points of console scripts too, making sure they work. They
+;; should be removed if they don’t. Other groups may not be safe, as they can
+;; depend on optional packages.
+"    for group, v in dist.get_entry_map ().items ():"
+"       if group not in {'console_scripts', }:"
+"           continue"
+"       for name, ep in v.items ():"
+"           print ('...trying to load endpoint', group, name)"
+"           ep.load ()"
+;; And finally try to load top level modules. This should not have any
+;; side-effects.
+"    for name in dist.get_metadata_lines ('top_level.txt'):"
+"        print ('...trying to load module', name)"
+"        try:"
+"            importlib.import_module (name)"
+;; Ignore non-existent modules, we only want to know if the existing ones work.
+"        except ModuleNotFoundError:"
+"            print ('......WARNING: module', name, 'not found, continuing')"
+"            continue")
+"\n")))
+    (add-installed-pythonpath inputs outputs)
+    ;; Make sure the working directory is empty (i.e. no Python modules in it)
+    (with-directory-excursion "/tmp"
+    ;; XXX: Cloak command run. Long and unreadable if it fails, provide an
+    ;; explanation instead.
+      (invoke "python" "-c" script (site-packages inputs outputs))))
+  #t)
+
 (define (python-version python)
   (let* ((version     (last (string-split python #\-)))
          (components  (string-split version #\.))
@@ -267,6 +317,7 @@ installed with setuptools."
     (replace 'install install)
     (add-after 'install 'check check)
     (add-after 'install 'wrap wrap)
+    (add-after 'check 'validate-loadable validate-loadable)
     (add-before 'strip 'rename-pth-file rename-pth-file)))
 
 (define* (python-build #:key inputs (phases %standard-phases)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-03 10:00 [RFC] Improve Python package quality Lars-Dominik Braun
@ 2021-01-03 10:48 ` Hartmut Goebel
  2021-01-05  9:19   ` Lars-Dominik Braun
  2021-01-03 15:04 ` Tobias Geerinckx-Rice
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Hartmut Goebel @ 2021-01-03 10:48 UTC (permalink / raw)
  To: Lars-Dominik Braun, guix-devel

[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]

Hi Lars,

this is a good idea. (Since you where mentioning setuptools, I first was 
afraid your solution would be tightened to setuptools, but it is not. 
Well done!)

Some comments (most of which are nit-picking):

> +;; Python 2 support.
>
> +"from __future__ import print_function"
>
This comment should go behind the line of code, as it only related to 
that single line.

> +" req = str (dist.as_requirement ())"
>
> +;; dist.activate() is not enough to actually check requirements, we 
> have to
>
> +;; .require() it.
>
> +" pkg_resources.require (req)"
>
I suggest putting the comments into the python source. This would allow 
to indent them according the the python code, which would make it easier 
to understand. This would also allow to use a single multi-line 
guile-string, which allows to easiyl copy the script out and in from the 
guile-source for testing it.


> +" except Exception as e:"
>
> +" print (req, e)"
>
> +" sys.exit (1)"
>
raise SystemExit(1)


> +" print ('...trying to load endpoint', group, name)"

Please follow PEP8 (no space before opening parentheses) - also at other 
places.


> +" for name in dist.get_metadata_lines ('top_level.txt'):"
>
> +" print ('...trying to load module', name)"
>
Add `end=""`, thus the "result" can be printed on the same line.

> +" except ModuleNotFoundError:"
>
> +" print ('......WARNING: module', name, 'not found, continuing')"
>
Print result terse, on same line, without repeating the name:

   print (' WARNING: not found')

> +" continue")
>
Add printing success:

     else:
        print("passed")

> + (add-installed-pythonpath inputs outputs)
>
> + ;; Make sure the working directory is empty (i.e. no Python modules 
> in it)
>
> + (with-directory-excursion "/tmp"
>
Would is be better to use mkdtemp here to ge a fresh, empty directory?


-- 
Regards
Hartmut Goebel

| Hartmut Goebel          | h.goebel@crazy-compilers.com               |
| www.crazy-compilers.com | compilers which you thought are impossible |


[-- Attachment #2: Type: text/html, Size: 5277 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-03 10:00 [RFC] Improve Python package quality Lars-Dominik Braun
  2021-01-03 10:48 ` Hartmut Goebel
@ 2021-01-03 15:04 ` Tobias Geerinckx-Rice
  2021-01-05  9:31 ` Vincent Legoll
  2021-01-07 13:31 ` Lars-Dominik Braun
  3 siblings, 0 replies; 9+ messages in thread
From: Tobias Geerinckx-Rice @ 2021-01-03 15:04 UTC (permalink / raw)
  To: Lars-Dominik Braun; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 308 bytes --]

Lars(-Dominik?),

I'll defer to others on the finer points of 
(Scheme-string-embedded-)Python, but just wanted to say:

Lars-Dominik Braun 写道:
> My proposal adds some build-time checks to guarentee three 
> properties:

This is utterly awesome.  Thank you so much.

Kind regards,

T G-R

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-03 10:48 ` Hartmut Goebel
@ 2021-01-05  9:19   ` Lars-Dominik Braun
  0 siblings, 0 replies; 9+ messages in thread
From: Lars-Dominik Braun @ 2021-01-05  9:19 UTC (permalink / raw)
  To: Hartmut Goebel; +Cc: guix-devel

Hi Hartmut,

> this is a good idea. (Since you where mentioning setuptools, I first was 
> afraid your solution would be tightened to setuptools, but it is not. 
> Well done!)
afaik pkg_resources is technically a part of setuptools, although it is
distributed with Python.

> This comment should go behind the line of code, as it only related to 
> that single line.
> […]
> I suggest putting the comments into the python source. This would allow 
> to indent them according the the python code, which would make it easier 
> to understand. This would also allow to use a single multi-line 
> guile-string, which allows to easiyl copy the script out and in from the 
> guile-source for testing it.
> […]
> Please follow PEP8 (no space before opening parentheses) - also at other 
> places.
> […]
> Add `end=""`, thus the "result" can be printed on the same line.
> Print result terse, on same line, without repeating the name:
You’re right, all fixed. I’ll send a non-hacky patch (with test-cases!)
to guix-patches@ for review once we’ve figured out a path to merge it. I
guess it would be best to fix packages directly on master and merge this
new phase to core-updates? Shall I apply for commit access or can you
(or Tobias?) review and “proxy” required changes? Right now I have fixes
for about 10 packges, but there will be more.

> Would is be better to use mkdtemp here to ge a fresh, empty directory?
I tried that, but mkdtemp! is not available and I’m not confident enough
to add that module to the closure. Any ideas?

Cheers,
Lars



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-03 10:00 [RFC] Improve Python package quality Lars-Dominik Braun
  2021-01-03 10:48 ` Hartmut Goebel
  2021-01-03 15:04 ` Tobias Geerinckx-Rice
@ 2021-01-05  9:31 ` Vincent Legoll
  2021-01-05 10:20   ` Lars-Dominik Braun
  2021-01-07 13:31 ` Lars-Dominik Braun
  3 siblings, 1 reply; 9+ messages in thread
From: Vincent Legoll @ 2021-01-05  9:31 UTC (permalink / raw)
  To: Lars-Dominik Braun; +Cc: guix-devel

Hello,

I like the idea of better testing for our python packages, but would it be
possible to avoid embedding the python code as scheme strings ?

Like in a separate pure-python file.

WDYT ?

-- 
Vincent Legoll


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-05  9:31 ` Vincent Legoll
@ 2021-01-05 10:20   ` Lars-Dominik Braun
  2021-01-05 10:48     ` Vincent Legoll
  0 siblings, 1 reply; 9+ messages in thread
From: Lars-Dominik Braun @ 2021-01-05 10:20 UTC (permalink / raw)
  To: Vincent Legoll; +Cc: guix-devel

Hi Vincent,

> I like the idea of better testing for our python packages, but would it be
> possible to avoid embedding the python code as scheme strings ?
> Like in a separate pure-python file.
I don’t know how unfortunately. Any ideas?

I moved it into a separate top-level variable now and turned it into a
single multi-line Scheme string. That makes it easier to read.

Cheers,
Lars



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-05 10:20   ` Lars-Dominik Braun
@ 2021-01-05 10:48     ` Vincent Legoll
  2021-01-06 16:09       ` Hartmut Goebel
  0 siblings, 1 reply; 9+ messages in thread
From: Vincent Legoll @ 2021-01-05 10:48 UTC (permalink / raw)
  To: Lars-Dominik Braun; +Cc: guix-devel

On Tue, Jan 5, 2021 at 11:28 AM Lars-Dominik Braun <lars@6xq.net> wrote:
> > Like in a separate pure-python file.
> I don’t know how unfortunately. Any ideas?

No sorry, I'm still a newbie

> I moved it into a separate top-level variable now and turned it into a
> single multi-line Scheme string. That makes it easier to read.

That is better, but the separate file would allow to have proper
syntax highlighting, allow linting/pep8'ing, etc.

Cheers

-- 
Vincent Legoll


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-05 10:48     ` Vincent Legoll
@ 2021-01-06 16:09       ` Hartmut Goebel
  0 siblings, 0 replies; 9+ messages in thread
From: Hartmut Goebel @ 2021-01-06 16:09 UTC (permalink / raw)
  To: Vincent Legoll, Lars-Dominik Braun; +Cc: guix-devel

Am 05.01.21 um 11:48 schrieb Vincent Legoll:
> That is better, but the separate file would allow to have proper
> syntax highlighting, allow linting/pep8'ing, etc.

Probably not worth the effort for trying to put this into a separate file.

-- 
Regards
Hartmut Goebel

| Hartmut Goebel          | h.goebel@crazy-compilers.com               |
| www.crazy-compilers.com | compilers which you thought are impossible |



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Improve Python package quality
  2021-01-03 10:00 [RFC] Improve Python package quality Lars-Dominik Braun
                   ` (2 preceding siblings ...)
  2021-01-05  9:31 ` Vincent Legoll
@ 2021-01-07 13:31 ` Lars-Dominik Braun
  3 siblings, 0 replies; 9+ messages in thread
From: Lars-Dominik Braun @ 2021-01-07 13:31 UTC (permalink / raw)
  To: guix-devel

Hi,

for reference, the patchset is now tracked by issue 45712
(http://issues.guix.gnu.org/45712)

Cheers,
Lars



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-01-07 16:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-03 10:00 [RFC] Improve Python package quality Lars-Dominik Braun
2021-01-03 10:48 ` Hartmut Goebel
2021-01-05  9:19   ` Lars-Dominik Braun
2021-01-03 15:04 ` Tobias Geerinckx-Rice
2021-01-05  9:31 ` Vincent Legoll
2021-01-05 10:20   ` Lars-Dominik Braun
2021-01-05 10:48     ` Vincent Legoll
2021-01-06 16:09       ` Hartmut Goebel
2021-01-07 13:31 ` Lars-Dominik Braun

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).