unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Python and propagation
@ 2016-02-18 12:21 Ricardo Wurmus
  2016-02-18 14:28 ` Andreas Enge
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Ricardo Wurmus @ 2016-02-18 12:21 UTC (permalink / raw)
  To: guix-devel

Hi Guix,

I’m dealing with Python in Guix pretty often, because numpy and scipy
are very popular here — and I have been bitten by propagated inputs a
little too often, so I’d like to discuss an alternative.

A user has the following packages installed in the default profile:

    python-2.7.10
    python2-numpy
    python2-scipy

This installation is already a little stale as it was done a couple of
months ago, so the versions of numpy and scipy are not the latest.  Now
the user chooses to install the innocuous-looking “htseq” package.  Guix
downloads the substitute (and more) and then proceeds to build a new
generation of the profile.

But, oh, what is *that*?  There are hundreds of lines in which Guix
warns about numpy conflicts, shrugs and picks one or the other version
of the conflicting file.  “How could this have happened?”, the user
cries out in dispair.

Well, the user did not see that “htseq” is not only an executable, but
also a Python library.  As a Python library it depends on numpy, and
thus propagates numpy as it is customary in Guix.  This is necessary
because Python libraries do not have a RUNPATH feature and must be able
to find named imports in some directory in the PYTHONPATH.  To make
packages available on a simple user-controlled PYTHONPATH, we propagate
all dependent Python packages, i.e. we install them into the very same
profile, so that the user only has to do this

    export PYTHONPATH=$HOME/.guix-profile/lib/python2.7

instead of the unmanageable

    export PYTHONPATH=/gnu/store/...dep1/...:/gnu/store/...dep2...:...

I wonder if we could do better than this.  Here are two proposals for
discussion; one is probably not very contentious, the other is straight
from my dream diary in which I collect all sorts of strange and
unrealistic ideas.

1) print a warning when a collision is expected to happen, not when a
collision has happened.

Guix knows what is currently installed in a profile, so it knows that,
say, “python2-numpy” is installed.  It also knows that installing
“htseq” is also going to install “python2-numpy” into that profile.  It
also knows that the propagated “python2-numpy” is different from the
installed “python2-numpy”.  Knowing all that, it should also know that
there are going to be file conflicts, no?

If that’s all true, could we make Guix print a warning about impending
doom *before* it creates a new profile generation?  It’s a much nicer
user experience (at least to me) when I’m being told about possible
conflicts (at the package level) before Guix creates a new profile and
nonchalantly informs me about arbitrarily resolving conflicts (at the
file level).  The difference is in the amount of warnings I get (package
vs individual files) and about the time I would otherwise have to waste
to roll back the profile and upgrade already installed libraries or
choose to install “htseq” into a new profile.


2) avoid PYTHONPATH, patch all Python files invasively!

Python does not have any feature that is comparable to RUNPATH.  It is
only concerned with finding libraries/modules by *name* in one of the
directories specified by the PYTHONPATH environment variable.

But actually the PYTHONPATH variable is not the only means to affect the
search path for modules.  It is possible to change the search path
programmatically:

    import sys
    sys.path.append("/gnu/store/cabba9e...-numpy.../lib/...")
    import numpy

The first two lines add an explicit store item path to the search path;
the third line just imports the numpy module by name (as usual).  Even
without setting the PYTHONPATH to include the numpy in the profile the
third line won’t fail.

I wonder if we could design a phase that — very much like the
“wrap-program” phase — modifies *every* Python file and injects lines
like the first two in the example above, appending explicit store item
paths, so that all dependent Python libraries can be found without the
need to have them installed in the same profile and without the need to
set PYTHONPATH.

Maybe this is crazy, and maybe this causes other annoying problems, but
I think it is the closest we can get to a RUNPATH-like feature in
Python.

What do you think?  Do I need a long vacation or is this something we
might realistically do in an automated fashion?

~~ Ricardo

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: Python and propagation
@ 2016-02-22 17:08 Federico Beffa
  0 siblings, 0 replies; 13+ messages in thread
From: Federico Beffa @ 2016-02-22 17:08 UTC (permalink / raw)
  To: ricardo.wurmus; +Cc: Guix-devel

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> 1) print a warning when a collision is expected to happen, not when a
> collision has happened.

+1

> 2) avoid PYTHONPATH, patch all Python files invasively!
>
> Python does not have any feature that is comparable to RUNPATH.  It is
> only concerned with finding libraries/modules by *name* in one of the
> directories specified by the PYTHONPATH environment variable.
>
> But actually the PYTHONPATH variable is not the only means to affect the
> search path for modules.  It is possible to change the search path
> programmatically:
>
>     import sys
>     sys.path.append("/gnu/store/cabba9e...-numpy.../lib/...")
>     import numpy
>
> The first two lines add an explicit store item path to the search path;
> the third line just imports the numpy module by name (as usual).  Even
> without setting the PYTHONPATH to include the numpy in the profile the
> third line won’t fail.
>
> I wonder if we could design a phase that — very much like the
> “wrap-program” phase — modifies *every* Python file and injects lines
> like the first two in the example above, appending explicit store item
> paths, so that all dependent Python libraries can be found without the
> need to have them installed in the same profile and without the need to
> set PYTHONPATH.
>
> Maybe this is crazy, and maybe this causes other annoying problems, but
> I think it is the closest we can get to a RUNPATH-like feature in
> Python.
>
> What do you think?  Do I need a long vacation or is this something we
> might realistically do in an automated fashion?

Isn't the problem left of how the python interpreter can find those
patched libraries?

Maybe something along this lines could do:

* Instead of installing all files related to a module into profiles,
  only install a .pth file in 'site-packages' with a unique name. The
  name could even include the Guix package hash. This file should
  include the full path to the store ('real') package. This could maybe
  be achieved with 'python-build-system' creating two derivations: a
  derivation including only the .pth file and the 'real' package
  derivation (similar to a wrapped program pointing to the real one).

  For example

  $cd ~/tmp/ttmp
  $echo -e "aspell\nmechanics" > test.pth
  $cd
  $python
  >>> import sys
  >>> import site
  >>> site.addsitedir("/home/beffa/tmp/ttmp")
  >>> sys.path[-3:]
  ['/home/beffa/tmp/ttmp', '/home/beffa/tmp/ttmp/aspell',
  '/home/beffa/tmp/ttmp/mechanics']

* Then you can install many different versions of a package and they
  will not mask each other.

  To select a specific version we could then possibly use pkg_resources.

  For example:

  >>> import pkg_resources
  >>> pkg_resources.require("numpy==1.9.1")
  [numpy 1.9.1 (/gnu/store/hbdzccnvnlf54mflgcigqw2jv4ybippv-profile/lib/python3.4/site-packages)]
  >>> import numpy

  If this always works (I'm not sure), we could patch this info into the
  packages at compile time.

Regards,
Fede

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-04-04 22:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-18 12:21 Python and propagation Ricardo Wurmus
2016-02-18 14:28 ` Andreas Enge
2016-02-18 14:45   ` Ricardo Wurmus
2016-02-18 15:03     ` Andreas Enge
2016-02-18 14:56   ` Jookia
2016-02-18 15:03 ` 宋文武
2016-02-19  1:26   ` 宋文武
2016-02-18 22:38 ` Christopher Allan Webber
2016-02-25 10:24   ` Ricardo Wurmus
2016-02-25 16:13     ` Christopher Allan Webber
2016-02-24 22:09 ` Ludovic Courtès
2016-04-04 22:08 ` Danny Milosavljevic
  -- strict thread matches above, loose matches on Subject: below --
2016-02-22 17:08 Federico Beffa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).