From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: PYTHONPATH woes Date: Tue, 20 Feb 2018 11:53:54 +0100 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:33316) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eo5Yq-0003AH-IC for guix-devel@gnu.org; Tue, 20 Feb 2018 05:54:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eo5Yn-00027m-El for guix-devel@gnu.org; Tue, 20 Feb 2018 05:54:12 -0500 Received: from venus.bbbm.mdc-berlin.de ([141.80.25.30]:58668) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eo5Yn-00026u-1B for guix-devel@gnu.org; Tue, 20 Feb 2018 05:54:09 -0500 Received: from localhost (localhost [127.0.0.1]) by venus.bbbm.mdc-berlin.de (Postfix) with ESMTP id A9473380052 for ; Tue, 20 Feb 2018 11:54:06 +0100 (CET) Received: from venus.bbbm.mdc-berlin.de ([127.0.0.1]) by localhost (venus.bbbm.mdc-berlin.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HUPOa_ryocUn for ; Tue, 20 Feb 2018 11:54:05 +0100 (CET) Received: from HTCAONE.mdc-berlin.net (puck.citx.mdc-berlin.de [141.80.36.101]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by venus.bbbm.mdc-berlin.de (Postfix) with ESMTPS for ; Tue, 20 Feb 2018 11:54:05 +0100 (CET) List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: guix-devel@gnu.org Hi Guix, we have a couple of packages that provide scripts that depend on Python modules. We wrap them in PYTHONPATH to ensure that the correct Python modules are found at runtime. This is not enough. We don=E2=80=99t wrap them tightly enough; instead we allow for a user-pr= ovided PYTHONPATH value to be added to the PYTHONPATH we specified. The result is that a user-set PYTHONPATH can act like LD_LIBRARY_PATH =E2=80=94 it c= auses chaos. This is despite the fact that we make sure that the wrapper=E2=80= =99s PYTHONPATH comes first! Suppose a user installs python@2 and python2-statsmodels; at a later point the user upgrades Guix, and then installs the ribodiff package. The user does not know that ribodiff is written in Python, nor should the user be aware of that. Because Python is installed in the profile, etc/profile will contain a definition for PYTHONPATH. The user may source that etc/profile file to set up all required environment variables. But now running the ribodiff scripts fails! Here=E2=80=99s what happens: the PYTHONPATH that Guix sets for the profil= e now contains an incompatible variant of the python2-statsmodels package. Guix has been upgraded between installing python2-statsmodels and ribodiff, so a different version of Python was used to build these modules. Since the ribodiff wrapper script gladly accepts any set PYTHONPATH, it causes the ribodiff scripts to load the old and incompatible python2-statsmodels package instead of the compatible one from the wrapper. I don=E2=80=99t know why this happens. I find it puzzling that in this particular case the user=E2=80=99s profile contains an *older* version of statsmodels (0.6.1). The wrapper includes the correct version of statsmodels (0.8.0) in the PYTHONPATH. Here=E2=80=99s the backtrace: --8<---------------cut here---------------start------------->8--- Traceback (most recent call last): File "/gnu/store/bz9l68hwlvwbp21msm2v002y7s8qfdd3-ribodiff-0.2.2/bin/.T= E.py-real", line 81, in main() File "/gnu/store/bz9l68hwlvwbp21msm2v002y7s8qfdd3-ribodiff-0.2.2/bin/.T= E.py-real", line 26, in main import ribodiff.estimatedisp as ed File "/gnu/store/bz9l68hwlvwbp21msm2v002y7s8qfdd3-ribodiff-0.2.2/lib/py= thon2.7/site-packages/ribodiff/estimatedisp.py", line 7, in import rawdisp as rd File "/gnu/store/bz9l68hwlvwbp21msm2v002y7s8qfdd3-ribodiff-0.2.2/lib/py= thon2.7/site-packages/ribodiff/rawdisp.py", line 8, in import statsmodels.api as sm File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/__init__.py", line 8, in from .tools.sm_exceptions import (ConvergenceWarning, CacheWriteWarni= ng, File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/tools/__init__.py", line 1, i= n from .tools import add_constant, categorical File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/tools/tools.py", line 11, in = from statsmodels.datasets import webuse File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/datasets/__init__.py", line 5= , in from . import (anes96, cancer, committee, ccard, copper, cpunish, eln= ino, File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/datasets/anes96/__init__.py",= line 1, in from .data import * File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/datasets/anes96/data.py", lin= e 90, in from statsmodels.datasets import utils as du File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/statsmode= ls-0.6.1-py2.7-linux-x86_64.egg/statsmodels/datasets/utils.py", line 13, = in from pandas import read_csv, DataFrame, Index File "/home/uzinnal/.guix-profile/lib/python2.7/site-packages/pandas-0.= 18.1-py2.7-linux-x86_64.egg/pandas/__init__.py", line 31, in "extensions first.".format(module)) ImportError: C extension: /home/uzinnal/.guix-profile/lib/python2.7/site-= packages/pandas-0.18.1-py2.7-linux-x86_64.egg/pandas/hashtable.so: undefi= ned symbol: PyUnicodeUCS2_FromStringAndSize not built. If you want to imp= ort pandas from the source directory, you may need to run 'python setup.p= y build_ext --inplace' to build the C extensions first. --8<---------------cut here---------------end--------------->8--- Now you could say that this is the user=E2=80=99s fault for not using man= ifests. But consider this: what happens if the user had a manifest and installed =E2=80=9Cpython-statsmodels=E2=80=9D instead of the Python 2 variant? Gu= ix would still set PYTHONPATH and the ribodiff wrapper would still prefer the profile=E2= =80=99s PYTHONPATH over the wrapped value, so it would cause Python 2 (from ribodiff) to load a Python 3 module of statsmodels =E2=80=94 these are no= t compatible and again we have a runtime crash. Manifests wouldn=E2=80=99t avoid this problem. Avoiding this problem now requires that users know what language a tool is implemented in (e.g. Python 2 for Ribodiff) and make a conscious effort to install these tools in a separate profile containing no Python 3 modules. This is not a reasonable burden to put on users. What can we do to fix this? Would it be good to make the wrappers for Python scripts stricter and not accept any user-set PYTHONPATH? How do we approach the problem of having both Python 2 modules and Python 3 modules in the same profile? PYTHONPATH will be set to refer to the site-packages directories of both versions, which is never good. Does Python offer us a way to do better? Can we make use of pth files to get around this problem somehow? -- Ricardo