unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
From: zimoun <zimon.toutoune@gmail.com>
To: Phil Beadling <phil@beadling.co.uk>
Cc: help-guix <help-guix@gnu.org>
Subject: Re: python-pyarrow broken for parquet?
Date: Fri, 2 Jul 2021 13:01:11 +0200	[thread overview]
Message-ID: <CAJ3okZ2nk=P27ECocr5dXsQgK1k0+um3UFZmdGY2-aER2iCknQ@mail.gmail.com> (raw)
In-Reply-To: <CAOvsyQvEZHuDoaTnjWt8xDfOn2=2bd=XaLhKBGViv1JVeg3w=g@mail.gmail.com>

Hi,

On Wed, 30 Jun 2021 at 21:10, Phil Beadling <phil@beadling.co.uk> wrote:

> When trying to use the parquet module in pyarrow - it's not finding the
> internal module.  This is true for latest 4.0.1 and also the previous
> version (shown below).
>
> This is despite the fact that in the underlying apache-arrow package
> parquet support is explicitly turned on as far as I can see:
> https://github.com/guix-mirror/guix/blob/5ed105a8bb1a812975496dc3a091596355a0234c/gnu/packages/databases.scm#L3687
>
> #:configure-flags
> (list "-DARROW_PYTHON=ON"
> "-DARROW_GLOG=ON"
> ;; Parquet options
> "-DARROW_PARQUET=ON"
> "-DPARQUET_BUILD_EXECUTABLES=ON".....

The package 'apache-arrow' is built with the Parquet support but not
the package 'python-pyarrow'.  To add the support of Parquet to the
Python binding, the environment variable PYTHON_WITH_PARQUET should be
set to 1, IIRC the doc.

Tweaking the 'python-pyarrow' with:

--8<---------------cut here---------------start------------->8---
      (add-after 'unpack 'set-env
        (lambda _ (setenv "PYARROW_WITH_PARQUET" "1") #t)))))
--8<---------------cut here---------------end--------------->8---

then running "./pre-inst-env guix build python-pyarrow --no-grafts" I get:

--8<---------------cut here---------------start------------->8---
-- Found Arrow:
/gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include
(found version "4.0.1")
-- Arrow version: 4.0.1 (CMake package configuration: Arrow)
-- Arrow SO and ABI version: 400
-- Arrow full SO version: 400.1.0
-- Found the Arrow core shared library:
/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib/lib/libarrow.so
-- Found the Arrow core import library:
-- Found the Arrow core static library:
-- Could NOT find ArrowPython (missing: ArrowPython_DIR)
-- Checking for module 'arrow-python'
--   Found arrow-python, version 4.0.1
-- Found ArrowPython:
/gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include
(found version "4.0.1")
-- Found the Arrow Python by pkg-config: arrow-python
-- Found the Arrow Python shared library:
/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib/lib/libarrow_python.so
-- Found the Arrow Python import library:
-- Found the Arrow Python static library: ARROW_PYTHON_static_lib-NOTFOUND
-- Could NOT find Parquet (missing: Parquet_DIR)
-- Checking for module 'parquet'
--   Found parquet, version 4.0.1
-- Found Parquet:
/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib//gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include
(found version "4.0.1")
-- Parquet version: 4.0.1 (pkg-config: parquet)
-- Found the Parquet shared library:
/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib/lib/libparquet.so
-- Found the Parquet import library:
-- Found the Parquet static library: PARQUET_static_lib-NOTFOUND
-- Configuring done
CMake Error in CMakeLists.txt:
  Imported target "parquet_shared" includes non-existent path

    "/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib//gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include"

  in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:

  * The path was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and references files it does not
  provide.


CMake Error in CMakeLists.txt:
  Imported target "parquet_shared" includes non-existent path

    "/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib//gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include"

  in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:

  * The path was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and references files it does not
  provide.
--8<---------------cut here---------------end--------------->8---

Without this environment variable PYTHON_WITH_PYARROW set to 1, the
configuration is not looking after Parquet, for instance:

--8<---------------cut here---------------start------------->8---
-- Found Arrow:
/gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include
(found version "4.0.1")
-- Arrow version: 4.0.1 (CMake package configuration: Arrow)
-- Arrow SO and ABI version: 400
-- Arrow full SO version: 400.1.0
-- Found the Arrow core shared library:
/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib/lib/libarrow.so
-- Found the Arrow core import library:
-- Found the Arrow core static library:
-- Could NOT find ArrowPython (missing: ArrowPython_DIR)
-- Checking for module 'arrow-python'
--   Found arrow-python, version 4.0.1
-- Found ArrowPython:
/gnu/store/bx4lprdcq5zw372f2rcc27c412chximr-apache-arrow-4.0.1-include/share/include
(found version "4.0.1")
-- Found the Arrow Python by pkg-config: arrow-python
-- Found the Arrow Python shared library:
/gnu/store/cagr2n03155mdx1mim8cxva1v22qhh0y-apache-arrow-4.0.1-lib/lib/libarrow_python.so
-- Found the Arrow Python import library:
-- Found the Arrow Python static library: ARROW_PYTHON_static_lib-NOTFOUND
-- Configuring done
-- Generating done
-- Build files have been written to:
/tmp/guix-build-python-pyarrow-4.0.1.drv-0/source/python/build/temp.linux-x86_64-3.8
-- Finished cmake for pyarrow
-- Running cmake --build for pyarrow
cmake --build . --config release --
make[1]: Entering directory
'/tmp/guix-build-python-pyarrow-4.0.1.drv-0/source/python/build/temp.linux-x86_64-3.8'
--8<---------------cut here---------------end--------------->8---

Well, setting the environment variable is not enough.  :-)  Do you
want to give a try for fixing it?


All the best,
simon


  reply	other threads:[~2021-07-02 14:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-30 18:35 python-pyarrow broken for parquet? Phil Beadling
2021-07-02 11:01 ` zimoun [this message]
2021-07-02 15:34   ` phil
2021-07-05 12:13     ` Phil Beadling
2021-07-05 12:15       ` Phil Beadling
2021-07-28 18:10         ` Ricardo Wurmus
2021-07-28 20:37           ` Phil
2021-08-17 12:29             ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ3okZ2nk=P27ECocr5dXsQgK1k0+um3UFZmdGY2-aER2iCknQ@mail.gmail.com' \
    --to=zimon.toutoune@gmail.com \
    --cc=help-guix@gnu.org \
    --cc=phil@beadling.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).