* [bug#41677] Update apache-arrow to v0.17.1
@ 2020-06-02 22:35 Katherine Cox-Buday
2020-06-09 8:47 ` bug#41677: " Ludovic Courtès
0 siblings, 1 reply; 2+ messages in thread
From: Katherine Cox-Buday @ 2020-06-02 22:35 UTC (permalink / raw)
To: 41677
[-- Attachment #1: Type: text/plain, Size: 257 bytes --]
17:34 kate says: guix refresh --list-dependent apache-arrow@0.10.0
Building the following 2 packages would ensure 3 dependent packages
are rebuilt: python-feather-format@0.4.0 python2-pyarrow@0.10.0
All dependencies have been updated to continue building.
[-- Attachment #2: 0002-gnu-apache-arrow-Update-to-0.17.1.patch --]
[-- Type: text/x-patch, Size: 11207 bytes --]
From 56162b1b94cbc9a85e7ee7358c9140dd80052cd4 Mon Sep 17 00:00:00 2001
From: Katherine Cox-Buday <cox.katherine.e@gmail.com>
Date: Tue, 2 Jun 2020 16:33:36 -0500
Subject: [PATCH 2/2] gnu: apache-arrow: Update to 0.17.1.
* gnu/packages/databases.scm (apache-arrow): Update to 0.17.1.
* gnu/packages/databases.scm (python-pyarrow): Update to 0.17.1.
* gnu/packages/serialization.scm (python-feather-format): Update to 0.4.1.
---
gnu/packages/databases.scm | 145 ++++++++++++++++++++++-----------
gnu/packages/serialization.scm | 4 +-
2 files changed, 100 insertions(+), 49 deletions(-)
diff --git a/gnu/packages/databases.scm b/gnu/packages/databases.scm
index ba15260c1f..c5d5c661a3 100644
--- a/gnu/packages/databases.scm
+++ b/gnu/packages/databases.scm
@@ -88,7 +88,9 @@
#:use-module (gnu packages language)
#:use-module (gnu packages libevent)
#:use-module (gnu packages linux)
+ #:use-module (gnu packages logging)
#:use-module (gnu packages man)
+ #:use-module (gnu packages maths)
#:use-module (gnu packages ncurses)
#:use-module (gnu packages onc-rpc)
#:use-module (gnu packages parallel)
@@ -98,6 +100,7 @@
#:use-module (gnu packages perl-web)
#:use-module (gnu packages pkg-config)
#:use-module (gnu packages popt)
+ #:use-module (gnu packages protobuf)
#:use-module (gnu packages python)
#:use-module (gnu packages python-crypto)
#:use-module (gnu packages python-web)
@@ -105,6 +108,8 @@
#:use-module (gnu packages python-xyz)
#:use-module (gnu packages rdf)
#:use-module (gnu packages readline)
+ #:use-module (gnu packages regex)
+ #:use-module (gnu packages rpc)
#:use-module (gnu packages ruby)
#:use-module (gnu packages serialization)
#:use-module (gnu packages sphinx)
@@ -3217,20 +3222,22 @@ Monitor read/write activity on a mongo server
@end table")
(license license:asl2.0)))
+;; There are many wrappers for this in other languages. When touching, please
+;; be sure to ensure all dependencies continue to build.
(define-public apache-arrow
(package
(name "apache-arrow")
- (version "0.10.0")
+ (version "0.17.1")
(source
- (origin
- (method git-fetch)
- (uri (git-reference
- (url "https://github.com/apache/arrow")
- (commit (string-append "apache-arrow-" version))))
- (file-name (git-file-name name version))
- (sha256
- (base32
- "04xkp922b8qrrnpvv9ixxnvk7151n1plzx6aqdff6frn9651zvxs"))))
+ (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/apache/arrow")
+ (commit (string-append "apache-arrow-" version))))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32
+ "02r6yx3yhywzikd3b0vfkjgddhfiriyx2vpm3jf5880wq59x798a"))))
(build-system cmake-build-system)
(arguments
`(#:tests? #f
@@ -3243,91 +3250,135 @@ Monitor read/write activity on a mongo server
(setenv "BOOST_ROOT" (assoc-ref %build-inputs "boost"))
(setenv "BROTLI_HOME" (assoc-ref %build-inputs "brotli"))
(setenv "FLATBUFFERS_HOME" (assoc-ref %build-inputs "flatbuffers"))
- (setenv "JEMALLOC_HOME" (assoc-ref %build-inputs "jemalloc"))
(setenv "RAPIDJSON_HOME" (assoc-ref %build-inputs "rapidjson"))
#t)))
#:build-type "Release"
#:configure-flags
(list "-DARROW_PYTHON=ON"
-
- ;; Install to PREFIX/lib (the default is
- ;; PREFIX/lib64).
- (string-append "-DCMAKE_INSTALL_LIBDIR="
- (assoc-ref %outputs "out")
+ "-DARROW_GLOG=ON"
+ ;; Parquet options
+ "-DARROW_PARQUET=ON"
+ "-DPARQUET_BUILD_EXECUTABLES=ON"
+ ;; The maintainers disallow using system versions of
+ ;; jemalloc:
+ ;; https://issues.apache.org/jira/browse/ARROW-3507. This
+ ;; is unfortunate because jemalloc increases performance:
+ ;; https://arrow.apache.org/blog/2018/07/20/jemalloc/.
+ "-DARROW_JEMALLOC=OFF"
+
+ ;; The CMake option ARROW_DEPENDENCY_SOURCE is a global
+ ;; option that instructs the build system how to resolve
+ ;; each dependency. SYSTEM = Finding the dependency in
+ ;; system paths using CMake's built-in find_package
+ ;; function, or using pkg-config for packages that do not
+ ;; have this feature
+ "-DARROW_DEPENDENCY_SOURCE=SYSTEM"
+
+ ;; Split output into its component packages.
+ (string-append "-DCMAKE_INSTALL_PREFIX="
+ (assoc-ref %outputs "lib"))
+ (string-append "-DCMAKE_INSTALL_RPATH="
+ (assoc-ref %outputs "lib")
"/lib")
-
- ;; XXX These Guix package offer static
- ;; libraries that are not position independent,
- ;; and ld fails to link them into the arrow .so
- "-DARROW_WITH_SNAPPY=OFF"
- "-DARROW_WITH_ZLIB=OFF"
- "-DARROW_WITH_ZSTD=OFF"
- "-DARROW_WITH_LZ4=OFF"
+ (string-append "-DCMAKE_INSTALL_BINDIR="
+ (assoc-ref %outputs "out")
+ "/bin")
+ (string-append "-DCMAKE_INSTALL_INCLUDEDIR="
+ (assoc-ref %outputs "include")
+ "/share/include")
+
+
+ "-DARROW_WITH_SNAPPY=ON"
+ "-DARROW_WITH_ZLIB=ON"
+ "-DARROW_WITH_ZSTD=ON"
+ "-DARROW_WITH_LZ4=ON"
+ "-DARROW_COMPUTE=ON"
+ "-DARROW_CSV=ON"
+ "-DARROW_DATASET=ON"
+ "-DARROW_FILESYSTEM=ON"
+ "-DARROW_HDFS=ON"
+ "-DARROW_JSON=ON"
+ ;; Arrow Python C++ integration library (required for
+ ;; building pyarrow). This library must be built against
+ ;; the same Python version for which you are building
+ ;; pyarrow. NumPy must also be installed. Enabling this
+ ;; option also enables ARROW_COMPUTE, ARROW_CSV,
+ ;; ARROW_DATASET, ARROW_FILESYSTEM, ARROW_HDFS, and
+ ;; ARROW_JSON.
+ "-DARROW_PYTHON=ON"
;; Building the tests forces on all the
;; optional features and the use of static
;; libraries.
"-DARROW_BUILD_TESTS=OFF"
+ "-DBENCHMARK_ENABLE_GTEST_TESTS=OFF"
+ ;;"-DBENCHMARK_ENABLE_TESTING=OFF"
"-DARROW_BUILD_STATIC=OFF")))
(inputs
`(("boost" ,boost)
- ("rapidjson" ,rapidjson)
("brotli" ,google-brotli)
- ("flatbuffers" ,flatbuffers)
- ("jemalloc" ,jemalloc)
+ ("double-conversion" ,double-conversion)
+ ("snappy" ,snappy)
+ ("gflags" ,gflags)
+ ("glog" ,glog)
+ ("apache-thrift" ,apache-thrift "lib")
+ ("protobuf" ,protobuf)
+ ("rapidjson" ,rapidjson)
+ ("zlib" ,zlib)
+ ("bzip2" ,bzip2)
+ ("lz4" ,lz4)
+ ("zstd" ,zstd "lib")
+ ("re2" ,re2)
+ ("grpc" ,grpc)
("python-3" ,python)
("python-numpy" ,python-numpy)))
+ (native-inputs
+ `(("pkg-config" ,pkg-config)))
+ (outputs '("out" "lib" "include"))
(home-page "https://arrow.apache.org/")
(synopsis "Columnar in-memory analytics")
(description "Apache Arrow is a columnar in-memory analytics layer
-designed to accelerate big data. It houses a set of canonical in-memory
+designed to accelerate big data. It houses a set of canonical in-memory
representations of flat and hierarchical data along with multiple
-language-bindings for structure manipulation. It also provides IPC and common
+language-bindings for structure manipulation. It also provides IPC and common
algorithm implementations.")
(license license:asl2.0)))
(define-public python-pyarrow
(package
+ (inherit apache-arrow)
(name "python-pyarrow")
- (version "0.10.0")
- (source
- (origin
- (method git-fetch)
- (uri (git-reference
- (url "https://github.com/apache/arrow")
- (commit (string-append "apache-arrow-" version))))
- (file-name (git-file-name name version))
- (sha256
- (base32
- "04xkp922b8qrrnpvv9ixxnvk7151n1plzx6aqdff6frn9651zvxs"))))
(build-system python-build-system)
(arguments
- '(#:tests? #f ; XXX There are no tests in the "python" directory
+ '(#:tests? #f ; XXX There are no tests in the "python" directory
#:phases
(modify-phases %standard-phases
(delete 'build) ; XXX the build is performed again during the install phase
(add-after 'unpack 'enter-source-directory
(lambda _ (chdir "python") #t))
- (add-after 'unpack 'set-env
+ (add-after 'unpack 'make-git-checkout-writable
(lambda _
- (setenv "ARROW_HOME" (assoc-ref %build-inputs "apache-arrow"))
+ (for-each make-file-writable (find-files "."))
#t)))))
(propagated-inputs
- `(("apache-arrow" ,apache-arrow)
+ `(("apache-arrow" ,apache-arrow "lib")
("python-numpy" ,python-numpy)
("python-pandas" ,python-pandas)
("python-six" ,python-six)))
(native-inputs
`(("cmake" ,cmake-minimal)
+ ("pkg-config" ,pkg-config)
("python-cython" ,python-cython)
("python-pytest" ,python-pytest)
("python-pytest-runner" ,python-pytest-runner)
("python-setuptools-scm" ,python-setuptools-scm)))
+ (outputs '("out"))
(home-page "https://arrow.apache.org/docs/python/")
(synopsis "Python bindings for Apache Arrow")
- (description "This library provides a Pythonic API wrapper for the reference
-Arrow C++ implementation, along with tools for interoperability with pandas,
-NumPy, and other traditional Python scientific computing packages.")
+ (description
+ "This library provides a Pythonic API wrapper for the reference Arrow C++
+implementation, along with tools for interoperability with pandas, NumPy, and
+other traditional Python scientific computing packages.")
(license license:asl2.0)))
(define-public python2-pyarrow
diff --git a/gnu/packages/serialization.scm b/gnu/packages/serialization.scm
index bee7a2e917..be81715116 100644
--- a/gnu/packages/serialization.scm
+++ b/gnu/packages/serialization.scm
@@ -467,14 +467,14 @@ game development and other performance-critical applications.")
(define-public python-feather-format
(package
(name "python-feather-format")
- (version "0.4.0")
+ (version "0.4.1")
(source
(origin
(method url-fetch)
(uri (pypi-uri "feather-format" version))
(sha256
(base32
- "1adivm5w5ji4qv7hq7942vqlk8l2wgw87bdlsia771z14z3zp857"))))
+ "00w9hwz7sj3fkdjc378r066vdy6lpxmn6vfac3qx956k8lvpxxj5"))))
(build-system python-build-system)
(propagated-inputs
`(("python-pandas" ,python-pandas)
--
2.26.2
[-- Attachment #3: 0001-gnu-Add-apache-thrift.patch --]
[-- Type: text/x-patch, Size: 2864 bytes --]
From 8765a3958d8940f3897ce32c22c239a0727e3f23 Mon Sep 17 00:00:00 2001
From: Katherine Cox-Buday <cox.katherine.e@gmail.com>
Date: Tue, 2 Jun 2020 16:25:37 -0500
Subject: [PATCH 1/2] gnu: Add apache-thrift.
* gnu/packages/rpc.scm (apache-thrift): New variable.
---
gnu/packages/rpc.scm | 49 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/gnu/packages/rpc.scm b/gnu/packages/rpc.scm
index 28c61b54f9..de7d08ae4d 100644
--- a/gnu/packages/rpc.scm
+++ b/gnu/packages/rpc.scm
@@ -25,10 +25,16 @@
#:use-module (guix download)
#:use-module (guix utils)
#:use-module (guix build-system cmake)
+ #:use-module (guix build-system gnu)
#:use-module (guix build-system python)
#:use-module (gnu packages adns)
+ #:use-module (gnu packages autotools)
+ #:use-module (gnu packages bison)
+ #:use-module (gnu packages boost)
#:use-module (gnu packages compression)
#:use-module (gnu packages cpp)
+ #:use-module (gnu packages flex)
+ #:use-module (gnu packages pkg-config)
#:use-module (gnu packages protobuf)
#:use-module (gnu packages python)
#:use-module (gnu packages python-xyz)
@@ -192,3 +198,46 @@ browsers to backend services.")
(description "This package provides a Python library for communicating
with the HTTP/2-based RPC framework gRPC.")
(license license:asl2.0)))
+
+(define-public apache-thrift
+ (package
+ (name "apache-thrift")
+ (version "0.13.0")
+ (source
+ (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/apache/thrift.git")
+ (commit (string-append "v" version))))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32
+ "17ckl7p7s3ga33yrjisilsimp80ansqxl54wvpkv0j7vx2zvc13y"))))
+ (build-system gnu-build-system)
+ (arguments
+ '(#:tests? #f
+ #:configure-flags
+ (list (string-append "--with-boost="
+ (assoc-ref %build-inputs "boost")))))
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)
+ ("libtool" ,libtool)
+ ("pkg-config" ,pkg-config)
+ ("flex" ,flex)
+ ("bison" ,bison)))
+ (inputs
+ `(("boost" ,boost)
+ ("libressl" ,libressl)))
+ (outputs '("out" "lib" "include"))
+ (home-page "https://thrift.apache.org/")
+ (synopsis
+ "Lightweight, language-independent software stack for point-to-point
+RPC")
+ (description
+ "Thrift provides clean abstractions and implementations for data
+transport, data serialization, and application level processing. The code
+generation system takes a simple definition language as input and generates
+code across programming languages that uses the abstracted stack to build
+interoperable RPC clients and servers.")
+ (license license:asl2.0)))
--
2.26.2
^ permalink raw reply related [flat|nested] 2+ messages in thread
* bug#41677: Update apache-arrow to v0.17.1
2020-06-02 22:35 [bug#41677] Update apache-arrow to v0.17.1 Katherine Cox-Buday
@ 2020-06-09 8:47 ` Ludovic Courtès
0 siblings, 0 replies; 2+ messages in thread
From: Ludovic Courtès @ 2020-06-09 8:47 UTC (permalink / raw)
To: Katherine Cox-Buday; +Cc: 41677-done
Hello Katherine,
Katherine Cox-Buday <cox.katherine.e@gmail.com> skribis:
> From 56162b1b94cbc9a85e7ee7358c9140dd80052cd4 Mon Sep 17 00:00:00 2001
> From: Katherine Cox-Buday <cox.katherine.e@gmail.com>
> Date: Tue, 2 Jun 2020 16:33:36 -0500
> Subject: [PATCH 2/2] gnu: apache-arrow: Update to 0.17.1.
>
> * gnu/packages/databases.scm (apache-arrow): Update to 0.17.1.
> * gnu/packages/databases.scm (python-pyarrow): Update to 0.17.1.
> * gnu/packages/serialization.scm (python-feather-format): Update to 0.4.1.
[...]
> From 8765a3958d8940f3897ce32c22c239a0727e3f23 Mon Sep 17 00:00:00 2001
> From: Katherine Cox-Buday <cox.katherine.e@gmail.com>
> Date: Tue, 2 Jun 2020 16:25:37 -0500
> Subject: [PATCH 1/2] gnu: Add apache-thrift.
>
> * gnu/packages/rpc.scm (apache-thrift): New variable.
Applied, thanks!
Ludo’.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-06-09 8:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-02 22:35 [bug#41677] Update apache-arrow to v0.17.1 Katherine Cox-Buday
2020-06-09 8:47 ` bug#41677: " Ludovic Courtès
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).