unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] Add khmer.
@ 2016-06-17  1:23 Ben Woodcroft
  2016-06-17  1:23 ` [PATCH 1/3] gnu: Add python-screed Ben Woodcroft
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17  1:23 UTC (permalink / raw)
  To: Guix-devel

This isn't the best name for a piece of software I suggest, but it is
reasonably well known in the field.

'murmur-hash' takes the code from a repository SMHasher, but I'm not
interested in packaging that. That is OK?

TIA as usual.
ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] gnu: Add python-screed.
  2016-06-17  1:23 [PATCH] Add khmer Ben Woodcroft
@ 2016-06-17  1:23 ` Ben Woodcroft
  2016-06-17  2:07   ` Leo Famulari
  2016-06-17  1:23 ` [PATCH 2/3] gnu: Add murmur-hash Ben Woodcroft
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17  1:23 UTC (permalink / raw)
  To: Guix-devel

* gnu/packages/bioinformatics.scm (python-screed, python2-screed):
New variables.
---
 gnu/packages/bioinformatics.scm | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 04ed769..22ed71a 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3359,6 +3359,47 @@ optimize the sequencing depth, or to screen multiple libraries to avoid low
 complexity samples.")
     (license license:gpl3+)))
 
+(define-public python-screed
+  (package
+    (name "python-screed")
+    (version "0.9")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (string-append
+             "https://pypi.python.org/packages/a0/6b/"
+             "a90c7ab7b0ad1eab211053c85c1242a81379f34c5f5933392079c9b36858"
+             "/screed-" version ".tar.gz"))
+       (sha256
+        (base32
+         "18czszp9fkx3j6jr7y5kp6dfialscgddk05mw1zkhh2zhn0jd8i0"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (replace 'check
+           (lambda _
+             (setenv "PYTHONPATH"
+                     (string-append (getenv "PYTHONPATH") ":."))
+             (zero? (system* "nosetests" "--attr" "!known_failing")))))))
+    (native-inputs
+     `(("python-nose" ,python-nose)))
+    (inputs
+     `(("python-bz2file" ,python-bz2file)))
+    (home-page "http://github.com/dib-lab/screed/")
+    (synopsis "Short read sequence utils")
+    (description "Screed parses FASTA and FASTQ files and generates databases.
+Values such as sequence name, sequence description, sequence quality and the
+sequence itself can be retrieved from these databases.")
+    (license license:bsd-3)))
+
+(define-public python2-screed
+  (let ((base (package-with-python2 (strip-python2-variant python-screed))))
+    (package
+      (inherit base)
+      (native-inputs `(("python2-setuptools" ,python2-setuptools)
+                       ,@(package-native-inputs base))))))
+
 (define-public sra-tools
   (package
     (name "sra-tools")
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3] gnu: Add murmur-hash.
  2016-06-17  1:23 [PATCH] Add khmer Ben Woodcroft
  2016-06-17  1:23 ` [PATCH 1/3] gnu: Add python-screed Ben Woodcroft
@ 2016-06-17  1:23 ` Ben Woodcroft
  2016-06-17  1:23 ` [PATCH 3/3] gnu: Add khmer Ben Woodcroft
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17  1:23 UTC (permalink / raw)
  To: Guix-devel

* gnu/packages/textutils.scm (murmur-hash): New variable.
---
 gnu/packages/textutils.scm | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/gnu/packages/textutils.scm b/gnu/packages/textutils.scm
index ebcf4b9..ce83c4d 100644
--- a/gnu/packages/textutils.scm
+++ b/gnu/packages/textutils.scm
@@ -373,3 +373,42 @@ runs Word\".")
     (description "UTF8-CPP is a C++ library for handling UTF-8 encoded text
 in a portable way.")
     (license license:boost1.0)))
+
+(define-public murmur-hash
+  ;; There are no releases so package directly from a git commit.
+  (let ((commit "61a0530f28277f2e850bfc39600ce61d02b518de"))
+    (package
+      (name "murmur-hash")
+      (version (string-append "0-1." (string-take commit 7)))
+      (source (origin
+                (method git-fetch)
+                (uri (git-reference
+                      (url "https://github.com/aappleby/smhasher.git")
+                      (commit commit)))
+                (file-name (string-append name "-" version ".tar.gz"))
+                (sha256
+                 (base32
+                  "1hasigzz5dyx8dzf1arzwahhph2p4vnsq32wlyh99nga2z4hd8si"))))
+      (build-system gnu-build-system)
+      (arguments
+       `(#:phases
+         (modify-phases %standard-phases
+           (delete 'configure)
+           (delete 'build)
+           (delete 'check)
+           (replace 'install
+             (lambda* (#:key outputs #:allow-other-keys)
+               (with-directory-excursion "src"
+                 (let ((out (string-append
+                             (assoc-ref outputs "out")
+                             "/include")))
+                   (install-file "MurmurHash3.cpp" out)
+                   (install-file "MurmurHash3.h" out)))
+               #t)))))
+      (home-page "https://github.com/aappleby/smhasher")
+      (synopsis "Non-cryptographic hash function for lookup")
+      (description
+       "MurmurHash is a non-cryptographic hash function suitable for general
+hash-based lookup.  The name comes from two basic operations, multiply (MU)
+and rotate (R), used in its inner loop.")
+      (license license:public-domain))))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3] gnu: Add khmer.
  2016-06-17  1:23 [PATCH] Add khmer Ben Woodcroft
  2016-06-17  1:23 ` [PATCH 1/3] gnu: Add python-screed Ben Woodcroft
  2016-06-17  1:23 ` [PATCH 2/3] gnu: Add murmur-hash Ben Woodcroft
@ 2016-06-17  1:23 ` Ben Woodcroft
  2016-06-17  7:23   ` Leo Famulari
  2016-06-17  8:16 ` [PATCH] " Leo Famulari
  2016-06-25 17:29 ` Leo Famulari
  4 siblings, 1 reply; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17  1:23 UTC (permalink / raw)
  To: Guix-devel

* gnu/packages/bioinformatics.scm (khmer): New variable.
* gnu/packages/patches/khmer-use-libraries.patch: New file.
* gnu/local.mk (dist_patch_DATA): Add it.
---
 gnu/local.mk                                   |  1 +
 gnu/packages/bioinformatics.scm                | 92 ++++++++++++++++++++++++++
 gnu/packages/patches/khmer-use-libraries.patch | 16 +++++
 3 files changed, 109 insertions(+)
 create mode 100644 gnu/packages/patches/khmer-use-libraries.patch

diff --git a/gnu/local.mk b/gnu/local.mk
index 55fea0e..bbbe986 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -590,6 +590,7 @@ dist_patch_DATA =						\
   %D%/packages/patches/jasper-CVE-2016-2089.patch		\
   %D%/packages/patches/jasper-CVE-2016-2116.patch		\
   %D%/packages/patches/jbig2dec-ignore-testtest.patch		\
+  %D%/packages/patches/khmer-use-libraries.patch                \
   %D%/packages/patches/kmod-module-directory.patch		\
   %D%/packages/patches/ldc-disable-tests.patch			\
   %D%/packages/patches/lftp-dont-save-unknown-host-fingerprint.patch \
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 22ed71a..7445d7b 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -2301,6 +2301,98 @@ command, or queried for specific k-mers with @code{jellyfish query}.")
     ;; files such as lib/jsoncpp.cpp are released under the Expat license.
     (license (list license:gpl3+ license:expat))))
 
+(define-public khmer
+  (package
+    (name "khmer")
+    (version "2.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri
+        (string-append
+         "https://pypi.python.org/packages/"
+         "52/3b/2c52a13937197391775f274ed75b4a33b2d7767a904faaf4032e14e10a55/"
+         "khmer-" version ".tar.gz"))
+       (sha256
+        (base32
+         "0wb05shqh77v00256qlm68vbbx3kl76fyzihszbz5nhanl4ni33a"))
+       (patches (search-patches "khmer-use-libraries.patch"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (add-after 'unpack 'set-paths
+           (lambda* (#:key inputs outputs #:allow-other-keys)
+             ;; Delete bundled libraries.
+             (delete-file-recursively "third-party/zlib")
+             (delete-file-recursively "third-party/bzip2")
+             ;; Replace bundled seqan.
+             (let* ((seqan-all "third-party/seqan")
+                    (seqan-include (string-append
+                                    seqan-all "/core/include/seqan")))
+               (delete-file-recursively seqan-all)
+               (mkdir-p seqan-include)
+               (rmdir seqan-include)
+               (copy-file (string-append (assoc-ref inputs "seqan")
+                                         "/include/seqan")
+                          seqan-include))
+             ;; Replace bundled MurmurHash.
+             (let ((smhasher "third-party/smhasher/"))
+               (delete-file-recursively smhasher)
+               (mkdir smhasher)
+               (for-each
+                (lambda (file)
+                  (copy-file
+                   (string-append
+                    (assoc-ref inputs "murmur-hash") "/include/" file)
+                   (string-append smhasher file)))
+                (list "MurmurHash3.cpp" "MurmurHash3.h"))
+               (rename-file
+                (string-append smhasher "MurmurHash3.cpp")
+                (string-append smhasher "MurmurHash3.cc")))
+             (setenv "CC" "gcc")
+             #t))
+         ;; It is simpler to test after installation.
+         (delete 'check)
+         (add-after 'install 'post-install-check
+           (lambda* (#:key inputs outputs #:allow-other-keys)
+             (let ((out (assoc-ref outputs "out")))
+               (setenv "PATH"
+                       (string-append
+                        (getenv "PATH")
+                        ":"
+                        (assoc-ref outputs "out")
+                        "/bin"))
+               (setenv "PYTHONPATH"
+                       (string-append
+                        (getenv "PYTHONPATH")
+                        ":"
+                        out
+                        "/lib/python"
+                        (string-take (string-take-right
+                                      (assoc-ref inputs "python") 5) 3)
+                        "/site-packages"))
+               (with-directory-excursion "build"
+                 (zero? (system* "nosetests" "khmer" "--attr"
+                                 "!known_failing")))))))))
+    (native-inputs
+     `(("murmur-hash" ,murmur-hash)
+       ("seqan" ,seqan)
+       ("python-nose" ,python-nose)))
+    (inputs
+     `(("zlib" ,zlib)
+       ("bzip2" ,bzip2)
+       ("python-screed" ,python-screed)
+       ("python-bz2file" ,python-bz2file)))
+    (home-page "https://khmer.readthedocs.org/")
+    (synopsis "K-mer counting, filtering and graph traversal library")
+    (description "The khmer software is a set of command-line tools for
+working with DNA shotgun sequencing data from genomes, transcriptomes,
+metagenomes and single cells.  Khmer can make de novo assemblies faster, and
+sometimes better.  Khmer can also identify and fix problems with shotgun
+data.")
+    (license license:bsd-3)))
+
 (define-public macs
   (package
     (name "macs")
diff --git a/gnu/packages/patches/khmer-use-libraries.patch b/gnu/packages/patches/khmer-use-libraries.patch
new file mode 100644
index 0000000..47d163a
--- /dev/null
+++ b/gnu/packages/patches/khmer-use-libraries.patch
@@ -0,0 +1,16 @@
+Change setup.cfg so that the bundled zlib and bzip2 are not used.  This cannot
+currently be achieved using "--library z,bz2" as instructed in the setup.py.
+
+diff --git a/setup.cfg b/setup.cfg
+index c054092..080992e 100644
+--- a/setup.cfg
++++ b/setup.cfg
+@@ -1,7 +1,7 @@
+ [build_ext]
+ define = SEQAN_HAS_BZIP2,SEQAN_HAS_ZLIB
+ undef = NO_UNIQUE_RC
+-# libraries = z,bz2
++libraries = z,bz2
+ ## if using system libraries
+ include-dirs = lib:third-party/zlib:third-party/bzip2:third-party/seqan/core/include:third-party/smhasher
+ # include-dirs = lib
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] gnu: Add python-screed.
  2016-06-17  1:23 ` [PATCH 1/3] gnu: Add python-screed Ben Woodcroft
@ 2016-06-17  2:07   ` Leo Famulari
  2016-06-17 22:57     ` Ben Woodcroft
  0 siblings, 1 reply; 19+ messages in thread
From: Leo Famulari @ 2016-06-17  2:07 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Fri, Jun 17, 2016 at 11:23:19AM +1000, Ben Woodcroft wrote:
> * gnu/packages/bioinformatics.scm (python-screed, python2-screed):
> New variables.

> +       (uri (string-append
> +             "https://pypi.python.org/packages/a0/6b/"
> +             "a90c7ab7b0ad1eab211053c85c1242a81379f34c5f5933392079c9b36858"

Can you keep the hash on a single line? Just to reduce noise in the
commit log for future updates...

> +    (synopsis "Short read sequence utils")

s/utils/utilities

> +    (description "Screed parses FASTA and FASTQ files and generates databases.

Can you unpack these acronyms?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] gnu: Add khmer.
  2016-06-17  1:23 ` [PATCH 3/3] gnu: Add khmer Ben Woodcroft
@ 2016-06-17  7:23   ` Leo Famulari
  2016-06-17 23:41     ` Ben Woodcroft
  0 siblings, 1 reply; 19+ messages in thread
From: Leo Famulari @ 2016-06-17  7:23 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Fri, Jun 17, 2016 at 11:23:21AM +1000, Ben Woodcroft wrote:
> +    (build-system python-build-system)
> +    (arguments
> +     `(#:phases
> +       (modify-phases %standard-phases
> +         (add-after 'unpack 'set-paths

This phase deletes bundled libraries and then copies one of the
libraries back in from another package (murmur-hash).

Can khmer refer to murmur-hash without it being bundled at all?

> +           (lambda* (#:key inputs outputs #:allow-other-keys)
> +             ;; Delete bundled libraries.
> +             (delete-file-recursively "third-party/zlib")
> +             (delete-file-recursively "third-party/bzip2")
> +             ;; Replace bundled seqan.
> +             (let* ((seqan-all "third-party/seqan")
> +                    (seqan-include (string-append
> +                                    seqan-all "/core/include/seqan")))
> +               (delete-file-recursively seqan-all)
> +               (mkdir-p seqan-include)
> +               (rmdir seqan-include)

Here it makes the directory seqan-include and then removes it. Should it
be reversed? Would it be simpler to delete the directory and then use
copy-recursively, which I don't think requires mkdir-p?

> +               (copy-file (string-append (assoc-ref inputs "seqan")
> +                                         "/include/seqan")
> +                          seqan-include))
> +             ;; Replace bundled MurmurHash.

> +             (let ((smhasher "third-party/smhasher/"))
> +               (delete-file-recursively smhasher)
> +               (mkdir smhasher)
> +               (for-each
> +                (lambda (file)
> +                  (copy-file
> +                   (string-append
> +                    (assoc-ref inputs "murmur-hash") "/include/" file)
> +                   (string-append smhasher file)))
> +                (list "MurmurHash3.cpp" "MurmurHash3.h"))
> +               (rename-file
> +                (string-append smhasher "MurmurHash3.cpp")
> +                (string-append smhasher "MurmurHash3.cc")))
> +             (setenv "CC" "gcc")

I think this variable setting should be in its own phase.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-17  1:23 [PATCH] Add khmer Ben Woodcroft
                   ` (2 preceding siblings ...)
  2016-06-17  1:23 ` [PATCH 3/3] gnu: Add khmer Ben Woodcroft
@ 2016-06-17  8:16 ` Leo Famulari
  2016-06-17  9:42   ` Ben Woodcroft
  2016-06-25 17:29 ` Leo Famulari
  4 siblings, 1 reply; 19+ messages in thread
From: Leo Famulari @ 2016-06-17  8:16 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Fri, Jun 17, 2016 at 11:23:18AM +1000, Ben Woodcroft wrote:
> 'murmur-hash' takes the code from a repository SMHasher, but I'm not
> interested in packaging that. That is OK?

I'll try packaging SMHasher. Hopefully it's not too hard.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-17  8:16 ` [PATCH] " Leo Famulari
@ 2016-06-17  9:42   ` Ben Woodcroft
  2016-06-17 17:54     ` Leo Famulari
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17  9:42 UTC (permalink / raw)
  To: Leo Famulari, Ben Woodcroft; +Cc: Guix-devel

Hey Leo, thanks for the thoughts.

On 17/06/16 18:16, Leo Famulari wrote:
> On Fri, Jun 17, 2016 at 11:23:18AM +1000, Ben Woodcroft wrote:
>> 'murmur-hash' takes the code from a repository SMHasher, but I'm not
>> interested in packaging that. That is OK?
> I'll try packaging SMHasher. Hopefully it's not too hard.

Go right ahead if you like. However, I wouldn't really recommend it 
because the only function of it is to test the quality of different hash 
algorithms, which probably isn't useful as anything except as an end in 
itself, and in that case a user might want to manage the compilation 
flags themselves anyway. But I won't stand in your way, of course.

What if we change the name of the package to 'smhasher' and leave 
packaging the binary for later if someone is really interested?

Thanks
ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-17  9:42   ` Ben Woodcroft
@ 2016-06-17 17:54     ` Leo Famulari
  0 siblings, 0 replies; 19+ messages in thread
From: Leo Famulari @ 2016-06-17 17:54 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Fri, Jun 17, 2016 at 07:42:52PM +1000, Ben Woodcroft wrote:
> Hey Leo, thanks for the thoughts.
> 
> On 17/06/16 18:16, Leo Famulari wrote:
> > On Fri, Jun 17, 2016 at 11:23:18AM +1000, Ben Woodcroft wrote:
> > > 'murmur-hash' takes the code from a repository SMHasher, but I'm not
> > > interested in packaging that. That is OK?
> > I'll try packaging SMHasher. Hopefully it's not too hard.
> 
> Go right ahead if you like. However, I wouldn't really recommend it because
> the only function of it is to test the quality of different hash algorithms,
> which probably isn't useful as anything except as an end in itself, and in
> that case a user might want to manage the compilation flags themselves
> anyway. But I won't stand in your way, of course.

I saw that SMHasher was just a test suite for hash algorithms [0], but I
don't really have any idea how it would be used in practice. So, perhaps
you're right that every user would have their own set of flags they want
to build with, making our package less useful.

> What if we change the name of the package to 'smhasher' and leave packaging
> the binary for later if someone is really interested?

Is murmur designed to be bundled? If there are no proper releases of a
library, then does it make sense to roll-our-own packaging like this?

[0] But also the reference implementation of murmur and potentially some
other algorithms?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] gnu: Add python-screed.
  2016-06-17  2:07   ` Leo Famulari
@ 2016-06-17 22:57     ` Ben Woodcroft
  0 siblings, 0 replies; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17 22:57 UTC (permalink / raw)
  To: Leo Famulari, Ben Woodcroft; +Cc: Guix-devel



On 17/06/16 12:07, Leo Famulari wrote:
> On Fri, Jun 17, 2016 at 11:23:19AM +1000, Ben Woodcroft wrote:
>> * gnu/packages/bioinformatics.scm (python-screed, python2-screed):
>> New variables.
>> +       (uri (string-append
>> +             "https://pypi.python.org/packages/a0/6b/"
>> +             "a90c7ab7b0ad1eab211053c85c1242a81379f34c5f5933392079c9b36858"
> Can you keep the hash on a single line? Just to reduce noise in the
> commit log for future updates...

OK.

>
>> +    (synopsis "Short read sequence utils")
> s/utils/utilities

"database utilities" would be even better I think.

>
>> +    (description "Screed parses FASTA and FASTQ files and generates databases.
> Can you unpack these acronyms?

Acronyms are no fun in descriptions I agree. But the problem here is 
that FASTA and FASTQ aren't really acronyms, and besides these terms are 
extremely well known (like 'tar' or 'zip').

On a side note, FASTA was first used by the "FASTA" program circa 1985 
(just like I am). And just like me it is still going - last release was 
April. I might have a go at packaging it.

ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] gnu: Add khmer.
  2016-06-17  7:23   ` Leo Famulari
@ 2016-06-17 23:41     ` Ben Woodcroft
  2016-06-25 17:30       ` Leo Famulari
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-17 23:41 UTC (permalink / raw)
  To: Leo Famulari, Ben Woodcroft; +Cc: Guix-devel



On 17/06/16 17:23, Leo Famulari wrote:
> On Fri, Jun 17, 2016 at 11:23:21AM +1000, Ben Woodcroft wrote:
>> +    (build-system python-build-system)
>> +    (arguments
>> +     `(#:phases
>> +       (modify-phases %standard-phases
>> +         (add-after 'unpack 'set-paths
> This phase deletes bundled libraries and then copies one of the
> libraries back in from another package (murmur-hash).
>
> Can khmer refer to murmur-hash without it being bundled at all?

By changing the Makefile or similar so that it refers to a 'murmur-hash' 
shared library? Maybe, but I don't think SMHasher creates one of these.

If you mean just referring to the code as it is in murmur-hash, then 
we'd have to change the Makefile so that it refers to that code. I 
figured copying the code in achieves the same thing with less effort on 
our part.

>> +           (lambda* (#:key inputs outputs #:allow-other-keys)
>> +             ;; Delete bundled libraries.
>> +             (delete-file-recursively "third-party/zlib")
>> +             (delete-file-recursively "third-party/bzip2")
>> +             ;; Replace bundled seqan.
>> +             (let* ((seqan-all "third-party/seqan")
>> +                    (seqan-include (string-append
>> +                                    seqan-all "/core/include/seqan")))
>> +               (delete-file-recursively seqan-all)
>> +               (mkdir-p seqan-include)
>> +               (rmdir seqan-include)
> Here it makes the directory seqan-include and then removes it. Should it
> be reversed? Would it be simpler to delete the directory and then use
> copy-recursively, which I don't think requires mkdir-p?

The issue is that there are other (unused) files in 'third-part/seqan' 
that aren't useful code, and I wanted to be sure to delete all the 
bundle before starting the build. How about the slightly simplified

+             (let* ((seqan-all "third-party/seqan")
+                    (seqan-include (string-append
+                                    seqan-all "/core/include")))
+               (delete-file-recursively seqan-all)
+               (mkdir-p seqan-include)
+               (copy-file (string-append (assoc-ref inputs "seqan")
+                                         "/include/seqan")
+                          (string-append seqan-include "/seqan")))

It would be better if I knew a way of copying directory into another 
directory, like an "install-directory".

>
>> +               (copy-file (string-append (assoc-ref inputs "seqan")
>> +                                         "/include/seqan")
>> +                          seqan-include))
>> +             ;; Replace bundled MurmurHash.
>> +             (let ((smhasher "third-party/smhasher/"))
>> +               (delete-file-recursively smhasher)
>> +               (mkdir smhasher)
>> +               (for-each
>> +                (lambda (file)
>> +                  (copy-file
>> +                   (string-append
>> +                    (assoc-ref inputs "murmur-hash") "/include/" file)
>> +                   (string-append smhasher file)))
>> +                (list "MurmurHash3.cpp" "MurmurHash3.h"))
>> +               (rename-file
>> +                (string-append smhasher "MurmurHash3.cpp")
>> +                (string-append smhasher "MurmurHash3.cc")))
>> +             (setenv "CC" "gcc")
> I think this variable setting should be in its own phase.

OK.

ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-17  1:23 [PATCH] Add khmer Ben Woodcroft
                   ` (3 preceding siblings ...)
  2016-06-17  8:16 ` [PATCH] " Leo Famulari
@ 2016-06-25 17:29 ` Leo Famulari
  2016-06-26  9:58   ` Ludovic Courtès
  4 siblings, 1 reply; 19+ messages in thread
From: Leo Famulari @ 2016-06-25 17:29 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Fri, Jun 17, 2016 at 11:23:18AM +1000, Ben Woodcroft wrote:
> This isn't the best name for a piece of software I suggest, but it is
> reasonably well known in the field.
> 
> 'murmur-hash' takes the code from a repository SMHasher, but I'm not
> interested in packaging that. That is OK?

I looked at the Debian package [0]. It uses the bundled copy or
murmur-hash. Since there doesn't really seem to be an independent
upstream distribution of murmur, I think it's fine to do the same.

What do others think?

[0]
https://packages.debian.org/sid/khmer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] gnu: Add khmer.
  2016-06-17 23:41     ` Ben Woodcroft
@ 2016-06-25 17:30       ` Leo Famulari
  0 siblings, 0 replies; 19+ messages in thread
From: Leo Famulari @ 2016-06-25 17:30 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Sat, Jun 18, 2016 at 09:41:46AM +1000, Ben Woodcroft wrote:
> It would be better if I knew a way of copying directory into another
> directory, like an "install-directory".

How about copy-recursively, from (guix build utils)?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-25 17:29 ` Leo Famulari
@ 2016-06-26  9:58   ` Ludovic Courtès
  2016-06-26 11:59     ` Ben Woodcroft
  0 siblings, 1 reply; 19+ messages in thread
From: Ludovic Courtès @ 2016-06-26  9:58 UTC (permalink / raw)
  To: Leo Famulari; +Cc: Guix-devel

Leo Famulari <leo@famulari.name> skribis:

> On Fri, Jun 17, 2016 at 11:23:18AM +1000, Ben Woodcroft wrote:
>> This isn't the best name for a piece of software I suggest, but it is
>> reasonably well known in the field.
>> 
>> 'murmur-hash' takes the code from a repository SMHasher, but I'm not
>> interested in packaging that. That is OK?
>
> I looked at the Debian package [0]. It uses the bundled copy or
> murmur-hash. Since there doesn't really seem to be an independent
> upstream distribution of murmur, I think it's fine to do the same.
>
> What do others think?

I agree.  Just leave a comment explaining the situation so we can
revisit it later if the need arises.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-26  9:58   ` Ludovic Courtès
@ 2016-06-26 11:59     ` Ben Woodcroft
  2016-06-26 17:34       ` Leo Famulari
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-26 11:59 UTC (permalink / raw)
  To: Ludovic Courtès, Leo Famulari; +Cc: Guix-devel

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]



On 26/06/16 19:58, Ludovic Courtès wrote:
> Leo Famulari <leo@famulari.name> skribis:
>
>> On Fri, Jun 17, 2016 at 11:23:18AM +1000, Ben Woodcroft wrote:
>>> This isn't the best name for a piece of software I suggest, but it is
>>> reasonably well known in the field.
>>>
>>> 'murmur-hash' takes the code from a repository SMHasher, but I'm not
>>> interested in packaging that. That is OK?
>> I looked at the Debian package [0]. It uses the bundled copy or
>> murmur-hash. Since there doesn't really seem to be an independent
>> upstream distribution of murmur, I think it's fine to do the same.
>>
>> What do others think?
> I agree.  Just leave a comment explaining the situation so we can
> revisit it later if the need arises.

I'm not convinced, I'm afraid. According to the SMHasher README [0]:

 >This is the home for the MurmurHash 
<https://github.com/aappleby/smhasher/tree/master/src> family of hash 
functions along with the SMHasher 
<https://github.com/aappleby/smhasher/tree/master/src> test suite used 
to verify them.

So that would count that as an independent upstream distribution of 
murmurhash, right?

Thanks,
ben

[0] https://github.com/aappleby/smhasher

[-- Attachment #2: Type: text/html, Size: 2038 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-26 11:59     ` Ben Woodcroft
@ 2016-06-26 17:34       ` Leo Famulari
  2016-06-26 22:48         ` Ben Woodcroft
  0 siblings, 1 reply; 19+ messages in thread
From: Leo Famulari @ 2016-06-26 17:34 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Sun, Jun 26, 2016 at 09:59:51PM +1000, Ben Woodcroft wrote:
> I'm not convinced, I'm afraid. According to the SMHasher README [0]:
> 
> >This is the home for the MurmurHash
> <https://github.com/aappleby/smhasher/tree/master/src> family of hash
> functions along with the SMHasher
> <https://github.com/aappleby/smhasher/tree/master/src> test suite used to
> verify them.
> 
> So that would count that as an independent upstream distribution of
> murmurhash, right?

My understanding is that SMHasher does not create a library, but just
provides raw source code, leaving it up to 3rd party software authors to
integrate it into their applications. Is that right?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-26 17:34       ` Leo Famulari
@ 2016-06-26 22:48         ` Ben Woodcroft
  2016-06-27  5:01           ` Leo Famulari
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-26 22:48 UTC (permalink / raw)
  To: Leo Famulari; +Cc: Guix-devel



On 27 June 2016 3:34:42 AM AEST, Leo Famulari <leo@famulari.name> wrote:
>On Sun, Jun 26, 2016 at 09:59:51PM +1000, Ben Woodcroft wrote:
>> I'm not convinced, I'm afraid. According to the SMHasher README [0]:
>> 
>> >This is the home for the MurmurHash
>> <https://github.com/aappleby/smhasher/tree/master/src> family of hash
>> functions along with the SMHasher
>> <https://github.com/aappleby/smhasher/tree/master/src> test suite
>used to
>> verify them.
>> 
>> So that would count that as an independent upstream distribution of
>> murmurhash, right?
>
>My understanding is that SMHasher does not create a library, but just
>provides raw source code, leaving it up to 3rd party software authors
>to
>integrate it into their applications. Is that right?

Yes. I don't see how that matters though, it is still bundling which should be avoided if possible right? What am I missing?

Ta, ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-26 22:48         ` Ben Woodcroft
@ 2016-06-27  5:01           ` Leo Famulari
  2016-06-27  5:47             ` Ben Woodcroft
  0 siblings, 1 reply; 19+ messages in thread
From: Leo Famulari @ 2016-06-27  5:01 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

On Mon, Jun 27, 2016 at 08:48:14AM +1000, Ben Woodcroft wrote:
> On 27 June 2016 3:34:42 AM AEST, Leo Famulari <leo@famulari.name> wrote:
> >On Sun, Jun 26, 2016 at 09:59:51PM +1000, Ben Woodcroft wrote:
> >> I'm not convinced, I'm afraid. According to the SMHasher README [0]:
> >> 
> >> >This is the home for the MurmurHash
> >> <https://github.com/aappleby/smhasher/tree/master/src> family of hash
> >> functions along with the SMHasher
> >> <https://github.com/aappleby/smhasher/tree/master/src> test suite
> >used to
> >> verify them.
> >> 
> >> So that would count that as an independent upstream distribution of
> >> murmurhash, right?
> >
> >My understanding is that SMHasher does not create a library, but just
> >provides raw source code, leaving it up to 3rd party software authors
> >to
> >integrate it into their applications. Is that right?
> 
> Yes. I don't see how that matters though, it is still bundling which should be avoided if possible right? What am I missing?

So far my understanding has been that, in Guix, we don't want to create
Guix-specific distributions of upstream code. SMHasher is the canonical
source for Murmur, but SMHasher doesn't seem suitable for building a
library that other applications can use, right?

Do any other distros have a package for Murmur or SMHasher that we can
use as an example?

Does anyone else have any thoughts about this?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Add khmer.
  2016-06-27  5:01           ` Leo Famulari
@ 2016-06-27  5:47             ` Ben Woodcroft
  0 siblings, 0 replies; 19+ messages in thread
From: Ben Woodcroft @ 2016-06-27  5:47 UTC (permalink / raw)
  To: Leo Famulari; +Cc: Guix-devel



On 27/06/16 15:01, Leo Famulari wrote:
> On Mon, Jun 27, 2016 at 08:48:14AM +1000, Ben Woodcroft wrote:
>> On 27 June 2016 3:34:42 AM AEST, Leo Famulari <leo@famulari.name> wrote:
>>> On Sun, Jun 26, 2016 at 09:59:51PM +1000, Ben Woodcroft wrote:
>>>> I'm not convinced, I'm afraid. According to the SMHasher README [0]:
>>>>
>>>>> This is the home for the MurmurHash
>>>> <https://github.com/aappleby/smhasher/tree/master/src> family of hash
>>>> functions along with the SMHasher
>>>> <https://github.com/aappleby/smhasher/tree/master/src> test suite
>>> used to
>>>> verify them.
>>>>
>>>> So that would count that as an independent upstream distribution of
>>>> murmurhash, right?
>>> My understanding is that SMHasher does not create a library, but just
>>> provides raw source code, leaving it up to 3rd party software authors
>>> to
>>> integrate it into their applications. Is that right?
>> Yes. I don't see how that matters though, it is still bundling which should be avoided if possible right? What am I missing?
> So far my understanding has been that, in Guix, we don't want to create
> Guix-specific distributions of upstream code. SMHasher is the canonical
> source for Murmur, but SMHasher doesn't seem suitable for building a
> library that other applications can use, right?

It is true that copying code into 'include' is probably changing the 
developer's intended distribution model.

> Do any other distros have a package for Murmur or SMHasher that we can
> use as an example?

I could not find any.

> Does anyone else have any thoughts about this?

Let's not worry about this further, I do not have such strong opinions 
and I've been outvoted. So I'll push with Ludo's and your suggested 
changes soon.

Thanks for the review and discussion.
ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-06-27  5:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-17  1:23 [PATCH] Add khmer Ben Woodcroft
2016-06-17  1:23 ` [PATCH 1/3] gnu: Add python-screed Ben Woodcroft
2016-06-17  2:07   ` Leo Famulari
2016-06-17 22:57     ` Ben Woodcroft
2016-06-17  1:23 ` [PATCH 2/3] gnu: Add murmur-hash Ben Woodcroft
2016-06-17  1:23 ` [PATCH 3/3] gnu: Add khmer Ben Woodcroft
2016-06-17  7:23   ` Leo Famulari
2016-06-17 23:41     ` Ben Woodcroft
2016-06-25 17:30       ` Leo Famulari
2016-06-17  8:16 ` [PATCH] " Leo Famulari
2016-06-17  9:42   ` Ben Woodcroft
2016-06-17 17:54     ` Leo Famulari
2016-06-25 17:29 ` Leo Famulari
2016-06-26  9:58   ` Ludovic Courtès
2016-06-26 11:59     ` Ben Woodcroft
2016-06-26 17:34       ` Leo Famulari
2016-06-26 22:48         ` Ben Woodcroft
2016-06-27  5:01           ` Leo Famulari
2016-06-27  5:47             ` Ben Woodcroft

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).