* [PATCH] gnu: Add mash. @ 2016-08-30 17:54 Marius Bakke 2016-08-31 19:44 ` Leo Famulari 0 siblings, 1 reply; 8+ messages in thread From: Marius Bakke @ 2016-08-30 17:54 UTC (permalink / raw) To: guix-devel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: 0001-gnu-Add-mash.patch --] [-- Type: text/x-patch, Size: 3158 bytes --] From 20974083333c8e94d10423d4a156caa5298d6dcb Mon Sep 17 00:00:00 2001 From: Marius Bakke <mbakke@fastmail.com> Date: Tue, 30 Aug 2016 18:49:21 +0100 Subject: [PATCH 1/1] gnu: Add mash. * gnu/packages/bioinformatics.scm (mash): New variable. --- gnu/packages/bioinformatics.scm | 53 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm index ed20b56..9b96d37 100644 --- a/gnu/packages/bioinformatics.scm +++ b/gnu/packages/bioinformatics.scm @@ -76,6 +76,7 @@ #:use-module (gnu packages python) #:use-module (gnu packages readline) #:use-module (gnu packages ruby) + #:use-module (gnu packages serialization) #:use-module (gnu packages statistics) #:use-module (gnu packages tbb) #:use-module (gnu packages tex) @@ -3046,6 +3047,58 @@ sequences).") "http://mafft.cbrc.jp/alignment/software/license.txt" "BSD-3 with different formatting")))) +(define-public mash + (package + (name "mash") + (version "1.1.1") + (source (origin + (method url-fetch) + (uri (string-append + "https://github.com/marbl/mash/archive/v" + version ".tar.gz")) + (file-name (string-append name "-" version ".tar.gz")) + (sha256 + (base32 + "08znbvqq5xknfhmpp3wcj574zvi4p7i8zifi67c9qw9a6ikp42fj")) + (modules '((guix build utils))) + (snippet + ;; Delete bundled kseq. + ;; TODO: Also delete bundled murmurhash and open bloom filter. + '(delete-file "src/mash/kseq.h")))) + (build-system gnu-build-system) + (arguments + `(#:tests? #f ; No tests. + #:configure-flags + (list + (string-append "--with-capnp=" (assoc-ref %build-inputs "capnproto")) + (string-append "--with-gsl=" (assoc-ref %build-inputs "gsl"))) + #:make-flags (list "CC=gcc") + #:phases + (modify-phases %standard-phases + (add-after 'unpack 'fix-includes + (lambda _ + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp") + (("^#include \"kseq\\.h\"") + "#include \"htslib/kseq.h\"")) + #t)) + (add-before 'configure 'autoconf + (lambda _ (zero? (system* "autoconf"))))))) + (native-inputs + `(("autoconf" ,autoconf) + ("capnproto" ,capnproto) + ("htslib" ,htslib))) + (inputs + `(("gsl" ,gsl) + ("zlib" ,zlib))) + (home-page "https://mash.readthedocs.io") + (synopsis "Fast genome and metagenome distance estimation using MinHash") + (description "Mash is a fast sequence distance estimator that uses the +MinHash algorithm and is designed to work with genomes and metagenomes in the +form of assemblies or reads.") + ;; Mash is distributed under 3-clause BSD, but includes software covered + ;; by other licenses. + (license (list license:bsd-3 license:public-domain license:cpl1.0)))) + (define-public metabat (package (name "metabat") -- 2.9.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-08-30 17:54 [PATCH] gnu: Add mash Marius Bakke @ 2016-08-31 19:44 ` Leo Famulari 2016-08-31 20:16 ` Ricardo Wurmus 0 siblings, 1 reply; 8+ messages in thread From: Leo Famulari @ 2016-08-31 19:44 UTC (permalink / raw) To: Marius Bakke; +Cc: guix-devel On Tue, Aug 30, 2016 at 06:54:49PM +0100, Marius Bakke wrote: > * gnu/packages/bioinformatics.scm (mash): New variable. Thanks! > + (add-after 'unpack 'fix-includes > + (lambda _ > + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp") > + (("^#include \"kseq\\.h\"") > + "#include \"htslib/kseq.h\"")) > + #t)) > + (add-before 'configure 'autoconf > + (lambda _ (zero? (system* "autoconf"))))))) > + (native-inputs > + `(("autoconf" ,autoconf) > + ("capnproto" ,capnproto) > + ("htslib" ,htslib))) Does it only need to use capnproto and htslib while building? Okay if so. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-08-31 19:44 ` Leo Famulari @ 2016-08-31 20:16 ` Ricardo Wurmus 2016-09-01 10:00 ` Marius Bakke 0 siblings, 1 reply; 8+ messages in thread From: Ricardo Wurmus @ 2016-08-31 20:16 UTC (permalink / raw) To: Leo Famulari; +Cc: guix-devel Leo Famulari <leo@famulari.name> writes: > On Tue, Aug 30, 2016 at 06:54:49PM +0100, Marius Bakke wrote: >> * gnu/packages/bioinformatics.scm (mash): New variable. > > Thanks! > >> + (add-after 'unpack 'fix-includes >> + (lambda _ >> + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp") >> + (("^#include \"kseq\\.h\"") >> + "#include \"htslib/kseq.h\"")) >> + #t)) >> + (add-before 'configure 'autoconf >> + (lambda _ (zero? (system* "autoconf"))))))) >> + (native-inputs >> + `(("autoconf" ,autoconf) >> + ("capnproto" ,capnproto) >> + ("htslib" ,htslib))) > > Does it only need to use capnproto and htslib while building? Okay if > so. Looking at the substitution in “fix-includes” htslib probably should be a regular input. ~~ Ricardo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-08-31 20:16 ` Ricardo Wurmus @ 2016-09-01 10:00 ` Marius Bakke 2016-09-06 21:01 ` Leo Famulari 0 siblings, 1 reply; 8+ messages in thread From: Marius Bakke @ 2016-09-01 10:00 UTC (permalink / raw) To: Ricardo Wurmus, Leo Famulari; +Cc: guix-devel Leo Famulari <leo@famulari.name> writes: >> + (add-after 'unpack 'fix-includes >> + (lambda _ >> + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp") >> + (("^#include \"kseq\\.h\"") >> + "#include \"htslib/kseq.h\"")) >> + #t)) >> + (add-before 'configure 'autoconf >> + (lambda _ (zero? (system* "autoconf"))))))) >> + (native-inputs >> + `(("autoconf" ,autoconf) >> + ("capnproto" ,capnproto) >> + ("htslib" ,htslib))) > > Does it only need to use capnproto and htslib while building? Okay if > so. I had these in inputs initially and was surprised to see no references. Both seems to be compiled into the final program[0]: when running "mash info" on an invalid file (the provided data/refseq.msh), a generic capnproto exception is thrown (src/capnp/serialize.c++:159). That raises another question: should the htslib and capnproto licenses be listed too, since they are part of the binary output? I'm not a bioinformatician (just a mere sysadmin for such), but have been going through the tutorial and things appear to work fine. 0: https://github.com/marbl/Mash/blob/master/Makefile.in#L38 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-09-01 10:00 ` Marius Bakke @ 2016-09-06 21:01 ` Leo Famulari 2016-09-08 22:06 ` Marius Bakke 0 siblings, 1 reply; 8+ messages in thread From: Leo Famulari @ 2016-09-06 21:01 UTC (permalink / raw) To: Marius Bakke; +Cc: guix-devel On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote: > I had these in inputs initially and was surprised to see no references. > Both seems to be compiled into the final program[0]: when running "mash > info" on an invalid file (the provided data/refseq.msh), a generic > capnproto exception is thrown (src/capnp/serialize.c++:159). I wonder, does using native-inputs work when building mash for another architecture? > That raises another question: should the htslib and capnproto licenses > be listed too, since they are part of the binary output? Good question, I'm not sure. I'd guess "yes", along with a code comment explaining what's going on. > > I'm not a bioinformatician (just a mere sysadmin for such), but have > been going through the tutorial and things appear to work fine. Ah, bioinformatics software... all bets are off ;) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-09-06 21:01 ` Leo Famulari @ 2016-09-08 22:06 ` Marius Bakke 2016-09-10 21:20 ` Leo Famulari 2016-09-10 21:42 ` Leo Famulari 0 siblings, 2 replies; 8+ messages in thread From: Marius Bakke @ 2016-09-08 22:06 UTC (permalink / raw) To: Leo Famulari; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1373 bytes --] Leo Famulari <leo@famulari.name> writes: > On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote: >> I had these in inputs initially and was surprised to see no references. >> Both seems to be compiled into the final program[0]: when running "mash >> info" on an invalid file (the provided data/refseq.msh), a generic >> capnproto exception is thrown (src/capnp/serialize.c++:159). > > I wonder, does using native-inputs work when building mash for another > architecture? That's interesting, the package indeed fails to build on i686. Sketch.cpp:(.text+0xdf): undefined reference to `memcpy@GLIBC_2.2.5' I don't understand why, the symbol versions should be the same, no? Are there any clever linker flags we can throw at it, or is setting supported-systems acceptable? >> That raises another question: should the htslib and capnproto licenses >> be listed too, since they are part of the binary output? > > Good question, I'm not sure. I'd guess "yes", along with a code comment > explaining what's going on. I've attached a patch below, with license comments and amd64 only. >> I'm not a bioinformatician (just a mere sysadmin for such), but have >> been going through the tutorial and things appear to work fine. > > Ah, bioinformatics software... all bets are off ;) You haven't seen anything yet! This is the nice part of my queue ;) Thanks! Marius [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-gnu-Add-mash.patch --] [-- Type: text/x-patch, Size: 3394 bytes --] From 9e8102ed2d5bf9334e5311f2ac917aed2f451361 Mon Sep 17 00:00:00 2001 From: Marius Bakke <mbakke@fastmail.com> Date: Tue, 30 Aug 2016 18:49:21 +0100 Subject: [PATCH] gnu: Add mash. * gnu/packages/bioinformatics.scm (mash): New variable. --- gnu/packages/bioinformatics.scm | 57 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm index f34acd1..decca6c 100644 --- a/gnu/packages/bioinformatics.scm +++ b/gnu/packages/bioinformatics.scm @@ -76,6 +76,7 @@ #:use-module (gnu packages python) #:use-module (gnu packages readline) #:use-module (gnu packages ruby) + #:use-module (gnu packages serialization) #:use-module (gnu packages statistics) #:use-module (gnu packages tbb) #:use-module (gnu packages tex) @@ -3046,6 +3047,62 @@ sequences).") "http://mafft.cbrc.jp/alignment/software/license.txt" "BSD-3 with different formatting")))) +(define-public mash + (package + (name "mash") + (version "1.1.1") + (source (origin + (method url-fetch) + (uri (string-append + "https://github.com/marbl/mash/archive/v" + version ".tar.gz")) + (file-name (string-append name "-" version ".tar.gz")) + (sha256 + (base32 + "08znbvqq5xknfhmpp3wcj574zvi4p7i8zifi67c9qw9a6ikp42fj")) + (modules '((guix build utils))) + (snippet + ;; Delete bundled kseq. + ;; TODO: Also delete bundled murmurhash and open bloom filter. + '(delete-file "src/mash/kseq.h")))) + (build-system gnu-build-system) + (arguments + `(#:tests? #f ; No tests. + #:configure-flags + (list + (string-append "--with-capnp=" (assoc-ref %build-inputs "capnproto")) + (string-append "--with-gsl=" (assoc-ref %build-inputs "gsl"))) + #:make-flags (list "CC=gcc") + #:phases + (modify-phases %standard-phases + (add-after 'unpack 'fix-includes + (lambda _ + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp") + (("^#include \"kseq\\.h\"") + "#include \"htslib/kseq.h\"")) + #t)) + (add-before 'configure 'autoconf + (lambda _ (zero? (system* "autoconf"))))))) + (native-inputs + `(("autoconf" ,autoconf) + ;; Capnproto and htslib are statically embedded in the final + ;; application. Therefore we also list their licenses, below. + ("capnproto" ,capnproto) + ("htslib" ,htslib))) + (inputs + `(("gsl" ,gsl) + ("zlib" ,zlib))) + (supported-systems '("x86_64-linux")) + (home-page "https://mash.readthedocs.io") + (synopsis "Fast genome and metagenome distance estimation using MinHash") + (description "Mash is a fast sequence distance estimator that uses the +MinHash algorithm and is designed to work with genomes and metagenomes in the +form of assemblies or reads.") + (license (list license:bsd-3 ; Mash + license:expat ; HTSlib and capnproto + license:public-domain ; MurmurHash 3 + license:cpl1.0)))) ; Open Bloom Filter + (define-public metabat (package (name "metabat") -- 2.9.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-09-08 22:06 ` Marius Bakke @ 2016-09-10 21:20 ` Leo Famulari 2016-09-10 21:42 ` Leo Famulari 1 sibling, 0 replies; 8+ messages in thread From: Leo Famulari @ 2016-09-10 21:20 UTC (permalink / raw) To: Marius Bakke; +Cc: guix-devel On Thu, Sep 08, 2016 at 11:06:44PM +0100, Marius Bakke wrote: > Leo Famulari <leo@famulari.name> writes: > > > On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote: > >> I had these in inputs initially and was surprised to see no references. > >> Both seems to be compiled into the final program[0]: when running "mash > >> info" on an invalid file (the provided data/refseq.msh), a generic > >> capnproto exception is thrown (src/capnp/serialize.c++:159). > > > > I wonder, does using native-inputs work when building mash for another > > architecture? > > That's interesting, the package indeed fails to build on i686. Do you mean that it fails when you try to build from x86_64 using `guix build --system=i686-linux`? I wondered if that particular case would work since the packages that are native-inputs would only be built for the architecture of the builder, if I understand correctly. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash. 2016-09-08 22:06 ` Marius Bakke 2016-09-10 21:20 ` Leo Famulari @ 2016-09-10 21:42 ` Leo Famulari 1 sibling, 0 replies; 8+ messages in thread From: Leo Famulari @ 2016-09-10 21:42 UTC (permalink / raw) To: Marius Bakke; +Cc: guix-devel On Thu, Sep 08, 2016 at 11:06:44PM +0100, Marius Bakke wrote: > * gnu/packages/bioinformatics.scm (mash): New variable. Anyways, pushed as 84be3b9920120e7cc03095baca06d61b7f3fb741. If the package needs more changes, we will change it :) ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-09-10 21:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-30 17:54 [PATCH] gnu: Add mash Marius Bakke 2016-08-31 19:44 ` Leo Famulari 2016-08-31 20:16 ` Ricardo Wurmus 2016-09-01 10:00 ` Marius Bakke 2016-09-06 21:01 ` Leo Famulari 2016-09-08 22:06 ` Marius Bakke 2016-09-10 21:20 ` Leo Famulari 2016-09-10 21:42 ` Leo Famulari
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).