* [PATCH] gnu: Add mash.
@ 2016-08-30 17:54 Marius Bakke
2016-08-31 19:44 ` Leo Famulari
0 siblings, 1 reply; 8+ messages in thread
From: Marius Bakke @ 2016-08-30 17:54 UTC (permalink / raw)
To: guix-devel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0001-gnu-Add-mash.patch --]
[-- Type: text/x-patch, Size: 3158 bytes --]
From 20974083333c8e94d10423d4a156caa5298d6dcb Mon Sep 17 00:00:00 2001
From: Marius Bakke <mbakke@fastmail.com>
Date: Tue, 30 Aug 2016 18:49:21 +0100
Subject: [PATCH 1/1] gnu: Add mash.
* gnu/packages/bioinformatics.scm (mash): New variable.
---
gnu/packages/bioinformatics.scm | 53 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index ed20b56..9b96d37 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -76,6 +76,7 @@
#:use-module (gnu packages python)
#:use-module (gnu packages readline)
#:use-module (gnu packages ruby)
+ #:use-module (gnu packages serialization)
#:use-module (gnu packages statistics)
#:use-module (gnu packages tbb)
#:use-module (gnu packages tex)
@@ -3046,6 +3047,58 @@ sequences).")
"http://mafft.cbrc.jp/alignment/software/license.txt"
"BSD-3 with different formatting"))))
+(define-public mash
+ (package
+ (name "mash")
+ (version "1.1.1")
+ (source (origin
+ (method url-fetch)
+ (uri (string-append
+ "https://github.com/marbl/mash/archive/v"
+ version ".tar.gz"))
+ (file-name (string-append name "-" version ".tar.gz"))
+ (sha256
+ (base32
+ "08znbvqq5xknfhmpp3wcj574zvi4p7i8zifi67c9qw9a6ikp42fj"))
+ (modules '((guix build utils)))
+ (snippet
+ ;; Delete bundled kseq.
+ ;; TODO: Also delete bundled murmurhash and open bloom filter.
+ '(delete-file "src/mash/kseq.h"))))
+ (build-system gnu-build-system)
+ (arguments
+ `(#:tests? #f ; No tests.
+ #:configure-flags
+ (list
+ (string-append "--with-capnp=" (assoc-ref %build-inputs "capnproto"))
+ (string-append "--with-gsl=" (assoc-ref %build-inputs "gsl")))
+ #:make-flags (list "CC=gcc")
+ #:phases
+ (modify-phases %standard-phases
+ (add-after 'unpack 'fix-includes
+ (lambda _
+ (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp")
+ (("^#include \"kseq\\.h\"")
+ "#include \"htslib/kseq.h\""))
+ #t))
+ (add-before 'configure 'autoconf
+ (lambda _ (zero? (system* "autoconf")))))))
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("capnproto" ,capnproto)
+ ("htslib" ,htslib)))
+ (inputs
+ `(("gsl" ,gsl)
+ ("zlib" ,zlib)))
+ (home-page "https://mash.readthedocs.io")
+ (synopsis "Fast genome and metagenome distance estimation using MinHash")
+ (description "Mash is a fast sequence distance estimator that uses the
+MinHash algorithm and is designed to work with genomes and metagenomes in the
+form of assemblies or reads.")
+ ;; Mash is distributed under 3-clause BSD, but includes software covered
+ ;; by other licenses.
+ (license (list license:bsd-3 license:public-domain license:cpl1.0))))
+
(define-public metabat
(package
(name "metabat")
--
2.9.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-08-30 17:54 [PATCH] gnu: Add mash Marius Bakke
@ 2016-08-31 19:44 ` Leo Famulari
2016-08-31 20:16 ` Ricardo Wurmus
0 siblings, 1 reply; 8+ messages in thread
From: Leo Famulari @ 2016-08-31 19:44 UTC (permalink / raw)
To: Marius Bakke; +Cc: guix-devel
On Tue, Aug 30, 2016 at 06:54:49PM +0100, Marius Bakke wrote:
> * gnu/packages/bioinformatics.scm (mash): New variable.
Thanks!
> + (add-after 'unpack 'fix-includes
> + (lambda _
> + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp")
> + (("^#include \"kseq\\.h\"")
> + "#include \"htslib/kseq.h\""))
> + #t))
> + (add-before 'configure 'autoconf
> + (lambda _ (zero? (system* "autoconf")))))))
> + (native-inputs
> + `(("autoconf" ,autoconf)
> + ("capnproto" ,capnproto)
> + ("htslib" ,htslib)))
Does it only need to use capnproto and htslib while building? Okay if
so.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-08-31 19:44 ` Leo Famulari
@ 2016-08-31 20:16 ` Ricardo Wurmus
2016-09-01 10:00 ` Marius Bakke
0 siblings, 1 reply; 8+ messages in thread
From: Ricardo Wurmus @ 2016-08-31 20:16 UTC (permalink / raw)
To: Leo Famulari; +Cc: guix-devel
Leo Famulari <leo@famulari.name> writes:
> On Tue, Aug 30, 2016 at 06:54:49PM +0100, Marius Bakke wrote:
>> * gnu/packages/bioinformatics.scm (mash): New variable.
>
> Thanks!
>
>> + (add-after 'unpack 'fix-includes
>> + (lambda _
>> + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp")
>> + (("^#include \"kseq\\.h\"")
>> + "#include \"htslib/kseq.h\""))
>> + #t))
>> + (add-before 'configure 'autoconf
>> + (lambda _ (zero? (system* "autoconf")))))))
>> + (native-inputs
>> + `(("autoconf" ,autoconf)
>> + ("capnproto" ,capnproto)
>> + ("htslib" ,htslib)))
>
> Does it only need to use capnproto and htslib while building? Okay if
> so.
Looking at the substitution in “fix-includes” htslib probably should be
a regular input.
~~ Ricardo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-08-31 20:16 ` Ricardo Wurmus
@ 2016-09-01 10:00 ` Marius Bakke
2016-09-06 21:01 ` Leo Famulari
0 siblings, 1 reply; 8+ messages in thread
From: Marius Bakke @ 2016-09-01 10:00 UTC (permalink / raw)
To: Ricardo Wurmus, Leo Famulari; +Cc: guix-devel
Leo Famulari <leo@famulari.name> writes:
>> + (add-after 'unpack 'fix-includes
>> + (lambda _
>> + (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp")
>> + (("^#include \"kseq\\.h\"")
>> + "#include \"htslib/kseq.h\""))
>> + #t))
>> + (add-before 'configure 'autoconf
>> + (lambda _ (zero? (system* "autoconf")))))))
>> + (native-inputs
>> + `(("autoconf" ,autoconf)
>> + ("capnproto" ,capnproto)
>> + ("htslib" ,htslib)))
>
> Does it only need to use capnproto and htslib while building? Okay if
> so.
I had these in inputs initially and was surprised to see no references.
Both seems to be compiled into the final program[0]: when running "mash
info" on an invalid file (the provided data/refseq.msh), a generic
capnproto exception is thrown (src/capnp/serialize.c++:159).
That raises another question: should the htslib and capnproto licenses
be listed too, since they are part of the binary output?
I'm not a bioinformatician (just a mere sysadmin for such), but have
been going through the tutorial and things appear to work fine.
0: https://github.com/marbl/Mash/blob/master/Makefile.in#L38
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-09-01 10:00 ` Marius Bakke
@ 2016-09-06 21:01 ` Leo Famulari
2016-09-08 22:06 ` Marius Bakke
0 siblings, 1 reply; 8+ messages in thread
From: Leo Famulari @ 2016-09-06 21:01 UTC (permalink / raw)
To: Marius Bakke; +Cc: guix-devel
On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote:
> I had these in inputs initially and was surprised to see no references.
> Both seems to be compiled into the final program[0]: when running "mash
> info" on an invalid file (the provided data/refseq.msh), a generic
> capnproto exception is thrown (src/capnp/serialize.c++:159).
I wonder, does using native-inputs work when building mash for another
architecture?
> That raises another question: should the htslib and capnproto licenses
> be listed too, since they are part of the binary output?
Good question, I'm not sure. I'd guess "yes", along with a code comment
explaining what's going on.
>
> I'm not a bioinformatician (just a mere sysadmin for such), but have
> been going through the tutorial and things appear to work fine.
Ah, bioinformatics software... all bets are off ;)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-09-06 21:01 ` Leo Famulari
@ 2016-09-08 22:06 ` Marius Bakke
2016-09-10 21:20 ` Leo Famulari
2016-09-10 21:42 ` Leo Famulari
0 siblings, 2 replies; 8+ messages in thread
From: Marius Bakke @ 2016-09-08 22:06 UTC (permalink / raw)
To: Leo Famulari; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]
Leo Famulari <leo@famulari.name> writes:
> On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote:
>> I had these in inputs initially and was surprised to see no references.
>> Both seems to be compiled into the final program[0]: when running "mash
>> info" on an invalid file (the provided data/refseq.msh), a generic
>> capnproto exception is thrown (src/capnp/serialize.c++:159).
>
> I wonder, does using native-inputs work when building mash for another
> architecture?
That's interesting, the package indeed fails to build on i686.
Sketch.cpp:(.text+0xdf): undefined reference to `memcpy@GLIBC_2.2.5'
I don't understand why, the symbol versions should be the same, no?
Are there any clever linker flags we can throw at it, or is setting
supported-systems acceptable?
>> That raises another question: should the htslib and capnproto licenses
>> be listed too, since they are part of the binary output?
>
> Good question, I'm not sure. I'd guess "yes", along with a code comment
> explaining what's going on.
I've attached a patch below, with license comments and amd64 only.
>> I'm not a bioinformatician (just a mere sysadmin for such), but have
>> been going through the tutorial and things appear to work fine.
>
> Ah, bioinformatics software... all bets are off ;)
You haven't seen anything yet! This is the nice part of my queue ;)
Thanks!
Marius
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-gnu-Add-mash.patch --]
[-- Type: text/x-patch, Size: 3394 bytes --]
From 9e8102ed2d5bf9334e5311f2ac917aed2f451361 Mon Sep 17 00:00:00 2001
From: Marius Bakke <mbakke@fastmail.com>
Date: Tue, 30 Aug 2016 18:49:21 +0100
Subject: [PATCH] gnu: Add mash.
* gnu/packages/bioinformatics.scm (mash): New variable.
---
gnu/packages/bioinformatics.scm | 57 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 57 insertions(+)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index f34acd1..decca6c 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -76,6 +76,7 @@
#:use-module (gnu packages python)
#:use-module (gnu packages readline)
#:use-module (gnu packages ruby)
+ #:use-module (gnu packages serialization)
#:use-module (gnu packages statistics)
#:use-module (gnu packages tbb)
#:use-module (gnu packages tex)
@@ -3046,6 +3047,62 @@ sequences).")
"http://mafft.cbrc.jp/alignment/software/license.txt"
"BSD-3 with different formatting"))))
+(define-public mash
+ (package
+ (name "mash")
+ (version "1.1.1")
+ (source (origin
+ (method url-fetch)
+ (uri (string-append
+ "https://github.com/marbl/mash/archive/v"
+ version ".tar.gz"))
+ (file-name (string-append name "-" version ".tar.gz"))
+ (sha256
+ (base32
+ "08znbvqq5xknfhmpp3wcj574zvi4p7i8zifi67c9qw9a6ikp42fj"))
+ (modules '((guix build utils)))
+ (snippet
+ ;; Delete bundled kseq.
+ ;; TODO: Also delete bundled murmurhash and open bloom filter.
+ '(delete-file "src/mash/kseq.h"))))
+ (build-system gnu-build-system)
+ (arguments
+ `(#:tests? #f ; No tests.
+ #:configure-flags
+ (list
+ (string-append "--with-capnp=" (assoc-ref %build-inputs "capnproto"))
+ (string-append "--with-gsl=" (assoc-ref %build-inputs "gsl")))
+ #:make-flags (list "CC=gcc")
+ #:phases
+ (modify-phases %standard-phases
+ (add-after 'unpack 'fix-includes
+ (lambda _
+ (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp")
+ (("^#include \"kseq\\.h\"")
+ "#include \"htslib/kseq.h\""))
+ #t))
+ (add-before 'configure 'autoconf
+ (lambda _ (zero? (system* "autoconf")))))))
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ;; Capnproto and htslib are statically embedded in the final
+ ;; application. Therefore we also list their licenses, below.
+ ("capnproto" ,capnproto)
+ ("htslib" ,htslib)))
+ (inputs
+ `(("gsl" ,gsl)
+ ("zlib" ,zlib)))
+ (supported-systems '("x86_64-linux"))
+ (home-page "https://mash.readthedocs.io")
+ (synopsis "Fast genome and metagenome distance estimation using MinHash")
+ (description "Mash is a fast sequence distance estimator that uses the
+MinHash algorithm and is designed to work with genomes and metagenomes in the
+form of assemblies or reads.")
+ (license (list license:bsd-3 ; Mash
+ license:expat ; HTSlib and capnproto
+ license:public-domain ; MurmurHash 3
+ license:cpl1.0)))) ; Open Bloom Filter
+
(define-public metabat
(package
(name "metabat")
--
2.9.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-09-08 22:06 ` Marius Bakke
@ 2016-09-10 21:20 ` Leo Famulari
2016-09-10 21:42 ` Leo Famulari
1 sibling, 0 replies; 8+ messages in thread
From: Leo Famulari @ 2016-09-10 21:20 UTC (permalink / raw)
To: Marius Bakke; +Cc: guix-devel
On Thu, Sep 08, 2016 at 11:06:44PM +0100, Marius Bakke wrote:
> Leo Famulari <leo@famulari.name> writes:
>
> > On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote:
> >> I had these in inputs initially and was surprised to see no references.
> >> Both seems to be compiled into the final program[0]: when running "mash
> >> info" on an invalid file (the provided data/refseq.msh), a generic
> >> capnproto exception is thrown (src/capnp/serialize.c++:159).
> >
> > I wonder, does using native-inputs work when building mash for another
> > architecture?
>
> That's interesting, the package indeed fails to build on i686.
Do you mean that it fails when you try to build from x86_64 using `guix
build --system=i686-linux`? I wondered if that particular case would
work since the packages that are native-inputs would only be built for
the architecture of the builder, if I understand correctly.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gnu: Add mash.
2016-09-08 22:06 ` Marius Bakke
2016-09-10 21:20 ` Leo Famulari
@ 2016-09-10 21:42 ` Leo Famulari
1 sibling, 0 replies; 8+ messages in thread
From: Leo Famulari @ 2016-09-10 21:42 UTC (permalink / raw)
To: Marius Bakke; +Cc: guix-devel
On Thu, Sep 08, 2016 at 11:06:44PM +0100, Marius Bakke wrote:
> * gnu/packages/bioinformatics.scm (mash): New variable.
Anyways, pushed as 84be3b9920120e7cc03095baca06d61b7f3fb741. If the
package needs more changes, we will change it :)
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-09-10 21:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-30 17:54 [PATCH] gnu: Add mash Marius Bakke
2016-08-31 19:44 ` Leo Famulari
2016-08-31 20:16 ` Ricardo Wurmus
2016-09-01 10:00 ` Marius Bakke
2016-09-06 21:01 ` Leo Famulari
2016-09-08 22:06 ` Marius Bakke
2016-09-10 21:20 ` Leo Famulari
2016-09-10 21:42 ` Leo Famulari
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).