unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Marius Bakke <mbakke@fastmail.com>
To: Leo Famulari <leo@famulari.name>
Cc: guix-devel@gnu.org
Subject: Re: [PATCH] gnu: Add mash.
Date: Thu, 08 Sep 2016 23:06:44 +0100	[thread overview]
Message-ID: <87eg4urqq3.fsf@ike.i-did-not-set--mail-host-address--so-tickle-me> (raw)
In-Reply-To: <20160906210127.GB1089@jasmine>

[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]

Leo Famulari <leo@famulari.name> writes:

> On Thu, Sep 01, 2016 at 11:00:39AM +0100, Marius Bakke wrote:
>> I had these in inputs initially and was surprised to see no references.
>> Both seems to be compiled into the final program[0]: when running "mash
>> info" on an invalid file (the provided data/refseq.msh), a generic
>> capnproto exception is thrown (src/capnp/serialize.c++:159).
>
> I wonder, does using native-inputs work when building mash for another
> architecture?

That's interesting, the package indeed fails to build on i686.

Sketch.cpp:(.text+0xdf): undefined reference to `memcpy@GLIBC_2.2.5'

I don't understand why, the symbol versions should be the same, no?

Are there any clever linker flags we can throw at it, or is setting
supported-systems acceptable?

>> That raises another question: should the htslib and capnproto licenses
>> be listed too, since they are part of the binary output?
>
> Good question, I'm not sure. I'd guess "yes", along with a code comment
> explaining what's going on.

I've attached a patch below, with license comments and amd64 only.

>> I'm not a bioinformatician (just a mere sysadmin for such), but have
>> been going through the tutorial and things appear to work fine.
>
> Ah, bioinformatics software... all bets are off ;)

You haven't seen anything yet! This is the nice part of my queue ;)

Thanks!
Marius


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-gnu-Add-mash.patch --]
[-- Type: text/x-patch, Size: 3394 bytes --]

From 9e8102ed2d5bf9334e5311f2ac917aed2f451361 Mon Sep 17 00:00:00 2001
From: Marius Bakke <mbakke@fastmail.com>
Date: Tue, 30 Aug 2016 18:49:21 +0100
Subject: [PATCH] gnu: Add mash.

* gnu/packages/bioinformatics.scm (mash): New variable.
---
 gnu/packages/bioinformatics.scm | 57 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index f34acd1..decca6c 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -76,6 +76,7 @@
   #:use-module (gnu packages python)
   #:use-module (gnu packages readline)
   #:use-module (gnu packages ruby)
+  #:use-module (gnu packages serialization)
   #:use-module (gnu packages statistics)
   #:use-module (gnu packages tbb)
   #:use-module (gnu packages tex)
@@ -3046,6 +3047,62 @@ sequences).")
               "http://mafft.cbrc.jp/alignment/software/license.txt"
               "BSD-3 with different formatting"))))
 
+(define-public mash
+  (package
+    (name "mash")
+    (version "1.1.1")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append
+                    "https://github.com/marbl/mash/archive/v"
+                    version ".tar.gz"))
+              (file-name (string-append name "-" version ".tar.gz"))
+              (sha256
+               (base32
+                "08znbvqq5xknfhmpp3wcj574zvi4p7i8zifi67c9qw9a6ikp42fj"))
+              (modules '((guix build utils)))
+              (snippet
+               ;; Delete bundled kseq.
+               ;; TODO: Also delete bundled murmurhash and open bloom filter.
+               '(delete-file "src/mash/kseq.h"))))
+    (build-system gnu-build-system)
+    (arguments
+     `(#:tests? #f ; No tests.
+       #:configure-flags
+       (list
+        (string-append "--with-capnp=" (assoc-ref %build-inputs "capnproto"))
+        (string-append "--with-gsl=" (assoc-ref %build-inputs "gsl")))
+       #:make-flags (list "CC=gcc")
+       #:phases
+       (modify-phases %standard-phases
+         (add-after 'unpack 'fix-includes
+           (lambda _
+             (substitute* '("src/mash/Sketch.cpp" "src/mash/CommandFind.cpp")
+               (("^#include \"kseq\\.h\"")
+                "#include \"htslib/kseq.h\""))
+             #t))
+         (add-before 'configure 'autoconf
+           (lambda _ (zero? (system* "autoconf")))))))
+    (native-inputs
+     `(("autoconf" ,autoconf)
+       ;; Capnproto and htslib are statically embedded in the final
+       ;; application. Therefore we also list their licenses, below.
+       ("capnproto" ,capnproto)
+       ("htslib" ,htslib)))
+    (inputs
+     `(("gsl" ,gsl)
+       ("zlib" ,zlib)))
+    (supported-systems '("x86_64-linux"))
+    (home-page "https://mash.readthedocs.io")
+    (synopsis "Fast genome and metagenome distance estimation using MinHash")
+    (description "Mash is a fast sequence distance estimator that uses the
+MinHash algorithm and is designed to work with genomes and metagenomes in the
+form of assemblies or reads.")
+    (license (list license:bsd-3          ; Mash
+                   license:expat          ; HTSlib and capnproto
+                   license:public-domain  ; MurmurHash 3
+                   license:cpl1.0))))     ; Open Bloom Filter
+
 (define-public metabat
   (package
     (name "metabat")
-- 
2.9.3


  reply	other threads:[~2016-09-08 22:07 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-30 17:54 [PATCH] gnu: Add mash Marius Bakke
2016-08-31 19:44 ` Leo Famulari
2016-08-31 20:16   ` Ricardo Wurmus
2016-09-01 10:00     ` Marius Bakke
2016-09-06 21:01       ` Leo Famulari
2016-09-08 22:06         ` Marius Bakke [this message]
2016-09-10 21:20           ` Leo Famulari
2016-09-10 21:42           ` Leo Famulari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eg4urqq3.fsf@ike.i-did-not-set--mail-host-address--so-tickle-me \
    --to=mbakke@fastmail.com \
    --cc=guix-devel@gnu.org \
    --cc=leo@famulari.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).