unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] Add Jellyfish.
@ 2015-12-18 16:42 Ricardo Wurmus
  2015-12-18 18:19 ` Eric Bavier
  0 siblings, 1 reply; 7+ messages in thread
From: Ricardo Wurmus @ 2015-12-18 16:42 UTC (permalink / raw)
  To: Guix-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0001-gnu-Add-Jellyfish.patch --]
[-- Type: text/x-patch, Size: 2900 bytes --]

From 9fcbc3e10773c7ee73b232c8e16e20a807318bbc Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Date: Fri, 18 Dec 2015 17:40:02 +0100
Subject: [PATCH] gnu: Add Jellyfish.

* gnu/packages/bioinformatics.scm (jellyfish): New variable.
---
 gnu/packages/bioinformatics.scm | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 4c350ff..430c568 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -56,6 +56,7 @@
   #:use-module (gnu packages statistics)
   #:use-module (gnu packages tbb)
   #:use-module (gnu packages textutils)
+  #:use-module (gnu packages time)
   #:use-module (gnu packages tls)
   #:use-module (gnu packages vim)
   #:use-module (gnu packages web)
@@ -1756,6 +1757,46 @@ to measure the reproducibility of findings identified from replicate
 experiments and provide highly stable thresholds based on reproducibility.")
     (license license:gpl3+)))
 
+(define-public jellyfish
+  (package
+    (name "jellyfish")
+    (version "2.2.4")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append "https://github.com/gmarcais/Jellyfish/"
+                                  "releases/download/v" version
+                                  "/jellyfish-" version ".tar.gz"))
+              (sha256
+               (base32
+                "0a6xnynqy2ibfbfz86b9g2m2dgm7f1469pmymkpam333gi3p26nk"))))
+    (build-system gnu-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (add-before 'check 'set-SHELL-variable
+           (lambda _
+             ;; generator_manager.hpp either uses /bin/sh or $SHELL
+             ;; to run tests.
+             (setenv "SHELL" (which "bash"))
+             #t)))))
+    (native-inputs
+     `(("bc" ,bc)
+       ("time" ,time)
+       ("gunzip" ,gzip)))
+    (synopsis "Tool for fast counting of k-mers in DNA")
+    (description
+     "Jellyfish is a tool for fast, memory-efficient counting of k-mers in
+DNA.  A k-mer is a substring of length k, and counting the occurrences of all
+such substrings is a central step in many analyses of DNA sequence.  Jellyfish
+is a command-line program that reads FASTA and multi-FASTA files containing
+DNA sequences.  It outputs its k-mer counts in a binary format, which can be
+translated into a human-readable text format using the @code{jellyfish dump}
+command, or queried for specific k-mers with @code{jellyfish query}.")
+    (home-page "http://www.genome.umd.edu/jellyfish.html")
+    ;; The combined work is published under the GPLv3 or later.  Individual
+    ;; files such as lib/jsoncpp.cpp are released under the Expat license.
+    (license (list license:gpl3+ license:expat))))
+
 (define-public macs
   (package
     (name "macs")
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add Jellyfish.
  2015-12-18 16:42 [PATCH] Add Jellyfish Ricardo Wurmus
@ 2015-12-18 18:19 ` Eric Bavier
  2015-12-18 23:25   ` Ben Woodcroft
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Bavier @ 2015-12-18 18:19 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: Guix-devel, guix-devel-bounces+ericbavier=openmailbox.org

On 2015-12-18 17:42, Ricardo Wurmus wrote:
> * gnu/packages/bioinformatics.scm (jellyfish): New variable.
[...]
> +    (native-inputs
> +     `(("bc" ,bc)
> +       ("time" ,time)
> +       ("gunzip" ,gzip)))

gzip is an implicit input of gnu-build-system, so could it be left out?

Otherwise LGTM.

-- 
`~Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add Jellyfish.
  2015-12-18 18:19 ` Eric Bavier
@ 2015-12-18 23:25   ` Ben Woodcroft
  2015-12-20 22:43     ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Woodcroft @ 2015-12-18 23:25 UTC (permalink / raw)
  To: Eric Bavier, Ricardo Wurmus
  Cc: Guix-devel, guix-devel-bounces+ericbavier=openmailbox.org

Hi Ricardo,

On 19/12/15 04:19, Eric Bavier wrote:
> On 2015-12-18 17:42, Ricardo Wurmus wrote:
>> * gnu/packages/bioinformatics.scm (jellyfish): New variable.
> [...]
>> +    (native-inputs
>> +     `(("bc" ,bc)
>> +       ("time" ,time)
>> +       ("gunzip" ,gzip)))
>
> gzip is an implicit input of gnu-build-system, so could it be left out?
My testing confirms this.

But would it be possible to include the scripting language bindings, 
something along these lines?

+    (arguments
+     `(#:configure-flags '("--enable-ruby-binding"
+                           "--enable-python-binding"
+                           "--enable-perl-binding")
+       #:phases
+       (modify-phases %standard-phases
+         (add-before 'check 'set-SHELL-variable
+           (lambda _
+             ;; generator_manager.hpp either uses /bin/sh or $SHELL
+             ;; to run tests.
+             (setenv "SHELL" (which "bash"))
+             #t)))))
+    (native-inputs
+     `(("bc" ,bc)
+       ("time" ,time)
+       ("ruby" ,ruby)
+       ("python" ,python-2)
+       ("perl" ,perl)))

Currently the perl tests fail for reasons I don't have time to 
investigate, and perhaps some search paths need to be exported too.

I did find the time to confirm that the most important part of this 
program works though:

sh-4.3# jellyfish jf
                    .......
           ..........      .....
        ....                   ....
       ..     /-+       +---\     ...
       .     /--|       +----\      ...
      ..                              ...
      .                                 .
      ..      +----------------+         .
       .      |. AAGATGGAGCGC .|         ..
       .      |---.        .--/           .
      ..          \--------/     .        .
      .     .            ..     ..        .
      .    ... .....   .....    ..        ..
      .   .. . .   .  ..   .   ....        .
      .  ..  . ..   . .    ..  .  .         .
      . ..   .  .   ...     . ..  ..        .
     ....    . ..   ..      ...    ..       .
    .. .     ...     .      ..      ..      .
    . ..      .      .       .       ...    ..
    ...       .      .      ..         ...   .
    .         ..     .      ..           .....
   ____  ____  ._    __   _  _  ____  ____  ___  _   _
  (_  _)( ___)(  )  (  ) ( \/ )( ___)(_  _)/ __)( )_( )
.-_)(   )__)  )(__  )(__ \  /  )__)  _)(_ \__ \ ) _ (
\____) (____)(____)(____)(__) (__)  (____)(___/(_) (_)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add Jellyfish.
  2015-12-18 23:25   ` Ben Woodcroft
@ 2015-12-20 22:43     ` Ludovic Courtès
  2015-12-30 15:58       ` Ricardo Wurmus
  0 siblings, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2015-12-20 22:43 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: Guix-devel

Ben Woodcroft <b.woodcroft@uq.edu.au> skribis:

> But would it be possible to include the scripting language bindings,
> something along these lines?
>
> +    (arguments
> +     `(#:configure-flags '("--enable-ruby-binding"
> +                           "--enable-python-binding"
> +                           "--enable-perl-binding")

There’s the usual space/popularity tradeoff to take into account: adding
them all makes the package’s closure much larger, so it’s important to
add only the useful bindings by default.

Ideally, the .so for these bindings could be moved to separate outputs
(like we did for the “tk” output of Python), but it’s not always easy to
do.

> I did find the time to confirm that the most important part of this
> program works though:

Nice.  :-)

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add Jellyfish.
  2015-12-20 22:43     ` Ludovic Courtès
@ 2015-12-30 15:58       ` Ricardo Wurmus
  2016-01-01 22:15         ` Ludovic Courtès
  2016-01-06 11:13         ` Ricardo Wurmus
  0 siblings, 2 replies; 7+ messages in thread
From: Ricardo Wurmus @ 2015-12-30 15:58 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix-devel


Ludovic Courtès <ludo@gnu.org> writes:

> Ben Woodcroft <b.woodcroft@uq.edu.au> skribis:
>
>> But would it be possible to include the scripting language bindings,
>> something along these lines?
>>
>> +    (arguments
>> +     `(#:configure-flags '("--enable-ruby-binding"
>> +                           "--enable-python-binding"
>> +                           "--enable-perl-binding")
>
> There’s the usual space/popularity tradeoff to take into account: adding
> them all makes the package’s closure much larger, so it’s important to
> add only the useful bindings by default.
>
> Ideally, the .so for these bindings could be moved to separate outputs
> (like we did for the “tk” output of Python), but it’s not always easy to
> do.

In this case it seems to be very easy to separate the bindings into
different outputs as the flags take an optional path.

However, the test for the Perl bindings does not pass:

  /gnu/store/czs63sm4l0s4a56ab38dqvkx19yzylbq-perl-5.16.1/bin/perl: symbol lookup error: /tmp/nix-build-jellyfish-2.2.4.drv-0/jellyfish-2.2.4/.libs/libjellyfish-2.0.so.2: undefined symbol: pthread_create
  FAIL tests/swig_perl.sh (exit status: 127)

Maybe the library needs another linker flag?  I’ll play with this later
and see if I can make it work.  If not I’ll leave the Perl bindings (and
the “perl” output) away for now.

It’s a bit unfortunate that this library would gain so many transitive
inputs just for these bindings (all of these three languages require a
lot of inputs).  It would be nice if we could somehow mark certain
inputs to be used only for certain outputs, but that’s probably a silly
wish.

~~ Ricardo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add Jellyfish.
  2015-12-30 15:58       ` Ricardo Wurmus
@ 2016-01-01 22:15         ` Ludovic Courtès
  2016-01-06 11:13         ` Ricardo Wurmus
  1 sibling, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2016-01-01 22:15 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: Guix-devel

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Ben Woodcroft <b.woodcroft@uq.edu.au> skribis:
>>
>>> But would it be possible to include the scripting language bindings,
>>> something along these lines?
>>>
>>> +    (arguments
>>> +     `(#:configure-flags '("--enable-ruby-binding"
>>> +                           "--enable-python-binding"
>>> +                           "--enable-perl-binding")
>>
>> There’s the usual space/popularity tradeoff to take into account: adding
>> them all makes the package’s closure much larger, so it’s important to
>> add only the useful bindings by default.
>>
>> Ideally, the .so for these bindings could be moved to separate outputs
>> (like we did for the “tk” output of Python), but it’s not always easy to
>> do.
>
> In this case it seems to be very easy to separate the bindings into
> different outputs as the flags take an optional path.

Great.

> However, the test for the Perl bindings does not pass:
>
>   /gnu/store/czs63sm4l0s4a56ab38dqvkx19yzylbq-perl-5.16.1/bin/perl: symbol lookup error: /tmp/nix-build-jellyfish-2.2.4.drv-0/jellyfish-2.2.4/.libs/libjellyfish-2.0.so.2: undefined symbol: pthread_create
>   FAIL tests/swig_perl.sh (exit status: 127)
>
> Maybe the library needs another linker flag?  I’ll play with this later
> and see if I can make it work.  If not I’ll leave the Perl bindings (and
> the “perl” output) away for now.

Looks like libjellyfish-2.0.so lacks “-pthread” in its LDFLAGS.  Perl
dlopening reveals the problem, it seems.

> It’s a bit unfortunate that this library would gain so many transitive
> inputs just for these bindings (all of these three languages require a
> lot of inputs).  It would be nice if we could somehow mark certain
> inputs to be used only for certain outputs, but that’s probably a silly
> wish.

At least someone using substitutes won’t need to download all these
prerequisites if bindings are moved to individual outputs.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add Jellyfish.
  2015-12-30 15:58       ` Ricardo Wurmus
  2016-01-01 22:15         ` Ludovic Courtès
@ 2016-01-06 11:13         ` Ricardo Wurmus
  1 sibling, 0 replies; 7+ messages in thread
From: Ricardo Wurmus @ 2016-01-06 11:13 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix-devel


Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Ben Woodcroft <b.woodcroft@uq.edu.au> skribis:
>>
>>> But would it be possible to include the scripting language bindings,
>>> something along these lines?
>>>
>>> +    (arguments
>>> +     `(#:configure-flags '("--enable-ruby-binding"
>>> +                           "--enable-python-binding"
>>> +                           "--enable-perl-binding")
>>
>> There’s the usual space/popularity tradeoff to take into account: adding
>> them all makes the package’s closure much larger, so it’s important to
>> add only the useful bindings by default.
>>
>> Ideally, the .so for these bindings could be moved to separate outputs
>> (like we did for the “tk” output of Python), but it’s not always easy to
>> do.
>
> In this case it seems to be very easy to separate the bindings into
> different outputs as the flags take an optional path.
>
> However, the test for the Perl bindings does not pass:
>
>   /gnu/store/czs63sm4l0s4a56ab38dqvkx19yzylbq-perl-5.16.1/bin/perl: symbol lookup error: /tmp/nix-build-jellyfish-2.2.4.drv-0/jellyfish-2.2.4/.libs/libjellyfish-2.0.so.2: undefined symbol: pthread_create
>   FAIL tests/swig_perl.sh (exit status: 127)
>
> Maybe the library needs another linker flag?  I’ll play with this later
> and see if I can make it work.  If not I’ll leave the Perl bindings (and
> the “perl” output) away for now.

I pushed a version with Ruby and Python bindings placed in different
outputs.  “guix size” confirmed that there are no references to Ruby or
Python in the plain output.

I did not add the Perl bindings.

Thanks for the suggestions!

~~ Ricardo

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-01-06 11:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-18 16:42 [PATCH] Add Jellyfish Ricardo Wurmus
2015-12-18 18:19 ` Eric Bavier
2015-12-18 23:25   ` Ben Woodcroft
2015-12-20 22:43     ` Ludovic Courtès
2015-12-30 15:58       ` Ricardo Wurmus
2016-01-01 22:15         ` Ludovic Courtès
2016-01-06 11:13         ` Ricardo Wurmus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).