all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: [PATCH] Add Blast+.
Date: Tue, 16 Jun 2015 17:37:09 -0400	[thread overview]
Message-ID: <87zj3ze3ne.fsf@netris.org> (raw)
In-Reply-To: <idjlhfjpw5i.fsf@bimsb-sys02.mdc-berlin.net> (Ricardo Wurmus's message of "Tue, 16 Jun 2015 16:26:01 +0200")

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> From 81cbb9bfa523d56c68d5f9f4feed3676edb5a414 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
> Date: Tue, 16 Jun 2015 16:24:24 +0200
> Subject: [PATCH] gnu: Add Blast+.
>
> * gnu/packages/bioinformatics.scm (blast+): New variable.
> ---
>  gnu/packages/bioinformatics.scm | 156 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 156 insertions(+)
>
> diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
> index ac4c50d..4a55040 100644
> --- a/gnu/packages/bioinformatics.scm
> +++ b/gnu/packages/bioinformatics.scm
> @@ -31,6 +31,7 @@
>    #:use-module (gnu packages base)
>    #:use-module (gnu packages boost)
>    #:use-module (gnu packages compression)
> +  #:use-module (gnu packages cpio)
>    #:use-module (gnu packages file)
>    #:use-module (gnu packages java)
>    #:use-module (gnu packages linux)
> @@ -294,6 +295,161 @@ into separate processes; and more.")
>      (inputs
>       `(("python2-numpy" ,python2-numpy)))))
>  
> +(define-public blast+
> +  (package
> +    (name "blast+")
> +    (version "2.2.30")
> +    (source (origin
> +              (method url-fetch)
> +              (uri (string-append
> +                    "ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/"
> +                    version "/ncbi-blast-" version "+-src.tar.gz"))
> +              (sha256
> +               (base32
> +                "0h0fj5cpx6zpfwixgx5f5xbr4rn3cnai0x3j7grrg50vr18jvxr6"))))
> +    (build-system gnu-build-system)
> +    (arguments
> +     `(;; There are three(!) tests for this massive library, and all fail with
> +       ;; "unparsable timing stats".
> +       ;; ERR [127] --  [util/regexp] test_pcre.sh     (unparsable timing stats)
> +       ;; ERR [127] --  [serial/datatool] datatool.sh     (unparsable timing stats)
> +       ;; ERR [127] --  [serial/datatool] datatool_xml.sh     (unparsable timing stats)
> +       #:tests? #f

Just a guess, but maybe this is because you replaced "/bin/date" with
"echo -n 0".  How about replacing it with "date -d @0" instead?

It would be great to get the tests working, even if we have to disable
some of them.  Otherwise we have no way of knowing that we're not
distributing broken garbage :)

> +       #:out-of-source? #t
> +       #:parallel-build? #f ; not supported
> +       #:phases
> +       (modify-phases %standard-phases
> +         (add-before
> +          'configure 'set-HOME
> +          ;; $HOME needs to be set at some point during the configure phase
> +          (lambda _ (setenv "HOME" "/tmp") #t))
> +         (add-after
> +          'unpack 'enter-dir
> +          (lambda _ (chdir "c++") #t))
> +         (add-after
> +          'enter-dir 'fix-build-system
> +          (lambda _
> +            ;; Proceed even though the weird build system says that generated
> +            ;; files are out of date
> +            (setenv "NCBICXX_RECONF_POLICY" "warn")
> +
> +            ;; Remove bundled bzip2 and zlib
> +            (delete-file-recursively "src/util/compress/bzip2")
> +            (delete-file-recursively "src/util/compress/zlib")
> +            (substitute* "src/util/compress/Makefile.in"
> +              (("bzip2 zlib api") "api"))
> +
> +            ;; Remove useless msbuild directory
> +            (delete-file-recursively "src/build-system/project_tree_builder/msbuild")
> +
> +            ;; Some of the files we're patching are
> +            ;; ISO-8859-1-encoded, so choose it as the default
> +            ;; encoding so the byte encoding is preserved.
> +            (with-fluids ((%default-port-encoding #f))
> +              (substitute* (find-files "src/build-system" "config.*")

"^config"

> +                (("LN_S=/bin/\\$LN_S") (string-append "LN_S=" (which "ln")))
> +                (("/bin/sh") (which "bash"))

(which "sh") might be better.  Bash behaves differently when it's
invoked as 'sh'.

> +                (("^PATH=.*") "")))
> +
> +            ;; fix static and generated shebangs
> +            (substitute* (find-files "scripts/common/check" "\\.sh")

"\\.sh$"

> +              (("/bin/sh") (which "bash")))

(which "sh")

> +
> +            ;; rewrite "/var/tmp" in check script
> +            (substitute* "scripts/common/check/check_make_unix.sh"
> +              (("/var/tmp") (string-append (getcwd) "/build/build")))

Or maybe just "/tmp" ?

> +
> +            ;; fix path to "echo"
> +            (substitute* '("src/build-system/Makefile.rules_with_autodep.in"
> +                           "src/build-system/Makefile.meta.gmake=no"
> +                           "src/build-system/Makefile.meta_r"
> +                           "src/build-system/Makefile.requirements")
> +              (("/bin/echo") (which "echo")))
> +
> +            ;; fix path to "basename"
> +            (substitute* '("src/build-system/Makefile.in.top")
> +              (("/usr/bin/basename") (which "basename")))
> +
> +            ;; fix path to "mv"
> +            (substitute* '("src/build-system/Makefile.rules_with_autodep.in"
> +                           "src/build-system/Makefile.meta_p")
> +              (("/bin/mv") (which "mv")))
> +
> +            ;; fix path to "rm"
> +            (substitute* '("src/build-system/Makefile.mk.in"
> +                           "src/build-system/Makefile.meta.in"
> +                           "scripts/common/impl/run_with_lock.sh")
> +              (("/bin/rm") (which "rm")))
> +
> +            ;; fix path to "cp"
> +            (substitute* '("src/build-system/Makefile.configurables.real"
> +                           "src/build-system/Makefile.mk.in"
> +                           "src/build-system/configure"
> +                           "src/build-system/configure.ac"
> +                           "scripts/common/impl/if_diff.sh")
> +              (("/bin/cp") (which "cp")))
> +
> +            ;; fix path to "mkdir"
> +            (substitute* '("src/build-system/Makefile.mk.in"
> +                           "src/build-system/Makefile.meta.in")
> +              (("/bin/mkdir") (which "mkdir")))
> +
> +            ;; fix path to "dirname"
> +            (substitute* '("src/build-system/Makefile.configurables.real"
> +                           "src/build-system/Makefile.meta_p")
> +              (("/usr/bin/dirname") (which "dirname")))
> +
> +            ;; make call to "date" deterministic
> +            (substitute* "src/build-system/Makefile.meta_l"
> +              (("/bin/date") "echo -n 0"))

All of these plus the ones for 'sh' could be combined into something
like this: (untested)

  (define (which* cmd)
    (cond ((string=? cmd "date")
           ;; make call to "date" deterministic
           "date -d @0")
          ((which cmd)
           => identity)
          (else
           (format (current-error-port)
                   "WARNING: Unable to find absolute path for ~s~%"
                   cmd)
           #f)))
  
  (substitute* <file-list>
    (("(/usr/bin/|/bin/)([a-z][-_.a-z]*)" all dir cmd)
     (or (which* cmd) all)))

The definition must be placed at the beginning of a <body>, i.e. before
any non-definitions within a 'lambda', 'let', or similar forms.  In this
case it would go just inside the 'lambda' for 'fix-build-system'.

I did something similar in the 'wicd' package.

> +
> +            ;; do not reset PATH
> +            (substitute* (find-files "scripts/common/impl/" "\\.sh")

"\\.sh$"

> +              (("^ *PATH=.*") "")
> +              (("action=/bin/") "action=")
> +              (("export PATH") "echo -n 0"))

Why "echo -n 0" here?  Maybe ":" would be better?  It is a no-op
built-in command in Bourne shell.

> +            #t))
> +         (replace
> +          'configure
> +          (lambda* (#:key inputs outputs #:allow-other-keys)
> +            (let ((out (assoc-ref outputs "out"))
> +                  (lib (string-append (assoc-ref outputs "lib") "/lib"))
> +                  (include (string-append (assoc-ref outputs "include")
> +                                          "/include/ncbi-tools++")))

How about lining up the initializers of this 'let'?

> +              ;; The 'configure' script doesn't recognize things like
> +              ;; '--enable-fast-install'.
> +              (zero? (system* "./configure.orig"
> +                              (string-append "--with-build-root=" (getcwd) "/build")
> +                              (string-append "--prefix=" out)
> +                              (string-append "--libdir=" lib)
> +                              (string-append "--includedir=" include)
> +                              (string-append "--with-bz2="
> +                                             (assoc-ref inputs "bzip2"))
> +                              (string-append "--with-z="
> +                                             (assoc-ref inputs "zlib"))
> +                              ;; Each library is built twice by default, once
> +                              ;; with "-static" in its name, and again
> +                              ;; without.
> +                              "--without-static"
> +                              "--with-dll"))))))))
> +    (outputs '("out"       ; 19 MB
> +               "lib"       ; 203MB
> +               "include")) ; 32MB
> +    (inputs
> +     `(("bzip2" ,bzip2)
> +       ("zlib" ,zlib)))
> +    (native-inputs
> +     `(("cpio" ,cpio)))
> +    (home-page "http://blast.ncbi.nlm.nih.gov")
> +    (synopsis "Basic local alignment search tool")
> +    (description
> +     "BLAST is a popular method of performing a DNA or protein sequence
> +similarity search, using heuristics to produce results quickly.  It also
> +calculates an “expect value” that estimates how many matches would have
> +occurred at a given score by chance, which can aid a user in judging how much
> +confidence to have in an alignment.")
> +    (license license:public-domain)))
> +

Is everything in here really in the public domain?  I'd guess that in
order to make this true, you'd need to remove bzip2 and zlib in a
snippet, and even then I'd doubtful :)

Actually, it might be a good idea for us to remove bundled stuff in a
snippet whenever possible, since we won't be applying security updates
to those things, and it's probably better to remove them than to
distribute bundled source code with security holes.

     Thanks!
       Mark

  reply	other threads:[~2015-06-16 21:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-16 14:26 [PATCH] Add Blast+ Ricardo Wurmus
2015-06-16 21:37 ` Mark H Weaver [this message]
2015-06-23  8:06   ` Ricardo Wurmus
2015-06-27 10:14     ` Ludovic Courtès
2015-06-27 13:10       ` Ben Woodcroft
2015-06-27 14:09         ` Ricardo Wurmus
2015-06-27 18:29     ` Mark H Weaver
2015-06-29 14:15       ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zj3ze3ne.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=guix-devel@gnu.org \
    --cc=ricardo.wurmus@mdc-berlin.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.