all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* [PATCH} Add cd-hit.
@ 2016-02-21 11:02 Ben Woodcroft
  2016-02-23 13:24 ` Ludovic Courtès
  2016-02-23 13:33 ` Ricardo Wurmus
  0 siblings, 2 replies; 5+ messages in thread
From: Ben Woodcroft @ 2016-02-21 11:02 UTC (permalink / raw)
  To: guix-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 19 bytes --]

Thanks in advance.

[-- Attachment #2: 0001-gnu-Add-cd-hit.patch --]
[-- Type: text/x-patch, Size: 2138 bytes --]

From 95ae898345774f6bb26ce11a340b688118d6c4ba Mon Sep 17 00:00:00 2001
From: Ben Woodcroft <donttrustben@gmail.com>
Date: Sat, 16 Jan 2016 14:31:25 +1000
Subject: [PATCH] gnu: Add cd-hit.

* gnu/packages/bioinformatics.scm (cd-hit): New variable.
---
 gnu/packages/bioinformatics.scm | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index a72765a..87b2519 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -807,6 +807,46 @@ and more accurate.  BWA-MEM also has better performance than BWA-backtrack for
 multiple sequence alignments.")
     (license license:expat)))
 
+(define-public cd-hit
+  (package
+    (name "cd-hit")
+    (version "4.6.4")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (string-append
+             "https://github.com/weizhongli/cdhit/releases/download/V"
+             version
+             "/cd-hit-v"
+             version
+             "-2015-0603.tar.gz"))
+       (sha256
+        (base32
+         "0b6r52hhz3apx3wbc3zpzmxzyv6p7n3w4x3didadsjbnnav5006a"))))
+    (build-system gnu-build-system)
+    (arguments
+     `(#:tests? #f ; no tests
+       #:make-flags
+       (list (string-append "PREFIX="
+                            (assoc-ref %outputs "out")
+                            "/bin"))
+       #:phases
+       (modify-phases %standard-phases
+         (delete 'configure)
+         (add-before 'install 'create-output-directory
+           (lambda* (#:key outputs #:allow-other-keys)
+             (mkdir-p (string-append
+                       (assoc-ref outputs "out") "/bin")))))))
+    (inputs
+     `(("perl" ,perl)))
+    (home-page "http://weizhongli-lab.org/cd-hit/")
+    (synopsis "Cluster and compare protein or nucleotide sequences")
+    (description
+     "CD-HIT is a program for clustering and comparing protein or nucleotide
+sequences.  CD-HIT is designed to be fast and handle extremely large
+databases.")
+    (license license:gpl2)))
+
 (define-public clipper
   (package
     (name "clipper")
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH} Add cd-hit.
  2016-02-21 11:02 [PATCH} Add cd-hit Ben Woodcroft
@ 2016-02-23 13:24 ` Ludovic Courtès
  2016-02-23 13:33 ` Ricardo Wurmus
  1 sibling, 0 replies; 5+ messages in thread
From: Ludovic Courtès @ 2016-02-23 13:24 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: guix-devel@gnu.org

Ben Woodcroft <b.woodcroft@uq.edu.au> skribis:

> From 95ae898345774f6bb26ce11a340b688118d6c4ba Mon Sep 17 00:00:00 2001
> From: Ben Woodcroft <donttrustben@gmail.com>
> Date: Sat, 16 Jan 2016 14:31:25 +1000
> Subject: [PATCH] gnu: Add cd-hit.
>
> * gnu/packages/bioinformatics.scm (cd-hit): New variable.

LGTM, thanks!

Ludo’.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH} Add cd-hit.
  2016-02-21 11:02 [PATCH} Add cd-hit Ben Woodcroft
  2016-02-23 13:24 ` Ludovic Courtès
@ 2016-02-23 13:33 ` Ricardo Wurmus
  1 sibling, 0 replies; 5+ messages in thread
From: Ricardo Wurmus @ 2016-02-23 13:33 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: guix-devel@gnu.org


Ben Woodcroft <b.woodcroft@uq.edu.au> writes:

> +(define-public cd-hit
> +  (package
> +    (name "cd-hit")
> +    (version "4.6.4")
> +    (source
> +     (origin
> +       (method url-fetch)
> +       (uri (string-append
> +             "https://github.com/weizhongli/cdhit/releases/download/V"
> +             version
> +             "/cd-hit-v"
> +             version
> +             "-2015-0603.tar.gz"))

I’m not a fan of this layout (in particular: distributing the string on
so many lines).

Is ‘-2015-0603’ actually part of the version number?

If so you may need to use “(version "4.6.4-2015-0603")” and strip off
the part beginning with “-” in the first instance of “version” in the
URL.

I checked the license and according to the documentation it’s really
just GPLv2.  (There are no license headers.)

~~ Ricardo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] Add CD-HIT
@ 2016-03-11 14:18 Ricardo Wurmus
  2016-03-11 16:02 ` Ben Woodcroft
  0 siblings, 1 reply; 5+ messages in thread
From: Ricardo Wurmus @ 2016-03-11 14:18 UTC (permalink / raw)
  To: guix-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0001-gnu-Add-CD-HIT.patch --]
[-- Type: text/x-patch, Size: 2922 bytes --]

From 19d0402a90ee8f93f099fb026a7ba5436f77a21b Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Date: Fri, 11 Mar 2016 14:57:29 +0100
Subject: [PATCH] gnu: Add CD-HIT.

* gnu/packages/bioinformatics.scm (cd-hit): New variable.
---
 gnu/packages/bioinformatics.scm | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 5cb5fa2..2f0d2db 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -875,6 +875,56 @@ also includes an interface for tabix.")
 (define-public python2-pysam
   (package-with-python2 python-pysam))
 
+(define-public cd-hit
+  (package
+    (name "cd-hit")
+    (version "4.6.5")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append "https://github.com/weizhongli/cdhit"
+                                  "/releases/download/V" version
+                                  "/cd-hit-v" version "-2016-0304.tar.gz"))
+              (sha256
+               (base32
+                "15db0hq38yyifwqx9b6l34z14jcq576dmjavhj8a426c18lvnhp3"))))
+    (build-system gnu-build-system)
+    (arguments
+     `(#:tests? #f ; there are no tests
+       #:make-flags
+       ;; Executables are copied directly to the PREFIX.
+       (list (string-append "PREFIX=" (assoc-ref %outputs "out") "/bin"))
+       #:phases
+       (modify-phases %standard-phases
+         ;; No "configure" script
+         (delete 'configure)
+         ;; Remove sources of non-determinism
+         (add-after 'unpack 'be-timeless
+           (lambda _
+             (substitute* "cdhit-utility.c++"
+               ((" \\(built on \" __DATE__ \"\\)") ""))
+             (substitute* "cdhit-common.c++"
+               (("__DATE__") "\"0\"")
+               (("\", %s, \" __TIME__ \"\\\\n\", date") ""))
+             #t))
+         ;; The "install" target does not create the target directory
+         (add-before 'install 'create-target-dir
+           (lambda* (#:key outputs #:allow-other-keys)
+             (mkdir-p (string-append (assoc-ref outputs "out") "/bin"))
+             #t)))))
+    (inputs
+     `(("perl" ,perl)))
+    (home-page "http://weizhongli-lab.org/cd-hit/")
+    (synopsis "Cluster and compare protein or nucleotide sequences")
+    (description
+     "CD-HIT is a program for clustering and comparing protein or nucleotide
+sequences.  CD-HIT is very fast and can handle extremely large databases.
+CD-HIT helps to significantly reduce the computational and manual efforts in
+many sequence analysis tasks and aids in understanding the data structure and
+correct the bias within a dataset.")
+    ;; The manual says: "It can be copied under the GNU General Public License
+    ;; version 2 (GPLv2)."
+    (license license:gpl2)))
+
 (define-public clipper
   (package
     (name "clipper")
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Add CD-HIT
  2016-03-11 14:18 [PATCH] Add CD-HIT Ricardo Wurmus
@ 2016-03-11 16:02 ` Ben Woodcroft
  0 siblings, 0 replies; 5+ messages in thread
From: Ben Woodcroft @ 2016-03-11 16:02 UTC (permalink / raw)
  To: Ricardo Wurmus, guix-devel

Hi Ricardo,

Looks good, and much the same as the patch I hadn't got around to 
submitting minus your good work with determinism.

On 11/03/16 09:18, Ricardo Wurmus wrote:
> +    (description
> +     "CD-HIT is a program for clustering and comparing protein or nucleotide
> +sequences.  CD-HIT is very fast and can handle extremely large databases.
> +CD-HIT helps to significantly reduce the computational and manual efforts in
> +many sequence analysis tasks and aids in understanding the data structure and
> +correct the bias within a dataset.")

I didn't understand that last sentence, and I thought the 2nd sentence 
was a bit too opinionated. In my patch the description was simply:

+    (description
+     "CD-HIT is a program for clustering and comparing protein or 
nucleotide
+sequences.  CD-HIT is designed to be fast and handle extremely large
+databases.")

HTH,
ben

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-03-11 16:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-11 14:18 [PATCH] Add CD-HIT Ricardo Wurmus
2016-03-11 16:02 ` Ben Woodcroft
  -- strict thread matches above, loose matches on Subject: below --
2016-02-21 11:02 [PATCH} Add cd-hit Ben Woodcroft
2016-02-23 13:24 ` Ludovic Courtès
2016-02-23 13:33 ` Ricardo Wurmus

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.