unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] gnu: Add bedtools
@ 2014-12-12 10:11 Ricardo Wurmus
  2014-12-12 23:00 ` Ludovic Courtès
  2014-12-13  6:42 ` Mark H Weaver
  0 siblings, 2 replies; 6+ messages in thread
From: Ricardo Wurmus @ 2014-12-12 10:11 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 750 bytes --]

This is a patch to add bedtools to the bioinformatics module.  It
depends on samtools and is part of the bioinformatics module created
with the samtools patch.

Since there is no install phase in the Makefile I had to write a
replacement to copy all tools from the build bin/ directory to the
output /bin/ directory.  I don't know if it is better to explicitly list
the tools to copy or if this should rather be done with a glob pattern.

Another source of ugliness in the recipe is the
patch-makefile-SHELL-definition phase which is really just
patch-makefile-SHELL but working on ":=" definitions rather than
"=" assignments.  Augmenting patch-makefile-SHELL to handle definitions
as well would result in a cleaner package recipe.

Cheers,
rekado


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-gnu-Add-bedtools.patch --]
[-- Type: text/x-patch, Size: 4431 bytes --]

From c8d71da303ff6b82a30db542d382cab57a00699e Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Date: Thu, 11 Dec 2014 17:37:16 +0100
Subject: [PATCH] gnu: Add bedtools

* gnu/packages/bioinformatics.scm (bedtools): New variable.
---
 gnu/packages/bioinformatics.scm | 68 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 6f6178a..bcc5d43 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -28,6 +28,74 @@
   #:use-module (gnu packages pkg-config)
   #:use-module (gnu packages python))
 
+(define-public bedtools
+  (package
+    (name "bedtools")
+    (version "2.22.0")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append "https://github.com/arq5x/bedtools2/archive/v"
+                                  version ".tar.gz"))
+              (sha256
+               (base32
+                "16aq0w3dmbd0853j32xk9jin4vb6v6fgakfyvrsmsjizzbn3fpfl"))))
+    (build-system gnu-build-system)
+    (inputs `(("python" ,python)
+              ("samtools" ,samtools)
+              ("zlib" ,zlib)))
+    (arguments
+     '(#:test-target "test"
+       #:phases
+       (alist-cons-after
+        'unpack 'patch-makefile-SHELL-definition
+                (lambda _
+                  ;; patch-makefile-SHELL cannot be used here as it does not
+                  ;; yet patch definitions with `:='.  Since changes to
+                  ;; patch-makefile-SHELL result in a full rebuild, features
+                  ;; of patch-makefile-SHELL are reimplemented here.
+                  (define (find-shell name)
+                    (let ((shell
+                           (search-path (search-path-as-string->list (getenv "PATH"))
+                                        name)))
+                      (unless shell
+                        (format (current-error-port)
+                                "patch-makefile-SHELL: warning: no binary for shell `~a' found in $PATH~%"
+                                name))
+                      shell))
+                  (substitute* "Makefile"
+                    (("^SHELL := .*$") (string-append "SHELL := " (find-shell "bash") " -e \n"))))
+                (alist-delete
+                 'configure
+                 (alist-replace
+                  'install (lambda* (#:key outputs #:allow-other-keys)
+                             (let* ((out (assoc-ref outputs "out"))
+                                    (bin (string-append out "/bin"))
+                                    (tools
+                                     '("bamToFastq" "mapBed" "shuffleBed" "bed12ToBed6" "bedToBam"
+                                       "multiIntersectBed" "complementBed" "randomBed" "tagBam" "sortBed"
+                                       "annotateBed" "clusterBed" "fastaFromBed" "coverageBed" "bedpeToBam"
+                                       "pairToPair" "subtractBed" "nucBed" "expandCols" "bedToIgv" "slopBed"
+                                       "closestBed" "windowMaker" "linksBed" "getOverlap" "mergeBed" "windowBed"
+                                       "flankBed" "pairToBed" "intersectBed" "bamToBed" "multiBamCov"
+                                       "unionBedGraphs" "genomeCoverageBed" "groupBy" "maskFastaFromBed"
+                                       "bedtools")))
+                               (mkdir-p bin)
+                               (map (lambda (tool)
+                                      (copy-file (string-append "bin/" tool)
+                                                 (string-append bin "/" tool)))
+                                    tools)))
+                  %standard-phases)))))
+    (home-page "https://github.com/arq5x/bedtools2")
+    (synopsis "Swiss army knife for genome arithmetic")
+    (description
+     "Collectively, the bedtools utilities are a swiss-army knife of tools for
+a wide-range of genomics analysis tasks.  The most widely-used tools enable
+genome arithmetic: that is, set theory on the genome.  For example, bedtools
+allows one to intersect, merge, count, complement, and shuffle genomic
+intervals from multiple files in widely-used genomic file formats such as BAM,
+BED, GFF/GTF, VCF.")
+    (license license:gpl2)))
+
 (define-public samtools
   (package
     (name "samtools")
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] gnu: Add bedtools
  2014-12-12 10:11 [PATCH] gnu: Add bedtools Ricardo Wurmus
@ 2014-12-12 23:00 ` Ludovic Courtès
  2014-12-13  6:42 ` Mark H Weaver
  1 sibling, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2014-12-12 23:00 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

> Since there is no install phase in the Makefile I had to write a
> replacement to copy all tools from the build bin/ directory to the
> output /bin/ directory.  I don't know if it is better to explicitly list
> the tools to copy or if this should rather be done with a glob pattern.

Is a glob pattern really needed, or is it that bin/* must be copied?

If the latter, you could use ‘scandir’ to obtain the list of files in
that directory, or (find-files "build/bin" ".*") where the 2nd argument
is a regexp, not a glob pattern.

> Another source of ugliness in the recipe is the
> patch-makefile-SHELL-definition phase which is really just
> patch-makefile-SHELL but working on ":=" definitions rather than
> "=" assignments.  Augmenting patch-makefile-SHELL to handle definitions
> as well would result in a cleaner package recipe.

Yeah; let’s fix that in core-updates.

> +       (alist-cons-after
> +        'unpack 'patch-makefile-SHELL-definition
> +                (lambda _

Please align ‘(lambda’ with ‘'unpack’.

> +                  (define (find-shell name)
> +                    (let ((shell
> +                           (search-path (search-path-as-string->list (getenv "PATH"))
> +                                        name)))
> +                      (unless shell
> +                        (format (current-error-port)
> +                                "patch-makefile-SHELL: warning: no binary for shell `~a' found in $PATH~%"
> +                                name))
> +                      shell))

‘find-shell’ is not needed: just use the procedure called ‘which’.

Could you send an updated patch?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] gnu: Add bedtools
  2014-12-12 10:11 [PATCH] gnu: Add bedtools Ricardo Wurmus
  2014-12-12 23:00 ` Ludovic Courtès
@ 2014-12-13  6:42 ` Mark H Weaver
  2014-12-15 15:46   ` Ricardo Wurmus
  1 sibling, 1 reply; 6+ messages in thread
From: Mark H Weaver @ 2014-12-13  6:42 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

> From c8d71da303ff6b82a30db542d382cab57a00699e Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
> Date: Thu, 11 Dec 2014 17:37:16 +0100
> Subject: [PATCH] gnu: Add bedtools

Ludovic mostly covered this, but I have two additional suggestions:

> +                 (alist-replace
> +                  'install (lambda* (#:key outputs #:allow-other-keys)

Please align the lambda* under 'install.

> +                             (let* ((out (assoc-ref outputs "out"))
> +                                    (bin (string-append out "/bin"))
> +                                    (tools
> +                                     '("bamToFastq" "mapBed" "shuffleBed" "bed12ToBed6" "bedToBam"
> +                                       "multiIntersectBed" "complementBed" "randomBed" "tagBam" "sortBed"
> +                                       "annotateBed" "clusterBed" "fastaFromBed" "coverageBed" "bedpeToBam"
> +                                       "pairToPair" "subtractBed" "nucBed" "expandCols" "bedToIgv" "slopBed"
> +                                       "closestBed" "windowMaker" "linksBed" "getOverlap" "mergeBed" "windowBed"
> +                                       "flankBed" "pairToBed" "intersectBed" "bamToBed" "multiBamCov"
> +                                       "unionBedGraphs" "genomeCoverageBed" "groupBy" "maskFastaFromBed"
> +                                       "bedtools")))
> +                               (mkdir-p bin)
> +                               (map (lambda (tool)

It would be more appropriate to use 'for-each' here instead of 'map'.

> +                                      (copy-file (string-append "bin/" tool)
> +                                                 (string-append bin "/" tool)))
> +                                    tools)))
> +                  %standard-phases)))))
[...]

    Thanks!
      Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] gnu: Add bedtools
  2014-12-13  6:42 ` Mark H Weaver
@ 2014-12-15 15:46   ` Ricardo Wurmus
  2014-12-16 17:18     ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Ricardo Wurmus @ 2014-12-15 15:46 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 271 bytes --]

Thanks for the review and the suggestions.  Attached is an updated patch
for bedtools.

Python has been marked as a native input as it is only required at build
time to run a script that creates shell script wrappers for the various
executables.  I hope that's correct.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-gnu-Add-bedtools.patch --]
[-- Type: text/x-patch, Size: 2856 bytes --]

From 5d1d383417992369e6fa3f81331e87471183ffc6 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Date: Thu, 11 Dec 2014 17:37:16 +0100
Subject: [PATCH] gnu: Add bedtools

* gnu/packages/bioinformatics.scm (bedtools): New variable.
---
 gnu/packages/bioinformatics.scm | 49 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 6f6178a..d608a34 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -28,6 +28,55 @@
   #:use-module (gnu packages pkg-config)
   #:use-module (gnu packages python))
 
+(define-public bedtools
+  (package
+    (name "bedtools")
+    (version "2.22.0")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append "https://github.com/arq5x/bedtools2/archive/v"
+                                  version ".tar.gz"))
+              (sha256
+               (base32
+                "16aq0w3dmbd0853j32xk9jin4vb6v6fgakfyvrsmsjizzbn3fpfl"))))
+    (build-system gnu-build-system)
+    (native-inputs `(("python" ,python-2)))
+    (inputs `(("samtools" ,samtools)
+              ("zlib" ,zlib)))
+    (arguments
+     '(#:test-target "test"
+       #:phases
+       (alist-cons-after
+        'unpack 'patch-makefile-SHELL-definition
+        (lambda _
+          ;; patch-makefile-SHELL cannot be used here as it does not
+          ;; yet patch definitions with `:='.  Since changes to
+          ;; patch-makefile-SHELL result in a full rebuild, features
+          ;; of patch-makefile-SHELL are reimplemented here.
+          (substitute* "Makefile"
+            (("^SHELL := .*$") (string-append "SHELL := " (which "bash") " -e \n"))))
+        (alist-delete
+         'configure
+         (alist-replace
+          'install
+          (lambda* (#:key outputs #:allow-other-keys)
+            (let ((bin (string-append (assoc-ref outputs "out") "/bin/")))
+              (mkdir-p bin)
+              (for-each (lambda (file)
+                          (copy-file file (string-append bin (basename file))))
+                        (find-files "bin" ".*"))))
+          %standard-phases)))))
+    (home-page "https://github.com/arq5x/bedtools2")
+    (synopsis "Swiss army knife for genome arithmetic")
+    (description
+     "Collectively, the bedtools utilities are a swiss-army knife of tools for
+a wide-range of genomics analysis tasks.  The most widely-used tools enable
+genome arithmetic: that is, set theory on the genome.  For example, bedtools
+allows one to intersect, merge, count, complement, and shuffle genomic
+intervals from multiple files in widely-used genomic file formats such as BAM,
+BED, GFF/GTF, VCF.")
+    (license license:gpl2)))
+
 (define-public samtools
   (package
     (name "samtools")
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] gnu: Add bedtools
       [not found] ` <idj61db7vu0.fsf@bimsb-sys02.mdc-berlin.net>
@ 2014-12-16 10:15   ` John Darrington
  0 siblings, 0 replies; 6+ messages in thread
From: John Darrington @ 2014-12-16 10:15 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2171 bytes --]

On Tue, Dec 16, 2014 at 11:06:15AM +0100, Ricardo Wurmus wrote:
     
     John Darrington writes:
     
     > I suggest that in the description and particularly in the synopsis fields you
     > avoid persiflage like "swiss-army knife".  It doesn't add any information, and
     > only goes to increase the noise.
     >
     > (The same goes for phrases such as "high quality", "powerful" and "state of the art").
     >
     > J'
     >
     > On Thu, Dec 11, 2014 at 05:37:16PM +0100, Ricardo Wurmus wrote:
     >      * gnu/packages/bioinformatics.scm (bedtools): New variable.
     >
     >      +    (synopsis "Swiss army knife for genome arithmetic")
     >      +    (description
     >      +     "Collectively, the bedtools utilities are a swiss-army knife of tools for
     >      +a wide-range of genomics analysis tasks.  The most widely-used tools enable
     >      +genome arithmetic: that is, set theory on the genome.  For example, bedtools
     >      +allows one to intersect, merge, count, complement, and shuffle genomic
     >      +intervals from multiple files in widely-used genomic file formats such as BAM,
     >      +BED, GFF/GTF, VCF.")
     
     I do agree.  I sheepishly copied the synopsis and description from the
     project website and I wasn't quite satisfied either.  Do you have a
     suggestion?  "Tools for performing genome arithmetic", maybe?
     
Without knowing the package I cannot say.  But I presume that "Swiss Army Knife" is supposed
to imply that there are various tools of differing purposes.

So perhaps "Miscellaneous tools for performing genome arithmetic"?   On the 
other hand, unless there is some other package which contains some specialised tools for one
special branch of genome arithmetic, then the word "miscellaneous" is redundant and simply
 "Tools for performing genome arithmetic"  like you suggested would be best.  It is short and 
to the point as the synopsis should be.

J'

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] gnu: Add bedtools
  2014-12-15 15:46   ` Ricardo Wurmus
@ 2014-12-16 17:18     ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2014-12-16 17:18 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

> Thanks for the review and the suggestions.  Attached is an updated patch
> for bedtools.
>
> Python has been marked as a native input as it is only required at build
> time to run a script that creates shell script wrappers for the various
> executables.  I hope that's correct.

Yes it is.

> From 5d1d383417992369e6fa3f81331e87471183ffc6 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
> Date: Thu, 11 Dec 2014 17:37:16 +0100
> Subject: [PATCH] gnu: Add bedtools
>
> * gnu/packages/bioinformatics.scm (bedtools): New variable.

Pushed.  As John suggested, I changed the synopsis to "Tools for genome
analysis and arithmetic", which is more rigorous and less marketingish.
Let me know if you think this should be changed to something else.

Thanks!

Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-12-16 17:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-12 10:11 [PATCH] gnu: Add bedtools Ricardo Wurmus
2014-12-12 23:00 ` Ludovic Courtès
2014-12-13  6:42 ` Mark H Weaver
2014-12-15 15:46   ` Ricardo Wurmus
2014-12-16 17:18     ` Ludovic Courtès
     [not found] <20141215155541.GA19757@jocasta.intra>
     [not found] ` <idj61db7vu0.fsf@bimsb-sys02.mdc-berlin.net>
2014-12-16 10:15   ` John Darrington

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).