From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: [PATCH] Add Couger. Date: Wed, 3 Jun 2015 13:56:33 +0200 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:42690) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z07Hu-0002W6-6z for guix-devel@gnu.org; Wed, 03 Jun 2015 07:56:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z07Hq-0008FT-0L for guix-devel@gnu.org; Wed, 03 Jun 2015 07:56:50 -0400 Received: from sinope.bbbm.mdc-berlin.de ([141.80.25.23]:39900) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z07Hp-0008FD-Mj for guix-devel@gnu.org; Wed, 03 Jun 2015 07:56:45 -0400 Received: from localhost (localhost [127.0.0.1]) by sinope.bbbm.mdc-berlin.de (Postfix) with ESMTP id ABE91280A5D for ; Wed, 3 Jun 2015 13:56:44 +0200 (CEST) Received: from sinope.bbbm.mdc-berlin.de ([127.0.0.1]) by localhost (sinope.bbbm.mdc-berlin.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LSFTmptbglZ1 for ; Wed, 3 Jun 2015 13:56:38 +0200 (CEST) Received: from HTCAONE.mdc-berlin.net (mab.citx.mdc-berlin.de [141.80.36.102]) by sinope.bbbm.mdc-berlin.de (Postfix) with ESMTP for ; Wed, 3 Jun 2015 13:56:38 +0200 (CEST) List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: guix-devel --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename="0001-gnu-Add-randomjungle.patch" >From d4f171912387304e61f2536a2eb167c7321e7663 Mon Sep 17 00:00:00 2001 From: Ricardo Wurmus Date: Tue, 2 Jun 2015 15:47:22 +0200 Subject: [PATCH 1/2] gnu: Add randomjungle. * gnu/packages/machine-learning.scm (randomjungle): New variable. --- gnu/packages/machine-learning.scm | 50 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 49 insertions(+), 1 deletion(-) diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm index 3b4af19..37e1c94 100644 --- a/gnu/packages/machine-learning.scm +++ b/gnu/packages/machine-learning.scm @@ -23,7 +23,12 @@ #:use-module (guix download) #:use-module (guix build-system gnu) #:use-module (gnu packages) - #:use-module (gnu packages python)) + #:use-module (gnu packages boost) + #:use-module (gnu packages compression) + #:use-module (gnu packages gcc) + #:use-module (gnu packages maths) + #:use-module (gnu packages python) + #:use-module (gnu packages xml)) (define-public libsvm (package @@ -95,3 +100,46 @@ classification.") #t))))) (inputs `(("python" ,python))))) + +(define-public randomjungle + (package + (name "randomjungle") + (version "2.1.0") + (source + (origin + (method url-fetch) + (uri (string-append + "http://www.imbs-luebeck.de/imbs/sites/default/files/u59/" + "randomjungle-" version ".tar_.gz")) + (sha256 + (base32 + "12c8rf30cla71swx2mf4ww9mfd8jbdw5lnxd7dxhyw1ygrvg6y4w")))) + (build-system gnu-build-system) + (arguments + `(#:configure-flags + (list (string-append "--with-boost=" + (assoc-ref %build-inputs "boost"))) + #:phases + (modify-phases %standard-phases + (add-before + 'configure 'set-CXXFLAGS + (lambda _ + (setenv "CXXFLAGS" "-fpermissive ") + #t))))) + (inputs + `(("boost" ,boost) + ("gsl" ,gsl) + ("libxml2" ,libxml2) + ("zlib" ,zlib))) + (native-inputs + `(("gfortran" ,gfortran-4.8))) + (home-page "http://www.imbs-luebeck.de/imbs/de/node/227/") + (synopsis "Implementation of the Random Forests machine learning method") + (description + "Random Jungle is an implementation of Random Forests. It is supposed to +analyse high dimensional data. In genetics, it can be used for analysing big +Genome Wide Association (GWA) data. Random Forests is a powerful machine +learning method. Most interesting features are variable selection, missing +value imputation, classifier creation, generalization error estimation and +sample proximities between pairs of cases.") + (license license:gpl3+))) -- 2.1.0 --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename="0002-gnu-Add-Couger.patch" >From c9246d7cfc56a0e4a9039a4b5e2dd9a7216ebeef Mon Sep 17 00:00:00 2001 From: Ricardo Wurmus Date: Wed, 3 Jun 2015 12:56:16 +0200 Subject: [PATCH 2/2] gnu: Add Couger. * gnu/packages/bioinformatics.scm (couger): New variable. --- gnu/packages/bioinformatics.scm | 71 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm index afcfecf..fd1c1dd 100644 --- a/gnu/packages/bioinformatics.scm +++ b/gnu/packages/bioinformatics.scm @@ -34,6 +34,7 @@ #:use-module (gnu packages file) #:use-module (gnu packages java) #:use-module (gnu packages linux) + #:use-module (gnu packages machine-learning) #:use-module (gnu packages maths) #:use-module (gnu packages ncurses) #:use-module (gnu packages perl) @@ -438,6 +439,76 @@ multiple sequence alignments.") "CLIPper is a tool to define peaks in CLIP-seq datasets.") (license license:gpl2))) +(define-public couger + (package + (name "couger") + (version "1.8.1") + (source (origin + (method url-fetch) + (uri (string-append + "http://couger.oit.duke.edu/static/assets/COUGER" + version ".zip")) + (sha256 + (base32 + "1bs151a54wnjl1bwnvlf31p17vs73vgiidnka3al02rmp1vrli8g")))) + (build-system gnu-build-system) + (arguments + `(#:tests? #f + #:phases + (modify-phases %standard-phases + (delete 'configure) + (delete 'build) + (replace + 'install + (lambda* (#:key outputs #:allow-other-keys) + (let ((out (assoc-ref outputs "out"))) + (copy-recursively "src" (string-append out "/src")) + (mkdir (string-append out "/bin")) + ;; Add "src" directory to module lookup path. + (substitute* "couger" + (("from argparse") + (string-append "import sys\nsys.path.append(\"" + out "\")\nfrom argparse"))) + (copy-file "couger" (string-append out "/bin/couger"))) + #t)) + (add-after + 'install 'wrap-program + (lambda* (#:key inputs outputs #:allow-other-keys) + ;; Make sure 'couger' runs with the correct PYTHONPATH. + (let* ((out (assoc-ref outputs "out")) + (path (getenv "PYTHONPATH"))) + (wrap-program (string-append out "/bin/couger") + `("PYTHONPATH" ":" prefix (,path)))) + #t))))) + (inputs + `(("python" ,python-2) + ("python2-pillow" ,python2-pillow) + ("python2-numpy" ,python2-numpy) + ("python2-scipy" ,python2-scipy) + ("python2-matplotlib" ,python2-matplotlib))) + (propagated-inputs + `(("r" ,r) + ("libsvm" ,libsvm) + ("randomjungle" ,randomjungle))) + (native-inputs + `(("unzip" ,unzip))) + (home-page "http://couger.oit.duke.edu") + (synopsis "Identify co-factors in sets of genomic regions") + (description + "COUGER can be applied to any two sets of genomic regions bound by +paralogous TFs (e.g., regions derived from ChIP-seq experiments) to identify +putative co-factors that provide specificity to each TF. The framework +determines the genomic targets uniquely-bound by each TF, and identifies a +small set of co-factors that best explain the in vivo binding differences +between the two TFs. + +COUGER uses classification algorithms (support vector machines and random +forests) with features that reflect the DNA binding specificities of putative +co-factors. The features are generated either from high-throughput TF-DNA +binding data (from protein binding microarray experiments), or from large +collections of DNA motifs.") + (license license:gpl3+))) + (define-public clustal-omega (package (name "clustal-omega") -- 2.1.0 --=-=-=--