From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Woodcroft Subject: Re: [PATCH] gnu: Add r-centipede. Date: Tue, 3 May 2016 09:24:08 +1000 Message-ID: <5727E198.5050904@uq.edu.au> References: <1462200561-19162-1-git-send-email-ricardo.wurmus@mdc-berlin.de> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:33168) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axNCj-0004JE-2y for guix-devel@gnu.org; Mon, 02 May 2016 19:24:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1axNCW-0003xO-WB for guix-devel@gnu.org; Mon, 02 May 2016 19:24:35 -0400 Received: from mailhub2.soe.uq.edu.au ([130.102.132.209]:36964 helo=newmailhub.uq.edu.au) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axNCW-0003vf-Dn for guix-devel@gnu.org; Mon, 02 May 2016 19:24:28 -0400 In-Reply-To: <1462200561-19162-1-git-send-email-ricardo.wurmus@mdc-berlin.de> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Ricardo Wurmus , guix-devel@gnu.org On 03/05/16 00:49, Ricardo Wurmus wrote: > * gnu/packages/bioinformatics.scm (r-centipede): New variable. > --- > gnu/packages/bioinformatics.scm | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm > index 7d025ef..d7957cf 100644 > --- a/gnu/packages/bioinformatics.scm > +++ b/gnu/packages/bioinformatics.scm > @@ -441,6 +441,27 @@ pybedtools extends BEDTools by offering feature-level manipulations from with > Python.") > (license license:gpl2+))) > > +(define-public r-centipede > + (package > + (name "r-centipede") > + (version "1.2") > + (source (origin > + (method url-fetch) > + (uri (string-append "http://download.r-forge.r-project.org/" > + "src/contrib/CENTIPEDE_" version ".tar.gz")) > + (sha256 > + (base32 > + "1hsx6qgwr0i67fhy9257zj7s0ppncph2hjgbia5nn6nfmj0ax6l9")))) > + (build-system r-build-system) > + (home-page "http://centipede.uchicago.edu/") > + (synopsis "Predict transcription factor binding sites") > + (description > + "Centipede fits a bayesian hierarchical mixture model to learn > +transcription-factor-specific distribution of experimental data on a > +particular cell-type for a set of candidate binding sites described by a > +genetic motif.") Perhaps this is just personal opinion but I prefer not to make the suggestion that experiments can only be done in the lab. Also I don't think that sentence makes sense grammatically - s/distribution/distributions/ but even then, it doesn't learn the experimental data. Maybe steal from the website, cut down a bit from this? >CENTIPEDE applies a hierarchical Bayesian mixture model to infer regions of the genome that are bound by particular transcription factors. It starts by identifying a set of candidate binding sites (e.g., sites that match a certain position weight matrix (PWM)), and then aims to classify the sites according to whether each site is bound or not bound by a TF. CENTIPEDE is an unsupervised learning algorithm that discriminates between two different types of motif instances using as much relevant information as possible. Thanks, ben