all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ben Woodcroft <b.woodcroft@uq.edu.au>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: "guix-devel@gnu.org" <guix-devel@gnu.org>
Subject: Re: [PATCH] Add fraggenescan.
Date: Mon, 28 Dec 2015 07:53:27 +1000	[thread overview]
Message-ID: <56805DD7.8010900@uq.edu.au> (raw)
In-Reply-To: <87egehl1ai.fsf@elephly.net>

[-- Attachment #1: Type: text/plain, Size: 6568 bytes --]

Heya,

On 20/12/15 23:03, Ricardo Wurmus wrote:
> thanks for your patch!
and thank you.
>> +(define-public fraggenescan
>> +  (package
>> +    (name "fraggenescan")
>> +    (version "1.20")
>> +    (source
>> +     (origin
>> +       (method url-fetch)
>> +       (uri
>> +        (string-append "mirror://sourceforge/fraggenescan/"
>> +                       "FragGeneScan" version ".tar.gz"))
>> +       (sha256
>> +        (base32 "1zzigqmvqvjyqv4945kv6nc5ah2xxm1nxgrlsnbzav3f5c0n0pyj"))))
>> +    (build-system gnu-build-system)
>> +    (arguments
>> +     `(#:phases
>> +       (modify-phases %standard-phases
>> +         (delete 'configure)
>> +         (add-before 'build 'patch-run-script
> This phase does more than just patch the run script (it changes paths in
> two scripts and a C source file).  Could you find a better name?
ok, done.
>> +           (lambda* (#:key outputs #:allow-other-keys)
>> +             (let* ((out (string-append (assoc-ref outputs "out")))
>> +                    (share (string-append out "/share/fraggenescan/")))
>> +               (substitute* "run_FragGeneScan.pl"
>> +                 (("system\\(\"rm")
>> +                  (string-append "system(\"" (which "rm")))
>> +                 (("system\\(\"mv")
>> +                  (string-append "system(\"" (which "mv")))
>> +                 ;; This script and other programs expect the training files
>> +                 ;; to be in the non-standard location bin/train/XXX. Change
>> +                 ;; this to be share/fraggenescan/train/XXX instead.
>> +                 (("^\\$train.file = \\$dir.\\\"train/\\\".\\$FGS_train_file;")
>> +                  (string-append "$train_file = \""
>> +                                 share
>> +                                 "train/\".$FGS_train_file;")))
> I might look clearer if you captured a part of the matches,
>
> (("(^\\$train\\.file = )\\$dir\\.\\\"(train/\\\"\\.\\$FGS_train_file;)" _ pre post)
>   (string-append pre "\"" share post))
>
> Or since there’s really just one line beginning with “$train.file” maybe
> you could do this:
>
> (("(^\\$train\\.file = ).*" _ prefix)
>   (string-append prefix "\"" share "train/\".$FGS_train_file;"))
>
> The regular expressions above look quite scary, so maybe the latter
> proposal is best here.
Indeed they are scary, and writing regexes for Guix rarely fails to trip 
me up. I'd actually prefer a less powerful alternative (just a regular 
string replacement), and better yet one that stops the build process 
when it matches 0 or 2 or more times. There's nothing like that around 
is there?

But for now I just took your suggestions about shortening the regexes.
>
>> +               (substitute* "run_hmm.c"
>> +                 (("^  strcat\\(train_dir, \\\"train/\\\"\\);")
>> +                  (string-append "  strcpy(train_dir, \"" share "/train/\");")))
> Why do you replace “strcat” with “strcpy” here?
The line above does a strcpy we don't want, so strcat would keep that. I 
could remove the line above if you want, but this effectively makes no 
difference to the running of the program.

[..]
>> +             #t))
>> +         (replace 'build
>> +           (lambda _ (and (zero? (system* "make" "clean"))
>> +                          (zero? (system* "make" "fgs")))))
> Why must “make clean” be run first?  I know the README says so, but is
> it really required?  If it is not you could just use the default build
> phase, possibly specifying “fgs” as the target.
Yeh the tarball comes with compiled files. I've added a comment.
>
>> +         (delete 'check)
> How about “#:tests? #f” instead?
>
>> +         (replace 'install
>> +           (lambda* (#:key outputs #:allow-other-keys)
>> +             (let* ((out (string-append (assoc-ref outputs "out")))
>> +                    (bin (string-append out "/bin/"))
>> +                    (share (string-append out "/share/fraggenescan/train")))
>> +               (install-file "run_FragGeneScan.pl" bin)
>> +               (install-file "FragGeneScan" bin)
>> +               (install-file "FGS_gff.py" bin)
>> +               (install-file "post_process.pl" bin)
>> +               (copy-recursively "train" share))))
>> +         (add-after 'install 'post-install-check
>> +           ;; In lieu of 'make check', run one of the examples and check the
>> +           ;; output files gets created.
> Oh, I see.  Maybe you could delete the “check” phase right before this,
> so that it’s obvious you are moving it after the “install” phase.
ok, done.
>
>> +           (lambda* (#:key outputs #:allow-other-keys)
>> +             (let* ((out (string-append (assoc-ref outputs "out")))
>> +                    (bin (string-append out "/bin/")))
>> +               (and (zero? (system* (string-append bin "run_FragGeneScan.pl")
>> +                             "-genome=./example/NC_000913.fna"
>> +                             "-out=./test2"
>> +                             "-complete=1"
>> +                             "-train=complete"))
>> +                    (file-exists? "test2.faa")
>> +                    (file-exists? "test2.ffn")
>> +                    (file-exists? "test2.gff")
>> +                    (file-exists? "test2.out"))))))))
>> +    (inputs
>> +     `(("perl" ,perl)
>> +       ("python" ,python-2))) ;not compatible with python 3.
>> +    (home-page "https://sourceforge.net/projects/fraggenescan/")
>> +    (synopsis "Finds potentially fragmented genes in short reads")
>> +    (description
>> +     "FragGeneScan is a program for predicting bacterial and archaeal genes in
>> +short and error-prone DNA sequencing reads.  It can also be applied to predict
>> +genes in incomplete assemblies or complete genomes.")
>> +    (license license:gpl1)))
> I didn’t see any mention of a particular GPL version.  The README says
> this:
>
>      License
>      ============
>      Copyright (C) 2010 Mina Rho, Yuzhen Ye and Haixu Tang.
>      You may redistribute this software under the terms of the GNU General Public License.
>
> This looks like it could be any version of the GPL (as it is implied
> when R packages just declare “GPL” as a license).  I would do this, but
> I’m not sure that’s okay:
>
>      ;; Released under any version of the GPL
>      (license license:gpl3+)
It seems your interpretation was better than mine. The authors said 
gpl3+ over email.

Thanks,
ben

[-- Attachment #2: 0001-gnu-Add-fraggenescan.patch --]
[-- Type: text/x-patch, Size: 4717 bytes --]

From a140b0816f095a7a13d6bdd4dcbddbae1d020fbb Mon Sep 17 00:00:00 2001
From: Ben Woodcroft <donttrustben@gmail.com>
Date: Sun, 20 Dec 2015 22:23:17 +1000
Subject: [PATCH] gnu: Add fraggenescan.

* gnu/packages/bioinformatics.scm (fraggenescan): New variable.
---
 gnu/packages/bioinformatics.scm | 81 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 7c573e1..afa32f2 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -1354,6 +1354,87 @@ supports next-generation sequencing data in fasta/q and csfasta/q format from
 Illumina, Roche 454, and the SOLiD platform.")
     (license license:gpl3)))
 
+(define-public fraggenescan
+  (package
+    (name "fraggenescan")
+    (version "1.20")
+    (source
+     (origin
+       (method url-fetch)
+       (uri
+        (string-append "mirror://sourceforge/fraggenescan/"
+                       "FragGeneScan" version ".tar.gz"))
+       (sha256
+        (base32 "1zzigqmvqvjyqv4945kv6nc5ah2xxm1nxgrlsnbzav3f5c0n0pyj"))))
+    (build-system gnu-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (delete 'configure)
+         (add-before 'build 'patch-paths
+           (lambda* (#:key outputs #:allow-other-keys)
+             (let* ((out (string-append (assoc-ref outputs "out")))
+                    (share (string-append out "/share/fraggenescan/")))
+               (substitute* "run_FragGeneScan.pl"
+                 (("system\\(\"rm")
+                  (string-append "system(\"" (which "rm")))
+                 (("system\\(\"mv")
+                  (string-append "system(\"" (which "mv")))
+                 ;; This script and other programs expect the training files
+                 ;; to be in the non-standard location bin/train/XXX. Change
+                 ;; this to be share/fraggenescan/train/XXX instead.
+                 (("^\\$train.file = \\$dir.*")
+                  (string-append "$train_file = \""
+                                 share
+                                 "train/\".$FGS_train_file;")))
+               (substitute* "run_hmm.c"
+                 (("^  strcat\\(train_dir, \\\"train/\\\"\\);")
+                  (string-append "  strcpy(train_dir, \"" share "/train/\");")))
+               (substitute* "post_process.pl"
+                 (("^my \\$dir = substr.*")
+                  (string-append "my $dir = \"" share "\";"))))
+             #t))
+         (replace 'build
+           (lambda _ (and (zero? (system* "make" "clean"))
+                          (zero? (system* "make" "fgs")))))
+         (replace 'install
+           (lambda* (#:key outputs #:allow-other-keys)
+             (let* ((out (string-append (assoc-ref outputs "out")))
+                    (bin (string-append out "/bin/"))
+                    (share (string-append out "/share/fraggenescan/train")))
+               (install-file "run_FragGeneScan.pl" bin)
+               (install-file "FragGeneScan" bin)
+               (install-file "FGS_gff.py" bin)
+               (install-file "post_process.pl" bin)
+               (copy-recursively "train" share))))
+         (delete 'check)
+         (add-after 'install 'post-install-check
+           ;; In lieu of 'make check', run one of the examples and check the
+           ;; output files gets created.
+           (lambda* (#:key outputs #:allow-other-keys)
+             (let* ((out (string-append (assoc-ref outputs "out")))
+                    (bin (string-append out "/bin/")))
+               (and (zero? (system* (string-append bin "run_FragGeneScan.pl")
+                             "-genome=./example/NC_000913.fna"
+                             "-out=./test2"
+                             "-complete=1"
+                             "-train=complete"))
+                    (file-exists? "test2.faa")
+                    (file-exists? "test2.ffn")
+                    (file-exists? "test2.gff")
+                    (file-exists? "test2.out"))))))))
+    (inputs
+     `(("perl" ,perl)
+       ("python" ,python-2))) ;not compatible with python 3.
+    (home-page "https://sourceforge.net/projects/fraggenescan/")
+    (synopsis "Finds potentially fragmented genes in short reads")
+    (description
+     "FragGeneScan is a program for predicting bacterial and archaeal genes in
+short and error-prone DNA sequencing reads.  It can also be applied to predict
+genes in incomplete assemblies or complete genomes.")
+    ;; GPL3+ according to private correspondense with the authors.
+    (license license:gpl3+)))
+
 (define-public grit
   (package
     (name "grit")
-- 
2.5.0


  reply	other threads:[~2015-12-27 21:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-20 12:36 [PATCH] Add fraggenescan Ben Woodcroft
2015-12-20 13:03 ` Ricardo Wurmus
2015-12-27 21:53   ` Ben Woodcroft [this message]
2015-12-28  0:38     ` Leo Famulari
2015-12-28  0:56       ` Ben Woodcroft
2015-12-28  5:43         ` Leo Famulari
2015-12-29 23:23           ` Ludovic Courtès
2015-12-30  1:19             ` Leo Famulari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56805DD7.8010900@uq.edu.au \
    --to=b.woodcroft@uq.edu.au \
    --cc=guix-devel@gnu.org \
    --cc=rekado@elephly.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.