From: Ben Woodcroft <b.woodcroft@uq.edu.au>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: "guix-devel@gnu.org" <guix-devel@gnu.org>
Subject: Re: [PATCH] Add fraggenescan.
Date: Mon, 28 Dec 2015 07:53:27 +1000 [thread overview]
Message-ID: <56805DD7.8010900@uq.edu.au> (raw)
In-Reply-To: <87egehl1ai.fsf@elephly.net>
[-- Attachment #1: Type: text/plain, Size: 6568 bytes --]
Heya,
On 20/12/15 23:03, Ricardo Wurmus wrote:
> thanks for your patch!
and thank you.
>> +(define-public fraggenescan
>> + (package
>> + (name "fraggenescan")
>> + (version "1.20")
>> + (source
>> + (origin
>> + (method url-fetch)
>> + (uri
>> + (string-append "mirror://sourceforge/fraggenescan/"
>> + "FragGeneScan" version ".tar.gz"))
>> + (sha256
>> + (base32 "1zzigqmvqvjyqv4945kv6nc5ah2xxm1nxgrlsnbzav3f5c0n0pyj"))))
>> + (build-system gnu-build-system)
>> + (arguments
>> + `(#:phases
>> + (modify-phases %standard-phases
>> + (delete 'configure)
>> + (add-before 'build 'patch-run-script
> This phase does more than just patch the run script (it changes paths in
> two scripts and a C source file). Could you find a better name?
ok, done.
>> + (lambda* (#:key outputs #:allow-other-keys)
>> + (let* ((out (string-append (assoc-ref outputs "out")))
>> + (share (string-append out "/share/fraggenescan/")))
>> + (substitute* "run_FragGeneScan.pl"
>> + (("system\\(\"rm")
>> + (string-append "system(\"" (which "rm")))
>> + (("system\\(\"mv")
>> + (string-append "system(\"" (which "mv")))
>> + ;; This script and other programs expect the training files
>> + ;; to be in the non-standard location bin/train/XXX. Change
>> + ;; this to be share/fraggenescan/train/XXX instead.
>> + (("^\\$train.file = \\$dir.\\\"train/\\\".\\$FGS_train_file;")
>> + (string-append "$train_file = \""
>> + share
>> + "train/\".$FGS_train_file;")))
> I might look clearer if you captured a part of the matches,
>
> (("(^\\$train\\.file = )\\$dir\\.\\\"(train/\\\"\\.\\$FGS_train_file;)" _ pre post)
> (string-append pre "\"" share post))
>
> Or since there’s really just one line beginning with “$train.file” maybe
> you could do this:
>
> (("(^\\$train\\.file = ).*" _ prefix)
> (string-append prefix "\"" share "train/\".$FGS_train_file;"))
>
> The regular expressions above look quite scary, so maybe the latter
> proposal is best here.
Indeed they are scary, and writing regexes for Guix rarely fails to trip
me up. I'd actually prefer a less powerful alternative (just a regular
string replacement), and better yet one that stops the build process
when it matches 0 or 2 or more times. There's nothing like that around
is there?
But for now I just took your suggestions about shortening the regexes.
>
>> + (substitute* "run_hmm.c"
>> + (("^ strcat\\(train_dir, \\\"train/\\\"\\);")
>> + (string-append " strcpy(train_dir, \"" share "/train/\");")))
> Why do you replace “strcat” with “strcpy” here?
The line above does a strcpy we don't want, so strcat would keep that. I
could remove the line above if you want, but this effectively makes no
difference to the running of the program.
[..]
>> + #t))
>> + (replace 'build
>> + (lambda _ (and (zero? (system* "make" "clean"))
>> + (zero? (system* "make" "fgs")))))
> Why must “make clean” be run first? I know the README says so, but is
> it really required? If it is not you could just use the default build
> phase, possibly specifying “fgs” as the target.
Yeh the tarball comes with compiled files. I've added a comment.
>
>> + (delete 'check)
> How about “#:tests? #f” instead?
>
>> + (replace 'install
>> + (lambda* (#:key outputs #:allow-other-keys)
>> + (let* ((out (string-append (assoc-ref outputs "out")))
>> + (bin (string-append out "/bin/"))
>> + (share (string-append out "/share/fraggenescan/train")))
>> + (install-file "run_FragGeneScan.pl" bin)
>> + (install-file "FragGeneScan" bin)
>> + (install-file "FGS_gff.py" bin)
>> + (install-file "post_process.pl" bin)
>> + (copy-recursively "train" share))))
>> + (add-after 'install 'post-install-check
>> + ;; In lieu of 'make check', run one of the examples and check the
>> + ;; output files gets created.
> Oh, I see. Maybe you could delete the “check” phase right before this,
> so that it’s obvious you are moving it after the “install” phase.
ok, done.
>
>> + (lambda* (#:key outputs #:allow-other-keys)
>> + (let* ((out (string-append (assoc-ref outputs "out")))
>> + (bin (string-append out "/bin/")))
>> + (and (zero? (system* (string-append bin "run_FragGeneScan.pl")
>> + "-genome=./example/NC_000913.fna"
>> + "-out=./test2"
>> + "-complete=1"
>> + "-train=complete"))
>> + (file-exists? "test2.faa")
>> + (file-exists? "test2.ffn")
>> + (file-exists? "test2.gff")
>> + (file-exists? "test2.out"))))))))
>> + (inputs
>> + `(("perl" ,perl)
>> + ("python" ,python-2))) ;not compatible with python 3.
>> + (home-page "https://sourceforge.net/projects/fraggenescan/")
>> + (synopsis "Finds potentially fragmented genes in short reads")
>> + (description
>> + "FragGeneScan is a program for predicting bacterial and archaeal genes in
>> +short and error-prone DNA sequencing reads. It can also be applied to predict
>> +genes in incomplete assemblies or complete genomes.")
>> + (license license:gpl1)))
> I didn’t see any mention of a particular GPL version. The README says
> this:
>
> License
> ============
> Copyright (C) 2010 Mina Rho, Yuzhen Ye and Haixu Tang.
> You may redistribute this software under the terms of the GNU General Public License.
>
> This looks like it could be any version of the GPL (as it is implied
> when R packages just declare “GPL” as a license). I would do this, but
> I’m not sure that’s okay:
>
> ;; Released under any version of the GPL
> (license license:gpl3+)
It seems your interpretation was better than mine. The authors said
gpl3+ over email.
Thanks,
ben
[-- Attachment #2: 0001-gnu-Add-fraggenescan.patch --]
[-- Type: text/x-patch, Size: 4717 bytes --]
From a140b0816f095a7a13d6bdd4dcbddbae1d020fbb Mon Sep 17 00:00:00 2001
From: Ben Woodcroft <donttrustben@gmail.com>
Date: Sun, 20 Dec 2015 22:23:17 +1000
Subject: [PATCH] gnu: Add fraggenescan.
* gnu/packages/bioinformatics.scm (fraggenescan): New variable.
---
gnu/packages/bioinformatics.scm | 81 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 81 insertions(+)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 7c573e1..afa32f2 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -1354,6 +1354,87 @@ supports next-generation sequencing data in fasta/q and csfasta/q format from
Illumina, Roche 454, and the SOLiD platform.")
(license license:gpl3)))
+(define-public fraggenescan
+ (package
+ (name "fraggenescan")
+ (version "1.20")
+ (source
+ (origin
+ (method url-fetch)
+ (uri
+ (string-append "mirror://sourceforge/fraggenescan/"
+ "FragGeneScan" version ".tar.gz"))
+ (sha256
+ (base32 "1zzigqmvqvjyqv4945kv6nc5ah2xxm1nxgrlsnbzav3f5c0n0pyj"))))
+ (build-system gnu-build-system)
+ (arguments
+ `(#:phases
+ (modify-phases %standard-phases
+ (delete 'configure)
+ (add-before 'build 'patch-paths
+ (lambda* (#:key outputs #:allow-other-keys)
+ (let* ((out (string-append (assoc-ref outputs "out")))
+ (share (string-append out "/share/fraggenescan/")))
+ (substitute* "run_FragGeneScan.pl"
+ (("system\\(\"rm")
+ (string-append "system(\"" (which "rm")))
+ (("system\\(\"mv")
+ (string-append "system(\"" (which "mv")))
+ ;; This script and other programs expect the training files
+ ;; to be in the non-standard location bin/train/XXX. Change
+ ;; this to be share/fraggenescan/train/XXX instead.
+ (("^\\$train.file = \\$dir.*")
+ (string-append "$train_file = \""
+ share
+ "train/\".$FGS_train_file;")))
+ (substitute* "run_hmm.c"
+ (("^ strcat\\(train_dir, \\\"train/\\\"\\);")
+ (string-append " strcpy(train_dir, \"" share "/train/\");")))
+ (substitute* "post_process.pl"
+ (("^my \\$dir = substr.*")
+ (string-append "my $dir = \"" share "\";"))))
+ #t))
+ (replace 'build
+ (lambda _ (and (zero? (system* "make" "clean"))
+ (zero? (system* "make" "fgs")))))
+ (replace 'install
+ (lambda* (#:key outputs #:allow-other-keys)
+ (let* ((out (string-append (assoc-ref outputs "out")))
+ (bin (string-append out "/bin/"))
+ (share (string-append out "/share/fraggenescan/train")))
+ (install-file "run_FragGeneScan.pl" bin)
+ (install-file "FragGeneScan" bin)
+ (install-file "FGS_gff.py" bin)
+ (install-file "post_process.pl" bin)
+ (copy-recursively "train" share))))
+ (delete 'check)
+ (add-after 'install 'post-install-check
+ ;; In lieu of 'make check', run one of the examples and check the
+ ;; output files gets created.
+ (lambda* (#:key outputs #:allow-other-keys)
+ (let* ((out (string-append (assoc-ref outputs "out")))
+ (bin (string-append out "/bin/")))
+ (and (zero? (system* (string-append bin "run_FragGeneScan.pl")
+ "-genome=./example/NC_000913.fna"
+ "-out=./test2"
+ "-complete=1"
+ "-train=complete"))
+ (file-exists? "test2.faa")
+ (file-exists? "test2.ffn")
+ (file-exists? "test2.gff")
+ (file-exists? "test2.out"))))))))
+ (inputs
+ `(("perl" ,perl)
+ ("python" ,python-2))) ;not compatible with python 3.
+ (home-page "https://sourceforge.net/projects/fraggenescan/")
+ (synopsis "Finds potentially fragmented genes in short reads")
+ (description
+ "FragGeneScan is a program for predicting bacterial and archaeal genes in
+short and error-prone DNA sequencing reads. It can also be applied to predict
+genes in incomplete assemblies or complete genomes.")
+ ;; GPL3+ according to private correspondense with the authors.
+ (license license:gpl3+)))
+
(define-public grit
(package
(name "grit")
--
2.5.0
next prev parent reply other threads:[~2015-12-27 21:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-20 12:36 [PATCH] Add fraggenescan Ben Woodcroft
2015-12-20 13:03 ` Ricardo Wurmus
2015-12-27 21:53 ` Ben Woodcroft [this message]
2015-12-28 0:38 ` Leo Famulari
2015-12-28 0:56 ` Ben Woodcroft
2015-12-28 5:43 ` Leo Famulari
2015-12-29 23:23 ` Ludovic Courtès
2015-12-30 1:19 ` Leo Famulari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56805DD7.8010900@uq.edu.au \
--to=b.woodcroft@uq.edu.au \
--cc=guix-devel@gnu.org \
--cc=rekado@elephly.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.