unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] Add python2-seqmagick.
@ 2015-09-17 11:47 Ben Woodcroft
  2015-09-17 15:51 ` Ricardo Wurmus
  0 siblings, 1 reply; 16+ messages in thread
From: Ben Woodcroft @ 2015-09-17 11:47 UTC (permalink / raw)
  To: guix-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 38 bytes --]

Thanks in advance for review peoples.

[-- Attachment #2: 0001-gnu-Add-python2-seqmagick.patch --]
[-- Type: text/x-patch, Size: 2390 bytes --]

From 298c9aabc2d042c45c8f96d83229016dc5c1cbd6 Mon Sep 17 00:00:00 2001
From: Ben Woodcroft <donttrustben@gmail.com>
Date: Thu, 17 Sep 2015 21:43:12 +1000
Subject: [PATCH] gnu: Add python2-seqmagick.

* gnu/packages/bioinformatics.scm (python2-seqmagick): New variable.
---
 gnu/packages/bioinformatics.scm | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 03eb2df..cffe0e9 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -2489,6 +2489,45 @@ manipulation, online and indexed string search, efficient I/O of
 bioinformatics file formats, sequence alignment, and more.")
     (license license:bsd-3)))
 
+(define-public python2-seqmagick
+  (package
+    (name "python2-seqmagick")
+    (version "0.6.1")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (string-append
+             "https://pypi.python.org/packages/source/s/seqmagick/seqmagick-"
+             version ".tar.gz"))
+       (sha256
+        (base32
+         "0cgn477n74gsl4qdaakrrhi953kcsd4q3ivk2lr18x74s3g4ma1d"))))
+    (build-system python-build-system)
+    (arguments
+     ;; python2 only, see https://github.com/fhcrc/seqmagick/issues/56
+     `(#:python ,python-2
+       #:phases
+       (modify-phases %standard-phases
+         ;; current test in setup.py does not work as of 0.6.1,
+         ;; so use nose to run tests instead for now. See
+         ;; https://github.com/fhcrc/seqmagick/issues/55
+         (replace 'check (lambda _ (zero? (system* "nosetests")))))))
+    (inputs
+     `(("python-setuptools" ,python2-setuptools)
+       ("python-biopython" ,python2-biopython)))
+    (native-inputs
+     `(("python-nose" ,python2-nose)))
+    (home-page "http://github.com/fhcrc/seqmagick")
+    (synopsis
+     "Tools for converting and modifying sequence files from the command-line")
+    (description
+     "Bioinformaticians often have to convert sequence files between formats
+and do little manipulations on them, and it's not worth writing scripts for
+that.  Seqmagick is a utility to expose the file format conversion in
+BioPython in a convenient way.  Instead of having a big mess of scripts, there
+is one that takes arguments.")
+    (license license:gpl3)))
+
 (define-public star
   (package
     (name "star")
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-17 11:47 [PATCH] Add python2-seqmagick Ben Woodcroft
@ 2015-09-17 15:51 ` Ricardo Wurmus
  2015-09-19  9:36   ` Ben Woodcroft
  2015-09-21 22:41   ` Cyril Roelandt
  0 siblings, 2 replies; 16+ messages in thread
From: Ricardo Wurmus @ 2015-09-17 15:51 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: guix-devel@gnu.org

Hi Ben,

thank you very much for your patch!

> From 298c9aabc2d042c45c8f96d83229016dc5c1cbd6 Mon Sep 17 00:00:00 2001
> From: Ben Woodcroft <donttrustben@gmail.com>
> Date: Thu, 17 Sep 2015 21:43:12 +1000
> Subject: [PATCH] gnu: Add python2-seqmagick.

> * gnu/packages/bioinformatics.scm (python2-seqmagick): New variable.

Maybe this should be just called “seqmagick”.  It’s written in Python,
but since it’s not a library I don’t think it needs to have the
“python2-” prefix.

> +(define-public python2-seqmagick
> +  (package
> +    (name "python2-seqmagick")
> +    (version "0.6.1")
> +    (source
> +     (origin
> +       (method url-fetch)
> +       (uri (string-append
> +             "https://pypi.python.org/packages/source/s/seqmagick/seqmagick-"
> +             version ".tar.gz"))
> +       (sha256
> +        (base32
> +         "0cgn477n74gsl4qdaakrrhi953kcsd4q3ivk2lr18x74s3g4ma1d"))))
> +    (build-system python-build-system)
> +    (arguments
> +     ;; python2 only, see https://github.com/fhcrc/seqmagick/issues/56
> +     `(#:python ,python-2
> +       #:phases
> +       (modify-phases %standard-phases
> +         ;; current test in setup.py does not work as of 0.6.1,
> +         ;; so use nose to run tests instead for now. See
> +         ;; https://github.com/fhcrc/seqmagick/issues/55
> +         (replace 'check (lambda _ (zero? (system* "nosetests")))))))
> +    (inputs
> +     `(("python-setuptools" ,python2-setuptools)

I think this should be a native input instead.

> +       ("python-biopython" ,python2-biopython)))

And this looks like it should be a propagated input instead.  Have you
tried running seqmagick after installing it with this package recipe?  I
found that Python executables often require either propagated inputs or
wrapping in PYTHONPATH to work without runtime errors.

> +    (native-inputs
> +     `(("python-nose" ,python2-nose)))
> +    (home-page "http://github.com/fhcrc/seqmagick")
> +    (synopsis
> +     "Tools for converting and modifying sequence files from the command-line")

The synopsis is a bit long.  You could shave off two words like this:

  “Command-line tools for converting and modifying sequence files”

but that’s not really much better.  I’m open to suggestions.

> +    (description
> +     "Bioinformaticians often have to convert sequence files between formats
> +and do little manipulations on them, and it's not worth writing scripts for
> +that.  Seqmagick is a utility to expose the file format conversion in
> +BioPython in a convenient way.  Instead of having a big mess of scripts, there
> +is one that takes arguments.")
> +    (license license:gpl3)))
> +

I’m not sure if it’s really “GPLv3 only” or “GPLv3 or later” as there
are no license headers anywhere.  Maybe others could comment what’s the
proper declaration here.

~~ Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-17 15:51 ` Ricardo Wurmus
@ 2015-09-19  9:36   ` Ben Woodcroft
  2015-09-21  7:34     ` Pjotr Prins
  2015-09-25 14:09     ` [PATCH] Add python2-seqmagick Ricardo Wurmus
  2015-09-21 22:41   ` Cyril Roelandt
  1 sibling, 2 replies; 16+ messages in thread
From: Ben Woodcroft @ 2015-09-19  9:36 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel@gnu.org


[-- Attachment #1.1: Type: text/plain, Size: 3980 bytes --]



On 18/09/15 01:51, Ricardo Wurmus wrote:
> Hi Ben,
>
> thank you very much for your patch!
and you sir, for the review.
>> +    (inputs
>> +     `(("python-setuptools" ,python2-setuptools)
> I think this should be a native input instead.
ok
>> +       ("python-biopython" ,python2-biopython)))
> And this looks like it should be a propagated input instead.  Have you
> tried running seqmagick after installing it with this package recipe?  I
> found that Python executables often require either propagated inputs or
> wrapping in PYTHONPATH to work without runtime errors.
I did, although not well enough to pick up the error you point out. 
Actually I'm a bit confused as to the difference between the input types 
even after reading the manual. Is this a fair summary?

native-inputs: required for building but not runtime - installing a 
package through a substitute won't install these inputs
inputs: installed in the store but not in the profile, as well as being 
present at build time
propagated-inputs: installed in the store and in the profile, as well as 
being present at build time


Anyway, it seems as if the package would have worked because a wrapper 
is generated with PYTHONPATH including inputs, propagated-inputs and 
native-inputs. But this seems a bit strange - why would native inputs be 
in the runtime wrapper?

$ cat /tmp/a.fa
 >a
ATGG
$ ./pre-inst-env guix environment --ad-hoc seqmagick --pure -E 'echo 
$PATH; echo $PYTHONPATH; seqmagick info /tmp/a.fa'
;;; note: source file /home/ben/git/guix/gnu/packages/bioinformatics.scm
;;;       newer than compiled 
/home/ben/git/guix/gnu/packages/bioinformatics.go
/gnu/store/bjd27hqk0wnh28nzmawjymdh3gfk98pn-seqmagick-0.6.1/bin

name      alignment    min_len   max_len   avg_len  num_seqs
/tmp/a.fa FALSE              4         4      4.00         1



$ cat 
/gnu/store/bjd27hqk0wnh28nzmawjymdh3gfk98pn-seqmagick-0.6.1/bin/.seqmagick-wrap-01 

#!/gnu/store/cpx9iibpdwi3wb81glpnnlxr9zra2iiv-bash-4.3.39/bin/bash
export 
PYTHONPATH="/gnu/store/bjd27hqk0wnh28nzmawjymdh3gfk98pn-seqmagick-0.6.1/lib/python2.7/site-packages:/gnu/store/qlf4k2b81wms6avlyny7hix175asg8kg-python-2.7.10/lib/python2.7/site-packages:/gnu/store/zmhk5r3lmc3s8ivdx9vv6m5vzn6fqim5-python2-setuptools-12.1/lib/python2.7/site-packages:/gnu/store/20s2vvslsapynhmbfsa3q2iyv9dzjzr2-python2-nose-1.3.4/lib/python2.7/site-packages:/gnu/store/c7navsrzibg2rjgs1fzqry711wch5f3p-python2-biopython-1.65/lib/python2.7/site-packages:/gnu/store/bjd27hqk0wnh28nzmawjymdh3gfk98pn-seqmagick-0.6.1/lib/python2.7/site-packages/${PYTHONPATH:+:}$PYTHONPATH"
exec -a "$0" 
"/gnu/store/bjd27hqk0wnh28nzmawjymdh3gfk98pn-seqmagick-0.6.1/bin/.seqmagick-real" 
"$@"

>> +    (native-inputs
>> +     `(("python-nose" ,python2-nose)))
>> +    (home-page "http://github.com/fhcrc/seqmagick")
>> +    (synopsis
>> +     "Tools for converting and modifying sequence files from the command-line")
> The synopsis is a bit long.  You could shave off two words like this:
>
>    “Command-line tools for converting and modifying sequence files”
>
> but that’s not really much better.  I’m open to suggestions.
"Tools for converting and modifying sequence files"
>> +    (description
>> +     "Bioinformaticians often have to convert sequence files between formats
>> +and do little manipulations on them, and it's not worth writing scripts for
>> +that.  Seqmagick is a utility to expose the file format conversion in
>> +BioPython in a convenient way.  Instead of having a big mess of scripts, there
>> +is one that takes arguments.")
>> +    (license license:gpl3)))
>> +
> I’m not sure if it’s really “GPLv3 only” or “GPLv3 or later” as there
> are no license headers anywhere.  Maybe others could comment what’s the
> proper declaration here.
 From the readme:
 >|seqmagick| is free software under the GPL v3.

Is that not straightforward enough?

Thanks,
ben

[-- Attachment #1.2: Type: text/html, Size: 5774 bytes --]

[-- Attachment #2: 0001-gnu-Add-seqmagick.patch --]
[-- Type: text/x-patch, Size: 2347 bytes --]

From 9328316f5ff7a454e0370bab9ac4897926629b72 Mon Sep 17 00:00:00 2001
From: Ben Woodcroft <donttrustben@gmail.com>
Date: Sat, 19 Sep 2015 19:31:57 +1000
Subject: [PATCH] gnu: Add seqmagick.

* gnu/packages/bioinformatics.scm (seqmagick): New variable.
---
 gnu/packages/bioinformatics.scm | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 03eb2df..08b3500 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -2489,6 +2489,45 @@ manipulation, online and indexed string search, efficient I/O of
 bioinformatics file formats, sequence alignment, and more.")
     (license license:bsd-3)))
 
+(define-public seqmagick
+  (package
+    (name "seqmagick")
+    (version "0.6.1")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (string-append
+             "https://pypi.python.org/packages/source/s/seqmagick/seqmagick-"
+             version ".tar.gz"))
+       (sha256
+        (base32
+         "0cgn477n74gsl4qdaakrrhi953kcsd4q3ivk2lr18x74s3g4ma1d"))))
+    (build-system python-build-system)
+    (arguments
+     ;; python2 only, see https://github.com/fhcrc/seqmagick/issues/56
+     `(#:python ,python-2
+       #:phases
+       (modify-phases %standard-phases
+         ;; current test in setup.py does not work as of 0.6.1,
+         ;; so use nose to run tests instead for now. See
+         ;; https://github.com/fhcrc/seqmagick/issues/55
+         (replace 'check (lambda _ (zero? (system* "nosetests")))))))
+    (propagated-inputs
+     `(("python-biopython" ,python2-biopython)))
+    (native-inputs
+     `(("python-setuptools" ,python2-setuptools)
+       ("python-nose" ,python2-nose)))
+    (home-page "http://github.com/fhcrc/seqmagick")
+    (synopsis
+     "Tools for converting and modifying sequence files")
+    (description
+     "Bioinformaticians often have to convert sequence files between formats
+and do little manipulations on them, and it's not worth writing scripts for
+that.  Seqmagick is a utility to expose the file format conversion in
+BioPython in a convenient way.  Instead of having a big mess of scripts, there
+is one that takes arguments.")
+    (license license:gpl3)))
+
 (define-public star
   (package
     (name "star")
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-19  9:36   ` Ben Woodcroft
@ 2015-09-21  7:34     ` Pjotr Prins
  2015-09-21 16:13       ` Ludovic Courtès
  2015-09-25 14:09     ` [PATCH] Add python2-seqmagick Ricardo Wurmus
  1 sibling, 1 reply; 16+ messages in thread
From: Pjotr Prins @ 2015-09-21  7:34 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: guix-devel@gnu.org

This contains the most lucid description of 'inputs' I have yet seen.
Could they go into the main Guix documentation?

Pj.

On Sat, Sep 19, 2015 at 07:36:17PM +1000, Ben Woodcroft wrote:
>    On 18/09/15 01:51, Ricardo Wurmus wrote:
> 
>  Hi Ben,
> 
>  thank you very much for your patch!
> 
>    and you sir, for the review.
> 
>  +    (inputs
>  +     `(("python-setuptools" ,python2-setuptools)
> 
>  I think this should be a native input instead.
> 
>    ok
> 
>  +       ("python-biopython" ,python2-biopython)))
> 
>  And this looks like it should be a propagated input instead.  Have you
>  tried running seqmagick after installing it with this package recipe?  I
>  found that Python executables often require either propagated inputs or
>  wrapping in PYTHONPATH to work without runtime errors.
> 
>    I did, although not well enough to pick up the error you point out.
>    Actually I'm a bit confused as to the difference between the input types
>    even after reading the manual. Is this a fair summary?
> 
>    native-inputs: required for building but not runtime - installing a
>    package through a substitute won't install these inputs
>    inputs: installed in the store but not in the profile, as well as being
>    present at build time
>    propagated-inputs: installed in the store and in the profile, as well as
>    being present at build time
> 
>    Anyway, it seems as if the package would have worked because a wrapper is
>    generated with PYTHONPATH including inputs, propagated-inputs and
>    native-inputs. But this seems a bit strange - why would native inputs be
>    in the runtime wrapper?
>    $ cat /tmp/a.fa
>    >a

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-21  7:34     ` Pjotr Prins
@ 2015-09-21 16:13       ` Ludovic Courtès
  2015-09-21 22:36         ` Ben Woodcroft
  2015-09-24  5:08         ` Pjotr Prins
  0 siblings, 2 replies; 16+ messages in thread
From: Ludovic Courtès @ 2015-09-21 16:13 UTC (permalink / raw)
  To: Pjotr Prins; +Cc: guix-devel@gnu.org

Pjotr Prins <pjotr.public12@thebird.nl> skribis:

> This contains the most lucid description of 'inputs' I have yet seen.
> Could they go into the main Guix documentation?

What do you think needs to be changed compared to the text at
<http://www.gnu.org/software/guix/manual/html_node/package-Reference.html>?

TIA,
Ludo’.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-21 16:13       ` Ludovic Courtès
@ 2015-09-21 22:36         ` Ben Woodcroft
  2015-09-24  5:08         ` Pjotr Prins
  1 sibling, 0 replies; 16+ messages in thread
From: Ben Woodcroft @ 2015-09-21 22:36 UTC (permalink / raw)
  To: Ludovic Courtès, Pjotr Prins; +Cc: guix-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 3950 bytes --]

On 22/09/15 02:13, Ludovic Courtès wrote:
> Pjotr Prins <pjotr.public12@thebird.nl> skribis:
>
>> This contains the most lucid description of 'inputs' I have yet seen.
>> Could they go into the main Guix documentation?
> What do you think needs to be changed compared to the text at
> <http://www.gnu.org/software/guix/manual/html_node/package-Reference.html>?
So, those descriptions are right?

The manual from the POV of someone a bit confused:
||
>
> |inputs| (default: |'()|)
>
>     Package or derivation inputs to the build. This is a list of
>     lists, where each list has the name of the input (a string) as its
>     first element, a package or derivation object as its second
>     element, and optionally the name of the output of the package or
>     derivation that should be used, which defaults to |"out"|.
>
That paragraph doesn't tell me much about what inputs actually are in 
any detail, only the semantics of how to specify them.
> |propagated-inputs| (default: |'()|)
>
>     This field is like |inputs|, but the specified packages will be
>     force-installed alongside the package they belong to (see |guix
>     package|
>     <http://www.gnu.org/software/guix/manual/html_node/Invoking-guix-package.html#package_002dcmd_002dpropagated_002dinputs>,
>     for information on how |guix package| deals with propagated inputs.)
>
I guess it is initially confusing why propagated-inputs exist as a 
concept - I presumed that inputs were "installed" too (an input of my 
input is my input). "force-install" is a bit ambiguous - force installed 
in the profile? in the store? What is "forced" - isn't every input 
required? What is the meaning of "install" exactly?
>
>     For example this is necessary when a library needs headers of
>     another library to compile, or needs another shared library to be
>     linked alongside itself when a program wants to link to it.
>
So I'm guessing this is supposed to mean that if library (A) needs 
headers of another library (B) when trying to compile (C) which requires 
(A), then library B should be in the propagated-inputs list of library 
A? This doesn't seem to have anything to do with being force installed. 
Also, an example from an interpreted language would be useful. 
Particularly since there seems to be some discussion about this on the 
mailing list atm.
http://lists.gnu.org/archive/html/guix-devel/2015-09/msg00597.html
> |native-inputs| (default: |'()|)
>
>     This field is like |inputs|, but in case of a cross-compilation it
>     will be ensured that packages for the architecture of the build
>     machine are present, such that executables from them can be used
>     during the build.
>
>     This is typically where you would list tools needed at build time
>     but not at run time, such as Autoconf, Automake, pkg-config,
>     Gettext, or Bison. |guix lint| can report likely mistakes in this
>     area (see Invoking guix lint
>     <http://www.gnu.org/software/guix/manual/html_node/Invoking-guix-lint.html#Invoking-guix-lint>).
>
>
This makes the most sense out of the three input types to me, although 
again the second sentence doesn't follow logically from the first (to 
me). Also the second sentence might include a nod to testing e.g. this 
would be where packages required only for testing are specified. Also 
unclear what mistakes lint is picking up here (how can lint know what is 
being used at runtime?), and thus the reference to lint seems of little 
benefit (authors should always run lint, so what's the point of 
mentioning it here?).

I imagine that examples would help too - the example at the top of that 
section is very useful but too simple, perhaps a second example with an 
interpreted language using different input types would be of use.
http://www.gnu.org/software/guix/manual/html_node/Defining-Packages.html#Defining-Packages

Thanks,
ben

[-- Attachment #2: Type: text/html, Size: 5999 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-17 15:51 ` Ricardo Wurmus
  2015-09-19  9:36   ` Ben Woodcroft
@ 2015-09-21 22:41   ` Cyril Roelandt
  1 sibling, 0 replies; 16+ messages in thread
From: Cyril Roelandt @ 2015-09-21 22:41 UTC (permalink / raw)
  To: guix-devel

On 09/17/2015 05:51 PM, Ricardo Wurmus wrote:
> I
> found that Python executables often require either propagated inputs or
> wrapping in PYTHONPATH to work without runtime errors.
Right. I usually have to grep the libraries to see whether they are
actually imported in the program, or they are just used for
tests/installation.

Cyril.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-21 16:13       ` Ludovic Courtès
  2015-09-21 22:36         ` Ben Woodcroft
@ 2015-09-24  5:08         ` Pjotr Prins
  2015-09-24  7:36           ` Ludovic Courtès
  1 sibling, 1 reply; 16+ messages in thread
From: Pjotr Prins @ 2015-09-24  5:08 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel@gnu.org

Hi Ludo,

On Mon, Sep 21, 2015 at 06:13:19PM +0200, Ludovic Courtès wrote:
> Pjotr Prins <pjotr.public12@thebird.nl> skribis:
> 
> > This contains the most lucid description of 'inputs' I have yet seen.
> > Could they go into the main Guix documentation?
> 
> What do you think needs to be changed compared to the text at
> <http://www.gnu.org/software/guix/manual/html_node/package-Reference.html>?

I am aligned with Ben that the current documentation is not clear
enough. Especially the difference between inputs and native-inputs.
It was when I read mentioned E-mail that it started to click.

You know, that aha-erlebniss :)

Are you asking us to come with an improved text? I'd be happy to try.

Pj.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-24  5:08         ` Pjotr Prins
@ 2015-09-24  7:36           ` Ludovic Courtès
  2015-09-24  8:37             ` R and R modules (and a Ruby twist) Pjotr Prins
  0 siblings, 1 reply; 16+ messages in thread
From: Ludovic Courtès @ 2015-09-24  7:36 UTC (permalink / raw)
  To: Pjotr Prins; +Cc: guix-devel@gnu.org

Pjotr Prins <pjotr.public12@thebird.nl> skribis:

> Hi Ludo,
>
> On Mon, Sep 21, 2015 at 06:13:19PM +0200, Ludovic Courtès wrote:
>> Pjotr Prins <pjotr.public12@thebird.nl> skribis:
>> 
>> > This contains the most lucid description of 'inputs' I have yet seen.
>> > Could they go into the main Guix documentation?
>> 
>> What do you think needs to be changed compared to the text at
>> <http://www.gnu.org/software/guix/manual/html_node/package-Reference.html>?

[...]

> Are you asking us to come with an improved text? 

Yes, that would be perfect.

Ludo’.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* R and R modules (and a Ruby twist)
  2015-09-24  7:36           ` Ludovic Courtès
@ 2015-09-24  8:37             ` Pjotr Prins
  2015-09-24  9:40               ` Ricardo Wurmus
  0 siblings, 1 reply; 16+ messages in thread
From: Pjotr Prins @ 2015-09-24  8:37 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel@gnu.org

When we add an R module, such as R-qtl, the R-build-system does not
provide R itself as a propagated input, i.e., the R interpreter is not
in the profile. In the R world this is kinda odd.  Almost all modules
used from R. I.e. start up R and

  library(qtl)
  do something with R/qtl

Would have use people use that module in interactive mode. In the
current package install R is not included as a symlink and needs to be
separately installed.

Is this really a good idea? 

Personally I think R should be in the path because that is the only
way a module is useful to 99% of users.

We can actually instruct users to install R separately (no different
from other packaging systems). Note that this opens its own can of
worms when the two are out of sync. I can imagine Guix modules having
different R's as dependencies.

It is one other thing I am trying to think through. With a standard R
distribution, every package is strictly aligned with the interpreter
(they get installed from inside R).

With Guix' rolling model of package updates modules may go out of sync
- even if they are correctly linked with an underlying R. So mixing
interpreters and modules/packages may potentially give problems. 

The Ruby twist:

With Ruby we use the major version numbering of the interpreter to
share modules (named gems) - i.e. anything 2.2.x gets installed in the
profile under 2.2.0 - this is the default model in the Ruby world. I
am actually not completely happy with this model because it does not
isolate interpreters and the installed gems. Even though it `works' in
almost 100% of use cases. What happens now, for example, is that a
Ruby with ssl and and Ruby without ssl always share gems. This is
pretty evil. I would prefer incorporating the SHA value of the
interpreter in the Guix profile to host gems. That would guarantee
perfect isolation and is pretty easy to achieve.

Pj.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: R and R modules (and a Ruby twist)
  2015-09-24  8:37             ` R and R modules (and a Ruby twist) Pjotr Prins
@ 2015-09-24  9:40               ` Ricardo Wurmus
  2015-09-24 15:16                 ` Pjotr Prins
  0 siblings, 1 reply; 16+ messages in thread
From: Ricardo Wurmus @ 2015-09-24  9:40 UTC (permalink / raw)
  To: Pjotr Prins; +Cc: guix-devel@gnu.org


Pjotr Prins <pjotr.public12@thebird.nl> writes:

> When we add an R module, such as R-qtl, the R-build-system does not
> provide R itself as a propagated input, i.e., the R interpreter is not
> in the profile. In the R world this is kinda odd.  Almost all modules
> used from R. I.e. start up R and
>
>   library(qtl)
>   do something with R/qtl
>
> Would have use people use that module in interactive mode. In the
> current package install R is not included as a symlink and needs to be
> separately installed.

Correct.  I didn’t think of it as a problem as I assumed people would
have R installed in their profile if they wanted to interactively use an
R package.  But now that you mention it, I think it might lead to
problems (see below).

> It is one other thing I am trying to think through. With a standard R
> distribution, every package is strictly aligned with the interpreter
> (they get installed from inside R).
>
> With Guix' rolling model of package updates modules may go out of sync
> - even if they are correctly linked with an underlying R. So mixing
> interpreters and modules/packages may potentially give problems. 

Users can have any number of “libraries” (directories containing
installed R packages) in R_LIBS_SITE.  Currently, our R package suggests
R_LIBS_SITE to be set to “$profile/site-library” and the r-build-system
installs packages to “$out/site-library”.

We could add a level for the R version, e.g. “$out/site-library/3.2.2/”,
but it should be noted that R_LIBS_SITE makes no distinction for
different versions of R.  It’s just a single list of directories.  I
don’t know what would happen if you had

    R_LIBS_SITE=$HOME/site-library/3.2.2:$HOME/site-library/3.1.3

and then ran one or the other version of R.  (Note that currently there
can only be one version of R in a single profile anyway.)

I guess the problem is with updates.  If you had R 3.1.3 in your profile
and installed a new R package that is then built with the latest version
of R (3.2.2), this might lead to problems actually using the package in
an R session using version 3.1.3.

Maybe it would be best to append the R version to the site-library
directory.  I don’t think we should go further than that and bring in
the Guix hash, because I’m willing to trust that packages built with
version 3.2.2 are compatible with R 3.2.2, even if the inputs to our R
package changed and thus the hash is different.

~~ Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: R and R modules (and a Ruby twist)
  2015-09-24  9:40               ` Ricardo Wurmus
@ 2015-09-24 15:16                 ` Pjotr Prins
  2015-09-25  9:14                   ` Ricardo Wurmus
  0 siblings, 1 reply; 16+ messages in thread
From: Pjotr Prins @ 2015-09-24 15:16 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel@gnu.org

On Thu, Sep 24, 2015 at 11:40:57AM +0200, Ricardo Wurmus wrote:
> Maybe it would be best to append the R version to the site-library
> directory.  I don’t think we should go further than that and bring in
> the Guix hash, because I’m willing to trust that packages built with
> version 3.2.2 are compatible with R 3.2.2, even if the inputs to our R
> package changed and thus the hash is different.

The exception I can think of is when R provides compile time switches
for blas or ssl (for example). We don't do that now (Nix does!), but
if you had two R's with the same version number, it could just be that
a module 'lifts' that dependency and strictly works with one R (and
not the other).

It is the same for Ruby, Perl, Python, Apache, Firefox, etc. Anything
that allows for building 'site' modules.

I know this is mostly theoretical at this stage, but why not encourage
strict isolation of interpreter+modules? That is the only way we'll
guarantee independence between graphs. Nix/Guix does such a great job
there, and now we allow interpreters to 'leak' their environments,
just because of their convention and our trust in things that ought to
work. And all it costs us is a partial SHA added to the path. So for
Ruby it would be

  ~/.guix-profile/lib/ruby/2.2.0-edb92950/

instead of

  ~/.guix-profile/lib/ruby/2.2.0/

Personally I can live with the status quo, but somehow I prefer the
exact isolation. Maybe it will come when someone gets hurt.

Pj.
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: R and R modules (and a Ruby twist)
  2015-09-24 15:16                 ` Pjotr Prins
@ 2015-09-25  9:14                   ` Ricardo Wurmus
  0 siblings, 0 replies; 16+ messages in thread
From: Ricardo Wurmus @ 2015-09-25  9:14 UTC (permalink / raw)
  To: Pjotr Prins; +Cc: guix-devel@gnu.org


Pjotr Prins <pjotr.public12@thebird.nl> writes:

> On Thu, Sep 24, 2015 at 11:40:57AM +0200, Ricardo Wurmus wrote:
>> Maybe it would be best to append the R version to the site-library
>> directory.  I don’t think we should go further than that and bring in
>> the Guix hash, because I’m willing to trust that packages built with
>> version 3.2.2 are compatible with R 3.2.2, even if the inputs to our R
>> package changed and thus the hash is different.
>
> The exception I can think of is when R provides compile time switches
> for blas or ssl (for example). We don't do that now (Nix does!), but
> if you had two R's with the same version number, it could just be that
> a module 'lifts' that dependency and strictly works with one R (and
> not the other).

Isn’t that expected, though?  That’s a property of the used version of
R, then, not a problem with the package.

> It is the same for Ruby, Perl, Python, Apache, Firefox, etc. Anything
> that allows for building 'site' modules.

I don’t disagree in general.  There may be cases where the variant of
the build-time dependency must be identical to that used at runtime.
But I don’t think this is true for more than a few special packages.
Take R as an example.  Most packages are written in pure R, and thus
only depend on features provided by R.  What features are provided by
the language depends only on the version, not on configure flags.

If a user builds a variant of R that lacks Cairo, for example, then
certain packages won’t work as intended.  But does this mean that we
need to disallow installing packages that would have reduced feature
sets for a mutilated version of R in that case?

> I know this is mostly theoretical at this stage, but why not encourage
> strict isolation of interpreter+modules? That is the only way we'll
> guarantee independence between graphs. Nix/Guix does such a great job
> there, and now we allow interpreters to 'leak' their environments,
> just because of their convention and our trust in things that ought to
> work. And all it costs us is a partial SHA added to the path. So for
> Ruby it would be
>
>   ~/.guix-profile/lib/ruby/2.2.0-edb92950/
>
> instead of
>
>   ~/.guix-profile/lib/ruby/2.2.0/
>
> Personally I can live with the status quo, but somehow I prefer the
> exact isolation. Maybe it will come when someone gets hurt.

For R, Perl, Ruby and Python we are often forced to propagate inputs, so
that they end up in the profile and can be loaded by looking up the path
to the union in some environment variable, such as R_LIBS_SITE,
GEM_HOME, or PYTHONPATH.  These environment variables do not make a
distinction between versions or variants.  (Only Perl allows for a
distinction between major versions by having the major version number as
part of its environment variable: PERL5LIB.)

How would a *user* make sure to use different sets of packages with
different variants of languages?  At the moment, the only way is to
manually set the environment variable to point to the desired path.

With propagated inputs we cannot achieve as much isolation as we would
like to.  There might be a way to actually patch the mechanisms that
these languages use to load additional libraries/packages, patching them
such that they load dependencies by full path rather than by simple
name, similar to how we patch ‘dlopen()’ calls in C programmes.

Only if we can avoid using these inflexible environment variables can we
achieve the kind of isolation you try to get by adding a partial hash to
the output directory.

Just a data point: last time I checked Ruby’s “require” directive allows
for a full path to be given instead of a simple string.  Might there be
a way to forego propagating inputs by patching all “require $string”
statements in Ruby sources in a build phase, much like we automatically
patch shebangs?

To note: this would make it impossible for users to override
libraries/modules by adding an alternative directory containing a
modified version of a module to the list of search paths in the
consulted environment variable.  That’s akin to disabling the
LD_LIBRARY_PATH feature in C programmes.

~~ Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-19  9:36   ` Ben Woodcroft
  2015-09-21  7:34     ` Pjotr Prins
@ 2015-09-25 14:09     ` Ricardo Wurmus
  2015-09-25 22:14       ` Ben Woodcroft
  1 sibling, 1 reply; 16+ messages in thread
From: Ricardo Wurmus @ 2015-09-25 14:09 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: guix-devel@gnu.org


Ben Woodcroft <b.woodcroft@uq.edu.au> writes:

>>> +       ("python-biopython" ,python2-biopython)))
>> And this looks like it should be a propagated input instead.  Have you
>> tried running seqmagick after installing it with this package recipe?  I
>> found that Python executables often require either propagated inputs or
>> wrapping in PYTHONPATH to work without runtime errors.
>
> I did, although not well enough to pick up the error you point out. 
> Actually I'm a bit confused as to the difference between the input types 
> even after reading the manual. Is this a fair summary?

I’m sorry to have confused you here.  “biopython” should *not* be a
propagated input here, because “seqmagick” provides an executable, not a
library.  I was not aware of the fact that the executables are
automatically wrapped here (although the PYTHONPATH is a little too
broad as you also noted).

> Anyway, it seems as if the package would have worked because a wrapper 
> is generated with PYTHONPATH including inputs, propagated-inputs and 
> native-inputs. But this seems a bit strange - why would native inputs be 
> in the runtime wrapper?

Good question.  I think it’s because the wrapping phase just wraps the
scripts in “$out/bin” with whatever the PYTHONPATH variable contains.
It doesn’t construct a minimally sufficient PYTHONPATH.  Maybe that’s
worth changing in the python-build-system?

>>> +    (synopsis
>>> +     "Tools for converting and modifying sequence files from the command-line")
>> The synopsis is a bit long.  You could shave off two words like this:
>>
>>    “Command-line tools for converting and modifying sequence files”
>>
>> but that’s not really much better.  I’m open to suggestions.
> "Tools for converting and modifying sequence files"

That’s okay.

>>> +    (description
>>> +     "Bioinformaticians often have to convert sequence files between formats
>>> +and do little manipulations on them, and it's not worth writing scripts for
>>> +that.  Seqmagick is a utility to expose the file format conversion in
>>> +BioPython in a convenient way.  Instead of having a big mess of scripts, there
>>> +is one that takes arguments.")
>>> +    (license license:gpl3)))
>>> +
>> I’m not sure if it’s really “GPLv3 only” or “GPLv3 or later” as there
>> are no license headers anywhere.  Maybe others could comment what’s the
>> proper declaration here.
>  From the readme:
>  >|seqmagick| is free software under the GPL v3.
>
> Is that not straightforward enough?

I’m still not sure, but the explicit mention of “v3” is enough for me to
not write “gpl3+” here.

I’ll push your latest patch with minor modifications (undoing the
“propagated-inputs” confusion I caused and moving the synopsis on one
line).

Thanks again!

~~ Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-25 14:09     ` [PATCH] Add python2-seqmagick Ricardo Wurmus
@ 2015-09-25 22:14       ` Ben Woodcroft
  2015-09-28  9:54         ` Ricardo Wurmus
  0 siblings, 1 reply; 16+ messages in thread
From: Ben Woodcroft @ 2015-09-25 22:14 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel@gnu.org

Thanks for that Ricardo. One question though.

On 26/09/15 00:09, Ricardo Wurmus wrote:
> [..]
>   (although the PYTHONPATH is a little too
> broad as you also noted).
I was wondering whether including the native-inputs breaks 
reproducibility. For instance, if we install seqmagick through a 
substitute, then the wrapper will point to a python-nose (a native-input 
of seqmagick) directory in the store, even if this directory does not 
exist. So then, later building python-nose and creating the directory in 
the PYTHONPATH might change the behavior of seqmagick, no?

Thanks,
ben

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add python2-seqmagick.
  2015-09-25 22:14       ` Ben Woodcroft
@ 2015-09-28  9:54         ` Ricardo Wurmus
  0 siblings, 0 replies; 16+ messages in thread
From: Ricardo Wurmus @ 2015-09-28  9:54 UTC (permalink / raw)
  To: Ben Woodcroft; +Cc: guix-devel@gnu.org


Ben Woodcroft <b.woodcroft@uq.edu.au> writes:

> Thanks for that Ricardo. One question though.
>
> On 26/09/15 00:09, Ricardo Wurmus wrote:
>> [..]
>>   (although the PYTHONPATH is a little too
>> broad as you also noted).
> I was wondering whether including the native-inputs breaks 
> reproducibility. For instance, if we install seqmagick through a 
> substitute, then the wrapper will point to a python-nose (a native-input 
> of seqmagick) directory in the store, even if this directory does not 
> exist. So then, later building python-nose and creating the directory in 
> the PYTHONPATH might change the behavior of seqmagick, no?

I think this possibility does in fact exist, but it usually isn’t a
problem unless the programme were designed to behave differently in the
presence of the native-input.

If, for example, python-nose provided a conflicting module overriding
one used by seqmagick at runtime, this could conceivably lead to an
error if python-nose would be added to the store.

I do not think that this is a realistic problem, but wrapping an
executable in a PYTHONPATH that is needlessly large is certainly not
nice.

~~ Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-09-28  9:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-17 11:47 [PATCH] Add python2-seqmagick Ben Woodcroft
2015-09-17 15:51 ` Ricardo Wurmus
2015-09-19  9:36   ` Ben Woodcroft
2015-09-21  7:34     ` Pjotr Prins
2015-09-21 16:13       ` Ludovic Courtès
2015-09-21 22:36         ` Ben Woodcroft
2015-09-24  5:08         ` Pjotr Prins
2015-09-24  7:36           ` Ludovic Courtès
2015-09-24  8:37             ` R and R modules (and a Ruby twist) Pjotr Prins
2015-09-24  9:40               ` Ricardo Wurmus
2015-09-24 15:16                 ` Pjotr Prins
2015-09-25  9:14                   ` Ricardo Wurmus
2015-09-25 14:09     ` [PATCH] Add python2-seqmagick Ricardo Wurmus
2015-09-25 22:14       ` Ben Woodcroft
2015-09-28  9:54         ` Ricardo Wurmus
2015-09-21 22:41   ` Cyril Roelandt

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).