unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* bioinformatics.scm vs bioconductor.scm ?
@ 2018-12-11 18:21 zimoun
  2018-12-12  3:44 ` Ricardo Wurmus
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2018-12-11 18:21 UTC (permalink / raw)
  To: Guix Devel

Dear,

Thank you the nice importers.

I am not sure to understand what is the rule to attribute a package to
bioinformatics.scm or to bioconductor.scm when it comes from
Bioconductor.

For example, the package DeSeq2 from Bioconductor is in bioinformatics.scm.
To be concrete,
  $ grep -e bioconductor-uri bioconductor.scm | wc -l
33
  $ grep -e bioconductor-uri bioinformatics.scm  | wc -l
116

Last, there is not alphabetical order of the packages in the files.
How the order is determined ?


Thank you in advance.

All the best,
simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-11 18:21 bioinformatics.scm vs bioconductor.scm ? zimoun
@ 2018-12-12  3:44 ` Ricardo Wurmus
  2018-12-12 11:42   ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Ricardo Wurmus @ 2018-12-12  3:44 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel


zimoun <zimon.toutoune@gmail.com> writes:

> I am not sure to understand what is the rule to attribute a package to
> bioinformatics.scm or to bioconductor.scm when it comes from
> Bioconductor.

bioinformatics.scm was there first.  Later I added bioconductor.scm
because I didn’t want bioinformatics.scm to eventually be full of R
packages.

New Bioconductor packages should go to bioconductor.scm.  Eventually we
may move all remaining R packages from bioinformatics to
bioconductor.scm.

> Last, there is not alphabetical order of the packages in the files.
> How the order is determined ?

Whichever package is added first comes first, later packages are usually
added to the bottom of the file.  The order has no significance.

--
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-12  3:44 ` Ricardo Wurmus
@ 2018-12-12 11:42   ` zimoun
  2018-12-12 12:45     ` Ricardo Wurmus
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2018-12-12 11:42 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: Guix Devel

Thank you the explanations.


> New Bioconductor packages should go to bioconductor.scm.  Eventually we
> may move all remaining R packages from bioinformatics to
> bioconductor.scm.

I am a bit confused.
The file bioconductor.scm contains (or will contain) all R packages
from Bioconductor, right?
But R packages from CRAN used in Bioinformatics ? bioconductor.scm or
bioinformatics.scm?

And I am asking myself if a massive import from Bioconductor should be
possible ?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-12 11:42   ` zimoun
@ 2018-12-12 12:45     ` Ricardo Wurmus
  2018-12-18 11:31       ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Ricardo Wurmus @ 2018-12-12 12:45 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel


zimoun <zimon.toutoune@gmail.com> writes:

> Thank you the explanations.
>
>
>> New Bioconductor packages should go to bioconductor.scm.  Eventually we
>> may move all remaining R packages from bioinformatics to
>> bioconductor.scm.
>
> I am a bit confused.
> The file bioconductor.scm contains (or will contain) all R packages
> from Bioconductor, right?

Correct.

> But R packages from CRAN used in Bioinformatics ? bioconductor.scm or
> bioinformatics.scm?

Neither :)  We put them in cran.scm.  At least that’s the new way of
doing this.  Previously it was all ad-hoc, meaning that packages would
end up in bioinformatics.scm…

Ideally, bioinformatics.scm would only contain non-R tools like
samtools, bamtools, bioinfo pipelines, etc.

> And I am asking myself if a massive import from Bioconductor should be
> possible ?

Certainly!  I’ve done this before actually, but I hit two minor
problems:

1. the bioconductor recursive importer does not *automatically* switch
   to “CRAN mode” when a dependent package isn’t found on Bioconductor.
   Not a big problem, but it means that teh import isn’t fully
   automatic.

2. compiling big Guile modules (such as a future (gnu packages cran))
   require lots of memory since Guile 2.2(?), so I didn’t add all these
   packages.  This is a bug and we’d have to split the module, probably,
   to work around it.

--
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-12 12:45     ` Ricardo Wurmus
@ 2018-12-18 11:31       ` zimoun
  2018-12-18 18:26         ` Björn Höfling
  2018-12-18 22:45         ` Ricardo Wurmus
  0 siblings, 2 replies; 10+ messages in thread
From: zimoun @ 2018-12-18 11:31 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: Guix Devel

Dear Ricardo,

Thank you for your explanations.

> > And I am asking myself if a massive import from Bioconductor should be
> > possible ?
>
> Certainly!  I’ve done this before actually, but I hit two minor
> problems:
>
> 1. the bioconductor recursive importer does not *automatically* switch
>    to “CRAN mode” when a dependent package isn’t found on Bioconductor.
>    Not a big problem, but it means that teh import isn’t fully
>    automatic.

I am not sure to understand.
Is the bioconductor importer usable from `guix import` ?

I have tried once by hand; to understand step by step. :-)
Thank Pierre for the nice tutorial !
see the package definition below.


> 2. compiling big Guile modules (such as a future (gnu packages cran))
>    require lots of memory since Guile 2.2(?), so I didn’t add all these
>    packages.  This is a bug and we’d have to split the module, probably,
>    to work around it.

Ok, even if I have no clue to work around.



So as I need the FlowCore package to process data from cytometry, let
start my first attempt with this one. :-)

This package is on Bioconductor:
https://bioconductor.org/packages/release/bioc/html/flowCore.html
Then define the package by hand was straightforward! :-)
I am not sure to be compliant... Basically, I have just copied/pasted
the modules from bioinformatics.scm (or bioconductor.scm), then I have
look for if the dependency was already there. Missing r-corpcor from
CRAN, so `guix import cran`.

Hum, the package BiocGenerics needs the version >= 0.1.14, and it is
not defined in the package.
Then, the package grDevices, graphics, methods, stats, stats4 are
required (see bioconductor webpage) but not defined elsewhere. Is it
good ?

What is the convention about license ?
(license name) or (license license:name)

Well, then I have run guix package --install-from-file=my.scm and it
seems to works, I mean the first examples from the vignette do not
complain. ;-)


Thank you for any comment.


All the best
simon

--

(define-module (gnu packages my-bioinformatics)
  #:use-module ((guix licenses) #:prefix license:)
  #:use-module (guix packages)
  #:use-module (guix utils)
  #:use-module (guix download)
  #:use-module (guix git-download)
  #:use-module (guix hg-download)
  #:use-module (guix build-system ant)
  #:use-module (guix build-system gnu)
  #:use-module (guix build-system cmake)
  #:use-module (guix build-system haskell)
  #:use-module (guix build-system ocaml)
  #:use-module (guix build-system perl)
  #:use-module (guix build-system python)
  #:use-module (guix build-system r)
  #:use-module (guix build-system ruby)
  #:use-module (guix build-system scons)
  #:use-module (guix build-system trivial)
  #:use-module (gnu packages)
  #:use-module (gnu packages autotools)
  #:use-module (gnu packages algebra)
  #:use-module (gnu packages base)
  #:use-module (gnu packages bash)
  #:use-module (gnu packages bison)
  #:use-module (gnu packages bioconductor)
  #:use-module (gnu packages bioinformatics)
  #:use-module (gnu packages boost)
  #:use-module (gnu packages check)
  #:use-module (gnu packages compression)
  #:use-module (gnu packages cpio)
  #:use-module (gnu packages cran)
  #:use-module (gnu packages curl)
  #:use-module (gnu packages documentation)
  #:use-module (gnu packages databases)
  #:use-module (gnu packages datastructures)
  #:use-module (gnu packages file)
  #:use-module (gnu packages flex)
  #:use-module (gnu packages gawk)
  #:use-module (gnu packages gcc)
  #:use-module (gnu packages gd)
  #:use-module (gnu packages gtk)
  #:use-module (gnu packages glib)
  #:use-module (gnu packages graph)
  #:use-module (gnu packages groff)
  #:use-module (gnu packages guile)
  #:use-module (gnu packages haskell)
  #:use-module (gnu packages haskell-check)
  #:use-module (gnu packages haskell-web)
  #:use-module (gnu packages image)
  #:use-module (gnu packages imagemagick)
  #:use-module (gnu packages java)
  #:use-module (gnu packages jemalloc)
  #:use-module (gnu packages dlang)
  #:use-module (gnu packages linux)
  #:use-module (gnu packages logging)
  #:use-module (gnu packages machine-learning)
  #:use-module (gnu packages man)
  #:use-module (gnu packages maths)
  #:use-module (gnu packages mpi)
  #:use-module (gnu packages ncurses)
  #:use-module (gnu packages ocaml)
  #:use-module (gnu packages pcre)
  #:use-module (gnu packages parallel)
  #:use-module (gnu packages pdf)
  #:use-module (gnu packages perl)
  #:use-module (gnu packages perl-check)
  #:use-module (gnu packages pkg-config)
  #:use-module (gnu packages popt)
  #:use-module (gnu packages protobuf)
  #:use-module (gnu packages python)
  #:use-module (gnu packages python-web)
  #:use-module (gnu packages readline)
  #:use-module (gnu packages ruby)
  #:use-module (gnu packages serialization)
  #:use-module (gnu packages shells)
  #:use-module (gnu packages statistics)
  #:use-module (gnu packages swig)
  #:use-module (gnu packages tbb)
  #:use-module (gnu packages tex)
  #:use-module (gnu packages texinfo)
  #:use-module (gnu packages textutils)
  #:use-module (gnu packages time)
  #:use-module (gnu packages tls)
  #:use-module (gnu packages vim)
  #:use-module (gnu packages web)
  #:use-module (gnu packages xml)
  #:use-module (gnu packages xorg)
  #:use-module (srfi srfi-1)
  #:use-module (ice-9 match))


(define-public r-corpcor
  (package
   (name "r-corpcor")
   (version "1.6.9")
   (source
    (origin
     (method url-fetch)
     (uri (cran-uri "corpcor" version))
     (sha256
      (base32
       "1hi3i9d3841snppq1ks5pd8cliq1b4rm4dpsczmfqvwksg8snkrf"))))
   (build-system r-build-system)
   (home-page
    "http://strimmerlab.org/software/corpcor/")
   (synopsis
    "Efficient Estimation of Covariance and (Partial) Correlation")
   (description
    "Implements a James-Stein-type shrinkage estimator for the
covariance matrix, with separate shrinkage for variances and
correlations.  The details of the method are explained in Schafer and
Strimmer (2005) <DOI:10.2202/1544-6115.1175> and Opgen-Rhein and
Strimmer (2007) <DOI:10.2202/1544-6115.1252>.  The approach is both
computationally as well as statistically very efficient, it is
applicable to \"small n, large p\" data, and always returns a positive
definite and well-conditioned covariance matrix.  In addition to
inferring the covariance matrix the package also provides shrinkage
estimators for partial correlations and partial variances.  The
inverse of the covariance and correlation matrix can be efficiently
computed, as well as any arbitrary power of the shrinkage correlation
matrix.  Furthermore, functions are available for fast singular value
decomposition, for computing the pseudoinverse, and for checking the
rank and positive definiteness of a matrix.")
   (license license:gpl3+)))


(define-public r-flowcore
  (package
    (name "r-flowcore")
    (version "1.48.0")
    (source
     (origin
       (method url-fetch)
       (uri (bioconductor-uri "flowCore" version))
       (sha256
        (base32
     "16mh3xlrcxkrqvhv3pry325jzsz97yg84ya8rpvd2lvlpqrz6k3h"))))
    (build-system r-build-system)
    (propagated-inputs
     `(
       ("r-biobase" ,r-biobase)
       ("r-biocgenerics" ,r-biocgenerics)
       ("r-biocmanager" ,r-biocmanager)
       ("r-bh" ,r-bh)
       ("r-graph" ,r-graph)
       ("r-rrcov" ,r-rrcov)
       ("r-r-utils" ,r-r-utils)
       ("r-corpcor" ,r-corpcor)
       ("r-rcpp" ,r-rcpp)
       ("r-matrixstats" ,r-matrixstats)
       ("r-mass" ,r-mass)
       ))
    (inputs
     `(("zlib" ,zlib)))
    (home-page "https://bioconductor.org/packages/flowCore")
    (synopsis "Basic structures for flow cytometry data")
    (description
     "Provides S4 data structures and basic functions to deal with
flow cytometry data.")
    (license license:artistic2.0)))


;;BiocGenerics(>= 0.1.14),
;; grDevices,
;; graphics, methods,
;; stats, stats4,


r-flowcore

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-18 11:31       ` zimoun
@ 2018-12-18 18:26         ` Björn Höfling
  2018-12-18 18:36           ` zimoun
  2018-12-18 22:45         ` Ricardo Wurmus
  1 sibling, 1 reply; 10+ messages in thread
From: Björn Höfling @ 2018-12-18 18:26 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

On Tue, 18 Dec 2018 06:31:44 -0500
zimoun <zimon.toutoune@gmail.com> wrote:

> What is the convention about license ?
> (license name) or (license license:name)

Just about this point: This is not a "convention", this is part of the
language definition of Guile, the underlying Scheme implementation:

In the module gnu/packes/cran.scm (and many others too) you find:

(define-module (gnu packages cran)
  #:use-module ((guix licenses) #:prefix license:)
  #:use-module (guix packages)
[...]
)

That means: use everything from module "guix licenses" and prefix it
with "license:". So, in the cran module, you must use "license:name" to
use the publicly defined "name" from the "guix licenses" module. 

In other packages that import "guix licenses" without the prefix, you
use "name" directly. See gnu/packages/scsi.scm for an example.

Björn


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-18 18:26         ` Björn Höfling
@ 2018-12-18 18:36           ` zimoun
  2018-12-18 22:49             ` Ricardo Wurmus
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2018-12-18 18:36 UTC (permalink / raw)
  To: Björn Höfling; +Cc: Guix Devel

Dear,

Thank you for your explanations.
And sorry if I am still slow to understand.

> > What is the convention about license ?
> > (license name) or (license license:name)
>
> Just about this point: This is not a "convention", this is part of the
> language definition of Guile, the underlying Scheme implementation:

I understand this. :-)

>
> In the module gnu/packes/cran.scm (and many others too) you find:
>
> (define-module (gnu packages cran)
>   #:use-module ((guix licenses) #:prefix license:)
>   #:use-module (guix packages)
> [...]
> )
>
> That means: use everything from module "guix licenses" and prefix it
> with "license:". So, in the cran module, you must use "license:name" to
> use the publicly defined "name" from the "guix licenses" module.

Ok, but for example this convention about CRAN is not consistent with
the importer. :-)
  guix import cran corpcor -r
fills the license field with (license gpl3+) and not (license license:gpl3+)

In other words, why the cran.scm needs a prefix for the license field?

> In other packages that import "guix licenses" without the prefix, you
> use "name" directly. See gnu/packages/scsi.scm for an example.

Ok.
But there is a convention or an explanation why some packages use a
prefix e.g. cran.scm and other not e.g scsi.scm?

Thank you again for your explanations.

Best regards,
simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-18 11:31       ` zimoun
  2018-12-18 18:26         ` Björn Höfling
@ 2018-12-18 22:45         ` Ricardo Wurmus
  2018-12-20  8:46           ` Ricardo Wurmus
  1 sibling, 1 reply; 10+ messages in thread
From: Ricardo Wurmus @ 2018-12-18 22:45 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

Hi,

> Is the bioconductor importer usable from `guix import` ?

yes.  You may encounter minor problems when using the recursive
bioconductor importer, as it may try to look up CRAN packages on
Bioconductor.

> This package is on Bioconductor:
> https://bioconductor.org/packages/release/bioc/html/flowCore.html

I’d do

    ./pre-inst-env guix import cran -a bioconductor -r flowCore

This fails because it wants corpcor from CRAN.  So we do:

    ./pre-inst-env guix import cran -r corpcor

We dump the result (with minor changes) in (gnu packages cran) and try
again to import flowCore.  This time it succeeds.

> Hum, the package BiocGenerics needs the version >= 0.1.14, and it is
> not defined in the package.

We have r-biocgenerics 0.28.0 in gnu/packages/bioinformatics.scm.
That’s one of the packages that should move eventually.

> Then, the package grDevices, graphics, methods, stats, stats4 are
> required (see bioconductor webpage) but not defined elsewhere. Is it
> good ?

These are all default packages that are part of R itself.  The importer
skips them.

> What is the convention about license ?
> (license name) or (license license:name)

This depends on the target module.  cran.scm, bioinformatics.scm, and
bioconductor.scm all use the “license:” prefix.  web.scm on the other
hand uses the “l:” prefix.  Take a look at the #:use-module clause at
the top of the module.

--
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-18 18:36           ` zimoun
@ 2018-12-18 22:49             ` Ricardo Wurmus
  0 siblings, 0 replies; 10+ messages in thread
From: Ricardo Wurmus @ 2018-12-18 22:49 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel


zimoun <zimon.toutoune@gmail.com> writes:

> Ok, but for example this convention about CRAN is not consistent with
> the importer. :-)
>   guix import cran corpcor -r
> fills the license field with (license gpl3+) and not (license license:gpl3+)

That’s right.  The importer does not know where the generated package
definition is supposed to be used.

> In other words, why the cran.scm needs a prefix for the license field?

It uses a prefix because we use the “zlib” package often, but not the
“zlib” license.  We could exclude the “zlib” license from (guix
licenses), or import only a specified list of licenses, or we can solve
this naming conflict by prefixing all values from (guix licenses) with
“license:” (or anything else, really).

Really small modules often don’t have this problem in the first place,
so they don’t need to find a solution to work around the naming
conflicts.

--
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bioinformatics.scm vs bioconductor.scm ?
  2018-12-18 22:45         ` Ricardo Wurmus
@ 2018-12-20  8:46           ` Ricardo Wurmus
  0 siblings, 0 replies; 10+ messages in thread
From: Ricardo Wurmus @ 2018-12-20  8:46 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel


Ricardo Wurmus <rekado@elephly.net> writes:

>> Is the bioconductor importer usable from `guix import` ?
>
> yes.  You may encounter minor problems when using the recursive
> bioconductor importer, as it may try to look up CRAN packages on
> Bioconductor.

This is now fixed in commit 10a1cacb1.  The importer will retry the
import from CRAN when the package cannot be found on Bioconductor.

So this works now:

    ./pre-inst-env guix import cran -a bioconductor -r flowCore


-- 
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-12-20 10:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 18:21 bioinformatics.scm vs bioconductor.scm ? zimoun
2018-12-12  3:44 ` Ricardo Wurmus
2018-12-12 11:42   ` zimoun
2018-12-12 12:45     ` Ricardo Wurmus
2018-12-18 11:31       ` zimoun
2018-12-18 18:26         ` Björn Höfling
2018-12-18 18:36           ` zimoun
2018-12-18 22:49             ` Ricardo Wurmus
2018-12-18 22:45         ` Ricardo Wurmus
2018-12-20  8:46           ` Ricardo Wurmus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).