unofficial mirror of guix-science@gnu.org 
 help / color / mirror / Atom feed
From: Simon TOURNIER <simon.tournier@inserm.fr>
To: "guix-science@gnu.org" <guix-science@gnu.org>
Cc: Konrad Hinsen <konrad.hinsen@cnrs.fr>
Subject: Rproducibility for Python and beyond
Date: Fri, 14 Apr 2023 10:43:37 +0000	[thread overview]
Message-ID: <73307ac3c0ef44ea9dcdc220a2307506@inserm.fr> (raw)

Hi Konrad, all,

French speakers, here is an interesting presentation by Konrad about the state of Python for scientific computing and reproducibility.

https://reproducibility.gricad-pages.univ-grenoble-alpes.fr/web/presentation_110423.html#presentation_110423

Without watching the video, here the questions I would like to discuss. :-) 

1. Considering the Konrad's schema of some scientific computation (Model --technical choices--> Code --computational env--> Results), there are also technical choices about the computational environment, but they are implicit.  And often impossible to scrutinize because of the lack of transparency.  The key, IMHO, is not the determinism of the computation, instead the key is its transparency.  Determinism is one mean to obtain transparency and determinism is not the only mean.  For instance, this determinism is not affordable for very intensive computation, where is not doable to repeat.  How to think about determinism considering statistical training of machine learning models?  Other said, for some cases, the "compilation" (Code -> Results) of the scientific model is too costly.

2. The "redo" of computations is only possible when the citation is correct.  L'Inria is somehow proposing <https://hal.science/hal-02135891> with the BibLaTeX style <https://mirrors.ircam.fr/pub/CTAN/macros/latex/contrib/biblatex-contrib/biblatex-software/software-biblatex.pdf>.  However, this only captures, at best, some technical choices when implementing the model.  And this does not capture at all the complete computational environment.  What are your ideas for tackling this issue about the citation?

For instance, the file "guix describe -f channels" is one mean for capturing (and cite too!) one computational environment.  Do we need to make it more popular?  How to link this mean with the archiving part of source code (relying on SWH, say)?


Cheers,
simon


             reply	other threads:[~2023-04-14 10:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14 10:43 Simon TOURNIER [this message]
2023-04-15  7:55 ` Rproducibility for Python and beyond Konrad Hinsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73307ac3c0ef44ea9dcdc220a2307506@inserm.fr \
    --to=simon.tournier@inserm.fr \
    --cc=guix-science@gnu.org \
    --cc=konrad.hinsen@cnrs.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).