unofficial mirror of 
 help / color / mirror / Atom feed
* User & Developer Meetup on Sept. 27th: quick wrap-up
@ 2021-09-28  7:22 Simon Tournier
  2021-10-04 12:16 ` Ludovic Courtès
  0 siblings, 1 reply; 2+ messages in thread
From: Simon Tournier @ 2021-09-28  7:22 UTC (permalink / raw)
  To: guix-science


On this Monday 27th, ~10 people met ~2h to map out actions for the
coming year about Guix in scientific context from reproducible research
to high-performance computing. Here a quick wrap-up!

Do not hesitate to drop an email with something you would like to see
or feel free to join #guix-hpc on to discuss one
specific item. Or please go ahead and help to make it happen. :-)

* Organizing training sessions and workshops

  + we need more!
    how do we give more visibility of this material already there?
  + sharing material à la Software Carpentry
  + "format" (package?) to easily share this material
  + sharing reading lists
    (the idea behind is to have a weekly/monthly/? "newsletter")
  + PRACE-like EU training sessions 
    (Season [summer, winter, etc.] School)
  + outreach effort
  + Libraries (jackhill): for archival as well as reproducing library

* Funding opportunities

  + EU project participation and grants to fund specific tasks
  + BIMSB aims to secure funding for PiGx, which uses Guix and Guile.
    There is a good chance to hire a person to hack on this in the
    coming year
  + research topics
  + funding a few months of development/integration work
    (Software Heritage)
  + "keeping in touch", sharing opportunities
    (share call for fundings on topics: Repro on HPC, Bioinfo, etc.)
  + join GSoC or other organizations
  + run mentoring programs under our own organization ?

* Long-term archival using Software Heritage and Disarchive

  + status: "guix lint -c archival" (git-fetch)
    and (url-fetch)
    missing tools to archive svn-fetch (and hg-fetch and minor others:
    CVS, etc.)
  + Disarchive can be used to "rebuild" tarballs 
    from content (SWH) and metadata (disarchive-DB)
  + Data Service or Cuirass (Berlin)?
  + Timothy Sample started to create a Disarchive database 
    for previous Guix releases
  + left to be done:
     + build/maintain/publish the Disarchive database
     + archive the database
    (-> coordinate with Software Heritage)
  + metrics: what's the archive coverage on Software Heritage?
    what's the coverage of the Disarchive database?
  + want to contribute? email Simon Tournier or mailing list

* Citing software using Software Heritage IDs and Guix

  + ReScience has been using SWHIDs for software submitted
    (submitters must provide the SWHID, 
     by just clicking on the SWH web interface)
  + "guix show --format=bibtex" to produce a biblatex software style
  + ideal goal: feed Guix with a BibTeX snippet so it starts deploying
    + a command to export the state and a list of packages
      which could be included to a paper as a citation
    + another command to import this citation and deploy again

* “Converting” reproducible/active papers to use Guix

  + Example:
  + write "source-code-to-PDF" (PDF or document in a broad sense)
    + pipelines automated with Guix for other papers out there
  + task: find candidate papers that can be automated
    (example: the ReScience collection)
  + task: organize a hackathon to work collectively on papers?

* Packaging machine learning frameworks

  + status: we have Tensorflow 1.9, Tensorflow-lite 2.x, PyTorch,
  + we are stuck with Tensorflow 1.9
    because that's the last version to provide a CMake build system
  + Tensorflow blocker: Bazel build system 
    (in Java, cannot be built from source)
    + Debian package?
    + Ricardo looked at Bazel-to-CMake converters
      - cons: not good enough for Tensorflow
    + idea: use Bazel to produce a "degenerate" build system
    + task: package more java package dependencies (for Bazel)
   + Required reading:
   + tensorflow-lite still needs its Python bindings
     (this should not be too difficult)
     - cons: tensorflow-lite is "not very useful"
  + related question: how do we package ML applications?
    what's the source: the trained model, or the data set?
    how do we distribute huge data sets?

* Packaging (or not packaging) datasets

  + relevant for ML models but also many other domains
  + idea: establish contacts with communities working on this question
    + data management
    + git-annex

* Programming with GPUs

  + NVIDIA has a monopoly, CUDA available in the guix-hpc-non-free
channel at INRIA
  + but this very much goes against our goal of building a transparent
software stack
  + how could we make it easier to support "quick-and-dirty" packaging?
    + binary-build-system from the nonguix channel
    + example: Zotero
    + idea: "guix environment --fhs" to provide an FHS-compliant file
      + Possible starting point:

* Julia packaging and importer

   + Efraim prepared 100+ packages
   + Documenter.jl (JS stuff) required by Flux (ML) and many many
     + patches on the mailing list for package without JS support
   + Simon started writing an importer
     + problem: information in the package registry is hard to use
       a service by Julia Computing, Inc. helps a bit
         requires to parse a lot of TOML and synopsis+description is
not there
         API with JSON containing almost everything
     + policy question: can Guix (the importer?) rely on this service?

* CPU micro-architecture support with function multi-versioning

   + function multiversioning, like in Clear Linux
   + benchmarking to see if it even provides a benefit
   + prototype:
   + task: find candidate code for automatic FMV + benchmarks
   + gcc-11 supports different x86_64 "micro-architectures"
     + GCC now supports micro-architecture levels
       defined in the x86-64 psABI
       via -march=x86-64-v2, -march=x86-64-v3 and -march=x86-64-v4.

* Relocatable pack execution on clusters that lack user namespaces

  + fixing issues when running MPI applications using the "fakechroot"
execution engine on big clusters

* Run the tutorial FEniCSX and Firedrake cases using Guix-Jupyter

   + main tasks: package FEniCSX and Firedrake (+ dependencies, mostly
   + initial work to happen on a channel
     Contact Paul Garlick

* Hosting a list of scientific channels at

   + provide substitutes via Cuirass
   + discussion about free software and binaries substitutes?

We plan to organize a hackathon day soon* to tackle and make progress
on one specific item. Stay tuned!

*soon: not fixed yet, surely on November or December.

All the best

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: User & Developer Meetup on Sept. 27th: quick wrap-up
  2021-09-28  7:22 User & Developer Meetup on Sept. 27th: quick wrap-up Simon Tournier
@ 2021-10-04 12:16 ` Ludovic Courtès
  0 siblings, 0 replies; 2+ messages in thread
From: Ludovic Courtès @ 2021-10-04 12:16 UTC (permalink / raw)
  To: guix-science


"Simon Tournier" <> skribis:

> * Hosting a list of scientific channels at
>    + provide substitutes via Cuirass
>    + discussion about free software and binaries substitutes?

I forgot to mention it on this list, but there’s now a low-tech list of
channels available on the web site:


We could provide proper integration with hpcguix-web, which would allow
users to navigate packages per channel, things like that (though pages
like <> already show which channel
packages come from.)

I’ve also added the ‘guix-science’ channel on our tiny build farm at
Inria (it doesn’t have a lot of CPU power but it has a lot of disk


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-10-04 12:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28  7:22 User & Developer Meetup on Sept. 27th: quick wrap-up Simon Tournier
2021-10-04 12:16 ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).