From: "Simon Tournier" <simon.tournier@univ-paris-diderot.fr>
To: guix-science@gnu.org
Subject: User & Developer Meetup on Sept. 27th: quick wrap-up
Date: Tue, 28 Sep 2021 09:22:44 +0200 [thread overview]
Message-ID: <web-253615875@univ-paris7.fr> (raw)
Dear,
On this Monday 27th, ~10 people met ~2h to map out actions for the
coming year about Guix in scientific context from reproducible research
to high-performance computing. Here a quick wrap-up!
Do not hesitate to drop an email with something you would like to see
or feel free to join #guix-hpc on irc.libera.chat to discuss one
specific item. Or please go ahead and help to make it happen. :-)
* Organizing training sessions and workshops
+ we need more!
+ https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/talks
how do we give more visibility of this material already there?
+ sharing material à la Software Carpentry
https://github.com/swcarpentry/swcarpentry
+ "format" (package?) to easily share this material
+ sharing reading lists
(the idea behind is to have a weekly/monthly/? "newsletter")
+ PRACE-like EU training sessions
(Season [summer, winter, etc.] School)
+ outreach effort
+ Libraries (jackhill): for archival as well as reproducing library
practices
* Funding opportunities
+ EU project participation and grants to fund specific tasks
+ BIMSB aims to secure funding for PiGx, which uses Guix and Guile.
There is a good chance to hire a person to hack on this in the
coming year
http://bioinformatics.mdc-berlin.de/pigx/
+ research topics
+ funding a few months of development/integration work
(Software Heritage)
+ "keeping in touch", sharing opportunities
(share call for fundings on topics: Repro on HPC, Bioinfo, etc.)
+ join GSoC or other organizations
+ run mentoring programs under our own organization ?
* Long-term archival using Software Heritage and Disarchive
+ status: "guix lint -c archival" (git-fetch)
and https://guix.gnu.org/sources.json (url-fetch)
missing tools to archive svn-fetch (and hg-fetch and minor others:
CVS, etc.)
+ Disarchive can be used to "rebuild" tarballs
from content (SWH) and metadata (disarchive-DB)
https://git.ngyro.com/disarchive
PoC: https://git.ngyro.com/disarchive-db/
+ Data Service or Cuirass (Berlin)?
+ Timothy Sample started to create a Disarchive database
for previous Guix releases
+ left to be done:
+ build/maintain/publish the Disarchive database
+ archive the database
(-> coordinate with Software Heritage)
+ metrics: what's the archive coverage on Software Heritage?
what's the coverage of the Disarchive database?
+ want to contribute? email Simon Tournier or mailing list
* Citing software using Software Heritage IDs and Guix
+ ReScience has been using SWHIDs for software submitted
(submitters must provide the SWHID,
by just clicking on the SWH web interface)
+ "guix show --format=bibtex" to produce a biblatex software style
https://www.ctan.org/tex-archive/macros/latex/contrib/biblatex-contrib/biblatex-software
+ ideal goal: feed Guix with a BibTeX snippet so it starts deploying
it
+ a command to export the state and a list of packages
which could be included to a paper as a citation
+ another command to import this citation and deploy again
* Converting reproducible/active papers to use Guix
+ Example: https://rescience.github.io/bibliography/Courtes_2020.html
+ write "source-code-to-PDF" (PDF or document in a broad sense)
+ pipelines automated with Guix for other papers out there
+ task: find candidate papers that can be automated
(example: the ReScience collection)
+ task: organize a hackathon to work collectively on papers?
* Packaging machine learning frameworks
+ status: we have Tensorflow 1.9, Tensorflow-lite 2.x, PyTorch,
scikit-learn
+ we are stuck with Tensorflow 1.9
because that's the last version to provide a CMake build system
+ Tensorflow blocker: Bazel build system
(in Java, cannot be built from source)
+ Debian package?
https://sources.debian.org/src/bazel-bootstrap/3.5.1+ds-3/debian/control/
+ Ricardo looked at Bazel-to-CMake converters
- cons: not good enough for Tensorflow
+ idea: use Bazel to produce a "degenerate" build system
+ task: package more java package dependencies (for Bazel)
+ Required reading:
https://hpc.guix.info/blog/2021/09/whats-in-a-package/
+ tensorflow-lite still needs its Python bindings
(this should not be too difficult)
- cons: tensorflow-lite is "not very useful"
+ related question: how do we package ML applications?
what's the source: the trained model, or the data set?
how do we distribute huge data sets?
* Packaging (or not packaging) datasets
+ relevant for ML models but also many other domains
+ idea: establish contacts with communities working on this question
+ https://www.datalad.org/
+ https://www.pachyderm.com/ data management
+ git-annex
* Programming with GPUs
+ NVIDIA has a monopoly, CUDA available in the guix-hpc-non-free
channel at INRIA
+ but this very much goes against our goal of building a transparent
software stack
+ how could we make it easier to support "quick-and-dirty" packaging?
+ binary-build-system from the nonguix channel
+ example: Zotero
+ idea: "guix environment --fhs" to provide an FHS-compliant file
tree
+ Possible starting point:
https://gitlab.com/pkill-9/guix-packages-free/blob/master/pkill9/services/fhs.scm
* Julia packaging and importer
+ Efraim prepared 100+ packages
+ Documenter.jl (JS stuff) required by Flux (ML) and many many
+ patches on the mailing list for package without JS support
+ Simon started writing an importer
+ problem: information in the package registry is hard to use
as-is;
a service by Julia Computing, Inc. helps a bit
+ https://github.com/JuliaRegistries/General/tree/master/F/Flux
requires to parse a lot of TOML and synopsis+description is
not there
+ https://juliahub.com/docs/Flux/QdkVy/0.11.6/pkg.json
API with JSON containing almost everything
+ policy question: can Guix (the importer?) rely on this service?
* CPU micro-architecture support with function multi-versioning
+ function multiversioning, like in Clear Linux
+ benchmarking to see if it even provides a benefit
+ prototype:
https://gitlab.inria.fr/guix-hpc/function-multi-versioning
+ task: find candidate code for automatic FMV + benchmarks
+ gcc-11 supports different x86_64 "micro-architectures"
+ https://gcc.gnu.org/gcc-11/changes.html
+ GCC now supports micro-architecture levels
defined in the x86-64 psABI
via -march=x86-64-v2, -march=x86-64-v3 and -march=x86-64-v4.
* Relocatable pack execution on clusters that lack user namespaces
+ fixing issues when running MPI applications using the "fakechroot"
execution engine on big clusters
* Run the tutorial FEniCSX and Firedrake cases using Guix-Jupyter
+ main tasks: package FEniCSX and Firedrake (+ dependencies, mostly
Python)
+ initial work to happen on a channel
Contact Paul Garlick
* Hosting a list of scientific channels at hpc.guix.info/channels
+ provide substitutes via Cuirass
+ discussion about free software and binaries substitutes?
We plan to organize a hackathon day soon* to tackle and make progress
on one specific item. Stay tuned!
*soon: not fixed yet, surely on November or December.
All the best
next reply other threads:[~2021-09-28 7:23 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-28 7:22 Simon Tournier [this message]
2021-10-04 12:16 ` User & Developer Meetup on Sept. 27th: quick wrap-up Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=web-253615875@univ-paris7.fr \
--to=simon.tournier@univ-paris-diderot.fr \
--cc=guix-science@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).