unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Simon Tournier <zimon.toutoune@gmail.com>
To: Kyle <kyle@posteo.net>,
	Spencer Skylar Chan <schan12@terpmail.umd.edu>,
	Ricardo Wurmus <rekado@elephly.net>
Cc: guix-devel@gnu.org
Subject: Re: Google Summer of Code 2023 Inquiry
Date: Tue, 04 Apr 2023 19:15:54 +0200	[thread overview]
Message-ID: <86h6tvtvp1.fsf@gmail.com> (raw)
In-Reply-To: <0DE0A1F6-58C0-47AD-BBDA-D99E4CD4213A@posteo.net>

Hi Kyle,

On Tue, 04 Apr 2023 at 14:32, Kyle <kyle@posteo.net> wrote:

>           The CRAN importer, for example, cannot yet detect non-R
> dependencies. So, the profile author has to figure those out for
> themselves. It's still very useful despite not being perfect.  

Yeah, improving the importers is very helpful…

> Sure, but as is shown with "guix import cran" as I previously
> mentioned, it doesn't have to be perfect to be really useful in many
> cases.

…but please note the R ecosystem is probably one of the best around.

Well, I will not extrapolate to other ecosystem as Python or else based
on what Lars did with the channel guix-cran [1].

For more details, give a look to this thread [2],

        Accuracy of importers?
        Ludovic Courtès <ludovic.courtes@inria.fr>
        Thu, 28 Oct 2021 09:02:27 +0200

or slide 53 of
https://git.savannah.gnu.org/cgit/guix/maintenance.git/plain/talks/packaging-con-2021/grail/talk.20211110.pdf 
  

In addition, quoting another discussion from [3]:

        Well, it strongly depends on the quality of the targeted language
        ecosystem.  For some, they provide enough metadata to rely on for good
        automatizing; for instance, R with CRAN or Bioconductor.

        Sadly, for many others ecosystem, they (upstream) do not provide enough
        metadata to automatically fill all the package fields.  And some manual
        tweaks are required.

        For example, let count the number of packages that are tweaking their
        ’arguments’ fields (from ’#:tests? #f’ to complex phases modifications).
        This is far from being a perfect metrics but it is a rough indication
        about upstream quality: if they provide clean package respecting their
        build system or if the package requires Guix adjustments.

        Well, I get:

              r            : 2093 = 2093 = 1991 + 102 

        which is good (only ~5% require ’arguments’ tweaks), but

              python       : 2630 = 2630 = 803  + 1827

        is bad (only ~31% do not require an ’arguments’ tweak).

and the analysis can be refined, for instance which keyword ’arguments’
are they tweaked?  I did it [4] for the emacs-build-system:

                emacs        : 1234 = 1234 = 878  + 356
                    ("phases" . 213)
                    ("tests?" . 144)
                    ("test-command" . 127)
                    ("include" . 87)
                    ("emacs" . 25)
                    ("exclude" . 20)
                    ("modules" . 7)
                    ("imported-modules" . 4)
                    ("parallel-tests?" . 1) 

        Considering this 356 packages, 144 modifies the keyword #:tests?.  Note
        that ’#:tests? #t’ is counted in these 144 and it reads,

            $ ag 'tests\? #t' gnu/packages/emacs-xyz.scm | wc -l
            117

        Ah!  It requires some investigations. :-)

Last, in addition to ideas of improvements provided by the thread [3,4],
the conclusion is still:

        Indeed, it could be worth to identify common sources of the extra
        modifications we are doing compared to the default emacs-build-system.

Yeah, improving the importers is very helpful! :-)

Well, considering that 95% of the current R packages in Guix just work
out-of-the-box from the CRAN metadata, and considering how many packages
guix-cran provides compared to how many packages CRAN provides, we can
roughly extrapolate the meaning of “doesn't have to be perfect” for
other ecosystem as Python or else.  Roughly speaking, consider the 30%
of the current Python packages in Guix that are working out-of-the-box.

Yeah, these numbers are very partial and finer analysis could help in
improving the importers.  But these numbers show that the conclusion
drawn from the CRAN example would not apply as-is for others, IMHO.


1: https://hpc.guix.info/blog/2022/12/cran-a-practical-example-for-being-reproducible-at-large-scale-using-gnu-guix/
2: https://yhetil.org/guix/878ryd8we4.fsf@inria.fr/#r
3: https://yhetil.org/guix/86cz9kk71y.fsf@gmail.com
4: https://yhetil.org/guix/87cz9gunwx.fsf@gmail.com


Cheers,
simon


  reply	other threads:[~2023-04-04 17:20 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07  1:31 Google Summer of Code 2023 Inquiry Spencer Skylar Chan
2023-03-11 13:32 ` Simon Tournier
2023-03-14 10:10   ` Simon Tournier
2023-03-22 17:41   ` Spencer Skylar Chan
2023-03-22 18:19   ` Ricardo Wurmus
2023-03-22 21:44     ` Spencer Skylar Chan
2023-03-23  7:58       ` Ricardo Wurmus
2023-03-30 23:27         ` Spencer Skylar Chan
2023-03-31  0:52           ` Kyle
2023-03-24 18:59       ` Kyle
2023-03-30 23:22         ` Spencer Skylar Chan
2023-03-31 15:15           ` Kyle
2023-04-04  0:41             ` Spencer Skylar Chan
2023-04-04  6:29               ` Kyle
2023-04-04  8:59               ` Simon Tournier
2023-04-04 14:32                 ` Kyle
2023-04-04 17:15                   ` Simon Tournier [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-03-08  2:33 Spencer Skylar Chan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86h6tvtvp1.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=kyle@posteo.net \
    --cc=rekado@elephly.net \
    --cc=schan12@terpmail.umd.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).