unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: jbranso@dismail.de
To: "zimoun" <zimon.toutoune@gmail.com>, "Guix Devel" <guix-devel@gnu.org>
Subject: Re: Some stats about the graph of dependencies
Date: Sat, 10 Dec 2022 00:19:01 +0000	[thread overview]
Message-ID: <0790f770f69af2bffd6d6d9d4ba22bc7@dismail.de> (raw)
In-Reply-To: <874ju4qyd4.fsf@gmail.com>

December 9, 2022 12:32 PM, "zimoun" <zimon.toutoune@gmail.com> wrote:

> Hi,
> 
> Preparing some Python stuff, I was toying with the package
> python-networkx. And Guix is awesome because it is easy to extract the
> graph of dependencies.
> 
> Here dependencies are just inputs, native-inputs and propagated-inputs.
> It could be interesting to also include build-system dependencies, I
> have been lazy. :-)
> 
> My initial question is to know what are the “essentials”? By essential,
> I mean the “important“ ones, the “hot” ones, etc. The ones which are
> “influencers” – yeah the world is a social network. :-)
> 
> First, let extract the graph with a tiny Scheme script:
> 
> $ guix repl -- packages-to-dict.scm > dod.py
> 
> Then, let import that into an IPython session:
> 
> $ guix shell python python-ipython \
> python-scipy python-matplotlib python-networkx -- ipython
> 
> and run another tiny Python script for plotting. See Figure attached.
> 
> We can compare a link analysis metrics [1] and a centrality measure
> [2]; say PageRank [3] and Eigenvector [4]. More the value is large and
> higher the package is “important“ (for this metrics).
> 
> And the Directed and Undirected graphs can be compared, using Networkx
> [5,6]. Well, Eigenvector centrality (or Katz centrality [7]) is failing
> because the power iteration does not converge but other metrics could be
> also considered. Here is just a first rough toy. :-)
> 
> According to PageRank applied to the Directed Graph, the 10 most
> “important” packages are:
> 
> --8<---------------cut here---------------start------------->8---
> [('pkg-config-0.29.2', 0.02418335991713879),
> ('perl-5.34.0', 0.015404032767249512),
> ('coreutils-minimal-8.32', 0.013240458675517012),
> ('zlib-1.2.11', 0.009107245584307803),
> ('python-pytest-6.2.5', 0.008413060648307678),
> ('ncurses-6.2.20210619', 0.007598925467605917),
> ('r-knitr-1.41', 0.00554772892485958),
> ('sbcl-rt-1990.12.19-1.a6a7503', 0.004884721933452539),
> ('bzip2-1.0.8', 0.004800877844001881),
> ('python-3.9.9', 0.00415536078558266)]
> --8<---------------cut here---------------end--------------->8---
> 
> And if we compare the 3 results (Undirected with PageRank and
> Eigenvector, and Directed with PageRank only, then 10 most “important”
> packages are:
> 
> --8<---------------cut here---------------start------------->8---
> ['pkg-config-0.29.2',
> 'glib-2.70.2',
> 'zlib-1.2.11',
> 'gtk+-3.24.30',
> 'perl-5.34.0',
> 'gettext-minimal-0.21',
> 'qtbase-5.15.5',
> 'libxml2-2.9.12',
> 'python-3.9.9',
> 'autoconf-2.69']
> --8<---------------cut here---------------end--------------->8---
> 
> Somehow, it means that these packages have an high influence on all the
> others. Now, we can roughly compare with the release-manifest.scm [8],
> 
> --8<---------------cut here---------------start------------->8---
> '("bootstrap-tarballs" "gcc-toolchain" "nss-certs"
> "openssh" "emacs" "vim" "python" "guile" "guix")))
> '("coreutils" "grep" "findutils" "gawk" "make"
> #;"gcc-toolchain" "tar" "xz")))
> '("xorg-server" "xfce" "gnome" "mate" "enlightenment"
> "openbox" "awesome" "i3-wm" "ratpoison"
> "emacs" "emacs-exwm" "emacs-desktop-environment"
> "xlockmore" "slock" "libreoffice"
> "connman" "network-manager" "network-manager-applet"
> "openssh" "ntp" "tor"
> "linux-libre" "grub-hybrid"
> '("coreutils" "grep" "sed" "findutils" "diffutils" "patch"
> "gawk" "gettext" "gzip" "xz"
> "hello" "zlib"))))
> --8<---------------cut here---------------end--------------->8---
> 
> Well, we could investigate more and play more with some graphs tools.
> For instance, include all the build-system dependencies and so on.
> 
> Some list about “statistically important” packages could help for
> improving the list of “essential” packages.
> 
> Although Python is great, I would like to run Guile. Any Guile library
> for manipulating graph is around?

https://packages.guix.gnu.org/packages/guile2.2-charting/0.2.0-1.75f755b/

Thought it may be guile 2 only...?

> 
> All that to say, Guix is great! :-) And perhaps some of you have already
> some Guile code for analysing graphs. Maybe.
> 
> Well, comment or idea is welcome. :-)
> 
> 1: <https://en.wikipedia.org/wiki/Network_theory#Link_analysis>
> 2: <https://en.wikipedia.org/wiki/Network_theory#Centrality_measures>
> 3: <https://en.wikipedia.org/wiki/PageRank>
> 4: <https://en.wikipedia.org/wiki/Eigenvector_centrality>
> 5: <https://networkx.org/documentation/stable/reference/algorithms/link_analysis.html>
> 6: <https://networkx.org/documentation/stable/reference/algorithms/centrality.html>
> 7: <https://en.wikipedia.org/wiki/Katz_centrality>
> 8: <https://git.savannah.gnu.org/cgit/guix.git/tree/etc/release-manifest.scm#n47>
> 
> Cheers,
> simon


  reply	other threads:[~2022-12-10  0:20 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-09 17:29 Some stats about the graph of dependencies zimoun
2022-12-10  0:19 ` jbranso [this message]
2022-12-10 13:30   ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0790f770f69af2bffd6d6d9d4ba22bc7@dismail.de \
    --to=jbranso@dismail.de \
    --cc=guix-devel@gnu.org \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).