* [bug#74268] [PATCH 0/1] teams: Add packages stats script.
@ 2024-11-08 21:31 Sharlatan Hellseher
2024-11-08 21:32 ` [bug#74268] [PATCH 1/1] etc: Add teams-packages-stats script Sharlatan Hellseher
0 siblings, 1 reply; 3+ messages in thread
From: Sharlatan Hellseher @ 2024-11-08 21:31 UTC (permalink / raw)
To: 74268; +Cc: Sharlatan Hellseher
Hi Guix!
During working on python-team branch I aimed to fix as much packages as
possible to prepare it for the upcoming merge to master. I faced with a fact
that there is no (or maybe I do not know about) tooling providing some larger
scale stats for packages within team scope.
Just a simple reasoning: Which N packages may be updated without triggering
larger scale rebuilds? Which packages need to be update on team branch? Which
affect ratio does this package has? Which type of package modification would
trigger the rebuild of dependent (it was suprise that order of inputs does
trigger...)?
So...
> ./pre-inst-env etc/teams-package-stats.scm stats python
Will generate a list of all packages for python team (defined for now as any
package which build-system is pyton or pyproject).
- module-file-name
- build-system-name
- package-name
- package-guix-version
- package-upstream-version
- all-inputs-count
- dependents-count
- affect-ratio"
Command `column' may be used to produce JSON for the farther analysys and
preparation.
e.g. some packages which have 0 impact if they are refreshed:
--8<---------------cut here---------------start------------->8---
> column -s, -t 1731089576-python-team | sort -k8 -n | head -n20
deprecated-package pyproject python-language-server 1.11.0 nil 23 0 0.0
deprecated-package python beets-next 1.6.0 nil 31 0 0.0
deprecated-package python python-trytond-purchase 6.2.3 nil 21 0 0.0
gnu/packages/android.scm python fdroidserver 1.1.9 nil 21 0 0.0
gnu/packages/astronomy.scm pyproject ginga-qt5 5.1.0 nil 22 0 0.0
gnu/packages/astronomy.scm pyproject python-poliastro 0.17.0 nil 20 0 0.0
gnu/packages/bioinformatics.scm pyproject fanc 0-1.354401e nil 29 0 0.0
gnu/packages/bioinformatics.scm pyproject python-baltica 1.1.2 nil 27 0 0.0
gnu/packages/bioinformatics.scm pyproject python-episcanpy 0.4.0 nil 25 0 0.0
gnu/packages/bioinformatics.scm pyproject python-fanc 0.9.25 nil 26 0 0.0
gnu/packages/bioinformatics.scm pyproject python-hicexplorer 3.7.4 nil 27 0 0.0
gnu/packages/bioinformatics.scm pyproject python-liana-py 1.1.0 nil 25 0 0.0
gnu/packages/bioinformatics.scm pyproject python-metacells 0.9.4 nil 26 0 0.0
gnu/packages/bittorrent.scm python deluge 2.1.1 nil 21 0 0.0
gnu/packages/bootloaders.scm pyproject patman 2024.01 nil 20 0 0.0
gnu/packages/databases.scm pyproject datasette 1.0a7 nil 29 0 0.0
gnu/packages/databases.scm python python-pyarrow 0.16.0 nil 28 0 0.0
gnu/packages/finance.scm python electron-cash 4.4.1 nil 23 0 0.0
gnu/packages/genealogy.scm python gramps 5.1.4 nil 23 0 0.0
gnu/packages/gnome.scm python terminator 2.1.4 nil 20 0 0.0
--8<---------------cut here---------------end--------------->8---
I'm not confident in my Guile Scheme ;-) any review points are welcome.
CC core team for wider spread within teams.
Sharlatan Hellseher (1):
etc: Add teams-packages-stats script.
etc/teams-packages-stats.scm | 218 +++++++++++++++++++++++++++++++++++
1 file changed, 218 insertions(+)
create mode 100755 etc/teams-packages-stats.scm
base-commit: 2a6d96425eea57dc6dd48a2bec16743046e32e06
--
2.46.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [bug#74268] [PATCH 1/1] etc: Add teams-packages-stats script.
2024-11-08 21:31 [bug#74268] [PATCH 0/1] teams: Add packages stats script Sharlatan Hellseher
@ 2024-11-08 21:32 ` Sharlatan Hellseher
2024-11-10 16:44 ` Troy Figiel
0 siblings, 1 reply; 3+ messages in thread
From: Sharlatan Hellseher @ 2024-11-08 21:32 UTC (permalink / raw)
To: 74268; +Cc: Sharlatan Hellseher
This is a proposal of the helper script which aims to asist in decision
making during cascade packages refresh task in the team scope.
* etc/teams-packages-stats.scm: New file.
Change-Id: I4af5ce1c3cbebed1793628229b29acba1f737c9d
---
etc/teams-packages-stats.scm | 218 +++++++++++++++++++++++++++++++++++
1 file changed, 218 insertions(+)
create mode 100755 etc/teams-packages-stats.scm
diff --git a/etc/teams-packages-stats.scm b/etc/teams-packages-stats.scm
new file mode 100755
index 0000000000..a95d913a79
--- /dev/null
+++ b/etc/teams-packages-stats.scm
@@ -0,0 +1,218 @@
+#!/bin/sh
+# -*- mode: scheme; -*-
+# Extra care is taken here to ensure this script can run in most environments,
+# since it is invoked by 'git send-email'.
+pre_inst_env_maybe=
+command -v guix > /dev/null || pre_inst_env_maybe=./pre-inst-env
+exec $pre_inst_env_maybe guix repl -- "$0" "$@"
+!#
+
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2024 Sharlatan Hellseher <sharlatanus@mgail.com>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>.
+
+;;; This file returns a manifest containing origins of all the packages. The
+;;; main purpose is to allow continuous integration services to keep upstream
+;;; source code around. It can also be passed to 'guix weather -m'.
+
+;;; Commentary:
+
+;; This code defines helpers for cascade packages refresh withing team scopes.
+;; The output may be piped to CLI commands like awk, column to compile a
+;; dataframe (e.g. JSON).
+;;
+;; ~$ column \
+;; --json \
+;; --table \
+;; --separator=, \
+;; --table-columns=module-file-name,build-system-name,package-name,\
+;; package-guix-version,package-upstream-version,all-inputs-count,\
+;; dependents-count,affect-ratio \
+;; <output> \
+;; > <output>.json
+;;
+;; TODO:
+;; - Implement manifests per team on some gradual criterias
+;; - Add more controls via command-line options
+;; - Improve the performance of dependents calculation, it takes about 30min
+;; to provide a list for packages with python/pyproject build system
+;; - Add save as JSON,CSV data formats for father analysis
+
+
+;;; Code:
+\f
+(use-modules (git)
+ (gnu packages)
+ (guix build-system)
+ (guix diagnostics)
+ (guix discovery)
+ (guix gnupg)
+ (guix graph)
+ (guix hash)
+ (guix monads)
+ (guix packages)
+ (guix profiles)
+ (guix scripts graph)
+ (guix scripts)
+ (guix store)
+ (guix ui)
+ (guix upstream)
+ (guix utils)
+ (ice-9 format)
+ (ice-9 match)
+ (ice-9 rdelim)
+ (ice-9 regex)
+ (srfi srfi-1)
+ (srfi srfi-26)
+ (srfi srfi-37)
+ (srfi srfi-71)
+ (srfi srfi-9))
+
+(define* (packages-by-team #:key (team "all"))
+ "Return the list of packages for the TEAM by certain criteria or fail over
+to all packages available."
+ (cond
+ ((string=? team "go")
+ (fold-packages
+ (lambda (package result)
+ (if (or (eq? (build-system-name (package-build-system package))
+ (quote go))
+ ;; XXX: Add other checks such Go is in inputs*.
+ )
+ (cons package result) result)) (list)))
+ ((string=? team "python")
+ (fold-packages
+ (lambda (package result)
+ (if (or (eq? (build-system-name (package-build-system package))
+ (quote pyproject))
+ (eq? (build-system-name (package-build-system package))
+ (quote python)))
+ (cons package result) result)) (list)))
+ ((string=? team "ruby")
+ (fold-packages
+ (lambda (package result)
+ (if (or (eq? (build-system-name (package-build-system package))
+ (quote ruby))
+ ;; XXX: Add other checkes such Ruby is in inputs*.
+ )
+ (cons package result) result)) (list)))
+ (else
+ (fold-packages
+ (lambda (package result)
+ (if (package-superseded package)
+ result
+ (cons package result)))
+ '()
+ #:select? (const #true)))))
+
+(define (dependents-count package)
+ "Return the count of requiring rebuild packages when PACKAGE is updated."
+ (with-error-handling ;; XXX: Taken from guix scripts refresh
+ (with-store store
+ (run-with-store store
+ (mlet %store-monad ((edges
+ (node-back-edges %bag-node-type
+ (package-closure (packages-by-team)))))
+ (let* ((dependents
+ (node-transitive-edges (list package) edges)))
+ (return (length dependents))))))))
+
+(define* (stats team
+ #:key (build-systems '())
+ (check-dependents? #false)
+ (check-deprecated? #false)
+ (check-upstream-version? #false)
+ (dependents-threshold-ratio 0.001)
+ (inputs-threshold 0))
+ "Return a detailed stats for the given TEAM packages which may help to make
+a decision during cascade updates.
+
+Parameters:
+- build-system :: The optional list of build system names to select.
+
+- check-dependents? :: Whether to query or not the dependents count, it might
+take time for a long list of provided packages.
+
+- check-deprecated? :: Whether to show or not the deprecated packages.
+
+- check-upstream-version? :: Check for the latest available version on
+upstream.
+
+- dependents-threshold-ratio :: Print out only packages which dependent count
+ration is bigger or equal given threshold. (dependents/all-packages * 100.0).
+
+- inputs-threshold :: The minimum number of inputs which package needs to
+have.
+
+Returns:
+- module-file-name
+- build-system-name
+- package-name
+- package-guix-version
+- package-upstream-version
+- all-inputs-count
+- dependents-count
+- affect-ratio"
+ (let ((team-packages (packages-by-team #:team team))
+ (all-packages-count (length (packages-by-team))))
+ (map (lambda (package)
+ (let ((all-inputs-count
+ (+ (length (package-inputs package))
+ (length (package-native-inputs package))
+ (length (package-propagated-inputs package))))
+ (module-path
+ (false-if-exception
+ (location-file (package-definition-location package))))
+ (build-system-name
+ (build-system-name (package-build-system package))))
+ (if (>= all-inputs-count inputs-threshold)
+ (let* ((dependents
+ (if check-dependents?
+ (dependents-count package)
+ "nil"))
+ (affect-ratio
+ (if check-dependents?
+ (* (/ dependents all-packages-count) 100.0)
+ "nil")))
+ (format #true "~{~a,~}~8f~%"
+ (list
+ (if (string? module-path)
+ module-path
+ "deprecated-package")
+ build-system-name
+ (package-name package)
+ (package-version package)
+ (if check-upstream-version? "TBA" "nil")
+ all-inputs-count
+ dependents)
+ affect-ratio)))))
+ team-packages)))
+\f
+(define (main . args)
+ (match args
+ (("stats" . team-name)
+ (apply (stats (car team-name) #:check-dependents? #true)))
+ (anything
+ (format (current-error-port)
+ "Usage: etc/teams-packages-stats.scm <command> [<args>]
+
+Commands:~
+ stats <team-name>
+ get a list of packages belonging to the given <team-name> with basic
+ affect ratio, which may help to plan cascade packages refresh task.%"))))
+
+(apply main (cdr (command-line)))
--
2.46.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [bug#74268] [PATCH 1/1] etc: Add teams-packages-stats script.
2024-11-08 21:32 ` [bug#74268] [PATCH 1/1] etc: Add teams-packages-stats script Sharlatan Hellseher
@ 2024-11-10 16:44 ` Troy Figiel
0 siblings, 0 replies; 3+ messages in thread
From: Troy Figiel @ 2024-11-10 16:44 UTC (permalink / raw)
To: Sharlatan Hellseher; +Cc: 74268
[-- Attachment #1: Type: text/plain, Size: 787 bytes --]
Hi Oleg,
I do not have too much to add, but just wanted to mention this solves a
pain point I had a while back. I thought it would be nice to have a
tool that in some sense does the opposite of `guix refresh -l' and
`guix refresh -T', showing me all packages with a given number of
dependencies. When you are new to updating packages in Guix, it can be
quite difficult to find some simple candidates.
Two other points that come to mind regarding your script:
- jsonl output probably gives you more flexibility in the future.
- I could imagine you might want to filter for module paths.
As a sidenote, when I run your script, the final 0.0 is often
misaligned. It seems to have extra spaces.
In any case, I think it is a nice tool to have.
Best wishes,
Troy
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-11-10 16:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08 21:31 [bug#74268] [PATCH 0/1] teams: Add packages stats script Sharlatan Hellseher
2024-11-08 21:32 ` [bug#74268] [PATCH 1/1] etc: Add teams-packages-stats script Sharlatan Hellseher
2024-11-10 16:44 ` Troy Figiel
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).