unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#51021: detect loops in module/package graph
@ 2021-10-05  0:58 raingloom
  2021-10-05  7:59 ` Mark H Weaver
  0 siblings, 1 reply; 6+ messages in thread
From: raingloom @ 2021-10-05  0:58 UTC (permalink / raw)
  To: 51021

I'll be short and blunt, currently it sucks big time whenever you have
a loop in your packages.
There is no indication other than Guix hanging forever without any
output.

I don't want to spend my time manually finding loops in graphs,
computers are better at that.

Sadly I don't know when I'll have time to implement this, so if someone
knows of a solution, they should not hesitate with sending a patch and
making all our lives easier.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#51021: detect loops in module/package graph
  2021-10-05  0:58 bug#51021: detect loops in module/package graph raingloom
@ 2021-10-05  7:59 ` Mark H Weaver
  2021-10-05  8:03   ` Mark H Weaver
  2021-10-07 13:28   ` Ludovic Courtès
  0 siblings, 2 replies; 6+ messages in thread
From: Mark H Weaver @ 2021-10-05  7:59 UTC (permalink / raw)
  To: raingloom, 51021

Hi,

raingloom <raingloom@riseup.net> writes:
> I'll be short and blunt, currently it sucks big time whenever you have
> a loop in your packages.

Agreed.  I've been concerned about this problem since the early days of
Guix.  See <https://bugs.gnu.org/18247>.

Back in August 2014, there was a strongly connected component (SCC)
containing 51 package modules:

  (acl algebra attr avahi base curl cyrus-sasl dejagnu docbook doxygen
   fltk fontutils gcc gd gdb gettext ghostscript gl glib gnome gnupg
   gnutls graphviz groff gsasl gstreamer gtk iso-codes lesstif
   libcanberra linux lisp maths mit-krb5 mpi netpbm openldap pdf
   pulseaudio rrdtool shishi ssh tcl tcsh texinfo texlive valgrind web
   xiph xml xorg)

Since then, that SCC has grown to 277 package modules:

  (acl admin aidc algebra apr aspell assembly astronomy attr audio
   augeas authentication autogen autotools avahi backup base bash bdw-gc
   bison bittorrent boost build-tools c calendar cdrom certs check cmake
   code compression cook cpio cpp crates-graphics crates-gtk crates-io
   cross-base crypto cryptsetup cups curl cyrus-sasl databases
   datastructures dav debian dejagnu dictionaries disk django djvu dns
   docbook docker documentation ebook ed elf emacs emacs-xyz enchant
   enlightenment fabric-management fcitx file-systems finance firmware
   flex fltk fonts fontutils freedesktop freeipmi ftp game-development
   gawk gcc gd gdb geo gettext ghostscript gimp gl glib gnome gnome-xyz
   gnu-doc gnunet gnupg gnuzilla golang gpodder gps graphics graphviz
   groff groovy gsasl gstreamer gtk guile guile-xyz gv haskell
   haskell-apps haskell-check haskell-crypto haskell-web haskell-xyz
   hurd ibus icu4c image image-processing imagemagick inkscape iso-codes
   java java-compression javascript jemalloc jupyter kde kde-frameworks
   kde-internet kde-pim kde-plasma kerberos language less lesstif
   libcanberra libedit libevent libffi libftdi libidn libphidget
   libreoffice libunistring libusb linphone linux lirc lisp lisp-xyz
   llvm logging lsof lua mail man markup maths matrix maven
   maven-parent-pom mcrypt mes messaging mingw monitoring mono mp3 mpd
   mpi multiprecision music nano ncurses netpbm nettle networking nfs
   ninja node noweb nss ocaml ocr onc-rpc openbox opencl openldap
   openstack package-management parallel password-utils patchutils
   pciutils pcre pdf perl perl-check perl-compression perl-maths
   perl-web photo php plotutils polkit popt pretty-print protobuf
   pulseaudio python python-check python-compression python-crypto
   python-science python-web python-xyz qt rails rdesktop rdf readline
   rpc rrdtool rsync ruby rust rust-apps samba scanner scheme screen sdl
   search security-token selinux serialization shells slang speech
   sphinx spice sqlite ssh sssd suckless swig sync tcl telephony
   terminals tex texinfo text-editors textutils time tls tor uml upnp
   valgrind version-control video vim virtualization vpn vulkan w3m web
   web-browsers webkit wget wine wordnet wxwidgets xdisorg xfig xiph xml
   xorg)

There's also a second, much smaller SCC:

  (bioconductor bioinformatics cran graph machine-learning statistics)

> I don't want to spend my time manually finding loops in graphs,
> computers are better at that.
>
> Sadly I don't know when I'll have time to implement this, so if someone
> knows of a solution, they should not hesitate with sending a patch and
> making all our lives easier.

I've attached a script that I hacked up in 2014 to analyze the Guix
package module dependency graph.  Note the (chdir "gnu/packages") in the
middle of it, so it must be loaded from the top directory of the Guix
source code, and the REPL will be in "gnu/packages" after loading it.
Here's an example of its use:

--8<---------------cut here---------------start------------->8---
mhw@jojen ~/guix$ guile -l cycle-viewer.scm
Found the following non-trivial strongly-connected components:
(bioconductor
  bioinformatics
  cran
  graph
  statistics
  machine-learning)

(autotools
  perl
  base
[… 272 lines elided …]
  libftdi
  perl-maths)

GNU Guile 3.0.2
Copyright (C) 1995-2020 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (length non-trivial-sccs)
$1 = 2
scheme@(guile-user)> (map length non-trivial-sccs)
$2 = (6 277)
scheme@(guile-user)> (first non-trivial-sccs)
$3 = (bioconductor bioinformatics cran graph statistics machine-learning)
scheme@(guile-user)> (second non-trivial-sccs)
$4 = (autotools perl base acl attr gettext check bash compression …)
scheme@(guile-user)> ,pp (edges-within (first non-trivial-sccs))
$5 = ((bioconductor . statistics)
 (bioconductor . graph)
 (bioconductor . cran)
 (bioconductor . bioinformatics)
 (bioinformatics . statistics)
 (bioinformatics . machine-learning)
 (bioinformatics . graph)
 (bioinformatics . cran)
 (bioinformatics . bioconductor)
 (cran . statistics)
 (cran . machine-learning)
 (cran . graph)
 (cran . bioinformatics)
 (graph . statistics)
 (graph . cran)
 (graph . bioinformatics)
 (graph . bioconductor)
 (machine-learning . statistics)
 (machine-learning . cran)
 (statistics . machine-learning)
 (statistics . cran))
scheme@(guile-user)> (with-output-to-file "/tmp/BIO-SCC.dot" (lambda () (write-scc-dot (first non-trivial-sccs))))
scheme@(guile-user)> (with-output-to-file "/tmp/MAIN-SCC.dot" (lambda () (write-scc-dot (second non-trivial-sccs))))
scheme@(guile-user)> ,quit
--8<---------------cut here---------------end--------------->8---

If someone would like to polish this into a more usable tool, and
possibly integrate it into Guix, please feel free.

       Mark

-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#51021: detect loops in module/package graph
  2021-10-05  7:59 ` Mark H Weaver
@ 2021-10-05  8:03   ` Mark H Weaver
  2021-10-07 13:28   ` Ludovic Courtès
  1 sibling, 0 replies; 6+ messages in thread
From: Mark H Weaver @ 2021-10-05  8:03 UTC (permalink / raw)
  To: raingloom, 51021

[-- Attachment #1: Type: text/plain, Size: 143 bytes --]

Earlier, I wrote
> I've attached a script that I hacked up in 2014 to analyze the Guix
> package module dependency graph.

Here's the script:


[-- Attachment #2: cycle-viewer.scm --]
[-- Type: text/plain, Size: 6974 bytes --]

;;; cycle-viewer.scm: a Guix package module dependency graph analyzer
;;; Copyright (C) 2014  Mark H Weaver <mhw@netris.org>
;;;
;;; This program is free software: you can redistribute it and/or modify
;;; it under the terms of the GNU General Public License as published by
;;; the Free Software Foundation, either version 3 of the License, or
;;; (at your option) any later version.
;;;
;;; This program is distributed in the hope that it will be useful,
;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with this program.  If not, see <http://www.gnu.org/licenses/>.

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (ice-9 match)
             (ice-9 ftw)
             (ice-9 pretty-print))

;;;
;;; Tarjan's strongly connected components algorithm
;;;
;;; Robert Tarjan, Depth-first search and linear graph algorithms.
;;; SIAM Journal on Computing, 1(2):146-160, 1972.
;;;
;;;
;;; vertices is the list of vertices, which may be any objects that
;;; can be distinguished using 'equal?'.
;;;
;;; edges is the list of edges, where each edge is a pair (w . v)
;;; representing the directed edge w => v, for vertices w and v.
;;;
;;; The return value is a list of the strongly-connected components,
;;; where each strongly-connected component (SCC) is represented as a
;;; list of the vertices it contains.  The returned SCCs are sorted in
;;; topological order.
;;;
(define (strongly-connected-components vertices edges)
  (define size (length vertices))
  (define vs (iota size))

  (define lookup
    (let ((t (make-hash-table size)))
      (for-each (cut hash-set! t <> <>) vertices vs)
      (cut hash-ref t <>)))

  (define name
    (let ((t (make-vector size #f)))
      (for-each (cut vector-set! t <> <>) vs vertices)
      (cut vector-ref t <>)))

  (define (vector-update! v i f)
    (vector-set! v i (f (vector-ref v i))))

  (define (compose f g) (lambda (x) (f (g x))))

  (define successors
    (let ((t (make-vector size '())))
      (for-each (lambda (v w) (vector-update! t v (cut cons w <>)))
                (map (compose lookup car) edges)
                (map (compose lookup cdr) edges))
      (cut vector-ref t <>)))

  (define new-index
    (let ((i -1))
      (lambda ()
        (set! i (+ i 1))
        i)))

  (define index-table (make-vector size #f))
  (define index       (cut vector-ref  index-table <>))
  (define set-index!  (cut vector-set! index-table <> <>))

  (define lowlink-table (make-vector size size))
  (define lowlink (cut vector-ref lowlink-table <>))
  (define (update-lowlink! v x)
    (if v (vector-update! lowlink-table v (cut min x <>))))

  (define done-table (make-bitvector size #f))
  (define done? (cut bitvector-ref  done-table <>))
  (define done! (cut bitvector-set! done-table <> #t))

  (define results '())
  (define pending '())

  (define (finalize! v)
    (let loop ((names '()) (p pending))
      (done! (car p))
      (cond ((eqv? v (car p))
             (set! pending (cdr p))
             (set! results (cons (cons (name v) names)
                                 results)))
            (else (loop (cons (name (car p))
                              names)
                        (cdr p))))))

  (let loop ((v #f) (ws vs) (stack '()))
    (cond ((pair? ws)
           (let ((w (car ws)))
             (cond ((index w) => (lambda (wi)
                                   (if (not (done? w))
                                       (update-lowlink! v wi))
                                   (loop v (cdr ws) stack)))
                   (else (let ((wi (new-index)))
                           (set-index! w wi)
                           (update-lowlink! w wi)
                           (set! pending (cons w pending))
                           (loop w (successors w)
                                 (cons (cons v (cdr ws))
                                       stack)))))))
          ((pair? stack)
           (if (and v (= (index v) (lowlink v)))
               (finalize! v))
           (update-lowlink! (caar stack) (lowlink v))
           (loop (caar stack) (cdar stack) (cdr stack)))
          (else results))))

(chdir "gnu/packages")

(define files (scandir "." (cut string-suffix? ".scm" <>)))
(define headers (map (cut call-with-input-file <> read)
                     files))
(define modules
  (filter-map
   (lambda (header)
     (match header
       (('define-module ('gnu 'packages name) . _) name)
       (('define-module module-name . _)
        (format (current-warning-port)
                "Warning: found unexpected module name ~S in gnu/packages/*.scm~%"
                module-name)
        #f)))
   headers))

(define dependencies
  (append-map
   (lambda (header)
     (match header
       (('define-module ('gnu 'packages module) . rest)
        (let loop ((rest rest)
                   (deps '()))
          (match rest
            (() deps)
            ((#:use-module ('gnu 'packages name) . rest)
             (loop rest `((,module . ,name) . ,deps)))
            ((#:use-module (('gnu 'packages name) . _) . rest)
             (loop rest `((,module . ,name) . ,deps)))
            ((#:use-module _ . rest)
             (loop rest deps))
            ((#:export _ . rest)
             (loop rest deps))
            ((#:autoload _ _ . rest)
             (loop rest deps)))))
       (('define-module module-name . _) '())))
   headers))

(define sccs (strongly-connected-components modules dependencies))

(define (non-trivial? scc)
  (not (= 1 (length scc))))

(define non-trivial-sccs (filter non-trivial? sccs))

(unless (null? non-trivial-sccs)
  (display "Found the following non-trivial strongly-connected components:")
  (newline)
  (for-each (lambda (scc)
              (pretty-print scc)
              (newline))
            non-trivial-sccs))

(define (edges-within vs)
  (filter (match-lambda
            ((a . b) (and (member a vs) (member b vs))))
          dependencies))

(define (edges-involving vs)
  (filter (match-lambda
            ((a . b) (or (member a vs) (member b vs))))
          dependencies))

(define (edges-from vs)
  (filter (match-lambda
            ((a . b) (member a vs)))
          dependencies))

(define (edges-to vs)
  (filter (match-lambda
            ((a . b) (member b vs)))
          dependencies))

(define (module-label module)
  (symbol->string module))

(define* (write-edges-dot edges #:optional (port (current-output-port)))
  (display "digraph {\n" port)
  (for-each (match-lambda
              ((a . b) (format port "  ~S -> ~S;\n"
                               (module-label a)
                               (module-label b))))
            edges)
  (display "}\n" port))

(define* (write-scc-dot scc #:optional (port (current-output-port)))
  (write-edges-dot (edges-within scc) port))


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]


-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#51021: detect loops in module/package graph
  2021-10-05  7:59 ` Mark H Weaver
  2021-10-05  8:03   ` Mark H Weaver
@ 2021-10-07 13:28   ` Ludovic Courtès
  2021-10-11  7:49     ` zimoun
  1 sibling, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2021-10-07 13:28 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 51021

Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

> raingloom <raingloom@riseup.net> writes:
>> I'll be short and blunt, currently it sucks big time whenever you have
>> a loop in your packages.
>
> Agreed.  I've been concerned about this problem since the early days of
> Guix.  See <https://bugs.gnu.org/18247>.
>
> Back in August 2014, there was a strongly connected component (SCC)
> containing 51 package modules:

Thanks for the analysis and for the updated patch!

Module cycles are something we allow and even rely on, so finding cycles
in itself is not necessarily helpful.  What would help is finding cyclic
top-level references, which are those that cause problems, but that’s
another story.

WDYT?

Now, I’m not sure if raingloom was talking about these cycles, or rather
about cycles in the package dependency graph?

Chris Baines proposed a patch a while back to report those, though I
can’t find it anymore.  IIRC, the difficulty was in making sure cycle
detection would not be too expensive, and in keeping a readable style.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#51021: detect loops in module/package graph
  2021-10-07 13:28   ` Ludovic Courtès
@ 2021-10-11  7:49     ` zimoun
  2021-10-12  9:47       ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: zimoun @ 2021-10-11  7:49 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 51021

Hi Ludo,

On Thu, 7 Oct 2021 at 15:34, Ludovic Courtès <ludo@gnu.org> wrote:
> Mark H Weaver <mhw@netris.org> skribis:
> > raingloom <raingloom@riseup.net> writes:

> >> I'll be short and blunt, currently it sucks big time whenever you have
> >> a loop in your packages.
> >
> > Agreed.  I've been concerned about this problem since the early days of
> > Guix.  See <https://bugs.gnu.org/18247>.
> >
> > Back in August 2014, there was a strongly connected component (SCC)
> > containing 51 package modules:
>
> Thanks for the analysis and for the updated patch!
>
> Module cycles are something we allow and even rely on, so finding cycles
> in itself is not necessarily helpful.  What would help is finding cyclic
> top-level references, which are those that cause problems, but that’s
> another story.

What Mark had implemented [1] works for any directed graph.  What do
you mean by "top-level references"?

1: <https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm>

> Now, I’m not sure if raingloom was talking about these cycles, or rather
> about cycles in the package dependency graph?

Probably. ;-)

But the way to detect cycle could be applied to any graph that Guix
uses.  For instance, something totally irrelevant that I would never
think of: channels [2]. :-)

2: <http://issues.guix.gnu.org/issue/41069>

> Chris Baines proposed a patch a while back to report those, though I
> can’t find it anymore.  IIRC, the difficulty was in making sure cycle
> detection would not be too expensive, and in keeping a readable style.

From my memories about Graph Theory, the algorithm Mark is proposing
is an efficient way to detect cycles (cycle is strong connected
component).  BTW, detect cycle is (almost) the same complexity as
topological sort; since cycles are obstacle for topological sort to
exist, somehow.  I remember too the Chris's proposal but I do not
remember what they implemented.

I do not understand what you mean by "keeping a readable style".

All the best,
simon




^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#51021: detect loops in module/package graph
  2021-10-11  7:49     ` zimoun
@ 2021-10-12  9:47       ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2021-10-12  9:47 UTC (permalink / raw)
  To: zimoun; +Cc: 51021

Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

> What Mark had implemented [1] works for any directed graph.  What do
> you mean by "top-level references"?

Reference to variables coming from one module of an SCC that appear at
the top level of another module in the SCC.

>> Chris Baines proposed a patch a while back to report those, though I
>> can’t find it anymore.  IIRC, the difficulty was in making sure cycle
>> detection would not be too expensive, and in keeping a readable style.
>
> From my memories about Graph Theory, the algorithm Mark is proposing
> is an efficient way to detect cycles

What I meant is that ‘package-derivation’ traverses the package graph,
so it’s a natural place to add cycle detection.

But since ‘package-derivation’ is so central, care must be taken about
the performance hit and about its readability.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-12  9:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-05  0:58 bug#51021: detect loops in module/package graph raingloom
2021-10-05  7:59 ` Mark H Weaver
2021-10-05  8:03   ` Mark H Weaver
2021-10-07 13:28   ` Ludovic Courtès
2021-10-11  7:49     ` zimoun
2021-10-12  9:47       ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).