2016-07-25 23:26 GMT+02:00 Ludovic Courtès <ludo@gnu.org>:
Hello!

Jelle Licht <jlicht@fsfe.org> skribis:

> On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it
> on my npm importer. After leaving the importer running for a quite
> some hours (and making it more robust in the face of inconsistent npm
> information), it turns out that jQuery has a direct or indirect
> dependcy on about everything. We are talking pretty much all of the
> build systems, all of the testing frameworks and all of the test
> runners. Literally thousands of packages, and multiple (conflicting)
> versions of most.

I’m really impressed that your importer can already grovel this much!
In itself, that’s already a significant achievement, despite the
frustration of not getting “guix package -i jquery” right away.

Do you have figures on the number of vertices and edges on this graph?
Could it be that the recursive importer keeps revisiting the same nodes
over and over again?  :-)

 
I would suggest publishing the code somewhere so others can try to
import their favorite JS library and give feedback.


I'd like to indicate that in the Guix code base there's a function visiting the graph of the dependencies of a package with the concern of covering it all AND of not considering the same node more than one time

It's in guix/packages.scm on line 552, it's called "transitive-inputs" and takes as an arguments the inputs of a single package.

It could be used as a blueprint for such task of dependencies graphs covering.

The main observation that I can do is that both versions do check whether a package being considered is already been seen in a previous passage.

So, it doesn't seem to me that Jelle's version could revisit the same package (node) over and over

Also, the "official" version clearly distinguishes between the current depth level in the graph and the next one, using 2 different variables for containing the packages (nodes) at the current depth and those in the next level ("inputs" and "propagated") as it does a breadth first visit.

Instead the Jelle's version uses the same variable for the current AND the next level and it's a list and it conses the nodes for the next level on the current list

So it seems to me that it does a depth first visit (I might be wrong, I didn't do a complete trace)

If anything, the "official" version uses a VHash to check if a package has already been "seen" while Jelle's uses a plain list (again)

So the cost of this check is constant in the official version and linear in Jelle's version.

Further, in Jelle's version, every package gets checked against the list of already imported packages (as a plain list) AND against the packages already present in the store.

Both checks at every iteration. It seems to me that there's space for improving here ;-)

In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be switched to an "or". It would work anyway and it could save a handful of checks on a complete run of the algorithm

Anyway, I tried to "import" a random package and it has a ton of dependencies.

It seems to me that a more systematic approach (like that of a real data scientist ;-) ) could help here

The whole graph should be imported in some database and then graph algorithms should be run on it

For example: which are the packages with less or no dependencies (and a lot of dependants) ?
Because those should be imported first, in my opinion.

Jelle came to the notion that testing frameworks are a dependencies for almost all the packages but as far as I understand that's a quite empirical knowledge.
 
> I am currently hovering between version 0.6 and 0.7, which can
> properly recompile itself and slightly more contemporary
> version. Getting to version 1.6 from June 2013 should be doable using
> this exact same approach.  This will allow us to package a 2014
> version of the Mocha testing framework.

Impressive.

Indeed. Impressive.

Ok, so these are the first impressions that I came up with. I'd like my observations on the algorithms to be vetted by more experienced people.

Just to help me verify if I can actually understand someone's else scheme code

Thanks