Quick reply from my phone, but thanks for the feedback. On Jul 29, 2016 16:53, "Catonano" wrote: > 2016-07-25 23:26 GMT+02:00 Ludovic Courtès : > >> Hello! >> >> Jelle Licht skribis: >> >> > On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it >> > on my npm importer. After leaving the importer running for a quite >> > some hours (and making it more robust in the face of inconsistent npm >> > information), it turns out that jQuery has a direct or indirect >> > dependcy on about everything. We are talking pretty much all of the >> > build systems, all of the testing frameworks and all of the test >> > runners. Literally thousands of packages, and multiple (conflicting) >> > versions of most. >> >> I’m really impressed that your importer can already grovel this much! >> In itself, that’s already a significant achievement, despite the >> frustration of not getting “guix package -i jquery” right away. >> >> Do you have figures on the number of vertices and edges on this graph? >> Could it be that the recursive importer keeps revisiting the same nodes >> over and over again? :-) >> > > > >> I would suggest publishing the code somewhere so others can try to >> import their favorite JS library and give feedback. >> >> > I'd like to indicate that in the Guix code base there's a function > visiting the graph of the dependencies of a package with the concern of > covering it all AND of not considering the same node more than one time > > It's in guix/packages.scm on line 552, it's called "transitive-inputs" and > takes as an arguments the inputs of a single package. > > It could be used as a blueprint for such task of dependencies graphs > covering. > > The main observation that I can do is that both versions do check whether > a package being considered is already been seen in a previous passage. > > So, it doesn't seem to me that Jelle's version could revisit the same > package (node) over and over > > Also, the "official" version clearly distinguishes between the current > depth level in the graph and the next one, using 2 different variables for > containing the packages (nodes) at the current depth and those in the next > level ("inputs" and "propagated") as it does a breadth first visit. > > Instead the Jelle's version uses the same variable for the current AND the > next level and it's a list and it conses the nodes for the next level on > the current list > > So it seems to me that it does a depth first visit (I might be wrong, I > didn't do a complete trace) > > If anything, the "official" version uses a VHash to check if a package has > already been "seen" while Jelle's uses a plain list (again) > > So the cost of this check is constant in the official version and linear > in Jelle's version. > > Further, in Jelle's version, every package gets checked against the list > of already imported packages (as a plain list) AND against the packages > already present in the store. > > Both checks at every iteration. It seems to me that there's space for > improving here ;-) > > In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be > switched to an "or". It would work anyway and it could save a handful of > checks on a complete run of the algorithm > You are totally right here! The quick and dirty way of doing it, which I am currently doing, is to wrap these checks in a memoized function. I'll take some of your points into account to properly rewrite this part. > > Anyway, I tried to "import" a random package and it has a ton of > dependencies. > Story of my life these past few weeks :-) > > It seems to me that a more systematic approach (like that of a real data > scientist ;-) ) could help here > > > The whole graph should be imported in some database and then graph > algorithms should be run on it > > For example: which are the packages with less or no dependencies (and a > lot of dependants) ? > Because those should be imported first, in my opinion. > > Jelle came to the notion that testing frameworks are a dependencies for > almost all the packages but as far as I understand that's a quite empirical > knowledge. > ^ This, I like. Does anyone have any suggestions on tools that could help me do this in guile? I know of projects like [1] that already do something similar, although I do no have any experience with constructing and querying graphs using free software. > > >> > I am currently hovering between version 0.6 and 0.7, which can >> > properly recompile itself and slightly more contemporary >> > version. Getting to version 1.6 from June 2013 should be doable using >> > this exact same approach. This will allow us to package a 2014 >> > version of the Mocha testing framework. >> >> Impressive. >> > > Indeed. Impressive. > > Ok, so these are the first impressions that I came up with. I'd like my > observations on the algorithms to be vetted by more experienced people. > > Just to help me verify if I can actually understand someone's else scheme > code > > Thanks > I think you definitely got the gist of it. Thanks - Jelle [1]: https://exploringdata.github.io/info/npm-packages-dependencies/