Quick reply from my phone, but thanks for the feedback.

On Jul 29, 2016 16:53, "Catonano" <catonano@gmail.com> wrote:

> 2016-07-25 23:26 GMT+02:00 Ludovic Courtès <ludo@gnu.org>:
>
>> Hello!
>>
>> Jelle Licht <jlicht@fsfe.org> skribis:
>>
>> > On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it
>> > on my npm importer. After leaving the importer running for a quite
>> > some hours (and making it more robust in the face of inconsistent npm
>> > information), it turns out that jQuery has a direct or indirect
>> > dependcy on about everything. We are talking pretty much all of the
>> > build systems, all of the testing frameworks and all of the test
>> > runners. Literally thousands of packages, and multiple (conflicting)
>> > versions of most.
>>
>> I’m really impressed that your importer can already grovel this much!
>> In itself, that’s already a significant achievement, despite the
>> frustration of not getting “guix package -i jquery” right away.
>>
>> Do you have figures on the number of vertices and edges on this graph?
>> Could it be that the recursive importer keeps revisiting the same nodes
>> over and over again?  :-)
>>
>
>
>
>> I would suggest publishing the code somewhere so others can try to
>> import their favorite JS library and give feedback.
>>
>>
> I'd like to indicate that in the Guix code base there's a function
> visiting the graph of the dependencies of a package with the concern of
> covering it all AND of not considering the same node more than one time
>
> It's in guix/packages.scm on line 552, it's called "transitive-inputs" and
> takes as an arguments the inputs of a single package.
>
> It could be used as a blueprint for such task of dependencies graphs
> covering.
>
> The main observation that I can do is that both versions do check whether
> a package being considered is already been seen in a previous passage.
>
> So, it doesn't seem to me that Jelle's version could revisit the same
> package (node) over and over
>
> Also, the "official" version clearly distinguishes between the current
> depth level in the graph and the next one, using 2 different variables for
> containing the packages (nodes) at the current depth and those in the next
> level ("inputs" and "propagated") as it does a breadth first visit.
>
> Instead the Jelle's version uses the same variable for the current AND the
> next level and it's a list and it conses the nodes for the next level on
> the current list
>
> So it seems to me that it does a depth first visit (I might be wrong, I
> didn't do a complete trace)
>
> If anything, the "official" version uses a VHash to check if a package has
> already been "seen" while Jelle's uses a plain list (again)
>
> So the cost of this check is constant in the official version and linear
> in Jelle's version.
>
> Further, in Jelle's version, every package gets checked against the list
> of already imported packages (as a plain list) AND against the packages
> already present in the store.
>
> Both checks at every iteration. It seems to me that there's space for
> improving here ;-)
>
> In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be
> switched to an "or". It would work anyway and it could save a handful of
> checks on a complete run of the algorithm
>

You are totally right here! The quick and dirty way of doing it, which I am
currently doing,
is to wrap these checks in a memoized function.
I'll take some of your points into account to properly rewrite this part.


>
> Anyway, I tried to "import" a random package and it has a ton of
> dependencies.
>

Story of my life these past few weeks :-)


>
> It seems to me that a more systematic approach (like that of a real data
> scientist ;-) ) could help here
>
>

> The whole graph should be imported in some database and then graph
> algorithms should be run on it
>
> For example: which are the packages with less or no dependencies (and a
> lot of dependants) ?
> Because those should be imported first, in my opinion.
>
> Jelle came to the notion that testing frameworks are a dependencies for
> almost all the packages but as far as I understand that's a quite empirical
> knowledge.
>

^ This, I like. Does anyone have any suggestions on tools that could help
me do this in guile?
I know of projects like [1] that already do something similar, although I
do no have any experience
with constructing and querying graphs using free software.

>
>
>> > I am currently hovering between version 0.6 and 0.7, which can
>> > properly recompile itself and slightly more contemporary
>> > version. Getting to version 1.6 from June 2013 should be doable using
>> > this exact same approach.  This will allow us to package a 2014
>> > version of the Mocha testing framework.
>>
>> Impressive.
>>
>
> Indeed. Impressive.
>
> Ok, so these are the first impressions that I came up with. I'd like my
> observations on the algorithms to be vetted by more experienced people.
>
> Just to help me verify if I can actually understand someone's else scheme
> code
>
> Thanks
>

I think you definitely got the gist of it.

Thanks
- Jelle

[1]: https://exploringdata.github.io/info/npm-packages-dependencies/