From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jelle Licht Subject: Re: [GSoC update] Npm & guix Date: Fri, 29 Jul 2016 17:26:53 +0200 Message-ID: References: <8737mxflz4.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a114b32cc56cf8d0538c7e180 Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:50416) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bT9go-0000IP-69 for guix-devel@gnu.org; Fri, 29 Jul 2016 11:27:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bT9gk-0000EK-UU for guix-devel@gnu.org; Fri, 29 Jul 2016 11:27:06 -0400 In-Reply-To: List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Catonano Cc: guix-devel --001a114b32cc56cf8d0538c7e180 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Quick reply from my phone, but thanks for the feedback. On Jul 29, 2016 16:53, "Catonano" wrote: > 2016-07-25 23:26 GMT+02:00 Ludovic Court=C3=A8s : > >> Hello! >> >> Jelle Licht skribis: >> >> > On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it >> > on my npm importer. After leaving the importer running for a quite >> > some hours (and making it more robust in the face of inconsistent npm >> > information), it turns out that jQuery has a direct or indirect >> > dependcy on about everything. We are talking pretty much all of the >> > build systems, all of the testing frameworks and all of the test >> > runners. Literally thousands of packages, and multiple (conflicting) >> > versions of most. >> >> I=E2=80=99m really impressed that your importer can already grovel this = much! >> In itself, that=E2=80=99s already a significant achievement, despite the >> frustration of not getting =E2=80=9Cguix package -i jquery=E2=80=9D righ= t away. >> >> Do you have figures on the number of vertices and edges on this graph? >> Could it be that the recursive importer keeps revisiting the same nodes >> over and over again? :-) >> > > > >> I would suggest publishing the code somewhere so others can try to >> import their favorite JS library and give feedback. >> >> > I'd like to indicate that in the Guix code base there's a function > visiting the graph of the dependencies of a package with the concern of > covering it all AND of not considering the same node more than one time > > It's in guix/packages.scm on line 552, it's called "transitive-inputs" an= d > takes as an arguments the inputs of a single package. > > It could be used as a blueprint for such task of dependencies graphs > covering. > > The main observation that I can do is that both versions do check whether > a package being considered is already been seen in a previous passage. > > So, it doesn't seem to me that Jelle's version could revisit the same > package (node) over and over > > Also, the "official" version clearly distinguishes between the current > depth level in the graph and the next one, using 2 different variables fo= r > containing the packages (nodes) at the current depth and those in the nex= t > level ("inputs" and "propagated") as it does a breadth first visit. > > Instead the Jelle's version uses the same variable for the current AND th= e > next level and it's a list and it conses the nodes for the next level on > the current list > > So it seems to me that it does a depth first visit (I might be wrong, I > didn't do a complete trace) > > If anything, the "official" version uses a VHash to check if a package ha= s > already been "seen" while Jelle's uses a plain list (again) > > So the cost of this check is constant in the official version and linear > in Jelle's version. > > Further, in Jelle's version, every package gets checked against the list > of already imported packages (as a plain list) AND against the packages > already present in the store. > > Both checks at every iteration. It seems to me that there's space for > improving here ;-) > > In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be > switched to an "or". It would work anyway and it could save a handful of > checks on a complete run of the algorithm > You are totally right here! The quick and dirty way of doing it, which I am currently doing, is to wrap these checks in a memoized function. I'll take some of your points into account to properly rewrite this part. > > Anyway, I tried to "import" a random package and it has a ton of > dependencies. > Story of my life these past few weeks :-) > > It seems to me that a more systematic approach (like that of a real data > scientist ;-) ) could help here > > > The whole graph should be imported in some database and then graph > algorithms should be run on it > > For example: which are the packages with less or no dependencies (and a > lot of dependants) ? > Because those should be imported first, in my opinion. > > Jelle came to the notion that testing frameworks are a dependencies for > almost all the packages but as far as I understand that's a quite empiric= al > knowledge. > ^ This, I like. Does anyone have any suggestions on tools that could help me do this in guile? I know of projects like [1] that already do something similar, although I do no have any experience with constructing and querying graphs using free software. > > >> > I am currently hovering between version 0.6 and 0.7, which can >> > properly recompile itself and slightly more contemporary >> > version. Getting to version 1.6 from June 2013 should be doable using >> > this exact same approach. This will allow us to package a 2014 >> > version of the Mocha testing framework. >> >> Impressive. >> > > Indeed. Impressive. > > Ok, so these are the first impressions that I came up with. I'd like my > observations on the algorithms to be vetted by more experienced people. > > Just to help me verify if I can actually understand someone's else scheme > code > > Thanks > I think you definitely got the gist of it. Thanks - Jelle [1]: https://exploringdata.github.io/info/npm-packages-dependencies/ --001a114b32cc56cf8d0538c7e180 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Quick reply from my phone, but thanks for t= he feedback.


On Jul 29, 2016 1= 6:53, "Catonano" <catonano@gmail.com> wrote:
2016-07-25 23= :26 GMT+02:00 Ludovic Court=C3=A8s <ludo@gnu.org>:
Hello!

Jelle Licht <jlicht= @fsfe.org> skribis:

> On Ludo's advice, I snarfed Ricardo's recursive importer and b= olted it
> on my npm importer. After leaving the importer running for a quite
> some hours (and making it more robust in the face of inconsistent npm<= br> > information), it turns out that jQuery has a direct or indirect
> dependcy on about everything. We are talking pretty much all of the > build systems, all of the testing frameworks and all of the test
> runners. Literally thousands of packages, and multiple (conflicting) > versions of most.

I=E2=80=99m really impressed that your importer can already grovel t= his much!
In itself, that=E2=80=99s already a significant achievement, despite the frustration of not getting =E2=80=9Cguix package -i jquery=E2=80=9D right a= way.

Do you have figures on the number of vertices and edges on this graph?
Could it be that the recursive importer keeps revisiting the same nodes
over and over again?=C2=A0 :-)

=C2=A0
I would suggest publishing the code somewhere so others can try to
import their favorite JS library and give feedback.


I'd like to indicate that i= n the Guix code base there's a function=20 visiting the graph of the dependencies of a package with the concern of=20 covering it all AND of not considering the same node more than one time
=
It's in guix/packages.scm on line 552, it's called &= quot;transitive-inputs" and takes as an arguments the inputs of a sing= le package.

It could be used as a blueprint for such task= of dependencies graphs covering.

The main observation that I can do is that both versions do check whether a package being considered is already been seen in a previous passage.

So, it doesn't seem to me that Jelle's vers= ion could revisit the same package (node) over and over

A= lso, the "official" version clearly distinguishes between the current= depth=20 level in the graph and the next one, using 2 different variables for=20 containing the packages (nodes) at the current depth and those in the=20 next level ("inputs" and "propagated") as it does a bre= adth first visit.

Instead the=20 Jelle's version uses the same variable for the current AND the next=20 level and it's a list and it conses the nodes for the next level on the= =20 current list

So it seems to me that it does a depth first= visit (I might be wrong, I didn't do a complete trace)

If anything, the "official" version uses a VHash to check if a= package has already been "seen" while Jelle's uses a plain l= ist (again)

So the cost of this check is constant in the = official version and linear in Jelle's version.

Further, in Jelle's version, every package gets checked against the list of=20 already imported packages (as a plain list) AND against the packages=20 already present in the store.

Both checks at every iterat= ion. It seems to me that there's space for improving here ;-)

In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be switched to an "or". It= would work=20 anyway and it could save a handful of checks on a complete run of the=20 algorithm

You are to= tally right here! The quick and dirty way of doing it, which I am currently= doing,
is to wrap these checks in a memoized function.
I'll tak= e some of your points into account to properly rewrite this part.
=
=C2=A0
Anyway, I tried to "import" a random package and it has a = ton of dependencies.
<= div>
Story of my life these past few weeks :-)
= =C2=A0
It seems to me that a more systematic approach (like that of a= real data scientist ;-) ) could help here
=C2=A0

The whole graph should be imported in some datab= ase and then graph algorithms should be run on it

For exa= mple: which are the packages with less or no dependencies (and a lot of dep= endants) ?
Because those should be imported first, in my opin= ion.

Jelle came to the notion that testing frameworks are= a dependencies for almost all the packages but as far as I understand that= 's a quite empirical knowledge.
=

^ This, I like. Does anyone have any sugge= stions on tools that could help me do this in guile?
I know o= f projects like [1] that already do something similar, although I do no hav= e any experience
with constructing and querying graphs using = free software.
<= div dir=3D"ltr">
=
=C2=A0
> I am currently hovering between version 0.6 and 0.7, which can
> properly recompile itself and slightly more contemporary
> version. Getting to version 1.6 from June 2013 should be doable using<= br> > this exact same approach.=C2=A0 This will allow us to package a 2014 > version of the Mocha testing framework.

Impressive.

Indeed. Impressive. =

Ok, so these are the first impressions that I came up with. I'd like my=20 observations on the algorithms to be vetted by more experienced people.
=
Just to help me verify if I can actually understand someone&= #39;s else scheme code

Thanks
=

I think you definitely got the gist = of it.

Thanks
- Jelle
--001a114b32cc56cf8d0538c7e180--