From: Jelle Licht <jlicht@fsfe.org>
To: Catonano <catonano@gmail.com>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: [GSoC update] Npm & guix
Date: Fri, 29 Jul 2016 17:26:53 +0200 [thread overview]
Message-ID: <CAPsKtfL5YT3Jz0WJOEk3gz=V5ANvBYc6L8sBeeNJeu+XbuUjUQ@mail.gmail.com> (raw)
In-Reply-To: <CAJ98PDyMSRMrkKmiR0WcDe9-htjXnuBPsO+vFK3isuepK8qvWQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5341 bytes --]
Quick reply from my phone, but thanks for the feedback.
On Jul 29, 2016 16:53, "Catonano" <catonano@gmail.com> wrote:
> 2016-07-25 23:26 GMT+02:00 Ludovic Courtès <ludo@gnu.org>:
>
>> Hello!
>>
>> Jelle Licht <jlicht@fsfe.org> skribis:
>>
>> > On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it
>> > on my npm importer. After leaving the importer running for a quite
>> > some hours (and making it more robust in the face of inconsistent npm
>> > information), it turns out that jQuery has a direct or indirect
>> > dependcy on about everything. We are talking pretty much all of the
>> > build systems, all of the testing frameworks and all of the test
>> > runners. Literally thousands of packages, and multiple (conflicting)
>> > versions of most.
>>
>> I’m really impressed that your importer can already grovel this much!
>> In itself, that’s already a significant achievement, despite the
>> frustration of not getting “guix package -i jquery” right away.
>>
>> Do you have figures on the number of vertices and edges on this graph?
>> Could it be that the recursive importer keeps revisiting the same nodes
>> over and over again? :-)
>>
>
>
>
>> I would suggest publishing the code somewhere so others can try to
>> import their favorite JS library and give feedback.
>>
>>
> I'd like to indicate that in the Guix code base there's a function
> visiting the graph of the dependencies of a package with the concern of
> covering it all AND of not considering the same node more than one time
>
> It's in guix/packages.scm on line 552, it's called "transitive-inputs" and
> takes as an arguments the inputs of a single package.
>
> It could be used as a blueprint for such task of dependencies graphs
> covering.
>
> The main observation that I can do is that both versions do check whether
> a package being considered is already been seen in a previous passage.
>
> So, it doesn't seem to me that Jelle's version could revisit the same
> package (node) over and over
>
> Also, the "official" version clearly distinguishes between the current
> depth level in the graph and the next one, using 2 different variables for
> containing the packages (nodes) at the current depth and those in the next
> level ("inputs" and "propagated") as it does a breadth first visit.
>
> Instead the Jelle's version uses the same variable for the current AND the
> next level and it's a list and it conses the nodes for the next level on
> the current list
>
> So it seems to me that it does a depth first visit (I might be wrong, I
> didn't do a complete trace)
>
> If anything, the "official" version uses a VHash to check if a package has
> already been "seen" while Jelle's uses a plain list (again)
>
> So the cost of this check is constant in the official version and linear
> in Jelle's version.
>
> Further, in Jelle's version, every package gets checked against the list
> of already imported packages (as a plain list) AND against the packages
> already present in the store.
>
> Both checks at every iteration. It seems to me that there's space for
> improving here ;-)
>
> In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be
> switched to an "or". It would work anyway and it could save a handful of
> checks on a complete run of the algorithm
>
You are totally right here! The quick and dirty way of doing it, which I am
currently doing,
is to wrap these checks in a memoized function.
I'll take some of your points into account to properly rewrite this part.
>
> Anyway, I tried to "import" a random package and it has a ton of
> dependencies.
>
Story of my life these past few weeks :-)
>
> It seems to me that a more systematic approach (like that of a real data
> scientist ;-) ) could help here
>
>
> The whole graph should be imported in some database and then graph
> algorithms should be run on it
>
> For example: which are the packages with less or no dependencies (and a
> lot of dependants) ?
> Because those should be imported first, in my opinion.
>
> Jelle came to the notion that testing frameworks are a dependencies for
> almost all the packages but as far as I understand that's a quite empirical
> knowledge.
>
^ This, I like. Does anyone have any suggestions on tools that could help
me do this in guile?
I know of projects like [1] that already do something similar, although I
do no have any experience
with constructing and querying graphs using free software.
>
>
>> > I am currently hovering between version 0.6 and 0.7, which can
>> > properly recompile itself and slightly more contemporary
>> > version. Getting to version 1.6 from June 2013 should be doable using
>> > this exact same approach. This will allow us to package a 2014
>> > version of the Mocha testing framework.
>>
>> Impressive.
>>
>
> Indeed. Impressive.
>
> Ok, so these are the first impressions that I came up with. I'd like my
> observations on the algorithms to be vetted by more experienced people.
>
> Just to help me verify if I can actually understand someone's else scheme
> code
>
> Thanks
>
I think you definitely got the gist of it.
Thanks
- Jelle
[1]: https://exploringdata.github.io/info/npm-packages-dependencies/
[-- Attachment #2: Type: text/html, Size: 8088 bytes --]
next prev parent reply other threads:[~2016-07-29 15:27 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-24 1:06 [GSoC update] Npm & guix Jelle Licht
2016-07-25 8:41 ` Andy Wingo
2016-07-25 11:33 ` Jan Nieuwenhuizen
2016-07-25 21:26 ` Ludovic Courtès
2016-07-29 14:53 ` Catonano
2016-07-29 15:26 ` Jelle Licht [this message]
2016-07-30 22:19 ` Package graph queries Ludovic Courtès
2016-08-11 4:37 ` Catonano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPsKtfL5YT3Jz0WJOEk3gz=V5ANvBYc6L8sBeeNJeu+XbuUjUQ@mail.gmail.com' \
--to=jlicht@fsfe.org \
--cc=catonano@gmail.com \
--cc=guix-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).