unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Jelle Licht <jlicht@fsfe.org>
To: Catonano <catonano@gmail.com>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: [GSoC update] Npm & guix
Date: Fri, 29 Jul 2016 17:26:53 +0200	[thread overview]
Message-ID: <CAPsKtfL5YT3Jz0WJOEk3gz=V5ANvBYc6L8sBeeNJeu+XbuUjUQ@mail.gmail.com> (raw)
In-Reply-To: <CAJ98PDyMSRMrkKmiR0WcDe9-htjXnuBPsO+vFK3isuepK8qvWQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5341 bytes --]

Quick reply from my phone, but thanks for the feedback.

On Jul 29, 2016 16:53, "Catonano" <catonano@gmail.com> wrote:

> 2016-07-25 23:26 GMT+02:00 Ludovic Courtès <ludo@gnu.org>:
>
>> Hello!
>>
>> Jelle Licht <jlicht@fsfe.org> skribis:
>>
>> > On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it
>> > on my npm importer. After leaving the importer running for a quite
>> > some hours (and making it more robust in the face of inconsistent npm
>> > information), it turns out that jQuery has a direct or indirect
>> > dependcy on about everything. We are talking pretty much all of the
>> > build systems, all of the testing frameworks and all of the test
>> > runners. Literally thousands of packages, and multiple (conflicting)
>> > versions of most.
>>
>> I’m really impressed that your importer can already grovel this much!
>> In itself, that’s already a significant achievement, despite the
>> frustration of not getting “guix package -i jquery” right away.
>>
>> Do you have figures on the number of vertices and edges on this graph?
>> Could it be that the recursive importer keeps revisiting the same nodes
>> over and over again?  :-)
>>
>
>
>
>> I would suggest publishing the code somewhere so others can try to
>> import their favorite JS library and give feedback.
>>
>>
> I'd like to indicate that in the Guix code base there's a function
> visiting the graph of the dependencies of a package with the concern of
> covering it all AND of not considering the same node more than one time
>
> It's in guix/packages.scm on line 552, it's called "transitive-inputs" and
> takes as an arguments the inputs of a single package.
>
> It could be used as a blueprint for such task of dependencies graphs
> covering.
>
> The main observation that I can do is that both versions do check whether
> a package being considered is already been seen in a previous passage.
>
> So, it doesn't seem to me that Jelle's version could revisit the same
> package (node) over and over
>
> Also, the "official" version clearly distinguishes between the current
> depth level in the graph and the next one, using 2 different variables for
> containing the packages (nodes) at the current depth and those in the next
> level ("inputs" and "propagated") as it does a breadth first visit.
>
> Instead the Jelle's version uses the same variable for the current AND the
> next level and it's a list and it conses the nodes for the next level on
> the current list
>
> So it seems to me that it does a depth first visit (I might be wrong, I
> didn't do a complete trace)
>
> If anything, the "official" version uses a VHash to check if a package has
> already been "seen" while Jelle's uses a plain list (again)
>
> So the cost of this check is constant in the official version and linear
> in Jelle's version.
>
> Further, in Jelle's version, every package gets checked against the list
> of already imported packages (as a plain list) AND against the packages
> already present in the store.
>
> Both checks at every iteration. It seems to me that there's space for
> improving here ;-)
>
> In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be
> switched to an "or". It would work anyway and it could save a handful of
> checks on a complete run of the algorithm
>

You are totally right here! The quick and dirty way of doing it, which I am
currently doing,
is to wrap these checks in a memoized function.
I'll take some of your points into account to properly rewrite this part.


>
> Anyway, I tried to "import" a random package and it has a ton of
> dependencies.
>

Story of my life these past few weeks :-)


>
> It seems to me that a more systematic approach (like that of a real data
> scientist ;-) ) could help here
>
>

> The whole graph should be imported in some database and then graph
> algorithms should be run on it
>
> For example: which are the packages with less or no dependencies (and a
> lot of dependants) ?
> Because those should be imported first, in my opinion.
>
> Jelle came to the notion that testing frameworks are a dependencies for
> almost all the packages but as far as I understand that's a quite empirical
> knowledge.
>

^ This, I like. Does anyone have any suggestions on tools that could help
me do this in guile?
I know of projects like [1] that already do something similar, although I
do no have any experience
with constructing and querying graphs using free software.

>
>
>> > I am currently hovering between version 0.6 and 0.7, which can
>> > properly recompile itself and slightly more contemporary
>> > version. Getting to version 1.6 from June 2013 should be doable using
>> > this exact same approach.  This will allow us to package a 2014
>> > version of the Mocha testing framework.
>>
>> Impressive.
>>
>
> Indeed. Impressive.
>
> Ok, so these are the first impressions that I came up with. I'd like my
> observations on the algorithms to be vetted by more experienced people.
>
> Just to help me verify if I can actually understand someone's else scheme
> code
>
> Thanks
>

I think you definitely got the gist of it.

Thanks
- Jelle

[1]: https://exploringdata.github.io/info/npm-packages-dependencies/

[-- Attachment #2: Type: text/html, Size: 8088 bytes --]

  reply	other threads:[~2016-07-29 15:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-24  1:06 [GSoC update] Npm & guix Jelle Licht
2016-07-25  8:41 ` Andy Wingo
2016-07-25 11:33 ` Jan Nieuwenhuizen
2016-07-25 21:26 ` Ludovic Courtès
2016-07-29 14:53   ` Catonano
2016-07-29 15:26     ` Jelle Licht [this message]
2016-07-30 22:19       ` Package graph queries Ludovic Courtès
2016-08-11  4:37         ` Catonano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPsKtfL5YT3Jz0WJOEk3gz=V5ANvBYc6L8sBeeNJeu+XbuUjUQ@mail.gmail.com' \
    --to=jlicht@fsfe.org \
    --cc=catonano@gmail.com \
    --cc=guix-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).