From mboxrd@z Thu Jan 1 00:00:00 1970 From: swedebugia Subject: Re: Improved NPM importer with blacklist Date: Fri, 30 Nov 2018 18:20:42 +0100 Message-ID: <9207d4fe-8c6e-fd7d-0587-0a44fb9eb976@riseup.net> References: <70F182DB-C157-4763-A4C6-89985545661C@lepiller.eu> <12fdf913-eb03-b898-f9ff-8dd455935975@riseup.net> <0cfa10d0f59225c3897d4fc004722ee2@lepiller.eu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------703C54EDBA8B69833FF207CF" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:60983) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gSmNt-0004qd-7d for guix-devel@gnu.org; Fri, 30 Nov 2018 12:15:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gSmNF-00028A-Ve for guix-devel@gnu.org; Fri, 30 Nov 2018 12:14:45 -0500 Received: from mx1.riseup.net ([198.252.153.129]:42263) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gSmNF-00027q-DL for guix-devel@gnu.org; Fri, 30 Nov 2018 12:14:41 -0500 In-Reply-To: <0cfa10d0f59225c3897d4fc004722ee2@lepiller.eu> Content-Language: en-US List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Julien Lepiller Cc: guix-devel@gnu.org This is a multi-part message in MIME format. --------------703C54EDBA8B69833FF207CF Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable On 2018-11-30 17:24, Julien Lepiller wrote: > Le 2018-11-30 17:13, swedebugia a =C3=A9crit=C2=A0: snip > Hi, >=20 > I never used the recursive importer, so I didn't know it wasn't very go= od. >=20 > I wonder if we really need to import every version of the packages. Tha= t=20 > doesn't seem very practical. There are a few cyclic dependencies issues= =20 > in Java packages too, and they are dealt with in a case-by-case basis.=20 > Most often, we made a degraded version of one of the packages, the=20 > second can use to build itself, then we rebuild the first with the=20 > second package. Sounds good. > Sometimes, we also have to adapt some of our packages for the newer=20 > versions of the dependencies we have. If we didn't, we'd have a lot of=20 > versions of every package, and most of them would be outdated, probably= =20 > buggy or contain security holes. I'd prefer using the latest versions o= f=20 > dependencies, and contribute patches back to upstream, so they can use=20 > the latest and greatest too :) >=20 > That's obviously a lot more work, but that's also probably a saner way=20 > of doing things. Agreed, this seems better. With a good tree browser we can probably=20 avoid importing more than 2-5 versions of the worst packages. I collected a few cyclic devdeps. See attached. (these definitions is of=20 little value as the versions of deps and devdeps are discarded) >> TODO: >> * make npm-recursive-import work by not fetching blacklisted packages >=20 > Let's be careful though: we don't want to fetch blacklisted packages=20 > when they are devDependencies, but we still want them if they are=20 > runtime dependencies. Totally agree. This is exactly why I only implemented blacklisting of=20 native-inputs. >> * implement keyword blacklisting based on the descriptions >=20 > We can probably use tags instead of the description : '("test" "testing= "=20 > "check" "doc" "coverage" "unit") seem like a good approximation of what= =20 > we want to blacklist. Fewer that half the npm packages have tags to my knowledge. We can do=20 both though :D >=20 >> >> * match not just the whole string of blacklisted packages: >> =C2=A0 e.g. match also "rollup-plugin" when "rollup" is in the blackli= st. >> >> * get the tarballs from npm-registry instead as they are never missing >> =C2=A0 (githubs sometimes are) and likely reproducible. >=20 > Are they actual source tarballs, or are they somewhat different than th= e=20 > source used to build the "binary" npm package? With maven (for java) fo= r=20 > instance, some sources are hosted, but they aren't supposed to be used=20 > to build the package, they're only here for the debugger. Fortunately it seems it is the full source. :D See https://registry.npmjs.org/underscore/-/underscore-1.9.1.tgz https://registry.npmjs.org/nodeunit/-/nodeunit-0.5.5.tgz https://registry.npmjs.org/async/-/async-0.9.0.tgz >=20 >> >> * Output a (define-public (inherit -)) f= or >> =C2=A0 all imported npm-packages. >=20 > I don't think that's a good idea: if we have multiple versions of a=20 > package, we'll have multiple packages... Ok, got it. I thought the define-publics would collide, but I guess not. >=20 >> >> * Make it possible to specify a specific version to import (and perhap= s >> =C2=A0 the latest of all minor versions of a package :D). >> (For async that would be "0.1.22", "0.2.10", "0.3.0", etc all the way >> up to "2.6.1" which is the current beast. This would mean that we in >> total import about 477.000 packages times the number of minor releases >> (mean ~10?) that equals 4,7 mio. npm-packages :p) Then we will >> definitely need to speed up guile. My guess is that we will have to >> import at least 1,5 versions for every npm package to mitigate cyclic >> dependencies (this means 477.000*1,5 =3D 715.500 npm-package-versions)= . >=20 > Again, I'm more in favor of patching them, rather than importing more=20 > versions. Do we still have as many cyclic deps with the blacklist? No, the blacklist makes a BIG difference (but only to the cycdevdeps.=20 The deps still introduce just as many cycles. These can be avoided by=20 carefully choosing a version just before the cycdep was added :) (or by=20 patching but I know nothing about JS so I leave that to others) >> * Make it easy to analyze a given npm-package to see when deps/devdeps >> were added. In the case async, propose we import 0.9.0 first which is >> the last version without lodash as devdep. From 1.0.0 more devdeps >> were added. (source: https://registry.npmjs.org/async) >> >> Perhaps some kind of tree output for these complex packages with >> versions as branches and dependencies as subbranches would be nice? I will try parsing the registry to output something intelligently to=20 help the user choose which version to import. > Thanks for your work! Thanks for sharing so we can improve this together :) --=20 Cheers Swedebugia --------------703C54EDBA8B69833FF207CF Content-Type: text/x-scheme; name="node-cyclic.scm" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="node-cyclic.scm" (define-public node-rimraf (package (name "node-rimraf") (version "2.6.2") (source (origin (method url-fetch) (uri "https://github.com/isaacs/rimraf/archive/v2.6.2/rimraf-v2.6.2.tar.gz") (sha256 (base32 "0bmssxz3s30nhq5f8ldssf6s8ga5w0aarn71wjsmvqb1j15b2r6d")))) (build-system node-build-system) (inputs `(("node-glob" ,node-glob))) (native-inputs `( ;; tests("node-tap" ,node-tap) ("node-mkdirp" ,node-mkdirp))) (synopsis "A deep deletion module for node (like `rm -rf`)") (description "A deep deletion module for node (like `rm -rf`)") (home-page "https://github.com/isaacs/rimraf#readme") (license license:isc))) (define-public node-glob (package (name "node-glob") (version "7.1.3") (source (origin (method url-fetch) (uri "https://github.com/isaacs/node-glob/archive/v7.1.3/node-glob-v7.1.3.tar.gz") (sha256 (base32 "0qcymwljbm947gvfn7g7871dnwv5s0jq0r8c8ih9xgrfcynfw3hx")))) (build-system node-build-system) (inputs `(("node-inflight" ,node-inflight) ("node-once" ,node-once) ("node-path-is-absolute" ,node-path-is-absolute) ("node-minimatch" ,node-minimatch) ("node-fs.realpath" ,node-fs.realpath) ("node-inherits" ,node-inherits))) (native-inputs `( ;; benchm ("node-tick" ,node-tick) ;; tests ("node-tap" ,node-tap) ("node-rimraf" ,node-rimraf) ("node-mkdirp" ,node-mkdirp))) (synopsis "a little globber") (description "a little globber") (home-page "https://github.com/isaacs/node-glob#readme") (license license:isc))) (define-public node-jasmine-core (package (name "node-jasmine-core") (version "3.3.0") (source (origin (method url-fetch) (uri "https://github.com/jasmine/jasmine/archive/v3.3.0/jasmine-v3.3.0.tar.gz") (sha256 (base32 "1rg4p487hf8mlxcj99wywzwp7jp3s4d114n4j12r3mkh8qyi8nck")))) (build-system node-build-system) (inputs `()) (native-inputs `( ;; ("node-grunt" ,node-grunt) ;; ("node-grunt-contrib-compass" ;; ,node-grunt-contrib-compass) ("node-jsdom" ,node-jsdom) ("node-shelljs" ,node-shelljs) ("node-jasmine" ,node-jasmine) ;; ("node-load-grunt-tasks" ,node-load-grunt-tasks) ;; ("node-grunt-contrib-compress" ;; ,node-grunt-contrib-compress) ;; ("node-grunt-contrib-concat" ;; ,node-grunt-contrib-concat) ;; ("node-grunt-cli" ,node-grunt-cli) ("node-temp" ,node-temp) ("node-glob" ,node-glob) ;; ;; ("node-grunt-contrib-jshint" ;; ,node-grunt-contrib-jshint) )) (synopsis "Official packaging of Jasmine's core files for use by Node.js projects.") (description "Official packaging of Jasmine's core files for use by Node.js projects.") (home-page "https://jasmine.github.io") (license license:expat))) (define-public node-jasmine (package (name "node-jasmine") (version "3.3.0") (source (origin (method url-fetch) (uri "https://github.com/jasmine/jasmine-npm/archive/v3.3.0/jasmine-npm-v3.3.0.tar.gz") (sha256 (base32 "1b6mgxmxv71bpr4fg75azfyh1v0m469prb7srg990fkf7i5bszw9")))) (build-system node-build-system) (inputs `(("node-jasmine-core" ,node-jasmine-core) ("node-glob" ,node-glob))) (native-inputs `(("node-grunt" ,node-grunt) ("node-shelljs" ,node-shelljs) ("node-grunt-cli" ,node-grunt-cli) ("node-grunt-contrib-jshint" ,node-grunt-contrib-jshint))) (synopsis "Command line jasmine") (description "Command line jasmine") (home-page "http://jasmine.github.io/") (license license:expat))) (define-public node-domhandler (package (name "node-domhandler") (version "2.4.2") (source (origin (method url-fetch) (uri "https://github.com/fb55/DomHandler/archive/v2.4.2/DomHandler-v2.4.2.tar.gz") (sha256 (base32 "16hi0vapmavw9g9s321b4c9nvwfg06cclj7pjnvjzk0imnzxjngp")))) (build-system node-build-system) (inputs `(("node-domelementtype" ,node-domelementtype))) (native-inputs `(("node-htmlparser2" ,node-htmlparser2) ("node-jshint" ,node-jshint) ("node-mocha" ,node-mocha))) (synopsis "handler for htmlparser2 that turns pages into a dom") (description "handler for htmlparser2 that turns pages into a dom") (home-page "https://github.com/fb55/DomHandler#readme") (license #f))) (define-public node-htmlparser2 (package (name "node-htmlparser2") (version "3.10.0") (source (origin (method url-fetch) (uri "https://github.com/fb55/htmlparser2/archive/v3.10.0/htmlparser2-v3.10.0.tar.gz") (sha256 (base32 "1qvsv4aixmgnh4h7q726wapg7qnk7srw4z9nmy71jc5r2krimnvn")))) (build-system node-build-system) (inputs `(("node-readable-stream" ,node-readable-stream) ("node-domhandler" ,node-domhandler) ("node-domelementtype" ,node-domelementtype) ("node-inherits" ,node-inherits) ("node-domutils" ,node-domutils) ("node-entities" ,node-entities))) (native-inputs `(("node-eslint" ,node-eslint) ("node-coveralls" ,node-coveralls) ("node-istanbul" ,node-istanbul) ("node-mocha-lcov-reporter" ,node-mocha-lcov-reporter) ("node-mocha" ,node-mocha))) (synopsis "Fast & forgiving HTML/XML/RSS parser") (description "Fast & forgiving HTML/XML/RSS parser") (home-page "https://github.com/fb55/htmlparser2#readme") (license license:expat))) --------------703C54EDBA8B69833FF207CF--