From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christopher Baines Subject: Re: More progress with the Guix Data Service Date: Mon, 20 May 2019 21:14:52 +0100 Message-ID: <87imu4narn.fsf@cbaines.net> References: <87pnohms3t.fsf@cbaines.net> <871s0trw43.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([209.51.188.92]:41620) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSogZ-0000d0-6p for guix-devel@gnu.org; Mon, 20 May 2019 16:15:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSogW-0003JB-FR for guix-devel@gnu.org; Mon, 20 May 2019 16:15:03 -0400 In-reply-to: <871s0trw43.fsf@gnu.org> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: guix-devel@gnu.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Ludovic Court=C3=A8s writes: >> As well as listening to the Guix Commits mailing list for emails about >> new revisions, more of the information in these emails is now stored, in >> particular, the time they were sent, and the branch the email applies >> to. This can be seen on the new Branches page [4]. >> >> 4: https://prototype-guix-data-service.cbaines.net/branches > > This is really nice. > > This information could also be gathered directly from the repo though, > right? > > I would expect only patch submission info, and possibly commit > notifications, to be grabbed from email, while the rest would be > extracted from the repo, thereby hopefully limiting the risk of > misinterpreting email. WDYT? So, currently the branch name, commit hash and date are taken from the email. As far as I know, git branches are just pointers to commits, and don't have any date/time associated with them. The commit date, or author date in the commits could be stored and used, but I think these are less interesting, and often misleading. The author date is often quite different from the time a commit is pushed, and the commit date is often different by some amount as well. Currently, if you actually want to know what was the state of a particular branch in the Guix git repository on Savannah was, at a particular time, I think the most reliable way of checking would probably be to check the guix-commits mailing list. As the branch name, and commit hash both relate to the date, I don't see that much problem with storing them. One thing I've also been thinking about is loading in the guix-commits mailing list archives. That would backfill the branch information, which might be useful/interesting... I did consider trying to access the clone of the Git repository that's managed by the (guix inferiors) module, but I couldn't see an easy way to do it, and as above, I'm not sure the date/time information is as useful as what you can get from the mailing list. >> There's now a basic search function on the packages page [5], and the >> location, and the licenses for packages is now being stored (which can >> be seen on the page for a package, for example [6]). >> >> 5: https://prototype-guix-data-service.cbaines.net/revision/f52e83470b05= b2473ea13feb2842a1330c316a00/packages?search_query=3DGuile&field=3Dversion&= field=3Dsynopsis&after_name=3D&limit_results=3D1000 >> 6: https://prototype-guix-data-service.cbaines.net/revision/f52e83470b05= b2473ea13feb2842a1330c316a00/package/0ad/0.0.23b-alpha > > Nice! > > One thing that be great is a page similar to > , > but keyed by package, where you get a list of the recent package > versions (and/or derivations) and map them to specific commits. Interesting, yeah, were you thinking of filtering that data for a specific branch (like master or staging), or showing data for all branches? >> The URL is a bit long, but I think that is now close to being possible >> with the Guix data service. I haven't got something working yet to >> easily access data for the latest revision, but for a particular >> revision, you can request a JSON file containing all the information I >> think Repology currently gets about all packages. For example: >> >> https://prototype-guix-data-service.cbaines.net/revision/f52e83470b05b= 2473ea13feb2842a1330c316a00/packages.json?field=3Dversion&field=3Dsynopsis&= field=3Ddescription&field=3Dhome-page&field=3Dlocation&field=3Dlicenses&lim= it_results=3D99999 > > Awesome. (I advise passing =E2=80=9Climit_results=3D900=E2=80=9D though,= because the URL > above gives a pretty big result. ;-)) Well, not that big? Icecat tells me it's 12MB. Also, I've recently added a "All results" checkbox/query parameter, so you no longer have to make up a large number. I wanted to make it possible to get all the data as a single file, as that could simplify processing it, but there's also some support for pagination. https://prototype-guix-data-service.cbaines.net/revision/f52e83470b05b247= 3ea13feb2842a1330c316a00/packages.json?field=3Dversion&field=3Dsynopsis&fie= ld=3Ddescription&field=3Dhome-page&field=3Dlocation&field=3Dlicenses&all_re= sults=3Don The all results option is especially important as I've now done some work on caching. That page should be served with a max-age of a day, it could probably be even longer as well, as the only thing that will change the contents is software changes. NGinx is also now caching responses, and you can see what it's doing by looking at the X-Cache-Status header in the response. >> This is just the software side of the problem though. If this was to be >> used by Repology, it would have to be a more permanent thing, similar to >> the Cuirass and Mumi services that are currently setup around Guix. Does >> anyone have any thoughts on this? > > I=E2=80=99d suggest having a Guix service for the whole thing, and making= a > branch in guix-maintenance.git such that bayfront (say) can run the > service. > > Then we=E2=80=99ll have to reach consensus on guix-sysadmin as to which m= achine > to use depending on the resources it needs, but if you have the config, > I=E2=80=99d argue that we can happily run it on bayfront or perhaps berli= n. And > we can give you access to the machine so you can reconfigure once in a > while. > > WDYT? That all sounds really good :D A package and service has been on my list of things to do, and I'll hopefully sort that out in the next few weeks. Currently I'm running it with Guix + support for isolated inferiors [1], but I think that's something that can be made optional in the Guix Data Service code, as initially I'd just be thinking about processing revisions in the Guix git repository. 1: https://issues.guix.info/issue/34638 Thanks, Chris --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQKTBAEBCgB9FiEEPonu50WOcg2XVOCyXiijOwuE9XcFAlzjCrxfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcACgkQXiijOwuE 9Xe3QRAAiAATt/nHbu8ZG8Gae6WJj195Hwf7qVA7SneCjBY2UVLWue3WoJi1WE4k UFcEXjK11nDlqbp4/+AtCmzYe4TyluRC29Gm9BpyNhBYr8He7/+21Rikh9p85U+3 p93Ueate9ZFcO8QDXh+wE9zjzj0whyZXej4fqZJc/094aJyqywqGFjqX1wWH202S TOgEzHowMVKqbjk1e/1shOY3FaKifWCvddlDeSwmY36PIMowlCDefDhQ8xZnROIa 1EPqxChHRLpdJq9aq6pfqC94k1SHAMqFVfpVDwoc28r7+BVx24I6Pz4WdkeZKXzZ scWDHMZwPRJCxn64TT5+BrqyUmzl6ZqnuhuubQDVRmVjpwebKropGoBGWi70y0Q6 AN/ezCwhqw9N5AsnPqH3ZByVtEIEUnrQWIf7ePJiMv/ySNLWSVQzgcoHf8DCVcM8 iZu9v2h6mL+5VOtypK1BVqACHLEJBA2t5WjLSLLT5vM1mp3P4sFsaCtlKRbHZErk f0/Qg9PR8xov6lonpSz10YICRjSrfgfHfv26ZSchkLupdvyXpY8Cn8zCbVSAUfov 6fBE19M2X8kcElPwYOZfmzrvF3dBhy5FeGGF9EChKJfamkdex3pFbIhPBV81Aomx 9K+LPnmcnM9nSDRb4DJLWa7GLRT9mxmPSz5iNQQ0ccmlaH/Nk9s= =zVbC -----END PGP SIGNATURE----- --=-=-=--