From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id KLMpCoqmimGwLgEAgWs5BA (envelope-from ) for ; Tue, 09 Nov 2021 17:49:14 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id 4C7cBYqmimFjSwAA1q6Kng (envelope-from ) for ; Tue, 09 Nov 2021 16:49:14 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id AD738EA49 for ; Tue, 9 Nov 2021 17:49:13 +0100 (CET) Received: from localhost ([::1]:46320 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkUJ6-000721-TA for larch@yhetil.org; Tue, 09 Nov 2021 11:49:12 -0500 Received: from eggs.gnu.org ([209.51.188.92]:55216) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkUIn-000701-Ox for guix-devel@gnu.org; Tue, 09 Nov 2021 11:48:53 -0500 Received: from [2001:470:142:3::e] (port=47812 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkUIn-0001SR-E8; Tue, 09 Nov 2021 11:48:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=0VKvT/oLi5kM92B2uuYwHWlAQbC79+beOCjJtNw/ekk=; b=pezYMwW1sFVb2TGJkngG 1+JeMG4CpbTeT06zjWYKGjtgWcCwUsdKS6eWh9ecp1z3Vf83sI01JW4gS6jRSDaPXbqg8XfgsyDwB bRf8XZ5oTT+19aFuJoAqxlclW/TRJLSwiU27NWs/xYeu1u1WLQXDv3NCn8kMKwGFS9+xjngDZi9ST /gtzks4nclNOSzU3KVOoUA8ZF5ohOPm4+grDgFdns3HprQ4x4CjF2zmov09jFCtACcbbRLwYLlig8 p9P9IscnqCKChISBje0w95eZiJlRBPYTfYQNq5RJcSxF7MtwJyEhV3GGBsEQrG67V98oc6BzLZ30g xtpXBMi8G9esQw==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201]:50254 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkUIn-00087t-58; Tue, 09 Nov 2021 11:48:53 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: zimoun Subject: Re: Accuracy of importers? References: <878ryd8we4.fsf@inria.fr> <87ilxfwl2q.fsf@gnu.org> <86wnlujyvw.fsf@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 19 Brumaire an 230 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 09 Nov 2021 17:48:50 +0100 In-Reply-To: <86wnlujyvw.fsf@gmail.com> (zimoun's message of "Sat, 30 Oct 2021 17:49:55 +0200") Message-ID: <87o86tjmvh.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix Devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1636476553; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=0VKvT/oLi5kM92B2uuYwHWlAQbC79+beOCjJtNw/ekk=; b=U8xqkXCJuwiuOzgu+JR6Xkgthv6vJGrNeXhqBPWOojwKQI9SvsEt0UtjH3DRPcMPwrXdnd 5RUaC8bRs+s1/oI/nkBTvv7YXRFHQw8vvfFgOOYx3v7rYtPdbBuUC91q0NNzaBRJfuUjFJ lRDQRNk/8QhnyxurXpMS+pfogiXFGV1GGy7j9Qrtlh4VLmf99vh/LgiStrjOsljhBvA+5w Ht9Kt7KQY64CK1weR42jEnjvMkD2RX1hyXjsOCrcHI6qNSB5OVvZnRHOVW0DjxzDbXDoG8 9ijBON3Z1falAAkIvdzyH5Hb/Zuc0I7ZID6+D4KfZg4cwVDeDZV5s9VQYJ5gPg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1636476553; a=rsa-sha256; cv=none; b=hzsYA1joDakU6IUk5bwrXzdho2kKkLW9aGjUPcUcevE4Ng3DBgOFJwJWmUKarQeYVbGdcK zPSLFtZrWFEzlawPAvZJTYiR6FovJSpKi2nm9AuFSXgWeV7Zjzwm2QXarpORgvJcG7ya+B 7QwZO7rw1DnB2eUiE6vjJjQWfGjsFbbbyf+K2OyoQJp1k7aAUJgAWFa4ta83UHjcdVdY3V Zhr3ZJe76i7xma9WIFI+as4c1TRqgTZ5p2OCwfNW74tdIm+rBTCbO0HfeBp2tXyjyUY2r1 HWyhDcznDA3/us1NnTZoP5fk0Hrkt1/611+ZG57bRQuRck+5LXwdLCDnUb985g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=pezYMwW1; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -2.12 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=pezYMwW1; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: AD738EA49 X-Spam-Score: -2.12 X-Migadu-Scanner: scn0.migadu.com X-TUID: H9gRud4fqNzN Hi, zimoun skribis: > On Fri, 29 Oct 2021 at 23:57, Ludovic Court=C3=A8s wrote: [...] >> --8<---------------cut here---------------start------------->8--- >> $ SAMPLE_SIZE=3D200 ./pre-inst-env guile ~/src/guix-debugging/importer-a= ccuracy.scm >> [=E2=80=A6] >> Accuracy for 'pypi' (200 packages): >> accurate: 58 (29%) >> different inputs: 142 (71%) >> different source: 0 (0%) >> inconclusive: 0 (0%) >> Accuracy for 'cran' (200 packages): >> accurate: 176 (88%) >> different inputs: 23 (12%) >> different source: 1 (0%) >> inconclusive: 0 (0%) >> --8<---------------cut here---------------end--------------->8--- > > [...] > >> The script doesn=E2=80=99t do anything useful for crates because they ha= ve their >> own way of representing inputs. It doesn=E2=80=99t account for changes = in >> =E2=80=98arguments=E2=80=99 like zimoun suggested, meaning it=E2=80=99s = overestimating >> accuracy. > > It is already quite interesting results. Because it shows upstream > stability, IIUC. Well, it means that running =E2=80=9Cguix import pypi= =E2=80=9D one > months ago and running the sames now, 71% packages have different > inputs. Right? It is because some metadata from PyPI changed, right? No no; I=E2=80=99m assuming PyPI, CRAN, etc. provide the same info as they = did back when the package was imported (which is probably the case). > Not because =E2=80=9Cguix import pypi=E2=80=9D was doing wrong and now it= does better, > right? I=E2=80=99m also assuming that the importer didn=E2=80=99t change significa= ntly in the meantime, which is probably a good approximation. What I think those figures show is the amount of manual tweaks necessary to get a proper package =E2=80=9C=C3=A0 la Guix=E2=80=9D, with tests runnin= g etc. For PyPI we often need to add things under =E2=80=98native-inputs=E2=80=99, hence th= e 71% =E2=80=9Cdifferent inputs=E2=80=9D line. For CRAN that=E2=80=99s sometimes= necessary, but much less frequently. There are also cases with non-R/non-Python dependencies. > IMHO, it shows how PyPI allows bad practises about packaging, isn=E2=80= =99t it? > > My understanding of this experiment is about upstream =E2=80=9Cquality=E2= =80=9D, not > about importer =E2=80=9Caccuracy=E2=80=9D. Do I incorrectly understand? Yes, in a way, assuming our importers are not lossy, this tells us whether the upstream repo contains enough information and/or whether that information is accurate. Thanks, Ludo=E2=80=99.