From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id qPuoOlVsfWFHRQEAgWs5BA (envelope-from ) for ; Sat, 30 Oct 2021 18:01:25 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id EFw/NlVsfWGeGQAAbx9fmQ (envelope-from ) for ; Sat, 30 Oct 2021 16:01:25 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 14620BCC8 for ; Sat, 30 Oct 2021 18:01:25 +0200 (CEST) Received: from localhost ([::1]:56134 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mgqnM-0002ht-6r for larch@yhetil.org; Sat, 30 Oct 2021 12:01:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45880) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mgqlu-0002hd-83 for guix-devel@gnu.org; Sat, 30 Oct 2021 11:59:54 -0400 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:40661) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mgqlq-0002Mg-Qv; Sat, 30 Oct 2021 11:59:52 -0400 Received: by mail-wr1-x42a.google.com with SMTP id r8so8645776wra.7; Sat, 30 Oct 2021 08:59:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:in-reply-to:references:date:message-id:mime-version :content-transfer-encoding; bh=fuHCZEXaPhUl6syu+DkD1ooLvaUgPA2jmYEGzInO3Jc=; b=VBctC8EetiaF2Zffn+6419n4m8DUmrBOuZaI9Odw6piwuOJf7QB5Fzw9Y9aKsjfMi0 79o1vCSPauo6/yZna5wDJNBjXKtcXjQuVHKDtxMxvMTgZwfLmyATDxWPYF3dhe/fFKBz 6NWfN5m3WJgMjM6qe3u9Dw0c2wBvh4FcbZG7rJAYhygpwVR1bPShMNejUaOZFH63yAUq BdAT4L8wM9fdnbUpmJE801sj0bgwF9lUiztcpc1wuPWY/u8N9CmMXyCfjXJFrnbn8upK 1NacGJc9ifj4JJW48kTMP1Szpt3FM+FZJePC3wdtA8uJdJmO/DvOj6R3f678Yq9M5Rpf 8rUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=fuHCZEXaPhUl6syu+DkD1ooLvaUgPA2jmYEGzInO3Jc=; b=YHvoWfsX9ji7wqinPm8hlMHYdcFPLuc+E1Pmt4zLvLnOtIWyqMCG+TWS5PFUq3yeMh jUGSzWhY4ftptyySHghCHStb+kDx740MLrMCRSaIv5rmROqOvX/NGCj067UkkckRQrhv puE7Gib46eC60kJ4T7qov4a8ipmh7wOuUfj3gKkmBQ63T0p4aY7fnYM7OSYGnop3MsZc ORYCtLZYKkypVdcyUd/gXFGJu1g8/fXeG5LHI24gcLFEe3ycYb3yRwBgQmB9xA2ZvDmw 1W/ZGImOUvVMAu9DZWJWv/A2dJ5SS9QtYp6FOTlpSF4bF7d8qs33icDdx0JyWC5/vT+c GIwQ== X-Gm-Message-State: AOAM532ZleaAIY0sBOxW9eOZZLn9Ap8VS9lkP2plRphSGoIrzJkHxB8X Q16VufCR+ODuHMIMS2OBnkefboXcCTc= X-Google-Smtp-Source: ABdhPJxJUjtPF804yGioIXB94n8s2qhgtaX99QCr+AZ7DiZZ+XG7kgqlP8M8s87miFUgIB4ubQB+XQ== X-Received: by 2002:adf:da44:: with SMTP id r4mr23742874wrl.180.1635609587812; Sat, 30 Oct 2021 08:59:47 -0700 (PDT) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id v6sm10533861wrx.17.2021.10.30.08.59.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Oct 2021 08:59:47 -0700 (PDT) From: zimoun To: Ludovic =?utf-8?Q?Court=C3=A8s?= , Guix Devel Subject: Re: Accuracy of importers? In-Reply-To: <87ilxfwl2q.fsf@gnu.org> References: <878ryd8we4.fsf@inria.fr> <87ilxfwl2q.fsf@gnu.org> Date: Sat, 30 Oct 2021 17:49:55 +0200 Message-ID: <86wnlujyvw.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=zimon.toutoune@gmail.com; helo=mail-wr1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1635609685; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=fuHCZEXaPhUl6syu+DkD1ooLvaUgPA2jmYEGzInO3Jc=; b=cGSObfuJ5GVZ55r8Lrlj/e5zdcmbpRiZYHp4gM9q+G/5rIG6lCvBaV3uN63UDtmkS4R/Rr ngg1BFbsXENCLZL0TjBQhICEy0N3AyhyUTkkwi9O34VeZ3PYU4xUpIJjaX0CLoj6AF1lYL kT1UR/i9tQYzr7U7/MBAuC27NyzDVb3s+XCmui14aa4Yp3W8K03mG6XyxE8r0e5wPKo7JI FPD/LAsme0fdEwfC/XwX1d1yAome/r4dDpzeIV9y/vsw190SuKF2cPp+WSZ9D5mB5AhrCR 2wiOxk2/oWsWmUU+pGHpIxZoAhyxbqsvrpxCGeinrzwkvgNnUowDQgN3ha4KQw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1635609685; a=rsa-sha256; cv=none; b=WoZkWzJisMk6kmr2sw8a6liINa/3a/rdw5Oiz8+pJmoD4YJAlHqvsdBU2rZL0sCP7pA+CA S2rlW+hMM3wTfLgDbqo4v5/i9x8TmkynapxRppY/OWHj6hnu6/rdt+0TlmUysUyg629lYc 76tZCrDV1YwVxZdskhxGW53otAeQBuQMfmS+bDMsNEbVbYFK4g/UhP5EPQVNCfuPGBiudf Za1LqHwPi/hDLAfXZ4ErCXx8tcjLb4J8GVU2wBxuj33x8s8GWYkSBZW5Go9PtGI/YIuuMC KEw7Pjw7kDyEmAcv2OFiAw+6pgZHcMM7B2E4LedncTmDO8vEJXbN4b+f+pefGw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=VBctC8Ee; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -2.12 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=VBctC8Ee; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 14620BCC8 X-Spam-Score: -2.12 X-Migadu-Scanner: scn0.migadu.com X-TUID: b6eDgPmZw356 Hi Ludo, On Fri, 29 Oct 2021 at 23:57, Ludovic Court=C3=A8s wrote: > (It=E2=80=99s quite expensive to run because it downloads a whole bunch of > things and tries many 404 URLs in the case of CRAN before finding the > right one.) Ah=E2=80=A6 it requires investigation thus. > --8<---------------cut here---------------start------------->8--- > $ SAMPLE_SIZE=3D200 ./pre-inst-env guile ~/src/guix-debugging/importer-ac= curacy.scm > [=E2=80=A6] > Accuracy for 'pypi' (200 packages): > accurate: 58 (29%) > different inputs: 142 (71%) > different source: 0 (0%) > inconclusive: 0 (0%) > Accuracy for 'cran' (200 packages): > accurate: 176 (88%) > different inputs: 23 (12%) > different source: 1 (0%) > inconclusive: 0 (0%) > --8<---------------cut here---------------end--------------->8--- [...] > The script doesn=E2=80=99t do anything useful for crates because they hav= e their > own way of representing inputs. It doesn=E2=80=99t account for changes in > =E2=80=98arguments=E2=80=99 like zimoun suggested, meaning it=E2=80=99s o= verestimating > accuracy. It is already quite interesting results. Because it shows upstream stability, IIUC. Well, it means that running =E2=80=9Cguix import pypi=E2= =80=9D one months ago and running the sames now, 71% packages have different inputs. Right? It is because some metadata from PyPI changed, right? Not because =E2=80=9Cguix import pypi=E2=80=9D was doing wrong and now it d= oes better, right? IMHO, it shows how PyPI allows bad practises about packaging, isn=E2=80=99t= it? My understanding of this experiment is about upstream =E2=80=9Cquality=E2= =80=9D, not about importer =E2=80=9Caccuracy=E2=80=9D. Do I incorrectly understand? Cheers, simon