From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id sJ+oKqWM6mEmWgEAgWs5BA (envelope-from ) for ; Fri, 21 Jan 2022 11:36:21 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id uCIbKKWM6mGgOQAA9RJhRA (envelope-from ) for ; Fri, 21 Jan 2022 11:36:21 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 4E8542DA26 for ; Fri, 21 Jan 2022 11:36:21 +0100 (CET) Received: from localhost ([::1]:50936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nArHI-0002DX-DK for larch@yhetil.org; Fri, 21 Jan 2022 05:36:20 -0500 Received: from eggs.gnu.org ([209.51.188.92]:53126) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nArGk-0001eB-CO for guix-devel@gnu.org; Fri, 21 Jan 2022 05:35:46 -0500 Received: from [2001:470:142:3::e] (port=41438 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nArGj-0001Sh-9D; Fri, 21 Jan 2022 05:35:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=VYtCcgo3DmYn2JduPBJ+JTYo+urh6+Q1GpLC39/mln4=; b=EBxgUw3dJFBtd0D6u2oG ATKHrScYR7v85tiQoFLUyfh1u8VqnAa2BmyGzr/KJvUBj0wKjs2eFCmbyKUfhXQidyrogzVowFRQu MlqkwsSnZwHTUxPyj/ZwLxLM7E5ghLnMySafvYNhJ4HsxKPfxTN9EfRzcE3f0yKwSwg+cDFEX+bik VJypg13t4DzW8vfyv0ZlfXn3d6T4Ml0gZmkwxe+8WPa3jtKJNWj/SN8FEzdES3zW4qntFyYv31tpV j57d0CxxXv7H9s0nyCxI6W8F4w1oYEnyQSm/aVo6SzFQxkz1oP3GI5HBOVbk34PM+k93TNeryfsSt wy1XN6AJwk7azg==; Received: from [2a01:e0a:19b:d9a0:2f3b:16f2:b776:3ef9] (port=36510 helo=meije) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nArGc-0004UG-7m; Fri, 21 Jan 2022 05:35:45 -0500 From: Mathieu Othacehe To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: File search References: <8735lh5ukw.fsf@inria.fr> Date: Fri, 21 Jan 2022 11:35:36 +0100 In-Reply-To: <8735lh5ukw.fsf@inria.fr> ("Ludovic =?utf-8?Q?Court=C3=A8s=22?= =?utf-8?Q?'s?= message of "Fri, 21 Jan 2022 10:03:43 +0100") Message-ID: <87czklwf47.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix Devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1642761381; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=VYtCcgo3DmYn2JduPBJ+JTYo+urh6+Q1GpLC39/mln4=; b=LGTtWSZGqq4UyFzDvgoljXA8TQpuW7NpZvkt0XJPHvVbgMwLLSFhy+NzNhb+hLGrYgnss3 Jj8frvWdbY+OEL8kDUzl7hQ4tceZ+QFfUPnpLM/3sUAtV+pEaWfcFC3KXTZ6uLNigb+MIF XR7F+aO3QazpsQLX91NOlXtvm3CD5iAYuqyGGnVm5fYt3GzYvD19/7S8r4QYCliV6T0Bh3 qEKvi8fGcl1uLKyx0toVt21kCrTvBEBm7WMR+3eWwwVCftXDbvcIsOXIb1+jp1//KuvyWg uWXheBWjkZxmz/pg66OmmE4MDu9H0Txn5rgbUkE/1lJ64MtRw6y9mIB01QV3Bw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1642761381; a=rsa-sha256; cv=none; b=nAvhAfUkwaEPkFRgE6CybU7JtFEZ8I4OBcQ3iiQXBji7/ezJoIDKybWJItqc9lsIgdrTAE KLCvUXkyQZUGfInyRqlcCB3JPSqtViDA0QwEKT7oc86eXo9IivBYw9baBU0K3v/LgdjdAz Pckxj7wsiV/zrYi9mT35rt25ZuozL9u/MuGM60gZeIQSAE2CVsu2HbtlUMi03/fu7aZT7L DF/uDNDqkWD1iE0bqlfXPYTvWpThYUd1fYe8L9VYBaMAp7No84SHdHeOWLnOTQ60NJR52+ uZIu2XY5PGhGk85w3IDc8lrWwIiC8RTig7pgzqfplfkpjf92zlQy3n/ONUCYUA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=EBxgUw3d; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -9.52 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=EBxgUw3d; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 4E8542DA26 X-Spam-Score: -9.52 X-Migadu-Scanner: scn0.migadu.com X-TUID: 82svO3ngS6X4 Hello Ludo! > Lately I found myself going several times to > to look for packages providing a given > file and I thought it=E2=80=99s time to do something about it. Yeah, I'm also thinking regularly about it but giving up because setting up this mechanism properly turns out to be much more complex than initially expected. > The script below creates an SQLite database for the current set of > packages, but only for those already in the store: > > guix repl file-database.scm populate > > That creates /tmp/db; it took about 25mn on berlin, for 18K packages. > Then you can run, say: > > guix repl file-database.scm search boot-9.scm Nice proof of concept :). > I think accuracy (making sure you get results that correspond precisely > to, say, your current channel revisions and your current system) is not > a high priority: some result is better than no result. Likewise for > freshness: results for an older version of a given package may still be > valid now. Agreed. > In terms of privacy, I think it=E2=80=99s better if we can avoid making o= ne > request per file searched for. Off-line operation would be sweet, and > it comes with responsiveness; fast off-line search is necessary for > things like =E2=80=98command-not-found=E2=80=99 (where the shell tells yo= u what package > to install when a command is not found). Yeah, that's the tricky part. In term of maintenance, it would probably be easier to have Cuirass index the packages it's building, store the results in the PostgreSQL database and serve them using the Cuirass web server. The pros are that we only rely on one database which is very important in my opinion. It's also relatively easy to setup. The cons are that you need to be online to access this API. If we instead decide to build periodically an sqlite database indexing all the packages in a cronjob or so, it would still be needed for the users to download it, which would be an expensive operation as you mentioned. It would also be difficult to index custom Guix channels with that approach. Another solution could be to have guix publish index the files from the NAR in its cache and provide a file searching API. That would still require to be online, but it would allow to search from multiple publish servers hence possibly multiple Guix channels. The packages that do not have substitutes couldn't be searched which is a strong cons. I would still maybe have a preference for that option. WDYT? Thanks, Mathieu