From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id KGMMGEyTi2OaTAEAbAwnHQ (envelope-from ) for ; Sat, 03 Dec 2022 19:19:56 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id IBcLGEyTi2Pp5gAA9RJhRA (envelope-from ) for ; Sat, 03 Dec 2022 19:19:56 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E7AE314382 for ; Sat, 3 Dec 2022 19:19:55 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p1X6j-0002th-SR; Sat, 03 Dec 2022 13:19:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p1X6h-0002tK-Ov for guix-devel@gnu.org; Sat, 03 Dec 2022 13:19:23 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p1X6h-0005Bj-DW; Sat, 03 Dec 2022 13:19:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=PTN+iHdWNytcuQp1kBZU5n5aHzpOoI2cG7p9yIEGZhE=; b=WUjPETLNxqerhJa3hnEy ZCjiDjybm//Y9ClSEhld0BtNtZDTaywkgkarshF1Q1zvuXtwCd9rYWBwScQ7AHQ5Nk/4vDcbKA9Op C4Lv4wlj51ik3ak707EojSFVKBqYsyqFAFCiTj0Flj01HXgSuPZ0En9kxCxrrqlwbwBrs3j07LiSc YWE8Xvuin2HspVZGk2M7JF4LCJMT8il49hkVX9MWNIayJMLdBXPVtv8C6RxR2sSHLQKpHKihr4gxp ERViIYCjyCQTMPXix1WY2wSqVKISKGKyTVzwBZVqkdiPpmXOlQmxbkvhIEulZbdJD3UYkNkxIBpZg OLoj0RWmAp3vbQ==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201] helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p1X6g-0006vS-RU; Sat, 03 Dec 2022 13:19:23 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: "Antoine R. Dumont (@ardumont)" Cc: guix-devel@gnu.org Subject: Re: File search References: <87pmd1r8kt.fsf@gmail.com> <87mt85r7h8.fsf@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Tridi 13 Frimaire an 231 de la =?utf-8?Q?R=C3=A9volu?= =?utf-8?Q?tion=2C?= jour du =?utf-8?Q?C=C3=A8dre?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Sat, 03 Dec 2022 19:19:20 +0100 In-Reply-To: <87mt85r7h8.fsf@gmail.com> (Antoine R. Dumont's message of "Fri, 02 Dec 2022 19:22:27 +0100") Message-ID: <87lenonydz.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1670091596; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=ObNpmq62pyCzSfVPCiaB+czJcz2TSo2C5m8WbOEuSgc=; b=JXKd3eQF/p2GG4e3PRqI7kILTCSuOCaOHoSHYeDuzCFzEr3snmASgALOgoRimydtKr0OEh q7m9c38f/tWhjHKQxxnkg7MMhgz/04xKop9HyxfF05Uy22metgR9r7WuYfWGReduAUsADn H5SIMaYmlE6+CAzq2h/1E/fngLB3QFqWAxGoZHuNYda3JaPpO1wVGPA2DvLtUm4PH4Iphi qo2Y10zfRqCsR4o8ZY3L0EySVZMiczmH5rjEENy2WZGY/EEOhlg72M/MyPjTJj6DjzLpeI zxmPTd+3JCUuJZN90+oxegw4XEvFWlnI4awQji03mGuSARPy8JCvD+YGI7EtIQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1670091596; a=rsa-sha256; cv=none; b=DnjhFkHp3aGNzrIJPLgImd41L3sKCsmD+4BWpOFrSY18Akdoek9WHTgNIV+mj5LsXbDPLL fVjMy8TwS49cqAC3WATQSm6mcxz4CpMLQbmtd/eV7VugtNj4PRrjhrkwiYj60E1E7NSMuA FwexGhj5zTLsr0Jq6pFcUiDS74om1dMiugR5A6kHSzyluRKrf9fugR4qnTv03RMftjVlb0 RHcpxsUm+zkn4ZcaLUMdRP7PrsFpH3YeRKAQgLV2bODERlLOsl3OdktbE1vdEAf0SVLkDU HUFNZ0Wi3doPdtBbft5ku+e6AGE4/mDNxSwF1jwU0X29vl2J+VSLWcLrOS1KTg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=gnu.org header.s=fencepost-gnu-org header.b=WUjPETLN; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.46 Authentication-Results: aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=gnu.org header.s=fencepost-gnu-org header.b=WUjPETLN; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: E7AE314382 X-Spam-Score: -3.46 X-Migadu-Scanner: scn0.migadu.com X-TUID: cKK+4XBTNi7l --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Antoine, "Antoine R. Dumont (@ardumont)" skribis: > After toying a bit with the initial code, I took the liberty to make it > a guix extension (we discussed it a bit with @zimoun). It was mostly to > get started with Guile (I know some lisp implems but not this one so i > had to familiarize myself with tools and whatnot ;). Anyway, that can be > reverted if you feel like it can be integrated as a Guix cli directly. > > Currently, the implementation scans and indexes whatever package is > present in the local store of the machine's user. From nix/guix's > design, it makes sense to do it that way as it's likely that even though > you don't have all the tools locally, it may be already present as a > dependency of some high level tools you already use (it's just not > exposed because not declared in config.scm or home-configuration.scm). > > You will find inlines (at the bottom) some cli usage calls [1] and the > current implementation [2]. Yay, nice work! I toyed a bit with your code and that gave me an idea: instead of the costly =E2=80=98fold-packages=E2=80=99 + =E2=80=98package-derivation=E2=80= =99, we can iterate over all the manifests on the system and index packages they refer to. That way, no need to talk to the daemon, computer derivations, etc. Should be faster, though of course it still needs to traverse those directories. Please find attached a modified version that illustrates that. (We=E2=80= =99ll need version control at some point. :-)) Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline; filename=index.scm Content-Transfer-Encoding: quoted-printable Content-Description: the code ;;; GNU Guix --- Functional package management for GNU ;;; Copyright =C2=A9 2022 Ludovic Court=C3=A8s ;;; ;;; This file is part of GNU Guix. ;;; ;;; GNU Guix is free software; you can redistribute it and/or modify it ;;; under the terms of the GNU General Public License as published by ;;; the Free Software Foundation; either version 3 of the License, or (at ;;; your option) any later version. ;;; ;;; GNU Guix is distributed in the hope that it will be useful, but ;;; WITHOUT ANY WARRANTY; without even the implied warranty of ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;;; GNU General Public License for more details. ;;; ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see . (define-module (guix extensions index) #:use-module (guix config) ;; %guix-package-name, ... #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) #:autoload (guix combinators) (fold2) #:autoload (guix grafts) (%graft?) #:autoload (guix store roots) (gc-roots) #:use-module (guix derivations) #:use-module (guix packages) #:use-module (guix profiles) #:use-module (guix sets) #:use-module ((guix utils) #:select (cache-directory)) #:autoload (guix build utils) (find-files) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) #:use-module (srfi srfi-71) #:export (guix-index)) (define debug #f) (define schema " create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, version text not null, unique (name, version) -- add uniqueness constraint ); create table if not exists Directories ( id integer primary key autoincrement not null, name text not null, package integer not null, foreign key (package) references Packages(id) on delete cascade, unique (name, package) -- add uniqueness constraint ); create table if not exists Files ( name text not null, basename text not null, directory integer not null, foreign key (directory) references Directories(id) on delete cascade unique (name, basename, directory) -- add uniqueness constraint ); create index if not exists IndexFiles on Files(basename);") (define (call-with-database file proc) (let ((db (sqlite-open file))) (dynamic-wind (lambda () #t) (lambda () (sqlite-exec db schema) (proc db)) (lambda () (sqlite-close db))))) (define (insert-files db package version directories) "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." (define stmt-select-package (sqlite-prepare db "\ SELECT id FROM Packages WHERE name =3D :name AND version =3D :version;" #:cache? #t)) (define stmt-insert-package (sqlite-prepare db "\ INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes VALUES (:name, :version);" #:cache? #t)) (define stmt-select-directory (sqlite-prepare db "\ SELECT id FROM Directories WHERE name =3D :name AND package =3D :package;" #:cache? #t)) (define stmt-insert-directory (sqlite-prepare db "\ INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes VALUES (:name, :package);" #:cache? #t)) (define stmt-insert-file (sqlite-prepare db "\ INSERT OR IGNORE INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" #:cache? #t)) (sqlite-exec db "begin immediate;") (sqlite-bind-arguments stmt-insert-package #:name package #:version version) (sqlite-fold (const #t) #t stmt-insert-package) (sqlite-bind-arguments stmt-select-package #:name package #:version version) (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) (when debug (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" package version package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) (sqlite-reset stmt-insert-directory) (sqlite-bind-arguments stmt-insert-directory #:name directory #:package package-id) (sqlite-fold (const #t) #t stmt-insert-directory) (sqlite-reset stmt-select-directory) (sqlite-bind-arguments stmt-select-directory #:name directory #:package package-id) (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) (when debug (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" directory package-id directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singlet= on. (unless (string=3D? file directory) (sqlite-reset stmt-insert-file) (sqlite-bind-arguments stmt-insert-file #:name (strip file) #:basename (basename file) #:directory directory-id) (sqlite-fold (const #t) #t stmt-insert-fi= le))) (find-files directory))))) directories))) (sqlite-exec db "commit;")) (define (insert-package db package) "Insert all the files of PACKAGE into DB." (mlet %store-monad ((drv (package->derivation package #:graft? #f))) (match (derivation->output-paths drv) (((labels . directories) ...) (when (every file-exists? directories) (insert-files db (package-name package) (package-version package) directories)) (return #t))))) (define (filter-public-current-supported package) "Filter supported, not hidden (public) and not superseded (current) packa= ge." (and (not (hidden-package? package)) (not (package-superseded package)) (supported-package? package))) (define (filter-supported-package package) "Filter supported package (package might be hidden or superseded)." (and (supported-package? package))) (define (no-filter package) "No filtering on package" #t) (define* (insert-packages db #:optional (filter-policy filter-public-curren= t-supported)) "Insert all current packages matching `filter-package-policy` into DB." (with-store store (parameterize ((%graft? #f)) (fold-packages (lambda (package _) (run-with-store store (insert-package db package))) #t #:select? filter-policy)))) ;;; ;;; Indexing from local profiles. ;;; (define (all-profiles) "Return the list of profiles on the system." (delete-duplicates (filter-map (lambda (root) (if (file-exists? (string-append root "/manifest")) root (let ((root (string-append root "/profile"))) (and (file-exists? (string-append root "/manifest")) root)))) (gc-roots)))) (define (profiles->manifest-entries profiles) "Return manifest entries for all of PROFILES, without duplicates." (let loop ((visited (set)) (profiles profiles) (entries '())) (match profiles (() entries) ((profile . rest) (let* ((manifest (profile-manifest profile)) (entries visited (fold2 (lambda (entry lst visited) (let ((item (manifest-entry-item entry))) (if (set-contains? visited item) (values lst visited) (values (cons entry lst) (set-insert item visited))))) entries visited (manifest-transitive-entries manifest)))) (loop visited rest entries)))))) (define (insert-manifest-entry db entry) "Insert ENTRY, a manifest entry, into DB." (insert-files db (manifest-entry-name entry) (manifest-entry-version entry) (list (manifest-entry-item entry)))) ;FIXME: outputs? (define (index-manifests db-file) "Insert into DB-FILE entries for packages that appear in manifests available on the system." (call-with-database db-file (lambda (db) (for-each (lambda (entry) (insert-manifest-entry db entry)) (let ((lst (profiles->manifest-entries (all-profiles)))) (pk 'entries (length lst)) lst))))) ;;; ;;; Search. ;;; (define-record-type (package-match name version file) package-match? (name package-match-name) (version package-match-version) (file package-match-file)) (define (matching-packages db file) "Return unique corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ SELECT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories ON files.basename =3D :file AND directories.id =3D files.directory AND packages.id =3D directories.package;")) (sqlite-bind-arguments lookup-stmt #:file file) (sqlite-fold (lambda (result lst) (match result (#(package version directory file) (cons (package-match package version (string-append directory "/" file)) lst)))) '() lookup-stmt)) (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname (lambda (db) (insert-packages db no-filter)))) (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname (lambda (db) (matching-packages db file)))) (define (print-matching-results matches) "Print the MATCHES matching results." (for-each (lambda (result) (format #t "~20a ~a~%" (string-append (package-match-name result) "@" (package-match-version result)) (package-match-file result))) matches)) (define default-db-path (string-append (cache-directory #:ensure? #f) "/index/db.sqlite")) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. With 'search FILE', search for packages installing FILEx;x.\n Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. See --db-path for customization.\n")) (newline) (display (G_ "The valid values for OPTIONS are:")) (newline) (display (G_ " -h, --help Display this help and exit")) (display (G_ " -V, --version Display version information and exit")) (display (G_ " --db-path=3DDIR Change default location of the cache db")) (newline) (newline) (display (G_ "The valid values for ARGS are:")) (newline) (display (G_ " search FILE Search for packages installing the FILE (from cache db)")) (newline) (show-bug-report-information)) (define-command (guix-index . args) (category extension) (synopsis "Index packages to allow searching package for a given filename= ") (define (parse-db-args args) "Parsing of string key=3Dvalue where we are only interested in 'value'" (match (string-split args #\=3D) ((unused db-path) db-path) (_ #f))) (define (display-help-and-exit) (show-help) (exit 0)) (match args ((or ("-h") ("--help")) (display-help-and-exit)) ((or ("-V") ("--version")) (show-version-and-exit "guix locate")) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path (index-packages-with-db db-path) (display-help-and-exit)))) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) (exit (pair? matches)))) ((db-path-args "search" file) (let ((db-path (parse-db-args db-path-args))) (if db-path (let ((matches (matching-packages-with-db db-path file))) (print-matching-results matches) (exit (pair? matches))) (display-help-and-exit)))) (_ ;; index by default ;; (index-packages-with-db default-db-path) (index-manifests default-db-path) ))) --=-=-=--