* File search @ 2022-01-21 9:03 Ludovic Courtès 2022-01-21 10:35 ` Mathieu Othacehe ` (4 more replies) 0 siblings, 5 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-01-21 9:03 UTC (permalink / raw) To: Guix Devel [-- Attachment #1: Type: text/plain, Size: 2372 bytes --] Hello Guix! Lately I found myself going several times to <https://packages.debian.org> to look for packages providing a given file and I thought it’s time to do something about it. The script below creates an SQLite database for the current set of packages, but only for those already in the store: guix repl file-database.scm populate That creates /tmp/db; it took about 25mn on berlin, for 18K packages. Then you can run, say: guix repl file-database.scm search boot-9.scm to find which packages provide a file named ‘boot-9.scm’. That part is instantaneous. The database for 18K packages is quite big: --8<---------------cut here---------------start------------->8--- $ du -h /tmp/db* 389M /tmp/db 82M /tmp/db.gz 61M /tmp/db.zst --8<---------------cut here---------------end--------------->8--- How do we expose that information? There are several criteria I can think of: accuracy, freshness, privacy, responsiveness, off-line operation. I think accuracy (making sure you get results that correspond precisely to, say, your current channel revisions and your current system) is not a high priority: some result is better than no result. Likewise for freshness: results for an older version of a given package may still be valid now. In terms of privacy, I think it’s better if we can avoid making one request per file searched for. Off-line operation would be sweet, and it comes with responsiveness; fast off-line search is necessary for things like ‘command-not-found’ (where the shell tells you what package to install when a command is not found). Based on that, it is tempting to just distribute a full database from ci.guix, say, that the client command would regularly fetch. The downside is that that’s quite a lot of data to download; if you use the file search command infrequently, you might find yourself spending more time downloading the database than actually searching it. We could have a hybrid solution: distribute a database that contains only files in /bin and /sbin (it should be much smaller), and for everything else, resort to a web service (the Data Service could be extended to include file lists). That way, we’d have fast privacy-respecting search for command names, and on-line search for everything else. Thoughts? Ludo’. [-- Attachment #2: The file database tool --] [-- Type: text/plain, Size: 7549 bytes --] ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;; ;;; GNU Guix is free software; you can redistribute it and/or modify it ;;; under the terms of the GNU General Public License as published by ;;; the Free Software Foundation; either version 3 of the License, or (at ;;; your option) any later version. ;;; ;;; GNU Guix is distributed in the hope that it will be useful, but ;;; WITHOUT ANY WARRANTY; without even the implied warranty of ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;;; GNU General Public License for more details. ;;; ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (file-database) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (guix store) #:use-module (guix monads) #:autoload (guix grafts) (%graft?) #:use-module (guix derivations) #:use-module (guix packages) #:autoload (guix build utils) (find-files) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) #:export (file-database)) (define schema " create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, version text not null ); create table if not exists Directories ( id integer primary key autoincrement not null, name text not null, package integer not null, foreign key (package) references Packages(id) on delete cascade ); create table if not exists Files ( name text not null, basename text not null, directory integer not null, foreign key (directory) references Directories(id) on delete cascade ); create index if not exists IndexFiles on Files(basename);") (define (call-with-database file proc) (let ((db (sqlite-open file))) (dynamic-wind (lambda () #t) (lambda () (sqlite-exec db schema) (proc db)) (lambda () (sqlite-close db))))) (define (insert-files db package version directories) "Insert the files contained in DIRECTORIES as belonging to PACKAGE at VERSION." (define last-row-id-stmt (sqlite-prepare db "SELECT last_insert_rowid();" #:cache? #t)) (define package-stmt (sqlite-prepare db "\ INSERT OR REPLACE INTO Packages(name, version) VALUES (:name, :version);" #:cache? #t)) (define directory-stmt (sqlite-prepare db "\ INSERT INTO Directories(name, package) VALUES (:name, :package);" #:cache? #t)) (define file-stmt (sqlite-prepare db "\ INSERT INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" #:cache? #t)) (sqlite-exec db "begin immediate;") (sqlite-bind-arguments package-stmt #:name package #:version version) (sqlite-fold (const #t) #t package-stmt) (match (sqlite-fold cons '() last-row-id-stmt) ((#(package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) (sqlite-reset directory-stmt) (sqlite-bind-arguments directory-stmt #:name directory #:package package-id) (sqlite-fold (const #t) #t directory-stmt) (match (sqlite-fold cons '() last-row-id-stmt) ((#(directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. (unless (string=? file directory) (sqlite-reset file-stmt) (sqlite-bind-arguments file-stmt #:name (strip file) #:basename (basename file) #:directory directory-id) (sqlite-fold (const #t) #t file-stmt))) (find-files directory))))) directories) (sqlite-exec db "commit;")))) (define (insert-package db package) "Insert all the files of PACKAGE into DB." (mlet %store-monad ((drv (package->derivation package #:graft? #f))) (match (derivation->output-paths drv) (((labels . directories) ...) (when (every file-exists? directories) (insert-files db (package-name package) (package-version package) directories)) (return #t))))) (define (insert-packages db) "Insert all the current packages into DB." (with-store store (parameterize ((%graft? #f)) (fold-packages (lambda (package _) (run-with-store store (insert-package db package))) #t #:select? (lambda (package) (and (not (hidden-package? package)) (not (package-superseded package)) (supported-package? package))))))) (define-record-type <package-match> (package-match name version file) package-match? (name package-match-name) (version package-match-version) (file package-match-file)) (define (matching-packages db file) "Return a list of <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ SELECT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) (sqlite-bind-arguments lookup-stmt #:file file) (sqlite-fold (lambda (result lst) (match result (#(package version directory file) (cons (package-match package version (string-append directory "/" file)) lst)))) '() lookup-stmt)) \f (define (file-database . args) (match args ((_ "populate") (call-with-database "/tmp/db" (lambda (db) (insert-packages db)))) ((_ "search" file) (let ((matches (call-with-database "/tmp/db" (lambda (db) (matching-packages db file))))) (for-each (lambda (result) (format #t "~20a ~a~%" (string-append (package-match-name result) "@" (package-match-version result)) (package-match-file result))) matches) (exit (pair? matches)))) (_ (format (current-error-port) "usage: file-database [populate|search] args ...~%") (exit 1)))) (apply file-database (command-line)) ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 9:03 File search Ludovic Courtès @ 2022-01-21 10:35 ` Mathieu Othacehe 2022-01-22 0:35 ` Ludovic Courtès 2022-01-21 19:00 ` Vagrant Cascadian ` (3 subsequent siblings) 4 siblings, 1 reply; 33+ messages in thread From: Mathieu Othacehe @ 2022-01-21 10:35 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Guix Devel Hello Ludo! > Lately I found myself going several times to > <https://packages.debian.org> to look for packages providing a given > file and I thought it’s time to do something about it. Yeah, I'm also thinking regularly about it but giving up because setting up this mechanism properly turns out to be much more complex than initially expected. > The script below creates an SQLite database for the current set of > packages, but only for those already in the store: > > guix repl file-database.scm populate > > That creates /tmp/db; it took about 25mn on berlin, for 18K packages. > Then you can run, say: > > guix repl file-database.scm search boot-9.scm Nice proof of concept :). > I think accuracy (making sure you get results that correspond precisely > to, say, your current channel revisions and your current system) is not > a high priority: some result is better than no result. Likewise for > freshness: results for an older version of a given package may still be > valid now. Agreed. > In terms of privacy, I think it’s better if we can avoid making one > request per file searched for. Off-line operation would be sweet, and > it comes with responsiveness; fast off-line search is necessary for > things like ‘command-not-found’ (where the shell tells you what package > to install when a command is not found). Yeah, that's the tricky part. In term of maintenance, it would probably be easier to have Cuirass index the packages it's building, store the results in the PostgreSQL database and serve them using the Cuirass web server. The pros are that we only rely on one database which is very important in my opinion. It's also relatively easy to setup. The cons are that you need to be online to access this API. If we instead decide to build periodically an sqlite database indexing all the packages in a cronjob or so, it would still be needed for the users to download it, which would be an expensive operation as you mentioned. It would also be difficult to index custom Guix channels with that approach. Another solution could be to have guix publish index the files from the NAR in its cache and provide a file searching API. That would still require to be online, but it would allow to search from multiple publish servers hence possibly multiple Guix channels. The packages that do not have substitutes couldn't be searched which is a strong cons. I would still maybe have a preference for that option. WDYT? Thanks, Mathieu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 10:35 ` Mathieu Othacehe @ 2022-01-22 0:35 ` Ludovic Courtès 0 siblings, 0 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-01-22 0:35 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: Guix Devel Hi! Mathieu Othacehe <othacehe@gnu.org> skribis: >> I think accuracy (making sure you get results that correspond precisely >> to, say, your current channel revisions and your current system) is not >> a high priority: some result is better than no result. Likewise for >> freshness: results for an older version of a given package may still be >> valid now. > > Agreed. > >> In terms of privacy, I think it’s better if we can avoid making one >> request per file searched for. Off-line operation would be sweet, and >> it comes with responsiveness; fast off-line search is necessary for >> things like ‘command-not-found’ (where the shell tells you what package >> to install when a command is not found). > > Yeah, that's the tricky part. In term of maintenance, it would probably > be easier to have Cuirass index the packages it's building, store the > results in the PostgreSQL database and serve them using the Cuirass web > server. The pros are that we only rely on one database which is very > important in my opinion. It's also relatively easy to setup. The cons > are that you need to be online to access this API. Like I wrote, I don’t think we should do on-line only; we need users to be able to download a database at least for ‘command-not-found’. > If we instead decide to build periodically an sqlite database indexing > all the packages in a cronjob or so, it would still be needed for the > users to download it, which would be an expensive operation as you > mentioned. It would also be difficult to index custom Guix channels with > that approach. True! Though I for this matter I’d be very pragmatic and wouldn’t mind sacrificing third-party channels until we have a better idea. > Another solution could be to have guix publish index the files from the > NAR in its cache and provide a file searching API. That would still > require to be online, but it would allow to search from multiple publish > servers hence possibly multiple Guix channels. The packages that do not > have substitutes couldn't be searched which is a strong cons. I would > still maybe have a preference for that option. I also thought about doing it in ‘guix publish’. One problem is that it’s not the right level of abstraction: it publishes everything, not just packages, and it can only guess whether something is a package and what its name and version are. Another option would be to have ‘guix publish’ provide digests (file lists), similar to what I did in: https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00080.html https://git.savannah.gnu.org/cgit/guix.git/log?h=wip-digests That way, ‘file-database.scm populate’ could fetch those digests instead of whole nars (or solely local info). Users would have to run it regularly. The Nix folks have <https://github.com/bennofs/nix-index>, which apparently creates a database based on substitutes. Looking at <https://github.com/bennofs/nix-index/blob/master/src/hydra.rs#L401>, it seems Hydra provides “file listings”, similar to digests. Then it’s up to users to regularly run the indexer so they can use ‘nix-locate’, which is typically done via a local cron job (similar to how one would use ‘updatedb’.) Populating the database the first time may be rather costly though. Maybe that’s a more reasonable approach? All that said, I think we could very much have, in parallel, a fancier database, be it in the Data Service or in Cuirass, that one could query on-line. Its implementation would actually be less constrained. Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 9:03 File search Ludovic Courtès 2022-01-21 10:35 ` Mathieu Othacehe @ 2022-01-21 19:00 ` Vagrant Cascadian 2022-01-22 0:37 ` Ludovic Courtès 2022-01-22 4:46 ` raingloom ` (2 subsequent siblings) 4 siblings, 1 reply; 33+ messages in thread From: Vagrant Cascadian @ 2022-01-21 19:00 UTC (permalink / raw) To: Ludovic Courtès, Guix Devel [-- Attachment #1: Type: text/plain, Size: 2181 bytes --] On 2022-01-21, Ludovic Courtès wrote: > Lately I found myself going several times to > <https://packages.debian.org> to look for packages providing a given > file and I thought it’s time to do something about it. Hah! > The script below creates an SQLite database for the current set of > packages, but only for those already in the store: > > guix repl file-database.scm populate ... > I think accuracy (making sure you get results that correspond precisely > to, say, your current channel revisions and your current system) is not > a high priority: some result is better than no result. Likewise for > freshness: results for an older version of a given package may still be > valid now. Hear hear! > In terms of privacy, I think it’s better if we can avoid making one > request per file searched for. Off-line operation would be sweet, and > it comes with responsiveness; fast off-line search is necessary for > things like ‘command-not-found’ (where the shell tells you what package > to install when a command is not found). > > Based on that, it is tempting to just distribute a full database from > ci.guix, say, that the client command would regularly fetch. The > downside is that that’s quite a lot of data to download; if you use the > file search command infrequently, you might find yourself spending more > time downloading the database than actually searching it. > > We could have a hybrid solution: distribute a database that contains > only files in /bin and /sbin (it should be much smaller), and for > everything else, resort to a web service (the Data Service could be > extended to include file lists). That way, we’d have fast > privacy-respecting search for command names, and on-line search for > everything else. What about ... a roughly weekly job that runs on ci.guix. to create the database and packages of parts of the database and a channel that includes those and utilities to query them so that you can install the packages and refresh them at your leisure... Or just put the packages in the main repository, and update it manually roughly weekly? live well, vagrant [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 19:00 ` Vagrant Cascadian @ 2022-01-22 0:37 ` Ludovic Courtès 2022-01-22 2:53 ` Maxim Cournoyer 0 siblings, 1 reply; 33+ messages in thread From: Ludovic Courtès @ 2022-01-22 0:37 UTC (permalink / raw) To: Vagrant Cascadian; +Cc: Guix Devel Hi! Vagrant Cascadian <vagrant@debian.org> skribis: > What about ... a roughly weekly job that runs on ci.guix. to create the > database and packages of parts of the database and a channel that > includes those and utilities to query them so that you can install the > packages and refresh them at your leisure... > > Or just put the packages in the main repository, and update it manually > roughly weekly? Making the database a package (or set of packages) sounds quite weird or at least very unusual from a Guix viewpoint, where packages normally describe build procedures. But a cron job publishing a file somewhere, why not! Thanks, Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-22 0:37 ` Ludovic Courtès @ 2022-01-22 2:53 ` Maxim Cournoyer 2022-01-25 11:15 ` Ludovic Courtès 0 siblings, 1 reply; 33+ messages in thread From: Maxim Cournoyer @ 2022-01-22 2:53 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Vagrant Cascadian, Guix Devel Hi Ludovic, Thank you for this valuable initiative :-). I like that it sits in few lines and should already be useful for local searches with a minimal front command to query it. Ludovic Courtès <ludo@gnu.org> writes: > Hi! > > Vagrant Cascadian <vagrant@debian.org> skribis: > >> What about ... a roughly weekly job that runs on ci.guix. to create the >> database and packages of parts of the database and a channel that >> includes those and utilities to query them so that you can install the >> packages and refresh them at your leisure... >> >> Or just put the packages in the main repository, and update it manually >> roughly weekly? > > Making the database a package (or set of packages) sounds quite weird or > at least very unusual from a Guix viewpoint, where packages normally > describe build procedures. > > But a cron job publishing a file somewhere, why not! I also had the idea of making it a package... this way only the people who opt to install the database locally would incur the cost (in bandwidth). Perhaps a question for Vagrant: talking about size, is this SQLite database file comparable or smaller in size to the apt-file database that needs to be downloaded? With the Debian software catalog being about 30% bigger, I'd expect a similarly bigger file size. If Debian is doing better in terms of database file size, we could look at how they're doing it. Thank you! Maxim ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-22 2:53 ` Maxim Cournoyer @ 2022-01-25 11:15 ` Ludovic Courtès 2022-01-25 11:20 ` Oliver Propst 0 siblings, 1 reply; 33+ messages in thread From: Ludovic Courtès @ 2022-01-25 11:15 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: Vagrant Cascadian, Guix Devel Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis: > I also had the idea of making it a package... this way only the people > who opt to install the database locally would incur the cost (in > bandwidth). > > Perhaps a question for Vagrant: talking about size, is this SQLite > database file comparable or smaller in size to the apt-file database > that needs to be downloaded? With the Debian software catalog being > about 30% bigger, I'd expect a similarly bigger file size. > > If Debian is doing better in terms of database file size, we could look > at how they're doing it. As a back-of-the-envelope estimate, here’s the amount of text that needs to be available in the database: --8<---------------cut here---------------start------------->8--- ludo@berlin ~/src$ sqlite3 -csv /tmp/db 'select name,version from packages; select name from directories;select name from files;'|wc -c 197689978 ludo@berlin ~/src$ guile -c '(pk (/ 197689978 (expt 2. 20)))' ;;; (188.5318546295166) ludo@berlin ~/src$ du -h /tmp/db 389M /tmp/db --8<---------------cut here---------------end--------------->8--- So roughly, SQLite with this particular schema ends up taking twice as much space as the lower bound. We can do a bit better (I’m not an expert, so I’m just trying things naively) by dropping the index and cleaning up the database: --8<---------------cut here---------------start------------->8--- ludo@berlin ~/src$ cp /tmp/db{,.without-index} ludo@berlin ~/src$ sqlite3 /tmp/db.without-index SQLite version 3.32.3 2020-06-18 14:00:33 Enter ".help" for usage hints. sqlite> drop index IndexFiles; sqlite> .quit ludo@berlin ~/src$ du -h /tmp/db.without-index 389M /tmp/db.without-index ludo@berlin ~/src$ sqlite3 /tmp/db.without-index SQLite version 3.32.3 2020-06-18 14:00:33 Enter ".help" for usage hints. sqlite> vacuum; sqlite> .quit ludo@berlin ~/src$ du -h /tmp/db.without-index 290M /tmp/db.without-index --8<---------------cut here---------------end--------------->8--- With compression: --8<---------------cut here---------------start------------->8--- ludo@berlin ~/src$ zstd -19 < /tmp/db.without-index > /tmp/db.without-index.zst ludo@berlin ~/src$ du -h /tmp/db.without-index.zst 37M /tmp/db.without-index.zst --8<---------------cut here---------------end--------------->8--- (Down from 61MB.) For comparison, this is smaller than guile, perl, gtk+, and roughly the same as glibc:out. For the record, with compression, the lower bound is about 12 MiB: --8<---------------cut here---------------start------------->8--- ludo@berlin ~/src$ sqlite3 -csv /tmp/db 'select name,version from packages; select name from directories;select name from files;'|zstd -19|wc -c 12128674 ludo@berlin ~/src$ guile -c '(pk (/ 12128674 (expt 2. 20)))' ;;; (11.566804885864258) --8<---------------cut here---------------end--------------->8--- All this to say that we could distribute the database in a form that gets closer to the optimal size, at the expense of extra processing on the client side upon reception to put it into shape (creating an index, etc.). Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-25 11:15 ` Ludovic Courtès @ 2022-01-25 11:20 ` Oliver Propst 2022-01-25 11:22 ` Oliver Propst 0 siblings, 1 reply; 33+ messages in thread From: Oliver Propst @ 2022-01-25 11:20 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Vagrant Cascadian, Guix Devel, Maxim Cournoyer On 2022-01-25 12:15, Ludovic Courtès wrote: I'm also not an expert at Sql-Lite but I can state that the effort looks very nice and promising Ludovic :) -- Kinds regards Oliver Propst https://twitter.com/Opropst ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-25 11:20 ` Oliver Propst @ 2022-01-25 11:22 ` Oliver Propst 0 siblings, 0 replies; 33+ messages in thread From: Oliver Propst @ 2022-01-25 11:22 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Vagrant Cascadian, Guix Devel, Maxim Cournoyer On 2022-01-25 12:20, Oliver Propst wrote: > On 2022-01-25 12:15, Ludovic Courtès wrote: > I'm also not an expert at Sql-Lite but I can state that the effort > looks very nice and promising Ludovic :) And definitely a step-up from the current implementation (obviously).. -- Kinds regards Oliver Propst https://twitter.com/Opropst ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 9:03 File search Ludovic Courtès 2022-01-21 10:35 ` Mathieu Othacehe 2022-01-21 19:00 ` Vagrant Cascadian @ 2022-01-22 4:46 ` raingloom 2022-01-22 7:55 ` Ricardo Wurmus 2022-01-25 23:45 ` Ryan Prior 2022-02-06 13:27 ` André A. Gomes 4 siblings, 1 reply; 33+ messages in thread From: raingloom @ 2022-01-22 4:46 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Guix Devel On Fri, 21 Jan 2022 10:03:43 +0100 Ludovic Courtès <ludo@gnu.org> wrote: > Hello Guix! > > Lately I found myself going several times to > <https://packages.debian.org> to look for packages providing a given > file and I thought it’s time to do something about it. > > The script below creates an SQLite database for the current set of > packages, but only for those already in the store: > > guix repl file-database.scm populate > > That creates /tmp/db; it took about 25mn on berlin, for 18K packages. > Then you can run, say: > > guix repl file-database.scm search boot-9.scm > > to find which packages provide a file named ‘boot-9.scm’. That part > is instantaneous. > > The database for 18K packages is quite big: > > --8<---------------cut here---------------start------------->8--- > $ du -h /tmp/db* > 389M /tmp/db > 82M /tmp/db.gz > 61M /tmp/db.zst > --8<---------------cut here---------------end--------------->8--- > > How do we expose that information? There are several criteria I can > think of: accuracy, freshness, privacy, responsiveness, off-line > operation. > > I think accuracy (making sure you get results that correspond > precisely to, say, your current channel revisions and your current > system) is not a high priority: some result is better than no result. > Likewise for freshness: results for an older version of a given > package may still be valid now. > > In terms of privacy, I think it’s better if we can avoid making one > request per file searched for. Off-line operation would be sweet, and > it comes with responsiveness; fast off-line search is necessary for > things like ‘command-not-found’ (where the shell tells you what > package to install when a command is not found). > > Based on that, it is tempting to just distribute a full database from > ci.guix, say, that the client command would regularly fetch. The > downside is that that’s quite a lot of data to download; if you use > the file search command infrequently, you might find yourself > spending more time downloading the database than actually searching > it. > > We could have a hybrid solution: distribute a database that contains > only files in /bin and /sbin (it should be much smaller), and for > everything else, resort to a web service (the Data Service could be > extended to include file lists). That way, we’d have fast > privacy-respecting search for command names, and on-line search for > everything else. > > Thoughts? > > Ludo’. > One use case that I hope can be addressed is TeXlive packages. Trying to figure out which package corresponded to which missing file was a nightmare the last I had to use LaTeX. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-22 4:46 ` raingloom @ 2022-01-22 7:55 ` Ricardo Wurmus 2022-01-24 15:48 ` Ludovic Courtès 0 siblings, 1 reply; 33+ messages in thread From: Ricardo Wurmus @ 2022-01-22 7:55 UTC (permalink / raw) To: raingloom; +Cc: guix-devel raingloom <raingloom@riseup.net> writes: > One use case that I hope can be addressed is TeXlive packages. Trying > to figure out which package corresponded to which missing file was a > nightmare the last I had to use LaTeX. The texlive package database is the authoritative source of information. The file texlive.tlpdb is included in the texlive-bin package, and we’re using it in the importer. I also added (@ (guix import texlive) files-differ?), which compares a texlive package’s output directory with the files that the texlive.tlpdb lists for that package. You can also use it to check what package should provide a certain file. I do this all the time to figure out if our existing packages are incomplete or if we’re just missing a package. -- Ricardo ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-22 7:55 ` Ricardo Wurmus @ 2022-01-24 15:48 ` Ludovic Courtès 2022-01-24 17:03 ` Ricardo Wurmus 0 siblings, 1 reply; 33+ messages in thread From: Ludovic Courtès @ 2022-01-24 15:48 UTC (permalink / raw) To: Ricardo Wurmus; +Cc: guix-devel Ricardo Wurmus <rekado@elephly.net> skribis: > raingloom <raingloom@riseup.net> writes: > >> One use case that I hope can be addressed is TeXlive packages. Trying >> to figure out which package corresponded to which missing file was a >> nightmare the last I had to use LaTeX. > > The texlive package database is the authoritative source of information. > The file texlive.tlpdb is included in the texlive-bin package, and we’re > using it in the importer. I also added (@ (guix import texlive) > files-differ?), which compares a texlive package’s output directory with > the files that the texlive.tlpdb lists for that package. > > You can also use it to check what package should provide a certain file. > I do this all the time to figure out if our existing packages are > incomplete or if we’re just missing a package. Oh, I had never tried that. Is there a command that browses texlive.tlpdb, or do you just open it or grep it? Thanks for the tip, Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-24 15:48 ` Ludovic Courtès @ 2022-01-24 17:03 ` Ricardo Wurmus 2022-02-02 16:14 ` Maxim Cournoyer 0 siblings, 1 reply; 33+ messages in thread From: Ricardo Wurmus @ 2022-01-24 17:03 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel Ludovic Courtès <ludo@gnu.org> writes: > Ricardo Wurmus <rekado@elephly.net> skribis: > >> raingloom <raingloom@riseup.net> writes: >> >>> One use case that I hope can be addressed is TeXlive packages. Trying >>> to figure out which package corresponded to which missing file was a >>> nightmare the last I had to use LaTeX. >> >> The texlive package database is the authoritative source of information. >> The file texlive.tlpdb is included in the texlive-bin package, and we’re >> using it in the importer. I also added (@ (guix import texlive) >> files-differ?), which compares a texlive package’s output directory with >> the files that the texlive.tlpdb lists for that package. >> >> You can also use it to check what package should provide a certain file. >> I do this all the time to figure out if our existing packages are >> incomplete or if we’re just missing a package. > > Oh, I had never tried that. Is there a command that browses > texlive.tlpdb, or do you just open it or grep it? I just have it open in Emacs and search inside. But we could easily add a procedure to (guix import texlive) to check the texlive.tlpdb. All the hard work has already been done; we’re using the same mechanism for the importer and “files-differ?”. -- Ricardo ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-24 17:03 ` Ricardo Wurmus @ 2022-02-02 16:14 ` Maxim Cournoyer 2022-02-05 11:15 ` Ludovic Courtès 0 siblings, 1 reply; 33+ messages in thread From: Maxim Cournoyer @ 2022-02-02 16:14 UTC (permalink / raw) To: Ricardo Wurmus; +Cc: guix-devel Hi, Ricardo Wurmus <rekado@elephly.net> writes: > Ludovic Courtès <ludo@gnu.org> writes: > >> Ricardo Wurmus <rekado@elephly.net> skribis: >> >>> raingloom <raingloom@riseup.net> writes: >>> >>>> One use case that I hope can be addressed is TeXlive packages. Trying >>>> to figure out which package corresponded to which missing file was a >>>> nightmare the last I had to use LaTeX. >>> >>> The texlive package database is the authoritative source of information. >>> The file texlive.tlpdb is included in the texlive-bin package, and we’re >>> using it in the importer. I also added (@ (guix import texlive) >>> files-differ?), which compares a texlive package’s output directory with >>> the files that the texlive.tlpdb lists for that package. >>> >>> You can also use it to check what package should provide a certain file. >>> I do this all the time to figure out if our existing packages are >>> incomplete or if we’re just missing a package. >> >> Oh, I had never tried that. Is there a command that browses >> texlive.tlpdb, or do you just open it or grep it? > > I just have it open in Emacs and search inside. But we could easily add > a procedure to (guix import texlive) to check the texlive.tlpdb. All > the hard work has already been done; we’re using the same mechanism for > the importer and “files-differ?”. It used to be broken, but with the c-u-f merge the 'tlmgr' tool now works as expected to search for things in the local texlive.tlpdb database: --8<---------------cut here---------------start------------->8--- $ guix shell --pure texlive-bin grep which coreutils sed gnupg -- tlmgr info cite.sty tlmgr: cannot find package cite.sty, searching for other matches: Packages containing `cite.sty' in their title/description: Packages containing files matching `cite.sty': abntex2: texmf-dist/tex/latex/abntex2/abntex2cite.sty apacite: texmf-dist/tex/latex/apacite/apacite.sty chscite: texmf-dist/tex/latex/chscite/chscite.sty cite: texmf-dist/tex/latex/cite/cite.sty texmf-dist/tex/latex/cite/drftcite.sty texmf-dist/tex/latex/cite/overcite.sty combine: texmf-dist/tex/latex/combine/combcite.sty computational-complexity: texmf-dist/tex/latex/computational-complexity/cc2cite.sty texmf-dist/tex/latex/computational-complexity/cccite.sty emojicite: texmf-dist/tex/lualatex/emojicite/emojicite.sty gcite: texmf-dist/tex/latex/gcite/gcite.sty icite: texmf-dist/tex/latex/icite/icite.sty kluwer: texmf-dist/tex/latex/kluwer/klucite.sty lwarp: texmf-dist/tex/latex/lwarp/lwarp-cite.sty texmf-dist/tex/latex/lwarp/lwarp-drftcite.sty mcite: texmf-dist/tex/latex/mcite/mcite.sty notoccite: texmf-dist/tex/latex/notoccite/notoccite.sty velthuis: texmf-dist/tex/latex/velthuis/dvngcite.sty xcite: texmf-dist/tex/latex/xcite/xcite.sty --8<---------------cut here---------------end--------------->8--- It's not great that references to 'grep which coreutils sed gnupg' aren't patched though (I thought I cared they were, perhaps it regressed). HTH, Maxim ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-02-02 16:14 ` Maxim Cournoyer @ 2022-02-05 11:15 ` Ludovic Courtès 0 siblings, 0 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-02-05 11:15 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: guix-devel Hi, Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis: > It used to be broken, but with the c-u-f merge the 'tlmgr' tool now > works as expected to search for things in the local texlive.tlpdb > database: > > $ guix shell --pure texlive-bin grep which coreutils sed gnupg -- tlmgr info cite.sty > tlmgr: cannot find package cite.sty, searching for other matches: > > Packages containing `cite.sty' in their title/description: > > Packages containing files matching `cite.sty': > abntex2: > texmf-dist/tex/latex/abntex2/abntex2cite.sty Nice! I think we really need a section in the manual on TeX Live usage, with tips and tricks like this. Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 9:03 File search Ludovic Courtès ` (2 preceding siblings ...) 2022-01-22 4:46 ` raingloom @ 2022-01-25 23:45 ` Ryan Prior 2022-02-05 11:18 ` Ludovic Courtès 2022-02-06 13:27 ` André A. Gomes 4 siblings, 1 reply; 33+ messages in thread From: Ryan Prior @ 2022-01-25 23:45 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Guix Devel On Friday, January 21st, 2022 at 9:03 AM, Ludovic Courtès <ludo@gnu.org> wrote: > The database for 18K packages is quite big: > > --8<---------------cut here---------------start------------->8--- > > $ du -h /tmp/db* > > 389M /tmp/db > > 82M /tmp/db.gz > > 61M /tmp/db.zst > > --8<---------------cut here---------------end--------------->8--- > [snip] > In terms of privacy, I think it’s better if we can avoid making > one request per file searched for. Off-line operation would be > sweet, and it comes with responsiveness; fast off-line search is > necessary for things like ‘command-not-found’ (where the shell > tells you what package to install when a command is not found). Offline operation is crucial, and I don't think it's desirable to download tens or hundreds of megabytes. What about creating & distributing a bloom filter per package, with members being file names? This would allow us to dramatically reduce the size of data we distribute, at the cost of not giving 100% reliable answers. We've established, though, that some information is better than none, and the uncertainty can be resolved by querying a web service or building the package locally and searching its directory. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-25 23:45 ` Ryan Prior @ 2022-02-05 11:18 ` Ludovic Courtès 0 siblings, 0 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-02-05 11:18 UTC (permalink / raw) To: Ryan Prior; +Cc: Guix Devel Hi, Ryan Prior <rprior@protonmail.com> skribis: > On Friday, January 21st, 2022 at 9:03 AM, Ludovic Courtès <ludo@gnu.org> wrote: > >> The database for 18K packages is quite big: >> >> --8<---------------cut here---------------start------------->8--- >> >> $ du -h /tmp/db* >> >> 389M /tmp/db >> >> 82M /tmp/db.gz >> >> 61M /tmp/db.zst >> >> --8<---------------cut here---------------end--------------->8--- >> [snip] >> In terms of privacy, I think it’s better if we can avoid making >> one request per file searched for. Off-line operation would be >> sweet, and it comes with responsiveness; fast off-line search is >> necessary for things like ‘command-not-found’ (where the shell >> tells you what package to install when a command is not found). > > Offline operation is crucial, and I don't think it's desirable to download tens or hundreds of megabytes. What about creating & distributing a bloom filter per package, with members being file names? This would allow us to dramatically reduce the size of data we distribute, at the cost of not giving 100% reliable answers. We've established, though, that some information is better than none, and the uncertainty can be resolved by querying a web service or building the package locally and searching its directory. My understanding is that Bloom filters are sets essentially, but here we need more than that: we need to map files to package names. Or am I misunderstanding what you have in mind? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-01-21 9:03 File search Ludovic Courtès ` (3 preceding siblings ...) 2022-01-25 23:45 ` Ryan Prior @ 2022-02-06 13:27 ` André A. Gomes 4 siblings, 0 replies; 33+ messages in thread From: André A. Gomes @ 2022-02-06 13:27 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Guix Devel Ludovic Courtès <ludo@gnu.org> writes: > Hello Guix! > > Lately I found myself going several times to > <https://packages.debian.org> to look for packages providing a given > file and I thought it’s time to do something about it. My understanding is very limited but I thought that the following blog post could be of any help. https://batsov.com/articles/2022/01/22/how-to-find-which-package-a-file-belongs-to-in-debian-ubuntu/ -- André A. Gomes "Free Thought, Free World" ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search @ 2022-12-02 17:58 antoine.romain.dumont 2022-12-02 18:22 ` Antoine R. Dumont (@ardumont) 0 siblings, 1 reply; 33+ messages in thread From: antoine.romain.dumont @ 2022-12-02 17:58 UTC (permalink / raw) To: guix-devel [-- Attachment #1: Type: text/plain, Size: 3099 bytes --] Hello Guix! Guix is top so thanks for the awesome work! Just to give some feedback on this thread. That's a good news that the file search functionality in the radar. > Lately I found myself going several times to > <https://packages.debian.org> to look for packages providing a given > file and I thought it’s time to do something about it. I've finally started to set up my machine with Guix system (and Guix Home). Finding out where such program or cli is packaged is definitely something that I need to port my existing use (from mainly nixified debian or nixos machines) to Guix. And to answer such question, I used existing "offline" programs in my machines. I've bounced back and forth between `nix-locate` and `apt-file search` to determine approximately the packages in Guix (names aren't usually that different). Hence, as a user, it's one of my expectation that the Guix cli provides some equivalent program to lookup from file to package ;). > The script below creates an SQLite database for the current set of > packages, but only for those already in the store: > > Guix repl file-database.scm populate > > That creates /tmp/db; it took about 25mn on berlin, for 18K packages. > Then you can run, say: > > Guix repl file-database.scm search boot-9.scm > > to find which packages provide a file named ‘boot-9.scm’. That part is > instantaneous. > > The database for 18K packages is quite big: > > --8<---------------cut here---------------start------------->8--- > $ du -h /tmp/db* > 389M /tmp/db > 82M /tmp/db.gz > 61M /tmp/db.zst > --8<---------------cut here---------------end--------------->8--- For information, in a most recent implementation (@civodul provided me in #guix-devel), I noticed multiple calls to the indexation step would duplicate information (at all levels packages, files, directories). So that might have had an impact in the extracted values above (if ludo had triggered multiple times the script at the time). Jsyk, I have started iterating a bit over that provided implementation (and fixed the current caveat mentioned), added some help message... I'll follow up with it in a bit (same thread) to have some more feedback on it. > How do we expose that information? There are several criteria I can > think of: accuracy, freshness, privacy, responsiveness, off-line > operation. > > I think accuracy (making sure you get results that correspond precisely > to, say, your current channel revisions and your current system) is not > a high priority: some result is better than no result. I definitely agree with this. At least from the offline use perspective. I did not focus at all on the second part of the problematic ("online" and distribution use). > Likewise for freshness: results for an older version of a given > package may still be valid now. Indeed. Cheers, -- tony / Antoine R. Dumont (@ardumont) ----------------------------------------------------------------- gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 877 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-02 17:58 antoine.romain.dumont @ 2022-12-02 18:22 ` Antoine R. Dumont (@ardumont) 2022-12-03 18:19 ` Ludovic Courtès 0 siblings, 1 reply; 33+ messages in thread From: Antoine R. Dumont (@ardumont) @ 2022-12-02 18:22 UTC (permalink / raw) To: guix-devel [-- Attachment #1: Type: text/plain, Size: 19430 bytes --] Hello again, As mentioned previously, I have iterated on the work @civodul started. After discussing it a bit over irc, I proposed to try and push a bit forward the discussion and the implementation [2] to see where this goes. After toying a bit with the initial code, I took the liberty to make it a guix extension (we discussed it a bit with @zimoun). It was mostly to get started with Guile (I know some lisp implems but not this one so i had to familiarize myself with tools and whatnot ;). Anyway, that can be reverted if you feel like it can be integrated as a Guix cli directly. Currently, the implementation scans and indexes whatever package is present in the local store of the machine's user. From nix/guix's design, it makes sense to do it that way as it's likely that even though you don't have all the tools locally, it may be already present as a dependency of some high level tools you already use (it's just not exposed because not declared in config.scm or home-configuration.scm). You will find inlines (at the bottom) some cli usage calls [1] and the current implementation [2]. Thanks in advance for any feedback ;) Cheers, -- tony / Antoine R. Dumont (@ardumont) ----------------------------------------------------------------- gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 [1] Usage sample: --8<---------------cut here---------------start------------->8--- $ env | grep GUIX_EXTENSION GUIX_EXTENSIONS_PATH=$HOME/repo/public/guix/guix/guix/extensions $ guix index ;;; (package 1 "acl") ;;; (package 595 "shepherd") ;;; (package 596 "guile2.2-shepherd") ;;; (package 2 "htop") ;;; (package 7 "shadow") ;;; (package 6 "shepherd") ;;; (package 5 "autojump") ... ^C $ guix index search shepherd guile2.2-shepherd@0.9.3 /gnu/store/cq8r2vzg56ax0iidgs4biz3sv0b9jxp3-guile2.2-shepherd-0.9.3/bin/shepherd shepherd@0.9.3 /gnu/store/a9jdd8kgckwlq97yw3pjqs6sy4lqgrfq-shepherd-0.9.3/bin/shepherd shepherd@0.8.1 /gnu/store/vza48khbaq0fdmcsrn27xj5y5yy76z6l-shepherd-0.8.1/bin/shepherd shepherd@0.9.1 /gnu/store/gxz67p4gx9g6rpxxpsgmhsybczimdlx5-shepherd-0.9.1/bin/shepherd guix help | grep -C3 extension repl read-eval-print loop (REPL) for interactive programming extension commands index Index packages to allow searching package for a given filename Report bugs to: bug-guix@gnu.org. $ guix help index # or: guix index [--help|-h] Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. With 'search FILE', search for packages installing FILE. Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. See --db-path for customization. The valid values for OPTIONS are: -h, --help Display this help and exit -V, --version Display version information and exit --db-path=DIR Change default location of the cache db The valid values for ARGS are: search FILE Search for packages installing the FILE (from cache db) Report bugs to: bug-guix@gnu.org. GNU Guix home page: <https://guix.gnu.org> General help using Guix and GNU software: <https://guix.gnu.org/en/help/> $ guix index --version # or guix index -V guix locate (GNU Guix) 5ccb5837ccfb39af4e3e6399a0124997a187beb1 Copyright (C) 2022 the Guix authors License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. --8<---------------cut here---------------start------------->8--- [2] The code: --8<---------------cut here---------------start------------->8--- ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;; ;;; GNU Guix is free software; you can redistribute it and/or modify it ;;; under the terms of the GNU General Public License as published by ;;; the Free Software Foundation; either version 3 of the License, or (at ;;; your option) any later version. ;;; ;;; GNU Guix is distributed in the hope that it will be useful, but ;;; WITHOUT ANY WARRANTY; without even the implied warranty of ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;;; GNU General Public License for more details. ;;; ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions index) #:use-module (guix config) ;; %guix-package-name, ... #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) #:autoload (guix grafts) (%graft?) #:use-module (guix derivations) #:use-module (guix packages) #:autoload (guix build utils) (find-files) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) #:export (guix-index)) (define debug #f) (define schema " create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, version text not null, unique (name, version) -- add uniqueness constraint ); create table if not exists Directories ( id integer primary key autoincrement not null, name text not null, package integer not null, foreign key (package) references Packages(id) on delete cascade, unique (name, package) -- add uniqueness constraint ); create table if not exists Files ( name text not null, basename text not null, directory integer not null, foreign key (directory) references Directories(id) on delete cascade unique (name, basename, directory) -- add uniqueness constraint ); create index if not exists IndexFiles on Files(basename);") (define (call-with-database file proc) (let ((db (sqlite-open file))) (dynamic-wind (lambda () #t) (lambda () (sqlite-exec db schema) (proc db)) (lambda () (sqlite-close db))))) (define (insert-files db package version directories) "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." (define stmt-select-package (sqlite-prepare db "\ SELECT id FROM Packages WHERE name = :name AND version = :version;" #:cache? #t)) (define stmt-insert-package (sqlite-prepare db "\ INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes VALUES (:name, :version);" #:cache? #t)) (define stmt-select-directory (sqlite-prepare db "\ SELECT id FROM Directories WHERE name = :name AND package = :package;" #:cache? #t)) (define stmt-insert-directory (sqlite-prepare db "\ INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes VALUES (:name, :package);" #:cache? #t)) (define stmt-insert-file (sqlite-prepare db "\ INSERT OR IGNORE INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" #:cache? #t)) (sqlite-exec db "begin immediate;") (sqlite-bind-arguments stmt-insert-package #:name package #:version version) (sqlite-fold (const #t) #t stmt-insert-package) (sqlite-bind-arguments stmt-select-package #:name package #:version version) (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) (when debug (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" package version package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) (sqlite-reset stmt-insert-directory) (sqlite-bind-arguments stmt-insert-directory #:name directory #:package package-id) (sqlite-fold (const #t) #t stmt-insert-directory) (sqlite-reset stmt-select-directory) (sqlite-bind-arguments stmt-select-directory #:name directory #:package package-id) (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) (when debug (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" directory package-id directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. (unless (string=? file directory) (sqlite-reset stmt-insert-file) (sqlite-bind-arguments stmt-insert-file #:name (strip file) #:basename (basename file) #:directory directory-id) (sqlite-fold (const #t) #t stmt-insert-file))) (find-files directory))))) directories))) (sqlite-exec db "commit;")) (define (insert-package db package) "Insert all the files of PACKAGE into DB." (mlet %store-monad ((drv (package->derivation package #:graft? #f))) (match (derivation->output-paths drv) (((labels . directories) ...) (when (every file-exists? directories) (insert-files db (package-name package) (package-version package) directories)) (return #t))))) (define (filter-public-current-supported package) "Filter supported, not hidden (public) and not superseded (current) package." (and (not (hidden-package? package)) (not (package-superseded package)) (supported-package? package))) (define (filter-supported-package package) "Filter supported package (package might be hidden or superseded)." (and (supported-package? package))) (define (no-filter package) "No filtering on package" #t) (define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) "Insert all current packages matching `filter-package-policy` into DB." (with-store store (parameterize ((%graft? #f)) (fold-packages (lambda (package _) (run-with-store store (insert-package db package))) #t #:select? filter-policy)))) (define-record-type <package-match> (package-match name version file) package-match? (name package-match-name) (version package-match-version) (file package-match-file)) (define (matching-packages db file) "Return unique <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ SELECT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) (sqlite-bind-arguments lookup-stmt #:file file) (sqlite-fold (lambda (result lst) (match result (#(package version directory file) (cons (package-match package version (string-append directory "/" file)) lst)))) '() lookup-stmt)) \f (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname (lambda (db) (insert-packages db no-filter)))) (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname (lambda (db) (matching-packages db file)))) (define (print-matching-results matches) "Print the MATCHES matching results." (for-each (lambda (result) (format #t "~20a ~a~%" (string-append (package-match-name result) "@" (package-match-version result)) (package-match-file result))) matches)) (define default-db-path (let ((local-config-path (and=> (getenv "HOME") (lambda (home) (string-append home "/.config/guix/"))))) (string-append local-config-path "locate-db.sqlite"))) (define (show-bug-report-information) ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this ;; package. Please add another line saying "Report translation bugs to ;; ...\n" with the address for translation bugs (typically your translation ;; team's web or email address). (format #t (G_ " Report bugs to: ~a.") %guix-bug-report-address) (format #t (G_ " ~a home page: <~a>") %guix-package-name %guix-home-page-url) (format #t (G_ " General help using Guix and GNU software: <~a>") ;; TRANSLATORS: Change the "/en" bit of this URL appropriately if ;; the web site is translated in your language. (G_ "https://guix.gnu.org/en/help/")) (newline)) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. With 'search FILE', search for packages installing FILEx;x.\n Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. See --db-path for customization.\n")) (newline) (display (G_ "The valid values for OPTIONS are:")) (newline) (display (G_ " -h, --help Display this help and exit")) (display (G_ " -V, --version Display version information and exit")) (display (G_ " --db-path=DIR Change default location of the cache db")) (newline) (newline) (display (G_ "The valid values for ARGS are:")) (newline) (display (G_ " search FILE Search for packages installing the FILE (from cache db)")) (newline) (show-bug-report-information)) (define-command (guix-index . args) (category extension) (synopsis "Index packages to allow searching package for a given filename") (define (parse-db-args args) "Parsing of string key=value where we are only interested in 'value'" (match (string-split args #\=) ((unused db-path) db-path) (_ #f))) (define (display-help-and-exit) (show-help) (exit 0)) (match args ((or ("-h") ("--help")) (display-help-and-exit)) ((or ("-V") ("--version")) (show-version-and-exit "guix locate")) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path (index-packages-with-db db-path) (display-help-and-exit)))) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) (exit (pair? matches)))) ((db-path-args "search" file) (let ((db-path (parse-db-args db-path-args))) (if db-path (let ((matches (matching-packages-with-db db-path file))) (print-matching-results matches) (exit (pair? matches))) (display-help-and-exit)))) (_ ;; index by default (index-packages-with-db default-db-path)))) --8<---------------cut here---------------start------------->8--- antoine.romain.dumont@gmail.com writes: > Hello Guix! > > Guix is top so thanks for the awesome work! > > Just to give some feedback on this thread. That's a good news that the > file search functionality in the radar. > >> Lately I found myself going several times to >> <https://packages.debian.org> to look for packages providing a given >> file and I thought it’s time to do something about it. > > I've finally started to set up my machine with Guix system (and > Guix Home). Finding out where such program or cli is packaged is > definitely something that I need to port my existing use (from mainly > nixified debian or nixos machines) to Guix. > > And to answer such question, I used existing "offline" programs in my > machines. I've bounced back and forth between `nix-locate` and `apt-file > search` to determine approximately the packages in Guix (names aren't > usually that different). > > Hence, as a user, it's one of my expectation that the Guix cli provides > some equivalent program to lookup from file to package ;). > >> The script below creates an SQLite database for the current set of >> packages, but only for those already in the store: >> >> Guix repl file-database.scm populate >> >> That creates /tmp/db; it took about 25mn on berlin, for 18K packages. >> Then you can run, say: >> >> Guix repl file-database.scm search boot-9.scm >> >> to find which packages provide a file named ‘boot-9.scm’. That part is >> instantaneous. >> >> The database for 18K packages is quite big: >> >> --8<---------------cut here---------------start------------->8--- >> $ du -h /tmp/db* >> 389M /tmp/db >> 82M /tmp/db.gz >> 61M /tmp/db.zst >> --8<---------------cut here---------------end--------------->8--- > > For information, in a most recent implementation (@civodul provided me > in #guix-devel), I noticed multiple calls to the indexation step would > duplicate information (at all levels packages, files, directories). So > that might have had an impact in the extracted values above (if ludo had > triggered multiple times the script at the time). > > Jsyk, I have started iterating a bit over that provided implementation > (and fixed the current caveat mentioned), added some help message... > I'll follow up with it in a bit (same thread) to have some more feedback > on it. > >> How do we expose that information? There are several criteria I can >> think of: accuracy, freshness, privacy, responsiveness, off-line >> operation. >> >> I think accuracy (making sure you get results that correspond precisely >> to, say, your current channel revisions and your current system) is not >> a high priority: some result is better than no result. > > I definitely agree with this. At least from the offline use perspective. > I did not focus at all on the second part of the problematic ("online" > and distribution use). > >> Likewise for freshness: results for an older version of a given >> package may still be valid now. > > Indeed. > > Cheers, > -- > tony / Antoine R. Dumont (@ardumont) > > ----------------------------------------------------------------- > gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 877 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-02 18:22 ` Antoine R. Dumont (@ardumont) @ 2022-12-03 18:19 ` Ludovic Courtès 2022-12-04 16:35 ` Antoine R. Dumont (@ardumont) 0 siblings, 1 reply; 33+ messages in thread From: Ludovic Courtès @ 2022-12-03 18:19 UTC (permalink / raw) To: Antoine R. Dumont (@ardumont); +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1541 bytes --] Hi Antoine, "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> skribis: > After toying a bit with the initial code, I took the liberty to make it > a guix extension (we discussed it a bit with @zimoun). It was mostly to > get started with Guile (I know some lisp implems but not this one so i > had to familiarize myself with tools and whatnot ;). Anyway, that can be > reverted if you feel like it can be integrated as a Guix cli directly. > > Currently, the implementation scans and indexes whatever package is > present in the local store of the machine's user. From nix/guix's > design, it makes sense to do it that way as it's likely that even though > you don't have all the tools locally, it may be already present as a > dependency of some high level tools you already use (it's just not > exposed because not declared in config.scm or home-configuration.scm). > > You will find inlines (at the bottom) some cli usage calls [1] and the > current implementation [2]. Yay, nice work! I toyed a bit with your code and that gave me an idea: instead of the costly ‘fold-packages’ + ‘package-derivation’, we can iterate over all the manifests on the system and index packages they refer to. That way, no need to talk to the daemon, computer derivations, etc. Should be faster, though of course it still needs to traverse those directories. Please find attached a modified version that illustrates that. (We’ll need version control at some point. :-)) Thanks, Ludo’. [-- Attachment #2: the code --] [-- Type: text/plain, Size: 14030 bytes --] ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;; ;;; GNU Guix is free software; you can redistribute it and/or modify it ;;; under the terms of the GNU General Public License as published by ;;; the Free Software Foundation; either version 3 of the License, or (at ;;; your option) any later version. ;;; ;;; GNU Guix is distributed in the hope that it will be useful, but ;;; WITHOUT ANY WARRANTY; without even the implied warranty of ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;;; GNU General Public License for more details. ;;; ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions index) #:use-module (guix config) ;; %guix-package-name, ... #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) #:autoload (guix combinators) (fold2) #:autoload (guix grafts) (%graft?) #:autoload (guix store roots) (gc-roots) #:use-module (guix derivations) #:use-module (guix packages) #:use-module (guix profiles) #:use-module (guix sets) #:use-module ((guix utils) #:select (cache-directory)) #:autoload (guix build utils) (find-files) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) #:use-module (srfi srfi-71) #:export (guix-index)) (define debug #f) (define schema " create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, version text not null, unique (name, version) -- add uniqueness constraint ); create table if not exists Directories ( id integer primary key autoincrement not null, name text not null, package integer not null, foreign key (package) references Packages(id) on delete cascade, unique (name, package) -- add uniqueness constraint ); create table if not exists Files ( name text not null, basename text not null, directory integer not null, foreign key (directory) references Directories(id) on delete cascade unique (name, basename, directory) -- add uniqueness constraint ); create index if not exists IndexFiles on Files(basename);") (define (call-with-database file proc) (let ((db (sqlite-open file))) (dynamic-wind (lambda () #t) (lambda () (sqlite-exec db schema) (proc db)) (lambda () (sqlite-close db))))) (define (insert-files db package version directories) "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." (define stmt-select-package (sqlite-prepare db "\ SELECT id FROM Packages WHERE name = :name AND version = :version;" #:cache? #t)) (define stmt-insert-package (sqlite-prepare db "\ INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes VALUES (:name, :version);" #:cache? #t)) (define stmt-select-directory (sqlite-prepare db "\ SELECT id FROM Directories WHERE name = :name AND package = :package;" #:cache? #t)) (define stmt-insert-directory (sqlite-prepare db "\ INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes VALUES (:name, :package);" #:cache? #t)) (define stmt-insert-file (sqlite-prepare db "\ INSERT OR IGNORE INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" #:cache? #t)) (sqlite-exec db "begin immediate;") (sqlite-bind-arguments stmt-insert-package #:name package #:version version) (sqlite-fold (const #t) #t stmt-insert-package) (sqlite-bind-arguments stmt-select-package #:name package #:version version) (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) (when debug (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" package version package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) (sqlite-reset stmt-insert-directory) (sqlite-bind-arguments stmt-insert-directory #:name directory #:package package-id) (sqlite-fold (const #t) #t stmt-insert-directory) (sqlite-reset stmt-select-directory) (sqlite-bind-arguments stmt-select-directory #:name directory #:package package-id) (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) (when debug (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" directory package-id directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. (unless (string=? file directory) (sqlite-reset stmt-insert-file) (sqlite-bind-arguments stmt-insert-file #:name (strip file) #:basename (basename file) #:directory directory-id) (sqlite-fold (const #t) #t stmt-insert-file))) (find-files directory))))) directories))) (sqlite-exec db "commit;")) (define (insert-package db package) "Insert all the files of PACKAGE into DB." (mlet %store-monad ((drv (package->derivation package #:graft? #f))) (match (derivation->output-paths drv) (((labels . directories) ...) (when (every file-exists? directories) (insert-files db (package-name package) (package-version package) directories)) (return #t))))) (define (filter-public-current-supported package) "Filter supported, not hidden (public) and not superseded (current) package." (and (not (hidden-package? package)) (not (package-superseded package)) (supported-package? package))) (define (filter-supported-package package) "Filter supported package (package might be hidden or superseded)." (and (supported-package? package))) (define (no-filter package) "No filtering on package" #t) (define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) "Insert all current packages matching `filter-package-policy` into DB." (with-store store (parameterize ((%graft? #f)) (fold-packages (lambda (package _) (run-with-store store (insert-package db package))) #t #:select? filter-policy)))) \f ;;; ;;; Indexing from local profiles. ;;; (define (all-profiles) "Return the list of profiles on the system." (delete-duplicates (filter-map (lambda (root) (if (file-exists? (string-append root "/manifest")) root (let ((root (string-append root "/profile"))) (and (file-exists? (string-append root "/manifest")) root)))) (gc-roots)))) (define (profiles->manifest-entries profiles) "Return manifest entries for all of PROFILES, without duplicates." (let loop ((visited (set)) (profiles profiles) (entries '())) (match profiles (() entries) ((profile . rest) (let* ((manifest (profile-manifest profile)) (entries visited (fold2 (lambda (entry lst visited) (let ((item (manifest-entry-item entry))) (if (set-contains? visited item) (values lst visited) (values (cons entry lst) (set-insert item visited))))) entries visited (manifest-transitive-entries manifest)))) (loop visited rest entries)))))) (define (insert-manifest-entry db entry) "Insert ENTRY, a manifest entry, into DB." (insert-files db (manifest-entry-name entry) (manifest-entry-version entry) (list (manifest-entry-item entry)))) ;FIXME: outputs? (define (index-manifests db-file) "Insert into DB-FILE entries for packages that appear in manifests available on the system." (call-with-database db-file (lambda (db) (for-each (lambda (entry) (insert-manifest-entry db entry)) (let ((lst (profiles->manifest-entries (all-profiles)))) (pk 'entries (length lst)) lst))))) \f ;;; ;;; Search. ;;; (define-record-type <package-match> (package-match name version file) package-match? (name package-match-name) (version package-match-version) (file package-match-file)) (define (matching-packages db file) "Return unique <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ SELECT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) (sqlite-bind-arguments lookup-stmt #:file file) (sqlite-fold (lambda (result lst) (match result (#(package version directory file) (cons (package-match package version (string-append directory "/" file)) lst)))) '() lookup-stmt)) \f (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname (lambda (db) (insert-packages db no-filter)))) (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname (lambda (db) (matching-packages db file)))) (define (print-matching-results matches) "Print the MATCHES matching results." (for-each (lambda (result) (format #t "~20a ~a~%" (string-append (package-match-name result) "@" (package-match-version result)) (package-match-file result))) matches)) (define default-db-path (string-append (cache-directory #:ensure? #f) "/index/db.sqlite")) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. With 'search FILE', search for packages installing FILEx;x.\n Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. See --db-path for customization.\n")) (newline) (display (G_ "The valid values for OPTIONS are:")) (newline) (display (G_ " -h, --help Display this help and exit")) (display (G_ " -V, --version Display version information and exit")) (display (G_ " --db-path=DIR Change default location of the cache db")) (newline) (newline) (display (G_ "The valid values for ARGS are:")) (newline) (display (G_ " search FILE Search for packages installing the FILE (from cache db)")) (newline) (show-bug-report-information)) (define-command (guix-index . args) (category extension) (synopsis "Index packages to allow searching package for a given filename") (define (parse-db-args args) "Parsing of string key=value where we are only interested in 'value'" (match (string-split args #\=) ((unused db-path) db-path) (_ #f))) (define (display-help-and-exit) (show-help) (exit 0)) (match args ((or ("-h") ("--help")) (display-help-and-exit)) ((or ("-V") ("--version")) (show-version-and-exit "guix locate")) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path (index-packages-with-db db-path) (display-help-and-exit)))) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) (exit (pair? matches)))) ((db-path-args "search" file) (let ((db-path (parse-db-args db-path-args))) (if db-path (let ((matches (matching-packages-with-db db-path file))) (print-matching-results matches) (exit (pair? matches))) (display-help-and-exit)))) (_ ;; index by default ;; (index-packages-with-db default-db-path) (index-manifests default-db-path) ))) ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-03 18:19 ` Ludovic Courtès @ 2022-12-04 16:35 ` Antoine R. Dumont (@ardumont) 2022-12-06 10:01 ` Ludovic Courtès 0 siblings, 1 reply; 33+ messages in thread From: Antoine R. Dumont (@ardumont) @ 2022-12-04 16:35 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 76105 bytes --] Hello Guix, Ludo, \o/, thanks for the iteration ;) Not that I understood everything yet but indeed, it's faster. I've iterated over your work to: - align calls to that new function - improve some docstrings, and imports, and the help message - drop dead (or redundant) code - make sure the (xdg) folder holding the db is created if needed Please, find enclosed the latest implementation as a patch (somewhat vcs code ;). I've edited commits to mark Ludo as author with his started/amended implementations first [0] (that should be in the patch). For information, I extracted some number from runs to compare our iterations (see the org-file attachment). The first iteration being "extracts packages from the store" and the second one "extracts packages from the system manifest". Those runs happened both on a guixified debian host and a raw guix host (more packages). It seems with the new implementation, we find less a bit less packages but it's faster so i guess it's a tradeoff. It'd be nice to know how it runs on your build farm machine (if you got the time at some point [1]). [0] fwiw, yeah git and magit! :D [1] I noticed (through ml discussions) you all are quite busy at the moment ;) Cheers, -- tony / Antoine R. Dumont (@ardumont) ----------------------------------------------------------------- gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 Ludovic Courtès <ludo@gnu.org> writes: > Yay, nice work! > > I toyed a bit with your code and that gave me an idea: instead of the > costly ‘fold-packages’ + ‘package-derivation’, we can iterate over all > the manifests on the system and index packages they refer to. That way, > no need to talk to the daemon, computer derivations, etc. Should be > faster, though of course it still needs to traverse those directories. > > Please find attached a modified version that illustrates that. (We’ll > need version control at some point. :-)) > > Thanks, > Ludo’. > > ;;; GNU Guix --- Functional package management for GNU > ;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org> > ;;; > ;;; This file is part of GNU Guix. > ;;; > ;;; GNU Guix is free software; you can redistribute it and/or modify it > ;;; under the terms of the GNU General Public License as published by > ;;; the Free Software Foundation; either version 3 of the License, or (at > ;;; your option) any later version. > ;;; > ;;; GNU Guix is distributed in the hope that it will be useful, but > ;;; WITHOUT ANY WARRANTY; without even the implied warranty of > ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > ;;; GNU General Public License for more details. > ;;; > ;;; You should have received a copy of the GNU General Public License > ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. > > (define-module (guix extensions index) > #:use-module (guix config) ;; %guix-package-name, ... > #:use-module (guix ui) ;; display G_ > #:use-module (guix scripts) > #:use-module (sqlite3) > #:use-module (ice-9 match) > #:use-module (guix describe) > #:use-module (guix store) > #:use-module (guix monads) > #:autoload (guix combinators) (fold2) > #:autoload (guix grafts) (%graft?) > #:autoload (guix store roots) (gc-roots) > #:use-module (guix derivations) > #:use-module (guix packages) > #:use-module (guix profiles) > #:use-module (guix sets) > #:use-module ((guix utils) #:select (cache-directory)) > #:autoload (guix build utils) (find-files) > #:autoload (gnu packages) (fold-packages) > #:use-module (srfi srfi-1) > #:use-module (srfi srfi-9) > #:use-module (srfi srfi-71) > #:export (guix-index)) > > (define debug #f) > > (define schema > " > create table if not exists Packages ( > id integer primary key autoincrement not null, > name text not null, > version text not null, > unique (name, version) -- add uniqueness constraint > ); > > create table if not exists Directories ( > id integer primary key autoincrement not null, > name text not null, > package integer not null, > foreign key (package) references Packages(id) on delete cascade, > unique (name, package) -- add uniqueness constraint > ); > > create table if not exists Files ( > name text not null, > basename text not null, > directory integer not null, > foreign key (directory) references Directories(id) on delete cascade > unique (name, basename, directory) -- add uniqueness constraint > ); > > create index if not exists IndexFiles on Files(basename);") > > (define (call-with-database file proc) > (let ((db (sqlite-open file))) > (dynamic-wind > (lambda () #t) > (lambda () > (sqlite-exec db schema) > (proc db)) > (lambda () > (sqlite-close db))))) > > (define (insert-files db package version directories) > "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." > (define stmt-select-package > (sqlite-prepare db "\ > SELECT id FROM Packages WHERE name = :name AND version = :version;" > #:cache? #t)) > > (define stmt-insert-package > (sqlite-prepare db "\ > INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes > VALUES (:name, :version);" > #:cache? #t)) > > (define stmt-select-directory > (sqlite-prepare db "\ > SELECT id FROM Directories WHERE name = :name AND package = :package;" > #:cache? #t)) > > (define stmt-insert-directory > (sqlite-prepare db "\ > INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes > VALUES (:name, :package);" > #:cache? #t)) > > (define stmt-insert-file > (sqlite-prepare db "\ > INSERT OR IGNORE INTO Files(name, basename, directory) > VALUES (:name, :basename, :directory);" > #:cache? #t)) > > (sqlite-exec db "begin immediate;") > (sqlite-bind-arguments stmt-insert-package > #:name package > #:version version) > (sqlite-fold (const #t) #t stmt-insert-package) > > (sqlite-bind-arguments stmt-select-package > #:name package > #:version version) > (match (sqlite-fold cons '() stmt-select-package) > ((#(package-id)) > (when debug > (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" > package version package-id)) > (pk 'package package-id package) > (for-each (lambda (directory) > (define (strip file) > (string-drop file (+ (string-length directory) 1))) > > (sqlite-reset stmt-insert-directory) > (sqlite-bind-arguments stmt-insert-directory > #:name directory > #:package package-id) > (sqlite-fold (const #t) #t stmt-insert-directory) > > (sqlite-reset stmt-select-directory) > (sqlite-bind-arguments stmt-select-directory > #:name directory > #:package package-id) > (match (sqlite-fold cons '() stmt-select-directory) > ((#(directory-id)) > (when debug > (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" > directory package-id directory-id)) > (for-each (lambda (file) > ;; If DIRECTORY is a symlink, (find-files > ;; DIRECTORY) returns the DIRECTORY singleton. > (unless (string=? file directory) > (sqlite-reset stmt-insert-file) > (sqlite-bind-arguments stmt-insert-file > #:name (strip file) > #:basename > (basename file) > #:directory > directory-id) > (sqlite-fold (const #t) #t stmt-insert-file))) > (find-files directory))))) > directories))) > (sqlite-exec db "commit;")) > > (define (insert-package db package) > "Insert all the files of PACKAGE into DB." > (mlet %store-monad ((drv (package->derivation package #:graft? #f))) > (match (derivation->output-paths drv) > (((labels . directories) ...) > (when (every file-exists? directories) > (insert-files db (package-name package) (package-version package) > directories)) > (return #t))))) > > (define (filter-public-current-supported package) > "Filter supported, not hidden (public) and not superseded (current) package." > (and (not (hidden-package? package)) > (not (package-superseded package)) > (supported-package? package))) > > (define (filter-supported-package package) > "Filter supported package (package might be hidden or superseded)." > (and (supported-package? package))) > > (define (no-filter package) "No filtering on package" #t) > > (define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) > "Insert all current packages matching `filter-package-policy` into DB." > (with-store store > (parameterize ((%graft? #f)) > (fold-packages (lambda (package _) > (run-with-store store > (insert-package db package))) > #t > #:select? filter-policy)))) > > \f > ;;; > ;;; Indexing from local profiles. > ;;; > > (define (all-profiles) > "Return the list of profiles on the system." > (delete-duplicates > (filter-map (lambda (root) > (if (file-exists? (string-append root "/manifest")) > root > (let ((root (string-append root "/profile"))) > (and (file-exists? (string-append root "/manifest")) > root)))) > (gc-roots)))) > > (define (profiles->manifest-entries profiles) > "Return manifest entries for all of PROFILES, without duplicates." > (let loop ((visited (set)) > (profiles profiles) > (entries '())) > (match profiles > (() > entries) > ((profile . rest) > (let* ((manifest (profile-manifest profile)) > (entries visited > (fold2 (lambda (entry lst visited) > (let ((item (manifest-entry-item entry))) > (if (set-contains? visited item) > (values lst visited) > (values (cons entry lst) > (set-insert item > visited))))) > entries > visited > (manifest-transitive-entries manifest)))) > (loop visited rest entries)))))) > > (define (insert-manifest-entry db entry) > "Insert ENTRY, a manifest entry, into DB." > (insert-files db (manifest-entry-name entry) > (manifest-entry-version entry) > (list (manifest-entry-item entry)))) ;FIXME: outputs? > > (define (index-manifests db-file) > "Insert into DB-FILE entries for packages that appear in manifests > available on the system." > (call-with-database db-file > (lambda (db) > (for-each (lambda (entry) > (insert-manifest-entry db entry)) > (let ((lst (profiles->manifest-entries (all-profiles)))) > (pk 'entries (length lst)) > lst))))) > > \f > ;;; > ;;; Search. > ;;; > > (define-record-type <package-match> > (package-match name version file) > package-match? > (name package-match-name) > (version package-match-version) > (file package-match-file)) > > (define (matching-packages db file) > "Return unique <package-match> corresponding to packages containing FILE." > (define lookup-stmt > (sqlite-prepare db "\ > SELECT Packages.name, Packages.version, Directories.name, Files.name > FROM Packages > INNER JOIN Files, Directories > ON files.basename = :file > AND directories.id = files.directory > AND packages.id = directories.package;")) > > (sqlite-bind-arguments lookup-stmt #:file file) > (sqlite-fold (lambda (result lst) > (match result > (#(package version directory file) > (cons (package-match package version > (string-append directory "/" file)) > lst)))) > '() lookup-stmt)) > > \f > > > (define (index-packages-with-db db-pathname) > "Index packages using db at location DB-PATHNAME." > (call-with-database db-pathname > (lambda (db) > (insert-packages db no-filter)))) > > (define (matching-packages-with-db db-pathname file) > "Compute list of packages referencing FILE using db at DB-PATHNAME." > (call-with-database db-pathname > (lambda (db) > (matching-packages db file)))) > > (define (print-matching-results matches) > "Print the MATCHES matching results." > (for-each (lambda (result) > (format #t "~20a ~a~%" > (string-append (package-match-name result) > "@" (package-match-version result)) > (package-match-file result))) > matches)) > > (define default-db-path > (string-append (cache-directory #:ensure? #f) > "/index/db.sqlite")) > > (define (show-help) > (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] > Without FILE, index (package, file) relationships in the local store. > With 'search FILE', search for packages installing FILEx;x.\n > Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. > See --db-path for customization.\n")) > (newline) > (display (G_ "The valid values for OPTIONS are:")) > (newline) > (display (G_ " > -h, --help Display this help and exit")) > (display (G_ " > -V, --version Display version information and exit")) > (display (G_ " > --db-path=DIR Change default location of the cache db")) > (newline) > (newline) > (display (G_ "The valid values for ARGS are:")) > (newline) > (display (G_ " > search FILE Search for packages installing the FILE (from cache db)")) > (newline) > (show-bug-report-information)) > > (define-command (guix-index . args) > (category extension) > (synopsis "Index packages to allow searching package for a given filename") > > (define (parse-db-args args) > "Parsing of string key=value where we are only interested in 'value'" > (match (string-split args #\=) > ((unused db-path) > db-path) > (_ #f))) > > (define (display-help-and-exit) > (show-help) > (exit 0)) > > (match args > ((or ("-h") ("--help")) > (display-help-and-exit)) > ((or ("-V") ("--version")) > (show-version-and-exit "guix locate")) > ((db-path-args) > (let ((db-path (parse-db-args db-path-args))) > (if db-path > (index-packages-with-db db-path) > (display-help-and-exit)))) > (("search" file) > (let ((matches (matching-packages-with-db default-db-path file))) > (print-matching-results matches) > (exit (pair? matches)))) > ((db-path-args "search" file) > (let ((db-path (parse-db-args db-path-args))) > (if db-path > (let ((matches (matching-packages-with-db db-path file))) > (print-matching-results matches) > (exit (pair? matches))) > (display-help-and-exit)))) > (_ ;; index by default > ;; (index-packages-with-db default-db-path) > (index-manifests default-db-path) > ))) ===File ~/repo/public/guix/guix/add-extension-guix-index.patch=== From d3e658ca1e3ce2715e25450b794d139d3417c74c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org> Date: Wed, 30 Nov 2022 15:25:21 +0100 Subject: [PATCH 01/18] extensions-index: Add initial implementation from civodul Related to https://lists.gnu.org/archive/html/guix-devel/2022-01/msg00354.html --- guix/extensions/file-database.scm | 199 ++++++++++++++++++++++++++++++ 1 file changed, 199 insertions(+) create mode 100644 guix/extensions/file-database.scm diff --git a/guix/extensions/file-database.scm b/guix/extensions/file-database.scm new file mode 100644 index 0000000000..83aafbc554 --- /dev/null +++ b/guix/extensions/file-database.scm @@ -0,0 +1,199 @@ +;;; GNU Guix --- Functional package management for GNU +;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org> +;;; +;;; This file is part of GNU Guix. +;;; +;;; GNU Guix is free software; you can redistribute it and/or modify it +;;; under the terms of the GNU General Public License as published by +;;; the Free Software Foundation; either version 3 of the License, or (at +;;; your option) any later version. +;;; +;;; GNU Guix is distributed in the hope that it will be useful, but +;;; WITHOUT ANY WARRANTY; without even the implied warranty of +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;;; GNU General Public License for more details. +;;; +;;; You should have received a copy of the GNU General Public License +;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. + +(define-module (file-database) + #:use-module (sqlite3) + #:use-module (ice-9 match) + #:use-module (guix store) + #:use-module (guix monads) + #:autoload (guix grafts) (%graft?) + #:use-module (guix derivations) + #:use-module (guix packages) + #:autoload (guix build utils) (find-files) + #:autoload (gnu packages) (fold-packages) + #:use-module (srfi srfi-1) + #:use-module (srfi srfi-9) + #:export (file-database)) + +(define schema + " +create table if not exists Packages ( + id integer primary key autoincrement not null, + name text not null, + version text not null +); + +create table if not exists Directories ( + id integer primary key autoincrement not null, + name text not null, + package integer not null, + foreign key (package) references Packages(id) on delete cascade +); + +create table if not exists Files ( + name text not null, + basename text not null, + directory integer not null, + foreign key (directory) references Directories(id) on delete cascade +); + +create index if not exists IndexFiles on Files(basename);") + +(define (call-with-database file proc) + (let ((db (sqlite-open file))) + (dynamic-wind + (lambda () #t) + (lambda () + (sqlite-exec db schema) + (proc db)) + (lambda () + (sqlite-close db))))) + +(define (insert-files db package version directories) + "Insert the files contained in DIRECTORIES as belonging to PACKAGE at +VERSION." + (define last-row-id-stmt + (sqlite-prepare db "SELECT last_insert_rowid();" + #:cache? #t)) + + (define package-stmt + (sqlite-prepare db "\ +INSERT OR REPLACE INTO Packages(name, version) +VALUES (:name, :version);" + #:cache? #t)) + + (define directory-stmt + (sqlite-prepare db "\ +INSERT INTO Directories(name, package) VALUES (:name, :package);" + #:cache? #t)) + + (define file-stmt + (sqlite-prepare db "\ +INSERT INTO Files(name, basename, directory) +VALUES (:name, :basename, :directory);" + #:cache? #t)) + + (sqlite-exec db "begin immediate;") + (sqlite-bind-arguments package-stmt + #:name package + #:version version) + (sqlite-fold (const #t) #t package-stmt) + (match (sqlite-fold cons '() last-row-id-stmt) + ((#(package-id)) + (pk 'package package-id package) + (for-each (lambda (directory) + (define (strip file) + (string-drop file (+ (string-length directory) 1))) + + (sqlite-reset directory-stmt) + (sqlite-bind-arguments directory-stmt + #:name directory + #:package package-id) + (sqlite-fold (const #t) #t directory-stmt) + + (match (sqlite-fold cons '() last-row-id-stmt) + ((#(directory-id)) + (for-each (lambda (file) + ;; If DIRECTORY is a symlink, (find-files + ;; DIRECTORY) returns the DIRECTORY singleton. + (unless (string=? file directory) + (sqlite-reset file-stmt) + (sqlite-bind-arguments file-stmt + #:name (strip file) + #:basename + (basename file) + #:directory + directory-id) + (sqlite-fold (const #t) #t file-stmt))) + (find-files directory))))) + directories) + (sqlite-exec db "commit;")))) + +(define (insert-package db package) + "Insert all the files of PACKAGE into DB." + (mlet %store-monad ((drv (package->derivation package #:graft? #f))) + (match (derivation->output-paths drv) + (((labels . directories) ...) + (when (every file-exists? directories) + (insert-files db (package-name package) (package-version package) + directories)) + (return #t))))) + +(define (insert-packages db) + "Insert all the current packages into DB." + (with-store store + (parameterize ((%graft? #f)) + (fold-packages (lambda (package _) + (run-with-store store + (insert-package db package))) + #t + #:select? (lambda (package) + (and (not (hidden-package? package)) + (not (package-superseded package)) + (supported-package? package))))))) + +(define-record-type <package-match> + (package-match name version file) + package-match? + (name package-match-name) + (version package-match-version) + (file package-match-file)) + +(define (matching-packages db file) + "Return a list of <package-match> corresponding to packages containing +FILE." + (define lookup-stmt + (sqlite-prepare db "\ +SELECT Packages.name, Packages.version, Directories.name, Files.name +FROM Packages +INNER JOIN Files, Directories +ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) + + (sqlite-bind-arguments lookup-stmt #:file file) + (sqlite-fold (lambda (result lst) + (match result + (#(package version directory file) + (cons (package-match package version + (string-append directory "/" file)) + lst)))) + '() lookup-stmt)) + +\f +(define (file-database . args) + (match args + ((_ "populate") + (call-with-database "/tmp/db" + (lambda (db) + (insert-packages db)))) + ((_ "search" file) + (let ((matches (call-with-database "/tmp/db" + (lambda (db) + (matching-packages db file))))) + (for-each (lambda (result) + (format #t "~20a ~a~%" + (string-append (package-match-name result) + "@" (package-match-version result)) + (package-match-file result))) + matches) + (exit (pair? matches)))) + (_ + (format (current-error-port) + "usage: file-database [populate|search] args ...~%") + (exit 1)))) + +(apply file-database (command-line)) -- 2.38.1 From d9139cc86c26f76bc66f7d82868ebf6a03605f76 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 1 Dec 2022 13:36:28 +0100 Subject: [PATCH 02/18] extensions-index: Transform command into `guix locate` extension --- .../{file-database.scm => locate.scm} | 58 ++++++++++++------- 1 file changed, 36 insertions(+), 22 deletions(-) rename guix/extensions/{file-database.scm => locate.scm} (82%) diff --git a/guix/extensions/file-database.scm b/guix/extensions/locate.scm similarity index 82% rename from guix/extensions/file-database.scm rename to guix/extensions/locate.scm index 83aafbc554..1e42f5bad8 100644 --- a/guix/extensions/file-database.scm +++ b/guix/extensions/locate.scm @@ -16,7 +16,8 @@ ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. -(define-module (file-database) +(define-module (guix extensions locate) + #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (guix store) @@ -28,7 +29,7 @@ (define-module (file-database) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) - #:export (file-database)) + #:export (guix-locate)) (define schema " @@ -155,8 +156,7 @@ (define-record-type <package-match> (file package-match-file)) (define (matching-packages db file) - "Return a list of <package-match> corresponding to packages containing -FILE." + "Return list of <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ SELECT Packages.name, Packages.version, Directories.name, Files.name @@ -174,26 +174,40 @@ (define lookup-stmt '() lookup-stmt)) \f -(define (file-database . args) + +(define (index-packages-with-db db-pathname) + "Index packages using db at location DB-PATHNAME." + (call-with-database db-pathname + (lambda (db) + (insert-packages db)))) + +(define (matching-packages-with-db db-pathname file) + "Compute list of packages referencing FILE using db at DB-PATHNAME." + (call-with-database db-pathname + (lambda (db) + (matching-packages db file)))) + +(define (print-matching-results matches) + "Print the MATCHES matching results." + (for-each (lambda (result) + (format #t "~20a ~a~%" + (string-append (package-match-name result) + "@" (package-match-version result)) + (package-match-file result))) + matches)) + +(define-command (guix-locate . args) + (category extension) + (synopsis "Index packages then search what package declares a given file") (match args - ((_ "populate") - (call-with-database "/tmp/db" - (lambda (db) - (insert-packages db)))) - ((_ "search" file) - (let ((matches (call-with-database "/tmp/db" - (lambda (db) - (matching-packages db file))))) - (for-each (lambda (result) - (format #t "~20a ~a~%" - (string-append (package-match-name result) - "@" (package-match-version result)) - (package-match-file result))) - matches) + (("index") + (index-packages-with-db "/tmp/db")) + (("search" file) + (let ((matches (matching-packages-with-db "/tmp/db" file))) + (print-matching-results matches) (exit (pair? matches)))) (_ (format (current-error-port) - "usage: file-database [populate|search] args ...~%") + "usage: guix locate [index|search] args ...~% ~a" + args) (exit 1)))) - -(apply file-database (command-line)) -- 2.38.1 From eb474f3412ba19320dceda7d08c7f960d00cb898 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 1 Dec 2022 13:45:59 +0100 Subject: [PATCH 03/18] extensions-index: Avoid duplicating the hard-coded db path --- guix/extensions/locate.scm | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 1e42f5bad8..830dfc49fb 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -196,14 +196,18 @@ (define (print-matching-results matches) (package-match-file result))) matches)) +;; TODO: Determine the current guile/guix mechanism to provide configuration +;; for this +(define default-location-db-path "/tmp/db") + (define-command (guix-locate . args) (category extension) (synopsis "Index packages then search what package declares a given file") (match args (("index") - (index-packages-with-db "/tmp/db")) + (index-packages-with-db default-location-db-path)) (("search" file) - (let ((matches (matching-packages-with-db "/tmp/db" file))) + (let ((matches (matching-packages-with-db default-location-db-path file))) (print-matching-results matches) (exit (pair? matches)))) (_ -- 2.38.1 From 309ecd5d5b7cdff012b66cbe9643c34725b22a2d Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 1 Dec 2022 13:47:19 +0100 Subject: [PATCH 04/18] extensions-index: Deduplicate lookup matching results --- guix/extensions/locate.scm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 830dfc49fb..ab0a0403ec 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -156,10 +156,10 @@ (define-record-type <package-match> (file package-match-file)) (define (matching-packages db file) - "Return list of <package-match> corresponding to packages containing FILE." + "Return unique <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ -SELECT Packages.name, Packages.version, Directories.name, Files.name +SELECT DISTINCT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) -- 2.38.1 From 541615ab6638b1fb418531f961cfb6756b41499b Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 14:09:52 +0100 Subject: [PATCH 05/18] extensions-index: Make insertion queries idempotent Prior to this, multiple runs of the index subcommand would append the same packages, directories or files in the db. --- guix/extensions/locate.scm | 71 ++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 26 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index ab0a0403ec..ce8306531f 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -36,14 +36,16 @@ (define schema create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, - version text not null + version text not null, + unique (name, version) -- add uniqueness constraint ); create table if not exists Directories ( id integer primary key autoincrement not null, name text not null, package integer not null, - foreign key (package) references Packages(id) on delete cascade + foreign key (package) references Packages(id) on delete cascade, + unique (name, package) -- add uniqueness constraint ); create table if not exists Files ( @@ -51,6 +53,7 @@ (define schema basename text not null, directory integer not null, foreign key (directory) references Directories(id) on delete cascade + unique (name, basename, directory) -- add uniqueness constraint ); create index if not exists IndexFiles on Files(basename);") @@ -66,64 +69,78 @@ (define (call-with-database file proc) (sqlite-close db))))) (define (insert-files db package version directories) - "Insert the files contained in DIRECTORIES as belonging to PACKAGE at -VERSION." - (define last-row-id-stmt - (sqlite-prepare db "SELECT last_insert_rowid();" + "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." + (define stmt-select-package + (sqlite-prepare db "\ +SELECT id FROM Packages WHERE name = :name AND version = :version;" #:cache? #t)) - (define package-stmt + (define stmt-insert-package (sqlite-prepare db "\ -INSERT OR REPLACE INTO Packages(name, version) +INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes VALUES (:name, :version);" #:cache? #t)) - (define directory-stmt + (define stmt-select-directory (sqlite-prepare db "\ -INSERT INTO Directories(name, package) VALUES (:name, :package);" +SELECT id FROM Directories WHERE name = :name AND package = :package;" #:cache? #t)) - (define file-stmt + (define stmt-insert-directory (sqlite-prepare db "\ -INSERT INTO Files(name, basename, directory) +INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes +VALUES (:name, :package);" + #:cache? #t)) + + (define stmt-insert-file + (sqlite-prepare db "\ +INSERT OR IGNORE INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" #:cache? #t)) (sqlite-exec db "begin immediate;") - (sqlite-bind-arguments package-stmt + (sqlite-bind-arguments stmt-insert-package #:name package #:version version) - (sqlite-fold (const #t) #t package-stmt) - (match (sqlite-fold cons '() last-row-id-stmt) + (sqlite-fold (const #t) #t stmt-insert-package) + + (sqlite-bind-arguments stmt-select-package + #:name package + #:version version) + (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) - (sqlite-reset directory-stmt) - (sqlite-bind-arguments directory-stmt + (sqlite-reset stmt-insert-directory) + (sqlite-bind-arguments stmt-insert-directory #:name directory #:package package-id) - (sqlite-fold (const #t) #t directory-stmt) + (sqlite-fold (const #t) #t stmt-insert-directory) - (match (sqlite-fold cons '() last-row-id-stmt) + (sqlite-reset stmt-select-directory) + (sqlite-bind-arguments stmt-select-directory + #:name directory + #:package package-id) + (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. (unless (string=? file directory) - (sqlite-reset file-stmt) - (sqlite-bind-arguments file-stmt + (sqlite-reset stmt-insert-file) + (sqlite-bind-arguments stmt-insert-file #:name (strip file) #:basename (basename file) #:directory directory-id) - (sqlite-fold (const #t) #t file-stmt))) + (sqlite-fold (const #t) #t stmt-insert-file))) (find-files directory))))) - directories) - (sqlite-exec db "commit;")))) + directories))) + (sqlite-exec db "commit;")) (define (insert-package db package) "Insert all the files of PACKAGE into DB." @@ -159,10 +176,12 @@ (define (matching-packages db file) "Return unique <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ -SELECT DISTINCT Packages.name, Packages.version, Directories.name, Files.name +SELECT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories -ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) +ON files.basename = :file + AND directories.id = files.directory + AND packages.id = directories.package;")) (sqlite-bind-arguments lookup-stmt #:file file) (sqlite-fold (lambda (result lst) -- 2.38.1 From 09d5f6b30ac24a8e8261994a1011ddd13082a4bb Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 14:10:59 +0100 Subject: [PATCH 06/18] extensions-index: Add debug statement This is conditional in the top-level debug module variable, false by default. --- guix/extensions/locate.scm | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index ce8306531f..3b43ea887e 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -31,6 +31,8 @@ (define-module (guix extensions locate) #:use-module (srfi srfi-9) #:export (guix-locate)) +(define debug #f) + (define schema " create table if not exists Packages ( @@ -109,6 +111,9 @@ (define stmt-insert-file #:version version) (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) + (when debug + (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" + package version package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) @@ -126,6 +131,9 @@ (define (strip file) #:package package-id) (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) + (when debug + (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" + directory package-id directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. -- 2.38.1 From b50267e3d24162cd8c3908bbaa841d13363621e9 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 14:11:50 +0100 Subject: [PATCH 07/18] extensions-index: Play around the packaging filtering functions This keeps the default behavior but allows to change it (by the developer) to determine what's the best policy. --- guix/extensions/locate.scm | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 3b43ea887e..9679d643a6 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -160,18 +160,27 @@ (define (insert-package db package) directories)) (return #t))))) -(define (insert-packages db) - "Insert all the current packages into DB." +(define (filter-public-current-supported package) + "Filter supported, not hidden (public) and not superseded (current) package." + (and (not (hidden-package? package)) + (not (package-superseded package)) + (supported-package? package))) + +(define (filter-supported-package package) + "Filter supported package (package might be hidden or superseded)." + (and (supported-package? package))) + +(define (no-filter package) "No filtering on package" #t) + +(define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) + "Insert all current packages matching `filter-package-policy` into DB." (with-store store (parameterize ((%graft? #f)) (fold-packages (lambda (package _) (run-with-store store (insert-package db package))) #t - #:select? (lambda (package) - (and (not (hidden-package? package)) - (not (package-superseded package)) - (supported-package? package))))))) + #:select? filter-policy)))) (define-record-type <package-match> (package-match name version file) @@ -206,7 +215,7 @@ (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname (lambda (db) - (insert-packages db)))) + (insert-packages db no-filter)))) (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." -- 2.38.1 From 3b5c765fc967cef1d6919b66acc2d7872ea1e48c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 15:19:24 +0100 Subject: [PATCH 08/18] extensions-index: Install db in ~/.config/guix/locate-db.sqlite --- guix/extensions/locate.scm | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 9679d643a6..7d19e64a07 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -232,9 +232,12 @@ (define (print-matching-results matches) (package-match-file result))) matches)) -;; TODO: Determine the current guile/guix mechanism to provide configuration -;; for this -(define default-location-db-path "/tmp/db") +(define default-location-db-path + (let ((local-config-path + (and=> (getenv "HOME") + (lambda (home) + (string-append home "/.config/guix/"))))) + (string-append local-config-path "locate-db.sqlite"))) (define-command (guix-locate . args) (category extension) -- 2.38.1 From f101d12acf05c82cf9678d1cffec76cceba9e845 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 17:58:18 +0100 Subject: [PATCH 09/18] extensions-index: Improve cli parsing This unifies with some existing guix commands (import). --- guix/extensions/locate.scm | 80 +++++++++++++++++++++++++++++++++----- 1 file changed, 71 insertions(+), 9 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 7d19e64a07..630560b231 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -17,9 +17,12 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions locate) + #:use-module (guix config) ;; %guix-package-name, ... + #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) + #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) #:autoload (guix grafts) (%graft?) @@ -232,25 +235,84 @@ (define (print-matching-results matches) (package-match-file result))) matches)) -(define default-location-db-path +(define default-db-path (let ((local-config-path (and=> (getenv "HOME") (lambda (home) (string-append home "/.config/guix/"))))) (string-append local-config-path "locate-db.sqlite"))) +(define (show-bug-report-information) + ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this + ;; package. Please add another line saying "Report translation bugs to + ;; ...\n" with the address for translation bugs (typically your translation + ;; team's web or email address). + (format #t (G_ " +Report bugs to: ~a.") %guix-bug-report-address) + (format #t (G_ " +~a home page: <~a>") %guix-package-name %guix-home-page-url) + (format #t (G_ " +General help using Guix and GNU software: <~a>") + ;; TRANSLATORS: Change the "/en" bit of this URL appropriately if + ;; the web site is translated in your language. + (G_ "https://guix.gnu.org/en/help/")) + (newline)) + +(define (show-help) + (display (G_ "Usage: guix locate [OPTIONS...] [ARGS...] +Index packages and search what package declares a given file.\n +By default, the local cache db is located in ~/.config/guix/locate-db.sqlite. +See --db-path for customization.")) + (display (G_ " + index Index current packages from the local store (in cache db)")) + (display (G_ " + search FILE Search for packages that declares FILE (from cache db)")) + (newline) + (display (G_ " + --db-path=DIR Change default location of the cache db")) + (newline) + (display (G_ " + -h, --help Display this help and exit")) + (display (G_ " + -V, --version Display version information and exit")) + (newline) + (show-bug-report-information)) + (define-command (guix-locate . args) (category extension) - (synopsis "Index packages then search what package declares a given file") + (synopsis "Index packages to allow searching package for a given filename") + + (define (parse-db-args args) + "Parsing of string key=value where we are only interested in 'value'" + (match (string-split args #\=) + ((unused db-path) + db-path) + (_ #f))) + + (define (display-help-and-exit) + (show-help) + (exit 0)) + (match args + ((or ("-h") ("--help") ()) + (display-help-and-exit)) + ((or ("-V") ("--version")) + (show-version-and-exit "guix locate")) + ((db-path-args "index") + (let ((db-path (parse-db-args db-path-args))) + (if db-path + (index-packages-with-db db-path) + (display-help-and-exit)))) (("index") - (index-packages-with-db default-location-db-path)) + (index-packages-with-db default-db-path)) (("search" file) - (let ((matches (matching-packages-with-db default-location-db-path file))) + (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) (exit (pair? matches)))) - (_ - (format (current-error-port) - "usage: guix locate [index|search] args ...~% ~a" - args) - (exit 1)))) + ((db-path-args "search" file) + (let ((db-path (parse-db-args db-path-args))) + (if db-path + (let ((matches (matching-packages-with-db db-path file))) + (print-matching-results matches) + (exit (pair? matches))) + (display-help-and-exit)))))) -- 2.38.1 From 9cb0826a71bdada345de100d98e9b44f3503a75a Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 19:13:46 +0100 Subject: [PATCH 10/18] extensions-index: Improve cli options and help message This also renames the cli from locate to index. --- guix/extensions/{locate.scm => index.scm} | 40 +++++++++++++---------- 1 file changed, 22 insertions(+), 18 deletions(-) rename guix/extensions/{locate.scm => index.scm} (93%) diff --git a/guix/extensions/locate.scm b/guix/extensions/index.scm similarity index 93% rename from guix/extensions/locate.scm rename to guix/extensions/index.scm index 630560b231..ab7661dbac 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/index.scm @@ -16,7 +16,7 @@ ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. -(define-module (guix extensions locate) +(define-module (guix extensions index) #:use-module (guix config) ;; %guix-package-name, ... #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) @@ -32,7 +32,7 @@ (define-module (guix extensions locate) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) - #:export (guix-locate)) + #:export (guix-index)) (define debug #f) @@ -259,26 +259,30 @@ (define (show-bug-report-information) (newline)) (define (show-help) - (display (G_ "Usage: guix locate [OPTIONS...] [ARGS...] -Index packages and search what package declares a given file.\n -By default, the local cache db is located in ~/.config/guix/locate-db.sqlite. -See --db-path for customization.")) - (display (G_ " - index Index current packages from the local store (in cache db)")) - (display (G_ " - search FILE Search for packages that declares FILE (from cache db)")) + (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] +Without FILE, index (package, file) relationships in the local store. +With 'search FILE', search for packages installing FILEx;x.\n +Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. +See --db-path for customization.\n")) (newline) - (display (G_ " - --db-path=DIR Change default location of the cache db")) + (display (G_ "The valid values for OPTIONS are:")) (newline) (display (G_ " -h, --help Display this help and exit")) (display (G_ " -V, --version Display version information and exit")) + (display (G_ " + --db-path=DIR Change default location of the cache db")) + (newline) + (newline) + (display (G_ "The valid values for ARGS are:")) + (newline) + (display (G_ " + search FILE Search for packages installing the FILE (from cache db)")) (newline) (show-bug-report-information)) -(define-command (guix-locate . args) +(define-command (guix-index . args) (category extension) (synopsis "Index packages to allow searching package for a given filename") @@ -294,17 +298,15 @@ (define (display-help-and-exit) (exit 0)) (match args - ((or ("-h") ("--help") ()) + ((or ("-h") ("--help")) (display-help-and-exit)) ((or ("-V") ("--version")) (show-version-and-exit "guix locate")) - ((db-path-args "index") + ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path (index-packages-with-db db-path) (display-help-and-exit)))) - (("index") - (index-packages-with-db default-db-path)) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) @@ -315,4 +317,6 @@ (define (display-help-and-exit) (let ((matches (matching-packages-with-db db-path file))) (print-matching-results matches) (exit (pair? matches))) - (display-help-and-exit)))))) + (display-help-and-exit)))) + (_ ;; index by default + (index-packages-with-db default-db-path)))) -- 2.38.1 From f18d1f536bf6b13ec0dd8ee1e865ce21448e3836 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org> Date: Sun, 4 Dec 2022 14:42:45 +0100 Subject: [PATCH 11/18] extensions-index: Iterate over system manifests to index This should avoid the extra work of discussing with daemon, computing derivations, etc... --- guix/extensions/index.scm | 84 +++++++++++++++++++++++++++++++++++---- 1 file changed, 76 insertions(+), 8 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index ab7661dbac..a7a23c6194 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -25,13 +25,19 @@ (define-module (guix extensions index) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) + #:autoload (guix combinators) (fold2) #:autoload (guix grafts) (%graft?) + #:autoload (guix store roots) (gc-roots) #:use-module (guix derivations) #:use-module (guix packages) + #:use-module (guix profiles) + #:use-module (guix sets) + #:use-module ((guix utils) #:select (cache-directory)) #:autoload (guix build utils) (find-files) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) + #:use-module (srfi srfi-71) #:export (guix-index)) (define debug #f) @@ -185,6 +191,67 @@ (define* (insert-packages db #:optional (filter-policy filter-public-current-sup #t #:select? filter-policy)))) +\f +;;; +;;; Indexing from local profiles. +;;; + +(define (all-profiles) + "Return the list of profiles on the system." + (delete-duplicates + (filter-map (lambda (root) + (if (file-exists? (string-append root "/manifest")) + root + (let ((root (string-append root "/profile"))) + (and (file-exists? (string-append root "/manifest")) + root)))) + (gc-roots)))) + +(define (profiles->manifest-entries profiles) + "Return manifest entries for all of PROFILES, without duplicates." + (let loop ((visited (set)) + (profiles profiles) + (entries '())) + (match profiles + (() + entries) + ((profile . rest) + (let* ((manifest (profile-manifest profile)) + (entries visited + (fold2 (lambda (entry lst visited) + (let ((item (manifest-entry-item entry))) + (if (set-contains? visited item) + (values lst visited) + (values (cons entry lst) + (set-insert item + visited))))) + entries + visited + (manifest-transitive-entries manifest)))) + (loop visited rest entries)))))) + +(define (insert-manifest-entry db entry) + "Insert ENTRY, a manifest entry, into DB." + (insert-files db (manifest-entry-name entry) + (manifest-entry-version entry) + (list (manifest-entry-item entry)))) ;FIXME: outputs? + +(define (index-manifests db-file) + "Insert into DB-FILE entries for packages that appear in manifests +available on the system." + (call-with-database db-file + (lambda (db) + (for-each (lambda (entry) + (insert-manifest-entry db entry)) + (let ((lst (profiles->manifest-entries (all-profiles)))) + (pk 'entries (length lst)) + lst))))) + +\f +;;; +;;; Search. +;;; + (define-record-type <package-match> (package-match name version file) package-match? @@ -214,6 +281,10 @@ (define lookup-stmt \f +;;; +;;; CLI +;;; + (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname @@ -236,11 +307,8 @@ (define (print-matching-results matches) matches)) (define default-db-path - (let ((local-config-path - (and=> (getenv "HOME") - (lambda (home) - (string-append home "/.config/guix/"))))) - (string-append local-config-path "locate-db.sqlite"))) + (string-append (cache-directory #:ensure? #f) + "/index/db.sqlite")) (define (show-bug-report-information) ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this @@ -261,7 +329,7 @@ (define (show-bug-report-information) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. -With 'search FILE', search for packages installing FILEx;x.\n +With 'search FILE', search for packages installing FILE.\n Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. See --db-path for customization.\n")) (newline) @@ -318,5 +386,5 @@ (define (display-help-and-exit) (print-matching-results matches) (exit (pair? matches))) (display-help-and-exit)))) - (_ ;; index by default - (index-packages-with-db default-db-path)))) + (_ ;; By default, index + (index-manifests default-db-path)))) -- 2.38.1 From c9b02fc838237ebd7bc38ba7a71587fcdcaf6212 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 14:45:20 +0100 Subject: [PATCH 12/18] extensions-index: Improve help message --- guix/extensions/index.scm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index a7a23c6194..4a69df326e 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -328,9 +328,9 @@ (define (show-bug-report-information) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] -Without FILE, index (package, file) relationships in the local store. +Without argument, indexes (package, file) relationships in the local store. With 'search FILE', search for packages installing FILE.\n -Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. +Note: The internal cache is located at ~/.cache/guix/index/db.sqlite. See --db-path for customization.\n")) (newline) (display (G_ "The valid values for OPTIONS are:")) -- 2.38.1 From d63ef7a97f3fb47b5693b2c1d24bdf276ca6a6a8 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 14:46:04 +0100 Subject: [PATCH 13/18] extensions-index: Improve imports --- guix/extensions/index.scm | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 4a69df326e..abaf7df071 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -17,8 +17,10 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions index) - #:use-module (guix config) ;; %guix-package-name, ... - #:use-module (guix ui) ;; display G_ + #:use-module ((guix config) #:select (%guix-package-name + %guix-home-page-url + %guix-bug-report-address)) + #:use-module ((guix ui) #:select (G_)) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) -- 2.38.1 From 14a9dafb2b927ba8435a26fdea04b00644e3ca3c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 15:52:15 +0100 Subject: [PATCH 14/18] extensions-index: Drop code duplication Import directly the right function from guix ui module. --- guix/extensions/index.scm | 23 +++-------------------- 1 file changed, 3 insertions(+), 20 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index abaf7df071..c40edc7944 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -17,10 +17,9 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions index) - #:use-module ((guix config) #:select (%guix-package-name - %guix-home-page-url - %guix-bug-report-address)) - #:use-module ((guix ui) #:select (G_)) + #:use-module ((guix i18n) #:select (G_)) + #:use-module ((guix ui) #:select (show-version-and-exit + show-bug-report-information)) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) @@ -312,22 +311,6 @@ (define default-db-path (string-append (cache-directory #:ensure? #f) "/index/db.sqlite")) -(define (show-bug-report-information) - ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this - ;; package. Please add another line saying "Report translation bugs to - ;; ...\n" with the address for translation bugs (typically your translation - ;; team's web or email address). - (format #t (G_ " -Report bugs to: ~a.") %guix-bug-report-address) - (format #t (G_ " -~a home page: <~a>") %guix-package-name %guix-home-page-url) - (format #t (G_ " -General help using Guix and GNU software: <~a>") - ;; TRANSLATORS: Change the "/en" bit of this URL appropriately if - ;; the web site is translated in your language. - (G_ "https://guix.gnu.org/en/help/")) - (newline)) - (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without argument, indexes (package, file) relationships in the local store. -- 2.38.1 From ea1d8216bfe5f487de24d883891b6e07c8536cdd Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:01:33 +0100 Subject: [PATCH 15/18] extensions-index: Drop dead code we read from local profiles now --- guix/extensions/index.scm | 42 ++------------------------------------- 1 file changed, 2 insertions(+), 40 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index c40edc7944..a7c518e903 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -160,38 +160,6 @@ (define (strip file) directories))) (sqlite-exec db "commit;")) -(define (insert-package db package) - "Insert all the files of PACKAGE into DB." - (mlet %store-monad ((drv (package->derivation package #:graft? #f))) - (match (derivation->output-paths drv) - (((labels . directories) ...) - (when (every file-exists? directories) - (insert-files db (package-name package) (package-version package) - directories)) - (return #t))))) - -(define (filter-public-current-supported package) - "Filter supported, not hidden (public) and not superseded (current) package." - (and (not (hidden-package? package)) - (not (package-superseded package)) - (supported-package? package))) - -(define (filter-supported-package package) - "Filter supported package (package might be hidden or superseded)." - (and (supported-package? package))) - -(define (no-filter package) "No filtering on package" #t) - -(define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) - "Insert all current packages matching `filter-package-policy` into DB." - (with-store store - (parameterize ((%graft? #f)) - (fold-packages (lambda (package _) - (run-with-store store - (insert-package db package))) - #t - #:select? filter-policy)))) - \f ;;; ;;; Indexing from local profiles. @@ -209,7 +177,7 @@ (define (all-profiles) (gc-roots)))) (define (profiles->manifest-entries profiles) - "Return manifest entries for all of PROFILES, without duplicates." + "Return deduplicated manifest entries across all PROFILES." (let loop ((visited (set)) (profiles profiles) (entries '())) @@ -286,12 +254,6 @@ (define lookup-stmt ;;; CLI ;;; -(define (index-packages-with-db db-pathname) - "Index packages using db at location DB-PATHNAME." - (call-with-database db-pathname - (lambda (db) - (insert-packages db no-filter)))) - (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname @@ -358,7 +320,7 @@ (define (display-help-and-exit) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path - (index-packages-with-db db-path) + (index-manifests db-path) (display-help-and-exit)))) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) -- 2.38.1 From 8454f9f417c2781fded2c26a1b920174991ac1dc Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:12:10 +0100 Subject: [PATCH 16/18] extensions-index: Rework docstrings --- guix/extensions/index.scm | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index a7c518e903..1c23d9a4f1 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -166,7 +166,7 @@ (define (strip file) ;;; (define (all-profiles) - "Return the list of profiles on the system." + "Return the list of system profiles." (delete-duplicates (filter-map (lambda (root) (if (file-exists? (string-append root "/manifest")) @@ -200,14 +200,13 @@ (define (profiles->manifest-entries profiles) (loop visited rest entries)))))) (define (insert-manifest-entry db entry) - "Insert ENTRY, a manifest entry, into DB." + "Insert a manifest ENTRY into DB." (insert-files db (manifest-entry-name entry) (manifest-entry-version entry) (list (manifest-entry-item entry)))) ;FIXME: outputs? (define (index-manifests db-file) - "Insert into DB-FILE entries for packages that appear in manifests -available on the system." + "Insert packages entries into DB-FILE from the system manifests." (call-with-database db-file (lambda (db) (for-each (lambda (entry) -- 2.38.1 From 98f9899d479cd62e93b86fab3448b2024db02621 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:12:24 +0100 Subject: [PATCH 17/18] extensions-index: Fix warning according to repl suggestion --- guix/extensions/index.scm | 1 + 1 file changed, 1 insertion(+) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 1c23d9a4f1..42c2051f13 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -23,6 +23,7 @@ (define-module (guix extensions index) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) + #:use-module (ice-9 format) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) -- 2.38.1 From bb80ad696e1a47651f2dc4a7c74ea577372c61b5 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:20:01 +0100 Subject: [PATCH 18/18] extensions-index: Ensure directory holding the db is created if needed. The creation is ignore if already present. --- guix/extensions/index.scm | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 42c2051f13..627dddc119 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -208,6 +208,10 @@ (define (insert-manifest-entry db entry) (define (index-manifests db-file) "Insert packages entries into DB-FILE from the system manifests." + (let ((db-dirpath (dirname db-file))) + (unless (file-exists? db-dirpath) + (mkdir db-dirpath))) + (call-with-database db-file (lambda (db) (for-each (lambda (entry) -- 2.38.1 ============================================================ ===File ~/repo/private/org/guix/guix-extensions-index.org=== #+title: Bootstrap guix index (search) #+author: civodul, ardumont Let's have a means to lookup from file to package holding that file: * sources - [[https://lists.gnu.org/archive/html/guix-devel/2022-01/msg00354.html][Initial discussion]] - [[https://web.fdn.fr/~lcourtes/pastebin/file-database.scm.html][Latest source]] - [[https://fulbert.neocities.org/guix/10-years-of-guix/simon-tournier-guix-repl/guix-cookbook/minimal-example.html][Extension mechanism]] + [[https://10years.guix.gnu.org/video/guix-repl-to-infinity-and-beyond/][demo presentation]] - [[https://issues.guix.gnu.org/58339][Simple extension package]] * kickoff discussion to propose contribution in irc #+begin_src txt 11:54 <ardumont> civodul: hello, i had a look at the discussion you pointed me (on store "file search"), it's definitely interesting (and the offline part is really what i need) 11:54 <ardumont> what's a way forward? 11:54 <ardumont> (if any) 11:55 <ardumont> (can i help or something? ;) 11:56 <ardumont> (i'm a somewhat "miscelleanous" lisper so) 12:01 <civodul> ardumont: hi! sure! i guess you could champion discussions & development of such a tool 12:02 <civodul> so it's mostly about finding out how to make the info available 12:03 <civodul> perhaps there could be a default mode of operation downloading the database from some server 12:03 <civodul> and an other mode of operation where it'd use purely local knowledge 12:03 <ardumont> regarding the implementation, the end discussion talked about compression, is that solely in regards to serving the result from somewhere? 12:03 <civodul> yes 12:03 <civodul> otherwise it doesn't really matter 12:03 <ardumont> (or is that some implementation adaptation so the tool is doing it?) 12:03 <ardumont> i have my answer ;) 12:04 <civodul> this is the latest version i have: https://web.fdn.fr/~lcourtes/pastebin/file-database.scm.html 12:05 <ardumont> thx, where should that live? 12:06 <ardumont> (in the end i mean) in the guix repo in a branch? 12:07 <ardumont> (we can always sort out the details of what's there to do regarding licenses and whatnot, i'll comply to whatever is required) 12:17 <zimoun> hey ardumont :-) 12:17 <ardumont> civodul: no promise on eta yet but i'll check what i can do (i got one last question below your last answer, sorry, i had forgotten to highlight you ;) 12:17 <ardumont> hello zimoun ;) 12:17 <zimoun> civodul: the link fails for me 12:17 <civodul> ardumont: in the end it would be part of Guix 12:17 <civodul> that's the kind of tool that's generally useful 12:17 <ardumont> yes 12:17 <zimoun> or via an extension? 12:18 <zimoun> civodul: -^ 12:18 <ardumont> i was gonna ask, is guix providing a way to extend the guix cli through extension already? 12:18 <civodul> it can start its life as an extension, sure 12:18 <ardumont> (since it's lisp and all that, somehow that makes sense to me ;) 12:18 <civodul> but the way i see it it should be part of Guix proper at some point 12:18 <zimoun> ardumont: yes, exemples are here https://issues.guix.gnu.org/58463 12:19 <ardumont> nice 12:19 <civodul> zimoun demoed extensions at the 10 years :-) 12:19 <civodul> yep 12:19 <ardumont> (oh i missed it, i was not there yet) 12:19 <ardumont> (or already left ¯\_(ツ)_/¯) 12:19 <zimoun> https://10years.guix.gnu.org/video/guix-repl-to-infinity-and-beyond/ 12:20 <ardumont> i like that title ;) 12:20 <ardumont> (thx) 12:21 <zimoun> civodul: I think we should go a path where we have more extensions and less all-in subcommands. For sure, tradeoff with maintenance. :-) 12:21 <ardumont> yes, that'd make sense ^ #+end_src * Some Metrics ** iteration 1 (over nix store) *** guixified debian yavin4: #+begin_src sh $ time guix index ;;; (package 286 "xcb-util-renderutil") guix index 121.88s user 2.49s system 138% cpu 1:29.82 total $ sqlite3 ~/.config/guix/locate-db.sqlite SQLite version 3.34.1 2021-01-20 14:10:07 Enter ".help" for usage hints. sqlite> select count(*) from files; select count(*) from directories; select count(*) from packages; 50913 328 284 $ ls -lah ~/.config/guix/locate-db.sqlite -rw-r--r-- 1 tony tony 8.9M Dec 3 10:49 /home/tony/.config/guix/locate-db.sqlite #+end_src *** guix system node dagobah: #+begin_src sh $ time guix index ;;; (package 1 "acl") ;;; (package 2 "inetutils") ... ;;; (package 753 "xauth") guix index 413.55s user 6.16s system 124% cpu 5:36.67 total $ ls -lah ~/.config/guix/locate-db.sqlite -rw-r--r-- 1 tony users 30M Dec 3 10:42 /home/tony/.config/guix/locate-db.sqlite $ sqlite3 ~/.config/guix/locate-db.sqlite SQLite version 3.39.3 2022-09-05 11:02:23 Enter ".help" for usage hints. sqlite> select count(*) from files; select count(*) from directories; select count(*) from packages; 162035 830 749 #+end_src ** iteration 2 (over system manifests) *** guixified debian #+begin_src sh $ time guix index ;;; (package 110 "guix") guix index 1.30s user 0.34s system 94% cpu 1.735 total $ sqlite3 ~/.cache/guix/index/db.sqlite SQLite version 3.34.1 2021-01-20 14:10:07 Enter ".help" for usage hints. sqlite> select count(*) from files; select count(*) from directories; select count(*) from packages; 34339 110 101 ls -lah ~/.cache/guix/index/db.sqlite -rw-r--r-- 1 tony tony 5.8M Dec 4 16:22 /home/tony/.cache/guix/index/db.sqlite #+end_src *** guix host #+begin_src sh $ time guix index ;;; (package 515 "guix") guix index 11.54s user 2.22s system 87% cpu 15.693 total dagobah% sqlite3 ~/.cache/guix/index/db.sqlite SQLite version 3.39.3 2022-09-05 11:02:23 Enter ".help" for usage hints. sqlite> select count(*) from files; select count(*) from directories; select count(*) from packages; 152947 515 354 sqlite> .quit dagobah% ls -lah ~/.cache/guix/index/db.sqlite -rw-r--r-- 1 tony users 29M Dec 4 16:26 /home/tony/.cache/guix/index/db.sqlite #+end_src ============================================================ [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 877 bytes --] ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: File search 2022-12-04 16:35 ` Antoine R. Dumont (@ardumont) @ 2022-12-06 10:01 ` Ludovic Courtès 2022-12-06 12:59 ` zimoun ` (2 more replies) 0 siblings, 3 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-12-06 10:01 UTC (permalink / raw) To: Antoine R. Dumont (@ardumont); +Cc: guix-devel Howdy! "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> skribis: > Please, find enclosed the latest implementation as a patch (somewhat vcs > code ;). I've edited commits to mark Ludo as author with his > started/amended implementations first [0] (that should be in the patch). Nice! > For information, I extracted some number from runs to compare our > iterations (see the org-file attachment). The first iteration being > "extracts packages from the store" and the second one "extracts packages > from the system manifest". Those runs happened both on a guixified > debian host and a raw guix host (more packages). So we went from 413s to 11s (on the Guix System node) for only 6% fewer files in the latter case? Do I get that right? That’s pretty cool. The implementation based on manifests can of course miss packages, so it’s a tradeoff. Purely local indexing will only find packages you already have anyway, so eventually we’ll need a second mode that would download a database. BTW, I noticed outputs are not properly handled so far, as in this example: --8<---------------cut here---------------start------------->8--- $ GUIX_EXTENSIONS_PATH=$HOME/tmp/guix-index guix index search git-send-email git@2.38.1 /gnu/store/g3lgyzr749l76qma7srycclgsm0f78iq-git-2.38.1-send-email/libexec/git-core/git-send-email git@2.37.1 /gnu/store/n3hkzz5ydm0qm1c2jja2pwy2v19mq1k0-git-2.37.1-send-email/libexec/git-core/git-send-email --8<---------------cut here---------------end--------------->8--- It should instead show “git@2.38.1:send-email”. We probably need an ‘output’ field in the ‘Packages’ table. Also going forward we’ll need a schema version, as in: --8<---------------cut here---------------start------------->8--- create table SchemaVersion ( version integer not null; ); --8<---------------cut here---------------end--------------->8--- so that the tool can upgrade or discard databases that have the wrong version. Oh, and progress bars too. And a pony. :-) Thanks for your work! Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-06 10:01 ` Ludovic Courtès @ 2022-12-06 12:59 ` zimoun 2022-12-06 18:27 ` ( 2022-12-09 10:05 ` Antoine R. Dumont (@ardumont) 2 siblings, 0 replies; 33+ messages in thread From: zimoun @ 2022-12-06 12:59 UTC (permalink / raw) To: Ludovic Courtès, Antoine R. Dumont (@ardumont); +Cc: guix-devel Hi, On Tue, 06 Dec 2022 at 11:01, Ludovic Courtès <ludo@gnu.org> wrote: > "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> > skribis: > >> Please, find enclosed the latest implementation as a patch (somewhat vcs >> code ;). I've edited commits to mark Ludo as author with his >> started/amended implementations first [0] (that should be in the patch). That’s cool! > Also going forward we’ll need a schema version, as in: > > --8<---------------cut here---------------start------------->8--- > create table SchemaVersion ( > version integer not null; > ); > --8<---------------cut here---------------end--------------->8--- Well, using plain SQLite as backend will make more complicated the way to query. For instance, “guix index output:doc gtk”. Maybe it could be worth to use guile-xapian as backend. Cheers, simon ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-06 10:01 ` Ludovic Courtès 2022-12-06 12:59 ` zimoun @ 2022-12-06 18:27 ` ( 2022-12-08 15:41 ` Ludovic Courtès 2022-12-09 10:05 ` Antoine R. Dumont (@ardumont) 2 siblings, 1 reply; 33+ messages in thread From: ( @ 2022-12-06 18:27 UTC (permalink / raw) To: Ludovic Courtès, Antoine R. Dumont (@ardumont); +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 986 bytes --] On Tue Dec 6, 2022 at 10:01 AM GMT, Ludovic Courtès wrote: > The implementation based on manifests can of course miss packages, so > it’s a tradeoff. Purely local indexing will only find packages you > already have anyway, so eventually we’ll need a second mode that would > download a database. Someone on IRC suggested that we use GraphQL to allow us to request that a substitute server search for files in a database constructed using all the packages that have been built. I suggest something like this: # on the client when [built a package derivation] [add files to local database] when [user ran guix find] if [send find request to substitute server] didn't work [search through local database] [display result] # on the substitute server when [built a package definition] [add files to local database] when [user sent find request] [search through local database] [reply with result] -- ( [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 659 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-06 18:27 ` ( @ 2022-12-08 15:41 ` Ludovic Courtès 0 siblings, 0 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-12-08 15:41 UTC (permalink / raw) To: (; +Cc: Antoine R. Dumont (@ardumont), guix-devel Hi, "(" <paren@disroot.org> skribis: > Someone on IRC suggested that we use GraphQL to allow us to request > that a substitute server search for files in a database constructed > using all the packages that have been built. > > I suggest something like this: > > # on the client > when [built a package derivation] > [add files to local database] > > when [user ran guix find] > if [send find request to substitute server] didn't work > [search through local database] > [display result] Upthread I started the discussion of criteria for file search: https://lists.gnu.org/archive/html/guix-devel/2022-01/msg00354.html Among them, there’s privacy and support for off-line operation. To meet these criteria, I think we should refrain from sending individual file search requests to the server, and instead have the option of fetching a full database. Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-06 10:01 ` Ludovic Courtès 2022-12-06 12:59 ` zimoun 2022-12-06 18:27 ` ( @ 2022-12-09 10:05 ` Antoine R. Dumont (@ardumont) 2022-12-09 18:05 ` zimoun 2022-12-11 10:22 ` Ludovic Courtès 2 siblings, 2 replies; 33+ messages in thread From: Antoine R. Dumont (@ardumont) @ 2022-12-09 10:05 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 4343 bytes --] Hello, > So we went from 413s to 11s (on the Guix System node) for only 6% fewer > files in the latter case? Do I get that right? That’s pretty cool. Not 6% of loss, a bit more, around half is only detected between the first and second round. Here is the summary [1] (org-mode) table I should have sent to ease reading. I don't have a better dataset to test against so you tell me if it is still worth pushing in that direction ;) [1] |-----------+-------------+----------+----------| | Iteration | Host System | Time (s) | Packages | |-----------+-------------+----------+----------| | 1st | Debian | 121.88 | 284 | | | Guix System | 413.55 | 749 | |-----------+-------------+----------+----------| | 2nd | Debian | 1.3 | 101 | | | Guix System | 11.54 | 354 | |-----------+-------------+----------+----------| > It should instead show “git@2.38.1:send-email”. We probably need an > ‘output’ field in the ‘Packages’ table. Why must that be "git@2.38.1:send-email", what does it mean? Providing we continue on that direction (see first question above), I'll check what i can do (I'm not sure how to properly do that just yet). > Also going forward we’ll need a schema version, as in: yes, that I see how to do. > Oh, and progress bars too. I'm a bit unsettled on this. Hopefully it was mostly a joke ;) If it's serious, how can we implement this as we don't know in advance how many packages we'll discover locally (if we don't change the current approach, that is, to avoid incurring too much time penalty). And also, what implem should be used, I know the "pv" package provide some pipe utility for that but that's about it. Do you have some example in the guix codebase that does some progress bar already? > And a pony. :-) ;p Cheers, -- tony / Antoine R. Dumont (@ardumont) ----------------------------------------------------------------- gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 Ludovic Courtès <ludo@gnu.org> writes: > Howdy! > > "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> > skribis: > >> Please, find enclosed the latest implementation as a patch (somewhat vcs >> code ;). I've edited commits to mark Ludo as author with his >> started/amended implementations first [0] (that should be in the patch). > > Nice! > >> For information, I extracted some number from runs to compare our >> iterations (see the org-file attachment). The first iteration being >> "extracts packages from the store" and the second one "extracts packages >> from the system manifest". Those runs happened both on a guixified >> debian host and a raw guix host (more packages). > > So we went from 413s to 11s (on the Guix System node) for only 6% fewer > files in the latter case? Do I get that right? That’s pretty cool. > > The implementation based on manifests can of course miss packages, so > it’s a tradeoff. Purely local indexing will only find packages you > already have anyway, so eventually we’ll need a second mode that would > download a database. > > BTW, I noticed outputs are not properly handled so far, as in this > example: > > --8<---------------cut here---------------start------------->8--- > $ GUIX_EXTENSIONS_PATH=$HOME/tmp/guix-index guix index search git-send-email > git@2.38.1 /gnu/store/g3lgyzr749l76qma7srycclgsm0f78iq-git-2.38.1-send-email/libexec/git-core/git-send-email > git@2.37.1 /gnu/store/n3hkzz5ydm0qm1c2jja2pwy2v19mq1k0-git-2.37.1-send-email/libexec/git-core/git-send-email > --8<---------------cut here---------------end--------------->8--- > > It should instead show “git@2.38.1:send-email”. We probably need an > ‘output’ field in the ‘Packages’ table. > > Also going forward we’ll need a schema version, as in: > > --8<---------------cut here---------------start------------->8--- > create table SchemaVersion ( > version integer not null; > ); > --8<---------------cut here---------------end--------------->8--- > > so that the tool can upgrade or discard databases that have the wrong > version. > > Oh, and progress bars too. And a pony. :-) > > Thanks for your work! > > Ludo’. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 873 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-09 10:05 ` Antoine R. Dumont (@ardumont) @ 2022-12-09 18:05 ` zimoun 2022-12-11 10:22 ` Ludovic Courtès 1 sibling, 0 replies; 33+ messages in thread From: zimoun @ 2022-12-09 18:05 UTC (permalink / raw) To: Antoine R. Dumont (@ardumont), Ludovic Courtès; +Cc: guix-devel Hi Antoine, Cool! I have not really look yet. Just a minor answer to one of your question. :-) On Fri, 09 Dec 2022 at 11:05, "Antoine R. Dumont (@ardumont)" <ardumont@softwareheritage.org> wrote: >> It should instead show “git@2.38.1:send-email”. We probably need an >> ‘output’ field in the ‘Packages’ table. > > Why must that be "git@2.38.1:send-email", what does it mean? It is about outputs [1]. Some packages have more than one output. For instance, --8<---------------cut here---------------start------------->8--- $ guix show git | recsel -p outputs outputs: + send-email: [description missing] + svn: [description missing] + credential-netrc: [description missing] + credential-libsecret: [description missing] + subtree: [description missing] + gui: [description missing] + out: everything else --8<---------------cut here---------------end--------------->8--- It means that “git:send-email” provides “git send-email” for not necessary Git itself. 1: https://guix.gnu.org/manual/devel/en/guix.html#Packages-with-Multiple-Outputs Cheers, simon ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-09 10:05 ` Antoine R. Dumont (@ardumont) 2022-12-09 18:05 ` zimoun @ 2022-12-11 10:22 ` Ludovic Courtès 2022-12-15 17:03 ` Antoine R. Dumont (@ardumont) 1 sibling, 1 reply; 33+ messages in thread From: Ludovic Courtès @ 2022-12-11 10:22 UTC (permalink / raw) To: Antoine R. Dumont (@ardumont); +Cc: guix-devel Hi! "Antoine R. Dumont (@ardumont)" <ardumont@softwareheritage.org> skribis: > |-----------+-------------+----------+----------| > | Iteration | Host System | Time (s) | Packages | > |-----------+-------------+----------+----------| > | 1st | Debian | 121.88 | 284 | > | | Guix System | 413.55 | 749 | > |-----------+-------------+----------+----------| > | 2nd | Debian | 1.3 | 101 | > | | Guix System | 11.54 | 354 | > |-----------+-------------+----------+----------| Ah, that’s a significant difference. I guess we can keep both methods: the exhaustive one that goes over all packages, and the quick one. Then we can have a switch to select the method. Typically, we may want to use the expensive one on the build farm to publish a full database, while on user’s machines we may want to default to the cheaper one. >> Oh, and progress bars too. > > I'm a bit unsettled on this. Hopefully it was mostly a joke ;) It wasn’t. :-) In the manifest case, we get ‘all-profiles’ is almost instantaneous, so we immediately known the number of manifests we’ll be working on. In the package case, the number of packages is known ahead. The (guix progress) module provides helpers. But anyway, that’s more like icing on the cake, we can leave that for later. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-11 10:22 ` Ludovic Courtès @ 2022-12-15 17:03 ` Antoine R. Dumont (@ardumont) 2022-12-19 21:25 ` Ludovic Courtès 0 siblings, 1 reply; 33+ messages in thread From: Antoine R. Dumont (@ardumont) @ 2022-12-15 17:03 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1.1: Type: text/plain, Size: 4802 bytes --] Hello, As mentioned last week (on irc), I've improved a bit the implementation as per the last discussions in the thread. Please, find enclosed the patch with those changes (hopefully a tad better attached than last time...). Here is the rough changelog: - The local db cache is now versioned. Migration will transparently happen for users at each index command calls (if need be). Note: Care should be taken by devs to provide the migration script step for each db schema bump (examples inside). The rest will happen on its own. - The cli parsing got rewritten to be more flexible (inspired from existing code from guix, notably `guix home`). - We can now choose the indexation method using the `--with-method={store|manifests}` flag. The "manifests" method is the default, seel the help message for more details). - Finally, the indexation methods are displayed using a progress bar. Heads up, I did not yet address the "output" part. Thanks @zimoun for the clarification btw ;) > In the package case, the number of packages is known ahead. @civodul For the index 'store' implementation, ^ I did not find that information. So, as a costly implementation detail, I'm folding over all packages first to know the total number of packages (for the progress bar). And then another round trip to actually do the insert. I don't like it at all. Plus, that number seems off to me (21696) packages in regards to the number of packages indexed (522). So, if you could have a rapid look to fix or tell me what's wrong, that'd be great. I'm pretty sure it will hit you immediately (while i still do not find it ¯\_(ツ)_/¯ ;). ---- Here is a rapid sample of the current command usage: --8<---------------cut here---------------end--------------->8--- $ guix index --version Extension local cache database: - path: /home/tony/.cache/guix/index/db.sqlite - version: 2 guix index (GNU Guix) 5ccb5837ccfb39af4e3e6399a0124997a187beb1 Copyright (C) 2022 the Guix authors License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. $ guix index --with-method=store --db-path=/tmp/db/db.sqlite Registering 21696 packages [######## ] $ guix index search git ... git-minimal@2.38.1 /gnu/store/xf734fz3jihgc5x4979ypyaxn8aday1k-git-minimal-2.38.1/bin/git git@2.38.1 /gnu/store/wx965ym3c5fwbcdp7i9lvzad3479vv7m-git-2.38.1/libexec/git-core/git git@2.38.1 /gnu/store/wx965ym3c5fwbcdp7i9lvzad3479vv7m-git-2.38.1/etc/bash_completion.d/git git@2.38.1 /gnu/store/wx965ym3c5fwbcdp7i9lvzad3479vv7m-git-2.38.1/bin/git $ guix index --help Usage: guix index [OPTIONS...] [search FILE...] Without argument, indexes (package, file) relationships from the machine. This allows indexation with 2 methods: - out of the local manifests. This is the fastest implementation but this indexes less packages. That'd be typically the use case for user local indexation. - out of the local store. This is slower due to implementation details (it discusses with the store daemon for one). That'd be typically the use case for building the largest db in one of the build farm node. With 'search FILE', search for packages installing FILE. Note: Internal cache is located at ~/.cache/guix/index/db.sqlite by default. See --db-path for customization. The valid values for OPTIONS are: -h, --help Display this help and exit -V, --version Display version information and exit --db-path=DIR Change default location of the cache db --with-method=METH Change default indexation method. By default it uses the local "manifests" (faster). It can also uses the local "store" (slower, typically on the farm build ci). The valid values for ARGS are: search FILE Search for packages installing the FILE (from cache db) <EMPTY> Without any argument, it index packages. This fills in the db cache using whatever indexation method is defined. Report bugs to: bug-guix@gnu.org. GNU Guix home page: <https://guix.gnu.org> General help using Guix and GNU software: <https://guix.gnu.org/en/help/> --8<---------------cut here---------------end--------------->8--- Hope you'll find it mostly to your taste! Note: I gather we'll rework the commits at some point (when it's ready) so I did not bother too much right now. Cheers, -- tony / Antoine R. Dumont (@ardumont) ----------------------------------------------------------------- gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 [-- Attachment #1.2: add-extension-guix-index.patch --] [-- Type: text/x-diff, Size: 96284 bytes --] From d3e658ca1e3ce2715e25450b794d139d3417c74c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org> Date: Wed, 30 Nov 2022 15:25:21 +0100 Subject: [PATCH 01/25] extensions-index: Add initial implementation from civodul Related to https://lists.gnu.org/archive/html/guix-devel/2022-01/msg00354.html --- guix/extensions/file-database.scm | 199 ++++++++++++++++++++++++++++++ 1 file changed, 199 insertions(+) create mode 100644 guix/extensions/file-database.scm diff --git a/guix/extensions/file-database.scm b/guix/extensions/file-database.scm new file mode 100644 index 0000000000..83aafbc554 --- /dev/null +++ b/guix/extensions/file-database.scm @@ -0,0 +1,199 @@ +;;; GNU Guix --- Functional package management for GNU +;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org> +;;; +;;; This file is part of GNU Guix. +;;; +;;; GNU Guix is free software; you can redistribute it and/or modify it +;;; under the terms of the GNU General Public License as published by +;;; the Free Software Foundation; either version 3 of the License, or (at +;;; your option) any later version. +;;; +;;; GNU Guix is distributed in the hope that it will be useful, but +;;; WITHOUT ANY WARRANTY; without even the implied warranty of +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;;; GNU General Public License for more details. +;;; +;;; You should have received a copy of the GNU General Public License +;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. + +(define-module (file-database) + #:use-module (sqlite3) + #:use-module (ice-9 match) + #:use-module (guix store) + #:use-module (guix monads) + #:autoload (guix grafts) (%graft?) + #:use-module (guix derivations) + #:use-module (guix packages) + #:autoload (guix build utils) (find-files) + #:autoload (gnu packages) (fold-packages) + #:use-module (srfi srfi-1) + #:use-module (srfi srfi-9) + #:export (file-database)) + +(define schema + " +create table if not exists Packages ( + id integer primary key autoincrement not null, + name text not null, + version text not null +); + +create table if not exists Directories ( + id integer primary key autoincrement not null, + name text not null, + package integer not null, + foreign key (package) references Packages(id) on delete cascade +); + +create table if not exists Files ( + name text not null, + basename text not null, + directory integer not null, + foreign key (directory) references Directories(id) on delete cascade +); + +create index if not exists IndexFiles on Files(basename);") + +(define (call-with-database file proc) + (let ((db (sqlite-open file))) + (dynamic-wind + (lambda () #t) + (lambda () + (sqlite-exec db schema) + (proc db)) + (lambda () + (sqlite-close db))))) + +(define (insert-files db package version directories) + "Insert the files contained in DIRECTORIES as belonging to PACKAGE at +VERSION." + (define last-row-id-stmt + (sqlite-prepare db "SELECT last_insert_rowid();" + #:cache? #t)) + + (define package-stmt + (sqlite-prepare db "\ +INSERT OR REPLACE INTO Packages(name, version) +VALUES (:name, :version);" + #:cache? #t)) + + (define directory-stmt + (sqlite-prepare db "\ +INSERT INTO Directories(name, package) VALUES (:name, :package);" + #:cache? #t)) + + (define file-stmt + (sqlite-prepare db "\ +INSERT INTO Files(name, basename, directory) +VALUES (:name, :basename, :directory);" + #:cache? #t)) + + (sqlite-exec db "begin immediate;") + (sqlite-bind-arguments package-stmt + #:name package + #:version version) + (sqlite-fold (const #t) #t package-stmt) + (match (sqlite-fold cons '() last-row-id-stmt) + ((#(package-id)) + (pk 'package package-id package) + (for-each (lambda (directory) + (define (strip file) + (string-drop file (+ (string-length directory) 1))) + + (sqlite-reset directory-stmt) + (sqlite-bind-arguments directory-stmt + #:name directory + #:package package-id) + (sqlite-fold (const #t) #t directory-stmt) + + (match (sqlite-fold cons '() last-row-id-stmt) + ((#(directory-id)) + (for-each (lambda (file) + ;; If DIRECTORY is a symlink, (find-files + ;; DIRECTORY) returns the DIRECTORY singleton. + (unless (string=? file directory) + (sqlite-reset file-stmt) + (sqlite-bind-arguments file-stmt + #:name (strip file) + #:basename + (basename file) + #:directory + directory-id) + (sqlite-fold (const #t) #t file-stmt))) + (find-files directory))))) + directories) + (sqlite-exec db "commit;")))) + +(define (insert-package db package) + "Insert all the files of PACKAGE into DB." + (mlet %store-monad ((drv (package->derivation package #:graft? #f))) + (match (derivation->output-paths drv) + (((labels . directories) ...) + (when (every file-exists? directories) + (insert-files db (package-name package) (package-version package) + directories)) + (return #t))))) + +(define (insert-packages db) + "Insert all the current packages into DB." + (with-store store + (parameterize ((%graft? #f)) + (fold-packages (lambda (package _) + (run-with-store store + (insert-package db package))) + #t + #:select? (lambda (package) + (and (not (hidden-package? package)) + (not (package-superseded package)) + (supported-package? package))))))) + +(define-record-type <package-match> + (package-match name version file) + package-match? + (name package-match-name) + (version package-match-version) + (file package-match-file)) + +(define (matching-packages db file) + "Return a list of <package-match> corresponding to packages containing +FILE." + (define lookup-stmt + (sqlite-prepare db "\ +SELECT Packages.name, Packages.version, Directories.name, Files.name +FROM Packages +INNER JOIN Files, Directories +ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) + + (sqlite-bind-arguments lookup-stmt #:file file) + (sqlite-fold (lambda (result lst) + (match result + (#(package version directory file) + (cons (package-match package version + (string-append directory "/" file)) + lst)))) + '() lookup-stmt)) + +\f +(define (file-database . args) + (match args + ((_ "populate") + (call-with-database "/tmp/db" + (lambda (db) + (insert-packages db)))) + ((_ "search" file) + (let ((matches (call-with-database "/tmp/db" + (lambda (db) + (matching-packages db file))))) + (for-each (lambda (result) + (format #t "~20a ~a~%" + (string-append (package-match-name result) + "@" (package-match-version result)) + (package-match-file result))) + matches) + (exit (pair? matches)))) + (_ + (format (current-error-port) + "usage: file-database [populate|search] args ...~%") + (exit 1)))) + +(apply file-database (command-line)) -- 2.38.1 From d9139cc86c26f76bc66f7d82868ebf6a03605f76 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 1 Dec 2022 13:36:28 +0100 Subject: [PATCH 02/25] extensions-index: Transform command into `guix locate` extension --- .../{file-database.scm => locate.scm} | 58 ++++++++++++------- 1 file changed, 36 insertions(+), 22 deletions(-) rename guix/extensions/{file-database.scm => locate.scm} (82%) diff --git a/guix/extensions/file-database.scm b/guix/extensions/locate.scm similarity index 82% rename from guix/extensions/file-database.scm rename to guix/extensions/locate.scm index 83aafbc554..1e42f5bad8 100644 --- a/guix/extensions/file-database.scm +++ b/guix/extensions/locate.scm @@ -16,7 +16,8 @@ ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. -(define-module (file-database) +(define-module (guix extensions locate) + #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (guix store) @@ -28,7 +29,7 @@ (define-module (file-database) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) - #:export (file-database)) + #:export (guix-locate)) (define schema " @@ -155,8 +156,7 @@ (define-record-type <package-match> (file package-match-file)) (define (matching-packages db file) - "Return a list of <package-match> corresponding to packages containing -FILE." + "Return list of <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ SELECT Packages.name, Packages.version, Directories.name, Files.name @@ -174,26 +174,40 @@ (define lookup-stmt '() lookup-stmt)) \f -(define (file-database . args) + +(define (index-packages-with-db db-pathname) + "Index packages using db at location DB-PATHNAME." + (call-with-database db-pathname + (lambda (db) + (insert-packages db)))) + +(define (matching-packages-with-db db-pathname file) + "Compute list of packages referencing FILE using db at DB-PATHNAME." + (call-with-database db-pathname + (lambda (db) + (matching-packages db file)))) + +(define (print-matching-results matches) + "Print the MATCHES matching results." + (for-each (lambda (result) + (format #t "~20a ~a~%" + (string-append (package-match-name result) + "@" (package-match-version result)) + (package-match-file result))) + matches)) + +(define-command (guix-locate . args) + (category extension) + (synopsis "Index packages then search what package declares a given file") (match args - ((_ "populate") - (call-with-database "/tmp/db" - (lambda (db) - (insert-packages db)))) - ((_ "search" file) - (let ((matches (call-with-database "/tmp/db" - (lambda (db) - (matching-packages db file))))) - (for-each (lambda (result) - (format #t "~20a ~a~%" - (string-append (package-match-name result) - "@" (package-match-version result)) - (package-match-file result))) - matches) + (("index") + (index-packages-with-db "/tmp/db")) + (("search" file) + (let ((matches (matching-packages-with-db "/tmp/db" file))) + (print-matching-results matches) (exit (pair? matches)))) (_ (format (current-error-port) - "usage: file-database [populate|search] args ...~%") + "usage: guix locate [index|search] args ...~% ~a" + args) (exit 1)))) - -(apply file-database (command-line)) -- 2.38.1 From eb474f3412ba19320dceda7d08c7f960d00cb898 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 1 Dec 2022 13:45:59 +0100 Subject: [PATCH 03/25] extensions-index: Avoid duplicating the hard-coded db path --- guix/extensions/locate.scm | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 1e42f5bad8..830dfc49fb 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -196,14 +196,18 @@ (define (print-matching-results matches) (package-match-file result))) matches)) +;; TODO: Determine the current guile/guix mechanism to provide configuration +;; for this +(define default-location-db-path "/tmp/db") + (define-command (guix-locate . args) (category extension) (synopsis "Index packages then search what package declares a given file") (match args (("index") - (index-packages-with-db "/tmp/db")) + (index-packages-with-db default-location-db-path)) (("search" file) - (let ((matches (matching-packages-with-db "/tmp/db" file))) + (let ((matches (matching-packages-with-db default-location-db-path file))) (print-matching-results matches) (exit (pair? matches)))) (_ -- 2.38.1 From 309ecd5d5b7cdff012b66cbe9643c34725b22a2d Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 1 Dec 2022 13:47:19 +0100 Subject: [PATCH 04/25] extensions-index: Deduplicate lookup matching results --- guix/extensions/locate.scm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 830dfc49fb..ab0a0403ec 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -156,10 +156,10 @@ (define-record-type <package-match> (file package-match-file)) (define (matching-packages db file) - "Return list of <package-match> corresponding to packages containing FILE." + "Return unique <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ -SELECT Packages.name, Packages.version, Directories.name, Files.name +SELECT DISTINCT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) -- 2.38.1 From 541615ab6638b1fb418531f961cfb6756b41499b Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 14:09:52 +0100 Subject: [PATCH 05/25] extensions-index: Make insertion queries idempotent Prior to this, multiple runs of the index subcommand would append the same packages, directories or files in the db. --- guix/extensions/locate.scm | 71 ++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 26 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index ab0a0403ec..ce8306531f 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -36,14 +36,16 @@ (define schema create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, - version text not null + version text not null, + unique (name, version) -- add uniqueness constraint ); create table if not exists Directories ( id integer primary key autoincrement not null, name text not null, package integer not null, - foreign key (package) references Packages(id) on delete cascade + foreign key (package) references Packages(id) on delete cascade, + unique (name, package) -- add uniqueness constraint ); create table if not exists Files ( @@ -51,6 +53,7 @@ (define schema basename text not null, directory integer not null, foreign key (directory) references Directories(id) on delete cascade + unique (name, basename, directory) -- add uniqueness constraint ); create index if not exists IndexFiles on Files(basename);") @@ -66,64 +69,78 @@ (define (call-with-database file proc) (sqlite-close db))))) (define (insert-files db package version directories) - "Insert the files contained in DIRECTORIES as belonging to PACKAGE at -VERSION." - (define last-row-id-stmt - (sqlite-prepare db "SELECT last_insert_rowid();" + "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." + (define stmt-select-package + (sqlite-prepare db "\ +SELECT id FROM Packages WHERE name = :name AND version = :version;" #:cache? #t)) - (define package-stmt + (define stmt-insert-package (sqlite-prepare db "\ -INSERT OR REPLACE INTO Packages(name, version) +INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes VALUES (:name, :version);" #:cache? #t)) - (define directory-stmt + (define stmt-select-directory (sqlite-prepare db "\ -INSERT INTO Directories(name, package) VALUES (:name, :package);" +SELECT id FROM Directories WHERE name = :name AND package = :package;" #:cache? #t)) - (define file-stmt + (define stmt-insert-directory (sqlite-prepare db "\ -INSERT INTO Files(name, basename, directory) +INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes +VALUES (:name, :package);" + #:cache? #t)) + + (define stmt-insert-file + (sqlite-prepare db "\ +INSERT OR IGNORE INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" #:cache? #t)) (sqlite-exec db "begin immediate;") - (sqlite-bind-arguments package-stmt + (sqlite-bind-arguments stmt-insert-package #:name package #:version version) - (sqlite-fold (const #t) #t package-stmt) - (match (sqlite-fold cons '() last-row-id-stmt) + (sqlite-fold (const #t) #t stmt-insert-package) + + (sqlite-bind-arguments stmt-select-package + #:name package + #:version version) + (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) - (sqlite-reset directory-stmt) - (sqlite-bind-arguments directory-stmt + (sqlite-reset stmt-insert-directory) + (sqlite-bind-arguments stmt-insert-directory #:name directory #:package package-id) - (sqlite-fold (const #t) #t directory-stmt) + (sqlite-fold (const #t) #t stmt-insert-directory) - (match (sqlite-fold cons '() last-row-id-stmt) + (sqlite-reset stmt-select-directory) + (sqlite-bind-arguments stmt-select-directory + #:name directory + #:package package-id) + (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. (unless (string=? file directory) - (sqlite-reset file-stmt) - (sqlite-bind-arguments file-stmt + (sqlite-reset stmt-insert-file) + (sqlite-bind-arguments stmt-insert-file #:name (strip file) #:basename (basename file) #:directory directory-id) - (sqlite-fold (const #t) #t file-stmt))) + (sqlite-fold (const #t) #t stmt-insert-file))) (find-files directory))))) - directories) - (sqlite-exec db "commit;")))) + directories))) + (sqlite-exec db "commit;")) (define (insert-package db package) "Insert all the files of PACKAGE into DB." @@ -159,10 +176,12 @@ (define (matching-packages db file) "Return unique <package-match> corresponding to packages containing FILE." (define lookup-stmt (sqlite-prepare db "\ -SELECT DISTINCT Packages.name, Packages.version, Directories.name, Files.name +SELECT Packages.name, Packages.version, Directories.name, Files.name FROM Packages INNER JOIN Files, Directories -ON files.basename = :file AND directories.id = files.directory AND packages.id = directories.package;")) +ON files.basename = :file + AND directories.id = files.directory + AND packages.id = directories.package;")) (sqlite-bind-arguments lookup-stmt #:file file) (sqlite-fold (lambda (result lst) -- 2.38.1 From 09d5f6b30ac24a8e8261994a1011ddd13082a4bb Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 14:10:59 +0100 Subject: [PATCH 06/25] extensions-index: Add debug statement This is conditional in the top-level debug module variable, false by default. --- guix/extensions/locate.scm | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index ce8306531f..3b43ea887e 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -31,6 +31,8 @@ (define-module (guix extensions locate) #:use-module (srfi srfi-9) #:export (guix-locate)) +(define debug #f) + (define schema " create table if not exists Packages ( @@ -109,6 +111,9 @@ (define stmt-insert-file #:version version) (match (sqlite-fold cons '() stmt-select-package) ((#(package-id)) + (when debug + (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" + package version package-id)) (pk 'package package-id package) (for-each (lambda (directory) (define (strip file) @@ -126,6 +131,9 @@ (define (strip file) #:package package-id) (match (sqlite-fold cons '() stmt-select-directory) ((#(directory-id)) + (when debug + (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" + directory package-id directory-id)) (for-each (lambda (file) ;; If DIRECTORY is a symlink, (find-files ;; DIRECTORY) returns the DIRECTORY singleton. -- 2.38.1 From b50267e3d24162cd8c3908bbaa841d13363621e9 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 14:11:50 +0100 Subject: [PATCH 07/25] extensions-index: Play around the packaging filtering functions This keeps the default behavior but allows to change it (by the developer) to determine what's the best policy. --- guix/extensions/locate.scm | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 3b43ea887e..9679d643a6 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -160,18 +160,27 @@ (define (insert-package db package) directories)) (return #t))))) -(define (insert-packages db) - "Insert all the current packages into DB." +(define (filter-public-current-supported package) + "Filter supported, not hidden (public) and not superseded (current) package." + (and (not (hidden-package? package)) + (not (package-superseded package)) + (supported-package? package))) + +(define (filter-supported-package package) + "Filter supported package (package might be hidden or superseded)." + (and (supported-package? package))) + +(define (no-filter package) "No filtering on package" #t) + +(define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) + "Insert all current packages matching `filter-package-policy` into DB." (with-store store (parameterize ((%graft? #f)) (fold-packages (lambda (package _) (run-with-store store (insert-package db package))) #t - #:select? (lambda (package) - (and (not (hidden-package? package)) - (not (package-superseded package)) - (supported-package? package))))))) + #:select? filter-policy)))) (define-record-type <package-match> (package-match name version file) @@ -206,7 +215,7 @@ (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname (lambda (db) - (insert-packages db)))) + (insert-packages db no-filter)))) (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." -- 2.38.1 From 3b5c765fc967cef1d6919b66acc2d7872ea1e48c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 15:19:24 +0100 Subject: [PATCH 08/25] extensions-index: Install db in ~/.config/guix/locate-db.sqlite --- guix/extensions/locate.scm | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 9679d643a6..7d19e64a07 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -232,9 +232,12 @@ (define (print-matching-results matches) (package-match-file result))) matches)) -;; TODO: Determine the current guile/guix mechanism to provide configuration -;; for this -(define default-location-db-path "/tmp/db") +(define default-location-db-path + (let ((local-config-path + (and=> (getenv "HOME") + (lambda (home) + (string-append home "/.config/guix/"))))) + (string-append local-config-path "locate-db.sqlite"))) (define-command (guix-locate . args) (category extension) -- 2.38.1 From f101d12acf05c82cf9678d1cffec76cceba9e845 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 17:58:18 +0100 Subject: [PATCH 09/25] extensions-index: Improve cli parsing This unifies with some existing guix commands (import). --- guix/extensions/locate.scm | 80 +++++++++++++++++++++++++++++++++----- 1 file changed, 71 insertions(+), 9 deletions(-) diff --git a/guix/extensions/locate.scm b/guix/extensions/locate.scm index 7d19e64a07..630560b231 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/locate.scm @@ -17,9 +17,12 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions locate) + #:use-module (guix config) ;; %guix-package-name, ... + #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) + #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) #:autoload (guix grafts) (%graft?) @@ -232,25 +235,84 @@ (define (print-matching-results matches) (package-match-file result))) matches)) -(define default-location-db-path +(define default-db-path (let ((local-config-path (and=> (getenv "HOME") (lambda (home) (string-append home "/.config/guix/"))))) (string-append local-config-path "locate-db.sqlite"))) +(define (show-bug-report-information) + ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this + ;; package. Please add another line saying "Report translation bugs to + ;; ...\n" with the address for translation bugs (typically your translation + ;; team's web or email address). + (format #t (G_ " +Report bugs to: ~a.") %guix-bug-report-address) + (format #t (G_ " +~a home page: <~a>") %guix-package-name %guix-home-page-url) + (format #t (G_ " +General help using Guix and GNU software: <~a>") + ;; TRANSLATORS: Change the "/en" bit of this URL appropriately if + ;; the web site is translated in your language. + (G_ "https://guix.gnu.org/en/help/")) + (newline)) + +(define (show-help) + (display (G_ "Usage: guix locate [OPTIONS...] [ARGS...] +Index packages and search what package declares a given file.\n +By default, the local cache db is located in ~/.config/guix/locate-db.sqlite. +See --db-path for customization.")) + (display (G_ " + index Index current packages from the local store (in cache db)")) + (display (G_ " + search FILE Search for packages that declares FILE (from cache db)")) + (newline) + (display (G_ " + --db-path=DIR Change default location of the cache db")) + (newline) + (display (G_ " + -h, --help Display this help and exit")) + (display (G_ " + -V, --version Display version information and exit")) + (newline) + (show-bug-report-information)) + (define-command (guix-locate . args) (category extension) - (synopsis "Index packages then search what package declares a given file") + (synopsis "Index packages to allow searching package for a given filename") + + (define (parse-db-args args) + "Parsing of string key=value where we are only interested in 'value'" + (match (string-split args #\=) + ((unused db-path) + db-path) + (_ #f))) + + (define (display-help-and-exit) + (show-help) + (exit 0)) + (match args + ((or ("-h") ("--help") ()) + (display-help-and-exit)) + ((or ("-V") ("--version")) + (show-version-and-exit "guix locate")) + ((db-path-args "index") + (let ((db-path (parse-db-args db-path-args))) + (if db-path + (index-packages-with-db db-path) + (display-help-and-exit)))) (("index") - (index-packages-with-db default-location-db-path)) + (index-packages-with-db default-db-path)) (("search" file) - (let ((matches (matching-packages-with-db default-location-db-path file))) + (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) (exit (pair? matches)))) - (_ - (format (current-error-port) - "usage: guix locate [index|search] args ...~% ~a" - args) - (exit 1)))) + ((db-path-args "search" file) + (let ((db-path (parse-db-args db-path-args))) + (if db-path + (let ((matches (matching-packages-with-db db-path file))) + (print-matching-results matches) + (exit (pair? matches))) + (display-help-and-exit)))))) -- 2.38.1 From 9cb0826a71bdada345de100d98e9b44f3503a75a Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Fri, 2 Dec 2022 19:13:46 +0100 Subject: [PATCH 10/25] extensions-index: Improve cli options and help message This also renames the cli from locate to index. --- guix/extensions/{locate.scm => index.scm} | 40 +++++++++++++---------- 1 file changed, 22 insertions(+), 18 deletions(-) rename guix/extensions/{locate.scm => index.scm} (93%) diff --git a/guix/extensions/locate.scm b/guix/extensions/index.scm similarity index 93% rename from guix/extensions/locate.scm rename to guix/extensions/index.scm index 630560b231..ab7661dbac 100644 --- a/guix/extensions/locate.scm +++ b/guix/extensions/index.scm @@ -16,7 +16,7 @@ ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. -(define-module (guix extensions locate) +(define-module (guix extensions index) #:use-module (guix config) ;; %guix-package-name, ... #:use-module (guix ui) ;; display G_ #:use-module (guix scripts) @@ -32,7 +32,7 @@ (define-module (guix extensions locate) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) - #:export (guix-locate)) + #:export (guix-index)) (define debug #f) @@ -259,26 +259,30 @@ (define (show-bug-report-information) (newline)) (define (show-help) - (display (G_ "Usage: guix locate [OPTIONS...] [ARGS...] -Index packages and search what package declares a given file.\n -By default, the local cache db is located in ~/.config/guix/locate-db.sqlite. -See --db-path for customization.")) - (display (G_ " - index Index current packages from the local store (in cache db)")) - (display (G_ " - search FILE Search for packages that declares FILE (from cache db)")) + (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] +Without FILE, index (package, file) relationships in the local store. +With 'search FILE', search for packages installing FILEx;x.\n +Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. +See --db-path for customization.\n")) (newline) - (display (G_ " - --db-path=DIR Change default location of the cache db")) + (display (G_ "The valid values for OPTIONS are:")) (newline) (display (G_ " -h, --help Display this help and exit")) (display (G_ " -V, --version Display version information and exit")) + (display (G_ " + --db-path=DIR Change default location of the cache db")) + (newline) + (newline) + (display (G_ "The valid values for ARGS are:")) + (newline) + (display (G_ " + search FILE Search for packages installing the FILE (from cache db)")) (newline) (show-bug-report-information)) -(define-command (guix-locate . args) +(define-command (guix-index . args) (category extension) (synopsis "Index packages to allow searching package for a given filename") @@ -294,17 +298,15 @@ (define (display-help-and-exit) (exit 0)) (match args - ((or ("-h") ("--help") ()) + ((or ("-h") ("--help")) (display-help-and-exit)) ((or ("-V") ("--version")) (show-version-and-exit "guix locate")) - ((db-path-args "index") + ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path (index-packages-with-db db-path) (display-help-and-exit)))) - (("index") - (index-packages-with-db default-db-path)) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) (print-matching-results matches) @@ -315,4 +317,6 @@ (define (display-help-and-exit) (let ((matches (matching-packages-with-db db-path file))) (print-matching-results matches) (exit (pair? matches))) - (display-help-and-exit)))))) + (display-help-and-exit)))) + (_ ;; index by default + (index-packages-with-db default-db-path)))) -- 2.38.1 From f18d1f536bf6b13ec0dd8ee1e865ce21448e3836 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org> Date: Sun, 4 Dec 2022 14:42:45 +0100 Subject: [PATCH 11/25] extensions-index: Iterate over system manifests to index This should avoid the extra work of discussing with daemon, computing derivations, etc... --- guix/extensions/index.scm | 84 +++++++++++++++++++++++++++++++++++---- 1 file changed, 76 insertions(+), 8 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index ab7661dbac..a7a23c6194 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -25,13 +25,19 @@ (define-module (guix extensions index) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) + #:autoload (guix combinators) (fold2) #:autoload (guix grafts) (%graft?) + #:autoload (guix store roots) (gc-roots) #:use-module (guix derivations) #:use-module (guix packages) + #:use-module (guix profiles) + #:use-module (guix sets) + #:use-module ((guix utils) #:select (cache-directory)) #:autoload (guix build utils) (find-files) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) + #:use-module (srfi srfi-71) #:export (guix-index)) (define debug #f) @@ -185,6 +191,67 @@ (define* (insert-packages db #:optional (filter-policy filter-public-current-sup #t #:select? filter-policy)))) +\f +;;; +;;; Indexing from local profiles. +;;; + +(define (all-profiles) + "Return the list of profiles on the system." + (delete-duplicates + (filter-map (lambda (root) + (if (file-exists? (string-append root "/manifest")) + root + (let ((root (string-append root "/profile"))) + (and (file-exists? (string-append root "/manifest")) + root)))) + (gc-roots)))) + +(define (profiles->manifest-entries profiles) + "Return manifest entries for all of PROFILES, without duplicates." + (let loop ((visited (set)) + (profiles profiles) + (entries '())) + (match profiles + (() + entries) + ((profile . rest) + (let* ((manifest (profile-manifest profile)) + (entries visited + (fold2 (lambda (entry lst visited) + (let ((item (manifest-entry-item entry))) + (if (set-contains? visited item) + (values lst visited) + (values (cons entry lst) + (set-insert item + visited))))) + entries + visited + (manifest-transitive-entries manifest)))) + (loop visited rest entries)))))) + +(define (insert-manifest-entry db entry) + "Insert ENTRY, a manifest entry, into DB." + (insert-files db (manifest-entry-name entry) + (manifest-entry-version entry) + (list (manifest-entry-item entry)))) ;FIXME: outputs? + +(define (index-manifests db-file) + "Insert into DB-FILE entries for packages that appear in manifests +available on the system." + (call-with-database db-file + (lambda (db) + (for-each (lambda (entry) + (insert-manifest-entry db entry)) + (let ((lst (profiles->manifest-entries (all-profiles)))) + (pk 'entries (length lst)) + lst))))) + +\f +;;; +;;; Search. +;;; + (define-record-type <package-match> (package-match name version file) package-match? @@ -214,6 +281,10 @@ (define lookup-stmt \f +;;; +;;; CLI +;;; + (define (index-packages-with-db db-pathname) "Index packages using db at location DB-PATHNAME." (call-with-database db-pathname @@ -236,11 +307,8 @@ (define (print-matching-results matches) matches)) (define default-db-path - (let ((local-config-path - (and=> (getenv "HOME") - (lambda (home) - (string-append home "/.config/guix/"))))) - (string-append local-config-path "locate-db.sqlite"))) + (string-append (cache-directory #:ensure? #f) + "/index/db.sqlite")) (define (show-bug-report-information) ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this @@ -261,7 +329,7 @@ (define (show-bug-report-information) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. -With 'search FILE', search for packages installing FILEx;x.\n +With 'search FILE', search for packages installing FILE.\n Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. See --db-path for customization.\n")) (newline) @@ -318,5 +386,5 @@ (define (display-help-and-exit) (print-matching-results matches) (exit (pair? matches))) (display-help-and-exit)))) - (_ ;; index by default - (index-packages-with-db default-db-path)))) + (_ ;; By default, index + (index-manifests default-db-path)))) -- 2.38.1 From c9b02fc838237ebd7bc38ba7a71587fcdcaf6212 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 14:45:20 +0100 Subject: [PATCH 12/25] extensions-index: Improve help message --- guix/extensions/index.scm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index a7a23c6194..4a69df326e 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -328,9 +328,9 @@ (define (show-bug-report-information) (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] -Without FILE, index (package, file) relationships in the local store. +Without argument, indexes (package, file) relationships in the local store. With 'search FILE', search for packages installing FILE.\n -Note: The internal cache is located at ~/.config/guix/locate-db.sqlite. +Note: The internal cache is located at ~/.cache/guix/index/db.sqlite. See --db-path for customization.\n")) (newline) (display (G_ "The valid values for OPTIONS are:")) -- 2.38.1 From d63ef7a97f3fb47b5693b2c1d24bdf276ca6a6a8 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 14:46:04 +0100 Subject: [PATCH 13/25] extensions-index: Improve imports --- guix/extensions/index.scm | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 4a69df326e..abaf7df071 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -17,8 +17,10 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions index) - #:use-module (guix config) ;; %guix-package-name, ... - #:use-module (guix ui) ;; display G_ + #:use-module ((guix config) #:select (%guix-package-name + %guix-home-page-url + %guix-bug-report-address)) + #:use-module ((guix ui) #:select (G_)) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) -- 2.38.1 From 14a9dafb2b927ba8435a26fdea04b00644e3ca3c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 15:52:15 +0100 Subject: [PATCH 14/25] extensions-index: Drop code duplication Import directly the right function from guix ui module. --- guix/extensions/index.scm | 23 +++-------------------- 1 file changed, 3 insertions(+), 20 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index abaf7df071..c40edc7944 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -17,10 +17,9 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix extensions index) - #:use-module ((guix config) #:select (%guix-package-name - %guix-home-page-url - %guix-bug-report-address)) - #:use-module ((guix ui) #:select (G_)) + #:use-module ((guix i18n) #:select (G_)) + #:use-module ((guix ui) #:select (show-version-and-exit + show-bug-report-information)) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) @@ -312,22 +311,6 @@ (define default-db-path (string-append (cache-directory #:ensure? #f) "/index/db.sqlite")) -(define (show-bug-report-information) - ;; TRANSLATORS: The placeholder indicates the bug-reporting address for this - ;; package. Please add another line saying "Report translation bugs to - ;; ...\n" with the address for translation bugs (typically your translation - ;; team's web or email address). - (format #t (G_ " -Report bugs to: ~a.") %guix-bug-report-address) - (format #t (G_ " -~a home page: <~a>") %guix-package-name %guix-home-page-url) - (format #t (G_ " -General help using Guix and GNU software: <~a>") - ;; TRANSLATORS: Change the "/en" bit of this URL appropriately if - ;; the web site is translated in your language. - (G_ "https://guix.gnu.org/en/help/")) - (newline)) - (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] Without argument, indexes (package, file) relationships in the local store. -- 2.38.1 From ea1d8216bfe5f487de24d883891b6e07c8536cdd Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:01:33 +0100 Subject: [PATCH 15/25] extensions-index: Drop dead code we read from local profiles now --- guix/extensions/index.scm | 42 ++------------------------------------- 1 file changed, 2 insertions(+), 40 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index c40edc7944..a7c518e903 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -160,38 +160,6 @@ (define (strip file) directories))) (sqlite-exec db "commit;")) -(define (insert-package db package) - "Insert all the files of PACKAGE into DB." - (mlet %store-monad ((drv (package->derivation package #:graft? #f))) - (match (derivation->output-paths drv) - (((labels . directories) ...) - (when (every file-exists? directories) - (insert-files db (package-name package) (package-version package) - directories)) - (return #t))))) - -(define (filter-public-current-supported package) - "Filter supported, not hidden (public) and not superseded (current) package." - (and (not (hidden-package? package)) - (not (package-superseded package)) - (supported-package? package))) - -(define (filter-supported-package package) - "Filter supported package (package might be hidden or superseded)." - (and (supported-package? package))) - -(define (no-filter package) "No filtering on package" #t) - -(define* (insert-packages db #:optional (filter-policy filter-public-current-supported)) - "Insert all current packages matching `filter-package-policy` into DB." - (with-store store - (parameterize ((%graft? #f)) - (fold-packages (lambda (package _) - (run-with-store store - (insert-package db package))) - #t - #:select? filter-policy)))) - \f ;;; ;;; Indexing from local profiles. @@ -209,7 +177,7 @@ (define (all-profiles) (gc-roots)))) (define (profiles->manifest-entries profiles) - "Return manifest entries for all of PROFILES, without duplicates." + "Return deduplicated manifest entries across all PROFILES." (let loop ((visited (set)) (profiles profiles) (entries '())) @@ -286,12 +254,6 @@ (define lookup-stmt ;;; CLI ;;; -(define (index-packages-with-db db-pathname) - "Index packages using db at location DB-PATHNAME." - (call-with-database db-pathname - (lambda (db) - (insert-packages db no-filter)))) - (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname @@ -358,7 +320,7 @@ (define (display-help-and-exit) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path - (index-packages-with-db db-path) + (index-manifests db-path) (display-help-and-exit)))) (("search" file) (let ((matches (matching-packages-with-db default-db-path file))) -- 2.38.1 From 8454f9f417c2781fded2c26a1b920174991ac1dc Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:12:10 +0100 Subject: [PATCH 16/25] extensions-index: Rework docstrings --- guix/extensions/index.scm | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index a7c518e903..1c23d9a4f1 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -166,7 +166,7 @@ (define (strip file) ;;; (define (all-profiles) - "Return the list of profiles on the system." + "Return the list of system profiles." (delete-duplicates (filter-map (lambda (root) (if (file-exists? (string-append root "/manifest")) @@ -200,14 +200,13 @@ (define (profiles->manifest-entries profiles) (loop visited rest entries)))))) (define (insert-manifest-entry db entry) - "Insert ENTRY, a manifest entry, into DB." + "Insert a manifest ENTRY into DB." (insert-files db (manifest-entry-name entry) (manifest-entry-version entry) (list (manifest-entry-item entry)))) ;FIXME: outputs? (define (index-manifests db-file) - "Insert into DB-FILE entries for packages that appear in manifests -available on the system." + "Insert packages entries into DB-FILE from the system manifests." (call-with-database db-file (lambda (db) (for-each (lambda (entry) -- 2.38.1 From 98f9899d479cd62e93b86fab3448b2024db02621 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:12:24 +0100 Subject: [PATCH 17/25] extensions-index: Fix warning according to repl suggestion --- guix/extensions/index.scm | 1 + 1 file changed, 1 insertion(+) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 1c23d9a4f1..42c2051f13 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -23,6 +23,7 @@ (define-module (guix extensions index) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) + #:use-module (ice-9 format) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) -- 2.38.1 From bb80ad696e1a47651f2dc4a7c74ea577372c61b5 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 4 Dec 2022 16:20:01 +0100 Subject: [PATCH 18/25] extensions-index: Ensure directory holding the db is created if needed. The creation is ignore if already present. --- guix/extensions/index.scm | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 42c2051f13..627dddc119 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -208,6 +208,10 @@ (define (insert-manifest-entry db entry) (define (index-manifests db-file) "Insert packages entries into DB-FILE from the system manifests." + (let ((db-dirpath (dirname db-file))) + (unless (file-exists? db-dirpath) + (mkdir db-dirpath))) + (call-with-database db-file (lambda (db) (for-each (lambda (entry) -- 2.38.1 From 34a86f977947371d1eae3be9953190464aa01a8c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 11 Dec 2022 20:11:56 +0100 Subject: [PATCH 19/25] extensions-index: Add a schema db version Nothing is done with that version just yet besides displaying it in the --version call. --- guix/extensions/index.scm | 179 ++++++++++++++++++++++++-------------- 1 file changed, 112 insertions(+), 67 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 627dddc119..b89eb9e6c8 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -44,8 +44,15 @@ (define-module (guix extensions index) (define debug #f) +(define application-version 1) + (define schema " +create table if not exists SchemaVersion ( + version integer primary key not null, + unique (version) +); + create table if not exists Packages ( id integer primary key autoincrement not null, name text not null, @@ -81,85 +88,107 @@ (define (call-with-database file proc) (lambda () (sqlite-close db))))) -(define (insert-files db package version directories) - "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." - (define stmt-select-package +(define (insert-version db version) + "Insert application VERSION into the DB." + (define stmt-insert-version (sqlite-prepare db "\ -SELECT id FROM Packages WHERE name = :name AND version = :version;" +INSERT OR IGNORE INTO SchemaVersion(version) +VALUES (:version);" #:cache? #t)) + (sqlite-exec db "begin immediate;") + (sqlite-bind-arguments stmt-insert-version #:version version) + (sqlite-fold (const #t) #t stmt-insert-version) + (sqlite-exec db "commit;")) - (define stmt-insert-package - (sqlite-prepare db "\ +(define (read-version db) + "Read the current application version from the DB." + + (define stmt-select-version (sqlite-prepare db "\ +SELECT version FROM SchemaVersion;" + #:cache? #t)) + (match (sqlite-fold cons '() stmt-select-version) + ((#(version)) + version))) + +(define (insert-files db package version directories) + "Insert files from DIRECTORIES as belonging to PACKAGE at VERSION." + (define stmt-select-package + (sqlite-prepare db "\ +SELECT id FROM Packages WHERE name = :name AND version = :version;" + #:cache? #t)) + + (define stmt-insert-package + (sqlite-prepare db "\ INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes VALUES (:name, :version);" - #:cache? #t)) + #:cache? #t)) - (define stmt-select-directory - (sqlite-prepare db "\ + (define stmt-select-directory + (sqlite-prepare db "\ SELECT id FROM Directories WHERE name = :name AND package = :package;" - #:cache? #t)) + #:cache? #t)) - (define stmt-insert-directory - (sqlite-prepare db "\ + (define stmt-insert-directory + (sqlite-prepare db "\ INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes VALUES (:name, :package);" - #:cache? #t)) + #:cache? #t)) - (define stmt-insert-file - (sqlite-prepare db "\ + (define stmt-insert-file + (sqlite-prepare db "\ INSERT OR IGNORE INTO Files(name, basename, directory) VALUES (:name, :basename, :directory);" - #:cache? #t)) - - (sqlite-exec db "begin immediate;") - (sqlite-bind-arguments stmt-insert-package - #:name package - #:version version) - (sqlite-fold (const #t) #t stmt-insert-package) - - (sqlite-bind-arguments stmt-select-package - #:name package - #:version version) - (match (sqlite-fold cons '() stmt-select-package) - ((#(package-id)) - (when debug - (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" - package version package-id)) - (pk 'package package-id package) - (for-each (lambda (directory) - (define (strip file) - (string-drop file (+ (string-length directory) 1))) - - (sqlite-reset stmt-insert-directory) - (sqlite-bind-arguments stmt-insert-directory - #:name directory - #:package package-id) - (sqlite-fold (const #t) #t stmt-insert-directory) - - (sqlite-reset stmt-select-directory) - (sqlite-bind-arguments stmt-select-directory - #:name directory - #:package package-id) - (match (sqlite-fold cons '() stmt-select-directory) - ((#(directory-id)) - (when debug - (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" - directory package-id directory-id)) - (for-each (lambda (file) - ;; If DIRECTORY is a symlink, (find-files - ;; DIRECTORY) returns the DIRECTORY singleton. - (unless (string=? file directory) - (sqlite-reset stmt-insert-file) - (sqlite-bind-arguments stmt-insert-file - #:name (strip file) - #:basename - (basename file) - #:directory - directory-id) - (sqlite-fold (const #t) #t stmt-insert-file))) - (find-files directory))))) - directories))) - (sqlite-exec db "commit;")) + #:cache? #t)) + + (sqlite-exec db "begin immediate;") + (sqlite-bind-arguments stmt-insert-package + #:name package + #:version version) + (sqlite-fold (const #t) #t stmt-insert-package) + + (sqlite-bind-arguments stmt-select-package + #:name package + #:version version) + (match (sqlite-fold cons '() stmt-select-package) + ((#(package-id)) + (when debug + (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" + package version package-id)) + (pk 'package package-id package) + (for-each (lambda (directory) + (define (strip file) + (string-drop file (+ (string-length directory) 1))) + + (sqlite-reset stmt-insert-directory) + (sqlite-bind-arguments stmt-insert-directory + #:name directory + #:package package-id) + (sqlite-fold (const #t) #t stmt-insert-directory) + + (sqlite-reset stmt-select-directory) + (sqlite-bind-arguments stmt-select-directory + #:name directory + #:package package-id) + (match (sqlite-fold cons '() stmt-select-directory) + ((#(directory-id)) + (when debug + (format #t "(name, package, dir-id): (~a, ~a, ~a)\n" + directory package-id directory-id)) + (for-each (lambda (file) + ;; If DIRECTORY is a symlink, (find-files + ;; DIRECTORY) returns the DIRECTORY singleton. + (unless (string=? file directory) + (sqlite-reset stmt-insert-file) + (sqlite-bind-arguments stmt-insert-file + #:name (strip file) + #:basename + (basename file) + #:directory + directory-id) + (sqlite-fold (const #t) #t stmt-insert-file))) + (find-files directory))))) + directories))) + (sqlite-exec db "commit;")) \f ;;; @@ -212,6 +241,8 @@ (define (index-manifests db-file) (unless (file-exists? db-dirpath) (mkdir db-dirpath))) + (insert-version-with-db db-file) + (call-with-database db-file (lambda (db) (for-each (lambda (entry) @@ -258,6 +289,16 @@ (define lookup-stmt ;;; CLI ;;; +(define (insert-version-with-db db-pathname) + "Insert application version into the database." + (call-with-database db-pathname + (lambda (db) + (insert-version db application-version)))) + +(define (read-db-version-with-db db-pathname) + "Insert version into the database." + (call-with-database db-pathname read-version)) + (define (matching-packages-with-db db-pathname file) "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname @@ -306,7 +347,7 @@ (define-command (guix-index . args) (synopsis "Index packages to allow searching package for a given filename") (define (parse-db-args args) - "Parsing of string key=value where we are only interested in 'value'" + "Parsing of string key=value where we are only interested in 'value'" (match (string-split args #\=) ((unused db-path) db-path) @@ -320,6 +361,10 @@ (define (display-help-and-exit) ((or ("-h") ("--help")) (display-help-and-exit)) ((or ("-V") ("--version")) + (with-exception-handler + (lambda (exn) 'meh) ;; noop exception + (simple-format #t "Extension db version: ~a\n" (read-db-version-with-db default-db-path)) + #:unwind? #t) (show-version-and-exit "guix locate")) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) -- 2.38.1 From 2ecdab01c93fc4872803c5a2d16743214512cb5d Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Sun, 11 Dec 2022 20:13:44 +0100 Subject: [PATCH 20/25] extensions-index: Fix typo in help message --- guix/extensions/index.scm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index b89eb9e6c8..3a5015afe1 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -365,7 +365,7 @@ (define (display-help-and-exit) (lambda (exn) 'meh) ;; noop exception (simple-format #t "Extension db version: ~a\n" (read-db-version-with-db default-db-path)) #:unwind? #t) - (show-version-and-exit "guix locate")) + (show-version-and-exit "guix index")) ((db-path-args) (let ((db-path (parse-db-args db-path-args))) (if db-path -- 2.38.1 From a30dff0161f60288ce3b260a8429c2fd3c8b8e7c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 15 Dec 2022 13:04:08 +0100 Subject: [PATCH 21/25] extensions-index: Allow user to choose the indexation method To do so, this: - reverted the old code removal to reuse the indexing packages out of the local store functions - rewrites the cli argument parsing logic. This allows more flexibility in indexation method (for a bit more code though) --- guix/extensions/index.scm | 250 ++++++++++++++++++++++++++++++-------- guix/scripts/home.scm | 2 +- 2 files changed, 199 insertions(+), 53 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 3a5015afe1..878daf4fb6 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -19,11 +19,15 @@ (define-module (guix extensions index) #:use-module ((guix i18n) #:select (G_)) #:use-module ((guix ui) #:select (show-version-and-exit - show-bug-report-information)) + show-bug-report-information + with-error-handling + string->number*)) + #:use-module ((guix status) #:select (with-status-verbosity)) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) #:use-module (ice-9 format) + #:use-module (ice-9 getopt-long) #:use-module (guix describe) #:use-module (guix store) #:use-module (guix monads) @@ -39,10 +43,11 @@ (define-module (guix extensions index) #:autoload (gnu packages) (fold-packages) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) + #:use-module (srfi srfi-37) ;; option #:use-module (srfi srfi-71) #:export (guix-index)) -(define debug #f) +(define debug #t) (define application-version 1) @@ -190,6 +195,34 @@ (define (strip file) directories))) (sqlite-exec db "commit;")) +\f +;;; +;;; Indexing from local packages. +;;; + +(define (insert-package db package) + "Insert all the files of PACKAGE into DB." + (mlet %store-monad ((drv (package->derivation package #:graft? #f))) + (match (derivation->output-paths drv) + (((labels . directories) ...) + (when (every file-exists? directories) + (insert-files db (package-name package) (package-version package) + directories)) + (return #t))))) + +(define* (index-packages-from-store db) + "Insert all current packages from the local store into the DB." + (with-store store + (parameterize ((%graft? #f)) + (fold-packages (lambda (package _) + (run-with-store store + (insert-package db package))) + #t + #:select? (lambda (package) + (and (not (hidden-package? package)) + (not (package-superseded package)) + (supported-package? package))))))) + \f ;;; ;;; Indexing from local profiles. @@ -235,14 +268,8 @@ (define (insert-manifest-entry db entry) (manifest-entry-version entry) (list (manifest-entry-item entry)))) ;FIXME: outputs? -(define (index-manifests db-file) - "Insert packages entries into DB-FILE from the system manifests." - (let ((db-dirpath (dirname db-file))) - (unless (file-exists? db-dirpath) - (mkdir db-dirpath))) - - (insert-version-with-db db-file) - +(define (index-packages-from-manifests-with-db db-file) + "Index packages entries into DB-FILE from the system manifests." (call-with-database db-file (lambda (db) (for-each (lambda (entry) @@ -289,6 +316,12 @@ (define lookup-stmt ;;; CLI ;;; +(define (index-packages-from-store-with-db db-pathname) + "Index packages using db at location DB-PATHNAME." + (call-with-database db-pathname + (lambda (db) + (index-packages-from-store db)))) + (define (insert-version-with-db db-pathname) "Insert application version into the database." (call-with-database db-pathname @@ -320,67 +353,180 @@ (define default-db-path (define (show-help) (display (G_ "Usage: guix index [OPTIONS...] [search FILE...] -Without argument, indexes (package, file) relationships in the local store. +Without argument, indexes (package, file) relationships from the machine. +This allows indexation with 2 methods: + +- out of the local manifests. This is the fastest implementation but this +indexes less packages. That'd be typically the use case for user local +indexation. + +- out of the local store. This is slower due to implementation details (it +discusses with the store daemon for one). That'd be typically the use case for +building the largest db in one of the build farm node. + With 'search FILE', search for packages installing FILE.\n -Note: The internal cache is located at ~/.cache/guix/index/db.sqlite. +Note: Internal cache is located at ~/.cache/guix/index/db.sqlite by default. See --db-path for customization.\n")) (newline) (display (G_ "The valid values for OPTIONS are:")) (newline) (display (G_ " - -h, --help Display this help and exit")) + -h, --help Display this help and exit")) (display (G_ " - -V, --version Display version information and exit")) + -V, --version Display version information and exit")) (display (G_ " - --db-path=DIR Change default location of the cache db")) + --db-path=DIR Change default location of the cache db")) (newline) + (display (G_ " + --with-method=METH Change default indexation method. By default it uses the + local \"manifests\" (faster). It can also uses the local + \"store\" (slower, typically on the farm build ci).")) (newline) (display (G_ "The valid values for ARGS are:")) (newline) (display (G_ " search FILE Search for packages installing the FILE (from cache db)")) (newline) + (display (G_ " + <EMPTY> Without any argument, it index packages. This fills in the + db cache using whatever indexation method is defined.")) (show-bug-report-information)) +(define (verbosity-level opts) + "Return the verbosity level based on OPTS, the alist of parsed options." + (or (assoc-ref opts 'verbosity) + (if (eq? (assoc-ref opts 'action) 'build) + 3 1))) + +(define %options + (list + (option '(#\h "help") #f #f + (lambda args (show-help) (exit 0))) + (option '(#\V "version") #f #f + (lambda args (show-version-and-exit "guix index"))) + (option '(#\v "verbosity") #f #t + (lambda (opt name arg result) + (let ((level (string->number* arg))) + (alist-cons 'verbosity level + (alist-delete 'verbosity result))))) + ;; index data out of the method (store or package) + (option '(#\d "db-path") #f #t + (lambda (opt name arg result) + (when debug + (format #t "%options: --with-method: opt ~a\n" opt) + (format #t "%options: --with-method: name ~a\n" name) + (format #t "%options: --with-method: arg ~a\n" arg) + (format #t "%options: --with-method: result ~a\n" result)) + (alist-cons 'db-path arg + (alist-delete 'db-path result)))) + + ;; index data out of the method (store or package) + (option '(#\m "with-method") #f #t + (lambda (opt name arg result) + (when debug + (format #t "%options: --with-method: opt ~a\n" opt) + (format #t "%options: --with-method: name ~a\n" name) + (format #t "%options: --with-method: arg ~a\n" arg) + (format #t "%options: --with-method: result ~a\n" result)) + (match arg + ((or "manifests" "store") + (alist-cons 'with-method arg + (alist-delete 'with-method result))) + (_ + (G_ "guix index: Wrong indexation method, either manifests + (fast) or store (slow)~%"))))))) + +(define %default-options + `((db-path . ,default-db-path) + (verbosity . #f) + (with-method . "manifests"))) + (define-command (guix-index . args) (category extension) - (synopsis "Index packages to allow searching package for a given filename") - - (define (parse-db-args args) - "Parsing of string key=value where we are only interested in 'value'" - (match (string-split args #\=) - ((unused db-path) - db-path) + (synopsis "Index packages to search package for a given filename") + + (define (parse-sub-command arg result) + ;; Parse sub-command ARG and augment RESULT accordingly. + (when debug + (format #t "parse-sub-command: arg: ~a\n" arg) + (format #t "parse-sub-command: result: ~a\n" result) + (format #t "parse-sub-command: (assoc-ref result 'action): ~a\n" (assoc-ref result 'action)) + (format #t "parse-sub-command: (assoc-ref result 'argument): ~a\n" (assoc-ref result 'argument))) + (if (assoc-ref result 'action) + (alist-cons 'argument arg result) + (let ((action (string->symbol arg))) + (case action + ((search) + (alist-cons 'action action result)) + (else (leave (G_ "~a: unknown action~%") action)))))) + + (define (match-pair car) + ;; Return a procedure that matches a pair with CAR. + (match-lambda + ((head . tail) + (and (eq? car head) tail)) (_ #f))) - (define (display-help-and-exit) - (show-help) - (exit 0)) - - (match args - ((or ("-h") ("--help")) - (display-help-and-exit)) - ((or ("-V") ("--version")) - (with-exception-handler - (lambda (exn) 'meh) ;; noop exception - (simple-format #t "Extension db version: ~a\n" (read-db-version-with-db default-db-path)) - #:unwind? #t) - (show-version-and-exit "guix index")) - ((db-path-args) - (let ((db-path (parse-db-args db-path-args))) - (if db-path - (index-manifests db-path) - (display-help-and-exit)))) - (("search" file) - (let ((matches (matching-packages-with-db default-db-path file))) - (print-matching-results matches) - (exit (pair? matches)))) - ((db-path-args "search" file) - (let ((db-path (parse-db-args db-path-args))) - (if db-path - (let ((matches (matching-packages-with-db db-path file))) + (define (option-arguments opts) + ;; Extract the plain arguments from OPTS. + (let* ((args (reverse (filter-map (match-pair 'argument) opts))) + (count (length args)) + (action (or (assoc-ref opts 'action) 'index))) + + (when debug + (format #t "option-arguments: args: ~a\n" args) + (format #t "option-arguments: count: ~a\n" count) + (format #t "option-arguments: action: ~a\n" action)) + + (define (fail) + (leave (G_ "wrong number of arguments for action '~a'~%") + action)) + + (unless action + (format (current-error-port) + (G_ "guix index: missing command name~%")) + (format (current-error-port) + (G_ "Try 'guix index --help' for more information.~%")) + (exit 1)) + (alist-cons 'argument (string-concatenate args) + (alist-delete 'argument + (alist-cons 'action action + (alist-delete 'action opts)))))) + + (with-error-handling + (let* ((opts (parse-command-line args %options + (list %default-options) + #:argument-handler + parse-sub-command)) + (args (option-arguments opts)) + (action (assoc-ref args 'action)) + (db-path (assoc-ref args 'db-path)) + (with-method (assoc-ref args 'with-method))) + (with-status-verbosity (verbosity-level opts) + (when debug + (format #t "main: opts: ~a\n" opts) + (format #t "main: args: ~a\n" args) + (format #t "main: action: ~a\n" action) + (format #t "main: db-path: ~a\n" db-path) + (format #t "main: with-method: ~a\n" with-method)) + + (match action + ('search + (unless (file-exists? db-path) + (format (current-error-port) + (G_ "guix index: The local cache db does not exist yet. +You need to index packages first.\nTry 'guix index --help' for more information.~%")) + (exit 1)) + (let* ((file (assoc-ref args 'argument)) + (matches (matching-packages-with-db db-path file))) (print-matching-results matches) - (exit (pair? matches))) - (display-help-and-exit)))) - (_ ;; By default, index - (index-manifests default-db-path)))) + (exit (pair? matches)))) + ('index + (let ((db-dirpath (dirname db-path))) + (unless (file-exists? db-dirpath) + (mkdir db-dirpath))) + ;; FIXME: Deal with check on version and destruction/migration if need be + (insert-version-with-db db-path) + (if (string= with-method "manifests") + (index-packages-from-manifests-with-db db-path) + (index-packages-from-store-with-db db-path)))))))) diff --git a/guix/scripts/home.scm b/guix/scripts/home.scm index 1c481ccf91..bdc903f393 100644 --- a/guix/scripts/home.scm +++ b/guix/scripts/home.scm @@ -69,7 +69,7 @@ (define-module (guix scripts home) #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:use-module (srfi srfi-35) - #:use-module (srfi srfi-37) + #:use-module ((srfi srfi-37) #:select (option)) #:use-module (srfi srfi-71) #:use-module (ice-9 match) #:export (guix-home)) -- 2.38.1 From 295e4f85b6a967cd714712fe67bcaaef6bb5c29d Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 15 Dec 2022 16:34:06 +0100 Subject: [PATCH 22/25] extensions-index: Deal with db schema migrations The schema db version is dealt with. It's transparent for users. As we modify along the schema for evolution, we should also provide the intermediary migration sql script so the existing db can be migrated along without losing data. --- guix/extensions/index.scm | 153 ++++++++++++++++++++++---------------- 1 file changed, 90 insertions(+), 63 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 878daf4fb6..0fd361a485 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -22,7 +22,6 @@ (define-module (guix extensions index) show-bug-report-information with-error-handling string->number*)) - #:use-module ((guix status) #:select (with-status-verbosity)) #:use-module (guix scripts) #:use-module (sqlite3) #:use-module (ice-9 match) @@ -49,12 +48,18 @@ (define-module (guix extensions index) (define debug #t) -(define application-version 1) +(define application-version 2) -(define schema +;; The following schema is the full schema at the `application-version`. It +;; should be modified according to the development required. If the schema +;; needs modification across time, those should be changed directly in the +;; full-schema and the incremental changes should be referenced below +;; as migration step (for the existing dbs) below. +(define schema-full " create table if not exists SchemaVersion ( version integer primary key not null, + date date, unique (version) ); @@ -83,22 +88,32 @@ (define schema create index if not exists IndexFiles on Files(basename);") +;; List of tuple ((version . sqlite schema migration script)). There should +;; be as much version increments with each step needed to migrate the db. +(define schema-to-migrate '((1 . " +create table if not exists SchemaVersion ( + version integer primary key not null, + unique (version) +); +") + (2 . " +alter table SchemaVersion +add column date date; +"))) + (define (call-with-database file proc) (let ((db (sqlite-open file))) (dynamic-wind (lambda () #t) - (lambda () - (sqlite-exec db schema) - (proc db)) - (lambda () - (sqlite-close db))))) + (lambda () (proc db)) + (lambda () (sqlite-close db))))) (define (insert-version db version) "Insert application VERSION into the DB." (define stmt-insert-version (sqlite-prepare db "\ -INSERT OR IGNORE INTO SchemaVersion(version) -VALUES (:version);" +INSERT OR IGNORE INTO SchemaVersion(version, date) +VALUES (:version, CURRENT_TIMESTAMP);" #:cache? #t)) (sqlite-exec db "begin immediate;") (sqlite-bind-arguments stmt-insert-version #:version version) @@ -109,8 +124,8 @@ (define (read-version db) "Read the current application version from the DB." (define stmt-select-version (sqlite-prepare db "\ -SELECT version FROM SchemaVersion;" - #:cache? #t)) +SELECT version FROM SchemaVersion ORDER BY version DESC LIMIT 1;" + #:cache? #f)) (match (sqlite-fold cons '() stmt-select-version) ((#(version)) version))) @@ -268,9 +283,9 @@ (define (insert-manifest-entry db entry) (manifest-entry-version entry) (list (manifest-entry-item entry)))) ;FIXME: outputs? -(define (index-packages-from-manifests-with-db db-file) - "Index packages entries into DB-FILE from the system manifests." - (call-with-database db-file +(define (index-packages-from-manifests-with-db db-pathname) + "Index packages entries into DB-PATHNAME from the system manifests." + (call-with-database db-pathname (lambda (db) (for-each (lambda (entry) (insert-manifest-entry db entry)) @@ -322,21 +337,40 @@ (define (index-packages-from-store-with-db db-pathname) (lambda (db) (index-packages-from-store db)))) -(define (insert-version-with-db db-pathname) - "Insert application version into the database." +(define (matching-packages-with-db db-pathname file) + "Compute list of packages referencing FILE using db at DB-PATHNAME." (call-with-database db-pathname (lambda (db) - (insert-version db application-version)))) + (matching-packages db file)))) -(define (read-db-version-with-db db-pathname) - "Insert version into the database." - (call-with-database db-pathname read-version)) +(define (read-version-from-db db-pathname) + (call-with-database db-pathname + (lambda (db) (read-version db)))) -(define (matching-packages-with-db db-pathname file) - "Compute list of packages referencing FILE using db at DB-PATHNAME." +(define (migrate-schema-to-version db-pathname) (call-with-database db-pathname (lambda (db) - (matching-packages db file)))) + (catch #t + (lambda () + ;; Migrate from the current version to the full migrated schema + ;; This can raise sqlite-error if the db is not properly configured yet + (let* ((current-db-version (read-version db)) + (next-db-version (+ 1 current-db-version))) + (when (< current-db-version application-version) + ;; when the current db version is older than the current application + (let ((schema-migration-at-version (assoc-ref schema-to-migrate next-db-version))) + (when schema-migration-at-version + ;; migrate the schema to the next version (if it exists) + (sqlite-exec db schema-migration-at-version) + ;; insert current version + (insert-version db next-db-version) + ;; iterate over the next migration if any + (migrate-schema-to-version db)))))) + (lambda (key . arg) + ;; exception handler in case failure to read an inexisting db + ;; Fallback to boostrap the schema + (sqlite-exec db schema-full) + (insert-version db application-version)))))) (define (print-matching-results matches) "Print the MATCHES matching results." @@ -392,23 +426,17 @@ (define (show-help) db cache using whatever indexation method is defined.")) (show-bug-report-information)) -(define (verbosity-level opts) - "Return the verbosity level based on OPTS, the alist of parsed options." - (or (assoc-ref opts 'verbosity) - (if (eq? (assoc-ref opts 'action) 'build) - 3 1))) - (define %options (list (option '(#\h "help") #f #f (lambda args (show-help) (exit 0))) (option '(#\V "version") #f #f - (lambda args (show-version-and-exit "guix index"))) - (option '(#\v "verbosity") #f #t (lambda (opt name arg result) - (let ((level (string->number* arg))) - (alist-cons 'verbosity level - (alist-delete 'verbosity result))))) + (catch 'sqlite-error + (lambda () + (simple-format #t "Extension db version: ~a\n" (read-version-from-db (assoc-ref result 'db-path)))) + (lambda (key . arg) 'no-db-yet-so-nothing-to-display)) + (show-version-and-exit "guix index"))) ;; index data out of the method (store or package) (option '(#\d "db-path") #f #t (lambda (opt name arg result) @@ -438,7 +466,6 @@ (define %options (define %default-options `((db-path . ,default-db-path) - (verbosity . #f) (with-method . "manifests"))) (define-command (guix-index . args) @@ -502,31 +529,31 @@ (define (fail) (action (assoc-ref args 'action)) (db-path (assoc-ref args 'db-path)) (with-method (assoc-ref args 'with-method))) - (with-status-verbosity (verbosity-level opts) - (when debug - (format #t "main: opts: ~a\n" opts) - (format #t "main: args: ~a\n" args) - (format #t "main: action: ~a\n" action) - (format #t "main: db-path: ~a\n" db-path) - (format #t "main: with-method: ~a\n" with-method)) - - (match action - ('search - (unless (file-exists? db-path) - (format (current-error-port) - (G_ "guix index: The local cache db does not exist yet. + (when debug + (format #t "main: opts: ~a\n" opts) + (format #t "main: args: ~a\n" args) + (format #t "main: action: ~a\n" action) + (format #t "main: db-path: ~a\n" db-path) + (format #t "main: with-method: ~a\n" with-method)) + + (match action + ('search + (unless (file-exists? db-path) + (format (current-error-port) + (G_ "guix index: The local cache db does not exist yet. You need to index packages first.\nTry 'guix index --help' for more information.~%")) - (exit 1)) - (let* ((file (assoc-ref args 'argument)) - (matches (matching-packages-with-db db-path file))) - (print-matching-results matches) - (exit (pair? matches)))) - ('index - (let ((db-dirpath (dirname db-path))) - (unless (file-exists? db-dirpath) - (mkdir db-dirpath))) - ;; FIXME: Deal with check on version and destruction/migration if need be - (insert-version-with-db db-path) - (if (string= with-method "manifests") - (index-packages-from-manifests-with-db db-path) - (index-packages-from-store-with-db db-path)))))))) + (exit 1)) + (let* ((file (assoc-ref args 'argument)) + (matches (matching-packages-with-db db-path file))) + (print-matching-results matches) + (exit (pair? matches)))) + ('index + (let ((db-dirpath (dirname db-path))) + (unless (file-exists? db-dirpath) + (mkdir db-dirpath))) + ;; Migrate/initialize db to schema at version application-version + (migrate-schema-to-version db-path) + ;; Finally index packages + (if (string= with-method "manifests") + (index-packages-from-manifests-with-db db-path) + (index-packages-from-store-with-db db-path))))))) -- 2.38.1 From 60b2d6e1e6c9ce286844354298a3c9f2fed0adff Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 15 Dec 2022 17:21:15 +0100 Subject: [PATCH 23/25] extensions-index: Deactivate debug --- guix/extensions/index.scm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 0fd361a485..56841a4666 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -46,7 +46,7 @@ (define-module (guix extensions index) #:use-module (srfi srfi-71) #:export (guix-index)) -(define debug #t) +(define debug #f) (define application-version 2) -- 2.38.1 From b7485e7302862ef3e96279eca9df6f4c63bfb94c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 15 Dec 2022 17:22:39 +0100 Subject: [PATCH 24/25] extensions-index: Expose db information in guix index -V output --- guix/extensions/index.scm | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 56841a4666..256a43d7fd 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -434,7 +434,12 @@ (define %options (lambda (opt name arg result) (catch 'sqlite-error (lambda () - (simple-format #t "Extension db version: ~a\n" (read-version-from-db (assoc-ref result 'db-path)))) + (let ((db-path (assoc-ref result 'db-path))) + (simple-format + #t + "Extension local cache database:\n- path: ~a\n- version: ~a\n\n" + db-path (read-version-from-db db-path)) + )) (lambda (key . arg) 'no-db-yet-so-nothing-to-display)) (show-version-and-exit "guix index"))) ;; index data out of the method (store or package) -- 2.38.1 From 93bb890ac2f887f338a9e2fa06e6d605bfc6722c Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> Date: Thu, 15 Dec 2022 17:22:53 +0100 Subject: [PATCH 25/25] extensions-index: Wrap index computations with progress bar output --- guix/extensions/index.scm | 48 +++++++++++++++++++++++++++------------ 1 file changed, 33 insertions(+), 15 deletions(-) diff --git a/guix/extensions/index.scm b/guix/extensions/index.scm index 256a43d7fd..12237f82ba 100644 --- a/guix/extensions/index.scm +++ b/guix/extensions/index.scm @@ -36,6 +36,8 @@ (define-module (guix extensions index) #:use-module (guix derivations) #:use-module (guix packages) #:use-module (guix profiles) + #:use-module ((guix progress) #:select (progress-reporter/bar + call-with-progress-reporter)) #:use-module (guix sets) #:use-module ((guix utils) #:select (cache-directory)) #:autoload (guix build utils) (find-files) @@ -173,8 +175,8 @@ (define stmt-insert-file ((#(package-id)) (when debug (format #t "(pkg, version, pkg-id): (~a, ~a, ~a)" - package version package-id)) - (pk 'package package-id package) + package version package-id) + (pk 'package package-id package)) (for-each (lambda (directory) (define (strip file) (string-drop file (+ (string-length directory) 1))) @@ -229,14 +231,24 @@ (define* (index-packages-from-store db) "Insert all current packages from the local store into the DB." (with-store store (parameterize ((%graft? #f)) - (fold-packages (lambda (package _) - (run-with-store store - (insert-package db package))) - #t - #:select? (lambda (package) - (and (not (hidden-package? package)) - (not (package-superseded package)) - (supported-package? package))))))) + (let* ((packages (fold-packages + (lambda (package result) + (cons package result)) + '() + #:select? (lambda (package) + (and (not (hidden-package? package)) + (not (package-superseded package)) + (supported-package? package))))) + (nb-entries (length packages)) + (prefix (format #f "Registering ~a packages" nb-entries)) + (progress (progress-reporter/bar nb-entries prefix))) + (call-with-progress-reporter progress + (lambda (report) + (for-each (lambda (package) + (run-with-store store + (insert-package db package)) + (report)) + packages))))))) \f ;;; @@ -287,11 +299,17 @@ (define (index-packages-from-manifests-with-db db-pathname) "Index packages entries into DB-PATHNAME from the system manifests." (call-with-database db-pathname (lambda (db) - (for-each (lambda (entry) - (insert-manifest-entry db entry)) - (let ((lst (profiles->manifest-entries (all-profiles)))) - (pk 'entries (length lst)) - lst))))) + (let* ((profiles (all-profiles)) + (entries (profiles->manifest-entries profiles)) + (nb-entries (length entries)) + (prefix (format #f "Registering ~a packages" nb-entries)) + (progress (progress-reporter/bar nb-entries prefix))) + (call-with-progress-reporter progress + (lambda (report) + (for-each (lambda (entry) + (insert-manifest-entry db entry) + (report)) + entries))))))) \f ;;; -- 2.38.1 [-- Attachment #1.3: Type: text/plain, Size: 1517 bytes --] Ludovic Courtès <ludo@gnu.org> writes: > Hi! > > "Antoine R. Dumont (@ardumont)" <ardumont@softwareheritage.org> skribis: > >> |-----------+-------------+----------+----------| >> | Iteration | Host System | Time (s) | Packages | >> |-----------+-------------+----------+----------| >> | 1st | Debian | 121.88 | 284 | >> | | Guix System | 413.55 | 749 | >> |-----------+-------------+----------+----------| >> | 2nd | Debian | 1.3 | 101 | >> | | Guix System | 11.54 | 354 | >> |-----------+-------------+----------+----------| > > Ah, that’s a significant difference. > > I guess we can keep both methods: the exhaustive one that goes over all > packages, and the quick one. Then we can have a switch to select the > method. > > Typically, we may want to use the expensive one on the build farm to > publish a full database, while on user’s machines we may want to default > to the cheaper one. > >>> Oh, and progress bars too. >> >> I'm a bit unsettled on this. Hopefully it was mostly a joke ;) > > It wasn’t. :-) > > In the manifest case, we get ‘all-profiles’ is almost instantaneous, so > we immediately known the number of manifests we’ll be working on. > > In the package case, the number of packages is known ahead. > > The (guix progress) module provides helpers. > > But anyway, that’s more like icing on the cake, we can leave that for > later. > > Thanks, > Ludo’. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 877 bytes --] ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: File search 2022-12-15 17:03 ` Antoine R. Dumont (@ardumont) @ 2022-12-19 21:25 ` Ludovic Courtès 2022-12-19 22:44 ` zimoun 2022-12-20 11:13 ` Antoine R. Dumont (@ardumont) 0 siblings, 2 replies; 33+ messages in thread From: Ludovic Courtès @ 2022-12-19 21:25 UTC (permalink / raw) To: Antoine R. Dumont (@ardumont); +Cc: guix-devel Hi Antoine! "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> skribis: > Here is the rough changelog: > > - The local db cache is now versioned. Migration will transparently > happen for users at each index command calls (if need be). Perfect! > - The cli parsing got rewritten to be more flexible (inspired from > existing code from guix, notably `guix home`). > > - We can now choose the indexation method using the > `--with-method={store|manifests}` flag. The "manifests" method is the > default, seel the help message for more details). Excellent. (I think we can call it ‘--method’, without “with”.) > - Finally, the indexation methods are displayed using a progress bar. Yay, I love progress bars. :-) > Heads up, I did not yet address the "output" part. Thanks @zimoun for > the clarification btw ;) Future work. ;-) >> In the package case, the number of packages is known ahead. > > @civodul For the index 'store' implementation, ^ I did not find that > information. (length (all-packages)) gives you the total number of packages you’re going to traverse. ‘all-packages’ is not instantaneous, but as a good approximation the time spent in ‘all-packages’ can be ignored. > So, as a costly implementation detail, I'm folding over all packages > first to know the total number of packages (for the progress bar). And > then another round trip to actually do the insert. You could build up the package list just once and call ‘length’ on it. > Hope you'll find it mostly to your taste! I do! > Note: I gather we'll rework the commits at some point (when it's ready) > so I did not bother too much right now. I think at this point we could consider integration in Guix proper, under ‘guix/scripts’. For that we could dismiss commit history. That’ll entail extra work (d’oh!) such as fine-tuning, writing tests, and writing a section for the manual. The other option, if you prefer, would be to keep it in a separate repo as an extension that people can install. To me that would be more of a temporary solution because I think it’s a useful feature that ought to be provided by Guix proper eventually. WDYT? :-) Ludo’. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-19 21:25 ` Ludovic Courtès @ 2022-12-19 22:44 ` zimoun 2022-12-20 11:13 ` Antoine R. Dumont (@ardumont) 1 sibling, 0 replies; 33+ messages in thread From: zimoun @ 2022-12-19 22:44 UTC (permalink / raw) To: Ludovic Courtès, Antoine R. Dumont (@ardumont); +Cc: guix-devel Hi Ludo, On Mon, 19 Dec 2022 at 22:25, Ludovic Courtès <ludo@gnu.org> wrote: > I think at this point we could consider integration in Guix proper, > under ‘guix/scripts’. For that we could dismiss commit history. > > That’ll entail extra work (d’oh!) such as fine-tuning, writing tests, > and writing a section for the manual. > > The other option, if you prefer, would be to keep it in a separate repo > as an extension that people can install. To me that would be more of a > temporary solution because I think it’s a useful feature that ought to > be provided by Guix proper eventually. For what it is worth, I think it would better to reduce the number of scripts and instead have something more modular with extensions. The tradeoff about the maintenance cost is to not clear, I agree. On the other hand, reducing the number of modules that “guix pull” processes would help for improving the performance. For instance, let say that I am not interested by “guix system” and the computation of the derivation at “guix pull” time is not nothing. Another instance is all the plumbing commands. The manual would cover the extensions but it would be an opt-in choice by the user to install them. And it would reduce the load at “guix pull” time. My 2 cents. :-) Cheers, simon ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: File search 2022-12-19 21:25 ` Ludovic Courtès 2022-12-19 22:44 ` zimoun @ 2022-12-20 11:13 ` Antoine R. Dumont (@ardumont) 1 sibling, 0 replies; 33+ messages in thread From: Antoine R. Dumont (@ardumont) @ 2022-12-20 11:13 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 5008 bytes --] Hello Guix, Thanks for the feedback! Note: @civodul, assuming you are subscribed to the ml, I currently kept you as a `To:` recipient but I can drop you from it, right? I'm "also" subscribed to the ml so you may drop me from the `To:` too (if i'm not mistaken). Ludovic Courtès <ludo@gnu.org> writes: > Hi Antoine! > "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com> > skribis: > >> Here is the rough changelog: >> >> - The local db cache is now versioned. Migration will transparently >> happen for users at each index command calls (if need be). > > Perfect! > >> - The cli parsing got rewritten to be more flexible (inspired from >> existing code from guix, notably `guix home`). >> >> - We can now choose the indexation method using the >> `--with-method={store|manifests}` flag. The "manifests" method is the >> default, seel the help message for more details). > > Excellent. (I think we can call it ‘--method’, without “with”.) sure. >> - Finally, the indexation methods are displayed using a progress bar. > > Yay, I love progress bars. :-) I have some work to improve the implementation to have more details in the messages (typically make the "prefix" parameter of `progress-report/bar` be a callable/function [1]...). If that's of some interest, i'll push forward (in another patch maybe?). I stopped at the moment 'cause i had some strange issues with my env (where i could not make guile see the changes for some reasons...). Anyway, that's of lesser priority than the rest so... [1] or whatever the name is in guile context ;) >> Heads up, I did not yet address the "output" part. Thanks @zimoun for >> the clarification btw ;) > > Future work. ;-) ok! If i'm getting out of all the modifications i need to do, and if I have some energy left, I might attend to it too ;) >>> In the package case, the number of packages is known ahead. >> >> @civodul For the index 'store' implementation, ^ I did not find that >> information. > > (length (all-packages)) gives you the total number of packages you’re > going to traverse. ‘all-packages’ is not instantaneous, but as a good > approximation the time spent in ‘all-packages’ can be ignored. ok. I missed that. Although, the current call to `fold-packages` does some package filtering first. So, I guess that's why you call `(length (all-packages))` an approximation (no filtering on that call), right? >> So, as a costly implementation detail, I'm folding over all packages >> first to know the total number of packages (for the progress bar). And >> then another round trip to actually do the insert. > > You could build up the package list just once and call ‘length’ on it. I explained myself wrongly. That's what it is doing currenly. It does that ^ folding and keep the packages list, then do a `length` call on it to have the exact number of entries. And then does the actual loop on that list to insert them in the db cache. I naively thought that the `length` call on the list would cost one round trip O(n), isn't it so? Or is there some memoization somewhere? >> Hope you'll find it mostly to your taste! > > I do! \o/ >> Note: I gather we'll rework the commits at some point (when it's ready) >> so I did not bother too much right now. > > I think at this point we could consider integration in Guix proper, > under ‘guix/scripts’. For that we could dismiss commit history. Fine with me. I'll do the adaptations to make it a script then. > That’ll entail extra work (d’oh!) such as fine-tuning, writing tests, > and writing a section for the manual. Yes, i'm fine with that. FWIW, I tried to have a look at how current unit tests were written last week. I did not grok it entirely yet. I saw some script tests generate some guile and I got lost there ;) I'll have to double check. I'll probalby need some help for testing and documentation. I guess asking questions on irc is fine for that part, right? > The other option, if you prefer, would be to keep it in a separate repo > as an extension that people can install. To me that would be more of a > temporary solution because I think it’s a useful feature that ought to > be provided by Guix proper eventually. > WDYT? :-) If it's temporary then i'm fine with trying to do the extra work to merge the work with proper Guix ;). Although, zimoun, down thread has some interesting remarks too. I'll let you discuss those. I have another extension idea [1] that might help anyway. So we'll have another opportunity to entertain the guix extensions features (if the idea is interesting to proper Guix). [1] `guix bug-report [--with-uname|--with-version|...]` > Ludo’. Cheers, -- tony / Antoine R. Dumont (@ardumont) ----------------------------------------------------------------- gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 877 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2022-12-20 11:23 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-21 9:03 File search Ludovic Courtès 2022-01-21 10:35 ` Mathieu Othacehe 2022-01-22 0:35 ` Ludovic Courtès 2022-01-21 19:00 ` Vagrant Cascadian 2022-01-22 0:37 ` Ludovic Courtès 2022-01-22 2:53 ` Maxim Cournoyer 2022-01-25 11:15 ` Ludovic Courtès 2022-01-25 11:20 ` Oliver Propst 2022-01-25 11:22 ` Oliver Propst 2022-01-22 4:46 ` raingloom 2022-01-22 7:55 ` Ricardo Wurmus 2022-01-24 15:48 ` Ludovic Courtès 2022-01-24 17:03 ` Ricardo Wurmus 2022-02-02 16:14 ` Maxim Cournoyer 2022-02-05 11:15 ` Ludovic Courtès 2022-01-25 23:45 ` Ryan Prior 2022-02-05 11:18 ` Ludovic Courtès 2022-02-06 13:27 ` André A. Gomes -- strict thread matches above, loose matches on Subject: below -- 2022-12-02 17:58 antoine.romain.dumont 2022-12-02 18:22 ` Antoine R. Dumont (@ardumont) 2022-12-03 18:19 ` Ludovic Courtès 2022-12-04 16:35 ` Antoine R. Dumont (@ardumont) 2022-12-06 10:01 ` Ludovic Courtès 2022-12-06 12:59 ` zimoun 2022-12-06 18:27 ` ( 2022-12-08 15:41 ` Ludovic Courtès 2022-12-09 10:05 ` Antoine R. Dumont (@ardumont) 2022-12-09 18:05 ` zimoun 2022-12-11 10:22 ` Ludovic Courtès 2022-12-15 17:03 ` Antoine R. Dumont (@ardumont) 2022-12-19 21:25 ` Ludovic Courtès 2022-12-19 22:44 ` zimoun 2022-12-20 11:13 ` Antoine R. Dumont (@ardumont)
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).