From: "Antoine R. Dumont (@ardumont)" <antoine.romain.dumont@gmail.com>
To: guix-devel@gnu.org
Subject: Re: File search
Date: Fri, 02 Dec 2022 19:22:27 +0100 [thread overview]
Message-ID: <87mt85r7h8.fsf@gmail.com> (raw)
In-Reply-To: <87pmd1r8kt.fsf@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 19430 bytes --]
Hello again,
As mentioned previously, I have iterated on the work @civodul started.
After discussing it a bit over irc, I proposed to try and push a bit
forward the discussion and the implementation [2] to see where this
goes.
After toying a bit with the initial code, I took the liberty to make it
a guix extension (we discussed it a bit with @zimoun). It was mostly to
get started with Guile (I know some lisp implems but not this one so i
had to familiarize myself with tools and whatnot ;). Anyway, that can be
reverted if you feel like it can be integrated as a Guix cli directly.
Currently, the implementation scans and indexes whatever package is
present in the local store of the machine's user. From nix/guix's
design, it makes sense to do it that way as it's likely that even though
you don't have all the tools locally, it may be already present as a
dependency of some high level tools you already use (it's just not
exposed because not declared in config.scm or home-configuration.scm).
You will find inlines (at the bottom) some cli usage calls [1] and the
current implementation [2].
Thanks in advance for any feedback ;)
Cheers,
--
tony / Antoine R. Dumont (@ardumont)
-----------------------------------------------------------------
gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8
[1] Usage sample:
--8<---------------cut here---------------start------------->8---
$ env | grep GUIX_EXTENSION
GUIX_EXTENSIONS_PATH=$HOME/repo/public/guix/guix/guix/extensions
$ guix index
;;; (package 1 "acl")
;;; (package 595 "shepherd")
;;; (package 596 "guile2.2-shepherd")
;;; (package 2 "htop")
;;; (package 7 "shadow")
;;; (package 6 "shepherd")
;;; (package 5 "autojump")
...
^C
$ guix index search shepherd
guile2.2-shepherd@0.9.3 /gnu/store/cq8r2vzg56ax0iidgs4biz3sv0b9jxp3-guile2.2-shepherd-0.9.3/bin/shepherd
shepherd@0.9.3 /gnu/store/a9jdd8kgckwlq97yw3pjqs6sy4lqgrfq-shepherd-0.9.3/bin/shepherd
shepherd@0.8.1 /gnu/store/vza48khbaq0fdmcsrn27xj5y5yy76z6l-shepherd-0.8.1/bin/shepherd
shepherd@0.9.1 /gnu/store/gxz67p4gx9g6rpxxpsgmhsybczimdlx5-shepherd-0.9.1/bin/shepherd
guix help | grep -C3 extension
repl read-eval-print loop (REPL) for interactive programming
extension commands
index Index packages to allow searching package for a given filename
Report bugs to: bug-guix@gnu.org.
$ guix help index # or: guix index [--help|-h]
Usage: guix index [OPTIONS...] [search FILE...]
Without FILE, index (package, file) relationships in the local store.
With 'search FILE', search for packages installing FILE.
Note: The internal cache is located at ~/.config/guix/locate-db.sqlite.
See --db-path for customization.
The valid values for OPTIONS are:
-h, --help Display this help and exit
-V, --version Display version information and exit
--db-path=DIR Change default location of the cache db
The valid values for ARGS are:
search FILE Search for packages installing the FILE (from cache db)
Report bugs to: bug-guix@gnu.org.
GNU Guix home page: <https://guix.gnu.org>
General help using Guix and GNU software: <https://guix.gnu.org/en/help/>
$ guix index --version # or guix index -V
guix locate (GNU Guix) 5ccb5837ccfb39af4e3e6399a0124997a187beb1
Copyright (C) 2022 the Guix authors
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
--8<---------------cut here---------------start------------->8---
[2] The code:
--8<---------------cut here---------------start------------->8---
;;; GNU Guix --- Functional package management for GNU
;;; Copyright © 2022 Ludovic Courtès <ludo@gnu.org>
;;;
;;; This file is part of GNU Guix.
;;;
;;; GNU Guix is free software; you can redistribute it and/or modify it
;;; under the terms of the GNU General Public License as published by
;;; the Free Software Foundation; either version 3 of the License, or (at
;;; your option) any later version.
;;;
;;; GNU Guix is distributed in the hope that it will be useful, but
;;; WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>.
(define-module (guix extensions index)
#:use-module (guix config) ;; %guix-package-name, ...
#:use-module (guix ui) ;; display G_
#:use-module (guix scripts)
#:use-module (sqlite3)
#:use-module (ice-9 match)
#:use-module (guix describe)
#:use-module (guix store)
#:use-module (guix monads)
#:autoload (guix grafts) (%graft?)
#:use-module (guix derivations)
#:use-module (guix packages)
#:autoload (guix build utils) (find-files)
#:autoload (gnu packages) (fold-packages)
#:use-module (srfi srfi-1)
#:use-module (srfi srfi-9)
#:export (guix-index))
(define debug #f)
(define schema
"
create table if not exists Packages (
id integer primary key autoincrement not null,
name text not null,
version text not null,
unique (name, version) -- add uniqueness constraint
);
create table if not exists Directories (
id integer primary key autoincrement not null,
name text not null,
package integer not null,
foreign key (package) references Packages(id) on delete cascade,
unique (name, package) -- add uniqueness constraint
);
create table if not exists Files (
name text not null,
basename text not null,
directory integer not null,
foreign key (directory) references Directories(id) on delete cascade
unique (name, basename, directory) -- add uniqueness constraint
);
create index if not exists IndexFiles on Files(basename);")
(define (call-with-database file proc)
(let ((db (sqlite-open file)))
(dynamic-wind
(lambda () #t)
(lambda ()
(sqlite-exec db schema)
(proc db))
(lambda ()
(sqlite-close db)))))
(define (insert-files db package version directories)
"Insert files from DIRECTORIES as belonging to PACKAGE at VERSION."
(define stmt-select-package
(sqlite-prepare db "\
SELECT id FROM Packages WHERE name = :name AND version = :version;"
#:cache? #t))
(define stmt-insert-package
(sqlite-prepare db "\
INSERT OR IGNORE INTO Packages(name, version) -- to avoid spurious writes
VALUES (:name, :version);"
#:cache? #t))
(define stmt-select-directory
(sqlite-prepare db "\
SELECT id FROM Directories WHERE name = :name AND package = :package;"
#:cache? #t))
(define stmt-insert-directory
(sqlite-prepare db "\
INSERT OR IGNORE INTO Directories(name, package) -- to avoid spurious writes
VALUES (:name, :package);"
#:cache? #t))
(define stmt-insert-file
(sqlite-prepare db "\
INSERT OR IGNORE INTO Files(name, basename, directory)
VALUES (:name, :basename, :directory);"
#:cache? #t))
(sqlite-exec db "begin immediate;")
(sqlite-bind-arguments stmt-insert-package
#:name package
#:version version)
(sqlite-fold (const #t) #t stmt-insert-package)
(sqlite-bind-arguments stmt-select-package
#:name package
#:version version)
(match (sqlite-fold cons '() stmt-select-package)
((#(package-id))
(when debug
(format #t "(pkg, version, pkg-id): (~a, ~a, ~a)"
package version package-id))
(pk 'package package-id package)
(for-each (lambda (directory)
(define (strip file)
(string-drop file (+ (string-length directory) 1)))
(sqlite-reset stmt-insert-directory)
(sqlite-bind-arguments stmt-insert-directory
#:name directory
#:package package-id)
(sqlite-fold (const #t) #t stmt-insert-directory)
(sqlite-reset stmt-select-directory)
(sqlite-bind-arguments stmt-select-directory
#:name directory
#:package package-id)
(match (sqlite-fold cons '() stmt-select-directory)
((#(directory-id))
(when debug
(format #t "(name, package, dir-id): (~a, ~a, ~a)\n"
directory package-id directory-id))
(for-each (lambda (file)
;; If DIRECTORY is a symlink, (find-files
;; DIRECTORY) returns the DIRECTORY singleton.
(unless (string=? file directory)
(sqlite-reset stmt-insert-file)
(sqlite-bind-arguments stmt-insert-file
#:name (strip file)
#:basename
(basename file)
#:directory
directory-id)
(sqlite-fold (const #t) #t stmt-insert-file)))
(find-files directory)))))
directories)))
(sqlite-exec db "commit;"))
(define (insert-package db package)
"Insert all the files of PACKAGE into DB."
(mlet %store-monad ((drv (package->derivation package #:graft? #f)))
(match (derivation->output-paths drv)
(((labels . directories) ...)
(when (every file-exists? directories)
(insert-files db (package-name package) (package-version package)
directories))
(return #t)))))
(define (filter-public-current-supported package)
"Filter supported, not hidden (public) and not superseded (current) package."
(and (not (hidden-package? package))
(not (package-superseded package))
(supported-package? package)))
(define (filter-supported-package package)
"Filter supported package (package might be hidden or superseded)."
(and (supported-package? package)))
(define (no-filter package) "No filtering on package" #t)
(define* (insert-packages db #:optional (filter-policy filter-public-current-supported))
"Insert all current packages matching `filter-package-policy` into DB."
(with-store store
(parameterize ((%graft? #f))
(fold-packages (lambda (package _)
(run-with-store store
(insert-package db package)))
#t
#:select? filter-policy))))
(define-record-type <package-match>
(package-match name version file)
package-match?
(name package-match-name)
(version package-match-version)
(file package-match-file))
(define (matching-packages db file)
"Return unique <package-match> corresponding to packages containing FILE."
(define lookup-stmt
(sqlite-prepare db "\
SELECT Packages.name, Packages.version, Directories.name, Files.name
FROM Packages
INNER JOIN Files, Directories
ON files.basename = :file
AND directories.id = files.directory
AND packages.id = directories.package;"))
(sqlite-bind-arguments lookup-stmt #:file file)
(sqlite-fold (lambda (result lst)
(match result
(#(package version directory file)
(cons (package-match package version
(string-append directory "/" file))
lst))))
'() lookup-stmt))
\f
(define (index-packages-with-db db-pathname)
"Index packages using db at location DB-PATHNAME."
(call-with-database db-pathname
(lambda (db)
(insert-packages db no-filter))))
(define (matching-packages-with-db db-pathname file)
"Compute list of packages referencing FILE using db at DB-PATHNAME."
(call-with-database db-pathname
(lambda (db)
(matching-packages db file))))
(define (print-matching-results matches)
"Print the MATCHES matching results."
(for-each (lambda (result)
(format #t "~20a ~a~%"
(string-append (package-match-name result)
"@" (package-match-version result))
(package-match-file result)))
matches))
(define default-db-path
(let ((local-config-path
(and=> (getenv "HOME")
(lambda (home)
(string-append home "/.config/guix/")))))
(string-append local-config-path "locate-db.sqlite")))
(define (show-bug-report-information)
;; TRANSLATORS: The placeholder indicates the bug-reporting address for this
;; package. Please add another line saying "Report translation bugs to
;; ...\n" with the address for translation bugs (typically your translation
;; team's web or email address).
(format #t (G_ "
Report bugs to: ~a.") %guix-bug-report-address)
(format #t (G_ "
~a home page: <~a>") %guix-package-name %guix-home-page-url)
(format #t (G_ "
General help using Guix and GNU software: <~a>")
;; TRANSLATORS: Change the "/en" bit of this URL appropriately if
;; the web site is translated in your language.
(G_ "https://guix.gnu.org/en/help/"))
(newline))
(define (show-help)
(display (G_ "Usage: guix index [OPTIONS...] [search FILE...]
Without FILE, index (package, file) relationships in the local store.
With 'search FILE', search for packages installing FILEx;x.\n
Note: The internal cache is located at ~/.config/guix/locate-db.sqlite.
See --db-path for customization.\n"))
(newline)
(display (G_ "The valid values for OPTIONS are:"))
(newline)
(display (G_ "
-h, --help Display this help and exit"))
(display (G_ "
-V, --version Display version information and exit"))
(display (G_ "
--db-path=DIR Change default location of the cache db"))
(newline)
(newline)
(display (G_ "The valid values for ARGS are:"))
(newline)
(display (G_ "
search FILE Search for packages installing the FILE (from cache db)"))
(newline)
(show-bug-report-information))
(define-command (guix-index . args)
(category extension)
(synopsis "Index packages to allow searching package for a given filename")
(define (parse-db-args args)
"Parsing of string key=value where we are only interested in 'value'"
(match (string-split args #\=)
((unused db-path)
db-path)
(_ #f)))
(define (display-help-and-exit)
(show-help)
(exit 0))
(match args
((or ("-h") ("--help"))
(display-help-and-exit))
((or ("-V") ("--version"))
(show-version-and-exit "guix locate"))
((db-path-args)
(let ((db-path (parse-db-args db-path-args)))
(if db-path
(index-packages-with-db db-path)
(display-help-and-exit))))
(("search" file)
(let ((matches (matching-packages-with-db default-db-path file)))
(print-matching-results matches)
(exit (pair? matches))))
((db-path-args "search" file)
(let ((db-path (parse-db-args db-path-args)))
(if db-path
(let ((matches (matching-packages-with-db db-path file)))
(print-matching-results matches)
(exit (pair? matches)))
(display-help-and-exit))))
(_ ;; index by default
(index-packages-with-db default-db-path))))
--8<---------------cut here---------------start------------->8---
antoine.romain.dumont@gmail.com writes:
> Hello Guix!
>
> Guix is top so thanks for the awesome work!
>
> Just to give some feedback on this thread. That's a good news that the
> file search functionality in the radar.
>
>> Lately I found myself going several times to
>> <https://packages.debian.org> to look for packages providing a given
>> file and I thought it’s time to do something about it.
>
> I've finally started to set up my machine with Guix system (and
> Guix Home). Finding out where such program or cli is packaged is
> definitely something that I need to port my existing use (from mainly
> nixified debian or nixos machines) to Guix.
>
> And to answer such question, I used existing "offline" programs in my
> machines. I've bounced back and forth between `nix-locate` and `apt-file
> search` to determine approximately the packages in Guix (names aren't
> usually that different).
>
> Hence, as a user, it's one of my expectation that the Guix cli provides
> some equivalent program to lookup from file to package ;).
>
>> The script below creates an SQLite database for the current set of
>> packages, but only for those already in the store:
>>
>> Guix repl file-database.scm populate
>>
>> That creates /tmp/db; it took about 25mn on berlin, for 18K packages.
>> Then you can run, say:
>>
>> Guix repl file-database.scm search boot-9.scm
>>
>> to find which packages provide a file named ‘boot-9.scm’. That part is
>> instantaneous.
>>
>> The database for 18K packages is quite big:
>>
>> --8<---------------cut here---------------start------------->8---
>> $ du -h /tmp/db*
>> 389M /tmp/db
>> 82M /tmp/db.gz
>> 61M /tmp/db.zst
>> --8<---------------cut here---------------end--------------->8---
>
> For information, in a most recent implementation (@civodul provided me
> in #guix-devel), I noticed multiple calls to the indexation step would
> duplicate information (at all levels packages, files, directories). So
> that might have had an impact in the extracted values above (if ludo had
> triggered multiple times the script at the time).
>
> Jsyk, I have started iterating a bit over that provided implementation
> (and fixed the current caveat mentioned), added some help message...
> I'll follow up with it in a bit (same thread) to have some more feedback
> on it.
>
>> How do we expose that information? There are several criteria I can
>> think of: accuracy, freshness, privacy, responsiveness, off-line
>> operation.
>>
>> I think accuracy (making sure you get results that correspond precisely
>> to, say, your current channel revisions and your current system) is not
>> a high priority: some result is better than no result.
>
> I definitely agree with this. At least from the offline use perspective.
> I did not focus at all on the second part of the problematic ("online"
> and distribution use).
>
>> Likewise for freshness: results for an older version of a given
>> package may still be valid now.
>
> Indeed.
>
> Cheers,
> --
> tony / Antoine R. Dumont (@ardumont)
>
> -----------------------------------------------------------------
> gpg fingerprint BF00 203D 741A C9D5 46A8 BE07 52E2 E984 0D10 C3B8
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 877 bytes --]
next prev parent reply other threads:[~2022-12-02 18:42 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-02 17:58 File search antoine.romain.dumont
2022-12-02 18:22 ` Antoine R. Dumont (@ardumont) [this message]
2022-12-03 18:19 ` Ludovic Courtès
2022-12-04 16:35 ` Antoine R. Dumont (@ardumont)
2022-12-06 10:01 ` Ludovic Courtès
2022-12-06 12:59 ` zimoun
2022-12-06 18:27 ` (
2022-12-08 15:41 ` Ludovic Courtès
2022-12-09 10:05 ` Antoine R. Dumont (@ardumont)
2022-12-09 18:05 ` zimoun
2022-12-11 10:22 ` Ludovic Courtès
2022-12-15 17:03 ` Antoine R. Dumont (@ardumont)
2022-12-19 21:25 ` Ludovic Courtès
2022-12-19 22:44 ` zimoun
2022-12-20 11:13 ` Antoine R. Dumont (@ardumont)
-- strict thread matches above, loose matches on Subject: below --
2022-01-21 9:03 Ludovic Courtès
2022-01-21 10:35 ` Mathieu Othacehe
2022-01-22 0:35 ` Ludovic Courtès
2022-01-21 19:00 ` Vagrant Cascadian
2022-01-22 0:37 ` Ludovic Courtès
2022-01-22 2:53 ` Maxim Cournoyer
2022-01-25 11:15 ` Ludovic Courtès
2022-01-25 11:20 ` Oliver Propst
2022-01-25 11:22 ` Oliver Propst
2022-01-22 4:46 ` raingloom
2022-01-22 7:55 ` Ricardo Wurmus
2022-01-24 15:48 ` Ludovic Courtès
2022-01-24 17:03 ` Ricardo Wurmus
2022-02-02 16:14 ` Maxim Cournoyer
2022-02-05 11:15 ` Ludovic Courtès
2022-01-25 23:45 ` Ryan Prior
2022-02-05 11:18 ` Ludovic Courtès
2022-02-06 13:27 ` André A. Gomes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mt85r7h8.fsf@gmail.com \
--to=antoine.romain.dumont@gmail.com \
--cc=guix-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).