Hello Guix! Lately I found myself going several times to to look for packages providing a given file and I thought it’s time to do something about it. The script below creates an SQLite database for the current set of packages, but only for those already in the store: guix repl file-database.scm populate That creates /tmp/db; it took about 25mn on berlin, for 18K packages. Then you can run, say: guix repl file-database.scm search boot-9.scm to find which packages provide a file named ‘boot-9.scm’. That part is instantaneous. The database for 18K packages is quite big: --8<---------------cut here---------------start------------->8--- $ du -h /tmp/db* 389M /tmp/db 82M /tmp/db.gz 61M /tmp/db.zst --8<---------------cut here---------------end--------------->8--- How do we expose that information? There are several criteria I can think of: accuracy, freshness, privacy, responsiveness, off-line operation. I think accuracy (making sure you get results that correspond precisely to, say, your current channel revisions and your current system) is not a high priority: some result is better than no result. Likewise for freshness: results for an older version of a given package may still be valid now. In terms of privacy, I think it’s better if we can avoid making one request per file searched for. Off-line operation would be sweet, and it comes with responsiveness; fast off-line search is necessary for things like ‘command-not-found’ (where the shell tells you what package to install when a command is not found). Based on that, it is tempting to just distribute a full database from ci.guix, say, that the client command would regularly fetch. The downside is that that’s quite a lot of data to download; if you use the file search command infrequently, you might find yourself spending more time downloading the database than actually searching it. We could have a hybrid solution: distribute a database that contains only files in /bin and /sbin (it should be much smaller), and for everything else, resort to a web service (the Data Service could be extended to include file lists). That way, we’d have fast privacy-respecting search for command names, and on-line search for everything else. Thoughts? Ludo’.