all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Mathieu Othacehe <othacehe@gnu.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Guix Devel <guix-devel@gnu.org>
Subject: Re: File search
Date: Fri, 21 Jan 2022 11:35:36 +0100	[thread overview]
Message-ID: <87czklwf47.fsf@gnu.org> (raw)
In-Reply-To: <8735lh5ukw.fsf@inria.fr> ("Ludovic Courtès"'s message of "Fri, 21 Jan 2022 10:03:43 +0100")


Hello Ludo!

> Lately I found myself going several times to
> <https://packages.debian.org> to look for packages providing a given
> file and I thought it’s time to do something about it.

Yeah, I'm also thinking regularly about it but giving up because setting
up this mechanism properly turns out to be much more complex than
initially expected.

> The script below creates an SQLite database for the current set of
> packages, but only for those already in the store:
>
>   guix repl file-database.scm populate
>
> That creates /tmp/db; it took about 25mn on berlin, for 18K packages.
> Then you can run, say:
>
>   guix repl file-database.scm search boot-9.scm

Nice proof of concept :).

> I think accuracy (making sure you get results that correspond precisely
> to, say, your current channel revisions and your current system) is not
> a high priority: some result is better than no result.  Likewise for
> freshness: results for an older version of a given package may still be
> valid now.

Agreed.

> In terms of privacy, I think it’s better if we can avoid making one
> request per file searched for.  Off-line operation would be sweet, and
> it comes with responsiveness; fast off-line search is necessary for
> things like ‘command-not-found’ (where the shell tells you what package
> to install when a command is not found).

Yeah, that's the tricky part. In term of maintenance, it would probably
be easier to have Cuirass index the packages it's building, store the
results in the PostgreSQL database and serve them using the Cuirass web
server. The pros are that we only rely on one database which is very
important in my opinion. It's also relatively easy to setup. The cons
are that you need to be online to access this API.

If we instead decide to build periodically an sqlite database indexing
all the packages in a cronjob or so, it would still be needed for the
users to download it, which would be an expensive operation as you
mentioned. It would also be difficult to index custom Guix channels with
that approach.

Another solution could be to have guix publish index the files from the
NAR in its cache and provide a file searching API. That would still
require to be online, but it would allow to search from multiple publish
servers hence possibly multiple Guix channels. The packages that do not
have substitutes couldn't be searched which is a strong cons. I would
still maybe have a preference for that option.

WDYT?

Thanks,

Mathieu


  reply	other threads:[~2022-01-21 10:36 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-21  9:03 File search Ludovic Courtès
2022-01-21 10:35 ` Mathieu Othacehe [this message]
2022-01-22  0:35   ` Ludovic Courtès
2022-01-21 19:00 ` Vagrant Cascadian
2022-01-22  0:37   ` Ludovic Courtès
2022-01-22  2:53     ` Maxim Cournoyer
2022-01-25 11:15       ` Ludovic Courtès
2022-01-25 11:20         ` Oliver Propst
2022-01-25 11:22           ` Oliver Propst
2022-01-22  4:46 ` raingloom
2022-01-22  7:55   ` Ricardo Wurmus
2022-01-24 15:48     ` Ludovic Courtès
2022-01-24 17:03       ` Ricardo Wurmus
2022-02-02 16:14         ` Maxim Cournoyer
2022-02-05 11:15           ` Ludovic Courtès
2022-01-25 23:45 ` Ryan Prior
2022-02-05 11:18   ` Ludovic Courtès
2022-02-06 13:27 ` André A. Gomes
  -- strict thread matches above, loose matches on Subject: below --
2022-12-02 17:58 antoine.romain.dumont
2022-12-02 18:22 ` Antoine R. Dumont (@ardumont)
2022-12-03 18:19   ` Ludovic Courtès
2022-12-04 16:35     ` Antoine R. Dumont (@ardumont)
2022-12-06 10:01       ` Ludovic Courtès
2022-12-06 12:59         ` zimoun
2022-12-06 18:27         ` (
2022-12-08 15:41           ` Ludovic Courtès
2022-12-09 10:05         ` Antoine R. Dumont (@ardumont)
2022-12-09 18:05           ` zimoun
2022-12-11 10:22           ` Ludovic Courtès
2022-12-15 17:03             ` Antoine R. Dumont (@ardumont)
2022-12-19 21:25               ` Ludovic Courtès
2022-12-19 22:44                 ` zimoun
2022-12-20 11:13                 ` Antoine R. Dumont (@ardumont)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czklwf47.fsf@gnu.org \
    --to=othacehe@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.