From: "Ludovic Courtès" <ludo@gnu.org>
To: Pierre Neidhardt <mail@ambrevar.xyz>
Cc: guix-devel@gnu.org, Mathieu Othacehe <othacehe@gnu.org>
Subject: Re: File search progress: database review and question on triggers
Date: Mon, 12 Oct 2020 12:20:43 +0200 [thread overview]
Message-ID: <87eem3u4n8.fsf@gnu.org> (raw)
In-Reply-To: <875z7oijxu.fsf@ambrevar.xyz> (Pierre Neidhardt's message of "Mon, 05 Oct 2020 20:53:01 +0200")
Hi!
Pierre Neidhardt <mail@ambrevar.xyz> skribis:
>> Could you post a summary of what you have done, what’s left to do, and
>> how you’d like to integrate it? (If you’ve already done it, my
>> apologies, but you can resend a link. :-))
>
> What I've done: mostly a database benchmark.
>
> - Textual database: slow and not lighter than SQLite. Not worth it I believe.
>
> - SQLite without full-text search: fast, supports classic patterns
> (e.g. "foo*bar") but does not support word permutations.
>
> - SQLite with full-text search: fast, supports word permutations but
> does not support suffix-matching (e.g. "bar" won't match "foobar").
> Size is about the same as without full-text search.
>
> - Include synopsis and descriptions. Maybe we should include all fields
> that are searched by `guix search`. This incurs a cost on the
> database size but it would fix the `guix search` speed issue. Size
> increases by some 10 MiB.
Oh so this is going beyond file search, right?
Perhaps it would make sense to focus on file search only as a first
step, and see what can be done with synopses/descriptions (like Arun and
zimoun did before) later, separately?
> What's left to do:
>
> - Populate the database on demand, either after a `guix build` or from a
> `guix filesearch...`. This is important so that `guix filesearch`
> works on packages built locally. If `guix build`, I need help to know
> where to plug it in.
>
> - Adapt Cuirass so that it builds its file database.
> I need pointers to get started here.
>
> - Sync the databases from the substitute server to the client when
> running `guix filesearch`. For this I suggest we send the compressed
> database corresponding to a guix generation over the network (around
> 10 MiB). Not sure sending just the delta is worth it.
It would be nice to see whether/how this could be integrated with
third-party channels. Of course it’s not a priority, but while
designing this feature, we should keep in mind that we might want
third-party channel authors to be able to offer such a database for
their packages.
> - Find a way to garbage-collect the database(s). My intuition is that
> we should have 1 database per Guix checkout and when we `guix gc` a
> Guix checkout we collect the corresponding database.
If we download a fresh database every time, we might as well simply
overwrite the one we have?
Thanks,
Ludo’.
next prev parent reply other threads:[~2020-10-12 10:21 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-10 14:32 File search progress: database review and question on triggers Pierre Neidhardt
2020-08-11 9:43 ` Mathieu Othacehe
2020-08-11 12:35 ` Pierre Neidhardt
2020-08-15 12:48 ` Hartmut Goebel
2020-08-11 15:43 ` Ricardo Wurmus
2020-08-11 17:54 ` Pierre Neidhardt
2020-08-11 17:58 ` Pierre Neidhardt
2020-08-11 20:08 ` Ricardo Wurmus
2020-08-12 19:10 ` Pierre Neidhardt
2020-08-12 20:13 ` Julien Lepiller
2020-08-12 20:43 ` Pierre Neidhardt
2020-08-12 21:29 ` Julien Lepiller
2020-08-12 22:29 ` Ricardo Wurmus
2020-08-13 6:55 ` Pierre Neidhardt
2020-08-13 6:52 ` Pierre Neidhardt
2020-08-13 9:34 ` Ricardo Wurmus
2020-08-13 10:04 ` Pierre Neidhardt
2020-08-15 12:47 ` Hartmut Goebel
2020-08-15 21:20 ` Bengt Richter
2020-08-16 8:18 ` Hartmut Goebel
2020-08-12 20:32 ` Pierre Neidhardt
2020-08-13 0:17 ` Arun Isaac
2020-08-13 6:58 ` Pierre Neidhardt
2020-08-13 9:40 ` Pierre Neidhardt
2020-08-13 10:08 ` Pierre Neidhardt
2020-08-13 11:47 ` Ricardo Wurmus
2020-08-13 13:44 ` Pierre Neidhardt
2020-08-13 12:20 ` Arun Isaac
2020-08-13 13:53 ` Pierre Neidhardt
2020-08-13 15:14 ` Arun Isaac
2020-08-13 15:36 ` Pierre Neidhardt
2020-08-13 15:56 ` Pierre Neidhardt
2020-08-15 19:33 ` Arun Isaac
2020-08-24 8:29 ` Pierre Neidhardt
2020-08-24 10:53 ` Pierre Neidhardt
2020-09-04 19:15 ` Arun Isaac
2020-09-05 7:48 ` Pierre Neidhardt
2020-09-06 9:25 ` Arun Isaac
2020-09-06 10:05 ` Pierre Neidhardt
2020-09-06 10:33 ` Arun Isaac
2020-08-18 14:58 ` File search progress: database review and question on triggers OFF TOPIC PRAISE Joshua Branson
2020-08-27 10:00 ` File search progress: database review and question on triggers zimoun
2020-08-27 11:15 ` Pierre Neidhardt
2020-08-27 12:56 ` zimoun
2020-08-27 13:19 ` Pierre Neidhardt
2020-09-26 14:04 ` Pierre Neidhardt
2020-09-26 14:12 ` Pierre Neidhardt
2020-10-05 12:35 ` Ludovic Courtès
2020-10-05 18:53 ` Pierre Neidhardt
2020-10-09 21:16 ` zimoun
2020-10-10 8:57 ` Pierre Neidhardt
2020-10-10 14:58 ` zimoun
2020-10-12 10:16 ` Ludovic Courtès
2020-10-12 11:18 ` Pierre Neidhardt
2020-10-13 13:48 ` Ludovic Courtès
2020-10-13 13:59 ` Pierre Neidhardt
2020-10-10 16:03 ` zimoun
2020-10-11 11:19 ` Pierre Neidhardt
2020-10-11 13:02 ` zimoun
2020-10-11 14:25 ` Pierre Neidhardt
2020-10-11 16:05 ` zimoun
2020-10-12 10:20 ` Ludovic Courtès [this message]
2020-10-12 11:21 ` Pierre Neidhardt
2020-10-13 13:45 ` Ludovic Courtès
2020-10-13 13:56 ` Pierre Neidhardt
2020-10-13 21:22 ` Ludovic Courtès
2020-10-14 7:50 ` Pierre Neidhardt
2020-10-16 10:30 ` Ludovic Courtès
2020-10-17 9:14 ` Pierre Neidhardt
2020-10-17 19:17 ` Pierre Neidhardt
2020-10-21 9:53 ` Ludovic Courtès
2020-10-21 9:58 ` Pierre Neidhardt
2020-10-12 11:23 ` zimoun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eem3u4n8.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=guix-devel@gnu.org \
--cc=mail@ambrevar.xyz \
--cc=othacehe@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).