From: Pierre Neidhardt <mail@ambrevar.xyz>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guix-devel@gnu.org, Mathieu Othacehe <othacehe@gnu.org>
Subject: Re: File search progress: database review and question on triggers
Date: Mon, 05 Oct 2020 20:53:01 +0200 [thread overview]
Message-ID: <875z7oijxu.fsf@ambrevar.xyz> (raw)
In-Reply-To: <87k0w4zw8q.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]
Hi Ludo!
Ludovic Courtès <ludo@gnu.org> writes:
> Nice!
Thanks!
> Could you post a summary of what you have done, what’s left to do, and
> how you’d like to integrate it? (If you’ve already done it, my
> apologies, but you can resend a link. :-))
What I've done: mostly a database benchmark.
- Textual database: slow and not lighter than SQLite. Not worth it I believe.
- SQLite without full-text search: fast, supports classic patterns
(e.g. "foo*bar") but does not support word permutations.
- SQLite with full-text search: fast, supports word permutations but
does not support suffix-matching (e.g. "bar" won't match "foobar").
Size is about the same as without full-text search.
- Include synopsis and descriptions. Maybe we should include all fields
that are searched by `guix search`. This incurs a cost on the
database size but it would fix the `guix search` speed issue. Size
increases by some 10 MiB.
I say we go with SQLite full-text search for now with all package
details. Switching to without full-text search is just a matter of a
minor adjustment, which we can decide later when merging the final
patch. Same if we decide not to include the description, synopsis, etc.
What's left to do:
- Populate the database on demand, either after a `guix build` or from a
`guix filesearch...`. This is important so that `guix filesearch`
works on packages built locally. If `guix build`, I need help to know
where to plug it in.
- Adapt Cuirass so that it builds its file database.
I need pointers to get started here.
- Sync the databases from the substitute server to the client when
running `guix filesearch`. For this I suggest we send the compressed
database corresponding to a guix generation over the network (around
10 MiB). Not sure sending just the delta is worth it.
- Find a way to garbage-collect the database(s). My intuition is that
we should have 1 database per Guix checkout and when we `guix gc` a
Guix checkout we collect the corresponding database.
I would store the databases in /var/guix/...
Comments and help welcome! :)
--
Pierre Neidhardt
https://ambrevar.xyz/
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]
next prev parent reply other threads:[~2020-10-05 18:59 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-10 14:32 File search progress: database review and question on triggers Pierre Neidhardt
2020-08-11 9:43 ` Mathieu Othacehe
2020-08-11 12:35 ` Pierre Neidhardt
2020-08-15 12:48 ` Hartmut Goebel
2020-08-11 15:43 ` Ricardo Wurmus
2020-08-11 17:54 ` Pierre Neidhardt
2020-08-11 17:58 ` Pierre Neidhardt
2020-08-11 20:08 ` Ricardo Wurmus
2020-08-12 19:10 ` Pierre Neidhardt
2020-08-12 20:13 ` Julien Lepiller
2020-08-12 20:43 ` Pierre Neidhardt
2020-08-12 21:29 ` Julien Lepiller
2020-08-12 22:29 ` Ricardo Wurmus
2020-08-13 6:55 ` Pierre Neidhardt
2020-08-13 6:52 ` Pierre Neidhardt
2020-08-13 9:34 ` Ricardo Wurmus
2020-08-13 10:04 ` Pierre Neidhardt
2020-08-15 12:47 ` Hartmut Goebel
2020-08-15 21:20 ` Bengt Richter
2020-08-16 8:18 ` Hartmut Goebel
2020-08-12 20:32 ` Pierre Neidhardt
2020-08-13 0:17 ` Arun Isaac
2020-08-13 6:58 ` Pierre Neidhardt
2020-08-13 9:40 ` Pierre Neidhardt
2020-08-13 10:08 ` Pierre Neidhardt
2020-08-13 11:47 ` Ricardo Wurmus
2020-08-13 13:44 ` Pierre Neidhardt
2020-08-13 12:20 ` Arun Isaac
2020-08-13 13:53 ` Pierre Neidhardt
2020-08-13 15:14 ` Arun Isaac
2020-08-13 15:36 ` Pierre Neidhardt
2020-08-13 15:56 ` Pierre Neidhardt
2020-08-15 19:33 ` Arun Isaac
2020-08-24 8:29 ` Pierre Neidhardt
2020-08-24 10:53 ` Pierre Neidhardt
2020-09-04 19:15 ` Arun Isaac
2020-09-05 7:48 ` Pierre Neidhardt
2020-09-06 9:25 ` Arun Isaac
2020-09-06 10:05 ` Pierre Neidhardt
2020-09-06 10:33 ` Arun Isaac
2020-08-18 14:58 ` File search progress: database review and question on triggers OFF TOPIC PRAISE Joshua Branson
2020-08-27 10:00 ` File search progress: database review and question on triggers zimoun
2020-08-27 11:15 ` Pierre Neidhardt
2020-08-27 12:56 ` zimoun
2020-08-27 13:19 ` Pierre Neidhardt
2020-09-26 14:04 ` Pierre Neidhardt
2020-09-26 14:12 ` Pierre Neidhardt
2020-10-05 12:35 ` Ludovic Courtès
2020-10-05 18:53 ` Pierre Neidhardt [this message]
2020-10-09 21:16 ` zimoun
2020-10-10 8:57 ` Pierre Neidhardt
2020-10-10 14:58 ` zimoun
2020-10-12 10:16 ` Ludovic Courtès
2020-10-12 11:18 ` Pierre Neidhardt
2020-10-13 13:48 ` Ludovic Courtès
2020-10-13 13:59 ` Pierre Neidhardt
2020-10-10 16:03 ` zimoun
2020-10-11 11:19 ` Pierre Neidhardt
2020-10-11 13:02 ` zimoun
2020-10-11 14:25 ` Pierre Neidhardt
2020-10-11 16:05 ` zimoun
2020-10-12 10:20 ` Ludovic Courtès
2020-10-12 11:21 ` Pierre Neidhardt
2020-10-13 13:45 ` Ludovic Courtès
2020-10-13 13:56 ` Pierre Neidhardt
2020-10-13 21:22 ` Ludovic Courtès
2020-10-14 7:50 ` Pierre Neidhardt
2020-10-16 10:30 ` Ludovic Courtès
2020-10-17 9:14 ` Pierre Neidhardt
2020-10-17 19:17 ` Pierre Neidhardt
2020-10-21 9:53 ` Ludovic Courtès
2020-10-21 9:58 ` Pierre Neidhardt
2020-10-12 11:23 ` zimoun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875z7oijxu.fsf@ambrevar.xyz \
--to=mail@ambrevar.xyz \
--cc=guix-devel@gnu.org \
--cc=ludo@gnu.org \
--cc=othacehe@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).