From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id +GHaKg1te181TgAA0tVLHw (envelope-from ) for ; Mon, 05 Oct 2020 18:59:25 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id 6IyRJg1te1/KbgAAbx9fmQ (envelope-from ) for ; Mon, 05 Oct 2020 18:59:25 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1A9FA940414 for ; Mon, 5 Oct 2020 18:59:24 +0000 (UTC) Received: from localhost ([::1]:43140 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kPVhj-0006ce-NS for larch@yhetil.org; Mon, 05 Oct 2020 14:59:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44762) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kPVbl-00089X-P0 for guix-devel@gnu.org; Mon, 05 Oct 2020 14:53:15 -0400 Received: from relay9-d.mail.gandi.net ([217.70.183.199]:34015) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kPVbf-0008BV-Gl; Mon, 05 Oct 2020 14:53:13 -0400 X-Originating-IP: 90.92.160.122 Received: from bababa (lfbn-idf2-1-1094-122.w90-92.abo.wanadoo.fr [90.92.160.122]) (Authenticated sender: mail@ambrevar.xyz) by relay9-d.mail.gandi.net (Postfix) with ESMTPSA id 5DC8AFF807; Mon, 5 Oct 2020 18:53:02 +0000 (UTC) From: Pierre Neidhardt To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: File search progress: database review and question on triggers In-Reply-To: <87k0w4zw8q.fsf@gnu.org> References: <87sgcuh8rb.fsf@ambrevar.xyz> <86imd4e7cr.fsf@gmail.com> <87eenspcf8.fsf@ambrevar.xyz> <865z94dz83.fsf@gmail.com> <87zh6gns4l.fsf@ambrevar.xyz> <87zh5c7hx6.fsf@ambrevar.xyz> <87k0w4zw8q.fsf@gnu.org> Date: Mon, 05 Oct 2020 20:53:01 +0200 Message-ID: <875z7oijxu.fsf@ambrevar.xyz> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Received-SPF: pass client-ip=217.70.183.199; envelope-from=mail@ambrevar.xyz; helo=relay9-d.mail.gandi.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/05 14:53:03 X-ACL-Warn: Detected OS = Linux 3.11 and newer [fuzzy] X-Spam_score_int: 6 X-Spam_score: 0.6 X-Spam_bar: / X-Spam_report: (0.6 / 5.0 requ) BAYES_00=-1.9, FROM_SUSPICIOUS_NTLD=0.499, PDS_OTHER_BAD_TLD=1.997, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org, Mathieu Othacehe Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -3.11 X-TUID: Aaja//qBE/EI --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Ludo! Ludovic Court=C3=A8s writes: > Nice! Thanks! > Could you post a summary of what you have done, what=E2=80=99s left to do= , and > how you=E2=80=99d like to integrate it? (If you=E2=80=99ve already done = it, my > apologies, but you can resend a link. :-)) What I've done: mostly a database benchmark. =2D Textual database: slow and not lighter than SQLite. Not worth it I bel= ieve. =2D SQLite without full-text search: fast, supports classic patterns (e.g. "foo*bar") but does not support word permutations. =2D SQLite with full-text search: fast, supports word permutations but does not support suffix-matching (e.g. "bar" won't match "foobar"). Size is about the same as without full-text search. =2D Include synopsis and descriptions. Maybe we should include all fields that are searched by `guix search`. This incurs a cost on the database size but it would fix the `guix search` speed issue. Size increases by some 10 MiB. I say we go with SQLite full-text search for now with all package details. Switching to without full-text search is just a matter of a minor adjustment, which we can decide later when merging the final patch. Same if we decide not to include the description, synopsis, etc. What's left to do: =2D Populate the database on demand, either after a `guix build` or from a `guix filesearch...`. This is important so that `guix filesearch` works on packages built locally. If `guix build`, I need help to know where to plug it in. =2D Adapt Cuirass so that it builds its file database. I need pointers to get started here. =2D Sync the databases from the substitute server to the client when running `guix filesearch`. For this I suggest we send the compressed database corresponding to a guix generation over the network (around 10 MiB). Not sure sending just the delta is worth it. =2D Find a way to garbage-collect the database(s). My intuition is that we should have 1 database per Guix checkout and when we `guix gc` a Guix checkout we collect the corresponding database. I would store the databases in /var/guix/... Comments and help welcome! :) =2D-=20 Pierre Neidhardt https://ambrevar.xyz/ --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQFGBAEBCAAwFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAl97a40SHG1haWxAYW1i cmV2YXIueHl6AAoJEJvc9Jeku8x/xqwH/An06Jobam3VGIyvB2xIKmMKVEaJ553B pA25tzpGcGq3vccDGvuYt8VSRdbwl15/pAX/djF997Y0tyl9Lj149YQrWGekeFKj 7dJZXLgq/mELI6D+Jm/JCdQ7GYutTs3KGIpjKRYRDXRt+u6/k4tBW6TpCWCcKlX2 ukJHOuR1BYpM6JEnjMh2yLLn5bMSSqgbwrKh3TjaUv/WpJH3MLzTI4byEaVbXi0m iMWJUL7R7ZzQjKcseJcWW65P4u18Snr2GfV4SfC6G1UsZMDxpTZ74+H6IHP3uSEz S2QITTkqo7U12Rj9elZUYzXOVejg7pbcI1oEXorz4nElgjlQO+UjWuU= =d5Dd -----END PGP SIGNATURE----- --=-=-=--